Re: Merged Compression Tests

themabus, you have no idea about how those methods really work do you
no worries, lets start with imagediff, which if you used at least once you would know that its slower than simple decompression

unecm "Final Fantasy IX (E)(Disc 1 of 4)[SLES-02965].bin.ecm" ~75 sec
imadiff "Final Fantasy IX (E)(Disc 1 of 4)[SLES-02965].bin" into version "(F)(Disc 1 of 4)[SLES-02966]" ~5 min

imagediff is patching in 3 basic steps
1. calc md5 (hdd speed) to validate integrity of foreign image
2. create sector map of foreign image (also hdd speed)
3. rebuild image while comparing imagediff's sector map to sector map of foreign image (and this one is "slightly" 2-3x slower than just reading whole image)

it reads whole original image at least 3 times!

and now compare it to extracting ALL 8 files to their original form (without any patching) in 8 (eight) minutes from FreeArc archive (+ unecm on 8 files which is even slower than decompression = 8 *75 = ~10 minutes )

FreeArc is ultra fast because of its method:
- first decompress 380mb archive into repetition filters dictionary + non-common data (in this step you dont have 5gb un-merged set, only ~700mb dictionary + ~50mb of data)
- next repetition filter is working in a way which you though imagediff should work - rebuild files with common parts which are stored in dictionary and are loaded into RAM for fast access (and here is done the magic of un-merging into 5gb)
- rebuilded data is stored in cache and then written onto hdd

and as I already wrote from the beginning "no need for storing by ImageDiff when merging with repetition filter is much more conveniant" big_smile

Re: Merged Compression Tests

7zip 4.57 is the latest version that produces the same output as the 4.53beta included with packiso.
All later versions have a little bit changed. smile

Re: Merged Compression Tests

unecm "Final Fantasy IX (E)(Disc 1 of 4)[SLES-02965].bin.ecm" ~75 sec
imadiff "Final Fantasy IX (E)(Disc 1 of 4)[SLES-02965].bin" into version "(F)(Disc 1 of 4)[SLES-02966]" ~5 min

are you sure it's 5 minutes?
i have whole PCE set on ImageDiff - it's always a matter of seconds on my machine
those are small ImageDiffs, however - rarely larger than 10 megabytes

md5 on 600mb file (fsum.exe) completes in ~20 seconds

if ImagePatch is slow it's implementation fault then, as in theory, it should be faster than unecm
ecm has a map of sectors where to recreate ecc
all program like ImagePatch needs, imho, is map of sectors to insert and those sectors themselves
it's practically 'copy /b file1+file2 file3'

as you said yourself FreeArc use symmetric compression algorithm which 7z does not
i don't think it's ultrafast - it can never be faster than 7z
it can compress better or be more convinient
but same files should extract faster from 7z

yes i'm theorizing
i'm out of space now and basically can not check anything so i have to trust you

so time results you provided earlier

unpack 650mb from 350mb 7z ~80 seconds

what are they from?

i mean if it's single file (or two) extracted from 7z containing one .ecm and ImageDiffs,
whole archive should decompress only little slower as it is solid most likely

so what we have is .ecm and .ImageDiffs in 2 minutes
vs one ecm (or several) in 7 minutes

now if ImagePatch is bottleneck here it should be possible to replace it with less complex patcher and it should zap

or i don't understand something?

- rebuilded data is stored in cache and then written onto hdd

ok, maybe you think author will implement ecm and joining of files in FreeArc and everything will be done in RAM (5gb)
and then written to HDD so time on HDD access will be saved
but it's overkill imho - 5gb of RAM is not sane and probably same could be done then with ramdisk

i mean it's all good to have all those features in one program but decompression algorithm itself is way slower
so that's where it ends, imho

edit:
i don't mean that 7z is ultimate archiver btw.
if ther's one faster at same or better ratio (like TAK vs APE & FLAC) - by all means
i just don't think that FreeArc is the case unfortunately

29 (edited by cHrI8l3 2009-04-06 16:55:30)

Re: Merged Compression Tests

as you said yourself FreeArc use symmetric compression algorithm which 7z does not

I never said that! quite opposite when you look at the config table you will see that fast, low-memory method from FreeArc use this same algorithm as 7z - LZMA !! and FA has faster implementation of LZMA than 7z
LZMA is assymetric
symmetric compressors are used in NanoZip

ImageDiff uses 2 methods for creating patches, Ive been using patches created with "best diff type" option and "slow match" for smallest patches

unpack 650mb from 350mb 7z ~80 seconds

unpack 1 versions of FF IX (in .ecm format) from 1 archive (7z) in 80 seconds (no ImageDiffs here!!)

but it's overkill imho - 5gb of RAM is not sane and probably same could be done then with ramdisk

i mean it's all good to have all those features in one program but decompression algorithm itself is way slower
so that's where it ends, imho

man you should educate yourself or at least play some with those algorithms before make such statements

I know my english is not perfect, but I already wrote all informations which are needed to understand the whole idea
decompression tests might by confusing and the whole diffs, because I did not intended to make those in the first place
I will think about it and maybe add those unpacking times to table with results , perhaps that will be easier to understand ;/

Re: Merged Compression Tests

ok, i'll clean up hdd, fetch those images and test for myself
asap

symmetric compressors are used in NanoZip

i thought it's the same program, like mode or something
so ther's 2?
what about records with FreeArc+NanoZip, then? are they compressed twice, like zip+arj?

edit:
i'm basing my statments on what you write
i've asked you about only comparable value in that table - whether it's slower, and you said it is
that 'fast' part you added later
and after that you wrote that FA extract in 7 minutes,
...about the same as compression value given ~10 (7 is about 10 - 7 approximates to 10) = symmetrical
would you have wroten  - FA extracts in 1 minute, i'd think it's fast (as 7z extracts in 2),
but you wrote - in seven

so what would be decompression speed difference of exactly same files 7z vs FA?
have you tested?

edit:

i don't mean that 7z is ultimate archiver btw.
if ther's one faster at same or better ratio (like TAK vs APE & FLAC) - by all means

so if FreeArc does that - it's great
if you can talk author into integrating ecm then - it's even better

31 (edited by cHrI8l3 2009-04-06 20:29:03)

Re: Merged Compression Tests

what about records with FreeArc+NanoZip

lets make one thing clear... FreeArc is a compression suite (not compressor!!) it allows you to link different algorithms
it has built-in some nice algorithms like LZMA (the one 7z uses), repetition filter, exe filters, delta...
and you can configure it to work with almost any other command line compressors/data filters, f.e. ECM, NanoZip, Precomp, APE, WinRAR, etc... whatever you need
and then.. you can create your own packing profiles by linking compressors with filters etc.

you can run for example arc.exe a -mecm+rep:1gb+lzma:128 -dp"C:\Working Dir" -- "C:\Working Dir\arc.arc" "file1" "file2"
and it will first ecm both files, then filter repetitions within 1gb range, and then compress it to LZMA with 128mb dictionary
its simple smile

if you can talk author into integrating ecm then - it's even better

it is already possible, but ecm have some issues on large sets of data... it can be used on <2gb files thou
might be worked out in one of next releases

(7 is about 10 - 7 approximates to 10) = symmetrical

wtf smile repetition filter takes most of the time, decompressings from LZMA takes less than half of that time

Try it, deffinately !!
download and run installer of v0.50: http://freearc.org/Download.aspx
download and unpack over installed version update pack: http://www.haskell.org/bz/arc1.arc (recommended)

Edit:
if you want you can store your own configurations in arc.ini, here is one of mine:
cso7=ecm+rep:1gb+lzma:128mb:max:bt4:273
and you run it with -mcso7 switch

and... following can also be configured and run when you have audio stored as wav:
packiso= $obj => ecm+7z, $wav => ape

and. FA has also a GUI but you can not run custom configurations from GUI yet.. however it will be soon resolved

in short... its soft for PRO's and thats why I though redump staff might be interested tongue are you ? smile

Re: Merged Compression Tests

This definitely looks pretty good and I might try it soon. There are just a few things though. First of all it looks like this program is still in development and the current version is an alpha build so I don't think it would be a good idea to migrate everything to this just yet until the program is more perfected by the author. Another thing I've been thinking of is the possible amount of PC resources people are going to need to extract an archive made with this program. A lot of our dumps are being shared on torrent networks now as well as usenet and most people in the torrent community want something that is easy to extract and not all of them have the latest and greatest hardware. A lot of them even have a problem figuring out packIso (like that's real hard to use). Basically if we are going to use this to help spread our dumps we need to make sure it's going to be usable and accepted by everyone first (or at least the majority of people that know what they are doing) before we start migrating everything to a new format.

Re: Merged Compression Tests

thank you cHrI8l3, that does sound intriguing

so in 'Arc 6' ther's ECMs of all 8 images -> diff
decompression and reverse of diff on all 8 images would complete in 7-8 minutes, leaving ECMs, right?

can you make a filter chain then, that would produce one ECM and diffs, alike to 'ImageDiffs+ECM+7z' ?

Re: Merged Compression Tests

I do not think this is meant to be used for spreading the images. Instead it is meant as to store many games in very little space. smile

It might be useful if you want to have something like a "complete psx FF7 collection" and share that with others.

But I myself prefer the approach of "1 disc - 1 archive", even when it is space-wasting. smile

(I currently have all games torrentzipped, since it is the easiest way to quickly scan the whole collection :x)

35 (edited by cHrI8l3 2009-04-07 14:58:44)

Re: Merged Compression Tests

First of all it looks like this program is still in development and the current version is an alpha build so I don't think it would be a good idea to migrate everything to this just yet until the program is more perfected by the author.

yes its still unstable release, and every time you want to store an archive, you should execute test after compression (-t switch)

Basically if we are going to use this to help spread our dumps we need to make sure it's going to be usable and accepted by everyone first (or at least the majority of people that know what they are doing) before we start migrating everything to a new format.

yes yes i agree ! dont use freearc yet for anything official, just try it for personal use for now, make used to it, learn basic usage etc...
It will be however possible to create bundles of FreeArc with other compressors (like ECM) and config, lets say f.e. "FreeArc 0.5 (Redump Pirate Release)" and every thing user will need to do is to unpack it, run FreeArc.exe, select and extract archives made with ECM tongue no need for installation, no need for messing with config, thats one example...

so in 'Arc 6' ther's ECMs of all 8 images -> diff
decompression and reverse of diff on all 8 images would complete in 7-8 minutes, leaving ECMs, right?

yes there are 8 .ecm files (no diffs!!) understand one thing... im looking for a method that will not involve diffs, I added diff method in tests only for comparison of archive sizes that you can get with and without diff, and as you can see both moethods giving pretty much this same size (with few MB in favour of diff and greater speed/convenience in favour of repetition filter...)

can you make a filter chain then, that would produce one ECM and diffs, alike to 'ImageDiffs+ECM+7z' ?

I dont thinks so, FreeArc config does not treat every file inside archive separately, but as a merged solid bundle, for patches you would need more compliacated algorithm that will select pairs of files of create patches from those pairs...

I do not think this is meant to be used for spreading the images. Instead it is meant as to store many games in very little space.

hell yeah smile for everyone who is short on hdd space and dont want too much mess with images

Re: Merged Compression Tests

Size

Split:

 zip (7z -tzip)       :3996932848 <| 8 * 499616606 - average from 3 samples

 rar (-m5)            :3725107104 <| 8 * 465638388 - average from 3 samples

 PackIso (ECM->7z)    :3020367596 <| taken from cHrI8l3's table but it's slightly off: each archive is by about 4..6 bytes smaller

Merged:

 ImageDiff+ECM->7z    : 379311961 <| ImageDiff with default settings

 Xdelta3+ECM->7z      : 387772715 <| Xdelta: -N -D -R -n -0; 7z: -mx=9; it's strange though, patches themselves are smaller uncompressed

 Xdelta3+ECM->FreeArc : 394458697 <| -m4, -m4x (size is the same)

 Xdelta3+ECM->FreeArc : 388348030 <| -m9x

Compression speed

Split:

 zip (7z -tzip)       : 8 *  ~87 =  ~696 seconds

 rar (-m5)            : 8 * ~347 = ~2776

 PackIso (ECM->7z)    : 8 * ~250 = ~2000

Merged:

 ImageDiff+ECM->7z    : 814
  ECM                 : 36
  ImegeDiff           : 7 *  ~56 =  ~392
  7z                  : 386

 Xdelta3+ECM->7z:     : 605
  ECM                 : 36
  Xdelta3             : 7 *  ~28 =  ~196
  7z                  : 373

 Xdelta3+ECM->FreeArc : 624
  ECM                 : 36
  Xdelta3             : 7 *  ~28 =  ~196
  FreeArc             : 392               <| -m4

 Xdelta3+ECM->FreeArc : 622
  ECM                 : 36
  Xdelta3             : 7 *  ~28 =  ~196
  FreeArc             : 390               <| -m4x

 Xdelta3+ECM->FreeArc : 797
  ECM                 : 36
  Xdelta3             : 7 *  ~28 =  ~196
  FreeArc             : 565               <| -m9x

Decompression speed

Split:

 zip (7z -tzip)       : 32..256 (8 *  ~32 =  ~256)

 rar (-m5)            : 40..320 (8 *  ~40 =  ~320)

 PackIso (ECM->7z)    : 72..576 (8 *  ~72 =  ~576)

Merged:

 ImageDiff+ECM->7z    : 84(209)..959
  unECM               : 36
  ImegePatch          : 7 * ~125 =  ~875
  7z (1 or many diffs): 48

 Xdelta3+ECM->7z      : 84(118)..322
  unECM               : 36
  Xdelta3             : 7 *  ~34 =  ~238
  7z (1 or many diffs): 48

 Xdelta3+ECM->FreeArc : 97(131)..335
  unECM               : 36
  Xdelta3             : 7 *  ~34 =  ~238
  FA (1 or many diffs): 61                <| -m4, -m4x

 Xdelta3+ECM->FreeArc : 101(135)..339
  unECM               : 36
  Xdelta3             : 7 *  ~34 =  ~238
  FA (1 or many diffs): 65                <| -m9x

Programs used

7-Zip 4.53 (PackIso)
7-Zip 4.65
ECM v1.0
FreeArc 0.50
ImageDiff v0.9.8
RAR 3.80
Xdelta 3.0u

ImageDiff is quite slow with larger files indeed, though patches it produce, while being larger, compress better for some reason.
replacing it with similar program: Xdelta3, improved both: compression and decompression speeds a lot.
replacing 7z with FreeArc on the other hand didn't improve anything, though i tested just a few options:
m4 - for being suggested as equal to -mx=9 of 7z, which i commonly use
and couple more
(maybe it does beat 7z with some - i'm not saying it doesn't, they're quite close anyway)
also i didn't test those inbuild filter chains

from those results i'd say 'Xdelta3+ECM->LZMA' is optimal configuration,
would .ecm be created for most demanded version from set (U or E, whichever it is)
it would loose only few seconds on decompression to PackIso, while improving ration a lot
(would this game contain audio tracks it'd be a tie probably (TAK vs APE))
it would be worse if patching is required, but still acceptable, imho
also whole set would compress/decompress considerably faster,
but it's unlikely somebody would do that, imho, not often at least
also memory requirements for 7z @x9 are ok: 700mb/70mb

would such set be created now - it'll involve a lot of constant recompression, though - whenever title is added,
so it's too early, imho, but otherwise i like it a lot

Re: Merged Compression Tests

anyone else is in favour in using patches for storing images instead of merged archives ?

Lil' Update:
- ECM issue I mentioned few posts ago (the one that you can not ECM inside FreeArc on large files) was found to be caused not by FreeArc but by ECM itself... ECM can not handle large files, so when you add more than 4gb of discs into archive and go ECM on it .. it will not work! lets hope it will be resolved...

Re: Merged Compression Tests

Haldrie wrote:

A lot of our dumps are being shared on torrent networks now as well as usenet and most people in the torrent community want something that is easy to extract and not all of them have the latest and greatest hardware. A lot of them even have a problem figuring out packIso (like that's real hard to use). Basically if we are going to use this to help spread our dumps we need to make sure it's going to be usable and accepted by everyone first (or at least the majority of people that know what they are doing) before we start migrating everything to a new format.

Seriously, this...
How could you share the dumps on UG or similar with patches, or unusual ridiculous compression program? 99% of the regular users will just ignore those torrents and the redump.org project will never be out of the niche.

It could annoy more people and offer to the tosec guy another fact to make joke on us. Already now in nearly every UG TOSEC torrent you can read things like "redump.org artificially forges dumps" "redump.org method is the worst of the internet and it's very difficult to implement""redump.org dumps are ghost dumps because 90% of discs listed on their site doesn't exist on the internet" etc... I already can imagine "lol, redump.org dumps are compressed in an abstruse way that it takes hours go get back to a common format, and maybe you have to be an engineer to make this happen")

To go head to head with TOSEC the compression format should be the common torrentzip or winrar.
Just my humble opinion, but I think to have a very strong point here.
Of course I'll help to seed on UG when the torrents will be out no matter the final compression standard. Keep up the good work to finally spread the dump! smile

Re: Merged Compression Tests

i don't think ther's anything to worry about
neither me nor cHrI8l3 maintain those sets, they'll likely remain as they are

i myself think it's still way too early for this,
it would be more appropriate when PSX is about 80% or so complete
and even then as alternative to PackIso likely

though regarding .zip or .rar, i think it's a step backwards
they are uncompromise in favour to people with fast connections, a lot of storage space and time to waste

compression increase merged set offers is unprecedent
it won't be 8 times of course, but on average it'd be x2 easy over PaskIso
and decompression speed is still good
http://img410.imageshack.us/img410/5558/merged.png
so let's say this 80% PSX set takes 1tb with PackIso
then it would 500gb merged
my connection allows maximum download speed of 500 kilobytes per second, which is about average i guess
so i'd save (1024*1024)/60/60/24 = ~12 full days on download
that's a lot of space and time economy
the price is slower decompression speed,
but to loose those 12 days i saved, i'd have to decompress really so often
let's be generous and say it's 2 minutes overhead on decompression, which is not true
(it's event not true for zip vs merged, but it's ok - let's be generous)
so then 12*24*60/2 = 8640
i'd need to decompress 8640 images one by one, to claim 'it wasn't worth it' -
would i decompress all merged versions of same title at once, i'd actually save time
(8640 happen to be about 80% of PSX titles, so i'd need to decompress each one of them: French, German, etc...
it's very unlikely and still i'd save space)

about ease of use - it's not a problem either, imho
graphical frontend can be made that would allow user friendly extraction

Re: Merged Compression Tests

I am not sure about other sets, as uploaders of those didn't use the forum post of the project at UG, but packiso has worked a lot better with PSX, and community have accepted its use, infact one of our project torrent have a dat file with packiso's CRC for quick renaming, and we plan to release the dat with every update from now on, which should solve the time issue in renaming those sets.

so i believe packiso is going a good Job there at the moment, thou i myself would like a merged set at some point to save HDD space, i have 2.5TB space and its almost full for last 6 months or so.

PS: we also made a little installer there which make it easy to work with packiso, if some want he can use that also.

Re: Merged Compression Tests

infact one of our project torrent have a dat file with packiso's CRC for quick renaming, and we plan to release the dat with every update from now on, which should solve the time issue in renaming those sets.

for people that would get those images elsewhere and then compress accordingly - with PackIso
to join in torrent at higher position?
but the thing is - there aren't alot of those people, as i understand it.
majority won't be bothered with renaming themselves - they'll take what is given.

so, imho, if somebody feels like recompressing and reuploading, like xenogears i suspect may (to .zip/.rar)
it wouldn't do any harm - the more there are those torrents - better
one enforced set without alternatives wouldn't be good
even though i think .zip or .rar particularly aren't rational - still if somebody feels different about it - it won't harm
(well, not any more than any torrent based on redump.org .dats at current state
i see a general problem with names, i think they're terribly wrong,
hence in torrents also and anything else derived from them
but that is a concern of redump.org crew)

afterwards to maintain names in sync with updates made at redump.org it would be enough to compare .dat files (redump's):
one taken when set was made, and current - ther's no need to rescan whole set every time, imho
(and to produce alternative .dats for that reason)
when it's uploaded it's locked to redump.org @certain point in time (like a snapshot),
so you can say this torrent is a subset of that .dat
when you see CRCs of some title change from .dat to .dat - you know this CD should be updated
when same CRCs belong to different title now - it was renamed then
and new records would manifest as new titles with new CRCs
that's all managment there is, and it is a concern solely of person maintaining torrent, not everyone downloading it, as i see it
i don't really see application for .dat indexing compressed files

Re: Merged Compression Tests

Anyone interested can download the PackISO installer here: http://www.mediafire.com/?otb3dmmtznh

43 (edited by BadSector 2009-04-16 05:14:06)

Re: Merged Compression Tests

I see what u are saying, and that why we also provide a change list which list games which has been changed since the last update, thou at this moment all this is done manually, and a automated solution would be great. the compressed crc dat files are for peoples like me who have miss couple of updates and thus can use that dat to quickly find which games need to be updated

about namings of the games, thou i am a big fan of serial # coz thats what i have been using since Psx_renamer days, but i do agree, we need to somehow provide the info what version of a game a said dump is.

i personally don't worry how long a game name is as long as it provides the following info

GameName-region-Language-version-edition-serial

PS: i need patches for few PSX Asia games dumped by u, and maybe few tracks/full images, could u provide that somehow.

Re: Merged Compression Tests

themabus wrote:

3) Now he shares the pars with other testers to see how many blocks are needed.

about equal to archive size, i guess, which would be huge

I will go back on this later.
I suggested this for stuff like Gamecube that have no better compression with 7z.
If images will be setted to same date and compressed with same version of winrar and same settings you'll have the same of packps2, or not?

My patch requests thread
--------------------------------