I tried converting a 7mb chunk of non game data, by first adding a bmp header from a real 7mb bmp image.
Intrestingly if you open-->save as-->tiff, psd, raw, or pxr-->then open-->tiff, psd, raw, or pxr-->then save over the original bmp.
There is practically no difference only 1 par2 block, which if we set the par2 block size very small this could be as little as a few bytes.
The only problem is when saved to a tiff, psd, raw, or pxr the size is the same as the bmp.
Does anyone know if theirs a file format like tiff, psd, raw, or pxr, that would save and be smaller than the original bmp, but when it is saved back as the original bmp, gives us back are bmp, with maybe a small sized repair.
The SPICE must flow.
Random data is random data, no matter which format you save it as. It simply can't be (losslessly) compressed in any way.
Random data is random data, no matter which format you save it as. It simply can't be (losslessly) compressed in any way.
It can be losslessly compressed. But theres no advantage because it doesn't get smaller.
What you mean is if you save it from a 32-bit.bmp to a 16-bit (making it smaller) then from 16-bit back to 32-bit, you loose all the original data. Or at least that what I was trying to say.
The SPICE must flow.
You could try png. It compresses lossless. Here's more info about its compression scheme: http://en.wikipedia.org/wiki/Portable_N … ompression
Considering that this is random data and not an actual picture, I doubt you'll see any significant compression gains over another file type. Wouldn't hurt to try though.
its kinda sad png hasnt been mentioned until the 6th post! but anyway, is it feasible?
8 2010-04-09 04:13:45 (edited by tossEAC 2010-04-09 04:14:21)
I know of a way that could compress this sort of image. I couldn't make the program, and I'm not a 100% it would work, but it might.
An xbox iso is 7433027584 bytes
If if it was slpit into 7433027584 seperate files. 1 byte per file.
And every byte was numbered from 0000000001-7433027584 and kept in order
If a program was made to scan the iso.
When it see's the first byte (which just say it was) FF, it could then create a folder called FF, then all the FF bytes would be grouped into this folder but numbered by the byte position in the iso.
So say the first byte was FF and the 0000010099 th byte was also FF. tHE FF folder would contain the bytes @ 0000000001 and also 0000010099 and so on and so forth.
Once the iso had been split into sepertate bytes and seperate byte folders.
Then all the seperate folders could be compressed individually.
Then to get the iso back all the files would be extracted to a single folder and joined in number order, to form the iso again.
The SPICE must flow.
9 2010-04-09 04:24:15 (edited by amarok 2010-04-09 04:25:27)
The amount of disk space needed to store the byte positions' numbers would be even bigger than the actual uncompressed file Let alone the fact that Windows would pad these 1-byte files to 4096 bytes (?) anyway (edit: Granted, not if you compressed the files afterwards ^^). If this worked, even the biggest files could be compressed to a few KB, which is physically impossible. Forget it, random stuff like that is nearly uncompressible.
10 2010-04-09 05:13:12 (edited by tossEAC 2010-04-09 05:16:26)
Any one know of any actuall programs that can split a 100mb or above into seperate 1 byte files numerically ordered, so far I've only found tools that can split up to 999 files. I need something that can split to 1 byte files, and able to at least handle 100mb files (1073741824 bytes)
I'm beginning to think we have no chance other than winrar store compression. But it was worth a go.
The SPICE must flow.
11 2010-04-09 07:27:42 (edited by velocity37 2010-04-09 07:36:05)
Any one know of any actuall programs that can split a 100mb or above into seperate 1 byte files numerically ordered
I really wouldn't suggest you do this. You would make that file 100MB file take up anywhere from 200GB to TBs. Explorer freezes up with tens of thousands, I'd be surprised if the OS didn't die in the tens of millions.
As r09 & amarok said, it is pretty much impossible to cut down random data.
I get what your idea is, to group all the same bytes together and compress them by describing them as a contiguous sequence. This isn't far from a real concept (RLE), but it doesn't work for bytes that aren't already grouped together, since it would take more than one byte to store the relative positions.
This is a byte:
o
If this byte was in a 7GB file, I'd have to name it something like this:
7421251637
This filename alone is 10 bytes (5 in hex).
There is a common way to store information about a large amount of data in a small amount, the checksum we all know and love. The checksum, however, can't be used to regenerate the data. A CRC32 is four bytes in length, so it can have ~4.3 billion values. If we had just one more file than its max, we'd 100% for sure run into an instance where two unique files had the same checksum (known as a collision). In fact, there are programs that can do this intentionally.
12 2010-04-09 09:11:45 (edited by tossEAC 2010-04-09 09:33:44)
Thanks for sharing that nfo velocity37, you know more than me.
Im not that technically minded, but I am a fighter, and don't like giving in or loosing. So I persist, even if I am banging my head against a brick wall most of the time.
Just looking at that pic above, when I saw it 1st time, It made me feel sick. Why? Because looking at it, their is a hidden pattern. RANDOMNESS
If you think about it, they are using the same bytes just in a random order. I read somehere the gamecube garbage, is cipher something or other. In other words completely random.
In an ideal world we should make some small sized random file, using say every hex value, and have it shift randomly, like a blob of mercury, and at the same time have like another program like par2 that scans the random shifting file untill it finds a block then copies the block and carries on shifitng and finding more blocks.
FACT: (not much of one), but say quickpar's block size could be set to one byte. You can set it to 384,000 and below.
But thats stil 384 thousand times bigger than we want. The smallest size it goes on very small size files is 4 bytes
Take any 4 consequetive bytes "F3 28 A9 F7" and they only appear once. Take the next 4 consequetive bytes "4D 1B 95 1E" and they only appear once. And it would probably go on like that throughout the whole iso.
So theirs a clue in their pointing to the obvious it's as random as can be, but going back to my earlier point we need to make a small constantly shifting random file that kind of acts like a safe cracker.
If only par done 1 block par's, maybe that could be hacked, and it might do it as it may be able to but they left some of its possiblities out of the program for various reasons. Maybe on big files with 1 byte as its size, it would lock up syatems probably and thats why they left that option out!
Keep thinking we need to crack this, rather than forget it.
By the way I have tried converting a small samle about 4000 characters of the random data, to binary, it came out a lot bigger in binary, which I thought might compress to less, but it ended up bigger, and I have tried PNG
.
Would this work, converting hex to say base36?
F328A9F78C67ED651971BF74DCE989D5 - HEXADECIMAL
F328A9F78C67F0000000000000000000 - BASE 16
EE8OQPJ6BFKKWCGWGGOOOG0K0 - BASE 36
3631212476743063760000000000000000000000000 - OCTAL
The SPICE must flow.
If the data is truly random, then by necessity no compressor will compress it. If you want to 'compress' it, the only hope there is is to find out how the random padding was generated, and emulate it if possible (severe longshot). One way I can think of where this might be possible is if they used a pseudo-random number generator (the entire sequence is determined by an initial 'seed' number), then brute force it. Maybe if microsoft has developed a random number generator you might have somewhere to start from, but anyway, good luck with that.
I have a few questions, just out of curiosity:
1/ How much padding would an average game have, does the amount vary from game to game?
2/ Would 2 copies of the same game have the same padding?
3/ Is the padding common to different games (maybe a particular developer has their own)?
4/ Is the entire padding for a game contiguous?
5/ Is the padding referenced in any way (maybe a hash check to ensure it's there), or is it properly just padding?
If padding is the same across many games, you've got the simplest answer. Distribute the games with the random padding zeroed, and distribute the padding seperately as a patch (which works for all the games with that padding). Couldn't use imageDiff, but a simple patch format could be made.
tossEAC, sorry man but the safe-cracker idea seems dubious. What would this constantly shifting random file do exactly? There's nothing to say that 4 consecutive bytes at location X can't be identical to 4 consecutive bytes at location Y, all we know about the data is that it is random. Hex is base 16, so there is something wrong with how you've converted the numbers, but what are you trying to do with converting between bases? Computers store data in base 2, there is no way around that. When you say that the binary ended up bigger, I'm guessing you are saving the codes in a text file? Each written character of binary then consumes 8 bits to describe (or 16 depending on format), instead of the intended 1 (and written hex takes up 8 or 16 bits instead of 4). Does that make sense?
14 2010-04-09 12:39:55 (edited by tossEAC 2010-04-09 13:03:48)
This is all I've got time for now..
1/ How much padding would an average game have, does the amount vary from game to game?
2/ Would 2 copies of the same game have the same padding?
3/ Is the padding common to different games (maybe a particular developer has their own)?
4/ Is the entire padding for a game contiguous?
5/ Is the padding referenced in any way (maybe a hash check to ensure it's there), or is it properly just padding?
A1/ The smaller the actuall game data the more padding, so its exactly the same as the gamecube.
A2/ Yep I believe they do, two dumps that are equal have the same padding.
A3/ I haven't checked, as to check would be to make a 1:1 dump and the very difficulty wipe the game data, that could be something I might try though.
A4/ Dunno?
A5/ Dunno?
If padding is the same across many games, you've got the simplest answer. Definately put I'm pretty sure its as random for each game, just as each individual game seems to have random padding from start to finish.
I did examine a section from the begggining of a padded part of the iso. I split of 999 parts 2048 bytes. I tried compressing using ape and they all came out bigger except two. I think it was probably those two half blank sectors, that contained some words of intrest.
IT SAID SOMTHING LIKE "MICROSOFT*XBOX*DVD" SHOULD HAVE MADE A NOTE OF IT. their was a bit more writing, I get the feeling it has something to do with the padding but it was just a header type bit of text like at the start of a dreamcast.bin file. But it was maybe 50 sectors from the start of the file.
You mods if you havent already seen this, go to the lba past the end of the video section, then sector view in isobuster, about 20-30 sectors and you'll see the MICROSOFT*XBOX*DVD text.
I have decided to give up finding better ways of compressing the random data.
The best compression I have found for the random padding is WinRAR store m0.
That only leaves one feasable option that would make us all jump for joy, would be to be able to generate it from scratch.
Wipe the padding first so it compresses then work out how to generate the exact data that was wiped, and hopefully the person in the process of doing this will spot what the random code is generated by.
The SPICE must flow.
The "MICROSOFT*XBOX*DVD" thing is a header that marks the start of the file system, if I'm not mistaken. It's like the "CD001" header in ISO9660 compliant CDs.
16 2010-04-09 14:49:37 (edited by Ghazy 2010-04-09 14:52:43)
nice thread and good thoughts. tossEAC's idea is great and sounds utopic, but this is the key.
jamjam suggestion is also good, it would be so awesome if someone could find out the algorithm for random padding.
i just doubt that both things are possible. i like this discussion. maybe someone has an even better suggestion. let's see.
in my opinion it's better to skip those data. who knows if they're useless because of copy protection?
I've compared the compressed filesizes of some games released on ps2 and xbox, the difference is astounding. I can see why this is such an issue, and hope that someone can pull a miracle out of the bag. I guess larger games end up having a smaller filesize, because there's a lower % of random padding?
I guess larger games end up having a smaller filesize
How is that meant?
I'm not really in the mood to check all posts, but I just wanted to let you guys know that the mastering tool for Xbox discs is in the SDK, so maybe the algoritm for the padding can somehow be traced there.
jamjam wrote:I guess larger games end up having a smaller filesize
How is that meant?
By that I mean that for games with more game data (as opposed to random padding), a higher % of the 7433027584 bytes an xbox image takes up will have a chance of being compressed well. I hope that makes sense.
Ah, I see your though, yes it makes sense, but I think xbox data is already pre-compressed and it would not make a difference of game data and random padded one.
I think JamJam is right in what he says, I found that to be the case, If the random padding is large then so will the full packed iso be.
Here's the text I found in the ISO
0010 : 00 00 00 00 00 00 00 00 4D 49 43 52 4F 53 4F 46 ........MICROSOF
0020 : 54 2A 58 42 4F 58 2A 4D 45 44 49 41 82 87 12 00 T*XBOX*MEDIA....
and in the next sector
0010 : 00 00 00 00 00 00 00 00 58 42 4F 58 5F 44 56 44 ........XBOX_DVD
0020 : 5F 4C 41 59 4F 55 54 5F 54 4F 4F 4C 5F 53 49 47 _LAYOUT_TOOL_SIG
Hopefully this find is of some use.
The SPICE must flow.