What are people's thoughts on how to handle game prototypes or other CD/DVD/BD-R releases where each disc has a different hash because of unique identifiers in one or more files on the disc?

I know that this affects Capcom prototypes in the Xbox 360 era, probably other publishers too.

Most prototypes were probably destroyed or lost, but in a worst case scenario, it's possible that there might be a few hundred entries for the same edition, all with different hashes. Same for any indie releases that have watermarked files or similar.

2 (edited by Hiccup 2019-08-02 16:16:33)

I can think of two ways to do it (within the current database):

1. Each disc would have a different entry, but with a comment/tag explainging that its the same as the other discs apart from the ID
2. There would only be one entry for the discs and the hash is of the disc with the unique data FF'd/00'd out. This unique data would then be listed in the comment.

It would be nice to have xdelta patches attached to entries for such items. Ideally the CRC32 for all variants would still show up in searches.

All my posts and submission data are released into Public Domain / CC0.

Hiccup wrote:

This unique data would then be listed in the comment.

I don't think we can do that. If this is an identifier ID, then there is a reason why developers have implemented it in the first place, namely to identify the source of a leaked disc. We cannot publish such information.

There must be a or multiple master discs which don't have an identifier ID set, or maybe already set to a transparent ID.

PX-760A (+30), PX-W4824TA (+98), GSA-H42L (+667), GDR-8164B (+102), SH-D162D (+6), SOHD-167T (+12)

5 (edited by user7 2019-08-02 17:16:19)

I think just as often, mismatching could be bad mastering.

For example, read the comments: http://redump.org/disc/607/

All my posts and submission data are released into Public Domain / CC0.

user7 wrote:

I think just as often, mismatching could be bad mastering.

That's surely a mistake, these need to be added as separate entries. Just look at the Saturn or Mega CD pre-releases - many of them have the same data, but only differ in gaps or have a differently shifted audio.

7 (edited by user7 2019-08-03 12:40:54)

F1ReB4LL wrote:

That's surely a mistake, these need to be added as separate entries. Just look at the Saturn or Mega CD pre-releases - many of them have the same data, but only differ in gaps or have a differently shifted audio.

And I wouldn't call our current handling of those sega systems ideal.

Non-pressed discs should be treated differently - for example Total NBA 98 was handled correctly.
I have a couple PS2 final betas I never submitted to redump because I don't believe an 8-byte difference warrants having an entry or collecting a rom for. However the result is that the particular disc is not accounted for in redump.

All my posts and submission data are released into Public Domain / CC0.

for example Total NBA 98 was handled correctly

It wasn't. You shouldn't look at the checksums/entries from the pokerom collector's view.

If you don't treat betas differently, you could have hundreds of entries for the same builds with a few byte differences just because the burner used some crap software / poor quality discs.

All my posts and submission data are released into Public Domain / CC0.

Betas are betas, if they were burned that way, yes, we need the hundreds of entries to document them properly. Or to ignore and not to add them at all.

I have to agree with F1ReB4LL on this.
When you are dumping/doing preservation work, you should not be hacking things to compliance or to "intended" values, even if that feels convenient and/or reduces number of dumps. You are supposed to be documenting the disc itself (not just the data that is on a disc). Nothing more, no less. If the disc is crap, you have to document a crap disc, not make it look nice and proper.

In my opinion, any other approaches would be exactly the same big NO-NO like an archaeologist discarding a fragment of bone or pottery because he has already thousands of very very similar fragments. Or to correct a grammatical error in an ancient script because what was actually written and what was intended to be written were not the same.

I agree that this approach may feel counter intuitive to the layman and that it is even pointless from a gamer's or rom collector's point of view. However, preservation is just coincidentally useful to these categories and should not be going out of its' way to cater for such needs. Documenting discs is the point here. At least that's how I understand the project.

Redump already modifies data by not using rawdump. We're organizing data in a useful way. Fixing bad mastering falls in line with that. Just note and offer patches for the bad parts.

All my posts and submission data are released into Public Domain / CC0.

13 (edited by Hiccup 2019-08-08 08:23:01)

Regarding sensitivety of the data: that's a valid concern, but only for recent discs. And the data should still be stored, but kept private for a while.

I don't think the rawdump comparison is valid. Redump dumping methods are "natural"/replicable ways of interpreting the data - not something that is done manually to fix something that is seen as mistake.

14 (edited by reentrant 2019-08-06 17:17:48)

Redump already modifies data by not using rawdump

The best way to preserve the disc is to scan it with microsocope. Period smile

Just to be clear:

The discs that prompted me to post this thread are not bad burns. They intentionally have a file with a random number in it. That is, each disc was technically a custom one-off burn created by a DVD-burning station, probably with a robot disc changer so the publisher could make a stack of a few hundred quickly.

They are also not otherwise identical to the release version of the game. Some of them are from months before the release build, and others have debug functionality enabled.

In other words, even without making other assumptions, these are very unique versions of the games. There were likely never versions of these discs made that didn't each have a unique ID file.

user7, I have a few like that as well, however, I err on the side of submitting them because of the somewhat-recent discovery that some PS2 master discs had prototype data used as disc padding. That is, even if the game data looks virtually identical, there may be something unusual that's not visible in the filesystem.