Hey everyone

I've been advised to ask iR0b0t for a database query/dump.

I'm trying to do a comparison between submitted Total CRC-32 values for dumps against dumps that I converted to CloneCD format.

I'm hoping to get a csv with the following info:

1. Title
2. Disc Number/Total Discs for title
3. System
4. Region
5. Serial
6. Build Date
7. Version
8. Edition
9. Bar Code
10. Total CRC-32

I primarily need this for Sega Saturn discs, but it would be wonderful to get output like this for all discs i might dump in the future.

Thank you

I don't get notifications from this forum, and i don't often check PMs here as a result. If you need to reach me, please just email me or contact F1ReB4LL.

10. Total CRC-32

I cannot do a database dump of it, Total CRC-32 is not stored within the database, its generated on the fly each time you review a disc page.

PX-760A (+30), PX-W4824TA (+98), GSA-H42L (+667), GDR-8164B (+102), SH-D162D (+6), SOHD-167T (+12)

iR0b0t wrote:

10. Total CRC-32

I cannot do a database dump of it, Total CRC-32 is not stored within the database, its generated on the fly each time you review a disc page.

Every time when you open the page or every time when you edit the page? If the latter, it should be saved somewhere?

iR0b0t wrote:

10. Total CRC-32

I cannot do a database dump of it, Total CRC-32 is not stored within the database, its generated on the fly each time you review a disc page.

hmm. this presents a problem. We're working on trying to build a scraper for this but had really hoped grabbing the CRC-32 would have been far easier for the verification process.

I don't get notifications from this forum, and i don't often check PMs here as a result. If you need to reach me, please just email me or contact F1ReB4LL.

5 (edited by Jackal 2019-04-24 05:32:30)

A Murder of Crows wrote:
iR0b0t wrote:

10. Total CRC-32

I cannot do a database dump of it, Total CRC-32 is not stored within the database, its generated on the fly each time you review a disc page.

hmm. this presents a problem. We're working on trying to build a scraper for this but had really hoped grabbing the CRC-32 would have been far easier for the verification process.

If this is for a personal list of sorts then you're better off making one manually in Excel (I've been doing the same to keep track of my IBM PC collection). There would be no benefit in automating anything, you only have to update the list when a new dump is added to the db.

F1ReB4LL wrote:

Every time when you open the page or every time when you edit the page?

Its the first one.

PX-760A (+30), PX-W4824TA (+98), GSA-H42L (+667), GDR-8164B (+102), SH-D162D (+6), SOHD-167T (+12)

Jackal wrote:

If this is for a personal list of sorts then you're better off making one manually in Excel (I've been doing the same to keep track of my IBM PC collection). There would be no benefit in automating anything, you only have to update the list when a new dump is added to the db.

hardly. a friend and i have developed a tool to automatically convert Redump images in Bin+Cue to cloneCD format, primarily because Bin+Cue is not supported by some ODEs and it makes little sense to maintain 2 separate image sets. Also, converting 2000+ images takes a LOT of time (approximately 50 hours last time i did it) and each time we still run in to certain errors, indicating that something in the conversion process is going wrong, but not on all images.

What i want to do is add in CRC-32 checking after the image conversion process completes on a per image basis and flag the ones that don't match up.

Either way, i do not want to attempt to manually visit 2000+ pages to pull a single point of data, much less 10 points.

There certainly is benefit to automating. Even if a new dump is added to the DB, the date added can be checked and if the date doesn't match or surpass the last time the tool was run, no update is needed for that title. Otherwise, snag updated info and flag for attention.

there are very few tasks i can think of where automation isn't welcome. Heck, even though i haven't been able to dump in a while due to various issues, the only reason i can dump is because my friend and i looked at the dump process and developed tools to automate the creation of the command for DIC as well as for moving and zipping the resulting files.

anyway, it appears that using a scraper is the only way currently to obtain the CRC-32 info.  I hope that changes in the future as verification of successful and accurate transition between formats is pretty important to preservation efforts and the prevention of inaccurate images getting out to the wild.

I don't get notifications from this forum, and i don't often check PMs here as a result. If you need to reach me, please just email me or contact F1ReB4LL.

8 (edited by wiggy2k 2019-04-24 17:47:20)

@AMOK.

Might be worth having a look at the code that combines the ~CRC32 values if iR0bo0t is willing to share or writing something from scratch, pretty sure it cant be that hard and maybe looking to script something (Python/PHP?) to generating the combined CRC32 from the DAT files themselves.  at least that way you wont need to scrape the pages in cases of corrections / updates.   

Slightly off topic:

from the sounds of it you are probably heavily invested in the the Phoebe/Rhea ODE but maybe worth a look at the upcoming Satiator from professor abraisive,

I am currently awaiting the beta board for testing.

one of the main advantages being his cue parser is specifically designed for split bin/cue images along with leaving the actual optical drive untouched obviously.  plus there in nothing in Saturn Subs that isnt covered by the redump .cue

Yeah. This function should be similar across different scripts and is very fast in execution, this is why i chose on-the-fly calculation vs. storage. Of course it has the disadvantage that one cannot quick search for total crc32 values.

Even though it could be usable in some situations to have an extra table of "total crc32" values, like the quick search function, I am still not sure if we need that, just because one won't be able to compare the "total crc32" values of foreign images due to offset from uncorrected audio tracks.

PX-760A (+30), PX-W4824TA (+98), GSA-H42L (+667), GDR-8164B (+102), SH-D162D (+6), SOHD-167T (+12)

wiggy2k wrote:

@AMOK.

Might be worth having a look at the code that combines the ~CRC32 values if iR0bo0t is willing to share or writing something from scratch, pretty sure it cant be that hard and maybe looking to script something (Python/PHP?) to generating the combined CRC32 from the DAT files themselves.  at least that way you wont need to scrape the pages in cases of corrections / updates.   

Slightly off topic:

from the sounds of it you are probably heavily invested in the the Phoebe/Rhea ODE but maybe worth a look at the upcoming Satiator from professor abraisive,

I am currently awaiting the beta board for testing.

one of the main advantages being his cue parser is specifically designed for split bin/cue images along with leaving the actual optical drive untouched obviously.  plus there in nothing in Saturn Subs that isnt covered by the redump .cue

I'm well aware of JHL's project and am one of the biggest critics of it as it doesn't allow for use of every title on the system (anything MPEG card optional/required can't be used). While there aren't a lot of detractors out there, my understanding is that he's not going to be implementing an mpeg card into the unit and therefore the unit misses the point to me. In my mind, any ODE product should enhance the capabilities and user experience of the original product without reducing compatibility or functionality.

While the rhea/phoebe does remove the ability to play optical discs, i have yet to run into a single disc image it can't handle that hasn't been fixed in some way (i believe there were a couple of discs early on that couldn't play but to my knowledge those have since been fixed via firmware). This seems a fair trade to me. Though i'd much prefer an ability to switch easily between using the ODE and using the original drive, i'm not willing to sacrifice titles out of the library for that feature.

In any case, I'm still much more of a fan of using CCD over BIN/CUE.

We're going to continue work on the scraper for now, but yes, having an alternative method of getting the total CRC-32 without needing to build the scraper would probably be ideal.

Also, i'm in agreement that the need i have doesn't necessarily justify an additional table, but i'm not exactly sure how one could convert the existing bin/cue sets over to CCD in batch and verify the images retained integrity without doing what i'm doing. it would be a different story if a CCD based set already existed, or if the raw files generated by DIC were what were distributed instead of bin/cue, but for end users that would likely be a nightmare.

So yes, i'm open to any and all suggestions to make this process more self contained. No matter what, some online update is required (either a new dat file must be downloaded and compared, or the site needs to be date checked and re-scraped each time a new disc is added).

I don't get notifications from this forum, and i don't often check PMs here as a result. If you need to reach me, please just email me or contact F1ReB4LL.

am i able to get any forward momentum on at least the calculation for using the dats as a source?

I don't get notifications from this forum, and i don't often check PMs here as a result. If you need to reach me, please just email me or contact F1ReB4LL.

A Murder of Crows wrote:

am i able to get any forward momentum on at least the calculation for using the dats as a source?

Sure!

PX-760A (+30), PX-W4824TA (+98), GSA-H42L (+667), GDR-8164B (+102), SH-D162D (+6), SOHD-167T (+12)

iR0b0t wrote:

Yeah. This function should be similar across different scripts and is very fast in execution, this is why i chose on-the-fly calculation vs. storage. Of course it has the disadvantage that one cannot quick search for total crc32 values.

Even though it could be usable in some situations to have an extra table of "total crc32" values, like the quick search function, I am still not sure if we need that, just because one won't be able to compare the "total crc32" values of foreign images due to offset from uncorrected audio tracks.

I am also interested to see total crc32 to be stored in the db as it is an easy way to quick search this data when you make new dumps.
It is really useful to check already dumped discs with this crc32 as sometimes the serial of the disc is not already added to the db and the title is not correct (or you don't know how to write it if it is from a foreign language), and the crc is the most relevant data to search for, to know if a dump already exist.
Then you could compare crc32 of all the tracks to know if you need to submit a new dump or just a verification.

So I think it could be a good feature, and useful.

Saturn Database {-} Retro Deals search engine that helps you find stuff (plextor drives, games, etc.) easily on eBay {-} My Redump Logs

While we're here, @iRobot, i didn't see anything back from you about the calculation. I'm stuck in limbo for at least one project until we make progress with Total CRC32.

If we skip that one value and find a way to add it in later, i'd still like the rest of the data dump.

I don't get notifications from this forum, and i don't often check PMs here as a result. If you need to reach me, please just email me or contact F1ReB4LL.

What do you want me to do? I thought you were going to use the dats as stated here

A Murder of Crows wrote:

am i able to get any forward momentum on at least the calculation for using the dats as a source?

PX-760A (+30), PX-W4824TA (+98), GSA-H42L (+667), GDR-8164B (+102), SH-D162D (+6), SOHD-167T (+12)

I apologize, i thought my question was fairly straightforward.

I don't know how you're calculating the total CRC-32. my attempts at using various calculators to take 2 or more CRC values out of the dat/webpage result and sum them up doesn't result in the value you have as total CRC.

I'm not interested in reinventing the wheel here. i need a solution that allows me to check the site pretty much any time and see if there are changes or updates to any disc (not just saturn mind you) and if there is a change against what was last checked/calculated, obtain the total CRC again and verify it against the dumps i currently have for that title.

I'd MUCH prefer some semi static listing of Total CRC, either in the database or in the dat, but upon opening the dat, i saw the value wasn't listed there either. I don't want to factor in your on the fly calculation into my programs/scripts as doing that will add an extra potential point of failure.  If you won't incorporate Total CRC as a static stat into either the dat or the webpages, i've got not choice but to do that on my own, but i still would rather be handed the code/algorithm than try to "figure it out".

And yes, the data base dump would also be super awesome since a lot of the requested data isn't in the dat file anyway.

Thank you!

I don't get notifications from this forum, and i don't often check PMs here as a result. If you need to reach me, please just email me or contact F1ReB4LL.

A Murder of Crows wrote:

I'm hoping to get a csv with the following info:

1. Title
2. Disc Number/Total Discs for title
3. System
4. Region
5. Serial
6. Build Date
7. Version
8. Edition
9. Bar Code

Dump attached.

Post's attachments

saturn.csv 239.81 kb, 16 downloads since 2019-06-18 

You don't have the permssions to download the attachments of this post.
PX-760A (+30), PX-W4824TA (+98), GSA-H42L (+667), GDR-8164B (+102), SH-D162D (+6), SOHD-167T (+12)

function crc32_combine($crc1, $crc2, $len2) {
    $odd[0]=0xedb88320;
    $row=1;

    for($n=1;$n<32;$n++) {
        $odd[$n]=$row;
        $row<<=1;
    }

    gf2_matrix_square($even,$odd);
    gf2_matrix_square($odd,$even);

    do {
        /* apply zeros operator for this bit of len2 */
        gf2_matrix_square($even, $odd);

        if ($len2 & 1)
            $crc1=gf2_matrix_times($even, $crc1);

        $len2>>=1;

        /* if no more bits set, then done */
        if ($len2==0)
            break;

        /* another iteration of the loop with odd and even swapped */
        gf2_matrix_square($odd, $even);
        if ($len2 & 1)
            $crc1=gf2_matrix_times($odd, $crc1);
        $len2>>= 1;

    } while ($len2 != 0);

    $crc1 ^= $crc2;
    return $crc1;
}

$crc1 => pool
$crc2 => to be added to the pool
$len2 => length of the track with $crc2

PX-760A (+30), PX-W4824TA (+98), GSA-H42L (+667), GDR-8164B (+102), SH-D162D (+6), SOHD-167T (+12)