1 (edited by user7 2018-05-01 05:27:47)

Just to establish this as a topic of discussion...

Pioneer LaserActive games were released on a subset of the LaserDisc format. LaserActive games included both analog video as well as digital content. I presume this means it's theoretically possible to do a perfect "dump" of the digital data. I could only assume this would require extensive and expensive hardware and software hacking, only doable by a LaserActive-specialized genius. But if such a person exists out there... they should probably know that we would love to chat about making some perfect LaserActive dumps.

More info + a miss list: http://wiki.redump.org/index.php?title= … sing_Discs

Some interesting topics of discussions on forums:
http://gendev.spritesmind.net/forum/vie … amp;t=1647
http://forum.lddb.com/viewtopic.php?f=32&t=2671

All my posts and submission data are released into Public Domain / CC0.

2 (edited by user7 2019-04-05 06:14:12)

I caught up a bit with the folks from the Domesday project about how the digital portion LD dumping could be adapted for redump:

#domesday86: Domesday86 - Recreating the Domesday Experience - https://domesday86.com/
[20:19] == HSmith [b9f55717@gateway/web/freenode/ip.185.245.87.23] has joined #domesday86
[20:21] == Morska [3f9b0134@gateway/web/cgi-irc/kiwiirc.com/ip.63.155.1.52] has joined #domesday86
[20:21] <HSmith> Hi guys, i'm dropping by from Redump.org disc preservation project. A few of us are trying to find out the basics on verifiability of dumping.
[20:22] <HSmith> For example, I would presume the analog data would result in different hashes for each dump?
[20:22] <simon_dd86> Morning, and welcome :)
[20:22] <HSmith> Hi :)
[20:23] <HSmith> I was reading here http://gendev.spritesmind.net/forum/viewtopic.php?f=17&t=1647&sid=1e086ebe155369010d8b54a7f32e3b69&start=60#p34937
[20:23] <simon_dd86> yes; every dump will have variance due to the player, the disc and the mechanics
[20:23] <HSmith> It sounds like from that link that the digital portion is very similar to a giant CD-Rom lead-in (with the TOC), pre-gap, track data, lead-out, full subcodes
[20:23] <HSmith> Presumeabily the digital portion would give verifiable dumps, unless the read offset differs between machines.
[20:24] <simon_dd86> the digital audio on a laserdisc is encoded using EFM; once decoded it would give a verifiable hash for the digital audio part of the RF signal - but the laserdisc is still analogue, so the EFM is modulated into the main analogue signal
[20:25] <simon_dd86> (i.e. there are no zeros and ones on the actual laserdisc - just a single analogue 'track' which is the modulated sum of all the content on the disc)
[20:26] <HSmith> interesting, in regards to LaserActive, i wonder how it could have the CDRom parts without 0's
[20:26] <simon_dd86> I've tried to explain this in a 'tech-light' version here: https://www.domesday86.com/?page_id=1379
[20:26] <simon_dd86> well, that I can explain :)
[20:27] <simon_dd86> on a cdrom, when the laser passes over the pits and gaps, the light falls in and out of the pits - this doesn't generate a square wave - it generates a sine wave
[20:28] <simon_dd86> the sinewave is generally run through a Phase-Locked Loop which times the sine wave half-periods and 'estimates' it back into a square wave of 0 and 1s
[20:29] <simon_dd86> because of that, you can simple record a sine wave on some non-CD medium; provided you can get the sine wave back, you can PLL it and get back a square wave
[20:30] <simon_dd86> and that's what an LD does.. the sine wave is basically 'audio' encoded into the modulated signal.  The player filters the part with the EFM sine wave to recover it from the main signal - and then runs it through a PLL to get the EFM back
[20:31] <HSmith> i'm thinking in terms of verifiability how it would make the most sense to seperate the data into whats 'verifiable' and whats not. my understanding might be too rudimentary, but...
[20:32] <HSmith> if the digital data is verifiable across different machines etc, then it may make sense to store it as separate "track" from the analog data
[20:32] <HSmith> at the least, the digital data could be verified to be perfect.
[20:32] <HSmith> with discs we do this as well. two users will dump a disc and compare hashes
[20:33] <HSmith> perhaps the closest analogy is a GD-Rom, which is two discs in one. http://redump.org/disc/19672/
[20:33] <HSmith> the two disc portions are glued together in a CUE with REM
[20:35] <HSmith> perhaps i'm putting the cart before the horse, seeing how as there's no LaserActive emulator yet, but i'd prefer to start off preservation on the right foot, figuring out what makes sense from a structural point of view as to separating out the data into "tracks" via cue, refering the LD TOC
[20:36] <simon_dd86> well, even that isn't technical a preservation of the CD though... it's a preservation of the data represented on the disc.  A CD is still EFM (which is an analogue sine wave coming off the disc) and every disc will vary.
[20:36] <HSmith> luckily CD-Rom has a c2 error detection mechanism, but verifiability via multiple dumpers has helped us prove the integrity of our database entries as well
[20:37] <simon_dd86> once you PLL the EFM, convert it (using 14 to 8 bit EFM decoding)... then CIRC error correct it - you have 'data' which should be the same between discs
[20:37] <simon_dd86> you can do the same trick with LD EFM, but not with the rest of the LD contents
[20:38] <HSmith> i wonder if it would be beneficial to an emulator to seperate the digital vs analog data with a cue system for LDs
[20:38] <HSmith> it would seem, from a preservation standpoint, it would be better to make the seperation - so what data can be verified, would be verified by different dumpers
[20:39] <simon_dd86> technically, you can actually capture the raw RF of a CD and convert it from there - and know exactly the condition of the disc.  I say 'technically' because I've done it (abet with very beta software)
[20:39] <HSmith> would that be the same as "scrambled" data?
[20:39] <simon_dd86> well, from an emulation stand point, separating things makes sense; but it's not a 'preserved copy' (of either the CD or the LD)
[20:40] <simon_dd86> scrambling is one stage of the EFM decoding process... there is EFM demod, C1 CIRC, C2 CIRC, descrambling and deinterlacing
[20:40] <HSmith> presumeably if you can get the digital portion to verify across multiple dumps, then at least those 'tracks' would be preserved
[20:41] <simon_dd86> well, the data in those tracks would be preserved :)  it's a matter of perspective...  you can't recreate the original disc from the decoded data
[20:41] <simon_dd86> but it is a matter of technicality
[20:42] <simon_dd86> to recreate the original disc, you would need the original EFM data - from a 'usability' point of view, it's all the same...  from a hardcore preservation point of view... it's not :)
[20:43] <HSmith> my interest would be more creating what we're doing with redump, but with laserdiscs
[20:43] <HSmith> useful, track based, verifiable descrambled data
[20:44] <simon_dd86> then, whatever you can do with a CD, you can do with the digital track of a LaserDisc
[20:45] <simon_dd86> still, RF decoding of CDs is interesting because it allows much more intelligent recovery of a disc (with a lot more information about the state of the original recorded signal)
[20:45] <simon_dd86> but we are (software wise) a little way off that at the moment
[20:45] <simon_dd86> if you have software engineering experience, the current EFM decoder for the ld-decode project is here: https://github.com/happycube/ld-decode/tree/rev5/tools/ld-process-efm
[20:46] <HSmith> i'm afraid i'm more of a coodination guy
[20:46] <simon_dd86> that software can do both audio and data recovery from both CDs and LDs
[20:46] <HSmith> but i'll pass this info along
[20:46] <HSmith> Do you know if the LDs would have offsets, similar to CD-Roms?
[20:46] <simon_dd86> the missing piece is the front-end RF signal capture and demod, which @happycube is working on
[20:47] <simon_dd86> by offsets, you mean table of content pointers to the tracks?
[20:47] <HSmith> http://www.accuraterip.com/driveoffsets.htm
[20:48] <simon_dd86> ah ok
[20:48] <HSmith> By correcting write offset in Discs with audio tracks, we're able to match hashes of two different discs of the same content, but different write offsets
[20:49] <simon_dd86> the processing lag of the player is a function of the actual hardware that decodes it (and the ECMA spec doesn't state what it should be) - so in this case the lag is generated by the software I just linked
[20:49] <HSmith> Thanks for all your help.
[20:49] <simon_dd86> right now it doesn't resync, but I have a plan to add it in
[20:49] <HSmith> Have you had the chance to look at a TOC of a LaserActive disc? I'm curious if there would be more than one Track.
[20:50] <simon_dd86> so, once it's in a release state, I'll be able to measure the 'offset' between the subcode channel timecode information and the actual audio data (because that is what you are really measuring with the 'offset')
[20:51] <HSmith> the offset basically is a means to properly split multiple "tracks", but if LDs only have a single track in their TOC (like 3DO), then write offset correction probably wouldn't be relevant afaik
[20:51] <simon_dd86> LD can have (and does have) multiple tracks, they follow the chapters of the disc.  Also LD can have sections with and without EFM data so, unlike a CD, the data may not be continuous across the disc
[20:52] <simon_dd86> with a software decode you can track the difference between the subcode times and the track times - only for data discs; audio discs only have the timing information in the subcode channel.... it's all a bit confusing
[20:53] <simon_dd86> even if you measure the player (like you did in your table) there is no rule that says it will be right for every disc though :/
[20:53] <HSmith> Thanks, i'll pass this conversation along to some people in my camp who are smarter than me.
[20:55] <HSmith> i'm hoping we can design a way for a track based system with cue which will be useful for emulators in the future, while maintaining verifiability and gluing in the analog data into a REM or track
[20:55] <simon_dd86> it is based on the encoding used when the disc was mastered - as well as the player that you play it on :)
[20:55] <simon_dd86> well, here's the good news... anything a real CD player/LD player can do - a software decode can also do
[20:56] <simon_dd86> only - a software decode can also do more :)
[20:56] <simon_dd86> but, at this type of level, the discussion is highly technical - the 'pre-data' bits of the processing chain are really about how discs are mastered and how a player 'plays' them - at the electronics level
[20:58] <simon_dd86> on a similar note - my intention is to software decode and (at the same time) produce a JSON file containing all the metadata about the image
[20:58] <simon_dd86> that JSON file is probably what you are getting at... all the 'extra' stuff to do with subcode channels and the like
[20:59] <happycube> heya
[20:59] <simon_dd86> morning :)
[20:59] <HSmith> I'm a bit over my head, but i'm thinking the smartest way to separate a LaserDisc into a Track-based cue system.
[21:00] <HSmith> instead of just one bin blob of all analog/digital/subchannel data merged
[21:00] <Gamn2> the decoding is already doing that basically. it'll split it into seperate files and metadata files for the video, analog audio, and EFM
[21:00] == Gamn2 has changed nick to Gamn
[21:00] <simon_dd86> it's basically what I was planning - one file containing the 'image' and another the metadata
[21:01] <HSmith> how would the parts be glued together? a cue or similar?
[21:02] <simon_dd86> you would take the JSON and the image and run it through your preferred 'convert me to this' software
[21:02] <HSmith> so youre thinking the image would contain all the tracks instead of seperating them out into individual bins?
[21:03] <simon_dd86> yes, but the metadata would tell you the 'map' - so with the two, you could separate it (if that's what you want)
[21:03] <simon_dd86> the reason for this is that different uses have different requirements - so from a decode perspective we want to do as little as possible to the image
[21:04] <simon_dd86> otherwise it limits the downstream possibilities
[21:04] <HSmith> we've found that by separating Tracks, that you can test the verifiability of individual tracks for matching hashes.
[21:05] <HSmith> For example, a game in two different regions may have a different data track but the same audio tracks (once write offset corrected)
[21:05] <simon_dd86> it's great until you find a source that doesn't follow the rules... which is why ld-decode is a chain of processing tools
[21:06] <simon_dd86> doing the 'bit' you are discussing, is pretty much the end of the chain - and where you will find the most 'specialization'
[21:07] <simon_dd86> so - it's perfectly possible to do (in case I didn't make that clear)
[21:08] <HSmith> again, going a bit out of my field of expertise, but i believe if you don't detect the write offset, multi tracks may not be perfectly separated at the correct bits - especially in regards to audio tracks.
[21:09] <simon_dd86> btw - the reason why you have to offset correct and hash the individual parts is that you don't have control of the decoding process... with ld-decode the EFM to data decode is all in software... you *know* if it's right because you will see any failures in the decoding process
[21:10] <simon_dd86> even in it's early form, the EFM decoder tells you the number of failed EFM to 8 bit conversions, C1 errors, C2 errors (the number of unrecovered C2s) - and then in the data, the same thing for the CIRC there too
[21:10] <HSmith> so its like the idea of an offset isn't really relevant here. that should simplify things
[21:10] <simon_dd86> you can even see the CRC16 results from decoding the subcode channels
[21:11] <HSmith> "the number of unrecovered C2s" interesting, we use a similar software to this for disc dumping at redump. it detects the sectors the c2 errors exists in and rereads those sectors in an attempt to recover
[21:11] <simon_dd86> well the offset tells you how to split the data into tracks based on the subcode timing information - so you probably still need it; but it's not required to 'verify' the data
[21:12] <happycube> yeah - having the entire decoding path in software's gonna be nice for verification
[21:12] <simon_dd86> with a software decode, you could RF capture many copies of the same disc... and recover the C1/C2 from multiple sources...  the technical possibilities are quite exciting
[21:12] <happycube> ... and you know when you only need one ;)
[21:15] <Gamn> merging multiple copies will be a pretty cool thing on many fronts.
[21:15] <HSmith> i wonder if the track hashes domesday produces will match the methods used by Nemesis
[21:15] <simon_dd86> the duplicator produces an analogue sample - every one will be different at that level
[21:17] <Gamn> in the video space, I'm thinking of using imagemagick to do this: https://petapixel.com/2013/05/29/a-look-at-reducing-noise-in-photographs-using-median-blending/
[21:17] <Gamn> eliminate noise and dropouts in one swoop. just need enough copies.
[21:20] <simon_dd86> from the perspective of CDs - it's the raw EFM data that's interesting; with multiple copies of the same CD pressing, you'll be able to get a perfect copy of what's on the discs surface (at the point of mastering) - so, unlike LDs, the preservation could be really extreme :)
[21:21] <simon_dd86> there's not much advantage from a 'using it' perspective though - it really depends on the aim
[21:23] <simon_dd86> for most people it falls into the 'don't care' category - but for LDs, where your only source is analogue, it's a game of getting it 'as right as possible' - since there will never be a verifiable hash
[21:26] <simon_dd86> but, the fact that LD decoding has to be so 'extreme' is good news for CDs, since you basically get the same extreme approach for free
[21:28] <HSmith> "but for LDs, where your only source is analogue, it's a game of getting it 'as right as possible' - since there will never be a verifiable hash"
[21:29] <HSmith> which is why - imo - its smarter to separate the verifiable data for hashing
[21:29] <HSmith> so verify what you can verify, and quarantine the rest into it's own bin
[21:30] <HSmith> an emulator could use a cue to link to the analog dumped data as a separate file
[21:30] <simon_dd86> I'm not saying that can't be the case; but the decoding process doesn't have to do that bit; you take the decoded object and the metadata and then do what you like with it
[21:31] <HSmith> i'm thinking in terms of organizing the data into a database http://redump.org/disc/61151/
[21:32] <simon_dd86> it will be different depending on what you are aiming to do - in your case, it's verifying 'perfect' audio data I guess... but that has basically nothing to do with preserving a disc... just the contents of the audio portion of a disc
[21:32] <simon_dd86> it really is a matter of perspective :)  not right or wrong... but the overall aim of what you are attempting to do
[21:32] <HSmith> i'm thinking more like breaking a disc down into its individual parts
[21:33] <simon_dd86> well, again, at the EFM level, there are no parts
[21:33] <HSmith> to create a file based structural system
[21:33] <HSmith> sure, but decoded you have a TOC and data track and subchannels
[21:33] <simon_dd86> so you are already several layers above what is really (physically) on the disc
[21:33] <HSmith> so its basically a philosophy of decoding the information and organzing it in a useful way
[21:34] <simon_dd86> I'm not saying that isn't worth while or correct @HSmith, what I'm really getting at is that you have already lost a lot of what is physically on the disc by that point
[21:34] <simon_dd86> which doesn't matter if you are concern with preserving the encoded audio or data
[21:34] <simon_dd86> but it's not a preservation of the 'disc' - just the interpretation of the contents
[21:34] <simon_dd86> (if that makes any sense :) )
[21:35] <HSmith> i believe so, if its similar to cd-rom etc.
[21:36] <HSmith> for example, for emulation we find the data tracks, audio tracks, and structure (cue) useful, and sometimes (but rarely) subchannel data
[21:37] <simon_dd86> ok, but now we are discussing two different things.  Preservation is the act of making a perfect copy.  Emulation is the act of using a copy for something.
[21:37] <simon_dd86> The better the preservation, the more opportunity there is for the emulation (as you have a choice of more data)
[21:38] <simon_dd86> but what you do with the copy, doesn't affect how you preserve it - it's the other way around, since the preservation limits the emulation
[21:38] <HSmith> well, i think of it this way. if you dump every time and get different hashes, then how sure can you be of the integrity of your data? whereas if you seperate it out into verifiable parts - at least you can be sure of the integrity of those parts
[21:39] <simon_dd86> for data - you can perform error correction and know that you have a viable copy of the information represented on the disc.  That works well for CDs
[21:39] <simon_dd86> but for LDs, only the data is verifiable - the rest isn't (video, analogue audio, VBI, etc)
[21:40] <simon_dd86> (and there are many LDs which don't contain any data at all)
[21:41] <simon_dd86> so - for LD there is never a way to be 100% certain - what you can do is sample many discs of the same pressing and compare them - but you get a degree of 'certainty' - not a '100% this is correct'
[21:41] <HSmith> Sure, i'm just thinking in terms of dattable data structure, its probably wise to separate the verifiable digital data from the analog, instead of keeping it all in one big binary blob
[21:42] <simon_dd86> since the data track on an LD is the same as a CD, you can do the same tricks - but the digital data is a very small part of the overall contents of an LD (unlike a CD where it is the only part)
[21:43] <HSmith> Understood
[21:43] <simon_dd86> I think we are agreeing here :)  We are just cross-discussing three things - CD data, LD data and LDs in general
2:19] <HSmith> i thought of another question, i really don't know if its offbase or not, but does the TOC identify the analog areas of an LD and what sectors(?) they fall into?
[22:19] <simon_dd86> happycube is the author of ld-decode btw... now introductions are done :)
[22:19] <happycube> i don't think it does directly
[22:19] <simon_dd86> the TOC gives a time offset into the disc
[22:19] <simon_dd86> so you know where abouts to look for something
[22:20] <simon_dd86> (there are no 'exact' locations though)
[22:20] <HSmith> so perhaps its fair to say the TOC glues the digital tracks and the analog time-offset-locations together
[22:21] <happycube> yeah
[22:21] <simon_dd86> yes, it's an aid for the player to find something.
[22:22] <simon_dd86> of course; it relies on the mastering being correct (so the EFM and VBI time codes match)... and bitter experience has taught me that mastering is often far from perfect :)
[22:23] <simon_dd86> (VBI is the 'data' in the analogue video btw - it's in between each visible video field)
[22:29] <simon_dd86> btw @HSmith - in the EMCA-130 spec, clause 18 on page 17 you will find the following text:
[22:29] <simon_dd86> "These  Sections  are  asynchronous  with  the  Sectors,  i.e.  there  is  no  prescribed  relation between the number of the F1-Frame in which the first byte of a Sector is placed and the number of the F3-Frame inwhich the first Control byte of the table is placed. Each Section has its own table with Control bytes"
[22:30] <simon_dd86> that is the bit that refers to the 'offset' you were talking about
[22:30] <simon_dd86> the 'no prescribed relationship' bit is the reason why it's a pain in the butt :)
[22:31] <simon_dd86> generally it will be player dependent; but there is no rule that states it couldn't vary based on the mastering process
[22:31] <simon_dd86> one of these days I'm going to dig in and see what's true - not really important, but I'm curious by nature
[22:33] <simon_dd86> (the control byte is the subcode channel btw, and the sector is the actual data) - it's even worse for audio, since the spec doesn't even bother to mention it :)

Unless i'm wrong, this seems entirely doable.

All my posts and submission data are released into Public Domain / CC0.