For most CDs, unscrambled images are more useful on a day-to-day basis than scrambled images. Fortunately, an unscrambled image can be used to exactly or nearly exactly generate the scrambled data, with maybe just a few KB of differences from the actual scrambled data returned by the drive. (In the case of intentional errors or incorrectly mastered sectors that were replaced with dummy sectors during the initial descrambling step, the re-scrambled data will differ from the original scrambled data returned by the drive. For sectors without any errors, though, the re-scrambled data should exactly match the scrambled data received from the drive.)
For archival purposes, I'm storing both unscrambled and scrambled images for all the CDs I dump. This ends up taking quite a lot of storage, because each disc is stored both fully scrambled and fully unscrambled. What I'd ideally like to be able to do is just keep unscrambled images, and, alongside them, a difference file indicating what (if any) bytes differ when the unscrambled data is used to generate the scrambled data. This would enable a tremendous space savings since it would still enable full reconstruction of the scrambled data, but it would store only those bytes that cannot be regenerated from the unscrambled image.
Is there any existing software / image format that enables this type of storage? It seems like it would potentially be a nice feature for Aaru, though I don't believe it currently supports this. I've thought about maybe writing a utility to do it myself, but it'd feel much tidier if it tied into something that the community was already using for archival. Maybe if Aaru can't do it natively, it would be possible to add some custom metadata field to Aaru images that encodes any differing bytes?
Does anyone have any thoughts / insight? My motivation is that some discs seem to embed data inside of error sectors (e.g., sarami has pointed out previously that some disc has lines from the poem Jabberwocky stored in the erroneous sectors), and this data is thrown away when the descrambled image is built. I'd like to keep that data.