Article 13839 (46 more) in alt.cd-rom: Newsgroups: sci.electronics,alt.cd-rom,comp.periphs.scsi Path: agate!doc.ic.ac.uk!pipex!bnr.co.uk!bnrgate!bmerh85!bmerh160!lreid From: lreid@bmerh160.bnr.ca (Lewis Reid) Subject: Re: CD error correction, was: Message-ID: <1993Sep2.151402.2534@bmerh85.bnr.ca> Summary: Oops. Sender: news@bmerh85.bnr.ca (Usenet News) Organization: Bell-Northern Research, Ottawa, Canada References: <25svrr$krd@ebh.eb.ele.tue.nl> <1993Aug30.235816.2632@cmkrnl.com> <1993Sep1.150428.5543@bmerh85.bnr.ca> Date: Thu, 2 Sep 93 15:14:02 GMT Lines: 190 Xref: agate sci.electronics:64174 alt.cd-rom:13839 comp.periphs.scsi:15627 Well, well. Having received such an overwhelming "BZZZTT, yourself!" from so many, I decided to look up the facts. I'll start by responding to Ed Hall's comments, and first off perform two net.rarities. First let me apologize to anyone I offended with my attitude and my misconceptions. Second, let me provided the real scoop... with references, no less. Read on, if you have the time. My exposure to CD error correction came from the CD-ROM side of the world. I had to write C code to do the 2-D Reed Solomon error correction for CD-ROM data since the Hitachi drives we were using did not do this in firmware. The bulk of what I had read dealt explicitly with CD-ROM error correction and said little about the implicit error correction built into the data encoding on the disc. If there's a grain of truth to my statement that CD-audio has no error correction, it would be that - in the user-accessible data you can read from a CD only - there is no provision for error correction, no additional bytes available. That's where my misconception lay. Sometimes if you're wrong long enough, you really start to believe that you're right. That's what happened with me, although now I understand why I thought what I thought. If I learned anything from grad school, at a minimum it's (a) that I'm not always right, and (b) what to do if I realize I'm not. The first couple of replies to my posting just sort of irritated me. The ones after that gave me that sinking feeling of impending crow to be eaten. In half-self-righteous indignation and a desire to find out the objective truth, I went to the library for a few hours last night. Since I no longer work for the company where I did work with CD-ROM, I no longer have references on hand. In addition, it's been about a year since I worked with CD-ROM at all, and almost two since I worked with error correction. First reference, one that I had at my previous job: CD ROM - The New Papyrus 1986 Microsoft Pressr. Essentially a collection of papers, including pp 73-83 by Andrew Hardwick entitiled "Error Correction Codes: Key to Perfect Data." This chapter details the error correction used on CD-ROM, specifically the 2-dimension Reed-Solomon code that I was referring to earlier. Looking closer, I found that it actually did say that CD ROM and CD audio both use Reed-Solomon (RS) coding, particularly CIRC, or Cross-Interleaved Reed-Solomon Coding. According to this book, this level provides a *Byte* Error Rate of 10E-9. Jamie, if you want a fairly clear description of the CD-ROM layer of error correction, this book would be a good choice. He doesn't get technical enough for you to write code to perform the algorithms, but it's detailed enough to get a good feel for what's going on. One thing to be aware of, though, Hardwick speaks almost solely in terms of Byte Error Rates, rather than the Bit Error Rates (BER) that most other authors use. QUIZ for the stochastically inclined: Given a Bit Error Rate of x, 0 < x < 1, what is the corresponding Byte Error Rate? Note: BER refers to the fraction of bits that are in error on average, typically expressed as 1E-9, for example. Also note that a BER of 1/8 does not mean a Byte ER of 1. Two or more bit errors may occur in the same byte; some bytes may have no errors. A little information on the 2-D RS coding for CD-ROM... first, the 2340-byte block is split into an even and odd subblock. This automatically cuts in half the length of a string of consecutive errors. Next, the bytes in each subblock are arranged into a 2-D array. Instead of filling up one row, then the next row, etc., the bytes are staggered diagonally, so that no two adjacent data bytes from the disc are on the same row or column. This also cuts down the effect of a string of errors. Then the RS codes are applied to each column and each row, alternatively. The RS codes can correct any single error and detect most multiple errors. The beauty of this method is that, even if you have multiple errors on a given row, they must fall on different columns. Thus, the row correction would fail, but the next round of column correction would fix both. This method falls apart when you have errors, say, in a square: X - - - X <- two errors on this row - can't correct - - - - - - - - - - - - - - - X - - - X <- two errors on this row - can't correct ^ ^ |_______|_____ two errors on these columns - can't correct The syndromes generated by the RS codes, however, can give the correction program an idea of where the errors are, and other methods can be used to fix even such error patterns as this. Apparently, that's part of the proprietary information of Hardwick's company, because he really jumps around the subject, giving little detail. Nonetheless, the article says that this method fails less than once in 1E4 times, giving a Byte ER of 1E-13, better than the 1E-12 required by the computer industry. This book had a sequel, CD ROM volume two, also by Microsoft Press, 1987. In chapter 3, pp 31-42, the data on a CD-ROM frame is broken into 12 bytes of sync, 4 header bytes, 2048 user data bytes, 4 EDC bytes, 8 unused bytes, and 276 ECC bytes. The author says that for a CD ROM with a BER of 1E-4 and bursts of over 1000 bad bits, the correction can regenerate all but "one in every 1E-12 [sic] bits." Hmmm.... we can correct all but one of every one-trillionth of a bit. Go figure. My next reference is the Essential Guide to CD-ROM, 1986 from Meckler Publishing. In chapter 2, pp 13-32, Bert Gall gets deeper into the actual encoding on the disc than the other books had touched. In the terminology he uses, a CD "frame" consists of 588 channel bits - meaning 588 bits on the disc itself, where a 0 is a pit or land, and a 1 is the transition between the two. These are broken up as follows: Synch 24 + 3 channel bits Control & display symbol 1 times (14+3) channel bits Data symbols 24 times (14+3) channel bits Error-correction symbols 8 times (14+3) channel bits --------------- 588 channel bits The (14+3) refers to two things: every eight-bit byte is expanded into 14 bits so that the data is self-clocking. The 3 bits are added between bytes to keep a zero DC component (equal number of zeros and ones) in the data on average. The synch data is the only part of the disc that is not encoded with these bits, although after the synch data, the 3 bits are inserted. The conversion from 8 bits to 14 is referred to as EFM, or Eight-to-Fourteen Modulation. This is done by a simple table lookup, although the 256 words from the possible 2**14 = 16384 were chosen to have special properties. From the above breakdown, we see that for every 24 data symbols (bytes), there are 8 error correction symbols (bytes). These are arranged physically on the disc as 12+4+12+4. This gives a code efficiency of 3/4, meaning that 1/4 of the data on the disc is for error correction. The first group of 4 corrects single errors in the 24 bytes and flags most multiple symbol errors. the second group of 4 corrects up to 2 more symbol errors, given the positions flagged by the first pass. Any errors that this level finds but cannot fix are flagged to the D/A conversion layer for muting, interpolation, and concealment. Note that some errors may slip by even this layer undetected. "EDC/ECC correction ensures the high reliability of the CD-ROM 1E-5 - 1E-6 system. The disc has a bit-error-rate (BER) of 1E-5 to 1E-6. The CIRC error-correction system reduces this to 1E-11 to 1E-12. ECC and EDC finally brings this to the rate of 1E-15 to 1E-16. This is one of the best corrected bit error rates obtainable today." (p.20) My final reference is Principles of Digital Audio, a SAMS publication by Ken C. Pohlman, 1985 and 1989 2nd ed. Chapter 8: Error Correction (pp185-228) and Chapter 12: The Compact Disc (pp 321-373) contained very valuable information. As the title implies, this book deals specifically with digital audio, in contrast to most of the other sources I've read. This book reads almost like a textbook, without homework problems at the end of the chapters. It gives a background on error correction/detection theory, then goes into some more specifics about the CIRC coding on CDs. The data is first encoded using a (28,24) code C2 resulting in 28 bytes from the original 24 data bytes, then coded using C1 (32,28), adding four more bytes. The 24 data bytes consist of six pairs of 16-bit samples, but the six sampling periods are scrambled chronologically so that they do not appear on the disc in the order that they occur in time. This helps to reduce the audible effect of long burst errors, although it doesn't help the correction of errors any. Pohlman gives burst-error lengths and their treatments as follows: Correctable: up to 3874 bits Good Concealment: 13282 bits Marginal Concealment: 15495 bits He also gives interesting bit rate values, saying that the channel bit rate on the CD itself is 4.3218 Mbits/sec, of which only 1.41 Mbits/sec is audio data after synch, error correction, and EFM demodulation. Nearly 2/3 of the bits on the surface of the CD are overhead. A 1-hour audio CD contains roughly 15.5 billion channel bits, around 5 billion audio data bits. Regarding the guy drilling holes in his CDs to test the error correction, a burst error of 3874 bits is about 2.5 mm on the pit track. This much can be fully corrected, even in an audio CD. The limit for good concealment, 13282 bits, is about 7.7 mm. Now we see why your 3 mm hole didn't affect the sound, while your 6 mm hole did. I find this amazing. Pohlman also substantiates the raw BER of a CD as 1E-5 to 1E-6, although he says, "In practice, because of the data density, even a mildly defective disc can exhibit a much higher bit error rate." Elsewhere he says, "If large numbers of adjacent samples are flagged, the concealment circuitry performs muting. Using delay lines, a number of previous valid samples (perhaps 30) are gradually attenuated.... Errors that have escaped the CIRC decoder without being flagged are not detected bythe concealment circuitry, therefor do not undergo concealment and might produce an audible click in the audio reproduction. Not all CD players are alike in terms of error correction." He also gives a CD ROM BER of 1E-15 for mode 1 data. As an aside tidbit, he says Navy studies show that a cruiser carries about 5.32 million pages of documents, about 36 tons. The amount carried above the main deck can affect the ship's stability. In theory, the equivalent on a CD-ROM would be 20 discs, weighing 280 grams. So... if you've managed to read this far, thanks for your patience. Again, I'm sorry I jumped out with my misinformation, but it was an honest mistake. I am actually glad to know the truth, painful as the process may have been, and an evening's research in the library wouldn't kill most of us. Now, does anyone want to hear the truth on oversampling??? 8-O -Bill Eason -- All opinions and factoids expressed are my own or those I've collected, not necessarily those of my employer. Bill Eason, using lreid's account Northern Telecom Atlanta, GA (on loan to BNR, Ottawa)