Article 17745 of comp.sys.ibm.pc: >From: pete@octopus.UUCP (Pete Holzmann) Subject: RLL Technical Details (long) (was Re: RLL- why it is hard on drives) Message-ID: <218@octopus.UUCP> Date: 15 May 88 03:16:04 GMT Organization: Octopus Enterprises, Cupertino CA If you read all the way through this, you will (hopefully) understand WHY RLL works/doesn't work depending on the configuration you set up. You will also understand WHY many of the horror stories applied to RLL are almost certainly mis-applied. I. How is data stored on a disk drive? As magnetic flux reversals (think of it as + to -). The POLARITY of the magnetic flux doesn't mean a thing. It is the TIMING of the flux reversals that is used to encode data. II. What is RLL? What does the '2,7' in '2,7 RLL' mean? RLL means Run Length Limited. The Limits in disk drive RLL refer to the minimum and maximum time between flux reversals. '2,7' means minimum of 2, maximum of 7. A minimum of zero would mean that flux reversals can occur in every clock period. Thus, '2,7' means that flux reversals occur at least every 8th clock period (7 periods without a reversal), but no more often than every third clock. RLL codes are 'self clocking'. Since you are guaranteed to have a flux reversal within a limited time, a phase-locked-loop circuit can find the basic clock period of data on the drive. As the basic clock period gets smaller and/or the maximum inter-flux-reverse time increases, the job gets harder and harder for the phase-locked-loop circuitry. III. What about MFM? MFM is simply 1,3 RLL encoding, with a basic clock period of 50 nsec. One data bit is encoded every two clock periods. The MFM code is relatively easy to understand [and I have some notes handy], so I'll give the complete details: In this table of flux encoding, '0' means no flux change, '1' means a flux change encoding a '1' data bit, 'C' means a flux change required to encode a '0' data bit due to clocking requirements. The code: 1 always becomes 0 1 0 becomes 0 0 if preceeded by a 1 0 becomes C 0 if preceeded by a 0 Message Data: 1 0 0 0 0 0 0 0 0 0 0 0 Disk Data: 0 1 0 0 C 0 C 0 C 0 C 0 C 0 C 0 C 0 C 0 C 0 C 0 ... Message Data: 1 1 1 1 1 1 1 1 1 1 1 1 Disk Data: 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 ... Message Data: 1 0 0 1 1 0 1 0 0 1 0 Disk Data: 0 1 0 0 C 0 0 1 0 1 0 0 0 1 0 0 C 0 0 1 0 0 Note that there are between 1 and 3 zeros between every 1 in the disk data! Note that since 'C' is physically the same as '1' (both are flux reversals), the setup gets in trouble if it loses track of clock periods! The way this is used on a disk drive is that there is a special data sequence encoded at the beginning of each sector, with special hardware to detect it: First, there is a long string of zero's; a hardware 'zero detector' is enabled to look for it. At this point, it could as easily find a string of one's as a string of zero's, since they are identical when taken out of context. Second, a special byte is encoded that VIOLATES the RLL rules: an 'A1' or 'A8' byte is written, with a clock missing in one of the sequential zero bits (the A1 and A8 tell us whether we are looking at the header of the sector, which contains cyl/sector/etc info, or the data portion of the sector). The special byte is called the Address Mark. If zeros followed by an Address Mark are found, then the PLL (phase locked loop) is synchronized and data can be read. IV. Ok, so explain the 'RLL' schemes. I don't have complete tables of code schemes for all of the RLL formats handy; it would also take a long time to type them all in. Instead, I'll explain what IS important about them. First, let's compare 2,7 RLL with 1,3 RLL. Both codes happen to encode one data bit into 2 clock periods. With 1,3 RLL (MFM), a flux reversal can occur every two clock periods. With 2,7 RLL, a flux reversal can occur every 3 clock periods. If we increase the clock rate by 50% using a 2,7 RLL scheme, we get the same maximum flux reversal rate as for MFM. But, we get 50% more data out of the drive, at a 50% higher data rate. Other RLL encoding schemes involve changes in the number of clock periods used to encode a data bit. For example, 1,7 RLL encodes 2 data bits into 3 clock periods. The 1,7 clock period must be kept the same as for 1,3 (MFM) (I hope you see why by now: both schemes involve a flux change as often as every 2 clocks). The result is a 50% increase in storage capacity, just as with 2,7 RLL. Why not use 1,7 RLL? Because the difference between minimum and maximum flux-change-intervals is so great. It turns out that the PLL electronics for detecting this wide a range of intervals is a real pain; worse, presumably, than the problems involved in implementing 2,7 RLL. Other encoding schemes use different clock rates and different min/max combinations. They all set things up so the maximum flux-reversal frequency is the same. The IMPORTANT differences between the schemes involve maximum clock freqency (50% higher for 2,7 RLL than MFM, 100% higher for ARLL than MFM) and maximum Frequency Ratio (comparing minimum and maximum flux-reversal intervals). In addition, some schemes involve simpler encoding/decoding algorithms (e.g. the normal 1,3 RLL/MFM); others are very complex: 2,7 RLL is a variable length code (e.g. 0011 maps to 00001000 but 010 maps to 100100); I don't have a simple formula for the 2,7 RLL code! Variable length codes make error recovery more difficult, and hence make bad-sector marking more important. A high frequency clock requires great accuracy in timing all along the chain from disk surface to final data to be read (and the reverse). The time period during which the controller must decide whether a flux reversal is present or not is called the 'window'. The variation in flux-reversal detection (+ or - from the nominal 'perfect detection time') allowed by a given encoding scheme is called the 'required window margin'. Higher frequency clocks have smaller window margin requirements. On a given drive/controller combination, the window margin can be measured: simply sync up the electronics to the pulses on the drive, read a worst-case data pattern, and see what kind of variation in flux-reversal timing you get. Good drive/controller combinations will place all flux reversals in a very narrow time window, giving a very good window margin, and hence will work well with high-frequency encoding schemes. A big difference between minimum and maximum flux reversal intervals simply requires complex decoding and phase-locked-loop circuitry that can handle a wide range of frequencies. All of which leads us to... V. What does all this mean in terms of real drives, controllers, etc.? First, let's understand which parts of the whole deal go where. Here are the pieces needed to read/write disk info, and where they are located: Component Where it is Disk surface Drive Head Drive Analog head electronics Drive (conditions signal to/from head) Cable Between drive and controller Analog data separator Controller (detects flux reversals) Phase Locked Loop Controller (determines data clocking) Digital read/write stuff Controller (includes bit/byte conversions, etc etc etc) Note that MOST of the junk is in the controller, not the drive! On the drive: Oxide-surface disks on early drive designs (e.g. ST-225, ST-238) do not place the flux-reversal with enough accuracy to be used in most RLL situations. This is why ST-225/238 drives have so much trouble. Newer drive designs use plated media, which allow better magnetic definition. The drive head and associated electronics are usually tuned to match the expected signals to and from the drive. If the drive was designed without 'RLL' (2,7) in mind, the frequency response of the drive electronics is 'mushy': it may not provide a crisp/accurate enough signal to allow the PLL to correctly sync up. On more recent drives, the same exact setup is used for 'MFM' (1,3 RLL) and 'RLL' (2,7); the drives that are certified for RLL are simply tested to verify that everything is OK. (The reason I'm so down on Seagate ST225/238 is that they didn't redesign anything. They simply test the same old stuff, and if it happens to pass the RLL test, they sell it as RLL). On the controller: On an 'RLL' controller, everything must be carefully designed to meet the tighter timing requirements. Note that a VERY accurate controller can make up for a somewhat mushy drive: the overall timing requirements are based on the sum total of electronics in the path from disk media to final digital output. Spreading the timing error evenly between drive and controller is theoretically cheaper, since neither one need be set up for very tight tolerances; however, a very accurate controller is not that hard to build today, hence the better success we're all having at running 'non-RLL' drives with RLL controllers. In general: There's no such thing as a free lunch. There is no encoding scheme (so far) that gets you more data without requiring more density or more timing accuracy of some kind. Somebody mentioned an amazing new Perstor controller that doubles drive density, supposedly without increasing the timing requirements. HAH! You sure can't get double the flux-reversals in the same space, so you MUST do it by increasing the timing requirements. The Perstor simply is an ARLL controller (I'm not certain, but I believe ARLL, getting 100% more data than MFM, is a 4,7 RLL encoding scheme); it will have trouble with some low quality drives just like the other RLL controllers do. I have not personally tested the window margins on lots of drives or controllers. I have talked with people who HAVE done this testing; their results say that the Adaptec RLL controllers have the best timing of all RLL controllers on the market today (as of a month ago), and confirm what I've heard/seen about Miniscribe and Maxtor drives (they also have good enough timing), and about Seagate ST225/238 (poor to marginal). VI. What about ESDI and SCSI? Well, they are kind of handy: all of the data encoding/decoding circuitry is on the drive; it is all designed together, and is well matched (hopefully!). Putting it all together like that makes it easier to use fancier high frequency encoding schemes, so you'll typically see higher data densities on ESDI and SCSI drives. VII. Anything else? Sure! There are lots of even more technical, related issues to discuss: bit shift details (bit shift is a lower level description of what causes large window margins on a given drive); signal-to-noise ratios; pulse amplification; pulse equalization; etc etc... and far on into things that I know nothing about (and hope I never have to!). Actually, it's pretty amazing when you think about it: for 99.999% of the people out there, this stuff is just boxes, cables and cards that you plunk together and they just *work*! Well, that's about it. I've run out of time, so I'd better send this now. I hope it helped more than it confused! [And no, I don't think you'll find drive manufacturers or controller manufacturers very willing to provide detailed spec's on their window margins; that would make it too easy to compare drive quality! :-(] Pete P.S.: If you read all the way to here, congratulations! I don't really expect that this stuff would really be interesting enough for people to read through 250 lines of gobbledy gook... :-) -- OOO __| ___ Peter Holzmann, Octopus Enterprises OOOOOOO___/ _______ USPS: 19611 La Mar Court, Cupertino, CA 95014 OOOOO \___/ UUCP: {hpda,pyramid}!octopus!pete ___| \_____ Phone: 408/996-7746 From ucdavis!ucbvax!hplabs!pyramid!octopus!pete Tue May 17 11:13:55 PDT 1988 Article 17805 of comp.sys.ibm.pc: Path: ucdavis!ucbvax!hplabs!pyramid!octopus!pete >From: pete@octopus.UUCP (Pete Holzmann) Newsgroups: comp.sys.ibm.pc,comp.periphs,comp.sys.misc Subject: HERE IS THE RLL code! (unburied sooner than I thought...) Message-ID: <226@octopus.UUCP> Date: 16 May 88 19:57:19 GMT Reply-To: pete@octopus.UUCP (Pete Holzmann) Followup-To: comp.periphs Organization: Octopus Enterprises, Cupertino CA Lines: 38 Xref: ucdavis comp.sys.ibm.pc:17805 comp.periphs:1029 comp.sys.misc:1515 [Note: I've directed followups to comp.periphs, although I don't personally read that group] You asked for it; I happened to find a copy in one of my magazines (Fall 1986 Computer Technology Review)... so here it is: the RLL code! I think you'll agree that it *is* a variable length code, with constant encoding density. It is kind of fun to play with it and verify that it really is a 2,7 RLL code. It isn't at all obvious how to start with "I want a 2,7 RLL code" and end up with this chart: Data Code 1 00 01 0001 10 0100 11 1000 000 100100 001 001000 010 000100 0110 00100100 0011 00001000 Have fun! Pete P.S.: People have requested the ERLL and ARLL codes. I don't have them handy. I'm not sure I have a recent enough printed reference. I know where to go (actually, who to talk to) to get the chart; but if somebody on the net has the codes handy, maybe they can pipe up! I can't be the only one with access to this stuff! -- OOO __| ___ Peter Holzmann, Octopus Enterprises OOOOOOO___/ _______ USPS: 19611 La Mar Court, Cupertino, CA 95014 OOOOO \___/ UUCP: {hpda,pyramid}!octopus!pete ___| \_____ Phone: 408/996-7746