11.2 Writable CD Formats
The physical and logical format used by writable CDs is defined in the rainbow books described in the CD-ROM chapter. The following sections provide an overview of how data is physically and logically stored on writable CDs. For further detail, refer to the rainbow books.
| || |
CD-R discs are manufactured with a pregroove track, which is 600 nanometers (nm) wide with a 1,600nm pitch. The pregroove includes an impressed timing wobble of ±3 nm radial excursion at 22.05 KHz, with an FM carrier modulated at 1 KHz superimposed on the pregroove. This modulation provides an absolute clock signal (called absolute time in pregroove, or ATIP) which provides an absolute location reference for any sector on the CD-R disc. Absolute addresses on the CD-R disc are specified in the form HH:MM:SS using ATIP information. Audio CDs are addressable in this manner with resolution of one second (75 sectors). Data CDs are addressable to the individual sector level.
11.2.1 Physical Formats
Because they must be readable in a standard CD-ROM drive or CD player, writable CDs use a physical format nearly identical to pressed CDs. The dimensions of a CD are 120.00mm in diameter (60.00mm radius) with a 15mm diameter central hole, which accommodates the rotating center spindle of the drive. Beginning at the edge of the center hole (radius 7.50mm) and proceeding outwards, a CD-R disc is divided into the following areas:
- Clamping Area
The Clamping Area is that portion of the disc that the drive spindle grasps to rotate the disc. On a pressed CD, this area extends from radius 7.50mm to 23.00mm. On a writable CD, this area occupies radius 7.50mm to 22.35mm.
- System Use Area
The System Use Area (SUA) is present only on writable discs, occupies radius 22.35mm to 23.00mm, and can be thought of as equivalent to the boot sector of a hard disk. The SUA contains data that tells a CD drive or player what kind of information is stored on the disc, where it is located, and what format it uses. The SUA is inside the radius readable by standard CD-ROM drives and CD players, and so may be read by (and written to) only by CD recorders. The SUA is divided into two sub-areas:
- Optimal Power Calibration Area
The Optimal Power Calibration Area (OPCA), often called the Power Calibration Area (PCA) for short, is used by the CD writer as a testing area to decide the best write schema to use when writing to that disc. Each time you insert a disc into a CD-R drive, the drive fires its writing laser at the PCA to calibrate that disc against the drive. Each such calibration uses one ATIP frame. Only 99 PCA ATIP frames are available at most, which limits a CD-R disc to 99 or fewer recording sessions.
Many variables determine how the drive should best write to that disc the type of dye and reflective backing material the disc uses, the proposed write speed, the firmware level of the drive, and so on. From this calibration testing, the drive decides the power level to use when writing, and whether to use a short write schema (typical for cyanine-based discs) or a long write schema (typical for pthalocyanine- and azo-based discs). The PCA begins at radius 22.35mm (ATIP -00:00:36 relative to the 23.00mm beginning of the Lead-in Area).
- Program Memory Area
The Program Memory Area (PMA) begins where the PCA ends, and extends to the beginning of the Lead-in Area at radius 23.0mm. The PMA is used to store a temporary Table of Contents (TOC) until the disc is finalized or closed. Closing a disc writes the temporary TOC stored in the PMA to the Lead-in Area, described below. That makes the TOC (and therefore the disc) readable by a CD-ROM drive or CD player, but also means that the disc can no longer be written to by a CD recorder. The PMA can store location information for up to 99 track numbers, including the start and stop times for each track (for audio) or the sector addresses for data.
- Information Area
The Information Area (IA) occupies a width of 35.0mm to 35.5mm, beginning at radius 23.0mm and ending between radius 58.0mm and 58.5mm. This area provides the general storage space to which user data is written. The IA is the only area of the CD that is visible to standard CD-ROM drives and CD players, and includes the following sub-areas:
- Lead-in Area
The Lead-in Area occupies radius 23.0 to 25.0mm on both pressed and writable CDs. This area contains digital silence in the main channel, as well as control information in various subcode channels, which can be used to provide additional information to the drive or reader about the content of the disc. The most important of the subcode channel data is the Table of Contents for the disc, which is stored in the Q-channel. The length of the Lead-in Area is determined by the space required to store up to 99 Tables of Contents for the 99 tracks that may potentially be written to the Program Area.
| || |
A CD has a main data channel which stores audio and/or computer data and eight interleaved subcode channels, designated P through W, which can store supplemental control data that can be read by CD-ROM drives and CD players. When the CD format was originally designed, it was intended that the main channel would contain only data and that subcode channels would be used to store administrative information. Nowadays, such supplemental information is usually encoded within the main data channel, and the only subchannels that are generally used are the P-channel, which specifies the start and end of each track, and the Q-channel, which stores the TOC, the track type/catalog number, and the timecodes (in HH:MM:SS and frames) used to locate data on the disc. Subchannels R through W were formerly sometimes used to store graphics and other supplemental data, but are now seldom used. The DVD specification eliminates subchannel coding as superfluous.
If you've ever wondered why a CD-R disc that has been written to but not closed can be read in a CD recorder but not in a standard CD-ROM drive or CD player, this is why. Standard readers look for the TOC in the Lead-in Area, where it has not yet been written for a disc that is not yet closed. CD recorders can read the temporary TOC stored in the PMA, which allows them to read that disc. The PMA is invisible to standard CD-ROM drives and CD players, so as far as they're concerned, that disc has no TOC.
- Program Area
The Program Area (PA) occupies a width of 33.0mm to 33.5mm, beginning at radius 25.0mm and ending between radius 58.0mm and 58.5mm. The PA is where actual user data (audio or computer data) is stored. The PA varies in capacity according to the CD-R disc you use. Discs are available that store 63 minutes of audio (which corresponds to about 600 MB of data), 74 minutes (~650 MB), and 80 minutes (~700 MB). Different brands of discs also have minor variations from nominal capacity. Some nominally 74-minute discs, for example, can store as much as 76.5 minutes.
- Lead-out Area
The Lead-out Area occupies a radius of 0.5mm to 1.0mm, which begins between radius 58.0mm and 58.5mm and ends between radius 59.0mm and 59.5mm. The Lead-out Area is created when the disc is closed, and defines the end of the Information Area.
The remaining 0.5mm to 1.0mm at the outer edge of the disc is unused. This area has no formal name that we know of, and exists simply to protect the outer portion of the track from damage.
The preceding assumes that the data on the disc exists as one session, which is nearly always true for commercial pressed CDs, as well as for writable CDs produced using Disc-at-Once recording (described in a later section). But Orange Book defines a concept called multisession for CD-R discs.
With multisession recording, the overall disc layout remains the same. As with a single-session disc, a multisession disc contains a Lead-in Area, a Program Area, and a Lead-out Area. The difference is that the Program Area on a multisession disc stores more than one session, each of which contains its own session-based Lead-in Area, Program Area, and Lead-out Area.
Like the disc itself, a session can be opened, written to, and closed. When a session is closed, that session can no longer be written to, but additional sessions can be added to the disc. In fact, closing a session on a multisession disc automatically opens a new session to which additional data can be written. Closing the session writes the session TOC to the PMA. This session TOC includes pointers to the start of the session Program Area for the new session and to the start time of the last-used (outermost) Lead-out Area.
Closing the session does not close the disc, however, which means that until the disc itself is closed, sessions on a multisession disc can be read only by a CD recorder (which can read the temporary TOC in the PMA) and by some recent CD-ROM drives. When the disc itself is closed, all sessions are closed and the temporary TOC is written to the Lead-in Area, allowing the disc to be read in any CD-ROM drive and most CD players.
| || |
Although the PMA makes provision for 99 tracks or sessions, in practice the number of sessions that can be recorded on a CD-R disc is much lower because of the overhead required for each session. When writing multiple sessions to a disc, the Lead-in Area for each session occupies 4,500 sectors (60 seconds or 9,000 KB). The Lead-out Area for the first session occupies 6,750 sectors (90 seconds or 13,500 KB). The Lead-out Area for the second and subsequent sessions occupies 2,250 sectors (30 seconds or 4,500 KB).
11.2.2 Logical Formats
The logical format of a CD specifies how data is arranged on the CD, and largely determines how data may be structured on the disc and what operating systems will be able to access it. CDs commonly use one of the logical formats described in the following sections.
188.8.131.52 ISO 9660
Most data CDs use the ISO 9660 format or one of its variants. ISO 9660 is based on the de facto standard High Sierra format, which was developed by the CD-ROM industry as a cooperative effort because of the lack of formal standards that then existed for writing data to CDs. In the days before High Sierra came into use, it was quite common to find that you could not read the data on a particular CD-ROM because that CD was incompatible with your software.
The primary purpose of ISO 9660, which was adopted in 1984, was to standardize a common logical data format for data CDs and, at the same time, to facilitate data exchange among different computing platforms. As a least-common-denominator format, the original ISO 9660 format is feature-poor because it supports only features that are common across many platforms. For example, the MS-DOS 8.3 file-naming convention limited ISO-9660 to using 8.3 filenames.
At the time ISO 9660 was adopted, these limitations were not much of a problem. Most people ran either MS-DOS or a Mac using floppy disks or small hard disks, and the limitations of ISO 9660 were not onerous in those environments. But the world soon changed, and the strict limits enforced by ISO 9660 became a problem, particularly for those who wanted to use deeply nested directories and long filenames. Accordingly, the ISO 9660 specification was expanded to include three ISO 9660 Interchange Levels for naming files and directories on disc. From most to least restrictive, these include:
- ISO 9660 Level 1
ISO 9660 Level 1 is the least-common-denominator level, developed to accommodate DOS filename limitations. Each file must be written to disc as a single, continuous stream of bytes, called an extent. Files may not be fragmented or interleaved. Filenames may contain from one to eight d-characters. Filename extensions may contain from zero to three d-characters (see following section). Directory names may contain from one to eight d-characters, and may not have an extension.
- ISO 9660 Level 2
ISO 9660 Level 2 also requires that files be written to disc as a single extent, but filenames may be up to 255 d-characters long, with an extension from zero to three d-characters. ISO 9660 Level 2 discs are unreadable by some operating systems, notably DOS.
- ISO 9660 Level 3
ISO 9660 Level 3 allows a file to be written in multiple extents, and so is used for packet writing. Filenames may be up to 255 characters long, with the same limitations as ISO 9660 Level 2.
| || |
Strictly interpreted, ISO 9660 filenames must end with a semicolon followed by the version number. e.g., FILENAME.TXT;1. Most operating systems ignore these final two characters when they access files or display directory listings. Versions of the Macintosh OS prior to 7.5 and some versions of Unix do not suppress the semicolon and version number, which causes problems if they attempt to access FILENAME.TXT rather than the actual filename of FILENAME.TXT;1.
The various ISO 9660 Levels vary significantly in which characters are legal. In ISO 9660-speak, these characters are designated as follows:
For strict compliance with ISO 9660 Level 1 file and directory naming conventions, only this character set may be used (and only in 8.3 format). d-characters include uppercase A through Z, digits 0 through 9, and the underscore character.
The character set usable for ISO Volume Descriptors (see below). a-characters include all d-characters as well as the following symbols: space; comma; semicolon; colon; period; question mark; exclamation point; right and left parentheses; single and double quotes; greater-than and less-than symbols; percent; ampersand; equals; asterisk; plus and minus (hyphen) symbols; and forward slash.
ISO 9660 Volume Descriptors are optional information fields recorded at the beginning of the data area on the disc. Volume Descriptors were originally intended for use by CD publishers, but may be used by anyone who creates an ISO 9660 disc, assuming the mastering software supports assigning ISO Volume Descriptors (some don't, or support only some of the available volume descriptors). ISO 9660 Volume Descriptors include the following, with allowable sizes in parentheses:
- System Name
The operating system for which the disc is intended. (0 to 32 a-characters)
- Volume Name
The disc name, displayed by the OS when the disc is mounted. (0 to 32 a-characters)
- Volume Set Name
Used in multi-disc sets to assign a common group name to each disc in the set. (0 to 32 d-characters).
- Publisher's Name
The publisher of the disc. (0 to 128 a-characters)
- Data Preparer's Name
The author of the disc content. (0 to 128 a-characters)
- Application Name
The name of the program, if any, needed to access data on the disc. (0 to 128 a-characters)
- Copyright File Name
Points to a file (which, if present, must reside in the root directory of the disc) that contains copyright information. (Maximum 8.3 d-characters)
- Abstract File Name
Points to a file (which, if present, must reside in the root directory of the disc) that contains text describing the contents of the disc. (Maximum 8.3 d-characters)
- Bibliographic File Name
Points to a file (which, if present, may reside in any directory on the disc) that contains bibliographic information, such as ISBN number. (Maximum 8.3 d-characters)
- Date Fields
Four Volume Descriptor fields exist for dates: Creation Date; Modification Date; Expiration Date; and Effective Date. Each of these fields, if present, stores a date and time in the following format, with size given in bytes in parenthesis: Year (4); Month (2); Day (2); Hour (2); Minute (2); Second (2); Hundredths of a second (2); Timezone (1 byte, signed integer; specifies the number of 15-minute increments from UCT from -48 West to +52 East).
184.108.40.206 ISO 9660 Variants
The very real limitations of ISO 9660 formatted discs gave rise to several alternative formats, all of which were based on ISO 9660:
- Rock Ridge
The Rock Ridge format is an extension of the ISO 9660 format, intended for use on Unix systems, which have much more liberal restrictions on the length of and characters used in filenames and directory names, as well as the depth of directories. Using Rock Ridge allows a CD to support long mixed-case filenames, symbolic links, and other conventions common to Unix systems. Although full Rock Ridge support is available only on Unix systems, a system running MS-DOS, Windows, or the Mac OS can still access the data on a Rock Ridge disc, but not the long filenames and other extended information. The Rock Ridge standard is available at ftp://ftp.ymi.com/pub/rockridge if you want to learn more about it.
The Romeo format is an obsolete extension to ISO 9660, developed by Adaptec as a stopgap measure for early versions of their Easy-CD premastering software. The raison d'être for the cutely-named Romeo format was that Windows NT 3.5a did not support the proprietary Microsoft Joliet format, described below. Romeo supports filenames of up to 128 characters, including spaces. However, unlike Joliet, Romeo supports neither the Unicode character set nor associated short (MS-DOS 8.3) filenames. Romeo-formatted discs can be read under Windows NT 3.51 and 4.0, Windows 98/SE/Me, and Windows 2000/XP. Because there is no associated short filename, Romeo-formatted discs cannot be read under MS-DOS. Romeo-formatted discs can be read on a Macintosh to the extent that they do not use filenames that exceed 31 characters. The Romeo format was essentially overtaken by events, was seldom used even when current, and is almost never encountered today.
Joliet is an extension of ISO 9660, developed by Microsoft to allow CDs to support long filenames, the Unicode character set, and associated short (MS-DOS 8.3) filenames. Joliet allows filenames up to 64 characters, including spaces. When read on a system running Windows 9X, Windows NT 4, Windows 2000, or recent releases of Linux, a Joliet-formatted disc displays long filenames and directory names. When read on a system running an operating system that does not support Microsoft long filename standards, the Joliet-formatted disc is recognized as a standard ISO 9660 disc. Full information about the Joliet standard is available at http://www-plateau.cs.berkeley.edu/people/chaffee/jolspec.html.
| || |
Consider logical formatting issues carefully if you plan to use CD-R premastering software to back up a hard disk that uses Windows long filenames and long folder names. ISO formatting restrictions mean that it's quite possible to have multiple subdirectories in one directory (or multiple files in one directory) whose long names are unambiguous, but whose truncated names are not. That means you might be unable to copy all files to CD unless you are very careful about using filenames and directory names that will truncate to unambiguous short names.
220.127.116.11 Universal Disc Format (UDF)
ISO 9660 and its variants were designed for duplicating or premastering discs, but were never intended to allow incrementally adding small amounts of data to a disc. Although ISO 9660 allows adding data to a disc (until that disc has been closed), the only way to do so is by opening a new session on that disc. That means that writing even one new file incurs the overhead required for a new session, which ranges from 13 MB to 22 MB.
In part to address these ISO 9660 limitations, OSTA defined a new logical format for optical discs. The official designation of this format is ISO 13346 but the common name is Universal Disc Format (UDF). UDF is an operating system independent logical formatting standard that defines how data is written to various types of optical discs, including CD-R, CD-RW, DVD-ROM, DVD-Video, and DVD-Audio. UDF uses a redesigned directory structure that allows small amounts of data (called packets) to be written incrementally and individually to disc without incurring the large overhead associated with writing a new session under ISO 9660.
In effect, with UDF each packet is written as a subsession within a standard session, incurring the standard session overhead only when that standard session is closed. Packet-writing software typically closes the session automatically when the disc is ejected using the eject feature of the software. As with ISO 9660, an open session on a UDF-formatted disc can be read only by a CD recorder. Closing the session allows the disc to be read by a standard CD-ROM drive or CD player. It's possible, however, to subsequently open a new session and add additional packet data to the disc.
In addition to session overhead, UDF addresses another issue that makes ISO 9660 completely inappropriate for packet writing. ISO 9660 must know, in advance, exactly which files are to be written during a session. It uses this information to create and write the Path Tables and Primary Volume Descriptors, which point to the physical locations of the files on disc. Because packet writing allows any arbitrarily selected file to be written to disc at any time, the information that ISO 9660 requires is not available before the write occurs.
UDF solves this problem by accumulating data about the physical locations of files as they are written. At the end of a packet-writing session, UDF consolidates these location pointers and writes them to disc as the Virtual Allocation Table (VAT). The VAT address of a file remains the same, even if it is overwritten. At the end of each packet-writing session, UDF creates a new VAT that includes not just the pointers for newly created or modified files, but also the pointers stored in the old VAT. That means the current VAT always includes pointers to every file that has been written to the disc since it was originally formatted.
| || |
The advantages of packet writing come at the cost of reduced capacity. A typical CD-R/RW disc stores about 650 MB with ISO 9660 formatting, but stores only about 500 MB with UDF formatting. About 100 MB of that reduced capacity is accounted for by the complex UDF directory and control structures that allow data to be added and deleted incrementally. The remaining 50 MB or so is used to implement various measures to distribute wear evenly across the CD-RW disc, preventing some areas from being overused and thereby rendered unwritable while other areas remain lightly used.
Two versions of UDF are in common use:
- UDF 1.02
UDF 1.02 was adopted in August 1996, and is the finalized version of the October 1995 UDF 1.0 specification. UDF 1.02 specifies standards for DVD and DVD-ROM, but does not support writable optical media. Windows NT 4, Windows 98/SE/Me, and Windows 2000 include native UDF 1.02 support, which allows them to access DVD video and DVD-ROM discs natively.
- UDF 1.5
UDF 1.5 was adopted in February 1997, and addresses the requirements of sequential recorded media, including CD-R, CD-RW, and DVD-RAM. UDF 1.5 adds the Virtual Allocation Table (VAT), which is analogous to the DOS File Allocation Table, and, optionally, the Sparing Table, which allows bad sectors to be marked as unusable and replaced by spare sectors. Windows 2000 includes native UDF 1.5 support, but Windows NT and Windows 9X do not. You can download UDF 1.5 reader software for these versions of Windows from http://www.adaptec.com/products/overview/udfreaders.html.
The UDF 2.0 and 2.01 specifications are available, but not yet commonly used in commercial products. For more information about UDF, see http://www.osta.org.