+ All Categories
Home > Documents > NAB 3 4 Audio Recording

NAB 3 4 Audio Recording

Date post: 01-Jun-2018
Category:
Upload: laurentiu-iacob
View: 222 times
Download: 0 times
Share this document with a friend

of 10

Transcript
  • 8/9/2019 NAB 3 4 Audio Recording

    1/24

    455NAB ENGINEERING HANDBOOK Copyright © 2007 Focal Press.All rights of reproduction in any form reserved.

    C H A P T E R

    3.4

    Audio Recording

    MICHAEL STARLINGNational Public Radio

    Washington, D.C.

     JOHN EMMETTBroadcast Project Research Ltd.Teddington, United Kingdom

    INTRODUCTION

    Anyone who has been away from the field of audiorecording for a number of years may be surprised bythe range of technologies that are discussed in thischapter. That is not to say that the process of soundrecording has changed fundamentally, but the specificrecording equipment has changed and, in many cases,even the physical media might appear to have van-

    ished. In large part, audio has followed text into becoming a generic Information Technology (IT) dataapplication.

    For much of the 20th century, analog magnetic tapewas the broadcaster’s principal medium for audiorecording. In the 1980s and 1990s, the industryfocused on the standardization and implementation of digital recording techniques using magnetic and opti-cal media. Recent years have seen the near-completionof the migration to digital systems, including solid-state storage with no moving parts and the continuingrefinement of hard-disk-based storage technologies.

    This chapter covers a wide range of recording tech-nologies and devices both past and present. It includesmaterial originally published in previous editions of 

    the NAB Engineering Handbook 1  on both analog anddigital recording, followed by new sections that dis-cuss the migration to IT-based recording.2  In-depthcoverage of computer-based audio recording systems

    may be found in Chapter 3.6, “Radio Station Automa-tion, Networks, and Storage.”

    HISTORY

    Until about 130 years ago, all sound was live sound.Despite the developments of mechanical recording, itwas only with the harnessing and commercial use of 

    electricity that amplification and broadcasting could be developed, and that subsequently led to a widerneed to store and reproduce sound waves. It alsooffered the tools that were needed to develop evermore sophisticated recording systems.

    Mechanical Recording

    Audio recording preceded and helped fuel the intro-duction of broadcasting. The earliest recorded audiowas Thomas Edison’s 1877 mechanical cylindrical pho-nograph employing a constant velocity vertical record-ing groove. The phonograph’s cylindrical mediamandated that each recording be a master and sty-mied mass production.

    In 1887, Emile Berliner patented a successful sys-tem of sound recording on flat discs. The first discswere made of glass, later zinc, and eventually plastic.An acoustic horn was used to collect the sound,which was converted into vibrations recorded in aspiral groove that was etched into the disc. For replay,the disc rotated on a turntable while a pickup armheld a needle that read the grooves in the record. Thevibrations in the needle produced the sound that was

    1Mike Starling, “Audio Recording Systems,” NAB Engineering Handbook, 9th edition, National Association of Broadcasters, 1999,pp. 321–340.

    2 John Emmett, “Sound Recording,” Broadcast Engineers ReferenceBook , Focal Press, 2004, pp. 599–607. Used with permission.

  • 8/9/2019 NAB 3 4 Audio Recording

    2/24

    SECTION 3: AUDIO PRODUCTION AND STUDIO TECHNOLOGY

    456

    transmitted mechanically to the horn speaker, thewhole machine being known as a gramophone.

    The flat disc recordings could be duplicated by cre-ating molds from the master recordings, allowing cop-ies to be mass-produced. Berliner’s discs, which

     became known as records, dominated recorded audiofor the next half-century.

    In the 1920s, the technology for disc recording

    improved greatly with the introduction of micro-phones and electronic amplifiers to drive an electro-magnetic cutting head on a lathe to produce the masterdisc, instead of relying on the acoustic horn.

    Early broadcast use of recorded media exploded inthe late 1920s. This development coincided with therapid proliferation of AM broadcasting. Among thefirst actions of the Federal Radio Commission in 1928was the deletion of several stations due to their heavyreliance on airing commercial records, which the FRCcited as “provision of a service which the public canreadily enjoy without the service.” The new FRCfavored original programming, and this stimulatedthe use of disc recording lathes by the burgeoningpopulation of radio stations. While many, if not most,early broadcast facilities acquired recording lathes forproduction of recorded audio, the widespread intro-duction of the more forgiving and affordable magnetictape recording did not begin until after World War II.

    Magnetic Recording

    Danish telephone engineer Valdemar Poulsen demon-strated a magnetic wire recorder as early as 1898, asshown in Figure 3.4-1. It was 30 years later that German

    researchers pioneered magnetically coated, paper- based tape for good-quality recorders/reproducers.

    By 1936, German scientists had advanced magnetic- based recording using cellulose-based tape andachieved remarkably good sound quality.

     After the war, the AKG Magnetophone (see Figure3.4-2) was copied and commercially exploited world-wide. A host of benefits compared to disc-based

    recording, including portability, immediacy of play- back, ease of storage, wide dynamic range, low distor-tion, and freedom from ticks and pops, propelledmagnetic recording to the forefront in broadcasting.

    Principles of Magnetic Recording 

    Magnetic audio tape recording uses a tape composedof a plastic base material, with a thin coating or emul-sion of  ferric oxide  powder. This is a ferromagneticmaterial, meaning that if exposed to a magnetic field,it becomes magnetized by the field.

    The tape recording device uses a  record head  thatapplies a magnetic flux to the oxide on the tape as it is

    moved past the record head at a constant speed. Theoxide then responds to the flux as it passes. The recordhead is a small, circular electromagnet with a small

     gap in it. During recording, the audio signal is sent tothe record head to create a magnetic field in the core.At the gap, magnetic flux forms a fringe pattern to

     bridge the gap, and this flux is what magnetizes theoxide on the tape. During playback, the motion of thetape pulls a varying magnetic field across the gap inthe  playback head  (which may be the same head usedfor recording, or a different head). This creates a vary-ing magnetic field in the core and therefore a signal inthe coil of the electromagnet. This signal is amplifiedto drive the speakers.

    FIGURE 3.4-1 Valdemar Poulsen’s Telegraphone wona Grand Prix Award at the 1900 Paris World’s Fair. Thedevice used pole pieces located on each side of thewire. (From Finn Jorgensen, The Complete Handbook of 

     Magnetic Recording, 4th edition, 1995, McGraw-Hill.Reproduced with permission of the McGraw-HillCompanies.)

    FIGURE 3.4-2 Early German Magnetophone broughtto the USA after capture by officers in the U.S. ArmySignal Corps. (From Finn Jorgensen, The Complete

     Handbook of Magnetic Recording, 4th edition, 1995,McGraw-Hill. Reproduced with permission of theMcGraw-Hill Companies.)

  • 8/9/2019 NAB 3 4 Audio Recording

    3/24

    CHAPTER 3.4: AUDIO RECORDING

    457

    From the earliest days, various techniques wereused to improve the quality of the recorded analogsignal. They include adding a high-frequency bias sig-nal to the signal applied to the record head; thisimproves the linearity of the magnetic recording pro-cess and greatly reduces distortion  levels. Anothertechnique is to provide record and playback audioequalization that improves frequency response and sig-

    nal-to-noise ratio.

     Further Developments

    Numerous variants of the analog magnetic taperecorder were developed for different applications, all

     based on the same basic recording principles. Thesevariants included different arrangement of the reels orcassettes, as shown later in the chapter. A range of magnetic emulsion formulations were developed withfeatures specific to the application. In some cases,chromium dioxide or metal particles were used ratherthan ferric oxide. Machines with different tape speeds,tape widths, track widths, bias, equalization, and levelstandards were all developed for specific purposes.

    There were decades of incremental refinements in fre-quency response, signal-to-noise, print-through, emul-sion composition, backing media, lubrication, andadhesive composition, all of which interact with oneanother and require trade-offs depending on theintended purpose of the recording device. From the1960s onward, various techniques for noise reductionwere introduced, including the well-known systemsfrom Dolby Laboratories.

    SOUND RECORDING FORMATS

    Because of the long history of sound recording, broad-cast sound operators may be presented for many years

    to come with source recordings in many and variedforms, so the most significant are discussed in the fol-lowing sections.

    In order to recognize the majority of sound record-ing formats for what they are, and especially with allthe languages in European broadcasting, the Euro-pean Broadcasting Union (EBU) has worked on theInternational Broadcast Tape Number (IBTN) scheme,and an associated bar-code label specification given inthe EBU document Tech 3279.

    The IBTN scheme can be applied to any broadcasttape and related items and enables them to beuniquely identified, from the earliest stages of the pro-duction process. The bar-code representation of theIBTN allows broadcast tapes to be scanned as theymove from production facilities to broadcasting out-lets and during transfers between broadcasters.

    For convenience, recording formats can be dividedinto those carrying analog or digital signals and thensubdivided by the type of mechanical carrier. Table3.4-1 provides an extract of the most common IBTNsound recording format codes, from which these sub-divisions can be made.

    The formats listed in Table 3.4-1 currently form the bulk of broadcast industry sound archives.

    TABLE 3.4-1The Most Common International Broadcast

    Tape Number (IBTN) Scheme Sound RecordingFormat Codes

    Code Material

    16T 16 mm sepmag analog audio film

    17T 17.5 mm sepmag analog audio film33L 33 rpm LP phonogram analog audio disc

    35T 35 mm sepmag analog audio film

    45D 45 rpm phonogram analog audio disc

    78D 78 rpm phonogram analog audio disc

    A01 6.3 mm (¼") analog audio tape, full track

    A02 6.3 mm (¼") analog audio tape, 2-channel

    A04 6.3 mm (¼") track half-width analog audio tape,stereo

    A08 12.5 mm (¼") analog audio tape, 8-channel

    A16 25.4 mm (1") analog audio tape, 16-channelA32 25.4 mm (1") analog audio tape, 32-channel

    AI1 AIT (Advanced Intelligent Tape) digital datatape, 25 GB capacity

    AI2 AIT (Advanced Intelligent Tape) digital datatape, 50 GB capacity

    AI3 AIT (Advanced Intelligent Tape) digital datatape, 36 GB capacity

    AIX AIT (Advanced Intelligent Tape) digital datatape, extended length, 35 GB capacity

    AS2 6.3 mm (¼") analog audio tape, 2-channel stereo

    AT2 6.3 mm (¼") analog audio tape, 2-channel stereoand TC

    CCA Compact Cassette format analog audio tape,cassette

    CDA Compact Disc Audio digital audio disc

    CDD CD-ROM digital data disc

    CDR Recordable CD digital data disc

    D24 25.4 mm (1") DASH format digital audio tape,24-track

    D32 25.4 mm (1") PD format digital audio tape, 32-channel

    D48 25.4 mm (1") DASH format digital audio tape,

    48-track

    DA2 DAT format digital audio tape, 2-channel

    DAT DAT format digital audio tape, stereo

    DCC DCC format digital audio tape

    DD2 6.3 mm (¼") DASH format digital audio tape, 2-channel

    DP2 6.3 mm (¼") PD format digital audio tape, 2-channel

  • 8/9/2019 NAB 3 4 Audio Recording

    4/24

    SECTION 3: AUDIO PRODUCTION AND STUDIO TECHNOLOGY

    458

    ANALOG RECORDING FORMATS

    Quarter-Inch and Cassette Tape Formats

    The media for most common professional analog

    audio tape formats are shown in Figure 3.4-3. On theleft is a typical analog radio station mainstay, the 10-1/2-inch diameter “NAB” center 1/4-inch tape reelproviding one hour of recording time at 7-1/2 ips.Alongside is a full 7-inch reel of Ampex 600 tape pro-viding 1/2 hour of recording time. Run time is reduced

     by half for 15 ips speed. Portable 1/4-inch tape record-ers for field reporting could also record at 3-3/4 ips,giving longer record times with smaller tape reels.Specialized recorders for station logging recordingcould run at speeds as low as 1-7/8 ips or even 15/16ips to give extra long record times at reduced quality.

    The NAB standardized endless loop cartridge, orcart, on the right in Figure 3.4-3, was the staple contri-

     bution format for radio station commercials andinserts from the 1960s until the 1990s. This formatused 1/4-inch tape with a lubricated backing, usuallyrunning at 7-1/2 ips. The center cue track carried sev-eral automation tones, which could be used to cue andtrigger cart players in order to automate commercial

     breaks. Distributed tapes were usually recorded on

    open-reel recorders, or duplicators, and then physi-cally transferred to the cartridge for distribution. Forarchival or restoration purposes, the tape can beextracted for replay on an open reel recorder, and thisis to be preferred for transfer, as the tape path controlin cart players is necessarily poor. The format is still inuse by some broadcasters, but for quality and reliabil-ity reasons, there has been a steady migration to digi-

    tal storage such as the MiniDisc and, later, hard diskand solid-state storage.In the background is a compact cassette, which typi-

    cally recorded 45 minutes per side stereo at 1-7/8 ps.These cassettes were initially developed by Sony andPhilips and introduced for consumer recording in1962, but from the late 1960s, and with the use of noise-reduction techniques, they became increasinglypopular as rugged interview recorders for profes-sional users. More recently, they were superseded byMiniDisc and solid-state recorders.

    For many years, the workhorse of broadcast soundrecording was the 1/4-inch open-reel recorder, of which an example is shown in Figure 3.4-4.

    In dealing with legacy analog tape program mate-rial that has to be played back, one first needs to estab-lish the tape speed and track layout of the originalrecording so a suitable playback machine can befound. The record characteristics for the tape shouldalso be determined if possible, although this may bemore difficult to establish. High-frequency and (some-times) low-frequency equalization is applied duringanalog tape recording to compensate for nonuniformresponse in the head-tape system. Equalization has to

     be applied during playback to achieve a flat overallfrequency response. The characteristic was standard-ized in the United States according to NAB-publishedcriteria and by CCIR and IEC in Europe.

    H8A Hi-8 format 8-channel digital audio tape, cas-sette

    LAQ Lacquer phonograph analog audio disc

    MDA MD (MiniDisc) digital audio disc

    NAB NAB cartridge analog audio tape

    SVA A-DAT 8-channel digital audio tape

    WAX Wax cylinder phonogram analog audio disc

    FIGURE 3.4-3 The main analog tape media.

    TABLE 3.4-1  (continued)The Most Common International Broadcast

    Tape Number (IBTN) Scheme Sound RecordingFormat Codes

    Code Material

    FIGURE 3.4-4 Analog 1/4-inch tape recorder. Mount-ing a recorder that would take 10.5-inch spools into a19-inch rack was always a challenge, and the RevoxPR99 was one machine with this capability.

  • 8/9/2019 NAB 3 4 Audio Recording

    5/24

    CHAPTER 3.4: AUDIO RECORDING

    459

    Table 3.4-2 lists the most common recording equal-ization characteristics for 1/4-inch tapes. The breakfrequencies of the filters are described by the time con-stants of simple RC low-pass networks. If a replaymachine with playback settings matching the tapecannot be found, the correction equalization can bequite easily applied, post replay, inside a digital audioworkstation.

    The  1/4-inch analog tapes recorded with centertime code or sync tracks for video synchronizationrequire a special replay machine that may be increas-ingly difficult to find.

    Multitrack Tape and Sepmag FilmRecording Formats

    Multitrack analog audio tapes, especially in the 2-inchwide 24-track form, still form a major interchange for-mat within recording studios for remixing archivemusic sessions, but in broadcast use these tapes arerare.

    Sprocketed analog magnetic film, known as separatemagnetic (sepmag) recordings, was for many years asso-

    ciated with the huge quantities of 16 mm broadcast filmmaterial used for television. There are few variations inplayback standards, although any sepmag transfer willrequire a specialized playback deck.

    MAINTENANCE, CARE, AND STORAGE OFMAGNETIC TAPE RECORDINGS

    Although many analog recordings have held up ingood condition for decades, they are quite sensitive to

    permanent physical damage from improper handling,machine malfunction, and environmental hazards.Winding tapes tails out immediately after completeplayback is the most important safeguard in preventingphysical edge damage to audiotapes. Environmentaldamage is most directly safeguarded by cleanliness andcarefully controlling temperature and humidity

    Tape wind or pack must be even to prevent protru-

    sion scatter between layers that will crease and perma-nently damage tape edges during subsequentplayback. Scatter-wound tape is susceptible to edgedamage from the pressure exerted on flanges duringcareless handling. For this reason, reels should be han-dled by their hubs rather than by the flanges. Similarlycinching of layers with actual foldover is possible dur-ing rapid acceleration/deceleration from jerky trans-port operation.

    Many professional recorders have a library windmode that operates at a higher than normal operatingspeed but with constant tension to assure a smoothpack. Tape libraries invariably have professional tapewinding equipment that is optimized for gentle han-dling during higher speed precision winding. At pro-fessional libraries, preventive maintenance includesperiodic rewinding to minimize print-through anddepletion of lubrication, and to interrupt stiction

     buildup from adhesive action. The recommendedperiod between rewindings varies greatly with stor-age conditions. Tapes stored at 20°C should berewound every 3000 hours. Tapes stored at 30°Cshould be rewound roughly every 300 hours.

    Minute particles can cause serious system degrada-tion. Static buildup, scraping, scratching of the tapesurface, and separation in pack and head contact cancause dropouts and permanent damage to tape andequipment. Thus, frequent cleaning of heads, guides,capstan, and pinch roller, typically after each record-

    ing or playback session, is imperative. Careful demag-netization of heads is also required for bestperformance, at intervals depending on the number of hours of operation. Oils and salts from fingerprintswill attract foreign particles and can themselves inter-fere with reliable head-to-tape interface.

     Hydrolysis is a chemical reaction with water thataffects polyester-based recording tapes. High temper-ature and high humidity will accelerate hydrolysisreactions in any polyester-based tape stock. However,from roughly 1977–1983 industrywide polyester

     binder phenomena, referred to as sticky-shed  syn-drome, exacerbated the rate of hydrolysis reactivity.

    Tapes from the sticky-shed era typically exhibit

    slip-stick phenomena as carboxylic acid and alcoholare sloughed from the binder as debris products.Tapes of this vintage are frequently unusable due toresidue buildup that causes transports to squeal and

     bind. Fortunately, this phenomenon has been exten-sively documented and can be reversed temporarilywith no apparent damage to the tape recording. Thereversal process consists of warming (or baking) thetapes in a convection oven at 120°F for 24 hours. Thetapes will then be usable upon cooling for severalweeks before hydrolysis again sheds sufficient

    TABLE 3.4-2The IEC and NAB Record Equalization

    Characteristics for 1/4-Inch and Cassette Tape

    CCIR/IECTime Constants

    NABTime Constants

    Tape SpeedUse

    BassBoost

    HighRoll-Off

    BassBoost

    HighRoll-Off

    15 ipsStudio

    none 35 µs 3180 µs 50 µs

    7-1/2 ipsStudio

    none 70 µs 3180 µs 50 µs

    7-1/2 ipsStudio carts

    none 50 µs none 50 µs

    3-3/4 ipsReporter & home

    3180 µs 90 µs 3180 µs 90 µs

    1-7/8 ipsLogging & home

    3180 µs 120 µs 3180 µs 90 µs

    1-7/8 ipsFe-cassette

    3180 µs 120 µs

    1-7/8 ipsCr-cassette

    3180 µs 70 µs

  • 8/9/2019 NAB 3 4 Audio Recording

    6/24

  • 8/9/2019 NAB 3 4 Audio Recording

    7/24

    CHAPTER 3.4: AUDIO RECORDING

    461

    signal with amplitude on the order of one quantiza-tion step. The signal value crosses back and forth

    across the threshold, resulting in a square wave signalfrom the quantizer. Dither suppresses such quantiza-tion error. Dither is a low-amplitude analog noiseadded to the input analog signal (similarly, digitaldither must be employed in the context of digital com-putation when rounding occurs).

    When dither is added to a signal with amplitude onthe order of a quantization step, the result is duty-cycle modulation that preserves the information of theoriginal signal. The average value of the quantizedsignal can move continuously between two steps;

    thus, the incremental effect of quantization has beenalleviated. Audibly, the result is the original wave-

    form, with added noise. That is more desirable thanthe clipped quantization waveform. With dither, theresolution of a digitization system is below the leastsignificant bit.

    The recording section of a pulse-code modulation(PCM) system, shown in Figure 3.4-6(a), consists of input amplifiers, a dither generator, input (antialias-ing) low-pass filters, sample-and-hold circuits, analog-to-digital converters, a multiplexer, digital processingcircuits for error correction and modulation, and astorage medium such as digital tape. The reproduction

    FIGURE 3.4-5 Summary of discrete-time sampling, shown in the time and frequencydomains. (From Ken C. Pohlmann, Principles of Digital Audio.)

  • 8/9/2019 NAB 3 4 Audio Recording

    8/24

    SECTION 3: AUDIO PRODUCTION AND STUDIO TECHNOLOGY

    462

    section, shown in Figure 3.4-6(b), contains processingcircuits for demodulation and error correction, ademultiplexer, digital-to-analog converters, outputsample-and-hold circuits, output (anti-imaging) low-pass filters, and output amplifiers. In most contempo-rary designs, digital filters are used in both the inputand output stages. The output section forms the basisfor a compact disc player.

    DIGITAL AUDIO RECORDING SYSTEMS4

    Removable Media for Digital Recordings

    Digital audio can be recorded using a wide variety of optical, magnetic, magneto-optical, and solid-state

    media. The media for most common professionalaudio tape formats are shown in Figure 3.4-7.

    In the 1980s, rotary head developments from analogvideo recording and the low importance of print-through for digital recordings led to a big jump for-ward in sound recording density. The stereo R-DATtape on the left in Figure 3.4-7 is essentially a miniatur-ized video tape cassette using 4 mm wide tape, whilefor the multitrack DA-88, introduced in 1992, the tape

    was 8 mm wide and actually used the same cassette asthe Hi-8 consumer analog video recorder. The com-pact disc (CD)-sized, 120 mm diameter, optical disc,first introduced as the audio CD in 1983, can carry anyone of a number of recorded formats. The optical discshad an immediate advantage of quick access, ratherthan lengthy tape spooling, but that meant that a disci-pline of recording metadata  was needed. The 80 mmdisc in the caddy is uniquely a MiniDisc for recordingstereo audio, introduced in 1992.

    FIGURE 3.4-6 Block diagram of the recording (a) and reproduction (b) sections of a lin-ear PCM system. (From Pohlmann, Principles of Digital Audio.)

    4This section is adapted from a contribution by Ken C. Pohlmann.NAB Engineering Handbook, 8th edition, National Association of Broadcasters, 1992, pp. 863-875, with new additions.

  • 8/9/2019 NAB 3 4 Audio Recording

    9/24

    CHAPTER 3.4: AUDIO RECORDING

    463

    The various devices and technologies that utilizedthese media are discussed next.

    Compact DiscThe compact disc was developed to store up to 74minutes of stereo digital audio program material of 16-bit PCM data sampled at 44.1 kHz. The total usercapacity is over 650 MB. In addition, for successfulstorage on a nonperfect medium, error correction, syn-chronization, modulation, and subcoding wererequired. The CD was originally conceived as a distri-

     bution medium to replace vinyl records, but its highquality of uncompressed digital audio firmly estab-lished it as the medium of choice for playback of pre-recorded music in the professional domain. Latervariants were introduced for recording and playing

     back both data and audio.

    Although the audio CD has now to some extent been superseded by more recent optical developmentssuch as the DVD family which have come out of itand, for professional purposes, the streaming formatused for recording has been largely replaced by file-

     based audio, a study of the design process is commonto all optical storage systems and provides a goodinsight into data storage in general.

    Compact Disc Physical Design

    The diameter of a compact disc is 120 mm, its centerhole diameter is 15 mm, and its thickness is 1.2 mm.Data are recorded in an area 35.5 mm wide. It is

     bounded by a lead-in area and a lead-out area, which

    contain nonaudio subcode data used to control theplayer’s operation. The disc is constructed with atransparent polycarbonate substrate. Data are repre-sented by pits that are impressed on the top of the sub-strate. The pit surface is covered with a thin metal(typically aluminum) layer 50–100 nm thick, and aplastic layer 10–30 µm thick. A label 5 µm thick isprinted on top. Disc physical characteristics are shownin Figure 3.4-8.

    Pits are configured in a continuous spiral from theinner circumference to the outer. The pit construction

    of the disc is diffraction-limited; the dimensions are assmall as permitted by the wave nature of light at thewavelength of the readout laser. A pit is about 0.5 µmwide. The track pitch is 1.6 µm. There is a maximum of 20,188 revolutions across the disc’s data area.

    The disc rotates with a constant linear velocity

    (CLV) in which a uniform relative velocity is main-tained between the disc and the pickup. To accomplishthis, the rotation speed of the disc varies depending onthe radial position of the pickup. The disc rotates at aspeed of about 8 rev/s when the pickup is reading theinner circumference, and as the pickup moves out-ward, the rotational speed gradually decreases toabout 3.5 rev/s. The player reads frame synchroniza-tion words from the data and adjusts the speed tomaintain a constant data rate.

    The CD standard permits a maximum of 74 min-utes, 33 seconds of audio playing time on a disc. How-ever, when encoding specifications such as track pitchand linear velocity are modified, it is possible to man-ufacture discs with over 80 minutes of music.Although the linear velocity of the pit track on a givendisc is constant, it can vary from 1.2–1.4 m/s, depend-ing on disc playing time. All audio compact discs andplayers must be manufactured according to the RedBook , the CD standards document authored by Philipsand Sony.

    Compact Disc Encoding 

    CD encoding is the process of placing audio and otherdata in a frame format suitable for storage on the disc.The information contained in a CD frame prior tomodulation consists of a 27-bit sync word, 8-bit sub-code, 192 data bits, and 64 parity bits. The input audio

     bit rate is 1.41 × 10

    6

     bps. Following encoding, the chan-nel bit rate is 4.3218 × 106  bps. Premastered digitalaudio data are typically stored on a 3/4 in. U-Maticvideo transport via a digital audio processor with a44.1 kHz sampling rate and 16-bit linear quantization.

    A frame is encoded with six 32-bit PCM audio sam-pling periods, alternating left and right channel 16-bitsamples. Each 32-bit sampling period is divided toyield four 8-bit audio symbols. The CD systememploys two error correction techniques: interleavingto distribute errors and parity to correct them. The

    FIGURE 3.4-7 The most-used digital recording media,left to right: R-DAT tape, compact disc, MiniDisc.

    FIGURE 3.4-8 Compact disc physical specifications.(From Ken C. Pohlmann, The Compact Disc.)

  • 8/9/2019 NAB 3 4 Audio Recording

    10/24

    SECTION 3: AUDIO PRODUCTION AND STUDIO TECHNOLOGY

    464

    standardized error correction algorithm used is theCross Interleave Reed-Solomon Code (CIRC), developedspecifically for the compact disc system. It uses twocorrection codes and three interleaving stages. Witherror correction, over 200 errors per second can becompletely corrected, and indeed, on such a storagemedium, possible error rates of this size are to beexpected.

    Subcode Data

    Following CIRC encoding, an 8-bit subcode symbol isadded to each frame. The 8 subcode bits (designatedas P, Q, R, S, T, U, V, and W) are used as 8 independentchannels. Only the P or Q bits are required in theaudio format; the other 6 bits are available for video orother information as defined by the CD +  G/M(Graphics/MIDI) format. The CD player collects sub-code symbols from 98 consecutive frames to form asubcode block with eight 98-bit words; blocks are out-put at a 75 Hz rate. A subcode block contains its ownsynchronization word, instruction and data, com-mands and parity. An example of P and Q data is

    shown in Figure 3.4-9.The P channel contains a flag bit that can be used toidentify disc data areas. Most players use informationin the more comprehensive Q channel. The Q channelcontains four types of information: control, address,data, and cyclic redundancy check code (CRCC) for sub-code error detection. The control bits specify severalplayback conditions: the number of audio channels(two/four); pre-emphasis (on/off); and digital copy

    prohibited (yes/no). The address information consistsof 4 bits designating three modes for the Q data bits.Mode 1 data are contained in the table of contents(TOC) that is read during disc initialization. The TOCstores data indicating the number of music selectionsas a track number and the starting points of the tracksin disc running time. In the program and lead-outareas, Mode 1 contains track numbers, indices within

    a track, track time, and disc time. The optional Mode 2contains the catalog number of the disc. The optionalMode 3 contains a country code, the owner code, theyear of the recording, and a serial number.

    EFM Encoding and Frame Assembly

    The audio, parity, and subcode data are modulatedusing eight-to-fourteen modulation (EFM) in whichsymbols of 8 data bits are assigned an arbitrary wordof 14 channel bits. When 14-bit words with a low num-

     ber and known rate of transitions are chosen, greaterdata density can be achieved. Each 14-bit word islinked by three merging bits. The 8-bit input symbolsrequire 256 different 14-bit code patterns. To achieve

    pits of controlled length, only those patterns are usedin which more than two but less than ten 0’s appearcontinuously. Two other patterns are used for subcodesynchronization words. The selection of EFM bit pat-terns defines the physical relationship of the pitdimensions. The channel stream comprises a collec-tion of 9 pits and 9 lands that range from 3–11T inlength, where T is one period. A 3T pit ranges inlength from 0.833–0.972 µm, and an 11T pit ranges in

    FIGURE 3.4-9 Typical subcode contents of the P and Q channels. (From Pohlmann, TheCompact Disc.)

  • 8/9/2019 NAB 3 4 Audio Recording

    11/24

    CHAPTER 3.4: AUDIO RECORDING

    465

    length from 3.054–3.560 µm, depending on pit track lin-ear velocity. Each pit edge, whether leading or trailing,is a “1” and all increments in between, whether insideor outside a pit, are 0’s, as shown in Figure 3.4-10.

    The start of a frame is marked with a 24-bit syn-chronization pattern, plus three merging bits. The totalnumber of channel bits per frame after encoding is588, composed of 24 synchronization bits, 336 (12 × 2 ×14) data bits, 112 (4 × 2 × 14) error correction bits, 14subcode bits, and 102 (34 × 3) merging bits.

    Data Readout 

    CD pickups use an aluminum gallium arsenide

    (AlGaAs) semiconductor laser generating laser lightwith a 780 nm wavelength, which was the most eco-nomical type to be developed during the late 1970s.Developments of the CD use shorter wavelengthlasers in order to record smaller pits and hence denserdata. The beam passes though the substrate, is focusedon the metalized pit surface, and is reflected back.Because the disc data surface is physically separatedfrom the reading side of the substrate, dust and sur-face damage on the substrate do not lie in the focalplane of the reading laser beam; hence, their effect isminimized. The polycarbonate substrate has a refrac-tive index of 1.55; because of the bending of the beamfrom the change in refractive index, thickness of the

    substrate, and the numerical aperture (0.45) of thelaser pickup’s lens, the size of the laser spot is reducedfrom approximately 0.8 mm on the disc surface toapproximately 1.7 µm at the pit surface. The laser spoton the data surface is an Airy function with a brightcentral spot and successively darker rings, and spotdimensions are quoted as half-power levels.

    When viewed from the laser’s perspective, the pitsappear as bumps with height between 0.11–0.13 mm.This dimension is slightly less than the laser beam’swavelength in polycarbonate of 500 nm. The height of 

    the bumps is thus approximately one-fourth of thelaser’s wavelength in the substrate. The reflective flatsurface of a CD is called land. Light striking land trav-els a distance one-half wavelength longer than lightstriking a bump, as shown in Figure 3.4-11. This cre-ates an out-of-phase condition between the part of the

     beam reflected from the bump and the part reflectedfrom the surrounding land. The beam thus undergoesdestructive interference, resulting in cancellation.Optically, if the CD pit surface is considered as a two-dimensional reflective grating, the focused laser beamdiffracts into higher orders, resulting in interference.The disc surface data thus modulate the intensity of the reflected light beam. In this way, the data physi-cally encoded on the disc are recovered by the laser.

    FIGURE 3.4-10 Channel bits as represented by the pit structure. (From Pohlmann, Prin-ciples of Digital Audio.)

    FIGURE 3.4-11 A pit causes cancellation throughdestructive interference.

  • 8/9/2019 NAB 3 4 Audio Recording

    12/24

    SECTION 3: AUDIO PRODUCTION AND STUDIO TECHNOLOGY

    466

    Data Decoding 

    A CD player’s data path, shown in Figure 3.4-12,directs the modulated light from the pickup through aseries of processing circuits, ultimately yielding a ste-reo analog signal. Data decoding follows a procedurethat essentially duplicates, in reverse order, the encod-ing process. The pickup’s photodiode array and itsprocessing circuits output EFM data as a high-fre-

    quency signal. The first data to be extracted from thesignal are synchronization words. This information isused to synchronize the 33 symbols of channel infor-mation in each frame, and a synchronization pulse isgenerated to aid in locating the zero crossing of theEFM signal.

    The EFM signal is demodulated so that 17-bit EFMwords again become 8 bits. A memory is used to

     buffer the effect of disc rotational wow and flutter. Fol-lowing EFM demodulation, data are sent to a CIRCdecoder for de-interleaving, and error detection andcorrection. The CIRC decoder accepts one frame of 328-bit symbols: 24 audio symbols and 8 parity symbols.One frame of 24 8-bit symbols is output. Parity from

    two Reed-Solomon decoders is utilized. The first errorcorrection decoder corrects random errors and detects burst errors and flags them. The second decoder pri-marily corrects burst errors, as well as random errorsthat the first decoder was unable to correct. Error con-cealment algorithms employing interpolation andmuting circuits follow CIRC decoding.

    In most cases, the digital audio data are convertedto a stereo analog signal. This reconstruction processrequires one or two D/A converters, and low-pass fil-ters to suppress high-frequency image components.Rather than use an analog brick wall filter after thesignal has been converted to analog form, the digi-tized signal is processed before D/A conversion usingan oversampling digital filter. An oversampling filteruses samples from the disc as input and then com-

    putes interpolation samples, digitally implementingthe response of an analog filter.

    A  finite impulse response (FIR) transversal filter isused in most CD players. Resampling is used toincrease the sample rate; for example, in a four-timesoversampling filter, three zero values are inserted forevery data value output from the disc. This increasesthe data rate from 44.1 kHz to 176.4 kHz. Interpolation

    is used to generate the values of intermediate samplepoints—for example, three intermediate samples foreach original sample. These samples are computedusing coefficients derived from a low-pass filterresponse.

    The spectrum of the oversampled output waveformcontains image spectra placed at multiples of the over-sampling rate; for example, in a four-times oversam-pled signal, the first image is centered at 176.4 kHz.Because the audio baseband and sidebands are sepa-rated, a low-order analog filter can be used to removethe images, without causing phase shift or other arti-facts common to high-order analog brick wall filters.

    Traditionally, D/A conversion is performed with amultibit PCM converter. In theory, a 16-bit converter

    could perfectly process the 16-bit signal from thedisc. However, because of inaccuracies in converters,18-bit D/A converters are often used because theycan more accurately represent the signal. Alterna-tively, low-bit (sometimes called 1-bit) D/A convert-ers can be used. They minimize many problemsinherent in multibit converters such as low-level non-linearity and zero-cross distortion. Low-bit systemsemploy very high oversampling rates, noise shaping,and low-bit conversion.

    Also present in the audio output stage of every CDplayer is an audio de-emphasis circuit. Some CDswere encoded with an audio pre-emphasis character-istic with time constants of 15 and 50 µsec. Uponplayback, de-emphasis is automatically carried out,resulting in an improvement in S/N.

    FIGURE 3.4-12 Block diagram of a CD player with digital filtering. (From Pohlmann, The Compact Disc.)

  • 8/9/2019 NAB 3 4 Audio Recording

    13/24

    CHAPTER 3.4: AUDIO RECORDING

    467

    Recordable CD-R

    With a CD-R (or CD-WO) write-once optical discrecorder, the user may record data until the disc capac-ity is filled. Recorded CD-R discs are playable on con-ventional CD players. A block diagram of a CD-Rrecorder is shown in Figure 3.4-13. An encoder circuitaccepts an input PCM signal and performs CIRC errorcorrection encoding, EFM modulating, and other cod-

    ing and directs the data stream to the recorder. Therecorder accepts audio data and records up to 74 min-utes in real time. In addition to audio data, a completesubcode table is written in the disc TOC, and appro-priate flags are placed across the playing surface.

    Write-once media are manufactured similarly to con-ventional playback-only discs. As with regular CDs,they employ a substrate, reflective layer, and protectivetop layer. Sandwiched between the substrate and reflec-tive layer, however, is a recording layer composed of an

    organic dye. Together with the reflective layer, it pro-vides a typical in-groove reflectivity of 70% or more.Unlike playback-only CDs, a pregrooved spiral track isused to guide the recording laser along the spiral track;this greatly simplifies recorder hardware design andensures disc compatibility. Shelf life of the media is saidto be 10 years or more at 25°C and 65% relative humid-ity. However, the dye used in these discs is vulnerable

    to sunlight; thus, discs should not be exposed to brightsun over a long period.The CD-R format is defined in the Orange Book Stan-

    dard authored by Philips and Sony. In CD recordersadhering to the Orange Book I Standard, a disc must berecorded in one pass—start-stop recording is not per-mitted. In recorders adhering to the Orange Book II Standard, recording may be stopped and started. Inmany players, tracks may be recorded at differenttimes and replayed, but because the disc lacks the final

    FIGURE 3.4-13 Block diagram of a CD-R recorder. (From Pohlmann, The Compact Disc.)

  • 8/9/2019 NAB 3 4 Audio Recording

    14/24

    SECTION 3: AUDIO PRODUCTION AND STUDIO TECHNOLOGY

    468

    TOC, it can be played only on a CD-R recorder. Whenthe entire disc is recorded, the interim TOC data aretransferred to a final TOC, and the disc may be playedin any CD audio player. The program memory area(PMA) located at the inner portion of the disc containsthe interim TOC record of the recorded tracks. In addi-tion, discs contain a power calibration area (PCA); thisallows recorders to automatically make test recordings

    to determine optimum laser power for recording.Some recorders exceed the Orange Book II Standard;they generate an interim TOC that allows partiallyrecorded discs to be played on playback-only CDplayers.

    CD-R recorders are useful because they eliminatethe need to create an edited master tape prior to CDrecording. If a passage is not wanted, it can be markedprior to writing the final TOC so that the recorder willnot play it back. For example, dead air during a liveperformance can be marked so it is deleted wheneverthe disc is played back. The data physically continueto exist on the disc, however.

    Magnetic Digital Recording DesignConsiderations

    A great advantage of digital magnetic recordings isthat system performance is no longer limited by per-formance of the storage medium. Since transitions arethe fundamental language of digital recording sys-tems, rather than perfect waveforms, neither AC biasnor particularly high signal-to-noise ratio is required.In fact, distorted waveforms are the norm. However,since a massive amount of transition density must bestored for high fidelity audio, higher bandwidth andmore precision magnetic emulsions are needed.Linear density, or kilobits per inch, is the critical fac-tor. Several techniques are employed to maximize

    density capabilities, as well as to minimize densityrequirements.

    The need for higher storage densities for digitalaudio accelerated research and development in tapecomposition and magnetic head design. At higherrecording densities, error vulnerability requires eversmoother recording media and revolutionary designsof recording and playback heads.

    Due to decreased signal-to-noise ratio require-ments, print-through effects are operationally nonex-istent, and much thinner tape base thicknesses andoxide layers are commonly employed. However, coer-civity is much higher on digital magnetic media andtypically ranges from 800–1500 Oe versus the more

    typical 300–400 Oe in analog recordings. Thus, digitalrecordings are deep and robust.Acicular magnetic particles are cigar-shaped frag-

    ments employed in most magnetic digital recordingmedia. Because transitions are the basis of digitalrecording, saturation recording is employed and is typi-cally of the traditional longitudinal format. However,for greater storage density, the acicular particles can beoriented perpendicularly to the direction of therecording medium’s travel. A balance is required

     between too low a density, which requires excessive

    tape consumption, and too high a density, whichrequires additional error correction to combat drop-outs and intersymbol interference.

    Isotropic recording  utilizes longitudinal and verticalmodes simultaneously. In isotropic recordings, the ver-tical field erases the longitudinal fields near the tape’ssurface. Thus, the tape is recorded to saturation withlongitudinal fields and is multiplexed with vertical

    fields near the surface. The longitudinal field is struc-tured for dominance at low frequencies, and the verti-cal field carries the higher frequencies. Because thehead gaps in isotropic recordings are so minute, thereis essentially no intersymbol interference because onlya small area at the trailing edge of the gap is recorded.

    Thin film heads  used for digital recording are of asubstantially different design from analog heads.These heads are manufactured using photolithogra-phy to achieve a minute, precise shape. Multiturnthin-film inductive record heads (IRH) are used forrecording but do not have good playback characteris-tics at slow speeds. However, magneto-resistive (MR)heads are useful due to the output being independentof tape speed. With MR heads, the head never touches

    the tape, and thus both head and media life are pro-longed. Both crosstalk and signal-to-noise characteris-tics are excellent in such systems.

    In order to minimize damage and errors due tohead-to-media contact in systems with high mediavelocities, a load-carrying air film is formed at theinterface between record head and magnetic media.Physical contact should only occur as the media startsand stops its motion. The air film must be thick enoughto conceal any near-contact surface irregularities andthin enough to provide a reliable record and playbacksignal. Head-to-medium separation ranges from about50 nm to 0.3 µm, and the roughness of the head andmedium surfaces ranges from 1.5 to 10 nm rms.5

    Rotary-Head Digital Audio Tape

    The rotary-head digital audio tape (R-DAT or DAT)format was originally designed as a consumermedium to replace the analog cassette. However, theformat has found wider application as a low-cost pro-fessional digital recording system, and although nowobsolescent, it represented the state of the art in rotaryhead recording for many years, and a study of thespecification will be of great help in understandingmany such tape formats. An example of a portableR-DAT recorder is shown in Figure 3.4-14.

     Format SpecificationsThe DAT format is based on a tape only 3.81 mm wide,using rotating heads to achieve the head-to-tape speednecessary for digital recording. It supports fourrecord/playback modes and two playback-onlymodes. The standard record/playback and both play-

     back-only modes, wide and normal, are implementedon every DAT recorder. The standard mode offers 16-

    5See Bharat Bhushan, “Tribology of the Head-Medium Interface,” Magnetic Recording Technology, McGraw-Hill, 1996, Chapter 7.

  • 8/9/2019 NAB 3 4 Audio Recording

    15/24

    CHAPTER 3.4: AUDIO RECORDING

    469

     bit linear quantization and 48 kHz sampling rate. Bothplayback-only modes use a 44.1 kHz sampling rate,for user- and prerecorded tapes. Three other record/playback modes, called Options 1, 2, and 3, all use 32kHz sampling rates. Option 1 provides two-hourrecording time with 16-bit linear quantization. Option2 provides four hours of recording time with 12-bitnonlinear quantization. Option 3 provides 4-channelrecording and playback, also using 12-bit nonlinear

    quantization. These specifications are summarized inFigure 3.4-15.

    The user can write and erase nonaudio informationinto the subcode area: start ID indicating the begin-ning of a selection, skip ID to skip over a selection, andprogram number indicating selection order. Thesesubcode data permit rapid search and other functions.Although subcode data are recorded onto the tape in

    the helical scan track along with the audio signal, theyare treated independently and can be rewritten with-out altering the audio program, and entered eitherduring recording or playback. With the ID codesentered into the subcode area, desired points on thetape such as the beginning of selections can besearched for at high speed by detecting each ID code.During playback, if the skip ID is marked, playback isskipped to the point at which the next start ID ismarked, and playback begins again.

    In the DAT format, the recorded area is distin-guished from a blank section of tape with no recordedsignal, even if the recorded area does not contain anaudio signal. Unlike blank areas, the track format is

    always encoded on the tape even if no signal is present.If these sections are mixed on a tape, search operationsmay be slowed. Hence, blank sections should beavoided. A consumer DAT deck with an interfacemeeting the specifications of the Sony/Philips digitalinterface format (SPDIF) will identify when data have

     been recorded with a copy inhibit Serial Copy Manage-ment System (SCMS) flag in the subcode (ID6 in themain ID in the main data area) and will not digitallycopy that recording. In other words, SCMS permitsfirst generation digital copying, but not second genera-tion copying. Analog copying is not inhibited.

    FIGURE 3.4-14 Sony PCM 2000 portable R-DATrecorder. This recorder continued some of the ruggedengineering of its analog predecessors, weighing someeight pounds, while the cassette weighs merely anounce. Recording duration on battery power was

    always an issue with portable R-DAT recorders.

    FIGURE 3.4-15 DAT standard specifications. (From Pohlmann, Principles of Digital Audio.)

  • 8/9/2019 NAB 3 4 Audio Recording

    16/24

    SECTION 3: AUDIO PRODUCTION AND STUDIO TECHNOLOGY

    470

    DAT Recorder Design

    From a hardware point of view, a DAT recorder uti-lizes many of the same elements as a CD-R recorder:A/D and D/A converters, modulators and demodula-tors, error correction encoding and decoding. Audioinput is received in digital form, or is converted to dig-ital by an A/D converter. Error correction code isadded and interleaving is performed. As with any

    helical scan system, time compression must be used toseparate the continuous input analog signal into seg-ments prior to recording and then rejoin them uponplayback with time expansion to form a continuousaudio output signal. Subcode information is added tothe bitstream, and it undergoes eight-to-ten (8/10)modulation. This signal is recorded via a recordingamplifier and rotary transformer.

    In the playback process, the rotary head generatesthe record waveform. Track-finding signals arederived from the tape and used to automaticallyadjust tracking. Eight-to-ten demodulation takesplace, and subcode data are separated and used foroperator and servo control. A memory permits de-interleaving as well as time expansion and eliminationof wow and flutter. Error correction is accomplished in

    the context of de-interleaving. Finally, the audio signalis output as a digital signal, or through D/A convert-ers as an analog signal.

    The DAT rotary head permits slow linear tapespeed while achieving high bandwidth. Each track isdiscontinuously recorded as the tape runs past thetilted head drum spinning rapidly in the same direc-tion as tape travel. The result is diagonal tracks at an

    angle of slightly more than 6° from the tape edge, asshown in Figure 3.4-16. Despite the slow linear tapespeed of 8.15 mm per second (5/16 in. per second), ahigh relative tape-to-head speed of about 3 m per sec-ond (120 in. per second) is obtained. A DAT rotatingdrum (typically 30 mm in diameter) rotates at 2000rpm, typically has two heads placed 180° apart, andhas a tape wrap of only 90°. Four head designs pro-vide direct read after write, so the recorded signal can

     be monitored.Azimuth recording (or guard-bandless recording) is

    used in which the drum’s two heads are angled differ-ently with respect to the tape; this creates two tracktypes, sometimes referred to as A and B, with differingazimuth angles between successively recorded tracks.This ±20° azimuth angle means that the A head will

    FIGURE 3.4-16 DAT track configuration. (From Pohlmann, Principles of Digital Audio.)

  • 8/9/2019 NAB 3 4 Audio Recording

    17/24

    CHAPTER 3.4: AUDIO RECORDING

    471

    read an adjacent B track at an attenuated level due tophase cancellation. This reduces crosstalk betweenadjacent tracks, eliminates the need for a guardband

     between tracks, and promotes high-density recording.Erasure is accomplished by overwriting new data totape such that successive tracks partially write overprevious tracks. Thus, the head gaps (20.4 microns) areapproximately 50% wider than the tracks (13.59

    microns) recorded to tape.The length of each track is 23.501 mm. Each bit of data occupies 0.67 microns, with an overall recordingdata density of 114 Mb per square inch. With a sam-pling rate of 48 kHz and 16-bit quantization, the audiodata rate for two channels is 1.536 Mbps. However,error correction encoding adds extra informationamounting to about 60% of the original, increasing thedata rate to about 2.46 Mbps. Subcode raises the over-all data rate to 2.77 Mbps.

    The primary types of data recorded on each trackare PCM audio, subcode, and automatic track finding(ATF) patterns. Each data (or sync) block contains async byte, ID code byte, block address code byte, par-ity byte, and 32 data bytes. In total, there are 288 bits

    per data block; following 8/10 modulation, this isincreased to 360 channel bits. Four 8-bit bytes are usedfor sync and addressing. The ID code contains infor-mation on pre-emphasis, sampling frequency, quanti-zation level, tape speed, copy-inhibit flag, channelnumber, and so on. Subcode data are used primarilyfor program timing and selection numbering. The sub-code capacity is 273.1 kbps. The parity byte is theexclusive or sum of the ID and block address bytes,and is used to error correct them.

    Since the tape is always in contact with the rotatingheads during record, playback, and search modes,tape wear necessitates sophisticated error correction.DAT is thus designed to correct random and burst

    errors. Random errors are caused by crosstalk from anadjacent track, traces of an imperfectly erased signal,or mechanical instability. Burst errors occur fromdropouts caused by dust, scratches on the tape, or byhead clogging with dirt.

    To facilitate error correction, each data track is splitinto halves, between left and right channels. In addi-tion, data for each channel are interleaved into evenand odd data blocks, one for each head; half of eachchannel’s samples are recorded by each head. All of the data are encoded with a doubly encoded Reed-Solomon error correction code. The error correctionsystem can correct any dropout error up to 2.6 mm indiameter, or a stripe 0.3 mm high. Dropouts up to 8.8mm long and 1.0 mm high can be concealed withinterpolation.

    Other Rotary Head Digital Tape Formats

     ADAT and DA88 (Multitrack)

    Once the R-DAT format was established and adopted by professional users, it was a small step to producemultitrack versions using S-VHS cassettes (ADAT) or 8mm video cassettes (DA88 type). Professional users

    tended toward the generic DA88 series, which hadtrack bounce and integral time code features (see Fig-ure 3.4-17). Originally carrying 16-bit audio tracks (andintegral time code), a newer series of these machineswill record 20-bit PCM tracks. Although these newermachines will play back 16-bit recordings, the earliermachines will not play back 20-bit recordings.

    Interconnection of these multitrack recorders in thedigital domain is via proprietary interfaces, a parallelelectrical TDIF interface in the case of the DA88s, anda serial optical “lightpipe” for the ADAT.

    “1610”-Type Videotape Recording (Stereo)

    The vital need in the mid-1970s for compact discsource recordings in an editable digital form createdone of the most bizarre audio recording formatsever, and it has left us the curious legacy of a samplerate of 44.1 kHz samples per second, as used for thecompact disc. As an idea, it started with the then-emerging helical scan video recorders, which werecapable of recording (via a special adapter) three 16-

     bit (stereo) samples of digital audio on each televi-sion line. The most practical video recording formatin those days was the U-Matic cassette with 3/4-inch tape. These tapes could be assemble editedusing multiple machines and an edit controller.Because analog recorders have no storage, the timewhen the rotary heads switched over at the tapeedges needed to be avoided for recording, and thesystem therefore used only 588 lines of the 625-line25-frame system for recording the audio. The 588lines times 3 samples times 25 per second gave44,100 samples per second. The equivalent U.S. sys-tem used 490 lines out of 525, but the slightly lowerframe rate of 29.94 gave 44,056 samples per second.The 0.1% difference in these sample rates was soonforgotten, and 44.1 kHz lives on.

    FIGURE 3.4-17 Tascam DA-88 digital multitrackrecorder.

  • 8/9/2019 NAB 3 4 Audio Recording

    18/24

    SECTION 3: AUDIO PRODUCTION AND STUDIO TECHNOLOGY

    472

    Fixed-Head Digital Tape Formats

    DASH (Stereo and Multitrack)

    During the 1980s several digital audio stationary head(DASH) recording formats were developed, often

     based on the platform of existing analog tape decks.The attraction at that time was the illusory eco-

    nomic advantages of razor blade editing, combined

    with digital audio quality. The most enduring of theseDASH formats, especially for storage and exchange inthe music recording industry, is probably the Sonymultitrack version using 2-inch wide tape, and capa-

     ble of recording up to 24 audio channels. In broadcast-ing circles, multitrack machines such as this were usedonly for serious music recording backup, althougheven today, the obvious successor in terms of bulkmultitrack storage has yet to be found.

    DCC Cassettes (Stereo)

    The advent of the compact disc in 1982 exposed theinadequacies of the compact cassette in home record-ing applications and, as a result, there was a three-

    sided development of digital recording formats for theconsumer market. Ultimately, two of these—the R-DAT and the MiniDisc—were to find a ready accep-tance in professional radio production applications.The third format started out life as the stationary headdigital audio tape (S-DAT) format, using a compact cas-sette-sized tape, but with uncompressed 16-bit PCMstereo recording. It became the digital compact cassette(DCC) when MPEG layer I bit rate reduction wasapplied. A few examples of these cassettes may befound in libraries; however, the recorders were discon-tinued around 1996.

    Optical and Magneto-Optical Disc Formats

    CD and DVD Audio

    CD-A (the commercial record format described in the“Compact Disc” section earlier), whether in CD-R,RW, or glass-mastered form, is not a particularly goodformat to use for broadcast recording, although it mayhave attractions for easily played samplers. It possesseslimited error correction, a table of contents must bewritten at the start of the disc, and the encoding is lim-ited to 16-bit PCM. As a physical carrier, however, theoptical medium of the CD (or equally the similarDVD) has a lot to commend it, not least the fact thatthe IT industry has greatly reduced the recorder and

     blank media prices by adopting them in such large

    numbers.

    CD/DVD-R and RW 

    The computer industry, and particularly Kodak withits Photo-CD application, pushed forward the CD-R(and the related RW) formats into affordable computercomponents, and the derivatives of them now formsuch economical carriers for fast audio file exchange.CD-R media can be “closed” to ISO 9660 format,which was the basic universal CD-ROM format. How-

    ever, CD-RW discs cannot be closed to the ISO 9660format. More recent formats use a derivative such asUDF that was basic to the recordable DVD; these for-mats have various benefits including allowing longernames to be entered.

     MiniDiscThe Sony MiniDisc recorder (see Figure 3.4-18) uses a2.5 in. magneto-optical disc housed in a caddy for protec-tion. It uses the proprietary adaptive transform acous-tic coding (ATRAC) audio data rate reduction system,

     based on block frequency transforms. Magneto-opticalrecording technology combines magnetic recordingand laser optics, utilizing the record/erase benefits of magnetic materials with the high density and contact-less pickup of optical materials.

    With magneto-optics, a magnetic field is used torecord data, but the applied magnetic field is muchweaker than conventional recording fields. It is notstrong enough to orient the magnetic particles. How-

    ever, the coercivity of the particles sharply decreasesas they are heated to their Curie temperature. A laser

     beam focused through an objective lens heats a spot of magnetic material and only the particles in that spotare affected by the magnetic field from the recordingcoil, as shown in Figure 3.4-19(a). After the laser pulseis withdrawn, the temperature decreases and the ori-entation of the magnetic layer records the data. In thisway, the laser beam creates a small recorded spot, thusincreasing recording density.

    The Kerr effect may be used to read data; itdescribes the slight rotation of the plane of polariza-tion of polarized light as it reflects from a magnetizedmaterial. The rotation of the plane of polarization of 

    light reflected from the reverse-oriented regions dif-fers from that reflected from unreversed regions, asshown in Figure 3.4-19(b). To read the disc, a low-powered laser is focused on the data surface, and theangle of rotation of reflected light is monitored, thusrecovering data from the laser light.

    Available as both studio models and affordable por-table recorders, the MiniDisc was a popular format forreporters as well as for radio playback applications,where it formed a direct replacement for NAB tapecartridge players.

    FIGURE 3.4-18 Sony pocket MiniDisc recorder of theearly 1990s.

  • 8/9/2019 NAB 3 4 Audio Recording

    19/24

    CHAPTER 3.4: AUDIO RECORDING

    473

    The MiniDisc system uses a TOC, very much likethat on a CD-A, and when one is recording, this TOCis written automatically after the stop button ispressed. This detail was sometimes forgotten by the

     journalist users, and physical vibration could some-times impair recordings (including that all-importantTOC), but large buffer memories helped to minimizethis, and the low bit rate makes playback access to anytrack particularly fast, a detail that did make it popu-lar with any previous users of compact cassetterecorders.

    Solid-State and Hard Disk RecordersSolid-State Recorders

    An obvious solution to possible mechanical vulnera- bility of portable digital recorders is to record directlyto a solid-state memory. The memory itself may beremovable or fixed, and in order to keep the powerconsumption down and the recording time up, bit ratereduction may be used. Examples of solid-staterecorders are shown in Figures 3.5-15 and 3.5-16 inChapter 3.5.

    Probably the most important consideration is howquickly and conveniently the recording can bedumped to a computer or sent back to the home sta-

    tion. This is an important function so that the recordercan be quickly reused.Another consideration if bit rate reduction is used is

    to what extent post processing (which might be assimple as equalization) will compromise the audioquality. The most efficient bit rate reduction systemssuch as MP3 are efficient only because the artifacts areevenly distributed across the threshold of our hearing.A simple equalization change can quite easily uncoverthose artifacts in a particular and audible area of theaudio spectrum.

    Metadata may seem an expensive luxury for a sim-ple interview recording, but in the case of solid-staterecorders, the transfer of the material will leave nospace for any label, and some form of explanation ortraceability becomes vital. Fortunately, an initiative

     between industry, the AES, EBU, and a UK-based broadcast craft group called the Institute of BroadcastSound has realized the problems of entering (or not

    entering) consistent manual metadata. They haveinstituted an XML-based automatically recorded datachunk, iXML, which allows the originating machineand take time to be traced uniquely and passed alongthrough the sound workflow.6

     NAGRA and Similar Portable Recorders

    The NAGRA series of 1/4-inch analog recorders made by the Swiss Kudelski company were highly consid-ered for location recording and especially for featurefilm applications. Several digital versions of this oldfavorite have appeared using solid-state memory, aswell as disk packs or tape. In particular, the NAGRA-D offered four 24-bit PCM channels on tape, a highly

    attractive top-quality capture format in its time. Theonly thing to consider with all tape-based storage isthe need for real time to dump these recordings tocomputer files, but equally, removable disk or solid-state packs can make for an expensive alternative to atape library.

    Digital Dubbers

    The name digital dubber  is a film industry misnomerfor solid-state or disk-based temporary stores for (typ-ically) eight tracks of up to 24-bit PCM audio, usedduring audio editing and dubbing, hence the name.These recorders are rarely used for the actual inter-

    change of recordings and can be viewed as a sort of local server.

    Computer Disks

    Hard-disk-based storage now forms the heart of mostmodern broadcast facilities with digital audio work-stations (DAW) and audio servers. Some of the impli-cations of this revolution are discussed later in thischapter, and the systems and equipment are coveredin depth in Chapter 3.6.

    DIGITAL SIGNAL PROCESSING

    Digital signal processing (DSP) has improved the per-formance of many existing audio functions such asequalization and dynamic range compression, andpermits new functions such as ambience processing,dynamic noise cancellation, and time alignment. DSPis a technology used to analyze, manipulate, or gener-ate signals in the digital domain. It uses the same prin-ciples as any digitization system; however, instead of a

    FIGURE 3.4-19 Magneto-optical recording (a) andplayback (b).

    6See http://www.ixml.info/.

  • 8/9/2019 NAB 3 4 Audio Recording

    20/24

    SECTION 3: AUDIO PRODUCTION AND STUDIO TECHNOLOGY

    474

    storage medium such as CD or DAT, it is a processingmethod.

    DSP Applications and Design

    DSP employs technology similar to that used in com-puters and microprocessor systems; however, there isan important distinction. A regular computer pro-

    cesses data, whereas a DSP system processes signals.It is accurate to say that an audio DSP system is inreality a computer dedicated to the processing of audio signals.

    Some audio functions that DSP can perform includeerror correction, multiplexing, sample rate conversion,speech and music synthesis, data compression, filter-ing, adaptive equalization, dynamic range compres-sion and expansion, crossovers, reverberation,ambience processing, time alignment, acoustic noisecancellation, mixing and editing, and acoustic analy-sis. Some DSP functions are embedded within otherapplications; for example, the error correction systemsand oversampling filters found in CD players are

    examples of DSP. In other applications the user hascontrol over the DSP functions.Digital processing is more precise, repeatable, and

    can perform operations that are impossible with ana-log techniques. Noise and distortion can be muchlower with DSP; thus, audio fidelity is much higher. Inaddition, whereas analog circuits age, lose calibration,and are susceptible to damage in harsh environments,DSP circuits do not age, cannot lose calibration, andare much more robust. However, DSP technology is anexpensive technology to develop. Hardware engineersmust design the circuit or employ a DSP chip, andsoftware engineers must write appropriate programs.Special concerns must be addressed when writing thecode needed to process the signal. For example, if a

    number is simply truncated without regard to itsvalue, a significant error could occur, and the errorwould be compounded as many calculations takeplace, each using truncated results. The resultingnumerical error would be manifested as distortion inthe output signal. Thus, all computations on the audiosignal must be highly accurate. This requires longword lengths; DSP chips employ digital words that are32 bits in length or longer.

    In addition, even simple DSP operations mayrequire several intermediate calculations, and complexoperations may require hundreds of operations. Toaccomplish this, the hardware must execute the stepsvery quickly. Because all computation must be accom-

    plished in real time—that is, within the span of onesample period—the processing speed of the system iscrucial. A DSP chip must often process 50–100 millioninstructions per second. This allows it to run completesoftware programs on every audio sample as it passesthrough the chip.

    DSP products are more complicated than similaranalog circuits, but DSP possesses an inherent advan-tage over analog technology: it is programmable.Through the use of software, many complicated func-tions can be performed entirely with coded instruc-

    tions. Figure 3.4-20(a) shows a band-pass filter usingconventional analog components. Figure 3.4-20(b)shows the same filter, represented as a DSP circuit. Itemploys the three basic DSP operators of delay, addi-tion, and multiplication. However, this DSP circuitmay be realized in software terms. Figure 3.4-20(c)

    shows an example of the computer code (MotorolaDSP56001) needed to perform band-pass filtering witha DSP chip. There are many advantages to this soft-ware implementation. Whereas hardware circuitswould require new hardware components and newcircuit design to change their processing tasks, thesoftware implementation could be changed by alter-ing parameters in the code. Moreover, the programcould be written so different parameters could beemployed based on user control.

    As noted, DSP can be used in lieu of most conven-tional analog processing circuits. The advantages of DSP are particularly apparent when various applica-tions such as recording, mixing, equalization, and

    editing are combined in a workstation. For example, apersonal computer, combined with a DSP hardwarecard, hard disk drive, appropriate software, and aDAT or CD recorder forms a complete postproductionsystem. Such a system allows comprehensive signalmanipulation including the capability to cut, paste,copy, replace, reverse, trim, invert, fade in, fade out,smooth, loop, mix, change gain and pitch, crossfade,and equalize. The integrated nature of such a worksta-tion, its low cost, and high-processing fidelity make itclearly superior to analog techniques.

    FIGURE 3.4-20 (a) A band-pass filter represented byan analog circuit, (b) digital signal processing circuit,and (c) digital signal processing instructions.

  • 8/9/2019 NAB 3 4 Audio Recording

    21/24

    CHAPTER 3.4: AUDIO RECORDING

    475

    SOUND RECORDING AS A PROCESS

    Recording delays an audio signal, and it also enablesthe material to be shared. For a radio interview thedelay element might involve only a few seconds, justenough for a “top and tail” edit. On the other hand atranscription recording of a live concert could well berecorded and then lie dormant for many years.

    Nowadays, economic pressures in the broadcastindustry demand a recording process that is far morecomplex than these two examples represent. Forinstance, if a recording can be made available to multi-ple operators soon after the start of that recording, theediting processes for different distribution paths anddifferent programs could take place in parallel.

    Workflow

    Therefore, we now need to think of broadcast soundrecording as a part of general workflow, and in this con-text the audio could usefully accompany other essencesuch as text and pictures. These essence items are alllinked together by the metadata, which takes the placeof the information once carried on the label and pack-age of a disc or tape recording, although in the digitalworld the metadata by itself possesses a much greaterpotential power than a label ever did.

    For the overall workflow process to be a success,several presumptions are made about the soundrecording. The highest source quality needs to be pre-served as far along the workflow chain as possible inorder to preserve the possibilities for later processingor for future (and possibly lucrative) applications of the recorded material. The technical quality of therecording must be adequate to survive the numeroussignal processing and editing procedures that may beapplied at any stage along the workflow route. This

    does not mean that current sound recording practicesneed to be perfect, but it does mean that other items inthe signal chain must produce a greater level of impairments than the recording. This has implicationsfor any bit rate reduction that is being considered.

    In fact, the days of needing bit-rate-reduced record-ings for the convenience or economic use of IT appli-cations are long over, although low bit ratecontribution of interviews and similar recordings overtelephone circuits is likely to remain a prime use of bitrate reduction for many years to come, especially inareas without widespread Internet access.

    Advantages of Digital Recording

    The “transparent” quality of digital systems shouldnot give the impression that sterility has crept intosound recording. In reality, the creative opportunitiesavailable in the most basic computer recording equip-ment well exceed those existing in the most complexanalog studio facilities of 10 years ago. What has goneare the subtle perceptual sound effects that wereinherent in analog recording or noise-reduction pro-cesses. However much of a cult has built up aroundsome of those effects over the years, there can be no

    doubt about the technical efficiency of a digital record-ing process. Digital recording has enabled big andhelpful changes in audio production, mainly as aresult of the transparency factor. For instance, therecording of two-channel stereo in the form of “M”and “S” (mid-side or mono-stereo) channels was notconsidered using analog equipment, because anyrecording artifacts such as noise on the M channel

    would appear as a coherent center image in the repro-duced sound field. “A and B” (left and right channel)analog recording resulted in a much more diffuselyreproduced field of any recording noise; therefore, thismethod became the standard for two-channel stereorecording. M and S recording can prove quite useful asthe apparent width of the stereo image can be varied

     just by adjusting the S channel level.Perhaps this is a good point to step back from the

    actual details of the recording technology and askwhether we have actually experienced some kind of digital audio revolution.

    A Digital Audio Revolution

    While there have undoubtedly been great changes,what has taken place in the field of sound recordinghas been perhaps more an evolution than a revolutionand, what is more, it has taken place in two distinctsteps.

    First, digital encoding of the recorded signal cir-cumvented the imperfections inherent in analogrecording systems. This led to the development of dedicated digital audio recording formats, mostlyusing magnetic tape or optical disc as the storagemedia.

    The second stage came when the encoded digitalaudio signals from stage one became available forcomputer-based editing. This led to an obvious use of generic IT storage for sound recordings. The main dis-advantage of any tape-based system lies in the inevita-

     ble slow access to any given part on the tape, althoughthe archive life of tapes may not be exceptional either.On the other hand, IT-based storage media are oftennot so easy to exchange, sometimes for physical rea-sons and sometimes because of format incompatibili-ties.

    Returning to the “audio” part of the digital audioevolution, the key stage was the initial encoding of theaudio in a digital form. The later digital recording andmanipulation technologies were both largely devel-oped by the generic IT industry. Audio coding in the

     broadcast industry in the form of pulse-code modula-

    tion (PCM) was first put into practical use during the1970s for the accurate long-distance transmission forthe two channels of stereo radio programs. Prior to useof this system, there was a limit of a hundred miles orso for the distance over which a sufficiently goodmatch of two analog landline paths, as necessary forstereo operation, could be maintained. At this time, asimilar form of weakness was also beginning to showup in the recording studio, as the number of genera-tions or layers of recording that could be employed

     before the signal quality was compromised was

  • 8/9/2019 NAB 3 4 Audio Recording

    22/24

    SECTION 3: AUDIO PRODUCTION AND STUDIO TECHNOLOGY

    476

     beginning to limit multitrack recording techniques,especially in some popular overdub formats. Whenaccurate digital audio coding arrived (and accuracy inaudio conversion did take some years to develop), itquickly bypassed the need for the expensive mechani-cal precision in professional analog recorders. Thisneed for precision had passed to the electronics, andthat development was paid for, not by the small audio

    market, but by the millions of computers, mobilephones, and countless other electronic items in themass consumer market.

    The huge market for IT-based products rapidly cre-ated an economic jump that reduced the cost of pro-cessing, mixing, and editing of the audio. Theseprocesses have now become integrated in the form of what has become known as an audio workstation.

    Meanwhile, the advances made in distributing theaudio material, in finished or unfinished form, have

     been crucial factors in creating even further jumps inrecording economics. These advances are nowheremore visible than in the often “invisible” contributionsfrom radio reporters in the field. These contributions

    are now sent via e-mail on telephone lines or satellites,and have led to fast audio file transfer between work-stations anywhere in the world. In this way the audiorushes from Hollywood can be sent during theevening to London, England, for editing during theday, and the finished material can then be sent backready for the following morning on the West Coast.

     In parallel with these operational advances, a pre-viously mentioned but often overlooked (and slow)technical advance has taken place in the quality of the

     basic audio digital coding and decoding. If anyonedoubts this, compare the D-to-A performance of anyearly CD player with the much lower cost equivalentof today, where what is actually heard approaches the

    theoretical coding quality promised in 1982.Revolution or not, one solid piece of advice whenfacing any new regime is to have a long-term strategyestablished. All too often, the economic advantages of one technical advance have been reversed when thenext advance came along. Planning any sound record-ing process for broadcasting requires much thought,especially as the sound is now often linked with otherworkflow patterns such as video production, andthese other patterns are still in the process of changinginside their own digital revolutions.

    Inherent complexities in IP-based topologies dictatethat contingencies for on-air product reliability should

     be built into all design and operating initiatives using

    digital audio networked systems. All systems willoccasionally fail and the mission-critical dependenceon such systems is a potential single point of failurenot typical of the analog systems being replaced.

    STREAMING AND FILE FORMATS

    In discussing the migration to file-based audio, it isuseful to consider the distinction between real-timeaudio streams and audio data files.

    Principles

    A microphone channel produces streaming audio, anda real-time output data stream will be required in manystages along the signal route to feed loudspeakers orheadphones. Any streaming format can be thought of as one with the capability for audio data delivery inreal time with a low and controlled latency. Streamingaudio cannot be slowed down or speeded up, and itcannot usually be interrupted without undesirableconsequences. It can start up immediately when thesignal is available, and it can go on streaming indefi-nitely. As it exists only in real time, it effectively car-ries its own timing information with it. While thiscould apply to any analog audio signal, when adaptedto digital forms, there arises a need to declare the orig-inal sampling frequency at the very least in the associ-ated metadata.

    A file, on the other hand, never existed in the ana-log world, although a finished physical recording wasnot a dissimilar concept. A file has to wait until adefinable portion of the streaming program is avail-able to be packaged. The size of any file is limited and

    needs to be declared, as do all the conditions neces-sary to rebuild a streaming output (that is, replay it).Once this metadata information has been gatheredand stored in a header chunk , tightly coupled with theaudio essence (bare data samples), the file is fullyformed and only then can it be handled in the sameway as any data file.

    In a typical recording production chain, it might atfirst seem easy to see where both streaming and filessit. This is not necessarily true, as for instance micro-phone signals might stream into a recorder via a mixerof some kind. The recorder, however, may then recordthe signal as a series of tiny files, and even on a digital

     broadcasting system the signal will be formed of pack-

    ets, which may be viewed as files of a tiny size. Thesepackets must be delivered at the output as a streamingformat. There is therefore some crossover betweenstreaming and files, and the fundamental penalty forany use of intermediate files is signal latency (delay)

     between the input and the output streams.File formats, on the other hand, are fairly well-

    defined and pre-agreed arrangements for storing andexchanging data of any kind at unspecified speeds.For audio, any file format must, at the very minimum,contain some form of header containing the informa-tion necessary to accurately rebuild the audio streamfrom which the files were initially built. Summing up,therefore, file formats and streaming formats are inex-tricably linked, and in some cases such as in those data

    formats used for packet transmission, the division between streaming and file may be blurred.

    Streaming Audio Formats

    The AES3 digital audio format is the most importantstreaming format at the heart of modern broadcastaudio systems. It was originally designed around theexisting cable and routing infrastructure in the analogstudios of the early 1980s, so that the connectors first

  • 8/9/2019 NAB 3 4 Audio Recording

    23/24

    CHAPTER 3.4: AUDIO RECORDING

    477

    specified were balanced XLR, allowing analog patchcables to be used. Unbalanced signals on 75 ohm videocoax using BNC connectors can also be found, and theconsumer version of this interface (SPDIF) in the IEC7

    standard was similarly (but not identically) electrically based on unbalanced signals on RCA phono connec-tors. Both in the PCM IEC60958 form and the packet-carrying version 61937, which is used for reduced bit

    rate multichannel carriage, an optical fiber link is oftenused on consumer equipment. Between these profes-sional and consumer PCM interfaces, the audioessence is identical, and the electrical interface differ-ences are in reality no more problematic than thosefound in analog practice. There are, however, differentmetadata formats (called Channel Status) in these twostandards.

    Audio File Formats

    There are many types of possible audio file exchange,and it is a relatively trivial task to convert between for-mats. However, there are always some penalties in

    conversion, and the EBU decided in 1996 that thedemands of the professional broadcaster would be best met with an extensible generic format. The EBUadopted an interesting approach, in the form of theBroadcast Wave File (BWF). The BWF is a develop-ment of the existing WAV format, used on many digi-tal audio workstations and computers. A Wave  file isan audio file that is one type of the more generalResource Interchange File Format (RIFF) file. RIFF wasdeveloped by the IBM and Microsoft corporations.

    However, no computer is concerned with what a fileis for or what it actually does. The computer can beasked to look at the file-name extension when given afile, and if a suitable application is available—eitherinside or connected to the computer—it will offer thatfile to the application. WAV audio sample files can berecognized by many different types of audio applica-tions, and these applications will look at the chunks of data in order to see if they recognize any data relevantto that application. It will leave all other chunks alone,a fact that enables broadcast-specific information to beinserted into generic Wave files without disabling anylow-cost existing applications such as simple players. If the mandatory file chunk known as “fmt-ck” containsparameters such as sample rate and data informationwhich suit that application, then the application will besatisfied that it can play the data. It will then send thedata chunk itself (and no other content) to a buffer,from which it will play the file through a pre-arranged

    port. The beauty of the Wave procedure is that not onlycan we expand files by adding other chunks of broad-cast-specific information, but we can also specify dif-ferent audio formats for the data chunk itself. Some of the first practical applications of the BWF used MPEGlayer II coding, for example.

    The full BWF file format is defined in the EBU Techdocument 3285,8  and this basic recommendation is

    further extended into the complexities of a native  fileformat for use in computer editing systems, within theAES Standard AES-31. The beauty of the whole proce-dure is that it lends itself to being extended even fur-ther, allowing industry users to incorporate their ownenhancements to the basic arrangement. For example,when the BWF file goes to form part of a multimediapackage, the BWF file can easily become a component

    of an exchange format such as MXF, or it could beincorporated in an assembly of files and metadatasuch as in the AAF structure.

    Piggy-Back Audio Formats

    “Piggy-back” audio networks are a relatively newapproach, although we are effectively riding an IT“wave” with all audio recording applications. Thesenetworks, in the broadest sense, use existing genericstandards to carry smaller specialized audio formatswithin, or on top of, them. They have arisen out of thesimple economics of using generic IT or telecom-baseddistribution methods, or even just cabling, for carryingmultiple streams of digital audio. However, strictly

    audio formats, such as IEC61937, actually use the con-sumer version of the AES/EBU streaming format inorder to carry packet audio information for multichan-nel co


Recommended