
Radio Broadcast Issues
Frank
Foti, Omnia Audio
Robert Orban, CRL/Orban
June, 2001
Few people in the record industry really know how
a radio station processes their material before
it hits the FM airwaves. This article's purpose
is to remove the many myths and misconceptions surrounding
this arcane art.
Every
radio station uses a transmission audio processor
in front of its transmitter. The processor's most
important function is to control the peak modulation
of the transmitter to the legal requirements of
the regulatory body in each station's nation.
However, very few stations use a simple peak limiter
for this function. Instead, they use more complex
audio chains. These can accurately constrain peak
modulation while significantly decreasing the
peak-to-average ratio of the audio. This makes
the station sound louder within the allowable
peak modulation.
Garbage
In-Garbage Out
Manufacturers have tuned broadcast processors
to process the clean, dynamic program material
that the recording industry has typically released
throughout its history. (The only significant
exception that comes to mind is 45-rpm singles,
which often were overtly distorted.) Because these
processors have to process speech, commercials,
and oldies in addition to current material, they
can't be tuned exclusively for "hypercompressed,"
distorted CDs. Indeed, experience has shown that
there's no way to tune them successfully for this
degraded material.
For
20 years, broadcast processor designers have known
that achieving highest loudness consistent with
maximum punch and cleanliness requires extremely
clean source material. For more than 20 years,
Orban has published application notes to help
broadcast engineers clean up their signal paths.
These notes emphasize that any clipping in the
path before the processor will cause subtle degradation
that the processor will often exaggerate severely.
The notes promote adequate headroom and low distortion
amplification to prevent clipping even when an
operator drives the meters into the red.
About
three years ago, we started to notice CDs arriving
at radio stations that had been pre-distorted
in production or mastering to increase their loudness.
For the first time, we started seeing frequently
reoccurring flat topping caused by brute-force
clipping in the production process. Broadcast
processors react to pre-distorted CDs exactly
the same way as they have reacted to accidentally
clipped material for more than 20 years-they exaggerate
the distortion. Because of phase rotation, the
source clipping never increases on-air loudness-it
just adds grunge.
The
authors understand the reasoning behind the CD
loudness wars. Just as radio stations wish to
offer the loudest signal on the dial, it is evident
that recording artists, producers, and even some
record labels want to have a loud product that
stands out against its competition in a CD changer
or a music store's listening station.
In
radio broadcasting this competition has existed
for at least the last 25 years. 25 years ago,
radio stations used simple clipping to get louder,
and this 25-year-old technique has now migrated
to the music industry. The following graphic shows
a section of a severely clipped waveform from
a contemporary CD. The area marked between the
two pointers highlights the clipped portion. This
is one of the roots of the problem as described
in this paper; the other is excessive digital
limiting that does not necessarily cause flat-topping,
but still removes transient punch and impact from
the sound.
The
problem today is that we now have sophisticated
and powerful audio processing for the broadcast
transmission system and this processing does not
coexist well with a signal that has already been
severely clipped. Unfortunately, with current
pop CDs, the example shown above is more the norm
than the exception.
The
attack and release characteristics of broadcast
multiband compression were tuned to sound natural
with source material having short-term peak-to-average
ratios typical of vinyl or pre-1990 CDs. Excessive
digital limiting of the source material radically
reduces this short-term peak-to-average ratio
and presents the broadcast processor with a new,
synthetic type of source that the broadcast processor
handles less gracefully and naturally than it
handles older material. Instead of being punchy,
the on-air sound produced from these hypercompressed
sources is small and flat, without the dynamic
contours that give music its dramatic impact.
The on-air sound resembles musical wallpaper and
makes the listener want to turn down the volume
control to background levels.
There
is a myth that broadcast processing will affect
hypercompressed material less than it will more
naturally produced material. This is true in only
one aspect-if there is no long-term dynamic range
coming in, then the broadcast processor's AGC
will not further reduce it. However, the broadcast
processor will still operate on the short-term
envelopes of hypercompressed material and will
further reduce the peak-to-average ratio, degrading
the sound even more.
Hypercompressed
material does not sound louder on the air. It
sounds more distorted, making the radio sound
broken in extreme cases. It sounds small, busy,
and flat. It does not feel good to the listener
when turned up, so he or she hears it as background
music. Hypercompression, when combined with "major-market"
levels of broadcast processing, sucks the drama
and life from music. In more extreme cases, it
sounds overtly distorted and is likely to cause
tune-outs by adults, particularly women.
A
Typical Processing Chain-What Really Goes On When
Your Recording is Broadcast:
A
typical chain consists of the following elements,
in the order that they appear in the chain:
1.
Phase Rotator
The phase rotator is a chain of allpass filters
(typically four poles, all at 200Hz) whose group
delay is very non-constant as a function of frequency.
Many voice waveforms (particularly male voices)
exhibit as much as 6dB asymmetry. The phase rotator
makes voice waveforms more symmetrical and can
sometimes reduce the peak-to-average ratio of
voice by 3-4dB. Because this processing is linear
(it adds no new frequencies to the spectrum, so
it doesn't sound raspy or fuzzy) it's the closest
thing to a "free lunch" that one gets
in the world of transmission processing.
There
are a few prices to play. In the good old days
when source material wasn't grossly clipped, the
main price was a very subtle reduction in transparency
and definition in music. This was widely accepted
as a valid trade-off to achieve greatly reduced
speech distortion, because the phase rotator's
effects on music are unlikely to be heard on typical
consumer radios, like car radios, boom boxes,
"Walkman"-style portables, and table
radios.
However,
with the rise of the clipped CD, things have changed.
The phase rotator radically changes the shape
of its input waveform without changing its frequency
balance: If you measured the frequency response
of the phase rotator, it would measure "flat"
unless you also measured phase response, in which
case you would say that the "magnitude response"
was flat and the phase response was highly non-linear
with frequency. The practical effect of this non-linear
phase response is that flat tops in the original
signal can end up anywhere in the waveform after
processing. It's common to see them go right through
a zero crossing. They end up looking like little
smooth sections of the waveform where all the
detail is missing-a bit like a scar from a severe
burn. This is an apt metaphor for their audible
effect, because they no longer help reduce the
peak-to-average ratio of the waveform. Instead,
their only effect is to add unnecessary grungy
distortion.
There
has been a myth in the recording world that broadcast
processing will modify these clipped, over-compressed
CDs less it will modify clean, dynamic CDs. Thanks
in part to phase rotation, this myth is absolutely
false. In particular, any clipping in the source
material causes nothing but added distortion without
increasing on-air loudness at all.
2.
AGC
The next stage is usually an average-responding
AGC. By recording studio standards, this AGC is
required to operate over a very wide dynamic range-typically
in the range of 25dB. Its function is to compensate
for operator errors (in live production environments)
and for varying average levels (in automated environments).
Average levels vary mainly because the peak to
average ratio of CDs themselves has varied so
much in the last 10 years or so. Therefore, normalizing
hard disk recordings (to use all available headroom)
has the undesirable side effect of causing gross
variations in average levels. Indeed, 1:1 transfers
(which are also common) will also exhibit this
variation, which can be as large as 15dB.
The
price to be paid is simple: the AGC will eliminate
long-term dynamics in your recording. Virtually
all radio station program directors want their
stations to stay loud always, eliminating the
risk that someone tuning the radio to their station
will either miss the station completely or will
think that it's weak and can't be received satisfactorily.
Radio people often call this effect "dropping
off the dial."
AGCs
can be either single-band or multiband. If they
are multiband, it's rare to use more than two
bands because AGCs operate slowly, so "spectral
gain intermodulation" (such as bass' pumping
the midrange) is not as big a potential problem
as it is for later compression stages, which operate
more quickly.
AGCs
are always gated in competent processors. This
means that their gain essentially freezes if the
input drops below a preset threshold, preventing
noise suck-up despite the large amount of gain
reduction.
3.
Stereo Enhancement
Not all processors implement stereo enhancement,
and those that do may implement it somewhere other
than after the AGC. (In fact, stand-alone stereo
enhancers are often placed in the program line
in front of the transmission processor.)
The
common purpose of stereo enhancement is to make
the signal stand out dramatically when the car
radio listener punches the tuning button. It's
a technique to make the sound bigger and more
dramatic. Overdone, it can remix the recording.
Assuming that stereo reverb, with considerable
L-R energy, was used in the original mix, stereo
enhancement, for example, can change the amount
of reverb applied to a center-channel vocalist.
The moral? When mixing for broadcast, err on the
"dry" side, because some stations' processors
will bring the reverb more to the foreground.
Because
each manufacturer uses a different technique for
stereo enhancement, it's impossible to generalize
about it. The only universal constraints are the
need for strict mono compatibility (because FM
radio is frequently received in mono, even on
"stereo" radios, due to signal-quality-trigged
mono blend circuitry), and the requirement that
the stereo difference signal (L-R) not be enhanced
excessively. Excessive enhancement always increases
multipath distortion (because the part of the
FM stereo signal that carries the L-R information
is more vulnerable to multipath). Excessive enhancement
will also reduce the loudness of the transmission
(because of the "interleaving" properties
of the FM stereo composite waveform, which we
won't further discuss).
These
constraints mean that recording-studio-style stereo
enhancement is often incompatible with FM broadcast,
particularly if it significantly increases average
L-R levels. In the days of vinyl, a similar constraint
existed because of the need to prevent the cutter
head from lifting off the lacquer, but with CDs,
this constraint no longer exists. Nevertheless,
any mix intended for airplay will yield the lowest
distortion and highest loudness at the receiver
if its L-R/L+R ratio is low. Ironically, mono
is loudest and cleanest!
4.
Equalization
Equalization may be as simple as a fixed-frequency
bass boost, or as complex as a multi-stage parametric
equalizer. EQ has two purposes in a broadcast
processor. The first is to establish a signature
for a given radio station that brands the station
by creating a "house sound." The second
purpose is to compensate for the frequency contouring
caused by the subsequent multiband dynamics processing
and high frequency limiting. These may create
an overall spectral coloration that can be corrected
or augmented by carefully chosen fixed EQ before
the multiband dynamics stages.
5.
Multiband Compression and Limiting
Depending on the manufacturer, this may occur
in one or two stages. If it occurs in two stages,
the multiband compressor and limiter can have
different crossovers and even different numbers
of bands. If it occurs in one stage, the compressor
and limiter functions can "talk" to
each other, optimizing their interaction. Both
design approaches can yield good sound and each
has its own set of tradeoffs.
Usually
using anywhere between four and six bands, the
multiband compressor/limiter reduces dynamic range
and increases audio density to achieve competitive
loudness and dial impact. It's common for each
band to be gated at low levels to prevent noise
rush-up, and manufacturers often have proprietary
algorithms for doing this while minimizing the
audible side effects of the gating.
An
advanced processor may have dozens of setup controls
to tune just the multiband compressor/limiter.
Drive and output gain controls for the various
compressors, attack and release time controls,
thresholds, and sometimes crossover frequencies
are adjustable, depending on the processor design.
Each of these controls has its own effect on the
sound, and an operator needs extensive experience
if he or she is to tune a broadcast multiband
compressor so that it sounds good on a wide variety
of program material without constant readjustment.
Unlike mastering in the record industry, in broadcast
there's no mastering engineer available to optimize
the processing for each new source!
6.
Pre-Emphasis and HF Limiting
FM radio is pre-emphasized at 50 microseconds
or 75 microseconds, depending on the country in
which the transmission occurs. Pre-emphasis is
a 6dB/octave high frequency boost that's 3dB up
at 2.1kHz (75µs) or 3.2kHz (50µs).
With 75µs pre-emphasis, 15kHz is up 17dB!
Depending
on the processor's manufacturer, pre-emphasis
may be applied before or after the multiband compressor/limiter.
The important thing for mixers and mastering engineers
to understand is that putting lots of energy above
5kHz creates significant problems for any broadcast
processor because the pre-emphasis will greatly
increase this energy. To prevent loudness loss,
the processor applies high frequency limiting
to these boosted high frequencies. HF limiting
may cause the sound to become dull, distorted,
or both, in various combinations. One of the most
important differences between competing processors
is how effectively a given processor performs
HF limiting to minimize audible side effects.
In state-of-the-art processors, HF limiting is
usually performed partially by HF gain reduction
and partially by distortion-cancelled clipping.
7.
Clipping
In most processors, the clipping stage is the
primary means of peak limiting. It's crucial to
broadcast processor performance. Because of the
FM pre-emphasis, simple clipping doesn't work
well at all. It produces difference-frequency
IM distortion, which the de-emphasis in the radio
then exaggerates. (The de-emphasis is flat below
2-3kHz, but rolls off at 6dB/octave thereafter,
effectively exaggerating energy below 2-3kHz.)
The result is particularly offensive on cymbals
and sibilance ("essses" become "efffs").
In
the late seventies, one of the authors of this
article (R.O.) invented distortion-cancelled clipping.
This manipulates the distortion spectrum added
by the clipper's action. In FM, it typically removes
the clipper-induced distortion below 2kHz (the
flat part of the receiver's frequency response).
This typically adds about 1dB to the peak level
emerging from the clipper, but, in exchange, allows
the clipper to be driven much harder than would
otherwise be possible.
Provided
that it doesn't introduce audibly offensive distortion,
distortion-cancelled clipping is a very effective
means of peak limiting because it affects only
the peaks that actually exceed the clipping threshold
and not surrounding material. Accordingly, clipping
does not cause pumping, which gain reduction can
do, particularly when gain reduction operates
on pre-emphasized material. Clipping also causes
minimal HF loss by comparison to HF limiting that
uses gain reduction. For these reasons, most FM
broadcast processors use the maximum practical
amount of clipping that's consistent with acceptably
low audible distortion.
Real-world
clipping systems can get very complicated because
of the requirement to strictly band-limit the
clipped signal to less than 19kHz despite the
harmonics that clipping adds to the signal. (Bandlimiting
prevents aliasing between the stereo main and
subchannel, protects subcarriers located above
55kHz in the FM stereo composite baseband, and
protects the stereo pilot tone at 19kHz). Linearly
filtering the clipped signal to remove energy
above 15kHz causes large overshoots (up to 6dB
in worst case) because of a combination of spectral
truncation and time dispersion in the filter.
Even a phase-linear lowpass filter (practical
only in DSP realizations) causes up to 2dB overshoot.
Therefore, state-of-the-art processors use complex
overshoot compensation schemes to reduce peaks
without significantly adding out-of-band spectrum.
Some
chains also apply composite clipping or limiting
to the output of the stereo encoder. The stereo
encoder is the circuit that encodes the left and
right channels into the single multiplex signal
that drives the transmitter, and it's actually
the peak level of this signal that government
broadcasting authorities regulate. Composite clipping
or limiting has long been a controversial technique,
but the latest generation of composite clippers
or limiters has greatly reduced the interference
problems characteristic of earlier technology.
Conclusions
Broadcast processing is complex and sophisticated,
and was tuned for the recordings produced using
practices typical of the recording industry during
almost all of its history. In this historical
context, hypercompression is a short-term anomaly
and does not coexist well with the "competitive"
processing that most pop-music radio stations
use. We therefore recommend that record companies
provide broadcasters with radio mixes. These can
have all of the equalization, slow compression,
and other effects that producers and mastering
engineers use artistically to achieve a desired
"sound." What these radio mixes should
not have is fast digital limiting and clipping.
Leave the short-term envelopes unsquashed. Let
the broadcast processor do its work. The result
will be just as loud on-air as hypercompressed
material, but will have far more punch, clarity,
and life.
A
second recommendation to the record industry is
to employ studio or mastering processing that
provides the desired sonic effect, but without
the undesired extreme distortion component that
clipping creates. The alternative to brute-force
clipping is digital look-ahead limiting, which
is already widely available to the recording industry
from a number of different manufacturers (including
the authors' companies). This processing creates
lower modulation distortion than clipping and
also avoids blatant flat-topping of waveforms.
Compared to clipping, it is therefore substantially
more compatible with broadcast processing. Nevertheless,
even digital limiting can have a deleterious effect
on sound quality by reducing the peak-to-average
ratio of the signal to the point that the broadcast
processor responds to it in an unnatural way,
so it should be used conservatively. Ultimately,
the only way to tell how one's production processing
will interact with a broadcast processor is to
actually apply the processed signal to a real-world
broadcast processor and to listen to its output,
preferably through a typical consumer radio.
|