[BC] dueling algorithms and audio quality

Thu Oct 8 08:07:05 CDT 2009

    Richard is right.  16 bit audio is just barely adequate for
a final delivery system, as long as you optimize your gain
structure.  Most non-technical folks miss that last part, though.
That's why our biggest challenge is education.

    When explaining it to non-technical folks, I believe that a
couple of pictures are worth a thousand words, so I show them
two .tif or .bmp photos of the exact same scene, one that is
very-high-resolution (lots and lots of pixels) and another one
that is the exact resolution of their computer monitor (usually
1024 x 768).  I demonstrate to them that they both look exactly
the same on their computer monitor.

    Then I crop them both to about 40% of the original size,
and then blow them up to fill the screen again.  This is a
visual process that is similar to what happens when an audio
signal that is at minus-25-dBfs-or-so in a 16-bit environment
gets expanded, compressed and limited until it is at -0.5 dBfs.  The
"jaggies" that result in the processed lower-rez source
file become immediately obvious to the naked eye, while the
processed high-rez source file still looks fine.

    With respect to dither. . . it is basically noise that is
added at a low level to prevent the quantization granularity
from causing intermodulation effects with the audio.  Generally,
this noise is shaped in the frequency domain to be masked by the
audio so that it is minimally noticable.  However, much like
bit-rate reduction artifacts, subsequent audio processing
(especially multi-band audio processing) reduces the effective-
ness of this "hiding in plain sight" technique, and then the
dither becomes more audible, in some cases quite irritating.

    Because it is modulated by the audio level variations, it
is much like the noise effects within a limited-dynamic-range
storage or transmission medium that uses a complementary
(encode in front, decode at the output) noise reduction system
such as dBx or Dolby to "improve" the apparent signal-to-noise
of the channel.  The noise is still there when the audio is
present, it just goes away (hopefully gracefully) as the audio
level drops.  The problem is that the noise is actually still there with
the audio, and is therefore perceived as non-harmonically-
related distortion.

    I've been in recording studios where such noise reduction
systems had no apparent noise, but the audio sounded ugly in an
indefinable way.  Everyone could hear the problem, but none of
the studio staff could figure out what it was.  When I put the
NR system into bypass mode (something that the staff had never
done) the egregiously high levels of hum and buzz in the channels
(due to poor grounding and shield termination techniques going
into and out of a 24-track machine with unbalanced inputs and
outputs (anyone remember the Stevens decks?)) immediately became
apparent, and was trivial to correct.

Grady

======================
Richard Johnson mused:

> 16-bit audio is signed (twos compliment). It runs from -32767
> through 0 to +32767. The peak dynamic range is therefore 32767:1
> (90 dB). This is not what the ear hears, though. The ear hears
> the "power," so the real dynamic range is less by the peak-to-RMS
> ratio, i.e., 0.707 * 32767 = 23167. In other words, a sine wave
> with peak amplitude of +/- 32767 counts only has a RMS value of
> 23167 counts.
>
> However, we got along just fine for many years with only 70 dB
> of dynamic range when analog equipment was used. The difference
> was that very low-level analog signals were embedded in noise,
> which seemed natural.  Digital signals, however, become granular
> which is unlike anything heard in nature. The solution is dither.
> It does not make any difference if you run 16, 18, or 24 dB
> converters if the equipment is designed correctly.
>
> For instance, suppose you are going to sum the inputs of four
> channels to provide an output. How do you sum them? If you are
> exercising all 16-bits on each of the input channels, do you
> need four times that number of bits on the output channel?
> The answer is NO as long as you do the summation correctly.
> At least one "respected" digital board I reviewed two years
> ago at the NAB was designed without a clue.
>
> If you attempt to divide the four inputs by 4 before you sum
> them to the output channel so you do not overload the output,
> you have removed the two LSBs on all four input channels!
> This is going to cause granulation artifacts on those channels.
> The "correct" solution is to use (or emulate) floating-point
> summation.