Audio Engineering for Beginners

We have more "how to" audio tutorials online than we're ever going to need. So I'm giving you something else: this blog provides easy to comprehend explanations of basic audio terminologies and processes. For over a decade it was my job to teach these concepts at several New York City colleges and universities. My students felt that my methods of presenting these topics were often easier to grasp than those found elsewhere. So if this is what you're looking for read a post, leave a comment.

Starting at 0 dB - Part I: Sound Pressure Level (SPL)

As beginning audio engineers we're going to have to learn a few things about decibels (dB) that aren't particularly intuitive or obvious. One of the most fundamental is that decibels aren't only used to measure differences in sound pressure: they're also used to compare differences of intensity in audio signals that are passing through our gear before they hit our speakers. We're going to be looking at visual representations of these signals whenever we use any kind of audio level meter.

We're also going to have to learn that the decibel is a ratio: an expression of the difference between two values. This can be confusing because only one of the values is given; the other is implied. The implied value is a reference level of 0 dB.

Let's start with this statement and break it down:

"Average conversation levels are around 70 dB"

Two things are being implied here and it's up to you to know how to fill in the blanks:

1. What quantity is being measured?

The dB by itself is just a unit. The statement above is comparable to saying "I have 70 ounces". 70 ounces of what? Implied in the statement is that we're talking about sound that we can hear: changes in the intensity of sound pressure or sound pressure level (SPL)

2. What is the 0 dB reference point? What are we comparing 70 dB to?

0 dB SPL is defined as the softest sound that a human can possibly hear. And that's really soft!

So what this statement is really saying is:

"Average conversation levels are around 70 dB SPL above 0 dB SPL. 0 dB SPL is the softest sound that a human can possibly hear"

Not too difficult so far right? Good.

NOTE: At this point I have to at least mention that decibels use a logarithmic scale. 70 dB SPL is not 70 times more powerful than 0 dB SPL. It's 10,000,000 times more powerful! This discussion is way beyond the scope of this short post. I've put a link to a more in-depth article at the end if you want to look into it further.

So now we know that:

The dB is used to compare changes in power or intensity. But without a quantifier after the "dB" we don't know what we're measuring and comparing
SPL refers to sound pressure level. When someone says "the music was loud! Around 100 decibels" we know that this implies 100 dB SPL
Decibels use 0 dB as a reference: all other levels are compared against it
0 dB SPL is referenced as the softest sound that a human can possibly hear

Now take a look at the analog meter above. Notice where 0 dB is on its scale. Since we know that 0 dB SPL is the softest sound that a human can hear this meter must be referencing something other than sound pressure.

Look at this digital peak meter. On this meter the 0 dB reference is at the top of the scale! It must be measuring something other than SPLs.

This is where the confusion usually starts. And that's where we're going to pick it up in the next post.

In the meantime if you do a search for "decibel chart" you'll find a bunch of graphics similar to the one below. These will give you a basic idea of the dB SPL ratings for some common sounds.

Also grab a SPL meter app for your smartphone. Many of them are free!

If you want to do some more in-depth reading on your own in here's a good article by Justin Colletti at Sonicscoop: Beyond the Basics: Demystifying dB

Thanks!

Hey if you turn a dB on its side you get an emoticon!

Karl Wenninger is an audio engineer, synthesist/sound designer, composer, guitarist and DIY audio electronics enthusiast. As an adjunct professor he has taught Pro Tools at The New School for Jazz and Contemporary Music, Computer Music at York College and Audio Post-Production for the Media Arts Program at NJCU. He was an program administrator and associate professor at the former Digital Media Arts program at Touro College in New York City for over a decade.

Analog to Digital Conversion - Part II: Bit Depths and Quantization

a "bit" of humor to "process"

In previous posts we've covered sound pressure waves, transducers, analog audio, preamplifiers and amplifiers, waveforms, binary basics and digital audio sample rates. This article assumes that you already have a handle on these fundamentals. If you find yourself a little confused about the terminology presented here you might want to review the earlier posts.

This time we're going to tackle bit depths and quantization: how your analog to digital converter (ADC) will be forced to interpret or round-off the levels of continuous analog signals that fall in between the fixed values of a digital sampler. This rounding-off of values is called "quantizing" or sometimes"rounding-error". In this case don't let the word "error" make you think that the system is malfunctioning; it's how this stuff was designed to work. And once you understand the basics it's actually a pretty simple process.

If this is new to you the concept of continuous analog vs. fixed digital may seem a little abstract. So let's start with a with an analogy that we can all relate to.

Analog vs Digital Clock

An analog and a digital clock are synchronized. When the minute hand on the analog clock hits 12:05 the digital clock also reads 12:05.

Over the next 59 seconds the minute hand on the analog clock moves gradually foward. Although it's moving slowly it is continuously moving. In the meantime the digital clock continues to display 12:05.

It's not until the analog clock hits exactly 12:06 that the numeric display on the digital clock bumps over and both are in sync again.

In this example the analog clock is capable of conveying much more information than the digital clock.

Increasing Resolution Increases Accuracy

If we need more information from the digital clock we could increase its resolution by adding two digits to the right of the minutes column displaying seconds. If we were timing a sporting event we may need even more accuracy from the digital clock. In that case we'd need to increase its resolution further by adding more fields to display milliseconds. We could go on subdividing time values on our digital clock and increasing its resolution almost forever. But at some point the digital clock is going to give us all the information we need to get our work done. Increasing its resolution beyond this wouldn't be useful and it would make the digital clock more complicated and expensive to manufacture.

Analog vs Digital Audio

A continuously varying analog audio signal is represented in red. The green dot illustrates a single digital sample: an analog voltage measurement converted to a 16 bit digital word

An analog audio signal consists of a continuously varying voltage (change in electron pressure).
A digital audio sample is not continuous: it has a finite number of possible values to work with.

Analog to Digital Conversion

In order to capture and encode a digital audio sample a circuit called an analog to digital converter (ADC) is used.

The ADC repeatedly measures the amplitude (intensity or loudness) of the analog signal at regularly timed intervals.
Each individual measurement is converted to a binary number and stored.
Each digitally stored measurement is called a sample.
Allocating more bits to each sample increases its resolution and therefore the accuracy of each measurement.

Got that? Let me see if we can make it a little simpler.

The images below illustrate what might happen if we tried to digitally capture an analog audio signal with an ADC using just one bit.

The graphic on the left describes an analog input signal before it hits the ADC. Notice that as the frequency of the signal repeats its amplitude decreases.
Each green dot (sometimes called a "lollipop" graph) on the center graphic illustrates a single digital sample. A one bit converter would have only two possible fixed values to choose from: "1" or "0" (labeled on the graph's vertical axis). Notice that:

if the amplitude of the analog signal goes above the center line the ADC must quantize (round) the value up and assign it a digital "1"
if the analog signal falls below the center the ADC must quantize the value down and assign it a digital "0"

On playback the digital to analog converter (DAC) looks at the values of the encoded samples (lollipops) and uses them to construct a new analog output. While the frequency content of the input is maintained all variations in amplitude are lost and our signal has been rendered to a square wave.

Now let's increase the number of bits available to our converter from one to two and see how it increases resolution.

In this instance the center image depicts the same number of samples (the sample rate) as in the previous example.
The ADC still must quantize the analog signal up or down to the closest available step. But by increasing the bit depth from one to two bits we've doubled the number of discrete values available to the converter to choose from. These are again labeled on the graph's vertical axis.
On output the DAC uses the stored sample values to construct a new analog signal. Notice that while the output is still very distorted some of the amplitude variations of the input are beginning to reappear.

Let's see what would happen if we upped our converters to three bits.

By increasing the ADC's bit depth from two to three bits we've again doubled the number of values that it has to work with from four to eight. Starting to see a pattern here?
The sample rate remains the same as in the previous examples. The only thing that's changed is the sampler's bit depth.
Our output is beginning more and more to look like (and sound like) the input.

And here's the same signal quantized by a four bit converter.

By raising the number of bits available to the ADC from three to four we've again doubled the number of available quantize steps from eight to sixteen.
While our output still isn't identical to the input the variations in amplitude are nearly the same.

Starting to get the idea?

Adding additional bits to a sample allow it to capture an analog signal's amplitude with greater accuracy.

Every additional bit that we allocate to a binary word doubles the number of values it can yield.

Got it? If so I think you've managed to absorb the big picture that I hoped to convey here.

But before we wrap it up let's consider this:

If longer bit depths increase the resolution (and therefore the accuracy) of a digital sample, we might assume that at some point we could arrive at a bit depth where quantization becomes unnecessary. Right?

Nope. That's just not going to happen.

As you're recording your converter is firing off samples at a fixed frequency determined by its sample rate. As we found out in this article sample rates determine the highest audio frequency that your converter can accurately capture. However, your converter is going to be recording these samples at regularly timed intervals regardless of the frequency or phase of the analog signal you're feeding it. The amplitude of your analog signal may just happen to happily coincide nicely with one of the quantized steps of a sample. But it's just as likely to fall somewhere in between the cracks.

And that's why we have something called dither. And that's what we'll cover next time. We'll also find out why your DAW hardware and software offers options for recording and processing samples at both 16 and 24 bit. Which means we'll also have to discuss dynamic range. And what is that 32 bit floating point thing anyway?

In the meantime: if you're interested in pursuing more information on the topic this book was my primary reference. It's also been one of the definitive texts on the subject for several decades.

Pohlmann, Ken. Principles of Digital Audio, Sixth Edition: 2010

A nice example of patience and practice

Before leaving I want to mention that I was a musician for many, many years before I became interested in audio technology. I still vividly remember how overwhelmed I was with all this jargon and terminology at first. None of it came easy for me. But with some patience and practice it all eventually started to make sense.

As much as I can I'm trying to approach these articles from a beginner's perspective. If you've read a post and still have questions post a comment and let me know. I'll do what I can to help. Really.

Thanks.

Analog to Digital Conversion - Part I: Sample Rates, Aliasing and Harry

In this post we're going to learn the basics of digital sampling and sample rates. Specifically, how a analog to digital converter (ADC) circuit does exactly what it's name implies: measures analog audio signals and converts these measurements to binary numbers. Once we've had our fun mangling those signals in our audio software we need to get them out to our ears and (hopefully) to the ears of others. Since digital data can't push a speaker back and forth we need another circuit called, as you might expect, a digital to analog converter (DAC). Your DAC reads the stored digital samples and uses them to construct a new analog signal. This signal is routed through the outputs of your audio interface to your speakers. Your speakers then vibrate and create sound pressure waves in the air.

We'll also learn that in many ways this all got started with some ideas put forth by a guy named Harry. Harry worked for the phone company 90 years ago.

I invented this stuff! I want stock options from Apple and Spotify!

Sorry Harry you've been in the ground for decades. But every time anyone uses any kind of digitally encoded audio your work comes into play. And now it happens millions and millions of times every day. Thank you!

Quick Review

Before going forward here's a quick refresher on sound waves, analog audio, waveforms, frequency and Hertz:

a sound pressure wave and a analog audio signal are both made up of fluctuating pressure variations

in a sound wave these are pressure variations in the air
in an analog audio signal these are electron pressure variations called voltage

one positive and one negative pressure variation comprise one complete cycle of a wave
the number of complete cycles that occur in one second are called frequencies and are measured in Hertz (Hz)
a waveform is a graphic that describes the physical attributes of a wave

I've covered these topics in previous posts. If you're unclear on the above it's in your own best interest to read them before proceeding. Heres a link:

Sound Waves, Transducers and Analog Audio

And here we go...

Analog to Digital Conversion (ADC)

Take a look at the image below. The red waveform represents an analog audio signal. The green line with the dot or "lollipop" represents a single digital sample. As you're recording your analog to digital converter (ADC) looks at the incoming audio signal and measures its amplitude (power) at a specific point in time. As far as a single sample is concerned all the ADC is measuring is amplitude, the frequency or phase of the signal is irrelevant.

Once the ADC has determined the amplitude of the analog signal it converts this measurement to a binary word (digital data). Notice that in the example given the binary word is 16 bits (ones and zeros) long. If your ADC is set to record at 16 bits every single sample will be encoded at that word length.

Note: As we add more bits to a digital sample an ADC is able to encode the amplitude of a analog signal with greater accuracy. We'll talk more about bit depths in an upcoming post.

When the ADC is finished with its first sample it repeats the process. The amplitude of the analog audio signal is again measured, encoded as a digital word and stored.

As you're recording this process continues over and over again until you hit the stop button.

When you hit stop all of the samples are stored in a single audio file on your drive. That file contains many, many thousands of discrete digital measurements of the amplitude of your analog audio input.

Digital samples are discrete measurements of analog amplitudes

Digital to Analog Conversion (DAC)

When you hit play the conversion process is essentially reversed. Your digital to analog converter (DAC) looks at all those discrete samples and uses them to construct a new analog signal. This analog signal is sent to your speaker and causes the speaker cone to vibrate. The speaker cone's vibrations create sound pressure waves in the air. Now we have something that we can hear. Everyone is happy.

Note: I've been talking about ADCs and DACs as if they're two separate circuits and they are. But most of you have both circuits (and a lot of other features) built into your audio interface.

Sample Rate

The number of times per second your ADC is measuring, encoding and storing samples is called the sample rate. Sample rates are measured in Hertz (Hz). Again, a single sample is only capable of capturing an analog signal's amplitude at one specific point in time. The sample rate (how many measurements per second) determines how accurately your ADC can capture an analog signals frequency.

On the graphic below I've indicated one complete cycle or wavelength of the analog signal.

Note: wavelength is often indicated by the greek letter lambda (looks like an upside-down "y")

The green lollipops that represent our digital samples on this graphic occur at the same sample rate (sometimes called sample frequency) as our audio frequency. For every complete cycle of the analog audio waveform there is a digital sample to account for it.

See any problem with this? Look at it again.

In this example only the positive side of the analog waveform is being sampled.

This just isn't going to work. At a minimum we need to sample both the positive and the negative phase of each complete cycle of the analog signal.

In the example above we've doubled the amount of digital lollipops. Our ADC can now capture both the positive and negative phase of the analog signal.

In order to accurately capture an analog audio signal our digital sample rate must be at least twice the frequency of the analog signal.

Does that make sense? Good. You now understand the basis of the Nyquist Theorem. Harry Nyquist was an electronic engineer who worked for Bell Labs during the first half of the 20th century. His research is the foundation of what would eventually become the process for digitally sampling all kinds of waves. Sometimes big companies with lots of money pay smart people to try to figure stuff out in the hopes of making more money. And sometimes it benefits all of us.

So now we know that in order to accurately sample an analog signal our sample rate has to be at least twice that of the highest frequency of the signal we hope to capture. The range of human hearing in frequency is said to be 20 Hz to 20 kHz. If the highest frequency that we can possibly hear is 20 kHz we might think that we could set the sample rate on our ADC to 40 kHz and be done with it. But..

good doggie!

Just because we can't hear frequencies above 20 kHz doesn't mean they're not there. And just because an ADC running at a 40 kHz sample rate won't accurately capture frequencies we can't hear doesn't mean that it's not going to capture something.

Aliasing

Take a look at the graphic below. The samples (lollipops) occur at the same rate as the previous example. But this time I've overlaid an analog signal with a frequency that is obviously much too fast to be accurately captured.

However our ADC is still going to measure, encode and save samples at each sample period. Remember, each single sample is just a measurement of amplitude regardless of frequency.

If I remove the original analog signal we're left with this series of samples.

When you press play your digital to analog converter (DAC) is going to look at these samples and output a new, much lower analog frequency that wasn't in your original input signal AT ALL!

This is called aliasing and we can't have it. Fortunately you don't have to worry about it. The engineers who designed your converter already took care of it.

Anti-aliasing Low Pass Filter

A low pass filter does exactly what it's name implies: it lets low frequencies pass while filtering out the highs. There's a low pass filter built into your ADC just for this purpose: to filter out higher analog frequencies that can cause aliasing.

A filter has a cut-off frequency. This is the point at which it begins to attenuate (reduce) signals. The filter doesn't attenuate infinitely at it's cut-off point; it has a gradual slope. When it reaches the end of the slope all frequencies above that are removed entirely.

The graphic above illustrates a low pass filter that was designed to begin to attenuate all analog frequencies at 20 kHz and eliminate them entirely after 22.05 kHz. These signals are removed before they hit your ADC.

Remember, Mr. Nyquist told us that in order to accurately sample an analog audio signal our sample rate must be at least twice as fast as the highest frequency we want to capture.

22.05 kHz * 2 = 44.1 kHz

Ever see that number before?

The first commercially released audio CD

Every audio CD that was ever made since the beginning of time in 1982 was encoded at a sample rate of 44.1 kHz

A Few Points Worth Pointing Out

Occasionally I run across someone who believes that a specific audio software application sounds better than another for recording. This is impossible. When you're recording you're capturing signals from your analog gear (probably mic and mic pre) to your ADC. Your ADC is creating the digital audio samples. The software you're using is irrelevant. Claiming that one DAW sounds better for recording is analogous to claiming that Photoshop it the best program for taking pictures.
This should also clue you into the fact that regardless of how good DAWs and plugins get at emulating analog gear for mixing your mics, preamps and converters are still going to have a huge effect on the sound of your digital recordings.
There are a lot of options for working at sample rates higher than 44.1 kHz or "CD quality". 48 kHz is the standard audio sample rate for the film/video industry. Other than that the topic of higher sample rates can be highly contentious and has been debated fervently for quite awhile now. It's beyond the scope of this article to get into it but I suggest you check it out on your own, you're eventually going to start asking about it anyway. Here's a few links to get you started:

Finally...

A Little Bit on Bit Depth

All those lollipop graphics I showed you earlier are really nice for illustrating sample rates. However the truth of the matter is that those dots on the lollipop tops are very often not going to precisely coincide with the amplitudes of your analog signals. Don't worry about it, this was all taken into consideration and compensated for a long time ago. But next time we'll have to talk about bit depths and quantization.

In the meantime if you're not too sure what a "bit" is you can find out here:

Bits and Binary 101 = 5

Karl Wenninger is an audio engineer, synthesist/sound designer, composer, guitarist and DIY audio electronics enthusiast. As an adjunct professor he has taught Pro Tools at The New School for Jazz and Contemporary Music, Computer Music at York College and Audio Post-Production for the Media Arts Program at NJCU. He was an program administrator and associate professor at the former Digital Media Arts program at Touro College in New York City for over a decade.

Bits and Binary 101 = 5

It's not necessary to understand binary code to work with digital audio but it is useful when learning the theories behind digital sampling. Every time you start a new session in your DAW or bounce your finished production out to the rest of the world you make decisions concerning sample rates, bit depth or bit rates. You probably already know that the settings you choose for these parameters effect the quality of your audio. Having a basic grasp of what those ones and zeros mean in these contexts can help you make more informed decisions. So here we go...

Decimal (base-10) numbers

We learn to count using a decimal or base-10 system. This becomes so second-nature and intuitive to us that we're barely cognizant of it. In our decimal number system we have

nine different digits representing the values zero through nine (0-9)
no single digit for the number ten (10) or greater

each time we exceed a value of 9 we add a new number column to the left of that digit and start over again with 1
each new column represents a value ten times the column to its right

For example, we immediately recognize the value of the number "101" without having to think that it represents 1 hundred, 0 tens and 1 one. If we were trying to explain this concept to someone who had never been exposed to it before it might help to present the number in a table:

In a decimal number system each column represents a value 10 times greater than the column to its right

Binary (base-2) numbers

A binary or base-2 system uses

two digits representing the values zero through one (0-1)
when we exceed the value of 1 we start over again with a new column on the left
each new column represents a value two times the column to its right

In a binary number system each column represents a value 2 times greater than the column to its right

As an analogy for reading binary code it's often said that you can think of its zeroes and ones as switches; zero being "off" and one being "on". I like this analogy because we also tend to think of a switch as a device that controls a circuit or a machine. And right now you're reading this post on a machine that's using a lot of electronic switches.

Let's plug our "101" into a binary table and see what we get. Keep in mind that it no longer represents the decimal value "one hundred and one". Instead think of it as a series of on and off switches.

Reading the table above from right to left:

the "one" column is switched on returning a decimal value of 1
the "two" column is switched off returning a decimal value of 0
the "four" column is switched on returning a decimal value of 4

add the values together and you get decimal 5

Get it? If you're still unsure study the table below and it should start to make sense. Remember to look at the binary ones and zeros as switches; they just turn the value at the top of their column on or off.

Bits

In geek speak a bit is a contraction of the words binary digit. Every single zero or one in a string of binary numbers constitutes one bit.

The table above illustrates every possible 3-bit binary number with its decimal equivalent presented on the right. Take a look and note that:

using only one bit there are 2 possible values: 0 and 1. To express a value larger than 1 it's necessary to add another bit to the left.
using two bits there are 4 possible values: 00, 01, 10 and 11. To express a value larger than 11 it's necessary to add another bit to the left.
with three bits there are 8 possible values: 000, 001, 010, 011, 100, 101, 110 and 111. To express a value larger than 111 it's necessary to add another bit to the left.

Note: if you just read 111 as "one hundred eleven" please go stare at the picture of the light switches above

Do you see any patterns here?

for every single bit added to a binary word the range of possible values doubles. If you didn't notice look at the table again. Understanding this is going to be very helpful when you learn about bit depth, quantization and dynamic range in digital audio systems.

when every bit in a binary word is at "1" it has reached its maximum value and can't increase any further. In digital audio these maximum values are referred to as full scale and trying to exceed them will result in clipping. Heard of that?

Bytes

01000010

A string of eight bits together is called a byte and represented by an uppercase "B". We most often run across bytes in our day to day work when referencing file size or storage space.

Conversely, bits are represented by a lowercase "b" and commonly used to indicate bit rate which refers to the amount of data transferred or processed in one second.

File size and bit rate are different terms which are often confused. Remember to look for the big "B" for bytes or little "b" for bits. Now you know...

On the left the storage requirements of a Pro Tools session folder is shown in GigaBytes.
On the right the bit rate of a mp3 file is shown in kilobits per second.

Machine Code

Finally, if you want to have a few bits (or bytes) of fun click on this link and type in your name or favorite ice cream flavor. In return you'll get to see what the processor on your device is really looking at once your entry gets through your browser, your OS and many, many layers of programming languages.

Something to keep in mind as you look at all those 1s and 0s: the processor on your device doesn't see them as we do. A "1" to your processor is just a little voltage—a little bit of electricity. A "0" means no voltage.

Every single bit is processed by your device on a tiny transistor: an electronic switch that's either on or off.

Amazing.

Next time we're going to put your new binary knowledge to good use as we begin to cover the analog to digital audio conversion process in detail.

Petzold, Charles. Code: The Hidden Language of Computer Hardware and Software, 2000

This book was my primary reference for this article. A great read for those who are interested in learning how your digital devices really work under the hood.

Karl Wenninger is an audio engineer, synthesist/sound designer, composer, guitarist and DIY audio electronics enthusiast. As an adjunct professor he has taught Pro Tools at The New School for Jazz and Contemporary Music, Computer Music at York College and Audio Post-Production for the Media Arts Program at NJCU. He was an program administrator and associate professor at the former Digital Media Arts program at Touro College in New York City for over a decade.