Tips and advice

The main parameters to guarantee the audio quality of digital product

This article explain it all regarding the main parameters to guarantee the audio quality of digital product including formats and compression.
4.4/5 - (49 votes)

Audio setups for recorders or sound interfaces can be very confusing. But if you are going to work with videos or podcasts , it will be useful to know how to interpret the parameters when recording and exporting files so you can improve audio quality of digital product, whether in Audacity (free), Reaper, Adobe Audition or in video editors .

Here we will discuss digital audio format, the differences between the rates of sample (sample rate) , resolution ( bit depth ), rates file compression and variations of formats . Thus, you will be more sure of the options you have in relation to audio quality of digital product and you can guarantee good results.

In short, you will understand why we recommend recording in uncompressed format ( WAV, for example) in 24 bits and 48 kHz . In addition, you will also know why, in most cases, we do not need more than a 192 kbps MP3 to export excellent audio quality of digital product.

We’ll also talk about the possibility of more podcasts compress files that can be generated in MP3 of 64kbps , monkey, for easy online consumption.

Formats, extensions and codecs: what do they mean?

When it comes to audio files, we can talk about formats, extensions and codecs. In summary, we can say that the format refers to the type of file, identified by its extension ( * .mp3, * .wav, * .ogg, * .wma etc ), which often tells us how it has been encoded or which one is your codec.

For example, a file in the MP3 format has the extension * .mp3 and the MPEG-1 Audio Layer III codec 

Usually those endings are mixed. But what is important to know is that, as in this image

Uncompressed files

Audio recording equipment usually offers us options to record files without losing any information. These uncompressed files can be generated in various formats and extensions, such as WAV, AIFF, FLAC, and ALACFor those who are familiar with photography, they are equivalent to RAW or DNG.

As they are usually very heavy, using lossless formats in the final product is only recommended in some cases, such as:

  • when the final product can be processed by the consumer (files destined for sound banks, for example);
  • when there will be recording on physical media (CD, DVD and Blue-Ray);
  • or for the audiophile market (for a matter of perceived value and guarantee of high quality).

But, even if you don’t want to end the process with a WAV (one of the most common), lossless formats can be very useful in the editing stage. Because they contain a lot of information, they withstand more extreme alterations without damaging the audio quality.

With plugins , conversions and renderings can be manipulated more freely, ensuring excellent quality, even if a compressed file is subsequently generated.

Compressed files

Most of the equipment available on the market (cameras, cell phones and even audio recorders) usually deliver files already compressed. This type of file is more practical, easier to process, requires less storage space, and has very small sizes (in bytes ) .

Some examples of these formats are: 3GP, AAC, M4A, OGG, WMA and MP3 , which is, without a doubt, the best known. The files are like JPEG or GIF in the field of images.

Through a complex algorithm, these files are generated seeking to keep only relevant information for our ears. Depending on the compression mode, we can generate an MP3 from a WAVand have a file 10 times smaller, without noticeable alterations in the audio quality.

les with the same type of extension do not always have the same codec and vice versa .

This information is valid so that you do not feel lost in case you do not understand the reason why one software, which normally plays your * .m4a files , does not play another with the same extension, for example. 

Such a situation could indicate that the codecs used are different. In that case, the solution would be to use other software to read the file or to convert it (new encoding). This can be done even in video editors.

The variations of formats and codecs depend on the options of the companies that develop the software that run the files. In these cases, the stakes are high, such as technical specifications and patent relationships. 

On the other hand, filesare usually divided into two types: uncompressed or compressed.

Speaking of MP3 , despite its great popularity, it is currently considered an obsolete format, since others, such as ACC(extension. Acc or .m4a ), make it possible to obtain smaller files with higher quality.

Even so, MP3 is still widely used, as a large part of the software and equipment was developed for this format. So, to talk about compression rates, we will use it as an example.

Compression rate: What is its relationship to audio quality of digital product?

Now that you understand that a file can be compressed and maintain a sufficient quality for our ears, you should know that the level of compression can vary greatly.

And it is by the value of the compression rate (or bitrate ) that we manage to control the size of the file and, therefore, the audio quality.

For example, a 320 kbps ( kilobits per second) MP3can sound just as good as uncompressed audio on a CD or DVD. As the bitrate value decreases , the file size is reduced, but the noise losses become noticeable, depending on the audio.

To get a sense of how this rate affects sound quality, take a look at the following references: 

  • 320 kbps – audio that does not differ from the quality of a CD;
  • 192 kbps – no significant loss for most people;
  • 128 kbps – slightly noticeable losses;
  • 96 kbps – quality similar to FM radio;
  • 32 kbps – similar to AM radio;
  • 16 kbps – similar to short wave radio (“walkie-talkie”).

We remind you that the values ​​and descriptions above are only an approximation, since the compression of the file behaves differently in each type of audio. The more perceptible information (or the more complex the audio in question), the more room there is for compression to affect quality. 

That is why for a podcast without a soundtrack it may not be a problem to generate a file of just 64 kbps , monowith a single audio signal , playing simultaneously on the left (L) and right (R) channels. .

However, a well-produced studio song, played on several different instruments, can suffer perceptible loss, even if the compressed file is 128 kbps, stereo, with a different signal for each snare, left and right.

Here we are talking about fixed compression rates ( CBR– constant bitrate ), but there is also the possibility of generating files with variable rates, such as VBRvariable bitrate ) or ABR average bitrate ).

In VBR, the algorithm analyzes the audio and decides where it can compress it more aggressively and where it should remove less information. The ABR acts in a similar way, but remains at the average of the previously stipulated rate. These two methods, despite being more intelligent, can cause incompatibility with some sound players.

When we talk about compression vs audio quality, remember that there are no rules: each case is different and it is necessary to evaluate them individually to know to what extent the losses are acceptable, or when it is worth giving up on quality in favor of ease of use (faster download or lower storage impact, for example).

Remember that some websites and services recode the audio after uploading it. As we cannot control this process, it may be a good idea to send files with a little higher quality than necessary, to have a margin of safety in case of new conversions.

Winamp player showing the compression rate of an MP3

Amplitude resolution: 16 bit or 24 bit?

If you are going to use an interface / sound card or a recorder, you are going to be faced with options for bit depth values . This is related to the PCM digital audio pattern and does not apply to compressed files .

The values ​​refer to the signal-to-noise ratio. In other words, it has to do with the dynamics or the volume levels that the file manages to record with quality .

It is as if it were an amplitude resolution of sound. Thus, in theory, 16-bit audio can represent 65,536 volume levels between the lowest and highest value on the scale. Whereas in 24 bits, there are 16.7 million gradations.

Despite the large numerical difference, in practice, it is not a perceptible variation to our ears. But, there is a technical difference that can, in some cases, give the 24-bit file an advantage when it comes to capturing and editing.

We know that we must be careful with the input level when recording, so that the audio does not “burst” (generating clipping ). This is what happens when we let the graphic meter rise a lot, reaching 0 dB (maximum value before there is digital saturation / distortion). Therefore, a certain margin of safety must be respected, called ” headroom “.

In 16 bits , in addition to being careful, it is also recommended to pay attention so that the input level is not kept too low.

The reason for this is that, since there is not enough resolution to accurately record extremely weak signals, sounds can appear digitally distorted or full of noise, through a process called dithering , which attempts to hide quantization errors.

In this way, as the 16-bit file registers less volume gradations (48 dB less compared to the 24-bit one), theoretically you run the risk that, when you increase the volume in the software, you will find yourself with a higher dose of “shrieking”. In 24-bit, technically, there is no such risk.

Despite what has been said, you will surely find a quantity of noise ( noise floor ) coming from various sources such as: cables, electrical mains, preamplifiers, microphones, low-quality components, noises from the environment itself (“room noise”) and even derived from the natural functioning of the equipment used (some manufacturers even specify the value in the manual).

In practice, bit depth values will probably not influence your recording in a relevant way. So if your computer only supports 16-bit, rest assured. At the end of the day, it is the same bit depth value of an audio CD that, as you should know, can present crystal clear sound in most of the uses.

However, since a 24-bit file is not heavier than a 16-bit file, it is worth recording in the highest resolution, whenever possible. In addition to ensuring a higher margin of safety when processing the file digitally, 24-bit is the standard for DVD and Blu-ray. Thus, unnecessary conversions are avoided, in case the final audio is destined for one of these physical communication media.

Currently, there are teams that work in 32 bits, but, as we saw, you can hardly benefit from something like this, since it is an option for specific cases. 

For example, when you create sound directly inside the computer, without going through all the analog paraphernalia, you end up adding various noises in the process. 

Illustrative representation of amplitude resolution

Sampling rate: what does that value tell about audio quality of digital product?

Some values ​​that you will find are relative to the sample rate (or sample rate ). Those numbers refer to the number of times, per second, that analog sound is “recorded” to be reconstructed digitally (44.1 kHz equals 44,100 samples per second). They are similar to the number of frames ( frames ) per second that are needed in video for our eyes to create the illusion of movement.

These values ​​also refer to the maximum frequency (highest sound) that can be played in the file.

To clarify this, it is worth remembering that the lower a sound (low pitch), the lower its frequency (measured in Hertz). The sharper (higher pitch), the higher the numerical value in Hz.

In general, the lowest frequency that we can hear, the lowest sound, is around 20 Hz (or 20 twenty wave oscillations per second) and the highest, the highest sound, around 20 kHz (or 20,000 oscillations per second).

For technical reasons (Nyquist’s theorem), the digital average must support twice the frequency capacity to be reproduced. Thus, the sampling rate of a CD (the industry standard for a long time) was defined as 44.1 kHz.

This means that with this value you have enough data (per second) to represent frequencies up to approximately 22 kHz. In theory, it is more than is necessary to reproduce any sound we can hear, considering that many people cannot perceive such high frequencies. Mainly with advancing age, a large part of adults do not hear frequencies above 17 kHz or even 16 kHz.

In 1995, DVD came to the market and the chosen standard became 48 kHz. Again, the number was defined by a technical question: basically to round values ​​in relation to the number of frames per second ( fps ) in video.

Based on what we saw earlier, it is clear that this slight increase does not alter our perception of the reproduced sound.

Despite this, some equipment allows recording at up to 96 kHz or more. The only reason to work with such high sample rate values is to have data to manipulate files digitally (something similar to what we saw about working with WAV compared to MP3). 

But, as this means more storage space and more demanding processing, we do not recommend it. For online video or podcast, the benefits will likely be negligible. Also, in some cases, very high sample rate values can lead to unwanted harmonic distortions.

Some of the possible compatibility errors are related to the length of the audio and the played pitch . For example, a 44.1 kHz file may sound faster and “sharp” in a project set to 48 kHz. Whereas a 48 kHz file, if read as a 44.1 kHz file, will sound slower and with more “low” tones.

                                                   Audios in 44.1 kHz and 48 kHz in the same timeline

 

Fortunately, most of the current software manages to identify the differences in sampling rates and automatically interprets the file correctly, performing an instantaneous conversion (usually followed by a warning) when the value does not correspond to what is defined in the software.

In some cases, for those who work exclusively with audio (mainly music), it may be a good idea to stay at 44.1 kHz, because, despite the fact that CD is not used as much as before, it is still the main physical medium of musical consumption.

Actually, you will hardly have a problem converting from one pattern to another. As we said, currently platforms and software read and interpret both values ​​of the sampling rates very well. 

Our recommendations are only a guarantee to avoid possible and rare problems, which can generate small errors (digital artifacts), derived from the failures of the conversions.

Applying the knowledge in your day to day: what should you consider for audio quality of digital product?

Talking about audio settings, preferences, and recommendations demands some observations. As the mode of consumption varies a lot, as does the hearing capacity of people, which may be of excellent quality for some, it may not be for others. 

In addition, there are countless elements in an audio chain that can alter the sound in a more significant way than the topics mentioned here.

For an audio enthusiast, someone with attentive ears, who uses excellent hi-fi equipment, the differences in parameters (such as compression rates) can be more noticeable, depending on the sounds in question. 

There is also a theory that some very serious sounds, such as infrasound between 4 and 16 Hz, despite not being audible to us, can be perceived tactilely. 

Some studies (which are controversial) also lead to believe that ultrasonic frequencies (higher than 20 kHz), in some cases, can be perceived by our body, not necessarily by the auditory system.

Lastly, our hearing is not as developed as our vision. Therefore, it is more difficult to make evaluations and it is common for the “placebo effect” to arise when we analyze audio quality of digital product.

For the same reason, the electronics market can sometimes take advantage of the technical evolution of equipment (higher values ​​of bit depth, sample rate, frequency response) to sell products that, in practice, may not present any difference. for the user.