Revision as of 13:56, 16 August 2006
Audio mastering is the process of preparing and transferring recorded audio to a medium for future duplication. The specific medium that receives the mastered audio varies, depending on the intended release format of the final product. This medium is then used as the master copy, from which all further production of the audio material will be based.
History/Overview
Traditionally, Audio Mastering was the process of transferring audio recordings on magnetic tape to a phonograph lathe for the production of vinyl records. The mastering process was performed in real-time to disk, with any mistakes appearing on the master lacquer disk. With the advent of the Compact Disc, the lathe was replaced with a digital encoder and recording device. In the 1990's, the digital audio workstation (DAW) became common in many mastering facilities, allowing the off-line manipulation of recorded audio via a graphical user interface (GUI).
In recent years, the rapid worldwide migration of much music recording from large, expensive studio facilities with highly trained and experienced staff to smaller DAW-based "artist" studios has started to have a profound effect on the traditional mastering process. While many of these smaller studios are well suited for creating and recording music, they are usually not acoustically designed or staffed for precision audio mixing. As a result, mastering studios have to cope with more and more mixes that are musically sound but sonically unbalanced. When a mix submitted for mastering is unbalanced (e.g. a vocal is too low in level or a bass too boomy), the mastering engineer can quickly get caught up in a series of audio compromises since everything is already "soup" and mixed together. If the vocal level is brought up by isolating it with one band of a multi-band processor, a guitar part in the same frequency range will be brought up as well. Alternatively, if the boomy bass range is EQed or compressed via the multi-band processor, the bass drum will also be affected. The more the submitted mix has presentation balance problems, the more compromises arise.
A process solution developed and rapidly adopted by the mastering industry starting in 2005 has been to change from dividing the 2-track mix into (typically) four or more frequency bands via a multi-band processor to having the artist/engineer submit the mix as four or more synced tracks (files) called Separations (or sometimes incorrectly, Stems). The process is analogous to high-end page layout software, splitting artwork into four "color separation" files (CMYK) for submission to an offset printing house. The mastering engineer can now adjust major elements without compromise and then sum them together to create the final master. The Separation Mastering process also eliminates "alternate mixes" (since the Separations are continuously variable), eliminates costly "recall" or remix sessions (since mixes don't have to be rejected by the mastering house, virtually any reasonably close mix can be salvaged and enhanced), and the final masters have been shown to be consistently more clear and detailed due to, among other causes, the re-rendering with a higher end, low-jitter mastering studio digital clock.
Possible drawbacks to this method include giving the mastering engineer too much control over the respective levels (for example, vocal volume in relation to guitar volume.) For this reason, a copy of the artist's original stereo mix is usually used as an A-B reference by the mastering engineer to ensure that the original mix intent and musical proportions are adhered to.
With traditional mastering the mix proportions can and frequently do change due to the actions of the mastering process compressors and/or limiters, especially when trying to achieve a "loud" CD (unfortunately, something very common in today's hyper-competitive music market and commonly referred to as the "loudness war"). In the extreme case you can imagine the apparent mix changes which would result from musical parts/instruments with high dynamic ranges getting their peaks brought down significantly while softer parts are amplified. As some parts are squished, others can bloom up in volume around them especially when using multi-band dynamics processors improperly.
Professional mastering engineers utilize considerable experience and skill to balance the effects of increasing loudness and enhancing fullness while minimizing apparent changes to the mix. This is an area of potential sonic compromise that requires a trained ear and considerable technical knowledge to avoid. Done incorrectly, even a great album with stunning songs and performances will have an audio quality that will sound "recorded at home" versus "store bought."
With Separations, as the track "loudness" is increased during mastering any resulting apparent mix or proportion changes can be dialed back into place without compromise due to the major musical elements being on separate tracks. This would seem to be the best of all worlds, but again, the extra flexibility can cause problems if improperly applied by an over-enthusiastic or inexperienced engineer.
In either case, this type of balancing is not "mixing while mastering" but staying true to the artists original mix intent while simultaneously making the album translate well to a wide variety of playback systems (home, car, boombox, porta-pod, etc), have sonic and audio consistency track to track as well as ensuring the kind of overall professional sonic presentation demanded by both the artist and consumer...
Process
The process of Audio Mastering varies depending on the specific needs of the audio to be processed. Steps of the process typically include:
- Load the recorded audio tracks into the DAW.
- Correct any problems with the audio, such as volume level, tonal balance, or undesirable artifacts.
- Sequence the separate songs or tracks as it will appear on the final product ( for example, a CD ) .
- Transfer the audio to the final master format ( i.e., Redbook, CD-R, etc. ).
Examples of possible actions taken during mastering:
- Apply noise reduction to eliminate hum and hiss.
- Limit the tracks to set the highest peaks in audio volume to a preset level; the overall audio should never exceed 0 dBFS.
- Equalize audio between tracks to ensure there are no jumps in bass, treble, midrange, volume or pan.
- Apply a compressor (for example, 1.5:1 starting at -10 dB) to compress the peaks but to expand the softer parts.
- In the case of mastering for broadcast, the bandwidth of the signal has to be reduced. For example for TV broadcast: apply a high-pass filter at 80 Hz with -18 dB/octave to filter out low frequencies and apply a low-pass filter at 12 kHz with -9 dB/octave to filter out high frequencies.
Please note that the above are not specific instructions but some processes that may or may not be applied. Audio Mastering needs to examine the adjectives of input media, the expectations of the source producer or recipient, the limitations of the end medium and process the subject accordingly. General rules of thumb can rarely be applied.
RMS in music, average loudness, punch
The Root Mean Square (RMS) in audio production terminology is a measure of average level and is found widely in software tools. A larger RMS number means higher average level, i.e. -9 RMS is 2 dB louder than -11 RMS, 0 is the maximum. The loudest records are -7 to -9 RMS, the softest -12 to -16 RMS. The RMS level is no absolute guarantee of loudness though; the perceived loudness of signals of similar RMS level can vary widely since the perception of loudness is dependent on several factors, such as the spectrum of the sound (see Fletcher-Munson) and the density of the music (e.g. slow ballad or fast rock).
Since most major pop releases have been for a long time in the -9 to -11 RMS range, the loudness war (the trend of new records sounding louder) has practically ended simply because the physical and artistic limit of loudness has been reached. In pop music, the louder the average level gets, the less punch you will have since the punch depends largely on how loud the kick drum is in relation to the average level of the music. The RMS can be thought of as being the volume difference between the kick drum and the rest of the music - the larger it is, the louder the kick drum is.
Compressed higher RMS vs clipped higher RMS, density
Some experienced listeners feel that around -12 RMS in general or during loud parts and -14 to -16 RMS during soft parts is a "sweet spot" for optimal punch and loudness, neither too loud nor too soft. This perception is still valid considering that the extra loudness (usually 1-3 dB) has often been achieved by simply clipping the smoothly curved tops of the waveforms resulting in flat topped square waves, which may or may not result in a subjective improvement of the sound. Prior to clipping, usually the last procedure in audio production, the "natural" RMS of many songs is in fact just around -12 RMS. Thus, in many cases where the final RMS is -8 to -11, the RMS has not "really" been increased over the -12 RMS "sweet spot", only the tops of the waveforms have been clipped by 1-3 dB; the music is not any thicker or denser, but merely played louder with less punch and more distortion.
In contrast, a "true" higher RMS is achieved by increasing the density (usually by compression) of the sounds contributing most to the average level, i.e., everything else but the drums, so that their volume as a group can be lower in relation to (usually drum) peaks, yet retaining the same RMS and perceived average loudness as the clipped one, but without clipping, often with a stronger sense of density and pressure. However, in practice, this, too, would probably be subjected to some clipping, resulting in even higher loudness and pressure than the one that was merely clipped.
So it's not how loud you make it, it's how you make it loud.
Limits of maximum RMS in music, average loudness
If the kick drum is hitting 0 dBFS (the maximum of digital sound) on most of its cycles (typically 2 or more) and/or processed (e.g. clipped) to sound louder than 0 dBFS, −8 RMS can have good punch. With precise mixing and separation mastering techniques, perhaps −7 to −6 RMS is achievable without distortion, with an audible but not very punchy kick drum.
Bass punch, kick drum, bass drum, frequencies, waveform
The physical punch of the sound waves the listener feels is caused by the short burst (around 20-175 ms) of low frequency energy of the kick drum or some other low frequency sound. The majority of the feelable energy is in the 0-200 Hz range, often the peak, the loudest frequency is in the 40-80 Hz range, often about 65-75 Hz. The punch you feel in the chest is around 120-170 Hz, often about 160 Hz.
Its visual shape, the waveform, is 1 or more, up to 7-10 (nearly) full amplitude (0 dBFS) approximately sine or square wave cycles. The more cycles hit 0 dBFS, i.e. the longer the sound stays at maximum level, the stronger the perceived punch, up to a limit.
Software tools for mastering
Digital Audio Workstations
- JAMin
- Sonic Solutions
- SaDiE
- Bias Peak
- Steinberg Wavelab
- Digidesign Pro Tools
- Samplitude
- MOTU AudioDesk
- Adobe Audition