Hopefully you already know about Paul Frindle. He worked at SSL, on the legendary G-Series console amongst other things (you know, the one with THAT bus compressor !) Later he worked on the equally legendary Oxford Digital console, and several tools you may be aware of by now in plugin-form - like the (yes!) legendary Oxford Inflator for example, and more recently the unique DSM v3.
So as you may have gathered by now, Paul absolutely knows what he’s talking about - and he’s a really nice chap to boot. He can often be found on audio forums generously sharing his wisdom and experience with anyone who asks, and has helped me personally expand my understanding (and correct some errors !) on multiple occasions, as he has for many others.
As a perfect example, he was recently part of a discussion on the Gearspace mastering forum after I shared my videos about inter-sample peaks and True Peak limiting there, and that conversation is what has prompted this post.
Because the thread broadened to a much wider topic - namely the incorrect use of technical terms when we’re talking about audio, and more importantly the confusion and problems this can cause.
(It happens a lot and is something I’ve been guilty of myself - most recently in the first two of those ISP videos, and also back in the past with the term 'Dynamic Range' when applied to measuring audio. Which is why I now take care to talk simply about 'dynamics' instead, to try and avoid any extra confusion.)
As part of this discussion, Paul shared some excellent, concise examples of common technical misunderstandings, and I’ve asked his permission to re-share them with you, here.
So without further ado, here they are, with a few interjections from me. And since we’ve already mentioned the topic, this seems like a good place to start:
For a system, this means the total difference between the largest legal signal and the noise level of the system - when the data is decoded into signal.
[For dithered 16-bit audio, this is 93 dB, or 141 dB for 24-bit audio - Ian]
For music, this means the total difference between the highest and the lowest signal parts of the program during its duration - when the data is decoded into signal.
[For classical music and feature films this could be 70 dB or more, whereas for some pop & rock it might be as little as 4 dB (!) - Ian]
For a system, the difference between the highest possible legal signal and the data disconnected entirely - i.e. no data at all! (Because 'no signal at all' isn’t a valid signal)
For music, the difference between brute peak levels and some kind of averaging algorithm (RMS) with perceptual weighting included. i.e. a 'loudness meter'.
[Measuring the peak-to-loudness ratio can give us helpful feedback about how hard the audio has been clipped or limited - this is how my Dynameter plugin works - but as Paul says it’s not a measure of dynamic range - Ian]
For a system, it means absolutely nothing - unless it's broken! In a correctly dithered data system ALL parts of every waveform in the signal are represented and included in the data, even beneath the noise floor as far down as you are prepared to measure it. If we are taking it to mean 'bit depth' it only refers to the potential noise floor of the system - as described above.
Take DSD for example - this PCM system is only 1-bit but is nonetheless capable of encoding the full audio signal.
For music, it means absolutely nothing - as per the above. In a multi-bit dithered data path, music program is not subject to timing limits, level limits or distortion. It's continuous, like analogue. It is not a measure of signal 'quality' when properly decoded - despite what the marketing pundits want you to believe.
[You can see my video demonstrating this here:
The truth about bit-depth (and digital audio 'resolution') - Ian]
(Also known as Intersample Overs, Reconstruction Overs - et al)
For a system, it means the generation of digital data that may not be decoded correctly, even though the data itself does not breach maximum sample peak values.
For music, it means simply that the quality of the decoded program varies depending on what happens to the data after the event and how it's finally decoded.
The reason that it exists at all is that in a PCM data stream, theoretically not all parts of the potential program complexity can pass at full sample values and still be decoded correctly. This is because the filtering necessary to do the final decoding relies on frequency content and data history.
You need to leave some headroom to make sure it's going to be perfect. This is what was missing from the original PCM spec.
For a system, because such specs don't formally exist - it doesn't mean that the system is faulty in any way technically, as far as the PCM standard is concerned. Makers of digital systems are not obliged to pass reconstruction overs / errors without damage - although the best manufacturers may try to accommodate them as far as possible.
For music, it doesn't mean that finding 2 consecutive samples at near full level forcibly indicates a reconstruction over. It's more complex than that, depending on what else in the signal is being represented in that part of the data - in relation to what will eventually be reconstructed during decoding. It varies with frequency content and history.
Bear in mind that the PCM data specification is that no frequency content above half the sample rate must be encoded to PCM data (i.e. via an Analogue to Digital Converter). And that no frequency content beyond half the sample rate must be decoded by the final DAC. This means a properly designed ADC - feeding a properly designed DAC should produce no errors.
So if you process the signal internally in your DAW to produce frequencies that would be above half the sampling rate and send them out to your files - you are effectively producing an illegal data stream. So for instance with your whizz-bang new distortion plugin at 44.1KHz sampling, you can only create 2nd harmonic distortion on program up to around 10KHz, and 3rd harmonic up to around 6.6KHz or so before you put the system under stress. It may not get you brute inter-sample over readings - but it's iffy as to how it will turn out when decoded.
However, if this distorted content comes into the system via an ADC, the stuff you can hear will pass - but the stuff you can't hear will be rejected - so the sound of the signal should be preserved without aliasing mush.
Obviously boosting up and fast-clipping signals during the final digital process to get more loudness, is a sure-fire way of creating all this aliasing and the 'mush' it produces when decoded 😕 You end up with a very fragile product.
I hope this is helpful - Paul
So there you go ! I love the clarity and economy of Paul’s statements, which is why I wanted to share them with you.
Thanks again to Paul for taking the time to share this information with us, and the permission to re-use it here !