Or, the importance of critical listening
Do you trust your hearing ?
Should you ?
There were several factors that led to me putting my foot in my mouth last week – or, my keyboard, perhaps that should be – in a post where I incorrectly announced that Spotify’s “Volume Normalisation” feature makes everything sound terrible (It doesn’t)
This post describes my mistakes, and contains some words of caution for anyone trying to decide if one thing sounds better than another. First I’ll tell you the factors that lead to my wrong conclusions, then I’ll explain how they tripped me up.
(If you haven’t already read the post in question, this will make more sense if you do.)
Here are the factors that caught me out:
1. The ‘Smile Curve’ or Fletcher-Munsen Effect
In a nutshell, loud things sound brighter and bassier than quiet ones. So, play someone two identical pieces of music but boost the level of one by half a dB, and that’s the one most people will think sounds better.
This effect may be an evolutionary mechanism, to help us react better to threats that are close. It’s the reason some amps have a “loudness button” – to boost the bass and treble at low volumes. And it’s also a driving factor behind the loudness war – in the quest to sound better, the temptation to simply make things a little louder is too strong, and levels gradually creep up over time.
2. Level-matching is critical
It follows-on naturally from the “smile curve” that whenever you’re comparing two bits of audio, it’s essential to level-match them first. Bear in mind though that this isn’t always easy – should you match peak level ? Average level ? Perceptually weighted level that attempts to take the smile curve into account ? Whatever method you choose, when judging fine details, tiny level differences can make a big difference.
3. Listener Bias
You should always go into a listening test with an open mind. Any prior assumptions will colour your results. In fact, in a true scientific test, even the person doing the testing doesn’t know what the person being tested is being listened to – the so-called “double blind”.
4. Too much trust in labels
Just because you can enable or disable a feature in a program’s settings doesn’t mean that you have complete control. Don’t read too much into the way something is labelled.
In my testing of Spotify’s Volume Normalisation, I fell foul of all of these factors.
Where I went wrong
First – I was a biased listener. I was investigating allegations made by lots of people that Spotify’s normalisation feature made things sound “flat” and “empty”. I also had my preconceptions that the U2 and Imogen Heap albums didn’t sound as good listening with Spotify as they did from CD or my iPod.
Second – I didn’t level-match first, I just disabled the normalisation feature. Since the tracks I was listening to dropped in volume, the smile curve did it’s work and they didn’t sound as good to me as the louder un-normalised versions.
Next though, I did the right thing – the Fletcher-Munsen effect is an everyday fact of my work as a mastering engineer – it’s why level-balancing is a key aspect of mastering. So I did the right thing – I level-matched and listened again. At this point, my confidence wavered, and I posted on Twitter about it.
Next I did a null test – and of course this revealed the limiter pumping I posted about originally. Unfortunately for me, because of my listener bias I then jumped to completely the wrong conclusion –
Third – I didn’t level-match accurately enough – which is why there is some fizzy, “toppy” stuff left after my null test. But more crucially my listener bias got in the way again – I assumed that since I could hear a difference between the two, and I already thought one sounded worse, that it did. Adding to this misconception was the final factor:
Fourth – Spotify’s options clearly allow you to enable or disable “Volume Normalisation”. I wrongly assumed that Spotify disabled all processing when normalisation was off.
As I described in the original post, Spotify has a limiter which stops quieter, more dynamic tracks being distorted when their volume is boosted by the normalisation feature. What I didn’t realise is that this limiter is always on – even when “Volume Normalisation” is off.
At first sight this may sound bizarre – why would you need a limiter, if you aren’t changing the level of any songs ? And even if it’s on, why would it be doing anything to audio whose level hasn’t been changed ? The answer is the final factor I failed to take account of:
Inter-sample decoding peaks
There’s a reason Spotify’s limiter is enabled all the time – that’s because most CDs recorded today, when decoded from mp3, AAC or Ogg Vorbis files – as in Spotify’s case – contain inter-sample peaks. There’s not enough space here to discuss them in detail, but the short version is – most CDs mastered in the last few years will be going “into the red” when they are decoded.
So, a player like Spotify needs to make a decision – should the resulting audio just be allowed to clip, should it be turned down, or should it be limited, in an attempt to minimise the distortion ?
Since the “Volume Normalisation” feature turns tracks down, and users complained, it’s not entirely unreasonable that this shouldn’t happen when the feature is disabled. And, digital clipping distortion sounds bad, so limiting sounds like a reasonable solution, right ? Well, maybe – see the end of this post. Meanwhile:
Summary – why I was wrong about Spotify
- The differences in sound I heard and detected with the null test are because Spotify adds extra limiting to high-level tracks when “Volume Normalisation” is disabled – not when it’s enabled
- This is because high-level audio goes “into the red” when decoded from the Ogg Vorbis stream used by Spotify
- Which is a limitation of the encoder (and lossy compression in general) not Spotify
- Limiting is a sensible way to deal with this, but changes the sound a little on high-level audio
- The “normalised” versions are closer to the original but don’t sound as good because of an aural illusion – the smile curve or Fletcher-Munsen effect (Or, because they were lower-level to begin with, and people don’t like the sound of Spotify’s limiter)
So, where does that leave us ?
Well, it leaves me with egg on my face, and Spotify in the clear. It’s particularly embarrassing for me, since as a mastering engineer I deal with all the issues I’ve raised here on a daily basis – I should have known better !
One final question remains, though.
Is limiting high-level decoded audio the right thing for Spotify to do ?
I blind-tested myself quickly and 5 out of 6 times I correctly spotted the limited (not-normalised) audio. The kick drums and bass end in general have less impact. Ironically the people who petitioned for the “disable” option on Spotify’s normalisation setting haven’t picked up on this – but perhaps they are listening to music that was lower-level to begin with.
Personally I think Spotify would be better to simply reduce the output of the decoded streams a little, instead of limiting them.
Designing a great-sounding limiter is no easy task, which is why companies like Waves and TC Electronic can charge thousands of pounds for the ones they’ve developed.
I can hear Spotify’s limiter working, and it’s changing the audio more than I would expect. The decoded OGG streams would only need to be reduced a little to prevent clipping distortion, and would sound better as a result.
Conclusion – back where I started – I love Spotify
I don’t want to make a huge song and dance about it though – most users will never run into the details of the limiter working, and even fewer will be worried by it. In the meantime, Spotify have done the right thing – songs are normalised by default, making the Loudness Wars a non-issue, and if the practise spreads, overall this may be the biggest disincentive ever to making music unnecessarily loud, which can only be a good thing.
Now, if you’ll excuse me, I’m off to get the phrase “level-match before listening” tattooed onto my forehead…
(Image by Marco Mutzke)