Understanding lossless, high-resolution, and spatial audio

Apple recently announced its lossless audio option for the Apple Music streaming service. Additionally, the company will also be offering lossless high-resolution audio as well as surround sound audio using Dolby Atmos encoding.

Apple is now the latest member of the lossless club, joining services such as Tidal, Amazon, Deezer, and Quobuz, who have been offering this option for a while. Even Spotify, which is actually yet to start rolling out lossless audio to its subscribers, beat Apple to the punch by announcing its own Spotify HiFi service a few months earlier.

Understanding lossless, high-resolution, and spatial audio

Apple is now deploying lossless audio on its platform, and it should automatically become the largest music streaming service to offer this feature. Millions of people will suddenly have access to lossless audio without putting in any additional amount of effort or even paying anything extra.

So now the question is, what even is lossless audio and why should anyone care? And what in the malarkey is high-resolution audio or Dolby Atmos? It’s time we clear some of the air surrounding these technologies and dispel some of the myths that are associated with them. But, before we do that…

The basics

The audio we hear is analog, as that is the domain in which we operate. In the past, the media used to distribute this audio to us was also analog. Audio cassettes and vinyl records are two such examples of analog media, with cassettes being used until somewhat recently and vinyl records still in use among enthusiasts.

However, recording audio in analog on the media put severe limitations on the quality of the recorded information. Moreover, it started to become incompatible in the increasingly digital world. This is where digital audio comes in.

A 4-bit PCM encoding
A 4-bit PCM encoding

An analog audio signal can be represented as a simple continuous sine wave. To represent this analog signal into the discrete digital domain, a technique called pulse-code modulation or PCM is used. Today, an overwhelming majority of the digital audio that we consume uses this method.

Pulse-code modulation, or the newer linear-pulse code modulation (LPCM), works by taking discrete samples in time and amplitude on the aforementioned analog sine wave. The quality of this analog to digital conversion depends on how often the samples are taken (sampling rate) and the number of possible digital values that can be used to represent each sample (bit depth). Theoretically, the higher these values, the closer the digital signal would be in representing the original analog audio.

Because the samples are taken at discrete points on a continuous signal a process called quantization is necessary to fill in the gaps, which uses techniques such as rounding off and truncation. However, this process adds noise and the amount of noise in the converted signal is inversely proportional to its bit-depth. As such, increasing the bit-depth of the digital signal serves the function of reducing the noise floor and thereby increasing the dynamic range.

Of course, most of the digital audio we hear today isn’t full of background noise. That’s because of a clever technique called dithering. Dithering replaces the natural noise pattern that is the result of the quantization process with a noise pattern of our choosing. This allows us to have the noise that we want and also where we want it in the frequency spectrum. With dithering, we can replace the quantization noise with a more subtle, consistent noise that’s less audible and also shifts it into the parts of the frequency response that our ears are less sensitive to.

Quantization noise
Quantization noise

Speaking of frequency response, our ears are sensitive to audio waves that fall within the 20Hz to 20,000Hz range. This is a fairly generous amount, meaning most people have a lower range than this, especially in the higher regions of the spectrum. Moreover, as you age, this range naturally reduces and you start hearing less of the higher frequency range. But for our discussion, let’s just go with the general 20-20,000Hz range.

To ensure the audio in the recording covers frequencies up to at least 20,000Hz, the sampling rate has to be at least twice that, which is dictated by something called the Nyquist-Shannon sampling theorem. This is done to avoid aliasing when the digital signal is converted back to analog for playback. This means to achieve up to 20,000Hz or 20kHz, you need a sampling rate of at least 40kHz.

The standard for digital audio today is the Audio CD, which uses 16-bits of information at a 44.1kHz sampling rate in uncompressed LPCM audio. Despite being over 40 years old, it is still considered the gold standard in digital audio and what we use to compare other standards, such as lossy compressed audio and high-resolution music.

However, as good as CD audio still is to this day it’s not particularly convenient to stream or download due to its large file sizes. This issue was particularly severe in the early days of the internet and online music, as internet speeds were much slower back then. This led to the invention of compressed audio, which eventually ended up taking over the audio world.

Compression is a common technique in computing to reduce file sizes. When you create a ZIP file, you’re effectively compressing it to occupy less space on the disk. However, a ZIP is an example of a lossless compression, which can achieve smaller sizes but not by much. To see the real gains, you need to go lossy.

Now, if your archiving application started discarding random data while creating your ZIP file, you probably aren’t going to be too pleased with it. However, audio works differently. Even standard CD audio has a lot of information that human ears often cannot perceive, depending on the person hearing and the equipment they may be using. This makes it easy to compress it down by discarding only those bits that aren’t that important, to begin with.

Early compression techniques weren’t that great. We all remember old MP3 files that had very audible compression artifacts even when listening on less than perfect equipment. You didn’t even need to know what a compression artifact was, you could just hear it because of how obvious it was.

Over time, things improved. We had better and more efficient codecs, which could store more data in less space. More importantly, we got better encoders so the data could be packaged more effectively. Today, lossy compressed audio is everywhere. Every music streaming service has it, every video streaming service has it, and even other formats like audiobooks and podcasts use it. Every smartphone video you recorded has had compressed audio. It’s good enough to the point that most people don’t even realize their music is compressed, even if they have happened to have heard the original in the past. It just works.

This then brings us to our next topic…

What is lossless audio and why should I care?

I’m going to make this clear right off the bat: lossless doesn’t mean uncompressed. I see people use these terms interchangeably and they most definitely do not mean the same thing.

As mentioned before, you can compress things in a way that discards some data to achieve significantly smaller file sizes or you can compress things in a way that preserves all the data for a relatively small reduction in file sizes. The latter is lossless compression.

Lossless audio is audio that has been compressed using techniques that preserve all the data in the original file. The result is a file that is easier to stream or download over the internet compared to the original uncompressed file, although it can still be quite large compared to lossy compressions.

Tidal is one of the earliest services to offer lossless streaming
Tidal is one of the earliest services to offer lossless streaming

The advantages of this should be clear. While lossy compression techniques have improved over the years to the point where even with modern equipment most people cannot differentiate between lossless and lossy audio it’s still not the original file. This means for those with really good hearing, good equipment, and the ability to discern the difference between lossy and lossless audio, or just someone who wants the original file unchanged for archival purposes, you now have the option to listen to the lossless version.

The primary reason why lossy codecs had to be invented was to make them easier to distribute over the Internet. But with increasing bandwidth in most places worldwide, faster computing devices, and generally more storage, it is easier than ever to consume the original audio without losing any data in the process. Streaming has also made the storage issue somewhat irrelevant, as music no longer has to be stored locally and can be streamed on-demand.

Even if you are someone who doesn’t necessarily have great hearing, the knowledge on what to look for in good audio, or have good equipment to aid you in that search but just happens to have really fast internet with no data caps, why would you listen to compressed audio? If you can consume the lossless audio the same way you would the lossy one, why would you pick the lossy one?

Amazon recently got into the lossless audio game with Music HD
Amazon recently got into the lossless audio game with Music HD

That’s pretty much the rationale behind lossless audio. It’s not necessarily better audio practically speaking but simply the original audio, as it would have arrived on an Audio CD all those years ago. Yes, in a way, we have come full circle back to the Audio CD and it’s strange to see lossless audio being hyped this way when, technically speaking, it’s usually just CD audio, except without the CD. It’s funny how things work out sometimes.

Still, lossless audio doesn’t necessarily have to be CD audio. When it is similar in spec to CD audio, you will see companies calling it “CD-quality” audio. Sure, more often than not it just means audio that’s in 16-bit, 44.1kHz but when used in context with lossless means it’s just audio that you’d find on an actual Audio CD.

But CD quality isn’t good enough for some people anymore. This is where we come to the final boss of digital audio.

High-resolution audio

Like with the previous section, I’m going to start this one with a clarification. High-resolution does not mean lossless and lossless does not mean high-resolution. High-resolution can be in lossy or lossless formats. Lossless audio can be low or high-resolution. The two are distinct.

Apple will be offering its high-resolution tracks in the same lossless ALAC codec as the CD-quality lossless files. Amazon doesn’t specify its codec but also offers lossless audio for its high-res and CD-quality files. Tidal, on the other hand, offers its CD-quality tracks in lossless FLAC but the high-resolution tracks in the lossy MQA codec within a FLAC container.

So what is high-resolution audio then? For this, we have to go back to our old friends — bit depth, and sampling rate. High-resolution just has more of these; more bits, and also more samples.

Amazon Music HD specifications
Amazon Music HD specifications

High-resolution usually has at least 24-bits of dynamic range and 88.2kHz sampling rate but can go as high as 192kHz or even 384kHz in some cases.

High-resolution audio is sometimes also referred to as high-definition audio. Then there’s also Hi-Res Audio, which is a Sony brand that others can license to indicate their devices support high-resolution audio but it’s not necessary, and you can have high-resolution audio hardware without this branding.

High-resolution audio has been around for a while now, starting in the Super Audio CD days. The thing is, most people — including audiophiles and engineers — can’t seem to agree on whether high-resolution is a useful thing to have or just snake oil. The standard Red Book specification of 16-bits, 44.1kHz used by the Audio CD can not just encompass the entirety of the average human hearing but it does this in a way that is practically very difficult to improve upon.

Let’s look at the first advantage that high-resolution audio brings, which is higher bit depth. When you reduce the bit depth of the analog to digital conversion, you add more noise during the quantization process when it has to be converted back to analog. By increasing the bit depth, you naturally reduce the noise and thereby increase the dynamic range.

However, even with undithered 16-bits, you can get a dynamic range of 96dB, which is very close to maxing out the limits of human hearing (120dB), and the additional headroom offered by 24-bits (144dB) goes so far beyond it that it even exceeds the limitations of most equipment. In other words, you can’t hear it.

Secondly, clever techniques like dithering can help reduce and shape the noise even in a 16-bit signal such that it would be inaudible to anything except precise equipment. The dynamic range of a dithered 16-bit signal can easily be made to go beyond 120dB by reducing and reshaping the noise in the audible range. This means for all practical purposes, 16-bit is perfectly adequate for human ears.

The other advantage high-resolution has is higher sampling rates. A 192kHz sampling rate means the audio can have a frequency response ceiling as high as 96kHz. As I already mentioned, humans can only hear as high as 20kHz, that too only those with perfect hearing at the prime of their life. Most people have even lower frequency responses than that.

For audio to have frequencies beyond the humanly audible range is like having a TV that shows light outside of the visible range. You can hear 96kHz sound almost as much as you can see X-ray. Which is to say, not at all.

Having to reproduce audio with such high frequencies can also put a strain on the equipment and drivers, which can introduce additional distortion. This distortion most definitely is in the audible range, which means you’re distorting sound you can hear for the sound you can’t hear.

Some people have suggested that the frequencies that exist beyond human hearing have an impact on the frequencies we can hear, thereby having those extra frequencies can improve how the audible bits sound. However, there’s no real consensus on this.

Tidal uses MQA, a lossy codec with a unique folding algorithm for packing in more data
Tidal uses MQA, a lossy codec with a unique folding algorithm for packing in more data

The thing is, both higher bit-depths and higher sampling rates have an advantage, but it’s primarily when it comes to producing music. Like working with a RAW image, working on an audio track that has a higher bit depth makes it easier to work with and adjust levels. The primary advantage is the lower noise floor, which can add up when layering multiple tracks so if all of them are 24-bit or higher then the average noise floor can still be quite low. The same is true for the sampling rate and the higher frequency response it affords. But once you are done composing the track, you can just export it out to 16-bit, 44.1kHz, and still get a perfect output.

It’s difficult to say definitively if high-resolution has tangible benefits to humans for simply listening to music. Most people who claim to hear a difference may also simply be listening to a different version of the track; a lot of the music released in high-resolution audio got remastered, which can make something sound dramatically different. The thing making the difference here is the mastering, which was often lackluster back in the day, and why a lot of music is re-released after being remastered these days. This can sound better even in standard CD audio so make sure if you are doing any comparisons that you use the same masters across both formats.

High-resolution audio also got a bad rep early on as companies who jumped on the bandwagon — whether it was music distribution services or those selling hardware — tacked on a hefty fee for the high-res support. Earlier, most digital to analog converters (DAC) would only support up to 16-bits, 48kHz. Even today, this is what you’d find on a majority of smartphones and computers out in the world. This is why Apple also recommends using an external DAC when playing back the high-resolution files on Apple Music.

Having said that, things have improved considerably in the past couple of years. These days, you can get even budget smartphones that have the ability to decode a 24-bit, 192kHz signal. High quality DACs for desktop use are now cheaper than ever and you can get something like a Schiit Fulla (yes, that’s what it’s called) for about $109 along with a great built-in amplifier. The only limitation now is being able to find sufficient high-resolution audio content, as it’s still fairly limited.

Spatial audio

Apple’s announcement of lossless audio on Apple Music also came with the addition of Spatial Audio. To be specific, Apple claims to now support Spatial Audio with Dolby Atmos on Apple Music. So what does this mean?

First, let’s discuss Dolby Atmos. Dolby is known to have multiple surround sound codecs and is one of the biggest names in cinema audio. However, traditional surround sound formats have had discrete channel support, such as 5.1, or 7.1. This meant that while mastering audio in these formats, the audio had to be placed within the channels based on where the director wanted the sound to appear from.

Understanding lossless, high-resolution, and spatial audio

Dolby Atmos introduced an object-based sound rather than a channel-based sound. This meant that while mastering, the audio engineer would just need to place the sound in a 3D space around the listener and then the system would figure out which speakers to use to reproduce that sound. This also meant that Atmos had a theoretically infinite number of channels as one can always add more speakers to increase the immersiveness of the sound, something you cannot do with fixed channel-based formats.

Atmos then goes a step further and adds a height element to the sound, which means you can now have speakers above you and the sound can be encoded to appear from above you. This makes the sound significantly more immersive as it can now come from all directions, similar to real life.

Dolby Atmos for music works similarly. It lets the composer arrange the music such that when listening to it on a surround sound speaker setup with height channels, it can be made to sound as if it’s coming from around you, thus enveloping you in the sound sphere and making you feel like you’re there. You can imagine how this can be used for something like a live recording of a performance.

Apple then uses this Dolby Atmos track and feeds it through its tech called Spatial Audio. Spatial Audio has been available for video content through the AirPods Pro and AirPods Max when paired with Apple devices. When playing back Dolby Atmos content on your iPhone or iPad, the AirPods Pro/Max will use the data in the audio track to simulate a 3D sphere of sound around you. Moreover, it will also use the accelerometer and gyroscope on these AirPods models and also track your head movement, so that the audio moves with your head, as it would in the real world.

For Apple Music, Spatial Audio does the part where it makes the sound appear all around you but without the head tracking bit. Since it’s not doing the head tracking, it can now also work on the base AirPods model. In fact, it works on all AirPods and Beats models with the W1/H1 chip and the built-in speakers of all iPhone, iPad, and Mac models with stereo speakers.

Now the question is, whether this sounds any good. I’m yet to experience Apple’s Spatial Audio with Apple Music but as a long-time subscriber of Tidal, surround sound music kinda sucks.

Understanding lossless, high-resolution, and spatial audio

This isn’t an issue with the format itself but rather the recordings. You will occasionally come across tracks that are specially designed to show off the surround sound audio and it might sound pretty cool. However, the music you want to listen to, that is, the surround sound masters of popular tracks released by studios, just don’t sound very good. This is especially true of studio albums, which were originally designed to be played back on regular stereo speakers and just sound odd and discombobulated when listening to the surround sound master. Personally, I’d much rather listen to the standard stereo version.

I can see how this may work well for live recordings of concerts but I haven’t come across many of those. In the end, I’m still somewhat optimistic about this tech but if you’re someone who doesn’t have the hardware to hear it then don’t worry, you aren’t missing out on much.

What about Bluetooth audio?

So far I’ve largely been talking of audio in the wired realm but a lot of people these days listen to music on Bluetooth devices, whether it’s Bluetooth earbuds like the AirPods, Bluetooth headphones like the Sony WH-1000XM4, Bluetooth speakers, or in-car audio over Bluetooth.

Bluetooth, however, adds an extra layer of complexity as well as an extra layer of compromise. All audio — and I mean ALL audio — sent over Bluetooth is compressed. Every single Bluetooth codec available today, whether it’s the baseline SBC to AAC, aptX, aptX HD, LDAC, LHDC, and Samsung Scalable Codec are all lossy codecs. Also to dispel another popular myth, if your music is in AAC and your Bluetooth headphones and device are using AAC for transmission, the music will still get compressed and re-encoded, and not just sent through as-is.

The AirPods Pro use lossy AAC codec for audio transmission
The AirPods Pro use lossy AAC codec for audio transmission

This is just how audio over Bluetooth works currently. It simply does not have the bandwidth for lossless transmission with the existing codecs, let alone uncompressed transmission. It’s possible that a codec could come about that is so efficient it can pack all the original data in without discarding anything and still manage to send it over Bluetooth’s minuscule bandwidth but that hasn’t happened yet.

Depending upon the codec being used for Bluetooth transmission, you can get pretty good results, especially if you bit rate of the Bluetooth codec far exceeds that of your audio file. However, if you are listening to lossless audio, then you will only retain as much information as the Bluetooth codec is capable of. Still, in the case of the lossless audio, since the Bluetooth codec is working with the complete file, the results may be a bit better than recompressing the same file twice in a lossy manner. Whether you can notice the difference is another matter, altogether.

Conclusion

There are a lot of words on this page and they may come in handy if you have a curious mind. But at the end of the day, music is more than the sum of bits and waves that make it. The purpose of listening to music or any piece of audio is to enjoy it.

For some people such as myself, taking apart the technical side of the music and studying the minutiae of it is part of the listening experience. I don’t remember the last time I listened to a piece of music and didn’t take subconscious notes on the quality of the recording, the arrangement, and how the speakers or headphones I’m using are reproducing all of it. I make sure all the equipment is working correctly, all the bit depths and sampling rates are correct and match the source audio, and that the audio has a clean path from the source without any elements that could affect the final delivery.

Understanding lossless, high-resolution, and spatial audio

If you’re like me — or an even bigger nerd as many are — then things like lossless audio and high-resolution audio are for you. Sometimes it doesn’t matter to us if we can tell a difference but that everything is correct and working as we like it. It’s people who are still buying and being enthusiastic about wired headphones, using external DACs and amps even though we keep getting told the ones that come built into our computers these days are “good enough”, and disabling all EQ and DSP effects.

If you are not like that, first of all, congratulations because you are normal. Secondly, you can just ignore everything here and continue listening and enjoying music as you always have. Things like lossless and high-res will always be there if you ever feel like taking it to the next level. If not, that’s perfectly fine too. What matters is if you’re having a good time.

Leave a Reply

Discover more from Ultimatepocket

Subscribe now to keep reading and get access to the full archive.

Continue reading