Spatial tap: Turning sound quality up to 11

  • With both Apple and Google adding spatial audio to their respective earbuds, 2023 may be the year the technology starts to break through
  • The hope is that enhanced audio will be as appealing to device users as enhanced imagery and cameras

A new audio technology is shaping up to be the next big thing for device vendors desperate to find a new must-have feature to give the smartphone sector a boost following a lacklustre 2022.

The newish ‘sound sensation’ is called spatial audio, a digital update on surround sound as implemented in quadraphonic music media or home cinema systems. It uses state-of-the-art digital audio processing and currently delivers its results most effectively via separate speakers arrayed to create a soundscape in the manner currently offered by complex (and expensive) AV and cinema technology.

Smartphone vendors would love to tap the same sonic advantages via headphones or earbuds, orchestrating a fully separated soundscape in the privacy of the user’s skull (no need to lug around those speakers to make it work).

While a somewhat spatially separated sound experience is already available via today’s high-quality buds and headphones using Dolby Atmos encoding technology, to really hit the immersive heights requires further technical tinkering to trick the brain into perceiving specific directionality and distance, so that a chirruping cricket not only sounds like a cricket, but sounds as though it’s located 10 metres away and slightly to the left (for instance). At the same time, the unmistakable rumble of a herd of wildebeest arriving from behind can be complemented by the screech of a pterodactyl flying overhead… you get the idea.

Ideally, smartphone vendors would like to make spatial audio a prime selling point in the same way as camera and image technology.

To that end, Apple was recently first out of the traps with its AirPods Pro, Max and Beats Fit Pro, which use tiny gyroscope-based, head-tracking systems to work out how a user is moving his or her head relative to the sound being listened to. Other techniques exploit the topography of the ear to add further spatial nuance. The objective is to make the user feel at the centre of the action. 

Google also recently announced it is introducing enhanced spatial audio functionality to the Pixel 6 (and 6 Pro) and the Pixel 7 (and 7 Pro) smartphones this month when paired with Pixel Bud Pro. It’s promising users they will soon be able to “fully immerse” themselves with the new setup. Google also says its spatial features include “head tracking”.

The fact that, until now, breakthrough audio technologies have undershot their initial billing – even stereo took a while to really take hold – speaks to the inherent difficulty of whipping up enthusiasm for audio improvements, some of which might not always markedly improve the listening experience.

Music audio is probably most often best encoded for a conventional one-dimensional sound arrangement (where the musical sources are arrayed across a stage in front of the listener for a stereophonic effect) unless it’s Pink Floyd or similar, where the sounds are popping up from multiple directions. The sound source separation promised by full-on three-dimensional spatial audio best supports an action movie, horror movie or, perhaps, a lavish wildlife documentary (the ultimate would be a movie borrowing from all three genres – think Jurassic Park). Listening to conventional music with full audio separation might actually distract and detract from the music itself.

When it comes to audio tech, there’s always been a whiff of placebo effect around alleged performance improvements – I remember hi-fi magazine reviews alleging significant sound quality differences between speaker stand designs! This tendency to imagine “awesome” sound improvement when really what’s being observed is a slight tonal difference is bound to continue with spatial audio, so potential marketer beware!

The big hurdle to broad acceptance of the technology, though, is probably the distribution of the encoded media. Just before Christmas, Sony announced it had created real-time live distribution technology by evolving its proprietary 360 Spatial Sound technology. It claims the move will enable artists and music creators to create a “360-degree musical experience” by mapping sound sources with positional information “to suit their creative and artistic purposes”. Sony says it’s developing a real-time encoder that provides both real-time performance and sound quality for artists and sound engineers, enabling them to assign position information to each sound source and arrange them in a spherical space.

So what are the chances that this sort of spatial audio is adopted broadly in advance of whatever the immersive visual metaverse turns out to be? There’s an old saying that radio is TV with better pictures – the theory being that images you form in your imagination while listening to words and sound effects may be ‘better’ than those force-fed by a screen. If true, spatial audio paired with radio and podcast content might turn out to be a formidable coupling.


Email Newsletters

Sign up to receive TelecomTV's top news and videos, plus exclusive subscriber-only content direct to your inbox.