diff options
Diffstat (limited to 'docs/ambisonics.txt')
-rw-r--r-- | docs/ambisonics.txt | 126 |
1 files changed, 126 insertions, 0 deletions
diff --git a/docs/ambisonics.txt b/docs/ambisonics.txt new file mode 100644 index 00000000..2d94427e --- /dev/null +++ b/docs/ambisonics.txt @@ -0,0 +1,126 @@ +OpenAL Soft's renderer has advanced quite a bit since its start with panned +stereo output. Among these advancements is support for surround sound output, +using psychoacoustic modeling and more accurate plane wave reconstruction. The +concepts in use may not be immediately obvious to people just getting into 3D +audio, or people who only have more indirect experience through the use of 3D +audio APIs, so this document aims to introduce the ideas and purpose of +Ambisonics as used by OpenAL Soft. + + +What Is It? +=========== + +Originally developed in the 1970s by Michael Gerzon and a team others, +Ambisonics was created as a means of recording and playing back 3D sound. +Taking advantage of the way sound waves propogate, it is possible to record a +fully 3D soundfield using as few as 4 channels (or even just 3, if you don't +mind dropping down to 2 dimensions like many surround sound systems are). This +representation is called B-Format. It was designed to handle audio independent +of any specific speaker layout, so with a proper decoder the same recording can +be played back on a variety of speaker setups, from quadraphonic and hexagonal +to cubic and other periphonic (with height) layouts. + +Although it was developed decades ago, various factors held ambisonics back +from really taking hold in the consumer market. However, given the solid +theories backing it, as well as the potential and practical benefits on offer, +it continued to be a topic of research over the years, with improvements being +made over the original design. One of the improvements made is the use of +Spherical Harmonics to increase the number of channels for greater spatial +definition. Where the original 4-channel design is termed as "First-Order +Ambisonics", or FOA, the increased channel count through the use of Spherical +Harmonics is termed as "Higher-Order Ambisonics", or HOA. The details of higher +order ambisonics are out of the scope of this document, but know that the added +channels are still independent of any speaker layout, and aim to further +improve the spatial detail for playback. + +Today, the processing power available on even low-end computers means real-time +Ambisonics processing is possible. Not only can decoders be implemented in +software, but so can encoders, synthesizing a soundfield using multiple panned +sources, thus taking advantage of what ambisonics offers in a virtual audio +environment. + + +How Does It Help? +================= + +Positional sound has come a long way from pan-pot stereo (aka pair-wise). +Although useful at the time, the issues became readily apparent when trying to +extend it for surround sound. Pan-pot doesn't work as well for depth (front- +back) or vertical panning, it has a rather small "sweet spot" (the area the +head needs to be in to perceive the sound in its intended direction), and it +misses key distance-related details of sound waves. + +Ambisonics takes a different approach. It uses all available speakers to help +localize a sound, and it also takes into account how the brain localizes low +frequency sounds compared to high frequency ones -- a so-called psychoacoustic +model. It may seem counter-intuitive (if a sound is coming from the front-left, +surely just play it on the front-left speaker?), but to properly model a sound +coming from where a speaker doesn't exist, more needs to be done to construct a +proper sound wave that's perceived to come from the intended direction. Doing +this creates a larger sweet spot, allowing the perceived sound direction to +remain correct over a larger area around the center of the speakers. + +In addition, Ambisonics can encode the near-field effect of sounds, effectively +capturing the sound distance. The near-field effect is a subtle low-frequency +boost as a result of wave-front curvature, and properly compensating for this +occuring with the output speakers (as well as emulating it with a synthesized +soundfield) can create an improved sense of distance for sounds that move near +or far. + + +How Is It Used? +=============== + +As a 3D audio API, OpenAL is tasked with playing 3D sound as best it can with +the speaker setup the user has. Since the OpenAL API does not explicitly handle +the output channel configuration, it has a lot of leeway in how to deal with +the audio before it's played back for the user to hear. Consequently, OpenAL +Soft (or any other OpenAL implementation that wishes to) can render using +Ambisonics and decode the ambisonic mix for a high level of accuracy over what +simple pan-pot could provide. + +This is effectively what the high-quality mode option does, when given an +appropriate decoder configuation for the playback channel layout. 3D rendering +and effect mixing is done to an ambisonic buffer, which is later decoded for +output utilizing the benefits available to ambisonic processing. + +The basic, non-high-quality, renderer uses similar principles, however it skips +the frequency-dependent processing (so low frequency sounds are treated the +same as high frequency sounds) and does some creative manipulation of the +involved math to skip the intermediate ambisonic buffer, rendering more +directly to the output while still taking advantage of all the available +speakers to reconstruct the sound wave. This method trades away some playback +quality for less memory and processor usage. + +In addition to providing good support for surround sound playback, Ambisonics +also has benefits with stereo output. 2-channel UHJ is a stereo-compatible +format that encodes some surround sound information using a wide-band 90-degree +phase shift filter. It works by taking a B-Format signal, and deriving a +frontal stereo mix with the rear sounds attenuated and filtered in with it. +Although the result is not as good as 3-channel (2D) B-Format, it has the +distinct advantage of only using 2 channels and being compatible with stereo +output. This means it will sound just fine when played as-is through a normal +stereo device, or it may optionally be fed to a properly configured surround +sound receiver which can extract the encoded information and restore some of +the original surround sound signal. + + +What Are Its Limitations? +========================= + +As good as Ambisonics is, it's not a magic bullet that can overcome all +problems. One of the bigger issues it has is dealing with irregular speaker +setups, such as 5.1 surround sound. The problem mainly lies in the imbalanced +speaker positioning -- there are three speakers within the front 60-degree area +(meaning only 30-degree gaps in between each of the three speakers), while only +two speakers cover the back 140-degree area, leaving 80-degree gaps on the +sides. It should be noted that this problem is inherent to the speaker layout +itself; there isn't much that can be done to get an optimal surround sound +response, with ambisonics or not. It will do the best it can, but there are +trade-offs between detail and accuracy. + +Another issue lies with HRTF. While it's certainly possible to play an +ambisonic mix using HRTF and retain a sense of 3D sound, doing so with a high +degree of spatial detail requires a fair amount of resources, in both memory +and processing time. And even with it, mixing sounds with HRTF directly will +still be better for positional accuracy. |