aboutsummaryrefslogtreecommitdiffstats
path: root/docs/hrtf.txt
blob: ba8cd8fa2e6df5f77e2c11ce775279014130a215 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
HRTF Support
============

Starting with OpenAL Soft 1.14, HRTFs can be used to enable enhanced
spatialization for both 3D (mono) and multi-channel sources, when used with
headphones/stereo output. This can be enabled using the 'hrtf' config option.

For multi-channel sources this creates a virtual speaker effect, making it
sound as if speakers provide a discrete position for each channel around the
listener. For mono sources this provides much more versatility in the perceived
placement of sounds, making it seem as though they are coming from all around,
including above and below the listener, instead of just to the front, back, and
sides.

The default data set is based on the KEMAR HRTF data provided by MIT, which can
be found at <http://sound.media.mit.edu/resources/KEMAR.html>. It's only
available when using 44100hz or 48000hz playback.


Custom HRTF Data Sets
=====================

OpenAL Soft also provides an option to use user-specified data sets, in
addition to or in place of the default set. This allows users to provide their
own data sets, which could be better suited for their heads, or to work with
stereo speakers instead of headphones, or to support more playback sample
rates, for example.

The file format is specified below. It uses little-endian byte order.

==
ALchar   magic[8] = "MinPHR02";
ALuint   sampleRate;
ALubyte  sampleType;  /* Can be 0 (16-bit) or 1 (24-bit). */
ALubyte  channelType; /* Can be 0 (mono) or 1 (stereo). */
ALubyte  hrirSize;    /* Can be 8 to 128 in steps of 8. */
ALubyte  fdCount;     /* Can be 1 to 16. */

struct {
    ALushort distance;        /* Can be 50mm to 2500mm. */
    ALubyte evCount;          /* Can be 5 to 128. */
    ALubyte azCount[evCount]; /* Each can be 1 to 128. */
} fields[fdCount];

/* NOTE: ALtype can be ALshort (16-bit) or ALbyte[3] (24-bit) depending on
 * sampleType,
 * hrirCount is the sum of all azCounts.
 * channels can be 1 (mono) or 2 (stereo) depending on channelType.
 */
ALtype coefficients[hrirCount][hrirSize][channels];
ALubyte delays[hrirCount][channels]; /* Each can be 0 to 63. */
==

The data is described as thus:

The file first starts with the 8-byte marker, "MinPHR02", to identify it as an
HRTF data set. This is followed by an unsigned 32-bit integer, specifying the
sample rate the data set is designed for (OpenAL Soft will not use it if the
output device's playback rate doesn't match).

Afterward, an unsigned 8-bit integer specifies how many sample points (or
finite impulse response filter coefficients) make up each HRIR.

The following unsigned 8-bit integer specifies the number of fields used by the
data set.  Then for each field an unsigned 16-bit short specifies the distance
for that field (in millimeters), followed by an 8-bit integer for the number of
elevations.  These elevations start at the bottom (-90 degrees), and increment
upwards.  Following this is an array of unsigned 8-bit integers, one for each
elevation which specifies the number of azimuths (and thus HRIRs) that make up
each elevation.  Azimuths start clockwise from the front, constructing a full
circle.  Mono HRTFs use the same HRIRs for both ears by reversing the azimuth
calculation (ie. left = angle, right = 360-angle).

The actual coefficients follow. Each coefficient is a signed 16-bit or 24-bit
sample.  Stereo HRTFs interleave left/right ear coefficients.  The HRIRs must
be minimum-phase.  This allows the use of a smaller filter length, reducing
computation.  For reference, the default data set uses a 32-point filter while
even the smallest data set provided by MIT used a 128-sample filter (a 4x
reduction by applying minimum-phase reconstruction).

After the coefficients is an array of unsigned 8-bit delay values, one for
each HRIR (with stereo HRTFs interleaving left/right ear delays). This is the
propagation delay (in samples) a signal must wait before being convolved with
the corresponding minimum-phase HRIR filter.