Three Binaural Auditory Scenes

As part of a psychoacoustic experiment for my dissertation, I designed three binaural auditory scenes. Below are several renderings of each auditory scene. The auditory scenes should be displayed on headphones at low-medium listening level. The best sense of presence is achieved if the listener sits upright and closes his or her eyes. If you are not accustomed to listening to binaural auditory scenes, I recommend that you listen to these a few times, as generally the sense of presence improves as your perceptual expectations are adapted to the headphone experience.

Binaural scenes are synthesized by filtering monaural source signals with 50 HRTF pairs surrounding the listener. Each virtual scene includes multiple sound sources, and 1st- and 2nd-order acoustic reflections for a medium-size room. For the first scene the listener is slowly rotating, and the sources are stationary with respect to the room. For the other two scenes, the listener is stationary and the sources are moving with straight line trajectories. The monaural source signals were collected from a variety of sources: the trumpet, percussion, and speech signals are true anechoic recordings, but the remaining signals are informal field recordings that include natural reverberation and background noise. A frame size of 50ms is used with 5ms overlapping raised-cosine squared onsets and offsets. The binaural signals are computed in MATLAB. More details can be found in my dissertation - to be posted shortly.

There are three renderings of each auditory scene:
1. with measured HRTFs (full-order impulse responses, N=256),
2. with two low-order MISO state-space systems (N=6 each), and
3. with diffuse reverberation added to the rendering in #2.
The addition of diffuse reverberation improves the immersiveness of the scenes (and for scene #2, reverberation also makes the listening experience more comfortable, as the anechoic recording of rapid staccato notes played on a trumpet appears to trigger the so-called Stapedius effect even at low levels).


The nine files below are MP3s (256 kbs). I notice no significant loss in fidelity with the MP3s, but uncompressed CD-quality WAVs are included below as well.
HRTF Implementation Auditory Scene 1
Speech, Percussion, Birds
Slowly Spinning Listener
Auditory Scene 2
Trumpet, Applause, Fireworks
Stationary Listener
Auditory Scene 3
Animals in Motion
Stationary Listener
Full-Order HRTFs PLAY PLAY PLAY
N=6 State-Space PLAY PLAY PLAY
N=6 State-Space
With Diffuse Reverb
PLAY PLAY PLAY

The binaural signals in the last row are generated with the same state-space system as the middle row, with the addition of a modest amount of diffuse reverberation. The reverberation was added using the conventional 'Reverb' function in Sound Forge v.5 (as opposed to the 'Acoustic Mirror' or other room simulators). No early reflections were included (other than the 1st- and 2nd-order reflections that are modeled with HRTFs). The delay time of the diffuse reverb for the 1st and 3rd scenes is 0.7s, and 1.7s for the 2nd scene. The diffuse reverberation is mixed into the signal with a relative level of -23dB, -10dB and -17dB for scenes 1, 2 and 3, respectively.

The six files below are WAVs.
HRTF Implementation Auditory Scene 1
Speech, Percussion, Birds
Slowly Spinning Listener
Auditory Scene 2
Trumpet, Applause, Fireworks
Stationary Listener
Auditory Scene 3
Animals in Motion
Stationary Listener
Full-Order HRTFs PLAY PLAY PLAY
N=6 State-Space PLAY PLAY PLAY