Go to the NASA Homepage
 
Advanced Controls and Displays Left-Side Header Image
Research Sidebar Header
Advanced Controls and Displays Image Collage
Research header
The Role of Dynamic Information in Virtual Acoustic Displays
Principal Investigators
Elizabeth M. Wenzel, Durand R. Begault

Problem
Unlike the use of spatial audio for entertainment (e.g., stereophony or surround-sound), spatial auditory displays for high-stress human interfaces demand a rigorous approach to human performance criteria and psychoacoustical validation. If this is not a mandatory component of the design strategy, disastrous results could occur if inadequate, incorrect, or inappropriate spatial information were presented to a human operator. The research described here recognizes that the development of advanced human- machine interfaces and their eventual technological transfer requires an understanding of human perceptual and cognitive requirements to successfully develop and integrate technology. While the project has already made significant strides in an integrated basic research and technology development program, much of the basic human performance information that will determine the nature of future technology remains to be gathered. The outcome of this research is linked to several important deliverables including: (1) human factors guidelines which form the basis of engineering specifications for virtual displays, (2) standalone hardware systems for synthesis of multi-channel, spatial auditory communications and (3) fully-functional hardware/software systems for synthesis of interactive, virtual acoustic environments.

Approach and Objectives
Currently, we have data for base-line performance of localization of spatialized sound using static (non-head-coupled) anechoic (echoless) sounds. Such stimuli tend to produce increased localization errors (relative to real sound sources) including increased reversal rates (sound heard with a front-back or up-down error across the interaural and horizontal axes), decreased elevation accuracy, and failures of externalization (Figure 3.1a). Such errors are probably due to the static nature of the stimulus and the inherent ambiguities resulting from the geometry of the head and ears (the so-called cones of confusion; (Figure 3.1b). The rather fragile cues provided by the complex spectral shaping of the HRTFs as a function of location (Figure 3.1c) are essentially the only means for disambiguating the location of static sounds corresponding to a particular cone of confusion. With head- motion (Figure 3.1d), however, the situation may improve greatly; it has been hypothesized that the listener can disambiguate front-back locations by tracking changes in the size of the interaural cues over time and that pinna cues are always dominated by interaural cues (Wallach, 1939; 1940). The early work by Wallach using real sound sources also suggests that head motion may also be a factor in externalization.

We propose that localization errors such as reversal rates, poor elevation accuracy, and the proportion of non-externalized stimuli will be reduced (relative to baseline conditions using static, non-reverberant synthesis techniques) by enabling head and/or source movement. The methods used were based on standard absolute judgement paradigms in which the subjects' task was to provide verbal estimates of sound source azimuth, elevation and distance. Acoustic stimuli consisted of broadband stimuli (e.g., continuous noise, noise- bursts) which were filtered by a Convolvotron or similar spatial audio system. Each study included six or more adult volunteer, paid subjects with normal hearing in both ears as measured by a standard audiometric test. In general, the experimental designs were within- subjects, repeated-measures factorial designs with at least 5 repetitions per stimulus condition tested over a range of locations intended to sample the stimulus space as fully as practicable.

Accomplishments
A recent study by Wightman and Kistler, our collaborators at the University of Wisconsin- Madison, has suggested that head movements significantly reduce the rate of front-back confusions with virtual sound sources using personalized HRTFs. The study also included conditions in which the sound source moves but the listener does not. The results indicate that source motion alone does not significantly reduce confusions; e.g., subjects' rates of front-back confusions only decreased when they had active knowledge of relative source position, either through head motions or through keyboard control of the direction of motion.

Wenzel recently examined the relative contribution of the two primary localization cues, interaural time differences (ITDs) and interaural level differences (ILDs), to the localization of virtual sound sources both with and without head motion. Stimuli were synthesized from nonpersonalized HRTFs. During dynamic conditions, listeners were encouraged to move their heads; the position of the listener's head was tracked and the stimuli were synthesized in real time using a Convolvotron to simulate a stationary external sound source. ILDs and ITDs were either correctly or incorrectly correlated with head motion: (1) both ILDs and ITDs correctly correlated, (2) ILDs correct, ITD fixed at 0 degrees azimuth and 0 degrees elevation, (3) ITDs correct, ILDs fixed at 0 degrees, 0 degrees. Similar conditions were run for static conditions except that none of the cues changed with head motion. The data indicated that, compared to static conditions, head movements helped listeners to resolve confusions primarily when ILDs were correctly correlated, although a smaller effect was also seen for correct ITDs (static vs. dynamic reversal rates: stimulus condition (1) 28 vs. 7%; (2) 33 vs. 11%; (3) 45 vs. 21%). There was also a small but general trend toward greater externalization with dynamic sounds, especially for the conditions with correct ILDs. Thus, there is some evidence that head motion can enhance the perceived externalization of virtual sources. In general the results suggest that, when head motion is enabled, the pinna cues (ILDs) may play a more prominent role in sound localization than might have been expected from the early proposals by Wallach and previous data for static sounds. Thus, correctly synthesized pinna cues may be quite important for virtual acoustic displays if one is to gain the maximum benefit from dynamic cues.

Future plans
Future work will include testing additional cue combinations to confirm these findings. We will also extend these investigations to localization of virtual sources in non-anechoic environments and test the hypothesis that appropriately-chosen environmental cues can improve localization accuracy in dynamic contexts using technology developed jointly by Crystal River and NASA Ames. For example, preliminary observations suggest that (1) the upward bias in elevation seen for anechoic sounds is reduced when a floor reflection is added, and (2) lateral reflections may be particularly important for a realistic, externalized auditory perception.

Key references
Wallach, H. (1939). On sound localization. Journal of the Acoustical Society of America, 10, 270-274.
Wallach, H. (1940). The role of head movements and vestibular and visual cues in sound localization. Journal of Experimental Psychology, 27, 339-368.
Wenzel, E. M. (1992) Localization in Virtual Acoustic Displays. Presence, 1, 80-107.
Wenzel, E. M. (1995) "The relative contribution of interaural time and magnitude cues to dynamic sound localization." Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, October 15-18, New York: IEEE Press.
Wightman, F. L., and Kistler, D. J. (1995) The importance of head movements for Localizing virtual auditory display objects. In G. Kramer & S. Smith (Eds.), Proceedings of the 1994 International Conference on Auditory Displays, (p. 283). Santa Fe, NM.
Wightman, F. L. & Kistler, D. J. (In press). Factors affecting the relative salience of sound localization cues. In R. Gilkey and T. Anderson (Eds.), Binaural and Spatial Hearing. Hillsdale, NJ: Lawrence Erlbaum.
Click to view - Figure 3.1a. Illustration of perceptual errors observed during localization. Click to view - Figure 3.1b. Illustration of the cones of confusion.
Figure 3.1a.
Illustration of perceptual errors observed during localization.
Figure 3.1b.
Illustration of the cones of confusion.
Click to view - Figure 3.1c. Illustration of the frequency-dependent interaural level differences (ILDs) provided by HRTFs of an individual subject as a function of 4 different source locations. Click to view - Figure 3.1d. Illustration of the interaural cues produced by head motion.
Figure 3.1c.
Illustration of the frequency-dependent interaural level differences (ILDs) provided by HRTFs of an individual subject as a function of 4 different source locations.
Figure 3.1d.
Illustration of the interaural cues produced by head motion.
Go to the First Gov Homepage
Go to the NASA - National Aeronautics and Space Administration Homepage
Curator: Phil So
NASA Official: Brent Beutter
Last Updated: August 15, 2019