Collecting Binaural Data for Recordings – Accuracy vs Realism
This article will look at some of the ways to get binaural data with emphasis on microphone techniques.
The human auditory system is equipped with two input ports – the left and right ears. This “binaural” processing system provides us with the ability to localize where sound is coming from, something that a one- eared listener would have difficulty in doing. Playback systems may utilize any number of channels to surround the listener with sound, but two channels is always enough to simulate the human listener.
Recording enthusiasts have long discovered the benefits of stereo microphones. While not necessarily “human-like,” they can produce recordings that add spaciousness and realism to the recorded material. Two-channel acoustic measurements are important for the same reason – they add a human characteristic to the data. In this article, I will use the term “binaural” to describe recording processes that provide data for two ears. For our purposes, there is no need to distinguish between making a recording and making a measurement, as either or both may be the motive of the investigator.
This article will look at some of the ways to get binaural data. Many modern measurement platforms support two-channel recording. We will assume that one of them is being used, and concentrate here on microphone techniques.
Accuracy vs. Realism
One of the first decisions that must be made by the data gatherer is whether accuracy or realism is more important. After a little consideration, it becomes apparent that one cannot have both. Setup parameters that provide a more accurate view of the loudspeaker’s response will require that the effects of the environment be minimized. On the other hand, if the effect of the room is to be considered, then accuracy will need to be sacrificed to include it. The question becomes, “Do I want to know what is actually happening, or do I want to know what is perceived to be happening?” The answer to this question will fundamentally affect the method used to collect the data.
Response Times Three
It is important to note that at least three responses are being gathered in the recording – the loudspeaker, the listener, and the room. The listener’s response is a constant. The ear/brain system is assumed to be processing sound the same way at every seat. The loudspeaker’s response can be dramatically position- dependent, but it does not have to be. Loudspeakers that are designed for covering an audience evenly can have a similar response over a large area. The room also has a response, but it is unique for each listening position. This is one of the reasons why we can’t correct room acoustic problems with electronics.
The Case for Accuracy
Is the goal of the measurement accuracy or realism? If the purpose of the measurement is to calibrate an equalizer or crossover network, then accuracy should be considered first. It is desirous to know the true acoustic response of a transducer at a point in space, usually for the purpose of improving this response through signal processing. A stereo mic on a stand at ear height might convey what a listener will hear, but this response will include seat-dependent artifacts, such as a strong reflection from the floor or other nearby objects. The resultant comb filters will make it impossible to observe the response that is due to the loudspeaker alone (Figure 3). If one were to attempt to compensate for the effect of the floor reflection, the compensation would not be correct for a closer or more distant listener seat. As such, it is best to ignore the floor reflection altogether when “tuning” the system. Also, such a “seat dependent” response would average out if a large number of measurements were averaged across an auditorium. This is why near-field and ground plane measurement techniques play an important role in sound system tuning. This article isn’t about either, so we will leave the subject for a future one.
The Case for Realism
If the measurer wants to know what a sound system/ room sounds like, then accuracy must give way to realism. Realism requires a binaural recording technique, and it must include the same effects from the room that might affect a live listener. Mic placement is actually much easier than when considering accuracy, as the measurer simply listens to the system wherever they like and then replaces their head with the microphone.
Microphone Choices Stereo
A simple stereo mic can yield left/right information. Two cardioid mics in an X/Y configuration can yield convincing stereo. Spaced omni’s are another popular method. This is art, not science so there really aren’t any rules to break. If you like what you hear, then it’s okay.
Head Simulation
An added element of realism can be achieved by simulating the presence of a human head. The “head effect” is called the Head-Related-Transfer-Function, or HRTF. The Crown SASS™ uses omni mics spaced at human dimensions with an absorptive mass in between (Figure 2). Frequency-dependent directivity is achieved by boundary-loading the mics on small, flat panels. This mic is quite convenient to use, although you might have to buy it an admission ticket in some venues.
Head/Torso/Pinnae Simulation
Perhaps the best binaural mic is the dummy head (Fig. 2). This includes the effect of the head, torso, and even the ear structure. The major benefits of this technique are customization and repeatability. The response can be modified electronically and physically to whatever is desired, and setups can be recalled in the future if needed. Digital signal processing provides a low-cost, powerful way to modify the response. Dummy heads can cost many thousands of dollars, but the cost is easily justified for researchers that need the benefits.
Human Mics
One way to make a “poor man’s” dummy head is to utilize your own (no offense intended). Everything is already in place except the microphones. I have seen numerous mic placement mechanisms over the years, including eye glass mounts, wires, and even ear rings. Possibly the most clever and realistic approach to date is the In-The-Ear™ (ITE) recording technique pioneered by Don and Carolyn Davis in the late ‘80s. This involved placing probe mics at the surface of the ear drum. This technique captured the outer ear response, including the ear canal resonance. The resonance was removed with an inverse filter during playback.
A variation on this technique that sacrifices some accuracy for practicality is to place small mics at the entrance to the ear canal. I will call this “At-The-Ear”to distinguish it from the previous technique. The mics are held in place by some foam inserts (Figure 4). The two mics have XL male connectors can connect directly to my data recorder. I normally survey the auditorium without wearing the mics to determine the measurement positions, and then return to the seats with mics in place to gather data. Figure 5 shows a comparison between a free-field measurement and the “At-The- Ear” placement in both the time and frequency domains. The responses have been overlaid for comparison.
The methods used to gather data are determined by the intended use of the data. This often requires more than one technique, each preserving or enhancing the information in a way that yields more insight into the particular problem being solved. When making measurements, arrive equipped to acquire both accurate data and realistic data, and then let the question being pondered determine the preferred perspective. pb