Small Room and Loudspeaker Interaction

Understanding Loudspeaker behavior and acoustical distortions in a small room
by Hadi Sumoro & Xian

The Common Questions

Several common questions are often asked related to loudspeaker’s sound reproduction, such as:

1. Why does a loudspeaker sound different when moved to another room?

2. Why does my new bookshelf loudspeaker sound terrible at home? They were great in the showroom!

3. Why does the loudspeaker sound muddy/bassy inside a room? It was great when I listened to it outside.

There are other questions, but readers can get the point from the examples above. Often a loudspeaker is being evaluated for how well it reproduced sound, but the room seldom comes into the discussion. This article discusses how loudspeakers and rooms interact.

Outdoor vs. Indoor

Impulse response (IR) is the response of a system to an impulse, usually plotted in a time domain graph (amplitude vs time). The input impulse has specific characteristics: it contains all frequencies, it has the same energy at all frequencies, and an infinitesimally short duration. This perfect impulse is also called a Dirac function. It is also good to understand that a loudspeaker is usually designed, measured and specified in a free space condition. The term free space condition simply means no reflections.


Figure 1

Figure 1 shows a free space condition where a hand clap is recorded. The time domain graph will show one spike at a certain time. This spike is the arrival of the direct sound.

A true free space condition does not include any reflections, including floor reflection. Therefore a theoretical free space will be a sound source hanging in the air, far (>10m away) from boundaries.

When a sound source is created in confined in a room, sound will bounce from walls, floor, ceiling, and furniture, etc. Figure 2 shows a more complex time domain graph which includes reflections (blue spikes).

Side note: when a mathematical function called fast Fourier transform (FFT) or discreet Fourier transform (DFT) is done to an IR, one can see the frequency response of the loudspeaker’s IR or room’s IR. Frequency response is a frequency domain data (amplitude vs frequency) showing the device’s output spectrum in response to a stimulus.


Figure 2

Like a microphone, our ears hear all sounds in a room: direct sound and room reflections. The room reflections will affect the frequency response of the loudspeakers. Figure 3 shows a frequency response of the same loudspeaker first measured outdoors (floor reflection is removed by a time window) and then indoors.

The black curve in figure 3 represents the true frequency response output of the loudspeaker without any walls/floor reflections. The loudspeaker’s frequency response is +/- 1dB at 80Hz – 12000Hz. This loudspeaker is later measured in a studio as shown in figure 4. The blue curve in figure 3 shows the frequency response inside the studio.

Picture3Figure 3


Figure 4

The blue curve looks like a comb. It isn’t smooth anymore. Instead, it contains a lot of spikes. This is what is commonly called a comb filter. The frequency response shows acoustical distortions due to room reflections. A common misunderstanding is thinking that electronic corrections such as EQ or other electronic processors can be used to fix/eliminate room reflections. Acoustical distortions are best handled with acoustical treatments, finding better source/listener’s position, selecting appropriate loudspeakers (for bigger rooms) and lastly, electronic correction (EQ).

Room Effect

We can study the effect of room reflections by observing the IR of a room. Illustratively, figure 5 shows a room’s IR.

Figure 5

Side notes: The time for direct sound to arrive is called time of flight and the time between direct sound to the first reflection is called initial time gap/delay.

Direct sound always arrives first (figure 5 – red spike) and the X-axis is called relative time since the reflections are observed based on the direct sound arrival.

Any reflections arriving up to 50ms after direct sound are called early reflections. Early reflections can provide strong frequency response coloration to the sound source if the relative level > -20dB to the direct sound. Beyond 50ms, the reflections’ perception is more separated from the direct sound.

Reflections arriving 50ms after direct sound are generally considered as reverberations. Reverberation is usually diffused in energy. Diffuse means equal energy per cubic volume of air. Reverberation time exists in a large room. Late reflections are usually arriving 50ms after the direct sound and are noticeable due to their relatively high level. If a clear repetition of the direct sound can be perceived, this is called an echo. Although echo may not color the direct sound as strongly as early reflection(s), it creates time domain problems and can degrade speech intelligibility especially for commercial venues or other large rooms that require a sound system for communications.

We have just discussed reflections based on the relative time arrival. Now, let’s see how one dominant reflection can affect a frequency response based on its relative level.

Picture6Figure 6

A mathematical experiment is done using two dirac/perfect impulses that are separated by 1ms. The delayed impulses (at 1ms) are prepared at 0dB, -10dB, -20dB and -40dB relative to the 0ms impulse and the frequency responses are observed. This experiment will show the effect of a delayed signal or in this case assumed as reflection arriving at 1ms relative to the direct sound at four different relative levels.

Side Note: A 3dB difference in level is generally perceived as just noticeable and a 10dB difference in level if generally perceived as twice/half of the loudness.

Figure 6 – the blue curve shows the resultant frequency response with 1ms delayed reflection with same amplitude with the direct sound. The comb filtering contains deep notches and peaks. 1ms is a period of 1000Hz, calculated using the following equation:

1 = T × f

Where T is period in second, and f is frequency in Hz.

A 1ms reflection will create deep notches in frequency response at odd integer multiplies of half 1ms wave period. Half 1ms period represents a frequency of 500Hz, thus notches can be observed at 500Hz, 1500Hz, 2500Hz, etc.

A 6dB peak/bump can be observed at integer multiplies of 1ms wave period, which corresponds to 1000Hz, 2000Hz, 3000Hz, etc.

The reader can also observe that no summation/cancellation happen at frequencies 333Hz, 666Hz, 1333Hz, 1666Hz, etc. This is caused by 120° and 240° phase shift of the integer multiplies of 1ms wave period.

At low frequencies, the frequency response is close to +6dB. This happens because the phase shift is not significant for a larger wavelength (longer period of wave).

What happens if the reflection is -10dB relative to the direct sound? Figure 6 – the red curve shows the resultant frequency response. The notches and peaks are not as extreme as the blue curve. The comb filter shows peaks and dips at ±3dB. This variation in frequency response is not very audible and the delayed reflection is practically masked by the direct sound. In the recording world, a microphone placement technique called 3:1 ratio will result in -10dB sound leakage. By doing this, the sound spill/leakage is practically masked by the main signal.

While 10dB difference seems adequate for masking a signal in recording/listening purposes, the effect in frequency response (±3dB) is still considered a lot. Figure 7 shows a frequency response comparison with delayed reflection at -20dB (green curve) and -40dB (red curve). It is clear if one wants to minimize the effect of a reflection, a 20dB or more attenuation is recommended. Absorbers or diffusers can be used to attenuate specular reflections. By properly placing absorbers/diffusers, one can shape the room impulse response to a specific acoustic needs, ie. a recording studio, lecture hall, music hall, etc.

Side Note: Comb filter does not look good on the graph, but it does not always sound bad. For example, a concert hall generates comb filter due to the massive reflections, but it sounds good.

Picture7Figure 7

Small Room Acoustics

To listen to a sound source without any room effect, one must go outside although most likely listening outdoors is still going to be affected by a floor reflection. A room with a dimension less than 10m/33ft will give strong effects to a sound source. Reflections create acoustical distortions such as comb filter, room resonances, and boundary interference/loading.

Comb Filter

Nearby furniture and walls will give strong specular reflections. Unless absorbers or diffusers are installed at the point of reflection, these reflections may arrive with minimum attenuation and will result in a strong comb filter. This is easily noticed at high frequencies. Figure 3 – the blue curve shows strong comb filtering at high frequencies (>500Hz).

A high frequency comb filter is easily controlled by using absorbers, diffusers, or by slanting the wall. Aside from room modification, sound source’s types, placements and orientations also have a lot of effects. Placing a loudspeaker too close to a boundary, such as on the table or too close to a reflective side wall can result in a strong reflection that can’t be easily attenuated. Typical perception of comb filter is a hollow or harsh sound reproduction at mid/high frequency.

Room Resonances

Room resonances are commonly known as room modes. These are standing waves that happen between two parallel boundaries that are not absorptive, diffusive or reactive at certain frequencies. When we talk about a room, a question arises: how big is a room and how does that relates to sound? It depends: sound behaves differently depending on its wavelength relative to the room’s dimensions.

Picture8Figure 8

Figure 8 above shows a frequency response in a room. Pressure zone happens at low frequency where the wavelength is much larger than the room dimensions. Most likely the room will provide extra gain and can ‘extend’ the low frequency response of a subwoofer.

Modal zone is where room modes occur. The room dimension is smaller or equal to the wavelength. Two parallel walls will create axial standing waves. Standing waves in a room can be calculated using the equation below (for axial, tangential, oblique modes):

Where c is the speed of sound propagation in air and l, m, n are integer numbers showing the order of the room modes. Axial room modes (order: 1,0,0; 0,1,0 and 0,0,1) for a room with dimension 3m x 2m x 2.5m can be calculated as follow.

For example: in the middle of the 3m length, 57Hz will be gone due to standing waves, but will be loud/very noticeable at the end of the long walls. Right in the middle of the room, a lot of cancellation occurs due to room resonances and this is commonly known as bass suck/weak bass.

In a cube room (all dimensions are equal), the cancellation between 3 pairs of wall/floor-ceiling happen at the same modal frequencies. This will create strong resonances at specific frequencies, creating very deep notches and very high peaks in the frequency response. These notches/cancellations in frequency response can’t be fixed using EQ. Bass trapping can help and good room ratio will result in good modal frequency spacing (less noticeable).

The resonant frequencies and distribution of room modes are mostly determined by the room’s dimensions. Just because the room is not symmetrical and no parallel walls existed, standing waves can occur among several walls (harder to predict). The loudspeaker position will affect the excitation of room modes and the positions of the listeners will affect the audibility of the room modes.

Schroeder frequency, noted by fs sign in figure 8, is the transition between the modal zone to the diffuse zone. In the diffuse zone, the room dimensions are large enough that there is a sufficient density of modes. A diffuse field has equal energy density at all points in the room and good probability that sound will arrive from any directions.


Where T60 is the reverberation time (it is known as decay time for a small room) and V is the room volume in m³.

In general, room is considered ‘big’ for frequencies above 300Hz.

At mid/high frequencies, back and forth reflections between two parallel surfaces can be noticeable. This is called flutter echoes or a series of regularly spaced reflections separated by a constant time. The peak in comb filter can be much sharper compared to a single strong reflection.

Lack of Diffusion / Sound Envelopment

Specular zone happens at frequencies above four times the Schroeder frequency as shown in figure 8. A specular reflection can be thought of as a mirror-like reflection and can create a strong comb filter in frequency domain. Although the common decay time of a treated vs untreated room can be roughly the same, the sonic impression can be very different (note: the term decay time is used in a small room and shall be differentiated from reverberation time/T60). Using proper placement of absorbers and diffusers, the density of early/late reflections can be modified to create different sound envelopment experience.

Sound envelopment is a sense of space. To create this experience in a small room, reflections can be diffused. Diffusers are commonly used to increase diffusivity in a small room, hence also increasing/modifying the sound envelopment. In movies, the use of multichannel audio is widely known to provide the envelopment to immerse the listener to the space shown on the screen.

In a loudspeaker’s sound reproduction, sound waves spread differently at different frequencies, becoming less directional with decreasing frequency. Here are sound dispersion balloons for a 2-way loudspeaker with 8in woofer and a tweeter in a small horn.


Fig9Figure 9 – left 125Hz, right 500Hz

Fig10Figure 10 – left 2000 Hz, right 8000 Hz

This explains why a loudspeaker needs to be aimed properly at the listeners (high frequency radiation tends to focus to the front of loudspeaker) and subwoofers can be placed anywhere. Directivity also exists in other sound sources such as the human voice or acoustic instruments. Various sound source types, placement or orientation can excite/energize a room differently at different frequencies.

A sense of sound envelopment comes from late arriving reflections. Listening to a live band in a small room can give more intimacy, which is a sense of being close to the stage/band, but probably less sound envelopment. The same live band in a large room may feel less intimate but it will generate enhanced sound envelopment. In a larger room, a band can be reinforced with a sound system, which may use high directivity loudspeakers. The use of high directivity loudspeakers can reduce the room effect, thus altering sound envelopment when the band is playing with vs without the sound system.

Modifying sound envelopment in a small room can be done by diffusing reflections, specifically at first reflection points. By locating diffusers at the first reflection points (side walls, ceiling and back wall), one can get dense early reflections. In studio design, this was known as rich reflection zone. Sound envelopment can get a boost by doing this (can be a good tweak for a home theater); however, if the dense reflections are relatively high in level, this can cause a clarity/image problem. A good balance can be achieved using a combination of diffusers and absorbers where sound envelopment can get a slight boost without sacrificing the clarity/image of the sound reproduction. First reflection points can be found easily by using a mirror. Place a mirror at the side walls (most important), the ceiling and the back wall where the listener can see the loudspeaker from the mirror. These locations are the first reflection points and are very important in shaping the sonic impression.

Side Note: To help building diffusers for breaking sound wave’s reflections, one can use software such as Sound Splash.

Boundary Loading

Loudspeaker – boundary interference response is an interaction between a loudspeaker’s direct sound and the adjacent boundaries. This mostly affects low frequency as they’re usually omnidirectional. It can be useful or create unwanted comb filtering.

The lower the frequency, the bigger the wavelength is. Since the sound’s wavelength can be >3m and loudspeakers’ dimensions are usually <2m, a loudspeaker is usually spreading low frequencies like a sphere, or omnidirectionally (projecting sound to all direction with equal energy). This omnidirectional characteristic is valid if a subwoofer is hung far from boundaries. Once a loudspeaker is placed on a reflective floor. The radiation becomes hemi-spherical because the floor reflects half of the sound energy.

By putting a subwoofer on the floor (half space), in the corner of two walls (quarter space), or two walls and floor/ceiling (eighth space), the subwoofers gain extra outputs. One boundary can add up to 6dB output especially for low frequency.

Half Space

To understand how a loudspeaker interacts with a boundary, we will discuss how sound behaves in a half space mounting condition, assuming the wall is reflective. A common half space condition is flush mounted loudspeaker where the body of the loudspeaker is inside the wall as shown in figure 11 – left.

Fig-11Figure 11 – left is flush mounted, right is in the front of the wall

In a flush mounted condition (assuming the loudspeaker is properly designed for this – no drivers/ports at the side/back), sound waves spread via the front only. At frequencies where the loudspeaker directivity is low (usually <500Hz), the output will rise. Flush mounted condition is usually done at one corner of the room, so aside from the low frequency rise, room modes are excited more. At high frequencies, the flush mount condition may not affect the frequency response output if done properly.

Figure 11 – right shows a common loudspeaker placement, which is in a front of a wall. By placing the loudspeaker at a certain distance from the wall to the front of the loudspeaker, a cancellation occurs at frequency with ¼ wavelength of the distance. If the front of loudspeaker is approx. 60cm away from the wall, a notch in frequency response can be noticed at ±143Hz.

While the flush mount condition seems ideal (because gives extra low frequency output and extends the low frequency response of the loudspeaker), it is not very practical for most people. Putting loudspeakers in the front of a wall is typically used. The closer the loudspeaker is to a wall (<1m), the higher the frequency cancellation will be. This higher frequency cancellation/comb filter may not be too noticeable and thick absorbers behind a loudspeaker can be used to control this. The extra low frequency output (boundary loading) may be desirable, and few loudspeaker manufactures provide low shelf filter to reduce the effect. When the distance between loudspeaker and the wall is >1m, comb filter starts at low/low-mid frequencies. This can create subjective impression such as hollow or thin sound, and boundary interference is harder (or more expensive) to control.

Lastly, putting a subwoofer or loudspeaker on the floor is also a half space condition. The output rise depends on the listener position and the directivity of the loudspeaker. Figure 12 shows a point sound source radiation of a loudspeaker. Point source means it radiates from one point of origin (another type of source is line source). When you place the same loudspeaker on a floor (assuming the boundary is solid, reflective and smooth), the radiation becomes a hemisphere as shown in figure 13.


Figure 12

In both figure 12 and 13, an ellipse is shown in the front of the loudspeaker. This illustrates the focused radiation of mid/ high frequency and the main sphere illustrates the low frequency radiation. The focused radiation/directivity of the loudspeaker is usually noted by its -6dB down-point. In this example, the ellipse shows the location where mid/high frequencies are down 6dB relative to the loudspeaker’s on-axis reference point (center of the ellipse). Outside the location of the ellipse, the attenuation of mid/high frequencies are >6dB.

Picture15Figure 13

In figure 13, two locations are noted as A and B. Location A is in the front of the loudspeaker, right on the ground. This location will receive +6dB gain at all frequencies due to the boundary/floor. At location B, the low frequencies will likely receive +6dB gain but it is outside of the loudspeaker’s mid/high main radiation/sound lobes. This will result in little or no output gain, as well as the possibility of comb filtering in the mid/high frequency response. Figure 14 shows the side view of figure 13.

Picture16Figure 14

The boundary acts like an imaginary mirror for sound waves. In figure 14, the top loudspeaker is the loudspeaker, and the bottom loudspeaker is the sound source’s mirror image. Note: the radiation sphere from each sound source are misaligned at location B, but aligned at location A. Due to the slight path difference in location B, this can cause comb filtering as previously discussed and the extra overall gain caused by the boundary is less than 6dB (closer to 3dB in general).

A common method of loudspeaker’s measurement is called ground plane measurement. By tilting the loudspeaker a little bit and locating the microphone at location A in figure 15, one is able to measure frequency response without getting destructive interfering reflections from floor. This can also be applied to microphone placement for recording purposes such as capturing a guitar/bass/keyboard amp.

Picture17Figure 15

Visualizing Point Source’s Sound Propagation

Several pictures below shows sound propagation of one loudspeaker installed in a medium hall with dimensions: 14.5m x 9m x 4.6m. The loudspeaker used is a 2-way with 15in direct radiating woofer and 60×60 horns on the high frequency. Four snapshots of sound propagation at 2ms, 4ms, 8ms and 16ms are shown.

Picture18Figure 16 – 2 ms snapshot

At 2ms, figure 16, sound starts to propagate like a sphere. The light blue arrow shows the first reflection from the closest side wall. As explained previously, this boundary interference will boost low frequency output (<200Hz), creates comb filtering at mid frequencies (200 – 2000Hz) and likely has less/no influence at high frequencies (>2000Hz) due to the focused radiation of 60×60 horn.

Picture19Figure 17 – 4ms snapshot

At 4ms, figure 17, we can tell that direct sound (noted by black arrows) are leading and will arrive first. The reader may notice that the black arrows point to different colors. This shows the focused radiation of the loudspeaker to the front (light blue particles), less radiation to the bottom of loudspeaker (green particles) and much less radiation to the back of loudspeaker (blue particles). Please use zoom function for a better view.

Light blue arrows show the side wall reflection and red arrows show the second reflection from the ceiling above the loudspeakers.

At 8ms (figure 18) and 16ms (figure 19), the reader can further observe the sound propagation as it expands and fills the room. This medium directivity loudspeaker will project most sound to the audience area, while providing less spill of mid/high frequencies to the stage.

Picture20Figure 18 – 8 ms snapshot

Picture21Figure 19 – 16 ms snapshot

Electronic Correction

Electronic correction in a form of equalization (EQ) can be used to lower the peak/bump in frequency response due to room effect. However, it cannot fix cancellation/notches. It is also known to create equalization based on energy average of several measurement points in the room, also known as spatial averages. This is done because frequency responses are different at various locations in a room. Spatial average does not have phase information, therefore it is typical to use FIR linear phase filter to flatten or shape the spatial average curve in a room so the loudspeaker’s phase response is not affected by the filter’s phase. Custom FIR filter can be created using software such as Filter Hose.

EQ does not substitute acoustical treatments in small or large rooms, but it can help to smooth out acoustical distortions.


Understanding how a room and loudspeaker interacts is the key to answering why a loudspeaker sounds different when moved to another room. By knowing the problems, correct acoustical treatments can be prescribed to achieve the needed acoustical function.

We would like to thank Chris Devenney, Pat Brown, (SynAudCon and Neil Shade (Acoustical Design Collaborative) for their insights and for reviewing this article prior to publication.