Fun with Pink Noise and Rooms and Loudspeakers ...

Fun with Pink Noise and Rooms and Loudspeakers ... This page goes on ... and on ... much longer than I originally intended. So much so, that a list of the sections seems in order!

Introduction: Setting the scene.
The Problem: Unfortunate interactions between loudspeakers and rooms.
A (Partial) Solution?: Why equalization might help.
Room Equalization with the Behringer DEQ-2496: How to use a specific device for room equalization.
Verifying the equalization "works": Higher resolution measurements of room behaviour before and after equalization. Includes some discussion of room modes.
What is wrong with all this?: Objections to the sort of equalization we have carried out and more sophisticated measurements of room behaviour. Why some reverberation may be a good thing.
Generating Pink Noise: Ways to do this in both the digital and analog domains.
Measuring with Spot Sine Wave Frequencies: This very basic technique is said to be "bad" when looking at the frequency response of rooms. How bad is it, though?
Any Conclusions?: What do experts conclude about loudspeakers and rooms? It seems some things are pretty certain, but on other points, there is disagreement.
Finally ...: Some thoughts on what audio systems are really capable of -- and what they are not capable of. The more I dig in to the realities of sound reproduction in rooms, the more I have to wonder about what we are really trying to achieve.

Components of both the problem and the (partial, arguably questionable) solution thereto: getting decent sound in a room which is "acoustically challenged". Note the speaker placement (too close to the wall behind them, leading to undesirable early reflections -- ideally they should be at least 1 metre out) and the listening position (with a wall immediately behind it -- ideally several metres of "free space" is desirable behind the listener).

Sooner or later, everyone who plays with audio ends up getting some sort of equalizer (a fancy tone control, not a gun) and tries to "improve the sound of the room" they listen in. While this turns out to be a very dodgy proposition in so many ways (not to mention frustrating), IMHO it really is possible to get a marked improvement (in spite of some very valid theoretical and practical objections). YMMV, of course! This page documents my fairly feeble attempts in this direction ... feeble, but rewarding (to my ears).

Caution! Take what follows with a liberal pinch of salt. I don't want to misrepresent myself as anything like an "expert" in these matters. These experiments and associated reading were educational -- but the biggest lesson was perhaps that how human hearing works in a reverberant environment (and especially in small domestic rooms) is very complex and not fully understood. Higher cognitive brain functions are definitely involved in what is really an "imaginative reconstruction" of an acoustic event -- a process which seems able to ignore all manner of measurable defects and problems ... but certainly not all. Even real experts -- I highly recommend reading as much as possible written by Floyd Toole and Siegfried Linkwitz -- do not fully agree on some quite major issues (such as the relative merits of stereo and surround systems, amongst several other matters). While the criteria for "transparency" in audio electronics are (to my mind) very clear (and meeting those criteria is essentially a solved problem), what gives the best possible sound reproduction with loudspeakers in a small room is not at all certain (although some key points are settled) and this area of sound reproduction (and analagous issues with microphones during recording) is by no means "done and dusted".

The Problem

Pretty much always, we listen to loudspeakers indoors: in a room. While loudspeakers are in many ways the weakest link in the electrical reproduction chain (OK - they are electromechanical) with significant frequency response and distortion issues, the interaction of the sounds they produce with the room in which they are placed is still more problematic -- much more problematic. Sounds in a room reflect off the surfaces in the room (to varying degrees depending on the nature of the surface) and the reflections interact with each other (and any direct component). This interaction leads to frequency and position dependent enhancements (peaks) and cancellations (nulls), resonances, ringing, flutter echoes and lots of other things with interesting names. One major consequence of this is that the frequency response of the reproduced sound becomes very far from flat and is different at every point in the room. All of this is dealt with in the venerable science of acoustics and that is a truly enormous subject. For a very approachable introduction, I would suggest having a look at Ethan Winer's "The Audio Expert", Chapters 17-20. The literature beyond that is vast and can get frighteningly difficult.

Floyd Toole's "Sound Reproduction: The Acoustics and Psychoacoustics of Loudspeakers and Rooms" is an excellent choice for further reading. This deals with the specific problem that interests us -- unlike much of the acoustics literature, which deals with concert halls and similar large spaces. It turns out that smaller spaces have their own characteristics which in many ways makes them more difficult to analyze than large spaces. The way the ear and brain work to "make sense" of the very complex physical sound field in a small room is fascinating -- and, like the sound field, also very complex. Unfortunately, the simple rules and guidelines found in introductory material turn out to have many caveats ... and may be misguided in some ways. There is more on this in the What is wrong with all this? section below.

For the time being, let's assume that having a flat frequency response is a good thing for a room. It certainly sounds like it should be, doesn't it? Since departure from flatness occurs due to sounds reflecting from the walls and reverberation has the same cause, reducing the reflections will also reduce reverberation. Logically, eliminating all reflections (and reverberation) might appear desirable. Since recordings made in concert halls and so forth will already contain the reverberation "signature" of that hall, the replay system in your room should not add its own reverberation "signature", but simply reproduce what is in the recording. In spite of the logic of this idea, all experience indicates that trying to create a totally "dead" listening room leads to unhappy outcomes. Reproducing the recorded reverberation characteristics of a large hall in a small room just doesn't seem to work well at all as far as listening enjoyment goes. Nonetheless, gross departures from a flat frequency response surely can't be a good thing. Well, let's go along with that for now, anyway.

As with all wave phenomena, there are two distinct types of behaviour: when the wavelength is short compared to the sizes of objects with which the wave interacts (i.e. at higher frequencies), things can be understood in terms of ray-like propagation and geometry. When the wavelength is similar to the size of interacting objects, the wave-like nature dominates and the whole system (the wave and the room in this case) must be considered together.

It would seem that right way to even out the frequency response of a room is to change the room. For high frequency problems, the reflections at critical points need to be attenuated. The "critical points" can be found from geometry and correspond to the positions of the first (earliest) reflections. At these points, absorbing material can be added to walls (such material is usually effective from 500Hz and up). To soak up low frequency energy in room resonances (see more on this below), absorbing structures (simple materials will not work) need to be put in the corners (and perhaps at other edges where the walls join to the ceiling and floor). Note that experts do not fully agree on the importance of fixing these room problems.

The high frequency solution will only work for a known, fixed listening position, and there are arguments that for the best results, this should be 40% of the longest dimension of the room away from the wall behind the speakers. Certainly not just in front of another wall. The low frequency correction will also have considerable position dependence. Furthermore, the speakers should be positioned "firing" down the longest dimension, and they should be the "right" distance away from the wall behind them, and angled in towards the listening position ... and so forth.

Unfortunately, the right way is very often the impractical way for many rooms. The location of doors and windows often gives a natural layout for the furniture in the room that just doesn't fit in with the best acoustical layout. Also, the idea of installing bulky sound absorbers often doesn't appeal (especially to one's "significant other" who may very well blow a fuse at the very thought of it -- not unreasonably). In the case of my living/listening room (pictured above), there really is no practical way to fit everything in and meet even the most basic recommendations where speaker placement and listener positioning is concerned. (We have tried various arrangements ... but even I recognize that it isn't possible to live with the results).

A (Partial) Solution?

So what can be done? Well, if you cannot selectively absorb sound to even out the room's frequency response, you can try changing the relative magnitudes of the frequencies that go in to the room via the speakers in the first place. If the signal to the power amplifier driving the speakers is shaped to have the inverse of the room's frequency response, maybe that would help.

Enter the equalizer! The most frequently used, and perhaps the most appropriate, type of equalizer for assisting with this problem is the graphical equalizer, which was common, in analog form, in the 1970's. Another very useful type is the parametric equalizer but this has not been used so much for this task -- although it may be a good solution for specific problems as noted below.

The graphical equalizer has a number of filters, centered at specific frequencies, each of which can be set to boost or cut a frequency band. The 1970's analog implementation often had a set of slider potentiometers to control the degree of boost or cut in each band (arranged in increasing frequency). Since the settings of the sliders looked like a graph of the desired frequency response, the name graphical equalizer was very descriptive.

These original graphical equalizers worked, but came with a number of disadvantages: ripples in the frequency response where the bands joined together, as well as the inevitable extra noise, distortion and so forth. The number of bands also had to be fairly small to keep the physical size and quantity of circuitry reasonable. All of these problems can be overcome by implementing the equalizer in the digital domain. Modern DSPs (or CPUs) are quite capable of crunching numbers at a rate sufficient to keep up with all standard digital audio sampling rates and depths while dealing with multiple channels.

What follows is a fairly detailed description of how I used a specific, DSP based, equalizer. There are many alternatives to this particular device (such as DSPeaker Anti-Mode) and software only versions (for example MathAudio). Some of these alternatives use theoretically superior filtering techniques, amongst other differences. However, I have no direct experience of using them, and can't comment on them. Several of them do look interesting ...

Room Equalization with the Behringer DEQ-2496

In 2003, Behringer introduced the DEQ-2496 which provides a wide range of processing options for 2-channel digital audio at up to 96kHz / 24 bit formats. The DEQ-2496 implements a pipeline of operations through which the audio signal passes (any or all of which can be bypassed). These (optional) processes are (in order):

Graphical Equalizer with 31 bands.
Parametric Equalizer.
Dynamic Equalizer (frequency dependent attenuation with thresholds)
Dynamics Processor (compressor/expander/limiter)
Stereo Width processor.

In addition, the DEQ-2496 has a pink noise generator and a real time 61 band spectrum analyzer, with a measurement microphone input. This is designed to work with the graphical equalizer section to "flatten" out the measured room response using a built in automatic equalization process. It also has comprehensive signal metering functions, balanced and unbalanced analog inputs and outputs and optical S/PDIF digital i/o. For more on what pink noise is and why it is useful, see the penultimate section of this webpage.

The product is still available (mid 2014) -- for example Thomann have it for the princely sum of £220 ... This is an amazing bargain, in my opinion. You get an awful lot for your money. Now -- needless to say -- this is far too cheap for many people. It obviously can't be any good at that price, now can it? Well, I don't know that there is anything wrong with it. There could be some feasibly valid nit-picking with the A/D and D/A stages (maybe), but if you stick to using the digital i/o path, those stages are irrelevant. Ah - but I've forgotten about jitter and maybe they've been careless with their signal processing implementations ... Yeah, right ... Then again, maybe they haven't. Personally, I'm more than happy with it.

The primary market for the DEQ-2496 is the professional sound reinforcement industry (i.e. the stuff used at live concerts, etc.). As an aside -- when building a domestic "hi-fi" system I would recommend "looking outside the box" at the pro and pro-sumer audio equipment world. It seems to me that you can get excellent devices at a small fraction of what is charged for similar things in the pure domestic "hi-fi" world.

As a further aside: Some of the prices of so-called "high end hi-fi" equipment seem borderline insane to me. For example, I recently saw a review of a D/A converter priced at £11,000 which wasn't the top of its particular range (by a long chalk) and noticed the PCB wasn't fully populated. At that asking price, they are saving money on the PCB design by using one board across multiple products? Really? Well, at least you "always get an effortless sound quality" ... apparently. It puzzles me why people don't try "USB soundcards" as D/As ... you get a nice A/D thrown in if you go that route. And you could buy 40 or 50 of them for the price of the above mentioned D/A. Having an A/D through which a computer can record sound turns out to be very useful for measuring room responses in detail, as we shall see. It is obviously very useful for other things too.

The first thing to use the DEQ-2496 for is to just have a look at the room's frequency response. If you haven't seen plots of room responses before, be prepared for a shock. To do this:

Connect a microphone to the RTA/MIC IN connector. This is easy if you buy a Behringer ECM8000 measurement microphone. The DEQ-2496 can supply phantom power for this (at a non-standard 15V). I had another measurement microphone and pre-amp (EMM8 microphone with MP-1r pre-amp from iSEMCon), and used this through my own single ended to balanced converter.
Connect the DEQ-2496 to the pre-amp (or whatever) so that its output can be heard on the loudspeakers. In my case, the S/PDIF output of the DEQ-2496 is connected -- via a Maplin Optical to Coaxial converter -- to the input of my Meridian DSP-5000 active digital loudspeakers. (Yes -- Maplin! Well, I'm sure that kills off any possibility of "street cred" as far as the "audiophile" world is concerned!) These speakers contain their own DSP system, which implements a volume control and a 3 way cross-over in the digital domain then feeds the woofer, mid-range and tweeter drivers by three separate D/A converters and 75W power amplifiers. The volume setting is sent over a separate (from S/PDIF) control channel from a Meridian G91 Controller (a digital pre-amp with analog inputs, DVD-A player and FM tuner). The DEQ-2496 fits in very naturally with this system. The main S/PDIF digital output of the G91 feeds the S/PDIF input of the DEQ-2496 via a Maplin Coaxial to Optical converter.
Select the pink noise generator as the input in page 1 of the I/O menu on the DEQ-2496. Set the noise gain (to say -6dB) and the system volume to get a fairly loud pink noise sound from the speakers.
Bypass all processing modules in the DEQ-2496 using the BYPASS menu.
Select RTA/MIC as the input source for the RTA in page 3 of the I/O menu.
Press the RTA button on the left of the DEQ-2496.

The LCD display will then show the RTA results ... the signal from the microphone analyzed into 61 frequency bands.

Note the vertical scale. Yep ... that has variations of the order of +/-15dB! Bit different from a CD player, isn't it?

The next step is to see what the DEQ-2496 can do automatically to smooth out this response with its automatic equalization facility. This is as simple as pressing the button next to the "Auto Eq" option shown in the above image. Before doing that, though, it is best to go in to the Graphical Equalizer and ensure that all frequency bands are selected. By default, the DEQ-2496 will not attempt to equalize frequencies below 100Hz. This seems odd, but there is a good reason for it, as we shall see. In this case, we will ask it to work on the sub 100Hz bands though. Once started, we see the AEQ display:

Note that the AEQ process will not stop until you tell it to. It doesn't iterate until it settles down ... it just keeps going. The AEQ process works by adjusting the setting of the GEQ (graphical equalizer). If you switch to page 3 of the AEQ menu, you can see the AEQ making modifications to the GEQ:

(Apologies for the blurry picture.) The arrows in the HF bands indicate the AEQ is currently adjusting these up. When you have had enough, press the button next to the "Done" option to terminate AEQ and keep the GEQ settings it has made. The final result in the GEQ is shown below.

The next thing to do is to use the RTA to measure the room response again with the GEQ in circuit. Here is what had happened to that:

That is much flatter than it was, isn't it? Does it sound any better? Well, to my ears, very definitely. We are now beyond what can be called objective facts, however. As noted above, you may not agree that it sounds better if you try a similar process. Certainly it will sound different.

Perhaps because I was used to the very pronounced bass hump at around 50Hz (clearly visible in the initial response plot), I modified the automatic equalizer setting in the GEQ to put back a little more bass. The final GEQ setting is shown below:

This equalization setting is now used for almost all listening. Since the DEQ-2496 can store and recall up to 64 complete (named) configuration settings, it is easy to experiment with variations.

There are a few other practical issues to note. By default, the DEQ-2496 does not attempt to auto equalize frequencies below 100Hz, even though this may well be the range that most needs treatment. The reason why not is that it is often hopeless to try to boost deep bass frequencies to try to make up for a lack in that region. If, for example, the speakers have very little output below 50Hz (not uncommon for small speakers), the required gains would be just too much. It simply wouldn't work and attempting it would probably make the speakers sound a lot worse as they tried to do the impossible. In our case, there was too much bass and so we wanted to cut those frequency bands. That will work.

Finally, if any bands are boosted, it is necessary to think about clipping. Full amplitude signals in the boosted bands will clip, and it is necessary to decrease the overall gain of the equalizer -- independent of frequency -- to avoid this. This can be done by changing the GAIN OFFSET (EQ) option in the UTILITY menu.

Verifying the equalization "works"

In the end, this exercise has "worked" if things sound better when using the equalization than when not using it. Switching the GEQ in and out using the DEQ-2496 BYPASS function leaves me in no doubt than an improvement has been made.

Out of curiosity, and as a further example of what can be done easily using the Python programming language and its supporting libraries (especially Numpy and Matplotlib), I had a go at measuring the room frequency response in more detail. To do that, I made use of the A/D capability of an EMU-0202 USB "soundcard" attached to a computer running the "Audacity" software. The DEQ-2496 was set to generate pink noise, as above, but this time, the output of the measurement microphone was recorded as a WAV file on the computer using "Audacity".

Now, pink noise is random noise whose amplitude and power have a particular dependence on frequency. Namely, the power in a pink noise signal declines at a constant rate of -6dB per octave. That is, every time the frequency doubles, the power halves. Since power is the square of amplitude, this is equivalent to a decline of -3dB per octave in amplitude. It is easy to use Wikipedia to find out much more about pink noise and its relationship to other noise types, so that is omitted here. The critical thing is that if a frequency analysis of pink noise is done and the resulting spectrum is averaged over frequency ranges corresponding to successive octaves, each octave bandwidth should have the same average power (given a long enough averaging time). Departures from a flat octave (or constant fraction of an octave) bandwidth averaged response reveal the frequency response of the system through which the pink noise has passed. In this case, the loudspeaker -- room -- measurement microphone system.

If we record a reasonable length of pink noise, as modified by the loudspeakers and room it is played in, we can use Fourier analysis to get the room response. This is what the RTA in the DEQ-2496 does. However, we can get a much finer frequency resolution if we do this in our own software.

In this case, we recorded the noise at 44100 samples per second for around 8 minutes. The frequency resolution depends on how many samples we analyze -- the more samples, the finer the resolution. Here, we analyzed the signal in blocks of 262144 samples (about 5.9 seconds). This will give us a frequency resolution of about 1/3Hz. A complete recording of about 8 minutes will contain about 80 such blocks. By averaging the spectra found for all of these 80 blocks, reproducible results are obtained and random variations due to the randomness of the pink noise are much reduced (in theory, by around sqrt(80) or about 9 times). The Python program written to do this can be found here. The various graphs below were generated directly by this. Note that all filenames and values (such as the number of sub-octaves) are set in the program code.

Using these methods, let's look first at the "before and after" -- with the DEQ-2496 bypassed and with it in circuit using the final, hand tweaked, GEQ settings shown above. At 1/3rd. octave, we have this:

The "after"curve is very obviously much flatter than "before" -- although hardly ruler flat! What is being hidden by the relatively crude 1/3rd. octave averaging though? The next plot shows the whole audio frequency range at 1/12th. octave:

As we look at finer octave subdivisions, the more fine features we are likely to find. At higher frequencies, much of the variation in response is due to comb filtering as reflections from the walls, floor, ceiling and everything else in the room that isn't sound absorbent add up in or out of phase to varying degrees. This comb filtering is extremely sensitive to frequency and position changes and gives rise to huge amounts of fine detail in the frequency response above a few hundred Hertz. For this reason, very fine sub-octave averaging is not very informative at high frequencies. Moving the microphone a few centimeters will completely change the results at very fine resolutions. The "after" curve still stays mostly in a +/-5dB range, however, while "before" occupies a greater than +/-10dB range.

At lower frequencies, room resonances are perhaps the dominant mechanism controlling the frequency response and because the wavelengths involved are metres rather than centimeters, there is likely to be less fine detail. This region can be probed at finer resolution with consistent results. Room resonances or modes are caused by standing waves where the echos from one or more reflective surfaces get back to the source in phase and so reinforce the wave. For a rectangular room, the simplest case gives rise to "axial modes" where the wavelength is twice the distance between two opposing walls. For example, my listening room (known as "our living room" by my wife) is 3.75m X 5.10m X 2.42m. Sound (in air at sea level and room temperature) travels at about 340m/s. We expect "axial modes" at about 45.3Hz, 33.3Hz and 70.2Hz given these dimensions (from v=f*w, where v is velocity, f is frequency and w is wavelength -- which needs to be twice an axial dimension to get a standing wave). More complex analysis gives the following formula for room resonance modes:

Here, c is the velocity of sound, L, W and H are the length, width and height of the room, and p, q and r are integers denoting the mode number. These can usefully be in the range 0 to 4. Modes (1,0,0), (0,1,0) and (0,0,1) correspond to the "axial modes" above. Lower number modes are likely to be the strongest (perhaps). A tiny Python program to calculate these mode frequencies can be found here. Here are the results for the first two modes of my room:

nick@nick-ThinkPad-T60:~/RoomCheck$ python modes.py --m 2 5.1 3.75 2.42
================================================================
modes.py - Find frequencies of rectangular room resonance modes.
================================================================
f=33.33 df=33.33 m=(1,0,0)
f=45.33 df=12.00 m=(0,1,0)
f=56.27 df=10.94 m=(1,1,0)
f=66.67 df=10.40 m=(2,0,0)
f=70.25 df=3.58 m=(0,0,1)
f=77.76 df=7.51 m=(1,0,1)
f=80.62 df=2.86 m=(2,1,0)
f=83.61 df=2.99 m=(0,1,1)
f=90.01 df=6.40 m=(1,1,1)
f=90.67 df=0.66 m=(0,2,0)
f=96.60 df=5.93 m=(1,2,0)
f=96.85 df=0.25 m=(2,0,1)
f=106.93 df=10.09 m=(2,1,1)
f=112.54 df=5.61 m=(2,2,0)
f=114.70 df=2.16 m=(0,2,1)
f=119.44 df=4.75 m=(1,2,1)
f=132.66 df=13.22 m=(2,2,1)
f=140.50 df=7.83 m=(0,0,2)
f=144.40 df=3.90 m=(1,0,2)
f=147.63 df=3.23 m=(0,1,2)
f=151.35 df=3.72 m=(1,1,2)
f=155.51 df=4.17 m=(2,0,2)
f=161.98 df=6.47 m=(2,1,2)
f=167.21 df=5.23 m=(0,2,2)
f=170.50 df=3.29 m=(1,2,2)
f=180.01 df=9.51 m=(2,2,2)

What does the low frequency region (below 400Hz) look like in detail? Here are plots at 1/24th and 1/48th octave resolutions:

We can certainly see the peak around 45Hz (nearer 47Hz in the mesurements, in fact), but the other predicted modes are not so obvious. Measured peaks seem to be found around 47Hz, 63Hz, 74Hz, 90Hz("after" only), 115Hz, 125Hz, 150Hz and 190Hz (roughly). This discrepancy is not surprising. The calculations assume very stiff walls with no features such as windows and doors (or anything else) in them. These requirements will almost never be met in practice, unless living in a medieval castle, perhaps. Doors, windows and plasterboard style construction will shift the resonant frequencies and make the peaks broader than they otherwise would be. The "sharpness" of a resonant peak is described by its "Q" value -- the ratio of energy stored to energy dissipated per cycle. Sharp peaks have high "Q", broad peaks have low "Q". Walls that flex, or otherwise do not perfectly reflect sound, dissipate the energy of the sound wave and hence give lower "Q", broader, resonances. Lower "Q" resonances tend to blend in to one another -- which should make them less audible.

Another instructive experiment is to see how the frequency response changes with the position of the measurement microphone. To do this, I positioned it at points on the (3 seater) sofa corresponding to the left, center and right seats and half way in between. The microphone was roughly where the mid point of the head would be in all these experiments. The plots below show what happened:

The major features largely remain, especially at low frequencies, but there is obviously a lot of change too. That is pretty much what would be expected. Sitting on the left seat, there is a mighty null at about 100Hz! Another experiment, which I won't be doing, might be to see what happens as the microphone is moved out from the wall. That should also have a significant effect at higher frequencies due to strong reflections from the wall immediately behind the sofa. In fact, at high frequencies, just about everything will have an effect!

What is wrong with all this?

Well, this is sort of all OK as far as it goes. As I say, I think fiddling about with the DEQ-2496 has made a really significant improvement to the listening experience in my room. What would apparently be much worse would be to try to measure the room response with spot sine wave frequencies. I must admit, that was my initial thought for how to make room measurements, but it is said to be a very bad idea.

Pink noise, however, is (statistically speaking) a constant, unchanging, signal. Music is not. Another signficant factor in how we perceive the frequency response of a room turns out to be the reverberation time of the room. This is a function of frequency. The frequencies with longer reverberation times will be emphasised relative to ones with shorter reverberation times. This effect cannot be measured with pink noise. Ideally, all frequencies would have the same reverberation time. That is, the time taken for a signal at a particular frequency to die away (to some specified level such as -60dB relative to its initial value) after it is turned off should ideally be the same for all frequencies. This is another goal for acoustic treatments of rooms and cannot be easy to achieve. Data on room response incorporating reverberation time measurements is presented as three dimensional waterfall plots, where horizontal is frequency, vertical is power and the third axis (out of the screen as it were) is time after signal termination.

Much more sophisticated signals and analysis software are required to factor in reverberation time effects. Several systems for doing that exist, among them, REW - Room Eq Wizard (cross-platform and free) and FuzzMeasure (for MacOSX only and commercial). Information on the "more sophisticated signals" can be found in the paper Transfer Function Measurement with Sweeps. S. Muller and P. Massarani. These more advanced (than pink noise) techniques have many advantages, including much reduced time required to make a measurement.

I purchased FuzzMeasure and made some measurements with it. Here are some of the results:

Room (no equalization) frequency response measured with a 1 second sine sweep.

This is gratifyingly similar to the frequency response we measured with pink noise. A single measurement can, however, be made much more quickly without losing frequency resolution. In this case, each measurement took 1 second. Measurements can (and should) be averaged for best results, though.

Room (no equalization) "waterfall" frequency dependent decay times

The image above gives information which pink noise cannot -- namely frequency dependent decay times showing the reverberation characteristics of the room. Let's look at the same graphs using the equalization we settled on.

Equalized room frequency response measured with a 1 second sine sweep.

Again, this is rather similar to the pink noise based measurement. Which is a good thing.

Equalized room "waterfall" frequency dependent decay times.

The decay times have not changed much as a result of the equalization. This is very much what we should expect, of course.

Because measurements can be taken very quickly using these more sophisticated techniques, it is possible to manually adjust the GEQ settings on the DEQ-2496 to smooth out the response. The best (flattest) result I could get (so far, anyway) by doing this is as follows:

Manually equalized room frequency response measured with a 1 second sine sweep.

Manually equalized room "waterfall" frequency dependent decay times.

This result is slightly flatter than the AEQ based equalization. However ... I'm not sure it sounds better! I'm not sure I can tell the difference, in fact. And we know the only way to really tell if that is the case, don't we?

Apart from the inability of our equalization to do anything about decay times, there is some reason to think a bit more carefully about the validity of what we are trying to do as a whole.

As mentioned, at higher frequencies (above around 500Hz, say), the sound field is extremely complex with huge variations in amplitude with small changes in position and frequency. It is quite a mess, in fact. This is due to comb filtering, as we have noted. And yet, except with special signals (constant frequency sine tones) and when specifically listening for it, we do not hear this comb filtering at all. At first sight, this is little short of astonishing. True -- reflected versions of waves are attenuated and so will not give anything like the infinitely deep nulls that exactly equal amplitudes would create, and the dimensions of small rooms are such that the frequencies of these nulls start in the "wave-like behaviour" zone where comb filtering doesn't apply. But our experience is still much "smoother" in all sorts of ways than might be expected from measuring the sound pressure field.

This should give us pause for thought. In fact, it seems as if the wall reflections (and consequent reverberation) of the kind found in small rooms are consistently heard as beneficial in every experiment that has looked in to this. The goal of attenuating first reflections by acoustical treatment of walls may not be a good idea after all.

So what does this mean for the sort of equalization we have been doing? Perhaps that it only makes sense to adjust the frequency response in the low frequency region. Reducing the big hump around 47Hz is definitely a good thing, for example. Trying to fix problems in the higher frequencies might not be worthwhile, however. Although major humps do tend to be found in about the same places as positions change (according to our measurements above) there is a lot of variation ... and the higher the resolution at which you look, the more variation you see.

Now, if we are only going to address problems at low frequencies, and we have high resolution frequency response measurements (much higher resolution than 1/3rd octave), then the parametric equalizer would seem to be the best tool to use. Parametric equalizers let you independently set the center frequency, the "width" (Q), and the boost/cut at the center frequency. Forget 1/3rd octave graphical equalizers and automatic correction systems based on them and "null out" specific peaks with parametric eq. either manually or, possibly, with a new sort of automated process. It would be possible to do this (manually) with the Behringer DEQ-2496, but I haven't tried it myself.

Another major spanner in the works (but also a blessed relief at a more fundamental level) is the degree of adaptability of human hearing. It seems that people automatically learn to compensate for the effect a room is having on the sound in it within a few tens of seconds of entering it. They can then "hear through" the room to distinguish quite subtle effects in the source material which the room's acoustical properties do not mask. There are limits to this adaptability, and it doesn't mean that all rooms are completely equal, but this adaptability makes it quite difficult to make reliable subjective judgements about relatively small variations (say, with the equalization settings).

Perhaps there is one rule for room equalization that we need to learn: know when to stop!

Generating Pink Noise

To generate pink noise, we start (one way or another) with white noise. White noise has uniform power at all frequencies, and averaged over octave bandwidths, that means its power would increase with frequency at +6dB per octave. This means its amplitude increases at +3dB per octave.

In software, the obvious way to generate pink noise is to first generate white noise, then filter it. White noise is simply the output of a uniform random number generator, and filtering could be done most easily in the frequency domain (using FFTs), although other filtering methods obviously exist -- although the issue noted below for analog filters applies to digital ones too, so this is not entirely trivial.

Another, more direct, way is to add up white noises with different bandwidth limits in the right proportions. If we pick a random number at every sample time, that is the highest frequency noise. To get lower frequency noises, we can pick a new random number every second, third, fourth, etc. sample time. Summing multiple white noises which are allowed to change at different rates is the basis of the Voss algorithm I believe. There are variations on this. This article is the best I have found on these algorithms.

Generating pink noise with analog hardware would seem to be problematic, since filters built with capacitors and inductors naturally have 6dB/octave characteristics. It is possible to get a 3dB/octave characteristic though. This extract from the National Semiconductor Audio Handbook (June 1976) shows how. That handbook also includes designs for analog graphical equalizers and a "room equalizing instrument". It is full of good stuff, in fact.

It looks like the MM5837 IC used in that design is still available!

Measuring with Spot Sine Wave Frequencies

As I mentioned, feeding constant amplitude sine waves at a set of spot frequencies (spaced, say, 1/12th of an octave apart) to the loudspeakers and measuring the response with a microphone at the listening position seems the most straightforward way of measuring a room response. After all, this is the sort of thing that works well with electronic devices (with a voltmeter in place of a microphone).

However, many sites on the internet state that such a method is inappropriate, except, perhaps, for "highly damped" rooms. How bad can it really be? Since I happen to have the equipment and software to hand to make automated measurements of this kind, I thought I would find out ...

I used my HP 3325A synthesised function generator and HP 8903E disortion analyzer and level meter controlled by a Python program which used my GPIBlib library. The photo below shows the setup in action:

Plotting the accumulated results actually shows quite similar results to the other two methods we have used. Which is comforting. But ... yes ... the results are "peakier". Where room modes are concerned, we will only be exciting one at a time (or none). Maybe that is the reason.

Well ... enquiring minds wanted to know!

While I was about it, I tried making measurements of the distortion (i.e. THD+N) from each of the driver units in both of my speakers. This came out consistently at about 0.6% +/- 0.1% using 75Hz, 300Hz and 4kHz and with the microphone positioned close to each driver so as to pick up its direct signal. Probably. I think this is more or less what I would expect -- and, actually, rather lower than it might be. The sound levels involved were not excessive. Around 85db SPL at the microphone position. The measurement method is questionable, however. The microphone was certainly in the near field of the speaker and without taking many measurements at different positions, near field measurements are of dubious value.

I also wouldn't be at all sure about the relative contributions of the speaker and the microphone (and????). The HP-3325A generates signals with 0.05% distortion (I checked) and it isn't a significant factor in this. There is no analysis of the different harmonic components ... I just measured a single overall number. However, it would may be a valid data point.

Any Conclusions?

This web page has grown to be far too long ... that is one firm conclusion! As mentioned before, after some considerable reading around the subject of loudspeakers in small rooms, it is clear that real experts do not always agree with one another. Some things which do seem to be more or less agreed on (and which have major practical implications) are:

Loudspeakers should have a flat on-axis frequency response. The flatter the better. No surprise there, of course.
Loudspeakers should also have the flattest possible off-axis frequency response too. This is because the off-axis output will be heard via reflections from the walls and the reflected sound should be spectrally neutral just like the direct sound.
Loudspeakers should be wide dispersion designs. That is, they should radiate sound as uniformly as possible over the fullest possible range of angles. The best sense of "envelopment" and stereo imaging is gained when they do this.
Accepting that the above characteristics of loudspeakers are of major importance (along with the room they are used in, very probably the biggest influence on how an audio system sounds, in fact), detailed information on the on-axis and off-axis frequency response and dispersion characteristics of loudspeakers is essential when choosing them. Of course, it isn't necessarily in the interests of manufacturers to publish this! Without it though, there is no really good basis for picking a loudspeaker.
There are no magical room dimensions that will give better sound than others. Nor will non-rectangular rooms magically fix things. All rooms will have standing wave issues at some frequencies in the bass region.
Positioning loudspeakers to the exact inch (or millimeter) is, to put it mildly, unnecessary! Roughly right is quite good enough.
If possible, loudspeakers should be positioned far enough from walls that the first reflections are delayed by more than 6ms, otherwise they are perceived as "smearing" rather than seperate events that are (probably) beneficial to a sense of spaciousness. This means more than 1 metre from any wall. This is often not feasible.
If possible, there should be extensive free space behind the listener, again to ensure reflections are delayed (in this case, the more delay the better). This is also often not feasible. Treating a wall close behind the listener with absorbing material that actually works (above 300Hz or so) is worthwhile if a decent amount of free space is not there.
The loudspeaker+room system will have major frequency response issues and those in the bass region should be addressed somehow or other.
Although measured frequency response curves almost always look absolutely dreadful, issues above the bass region sound nowhere near as bad as the curves would suggest they might. Any room in which conversations can be comfortably held will be a decent room in which to listen to music.
Adding the sort of furnishings, carpeting and curtains usually found in domestic living rooms will generally result in a room that is comfortable for conversing in (and hence for music listening). The reverberation times of such rooms are in the 0.3s to 0.5s range. The same rooms without furnishings may be up in the 1s to 2s range and that would be bad for both conversation and music. This is good news, of course, because it means that "normal rooms" are basically OK as "listening rooms".

Beyond this, agreement amongst experts tails off ... Points of some disagreement include:

Is equalization (of any kind) an effective solution to room frequency response problems? Is the only proper solution acoustic treatments applied to the room to change its characteristics?
- The argument against equalization is that it can only be effective at a single point in space. Even the distance between two ears is enough to invalidate the correction at higher frequencies.
- The argument for equalization is that it at least improves matters in the bass region over a useful region of space -- not just a point. It may not be perfect, but it is a lot better than nothing.
I am firmly of the opinion that equalization is useful to tame bass response issues, if for nothing else. It is worth doing. It may well be that acoustic devices would give superior results, but they are often impractical in living rooms (my opinion). Active "sound field management" may be even better, but I'm not sure if any such systems are commercially available. Multiple sub-woofers (typically four) seem to be required, which pose practical positioning problems. They seem to be a complicated and expensive solution (although that may change, of course), albeit perhaps the best.
Should walls be treated to attenuate reflections above the bass region? Is the goal of this to create an "acoustically dead" room in which the reverberations in the recording are reproduced without the replay room characteristics impinging on them? There seems to be a great deal of evidence that "dead rooms" sound much worse than normally reverberant rooms. There is logic to the idea of attenuating replay room reflections so that "recording hall sound" is reproduced correctly ... but in reality that just doesn't seem to work from what I can gather.
Is stereo or surround sound the best system for listening to music? Here, the most highly qualified experts simply disagree. It would seem that stereo has a fundamental flaw as far as the centre phantom image is concerned: because of the separation of the ears and the shadowing effect of the head, there is a 6dB dip in the frequency response heard at about 2kHz! This can be measured using a mannequin head with microphones in the ear canals and it can be heard as a dulling of pink noise at the exact location of the stereo "sweet spot". This is another reason why a normally reflective room is better than one with attenuating acoustic treatment above the bass region: the reflections fill in this 2kHz dip. It may also be a reason to prefer systems with a real center loudspeaker rather than stereo. I've never owned a surround sound system or listened to one at any length, so I have no real opinion on this. It seems to me that:
- Surround sound systems must, in principle, have more potential for "envelopment" and a generally fuller listening experience than 2 channel systems.
- If your main interest is in watching movies, choosing a surround sound system is a "no brainer". You need one!
- How often is music (as opposed to movies) recorded specifically for a surround sound system? I don't know for sure, but I'm not aware that this is done very often.
- So much about surround systems seems ad hoc to me. Why 5.1? Why 7.1? Well, the answers come from how cinema sound developed and that is fair enough as far as it goes. But how would a 7.1 mix (say) be created for a recording of a symphony orchestra? There would seem to be an infinite number of possible ways starting from a close miked multi-track recording. It must all be down to artistic judgement by the mastering engineer(s), presumably. Which may also be fair enough ... but ... I'm not sure.
- There many different ways of replaying stereo recordings on surround sound systems. The process of automagically converting stereo to surround "on-the-fly" at replay is called "upmixing". Well, apparently some versions of this can work well, at least on some stereo material. The whole thing seems very questionable to me though.
- There is something appealingly pure about the idea of recording an acoustic performance with a pair of microphones and then reproducing it with a pair of loudspeakers in the "classic stereo" fashion. The problem, though, is that this is almost never how stereo recordings are made! The original recordings are almost always made with many microphones positioned close to performers and recorded on a multi-track recording system. The stereo recording we listen to is made in mix-down just as much as a 5.1 or 7.1 surround recording is. Now, it may well be that some sorts of acoustic music should be made with a simple two microphone system ... but they very rarely are.
- Where do you put seven speakers and a sub-woofer (or more) in a normal living room? I don't know, and they surely wouldn't fit in mine! Surround sound seems to almost require a dedicated listening room. Building the speakers in to the room walls seems to be the best answer to this ... but that is hardly easy to do with an existing living room -- although it would be possible. Trying it might be very helpful for divorce lawyers though.
The future must belong to surround sound, although I'm not sure the point will ever come when I personally feel the need to move beyond stereo (given the practical issues involved).

Finally ...

Looking seriously in to the details of sound recording and reproduction eventually leads you to think hard about what we are actually trying to do. Back in the day, advertisements for the "QUAD" brand of audio equipment always contained the phrase "... the closest approach to the original sound...". This made complete sense to me at the time and still sounds great today. From an "objectivist" point of view this surely sums up what an audio system should be all about. Concise, clear and memorable. But, in reality, it often doesn't make much sense at all ...

To begin with, our current systems do not and cannot record and reproduce a physical sound field over a significant volume of space. Nor is it likely that any future system will do this fully successfully. It has been done for scientific purposes in very limited circumstances and with great difficulty.

The nearest thing to that in conventional audio is probably Ambisonics. Although "conventional" may not be the appropriate word. This does attempt to capture the 3D sound field at a single point in space and then to reproduce this with 4 or more loudspeakers. This is a fascinating system and much more information on it is available here. It has a very firm theoretical basis, but the theory doesn't apply in the presence of room reflections and is strictly valid only for a single point. Consequently, listening to an Ambisonics recording should presumably be done in a totally dead room -- an anechoic chamber. For these reasons, its practicality is doubtful, although it remains a fascinating option. It may be more robust in practice than the theory behind it suggests and it may yet play an important role in the future. I hope so, because it has the potential to be a "pure form" of surround sound recording while other surround systems inherently involve ad hoc artistic decisions at the mixing stage (which may be done by very talented people and give excellent results). There is something about the possibility of making a "direct recording" that appeals to people like me, however misguided that may be. This possibility also exists with "two microphone" stereo, which is why that appeals too, in spite of its limitations. Of course, "stereo" recordings are almost never made in such a "pure form", as noted below.

Perhaps the purest practical system of sound recording and reproduction is binaural recording and headphone reproduction. Binaural recording uses an anatomically correct mannequin head with microphones in the ear canals. In an extreme version, it could even be a cast of your actual head so the shape of the external ears, etc., exactly match your own. This undoubtedly works very well if the limitations of headphone listening (and the practical issues of making the recordings) are accepted. Note that "in ear" headphones (in the ear canal) are needed to avoid the outer ear response being incorrectly applied twice. Furthermore, if you are not recording with a model of your own head, you will experience someone else's outer ear response (or that of some generic mannequin). Apparently you get used to this quickly, so it isn't a fatal objection. Both the problem of the "wrong" outer ear response and the "double" outer ear response can be "undone" by measuring your actual response and applying signal processing to make appropriate corrections. You can then use "over ear" headphones (which I for one find a lot more comfortable than the "in ear" types). Most people would conclude binaural recording and reproduction isn't exactly ideal in the real world, though.

Real world sound systems simply do not record or reproduce anything close to the physical sound field you experienced in the concert hall or at the gig. The shocking implication of this is that what you experience when listening to recorded music is an imaginative reconstruction of an acoustic event carried out by your ears and brain involving higher cognitive functions that are far from fully understood (to say the least). That is if there ever was an actual acoustic event in the first place, as we shall see. As an "objectivist" I don't much like this, but it is the reality of the situation. You could still say that the goal is to evoke an imaginative reconstruction that is the closest approach to what you remember of the original sound! Not quite as compelling as the "QUAD" tag line, is it?

The next shocking thing to realize is that very often there is no original sound! In any sort of pure form the idea is only relevant to acoustic instruments played live and recorded with a minimal number of microphones (e.g. two for stereo). What is the original sound of a multi-track recording? The only logical answer to this is what the mastering engineer hears in the control room while he/she is creating the stereo (5.1, 7.1 ...) mix. As Dr. Toole points out, it is ironic that there are no standards for the listening environment at this mix-down stage, and that the environments in question are both: (a) often very different from one another and: (b) bear no resemblance to a domestic listening environment! In any event, you have no idea what the mixing engineer heard and it is highly likely that what you get from your loudspeakers is significantly different from whatever that may have been anyway. This really makes you think. Multi-track recording with close miking became the usual way of doing things in the late 1960's, starting with non-classical music (where multi-tracking is close to universal). Almost every recording -- of classical music or otherwise -- is made with multi-tracking and this has been the case for a very long time.

Beyond that, there are many recordings which have significant elements that are not acoustic at all and simply don't exist outside the realm of audio electronics. This includes things from 1950's musique concrete, through to sampling and all manner of modern electronica. There is nothing "original" (as far as actual sound in air goes) to "closely approach" here. The recording is all there is.

So what is the goal of a sound reproduction system then? As an "objectivist" it gauls me to admit that ultimately it must just be to make a pleasant, entertaining noise. The most pleasant, entertaining and enjoyable noise possible given practical considerations, of course. Our current systems can be very good indeed at doing that. The irony here is that, under the right circumstances, very modest audio systems can be very satisfying. Highly sophisticated systems are not necessarily needed to gain great pleasure from reproduced music. In fact, it may be that part of the frustration and endless fiddling about with equipment that many have noted as "audiophile" traits are due to grasping for a kind of "perfection" that always remains tantalizingly just out of reach. High quality systems may be good enough to make people think that actual "perfection" is possible -- if only we just changed something a little bit. Lower quality systems clearly are never going to be perfect and hence don't provoke such behaviour. Instead, we just enjoy the music when listening to them.