Many are shocked when they first encounter the illustration on page 43 of Audio Perfectionist Journal #12 because that illustration shows the actual frequency range of musical notes and conflicts with their preconceptions.
The range of human hearing is commonly assumed to be 20-20kHz (a ratio of 1000:1) and many believe that low notes are toward the bottom of this range and high notes are toward the top, with midrange notes falling somewhere in the middle. Because speakers have woofers, midrange drivers and tweeters, it seems logical to conclude that woofers reproduce low notes and tweeters reproduce high notes and midrange drivers handle the rest, but this isn’t how music works and it certainly isn’t how speakers work.
A musical note has three distinguishing characteristics: pitch, intensity and timbre. A note’s pitch is determined by the frequency of its fundamental. The intensity of the note depends on how loudly it is played. Timbre is determined by harmonic overtones added to or subtracted from the fundamental depending on phase relationships.
High fidelity loudspeakers try to accurately reproduce these characteristics so that they sound like real music. Multiple drivers are necessary in order to reproduce the entire frequency range but much of that range—in fact everything above 4,186Hz— is made up solely of harmonics or sounds other than musical notes.
If you refer to the illustration mentioned above you’ll find that the piano keyboard spans the frequency range of all the instruments in the symphony orchestra. You may be startled to learn that the lowest note found there is an “A” at 27.5Hz. The highest note is a “C” at 4,186Hz. All the fundamental frequencies of all the notes that all the instruments can play fall between these frequencies. Middle “C” is at 262.63Hz and the “A” above, which is commonly used for tuning, is 440Hz. Does that mean that our speakers really don’t need tweeters?
The answer depends on what you want to accomplish. You can detect the tune over the telephone, which has extremely limited bandwidth, but you may be hard-pressed to distinguish what instrument is playing. The characteristic sound of each instrument—its timbre—is established by the complex waveform it produces. The complex waveform (made up of many frequencies) contains the fundamental frequency of the note plus harmonic “overtones” which may extend to the limits of human hearing and beyond.
A male tenor and the voices of female altos and/or sopranos can sing an “A” over middle “C” at 440Hz. A violin, cello and voila can all play this same note. Even if the pitch and intensity of each is the same they’ll sound different because each instrument and voice has unique timbre. This unique timbre can be observed on the screen of an oscilloscope as a complex (made up of many frequencies) waveform. Change the waveform and you’ll change the timbre. (That’s what voice-disguising scramblers do.)
With one exception all the devices used to record music and play recordings have minimal impact on complex waveforms. Loudspeakers are the exception and the reason is simple: electrical dividing networks, commonly called crossovers. Other components in the recording and playback chain handle the entire waveform at once. Loudspeakers break up the waveform into frequency bands, which are handled separately.
A loudspeaker drive element, even those of the highest quality, can only provide optimum performance over a limited range of frequencies. Low frequency drivers must be large and powerful, high frequency drivers must be small and fast, and drivers optimized for middle frequencies must be capable of delivering articulate and uncolored midrange frequencies with low distortion and wide dispersion. Several drivers are necessary in order to cover the entire frequency range that a full-range speaker is expected to reproduce.
An electrical dividing network, commonly called a crossover, is used to separate the frequency spectrum into appropriate sections so that each drive element is presented only those frequencies that can be optimally reproduced. Crossovers are usually designed to blend the outputs from all drive elements to provide the flattest frequency response. Unfortunately, the reactive components used to create crossover networks cause frequency-dependant phase shift. Crossovers with steeper rates of attenuation, like 12dB/octave, 18dB/octave, and 24dB/octave, utilize increasingly more reactive components, store and release increasingly more energy, and produce increasingly more phase shift. Because crossover-induced phase shift is frequency dependant it cannot be compensated for by simply staggering drive elements.
In fact, typical third-order crossover networks produce so much phase shift that drivers handling adjacent frequency bands, like the woofer and midrange driver or midrange driver and tweeter, may cancel each other in the range of frequencies where their outputs overlap. To prevent suck-outs in frequency response it is common for speakers with third-order (18dB/octave) crossovers to have midrange driver elements wired out-of-phase from woofers and tweeters. This improves linearity in the frequency domain at the expense of accuracy in the time domain. Crossover phase shift brings the drivers back in phase in the overlap regions but they are now out-of-phase in their respective pass bands.
Regardless of what you might have read elsewhere, this is not good time-domain performance. It is compromised time-domain performance for the sake of improved frequency-domain performance. What’s wrong with this approach? If the midrange driver is out-of-phase with the woofer and the tweeter some harmonics will be out-of-phase with some fundamentals. Let me be even more specific.
Suppose a piano sounds middle “C” at 262.63Hz. The woofer in a typical 3-way loudspeaker would reproduce this fundamental. Harmonic overtones would occur at multiples of the fundamental frequency so the midrange driver would reproduce overtones up to about 17 times the fundamental. Additional overtones (if there are any) would be reproduced by the tweeter. If the midrange driver is out-of-phase with the woofer, frequencies reproduced by the midrange driver that should add to the fundamental (reproduced by the woofer) would instead subtract and frequencies that should subtract would instead add. Instruments richer in high frequency content would span the midrange-to-tweeter range of frequencies with similar results. This is not a recipe for the accurate reproduction of timbre. Complex waveforms will be altered. Timbre will be altered. Imaging will be depreciated. This is not good time-domain performance. Period!
How can you tell?
Amplifiers can be objectively measured and so can speakers. Audio Perfectionist Journal #13 is all about speaker tests and how to interpret them.
The step response test shows how a loudspeaker responds to a stimulus containing many frequencies all at once. If the output from the speaker under test is anything other than a positive (upward) triangle-shaped plot (as shown in the ideal example in APJ#13), some drivers push while others pull.
There’s an old saying in racing: “when the green flag drops, the bullshit stops.” In racing, the guy who crosses the finish line first wins. In loudspeakers a design is either time- and phase-accurate of it’s not. If some part of the output signal is negative in response to a positive input stimulus, it’s not.