A Compilation of 50+ Blind Tests Reveal the Only Gear Upgrade That Actually Matters

These blind test results haven't changed since 1977 and the industry still ignores them.
These blind test results haven’t changed since 1977 and the industry still ignores them.

We independently review all our recommendations. Purchases made via our links may earn us a commission. Learn more ❯

One product category scored 97% while everything else landed at coin-flip.

Somebody spent 14 years cataloging more than 50 blind listening tests, and the results should have ended the argument by now. The Head-Fi.org compilation stretches from 1977 to 2024 and covers eight equipment categories.

The protocol is standardized. Listeners hear two products, then an unknown sample. Score above 50% consistently and you’ve proven a real audible difference. Score at 50% and you’ve proven a coin flip.

Category after category, the percentage lands in the same devastating range, except for one.

Cable Blind Tests Results

In 2004, a group at Audioholics wired up Monster 1000 speaker cables alongside coat hangers stripped and bent into makeshift connectors. They ran five blind tests on a high-end system, with nobody in the room knowing which cable was playing.

“After 5 tests, none could determine which was Monster… impossible to determine which sounded best majority of time,” the testers reported.

This coat hanger test became audiophile folklore, but it was not an outlier.

The compilation pulls together 17 blind tests conducted over several decades and across multiple countries, comparing everything from inexpensive hardware-store wire to cables costing tens of thousands of dollars. Out of it, 14 tests found no audible difference, one was inconclusive, and only one produced a clear pass, with just three listeners.

One additional “pass” came from a six-year-old making six comparisons, which made it difficult to treat as meaningful evidence.
Results from a 1991 engineering study comparing 12 speaker cables (From: Audio Engineering Society)
Results from a 1991 engineering study comparing 12 speaker cables (From: Audio Engineering Society)

That same pattern appears in more focused engineering and listening studies:

Basically, roughly 82% of these cable tests failed to show an audible difference, whether the comparison was between bargain-bin wire and five-figure cables or between strict ABX tests and looser blind listening trials.

Amplifier Blind Tests Results

Richard Clark offered $10,000 to anyone who could tell two amplifiers apart in a controlled ABX test. The challenge ran from the early 1990s to roughly 2006.

This required two sets of 12 correct identifications. Clark matched volumes to within fractions of a decibel and kept both amplifiers operating in their linear range, below clipping. Listeners got as many attempts as they wanted.

They just had to prove they could hear a difference, not just believe one existed.

But out of approximately 2,000 participants that took the test, none won.

In 1985, Stereophile challenged Bob Carver to make a $400 amplifier indistinguishable from a Conrad-Johnson Premier Five worth more than $6,000. Stereophile gave him 48 hours. Using null-difference testing, Carver modified his amp until the output matched the Conrad-Johnson’s.

Again, the editors failed every blind comparison in their own room, on their own speakers.

Then, five hundred and five listeners sat down at a 1989 Stereophile ABX session to identify amplifiers. Their scores formed a bell curve centered on random chance, as, in a sample that large, probability alone guaranteed a few would beat the average.

“There is bugger all between the 2 preamps, they were so close that any difference could not be reliably picked,” an Australian tester concluded after a 2008 Stereo.net blind test.

Across all 12 amplifier blind tests in the compilation, roughly 75% produced clear failures. The single pass involved fundamentally different amplifier architectures.

DAC Blind Tests Results

A $30 Fiio DAC and a $3,000 Forssell sounded identical in 2017 ABX trials run by a DIYAudio forum group. Nobody heard a difference.

Results from Archimago’s 2024 DAC blind listening survey, where a $10 Apple USB-C dongle ranked alongside far more expensive DACs. (From: Archimago)
Results from Archimago’s 2024 DAC blind listening survey, where a $10 Apple USB-C dongle ranked alongside far more expensive DACs. (From: Archimago)

Archimago ran a larger version in 2024, surveying 105 self-selected audiophiles over six weeks. The lineup spanned a 2,000-to-one price gap, from a $10 Apple USB-C dongle to the $20,000 Linn Klimax, with every participant listening to the same 24-bit/96kHz FLAC files through their own systems.

Forty-three percent said the upgrade wasn’t worthwhile.

The most revealing result, however, came from listeners who owned systems costing more than $10,000. They ranked the Apple dongle above the Klimax.

Archimago noted that the large sample size at least gave the test enough statistical power to be meaningful, even if the results didn’t favor expensive gear.

“Nice demonstration that with enough audiophile listeners, we can achieve ‘power’ to detect differences using blind testing,” he wrote.

The Pile Keeps Growing

Test setup from a 2004 ABX blind experiment on power cords (From: Home Theater Hifi)
Test setup from a 2004 ABX blind experiment on power cords (From: Home Theater Hifi)

The same pattern shows up in several other product categories. Individually, these areas have fewer tests than cables or amplifiers. But together, they point in the same direction: once the comparison is blind, expensive add-ons and boutique formats usually collapse back toward chance.

  • Power cords: In 2004, listeners scored 49% accuracy across 149 ABX trials, almost exactly what random guessing would produce.
  • HDMI cables: Across four blind tests, there were zero passes, and in every round the cheaper cables outscored the expensive ones. Which? magazine reported the same result in 2021, as they found no meaningful performance difference between a £10 HDMI cable and a £100 one.
  • High-resolution physical formats: In the Meyer-Moran AES study, 60 listeners completed 554 double-blind trials comparing standard CD audio with higher-resolution formats such as SACD and DVD-Audio. The study found that listeners could not reliably tell them apart.
  • Fringe products: The Boston Audio Society tested green ink applied to CD edges, Armor-All sprayed on disc surfaces, and other CD tweaks. Each product promised to improve readability and sound quality. Across the full battery, accuracy came back at 48.3%.

These categories do not look like isolated exceptions. Power cords, HDMI cables, high-resolution disc formats, and CD tweaks all land in the same narrow band near coin-flip performance.

Then Speakers Broke the Pattern

The usual objection is that blind tests are flawed by design. They create stress, increase fatigue, and suppress the subtle differences listeners would notice in normal use.

Some critics also argue that blind protocols encourage null results by introducing bias of their own or by relying on test structures that are statistically unbalanced.

But if blind testing is fundamentally broken, it should break for everything.

In the compilation, five speaker blind tests all produced clear passes. So under the same general kind of protocols that cables and amplifiers routinely fail, speakers repeatedly stood out as audibly different.

According to ABX Comparator data cited, roughly 97% of participants correctly identified which speaker was playing.

The reason behind this, however, is physical and not mysterious.

Speakers produce large frequency-response differences, often on the order of 3 to 10 dB or more across the audible range.

Competent amplifiers typically differ by less than 0.1 dB. Cables differ by even less. Meanwhile, speakers are the component that actually turns an electrical signal into acoustic output. So, changes in driver design, cabinet tuning, dispersion, and bass performance are far more likely to create audible differences than electronics upstream that already measure near-identically.

In fact, Floyd Toole’s work at Harman International showed that speaker measurements strongly predict listener preference, with an 86% correlation. And when bass performance is controlled, that correlation rises to 0.995.

Headphones follow a similar pattern: AES Paper 9878 reported a 91% measurement-to-preference correlation across 30 earphones and 71 listeners.

The pattern is not that blind tests erase audible differences. It is that they separate large, repeatable differences from imagined ones. Speakers and headphones keep passing. Most upstream electronics and tweaks do not.

Why You Hear What You See

The differences people report are often real to them. They just are not always caused by sound.

For instance, a HiFi News blind test played identical signals through the same amplifier and asked listeners whether they heard a difference. Even though nothing had changed acoustically, 35% said yes.

Harman International tested the same effect more rigorously in 1994. Forty employees evaluated four loudspeakers twice (once sighted and once blind).

When listeners could see the speakers, they consistently rated the larger, more expensive models higher. But once the visual cues disappeared, so did the bias. And both trained listeners and casual listeners were both affected.

More recently, at CES 2004, Wilson Audio played music through their flagship speakers and asked attendees to evaluate the source. The audience praised what they believed was a $20,000 CD player. It was an iPod.

Blind tests do not erase audible differences, as they only remove the non-acoustic cues that shape perception. The ears still work. The problem is that the eyes, the price tag, and the story around the product often speak first.

💬 Conversation: 10 comments

  1. What about the room? Acoustic treatment should also be included. A bit more difficult to set up but well worth it. Too many audiophiles miss the one thing that could result in more improvement than anything else. Please discuss.

    Reply
  2. Clear back in the late 1970’s, in my early days as an audiophile, I had reached the conclusion, with several others, that the most important elements in sound quality were the speakers and the source material. Everything else was marginal, although under less than ideal conditions, differences could be noted, such as shielded cables could be helpful in locations with lots of EM interference fields, and gold plated switch contacts could prevent noise or signal loss after years of exposure that could cause corrosion in cheaper units, etc. There were still reasons to buy quality gear, such as less deterioration after years of use, and better phono cartridges using lighter tracking weights reduced vinyl wear, better tape heads could produce better recordings, and better amplifiers allowing more headroom without clipping, which reduced speaker damage and distortion at high volumes, etc. But there was certainly a steep curve in the cost/benefit scale, after going beyond cheap gear to reasonably good quality gear.

    Reply
  3. Thanks for the article, I enjoyed it. Speakers making the biggest difference correlates with what I’ve found when upgrading components and being honest with myself.

    Reply
  4. I play in several bands, certain professional bass speakers were better than early models due to improved technology on magnets and using aluminium wires. In home stereo situations I found Rotel was the best amplifiers. Used JBL brand speakers plus their mid range ones too. I selected good horn/tweeters. I never had good hearing as inherited bad hearing from father’s side. But I still maintain that vinyl records had that warmer tone as well as valve amps.

    Reply
  5. I’m sure I know the answer will be, but I’ve always wanted to see a blind test between speakers with those fancy feet (isoacoustics, etc) and without. People swear by them and I’ve listened to videos that show a difference, but never heard them in person. Some have said the difference is due to the height it adds to the location of the tweeter. If anyone knows if such a test let me know. If they help us be willing to get them, but they are so expensive and I would hate to throw my money away for no actual benefit.

    Reply
  6. This is only an example of a population test of who can hear well and who cannot. The mean response would represent those who have good hearing and those who attended loud rock concerts without hearing protection. It’s like showing a painting to a population of half color blind people and half color enabled. What does this prove? I don’t care what people hear who no longer have high frequency hearing.

    Reply
  7. On the Richard Clark amp challenge, it was basically a simple scientific test where all variables between amps were eliminated so that one variable could be tested. Most people thought the test was to prove all amps were the same, but what was actually being tested was the listener’s hearing.

    Both amps in the test were tested before hand on an Audio Precision analyzer and any frequency response differences, input or output voltages, phase changes, and any other differences between the amps were adjusted using equalization and other methods so that both amps had identical output. Some amp designs, like SET amps, were banned from the test because their outputs could not be made to be identical to other amp designs. Given the number of 10-20 pre-test listening test rounds, to prove your hearing was good enough for the $10,000 money round of tests, and the actual 10-12 part round money round test (where one had to get every round correct to get the money) most listener’s hearing was shot before the money round began.

    I know of one person that went through to pre-tests and the money round and got every round 100% correct. He beat the Amp Challenge but Clark refused to pay out the money because afterwards Clark retested both amps and found a 0.4 dB difference in output.

    Clark is correct in saying under his test conditions no one could hear a difference between amps. But his amp challenge test was so rigged against the listener, it was not a fair A/B/X test. In real life use, differences between amps can be heard, but this test eliminated all differences.

    Reply
  8. I’ve gone back and forth on this subject for a while (like I’m sure most of us have) and was sold on the ‘speaker cables don’t matter’ until one day, for kicks, I told chatGPT about my system and it recommended switching my speaker cable from my Kimber to my Silver Sonics. I put on my mofi copy of The Nightfly and noticed no difference, UNTIL a textural rhythm guitar part kicked in – with the SilverSonics, the texture was much clearer, with the Kimber it was more muted/smeared, etc. My point is only that the difference between cables would’ve been impossible to identify if it weren’t for this one particular sound. This experience called into question every blind test I’ve read. For example, if you test two amplifiers playing Mahler LOUD, and one has tiny caps and the other $$$$ big caps, will a blind test on amps fail then? I’d love to see a blind test where someone who identifies a difference then blind tests to see if others hear a difference as well.

    Reply
  9. We need to know the participants experience in every study. Are they noobs or experienced in hifi? Many people who are not experienced will actually choose the worse sounding gear – because they like/think the “bright” sound is more detailed, or boomy bass because it’s more pronounced… things like that. The test results are fine if you’re studying the public in general, but NOT if you are assessing people’s ability to discern top quality gear.

    Reply

Join the conversation