Okay, the posts from yesterday indicate that some clarification is still needed. First, the point has never been that there's no difference between amplifiers. I almost always list the requirements for audibly flat frequency response and inaudibly low noise and distortion when this comes up. If an amplifier doesn't meet one or more of these requirements it will sound different. However modern audio technology makes audibly transparent amplification the norm, even in moderately priced receivers. Independent lab results show the almost uniformly extremely low deviations from perfect performance that's found in these units.
I studied that 1989 Stereophile blind test which is linked quite a few years ago, and it's quite instructive. It's curious, however, to see it cited as evidence of audible differences when the tests failed to show statistically significant results. Stereophile selected a solid state($750)and tube($4900)amplifier "as different in design as possible" apparently anticipating positive results. The 505 participants in a same/different single-blind test returned 52.3% correct responses. This may not seem very impressive, but because of the large number of results it would be statistically significant if compared to 50% guessing. However, as Dr. Carlson and the two following letters in Part 4 point out, there was an inclination to vote different(62% of the tests with the same amplifier were voted different!)and since about 54% of the tests were in fact with different amplifiers(rather than 50/50 for a valid same/different test)this skewed the results and made them drop to a statistically insignificant figure.
Although the overall results were yet another failure to show audibility, apparently some claim that the fact that some participants scored extremely well(6 of the 505 scored 7/7)"proved" that they heard real differences. This isn't the way the statistical significance of blind tests works, however; in all such tests with a large number of participants a small number of participants return very high scores(e.g., some coin flippers doing 9 or 10 out of 10 heads), possibly simply by chance. If time is available, the high scorers should be given supplemental tests to investigate the reliability of the previous results. An example of this is found in this
paper, where in sections 4. and 5. a participant showing significant initial test results failed to show the same after supplemental testing.