Sinusoidal sound wave (actually any sinusoidal wave) is fully described by 3 parameters - amplitude, frequency and phase angle. When two sound waves of same frequency combine we use vector addition. The magnitude of each vector is the amplitude and angle between the two is the phase angle difference. A sound wave in the room will get reflected by different surfaces many times and combine with itself many times, and each time it will be a vector (phase-dependent) sum. For example when bass leaves the speaker half of it goes back and half forward, the part going back gets reflected by the wall and recombined with the other half of itself but with a phase delay. At least for some frequency this phase delay will exactly equal half wavelength (that is 180 degrees, that is inverse amplitude in-phase) and this will cause near-complete cancellation (for complete cancellation the two waves would have to be EXACTLY of same amplitude) or a NULL, at that frequency. For all other frequencies whether the reflected wave will add or subtract from the direct one will depend on relative phase angle, which will be different for all frequencies because different frequencies have different wavelengths and the physical distance from your speaker to wall is a constant. You can see then that bass response can be nothing but a rollercoaster.

This is a good time to take a break and think about why we do not have this problem with other frequency ranges such as midrange or treble? The reason is that at those frequencies the distance from speaker to the wall cannot really be expressed in terms of phase angle. The wavelength are so short that for any different point on the wall the small change in distance (distance to speaker, or to your ear) will result in large phase difference, so the phase angle of the reflected sound is going to be different for different parts of the wall and on the average the response thus will be smooth. The trouble begins at about 150 - 200 hertz and below.

It is bad enough that you have all the walls and floor and ceiling and sound reflected from each of these surfaces combines vectorially with direct sound, but in fact the situation is much worse. After the sound is reflected once, and reaches another surface it will get reflected again, and again and again and again. Thus at some frequencies you will get "standing waves." Standing waves are waves endlessly reflected back upon itself but in the opposite direction - note I did not say opposite phase, but opposite direction. You CANNOT in fact define relative phase of two waves at all points if they're not traveling in the same direction - what you get instead (in a standing wave) is that the relative phase of the waves traveling back and forth is different for any different point in space, but constant with time. Thus you get regions of reinforcement (nodes) and regions of cancellation (antinodes) spread all over your room. Why does this make the situation so much worse? Because it becomes impossible to equalize bass response flat for more than a single point in the room. You can place a microphone at a certain point, measure the response and equalize it flat, then you can take two steps in any direction and response there will be nowhere near flat again.

Lets summarize briefly then. We have anomalies due to reflection of back wall and such which determine how well the speaker couples at a certain frequency to the room; these effects will be felt across entire room, they can be equalized successfully if you're equipped for the job. But then you also have the standing waves that affect different points in the room differently and you can only equalize them for a single location, such as where your head is.

There are even more problems though. Standing waves store energy. Energy storage "smears" signal in time domain, so basically a transient that was supposed to be short and sharp can become long and not even a transient at all but an oscillation. And then it gets worse yet again. By equalizing these standing waves you can achieve flat response of the total of stored and direct sound energy, but because energy is not stored at all frequencies you will in fact create hills and valleys in the response of both direct and stored energy when they are considered separately.

Now you should understand why people use bass traps. Bass traps try to fight the standing waves. We will not discuss bass traps here though, because this topic is too large.

Finally walls, floor and ceiling all flex from the bass and thus both store and absorb bass energy, not just reflect it - but the effects of reflection are stronger.

You should have guessed by now that it is impossible to predict room gain with any degree of accuracy, its best to just use parametric EQ after the speaker is built. Graphic equalizer, even 31 band, is not well suited for this job. The fear of equalizers in signal path that most people have is not founded in anything - the signal is fed through an equalizer many times during mixing and mastering of the recording anyway; maybe in a few years the studios will be equipped to perform all operations and transfer data losslessly, strictly in digital domain, but it has not happened yet.

Many people will give you advices such as "in a small room use sealed speaker." They thus assume that they can guess at room transfer function and have the speaker match its inverse; you should see this logic is flawed.

Dipole and Cardioid speakers interact with room differently, but this topic is already treated on another website, namely linkwitzlab.com. The only thing I want to add to the treatment given there is that Linkwitz is a little biased towards dipoles; he has not convinced me that dipoles are much better than cardioids for example.