Echo Cancellation Demystified
The problem here is that if both codecs are clocked at different rates (say both have sampling rate at about 8 KHz but they're not exactly equal because they're clocked from different quartz oscillators), then we can't just take each sample from one codec, somehow process it and pass to the other codec. Eventually, the sampling rate difference will lead to either sample accumulation somewhere in the sample buffers or sample depletion, e.g. there will be nothing to take out of a buffer when a sample is needed.
The first solution is to choose the codecs such that they're clocked from the same clock source, the same quartz oscillator. This is the best solution to the problem and with little provision on the hardware design stage the problem can be completely eliminated. Even if the specifics of the application does not allow for use of the same codecs in both places, it is still better to have the same clock source for both because this will make it possible to use sample rate conversion with a constant upsampling and downsampling ratio and there will be no synchronization issues.
But if codec synchronization via the same clock source is not possible to achieve (as is the case with ISDN phones, where the data rate is not anyhow related to the codec clock), then some different solution is needed.
Often the engineers are tempted to solve this problem using one of the following solutions:
- Continuously tuning the codec's sampling rate
- Dropping samples received from the codec and repeating samples to be sent when there's nothing to send
But our experience and logical reasoning proves these solutions wrong as they fail to solve the problem they're supposed to. And here's why...
The first solution is not viable because it incurs additional nonlinear distortions in the echo path and also effectively changes the echo path delay. The second solution is not viable because using such an approach we will be abruptly changing the echo path delay. Changing the echo path reduces the quality of echo cancellation and can even force the echo canceller diverge if the residual echo error becomes too big. The worst case is the double-talk situation, e.g. when both the near and far-end talker signals are present. In such situations the echo canceller usually doesn't adapt the filter coefficients or adapts them very slowly. If the echo path remains constant during the double-talk, the echo canceller performs well, but if the echo path changes, the echo canceller will not be able to adapt to these changes and it will diverge. So, if we want the echo cancellers to operate, we can't use any of these non-solutions. Neither.
A solution to this problem is an adaptive sample rate converter, or simply an adaptive interpolator. It must be placed between the codecs (or the codec and the ISDN interface). Actually, there are two of them needed, one for each signal direction. The interpolator should be initially tuned to do upsampling or downsampling from one frequency to another if they're known to be different (for example, they can be 8 KHz and 9.6 KHz, so the interpolator will know what interpolation is done). As the time goes, it is possible to see the actual rate at which each codec transfers samples. The difference of the rates can be used as a feedback to adapt the interpolator to the actual ratio of the sampling rates.
Kane Computing Ltd
7 Theatre Court, London Road, Northwich, Cheshire, CW9 5HB, UK
Tel: +44(0)1606 351006 - Fax: +44(0)1606 351007/8
Email: firstname.lastname@example.org - Web: www.kanecomputing.com
If you are familiar with RSS feeds, you can also sign up for our free blog feed. Our RSS feed is updated in real-time while our newsletter is updated daily.