By: Nathan Blundy, NCT (Europe) Ltd
It is estimated that over a third of the world’s population spend time talking on mobile phones. The quality of that communication experience very much depends on the clarity of the voice that is heard at each end of the conversation. In an increasingly noisy world, it is easy to see that noise reduction and echo cancellation play a major role in helping mobile phone developers meet the expectations of this massive and demanding customer base.
Even away from this huge consumer driven market, people such as the emergency services and Formula 1 drivers depend on communication systems that can delivery clear voice communication even in the most extreme environments. This article discusses some of the issues faced by communication equipment developers and a few of the solutions that are applied to help them deliver a noise and echo free world.
Noise and echo can have a big influence over the performance of communication systems. For a voice activated telematics system, a recognition accuracy degraded to 90% by automotive noise will render it useless to dial a 10 digit phone number. In a fire engine, poor communication with the control room due to the screaming of the siren can result in a slower or confused response, potentially resulting tragedy. For a Formula 1 driver and the pit crew, engine noise could easily wipe out their chance of a podium finish, if there was a misunderstanding regarding strategy. Although it is taken for granted by most of us, technology for handling noise and echo is essential for many of today’s communication systems.
Good System Design – An Important Start
Audio engineers know that many of the problems that can cause noise and echo could be minimized at the initial design stage. A good example that illustrates some of these design considerations is a hand free communication system in a car.
Simply using a uni-directional microphone pointing towards the diver, rather than an omni-directional microphone, can eliminated a great deal of road noise, sound from the music system and passenger noise.
The location of the microphone relative to the loudspeaker can also have a considerable impact on the performance of the hand-free communication system, as the microphone will pick up the sound from the loudspeaker to a greater or lesser extent depending on its position and orientation relative to the loudspeaker.
Developing a good analog filter for the signal input from the microphone and choosing a higher sampling rate for its conversion to the digital domain are important first steps in the electronic design. The good analog filter can reduce electrical interference and microphone buffeting, for example, while the higher sampling rate will offer a broader bandwidth, capturing more of the voice signal frequencies and therefore a better quality of voice for the system to handle.
A further consideration will be the amplification and management of the signal strength; too big and the signal will experience distortion through clipping, too small and it will be embedded in the noise of the system and will be difficult to extract.
A strong, clear, undistorted, voice signal is an essential starting point for the achievement of good voice quality. Once this is in place, the processes of noise reduction and echo cancellation can be applied for further enhancement.
The Drive Thru – A Simple System
The communication system at your local drive through restaurant is a great example of a voice system that could employ noise reduction software. When you speak into the ‘microphone post’ to place your order, the system ideally removes background traffic noise, and even the noise from your own car making order errors less likely and order taking faster.
You will see from the diagram above, that the system contains a DSP (Digital Signal Processor). This processor provides the platform for the running of the speech enhancement algorithms. Most often this will be a Texas Instruments’ or an Analog Devices’ device but as we move forward, it is equally likely to be a Bluetooth device or some new form of WiFi processor. In some instances, these complex and highly refined voice algorithms are integrated into dedicated silicon and sold to developers as a discreet component performing all of the functions illustrated in the diagram and more. Alternatively, in more complex systems, these algorithms may be run as a task on operating systems such as OSE, Linux, Windows or QNX.
The Referenced Noise Filter
The Referenced Noise Filter is an example of a noise reduction technique that can target a specific type of problem where the noise-producing source is man-made and accessible. If we use the example of the fire engine in the introduction, the problem faced here is the constant background sound of the fire engine’s siren. When the engine driver talks into his microphone, it will be picking up his voice as well as the sound of the siren.
The referenced noise filter takes a signal (direct tap) from the source of the noise (the siren) and use this as a reference to enable the filter to target the noise that need to be removed. Using this complex software algorithm on the DSP in the system, the results can be quite profound, resulting in the region of a 90% reduction in the unwanted sound.
Voice Recognition Enhancer
Voice recognition is still very much an emerging technology. One of its key applications will be for the control of in-car telematic systems. Using your voice, you’ll be able to instruct your navigation system, dial up your favourite restaurant to book a table, maybe even open the sunroof and select ‘Katrina and the Waves’.
One of the biggest challenges for these systems has been the over-coming of noise. As discussed, a 90% voice recognition accuracy is close to useless (one digit in a 10 digit phone number will always be wrong). The challenge is that as the car accelerates, wind and road noise starts to degrade the performance of the system.
The x-axis on the graph shows that as the car accelerates from 0mph to greater than 70mph, the ‘Speech to Noise Ratio’ declines. In turn, the Hit Rate (voice recognition accuracy) also declines. Using a voice recognition enhancer software algorithm on the DSP can provide valuable improvements in the hit rate at the higher speeds. It is easy to see that a 10% improvement can be the difference between a successful voice instruction and a failed one. A VRE basically works by analyzing the sampled input data and making decisions on that. It decides what is speech and leaves that alone (crucial for successful speech recognition), and what is noise and then reduces that in amplitude.
It can differentiate successfully between speech and noise because; speech varies rapidly in amplitude and pitch, whereas noise varies much more slowly (to the point of being what noise audio engineers term ‘stationary’).
Hands-Free Systems – a new set of issues
For in-vehicle communication systems, hands-free systems are essential for safety and in order to comply with driving legislation. However, a hands-free system can be complex to design effectively. As with the voice recognition system, there is concern about road, wind and engine noise but in addition to this, a hands-free system will generate echo for the caller.
The problem is that the callers voice will be emitted by the loudspeaker and will travel around the vehicle before being picked up by the microphone. The caller will hear an echo of their voice and this will constantly interfere with the overall communication.
Indeed this problem can exist even when using a mobile handset or a Bluetooth headset – there will be acoustic echo generated by direct coupling between the loudspeaker and the microphone to a greater or lesser degree. To make hands-free systems viable, developers use a combination of a noise reduction algorithm along with an echo cancellation algorithm running on the DSP device. Often, this is also bundled with some speech enhancing software that provides the services of some acoustic filters and gain control to boost the clarity of the voice.
How Can You Benefit from this Technology?
The technology described is readily available as off-the-shelf software components. It is developed by expert audio and software engineers and proven in millions of devices. Developers choose to buy in these algorithms in order to ensure that they have the very best voice quality for their products. Many of these algorithms are optimized for small footprint devices and can also be tailored for specific applications e.g. bluetooth headsets, 3G mobile phones, Tetra radios, Formula 1 racing, etc.
A range of evaluation hardware is readily available, enabling engineers to assess the impact this technology would have on their next design and to enable them to become familiar with the issues surrounding this type of software component.
When you receive the next call on your mobile phone, just listen to how clearly you can hear your caller. NCT alone has invested over 70 man-year’s of mathematical, acoustic and software programming expertise to deliver the software that provides this level of clear speech. Noise reduction and echo cancellation has been a big subject for the communications sector over the last 10 years. It is expected that as voice recognition and Bluetooth systems roll out, the next 10 years will be equally as busy for voice enhancement software companies such as ourselves.