Model-based Speech Enhancement for ASR in Mobile Environments

As voice-driven applications move from the desktop to the mobile environment (e.g. telematics, voice portals from cell phones, hand-held computers, etc.), degradation of accuracy due to noise has become a limiting factor to the penetration of ASR into the marketplace. ClearStream? is an expert system that includes at its heart a model of what clean voice should be like (independent of what is being said). Whatever sound does not fit the voice model is deemed to be noise and is separately tracked. The portions of voice that are deemed occluded by noise get reconstructed using the speech model. That last crucial step is not taken by the standard approaches to speech enhancement, e.g. spectral subtraction. This paper provides evaluation of this technology using the Aurora standard database (TIdigit) and engine (HTK), in a variety of environments with implications in a number of markets such as telephony, automotive, and consumer.

