Recent Advances in Robust Speech Recognition Technology

This E-book is a collection of articles that describe advances in speech recognition technology. Robustness in speech recognition refers to the need to maintain high speech recognition accuracy even ...
Integration of Statistical-Model-Based Voice Activity Detection and Noise Suppression for Noise Robust Speech Recogni

Masakiyo Fujimoto


This chapter addresses robust front-end processing for automatic speech recognition in noisy environments. To recognize corrupted speech accurately, it is necessary to employ robust methods against various types of interference. Usually, noise suppression is used for the frontend processing of speech recognition in the presence of noise. Voice activity detection (VAD) is also used for front-end processing to eliminate the redundant non-speech period. VAD and noise suppression are typically combined as series processing. VAD and noise suppression should not be assumed to be separate techniques, because the output information of these methods is mutually beneficial. Thus, this chapter introduces the integrated front-end processing of VAD and noise suppression, which can utilize each others' input-output information.


NTT Communication Science Laboratories, NTT Corporation 2-4, Hikari-dai, Seika-cho, Souraku-gun, Kyoto, 619-0237, Japan