Robust adaptation to non-native accents in automatic speech recognition. (Q1422264)

The conversation between human beings is a very complex situation. People can communicate intentions very easily without much effort, provided they are speaking the same language. Communication and language is very natural, the capability of which is acquired step by step, beginning when one is a baby and then growing and developing unconsciously throughout life. What the book is interested in, however, is how we can make computers understand the human language. Of course the overall goal is to let computers understand human speech as it is used in human conversations. For solving this problem there are many technical difficulties that may seem to be no problem at all in human-human communication. One step towards this is a technology that is called Automatic Speech Recognition (ASR). Among the many problems that are still unsolved, the research presented here is concerned with robust adaptation to native as well as to non-native speakers. However, ASR systems designed for native speakers have a big problems when used by non-native speakers resulting in unacceptable recognition rates. Traditionally, speaker adaptation techniques are widely used to improve recognition rates of ASR systems. In this book the semi-supervised adaptation approach is also applied to improve the performance for non-native speakers. This book is structured as follows: in Chapter 2, a general overview of ASR systems is given. Background information about the necessary pre-processing of the speech data and the theory of stochastic modelling of speech is given in Chapter 3 and Chapter 4, respectively. Chapter 5 describes the knowledge sources that are necessary in each ASR system. The improved speaker adaptation algorithm that was developed during this research work is described in detail in Chapter 6 together with the experiments and results that were achieved. The applied approach for confidence modelling and its application to speaker adaptation are described in Chapter 7; and the new pronunciation adaptation approach is presented in Chapter 8. A perspective of future work is given in Chapter 9, and the research is then summarised in Chapter 10.

0 references

reviewed by

Attila Fazekas

0 references

zbMATH Keywords

speech recognition

0 references

MaRDI profile type

Publication

0 references

full work available at URL

https://doi.org/10.1007/3-540-36290-8