A Study on Layered Decoding Method Based on Articulatory Feature for Mandarin Speech Recognition
Abstract:Traditional automatic speech recognition (ASR) decoding methods extend path Candidates, nondistinctively in the whole search space, ignoring a usage of assistant information for search space dividing, consequently incapable of enhancing or pruning the search according to promising level of subspaces, which courses lots of unnecessary calculation. Besides, traditional decoding methods lack an assessment of confidence of path candidates by using assistant information, thus unable to adjust direction of extension in the decoding process. On the basis of multiple acoustic and semantic cues, as well as articulatory feature (AF) framework of Mandarin speech, this study is going to explore automatic AF extraction method. As a kind of assistant information, AF provides a stable representation of speech from the point of speech production. Then, this study intends to explore AF modeling method, as well as two-level decoding method by integrating articulatory information. The first level decoding takes advantage of articulatory model to divide the search space into several subspaces, and the second level decoding takes advantage of acoustic model to extend path candidates in the resulting subspaces according to their degree of promising. Furthermore, this study is going to explore articulatory information based assessment method of path candidates. After assessing path candidates, the result of assessment is integrated into decoding process to induce the extension direction of path candidates, which gives rise to a novel ASR method that conforms to the cognitive process of human brain that assess candidate hypothesis by using heuristic cues.
Keywords: decoding algorithm; search space; feature extraction; acoustic model; articulatory feature
Contact:
YANG Zhanlei
E-mail: zhanlei.yang@nlpr.ia.ac.cn