Title: Speaker Segmentation and Adaptation for Speech Recognition on Multiple-Speaker Audio Conference Data
Abstract: In this paper, we address the problem of how to improve the automatic speech recognition (ASR) performance on audio conference data by speaker segmentation and speaker adaptation. A new speaker segmentation method is proposed, where the speaker turns and speaker labels are automatically determined. For speaker adaptation, we use Vocal Tract Length Normalization and Maximum Likelihood Linear Regression. On a corpus of multi-speaker teleconferences, the word error rate of the ASR system improves over 4% absolute.
Publication Year: 2007
Publication Date: 2007-07-01
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
Cited By Count: 5
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot