Title: Speech transformations based on a sinusoidal representation
Abstract: This paper presents a new speech analysis/synthesis technique based on a sinusoidal representation of the speech production mechanism but which is independent of pitch and the voiced/unvoiced speech state. The resulting synthetic speech preserves the waveform shape and is essentially perceptually indistinguishable from the original. The method provides the basis for a general class of speech transformations and is successfully applied to time-scale modification, frequency scaling, and scaling of pitch. Furthermore, these modifications can be performed with a time-varying rate of change, allowing, for example, continuous adjustment of a speaker's fundamental frequency and rate of articulation. Although the analysis/synthesis system was originally designed for single-speaker signals, it is equally capable of recovering and modifying nonspeech signals such as music, multi-speakers, marine biologic sounds, and speech in the presence of interferences such as noise and musical backgrounds.
Publication Year: 2005
Publication Date: 2005-03-23
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
Cited By Count: 15
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot