g.s.moze college of enginnering balewadi,pune -45. a presentation on a presentation on voice...

20
G.S.MOZE COLLEGE OF ENGINNERING BALEWADI,PUNE -45. A PRESENTATION ON A PRESENTATION ON Voice Morphing PROJECT GUIDE : By: Anil Mahadik Prof. Sonali Ghote

Upload: cecil-logan

Post on 17-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

G.S.MOZE COLLEGE OF ENGINNERINGBALEWADI,PUNE -45.

 

A PRESENTATION ONA PRESENTATION ON Voice Morphing

PROJECT GUIDE :By: Anil Mahadik Prof. Sonali Ghote

ContentTitleIntroductionHistoryNeed of Vocal track area functionVocal track area functionAR-HMM AnalysisAR-HMM DiagramRe-synthesis of Converted voice

Training PhaseConversion and morphing phase ApplicationConclusionReferences

The Project title is “Voice Morphing”. Give the information about Flexible Voice Morphing

based on linear combination of multispeakers’ vocal tract area function.

Voice morphing or voice conversion usually means transformation from a source speaker’s speech to a target speaker’s.

Title

IntroductionThe main goal of the developed audio morphing methods

is the smooth transformation from one sound to another. These techniques are considered to be a kind of point-to-

point mapping in a feature space. There are many applications which may benefit from this

sort of technology. Research on voice morphing aims to extend this

restriction to area-to-area mapping by introducing multi-speakers .

HistoryVoice morphing is a technology developed at the

Los Alamos National Laboratory in New Mexico, USA by George Papcun and publicly demonstrated in 1999.

Voice morphing enables speech patterns to be cloned and an accurate copy of a person's voice be made which can then say anything the operator wishes it to say.

Need of Vocal track area function Since the 1990s, many techniques for voice

conver-sion have been proposed [1-7]. One successful technique is to use a statistical

method for mapping a source speaker’s voice to a target speaker’s but a weakness of these methods is the discontinuity of formants.

The proposed method employs an estimated vocal tract area function to avoid such weakness.

Vocal Tract area function(A) Interpolation in the vocal tract area domain is

considered to provide reasonably continuous transition of formants.

Estimation of the vocal tract area function implies simultaneous estimation of the voice source characteristics.

AR-HMM analysis For this purpose of Estimation of the vocal tract area

function introduce Auto-Regressive Hidden Markov Model (AR-HMM) analysis of speech.

The AR-HMM model represents the vocal tract characteristics by an AR model and the glottal source wave by an HMM.

The AR-HMM analysis estimates the vocal tract resonance characteristics and vocal source waves in the sense of maximum likelihood estimation.

Diagram of AR-HMM

Re-synthesis of the converted voice

There are two phase’s Training phase and Conversion & Morphing phase.

The procedure of each phase is as follow in Diagram.

Training phase AR-HMM analysis: Speech samples with the same

phonetic content from both source and target speaker are analyzed .

Feature alignment: The feature vectors obtained above are time-aligned using dynamic time warping (DTW) in order to compensate for any differences in duration between source and target utterances.

Estimation of the conversion function: The aligned vectors are used to train a joint GMM whose parameters are then used to construct a stochastic conversion function.

Training phase

Conversion and morphing phase AR-HMM analysis: In this case only the source

speaker’s utterances are used.Features Transformation: The GMM-based transfor-

mation function constructed during training is now used for converting every source log vocal tract area function and vocal cord cepstrum into its most likely target equivalent.

Linear Interpolation ,Synthesis of the source wave and LPC synthesis.

Conversion and morphing phase

Application Applications as the creation of peculiar voices in

animation films.Voice morphing has tremendous possibilities in

military psychological warfare and subversion.Voice morphing is a powerful battlefield weapon

which can be used to provide fake orders to the enemy's troops, appearing to come from their own commanders.

Conclusion This paper has presented a voice morphing method

based on mappings in the vocal tract area space and glottal source wave spectrum that can each be independently mod-ified.

These features have been realized using AR-HMM analysis of speech.

In future, we will investigate how to improve the quality of voice conversion with interpolation techniques.

References [1] L.M. Arslan, D.Talkin, ”Voice conversion by

codebook map-ping of line spectral frequencies and excitation spectrum,” Proc. Eurospeech, pp.1347-1350, 1997.

[2] Y.Stylianou, O.Cappe, “A system voice conversion based on probabilistic classification and a harmonic plus noise mod-el”, Proc.ICASSP, pp.281-284, 1998 .

[3] A.Kain, “Spectral voice conversion for text-to-speech syn-thesis”, Proc.ICASSP pp.285-288, 1998.

[4] H. Ye, S. Young, “High Quality Voice Morphing”, in Proc.IEEEICASSP, pp.9-12, 2004.