1 analysis of parameter importance in speaker identity ricardo de córdoba, juana m....
TRANSCRIPT
1
Analysis of Analysis of Parameter Importance in Parameter Importance in
Speaker IdentitySpeaker Identity
Ricardo de Córdoba,Juana M. Gutiérrez-ArriolaSpeech Technology Group
Departamento de Ingeniería ElectrónicaUniversidad Politécnica de Madrid
e-mail: [email protected],[email protected]
2
IndexIndex
Introduction System description Parameter extraction Voice conversion and synthesis Parameter analysis Application to a voice quality task Results Conclusions
3
parametersconverted target speaker
speechSynthesis —
— source speaker voice
— target speaker voice
Analysis
Analysis
parameters
parameters
Transformation functions
computation
transformation functions
Voice conversion
IntroductionIntroduction
4
IndexIndex
Introduction
System descriptionSystem description Parameter extraction Voice conversion and synthesis Parameter analysis Application to a voice quality task Results Conclusions
5
System descriptionSystem description Source speaker
455 units Formant
synthesizer Parametrized
units concatenation
Target speaker 455 units
Transformation functions
computation
Speaker conversion
6
IndexIndex
Introduction System description
Parameter extractionParameter extraction Voice conversion and synthesis Parameter analysis Application to a voice quality task Results Conclusions
10
Parameter Extraction IIIParameter Extraction III
p
pe
t
ttSkewness
O
c
t
tKOpen
aTeturn1
Re
pc
p
tt
tVelocity
11
Parameter Extraction IVParameter Extraction IV
We calculate F0, AV, AF, formant frequencies and bandwidths
Pitch marks and formants are manually revised
Only voiced sounds are transformed
12
IndexIndex
Introduction System description Parameter extraction
Voice conversion and synthesisVoice conversion and synthesis Parameter analysis Application to a voice quality task Results Conclusions
13
Voice conversion IVoice conversion I
Lineal transformation functions:
For each pair of source-target units we compute the transformation coefficients which are stored in a file
BParAPar sourcetarget
14
SynthesisSynthesis
Formant synthesizer (Klatt)
Parameterized units concatenation
Prosodic modification, changing glottal pulse length and the number of glottal pulses
Formant smoothing during unit transitions
15
IndexIndex
Introduction System description Parameter extraction Voice conversion and synthesis
Parameter analysisParameter analysisApplication to a voice quality taskApplication to a voice quality task Results Conclusions
16
Parameter Analysis IParameter Analysis I
11 speakers (5 female, 6 male)
EUROM1 database in Castilian Spanish
Sentence: “Mi abuelo me animó a estudiar solfeo”(My grandfather encouraged me to study solfa)
Fs=16 kHz
18
Parameter Analysis IIIParameter Analysis III We want to know which parameters are actually
relevant for speaker identity Discriminant functions are linear combinations of
variables that best discriminate classes– They can be used to rank the variables in terms of their
relative contribution to class discrimination LDA is performed:
– For each phoneme of the sentence (does not work well for the whole sentence)
– Coefficients of the first discriminant function are used to rank the parameters
19
Application to a Voice Quality Application to a Voice Quality TaskTask
We extracted four sentences of the Brian VOQUAL'03 database: normal, clear, creaky, and relax.
We analyzed two phonemes of the sentence: “She has leeft for a great paarty today”
We wanted to rank parameter importance to discriminate between the four classes:– We use the coefficients of the first discriminant
function
20
IndexIndex
Introduction System description Parameter extraction Voice conversion and synthesis Parameter analysis Application to a voice quality task
ResultsResults Conclusions
21
Results IResults IVoice Quality TaskVoice Quality Task
Frame classification for E and A using LDA for the first two discriminant functions
normal creaky clear relax
E A
22
Results IIResults IIVoice Quality TaskVoice Quality Task
F0 F1 B1 F2 B2 F3 B3 F4 B4 F5 B5 F6OQRESKVE F0 F1 B1 F2 B2 F3 B3 F4 B4 F5 B5 F6OQRESKVE
E A
First function coefficients
Absolute values of the coefficients that multiply each parameter in the first discriminant functions
23
Results IIIResults IIISpeaker IdentitySpeaker Identity
F0 F1 B1 F2 B2 F3 B3 F4 B4 F5 B5 F6OQRESKVE0
5000
10000
15000more relevant
F0 F1 B1 F2 B2 F3 B3 F4 B4 F5 B5 F6OQRESKVE0
5000
10000less relevant
Number of times each parameter has been the most relevant (up) and the least relevant (bottom)
in the first discriminant function
24
IndexIndex
Introduction System description Parameter extraction Voice conversion and synthesis Parameter analysis Application to a voice quality task Results
ConclusionsConclusions