1 analysis of parameter importance in speaker identity ricardo de córdoba, juana m....

24
1 Analysis of Analysis of Parameter Importance in Parameter Importance in Speaker Identity Speaker Identity Ricardo de Córdoba, Juana M. Gutiérrez-Arriola Speech Technology Group Departamento de Ingeniería Electrónica Universidad Politécnica de Madrid e-mail: [email protected], [email protected]

Upload: belen-kessel

Post on 14-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

1

Analysis of Analysis of Parameter Importance in Parameter Importance in

Speaker IdentitySpeaker Identity

Ricardo de Córdoba,Juana M. Gutiérrez-ArriolaSpeech Technology Group

Departamento de Ingeniería ElectrónicaUniversidad Politécnica de Madrid

e-mail: [email protected],[email protected]

2

IndexIndex

Introduction System description Parameter extraction Voice conversion and synthesis Parameter analysis Application to a voice quality task Results Conclusions

3

parametersconverted target speaker

speechSynthesis —

— source speaker voice

— target speaker voice

Analysis

Analysis

parameters

parameters

Transformation functions

computation

transformation functions

Voice conversion

IntroductionIntroduction

4

IndexIndex

Introduction

System descriptionSystem description Parameter extraction Voice conversion and synthesis Parameter analysis Application to a voice quality task Results Conclusions

5

System descriptionSystem description Source speaker

455 units Formant

synthesizer Parametrized

units concatenation

Target speaker 455 units

Transformation functions

computation

Speaker conversion

6

IndexIndex

Introduction System description

Parameter extractionParameter extraction Voice conversion and synthesis Parameter analysis Application to a voice quality task Results Conclusions

8

Parameter Extraction IParameter Extraction I

Glottal parameters:

9

Parameter extraction IIParameter extraction II

10

Parameter Extraction IIIParameter Extraction III

p

pe

t

ttSkewness

O

c

t

tKOpen

aTeturn1

Re

pc

p

tt

tVelocity

11

Parameter Extraction IVParameter Extraction IV

We calculate F0, AV, AF, formant frequencies and bandwidths

Pitch marks and formants are manually revised

Only voiced sounds are transformed

12

IndexIndex

Introduction System description Parameter extraction

Voice conversion and synthesisVoice conversion and synthesis Parameter analysis Application to a voice quality task Results Conclusions

13

Voice conversion IVoice conversion I

Lineal transformation functions:

For each pair of source-target units we compute the transformation coefficients which are stored in a file

BParAPar sourcetarget

14

SynthesisSynthesis

Formant synthesizer (Klatt)

Parameterized units concatenation

Prosodic modification, changing glottal pulse length and the number of glottal pulses

Formant smoothing during unit transitions

15

IndexIndex

Introduction System description Parameter extraction Voice conversion and synthesis

Parameter analysisParameter analysisApplication to a voice quality taskApplication to a voice quality task Results Conclusions

16

Parameter Analysis IParameter Analysis I

11 speakers (5 female, 6 male)

EUROM1 database in Castilian Spanish

Sentence: “Mi abuelo me animó a estudiar solfeo”(My grandfather encouraged me to study solfa)

Fs=16 kHz

17

Parameter Analysis IIParameter Analysis II

18

Parameter Analysis IIIParameter Analysis III We want to know which parameters are actually

relevant for speaker identity Discriminant functions are linear combinations of

variables that best discriminate classes– They can be used to rank the variables in terms of their

relative contribution to class discrimination LDA is performed:

– For each phoneme of the sentence (does not work well for the whole sentence)

– Coefficients of the first discriminant function are used to rank the parameters

19

Application to a Voice Quality Application to a Voice Quality TaskTask

We extracted four sentences of the Brian VOQUAL'03 database: normal, clear, creaky, and relax.

We analyzed two phonemes of the sentence: “She has leeft for a great paarty today”

We wanted to rank parameter importance to discriminate between the four classes:– We use the coefficients of the first discriminant

function

20

IndexIndex

Introduction System description Parameter extraction Voice conversion and synthesis Parameter analysis Application to a voice quality task

ResultsResults Conclusions

21

Results IResults IVoice Quality TaskVoice Quality Task

Frame classification for E and A using LDA for the first two discriminant functions

normal creaky clear relax

E A

22

Results IIResults IIVoice Quality TaskVoice Quality Task

F0 F1 B1 F2 B2 F3 B3 F4 B4 F5 B5 F6OQRESKVE F0 F1 B1 F2 B2 F3 B3 F4 B4 F5 B5 F6OQRESKVE

E A

First function coefficients

Absolute values of the coefficients that multiply each parameter in the first discriminant functions

23

Results IIIResults IIISpeaker IdentitySpeaker Identity

F0 F1 B1 F2 B2 F3 B3 F4 B4 F5 B5 F6OQRESKVE0

5000

10000

15000more relevant

F0 F1 B1 F2 B2 F3 B3 F4 B4 F5 B5 F6OQRESKVE0

5000

10000less relevant

Number of times each parameter has been the most relevant (up) and the least relevant (bottom)

in the first discriminant function

24

IndexIndex

Introduction System description Parameter extraction Voice conversion and synthesis Parameter analysis Application to a voice quality task Results

ConclusionsConclusions

25

ConclusionsConclusions

Parameter importance depends on:– the type of speech – the gender of the speaker – the phonemes under study

Results show that F0, formant frequencies and OQ are the most important parameters for speaker classification.