Transcript
Page 1: Functional Data Analysis for Speech Research

Functional Data Analysis for Speech Research

Michele GubianRadboud University Nijmegen The NetherlandsLondon, March 24th 2010Cambridge, March 26th 2010

Page 2: Functional Data Analysis for Speech Research

Content What and why Functional Data Analysis (FDA)

Motivation

Case study 1

Case study 2 – pitch re-synthesis

How to use FDA

Using the R package ‘fda’

Page 3: Functional Data Analysis for Speech Research

Motivation

Page 4: Functional Data Analysis for Speech Research

Analyzing curves

PCA

ANOVA

Linear models

xxx

x

?

dur ext

58

48

98

2.8

3.8

2.9

dur

ext

Page 5: Functional Data Analysis for Speech Research

Problems

xxx

x

?

dur

ext

Decide what are the important features of a curve using

models

intuition / trial and error

However

Those features may not capture all the relevant dynamic

aspects

e.g. concavity/convexity

long range correlatioins

Page 6: Functional Data Analysis for Speech Research

Problems (2)

xxx

x

?

dur

ext

Identify those feature points

manually

(semi)automatically

However

The identification may be hard, even ill-posed

time consuming

risk of subjective judgment

Page 7: Functional Data Analysis for Speech Research

Analyzing curves with FDA

xxx

x

?

dur

ext

Functional

Data

Analysis

Page 8: Functional Data Analysis for Speech Research

Analyzing curves with FDA

All the information contained in the curve (dynamics) is used

No need to reduce a curve to a set of significant features

No need to introduce assumptions on what is relevant in a curve

shape and what is not

FDA provides both VISUAL and QUANTITATIVE results

input is curves, output is also curves

plus classic statistical output like p-values, confidence intervals

Page 9: Functional Data Analysis for Speech Research

Functional Data Analysis: an extension of (some) statistical techniques to the domain of functions

Example

Ask people: How old are you? How much do you earn?

Each data point is a point in 2D

CLASSIC FDA

age

salary xx

x

xxx

x

x

Record people salary through the years

Each “data point” is a whole CURVE

age

salary

Page 10: Functional Data Analysis for Speech Research

Case study

Page 11: Functional Data Analysis for Speech Research

Diphthong vs. hiatus in Spanish

/ja/ vs. /i.a/ contrast is unstable in European Spanish

Diachronically, in Romance languages /i.a/ becomes /ja/

Diatopically, in Latin American Spanish the contrast seems to be lost

It is not present in orthography (“ia” in either case)

No strict minimal pairs

Investigate

Consistent realization of the contrast

Inter-speaker variation

Cues used in the realization

Page 12: Functional Data Analysis for Speech Research

CuesDIPHTHONG

/ja/HIATUS

/i.a/

Duration

Formants

Pitch

short long

f1

f2

f1

f2

f0 f0

Page 13: Functional Data Analysis for Speech Research

Example diphthong

Page 14: Functional Data Analysis for Speech Research

Example hiatus

Page 15: Functional Data Analysis for Speech Research

Dataset

Read speech

Diphthong

‘Emiliana no, …’ /e.mi.lja.na#no#.../ (‘Not Emiliana, …’)

Hiatus

‘Mi liana no, … ‘ /mi#li.a.na#no#.../ (‘Not my liana, …’)

9 speakers (gender balanced)

20 repetitions per speaker per type

In total 365 utterances

Page 16: Functional Data Analysis for Speech Research

Duration

Page 17: Functional Data Analysis for Speech Research

Pitch

Pitch was extracted from the beginning of /l/ to the end of the

rising gesture

In Spanish the pitch rising peak falls beyond the accented

syllable

lja li a

Page 18: Functional Data Analysis for Speech Research

The raw dataspeaker

/ja/ vs /i.a/

Page 19: Functional Data Analysis for Speech Research

FDA data preparation

Each sampled curve has to be turned into a function

Decide how much detail to retain (smoothing)

Page 20: Functional Data Analysis for Speech Research

FDA data preparation (2)

All functions will be obtained by a combination of so-called

basis functions, usually B-splines

All functions will be linearly stretched in time to become of

equal duration

Functional

representation

B-spline

Page 21: Functional Data Analysis for Speech Research

ClassicPrincipal Component Analysis (PCA)

age25 65

salary

xx

xxx

x

xx

xxx

xx xx xx

x x

xxx

x

xx x

xxx

xx

x

xx

x

x

PC1

PC2

Page 22: Functional Data Analysis for Speech Research

Functional PCA on pitch contours

Page 23: Functional Data Analysis for Speech Research

Functional PCA on pitch contours

PCA does not know about labels !!

Page 24: Functional Data Analysis for Speech Research

Functional PCA on pitch contours

PC1

Page 25: Functional Data Analysis for Speech Research

Functional PCA on pitch contours

PC1

Page 26: Functional Data Analysis for Speech Research

Functional PCA on pitch contours

PC2

Page 27: Functional Data Analysis for Speech Research

Functional PCA on pitch contours

PC2

Page 28: Functional Data Analysis for Speech Research

Functional PCA on formants

PC2

PC1

f1 f2

Page 29: Functional Data Analysis for Speech Research

Functional PCA on formants

PC1PC1

Page 30: Functional Data Analysis for Speech Research

Cues coordination

Duration vs formants Duration vs pitch

Page 31: Functional Data Analysis for Speech Research

Summary

FDA provides tools to extract relevant dynamic characteristics of a set of

curves

Traditional tools like PCA (and linear regression) are extended to curves

Functional PCA revealed the main dynamic cues used in the realization

of a (weak) contrast in Spanish

Without using the labels information

Without extracting features from the curves (e.g. peaks)

Combining multi-dimensional curves (formants) without effort

Page 32: Functional Data Analysis for Speech Research

References Functional Data Analysis website:

www.functionaldata.org

Books:

Software:

a bilingual (R and MATLAB) tool is freely available

online

Page 33: Functional Data Analysis for Speech Research

Appendix

Page 34: Functional Data Analysis for Speech Research

Functional linear models

y(t) = a(t) + b(t) x

diphthong, x = 0

hiatus, x = 1

Confidence intervals for a(t) and b(t)

R2(t) = percentage of explained variance


Top Related