xkl: a tool for speech analysis eric truslow adviser: helen hanson

29
Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Upload: ethelbert-miles

Post on 17-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Xkl: A Tool For Speech Analysis

Eric TruslowAdviser: Helen Hanson

Page 2: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Outline

• Introduction to speech analysis– Production mechanism– Models of speech production

• Background about Xkl• Design

– Pitch Detection– Labeling– Portability

• Future Work

Page 3: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Outline

• Introduction to speech analysis– Production mechanism– Models of speech production

• Background about Xkl• Design

– Pitch Detection– Labeling– Portability

• Future Work

Page 4: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Speech Production

Vocal Tract Frequency Reponse

Page 5: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Speech Production

Periodic Source Vocal Tract Frequency Reponse

Page 6: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Speech Production

Periodic Source Vocal Tract Frequency Reponse

Nasal cavities contribute tooNasal cavities contribute too

Output

Page 7: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Speech Model: Basic

Impulse TrainGenerator

Pitch Period

Glottal PulseModel X

Random Noise

GeneratorX

Vocal TractModel

Vocal Tract Parameters

Voiced/Unvoiced Decision

Gain

Gain

Page 8: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Speech Model: Klatt

Page 9: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Parameters

• Source characterization– Voiced or unvoiced– Frequency of periodic source– Energy distribution of a noise source

• Vocal tract model– Resonant frequency (Formants),

antiresonant frequencies and bandwidths

Page 10: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Outline

• Introduction to speech analysis– Production mechanism– Models of speech production

• Background about Xkl• Design

– Pitch Detection– Labeling– Portability

• Future Work

Page 11: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Background - Xkl

• Developed in-house at MIT by Dennis Klatt in the 1980s, and was originally a command line tool on Vax systems.

• Later was ported to UNIX and an X11/Motif GUI was added.

• Currently runs on Linux.

• Praat has become a very versatile alternative to Xkl, but Xkl has functionality that Praat does not.

Page 12: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Xkl – Features

• Allows users to easily examine speech signals in fine detail.

• Automatically computes DFT and spectrogram.

• Can perform a variety of computations not available in other packages.

• Averages spectra over time or waveforms• Smooth spectrum

Page 13: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Spectrogram and DFT in Xkl

SpectrogramSpectrogram

DFT and smoothed spectrumDFT and smoothed spectrum

Page 14: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Outline

• Introduction to speech analysis– Production mechanism– Models of speech production

• Background about Xkl• Design

– Pitch Detection– Labeling– Portability

• Future Work

Page 15: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Design Requirements

Users surveyed wanted:1. Pitch period estimator2. An improved labeling system3. Portability

1. Compatibility with multiple operating systems

2. Support for more audio file formats

Page 16: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Pitch Detection

• How rapidly the vocal tract is excited with periodic pulses.

• Carries lexical and prosodic information.• During computation we must decide whether

speech is voiced or unvoiced.– Errors in computation often occur during

transitions between sounds.– Errors depend on type of pitch detector being

used.

Page 17: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Pitch Detection: Design

• There are many different pitch detectors

• Praat's was chosen because it– Outperforms other detectors (SNR, HNR)– Is readily available

Page 18: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Pitch Detection: Algorithm

Tone 4

Remove HanningWindow Sidelobe

Praat Pitch Detector

Compute GlobalPeak Value

Process FrameTo Obtain LocalOptimal Choices

Find Path withGlobally Minimum

Cost

• Time domain, autocorrelation method• Frame processing determines strongest

pitch candidates including unvoiced.• Viterbi algorithm minimizes global cost

from candidates.

Page 19: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Outline

• Introduction to speech analysis– Production mechanism– Models of speech production

• Background about Xkl• Design

– Pitch Detection– Labeling– Portability

• Future Work

Page 20: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Labeling

• Support for reading and saving TextGrid files, for interaction with Praat [1].

– Tiers for grouping labels• Want labels to be displayed in same

window as waveform– Different from Xkl's separated

window layout

Page 21: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Labeling

Page 22: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Outline

• Introduction to speech analysis– Production mechanism– Models of speech production

• Background about Xkl• Design

– Pitch Detection– Labeling– Portability

• Future Work

Page 23: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Portability

• PortAudio– a cross-platform audio library– supports most operating systems– simplifies software maintenance

• Runs on OS X – Since it natively runs X11

• Added support to open Microsoft .wav files.

Page 24: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Outline

• Introduction to speech analysis– Production mechanism– Models of speech production

• Background about Xkl• Design

– Requirements– Alternatives– Final Design

• Future Work

Page 25: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Future Work

• Deploy to users for feedback• Finalize

– Labeling – Pitch Contour

• Fix bugs and add small features

Page 26: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Software Used

Eclipse – Integrated Development Environment.

Doxygen – A documentation generation system.

SVN – A version control system.

Open Motif – X Windows window managing system and widget library.

GDB – The GNU debugger.

GNU build system on OS X.

PortAudio – A multiplatform audio library.

Page 27: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Thank you for your attention.

Special thanks to:• Professor Helen Hanson• Dr. Stefanie Shattuck-Hufnagel (MIT)• Dennis H. Klatt• Survey Participants• ECE Department

Page 28: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

Questions?

Page 29: Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson

References

1: Paul Boersma & David Weenink (2009):Praat: doing phonetics by computer (Version 5.1.05) [Computer program].Retrieved May 1, 2009, from http://www.praat.org/

2: Paul Boersma, Accurate Short-term analysis of the Fundamental Frequency and the Harmonics-to-Noise Ratio of a Sampled Sound, 1993, http://www.fon.hum.uva.nl/paul/papers/Proceedings_1993.pdf