uchechukwu ofoegbu, dissertation proposal speech processing lab sept. 29, 2006 1 speaker...

75
Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, Sept. 29, 2006 2006 1 Speaker Di Speaker Di scrimination scrimination : : The Challenge of Conversational Data Dissertation Committee Advisor: Robert Yantorno, Ph.D Members: Dennis Silage, Ph.D. Brian Butz, Ph.D. Iyad Obeid, Ph.D. Eugene Kwatny, Ph.d Uchechukwu O. Ofoegbu

Upload: oswald-harvey-evans

Post on 03-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 200611

Speaker DiSpeaker Discriminationscrimination::The Challenge of Conversational Data Dissertation

Committee

Advisor: Robert Yantorno, Ph.D

Members:Dennis Silage, Ph.D.

Brian Butz, Ph.D.Iyad Obeid, Ph.D.

Eugene Kwatny, Ph.d

Uchechukwu O. Ofoegbu

Page 2: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 200622

Presentation OutlinePresentation Outline

• Problem Statement and Research Goal

• Scope of Research

• Distance Analysis

• Feature Analysis

• Data Analysis

• Application Systems

• Fusion of Distances

• Proposal Summary

DissertationCommittee

Advisor: Robert Yantorno, Ph.D

Members:Dennis Silage, Ph.D.

Brian Butz, Ph.D.Iyad Obeid, Ph.D.

Eugene Kwatny, Ph.d

Page 3: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 200633

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Problem Statement and Research Goal

Page 4: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 200644

Conventional Speaker Recognition Conventional Speaker Recognition

Reference SpeechReference Speech

Feature ExtractionFeature ExtractionFeature ExtractionFeature Extraction

Model BuildingModel BuildingModel BuildingModel Building

Test SpeechTest Speech

Feature Feature ExtractionExtraction

Feature Feature ExtractionExtraction ComparisonComparison

ComparisonComparison RecognitionRecognitionDecision Decision

RecognitionRecognitionDecision Decision

SystemSystemOutputOutput

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

• Speaker Identification– Who is this speaker?

• Speaker Verification– Is he who he claims to be?

Page 5: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 200655

Conversation Segmentation Conversation Segmentation

• Broadcast News/Conference Data

• Conversational Data

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

12 13 14 15 16 17 18 19 20-0.5

0

0.5

Time (Seconds)

Am

plitu

de

12 13 14 15 16 17 18 19 20-0.5

0

0.5

Time (Seconds)

Am

plitu

de

0 5 10 15 20 25 30-0.4

-0.2

0

0.2

0.4

0.6

Time (seconds)

Am

plitu

de

0 5 10 15 20 25 30-0.4

-0.2

0

0.2

0.4

0.6

Time (seconds)

Am

plitu

de

Page 6: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 200666

Problems with Conversational DataProblems with Conversational Data

• No a priori information available from participating speakers.– Training is impossible

• No a priori knowledge of change points

• Speakers alternate very rapidly.– Limited amounts of data for single speaker

representations

• Distortion– Channel noise, co-channel data

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 7: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 200677

Proposed SolutionsProposed Solutions

1. Selective creation of data models

2. Development of an “optimal” distance measure

3. Decision level fusion of distance measures

4. Development of application-specific system

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 8: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 200688

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Scope of Research

Page 9: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 200699

Criminal Activity DetectionCriminal Activity Detection

• Monitoring inmate conversations– Prevention of 3-way calls– Notification of suspicious contacts– Enhancement of keyword detection– Uncooperative data collection

• Forensics– Voiceprints

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 10: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20061010

Commercial ServicesCommercial Services

• Automated Customer Services– Personalized contact with customers

• Search/Retrieval of Audio Data

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 11: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20061111

Homeland SecurityHomeland Security

• Military Activities– Pilot-control tower communications– Detection of unidentified speakers on

pilot radio channels

• Terrorist Identification

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 12: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20061212

Distance Analysis

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 13: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20061313

Distance MeasuresDistance Measures

• Univariate vs. Multivariate Analysis

Difference between meansDifference between means

Standard DeviationStandard Deviation

Standard DeviationStandard Deviation

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 14: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20061414

Distance MeasuresDistance Measures

• Notations– Random variables being compared:

X = [X1, X2, …, Xp]: nx by p matrix

Y = [Y1, Y2, …, Yp]: ny by p matrix

• Properties– Q(X, Y) ≥ 0, – Q(X, Y) = 0 iff X = Y, – Q(X, Y) = Q(Y, X),

– Q(X, Y) ≤ Q(X, Z) + Q(Z,Y)

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 15: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20061515

Distance MeasuresDistance Measures

• Mahalanobis Distance

QMAHANALOBIS(X,Y) = (μx – μy)T Σ-1 (μx – μy)

Σ = combined covariance matrix of X and Y

• Hotelling’s T-Square StatisticsHotelling’s T-Square Statistics

Cik = ith row and kth column of the inverse of C

)()( ykxkyixi

ikp

i

p

k

Cnn

nn

1 1yx

yxTSQ Y)(X, Q

2

)1()1( C

yx

yyxx

nn

nn

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 16: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20061616

Distance MeasuresDistance Measures

• Kullback-Leibler (KL) Distance

• Bhattacharya Distance

)2(2

1))(()(

2

1 1111 ItrQ xyyxxyyxT

xyKULLBACK

yx

yx

xyyxT

xyYYABHATTACHARQ

2log(

2

1)()()(

4

1 1

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 17: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20061717

Distance MeasuresDistance Measures

• Levene’s Test

– Derived from T-Square statistics as follows:

• Each set of points is transformed along each vector into absolute divergence from the mean vector

• The T-Square Statistic is then applied on the transformed features.

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 18: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20061818

Randomly Randomly Select 2 Select 2

UtteranceUtterancess

Randomly Randomly Select 2 Select 2

UtteranceUtterancess

Window Window DataData

Window Window DataData

Window Window DataData

Window Window DataData

10 utterances from Speaker A

ComputCompute 14e 14thth Order Order LPCCLPCC

ComputCompute 14e 14thth Order Order LPCCLPCC

ComputCompute 14th e 14th Order Order LPCCLPCC

ComputCompute 14th e 14th Order Order LPCCLPCC

UtteranUtterance 2ce 2

Utterance 1

CompuCompute te

DistancDistancee

CompuCompute te

DistancDistancee

Procedural Set-upProcedural Set-up

• HTIMIT database used• Average Utterance Length = 5 seconds

Intra-speaker distance computationsIntra-speaker distance computations

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 19: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20061919

Randomly Randomly Select Select

Utterance Utterance

Randomly Randomly Select Select

Utterance Utterance

Compute Compute 1414thth

Order Order LPCCLPCC

Compute Compute 1414thth

Order Order LPCCLPCC

Compute Compute 14th 14th Order Order LPCCLPCC

Compute Compute 14th 14th Order Order LPCCLPCC

CompuCompute te

DistancDistancee

CompuCompute te

DistancDistancee

Procedural Set-upProcedural Set-up

Inter-speaker, different utterances Inter-speaker, different utterances distance computationsdistance computations

Randomly Randomly Select Select

UtteranceUtterance

Randomly Randomly Select Select

UtteranceUtterance

Window Window DataData

Window Window DataData

Window Window DataData

Window Window DataData

Speaker Speaker AA

Speaker Speaker BB

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 20: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20062020

Analysis of Distance MeasuresAnalysis of Distance Measures

• Mahalanobis Distance – Gaussian Estimate

0 0.5 1 1.5 2 2.5 3 3.50

0.005

0.01

0.015

0.02

0.025

0.03

Distance

Pro

bab

ilit

y o

f O

ccu

rren

ceMahalanobis Distance Comparisons

SSDU

DSDU

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 21: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20062121

Analysis of Distance MeasuresAnalysis of Distance Measures

• Levene’s Test – Gamma Estimate

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary0 50 100 150 200 250

0

0.02

0.04

0.06

0.08

0.1

0.12

Levene's Test Comparisons

LPCC-based Distances

Pro

babi

lity

of O

ccur

renc

e

Page 22: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20062222

Feature Analysis

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 23: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20062323

Cepstral AnalysisCepstral Analysis

Frequency Analysis of Speech Excitation ComponentExcitation Component Vocal Tract ComponentVocal Tract Component

XX==

Slowly varying formantsSlowly varying formants

Fast varying harmonicsFast varying harmonics

STFT of SpeechSTFT of Speech

==

==

++

++

Log of STFTLog of STFT Log of ExcitationLog of Excitation Log of Vocal Tract ComponentLog of Vocal Tract Component

IDFT of Log of IDFT of Log of STFTSTFT

Vocal tractVocal tract ExcitationExcitation

Page 24: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20062424

Cepstral FeaturesCepstral Features

• Linear Predictive Cepstral Coefficients

– Obtained Recursively from LPC Coefficients

• Mel-Scale Frequency Cepstral Coefficients

– Nonlinear warping of frequency axis to model the human auditory system

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 25: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20062525

Cepstral FeaturesCepstral Features

• Delta Cepstral Coefficients

– First and Second derivatives of cepstral coefficients

– Reflects dynamic information– Used as supplement to original cepstral

features

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 26: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20062626

Analysis of Cepstral FeaturesAnalysis of Cepstral Features

• Mahalanobis Distance

0 1 2 3 40

0.02

0.04

0.06

0.08

0.1

Mahalanobis Distance Comparisons

LPCC-based Distances

Pro

bability o

f O

ccurr

ence

Intra

Inter

0 1 2 3 40

0.02

0.04

0.06

0.08

0.1

0.12

MFCC-based Distances

Pro

bability o

f O

ccurr

ence

Intra

Inter

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 27: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20062727

Analysis of Cepstral FeaturesAnalysis of Cepstral Features

• Levene’s TestProblem Statement and

Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary 0 100 2000

0.02

0.04

0.06

0.08

0.1

0.12

Levene's Test Comparisons

LPCC-based Distances

Pro

bability o

f O

ccurr

ence

Intra

Inter

0 100 200 300 4000

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

MFCC-based Distances

Pro

bability o

f O

ccurr

ence

Intra

Inter

Page 28: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20062828

Feature CombinationFeature Combination

• Proposed Investigation

– What’s the best feature combination?

– Will the delta and delta-delta coefficients contribute to the speaker differentiating ability of the features.

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 29: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20062929

Feature Combination AnalysisFeature Combination Analysis

• T-test Based Evaluation

– Why?

• Robust to the Gaussian distribution especially for amounts of data sizes and when the two samples to be compared have approximately equal values.

• Unaffected by differences in the variances of the compared variables

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 30: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20063030

Data Analysis

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 31: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20063131

Traditional Speaker ModelingTraditional Speaker Modeling

• Examples– Gaussian Mixture Models– Hidden Markov Models– Neural Networks– Prosody-Based Models

• Disadvantages– Require large amounts– Sometimes require training procedure– Relatively complex

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 32: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20063232

Conversational Data ModelingConversational Data Modeling

• Current Method– Equal Segmentation of Data– Indiscriminate use of data– Poor performance

• Problems– Change points unknown– Not all speech is useful

Problem Statement and Problem Statement and Research GoalResearch Goal

Scope of ResearchScope of Research

Distance AnalysisDistance Analysis

Feature AnalysisFeature Analysis

Data AnalysisData Analysis

Application SystemsApplication Systems

Fusion of DistancesFusion of Distances

Proposal SummaryProposal Summary

Page 33: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20063333

Proposed Speaker ModelingProposed Speaker Modeling

SSSS

VVVV

UUUU

VVVV

UUUU

VVVV

UUUU

…………

UUUU

VVVV

SSSS

VVVV

VVVV

VVVV

VVVV

VVVV

VVVV

VVVV

VVVV. . .. . .

SEGMENT 1SEGMENT 1 SEGMENT MSEGMENT M

FEATURE FEATURE COMPUTATIONCOMPUTATION

FEATURE FEATURE COMPUTATIONCOMPUTATION

MEAN AND COVARIANCE MEAN AND COVARIANCE MATRIX COMPUTATIONMATRIX COMPUTATION

MEAN AND COVARIANCE MEAN AND COVARIANCE MATRIX COMPUTATIONMATRIX COMPUTATION

MODEL 1MODEL 1MODEL 1MODEL 1

MODEL MMODEL MMODEL MMODEL M. . .. . .

. . .. . .

FEATURE FEATURE COMPUTATIONCOMPUTATION

FEATURE FEATURE COMPUTATIONCOMPUTATION

MEAN AND COVARIANCE MEAN AND COVARIANCE MATRIX COMPUTATIONMATRIX COMPUTATION

MEAN AND COVARIANCE MEAN AND COVARIANCE MATRIX COMPUTATIONMATRIX COMPUTATION

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 34: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20063434

Proposed Speaker ModelingProposed Speaker Modeling

• Why voiced only– Same speech class compared– Contains the most information

• What’s the appropriate number of phonemes

– Large enough to sufficiently represent speakers

– Small enough to avoid speaker overlap

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 35: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20063535

Modeling AnalysisModeling Analysis

0 0.5 1 1.5 2 2.5 3 3.50

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

Distance Value

Pro

babi

lity

Distribution of Mahalanobis Distance - Utterance Based

Same Speaker

Different Speaker

N = 20 – 4 seconds of voiced speech

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 36: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20063636

Modeling AnalysisModeling Analysis

0 5 10 15 20 250

0.5

1

1.5

2

2.5

3

3.5

4

4.5

Number of segments

Mahala

nobis

dis

tance

Speaker Differentiation with Respect to Data Size

Same Speaker

Different Speaker

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 37: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20063737

Modeling AnalysisModeling Analysis

0.5 1 1.5 2 2.5 3 3.50

0.005

0.01

0.015

0.02

0.025

0.03

0.035

Distance value

Pro

babili

ty

Distributions of Mahalanobis Distance - Segment Based

Same Speaker

Different Speaker

N = 5 – 1 second of voiced speech

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 38: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20063838

Applications Systems

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 39: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20063939

Unsupervised Speaker IndexingUnsupervised Speaker Indexing

• The Restrained-Relative Minimum Distance (RRMD) Approach

0 D0 D1,21,2 D D1,31,3 … …

DD2,12,1 0 D 0 D2,32,3 … …

DD3,13,1 D D3,23,2 0 … 0 …

……

0 D0 D1,21,2 D D1,31,3 … …

DD2,12,1 0 0 DD2,32,3 … …

DD3,13,1 DD3,23,2 0 … 0 …

……

REFERENCE MODELSREFERENCE MODELS

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 40: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20064040

Unsupervised Speaker IndexingUnsupervised Speaker Indexing

• The Restrained-Relative Minimum Distance (RRMD) Approach

Reference 2Reference 2

Restraining Condition

Restraining Condition

Same Speaker

PassedPassedRelativeDistance

Condition

RelativeDistance

Condition

FailedFailed

Passed

FailedFailed

Unusable Data

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Reference 1Reference 1

Observe distanceObserve distance

Min. Distance

Same Speaker?

Page 41: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20064141

RRMD ApproachRRMD Approach

• Restraining Condition

– Distance Likelihood Ratio

DLR > 1 Same Speaker

DLR < 1 Check Relative

Distance Condition

),|(

),|(

22

11

xf

xfDLR

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 42: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20064242

RRMD ApproachRRMD Approach

• Relative Distance Condition

– Relative Distance:

Drel = dmax – dmin

– Drel > threshold Same Speaker

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Reference 2Reference 2

Reference 1Reference 1

dmin dmax

Page 43: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20064343

Preliminary ResultsPreliminary Results

• Experiments

– 245 telephone conversations from the SWITCHBOARD database, with an average length of 400 seconds.

– T-Square statistics used

– Ground truth obtained from Mississippi State Transcriptions

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 44: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20064444

Preliminary ResultsPreliminary Results

• Best N Estimation

2 4 6 8 10 12 14 16 18 2070

72

74

76

78

80

82

84Average Indexing Accuracy Wth Respect to Number of Voiced Phonemes Per Models

Acc

ura

cy

N = Number of Voiced Phonemes

N = 5N = 5

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 45: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20064545

Preliminary ResultsPreliminary Results

• RRMD Experiments– Drel Varied from 0-200

– Two Errors Defined

• Indexing Error

Ierr = 100 – Accuracy,

• Undecided Error

Nu = number of detected undecided/unusable samples,

Nc = number labeled as co-channel data

‘undecided error’ :

%100

T

Err N

NcNuU

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 46: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20064646

Preliminary ResultsPreliminary Results

0 20 40 60 80 100 120 140 160 180 2000

2

4

6

8

10

12

14

16

18

20Classification Error with Respect to Relative Distance Threshold

Per

cen

t E

rro

r

Threshold

Indexing Error

Undecided Error

Equal error

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 47: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20064747

Speaker Count SystemSpeaker Count System

• The Residual Ratio Algorithm (RRA)

• Process is repeated K-1 times for counting up to K speakers

Reference Model Reference Model Selected RandomlySelected Randomly

DLR-based DLR-based Model Model ComparisonComparison

Reference Model Reference Model Selected RandomlySelected Randomly

. . .. . .

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

DLR-based DLR-based Model Model ComparisonComparison

Reference Model Reference Model Selected RandomlySelected Randomly

Too little data Removed, select

Another model

Page 48: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20064848

RRA Examples – 2 SpeakersRRA Examples – 2 Speakers

0 10 20 30 40 50 60-20

0

20How the Residual Ratio Algorithm Works for Two-Speaker Conversations

Init

ial

Sp

eech

0 10 20 30 40 50 60-20

0

20

Ro

un

d 2

Res

idu

al

Time (Seconds)

0 10 20 30 40 50 60-20

0

20

Ro

un

d 3

Res

idu

al

Time (Seconds)

Speaker 1

Speaker 2

Speaker 2

Speaker 2

Speaker 1 Speaker 2

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 49: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20064949

RRA Examples – 3 SpeakersRRA Examples – 3 Speakers

0 10 20 30 40 50 60 70 80 90 100-20

0

20How the Residual Ratio Algorithm Works for Three-Speaker Conversations

Initi

al S

peec

h

0 10 20 30 40 50 60 70 80 90 100-20

0

20

Rou

nd 2

Res

idua

l

Time (Seconds)

0 10 20 30 40 50 60 70 80 90 100-20

-10

0

10

Rou

nd 3

Res

idua

l

Time (Seconds)

Speaker 1

Speaker 2Speaker 3

Speaker 2

Speaker 1

Speaker 2

Speaker 3

Speaker 2

Speaker 3

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 50: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20065050

ComparisonComparison

Speaker 2

Residual Ratio after 2nd round Residual Ratio after 2nd round of RRAof RRA

Residual Ratio after 2nd round Residual Ratio after 2nd round of RRAof RRA

TWO-SPEAKER RESIDUALTWO-SPEAKER RESIDUAL THREE-SPEAKER RESIDUALTHREE-SPEAKER RESIDUAL

Speaker 2

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 51: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20065151

Preliminary ResultsPreliminary Results

• Experiments

– HTIMIT Database

– 1000 artificially generated K-speaker conversations (each) for K=1-4

– Average conversation length = 1min

– Mahalanobis distance used

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 52: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20065252

Preliminary ResultsPreliminary Results

• Counting Techniques– Stopped Residual Ratio (SRR)

– Added Residual Ratio (ARR)• speaker count determined based on the sum of

the Residual Ratios for all K-1 rounds. The higher the ARR= higher speaker count

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 53: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20065353

Preliminary ResultsPreliminary Results

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

0.11

ARR

Pro

babi

lity

ARR Probability Distributions

1 Speaker2 Speakers3 Speakers4 Speakers

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

0.11

ARR

Pro

babi

lity

ARR Probability Distributions

1 Speaker

2 Speakers

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

0.11

ARR

Pro

babi

lity

ARR Probability Distributions

1 Speaker2 Speakers3 Speakers

Page 54: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20065454

Preliminary ResultsPreliminary Results

RRA Accuracy

40

50

60

70

80

90

100

1 or More 1, 2 or More 1, 2, 3 or 4

Accuracy Method

Pe

rce

nt

Co

rre

ct

SRR

ARR

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 55: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20065555

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Fusion of Distances

Page 56: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20065656

Correlation AnalysisCorrelation Analysis

Problem Statement and Problem Statement and Research GoalResearch Goal

Scope of ResearchScope of Research

Distance AnalysisDistance Analysis

Feature AnalysisFeature Analysis

Data AnalysisData Analysis

Application SystemsApplication Systems

Fusion of DistancesFusion of Distances

Proposal SummaryProposal Summary

20 40 60 80 100120140Levenne

2 4 6Bhrattacharyya

50 100 150 200KL

0 100 200 300TSquared

1 2 3

20406080

100120140

Mahalanobis

Leve

nne

2

4

6

Bhr

atta

char

yya

50

100

150

200

KL

0

100

200

300

TS

quar

ed

1

2

3

Mah

alan

obis

Draftsmans Dispalay of Distances (LPCC)

Intra

Inter

Page 57: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20065757

Correlation AnalysisCorrelation Analysis

50 100 150 200 250Levenne

4 6 8 10Bhrattacharyya

200 400 600 800KL

200 400 600 800TSquared

1.5 2 2.5 3 3.5

50

100150

200

250

Mahalanobis

Leve

nne

4

6

8

10

Bhr

atta

char

yya

0

500

1000

KL

200

400

600

800

TS

quar

ed

1.5

2

2.5

33.5

Mah

alan

obis

Draftsmans Dispalay of Distances (MFCC)

IntraInter

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 58: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20065858

““Best Distance”Best Distance”

• Optimized Fusion of Distances

– Minimize inter-speaker variation

– Maximize intra-speaker variation

– Maximize T-test value between inter-class distance distributions

XaTT max

TTmaxmax = New Distance = New Distance

X = vector consisting of the distance measure values X = vector consisting of the distance measure values aa = vector of the weights assigned to each distance measure = vector of the weights assigned to each distance measure

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 59: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20065959

““Best Distance”Best Distance”

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Distance Measure 1

Dis

tan

ce M

easu

re 2

Page 60: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20066060

Preliminary ExperimentsPreliminary Experiments

0.5 1 1.5 2 2.5 3 3.5 40

0.02

0.04

Pro

bability

Tmax - 44.3369

0.5 1 1.5 2 2.5 3 3.50

0.02

Pro

bability

Mahalanobis - 44.0652

0 200 400 600 8000

0.050.1

Pro

bability

TSquared - 35.2111

0 50 100 150 200 250 300 3500

0.050.1

Pro

bability

KL - 22.7672

2 4 6 80

0.05

Pro

bability

Bhrattacharyya - 33.7449

0 50 100 150 200 2500

0.05

Pro

bability

Distance Feature - LPCC

Levenne - 13.4432

intra

inter

LPCCsLPCCs

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 61: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20066161

Preliminary ExperimentsPreliminary Experiments

LPCCsLPCCs

1 2 3 4 5 44.05

44.1

44.15

44.2

44.25

44.3

44.35

Mahalanobis

LevenneBhrattacharyya

KLTSquared

Increase in Inter-Class Separation as Number of Distances is Increased Features - LPCC

Number of Distances

T-T

est

Val

ue

Mahalanobis

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 62: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20066262

Preliminary ExperimentsPreliminary Experiments

MFCCsMFCCs

1 2 3 4 50

0.02

Pro

bability

39.4524

1 1.5 2 2.5 3 3.5 40

0.020.04

Pro

bability

Mahalanobis - 38.2733

0 200 400 600 800 1000 12000

0.050.1

Pro

bability

TSquared - 32.5542

0 500 1000 1500 20000

0.050.1

Pro

bability

KL - 11.0738

5 10 150

0.05

Pro

bability

Bhrattacharyya - 23.4276

0 100 200 300 400 5000

0.05

Pro

bability

Distance Feature - MFCC

Levenne - 16.8735

intra

inter

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 63: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20066363

Preliminary ExperimentsPreliminary Experiments

MFCCsMFCCs

1 2 3 4 538.2

38.4

38.6

38.8

39

39.2

39.4

39.6

Mahalanobis

Levenne

KL

TSquared

Bhrattacharyya

Increase in Inter-Class Separation as Number of Distances is Increased Features - MFCC

Number of Distances

T-t

est

Val

ue

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 64: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20066464

Proposal Summary

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 65: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20066565

Research Goal RevisitedResearch Goal Revisited

• To overcome the following challenges faced in between differentiating speakers participating in conversations:

– No a priori information– Limited data size– No knowledge of change points– Co-channel speech

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 66: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20066666

Summary of Work AccomplishedSummary of Work Accomplished

• Practically demonstration of the existence of the problem.

• Analysis of distance measures and features

• Development of a novel model formation technique

• Development, implementation and evaluation of two conversations-based speaker differentiation systems

• Introduction to and preliminary testing of an “optimal” distance formation

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 67: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20066767

Proposed WorkProposed Work

Problem Statement and Problem Statement and Research GoalResearch Goal

Scope of ResearchScope of Research

Distance AnalysisDistance Analysis

Feature AnalysisFeature Analysis

Data AnalysisData Analysis

Application SystemsApplication Systems

Fusion of DistancesFusion of Distances

Proposal SummaryProposal Summary

• Features Combinations– Determination of the best combination of

features using univariate tests of similarity– Enhancement of feature combinations using

Principal Component Analysis.

• Fusion of Distance measure– Enhancement of fusion technique using

mutual information suppression techniques– Decision-level distance measure fusion

Page 68: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20066868

Proposed WorkProposed Work

Problem Statement and Problem Statement and Research GoalResearch Goal

Scope of ResearchScope of Research

Distance AnalysisDistance Analysis

Feature AnalysisFeature Analysis

Data AnalysisData Analysis

Application SystemsApplication Systems

Fusion of DistancesFusion of Distances

Proposal SummaryProposal Summary

• Further development of introduced systems– Use of all distance measures– Use of best feature combination– The use of the “optimal distance”– Implementation of decision-level fusion

technique

Page 69: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20066969

Final GoalFinal Goal

Problem Statement and Problem Statement and Research GoalResearch Goal

Scope of ResearchScope of Research

Distance AnalysisDistance Analysis

Feature AnalysisFeature Analysis

Data AnalysisData Analysis

Application SystemsApplication Systems

Fusion of DistancesFusion of Distances

Proposal SummaryProposal Summary

A speaker recognition system for conversations yields results which are comparable to non-conversational systems.

Page 70: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20067070

PublicationsPublications

Problem Statement and Problem Statement and Research GoalResearch Goal

Scope of ResearchScope of Research

Distance AnalysisDistance Analysis

Feature AnalysisFeature Analysis

Data AnalysisData Analysis

Application SystemsApplication Systems

Fusion of DistancesFusion of Distances

Proposal SummaryProposal Summary

• U. Ofoegbu, A. Iyer, R. Yantorno, “Detection of a Third Speaker in Telephone Conversations”, ICSLP, INTERSPEECH 2006

• U. Ofoegbu, A. Iyer, R. Yantorno and S. Wenndt, “Unsupervised Indexing of Noisy conversations with Short Speaker Utterances”, IEEE Aerospace Conference. March, 2007  

• U. Ofoegbu, A. Iyer, R. Yantorno, “A Simple Approach to Unsupervised Speaker Indexing”, IEEE ISPACS. 2006.

• U. Ofoegbu, A. Iyer, R. Yantorno, “A Speaker Count System for Telephone Conversations”, IEEE ISPACS. 2006.

•  A. Iyer, U. Ofoegbu, R. Yantorno, “Speaker Discriminative Distances: Comprehensive Study”, IEEE Transactions on Speech and Audio Processing. (Submitted).

Page 71: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20067171

DissertationCommittee

Advisor: Robert Yantorno, Ph.D

Members:Dennis Silage, Ph.D.

Brian Butz, Ph.D.Iyad Obeid, Ph.D.

Eugene Kwatny, Ph.d

Page 72: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20067272

Cepstral FeaturesCepstral Features

• Linear Predictive Cepstral Coefficients

– Obtained Recursively from LPC Coefficients

Problem Statement and Problem Statement and Research GoalResearch Goal

Scope of ResearchScope of Research

Distance AnalysisDistance Analysis

Feature AnalysisFeature Analysis

Data AnalysisData Analysis

Application SystemsApplication Systems

Fusion of DistancesFusion of Distances

Proposal SummaryProposal Summary

Let LPC vector = [a0 a1 a2 …ap]   and

LPCC vector = [c0 c1 c2 …cp c0 … c1 c2 …cn-1]     

20 ln Ec

nmpcam

kmc

m

kkmkm

,

)(1

1)(

pmcakmm

acm

kkmkmm

1,)(

1 1

1)(

Page 73: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20067373

Conversational Data ModelingConversational Data Modeling

• Current Method– Equal Segmentation of Data– Indiscriminate use of data

• Problems– Change points unknown– Not all speech is useful

Problem Statement and Problem Statement and Research GoalResearch Goal

Scope of ResearchScope of Research

Distance AnalysisDistance Analysis

Feature AnalysisFeature Analysis

Data AnalysisData Analysis

Application SystemsApplication Systems

Fusion of DistancesFusion of Distances

Proposal SummaryProposal Summary

Page 74: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20067474

““Best Distance”Best Distance”

• Intra-speaker and inter-speaker distance lengths are always equal, therefore:

P = sum of the covariance matrices of the two classes.

λ1 = maximum eigenvalue obtained by solving the generalized eigenvalue problem:

Q = is the square of the distance between the mean vectors of the two classes

22

21

21)(

aT )( 21

1

1

Pa

k

1

211

1 )( Pk

aQaP 11

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary

Page 75: Uchechukwu Ofoegbu, Dissertation Proposal Speech Processing Lab Sept. 29, 2006 1 Speaker Discrimination: Speaker Discrimination: The Challenge of Conversational

Uchechukwu Ofoegbu, Dissertation Proposal

Speech Processing Lab

Sept. 29, 2006Sept. 29, 20067575

RRMD ApproachRRMD Approach

• Relative Distance Condition

0 100 200 300 400 500 6000

0.05

0.1

0.15

0.2

T-Square Statistics

Pro

bab

ilit

y

Distribution of T-Square Statistics for N = 5

Intra Speaker

Inter Speaker

D rel

Problem Statement and Research Goal

Scope of Research

Distance Analysis

Feature Analysis

Data Analysis

Application Systems

Fusion of Distances

Proposal Summary