automatic speaker recognition system using mfcc and vq approach

20
Automatic Speaker Recognition system using Mel Frequency Cepstral Coefficients (MFCC) and Vector Quantization (VQ) approach Presented by: Md. Abdullah-al-MAMUN 1

Upload: alpha-reaction

Post on 16-Aug-2015

76 views

Category:

Engineering


3 download

TRANSCRIPT

Page 1: Automatic Speaker Recognition system using MFCC and VQ approach

Automatic Speaker Recognition system using Mel Frequency Cepstral Coefficients (MFCC) and Vector Quantization (VQ)

approachPresented by:

Md. Abdullah-al-MAMUN

1

Page 2: Automatic Speaker Recognition system using MFCC and VQ approach

OUTLINEOUTLINE What is speaker recognition ?What is speaker recognition ?

Speaker Identification Speaker Identification Speaker VerificationSpeaker Verification

The Structure of Speaker Recognizer The Structure of Speaker Recognizer Feature Extraction : Feature Extraction : MFCCMFCC Speech Signal to Vector Quantization Speech Signal to Vector Quantization ((VQVQ)) Database Creation ProcessDatabase Creation Process Speaker IdentificationSpeaker Identification Speaker VerificationSpeaker VerificationTable :Table : Speaker Recognition Result Speaker Recognition Result ApplicationsApplications ConclusionConclusion ReferencesReferences

2

Page 3: Automatic Speaker Recognition system using MFCC and VQ approach

What is What is SSpeaker peaker RRecognitionecognition??

Speaker Recognition is the process Speaker Recognition is the process of automatically recognizing who is of automatically recognizing who is speaking on the basis of individual speaking on the basis of individual information included in speech information included in speech signals. signals.

3

Speaker Recognition =

Speaker Identification, Speaker Verification

Page 4: Automatic Speaker Recognition system using MFCC and VQ approach

Speaker Identification

Whose voice is this?

??

??

4

Page 5: Automatic Speaker Recognition system using MFCC and VQ approach

Speaker Verification

• Synonyms: authentication, detection.• User claims an identity.• System task: Accept or reject identity claim.

Is this Ahmad’s voice

?

?

5

Page 6: Automatic Speaker Recognition system using MFCC and VQ approach

Model of Model of Speaker Speaker RecognizerRecognizer

6

Fig -1 : Simple model of Speaker Recognizer .

U Permitted to Access

Hello,Mr. John

Page 7: Automatic Speaker Recognition system using MFCC and VQ approach

The Structure of The Structure of Speaker Speaker RecognizerRecognizer

Figure 2 :Functional Scheme of an ASR System.Figure 2 :Functional Scheme of an ASR System.

7

Feature Extraction Feature VectorFeature Vector

Training ModeTraining Mode

RecognitionRecognition

Speaker Modeling

Classification

Decision Logic Speaker

#ID

Speaker_1Speaker_1

Page 8: Automatic Speaker Recognition system using MFCC and VQ approach

Speech Signal AnalysisSpeech Signal Analysis

FFeature eature EExtractionxtraction- The aim is to extract the voice - The aim is to extract the voice features to distinguish different features to distinguish different phonemes of a language.phonemes of a language.

8

515645465

156156165

156456454

251561565

Page 9: Automatic Speaker Recognition system using MFCC and VQ approach

MFCCMFCC extractionextraction

Pre-emphasis DFTMel filter

banks Log(||2) IDFT

Speech

signalx(n)

WINDOW

x’(n)

xt (n)

Xt(k)

Yt(m)

MFCCyt(m)(k)

9

MFCC means Mel-frequency cepstral coefficients that representation of the short-term power spectrum of a sound for audio processing.

The MFCCs are the amplitudes of the resulting spectrum.

Page 10: Automatic Speaker Recognition system using MFCC and VQ approach

Speech waveform Speech waveform of a phoneme “\of a phoneme “\

ae”ae”

After pre-emphasis After pre-emphasis and Hamming and Hamming

windowingwindowing

Power spectrumPower spectrum MFCCMFCC

Explanatory ExampleExplanatory Example

10

Page 11: Automatic Speaker Recognition system using MFCC and VQ approach

Speech SignalSpeech Signal to to Feature Feature VectorVector

11

515645465

156156165

156456454

251561565

Feature VectorFeature Vector to to ClassificationClassification

Vector Quantization (VQ)

Page 12: Automatic Speaker Recognition system using MFCC and VQ approach

12

Vector Quantization (VQ)

AIM of VQ :representation of large amounts

of data by (few) prototype vectors.

example:

identification and grouping

in clusters of similar data.

assignment of feature vector to the closest prototype w

(similarity or distance measure,

e.g. Euclidean distance )

Page 13: Automatic Speaker Recognition system using MFCC and VQ approach

DDatabase atabase CCreation reation PProcessrocess

13

Database

Speaker #1

Speaker #2

Speaker #3

Hello, Speaker #1

Speaker #1

Speaker #1

Speaker #2

Speaker #2

Hello, Speaker #2

Page 14: Automatic Speaker Recognition system using MFCC and VQ approach

SSpeaker peaker IIdentificationdentification

Database

#1

#2

#3

Speaker

# ?

Speaker #

1

14

Page 15: Automatic Speaker Recognition system using MFCC and VQ approach

SSpeaker peaker VVerificationerification

Database

#1

#2

#3

Speaker #

1Accep

t

15

Page 16: Automatic Speaker Recognition system using MFCC and VQ approach

DDatabaseatabase C Creationreation CConditionondition

16

Table 1: Database description.

Parameter Characteristics

Language BanglaNo. of speaker 5Speech type Sentence reading Recording condition A normal room conditionAudio Length 60-90 secondsAudio type StereoSample Format 16-bit PCMSampling Frequency 8 KHzBit Rate 1411 kbps

Page 17: Automatic Speaker Recognition system using MFCC and VQ approach

SSpeakerpeaker R Recognitionecognition RResultesult

17

Table 3: Test result for speaker recognition system.

Speaker No. of input Correct Incorrect Accuracy

Speaker_1 5 5 0 100%

Speaker_2 9 8 1 88.88%

Speaker_3 6 6 0 100%

Speaker_3 12 11 1 91.67%

Speaker_4 8 8 0 100%

Speaker_5 10 10 0 100%

Total Speaker 50 48 2 96%

Page 18: Automatic Speaker Recognition system using MFCC and VQ approach

Applications

• Transaction authentication– Toll fraud prevention– Telephone credit card purchases– Telephone brokerage (e.g., stock trading)

• Access control– Physical facilities– Computers and data networks

• Information retrieval– Customer information for call centers– Audio indexing (speech skimming device)

• Forensics– Voice sample matching

18

Page 19: Automatic Speaker Recognition system using MFCC and VQ approach

ConclusionsConclusions 100% accuracy achievement is really 100% accuracy achievement is really

difficult whereas our proposal difficult whereas our proposal system achieve 96% accuracy for system achieve 96% accuracy for limited resources (limited resources (speaker & utterancespeaker & utterance)). .

You should avoided poor quality You should avoided poor quality microphone to get better accuracy.microphone to get better accuracy.

Training the recognizer will provide Training the recognizer will provide an even better experience.an even better experience.

19

Page 20: Automatic Speaker Recognition system using MFCC and VQ approach

Thank YouThank You

20