speaker verification for remote authentication

7/28/2019 Speaker Verification for Remote Authentication

1/31

MAJOR PROJECT FINAL PRESENTATION :

TEXT PROMPTED REMOTE

SPEAKER AUTHENTICATION

Project Members:

Ganesh Tiwari (75010)

Madhav Pandey(75014)

Manoj Shrestha(75018)

Project Supervisor :

Dr. Subarna Shakya

Associate Professor

Internal Examiner:

Er. Manoj Ghimire

External Examiner

Er. Bimal Acharya

Tribhuvan University

Institute of Engineering

Pulchowk CampusDepartment of Electronics and Computer Engineering


2/31

INTRODUCTION

Voice biometric system

User login

Text-Prompted system Claimant is asked to speak a prompted(random) text

Speech and Speaker Recognition

Why Text prompted ? Playback attack


3/31

OUR SYSTEM

Feature :MFCC

Modeling and Classifications : both statistical

GMM- Speaker Modeling:

HMM/VQ -Speech Modeling:


4/31

PROPERTIESOF SPEECH SIGNAL

Carries both Speech Content and Speaker identity

What makes Speech Signal Unique ?

Each phoneme resonates at its own fundamental frequency

and harmonics of it

Studied over short period : short time spectral analysis

What is Speaker Dependent information

Fundamental frequency, primarily function of the dimensions and tension of the vocal chords

size and shape of the mouth, throat, nose, and teeth

Studied over long period : all the variations from that speaker


5/31

UNIQUENESSIN PHONEME

0 500 1000 1500 2000 2500-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

Samples

Amplitude

Phoneme /ah/

Phoneme /i:/


6/31

Pre-Processing and Feature Extraction


7/31

PREPROCESSING : STEPS

1)Silence Removal

0 1 2 3 4 5 6 7 8 9

x 104

-1

-0.5

0

0.5

1

0 0.5 1 1.5 2 2.5 3 3.5 4

4

-1

-0.5

0

0.5

1

Silence Signal

Silence Removed


8/31

PREPROCESSING :STEPS (CONTD..)

1)Silence Removal

2)Pre-Emphasis

0 2000 4000 6000 8000 10000 120000

0.01

0.02

0.03

0.04

0.05

Frequency (Hz)

|Y(f)|

0 2000 4000 6000 8000 10000 120000

1

2

3

4

5x 10

-3

Frequency (Hz)

|Y(f)|

Boosted highFrequencies

Suppressed high

Frequencies


9/31

1)Silence Removal

2)Pre-Emphasis

3)Framing

50% overlapped, 23ms



10/31

1)Silence Removal

2)Pre-Emphasis

3)Framing

4)Windowing

0 10 20 30 40 50 60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Hamming Window

0 200 400 600 800 1000 1200-0.04

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

0 200 400 600 800 1000 1200-0.05

-0.04

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

0.05


Hamming WindowWindowed Signal


11/31

FEATURE EXTRACTION

MFCC : Mel Filter Cepstral Coefficients

Perceptual approach

Human Ear processes audio signal in Mel scale

Mel scale : linear up to 1KHz and logarithmic after

1KHz


12/31

MFCC EXTRACTION: (CONTD..)

Steps :

FFT Mel Filter Log DCT CMS

Mel Filter : 12

Filtering of absolute fft coefficients using triangular filter bank inMel scale

MFCC gives distribution of energy acc. to filters in Melfrequency band

Mel Filter Bank


13/31

EXTRAFEATURES :ENERGYAND DELTAS

For achieving high recognition rate

A Energy Feature

Delta and Delta-Delta

deltavelocity feature

double deltaacceleration feature

Co-articulation


14/31

COMPOSITIONOF FEATURE VECTOR

12 MFCC Features

12 MFCC

12 MFCC

1 Energy Feature

1 Energy

1 Energy

39 Features from each frame


15/31

Speech Recognition/Verification by

HMM/VQ


16/31

HIDDEN MARKOV MODEL (HMM)

HMM is the extension of Markov Process

Markov Process consist of observable states

HMM has hidden states and observable symbols

per states

HMM is the stochastic model


17/31

HMM (CONTD)

Parameters

1) The initial state distribution ()

2) State transition probability distribution (A)

3) Observation symbol probability distribution (B)

The HMM Model

(A,B,)


18/31

EXAMPLE:

PRONUNCIATIONMODELOFWORD TOMATO

(A,B,)


19/31

HMM IMPLEMENTATION

Feature Vector observation symbols , 256

Phonemeshidden states, 6

Left to right HMM

Discrete Hidden Markov Model (DHMM) with

Vector Quantization (VQ) technique


20/31

SPEECH RECOGNITION SYSTEM


21/31

VECTOR QUANTIZATION


22/31

Speaker Recognition/Verification by

GMM


23/31

SPEAKER VERIFICATION SYSTEM


24/31

SPEAKER MODELING (GMM)

Gaussian Mixture Model

Parametric probability density function

Based on soft clustering technique

Mixture of Gaussian components

= (, ,)


25/31

SPEAKER MODEL TRAINING

Estimate the model parameters

Expectation Maximization algorithm


26/31

SPEAKER VERIFICATION

Based on likelihood ratio

=


27/31

TOOLS USED

Languages: Adobe Flex

Java

Blaze DS for RPC

Servers: Apache Tomcat

MySQL

Versioning Tortoise SVN


28/31

OUTPUT : SNAPSHOT (GUI)


29/31

APPLICATION AREAS

Telephone transaction

Telephone credit card purchase,

Telephone stock trading

Access control

Physical facilities

Computer networks

Information retrieval

Customers information

Forensics

Voice sample matching


30/31

LIMITATIONAND FUTURE ENHANCEMENT

Noise reduction

Training on more data

Combine with

other features

other classification methods


31/31

Thanks

Any queries ?

speaker verification for remote authentication

Documents