machine learning in voice biometrics

Machine Learningin Voice Biometrics

Michał DankiewiczDataKRK Meetup30.01.2017

basic concepts

Agenda

● Biometrics in general

● R&D @ VoicePIN.com

● Machine learning techniques in voice biometrics

● Features extraction

● Gaussian Mixture Models

● i-vectors

● Gotchas

● Challenge

Biometrics

sources: commons.wikimedia.orgclipartfest.compixabay.com

Voice Biometrics

= excitation + filter

sources: synthschool.comcommons.wikimedia.org https://youtu.be/ZQcEyXI1OGM?t=54s

Voice Biometrics

Access control

Transaction authentication

Internet of Things

Law enforcement

sources: giphy.com

Enrollment & verification

Enrollment

Verification

sources: pixabay.com

First attempts

Enrollment

Verification

cross-correlationsources: pixabay.comcommons.wikimedia.org

R&D @ VoicePIN.com

Features extraction

framing: waveform → matrix

MFCC – mel-frequency cepstral coefficients

● psychoacoustics

bottleneck features

● ANN with a narrow layer

sources: Deep neural networks in speaker recognition – K. Odrzywołek, 2016

Gaussian Mixture Models

sources: commons.wikimedia.org

λ – model x – observation (1 point)X – sequence of observations

Identification

Charlie

p(X|λ)

sources: pixabay.comcommons.wikimedia.org

argmax

Verification, UBM & LLR

p(X|λ)Bob

P(λ|X )=p(X|λ)P (λ )

P (X )Bayes’ theorem

we want this

GMM formula same for every speaker

- negligible

P(X )≈p(X|λ?)

P(λ|X)

LLR = log( p(X|λ)

p(X|λUBM))

Totally Bob

UBM – Universal Background Model

LLR – log-likelihood ratiosources: pixabay.comcommons.wikimedia.org

Verification, UBM & LLR

Clear conditions Noisy conditions-2

LLR is a solution

Alice's voice Bob's voice

log(p(X | λ))- log(p(X | UBM)

Alice's modelin relation to

Clear conditions Noisy conditions

Problem with p(X|λ) - where should we set a threshold?

Alice's voice Bob's voice

log(p(X|λ))

Alice's model

i-vectors

● D-dimensional GMM with C components has D*C mean values● Concatenation of them is a mean supervector

M = m + T*w

speaker supervector[D*C × 1]

UBM supervector[D*C × 1]

total variabilitymatrix

[D*C × W]

speaker i-vector[W × 1]

source: Low-dimensional speech representation based on Factor Analysis and its applications - Najim Dehak and Stephen Shum

Gotchas

● quality of recordings

Gotchas

train test

Gotchas

train test

Gotchas

● quality of recordings● gender

train test

Gotchas

● quality of recordings● gender

train test

Gotchas

● gender

● device/channel

train test

Gotchas

● gender

● device/channel

● real case (conditions)

train test

Gotchas

● gender

● device/channel

● real case (conditions)

● session variability

train test

Challenge Sneakers, 1992

https://www.youtube.com/watch?v=-zVgWpVXb64

Challenge

„What if someone records my voice?”

www.spoofingchallenge.org

@VoicePINcom

Thanks!

VoicePIN.com

machine learning in voice biometrics

Science

retina registration for biometrics based on retinal...

voice biometrics 2 - uniphore · banking to increase...

deploying voice biometrics: a...

voice biometrics: balancing security, usability and ... ·...

voice biometric case...

voice biometrics in the contact...

biometrics deployment of machine readable travel...

voice biometrics automated password_reset

voice biometrics. general description each individual has...

biometrics - fundacja panoptykon · biometrics: enhancing...

operationalizing voice biometrics

voice biometrics census steady growth of global …€¦ ·...

an introduction to voice biometrics

chapter 26 machine learning for biometrics · chapter 26...

revolution in accuracy and speed of voice biometrics ·...

biometrics institute us conference · biometrics institute...

frequently asked questions on voice biometrics · voice...

recent developments in voice biometrics: robustness and high

tres commas voice biometrics

sestek voice biometrics