![Page 1: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/1.jpg)
LETTER TO PHONEME ALIGNMENT
Reihaneh Rabbany
Shahin Jabbari
![Page 2: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/2.jpg)
OUTLINE
Motivation Problem and its Challenges Relevant Works Our Work
Formal Model EM Dynamic Bayesian Network
Evaluation Letter to Phoneme Generator AER
Result 2
![Page 3: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/3.jpg)
TEXT TO SPEECH TEXT TO SPEECH PROBLEM
Conversion of Text to Speech: TTS
Automated Telecom ServicesE-mail by PhoneBanking SystemsHandicapped People
3
![Page 4: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/4.jpg)
PRONUNCIATIONPRONUNCIATION
Pronunciation of the words Dictionary Words Non-Dictionary Words
Phonetic Analysis
Dictionary Look-up Language is alive, new words add Proper Nouns
4
Phonetic AnalysisWord
Pronunciation
![Page 5: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/5.jpg)
OUTLINE
Motivation Problem and its Challenges Relevant Works Our Work
Formal Model EM Dynamic Bayesian Network
Evaluation Letter to Phoneme Generator AER
Result 5
![Page 6: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/6.jpg)
PROBLEM
Letter to Phoneme Alignment◦ Letter: c a k e
◦ Phoneme: k ei k
6
L2P
![Page 7: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/7.jpg)
CHALLENGES
No Consistency◦ City / s /◦ Cake / k /◦ Kid / k /
No Transparency◦ K i d (3) / k i d / (3) ◦ S i x (3) / s i k s / (4)◦ Q u e u e (5) / k j u: / (3)◦ A x e (3) / a k s / (3)
7
![Page 8: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/8.jpg)
OUTLINE
Motivation Problem and its Challenges Relevant Works Our Work
Formal Model EM Dynamic Bayesian Network
Evaluation Letter to Phoneme Generator AER
Result 8
![Page 9: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/9.jpg)
ONE-TO-ONE EMDAELEMANS ET.AL., 1996 Length of word = pronunciation Produce all possible alignments
Inserting null letter/phoneme
Alignment probability
9
i
ii lpPAP )|()(
![Page 10: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/10.jpg)
DECISION TREEBLACK ET.AL., 1996
Train a CART Using Aligned Dictionary Why CART? A Single Tree for Each Letter
10
![Page 11: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/11.jpg)
KONDRAK
Alignments are not always one-to-one A x e / a k s / B oo k /b ú k /
Only Null Phoneme Similar to one-to-one EM
Produce All Possible Alignments Compute the Probabilities
11
![Page 12: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/12.jpg)
OUTLINE
Motivation Problem and its Challenges Relevant Works Our Work
Formal Model EM Dynamic Bayesian Network
Evaluation Letter to Phoneme Generator AER
Result 12
![Page 13: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/13.jpg)
FORMAL MODEL
Word: sequence of letters
Pronunciation: sequence of phonemes
Alignment: sequence of subalignments
Problem: Finding the most probable alignment
13
mpppP ...21
iiik PLaaaaA ,...21
nlllL ...21
),|(maxarg PLAPA Abest
2|||,| ii PL
![Page 14: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/14.jpg)
MANY-TO-MANY EM
1. Initialize prob(SubAlignmnets)// Expectation Step2. For each word in training_set
2.1. Produce all possible alignments 2.2. Choose the most probable
alignment// Maximization Step3. For all subalignments
3.1. Compute new_p(SubAlignmnets)
14][
],[)(
i
iii lM
plMaP
![Page 15: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/15.jpg)
DYNAMIC BAYESIAN NETWORK
15
Model
Subaligments are considered as hidden variables
Learn DBN by EM
lili PiPi
ai
k
iiii PLaPAP
1
),|()(
],[
][)(
ii
ii lpM
aMaP
![Page 16: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/16.jpg)
CONTEXT DEPENDENT DBN
Context independency assumption Makes the model simpler It is not always a correct assumption Example: Chat and Hat
Model
16
lili PiPi
aiai-1
k
iiiii PLaaPAP
11 ),,|()(
],,[
][)(
1 iii
ii lpaM
aMaP
![Page 17: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/17.jpg)
OUTLINE
Motivation Problem and its Challenges Relevant Works Our Work
Formal Model EM Dynamic Bayesian Network
Evaluation Letter to Phoneme Generator AER
Result 17
![Page 18: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/18.jpg)
EVALUATION DIFFICULTIES
Unsupervised Evaluation No Aligned Dictionary
Solutions How much it boost a supervised module
Letter to Phoneme Generator Comparing the result with a gold alignment
AER
18
![Page 19: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/19.jpg)
Letter to Phoneme Generator
Percentage of correctly generated phonemes and words
How it works? Finding Chunks
Binary Classification Using Instance-Based-Learning
Phoneme Prediction Phoneme is predicted independently for each letter Phoneme is predicted for each chunk
Hidden Markov Model 19
![Page 20: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/20.jpg)
ALIGNMENT ERROR RATIO
AER Evaluating by Alignment Error Ratio
Counting common pairs between Our aligned output Gold alignment
Calculating AER
20
|| A
GAAER
![Page 21: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/21.jpg)
OUTLINE
Motivation Problem and its Challenges Relevant Works Our Work
Formal Model EM Dynamic Bayesian Network
Evaluation Letter to Phoneme Generator AER
Result 21
![Page 22: L ETTER TO P HONEME A LIGNMENT Reihaneh Rabbany Shahin Jabbari](https://reader031.vdocument.in/reader031/viewer/2022013011/56649f2b5503460f94c452d0/html5/thumbnails/22.jpg)
RESULTS
22
10 fold cross validation
Model Word Accuracy
Phoneme Accuracy
Best previous results 66.82 92.45
One_To_One EM 53.87% 85.66%
Many_To_Many EM 76% 94.5%
DBN ContextIndependent
79.12% 95.23%
ContextDependent
81.54% 96. 70%