1 of 39 world computer congress 2013-international conference for artificial intelligence (icai2013)...

Download 1 of 39 World Computer Congress 2013-International Conference for Artificial Intelligence (ICAI2013) Arnel C. Fajardo and Yoon-Joong Kim, Ph.D Hanbat National

If you can't read please download the document

Upload: thomas-summers

Post on 17-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

  • Slide 1
  • 1 of 39 World Computer Congress 2013-International Conference for Artificial Intelligence (ICAI2013) Arnel C. Fajardo and Yoon-Joong Kim, Ph.D Hanbat National University, Korea DEVELOPMENT OF FILIPINO PHONETICALLY-BALANCED WORDS AND TEST USING HIDDEN MARKOV MODEL
  • Slide 2
  • 2 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Introduction Objectives: 1. To present the development of Filipino phonetically balanced words (PBW) 2. Test the recognition accuracy of the developed PBW using Hidden Markov Model.
  • Slide 3
  • 3 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Related Studies In 2003, an ASR for Filipino alphabets was developed. This study reported to have achieved a recognition accuracy of 85.5%. However, this recognizer was used for phoneme utterances using discrete Hidden Markov Model (HMM) rather than continuous word recognition. Navaro, R. D., Recognition of Tagalog Alphabets Using The Hidden Markov Model
  • Slide 4
  • 4 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Related Studies Dela Roca G., et.al (2003) attempted to recognize continuous speech using a developed Filipino Speech Corpus by Guevara R., et. al (2002) that reported to achieve only a 32% recognition accuracy. In 2010, an Indonesian speech corpus was incorporated for the recognizer as training sets to recognize Filipino utterances. contains 80 hours of recording v.s 4 hours of Filipino Speech corpus achieved 79.50% recognition accuracy. Note: None of these previous researches used a phonetically balanced set of words for the development of its speech corpus Sakti S., Isotani, R., Kawai H., and Nakamura, S., The Use of Indonesian Speech Corpora for Developing a Filipino Continuous Speech Recognition System, (2010). Guevara, R., Co, M., Espina, E., Gracia, I., Tan, E., Ensomo, R., and Sagum, R., Development of a Filipino speech corpus, (2002).
  • Slide 5
  • 5 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM The Specifics of Filipino Language Filipino is the language used largely in the Philippines with 22 million native speakers. Consists of 20 Alphabets: 5 (a, e, I, o, u) vowels and 15 (b, k, d, g, h, l, m, n, ng, p, r, s, t, w, y) consonants Some words are spelled the same but with slight differences in pronunciation, which produces differences in meaning. bata /b:a - ta/ - a child bata /ba - ta/ - to bear or endure
  • Slide 6
  • 6 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM The Filipino Phoneme System FILIPINO VOWEL SYSTEM frontcentralback upper /i/ /u/ high lower upper /e/ /o/ high lower upper /a/ high lower
  • Slide 7
  • 7 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM The Filipino Phoneme System FILIPINO CONSONANT SYSTEM labialdentalalveolar palatalvelarglottal Stops voiced /p//t/ /k/ voiceless Nasals voiced /b//d/ /g/ Fricatives voiceless /n/ // Affricatives voiceless /s/ /h/ Lateral voiced /l/ Flap voiced /r/ Glide voiced /w/ /y/
  • Slide 8
  • 8 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM The Filipino Phoneme System -The diphthongs /iw/ /ay/ /aw/ /oy/ /ey/ /uy/ were also included as part of the vowel phoneme list. -Phonemes such as /p:/ /b:/ /m:/ /t:/ /d:/ /n:/ /s:/ /l:/ /k:/ /g:/ were included
  • Slide 9
  • 9 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Development of the Filipino Phonetically Balanced Word List WORD EXTRACTION Source 16 Articles from the Tagalog Textbook: Bagwis
  • Slide 10
  • 10 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Development of the Filipino Phonetically Balanced Word List WORD EXTRACTION Source 16 Articles from the Tagalog Textbook: Bagwis 9768 Words Transcribed words
  • Slide 11
  • 11 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Development of the Filipino Phonetically Balanced Word List WORD EXTRACTION Source 16 Articles from the Tagalog Textbook: Bagwis 2938 Words Unique Words
  • Slide 12
  • 12 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Development of the Filipino Phonetically Balanced Word List WORD EXTRACTION Source 16 Articles from the Tagalog Textbook: Bagwis 2938 Words Unique Words Syllable CountFrequency 1 Syllable101 2 Syllable780 3 Syllable912 4 Syllable740 5 Syllable299 6 Syllable72 7 Syllable23 8 Syllable7 9 Syllable2 10 Syllable1 13 Syllable1
  • Slide 13
  • 13 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Development of the Filipino Phonetically Balanced Word List WORD EXTRACTION Source 16 Articles from the Tagalog Textbook: Bagwis 2938 Words Unique Words 780 (2-Syllable Words) 912 (3-Syllable Words) Syllable Count
  • Slide 14
  • 14 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Development of the Filipino Phonetically Balanced Word List 780 (2-Syllable Words) 912 (3-Syllable Words) Syllable Count
  • Slide 15
  • 15 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Development of the Filipino Phonetically Balanced Word List 780 (2-Syllable Words) 912 (3-Syllable Words) Syllable Count WORD OCCURRENCE must be >1
  • Slide 16
  • 16 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Development of the Filipino Phonetically Balanced Word List 780 (2-Syllable Words) 912 (3-Syllable Words) Syllable Count 323 (2-Syllable Words) 249 (3-Syllable Words) Word Occurrence
  • Slide 17
  • 17 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Development of the Filipino Phonetically Balanced Word List 780 (2-Syllable Words) 912 (3-Syllable Words) Syllable Count 323 (2-Syllable Words) 249 (3-Syllable Words) Word Occurrence SYLLABIC STRUCTURE 80% of Frequency
  • Slide 18
  • 18 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Development of the Filipino Phonetically Balanced Word List 780 (2-Syllable Words) 912 (3-Syllable Words) Syllable Count 323 (2-Syllable Words) 249 (3-Syllable Words) Word Occurrence 130 cv-cvc 43 v-cvc 27 cvc-cv 61 cv-cv 100 cv-cv-cvc 14 cvc-cv-cvc 11 V-cv-cvc 38 cv-cv-cv 9 cv-cv-vc 8 V-cv-cv 11 Cvc-cv-cv 15 Cv-cvc-cvc 8 Cv-cvc-cv
  • Slide 19
  • 19 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Development of the Filipino Phonetically Balanced Word List 780 (2-Syllable Words) 912 (3-Syllable Words) Syllable Count 323 (2-Syllable Words) 249 (3-Syllable Words) Word Occurrence 261 (2-Syllable Words) 214 (3-Syllable Words) Syllabic Structure
  • Slide 20
  • 20 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Development of the Filipino Phonetically Balanced Word List Ffrequency of phonemes represented in the word list pfwfrequency of phoneme in a word wffrequency of word occurrence ntotal number of phonemes in the word list
  • Slide 21
  • 21 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Development of the Filipino Phonetically Balanced Word List
  • Slide 22
  • 22 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Development of the Filipino Phonetically Balanced Word List 261 (2-Syllable Words) 214 (3-Syllable Words) Syllabic Structure Acceptance value: aV = 1 / (x*m) 2 SYLLABLE (VOWEL = 0.0023, CONSONANT = 0.0018) 3 SYLLABLE (VOWEL = 0.0015, CONSONANT = 0.0012) RESULTS F = frequency of phonemes Pfw = frequency of phonemes in a word Wf = frequency of word occurrence N = total number of phonemes
  • Slide 23
  • 23 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Results Phonetically Balanced Word List Extracted Word-list: 257 2-syllable words, representing 32 phonemes 212 3-syllable words, representing 31 phonemes
  • Slide 24
  • 24 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Results PBW LIST 2 Syllable PBW List abayagadakinakingakoyalamalonaminaming anakanongapatarawatinatingayawayon bagaybagobagong bahaybakabakitbanalbansabatabataybatasbatay batobawalbawatbayanbelenbesesbigatbigay biglabihisbilang bisabisigbombabuhaybukasbukidbukodbulsabunso buwandagatdahildahondakongdaladamitdapat datidatingdatu ditodiwadiyosdugongedadgabigabinggaling ganapgawagawingayagayongayonggitnagubat gurogustogutom habanghaloshaponharaphariharinghatidhawakhigit hindihiramhulihulinghuwagibatibangibig ibonikawilan ilangiloginayinanginayisangisipitoyitong iyayiyaniyoyiyoniyongkahitkalyekamay kamikaming kanyakapagkayakayatkayangkayokaysakitakulang kulaykundikuninkuyalabanlabaslabilabis
  • Slide 25
  • 25 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Results PBW LIST 2 Syllable PBW List lagaylagilaging lahatlahilakaslalolalonglamanlamanglamiglangit lasalibinglihimlikodlimalimanglinggolobo luboslugarlupa lupitmagingmahalmalaymatamgamismomulamuli mulingmundonagingnamannamangnaminnasanatin ngalanngayoy ngayonngayongngunitnilanilangninaninyonitonitoy nitongniyaniyangniyonoraspagodpapelpara parangparipasko patayperaperangperopisngipisopugadpulispuno puntopusoputoritosabisagingsakasana sanaysanang sanaysanhisantasilasilaysinasinosiyasiyay siyangsumansunodtabitakbotangingtanongtapat tatlotatlo tawatawagtayngatayotayongtindatingintinigtiyak tugontuladtulaktunaytungotuwingulitunang upangutak walawalangwariwaringwikawikang
  • Slide 26
  • 26 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Results PBW LIST 3 Syllable PBW List abalaagilaakalaalilaanimasanimoyanumangasawabahagi bakuran banayadbayanibihirangbinatabinigyanbituinbulaklakbulsikotbumili buwaya dahilangdakiladakilangdalangindalawadalawangdaliridalisaydaluyong damdamin damdamingdaratingdayuhangdebatedinatnandumatinggalaponggamitinganito ganitong gawaingawainggayundingayunmangilinganginawaglobalgumawahalaman halamang humingiibayoibigayihulogilaliminagawinutangkabilakabilang kahapon kakaninkalamnankanilakanilangkanlurankapatidkapilingkapilitkatulad katulong kawawakawawangkukuninkulturallalakilarawanlarawanglibanginlipunan lipunang liwanaglumabaslumikhalumipadlumitawlupainlupaingmabilangmabilis mabuhay mabutimabutingmadilimmagandamagbigaymagigingmagulangmahabangmahina
  • Slide 27
  • 27 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Results PBW LIST 3 Syllable PBW List mahirap makitamakitidmalagkitmalakasmalakimalakingmalamanmalapitmalayo maliban maliitmalinawmalinismalungkotmaluwagmarahanmaramimaramingmarinig marunong masarapmatandamatandangmataposmateryalmatiyakmatulogmatutomayamang minana nabuhaynagawanagbaliknagulatnakitanamataynapagodnapansinnapulot narinig naritonariyannaroonnasaannasabinaturalnawalanilikhanobela pagdalaw pagigingpagitanpalakaspaligidpamilyapanahonpanahongpananawpangalan panlaban patuloypuhunanpumasokpuwestosabadosabungansalapisapagkatsapagkat sarilisariliysarilingsasagisimbahansinabisinisisubalitsumagot sumapit sumunodtagalogtagintingtagumpaytahanantalagangtanggihantawagintinapos tinawag trabahotumakbotumigiltumulongturistaulilangumagaumuwiunidos utusanyumaong
  • Slide 28
  • 28 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Test of PBW 50 RESPONDENTS 25 Female 25 Male
  • Slide 29
  • 29 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Test of PBW 50 RESPONDENTS 25 Female 25 Male } Tagalog Native Speakers Able to read and speak Has no speaking ailment At proper disposition
  • Slide 30
  • 30 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Test of PBW 50 RESPONDENTS 25 Female 25 Male } Dependent SpeakersIndependent Speakers 20 Dependent Female Speakers 20 Dependent Male Speakers 5 Independent Female Speakers 5 Independent Male Speakers
  • Slide 31
  • 31 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Test of PBW RECORDING SPECIFICATION Sampling Rate: 16kHz Mono Environment: Isolated Room Distance: 5-10 cm away from mouth Unidirectional Microphone
  • Slide 32
  • 32 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Test of PBW SPEECH RECOGNITION Recorded Speech.wav file
  • Slide 33
  • 33 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Test of PBW SPEECH RECOGNITION Hidden Markov Model Toolkit (HMM) Recorded Speech.wav file
  • Slide 34
  • 34 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Test of PBW SPEECH RECOGNITION Hidden Markov Model Toolkit (HMM) MFCC Recorded Speech.wav file
  • Slide 35
  • 35 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Test of PBW SPEECH RECOGNITION Hidden Markov Model Toolkit (HMM) MFCC Recorded Speech.wav file 20 Dependent Speakers (Training Data) 5 Independent Speakers (Testing Data)
  • Slide 36
  • 36 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Test of PBW SPEECH RECOGNITION Hidden Markov Model Toolkit (HMM) MFCC Recorded Speech.wav file 40 Dependent Speakers (Training Data) 10 Independent Speakers (Testing Data) Results
  • Slide 37
  • 37 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Results Test of Speech recognition The results from the independent speakers were less than the dependent speakers since these were not common to the training data. Dependent Speaker Independent Speaker PBW2 93.25 %88.67 % PBW3 99.53 %96.30 %
  • Slide 38
  • 38 of 39 Arnel C. Fajardo & Yoon-Joong Kim, Ph.D | Hanbat National University, South Korea Development Of Filipino Phonetically-balanced Words And Test Using HMM Researchers Arnel C. Fajardo Hanbat National University, South Korea E-mail [email protected] Yoon-Joong Kim, Ph.D. Hanbat National University, South Korea E-mail [email protected]
  • Slide 39
  • 39 of 39 Thank you for Listening! ARNEL C. FAJARDO, YOON JOONG KIM, PH.D World Computer Congress 2013-International Conference for Artificial Intelligence (ICAI2013