presented by han

16

“Improving Pronunciation Dictionary Coverage of Names by Modelling Spelling Variation” - Justin Fackrell and Wojciech Skut Presented by Han

Upload: brynne-fernandez

Post on 30-Dec-2015

32 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

DESCRIPTION

“Improving Pronunciation Dictionary Coverage of Names by Modelling Spelling Variation” - Justin Fackrell and Wojciech Skut. Presented by Han. The Problem:. The pronunciation of out-of-vocabulary (OOV) words is a major problem in TTS. Many OOV words are names. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Presented by Han

“Improving Pronunciation Dictionary Coverage of Names by

Modelling Spelling Variation” - Justin Fackrell and Wojciech Skut

Presented by Han

Page 2: Presented by Han

The Problem:

• The pronunciation of out-of-vocabulary (OOV) words is a major problem in TTS.

• Many OOV words are names.• For English names, the orthography for names is

highly irregular.• Current methods of approaching this problem

has low accuracy.– Using hand-written or automatically learned rules to

replace a sequence of graphemes by a sequence of phonemes.

Page 3: Presented by Han

The Challenge

Page 4: Presented by Han

Their Method• Scope: English surnames, forenames,

street names and place names.• Based on: the observation that some of

the words in the above categories have same pronunciation, but slightly different spelling.

• Approach: learn from existing data (data-driven) of the rules of these variations, so that next time we see an OOV word, we will try to apply these rules and see if we can transform that word into an IOV word.

Page 5: Presented by Han

Different Orthographical Expressions for the Same Pronunciation

Page 6: Presented by Han

Hypothesis

• Given a name that’s not in the dictionary, there’s about 10% chance that it DOES have a valid pronunciation in the dictionary. We have to somehow map it to a valid in-dictionary word.

Page 7: Presented by Han

A Hierarchical Approach

Dictionary

Filter 1

Filter 2

etc.

Page 8: Presented by Han

Two Ways of Using This Method and Their Results

• Online– Results suggested pronunciations are good in

80% of cases.

• Offline– For surnames, a model trained on a 23,000-

entry dictionary was able to add 5,000 new entries, increasing the coverage by about 1%.

Page 9: Presented by Han

The Algorithm (Part I)Training

• 1) reverse dictionary (pron -> ortho)

• 2) delete one-to-one mappings

• 3) Each pair of spellings that share a common pronunciation generates a set of rewrite rules, ri where i = 0 to n, in the form of “A -> B / L _ R”

Page 10: Presented by Han

The Algorithm (Part I)Training

Page 11: Presented by Han

The Algorithm (Part I)Training

• Each rule, ri, is then evaluated on the rest of the dictionary to see how useful it is. – MISS– OOV– DIFF– GOOD

And gets four scores: niMISS, ni

OOV, niDIFF, and ni

GOOD

• From each set of rules generated by a pair, only one rule is chosen: shortest and ni

DIFF =0.

Page 12: Presented by Han

The Algorithm (Part I)Predication

• Sort all rules by score.

• When given an OOV word, use the rule with the highest score that can map it into an IOV word.

Page 13: Presented by Han

Some Examples of Resulted Rewrite Rules

Page 14: Presented by Han

Some Results

Page 15: Presented by Han

Accuracy Test Results

Page 16: Presented by Han

Accuracy Test Results

Presented by - urdu.duas.orgurdu.duas.org/Zyaraat/Ziara_book/ZiarateAshuraWithUrduTranslation… · Presented by . Presented by . Presented by

“Improving Pronunciation Dictionary Coverage of Names by Modelling Spelling Variation” - Justin Fackrell and Wojciech Skut Presented by Han

The Timken Company (TKR) Presented by, Han Yang Nurul Alam Rafi Kannu Priya Presented on April 24, 2012

Distinctive Image Features from Scale-Invariant Keypoints David Lowe Presented by Tony X. Han March 11, 2008

Presented by · 2016-06-02 · Presented by . Presented by . Presented by

Presented by Tarjuma-Zikr e Jameel.pdf · Presented by . Presented by . Presented by

Presented By: · 2020. 2. 2. · Presented By: . Presented By: . Presented By:

Discriminative Frequent Pattern Analysis for Effective Classification By Hong Cheng, Xifeng Yan, Jiawei Han, Chih- Wei Hsu Presented by Mary Biddle

Animals in desert by han

Real Client Managed Portfolio Presented on October 18 th, 2011 By: Ran Mu Frank Damian Jionghan Dai Joseph Kim Han Yang

Presented by · 2016. 6. 2. · Presented by . Presented by . Presented by

On L1q Regularized Regression Authors: Han Liu and Jian Zhang Presented by Jun Liu

A User-Programmable Vertex Engine Erik Lindholm Mark Kilgard Henry Moreton NVIDIA Corporation Presented by Han-Wei Shen

Independent Component Analysis on Images Instructor: Dr. Longin Jan Latecki Presented by: Bo Han

Presented by Presented by . Presented by . Presented by

Scheduling and Routing Algorithms for AGVs: A Survey Ling Qiu · Wen-Jing Hsu · Shell-Ying Huang · Han Wang presented by Oğuz Atan

Network Dynamics of Budding Yeast Cell Cycle Supervisor: Dr. Lei-han Tang Presented by Cai Chunhui April 16, 2005

INSENS: Intrusion-Tolerant Routing For Wireless Sensor Networks By: Jing Deng, Richard Han, Shivakant Mishra Presented by: Daryl Lonnon

Presented by · Presented by . Presented by . Presented by

WEB OPERATING SYSTEM BY XIAO HAN BY XIAO HAN [email protected] [email protected]

{ Works of Beauty Myth By: Jennifer Han By: Jennifer Han

Presented by Han

SKY WARS : The Attempted Merger of EchoStar and DirecTV Presented by: Brennan Han Tasmin

Configurable User Interface Framework for Cross-Disciplinary and Citizen Science Presented by: Peter Fox Authors: Eric Rozell, Han Wang, Patrick West,

A Collaborative Pervasive Surveillance System (COPS) based on low bit-rate video Supervised by Prof. Shueng Han Gary Chan Presented by Ho Chi Wang, Jody

Born Blue by Han Nolan Presented by Ms. Doherty. About the Author Han Nolan Born in Alabama in 1956 active and loved to sing and dance as a child hated

Presented by - IslamicBlessings.comislamicblessings.com/upload/SurkhShiyat.pdf · Presented by . Presented by . Presented by

Presented by · 2010-01-15 · Presented by . Presented by . Presented by

Accurate Prediction of Power Consumption in Sensor Networks University of Tubingen, Germany In EmNetS 2005 Presented by Han

Low-Cost Multi-Touch Sensing through Frustrated Total Internal Reflection Jefferson Y. Han, New York University Presented by: Cody Boisclair

Portfolio Samples by Aidan Han

Presented by - urdu.duas.orgurdu.duas.org/Zyaraat/ZiarateAshuraWithUrduTranslation.pdf · Presented by . Presented by . Presented by

Presented by - WordPress.com by . Presented by . Presented by . Presented by . 9. 'Lie . …

Collecting Correlated Information from Wireless Sensor Networks Presented by Bo Han Joint work with Aravind Srinivasan and Amol Deshpande

Exploration of multidimensional biomedical data in pub chem, Presented by Lianyi Han at Solr Exchange DC