natural language processing in ios / osx

25
Tech Talk NLP Tools in iOS/OSX Todd Kramer

Upload: cotap-engineering

Post on 16-Jul-2015

579 views

Category:

Engineering


8 download

TRANSCRIPT

Page 1: Natural language processing in iOS / OSX

Tech Talk NLP Tools in iOS/OSX

Todd Kramer

Page 2: Natural language processing in iOS / OSX

NLP Tools in iOS/OSX: Topics

• CFStringTransform • transliteration, normalization

• CFStringTokenizer • string tokenization, language identification

• UITextChecker • spell check

• NSLinguisticTagger • parts of speech tagging, named entity recognition,

lemmatization, language/script identification

• NSDataDetector • data detection

Page 3: Natural language processing in iOS / OSX

NLP Tools in iOS/OSX: CFStringTransform

The CFStringTransform Function

Page 4: Natural language processing in iOS / OSX

NLP Tools in iOS/OSX: CFStringTransform

Transliterate Thai to Latin

Original: สวัสดี; Transformed: sw̄ạsd̄ī

Page 5: Natural language processing in iOS / OSX

NLP Tools in iOS/OSX: CFStringTransform

Page 6: Natural language processing in iOS / OSX

NLP Tools in iOS/OSX: CFStringTransform

Transliterate Latin to Gujarati

Original: Gujarātī; Transformed: ગuજરાતી

Page 7: Natural language processing in iOS / OSX

NLP Tools in iOS/OSX: CFStringTransform

Remove Diacritics and Accents

Original: sw̄ạsd̄ī; Transformed: swasdi

Page 8: Natural language processing in iOS / OSX

NLP Tools in iOS/OSX: CFStringTransform

Describe Unicode Characters

Original: 👍; Transformed: \N{THUMBS UP SIGN}

Page 9: Natural language processing in iOS / OSX

CFStringTokenizer

Page 10: Natural language processing in iOS / OSX

NLP Tools in iOS/OSX: CFStringTokenizer

Tokenize Into Words: Simplified Chinese

Tokens: [⼈人, ⼈人⽣生, ⽽而, ⾃自由, 在, 尊严, 和, 权利, 上, ⼀一律, 平等, 他们, 赋有, 理性, 和, 良⼼心, 并, 应, 以, 兄弟, 关系, 的, 精神, 互相, 对待]

Page 11: Natural language processing in iOS / OSX

NLP Tools in iOS/OSX: CFStringTokenizer

Transliterate Tokens: Simplified Chinese

Tokens: [rén, rénshēng, ér, zìyóu, zài, zūnyán, hé, quánlì, shàng, yīlv,̀ píngděng, tāmén, fùyǒu, lǐxìng, hé, liángxīn, bìng, yìng, yǐ, xiōngdì, guānxī, de, jīngshén, hùxiāng, duìdài]

Page 12: Natural language processing in iOS / OSX

NLP Tools in iOS/OSX: CFStringTokenizer

Language Identification: Icelandic

Language Code: is

Page 13: Natural language processing in iOS / OSX

UITextChecker

Page 14: Natural language processing in iOS / OSX

NLP Tools in iOS/OSX: UITextChecker

Page 15: Natural language processing in iOS / OSX

NLP Tools in iOS/OSX: UITextChecker

Spell Check

Misspelled Range: (7,4); Guesses: Optional([ice, Bice, bide, nice, vice, bike, bile, bite, bace, bbce, bcce, bdce, bece, bfce, dice, lice, mice, pice, rice, brice, bicep]) Misspelled Range: (12,3); Guesses: Optional([ay, cay, day, say])

Page 16: Natural language processing in iOS / OSX

NSLinguisticTagger

Page 17: Natural language processing in iOS / OSX

NLP Tools in iOS/OSX: NSLinguisticTagger

Parts of Speech Tagging and Named Entity Recognition

Page 18: Natural language processing in iOS / OSX

NLP Tools in iOS/OSX: NSLinguisticTagger

NSLinguisticTagger Schemes

Page 19: Natural language processing in iOS / OSX

NLP Tools in iOS/OSX: NSLinguisticTagger

Parts of Speech Tagging and Named Entity Recognition

Page 20: Natural language processing in iOS / OSX

NLP Tools in iOS/OSX: NSLinguisticTagger

Parts of Speech Tagging and Named Entity Recognition

Token: What; Tag: Pronoun Token: is; Tag: Verb Token: the; Tag: Determiner Token: capital; Tag: Noun Token: of; Tag: Preposition Token: New York; Tag: PlaceName

Page 21: Natural language processing in iOS / OSX

NLP Tools in iOS/OSX: NSLinguisticTagger

Script Identification

Page 22: Natural language processing in iOS / OSX

NLP Tools in iOS/OSX: NSLinguisticTagger

Script Identification

Token: hello; Tag: Latn Token: สวัสดี; Tag: Thai Token: bonjour; Tag: Latn Token: 你; Tag: Hani Token: 好; Tag: Hani Token: !લો; Tag: Gujr Token: привет; Tag: Cyrl Token: नमस्ते; Tag: Deva

Page 23: Natural language processing in iOS / OSX

NSDataDetector

Page 24: Natural language processing in iOS / OSX

NLP Tools in iOS/OSX: NSDataDetector

Extracting Structured Data

Page 25: Natural language processing in iOS / OSX

Match: Lunch tomorrow at 12:30PM; - Date: Optional(2014-11-20 20:30:00 +0000) Match: 1600 Pennsylvania Ave. NW, Washington, D.C. 20500; - Street: Optional(1600 Pennsylvania Ave.); - Zip: Optional(20500) Match: 202-456-1414 Match: 2:15PM; - Date: Optional(2014-11-19 22:15:00 +0000) Match: Southwest Airlines Flight 737 Match: www.southwest.com

NLP Tools in iOS/OSX: NSDataDetector

Extracting Structured Data