biemann ibm cog_comp_jan2015_noanim
TRANSCRIPT
Cognitive Systems Institute External Speaker Series January 15, 2015
Chris Biemann [email protected]
Adaptive Natural Language Processing
2
Natural Language Understanding – the key to intelligent behavior
§ Most information and knowledge is encoded in unstructured form in natural language
§ When humans learn about a new topic, they read about it – machines should do the same
§ Natural language content on the internet is growing constantly § Natural language is evolving, and natural language processing should
account for that
Cognitive computing Cognitive computing systems learn and interact naturally with people to extend what either humans or machine could do on their own. They help human experts make better decisions by penetrating the complexity of Big Data.
http://www.research.ibm.com/cognitive-computing
3
Why Language is difficult ..
He sat on the river bank and counted his dough.
She went to the bank and took out some money.
4
Why Language is difficult ..
He sat on the river bank and counted his dough.
She went to the bank and took out some money.
Lexical Layer
Concept Layer
5
Why Language is difficult ..
He sat on the river bank and counted his dough.
She went to the bank and took out some money.
Lexical Layer
Concept Layer polysemous
6
Why Language is difficult ..
He sat on the river bank and counted his dough.
She went to the bank and took out some money.
Lexical Layer
Concept Layer
synonymous polysemous
7
Why Not To Use Dictionaries or Ontologies
Advantages: § Sense inventory given § Linking to concepts § Full control
Photo by zeh fernando under Creative Commons licence
http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData
8
Why Not To Use Dictionaries or Ontologies
Advantages: § Sense inventory given § Linking to concepts § Full control
Photo by zeh fernando under Creative Commons licence Disadvantages: • Dictionaries have to be created • Dictionaries are incomplete • Language changes constantly: new
words, new meanings …
http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData
9
Why Not To Use Dictionaries or Ontologies
Advantages: § Sense inventory given § Linking to concepts § Full control
Photo by zeh fernando under Creative Commons licence
“give a man a fish and you feed him for a day…
Disadvantages: • Dictionaries have to be created • Dictionaries are incomplete • Language changes constantly: new
words, new meanings …
http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData
10
Structure Discovery Paradigm
… teach a man to fish and you feed him for a lifetime”
Consequences: § Only raw text input required § No fine-grained control on categories § Cognitive system: learns from and adopts to data
Task
Use annotations as features
Text Data
SD algorithm
Find regularities by analysis
Annotate data with regularitiesSD algorithm
SD algorithmSD algorithms
11
The JoBimText project – www.jobimtext.org Partners: § Lead at IBM: Alfio Gliozzo
IBM Watson DeepQA, Yorktown, NY, USA § Lead at TU DA: Chris Biemann
Language Technology, TU Darmstadt, Germany Software Capabilities: § Compute a Distributional Thesaurus § Compute Sense Representations § 2-Dimensional Text: Contextualized Expansion § RESTful API and Web Demo Features: § Scalable architecture § Open Source, ASL 2.0
12
2D Text: Matching Meaning beyond Keywords
almost no word overlap
Where was the first professor for electric science established?
In 1883 the first faculty for electrical engineering was founded there.
13
2D Text: Matching Meaning beyond Keywords
Where was the first professor for electric science established?
In 1883 the first faculty for electrical engineering was founded there. teacher professor student graduate alumnus staff campus
electric mechanical thermal electronic industrial optical automotive
science sciences biology physics economics mathematics psychology
co-found form establish own join rename bear
director emeritus dean lecturer president psychologist historian
electrical heavy-duty antique battery-powered electronic stainless diesel
biology economics sciences mathematics physics math psychology
create form set maintain found abolish strengthen
14
2D Text: Matching Meaning beyond Keywords
Where was the first professor for electric science established?
In 1883 the first faculty for electrical engineering was founded there. teacher professor student graduate alumnus staff campus
electric mechanical thermal electronic industrial optical automotive
science sciences biology physics economics mathematics psychology
co-found form establish own join rename bear
director emeritus dean lecturer president psychologist historian
electrical heavy-duty antique battery-powered electronic stainless diesel
biology economics sciences mathematics physics math psychology
create form set maintain found abolish strengthen
15
Sipping cappuccino ..
§ s
16
.. in Milan.
§ s
17
.. in Milan.
§ s
18
Clustering of DT entries: Sense Induction
bright#JJ
paper#NN
C. Biemann (2006): Chinese Whispers - an Efficient Graph Clustering Algorithm and its Application to Natural Language Processing Problems. Proceedings of the HLT-NAACL-06 Workshop on Textgraphs-06, New York, USA.
19
Features for Disambiguation
paper 0 (newspaper) read#VB#-dobj 45 reading#VBG#-dobj 45 write#VB#-dobj 38 read#VBD#-dobj 37 writing#VBG#-dobj 36 wrote#VBD#-dobj 34 original#JJ#amod 27 wrote#VBD#-prep_in 26 recent#JJ#amod 26 published#VBN#partmod 25 written#VBN#-dobj 23 published#VBN#-nsubjpass 20 published#VBD#-dobj 19 copy#NN#-prep_of 18 said#VBD#-prep_in 18 author#NN#-prep_of 17 pages#NNS#-prep_of 16 told#VBD#-dobj 15 buy#VB#-dobj 14 published#VBN#-prep_in 14 page#NN#-prep_of 14
paper 1 (material) piece#NN#-prep_of 21 pieces#NNS#-prep_of 17 made#VBN#-prep_from 13 bags#NNS#-nn 11 white#JJ#amod 9 paper#NN#-conj_and 9 glass#NN#-conj_and 9 products#NNS#-nn 9 industry#NN#-nn 8 plastic#NN#conj_and 8 plastic#NN#-conj_and 8 bits#NNS#-prep_of 8 bag#NN#-nn 8 plastic#NN#conj_or 8 sheet#NN#-prep_of 7 recycled#JJ#amod 7 tons#NNS#-prep_of 7 glass#NN#conj_and 7 buy#VB#-dobj 6 plates#NNS#-nn 6 pile#NN#-prep_of 6
These are shared by paper and the cluster members. Disambiguation: find features in context. I am reading an original paper on the paper .
20
§ d
Paraphrasing with JoBimText
21
§ d
Paraphrasing with JoBimText
22
JoBimText Model example “beetle”
S. Mitra, R. Mitra, M. Riedl, C. Biemann, A. Mukherjee, P. Goyal (2014): That’s sick dude!: Automatic identification of word sense change across different timescales. Proceedings of ACL-2014, Baltimore, MD, USA
http://www.thezooom.com/2013/01/10749/
23
JoBimText Model example “beetle”
S. Mitra, R. Mitra, M. Riedl, C. Biemann, A. Mukherjee, P. Goyal (2014): That’s sick dude!: Automatic identification of word sense change across different timescales. Proceedings of ACL-2014, Baltimore, MD, USA
http://www.thezooom.com/2013/01/10749/
24
Outlook: From Similarities and Relations…
Cathy liked the blue dress very much.
She bought it for 15 Euros from the shop.
gown skirt blouse
Pat Brian Kevin
red purple green
currency greenback yen
store restaurant boutique
COLOR CLOTHING FIRSTNAME
MONEY SALESPOINT
HAS-PROPERTY 1: ENTITIES 2. RELATIONS
25
Sneak Preview: Induction of Relations
§ JoBimText model on pairs and paths between pairs
26
… to Frames and Causality
She bought it for 15 Euros from the shop. MONEY SALESPOINT
FIRSTNAME adored CLOTHING FIRSTNAME found CLOTHING great
POSITIVE-OPINION-ABOUT
subj=FIRSTNAME obj=CLOTHING
VERKAUFSVORGANG
subj=AGENT obj=THING für=MONEY loc=SALESPOINT
FIRSTNAME
CLOTHING
Cathy
dress
Cathy
dress
3: FRAMES 4: CAUSALITY
Cathy liked the blue dress very much. COLOR CLOTHING FIRSTNAME HAS-PROPERTY
27
Sneak Preview: Frame Induction
§ s
28
§ JoBimText informs relation extraction significant improvements in EMRA application, e.g. for finding drug prescriptions for diseases
§ JoBimText sense clusters are being used to inform term matching e.g. when finding justifications for answers
§ JoBimText is one of the solutions for knowledge induction from text in new domains
Applications of JoBimText in IBM Watson
29
Conclusion
§ The role of Natural Language Processing in Cognitive Computing is two-fold: § the technology for natural interaction with the system § a technology subject to be framed in the cognitive paradigm
30
Conclusion
§ The role of Natural Language Processing in Cognitive Computing is two-fold: § the technology for natural interaction with the system § a technology subject to be framed in the cognitive paradigm
§ Adaptive Natural Language Processing § makes use of static AND dynamically generated resources § is driven by (text) data that defines its application domain § accounts for language evolution and new meanings by adaptation
to the data § beyond NLP pipelines
31
Thanks..
.. and now some (deep) QA!
www.jobimtext.org
Special Track: Semantic and Cognitive Computing
32
33
The @-ing (‘holing’) operation: producing pairs of Jos and Bims
SENTENCE: I suffered from a cold and took aspirin.
STANFORD COLLAPSED DEPENDENCIES: nsubj(suffered, I); nsubj(took, I); root(ROOT, suffered); det(cold, a); prep_from(suffered, cold); conj_and(suffered, took); dobj(took, aspirin) WORD-CONTEXT PAIRS: suffered nsubj(@@, I) 1 took nsubj(@@, I) 1 cold det(@@, a) 1 suffered prep_from(@@, cold) 1 suffered conj_and(@@, took) 1 took dobj(@@, aspirin) 1
I nsubj(suffered, @@) 1 I nsubj(took, @@) 1 a det(cold, @@) 1 cold prep_from(suffered, @@) 1 took conj_and(suffered, @@) 1 aspirin dobj(took, @@) 1
http://nlp.stanford.edu:8080/parser/
Jo Bim
34
Distributional Thesaurus (DT)
§ Computed from distributional similarity statistics § Entry for a target word consists of a ranked list of neighbors meeting meeting 288 meetings 102 hearing 89 session 68 conference 62 summit 51 forum 46 workshop 46 hearings 46 ceremony 45 sessions 41 briefing 40 event 40 convention 38 gathering 36 ...
articulate articulate 89 explain 19 understand 17 communicate 17 defend 16 establish 15 deliver 14 evaluate 14 adjust 14 manage 13 speak 13 change 13 answer 13 maintain 13 ...
immaculate amod(condition,@@)
perfect amod(timing,@@)
nsubj(@@,hair)
cop(@@,remains)
First order
immaculate perfect
Second order
3
amod(Church,@@)
35
Scaling Computation with MapReduce Roomano is a hard Gouda-like cheese from Friesland in the northern part of The Netherlands. It pairs well with aged sherries ...
FreqSig t: min freq s: min sign
Holing using gramm. relations
word feature t hard#a cheese#ADJ_MODn 17 cheese#n Gouda-like#ADJ_MODa 5 cheese#n hard#ADJ_MODa 17 pair#v well#ADV_MODa 3 ... .... ...
word feature s hard#a cheese#ADJ_MODn 15.8 cheese#n Gouda-like#ADJ_MODa 7.6 cheese#n hard#ADJ_MODa 0.4 ... .... ...
AggrPerFt feature words cheese#ADJ_MODn hard#a, yellow#a, French#a hard#ADJ_MODa cheese#n, stone#n ... .... ...
SimCounts w: weighting for # words/ feature
word word w.sum hard#a yellow#a 0.234 yellow#a hard#a 0.234 cheese#n stone#n 3.14 ... .... ...
PruneGraph p: max number of features per word ; s
(like data below)
Convert sum threshold
ibm i.b.m. 164 intel 154 hewlett-packard 151 dell 141 cisco 134 microsoft 125 hp 124 green: Steps blue: Parameters