biemann ibm cog_comp_jan2015_noanim

35
Cognitive Systems Institute External Speaker Series January 15, 2015 Chris Biemann [email protected] Adaptive Natural Language Processing

Upload: diannepatricia

Post on 03-Aug-2015

219 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Biemann ibm cog_comp_jan2015_noanim

Cognitive Systems Institute External Speaker Series January 15, 2015

Chris Biemann [email protected]

Adaptive Natural Language Processing

Page 2: Biemann ibm cog_comp_jan2015_noanim

2

Natural Language Understanding – the key to intelligent behavior

§ Most information and knowledge is encoded in unstructured form in natural language

§ When humans learn about a new topic, they read about it – machines should do the same

§ Natural language content on the internet is growing constantly § Natural language is evolving, and natural language processing should

account for that

Cognitive computing Cognitive computing systems learn and interact naturally with people to extend what either humans or machine could do on their own. They help human experts make better decisions by penetrating the complexity of Big Data.

http://www.research.ibm.com/cognitive-computing

Page 3: Biemann ibm cog_comp_jan2015_noanim

3

Why Language is difficult ..

He sat on the river bank and counted his dough.

She went to the bank and took out some money.

Page 4: Biemann ibm cog_comp_jan2015_noanim

4

Why Language is difficult ..

He sat on the river bank and counted his dough.

She went to the bank and took out some money.

Lexical Layer

Concept Layer

Page 5: Biemann ibm cog_comp_jan2015_noanim

5

Why Language is difficult ..

He sat on the river bank and counted his dough.

She went to the bank and took out some money.

Lexical Layer

Concept Layer polysemous

Page 6: Biemann ibm cog_comp_jan2015_noanim

6

Why Language is difficult ..

He sat on the river bank and counted his dough.

She went to the bank and took out some money.

Lexical Layer

Concept Layer

synonymous polysemous

Page 7: Biemann ibm cog_comp_jan2015_noanim

7

Why Not To Use Dictionaries or Ontologies

Advantages: § Sense inventory given § Linking to concepts § Full control

Photo by zeh fernando under Creative Commons licence

http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData

Page 8: Biemann ibm cog_comp_jan2015_noanim

8

Why Not To Use Dictionaries or Ontologies

Advantages: § Sense inventory given § Linking to concepts § Full control

Photo by zeh fernando under Creative Commons licence Disadvantages: •  Dictionaries have to be created •  Dictionaries are incomplete •  Language changes constantly: new

words, new meanings …

http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData

Page 9: Biemann ibm cog_comp_jan2015_noanim

9

Why Not To Use Dictionaries or Ontologies

Advantages: § Sense inventory given § Linking to concepts § Full control

Photo by zeh fernando under Creative Commons licence

“give a man a fish and you feed him for a day…

Disadvantages: •  Dictionaries have to be created •  Dictionaries are incomplete •  Language changes constantly: new

words, new meanings …

http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData

Page 10: Biemann ibm cog_comp_jan2015_noanim

10

Structure Discovery Paradigm

… teach a man to fish and you feed him for a lifetime”

Consequences: § Only raw text input required § No fine-grained control on categories § Cognitive system: learns from and adopts to data

Task

Use annotations as features

Text Data

SD algorithm

Find regularities by analysis

Annotate data with regularitiesSD algorithm

SD algorithmSD algorithms

Page 11: Biemann ibm cog_comp_jan2015_noanim

11

The JoBimText project – www.jobimtext.org Partners: §  Lead at IBM: Alfio Gliozzo

IBM Watson DeepQA, Yorktown, NY, USA §  Lead at TU DA: Chris Biemann

Language Technology, TU Darmstadt, Germany Software Capabilities: § Compute a Distributional Thesaurus § Compute Sense Representations §  2-Dimensional Text: Contextualized Expansion § RESTful API and Web Demo Features: § Scalable architecture § Open Source, ASL 2.0

Page 12: Biemann ibm cog_comp_jan2015_noanim

12

2D Text: Matching Meaning beyond Keywords

almost no word overlap

Where was the first professor for electric science established?

In 1883 the first faculty for electrical engineering was founded there.

Page 13: Biemann ibm cog_comp_jan2015_noanim

13

2D Text: Matching Meaning beyond Keywords

Where was the first professor for electric science established?

In 1883 the first faculty for electrical engineering was founded there. teacher professor student graduate alumnus staff campus

electric mechanical thermal electronic industrial optical automotive

science sciences biology physics economics mathematics psychology

co-found form establish own join rename bear

director emeritus dean lecturer president psychologist historian

electrical heavy-duty antique battery-powered electronic stainless diesel

biology economics sciences mathematics physics math psychology

create form set maintain found abolish strengthen

Page 14: Biemann ibm cog_comp_jan2015_noanim

14

2D Text: Matching Meaning beyond Keywords

Where was the first professor for electric science established?

In 1883 the first faculty for electrical engineering was founded there. teacher professor student graduate alumnus staff campus

electric mechanical thermal electronic industrial optical automotive

science sciences biology physics economics mathematics psychology

co-found form establish own join rename bear

director emeritus dean lecturer president psychologist historian

electrical heavy-duty antique battery-powered electronic stainless diesel

biology economics sciences mathematics physics math psychology

create form set maintain found abolish strengthen

Page 15: Biemann ibm cog_comp_jan2015_noanim

15

Sipping cappuccino ..

§ s

Page 16: Biemann ibm cog_comp_jan2015_noanim

16

.. in Milan.

§ s

Page 17: Biemann ibm cog_comp_jan2015_noanim

17

.. in Milan.

§ s

Page 18: Biemann ibm cog_comp_jan2015_noanim

18

Clustering of DT entries: Sense Induction

bright#JJ

paper#NN

C. Biemann (2006): Chinese Whispers - an Efficient Graph Clustering Algorithm and its Application to Natural Language Processing Problems. Proceedings of the HLT-NAACL-06 Workshop on Textgraphs-06, New York, USA.

Page 19: Biemann ibm cog_comp_jan2015_noanim

19

Features for Disambiguation

paper 0 (newspaper) read#VB#-dobj 45 reading#VBG#-dobj 45 write#VB#-dobj 38 read#VBD#-dobj 37 writing#VBG#-dobj 36 wrote#VBD#-dobj 34 original#JJ#amod 27 wrote#VBD#-prep_in 26 recent#JJ#amod 26 published#VBN#partmod 25 written#VBN#-dobj 23 published#VBN#-nsubjpass 20 published#VBD#-dobj 19 copy#NN#-prep_of 18 said#VBD#-prep_in 18 author#NN#-prep_of 17 pages#NNS#-prep_of 16 told#VBD#-dobj 15 buy#VB#-dobj 14 published#VBN#-prep_in 14 page#NN#-prep_of 14

paper 1 (material) piece#NN#-prep_of 21 pieces#NNS#-prep_of 17 made#VBN#-prep_from 13 bags#NNS#-nn 11 white#JJ#amod 9 paper#NN#-conj_and 9 glass#NN#-conj_and 9 products#NNS#-nn 9 industry#NN#-nn 8 plastic#NN#conj_and 8 plastic#NN#-conj_and 8 bits#NNS#-prep_of 8 bag#NN#-nn 8 plastic#NN#conj_or 8 sheet#NN#-prep_of 7 recycled#JJ#amod 7 tons#NNS#-prep_of 7 glass#NN#conj_and 7 buy#VB#-dobj 6 plates#NNS#-nn 6 pile#NN#-prep_of 6

These are shared by paper and the cluster members. Disambiguation: find features in context. I am reading an original paper on the paper .

Page 20: Biemann ibm cog_comp_jan2015_noanim

20

§ d

Paraphrasing with JoBimText

Page 21: Biemann ibm cog_comp_jan2015_noanim

21

§ d

Paraphrasing with JoBimText

Page 22: Biemann ibm cog_comp_jan2015_noanim

22

JoBimText Model example “beetle”

S. Mitra, R. Mitra, M. Riedl, C. Biemann, A. Mukherjee, P. Goyal (2014): That’s sick dude!: Automatic identification of word sense change across different timescales. Proceedings of ACL-2014, Baltimore, MD, USA

http://www.thezooom.com/2013/01/10749/

Page 23: Biemann ibm cog_comp_jan2015_noanim

23

JoBimText Model example “beetle”

S. Mitra, R. Mitra, M. Riedl, C. Biemann, A. Mukherjee, P. Goyal (2014): That’s sick dude!: Automatic identification of word sense change across different timescales. Proceedings of ACL-2014, Baltimore, MD, USA

http://www.thezooom.com/2013/01/10749/

Page 24: Biemann ibm cog_comp_jan2015_noanim

24

Outlook: From Similarities and Relations…

Cathy liked the blue dress very much.

She bought it for 15 Euros from the shop.

gown skirt blouse

Pat Brian Kevin

red purple green

currency greenback yen

store restaurant boutique

COLOR CLOTHING FIRSTNAME

MONEY SALESPOINT

HAS-PROPERTY 1: ENTITIES 2. RELATIONS

Page 25: Biemann ibm cog_comp_jan2015_noanim

25

Sneak Preview: Induction of Relations

§ JoBimText model on pairs and paths between pairs

Page 26: Biemann ibm cog_comp_jan2015_noanim

26

… to Frames and Causality

She bought it for 15 Euros from the shop. MONEY SALESPOINT

FIRSTNAME adored CLOTHING FIRSTNAME found CLOTHING great

POSITIVE-OPINION-ABOUT

subj=FIRSTNAME obj=CLOTHING

VERKAUFSVORGANG

subj=AGENT obj=THING für=MONEY loc=SALESPOINT

FIRSTNAME

CLOTHING

Cathy

dress

Cathy

dress

3: FRAMES 4: CAUSALITY

Cathy liked the blue dress very much. COLOR CLOTHING FIRSTNAME HAS-PROPERTY

Page 27: Biemann ibm cog_comp_jan2015_noanim

27

Sneak Preview: Frame Induction

§ s

Page 28: Biemann ibm cog_comp_jan2015_noanim

28

§ JoBimText informs relation extraction significant improvements in EMRA application, e.g. for finding drug prescriptions for diseases

§ JoBimText sense clusters are being used to inform term matching e.g. when finding justifications for answers

§ JoBimText is one of the solutions for knowledge induction from text in new domains

Applications of JoBimText in IBM Watson

Page 29: Biemann ibm cog_comp_jan2015_noanim

29

Conclusion

§ The role of Natural Language Processing in Cognitive Computing is two-fold: § the technology for natural interaction with the system § a technology subject to be framed in the cognitive paradigm

Page 30: Biemann ibm cog_comp_jan2015_noanim

30

Conclusion

§ The role of Natural Language Processing in Cognitive Computing is two-fold: § the technology for natural interaction with the system § a technology subject to be framed in the cognitive paradigm

§ Adaptive Natural Language Processing § makes use of static AND dynamically generated resources § is driven by (text) data that defines its application domain § accounts for language evolution and new meanings by adaptation

to the data § beyond NLP pipelines

Page 31: Biemann ibm cog_comp_jan2015_noanim

31

Thanks..

.. and now some (deep) QA!

www.jobimtext.org

Special Track: Semantic and Cognitive Computing

Page 32: Biemann ibm cog_comp_jan2015_noanim

32

Page 33: Biemann ibm cog_comp_jan2015_noanim

33

The @-ing (‘holing’) operation: producing pairs of Jos and Bims

SENTENCE: I suffered from a cold and took aspirin.

STANFORD COLLAPSED DEPENDENCIES: nsubj(suffered, I); nsubj(took, I); root(ROOT, suffered); det(cold, a); prep_from(suffered, cold); conj_and(suffered, took); dobj(took, aspirin) WORD-CONTEXT PAIRS: suffered nsubj(@@, I) 1 took nsubj(@@, I) 1 cold det(@@, a) 1 suffered prep_from(@@, cold) 1 suffered conj_and(@@, took) 1 took dobj(@@, aspirin) 1

I nsubj(suffered, @@) 1 I nsubj(took, @@) 1 a det(cold, @@) 1 cold prep_from(suffered, @@) 1 took conj_and(suffered, @@) 1 aspirin dobj(took, @@) 1

http://nlp.stanford.edu:8080/parser/

Jo Bim

Page 34: Biemann ibm cog_comp_jan2015_noanim

34

Distributional Thesaurus (DT)

§ Computed from distributional similarity statistics § Entry for a target word consists of a ranked list of neighbors meeting meeting 288 meetings 102 hearing 89 session 68 conference 62 summit 51 forum 46 workshop 46 hearings 46 ceremony 45 sessions 41 briefing 40 event 40 convention 38 gathering 36 ...

articulate articulate 89 explain 19 understand 17 communicate 17 defend 16 establish 15 deliver 14 evaluate 14 adjust 14 manage 13 speak 13 change 13 answer 13 maintain 13 ...

immaculate amod(condition,@@)

perfect amod(timing,@@)

nsubj(@@,hair)

cop(@@,remains)

First order

immaculate perfect

Second order

3

amod(Church,@@)

Page 35: Biemann ibm cog_comp_jan2015_noanim

35

Scaling Computation with MapReduce Roomano is a hard Gouda-like cheese from Friesland in the northern part of The Netherlands. It pairs well with aged sherries ...

FreqSig t: min freq s: min sign

Holing using gramm. relations

word feature t hard#a cheese#ADJ_MODn 17 cheese#n Gouda-like#ADJ_MODa 5 cheese#n hard#ADJ_MODa 17 pair#v well#ADV_MODa 3 ... .... ...

word feature s hard#a cheese#ADJ_MODn 15.8 cheese#n Gouda-like#ADJ_MODa 7.6 cheese#n hard#ADJ_MODa 0.4 ... .... ...

AggrPerFt feature words cheese#ADJ_MODn hard#a, yellow#a, French#a hard#ADJ_MODa cheese#n, stone#n ... .... ...

SimCounts w: weighting for # words/ feature

word word w.sum hard#a yellow#a 0.234 yellow#a hard#a 0.234 cheese#n stone#n 3.14 ... .... ...

PruneGraph p: max number of features per word ; s

(like data below)

Convert sum threshold

ibm i.b.m. 164 intel 154 hewlett-packard 151 dell 141 cisco 134 microsoft 125 hp 124 green: Steps blue: Parameters