representing meaning in unsupervised word sense disambiguation
DESCRIPTION
Bridget T. McInnes 5 September 2008. Representing Meaning in Unsupervised Word Sense Disambiguation. University of Minnesota Twin Cities. What is WSD?. The culture count doubled. Culture. Anthropological Culture. Laboratory Culture. Sense Inventory. Approaches to WSD. Supervised - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/1.jpg)
1
Representing Meaning in Unsupervised Word Sense
Disambiguation
Bridget T. McInnes
5 September 2008
University of Minnesota Twin Cities
![Page 2: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/2.jpg)
2
What is WSD?
The culture count doubled.
Culture
LaboratoryCulture
AnthropologicalCulture
Sense Inventory
![Page 3: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/3.jpg)
3
Approaches to WSD
SupervisedAdvantages: obtains a high accuracyDisadvantages: manually annotated training data is required for each word that needs to be disambiguated therefore it can not scale
UnsupervisedAdvantages: does not require manually annotated training dataDisadvantages: generally does not obtain as high of an accuracy as supervised approaches
![Page 4: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/4.jpg)
4
Unsupervised Approaches
Similarity and Relatedness Based
![Page 5: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/5.jpg)
5
Unsupervised Approaches
Similarity and Relatedness BasedPatwardhan, Banerjee and Pedersen 2005Pedersen, et al 2006Budanitsky and Hirst 2006
![Page 6: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/6.jpg)
6
Unsupervised Approaches
Similarity and Relatedness based
Vector Based
![Page 7: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/7.jpg)
7
Unsupervised Approaches
Similarity and Relatedness Based
Vector-basedMohammad and Hirst, 2006Patwardhan, 2003Pedersen, et al 2006Humphrey, et al 2006
![Page 8: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/8.jpg)
8
Unsupervised Approaches
Similarity and Relatedness-based
Vector-based
Clustering
![Page 9: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/9.jpg)
9
Unsupervised Approaches
Similarity and Relatedness based
Vector-based
ClusteringPedersen and Bruce, 1997Shütze, 1998Pedersen and Bruce, 1998Purandare and Pedersen, 2004Kulkarni and Pedersen, 2005
![Page 10: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/10.jpg)
10
Road Map
Previous Approaches
Our vector approach
Future Work
![Page 11: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/11.jpg)
11
Previous Approaches
Similarity and Relatedness Based
SenseRelate (Banerjee and Pedersen, 2003)
Vector-based
Semantic Type Indexing (Humphrey et al 2006)
Clustering
SenseClusters (Kulkarni and Pedersen, 2005)
![Page 12: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/12.jpg)
12
Banerjee and Pedersen 2003
Sense Relate
![Page 13: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/13.jpg)
13
SenseRelateTarget Word: Transport
Concept 1: Biological Transport (C0005528)
Concept 2: Patient Transport (C0150390)
Transport of glutathione S-linked conjugates.
glutathione S-linked conjugates.
C0017817C0522529 C0301869
C0005528 = SS + SS + SS = Total SS for Concept 1
![Page 14: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/14.jpg)
14
SenseRelateTarget Word: Transport
Concept 1: Biological Transport (C0005528)
Concept 2: Patient Transport (C0150390)
Transport of glutathione S-linked conjugates.
glutathione S-linked conjugates.
C0017817C0522529 C0301869
C0150390 = SS + SS + SS = Total SS for concept 2
C0005528 = SS + SS + SS = Total SS for concept 1
![Page 15: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/15.jpg)
15
Humphrey et al, 2006
Semantic Type Indexing for WSD
![Page 16: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/16.jpg)
16
Semantic Type Indexing (STI) Target Word: Transport
Concept 2 Vector
Concept 1 Vector
Target Word VectorCosine 2
Cosine 1
Concept 1: Biological TransportSemantic type: Cell Function
Concept 2: Patient TransportSemantic type: Health Care Activity
JDI
CV1 – JDI vectorCV2 – JDI vector
TW – JDI vector
Transport of glutathione S-linked conjugates.
![Page 17: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/17.jpg)
17
Target Word Vector
Transport of glutathione S-linked conjugates.
Contains the words surrounding the ambiguous word
![Page 18: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/18.jpg)
18
STI - Target Word Vectors
Transport of glutathione S-linked conjugates.
Contains the words surrounding the ambiguous word
![Page 19: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/19.jpg)
19
STI -Concept Vectors
The concept vectors are created based on their semantic type(s)
Transport:C0005528: Biological TransportC0150390: Patient Transport
C0005528
C0150390
Cell FunctionOne word terms in the Metathesaurus associated with Cell Function
Health Care Activity One word terms in the Metathesaurus associated with Health Care Activity
![Page 20: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/20.jpg)
20
Kulkarni and Pedersen, 2005
SenseClusters
![Page 21: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/21.jpg)
21
Sense Clusters (SC)Target Word: Transport
Concept 1: Biological TransportConcept 2: Patient Transport
Instance 1Instance 2Instance 3Instance 4Instance 5Instance 6Instance 7Instance 8Instance 9Instance 10Instance 11Instance 12Instance 13…
Concept 1
Concept 2
Transport of glutathione S-linked conjugates.
![Page 22: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/22.jpg)
22
Sense Clusters (SC)
Instance 1Instance 2Instance 3Instance 4Instance 5Instance 6Instance 7Instance 8Instance 9Instance 10Instance 11Instance 12Instance 13…
Concept 1
Concept 2
Target Word: Transport
Concept 1: Biological TransportConcept 2: Patient Transport
Transport of glutathione S-linked conjugates.
![Page 23: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/23.jpg)
23
Sense Clusters
Concept 2 Vector
Concept 1 Vector
Target Word Vector
Cosine 2
Cosine 1
Target Word: Transport
Concept 1: Biological TransportConcept 2: Patient Transport
Transport of glutathione S-linked conjugates.
![Page 24: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/24.jpg)
24
SC -Vectors
Contain the words surrounding the ambiguous word
Created using:
First order co-occurrences
Second order co-occurrences
![Page 25: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/25.jpg)
25
First Order Co-occurrence Vectors
glutathione S-linked conjugates
Word 1
Word 2
Word N
.
.
.
.
.
.
.
50
6
5
.
.
.
5
6
1
.
.
.
5
0
15
.
.
.
20
4
7
TargetVector
![Page 26: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/26.jpg)
26
Second Order Co-occurrence Vectors
Word 1
Word 2
Word N
.
.
.
.
.
.
.
10
30
0
1st orderglutathione
20 10 0
10
0
0
2
50
2
…
…
…
…
…
… …
Word1 Word 2 … Word N
0 2 2…
2nd orderglutathione
![Page 27: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/27.jpg)
27
Second Order Co-occurrence Vectors
S-linked conjugates
Word 1
Word 2
Word N
.
.
.
.
.
.
.
10
30
2
.
.
.
0
6
0
.
.
.
5
0
13
.
.
.
5
13
5
TargetVector
glutathione
![Page 28: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/28.jpg)
28
Our unsupervised approach
![Page 29: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/29.jpg)
29
CuiTools ApproachOur approach uses a general vector approach with SenseCluster vectors
![Page 30: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/30.jpg)
30
CuiTools
Concept 2 Vector
Concept 1 Vector
Target Word Vector
Cosine 2
Cosine 1
Target Word: Transport
Concept 1: Biological Transport (C0005528)
Concept 2: Patient Transport (C0150390)
Transport of glutathione S-linked conjugates.
![Page 31: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/31.jpg)
31
CuiTools Approach
We explore using
First-order co-occurrence vectors
Second-order co-occurrence vectors
Our approach uses a general vector approach with SenseCluster vectors
![Page 32: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/32.jpg)
32
Target Word Vector
Contains the words surrounding the ambiguous word
Transport of glutathione S-linked conjugates.
![Page 33: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/33.jpg)
33
CuiTools - Concept Vectors
How to create a vector that can represent the meaning of
a concept for word sense disambiguation?
![Page 34: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/34.jpg)
34
To answer this question
We explore information in the UMLS that can be used to
represent the meaning of a concept.
![Page 35: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/35.jpg)
35
CuiTools - Concept Vectors
Adjustment
Individual AdjustmentConceptually broad term referring to a state of harmony between internal needs and external …
Adjustment ActionThe act of making necessary corrections or modifications …
Psychological AdjustmentA state of harmony between internal needs and external demands and the processes used …
CUI definition
![Page 36: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/36.jpg)
36
CuiTools - Concept Vectors
Blood Pressure
Blood PressureForce exerted by the blood on the walls of the arteries and other vessels.
Blood Pressure DeterminationActions performed to measure the diastolic and systolic pressure of the blood.
Arterial PressureNO DEFINTION
CUI definition
![Page 37: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/37.jpg)
37
CuiTools - Concept Vectors
CUI definitionUse CUI definition but if it doesn’t exist
PARent definitionSemantic Type definition
SYNonymous terms
For example:C0430400: Laboratory Culture
laboratory culturemicrobial culturesample culture
![Page 38: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/38.jpg)
38
CuiTools - Concept Vectors
CUI definition
PARent definitionSemantic Type definition
SIBlings
For example:C0010453: Anthropological Culture
archeologyfamilysocial groups
If CUI definition doesn’t exist
SYNonymous terms
![Page 39: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/39.jpg)
39
CuiTools - Concept Vectors
CUI definitionIf CUI definition doesn’t exist
PARent definitionSemantic Type definition
SIBlings
SYNonymous terms
TOP 50 most frequent words surrounding the terms associated with the CUI
![Page 40: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/40.jpg)
40
Dataset
National Library of Medicine's Word Sense Disambiguation (NLM-WSD) Dataset
50 words from the 1998 MEDLINE abstracts
100 instances for each of the 50 words
The target word was manually assigned a UMLS concept or None
All instances of None were removed
Average number of concepts per ambiguous word is 2.26
![Page 41: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/41.jpg)
41
Data subsets
Humphrey subset
Humphrey, et al 2006
45 out of the 50 words in NLM-WSD
5 words were excluded because at least two of the possible concepts associated with these words have the same semantic type
Instances that were assigned “None” were removed
![Page 42: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/42.jpg)
42
Training Data
The training data used to create the 1st and 2nd order co-occurrence vectors is
2005 Medline baseline
![Page 43: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/43.jpg)
43
Results
![Page 44: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/44.jpg)
Results
![Page 45: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/45.jpg)
45
Results of Co-occurrence Vectors
![Page 46: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/46.jpg)
46
Results of the Representations of Meaning
![Page 47: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/47.jpg)
47
Results of the Representations of Meaning - CUI
Adding the parent and semantic type definitions decreased the accuracy by 6 and 7 percentage points
Parent and semantic type definitions are too broad to define the meaning of a concept
![Page 48: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/48.jpg)
48
Results of the Representations of Meaning - SYN
Using the synonymous terms associated with a concept is too narrow to represent the meaning.
Adjustment ActionAdjustment – actionAdjustmentsAdjustment, NOSAdjustment – action qualifier valueAdjustment – action procedure
![Page 49: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/49.jpg)
49
Results of the Representations of Meaning - SIB
Using the terms associated the siblings of a concept is too broad to represent the meaning.
Adjustment ActionBiopsyCauterisationCauteryCold TherapyDesiccationDrainage procedureElectrolysis
![Page 50: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/50.jpg)
50
Results of the Representations of Meaning
![Page 51: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/51.jpg)
51
Supervised versus Unsupervised
Joshi McInnes Stevenson SenseClusters Humphrey CuiTools et al 04 et al 07 et al 08 et al 06
![Page 52: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/52.jpg)
52
To recap
How to create a vector that can represent the meaning of
a concept for word sense disambiguation?
![Page 53: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/53.jpg)
53
Conclusions
To answer this we explored information in the UMLS that could be used to represent the meaning of a concept
Finding a context to represent the meaning of a concept is difficult
We found using the top 50 most frequent words surrounding the terms associated with the concept best represented the concept for the task of word sense disambiguation
![Page 54: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/54.jpg)
54
Take away message
Unsupervised approaches are showing promise
Their disadvantage due to supervised approaches obtaining a higher disambiguation accuracy is slowly disappearing
But we are not there yet … so there is more work to do
![Page 55: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/55.jpg)
55
Future Work
UMLS-Similarity package
Using the Semantic Similarity scores rather than frequency in the 1st order co-occurrence vectors
![Page 56: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/56.jpg)
56
First Order Co-occurrence Vectors
glutathione S-linked conjugates
Word 1
Word 2
Word N
.
.
.
.
.
.
.
50
6
5
.
.
.
5
6
1
.
.
.
5
0
15
.
.
.
20
4
7
TargetVector
FREQ (glutathione, word N) Average
![Page 57: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/57.jpg)
57
First Order Co-occurrence Vectors
glutathione S-linked conjugates
Word 1
Word 2
Word N
.
.
.
.
.
.
.
.5
.6
.5
.
.
.
.5
.6
.1
.
.
.
.5
0
.15
.
.
.
.75
.6
.25
TargetVector
Similarity (glutathione, word N) Average
![Page 58: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/58.jpg)
58
First Order Co-occurrence Vectors
glutathione S-linked conjugates
Word 1
Word 2
Word N
.
.
.
.
.
.
.
.5
.6
.5
.
.
.
.5
.6
.1
.
.
.
.5
0
.15
.
.
.
1.5
1.2
.75
TargetVector
Similarity (glutathione, word N) Sum (like SenseRelate)
![Page 59: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/59.jpg)
59
First Order Co-occurrences
glutathione
Word 1
Word 2
Word N
.
.
.
.
.
.
.
.5
.6
.5
Word N
(C0005528)
.3+ .2
C0000000 C0000001
Similarity = = .5
C0005528
![Page 60: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/60.jpg)
60
Future Work
UMLS-Similarity package
Creating 2nd order co-occurrence matrices based on highly similar concepts rather than words in text
Using the Semantic Similarity scores rather than frequency in the 1st order co-occurrence vectors
![Page 61: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/61.jpg)
61
Second Order Co-occurrence Vectors
Word 1
Word 2
Word N
.
.
.
.
20 10 0
10
0
0
2
50
2
…
…
…
…
…
… …
Word1 Word 2 … Word N
Words come from training corpus
Frequency counts
![Page 62: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/62.jpg)
62
Second Order Co-occurrence Vectors
CUI 1
CUI 2
CUI N
.
.
.
.
.20 .10 0
.10
0
0
.20
.50
.20
…
…
…
…
…
… …
CUI1 CUI2 … CUI N
Use concepts from the UMLS
Similarity scores
![Page 63: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/63.jpg)
63
Future Work
UMLS-Similarity package
Creating 2nd order co-occurrence matrices based on highly similar concepts rather than co-occurrences in text
Use terms associated with CUIs that have a high similarity score with the possible concept to represent the meaning of the concept
Using the Semantic Similarity scores rather than frequency in the 1st order co-occurrence vectors
![Page 64: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/64.jpg)
64
Similarity Scores
What is potentially gained by using the similarity or relatedness measures
May catch words/concepts that are similar but do not frequently occur together in the training data
culture and ethnology
Ethnology is the study of anthropology
ethnology appears with culture only five times in the training data
The concepts Anthropological Culture and Ethnology would have a high similarity score where as Laboratory culture and Ethnology would not
![Page 65: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/65.jpg)
65
Software
CuiTools version 0.19
http://cuitools.sourceforge.net
![Page 66: Representing Meaning in Unsupervised Word Sense Disambiguation](https://reader035.vdocument.in/reader035/viewer/2022081513/56814ae4550346895db7f62d/html5/thumbnails/66.jpg)
66
Thank you
Lan AronsonFrançois LangJim MorkAurélie NévéolWill Rogers
Olivier BodenreiderAllen BrowneMay CheyDina Demner-FushmanGuy DivitaKin Wah FungSusanne HumphreyDwayne McCullyTom RindfleschSuresh Srinivasan