what the brain can tell us about language: evidence from machine learning, wordnet...
TRANSCRIPT
What the brain can tell us about
language evidence from machine
learning WordNet and latent
semantic analysis
Colleen Crangle2 Marcos Perreau-Guimaraes1 Patrick Suppes1
1Center for the Study of Language and Information Stanford
University California USA
2US-UK Fulbright Fellow 2013 Department of Computing and Communications Lancaster University UK
Converspeech LLC Palo Alto California USA
Earlier version presented at the 18th Annual Cognitive Neuroscience Meeting April 2-5 2011 San Francisco California
1 copy Colleen E Crangle 2013
Electroencephalography (EEG) measures the electric potential generated by the synchronous activity of thousands or millions of neurons that have similar spatial orientation allowing brain wave activity to be detected EEG has very high temporal resolution on the order of milliseconds with a sampling rate of 500 to 1000 Hz common in neuro-cognitive studies ndash that is 500 to 1000 readings a second
Fifteen seconds of EEG data From httpsccnucsdedu
15-22-128 channels of information 2 copy Colleen E Crangle 2013
Simple Experiment
one two three four five six seven eight nine ten left right yes no
Repeated visually or auditorily many times while EEG recordings are
madehellip
Data
Extract segments of brain waveforms corresponding to the time at which each individual word was heard (or
read)
Question
Can we tell which brain data sample is associated with which word
That is can we train a classifier to correctly classify the brain data
samples
Fifteen seconds of EEG data From httpsccnucsdedu
3 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict
which wordsentenceword within a sentence the participant
is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
4 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it
to make predictions for other participants
1999 mdashInvariance between subjects of brain wave
representations of language
5 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it to make predictions for other
participants
1999 mdashInvariance between subjects of brain wave representations of language
Can we train the classifier using visually presented words
and use that classifier to make predictions for words
presented auditorily (and vice versa)
2004 mdashClassification of individual trials based on the best
independent component of EEG-recorded sentences
2005 mdashRecognition of Words from the EEG Laplacian
2006 mdashMultichannel classification of single EEG trials with
independent component analysis
2007 mdashSingle-trial classification of MEG recordings
6 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it to make predictions for other participants
1999 mdashInvariance between subjects of brain wave representations of language
Can we train the classifier using visually presented words and use that classifier to make predictions
for words presented auditorily (and vice versa)
2004 mdashClassification of individual trials based on the best independent component of EEG-recorded sentences
2005 mdashRecognition of Words from the EEG Laplacian
2006 mdashMultichannel classification of single EEG trials with independent component analysis
2007 mdashSingle-trial classification of MEG recordings
Can we train the classifier using words and use that
classifier to make predictions for pictures depicting what
the word refers to (and vice versa)
1999 mdashInvariance of brain-wave representations of simple
visual images and their names
7 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it to make predictions for other participants
1999 mdashInvariance between subjects of brain wave representations of language
Can we train the classifier using visually presented words and use that classifier to make predictions
for words presented auditorily (and vice versa)
2004 mdashClassification of individual trials based on the best independent component of EEG-recorded sentences
2005 mdashRecognition of Words from the EEG Laplacian
2006 mdashMultichannel classification of single EEG trials with independent component analysis
2007 mdashSingle-trial classification of MEG recordings
Can we train the classifier using words and use that classifier to make predictions for pictures
depicting what the word refers to (and vice versa)
1999 mdashInvariance of brain-wave representations of simple visual images and their names
Answering yes to these questions establishes that we are
recognizing the meaning of the word or the idea or concept
behind the word and not for instance the sound of the
word or its orthography or its idiosyncratic use by one
person
8 copy Colleen E Crangle 2013
EXPERIMENTAL SETUP for recognizing words within
sentences
A computer presented 48 spoken sentences to each of 9 participants in 10 randomized blocks with all 48 sentences in each block So there were 480 trials for each participant
The sentences were about the geography of Europe Half were true half false half positive half negative
The capital of Italy is Paris F
Paris is not east of Berlin T
Spain is west of Russia T
Participants were asked to determine the truth or falsity of each statement while EEG recordings were made
9 copy Colleen E Crangle 2013
X ϵ Berlin London Moscow Paris Rome Warsaw Madrid Vienna Athens)
Y ϵ France Germany Italy Poland Russia Austria Greece Spain)
W ϵ the capital the largest city
Z ϵ north south east west
W of Y is [not] X X is [not] W of Y
X is [not] Z of X Y is [not] Z of Y
10 copy Colleen E Crangle 2013
From httpwww2leacuk
Paris is north of Berlin
The capital of Germany is Berlin
Spain is east of France
London is not the capital of France
Vienna is east of Moscow
The largest city of Poland is Athens
Fifteen seconds of EEG data From httpsccnucsdedu
LANGUAGE BRAIN
11 copy Colleen E Crangle 2013
What our work doeshellip
Compares the brain data and the language data and finds structural similarities between them
HOW DO WE REPRESENT THE EEG DATA
HOW DO WE REPRESENT THE LANGUAGE DATA
HOW DO WE COMPARE THE TWO
Berlin London Moscow Paris
Rome Warsaw Madrid Vienna
Athens France Germany Italy
Poland Russia Austria Greece
Spain north south east west
12 copy Colleen E Crangle 2013
Paris is north of Berlin
Spain is east of
France
The capital of
Germany is
Berlin
13 copy Colleen E Crangle 2013
Paris
east of
Germany
14 copy Colleen E Crangle 2013
Machine learning approach to the study of
brain and language
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
We used a 5-fold linear discriminant model with principal component analysis for blind source separation to classify the segments of EEG data obtained from the individual trials
15 copy Colleen E Crangle 2013
Beyond machine learning hellip
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
THEN
look at the MIS-CLASSIFICATIONS and build a
CONFUSION MATRIX
16 copy Colleen E Crangle 2013
Classify 640 brain data samples s1 s2 hellip s640 into 10 classes ω1 ω2
hellip ω10 of the finite set A
M = (miq) is the confusion matrix for a given classification where miq
is the number of test samples from class ωi classified as belonging to
class ωq
London Moscow Paris north south east west Germany Poland Russia
London 8 14 11 7 6 3 3 10 8 10
Moscow 8 24 14 6 2 6 4 4 7 15
Paris 6 18 12 4 3 5 8 6 11 7
north 4 2 5 11 9 7 10 1 8 3
south 1 4 3 14 14 9 11 4 7 3
east 4 3 5 9 12 12 7 1 4 3
west 4 3 2 12 13 11 10 2 8 5
Germany 2 2 3 2 2 1 0 9 11 8
Poland 2 3 4 0 4 1 4 9 9 4
Russia 7 7 5 1 5 1 1 8 4 11
17 copy Colleen E Crangle 2013
The relative frequencies miq are an N-by-N estimate for the
conditional probability densities minus designated by the matrix P = (piq) minus
that a randomly chosen test sample from class ωi will be classified as
belonging to class ωq
119846119842119850119850
Conditional probability density estimates from the classification of brain wave data for London Moscow Paris north south east west Germany Poland Russia
ωi
ωq ndash predicted classified as
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
18 copy Colleen E Crangle 2013
02 03 04 05 06
London
Paris
Moscow
Germany
Poland
Russia
north
south
west
east
Hierarchical cluster tree (similarity tree) computed from the conditional probability
density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
19 copy Colleen E Crangle 2013
WordNet a lexical database of English organized around sets of cognitive synonyms restricted to the senses that are relevant to the geography sentences Latent Semantic Analysis (LSA) a statistical method to extract measures of word similarity from selected sets of documents such as novels newspaper articles textbooks
LANGUAGE BRAIN
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
20 copy Colleen E Crangle 2013
WordNet is a human-annotated lexical database of English in which nouns
verbs adjectives and adverbs are grouped into sets of cognitive synonyms
(called synsets) each synset expressing a distinct concept
See Miller (1995) Fellbaum (1997) and httpwordnetprincetonedu
The synsets are related to each other primarily through the hypernymy and
hyponymy relations for nouns Other relations in WordNet are part-whole
(holonym) member of (meronym) has instance and so on
Hyponymy often referred to as the isndasha relation in computational discussions --
is defined as follows a concept represented by a lexical item Li is said to be a
hyponym of the concept represented by a lexical item Lk if native speakers of
English accept sentences of the form An Li is a kind of Lk Conversely Lk is the
hypernym of Li A hypernym is therefore a more general concept and a hyponym
a more specific concept
WORDNET
21 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
Electroencephalography (EEG) measures the electric potential generated by the synchronous activity of thousands or millions of neurons that have similar spatial orientation allowing brain wave activity to be detected EEG has very high temporal resolution on the order of milliseconds with a sampling rate of 500 to 1000 Hz common in neuro-cognitive studies ndash that is 500 to 1000 readings a second
Fifteen seconds of EEG data From httpsccnucsdedu
15-22-128 channels of information 2 copy Colleen E Crangle 2013
Simple Experiment
one two three four five six seven eight nine ten left right yes no
Repeated visually or auditorily many times while EEG recordings are
madehellip
Data
Extract segments of brain waveforms corresponding to the time at which each individual word was heard (or
read)
Question
Can we tell which brain data sample is associated with which word
That is can we train a classifier to correctly classify the brain data
samples
Fifteen seconds of EEG data From httpsccnucsdedu
3 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict
which wordsentenceword within a sentence the participant
is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
4 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it
to make predictions for other participants
1999 mdashInvariance between subjects of brain wave
representations of language
5 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it to make predictions for other
participants
1999 mdashInvariance between subjects of brain wave representations of language
Can we train the classifier using visually presented words
and use that classifier to make predictions for words
presented auditorily (and vice versa)
2004 mdashClassification of individual trials based on the best
independent component of EEG-recorded sentences
2005 mdashRecognition of Words from the EEG Laplacian
2006 mdashMultichannel classification of single EEG trials with
independent component analysis
2007 mdashSingle-trial classification of MEG recordings
6 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it to make predictions for other participants
1999 mdashInvariance between subjects of brain wave representations of language
Can we train the classifier using visually presented words and use that classifier to make predictions
for words presented auditorily (and vice versa)
2004 mdashClassification of individual trials based on the best independent component of EEG-recorded sentences
2005 mdashRecognition of Words from the EEG Laplacian
2006 mdashMultichannel classification of single EEG trials with independent component analysis
2007 mdashSingle-trial classification of MEG recordings
Can we train the classifier using words and use that
classifier to make predictions for pictures depicting what
the word refers to (and vice versa)
1999 mdashInvariance of brain-wave representations of simple
visual images and their names
7 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it to make predictions for other participants
1999 mdashInvariance between subjects of brain wave representations of language
Can we train the classifier using visually presented words and use that classifier to make predictions
for words presented auditorily (and vice versa)
2004 mdashClassification of individual trials based on the best independent component of EEG-recorded sentences
2005 mdashRecognition of Words from the EEG Laplacian
2006 mdashMultichannel classification of single EEG trials with independent component analysis
2007 mdashSingle-trial classification of MEG recordings
Can we train the classifier using words and use that classifier to make predictions for pictures
depicting what the word refers to (and vice versa)
1999 mdashInvariance of brain-wave representations of simple visual images and their names
Answering yes to these questions establishes that we are
recognizing the meaning of the word or the idea or concept
behind the word and not for instance the sound of the
word or its orthography or its idiosyncratic use by one
person
8 copy Colleen E Crangle 2013
EXPERIMENTAL SETUP for recognizing words within
sentences
A computer presented 48 spoken sentences to each of 9 participants in 10 randomized blocks with all 48 sentences in each block So there were 480 trials for each participant
The sentences were about the geography of Europe Half were true half false half positive half negative
The capital of Italy is Paris F
Paris is not east of Berlin T
Spain is west of Russia T
Participants were asked to determine the truth or falsity of each statement while EEG recordings were made
9 copy Colleen E Crangle 2013
X ϵ Berlin London Moscow Paris Rome Warsaw Madrid Vienna Athens)
Y ϵ France Germany Italy Poland Russia Austria Greece Spain)
W ϵ the capital the largest city
Z ϵ north south east west
W of Y is [not] X X is [not] W of Y
X is [not] Z of X Y is [not] Z of Y
10 copy Colleen E Crangle 2013
From httpwww2leacuk
Paris is north of Berlin
The capital of Germany is Berlin
Spain is east of France
London is not the capital of France
Vienna is east of Moscow
The largest city of Poland is Athens
Fifteen seconds of EEG data From httpsccnucsdedu
LANGUAGE BRAIN
11 copy Colleen E Crangle 2013
What our work doeshellip
Compares the brain data and the language data and finds structural similarities between them
HOW DO WE REPRESENT THE EEG DATA
HOW DO WE REPRESENT THE LANGUAGE DATA
HOW DO WE COMPARE THE TWO
Berlin London Moscow Paris
Rome Warsaw Madrid Vienna
Athens France Germany Italy
Poland Russia Austria Greece
Spain north south east west
12 copy Colleen E Crangle 2013
Paris is north of Berlin
Spain is east of
France
The capital of
Germany is
Berlin
13 copy Colleen E Crangle 2013
Paris
east of
Germany
14 copy Colleen E Crangle 2013
Machine learning approach to the study of
brain and language
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
We used a 5-fold linear discriminant model with principal component analysis for blind source separation to classify the segments of EEG data obtained from the individual trials
15 copy Colleen E Crangle 2013
Beyond machine learning hellip
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
THEN
look at the MIS-CLASSIFICATIONS and build a
CONFUSION MATRIX
16 copy Colleen E Crangle 2013
Classify 640 brain data samples s1 s2 hellip s640 into 10 classes ω1 ω2
hellip ω10 of the finite set A
M = (miq) is the confusion matrix for a given classification where miq
is the number of test samples from class ωi classified as belonging to
class ωq
London Moscow Paris north south east west Germany Poland Russia
London 8 14 11 7 6 3 3 10 8 10
Moscow 8 24 14 6 2 6 4 4 7 15
Paris 6 18 12 4 3 5 8 6 11 7
north 4 2 5 11 9 7 10 1 8 3
south 1 4 3 14 14 9 11 4 7 3
east 4 3 5 9 12 12 7 1 4 3
west 4 3 2 12 13 11 10 2 8 5
Germany 2 2 3 2 2 1 0 9 11 8
Poland 2 3 4 0 4 1 4 9 9 4
Russia 7 7 5 1 5 1 1 8 4 11
17 copy Colleen E Crangle 2013
The relative frequencies miq are an N-by-N estimate for the
conditional probability densities minus designated by the matrix P = (piq) minus
that a randomly chosen test sample from class ωi will be classified as
belonging to class ωq
119846119842119850119850
Conditional probability density estimates from the classification of brain wave data for London Moscow Paris north south east west Germany Poland Russia
ωi
ωq ndash predicted classified as
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
18 copy Colleen E Crangle 2013
02 03 04 05 06
London
Paris
Moscow
Germany
Poland
Russia
north
south
west
east
Hierarchical cluster tree (similarity tree) computed from the conditional probability
density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
19 copy Colleen E Crangle 2013
WordNet a lexical database of English organized around sets of cognitive synonyms restricted to the senses that are relevant to the geography sentences Latent Semantic Analysis (LSA) a statistical method to extract measures of word similarity from selected sets of documents such as novels newspaper articles textbooks
LANGUAGE BRAIN
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
20 copy Colleen E Crangle 2013
WordNet is a human-annotated lexical database of English in which nouns
verbs adjectives and adverbs are grouped into sets of cognitive synonyms
(called synsets) each synset expressing a distinct concept
See Miller (1995) Fellbaum (1997) and httpwordnetprincetonedu
The synsets are related to each other primarily through the hypernymy and
hyponymy relations for nouns Other relations in WordNet are part-whole
(holonym) member of (meronym) has instance and so on
Hyponymy often referred to as the isndasha relation in computational discussions --
is defined as follows a concept represented by a lexical item Li is said to be a
hyponym of the concept represented by a lexical item Lk if native speakers of
English accept sentences of the form An Li is a kind of Lk Conversely Lk is the
hypernym of Li A hypernym is therefore a more general concept and a hyponym
a more specific concept
WORDNET
21 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
Simple Experiment
one two three four five six seven eight nine ten left right yes no
Repeated visually or auditorily many times while EEG recordings are
madehellip
Data
Extract segments of brain waveforms corresponding to the time at which each individual word was heard (or
read)
Question
Can we tell which brain data sample is associated with which word
That is can we train a classifier to correctly classify the brain data
samples
Fifteen seconds of EEG data From httpsccnucsdedu
3 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict
which wordsentenceword within a sentence the participant
is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
4 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it
to make predictions for other participants
1999 mdashInvariance between subjects of brain wave
representations of language
5 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it to make predictions for other
participants
1999 mdashInvariance between subjects of brain wave representations of language
Can we train the classifier using visually presented words
and use that classifier to make predictions for words
presented auditorily (and vice versa)
2004 mdashClassification of individual trials based on the best
independent component of EEG-recorded sentences
2005 mdashRecognition of Words from the EEG Laplacian
2006 mdashMultichannel classification of single EEG trials with
independent component analysis
2007 mdashSingle-trial classification of MEG recordings
6 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it to make predictions for other participants
1999 mdashInvariance between subjects of brain wave representations of language
Can we train the classifier using visually presented words and use that classifier to make predictions
for words presented auditorily (and vice versa)
2004 mdashClassification of individual trials based on the best independent component of EEG-recorded sentences
2005 mdashRecognition of Words from the EEG Laplacian
2006 mdashMultichannel classification of single EEG trials with independent component analysis
2007 mdashSingle-trial classification of MEG recordings
Can we train the classifier using words and use that
classifier to make predictions for pictures depicting what
the word refers to (and vice versa)
1999 mdashInvariance of brain-wave representations of simple
visual images and their names
7 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it to make predictions for other participants
1999 mdashInvariance between subjects of brain wave representations of language
Can we train the classifier using visually presented words and use that classifier to make predictions
for words presented auditorily (and vice versa)
2004 mdashClassification of individual trials based on the best independent component of EEG-recorded sentences
2005 mdashRecognition of Words from the EEG Laplacian
2006 mdashMultichannel classification of single EEG trials with independent component analysis
2007 mdashSingle-trial classification of MEG recordings
Can we train the classifier using words and use that classifier to make predictions for pictures
depicting what the word refers to (and vice versa)
1999 mdashInvariance of brain-wave representations of simple visual images and their names
Answering yes to these questions establishes that we are
recognizing the meaning of the word or the idea or concept
behind the word and not for instance the sound of the
word or its orthography or its idiosyncratic use by one
person
8 copy Colleen E Crangle 2013
EXPERIMENTAL SETUP for recognizing words within
sentences
A computer presented 48 spoken sentences to each of 9 participants in 10 randomized blocks with all 48 sentences in each block So there were 480 trials for each participant
The sentences were about the geography of Europe Half were true half false half positive half negative
The capital of Italy is Paris F
Paris is not east of Berlin T
Spain is west of Russia T
Participants were asked to determine the truth or falsity of each statement while EEG recordings were made
9 copy Colleen E Crangle 2013
X ϵ Berlin London Moscow Paris Rome Warsaw Madrid Vienna Athens)
Y ϵ France Germany Italy Poland Russia Austria Greece Spain)
W ϵ the capital the largest city
Z ϵ north south east west
W of Y is [not] X X is [not] W of Y
X is [not] Z of X Y is [not] Z of Y
10 copy Colleen E Crangle 2013
From httpwww2leacuk
Paris is north of Berlin
The capital of Germany is Berlin
Spain is east of France
London is not the capital of France
Vienna is east of Moscow
The largest city of Poland is Athens
Fifteen seconds of EEG data From httpsccnucsdedu
LANGUAGE BRAIN
11 copy Colleen E Crangle 2013
What our work doeshellip
Compares the brain data and the language data and finds structural similarities between them
HOW DO WE REPRESENT THE EEG DATA
HOW DO WE REPRESENT THE LANGUAGE DATA
HOW DO WE COMPARE THE TWO
Berlin London Moscow Paris
Rome Warsaw Madrid Vienna
Athens France Germany Italy
Poland Russia Austria Greece
Spain north south east west
12 copy Colleen E Crangle 2013
Paris is north of Berlin
Spain is east of
France
The capital of
Germany is
Berlin
13 copy Colleen E Crangle 2013
Paris
east of
Germany
14 copy Colleen E Crangle 2013
Machine learning approach to the study of
brain and language
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
We used a 5-fold linear discriminant model with principal component analysis for blind source separation to classify the segments of EEG data obtained from the individual trials
15 copy Colleen E Crangle 2013
Beyond machine learning hellip
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
THEN
look at the MIS-CLASSIFICATIONS and build a
CONFUSION MATRIX
16 copy Colleen E Crangle 2013
Classify 640 brain data samples s1 s2 hellip s640 into 10 classes ω1 ω2
hellip ω10 of the finite set A
M = (miq) is the confusion matrix for a given classification where miq
is the number of test samples from class ωi classified as belonging to
class ωq
London Moscow Paris north south east west Germany Poland Russia
London 8 14 11 7 6 3 3 10 8 10
Moscow 8 24 14 6 2 6 4 4 7 15
Paris 6 18 12 4 3 5 8 6 11 7
north 4 2 5 11 9 7 10 1 8 3
south 1 4 3 14 14 9 11 4 7 3
east 4 3 5 9 12 12 7 1 4 3
west 4 3 2 12 13 11 10 2 8 5
Germany 2 2 3 2 2 1 0 9 11 8
Poland 2 3 4 0 4 1 4 9 9 4
Russia 7 7 5 1 5 1 1 8 4 11
17 copy Colleen E Crangle 2013
The relative frequencies miq are an N-by-N estimate for the
conditional probability densities minus designated by the matrix P = (piq) minus
that a randomly chosen test sample from class ωi will be classified as
belonging to class ωq
119846119842119850119850
Conditional probability density estimates from the classification of brain wave data for London Moscow Paris north south east west Germany Poland Russia
ωi
ωq ndash predicted classified as
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
18 copy Colleen E Crangle 2013
02 03 04 05 06
London
Paris
Moscow
Germany
Poland
Russia
north
south
west
east
Hierarchical cluster tree (similarity tree) computed from the conditional probability
density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
19 copy Colleen E Crangle 2013
WordNet a lexical database of English organized around sets of cognitive synonyms restricted to the senses that are relevant to the geography sentences Latent Semantic Analysis (LSA) a statistical method to extract measures of word similarity from selected sets of documents such as novels newspaper articles textbooks
LANGUAGE BRAIN
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
20 copy Colleen E Crangle 2013
WordNet is a human-annotated lexical database of English in which nouns
verbs adjectives and adverbs are grouped into sets of cognitive synonyms
(called synsets) each synset expressing a distinct concept
See Miller (1995) Fellbaum (1997) and httpwordnetprincetonedu
The synsets are related to each other primarily through the hypernymy and
hyponymy relations for nouns Other relations in WordNet are part-whole
(holonym) member of (meronym) has instance and so on
Hyponymy often referred to as the isndasha relation in computational discussions --
is defined as follows a concept represented by a lexical item Li is said to be a
hyponym of the concept represented by a lexical item Lk if native speakers of
English accept sentences of the form An Li is a kind of Lk Conversely Lk is the
hypernym of Li A hypernym is therefore a more general concept and a hyponym
a more specific concept
WORDNET
21 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict
which wordsentenceword within a sentence the participant
is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
4 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it
to make predictions for other participants
1999 mdashInvariance between subjects of brain wave
representations of language
5 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it to make predictions for other
participants
1999 mdashInvariance between subjects of brain wave representations of language
Can we train the classifier using visually presented words
and use that classifier to make predictions for words
presented auditorily (and vice versa)
2004 mdashClassification of individual trials based on the best
independent component of EEG-recorded sentences
2005 mdashRecognition of Words from the EEG Laplacian
2006 mdashMultichannel classification of single EEG trials with
independent component analysis
2007 mdashSingle-trial classification of MEG recordings
6 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it to make predictions for other participants
1999 mdashInvariance between subjects of brain wave representations of language
Can we train the classifier using visually presented words and use that classifier to make predictions
for words presented auditorily (and vice versa)
2004 mdashClassification of individual trials based on the best independent component of EEG-recorded sentences
2005 mdashRecognition of Words from the EEG Laplacian
2006 mdashMultichannel classification of single EEG trials with independent component analysis
2007 mdashSingle-trial classification of MEG recordings
Can we train the classifier using words and use that
classifier to make predictions for pictures depicting what
the word refers to (and vice versa)
1999 mdashInvariance of brain-wave representations of simple
visual images and their names
7 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it to make predictions for other participants
1999 mdashInvariance between subjects of brain wave representations of language
Can we train the classifier using visually presented words and use that classifier to make predictions
for words presented auditorily (and vice versa)
2004 mdashClassification of individual trials based on the best independent component of EEG-recorded sentences
2005 mdashRecognition of Words from the EEG Laplacian
2006 mdashMultichannel classification of single EEG trials with independent component analysis
2007 mdashSingle-trial classification of MEG recordings
Can we train the classifier using words and use that classifier to make predictions for pictures
depicting what the word refers to (and vice versa)
1999 mdashInvariance of brain-wave representations of simple visual images and their names
Answering yes to these questions establishes that we are
recognizing the meaning of the word or the idea or concept
behind the word and not for instance the sound of the
word or its orthography or its idiosyncratic use by one
person
8 copy Colleen E Crangle 2013
EXPERIMENTAL SETUP for recognizing words within
sentences
A computer presented 48 spoken sentences to each of 9 participants in 10 randomized blocks with all 48 sentences in each block So there were 480 trials for each participant
The sentences were about the geography of Europe Half were true half false half positive half negative
The capital of Italy is Paris F
Paris is not east of Berlin T
Spain is west of Russia T
Participants were asked to determine the truth or falsity of each statement while EEG recordings were made
9 copy Colleen E Crangle 2013
X ϵ Berlin London Moscow Paris Rome Warsaw Madrid Vienna Athens)
Y ϵ France Germany Italy Poland Russia Austria Greece Spain)
W ϵ the capital the largest city
Z ϵ north south east west
W of Y is [not] X X is [not] W of Y
X is [not] Z of X Y is [not] Z of Y
10 copy Colleen E Crangle 2013
From httpwww2leacuk
Paris is north of Berlin
The capital of Germany is Berlin
Spain is east of France
London is not the capital of France
Vienna is east of Moscow
The largest city of Poland is Athens
Fifteen seconds of EEG data From httpsccnucsdedu
LANGUAGE BRAIN
11 copy Colleen E Crangle 2013
What our work doeshellip
Compares the brain data and the language data and finds structural similarities between them
HOW DO WE REPRESENT THE EEG DATA
HOW DO WE REPRESENT THE LANGUAGE DATA
HOW DO WE COMPARE THE TWO
Berlin London Moscow Paris
Rome Warsaw Madrid Vienna
Athens France Germany Italy
Poland Russia Austria Greece
Spain north south east west
12 copy Colleen E Crangle 2013
Paris is north of Berlin
Spain is east of
France
The capital of
Germany is
Berlin
13 copy Colleen E Crangle 2013
Paris
east of
Germany
14 copy Colleen E Crangle 2013
Machine learning approach to the study of
brain and language
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
We used a 5-fold linear discriminant model with principal component analysis for blind source separation to classify the segments of EEG data obtained from the individual trials
15 copy Colleen E Crangle 2013
Beyond machine learning hellip
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
THEN
look at the MIS-CLASSIFICATIONS and build a
CONFUSION MATRIX
16 copy Colleen E Crangle 2013
Classify 640 brain data samples s1 s2 hellip s640 into 10 classes ω1 ω2
hellip ω10 of the finite set A
M = (miq) is the confusion matrix for a given classification where miq
is the number of test samples from class ωi classified as belonging to
class ωq
London Moscow Paris north south east west Germany Poland Russia
London 8 14 11 7 6 3 3 10 8 10
Moscow 8 24 14 6 2 6 4 4 7 15
Paris 6 18 12 4 3 5 8 6 11 7
north 4 2 5 11 9 7 10 1 8 3
south 1 4 3 14 14 9 11 4 7 3
east 4 3 5 9 12 12 7 1 4 3
west 4 3 2 12 13 11 10 2 8 5
Germany 2 2 3 2 2 1 0 9 11 8
Poland 2 3 4 0 4 1 4 9 9 4
Russia 7 7 5 1 5 1 1 8 4 11
17 copy Colleen E Crangle 2013
The relative frequencies miq are an N-by-N estimate for the
conditional probability densities minus designated by the matrix P = (piq) minus
that a randomly chosen test sample from class ωi will be classified as
belonging to class ωq
119846119842119850119850
Conditional probability density estimates from the classification of brain wave data for London Moscow Paris north south east west Germany Poland Russia
ωi
ωq ndash predicted classified as
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
18 copy Colleen E Crangle 2013
02 03 04 05 06
London
Paris
Moscow
Germany
Poland
Russia
north
south
west
east
Hierarchical cluster tree (similarity tree) computed from the conditional probability
density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
19 copy Colleen E Crangle 2013
WordNet a lexical database of English organized around sets of cognitive synonyms restricted to the senses that are relevant to the geography sentences Latent Semantic Analysis (LSA) a statistical method to extract measures of word similarity from selected sets of documents such as novels newspaper articles textbooks
LANGUAGE BRAIN
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
20 copy Colleen E Crangle 2013
WordNet is a human-annotated lexical database of English in which nouns
verbs adjectives and adverbs are grouped into sets of cognitive synonyms
(called synsets) each synset expressing a distinct concept
See Miller (1995) Fellbaum (1997) and httpwordnetprincetonedu
The synsets are related to each other primarily through the hypernymy and
hyponymy relations for nouns Other relations in WordNet are part-whole
(holonym) member of (meronym) has instance and so on
Hyponymy often referred to as the isndasha relation in computational discussions --
is defined as follows a concept represented by a lexical item Li is said to be a
hyponym of the concept represented by a lexical item Lk if native speakers of
English accept sentences of the form An Li is a kind of Lk Conversely Lk is the
hypernym of Li A hypernym is therefore a more general concept and a hyponym
a more specific concept
WORDNET
21 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it
to make predictions for other participants
1999 mdashInvariance between subjects of brain wave
representations of language
5 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it to make predictions for other
participants
1999 mdashInvariance between subjects of brain wave representations of language
Can we train the classifier using visually presented words
and use that classifier to make predictions for words
presented auditorily (and vice versa)
2004 mdashClassification of individual trials based on the best
independent component of EEG-recorded sentences
2005 mdashRecognition of Words from the EEG Laplacian
2006 mdashMultichannel classification of single EEG trials with
independent component analysis
2007 mdashSingle-trial classification of MEG recordings
6 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it to make predictions for other participants
1999 mdashInvariance between subjects of brain wave representations of language
Can we train the classifier using visually presented words and use that classifier to make predictions
for words presented auditorily (and vice versa)
2004 mdashClassification of individual trials based on the best independent component of EEG-recorded sentences
2005 mdashRecognition of Words from the EEG Laplacian
2006 mdashMultichannel classification of single EEG trials with independent component analysis
2007 mdashSingle-trial classification of MEG recordings
Can we train the classifier using words and use that
classifier to make predictions for pictures depicting what
the word refers to (and vice versa)
1999 mdashInvariance of brain-wave representations of simple
visual images and their names
7 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it to make predictions for other participants
1999 mdashInvariance between subjects of brain wave representations of language
Can we train the classifier using visually presented words and use that classifier to make predictions
for words presented auditorily (and vice versa)
2004 mdashClassification of individual trials based on the best independent component of EEG-recorded sentences
2005 mdashRecognition of Words from the EEG Laplacian
2006 mdashMultichannel classification of single EEG trials with independent component analysis
2007 mdashSingle-trial classification of MEG recordings
Can we train the classifier using words and use that classifier to make predictions for pictures
depicting what the word refers to (and vice versa)
1999 mdashInvariance of brain-wave representations of simple visual images and their names
Answering yes to these questions establishes that we are
recognizing the meaning of the word or the idea or concept
behind the word and not for instance the sound of the
word or its orthography or its idiosyncratic use by one
person
8 copy Colleen E Crangle 2013
EXPERIMENTAL SETUP for recognizing words within
sentences
A computer presented 48 spoken sentences to each of 9 participants in 10 randomized blocks with all 48 sentences in each block So there were 480 trials for each participant
The sentences were about the geography of Europe Half were true half false half positive half negative
The capital of Italy is Paris F
Paris is not east of Berlin T
Spain is west of Russia T
Participants were asked to determine the truth or falsity of each statement while EEG recordings were made
9 copy Colleen E Crangle 2013
X ϵ Berlin London Moscow Paris Rome Warsaw Madrid Vienna Athens)
Y ϵ France Germany Italy Poland Russia Austria Greece Spain)
W ϵ the capital the largest city
Z ϵ north south east west
W of Y is [not] X X is [not] W of Y
X is [not] Z of X Y is [not] Z of Y
10 copy Colleen E Crangle 2013
From httpwww2leacuk
Paris is north of Berlin
The capital of Germany is Berlin
Spain is east of France
London is not the capital of France
Vienna is east of Moscow
The largest city of Poland is Athens
Fifteen seconds of EEG data From httpsccnucsdedu
LANGUAGE BRAIN
11 copy Colleen E Crangle 2013
What our work doeshellip
Compares the brain data and the language data and finds structural similarities between them
HOW DO WE REPRESENT THE EEG DATA
HOW DO WE REPRESENT THE LANGUAGE DATA
HOW DO WE COMPARE THE TWO
Berlin London Moscow Paris
Rome Warsaw Madrid Vienna
Athens France Germany Italy
Poland Russia Austria Greece
Spain north south east west
12 copy Colleen E Crangle 2013
Paris is north of Berlin
Spain is east of
France
The capital of
Germany is
Berlin
13 copy Colleen E Crangle 2013
Paris
east of
Germany
14 copy Colleen E Crangle 2013
Machine learning approach to the study of
brain and language
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
We used a 5-fold linear discriminant model with principal component analysis for blind source separation to classify the segments of EEG data obtained from the individual trials
15 copy Colleen E Crangle 2013
Beyond machine learning hellip
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
THEN
look at the MIS-CLASSIFICATIONS and build a
CONFUSION MATRIX
16 copy Colleen E Crangle 2013
Classify 640 brain data samples s1 s2 hellip s640 into 10 classes ω1 ω2
hellip ω10 of the finite set A
M = (miq) is the confusion matrix for a given classification where miq
is the number of test samples from class ωi classified as belonging to
class ωq
London Moscow Paris north south east west Germany Poland Russia
London 8 14 11 7 6 3 3 10 8 10
Moscow 8 24 14 6 2 6 4 4 7 15
Paris 6 18 12 4 3 5 8 6 11 7
north 4 2 5 11 9 7 10 1 8 3
south 1 4 3 14 14 9 11 4 7 3
east 4 3 5 9 12 12 7 1 4 3
west 4 3 2 12 13 11 10 2 8 5
Germany 2 2 3 2 2 1 0 9 11 8
Poland 2 3 4 0 4 1 4 9 9 4
Russia 7 7 5 1 5 1 1 8 4 11
17 copy Colleen E Crangle 2013
The relative frequencies miq are an N-by-N estimate for the
conditional probability densities minus designated by the matrix P = (piq) minus
that a randomly chosen test sample from class ωi will be classified as
belonging to class ωq
119846119842119850119850
Conditional probability density estimates from the classification of brain wave data for London Moscow Paris north south east west Germany Poland Russia
ωi
ωq ndash predicted classified as
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
18 copy Colleen E Crangle 2013
02 03 04 05 06
London
Paris
Moscow
Germany
Poland
Russia
north
south
west
east
Hierarchical cluster tree (similarity tree) computed from the conditional probability
density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
19 copy Colleen E Crangle 2013
WordNet a lexical database of English organized around sets of cognitive synonyms restricted to the senses that are relevant to the geography sentences Latent Semantic Analysis (LSA) a statistical method to extract measures of word similarity from selected sets of documents such as novels newspaper articles textbooks
LANGUAGE BRAIN
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
20 copy Colleen E Crangle 2013
WordNet is a human-annotated lexical database of English in which nouns
verbs adjectives and adverbs are grouped into sets of cognitive synonyms
(called synsets) each synset expressing a distinct concept
See Miller (1995) Fellbaum (1997) and httpwordnetprincetonedu
The synsets are related to each other primarily through the hypernymy and
hyponymy relations for nouns Other relations in WordNet are part-whole
(holonym) member of (meronym) has instance and so on
Hyponymy often referred to as the isndasha relation in computational discussions --
is defined as follows a concept represented by a lexical item Li is said to be a
hyponym of the concept represented by a lexical item Lk if native speakers of
English accept sentences of the form An Li is a kind of Lk Conversely Lk is the
hypernym of Li A hypernym is therefore a more general concept and a hyponym
a more specific concept
WORDNET
21 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it to make predictions for other
participants
1999 mdashInvariance between subjects of brain wave representations of language
Can we train the classifier using visually presented words
and use that classifier to make predictions for words
presented auditorily (and vice versa)
2004 mdashClassification of individual trials based on the best
independent component of EEG-recorded sentences
2005 mdashRecognition of Words from the EEG Laplacian
2006 mdashMultichannel classification of single EEG trials with
independent component analysis
2007 mdashSingle-trial classification of MEG recordings
6 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it to make predictions for other participants
1999 mdashInvariance between subjects of brain wave representations of language
Can we train the classifier using visually presented words and use that classifier to make predictions
for words presented auditorily (and vice versa)
2004 mdashClassification of individual trials based on the best independent component of EEG-recorded sentences
2005 mdashRecognition of Words from the EEG Laplacian
2006 mdashMultichannel classification of single EEG trials with independent component analysis
2007 mdashSingle-trial classification of MEG recordings
Can we train the classifier using words and use that
classifier to make predictions for pictures depicting what
the word refers to (and vice versa)
1999 mdashInvariance of brain-wave representations of simple
visual images and their names
7 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it to make predictions for other participants
1999 mdashInvariance between subjects of brain wave representations of language
Can we train the classifier using visually presented words and use that classifier to make predictions
for words presented auditorily (and vice versa)
2004 mdashClassification of individual trials based on the best independent component of EEG-recorded sentences
2005 mdashRecognition of Words from the EEG Laplacian
2006 mdashMultichannel classification of single EEG trials with independent component analysis
2007 mdashSingle-trial classification of MEG recordings
Can we train the classifier using words and use that classifier to make predictions for pictures
depicting what the word refers to (and vice versa)
1999 mdashInvariance of brain-wave representations of simple visual images and their names
Answering yes to these questions establishes that we are
recognizing the meaning of the word or the idea or concept
behind the word and not for instance the sound of the
word or its orthography or its idiosyncratic use by one
person
8 copy Colleen E Crangle 2013
EXPERIMENTAL SETUP for recognizing words within
sentences
A computer presented 48 spoken sentences to each of 9 participants in 10 randomized blocks with all 48 sentences in each block So there were 480 trials for each participant
The sentences were about the geography of Europe Half were true half false half positive half negative
The capital of Italy is Paris F
Paris is not east of Berlin T
Spain is west of Russia T
Participants were asked to determine the truth or falsity of each statement while EEG recordings were made
9 copy Colleen E Crangle 2013
X ϵ Berlin London Moscow Paris Rome Warsaw Madrid Vienna Athens)
Y ϵ France Germany Italy Poland Russia Austria Greece Spain)
W ϵ the capital the largest city
Z ϵ north south east west
W of Y is [not] X X is [not] W of Y
X is [not] Z of X Y is [not] Z of Y
10 copy Colleen E Crangle 2013
From httpwww2leacuk
Paris is north of Berlin
The capital of Germany is Berlin
Spain is east of France
London is not the capital of France
Vienna is east of Moscow
The largest city of Poland is Athens
Fifteen seconds of EEG data From httpsccnucsdedu
LANGUAGE BRAIN
11 copy Colleen E Crangle 2013
What our work doeshellip
Compares the brain data and the language data and finds structural similarities between them
HOW DO WE REPRESENT THE EEG DATA
HOW DO WE REPRESENT THE LANGUAGE DATA
HOW DO WE COMPARE THE TWO
Berlin London Moscow Paris
Rome Warsaw Madrid Vienna
Athens France Germany Italy
Poland Russia Austria Greece
Spain north south east west
12 copy Colleen E Crangle 2013
Paris is north of Berlin
Spain is east of
France
The capital of
Germany is
Berlin
13 copy Colleen E Crangle 2013
Paris
east of
Germany
14 copy Colleen E Crangle 2013
Machine learning approach to the study of
brain and language
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
We used a 5-fold linear discriminant model with principal component analysis for blind source separation to classify the segments of EEG data obtained from the individual trials
15 copy Colleen E Crangle 2013
Beyond machine learning hellip
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
THEN
look at the MIS-CLASSIFICATIONS and build a
CONFUSION MATRIX
16 copy Colleen E Crangle 2013
Classify 640 brain data samples s1 s2 hellip s640 into 10 classes ω1 ω2
hellip ω10 of the finite set A
M = (miq) is the confusion matrix for a given classification where miq
is the number of test samples from class ωi classified as belonging to
class ωq
London Moscow Paris north south east west Germany Poland Russia
London 8 14 11 7 6 3 3 10 8 10
Moscow 8 24 14 6 2 6 4 4 7 15
Paris 6 18 12 4 3 5 8 6 11 7
north 4 2 5 11 9 7 10 1 8 3
south 1 4 3 14 14 9 11 4 7 3
east 4 3 5 9 12 12 7 1 4 3
west 4 3 2 12 13 11 10 2 8 5
Germany 2 2 3 2 2 1 0 9 11 8
Poland 2 3 4 0 4 1 4 9 9 4
Russia 7 7 5 1 5 1 1 8 4 11
17 copy Colleen E Crangle 2013
The relative frequencies miq are an N-by-N estimate for the
conditional probability densities minus designated by the matrix P = (piq) minus
that a randomly chosen test sample from class ωi will be classified as
belonging to class ωq
119846119842119850119850
Conditional probability density estimates from the classification of brain wave data for London Moscow Paris north south east west Germany Poland Russia
ωi
ωq ndash predicted classified as
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
18 copy Colleen E Crangle 2013
02 03 04 05 06
London
Paris
Moscow
Germany
Poland
Russia
north
south
west
east
Hierarchical cluster tree (similarity tree) computed from the conditional probability
density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
19 copy Colleen E Crangle 2013
WordNet a lexical database of English organized around sets of cognitive synonyms restricted to the senses that are relevant to the geography sentences Latent Semantic Analysis (LSA) a statistical method to extract measures of word similarity from selected sets of documents such as novels newspaper articles textbooks
LANGUAGE BRAIN
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
20 copy Colleen E Crangle 2013
WordNet is a human-annotated lexical database of English in which nouns
verbs adjectives and adverbs are grouped into sets of cognitive synonyms
(called synsets) each synset expressing a distinct concept
See Miller (1995) Fellbaum (1997) and httpwordnetprincetonedu
The synsets are related to each other primarily through the hypernymy and
hyponymy relations for nouns Other relations in WordNet are part-whole
(holonym) member of (meronym) has instance and so on
Hyponymy often referred to as the isndasha relation in computational discussions --
is defined as follows a concept represented by a lexical item Li is said to be a
hyponym of the concept represented by a lexical item Lk if native speakers of
English accept sentences of the form An Li is a kind of Lk Conversely Lk is the
hypernym of Li A hypernym is therefore a more general concept and a hyponym
a more specific concept
WORDNET
21 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it to make predictions for other participants
1999 mdashInvariance between subjects of brain wave representations of language
Can we train the classifier using visually presented words and use that classifier to make predictions
for words presented auditorily (and vice versa)
2004 mdashClassification of individual trials based on the best independent component of EEG-recorded sentences
2005 mdashRecognition of Words from the EEG Laplacian
2006 mdashMultichannel classification of single EEG trials with independent component analysis
2007 mdashSingle-trial classification of MEG recordings
Can we train the classifier using words and use that
classifier to make predictions for pictures depicting what
the word refers to (and vice versa)
1999 mdashInvariance of brain-wave representations of simple
visual images and their names
7 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it to make predictions for other participants
1999 mdashInvariance between subjects of brain wave representations of language
Can we train the classifier using visually presented words and use that classifier to make predictions
for words presented auditorily (and vice versa)
2004 mdashClassification of individual trials based on the best independent component of EEG-recorded sentences
2005 mdashRecognition of Words from the EEG Laplacian
2006 mdashMultichannel classification of single EEG trials with independent component analysis
2007 mdashSingle-trial classification of MEG recordings
Can we train the classifier using words and use that classifier to make predictions for pictures
depicting what the word refers to (and vice versa)
1999 mdashInvariance of brain-wave representations of simple visual images and their names
Answering yes to these questions establishes that we are
recognizing the meaning of the word or the idea or concept
behind the word and not for instance the sound of the
word or its orthography or its idiosyncratic use by one
person
8 copy Colleen E Crangle 2013
EXPERIMENTAL SETUP for recognizing words within
sentences
A computer presented 48 spoken sentences to each of 9 participants in 10 randomized blocks with all 48 sentences in each block So there were 480 trials for each participant
The sentences were about the geography of Europe Half were true half false half positive half negative
The capital of Italy is Paris F
Paris is not east of Berlin T
Spain is west of Russia T
Participants were asked to determine the truth or falsity of each statement while EEG recordings were made
9 copy Colleen E Crangle 2013
X ϵ Berlin London Moscow Paris Rome Warsaw Madrid Vienna Athens)
Y ϵ France Germany Italy Poland Russia Austria Greece Spain)
W ϵ the capital the largest city
Z ϵ north south east west
W of Y is [not] X X is [not] W of Y
X is [not] Z of X Y is [not] Z of Y
10 copy Colleen E Crangle 2013
From httpwww2leacuk
Paris is north of Berlin
The capital of Germany is Berlin
Spain is east of France
London is not the capital of France
Vienna is east of Moscow
The largest city of Poland is Athens
Fifteen seconds of EEG data From httpsccnucsdedu
LANGUAGE BRAIN
11 copy Colleen E Crangle 2013
What our work doeshellip
Compares the brain data and the language data and finds structural similarities between them
HOW DO WE REPRESENT THE EEG DATA
HOW DO WE REPRESENT THE LANGUAGE DATA
HOW DO WE COMPARE THE TWO
Berlin London Moscow Paris
Rome Warsaw Madrid Vienna
Athens France Germany Italy
Poland Russia Austria Greece
Spain north south east west
12 copy Colleen E Crangle 2013
Paris is north of Berlin
Spain is east of
France
The capital of
Germany is
Berlin
13 copy Colleen E Crangle 2013
Paris
east of
Germany
14 copy Colleen E Crangle 2013
Machine learning approach to the study of
brain and language
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
We used a 5-fold linear discriminant model with principal component analysis for blind source separation to classify the segments of EEG data obtained from the individual trials
15 copy Colleen E Crangle 2013
Beyond machine learning hellip
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
THEN
look at the MIS-CLASSIFICATIONS and build a
CONFUSION MATRIX
16 copy Colleen E Crangle 2013
Classify 640 brain data samples s1 s2 hellip s640 into 10 classes ω1 ω2
hellip ω10 of the finite set A
M = (miq) is the confusion matrix for a given classification where miq
is the number of test samples from class ωi classified as belonging to
class ωq
London Moscow Paris north south east west Germany Poland Russia
London 8 14 11 7 6 3 3 10 8 10
Moscow 8 24 14 6 2 6 4 4 7 15
Paris 6 18 12 4 3 5 8 6 11 7
north 4 2 5 11 9 7 10 1 8 3
south 1 4 3 14 14 9 11 4 7 3
east 4 3 5 9 12 12 7 1 4 3
west 4 3 2 12 13 11 10 2 8 5
Germany 2 2 3 2 2 1 0 9 11 8
Poland 2 3 4 0 4 1 4 9 9 4
Russia 7 7 5 1 5 1 1 8 4 11
17 copy Colleen E Crangle 2013
The relative frequencies miq are an N-by-N estimate for the
conditional probability densities minus designated by the matrix P = (piq) minus
that a randomly chosen test sample from class ωi will be classified as
belonging to class ωq
119846119842119850119850
Conditional probability density estimates from the classification of brain wave data for London Moscow Paris north south east west Germany Poland Russia
ωi
ωq ndash predicted classified as
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
18 copy Colleen E Crangle 2013
02 03 04 05 06
London
Paris
Moscow
Germany
Poland
Russia
north
south
west
east
Hierarchical cluster tree (similarity tree) computed from the conditional probability
density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
19 copy Colleen E Crangle 2013
WordNet a lexical database of English organized around sets of cognitive synonyms restricted to the senses that are relevant to the geography sentences Latent Semantic Analysis (LSA) a statistical method to extract measures of word similarity from selected sets of documents such as novels newspaper articles textbooks
LANGUAGE BRAIN
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
20 copy Colleen E Crangle 2013
WordNet is a human-annotated lexical database of English in which nouns
verbs adjectives and adverbs are grouped into sets of cognitive synonyms
(called synsets) each synset expressing a distinct concept
See Miller (1995) Fellbaum (1997) and httpwordnetprincetonedu
The synsets are related to each other primarily through the hypernymy and
hyponymy relations for nouns Other relations in WordNet are part-whole
(holonym) member of (meronym) has instance and so on
Hyponymy often referred to as the isndasha relation in computational discussions --
is defined as follows a concept represented by a lexical item Li is said to be a
hyponym of the concept represented by a lexical item Lk if native speakers of
English accept sentences of the form An Li is a kind of Lk Conversely Lk is the
hypernym of Li A hypernym is therefore a more general concept and a hyponym
a more specific concept
WORDNET
21 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
Can we train a classifier of EEG data so that we can predict which wordsentenceword within a
sentence the participant is seeing or hearing
1997 mdashBrain-wave recognition of words
1998 mdashBrain-wave recognition of sentences
Can we train the classifier on some participants and use it to make predictions for other participants
1999 mdashInvariance between subjects of brain wave representations of language
Can we train the classifier using visually presented words and use that classifier to make predictions
for words presented auditorily (and vice versa)
2004 mdashClassification of individual trials based on the best independent component of EEG-recorded sentences
2005 mdashRecognition of Words from the EEG Laplacian
2006 mdashMultichannel classification of single EEG trials with independent component analysis
2007 mdashSingle-trial classification of MEG recordings
Can we train the classifier using words and use that classifier to make predictions for pictures
depicting what the word refers to (and vice versa)
1999 mdashInvariance of brain-wave representations of simple visual images and their names
Answering yes to these questions establishes that we are
recognizing the meaning of the word or the idea or concept
behind the word and not for instance the sound of the
word or its orthography or its idiosyncratic use by one
person
8 copy Colleen E Crangle 2013
EXPERIMENTAL SETUP for recognizing words within
sentences
A computer presented 48 spoken sentences to each of 9 participants in 10 randomized blocks with all 48 sentences in each block So there were 480 trials for each participant
The sentences were about the geography of Europe Half were true half false half positive half negative
The capital of Italy is Paris F
Paris is not east of Berlin T
Spain is west of Russia T
Participants were asked to determine the truth or falsity of each statement while EEG recordings were made
9 copy Colleen E Crangle 2013
X ϵ Berlin London Moscow Paris Rome Warsaw Madrid Vienna Athens)
Y ϵ France Germany Italy Poland Russia Austria Greece Spain)
W ϵ the capital the largest city
Z ϵ north south east west
W of Y is [not] X X is [not] W of Y
X is [not] Z of X Y is [not] Z of Y
10 copy Colleen E Crangle 2013
From httpwww2leacuk
Paris is north of Berlin
The capital of Germany is Berlin
Spain is east of France
London is not the capital of France
Vienna is east of Moscow
The largest city of Poland is Athens
Fifteen seconds of EEG data From httpsccnucsdedu
LANGUAGE BRAIN
11 copy Colleen E Crangle 2013
What our work doeshellip
Compares the brain data and the language data and finds structural similarities between them
HOW DO WE REPRESENT THE EEG DATA
HOW DO WE REPRESENT THE LANGUAGE DATA
HOW DO WE COMPARE THE TWO
Berlin London Moscow Paris
Rome Warsaw Madrid Vienna
Athens France Germany Italy
Poland Russia Austria Greece
Spain north south east west
12 copy Colleen E Crangle 2013
Paris is north of Berlin
Spain is east of
France
The capital of
Germany is
Berlin
13 copy Colleen E Crangle 2013
Paris
east of
Germany
14 copy Colleen E Crangle 2013
Machine learning approach to the study of
brain and language
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
We used a 5-fold linear discriminant model with principal component analysis for blind source separation to classify the segments of EEG data obtained from the individual trials
15 copy Colleen E Crangle 2013
Beyond machine learning hellip
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
THEN
look at the MIS-CLASSIFICATIONS and build a
CONFUSION MATRIX
16 copy Colleen E Crangle 2013
Classify 640 brain data samples s1 s2 hellip s640 into 10 classes ω1 ω2
hellip ω10 of the finite set A
M = (miq) is the confusion matrix for a given classification where miq
is the number of test samples from class ωi classified as belonging to
class ωq
London Moscow Paris north south east west Germany Poland Russia
London 8 14 11 7 6 3 3 10 8 10
Moscow 8 24 14 6 2 6 4 4 7 15
Paris 6 18 12 4 3 5 8 6 11 7
north 4 2 5 11 9 7 10 1 8 3
south 1 4 3 14 14 9 11 4 7 3
east 4 3 5 9 12 12 7 1 4 3
west 4 3 2 12 13 11 10 2 8 5
Germany 2 2 3 2 2 1 0 9 11 8
Poland 2 3 4 0 4 1 4 9 9 4
Russia 7 7 5 1 5 1 1 8 4 11
17 copy Colleen E Crangle 2013
The relative frequencies miq are an N-by-N estimate for the
conditional probability densities minus designated by the matrix P = (piq) minus
that a randomly chosen test sample from class ωi will be classified as
belonging to class ωq
119846119842119850119850
Conditional probability density estimates from the classification of brain wave data for London Moscow Paris north south east west Germany Poland Russia
ωi
ωq ndash predicted classified as
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
18 copy Colleen E Crangle 2013
02 03 04 05 06
London
Paris
Moscow
Germany
Poland
Russia
north
south
west
east
Hierarchical cluster tree (similarity tree) computed from the conditional probability
density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
19 copy Colleen E Crangle 2013
WordNet a lexical database of English organized around sets of cognitive synonyms restricted to the senses that are relevant to the geography sentences Latent Semantic Analysis (LSA) a statistical method to extract measures of word similarity from selected sets of documents such as novels newspaper articles textbooks
LANGUAGE BRAIN
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
20 copy Colleen E Crangle 2013
WordNet is a human-annotated lexical database of English in which nouns
verbs adjectives and adverbs are grouped into sets of cognitive synonyms
(called synsets) each synset expressing a distinct concept
See Miller (1995) Fellbaum (1997) and httpwordnetprincetonedu
The synsets are related to each other primarily through the hypernymy and
hyponymy relations for nouns Other relations in WordNet are part-whole
(holonym) member of (meronym) has instance and so on
Hyponymy often referred to as the isndasha relation in computational discussions --
is defined as follows a concept represented by a lexical item Li is said to be a
hyponym of the concept represented by a lexical item Lk if native speakers of
English accept sentences of the form An Li is a kind of Lk Conversely Lk is the
hypernym of Li A hypernym is therefore a more general concept and a hyponym
a more specific concept
WORDNET
21 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
EXPERIMENTAL SETUP for recognizing words within
sentences
A computer presented 48 spoken sentences to each of 9 participants in 10 randomized blocks with all 48 sentences in each block So there were 480 trials for each participant
The sentences were about the geography of Europe Half were true half false half positive half negative
The capital of Italy is Paris F
Paris is not east of Berlin T
Spain is west of Russia T
Participants were asked to determine the truth or falsity of each statement while EEG recordings were made
9 copy Colleen E Crangle 2013
X ϵ Berlin London Moscow Paris Rome Warsaw Madrid Vienna Athens)
Y ϵ France Germany Italy Poland Russia Austria Greece Spain)
W ϵ the capital the largest city
Z ϵ north south east west
W of Y is [not] X X is [not] W of Y
X is [not] Z of X Y is [not] Z of Y
10 copy Colleen E Crangle 2013
From httpwww2leacuk
Paris is north of Berlin
The capital of Germany is Berlin
Spain is east of France
London is not the capital of France
Vienna is east of Moscow
The largest city of Poland is Athens
Fifteen seconds of EEG data From httpsccnucsdedu
LANGUAGE BRAIN
11 copy Colleen E Crangle 2013
What our work doeshellip
Compares the brain data and the language data and finds structural similarities between them
HOW DO WE REPRESENT THE EEG DATA
HOW DO WE REPRESENT THE LANGUAGE DATA
HOW DO WE COMPARE THE TWO
Berlin London Moscow Paris
Rome Warsaw Madrid Vienna
Athens France Germany Italy
Poland Russia Austria Greece
Spain north south east west
12 copy Colleen E Crangle 2013
Paris is north of Berlin
Spain is east of
France
The capital of
Germany is
Berlin
13 copy Colleen E Crangle 2013
Paris
east of
Germany
14 copy Colleen E Crangle 2013
Machine learning approach to the study of
brain and language
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
We used a 5-fold linear discriminant model with principal component analysis for blind source separation to classify the segments of EEG data obtained from the individual trials
15 copy Colleen E Crangle 2013
Beyond machine learning hellip
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
THEN
look at the MIS-CLASSIFICATIONS and build a
CONFUSION MATRIX
16 copy Colleen E Crangle 2013
Classify 640 brain data samples s1 s2 hellip s640 into 10 classes ω1 ω2
hellip ω10 of the finite set A
M = (miq) is the confusion matrix for a given classification where miq
is the number of test samples from class ωi classified as belonging to
class ωq
London Moscow Paris north south east west Germany Poland Russia
London 8 14 11 7 6 3 3 10 8 10
Moscow 8 24 14 6 2 6 4 4 7 15
Paris 6 18 12 4 3 5 8 6 11 7
north 4 2 5 11 9 7 10 1 8 3
south 1 4 3 14 14 9 11 4 7 3
east 4 3 5 9 12 12 7 1 4 3
west 4 3 2 12 13 11 10 2 8 5
Germany 2 2 3 2 2 1 0 9 11 8
Poland 2 3 4 0 4 1 4 9 9 4
Russia 7 7 5 1 5 1 1 8 4 11
17 copy Colleen E Crangle 2013
The relative frequencies miq are an N-by-N estimate for the
conditional probability densities minus designated by the matrix P = (piq) minus
that a randomly chosen test sample from class ωi will be classified as
belonging to class ωq
119846119842119850119850
Conditional probability density estimates from the classification of brain wave data for London Moscow Paris north south east west Germany Poland Russia
ωi
ωq ndash predicted classified as
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
18 copy Colleen E Crangle 2013
02 03 04 05 06
London
Paris
Moscow
Germany
Poland
Russia
north
south
west
east
Hierarchical cluster tree (similarity tree) computed from the conditional probability
density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
19 copy Colleen E Crangle 2013
WordNet a lexical database of English organized around sets of cognitive synonyms restricted to the senses that are relevant to the geography sentences Latent Semantic Analysis (LSA) a statistical method to extract measures of word similarity from selected sets of documents such as novels newspaper articles textbooks
LANGUAGE BRAIN
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
20 copy Colleen E Crangle 2013
WordNet is a human-annotated lexical database of English in which nouns
verbs adjectives and adverbs are grouped into sets of cognitive synonyms
(called synsets) each synset expressing a distinct concept
See Miller (1995) Fellbaum (1997) and httpwordnetprincetonedu
The synsets are related to each other primarily through the hypernymy and
hyponymy relations for nouns Other relations in WordNet are part-whole
(holonym) member of (meronym) has instance and so on
Hyponymy often referred to as the isndasha relation in computational discussions --
is defined as follows a concept represented by a lexical item Li is said to be a
hyponym of the concept represented by a lexical item Lk if native speakers of
English accept sentences of the form An Li is a kind of Lk Conversely Lk is the
hypernym of Li A hypernym is therefore a more general concept and a hyponym
a more specific concept
WORDNET
21 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
X ϵ Berlin London Moscow Paris Rome Warsaw Madrid Vienna Athens)
Y ϵ France Germany Italy Poland Russia Austria Greece Spain)
W ϵ the capital the largest city
Z ϵ north south east west
W of Y is [not] X X is [not] W of Y
X is [not] Z of X Y is [not] Z of Y
10 copy Colleen E Crangle 2013
From httpwww2leacuk
Paris is north of Berlin
The capital of Germany is Berlin
Spain is east of France
London is not the capital of France
Vienna is east of Moscow
The largest city of Poland is Athens
Fifteen seconds of EEG data From httpsccnucsdedu
LANGUAGE BRAIN
11 copy Colleen E Crangle 2013
What our work doeshellip
Compares the brain data and the language data and finds structural similarities between them
HOW DO WE REPRESENT THE EEG DATA
HOW DO WE REPRESENT THE LANGUAGE DATA
HOW DO WE COMPARE THE TWO
Berlin London Moscow Paris
Rome Warsaw Madrid Vienna
Athens France Germany Italy
Poland Russia Austria Greece
Spain north south east west
12 copy Colleen E Crangle 2013
Paris is north of Berlin
Spain is east of
France
The capital of
Germany is
Berlin
13 copy Colleen E Crangle 2013
Paris
east of
Germany
14 copy Colleen E Crangle 2013
Machine learning approach to the study of
brain and language
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
We used a 5-fold linear discriminant model with principal component analysis for blind source separation to classify the segments of EEG data obtained from the individual trials
15 copy Colleen E Crangle 2013
Beyond machine learning hellip
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
THEN
look at the MIS-CLASSIFICATIONS and build a
CONFUSION MATRIX
16 copy Colleen E Crangle 2013
Classify 640 brain data samples s1 s2 hellip s640 into 10 classes ω1 ω2
hellip ω10 of the finite set A
M = (miq) is the confusion matrix for a given classification where miq
is the number of test samples from class ωi classified as belonging to
class ωq
London Moscow Paris north south east west Germany Poland Russia
London 8 14 11 7 6 3 3 10 8 10
Moscow 8 24 14 6 2 6 4 4 7 15
Paris 6 18 12 4 3 5 8 6 11 7
north 4 2 5 11 9 7 10 1 8 3
south 1 4 3 14 14 9 11 4 7 3
east 4 3 5 9 12 12 7 1 4 3
west 4 3 2 12 13 11 10 2 8 5
Germany 2 2 3 2 2 1 0 9 11 8
Poland 2 3 4 0 4 1 4 9 9 4
Russia 7 7 5 1 5 1 1 8 4 11
17 copy Colleen E Crangle 2013
The relative frequencies miq are an N-by-N estimate for the
conditional probability densities minus designated by the matrix P = (piq) minus
that a randomly chosen test sample from class ωi will be classified as
belonging to class ωq
119846119842119850119850
Conditional probability density estimates from the classification of brain wave data for London Moscow Paris north south east west Germany Poland Russia
ωi
ωq ndash predicted classified as
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
18 copy Colleen E Crangle 2013
02 03 04 05 06
London
Paris
Moscow
Germany
Poland
Russia
north
south
west
east
Hierarchical cluster tree (similarity tree) computed from the conditional probability
density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
19 copy Colleen E Crangle 2013
WordNet a lexical database of English organized around sets of cognitive synonyms restricted to the senses that are relevant to the geography sentences Latent Semantic Analysis (LSA) a statistical method to extract measures of word similarity from selected sets of documents such as novels newspaper articles textbooks
LANGUAGE BRAIN
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
20 copy Colleen E Crangle 2013
WordNet is a human-annotated lexical database of English in which nouns
verbs adjectives and adverbs are grouped into sets of cognitive synonyms
(called synsets) each synset expressing a distinct concept
See Miller (1995) Fellbaum (1997) and httpwordnetprincetonedu
The synsets are related to each other primarily through the hypernymy and
hyponymy relations for nouns Other relations in WordNet are part-whole
(holonym) member of (meronym) has instance and so on
Hyponymy often referred to as the isndasha relation in computational discussions --
is defined as follows a concept represented by a lexical item Li is said to be a
hyponym of the concept represented by a lexical item Lk if native speakers of
English accept sentences of the form An Li is a kind of Lk Conversely Lk is the
hypernym of Li A hypernym is therefore a more general concept and a hyponym
a more specific concept
WORDNET
21 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
From httpwww2leacuk
Paris is north of Berlin
The capital of Germany is Berlin
Spain is east of France
London is not the capital of France
Vienna is east of Moscow
The largest city of Poland is Athens
Fifteen seconds of EEG data From httpsccnucsdedu
LANGUAGE BRAIN
11 copy Colleen E Crangle 2013
What our work doeshellip
Compares the brain data and the language data and finds structural similarities between them
HOW DO WE REPRESENT THE EEG DATA
HOW DO WE REPRESENT THE LANGUAGE DATA
HOW DO WE COMPARE THE TWO
Berlin London Moscow Paris
Rome Warsaw Madrid Vienna
Athens France Germany Italy
Poland Russia Austria Greece
Spain north south east west
12 copy Colleen E Crangle 2013
Paris is north of Berlin
Spain is east of
France
The capital of
Germany is
Berlin
13 copy Colleen E Crangle 2013
Paris
east of
Germany
14 copy Colleen E Crangle 2013
Machine learning approach to the study of
brain and language
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
We used a 5-fold linear discriminant model with principal component analysis for blind source separation to classify the segments of EEG data obtained from the individual trials
15 copy Colleen E Crangle 2013
Beyond machine learning hellip
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
THEN
look at the MIS-CLASSIFICATIONS and build a
CONFUSION MATRIX
16 copy Colleen E Crangle 2013
Classify 640 brain data samples s1 s2 hellip s640 into 10 classes ω1 ω2
hellip ω10 of the finite set A
M = (miq) is the confusion matrix for a given classification where miq
is the number of test samples from class ωi classified as belonging to
class ωq
London Moscow Paris north south east west Germany Poland Russia
London 8 14 11 7 6 3 3 10 8 10
Moscow 8 24 14 6 2 6 4 4 7 15
Paris 6 18 12 4 3 5 8 6 11 7
north 4 2 5 11 9 7 10 1 8 3
south 1 4 3 14 14 9 11 4 7 3
east 4 3 5 9 12 12 7 1 4 3
west 4 3 2 12 13 11 10 2 8 5
Germany 2 2 3 2 2 1 0 9 11 8
Poland 2 3 4 0 4 1 4 9 9 4
Russia 7 7 5 1 5 1 1 8 4 11
17 copy Colleen E Crangle 2013
The relative frequencies miq are an N-by-N estimate for the
conditional probability densities minus designated by the matrix P = (piq) minus
that a randomly chosen test sample from class ωi will be classified as
belonging to class ωq
119846119842119850119850
Conditional probability density estimates from the classification of brain wave data for London Moscow Paris north south east west Germany Poland Russia
ωi
ωq ndash predicted classified as
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
18 copy Colleen E Crangle 2013
02 03 04 05 06
London
Paris
Moscow
Germany
Poland
Russia
north
south
west
east
Hierarchical cluster tree (similarity tree) computed from the conditional probability
density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
19 copy Colleen E Crangle 2013
WordNet a lexical database of English organized around sets of cognitive synonyms restricted to the senses that are relevant to the geography sentences Latent Semantic Analysis (LSA) a statistical method to extract measures of word similarity from selected sets of documents such as novels newspaper articles textbooks
LANGUAGE BRAIN
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
20 copy Colleen E Crangle 2013
WordNet is a human-annotated lexical database of English in which nouns
verbs adjectives and adverbs are grouped into sets of cognitive synonyms
(called synsets) each synset expressing a distinct concept
See Miller (1995) Fellbaum (1997) and httpwordnetprincetonedu
The synsets are related to each other primarily through the hypernymy and
hyponymy relations for nouns Other relations in WordNet are part-whole
(holonym) member of (meronym) has instance and so on
Hyponymy often referred to as the isndasha relation in computational discussions --
is defined as follows a concept represented by a lexical item Li is said to be a
hyponym of the concept represented by a lexical item Lk if native speakers of
English accept sentences of the form An Li is a kind of Lk Conversely Lk is the
hypernym of Li A hypernym is therefore a more general concept and a hyponym
a more specific concept
WORDNET
21 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
What our work doeshellip
Compares the brain data and the language data and finds structural similarities between them
HOW DO WE REPRESENT THE EEG DATA
HOW DO WE REPRESENT THE LANGUAGE DATA
HOW DO WE COMPARE THE TWO
Berlin London Moscow Paris
Rome Warsaw Madrid Vienna
Athens France Germany Italy
Poland Russia Austria Greece
Spain north south east west
12 copy Colleen E Crangle 2013
Paris is north of Berlin
Spain is east of
France
The capital of
Germany is
Berlin
13 copy Colleen E Crangle 2013
Paris
east of
Germany
14 copy Colleen E Crangle 2013
Machine learning approach to the study of
brain and language
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
We used a 5-fold linear discriminant model with principal component analysis for blind source separation to classify the segments of EEG data obtained from the individual trials
15 copy Colleen E Crangle 2013
Beyond machine learning hellip
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
THEN
look at the MIS-CLASSIFICATIONS and build a
CONFUSION MATRIX
16 copy Colleen E Crangle 2013
Classify 640 brain data samples s1 s2 hellip s640 into 10 classes ω1 ω2
hellip ω10 of the finite set A
M = (miq) is the confusion matrix for a given classification where miq
is the number of test samples from class ωi classified as belonging to
class ωq
London Moscow Paris north south east west Germany Poland Russia
London 8 14 11 7 6 3 3 10 8 10
Moscow 8 24 14 6 2 6 4 4 7 15
Paris 6 18 12 4 3 5 8 6 11 7
north 4 2 5 11 9 7 10 1 8 3
south 1 4 3 14 14 9 11 4 7 3
east 4 3 5 9 12 12 7 1 4 3
west 4 3 2 12 13 11 10 2 8 5
Germany 2 2 3 2 2 1 0 9 11 8
Poland 2 3 4 0 4 1 4 9 9 4
Russia 7 7 5 1 5 1 1 8 4 11
17 copy Colleen E Crangle 2013
The relative frequencies miq are an N-by-N estimate for the
conditional probability densities minus designated by the matrix P = (piq) minus
that a randomly chosen test sample from class ωi will be classified as
belonging to class ωq
119846119842119850119850
Conditional probability density estimates from the classification of brain wave data for London Moscow Paris north south east west Germany Poland Russia
ωi
ωq ndash predicted classified as
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
18 copy Colleen E Crangle 2013
02 03 04 05 06
London
Paris
Moscow
Germany
Poland
Russia
north
south
west
east
Hierarchical cluster tree (similarity tree) computed from the conditional probability
density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
19 copy Colleen E Crangle 2013
WordNet a lexical database of English organized around sets of cognitive synonyms restricted to the senses that are relevant to the geography sentences Latent Semantic Analysis (LSA) a statistical method to extract measures of word similarity from selected sets of documents such as novels newspaper articles textbooks
LANGUAGE BRAIN
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
20 copy Colleen E Crangle 2013
WordNet is a human-annotated lexical database of English in which nouns
verbs adjectives and adverbs are grouped into sets of cognitive synonyms
(called synsets) each synset expressing a distinct concept
See Miller (1995) Fellbaum (1997) and httpwordnetprincetonedu
The synsets are related to each other primarily through the hypernymy and
hyponymy relations for nouns Other relations in WordNet are part-whole
(holonym) member of (meronym) has instance and so on
Hyponymy often referred to as the isndasha relation in computational discussions --
is defined as follows a concept represented by a lexical item Li is said to be a
hyponym of the concept represented by a lexical item Lk if native speakers of
English accept sentences of the form An Li is a kind of Lk Conversely Lk is the
hypernym of Li A hypernym is therefore a more general concept and a hyponym
a more specific concept
WORDNET
21 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
Paris is north of Berlin
Spain is east of
France
The capital of
Germany is
Berlin
13 copy Colleen E Crangle 2013
Paris
east of
Germany
14 copy Colleen E Crangle 2013
Machine learning approach to the study of
brain and language
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
We used a 5-fold linear discriminant model with principal component analysis for blind source separation to classify the segments of EEG data obtained from the individual trials
15 copy Colleen E Crangle 2013
Beyond machine learning hellip
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
THEN
look at the MIS-CLASSIFICATIONS and build a
CONFUSION MATRIX
16 copy Colleen E Crangle 2013
Classify 640 brain data samples s1 s2 hellip s640 into 10 classes ω1 ω2
hellip ω10 of the finite set A
M = (miq) is the confusion matrix for a given classification where miq
is the number of test samples from class ωi classified as belonging to
class ωq
London Moscow Paris north south east west Germany Poland Russia
London 8 14 11 7 6 3 3 10 8 10
Moscow 8 24 14 6 2 6 4 4 7 15
Paris 6 18 12 4 3 5 8 6 11 7
north 4 2 5 11 9 7 10 1 8 3
south 1 4 3 14 14 9 11 4 7 3
east 4 3 5 9 12 12 7 1 4 3
west 4 3 2 12 13 11 10 2 8 5
Germany 2 2 3 2 2 1 0 9 11 8
Poland 2 3 4 0 4 1 4 9 9 4
Russia 7 7 5 1 5 1 1 8 4 11
17 copy Colleen E Crangle 2013
The relative frequencies miq are an N-by-N estimate for the
conditional probability densities minus designated by the matrix P = (piq) minus
that a randomly chosen test sample from class ωi will be classified as
belonging to class ωq
119846119842119850119850
Conditional probability density estimates from the classification of brain wave data for London Moscow Paris north south east west Germany Poland Russia
ωi
ωq ndash predicted classified as
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
18 copy Colleen E Crangle 2013
02 03 04 05 06
London
Paris
Moscow
Germany
Poland
Russia
north
south
west
east
Hierarchical cluster tree (similarity tree) computed from the conditional probability
density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
19 copy Colleen E Crangle 2013
WordNet a lexical database of English organized around sets of cognitive synonyms restricted to the senses that are relevant to the geography sentences Latent Semantic Analysis (LSA) a statistical method to extract measures of word similarity from selected sets of documents such as novels newspaper articles textbooks
LANGUAGE BRAIN
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
20 copy Colleen E Crangle 2013
WordNet is a human-annotated lexical database of English in which nouns
verbs adjectives and adverbs are grouped into sets of cognitive synonyms
(called synsets) each synset expressing a distinct concept
See Miller (1995) Fellbaum (1997) and httpwordnetprincetonedu
The synsets are related to each other primarily through the hypernymy and
hyponymy relations for nouns Other relations in WordNet are part-whole
(holonym) member of (meronym) has instance and so on
Hyponymy often referred to as the isndasha relation in computational discussions --
is defined as follows a concept represented by a lexical item Li is said to be a
hyponym of the concept represented by a lexical item Lk if native speakers of
English accept sentences of the form An Li is a kind of Lk Conversely Lk is the
hypernym of Li A hypernym is therefore a more general concept and a hyponym
a more specific concept
WORDNET
21 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
Paris
east of
Germany
14 copy Colleen E Crangle 2013
Machine learning approach to the study of
brain and language
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
We used a 5-fold linear discriminant model with principal component analysis for blind source separation to classify the segments of EEG data obtained from the individual trials
15 copy Colleen E Crangle 2013
Beyond machine learning hellip
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
THEN
look at the MIS-CLASSIFICATIONS and build a
CONFUSION MATRIX
16 copy Colleen E Crangle 2013
Classify 640 brain data samples s1 s2 hellip s640 into 10 classes ω1 ω2
hellip ω10 of the finite set A
M = (miq) is the confusion matrix for a given classification where miq
is the number of test samples from class ωi classified as belonging to
class ωq
London Moscow Paris north south east west Germany Poland Russia
London 8 14 11 7 6 3 3 10 8 10
Moscow 8 24 14 6 2 6 4 4 7 15
Paris 6 18 12 4 3 5 8 6 11 7
north 4 2 5 11 9 7 10 1 8 3
south 1 4 3 14 14 9 11 4 7 3
east 4 3 5 9 12 12 7 1 4 3
west 4 3 2 12 13 11 10 2 8 5
Germany 2 2 3 2 2 1 0 9 11 8
Poland 2 3 4 0 4 1 4 9 9 4
Russia 7 7 5 1 5 1 1 8 4 11
17 copy Colleen E Crangle 2013
The relative frequencies miq are an N-by-N estimate for the
conditional probability densities minus designated by the matrix P = (piq) minus
that a randomly chosen test sample from class ωi will be classified as
belonging to class ωq
119846119842119850119850
Conditional probability density estimates from the classification of brain wave data for London Moscow Paris north south east west Germany Poland Russia
ωi
ωq ndash predicted classified as
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
18 copy Colleen E Crangle 2013
02 03 04 05 06
London
Paris
Moscow
Germany
Poland
Russia
north
south
west
east
Hierarchical cluster tree (similarity tree) computed from the conditional probability
density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
19 copy Colleen E Crangle 2013
WordNet a lexical database of English organized around sets of cognitive synonyms restricted to the senses that are relevant to the geography sentences Latent Semantic Analysis (LSA) a statistical method to extract measures of word similarity from selected sets of documents such as novels newspaper articles textbooks
LANGUAGE BRAIN
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
20 copy Colleen E Crangle 2013
WordNet is a human-annotated lexical database of English in which nouns
verbs adjectives and adverbs are grouped into sets of cognitive synonyms
(called synsets) each synset expressing a distinct concept
See Miller (1995) Fellbaum (1997) and httpwordnetprincetonedu
The synsets are related to each other primarily through the hypernymy and
hyponymy relations for nouns Other relations in WordNet are part-whole
(holonym) member of (meronym) has instance and so on
Hyponymy often referred to as the isndasha relation in computational discussions --
is defined as follows a concept represented by a lexical item Li is said to be a
hyponym of the concept represented by a lexical item Lk if native speakers of
English accept sentences of the form An Li is a kind of Lk Conversely Lk is the
hypernym of Li A hypernym is therefore a more general concept and a hyponym
a more specific concept
WORDNET
21 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
Machine learning approach to the study of
brain and language
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
We used a 5-fold linear discriminant model with principal component analysis for blind source separation to classify the segments of EEG data obtained from the individual trials
15 copy Colleen E Crangle 2013
Beyond machine learning hellip
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
THEN
look at the MIS-CLASSIFICATIONS and build a
CONFUSION MATRIX
16 copy Colleen E Crangle 2013
Classify 640 brain data samples s1 s2 hellip s640 into 10 classes ω1 ω2
hellip ω10 of the finite set A
M = (miq) is the confusion matrix for a given classification where miq
is the number of test samples from class ωi classified as belonging to
class ωq
London Moscow Paris north south east west Germany Poland Russia
London 8 14 11 7 6 3 3 10 8 10
Moscow 8 24 14 6 2 6 4 4 7 15
Paris 6 18 12 4 3 5 8 6 11 7
north 4 2 5 11 9 7 10 1 8 3
south 1 4 3 14 14 9 11 4 7 3
east 4 3 5 9 12 12 7 1 4 3
west 4 3 2 12 13 11 10 2 8 5
Germany 2 2 3 2 2 1 0 9 11 8
Poland 2 3 4 0 4 1 4 9 9 4
Russia 7 7 5 1 5 1 1 8 4 11
17 copy Colleen E Crangle 2013
The relative frequencies miq are an N-by-N estimate for the
conditional probability densities minus designated by the matrix P = (piq) minus
that a randomly chosen test sample from class ωi will be classified as
belonging to class ωq
119846119842119850119850
Conditional probability density estimates from the classification of brain wave data for London Moscow Paris north south east west Germany Poland Russia
ωi
ωq ndash predicted classified as
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
18 copy Colleen E Crangle 2013
02 03 04 05 06
London
Paris
Moscow
Germany
Poland
Russia
north
south
west
east
Hierarchical cluster tree (similarity tree) computed from the conditional probability
density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
19 copy Colleen E Crangle 2013
WordNet a lexical database of English organized around sets of cognitive synonyms restricted to the senses that are relevant to the geography sentences Latent Semantic Analysis (LSA) a statistical method to extract measures of word similarity from selected sets of documents such as novels newspaper articles textbooks
LANGUAGE BRAIN
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
20 copy Colleen E Crangle 2013
WordNet is a human-annotated lexical database of English in which nouns
verbs adjectives and adverbs are grouped into sets of cognitive synonyms
(called synsets) each synset expressing a distinct concept
See Miller (1995) Fellbaum (1997) and httpwordnetprincetonedu
The synsets are related to each other primarily through the hypernymy and
hyponymy relations for nouns Other relations in WordNet are part-whole
(holonym) member of (meronym) has instance and so on
Hyponymy often referred to as the isndasha relation in computational discussions --
is defined as follows a concept represented by a lexical item Li is said to be a
hyponym of the concept represented by a lexical item Lk if native speakers of
English accept sentences of the form An Li is a kind of Lk Conversely Lk is the
hypernym of Li A hypernym is therefore a more general concept and a hyponym
a more specific concept
WORDNET
21 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
Beyond machine learning hellip
For the10 geography words London Moscow Paris north south east west Germany Poland Russia we have 640 EEG data samples for each participant
We want to classify these 640 samples into the correct 10 classes
640 EEG samples for each participant
Use 580 brain samples and their associated words to train the classifier
Test the remaining 60 samples
Do this many times each time using a different set of training samples
Find the average classification rate make sure it is statistically significant
Obtained classification rates in the range 25 to 29 with a mean classification rate of around 245 p lt 10Eminus10 Significantly higher than chance (10)
THEN
look at the MIS-CLASSIFICATIONS and build a
CONFUSION MATRIX
16 copy Colleen E Crangle 2013
Classify 640 brain data samples s1 s2 hellip s640 into 10 classes ω1 ω2
hellip ω10 of the finite set A
M = (miq) is the confusion matrix for a given classification where miq
is the number of test samples from class ωi classified as belonging to
class ωq
London Moscow Paris north south east west Germany Poland Russia
London 8 14 11 7 6 3 3 10 8 10
Moscow 8 24 14 6 2 6 4 4 7 15
Paris 6 18 12 4 3 5 8 6 11 7
north 4 2 5 11 9 7 10 1 8 3
south 1 4 3 14 14 9 11 4 7 3
east 4 3 5 9 12 12 7 1 4 3
west 4 3 2 12 13 11 10 2 8 5
Germany 2 2 3 2 2 1 0 9 11 8
Poland 2 3 4 0 4 1 4 9 9 4
Russia 7 7 5 1 5 1 1 8 4 11
17 copy Colleen E Crangle 2013
The relative frequencies miq are an N-by-N estimate for the
conditional probability densities minus designated by the matrix P = (piq) minus
that a randomly chosen test sample from class ωi will be classified as
belonging to class ωq
119846119842119850119850
Conditional probability density estimates from the classification of brain wave data for London Moscow Paris north south east west Germany Poland Russia
ωi
ωq ndash predicted classified as
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
18 copy Colleen E Crangle 2013
02 03 04 05 06
London
Paris
Moscow
Germany
Poland
Russia
north
south
west
east
Hierarchical cluster tree (similarity tree) computed from the conditional probability
density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
19 copy Colleen E Crangle 2013
WordNet a lexical database of English organized around sets of cognitive synonyms restricted to the senses that are relevant to the geography sentences Latent Semantic Analysis (LSA) a statistical method to extract measures of word similarity from selected sets of documents such as novels newspaper articles textbooks
LANGUAGE BRAIN
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
20 copy Colleen E Crangle 2013
WordNet is a human-annotated lexical database of English in which nouns
verbs adjectives and adverbs are grouped into sets of cognitive synonyms
(called synsets) each synset expressing a distinct concept
See Miller (1995) Fellbaum (1997) and httpwordnetprincetonedu
The synsets are related to each other primarily through the hypernymy and
hyponymy relations for nouns Other relations in WordNet are part-whole
(holonym) member of (meronym) has instance and so on
Hyponymy often referred to as the isndasha relation in computational discussions --
is defined as follows a concept represented by a lexical item Li is said to be a
hyponym of the concept represented by a lexical item Lk if native speakers of
English accept sentences of the form An Li is a kind of Lk Conversely Lk is the
hypernym of Li A hypernym is therefore a more general concept and a hyponym
a more specific concept
WORDNET
21 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
Classify 640 brain data samples s1 s2 hellip s640 into 10 classes ω1 ω2
hellip ω10 of the finite set A
M = (miq) is the confusion matrix for a given classification where miq
is the number of test samples from class ωi classified as belonging to
class ωq
London Moscow Paris north south east west Germany Poland Russia
London 8 14 11 7 6 3 3 10 8 10
Moscow 8 24 14 6 2 6 4 4 7 15
Paris 6 18 12 4 3 5 8 6 11 7
north 4 2 5 11 9 7 10 1 8 3
south 1 4 3 14 14 9 11 4 7 3
east 4 3 5 9 12 12 7 1 4 3
west 4 3 2 12 13 11 10 2 8 5
Germany 2 2 3 2 2 1 0 9 11 8
Poland 2 3 4 0 4 1 4 9 9 4
Russia 7 7 5 1 5 1 1 8 4 11
17 copy Colleen E Crangle 2013
The relative frequencies miq are an N-by-N estimate for the
conditional probability densities minus designated by the matrix P = (piq) minus
that a randomly chosen test sample from class ωi will be classified as
belonging to class ωq
119846119842119850119850
Conditional probability density estimates from the classification of brain wave data for London Moscow Paris north south east west Germany Poland Russia
ωi
ωq ndash predicted classified as
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
18 copy Colleen E Crangle 2013
02 03 04 05 06
London
Paris
Moscow
Germany
Poland
Russia
north
south
west
east
Hierarchical cluster tree (similarity tree) computed from the conditional probability
density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
19 copy Colleen E Crangle 2013
WordNet a lexical database of English organized around sets of cognitive synonyms restricted to the senses that are relevant to the geography sentences Latent Semantic Analysis (LSA) a statistical method to extract measures of word similarity from selected sets of documents such as novels newspaper articles textbooks
LANGUAGE BRAIN
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
20 copy Colleen E Crangle 2013
WordNet is a human-annotated lexical database of English in which nouns
verbs adjectives and adverbs are grouped into sets of cognitive synonyms
(called synsets) each synset expressing a distinct concept
See Miller (1995) Fellbaum (1997) and httpwordnetprincetonedu
The synsets are related to each other primarily through the hypernymy and
hyponymy relations for nouns Other relations in WordNet are part-whole
(holonym) member of (meronym) has instance and so on
Hyponymy often referred to as the isndasha relation in computational discussions --
is defined as follows a concept represented by a lexical item Li is said to be a
hyponym of the concept represented by a lexical item Lk if native speakers of
English accept sentences of the form An Li is a kind of Lk Conversely Lk is the
hypernym of Li A hypernym is therefore a more general concept and a hyponym
a more specific concept
WORDNET
21 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
The relative frequencies miq are an N-by-N estimate for the
conditional probability densities minus designated by the matrix P = (piq) minus
that a randomly chosen test sample from class ωi will be classified as
belonging to class ωq
119846119842119850119850
Conditional probability density estimates from the classification of brain wave data for London Moscow Paris north south east west Germany Poland Russia
ωi
ωq ndash predicted classified as
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
18 copy Colleen E Crangle 2013
02 03 04 05 06
London
Paris
Moscow
Germany
Poland
Russia
north
south
west
east
Hierarchical cluster tree (similarity tree) computed from the conditional probability
density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
19 copy Colleen E Crangle 2013
WordNet a lexical database of English organized around sets of cognitive synonyms restricted to the senses that are relevant to the geography sentences Latent Semantic Analysis (LSA) a statistical method to extract measures of word similarity from selected sets of documents such as novels newspaper articles textbooks
LANGUAGE BRAIN
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
20 copy Colleen E Crangle 2013
WordNet is a human-annotated lexical database of English in which nouns
verbs adjectives and adverbs are grouped into sets of cognitive synonyms
(called synsets) each synset expressing a distinct concept
See Miller (1995) Fellbaum (1997) and httpwordnetprincetonedu
The synsets are related to each other primarily through the hypernymy and
hyponymy relations for nouns Other relations in WordNet are part-whole
(holonym) member of (meronym) has instance and so on
Hyponymy often referred to as the isndasha relation in computational discussions --
is defined as follows a concept represented by a lexical item Li is said to be a
hyponym of the concept represented by a lexical item Lk if native speakers of
English accept sentences of the form An Li is a kind of Lk Conversely Lk is the
hypernym of Li A hypernym is therefore a more general concept and a hyponym
a more specific concept
WORDNET
21 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
02 03 04 05 06
London
Paris
Moscow
Germany
Poland
Russia
north
south
west
east
Hierarchical cluster tree (similarity tree) computed from the conditional probability
density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
19 copy Colleen E Crangle 2013
WordNet a lexical database of English organized around sets of cognitive synonyms restricted to the senses that are relevant to the geography sentences Latent Semantic Analysis (LSA) a statistical method to extract measures of word similarity from selected sets of documents such as novels newspaper articles textbooks
LANGUAGE BRAIN
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
20 copy Colleen E Crangle 2013
WordNet is a human-annotated lexical database of English in which nouns
verbs adjectives and adverbs are grouped into sets of cognitive synonyms
(called synsets) each synset expressing a distinct concept
See Miller (1995) Fellbaum (1997) and httpwordnetprincetonedu
The synsets are related to each other primarily through the hypernymy and
hyponymy relations for nouns Other relations in WordNet are part-whole
(holonym) member of (meronym) has instance and so on
Hyponymy often referred to as the isndasha relation in computational discussions --
is defined as follows a concept represented by a lexical item Li is said to be a
hyponym of the concept represented by a lexical item Lk if native speakers of
English accept sentences of the form An Li is a kind of Lk Conversely Lk is the
hypernym of Li A hypernym is therefore a more general concept and a hyponym
a more specific concept
WORDNET
21 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
WordNet a lexical database of English organized around sets of cognitive synonyms restricted to the senses that are relevant to the geography sentences Latent Semantic Analysis (LSA) a statistical method to extract measures of word similarity from selected sets of documents such as novels newspaper articles textbooks
LANGUAGE BRAIN
London Moscow Paris north south east west Germany Poland Russia
London 025 0163 0138 0075 0038 005 0063 0088 0075 0063
Moscow 0144 0333 0133 0033 0011 0056 0056 0089 0067 0078
Paris 0175 0188 0125 0038 0038 0038 005 01 0138 0113
north 0067 0017 0017 0283 0167 015 01 01 0067 0033
south 0014 0014 0071 0114 0271 0171 0157 0029 0086 0071
east 005 0 0 0133 015 0383 015 005 0 0083
west 0057 0043 0 0171 0114 0143 0229 0086 01 0057
Germany 0075 0175 01 005 0125 0025 005 0275 0025 01
Poland 015 01 0125 005 005 0 0025 015 025 01
Russia 016 004 008 004 01 002 012 01 008 026
20 copy Colleen E Crangle 2013
WordNet is a human-annotated lexical database of English in which nouns
verbs adjectives and adverbs are grouped into sets of cognitive synonyms
(called synsets) each synset expressing a distinct concept
See Miller (1995) Fellbaum (1997) and httpwordnetprincetonedu
The synsets are related to each other primarily through the hypernymy and
hyponymy relations for nouns Other relations in WordNet are part-whole
(holonym) member of (meronym) has instance and so on
Hyponymy often referred to as the isndasha relation in computational discussions --
is defined as follows a concept represented by a lexical item Li is said to be a
hyponym of the concept represented by a lexical item Lk if native speakers of
English accept sentences of the form An Li is a kind of Lk Conversely Lk is the
hypernym of Li A hypernym is therefore a more general concept and a hyponym
a more specific concept
WORDNET
21 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
WordNet is a human-annotated lexical database of English in which nouns
verbs adjectives and adverbs are grouped into sets of cognitive synonyms
(called synsets) each synset expressing a distinct concept
See Miller (1995) Fellbaum (1997) and httpwordnetprincetonedu
The synsets are related to each other primarily through the hypernymy and
hyponymy relations for nouns Other relations in WordNet are part-whole
(holonym) member of (meronym) has instance and so on
Hyponymy often referred to as the isndasha relation in computational discussions --
is defined as follows a concept represented by a lexical item Li is said to be a
hyponym of the concept represented by a lexical item Lk if native speakers of
English accept sentences of the form An Li is a kind of Lk Conversely Lk is the
hypernym of Li A hypernym is therefore a more general concept and a hyponym
a more specific concept
WORDNET
21 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
WORDNET
(n) Paris1 City of Light1 French capital1 capital of France1 (the capital and largest city of France and international center of culture and commerce)
(n) Paris2 genus Paris1 (sometimes placed in subfamily Trilliaceae)
(n) Paris3 ((Greek mythology) the prince of Troy who abducted Helen from her husband Menelaus and provoked the Trojan War)
(n) Paris4 (a town in northeastern Texas)
Multiple word senses
Organized into sets of cognitive synonyms called ldquosynsetsrdquo
(n) city1 metropolis1 urban center1 (a large and densely populated urban area may include several independent administrative districts) direct hyponym full hyponym part meronym has instance (n) city2 (an incorporated administrative district established by state charter) (n) city3 metropolis2 (people living in a large densely populated municipality)
(n) east1 due east1 eastward1 E3 (the cardinal compass point that is at 90 degrees) (n) East2 Orient1 (the countries of Asia) (n) East3 eastern United States1 (the region of the United States lying to the north of the Ohio River and to the east of the Mississippi River) (n) east4 (the direction corresponding to the eastward cardinal compass point) (n) east5 (a location in the eastern part of a country region or city)
22 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
LATENT SEMANTIC ANALYSIS (LSA)
LSA is a statistical technique for extracting from large collections of documents a measure of how similar two words are to each other in terms of patterns of their co-occurrences within those documents See Deerwester et al 1990 Landauer and Dumais 1997 Landauer et al 1998
The underlying idea is that if for each word you take into account all the contexts in which it does and does not appear
you get for all the words a set of mutual constraints that represent how similar any two words are to each other
The similarity judgments produced by latent semantic analysis have been shown to correspond to some extent to human judgments of similarity After training on about 2000 pages of English text it scored as well as average test-takers on the synonym portion of the Test of English as a Foreign Language After training on a psychology textbook it achieved a passing score on a multiple-choice exam We used the application at httplsacoloradoedu to compute similarity matrices in term space for our set of words The computation was based on ~ 38000 college-level texts (novels newspaper articleshellip) A maximum of 300 factors was permitted in the analysis
23 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
London Moscow Paris north south east west Germany Poland Russia
London 1 017 037 013 012 012 014 017 016 016
Moscow 017 1 018 013 008 022 014 037 065 069
Paris 037 018 1 01 004 008 009 05 031 036
north 013 013 01 1 089 06 061 007 009 014
south 012 008 004 089 1 05 055 002 005 005
east 012 022 008 06 05 1 085 022 029 03
west 014 014 009 061 055 085 1 024 026 023
Germany 017 037 05 007 002 022 024 1 085 081
Poland 016 065 031 009 005 029 026 085 1 087
Russia 016 069 036 014 005 03 023 081 087 1
Semantic similarity matrix derived from LSA for the set of words
London Moscow Paris north south east west Germany
Poland Russia
24 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
01 02 03 04 05 06
east
west
north
south
London
Paris
Moscow
Germany
Poland
Russia
Hierarchical cluster tree computed from the pair-wise Latent Semantic
Analysis (LSA) scores of similarity for London Moscow Paris north
south east west Germany Poland Russia based on ~ 38000 college-
level texts (novels newspaper articleshellip)
25 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
LSA provides straightforward measure of similarity between words
For WORDNET several different measures of similarity have been devised Eg Path length between synsets Information content a corpusndashbased measure of the specificity of a concept measured in terms of the frequency of occurrence of the concept in the corpus the human-annotated sensendashtagged corpus SemCor (Miller et al 1993) which links every word in the Brown Corpus to its appropriate WordNet sense Scaled various ways Vector-space models -- works by forming second-order co-occurrence vectors from the WordNet definitionsof concepts known as glosses We used five measures in our computations of similarity and took the average score using each of the relevant senses
26 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
Semantic similarity matrix derived from WordNet for the set of
words London Moscow Paris north south east west Germany
Poland Russia using senses relevant to the geography of Europe
and five measures of similarity wup (path length) lin and jcn
(information content) and gv and pgv (vector space measures)
London Moscow Paris north south east west Germany Poland Russia
London 1 0396 0466 0106 0103 0076 0078 0322 0299 0303
Moscow 0396 1 0393 0095 0094 0062 007 0286 0281 0288
Paris 0466 0393 1 0106 0104 0074 0077 0327 0308 0307
north 0106 0095 0106 1 0228 0179 021 0123 0132 0111
south 0103 0094 0104 0228 1 0172 0212 0115 0107 0109
east 0076 0062 0074 0179 0172 1 0216 0093 008 0077
west 0078 007 0077 021 0212 0216 1 0087 0082 0083
Germany 0322 0286 0327 0123 0115 0093 0087 1 0589 0409
Poland 0299 0281 0308 0132 0107 008 0082 0589 1 0403
Russia 0303 0288 0307 0111 0109 0077 0083 0409 0403 1
27 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
02 03 04 05 06
Germany
Poland
Russia
London
Paris
Moscow
north
south
west
east
Hierarchical cluster tree computed from pairwise WordNet-based semantic
similarity scores for London Moscow Paris north south east west
Germany Poland Russia restricted to senses related to the geography of
Europe
28 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
BACK TO THE BRAIN hellip
London Moscow Paris north south
east west Germany Poland Russia
29 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
Now letrsquos see how to compare the EEG data
and the language datahellip
30 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
Some of the similarity trees show remarkable congruence
between the brain and semantic data
Where exactly does that congruence lie
Can we devise a quantitative measure of the nature and
strength of that congruence
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
02 03 04 05 06
Germany Poland Russia London Paris Moscow north south west east
LANGUAGE DATA
BRAIN DATA
31 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
WordNet-based semantic similarities and EEG conditional probability
estimates for London relative to London (L) Moscow (M) Paris (P) north
(n) south (s) east (e) west (w) Germany (G) Poland (Po) and Russia reg
The Spearman rank correlation for the two sequences in the figure is 099 with one-sided significance of 184E-10
32 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
For each word ω we compute from the conditional probability
density estimates a ternary relation R such that R( ω ω1 ω2 ) if
and only if with respect to word ω the conditional probability for word
ω1 is smaller than the conditional probability for word ω2 that is if and
only if ω1s similarity difference with ω is smaller than ω2s similarity
difference with ω
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
BRAIN DATA
33 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
For each word ω we compute from the semantic similarity matrix
a ternary relation R such that R ( ω ω1 ω2 ) if and only the
similarity difference of ω1 with ω is smaller than the similarity
difference of ω2 with ω that is ω1 is more similar to ω than is ω2
R is an ordinal relation of similarity differences a partial order
that is irreflexive asymmetric and transitive
LANGUAGE DATA
34 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
London
Language data Brain data London 1000 London 0275 Paris 0466 Paris 0133 Moscow 0396 Moscow 0108 Germany 0322 Germany 0075 Russia 0303 north 0042 Poland 0299 Russia 0033 north 0106 Poland 0025 south 0103 east 0008 west 0078 south 0000
east 0076 west 0000
Partial orders for London derived from the WordNet
semantic similarities of Table 2 and the conditional
probability estimates for the brain data of Table 5
Poland
north
south
west
east
north
Poland
east
south
west
35 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
Following the approach described in Suppes (1974) for the axiomatization of
the theory of differences in utility preference or the theory of differences in
psychological intensity Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et
al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in
Pure Mathematics 25 Providence RI American Mathematical Society pp 465-
479
The relational structure (A R)
constructed from R and the finite
set A of classes ω1 ω2 hellip ωN
together with the N partial orders
constructed from the N-by-N
estimate for the conditional
probability densities
The relational structure (A R) constructed from R and the finite set A of classes ω1 ω2 hellip ωN together with the N partial orders constructed from the N-by-N similarity matrix
Brain data
Language data
For each ω1 we compare the partial order of the brain data with the partial order of the language data using Spearmanrsquos rank correlation coefficient which we interpret in the usual way to determine if we have a statistically significant correlation or not
36 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
London Moscow
Paris
north
south east
west
Germany Poland
Russia
Significant Invariance - Paris - Spearman 088 (p=66795e-004)
London Moscow Paris
north
south east
west
Germany Poland Russia
Significant Invariance - Paris - Spearman 090 (p=38716e-004)
For those instances in which the brain
and language partial orders are
significantly correlated we find the
partial order that is invariant with
respect to the brain and language data
Here are two more examples
37 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
02 03 04 05 06
London Paris Moscow Germany Poland Russia north south west east
Another hierarchical cluster tree (similarity tree) computed from the conditional probability density estimates for the classification of 640 brain wave samples for London Moscow Paris north south east west Germany Poland Russia
Every single-trial classification produces its own conditional probability density estimates giving rise to its own similarity tree hellip
01 02 03 04 05
north east south west London Paris Germany Moscow Russia Poland
Brain data
38 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
1 We compute M (=30) single-trial classifications of the data (640 data samples for our 10 words) using random
resampling with replacement
2 For each classification we find for each word the partial orders of the brain and language data that are significantly
correlated
3 For each of these highly correlated partial order pairs we find the partial order invariant with respect to both
We performed 60 classifications ndash that is we recomputed the classifications of the brain data using random resampling with
replacement
For half of these 60 classifications we compared the brain data to the
WordNet data
and for the other half we compared the brain data to the
LSA data
And we plotted the results
39 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
0
50
100
150
200
250
300
S10 S24 S25 S27 S16 S26 S12 S13 S18
WordNet
LSA
Partial Orders of Similarity Differences Invariant between EEG-
recorded Brain Data and Semantic Representations of Language
Significant
Invariant
Partial
Orders
Test Participants
40 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
A higher number of significant structural similarities was
found between the brain data and the WordNet data than
the LSA data
This stronger structural similarity between the
brain data and the WordNet-derived data
supports the contention that during language
comprehension for the complex cognitive task
of assessing truth or falsity the representation
of words in the brain has a WordNet-like quality
This stronger structural similarity is the first
evidence we have seen of WordNet-like
representations in the brain
41 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
Banerjee S and Pedersen T (2003) Extended Gloss Overlaps as a Measure of Semantic Relatedness In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence pp 805-810 August 9-15 2003 Acapulco Mexico Banerjee S and Pedersen T (2002) An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet In Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics pp 136-145 February 17-23 2002 Mexico City Baroni M B Murphy E Barbu M Poesio (2010) Strudel A Corpus-Based Semantic Model Based on Properties and Types Cognitive Science - COGSCI vol 34 no 2 pp 222-254 2010 DOI 101111j1551-6709200901068x Blei D Ng A amp Jordan M (2003) Latent Dirichlet allocation Journal of Machine Learning Research 3 993ndash1022 Brants T and Franz A (2006) wwwldcupenneduCatalogCatalogEntryjspcatalogId=LDC2006T13 Linguistic Data Consortium Philadelphia Budanitsky A Hirst G (2001) Semantic distance in WordNet an experimental application-oriented evaluation of five measures In Proceedings of the NAACL 2001 Workshop on WordNet and Other Lexical Resources Pittsburgh June 2001 httpwwwseassmuedu~radamwnwpapersWNW-NAACL-101pdfgz Collins AM Loftus EF (1975) A spreading-activation theory of semantic processing Psychological Review 1975 Nov Vol 82(6) 407-428 Crestani F (1997) Application of Spreading Activation Techniques in Information Retrieval Artificial Intelligence Review 11 453ndash482 Deerwester S Dumais ST Furnas GW Landauer TK Harshman R (1990) Indexing By Latent Semantic Analysis Journal of the American Society for Information Science 41 391-407 Devereux B C Kelly A Korhonen (2010) Using fMRI activation to conceptual stimuli to evaluate methods for extracting conceptual representations from corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 70mdash78 Fellbaum C (Ed) (1998) WordNet An Electronic Lexical Database Cambridge MA MIT Press Hauk O Davis MH Ford M Pulvermuumlller F Marslen-Wilson WD (2006) The time course of visual word recognition as revealed by linear regression analysis of ERP data Neuroimage 2006 May 130(4)1383-400 Epub 2006 Feb 7
42 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
Hirst G (1988) Resolving lexical ambiguity computationally with spreading activation and Polaroid Words In Small S L Cottrell G W Tanenhaus M K (editors) Lexical ambiguity resolution Perspectives from psycholinguistics neuropsychology and artificial intelligence San Mateo CA Morgan Kaufmann Publishers 1988 73ndash107 Jelodar A B M Alizadeh S Khadivi (2010) WordNet Based Features for Predicting Brain Activity associated with meanings of noun Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 18mdash26 Jiang J and Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy In Proceedings of International Conference on Research in Computational Linguistics Taiwan 19-33 Just MA Cherkassky VL Aryal S Mitchell TM (2010) A Neurosemantic Theory of Concrete Noun Representation Based on the Underlying Brain Codes PLoS ONE 5(1) e8622 doi101371journalpone0008622 Karp RM (1972) Reducibility among combinatorial problems In Complexity of Computer Computations Proc Sympos IBM Thomas J Watson Res Center Yorktown Heights NY (Ed R E Miller and J W Thatcher) New York Plenum pp 85-103 Kelly C B Devereux A Korhonen (2010) Acquiring Human-like Feature-Based Conceptual Representations from Corpora Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 61mdash69 Kriegeskorte N Mur M Bandettini PA (2008) Representational similarity analysis ndash connecting the branches of systems neuroscience Frontiers in Systems Neuroscience doi103389neuro060042008 Kutas M Hillyard SA (1984) Brain potentials during reading reflect word expectancy and semantic association Nature 307 161-163 Kutas M Federmeier KD (2011) Thirty years and counting Finding meaning in the N400 component of the event related brain potential (ERP) Annual Review of Psychology 2011 62 pp 621-647 Landauer TK Dumais ST (1997) A solution to Platos problem The Latent Semantic Analysis theory of the acquisition induction and representation of knowledge Psychological Review 104 211-240 Landauer TK Foltz PW Laham D (1998) Introduction to Latent Semantic Analysis Discourse Processes 25 259-284
43 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
Leacock C Chodorow M (1998) Combining local context and WordNet similarity for word sense identification In Fellbaum (Ed) WordNet An Electronic Lexical Database Cambridge MA MIT Press pp 265-283 Leech G Rayson P Wilson A (2001) Word Frequencies in Written and Spoken English based on the British National Corpus Longman London Lin D (1998) An information-theoretic definition of similarity In Proc 15th International Conf on Machine Learning Morgan Kaufmann San Francisco CA p 296--304 Luce RD (1956) Semiorders and a theory of utility discrimination Econometrica 24 178ndash191 MR0078632 httpwwwjstororgstable1905751 Matsunaga T Yonemori C Tomita E Muramatsu M (2009) Clique-based data mining for related genes in a biomedical database BMC Bioinformatics 2009 Jul 110205 Martin Ph (2003) Correction and Extension of WordNet 17 ICCS 2003 11th International Conference on Conceptual Structures (copy Springer Verlag LNAI 2746 pp 160-173) Dresden Germany July 21-25 2003 Miller GA Nicely P (1955) An analysis of perceptual confusions among some English consonants JAcoustSocAm 272 Miller GA Leacock C Tengi R Bunker RT (1993) A Semantic Concordance In Proceedings of the 3 DARPA Workshop on Human Language Technology Miller GA (1995) WordNet A Lexical Database for English Communications of the ACM Vol 38 No 11 39-41 Mitchell TM SV Shinkareva A Carlson KM Chang VL Malave R A Mason and M A Just (2008) Predicting Human Brain Activity Associated with the Meanings of Nouns Science 320 1191 May 30 2008 DOI 101126science1152876 Murphy B Baroni M amp Poesio M (2009) EEG responds to conceptual stimuli and corpus semantics In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing pages 619ndash627 Singapore 6-7 August 2009 c 2009 ACL and AFNLP Murphy B amp Poesio M (2010) Detecting semantic category in simultaneous EEGMEG recordings In First workshop on computational neurolinguistics NAACL HLT 2010 (pp 36ndash44) Los Angeles Association for Computational Linguistics Murphy B Poesio M Bovolo F Bruzzone L Dalponte M Lakany H (2011) EEG decoding of semantic category reveals distributed representations for single concepts Brain Lang 2011 Apr117(1)12-22 doi 101016jbandl201009013
44 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
Pedersen T Patwardhan S Michelizzi J (2004) WordNetSimilarity - Measuring the Relatedness of Concepts In Proceedings of Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2004) pp 38-41 Boston May 2004 Pereira F M Botvinick G Detre (2010) Learning semantic features for fMRI data from definitional text Proceedings of the NAACL HLT 2010 First Workshop on Computational Neurolinguistics June 2010Los Angeles USA Association for Computational Linguistics 1mdash9 httpwwwaclweborganthologyW10-0601 Perreau-Guimaraes M Wong DK Uy ET Grosenick L Suppes P (2007) Single-trial classification of MEG recordings IEEE Transactions on Biomedical Engineering 54 436ndash443 Rabinovitch I (1977) The Scott-Suppes theorem on semiorders J Mathematical Psychology 15 (2) 209ndash212 MR0437404 Resnik P (1995) Using information content to evaluate semantic similarity In Proceedings of the 14th International Joint Conference on Artificial Intelligence pages 448-453 Montreal Roach BJ Mathalon DH (2008) Event-related EEG time-frequency analysis an overview of measures and an analysis of early gamma band phase locking in schizophrenia Schizophr Bull 2008 Sep34(5)907-26 Epub 2008 Aug 6 Scott D Suppes P (1958) Foundational aspects of theories of measurement The Journal of Symbolic Logic 23 113ndash128 MR0115919 Suppes P (1972) Axiomatic Set Theory New York Dover Suppes P (1974) The axiomatic method in the empirical sciences L Henkin et al (Eds) Proceedings of the Tarski Symposium Proceedings of Symposia in Pure Mathematics 25 Providence RI American Mathematical Society pp 465-479 Suppes P Lu Z-L Han B (1997) Brain wave recognition of words Proceedings of the National Academy of Sciences 95 14965ndash14969 Suppes P Han B Lu Z-L (1998) Brain-wave recognition of sentences Proceedings of the National Academy of Sciences 95 15861ndash15866 Suppes P Han B Epelboim J Lu Z-L (1999a) Invariance between subjects of brain wave representations of language Proceedings of the National Academy of Sciences 96 12953ndash12958 Suppes P Han B Epelboim J Lu Z-L (1999b) Invariance of brain-wave representations of simple visual images and their names Proceedings of the National Academy of Sciences 96 14658-14663 Suppes P Han B (2000) Brain-wave representation of words by superposition of a few sine waves Proceedings of the National Academy of Sciences 97 8738ndash8743 Suppes P Perreau-Guimaraes M Wong DK (2009) Partial Orders of Similarity Differences Invariant Between EEG-Recorded Brain and Perceptual Representations of Language Neural Computation 21 3228ndash3269
45 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013
Suppes P de Barros JA Oas G (2012) Phase-Oscillator computations as neural models of stimulus-response conditioning and response selection Journal of Mathematical Psychology 56 95-117 Turney P (2001) Mining the Web for Synonyms PMI-IR versus LSA on TOEFL In L De Raedt amp P Flach (Eds) Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (pp 491-502) Freiburg Germany Vassilieva E Pinto G de Barros JA Suppes P (2011) Learning Pattern Recognition Through Quasi-Synchronization of Phase Oscillators IEEE Transactions on Neural Networks Vol 22 No 1 January 2011 Vigliocco G Warren J Siri S Arciuli J Scott S Wise R (2006) The role of semantics and grammatical class in the neural representation of words Cereb Cortex 2006 Dec16(12)1790-6 Epub 2006 Jan 18 Wong DK Perreau-Guimaraes M Uy ET Suppes P (2004) Classification of individual trials based on the best independent component of EEG-recorded sentences Neurocomputing 61 479-484 Wong DK Uy ET Perreau-Guimaraes M Yang W Suppes P (2006) Interpretation of perceptron weights as constructed time series for EEG classification Neurocomputing 70 373-383 Wong DK Grosenick L Uy ET Perreau-Guimaraes M Carvalhaes CG Desain P Suppes P (2008) Quantifying inter-subject agreement in brain-imaging analyses NeuroImage 39 1051ndash1063 Woods W (1975) Whatrsquos in a link Foundations for semantic networks In Bobrow D Collins A eds Representation and understanding New York Academic Press 197535-82 Wu Z Palmer M (1994) Verb Semantics and Lexical Selection In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics Las Cruces New Mexico pp 133--138
46 copy Colleen E Crangle 2013