language technology enhanced learning
DESCRIPTION
Fridolin Wild, Gaston Burek, Adriana BerlangaTRANSCRIPT
Language TechnologyLanguage TechnologyEnhanced LearningEnhanced LearningFridolin WildThe Open University, UK
Gaston BurekUniversity of Tübingen
Adriana BerlangaOpen University, NL
Workshop OutlineWorkshop Outline
1 | Deep Introduction
Latent-Semantic Analysis (LSA)2 | Quick Introduction
Working with R3 | Experiment
Simple Content-Based Feedback4 | Experiment
Topic Proxy
#2
Latent-Semantic AnalysisLatent-Semantic Analysis
LSALSA
Latent Semantic AnalysisLatent Semantic Analysis
• Assumption: language utterances do have a semantic structure
• However, this structure is obscured by word usage (noise, synonymy, polysemy, …)
• Proposed LSA Solution: – map doc-term matrix – using conceptual indices – derived statistically (truncated SVD) – and make similarity comparisons
using e.g. angles
Input (e.g., documents)Input (e.g., documents)
{ M } =
Deerwester, Dumais, Furnas, Landauer, and Harshman (1990): Indexing by Latent Semantic Analysis, In: Journal of the American Society for Information Science, 41(6):391-407
Only the red terms appear in more than one document, so strip the rest.
term = feature
vocabulary = ordered set of features
TEXTMATRIX
Singular Value Decomposition
=
Truncated SVDTruncated SVD
… we will get a different matrix (different values, but still of the same format as M).
latent-semantic space
Reconstructed, Reduced Reconstructed, Reduced MatrixMatrix
m4: Graph minors: A survey
Similarity in a Latent-Semantic SpaceSimilarity in a Latent-Semantic Space
(Landauer, 2007)
m
ii
m
ii
m
iii
ba
ba
1
2
1
2
1cos Query
Target 1
Target 2Angle 2
Angle 1
Y d
ime
nsi
on
X dimension
doc2doc - similarities
Unreduced = pure vector space model
- Based on M = TSD’
- Pearson Correlation over document vectors
reduced
- based on M2 = TS2D’
- Pearson Correlation over document vectors
The meaning of "life" =
0.0465 -0.0453 -0.0275 -0.0428 0.0166 -0.0142 -0.0094 0.0685 0.0297 -0.0377 -0.0166 -0.0165 0.0270 -0.0171 0.0017 0.0135 -0.0372 -0.0045 -0.0205 -0.0016 0.0215 0.0067 -0.0302 -0.0214 -0.0200 0.0462 -0.0371 0.0055 -0.0257 -0.0177
-0.0249 0.0292 0.0069 0.0098 0.0038 -0.0041 -0.0030 0.0021 -0.0114 0.0092 -0.0454 0.0151 0.0091 0.0021 -0.0079 -0.0283 -0.0116 0.0121 0.0077 0.0161 0.0401 -0.0015 -0.0268 0.0099 -0.0111 0.0101 -0.0106 -0.0105 0.0222 0.0106 0.0313 -0.0091 -0.0411 -0.0511 -0.0351 0.0072 0.0064 -0.0025 0.0392 0.0373 0.0107 -0.0063 -0.0006 -0.0033 -0.0403 0.0481 0.0082 -0.0587 -0.0154 -0.0342
-0.0057 -0.0141 0.0340 -0.0208 -0.0060 0.0165 -0.0139 0.0060 0.0249 -0.0515 0.0083 -0.0303 -0.0070 -0.0033 0.0408 0.0271 -0.0629 0.0202 0.0101 0.0080 0.0136 -0.0122 0.0107 -0.0130 -0.0035 -0.0103 -0.0357 0.0407 -0.0165 -0.0181 0.0369 -0.0295 -0.0262 0.0363 0.0309 0.0180 -0.0058 -0.0243 0.0038 -0.0480 0.0008 -0.0064 0.0152 0.0470 0.0071 0.0183 0.0106 0.0377 -0.0445 0.0206
-0.0084 -0.0457 -0.0190 0.0002 0.0283 0.0423 -0.0758 0.0005 0.0335 -0.0693 -0.0506 -0.0025 -0.1002 -0.0178 -0.0638 0.0513 -0.0599 -0.0456 -0.0183 0.0230 -0.0426 -0.0534 -0.0177 0.0383 0.0095 0.0117 0.0472 0.0319 -0.0047 0.0534 -0.0252 0.0266 -0.0210 -0.0627 0.0424 -0.0412 0.0133 -0.0221 0.0593 0.0506 0.0042 -0.0171 -0.0033 -0.0222 -0.0409 -0.0007 0.0265 -0.0260 -0.0052 0.0388 0.0393 0.0393 0.0652 0.0379 0.0463 0.0357 0.0462 0.0747 0.0244 0.0598
-0.0563 0.1011 0.0491 0.0174 -0.0123 0.0352 -0.0368 -0.0268 -0.0361 -0.0607 -0.0461 0.0437 -0.0087 -0.0109 0.0481 -0.0326 -0.0642 0.0367 0.0116 0.0048 -0.0515 -0.0487 -0.0300 0.0515 -0.0312 -0.0429 -0.0582 0.0730 -0.0063 -0.0479 0.0230 -0.0325 0.0240 -0.0086 -0.0401 0.0747 -0.0649 -0.0658 -0.0283 -0.0184
-0.0297 -0.0122 -0.0883 -0.0138 -0.0072 -0.0250 -0.1139 -0.0172 0.0507 0.0252 0.0307 -0.0821 0.0328 0.0584 -0.0216 0.0117 0.0801 0.0186 0.0088 0.0224
-0.0079 0.0462 -0.0273 -0.0792 0.0127 -0.0568 0.0105 -0.0167 0.0923 -0.0843 0.0836 0.0291 -0.0201 0.0807 0.0670 0.0592 0.0312 -0.0272 -0.0207 0.0028
-0.0092 0.0385 0.0194 -0.0451 0.0002 -0.0041 0.0203 0.0313 -0.0093 -0.0444 0.0142 -0.0458 0.0223 -0.0688 -0.0334 -0.0361 -0.0636 0.0217 -0.0153 -0.0458
-0.0322 -0.0615 -0.0206 0.0146 -0.0002 0.0148 -0.0223 0.0471 -0.0015 0.0135 (Landauer, 2007)
ConfigurationsConfigurations
4 x 12 x 7 x 2 x 3 = 2016 Combinations
Updating: Folding-InUpdating: Folding-In
• SVD factor stability– Different texts – different factors– Challenge: avoid unwanted factor changes
(e.g., bad essays)– Solution: folding-in instead of recalculating
• SVD is computationally expensive– 14 seconds (300 docs textbase) – 10 minutes (3500 docs textbase)– … and rising!
The Statistical Language The Statistical Language and Environment Rand Environment R
RR
HelpHelp
> ?'+'
> ?kmeans
> help.search("correlation")
http://www.r-project.org
=> site search
=> documentation
Mailinglist r-help
Task View NLP: http://cran.r-project.org/ -> Task Views -> NLP
Installation & ConfigurationInstallation & Configuration
install.packages("lsa", repos="http://cran.r-project.org")
install.packages("tm", repos="http://cran.r-project.org")
install.packages("network", repos="http://cran.r-project.org")
library(lsa)
setwd("d:/denkhalde/workshop")
dir()
ls()
quit()
The lsa PackageThe lsa Package
• Available via CRAN, e.g.:http://cran.at.r-project.org/src/contrib/Descriptions/lsa.html
• Higher-level Abstraction to Ease Use– Five core methods:
textmatrix() / query()
lsa()
fold_in()
as.textmatrix()
– Supporting methods for term weighting, dimensionality calculation, correlation measurement, triple binding
Core Processing WorkflowCore Processing Workflow
tm = textmatrix(‘dir/‘)
tm = lw_logtf(tm) * gw_idf(tm)
space = lsa(tm, dims=dimcalc_share())
tm3 = fold_in(tm, space)
as.textmatrix(tm)
A Simple Evaluation of Students WritingsA Simple Evaluation of Students Writings
FeedbackFeedback
Evaluating Student WritingsEvaluating Student Writings
(Landauer, 2007)
External Validation?Compare to Human Judgements!
How to do it...How to do it...library( "lsa“ ) # load package
# load training texts
trm = textmatrix( "trainingtexts/“ )
trm = lw_bintf( trm ) * gw_idf( trm ) # weighting
space = lsa( trm ) # create an LSA space
# fold-in essays to be tested (including gold standard text)
tem = textmatrix( "testessays/", vocabulary=rownames(trm) )
tem = lw_bintf( tem ) * gw_idf( trm ) # weighting
tem_red = fold_in( tem, space )
# score an essay by comparing with
# gold standard text (very simple method!)
cor( tem_red[,"goldstandard.txt"], tem_red[,"E1.txt"] )
=> 0.7
Evaluating EffectivenessEvaluating Effectiveness
• Compare Machine Scores with Human Scores
• Human-to-Human Correlation– Usually around .6– Increased by familiarity between
assessors, tighter assessment schemes, …
– Scores vary even stronger with decreasing subject familiarity (.8 at high familiarity, worst test -.07)
•Test Collection: 43 German Essays, scored from 0 to 5 points (ratio scaled), average length: 56.4 words•Training Collection: 3 ‘golden essays’, plus 302 documents from a marketing glossary, average length: 56.1 words
(Positive) Evaluation Results(Positive) Evaluation ResultsLSA machine scores: Spearman's rank correlation rhodata: humanscores[names(machinescores), ] and machinescores S = 914.5772, p-value = 0.0001049alternative hypothesis: true rho is not equal to 0 sample estimates: rho 0.687324
Pure vector space model: Spearman's rank correlation rhodata: humanscores[names(machinescores), ] and machinescores S = 1616.007, p-value = 0.02188alternative hypothesis: true rho is not equal to 0 sample estimates: rho 0.4475188
Concept-Focused EvaluationConcept-Focused Evaluation
(using http://eczemablog.blogspot.com/feeds/posts/default?alt=rss)
Visualising Lexical SemanticsVisualising Lexical Semantics
Topic ProxyTopic Proxy
Network VisualisationNetwork Visualisation
• Term-2-Term distance matrix
t1 t2 t3 t4
t1 1
t2 -0.2 1
t3 0.5 0.7 1
t4 0.05 -0.5 0.3 1
==
Graph
Classical Landauer ExampleClassical Landauer Example
tl = landauerSpace$tk %*% diag(landauerSpace$sk)dl = landauerSpace$dk %*% diag(landauerSpace$sk)dtl = rbind(tl,dl)
s = cosine(t(dtl))s[which(s<0.8)] = 0
plot( network(s), displaylabels=T, vertex.col = c(rep(2,12), rep(3,9)) )
Divisive Clustering (Diana)Divisive Clustering (Diana)
edmediaedmediaTerminology
Code SampleCode Sampled2000 = cosine(t(dtm2000))
dianac2000 = diana(d2000, diss=T)
clustersc2000 = cutree(as.hclust(dianac2000), h=0.2)
plot(dianac2000, which.plot=2, cex=.1) # dendrogramme
winc = clustersc2000[which(clustersc2000==1)] # filter for cluster 1
wincn = names(winc)
d = d2000[wincn,wincn]
d[which(d<0)] == 0
btw = betweenness(d, cmode="undirected") # for nodes size calc
btwmax = colnames(d)[which(btw==max(btw))]
btwcex = (btw/max(btw))+1
plot(network(d), displayisolates=F, displaylabels=T, boxed.labels=F, edge.col="gray", main=paste("cluster",i), usearrows=F, vertex.border="darkgray", label.col="darkgray", vertex.cex=btwcex*3, vertex.col=8-(colnames(d) %in% btwmax))
PermutatingPermutating
PermutationPermutation
Permutation testPermutation test
• NON PARAMETRIC: does not assume that the data have a particular probability distribution.
• Suppose the following ranking of elements of two categories X and Y
• Actual data to be evaluated,• (x_1,x_2,y_1) = (1,9,2). • Let,• T(x_1,x_2,y_1)=abs(mean X- mean Y) = 2
PermutationPermutation
• Usually, it is not practical to evaluate all N! permutatioons.
• We can approximate the p-value by sampling randomly from the set of permutations.
The permutations are:The permutations are:
• permutation value of T
• --------------------------------------------
• (1,9,3) 2 (actual data)
• (9,1,3) 2
• (1,3,9) 7
• (3,1,9) 7
• (3,9,1) 5
• (9,3,1) 5
Some resultsSome results• Students discussions on safe prescribing:
• Classified according expected learning outcomes related subtopics topics: A=7, B=12, C=53, D=4, E=40, F=7
• Graded: poor, fair, good, excelent
• Methodology used:
• LSA
• Bag of words/Maximal Repeated Phrases
• Permutation test
Challenging QuestionsChallenging Questions
DiscussionDiscussion
QuestionsQuestions
• Dangers of using Language Technology?
• Ontologies = Neat? NLP = Nasty?
• Other possible application areas?
• Corpus Collection?
• What is good effectiveness? When can we say that an algorithm works well?
• Other aspects not evaluated…
Questions?Questions?
#eof.#eof.