semantic textual similarity & more on alignment · semantic textual similarity & more on...
TRANSCRIPT
![Page 1: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/1.jpg)
Semantic Textual Similarity
& more on Alignment
CMSC 723 / LING 723 / INST 725
MARINE CARPUAT
![Page 2: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/2.jpg)
2 topics today
• P3 task: Semantic Textual Similarity
– Including Monolingual alignment
• Beyond IBM word alignment
– Synchronous CFGs
![Page 3: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/3.jpg)
Semantic Textual Similarity
Series of tasks at international workshop on
semantic evaluations (SemEval), since 2012
http://alt.qcri.org/semeval2017/task1/
![Page 4: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/4.jpg)
What is Semantic Textual Similarity?
Semantic Similarity
جدالكجد يدجياجد يجدي يج جي
وغو يحيح يحسيفحس يحيحفي
سف ي جي جيييدج كجساكجاس
كححسح حيحي . حفجحسوجح ج
حوحوس دح حدي يجدي يو جي
جيحجفححكسحجسكحك
حفحسوحوشيحيدويويد وي
يوسحفوفوفوطبس تعالى ومالكش
دعوه، هتبنبسط اخر انبساط
Hnh whdun duuhj js ijd dj iow oijd oidj dk
uwhd8 yh djhdhwuih jhu h uh jhihk, jdhhii,
gdytysla, yuiyduinsjsh, iodpisomkncijsi.
Kjhhuduh, dhdhhd hhduhd
jjhuiq…Welcome to my world, trust me you
will never be disappointed djijdp idiowdiw
I iwfiow ifiwoufowi ioiowruo iyfi I wioiwf oid
oi iwoiwy iowuouwr ujjd hihi iohoihiof uouo
ou o oufois f uhdiy oioi oo ouiosufoisuf
iouiouf paidp paudoi uiu fh uhhioiof
Shjkahsiunu iuhndhau dhdkhn hdhaud8
kdhikahdi dhjhd dhjh jiidh iihiiohio hihiahdiod Yo!
Come over here, you will be pleasantly
surprised idoasd io idjioio jidjduio iodio oi iiouio
oiudoi ifuiosu fiuoi oiuiou oi io hiyuify 8iy ih iouoiu
ou o ooihyiush iuh fhdfosiip upouosu oiu oi o
oisyoisy oi sih oiiou ios oisuois uois oudiosu doi
soiddu os oso iio oioisosuo.
Добро пожаловать в
мой мир, поверьте мне вы никогда не будете
разочарованы
안녕하세요제가당신에게전화했지만아무
소용이있을려고 ... 당신이시간을즐기고있었다희망
Quantitative Graded Similarity Score
Confidence Score
Principled Interpretability, which semantic
components/features led to results (hopefully will lead
to us gaining a better understanding of semantics)
![Page 5: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/5.jpg)
Why Semantic Textual Similarity?
• Most NLP applications need some notion of semantic
similarity to overcome brittleness and sparseness
• Provides evaluation beyond surface text processing
• A hub for semantic processing as a black box in
applications beyond NLP
• Lends itself to an extrinsic evaluation of scattered
semantic components
![Page 6: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/6.jpg)
What is STS?
• The graded process by which two snippets of text
(t1 and t2) are deemed equivalent semantically, i.e.
bear the same meaning
• An STS system will quantifiably inform us on how
similar t1 and t2 are, resulting in a similarity score
• An STS system will tell us why t1 and t2 are similar
giving a nuanced interpretation of similarity based
on semantic components’ contributions
![Page 7: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/7.jpg)
What is STS?
• Word similarity has been relatively well studied
– For example according to WN
cord smile 0.02
rooster voyage 0.04
noon string 0.04
fruit furnace 0.05
...
hill woodland 1.48
car journey 1.55
cemetery mound 1.69
...
cemetery graveyard 3.88
automobile car 3.92
More
similar
![Page 8: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/8.jpg)
What is STS?
• Fewer datasets for similarity between
sentences
A forest is a large area where trees grow close
together.
VS.
The coast is an area of land that is next to the sea.
[0.25]
![Page 9: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/9.jpg)
What is STS?
• Fewer datasets for similarity between
sentences
A forest is a large area where trees grow close
together.
VS.
Woodland is land with a lot of trees.
[2.51]
![Page 10: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/10.jpg)
What is STS?
• Fewer datasets for similarity between
sentences
Once there was a Czar who had three lovely
daughters.
VS.
There were three beautiful girls, whose father was a
Czar.
[4.3]
![Page 11: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/11.jpg)
Related tasks
• Paraphrase detection
– Are 2 sentences equivalent in meaning?
• Textual Entailment
– Does premise P entail hypothesis H?
• STS provides graded similarity
judgments
![Page 12: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/12.jpg)
![Page 13: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/13.jpg)
Annotation: crowd-sourcing
![Page 14: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/14.jpg)
Annotation: crowd-sourcing
• English annotation process
– Pairs annotated in batches of 20
– Annotators paid $1 per batch
– 5 annotations per pair
– Workers need to have Mturk master qualification
• Defining gold standard judgments
– Median value of annotations
– After filtering low quality annotators (<0.80
correlation with leave-on-out gold & <0.20 Kappa)
![Page 15: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/15.jpg)
Diverse data sources
![Page 16: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/16.jpg)
Evaluation: a shared task
Subset of 2016 results
(Score: Pearson correlation)
![Page 17: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/17.jpg)
STS models
from word to sentence vectors
• Can we perform STS by comparing sentence vector
representation?
• This approach works well for word level similarity
• But can we capture the meaning of a sentence in a single
vector?
![Page 18: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/18.jpg)
“Composing” by averaging
g(“shots fired at residence”)
=
1
4+ + +
shots fired at residence
[Tai et al. 2015, Wieting et al. 2016]
![Page 19: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/19.jpg)
How can we induce word vectors
for composition?
𝒙𝟏 𝒙𝟐
English paraphrases [Wieting et al. 2016]
By our fellow members
By our colleagues
Bilingual sentence pairs [Hermann & Blunsom 2014]
Thus in fact …by our fellow
members
As que podramos … nuestracolega disputado
Bilingual phrase pairs by our fellow member de nuestra colega
![Page 20: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/20.jpg)
STS models:
monolingual alignment
![Page 21: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/21.jpg)
One (of many) approaches to
monolingual entailment
Idea
• Exploit not only similarity
between words
• But also similarity
between their contexts
See Sultan et al. 2013
https://github.com/ma-
sultan/
![Page 22: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/22.jpg)
2 topics today
• P3 task: Semantic Textual Similarity
– Including Monolingual alignment
• Beyond IBM word alignment
– Synchronous CFGs
![Page 23: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/23.jpg)
Aligning words & constituents
• Alignment: mapping between spans of text in
lang1 and spans of text in lang2
– Sentences in document pairs
– Words in sentence pairs
– Syntactic constituents in sentence pairs
• Today: 2 methods for aligning constituents
– Parse and match
– biparse
![Page 24: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/24.jpg)
Parse
&
Match
![Page 25: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/25.jpg)
Parse(-Parse)-Match
• Idea
– Align spans that are consistent with existing
structure
• Pros
– Builds on existing NLP tools
• Cons
– Assume availability of lots of resources
– Assume that representations can be matched
![Page 26: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/26.jpg)
Aligning words & constituents
2 methods for aligning constituents:
• Parse and match
– assume existing parses and alignment
• Biparse
– alignment = structure
![Page 27: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/27.jpg)
A “straw man” hypothesis:
All languages have same grammar
![Page 28: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/28.jpg)
A “straw man” hypothesis:
All languages have same grammar
![Page 29: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/29.jpg)
A “straw man” hypothesis:
All languages have same grammar
![Page 30: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/30.jpg)
A “straw man” hypothesis:
All languages have same grammar
![Page 31: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/31.jpg)
The biparsing hypothesis:
All languages have nearly the same grammar
![Page 32: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/32.jpg)
The biparsing hypothesis:
All languages have nearly the same grammar
![Page 33: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/33.jpg)
Example for the biparsing hypothesis:
All languages have nearly the same grammar
![Page 34: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/34.jpg)
The biparsing hypothesis:
All languages have nearly the same grammar
![Page 35: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/35.jpg)
Dekai Wu and Pascale Fung, IJCNLP-2005
HKUST Human Language Technology Center
The biparsing hypothesis:
All languages have nearly the same grammar
![Page 36: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/36.jpg)
Dekai Wu and Pascale Fung, IJCNLP-2005
HKUST Human Language Technology Center
The biparsing hypothesis :
All languages have nearly the same grammar
![Page 37: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/37.jpg)
Dekai Wu and Pascale Fung, IJCNLP-2005
HKUST Human Language Technology Center
The biparsing hypothesis :
All languages have nearly the same grammar
![Page 38: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/38.jpg)
Dekai Wu and Pascale Fung, IJCNLP-2005
HKUST Human Language Technology Center
The biparsing hypothesis:
All languages have nearly the same grammar
VP [ VV PP ]
VP VV PP
ITG shorthand
VP VV PP , VV PP
VP VV PP , PP VV
SDTG/SCFG notation
VP VV(1) PP(2) , VV(1) PP(2)
VP VV(1) PP(2) , PP(2) VV(1)
Indexed SDTG/SCFG notationVP VV PP ; 1 2
VP VV PP ; 2 1
Permuted SDTG/SCFG
![Page 39: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/39.jpg)
Synchronous
Context Free Grammars
• Context free grammars (CFG)
– Common way of representing syntax in (monolingual)
NLP
• Synchronous context free grammars (SCFG)
– Generate pairs of strings
– Align sentences by parsing them
– Translate sentences by parsing them
• Key algorithm: how to parse with SCFGs?
![Page 40: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/40.jpg)
SCFG trade off
• Expressiveness
– SCFGs cannot represent all sentence
pairs in all languages
• Efficiency
– SCFGs let us view alignment as parsing
& benefit from well-studied formalism
![Page 41: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/41.jpg)
Synchronous parsing cannot
represent all sentence pairs
![Page 42: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/42.jpg)
Synchronous parsing cannot
represent all sentence pairs
![Page 43: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/43.jpg)
Synchronous parsing cannot
represent all sentence pairs
![Page 44: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/44.jpg)
A subclass of SCFGs:
Inversion Transduction Grammars
• ITGs are the subclass of SDTGs/SCFGs:
– with only straight and inverted transduction rules
– with only transduction rules of rank < 2
– with only transduction rules of rank < 3
• ITGs are context-free (like SCFGs).
equivalent
![Page 45: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/45.jpg)
For length-4 phrases (or frames),
ITGs can express 22 out of 24 permutations!
![Page 46: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/46.jpg)
ITGs enable efficient DP algorithms[Wu 1995]
e0 e1 e2 e3 e4 e5 e6 e7
c0 c1 c2 c3 c4 c5 c6
![Page 47: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/47.jpg)
ITGs enable efficient DP algorithms[Wu 1995]
e0 e1 e2 e3 e4 e5 e6 e7
c0 c1 c2 c3 c4 c5 c6
![Page 48: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/48.jpg)
ITGs enable efficient DP algorithms[Wu 1995]
e0 e1 e2 e3 e4 e5 e6 e7
c0 c1 c2 c3 c4 c5 c6
![Page 49: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/49.jpg)
ITGs enable efficient DP algorithms[Wu 1995]
e0 e1 e2 e3 e4 e5 e6 e7
c0 c1 c2 c3 c4 c5 c6
![Page 50: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/50.jpg)
ITGs enable efficient DP algorithms[Wu 1995]
e0 e1 e2 e3 e4 e5 e6 e7
c0 c1 c2 c3 c4 c5 c6
![Page 51: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/51.jpg)
ITGs enable efficient DP algorithms[Wu 1995]
e0 e1 e2 e3 e4 e5 e6 e7
c0 c1 c2 c3 c4 c5 c6
![Page 52: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/52.jpg)
ITGs enable efficient DP algorithms[Wu 1995]
e0 e1 e2 e3 e4 e5 e6 e7
c0 c1 c2 c3 c4 c5 c6
![Page 53: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/53.jpg)
Biparsing with CKY
• Given the following SCFGA -> fat, gordos
A -> thin, delgados
N -> cats, gatosVP -> eats, comen
NP -> A(1)N(2),N(2)A(1)
S -> NP(1)VP(2), NP(1)VP(2)
• Let’s parse a sentence pairfat cats eat gatos gordos comen
Example by Matt Post (JHU)
![Page 54: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/54.jpg)
Biparsing with CKY
A -> fat, gordos
A -> thin, delgados
N -> cats, gatosVP -> eats, comen
NP -> A(1)N(2),N(2)A(1)
S -> NP(1)VP(2), NP(1)VP(2)
3 comen
2 gordos
1 gatos
fat cats eats
1 2 3
Chart now enumerates pairs of spans
![Page 55: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/55.jpg)
Biparsing with CKY
A -> fat, gordos
A -> thin, delgados
N -> cats, gatosVP -> eats, comen
NP -> A(1)N(2),N(2)A(1)
S -> NP(1)VP(2), NP(1)VP(2)
3 comen
2 gordos
1 gatos
fat cats eats
1 2 3
A((1,1),(2,2))
N((2,2),(1,1))
VP((3,3),(3,3))
Apply lexical rules
![Page 56: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/56.jpg)
Biparsing with CKY
A -> fat, gordos
A -> thin, delgados
N -> cats, gatosVP -> eats, comen
NP -> A(1)N(2),N(2)A(1)
S -> NP(1)VP(2), NP(1)VP(2)
3 comen
2 gordos
1 gatos
fat cats eats
1 2 3
A((1,1),(2,2))
N((2,2),(1,1))
NP((1,2),(1,2))
VP((3,3),(3,3))
For each block,apply straight & inverted rules
S((3,3),(3,3))
![Page 57: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/57.jpg)
Biparsing with CKY
3 comen
2 gordos
1 gatos
fat cats eats
1 2 3
O(GN3M3)
![Page 58: Semantic Textual Similarity & more on Alignment · Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu](https://reader035.vdocument.in/reader035/viewer/2022062506/5f06af8a7e708231d4193a3f/html5/thumbnails/58.jpg)
Aligning words & constituents
2 different ways of looking at this problem:
• parse-parse-match
– assume existing parses and alignment
• biparse
– alignment = structure