evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · all...

27
Evaluation of pairwise string alignment methods Martijn Wieling, Jelena Proki´ c and John Nerbonne Department of Computational Linguistics, University of Groningen Feb. 20, 2009, Kampala Martijn Wieling, Jelena Proki´ c and John Nerbonne 1/20

Upload: others

Post on 21-Jan-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Evaluation of pairwise string alignment methods

Martijn Wieling, Jelena Prokic and John Nerbonne

Department of Computational Linguistics, University of Groningen

Feb. 20, 2009, Kampala

Martijn Wieling, Jelena Prokic and John Nerbonne 1/20

Page 2: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Overview

IntroductionDataset and gold standardAlgorithmsEvaluation methodResultsDiscussion

Martijn Wieling, Jelena Prokic and John Nerbonne 2/20

Page 3: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Introduction

There are many string-similarity measures based on pairwisestring alignment (PWA)Evaluations at the aggregate level show almost no performancedifference between PWA methods (Heeringa et al., 2006; Wieling et al., 2007)

More sensitive evaluation techniques needed to determine effectat the alignment level

Martijn Wieling, Jelena Prokic and John Nerbonne 3/20

Page 4: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Dataset

Bulgarian dialect dataTranscriptions of 152 words in 197 sites98 phonetic types

Martijn Wieling, Jelena Prokic and John Nerbonne 4/20

Page 5: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Gold standard

Automatically generated from manually corrected multiplealignment

L1: j "A - - - -L2: - "A s - - -L3: - "6 s - - -L4: j "A s - - -L5: j "A z e k a

Each transcription in the gold standard is aligned with all others(L1:L2, L1:L3, ...): 3.5 million pairwise alignmentsGap-gap alignments are removed

L2: "A sL3: "6 s

Martijn Wieling, Jelena Prokic and John Nerbonne 5/20

Page 6: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Pairwise alignment algorithms

Evaluated algorithmsRegular Levenshtein algorithmLevenshtein algorithm with swap-operationLevenshtein algorithm with PMI generated segment distancesPair Hidden Markov Model - Viterbi algorithm

Martijn Wieling, Jelena Prokic and John Nerbonne 6/20

Page 7: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Regular Levenshtein algorithm

One of the most popular pairwise string alignment methodsVowel-consonant alignment restriction

j"As delete j 1"As subst. s/z 1"Az insert i 1"Azi

3

j "A s"A z i

1 1 1

Martijn Wieling, Jelena Prokic and John Nerbonne 7/20

Page 8: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Levenshtein with swap-operation

Bulgarian dialect data often contains metathesis

Implementation: the swap operation is used whenever possiblev r "7v "7 r

>< 1But only involving exactly the same symbols

v r "7v "a r

1 1

Martijn Wieling, Jelena Prokic and John Nerbonne 8/20

Page 9: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Levenshtein with PMI segment distances (1)

Pointwise Mutual Information (PMI): assesses degree of statisticaldependence between aligned segments (x and y )

PMI(x , y) = log2

(p(x , y)

p(x) p(y)

)p(x , y): relative occurrence of the aligned segments x and y in thewhole datasetp(x) and p(y): relative occurrence of x or y in the whole dataset

The greater the PMI value, the more segments tend to cooccur incorrespondences

Martijn Wieling, Jelena Prokic and John Nerbonne 9/20

Page 10: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Levenshtein with PMI segment distances (2)

Algorithm:Initially all string pairs are aligned using Levenshtein algorithmDistance between tokens x and y is set to: 0 - PMI(x , y )Repeatedly strings are aligned with the Levenshtein algorithm usingthe token distances until alignments remain constant

Advantage: second alignment is not generated anymore

v "7 nv "7 n’ k @

1 1 1

v "7 nv "7 n’ k @

1 1 1

Martijn Wieling, Jelena Prokic and John Nerbonne 10/20

Page 11: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Pair Hidden Markov Model

Pair Hidden Markov ModelAdapted Hidden Markov Model: 2 parallel output streamsLinguistically introduced by Mackay and Kondrak (2005)Large number of probabilities to be estimated in trainingProbabilities linguistically sensible (Wieling et al., 2007)After training Viterbi algorithm yields most probable alignment

Martijn Wieling, Jelena Prokic and John Nerbonne 11/20

Page 12: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Evaluation method (1)

All pairwise alignments are generated for every algorithmInsertion-deletion sequences are standardized

v "i A v "i Av "i j v "i j

Two-to-one mappings are standardizedv "ô

"x v "ô

"x

v "A r x v "A r x

Martijn Wieling, Jelena Prokic and John Nerbonne 12/20

Page 13: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Evaluation method (2)

Each token alignment is converted to a single symbol

v l "7 k v l "7 kv "7 l k v "7 l k

Generated strings:

v/v l/"7 "7/l k/k v/v l/- "7/"7 -/l k/k

Martijn Wieling, Jelena Prokic and John Nerbonne 13/20

Page 14: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Evaluation method (3)

The generated strings can be aligned to determine their distancev/v l/"7 "7/l k/kv/v l/- "7/"7 -/l k/k

1 1 1For every algorithm the generated strings are aligned with thegenerated strings of the gold standard (GS)The distance between an algorithm and the GS is simply the sumof all generated string distancesBaseline: Hamming alignments (only substitutions)

Martijn Wieling, Jelena Prokic and John Nerbonne 14/20

Page 15: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Quantitative results: segment distances

PMI distances: D(a, a) < D(V , V ) (t < −13, p < .001)

PHMM substitution probabilities: P(a, a) > P(V , V ) > P(V , C)(t’s > 9, p < .001)

PMI versus PHMM log odds transformed substitution scores:Spearman’s ρ = −.965, p < .001(indels: Spearman’s ρ = −.736, p < .001)

Martijn Wieling, Jelena Prokic and John Nerbonne 15/20

Page 16: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Quantitative results: segment distances

PMI distances: D(a, a) < D(V , V ) (t < −13, p < .001)

PHMM substitution probabilities: P(a, a) > P(V , V ) > P(V , C)(t’s > 9, p < .001)

PMI versus PHMM log odds transformed substitution scores:Spearman’s ρ = −.965, p < .001(indels: Spearman’s ρ = −.736, p < .001)

Martijn Wieling, Jelena Prokic and John Nerbonne 15/20

Page 17: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Quantitative results: segment distances

PMI distances: D(a, a) < D(V , V ) (t < −13, p < .001)

PHMM substitution probabilities: P(a, a) > P(V , V ) > P(V , C)(t’s > 9, p < .001)

PMI versus PHMM log odds transformed substitution scores:Spearman’s ρ = −.965, p < .001(indels: Spearman’s ρ = −.736, p < .001)

Martijn Wieling, Jelena Prokic and John Nerbonne 15/20

Page 18: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Quantitative results: alignments

MS: Number of misaligned segmentsError rate (E): MS / 15898147 (aligned segments in GS)

0 ≤ E ≤ 2

IA: Number of incorrect alignments with respect to the GS

MS (E) IA (%)Hamming 2510094 (0.1579) 726844 (20.92%)Levenshtein 490703 (0.0309) 191674 (5.52%)Levenshtein PMI 399216 (0.0251) 156440 (4.50%)Levenshtein swap 392345 (0.0247) 161834 (4.66%)PHMM Viterbi 362423 (0.0228) 160896 (4.63%)

Martijn Wieling, Jelena Prokic and John Nerbonne 16/20

Page 19: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Quantitative results: alignments

MS: Number of misaligned segmentsError rate (E): MS / 15898147 (aligned segments in GS)

0 ≤ E ≤ 2

IA: Number of incorrect alignments with respect to the GS

MS (E) IA (%)Hamming 2510094 (0.1579) 726844 (20.92%)Levenshtein 490703 (0.0309) 191674 (5.52%)Levenshtein PMI 399216 (0.0251) 156440 (4.50%)Levenshtein swap 392345 (0.0247) 161834 (4.66%)PHMM Viterbi 362423 (0.0228) 160896 (4.63%)

Martijn Wieling, Jelena Prokic and John Nerbonne 16/20

Page 20: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Qualitative results (1)

No perfect performance possible:

p "ô"

v j 7 tp "ô

"v n i o

p "ô"

v n i j 7 tp "ô

"v j 7 t

p "ô"

v n i o

Martijn Wieling, Jelena Prokic and John Nerbonne 17/20

Page 21: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Qualitative results (1)

No perfect performance possible:

p "ô"

v j 7 tp "ô

"v n i o

p "ô"

v n i j 7 tp "ô

"v j 7 t

p "ô"

v n i o

Martijn Wieling, Jelena Prokic and John Nerbonne 17/20

Page 22: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Qualitative results (2)

Problems of Levenshtein (and PMI):Detecting correct alignments of a vowel with a consonantDetecting metathesisAligning one consonant with either of two other consonants

Problems of Levenshtein swap:Problem 1 and 3 of LevenshteinApplying metathesis too often

b r @ nj "ebj @ r "A n i1 >< 1 1 1 1

Problems of PHMM:Segment distances causes wrong alignments of vowels withconsonants which appear often in swaps

Martijn Wieling, Jelena Prokic and John Nerbonne 18/20

Page 23: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Qualitative results (2)

Problems of Levenshtein (and PMI):Detecting correct alignments of a vowel with a consonantDetecting metathesisAligning one consonant with either of two other consonants

Problems of Levenshtein swap:Problem 1 and 3 of LevenshteinApplying metathesis too often

b r @ nj "ebj @ r "A n i1 >< 1 1 1 1

Problems of PHMM:Segment distances causes wrong alignments of vowels withconsonants which appear often in swaps

Martijn Wieling, Jelena Prokic and John Nerbonne 18/20

Page 24: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Qualitative results (2)

Problems of Levenshtein (and PMI):Detecting correct alignments of a vowel with a consonantDetecting metathesisAligning one consonant with either of two other consonants

Problems of Levenshtein swap:Problem 1 and 3 of LevenshteinApplying metathesis too often

b r @ nj "ebj @ r "A n i1 >< 1 1 1 1

Problems of PHMM:Segment distances causes wrong alignments of vowels withconsonants which appear often in swaps

Martijn Wieling, Jelena Prokic and John Nerbonne 18/20

Page 25: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Discussion

PHMM performs best at segment levelSlow to train: multiple hours

Quicker, clearer and also good alignments: Levenshtein PMI /Swap

Interesting further research:Combine PMI and Swap Levenshtein methodsVerify results against a gold standard of another dataset

Martijn Wieling, Jelena Prokic and John Nerbonne 19/20

Page 26: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Discussion

PHMM performs best at segment levelSlow to train: multiple hours

Quicker, clearer and also good alignments: Levenshtein PMI /Swap

Interesting further research:Combine PMI and Swap Levenshtein methodsVerify results against a gold standard of another dataset

Martijn Wieling, Jelena Prokic and John Nerbonne 19/20

Page 27: Evaluation of pairwise string alignment methodsnerbonne/talks/wieling-et-al-2009.pdf · All pairwise alignments are generated for every algorithm Insertion-deletion sequences are

Any questions?

Thank You!

Martijn Wieling, Jelena Prokic and John Nerbonne 20/20