Download - Unordered Trees

Transcript

8/8/2019 Unordered Trees

http://slidepdf.com/reader/full/unordered-trees 1/18

Algorithmica (1996) 15:205-222

Algorithmica9 1996Springer-VerlagNew YorkInc.

A C o n s t r a i n e d E d i t D i s t a n c e B e t w e e nU n o r d e r e d L a b e l e d T r ee s 1

K a i z h o n g Z h a n g 2

Abs trac t . This paper considers the problem of computing a constrained edit distance between unordered

labeled trees. The problem of approximate nnordered tree matching is also considered. We present dynamic

programming algorithms solving these problems in sequential time O (IT ll • IT21 • (deg(T1) + deg(T2)) •

l og2(deg (T l ) q - deg (T2) ) ) . Our previous result shows that computing the edit distance between unordered

labeled trees is NP-complete.

Key Words. Unordered trees, Constrained edit distance, Approximate tree matching.

1 . I n t r o d u c t i o n . U n o r d e r e d r o o t e d l a b e le d tr e e s a r e t re e s w h o s e n o d e s a r e l a b e le d

and i n w h i ch on l y ances t o r r e l a ti onsh i ps a r e s i gn i fi can t ( t he l e ft - to - r i gh t o rde r am ong

sibl ings i s no t s igni ficant ) . Su ch t rees ar i se na tura l ly in ma ny f ie lds : gene alogica l s tudies ,

e . g ., de t e rmi n i ng gene t i c d i s eases based on ances t ry t r ee pa t t e rns ; co mp ut e r v i s i on , e .g . ,

ob j ec t r ep resen t a t i on an d r ec ogn i t i on [7 ], [ 8 ] ; chemi s t ry , e . g ., s t udy i ng t he r e l a t ionsh i p

be t w een t he s t r uc tu r e o f chem i ca l com pou nds and t he i r p rope r t i e s [ 10 ], [ 4 ] ; fi le sys t em,

e . g ., i n fo rm a t i on r e t ri eva l in a un i x f i l e sys tem. F or ma ny such app l i ca ti ons , i t w ou l d b e

use fu l to com pare uno rde red l abe l ed tr ees by som e mean i ngfu l d i s tance me t r i c .

Th e ed i t d i s t ance me t r i c , u sed w i t h succes s [ 6] f o r o rde red l abe l ed t r ees [ 17 ] , [ 18 ], i s

one such na t u r a l me t r ic . A p rev i ous r e su l t has show n t ha t com put i ng t he ed i t d i s tance

fo r uno rde red l abe l ed t rees is N P-c om pl e t e [ 19] ( i n fac t M A X SN P-ha rd [16 ]) . S i nce

unord e red t r ees a r e i mpo r t an t in m any app l i ca t ions , on e w o u l d l i ke to f i nd d i st ances t ha t

can be e f f i c i en t ly com put ed .

Mot i va t ed by t hese r e su l t s , t h i s pape r cons i de r s a cons t r a i ned ed i t d i s t ance me t r i c

be t w een unorde red l abe l ed t r ees . The cons t r a i n t w e add i s a na t u r a l one , namel y , d i s -

j o i n t sub t r ees be m app ed t o d i s j o i n t sub tr ees . A c t ua l l y some ap p l i ca ti ons w ou l d r equ i r et h is co ns t r a i n t ( e .g . , c la s s i fi ca t ion t r ee com par i son) [ 11 ]. W e p resen t a po l yn om i a l - t i me

a l gor i t hm t o co mp ut e t h i s cons t r a i ned ed i t d is t ance me t r i c .

I n S e c t i o n 2 w e r e v i e w th e p r e v i o u s N P - c o m p l e t e n e s s a n d M A X S N P - h a r d r es u lt s

on t he ed i t d i s tance m e t r i c be t w een un orde re d t rees . In S ec t i on 3 w e i n t roduc e the new

di s t ance me t r i c be t w e en t r ees , w h i ch w e ca l l t he cons t r a i ned ed i t d is t ance . I n Sec t i on 4

w e i nves t i ga t e t he p rope r t i e s o f t he cons t r a i ned ed i t d i s t ance me t r i c . I n Sec t i on 5 w e

p r e s e n t a d y n a m i c p r o g r a m m i n g a l g o r i t h m t o c o m p u t e t h e c o n s t r a i n e d e d i t d i s t a n c e

This research was supported by the Natural Sciences and Engineering Research Council of Canada under

Grant No. OGP0046373.2 Department of Computer Science, University of Western Ontario, London, Ontario, Canada, N6A 5B7.

[email protected].

Received October 16, 1993; revised May 18, 1994. Communicated by H. N. Gabow.

8/8/2019 Unordered Trees

http://slidepdf.com/reader/full/unordered-trees 2/18

206 Kaizhong Zhang

m e t r i c a n d a n a l y s e t he c o m p l e x i ty . I n S e c ti o n 6 w e c o n s i d e r t h e a p p r o x i m a t e u n o r d e r e d

t r e e - m a t c h i n g p r o b l e m .

2 . P r e l i m i n a r i e s . I n t h is s ec t io n w e f i rs t i n t r o d u c e s o m e b a s i c de f i n it io n s a n d th e n

r e v i e w t h e p r e v i o u s r e s u l t s o n u n o r d e r e d t r e e s .

2.1 . Trees. A f re e t ree i s a c o n n e c t e d , a c y c l i c , u n d i r e c t e d g r a p h . A r oo t e d t r e e i s a

f r e e t re e i n w h i c h o n e o f th e v e r t i c e s is d i s t i n g u i s h e d f r o m t h e o t h e r s . T h e d i s t i n g u i s h e d

v e r t e x i s c a l l e d t h e r o o t o f t h e t r e e . W e r e f e r t o a v e r t e x o f a r o o t e d t r e e a s a n o d e o f t h e

t r e e . A n unor de r e d t r ee i s j u s t a r o o t e d t r e e . W e u s e t h e t e r m u n o r d e r e d t r e e t o d i s t i n g u i s h

i t f r o m t h e r o o t e d , o r d e r e d t r e e d e f i n e d b e l o w . A n or de r e d t r e e i s a r o o t e d t r e e i n w h i c h

t h e c h i l d r e n o f e a c h n o d e a r e o r d e r e d . T h a t i s, i f a n o d e h a s k c h i l d r e n , t h e n w e c a n

d e s i g n a t e t h e m a s th e f ir s t c h i l d , t h e s e c o n d c h i l d . . . . a n d t h e k t h c h i ld .

U n l e s s o t h e r w i s e s t a t e d , a l l t r e e s w e c o n s i d e r i n th e p a p e r a r e u n o r d e r e d l a b e l e d tr e e s .

2 . 2 . Edi t i ng O pe r a t i ons . W e c o n s i d e r th r e e k i n d s o f o p e ra t io n s . C h a n g i n g a n o d e n

m e a n s c h a n g i n g t h e la b e l o n n . D e l e t i ng a n o d e n m e a n s m a k i n g t h e c h i l d r e n o f n

b e c o m e t h e c h i ld r e n o f t h e p a r e n t o f n a n d t h e n r e m o v i n g n . I n s e r t i ng i s t he c o m p l e m e n t

o f d e l e t i n g . T h i s m e a n s t h a t i n s e r t i n g n a s t h e c h i l d o f m w i l l m a k e n t h e p a r e n t o f a

s u b s e t ( a s o p p o s e d t o a c o n s e c u t i v e s u b s e q u e n c e [ 1 7 ] ) o f th e c u r r e n t c h i l d r e n o f m .

S u p p o s e e a c h n o d e l a b e l i s a s y m b o l c h o s e n f r o m a n a l p h a b e t E . L e t )~, a u n i q u es y m b o l n o t in E , d e n o t e t h e n u l l s y m b o l . F o l l o w i n g [ 9] , [1 7 ] , a n d [ 1 9 ] , w e r e p r e s e n t a n

e d i t o p e r a t i o n a s a ~ b , w h e r e a i s e i t h e r )~ o r a l a b e l o f a n o d e i n t r e e 7"1 a n d b i s e i t h e r

)~ o r a l a b e l o f a n o d e i n t r e e T 2. W e c a l l a ~ b a c h a n g e o p e r a t i o n i f a r ~ a n d b r )~; a

d e l e t e o p e r a t i o n i f b = )~; a n d a n i n s e r t o p e r a t i o n i f a - - )~. L e t T 2 b e t h e t r e e t h a t r e s u l t s

f r o m t h e a p p l i c a t i o n o f a n e d i t o p e r a t i o n a -- + b t o t r e e T 1; t h i s is w r i t t e n T1 --+ T 2 v i a

a - + b .

L e t S b e a s e q u e n c e s l . . . . . s k o f e d i t o p e r a t i o n s . A n S - d e r i v a t i o n f r o m t r e e A t o t r e e

B i s a s e q u e n c e o f t r e e s A o . . . . . A k su c h th a t A = A o , B = A k , a n d A i - 1 ~ A i v i a si

f o r 1 < i < k . L e t F b e a c o s t f u n c t i o n w h i c h a s s i g n s to e a c h e d i t o p e r a t i o n a ~ b a

n o n n e g a t i v e r e a l n u m b e r y ( a ~ b ) .

W e c o n s t r a i n F t o b e a d i s t a n c e m e t r i c . T h a t i s:

( i ) ~, ( a - -+ b ) > 0 , y ( a ~ a ) = 0 .

( i i) y ( a - + b ) = ~ , ( b -- + a ) .

( i i i ) y (a - -+ c ) < ~ ' (a - -+ b ) + y (b - -+ c ) .

X-~IslW e e x t e n d F t o th e s e q u e n c e o f e d i t o p e r a t i o n s S b y l e t t i n g F ( S ) = ~-.~i= 1 F ( s i ) .

2 . 3 . E d i t i n g D i s t a n c e a n d E d i t D i s t a n c e M a p p i n g . T h e r e s u l t s i n t h is s u b s e c t i o n a r e

f r o m [ 19 ]. W e o m i t th e p r o o f s .

2 . 3 . 1 . E d i t i n g D i s t a n c e . T h e e d i t d i s ta n c e b e t w e e n t w o t r e e s i s d e f i n e d [ 1 9 ] b y c o n -

s i d e r i n g th e m i n i m u m c o s t e d i t o p e r a t i o n s s e q u e n c e t h a t t r a n s f o r m s o n e t r e e t o t h e o th e r .

8/8/2019 Unordered Trees

http://slidepdf.com/reader/full/unordered-trees 3/18

A C o n s t r a i n e d E d i t D i s t a n c e B e t w e e n U n o r d e r e d L a b e l e d T r e e s 2 0 7

F o r m a l l y t h e e d i t d i s t a n c e b e t w e e n T1 a n d T 2 i s d e f i n e d a s

De(T1 , T 2 ) = m ~ n { y ( S ) I S i s a n e d i t o p e r a t i o n s e q u e n c e t a k i n g 7"1 t o T 2 }.

2 . 3 . 2 . E d i t in g D i s ta n c e M a p p i n g s . T h e e d i t o p e ra t i o n s g i v e r i se t o a m a p p i n g w h i c h i s

a g r a p h i c a l s p e c i f ic a t io n o f w h a t e d i t o p e r a t i o n s a p p l y t o e a c h n o d e i n t h e t w o u n o r d e r e d

l a b e l e d t re e s .

G i v e n a tr e e , i t i s u s u a l l y c o n v e n i e n t t o u s e a n u m b e r i n g t o r e f e r t o t h e n o d e s o f th e

t re e . F o r a n o r d e r e d t r e e T , t h e l e f t - t o - r ig h t p o s t o r d e r n u m b e r i n g o r l e f t - t o - ri g h t p r e o r d e r

n u m b e r i n g a r e o f te n u s e d t o n u m b e r t h e n o d e s o f T f r o m 1 t o I T I, t h e s i z e o f t re e T .

S i n c e w e a r e c o n c e r n e d w i t h u n o r d e r e d t re e s , w e c a n f i x a n a r b i tr a r y o r d e r f o r e a c h o f th e

n o d e i n th e t r e e a n d t h e n u s e l e f t - t o - ri g h t p o s t o r d e r n u m b e r i n g o r le f t - t o - r i g h t p re o i~ d er

n u m b e r i n g . W e r e f e r t o t h e a b o v e a s a n u m b e r i n g o f t h e t r e e .

S u p p o s e t h a t w e h a v e a n u m b e r i n g f o r e a c h t re e . L e t t[i] b e t h e i t h n o d e o f tr e e T

i n t h e g i v e n n u m b e r i n g . F o r m a l l y w e d e f i n e a t r i p l e (Me, T~, T 2 ) t o b e a n edi t d i s tance

m a p p i n g f r o m T 1 t o T2, w h e r e M e i s a n y s e t o f o r d e r e d p a i r s o f i n t e g e r s ( i, j ) s a t i s fy i n g :

(0 ) 1 < i < I / ' 1 1 , 1 < j < I T 2I ; w h e r e I T kl i s t h e s i z e o f t r e e T k , k = 1 , 2 .

( 1 ) F o r a n y p a i r o f ( il , j l ) a n d ( i2 , j 2 ) i n M e:

( a ) i l = i2 i f f j l = j 2 ( o n e - t o - o n e ) .

(b ) q [ i l ] i s a n a n c e s t o r o f t1 [i 2] i f f t 2 [ j l ] i s a n a n c e s t o r o f t 2 [j 2 ] ( a n c e s t o r o r d e r

p r e s e r v e d ) .

W e u se M e i n s te a d o f (Me, T1, T 2) i f t h e r e i s n o c o n f u s i o n . L e t M e b e a n e d i t d i s t a n c e

m a p p i n g f r o m T 1 t o T 2. T h e n w e c a n d e f i n e t h e c o s t o f M e :

) / ( M e ) = ~ y ( t l [ i ] ~ t 2 [ j ] ) + y ( t l [ i ] ..-.-> )~)

( i , j ) c M ~ { i l ~ j s. t . ( i , j )E M e }

V()~ ~ t2[j]) .

{j[~[ i s.t . ( i , j ) c M e }

T h e r e l a t i o n b e t w e e n a n e d i t d is t a n c e m a p p i n g a n d a s e q u e n c e o f e d i t o p e r a t i o n s i s

a s f o l l o w s :

L E M M A 1 . Given S , a s equence s 1 . . . . s k o f edi t op erat ions f rom T1 to T2, an ed i tdi s tance ma pp ing Me f ro m Ta to T2 ex i s t s such that V (Me) < y ( S) . Conversely , fo r a ny

ed i t d i s t ance ma pping Me , a s equence o f ed i t opera t i ons ex i st s such t ha t • ( S ) = y (M e) .

B a s e d o n t h e l e m m a , t h e f o l l o w i n g t h e o r e m s t a t e s t h e r e l a t i o n b e t w e e n t h e e d i t d i s :

t a n c e a n d t h e e d i t d is t a n c e m a p p i n g s . T h i s i s w h e y w e c a l l t hi s k in d o f m a p p i n g e d i t

d i s t a n c e m a p p i n g .

T H E O R E M 1 .

De(T1 , T 2) = m i n { v ( M e ) ] Me i s an ed i t d is t ance m apping from T1 to T2}.Me

O n e o f t h e r e s u l t s i n [ 1 9 ] i s t h a t f i n d i n g De(T1,T2) i s N P - c o m p l e t e e v e n i f t h e t re e s

a r e b i n a r y t r e e s w i t h a la b e l a l p h a b e t o f s iz e tw o . K i l p e l a in e n a n d M a n n i l a [3 ] s h o w e d

8/8/2019 Unordered Trees

http://slidepdf.com/reader/full/unordered-trees 4/18

208 Kaizhong Zhang

t h a t e v e n t h e i n c lu s i o n p r o b l e m f o r u n o r d e r e d t re e s i n N P - c o m p l e t e . I n fa c t w e r e c e n t l y

p r o v e d a s t ro n g e r r e s u l t [ 1 6] : th a t th e p r o b l e m o f f in d i n g t h e m i n i m u m c o s t e d it d i s ta n c e

m a p p i n g ( e d it d i st a n c e ) b e t w e e n t w o u n o r d e r e d t r ee s a n d t h e p r o b l e m o f fi n d in g t h e

l a rg e s t c o m m o n s u b tr e e o f t w o u n o r d e r e d t re e s ar e b o t h M A X S N P - h a rd . T h i s m e a n s

t h at t he r e is n o p o l y n o m i a l - t i m e a p p r o x i m a t i o n s c h e m e ( P T A S ) f o r th e s e p r o b l e m s u n le s s

P = N P [ 1 ] . S i n c e u n o r d e r e d t r e e s a r e im p o r t a n t in m a n y a p p l i c a t i o n s , i t i s d e s i r e d t o

f i n d d i s t a n c e s t h a t c a n b e e f f i c i e n t l y c o m p u t e d .

3 . A C o n s t r a i n e d E d i t D i s t a n c e M e t r i c B e t w e e n U n o r d e r e d T r e es . O u r n e w d is -

t a n c e m e t r i c is b a s e d o n a c o n s t r a i n t o f t h e m a p p i n g s a l l o w e d b e t w e e n t w o t re e s . T h e

i n t u it i v e i d e a i s t h a t t w o s e p a r a t e s u b t r e e s o f 7"1 s h o u l d b e m a p p e d t o t w o s e p a r a t e s u b -

t r e e s o f T 2. T h i s i d e a w a s p r o p o s e d b y T a n a k a a n d T a n a k a [ 1 1 ] in th e i r d e f in i t i o n f o r

a s tr u c tu r e p r e s e r v i n g m a p p i n g b e t w e e n t w o o r d e r e d l a b e l e d t re e s a l t h o u g h t h e d ef i ni -t i o n in [ 1 1 ] d o e s n o t c a p t u r e t h e i d e a p r e c i s e l y . T a n a k a a n d T a n a k a [ 1 1 ] a ls o s h o w e d

t h a t in s o m e a p p l i c a t i o n s ( e .g . , c l a s s i f i c a t i o n t re e c o m p a r i s o n ) t h is k i n d o f m a p p i n g i s

m o r e m e a n i n g f u l t h a n e d it d is t a n c e m a p p i n g . I n a s e p a r a t e re s u l t [1 5 ], w e d e v e l o p e d a n

a l g o r i t h m f o r th e c o n s t r a in e d e d i t d i s t a n c e b e t w e e n o r d e r e d l a b e l e d t re e s w i t h t i m e c o m -

p l e x i t y O([TI[[T2[). T h i s p a p e r e x t e n d s t h e d e f in i ti o n f r o m o r d e r e d t r e e s t o u n o r d e r e d

t r e e s .

3.1. Constrained Edi t Distance Mappings. S u p p o s e a g a i n t h a t w e h a v e a n u m b e r i n g

f o r e a c h t r e e . L e t t[i] b e t h e i t h n o d e o f t re e T i n th e g i v e n n u m b e r i n g . L e t T[i] b e t h e

s u b t r e e r o o t e d a t t [ i] a n d l e t F [ i ] b e t h e u n o r d e r e d f o r e s t o b t a i n e d b y d e l e t i n g t [ i] f r o m

T[i].

F o r m a l l y w e d e f i n e a t r ip l e ( M , / ' 1 , T 2) t o b e a constrained ed it distance map pin g

f r o m T 1 t o T 2 , w h e r e M i s a n y s e t o f o r d e r e d p a i r s o f i n t e g e r s ( i, j ) s a t i s f y i n g :

( 1 ) M i s a n e d i t d i s t a n c e m a p p i n g .

( 2 ) F o r a n y t r i p l e ( i l , j l ) , ( i2 , j 2 ) , a n d ( i 3 , j 3 ) in M l e t q[1] b e th e l c a ( q [ i l ] , q [ i 2 ]) a n d

l e t t 2 [ J ] b e t h e l c a ( t 2 [ j l ] , t 2 [ j 2 ]) , w h e r e l c a r e p r e s e n t s l e a s t c o m m o n a n c e s t o r , t l [ I ]

i s a p r o p e r a n c e s t o r o f tl [ i3 ] i f f t2[J] i s a p r o p e r a n c e s t o r o f t z [ j3 ] .

T h i s d e f i n i t io n a d d s c o n d i t i o n ( 2 ) to t h e d e f i n i ti o n o f t h e e d i t d i s t a n c e m a p p i n g .W e e x a m i n e t h e m e a n i n g o f t h e c o n s tr a i n e d e d i t d i s t a n c e m a p p i n g s . S u p p o s e t l [ I ] i s a

d e s c e n d a n t o f t l [ i3 ] , t h e n t l [ i l ] a n d t l [i2 ] m u s t b e d e s c e n d a n t s o f t l [ i3 ] . B y c o n d i t i o n

( 1 ) t 2 [ j l ] a n d t z [ j2 ] m u s t b e d e s c e n d a n t s o f t2 [ j3 ] . T h i s i n t u r n m e a n s t h a t t 2 [ J ] i s a

d e s c e n d a n t o f T 2 [j 3 ]. T h e r e f o r e c o n d i t i o n ( 1 ) i m p l i e s t h a t t l [ I ] i s a d e s c e n d a n t o f t l [i3 ]

i f a n d o n l y i f t z [ J ] i s a d e s c e n d a n t o f tz [ j3 ] . C o n d i t i o n ( 2 ) a d d s t h e c o n s t r a i n t t h a t tl [ I ]

i s a p r o p e r a n c e s t o r o f t1 [i 3] i f a n d o n l y i f t z [ J ] i s a p r o p e r a n c e s t o r o f t z [j 3 ]. S u p p o s e

n o w t h a t n e i t h e r q [ I ] n o r q [ i 3 ] i s a n a n c e s t o r o f t h e o th e r , t h e n b y ( 1 ) a n d ( 2 ) n e i t h e r

t z [ J ] n o r t 2 [ j3 ] i s a n a n c e s t o r o f t h e o th e r .

C o n s i d e r t w o s e p a r a t e s u b t r e e s T 1 [ i l ] a n d T 1 [ i2 ] s u c h t h a t n e i t h e r t l [ i l ] n o r t l [i2 ] i s

a n a n c e s t o r o f t h e o th e r. G i v e n a m a p p i n g M , l e t

M k = { n I ( m , n ) c M a n d t l [ m ] i s a n o d e o f T1 [ik ]}

w h e r e k = 1 , 2 . L e t t z [ j k ] = l e a ( M k ) f o r k = 1 , 2 . T z [ j k ] c a n b e c o n s i d e r e d a s t h e

8/8/2019 Unordered Trees

http://slidepdf.com/reader/full/unordered-trees 5/18

A Constrained Edi t Dis tance Between Unordered Labeled Trees 209

i m a g e o f T l [i k ] u n d e r m a p p i n g M . W i t h o u r d e f in i ti o n o f t h e c o n s t ra i n e d e d i t d i s ta n c e

m a p p i n g , n e i t h e r t 2 [ j l ] n o r t2[j2] i s an ances to r o f t he o the r . Th i s co inc ide s w i th t he

i n tu i ti v e i d e a t h a t s e p a r a te s u b t r e e s o f T l s h o u l d b e m a p p e d t o s e p a r a t e s u b t re e s o f / ' 2 .

W e u s e M i n s te a d o f ( M , T l , T 2 ) i f t h e r e is n o c o n f u si o n . L e t M b e a c o n s t r a in e d e d i td i s t ance m ap p in g f r om T1 to T2 ; t he cos t o f M, F ( M ) , i s w e l l de f ined s i nce M i s a ls o an

e d i t d is t a n c e m a p p i n g .

C o n s t r a i n e d e d i t d is t a n c e m a p p i n g s c a n b e c o m p o s e d . L e t M l b e a c o n s tr a i n e d e d i t

d i s ta n c e m a p p i n g f r o m T1 a n d T 2, a n d l e t M 2 b e a c o n s t r a in e d e d i t d is t a n c e m a p p i n g

f ro m T2 to T3 . D ef ine

M1 o 342 = {(i, j ) I 3k s . t . ( i , k) c M1 and (k , j ) c M e } .

L E M M A 2 .

(1) M1 o M2 is a con straine d edit dis tance m app ing betw een 7"1 an d T3.

( 2 ) ) / ( M 1 o M 2 ) < y ( M 1 ) + y ( M 2 ) .

PRO OF. ( 1 ) f o l l ow s f r om the de f in i ti on o f cons t r a ine d ed i t d i s t ance m app ings . We che ck

cond i t i on ( 2 ) on ly . Le t ( i l , j l ) , ( i 2 , j 2 ) , and ( i3 , J3) be in M1 o ME. By the def in i t ion of

M1 o M2, there a re k l , k2 , and k3 such tha t ( i l , k l ) , ( i2 , k2) , and ( i3 , k3) a re in M1 and

( k l , j l ) , ( k 2 , j 2 ) , and (k3 , j3 ) a re in M2. Le t I be the lca( i l , i2 ) , l e t K be the lca(k l , k2) ,

and l e t J b e t he l c a ( j l , j 2 ) . By t he de f in i ti on o f M1 and M2 , I i s a p r op e r ances to r o f i3

i f f K i s a p r o pe r ance s to r o f k3. M or eov e r , K i s a p r o pe r ance s to r o f k3 i f f J i s a p r ope r

ances to r o f j 3 . The r e f o r e I i s a p r o pe r an ces to r o f i3 if f J i s a p r o pe r ance s to r o f J 3.

( 2 ) Le t MI be t he cons t r a ined ed i t d i s t ance mapp ing f r om T1 to T2 , l e t / I / / 2 be t he

cons t r a ined ed i t d i s t ance mapp ing f r om T2 to T3 , and l e t M1 o M2 be t he compos ed

cons t r a ined ed i t d i s tance m app ing f r o m T1 to T3. Th r ee gene r a l s i t ua ti ons occu r : ( i , j )

M1 oM 2, i ~ M1 , o r j ( M2 . I n each ca s e th i s co r r e s p ond s t o an ed i t ope r a t i on F ( x --+ y )

w h e r e x and y m ay be n odes o r ma y be )~. I n a l l s uch ca s e s , t he t r iang l e i nequa l i t y on

the d is tan ce m et r ic y ensu res tha t F (x ~ y) < F (x - -~ z ) + F (z ~ y) . [ ]

3.2. A Cons t r a ined Ed i t D is tance Be tween Trees . W e c a n n o w d e f in e a d i s s i m i l a ri t y

m eas u r e be tw een T1 and T2 a s :

D ( T 1 , T2) = m in{ g ( M ) [ M i s a cons t r a ine d ed i t d i s t ance ma pp ing f r om T1 to T2}.M

I n f ac t th i s d i s s imi l a r i t y mea s u r e i s a d i s t ance me t r i c .

T H E O R E M 2 .

(1 ) D ( T , T ) = O .

(2 ) D ( T 1 , T2) = D(T2, T~) .

(3 ) D(T~, T3) < D(T~, T2) + D(T2, T3).

PRO OF. ( 1 ) and ( 2 ) f o l l ow d i r ec t l y f r om the de f in i t i on o f t he cons t r a ined ed i t d i s t ance

m a p p i n g . F o r ( 3 ) , c o n s i d e r t h e m i n i m u m c o s t c o n s t r a i n e d e d i t d i s t a n c e m a p p i n g s M 1

8/8/2019 Unordered Trees

http://slidepdf.com/reader/full/unordered-trees 6/18

2 1 0 K a iz h o n g Z h a n g

b e t w e e n T1 a n d T 2 a n d M 2 b e t w e e n T 2 a n d / ' 3 . I t i s e a s y t o s e e t h e f o ll o w i n g :

D(T1, T 3) _< y ( M 1 o M 2 ) 5 y ( M 1 ) + z ( M 2 ) = D(T~, T2) + D(T2, T3) . [ ]

W e c a ll t h i s d i s t a n c e m e t r i c t h e c o n s t r a i n e d e d i t d i s t a n c e . T h e r e l a ti o n b e t w e e n t h e

c o n s t r a i n e d e d i t d i s t a n c e m e t r i c D a n d th e e d i t d i s t a n c e m e t r i c D e i s De(T1 , T2) _<

D(T1, T 2 ). T h e r e a s o n i s t h a t a n y c o n s t r a in e d e d i t d i s t a n c e m a p p i n g i s a l w a y s a n e d i t

d i s t a n c e m a p p i n g .

4 , Proper t ie s o f the Co ns tra ined Ed i t D is tance . I n th i s s e c t i o n w e p r e s e n t s e v e r a l

l e m m a s w h i c h a r e t h e b a s i s fo r th e a l g o r i t h m i n t h e n e x t s e ct io n . W e u s e 0 t o r e p r e s e n t

t h e e m p t y t r ee .

L E M M A 3. Let h[ i l ] , t l [i 2 ] . . . . . ta[ini] be the children o f t l [ i ] and let t 2 [ j l ] , t 2 [ j 2 ] ,

. . . . tz[jnj] be the children of tz[j]. Then

D(O , O) =

D (FI[i] , O) =

D(O, F 2 [ j ] ) =

,

ni

Z D(T l[ ik] , 0) ,k = l

nj

Z D ( O , T 2 [ j k ] ) ,k = l

D(TI[ i ] , O )= D(F I[ i ] , O) + y ( t l [ i ] - + ~ . ) ,

D(O, T 2 [ j ] ) : - D ( O , F 2 [ j ] ) + y () ~ ~ t 2 [ j ] ) .

P R O O F . T r i v i a . [ ]

L E M M A 4 . Letq[ ia] , t 1 [i 2] . . . . . t l[ i, i] beth ec hi ld ren ofq [i] and let t2[j l ], t 2 [j 2 ] . . . . .

t2[Jnj] be the children o f t2[j]. Then

D(TI[ i ] , T 2 [ j ] ) = m i n

D(O, T 2 [ j ] ) + m i n {D (TI [i], T2[jt]) - D(O , T2[jt])},l<t<nj

D (TI[ i], O) + m i n {D(Tl[is], T 2 [ j ] ) - D(TI[is], 0 ) } ,l<s<n i

D(FI[ i ] , F 2 [ j ] ) + ~ ) ( t l [i ] ~ t 2 [ j ] ) .

P R O O F. C o n s i d e r th e m i n i m u m c o s t c o n s t ra i n e d e d it d i s ta n c e m a p p i n g M b e t w e e n

T I [ i ] a n d T 2 [ j ]. T h e r e a r e f o u r c a s e s :

( 1 ) i e M and j qg M .

( 2 ) i C M a n d j e M .

( 3 ) i e M a n d j 6 M .

( 4 ) i C M and j C M .

Case 1 . L e t ( i, t ) i n M . S i n c e j r M , t m u s t b e a n o d e in F 2 [ j ] . L e t t2[jt] b e t h e

c h i l d o f t 2 [ j ] o n t h e p a t h f r o m t z [ t] t o t2 [ j ] . T h u s D(TI[i] , T 2 [ j ] ) = D(T1 [ i ] , T2 [ j t ] ) - 4 -

8/8/2019 Unordered Trees

http://slidepdf.com/reader/full/unordered-trees 7/18

A Constrained Edit Distance Betwee n Unordered Labeled Trees 2 11

D ( O , T 2 [ j l ] ) + - . . + D ( O , T2 [ j t -1 ] ) + D( O , T2 [ j t + I ] ) + . . . + D( O , T2 [j nj ]) + y ( L , t 2 [ j ] ) .

S i n c e D ( O , T 2 [ j ] ) ----- y ( / k , t 2 [ j ] ) + ~ : - - 1 D( O, Tz [ j k ] ) , w e c a n r e w r i t e t h e r i g h t - h a n d

s i d e o f th e f o r m u l a a s D ( O , T 2 [ j ] ) + D (T I[ i ] , T2[ jt ]) - D (O, T2[ j t ] ). S i n c e t h e ra n g e o f

k i s f r o m 1 t o n j , w e t a k e t h e m i n i m u m o f th e s e c o r r e s p o n d i n g c o s ts .

C a s e 2 . S i m i l a r t o c a s e 1 .

C a s e 3 . S i n c e i ~ M a n d j c M , b y t h e c o n d i t i o n o f c o n s t r a i n e d e d i t d i s t a n c e m a p p i n g ,

( i, j ) m u s t b e in M . S i n c e M - { (i , j ) } i s a c o n s t r a i n e d e d i t d i s t a n c e m a p p i n g b e t w e e n

F t [ i ] a n d F2 [ j ] , a n d f o r a n y c o n s tr a i n e d e d i t d i s ta n c e m a p p i n g M ' b e t w e e n F 1 [ i] a n d

F z [ j ] , ( i, j ) tO M ' i s a c o n s t r a i n e d e d i t d i s ta n c e m a p p i n g b e t w e e n T I [ i] a n d T2[ j ] , w e

k n o w D ( T I [ i ] , Tg_[j]) = D ( F I [ i ] , F 2 [ j ] ) + y ( q [ i ] , tv_[j]).

C a s e 4 . S i m i l a r t o C a s e 3. T h e f o r m u l a w o u l d b e D ( T I [ i ] , T z [ j ] ) = D ( F 1 [ i ] , F z [ j ] ) +

y ( t l [ i ] , ~ ) + y 0 ~ , t 2 [ j ] ) . S i n c e ~ ' ( q [ i ] , t 2 [ j ] ) _< ~' ( q [ i ] , ~ .) + y ( ~ ., t2 [ j ] ) , w e d o n o t h a v et o i n c l u d e t h is c a s e i n o u r f in a l f o r m u l a . [ ]

B e f o r e w e p r o c e e d t o t h e n e x t l e m m a w e n e e d t h e f o l l o w i n g d e f in i ti o n.

A r e st ri ct e d m a p p i n g R M ( i , j ) b e t w e e n F I [ i] a n d F 2 [ j ] i s d e f i n e d a s f o l l o w s :

( 1 ) R M ( i , j ) i s a c o n s t r a i n e d e d i t d i s t a n c e m a p p i n g b e t w e e n F 1 [ i] a n d F 2 [ j ] .

( 2 ) I f ( l , k ) i s i n R M ( i , j ) , q [ l ] i s i n Tl[is], a n d t 2 [ k ] i s i n T2[jt], t h e n , f o r a n y ( l l, k l)

i n R M ( i , j ) , t l [ l l ] i s i n T~[is] i f a n d o n l y i f t 2 [ k l] i s in T2[jt].

S i n c e a r e s t ri c t e d m a p p i n g i s a c o n s t r a i n e d e d i t d is t a n c e m a p p i n g , t h e c o s t o f ar e s tr i c te d m a p p i n g i s w e l l d e f in e d . T h e m e a n i n g o f th e r e s tr i ct i o n i s t h a t n o d e s f r o m a

s u b t r e e T 1 [i~ ] c a n o n l y m a p t o n o d e s o f o n e s u b t r e e T2[jt] a n d v i c e v e r s a . I t t u r n s o u t t h a t,

a l t h o u g h t h e r e a r e e x c e p t i o n s , t h i s is t h e g e n e r a l c a s e w h e n w e c o n s i d e r t h e c o n s t r a i n e d

e d i t d i s t a n c e m a p p i n g b e t w e e n / 7 1 [i ] a n d F 2 [ j ] .

L E M M A 5 . L e t q [ i l ] , t t[ i 2 ] . . . . . q [ in , ] b e t h e c h i l d r e n o f q [ i ] a n d l e t t 2 [ j l ] , t 2 [ j 2 ] ,

. . . . t 2[ jn j] b e t h e c h i l d r e n o f t 2 [ j] . Th e n

D ( F I [ i ] , F 2 [ j ] ) = m i n

D ( O , F z [ j ] ) + l m in n {D ( Fl [ i ] , F 2 [ j t] ) - D( O , Fz [ j t ] ) } ,

D ( F I [ i ] , O ) + m i n {D( F~[ i s ] , 0 ) } ,l<s<ni F 2 [ j ] ) - D ( V l [ i s ] ,

m i n RM(i ,j) y ( R M ( i , j ) ) .

P R O O F . W e c o n s i d e r t h e m i n i m u m - c o s t c o n s t r a in e d e d i t d i s t a n c e m a p p i n g M b e t w e e n

Ft[ i ] a n d F 2 [ j ] . T h e r e a r e f o u r c as e s .

C a s e 1 . T h e r e i s a 1 < t < n j s u c h t h a t i f ( k , l ) ~ M , t h e n t 2 [l ] i s a n o d e i n s u b t r e e

T2[jt]; a n d t h e r e a r e ( k t , l l ) a n d ( k 2, 12 ) i n M s u c h t h a t q [ k t ] i s a n o d e i n T l [ i s l ] a n d

q [ k 2 ] i s a n o d e i n T t [i s 2] , w h e r e 1 < Sl ~ s2 < ni. N o t e t h a t i n t h i s c a s e j t c a n n o t b e

i n M . T h i s i s s i m i l a r t o c a s e 1 i n L e m m a 4 , a n d h e n c e w e h a v e t h e f o l l o w i n g f o r m u l a :

D ( F t [ i ] , F 2 [ j ] ) = D ( O , F 2 [ j ] ) + minl<_t<_nj{D(Fl[ i], F2 [j t] ) - D (O , F2 [j t] )}.

8/8/2019 Unordered Trees

http://slidepdf.com/reader/full/unordered-trees 8/18

212 KmzhongZhang

C a s e 2 . T he re i s a 1 < s < n i s u c h t h a t i f ( k , l ) ~ M, t h e n q [ k ] i s a n o d e i n s u b t r e e

7"1 [ is ] ; an d th ere are (k l , l l ) an d (k2, l z ) in M suc h tha t t2 [ /1 ] is a nod e in T2[j t , ] an d t2[12]

i s a n o d e i n T 2 [ j t : ] , w h e r e 1 < t l ~ t 2 < n j . T h i s i s s i m i l a r t o c a s e 1 .

C a s e 3 . T h e r e a r e s a n d t s u c h t h a t i f ( k , l ) 6 M , t h e n t l [ k ] i s n o d e i n Tl [ i s ] a n d t 2 [l ]i s n o d e i n Tz[ i t ] . I n th i s c a s e M i s a r e s t r ic t e d m a p p i n g a n d t h e r e f o r e D ( F 1 [ i ] , F 2 [ j ] ) =

m i nR M ( i, j ) y ( R M ( i , j ) ) .

C a s e 4 . T he re a re (k l , l l ) , (k2, 12 ) , (x l , Y l ) , (x2, Y2) in M suc h tha t q [ k l ] i s a n o d e i n

T1 [ is , ] , t l [k2] is a n o de in T1 [ is~] , t2 [y l] is a no d e in T2[it, ] , a n d t 2 [ y 2 ] is a n o d e i n T2[it2],

w h e r e 1 < s l ~ s2 <_ ni an d 1 < t l ~ t2 < n j . W e s h o w t h a t i n t h is c a s e t h e c o n s t r a i n e d

e d i t d is t a n c e m a p p i n g M i s a r e s t r ic t e d m a p p i n g b e t w e e n F 1 [ i ] a n d F 2 [ j ] .

S u p p o s e t h is i s n o t t ru e . T h e n , w i t h o u t l o s s o f g e n e r a l i ty , w e a s s u m e t h a t th e r e a r e

( a l , b l ) a n d ( a 2, b 2 ) i n M s u c h t h a t q [ a l ] a n d t l [a 2 ] b e l o n g t o t h e s a m e s u b t r e e a n d

t 2 [ b l ] a n d t 2 [b z ] b e l o n g t o d i f f e r e n t su b t r e e s . L e t ( a 3, b 3 ) e M s u c h t h a t q [ a 3 ] a n dt l [ a l ] b e l o n g t o d i f f e r e n t s u b t r e e s o f F 1 [ i ] . N o w c o n s i d e r l c a ( q [ a 1 ], f i [ a 2 ]) a n d t l [ a 3 ] ,

S i n c e q [ a l ] a n d t l [ a2 ] b e l o n g t o t h e s a m e s u b t r e e w h i c h i s d i ff e r e n t f r o m t h e s u b t r e e

t l [ a 3 ] b e l o n g s t o , w e k n o w t h a t l c a ( t l [ a l ] , t l [ a 2 ]) a n d t l [ a 3 ] d o n o t h a v e t h e a n c e s t o r o r

d e s c e n d a n t r e l a t i o n s h i p . H o w e v e r , i f w e c o n s i d e r l c a ( t2 [ b l ] , t 2 [ b 2 ] ) a n d t2 [ b3 ] , i t is e a s y

t o s e e t h a t l c a ( t 2 [ b l ], t 2 [b 2 ]) = t 2 [ j ] , w h i c h i s a n a n c e s t o r o f t 2[ b3 ]. T h i s m e a n s t h a t M

i s n o t a v a l i d c o n s t r a i n e d e d i t d i s t a n c e m a p p i n g . C o n t r a d i c t i o n . [ ]

F r o m L e m m a 5 , i n o r d e r t o c o m p u t e D ( F I [ i ] , F 2 [ j ] ) w e h a v e t o c o m p u t e

minRM(i,j) y ( R M ( i , j ) ) . F r o m t h e d e fi n it i on o f th e r e s tr i ct e d m a p p i n g , w e k n o w t h a t

n o d e s o f o n e s u b t re e o f F 1 [ i ] c a n o n l y m a p t o n o d e s o f o n e s u b t re e o f F 2 [ j ] . T h i s m e a n s

t h a t i f Tl [ i s ] i s i n v o l v e d i n t h e m a p p i n g , t h e n i t w i l l m a p t o T 2 [ j t] f o r s o m e 1 < t < n j ;

i f i t i s n o t i n v o l v e d i n t h e m a p p i n g , t h e n i t w i l l m a p t o a n e m p t y t r e e. T h i s g i v e s a

k i n d o f m a t c h i n g b e t w e e n s u b t re e s o f F I [ i ] a n d s u b t re e s o f F 2 [ j ] . T h e n e x t t w o l em -

m a s e s t a b l i s h t h e r e l a t i o n s h i p b e t w e e n m i n R M ( i, j) y ( R M ( i , j ) ) a n d D ( T 1 [is ], T 2[ j t] ) ,

w h e r e 1 < s < n i a n d 1 < t < r / j . W e n e e d t h e f o l l o w i n g d e f i n i t i o n s i n t h e n e x t t w o

l e m m a s .

L e t I = { i l , i2 . . . . . in~}, J = { j l , j2 . . . . . j n j } , A = { a l , a 2 . . . . . a n j }, a n d B =

{ b l , b z . . . . . b n, }. W e u s e A a n d B t o r e p r e s e n t e m p t y t r e e s . L e t L = I U A a n d R = J t3 B ,

w e d e f i n e a w e i g h t e d b i p a r t i t e g r a p h G ( i , j ) = ( V , E ) a s f o l lo w s :

(1 ) V = L U R .

( 2 ) E = L • R , w h e r e • s t a n d s f o r t h e C a r t e s i a n p r o d u c t .

(3 ) w ( u , v ) = D ( T I [ u ] , T 2 [ v ]) i f u 6 I a n d v 6 J ,

w ( u , v ) =- D ( T l [ u ] , O ) i f u c I and v c B ,

w ( u , v) = D ( O , T 2 [ v ] ) i f u 6 A a n d v 6 J ,

w ( u , v ) = O i f u E A a n d v 6 B .

A m a t c h i n g o n G ( i , j ) i s a s u b s e t o f e d g e s M C E s u c h t h a t, f o r a l l v e r t i c e s v 6 V ,

a t m o s t o n e e d g e o f M is i n ci d e n t o n v . T h e s i z e o f M , I M h i s th e n u m b e r o f e d g e s i n

M . A m a x i m u m m a t c h i n g i s a m a t c h i n g w i t h m a x i m u m c a r di n a li ty , th a t is , a m a t c h i n g

M s u c h t ha t, f o r a n y m a t c h i n g M ' , w e h a v e I M I > I M ' I . L e t M M ( i , j ) b e a m a x i m u m

m a t c h i n g o n G ( i , j ) , t h e n I M M ( i , J ) l = n i + n j .

8/8/2019 Unordered Trees

http://slidepdf.com/reader/full/unordered-trees 9/18

A C o n s t r a i n e d E d i t D i s t a n c e B e t w e e n U n o r d e r e d L a b e l e d T r e e s 2 1 3

L e t M M ( i, j ) b e a m a x i m u m m a t c h in g o n G ( i , j ) . T h e c o s t o f M M ( i , j ) , t h e s u m -

m a t i o n o f t h e w e i g h t s o f th e e d g e s i n M M ( i, j ) , i s d e f in e d a s f o l l o w s :

y ( M M ( i , j ) ) = y~ w(u, v)(u, v) ~M M(i, j )

n~d~ D(T1 [ u l , T 2 [ v ] )(u,v)EM M a E1 a n d v E J

+ Z D(TI[u] , O)

(u,v)6MM a n d uel and oeB

Z D(O, T 2 [ v ] ) .

(u.v)EMM and ueA an d vcJ

L E M M A 6 . minRM( i, j) y (R M ( i , j ) ) = minMM(i,j) y (M M ( i , j ) ) .

P R O O F . G i v e n a r e s t r i c t e d m a p p i n g R M ( i , j ) , w e d e f i n e a m a x i m u m m a t c h i n g M M ( i , j )

o n G ( i , j ) a s f o l lo w s :

M M ( i, j ) = H M M U IM M U J M M U K M M ,

w h e r e

HM M : {(is, jr) [ t h e r e i s (k, l) 9 R M ( i , j )

s . t . q[k] ( t 2 [ / ] ) i s a n o d e i n Tl[is] (T2[jt])},

IMM = { ( i s , b s ) I t h e r e i s n o j t s . t . (is, jr) 9 HMM},

JMM : { ( a t , j r ) [ t h e r e i s n o i s s .t . ( i s , jr ) E HMM},

KMM = { ( a s , bt) I (it, js) E HMM }.

B y t h e d ef i n it io n o f a r e s tr i c te d m a p p i n g t h is i s i n d e e d a m a x i m u m m a t c h i n g . F u r t h e r-

m o r e , i t is e a s y t o s e e t h a t y ( M M ( i , j ) ) < y ( R M ( i , j ) ) .O n t h e o t h er h an d , g i v en a m a x i m u m m a t c h i n g M M (i, j ) , w e c a n c o n s t r u c t a r e s t r i c t e d

m a p p i n g R M ( i , j ) a s f o l l o w s :

R M ( i , j ) = ( k , I )

t h e r e e x i s t s ( i s , Jr) 9 M M ( i , j ) s . t . ( k , l ) i s i n t h e

m i n i m u m c o s t c o n s t r a in e d e d i t d is t an c e m a p p i n g

b e t w e e n Tl[is] a n d T 2 [ j t ]

I t i s c l e a r t h a t t h i s i s i n d e e d a r e s t r i c t e d m a p p i n g . T h e r e f o r e ,

y ( m m ( i , j ) ) .H e n c e m i n R M ( i , j ) v ( R M ( i , j ) ) = m i n M M ( i , j ) y ( M M ( i , j) ) .

y ( R M ( i , j )) <

[ ]

C o m b i n i n g L e m m a s 5 a n d 6 , w e h a v e p r o v e d t he f o l lo w i n g l e m m a .

8/8/2019 Unordered Trees

http://slidepdf.com/reader/full/unordered-trees 10/18

214 Kaizhong Zhang

L E M M A 7 . L e t q [ i l ] , q [ i2 ] . . . . . t l[ im ] b e t h e c h il d re n o f t l[ i ] a n d l e t t 2 [ j l ] , t 2 [ J 2 ] ,

. . . . t 2[ jn i] be the ch i ldren o f t 2 [ j ]. T hen

D ( O , F 2 [ j ] ) + m i n D ( F l [ i ] , F e [ j t ] ) - D ( O , F z [ j t ] ) ,1<t <nj

D ( F l [ i ] , F 2 [ j ] ) = m i n D ( F l [ i ] , O ) + m i n D ( F l [ i s ] , F 2 [ j ] ) - D(F 1[ i s ] , 0 ) ,1<s <ni

m i n y ( M M ( i , j ) ) .MM(i,j)

5 . A l g o r i t h m a n d C o m p l e x i t y . W e f ir st c o n s i de r h o w t o c o m p u t e

m i n y ( M M ( i , j ) )MM(i,j)

a n d t h e n p r e s e n t o u r s i m p l e a l g o r i th m .

5 . 1 . A l g o r i t h m . F r o m t h e d e f in i t i o n o f M M ( i, j ) a n d y ( M M ( i , j ) ) , i t i s c l e a r t h a t t h i s

p r o b l e m i s t he m i n i m u m c o s t m a x i m u m b i pa r ti te m a t c h i n g p ro b l e m . T h e r e f o r e w e c a n

u s e th e w e i g h t e d m a x i m u m m a t c h i n g a l g o r i th m t o so l v e t hi s p ro b l e m . H o w e v e r , s i n c e

w e h a v e m a n y e m p t y t re e s in g r a p h G ( i , j ) , t h i s w i l l r e s u l t i n r e d u n d a n t c o m p u t a t i o n .

I n st e ad , w e r e d u c e t h is p r o b l e m d i re c t l y to t he m i n i m u m c o s t m a x i m u m f lo w p r o b l e m

b y a d d i n g o n l y t w o e m p t y t r e e s , o n e t o F 1 [ i] a n d t h e o t h e r t o F e [ j ] .G i v e n F l[ i ] a n d F 2 [ j ] , l e t I = { i l , i2 . . . . . . in ,} a n d J = { j l , j2 . . . . . j n j } , w h e r e i k,

1 < k < hi, r e p r e s e n t s t r e e Tl[ik], a n d j l , 1 < l < n j , r e p r e s e n t s t r e e T 2 [ j d .

W e c o n s tr u c t a g r a p h G = ( V , E ) a s f o l lo w s :

v e r t e x s e t: V = {s, t, e l, ej } t.J I U J, w h e r e s i s t h e s o u r c e , t i s t h e s i n k ,

a n d e i a n d e j r e p r e s e n t t w o e m p t y t r e e s ;

e d g e s e t : [ s , i k ] , [ s , eli , [ j l , t] , [ej , t] w i t h c o s t z e r o , [ i k, j l ] w i t h c o s t

D ( T l [ i k ] , T 2 [ j t ] ) , [el, jt] w i t h c o s t D ( O , T 2 [ j l ] ) , [ i k , ej] w i t h c o s t

D(Tl[ ik] , 0) , and [e l , e j ] w i t h c o s t z e r o . A l l th e e d g e s h a v e c a p a c i t y o n e

e x c e p t [ s , el], [ei, ej], a n d [ej, t] w h o s e c a p a c i t i e s a r e n j , m a x { n / , n j } -m i n { n i , n j } , an d h i , r e s p e c t i v e l y .

G i s a n e t w o r k w i t h i n te g e r ca p a c i t ie s , n o n n e g a t i v e c o s t s, a n d th e m a x i m u m f l ow f * =

n i -1- n j ( s e e F i g u r e 1 ) . L e t c o s t ( G ) b e t he c o s t o f t h e m i n i m u m c o s t o n m a x i m u m f lo w s

o f G , w h e r e o n l y t h e c o s t s o n p o s i ti v e f lo w s a r e c o n s id e r e d . T h e f o l l o w i n g l e m m a s t a te s

t h e r e l a t i o n b e t w e e n c o s t ( G ) an d min MM( i , j ) y ( M M ( i , j ) ) , w h i c h e n a b l e s u s t o u s e t h e

m i n i m u m c o s t f lo w a l g o r i t h m t o c o m p u t e m in M M (i.j) g ( M M ( i , j ) ) .

L E M M A 8 . c o s t ( G ) = m in M M ( i,j ) y ( M M ( i , j ) ) .

P R O O F . G i v e n a n M M ( i , j ) , y ( M M ( i , j ) ) r e p r e s e n t s t h e c o s t o f t h e f o l l o w i n g m a x i m u m

f l o w O n G : f o r a n y ( is , j r ) c M M ( i , j ) t h e f l o w o n e d g e [ is , Jt] i s o n e ; f o r a n y ( i s , b t ) E

8/8/2019 Unordered Trees

http://slidepdf.com/reader/full/unordered-trees 11/18

A C o n s t r a i n e d E d i t D i s t a n c e B e t w e e n U n o r d e r e d L a b e l e d T r e es

iJ

2 1 5

i k I , D O t [ i t ] , T 2 [j d) j l

1 ,

2,0

F i g . 1 . R e d u c t i o n t o t h e m i n i m u m c o s t f l ow p r o b l e m .

M M ( i , j ) t h e f l o w o n e d g e [ i s , e j] i s o n e ; f o r a n y ( a s , j r ) ~ M M ( i , j ) t h e f l o w o n e d g e

[ei, j t ] i s o n e ; t h e f l o w o n e d g e

[el, e j ] = I { ( u , V ) I (U , V ) E M M ( i , j ) a n d u E A a n d v c B } I ;

t h e f l o w s o n e d g e I s , i k] a n d e d g e [ j l , t] a r e o n e ; t h e f l o w o n e d g e [ s , el] i s n j; t h e f l o wo n e d g e [ej , t] i s n i; a n d a l l t h e f l o w s o n t h e o t h e r e d g e s a r e z e r o .

O n t h e o th e r h a n d , g i v en a m a x i m u m f lo w o n G , c o s t ( G ) r e p r e s e n t s t h e c o s t o f th e

f o ll ow i n g m a x i m u m m a t c h in g o n G ( i , j ) :

M M ( i , j ) = { ( L , j t ) , (a t , b s ) [ t h e f l o w o n e d g e [ i s , j t ] i s o n e }

U { ( is , b s ) [ t h e f l o w o n e d g e [is, ej] i s o n e }

U { ( a t , j r ) [ t h e f lo w o n e d g e [ei, jr] i s o n e } .

T h e r e f o r e t h e c o s t o f t h e m i n i m u m c o s t m a x i m u m f lo w is e x a ct lym i n M M ( i , j ) y ( M M ( i , j ) ) . [ ]

W e a r e n o w r e a d y t o g i v e o u r a lg o r i t h m .

8/8/2019 Unordered Trees

http://slidepdf.com/reader/full/unordered-trees 12/18

216 Kaizhong Zhang

Input: T1 and T2.

Output: D(TI[ i] , T2 [ j ] ) , wh ere 1 < i < iTl l and 1 < j < IT21.

D(O, 0) = 0 ;f o r / = 1 to I T s ]

n l

D (F l[i] , O) -= ~ D (Tl[ ik] , O)k = l

D (TI[i] , O) ---- D (FI [i] , O )+ y( t a [ i ] ~ )0

fo r j = 1 to IT21n j

D(O, F 2 [ j ] ) = Z D(O, T2[jk])k = l

D(O, T2[j]) ----D(O, F2 [ j ] ) + ) /(~ . -+ t2[ j ] )

f o r i = 1 t o I T l l

f o r j = 1 t o IT 2 1

D(O, F2 [ j ] ) + r a i n {D (F~ [i l , F2[j t]) - D(O, F2[ j , I )} ,l<t<nj

D ( F I [ i ] , F2 [ j ] ) ---- ra in D (F~ [ i ] , 0) + ra in { D ( F l t i s ] , F z t j ] ) - D ( F l [ i s ] , 0)} ,l<s<n i

ra in y ( M M ( i , j ) ) .M M (i , j )

D ( T l [ i ] , T2 [ j ] ) = ra in

D(O, T2 [ j ] ) + mi n {D(TI[ i ] , T2[ j t ] ) - D(O , Tz[j t])},l < t <n j

D(TI[i] , O) + rain {D(T l [ i s ] , T2[ j ] ) - O(Tl[ i s ] , 0)} ,l<_s<ni

D ( F l [ i ] , F2[ j ] ) + y ( t l [ i ] - -~ t 2 [ j ] ) .

5 .2 . Complexity. T h e c o m p l e x i t y o f c o m p u t i n g D(TI[ i ] , T 2 [ j ] ) i s , b y L e m m a 4 ,

b o u n d e d b y O ( n i q '- n j ) . T h e c o m p l e x i ty o f c o m p u t i n g D(Fa[i], F 2 [ j ] ) i s b o u n d e d

b y O ( h i - } - n j ) p l u s th e c o m p l e x i t y o f t h e m i n i m u m c o s t m a x i m u m f lo w c o m p u t a t io n .

O u r g r aph G i s a g r aph w i t h i n t ege r capac i t ie s , nonnega t i ve edge cos t s, and max i mu mflow f * ~ -- n i "[- n j. T h e c o m p l e x i t y o f f in d i ng m i n i m u m c o s t m a x i m u m f lo w f o r su c h

a g r aph w i t h n ve r t i ce s and m edg es i s O ( m l f * l log(2+mt,) n ) < O ( m L f * l log2 n) [12] .

F o r o u r g r a p h , n = n i + n j + 4 and m = ni x n j + 2n i + 2n j + 3 ; t he r e fo r e t he comp l ex i t y

is b o u n d e d b y O ( n i x n j x ( h i -t- n j ) x log2(ni -4-n j ) ) .

H e n c e f o r a n y p a i r i a n d j , t h e c o m p l e x i t y o f c o m p u t i n g D(TI[ i ] , T2[ j ] ) and

D ( F I [ i ] , F E [ j ] ) i s b o u n d e d b y O ( n i x n j x (n i -1- n j) x log2(n i + n j ) ) . T h e r e f o r e

t h e c o m p l e x i t y o f o u r a l g o r it h m i s

ITd IT2~

E E O(n i x n j • (n i + n j ) • log2(ni + n j ) )i = 1 j = l

I/'11 t/'21

< ~ _ ~ O (ni • n j • (deg(T1) -4-deg(T2)) x log2(deg(T1) -4-deg(T2)))

i = 1 j = l

8/8/2019 Unordered Trees

http://slidepdf.com/reader/full/unordered-trees 13/18

A Constrained Edit Distance Between Unordered Labeled Trees

<_ 0 ( d e g (T 1 ) + d e g ( T 2 )) x l o g 2 ( d e g ( T 1 ) + d e g ( T 2 ) ) x E n i x n ji=1 j=l

_< O (I T ll • IT21 • ( d e g ( T 1 ) + d e g ( T 2 ) ) • l o g 2 ( d e g ( T t ) + d e g ( T 2 ) ) ) .

217

6. Approxim ate Unordered Tree M atching. A p p r o x i m a t e u n o r d e r e d t r ee m a t c h in g

i s a na tu ra l ex t ens ion o f app rox im a te s t ri ng ma tch ing [5 ] , [13 ] , [2 ] and ap pro x im a te

o rde re d t r ee m a tch in g [ 17 ].

T h e a p p r o x i m a t e - s t r i n g - m a t c h i n g p r o b l e m i s a s fo l l o w s . G i v e n t w o s t r in g s T E X T a n d

PAT, the p r o b l e m i s t o c o m p u t e , f o r e a c h i , S D [ i , P A T ] = m i n j { D ( T E X T [ j . . i] , P A T ) },

w he re 1 _< j _< i + 1 an d S D i s the s t r i ng d i s t ance m e t r i c . W i th t he ed i t d i s t ance m e t r i c ,

t h e p r o b l e m i s to c o m p u t e , f o r ea c h i , t h e m i n i m u m n u m b e r o f e d it o p e r a ti o n s b e t w e e n

" pa t t e rn" s t r i ng PAT[ 1 .. I PA TI] a n d t h e " t e x t " s u b s t r in g T E X T [ 1 . . i ] , where any p re f ix

c a n b e r e m o v e d f r o m TEXT[1 . . i ] . In tu i t i ve ly t he p rob lem i s t he f ind the " occu r r enc e" i n

T E X T t h at m o s t c l o s e ly m a t c h e s P A T .

G i v e n p a t t e r n t r e e P a n d d a t a t re e T , i n o r d e r t o a ll o w P t o m a t c h o n l y a p a r t o f T ,

w e m u s t g e n e r a l i z e t h e n o t i o n o f a s u b s t r i n g a n d t h e n o t i o n o f r e m o v i n g a p r e f ix . F o r u s ,

a subs t r i ng mea ns a sub t r ee and a p re f ix m eans a co l l ec t i on o f sub t rees .

A s s u m e a n u m b e r i n g o f tr e e T , w e u s e t[i] t o re p r e s e n t t h e i t h n o d e o f t re e T , T[i] to

r ep rese n t t he sub t r ee roo ted a t t [ i ] , and Tf [ i] t o r ep resen t t he fo re s t ob t a ined by de l e t i ng

t[i] f r o m T[ i ] . Simi l a r ly , f o r pa t t e rn t r ee P w e can de f ine p[ i ] , P [ i ] , a n d P f [ i ] .

W e f i rs t de f ine t he ope ra t ion o f r em ov ing a t a nod e [17 ] .Rem oving a t node t [ i] m e a n s r e m o v i n g t h e s u b t r e e r o o t e d a t t[i]. G i v e n t r e e T , w e

def ine S ( T ) as a s e t o f nodes o f T sa t i s fy ing

t [ i ] , t [ j ] ~ S ( T ) imp l i e s t ha t ne i the r is an ance s to r o f t he o the r .

S ( T ) r e p r e s e n t s a s e t o f n o n o v e f l a p p i n g s u b t re e s o f T .

D e f i n e R ( T , S ( T ) ) t o b e t h e t r e e o b t a i n e d b y removing a t a l l t he node s i n S ( T ) f r o m

t ree T .

N o w w e c a n g i v e t he d e f i n it io n o f a p p r o x i m a t e u n o r d e r e d t r e e m a t c h i n g . G i v e n p a t t e r n

t r ee P and da t a t r ee T , fo r each sub t r ee T[ i ] , w e w a n t t o c o m p u t e

D r ( P , T [ i ] ) = m } n { D ( P , R(T[ i ] , S (T[ i ] ) ) ) } .

T h e m i n i m u m h e r e i s o v e r a ll p o s s i b l e s u b t r e e s e t s S ( T [ i ] ) .

The in tu i t i ve mean ing o f t he above de f in i t i on i s a s fo l l ows . G iven a pa t t e rn t r ee P

a n d a d a t a t re e T , w i t h a p p r o x i m a t e tr e e m a t c h i n g w e w a n t t o fi n d p a r t o f T , s a y T e, t h a t

m o s t c l o s e ly m a t c h e s P . W e a s s u m e t ha t Te i s connec t ed , t he re fo re Tp mus t be a t r ee .

S i n c e w e c o n s i d e r r o o t e d t r ee s , Tp has a roo t . Le t t [ i ] be t he roo t o f Tp ; i t is c l ea r t ha t w e

c a n o b t a i n T p b y r e m o v i n g s o m e s u b t re e s , s a y S ( T [ i ] ) , f r o m T[i] s ince Te i s conne c t ed .

H e n c e , m i n i D r ( P , T [ i ] ) w i l l g i v e u s t h e d is t a n c e b e t w e e n P a n d Tp.

I n t h e f o l lo w i n g w e u s e ni to r e p r e s e n t t he n u m b e r o f c h i ld r e n o f t h e n o d e p[i] a n d

n j t o r e p r e s e n t t h e n u m b e r o f c h i l d re n o f th e n o d e t [ j ] .

8/8/2019 Unordered Trees

http://slidepdf.com/reader/full/unordered-trees 14/18

218 KaizhongZhang

LEM M A 9 . L e t p [ i l ] , p [ i2 ] . . . . . p[ in i] be t he ch i ld ren o f p [ i ] a nd l e t t [ j l ] , t [ j 2 ] ,

. . . . t [ jn j ] b e th e c h i ld r e n o f t [ j ] . T h e n

Dr(O, O) =-- O,

Dr (O, T f [ j ] ) = 0 ,

n i

D r ( P f [ i ], O ) = Z D r ( e [ i k ] , 0 ) ,k = l

Dr(O, T [ j l ) = 0 ,

D r(P [ i ] , O) = D r (P f [ i ] , O) -t- y ( p [ i ] --+ L ) .

P R O O F . T r i v i a l . [ ]

LEM M A 1 0 . L e t p [ i l ] , p [ i2 ] . . . . . p[ in ,] be t he ch i ld ren o f p [ i ] a nd l e t t [ j l ] , t [ j 2 ] ,

. . . . t [ jn j ] b e t h e c h i ld r e n o f t [ j ] . T h e n

D r ( P [ i ] , 0 ) ,

l y 0 ~ ~ t [ j ] ) + m i n D r ( P [ i ] , T [ j t ] ) ,I l < t <n j

D r ( P [ i ] , T [ j l ) = m i n I D r ( P [ i ] , O ) + r a i n D r ( P [ i s ] , T [ j ] ) - D r ( P [ i s ] , 0 ) ,I 1 s < n i

[ D r (P f [ i ] , T f [ j ] ) -+ -y (p [ i ] -'+ t [ j ] ) .

P R O O F . W e c o n s i d e r t h e b e s t m a p p i n g b e t w e e n P [ i ] a n d T [ j ] a f t e r p e r f o r m i n g a n

o p t i m a l r e m o v a l o f s u b tr e e s o f T [ j ] . I f th e w h o l e t r ee T [ j ] i s r e m o v e d , t h e n t h e d i s -

t a n c e s h o u l d b e D r ( P [ i ] , 0 ) . O t h e r w i s e w e h a v e t h e s a m e f o u r c a s e s a s i n L e m m a 4 .

I n c a se 1 p[ i ] i s i n t h e b e s t m a p p i n g b u t t [ j ] i s n o t . S u p p o s e t h a t p[ i ] i s m a p p e d t o

t[k] w h i c h i s a n o d e i n T [ j t ] . T h e n s ub tr ee s T [ j l ] . . . . . T [ j t - 1 ] , T [ j t+ l ] . . . . . T [ j~ j ]

s h o u l d h a v e b e e n r e m o v e d i n t h e o p ti m a l r e m o v a l o f s u b tr e es o f T [ j ] . T h e r e f o r e t h e

d i st a n ce i s y ( L - , t [ j ] ) + D r ( P [ i ] , T [ j t ] ) . T h e o t h e r t h r e e c a s e s a r e t h e s a m e a s i n

L e m m a 4 . [ ]

I n th e f o ll o w i n g l e m m a w e a g a in u s e t he c o n c e p t o f a m a x i m u m m a t c h i n g o n a

w e i g h t e d b i p a r t i t e g r a p h . H o w e v e r , w e h a v e t o m o d i f y t h e d e f i n i ti o n o f t h e b ip a r t i t e

g r a p h : Le t I = { i1 , i 2 . . . . . in ~} , J = { j l , j 2 . . . . . j n j} , a n d B = { b l , b 2 . . . . b i n } . W e ,

a g a i n , u s e B t o r e p r e s e n t e m p t y t re e s. L e t L - - I a n d R = J U B ; w e d e f in e a w e i g h t e d

b i p a r t i t e g r a p h G r ( i , j ) = ( V , E ) a s f o l l o w s :

(1 ) V = L U R .

( 2 ) E = L x R , w h e r e x s t a n d s f o r C a r t e s i a n p r o d u c t .

(3 ) w ( u , v ) = D ( T I [ u l , T 2 [ v l) i f u e I a n d v ~ J ,

w ( u , v ) = D ( T l [ u ] , O ) i f u 6 I a n d v 6 B .

W e u s e M M r ( i , j ) t o r e p r e se n t a m a x i m u m m a t c h i n g o n g r a p h G r ( i , j ) .

8/8/2019 Unordered Trees

http://slidepdf.com/reader/full/unordered-trees 15/18

A Constrained Edit Distance Between Unordered Labeled Trees 219

L E M M A 1 1 . L e t p [ i l ] , P [ i2 ] . . . . . p [ i , i] b e t h e c h il d r e n o f p [ i] a n d l e t t [ j l ] , t [ j2] ,

. . . . t [ jn j ] b e t h e c h i l d r en o f t [ j ] . T h e n

[ lminnj D r( P f [ i ] , T f [ j t] ) q - yO ~ --> t [ j t ] ) ,

II D r ( P f [ i] , O ) + l_s<_n,minD , ( P f [ i s ] , T f [ j ] ) - D ~( P f [ i j ] , 0 ) ,O r ( P f [ i ] , T f [ j ] ) = m i n //

I m i n y ( M M r ( i , j ) ) .I~MMr(i,j)

P R O O F . A g a i n w e c o n s i d e r t he b e s t m a p p i n g a f t e r p e r f o r m i n g t h e o p t i m a l r e m o v a l .

I f T f [ j ] i s r e m o v e d , t h e n t h e d i s ta n c e s h o u l d b e D r ( e l [ i ] , 0 ) . H o w e v e r , t h i s is h a n d l e d

b y m in M M r (i,j) y ( M M ~ ( i , j ) ) s i n c e y ( M M r ( i , j ) ) = D ~ ( P f[ i] , O ) w h e n M M r ( i, j ) ={( iv , bs) [ 1 < s < n i } .

S u p p o s e t h a t T f [ j] i s n o t r e m o v e d , t h e n w e h a v e f o u r c a s es a s in L e m m a 5 . C a s e 1 i s

s i m i l a r to t h e c a s e 1 i n L e m m a 5 . T h e o n l y d i f f e re n c e i s th a t w e d o n o t n e e d t o c o n s i d e r

t h e c o s t s o f i n s e r t i n g s u b t r e e s o f T f [ j ] w h i c h a r e n o t in th e m a p p i n g s i n c e t h e y s h o u l d

h a v e b e e n r e m o v e d . C a s e 2 is t h e s a m e a s c a s e 2 in L e m m a 5 . C a s e 3 is a s p e c ia l c a s e

f o r m i nM M r ( i,j) y ( M M r ( i , j ) ) , w h e r e M M r ( i, j ) c o n t a i n s o n l y o n e e l e m e n t f r o m I • J .

C a s e 4 i s s i m i l a r t o c a s e 4 o f L e m m a 5 . G r (i , j ) a n d M M r ( i, j ) ) a r e d e f in e d t o d e a l w i t h

t h is c a s e w h e r e w e d o n o t n e e d t o c o n s i d e r t h o s e s u b t r e e s o f T f [ j ] t h a t a re n o t i n t h e b e s t

m a p p i n g . [ ]

W e a g a i n r e d u c e t h e c o m p u t a t i o n o f minMMr ( i , j ) y ( M M r ( i , j ) ) t o t he m i n i m u m c o s t

f lo w p r o b l e m .

G i v e n P[ i ] a n d T [ j ] , l e t 1 = { i l , i2 . . . . . in l} a n d J = { j l , j 2 . . . . . jn ~ }, w h e r e i k,

1 < k < h i , r e p r e s e n t s t r e e P[ i k ] , a n d j r , 1 < l < n j , r e p r e s e n t s t r e e T[ j l ] .

W e c o n s t r u c t a g r a p h G r = ( V , E ) a s f o l l o w s :

v e r t e x s e t : V = { s , t , e , f } tA I U J , w h e r e s i s t h e s o u r c e , t is t h e s i n k , a n d e

r e p r e s e n t s a n e m p t y t r ee ;e d g e s e t : [ s , i k ], [ j l, f ] , [ e , f ] , [ f , t ] w i t h C O S t z e r o , [ ik , j r ] w i t h c o s t

D r ( P [ i k ] , T [ j d ) , a n d [ i k, e ] w i t h c o s t D r ( P [ i k ] , 0 ) . A l l t h e e d g e s h a v e

c a p a c i t y o n e e x c e p t [ e, f ] a n d [ f , t ] , w h o s e c a p a c i t i e s a re h i .

G r i s a n e t w o r k w i t h i n t e g e r c a p a c i t i e s , n o n n e g a t i v e c o s t s , a n d t h e m a x i m u m f l o w

f * = n i ( s e e F i g u r e 2 ) .

W e c a n p r o v e t h a t co s t ( G r ) = m i nM M r( i, j) y ( M M r ( i, j ) ) . T h e p r o o f i s s i m i l a r t o t h e

p r o o f o f L e m m a 8 .

W e a r e n o w r e a d y t o g i v e a n a l g o r i th m f o r a p p r o x i m a t e u n o r d e r e d t r ee m a t c h i n g . T h e

t i m e c o m p l e x i t y o f t h e a l g o r i t h m i s O ( I P I x I T I x d e g ( P ) • l o g 2 ( d e g (P ) + d e g ( T ) ) )s i n c e f * f o r G r i s r / i i n s t e a d o f ni -[- nj.

8/8/2019 Unordered Trees

http://slidepdf.com/reader/full/unordered-trees 16/18

2 2 0 K a i z h o n g Z h a n g

Tt i ]

s t

l , l~-'P[i, l , 19

F i g . 2 . R e d u c t i o n t o t h e m i n i m u m c o s t fl o w p r o b le m .

Input: P a n d T .

Output : D~(P , T [ i ] ) , w h e r e 1 < i < } T t.

D r ( 0 , 0 ) = 0 ;

f o r j = 1 t o I T I

D~(O,T , [ j ] ) = 0

Dr(O, T [ j ] ) = 0

f o r / = 1 t o I P [ni

o~(Pf[il, o ) = ~ D A P [ i k ] , O )

k = l

D ~(P [i], O) = Dr(Pf[i] , O) + y ( p [ i ] - -~ ; 0

f o r i = l t o I P I

f o r j : 1 t o IT ]

Dr(Pf[ i l , T f [ j ] ) = m i n

m i n Dr(Pe[ i l , ~ [ j , ] ) + y ( L - -+ t [ j t ] ) ,l<t<ny

D r( Pf[ i] , O) q- m i n O r( Pf[ is], T f [ j ] ) - D r( Pf[ is], 0 ) ,l <s<n i

m i n v ( M M ~ ( i , j ) ) .MM~(i,j)

8/8/2019 Unordered Trees

http://slidepdf.com/reader/full/unordered-trees 17/18

A Constrained Edit Distance Betwe en Unordered Labeled Trees 221

D r ( P [ i ] , T [ j ] ) = m i n

D r ( P [ i ] , O ) ,

Y 0 ~ ---> t [ j ] ) + m i n D r ( P [ i ] , T [ j t ] ) ,l<t<nj

D r ( P [ i ] , O ) q - m i n D r ( P [ i s ] : T [ j ] ) - D r ( P [ i s ] , O , ) ,l<s<ni

D r ( P f [ i ] , T f [ j ] ) + y ( p [ i l - - - > t [ j l ) .

7 . C o n c l u s i o n , M o t i v a te d b y t h e N P - c o m p l e t e [3 ], [ 19 ] a n d M A X S N P - h a r d r e su l ts

[ 1 6 ], w e h a v e d e f i n e d a c o n s t r a i n e d e d i t d i s t a n c e m e t r i c b e t w e e n u n o r d e r e d l a b e l e d t r ee s .

W e p r e s e n t a n a l g o r i th m f o r c o m p u t i n g t h is d i s t a n c e m e t r ic b a s e d o n a r e d u c t i o n t o th e

m i n i m u m c o s t m a x i m u m f lo w p r o b le m . O u r a l g o r i t h m i s g e n e r a l iz a b l e w i t h th e s a m e

c o m p l e x i t y to t h e a p p r o x i m a t e u n o r d e r e d - t r e e - m a t c h i n g p r o b l e m .

T h e w o r k p r e s e n t e d i s p a r t o f a p r o j e c t t o d e v e l o p a c o m p r e h e n s i v e t o o l f o r a p p r o x i -

m a t e t r ee p a t t e rn m a t c h i n g [ 1 4] . T h e p r o p o s e d a l g o r i t h m s a n d t h e i r i m p l e m e n t a t i o n a re

c u r r e n t l y i n t e g r a t e d i n t o t h i s t o o l.

Acknowledgments . T h e a u t h o r w o u l d l i k e t o t h a n k t h e a n o n y m o u s r e f e re e s f o r t h e ir

m a n y h e l p f u l s u g g e s t io n s .

References

[ 1] A. Arora, C. Lund, R. Motw ani, M . Sudan, and M. Szegedy, Proof verification and hardness of approx-

imation problems, Proc. 33rd IEEE Symp. on the Foundation of Computer Science, 1992, pp. 14-23.

[2] G .M . Landau and U. Vishkiu, Fast parallel and serial approximate string matching, J. Algorithms, 10(1989), 157-169.

[3] P. Kilpelainen and H. M annila, The tree inclusion problem, Proc. Internat. Joint C onf. on the Theory

and Practice o f Software Development (CAAP '91), 1991, Vol. 1, pp. 202-214.

[4] S. M asuyam a, Y. Takahashi, T. Oku yama, and S. Sasaki, On the largest comm on subgraph problem,

Algorithms an d C omputing Theory, RIM S, Kokyuroku (Kyoto University), 1990, pp. 195-201.

[5] P.H . Sellers, The theory and computation of evolutionary distances, J. Algorithms, 1 (1980), 359-373.[6] B. Shapiro and K. Zhang, Com paring multiple RNA secondary structures using tree comparisons,

Com put. Appl. B iosci., 6(4) (1990), 309-318.

[7] E Y. Shih, Object representation and recognition usin g mathem atical morpho logy model, J. System

Integration, 1 (1991), 235-256.

[8] E Y. Shih and O. R. Mitchell, Threshold decomp osition of grayscale morphology into bina~'ymorphol-

ogy, IEEE Trans. Pattern Anal. M ach. Intell., 11 (1989), 31-42 .

[9] K .C . Tai, The tree-to-tree correction problem, J. Assoc. Comput. Mach., 26 (1979), 422-433.

[10] Y. Takahash i, Y. Satoh, H. Suzuki, and S. Sasaki, Recogn ition of largest com mon structural fragment

amo ng a variety of chemical structures, Anal. Sci., 3 (1987), 23-28.

[ 11] E. Tanaka and K. Tanaka, The tree-to-tree editing problem, Internat. J. Pattern Recog. Artificial Intell.,

2(2) (1988), 221-240.

[12] R. E. Tar jan , Data Structures and Network Algorithms, CBMS-NSF Regional Conference Series inApplied Mathematics, CBMS, W ashington, ~C , 1983.

[13] E. Ukkonen, Find ing approximate pattern~in str ings, J. Algorithms, 6 (1985), 132-137.

[14 ] J. T. L . Wang, Kaizhong Zhang, Karpjoo Jeong, and D. Shasha, ATB E: a system for approximate tree

matching, IEEE Trans. Knowledge Data Engrg, 6(4) (1994), 559-571.

8/8/2019 Unordered Trees

http://slidepdf.com/reader/full/unordered-trees 18/18

2 2 2 K a i z h o n g Z h a n g

[15] Ka izho ng Zhang , Algor i thm s fo r the Cons t r a ined Edi t ing Dis tance Be tw een Orde red Labe led Trees and

Re la ted Prob lem s , Technica l Repor t No . 361 , Depa r tm ent o f Com pute r Sc ience , U nive r s i ty o f Wes te rn

Ontario, 1993.

[16] Ka izho ng Zhang and Tao Jiang , Some M AX SNP -ha rd r e su l t s conce rn ing unorde red labe led t ree s ,

Inform. P rocess. Lett., 49 (1994) , 249-254.[17] Ka izho ng Zhang and D. Shasha , S imple f a s t a lgor i thms fo r the ed i t ing d is tance be twe en t ree s and re la ted

prob lems , SIAMJ. Comput . , 18(6 ) (1989), 12 45-1262.

[18] Ka izho ng Zhang , D . Shasha , and J . Wang , Approx im a te t r ee ma tch ing in the p re sence o f va r iab le leng th

do n ' t c a res , J . Algorithms, 16 (1994) , 33-66.

[19] Ka izho ng Zhan g , R. Sta tman and D. Shasha , On the ed i t ing d is tance be twe en unorde red labe led t r ee s ,

Inform. P rocess. Lett., 42 (1992) , 133-13 9.


Top Related