format independent change detection & propagation (fcdp) in support of mobile computing michael...

22
Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph Wilson

Upload: quentin-logan

Post on 01-Jan-2016

222 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing

Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal,

Joseph Wilson

Page 2: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

8 Oct 02 2 of 22

Agenda

• Motivation & Problem Statement

• Related Research

• FCDP Implementation

• Experimental Results

• Future Research

• Conclusions

• Questions

Page 3: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

8 Oct 02 3 of 22

Sponsors

• Research sponsored in part by the National Science Foundation through grant number OR-0100770.

• Research sponsored in part by the United States Military Academy

• Part of Ubiquitous Data Access (UbiData) Project– Lead Investigators: Dr. Helal and Dr. Hammer– Various Ph.D. and Masters Candidates

Page 4: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

8 Oct 02 4 of 22

Motivation

vdiff

Page 5: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

8 Oct 02 5 of 22

Problem Statement

• How to identify meaningful changes between two XML documents that do not have identical element tag and attribute sets?

• How to apply those changes on a copy of the source XML document?

Page 6: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

8 Oct 02 6 of 22

Related Research (1 of 3)

• Bandwidth Adaptation – Puppeteer [Rice U]

• Dynamic adaptation to network conditions (MS Office Only)

• MS-OLE DOM and Puppeteer Intermediate Form

– Odyssey [CMU] • Actively adapts to network conditions• Requires modification of applications

– Neither support disconnected operations

Page 7: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

8 Oct 02 7 of 22

Related Research (2 of 3)

• Text- and Byte-based Difference Detection–(line based) diff (GNU diff)–Xdelta: binary delta based file system from UCB–Rsync: allows binary diff on data on separate machines

(None can support B/W adaptation or cross-application diff/patch)

Page 8: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

8 Oct 02 8 of 22

Related Research (3 of 3)

• XML specific algorithms – Sun [diffmk], IBM Alphaworks [XMLdiff, and treediff ]

• Open API, closed source, • no ‘move’ support for IBM, no ‘update’ for either

– UMD [laDiff], and INRIA Rocquencourt, France [XyDiff]

• Open source, • Supports ‘move’ and ‘update’• XyDiff uses XIDs (aka eXternal IDentifiers)

(Neither can cope with non-identical tag/attribute sets)

Page 9: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

8 Oct 02 9 of 22

FCDP Design Goals

• Minimize connect time and bandwidth requirements when in a mobile mode

• Application transparent bandwidth adaptation

• Application transparent data accessibility

• Cross-Application difference detection

Page 10: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

8 Oct 02 10 of 22

Architectural Overview

Laptop Mobile Device

M-Mem

Specific Format

Document

XML Converter

Limited Content Doc

Request for

import i

ncluding

Target AppPalm Mobile Device

M-Mem

Specific Format Doc

Publish Document

XML Encoded Data

FCDP Server

F-M

emApplication and B/W filtered XML / Target

App document

XSLT Convertor Custom Splitter

Target App Conversion Rules

Avail BandwidthSensor

Target App Sensor

Page 11: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

8 Oct 02 11 of 22FCDP Server

F-M

em

Mobile DeviceM

-Mem Text (ver 1(-)’) to AbiWord

XML(-) Converter

XML vdiff util

XML delta script

XML deltaApply util

AbiWord XML Doc v1

XML delta script

Text Document v0(-)

AbiWord XML to Text Converter

Text Doc v0(-)AbiWord Generated

XML Doc v 0

Text Document

Text patch script

Text Doc v1(-)

Text diff util

patch util

XMLtxttxt’XML(-)

Page 12: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

8 Oct 02 12 of 22

Content Conversion Management

v0 Document

Tag 1

Tag n

Tag 4

v1 Document

Tag 2Tag n

Tag 3Intersection

Map

Symmetric Difference Map

• Lossless Convert or Lossy?

• If Lossy, must track omitted data

•Track via Intersection or Symmetric Difference Maps

Page 13: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

8 Oct 02 13 of 22

vdiff()• 15 line algorithm (specifics in the paper)• Of 13 modules inherited from INRIA’s XyDiff, we have

modified 5 and added 2 more.– The modifications included complete rewrites of

PeepHoleOptimization– Brand new modules for parsing the StructuralMapInfo and when we

need to AdjustForUnSharedChildren– Entire project’s code base now sits at a little more than 16K lines,

XyDiff was just over 8.6K lines

Page 14: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

8 Oct 02 14 of 22

Generation II vdiff()

• Match text nodes using LCS algorithms and partial string matching to create as many text node matchings as possible

• Adjust structure of v1minusDOM so it is isomorphic to v0

• Optimize Matches

• Construct Delta Script

Page 15: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

8 Oct 02 15 of 22

Experimental Setup

XML document version n

Text document version n

Text 2 AbiW Converter

0-50% of version n’s paragraph insert, delete, move, update

10-40% of paragraph

content

Text document version n+i

Primitive XML

document version n+i

Diff script

DiffApplyProgram

Diff script

DiffApplyProgram

XML document

version n+1

XML document

version n+1

AbiWord AbiWord

Statistics Capture, Parse, and Analysis

XyDiff vdiff/ vdiff2()

Page 16: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

8 Oct 02 16 of 22

Experimental Results: ASCII (v0(-)) vs AbiWord (v0) in the TestSet

Page 17: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

8 Oct 02 17 of 22

Experimental Results: Diff versus Value Shipping Text Documents

0

2000

4000

6000

8000

10000

12000

14000

16000

Test Cases in Increasing Size Order

Siz

e (b

ytes

)

v1(-)

diff script

Page 18: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

8 Oct 02 18 of 22

Experimental Results: Cumulative Diff Script Sizes After 960 Test Cases

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

Source Documents in Increasing Size Order

Siz

e (b

ytes

)

XyDiff Delta Size

Vdiff2 Delta Size

Page 19: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

8 Oct 02 19 of 22

Experimental Results: Missing node counts for vdiff2() and XyDiff against the reference

modified documents

-10

0

10

20

30

40

50

60

70

80

90

100

110

120

130

1 36 71 106 141 176 211 246 281 316 351 386 421 456 491 526 561 596 631 666 701 736 771 806 841 876 911 946

List of documents ordered by size (in bytes)

No

de

co

un

t

XyDiff missing nodes vdiff2 missing nodes

Page 20: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

8 Oct 02 20 of 22

Future Research (1 of 2)

• Peephole Optimization- – Customized– Q-grams– Partial String Matching

• Child Reordering– Solve with placeholders?– Use stylized markup in text only docs as placeholder– Non-Provably correct otherwise

• Expand classes of XML documents and applications supported for further logic / implementation verification

Page 21: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

8 Oct 02 21 of 22

Conclusions

• We have improved current state of the art in mobile computing in the following manner:– Content transformation – Content reduction– Change detection between reduced &

transformed documents and canonical XML documents

– Set stage for expansion and refinements of above

Page 22: Format Independent Change Detection & Propagation (FCDP) in Support of Mobile Computing Michael Lanham, Ajay Kang, Joachim Hammer, Abdelsalam Helal, Joseph

8 Oct 02 22 of 22

Questions