don’t scrap it, wrap it! a wrapper architecture for legacy data sources mary tork roth peter...
TRANSCRIPT
![Page 1: Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources Mary Tork Roth Peter Schwarz IBM Almaden](https://reader036.vdocument.in/reader036/viewer/2022082710/56649dc65503460f94aba79c/html5/thumbnails/1.jpg)
Don’t Scrap It, Wrap It!A Wrapper Architecture for Legacy Data
Sources
Mary Tork RothPeter SchwarzIBM Almaden
![Page 2: Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources Mary Tork Roth Peter Schwarz IBM Almaden](https://reader036.vdocument.in/reader036/viewer/2022082710/56649dc65503460f94aba79c/html5/thumbnails/2.jpg)
Road Map
• Motivation• Garlic Overview• Wrapper Architecture
– Data Definition– Query Planning– Query Execution
• Good, Bad, and Ugly
![Page 3: Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources Mary Tork Roth Peter Schwarz IBM Almaden](https://reader036.vdocument.in/reader036/viewer/2022082710/56649dc65503460f94aba79c/html5/thumbnails/3.jpg)
Motivation
• “Real Companies”• Heavy investment in legacy
– Data management wares– Application woes
• Need an integrated view of heterogeneous data sources– Leverage existing query facilities– Work around idiosyncrasies
![Page 4: Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources Mary Tork Roth Peter Schwarz IBM Almaden](https://reader036.vdocument.in/reader036/viewer/2022082710/56649dc65503460f94aba79c/html5/thumbnails/4.jpg)
Garlic Architecture
Query Processor
GarlicMetadata
RelationalDB
ObjectDB
ImageArchive
ComplexObjects
Client ClientClient
Wrapper Wrapper Wrapper Wrapper
![Page 5: Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources Mary Tork Roth Peter Schwarz IBM Almaden](https://reader036.vdocument.in/reader036/viewer/2022082710/56649dc65503460f94aba79c/html5/thumbnails/5.jpg)
Wrapper Goals
• Small start-up cost– Wizards are not the only ones writing
• Incremental growth– Wrappers must be able to evolve– Add new sources without disturbing
existing ones
• Must be able to optimize queries– Enable participation, not delegation
![Page 6: Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources Mary Tork Roth Peter Schwarz IBM Almaden](https://reader036.vdocument.in/reader036/viewer/2022082710/56649dc65503460f94aba79c/html5/thumbnails/6.jpg)
Wrapper Overview
Data Source
Wrapper
GarlicObjects
Method Invocation
Planning
Work Request
WrapperPlan
Query Plan
Execution
Execution Plan
Iterator
![Page 7: Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources Mary Tork Roth Peter Schwarz IBM Almaden](https://reader036.vdocument.in/reader036/viewer/2022082710/56649dc65503460f94aba79c/html5/thumbnails/7.jpg)
Modeling Data
• Object Data Model– Interface and Implementation– GDL variant of ODMG-ODL
• Wrapper assigns IDs to objects– OID = IID + key
• Methods– default accessor methods– stub and generic dispatch
![Page 8: Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources Mary Tork Roth Peter Schwarz IBM Almaden](https://reader036.vdocument.in/reader036/viewer/2022082710/56649dc65503460f94aba79c/html5/thumbnails/8.jpg)
Modeling Data Example
interface Country {attribute string name;attribute string airlines_served;attribute boolean visa_required;attribute Image scene;
}
interface Image {attribute readonly string file_name;double matches(in string file_name);void display(in string device_name);
}
![Page 9: Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources Mary Tork Roth Peter Schwarz IBM Almaden](https://reader036.vdocument.in/reader036/viewer/2022082710/56649dc65503460f94aba79c/html5/thumbnails/9.jpg)
Query Planning• Like System-R, bottom-up dynamic
programming• Wrapper tells what it can do
through methods– plan_access() for single collections– plan_join() for multi-way joins– plan_bind() for inner streams of joins
• Input: work request• Output: set of plans, cost,
cardinalities?
![Page 10: Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources Mary Tork Roth Peter Schwarz IBM Almaden](https://reader036.vdocument.in/reader036/viewer/2022082710/56649dc65503460f94aba79c/html5/thumbnails/10.jpg)
Single Collections
• Work Request– Attributes to project upon– Selections, and methods to invoke
• Wrapper response– Which projections, selections it supports– Cost of plan– Instances of Wrapper_Plan class– Include private data for plan execution– Execute a plan which subsumes
request?
![Page 11: Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources Mary Tork Roth Peter Schwarz IBM Almaden](https://reader036.vdocument.in/reader036/viewer/2022082710/56649dc65503460f94aba79c/html5/thumbnails/11.jpg)
Single Collection Access Plan
select H.name, H.city, H.daily_ratefrom Hotels Hwhere H.class = 5 and H.loc = ‘beach’
Garlic Optimizer
Web Wrapper
Hotel Repository
Work Request
Project: H.OID, H.name, H.city H.daily_rate, H.class, H.loc
Preds: H.class = 5 H.loc = ‘beach’
Wrapper Access Plan - Wrapper_Plan class
PropertiesProject: H.OID, H.name, H.city,H.daily_rate, H.class, H.loc
Preds: H.class = 5
Cost: <access cost>
Plan details (private)
![Page 12: Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources Mary Tork Roth Peter Schwarz IBM Almaden](https://reader036.vdocument.in/reader036/viewer/2022082710/56649dc65503460f94aba79c/html5/thumbnails/12.jpg)
Join Plans
• Request– Plans to join– Join Predicate
• Wrapper response– Join plan with supported predicates– Cost of join
![Page 13: Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources Mary Tork Roth Peter Schwarz IBM Almaden](https://reader036.vdocument.in/reader036/viewer/2022082710/56649dc65503460f94aba79c/html5/thumbnails/13.jpg)
Join Plansselect I.namefrom Countries C, Cities Iwhere C.name = ‘Greece’ and I.pop < 500 and I.country=C.OID
Garlic Optimizer
Wrapper Join Plan- Countries, Cities
Project: C.OID, C.name, I.OID, I.name, I.pop, I.country
Preds: C.name = `Greece’, I.pop < 500, I.country = C.OID
Cost: <join cost>
Plan details (private)
Wrapper Access Plan
Work Request
Project: C.OID, C.namePreds: C.name = ‘Greece’Cost: <xx>
Plan details (private)
Wrapper Access PlanProject: I.OID, I.name...
Preds: I.pop < 500Cost: <xx>
Private details (private)
Input Plans
Join pred: I.country = C.OID
Relational Wrapper
Relational DB
![Page 14: Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources Mary Tork Roth Peter Schwarz IBM Almaden](https://reader036.vdocument.in/reader036/viewer/2022082710/56649dc65503460f94aba79c/html5/thumbnails/14.jpg)
Inter Site Joinsselect C.pop, H.namefrom Cities C, Hotels Hwhere C.name = H.loc
Site A: Cities - CSite B: Hotels - H
A B
Garlic
H
H C
A B
Garlic
H C
H C
A B
Garlic
Hsub
Hsub.loc
Hsub C
![Page 15: Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources Mary Tork Roth Peter Schwarz IBM Almaden](https://reader036.vdocument.in/reader036/viewer/2022082710/56649dc65503460f94aba79c/html5/thumbnails/15.jpg)
Bind Plans
• Inter wrapper join• Fetch matches
– Values produced by outer node– Inner node invoked for each/set of
values– Like semi or filter join
• Same request and reply pairs
![Page 16: Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources Mary Tork Roth Peter Schwarz IBM Almaden](https://reader036.vdocument.in/reader036/viewer/2022082710/56649dc65503460f94aba79c/html5/thumbnails/16.jpg)
Query Execution
• Garlic plan looks like tree with wrapper plans as leaves
• Wrapper exports iterator interface– Translate plan into iterator– Methods supported
• reset()• advance()• bind()
![Page 17: Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources Mary Tork Roth Peter Schwarz IBM Almaden](https://reader036.vdocument.in/reader036/viewer/2022082710/56649dc65503460f94aba79c/html5/thumbnails/17.jpg)
Wrapper Details• Interface files include the GDL• Environment files include
parameters specific to wrappers• Libraries
– Core, shared among several wrappers– Implementation, specific to repositories
• Dynamically loaded code• Same address space as Garlic
![Page 18: Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources Mary Tork Roth Peter Schwarz IBM Almaden](https://reader036.vdocument.in/reader036/viewer/2022082710/56649dc65503460f94aba79c/html5/thumbnails/18.jpg)
Odds and Ends
• How easy is it to write a wrapper?– Summer student, chemist, and many
wrappers written.
• Related Work– TSIMMIS
• Uses QDTL, a declarative spec for supported queries
– DISCO• Language for describing capabilites• Partial queries
![Page 19: Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources Mary Tork Roth Peter Schwarz IBM Almaden](https://reader036.vdocument.in/reader036/viewer/2022082710/56649dc65503460f94aba79c/html5/thumbnails/19.jpg)
Good and Bad• Good
– Leverages existing query facilities– Handles idiosyncrasies– Graceful growth and evolution
• Bad– How easy is it to write wrappers?– How unstructured can my repository
be?– Optimization
• Centralized vs. Local• Selectivity estimation?
![Page 20: Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources Mary Tork Roth Peter Schwarz IBM Almaden](https://reader036.vdocument.in/reader036/viewer/2022082710/56649dc65503460f94aba79c/html5/thumbnails/20.jpg)
The Ugly
• Cost model for diverse set of sources
• Handling failures– Unavailable sources– Wrappers are buggy and often wrong– Want graceful degradation on failures
• Replication