exporting databases in xml dtd a conceptual and generic approach

22
TU/e eindhoven university of technology /faculty of mathematics and informatics Exporting Databases in XML DTD A Conceptual and Generic Approach Philippe Thiran Computer Science Department Technische Universiteit Eindhoven The Netherlands

Upload: shoushan-mccoy

Post on 01-Jan-2016

17 views

Category:

Documents


0 download

DESCRIPTION

Philippe Thiran Computer Science Department Technische Universiteit Eindhoven The Netherlands. Exporting Databases in XML DTD A Conceptual and Generic Approach. Exporting Databases in XML. Current Situation XML as the standard for publishing and exchanging data over the Web - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML DTD

A Conceptual and Generic Approach

Philippe ThiranComputer Science Department

Technische Universiteit EindhovenThe Netherlands

Page 2: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML• Current Situation

– XML as the standard for publishing and exchanging data over the Web

– Data recorded and maintained in existing Databases• Heterogeneous databases: different data models• Limitation of database models

– Database schema incompleteness (implicit/hidden structures)

– Explicit and implicit interconnections among entities

(no primary and foreign keys)

Oracle V5 Model

ProductReferenceLabel[0-1]UnitPriceSupplier

OrderOderIDCustomerDateTotal[0-1]

DetailOderIDReferenceQuantityAmount

ProductReferenceLabel[0-1]UnitPriceSupplier[1-5]id: Reference

OrderOderIDCustomerDateTotal[0-1]id: OderID

DetailOderIDReferenceQuantityAmountid: Reference

OderIDref: Referenceref: OderID

Page 3: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML• Migrating existing databases to

XML– Principle

• XML description in DTD • Bottom-up Approach• Exploiting as much as possible the

meaning of source data– Method and Tool

• Method – Not limited to any specific database model– Capturing the explicit and implicit structures

and interconnections of the database schema• Tool for supporting the method

Page 4: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML

Schema RepresentationDatabase models and DTD

Schema Manipulation Database schemas and DTD

Page 5: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML

• Schema Representation– Expressing database schemas and

XML in terms of GER• Extended object-entity relationship

data model• One rich and expressive model able to

express data schemas whatever their operational data models– Operational database models like IMS,

Relational, OO– XML-family models: XML DTD or XML

Schema

Page 6: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML

• Schema Representation– Expressing XML in terms of GER

• DTD expressed in terms of GER– DTD concepts– Hierarchical organization– Sequence organization

DTD Concepts GER Interpretation

Element types Entity types

Hierarchy of element types (root) entity types, relationship types, father roles

Content type ELEMENT Relationship types

Sequence organization (order of elements in the sequence)

Seq groups

Occurrence operators on sub-elements ?, *, +

Role Cardinalities

IDREF, GID attributes IDREF, GID groups

Attribute modifiers Attribute cardinalities

Page 7: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML

• Schema Representation– Expressing XML in terms of GER

1-1

1-1f

1-11-1f

1-1

1-Nf

1-1

0-1f

1-11-1f

1-1

1-1f

1-1

0-Nf

1-10-Nf

1-1

1-Nf

Amount#pcdata

Quantity#pcdata

Supplier#any

DetailProductidref: Productseq: .Quantity

.Amount

ProductReferenceLabel[0-1]UnitPricegid: Referenceseq: .Supplier[*]

Total#pcdata

Date#pcdata

Customer#anyOrder

OderIDseq: .Customer

.Date

.Total

.Detail[*]gid: OderID

Catalog

seq: .Order[*].Product[*]

<!ELEMENT Catalog (Order*, Product*)><!ELEMENT Order (Customer, Date, Total?, detail+)><!ATTLIST Order OrderID ID #REQUIRED><!ELEMENT Customer ANY><!ELEMENT Date (#PCDATA)><!ELEMENT Total (#PCDATA)><!ELEMENT Detail (Quantity, Amount)><!ATTLIST Detail Product IDREF #REQUIRED><!ELEMENT Quantity (#PCDATA)><!ELEMENT Amount (#PCDATA)><!ELEMENT Product (Supplier+)><!ATTLIST Product Reference ID #REQUIRED Label CDATA #IMPLIED UnitPrice CDATA #REQUIRED><!ELEMENT Supplier ANY>

Page 8: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML• Schema Manipulation

– Transforming XML DTD within GER• Schema transformations defined on GER

– Reverse transformations, semantics-preserving transformations

– Transformation operators• Standard transformations

– For manipulating schemas expressed in operational database models

– Example: transforming an entity type into an attribute

• DTD-specific transformations

Page 9: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML

• Schema Manipulation– Transforming XML DTD within GER

• Standard transformations– For manipulating schemas expressed in

classical structured models – Example of a semantics-preserving

transformation: transforming an relationship type into a entity type

RT-ET: Transforming a relationship type into an entity type.

Inverse: ET-RT0-N0-N R

B1B1

AA1

1-1

0-N

rB1-1

0-N

rAR

id: rB.B1rA.A

B1B1

AA1

Page 10: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML

• Schema Manipulation– Transforming XML DTD within GER

• DTD-specific transformations (example)– Suited to derive a DTD from a structured data

schemaDTD-RT-to-HIER: Transforms a one-to-many (or one-to-one) binary relationship type into a hierarchical relation. The 1-1 role becomes the child role.

Inverse: DTD-HIER-to-RT

Create-SEQ-GROUP: Adds a seq group to an entity type. That group contains the child roles played by its children (in an aleatory order).

Inverse: Del-SEQ-GROUP

1-10-N R BA 1-1f

0-NR BA

1-1

f0-1

R2

1-1f

0-NR1R

B

A

1-1

f0-1

R2

1-1f

0-NR1

R

seq: R1.A[*]R2.B

B

A

Page 11: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML

Converting (legacy) databases into DTDExploiting as much as possible the meaning of source dataCapturing the explicit and implicit structures and interconnections

Page 12: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML• Exporting Databases

– Bottom-up approach (from the source to the target)

– Semi-automated 4-step method• Extraction of the database schema (automated)

– Extraction of the explicit structures and constraints• Semantics recovering (semi-automated)

– Recovery of the implicit structures and constraints• Model translation (semi-automated)

– Translation of a schema expressed in the GER into a schema expressed in the GER DTD

– Use of the relations among entities• DTD exportation (automated)

– Generation of the DTD document

Page 13: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML

• Exporting XML – Reverse Engineering

• Recovering of the conceptual schema of an existing database

– Augmentation of the knowledge about the data semantics

– Database reverse engineering process (DB-MAIN)

– Elicitation of hidden structures and constraints

0-N1-NDetail

QuantityAmount

ProductReferenceLabel[0-1]UnitPriceSupplier[1-5]id: Reference

OrderOderIDCustomerDateTotal[0-1]id: OderID

Database Schema

DetailOderIDReferenceQuantityAmountacc: Reference

OderID

ProductReferenceLabel[0-1]UnitPriceSupplieracc: Reference

OrderOderIDCustomerDateTotal[0-1]acc: OderID

FileCatalog

ProductOrderDetail

Schema transformations

Conceptual Schema

Page 14: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML• Exporting XML

– Model Translation• DTD-specific transformation• Non-deterministic process

– It requires some design choices– The user-inputs might have consequences on

the properties and the semantics of the resulting schema

• 5-step transformation process– Schema preparation– Hierarchy structure creation– Constraint relaxation– Attribute representation– Ordering definition

Page 15: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML

• Exporting XML – Model Translation

• Schema preparation– Removing invalid constructs

» Multivalued/compound attributes» Complex relationship types

0-N1-NDetail

QuantityAmount

ProductReferenceLabel[0-1]UnitPriceSupplier[1-5]id: Reference

OrderOderIDCustomerDateTotal[0-1]id: OderID

Conceptual Schema

1-1

1-5

supplied

1-1

0-Nof

1-1

1-N consists

SupplierSupplierid: supplied.Product

Supplier

ProductReferenceLabel[0-1]UnitPriceid: Reference

OrderOderIDCustomerDateTotal[0-1]id: OderID

DetailQuantityAmountid: of.Product

consists.Order

1. Schema preparation2. Hierarchy structure creation3. Constraint relaxation4. Attribute representation5. Ordering definition

Page 16: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML• Exporting XML

– Model Translation• Hierarchical structure creation

1-1

1-5

supplied

1-1

0-Nof

1-1

1-N consists

SupplierSupplierid: supplied.Product

Supplier

ProductReferenceLabel[0-1]UnitPriceid: Reference

OrderOderIDCustomerDateTotal[0-1]id: OderID

DetailQuantityAmountid: of.Product

consists.Order

1-1

1-5f

1-10-Nf

1-1

0-Nf

1-11-Nf

SupplierSupplierid: .f

Supplier

ProductReferenceLabel[0-1]UnitPriceid: Reference

Catalog

OrderOderIDCustomerDateTotal[0-1]id: OderID

DetailReferenceQuantityAmountid: Reference

.fref: Reference

Entity types, relationship types are transformed into a tree• by electing natural roots (significant concepts)• by resolving father conflicts• by breaking cycles• by (eventually) adding a unique root

1. Schema preparation2. Hierarchy structure creation3. Constraint relaxation4. Attribute representation5. Ordering definition

Page 17: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML

• Exporting XML – Model Translation

• Constraint relaxation– Role cardinalities extension– Gid and idref groups creation

1-1

1-Nf

1-10-Nf

1-1

0-Nf

1-11-Nf

SupplierSuppliergid: .f

Supplier

ProductReferenceLabel[0-1]UnitPricegid: Reference

Catalog

OrderOderIDCustomerDateTotal[0-1]gid: OderID

DetailReferenceQuantityAmountid: Reference

.fidref: Reference

1. Schema preparation2. Hierarchy structure creation3. Constraint relaxation4. Attribute representation5. Ordering definition

1-1

1-5f

1-10-Nf

1-1

0-Nf

1-11-Nf

SupplierSupplierid: .f

Supplier

ProductReferenceLabel[0-1]UnitPriceid: Reference

Catalog

OrderOderIDCustomerDateTotal[0-1]id: OderID

DetailReferenceQuantityAmountid: Reference

.fref: Reference

Page 18: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML

• Exporting XML – Model Translation

1-1

1-1f

1-11-1f

1-1

1-Nf

1-1

0-1f

1-11-1f

1-1

1-1f

1-1

0-Nf

1-10-Nf

1-1

1-Nf

Amount#pcdata

Quantity#pcdata

Supplier#any

DetailProductidref: Productseq: .Quantity

.Amount

ProductReferenceLabel[0-1]UnitPricegid: Referenceseq: .Supplier[*]

Total#pcdata

Date#pcdata

Customer#anyOrder

OderIDseq: .Customer

.Date

.Total

.Detail[*]gid: OderID

Catalog

seq: .Order[*].Product[*]

1. Schema preparation2. Hierarchy structure creation3. Constraint relaxation4. Attribute representation5. Ordering definition

• Attribute representation• Ordering definition

Page 19: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML

CASE Support – DB-MAINModel Expression

Database models and DTD Model Translation

DTD-specific transformation

Page 20: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML

• CASE Support – DB-MAIN– Basic Features

• Dedicated to database application engineering

• Based on the GER

• Includes transformation operators, reverse engineering processors and schema analysis tools

• Extraction facilities (SQL, Codasyl, RPG, IMS, etc.)

Page 21: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML

• CASE Support– *-to-DTD Transformation

• DTD-Specific transformations• Assistant

Page 22: Exporting Databases in XML DTD A Conceptual and Generic Approach

TU/e eindhoven university of technology

/faculty of mathematics and informatics

Exporting Databases in XML

• Conclusions– Rich and expressive data model

• Translating semantics of both database and XML models

– Non-deterministic aspect of the model translation• The same database schema can lead to a large set of equivalent XML

structures

– CASE Support (application)• Automatic production of XML documents

– that comply with the DTD that has been computed– based on the schema transformations used to convert the database

schema in XML DTD