delta* - hvl

63
DELTA* A tool for database refactoring Patrick Stünkel Western Norway University of Applied Sciences *Supported by the "Modern Refactoring"bilateral SIU/CAPES project November 21, 2018

Upload: others

Post on 16-Jan-2022

8 views

Category:

Documents


0 download

TRANSCRIPT

DELTA*A tool for database refactoring

Patrick Stünkel

Western Norway University of Applied Sciences*Supported by the "Modern Refactoring"bilateral SIU/CAPES project

November 21, 2018

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

About MeMotivation

About Me

I B.Sc. and M.Sc. from FHDW Hannover in GermanyI There DELTA was developed.

I 3 years in industry as a Software EngineerI Since September 2017: PhD research fellow at Western Norway

University of Applied SciencesI Topic: Interoperability in Model Driven Software Engineering (MDSE)I Areas: MDSE, Bidirectional Transformations (BX), Co-Evolution

2 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

About MeMotivation

Motivation

I Software refactoring is widely adopted and is performedautomatically...

I ... whereas database refactoring is not.I Ambler/Sadalage „Database Refactoring“, 2006 [1]

Definition 1 (Ambler): Database refactoring

A simple change to a database schema that improves its design whileretaining both its behavioural and informational semantics.

3 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

About MeMotivation

Motivation

I Software refactoring is widely adopted and is performedautomatically...

I ... whereas database refactoring is not.

I Ambler/Sadalage „Database Refactoring“, 2006 [1]

Definition 1 (Ambler): Database refactoring

A simple change to a database schema that improves its design whileretaining both its behavioural and informational semantics.

4 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

About MeMotivation

Motivation

I Software refactoring is widely adopted and is performedautomatically...

I ... whereas database refactoring is not.I Ambler/Sadalage „Database Refactoring“, 2006 [1]

Definition 1 (Ambler): Database refactoring

A simple change to a database schema that improves its design whileretaining both its behavioural and informational semantics.

5 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

About MeMotivation

Motivation

I Software refactoring is widely adopted and is performedautomatically...

I ... whereas database refactoring is not.I Ambler/Sadalage „Database Refactoring“, 2006 [1]

Definition 1 (Ambler): Database refactoring

A simple change to a database schema that improves its design whileretaining both its behavioural and informational semantics. 6 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

About MeMotivation

Distinction: code refactoring - database refactoring

What makes Database refactoring different ?

I deals with behavioural and informational aspects,

I affects actual stored data,

I requires manual effort,

I in general a database schema is shared between many differentapplications.

Requirement 1: Transition periods

The database has to be accessible through the new and the old schemaafter a refactoring for certain transition period because the differentdependent applications need time to adopt the changed schema.

7 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

About MeMotivation

Distinction: code refactoring - database refactoring

What makes Database refactoring different ?

I deals with behavioural and informational aspects,

I affects actual stored data,

I requires manual effort,

I in general a database schema is shared between many differentapplications.

Requirement 1: Transition periods

The database has to be accessible through the new and the old schemaafter a refactoring for certain transition period because the differentdependent applications need time to adopt the changed schema.

8 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

About MeMotivation

Distinction: code refactoring - database refactoring

What makes Database refactoring different ?

I deals with behavioural and informational aspects,

I affects actual stored data,

I requires manual effort,

I in general a database schema is shared between many differentapplications.

Requirement 1: Transition periods

The database has to be accessible through the new and the old schemaafter a refactoring for certain transition period because the differentdependent applications need time to adopt the changed schema.

9 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

About MeMotivation

Distinction: code refactoring - database refactoring

What makes Database refactoring different ?

I deals with behavioural and informational aspects,

I affects actual stored data,

I requires manual effort,

I in general a database schema is shared between many differentapplications.

Requirement 1: Transition periods

The database has to be accessible through the new and the old schemaafter a refactoring for certain transition period because the differentdependent applications need time to adopt the changed schema.

10 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

About MeMotivation

Distinction: code refactoring - database refactoring

What makes Database refactoring different ?

I deals with behavioural and informational aspects,

I affects actual stored data,

I requires manual effort,

I in general a database schema is shared between many differentapplications.

Requirement 1: Transition periods

The database has to be accessible through the new and the old schemaafter a refactoring for certain transition period because the differentdependent applications need time to adopt the changed schema.

11 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

About MeMotivation

Distinction: code refactoring - database refactoring

What makes Database refactoring different ?

I deals with behavioural and informational aspects,

I affects actual stored data,

I requires manual effort,

I in general a database schema is shared between many differentapplications.

Requirement 1: Transition periods

The database has to be accessible through the new and the old schemaafter a refactoring for certain transition period because the differentdependent applications need time to adopt the changed schema.

12 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

About MeMotivation

Further requirements

Requirement 2: Generated Migrations

As the manual development of migration procedures is error-prone andthere are recurring patterns of migrations the migration code itself shallbe generated.

Requirement 3: Revertible Migrations

As a refactoring must not change the behaviour of the data or causeinformation loss the refactoring can be reverted. Therefore we require anundo-feature for our migrations.

Requirement 4: Avoid downtimes

As long downtimes have negative impact and productivity and sale theyshall be avoided whilst applying refactorings.

13 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

About MeMotivation

Further requirements

Requirement 2: Generated Migrations

As the manual development of migration procedures is error-prone andthere are recurring patterns of migrations the migration code itself shallbe generated.

Requirement 3: Revertible Migrations

As a refactoring must not change the behaviour of the data or causeinformation loss the refactoring can be reverted. Therefore we require anundo-feature for our migrations.

Requirement 4: Avoid downtimes

As long downtimes have negative impact and productivity and sale theyshall be avoided whilst applying refactorings.

14 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

About MeMotivation

Further requirements

Requirement 2: Generated Migrations

As the manual development of migration procedures is error-prone andthere are recurring patterns of migrations the migration code itself shallbe generated.

Requirement 3: Revertible Migrations

As a refactoring must not change the behaviour of the data or causeinformation loss the refactoring can be reverted. Therefore we require anundo-feature for our migrations.

Requirement 4: Avoid downtimes

As long downtimes have negative impact and productivity and sale theyshall be avoided whilst applying refactorings.

15 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

About MeMotivation

Further requirements

Requirement 2: Generated Migrations

As the manual development of migration procedures is error-prone andthere are recurring patterns of migrations the migration code itself shallbe generated.

Requirement 3: Revertible Migrations

As a refactoring must not change the behaviour of the data or causeinformation loss the refactoring can be reverted. Therefore we require anundo-feature for our migrations.

Requirement 4: Avoid downtimes

As long downtimes have negative impact and productivity and sale theyshall be avoided whilst applying refactorings.

16 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Refactoring ProcessRefactoring CatalogueMigration Strategies

Maintaining compatibility

Trigger Old and new schema running in parallel, trigger replicatechanges back and forth.

Views Views, representing the old schema, provide backwardcompatibility.

Batch Jobs Old and new schema running in parallel, Batch jobsreplicate changes back and forth on a regular basis.

17 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Refactoring ProcessRefactoring CatalogueMigration Strategies

Maintaining compatibility

I Ambler and Sadalage suggest Trigger.

I We chose Views with Instead-Of-Trigger.

18 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Refactoring ProcessRefactoring CatalogueMigration Strategies

Maintaining compatibility

I Ambler and Sadalage suggest Trigger.

I We chose Views with Instead-Of-Trigger.

19 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Refactoring ProcessRefactoring CatalogueMigration Strategies

Maintaining compatibility

I Ambler and Sadalage suggest Trigger.

I We chose Views with Instead-Of-Trigger.

20 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Refactoring ProcessRefactoring CatalogueMigration Strategies

Assumptions

I Database schema represents an object-oriented-model, i. e. set ofrelated entities.

I Object-Relational-MappingI EntityType → Table with surrogate IDI Attribute → ColumnI *:1-Association → Foreign-Key-Constraint

I ⇒ Schema satisfies 2NF.

21 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Refactoring ProcessRefactoring CatalogueMigration Strategies

Assumptions

I Database schema represents an object-oriented-model, i. e. set ofrelated entities.

I Object-Relational-MappingI EntityType → Table with surrogate IDI Attribute → ColumnI *:1-Association → Foreign-Key-Constraint

I ⇒ Schema satisfies 2NF.

22 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Refactoring ProcessRefactoring CatalogueMigration Strategies

Assumptions

I Database schema represents an object-oriented-model, i. e. set ofrelated entities.

I Object-Relational-MappingI EntityType → Table with surrogate IDI Attribute → ColumnI *:1-Association → Foreign-Key-Constraint

I ⇒ Schema satisfies 2NF.

23 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Refactoring ProcessRefactoring CatalogueMigration Strategies

Catalogue

I Rename Table

I Rename Column

I Introduce Calculated Column

I Merge and Split Columns

I Spin-Off Empty-Table

I Move Column

I Merge and Split Table

I Transcode Foreign-Key

I Compose Foreign-Keys

24 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Refactoring ProcessRefactoring CatalogueMigration Strategies

Introduce Calculated Column

25 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Refactoring ProcessRefactoring CatalogueMigration Strategies

Merge and Split Columns

26 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Refactoring ProcessRefactoring CatalogueMigration Strategies

Spin-Off Empty Table

27 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Refactoring ProcessRefactoring CatalogueMigration Strategies

Move Column

28 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Refactoring ProcessRefactoring CatalogueMigration Strategies

Merge and Split Table

29 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Refactoring ProcessRefactoring CatalogueMigration Strategies

Transcode Foreing Key

30 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Refactoring ProcessRefactoring CatalogueMigration Strategies

Compose Foreign Keys

31 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Refactoring ProcessRefactoring CatalogueMigration Strategies

Offline Batch-Migration

I Shut down database.

I Apply schema changes and migrate data.

I Restart database.

Advantages Disadvantages

Easy to implement May cause long downtimes on large data sets

32 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Refactoring ProcessRefactoring CatalogueMigration Strategies

Stepwise, Transactional Migration

I Create new schema (empty).

I Copy data to the new schema on-access or per batch.

I View layer on the new schema aggregates data from old and newschema.

Advantages Disadvantages

Long downtimes avoided Higher complexity due to aggregationTemporary trigger required

33 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Refactoring ProcessRefactoring CatalogueMigration Strategies

Two-Phase Migration

I Copy database.

I Apply schema changes to one copy (incl. data migration).

I Original schema receives triggers, which log every datamanipulation.

I The logged actions are applied to the new schema copy bit by bit.

I Eventually all data is migrated to the new schema.

Advantages Disadvantages

Long downtimes avoided Higher complexity due to mergeConcurrent migration Temporary trigger for logging required

34 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Design Goals

I Independent, encapsulated refactorings

I A refactorings is composed of Deltas

I Virtual preview of schema modification

I Goal in the long run: Merge of refactorings!

35 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Design Goals

I Independent, encapsulated refactorings

I A refactorings is composed of Deltas

I Virtual preview of schema modification

I Goal in the long run: Merge of refactorings!

36 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Design Goals

I Independent, encapsulated refactorings

I A refactorings is composed of Deltas

I Virtual preview of schema modification

I Goal in the long run: Merge of refactorings!

37 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Design Goals

I Independent, encapsulated refactorings

I A refactorings is composed of Deltas

I Virtual preview of schema modification

I Goal in the long run: Merge of refactorings!

38 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Design Goals

I Independent, encapsulated refactorings

I A refactorings is composed of Deltas

I Virtual preview of schema modification

I Goal in the long run: Merge of refactorings!

39 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Design Goals

I Independent, encapsulated refactorings

I A refactorings is composed of Deltas

I Virtual preview of schema modification

I Goal in the long run: Merge of refactorings!

DELTA:

I Tool for database migration (currently Oracle)

I Java 8

I JavaFX-GUI

I Three-Layer-Architecture

I 50 Deltas (9 "real"refactorings)

40 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Starting point

I Address is part of Person

41 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Starting point

I Address is part of Person

42 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Step 1: Spin-Off the Address

I transition phaseI no distinction between legal and natural persons

43 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Step 1: Spin-Off the Address

I transition phase

I no distinction between legal and natural persons

44 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Step 1: Spin-Off the Address

I transition phaseI no distinction between legal and natural persons

45 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Step 2: Split in natural and legal persons

I no support for multiple tenants

46 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Step 2: Split in natural and legal persons

I no support for multiple tenants47 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Step 3: Implement Multi-Tenancy

I transition phase

48 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Step 3: Implement Multi-Tenancy

I transition phase

49 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Company Merger

I bad design: insuranceId and liabilityInsuranceId aredisjoint

50 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Company Merger

I bad design: insuranceId and liabilityInsuranceId aredisjoint

51 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Schritt 4: Concrete-Table-Inheritance →Class-Table-Inheritance

52 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Case Study: Wrap-Up

I Extract entities: transformation of associations 1:1 → 1:* → *:*

I Split entities: Single-Table-Inheritance → Class-Table-Inheritance

I Extend models (multi-tenancy) with calculated columns

I Transformation of models (Merge, „Transcode“etc.):Concrete-Table-Inheritance → Class-Table-Inheritance

53 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Case Study: Wrap-Up

I Extract entities: transformation of associations 1:1 → 1:* → *:*

I Split entities: Single-Table-Inheritance → Class-Table-Inheritance

I Extend models (multi-tenancy) with calculated columns

I Transformation of models (Merge, „Transcode“etc.):Concrete-Table-Inheritance → Class-Table-Inheritance

54 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Case Study: Wrap-Up

I Extract entities: transformation of associations 1:1 → 1:* → *:*

I Split entities: Single-Table-Inheritance → Class-Table-Inheritance

I Extend models (multi-tenancy) with calculated columns

I Transformation of models (Merge, „Transcode“etc.):Concrete-Table-Inheritance → Class-Table-Inheritance

55 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Case study

Case Study: Wrap-Up

I Extract entities: transformation of associations 1:1 → 1:* → *:*

I Split entities: Single-Table-Inheritance → Class-Table-Inheritance

I Extend models (multi-tenancy) with calculated columns

I Transformation of models (Merge, „Transcode“etc.):Concrete-Table-Inheritance → Class-Table-Inheritance

56 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Existing tools

SQL Prompt VisualStudio-PlugIn, supports renaming and split-table

ApexSQL Only MS-SQL-Server, supports renaming, split-table andintroduce association-table.

Flyway Version control and management of migration scripts, noimplemented refactorings

Liquibase Same as Flyway but with SQL-Abstraction and somebuilt-in-refactorings (rename, split-table)

57 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Related research

I I. Skoulis et. al. Growing up with stability: How open-sourcerelational databases evolve [4]: study of the VCS-history to analysepatterns in the development of the database model.

I C. Curino et. al. Graceful database schema evolution: the prismworkbench [5]. Similar approach to DELTA, based on SQL rewriting.

I M. Pereira et al. Evolution of databases using petri nets [6]. Moregeneral approach on how to check dependencies betweenrefactorings.

58 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Conclusion

I Catalogue of database refactorings

I Backward-compatibility with Views and Instead-Of-Trigger

I Presentation of the different migration scenarios

I Open-Source-Tool: DELTA

59 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Outlook

I Evaluation of the prototype

I Support of the different migration scenarios

I Composition of refactorings

I Reordering of refactoring

THANK YOU FOR YOUR ATTENTION!

60 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

Outlook

I Evaluation of the prototype

I Support of the different migration scenarios

I Composition of refactorings

I Reordering of refactoring

THANK YOU FOR YOUR ATTENTION!

61 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

[1] Scott W Ambler and Pramod J SadalageRefactoring databases: Evolutionary database design. PearsonEducation, 2006.

[2] Martin FowlerRefactoring : improving the design of existing code. Addison-Wesley,1999.

[3] Michael LöweRefactoring information systems - association folding and unfolding.FHDW Hannover, 2013.

[4] Ioannis Skoulis, Panos Vassiliadis, Apostolos V. ZarrasGrowing up with stability: How open-source relational databasesevolve. Information Systems,53:363-385, 2015

[5] Carlo A. Curino, Hyun J. Moon, Carlo ZanioloGraceful database schema evolution: the prism workbench.Proceeding of the VLDB Endowment, 1:761-772, 2008

62 / 63

IntroductionTheory

ApplicationRelated Work

Conclusion and Outlook

[6] Marcia Beatriz Carvalho Pereira, Jorge Rady de Almeida Junior,Jose Reinaldo SilvaEvolution of databases using petri nets. Anais de XIX CongressoBrasileiro de Automatica, CBA 2012

63 / 63