model-driven cloud data storage

19
Juan Castrejón , Genoveva Vargas-Solar, Christine Collet, Rafael Lozano Université de Grenoble, CNRS, Grenoble INP, Tecnológico de Monterrey CloudMDE 2012 Model-Driven Cloud Data Storage

Upload: jccastrejon

Post on 22-Apr-2015

1.096 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Model-Driven Cloud Data Storage

Juan Castrejón, Genoveva Vargas-Solar, Christine Collet, Rafael Lozano Université de Grenoble, CNRS, Grenoble INP, Tecnológico de Monterrey

CloudMDE 2012

Model-Driven Cloud Data Storage

Page 2: Model-Driven Cloud Data Storage

Background • Cloud computing (NIST-2011)

•  Utility computing model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable resources

• Cloud data storage (Ruiz-2011, Armbrust-2009) •  Store, retrieve and manage large amounts of data, using highly

scalable distributed infrastructures

• Polyglot persistence (Fowler-2011) •  Different data storage technologies for different kinds of data •  Each storage mechanism introduces a new interface to be learned •  To get decent performance, you have to understand a lot about

how the technology works

2

Page 3: Model-Driven Cloud Data Storage

Background • Variety of data storage models and implementations

(Cattell-2011, Edlich-2012)

•  Models: key-value, document, extensible record, graph, blob, object, queue, xml, relational

•  Implementations: Redis, Voldemort, MongoDB, CouchDB, Cassandra, Neo4J, db4o, eXist-db, etc. (As of today, over 120 options)

• Cloud deployment environments (Ruiz-2011)

•  Different combinations of pricing, support, service level agreements, and management APIs

•  Public providers (Amazon, Windows Azure, Xeround, etc.) •  Private providers (Eucalyptus, OpenNebula, etc.)

3

Page 4: Model-Driven Cloud Data Storage

(Katsov-2012)

Use the right tool for the right job…

How do I know which is the right tool for the right job?

4

Page 5: Model-Driven Cloud Data Storage

Problem • How to specify data requirements for cloud environments?

•  For a set of data requirements, how to choose an appropriate combination of cloud storage system implementation and deployment provider?

• How to generate/manage everything that’s required to work with the selection that I make?

5

Page 6: Model-Driven Cloud Data Storage

Existing solutions •  Integration of cloud storage platforms (Livenson-2011)

•  Cloud Data Management Interface (CDMI) (SNIA-2011) proxy to integrate blob and queue data stores

• Data integration over NoSQL stores (Curé-2011)

•  Integration of relational and NoSQL databases (Document, column) •  Focus on efficient answering of queries

• Storage provider selection (Ruiz-2011, Ruiz-2012) •  Characterize storage providers features (Ex: performance, cost) •  Specify requirements for application datasets (Ex: expected size,

access latency, concurrent clients) •  Based on the previous information, an assignment of datasets to

different storage systems is proposed

6

Page 7: Model-Driven Cloud Data Storage

Existing solutions • Modeling as a Service (Bruneliere-2010)

•  Deploy and execute model-driven services over the Internet (SaaS)

• Design and deploy applications in the cloud (Peidro-2011) •  Promotes graphical models to capture cloud requirements •  Models automatically deployed to PaaS and IaaS environments

• Application design/execution in multiple clouds (Ardagna-2012)

•  MDE quality-driven method for design, development and operation •  Monitoring and feedback system

7

Page 8: Model-Driven Cloud Data Storage

Limitations of existing solutions • Support for a limited set of cloud storage interfaces

• Data integration can be highly based on the relational model

•  Limited information for the selection of data storage systems

• Consideration for high-level cloud models (SaaS) but limited support for low-level models (PaaS and IaaS)

8

Page 9: Model-Driven Cloud Data Storage

Objectives 1.  Provide adequate notations and environments to

characterize cloud data storage requirements

2.  Selection of cloud data storage implementations and deployment providers

3.  Management of the required artifacts to work with different combinations of cloud storage implementations and providers

9

Page 10: Model-Driven Cloud Data Storage

Objectives

10

Conceptual models

Cloud requirements

High-level of abstraction (Conceptual models and environments)

Low-level of abstraction (Storage implementations and providers)

Logical model

Logical model

Logical model

Physical model

Physical model

Physical model

Selection process Artifacts management

Page 11: Model-Driven Cloud Data Storage

Proposed solution • Rely on Model-Driven Engineering (MDE) (Kent-2002) to:

•  Characterize cloud storage requirements •  Encapsulate selection, administration and use of cloud data

storage implementations

• Why MDE? •  Avoid dependencies between high-level (data models) and low-

level abstractions (storage implementations and providers) •  Emphasis on relying on different levels of modeling notations •  Generation of low-level abstractions by using automatic

transformation procedures

11

Page 12: Model-Driven Cloud Data Storage

Objective 1: Data requirements for the cloud

• Do traditional modeling notations (ER and UML diagrams) make sense for data storage in the cloud? •  Define-extend notations and environments for cloud data modeling

• What requirements should a cloud data storage notation consider? •  Rely on quality standards (ISO/IEC SQuaRE, S-Cube) to guide this

analysis. Example: performance, efficiency, portability, etc.

• How to characterize the proposed requirements? •  Associate quality metrics relevant to (cloud) scenarios, based on

the characteristics of the reference standard (Jureta-2010)

•  Validate currently proposed metrics. For example: throughput, cost, access latency, etc.

12

Page 13: Model-Driven Cloud Data Storage

Objective 2: Data storage selection •  Based on the analysis of historic data and usage patterns

•  Both in test applications and within systems generated in our modeling environment

•  Monitoring data is gathered in a non-intrusive manner •  AOP monitoring •  Monitor the behaviour of the selected implementation/providers, based

on the metrics specified in the modeling environment •  Compare expected values and actual performance

•  Monitoring data is shared in open/collaborative manner •  Used by our decision process •  Available for external users

•  Users could work, at the same time, with multiple combinations of storage implementations and providers •  Test the performance of the different combinations

13

Page 14: Model-Driven Cloud Data Storage

Objective 3: Cloud artifacts management • Generate the low-level artifacts to work with data storage

implementations and deployment providers •  Configuration files for deployment providers •  Data management interfaces (CDMI, Spring Data, etc.)

• Different levels of transformation procedures •  From the high-level data model to an intermediate Domain Specific

Language (DSL) (Liu-2010, SpringRoo-2012) •  From the intermediate DSL to configuration files, AOP monitoring

aspects and data management interfaces (SpringData-2012)

• MDE transformation techniques •  Model-to-Model (M2M), Model-to-Text (M2T)

14

Page 15: Model-Driven Cloud Data Storage

Proof of concept • Extension - Model2Roo (http://code.google.com/p/model2roo/)

UML class diagram Spring Roo

Java web App

Spring Data

Work in progress…

Graph database

Relational database

High-level abstractions

1

Low-level abstractions

2

15

Page 16: Model-Driven Cloud Data Storage

Preliminary results • Castrejón, J., Vargas-Solar, G., Collet, C., Lozano, R., :

“Model-Driven Cloud Data Storage”. In: First International Workshop on Model-Driven Engineering on and for the Cloud (CloudMDE 2012). Co-located with ECMFA ’12. July 2012

• Castrejón, J., Vargas-Solar, G., Lozano, R., : “Model2Roo: Web Application Development based on the Eclipse Modeling Framework and Spring Roo”. In: First Workshop on Academics Modeling with Eclipse (ACME 2012). Co-located with ECMFA ’12. July 2012

16

Page 17: Model-Driven Cloud Data Storage

Demonstration / Questions

17

Contact: [email protected]

Page 18: Model-Driven Cloud Data Storage

References •  Ardagna, D., Di Nitto, E., Casale, G., et al. MODACLOUDS, A Model-Driven Approach for the

Design and Execution of Applications on Multiple Clouds. Models in Software Engineering Workshop (MiSE 2012). Co-located with ICSE ’12. (2012)

•  Armbrust M. , Fox A., Griffith R., Joseph A. D, et al. Above the Clouds: A Berkeley View of Cloud Computing, 2009.

•  Bruneliere, H., Cabot, J., Jouault, F.: Combining model-driven engineering and cloud computing. In: Modeling, Design, and Analysis for the Service Cloud Workshop. MDA4ServiceCloud ’10 (2010)

•  Cattell, R.: Scalable sql and nosql data stores. SIGMOD Rec. 39, 12–27 (May 2011) •  Curé, O., Hecht, R., Le Duc, C., Lamolle, M.: Data Integration over NoSQL Stores Using

Access Path Based Mappings. A. In: Proceedings of the 22nd International Conference on Database and Expert Systems Applications (DEXA 2011). Hameurlain et al. (Eds.), Part I, LNCS 6860, pp. 481–495, (2011)

•  Edlich, S.: List of nosql databases. http://nosqldatabase.org/ (March 2012) •  Fowler, M.: Polyglot persistence. http://martinfowler.com/bliki/PolyglotPersistence.html

(November 2011) •  Jureta, I., Borgida, A., Ernst, N., Mylopoulos, J.: Techne: Towards a New Generation of

Requirements Modeling Languages with Goals, Preferences, and Inconsistency Handling. In: Proceedings of the 18th IEEE International Requirements Engineering Conference. pp. 115-124. RE 2010. IEEE Computer Society (2010)

•  Katsov, I.: Nosql data modeling techniques. http://highlyscalable.wordpress.com/ 2012/03/01/nosql-data-modeling-techniques/ (March 2012)

18

Page 19: Model-Driven Cloud Data Storage

References •  Kent, S.: Model driven engineering. In: Butler, M., Petre, L., Sere, K. (eds.) Integrated Formal Methods,

LNCS, vol. 2335, pp. 286–298. Springer Berlin (2002) •  Lenzerini, M.: Data integration is harder than you thought. In: Proceedings of the 9th International

Conference on Cooperative Information Systems. pp. 22-26. CooplS ’01, Springer-Verlag, London, UK (2001)

•  Livenson, I., Laure, E.: Towards Transparent Integration of Heterogeneous Cloud Storage Platforms. In: Fourth International Workshop on Data Intensive Distributed Computing. DIDC ’11. Co-located with HDPC ‘11 (2011)

•  Liu, D., Zic, J.: Cloud#: A specification language for modeling cloud. In: Proceedings of the 2011 IEEE 4th International Conference on Cloud Computing. pp. 533–540. CLOUD ’11, IEEE Computer Society, Washington, DC, USA (2011)

•  Peidro, J.E., Muñoz-Escoí, F.D.: Towards the next generation of model driven cloud platforms. In: 1st International Conference on Cloud Computing and Services Science. pp. 494–500. CLOSER ’11 (2011)

•  Ruiz-Alvarez, A., Humphrey, M.: An automated approach to cloud storage service selection. In: Proceedings of the 2nd international workshop on Scientific cloud computing. pp. 39–48. ScienceCloud ’11, ACM, New York, NY, USA (2011)

•  Ruiz-Alvarez, A., Humphrey, M.: A model and decision procedure for data storage in cloud computing. In: Proceedings of the IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing. CCGrid ’12 (2012)

•  Storage Networking Industry Association (SNIA): Cloud data management interface (CDMI). http://www.snia.org/cdmi (September 2011)

•  SpringSource: Spring data projects. http://www.springsource.org/spring-data (March 2012) •  SpringSource: Spring roo. http://www.springsource.org/spring-roo (March 2012)

19