2010.04.09 domain specific cloud components for general availability in the research

24
15 April, 2010 - slide 1 MOSE University of Trieste Domain Specific Cloud Components for General Availability in the Research Marco Parenzan Tenure - Web Service Programming Computer Engineering University of Trieste Researcher Methodologies and Tools MOSE Laboratory University of Trieste Maurizio Fermeglia Full Professor Chemical Engineering MOSE Laboratory University of Trieste

Upload: marco-parenzan

Post on 04-Jul-2015

558 views

Category:

Technology


2 download

DESCRIPTION

This work deals with availability of cloud computing to computational research labs. We will focus to the concept of availability. This concept may have two different interpretations, namely: “Available” as an accessible resource, always, from everywhere“Available” as the ability to consume a service (as a client or as the publisher)This paper will focus on the second interpretation: a cloud service is “available” if it is easy for anyone in the academic community (and not) to consume the cloud. Indeed, cloud allows sharing “knowledge” in form of components or data to be “executed” in the cloud. The challenge here is to make possible for researchers, not necessarily expert in programming and computer science, to make available her/his knowledge in form of components and data tables.The solution we propose is based on Domain Specific Languages (DSL), by which a researcher will express the components in her/his specific language, that will be user-friendly since it is directly related to the particular research field. In this framework, cloud components will be expressed in terms of a generic mathematical model rather than a software component. This vision is quite common in computing thanks to the availability of many tools that simplify the development of DSL such as dynamic languages like IronRuby or revolutionary data access with SQL Server Modeling.The objective of this work is to present a model of a general “Domain Specific Cloud Component” (DSCC) that can be expressed, published and consumed by the research community using tools that allow an easy and direct implementation for the mathematical algorithms developed by the scientists. The general concept will be applied to specific examples by developing frameworks customized to share a specific “DSCC”. Examples will be taken in the area of multiscale molecular modeling for the design of nanostructured polymer systems (nanotechnology) and the estimation of the environmental impact of a production process (sustainability).

TRANSCRIPT

Page 1: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 1MOSE – University of Trieste

Domain Specific Cloud Components for General Availability in the Research

Marco Parenzan•Tenure - Web Service ProgrammingComputer Engineering – University of Trieste

•ResearcherMethodologies and ToolsMOSE Laboratory – University of Trieste

Maurizio Fermeglia•Full Professor Chemical EngineeringMOSE Laboratory – University of Trieste

Page 2: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 2MOSE – University of Trieste

Vision

Multi – Scale Molecular Modeling will revolution the world of research and industrial production in the next years by strongly accelerating the development of new products.

Mission Material Sciences: thermo physical properties for materials, polymer

technology and nanoscience/nanotechnology

Life Sciences: drug-receptor interactions, drug-design, QSAR, drug-delivery…

Process simulation: process synthesis, design, modeling for chemical, biochemical, energy production

MOSE: Molecular Simulation Engineering

COL1S01

S14

S02

S03

H1

S04

F1

S05

S06

H2

S07Q1

COL2

S08

S09

T1

S11

S10

H3

S12

P1S13Z

M1

MAKUPA

MAKUPB

S13

H4

Page 3: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 3MOSE – University of Trieste

MeccanicaQuantistica(elettroni)

Meccanicamolecolare(atomi)

Simulazione di processo

FEM

Engineering design

Characteristic Length

1nm 1μm 1mm 1m

years

seconds

nanoseconds

picoseconds

femtoseconds

QuantumMechanics(electrons)

MolecularMechanics(atoms)

Mesoscale modeling

(segments)

Process Simulation

FEM

Engineering design

Characteristic Time

1nm 1μm 1mm 1m

hours

minutes

microseconds

Multiscale Molecular Modeling

Page 4: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 4MOSE – University of Trieste

Engineering design

MeccanicaQuantistica(elettroni)

QuantumMechanics(electrons)

MolecularMechanics(atoms)

Message Passing Multiscale Molecular Modeling

Mesoscalemodeling

(segments)

Process Simulation

FEM

Page 5: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 5MOSE – University of Trieste

QuantumMechanics(electrons)

Meccanicamolecolare(atomi)

Cloud-based Message Passing forMultiscale Molecular Modeling

Engineering design

Engineering design

Mesoscalemodeling

(segments)

Simulazione di processo

FEM

Process Simulation

FEM

Page 6: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 6MOSE – University of Trieste

MOSE in the cloud…

…no. Why?

Because it heavily depends on software (molecular simulation, process simulation) that are not on the Cloud

Does MOSE needs the Cloud?

Yes. People need sharing the knowledge based on simulation and, generally, on computation

Can MOSE access “alone” the Cloud? No, at the moment

The actors:

Chemists, Chemical Engineers, Materials Engineers, Biologists, Medical Doctors …

Just “Computer Science” classes in the first two years of Engineering Curriculum (some C/C++, no VB(A) or .NET, some Matlab)

But they need programs to solve their problems…

…and sometimes they try to write them!

Page 7: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 7MOSE – University of Trieste

Objectives of this Research

Move MOSE to the Cloud!

Cannot wait software companies

Computer engineers can “simplify” write these codes

But she needs speaking with (non-computer) engineers about the details (Analysis, Specifications, “DOMAIN”)

Why don‟t we enable (non-computer) scientists writing their own code?

Simplifying (programming) tools to consume the Cloud

What does “consume” mean?

Write Algorithms

Write Plug ins for existing apps

Write Custom Programs

Page 8: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 8MOSE – University of Trieste

Simplification development path

We are „still @ C++‟ (some apps need C++ plug in/custom code)

We already stepped into CLR world Example our development in CAPE-OPEN (http://co-lan.org)

The next step are Dynamic Languages such as Python or Ruby

DSLs world for data (custom data texts)

Win32 CLR DLR DSL

C/C++ C#/VB Python/Ruby Custom DSL

Page 9: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 9MOSE – University of Trieste

Python

binder

Rubybinder

COMbinder

JavaScriptbinder

Objectbinder

Dynamic Languages on .NET

Dynamic Language Runtime

Expression Trees Dynamic Dispatch Call Site Caching

IronPython IronRuby C# VB.NET Others…

Page 10: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 10MOSE – University of Trieste

Why should we care then?

More languages, more options

DLR gives apps instant scripting abilities

C# has moved in that direction too! LINQ

Lambda expressions

Parallel extensions (C# 4.0)

„dynamic‟ (C# 4.0) and „var‟ keywords

C# 1.0

C# 2.0

C# 3.0

Managed Code

Generics

Language Integrated Query

C# 4.0Dynamic Programming

Page 11: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 11MOSE – University of Trieste

What is "Oslo"?

THE PLATFORM FOR MODEL-DRIVEN APPLICATIONS

“M”The language for authoring models & DSLs

“Quadrant”The tool for interacting with models & DSLs

RepositoryThe database for storing & sharing models

Page 12: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 12MOSE – University of Trieste

[Your Models]

Base Models

“M” Runtime

REPOSITORY

SQL SERVER

[Your Models]

Base Models

“M” Runtime

REPOSITORY

SQL SERVER

"Oslo" Architecture

RUNTIMES

[Your Runtime]

“Dublin”

ASP.NET

WF

WCF

SQL/EDM

Windows

Other ISV Runtimes

ADO .NET

XML, Custom Formats, …

“QUADRANT”

Composition

Generic Viewers

Dataflow

EDITOR

FRAMEWORK

[Your Visual DSL] [Your Textual DSL]

MSchema

MGrammar

MGraph

LANGUAGE

FRAMEWORK

[Your Models]

.Net Models

Repository Models

REPOSITORY

SQL SERVER

OTHER

TOOLS

(VSTS, EXCEL, …)

XML, Custom Formats, …

Page 13: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 13MOSE – University of Trieste

GOAL:

SCALABLE, DURABLE

STORAGE

Windows Azure storage is an application managed by the Fabric Controller

Windows Azure applications can use native storage or SQL Azure

Application state is kept in storage services, so worker roles can replicate as needed

Blobs: large, unstructured data (audio, video, etc)

Tables: simply structured data, accessed using ADO.NET Data Services

Queues: serially accessed messages or requests, allowing web-roles and worker-roles to interact

Storage in Windows Azure

Page 14: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 14MOSE – University of Trieste

Simplification steps

1. Write apps running on cloud

1. Windows Azure

2. (ASP.NET MVC2) Web Role for the front-end

3. Worker Role for background processing

4. Table, Blob and Queue for “unstructured”, but easy, storage

2. Use Dynamic Languages to do the processing

1. Simplified deployment

2. Simplified “code” model

3. Simplified type management (dynamic typing, no variable declaration)

4. Now fully integrated in .NET with DLR and IronPython and IronRuby

3. Input and Output as structured text

1. “M” (in “Oslo”, now SQL Server Modeling) gives us a generic schema language (more general that XSD) and more “readable” than xml

2. This gives structure and metadata to the Azure Storage data (as requested by Ed Lazowska in his yesterday wonderful keynote)

Page 15: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 15MOSE – University of Trieste

Domain Specific Cloud Components for General Availability in the Research

Demo

Page 16: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 16MOSE – University of Trieste

The matrix was too simple?

This is a two-dimensional matrix of three dimensional vectorsSize of cube is: 100 nano meters

Page 17: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 17MOSE – University of Trieste

Domain Specific Cloud Components for General Availability in the Research

Conclusions

Page 18: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 18MOSE – University of Trieste

The results1. Write apps running on cloud

1. Windows Azure

2. (ASP.NET MVC2) Web Role for the front-end

3. Web Role for background processing

2. Use Dynamic Languages to do the processing1. Simplified deployment

2. Simplified “code” model

3. Simplified type management (dynamic typing, no variable declaration)

4. Now fully integrated in .NET with DLR and IronPython and IronRuby

3. Input and Output as structured text1. Oslo (now SQL Server Modeling) gives us a generic

schema language (more general that Xsd) and more “readable” than xml

2. Structured text as data sources

Page 19: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 19MOSE – University of Trieste

ConclusionsWhy MOSE needs the cloud? To build a platform to orchestrate the message passing in Multiscale Molecular Modeling

activity

To empower our research team with a flexible scientific platform that drives efficiency, collaboration and innovation

In the demo we have seen The “creation” and the execution (invocation) of the single step of the process

The input and the output are the “messages” that walk through the scale

The code: Definition of a library of a generic cloud component

Usage of Dynamic Languages (IronPython)

A new opportunity in .NET development

More productive (PLLs, as told by Armando Fox yesterday)

More simpler for non programmers

Application of DSLs (Oslo) for the definition of simple input/output messages

More confident with scientific people

More simple than a graphical UI to implement

It gaves metadata/schema to flat files (as requested by Ed Lazowska in his yesterday wonderful keynote)

What‟s next?

Page 20: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 20MOSE – University of Trieste

What is next?

Continue with the project

The definition of a process (an orchestration)

Did you saw the session from Paul Watson yesterday? (“Cloud Computing from chemical Property Prediction”)

The users in the process

Collaboration in the process

Again, as Paul said, we agree on a structure like a “social science community”, a Web 2.0 application

Security, Confidentiality

Verticalization on the domain

Remove all the nitty-gritty details that lowers the experience

Define custom component Languages

Page 21: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 21MOSE – University of Trieste

Simplification steps

1. Write apps running on cloud

1. Windows Azure

2. (ASP.NET MVC2) Web Role for the front-end

3. Worker Role for background processing

4. Table, Blob and Queue for “unstructured”, but easy, storage

2. Use Dynamic Languages to do the processing

1. Simplified deployment

2. Simplified “code” model

3. Simplified type management (dynamic typing, no variable declaration)

4. Now fully integrated in .NET with DLR and IronPython and IronRuby

3. Input and Output as structured text

1. “M” (in “Oslo”, now SQL Server Modeling) gives us a generic schema language (more general that XSD) and more “readable” than xml

2. This gives structure and metadata to the Azure Storage data

4. Write DSL

Page 22: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 22MOSE – University of Trieste

Writing a Custom DSL(Supposed)Needs of the “non-programmer” Libraries

Integrated functionalities

No “include”

Data Access as Libraries

Connect Command Execute LINQ

Define Datasource (Metadata), no SQL schema

All-in-one

One Component, one “file” (as much as possible)

Simplifing deployment

Need of the programmer Not so (much) imperative, not so (much)

functional, not so (much) object oriented

State is not so bad

Lambda are cool (no functions, all lambdas)

Escape to power (if DSL is “poor”) Backend of a full language, totally integrated

DLR, (Iron)Python, (Iron)Ruby, (Iron)JS (Javascript) and so on

cloud component

#naming part (entry point)

Name = "test 0004"

# declarative part

# sections like cobol

input

i(label = "Input Vector")

data

# static declaration

m(name = "matrix01" # this is the "query"

label = "multiplication matrix")

output

o(label = "Output Vector")

# coding part

# dynamic like python (and vb)

# verbose like visual basic

code "this is the main"

# alterernative syntax of query from storage

# calculated

m = lookup in Matrici

for NomeMatrice

### multiline

comment ###

assign 0 to r

while r is less then m.rows do

assign 0 to c

assign 0nm to a

while c is less then m.cols do

#a = a + m(r,c) * i(c)

increment a by m(r,c) * i(c)

# python has no matrix, but jagged arrays

increment c by 1

end do

assign a to o

increment r by 1

end do

Page 23: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 23MOSE – University of Trieste

Domain Specific Cloud Components for General Availability in the Research

Q&A

Marco Parenzan web presence

Blog: http://blog.codeisvalue.com/

E-mail: [email protected]

Facebook: parenzan.marco

Twitter: marco_parenzan

Skype: marco.parenzan

Live: [email protected]

Slides: http://www.slideshare.com/marco.parenzan

Page 24: 2010.04.09   domain specific cloud components for general availability in the research

15 April, 2010 - slide 24MOSE – University of Trieste

MOSE: Molecular Simulation Engineering

Department of Materials and Natural Resources (DMRN)University of TriestePiazzale Europa 1, 34127 Trieste (Italy)

http://www.mose.units.it/

Maurizio Fermeglia

[email protected]