2010.04.09 domain specific cloud components for general availability in the research
DESCRIPTION
This work deals with availability of cloud computing to computational research labs. We will focus to the concept of availability. This concept may have two different interpretations, namely: “Available” as an accessible resource, always, from everywhere“Available” as the ability to consume a service (as a client or as the publisher)This paper will focus on the second interpretation: a cloud service is “available” if it is easy for anyone in the academic community (and not) to consume the cloud. Indeed, cloud allows sharing “knowledge” in form of components or data to be “executed” in the cloud. The challenge here is to make possible for researchers, not necessarily expert in programming and computer science, to make available her/his knowledge in form of components and data tables.The solution we propose is based on Domain Specific Languages (DSL), by which a researcher will express the components in her/his specific language, that will be user-friendly since it is directly related to the particular research field. In this framework, cloud components will be expressed in terms of a generic mathematical model rather than a software component. This vision is quite common in computing thanks to the availability of many tools that simplify the development of DSL such as dynamic languages like IronRuby or revolutionary data access with SQL Server Modeling.The objective of this work is to present a model of a general “Domain Specific Cloud Component” (DSCC) that can be expressed, published and consumed by the research community using tools that allow an easy and direct implementation for the mathematical algorithms developed by the scientists. The general concept will be applied to specific examples by developing frameworks customized to share a specific “DSCC”. Examples will be taken in the area of multiscale molecular modeling for the design of nanostructured polymer systems (nanotechnology) and the estimation of the environmental impact of a production process (sustainability).TRANSCRIPT
15 April, 2010 - slide 1MOSE – University of Trieste
Domain Specific Cloud Components for General Availability in the Research
Marco Parenzan•Tenure - Web Service ProgrammingComputer Engineering – University of Trieste
•ResearcherMethodologies and ToolsMOSE Laboratory – University of Trieste
Maurizio Fermeglia•Full Professor Chemical EngineeringMOSE Laboratory – University of Trieste
15 April, 2010 - slide 2MOSE – University of Trieste
Vision
Multi – Scale Molecular Modeling will revolution the world of research and industrial production in the next years by strongly accelerating the development of new products.
Mission Material Sciences: thermo physical properties for materials, polymer
technology and nanoscience/nanotechnology
Life Sciences: drug-receptor interactions, drug-design, QSAR, drug-delivery…
Process simulation: process synthesis, design, modeling for chemical, biochemical, energy production
MOSE: Molecular Simulation Engineering
COL1S01
S14
S02
S03
H1
S04
F1
S05
S06
H2
S07Q1
COL2
S08
S09
T1
S11
S10
H3
S12
P1S13Z
M1
MAKUPA
MAKUPB
S13
H4
15 April, 2010 - slide 3MOSE – University of Trieste
MeccanicaQuantistica(elettroni)
Meccanicamolecolare(atomi)
Simulazione di processo
FEM
Engineering design
1Å
Characteristic Length
1nm 1μm 1mm 1m
years
seconds
nanoseconds
picoseconds
femtoseconds
QuantumMechanics(electrons)
MolecularMechanics(atoms)
Mesoscale modeling
(segments)
Process Simulation
FEM
Engineering design
1Å
Characteristic Time
1nm 1μm 1mm 1m
hours
minutes
microseconds
Multiscale Molecular Modeling
15 April, 2010 - slide 4MOSE – University of Trieste
Engineering design
MeccanicaQuantistica(elettroni)
QuantumMechanics(electrons)
MolecularMechanics(atoms)
Message Passing Multiscale Molecular Modeling
Mesoscalemodeling
(segments)
Process Simulation
FEM
15 April, 2010 - slide 5MOSE – University of Trieste
QuantumMechanics(electrons)
Meccanicamolecolare(atomi)
Cloud-based Message Passing forMultiscale Molecular Modeling
Engineering design
Engineering design
Mesoscalemodeling
(segments)
Simulazione di processo
FEM
Process Simulation
FEM
15 April, 2010 - slide 6MOSE – University of Trieste
MOSE in the cloud…
…no. Why?
Because it heavily depends on software (molecular simulation, process simulation) that are not on the Cloud
Does MOSE needs the Cloud?
Yes. People need sharing the knowledge based on simulation and, generally, on computation
Can MOSE access “alone” the Cloud? No, at the moment
The actors:
Chemists, Chemical Engineers, Materials Engineers, Biologists, Medical Doctors …
Just “Computer Science” classes in the first two years of Engineering Curriculum (some C/C++, no VB(A) or .NET, some Matlab)
But they need programs to solve their problems…
…and sometimes they try to write them!
15 April, 2010 - slide 7MOSE – University of Trieste
Objectives of this Research
Move MOSE to the Cloud!
Cannot wait software companies
Computer engineers can “simplify” write these codes
But she needs speaking with (non-computer) engineers about the details (Analysis, Specifications, “DOMAIN”)
Why don‟t we enable (non-computer) scientists writing their own code?
Simplifying (programming) tools to consume the Cloud
What does “consume” mean?
Write Algorithms
Write Plug ins for existing apps
Write Custom Programs
15 April, 2010 - slide 8MOSE – University of Trieste
Simplification development path
We are „still @ C++‟ (some apps need C++ plug in/custom code)
We already stepped into CLR world Example our development in CAPE-OPEN (http://co-lan.org)
The next step are Dynamic Languages such as Python or Ruby
DSLs world for data (custom data texts)
Win32 CLR DLR DSL
C/C++ C#/VB Python/Ruby Custom DSL
15 April, 2010 - slide 9MOSE – University of Trieste
Python
binder
Rubybinder
COMbinder
JavaScriptbinder
Objectbinder
Dynamic Languages on .NET
Dynamic Language Runtime
Expression Trees Dynamic Dispatch Call Site Caching
IronPython IronRuby C# VB.NET Others…
15 April, 2010 - slide 10MOSE – University of Trieste
Why should we care then?
More languages, more options
DLR gives apps instant scripting abilities
C# has moved in that direction too! LINQ
Lambda expressions
Parallel extensions (C# 4.0)
„dynamic‟ (C# 4.0) and „var‟ keywords
C# 1.0
C# 2.0
C# 3.0
Managed Code
Generics
Language Integrated Query
C# 4.0Dynamic Programming
15 April, 2010 - slide 11MOSE – University of Trieste
What is "Oslo"?
THE PLATFORM FOR MODEL-DRIVEN APPLICATIONS
“M”The language for authoring models & DSLs
“Quadrant”The tool for interacting with models & DSLs
RepositoryThe database for storing & sharing models
15 April, 2010 - slide 12MOSE – University of Trieste
[Your Models]
Base Models
“M” Runtime
REPOSITORY
SQL SERVER
[Your Models]
Base Models
“M” Runtime
REPOSITORY
SQL SERVER
"Oslo" Architecture
RUNTIMES
[Your Runtime]
“Dublin”
ASP.NET
WF
WCF
SQL/EDM
Windows
Other ISV Runtimes
ADO .NET
XML, Custom Formats, …
“QUADRANT”
Composition
Generic Viewers
Dataflow
EDITOR
FRAMEWORK
[Your Visual DSL] [Your Textual DSL]
MSchema
MGrammar
MGraph
LANGUAGE
FRAMEWORK
[Your Models]
.Net Models
Repository Models
REPOSITORY
SQL SERVER
OTHER
TOOLS
(VSTS, EXCEL, …)
XML, Custom Formats, …
15 April, 2010 - slide 13MOSE – University of Trieste
GOAL:
SCALABLE, DURABLE
STORAGE
Windows Azure storage is an application managed by the Fabric Controller
Windows Azure applications can use native storage or SQL Azure
Application state is kept in storage services, so worker roles can replicate as needed
Blobs: large, unstructured data (audio, video, etc)
Tables: simply structured data, accessed using ADO.NET Data Services
Queues: serially accessed messages or requests, allowing web-roles and worker-roles to interact
Storage in Windows Azure
15 April, 2010 - slide 14MOSE – University of Trieste
Simplification steps
1. Write apps running on cloud
1. Windows Azure
2. (ASP.NET MVC2) Web Role for the front-end
3. Worker Role for background processing
4. Table, Blob and Queue for “unstructured”, but easy, storage
2. Use Dynamic Languages to do the processing
1. Simplified deployment
2. Simplified “code” model
3. Simplified type management (dynamic typing, no variable declaration)
4. Now fully integrated in .NET with DLR and IronPython and IronRuby
3. Input and Output as structured text
1. “M” (in “Oslo”, now SQL Server Modeling) gives us a generic schema language (more general that XSD) and more “readable” than xml
2. This gives structure and metadata to the Azure Storage data (as requested by Ed Lazowska in his yesterday wonderful keynote)
15 April, 2010 - slide 15MOSE – University of Trieste
Domain Specific Cloud Components for General Availability in the Research
Demo
15 April, 2010 - slide 16MOSE – University of Trieste
The matrix was too simple?
This is a two-dimensional matrix of three dimensional vectorsSize of cube is: 100 nano meters
15 April, 2010 - slide 17MOSE – University of Trieste
Domain Specific Cloud Components for General Availability in the Research
Conclusions
15 April, 2010 - slide 18MOSE – University of Trieste
The results1. Write apps running on cloud
1. Windows Azure
2. (ASP.NET MVC2) Web Role for the front-end
3. Web Role for background processing
2. Use Dynamic Languages to do the processing1. Simplified deployment
2. Simplified “code” model
3. Simplified type management (dynamic typing, no variable declaration)
4. Now fully integrated in .NET with DLR and IronPython and IronRuby
3. Input and Output as structured text1. Oslo (now SQL Server Modeling) gives us a generic
schema language (more general that Xsd) and more “readable” than xml
2. Structured text as data sources
15 April, 2010 - slide 19MOSE – University of Trieste
ConclusionsWhy MOSE needs the cloud? To build a platform to orchestrate the message passing in Multiscale Molecular Modeling
activity
To empower our research team with a flexible scientific platform that drives efficiency, collaboration and innovation
In the demo we have seen The “creation” and the execution (invocation) of the single step of the process
The input and the output are the “messages” that walk through the scale
The code: Definition of a library of a generic cloud component
Usage of Dynamic Languages (IronPython)
A new opportunity in .NET development
More productive (PLLs, as told by Armando Fox yesterday)
More simpler for non programmers
Application of DSLs (Oslo) for the definition of simple input/output messages
More confident with scientific people
More simple than a graphical UI to implement
It gaves metadata/schema to flat files (as requested by Ed Lazowska in his yesterday wonderful keynote)
What‟s next?
15 April, 2010 - slide 20MOSE – University of Trieste
What is next?
Continue with the project
The definition of a process (an orchestration)
Did you saw the session from Paul Watson yesterday? (“Cloud Computing from chemical Property Prediction”)
The users in the process
Collaboration in the process
Again, as Paul said, we agree on a structure like a “social science community”, a Web 2.0 application
Security, Confidentiality
Verticalization on the domain
Remove all the nitty-gritty details that lowers the experience
Define custom component Languages
15 April, 2010 - slide 21MOSE – University of Trieste
Simplification steps
1. Write apps running on cloud
1. Windows Azure
2. (ASP.NET MVC2) Web Role for the front-end
3. Worker Role for background processing
4. Table, Blob and Queue for “unstructured”, but easy, storage
2. Use Dynamic Languages to do the processing
1. Simplified deployment
2. Simplified “code” model
3. Simplified type management (dynamic typing, no variable declaration)
4. Now fully integrated in .NET with DLR and IronPython and IronRuby
3. Input and Output as structured text
1. “M” (in “Oslo”, now SQL Server Modeling) gives us a generic schema language (more general that XSD) and more “readable” than xml
2. This gives structure and metadata to the Azure Storage data
4. Write DSL
15 April, 2010 - slide 22MOSE – University of Trieste
Writing a Custom DSL(Supposed)Needs of the “non-programmer” Libraries
Integrated functionalities
No “include”
Data Access as Libraries
Connect Command Execute LINQ
Define Datasource (Metadata), no SQL schema
All-in-one
One Component, one “file” (as much as possible)
Simplifing deployment
Need of the programmer Not so (much) imperative, not so (much)
functional, not so (much) object oriented
State is not so bad
Lambda are cool (no functions, all lambdas)
Escape to power (if DSL is “poor”) Backend of a full language, totally integrated
DLR, (Iron)Python, (Iron)Ruby, (Iron)JS (Javascript) and so on
cloud component
#naming part (entry point)
Name = "test 0004"
# declarative part
# sections like cobol
input
i(label = "Input Vector")
data
# static declaration
m(name = "matrix01" # this is the "query"
label = "multiplication matrix")
output
o(label = "Output Vector")
# coding part
# dynamic like python (and vb)
# verbose like visual basic
code "this is the main"
# alterernative syntax of query from storage
# calculated
m = lookup in Matrici
for NomeMatrice
### multiline
comment ###
assign 0 to r
while r is less then m.rows do
assign 0 to c
assign 0nm to a
while c is less then m.cols do
#a = a + m(r,c) * i(c)
increment a by m(r,c) * i(c)
# python has no matrix, but jagged arrays
increment c by 1
end do
assign a to o
increment r by 1
end do
15 April, 2010 - slide 23MOSE – University of Trieste
Domain Specific Cloud Components for General Availability in the Research
Q&A
Marco Parenzan web presence
Blog: http://blog.codeisvalue.com/
E-mail: [email protected]
Facebook: parenzan.marco
Twitter: marco_parenzan
Skype: marco.parenzan
Live: [email protected]
Slides: http://www.slideshare.com/marco.parenzan
15 April, 2010 - slide 24MOSE – University of Trieste
MOSE: Molecular Simulation Engineering
Department of Materials and Natural Resources (DMRN)University of TriestePiazzale Europa 1, 34127 Trieste (Italy)
http://www.mose.units.it/
Maurizio Fermeglia