environment from the molecular level: an e-science project for modelling the atomistic processes...
TRANSCRIPT
Environment from the Molecular Level:
An e-science project for modelling the atomistic processes involved in environmental issues
(funded by NERC)
Radioactive waste disposal
Crystal growth and scale inhibition
Pollution: molecules and atoms on mineral surfaces
Crystal dissolution and weathering
Molecular Environmental Issues
Rocks and Mineral StructuresRadioactive waste disposal
Crystal growth and scale inhibition
Pollution: molecules and atoms on mineral surfaces
Crystal dissolution and weathering
The “Grand Challenge”.
Level of theory
Adsorbing surface
Contaminant
Quantum Monte Carlo
Large empirical models
Linear-scaling quantum mechanics
Organic molecules
HalogensMetallic elements
Cla
ys,
mic
as
Alu
min
osili
cate
s
Nat
ura
l org
anic
mat
ter
Pho
sph
ates
Car
bona
tes
Oxi
des/
hydr
oxid
es
Sul
phi
des
Requires scientists to work together in teams - a Virtual Organisation
DesignApproach taken:
– Over approx 3 years we have engaged in many workshops, tutorials and prototyping with developers and users. Teaching users what e-Science can “do for them”, including security.
• Cooperation between CCLRC and NIEeS in Cambridge.
– Planned to integrate together some tools which had already been developed/ prototyped at CCLRC, UCL and Reading.
• A service-oriented approach is used for certain aspects: Grid, data management, user interfaces, metadata management. Workflow was found to be important to users, e.g. for combinatorial studies.
• Several iterations of software have enabled some usability issues to be addresses.
– Originally envisaged an “Integrated Portal Architecture” linking HPCPortal, DataPortal and visualisation services.
• We thought we knew what users would like, but actually they preferred a simpler incremental approach;
• Workflow scripting was preferred to a single portal. There are now several separate tools in use.
Technical Strategy
• Technology considerations:
– Considered: Globus GT2, SRB, Harness, CCF, Portal, Web services, visualisation tools
• Various tool sets were tried and the users “voted with their feet”
– Used: Globus, Condor, SRB, AG, MAST, RCommands, Metadata Editor, Workflow scripts, Web services, XML/ RDF/ OWL for data interoperability.
• Infrastructure
– E-Minerals “mini-Grid” was a great success, based on earlier work at Daresbury and Manchester on Grid evaluation. Mini-Grid focuses resources of the e-Minerals VO and includes large campus Condor pools and parallel computers. Using Globus, Condor and GSI. Data managed using SRB.
• Collaboration tools
– Access Grid, MAST, Wiki
Integrated Portal Architecture
Generic portal design using Globus and Web Services:
Visualisation
DataPortal
HPCPortal
HPC Systems
Data Systems
Web Services
Web Services
Web Services
Working with GGF Grid Computing Environments Research Group
GridFTP
GSI
Globus
Development Issues
• Constraints and other issues:– Project divided from outset into:
• development team; • application team; • science team.
– All teams work together and collaborate on papers– Tools written in C to integrate with existing “heritage”
applications, e.g. from the Collaborative Computational Projects (CCPs)
– Other interoperability issues addressed using Web services, e.g. gSOAP (client) +AXIS (server), XML-based data models and Semantic Grid technologies RDF+OWL
– Constraints: short term goals, no prior experience of e-Science, new technology must not disrupt current work.
– High requirements on computing resources for simulation studies• This lead to a focus on workflows for repeated calculations, data
management for storing and retrieving results, semantic Web technologies for data interoperability between codes
Evaluation• Papers presented at All Hands 2005 included:
– E-Science Usability: the e-Minerals Experience (paper 425)
– The e-Minerals Project: Developing the Concept of the Virtual Organisation to support Collaborative Work on Molecular-scale Environmental Simulations (paper 518)
• User engagement and evaluation:
– Looked at the Usability Task Force metrics.
– Our approach did not readily map onto them, but there are overlaps
– Key: understand the science users, their needs, and their natural ways of working.
– Good and bad points summarised on next slides
Lessons LearntWhat was usable?
– Keep it simple – use effective lightweight tools for the job
– Condor and Globus – Condor job scripts were accepted readily. Condor-G and DAGMan now used. RSL also embedded in scripts.
– SRB – required little training and was found to be useful, SCommands in scripts.
– Resource Management – Globus-based resource-monitoring tool was developed (in the Portal). A meta-scheduler is being developed.
– Security – GSI proved “easy for users to work with”. The Portal uses MyProxy to ensure pervasive access. Certificates were not a problem – we offered training from Day 1.
– Collaboration tools – desktop use of AG enables ad hoc meetings + MAST (Multi-cast Application Sharing Tool). Wiki and Instant Messaging also used.
– Semantic technologies. CML was initially used with XSLT and SVG. This now extended in the AgentX toolkit.
Lessons LearntWhat was not usable?
– Client tools * – installation has caused difficulties, e.g. Globus. Initially used “submit machines”. Solutions investigated include:
• Portal – hides the complexity behind a Web interface, user doesn’t install anything;
• Web service interfaces – for Condor (Chapman et al.), GROWL for Globus and SRB (Allan et al.);
• BPEL interface – work at UCL/ OMII – plug-in for Eclipse.
– Firewall issues – for both users and infrastructure – changes to rules lead to instability. Portal and Web services solve this problem for users.
– Meta-data – tools are available, but automatic harvesting required to avoid mistakes. RCommands developed to improve this, can be linked into the workflow scripts.
* A recent workshop “Lightweight Grid Computing” was held 2-3/5/06 at Losehill Hall. Attendees from GROWL, RealityGrid, Imperial College, e-Minerals, e-CCP… Transcript of discussions on usability issues is available giving more detailed information.
Future PlansCurrent and Future development plans:
– New tools are being developed, for instance recently the meta-data editor and RCommands were added to the suite .
– AgentX data-interoperability tools have been added from e-CCP extending the use of CML. Such work is now timely and illustrates how existing large codes, e.g. Siesta and GULP from CCP5 can be integrated easily with visualisation tools.
– Development staff also work on other projects and with other developers. E-Minerals tools are now being evaluated in other areas, e.g. Integrative Biology and e-CCP. There are key synergies and critical mass, sharing of experiences and code/ services.
– Full integration via a portal interface was not initially wanted, and also could not be achieved at the start of the project as the technology was not adequate (we tried PHP, now have JSR-168). This is now being re-visited as it provides a good solution to many of the problems highlighted.
– Re-usable portlet-based tools from the NGS Portal can be re-used, already done for Integrative Biology and other projects. Can be combined with Wiki etc.
Some following slides show more details of some of the tools.
Blatant advert: Portals and Portlets 2006 http://www.nesc.ac.uk/esi/events/686/
MOLECULE
“Mol_frag_id”
ATOM
“Atom_frag_id”
xCoordinate
“xCoor_frag_id”
locator
locator
locator
O
0.000
0.000
0.000
H
0.000
0.757
0.587
H
0.000
-0.757
0.587
AgentX Framework - OverviewSpecify how to locate data (XML, CML, XLink) with a particular meaning
Applications can use tools (AgentX library) that work with the specification to obtain information
Classes and properties of entities are specified in an ontology(OWL, RDF/ XML)
Mappings (RDF/ XML) associate classes and properties with fragment identifiers(XPointer)
Fragment identifiers can be used to locate logical collections (classes) and data items (properties)
Ontology Mappings Data
AgentX Framework - Example
CONTROL
CONFIG.xml
Mappings
DL_POLY3
AgentX
core
Fortran
wrapper
Standard
Ontology
Standard
Mappings
AgentX
core
Python
wrapper
REVCON.xml
Mappings
CCP1 GUI
DL_POLY3 (CCP5) integrated with CCP1 GUI
AgentX
- Core library written in C
- Wrappers for Python, Perl and Fortran
- Hides the complexities of dealing with XML
- Simple API
- Enables straightforward exchange of information
RCommands
• RCommands are shell tools and associated Web services for meta-data manipulation
• RCommands primary use case is within e-Minerals workflow, i.e. to allow automatic insertion of meta-data as a post processing action
Function Domain RCommand
Authentication / Session
Rinit
Rexit
Rpasswd
Entity Operations
Rls
Rcreate
Rrm
Parameter Operations
Rannotate
Rsearch
Permissions Rchmod
RCommands Service-based Arch
RCommands
gSOAP
RCommand Server Code
JDBC
Axis
Relational Database
Client Side
Server Side
BPEL Engine
SOAP
Link into workflows
Subset of Schema
Name Value Pairs
• Title• Description• Notes• Start / End Dates• Originator
• Name• Description
• Name• URI