![Page 1: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/1.jpg)
Semantically Enhanced Model Experiment Evaluation Process (SeMEEP)
within the Atmospheric Chemistry Community
• Chris Martin 1,2, Mo Haji 2, Peter Dew 2, Peter Jimack 2, Mike Pilling 1
• 1 School of Chemistry, University of Leeds
• 2 School of Computing, University of Leeds
![Page 2: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/2.jpg)
2
Outline of the Presentation
• Introduction
• Atmospheric community
• SeMEEP
• ELN Provenance capture
• Conclusion and next stage
![Page 3: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/3.jpg)
3
Section 1 Overview
• Application domain – atmospheric community
– Reliance on computational models to evaluate data
• Motivation
– Study how to transition from today's ad-hoc processes practises
– Sustainable process of
• Gathering, community evaluation and sharing data & models between scientists
• Minimising changes to proven working practises of the scientist
• Within world-wide co-laboratories
![Page 4: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/4.jpg)
4
Related projects
• CombeChem– Experimental organic chemistry– From source to long term data – perseveration (knowledge)– Semantically-enabled ELN– Data-driven workflow
• Collaboratory for Multi-Scale Chemical Science– Multi-layer chemical model
• myGrid– Bio-informatics and related areas (semantic pattern matching– Reusable semantic workflow using SMD (semantic metadata)– Data Quality
• Karama2– Weather forecasting – computation modelling– Data-driven workflow
Add
Sample
chem1 chem2
Quantum Thermo Kinetic Mechanism Reacting Flow
Chemistry Chemistry Simulation
![Page 5: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/5.jpg)
5
Section 2 Atmospheric Chemistry
• Seeks to understand the chemical processes (reactions) taking place in the lower atmosphere (e.g. smoke)
• It has significant implication for both:
– Air Quality
– Climate Change
![Page 6: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/6.jpg)
6
The Master Chemical Mechanism (MCM)
• Data repository of elementary chemical reactions & rate constants
• The mechanism is described by a computational model that is evaluated against experimental data
– Chamber experiments
– Field experiments
27.11.06 Methyl Glyoxal
0
20
40
60
80
100
120
140
0 5000 10000 15000 20000 25000 30000 35000 40000
time/ s
MG
LY
OX
/ pp
bv
MCMv3.1
measured (calibrated using isoprene)
![Page 7: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/7.jpg)
7
Section 3 SeMEEP
• Today
– Typically within the atmospheric chemistry community the provenance is recorded in an ad-hoc, unstructured fashion, using a combination of traditional lab-book, word processing documents and spreadsheet.
• Move to more sustainable evaluation process supports the gathering, evaluation and sharing of data and models
• Using semantic metadata
![Page 8: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/8.jpg)
8
Laboratory Database (s)
Shared Community Semantic Database
CommunityEvaluation(people)
Scientist (s) with personal ELN
SeMEEEP
Com Data manager
Datamanager
Public Database (s)
Datamanager
SeMEEP Vision
• SeMEEP semantically-enabled MEEP
– Supports the organisation of information but critically, records its provenance (say to recover secondary data)
Mike Pilling : “SeMEEP approach will radically enhance the effectiveness of a research community to deliver new science“
![Page 9: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/9.jpg)
10
Raw Data
Metadata
Publication
Metadata
Process DataE.g. k(T, p)
ELN
Community evaluation
(subjective)
May be partial information
PhysicalExperiment
AnalysisProcess
HistoricalData
Theory(e.g. quantum
mechanic)
IUPAC (kinematic, Int. Union of
pure and applied chemistry
From other labs
Requirements for metadata capture for elementary reactions
•Only published data•Rate constants from several labs•No access to the raw data•No access to secondary data•SeMEEP will provide this.
![Page 10: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/10.jpg)
11
Current Evaluation Processes for the MCM
![Page 11: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/11.jpg)
12
Envisioned Evaluation Processes
LaboratoryArchiveCommunity Semantic Database
Inputs to the modelling process:Benchmark data
Model parameter sets etc.
Scientist’s Personal ELN Archive
Workgroup database
ELN Capture of the Model Development Provenance
Model Development
Model ExecutionAnalysis
Links to experimental dataand provenance generation
processes
Data sources
Community EvaluationSubjective
SeMEEP
Semantic-enabled
ELN
![Page 12: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/12.jpg)
13
Section 4 Electronic Lab-Books (ELNs)
• ELNs address the limitations of the current methods of provenance capture.
• Southampton ELN for organic chemistry experiments.
• Benefits to the modeller
• Modelling process can be automatically captured
• Searchable
• Remote access is possible
• Provenance is structured
• Possible to use resolvable references to resources
![Page 13: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/13.jpg)
14
Will User attach quality metadata?
• Motivate users:
– By demonstrating the value of provenance in their day-to-day work
• Writing publication
• Managing their data
• Reinterpretting the data.
– Management
– Publishers
![Page 14: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/14.jpg)
16
The Modelling Process - A Three Layer Mapping
ExperimentExperiment
PlanExperiment Conclusions
Modelling Iteration
Iteration Plan
Iteration Conclusion /
Plan for Iteration n + 1
Modelling Iteration
Model Development
Model Parameters
Model Output
Model Execution Analysis
Iteration Plan
· Model Source code
· ……...
· Model Output Data from previous iterations
· External Data Sources· ……...
Experiment Layer
Modelling Iteration
Layer
Modelling Layer
Iteration Conclusion /
Plan for Iteration n + 1
Iteration Conclusion /
Plan for Iteration n + 1
Model Parameters
Iteration Conclusion /
Plan for Iteration n + 1
![Page 15: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/15.jpg)
17
MCM Mechanism being investigated
![Page 16: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/16.jpg)
18
Modelling Plan
Ontology
Compare to generate metadata
Mechanism Editing Model Execution Model Output Analysis
Mechanism version n
Mechanism version n-1
Scientific Process
Automatic Metadata Capture
Planning the
Scientific Process
User Annotation
Metadata Storeage
Metadata Storeage
Capture Metadata at run time
ELN Process
![Page 17: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/17.jpg)
19
ELN Screenshots
• Prompts displayed when changing the changing the chemical mechanism;
• Editing a reaction
• Adding a new reaction
![Page 18: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/18.jpg)
20
ELN Screenshots
![Page 19: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/19.jpg)
21
ELN Modelling SMD Architecture
SMD creation(e.g. Data driven
workflow)
Context ontology(e.g. materials/
process)
3-level scientific services (model dev; execution; analysis)
Data Storage (SMD, Model Output
& Analysis)
SMD Middleware Services(e.g. ontology. services, query etc
SMD Modelling sub-system
SemanticMetadata
level
Grid Fabrics
User Interface
Workflow constrictor Annotation interface Database Query & Retrieval
DL-based reasoner
Simulation server
![Page 20: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/20.jpg)
22
Evaluation Methodology
• In-depth interviews with members of the atmospheric chemistry model group here at Leeds, covering:
– Demonstration of the prototype
– User testing of the prototype
– Discussion of scenarios involving the use of the prototype (e.g. )
• Analysis
– Interviews recorded and transcribed
– Analysed using techniques from grounded theory
![Page 21: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/21.jpg)
23
Evaluation
Barriers to adoption:
– Effort required at modelling time for provenance capture
• “[in] your lab book you can write down what ever you want [but with an ELN] it is going to take time to go through the different protocol steps”.
– When asked if they would use an ELN requiring a similar amount of user input to the prototype the response was positive:
• “Yeah, I think it would be a good thing. I don’t think it is too much extra … work.”
– Rather than viewing the prompts for user annotation as interruption to their normal work the user recognised the value of being prompted
• “is a good way to do it because otherwise you won’t [record the provenance].”
![Page 22: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/22.jpg)
24
Evaluation
• Users intuitively grasped the benefits of recording provenance with an ELN and that the benefits would be realised after the time of modelling by a number of stakeholders:
– “if someone else wants to look at … [your provenance], that’s great because the person can see exactly what you have done, where you have been and where to go next. And for yourself, if you are writing up a PhD ... [you can] … see exactly what you’ve done whereas currently you have to rifle through lab-books to see exactly what you have done.”
![Page 23: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/23.jpg)
25
Section 5 Conclusions and future work
• Outlined SeMEEP and ELN
– User evaluated proposed modelling ELN
• Addressed case studies
– IUPAC
– MCM
• Developing a case study with the Geomagnetic community
• User and System issues
– Application of actively theory to capture requirements and user evaluation
– Querying and inference
– Address QoS issues (e.g. security, scalabilty, dynamic roles-based access control)
![Page 24: Chris Martin 1,2 , Mo Haji 2 , Peter Dew 2 , Peter Jimack 2, Mike Pilling 1](https://reader035.vdocument.in/reader035/viewer/2022062323/56815a11550346895dc75a20/html5/thumbnails/24.jpg)
26
Questions