matt jones software-interoperability
DESCRIPTION
TRANSCRIPT
Matthew B. Jones
National Center for Ecological Analysis and Synthesis (NCEAS)University of California Santa Barbara
Advancing Software for Ecological ForecastingMarch 25, 2014
Software for Ecological Synthesis
Ocean Health Index (OHI)O
cean H
ealt
h Ind
ex
Halp
ern
et
al. 2
01
2
The “long-tail” of science
Heidorn, P. 2008. doi:10.1353/lib.0.0036
https://goa.nceas.ucsb.edu
https://knb.ecoinformatics.org/
Data HeterogeneityHeterogeneity HighLow
•Tight coupling•Simple subsetting•Explicit semantics
•Loose coupling•Hard subsetting•Limited semantics
Data set size LowHigh
Diverse Analysis and Modeling
• Wide variety of analyses used in ecology and environmental sciences
– Statistical analyses and trends– Rule-based models– Dynamic models (e.g., continuous time)– Individual-based models (agent-based)– many others
• Implemented in many frameworks– R, Matlab, SAS, SPSS, Jump, C, Python, Fortran
Kepler
DMP-Tool
Software & Data Interoperability
Plan
Collect
Assure
Describe
Preserve
Discover
Integrate
Analyze
•Produce an open-source scientific workflow system• Design, share, and execute scientific workflows
•Support scientists in a variety of disciplines• e.g., biology, ecology, oceanography, astronomy
•Features• Data access• Cross analytical packages• Documentation• Provenance tracking• Model archiving and sharing
Scientific workflows promote interoperability
Why workflows?
• Executability• Replicability• Reproducibility• Transparency• Modularity• Reusability• Provenance
How do we harness the long tail?
• Efficient data federation
• Interoperable software workflows
• Central search for discovery
• Just-in-time data integration– Loose coupling– Schema-less storage