ecosystem data and tern: genes to geosciences workshop 19 may 2014

Post on 24-Dec-2014

286 Views

Category:

Science

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Powerpoint presentation used to support the 'Ecosystem data and TERN' workshop on 19 May 2014, held at Macquarie University in Sydney as part of the Genes to Geosciences seminar series.

TRANSCRIPT

Ecosystem Data and Australia’s TERN:

Making the most of TERN to benefit your research and data management!

A workshop for the “Genes to Geosciences” SeriesMacquarie University, May 19, 2014: 1000 – 1500 hrs

Contents1. Welcome and Introductions 2. TERN and the Research Cycle and Data Cycle 3. Australian Ecosystem Data

• what’s available• data discovery• evaluation of data – is it suitable for my needs?• download and appropriate re-use

4. eMAST Example - New possibilities with ecosystem data5. Data Management and Publishing

• why does it matter and how can it help you• data management plans• data publishing – what are your options and why does it matter• data publishers – a continuum of approaches• data publishing options with TERN

6. Wrap-up and Exit Survey

Who are we?To understand your current practices and topics of interest we did a survey beforehand.

Have you previously searched for and accessed data from a public repository?Yes: 7 No: 5

Do you have a data management plan?Yes: 4 No: 8

Have you published data?Yes: 6 No: 6

Survey – your prior knowledge, experience, and requests for today

• To explain and demonstrate options available to the ecosystem science research community to use online resources for searching, evaluating, downloading, publishing and managing ecosystem data sets.

• Focus on activity and learning-by-doing, rather than too much talking

• To recognise different needs of researchers in different position and stages in research careers.

1. Aims and outcomes

• What will you walk away with?

- Better understanding of the national research infrastructure available to you – TERN

- Sense of the kinds of ecosystem data that is available, and how you can get it

- Experience searching, assessing and downloading data for your research

- Understanding the principles of good data management and the benefits for you

- Appreciation of the options for data management

- Introduction to tools for managing your data, including TERN infrastructure

1. Aims and outcomes

2. What is TERN?

• Infrastructure and networks to support coordinated, collaborative ecosystem science community

• Enabling sustained, long-term collection, storage, synthesis and sharing of ecosystem data

• Connecting science with policy and management

• TERN’s infrastructure for ecosystem science

Instruments + Sensors

Policy + Management

Analysis + Synthesis

Modelling

Data Searching

Data Sharing

Data Curation + Publishing

Data Storage

Processing + Analysis

Collection Methods

Eciencygain

Increasedeectiveness

Storage,preservation anddiscoverability

of data

Data analysis,integration and

synthesis

r

Ecosystem Science

Data + meta-data,

licensing

Research output:

new data and publications

Enables large scale and coordinated data collection, sharing and

multiple re-uses

Enhanced ability to revise, question and expand knowledge

Knowledge gap: research questions

Proposal and planning

Data collection, verification,

quality assurance and

control

Research lifecycle

Eciencygain

Increasedeectiveness

Storage,preservation anddiscoverability

of data

Data analysis,integration and

synthesis

r

Ecosystem Science

Data + meta-data,

licensing

Research output:

new data and publications

Enables large scale and coordinated data collection, sharing and

multiple re-uses

Enhanced ability to revise, question and expand knowledge

Knowledge gap: research questions

Proposal and planning

Data collection, verification,

quality assurance and

control

This morning

3. Australian Ecosystem Data• Learning Objectives: To identify the following resources for Australian ecosystem science applications:

- ecosystem data stores- meta-data portals- data publishers

• Sections:• 1030 - 1040 Data discovery• 1040 -1055 Data discovery - exercise• 1055 -1125 Evaluation of data – is it suitable for my needs?• 1125 – 1145 Download and appropriate re-use• 1145 – 1215 eMAST Possibilities

Data DiscoveryLearning objectives:To understand how to approach data discovery through systematic use of ecosystem data stores, portals and data journals.

• National infrastructure for Australian ecosystem data

• National infrastructure for Australian ecosystem data

TERN’s data portals and meta-data structure:

Auscover

Ozflux

Ausplots, and Transects

Coasts

Soils

Supersites Network and LTERN

eMAST

AeKOSEcoinformaticsTERN Data

Discovery Portal

TERN Data:TERN facility Kind of data available Where can I access [+ submit] data ?AusCover Remote sensing data and derived

products covering: land cover; ecosystem variables; fire; surface radiation, meteorology; base satellite data and inputs to satellite processing; site-based datasets.

Via TDDP or AusCover portal:www.auscover.org.au/data/product-list [Submit - matt.paget@csiro.au]

AusPlots Vegetation and soil surveys and samples; photopoints.Over 330 sites sampled so far. As at March 2014: data from ~130 rangelands sites available, with more coming soon.

Via AEKOS data portal www.aekos.org.au or Soils to Satellites soils2sat.ala.org.au/(In future will also be searchable from TDDP) Specimens (vegetation voucher samples and soils) ian@ausplots.org.auPhotopoints: Contact ben@ausplots.org.au

ACEAS(Australian Centre for Ecological Analysis and Synthesis)

Synthesised data products from ACEAS working groups.

Via TDDP or ACEAS portal:aceas-data.science.uq.edu.au/portal/ [Submit – s.guru@uq.edu.au]

TERN Data:TERN facility Kind of data available Where can I access [+ submit] data ?ACEFAustralian Coastal Ecosystems Facility

Key datasets include coastal bathymetry, coastal habitats, water quality, beach morphology, turtle distribution and habitat

Via TDDP or ACEF portal:acef.tern.org.au/portal/ [Submit – jonathan.hodge@csiro.au]

Australian SuperSite Network (ASN)

Vegetation composition, structure and cover; fauna surveys; soil properties; gas and energy flux (see OzFlux below); meteorology; surface, ground and soil water

Via TDDP or ASN portal:www.tern-supersites.net.au/knb/ [Submit – shiela.lloyd@jcu.edu.au]

Australian Transect Network (ATN)

Vegetation and soil surveys, including specimens.

Via AEKOS data portal www.aekos.org.au or Soils to Satellites soils2sat.ala.org.au/(In future will also be searchable from TDDP) Specimens (vegetation voucher samples and soils) stefan.caddy-retalic@adelaide.edu.au

Eco-Informatics

Ecological data from individual sites, and from broadscale surveys. Data from AusPlots and the Australian Transect Network, alongside key data from State and Federal partners.See AEKOS data publication schedule for more detail.

www.aekos.org.au(In progress of submitting metadata to TDDP) [submit - www.aekos.org.au/access_shared]

TERN Data:TERN facility Kind of data available Where can I access [+ submit] data ?eMASTEcosystem Modelling and Scaling Infrastructure

Modelled climate and land surface data derived from surface observations.

Partially available via eMAST: www.tern.org.au/e-MAST-Data-Products-pg26355.html(In progress of submitting metadata to TDDP) [Submit - bradley.evans@mq.edu.au]

LTERN Long-Term Ecological Research Network

Vegetation composition, structure and cover; fauna surveys; surface, ground and soil water

Via TDDP or LTERN portal:www.ltern.org.au/knb/ [Contact emma.burns@anu.edu.au ]

OzFlux CO 2 and other gas concentration and fluxes; evapotranspiration; surface energy balance; carbon and water cycles

Via TDDP or OzFlux portal:ozflux.its.monash.edu.au/ecosystem/home [Submit -pisaac.ozflux@gmail.com ]

Soil and Landscape Grid of Australia

Functional soil attributes and key landscape features.

Under development. Best available data products via TDDP:http://portal.tern.org.au/search#!/q=soils/p=1/tab=collection/group=Soils/num=10 [Submit - mike.grundy@csiro.au]

• Other data stores and sources?

• Other data stores and sources?

• Other data stores and sources?

• Other data stores and sources?

Data Discovery - Exercise

Exercise:• Using the TERN Data Discovery Portal:

http://portal.tern.org.au

Data Download and Evaluation

Learning objectiveTo understand how to effectively search, download and critically assess ecosystem data sets for use in your own work from: ecosystem data stores, portals and data journals.

Evaluation of data – is it suitable for my needs? Exercise

Exercise:• Evaluating your chosen dataset:

• What is the metadata?• What do different parts of the metadata mean? • Is this going to be useful for you?• Criteria to use for evaluation?

Data format (s) Data currency Data collection methods Data QA/QC Data licence

Download and Appropriate Re-use of Data

Learning Objective:

To understand what data “licensing” is from the research producer, user and owner’s points of view.

What do licences mean?

If you download data with a licence, what are your obligations for re-use?

TERN’s Data Licenceshttp://ww.tern.org.au/datalicence

Licencing for Australian Data - www.ausgoal.gov.au

ecosystem Modelling And Scaling infrasTructure (eMAST)

Integrating multiple data sets

Presentation by Brad Evans based on contributions by Colin Prentice, Michael Hutchinson, Gab Abramowitz, Ben Evans, Rhys Whitley, Julie Pauwels

Land surface 101: Energy balance

Source: IPCC

Land surface 101: Carbon cycle

Source: NASA

eMAST Domain

Research domain: Impacts of rising CO2 Thus the ecosystem modeller seeks to:1. Understand the effects of CO2 increases on

ecosystems2. Quantify negative feedbacks – the impact of

rising CO2, land surface warming and extreme events on ecosystems

6CO2 + 6H20 C6H12O6 + 6O2

light energy

chlorophyll +nutrients

IPCC Consensus: CO2 Fertilization

WUE

NPPWUE =

GPPET

NPP = GPP - R

N & P

Land Surface Models-> Coupled to Climate Models

Other approaches

Observations , models and policy

(1) MORE Observations

(2) BETTER models are developed

(3) Models evaluated

against observations

(4) EVEN BETTER Models

(5) BETTER Policy

A viscous cycle

Unifying principles for ecosystem modellers

# 1: Observations, Models and Understanding: Integration of empirical science and modelling betters scientific understanding.

# 2: Transparency, Evaluation, Confidence : Reproducible models, evaluated with observations, enhance model efficacy.

# 3: Innovation, Standards, Simplicity: Continuous innovation, use standards, mitigate unnecessary complexity.

eMAST Observations and Models

Models

OzFluxCO2 and water fluxes

Plot NetworksVegetation Observations

via AeKos and Others

AusCoverRemote Sensing –

Satellite, in-situ & Obs.

Bureau of Meteorology and

Geoscience Australia

Land Surface Models

SoilsProperties of soil

dap.nci.org.augeonetwork

TERN TDDPtern.org.au

RDSI VM’s raijin@nciINTERSECT

NeCTAR

PALSEVALUATION

NeCTARVirtualLabs

eMAST Delivers in 2014-2015 : 1 of 3Simple land surface process models• eMAST R-Package: MQ & ANU Bioclimate indices and surface processes• eMAST Earth System Model Connex (C++ & FORTRAN): MQ & ANU

Bioclimate indices and surface processes coupled to ACCESS and other Earth System Models

• ePiSaT R-Package: Continental Gross Primary Production (data model fusion)

• Community R-Packages: Hutchinson Drought & BoM Heatwave – in kind from Ivan Hanigan (ANU)

• pyeMAST: Python version of eMAST tools including big data services (connectivity with SPEDDEXES).

Statistical land surface models• Data Assimilation: Ensemble Kalman Filter coupled to process based land

surface model (Renzullo, CSIRO)• Fubaar: Machine learning land surface model (in-kind MQ – Keenan)

Open Source !

Tools

eMAST Delivers in 2014-2015 : 2 of 3Observation assimilation into Models• eMAST Ecosystem Model Parameters Database (EMP DB).• NCAR’s Data Assimilation Research Testbed (DART)

• DART-CESM : In collaboration with NEON, Inc. (USA)• DART-CABLE : In collaboration with the NCI, NCAR and CSIRO

• Assimilation of : fluxes, leaf properties, plot network observationsModelled Data discovery and ACCESS Tools• SPEDDEXES: A community based solution to (a) publishing big data (b)

sharing big data (c ) discovering big data and (d) programmatic access to big data on Australia’s eResearch infrastructure.

• SPEDDEXES@NeCTAR-VL’s: Collaborative extension of the SPEDDEXES tools to the NeCTAR Virtual Laboratories – embedding in the Climate and Weather Laboratory

Benchmarking and Evaluation• eMAST@PALS : Development of the PALS system for eMAST and TERN data

streams• eMAST BENCH : International collaboration on benchmarking

Tools

eMAST Delivers in 2014-2015: 3 of 3NEXT Generation of Ecosystem Models• ARC DP on Australian Tropical Savanna’s : Past Present and Future:

Enhancing ecosystem models for Tropical Savanna’s• ARC DP on the Next Generation of Ecosystem Models: Using plant trait

observations to inform a new approach to ecosystem modelling.• GePiSaT: Global version of the ePiSaT model (eMAST and Imperial College

of London)• CAMELS: Coupling ACCESS with Models of Ecosystems and the Land

Surface: Next generation approach to ecosystem and land surface modelling

Datasets from eMAST• ANUClimate: A extension of past methods for gridding Climate and

Weather for the Australian continent .• eMAST Bioclimate• eMAST Land Surface Modelling

Tools & Data

Climate and Bioclimate data Res. 0.01 degrees (nominally 1km) T, P, R + and 50 + indices

: New approach for Big DataIt is no longer practical, let alone affordable, to continue to do data-intensive ecosystem science in the copy-and-work paradigm, a new approach to working with Big Data is required.

Think about network data access, not file downloads…

Cross-disciplinary use of file formats and services…

Open-source server technology and file formats…

Work with big data in a high performance facility

Big Data : eMAST’s collections

Climate/W

eather

Earth &

Marin

e Obse

rvations

Geoscience

Collecti

ons

Terrestr

ial Eco

syste

m

Water M

gmt, H

ydro

logy10

100

1000

10000 54191928

326176 140

Scientific Data for Research (NCI RDSI node)by 2015

Dat

a Vo

lum

es (T

B)

Three eMAST projects

1. Observations: The Ecosystem Model Parameters Database

2. Models: Ecosystem Production in Space and Time

3. Observations in Models: CABLE-DART Data assimilation on the NCI

Observations The Ecosystem Model Parameters Database

• Originally setup to generate continental scale surfaces of leaf properties (nitrogen, phosphorus etc) using ANN’s

• Adapted in April 2014 for use with Data assimilation

• Focal point for ecosystem scientists and plot networks to contribute observations for use in models

EMP DBExample One

eMAST : Data assimilation

Example Two

eMAST : Data assimilationCollaborative ‘Community’ approach: Work with international experts (Fox – NEON and Hoar – NCAR) and local champions Renzullo (CSIRO) and Evans. Open to community participation (Wang, Haverd and Trudinger CSIRO)

Data assimilation: NEON Leaf Carbon

Fox et al. 2012

Data assimilation: NEON Leaf Carbon

Fox et al. 2012

Ecosystem Production in Space and TimeExample Three

ePiSaT

Data filtering: Removal of outliers etc.. Gap filling of PAR (PPFD) for GPP

1

3

1R =

Assimilation

Amax = - 2

Efficiency

Φ =

2

2

3Amax *FC =

Rectangular Hyperbole

3 parameter

1 2 3

Respiration

Quantum

R -Φ I

Amax +Φ I

How does gross primary productivity (GPP) vary in space and time across Australia?

How can we ‘simply’ estimate GPP across Australia?

What data does TERN provide that might be useful for addressing this research question?

Ecosystem Production in Space and TimeePiSaT

Choose the ePiSaT model fromemast.org.au

TDDP orSPEDDEXES

Obtain OzFlux data via the TERN/ OzFlux portals

Run the ePiSaT model – generate estimates of

ecosystem parameters, evaluate them

Obtain climate (eMAST) and satellite data (AusCover) to scale the ePiSaT parameters

Produce continental scale estimates of GPP and evaluate

them

Ecosystem Production in Space and TimeePiSaT

This project is supported by the Australian National Data Service (ANDS). ANDS is supported by the Australian Government through the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative. For more information visit the ANDS website ands.org.au and Research Data Australia services.ands.org.au.

Closing thoughts on data sharing…

Lunch

Eciencygain

Increasedeectiveness

Storage,preservation anddiscoverability

of data

Data analysis,integration and

synthesis

r

Ecosystem Science

Data + meta-data,

licensing

Research output:

new data and publications

Enables large scale and coordinated data collection, sharing and

multiple re-uses

Enhanced ability to revise, question and expand knowledge

Knowledge gap: research questions

Proposal and planning

Data collection, verification,

quality assurance and

control

This afternoon

5. Data Management & Publishing• Learning Objectives: To understand recognised best practice in “data management” for ecosystem, science data sets.

To understand what is required for “data publishing” in appropriate storage sites, portals and journals for specific research purposes – and to understand the diversity of options.

• Sections:• 1305-1315 Why does data management + publishing matter and

how can it help you?• 1315-1330 Data management plans - exercise• 1330-1340 Data publishing – your options and why does it matter• 1340-1350 Data publishers – a continuum of approaches• 1350-1430 Data publishing options with TERN

Data ManagementLearning Objectives:To understand recognised best practice in “data management” for ecosystem, science data sets.

- Why good data management is beneficial?- What is good data management?

Poor Data Management

Unusable Lost Re-collected

www.shutterstock.com . 54240301

http:

//36

0dig

est.c

om/2

006/

02/2

5/m

essy

-offi

ce-c

onte

st/

TERN

Aus

Plot

s

Personal Drivers

Increase efficiency of research

Guarantee the quality and authenticity of data

Enable exposure of research outcomes via collaborations and dissemination (40%)

Provide reproducibility of experimental and computational outcomes

Facilitate the validation and verification of results

Source: UQL-050112 – Research Data Management Fact Sheet 2

Survey on research data management 2012:• 63% aware of Australian Code of Conduct• 70% understand their data management responsibilities• 70% don’t do data management plans• 70% don’t keep a registry of research data collections

From Miller, C (2012). “Responses to interviews: University of Adelaide research data repository and metadata store”

• 82% agree data should be available to other researchers• 81% would re-use another’s data• 29% supported public access to their data

Data Management Plans - ExerciseExercise:Design of a “data management plan” to meet Australian Research Council requirements.

ARC Proposal Guidelines – Under “Project Description”“MANAGEMENT OF DATA Outline plans for the management of data produced as a result of the proposed research, including but not limited to storage, access and re-use arrangements.”

Data PublishingLearning Objectives:To understand what is required for “data publishing” in appropriate storage sites, portals and journals for specific research purposes – and to understand the diversity of options available.

To understand the different levels of publishing possible under the “data publishing continuum.”

Why should I publish data?

• replication and verification of work;

• formal and measureable recognition of data as a research output;

• a reduction in the duplication of data collection;

• re-use of data in multi- and interdisciplinary research;

• greater transparency in the research process.

High quality, well-described ecological data for 1000s species occurring at 25,000 sites and another 67,000 coming soon

Successful data publishers get noticed

Correlation between archived or

open access data to copies of

published articles and citation impact (Sharing detailed research data is associated with increased citation rate: Piowar, et al (2007)

Adopting good science practice

• Data are well-described and reproducible• ApplyNHMRC and ARC research ethics

• NHMRC Open Access policy came into effect from 1 July 2012

http://www.nhmrc.gov.au/grants/policy/dissemination-research-findings

• ARC Open Access policy came into effect from 1 January 2013.

http://www.arc.gov.au/applicants/open_access.htm

“A11.5.2. Researchers and institutions have an obligation to care for and maintain research data in accordance with the Australian Code for the Responsible Conduct of Research (2007). The ARC considers data management planning an important part of the responsible conduct of research and strongly encourages the depositing of data arising from a Project in an appropriate publically accessible subject and/or institutional repository. “

When not to publish data or place restrictions

• Patent application

• Confidential human/individual details

• Confidential data due to commercial sponsorship arrangements

• Sensitive species declared by governments

• Sensitive location declared by governments

http://www.tern.org.au/Data-publishing-pg26249.html

Data Publishers – A Continuum

Data Publishing - ExerciseExerciseIdentification and review of potential data publishers.

We will divide you into small groups to assess the approach to data publishing of a given data publisher in terms of: - submission and review process;- attributes required for re-use; - capacity for re-use- costs; and- ability to measure output and re-use.

Data Publishing with TERN

Learning Objectives: Identification of current and planned data publishing options in TERN.

To understand how you can publish your data with TERN

TERN’s data portals and meta-data structure:

Auscover

Ozflux

Ausplots, and Transects

Coasts

Soils

Supersites Network and LTERN

eMAST

AeKOSEcoinformaticsTERN Data

Discovery Portal

Data Publication in TERN - SHaRED

- Metadata complyingwith ISO 19115 and 19139 international standards; specifically the ANZLIC Profile ofthose standard

- Easy to use- Base template which can accommodate in depth details if needed- *.xml format

Tool developed by ANZLIC - the Spatial Information Council (ANZLIC)

Data Publication in TERN - ACEF using ANZMet Lite

http://spatial.gov.au/sites/default/files/legacy/osdm.gov.au/Metadata/ANZLIC%2Bmetadata%2Bresources/default.html

Data Publication in TERN - ACEF using ANZMet Lite

Data Publication in TERN - Morpho

https://knb.ecoinformatics.org/#tools

Questions?

6. Wrap upOutcomes?

- Better understanding of the national research infrastructure available to you – including TERN

- Knowledge of the kinds of ecosystem data that is available, and how you can get it

- Experience searching, assessing and downloading data for your research

- Understanding the principles of good data management and the benefits for you

- Appreciation of the options for data management

- Introduction to tools for managing your data, including TERN infrastructure

6. Wrap up

• Email exit survey tomorrow

• Presentations and materials online and links sent to you

• Please contact us with any questions or follow up items

International Partners

TERN is supported by the Australian Government through the National Collaborative Research Infrastructure Strategy and the Super Science Initiative

More Questions?

Prof Stuart Phinns.phinn@uq.edu.au

Dr Bek Christensenr.christensen@uq.edu.au

www.tern.org.au

top related