data management in dewelopment: experiences from other eu projects and recommendations

19
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund Data management in deWELopment: experiences from other EU projects and recommendations Jannicke Moe (NIVA) deWELopment project meeting 01.02.2010, Warsawa

Upload: erik

Post on 31-Jan-2016

21 views

Category:

Documents


0 download

DESCRIPTION

Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund. Data management in deWELopment: experiences from other EU projects and recommendations. Jannicke Moe (NIVA) deWELopment project meeting 01 .02.2010, Warsawa. Outline. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data management in deWELopment:  experiences from other EU projects and recommendations

Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund

Data management in deWELopment: experiences from other EU projects

and recommendations

Jannicke Moe (NIVA)

deWELopment project meeting

01.02.2010, Warsawa

Page 2: Data management in deWELopment:  experiences from other EU projects and recommendations

Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund

Outline

1. Experiences from data management in EU project WISER– Is data management in WISER relevant for deWELopment?– Data management within WPs vs. Central project database

2. Recommendations for deWELopment– Central project database or not?– If yes; raw values or calculated metrics?

Page 3: Data management in deWELopment:  experiences from other EU projects and recommendations

Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund

1. Experiences from data management in EU project WISER

Page 4: Data management in deWELopment:  experiences from other EU projects and recommendations

Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund

Data management in WISER builds upon data management in EU FP6 project REBECCA

WISER Data Service team

Page 5: Data management in deWELopment:  experiences from other EU projects and recommendations

Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund

Commonalities• Similar aims and data types

– BQEs in rivers and lakes

• Must combine many BQEs – different sampling methods etc.

• Biological data to be combined with common abiotic data– for metric analysis

• Biological data (or metrics) to be combined for different BQEs– for complete waterbody status

assessment

Is data management in WISER relevant for deWELopment?

Differences• WISER is more complex:

– Freshwater data + coastal data

– New field data + old data

– Some WPs depending on data collected by other WPs

– 25 partners in ~20 countries

– Data and results must be delivered to external groups (GIGs - intercalibration process)

• Data management in deWELopment can be simpler

Page 6: Data management in deWELopment:  experiences from other EU projects and recommendations

Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund

Macro

invert

eb

rate

sP

hyto

ben

thos

Hydrology

Acidification

Organic

Combining BQEs (Biological Quality Elements)

Page 7: Data management in deWELopment:  experiences from other EU projects and recommendations

Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund

From raw data - via database - to data analysis

These steps must be done regardless of how you choose to store the data, but it is more practical to do these steps in Access than in Excel.

Page 8: Data management in deWELopment:  experiences from other EU projects and recommendations

Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund

Recommended way to store data: MS Access database with related tables

Page 9: Data management in deWELopment:  experiences from other EU projects and recommendations

Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund

Dilemma for data management: in each WPs or in central project database?

Solution for WISER:

Step 1: Data compilation per BQE - within each WP• Data Service provides template for recommended DB structure• WPs do data cleaning, standardisation of taxonomy and units, etc.• Data analyses will reveal errors => further data corrections

Step 2: Combination of data across BQEs - by Data Service team• Necessary for making assessment with >1 BQE • BQE-specific databases should now have common features which

makes it possible to combine them into a Central project DB

Page 10: Data management in deWELopment:  experiences from other EU projects and recommendations

Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund

WP3.1

WP3.2

WP3.3

WP3.4

51-L-C

Data sources

53-LR-C

78-L-NC

WP4.1

WP4.2

WP4.3

WP4.4

Example: WP3.4 needs data for fish in lakes ...

... WP5.2 needs the same data+ data from other lake BQEs

Central database

Meta-database

Metadata for each dataset

WP5.1

WP5.2

WP5.3

WP6.1

WP6.2

WP6.3

WP6.4

Newdata

Data flow - general idea and example

Page 11: Data management in deWELopment:  experiences from other EU projects and recommendations

Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund

Lessons from WISER data management

• Different WPs have very different data types, experience with data management, needs for assistance etc.

• Difficult to develop one common database structure which fits all needs of all WPs

• Current status in WISER: – Data Service team has developed ”WISER common database structure”

– For WPs experienced with data management (f.ex. lake fish):• Step 1: WP uses their own existing DBs also for storing new WISER data,

and takes care of all data management themselves• Step 2: WISER Data Service transfers the WP’s data (subset) to Central DB

– For WPs not experienced with data management (f.ex. lake phytoplankton):• Step 1: WP stores new WISER data in a new DB with ”common structure”,

and is offered assistance/tools from WISER Data Service • Step 2: WP’s database can be imported directly into to Central DB

Page 12: Data management in deWELopment:  experiences from other EU projects and recommendations

Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund

2. Recommendations for deWELopment

Page 13: Data management in deWELopment:  experiences from other EU projects and recommendations

Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund

COSTS• Difficult / impossible to

quality-check biological metrics after import to DB

• Original project data are not easily available

Possible compromise for central project DB: BQE groups provide raw data + instructions for calculation of metrics. Data manager imports raw data, calculate metrics and stores them as metrics.The best solution depends on the various needs of this project...

Should biological values be stored in project DB as calculated metrics?

BENEFITS • DB can have simpler

structure• Easier to compile • Easier to extract tables for

data analysis• NIVA experience from

developing EEA DBs for biological metrics

Page 14: Data management in deWELopment:  experiences from other EU projects and recommendations

Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund

Questions regarding data management to be considered during this meeting

• Needs – Combination of biological data from different BQEs:

Do you need to use raw biological data, or only biological metrics? – Evaluation of uncertainty:

Do you need to use raw biological data, or only biological metrics?– ...

• Responsibilities– Who is overall responsible for deWELopment data management?– For each BQE group, who is responsible / contact person for data

management? – What should be the role of NIVA in data management?

(Give recommendations? Practical assistance? Some responsibility?)

• Ownership of data– Who owns the raw data?

(The BQE group leader? The data collectors? The project consortium?) – Who makes final decisions regarding storage and use of raw data?

(Project leader? Already regulated by project contract?)

Page 15: Data management in deWELopment:  experiences from other EU projects and recommendations

Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund

Who is responsible for which parts of data management within a BQE group? - Suggestions

Data provider ”Data manager” Data user

Data cleaning, quality checking

Responsible Provide guidelines Contribute during analysis

Standardisation of taxonomy, parameters, units

Responsible Provide standards and guidelines

Standardisation of station and waterbody codes

Responsible Provide code system and guidelines

Reorganisation of data for import to database

Responsible (?)

Compilation, storing and maintenance

Correct and resubmit data when required

Responsible

Extraction of tables for data analysis

Provide examples and guidelines

Responsible

Page 16: Data management in deWELopment:  experiences from other EU projects and recommendations

Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund

TAKK FOR OPPMERKSOMHETEN!

Page 17: Data management in deWELopment:  experiences from other EU projects and recommendations

Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund

”List format” • Required for databases• Easy to read for machines• Difficult to do summaries and

plotting directly in spreadsheet• Suitable for all types of data• Easy to transform to table

format (e.g. in Excel or Access)• OK for data analyses in R

”Table format” (cross table)• Common for storing own data• Easy to read for humans• Can make summaries and figures

directly in spreadsheet– ... which is not recommended!

• Suitable for data with few "gaps"– Phys/chemistry: OK– Species: many empty cells

• Difficult to transform to list format– requires macro

• OK for data analysis in R– but can only import 1 header row;

must replace empty cells, ...

day month year pH Conductivity

Alkalinity

Colour Secchi 1

etc.

mS/cm meq/L mg/L m

18 4 2002 5.14 73 -0.42 61 0.92 5 2001 4.77 90 0.898 186 0.5

SDAY SMONTHSYEAR CONC QUALDETERMINAND UNIT18 4 2002 5.14 pH18 4 2002 73 Conductivity mS/cm18 4 2002 -0.424 Alkalinity meq/L18 4 2002 61 Colour mg/L18 4 2002 0.9 Secchi m18 4 2002 6.2 Chl a µg/L18 4 2002 18.11 TP µg/L18 4 2002 10 <$ PO4-P mg l-1 (spring)18 4 2002 0.3 TN mg l-1 (Spring)18 4 2002 0.01 <$ NH3-N mg l-1 (Spring)18 4 2002 0.01 <$ TON-N mg l-1 (Spring)18 4 2002 0.2 <$ SiO2 mg l-1 (Spring)18 4 2002 4.1 Station Depth (m) (Spring)18 4 2002 10 Surface temperatureoC (Spring)18 4 2002 10.62 Surface O2 mg l-1 (Spring)

Easy

Difficult

Page 18: Data management in deWELopment:  experiences from other EU projects and recommendations

Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund

Role of data holders (WPs)vs. Data Service Team ("Module 2")

Page 19: Data management in deWELopment:  experiences from other EU projects and recommendations

Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund

NEED TO HAVE for data manager

• Waterbody codes (not names) and station codes (not names)

• Taxonomic codes (not names)

• ”Cleaned” data

NICE TO HAVE for data manager

• Database software (not Excel)– Recommended: MS Access

• Standardised phys/chemical determinands and units

• Geographic coordinates

• Data reorganised into "list format"

This applies regardless of how the data are managed (centrally or within each WP)