data management in dewelopment: experiences from other eu projects and recommendations
Post on 31-Jan-2016
21 Views
Preview:
DESCRIPTION
TRANSCRIPT
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Data management in deWELopment: experiences from other EU projects
and recommendations
Jannicke Moe (NIVA)
deWELopment project meeting
01.02.2010, Warsawa
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Outline
1. Experiences from data management in EU project WISER– Is data management in WISER relevant for deWELopment?– Data management within WPs vs. Central project database
2. Recommendations for deWELopment– Central project database or not?– If yes; raw values or calculated metrics?
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
1. Experiences from data management in EU project WISER
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Data management in WISER builds upon data management in EU FP6 project REBECCA
WISER Data Service team
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Commonalities• Similar aims and data types
– BQEs in rivers and lakes
• Must combine many BQEs – different sampling methods etc.
• Biological data to be combined with common abiotic data– for metric analysis
• Biological data (or metrics) to be combined for different BQEs– for complete waterbody status
assessment
Is data management in WISER relevant for deWELopment?
Differences• WISER is more complex:
– Freshwater data + coastal data
– New field data + old data
– Some WPs depending on data collected by other WPs
– 25 partners in ~20 countries
– Data and results must be delivered to external groups (GIGs - intercalibration process)
• Data management in deWELopment can be simpler
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Macro
invert
eb
rate
sP
hyto
ben
thos
Hydrology
Acidification
Organic
Combining BQEs (Biological Quality Elements)
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
From raw data - via database - to data analysis
These steps must be done regardless of how you choose to store the data, but it is more practical to do these steps in Access than in Excel.
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Recommended way to store data: MS Access database with related tables
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Dilemma for data management: in each WPs or in central project database?
Solution for WISER:
Step 1: Data compilation per BQE - within each WP• Data Service provides template for recommended DB structure• WPs do data cleaning, standardisation of taxonomy and units, etc.• Data analyses will reveal errors => further data corrections
Step 2: Combination of data across BQEs - by Data Service team• Necessary for making assessment with >1 BQE • BQE-specific databases should now have common features which
makes it possible to combine them into a Central project DB
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
WP3.1
WP3.2
WP3.3
WP3.4
51-L-C
Data sources
53-LR-C
78-L-NC
WP4.1
WP4.2
WP4.3
WP4.4
Example: WP3.4 needs data for fish in lakes ...
... WP5.2 needs the same data+ data from other lake BQEs
Central database
Meta-database
Metadata for each dataset
WP5.1
WP5.2
WP5.3
WP6.1
WP6.2
WP6.3
WP6.4
Newdata
Data flow - general idea and example
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Lessons from WISER data management
• Different WPs have very different data types, experience with data management, needs for assistance etc.
• Difficult to develop one common database structure which fits all needs of all WPs
• Current status in WISER: – Data Service team has developed ”WISER common database structure”
– For WPs experienced with data management (f.ex. lake fish):• Step 1: WP uses their own existing DBs also for storing new WISER data,
and takes care of all data management themselves• Step 2: WISER Data Service transfers the WP’s data (subset) to Central DB
– For WPs not experienced with data management (f.ex. lake phytoplankton):• Step 1: WP stores new WISER data in a new DB with ”common structure”,
and is offered assistance/tools from WISER Data Service • Step 2: WP’s database can be imported directly into to Central DB
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
2. Recommendations for deWELopment
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
COSTS• Difficult / impossible to
quality-check biological metrics after import to DB
• Original project data are not easily available
Possible compromise for central project DB: BQE groups provide raw data + instructions for calculation of metrics. Data manager imports raw data, calculate metrics and stores them as metrics.The best solution depends on the various needs of this project...
Should biological values be stored in project DB as calculated metrics?
BENEFITS • DB can have simpler
structure• Easier to compile • Easier to extract tables for
data analysis• NIVA experience from
developing EEA DBs for biological metrics
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Questions regarding data management to be considered during this meeting
• Needs – Combination of biological data from different BQEs:
Do you need to use raw biological data, or only biological metrics? – Evaluation of uncertainty:
Do you need to use raw biological data, or only biological metrics?– ...
• Responsibilities– Who is overall responsible for deWELopment data management?– For each BQE group, who is responsible / contact person for data
management? – What should be the role of NIVA in data management?
(Give recommendations? Practical assistance? Some responsibility?)
• Ownership of data– Who owns the raw data?
(The BQE group leader? The data collectors? The project consortium?) – Who makes final decisions regarding storage and use of raw data?
(Project leader? Already regulated by project contract?)
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Who is responsible for which parts of data management within a BQE group? - Suggestions
Data provider ”Data manager” Data user
Data cleaning, quality checking
Responsible Provide guidelines Contribute during analysis
Standardisation of taxonomy, parameters, units
Responsible Provide standards and guidelines
Standardisation of station and waterbody codes
Responsible Provide code system and guidelines
Reorganisation of data for import to database
Responsible (?)
Compilation, storing and maintenance
Correct and resubmit data when required
Responsible
Extraction of tables for data analysis
Provide examples and guidelines
Responsible
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
TAKK FOR OPPMERKSOMHETEN!
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
”List format” • Required for databases• Easy to read for machines• Difficult to do summaries and
plotting directly in spreadsheet• Suitable for all types of data• Easy to transform to table
format (e.g. in Excel or Access)• OK for data analyses in R
”Table format” (cross table)• Common for storing own data• Easy to read for humans• Can make summaries and figures
directly in spreadsheet– ... which is not recommended!
• Suitable for data with few "gaps"– Phys/chemistry: OK– Species: many empty cells
• Difficult to transform to list format– requires macro
• OK for data analysis in R– but can only import 1 header row;
must replace empty cells, ...
day month year pH Conductivity
Alkalinity
Colour Secchi 1
etc.
mS/cm meq/L mg/L m
18 4 2002 5.14 73 -0.42 61 0.92 5 2001 4.77 90 0.898 186 0.5
SDAY SMONTHSYEAR CONC QUALDETERMINAND UNIT18 4 2002 5.14 pH18 4 2002 73 Conductivity mS/cm18 4 2002 -0.424 Alkalinity meq/L18 4 2002 61 Colour mg/L18 4 2002 0.9 Secchi m18 4 2002 6.2 Chl a µg/L18 4 2002 18.11 TP µg/L18 4 2002 10 <$ PO4-P mg l-1 (spring)18 4 2002 0.3 TN mg l-1 (Spring)18 4 2002 0.01 <$ NH3-N mg l-1 (Spring)18 4 2002 0.01 <$ TON-N mg l-1 (Spring)18 4 2002 0.2 <$ SiO2 mg l-1 (Spring)18 4 2002 4.1 Station Depth (m) (Spring)18 4 2002 10 Surface temperatureoC (Spring)18 4 2002 10.62 Surface O2 mg l-1 (Spring)
Easy
Difficult
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
Role of data holders (WPs)vs. Data Service Team ("Module 2")
Polsko-Norweski Fundusz Badań Naukowych / Polish-Norwegian Research Fund
NEED TO HAVE for data manager
• Waterbody codes (not names) and station codes (not names)
• Taxonomic codes (not names)
• ”Cleaned” data
NICE TO HAVE for data manager
• Database software (not Excel)– Recommended: MS Access
• Standardised phys/chemical determinands and units
• Geographic coordinates
• Data reorganised into "list format"
This applies regardless of how the data are managed (centrally or within each WP)
top related