biodata a new bioassessment database for the usgs briefing for the cdi 2011.06.08

29
BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08 http:// aquatic.biodata.usgs.g ov

Upload: randell-floyd

Post on 31-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

BioDataa new bioassessment database for the USGS

Briefing for the CDI 2011.06.08

http://aquatic.biodata.usgs.gov

Page 2: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

Today

What is BioData? Why Did We Build It? Current Capabilities Future Possibilities Data Integration/Interoperability Challenges

Page 3: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

What is BioData? – in a nutshell

A data management, storage, and distribution system for aquatic bioassessment data.

• data capture• data curation• data publication

Page 4: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

Why We Built It - A Brief History

1992 – National Water-Quality Assessment Program (NAWQA) began collecting bioassessment data (macroinvert, fish, algae, stream habitat)

Page 5: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

NAWQA Study Units

Page 6: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

Why We Built It - A Brief History

1992 – National Water-Quality Assessment Program (NAWQA) begins collecting bioassessment data (macroinvert, fish, algae, stream habitat)

1992 – 1999: Local data management and national data aggregations

1999 – NAWQA national bioassessment database – (BioTDB)

Page 7: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

WRD Needs Assessment (2006)

Surveyed WRD Science Centers to find out: How much aquatic ecology data is being collected

outside the NAWQA Program? What kinds? What methods? Where and how are data being stored?

Page 8: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

What We Discovered

Water collaborative projects with other agencies, states, localities, and partners are producing as much data as the NAWQA Program 80 % of WSC’s reported projects collecting aquatic ecology

data 120 projects had a macroinvertebrate, fish, algae, or habitat

component (2000 – 2005) Approximately 15,000 samples

The majority of samples are being collected using NAWQA and USEPA national stream bioassessment protocols

Samples are being sent to a variety of taxonomic labs

Page 9: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

What We Discovered

The data are stored electronically, but are very difficult to discover, access, and integrate 47% in Excel 13% are in EPA databases 19% in home-grown relational databases

79%

Page 10: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

U.S. Department of the InteriorU.S. Geological Survey

BioDataa new bioassessment database for the USGS

briefing for the USGS GCMRC 5/9/2011

http://aquatic.biodata.usgs.gov

Page 11: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

What Should We Do?

1. Do nothing?

2. Implement a federated system?

3. Incrementally refurbish existing NAWQA database?

4. Redesign and “re-build” using modern, web-enabled, extensible architecture? (BioData)

Page 12: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

Biodata - Version 1 Objective

A data storage, retrieval, and distribution system for aquatic bioassessment data most commonly produced by USGS WRD projects.

Page 13: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

“Most Commonly Produced” Project Objectives

Setting

Types of Data

Sampling Protocols

Bioassessment and monitoring

Streams and rivers

Macroinvertebrates Fish Algae Study reach habitat

NAWQA USEPA

Page 14: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

Additional Characteristics

An internet application Available to any USGS ecologist. Designed to be adapted and extended Support scientific workflow Serve as an online data archive Curate taxonomic nomenclature - map it

forward and harmonize it across all the data Support biologist lab data exchange Readily add web data services

Page 15: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

BioDataRetrieval

(DWH)

project data management

BioDataInput

data distribution

field data lab data

• field data input• data exchange with

labs• data review

external data

• NAWQA legacy data

public web site

web data services

application-specific output

Page 16: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

Data Retrieval Featureshttps://aquatic.biodata.usgs.gov

Real-time feedback on how many samples your query will return

Save the query to your desktop – then email to friends for them to run

Variety of file formats Multiple data sets downloaded in one step

Page 17: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

Data Retrieval Demo

https://aquatic.biodata.usgs.gov

Page 18: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

BioDataRetrieval

(DWH)

project data management

BioDataInput

data distribution

field data lab data

• field data input• data exchange with

labs• data review

external data

• NAWQA legacy data

public web site

web data services

application-specific output

Page 19: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

Data Input/Management Features

Retrieve restricted (unreleased) data Manage and organize data by project Project control over rights to enter and edit

data Built in help and data validation checks Auto-saving Data entry screens tailored to field sheets Send electronic orders to labs

Page 20: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

Data Input/Mgt Demo

Page 21: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

Data integration – touchpoints

First challenge – find the data Second challenge - compatible methods?

Page 22: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

Data integration – touchpoints

First challenge – find the data Second challenge - compatible methods? Third challenge – get the data

We need to pick a data exchange standard

Page 23: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

Data integration – touchpoints

First challenge – find the data Second challenge - compatible methods? Third challenge – get the data Fourth challenge – harmonize taxonomy

Does “Thienemannimyia group” = “Thienemannimyia gr.” ?? Does ITIS solve this?

Page 24: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

ITIS

Page 25: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08
Page 26: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

ITIS

Only handles published names We have to handle unpublished names Provisional = new taxon claimed but not

“officially” published Conditional = uncertain or indeterminate

identification, e.g. “Thienemannimyia group”

ITIS is not complete for all groups Fish – good, we can integrate tightly with it Macroinvertebrates – doable Algae – ITIS not ready yet

Page 27: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

Data integration – touchpoints

First challenge – find the data Second challenge - compatible methods? Third challenge – get the data Fourth challenge – harmonize taxonomy

Does “Thienemannimyia group” = “Thienemannimyia gr.” ??

Fifth challenge – integrate with physio-chemical and ancillary data Common geospatial framework would help

Page 28: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

NHD

Which NHD? NHD “snap to” service with API’s that

developers could use in their application(s)? Service to translate NHD address to other

versions of NHD (and future)

Page 29: BioData a new bioassessment database for the USGS Briefing for the CDI 2011.06.08

http://aquatic.biodata.usgs.gov

BioData

For more information contact:Pete [email protected]