the long tail of hydroinformatics: implementing ... long tail of hydroinformatics: implementing...
TRANSCRIPT
The Long Tail of Hydroinformatics: Implementing Environmental Flows Information
in Hydrologic Information Systems
Eric Hersh, PhD, PE - Environmental Science Institute David Maidment, PhD, PE - Center for Research in Water Resources The University of Texas at Austin
AWRA E-Flows, June 25, 2013
• Supported by the NSF, 2004-2017, $12 million total for HIS
• CUAHSI has successfully designed and is deploying a national prototype Hydrologic Information System
• But CUAHSI HIS efforts to-date have largely excluded biological observations of the water environment and have largely been constrained to terrestrial hydrologic systems
http://his.cuahsi.org/
Hydroinformatics
Informatics – the science of information,
the practice of information processing, and
the engineering of information systems
Hydroinformatics – applied to water -or-
the study of the flow of information
related to the flow of water (and the entire
water environment in general)
HIS – a “services-oriented architecture for
water information”
“This effort is relatively nascent. A thorough literature review was performed as part of this research encompassing dozens of systems, tools, projects, and efforts; not one of them existed just ten years ago.”
Information Science
Water Information Value Ladder (Vertessey 2010)
Data-Information-Knowledge-Wisdom Pyramid (Rowley 2007)
Data - the raw facts
Information - gives meaning to Data
Knowledge - analyzing and synthesizing Information
Wisdom - using Knowledge to establish and achieve goals
The Water Environment
vs.
Surveys
Sensors
Continuous
Series-based
phys/ chem
Studies
Samples
Discrete
Collections-based
biol
• Physical data describe the movement of water and its properties (hundreds of physical parameters)
• Chemical data describe the constituents moving through the water (thousands of chemical constituents)
• Biological data describe the organisms inhabiting the water environment (millions of biological species)
The 3-D Data Cube
D = f {S,T,V}
Space, S
Time, T
Variables, V
s
t
Vi
D
“Where”
“What”
“When”
A data value
(Maidment 2002)
7 CUAHSI ODM (Horsburgh et al. 2008)
KINGDOM
PHYLUM
CLASS
ORDER
SUBORDER
INFRAORDER
FAMILY
GENUS
SPECIES
SUBSPECIES
Level Classification Description
Kingdom Animalia animals
Phylum Chordata chordates (possessing a nerve cord)
Subphylum Vertebrata vertebrates (with backbone and spinal column)
Class Mammalia mammals
Order Cetacea whales and dolphins
Suborder Mysticeti baleen whales (possessing baleen plates
instead of teeth)
Family Balaenopteridae rorquals (possessing pleated throat grooves)
Genus Balaenoptera finback whales
Species B. musculus Blue whale
Taxonomic Classification
Length = 98 feet
Mass = 400,000 pounds
…and Traits
D = f {S,T,V}
taxonomy
trait Measurements and characteristics, such as length, mass, sex, or count
The 4-D Data Cube
traits have species & species have traits database efficiency & query simplicity
S 62.08 °N, 19.62 °W
T July 27, 2012
V? Balaenoptera musculus (blue whale)
V? Length = 98 feet
Semantic Mediation
(Arizona Republic, 11/19/2012)
http://www.itis.gov/
reservoir discharge
streamflow
Addressing:
- how data is formatted = syntactic mediation
(ODM, WaterML, OGC)
- how data is described = semantic mediation
(Ontology, CV)
… across multiple systems
Ontologies
http://water.sdsc.edu/hiscentral/startree.aspx
Biological Taxa
Indicator Organisms
Biological Domain
Biological Community
a formal description of concepts and relationships
BioODM v.1.2 Eric S. Hersh, UT-CRWR Organism
Group Taxonomy
Sample
Method
Document
Source
Domain
Site Habitat
(has traits, ID)
(characteristics, statistics)
(link to pdf, geography)
(specific location)
(general location)
(substrate, cover, hydraulics, characteristics) (provenance,
contact info)
Weight Units
Datum
Size
BioODM
133 additions and 17 edits to the CUAHSI Controlled
Vocabulary
CV changes Taxonomy table
ODM
“…issues have arisen with regard to the lack of sufficient site-specific scientific data and analyses describing the essential relationships between environmental flows and the actual needs of aquatic organisms in those systems.”
- SB3 Science Advisory Committee to BBASCs and BBESTs, 2/17/2010
Problem:
Bring bio data into a structured format to stand on the same footing as hydrology
Solution:
An Environmental Flows Information System for Texas
• Six information types
– Observations data (WaterML/ODM)
– Geographic (SHP, KML, WFS)
– Documents (DSpace)
– Tables (conservation status, guilds)
– Tools (CaLF, HydroExcel)
– Links (Fishes of Texas, IHA, SAC)
• Four access types – Web Page
– Interactive Map Viewer
– Digital Library
– HydroPortal
The Calculator for Low Flows (CaLF)
A Services-Oriented Architecture… …for the communication of data
…via a standard language …over the internet
Not just data stored locally… …but accessible via web services for specific analysis
Here: 7Q2 and Lyons Flow Using USGS streamflow data published in WaterML
Water Information System of Systems
HIS GIS DL
WISOS
observations data geographic data digital assets
A combination of tools – multiple means of data storage, multiple
avenues of data access, water web services, maps, documents, and
databases – all aggregated into a shared data portal.
• How does this schema apply to Texas rivers?
• Purpose: (1) the information used, the questions posed, and the analyses conducted to aid in environmental flow determinations; (2) demonstrate how an improved HIS can help.
• Both taxonomy and traits are included – fish genus and species plus count and length.
Lower Sabine River Case Study
Blacktail shiner (Cyprinella venusta) Bullhead minnow (Pimephales vigilax)
Bay anchovy (Anchoa mitchilli) Spotted bass (Micropterus punctulatus)
Sabine shiner (Notropis sabinae)
Biological Physical Sampling Effort
scientific name water depth seine length
common name velocity at 20% depth haul length
count velocity at 60% depth shock distance
minimum total length velocity at 80% depth shock time
maximum total length substrate type
habitat type
cover type
embededness
5,831 fish observed representing 58 species.
On average, 729±433 fish were collected per study reach.
The average reach had 24±8 species represented.
The five most abundant species accounted for ~70% of all fishes.
Toledo Bend Dam is having a negative impact on fish abundance and species richness… …and the inland silverside has had some, but limited, success in invasion.
Environmental flows considerations: (1) how the hydropower dam is operated; (2) freshwater inflows to Sabine Lake
35% visited EFIS more than once and 10% visited more than 10 times!
Impact To-Date
Data Portal # of Visits # Unique Visitors
Time (Years)
EFIS 1603 1603 < 3
COMIDA CAB 2868 1508 < 2.5
Texas seagrass 1285 675 < 2
TOTAL 5,756 3,786
http://texasseagrass.org/ http://comidacab.org/ http://efis.crwr.utexas.edu/
• Improved understanding of how to deal with collections of biological data stored alongside sensor-based physical data
• New 4-D data cube for biological observations data with axes of space, time, species, and trait
• Reworking and expansion of the biological domain ontology
Contributions to CUAHSI HIS
• Water Information System of Systems vision for next-generation Hydrologic Information Systems
1. Geographic data stored in a geodatabase in a GIS,
2. Observations data stored in a relational database in an HIS
3. Documents stored in a digital repository in a Digital Library
Contributions to HIS
Acknowledgements
thanks.