Hydrologic Information System
for the NationI. Zaslavsky (SDSC)
&The CUAHSI HIS Project
his.cuahsi.org, hiscentral.cuahsi.org
Consortium of Universities for the Advancement of Hydrologic Science, Inc.
An organization representing more than one hundred United States universities, receives support from the
National Science Foundation to develop infrastructure and services for the advancement of hydrologic
science and education in the U.S.http://www.cuahsi.org/
122 US Universities as
of July 2008
Databases Analysis
Models
CUAHSI Hydrologic Information SystemGoal: Enhance hydrologic science by facilitating user access to more and better data for testing hypotheses and analyzing processes
• Advancement of water science is critically dependent on integration of water information– Querying nation’s repository of water data
– Linking small integrated research sites (<100 km2) with global and continental models
– Integrating data from multiple disciplines to understand controls on hydrologic cycle
• It is as important to represent hydrologic environments precisely with data as it is to represent hydrologic processes with equationsRainfall
& SnowWater quantity
and qualityRemote sensing Meteorology Soil water
What is the CUAHSI HIS?
An internet based system to support the sharing of hydrologic data comprising databases connected using the internet through web services as well as software for
data discovery, access and publication.
Project co-PI in Phase 2
Collaborator in Phase I
CUAHSI HIS Partner Institutions
HIS
WATERSTestbed
CUAHSI Hydrologic Information System (HIS)
NSF has funded work at 11 testbed sites, each with its own science agenda. HIS supplies the
common information system
SupercomputerCenters:NCSA,TACC
Domain Sciences:
Unidata, NCARLTER, GEON
Government:USGS, EPA,
NCDC, USDA
Industry:ESRI, Kisters,
OpenMI
HISTeam
WATERS Testbed
WATERS Network Information System
CUAHSI HIS
International Partners
CSIRO Land and Water ResourcesWater Resources Observations Network
(WRON)
European CommissionWater database design and model integration
(HarmonIT and OpenMI)
Observation Stations
Ameriflux Towers (NASA & DOE) NOAA Automated Surface Observing System
USGS National Water Information System NOAA Climate Reference Network
Map for the US
Build a common window on water data using web services
Water Data Web Sites
NWISWeb site output# agency_cd Agency Code# site_no USGS station number# dv_dt date of daily mean streamflow# dv_va daily mean streamflow value, in cubic-feet per-second# dv_cd daily mean streamflow value qualification code## Sites in this file include:# USGS 02087500 NEUSE RIVER NEAR CLAYTON, NC#agency_cd site_no dv_dt dv_va dv_cdUSGS 02087500 2003-09-01 1190USGS 02087500 2003-09-02 649USGS 02087500 2003-09-03 525USGS 02087500 2003-09-04 486USGS 02087500 2003-09-05 733USGS 02087500 2003-09-06 585USGS 02087500 2003-09-07 485USGS 02087500 2003-09-08 463USGS 02087500 2003-09-09 673USGS 02087500 2003-09-10 517USGS 02087500 2003-09-11 454
Time series of streamflow at a gaging station
USGS has committedto supporting CUAHSI’sGetValues function
Point Observations Information Model
• A data source operates an observation network• A network is a set of observation sites• A site is a point location where one or more variables are measured• A variable is a property describing the flow or quality of water• An observation series is an array of observations at a given site, for a given variable, with start time and end time• A value is an observation of a variable at a particular time• A qualifier is a symbol that provides additional information about the value
Data Source
Network
Sites
ObservationSeries
Values
{Value, Time, Qualifier}
USGS
Streamflow gages
Neuse River near Clayton, NC
Discharge, stage, start, end (Daily or instantaneous)
206 cfs, 13 August 2006
Return network information, and variable information within the network
Return site information, including a series catalog of variables measured at a site with their periods of record
Return time series of values
http://his.cuahsi.org/odmdatabases.html
CUAHSI Observations Data Model
WaterML design principles• Driven largely by hydrologists; the goal is to capture
semantics of hydrologic observations discovery and retrieval• Relies to a large extent on the information model as in ODM
(Observations Data Model), and terms are aligned as much as possible– Several community reviews since 2005
• Driven by data served by USGS NWIS, EPA STORET, multiple individual PI-collected observations
• Is no more than an exchange schema for CUAHSI web services
• A fairly simple and rigid schema tuned to the current implementation; the least barrier for adoption by hydrologists
• Conformance with OGC specs not in the initial scope – but working with OGC on this (OGC Discussion Paper 07-041)
Water Data Services• Set of query functions • Returns data in WaterML
NWIS Daily Values (discharge), NWIS Ground Water, NWIS Unit Values (real time), NWIS Instantaneous Irregular Data, EPA STORET, NCDC ASOS, DAYMET, MODIS, NAM12K, USGS SNOTEL, ODM (multiple sites)
Test bed HISServers
Central HIS servers
ArcGIS
Matlab
IDL, R
MapWindow
Excel
Programming (Fortran, C, VB)
Desktop clients
Customizable web interface (DASH)
HTML - XMLW
SD
L - SO
AP
Modeling (OpenMI)
Global search (Hydroseek)
WaterOneFlow Web Services, WaterML
HIS LiteServers
External data providers
Deployment to test beds
Other popular online clients
ODM DataLoader
Streaming Data Loading
Ontology tagging (Hydrotagger)
WSDL and ODM registration
Data publishing
ODMTools
Server config tools
HIS CentralRegistry & Harvester
Hydrologic Information System Service Oriented Architecture
Hydrologic Information Server
Microsoft SQLServer Relational Database
Observations Data & Catalogs Geospatial Data
GetSites
GetSiteInfo
GetVariables
GetVariableInfo
GetValues
DASH – data access system for hydrologyWaterOneFlow services
ArcGIS Server
SQL Server
ODMs and catalogs. All instancesexposed as ODM (i.e. have standard ODM tables or views: Sites, Variables, SeriesCatalog, etc.)
NWIS-IID
NWIS-DV
ASOS
STORET
TCEQ
BearRiver
. . .
Spatial store
Geodatabase or collection of shapefilesor both
NWIS-IID points
NWIS-DV points
ASOS points
STORET points
TCEQ points
BearRiver points
. . .
My new ODM
My new points
More databases
More synced layers
DASH Web Application
Background layers
(can be in the same or separate spatial store)
WOF services
Web services from a common template
NWIS-IID WS
NWIS-DV WS
ASOS WS
STORET WS
TCEQ WS
BearRiver WS
. . .My new WS
More WS fromODM-WS template
USGS
NCDC
EPA
TCEQ
Web Configuration fileStores information about registered networks
MXDStores information about layers
WSDLs
, web
serv
ice U
RLs Connection
strings
Layer info,
symbology, etc.
ODMDataLoader
2
6
5
3
1
4
WORKGROUPHISSERVERORGANIZATION
STEPS FORREGISTERINGOBSERVATIO
NDATA
Central HIS Data
Services
Catalog
Against the NIH Syndrome2006:► CUAHSI HIS web services are discussed on the BASINS mailing list as a new way to
access hydrologic data. The list is mostly used by hydrologists and developers outside academia;
► NCDC develops ASOS web services following WaterML2007: ► MOU with USGS; USGS is developing WaterML-compliant GetValues service;► GLEON uses an early version of ODM to develop their own database schema (VEGA);► Phoenix LTER is developing ODM (in MySQL) and WaterML web services (in Java);► A Google Earth-based client for CUAHSI web services is developed at CSIRO,
Australia;► Deployment to 11 hydrologic observatory test beds, + CBEO (CEOP project)2008: ► KISTERS develops WaterML-compliant web services over their database, for a client;► MapWindow open source GIS develops WaterOneFlow parsers;► Florida, Texas and Idaho use ODM and WaterOneFlow web services to provide access
to state data repositories; New Jersey is considering the same;► Another CEOP project, at UC-Davis, is implementing ODM (in Postgres) and web
services (in Java);► More, which we don’t know about…
Hydroseekhttp://www.hydroseek.net
Supports search by location and type of data across multiple observation networks including NWIS, Storet, and academic data
Semantic Tagging of Harvested Variables
CUAHSI HIS as a mediator across multiple agency and PI data
– Maintains integrated metadata catalog and services registry
– Keeps identifiers for sites, variables, etc. across observation networks
– Manages and publishes controlled vocabularies, and provides vocabulary/ontology management and update tools
– Provides common structural definitions for data interchange
– Provides a sample protocol implementation– Governance framework: a consortium of
universities, MOUs with federal agencies, collaboration with key commercial partners, NSF support for core development and test beds
HIS Scalability
• Adding…– …data types and datasets; processing models and services; servers;
users and roles – – - shall not create unmanageable bottlenecks that require system re-
engineering
• Designing for scalability:– Distilling a generic set of web service signatures; resolving semantic
and structural heterogeneities– Using ODM as a common generic format for time series data, for ease
of coding and uniform search interfaces– DASH GUI design to abstract specifics of disparate repositories– Leveraging common CI components developed in GEON– Working with agencies to remove web service bottlenecks
• 11 WATERS Network test bed projects• 16 ODM instances (some test beds have more than one ODM
instance)• Data from 1246 sites, of these, 167 sites are operated by WATERS
investigators
National Hydrologic Information ServerSan Diego Supercomputer Center
HIS Deployment
Water Quality in Moreton Bay, Brisbane, Australia (Jane Hunter)
National Water Metadata Catalog
Synthesis and communication of the nation’s water data http://his.cuahsi.org
Hydroseek WaterML
Government Water Data
Academic Water Data
Accomplishments• Generic method for managing and publishing observational data
– Supports many types of point observational data– Overcomes syntactic and semantic heterogeneity using a standard data
model and controlled vocabularies– Supports a national network of observatory test beds but can grow!
• WaterML is a common language for water observations data from academic and government sources
• Point Observations Data from Agencies and Academic Investigators can be consistently communicated using web services
• Point Observations Data can be archived in a relational database
• National Water Metadata Catalog is the most comprehensive index of the nation’s water observations presently existing
HIS Overview Report• Summarizes the
conceptual framework, methodology, and application tools for HIS version 1.1
• Shows how to develop and publish a CUAHSI Water Data Service
• Available at:
http://his.cuahsi.org/documents/HISOverview.pdf