nsf marine geoscience databases · 2016-08-05 · dms oversight committee - july 19, 2004 2...
TRANSCRIPT
NSF marine geosciencedatabases
Morning • Overview on integrated projects -
Ridge2000, Margins DB, SeismicDMS-Field DB, RidgeMBS, AntarcticMBS - one database
Afternoon• PetDB, SeismicDMS - separate databases.
• Discussion - Focus on interop, visualization, and community contribution
Ridge2000 and Margins
• Provide access to basic cruise metadata, data and sample inventories for all R2K/Margins expeditions.
• Diverse data types - geological, fluid, biological, rock, and sediment samples, seismic, temperature, sonar, photo imagery
• Multi-resolution and time-series
Focus of Projects
AntarcticMBS
• Southern ocean bathymetry primarily from NB Palmer - ~ 10 year legacy
• As of 2002, no designated archive, data resided with PI’s and data techs
Focus (cont.)
SeismicDMS Field DB• Central database for MCS field data from the
Ewing• Rescue older seismic data in Lamont holdings
RidgeMBS• Multibeam bathymetry from mid-ocean ridges -
shaded relief images, grids, ping files
Focus (cont.)
Guiding principles
• Need to handle large and diverse datasets
• Need to serve a diverse user community -nonspecialist and specialist access
• Global coverage
DB design needs• Relational database
• Standardize basic cruise metadata collection
• Easy to use tools to access data
• Capability for visual exploration
• Link to existing repositories to enhance content
Implementation• Common Cruise Metadata Catalog -
SuzanneO/Bob
• Metadata Forms - Dale
• Server and client side tools for access-SuzanneO/Bob/Bill
Data Link, GeoMapApp, Create Maps&Grids
• Ridge2000, Margins, SeismicReflectionFieldDS - Sept/Oct 2003
• AntarcticMBS - Jan 2003
• RidgeMBS - 1993 with renewal in 2000
Project initiation
Status for past 18 months• Metadata forms constructed and distributedculture of data collection
• Develop DB schema and built RDBQuery capability needed
• Populated with bathymetry, mag and grav, navigation and cruise metadata, proprietary holds until PI release
Status (cont.)
• Assembled multi-resolution global gridded topography
• All tool development since Jan 2003
• Begun developing links with other DB -PetDB and UTIG seismic
• New Web site
• Meetings: Interoperability I and II, Marine Metadata workshop, Cyberinfrastructure for geochemistry, Cyberinfrastructure for the Geosciences, Sediment DB workshop, AGU, R2K Community Workshop, R2K and Margins STCOM
• R2K Events - RMBS report, AMBS announcement, Margins Newsletter
Outreach - past year
• Ridge2000 Teachers Workshop - Kim
• SCAR - plans for IBCSO
• GTF, AGU Town Hall
Outreach - upcoming
• Continue add content - historical and recently collected data
• Community Data Products
• Enhance cruise metadata catalog (beyond R2K and Margins)
Upcoming year
• Develop links to other national databases to access other data types (e.g. SIO, NGDC, IRIS)
• Proposals/new projects - IBCSO?, ITR-Global gridded bathymetry, Marine Metadata
Upcoming year
Data Collection• Researching what data was collected during the event
• Identifying point of contact for each data set
• Collecting all levels of data
– Field data
– Processed data
– Metadata (hardest to collect after a cruise)
• Evaluating and cleaning data
• Inserting data into database
• Educating community about how to document their data for the best future use
RVIB NB Palmer Data Collection Status
Data holdings
Antarctic• 80% of all multibeam acquired on the
Palmer• Trackline geophysical data for the Gould
and Palmer• Ice-penetrating radar from Lake Vostok• Single channel seismic data from
Southern Ocean NGDC-LDEO rescue project
Data holdings
RidgeMBS• Bathymetry from almost half the global
MOR system
• Doubled data holdings in past 3 years
Data holdings
Ridge2000• Bathymetry from all study sites
• Cruise metadata for 2004 cruises retrieved from PIs and partially ingested
• Historical data from 9N and Lau data sites partially migrated
Data holdings
Margins• Land Topography Data from all Focus
Sites (SRTM)• Publicly available bathymetry from Focus
Sites (primarily older swath data from NGDC)
• Basic cruise metadata for Ewing cruises within FS
05000
100001500020000250003000035000400004500050000
OBJECTS
1
DATA VOLUME
RODES
RMBS
MARGINS
LDEO
AMBS
05000
100001500020000250003000035000400004500050000
OBJECTS
1
DATA VOLUME
RODES
RMBS
MARGINS
LDEO
AMBS
total # of cruises = 1149total track distance = 8239760 km = 4449115 nm
Storing Data locally versus Accessing it from Remote Locations
• Security ( Controlling proprietary data, Remote database access )
• Consistency ( Stability of websites, database tables and exchange information )
• Redundancy ( Multiple copies of data in various databases )
• Interoperability ( Levels for Accessing Remote Data )– HTTP Linking– Exchange of XML Meta Data– Exchange of data via Web Coverage and Feature Services
Data Holdings
Data Access Related Topics
• Usage Statistics for the DMS Website• Recognizing development partners• Providing credit to data contributors
DMS Oversight Committee - July 19, 2004 1
TECH INFRASTRUCTURE
• UNIFIED HARDWARE/SOFTWARE BACK-END
• UNIFIED DATABASE SCHEMA
• LEVERAGE EXISTING TECHNOLOGY
METADATA
PostgreSQLTM
relationaldatabase
DATA OBJECTS
RedHatTM Linuxlocal filesystem
DEEP ARCHIVE
StorageTekPowderhornTM
Mass Store
WEB INTERFACE
ApacheTM
HTTP server+ PHP scripting
DMS Oversight Committee - July 19, 2004 2
DATABASE SCHEMAENTRY (cruise / flight / traverse) -> DIVE (daughter platform deployment) -> LINE (survey transect) -> STATION (sampling location) -> ADO (arb itrary digital ob ject)
ALIAS (entry ID at another repository)
INSTITUTION (funding agency / academic institution / operating center)
PLATFORM (survey platform)
ATTACHMENT (related documents)
PARAMETER (data types collected; maps 1 ADO to N parameters)
INITIATIVE (funding umbrella)
SENSORSENSOR_HISTORY (deployment / service / (re)configuration / recovery)
PERSONNEL (science / technical)
AWARD (funding instrument)
LOCATION (place name / physiographic feature / survey target)
REFERENCE (related publications)
HISTORY (internal tracking)
ADO (arb itrary digital ob ject) unique ID version file name URL size format access control list quality sensor ID award ID
ADO_[TYPE] (complex data type) -> inherits from ADO
(Arko + O’Hara)
DMS Oversight Committee - July 19, 2004 3
DATA LINK
UNIFIED WEB-BASED SEARCH INTERFACE FOR ENTIRE REPOSITORY.
SEARCH OPTIONS:
• KEY FIELDS (platform, operator, investigator, etc)
• DATA TYPES
• SPATIAL (study/focus site, lat-lon region)
• TEMPORAL
http://www.marine-geo.org/link/
DMS Oversight Committee - July 19, 2004 4
EXAMPLE#1SEARCH
DMS Oversight Committee - July 19, 2004 5
EXAMPLE#1RESULT
DMS Oversight Committee - July 19, 2004 6
EXAMPLE#1DETAILS
DMS Oversight Committee - July 19, 2004 7
EXAMPLE#1INTEROP
DMS Oversight Committee - July 19, 2004 8
EXAMPLE#2SEARCH
DMS Oversight Committee - July 19, 2004 9
EXAMPLE#2RESULT
DMS Oversight Committee - July 19, 2004 10
EXAMPLE#2DETAILS
DMS Oversight Committee - July 19, 2004 11
EXAMPLE#2INTEROP
DMS Oversight Committee - July 19, 2004 12
NEXT STEPS
• EXTEND DATABASE SCHEMA
(data types, access control)
• EXTEND SEARCH OPTIONS
(physiographic features, free text)
• “SCALE UP” BACK-END
(redundant hardware + software)
• ADD DIRECTORY SERVER
• EXTEND INTEROPERABILITY
-> new collaborative proposal: “Marine Metadata Interoperability”
• REVISE INVENTORY FORMS
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
C. Langmuir (Harvard)K. Lehnert (LDEO)
A RidgeA Ridge Petrological Petrological Relational Database Relational Database Served over the World Wide WebServed over the World Wide Web
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
Funding Status
1st Grant Period: Sept 1996 to Aug 2001– 1997: Start of actual development– 1999: PetDB on-line– 2001: Data entry up to date; move to CIESIN
2nd Grant Period: Sept 2002 to Aug 2007
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
The Lamont Team
LDEO K. Lehnert(Lamont-Doherty Earth Observatory)
CIESIN C. Lenhardt(Center for International S. VinayagamoorthyEarth Science Information Network) B. Diapic
A. GerardN. Celo
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
contains ALL ‘raw’ contains ALL ‘raw’ geochemicalgeochemicaldata (major & trace elements, data (major & trace elements, radiogenic & stable isotopes, Uradiogenic & stable isotopes, U--series, noble gases …)series, noble gases …)
for rocks generated at for rocks generated at midmid--ocean ocean ridgesridges inclincl. BAB, seamounts, old . BAB, seamounts, old oceanic crust (volcanic, plutonic, oceanic crust (volcanic, plutonic, & mantle rocks)& mantle rocks)
measured on measured on rocks (rocks (wr wr & glass), & glass), minerals, & melt inclusionsminerals, & melt inclusions
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
740,000 chemical value for740,000 chemical value for>31,000 samples from >31,000 samples from
1,050 cruises/dives1,050 cruises/dives
R/V Mabahiss
John Murray Expedition 1936
Earliest samples from:
R/V Mabahiss
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
Objective
Maximize the application of the Maximize the application of the geochemical data setgeochemical data set
Data is dispersed in literature/on web, often not in electronic form
Compilations by investigators are time-consuming, redundant, often incomplete
Missing links among related data
Data is lost due to incomplete publication
Data is not properly documented
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
Concept
Create anCreate an integrated datasetintegrated dataset of all ‘raw’of all ‘raw’geochemical datageochemical dataInclude all essential metadataInclude all essential metadata about the about the samples and analytical procedure for samples and analytical procedure for searching and data evaluationsearching and data evaluationBuild a Build a webweb--basedbased interactive user interfaceinteractive user interfacethat allows extraction of any subset of the that allows extraction of any subset of the data anddata and facilitate data analysisfacilitate data analysis
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
Data Cataloging vs. Data Integration
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
Data Cataloging vs. Data Integration
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
The Data Model
19911991
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
The Data Model
19971997
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
The Data Model
LEHNERT et al.: A Global Geochemical Database Structure for rocks. G3, 2000
• Relational
• Data fully integrated
• Accommodates all essential metadata
• Generally applicable for sample-based petrological and chemical data for rocks
• Each value linked to original publication or producer
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
• Unit• Method• Lab• Precision• Standard deviation• Normalization• Averaged or individual values• Standard values• Material (GL, WR, etc.)• Mineral metadata: mineral type,
grain type, generation,…• Inclusion metadata: host
mineral, heating, …
ApplicationMetadata
CataloguingMetadata
• Rock type• Location (lat/long)• Precision of location• Depth• Location names• Sampling technique• Expedition• Sampling date• Tectonic setting• Age / Eruption date• Rock class• Texture• Alteration• Modal composition• Archive• Authors
• Title• Journal• Volume Number• Publication Year
Metadata
Red: mandatory metadataWhite: essential metadataBlue: desirable metadata
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
Data Model
Current Applications: • PetDB• GEOROC• NAVDAT• PaleoStrat
Planned Applications: • ‘MetPetDB’ (Spear)
• Experimental Petrology database (Ghiorso, Hirshmann, Grove)
Recommended for: • SedDB (Sediment Geochemistry Workshop, JOI, Washington DC, June 2004)
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
Data Entry
Collect dataCollect dataCollect metadataCollect metadataIdentify samplesIdentify samplesTransfer data into electronic Transfer data into electronic formformExtract metadata from Extract metadata from papers/solicit from authorspapers/solicit from authorsEstablish relations among data Establish relations among data Load data into databaseLoad data into databaseCheck data integrityCheck data integrityLink to existing informationLink to existing information
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
Data Entry
Missing metadataMissing metadata
No unique sample No unique sample identificationidentification
Missing standards for data Missing standards for data presentation (e.g. units)presentation (e.g. units)
Unavailable data files Unavailable data files (supplements)(supplements)
Errors in original data tablesErrors in original data tables
Lack of cooperation from Lack of cooperation from authorsauthors
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
Data Entry Procedure
ROCK DATAENTER OXIDES/ELEMENTS/ISOTOPE RATIOS ANALYZE
SiO2 TiO2 Al2O3 FeOT
METHOD_CODE AS USED IN "METHODS’
1 1 1 1
UNIT WT% WT% WT% WT%
SAMPLE DATA CHEMISTRY DATAANALYSES
NO.TAB_IN_REF SAMPLE NAME
NUMBER OF REPLICATES
CALC_AVGE MATERIAL
AS USED IN "SAMPLES’ IF AVERAGE Y-----Can be averaged N-----Can’t be A-----It is an average
G=GLASS WR=WHOLE ROCK
1 1 D1-2C N GL 50.98 1.79 15.17 9.552 1 D7 N GL 50.38 1.73 14.92 10.393 1 D4 N GL 51.08 1.6 15.14 9.064 1 D6-1A N GL 50.83 1.52 15.56 8.75 1 D6-1B N GL 50.8 1.5 15.55 8.76 1 D5-1D N GL 50.88 1.72 15.55 9.197 1 D5-1E N GL 50.92 1.72 15.56 9.28 1 D3-1G N GL 50.92 1.48 15.66 8.679 1 D3-1G N WR 50.82 1.47 15.46 8.74
10 1 D3-1I N GL 50.69 1.52 15.75 8.6411 1 D3-1I N WR 50.91 1.47 15.35 8.6412 1 D3-1J N GL 50.84 1.48 15.63 8.6913 1 D1-1K N GL 50.95 1.78 15.37 9.4814 1 2568-1053 N WR 52.19 1.76 15.85 8.2715 1 2568-1152 N WR 49.73 1.4 16.43 8.216 1 2568-1343 N WR 50.79 1.11 15.82 7.8117 1 2568-1410 N WR 50.67 1.36 15.85 8.2718 1 2568-1428 N WR 49.96 1.68 18.23 8.7819 1 2571-1109 N WR 50.35 1.34 16.65 7.920 1 2571-1114 N WR 51.03 1.78 14.96 9.3721 1 2571-1152 N WR 50.9 1.68 15.1 9.922 1 2571-1252 N WR 50.76 1.41 15.84 8.0923 1 2572-1015 N GL 51.03 1.62 15.76 8.8224 1 2572-1020 N WR 50.49 1.55 15.67 9.4325 1 2572-1033 N GL 50.77 1.52 15.74 8.6726 1 2572-1132 N WR 50.77 1.46 16.25 8.1627 1 2572-1151 N GL 50.92 1.77 15.26 9.4928 1 2572-1200 N WR 50.9 1.53 15.55 8.58
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
User Interface
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
User Interface
•• Dynamic, interactive, webDynamic, interactive, web--basedbased
•• Allow user to select, filter, view, Allow user to select, filter, view, download customized data setsdownload customized data sets
•• Allow user to explore metadataAllow user to explore metadata
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
Data Integration
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
Precompiled Data
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
Precompiled Data
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
Sample Identification
101 samples in 101 samples in PetDBPetDB are are sample #1 from cruise or dive sample #1 from cruise or dive station #5 station #5 “5“5--1”1”
Different names for sample Different names for sample ARGAMPHARGAMPH--001 (Dredge 1 from 001 (Dredge 1 from the cruise AMPHITRITE on R/V the cruise AMPHITRITE on R/V Argo):Argo):
•• 60 samples from Hawaii 60 samples from Hawaii are named are named ““11””
•• the name the name “ML19”“ML19” was was used for samples from used for samples from
-- MalaitaMalaita (Solomon (Solomon Islands)Islands)
-- Mauna Kea (Hawaii)Mauna Kea (Hawaii)-- Mauna Loa (Hawaii)Mauna Loa (Hawaii)-- Mount Jefferson Mount Jefferson
(Cascades)(Cascades)-- Medicine Lake Medicine Lake
(Cascades)(Cascades)
The Terrestrial WorldThe Terrestrial World
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
Unique Sample Identification
IGSNInternational Geo Sample Number
QuickTime™ and aBMP decompressor
are needed to see this picture.QuickTime™ and a
TIFF (Uncompressed) decompressorare needed to see this picture.
SESAR: Solid Earth Sample Registry
Establish a system that provides unique identifiers for solid earth samples to allow global sharing, linking, and integration of information and data about these samples.
NSF SGER Award July 2004
K. Lehnert, S. Goldstein, C. Lenhardt, S. Vinayagamoorthy
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
Utility of PetDB
Plank, T.: Constraints from Thorium/Lanthanum on Sediment Recycling at Subduction Zones and the Evolution of the Continents. Journal of Petrology, 2004 (in press).V. Salters & A. Stracke: Composition of the depleted mantle. G3, 2004Kellogg, J. B., Jacobsen, S. B., O’Connell, R. J.: Modeling the distribution of isotopic ratios in geochemical reservoirs, Earth Planet. Sci. Letters 217, 2004.S. Weyer et al.: Nb/Ta, Zr/Hf and REE in the depleted mantle: implications for the differentiation history of the crust-mantle system. EPSL 205, 2003M. Hirschmann et al.: Alkalic magmas generated by partial melting of garnet pyroxenite. Geology 31, 2003P. van Keken et al.: Mantle Mixing: The generation, preservation, and destruction of chemical heterogeneity. Annual Reviews of Earth & Plan Sci30, 2002A. Saal et al.: Vapour undersaturation in primitive mid-ocean-ridge basalt and the volatile content of Earth’s upper mantle. Nature 419, 2002
…… and another >40 papers that cite and another >40 papers that cite PetDBPetDB
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
Utility of PetDB
V. Salters & A. Stracke:Composition of the depleted mantle.G3, 2004
” To achieve an average isotopic composition of the DM, we used the PetDB database (http://petdb.ldeo.columbia.edu/petdb/query.asp). We compiled all eruptive products from mid-ocean spreading centers and filtered the data to include only samples that were erupted in water depths in excess of 2000 m and excluded samples that had more than 55 wt.% SiO2.
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
User Community
0
500
1000
1500
2000
2500
11/0308/00 12/00 12/01 12/02
Report Range: 05/01/2004 00:00:00 - 06/01/2004 00:00:00 Generated On Monday July 19 2004 - 11:23:03
Hits Entire Site (Successful) 30558 Average Per Day 985 Home Page 3048Page Views Page Views (Impressions) 13488 Average Per Day 435 Document Views 11464Visits Visits 6205 Average Per Day 200 Average Visit Length 00:09:17 Median Visit Length 00:02:49 International Visits 20.01% Visits of Unknown Origin 36.09% Visits from United States 43.88%Visitors Unique Visitors 4697 Visitors Who Visited Once 4297 Visitors Who Visited More Than Once 400
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
Utility
QuickTime™ and aCinepak decompressor
are needed to see this picture.
By Austin J. RoelofsGraduate StudentDepartment of Geological SciencesThe University of North Carolina at Chapel Hill
EducationEducation
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
Achievements 2002/2003
I. Transfer to ORACLE/Solaris (completed)II. Schema Revision/ data clean-up (ongoing)III. Web-based administration site (started)IV. New JSP web interface on-line (beta site
completed, testing ongoing)V. Data preparation/entry (ongoing)VI. Interoperability with other data systems
(ongoing)VII. Integration into the broader Geoscience
Cyberinfrastructure (ongoing)
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
I. Transfer to ORACLE
VBScripts-Loading-Validation
Web ServerASP
-User queries-Dynamic page content
VBScripts-Denormalization
-Compilation
Entry DB(ACCESS)
Web DB(ACCESS)
ROCK DATAENTER OXIDES/ELEMENTS/ISOTOPE RATIOS ANALYZE
SiO2 TiO2 Al2O3 FeOT
METHOD_CODE AS USED IN "METHODS’
1 1 1 1
UNIT WT% WT% WT% WT%
SAMPLE DATA CHEMISTRY DATAANALYSES
NO.TAB_IN_REF SAMPLE NAME
NUMBER OF REPLICATES
CALC_AVGE MATERIAL
AS USED IN "SAMPLES’ IF AVERAGE Y-----Can be averaged N-----Can’t be A-----It is an average
G=GLASS WR=WHOLE ROCK
1 1 D1-2C N GL 50.98 1.79 15.17 9.552 1 D7 N GL 50.38 1.73 14.92 10.393 1 D4 N GL 51.08 1.6 15.14 9.064 1 D6-1A N GL 50.83 1.52 15.56 8.75 1 D6-1B N GL 50.8 1.5 15.55 8.76 1 D5-1D N GL 50.88 1.72 15.55 9.197 1 D5-1E N GL 50.92 1.72 15.56 9.28 1 D3-1G N GL 50.92 1.48 15.66 8.679 1 D3-1G N WR 50.82 1.47 15.46 8.74
10 1 D3-1I N GL 50.69 1.52 15.75 8.6411 1 D3-1I N WR 50.91 1.47 15.35 8.6412 1 D3-1J N GL 50.84 1.48 15.63 8.6913 1 D1-1K N GL 50.95 1.78 15.37 9.4814 1 2568-1053 N WR 52.19 1.76 15.85 8.2715 1 2568-1152 N WR 49.73 1.4 16.43 8.216 1 2568-1343 N WR 50.79 1.11 15.82 7.8117 1 2568-1410 N WR 50.67 1.36 15.85 8.2718 1 2568-1428 N WR 49.96 1.68 18.23 8.7819 1 2571-1109 N WR 50.35 1.34 16.65 7.920 1 2571-1114 N WR 51.03 1.78 14.96 9.3721 1 2571-1152 N WR 50.9 1.68 15.1 9.922 1 2571-1252 N WR 50.76 1.41 15.84 8.0923 1 2572-1015 N GL 51.03 1.62 15.76 8.8224 1 2572-1020 N WR 50.49 1.55 15.67 9.4325 1 2572-1033 N GL 50.77 1.52 15.74 8.6726 1 2572-1132 N WR 50.77 1.46 16.25 8.1627 1 2572-1151 N GL 50.92 1.77 15.26 9.4928 1 2572-1200 N WR 50.9 1.53 15.55 8.58
xls spreadsheet Web client
Windows NT
PetDB’s Operational ArchitectureUntil August 2002
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
I. Transfer to ORACLE
Web ServerASP
-User queries-Dynamic page content
VBScripts-Denormalization
-Compilation
Entry DB(ORACLE)
ROCK DATAENTER OXIDES/ELEMENTS/ISOTOPE RATIOS ANALYZE
SiO2 TiO2 Al2O3 FeOT
METHOD_CODE AS USED IN "METHODS’
1 1 1 1
UNIT WT% WT% WT% WT%
SAMPLE DATA CHEMISTRY DATAANALYSES
NO.TAB_IN_REF SAMPLE NAME
NUMBER OF REPLICATES
CALC_AVGE MATERIAL
AS USED IN "SAMPLES’ IF AVERAGE Y-----Can be averaged N-----Can’t be A-----It is an average
G=GLASS WR=WHOLE ROCK
1 1 D1-2C N GL 50.98 1.79 15.17 9.552 1 D7 N GL 50.38 1.73 14.92 10.393 1 D4 N GL 51.08 1.6 15.14 9.064 1 D6-1A N GL 50.83 1.52 15.56 8.75 1 D6-1B N GL 50.8 1.5 15.55 8.76 1 D5-1D N GL 50.88 1.72 15.55 9.197 1 D5-1E N GL 50.92 1.72 15.56 9.28 1 D3-1G N GL 50.92 1.48 15.66 8.679 1 D3-1G N WR 50.82 1.47 15.46 8.74
10 1 D3-1I N GL 50.69 1.52 15.75 8.6411 1 D3-1I N WR 50.91 1.47 15.35 8.6412 1 D3-1J N GL 50.84 1.48 15.63 8.6913 1 D1-1K N GL 50.95 1.78 15.37 9.4814 1 2568-1053 N WR 52.19 1.76 15.85 8.2715 1 2568-1152 N WR 49.73 1.4 16.43 8.216 1 2568-1343 N WR 50.79 1.11 15.82 7.8117 1 2568-1410 N WR 50.67 1.36 15.85 8.2718 1 2568-1428 N WR 49.96 1.68 18.23 8.7819 1 2571-1109 N WR 50.35 1.34 16.65 7.920 1 2571-1114 N WR 51.03 1.78 14.96 9.3721 1 2571-1152 N WR 50.9 1.68 15.1 9.922 1 2571-1252 N WR 50.76 1.41 15.84 8.0923 1 2572-1015 N GL 51.03 1.62 15.76 8.8224 1 2572-1020 N WR 50.49 1.55 15.67 9.4325 1 2572-1033 N GL 50.77 1.52 15.74 8.6726 1 2572-1132 N WR 50.77 1.46 16.25 8.1627 1 2572-1151 N GL 50.92 1.77 15.26 9.4928 1 2572-1200 N WR 50.9 1.53 15.55 8.58
xls spreadsheet Web client
Solaris
PetDB’s Operational ArchitectureAugust 2002 - August 2003
Web DB(ORACLE)
Windows NT
VBScripts-Loading-Validation
////
PL/SQL-Loading-Validation
//August 2002
June 2003
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
I. Transfer to ORACLE
PL/SQL-Loading-Validation
Web ServerJava SP-User queries-Dynamic page content
ROCK DATAENTER OXIDES/ELEMENTS/ISOTOPE RATIOS ANALYZE
SiO2 TiO2 Al2O3 FeOT
METHOD_CODE AS USED IN "METHODS’
1 1 1 1
UNIT WT% WT% WT% WT%
SAMPLE DATA CHEMISTRY DATAANALYSES
NO.TAB_IN_REF SAMPLE NAME
NUMBER OF REPLICATES
CALC_AVGE MATERIAL
AS USED IN "SAMPLES’ IF AVERAGE Y-----Can be averaged N-----Can’t be A-----It is an average
G=GLASS WR=WHOLE ROCK
1 1 D1-2C N GL 50.98 1.79 15.17 9.552 1 D7 N GL 50.38 1.73 14.92 10.393 1 D4 N GL 51.08 1.6 15.14 9.064 1 D6-1A N GL 50.83 1.52 15.56 8.75 1 D6-1B N GL 50.8 1.5 15.55 8.76 1 D5-1D N GL 50.88 1.72 15.55 9.197 1 D5-1E N GL 50.92 1.72 15.56 9.28 1 D3-1G N GL 50.92 1.48 15.66 8.679 1 D3-1G N WR 50.82 1.47 15.46 8.74
10 1 D3-1I N GL 50.69 1.52 15.75 8.6411 1 D3-1I N WR 50.91 1.47 15.35 8.6412 1 D3-1J N GL 50.84 1.48 15.63 8.6913 1 D1-1K N GL 50.95 1.78 15.37 9.4814 1 2568-1053 N WR 52.19 1.76 15.85 8.2715 1 2568-1152 N WR 49.73 1.4 16.43 8.216 1 2568-1343 N WR 50.79 1.11 15.82 7.8117 1 2568-1410 N WR 50.67 1.36 15.85 8.2718 1 2568-1428 N WR 49.96 1.68 18.23 8.7819 1 2571-1109 N WR 50.35 1.34 16.65 7.920 1 2571-1114 N WR 51.03 1.78 14.96 9.3721 1 2571-1152 N WR 50.9 1.68 15.1 9.922 1 2571-1252 N WR 50.76 1.41 15.84 8.0923 1 2572-1015 N GL 51.03 1.62 15.76 8.8224 1 2572-1020 N WR 50.49 1.55 15.67 9.4325 1 2572-1033 N GL 50.77 1.52 15.74 8.6726 1 2572-1132 N WR 50.77 1.46 16.25 8.1627 1 2572-1151 N GL 50.92 1.77 15.26 9.4928 1 2572-1200 N WR 50.9 1.53 15.55 8.58
xls spreadsheet
Solaris
PetDB’s Operational ArchitectureOctober 2003
Entry DB(ORACLE)
Web clientBeta Site public: March 2004
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
II. Schema Revision
Better enforcement of key constraints“Constrained lists” (constrained vocabulary)Version codingData clean-up
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
III. Web-based Admin Site
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
IV. New JSP Web Interface
No size limits for query resultsNo size limits for query resultsImproved performanceImproved performanceConsolidated queries (less pages/clicks)Consolidated queries (less pages/clicks)PopPop--up windows for setting criteriaup windows for setting criteriaCriteria of current query always visibleCriteria of current query always visibleAbility to save queries (for a session)Ability to save queries (for a session)Query by data availabilityQuery by data availabilityNew look & feelNew look & feel
1.1. Phase:Phase:
1. Beta site released for internal testing started by Oct 15, 20032. Beta site released for general public April 1, 2004
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
IV. New JSP Web Interface
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
V. Data Entry
Data Entry into EXCEL formsData Entry into EXCEL forms–– Ca. 90% of literature since 2001 doneCa. 90% of literature since 2001 done
Data LoadingData Loading–– Ca. 30% data entry forms loadedCa. 30% data entry forms loaded–– Improved data quality, but more error correctionsImproved data quality, but more error corrections
New data available through new interfaceNew data available through new interfaceDelays caused byDelays caused by–– Change in architectureChange in architecture–– PersonnelPersonnel–– Metadata collectionMetadata collection
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
VI. Interoperability
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
VI. Interoperability
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
VI. Interoperability
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
VI. Interoperability
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
VI. Interoperability
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
VI. Interoperability
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
VII. Integration with Geoscience CI
Geochemical Cycles Workshop, Geochemical Cycles Workshop, San Antonio, June 2004San Antonio, June 2004
Sediment Geochemistry Workshop (organizer), Sediment Geochemistry Workshop (organizer), Washington DC, Washington DC, June 2004June 2004
Sample Sample Curation Curation Workshop, Workshop, Houston, May 2004Houston, May 2004
Metamorphic Petrology Database, Metamorphic Petrology Database, Montreal, May 2004Montreal, May 2004
Forum on Sediment Geology, Forum on Sediment Geology, Dallas, April 2004Dallas, April 2004
Annual NAVDAT Workshop, Annual NAVDAT Workshop, Boulder, February 2004Boulder, February 2004
Interoperability Workshop, Interoperability Workshop, San Diego, December 2003San Diego, December 2003
EarthChem EarthChem exhibit AGU, exhibit AGU, San Francisco, December 2003San Francisco, December 2003
ISESISES--CI Forum, CI Forum, Seattle, November 2003Seattle, November 2003
CI for Solid Earth Geochemistry Workshop (coCI for Solid Earth Geochemistry Workshop (co--organizer), organizer), Washington DC, October 2003Washington DC, October 2003
ISESISES--CI Workshop, CI Workshop, Lawrence KS, March 2003Lawrence KS, March 2003
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
D.Walker, R.Carlson, A. Glazner,L. Farmer, R. Black, T. Bowers
Kansas University, Kansas GS
B. Sarbas, U.Nohl, A,W. HofmannMPI fuer Chemie, Mainz, Germany
VII. Integration with Geoscience CI
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
EarthChem
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
VII. Integration with Geoscience CI
EarthChemA Consortia for On-Line Access to
Geologic, Geochemical andGeochronologic Data
K.Lehnert, R. Carlson, A. Hofmann
• Create an Integrated Information System for Solid Earth Geochemistry
• Take advantage of the synergies among the individual projects
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
EarthChem
Easier access to the dataEasier access to the data
Common standardsCommon standards
Generally applicable toolsGenerally applicable tools
Improved systems Improved systems
Benefits for the community:Benefits for the community:
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
EarthChem Challenges
TechnologyTechnologyImplementationImplementationCulture ChangeCulture Change
Update and maintain content of databasesImprove functionality of interfacesImprove database capabilities and interoperabilityEnsure long-term stability of data systems
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
EarthChem: Tools
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
EarthChem: Tools
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
EarthChem Proposal
“Facility Support -EarthChem and Construction of a Cyber
Infrastructure for Solid Earth Geochemistry”
K. A. Lehnert, D. W. Walker, A. Grunow, D. Elliot
Submitted to NSF/EAR-IF July 19, 2004
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
EarthChem: Goals
•• ‘‘OneOne--stopstop--shop’ for geochemical data (web shop’ for geochemical data (web services: SOAP/XML/WSDL, OAI, OGC)services: SOAP/XML/WSDL, OAI, OGC)
•• Standardize metadata (ISO19115, OGCStandardize metadata (ISO19115, OGC--GML)GML)•• Systematize nomenclature & vocabularySystematize nomenclature & vocabulary•• Implement unique sample identificationImplement unique sample identification
InteroperabilityInteroperability: : Ensure that Ensure that users can retrieve and integrate data users can retrieve and integrate data from distributed sourcesfrom distributed sources
•• MapMap--based integration of geologic, geodetic, based integration of geologic, geodetic,
and geophysical data, mapand geophysical data, map--based queriesbased queries
•• Tools: interactive Tools: interactive discriminantdiscriminant plots, P/T plots, P/T calculators, link to MELTScalculators, link to MELTS
InterfacesInterfaces: : Optimize interaction Optimize interaction with data for users at all levels of with data for users at all levels of expertiseexpertise
•• OphiolitesOphiolites, mantle xenoliths, , mantle xenoliths, orogenicorogenicperidotitesperidotites
•• Precambrian mantlePrecambrian mantle--derived igneous rocksderived igneous rocks•• Antarctic (Byrd Polar Research Center)Antarctic (Byrd Polar Research Center)•• GeochronologyGeochronology
ContentContent: : Expand data coverage in Expand data coverage in time, space, and tectonic settingtime, space, and tectonic setting
•• Develop interactive data submission capabilityDevelop interactive data submission capability•• Facilitate incorporation of data compilationsFacilitate incorporation of data compilations
Data Entry: Data Entry: Make population of Make population of databases more efficientdatabases more efficient
GoalsGoals Proposed ActivitiesProposed Activities
July 19, 2004 Oversight Comittee Kerstin Lehnert, LDEO
Next Steps
2. Phase of User Interface Improvements2. Phase of User Interface Improvements–– User registrationUser registration–– Store queries (for registered users only)Store queries (for registered users only)–– PrePre--set queries (instead of preset queries (instead of pre--compiled data sets)compiled data sets)
Implementation of Harvard productsImplementation of Harvard productsContinue data loadingContinue data loadingImprove data entry processImprove data entry process–– Extend functionality of web admin siteExtend functionality of web admin site–– Build data submission capabilities for usersBuild data submission capabilities for users
InteroperabilityInteroperability–– Improve interoperability with Marine Improve interoperability with Marine GeoscienceGeoscience DMSDMS–– Build new links: Build new links: SIOExplorerSIOExplorer, NGDC Marine Sample , NGDC Marine Sample
Catalog, Catalog, JanusJanus/ODP, /ODP, EarthChemEarthChem, , SedDBSedDB))
LDEO Component
• Establish Field Archive for MCS
• Contribute LDEO historical holdings to UTIG Seismic Processed Data Center
Field Archive Plan
• Establish procedures for routine archiving of all future MCS data
• Incorporate older Ewing field data at LDEO (14) & 5 high res
• Incorporate older field data from Conrad on an as requested/can do basis (digital streamer).
Field Archive (cont.)
• Plan to serve basic cruise metadata immediately
• Option for 2 yr proprietary hold on seismic metadata (shot nav, seismic line logs, acquisition parameters)
• Field data held with proprietary hold until PI release
Status• Onboard data copying procedures
using SEISNET
• Tested data copy for MCS cruises since fall 2003 (Holbrook Storegga and Levander -Venezuela)
• Developed tables for seismic acquisition parameters and populated for cruise (EW0207)
Status (cont)
• Basic cruise metadata for all LDEO cruises currently served
• Incorporating old analog single-channel seismic images (Conrad, Vema, Eltanin) into DB
Processed Data ArchivePlan
• Recover processed MCS data (stacked and migrated sections) from LDEO holdings (63 cruises, most on 9-track tape)
• Assemble navigation, seismic metadata for incorporation into UTIG system
• Provide capability to visually browse through GeoMapAPP
Status
• Recovered stacks for 36 cruises off 9-track tape
• Seismic metadata, navigation, and stacks assembled and sent to UT for 3 cruises
GeoMapApp
• Single channel seismic images from Atlantic now accessible
• Capability to digitize horizons and save
LDEO Year 2 Plans
• Continue to assemble metadata and organize processed data for incorporation in UT system
• Develop search interface for field archive on acquisition parameters
• Fully incorporate Ewing cruises from 2004
• Begin to incorporate older field data
cont
• UTIG-LDEO interoperability;
Link from cruise metadata page to UT for all processed data currently in system
Incorporate unrestricted-access seismic images intoGeoMapApp with link to UT for download
LDAP server for user registration
Marine Geosciences Seismic Data Management System
Access for education and research
Field Data Center (LDEO)
Processed Data Center (UTIG)
16 July 2004
Outline
• How effort developed
• What data is out there
• What our system looks like
• What we have been doing this year
• What we are planning to do next year
• Demo UTIG Data Center
• (Lamont Data Center--Carbotte)
16 July 2004
Digital Seismic Reflection Data
• UTIG and LDEO since 1974 (~200 projects)• Other data reside at:
USGS (particularly Marine Geology)Antarctic Seismic Data Library (~100 projects)IRIS (~80 terrestrial and marine)Scripps Institution of Oceanography (~45 projects)NGDC (~38 projects)U. S. scientists (unknown, but perhaps ~100 projects)
• Other countries• Commercial• Individual scientists
16 July 2004
Acquiring Metadata and Data
• Ownership (Institution and PI; funding source)
• Intellectual investment and future use by PI
• DMS response– Access restrictions wholly data provider’s decision
– DMS provides citation information linked to data and images
– DMS provides information on downloading to provider to foster connection to data use and collaborations
16 July 2004
UTIG Holdings ~122 Cruises
• A concentration in the Gulf of Mexico, Caribbean and eastern Pacific
• ~2000 lines, 180,000 km, four 3-D volumes
• ~5000 processed line segments and versions
• ~1500 active-source OBS stations
• Approximately half is proprietary (mostly Gulf of Mexico)
16 July 2004
LDEO Holdings ~63 Cruises
• Widely distributed data sets, including North Atlantic, North Pacific, Mediterranean, eastern Pacific, western Pacific, Chinese and Australian margins
• Approximately 150,000 km
16 July 2004
16 July 2004
Basic Metadata
• Descriptions of acquisition and processing, cruise reports, personnel and citations– searchable and suitable for other data mining efforts
• Trace location ASCII files suitable for plotting
• SEG-Y shot and processed data (or links to external archives), with identifying header information, including trace positions and how the positions relate to the navigation (e.g., what offsets
were applied?).
16 July 2004
Trace Position Issues
16 July 2004
Minimum Requirements
• Metadata– Navigation and small raster image must be open access
(viewable and downloadable without restriction)• Information relating locations to seismic traces
– Descriptive information about acquisition, geometry, processing sequence, provenance
• More better than less, but high variability common
• Data (if included)– SEG-Y format (rev. 0)
16 July 2004
Basic Tools
• View, select and download SEG-Y binary data, images, navigation, other metadata using- map-based search
- query-based search
• Create custom seismic images
• Allow external access to metadata
16 July 2004
Processed SDB Usage• Old UTIG archive activity of the order of one request per
month• From January thru June 2004 (6 months)
– ~30,000 hits and ~1500 unique hosts each month– 9 new registered users/month
~116 files downloaded/month (segy, nav, large gifs)
– ~ 50% of users are non-U.S.– download purpose
• course-work preparation by professors• student assignments/classroom• data for student research M.S. and Ph.D.• academic research• sample ‘data’ for visualization, processing development
– commercial users not monitored properly
16 July 2004
UTIG Year 1 Progress
• Formalized metadata and “congruency” with Lamont efforts
• Back-filled metadata
• Completed (most) old UTIG data uploading
• Redesigned user interface– for new metadata
– for new tools
• Moved to SAM-FS/Disk/Tape system
16 July 2004
Remaining Data Intake CY2004• 1983 Moore OBS Gulf of Mexico• 1983 Moore OBS Gulf of Mexico• 1990 Ewing 9009 MCS (Lamont)• 1991 Atlantis II NOBEL MAR• 1992 Gyre OBS/SCS Gulf of Mexico • 1995 Ewing Caribbean MCS (Lamont)• 1996 Longhorn OBS Chicxulub• 1997 Ewing Iberia MCS/OBS (Rice)*• 1998 Wecoma stacks Eel River (Lamont)• 1998 Ewing stacks Lesser Antilles• 2000 Thompson MCS/OBS Hydrate Ridge• 2000 Ewing additional stacks Nicaragua• 2002 Ewing OBS Hydrate Ridge• 2003 Ewing MCS/OBS Hess Deep
16 July 2004
UTIG Year 2 Plans
• LDEO-UTIG interoperability– explore other major data archive/DB links
• In-take the rest of Lamont processed data• Establish relationships with other data producers
– using direct appeals to U.S. investigators, MARGINS and RIDGE offices
• Establish links with data users– EOS article, AGU town meeting
• International opportunities-Japan, France and Germany
16 July 2004
Non-technicalInteroperability Issues
• Three classes of access restrictions to data– metadata is completely unrestricted
• Present UT terms and conditions– Data are provided with the express understanding that they will
not be sold or given to third parties. Data are not provided forinclusion with other commercial databases or any web sites without prior written approval of the UTIG Director or cognizant data provider
– Appropriate acknowledgment must be given to primary data gatherers and to the web site
16 July 2004
UTIG System
• Personnel: (total time commitment < 12 months)
– Lisa Gahagan (manager, data intake)– Kevin Johnson (systems)– Marcy Davis (data intake)– and me
• Server-side Netscape 7+ and Explorer 5.5+ compatible• MySQL, Apache, PHP• Presently limited to single T-1 line for UTIG
– expect higher speed service by 15 Nov 2004
• Back-end is 3-TB raid disk and 18 TB robotic tape system controlled with SAM-FS (2 copies on tape)
• Offsite (tar) storage of entire library every six months
16 July 2004