page 1 object oriented data technology for space science data archiving and retrieval: potential...

48
Page 1 Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan Crichton Steve Hughes Jet Propulsion Labs California Institute of Technology National Aeronautics and Space Administration

Upload: milo-manning

Post on 18-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 1

Page 1

Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research

March 21, 2001

Dan CrichtonSteve Hughes

Jet Propulsion LabsCalifornia Institute of Technology

National Aeronautics and Space Administration

Page 2: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 2

Page 2

JPL is a federally funded research and development center (FFRDC) run by Caltech for NASA

JPL is NASA’s lead center for robotic exploration of the universe

JPL has an enormous amount of data that it needs to manage from scientific data, to engineering, to institutional

We represent several efforts in both the research and enterprise side of JPL that is addressing enterprise architectures for integrating data at both JPL and NASA. Such efforts include

Planetary Data System - NASA Planetary Archive

Enterprise Data Architecture and Applications

JPL IT Data Infrastructure

Knowledge Management - JPL/NASA KM Initiative

Object Oriented Data Technology - I.T. Data System Research

Who are we?

Page 3: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 3

Page 3

Challenges to Data ManagementArchiving, Search, Retrieval and Integration

Space scientists cannot easily locate or use data across the hundreds if not thousands of autonomous, heterogeneous, and distributed data systems currently in the Space Science community.

Heterogeneous Systems Data Management - RDBMS, ODBMS, HomeGrownDBMS, BinaryFiles Platforms - UNIX, LINUX, WIN3.x/9x/NT, Mac, VMS, … Interfaces - Web, Windows, Command Line Data Formats - HDF, CDF, NetCDF, PDS, FITS, VICR, ASCII, ... Data Volume - KiloBytes to TeraBytes

Heterogeneous Disciplines Moving targets and stationary targets Multiple coordinate systems Multiple data object types (images, cubes, time series, spectrum, tables,

binary, document) Multiple interpretations of single object types Multiple software solutions to same problem Incompatible and/or missing metadata

Page 4: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 4

Page 4

Evolution of Data Systems(Trying to make order out of entropy)

Data System Maturity

Basic Data Infrastructure - Data Acquisition - Databases - Data Analysis Tools - Homogeneous Computing

Data Archiving

- Catalog Systems - Data sets - Data Products - Metadata

Data Location

- Metadata - Distributed Data sets - Distributed Services

Data Product Exchange/Interoperability - Heterogeneous Servers - Data Interchange - Data Sharing - Distributed Architectures

Centralized Data Distributed Data

Page 5: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 5

Page 5

Independent Access to PDS Archives

MARIE EDR,PDS PPI

GRS RDRPDS GEO

THEMIS EDRASU

DOCUMENTS& ANCILLARYFILES

GRS EDRUofA

MGSMOLA

MGSTES

MGSMOC

Viking MDIM

VikingEDR

Mars Odyssey

Mars GlobalSurveyor Viking

AncillaryFiles

MOC Interface

TES Interface

MOLAInterface

Viking Interface

Odyssey Interfaces

Page 6: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 6

Page 6

PDS Nodes and Institutions (Silos)

NAIF/JPL

Small Bodies/UMD

Atmospheres/New Mexico State

Geosciences/Washington University

Planetary Plasma/UCLA

Rings/Ames

Radio Science/Stanford

Central Node/JPL

Imaging/USGS

Imaging/JPL

Page 7: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 7

Page 7

Managing Software Interfaces(# of Systems vs # of Interfaces)

N Interfaces

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

10 20 50 100 200 300

N Interfaces

# of interfaces = (n2 - n) / 2 where the number of systems is n

# of systems

# of interfaces

Page 8: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 8

Page 8

DIS Approach (circa 1998)

• PDS Distributed Inventory System (DIS)

• Identified all resources available across the PDS• Used Object Description Language (keyword/value)

labels to describe data and non-data resources• Included URL/FTP links to all online resources• Used cgi-script engine to search label database• Displayed results in an HTML page

Page 9: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 9

Page 9

Product Label

OBJECT = DATA_SET ; DATA_SET_NAME = VO1/VO2 MARS VISUAL IMAGING SUBSYSTEM DIGITAL IMAGING MODEL; DATA_SET_ID = VO1/VO2-M-VIS-5-DIM-V1.0; DATA_SET_DESC = http://pds.jpl.nasa.gov/cgi-bin/remote.pl?ds:VO1/VO2%2BMARS%…; DATA_SET_TERSE_DESC = http://pds.jpl.nasa.gov/cgi-bin/remote.pl?ds:VO1/VO2%2BM…; DATA_SET_RELEASE_DATE = 1991; LINK = http://www-pdsimage.wr.usgs.gov/PDS/public/mapmaker/mapmkr.htm; ARCHIVE_STATUS = ARCHIVED; ARCHIVE_STATUS_NOTE = Passed peer review with all liens resolved. Available through PDS and NSSDC; ARCHIVE_SCHEDULE_NOTE = 1999-08-18:ARCHIVED; DATA_OBJECT_TYPE = IMAGE; START_TIME = N/A; STOP_TIME = N/A; NODE_NAME = IMAGING; CURATING_NODE_ID = IMAGING; DATA_ENGINEER_ID = N/A; TARGET_NAME = MARS; TARGET_TYPE = PLANET; MISSION_NAME = VIKING; INSTRUMENT_HOST_NAME = VIKING ORBITER 1, VIKING ORBITER 2; INSTRUMENT_HOST_TYPE = SPACECRAFT; INSTRUMENT_NAME = VISUAL IMAGING SUBSYSTEM - CAMERA A, VISUAL IMAGING SUBSYSTEM - CAMERA B; INSTRUMENT_TYPE = VIDICON CAMERA; NSSDC_DATA_SET_ID = 75-075A-01F, 75-083A-01C; VOLUME_ID = VO_2001, VO_2002, …; VOLUME_LINK = ftp://pdsimage.wr.usgs.gov/cdroms/vo_2001/, ftp://pdsimage.wr.usgs.gov/cdroms/vo_2002/, … KEYVALUES = CAMERA, DIGITAL, DIM, IMAGING, MARS, MODEL, ORBITER, PLANET, SPACECRAFT, SUBSYSTEM,

TABLE, VIDICON, VIKING, VIS, VISUAL, VO1, VO2, VO_2001, VO_2002, …; LABEL_REVISION_NOTE = 2000-09-19 C:CN S:CN LN:AY:DN;END_OBJECT;

Page 10: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 10

Page 10

DIS Links to Mars Archives

MARIE EDR,PDS PPI

GRS RDRPDS GEO

THEMIS EDRASU

DOCUMENTS& ANCILLARYFILES

GRS EDRUofA

MGSMOLA

MGSTES

MGSMOC

Viking MDIM

VikingEDR

Mars Odyssey

Mars GlobalSurveyor

Viking AncillaryFiles

DIS

DataMiningData

Catalogs

Distributed Archives

Page 11: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 11

Page 11

Lessons Learned

• Identified metadata as key to the solution• Identifies and describes data and resources• Provides search attributes• Helps determine whether query can be resolved by

resource

• However• ODL is not a universal interchange language• Resource heterogeneity is still visible• Navigation and access is limited to http links• System interoperability is not supported

Page 12: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 12

Page 12

OODT, The Next Generation XML/JAVA/CORBA

• Started in 1998 as a research and development task funded at JPL by the Office of Space Science to address

• Application of Information Technology to Space Science• Research methods for interoperability, knowledge

management and knowledge discovery• Develop software frameworks for data management to reuse

software, manage risk, reduce cost and leverage IT experience

• OODT Initial focus• Data archiving – Manage heterogeneous data products and

resources in a distributed, metadata-driven environment• Data location – Locate data products across multiple

archives, catalogs and data systems• Data retrieval – Retrieve diverse data products from

distributed data sources and integrate

Page 13: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 13

Page 13

OODT System Design Goals

Encapsulate individual data systems to hide uniqueness

Provide data system location independence Require that communication between distributed

systems use metadata Define a standard data dictionary structure and

approach for describing systems and resources Provide a scalable and extensible solution Provide a mechanism for data product exchange Allow systems using different data dictionaries and

metadata implementations to be integrated Define an architecture that can leverage off of open

standard approaches

Page 14: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 14

Page 14

OODT Technology Focus

Focus on building middleware components Focus on creating metadata “profiles” about data system resources Provide sufficient layers of abstraction in the architecture to isolate

technology choices from architecture choices XML (Extensible Markup Language) for the data content CORBA (Common Object Request Broker Architecture) for the data

transport Research technologies for implementing a distributed data architecture

Distributed Object Computing (CORBA, DCOM, etc) Database Technology (RDBMS, ODBMS) Data Access Technologies (O/JDBC, XML, etc) Directory Implementations (LDAP) Data Interchange (XML) Communication Technologies (Web/HTTP, MOM, RPC, etc)

Page 15: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 15

Page 15

Motivation for Middleware

Middleware allows for the encapsulation of individual data systems

Hide uniqueness by introducing the data architecture layer

Allows for data abstraction Provide common client interfaces to heterogeneous systems Manage risk associated with technical decisions. Systems evolve

independent of the clients.

Enable reuse and promote standards

Allow for incompatible systems to be tied together by introducing a middleware layer

Page 16: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 16

Page 16

Middleware Data Encapsulation

Middleware can tie application, data, and user interfaces together and hide the unique interfaces

Applications User Interface

Data

Middleware

Page 17: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 17

Page 17

OODT Component Framework

Java based software middleware component architecture that provides a software framework for archiving, search and retrieval, and data product exchange

Archive Component Provides centralized data archiving and cataloging of data products Distributed

A Search and Retrieval Component Manage metadata associated with resources Locate resources across geographically distributed data systems Distributed

Data Product Exchange Component Support interchange (data sharing) of data products Support heterogeneous implementations and systems Distributed

Query Service Component Ties search and product exchange services together Distributed

Page 18: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 18

Page 18

Component Framework for OODT

OODT/Science Web Tools

ArchiveClient

OBJECT ORIENTED DATA TECHNOLOGY FRAMEWORK

ProfileXMLData

NavigationService

Data System

2

Data System

2

Other Service 1

Other Service 2

QueryService

ProductService

ProfileService

ArchiveService

Bridge to External Services

Page 19: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 19

Page 19

Data Archiving Goals

Provide basic functions Transport and management of data sets and products Identification of products using metadata Event driven processing associated with data sets Ability to add, get and delete products from the archive Management of large, complex datasets and products

Provide extensible data management approach Database is dynamically generated and extended based on

metadata content Separate the metadata and the archive. Metadata can be exported

to other components of the OODT architecture.

Provide a n-tier, client/server architecture for archiving at remote locations

Build a service that is accessible via clients using common programming languages (Java, C++, etc)

Page 20: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 20

Page 20

Archive Management

Page 21: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 21

Page 21

OODT Metadata Development

Develop methods for managing the semantics of data that are shared within and between domains

Terminology Base – Domain specific name space Data Dictionary – Inventory of domain terms with definitions and

other distinguishing attributes. Ontology – A set of concepts, their relationships and constraints, all

within the scope of a domain. XML for metadata registry and communication

Several I.T. efforts have shown the criticality of metadata in enabling data sharing and system interoperability

Use standards where appropriate ISO/IEC 11179 – A framework for the Specification and

Standardization of Data Elements Dublin Core – A metadata element set intended to facilitate discovery

of electronic resources.

Page 22: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 22

Page 22

Why XML?

XML doesn’t provide a “silver bullet,” but it does allow us to refocus the problem on metadata Metadata is a key to interoperability (http://www.cio.gov/docs/metadata.htm)

XML is language neutral

Allows the designer to separate the data and the transport (re: CORBA vs XML-over-CORBA) Transport mechanism and data are not tied together

Could be XML/HTTP

Simpler deployments

Simpler interfaces

Allows technologies to grow and change independently

Real value of XML is the process of describing the data

Page 23: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 23

Page 23

What is a profile?

A profile is a set of metadata resource definitions implemented in XML for data products residing in one or more distributed systems

Profile servers are CORBA servers that manage XML profile definitions

Profile servers communicate via XML-over-CORBA

Developed Java classes that map XML profiles to a Java object

Profile Distributed Node Architecture

XML/CORBA XML/CORBA

XML/CORBA XML/CORBA

XML/CORBAXML/CORBA

Page 24: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 24

Page 24

Solutions to Data Search

Build metadata “profiles” that describe data system resources Encapsulate individual data systems resources (Hide uniqueness)

Communicate using metadata (Provide metadata with data)

Enable interoperability based on metadata compatibility

Refocus problem on metadata development

Provide a core framework of software components to interconnect distributed data systems

Define profiles using standard industry approaches Use XML to describe profiles ISO/IEC 11179 – A framework for the Specification and Standardization

of Data Elements Dublin Core – A metadata element set intended to facilitate discovery of

electronic resources.

Page 25: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 25

Page 25

Resource Profile Classifications

profileserver

granule

datasetanalysistool

productserver

catalog

interface

datasetcollection

datadictionary

SpaceScience

Astrophysics SpacePhysics

PlanetaryScience

NationalInstitute ofHealth

EarlyDetectionNetwork

NationalCancerInstitute

Resource Classes

MetadataDataApplicationSystem

Resource Context(Discipline )

Medicine Space Science

Page 26: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 26

Page 26

Profile DTD

<!ELEMENT profiles (profile+)>

<!ELEMENT profile (profAttributes, resAttributes, profElement*)>

<!ELEMENT profAttributes (profId, profVersion*, profTitle*, profDesc*, profType*, profStatusId*, profSecurityType*, profParentId*, profChildId*, profRegAuthority*, profRevisionNote*, profDataDictId*)>

<!ELEMENT resAttributes (Identifier, Title*, Format*, Description*, Creator*, Subject*, Publisher*, Contributor*, Date*, Type*, Source*, Language*, Relation*, Coverage*, Rights*, resContext*, resAggregation*, resClass*, resLocation*)>

<!ELEMENT profElement (elemId*, elemName, elemDesc*, elemType*, elemUnit*, elemEnumFlag*, (elemValue | (elemMinValue, elemMaxValue))*, elemSynonym*, elemObligation*, elemMaxOccurrence*, elemComment*)>

Page 27: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 27

Page 27

XML Profile Example (1 of 2)

<profile> <profAttributes> <profId>OODT_PDS_DATA_SET_INV_82</profId><profDataDictId>OODT_PDS_DATA_SET_DD_V1.0</profDataDictId> </profAttributes> <resAttributes> <Identifier>VO1/VO2-M-VIS-5-DIM-V1.0</Identifier> <Title>VO1/VO2 MARS VISUAL IMAGING SUBSYSTEM DIGITAL …</Title> <Format>text/html</Format> <Language>en</Language> <resContext>PDS</resContext> <resAggregation>dataSet</resAggregation> <resClass>data.dataSet</resClass> <resLocation>http://pds.jpl.nasa.gov/cgi-bin/pdsserv.pl?…</resLocation> </resAttributes>

Page 28: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 28

Page 28

XML Profile Example (2 of 2)

<profElement> <elemId>ARCHIVE_STATUS</elemId> <elemName>ARCHIVE_STATUS</elemName> <elemType>ENUMERATION</elemType> <elemEnumFlag>T</elemEnumFlag> <elemValue>ARCHIVED</elemValue> </profElement> <profElement> <elemId>TARGET_NAME</elemId> <elemName>TARGET_NAME</elemName> <elemType>ENUMERATION</elemType> <elemEnumFlag>T</elemEnumFlag> <elemValue>MARS</elemValue> </profElement></profile>

Page 29: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 29

Page 29

OODT Profile Service Component

Profiles are managed by profile servers

Profile servers are written in Java

OODT currently has three different registry methods for managing profiles which are configurable at run time Flat File

RDBMS via JDBC (Oracle)

LDAP (OpenLDAP)

Page 30: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 30

Page 30

Data Product Exchange

Exchanging products requires access to each data system (RDBMS/OODBMS, Flat file, etc) which is difficult Different vendor products

Non-standard interfaces

Different implementations (data model, home grown, COTS,etc)

Representations of data are different

Heterogeneous Platforms

Heterogeneous O/S

etc

Page 31: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 31

Page 31

Solutions to Data Product Exchange

Extend framework to support common access to distributed data systems by creating a “Product Service Component”

Product Servers - Middleware that negotiates the interfaces between the data system implementations

Design the component to leverage off of Consistent metadata and data dictionary Consistent data interchange methods and protocols

Provide data abstraction Data and information hiding Location hiding and independence

Provide a standard language for communication Use the OODT XML Query language for data interchange

Support “rich” query description including data elements and constraints Support “rich” query results that include results in many different formats

Page 32: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 32

Page 32

OODT Product Server Component

The Product Server plugs into the OODT framework and manages the “handshake” between the data system and the OODT system.

Extensible by dynamically loading objects at runtime which are specific to the data system model

Queries and results are passed using an OODT XML Query structure Encapsulates one or more data sources for standardized access

Product Server

Flat file accessor

Other accessor

Query: XML

Result: XML

Generic Server Implementation Class

FileSys

Database

Page 33: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 33

Page 33

OODT Query Service Component

Manages all queries for the identification and retrieval of data products

All components are identified by a unique name and managed in a CORBA name server

Queries to multiple profile or product servers occur concurrently

Queries are described using the OODT XML Query structure

Ties together the profile and product server components for the OODT framework

Page 34: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 34

Page 34

XML Query Example (1 of 2)

<query> <queryAttributes> <queryId>OODT_XML_QUERY_V0.1</queryId> <queryTitle>OODT_XML_QUERY - PDS DIS Query Example</queryTitle> <queryDesc>PDS DIS Query for TARGET_NAME = MARS</queryDesc> <queryType>QUERY</queryType> <queryStatusId>ACTIVE</queryStatusId> <querySecurityType>UNKNOWN</querySecurityType> <queryRevisionNote>2000-05-12 JSH V1.2 Updated for new prof.dtd</queryRevisionNote> <queryDataDictId>OODT_PDS_DATA_SET_DD_V1.0</queryDataDictId> </queryAttributes> <queryResultModeId>ATTRIBUTE</queryResultModeId> <queryPropogationType>BROADCAST</queryPropogationType> <queryPropogationLevels>N/A</queryPropogationLevels> <queryMaxResults>100</queryMaxResults><queryResults>0</queryResults> <queryKWQString>TARGET_NAME = MARS</queryKWQString>

Page 35: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 35

Page 35

XML Query Example (2 of 2)

<querySelectSet></querySelectSet> <queryFromSet></queryFromSet> <queryWhereSet> <queryElement> <tokenRole>elemName</tokenRole> <tokenValue>TARGET_NAME</tokenValue> </queryElement> <queryElement> <tokenRole>LITERAL</tokenRole> <tokenValue>MARS</tokenValue> </queryElement> <queryElement> <tokenRole>RELOP</tokenRole> <tokenValue>EQ</tokenValue> </queryElement> </queryWhereSet> <queryResultSet></queryResultSet></query>

Page 36: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 36

Page 36

Insertion of OODT into PDS

MARIE EDR,PDS PPI

GRS RDRPDS GEO

THEMIS EDRASU

DOCUMENTS& ANCILLARYFILES

GRS EDRUofA

MGSMOLA

MGSTES

MGSMOC

Viking MDIM

VikingEDR

BrowserInterfacesBrowser

Interfaces

DataVisualizationData Mining &

Visualization

DataMiningData

Catalogs

OODT MiddlewareDataMiningProduct

Servers

DataMiningProduct

Servers

Distributed Archives

Page 37: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 37

Page 37

OODT Query Flow Example

Query Server

Profile Serverjpl

Product Serverjpl.pti

Product Serverjpl.pds

Product Serverjpl.pds.mola

Userquery

XSL(profiles ordata productsformatted)

XMLQuery/IIOP(no results)

XMLQuery/IIOP(profiles ordata resultsas requested)

XMLQuery/IIOP(no results)

XMLQuery/IIOP(profiles of resources to handle query)

XM

LQ

ue

ry/I

IOP

(d

ata

re

sults

)

XM

LQ

ue

ry/I

IOP

(p

rod

uct

se

arc

h)

Search Web Page

Profile DB

PTI Repository

PDS DVD Jukebox

PDS MOLA Oracle DB

Que

ryC

lien

t

Web

se

rver

sear

ch.js

p

Page 38: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 38

Page 38

MGS Coverage Plot

Page 39: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 39

Page 39

Query Results

Page 40: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 40

Page 40

Efforts in Bioinformatics

September 2000, the Office of Science Policy at NIH and JPL entered into a joint inter-agency agreement to

Explore knowledge environments to support the management and exchange of biomarkers and biomakers databases

Insert technologies developed to support interoperability between biomedical databases

Develop methodologies to are mutually beneficial to both biomedical and space science data systems

A series of meetings lead to an initial effort that focused on interoperability within the Early Detection Research Network, a network within the National Cancer Institute (NCI) consisting of 18 laboratories and hospitals

Two initial epidemiological centers were selected for locating cancer specimens

Moffitt Cancer Research Center University of San Antonio, Texas

A “mock up” data architecture for EDRN was presented on January 21st at the 3rd EDRN Steering Committee meeting in San Antonio, Texas

Page 41: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 41

Page 41

EDRN Mock Query Interface

Page 42: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 42

Page 42

QueryManager

EDRN Knowledge ArchitectureMockup Implementation at JPL

San Antonio

Moffitt

MetadataProfiles

EDRN Mock Databases Hosted at JPL

San Antonio ProductExchange Server

Moffitt ProductExchange Server

In:QueryOut::Identified Resources

In:QueryOut::Data Products

In:QueryOut::Data Products

OODTMiddleware: Hosted at JPL

EDRN “Mock” Query Interface

In:QueryOut::Data Products

Page 43: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 43

Page 43

Component Based Mars Distributed Archive Architecture 2001 Mars Odyssey Orbiter Example

VolumeServer

Data ProductServer(s)

PI TeamClient Interface

QueryManager

MARIE EDR,RDR ArchivePDS PPI

GRS RDRArchivePDS GEO

THEMIS EDR,RDR ArchiveASU

Documents andAncillary FilesPDS CN

GRS EDRArchiveUofA

Code-S ScientistsClient Interface

General PublicClient Interface

In:QueryOut::Data Products,Volumes

In:QueryOut::Identified Resources

In:QueryOut::Data Products Data Products

Virtual Volume

OODTMiddleware Virtual Volume

Physical Volume

Data Productsand Descriptions

Data Productand ResourceDescriptions

DVDVolume

SupportingDocuments

Data Productsand Descriptions

Data Productsand Descriptions

Data Productsand Descriptions

In:QueryOut::Data Products,Volumes

In:QueryOut::Data Products,Volumes

Page 44: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 44

Page 44

More Information and References

OODT Papers (http://oodt.jpl.nasa.gov/doc/papers) “Science Search and Retrieval using XML” by OODT Team. Presented at Second National Conference

on Scientific and Technical Data, National Academy of Sciences, Washington D.C. March 2000. “A Distributed Component Framework for Science Data Product Interoperability” by OODT Team.

Presented at the 17th Annual International CODATA conference. Baveno, Italy. October 2000.

OODT Presentations (http://oodt.jpl.nasa.gov/doc/presentations)

JPL Enterprise Data Architecture White Paper (e-mail: [email protected])

Planetary Data System

http://pds.jpl.nasa.gov

Dublin Core

http://purl.oclc.org/dc

Extensible Markup Language

http://www.w3c.org/XML

ISO/IEC 11179: Specification and Standardization of Data Elements

Federal CIO Statement on Metadata

http://www.cio.gov/docs/metadata.htm

Page 45: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 45

Page 45

Backup Slides

Page 46: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 46

Page 46

OODT Prototype Environment

- XML parser: Apache/IBM Xerces 1.0.3

http://xml.apache.org

- XSLT: Apache Xalan 1.0.0

http://xml.apache.org

- CORBA: Orbacus 4.0.3

http://www.ooc.com

- Database: Oracle 8.1.5

http://www.oracle.com

- LDAP server: OpenLDAP 1.2.11

http://www.openldap.org

- Development language: Java 1.2

http://java.sun.com

- Web server: iPlanet Fasttrack 4.1

http://www.iplanet.com

- Server operating system: RedHat Linux 6.2

http://www.redhat.com

- Version control system: CVS 1.10.5

http://www.cvshome.org

Page 47: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 47

Page 47

Profile Server Node

Page 48: Page 1 Object Oriented Data Technology for Space Science Data Archiving and Retrieval: Potential Applications in Biomedical Research March 21, 2001 Dan

Page 48

Page 48

The Use of Metadata in the PDS

Mission

Data Set Collection

Data SetInstrument

Spacecraft

Target

Document Image Time Series Spectrum

Planetary ScienceModel

Locate and Use Data- Use context to find data- Use context to understand data

System Interoperability- Use context to share data

Correlative Science- Use context to find new relationships between data

ModelAttributes

Label

Data