nsf workshop on cyberinfrastructure for environmental observatories introduction to ci topics

51
SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004 NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics Chaitan Baru, SDSC/NLADR Bertram Ludaescher, UC Davis/SDSC Michael Welge, NCSA/NLADR

Upload: colin

Post on 04-Feb-2016

35 views

Category:

Documents


0 download

DESCRIPTION

NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics. Chaitan Baru, SDSC/NLADR Bertram Ludaescher, UC Davis/SDSC Michael Welge, NCSA/NLADR. Outline. A nexus of CI projects CI project “principles” CI technical focus areas/topics - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

NSF Workshop on Cyberinfrastructure for Environmental Observatories

Introduction to CI Topics

Chaitan Baru, SDSC/NLADRBertram Ludaescher, UC Davis/SDSC

Michael Welge, NCSA/NLADR

Page 2: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

Outline

• A nexus of CI projects• CI project “principles”• CI technical focus areas/topics• CI organizational issues

Page 3: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

CI Projects• Biomedical

• BIRN-CC (Ellisman PI, Papadopoulos, Gupta, Baru, …)• National Biomedical Computational Resource, NBCR (Arzberger PI, Ellisman, Papadopoulos,

Gupta, Baru, …) ...• Geosciences

• GEON (Baru PI, Ludaescher, Papadopoulos, Helly, …)• SCEC (Jordan PI, Moore, …)• LEAD (Drogemeier PI, Wilhelmson, Welge, …)• Chronos (Cervato PI, Baru…)• CUAHSI-HIS (Maidment PI, Helly, Zaslavsky, …)• LOOKING (Smarr/Orcutt PI, Welge, Fountain, …) ...

• Bio/Eco/Environmental• SEEK (Michener PI, Ludaescher, Jones, Rajasekar, …)• LTER (Michener PI, SDSC partner (Arzberger, Baru, Fountain, Rajasekar)…)• NEON (Hayden/Michener Lead PI’s, Krishtalka, Baru, Welge…)• ROADNet (Orcutt PI, Vernon, Rajasekar, Ludaescher, Fountain, …)• NSF/BDI Lake Metabolism (Arzberger/Kratz PI’s, Fountain, …) ...

• Engineering• Monitoring Health of Civil Infrastructure (El Gamal PI, Fountain, …)• CLEANER (Minsker, Welge, Zaslavsky, Fountain, Pancake, …)

• CISE• OpIPuter (Smarr PI, Ellisman, Orcutt, Papadopoulos, Welge, …)• NMI, GRIDs Center• Data Intensive Grid Benchmarking (Baru PI, Snavely, Casanova)

• MPS• NVO, GriPhyN, …

Page 4: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

CI Project Principles• Use IT state-of-the-art, and develop advanced IT where needed, to support

the “day-to-day” conduct of science (e-science)• (not just “hero” computations)• Based on a Web/Grid services-based distributed environment

The “two-tier” approach• Use best practices, including commercial tools,• while developing advanced technology in open source, and doing CS research

• An equal partnership • IT works in close conjunction with science, to create CI, i.e., the best practices, data

sharing frameworks, useful and usable capabilities and tools

• Create the “science IT infrastructure”• Online databases with advanced search engines• Robust tools and applications, etc.

• Leverage from other intersecting projects• Much commonality in the technologies, regardless of science disciplines• Constantly work towards eliminating (or, at least, minimizing) the “NIH” syndrome• And, importantly, try not to reinvent what industry already knows how to do…

Page 5: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

Important Focus Areas / Topics

• Security• Authentication, access control, controls for data publication…

• Grid middleware • WSRF implementations, architecting “core” services (e.g. for metadata

management, versioning, …)• Data integration and ontologies

• Data interoperability, schema and semantic integration• Workflow systems

• “system-level” and science workflows (ingestion and analysis)• Sensor network and sensor data management

• Extensible, scalable, autonomic software; intelligent sensor management• Data mining

• Online analysis, large-scale data, novel algorithms, advanced triggering and notification

• Visualization• Large-scale, multi-model (data viz, GIS, info viz)

Page 6: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

Example: GEON “Software Stack”

Core Grid ServicesData, Metadata, Indexing, Logging, Other Systems Services

“Physical” GridRedHat Linux, ROCKS, OGSI, Internet, I2, OptIPuter (planned)

Registration Services

Data Integration Services

GIS Mapping Services

Computational And Modeling

Services

Registration GEONsearch GEONworkbench

service interfaces

Portal(myGEON)

Other service “consumers”

Page 7: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

Antelope WSRF ExtensionsCourtesy: Tony Fountain, SDSC and LOOKING project

Object Ring

BufferField

Interface Module

ORB Operations:

Orb ImportOrb ExportProcessingArchiving

field digitizer

field digitizer

field digitizer

Databases

Antelope Executive

Module

WS-Resource

WS-Resource

WS-Resource

WS-Resource

Soap HeaderSoap Body

Proxy Cert

Request

Params

SoapRequest

SOAP/HTTPPortal Data Analyzer

ORBcommander

ORBManager

LookupService

WSRFAuthentication & Authorization

Antelope Web

Services

ServiceInvoker

Proxy RepositoryCerts,username, password, others

Services Repositoryname, definiton, others

ORBMonitor

ServicesSubscriber

Databaseoperator

Event Coordinator

OtherServices

WS-Resource

WS-Resource

WS-Resource

Page 8: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

CI Organizational Issues

• How to foster development of common infrastructure (based upon science needs/input), across multiple science domains• Not just at hardware level (e.g. supercomputers, high-

speed networks) or OS and system services level• But, at the database, data integration, data mining levels

• How to deal with the continuum of activities from basic CS research to production IT systems

• NLADR – created with above issues in mind• Prototype for a CI organization

Page 9: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

NLADR—National Lab for Advanced Data Research

• Joint activity between SDSC and NCSA, started October 1, 2004

• Formed based on NSF’s requirement that SDSC and NCSA collaborate on CI activities

• Collaborative R&D activity focused on advanced data technologies• Guided by real applications from science communities• …to assemble expertise and a “knowledge base” of data technologies• And, also develop a broad data architecture framework• …within which to develop, integrate, test, and benchmark data-related

technologies• …in the context of national-scale physical infrastructure

Page 10: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

NLADR Services Architecture

NLADR Data Management ServicesManagement and archiving of large simulation outputs, streaming data, databases, data collections

Internet2, LambdaGridsSDSC/NCSA testbed, OptIPuter

nladrSearch DataWorkbench

NLADR Query, Analysis, and Visualization Services

DataRegistration

And Indexing

Database Federation

& Integration

WorkflowAuthoring Execution

Data andInformationVisualization

Data Analysis and

MiningCollaboration Benchmarking

Applications NSF – LEAD, GEON, LTERGrid, CLEANER, LOOKING

NIH/NCRR – BIRNNASA – Space & Earth Sciences

Strategic Industrial Partners -- …

Grid and Web Middleware – (Globus/WSRF/WebServices/J2EE)

Node Operating Systems (Linux, …)

Page 11: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

Some Core IT Areas

• Data integration and ontologies• Data interoperability, schema and semantic integration

• Scientific Workflows

Page 12: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

IntegrationSchema

Schema Integration (“registering” local schemas to a global schema)

Arizona

Colorado

Utah

Nevada

Wyoming

New Mexico

Montana E.

Idaho

Montana West

Formation …

Age …

Formation …

Age …

Formation …

Age …

Formation …

Age …

Formation …

Age …

Formation …

Age …

Formation …

Age …

… Formation

… Age

… Composition

… Fabric

… Texture

… Formation

… Age

… Composition

… Fabric

… Texture

ABBREV

PERIOD

PERIOD

NAME

PERIOD

TYPE

TIME_UNIT

FMATN

PERIOD

NAME

PERIOD

NAME

FORMATION

PERIOD

FORMATION

FORMATION

LITHOLOGY

LITHOLOGY

AGE

AGE

andesitic sandstone

Livingston formation

Tertiary-Cretaceous

Sources

Sources

Page 13: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

Multihierarchical Rock Classification “Ontology” (Taxonomies) for “Thematic Queries” (GSC)

Composition

Genesis

Fabric

Texture

Page 14: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

Ontology-Enabled Application Example:Geologic Map Integration

Show formations where AGE = ‘Paleozic’

(without age ontology)

Show formations where AGE = ‘Paleozic’

(without age ontology)

Show formations where AGE = ‘Paleozic’

(with age ontology)

Show formations where AGE = ‘Paleozic’

(with age ontology)

+/- a few hundred million years

domainknowledge

domainknowledge

Knowledge r

epresentatio

n

Geologic Age

ONTOLOGY

NevadaNevada

Page 15: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

Different views on State Geological Maps

Page 16: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

Sedimentary Rocks: BGS Ontology

Page 17: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

Sedimentary Rocks: GSC Ontology

Page 18: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004formalized as domain map/ontology

Purkinje cells and Pyramidal cells have dendritesthat have higher-order branches that contain spines.Dendritic spines are ion (calcium) regulating components.Spines have ion binding proteins. Neurotransmissioninvolves ionic activity (release). Ion-binding proteinscontrol ion activity (propagation) in a cell. Ion-regulatingcomponents of cells affect ionic activity (release).

domain expert knowledge

Made usable for the system using Description Logic

Example: Domain Knowledge to “glue” SYNAPSE & NCMIR

Data

Page 19: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

“Semantic Source Browsing”: Domain Maps/Ontologies (left) & conceptually linked data (right)

Page 20: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

A Semantic Mediation Result View

Page 21: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

Source Contextualization through Ontology Refinement

In addition to registering (“hanging off”) data relative toexisting concepts, a source may also refine the mediator’s domain map...

sources can register new concepts at the mediator ...

increase your data usability

Page 22: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

What is a Scientific Workflow (SWF)?• Aims:

• automate a scientist’s repetitive data management and analysis tasks • typical phases:

• data access, scheduling, generation, transformation, aggregation, analysis, mining, visualization

design, test, share, deploy, execute, reuse, … SWFs

Page 23: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

Promoter Identification Workflow

Source: Matt Coleman (LLNL)Source: Matt Coleman (LLNL)

Page 24: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

KEPLER/CSP: Contributors, Sponsors, Projects(or loosely coupled Communicating Sequential Persons ;-)

Ilkay Altintas SDM, ResurgenceKim Baldridge Resurgence, NMI Chad Berkley SEEK Shawn Bowers SEEKTerence Critchlow SDM Tobin Fricke ROADNetJeffrey Grethe BIRNChristopher H. Brooks Ptolemy II Zhengang Cheng SDM Dan Higgins SEEKEfrat Jaeger GEON Matt Jones SEEK Werner Krebs, EOLEdward A. Lee Ptolemy II Kai Lin GEONBertram Ludaescher SDM, SEEK, GEON, BIRN, ROADNetMark Miller EOLSteve Mock NMISteve Neuendorffer Ptolemy II Jing Tao SEEK Mladen Vouk SDM Xiaowen Xin SDM Yang Zhao Ptolemy IIBing Zhu SEEK •••

Ptolemy IIPtolemy II

www.kepler-project.orgwww.kepler-project.org

Page 25: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

Scientific Workflows as a Melting Pot:Example: The Kepler SWF System

• A grass-roots project• collaboration at the level of developers

• Intra-project links• e.g. in SEEK: AMS SMS EcoGrid

• Inter-project links• SEEK ITR, GEON ITR, ROADNet ITRs, DOE SciDAC SDM, Ptolemy

II, NIH BIRN (coming we hope …), UK eScience myGrid, …• Inter-technology links

• Globus, SRB, JDBC, web services, soaplab services, command line tools, R, GRASS, XSLT, …

• Interdisciplinary links• CS, IT, domain sciences, …

Page 26: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

Promoter Identification Workflowin KEPLER

Page 27: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

Promoter Identification Workflowin KEPLER

Page 28: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

Web Services Actors (WS Harvester)

12

3

4

“Minute-made” (MM) WS-based application integration• Similarly: MM workflow design & sharing w/o implemented

components

Page 29: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

Job Management (here: NIMROD)

• Job management infrastructure in place• Results database: under development• Goal: 1000’s of GAMESS jobs (quantum mechanics)

Page 30: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

Some Recent Actor Additions

Page 31: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

in KEPLER (w/ editable script)

Source: Dan Higgins, Kepler/SEEKSource: Dan Higgins, Kepler/SEEK

Page 32: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

Blurring Design (ToDo) and Execution

Page 33: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

SAN DIEGO SUPERCOMPUTER CENTER NSF Workshop on CI for Environmental Observatories, Dec 6.7, 2004

Towards Real-time Analysis Pipelines:Towards Real-time Analysis Pipelines:Combining Simulations, Models, and ObservationsCombining Simulations, Models, and Observations

Page 34: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

A Briefing On Data Mining to the NSF Planning Meeting Discussion Group on Cyberinfrastructure For Environmental Observatories

December 6 & 7, Arlington, VA

Michael WelgeUniversity Of Illinois/[email protected]

Page 35: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

Modern Discovery and Problem Solving

• Team-oriented and collaborative• Information-based, decision focused

• Requires large-scale data fusion and analysis• All data is not under user’s control

• Geographically distributed experts• Geographically distributed data and

applications• Multiple stakeholders – multiple objectives

Page 36: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

Enabling Scientist

Scientists, Engineers, Decision Makers, Policy Makers, Media and Citizens

Engaging in discovery, analysis, discussion, deliberation, decisions, policy formulation and communication

Collaboration Framework facilitates Idea and Knowledge Sharing, eLearning and Multi-Objective Decision Support Processes

Analysis Framework facilitates Data and Model Discovery, Exploration, and Analysis; via the Collaboration Framework

Data Management Framework builds logical maps of distributed, heterogeneous information resources (data, models, tools, etc.)

and facilitates their use via the Analysis and Collaboration Frameworks

Physical Infrastructure

Page 37: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

Data Streams – large number of applications

• Sensor networks • Massive Simulation data sets (stored but random

access is too expensive)• Monitoring & surveillance: video streams• Network monitoring and traffic engineering• Text based systems• RFID tags

• Web logs and Web page click streams• Credit card transaction flows

• Telecommunication calling records• Engineering & industrial processes: power supply &

manufacturing

Page 38: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

Support For Large Data Driven Problems

• Streaming Data• Continuous, unbounded, rapid,

time-varying • Huge volumes of continuous

data, possibly infinite• Unpredictable arrival• Fast changing and requires

real-time response• Random access is expensive so

an application can only have one look at the data

• May require methods to detect rare events

• Large Static Data• Databases involving many

terabytes can exceed reasonable processing capacity

• Thousands of files problems of management and version control

• Thousands of fields create problems with model building

• May require auxiliary models to support data quality issues

• May require methods to detect rare events

• Distributed data store necessary for some application domains

Page 39: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

Managing and Mining Data Streams

Page 40: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

Event Federation I

• Connect with data sources.

• Parse source data to form (composite) events according to type definitions.

• Collect and stage events for retrieval.

Event Interface

…1 2 N

Data Sources

Type Info

Event CollectorEvent Collector

Parse and Compose EQL

Persistence Buffering

…Stream Clients

Page 41: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

Event Federation 2

• Monitors are event expression recognition agents.• Recognize Event• Evaluate Conditions• Act

• EQL (Event Query Language) implements a compositional semantics for event expressions.• Composite events are

“first order” events.• Monitors can monitor

monitors.

• Clock events are part of the language implementation.• Easy to write queries

with temporal constraints.

EventWorksEvent Router

Streams

Monitor 1 Monitor N

…EQL EQLNew

Events

Monitors are generated by users or programmatically.

Page 42: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

D2K : A Framework For Building Data-Driven Apps – Persistent Stream Data Analytics Foundation

Designed for Building and Maintaining Complex Persistent and Stream Designed for Building and Maintaining Complex Persistent and Stream Data-Driven ApplicationsData-Driven Applications

http://alg.ncsa.uiuc.edu

Page 43: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

D2K/T2K/I2K: Data, Text, and Image Analysis

http://alg.ncsa.uiuc.edu

Page 44: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

Uses novel methods to do real-time stream data analysis.

LOOKING: Stream Data Analytics/Information Visualization scientific “dashboard”

Discovers association and correlation rules in data stream environment.

Online Frequent Pattern Mining

Online Stream Query Engine Online Stream Classification

Adaptable to the changes and evolution of data streams.

Detects outliers and finds evolution of clusters in data streams.

Online Clustering of Data Streams

Page 45: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

CI Issues Architecture – NEON/CLEANER

Page 46: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

Real-time Visualization of RFID people location sensors: Supercomputing IntelliBadge™

Page 47: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

Atmospheric Science: Analytic Feature Extraction Scientific Visualization Techniques

Page 48: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

LOOKING: Scientist Analytical/Spatial-temporal Visualization Techniques

Page 49: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

LOOKING/Optiputer/Planetary Collaboratory

1024 Processor Altix

3 TB Shared Memory

>300 TeraBytes Disk

8 X 8 Processor 4 Pipe, 16 gig Memory each, Prisms coupled with Infiniban for On-demand, Interactive

U of W

Page 50: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

NLADR Tier 1 Architecture

Page 51: NSF Workshop on Cyberinfrastructure for Environmental Observatories Introduction to CI Topics

…. Data-Drive Science

• Collaboration• Information Gathering (experiments, simulation,

observation – calendar of upcoming activities)• Data Management

• Generation and Publishing of Data (experiments, simulation, or observation)• Persistent Data Stores (Distributed Data Management)• Stream Data Management (Event Management)

• Detection• Mining of new types of data, such as large static data

stores (>>1TB), streams, networks,..• Behavior Characterization (atypical, surprising, normal)

• Discovery• Hypothesis Generation

• Collaboration • Focusing results for testing and validation