japanese & uk n+n data, data everywhere and … prof. malcolm atkinson director nesc.ac.uk

15
Japanese & UK N+N Data, Data everywhere and … Prof. Malcolm Atkinson Director www.nesc.ac.uk 3 rd October 2003

Upload: nusa

Post on 15-Jan-2016

32 views

Category:

Documents


0 download

DESCRIPTION

Japanese & UK N+N Data, Data everywhere and … Prof. Malcolm Atkinson Director www.nesc.ac.uk 3 rd October 2003. Discovery is a wonderful thing . Web Hits - Domain. Theory Models & Simulations → Shared Data. Experiment & Advanced Data Collection → Shared Data. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Japanese & UK N+N Data, Data everywhere and … Prof. Malcolm Atkinson Director nesc.ac.uk

Japanese & UK N+N

Data, Data everywhere and …

Prof. Malcolm AtkinsonDirector

www.nesc.ac.uk

3rd October 2003

Page 2: Japanese & UK N+N Data, Data everywhere and … Prof. Malcolm Atkinson Director nesc.ac.uk

Discovery is a wonderful thing

Page 3: Japanese & UK N+N Data, Data everywhere and … Prof. Malcolm Atkinson Director nesc.ac.uk

Web Hits - Domain

47%

4%

15%

17%

4%

9% .ac.uk

.uk (other)

unresolved

.ibm.com

.com (other)

.net

.edu

.jp

.de

other

Page 4: Japanese & UK N+N Data, Data everywhere and … Prof. Malcolm Atkinson Director nesc.ac.uk

Our job: Make the Party a Success every time

Computing ScienceSystems, Notations &

Formal Foundation→ Process & Trust

TheoryModels & Simulations

→Shared Data

Experiment &Advanced Data

Collection→

Shared Data

Multi-national, Multi-discipline, Computer-enabledConsortia, Cultures & Societies

Requires Much Engineering, Much Innovation

Changes Culture, New Mores, New Behaviours

Page 5: Japanese & UK N+N Data, Data everywhere and … Prof. Malcolm Atkinson Director nesc.ac.uk

Integration is our Focus

Supporting CollaborationBring together disciplinesBring together people engaged in shared challengeInject initial energyInvent methods that work

Supporting Collaborative ResearchIntegrate compute, storage and communicationsDeliver and sustain integrated software stackOperate dependable infrastructure serviceIntegrate multiple data sourcesIntegrate data and computationIntegrate experiment with simulationIntegrate visualisation and analysis

High-level tools and automation essentialFundamental research as a foundation

Page 6: Japanese & UK N+N Data, Data everywhere and … Prof. Malcolm Atkinson Director nesc.ac.uk

Derived from Ian Foster’s slide at ssdbM July 03

It’s Easy to ForgetHow Different 2003 is From

1993Enormous quantities of data: Petabytes

For an increasing number of communitiesGating step is not collection but analysis

Ubiquitous Internet: >100 million hostsCollaboration & resource sharing the normSecurity and Trust are crucial issues

Ultra-high-speed networks: >10 Gb/sGlobal optical networksBottlenecks: last kilometre & firewalls

Huge quantities of computing: >100 Top/sMoore’s law gives us all supercomputersUbiquitous computing

(Moore’s law)2 everywhereInstruments, detectors, sensors, scanners, …

Page 7: Japanese & UK N+N Data, Data everywhere and … Prof. Malcolm Atkinson Director nesc.ac.uk

Tera → Peta BytesRAM time to move

15 minutes

1Gb WAN move time10 hours ($1000)

Disk Cost7 disks = $5000 (SCSI)

Disk Power100 Watts

Disk Weight5.6 Kg

Disk FootprintInside machine

RAM time to move2 months

1Gb WAN move time14 months ($1 million)

Disk Cost6800 Disks + 490 units + 32 racks = $7 million

Disk Power100 Kilowatts

Disk Weight33 Tonnes

Disk Footprint60 m2

May 2003 Approximately Correct

See also Distributed Computing Economics Jim Gray, Microsoft Research, MSR-TR-2003-24

Page 8: Japanese & UK N+N Data, Data everywhere and … Prof. Malcolm Atkinson Director nesc.ac.uk

DynamicallyMove computation to the dataAssumption: code size << data sizeDevelop the database philosophy for this?

Queries are dynamically re-organised & boundDevelop the storage architecture for this?

Compute closer to disk? System on a Chip using free space in the on-disk controller

Data Cutter a step in this directionDevelop the sensor & simulation architectures for this?Safe hosting of arbitrary computation

Proof-carrying code for data and compute intensive tasks + robust hosting environments

Provision combined storage & compute resourcesDecomposition of applications

To ship behaviour-bounded sub-computations to dataCo-scheduling & co-optimisation

Data & Code (movement), Code executionRecovery and compensation

Dave PattersonSeattle

SIGMOD 98

Page 9: Japanese & UK N+N Data, Data everywhere and … Prof. Malcolm Atkinson Director nesc.ac.uk

OGSA

Infrastructure Architecture

OGSI: Interface to Grid Infrastructure

Data Intensive Applications for Science X

Compute, Data & Storage Resources

Distributed

Simulation, Analysis & Integration Technology for Science X

Data Intensive X Scientists

Virtual Integration Architecture

Generic Virtual Data Access and Integration Layer

Structured DataIntegration

Structured Data Access

Structured Data Relational XML Semi-structured-

Transformation

Registry

Job Submission

Data Transport Resource Usage

Banking

Brokering Workflow

Authorisation

Page 10: Japanese & UK N+N Data, Data everywhere and … Prof. Malcolm Atkinson Director nesc.ac.uk

1a. Request to Registry for sources of data about “x”

1b. Registry responds with

Factory handle2a. Request to Factory for access to database

2c. Factory returns handle of GDS to client

3a. Client queries GDS with XPath, SQL, etc

3b. GDS interacts with database

3c. Results of query returned to client as XML

SOAP/HTTP

service creation

API interactions

Registry

Factory

2b. Factory creates GridDataService to manage access

Grid Data Service

Client

XML / Relational database

Data Access & Integration Services

Page 11: Japanese & UK N+N Data, Data everywhere and … Prof. Malcolm Atkinson Director nesc.ac.uk

GDTS2 GDS3

GDS2

GDTS1

Sx

Sy

1a. Request to Registry for sources of data about “x” & “y”

1b. Registry responds with

Factory handle

2a. Request to Factory for access and integration from resources Sx and Sy

2b. Factory creates GridDataServices network

2c. Factory returns handle of GDS to client

3a. Client submits sequence of scripts each has a set of queries to GDS with XPath, SQL, etc

3c. Sequences of result sets returned to analyst as formatted binary described in a standard XML notation

SOAP/HTTP

service creation

API interactions

Data Registry

Data Access& Integrationmaster

Client

Analyst XML database

Relational database

GDS

GDS

GDS

GDTS

GDTS

3b. Client tells analyst

GDS1

Future DAI Services

“scientific”Applicationcodingscientificinsights

ProblemSolving

Environment

SemanticMeta data

Application Code

Page 12: Japanese & UK N+N Data, Data everywhere and … Prof. Malcolm Atkinson Director nesc.ac.uk

A New World

What Architecture will Enable Data & Computation Integration?

Common Conceptual ModelsCommon Planning & OptimisationCommon Enactment of WorkflowsCommon Debugging…

What Fundamental CS is needed?Trustworthy code & Trustworthy evaluatorsDecomposition and Recomposition of Applications…

Is there an evolutionary path?

Page 13: Japanese & UK N+N Data, Data everywhere and … Prof. Malcolm Atkinson Director nesc.ac.uk

Take Home Message

Information GridsSupport for collaborationSupport for computation and data gridsStructured data fundamental

Relations, XML, semi-structured, files, …

Integrated strategies & technologies needed

OGSA-DAI is here nowA first stepTry itTell us what is needed to make it betterJoin in making better DAI services & standards

Page 14: Japanese & UK N+N Data, Data everywhere and … Prof. Malcolm Atkinson Director nesc.ac.uk

Cambridge

Newcastle

Edinburgh

Oxford

Glasgow

Manchester

Cardiff

Southampton

London

Belfast

Daresbury Lab

RALHinxton

NeSC in the UKNational

e-ScienceCentre HPC(x)

Directors’ ForumHelped build a

communityEngineering Task ForceGrid Support CentreArchitecture Task Force

UK Adoption of OGSAOGSA Grid MarketWorkflow Management

Database Task ForceOGSA-DAIGGF DAIS-WG

GridNet e-Storm

Globus Alliance

Page 15: Japanese & UK N+N Data, Data everywhere and … Prof. Malcolm Atkinson Director nesc.ac.uk

www.nesc.ac.uk