software defined services (sds) for high performance large ... · gmd/ccgg/carbontrack er xsede...

Post on 23-Aug-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Software Defined Services (SDS) For High Performance

Large Scale Science Data Streams Across 100 Gbps

WANs

Joe Mambretti, Director, (j-mambretti@northwestern.edu)

International Center for Advanced Internet Research (www.icair.org)

Northwestern University

Director, Metropolitan Research and Education Network (www.mren.org)

Co-Director, StarLight, PI StarLight SDX, PI-iGENI, PI-OMNINet

(www.startap.net/starlight)

The International Conference for High Performance Computing,

Networking, Storage and Analysis SC16

Salt Lake City, Utah

November 13-17, 2016

Sloan Digital Sky

Survey

www.sdss.org

Globus Alliance

www.globus.org

LIGO

www.ligo.org TeraGrid

www.teragrid.org

ALMA: Atacama

Large Millimeter

Array

www.alma.nrao.edu

CAMERA

metagenomics

camera.calit2.net

Comprehensive

Large-Array

Stewardship System

www.class.noaa.gov

DØ (DZero)

www-d0.fnal.gov

ISS: International

Space Station

www.nasa.gov/statio

n

IVOA:

International

Virtual

Observatory

www.ivoa.net

BIRN: Biomedical

Informatics Research

Network

www.nbirn.net

GEON: Geosciences

Network

www.geongrid.org

ANDRILL:

Antarctic

Geological

Drilling

www.andrill.org

GLEON: Global Lake

Ecological

Observatory

Network

www.gleon.orgPacific Rim

Applications and

Grid Middleware

Assembly

www.pragma-

grid.net

CineGrid

www.cinegrid.orgCarbon Tracker

www.esrl.noaa.gov/

gmd/ccgg/carbontrack

er

XSEDE

www.xsede.org

LHCONE

www.lhcone.net

WLCG

lcg.web.cern.ch/LCG/publi

c/

OOI-CI

ci.oceanobservatories.org

OSG

www.opensciencegrid.org

SKA

www.skatelescope.o

rg

NG Digital

Sky Survey

ATLAS

Compilation By Maxine Brown

Macro Network Science Themes

• Transition From Legacy Networks To Networks That

Take Full Advantage of IT Architecture and Technology

• Extremely Large Capacity (Multi-Tbps Streams)

• High Degrees of Communication Services

Customization

• Highly Programmable Networks

• Network Facilities As Enabling Platforms for Any Type

of Service

• Network Virtualization

• Highly Distributed Processes

National Science Foundation’s Global

Environment for Network Innovations (GENI)

• GENI Is Funded By The National Science Foundation’s Directorate

for Computer and Information Science and Engineering (CISE)

• GENI Is a Virtual Laboratory For Exploring Future Internets At

Scale.

• GENI Is Similar To Instruments Used By Other Science Disciplines,

e.g., Astronomers – Telescopes, HEP - Synchrotrons

• GENI Creates Major Opportunities To Understand, Innovate and

Transform Global Networks and Their Interactions with Society.

• GENI Is Dynamic and Adaptive.

• GENI Opens Up New Areas of Research at the Frontiers of Network

Science and Engineering, and Increases the Opportunity for

Significant Socio-Economic Impact.

Global Environment for Network Innovations

(GENI)

• GENI

– Supports At-Scale Experimentation on Shared, Heterogeneous,

Highly Instrumented Infrastructure

– Enables Deep Programmability Throughout the Network,

– Promotes innovations in Network Science, Security,

Technologies, Services and Applications

– Provides Collaborative and Exploratory Environments for

Academia, Industry and the Public to Catalyze Groundbreaking

Discoveries and Innovation.

Future Cyberinfrastructure

• Large Scale Highly Distributed Infrastructure That Can

Support Multiple Empirical Research Testbeds At Scale

• Next Generation GENI, Edge Clouds, IOT, US Ignite,

Platform for Advanced Wireless Research (PAWR) and

Many Others

• Currently Being Planned – Will Be Designed,

Implemented and Operated By Researchers for

Researchers

National Science Foundation Global Environment for Network innovations

International 40G and 100 G ExoGENI Testbed

ORCA Semantics

Chapter:

Creating a Worldwide Network

For The Global Environment for Network

Innovations (GENI) and

Related Experimental Environments

iGENI: The International GENI

• The iGENI Initiative Will Design, Develop, Implement, and Operate a Major New National and International Distributed Infrastructure.

• iGENI Will Place the “G” in GENI Making GENI Truly Global.

• iGENI Will Be a Unique Distributed Infrastructure Supporting Research and Development for Next-Generation Network Communication Services and Technologies.

• This Infrastructure Will Be Integrated With Current and Planned GENI Resources, and Operated for Use by GENI Researchers Conducting Experiments that Involve Multiple Aggregates At Multiple Sites.

• iGENI Infrastructure Will Connect Its Resources With Current GENI National Backbone Transport Resources, With Current and Planned GENI Regional Transport Resources, and With International Research Networks and Projects,

The Global Lambda Integrated Facility: a Global

Programmable Resource

SDX StarLightNetherLight

Ronald van der Pol, Joe Mambretti, Jim Chen, John Shillington

SCInet TbpsTopology

By Azher Mughal CalTech

DTN Flows@100 Gbps=>Compute CanadaCANARIEStarLight<+>SC16

SCInet WAN Topology By Azher Mughal CalTech

Precision Medicine

• Precisely match treatments to patients and their specific disease

• Genomic data promises optimal matching.

• 1.7 million cancer cases diagnosed in America each year.

• A single RNA-seq file is 10-20 GB, Whole genome raw data files are

> 100 GB.

• Analysis has become the bottleneck and data size is an issue.

2,000,000 genomes ≈ 1 Exabyte (1,000,000,000,000 MB)

Cost to sequence 1 genome less than $5,000 and falling fast.

Cost to analyze 1 genome is approx. $100,000 and rising.

• A key step towards Algorithm-assisted Personalized medicine is

building Data Commons/Cloud analytics and the *Programmable*

Networks & Communication Exchanges (SDXs) for high

performance, flexible data transport.

Hospitals,

Doctors

Cloud Computation

Genomic Data Commons

Patients

Output: Data-

Aware,

Analytics-Informed

Diagnosis,

Prognosis,

Optimal Treatment

Future: The Virtual ComprehensiveCancer Center

Genomic Data Commons Data Transfer

BI Data Flow Visualization (Inbound-Outbound)

From SDSC To UoC

NCI Genomic Data

Commons

• Harmonization and storage for the Nations Cancer Genomic

Data GDC 1.6PB of cancer genomic data and associated clinical

data.

• Precision Medicine Enabled By Precision Networking

Bionimbus Protected Data

Cloud

• Petabyte-scale, secure compliant biomedical

cloud that interoperates with dbGaP controlled

access data at NIH.

Biomedical Data Commons:

Flow Orchestration: Control Plane + Data Plane

Data Plane

Control Plane

Data Repository A (West Coast)

Data Repository C (Asia) Data Repository D (Europe)

Data Repository B

(South)

Visualization Engines

North AmericaCompute Engines

(Midwest)

Object Storage

(Data)

Local

Compute

“Local” Location

(University of Chicago, US)

Remote

Compute

ClientClientServerServer

UDT

“Remote” Location

(ASGC, Taiwan)

TCP

Remote stats and

visualizations

Parcel Based Collaboration

parcel-

udt2tcp

parcel-

tcp2udt

Client identifies and

retrieves data with ARK ids

Performs Compute

Remotely

Results are pushed back

to remote storage and an

ARK id is assigned

Visualizations are

produced locally

Source: Kevin P. Keegan, Bob

Grossman

Software Defined Networking Exchanges

(SDXs): Motivations

• With the Increasing Deployment of SDN In Production Networks,

the Need for an SDN Exchange (SDX) Has Been Recognized.

• Current SDN Architecture/Protocols/Technologies Are Single

Domain Centralized Controller Oriented

• SDXs Are Required To Interconnect Increasing Numbers Of SDN

“Islands.”

• SDXs Provide Highly Granulated Views Into (and Control Over) All

Flows Within the Exchange – i.e., Much Enhanced Traffic

Engineering and Optimization.

• Options for Many New Types of Services and Capabilities, e.g.,

Encyption E2E, Ultra High Resolution Digital Media, Support for

Data Intensive Science

• WH Office of Science and Technology Policy – Large Scale Science

Instrumentation

• Democratization Of Exchange Facilities – Options for Edge Control

Over Exchange Flows

E2E Services Based On Open Architecture For

Petascale Sciences (OAPS)

• Integrating Science Workflows With Foundation Infrastructure

Workflows (Orchestration)

• Providing Built-In Preconfigured Examples/Templates To Establish

Infrastructure Foundation Workflows

• Providing Zero-Touch “Playbooks” For Different Segments of

Infrastructure Foundation Workflows After Running The 1st Suite

• Supporting Interactive Control Over Running Workflows

• Providing Portability for Different Infrastructure Foundation

Workflows

• Providing Capabilities for Experiment Reproducibility

• Providing Options For Real Time Scientific Visualization

• With Standard Tools

IRNC: RXP: StarLight SDX A Software Defined

Networking Exchange for Global Science Research and

Education

Joe Mambretti, Director, (j-mambretti@northwestern.edu)

International Center for Advanced Internet Research (www.icair.org)

Northwestern University

Director, Metropolitan Research and Education Network (www.mren.org)

Co-Director, StarLight (www.startap.net/starlight)

PI IRNC: RXP: StarLight SDX

Co-PI Tom DeFanti, Research Scientist, (tdefanti@soe.ucsd.edu)

California Institute for Telecommunications and Information Technology (Calit2),

University of California, San Diego

Co-Director, StarLight

Co-PI Maxine Brown, Director, (maxine@uic.edu)

Electronic Visualization Laboratory, University of Illinois at Chicago

Co-Director, StarLight

Co-PI Jim Chen, Associate Director, International Center for Advanced Internet

Research, Northwestern University

National Science Foundation

International Research Network Connections Program

Workshop

Chicago, Illinois

May 15, 2015

Another SDX Opportunity! An

Experimental Testbed For

Computer Science Research

IRNC SDX

IRNC SDX

IRNC SDX

IRNC SDX

GENI

SDX

UoM SDX

Planned US SDX Interoperable Fabric

GENI

SDX IRNC SDX

Will Be Contiguous To

the StarLight SDX

Global Research Platform

• A Emerging International Fabric

• A Specialized Globally Distributed Platform For Science

Discovery and Innovation

• Based On State-Of-the-Art-Clouds

• Interconnected With Computational Grids,

Supercomputing Centers, Specialized Instruments, et al

• Also, Based On World-Wide 100 Gbps Networks

• Leveraging Advanced Architectural Concepts, e.g.,

SDN/SDX/SDI – Science DMZs

• Ref: Demonstrations @ SC15, Austin Texas November

2015

• New=> Global Research Platform 100 Gbps Network

(GRPnet) On Private Optical Fiber Between PacificWave

and StarLight via the PNWGP

www.startap.net/starlight

Thanks to the NSF, DOE, DARPA

Universities, National Labs,

International Partners,

and Other Supporters

top related