the pacific research platform

45
“The Pacific Research Platform” CISE Distinguished Lecture National Science Foundation Arlington, VA January 28, 2016 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber Professor, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD http://lsmarr.calit2.net 1

Upload: larry-smarr

Post on 11-Apr-2017

633 views

Category:

Data & Analytics


0 download

TRANSCRIPT

“The Pacific Research Platform”

CISE Distinguished LectureNational Science Foundation

Arlington, VAJanuary 28, 2016

Dr. Larry SmarrDirector, California Institute for Telecommunications and Information

TechnologyHarry E. Gruber Professor,

Dept. of Computer Science and EngineeringJacobs School of Engineering, UCSD

http://lsmarr.calit2.net

1

Abstract

Research in data-intensive fields is increasingly multi-investigator and multi-campus, depending on ever more rapid access to ultra-large heterogeneous and widely distributed datasets. The Pacific Research Platform (PRP) is a multi-institutional extensible deployment that establishes a science-driven high-capacity data-centric “freeway system.” The PRP spans all 10 campuses of the University of California, as well as the major California private research universities, four supercomputer centers, and several universities outside California. Fifteen multi-campus data-intensive application teams act as drivers of the PRP, providing feedback over the five years to the technical design staff. These application areas include particle physics, astronomy/astrophysics, earth sciences, biomedicine, and scalable multimedia, providing models for many other applications. The PRP partnership extends the NSF-funded campus cyberinfrastructure awards to a regional model that allows high-speed data-intensive networking, facilitating researchers moving data between their labs and their collaborators’ sites, supercomputer centers or data repositories, and enabling that data to traverse multiple heterogeneous networks without performance degradation over campus, regional, national, and international distances.

Vision: Creating a West Coast “Big Data Freeway” Connected by CENIC/Pacific Wave to Internet2 & GLIF

Use Lightpaths to Connect All Data Generators and Consumers,

Creating a “Big Data” FreewayIntegrated With High Performance Global Networks

“The Bisection Bandwidth of a Cluster Interconnect, but Deployed on a 20-Campus Scale.”

This Vision Has Been Building for 25 Years

Launching the Nation’s Information Infrastructure:NSFnet Supernetwork and the Six NSF Supercomputers

NCSANCSA

NSFNET 56 Kb/s Backbone (1986-8)

PSCPSCNCARNCAR

CTCCTC

JVNCJVNC

SDSCSDSC

Supernetwork Backbone:56kbps is 50 Times Faster than 1200 bps PC Modem!

Interactive Supercomputing End-to-End Prototype: Using Analog Communications to Prototype the Fiber Optic Future

“We’re using satellite technology…to demo what It might be like to have high-speed fiber-optic links between advanced computers in two different geographic locations.”― Al Gore, Senator

Chair, US Senate Subcommittee on Science, Technology and Space

Illinois

Boston

SIGGRAPH 1989“What we really have to do is eliminate distance between individuals who want to interact with other people and with other computers.”― Larry Smarr, Director, NCSA

NSF’s OptIPuter Project: Using Supernetworks to Meet the Needs of Data-Intensive Researchers

OptIPortal– Termination

Device for the

OptIPuter Global

Backplane

Calit2 (UCSD, UCI), SDSC, and UIC Leads—Larry Smarr PIUniv. Partners: NCSA, USC, SDSU, NW, TA&M, UvA, SARA, KISTI, AIST

Industry: IBM, Sun, Telcordia, Chiaro, Calient, Glimmerglass, Lucent

2003-2009 $13,500,000

In August 2003, Jason Leigh and his

students used RBUDP to blast data from NCSA to SDSC

over theTeraGrid DTFnet,

achieving18Gbps file transfer out of the available 20Gbps

LS Slide 2005

Integrated “OptIPlatform” Cyberinfrastructure System:A 10Gbps Lightpath Cloud

National LambdaRail

CampusOpticalSwitch

Data Repositories & Clusters

HPC

HD/4k Video Images

HD/4k Video Cams

End User OptIPortal

10G Lightpath

HD/4k TelepresenceInstruments

LS 2009 Slide

Two New Calit2 Buildings Provide New Laboratories for “Living in the Future”

• “Convergence” Laboratory Facilities– Nanotech, BioMEMS, Chips, Radio, Photonics– Virtual Reality, Digital Cinema, HDTV, Gaming

• Over 1000 Researchers in Two Buildings– Linked via Dedicated Optical Networks

UC Irvinewww.calit2.net

Preparing for a World in Which Distance is Eliminated…

So Why Don’t We Have a NationalBig Data Cyberinfrastructure?

“Research is being stalled by ‘information overload,’ Mr. Bement said, because data from digital instruments are piling up far faster than researchers can study. In particular, he said, campus networks need to be improved. High-speed data lines crossing the nation are the equivalent of six-lane superhighways, he said. But networks at colleges and universities are not so capable. “Those massive conduits are reduced to two-lane roads at most college and university campuses,” he said. Improving cyberinfrastructure, he said, “will transform the capabilities of campus-based scientists.”-- Arden Bement, the director of the National Science Foundation May 2005

The “Golden Spike” UCSD Experimental Optical Core:NSF Quartzite Grant

NSF Quartzite Grant 2004-2007Phil Papadopoulos, PI

The “Golden Spike” UCSD Experimental Optical Core:Ready to Couple Users to CENIC L1, L2, L3 Services

Source: Phil Papadopoulos, SDSC/Calit2 (Quartzite PI, OptIPuter co-PI)

Funded by NSF MRI

Grant

Lucent

Glimmerglass

Force10

OptIPuter Border Router

CENIC L1, L2Services

Cisco 6509

Goals by 2008:>= 60 endpoints at 10 GigE>= 30 Packet switched>= 30 Switched wavelengths>= 400 Connected endpoints

Approximately 0.5 Tbps Arrive at the “Optical” Center

of Hybrid Campus Switch

Rapid Evolution of 10GbE Port PricesMakes Campus-Scale 10Gbps CI Affordable

2005 2007 2009 2010

$80K/port Chiaro(60 Max)

$ 5KForce 10(40 max)

$ 500Arista48 ports

~$1000(300+ Max)

$ 400Arista48 ports

• Port Pricing is Falling • Density is Rising – Dramatically• Cost of 10GbE Approaching Cluster HPC Interconnects

Source: Philip Papadopoulos, SDSC/Calit2

DOE ESnet’s Science DMZ: A Scalable Network Design Model for Optimizing Science Data Transfers

• A Science DMZ integrates 4 key concepts into a unified whole:– A network architecture designed for high-performance applications,

with the science network distinct from the general-purpose network

– The use of dedicated systems for data transfer

– Performance measurement and network testing systems that are regularly used to characterize and troubleshoot the network

– Security policies and enforcement mechanisms that are tailored for high performance science environments

http://fasterdata.es.net/science-dmz/Science DMZCoined 2010

The DOE ESnet Science DMZ and the NSF “Campus Bridging” Taskforce Report Formed the Basis for the NSF Campus Cyberinfrastructure Network Infrastructure and Engineering (CC-NIE) Program

Based on Community Input and on ESnet’s Science DMZ Concept,NSF Has Funded Over 100 Campuses to Build Local Big Data Freeways

Red 2012 CC-NIE AwardeesYellow 2013 CC-NIE AwardeesGreen 2014 CC*IIE AwardeesBlue 2015 CC*DNI AwardeesPurple Multiple Time Awardees

Source: NSF

The Pacific Research Platform Creates a Regional End-to-End Science-Driven “Big Data Freeway System”

NSF CC*DNI Grant$5M 10/2015-10/2020

PI: Larry Smarr, UC San Diego Calit2Co-Pis:• Camille Crittenden, UC Berkeley CITRIS, • Tom DeFanti, UC San Diego Calit2, • Philip Papadopoulos, UC San Diego SDSC, • Frank Wuerthwein, UC San Diego Physics and SDSC

Creating a “Big Data” Freeway on Campus:NSF-Funded CC-NIE Grants Prism@UCSD and CHeruB

Prism@UCSD, Phil Papadopoulos, SDSC, Calit2, PI (2013-15)CHERuB, Mike Norman, SDSC PI

CHERuB

Big Data Compute/Storage Facility -Interconnected at Over 1 Trillion Bits Per Second

128COMETVM SC 2 PF

128Gordon

Big Data SC Oasis Data Store •

128

Source: Philip Papadopoulos, SDSC/Calit2

Arista Router Can Switch

576 10Gps Light Paths

6000 TB> 800 Gbps

# of Parallel 10GbpsOptical Light Paths

128 x 10Gbps = 1.3TbpsSDSCSupercomputers

FIONA – Flash I/O Network Appliance:Linux PCs Optimized for Big Data

UCOP Rack-Mount Build:

FIONAs Are Science DMZ Data Transfer Nodes &Optical Network Termination Devices

UCSD CC-NIE Prism Award & UCOPPhil Papadopoulos & Tom DeFanti

Joe Keefe & John Graham

Cost $8,000 $20,000

Intel Xeon Haswell Multicore

E5-1650 v3 6-Core

2x E5-2697 v3 14-Core

RAM 128 GB 256 GB

SSD SATA 3.8 TB SATA 3.8 TB

Network Interface 10/40GbEMellanox

2x40GbE Chelsio+Mellanox

GPU NVIDIA Tesla K80

RAID Drives 0 to 112TB (add ~$100/TB)

John Graham, Calit2’s QI

How Prism@UCSD Transforms Big Data Microbiome Science – PRP Does This on a Sub-National Scale

FIONA12 Cores/GPU128 GB RAM3.5 TB SSD48TB Disk

10Gbps NIC

Knight Lab

10Gbps

Gordon

Prism@UCSD

Data Oasis7.5PB,

200GB/s

Knight 1024 ClusterIn SDSC Co-Lo

CHERuB100Gbps

Emperor & Other Vis Tools

64Mpixel Data Analysis Wall

120Gbps

40Gbps

1.3Tbps

FIONAs as Uniform DTN End Points

Existing DTNs

As of October 2015

FIONA DTNs

UC FIONAs Funded byUCOP “Momentum” Grant

Ten Week Sprint to Demonstrate the West Coast Big Data Freeway System: PRPv0

Presented at CENIC 2015 March 9, 2015

FIONA DTNs Now Deployed to All UC CampusesAnd Most PRP Sites

PRP Allows for Multiple Secure Independent Cooperating Research Groups

• Any Particular Science Driver is Comprised of Scientists and Resources at a Subset of Campuses and Resource Centers

• We Term These Science Teams with the Resources and Instruments they Access as Cooperating Research Groups (CRGs).

• Members of a Specific CRG Trust One Another, But They Do Not Necessarily Trust Other CRGs

PRP Timeline

• PRPv1 (Years 1 and 2)– A Layer 3 System – Completed In 2 Years – Tested, Measured, Optimized, With Multi-domain Science Data– Bring Many Of Our Science Teams Up – Each Community Thus Will Have Its Own Certificate-Based Access

To its Specific Federated Data Infrastructure.

• PRPv2 (Years 3 to 5)– Advanced IPv6-Only Version with Robust Security Features

– e.g. Trusted Platform Module Hardware and SDN/SDX Software– Support Rates up to 100Gb/s in Bursts And Streams– Develop Means to Operate a Shared Federation of Caches

Pacific Research PlatformMulti-Campus Science Driver Teams

• Jupyter Hub• Biomedical

– Cancer Genomics Hub/Browser– Microbiome and Integrative ‘Omics– Integrative Structural Biology

• Earth Sciences– Data Analysis and Simulation for Earthquakes and Natural Disasters– Climate Modeling: NCAR/UCAR– California/Nevada Regional Climate Data Analysis– CO2 Subsurface Modeling

• Particle Physics• Astronomy and Astrophysics

– Telescope Surveys– Galaxy Evolution– Gravitational Wave Astronomy

• Scalable Visualization, Virtual Reality, and Ultra-Resolution Video

24

PRP First Application: Distributed IPython/Jupyter Notebooks: Cross-Platform, Browser-Based Application Interleaves Code, Text, & Images

IJuliaIHaskellIFSharpIRubyIGoIScalaIMathicsIaldorLuaJIT/TorchLua KernelIRKernel (for the R language)IErlangIOCamlIForthIPerlIPerl6IoctaveCalico Project •kernels implemented in Mono, including Java, IronPython, Boo, Logo, BASIC, and many others

IScilabIMatlabICSharpBashClojure KernelHy KernelRedis Kerneljove, a kernel for io.jsIJavascriptCalysto SchemeCalysto Processingidl_kernelMochi KernelLua (used in Splash)Spark KernelSkulpt Python KernelMetaKernel BashMetaKernel PythonBrython KernelIVisual VPython Kernel

Source: John Graham, QI

GPU JupyterHub:

2 x 14-core CPUs256GB RAM 1.2TB FLASH

3.8TB SSDNvidia K80 GPU

Dual 40GbE NICsAnd a Trusted

Platform Module40Gbps

GPU JupyterHub:

1 x 18-core CPUs128GB RAM 3.8TB SSD

Nvidia K80 GPUDual 40GbE NICs

And a Trusted Platform Module

PRP UC-JupyterHub Backbone UCB Next Step: Deploy Across PRP UCSD

Source: John Graham, Calit2

OSG Federates Clusters in 40/50 States:A Major XSEDE Resource

Source: Miron Livny, Frank Wuerthwein, OSG

Open Science Grid Has Had a Huge Growth Over the Last Decade -Currently Federating Over 130 Clusters

Crossed 100 Million

Core-Hours/MonthIn Dec 2015

Over 1 Billion Data Transfers

Moved200 Petabytes

In 2015

Supported Over200 Million Jobs

In 2015

Source: Miron Livny, Frank Wuerthwein, OSG

ATLAS

CMS

PRP Prototype of LambdaGrid Aggregation of OSG Software & ServicesAcross California Universities in a Regional DMZ

• Aggregate Petabytes of Disk Space & PetaFLOPs of Compute, Connected at 10-100 Gbps

• Transparently Compute on Data at Their Home Institutions & Systems at SLAC, NERSC, Caltech, UCSD, & SDSC

SLAC

UCSD & SDSC

UCSB

UCSC

UCD

UCR

CSU Fresno

UCI

Source: Frank Wuerthwein, UCSD Physics;

SDSC; co-PI PRP

PRP Builds on SDSC’s

LHC-UC ProjectCaltech

ATLAS

CMS

other physics

life sciences

other sciences

OSG Hours 2015 by Science Domain

Status of LHC Science on PRP

• Established Data & Compute Federations on PRP for ATLAS/CMS– UCI ATLAS Researchers First to do Science on PRP– UCSD Shipped FIONAs that Implement This Functionality

and are Being Integrated into Local Infrastructure at UCI, UCR, UCSC• Integrated Comet as Overflow Resource for LHC Science on PRP

– Received XRAC Allocation – Operating via OSG Interface to Comet Virtual Cluster

• Plans for 2016– Ship FIONAs to UCSB, UCD– Work with UCR, UCSB, UCSC, UCD to Reach Same Routine Science Use as UCI– Deploy Inbound Data Cache in Front of Comet

Source: Frank Wuerthwein, UCSD Physics; SDSC; co-PI PRP

Two Automated Telescope SurveysCreating Huge Datasets Will Drive PRP

300 images per night. 100MB per raw image

30GB per night

120GB per night

250 images per night. 530MB per raw image

150 GB per night

800GB per nightWhen processed

at NERSC Increased by 4x

Source: Peter Nugent, Division Deputy for Scientific Engagement, LBLProfessor of Astronomy, UC Berkeley

Precursors to LSST and NCSA

PRP Allows Researchersto Bring Datasets from NERSC

to Their Local Clusters for In-Depth Science Analysis

Dark Energy Spectroscopic Instrument Goals + Technologies

Source: Peter Nugent, Division Deputy for Scientific Engagement, LBLProfessor of Astronomy, UC Berkeley

Dark Energy Spectroscopic Instrument (DESI): Goals + Technologies

1 meter diameter corrector5000 fiber-robot army 200,000 meters fiber optics 10 spectrographs x 3 cameras

Target Survey which covers 14,000 sq. deg. of sky and is 10 times deeper than SDSS. ~1 PB of data.

Collaborators throughout the UC System (and the world) will use this data to perform target selection for DESI as well as many other interesting science topics, including the hunt for Planet 9.

DESI Goals:35 Million Galaxy + Quasar Redshift Survey

33

DESI

4 Million Luminous Red Galaxies (LRGs)

17 Million Emission-Line Galaxies (ELGs)

0.7 million Lyman α QSOs

+1.7 million QSOs

10 Million Bright Galaxy Sample (BGS) galaxies

SDSS-III/BOSS has only mapped 1.5 million galaxies + 160,000 quasars

Source: Peter Nugent, Division Deputy for Scientific Engagement, LBLProfessor of Astronomy, UC Berkeley

Global Scientific Instruments Will Produce Ultralarge Datasets Continuously Requiring Dedicated Optic Fiber and Supercomputers

Square Kilometer Array Large Synoptic Survey Telescope

https://tnc15.terena.org/getfile/1939 www.lsst.org/sites/default/files/documents/DM%20Introduction%20-%20Kantor.pdf

Tracks ~40B Objects,Creates 10M Alerts/Night

Within 1 Minute of Observing

2x40Gb/s

Dan Cayan USGS Water Resources Discipline

Scripps Institution of Oceanography, UC San Diego

much support from Mary Tyree, Mike Dettinger, Guido Franco and other colleagues

NCAR Upgrading to 10Gbps Link from Wyoming and Boulder to CENIC/PRP

Sponsors: California Energy Commission NOAA RISA program California DWR, DOE, NSF

Planning for climate change in California substantial shifts on top of already high climate variability

SIO Campus Climate Researchers Need to Download Results from NCAR Remote Supercomputer Simulations

to Make Regional Climate Change Forecasts

HPWREN Users and Public Safety ClientsGain Redundancy and Resilience from PRP Upgrade

San Diego CountywideSensors and Camera

ResourcesUCSD & SDSU

Data & ComputeResources UCSD

UCR

SDSU

UCIUCI & UCR

Data Replicationand PRP FIONA Anchors

as HPWREN ExpandsNorthward

10X Increase During Wildfires

• PRP 10G Link UCSD to SDSU– DTN FIONAs Endpoints– Data Redundancy – Disaster Recovery – High Availability – Network Redundancy

Data From Hans-Werner Braun

Source: Frank Vernon, Greg Hidley, UCSD

PRP Backbone Sets Stage for 2016 Expansion of HPWREN into Orange and Riverside Counties

• Anchor to CENIC at UCI– PRP FIONA Connects to

CalREN-HPR Network– Potential Data Replication Site

• Potential Future UCR Anchor• Camera and Relay Sites at:

– Santiago Peak– Sierra Peak– Lake View– Bolero Peak– Modjeska Peak– Elsinore Peak– Sitton Peak– Via Marconi

Collaborations through COAST –County of Orange Safety Task Force

UCR

UCI

UCSD

SDSU

Source: Frank Vernon, Greg Hidley, UCSD

Collaboration Between EVL’s CAVE2 and Calit2’s VROOM Over 10Gb Wavelength – Extend Across PRP

EVL

Calit2

Source: NTT Sponsored ON*VECTOR Workshop at Calit2 March 6, 2013

Optical Fibers Link Australian and US Big Data Researchers-Also Korea, Japan, and the Netherlands

The Future of SupercomputingWill Need More Than von Neumann Processors

“High Performance Computing Will Evolve Towards a Hybrid Model,

Integrating Emerging Non-von Neumann Architectures, with Huge Potential in Pattern Recognition,

Streaming Data Analysis, and Unpredictable New Applications.”

Horst Simon, Deputy Director, U.S. Department of Energy’s

Lawrence Berkeley National Laboratory

Calit2’s Qualcomm Institute Has Established a Pattern Recognition Lab On the PRP, For Machine Learning on non-von Neumann Processors

“On the drawing board are collections of 64, 256, 1024, and 4096 chips.

‘It’s only limited by money, not imagination,’ Modha says.”Source: Dr. Dharmendra Modha

Founding Director, IBM Cognitive Computing Group

August 8, 2014

Is the Release of Google’s of TensorFlow as Transformative as the Release of C?

https://exponential.singularityu.org/medicine/big-data-machine-learning-with-jeremy-howard/

From Programming Computers Step by Step To Achieve a Goal

To Showing the Computer Some Examples of

What You Want It to Achieve and Then Letting the Computer

Figure It Out On Its Own--Jeremy Howard, Singularity Univ.

2015

AI is Advancing at a Amazing Pace:Deep Learning Algorithms Working on Massive Datasets

1.5 Years!

Training on 30M Moves, Then Playing Against Itself

PRP Is Bringing Together a Wide Range of Stakeholders

The Pacific Research Platform Creates a Regional End-to-End Science-Driven “Big Data Freeway System”

Opportunities for Collaboration with Many Other Regional Systems