how researchers will benefit from canada’s national data cyberinfrastructure

14
Presentation to the DDN User Group November 14, 2016 Compute Canada's National Data Cyberinfrastructure

Upload: insidehpc

Post on 08-Jan-2017

256 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: How Researchers Will Benefit from Canada’s National Data Cyberinfrastructure

Presentation to the DDN User Group November 14, 2016

Compute Canada's National Data Cyberinfrastructure

Page 2: How Researchers Will Benefit from Canada’s National Data Cyberinfrastructure

1.  About Compute Canada 2.  Technology Refresh: Challenge 2 Stage 1 (& beyond) 3.  Compute Canada’s new National Data Cyberinfrastructure 4.  Software defined storage and storage building blocks 5.  The role of object storage 6.  Visions of data availability, resiliency and usability

Today’s presentation

Page 3: How Researchers Will Benefit from Canada’s National Data Cyberinfrastructure

Abstract

Compute Canada is the national platform for Advanced Research Computing, serving essentially all academic disciplines with computational or storage needs “beyond the desktop.” Member institutions include research universities and institutes, and there are more than 3,000 active research projects that utilize the national platform. As a result of a new federal funding program, matched by provinces and member institutions, an ambitious technology refresh program is underway. A cornerstone of the updated platform is a new national data cyberinfrastructure. The NDC is deploying robust, highly available, large scale storage to the hosting sites. Building on concepts of software defined storage and commodity storage building blocks, the NDC is delivering backup and nearline services, persistent filesystem-based storage, and object storage.

Page 4: How Researchers Will Benefit from Canada’s National Data Cyberinfrastructure

About Compute Canada

Page 5: How Researchers Will Benefit from Canada’s National Data Cyberinfrastructure

Technology Refresh: Challenge 2 Stage 1

System RFP Issued

RFP Closed

Delivered In Production

National Data Cyberinfrastructure

(Ongoing delivery)

Fall 2016

GP1 - UVIC Cloud

Fall 2016

GP2 - SFU General Purpose

Early 2017

GP3 - Waterloo General Purpose

Spring 2017

LP - UofT Large Parallel

Late 2017

Federal funding: $30M, total value of $75M with matching and in-kind. Project time span: 2016-2018.

Page 6: How Researchers Will Benefit from Canada’s National Data Cyberinfrastructure

Technology Refresh: Challenge 2 Stage 2 Proposal submitted, outcomes not yet public

System/service type CFI capital Notes

Deep storage $2,500,000 One additional deep storage site, plus additional capacity for the current two sites.

Experimental systems $750,000 Small experimental systems at some Stage 2 sites; modest investment in commercial cloud.

Services infrastructure $250,000 1 FTE for 2 years, plus small purchases of existing software and/or services.

Elastic secure cloud (ESC) $750,000 One standalone ESC site.

Expand LP - No expansion of LP.

GPx $15,750,000 Expansion of one or more GPx systems, and addition of one or more new GPx systems. All GPx systems will have ESC partitions.

TOTAL $20,000,000

Details and descriptions of system/service types are in the “Cyberinfrastructure Initiative Challenge 2” proposal, online at: https://www.computecanada.ca/publications/

Page 7: How Researchers Will Benefit from Canada’s National Data Cyberinfrastructure

Major Elements to Date of Compute Canada’s new National Data Cyberinfrastructure

1.  Storage Building Blocks (SBBs). Commodity storage systems that are flexible, configurable, and will evolve over time as technology improves. a.  Provider: Scalar Decisions, Inc. (Toronto). b.  Technologies: SBB systems from Dell and Seagate. c.  Configurations to be provided: Mult. performance tiers & capacities.

2.  Object Storage Software. Automated, efficient data replication across the wide-area network, S3-compatible interface to data objects, and POSIX-style access to object storage. a.  Provider: DDN Storage b.  Technologies: WOS c.  Configurations to be provided: Software at Stage 1 sites & beyond

3.  Backup capabilities. To provide cost-efficient bulk storage of data copies, including archives and nearline storage. a.  Provider: IBM Canada b.  Technologies: Spectrum Protect software; TS3500 tape silos and

LTO7 tapes+drives; supporting infrastructure systems c.  Configurations to be provided: Multi-site redundant backups to SFU &

uWaterloo; other configurations and uses as needed.

RFP evaluation criteria focused on total cost of ownership, for desired capabilities and capacities.

Page 8: How Researchers Will Benefit from Canada’s National Data Cyberinfrastructure

Software Defined Storage and Storage Building Blocks

Software Defined Storage (SDS): Compute Canada anticipates that increasingly, the software layer will present storage features, irrespective of hardware. This will often occur with flexible, interchangeable, and vendor-agnostic underlying hardware layers. Key features of software defined storage include: ●  Incorporation of different performance layers; ●  Multiple access points and/or modalities to the same data items; ●  Ease of expansion.

Storage building blocks (SBBs): Compute Canada is focused on cost-effective technology deployment and growth. Total cost of ownership (TCO) calculations for solutions are intended to include capital costs, operating costs, and all aspects of support. Storage building blocks helps to control TCO: ●  Obtaining the needed level of performance and other features; ●  Controlling costs, by emphasizing commodity-based solutions; ●  Expanding capacity as-needed, to take advantage of price/performance

improvements over time; ●  Avoiding proprietary solutions and vendor lock-in.

Page 9: How Researchers Will Benefit from Canada’s National Data Cyberinfrastructure

The Role of Object Storage Compute Canada engages in continuous assessment of current and future needs of the user community (see https://www.computecanada.ca/research-portal/sparc2/ for the early 2016 activities). Indications are that object storage will address several key current and future needs: ●  Modernizing and modularizing research platforms and portals, by providing an

object storage interface to data; ●  Providing easy and cost-effective replication of data, including replication over the

wide-area network; ●  Adding a compatibility layer for users seeking to employ commercial cloud

services; ●  Adding an interoperability layer for data access via POSIX or S3; ●  Enabling diverse metadata; ●  Access control mechanisms, including public sharing of data.

The Storage RFP included solicitation of bids for object storage software, which was to be software defined storage capable of running on commodity storage building blocks.

Page 10: How Researchers Will Benefit from Canada’s National Data Cyberinfrastructure

WOS Status, Hopes and Plans

Access via S3 or POSIX bridge (Lustre or GPFS).

Page 11: How Researchers Will Benefit from Canada’s National Data Cyberinfrastructure

Visions of Data Availability, Resiliency and Usability

Step 1: Science DMZ + Persistent storage (object, filesystem, backups)

Step 2 (2017): Integrate with HPC systems Step 3 (2017-2018): Integrate with research data management systems, research platforms and portals

Page 12: How Researchers Will Benefit from Canada’s National Data Cyberinfrastructure

Welcome, ARBUTUS (“GP1”) to UVic

Installed and Operational September 2016

Page 13: How Researchers Will Benefit from Canada’s National Data Cyberinfrastructure

Your Presenter

Dr. Greg Newby is Chief Technology Officer of Compute Canada He has a passion for enabling diverse scientific, social and educational opportunities. He has devoted his professional career to advanced research computing. Born in Montreal, Dr. Newby received his doctorate in Information Transfer from Syracuse University and most recently completed an M.B.A. in Sustainable Systems from the Bainbridge Graduate Institute of Presidio. Dr. Newby also obtained a Masters in Communications from University at Albany, State University of New York. Author of several books and numerous publications, Dr. Newby was a faculty member at two major US universities where he developed and taught courses in information systems, information security, and computer technology. His most recent roles include Manager of the Supercomputing Core Laboratory at King Abdullah University of Science and Technology in Saudi Arabia. Dr. Newby was Director of the Arctic Region Supercomputing Center at the University of Alaska Fairbanks, where he also served as a faculty member for 11 years.

Page 14: How Researchers Will Benefit from Canada’s National Data Cyberinfrastructure

Questions, Discussion and Closing Thoughts

Visit Compute Canada at SC16 booth #4430.

Find Compute Canada online at www.computecanada.ca

Twitter: @ComputeCanada

Email: [email protected]