how researchers will benefit from canada’s national data cyberinfrastructure
TRANSCRIPT
Presentation to the DDN User Group November 14, 2016
Compute Canada's National Data Cyberinfrastructure
1. About Compute Canada 2. Technology Refresh: Challenge 2 Stage 1 (& beyond) 3. Compute Canada’s new National Data Cyberinfrastructure 4. Software defined storage and storage building blocks 5. The role of object storage 6. Visions of data availability, resiliency and usability
Today’s presentation
Abstract
Compute Canada is the national platform for Advanced Research Computing, serving essentially all academic disciplines with computational or storage needs “beyond the desktop.” Member institutions include research universities and institutes, and there are more than 3,000 active research projects that utilize the national platform. As a result of a new federal funding program, matched by provinces and member institutions, an ambitious technology refresh program is underway. A cornerstone of the updated platform is a new national data cyberinfrastructure. The NDC is deploying robust, highly available, large scale storage to the hosting sites. Building on concepts of software defined storage and commodity storage building blocks, the NDC is delivering backup and nearline services, persistent filesystem-based storage, and object storage.
About Compute Canada
Technology Refresh: Challenge 2 Stage 1
System RFP Issued
RFP Closed
Delivered In Production
National Data Cyberinfrastructure
(Ongoing delivery)
Fall 2016
GP1 - UVIC Cloud
Fall 2016
GP2 - SFU General Purpose
Early 2017
GP3 - Waterloo General Purpose
Spring 2017
LP - UofT Large Parallel
Late 2017
Federal funding: $30M, total value of $75M with matching and in-kind. Project time span: 2016-2018.
Technology Refresh: Challenge 2 Stage 2 Proposal submitted, outcomes not yet public
System/service type CFI capital Notes
Deep storage $2,500,000 One additional deep storage site, plus additional capacity for the current two sites.
Experimental systems $750,000 Small experimental systems at some Stage 2 sites; modest investment in commercial cloud.
Services infrastructure $250,000 1 FTE for 2 years, plus small purchases of existing software and/or services.
Elastic secure cloud (ESC) $750,000 One standalone ESC site.
Expand LP - No expansion of LP.
GPx $15,750,000 Expansion of one or more GPx systems, and addition of one or more new GPx systems. All GPx systems will have ESC partitions.
TOTAL $20,000,000
Details and descriptions of system/service types are in the “Cyberinfrastructure Initiative Challenge 2” proposal, online at: https://www.computecanada.ca/publications/
Major Elements to Date of Compute Canada’s new National Data Cyberinfrastructure
1. Storage Building Blocks (SBBs). Commodity storage systems that are flexible, configurable, and will evolve over time as technology improves. a. Provider: Scalar Decisions, Inc. (Toronto). b. Technologies: SBB systems from Dell and Seagate. c. Configurations to be provided: Mult. performance tiers & capacities.
2. Object Storage Software. Automated, efficient data replication across the wide-area network, S3-compatible interface to data objects, and POSIX-style access to object storage. a. Provider: DDN Storage b. Technologies: WOS c. Configurations to be provided: Software at Stage 1 sites & beyond
3. Backup capabilities. To provide cost-efficient bulk storage of data copies, including archives and nearline storage. a. Provider: IBM Canada b. Technologies: Spectrum Protect software; TS3500 tape silos and
LTO7 tapes+drives; supporting infrastructure systems c. Configurations to be provided: Multi-site redundant backups to SFU &
uWaterloo; other configurations and uses as needed.
RFP evaluation criteria focused on total cost of ownership, for desired capabilities and capacities.
Software Defined Storage and Storage Building Blocks
Software Defined Storage (SDS): Compute Canada anticipates that increasingly, the software layer will present storage features, irrespective of hardware. This will often occur with flexible, interchangeable, and vendor-agnostic underlying hardware layers. Key features of software defined storage include: ● Incorporation of different performance layers; ● Multiple access points and/or modalities to the same data items; ● Ease of expansion.
Storage building blocks (SBBs): Compute Canada is focused on cost-effective technology deployment and growth. Total cost of ownership (TCO) calculations for solutions are intended to include capital costs, operating costs, and all aspects of support. Storage building blocks helps to control TCO: ● Obtaining the needed level of performance and other features; ● Controlling costs, by emphasizing commodity-based solutions; ● Expanding capacity as-needed, to take advantage of price/performance
improvements over time; ● Avoiding proprietary solutions and vendor lock-in.
The Role of Object Storage Compute Canada engages in continuous assessment of current and future needs of the user community (see https://www.computecanada.ca/research-portal/sparc2/ for the early 2016 activities). Indications are that object storage will address several key current and future needs: ● Modernizing and modularizing research platforms and portals, by providing an
object storage interface to data; ● Providing easy and cost-effective replication of data, including replication over the
wide-area network; ● Adding a compatibility layer for users seeking to employ commercial cloud
services; ● Adding an interoperability layer for data access via POSIX or S3; ● Enabling diverse metadata; ● Access control mechanisms, including public sharing of data.
The Storage RFP included solicitation of bids for object storage software, which was to be software defined storage capable of running on commodity storage building blocks.
WOS Status, Hopes and Plans
Access via S3 or POSIX bridge (Lustre or GPFS).
Visions of Data Availability, Resiliency and Usability
Step 1: Science DMZ + Persistent storage (object, filesystem, backups)
Step 2 (2017): Integrate with HPC systems Step 3 (2017-2018): Integrate with research data management systems, research platforms and portals
Welcome, ARBUTUS (“GP1”) to UVic
Installed and Operational September 2016
Your Presenter
Dr. Greg Newby is Chief Technology Officer of Compute Canada He has a passion for enabling diverse scientific, social and educational opportunities. He has devoted his professional career to advanced research computing. Born in Montreal, Dr. Newby received his doctorate in Information Transfer from Syracuse University and most recently completed an M.B.A. in Sustainable Systems from the Bainbridge Graduate Institute of Presidio. Dr. Newby also obtained a Masters in Communications from University at Albany, State University of New York. Author of several books and numerous publications, Dr. Newby was a faculty member at two major US universities where he developed and taught courses in information systems, information security, and computer technology. His most recent roles include Manager of the Supercomputing Core Laboratory at King Abdullah University of Science and Technology in Saudi Arabia. Dr. Newby was Director of the Arctic Region Supercomputing Center at the University of Alaska Fairbanks, where he also served as a faculty member for 11 years.
Questions, Discussion and Closing Thoughts
Visit Compute Canada at SC16 booth #4430.
Find Compute Canada online at www.computecanada.ca
Twitter: @ComputeCanada
Email: [email protected]