build hybrid storage architectures

53

Upload: others

Post on 28-Feb-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Build hybrid storage architectures
Page 2: Build hybrid storage architectures

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Build hybrid storage architectures with AWS Storage Gateway

S T G 3 0 5

Asa Kalavade

AWS Storage Gateway General Manager

Paul Reed

AWS Storage Gateway Principal Product Manager

Mohammad Shaikh

Director of Research

ComputingBristol-Myers Squibb

Oleg Moiseyenko

Sr. Cloud Architect, Bristol-Myers Squibb

Page 3: Build hybrid storage architectures

… then you’ve come to the right session

Are you faced with these on-premises storage challenges

Growing backup infrastructure costs

Storage capacity limits

Limited access to in-cloud data

Page 4: Build hybrid storage architectures

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Use cases

Customer case study - BMS

New features deep dive

Storage Gateway overview

Summary

Page 5: Build hybrid storage architectures

AWS Storage Gateway

Provides on-premises access to virtually unlimited cloud storage …

… regardless of cloud adoption stage

Move on-premises backups

to the cloud

Provide low latency access for

on-premises applications to

cloud data

Shift on-premises storage to

cloud-backed file shares

Page 6: Build hybrid storage architectures

Tens of thousands of customers

PBs ingested

every day

Average 96% reduction of on-premises storage

100s of PBs managed in-cloud

AWS Storage Gateway

Managing rapidly growing customer datasets …

… and serving more customers every day

Page 7: Build hybrid storage architectures

Some AWS Storage Gateway customers

Page 8: Build hybrid storage architectures

Integrated with AWS Identity and Access Management

(IAM), AWS Key Management Service (AWS KMS),

AWS CloudTrail, Amazon CloudWatch services

AWS Storage Gateway

Configuration: VMware ESXi, Microsoft Hyper-V,

Amazon Elastic Compute Cloud (Amazon EC2),

Hardware Appliance

AWS CloudCustomer premise

Files

(NFS/SMB)

Volumes

(iSCSI)

Tapes

(iSCSI VTL)

AWS Storage GatewayAmazon S3

Glacier

Amazon S3

Amazon Elastic

Block Store

(Amazon EBS)

AWS Backup

Amazon S3

Glacier Deep

Archive

Storage Gateway serviceStorage Gateway

HTTPS

Page 9: Build hybrid storage architectures

• Low latency cached access to data in Amazon S3

• Support for NFS (POSIX) and SMB file shares (Windows ACLs)

• One-to-one mapping between files and objects in S3

Features

File GatewayStore and access objects in Amazon S3 from file-based applications with local caching

On-Premises

NFS & SMB

File Gateway

HTTPS

Amazon

S3 bucketApplication Storage

Gateway

service

Page 10: Build hybrid storage architectures

• Presents block storage over iSCSI in cached mode (recently accessed data) or stored mode (full volume)

• Cost-efficient incremental Amazon EBS snapshots of volumes managed through AWS Backup

• Compresses data between gateway and cloud to minimize storage charges

Features

Volume GatewayBlock storage on-premises backed by cloud storage

Storage

Gateway

service

On-Premises

iSCSI HTTPS

Volume

Gateway

Amazon EBS

snapshots

Application

Page 11: Build hybrid storage architectures

• Emulates physical tape library through iSCSI-VTL protocol

• Compatible with most major backup applications

• Archive virtual tapes in S3 Glacier Deep Archive, lowest cost cloud storage, or S3 Glacier

Features

Tape Gateway

Learn more … STG217 – Shift your tape backups to AWS to save time and money

Tuesday, Dec 3, 5:30 PM - 6:30 PM

On-Premises

iSCSI VTL

Tape Gateway

HTTPS

Application

Storage Gateway service

Tape library(Amazon S3)

Tape shelf(S3 Glacier Deep Archive)

OR (S3 Glacier)

Page 12: Build hybrid storage architectures

File

Gateway

Volume

GatewayTape

Gateway

What’s new since re:Invent 2018

NEW!

NEW!

NEW!

Page 13: Build hybrid storage architectures

What’s new since re:Invent 2018

Hardware appliance Enterprise features

◉◉◉◉

◉◉◉◉◉

◉◉◉ ◉◉◉

Regions

• Currently available in 20

regions, including China

(Beijing), and GovCloud

(US-West)

NEW!

NEW!

NEW!

Page 14: Build hybrid storage architectures

Limited time incentive for Hardware ApplianceMONDAY

CYBER

Page 15: Build hybrid storage architectures

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Page 16: Build hybrid storage architectures

AWS Storage Gateway

Provides on-premises access to virtually unlimited cloud storage …

… regardless of cloud adoption stage

Move on-premises backups

to the cloud

Provide low latency access

for on-premises applications

to cloud data

Shift on-premises storage to

cloud-backed file shares

Page 17: Build hybrid storage architectures

Move on-premises backups to the cloud

iSCSI VTL

AWS Cloud

File

Gateway

Volume

Gateway

Tape

Gateway

Storage

Gateway

Managed

Service

Database

Server

Application

Server

Backup

Server

iSCSI

NFS/SMB

Tape Library

(Amazon S3)Tape Archive

(S3 Glacier / GDA)

Amazon S3 Amazon EBSAWS Backup

HTTPS

HTTPS

HTTPS

On-premises

Any S3 storage class

lifecycle

Amazon S3

eject

Maintain your backup workflows while reducing your backup infrastructure on-premises

Page 18: Build hybrid storage architectures

File Gateway for on-premises backupMove database and file backups into the cloud and free up on-premises storage capacity

Features

NFS/SMB protocol support, mount shares directly

on database and application servers

Files stored durably in Amazon S3, lifecycle to any

S3 storage class

Local cache for accessing recent backups

Windows ACL support to control access to

backup files

Support for S3 Object Lock

Bandwidth-optimized, only changes are transferred

Reduce on-premises storage for backups

Easily integrates with SAP, SQL Server,

Oracle, HDFS, and other applications

Restore backups on-premises or in the

cloud on EC2 or RDS

Benefits

AWS Cloud

HTTPSFile

Gateway

NFS/SMBApp/DB

Server

Any S3 storage class

On-premises

Amazon S3

lifecycle

Page 19: Build hybrid storage architectures

Volume Gateway for on-premises backupEnable faster application recovery in-cloud or on-premises

AWS Cloud

HTTPS

On-premises

Volume

Gateway

Application

Server

iSCSI

Amazon S3 Amazon EBSAWS Backup

Features Benefits

Present cloud-based iSCSI block storage volumes

to on-premises applications

On-premises cache of recently accessed data

Backup volumes as EBS snapshots

Integrates with AWS Backup to coordinate

volume backup and retention

Store volume backups securely

and reliably

Restore backups on-premises or

in the cloud as EBS volumes

Page 20: Build hybrid storage architectures

Tape Gateway for on-premises backupReplace physical tape infrastructure with virtual tape workflows

Features Benefits

iSCSI VTL interface compatible with leading

backup applications

Active tapes stored in Amazon S3

Ejected tapes stored in S3 Glacier or S3 Glacier

Deep Archive

Automatic fixity checking

Data compressed and encrypted, in-transit

and at-rest

Drop-in replacement for tape libraries,

tape media, and archiving services

Maintain existing backup workflows

Eliminate the hassles of physical tape

Store archived tapes durably and reliably

in Amazon S3 Glacier Deep Archive for

$1/TB/month

iSCSI VTL

AWS Cloud

Tape

Gateway

Backup

Server

Tape Library

S3 Glacier / S3

GDA

HTTPS

Amazon S3

Tape Archive

On-premises

eject

Page 21: Build hybrid storage architectures

Backing up to physical tapes, sent off-site

Lengthy, unreliable recovery of data from tapes

No new backup budget approved

Couldn’t disrupt their existing operations

Problem

Solution

Outcome

EMC Networker connected to Tape Gateway

Backups stored in Virtual Tape Library (VTL)

on Amazon S3

Archive to Amazon S3 Glacier

No change in backup workflow

50% cost reduction

Parallel backups for one year, then turned off physical tape

Phased out off-site archive in 3 months

Analog Devices is a world leader in the design, manufacture, and marketing of a

broad portfolio of high performance analog, mixed-signal, and digital signal

processing (DSP) integrated circuits (ICs) used in virtually all types of electronic

equipment

Page 22: Build hybrid storage architectures

Migrating datacenters & applications to AWS

Many on-premises databases and assets to migrate, backup & archive

High backup costs with commercial software

Install File Gateways for backup of SAP on Oracle

environments, hybrid backups, and archives of SQL

databases, Hadoop clusters, and other applications

Keep on-premises access to in-cloud data

~90% reduction in backup costs, eliminating

backup software

With a few TB of storage on premises, get access

to 100s of TB of storage and backups in cloud

Problem

Solution

Outcome

The world's leading cereal company, 2nd largest producer of cookies, crackers, and

savory snacks, and leading North American frozen foods company

Page 23: Build hybrid storage architectures

Shift on-premises storage to cloud-backed file sharesAccess virtually unlimited, highly durable cloud storage using common file protocols

Features Benefits

Supports NFS and SMB protocols—no application

changes required

Files stored durably in Amazon S3

SMB shares integrate with Active Directory

Amazon CloudWatch events for

automated workflows

Reduce costs by moving storage to Amazon

S3 and accessing on-premises

Virtually unlimited cloud storage—no more

running out of capacity

Eliminate expensive hardware refresh cycles

AWS Cloud

HTTPSFile

Gateway

NFS/SMBApplication

On-premises

Amazon S3

NAS storage

Page 24: Build hybrid storage architectures

Stacks of disk arrays on-premises were expensive and required a lot of space

Complex architecture and cache hierarchy

Many readers via NFS

Problem

Solution

Outcome

AWS DataSync to transfer bulk data and active

datasets to cloud

File Gateway for local access to cloud data

Active/active multi-region and versioning with

lifecycles

$1M bandwidth cost savings

Saved ~85% on storage, per location

Storage engineers focused on high-value activities

With more than 40,000 auto dealer clients across five continents, we strive to

understand your needs by pairing our insights and research with your business

goals – delivering inspired results to bridge the gap between consumers,

manufacturers, dealers and lenders at every stage of the automotive experience

Learn more … STG354 – Large-scale file migrations with AWS DataSync

Thursday, Dec 5, 3:15 PM - 4:15 PM

Page 25: Build hybrid storage architectures

Low-latency access for on-premises applications to cloud dataAccess files quickly from distributed locations and scale capacity as needed

Features Benefits

Generate data in-cloud or ingest from on-

premises using AWS DataSync or AWS Snowball

Up to 16 TB local cache per gateway

Fully-managed gateway cache provides low-

latency access to data

Refresh cache at the bucket or prefix level

Access cloud storage from any

on-premises location

Process data in the cloud and refresh

gateway cache for up-to-date results

Data stored cost effectively and centrally

in the cloud

AWS Cloud

Application

NFS/SMB

Cache refresh

HTTPS

Cache refresh

HTTPS

Application

NFS/SMB

On-premises

File Gateway

On-premises

File GatewayIn-cloud processing

AWS

DataSync

AWS

Snowball

Page 26: Build hybrid storage architectures

Data stored on premises for regulatory and performance reasons

Moved application data to Amazon S3 but developers still need file-based access

Required high level of security, encryption, and scalability

Problem

Solution

Outcome

Deployed multiple file gateways to manage ready

access to cloud data

Use gateways for granular control over data stored

in Amazon S3

Preserve developer access to frequently used data

Use native tools with no proprietary formats

No coding required—works with existing protocols

and OS-level commands

The world's leading and most diverse derivatives marketplace

Page 27: Build hybrid storage architectures

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Page 28: Build hybrid storage architectures

New features deep dive

Customers asked to Feature we delivered

Page 29: Build hybrid storage architectures

• High availability for all gateway types running

on VMware

• Gateway health checks integrated with VMware

provide application level monitoring including:

• NFS/SMB file share availability

• iSCSI availability

• Configuration errors; e.g., read-only root disks

• Gateway restarts on service interruption

High availability on VMware: Feature overview and benefitsFor VMware-based gateways running on premises or in VMware Cloud on AWS

• Enterprise workloads operate

uninterrupted

• VMware HA protects workloads against

hardware, hypervisor, and network

errors

• Gateway automatically recovers from

most service interruptions in under 60

seconds and maintains its local cache

What is it What are its benefits

Page 30: Build hybrid storage architectures

How does it workGateway recovery for software, hardware, and datacenter failure scenarios

VMware Host

Software failure Hardware failure

VMware Host VMware Host

Datacenter failure

DR DatacenterCorporate Datacenter

VMware Host VMware Host

Page 31: Build hybrid storage architectures

• Real-time visibility into cache utilization, gateway

access patterns, and throughput and I/O metrics

through CloudWatch integration

• Administrators can monitor performance and

cache metrics to tune resources based on

application needs

• High ”Cache Percent Dirty” can prompt an increase

in network allocation

• High “Cloud Traffic” can prompt and increase in

cache size

For all environments

Page 32: Build hybrid storage architectures

Monitor all of your gateways from the console

Page 33: Build hybrid storage architectures

CloudWatch integration For all environments

Trigger actions and notifications based on events and metrics

Corporate datacenter AWS Cloud

NEW!

NEW!

Page 34: Build hybrid storage architectures

Gateway software updates are managed

automatically for customers

Granular control over maintenance windows

to meet the uptime requirements of enterprise-

wide applications that need to operate

without interruption:

• Day of the week—available now

• Day of the month—available now

• Day of the week of the month—coming soon

• Day of every # weeks—coming soon

Additional maintenance window optionsFor all environments

Page 35: Build hybrid storage architectures

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Customer case study: Bristol-Myers SquibbStorage Gateway applications in life sciences

Mohammad Shaikh

Director of Research ComputingBristol-Myers Squibb

Oleg Moiseyenko

Sr. Cloud Architect, Bristol-Myers Squibb

Page 36: Build hybrid storage architectures

To discover, develop, and deliver

innovative medicines that help

patients prevail over serious diseases

Our mission

Scientific Computing Services

Page 37: Build hybrid storage architectures

Major data sources

• Raw data from labs

• Scratch space

• Results data

• External collaborations

• Public & government agencies

• R&D

It’s all about data, Big Data

From GBs to PBs scale

Exponential growth

(Tens of PBs)

Scientific data sets

• NGS data

• Proteomics

• Flow Cytometry

• Imaging data

• High-throughput screening

• Mass spectrometry

• Databases

2016 2017 2018 2019 2020

Page 38: Build hybrid storage architectures

Our data sources

High-velocity and continuous sources• Illumina sequencers (Genomics data)

• Nuclear Magnetic Resonance (NMR)

• Many others

High-volume sources• High-resolution mass spectrometer (Proteomics)

• AT2 tissue microscope (Histology)

• High content screening

Intermediate storage• NAS drive, NFS-based metadata

• POSIX metadata captured only

• Business metadata: Relationships need to be enriched on S3

Page 39: Build hybrid storage architectures

Hybrid file use cases: Data transfer, analytics, ML

Lab to Cloud (NMR, Histology, NGS)

• Instrument data

• Metadata catalog in the cloud

• Downstream analysis

Machine Learning (ML) analysis in cloud/visualization in Labs (Flow Cytometry)

• Instrument data to cloud

• ML-based analytics, unsupervised learning models

• Visualization of scientific data

Image management analysis in cloud

• Specialized scientific data formats

• Data enrichment

• Downstream analytics

Page 40: Build hybrid storage architectures

1. Instruments writes raw data into File Gateway file share

2. File Gateway transfer files to S3 buckets

3. Data Management system scans S3 buckets regularly

4. Applications request data via Data Management system meta catalog

Typical data flow diagram

AWS Direct

Connect

10 Gb/s

S3 buckets Data

Management

System

ApplicationsFile GatewayBMS

Scientific

Instruments

1 2

3 4

Page 41: Build hybrid storage architectures

AWS Storage Gateway in Image Discovery

AWS Direct

Connect

10 Gb/s

BMS AWS Cloud

S3 bucket A

S3 bucket B

S3 bucket N

S3 bucket N+1

S3 object store

S3 bucket 3

S3 bucket 2

S3 bucket 1

Data Management

System

(Metadata Catalog)

Image analysis

tools

S3 bucket

for transformed

images

Collaborator’s AWS Cloud

Image transformation

On premises

Scientific

instruments

Scientists

Images on local

server (NFS)

Images on local

server (NFS)

Images on local

server (NFS)

Local storage

layer

Image Metadata

database

Storage Gateway

Hardware appliance

AWS Snowball

Page 42: Build hybrid storage architectures

Outcomes for BMS

Tech

Integration across standard protocols

Low-latency

Efficient data transfer

Easy to deploy: Virtual and hardware storage gateways

Data replication

Encryption in transit

Business

Cost and elasticity

Support many old and new applications

Overall simplicity

Effective workflows automation

Secure data sharing

Page 43: Build hybrid storage architectures

Plan Storage Gateway deployment

Preparing for Storage Gateway

• S3 buckets

• Access policies

• File shares

• Mounting instructions

• Data transfers

Preparing for metadata catalog

• Collection names

• Directory names

• Data sources, daily volumes,

formats

• Business data tags and rules

• Access requirements

• Shared directory needs

• Data scan frequency

• Access to metadata catalog

Page 44: Build hybrid storage architectures

AWS Storage Gateway hardware appliance

Appliance details

The hardware appliance comes with AWS Storage

Gateway software pre-installed on a validated

configuration of a Dell EMC PowerEdge R640XL server:

• 2 x Intel Xeon Silver 4114 2.20 GHz

processors with 10 cores each

• 128 GB DDR4 RAM

• 5 TB of usable enterprise SSD storage, with the

option to add 7 TB of usable enterprise SSD

storage for a total of 12 TB

• 4-port 10 Gigabit copper network card, with

the option to purchase and use a 4-port 10

Gigabit fiber-optic network card

• 3 years of hardware support from Dell—

accessed and coordinated through your

normal AWS support channels

1 2 3

4 5

Page 45: Build hybrid storage architectures

Hardware applianceFacts:

• You own it!

• Secure local installation

• Low latency

• Data compression

• Suitable for legacy applications

• Provide local applications access to S3 storage

• Price range: $12K–$16K USD

Current limitations:

• One gateway type per appliance

• 5 TB usable storage (extendable up to 12 TB)

• Software RAID

• Intel X710 4-port 10 Gigabit fiber optic network card

• AWS Direct Connect is recommended

• Local proxy servers

1 2 3

4 5

Page 46: Build hybrid storage architectures

Lessons learned

• Optimizing AWS Gateway: compute, storage, cache size

• Do not oversubscribe the CPUs of the host server (4-16-24 vCPU’s)

• Don’t mix upload buffer disks and cache storage

• Use high-performing RAID configuration for data store disks

• Cache disk configuration: Proxy server vs. Direct connect

• IP addresses, ports, and firewall rules

• Live test from actual scientific instruments

• Caution while sharing same S3 bucket through different AWS Storage Gateways

• Software-based RAID (no hardware RAID option?)

• Direct Connect links

• Storage Gateway, data governance and reliability

• Support channels and security

Page 47: Build hybrid storage architectures

Preventing multiple file shares writing to S3 Bucket

When you create a file share, we

recommend that you configure your

Amazon S3 bucket so that only one

file share can write to it

If you configure your S3 bucket

to be written to by multiple file

shares, unpredictable results

can occur

To prevent this, create an S3 bucket

policy that denies all roles except the

role used for the file share to put or

delete objects in the bucket

{"Version":"2012-10-17","Statement":[

{"Sid":"DenyMultiWrite","Effect":"Deny","Principal":"*","Action":[

"s3:DeleteObject","s3:PutObject"

],

"Resource":"arn:aws:s3:::TestBucket/*","Condition":{

"StringNotLike":{"aws:userid":"TestUser:*"

}}

}]

}

Page 48: Build hybrid storage architectures

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Page 49: Build hybrid storage architectures

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

AWS Storage Gateway General [email protected]

Paul Reed

AWS Storage Gateway Principal Product [email protected]

Asa Kalavade

Question Time

Mohammad Shaikh

Director of Research

ComputingBristol-Myers Squibb

Oleg Moiseyenko

Sr. Cloud Architect, Bristol-Myers Squibb

Page 50: Build hybrid storage architectures

Take action

Deploy a Storage

Gateway VM

Learn more … aws.amazon.com/storagegateway

Start using cloud

storage on-premises

Try it out

File

(NFS/SMB)

Volume

(iSCSI)

Tape

(iSCSI VTL)

Choose your

Gateway Type

With Amazon S3, Amazon S3

Glacier, Amazon S3 Glacier

Deep Archive, and Amazon EBS

Page 51: Build hybrid storage architectures

Learn more about hybrid cloud storage in these sessions

• STG231 Lift and shift your tape-based backup workflows to AWS

• STG226 Hands-on with hybrid block storage using a Volume Gateway

• STG217 Shift your tape backups to AWS to save time and money

• STG213— —Storage for hybrid cloud and edge computing: Bring AWS to you

• STG313 Hybrid architectures for database backups & file migrations

• STG336— Using hybrid cloud storage to close a data center and migrate

Page 52: Build hybrid storage architectures

Thank you!

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

[email protected]

Paul Reed

[email protected]

Asa Kalavade

Page 53: Build hybrid storage architectures

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.