se training storage grid webscale technical overview
TRANSCRIPT
StorageGRID Webscale Technical Overview September 2014
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only2
Student Guide & Workshop & Internal Training & Confidential Update Dailyhttps://goo.gl/VVmVZ0
5673-StorageGRID for Standard Enterprise Solution http://ouo.io/SROnEV
Building Private Clouds on E-Series The Swift Object Opportunity http://ouo.io/vWryaDeployment Guide - Acuo Universal Clinical Platform with NetApp StorageGRID Integration Architecture http://ouo.io/TowHxU
Deployment Guide - StorageGRID E-Series http://ouo.io/jhf1M
Getting Started with StorageGRID CDMI Integrations http://ouo.io/dAORQj
NetApp and the Object-Based Storage and Archiving Landscape http://ouo.io/wvoEgQ
NetApp Distributed Content Repositories What Are We Doing in Real Life http://ouo.io/3y65M
NetApp Portfolio Overview http://ouo.io/O9aKQ
POC - StorageGRID Webscale Proof of Concept Guide http://ouo.io/pzljIH
QRG_StorageGRID Webscale http://ouo.io/3dqRr
SE Training - StorageGRID Webscale Technical Overview http://ouo.io/BqtKUUSolution Brief - NetApp StorageGRID Manage Large Pools of Mission-Critical Patient Data Across Healthcare Facilities http://ouo.io/KTDFz
StorageGRID Webscale 10.0 Installation Overview http://ouo.io/hIYhJ
StorageGRID Webscale Nonstop Object Storage for Enterprise and Cloud http://ouo.io/dMRoOr
StorageGRID Webscale Object-Enabled Data Management http://ouo.io/nhuj4fTechnical Report - Integration Guide for NetApp StorageGRID with McKesson Horizon Medical Imaging PACS http://ouo.io/9ncafj
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only3
Agenda
What is Object Storage
StorageGRID Webscale Introduction
Technical Overview
Key Concepts
Data flow
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only4
Object storage growth driven by macro trends…
Unstructured data continues to grow•New sources of unstructured data growth media, mobile and applications
•Data continues to be retained for long periods (archival, compliance etc)
Unstructured data profile is changing•Simultaneous access to the same data is rarely required
•Most data accessed a few times initially and then rarely accessed
Data access is changing•Geographically dispersed access•Applications accessing data – don’t care about POSIX semantics, file locking
Highly cost sensitive petabyte scale repositories•Driving tradeoffs between $/GB, latency, throughput, and data protection
Storage being managed in a cloud ecosystem•Unified management & orchestration•Growth in cloud hosted applications that leverage object storage
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only5
Introducing StorageGRID Webscale
A new variant of StorageGRID Target object store (cloud, archive, media) use cases at massive scale
New Features Protocols – native support for S3 API Scalability – 100 billion objects, 70 PB Simplicity – modular scalable resilient architecture, simplified deployment
Proven track record for reliability and innovation
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only6
Block File Object
What is Object Storage?Different ways to address data
Specific location on disks / memory
Tracks
Sectors
Specific folder in fixed logical order
File path
File name
Date
Flexible container size
Data and Metadata
Unique ID
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only7
Object Storage Example: File vs. ObjectObject Based File Based
ValetParking Garage
Daily Garage 1Floor 4Row NSpace 53
/users/jsmith/car/garage1/floor4/rown/space53.fileC:\Users\jsmith\Garage1\Floor4\RowN\Space53.file Object UID 317
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only8
Why Object storage?
Massive Scale Billions of objects Petabytes of data Global namespace Explosive growth
Respond to compliance and retention requirements
Cost effectively meet SLAs with intelligent data placement
Access from anywhere
Control access, security, and data integrity
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only9
Object storage used across different workloadsEmerging Object storage segments in the hybrid cloud
Service Providers (XaaS)
Media Repository (Media Redistribution)
• Large object sizes (+250MB)• Distributed repositories• High data rates for redistribution• Time to first byte latency < 50ms
Secure multi-tenancy (billing, isolation, authentication, & self-service)
Control plane APIs and workflow automation
New apps requiring RESTful interfaces (S3, Swift)
Web Data Repositories
• Small object (~KB) performance• Extremely high transaction load • Searchable, scalable metadata• High object counts
Data Archives
• Long access latency tolerance• Integration to tape / Glacier• Long retention periods• Erasure coding
Scale seamlessly Ease of install & management Global namespace Cost
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only10
StorageGRID WebscaleObject-Enabled Data Management
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only11
NetApp StorageGRID Webscale
Software Defined Object Storage Billions of objects and multi-petabyte Architected for massive scale
Built for the Hybrid Cloud Global, Always-On data availability and
durability Support for cloud applications (S3, CDMI) 10th Generation object store proven with
product deployments
Dynamic Policy Engine Intelligent, policy driven data management
for optimal availability, performance and cost over the life cycle of data
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only12
StorageGRID
Platform for Distributed Content RepositoriesStorageGRID Object Storage Software + E-Series Storage Array
MULTIPLE: APPLICATIONS + SITES + PROTOCOLS
MULTIPLE: TARGETS + TIERS
MULTIPLE: TENANTS + POLICIES + ADMINISTRATORS
Site 1 Site 2 … Site NSite 3
APPLICATIONS APPLICATIONS APPLICATIONS APPLICATION
NetApp E-Series Tape
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only13
StorageGRID Webscale features
Hardware Obsolescence Protection
Non-DisruptiveOperations
Object Integrity and Security
Multi-Tenancy
Global ObjectNamespace
Services Automation
Manageability Reliability Scalability
ILM, Metadata driven policies
SeamlessScaling
Audit & Reporting
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only14
Technical OverviewStorageGRID Webscale
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only15
Solution Topology
• Clients can access global object namespace via Gateway Nodes or Storage Nodes
E2760
DE6600
AdministratorsREST API clients
10GbE LAN
Per Node Resource Requirement
VMDK(GB)
vCPU RAM (GB)
100 8 24
100 8 24
300 8 24
Storage
Gateway
AdminDE6600
E2760
DE6600
DE6600
• StorageGRID Webscale nodes running in VMware hosts
S3, CDMIHTTPS
16Gb FC switch
WAN Router
• E2760 block-based storage• SSDs for read cache and VM
datastores• Mixed disk types (SSD, SAS,
NL-SAS) for tiered storage pools
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only16
StorageGRID Architecture
App1
App2
Admin Nodes Management Services: Configuration, Monitoring, Audit and Logging
Storage Nodes Manages object storage including replication
API Gateway Nodes Load balancing interface through which applications connect to the system
Archive Nodes Interface to archive media storage such as tape
DATA CENTER 1DC2
DC3
Design the grid to scale for performance, capacity & resiliency
STORAGE
STORAGE
ADMIN
ARCHIVE
API
LOAD
BALANCER
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only17
Use NetApp StorageGRID Webscale Designer to architect the grid to your requirements
Quickly deploy the grid via NetApp StorageGRID Webscale Installer
Adapt your grid to changing requirements Add sites Add nodes and capacity Support rolling upgrades
Centralized Deployment & Rolling UpgradesDesign, deploy, and maintain configuration control
DC1
DC2 DC4
DC3
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only18
Data Management Key Concepts
Client Connections
StoragePools
MetadataObject
IdentifierILM
Policy
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only19
Client Connections, Object Identifiers & Metadata
Client
CDMIS3 Read/Write
StorageGRID Webscale System HTTPs
Client Connections
Object Identifier
Object type: JPGDate modified: 07/21/ 2014GPS Coordinates: Lat, LongLocation: DC @ Seattle
Metadata
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only20
Scale beyond traditional application metadata to enable new capabilities
Flexibility: Metadata is application defined Up to 4096 fields can be created as
requirements change (CDMI) No lock-in to predefined schema
ILM engine evaluates the metadata and applies policies S3 metadata available to the policy
engine
Extensive metadata managementMetadata – Why it matters ?
Metadata is distributed throughout the grid Increased scalability and resiliency Faster retrieval and efficient ILM policy
evaluation
Object Identifier:00006FFD00192A1200555FFEE12039468EBF622D9402C4F962
Locations:Location 1: Data Center 1/DC1-S1/LDRLocation 2: Data Center 2/DC2-S3/LDR
Metadata CDMI/CVTE: 0CDMI/META: {"application":"finance","doctype":"contract", "project":"45667}
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only21
Extensive metadata managementMetadata Management
Metadata Includes – Object size, User Metadata, Bucket Name, Account Id
CDMI supports a large number of user metadata fields (up to 4096)
Policy support Metadata can be used in polices (CDMI & S3) No pre-configuration required for metadata, immediately available for policy use
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only22
Data Management Key Concepts
Client Connections
StoragePools
MetadataObject
IdentifierILM
Policy
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only23
Sites(Link Cost Groups)
Storage PoolsStorageGRID Data Management
StoragePools
Storage GradeTAPE
Storage GradeSAS
Storage GradeFLASH
Munich, Germany
San Francisco, USA
Storage Node T1
Storage Node T2
Tape Storage Pool
EU Storage Pool
Storage Node S1Storage Node F1
Storage Node F2
Fast Storage Pool
Vancouver, CAStorage Node T3Storage Node F3
Simple configuration of SLO based storage across multiple sites and storage grades
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only24
Customer Defined Storage gradesStorageGRID Data Management
Associating Storage Nodes with Storage Grades
Creating Storage Grades
Customers can configure their own Storage Grades, and associate them with specific nodes
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only25
Site configuration – Not just a siteStorageGRID Data Management
Customers can model network costs, creating powerful configurations
Possible Examples Configure two sites to be treated as one, Specify which sites are connected by high b/w pipes & low b/w pipes, Prefer traffic to go in a particular direction
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only26
Data Management Key Concepts
Client Connections
StoragePools
MetadataObject
IdentifierILM
Policy
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only27
Dynamic Policy Engine – An OverviewData management key concepts
Manage policies not objects Evaluate objects based on metadata
such as: Custom user \ application metadata Method of ingest (S3 or CDMI) Size of object Last access time
Apply ILM rules to set: Geography — Placement of an object Storage grade — Type of storage used to
store an object Replication — Number of copies stored Retention — Set time during which an object
cannot be purged
App1
Site1
Site2
Site3
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only28
Policy ManagementData Management Key Concepts
ILMPolicy
When
If..
then..
• At Ingest• Objects at rest (already ingested)• After a read (enabling caching)
• Metadata matches specific criteria
• Move/Copy to one or more Storage Pools
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only29
ILM Rule – Specify the ConditionsData Management Key Concepts
When
If..
Customers designate when a rule will be applied and what conditions must be met to trigger the action for placement and retention.
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only30
ILM Rule – Defining data placement and retention
then..
Simple configuration of complex placement rules, that are graphically displayed for easy understanding
Data Management Key Concepts
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only31
Combine ILM rulesData Management Key Concepts
Highly complex business logic can be implemented by combining multiple rules into a policy
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only32
ILM Policy Example
Application writes to
grid via S3
Metadata is
evaluated
DC1 DC2 DC3
Store objects with S3 metadata “Bucket Name = ClientX” on ingest at DC1 on SSD and DC2 on SATA for 90 days
After 90 days store on DC1 on SATA and DC3 on Tape
1 x copy DC1\SSD1 x copy
DC2\SATA
90 Days later….
1 x copy DC1\SATA1 x copy
DC3\Tape
BucketName =ClientX
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only33
ILM Policy Example
DC1 DC2 DC3
What if requirements change?
A new Data Center is brought on line and now we must store a copy at DC4
We can edit the policy – and apply not just to new data, but re-evaluate existing data and create new replicas as needed
Without impact to the performance of the grid
DC4
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only34
Create ILM Rules to set customized service levels GOLD = 2 x copies on SSD, 1 x copy on
SATA SILVER = 1 x copy on SAS, 2 x SATA BRONZE = 1 x copy on SATA, 1 x copy
on tape
Set custom service levels via metadataSLA Example
{"sla":"gold"}
DC1
DC2
DC3
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only35
Built-in Object Integrity Verification and Self HealingDurability at the object level
Digital fingerprint is calculated per object upon ingest
Interlocking layers of object-wide and sub-object level integrity protection Object hash value Content hash value CRC checksum HMAC message authentication digest
ID: 41E85A1D
Data Metadata
Data
Metadata
Fingerprint
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only36
Built-in Object Integrity Verification and Self HealingDurability at the object level
Continuous verification: on ingest, retrieval, replication, migration and at rest
Object failing integrity test is automatically replaced with another copy
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only37
Real time audit feed can be used to create custom reports
Audit logs created in open format allows the use of partner products such as Splunk
Comprehensive audit feed for: Chargeback and billing Search integration Custom reporting Security diagnostics Compliance events Validate performance for SLAs
Continuous and active monitoring Audit and Reporting
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only38
System Performance AnalysisPerform deep performance analysis to fine tune your infrastructure
Analyze system performance and activity at every level Grid Wide
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only39
System Performance AnalysisPerform deep performance analysis to fine tune your infrastructure
Analyze system performance and activity at every level Grid Wide Site Specific
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only40
System Performance AnalysisPerform deep performance analysis to fine tune your infrastructure
Analyze system performance and activity at every level Grid Wide Site Specific Service Level
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only41
Data FlowUnderstanding how data flows through a StorageGRID Webscale system for different operations
42
Data flow overviewBasic topology and business rules
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only
SalesCapacity
Storage Pool
FinancePerformanceStorage Pool
MarketingPerformanceStorage Pool
FinanceCapacity
Storage Pool
SalesPerformanceStorage Pool
MarketingCapacity
Storage Pool
Local Storage Pool
WAN
Satellite Office
Three departments Sales Finance Marketing
Two pools per department Capacity pool Performance pool
Three sites Two datacenter sites One remote office
Policy - Finance Ingest into local pool After 1 day create copy in
performance pool After 30 days move copy
to capacity poolData Center 1 Data Center 2
43
SalesCapacity
Storage Pool
FinancePerformanceStorage Pool
MarketingPerformanceStorage Pool
FinanceCapacity
Storage Pool
SalesPerformanceStorage Pool
MarketingCapacity
Storage Pool
Local Storage Pool
WAN
Satellite Office
Data Center 1 Data Center 2
Object Ingest and ReplicationTransmitting objects from client to StorageGRID Webscale
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only
Receives object write request from client along with custom metadata
Returns object ID to client and forms the file payload into an object Packetization Digital fingerprint Compression (optional) Encryption (optional)
0x05DFF4338ADCE6F5
44
Object Ingest and ReplicationTransmitting objects from client to StorageGRID Webscale
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only
SalesCapacity
Storage Pool
FinancePerformanceStorage Pool
MarketingPerformanceStorage Pool
FinanceCapacity
Storage Pool
SalesPerformanceStorage Pool
MarketingCapacity
Storage Pool
LocalStorage Pool
WAN
Satellite Office
By default, creates semi-sync local copy for immediate redundancy
Metadata is stored and replicated
Replicate object as per ILM policy
Finance Policy Example Ingest into local pool After 1 day create copy in
performance pool After 30 days move copy
to capacity poolData Center 1 Data Center 2
45
SalesCapacity
Storage Pool
FinancePerformanceStorage Pool
MarketingPerformanceStorage Pool
FinanceCapacity
Storage Pool
SalesPerformanceStorage Pool
MarketingCapacity
Storage Pool
Local Storage Pool
WAN
Satellite Office
Data Center 1 Data Center 2
Object ReplicationMetadata-driven ILM with optimal resource utilization
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only
Optimal resources in target storage pools are selected for the replication destination
While honoring ILM the grid considers Network costs Server utilization Storage utilization
46
SalesCapacity
Storage Pool
FinancePerformanceStorage Pool
MarketingPerformanceStorage Pool
FinanceCapacity
Storage Pool
SalesPerformanceStorage Pool
MarketingCapacity
Storage Pool
LocalStorage Pool
WAN
Satellite Office
Data Center 1 Data Center 2
Object RetrievalRequesting objects by the client from StorageGRID Webscale
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only
Receives object read request from client
Determines optimal object location relative to request location
Streams a copy of the object to the client and verifies integrity of the object on-the-fly
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only47
Thank You
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only48
Management ServicesAdmin Node
NMS – Network Management System: Provides the administrative interface for configuration and monitoring of the grid.
CMN – Configuration Management Node: Manages system-wide configurations such as connection profiles, grid tasks, and system configuration options.
AMS - Audit Management System: Keeps logs of system activity and events.
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only49
Object, Metadata Storage and ReplicationStorage Nodes
LDR - Local Distribution Router: Stores, moves, verifies, and retrieves object data stored on disks
DDS - Distributed Data Store: Stores, replicates, and protects metadata in the key value store.
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only50
Object, Metadata Storage and ReplicationStorage Nodes - Continued
CMS - Content Management System: Manages object placement and replication based on ILM rules.
ADC - Administrative Domain Controller: Maintains topology information and provides authentication services.
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only51
Load Balancing and Client ConnectivityAPI Gateway Nodes
CLB - Connection Load Balancer: Acts as switchboard for connecting clients to the most efficient LDR service for ingest and retrieval.
Ports 8081 CDMI 8082 S3
© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only52
Data storage on archive mediaArchive Nodes
ARC – Archive: Communicates with archiving middleware to store and retrieve data to and from archive media such as tape.