future of cloud storage - meetupfiles.meetup.com/677520/future of cloud storage.pdf · nfs, ocfs2,...
TRANSCRIPT
AB Periasamy | CTO Gluster, Inc.Thu, June 9, 2011
Future of Cloud Storage
Petascale Cloud Filesystem 2
Storage Transforming to Reflect Compute
Storage must support the public and private cloud environment• Storage is the Achilles heel of full data center virtualization
• Storage performance, availability, capacity, & interface is the Achilles heel of public cloud
• Big Data, data migration, multi-site consistency are the Achilles heel of hybrid cloud
Storage will look like the computing environment• Storage should be a commoditized, virtualized, centrally managed pool
• Monolithic proprietary systems now challenged by nimble and shared network storage
• Storage becomes a software problem
• The “Google model” of storage
Multi-tenant / sharedVirtualized Automated Commoditized
Standardized
Scale on Demand In the Cloud Scale Out Free Software
/ OpenSource
Petascale Cloud Filesystem 3
Disk Systems & Storage Interconnects
FC (FCP)Infiniband (SRP, iSer, NFSoRDMA)Ethernet (iSCSI, FCoE, AoE, NBD/DRBD, NFS, HTTP)
DAS vs JBOD vs SANSATA vs SAS vs SSD
Petascale Cloud Filesystem 4
Filesystems
Ext3/4, XFS, Btrfs, ZFS
NFS, OCFS2, Lustre, GFS, GPFS
MogileFS, VMFS, SheepDog, Ceph
GlusterFS
Petascale Cloud Filesystem 5
Future of Cloud Storage
Filesystems vs Object Storage vs Big Data vs NoSQL
Petascale Cloud Filesystem 6
GlusterFS towards Unified Storage
Unified Multi-Protocol Storage
NAS + Objects + Big Data + SAN
Petascale Cloud Filesystem 7
# gluster peer probe HOSTNAME
# gluster volume info
# gluster volume create VOLNAME [stripe COUNT] [replica COUNT] [transport tcp | rdma] BRICK …
# gluster volume delete VOLNAME
# gluster volume add-brick VOLNAME NEW-BRICK ...
# gluster volume rebalance VOLNAME start
GlusterFS Simple Commands
Petascale Cloud Filesystem 9
Gluster Architecture Advantages
Software onlyNo metadata server• Fully distributed architecture, no bottleneck• Gluster Elastic Hash
High performance global namespace• Scale out with linear performance• Hundreds of petabytes• 1 GbE, 10GbE
High availability• Replication to survive hardware failure• Self-healing• Data stored in NFS-like native format
Stackable userspace design• No kernel dependencies, simple install• Match specific workload profiles• Early maturity and rich functionality
‘Google Storage’ for
Everyone
• Intelligence in the SW
•Leverage commodity HW
•Scale-out elastically
•Replication for reliability
•Software enables virtualization
Petascale Cloud Filesystem 10
Evolution of GlusterFS
2006-2009 GlusterFS v1.0 – v3.0
Distributed Filesystem capabilities with self-healing, synchronous replication, stripe, distribute (global name space)
2010 GlusterFS v3.1
Elastic Cloud capabilities
2011 Q2 GlusterFS v3.2
GeoGraphic replication, Enhanced monitoring, Directory level quotas (also works as cloud usage billing APIs)
2011 Q3/Q4
Hadoop HDFS drop-in replacement, Unified File and Object Storage (Amazon S3 compatible) and Near CDP.
Petascale Cloud Filesystem 11
1st meeting room1st Office US
1st Office Bengalooru 1st Office Bengalooru
Story of Gluster
Petascale Cloud Filesystem 12
1000s of Community Deployments
Petascale Cloud Filesystem 13
Fast Growing Commercial Deployments
Thank You
www.gluster.org
Petascale Cloud Filesystem 15
Gluster Deployment
Private Cloud Public Cloud
Petascale Cloud Filesystem 16
GlusterFS & OpenStack
VM Image Storage – Answer to VMWare VMFS
Unified File & Object Storage – Application Data
GeoReplication – Enable Hybrid Clouds
Petascale Cloud Filesystem 17
Problem• Capacity growth from 144TB to 1+PB• Multiple distributed users/departments• Multi OS access - Windows, Linux and Unix
Solution• GlusterFS Cluster• Solaris/ZFS/x4500 w/ InfiniBand• Native CIFS/ NFS access
Benefits• Capacity on demand / pay as you grow• Centralized management• Higher reliability• OPEX decreased by 10X
Partners Healthcare
• Over 500 TB
• 9 Sun “Thumper” systems in cluster
Private Cloud: Centralized Storage as a Service
Petascale Cloud Filesystem 18
Pandora Internet Radio
Problem• Explosive user & title growth• As many as 12 file formats for each song• ‘Hot’ content and long tail
Solution• Three data centers, each with a six-node
GlusterFS cluster• Replication for high availability• 250+ TB total capacity
Benefits• Easily scale capacity• Centralized management; one
administrator to manage day-to-day operations
• No changes to application• Higher reliability
• 1.2 PB of audio served per week
• 13 million files
• Over 50 GB/sec peak traffic
Petascale Cloud Filesystem 19
Cincinnati Bell Technology Solutions
Problem• Host a dedicated enterprise cloud solution• Large scale VMware environment• Need high availability
Solution• Gluster for VM storage, NFS to clients• SAS drives on back-end• Replication for high availability
Benefits• Storage provisioning from 6 wks. to 15 min.• Vendor agnostic storage• Low cost of service delivery• Elastic growth
• Large scale VM storage
• Low cost service delivery for enterprise customer
• Drastic reduction in provisioning time
Petascale Cloud Filesystem 20
Envoy Media
Problem• Limited scalability• Slow response to demand spikes• Manual data management
Solution• Four EBS volumes under Gluster global
namespace• Replication for high availability• EC2 for compute; S3 for backup
Benefits• No change to application• Content immediately available to all servers• Automatic resource allocation• Lower cost (vs. colo and proprietary options)
• Targeted media serving
• 100% AWS hosted
• Unpredictable traffic
Public Cloud: Media Serving on AWS