solidfire and netapp all-flash fas architectural...

6
Be fueled by SolidFire. Do what’s never been done. This document provides an overview of NetApp’s All-Flash FAS architecture as it compares to SolidFire. Not intended to be exhaustive, this overview covers where solutions differ, and presents the impact to overall suitability for the needs of a next generation data center. SolidFire and NetApp All-Flash FAS Architectural Comparison JUNE 2015

Upload: phungdiep

Post on 14-Apr-2018

235 views

Category:

Documents


3 download

TRANSCRIPT

Be fueled by SolidFire. Do what’s never been done.

This document provides an overview of NetApp’s All-Flash FAS architecture as it compares to SolidFire. Not intended to be exhaustive, this overview covers where solutions differ, and presents the impact to overall suitability for the needs of a next generation data center.

SolidFire and NetApp All-Flash FAS Architectural Comparison JUNE 2015

Overview

NetApp OverviewNetApp’s All-Flash FAS solution is a traditional NetApp FAS cluster configured with enterprise multi-layer cell (eMLC) flash drives running as a scale-out cluster via NetApp’s Clustered Data ONTAP 8.3 operating system.

NetApp’s scale-out clusters are built by taking traditional NetApp dual-controller or dual-node pairs, now called high availability (HA) pairs, and deploying them into a single cluster via a 10GbE interconnect. NetApp’s scale-out NAS configurations can scale to 12 HA pairs (24 nodes) whereas

SAN configurations are limited to 4 HA pairs (8 nodes). Each node pair is configured as an active/active pair that can be non-disruptively upgraded and non-disruptively balanced by moving data to other node pairs in the cluster.

NetApp all-flash clusters claim impressive capacity and performance scale. However, while most of the cluster capacity can be virtualized into a single namespace, performance is tied directly to the controller-pair to which the drives are attached.

SolidFire OverviewSolidFire provides a scale-out all-flash storage platform designed to deliver guaranteed storage performance to thousands of application workloads side-by-side, allowing consolidation under a single storage platform. The SolidFire systems scale out over a 10Gb Ethernet or 8/16 Gb FC network in clusters ranging from 4 to 100 nodes.

Each SolidFire storage node is a completely self-contained, all-SSD storage appliance that, when clustered together via a 10Gbps iSCSI interconnect, makes up the building blocks of the SolidFire scale-out system. As a clustered

scale-out architecture, the independent node resources are aggregated together as a cluster where capacity and performance are virtualized into separate pools and scale linearly with the addition of each node to the system.

A cluster is managed by the SolidFire Element OS, which is distributed across each storage node. The Element OS is responsible for Quality of Service (performance SLAs), data efficiency (compression, deduplication, and thin provisioning), data replication (for resilience), and overall management of the system.

2SolidFire.com

Findings:

NetApp’s All-Flash FAS offering certainly offers significant scale and performance for large-scale

infrastructures. However, as this solution was built on a traditional, active/passive dual controller

architecture, it is not particularly well-suited for a next generation data center architecture.

Specifically:

■ Scale and agility - NetApp’s Clustered ONTAP OS enables the platform to avoid the traditional scale-out limitations of a dual-controller solution. However, instead of a truly distributed scale-out cluster, the result is an extremely large cluster of traditional controller pairs. NetApp’s scale-out capabilities require capacity and performance to be scaled separately. Performance can only be scaled by adding controller pairs to the cluster, whereas capacity is scaled by adding disk shelves to controller pairs. With no ability to virtualize cluster performance, NetApp users must ensure disk shelves are balanced across the controller pairs, otherwise stranded performance or additional unplanned capacity may result. In contrast, SolidFire’s distributed scale-out architecture enables predictable, granular scale with much denser capacity and performance per node, resulting in a lower cost per GB. SolidFire’s architecture also allows for the non-disruptive addition or removal of any single 1U cluster node, while maintaining application-specific quality of service (QoS) independent of capacity. SolidFire also supports the mixing of different capacity and performance nodes within a single cluster for fine-grain scale-out and better long term customization capability.

■ Guaranteed performance A key requirement of a next generation data center is an environment with repeatable, predictable performance capable of supporting tens, hundreds, or thousands of applications running efficiently in parallel. NetApp performance is susceptible to runaway applications also known as “noisy neighbors” because its QoS capability relies on workload rate limiting and lacks fine-grain performance control such as minimum and burst I/O settings. Because of this, NetApp is not able to provide a performance

guarantee to all workloads. Often the only solution for workloads that need more performance in a NetApp environment is to add more controller pairs or additional FlashCache cards. Uniquely, SolidFire enables service providers and enterprises to specify and guarantee minimum, maximum, and burst IOPS for individual storage volumes dynamically and independently of capacity, eliminating the “noisy neighbor” problem in mixed workload environments.

■ Automated management - Both SolidFire and NetApp have REST APIs, however NetApp requires a workflow engine to provide a fully automated infrastructure. SolidFire provides a simple, native API set with very few logical entities to automate. NetApp clusters carry a great deal of management overhead that, for the most part, has been reduced or automated within SolidFire clusters. NetApp storage administrators spend a lot of time performing management tasks such as RAID group management, dealing with spare drives, tracking down performance issues, developing caching and prioritization schemes, and constantly worrying if they will have enough performance or capacity to serve all workloads. SolidFire’s REST-based API is integrated at the very core of the system, enabling automation of every aspect of storage provisioning, management, monitoring, and reporting. So whether you have rolled your own management framework or are using an off-the-shelf management stack, the SolidFire API makes automating storage management simple and straightforward.

3SolidFire.com

SolidFire vs. NetAppQoSNetApp’s All-Flash FAS does offer a soft form of QoS. However, unlike SolidFire, NetApp QoS is based on rate limiting, meaning the entire volume is assigned a throughput maximum or limit, and IOPS are throttled to ensure throughput stays within the set limits. The drawback to this soft form of QoS is that without a minimum performance setting, as more volumes and workloads are added to the cluster the performance of the rate-limited volume will likely diminish. Simply stated, NetApp’s QoS has no way of guaranteeing that all workloads within the cluster will always have the performance they need.

SolidFire allows true QoS “performance virtualization” of resources. This patented SolidFire technology permits the management of storage performance independent from storage capacity. SolidFire’s architecture offers fine-grain QoS control, including settings for minimum, maximum, and burst IOPS on a per volume basis. Because performance and capacity are managed independently, SolidFire clusters are able to guarantee predictable storage performance to thousands of applications within a shared infrastructure.

ScaleNetApp and SolidFire clusters are both built to scale out to large capacities and significant performance. A major

drawback to NetApp’s approach to scale-out clusters is that it is very difficult to scale both performance and capacity predictably.

NetApp’s use of clustered controller pairs means customers have to scale performance and storage separately, provided there is enough controller performance to handle the added capacity and application workload. Capacity is added separately through the addition of disk shelves. Each controller pair has varying performance and capacity limits based on the number and type of drives and disk shelves attached. In addition, most storage efficiencies and management that run inline in a SolidFire cluster occur partially or completely post-process in a NetApp cluster, rather than globally throughout it.

This prevents a NetApp deployment from scaling capacity or performance predictably or linearly, which results in over-provisioning storage and/or controllers in order to ensure enough performance and capacity to meet workload demands. This results in higher storage hardware, software licensing and support costs. SolidFire’s cluster mesh architecture enables the linear scaling of available capacity and IOPS, because nodes and capacity are added in a very linear fashion without any RAID penalties. At any point during or after deployment, nodes can be added, removed, or replaced to increase capacity and/or performance without impacting existing workloads.

As nodes are added, their capacity and IOPS are aggregated into the total provisionable capacity and performance available for assignment to any existing or new volumes. Conversely, if a node fails or is removed from the cluster, SolidFire Helix (patent-pending data protection architecture), automatically initiates a rebuild, restoring redundancy for any copies that were residing on the unavailable node and eliminating any threat of a single point of failure.

Figure 1: SolidFire QoS

SolidFire architecture allows users to set minimum, maximum, and burst IOPS on a per volume basis.

4SolidFire.com

SolidFire’s 100% scale-out architecture provides customers with granular, predictable, and linear scale of both capacity and performance. SolidFire enables customers to start clusters as small as four nodes and granularly scale out as little as a single node at a time up to 100 nodes, offering up to an incredible 3.4PBs of capacity and 7.5M IOPS. SolidFire’s predictable scale, ability to non-disruptively add or remove nodes from a cluster, and support for mixed-node clusters provide customers with the ultimate in scale-out flexibility at a 76% lower total cost of ownership compared to traditional storage systems over five years (source : ESG 2015).

Data durabilityNetApp utilizes RAID for data durability and usually employs a very wide 26+2 RAID 6 stripe. NetApp’s traditional RAID architecture means that a drive failure results in a recovery period that can stretch into hours or even days while the drive rebuilds. During this recovery period performance is significantly impacted and the cluster is vulnerable to further faults and even possible data loss. Further, the use of RAID provides no protection or ability to heal from a controller failure.

Instead of traditional RAID, SolidFire utilizes its Helix data protection architecture to protect the entire cluster from failure. SolidFire’s Helix maintains redundant copies of data across a global cluster, which not only enables a SolidFire cluster to automatically recover from a failed

drive but also to automatically recover from a failed node. In the event of a node or drive failure, all data is automatically redistributed across the cluster in a matter of minutes, and the failed node or drive is automatically disassociated from the cluster.

Helix not only automatically removes faulted components from the cluster, but it also ensures the cluster does not remain in a single point of failure state until the failed component is physically removed. Helix’s dual-copy architecture and automated rebalancing capability is also what enables customers to add or remove nodes to or from a cluster non-disruptively without any manual configurations.

Inline efficienciesInline efficiencies are another area where NetApp’s architecture limits the cluster’s capabilities. The traditional design of NetApp’s architecture and its ONTAP operating system make it very difficult for NetApp to provide inline deduplication or compression without a severe performance penalty. NetApp’s past approach has been to provide these services after the data is written or “post-process.” However, with all-flash FAS NetApp now offers partial inline deduplication and compression. Inline deduplication consists of a simple inline zero-detection capability that identifies blocks of zeros in the incoming data and makes those blocks available for re-write. The actual byte-to-byte comparisons are still done post-process. 

Figure 2: SolidFire mixed-node scale-out

At any point during or after deployment, nodes can be added, removed, or replaced to increase capacity and/or performance without impacting existing workloads. As nodes are added, their capacity and IOPS are aggregated into the total provisionable capacity and performance available for assignment to any existing or new volume.

5SolidFire.com

In reality, NetApp is still not offering true inline deduplication. NetApp does enable inline compression if the customer wants to take the performance hit, which can be significant. In an attempt to offer inline compression with less performance impact, NetApp offers a compression detection capability. This enables NetApp customers to set up inline compression on a volume and the compression detection algorithm will quickly deduce whether or not a file is easily compressible. If the file is compressible with minimal performance impact, it is compressed. Otherwise the file is flagged and compression occurs post-process. Customers must also remember that post-process functions carry a performance overhead and they must be scheduled off-hours so they have minimal performance impact to user workloads.

SolidFire’s inline deduplication and compression is always on, operates in real time, and enables immediate capacity efficiencies with virtually no performance impact. In addition, SolidFire’s deduplication is cluster-wide rather than volume-based, as is the case with NetApp deduplication. Compared to cluster-wide deduplication, volume-based deduplication is not nearly as efficient because it only deduplicates the data in a single volume, meaning duplicate data could still easily reside in other volumes within the cluster.

Snapshot data protectionBoth NetApp and SolidFire utilize space-efficient snapshots for local replication. While not metadata based, NetApp pioneered the use of redirect-on-write snapshots, which like SolidFire’s snapshots, only consume storage when the data changes. This gives SolidFire a slight edge against NetApp because SolidFire’s snapshots are inherently thin, deduped, compressed, and do not require a pre-allocated snapshot reserve.

The bottom lineWhile NetApp seems to have finally brought a scale-out, all-flash version of their traditional FAS filer to market, it is important to remember what lies under the covers. NetApp’s large footprint is not conducive to a scale-out environment and its inability to virtualize capacity and performance separately hampers its ability to scale out linearly and predictably.

Lack of basic inline efficiencies, storage automation, and QoS also limit NetApp clusters to specific single workload point solutions rather than next generation data center applications, including large-scale, multiple/mixed workload, and IT as a Service (ITaaS) deployments.

SolidFire’s scale out architecture makes it ideally suited for large-scale, mixed-workload service provider and enterprise deployments. The ability to non-disruptively mix different nodes within clusters, scale out performance and capacity linearly at any time, and guarantee IOPS to every workload means deployments can start and grow as needed without worry of ever stranding capacity or performance.

Adding all of this up, SolidFire’s architecture means IT organizations can consolidate multiple applications and workloads onto an agile, scalable, predictable, automated infrastructure, thereby saving space, time, and resources, and making positive impact to the bottom line.

Figure 3: SolidFire Inline Efficiencies

Don’t settle for bolt-on efficiencies. SolidFire was built with native, always-on inline data reduction.

6SolidFire.com