surfing technology curves steve kleiman cto network appliance inc
DESCRIPTION
Surfing Technology Curves Steve Kleiman CTO Network Appliance Inc. Book Plug. The Innovator’s Dilemma - When New Technologies Cause Great Firms to Fail Clayton M. Christensen. About NetApp. Two product lines: Network Attached File Servers (a.k.a. filers) Web proxy caches: NetCache - PowerPoint PPT PresentationTRANSCRIPT
1
Surfing Technology Curves
Steve KleimanCTO
Network Appliance Inc.
2
Book Plug
The Innovator’s Dilemma - When New
Technologies Cause Great Firms to Fail
Clayton M. Christensen
3
About NetApp
Two product lines: Network Attached File Servers (a.k.a. filers) Web proxy caches: NetCache
Founded in 1992>$1B revenue run rate>70% CAGR since founding>120% last year
4
Filers: Fast, Simple, Reliable and Multi-protocol
Sun E 3500/4500
HP-9000 N4000
NetApp 840
System
340
318
15,235
Ops/FS
2
4
1
CPUs
8,165
15,270
15,235
Result
3.04
1.91
1.54
OverallResp.
23.8
3.7
3.6
Resp.@ Max
20.4
10.4
46
Ops perSpecRate
no
yes
yes
RAID
5
Filers: Fast, Simple, Reliable and Multi-protocol
Disk management Filer finds disks and organizes into RAID groups
and spares automatically Simple addition of storage Automatic RAID reconstruction
Data management Snapshots SnapRestore SnapMirror
Simple upgradeSmall command set
6
Filers: Fast, Simple, Reliable and Multi-protocol
Built-in RAIDEasy hardware maintenance
Hot plug disk, power, fans Low MTTR
Cluster FailoverAutosupport>99.995% measured field availability
7
Filers: Fast, Simple, Reliable and Multi-protocol
NFSCIFS
CIFS and NFS attributesHTTPFTPDAFSInternet Cache
FTP Streaming media
8
Wave 1:Networks, Appliances and Software
9
Network and Storage Bandwidth
Year Storage Network Penalty 1992 10 MB 0.1 MB 100-to-1 1994 20 MB 1 MB 20-to-1 1996 40 MB 10 MB 4-to-1 1998 100 MB 100 MB 1-to-1 2001 200-400 MB 1000 MB .2-to-1
10
The Appliance Revolution
1980s (General Purpose)
...
UNIXApplication
File Service
Routing
1990s(Appliance Based)
Router/Switch
Filer
Windows/NT
UNIX
Printer
...
11
Appliance philosophy
Appliance philosophy breeds focus External simplicity internal simplicity RISC argument
Don’t have to be all things to all people Limited compatibility constraints
Interfaces are bits on wire Think different!
Can innovate with both software and hardware
12
Filer Architecture
Commercial off-the shelf chips Any appropriate architecture
i486 Pentium Alpha ‘064 Alpha‘164 PIII
Board level integration 1 or more CPUs (4) 1 or more PCI busses (4) High bandwidth switches Multiple memory banks Integrated I/O
NVRAM
CPU
Mem
PCI
NVRAM
13
Roads Not Taken
No “unobtainium” Minimalist infrastructure No special purpose busses No big MPs
Motherboards only: no cache coherent backplanes
No functionally distributed computersNo special purpose networks (e.g. HIPPI)No block access protocols
14
DataOnTap Architecture
WAFL RAID Disk
FCAL
SCSI
ATM
GbE
FDDI
100BT
TCP/IP
NFS
CIFS
HTTP
SK
Lib
Daemons, Shells, Commands Java Virtual Machine
VINIC* VIPL DAFS
* VI supported on FC, (Future: GbE, Infiniband)
15
DataOnTap
Simple Kernel Message passing Non-preemptive
Sample optimizations Checksum caching Suspend/Resume Cache hit pass through
16
WAFL: Write Anywhere File Layout
Log-like write throughput No segment cleaning (LFS) Write data allocated to optimize RAID performance
Delayed write allocationActive data is never overwritten (shadow paging)
On-disk data is always consistent File system state is changed atomically
Every 10 sec, by defaultClient modification requests are logged to NVRAM
NVRAM log is replayed only on reboot
17
Wave 2:Memory-to-Memory Interconnects
(a.k.a NUMA, NORMA)
18
Problem:
Remove single points of failure
Without doubling hardware
Minimizing performance overhead
Without decreasing reliability
19
Clustered Failover Architecture
NVRAM
Filer 2
NVRAM
Filer 1
Fibre Channel
Fibre Channel
ServerNet
Network
20
Memory-to-Memory Interconnects
Efficient transfer model Allows minimal overhead on receiver
Scaleable Bandwidth High speed ASIC based switching Gigabit technology
Open architecture PCI, not coherent bus interface Incorporate multiple technologies
Relatively inexpensive
21
Mirroring NVRAM
NVRAM is split into local and partner regions
Data is assembled in NVRAM
Data is DMAed from NVRAM to equivalent offset in remote node
Client reply is sent when log entry DMA completes
CPU
ServerNet
NVRAM
DMA To partner NVRAM
NVRAM data from partner
PCI Bus
22
Leveraged Components
Memory-to-Memory interconnects Low overhead, high-bandwidth, cheap
WAFL Always consistent file system Built-in NVRAM logging/replay
Fibre Channel disks Two independent ports
Single function appliance software
Simple, low-overhead failover
23
Wave 3:The Internet
24
The Consequences ofHigher-speed Internet Access
200K-400K home cable head-end Requires 1.5-3Gbps access capability
30% subscription rate, 20% online Minimum 128Kbps BW
Enterprise Remote sites still connected by slow links Require high-quality access to content Overloaded web servers
ISP Require distribution and caching of large
media files
25
Yet Another Appliance
Cisco
NetApp
26
NetCache
HTTP/FTP proxy cache appliance Highly deployable Forward and reverse proxy
TransparencyFilteringiCAP
Enables value added services Virus scanning, transcoding, ad insertion, …
Stream splittingStream cachingContent distribution
27
Cacheable Content
StaticContent
DynamicContent
StreamingMedia
Time
CacheableContent
28
Wave 4:The Death of Tapes
29
Using Tapes for Disaster Recovery
YearDrive
Capacity#
DrivesCapacity
TapesRequired
# Tape drives torestore in 8 hours
1999 36G 168 6TB 172 21
2000 72G 216 16TB 160 28
2001 144G 500a 72TB 360 63
a: with SAN
30
SnapMirror
Remote asynchronous mirroring Continuous incremental update Only allocated blocks are transmitted Automatic resynchronization after
disconnect Destination is always a consistent
“snapshot” of source
WANFiler Filer
31
Creating a Snapshot
Disk Blocks
BeforeSnapshot
A B C D
Active FS
AfterSnapshot
A B C D
SnapshotActive FS
AfterBlock Update
NewBlock
C’A B C D
SnapshotActive FS
32
WAFL: Block Map File
Multiple bits per 4KB block Column for allocated block
in the active file system Columns for allocated
blocks in snapshotsTaking a Snapshot
Copy root inode
S1 S2 S3 FS
Block 1
Block 3
Block 4
Block 5
Block 2
Block 6
Block 8
Block 7
33
Consistent Image Propagation
1 2 3 4 5 6
61 4 5
Source
Destination
1 2 3 4 5 6
41
Source
Destination
Fast Network or Slow Modification Rate
Slow Network or High Modification Rate
34
Wave 5: Local File Sharing and
Virtual Interface Architecture
35
ISPs: Scalable ServicesScalability
Scale compute power and storage independently
ResiliencyCost
Commodity hardware and Open Systems standards
Internet or
Intranet
ApplicationServers
File Servers
Data Center
Gigabit Switch
F760
F760
F760
Load Balancing
Switch
36
Database
Better Manageability Offline backup with snapshots Replication Recovery from snapshots Easy storage management
Equal or better performance Less retuning
F760
37
Local File Sharing
Geographically constrained 1 or 2 machine rooms
Mostly homogeneous clients Can be large or small 1 - 100 machines
Single administrative controlHigh performance applications
Web service, Cache Email, News Database, GIS
38
Local File Sharing Architecture Characteristics
Applications tend to avoid OS e.g. No virtual memory
Applications tend to have OS adaptation layer
Different access protocol requirements e.g. high-performance locking, recovery,
streaming
39
What is VI?
Virtual Interface (VI) Architecture VI architecture organization
Promoted by Intel, Compaq and Microsoft VI Developer’s Forum
Standard capabilities Send/receive message, remote DMA read/write Multiple channels with send/completion queues Data transfer bypasses kernel
Memory pre-registration
40
VI Architecture
VI compliantNIC
User
Kernel
Hardware
KVIPLModule
VIPLLibrary
Application
KernelKVIPL client
VI compliantNIC driver
Data
Control
41
VI-compliant implementations
Fibre channel (FC-VI draft standard) e.g. Troika, Emulex
Giganet
Servernet II
Infiniband Enables 1U MP heads
Future: VI over TCP/IP
42
How VI Improves Data Transfer
No fragmentation, reassembly and realignment data copies
No user/kernel boundary crossing
No user/kernel data copies Data transfer direct to application buffers
43
Direct Access File System
User
Kernel
Hardware NIC
VI NICDriver
Buffers
Application
DAFS
VIPLVIPL* API
File Access API
* VI Provider Layer specification maintained by the VI Developers Forum
Data
Control
Memory
44
DAFS Benefits
File access protocol with implicit data sharing Direct application access
File data transfers directly to application buffers Bypasses Operating System File semantics
Optimized for high throughput and low latency Consistent high speed locking Graceful recovery/failover of clients and servers Fencing Enhanced data recovery Leverages VI for transport independence
45
DAFS vs. SAN
LocalAttached
SAN
SCSI over IP
NASDAFS
Direct(direct transfer to memory)
Network(TCP/IP)
Block
File
Wires
Protocols
46
Summary
Wave 1: Filers Technology: Fast networks, commodity
servers Environment: Appliance-ization
Wave 2: Failover Technology: Memory-to-memory
interconnects, Dual ported FC disks Environment: 24x7 requirements
Wave 3: NetCache Technology: Internet, HTTP Environment: High BW requirements, POP
deployability
47
Summary
Wave 4: SnapMirror Technology: Disk areal density, Fibre
Channel, fast networks Environment: Cost of downtime for recovery
Wave 5: DAFS Technology: VI architecture Environment: Local file sharing