ce202 storage
TRANSCRIPT
![Page 1: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/1.jpg)
Storage Architecture
CE202
December 2, 2003
David Pease
![Page 2: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/2.jpg)
Hierarchy of Storage
Smaller
Larger
Cap
acity
Higher
Lower
Cos
t
RAM
Disk
Optical
Tape
Cache
Faster
Slower
Spe
ed
![Page 3: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/3.jpg)
• Application• I/O Library• File System• Device Driver• Host Bus Adapter• Interconnect• Storage Controller• Devices I/O Context
Storage System Components
![Page 4: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/4.jpg)
Disks
![Page 5: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/5.jpg)
Disk Drives
• “Workhorse” of modern storage systems• Capacity increasing, raw price dropping
– can buy 1TB for only $1000!– bandwidth not keeping pace– reliability is actually decreasing
• massive systems can mean even lower availability
• Majority of cost of ownership in administration, not purchase price– backup, configuration, failure recovery
![Page 6: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/6.jpg)
Disk Architecture
track
platters
spindle
sector
arms withread/writeheads
rotation
cylinder
![Page 7: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/7.jpg)
Disk Storage Density
![Page 8: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/8.jpg)
Disk Capacity Growth
![Page 9: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/9.jpg)
IBM Disk Storage Roadmap
![Page 10: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/10.jpg)
Storage Costs
![Page 11: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/11.jpg)
RAID• Redundant Arrays of Inexpensive Disks• Two orthogonal concepts:
– data striping for performance– redundancy for reliability
• Striped arrays can increase performance, but at the cost of reliability (next page)– redundancy can give arrays better reliability than an
individual disk
![Page 12: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/12.jpg)
Reliability of Striped Array
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Number of Disks
Sys
tem
Rel
iab
ilit
y
![Page 13: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/13.jpg)
Trace collected from the Internet Archive (March 2003)(thanks Kelly Gottlib)-- Over 100 terabytes of compressed data-- 30 disk failures out of total 70 hardware problems
power supply
6%
FS error6%
disk subsystem
10%
disk error10%
disk failure42%
others26%
One-month Trace of Hardware Failures
![Page 14: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/14.jpg)
RAID Levels
Additonal FailuresLevel Description Disks Tolerated
0 Non-redundant striping 0 01 Mirrored n 12 Memory-style ECC 1+lg n 13 Bit-Interleaved Parity 1 14 Block-Interleaved Parity 1 15 Block-Interleaved, Distributed Parity 1 16 P+Q Redundancy 2 2
![Page 15: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/15.jpg)
RAID Levels0
2
3
4
5
6
1
![Page 16: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/16.jpg)
RAID: 4x Small Write Penaltysmall data write
1 2
4
5
3
xor
![Page 17: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/17.jpg)
Log-Structured File Systems• Based on assumption that disk traffic will
become dominated by writes• Always writes disk data sequentially, into next
available location on disk – no seeks on write
• Eliminates problem of 4x write penalty – all writes are “new”, no need to read old data or
parity
• However, almost no examples in industry file systems
![Page 18: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/18.jpg)
Tape
![Page 19: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/19.jpg)
Tape Media
• Inherently sequential– long time to first byte– no random I/O
• Subject to mechanical stress– number of read-write cycles lower than disk
• Problems as an archival medium:– readers go away after some years
• most rapidly in recent years– tapes (with data) remain in a salt mine
![Page 20: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/20.jpg)
Tape Media
• Density will always trail that of disk– Tape stretches, more difficult to get higher
density
• Alignment also an issue – once it’s past the head, it’s gone– more conservative techniques required
• Bottom line: mechanical engineering issues for tape are the difficult ones
![Page 21: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/21.jpg)
Optical
• CD, CD-R/RW, DVD, DVD-R/RW– Capacities:
• CD: ~700MB (huge 20 years ago!)
• DVD: – single sided, single layer: 5GB– single sided, double layer: 9GB– double sided, single layer: 10GB– double sided, double layer: 18GB
• Size of cell limited by wavelength of light– current lasers are red– blue lasers are under development, then UV, ...
![Page 22: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/22.jpg)
Optical
• Magneto-optical (HAMR)– heat from laser makes changing direction
of magnetization easier (so cell is smaller)
![Page 23: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/23.jpg)
MEMS• MicroElectroMechanical Systems
– 6-10 times faster than disk– cost and capacity issues
![Page 24: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/24.jpg)
Magnetic RAM (MRAM)
• Stores each bit in a magnetic cell rather than a capacitor or flip-flop– data is persistent
• Can be read and written very quickly– Read and write times 0.5 – 10 µs or less– Individual bits are writeable (no block erase)
• Density & cost comparable to DRAM– may require density/speed tradeoffs– denser MRAM may have to run slower because of
heat dissipation on writes
![Page 25: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/25.jpg)
Magnetic RAM (MRAM)
• Several companies have announced partnerships to produce products ~2003
• Ideas for use of MRAM in storage:– Persistent cache
• Hot data in MRAM, cold data to disk• No need to flush write cache to avoid data loss
– HeRMES• all metadata in MRAM• enough file data in MRAM to hide disk latency for first
access to a file
![Page 26: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/26.jpg)
Peripheral Buses
• SCSI• IDE/ATA• HIPPI (High Performance Parallel Intf.)• IEEE 1394 (FireWire)• FibreChannel (FCP)• IP (e.g., iSCSI)• InfiniBand• Serial ATA
![Page 27: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/27.jpg)
Peripheral Buses• Parallel
– SCSI, most printers, IBM Channels– 1 or more bytes per clock– Skew problems at high speeds
• Serial– FC, RS232, IEEE1394 (FireWire)– 1 bit per clock, self clocking– can be run at much higher speeds than
parallel bus
![Page 28: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/28.jpg)
Networked Storage• Storage attached by general-purpose or
dedicated network (e.g., FibreChannel)• Motivations:
– homogenous and heterogeneous file sharing– centralized administration– better resource utilization (shared storage
resources, pooling)
• Dedicated Networks:– Fibre-Channel: FCP (SCSI over FC)– iSCSI: SCSI over IP– InfiniBand
![Page 29: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/29.jpg)
Networked Storage• Can mean many things:
– NAS (Network-Attached Storage): file server appliances serving NFS and/or CIFS (for example, Network Appliance)
– NASD (Network-Attached Secure Disk): intelligent, network-attached drives w/ security features (also, Network-Attached Storage Device)
– SAN (Storage Area Network): network for attaching disks and computers, usually dedicated only to storage operations
• OBSD (Object-Based Storage Device): similar to NASD
![Page 30: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/30.jpg)
Meta-dataServer
A SAN File System
SAN
IFS w/cache
Win2K
IFS w/cache
AIX
IFS w/cache
Solaris Meta-dataServer
Meta-dataServer
StorageManagement
ServerHSM &Backup
Meta-data
Control Network (IP)NFS
CIFSFTP
HTTP
Data Data
data
Securityassists
IFS w/cache
Linux
![Page 31: Ce202 Storage](https://reader036.vdocument.in/reader036/viewer/2022062405/55814596d8b42a9b098b52c4/html5/thumbnails/31.jpg)
Additional Reading• Hennessy & Patterson: Chapter 6
• Chen, Lee, Gibson, Katz, & Patterson: RAID: high performance, reliable secondary storage. ACM Computing Surveys 26, June 1994, 145-185
• Rosenblum & Ousterhout: The design and implementation of a log-structured file system. ACM Transactions on Computer Systems, Feb. 1992, 26-52
• Gibson, Nagle, et al.: A cost-effective, high-bandwidth storage architecture. Proceedings of the Eight Conference on Architectural Support for Programming Languages and Operating Systems, 1998
• http://www.almaden.ibm.com/cs/storagesystems/stortank/