the tickertaip parallel raid architecture

21
The TickerTAIP Parallel RAID Architecture P. Cao, S. B. Lim S. Venkatraman, J. Wilkes HP Labs

Upload: locke

Post on 05-Feb-2016

31 views

Category:

Documents


0 download

DESCRIPTION

The TickerTAIP Parallel RAID Architecture. P. Cao, S. B. Lim S. Venkatraman, J. Wilkes HP Labs. RAID Architectures. Traditional RAID architectures have A central RAID controller interfacing to the host and processing all I/O requests Disk drives organized in strings - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The TickerTAIP Parallel RAID Architecture

The TickerTAIPParallel RAID Architecture

P. Cao, S. B. LimS. Venkatraman, J. Wilkes

HP Labs

Page 2: The TickerTAIP Parallel RAID Architecture

RAID Architectures

• Traditional RAID architectures have– A central RAID controller interfacing to the

host and processing all I/O requests– Disk drives organized in strings – One disk controller per disk string (mostly

SCSI)

Page 3: The TickerTAIP Parallel RAID Architecture

Limitations

• Capabilities of RAID controller are crucial to the performance of RAID– Can become memory-bound

– Presents a single point of failure

– Can become a bottleneck

• Having a spare controller is an expensive proposition

Page 4: The TickerTAIP Parallel RAID Architecture

Our Solution

• Have a cooperating set ofarray controller nodes

• Major benefits are:– Fault-tolerance– Scalability– Smooth incremental growth– Flexibility: can mix and match components

Page 5: The TickerTAIP Parallel RAID Architecture

TickerTAIP

Hostinterconnects

Controller nodes

Page 6: The TickerTAIP Parallel RAID Architecture

TickerTAIP ( I)

A TickerTAIP array consists of:• Worker nodes connected with one or more local

disks through a bus • Originator nodes interfacing with host computer

clients• A high-performance small area network:

– Mesh based switching network (Datamesh)– PCI backplanes for small networks

Page 7: The TickerTAIP Parallel RAID Architecture

TickerTAIP ( II)

• Can combine or separate worker and originator nodes

• Parity calculations are done in decentralized fashion:– Bottleneck is memory bandwidth not CPU

speed– Cheaper than having faster paths to a

dedicated parity engine

Page 8: The TickerTAIP Parallel RAID Architecture

Design Issues (I)

• Normal-mode reads are trivial to implement• Normal mode writes:

– three ways to calculate the new parity:• full stripe: calculate parity from new data• small stripe: requires at least four I/Os• large stripe: if we rewrite more than half a

stripe, we compute the parity by reading the unmodified data blocks

Page 9: The TickerTAIP Parallel RAID Architecture

Design Issues (II)

• Parity can be calculated:– At originator node– Solely parity: at the parity node for the stripe

• Must ship all involved blocks to party node– At parity: same as solely parity but partial

results for small stripe writes are computed at worker node and shipped to parity node• Occasions less traffic than solely parity

Page 10: The TickerTAIP Parallel RAID Architecture

Handling single failures (I)

• TickerTAIP must provide request atomicity• Disk failures are treated as in standard RAID• Worker failures:

– Treated like disk failures– Detected by time-outs

(assuming fail-silent nodes)– A distributed consensus algorithm reaches

consensus among remaining nodes

Page 11: The TickerTAIP Parallel RAID Architecture

Handling single failures (II)

• Originator failures:– Worst case is failure of a originator/worker node

during a write– TickerTAIP uses a two-phase commit

protocol:– Two options:

• Late commit• Early commit

Page 12: The TickerTAIP Parallel RAID Architecture

Late commit/Early commit

• Late commit only commits after parity has been computed– Only the writes must be performed

• Early commit commits as soon as new data and old data have been replicated – Somewhat faster– Harder to implement

Page 13: The TickerTAIP Parallel RAID Architecture

Handling multiple failures

• Power failures during writes can corrupt stripe being written:– Use UPS to eliminate them

• Must guarantee that some specific requests will always be executed in a given order:– Cannot write data blocks before updating the i-

nodes containing block addresses– Uses request sequencing to achieve partial write

ordering

Page 14: The TickerTAIP Parallel RAID Architecture

Request sequencing (I)

• Each request– Is given a unique identifier– Can specify one or more requests on whose

previous completion it depends(explicit dependencies)

• TickerTAIP adds enough implicit dependencies to prevent concurrent execution of overlapping requests

Page 15: The TickerTAIP Parallel RAID Architecture

Request sequencing (II)

• Sequencing is performed by acentralized sequencer– Several distributed solutions were considered

but not selected because of the complexity of the recovery protocols they would require

Page 16: The TickerTAIP Parallel RAID Architecture

Disk Scheduling

• Considered– First come first served (FCFS): implemented

in the working prototype – Shortest seek time first (SSTF): – Shortest access time first (SATF):

Considers both seek time and rotation time– Batched nearest neighbor (BNN):

Runs SATF on all reuests in queue

Not discussed in class in Fall 2005

Page 17: The TickerTAIP Parallel RAID Architecture

Evaluation (I)

• Based upon– Working prototype

• Used seven relatively slow Parsytec cards each with its own disk drive

– Event-driven simulator was used to test other configurations:• Results were always within 6% of prototype

measurements

Page 18: The TickerTAIP Parallel RAID Architecture

Evaluation (II)

• Read performance:– 1MB/s links are enough unless the request

sizes exceed 1MB

Page 19: The TickerTAIP Parallel RAID Architecture

Evaluation (III)

• Write performance:– Large stripe policy always results in a

slight improvement– At-parity significantly better than at-originator especially for link

speeds below 10MB/s– Late commit protocol reduces throughput by at most 2% but can

increase response time by up to 20%– Early commit protocol is not much better

• TickerTAIP always outperforms a comparable centralized RAID architecture

• best disk scheduling policy is Batched Nearest Neighbor

Page 20: The TickerTAIP Parallel RAID Architecture

Evaluation (IV)

• TickerTAIP always outperforms a comparable centralized RAID architecture

• Best disk scheduling policy is Batched Nearest Neighbor (BNN)

Page 21: The TickerTAIP Parallel RAID Architecture

Conclusion

• Can use physical redundancy to eliminate single points of failure

• Can use eleven 5 MIPS processors instead of single 50 MIPS

• Can use off-the-shelf processors for parity computations

• Disk drives remain the bottleneck for small request sizes