hugo caçote @ hepix fall 2005 testing high performance tape drives hepix fall 2005 data services...
Post on 21-Dec-2015
219 views
TRANSCRIPT
Hugo Caçote @ HEPiX Fall 2005
Testing High Performance Tape Drives
HEPiX FALL 2005Data Services Section
Hugo Caçote @ HEPiX Fall 2005
Motivation
2007
~15 Petabytes/year
LHC
Hugo Caçote @ HEPiX Fall 2005
Current Model
CASTORHSM
RFIOD
ATA
DA
TA
DA
TA
DA
TA
DA
TA
DA
TA
Hugo Caçote @ HEPiX Fall 2005
IBM 3584 Library IBM 3592JA Tape Drive
HP LT0-3STK SL8500
400 GB
80 MB/sec
40 MB/sec 300 GB / 60 GB
1,448 carts 64 drives
117 carts12 drives
+ +
+ +
Devices Tested
Future Generation Tape Drives >100 MB/s
Testing in 2006
Hugo Caçote @ HEPiX Fall 2005
Test Infrastructure
Tape Library
Tape Drive
Fiber
Fiber Channel HBA
Tape Server
Hugo Caçote @ HEPiX Fall 2005
Functionality Tests
• Go through the set of commands available on the SCSI standard
Check returned information, timing, command acceptance SCSI COMMANDS:
Change Definition, Compare, Copy, Copy and Verify, Display Message, Erase, Format Medium, Inquiry, Load/Unload, Locate , Log Select, Log Sense, Mode Select (6), Mode Select (10), Mode Sense (6), Mode Sense (10), Persistent Reserve In, Persistent Reserve Out, Prevent/Allow Medium Removal, Read, Read Attribute, Read Block Limits, Read Buffer, Read Position, Read Reverse, Receive Diagnostic Results, Recover Buffered Data, Release Unit (6), Release Unit (10), Report Density Support LUNs, Request Sense Unit (6), Reserve Unit (10), Rewind, Send Diagnostic, Set Capacity, Space, Test Unit Ready, Verify, Write, Write Attribute, Write Buffer, Write Filemarks
Hugo Caçote @ HEPiX Fall 2005
Functionality Tests
fibre channel analyzer for verifying SCSI commands
Hugo Caçote @ HEPiX Fall 2005
Test Scenarios
Linux tape driver and cernTapeTestUtil (interactive/command line mode)
Hugo Caçote @ HEPiX Fall 2005
Analysis of Results
Hugo Caçote @ HEPiX Fall 2005
Mechanical Tests
IBM 3592:Over 125,000 mount/dismount cycles performed, no
errorsTest mechanical reliability of drive / media: some
cartridges now mounted > 4000 times, no errorsRandom file reads on selected tapes and media:
superseded by CASTOR operation in data challenges
HP LTO-3:Over 125,000 mount/dismount cycles performed, no
errors Test mechanical reliability of drive / media: some cartridges now mounted > 5000 times, no errors
Hugo Caçote @ HEPiX Fall 2005
Performance Tests
- Use of native Linux Commands (mt/dd) for data transfers :
• read / write• compression / no compression• blocksize• filesize• position “labeled” files
Hugo Caçote @ HEPiX Fall 2005
Performance
MBytes/s with blocksize, L30099
01020304050607080
Blocks (K)
MB
ytes
/sLTO-3 Data Transfer RateLTO-3 Data Transfer Rate
Write no compressionWrite no compression
Hugo Caçote @ HEPiX Fall 2005
Performance
LTO-3 Data Transfer RateLTO-3 Data Transfer RateRead no compressionRead no compression
MBytes/s with blocksize, L30100
01020304050607080
Blocks (K)
MB
ytes
/s
Hugo Caçote @ HEPiX Fall 2005
Skipping 40 MB files
0
20
40
60
80
100
120
140
160
180
1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191 201 211 221 231 241 251 261 271
Groups of 33 labelled files
Se
co
nd
s
Skip
Rewind
LTO-3 Locate File Timing
Skipping 512 MB steps
0
20
40
60
80
100
120
140
1 37 73 109 145 181 217 253 289 325 361 397 433 469 505 541 577 613 649 685 721 757
Skips
Se
co
nd
s
Skip
Rewind
LTO-3 Locate Record Timing
Performance
Algorithm pro
blem(?) se
en older
LTO-1 drives,
just HP (?
)
Hugo Caçote @ HEPiX Fall 2005
Tapemark
Data
0-? Bytes
Tapemark
Header 1
80 Bytes
Header 2
80 Bytes
Header 3
80 Bytes
Trailer 1
80 Bytes
Trailer 2
80 Bytes
Trailer 3
80 Bytes
Tapemark
Sync
ANSI Labels
Headers
Trailers
Tapemark
Sync
Filename, block size, HSM version, time of writing …
Number of blocks, non standard data …
Flush buffer
Special records on tape used by the drive , immediate bit =0/1
Hugo Caçote @ HEPiX Fall 2005
Average Effective Write Data Transfer HP LTO-3
0
10
20
30
40
50
60
70
0 500 1000 1500 2000
File Size/MBytes
Dat
a Tr
ansf
er M
Byt
es/s Imm=1Tpmrk=3 Sync=1
Imm=0 Tpmrk=3 Sync=1
Imm=0 Tpmrk=1 Sync=1
Imm=0 Tpmrk=0 Sync=1
Imm=0 Tpmrk=0 Sync=0
Labels vs Performance
Minimumover head
Maximumover head
Hugo Caçote @ HEPiX Fall 2005
Average Effective Write Data Transfer HP LTO-3 vs IBM 3592JA
0
10
20
30
40
50
60
70
0 200 400 600 800 1000 1200 1400 1600 1800 2000
File Size/MBytes
Dat
a Tr
ansf
er M
Byt
es/s
Imm=1Tpmrk=3 Sync=1 LTO3
Imm=0 Tpmrk=3 Sync=1 LTO3
Imm=0 Tpmrk=1 Sync=1 LTO3
Imm=0 Tpmrk=0 Sync=1 LTO3
Imm=0 Tpmrk=0 Sync=0 LTO3
Imm=1Tprmk=3 Sync=1 3592
Imm=0 Tpmrk=3 Sync=1 3592
Imm=0 Tpmrk=1 Sync=1 3592
Imm=0 Tpmrk=0 Sync=1 3592
Imm=0 Tpmrk=0 Sync=0 3592
Labels vs Performance
HP-LTO3
IBM 3592JA
Hugo Caçote @ HEPiX Fall 2005
Average Effective Write Data Transfer HP LTO-3 vs IBM 3592JA
0
10
20
30
40
50
60
70
0.25
1 4 7
10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97
100
400
700
1000
2024
File Size/MBytes
Dat
a T
ran
sfer
MB
ytes
/s
Imm=1Tpmrk=3 Sync=1 LTO3
Imm=0 Tpmrk=3 Sync=1 LTO3
Imm=0 Tpmrk=1 Sync=1 LTO3
Imm=0 Tpmrk=0 Sync=1 LTO3
Imm=0 Tpmrk=0 Sync=0 LTO3
Imm=1 Tpmrk=3 Sync=1 3592
Imm=0 Tpmrk=3 Sync=1 3592
Imm=0 Tpmrk=1 Sync=1 3592
Imm=0 Tpmrk=0 Sync=1 3592
Imm=0 Tpmrk=0 Sync=0 3592
Labels vs Performance
IBM 3592small files
Hugo Caçote @ HEPiX Fall 2005
HSM Integration
Functionality + Mechanical + Performance
tape_up tape unit standard testing for production utility tplabel tape labelling utility dumptape tape dumping (scanning) utility stagein tape reading utility stagewrt tape writing utility repack move CASTOR file from tape and reclaim utilities
OK
Drive integration in HSM system
Hugo Caçote @ HEPiX Fall 2005
Operations
IBM 3592/HP LT0-3:SNMP Agent :
Error CountersTape AlertsDrive and Media: number mounts/Loads/…
IBM 3592:Statistical Analysis and Reporting System :
bit 62: SARS Drive Relative Quality X'00' is unknown, best X'01' ->
worst X'FF‘bit 63 SARS Media Relative Quality
X'00' is unknown, best X'01' -> worst X'FF'
No Vendor:Perfect tool for monitoring all type of drives and for a large
number of drives
Request Sense
Hugo Caçote @ HEPiX Fall 2005
More Tests
CASTORHSM
RFIO
?? $$SAN