kinetic basho public
TRANSCRIPT
Seagate Kinetic Open Storage PlatformJames Hughes …and many others
Storage is a Price Elastic Market
Price elasticity of demand • Alfred Marshall (1890)
As the price of Storage approaches $0 • Demands for storage will approach infinity
If the price of a Cisco router approaches $0 • Demands for routers will not approach
infinity - Storage is different
"3
http
://en
.wik
iped
ia.o
rg/w
iki/A
lfred
_Mar
shal
l
Seagate Confidential: Subject to NDA No. 77103, effective Jan. 18, 2009,
and all applicable supplements
"4
Areal Density Growth
0.1
1
10
100
1000
10000
100000
1989
1991
1993
1995
1997
1999
2001
2003
2005
2007
2009
2011
2013
2015
2017
2019
year
giga
bit /
in2
Single particle superparamagnetic limit (estimated)
Charap’s limit (broken)
• Late 1990s – super paramagnetic limit demonstrated through modeling
• Perpendicular expected to extend to 0.5-1 Tb/in2
• Additional innovations required at that point
• heat-assisted recording
• bit patterned media recording
• Areal Density CAGR 40% • Transfer Rate CAGR 20%
Perpendicular Writing & GMR
HAMR
HAMR+BPM
• Inductive Writing & Reading
• Inductive Writing/ MR reading
• Inductive Writing/ GMR reading
29%
100%
40%
Seagate Confidential: Subject to NDA No. 77103, effective Jan. 18, 2009,
and all applicable supplements
•Jevons Paradox • Cloud Computing increases the efficiency of computing....
Cloud Computing will increase this trend
"5
http://en.wikipedia.org/wiki/Jevons_paradox
Seagate Confidential: Subject to NDA No. 77103, effective Jan. 18, 2009,
and all applicable supplements
•Jevons Paradox • Cloud Computing increases the efficiency of computing....
Cloud Computing will increase this trend
Improved technology doubles the amount of Information produced with a given amount of Storage !Demand for Storage rises
"5
http://en.wikipedia.org/wiki/Jevons_paradox
Seagate Confidential: Subject to NDA No. 77103, effective Jan. 18, 2009,
and all applicable supplements
Technology Trends
"6
Seagate Confidential: Subject to NDA No. 77103, effective Jan. 18, 2009,
and all applicable supplements
•Write head larger than read head • Turns Disk into a sequentially
written media •All updates to data and
metadata are written sequentially to a continuous stream, called a log
•Disk API of sectors is no longer “natural”
Shingled Disks
http://www.ssrc.ucsc.edu/Papers/amer-ieeetm11.pdf
"7
Log Structured Storage
How much is erased on a reposition? • Tape - the remainder of the tape • Shingled disk - the remainder of the track group • Flash - the entire page
All persistent Storage systems do/will implement log structure • e.g. “NoSQL Database of sectors”
Does it make sense to layer a database on top of a database? • Could we use the log structure of the media to provide a more
natural storage systems, not mimicking an antique paradigm?
"8
Leading to disaggregation of servers
Single System Performance Trend
http://web.eecs.umich.edu/~twenisch/papers/isca09-disaggregate.pdf
"9
Scaling Storage
Distributed Hash Table • Key/Value Store
RAM MemcachedFlash FAWNDisk Riak
http://en.wikipedia.org/wiki/Distributed_hash_table
"10
Metadata and Metadata Servers are Evil
Required by traditional file systems (POSIX) to translate names to sectors • Hard to scale, heavy HA requirements, expensive
Can we use a name as a key? • Place the data into a scaled key value store? • Eliminate costly metadata servers?
"11
Cumulative operations ordered by length
"12
0%
20%
40%
60%
80%
100%
1.00 10.00 100.00 1000.00 10000.00 100000.00
Length (KB)
Cum
ulat
ive
perc
enta
ge
32K
B
0.5% of the data
92% of the operations
operations
data
Map of Operations
"13
Time (minutes)Locatio
n (TB)
Leng
th
0
1
2
3
4 0
2
051
2KB
1
3
Seagate Confidential: Subject to NDA No. 77103, effective Jan. 18, 2009,
and all applicable supplements
Seagate Kinetic Open Storage Platform
"14
Dis-intermediates applications to drive –Goes around file systems, volume managers, drivers
Enable ecosystem of value added software –Partners (like Basho) can create their own system value
Lower TCO –Eliminates complexity
Seagate Kinetic Open Storage Platform
"15
"16
"16
"17
SA
DD
"17
SA
LibKinetic
App App
ProtoBuf TCP/IP/GbE
Proprietary to Seagate
GPL
StandardProprietary
to System Vendor
• Application • Clustering • Management
• Storage
• Interconnect
D
C++, Java, Python, Erlang, DIY
"17
SA
LibKinetic
App App
LibKinetic
App App
LibKinetic
App App
ProtoBuf TCP/IP/GbE
Proprietary to Seagate
GPL
StandardProprietary
to System Vendor
• Application • Clustering • Management
• Storage
• Interconnect
D
"17
SA
LibKinetic
App App
LibKinetic
App App
LibKinetic
App App
ProtoBuf TCP/IP/GbE
SA
D
SA
D
SA
D
SA
D
SA
D
SA
D
SA
D
SA
D Proprietary to Seagate
GPL
StandardProprietary
to System Vendor
• Application • Clustering • Management
• Storage
• Interconnect
D
Typical JBOD architecture • Does not require a server, just JBODs to the ToR Switch • 10 JBODS × 60 drives × 4TB = 2.4PB/Rack
System Hardware
"18
Provides RPC to Key/Value database • Data is pre-indexed • Compression and other value is easy and transparent
P2P (Drive to Drive) copy of key ranges Communicate using existing Data Center Plumbing (TCP/IP) Multiple masters - Data sharing between machines Configurable caching per command
• Async, Sync, Flush Local space management
Features
"19
Clustering (performance, reliability, management) Compatibility with large scale applications (S3, etc.) Centralized Management
• Reliability, availability, durability
Kinetic Systems
"20
Elimination of server layers Less Human requirements Reduced mistakes Disaggregate storage from
servers Power management
Lower TCO
"21
Elimination of server layers Less Human requirements Reduced mistakes Disaggregate storage from
servers Power management
Lower TCO
"21
Data movement • Get/put/delete/getnext/getprevious • Versioned (== for success), options
Range operations Multiple masters
• Authentication/Integrity/Authorization Cluster-able
• Simple cluster configuration version enforcement 3rd party copy Management
Goals of API
"22
Configures the drive • Network • authorized clients
Monitors • Health • Statistics • Logs
Initiates recovery • Change cluster version • 3rdPartyCopy
Management (System Vendor)
"23
Key Structure • Variable number of octets (0-4KB)
Data Structure (Serialized to a byte stream) • KeyOf • Version • E2E Data Integrity
–Algorithm name • Data Variable length (0-xMB)
Data formats
"24
Same normal performance expectations • Sequential Write • Random Write • Sequential Read • Sequential Write
Iometer for key/value
Performance Metrics
"25
Seagate Confidential: Subject to NDA No. 77103, effective Jan. 18, 2009,
and all applicable supplements
Demo Time!
"26
Performance Results
"27
0
30
60
90
120
0 2 4 6 8
MB/s
0
250
500
750
1000
0 2 4 6 8
Puts/s
1MB values put rate (MB/s) 1KB values put rate
Deliver more value to Seagate, Partners and Customers • Dis-intermediates cloud applications to drive • Enable innovation in hardware and software ecosystem • Lower TCO
OpenSource Software –Basho Riak, Swift, HDFS
More information • http://seagate.com/www/kinetic • https://developers.seagate.com • http://guthub.com/Seagate
Conclusion
"28