project voldemort - when relation database is not enough (too much?)
TRANSCRIPT
2010
Project Voldemort
When relation database is not enough (too much?)
2010
Tomasz Nurkiewicz
• Graduated Warsaw University Of Technology (Faculty of Electronics and Information Technology)
• SCJP, SCJD, SCWCD, SCBCD• Brown belt on JavaBlackBelt.com• Few commits in:
– Commons Lang, Logback, Mule ESB• Currently working as a developer at
JAVART• http://nurkiewicz.blogspot.com
2010
What is Voldemort not?
• Relational database• Alternative API to RDB• Distributed cache
2010
...so?
• Key-value database• „Persistent HashMap”• Keywords:
– Cluster ready– Scalability (no master)– Efficient (partitioning)– High availability
• „Less than” relation database
2010
Voldemort vs. HashMap
• Partitions vs. hash-buckets• Where is my stuff? – client side
routing0 4 8 12
1 5 9 13
2 6 10 14
3 7 11 15
node1
node2
node3
node4 19
18
17
16
2010
API – StoreClient<K, V>
• getValue(K key): V• getAll(Iterable<K> keys): Map<K, V>• put(K key, V value)• delete(K key): boolean• getResponsibleNodes(K key):
List<Node>• …• extends java.util.Map<K, V>?
2010
Voldemort cluster
DEMO
node1 node2 node3 node4
localhost:6101 localhost:6102 localhost:6103 localhost:6104
2010
Connection (One to rule them all)
ClientConfig config = new ClientConfig();config.setBootstrapUrls("tcp://server:6101");SocketStoreClientFactory factory = new
SocketStoreClientFactory(config);StoreClient<K, V> client = factory.getStoreClient("users");
• No replication – efficient, but risky
2010
Store configuration (stores.xml)
• N (<replication-factor>)• R (<required-reads>)• W (<required-writes>)• Serialization
– JSON – only primitives, dates and collections
– protobuf – (Google)– String (e.g. XML)– Java (Serializable)
2010
Cluster configuration (cluster.xml)
<cluster><name>User cluster</name><server>
<id>1</id><host>192.168.0.17</host><socket-port>6101</socket-port><partitions>0, 4, 8, 12, 16</partitions>
</server>
• Persistency: MySQL, Berkeley DB, RAM
2010
Rebalancing
0 4 8 12
1 5 9 13
2 6 10 14
3 7 11 15
node1
node2
node3
node4
node5
19
18
17
16
2010
Facts of life (LinkedIn and Gilt)
GET PUT
Throughput 19K/s 16K/s
Median 0,015 ms 0,040 ms
99,99% 0,227 ms 2,551 ms
• Uptime measured in months (Gilt Groupe)• Last Thursday we lost a server and no-one
noticed (as above)
2010
Transactional/consistent?
• Logic time (vector clock)• Conflict resolution during read• Optimistic locking
– putIfNotObsolete(K key, Versioned<V> versioned)
– applyUpdate(UpdateAction<K, V> action)– ObsoleteVersionException
• R+W>N
2010
What else?
• JMX (both client and server)• Views (non-materialized)• HTTP interface (not REST though)• Hinted handoff
2010
Happy clustering!• http://project-voldemort.com
• http://www.infoq.com/presentations/Project-Voldemort-Scaling-Simple-Storage
• http://www.infoq.com/presentations/Project-Voldemort-at-Gilt-Groupe
• http://behemoth.strlen.net/~alex/voldemort-nosql_live.ppt• http://www.slideshare.net/hadoopusergroup/hadoop-and-voldemort-linkedin
• http://www.vineetgupta.com/2010/01/nosql-databases-part-1-landscape.html
• http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores