"clusterpoint inside-out" by jurģis orups at nosql focused xxviii devclub.lv event

28
Clusterpoint Inside-Out Jur is Orups ģ

Upload: -

Post on 20-Jul-2015

1.159 views

Category:

Software


0 download

TRANSCRIPT

Page 1: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Clusterpoint Inside-Out

Jur is Orupsģ

Page 2: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Development stages Planning – Idea Infant – Minimum Viable Product Child – Trial and error Teenager – Pivot & Execute

Grown ups – ... soon (:

Page 3: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Inspiration (2001) FTS for Sybase & FoxPro First distributed design & implementation – trying to bite Google (: Folk song search portal www.dainuskapis.lv

Page 4: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Long long time ago

Page 5: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Talk is cheap.Show me the code.(c) Linus Torvalds

Inverted Index Problem – real time updates to index

Pierpaolo Basile, Information Access with Lucene, Slideshare.net

Page 6: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Talk is cheap.Show me the code.(c) Linus Torvalds

Inverted Index

Pierpaolo Basile, Information Access with Lucene, Slideshare.net

Page 7: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Talk is cheap.Show me the code.(c) Linus Torvalds

Inverted Index

Pierpaolo Basile, Information Access with Lucene, Slideshare.net

Page 8: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Infant (2006) Clusterpoint (2006) – first startup in LV Seeded by Imprimatur Capital Team of 2.5 developers and 0.5 CEO 6 months wicked C/C++ coding biting Google again – search appliance vertical

- “didn't go well”

Page 9: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Talk is cheap.Show me the code.(c) Linus Torvalds

Inverted index Two type FTS indices:

− Memory (mutable)− Disk based (immutable)

Dump memory index when full Merge dumpings Problem solved – real time updates!

Page 10: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Talk is cheap.Show me the code.(c) Linus Torvalds

Query language Simple query js developer Dublin

Advanced queryjs developer<sex>=”female”</sex><salary>2000 .. 5000</salary><place>=”Dublin”</place>

Aggregation (SQL like)SELECT sex, count(sex) GROUP BY sex limit 1

Page 11: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Talk is cheap.Show me the code.(c) Linus Torvalds

Lookup tables (column-stores) Associative array/hash map Constant access/modify time Memory mapped Append only Perfect when accesing data by column

i.e. aggregation, faceting, filtering

Page 12: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Child (2008)

Trust in enterprise sales model First commercial customers

(directories, portals, e-shops, public sector) Positioning as database challenging NoSQL – heard nothing about it ... mhm maybe we are NoSQL ?!The San Francisco NOSQL Meetup on June 11, 2009 was important to the trend's development.

(Wikipedia)

Page 13: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

“Family”

Page 14: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Market Trends

Page 15: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Teenager (these days) Less trust in enterprise model Shift to free software & Cloud Grow customer base Innovate Develop for developers

Page 16: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Talk is cheap.Show me the code.(c) Linus Torvalds

Transactions – for what? ATM cash withdrawal Checkout Transfer of goods (monies, credits, lifes :) Booking

Page 17: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Talk is cheap.Show me the code.(c) Linus Torvalds

Transactions – example Begin Retrieve value for A1 Retrieve value for A2 Check Update value for A1 Update value for A2 Commit

Page 18: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Talk is cheap.Show me the code.(c) Linus Torvalds

Transactions – behind the scenes Begin – fix the “view of the world” Retrieve A1 (version v1) Retrieve A2 (version v2) Check Update A1: if v1' != v1 then rollback else

continue Update A2: if v2' != v2 then rollback else

continue Commit (save final versions in transaction log)

Page 19: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Talk is cheap.Show me the code.(c) Linus Torvalds

Transactions – “view of the world”

D1: TID1, TID6D2: TID2

D3: TID3, TID8D4: TID4D5: TID5

Shard1

D6: TID1, TID6D7: TID2, TID8

D8: TID3D9: TID4

D10: TID5

Shard2

TID1: D1,D6TID2: D2, D7TID3: D3, D8TID4: D4, D9TID5: D5, D10TID6: D1, D6TID7: D9, D8TID8: D3, D7

Transaction Log

1.

2. Retrieve

3.

Page 20: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Talk is cheap.Show me the code.(c) Linus Torvalds

Transactions – behind the scenes Begin – fix the “view of the world” Retrieve A1 (version v1) Retrieve A2 (version v2) Check Update A1: if v1' != v1 then rollback else

continue Update A2: if v2' != v2 then rollback else

continue Commit (save final versions in transaction log)

Page 21: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Talk is cheap.Show me the code.(c) Linus Torvalds

Transactions - distributed Tough because of sharding & replication Transaction log – no SPOF and it scales via

sharding & replication Optimistic locking – high concurrency Isolation – phantom reads

Page 22: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Talk is cheap.Show me the code.(c) Linus Torvalds

Transactions API

Page 23: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Benchmarks(single shard)

Ingestion (structured) – 25'000 ops Ingestion (text) – 1'800 ops Query (fts) – 4'700 ops Transactions (2r + 2w) – 3'500 ops

Page 24: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Cloud 6 months of stacking & racking & wiring 800 CPU Cores/250TB Storage/3TB RAM Real on-demand resources Pay per use model

Page 25: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Lots of hardware

Page 26: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

How does it work?

Once database is stored in Clusterpoint Cloud it is broken up in many shards and distributed among many servers.

Page 27: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

Try it Signup for Cloud

http://cloud.clusterpoint.com Atendees 3 months free of charge access upt

to 100GB storage Be part of community Have a fun!

Page 28: "Clusterpoint Inside-Out" by Jurģis Orups at NoSQL focused XXVIII DevClub.lv event

twitter.com/clusterpoint