hbase feed aggregator wurbe 25

9
feed aggregator powered by hbase & python Andrei Savu wurbe #25

Upload: andrei-savu

Post on 06-May-2015

877 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: HBase Feed Aggregator Wurbe 25

feed aggregator powered by hbase & python

Andrei Savuwurbe #25

Page 2: HBase Feed Aggregator Wurbe 25

Objectives

Highly scalable feed aggregatorPlay with python & thrift Provide some sample codeProvide detailed install instructionsLearn new stuff

Page 3: HBase Feed Aggregator Wurbe 25

Table Structure

3 tables: Feeds, Urls, UrlsIndex

Feeds: all feedsUrls: data extracted from feedsUrlsIndex: index table

Page 4: HBase Feed Aggregator Wurbe 25

Source code

http://github.com/andreisavu/feedaggregator

detailed install instructions

Page 5: HBase Feed Aggregator Wurbe 25

Lessons learned

Page 6: HBase Feed Aggregator Wurbe 25

Lesson #1: Hbase Game Rules

Not relationsNo joins No sophisticated query engineNo column typingNo transactionsNo secondary indices

... all done in application code

Page 7: HBase Feed Aggregator Wurbe 25

Lesson #2: Design your index

<cat>/<w3c_timestamp>

time sorting = lexicographic sorting

Page 8: HBase Feed Aggregator Wurbe 25

Lesson #3: No charsets

convert everything to bytes

... but store the original charset

Page 9: HBase Feed Aggregator Wurbe 25

Questions?

http://www.andreisavu.ro

http://twitter.com/andreisavu

[email protected]