hbase feed aggregator wurbe 25

Post on 06-May-2015

877 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

feed aggregator powered by hbase & python

Andrei Savuwurbe #25

Objectives

Highly scalable feed aggregatorPlay with python & thrift Provide some sample codeProvide detailed install instructionsLearn new stuff

Table Structure

3 tables: Feeds, Urls, UrlsIndex

Feeds: all feedsUrls: data extracted from feedsUrlsIndex: index table

Source code

http://github.com/andreisavu/feedaggregator

detailed install instructions

Lessons learned

Lesson #1: Hbase Game Rules

Not relationsNo joins No sophisticated query engineNo column typingNo transactionsNo secondary indices

... all done in application code

Lesson #2: Design your index

<cat>/<w3c_timestamp>

time sorting = lexicographic sorting

Lesson #3: No charsets

convert everything to bytes

... but store the original charset

Questions?

http://www.andreisavu.ro

http://twitter.com/andreisavu

contact@andreisavu.ro

top related