how we use bottle and elasticsearch
DESCRIPTION
An Adventure in Sinar ProjectTRANSCRIPT
How We Use Bottle and Elasticsearch
An Adventure in Sinar Project
A background of Malaysian Bill Watcher
● The big idea, is to let citizen knows what is happening in parliament, bills being debated and pass
● No it doesn't know the bills that is not debated yet.
● Its is scraper based, just because the parliament site tend to be stand alone, and nobody bother to go there.
● And cheat on twitter notification
The result
● Didn't quite worked out.● That is not the point of this talk● You can talk to us later on this
What we use?
● We use bottle micro framework.● Elasticsearch via pyes● sqlalchemy for db abstraction(seriously
thinking to move to mongodb instead, idea?)● We use bootstrap for css and their js plugin● with jquery● Beaufifulsoup for scraping(that is our data
source) We only going to cover pyes and bottle(talk to me later for everything else)
Bottle
● A micro framework● Similar to flask!!!● Less feature though● But it is OKAY.
Bottle a views
● https://github.com/Sinar/Malaysian-Bill-Watcher/blob/master/billwatcher/pages.py
● https://github.com/Sinar/Malaysian-Bill-Watcher/blob/master/billwatcher/pages.py
What happen?
● Bottle can return a dict, and it automatically output json, if the object in dict is compatible
● Or with View Decorator, the dict can be feed to a template
bottle the template
● https://github.com/Sinar/Malaysian-Bill-Watcher/blob/master/billwatcher/views/list.tpl
●
What happen?
● Bottle read template from a dict/object● like most templating language does. ● Have some logic like other, different syntax
same idea● Don't have good inheritance though
elasticsearch!!!
● OK I lied, this is written in JAVA● But hey, it got RESTful API● and pyes did a lot of the stuff for us already
Before index
● before we do that● we don't need to create a schema● But we do it because we can control the
search priority
Indexer
● https://github.com/Sinar/Malaysian-Bill-Watcher/blob/master/billwatcher/indexer.py
then we index
● First flatten your result● convert to dict● index
The indexer
● https://github.com/Sinar/Malaysian-Bill-Watcher/blob/master/billwatcher/indexer.py
● https://github.com/Sinar/Malaysian-Bill-Watcher/blob/master/billwatcher/loader.py
now we search!!!!
● There is some field inside the search result● https://github.com/Sinar/Malaysian-Bill-
Watcher/blob/master/billwatcher/pages.py
The future
● Do localization● Find new data source● Move template to jinja● Seriously thinking of moving to mongodb● Seriously still trying to see a better way to
scrape websites