mongodb, our swiss army knife database
Post on 25-May-2015
3.694 Views
Preview:
DESCRIPTION
TRANSCRIPT
MongoDB at fotopediaTimeline storage
Our Swiss Army Knife Database
MongoDB at fotopedia
• Context
• Wikipedia data storage
• Metacache
Fotopedia
• Fotonauts, an American/French company
• Photo — Encyclopedia
• Heavily interconnected system : flickr, facebook, wikipedia, picassa, twitter…
• MongoDB in production since last october
• main store lives in MySQL… for now
First contact
• Wikipedia imported data
Wikipedia queries
• wikilinks from one article
• links to one article
• geo coordinates
• redirect
• why not use wikipedia API ?
Download ~ 5.7GB gzipXML
GeoRedirectBacklinkRelated
~12GB tabular data
Problem
Load ~12GB into a K/V store
CouchDB 0.9 attempt
• CouchDB had no dedicated import tool
• need to go through HTTP / Rest API
“DATA LOADING”
LOADING!
(obviously hijacked from xkcd.com)
Problem, rephrased
Load ~12GB into any K/V store
in hours, not days
Hadoop HBase ?
• as we were already using Hadoop Map/Reduce for preparation
• bulk load was just emerging at that time, requiring to code against HBase private APIs, generate the data in an ad-hoc binary format, ...
photo by neural.it on Flickr
Problem, rerephrasedLoad ~12GB into any K/V store
in hours, not days
without wasting a week on development
and another week on setup
and several months on tuning
please ?
MongoDB attempt• Transforming the tabular data into a JSON
form : about half an hour or code, 45 minutes of hadoop parallel processing
• setup mongo server : 15 minutes
• mongoimport : 3 minutes to start it, 90 minutes to run
• plug RoR app on mongo : minutes
• prototype was done in a day
Download ~ 5.7GB gzip
GeoRedirectBacklinkRelated
~12GB, 12M docs
Batch Synchronous
Ruby on Rails
Hot swap ?
• Indexing was locking everything.
• Just run two instances of MongoDB.
• One instance is servicing the web app
• One instance is asleep or loading data
• One third instance knows the status of the two instances.
We loved:
• JSON import format
• efficiency of mongoimport
• simple and flexible installation
• just one cumbersome dependency
• easy to start (we use runit)
• easy to have several instances on one box
Second contact
• itʼs just all about graphes, anyway.
• wikilinks
• people following people
• related community albums
• and soon, interlanguage links
all about graphes...
• ... and itʼs also all about cache.
• The application needs to “feel” faster, letʼs cache more.
• The application needs to “feel” right, so letʼs cache less.
• or — big sigh — invalidate.
Page fragment caching
RoR application
Varnish HTTP cache
Nginx SSIphoto by Mykl Roventine on Flickr
photo by Aires Dos Santos
photo by Leslie Chatfield on Flickr
There are only two hard thingsin Computer Science:cache invalidation and naming things.
Phil Karlton
Haiku ?
Naming things
• REST have been a strong design principle in fotopedia since the early days, and the efforts are paying.
/en/2nd_arrondissement_of_Paris
/en/Paris/fragment/left_col
/en/Paris/fragment/related
/users/john/fragment/contrib
Invalidating
• Rest allows us to invalidate by URL prefix.
• When the Paris album changes, we have to invalidate /en/Paris.*
Varnish invalidation
• Varnish built-in regexp based invalidation is not designed for intensive, fine grained invalidation.
• We need to invalidate URLs individually.
/en/Paris.*
/en/Paris
/en/Paris/fragment/left_col
/en/Paris/photos.json?skip=0&number=20
/en/Paris/photos.json?skip=13&number=27
Metacache workflow
RoR application
Varnish HTTP cache
Nginx SSI
metacache feeder
varnish log
invalidation worker
/en/Paris/en/Paris/fragment/left_col/en/Paris/photos.json?skip=0&number=20/en/Paris/photos.json?skip=13&number=27
/en/Paris/fragment/left_col
/en/Paris.*
Waw.
• This time we are actually using MongoDB as a BTree. Impressive.
• The metacache has been running fine for several months, and we want to go further.
Invalidate less
• We need to be more specific as to what we invalidate.
• Today, if somebody votes on a photo in the Paris album, we invalidate all /en/Paris prefix, and most of it is unchanged.
• We will move towards a more clever metacache.
Metacache reloaded• Pub/Sub metacache
• Have the backend send a specific header to be caught by the metacache-feeder, conaining “subscribe” message.
• This header will be a JSON document, to be pushed to the metacache.
• The purge commands will be mongo search queries.
{url:/en/Paris, observe:[summary,links]}
{url:/en/Paris/fragment/left_col, observe: [cover]}
{url:/en/Paris/photos.json?skip=0&number=20, observe:[photos]}
{url:/en/Paris/photos.json?skip=0&number=20, observe:[photos]}
/en/Paris
/en/Paris/fragment/left_col
/en/Paris/photos.json?skip=0&number=20
/en/Paris/photos.json?skip=13&number=27
{url:/en/Paris, observe:[summary,links]}
{url:/en/Paris/fragment/left_col, observe: [cover]}
{url:/en/Paris/photos.json?skip=0&number=20, observe:[photos]}
{url:/en/Paris/photos.json?skip=0&number=20, observe:[photos]}
when somebody votes{ url:/en/Paris.*, observe:photos }
when the summary changes{ url:/en/Paris.*, observe:summary }
when the a new link is created{ url:/en/Paris.*, observe:links }
Other uses cases
• Timeline activities storage: just one more BTree usage.
• Moderation workflow data: tiny dataset, but more complex queries, map/reduce.
• Suspended experimentation around log collection and analysis
Current situation
• Mysql: main data store
• CouchDB: old timelines (+ chef)
• MongoDB: metacache, wikipedia, moderation, new timelines
• Redis: raw data cache for counters, recent activity (+ resque)
What about the main store ?
• albums are good fit for documents
• votes and score may be more tricky
• recent introduction of resque
In short
• Simple, fast.
• Hackable: in a language most can read.
• Clear roadmap.
• Very helpful and efficient team.
• Designed with application developer needs in mind.
top related