using mongodb for the art genome project (mongo boston 2011)

13
Happiness is MongoDB Monday, October 3 rd , 2011 Daniel Doubrovkine (dB.) [email protected] @dblockdotorg http://code.dblock.org 902 Broadway, 4th Fl. New York, NY

Upload: daniel-doubrovkine

Post on 18-Dec-2014

1.071 views

Category:

Technology


0 download

DESCRIPTION

Using MongoDB for the Art Genome Project, presented 10/3/2011 at Mongo Boston.

TRANSCRIPT

Page 1: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

Happiness is MongoDBMonday, October 3rd, 2011

Daniel Doubrovkine (dB.)[email protected]@dblockdotorghttp://code.dblock.org

902 Broadway, 4th Fl.New York, NY

Page 2: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

» Claude Monet 

» Mark Grotjahn

Demo

Page 3: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

Kasimir Malevich – “Self Portrait”

- vs. -

William Beckman – “Self Portrait”

Art Genome Project

Page 4: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

= 42

Portrait Contemporary Realist Conceptual

100 100 0 75

Portrait Contemporary Realist Conceptual

100 50 70 20

Euclidean Distance

Page 5: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

def similar(a1)

artworks.each { |a2|

[a2, euclidean(a1, a2)] }.sort_by { |a, d|

d

}.take(10)

end

Fast Search in Ruby

Page 6: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

MySQL Prototype Schema

Page 7: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

Need a sorted sparse vector on boot.[ 100, 0, 20, … 60 ]10K artworks: 5 minutes to startup

5 minutes to accomplish … nothing.

MySQL Prototype Schema

Page 8: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

» Genome.genes – it’s a hash!

{ “Portrait” => 100, …, “Conceptual” => 20 }

» Genome, Embedded in Artwork

MonoDB “Schema”

Page 9: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

» Something new? Got (far too) many years of experience with *SQL / DW

» @harryh uses it @ 4sq

» @eliothorowitz looks pretty smart

» db.startups.find({ location : { $near : GA }, category : ‘nosql db vendor' } ).first = 10gen

» install … ? … profit

» available on Heroku from MongoHQ

» continuous deployment friendly

Choosing MongoDB

Page 10: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

» MongoDB retrieval by ID is fast, maybe faster, than Ruby Hash

» Using Rails + Rake and Mongo is safer than mongo shell db.collection.update({x: y})

» Shared Hosting is not Rubber, You Can’t Stretch It

» Map/reduce for live queries really doesn’t work, no really mongoid_fulltext

» Read-secondary + Map/Reduce can be fun read_secondary: <%= $rails_rake_task.nil? or !$rails_rake_task %>

» Collection names are limited in length if you use mongodumphttps://jira.mongodb.org/browse/SERVER-2973, fixed in 2.0.0

» copyDatabase requires administrative privilegeshttps://jira.mongodb.org/browse/SERVER-2846

Using MongoDB

Page 11: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

» Mongo cursors aren’t snapshotted by defaultProcessing 5183 of 4012 …http://www.mongodb.org/display/DOCS/How+to+do+Snapshotted+Queries+in+the+Mongo+Database

» Mongo Interest is growing, RoR + MongoId = GTDhttp://code.dblock.org/ror-win-getting-things-done-with-mongodb-mongoid

» Mongoid Keeps Things Entertaining, Living on the Edge

Using MongoDB (continued …)

Page 12: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

» MongoHQ Extensions via Heroku

» Production Directly w/MongoHQ

» A Few Hundred Bucks / mo.

» Mongo 1.8.1 w/ replica sets, 2 DBs and 1 arbiter

» Different Availability Zones

» Dedicated RAM, separate EBS, shared CPU

» Early Issues, Now Very Stable

» Jason McCay + other folks @ MongoHQ = Awesome

» Mongoid 2.0.2

» mongoid_slug, mongoid_fulltext, mongoid_history, delayed_job_mongoid

Deploying MongoDB

Page 13: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

name: Daniel Doubrovkine (aka. dB.)

company: http://art.sy ^ work here

twitter: @dblockdotorg blog: http://code.dblock.org ^ link to slides here

email: [email protected]

Thank you.