data modeling for performance
DESCRIPTION
My talk for Mongo Boulder on data modeling.TRANSCRIPT
![Page 1: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/1.jpg)
Data Modeling for Performance
Mongo BoulderJanuary 21, 2010
Michael DwanSnapjoy
![Page 2: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/2.jpg)
i’m michael dwan@michaeldwan on the twitter
![Page 3: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/3.jpg)
the projectCompany X
![Page 4: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/4.jpg)
application spec
• find business details (web + api)
• search by category/keyword + geo (web + api)
• update (api)
![Page 5: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/5.jpg)
why is this interesting?
15,000,000businesses
30,000partners
100,000geo areas
2,300categories
2,000,000requests daily
24,000,000urls in sitemap
100,000,000tags
![Page 6: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/6.jpg)
updates
• infrequent changes
• monthly updates w/ 12M monthly changes
• “zero downtime”
![Page 7: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/7.jpg)
the problemmo’ data, mo’ problems
![Page 8: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/8.jpg)
complexity
![Page 9: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/9.jpg)
businesses
phone_numbers
businesses _phone_numbers
cities
states
zips
neighborhoods
businesses_neighborhoods
tags
taggings
assets
users
categories
categorizations
providers mappings
![Page 10: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/10.jpg)
architecture
x
xx x
![Page 11: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/11.jpg)
read performance
![Page 12: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/12.jpg)
solr
downtime
![Page 13: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/13.jpg)
solr getting fussy
![Page 14: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/14.jpg)
migrations
downtime
![Page 15: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/15.jpg)
the solution
![Page 16: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/16.jpg)
> gem install acts_as_web_scale
![Page 17: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/17.jpg)
![Page 18: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/18.jpg)
![Page 19: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/19.jpg)
a business...
{ "_id" : ObjectId("4ce838ef4a882579960001b9"), "name" : "Acme Glass Co", "tagline" : "Your trusty glass hole", "description" : "Glass repair...", "hours" : "Mon Fri 8 5", "url" : "http://acmeglasshole.biz",}
![Page 20: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/20.jpg)
a business... has many phone numbers
{ "_id" : ObjectId("4ce838ef4a882579960001b9"), "name" : "Acme Glass Co", "tagline" : "Your trusty glass hole", "description" : "Glass repair...", "hours" : "Mon Fri 8 5", "url" : "http://acmeglasshole.biz",}
![Page 21: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/21.jpg)
a business... has many phone numbers
{ "_id" : ObjectId("4ce838ef4a882579960001b9"), "name" : "Acme Glass Co", "tagline" : "Your trusty glass hole", "description" : "Glass repair...", "hours" : "Mon Fri 8 5", "url" : "http://acmeglasshole.biz", "phone_numbers" : [ "5035550091", "8005555456" ]}
![Page 22: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/22.jpg)
a business... has coordinates
{ "_id" : ObjectId("4ce838ef4a882579960001b9"), "name" : "Acme Glass Co", "tagline" : "Your trusty glass hole", "description" : "Glass repair...", "hours" : "Mon Fri 8 5", "url" : "http://acmeglasshole.biz", "phone_numbers" : [ "5035550091", "8005555456" ]}
![Page 23: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/23.jpg)
a business... has coordinates
{ "_id" : ObjectId("4ce838ef4a882579960001b9"), "name" : "Acme Glass Co", "tagline" : "Your trusty glass hole", "description" : "Glass repair...", "hours" : "Mon Fri 8 5", "url" : "http://acmeglasshole.biz", "phone_numbers" : [ "5035550091", "8005555456" ], "coordinates" : [ 45.559294, -122.644053 ]}
![Page 24: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/24.jpg)
a business... has many tags
{ "_id" : ObjectId("4ce838ef4a882579960001b9"), "name" : "Acme Glass Co", "tagline" : "Your trusty glass hole", "description" : "Glass repair...", "hours" : "Mon Fri 8 5", "url" : "http://acmeglasshole.biz", "phone_numbers" : [ "5035550091", "8005555456" ], "coordinates" : [ 45.559294, -122.644053 ]}
![Page 25: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/25.jpg)
a business... has many tags
{ "_id" : ObjectId("4ce838ef4a882579960001b9"), "name" : "Acme Glass Co", "tagline" : "Your trusty glass hole", "description" : "Glass repair...", "hours" : "Mon Fri 8 5", "url" : "http://acmeglasshole.biz", "phone_numbers" : [ "5035550091", "8005555456" ], "coordinates" : [ 45.559294, -122.644053 ], "tags" : [ "glass", "mirrors", "flat glass" ]}
![Page 26: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/26.jpg)
a business... has an address
{ "_id" : ObjectId("4ce838ef4a882579960001b9"), "name" : "Acme Glass Co", "tagline" : "Your trusty glass hole", "description" : "Glass repair...", "hours" : "Mon Fri 8 5", "url" : "http://acmeglasshole.biz", "phone_numbers" : [ "5035550091", "8005555456" ], "coordinates" : [ 45.559294, -122.644053 ], "tags" : [ "glass", "mirrors", "flat glass" ]}
![Page 27: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/27.jpg)
a business... has an address
{ "_id" : ObjectId("4ce838ef4a882579960001b9"), "name" : "Acme Glass Co", "tagline" : "Your trusty glass hole", "description" : "Glass repair...", "hours" : "Mon Fri 8 5", "url" : "http://acmeglasshole.biz", "phone_numbers" : [ "5035550091", "8005555456" ], "coordinates" : [ 45.559294, -122.644053 ], "tags" : [ "glass", "mirrors", "flat glass" ], "location" : { "street_address" : "2035 NE Alberta St" }}
![Page 28: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/28.jpg)
belongs to?
![Page 29: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/29.jpg)
a state
{ "_id" : ObjectId("4ce82937961552247900000f"), "name" : "Illinois", "slug" : "il", ...}
![Page 30: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/30.jpg)
a business... belongs to a state
{ "_id" : ObjectId("4ce838ef4a882579960001b9"), "name" : "Acme Glass Co", "tagline" : "Your trusty glass hole", "description" : "Glass repair...", "hours" : "Mon Fri 8 5", "url" : "http://acmeglasshole.biz", "phone_numbers" : [ "5035550091", "8005555456" ], "coordinates" : [ 45.559294, -122.644053 ], "tags" : [ "glass", "mirrors", "flat glass" ], "location" : { "street_address" : "2035 NE Alberta St" }}
![Page 31: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/31.jpg)
a business... belongs to a state
{ "_id" : ObjectId("4ce838ef4a882579960001b9"), "name" : "Acme Glass Co", "tagline" : "Your trusty glass hole", "description" : "Glass repair...", "hours" : "Mon Fri 8 5", "url" : "http://acmeglasshole.biz", "phone_numbers" : [ "5035550091", "8005555456" ], "coordinates" : [ 45.559294, -122.644053 ], "tags" : [ "glass", "mirrors", "flat glass" ], "location" : { "street_address" : "2035 NE Alberta St" }}
![Page 32: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/32.jpg)
a business... belongs to a state
{ "_id" : ObjectId("4ce838ef4a882579960001b9"), "name" : "Acme Glass Co", "tagline" : "Your trusty glass hole", "description" : "Glass repair...", "hours" : "Mon Fri 8 5", "url" : "http://acmeglasshole.biz", "phone_numbers" : [ "5035550091", "8005555456" ], "coordinates" : [ 45.559294, -122.644053 ], "tags" : [ "glass", "mirrors", "flat glass" ], "location" : { "street_address" : "2035 NE Alberta St", "state" : { "_id" : ObjectId("4ce829379615522479000026"), "meta" : { "slug" : "or" }, "display_name" : "Oregon" } }}
![Page 33: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/33.jpg)
a business... belongs to a city
{ "_id" : ObjectId("4ce838ef4a882579960001b9"), "name" : "Acme Glass Co", "tagline" : "Your trusty glass hole", "description" : "Glass repair...", "hours" : "Mon Fri 8 5", "url" : "http://acmeglasshole.biz", "phone_numbers" : [ "5035550091", "8005555456" ], "coordinates" : [ 45.559294, -122.644053 ], "tags" : [ "glass", "mirrors", "flat glass" ], "location" : { "street_address" : "2035 NE Alberta St", "state" : { "_id" : ObjectId("4ce829379615522479000026"), "meta" : { "slug" : "or" }, "display_name" : "Oregon" } }}
![Page 34: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/34.jpg)
a business... belongs to a city
{ "_id" : ObjectId("4ce838ef4a882579960001b9"), "name" : "Acme Glass Co", "tagline" : "Your trusty glass hole", "description" : "Glass repair...", "hours" : "Mon Fri 8 5", "url" : "http://acmeglasshole.biz", "phone_numbers" : [ "5035550091", "8005555456" ], "coordinates" : [ 45.559294, -122.644053 ], "tags" : [ "glass", "mirrors", "flat glass" ], "location" : { "street_address" : "2035 NE Alberta St", "state" : { "_id" : ObjectId("4ce829379615522479000026"), "meta" : { "slug" : "or" }, "display_name" : "Oregon" }, "city" : { "_id" : ObjectId("4ce82abdd3dfaa10f8006faa"), "meta" : { "slug" : "portland", }, "display_name" : "Portland, OR" }, }}
![Page 35: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/35.jpg)
a business... belongs to a zip code
{ "_id" : ObjectId("4ce838ef4a882579960001b9"), "name" : "Acme Glass Co", "tagline" : "Your trusty glass hole", "description" : "Glass repair...", "hours" : "Mon Fri 8 5", "url" : "http://acmeglasshole.biz", "phone_numbers" : [ "5035550091", "8005555456" ], "coordinates" : [ 45.559294, -122.644053 ], "tags" : [ "glass", "mirrors", "flat glass" ], "location" : { "street_address" : "2035 NE Alberta St", "state" : { "_id" : ObjectId("4ce829379615522479000026"), "meta" : { "slug" : "or" }, "display_name" : "Oregon" }, "city" : { "_id" : ObjectId("4ce82abdd3dfaa10f8006faa"), "meta" : { "slug" : "portland", }, "display_name" : "Portland, OR" }, }}
![Page 36: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/36.jpg)
a business... belongs to a zip code
{ "_id" : ObjectId("4ce838ef4a882579960001b9"), "name" : "Acme Glass Co", "tagline" : "Your trusty glass hole", "description" : "Glass repair...", "hours" : "Mon Fri 8 5", "url" : "http://acmeglasshole.biz", "phone_numbers" : [ "5035550091", "8005555456" ], "coordinates" : [ 45.559294, -122.644053 ], "tags" : [ "glass", "mirrors", "flat glass" ], "location" : { "street_address" : "2035 NE Alberta St", "state" : { "_id" : ObjectId("4ce829379615522479000026"), "meta" : { "slug" : "or" }, "display_name" : "Oregon" }, "city" : { "_id" : ObjectId("4ce82abdd3dfaa10f8006faa"), "meta" : { "slug" : "portland", }, "display_name" : "Portland, OR" }, "zip" : { "_id" : ObjectId("4ce82c29d3dfaa116b006dfa"), "display_name" : "97211" } }}
![Page 37: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/37.jpg)
many-to-many?
![Page 38: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/38.jpg)
a category
{ "_id" : ObjectId("4ce82e64d3dfaa16360014eb"), "name" : "Auto Glass", "slug" : "3063-auto-glass", "tags" : [ "windshields" ], ...}
![Page 39: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/39.jpg)
a business... belongs to a zip code
{ "_id" : ObjectId("4ce838ef4a882579960001b9"), "name" : "Acme Glass Co", "tagline" : "Your trusty glass hole", "description" : "Glass repair...", "hours" : "Mon Fri 8 5", "url" : "http://acmeglasshole.biz", "phone_numbers" : [ "5035550091", "8005555456" ], "coordinates" : [ 45.559294, -122.644053 ], "tags" : [ "glass", "mirrors", "flat glass" ], "location" : { "street_address" : "2035 NE Alberta St", "state" : { "_id" : ObjectId("4ce829379615522479000026"), "meta" : { "slug" : "or" }, "display_name" : "Oregon" }, "city" : { "_id" : ObjectId("4ce82abdd3dfaa10f8006faa"), "meta" : { "slug" : "portland", }, "display_name" : "Portland, OR" }, "zip" : { "_id" : ObjectId("4ce82c29d3dfaa116b006dfa"), "display_name" : "97211" } }}
![Page 40: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/40.jpg)
a business... belongs to many categories
{ "_id" : ObjectId("4ce838ef4a882579960001b9"), "name" : "Acme Glass Co", "tagline" : "Your trusty glass hole", "description" : "Glass repair...", "hours" : "Mon Fri 8 5", "url" : "http://acmeglasshole.biz", "phone_numbers" : [ "5035550091", "8005555456" ], "coordinates" : [ 45.559294, -122.644053 ], "tags" : [ "glass", "mirrors", "flat glass" ], "location" : { "street_address" : "2035 NE Alberta St", "state" : { "_id" : ObjectId("4ce829379615522479000026"), "meta" : { "slug" : "or" }, "display_name" : "Oregon" }, "city" : { "_id" : ObjectId("4ce82abdd3dfaa10f8006faa"), "meta" : { "slug" : "portland", }, "display_name" : "Portland, OR" }, "zip" : { "_id" : ObjectId("4ce82c29d3dfaa116b006dfa"), "display_name" : "97211" } }}
![Page 41: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/41.jpg)
a business... belongs to many categories
{ "_id" : ObjectId("4ce838ef4a882579960001b9"), "name" : "Acme Glass Co", "tagline" : "Your trusty glass hole", "description" : "Glass repair...", "hours" : "Mon Fri 8 5", "url" : "http://acmeglasshole.biz", "phone_numbers" : [ "5035550091", "8005555456" ], "coordinates" : [ 45.559294, -122.644053 ], "tags" : [ "glass", "mirrors", "flat glass" ], "location" : { "street_address" : "2035 NE Alberta St", "state" : { "_id" : ObjectId("4ce829379615522479000026"), "meta" : { "slug" : "or" }, "display_name" : "Oregon" }, "city" : { "_id" : ObjectId("4ce82abdd3dfaa10f8006faa"), "meta" : { "slug" : "portland", }, "display_name" : "Portland, OR" }, "zip" : { "_id" : ObjectId("4ce82c29d3dfaa116b006dfa"), "display_name" : "97211" } }, "categories" : [ { "_id" : ObjectId("4ce82e50d3dfaa16360004f2"), "meta" : { "slug" : "282-glass", "tags" : [ "windows" ], }, "display_name" : "Glass" }, { "_id" : ObjectId("4ce82e64d3dfaa16360014eb"), "meta" : { "slug" : "3063-auto-glass", "tags" : [ "windshields" ], }, "display_name" : "Auto Glass" } ]}
![Page 42: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/42.jpg)
queries & indexesknow what you want
![Page 43: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/43.jpg)
#1 find a businessI want *that* one
![Page 44: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/44.jpg)
find a business
// single businessdb.businesses.findOne({ _id: ObjectId("4ce838ef4a882579960001b9")})
![Page 45: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/45.jpg)
#2 find by locationBusinesses in San Francisco, CA
![Page 46: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/46.jpg)
find businesses by state/city/zip
// find all within statedb.businesses.find({ "location.state._id": ObjectId("4ce82937961552247900000f")})
![Page 47: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/47.jpg)
find businesses by state/city/zip
// find all within statedb.businesses.find({ "location.state._id": ObjectId("4ce82937961552247900000f")})
// find all within citydb.businesses.find({ "location.city._id": ObjectId("4ce82aa0d3dfaa10f8004a95")})
![Page 48: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/48.jpg)
find businesses by state/city/zip
// find all within statedb.businesses.find({ "location.state._id": ObjectId("4ce82937961552247900000f")})
// find all within citydb.businesses.find({ "location.city._id": ObjectId("4ce82aa0d3dfaa10f8004a95")})
// find all within zipdb.businesses.find({ "location.zip._id": ObjectId("4ce82b5ed3dfaa116b0026f0")})
![Page 49: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/49.jpg)
indexes
// the indexesdb.businesses.ensureIndex({"location.city._id": 1})db.businesses.ensureIndex({"location.zip._id": 1})
skip “location.state._id” -- only 51 possibilities
1.5GBeach
![Page 50: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/50.jpg)
#3 find by categoryBusinesses in the Auto Repair category
![Page 51: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/51.jpg)
businesses by category
// find by category iddb.businesses.find({ "categories._id": ObjectId("4ce82e50d3dfaa16360004f2")})
// the indexdb.businesses.ensureIndex({ "categories._id":1})
![Page 52: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/52.jpg)
#4 - find by category + location Businesses in the Plumbing category in Chicago, IL
![Page 53: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/53.jpg)
businesses by category + city
// find by city id and category iddb.businesses.find({ "location.city._id": ObjectId("4ce82aa0d3dfaa10f8004a95"), "categories._id": ObjectId("4ce82e50d3dfaa16360004f2")})
![Page 54: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/54.jpg)
which index should we use?
// city id{"location.city._id":1}
// category id{"categories._id":1}
~ or ~
we need a compound indexanswer: both suck
![Page 55: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/55.jpg)
which order?
db.businesses.ensureIndex({ "location.city._id" : 1, "categories._id" : 1})
db.businesses.ensureIndex({ "categories._id" : 1, "location.city._id" : 1})
~ or ~
answer: cities → categories
35,000 cities & 2,500 categories
create one for zip codes and categories too!
![Page 56: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/56.jpg)
don’t we have 2 indexes on city id?
answer: yes
{"location.city._id" : 1}{"location.city._id" : 1, "categories._id" : 1}
db.businesses.dropIndex("location.city._id_1")
![Page 57: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/57.jpg)
#5 - find by keyword“something awesome” in Boulder, CO
![Page 58: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/58.jpg)
find businesses in city by keyword
{ "_id" : ObjectId("4ce838ef4a882579960001b9"), "name" : "Acme Glass Co", "keywords" : [ "glass", "repair", "acme", ... ]}
db.businesses.ensureIndex({ "location.city._id":1, "keywords":1})
db.businesses.find({ "location.city._id":ObjectId("4ce82aa0d3dfaa10f8004a95"), "keywords":/glass/i})
![Page 59: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/59.jpg)
chat with Kyle Banker
me: we’re switching from postgres+solr to mongo
kyle: oh wow, you can replace solr with mongo?
me: with some creativity
kyle: seems like it’d still be hard to get just right
me: it works well
kyle: gotcha
![Page 60: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/60.jpg)
i was wrong, kyle was right
![Page 61: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/61.jpg)
I’ll never leave you again
...until MongoDB supports full text later this year:)
I
![Page 62: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/62.jpg)
aggregationmap/reduce to the rescue
![Page 63: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/63.jpg)
sitemapsbig list of every url
![Page 64: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/64.jpg)
sitemaps
• xml files containing each unique url ~ 24M
• 50,000 urls per file, about 500 files
• urls are generated from live data
• http://companyx.com/sitemaps/1.xml
![Page 65: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/65.jpg)
partition by consistent hash
>> "hello!".hash % 6 #=> 5
>> "/ny/new-york/c/apartments".hash % 6 #=> 5
returns an integer between 0 and the number specified
![Page 66: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/66.jpg)
map/reduce
1. map each url in the site to a partition
2. reduce all partitions to a single document containing all urls in that partition
3. save to a permanent collection
![Page 67: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/67.jpg)
map
/il/chicago/c/pizza 4/ny/new-york/c/apartments 1nd/rugby/c/apartments 6/14076500-bayside-marina 2/13401000-comtrak-logistics-inc 3/12347500-allstate-auto-insurance 1il/downers-grove/c/computer-web-design 6/1009500-heidelberg-lodges 5mn/redwood-falls/c/food-service 4/14077000-bank-of-america 5mn/savage/c/audio-visual-equipment 1...
1
2
3
4
5
6
![Page 68: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/68.jpg)
reduce
{ "total" : 2, "urls" : [ "/12347500-allstate-auto-insurance", "/ny/new-york/c/apartments" ]}
{ "total" : 1, "urls" : [ "/mn/savage/c/audio-visual-equipment" ]}
{ "_id" : 1, "value" : { "total" : 2, "urls" : [ "/12347500-allstate-auto-insurance", "/mn/savage/c/audio-visual-equipment", "/ny/new-york/c/apartments" ] }}
![Page 69: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/69.jpg)
usage
db.sitemaps.findOne({_id:1}).value.urls
[ "/12347500-allstate-auto-insurance", "/mn/savage/c/audio-visual-equipment", "/ny/new-york/c/apartments"]
![Page 70: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/70.jpg)
wrap up
![Page 71: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/71.jpg)
2 months later
115ms average response times
![Page 72: Data Modeling for Performance](https://reader031.vdocument.in/reader031/viewer/2022020721/559ecb5a1a28abd1338b4679/html5/thumbnails/72.jpg)
thank you@michaeldwan