Download - Enhance you APDEX.. naturally!
Copyright Dimelo SA www.dimelo.com
Proven methods to enhance your Rails performance when trafic
increases X times
ENHANCE YOUR APDEX.. NATURALLY!
Vlad ZLOTEANUSoftware Engineer - Dimelo
@vladzloteanu#ParisRB March 6, 2002
Copyright Dimelo SA www.dimelo.com
Be Warned!
Surprise coming up…at the end of this talk! ;)
Copyright Dimelo SA www.dimelo.com
Dimelo
Software editor, SaaS platforms, social media CR
Frontend platformsCollaborative platforms, ‘forum/SO-like’, white-labeled, for big accounts (a kind of GetSatisfaction / UserVoice for big accounts)
Backend productSocialAPIs kind of tweetdeck, but multiple channel, designed for multiple users/teams
Copyright Dimelo SA www.dimelo.com
Technical details (frontend product)
30+ average dynamic (Rails) req/s (web + api)Peaks of 80 req/s
2M+ dynamic requests / day700k+ unique visitors / day
Copyright Dimelo SA www.dimelo.com
Tools
Load/Stress tests: AB, httperf, siege
Htop, iftop, mytoppassenger-status, passenger-memory-status
Mysql: EXPLAIN query, SHOW PROCESSLIST
Application logsNewRelic
Copyright Dimelo SA www.dimelo.com
Demo env
Rails 3.2REE + Passenger + Apache (3 workers)MySQL 5.5.x + InnoDB tablesOSX Lion - MBPro 2010 8GB RAM
class Post < ActiveRecord::Base # 500K posts belongs_to :author has_and_belongs_to_many :categories # state -> [ moderation, published, answered ]…
class Category < ActiveRecord::Base has_and_belongs_to_many :posts
Copyright Dimelo SA www.dimelo.com
A. External services: timeouts [DEMO]
# EventMachine app on port 8081 operation = proc do sleep 2 # simulate a long running request resp.status = 200 resp.content = "Hello World!" end
EM.defer(operation, callback)
# AggregatesController on main site (port 8080)
uri = URI('http://127.0.0.1:8081’) http = Net::HTTP.new(uri.host, uri.port)
@rss_feeds = http.get("/").body
Copyright Dimelo SA www.dimelo.com
A. External services: timeouts
Page depends on external resource (E.G.: RSS, Twitter API, FB API, Auth servers, …)External resource responds very slow, or connection hangsIn ruby, Net::HTTP’s default timeout is 60s!Ruby 1.8 – Timeout library is not reliable
Move it to a BG requestPut timeouts EVERYWHERE!Enable timeouts on API clientsCache parts that involve external resources
Problem
Solution
Copyright Dimelo SA www.dimelo.com
A. Internal services(2)
Same conditions, but this time 2 services from same server/application have calls each to other
Same problems, but risk of deadlock!
Problem
Solution
Copyright Dimelo SA www.dimelo.com
B. DB: Queries containing ‘OR’ conditions [Demo]
# Request: list also my posts (current user’s posts), even if they are not published
# Current index: on [state, created_at]
@posts.where("state = :state OR author_id = :author_id",
{:state => 'published', :author_id => params[:author_id]})
Copyright Dimelo SA www.dimelo.com
B. DB: Queries containing ‘OR’ conditions
Queries containing “OR” conditionsEG: ‘visible_or_mine’ (status = published OR author_id=42 )
.. will make index on [ a, b, c ] unusable on (a OR condition) AND b AND c
Don’t use it!Cache the resultPut index only on sort columnOn: (a OR cond) AND b AND c, put index on[b, c]
Problem
Solution
Copyright Dimelo SA www.dimelo.com
C. Filtering on HABTM relations [Demo]
# Request: Filter by one (or more) categories# Model@posts = @posts.joins(:categories). where(:categories => {:id => params[:having_categories]})
# OR: Create join model, use only one join
# Modelhas_many :post_categorizationshas_many :categories, :through => :post_categorizations
# [email protected](:post_categorizations).
where(:post_categorizations => {:category_id =>
params[:having_categories]})
Copyright Dimelo SA www.dimelo.com
C. Filtering on HABTM relations
Filtering on HABTM relations creates a double join
.. which are (usually) expensive
Rewrite double joinsUse intermediary modelJoin on intermediary model
Problem
Solution
Copyright Dimelo SA www.dimelo.com
D. DB: Pagination/count on large tables [Demo]
# Nothing fancy, just implement pagination
# Controller@posts = @posts.paginate(
:page => params[:page]).order('posts.created_at desc')
# View <%= will_paginate @posts %>
Copyright Dimelo SA www.dimelo.com
D. DB: Pagination/count on large tables
Count queries are expensive on large tablesEach time a pagination is displayed, a count query is runDisplaying distant page (aka using a big OFFSET) is very expensive
MyISAM: counts LOCK the TABLE!
Problem
Copyright Dimelo SA www.dimelo.com
D. DB: Pagination/count on large tables (2)
Cache count result.. and don’t display ‘last’ pages
Limit countSELECT COUNT(*) FROM a_table WHERE some_conditions SELECT COUNT(*) FROM (SELECT 1 FROM a_table WHERE some_conditions LIMIT x) t;
Drop the isolation: NOLOCK / READ UNCOMMITED
Solution
Copyright Dimelo SA www.dimelo.com
E. Fragment caching: Thundering herd
# Let’s implement fragment caching, time-expired for displaying the previous page (no pagination optimisations were enabled)
<% cache_key = ("posts::" + Digest::MD5.hexdigest(params.inspect))
cache cache_key, :expires_in => 20.seconds do %> <h1>Posts#index</h1>….<% end %>
Copyright Dimelo SA www.dimelo.com
E. Fragment caching: Thundering herd
Using: fragment cache, time-expiredCache for a resource-intensive page expires multiple processes try to recalculate the keyEffects: resource consumption peaks, passenger worker pools starvation
Problem
t
Cache unavailable;Cache computation
Cachevalidity
Copyright Dimelo SA www.dimelo.com
E. Fragment caching: Thundering herd (2)
Copyright Dimelo SA www.dimelo.com
E. Fragment caching: Thundering herd (3)
Backgrounded calculation/sweeping is hard/messy/buggy
Before expiration time is reached (t - delta), obtain a lock and trigger cache recalculationThe next processes won’t obtain the lock and will serve the still-valid cacheRails 2: github.com/nel/atomic_mem_cache_storeRails 3: Implemented.
Solution
Copyright Dimelo SA www.dimelo.com
F. API and Web on same server
API and Web don’t have complementary usage patterns
Web slows down APIs, that should respond fastAPIs are much more prone to peaks
Worker threads starvation
Put API and WEB on different serversLog/Throttle API calls
Problem
Solution
Copyright Dimelo SA www.dimelo.com
G. API: dynamic queries
REST APIs usually expose a proxy to your DBClient can make any type of combination available: filter + sort
And because they can, they will.
Don’t give them too many options Use one db per client (prepare to shard per client)
Will be able to: add custom indexesLog/Throttle API calls
Problem
Solution
Copyright Dimelo SA www.dimelo.com
H. Ruby GC: not adapted for Web frameworks
Ruby GC is not optimized for large web frameworksOn medium Rails apps, ~50% of time can be spent in GC
Use REE or Ruby 1.9.3Activate GC.stats & benchmark your appTweak GC params
trade memory for CPUPrevious conf: 40%+ speed for 20%+ memory
Problem
Solution
Copyright Dimelo SA www.dimelo.com
I. Other recomandations
Use MyISAM (on MySQL) unless you really need transactionsDesign your models thinking about sharding (and shard the DB, when it becomes the bottleneck)Perf refactor: improve where it matters
Benchmark (before and after)Btw.. Don’t make perf tests on OSX :P
Solution
Copyright Dimelo SA www.dimelo.com
Copyright Dimelo SA www.dimelo.com
Le Dimelo Contest revient !
Coder un Middleware Rackpour
déterminer les urls accédées via Rack,calculer le nombre de visiteurs uniques,en temps réel, agrégé sur les 5 dernières
minutes.
Copyright Dimelo SA www.dimelo.com
Le prix !
Copyright Dimelo SA www.dimelo.com
Copyright Dimelo SA www.dimelo.com
.end
Thank you!
?