Download - Alexander Dymo - RailsConf 2014 - Improve performance: Optimize Memory and Upgrade to Ruby 2.1
Improve Performance Quick and Cheap: Optimize Memory and Upgrade to Ruby 2.1
http://www.slideshare.net/adymo/alexander-dymo-railsconf-2014-improve-performance
Ok, Let's talk about performance
Can I have a show of hands. Who here thinks Ruby is fast:C'mon, only a few people I disagree, Ruby is fast, especially the latest version except for one thing memory consumption and garbage collection make it slow.
Oh, most people here think it's fast I do agree, ruby is fast until your program takes so much memory that it becomes slow.
Part 1Why?
Why am I talking so much about memory? Here's why.
Memory optimization is the #1 thing that makes your Ruby application fast
Memory overhead+Slow GC algorithm=
Why? Two reasons:
Large memory overhead where every object takes at least 40 bytes in memoryPlusSlow gc algorithm that got improved in 2.1 but not as much as we will later see
That all equals not universal love an peace
Memory overhead+Slow GC algorithm=High memory consumption+Enormous time spent in GC
But high memory consumption and because of that enormous time that app spends doing GC
That is why memory optimization is so important. It saves you that GC time
That's also why Ruby 2.1 is so important. It makes GC so much faster.
Memory Optimized Rails App (Ruby 1.8)
Same $1k/mo hardware all these years
Some examples from my own experience.
Rails App Upgraded from Ruby 1.9 to 2.1
Compare before/after
Here's another example. No memory optimization done, but Ruby upgraded from 1.9 to 2.1
Optimize Memory andOptionallyUpgrade to Ruby 2.1
But here's another thing. If you can upgrade fine. If not - you can still get same and better performance by optimizing memory.
It's OK to Optimize Memory Only
BeforeAfterOptimization
Heavy rendering1.840.732.5x
Big data17.063.634.7x
Charting5.750.2374.0x
Average request time, seconds
require "csv"data = CSV.open("data.csv")
output = data.readlines.map do |line| line.map do |col| col.downcase.gsub(/\b('?[a-z])/) { $1.capitalize } } endend
File.open("output.csv", "w+") do |f| f.write output.join("\n")end
Unoptimized Program
Ruby 2.1 Is 40% Faster, Right?
require "csv"output = File.open("output.csv", "w+")
CSV.open("examples/data.csv", "r").each do |line| output.puts line.map do |col| col.downcase! col.gsub!(/\b('?[a-z])/) { $1.capitalize! } end.join(",")end
Memory Optimized Program
Ruby 2.1 Is NOT Faster...once your program is memory optimized
Takeaways
Ruby 2.1 is not a silver performance bullet
Memory optimized Ruby app performs the same in 1.9, 2.0 and 2.1
Ruby 2.1 merely makes performance adequate by default
Optimize memory to make a difference
Part 2How?
5 Memory Optimization Strategies
Tune garbage collector
Do not allow Ruby instance to grow
Control GC manually
Write less Ruby
Avoid memory-intensive Ruby and Rails features
Strategy 1Tune Ruby GC
Ruby GC Tuning Goal
Goal: balance the number of GC runs and peak memory usage
How to check:> GC.stat[:minor_gc_count]> GC.stat[:major_gc_count]> `ps -o rss= -p #{Process.pid}`.chomp.to_i / 1024#MB
How does tuning help?You can balance...By default this balance is to do more GC and reduce memory peaks. You can shift this balance.
Change GC settings and see how often GC is called and what your memory usage is
When Is Ruby GC Triggered?
Minor GC (faster, only new objects collected):- not enough space on the Ruby heap to allocate new objects- every 16MB-32MB of memory allocated in new objects
Major GC (slower, all objects collected):- number of old or shady objects increases more than 2x- every 16MB-128MB of memory allocated in old objects
Let's step back for a minute and look when GC is triggered
Environment Variables
Initial number of slots on the heapRUBY_GC_HEAP_INIT_SLOTS10000Min number of slots that GC must freeRUBY_GC_HEAP_FREE_SLOTS4096Heap growth factorRUBY_GC_HEAP_GROWTH_FACTOR1.8Maximum heap slots to addRUBY_GC_HEAP_GROWTH_MAX_SLOTS-
New generation malloc limitRUBY_GC_MALLOC_LIMIT16MMaximum new generation malloc limitRUBY_GC_MALLOC_LIMIT_MAX32MNew generation malloc growth factorRUBY_GC_MALLOC_LIMIT_GROWTH_FACTOR1.4
Old generation malloc limitRUBY_GC_OLDMALLOC_LIMIT16MMaximum old generation malloc limitRUBY_GC_OLDMALLOC_LIMIT_MAX128MOld generation malloc growth factorRUBY_GC_OLDMALLOC_LIMIT_GROWTH_FACTOR1.2
Environment Variables
Initial number of slots on the heapRUBY_GC_HEAP_INIT_SLOTS10000Min number of slots that GC must freeRUBY_GC_HEAP_FREE_SLOTS4096Heap growth factorRUBY_GC_HEAP_GROWTH_FACTOR1.8Maximum heap slots to addRUBY_GC_HEAP_GROWTH_MAX_SLOTS-
New generation malloc limitRUBY_GC_MALLOC_LIMIT16MMaximum new generation malloc limitRUBY_GC_MALLOC_LIMIT_MAX32MNew generation malloc growth factorRUBY_GC_MALLOC_LIMIT_GROWTH_FACTOR1.4
Old generation malloc limitRUBY_GC_OLDMALLOC_LIMIT16MMaximum old generation malloc limitRUBY_GC_OLDMALLOC_LIMIT_MAX128MOld generation malloc growth factorRUBY_GC_OLDMALLOC_LIMIT_GROWTH_FACTOR1.2
When Is Ruby GC Triggered?
ruby-performance-book.com
http://samsaffron.com/archive/2013/11/22/demystifying-the-ruby-gchttp://thorstenball.com/blog/2014/03/12/watching-understanding-ruby-2.1-garbage-collector/
Strategy 2Limit Growth
3 Layers of Memory Consumption Control
Internal
read `ps -o rss= -p #{Process.pid}`.chomp.to_i / 1024
or VmRSS from/proc/pid/#{Process.pid}
and exit worker
3 Layers of Memory Consumption Control
Internal
read `ps -o rss= -p #{Process.pid}`.chomp.to_i / 1024
or VmRSS from/proc/pid/#{Process.pid}
and exit worker
3 Layers of Memory Consumption Control
External (software)
Heroku, Monit, God, etc.
3 Layers of Memory Consumption Control
External (OS kernel)
Process.setrlimit(Process::RLIMIT_AS, )
What about Background Jobs?
Fork et Impera:
# setup background job
fork do # do something heavyend
Strategy 3Control GC Manually
GC Between Requests in Unicorn
OobGC for Ruby < 2.1require 'unicorn/oob_gc'use(Unicorn::OobGC, 1)
gctools for Ruby >= 2.1https://github.com/tmm1/gctoolsrequire 'gctools/oobgc'use(GC::OOB::UnicornMiddleware)
GC Between Requests in Unicorn
Things to have in mind:- make sure you have enough workers- make sure CPU utilization < 50%- this improves only perceived performance- overall performance might be worse- only effective for memory-intensive applications
Strategy 4Write Less Ruby
Example: Group Rank
SELECT * FROM empsalary;
depname | empno | salary-----------+-------+------- develop | 6 | 6000 develop | 7 | 4500 develop | 5 | 4200 personnel | 2 | 3900 personnel | 4 | 3500 sales | 1 | 5000 sales | 3 | 4800
PostgreSQL Window Functions
SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary DESC) FROM empsalary;
depname | empno | salary | rank -----------+-------+--------+------ develop | 6 | 6000 | 1 develop | 7 | 4500 | 2 develop | 5 | 4200 | 3 personnel | 2 | 3900 | 1 personnel | 4 | 3500 | 2 sales | 1 | 5000 | 1 sales | 3 | 4800 | 2
There has been a sentiment inside Rails community that sql is somehow bad, that you should avoid it at all costs. People invent more and more things to stay out of sql. Just to mention AREL.
Guys, I wholeheartedly disagree with this. Web frameworks come and go. Sql stays. We had sql for 40 years. It's not going away.
Finally Learn SQL
Strategy 5Avoid Memory Hogs
Operations That Copy Data
String::gsub! instead of String::gsub and similar
String::{:count=>11,:minor_gc_count=>8,:major_gc_count=>3,
:heap_used=>126, :heap_length=>130,
:malloc_increase=>7848, :malloc_limit=>16777216,
:oldmalloc_increase=>8296, :oldmalloc_limit=>16777216}
objspace.so
> ObjectSpace.count_objects=> {:TOTAL=>51359, :FREE=>16314, :T_OBJECT=>1356 ...
> require 'objspace'> ObjectSpace.memsize_of(Class)=>
1096
> ObjectSpace.reachable_objects_from(Class)=> [#,
Class...
> ObjectSpace.trace_object_allocations_start> str = "x" * 1024 * 1024 * 10> ObjectSpace.allocation_generation(str)=> 11
objspace.sohttp://tmm1.net/ruby21-objspace/ http://stackoverflow.com/q/20956401
GC.stathttp://samsaffron.com/archive/2013/11/22/demystifying-the-ruby-gc
RubyProf Memory Profiling
require 'ruby-prof'RubyProf.measure_mode = RubyProf::MEMORYRubyProf.start
str = 'x'*1024*1024*10
result = RubyProf.stop printer = RubyProf::FlatPrinter.new(result)printer.print(STDOUT)
This requires patched Ruby, will work only for 1.8 and 1.9https://github.com/ruby-prof/ruby-prof/issues/86
Valgrind Memory Profiling
> valgrind --tool=massif `rbenv which irb`==9395== Massif, a heap profilerirb(main):001:0> x = "x"*1024*1024*10; nil=> nil==9395==
> ms_print massif.out.9395
> massif-visualizer massif.out.9395
http://valgrind.orghttps://projects.kde.org/projects/extragear/sdk/massif-visualizer
http://www.slideshare.net/adymo/alexander-dymo-railsconf-2014-improve-performance
Sign up for my upcoming book updates:ruby-performance-book.com
Ask me:[email protected]@alexander_dymo
AirPair with me:airpair.me/adymo
So, our time is out. If you'd like to learn more about ruby performance optimization, please sign up for my book mailing list updates. If you need help, just email me or airpair with me. And thank you for listening.
Requests (millions)Column B
120104.9
220117
320128
4201311.3
5201420.1
Column B
Ruby 1.9 & 2.023
Ruby 2.116
Column B
Ruby 1.9 & 2.013.5
Ruby 2.113.1