ruby performance - the last mile - rubyconf india 2016
TRANSCRIPT
Ruby PerformanceThe Last Mile
Charles Oliver Nutter @headius
I've spent ten years building
I love Ruby
I want Ruby to succeed
I believe JRuby is the way
Do you want performance today?
Do you want concurrency today?
Do you know JRuby?
What you love from Ruby• Latest Ruby language features
• Mostly-same Ruby standard library
• Pure-Ruby gems work great
• Native gems with JRuby support
• It walks like Ruby, talks like Ruby
• It is Ruby!
With the power of the JVM
• Fast JIT to native code
• Fully parallel threading
• Leading-edge garbage collectors
• Access to Java, Scala, Clojure, ...
• But it's still Ruby!
Do you know Jay?
Matz's Keynote
Performance
Concurrency
By 2020
But what about today?
Performance
What do you optimize for?
• Easy to develop with: short time until first deploy
• Fast startup: good response cycle at command line
• Straight-line performance, many operations per second
• Parallelism: utilize many cores to get more done
State of the Art: Production-quality Rubies
Production Quality?• Support for 99%+ of Ruby language features
• Important parts of standard library
• Runs typical Ruby applications and libraries
• Healthy extension ecosystem
• CRuby, JRuby are the only real options right now
CRuby (MRI)• Up until 1.9, AST interpreter
• YARV bytecode VM introduced for 1.9.0
• GC and performance improvements through 2.x series
• Ruby 2.3 is latest, released in December
• Future work on JIT, GC, happening now
JRuby• Many redesigns since creation in 2001
• AST interpreter until 2007
• Simple AST-to-bytecode JIT until JRuby 9000
• Optimizing compiler with JIT for 9k
• JRuby 9.0.5 is current, 9.1 in a couple weeks
• Next-gen Truffle runtime in the works
Lies, damn lies, and benchmarks
CRuby: red/black tree
0
2
4
6
8
1.8.7 1.9.3 2.0 2.1 2.2 2.3
CRuby: red/black tree
0
0.625
1.25
1.875
2.5
1.8.7 1.9.3 2.0 2.1 2.2 2.3
JRuby: red/black tree
0
1.25
2.5
3.75
5
JRuby 9.1 CRuby 2.3
JRuby: red/black tree
0
1.25
2.5
3.75
5
JRuby 9.1 CRuby 2.3
JRuby: red/black tree
0
0.6
1.2
1.8
2.4
JRuby 9.1 CRuby 2.3
4x FASTER
Ruby Code
Ruby Code
ParserJRuby AST
Parser
JRuby AST
Com
piler
JRuby IR
Com
piler
JRuby IR
JIT
JVM Bytecode
What can we do with this?
Block Pass-throughdef foo(&b) bar(&b) enddef bar yieldend
Block Pass-throughloop { puts Benchmark.measure { i = 0 while i < 1_000_000 i+=1 foo { }; foo { }; foo { }; foo { }; foo { } end } }
Block Passing
0
0.55
1.1
1.65
2.2CRuby 2.3 JRuby 1.7.24 JRuby 9.1
define_method
define_method :add do |a, b| a + b end
define_method
0
0.25
0.5
0.75
1CRuby 2.3 JRuby 1.7.24 JRuby 9.1
Postfix rescue
foo rescue nil
csv.rb Converters
Converters = { integer: lambda { |f| Integer(f.encode(ConverterEncoding)) rescue f }, float: lambda { |f| Float(f.encode(ConverterEncoding)) rescue f },
Postfix rescue
0
3.5
7
10.5
14CRuby 2.3 JRuby 9.1
0
3.5
7
10.5
14CRuby 2.3 JRuby 9.1
Postfix rescue
CRuby starts up the fastest
JRuby runs the fastest
And we're getting faster
Concurrency
Parallelism
Concurrency? Parallelism?
• Parallelism happens on the harder, e.g. multi-core
• Concurrency happens in the software, e.g. Thread API
• You can have concurrency without parallelism
• You can have both with JRuby
Parallelism in Ruby• On CRuby, usually process-level
• Ruby threads prevented from running in parallel
• Extensions, IO can opt to release lock
• On JRuby, usually thread-level
• Ruby thread == JVM thread == OS thread
• Single-process, shared memory
A Mailing Queue• A simple example of concurrency
• For each job, construct an email to send
• Some computation added to make processing heavier
• "Ruby Concurrency and Parallelism: A Practical Tutorial" https://www.toptal.com/ruby/ruby-concurrency-and-parallelism-a-practical-primer
require "./lib/mailer"require "benchmark"puts Benchmark.measure{ (ARGV[0] || 10_000).times do |i| Mailer.deliver do from "eki_#{i}@eqbalq.com" to "jill_#{i}@example.com" subject "Threading and Forking (#{i})" body "Some content" end end }
POOL_SIZE = (ARGV[0] || 10).to_ijobs = Queue.new(ARGV[1] || 10_000).to_i.times{|i| jobs.push i} workers = (POOL_SIZE).times.map do Thread.new do begin while x = jobs.pop(true) Mailer.deliver do ... end end rescue ThreadError end endendworkers.map(&:join)
CRuby: mailer * 1000Ti
me
in S
econ
ds
0
37.5
75
112.5
150
synchronous 4 threads 4 forks
JRuby: mailer * 1000
0
7
14
21
28
Synchronous 4 Threads
JRuby vs MRITimes Improvement
0
0.85
1.7
2.55
3.4
CRuby Forks JRuby Threads
3.37x3.09x
But Threads are bad, right?
Most users will never Thread.new
You'll deploy one Rails server for your entire site
You'll cut your instances ten times
Or maybe 100 times
Libraries and frameworks will Thread.new for you
And on JRuby, you'll have more efficient apps
So we're done?
Move to JRuby
Now your app is fast!
Right?
It is possible to write efficient Ruby code
But it's very easy to write inefficient Ruby code
Great Features, Hidden Costs• Blocks are expensive to create, slower than method calls
• case/when is an O(n) cascade of calls
• Singleton classes/methods are costly and hurt method cache
• Literal arrays, hashes, strings have to be constructed, GCed
• Flow-control exceptions can be very expensive and hard to find
What happens if your code does not run fast enough?
You need to know your app
You need good tools
And the JVM has great tools
CRuby Tooling
• Basic GC stats built in
• Simple profilers in standard library
• Some third-party tools
• stackprof, ruby-prof, perftools.rb, ...
JVM tooling is JRuby tooling
JVM Tooling
• Wide range of GCs: parallel, concurrent, realtime, pauseless
• Built-in tools for analyzing GC, JIT, thread, IO, heap
• Built-in remote monitoring via JMX
• Dozens of tools out there for profiling, management, and more
VisualVM• Graphical console into your application
• Monitor GC, threads, CPU usage
• Sampled or full profiling with GUI browser
• Live memory dumping, heap inspection
• Ships with every OpenJDK install
Java Mission Control
• Extremely low-overhead application recording
• Analyze results offline in JMC
• GC, CPU, heap events, IO operation all browsable
• Commercial feature, free for development use
More on GC
• JVM GCs are incredibly tunable with sensible defaults
• Tools like http://gceasy.io and JClarity give you a deeper view
• These are the best GCs and the best tools in the world
Finding Bottlenecks
Profiling
Profiling Tools
• Command line options: --profile, --sample
• JVM command-line profilers like prof
• Many graphical sampling/complete profiling options
• Flame graphs, stack profilers, you name it!
...and this is just the beginning
Wrapping Up
Ruby is alive and well
CRuby continues to improve
Ruby 3 is very exciting
JRuby is performance today
JRuby is concurrency today
JRuby has tools today
JRuby makes you happier!
Thank You!Charles Oliver Nutter
[email protected] @headius
http://jruby.org https://github.com/jruby/jruby