let's get to the rapids
Post on 12-Apr-2017
255 views
Embed Size (px)
TRANSCRIPT
Lets Get to the RapidsUnderstanding Java 8 Stream Performance
JavaDay KyivNovember [email protected]
@mauricenaftalin
Maurice Naftalin
Developer, designer, architect, teacher, learner, writer
@mauricenaftalin
Maurice Naftalin
Developer, designer, architect, teacher, learner, writer
JavaOne Rock Star
Java Champion
Repeat offender:
Maurice Naftalin
Repeat offender:
Maurice NaftalinJava 5
Repeat offender:
Maurice NaftalinJava 5 Java 8
@mauricenaftalin
The Lambda FAQ
www.lambdafaq.org
http://www.lambdafaq.org
Agenda
Background
Java 8 Streams
Parallelism
Microbenchmarking
Case study
Conclusions
Streams Why?
Bring functional style to Java Exploit hardware parallelism explicit but unobtrusive
Streams Why?
Intention: replace loops for aggregate operations
7
Streams Why?
Intention: replace loops for aggregate operations
List people = Set shortCities = new HashSet();
for (Person p : people) { City c = p.getCity(); if (c.getName().length() < 4 ) { shortCities.add(c); } }
instead of writing this:
7
Streams Why?
Intention: replace loops for aggregate operations more concise, more readable, composable operations, parallelizable
Set shortCities = new HashSet();
for (Person p : people) { City c = p.getCity(); if (c.getName().length() < 4 ) { shortCities.add(c); } }
instead of writing this:
List people = Set shortCities = people.stream() .map(Person::getCity) .filter(c -> c.getName().length() < 4) .collect(toSet());
8
were going to write this:
Streams Why?
Intention: replace loops for aggregate operations more concise, more readable, composable operations, parallelizable
Set shortCities = new HashSet();
for (Person p : people) { City c = p.getCity(); if (c.getName().length() < 4 ) { shortCities.add(c); } }
instead of writing this:
List people = Set shortCities = people.stream() .map(Person::getCity) .filter(c -> c.getName().length() < 4) .collect(toSet());
8
were going to write this:
Streams Why?
Intention: replace loops for aggregate operations more concise, more readable, composable operations, parallelizable
Set shortCities = new HashSet();
for (Person p : people) { City c = p.getCity(); if (c.getName().length() < 4 ) { shortCities.add(c); } }
instead of writing this:
List people = Set shortCities = people.parallelStream() .map(Person::getCity) .filter(c -> c.getName().length() < 4) .collect(toSet());
9
were going to write this:
Visualising Sequential Streams
x2x0 x1 x3x0 x1 x2 x3
Source Map Filter Reduction
Intermediate Operations
Terminal Operation
Values in Motion
Visualising Sequential Streams
x2x0 x1 x3x1 x2 x3
Source Map Filter Reduction
Intermediate Operations
Terminal Operation
Values in Motion
Visualising Sequential Streams
x2x0 x1 x3 x1x2 x3
Source Map Filter Reduction
Intermediate Operations
Terminal Operation
Values in Motion
Visualising Sequential Streams
x2x0 x1 x3 x1x2x3
Source Map Filter Reduction
Intermediate Operations
Terminal Operation
Values in Motion
Practical Benefits of Streams?
Practical Benefits of Streams?
Practical Benefits of Streams?
Functional style will affect (nearly) all collection processing
Practical Benefits of Streams?
Functional style will affect (nearly) all collection processingAutomatic parallelism is useful, in certain situations
Practical Benefits of Streams?
Functional style will affect (nearly) all collection processingAutomatic parallelism is useful, in certain situations- but everyone cares about performance!
Parallelism Why?
The Free Lunch Is Over
http://www.gotw.ca/publications/concurrency-ddj.htm
http://www.gotw.ca/publications/concurrency-ddj.htm
Intel Xeon E5 2600 10-core
x2
Visualizing Parallel Streams
x0
x1
x3
x0
x1x2x3
x2
Visualizing Parallel Streams
x0
x1
x3
x0
x1
x2x3
x2
Visualizing Parallel Streams
x1
x3
x0
x1
x3
x2
Visualizing Parallel Streams
x1 y3
x0
x1
x3
What to Measure?
What to Measure?
How do code changes affect system performance?
What to Measure?
How do code changes affect system performance?
What to Measure?
How do code changes affect system performance?
Controlled experiment, production conditions
What to Measure?
How do code changes affect system performance?
Controlled experiment, production conditions- difficult!
What to Measure?
How do code changes affect system performance?
Controlled experiment, production conditions- difficult!
What to Measure?
How do code changes affect system performance?
Controlled experiment, production conditions- difficult!
So: controlled experiment, lab conditions
What to Measure?
How do code changes affect system performance?
Controlled experiment, production conditions- difficult!
So: controlled experiment, lab conditions- beware the substitution effect!
Microbenchmarking
Really hard to get meaningful results from a dynamic runtime: timing methods are flawed
System.currentTimeMillis() and System.nanoTime() compilation can occur at any time garbage collection interferes runtime optimizes code after profiling it for some time then may deoptimize it
optimizations include dead code elimination
MicrobenchmarkingDont try to eliminate these effects yourself!
Use a benchmarking library Caliper JMH (Java Benchmarking Harness)
Ensure your results are statistically meaningful
Get your benchmarks peer-reviewed
Case Study: grep -b
grep -b: The offset in bytes of a matched pattern is displayed in front of the matched line.
Case Study: grep -b
The Moving Finger writes; and, having writ,Moves on: nor all thy Piety nor Wit
Shall bring it back to cancel half a LineNor all thy Tears wash out a Word of it.
rubai51.txt
grep -b: The offset in bytes of a matched pattern is displayed in front of the matched line.
Case Study: grep -b
The Moving Finger writes; and, having writ,Moves on: nor all thy Piety nor Wit
Shall bring it back to cancel half a LineNor all thy Tears wash out a Word of it.
rubai51.txt
grep -b: The offset in bytes of a matched pattern is displayed in front of the matched line.
$ grep -b 'W.*t' rubai51.txt 44:Moves on: nor all thy Piety nor Wit 122:Nor all thy Tears wash out a Word of it.
Because we dont have a problem
Why Shouldnt We Optimize Code?
Why Shouldnt We Optimize Code?
Because we dont have a problem- No performance target!
Because we dont have a problem- No performance target!
Else there is a problem, but not in our process
Why Shouldnt We Optimize Code?
Because we dont have a problem- No performance target!
Else there is a problem, but not in our process- The OS is struggling!
Why Shouldnt We Optimize Code?
Because we dont have a problem- No performance target!
Else there is a problem, but not in our process- The OS is struggling!
Else theres a problem in our process, but not in the code
Why Shouldnt We Optimize Code?
Because we dont have a problem- No performance target!
Else there is a problem, but not in our process- The OS is struggling!
Else theres a problem in our process, but not in the code- GC is using all the cycles!
Why Shouldnt We Optimize Code?
Because we dont have a problem- No performance target!
Else there is a problem, but not in our process- The OS is struggling!
Else theres a problem in our process, but not in the code- GC is using all the cycles!
Now we can consider the code
Else theres a problem in the code somewhere- now we can consider optimising!
122Nor Moves 44
grep -b: Collector combiner
0The [ , 80Shall ], ,
41 122Nor Moves 36 44
grep -b: Collector combiner
0The 44[ , 42 80Shall ], ,
41 122Nor Moves 36 44
grep -b: Collector combiner
0The 44[ , 42 80Shall ]
41 42Nor Moves 36 440The 44[ , 42 0Shall ]
, ,
,] [
41 122Nor Moves 36 44
grep -b: Collector combiner
0The 44[ , 42 80Shall ]
41 42Nor Moves 36 440The 44[ , 42 0Shall ]
, ,
,] [
Moves 36 0 41 0Nor 42 0Shall 0The 44[ ]] [[ [] ]
grep -b: Collector accumulator
44 0The moving writ,
Moves on: Wit
44 0The moving writ, 36 44Moves on: Wit
][
][ ,
[ ]
Supplier The moving writ,
accumulator
accumulator
41 122Nor Moves 36 44
grep -b: Collector solution
0The 44[ , 42 80Shall ]
41 42Nor Moves 36 440The 44[ , 42 0Shall ]
, ,
,] [