let's get to the rapids

Click here to load reader

Post on 12-Apr-2017

255 views

Category:

Software

0 download

Embed Size (px)

TRANSCRIPT

  • Lets Get to the RapidsUnderstanding Java 8 Stream Performance

    JavaDay KyivNovember [email protected]

  • @mauricenaftalin

    Maurice Naftalin

    Developer, designer, architect, teacher, learner, writer

  • @mauricenaftalin

    Maurice Naftalin

    Developer, designer, architect, teacher, learner, writer

    JavaOne Rock Star

    Java Champion

  • Repeat offender:

    Maurice Naftalin

  • Repeat offender:

    Maurice NaftalinJava 5

  • Repeat offender:

    Maurice NaftalinJava 5 Java 8

  • @mauricenaftalin

    The Lambda FAQ

    www.lambdafaq.org

    http://www.lambdafaq.org

  • Agenda

    Background

    Java 8 Streams

    Parallelism

    Microbenchmarking

    Case study

    Conclusions

  • Streams Why?

    Bring functional style to Java Exploit hardware parallelism explicit but unobtrusive

  • Streams Why?

    Intention: replace loops for aggregate operations

    7

  • Streams Why?

    Intention: replace loops for aggregate operations

    List people = Set shortCities = new HashSet();

    for (Person p : people) { City c = p.getCity(); if (c.getName().length() < 4 ) { shortCities.add(c); } }

    instead of writing this:

    7

  • Streams Why?

    Intention: replace loops for aggregate operations more concise, more readable, composable operations, parallelizable

    Set shortCities = new HashSet();

    for (Person p : people) { City c = p.getCity(); if (c.getName().length() < 4 ) { shortCities.add(c); } }

    instead of writing this:

    List people = Set shortCities = people.stream() .map(Person::getCity) .filter(c -> c.getName().length() < 4) .collect(toSet());

    8

    were going to write this:

  • Streams Why?

    Intention: replace loops for aggregate operations more concise, more readable, composable operations, parallelizable

    Set shortCities = new HashSet();

    for (Person p : people) { City c = p.getCity(); if (c.getName().length() < 4 ) { shortCities.add(c); } }

    instead of writing this:

    List people = Set shortCities = people.stream() .map(Person::getCity) .filter(c -> c.getName().length() < 4) .collect(toSet());

    8

    were going to write this:

  • Streams Why?

    Intention: replace loops for aggregate operations more concise, more readable, composable operations, parallelizable

    Set shortCities = new HashSet();

    for (Person p : people) { City c = p.getCity(); if (c.getName().length() < 4 ) { shortCities.add(c); } }

    instead of writing this:

    List people = Set shortCities = people.parallelStream() .map(Person::getCity) .filter(c -> c.getName().length() < 4) .collect(toSet());

    9

    were going to write this:

  • Visualising Sequential Streams

    x2x0 x1 x3x0 x1 x2 x3

    Source Map Filter Reduction

    Intermediate Operations

    Terminal Operation

    Values in Motion

  • Visualising Sequential Streams

    x2x0 x1 x3x1 x2 x3

    Source Map Filter Reduction

    Intermediate Operations

    Terminal Operation

    Values in Motion

  • Visualising Sequential Streams

    x2x0 x1 x3 x1x2 x3

    Source Map Filter Reduction

    Intermediate Operations

    Terminal Operation

    Values in Motion

  • Visualising Sequential Streams

    x2x0 x1 x3 x1x2x3

    Source Map Filter Reduction

    Intermediate Operations

    Terminal Operation

    Values in Motion

  • Practical Benefits of Streams?

  • Practical Benefits of Streams?

  • Practical Benefits of Streams?

    Functional style will affect (nearly) all collection processing

  • Practical Benefits of Streams?

    Functional style will affect (nearly) all collection processingAutomatic parallelism is useful, in certain situations

  • Practical Benefits of Streams?

    Functional style will affect (nearly) all collection processingAutomatic parallelism is useful, in certain situations- but everyone cares about performance!

  • Parallelism Why?

    The Free Lunch Is Over

    http://www.gotw.ca/publications/concurrency-ddj.htm

    http://www.gotw.ca/publications/concurrency-ddj.htm

  • Intel Xeon E5 2600 10-core

  • x2

    Visualizing Parallel Streams

    x0

    x1

    x3

    x0

    x1x2x3

  • x2

    Visualizing Parallel Streams

    x0

    x1

    x3

    x0

    x1

    x2x3

  • x2

    Visualizing Parallel Streams

    x1

    x3

    x0

    x1

    x3

  • x2

    Visualizing Parallel Streams

    x1 y3

    x0

    x1

    x3

  • What to Measure?

  • What to Measure?

    How do code changes affect system performance?

  • What to Measure?

    How do code changes affect system performance?

  • What to Measure?

    How do code changes affect system performance?

    Controlled experiment, production conditions

  • What to Measure?

    How do code changes affect system performance?

    Controlled experiment, production conditions- difficult!

  • What to Measure?

    How do code changes affect system performance?

    Controlled experiment, production conditions- difficult!

  • What to Measure?

    How do code changes affect system performance?

    Controlled experiment, production conditions- difficult!

    So: controlled experiment, lab conditions

  • What to Measure?

    How do code changes affect system performance?

    Controlled experiment, production conditions- difficult!

    So: controlled experiment, lab conditions- beware the substitution effect!

  • Microbenchmarking

    Really hard to get meaningful results from a dynamic runtime: timing methods are flawed

    System.currentTimeMillis() and System.nanoTime() compilation can occur at any time garbage collection interferes runtime optimizes code after profiling it for some time then may deoptimize it

    optimizations include dead code elimination

  • MicrobenchmarkingDont try to eliminate these effects yourself!

    Use a benchmarking library Caliper JMH (Java Benchmarking Harness)

    Ensure your results are statistically meaningful

    Get your benchmarks peer-reviewed

  • Case Study: grep -b

    grep -b: The offset in bytes of a matched pattern is displayed in front of the matched line.

  • Case Study: grep -b

    The Moving Finger writes; and, having writ,Moves on: nor all thy Piety nor Wit

    Shall bring it back to cancel half a LineNor all thy Tears wash out a Word of it.

    rubai51.txt

    grep -b: The offset in bytes of a matched pattern is displayed in front of the matched line.

  • Case Study: grep -b

    The Moving Finger writes; and, having writ,Moves on: nor all thy Piety nor Wit

    Shall bring it back to cancel half a LineNor all thy Tears wash out a Word of it.

    rubai51.txt

    grep -b: The offset in bytes of a matched pattern is displayed in front of the matched line.

    $ grep -b 'W.*t' rubai51.txt 44:Moves on: nor all thy Piety nor Wit 122:Nor all thy Tears wash out a Word of it.

  • Because we dont have a problem

    Why Shouldnt We Optimize Code?

  • Why Shouldnt We Optimize Code?

    Because we dont have a problem- No performance target!

  • Because we dont have a problem- No performance target!

    Else there is a problem, but not in our process

    Why Shouldnt We Optimize Code?

  • Because we dont have a problem- No performance target!

    Else there is a problem, but not in our process- The OS is struggling!

    Why Shouldnt We Optimize Code?

  • Because we dont have a problem- No performance target!

    Else there is a problem, but not in our process- The OS is struggling!

    Else theres a problem in our process, but not in the code

    Why Shouldnt We Optimize Code?

  • Because we dont have a problem- No performance target!

    Else there is a problem, but not in our process- The OS is struggling!

    Else theres a problem in our process, but not in the code- GC is using all the cycles!

    Why Shouldnt We Optimize Code?

  • Because we dont have a problem- No performance target!

    Else there is a problem, but not in our process- The OS is struggling!

    Else theres a problem in our process, but not in the code- GC is using all the cycles!

    Now we can consider the code

    Else theres a problem in the code somewhere- now we can consider optimising!

  • 122Nor Moves 44

    grep -b: Collector combiner

    0The [ , 80Shall ], ,

  • 41 122Nor Moves 36 44

    grep -b: Collector combiner

    0The 44[ , 42 80Shall ], ,

  • 41 122Nor Moves 36 44

    grep -b: Collector combiner

    0The 44[ , 42 80Shall ]

    41 42Nor Moves 36 440The 44[ , 42 0Shall ]

    , ,

    ,] [

  • 41 122Nor Moves 36 44

    grep -b: Collector combiner

    0The 44[ , 42 80Shall ]

    41 42Nor Moves 36 440The 44[ , 42 0Shall ]

    , ,

    ,] [

    Moves 36 0 41 0Nor 42 0Shall 0The 44[ ]] [[ [] ]

  • grep -b: Collector accumulator

    44 0The moving writ,

    Moves on: Wit

    44 0The moving writ, 36 44Moves on: Wit

    ][

    ][ ,

    [ ]

    Supplier The moving writ,

    accumulator

    accumulator

  • 41 122Nor Moves 36 44

    grep -b: Collector solution

    0The 44[ , 42 80Shall ]

    41 42Nor Moves 36 440The 44[ , 42 0Shall ]

    , ,

    ,] [