java 8 stream api. a different way to process collections

Post on 27-Aug-2014






Click to see full reader


A look on one of the features of Java 8 hidden behind the lambdas. A different way to iterate Collections. You'll never see the Collecions the same way. These are the slides I used on my talk at the "Tech Thursday" by Oracle in June in Madrid.


Java8 Stream APIA different way to process collectionsDavid Gómez

Streams? What’s that?

A Stream is…An convenience method to iterate over

collections in a declarative wayList<Integer>  numbers  =  new  ArrayList<Integer>();for  (int  i=  0;  i  <  100  ;  i++)  {   numbers.add(i); }  

List<Integer> evenNumbers = new ArrayList<>();for (int i : numbers) { if (i % 2 == 0) { evenNumbers.add(i); } }


A Stream is…An convenience method to iterate over

collections in a declarative wayList<Integer>  numbers  =  new  ArrayList<Integer>();for  (int  i=  0;  i  <  100  ;  i++)  {   numbers.add(i); }  

List<Integer> evenNumbers = .filter(n -> n % 2 == 0) .collect(toList());


So… Streams are collections?Not Really

Collections Streams

Sequence of elements

Computed at construction

In-memory data structure

Sequence of elements

Computed at iteration

Traversable only Once

External Iteration Internal Iteration

Finite size Infinite size


Iterating a CollectionList<Integer> evenNumbers = new ArrayList<>();for (int i : numbers) { if (i % 2 == 0) { evenNumbers.add(i); } }

External Iteration - Use forEach or Iterator - Very verbose Parallelism by manually using Threads - Concurrency is hard to be done right! - Lots of contention and error-prone - Thread-safety


Iterating a Stream

List<Integer> evenNumbers = .filter(n -> n % 2 == 0) .collect(toList());

Internal Iteration - No manual Iterators handling - Concise - Fluent API: chain sequence processing Elements computed only when needed


Iterating a Stream

List<Integer> evenNumbers = numbers.parallelStream() .filter(n -> n % 2 == 0) .collect(toList());

Easily Parallelism - Concurrency is hard to be done right! - Uses ForkJoin - Process steps should be - stateless - independent


Lambdas &

Method References


@FunctionalInterfacepublic interface Predicate<T> {

boolean test(T t); !!!!!}

An interface with exactly one abstract method !




@FunctionalInterfacepublic interface Predicate<T> {

boolean test(T t); ! default Predicate<T> negate() { return (t) -> !test(t); } !}

An interface with exactly one abstract method Could have default methods, though! !


Lambda TypesBased on abstract method signature from @FunctionalInterface: (Arguments) -> <return type>

@FunctionalInterfacepublic interface Predicate<T> {

boolean test(T t); }

T -> boolean


Lambda TypesBased on abstract method signature from @FunctionalInterface: (Arguments) -> <return type>

@FunctionalInterfacepublic interface Runnable {

void run(); }

() -> void


Lambda TypesBased on abstract method signature from @FunctionalInterface: (Arguments) -> <return type>

@FunctionalInterfacepublic interface Supplier<T> {

T get(); }

() -> T


Lambda TypesBased on abstract method signature from @FunctionalInterface: (Arguments) -> <return type>

@FunctionalInterfacepublic interface BiFunction<T, U, R> {

R apply(T t, U t); }

(T, U) -> R


Lambda TypesBased on abstract method signature from @FunctionalInterface: (Arguments) -> <return type>

@FunctionalInterfacepublic interface Comparator<T> {

int compare(T o1, T o2); }

(T, T) -> int


Method ReferencesAllows to use a method name as a lambda Usually better readability !

Syntax: <TargetReference>::<MethodName> !

TargetReference: Instance or Class


Method References

phoneCall -> phoneCall.getContact()

Method ReferenceLambda


() -> Thread.currentThread() Thread::currentThread

(str, c) -> str.indexOf(c) String::indexOf

(String s) -> System.out.println(s) System.out::println


From Collections to


Characteristics of A Stream

• Interface to Sequence of elements • Focused on processing (not on storage) • Elements computed on demand

(or extracted from source) • Can be traversed only once • Internal iteration • Parallel Support • Could be Infinite


Anatomy of a Stream


Intermediate Operations





Final operation




Anatomy of Stream Iteration1. Start from the DataSource (Usually a

collection) and create the Stream

List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10); Stream<Integer> numbersStream =;


Anatomy of Stream Iteration2. Add a chain of intermediate Operations

(Stream Pipeline)Stream<Integer> numbersStream = .filter(new Predicate<Integer>() { @Override public boolean test(Integer number) { return number % 2 == 0; } }) ! .map(new Function<Integer, Integer>() { @Override public Integer apply(Integer number) { return number * 2; } });


Anatomy of Stream Iteration2. Add a chain of intermediate Operations

(Stream Pipeline) - Better using lambdas

Stream<Integer> numbersStream = .filter(number -> number % 2 == 0) .map(number -> number * 2);


Anatomy of Stream Iteration3. Close with a Terminal Operation

List<Integer> numbersStream = .filter(number -> number % 2 == 0) .map(number -> number * 2) .collect(Collectors.toList());

•The terminal operation triggers Stream Iteration •Before that, nothing is computed. •Depending on the terminal operation, the stream could be fully traversed or not.


Stream operations

Operation TypesIntermediate operations • Always return a Stream • Chain as many as needed (Pipeline) • Guide processing of data • Does not start processing • Can be Stateless or Stateful

Terminal operations • Can return an object, a collection, or void • Start the pipeline process • After its execution, the Stream can not be revisited

Intermediate Operations // T -> boolean Stream<T> filter(Predicate<? super T> predicate); ! //T -> R<R> Stream<R> map(Function<? super T, ? extends R> mapper); //(T,T) -> intStream<T> sorted(Comparator<? super T> comparator); Stream<T> sorted(); ! //T -> voidStream<T> peek(Consumer<? super T> action); !Stream<T> distinct();Stream<T> limit(long maxSize);Stream<T> skip(long n);


Final Operations

Object[] toArray(); void forEach(Consumer<? super T> action); //T -> void<R, A> R collect(Collector<? super T, A, R> collector);!

!;;;; !!!


Final Operations (II)

//T,U -> R Optional<T> reduce(BinaryOperator<T> accumulator); //(T,T) -> int Optional<T> min(Comparator<? super T> comparator); //(T,T) -> int Optional<T> max(Comparator<? super T> comparator);long count();!


Final Operations (y III)

//T -> boolean boolean anyMatch(Predicate<? super T> predicate);boolean allMatch(Predicate<? super T> predicate);boolean noneMatch(Predicate<? super T> predicate);!


Usage examples - Context

public class Contact { private final String name; private final String city; private final String phoneNumber; private final LocalDate birth; public int getAge() { return Period.between(birth, .getYears(); } //Constructor and getters omitted!}


Usage examples - Contextpublic class PhoneCall { private final Contact contact; private final LocalDate time; private final Duration duration; ! //Constructor and getters omitted }

Contact me = new Contact("dgomezg", "Madrid", "555 55 55 55", LocalDate.of(1975, Month.MARCH, 26));Contact martin = new Contact("Martin", "Santiago", "666 66 66 66", LocalDate.of(1978, Month.JANUARY, 17));Contact roberto = new Contact("Roberto", "Santiago", "111 11 11 11", LocalDate.of(1973, Month.MAY, 11));Contact heinz = new Contact("Heinz", "Chania", "444 44 44 44", LocalDate.of(1972, Month.APRIL, 29));Contact michael = new Contact("michael", "Munich", "222 22 22 22", LocalDate.of(1976, Month.DECEMBER, 8));List<PhoneCall> phoneCallLog = Arrays.asList( new PhoneCall(heinz, LocalDate.of(2014, Month.MAY, 28), Duration.ofSeconds(125)), new PhoneCall(martin, LocalDate.of(2014, Month.MAY, 30), Duration.ofMinutes(5)), new PhoneCall(roberto, LocalDate.of(2014, Month.MAY, 30), Duration.ofMinutes(12)), new PhoneCall(michael, LocalDate.of(2014, Month.MAY, 28), Duration.ofMinutes(3)), new PhoneCall(michael, LocalDate.of(2014, Month.MAY, 29), Duration.ofSeconds(90)), new PhoneCall(heinz, LocalDate.of(2014, Month.MAY, 30), Duration.ofSeconds(365)), new PhoneCall(heinz, LocalDate.of(2014, Month.JUNE, 1), Duration.ofMinutes(7)), new PhoneCall(martin, LocalDate.of(2014, Month.JUNE, 2), Duration.ofSeconds(315))) ;


People I phoned in June .filter(phoneCall -> phoneCall.getTime().getMonth() == Month.JUNE) .map(phoneCall -> phoneCall.getContact().getName()) .distinct() .forEach(System.out::println);!


Seconds I talked in May

Long total = .filter(phoneCall -> phoneCall.getTime().getMonth() == Month.MAY) .map(PhoneCall::getDuration) .collect(summingLong(Duration::getSeconds));


Seconds I talked in MayOptional<Long> total = .filter(phoneCall -> phoneCall.getTime().getMonth() == Month.MAY) .map(PhoneCall::getDuration) .reduce(Duration::plus); total.ifPresent(duration -> {System.out.println(duration.getSeconds());} ); !


Did I phone to Paris?

boolean phonedToParis = .anyMatch(phoneCall -> "Paris".equals(phoneCall.getContact().getCity()))!!


Give me the 3 longest phone calls .filter(phoneCall -> phoneCall.getTime().getMonth() == Month.MAY) .sorted(comparing(PhoneCall::getDuration)) .limit(3) .forEach(System.out::println);


Give me the 3 shortest ones .filter(phoneCall -> phoneCall.getTime().getMonth() == Month.MAY) .sorted(comparing(PhoneCall::getDuration).reversed()) .limit(3) .forEach(System.out::println);


Creating Streams

Streams can be created fromCollections Directly from values Generators (infinite Streams) Resources (like files)

Stream ranges


From collections

use stream()

List<Integer> numbers = new ArrayList<>();for (int i= 0; i < 10_000_000 ; i++) { numbers.add((int)Math.round(Math.random()*100));}

Stream<Integer> evenNumbers =;

or parallelStream()

Stream<Integer> evenNumbers = numbers.parallelStream();


Directly from Values & ranges

Stream.of("Using", "Stream", "API", "From", “Java8”);

can convert into parallelStreamStream.of("Using", "Stream", "API", "From", “Java8”) .parallel();


Generators - Functions

Stream<Integer> integers = Stream.iterate(0, number -> number + 2);

This is an infinite Stream!, will never be exhausted!

Stream fibonacci = Stream.iterate(new int[]{0,1}, t -> new int[]{t[1],t[0]+t[1]}); fibonacci.limit(10) .map(t -> t[0]) .forEach(System.out::println);


Generators - Functions

Stream<Integer> integers = Stream.iterate(0, number -> number + 2);

This is an infinite Stream!, will never be exhausted!

Stream fibonacci = Stream.iterate(new int[]{0,1}, t -> new int[]{t[1],t[0]+t[1]}); fibonacci.limit(10) .map(t -> t[0]) .forEach(System.out::println);


From Resources (Files)

Stream<String> fileContent = Files.lines(Paths.get(“readme.txt”));

Files.lines(Paths.get(“readme.txt”)) .flatMap(line ->" "))) .distinct() .count()); !

Count all distinct words in a file



Parallel Streams

use stream()

List<Integer> numbers = new ArrayList<>();for (int i= 0; i < 10_000_000 ; i++) { numbers.add((int)Math.round(Math.random()*100));}

//This will use just a single thread Stream<Integer> evenNumbers =;

or parallelStream()//Automatically select the optimum number of threads Stream<Integer> evenNumbers = numbers.parallelStream();


Let’s test it

use stream()

!for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = .filter(n -> n % 2 == 0) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, Thread.activeCount());} 5001983 elements computed in 828 msecs with 2 threads 5001983 elements computed in 843 msecs with 2 threads 5001983 elements computed in 675 msecs with 2 threads 5001983 elements computed in 795 msecs with 2 threads


Let’s test it

use stream()

!for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, Thread.activeCount());}

4999299 elements computed in 225 msecs with 9 threads 4999299 elements computed in 230 msecs with 9 threads 4999299 elements computed in 250 msecs with 9 threads


Enough, for now, But this is just the beginning

Thank You.

top related