+ All Categories
Home > Documents > Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall...

Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall...

Date post: 25-Aug-2020
Category:
Upload: others
View: 5 times
Download: 1 times
Share this document with a friend
34
Java 8 Parallel Streams Internals (Part 1) Douglas C. Schmidt [email protected] www.dre.vanderbilt.edu/~schmidt Professor of Computer Science Institute for Software Integrated Systems Vanderbilt University Nashville, Tennessee, USA
Transcript
Page 1: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

Java 8 Parallel Streams Internals

(Part 1)

Douglas C. [email protected]

www.dre.vanderbilt.edu/~schmidt

Professor of Computer Science

Institute for Software

Integrated Systems

Vanderbilt University

Nashville, Tennessee, USA

Page 2: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

2

• Understand parallel stream internals

Learning Objectives in this Part of the Lesson

join join

join

Processsequentially

Processsequentially

Processsequentially

Processsequentially

List<String>1.1 List<String>1.2 List<String>2.1 List<String>2.2

List<String>1 List<String>2

trySplit()

List<String>

trySplit() trySplit()

See www.ibm.com/developerworks/library/j-java-streams-3-brian-goetz

Page 3: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

3

• Understand parallel stream internals, e.g.

• Know what can change & what can’t

Learning Objectives in this Part of the Lesson

See en.wikipedia.org/wiki/Serenity_Prayer

Page 4: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

4

Why Knowledge of Parallel Streams Matters

Page 5: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

5

• Converting a Java 8 sequential stream to a parallel stream is usually quite straightforward

See “Java 8 SearchWithParallelStreams Example”

Why Knowledge of Parallel Streams MattersList<List<SearchResults>>

processStream() {

return getInput()

.stream()

.map(this::processInput)

.collect(toList());

}

vs

Changing stream() calls to parallelStream() calls

involves minuscule effort!!

List<List<SearchResults>>

processStream() {

return getInput()

.parallelStream()

.map(this::processInput)

.collect(toList());

}

Page 6: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

6

• However, knowledge of parallel streams internals will make you a better Java 8 streams programmer!

join join

join

Processsequentially

Processsequentially

Processsequentially

Processsequentially

List<String>1.1 List<String>1.2 List<String>2.1 List<String>2.2

List<String>1 List<String>2

trySplit()

List<String>

trySplit() trySplit()

See www.ibm.com/developerworks/library/j-java-streams-3-brian-goetz

When performance is critical, it's importantto understand how

streams work internally

Why Knowledge of Parallel Streams Matters

Page 7: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

7See docs.oracle.com/javase/tutorial/collections/streams/parallelism.html

Why Knowledge of Parallel Streams Matters• Recall the 3 phases of a Java 8 parallel stream

Output

f(x)

Output

g(f(x))

Input x

Intermediate operation (behavior f)

Intermediate operation (behavior g)

Terminal operation (reducer)

Stream factory operation ()

Page 8: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

8

Why Knowledge of Parallel Streams Matters• Recall the 3 phases of a Java 8 parallel stream

• Split – Uses a spliterator to partition stream elements into multiple chunks

Intermediate operation (behavior f)

Intermediate operation (behavior g)

Terminal operation (reducer)

Stream factory operation ()

Output

f(x)

Output

g(f(x))

Input x

Page 9: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

9

Why Knowledge of Parallel Streams Matters• Recall the 3 phases of a Java 8 parallel stream

• Split – Uses a spliterator to partition stream elements into multiple chunks

• Apply – Independently processes these chunks in the common fork-join pool

Intermediate operation (behavior f)

Intermediate operation (behavior g)

Terminal operation (reducer)

Stream factory operation ()

Output

f(x)

Output

g(f(x))

Input x

Page 10: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

10

Why Knowledge of Parallel Streams Matters• Recall the 3 phases of a Java 8 parallel stream

• Split – Uses a spliterator to partition stream elements into multiple chunks

• Apply – Independently processes these chunks in the common fork-join pool

• Combine – Joins partial sub-results into a single result

Intermediate operation (behavior f)

Intermediate operation (behavior g)

Terminal operation (reducer)

Stream factory operation ()

Output

f(x)

Output

g(f(x))

Input x

Page 11: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

11

Why Knowledge of Parallel Streams Matters• Recall the 3 phases of a Java 8 parallel stream

• Split – Uses a spliterator to partition stream elements into multiple chunks

• Apply – Independently processes these chunks in the common fork-join pool

• Combine – Joins partial sub-results into a single result

Intermediate operation (behavior f)

Intermediate operation (behavior g)

Terminal operation (reducer)

Stream factory operation ()

Output

f(x)

Output

g(f(x))

Input x

It’s important to which of these phases you can control & which you can’t!

Page 12: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

12

Parallel Stream Splitting & Thread Pool Mechanisms

Page 13: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

13

• A parallel stream’s splitting & thread pool mechanisms are often invisible

Parallel Stream Splitting & Thread Pool Mechanisms …

Intermediate operation (behavior f)

Intermediate operation (behavior g)

Terminal operation (behavior h)

Stream factory operation ()

Output

f(x)

Output

g(f(x))

Input x

Page 14: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

14

• A parallel stream’s splitting & thread pool mechanisms are often invisible, e.g.

• Java collections have predefinedspliterators

Parallel Stream Splitting & Thread Pool Mechanisms …

Intermediate operation (behavior f)

Intermediate operation (behavior g)

Terminal operation (behavior h)

Stream factory operation ()

Output

f(x)

Output

g(f(x))

Input x

See blog.logentries.com/2015/10/java-8-introduction-to-parallelism-and-spliterator

public interface Collection<E> {

default Stream<E> stream() {

return StreamSupport

.stream(spliterator(), false);

}

default Spliterator<E> spliterator() {

return Spliterators

.spliterator(this, 0);

}

}

Page 15: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

15

• A parallel stream’s splitting & thread pool mechanisms are often invisible, e.g.

• Java collections have predefinedspliterators

• The common fork-join pool is used by default

Parallel Stream Splitting & Thread Pool Mechanisms …

Intermediate operation (behavior f)

Intermediate operation (behavior g)

Terminal operation (behavior h)

Stream factory operation ()

Output

f(x)

Output

g(f(x))

Input x

See www.baeldung.com/java-fork-join

Page 16: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

16

• However, programmers can customize thebehavior of splitting & thread pools

Parallel Stream Splitting & Thread Pool Mechanisms …

Intermediate operation (behavior f)

Intermediate operation (behavior g)

Terminal operation (behavior h)

Stream factory operation ()

Output

f(x)

Output

g(f(x))

Input x

public interface ManagedBlocker {

boolean block()

throws InterruptedException;

boolean isReleasable();

}

public interface Spliterator<T> {

boolean tryAdvance

(Consumer<? Super T> action);

Spliterator<T> trySplit();

long estimateSize();

int characteristics();

}

See Parts 2 & 4 of this lesson on “Java 8 Parallel Stream Internals”

Page 17: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

17

Parallel Stream Ordering

Page 18: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

18

• The order in which chunks are processed is non-deterministic

Parallel Stream Ordering

Output

f(x)

Output

g(f(x))

Input x

Intermediate operation (behavior f)

Intermediate operation (behavior g)

Terminal operation (reducer)

Stream factory operation ()

See en.wikipedia.org/wiki/Nondeterministic_algorithm

The ordering can exhibit different behaviors on different runs even for the same input

Page 19: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

19

• The order in which chunks are processed is non-deterministic

• Programmers have little/no control over how chunks are processed

Parallel Stream Ordering

Output

f(x)

Output

g(f(x))

Input x

Intermediate operation (behavior f)

Intermediate operation (behavior g)

Terminal operation (reducer)

Stream factory operation ()

Page 20: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

20

• The order in which chunks are processed is non-deterministic

• Programmers have little/no control over how chunks are processed

• Non-determinism is useful since it enables optimizations at multiple layers!

Parallel Stream Ordering

Additional Frameworks & Languages

Operating System Kernel

Applications

System Libraries

Java Execution Environment (e.g., JVM)

Threading & Synchronization Packages

e.g., scheduling & execution of tasks via fork-join pool, JVM, hardware cores, etc.

Page 21: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

21

• The results of the processing are moredeterministic

Parallel Stream Ordering…

Intermediate operation (behavior f)

Intermediate operation (behavior g)

Terminal operation (reducer)

Stream factory operation ()

Output

f(x)

Output

g(f(x))

Input x

See en.wikipedia.org/wiki/Deterministic_algorithm

Page 22: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

22

• The results of the processing are moredeterministic

• Programmers can control if results arepresented in “encounter order” (EO)

Parallel Stream Ordering…

Intermediate operation (behavior f)

Intermediate operation (behavior g)

Terminal operation (reducer)

Stream factory operation ()

Output

f(x)

Output

g(f(x))

Input x

See www.logicbig.com/tutorials/core-java-tutorial/java-util-stream/ordering

EO is order in which the stream source makes its elements available

Page 23: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

23

• The results of the processing are moredeterministic

• Programmers can control if results arepresented in “encounter order” (EO)

• Order is maintained if the source is ordered & the aggregate operations used are obliged to maintain order

Parallel Stream Ordering…

Intermediate operation (behavior f)

Intermediate operation (behavior g)

Terminal operation (reducer)

Stream factory operation ()

Output

f(x)

Output

g(f(x))

Input x

See www.ibm.com/developerworks/library/j-java-streams-3-brian-goetz/index.html#eo

It doesn’t matter whether the stream is parallel or sequential

Page 24: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

24

• The results of the processing are moredeterministic

• Programmers can control if results arepresented in “encounter order” (EO)

• Order is maintained if the source is ordered & the aggregate operations used are obliged to maintain order

• Ordered spliterators, orderedcollections, & static stream factorymethods respect “encounter order”

Parallel Stream Ordering

See github.com/douglascraigschmidt/LiveLessons/tree/master/Java8/ex21

List<Integer> list =

Arrays.asList(1, 2, ...);

Integer[] doubledList = list

.parallelStream()

.filter(x -> x % 2 == 0)

.map(x -> x * 2)

.toArray(Integer[]::new);

The encounter order is [1, 2, 3, 4, …] since list is ordered

Page 25: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

25

• The results of the processing are moredeterministic

• Programmers can control if results arepresented in “encounter order” (EO)

• Order is maintained if the source is ordered & the aggregate operations used are obliged to maintain order

• Ordered spliterators, orderedcollections, & static stream factorymethods respect “encounter order”

Parallel Stream Ordering

The result must be [2, 4, …]

See github.com/douglascraigschmidt/LiveLessons/tree/master/Java8/ex21

List<Integer> list =

Arrays.asList(1, 2, ...);

Integer[] doubledList = list

.parallelStream()

.filter(x -> x % 2 == 0)

.map(x -> x * 2)

.toArray(Integer[]::new);

Page 26: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

26

• The results of the processing are moredeterministic

• Programmers can control if results arepresented in “encounter order” (EO)

• Order is maintained if the source is ordered & the aggregate operations used are obliged to maintain order

• Ordered spliterators, orderedcollections, & static stream factorymethods respect “encounter order”

• Unordered collections don’t needto respect “encounter order”

Parallel Stream OrderingSet<Integer> set = new

HashSet<>

(Arrays.asList(1, 2, ...);

Integer[] doubledSet = set

.parallelStream()

.filter(x -> x % 2 == 0)

.map(x -> x * 2)

.toArray(Integer[]::new);

See github.com/douglascraigschmidt/LiveLessons/tree/master/Java8/ex21

A HashSet is unordered

Page 27: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

27

• The results of the processing are moredeterministic

• Programmers can control if results arepresented in “encounter order” (EO)

• Order is maintained if the source is ordered & the aggregate operations used are obliged to maintain order

• Ordered spliterators, orderedcollections, & static stream factorymethods respect “encounter order”

• Unordered collections don’t needto respect “encounter order”

Parallel Stream OrderingSet<Integer> set = new

HashSet<>

(Arrays.asList(1, 2, ...);

Integer[] doubledSet = set

.parallelStream()

.filter(x -> x % 2 == 0)

.map(x -> x * 2)

.toArray(Integer[]::new);

This code runs faster since encounter order need not be maintained

See github.com/douglascraigschmidt/LiveLessons/tree/master/Java8/ex21

Page 28: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

28

• The results of the processing are moredeterministic

• Programmers can control if results arepresented in “encounter order” (EO)

• Order is maintained if the source is ordered & the aggregate operations used are obliged to maintain order

• Certain intermediate operations affectordering behavior

Parallel Stream Ordering…

Intermediate operation (behavior f)

Intermediate operation (behavior g)

Terminal operation (reducer)

Stream factory operation ()

Output

f(x)

Output

g(f(x))

Input x

Page 29: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

29

• The results of the processing are moredeterministic

• Programmers can control if results arepresented in “encounter order” (EO)

• Order is maintained if the source is ordered & the aggregate operations used are obliged to maintain order

• Certain intermediate operations affectordering behavior

• e.g., sorted(), unordered(), skip(), & limit()

Parallel Stream Ordering

See github.com/douglascraigschmidt/LiveLessons/tree/master/Java8/ex21

The result must be [2, 4, …], but the code is slow due to limit() & distinct() “stateful” semantics in parallel streams

List<Integer> list =

Arrays.asList(1, 2, ...);

Integer[] doubledList = list

.parallelStream()

.distinct()

.filter(x -> x % 2 == 0)

.map(x -> x * 2)

.limit(sOutputLimit)

.toArray(Integer[]::new);

Page 30: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

30

• The results of the processing are moredeterministic

• Programmers can control if results arepresented in “encounter order” (EO)

• Order is maintained if the source is ordered & the aggregate operations used are obliged to maintain order

• Certain intermediate operations affectordering behavior

• e.g., sorted(), unordered(), skip(), & limit()

Parallel Stream OrderingList<Integer> list =

Arrays.asList(1, 2, ...);

Integer[] doubledList = list

.parallelStream()

.unordered()

.distinct()

.filter(x -> x % 2 == 0)

.map(x -> x * 2)

.limit(sOutputLimit)

.toArray(Integer[]::new);

This code runs faster since stream is unordered & thus limit() & distinct() incur less overhead

See github.com/douglascraigschmidt/LiveLessons/tree/master/Java8/ex21

Page 31: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

31

• The results of the processing are moredeterministic

• Programmers can control if results arepresented in “encounter order” (EO)

• Order is maintained if the source is ordered & the aggregate operations used are obliged to maintain order

• Certain intermediate operations affectordering behavior

• Certain terminal operations also affectordering behavior

Parallel Stream Ordering…

Intermediate operation (behavior f)

Intermediate operation (behavior g)

Terminal operation (reducer)

Stream factory operation ()

Output

f(x)

Output

g(f(x))

Input x

Page 32: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

32

• The results of the processing are moredeterministic

• Programmers can control if results arepresented in “encounter order” (EO)

• Order is maintained if the source is ordered & the aggregate operations used are obliged to maintain order

• Certain intermediate operations affectordering behavior

• Certain terminal operations also affectordering behavior

• e.g., forEachOrdered() & forEach()

Parallel Stream OrderingList<Integer> list =

Arrays.asList(1, 2, ...);

ConcurrentLinkedQueue

<Integer> queue = new

ConcurrentLinkedQueue<>();

list

.parallelStream()

.distinct()

.filter(x -> x % 2 == 0)

.map(x -> x * 2)

.limit(sOutputLimit)

.forEachOrdered(queue::add);

Ordered

See github.com/douglascraigschmidt/LiveLessons/tree/master/Java8/ex21

Page 33: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

33

• The results of the processing are moredeterministic

• Programmers can control if results arepresented in “encounter order” (EO)

• Order is maintained if the source is ordered & the aggregate operations used are obliged to maintain order

• Certain intermediate operations affectordering behavior

• Certain terminal operations also affectordering behavior

• e.g., forEachOrdered() & forEach()

Parallel Stream Ordering

See github.com/douglascraigschmidt/LiveLessons/tree/master/Java8/ex21

List<Integer> list =

Arrays.asList(1, 2, ...);

ConcurrentLinkedQueue

<Integer> queue = new

ConcurrentLinkedQueue<>();

list

.parallelStream()

.distinct()

.filter(x -> x % 2 == 0)

.map(x -> x * 2)

.limit(sOutputLimit)

.forEach(queue::add);

Unordered

Page 34: Java 8 Parallel Streams Internals (Part 1)schmidt/cs891f/2018-PDFs/07-parallelstrea… · •Recall the 3 phases of a Java 8 parallel stream •Split–Uses a spliterator to partition

34

End of Java 8 Parallel Stream Internals (Part 1)


Recommended