In the context of Java 8 streams, a pipeline refers to a sequence of operations that are performed on a stream. The pipeline is created by chaining together intermediate operations, such as filter, map, and sorted, and a terminal operation, such as forEach, toArray, count, or collect.

Intermediate operations

An intermediate operation is an operation that returns another stream, allowing you to chain multiple operations together. For example, the filter() method returns a new stream containing only the elements that match the given predicate, and the map() method returns a new stream containing the results of applying a given function to each element.

Intermediate operations are typically used to filter, transform, and reorder the elements of a stream, but they don’t perform any computation on the data, they are just describing the computation to be done. The computation is only performed when a terminal operation is called on the stream.

Examples of intermediate operations in Java 8 streams

  • filter(Predicate<T> predicate): Returns a new stream containing only the elements that match the given predicate.
  • map(Function<T, R> mapper): Returns a new stream containing the results of applying a given function to each element.
  • flatMap(Function<T, Stream<R>> mapper): Returns a new stream containing the results of applying a given function to each element and then flattening the resulting streams.
  • distinct(): Returns a new stream containing only the distinct elements.
  • sorted(): Returns a new stream containing the elements sorted according to their natural order.
  • peek(Consumer<T> action): Returns a new stream that performs the given action on each element as they are consumed from the resulting stream.
  • limit
  • skip

Terminal Operations

A terminal operation is an operation that produces a non-stream result or a side-effect. Terminal operation triggers the pipeline of operations to be executed and it also terminates the stream, so you cannot call any more operations on it.

Examples of terminal operations in Java 8 streams:

  • forEach(Consumer<T> action): Performs the given action on each element in the stream.
  • toArray(): Returns an array containing the elements of the stream.
  • count(): Returns the number of elements in the stream.
  • min(Comparator<T> comparator): Returns the minimum element according to the provided comparator.
  • max(Comparator<T> comparator): Returns the maximum element according to the provided comparator.
  • findFirst(): Returns the first element in the stream, or an empty Optional if the stream is empty.
  • findAny(): Returns an arbitrary element from the stream, or an empty Optional if the stream is empty.
  • reduce(T identity, BinaryOperator<T> accumulator): Accumulates the elements of the stream using the provided identity and accumulator function.
  • collect(Collector<T, A, R> collector): Accumulates the elements of the stream into a container, such as a List, Set, or Map.

Terminal operations are typically used to produce a result or a side-effect from the stream, such as printing the elements, counting the number of elements, or reducing the elements to a single value.

It’s important to note that terminal operations are eager, they execute the pipeline of operations and consume the stream, producing the final result or side-effect. The pipeline of operations is executed only when a terminal operation is called, this allows for efficient processing of large data sets and also ensures that the pipeline is executed only when it is needed.

Also, some terminal operations are short-circuiting, they stop processing the stream as soon as they find the result they are looking for, this allows them to be more efficient when working with large data sets.

An example of a pipeline:

List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5);
numbers.stream()
       .filter(n -> n > 2)
       .map(n -> n * 2)
       .forEach(System.out::println);

In this example, we first create a stream of numbers, then we apply an intermediate operation filter() to get only the numbers greater than 2, then we apply an intermediate operation map() to double the remaining numbers and finally we apply a terminal operation forEach() to print the resulting numbers.

The pipeline is executed only when a terminal operation is called, and it processes the elements of the stream in the order that the intermediate operations were called. This allows you to perform complex data processing tasks in a clear and readable way, without the need for explicit loops or nested conditional statements.

It’s important to note that some intermediate operations are stateful, these operations maintain some state internally, and they may execute some computation even before the terminal operation is called.

Also, pipelines can be executed in parallel, by calling the parallelStream() method instead of the stream() method to create the stream, the pipeline will be executed in parallel using multiple threads, this can increase performance when working with large data sets.

Comments are closed.