Using Streams


Streams

A Stream represents a sequence of elements and supports different kind of operations to perform computations upon those elements. With Java 8, Collection interface has two methods to generate a Stream: stream() and parallelStream(). Stream operations are either intermediate or terminal. Intermediate operations return a Stream so multiple intermediate operations can be chained before the Stream is closed. Terminal operations are either void or return a non-stream result.


Using Streams

A Stream is a sequence of elements upon which sequential and parallel aggregate operations can be performed. Any given Stream can potentially have an unlimited amount of data flowing through it. As a result, data received from a Stream is processed individually as it arrives, as opposed to performing batch processing on the data altogether. When combined with lambda expressions they provide a concise way to perform operations on sequences of data using a functional approach.

Example:

Stream<String> fruitStream = Stream.of("apple", "banana", "pear", "kiwi", "orange");
fruitStream.filter(s -> s.contains("a"))
       .map(String::toUpperCase)
       .sorted()
       .forEach(System.out::println);

Output:

APPLE
BANANA
ORANGE
PEAR

The operations performed by the above code can be summarized as follows:

  • Create a Stream<String> containing a sequenced ordered Stream of fruit String elements using the static factory method Stream.of(values).
  • The filter() operation retains only elements that match a given predicate (the elements that when tested by the predicate return true). In this case, it retains the elements containing an "a". The predicate is given as a lambda expression.
  • The map() operation transforms each element using a given function, called a mapper. In this case, each fruit String is mapped to its uppercase String version using the method-reference String::toUppercase.
    • Note that the map() operation will return a stream with a different generic type if the mapping function returns a type different to its input parameter. For example on a Stream<String> calling .map(String::isEmpty) returns a Stream<Boolean>
  • The sorted() operation sorts the elements of the Stream according to their natural ordering (lexicographically, in the case of String).
  • Finally, the forEach(action) operation performs an action which acts on each element of the Stream, passing it to a Consumer. In the example, each element is simply being printed to the console. This operation is a terminal operation, thus making it impossible to operate on it again.
    • Note that operations defined on the Stream are performed because of the terminal operation. Without a terminal operation, the stream is not processed. Streams can not be reused. Once a terminal operation is called, the Stream object becomes unusable.

Closing Streams

  • Note that a Stream generally does not have to be closed. It is only required to close streams that operate on IO channels. Most Stream types don't operate on resources and therefore don't require closing.

The Stream interface extends AutoCloseable. Streams can be closed by calling the close method or by using try-with-resource statements.

An example use case where a Stream should be closed is when you create a Stream of lines from a file:

try (Stream<String> lines = Files.lines(Paths.get("somePath"))) {
     lines.forEach(System.out::println);
}

The Stream interface also declares the Stream.onClose() method which allows you to register Runnable handlers which will be called when the stream is closed. An example use case is where code which produces a stream needs to know when it is consumed to perform some cleanup.

public Stream<String>streamAndDelete(Path path) throws IOException {
      return Files.lines(path).onClose(() -> someClass.deletePath(path));
}

The run handler will only execute if the close() method gets called, either explicitly or implicitly by a try-with-resources statement

Processing Order

A Stream object's processing can be sequential or parallel.

In a sequential mode, the elements are processed in the order of the source of the Stream. If the Stream is ordered (such as a SortedMap implementation or a List) the processing is guaranteed to match the ordering of the source. In other cases, however, care should be taken not to depend on the ordering (see: is the Java HashMap keySet() iteration order consistent?).

Example:

List<Integer> integerList = Arrays.asList(0, 1, 2, 3, 42);
 
// sequential
long howManyOddNumbers = integerList.stream()
       .filter(e -> (e % 2) == 1)
       .count();
 
System.out.println(howManyOddNumbers); // Output: 2

Parallel mode allows the use of multiple threads on multiple cores but there is no guarantee of the order in which elements are processed.

f multiple methods are called on a sequential Stream, not every method has to be invoked. For example, if a Stream is filtered and the number of elements is reduced to one, a subsequent call to a method such as sort will not occur. This can increase the performance of a sequential Stream — an optimization that is not possible with a parallel Stream.

Example:

// parallel
long howManyOddNumbersParallel = integerList.parallelStream()
      .filter(e -> (e % 2) == 1)
      .count();
 
System.out.println(howManyOddNumbersParallel); // Output: 2

Differences from Containers (or Collections)

While some actions can be performed on both Containers and Streams, they ultimately serve different purposes and support different operations. Containers are more focused on how the elements are stored and how those elements can be accessed efficiently. A Stream, on the other hand, doesn't provide direct access and manipulation to its elements; it is more dedicated to the group of objects as a collective entity and performing operations on that entity as a whole. Stream and Collection are separate high-level abstractions for these differing purposes.

Basic Programs