What Java 8 Stream.collect equivalents are available in the standard Kotlin library?

In Java 8, there is Stream.collect which allows aggregations on collections. In Kotlin, this does not exist in the same way, other than maybe as a collection of extension functions in the stdlib. But it isn't clear what the equivalences are for different use cases.

For example, at the top of the JavaDoc for Collectors are examples written for Java 8, and when porting them to Kolin you can't use the Java 8 classes when on a different JDK version, so likely they should be written differently.

In terms of resources online showing examples of Kotlin collections, they are typically trivial and don't really compare to the same use cases. What are good examples that really match the cases such as documented for Java 8 Stream.collect? The list there is:

  • Accumulate names into a List
  • Accumulate names into a TreeSet
  • Convert elements to strings and concatenate them, separated by commas
  • Compute sum of salaries of employee
  • Group employees by department
  • Compute sum of salaries by department
  • Partition students into passing and failing

With details in the JavaDoc linked above.

Note: this question is intentionally written and answered by the author (Self-Answered Questions), so that the idiomatic answers to commonly asked Kotlin topics are present in SO. Also to clarify some really old answers written for alphas of Kotlin that are not accurate for current-day Kotlin.


Solution 1:

There are functions in the Kotlin stdlib for average, count, distinct,filtering, finding, grouping, joining, mapping, min, max, partitioning, slicing, sorting, summing, to/from arrays, to/from lists, to/from maps, union, co-iteration, all the functional paradigms, and more. So you can use those to create little 1-liners and there is no need to use the more complicated syntax of Java 8.

I think the only thing missing from the built-in Java 8 Collectors class is summarization (but in another answer to this question is a simple solution).

One thing missing from both is batching by count, which is seen in another Stack Overflow answer and has a simple answer as well. Another interesting case is this one also from Stack Overflow: Idiomatic way to spilt sequence into three lists using Kotlin. And if you want to create something like Stream.collect for another purpose, see Custom Stream.collect in Kotlin

EDIT 11.08.2017: Chunked/windowed collection operations were added in kotlin 1.2 M2, see https://blog.jetbrains.com/kotlin/2017/08/kotlin-1-2-m2-is-out/


It is always good to explore the API Reference for kotlin.collections as a whole before creating new functions that might already exist there.

Here are some conversions from Java 8 Stream.collect examples to the equivalent in Kotlin:

Accumulate names into a List

// Java:  
List<String> list = people.stream().map(Person::getName).collect(Collectors.toList());
// Kotlin:
val list = people.map { it.name }  // toList() not needed

Convert elements to strings and concatenate them, separated by commas

// Java:
String joined = things.stream()
                       .map(Object::toString)
                       .collect(Collectors.joining(", "));
// Kotlin:
val joined = things.joinToString(", ")

Compute sum of salaries of employee

// Java:
int total = employees.stream()
                      .collect(Collectors.summingInt(Employee::getSalary)));
// Kotlin:
val total = employees.sumBy { it.salary }

Group employees by department

// Java:
Map<Department, List<Employee>> byDept
     = employees.stream()
                .collect(Collectors.groupingBy(Employee::getDepartment));
// Kotlin:
val byDept = employees.groupBy { it.department }

Compute sum of salaries by department

// Java:
Map<Department, Integer> totalByDept
     = employees.stream()
                .collect(Collectors.groupingBy(Employee::getDepartment,
                     Collectors.summingInt(Employee::getSalary)));
// Kotlin:
val totalByDept = employees.groupBy { it.dept }.mapValues { it.value.sumBy { it.salary }}

Partition students into passing and failing

// Java:
Map<Boolean, List<Student>> passingFailing =
     students.stream()
             .collect(Collectors.partitioningBy(s -> s.getGrade() >= PASS_THRESHOLD));
// Kotlin:
val passingFailing = students.partition { it.grade >= PASS_THRESHOLD }

Names of male members

// Java:
List<String> namesOfMaleMembers = roster
    .stream()
    .filter(p -> p.getGender() == Person.Sex.MALE)
    .map(p -> p.getName())
    .collect(Collectors.toList());
// Kotlin:
val namesOfMaleMembers = roster.filter { it.gender == Person.Sex.MALE }.map { it.name }

Group names of members in roster by gender

// Java:
Map<Person.Sex, List<String>> namesByGender =
      roster.stream().collect(
        Collectors.groupingBy(
            Person::getGender,                      
            Collectors.mapping(
                Person::getName,
                Collectors.toList())));
// Kotlin:
val namesByGender = roster.groupBy { it.gender }.mapValues { it.value.map { it.name } }   

Filter a list to another list

// Java:
List<String> filtered = items.stream()
    .filter( item -> item.startsWith("o") )
    .collect(Collectors.toList());
// Kotlin:
val filtered = items.filter { it.startsWith('o') } 

Finding shortest string a list

// Java:
String shortest = items.stream()
    .min(Comparator.comparing(item -> item.length()))
    .get();
// Kotlin:
val shortest = items.minBy { it.length }

Counting items in a list after filter is applied

// Java:
long count = items.stream().filter( item -> item.startsWith("t")).count();
// Kotlin:
val count = items.filter { it.startsWith('t') }.size
// but better to not filter, but count with a predicate
val count = items.count { it.startsWith('t') }

and on it goes... In all cases, no special fold, reduce, or other functionality was required to mimic Stream.collect. If you have further use cases, add them in comments and we can see!

About laziness

If you want to lazy process a chain, you can convert to a Sequence using asSequence() before the chain. At the end of the chain of functions, you usually end up with a Sequence as well. Then you can use toList(), toSet(), toMap() or some other function to materialize the Sequence at the end.

// switch to and from lazy
val someList = items.asSequence().filter { ... }.take(10).map { ... }.toList()

// switch to lazy, but sorted() brings us out again at the end
val someList = items.asSequence().filter { ... }.take(10).map { ... }.sorted()

Why are there no Types?!?

You will notice the Kotlin examples do not specify the types. This is because Kotlin has full type inference and is completely type safe at compile time. More so than Java because it also has nullable types and can help prevent the dreaded NPE. So this in Kotlin:

val someList = people.filter { it.age <= 30 }.map { it.name }

is the same as:

val someList: List<String> = people.filter { it.age <= 30 }.map { it.name }

Because Kotlin knows what people is, and that people.age is Int therefore the filter expression only allows comparison to an Int, and that people.name is a String therefore the map step produces a List<String> (readonly List of String).

Now, if people were possibly null, as-in a List<People>? then:

val someList = people?.filter { it.age <= 30 }?.map { it.name }

Returns a List<String>? that would need to be null checked (or use one of the other Kotlin operators for nullable values, see this Kotlin idiomatic way to deal with nullable values and also Idiomatic way of handling nullable or empty list in Kotlin)

See also:

  • API Reference for extension functions for Iterable
  • API reference for extension functions for Array
  • API reference for extension functions for List
  • API reference for extension functions to Map

Solution 2:

For additional examples, here are all the samples from Java 8 Stream Tutorial converted to Kotlin. The title of each example, is derived from the source article:

How streams work

// Java:
List<String> myList = Arrays.asList("a1", "a2", "b1", "c2", "c1");

myList.stream()
      .filter(s -> s.startsWith("c"))
      .map(String::toUpperCase)
     .sorted()
     .forEach(System.out::println);

// C1
// C2
// Kotlin:
val list = listOf("a1", "a2", "b1", "c2", "c1")
list.filter { it.startsWith('c') }.map (String::toUpperCase).sorted()
        .forEach (::println)

Different Kinds of Streams #1

// Java:
Arrays.asList("a1", "a2", "a3")
    .stream()
    .findFirst()
    .ifPresent(System.out::println);    
// Kotlin:
listOf("a1", "a2", "a3").firstOrNull()?.apply(::println)

or, create an extension function on String called ifPresent:

// Kotlin:
inline fun String?.ifPresent(thenDo: (String)->Unit) = this?.apply { thenDo(this) }

// now use the new extension function:
listOf("a1", "a2", "a3").firstOrNull().ifPresent(::println)

See also: apply() function

See also: Extension Functions

See also: ?. Safe Call operator, and in general nullability: In Kotlin, what is the idiomatic way to deal with nullable values, referencing or converting them

Different Kinds of Streams #2

// Java:
Stream.of("a1", "a2", "a3")
    .findFirst()
    .ifPresent(System.out::println);    
// Kotlin:
sequenceOf("a1", "a2", "a3").firstOrNull()?.apply(::println)

Different Kinds of Streams #3

// Java:
IntStream.range(1, 4).forEach(System.out::println);
// Kotlin:  (inclusive range)
(1..3).forEach(::println)

Different Kinds of Streams #4

// Java:
Arrays.stream(new int[] {1, 2, 3})
    .map(n -> 2 * n + 1)
    .average()
    .ifPresent(System.out::println); // 5.0    
// Kotlin:
arrayOf(1,2,3).map { 2 * it + 1}.average().apply(::println)

Different Kinds of Streams #5

// Java:
Stream.of("a1", "a2", "a3")
    .map(s -> s.substring(1))
    .mapToInt(Integer::parseInt)
    .max()
    .ifPresent(System.out::println);  // 3
// Kotlin:
sequenceOf("a1", "a2", "a3")
    .map { it.substring(1) }
    .map(String::toInt)
    .max().apply(::println)

Different Kinds of Streams #6

// Java:
IntStream.range(1, 4)
    .mapToObj(i -> "a" + i)
    .forEach(System.out::println);

// a1
// a2
// a3    
// Kotlin:  (inclusive range)
(1..3).map { "a$it" }.forEach(::println)

Different Kinds of Streams #7

// Java:
Stream.of(1.0, 2.0, 3.0)
    .mapToInt(Double::intValue)
    .mapToObj(i -> "a" + i)
    .forEach(System.out::println);

// a1
// a2
// a3
// Kotlin:
sequenceOf(1.0, 2.0, 3.0).map(Double::toInt).map { "a$it" }.forEach(::println)

Why Order Matters

This section of the Java 8 Stream Tutorial is the same for Kotlin and Java.

Reusing Streams

In Kotlin, it depends on the type of collection whether it can be consumed more than once. A Sequence generates a new iterator every time, and unless it asserts "use only once" it can reset to the start each time it is acted upon. Therefore while the following fails in Java 8 stream, but works in Kotlin:

// Java:
Stream<String> stream =
Stream.of("d2", "a2", "b1", "b3", "c").filter(s -> s.startsWith("b"));

stream.anyMatch(s -> true);    // ok
stream.noneMatch(s -> true);   // exception
// Kotlin:  
val stream = listOf("d2", "a2", "b1", "b3", "c").asSequence().filter { it.startsWith('b' ) }

stream.forEach(::println) // b1, b2

println("Any B ${stream.any { it.startsWith('b') }}") // Any B true
println("Any C ${stream.any { it.startsWith('c') }}") // Any C false

stream.forEach(::println) // b1, b2

And in Java to get the same behavior:

// Java:
Supplier<Stream<String>> streamSupplier =
    () -> Stream.of("d2", "a2", "b1", "b3", "c")
          .filter(s -> s.startsWith("a"));

streamSupplier.get().anyMatch(s -> true);   // ok
streamSupplier.get().noneMatch(s -> true);  // ok

Therefore in Kotlin the provider of the data decides if it can reset back and provide a new iterator or not. But if you want to intentionally constrain a Sequence to one time iteration, you can use constrainOnce() function for Sequence as follows:

val stream = listOf("d2", "a2", "b1", "b3", "c").asSequence().filter { it.startsWith('b' ) }
        .constrainOnce()

stream.forEach(::println) // b1, b2
stream.forEach(::println) // Error:java.lang.IllegalStateException: This sequence can be consumed only once. 

Advanced Operations

Collect example #5 (yes, I skipped those already in the other answer)

// Java:
String phrase = persons
        .stream()
        .filter(p -> p.age >= 18)
        .map(p -> p.name)
        .collect(Collectors.joining(" and ", "In Germany ", " are of legal age."));

    System.out.println(phrase);
    // In Germany Max and Peter and Pamela are of legal age.    
// Kotlin:
val phrase = persons.filter { it.age >= 18 }.map { it.name }
        .joinToString(" and ", "In Germany ", " are of legal age.")

println(phrase)
// In Germany Max and Peter and Pamela are of legal age.

And as a side note, in Kotlin we can create simple data classes and instantiate the test data as follows:

// Kotlin:
// data class has equals, hashcode, toString, and copy methods automagically
data class Person(val name: String, val age: Int) 

val persons = listOf(Person("Tod", 5), Person("Max", 33), 
                     Person("Frank", 13), Person("Peter", 80),
                     Person("Pamela", 18))

Collect example #6

// Java:
Map<Integer, String> map = persons
        .stream()
        .collect(Collectors.toMap(
                p -> p.age,
                p -> p.name,
                (name1, name2) -> name1 + ";" + name2));

System.out.println(map);
// {18=Max, 23=Peter;Pamela, 12=David}    

Ok, a more interest case here for Kotlin. First the wrong answers to explore variations of creating a Map from a collection/sequence:

// Kotlin:
val map1 = persons.map { it.age to it.name }.toMap()
println(map1)
// output: {18=Max, 23=Pamela, 12=David} 
// Result: duplicates overridden, no exception similar to Java 8

val map2 = persons.toMap({ it.age }, { it.name })
println(map2)
// output: {18=Max, 23=Pamela, 12=David} 
// Result: same as above, more verbose, duplicates overridden

val map3 = persons.toMapBy { it.age }
println(map3)
// output: {18=Person(name=Max, age=18), 23=Person(name=Pamela, age=23), 12=Person(name=David, age=12)}
// Result: duplicates overridden again

val map4 = persons.groupBy { it.age }
println(map4)
// output: {18=[Person(name=Max, age=18)], 23=[Person(name=Peter, age=23), Person(name=Pamela, age=23)], 12=[Person(name=David, age=12)]}
// Result: closer, but now have a Map<Int, List<Person>> instead of Map<Int, String>

val map5 = persons.groupBy { it.age }.mapValues { it.value.map { it.name } }
println(map5)
// output: {18=[Max], 23=[Peter, Pamela], 12=[David]}
// Result: closer, but now have a Map<Int, List<String>> instead of Map<Int, String>

And now for the correct answer:

// Kotlin:
val map6 = persons.groupBy { it.age }.mapValues { it.value.joinToString(";") { it.name } }

println(map6)
// output: {18=Max, 23=Peter;Pamela, 12=David}
// Result: YAY!!

We just needed to join the matching values to collapse the lists and provide a transformer to jointToString to move from Person instance to the Person.name.

Collect example #7

Ok, this one can easily be done without a custom Collector, so let's solve it the Kotlin way, then contrive a new example that shows how to do a similar process for Collector.summarizingInt which does not natively exist in Kotlin.

// Java:
Collector<Person, StringJoiner, String> personNameCollector =
Collector.of(
        () -> new StringJoiner(" | "),          // supplier
        (j, p) -> j.add(p.name.toUpperCase()),  // accumulator
        (j1, j2) -> j1.merge(j2),               // combiner
        StringJoiner::toString);                // finisher

String names = persons
        .stream()
        .collect(personNameCollector);

System.out.println(names);  // MAX | PETER | PAMELA | DAVID    
// Kotlin:
val names = persons.map { it.name.toUpperCase() }.joinToString(" | ")

It's not my fault they picked a trivial example!!! Ok, here is a new summarizingInt method for Kotlin and a matching sample:

SummarizingInt Example

// Java:
IntSummaryStatistics ageSummary =
    persons.stream()
           .collect(Collectors.summarizingInt(p -> p.age));

System.out.println(ageSummary);
// IntSummaryStatistics{count=4, sum=76, min=12, average=19.000000, max=23}    
// Kotlin:

// something to hold the stats...
data class SummaryStatisticsInt(var count: Int = 0,  
                                var sum: Int = 0, 
                                var min: Int = Int.MAX_VALUE, 
                                var max: Int = Int.MIN_VALUE, 
                                var avg: Double = 0.0) {
    fun accumulate(newInt: Int): SummaryStatisticsInt {
        count++
        sum += newInt
        min = min.coerceAtMost(newInt)
        max = max.coerceAtLeast(newInt)
        avg = sum.toDouble() / count
        return this
    }
}

// Now manually doing a fold, since Stream.collect is really just a fold
val stats = persons.fold(SummaryStatisticsInt()) { stats, person -> stats.accumulate(person.age) }

println(stats)
// output: SummaryStatisticsInt(count=4, sum=76, min=12, max=23, avg=19.0)

But it is better to create an extension function, 2 actually to match styles in Kotlin stdlib:

// Kotlin:
inline fun Collection<Int>.summarizingInt(): SummaryStatisticsInt
        = this.fold(SummaryStatisticsInt()) { stats, num -> stats.accumulate(num) }

inline fun <T: Any> Collection<T>.summarizingInt(transform: (T)->Int): SummaryStatisticsInt =
        this.fold(SummaryStatisticsInt()) { stats, item -> stats.accumulate(transform(item)) }

Now you have two ways to use the new summarizingInt functions:

val stats2 = persons.map { it.age }.summarizingInt()

// or

val stats3 = persons.summarizingInt { it.age }

And all of these produce the same results. We can also create this extension to work on Sequence and for appropriate primitive types.

For fun, compare the Java JDK code vs. Kotlin custom code required to implement this summarization.

Solution 3:

There are some cases where it is hard to avoid calling collect(Collectors.toList()) or similar. In those cases, you can more quickly change to a Kotlin equivalent using extension functions such as:

fun <T: Any> Stream<T>.toList(): List<T> = this.collect(Collectors.toList<T>())
fun <T: Any> Stream<T>.asSequence(): Sequence<T> = this.iterator().asSequence()

Then you can simply stream.toList() or stream.asSequence() to move back into the Kotlin API. A case such as Files.list(path) forces you into a Stream when you may not want it, and these extensions can help you to shift back into the standard collections and Kotlin API.

Solution 4:

More on laziness

Let's take the example solution for "Compute sum of salaries by department" given by Jayson:

val totalByDept = employees.groupBy { it.dept }.mapValues { it.value.sumBy { it.salary }}

In order to make this lazy (i.e. avoid creating an intermediate map in the groupBy step), it is not possible to use asSequence(). Instead, we must use groupingBy and fold operation:

val totalByDept = employees.groupingBy { it.dept }.fold(0) { acc, e -> acc + e.salary }

To some people this may even be more readable, since you're not dealing with map entries: the it.value part in the solution was confusing for me too at first.

Since this is a common case and we'd prefer not to write out the fold each time, it may be better to just provide a generic sumBy function on Grouping:

public inline fun <T, K> Grouping<T, K>.sumBy(
        selector: (T) -> Int
): Map<K, Int> = 
        fold(0) { acc, element -> acc + selector(element) }

so that we can simply write:

val totalByDept = employees.groupingBy { it.dept }.sumBy { it.salary }