Flattening a collection

Using Java 8 and if you prefer not to instantiate a List instance by yourself, like in the suggested (and accepted) solution

someMap.values().forEach(someList::addAll);

You could do it all by streaming with this statement:

List<String> someList = map.values().stream().flatMap(c -> c.stream()).collect(Collectors.toList());

By the way it should be interesting to know, that on Java 8 the accepted version seems to be indeed the fastest. It has about the same timing as a

for (List<String> item : someMap.values()) ...

and is a way faster than the pure streaming solution. Here is my little testcode. I explicitly don't name it benchmark to avoid the resulting discussion of benchmark flaws. ;) I do every test twice to hopefully get a full compiled version.

    Map<String, List<String>> map = new HashMap<>();
    long millis;

    map.put("test", Arrays.asList("1", "2", "3", "4"));
    map.put("test2", Arrays.asList("10", "20", "30", "40"));
    map.put("test3", Arrays.asList("100", "200", "300", "400"));

    int maxcounter = 1000000;
    
    System.out.println("1 stream flatmap");
    millis = System.currentTimeMillis();
    for (int i = 0; i < maxcounter; i++) {
        List<String> someList = map.values().stream().flatMap(c -> c.stream()).collect(Collectors.toList());
    }
    System.out.println(System.currentTimeMillis() - millis);
    
    System.out.println("1 parallel stream flatmap");
    millis = System.currentTimeMillis();
    for (int i = 0; i < maxcounter; i++) {
        List<String> someList = map.values().parallelStream().flatMap(c -> c.stream()).collect(Collectors.toList());
    }
    System.out.println(System.currentTimeMillis() - millis);

    System.out.println("1 foreach");
    millis = System.currentTimeMillis();
    for (int i = 0; i < maxcounter; i++) {
        List<String> mylist = new ArrayList<String>();
        map.values().forEach(mylist::addAll);
    }
    System.out.println(System.currentTimeMillis() - millis);        

    System.out.println("1 for");
    millis = System.currentTimeMillis();
    for (int i = 0; i < maxcounter; i++) {
        List<String> mylist = new ArrayList<String>();
        for (List<String> item : map.values()) {
            mylist.addAll(item);
        }
    }
    System.out.println(System.currentTimeMillis() - millis);
    
    
    System.out.println("2 stream flatmap");
    millis = System.currentTimeMillis();
    for (int i = 0; i < maxcounter; i++) {
        List<String> someList = map.values().stream().flatMap(c -> c.stream()).collect(Collectors.toList());
    }
    System.out.println(System.currentTimeMillis() - millis);
    
    System.out.println("2 parallel stream flatmap");
    millis = System.currentTimeMillis();
    for (int i = 0; i < maxcounter; i++) {
        List<String> someList = map.values().parallelStream().flatMap(c -> c.stream()).collect(Collectors.toList());
    }
    System.out.println(System.currentTimeMillis() - millis);
    
    System.out.println("2 foreach");
    millis = System.currentTimeMillis();
    for (int i = 0; i < maxcounter; i++) {
        List<String> mylist = new ArrayList<String>();
        map.values().forEach(mylist::addAll);
    }
    System.out.println(System.currentTimeMillis() - millis);        

    System.out.println("2 for");
    millis = System.currentTimeMillis();
    for (int i = 0; i < maxcounter; i++) {
        List<String> mylist = new ArrayList<String>();
        for (List<String> item : map.values()) {
            mylist.addAll(item);
        }
    }
    System.out.println(System.currentTimeMillis() - millis);

And here are the results:

1 stream flatmap
468
1 parallel stream flatmap
1529
1 foreach
140
1 for
172
2 stream flatmap
296
2 parallel stream flatmap
1482
2 foreach
156
2 for
141

Edit 2016-05-24 (two years after):

Running the same test using an actual Java 8 version (U92) on the same machine:

1 stream flatmap
313
1 parallel stream flatmap
3257
1 foreach
109
1 for
141
2 stream flatmap
219
2 parallel stream flatmap
3830
2 foreach
125
2 for
140

It seems that there is a speedup for sequential processing of streams and an even larger overhead for parallel streams.

Edit 2018-10-18 (four years after):

Using now Java 10 version (10.0.2) on the same machine:

1 stream flatmap
393
1 parallel stream flatmap
3683
1 foreach
157
1 for
175
2 stream flatmap
243
2 parallel stream flatmap
5945
2 foreach
128
2 for
187

The overhead for parallel streaming seems to be larger.

Edit 2020-05-22 (six years after):

Using now Java 14 version (14.0.0.36) on a different machine:

1 stream flatmap
299
1 parallel stream flatmap
3209
1 foreach
202
1 for
170
2 stream flatmap
178
2 parallel stream flatmap
3270
2 foreach
138
2 for
167

It should really be noted, that this was done on a different machine (but I think comparable). The parallel streaming overhead seems to be considerably smaller than before.


If you are using Java 8, you could do something like this:

someMap.values().forEach(someList::addAll);

When searching for "java 8 flatten" this is the only mentioning. And it's not about flattening stream either. So for great good I just leave it here

.flatMap(Collection::stream)

I'm also surprised no one has given concurrent java 8 answer to original question which is

.collect(ArrayList::new, ArrayList::addAll, ArrayList::addAll);

Suggested by a colleague:

listOfLists.stream().flatMap(e -> e.stream()).collect(Lists.toList())

I like it better than forEach().


If you're using Eclipse Collections, you can use Iterate.flatten().

MutableMap<String, MutableList<String>> map = Maps.mutable.empty();
map.put("Even", Lists.mutable.with("0", "2", "4"));
map.put("Odd", Lists.mutable.with("1", "3", "5"));
MutableList<String> flattened = Iterate.flatten(map, Lists.mutable.empty());
Assert.assertEquals(
    Lists.immutable.with("0", "1", "2", "3", "4", "5"),
    flattened.toSortedList());

flatten() is a special case of the more general RichIterable.flatCollect().

MutableList<String> flattened = 
    map.flatCollect(x -> x, Lists.mutable.empty());

Note: I am a committer for Eclipse Collections.