Is there an elegant way to remove nulls while transforming a Collection using Guava?

I have a question about simplifying some Collection handling code, when using Google Collections (update: Guava).

I've got a bunch of "Computer" objects, and I want to end up with a Collection of their "resource id"s. This is done like so:

Collection<Computer> matchingComputers = findComputers();
Collection<String> resourceIds = 
    Lists.newArrayList(Iterables.transform(matchingComputers, new Function<Computer, String>() {
    public String apply(Computer from) {
        return from.getResourceId();
    }
}));

Now, getResourceId() may return null (and changing that is not an option right now), yet in this case I'd like to omit nulls from the resulting String collection.

Here's one way to filter nulls out:

Collections2.filter(resourceIds, new Predicate<String>() {
    @Override
    public boolean apply(String input) {
        return input != null;
    }
});

You could put all that together like this:

Collection<String> resourceIds = Collections2.filter(
Lists.newArrayList(Iterables.transform(matchingComputers, new Function<Computer, String>() {
    public String apply(Computer from) {
        return from.getResourceId();
    }
})), new Predicate<String>() {
    @Override
    public boolean apply(String input) {
        return input != null;
    }
});

But this is hardly elegant, let alone readable, for such a simple task! In fact, plain old Java code (with no fancy Predicate or Function stuff at all) would arguably be much cleaner:

Collection<String> resourceIds = Lists.newArrayList();
for (Computer computer : matchingComputers) {
    String resourceId = computer.getResourceId();
    if (resourceId != null) {
        resourceIds.add(resourceId);
    }
}

Using the above is certainly also an option, but out of curiosity (and desire to learn more of Google Collections), can you do the exact same thing in some shorter or more elegant way using Google Collections?


Solution 1:

There's already a predicate in Predicates that will help you here -- Predicates.notNull() -- and you can use Iterables.filter() and the fact that Lists.newArrayList() can take an Iterable to clean this up a little more.

Collection<String> resourceIds = Lists.newArrayList(
  Iterables.filter(
     Iterables.transform(matchingComputers, yourFunction),
     Predicates.notNull()
  )
);

If you don't actually need a Collection, just an Iterable, then the Lists.newArrayList() call can go away too and you're one step cleaner again!

I suspect you might find that the Function will come in handy again, and will be most useful declared as

public class Computer {
    // ...
    public static Function<Computer, String> TO_ID = ...;
}

which cleans this up even more (and will promote reuse).

Solution 2:

A bit "prettier" syntax with FluentIterable (since Guava 12):

ImmutableList<String> resourceIds = FluentIterable.from(matchingComputers)
    .transform(getResourceId)
    .filter(Predicates.notNull())
    .toList();

static final Function<Computer, String> getResourceId =
    new Function<Computer, String>() {
        @Override
        public String apply(Computer computer) {
            return computer.getResourceId();
        }
    };

Note that the returned list is an ImmutableList. However, you can use copyInto() method to pour the elements into an arbitrary collection.

Solution 3:

It took longer than @Jon Skeet expected, but Java 8 streams do make this simple:

List<String> resourceIds = computers.stream()
    .map(Computer::getResourceId)
    .filter(Objects::nonNull)
    .collect(Collectors.toList());

You can also use .filter(x -> x != null) if you like; the difference is very minor.

Solution 4:

Firstly, I'd create a constant filter somewhere:

public static final Predicate<Object> NULL_FILTER =  new Predicate<Object>() {
    @Override
    public boolean apply(Object input) {
            return input != null;
    }
}

Then you can use:

Iterable<String> ids = Iterables.transform(matchingComputers,
    new Function<Computer, String>() {
        public String apply(Computer from) {
             return from.getResourceId();
        }
    }));
Collection<String> resourceIds = Lists.newArrayList(
    Iterables.filter(ids, NULL_FILTER));

You can use the same null filter everywhere in your code.

If you use the same computing function elsewhere, you can make that a constant too, leaving just:

Collection<String> resourceIds = Lists.newArrayList(
    Iterables.filter(
        Iterables.transform(matchingComputers, RESOURCE_ID_PROJECTION),
        NULL_FILTER));

It's certainly not as nice as the C# equivalent would be, but this is all going to get a lot nicer in Java 7 with closures and extension methods :)

Solution 5:

You could write your own method like so. this will filter out nulls for any Function that returns null from the apply method.

   public static <F, T> Collection<T> transformAndFilterNulls(List<F> fromList, Function<? super F, ? extends T> function) {
        return Collections2.filter(Lists.transform(fromList, function), Predicates.<T>notNull());
    }

The method can then be called with the following code.

Collection c = transformAndFilterNulls(Lists.newArrayList("", "SD", "DDF"), new Function<String, Long>() {

    @Override
    public Long apply(String s) {
        return s.isEmpty() ? 20L : null;
    }
});
System.err.println(c);