Remove duplicates from a list of objects based on property in Java 8 [duplicate]

Solution 1:

You can get a stream from the List and put in in the TreeSet from which you provide a custom comparator that compares id uniquely.

Then if you really need a list you can put then back this collection into an ArrayList.

import static java.util.Comparator.comparingInt;
import static java.util.stream.Collectors.collectingAndThen;
import static java.util.stream.Collectors.toCollection;

...
List<Employee> unique = employee.stream()
                                .collect(collectingAndThen(toCollection(() -> new TreeSet<>(comparingInt(Employee::getId))),
                                                           ArrayList::new));

Given the example:

List<Employee> employee = Arrays.asList(new Employee(1, "John"), new Employee(1, "Bob"), new Employee(2, "Alice"));

It will output:

[Employee{id=1, name='John'}, Employee{id=2, name='Alice'}]

Another idea could be to use a wrapper that wraps an employee and have the equals and hashcode method based with its id:

class WrapperEmployee {
    private Employee e;

    public WrapperEmployee(Employee e) {
        this.e = e;
    }

    public Employee unwrap() {
        return this.e;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        WrapperEmployee that = (WrapperEmployee) o;
        return Objects.equals(e.getId(), that.e.getId());
    }

    @Override
    public int hashCode() {
        return Objects.hash(e.getId());
    }
}

Then you wrap each instance, call distinct(), unwrap them and collect the result in a list.

List<Employee> unique = employee.stream()
                                .map(WrapperEmployee::new)
                                .distinct()
                                .map(WrapperEmployee::unwrap)
                                .collect(Collectors.toList());

In fact, I think you can make this wrapper generic by providing a function that will do the comparison:

public class Wrapper<T, U> {
    private T t;
    private Function<T, U> equalityFunction;

    public Wrapper(T t, Function<T, U> equalityFunction) {
        this.t = t;
        this.equalityFunction = equalityFunction;
    }

    public T unwrap() {
        return this.t;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        @SuppressWarnings("unchecked")
        Wrapper<T, U> that = (Wrapper<T, U>) o;
        return Objects.equals(equalityFunction.apply(this.t), that.equalityFunction.apply(that.t));
    }

    @Override
    public int hashCode() {
        return Objects.hash(equalityFunction.apply(this.t));
    }
}

and the mapping will be:

.map(e -> new Wrapper<>(e, Employee::getId))

Solution 2:

The easiest way to do it directly in the list is

HashSet<Object> seen=new HashSet<>();
employee.removeIf(e->!seen.add(e.getID()));
  • removeIf will remove an element if it meets the specified criteria
  • Set.add will return false if it did not modify the Set, i.e. already contains the value
  • combining these two, it will remove all elements (employees) whose id has been encountered before

Of course, it only works if the list supports removal of elements.

Solution 3:

If you can make use of equals, then filter the list by using distinct within a stream (see answers above). If you can not or don't want to override the equals method, you can filter the stream in the following way for any property, e.g. for the property Name (the same for the property Id etc.):

Set<String> nameSet = new HashSet<>();
List<Employee> employeesDistinctByName = employees.stream()
            .filter(e -> nameSet.add(e.getName()))
            .collect(Collectors.toList());

Solution 4:

Another solution is to use a Predicate, then you can use this in any filter:

public static <T> Predicate<T> distinctBy(Function<? super T, ?> f) {
  Set<Object> objects = new ConcurrentHashSet<>();
  return t -> objects.add(f.apply(t));
}

Then simply reuse the predicate anywhere:

employees.stream().filter(distinctBy(e -> e.getId));

Note: in the JavaDoc of filter, which says it takes a stateless Predicte. Actually, this works fine even if the stream is parallel.


About other solutions:

1) Using .collect(Collectors.toConcurrentMap(..)).values() is a good solution, but it's annoying if you want to sort and keep the order.

2) stream.removeIf(e->!seen.add(e.getID())); is also another very good solution. But we need to make sure the collection implemented removeIf, for example it will throw exception if we construct the collection use Arrays.asList(..).

Solution 5:

Try this code:

Collection<Employee> nonDuplicatedEmployees = employees.stream()
   .<Map<Integer, Employee>> collect(HashMap::new,(m,e)->m.put(e.getId(), e), Map::putAll)
   .values();