Java List Sorting: Is there a way to keep a list permantly sorted automatically like TreeMap?
In Java you can build up an ArrayList
with items and then call:
Collections.sort(list, comparator);
Is there anyway to pass in the Comparator at the time of list, creation like you can do with TreeMap
?
The goal is to be able add an element to the list and instead of having it automatically appended to the end of the list, the list would keep itself sorted based on the Comparator
and insert the new element at the index determined by the Comparator
. So basically the list might have to re-sort upon every new element added.
Is there anyway to achieve this in this way with the Comparator
or by some other similar means?
Solution 1:
You can change the behaviour of ArrayList
List<MyType> list = new ArrayList<MyType>() {
public boolean add(MyType mt) {
super.add(mt);
Collections.sort(list, comparator);
return true;
}
};
Note: a PriorityQueue is NOT a List, if you didn't care what type of collection it was, the simplest would be to use a TreeSet, which is just like a TreeMap but is a collection. The only advantage PriorityQueue has is to allow duplicates.
Note: resorting is not very efficient for large collections, Using a binary search and inserting an entry would be faster. (but more complicated)
EDIT: A lot depends on what you need the "list" to do. I suggest you write a List wrapper for an ArrayList, LinkedList, PriorityQueue, TreeSet, or one of the other sorted collections and implement the methods which will actually be used. That way you have a good understanding of the requirements for the collection and you can make sure it works correctly for you.
EDIT(2): Since there was so much interest in using binarySearch instead. ;)
List<MyType> list = new ArrayList<MyType>() {
public boolean add(MyType mt) {
int index = Collections.binarySearch(this, mt);
if (index < 0) index = ~index;
super.add(index, mt);
return true;
}
};
Solution 2:
Everyone is suggesting PriorityQueue
. However, it is important to realize that if you iterate over the contents of a PriorityQueue
, the elements will not be in sorted order. You are only guaranteed to get the "minimum" element from the methods peek()
, poll()
, etc.
A TreeSet
seems to be a better fit. The caveats would be that, as a Set
, it can't contain duplicate elements, and it doesn't support random access with an index.
Solution 3:
Commentary
There is probably a good reason that there is no SortedList
implementation in the JDK. I personally can't think of a reason to have one auto-sort in the JDK.
It reeks of premature optimization gone wrong. If the list is not read from as often as it is inserted into, then you are wasting cycles sorting repeatedly for no reason. Sorting right before a read would be much more reactive and having a boolean
somewhere indicating that the list does or does not need to be sorted before reading it would be even better.
The thing is you only really care about order when traversing the list with an Iterator
or for each
loop, so calling Collections.sort()
before any code that iterates would probably be more performant than trying to keep the list sorted all the time on every insertion.
There are ambiguities with List
because of duplicates, how do you order duplicates deterministically? There is SortedSet
, and that makes sense because of the uniqueness. But sorting a List
can have more complications from the side effects of duplicates and other constraints like making every object Comparable
or as I show in my code having to have a Comparator
that can do the work instead.
Sorting on .add()
If you have some very special situation where a auto-sorting List
would be useful then one thing you might do is sub-class a List
implementation and over-ride .add()
to do a Collections.sort(this, comparator)
that you pass into a custom constructor. I used LinkedList
instead of ArrayList
for a reason, ArrayList
is a natural insertion sorted order List
to begin with. It also has the ability to .add()
at an index which is pretty useless if you want a constantly sorted List
, that would have to be handled in someway that would probably be less than ideal. According to the Javadoc;
void add(int index, Object element)
Inserts the specified element at the specified position in this list (optional operation).
So it just throwing UnSupportedOperationException
would be acceptable, or you could just ignore the index
and delegate to .add(Object element);
if you document it in a JavaDoc on the method.
Usually when you want to lots of inserts/removals and sorting you would use a LinkedList
because of better performance characteristics given the usage of the `List'.
Here is a quick example:
import java.util.Collections;
import java.util.Comparator;
import java.util.LinkedList;
public class SortedList<E> extends LinkedList<E>
{
private Comparator<E> comparator;
public SortedList(final Comparator<E> comparator)
{
this.comparator = comparator;
}
/**
* this ignores the index and delegates to .add()
* so it will be sorted into the correct place immediately.
*/
@Override
public void add(int index, Object element)
{
this.add(element);
}
@Override
public boolean add(final E e)
{
final boolean result = super.add(e);
Collections.sort(this, this.comparator);
return result;
}
}
Most Efficient Solution:
Alternatively you could only sort when getting the Iterator
and this would be more performance oriented if the sorted order was only really important when iterating over the List
. This would cover the use case of the client code not having to call, Collections.sort()
before every iteration and encapsulate that behavior into the class.
import java.util.Collections;
import java.util.Comparator;
import java.util.Iterator;
import java.util.LinkedList;
public class SortedList<E> extends LinkedList<E>
{
private Comparator<E> comparator;
public SortedList(final Comparator<E> comparator)
{
this.comparator = comparator;
}
@Override
public Iterator<E> iterator()
{
Collections.sort(this, this.comparator);
return super.iterator();
}
}
Of course there would need to be error checking and handling to see if the Comparator
was null
or not and what to do if that was the case, but this gives you the idea. You still don't have any deterministic way to deal with duplicates.
Guava Solution:
If you are using Guava and you should be, you can just use
Ordering.immutableSortedCopy()
only when you need to iterate and be done with it.
Solution 4:
Something like TreeSet (or TreeMultiset in case you need duplicates) with more efficient random access is possible, but I doubt it was implemented in Java. Making each node of the tree remembers the size of its left subtree allows accessing an element by index in time O(log(size))
which is not bad.
In order to implement it, you'd need to rewrite a good portion of the underlying TreeMap.
Solution 5:
The main difference between SortedSet and List is:
- SortedSet keeps it's element in the right order, but you can't really access a specific element by index.
- List allows indexed access, and arbitrary ordering of the elements. It also allows changing any element (by index or Iterator) to another element, without that the location changes.
You seem to want kind of a fusion of both: automatic sorting, and allowing (reasonable fast) index access. Depending on the size of data and how often indexed reading or adding new elements occur, these are my ideas:
- a wrapped ArrayList, where the add method used a ListIterator to find the insertion spot, then inserting the element there. This is O(n) for insertions, O(1) for indexed access.
- a wrapped LinkedList, where the add method used a ListIterator to find the insertion spot, then inserting the element there. (This still is O(n) for insertions (with sometimes quite smaller factor as ArrayList, sometimes even more), as well as indexed access.)
- a modified binary tree keeping track of the sizes of both halves on each level, thus enabling indexed access. (This would be O(log n) for each access, but needs some extra programming, as it is not yet included in the Java SE. Or you find some library that can this.)
In any case, the interfaces and contracts of SortedSet and List are not really compatible, so you'll want the List part be read-only (or read-and-delete-only), not allowing setting and adding, and having an extra object (maybe implementing the Collection interface) for adding Objects.