Collection Interface vs arrays

We are learning about the Collection Interface and I was wondering if you all have any good advice for it's general use? What can you do with an Collection that you cannot do with an array? What can you do with an array that you cannot do with a Collection(besides allowing duplicates)?


Solution 1:

The easy way to think of it is: Collections beat object arrays in baaasically every single way. Consider:

  • A collection can be mutable or immutable. A nonempty array must always be mutable.
  • A collection can be thread-safe; even concurrent. An array is never safe to publish to multiple threads.
  • A collection can allow or disallow null elements. An array must always permit null elements.
  • A collection is type-safe; an array is not. Because arrays "fake" covariance, ArrayStoreException can result at runtime.
  • A collection can hold a non-reifiable type (e.g. List<Class<? extends E>> or List<Optional<T>>). An array will generate a warning for this.
  • A collection has a full-fledged API; an array has only set-at-index, get-at-index, length and clone.
  • A collection can have views (unmodifiable, subList...). No such luck for an array.
  • A list or set's equals, hashCode and toString methods do what users expect; those methods on an array do anything but what you expect -- a common source of bugs.
  • Type-use annotations like @Nullable are extremely confusing with arrays -- quick, what does String @Nullable [] mean? (Are you positive?)
  • Because of all the reasons above, third-party utility libraries should not bother adding much additional support for arrays, focusing only on collections, so you also have a network effect.

Object arrays will never be first-class citizens in Java.

A couple of the reasons above are covered -- but in much greater detail -- in Effective Java, Third Edition, Item 28, from page 126.

So, why would you ever use object arrays?

  • You have to interact with an API that uses them and you can't fix it
    • so convert to/from a List as close to that API as you can
  • Because varargs (but varargs is overused)
    • so ... same as previous
  • You have a good benchmark showing real performance gains you actually need
    • but benchmarks can lie
  • I can't think of any other reasons, they suck bad

Solution 2:

It's basically a question of the desired level of abstraction.

Most collections can be implemented in terms of arrays, but they provide many more methods on top of it for your convenience. Most collection implementations I know of for instance, can grow and shrink according to demand, or perform other "high-level" operations which basic arrays can't.

Suppose for instance that you're loading strings from a file. You don't know how many new-line characters the file contains, thus you don't know what size to use when allocating the array. Therefore an ArrayList is a better choice.

Solution 3:

The details are in the sub interfaces of Collection, like Set, List, and Map. Each of those types has semantics. A Set typically cannot contain duplicates, and has no notion of order (although some implementations do), following the mathematical concept of a Set. A List is closest to an Array. A Map has specific behavior for push and get. You push an object by its key, and you retrieve with the same key.

There are even more details in the implementations of each collection type. For example, any of the hash based collections (e.g. HashSet, HasMap) are based on the hashcode() method that exists on any Java object.

You could simulate the semantics of any collection type based of an array, but you would have to write a lot of code to do it. For example, to back a Map with an array, you would need to write a method that puts any object entered into your Map into a specific bucket in the array. You would need to handle duplicates. For an array simulating a Set, you would need to write code to not allow duplicates.