What is the benefit of polymorphism using Collection interface to create ArrayList object?

I studied polymorphism and understand that it can do dynamic method binding like below.

Assuming that class Animal is abstract class.

public class AnimalReference
{
  public static void main(String args[])
  Animal ref                 // set up var for an Animal
  Cow aCow = new Cow("Bossy"); // makes specific objects
  Dog aDog = new Dog("Rover");

  // now reference each as an Animal
  ref = aCow; ref.speak();
  ref = aDog; ref.speak();
}

I used to create instance of ArrayList like:

ArrayList myList = new ArrayList();

But usually I figured that people write:

Collection myList = new ArrayList();

So my confusion is what is the benefit of declaring as Collection? Also I didn't know you can have "Collection" (which is an interface not abstract class) in front of "myList".

Why it is not good practice to just say:

ArrayList myList = new ArrayList();

I read Collection interface and ArrayList Java documents as well as online tutorials but still not really clear.. Could anyone give me some explanation?


Solution 1:

If you declare myList as ArrayList, you fix its concrete type. Everyone using it will depend on this concrete type, and it is easy to (even inadvertently) call methods which are specific to ArrayList. If sometime later you decide to change it to e.g. LinkedList or CopyOnWriteArrayList, you need to recompile - and possibly even change - client code. Programming for interfaces eliminates this risk.

Note that between Collection and ArrayList, there is another level of abstraction: the List interface. Typically the usage pattern of a list is very different from that of a map, set or queue. So the type of collection you need for a job is usually decided early on, and is not going to change. Declaring your variable as a List makes this decision clear, and gives its clients useful information regarding the contract this collection obeys. Collection OTOH is usually not very useful for more than iterating through its elements.

Solution 2:

It's probably more common to write List<Something> myList = new ArrayList<Something>(); than to use Collection. Usually some aspects of it being a list are significant. The vagueness of Collection with regard to accepting duplicate elements whether it's a set or list (or whatever) underneath can be a pain.

That aside, the primary purpose is abstraction, or implementation independence. Do I actually care if the List I have is an ArrayList or a Vector? Probably not most of the time. My code is more flexible if it uses the most general interface that expresses what I need the object to do.

The easy example is, suppose you write a program using all ArrayLists, and then later it needs to support multiple users, so for thread safety you want to change all your ArrayLists to Vectors. If you've been passing around references to the Type ArrayList, you have to change every usage everywhere. If you've been passing around references to the Type List, you only have to change the places where you create them.

Also, sometimes the implementing class might not be something you can or want to import and use. For example when using a persistence provider like hibernate, the actual class that implements the Set interface could be a highly specialized custom implementation unique to the framework, or it could be plain old HashSet, depending on how the object got created. You don't care about the difference, it's just a Set to you.

Solution 3:

If you're declaring an ArrayList, I wouldn't ever use ArrayList as the type on the left side. Instead, program to the interface, be it List or Collection.

Note that if you declare a method as taking a Collection, it can be passed a List or Set.

As a side note, consider using Generics.

Edit: Having said that, Generics also introduces some Gotchas.

List<Animal> could store an ArrayList<Animal>, but not an ArrayList<Cat>. You'd need List<? extends Animal> to store an ArrayList<Cat>.

Solution 4:

First of all there is a significant difference between inheritance and interfaces. A short trip back history: In plain old c++ you where able to inherit from multiple classes. This had a negative consequence if a "Transformer" class inherits from "Vehicle" and "Human" both of them implementing a method called "MoveForward". Which function dshoud the instance use if you call this method on the "Transformer" class? the "Human" or the "Vehicle" implementation? To solve this problem interfaces are introduced by java, c#, ... . Interfaces are a contract between you class and something else. You make a commitment to functionality but the contract does NOT implement any logic for your class (to prevent the Transformer.MoveForward" problem).

Polymorphism in general means that something can appear in different ways. A "Transformer" can be a "Vehicle" and a "Human". Depending on you language (i use C#) you can implement different behaviours for "MoveForward", depending on the contract you want to fulfil.

Using interfaces instead of a concrete implementation hat several advantages. First you can switch the implementation without changing the code (Dependency Injection for google lookup ;). Second you can easier test you code with testing frameworks and mocking frameworks. Third it is a good practice to use the most generalized interface data exchange (enumerator instad of a list if you only want to interate over the values).

hope this helps a bit understanding the advantages of interfaces.

Solution 5:

The type you use in a local variable declaration (as in your ArrayList example) is not usually that significant. All you have to ensure is that the type of myList (the word to the left of the name 'myList') has to be more specific than the type of any method parameter that takes myList.

Consider:

ArrayList words = new ArrayList();
sort(words);
removeNames(words);

public void sort(Collection c)  ... blah blah blah

public void removeNames(List words) ...  

I could have replaced the type of 'words' to just be List. It doesn't make any difference to the readability or the behaviour of my program. I could not define 'words' to be Object though. That's too general.

On a related note, when you define a public method, you should give careful consideration about the types of the method's parameters, since this has a direct effect on what the caller can pass in. If I defined sort differently:

ArrayList words = new ArrayList();

// this line will cause a compilation error.
sort(words);

public void sort(LinkedList c)  ... blah blah blah

The definition of sort is now very restrictive. In the first example, the sort method allows any object as the parameter, so long as it implements Collection. In the second example, sort only allows a LinkedList, it won't accept anything else (ArrayLists, HashSets, TreeSets and many others). The scenarios in which the sort method can be used are now quite limited. This might be for good reason; the implementation of sort may rely on a feature of the LinkedList data structure. It is only bad to define sort this way if people using this code want a sort algorithm that works for things other than LinkedLists.

One of the main skills in writing java libraries is deciding the types of method parameters. How general do you want to be?