Why Java Collection Framework doesn't contain Tree and Graph
I am familiar with Java Collection Framework which contains basic interfaces: Collection
and Map
. I am wondering why the Framework doesn't contain structures as Tree and Graph which are basic collections. Both can be regarded as sub types of Collection
.
By the way, I know TreeSet
is implemented by Red-Black Tree underlying. However, the TreeSet
is not a Tree but a Set
, so there's no real Tree in the framework.
I am wondering why the Framework doesn't contain structures as Tree and Graph which are basic collections. Both can be regarded as sub types of
Collection
.
This is a good question. I think it simply boils down to scoping. The core features that Collections API provides classes for are:
iteration order: Lists and sorted maps have specified iteration order, most sets don't.
duplicates: Lists allow duplicates, sets do not
index: List values are indexed by integers, map values are indexed by other objects.
This gets us very far and I assume Joshua Bloch et al argued that more feature rich collections (graphs and trees which require internal relationship between elements, sets with multiplicity, bi-directional maps, ...) can be implemented on top of these three core features, and are thus better off in libraries.
The java.util
package contains data structures to organize data of any kind. It deals basically with abstract data structures (like List
, Set
, Map
) which are defined via their methods and behavior (e.g. a Set does contain no elements twice, a List maintains order and allows duplicates, etc.).
You as a developer are free to choose which implementation of these data structures are best suited for the kind of data you deal with (HashSet vs. TreeSet / LinkedList vs. ArrayList / etc.). For example for Sets and Maps you may choose between hash-based implementations and tree-based implementations, which may or may not be suited for what you want to do (in most cases a hash-based implementation will be the best choice, while sometimes, when order is important, a tree may be better suited for your needs - See also HashSet vs TreeSet (here at Stackoverflow)).
If you are thinking of a Tree as a special kind of Graph (which it is), then you’re interested in specific properties which apply to graphs, not to collections in general (which, essentially, are lists, and are in turn used to implement things like graphs).
As mentioned in this thread, if you’re interested in modeling Graphs, there are plenty of choices for Graph libraries. Personally I can recommend JGraphT.
I don’t know why there is no graph library within the JDK (and I don’t know whether that’s a good thing to ask?), but I guess Sun decided to leave this to developers, since most applications that require graphs also require very unique implementations.
I suspect that the answer is that it is a combination of two things:
- A generic tree or graph interface would be "feature poor".
- It is easier and more efficient to implement a tree or graph using fields to represent child and (if you need them) parent pointers.
Note that neither Apache commons or Google commons have generic graph or tree support. However, I did come across a couple of generic tree/graph hierarchies:
- The jsfcompounds project on JavaNet has the "com.truchsess.util" graph and tree framework.
- The OpenJGraph project (inactive) on SourceForge includes tree and graph libraries.
Original answer
The main problem with graphs (and more concretely trees, as you know, a special case where there's a one only root, there can be no loops and all nodes are connected) is that each item in that supposed collection should be hold in another structure: the node. It's tough to make standard that kind of treatment as it requires a double layer of "containers".
If anyone ever happens to work with the TreeModel
for JTree
would notice that is almost impossible to avoid the fact that there are TreeNode
s behind (inside?) and you have to manipulate both, nodes and trees.
Anyway, I agree that the structure would be useful but it's very hard to make it standard, just notice that, for instance, neither Apache Commons Collections nor Google Guava, two big collection extensions for the API, don't have them either.
Update
According to the overview of the API:
The main design goal was to produce an API that was reasonably small, both in size, and, more importantly, in "conceptual weight." It was critical that the new functionality not seem alien to current Java programmers; it had to augment current facilities, rather than replacing them. At the same time, the new API had to be powerful enough to provide all the advantages described above.
So while graphs are proper collections and it'd be maybe possible to make it standard, the size and complexity of the implementation would be too big to fit this design goal, because of the previously explained need of a double layer of containers.
Also, Google Guava has added support for graphs. Apache had a sandbox (development in progress) for graphs but seems dormant since 2011.