In Java 8, why were Arrays not given the forEach method of Iterable?
I must be missing something here.
In Java 5, the "for-each loop" statement (also called the enhanced for loop) was introduced. It appears that it was introduced mainly to iterate through Collections. Any collection (or container) class that implements the Iterable
interface is eligible for iteration using the "for-each loop". Perhaps for historic reasons, the Java arrays did not implement the Iterable
interface. But since arrays were/are ubiquitous, javac
would accept the use of for-each loop on arrays (generating bytecode equivalent to a traditional for loop).
In Java 8, the forEach
method was added to the Iterable
interface as a default method. This made passing lambda expressions to collections (while iterating) possible (e.g. list.forEach(System.out::println)
). But again, arrays don't enjoy this treatment. (I understand that there are workarounds).
Are there technical reasons why javac
couldn't be enhanced to accept arrays in forEach
, just like it accepts them in the enhanced for loop? It appears that code generation would be possible without requiring that arrays implement Iterable
. Am I being naive?
This is especially important for a newcomer to the language who rather naturally uses arrays because of their syntactical ease. It's hardly natural to switch to Lists and use Arrays.asList(1, 2, 3)
.
There are a bunch of special cases in the Java language and in the JVM for arrays. Arrays have an API, but it's barely visible. It is as if arrays are declared to have:
implements Cloneable, Serializable
public final int length
-
public T[] clone()
whereT
is the array's component type
However, these declarations aren't visible in any source code anywhere. See JLS 4.10.3 and JLS 10.7 for explanations. Cloneable
and Serializable
are visible via reflection, and are returned by a call to
Object[].class.getInterfaces()
Perhaps surprisingly, the length
field and the clone()
method aren't visible reflectively. The length
field isn't a field at all; using it turns into a special arraylength
bytecode. A call to clone()
results in an actual virtual method call, but if the receiver is an array type, this is handled specially by the JVM.
Notably, though, array classes do not implement the Iterable
interface.
When the enhanced-for loop ("for-each") was added in Java SE 5, it supported two different cases for the right-hand-side expression: an Iterable
or an array type (JLS 14.14.2). The reason is that Iterable
instances and arrays are handled completely differently by the enhanced-for statement. That section of the JLS gives the full treatment, but put more simply, the situation is as follows.
For an Iterable<T> iterable
, the code
for (T t : iterable) {
<loop body>
}
is syntactic sugar for
for (Iterator<T> iterator = iterable.iterator(); iterator.hasNext(); ) {
t = iterator.next();
<loop body>
}
For an array T[]
, the code
for (T t : array) {
<loop body>
}
is syntactic sugar for
int len = array.length;
for (int i = 0; i < len; i++) {
t = array[i];
<loop body>
}
Now, why was it done this way? It would certainly be possible for arrays to implement Iterable
, since they implement other interfaces already. It would also be possible for the compiler to synthesize an Iterator
implementation that's backed by an array. (There is precedent for this. The compiler already synthesizes the static values()
and valueOf()
methods that are automatically added to every enum
class, as described in JLS 8.9.3.)
But arrays are a very low-level construct, and accessing an array by an int
value is expected to be extremely inexpensive operation. It's quite idiomatic to run a loop index from 0
to an array's length, incrementing by one each time. The enhanced-for loop on an array does exactly that. If the enhanced-for loop over an array were implemented using the Iterable
protocol, I think most people would be unpleasantly surprised to discover that looping over an array involved an initial method call and memory allocation (creating the Iterator
), followed by two method calls per loop iteration.
So when default methods were added to Iterable
in Java 8, this didn't affect arrays at all.
As others have noted, if you have an array of int
, long
, double
, or of reference type, it's possible to turn this into a stream using one of the Arrays.stream()
calls. This provides access to map()
, filter()
, forEach()
, etc.
It would be nice, though, if the special cases in the Java language and JVM for arrays were replaced by real constructs (along with fixing a bunch of other array-related problems, such as poor handling of 2+ dimensional arrays, the 2^31 length limitation, and so forth). This is the subject of the "Arrays 2.0" investigation being led by John Rose. See John's talk at JVMLS 2012 (video, slides). The ideas relevant to this discussion include introduction of an actual interface for arrays, to allow libraries to interpose element access, to support additional operations such as slicing and copying, and so forth.
Note that all of this is investigation and future work. There is nothing from these array enhancements that is committed in the Java roadmap for any release, as of this writing (2016-02-23).
Suppose the special code will be added into java compiler to handle forEach
.
Then many similar questions could be asked.
Why we cannot write myArray.fill(0)
? Or myArray.copyOfRange(from, to)
? Or myArray.sort()
?
myArray.binarySearch()
? myArray.stream()
? Practically every static method in Arrays
interface could be converted into the corresponding method of the "array class". Why should JDK developers stop on
myArray.forEach()
? Note however that every such method must be added not only into classlib specification,
but into Java Language Specification which is far more stable and conservative. Also this would mean that not only
the implementation of such methods would become part of specification, but also classes like java.util.function.Consumer
should be explicitly mentioned in JLS (which is the argument of proposed forEach
method).
Also note that new consumers would be necessary to add to the
standard library like FloatConsumer
, ByteConsumer
, etc. for the corresponding array types.
Currently the JLS rarely refers to the types outside of java.lang
package (with some notable exceptions
like java.util.Iterator
). This implies some stability layer. The proposed change is too drastic for Java language.
Also note that currently we have one method which could be called for arrays directly (and which implementation differs
from the java.lang.Object
): it's clone()
method. It actually adds some dirty parts into javac and even JVM
as it must be handled specially everywhere. This causes bugs (e.g. method references were incorrectly handled in Java 8 JDK-8056051). Adding more similar complexity into javac may introduce
even more similar bugs.
Such feature will probably be implemented in some not so near future as a part of Arrays 2.0 initiative. The idea is to introduce some superclass for arrays which will be located in class library, so new methods could be added just by writing normal java code without tweaking javac/JVM. However, this is also very hard feature as arrays are always treated specially in Java, and, as far as I know it's unknown yet whether it will be implemented and when.