What is the relation between Iterable and Iterator?
What is the difference between Iterator
and Iterable
in scala?
I thought that Iterable
represents a set that I can iterate through, and Iterator
is a "pointer" to one of the items in the iterable set.
However, Iterator
has functions like forEach
, map
, foldLeft
. It can be converted to Iterable
via toIterable
. And, for example, scala.io.Source.getLines
returns Iterator
, not Iterable
.
But I cannot do groupBy
on Iterator
and I can do it on Iterable
.
So, what's the relation between those two, Iterator
and Iterable
?
In short: An Iterator
does have state, whereas an Iterable
does not.
See the API docs for both.
Iterable:
A base trait for iterable collections.
This is a base trait for all Scala collections that define an iterator method to step through one-by-one the collection's elements. [...] This trait implements Iterable's foreach method by stepping through all elements using iterator.
Iterator:
Iterators are data structures that allow to iterate over a sequence of elements. They have a hasNext method for checking if there is a next element available, and a next method which returns the next element and discards it from the iterator.
An iterator is mutable: most operations on it change its state. While it is often used to iterate through the elements of a collection, it can also be used without being backed by any collection (see constructors on the companion object).
With an Iterator
you can stop an iteration and continue it later if you want. If you try to do this with an Iterable
it will begin from the head again:
scala> val iterable: Iterable[Int] = 1 to 4
iterable: Iterable[Int] = Range(1, 2, 3, 4)
scala> iterable.take(2)
res8: Iterable[Int] = Range(1, 2)
scala> iterable.take(2)
res9: Iterable[Int] = Range(1, 2)
scala> val iterator = iterable.iterator
iterator: Iterator[Int] = non-empty iterator
scala> if (iterator.hasNext) iterator.next
res23: AnyVal = 1
scala> if (iterator.hasNext) iterator.next
res24: AnyVal = 2
scala> if (iterator.hasNext) iterator.next
res25: AnyVal = 3
scala> if (iterator.hasNext) iterator.next
res26: AnyVal = 4
scala> if (iterator.hasNext) iterator.next
res27: AnyVal = ()
Note, that I didn't use take
on Iterator
. The reason for this is that it is tricky to use. hasNext
and next
are the only two methods that are guaranteed to work as expected on Iterator
. See the Scaladoc again:
It is of particular importance to note that, unless stated otherwise, one should never use an iterator after calling a method on it. The two most important exceptions are also the sole abstract methods: next and hasNext.
Both these methods can be called any number of times without having to discard the iterator. Note that even hasNext may cause mutation -- such as when iterating from an input stream, where it will block until the stream is closed or some input becomes available.
Consider this example for safe and unsafe use:
def f[A](it: Iterator[A]) = { if (it.hasNext) { // Safe to reuse "it" after "hasNext" it.next // Safe to reuse "it" after "next" val remainder = it.drop(2) // it is *not* safe to use "it" again after this line! remainder.take(2) // it is *not* safe to use "remainder" after this line! } else it }
Another explanation from Martin Odersky and Lex Spoon:
There's an important difference between the foreach method on iterators and the same method on traversable collections: When called to an iterator, foreach will leave the iterator at its end when it is done. So calling next again on the same iterator will fail with a NoSuchElementException. By contrast, when called on on a collection, foreach leaves the number of elements in the collection unchanged (unless the passed function adds to removes elements, but this is discouraged, because it may lead to surprising results).
Source: http://www.scala-lang.org/docu/files/collections-api/collections_43.html
Note also (thanks to Wei-Ching Lin for this tip) Iterator
extends the TraversableOnce
trait while Iterable
doesn't.