What is the relation between Iterable and Iterator?

What is the difference between Iterator and Iterable in scala?

I thought that Iterable represents a set that I can iterate through, and Iterator is a "pointer" to one of the items in the iterable set.

However, Iterator has functions like forEach, map, foldLeft. It can be converted to Iterable via toIterable. And, for example, scala.io.Source.getLines returns Iterator, not Iterable.

But I cannot do groupBy on Iterator and I can do it on Iterable.

So, what's the relation between those two, Iterator and Iterable?


In short: An Iterator does have state, whereas an Iterable does not.

See the API docs for both.

Iterable:

A base trait for iterable collections.

This is a base trait for all Scala collections that define an iterator method to step through one-by-one the collection's elements. [...] This trait implements Iterable's foreach method by stepping through all elements using iterator.

Iterator:

Iterators are data structures that allow to iterate over a sequence of elements. They have a hasNext method for checking if there is a next element available, and a next method which returns the next element and discards it from the iterator.

An iterator is mutable: most operations on it change its state. While it is often used to iterate through the elements of a collection, it can also be used without being backed by any collection (see constructors on the companion object).

With an Iterator you can stop an iteration and continue it later if you want. If you try to do this with an Iterable it will begin from the head again:

scala> val iterable: Iterable[Int] = 1 to 4
iterable: Iterable[Int] = Range(1, 2, 3, 4)

scala> iterable.take(2)
res8: Iterable[Int] = Range(1, 2)

scala> iterable.take(2)
res9: Iterable[Int] = Range(1, 2)

scala> val iterator = iterable.iterator
iterator: Iterator[Int] = non-empty iterator

scala> if (iterator.hasNext) iterator.next
res23: AnyVal = 1

scala> if (iterator.hasNext) iterator.next
res24: AnyVal = 2

scala> if (iterator.hasNext) iterator.next
res25: AnyVal = 3

scala> if (iterator.hasNext) iterator.next
res26: AnyVal = 4

scala> if (iterator.hasNext) iterator.next
res27: AnyVal = ()

Note, that I didn't use take on Iterator. The reason for this is that it is tricky to use. hasNext and next are the only two methods that are guaranteed to work as expected on Iterator. See the Scaladoc again:

It is of particular importance to note that, unless stated otherwise, one should never use an iterator after calling a method on it. The two most important exceptions are also the sole abstract methods: next and hasNext.

Both these methods can be called any number of times without having to discard the iterator. Note that even hasNext may cause mutation -- such as when iterating from an input stream, where it will block until the stream is closed or some input becomes available.

Consider this example for safe and unsafe use:

def f[A](it: Iterator[A]) = {
  if (it.hasNext) {            // Safe to reuse "it" after "hasNext"
    it.next                    // Safe to reuse "it" after "next"
    val remainder = it.drop(2) // it is *not* safe to use "it" again after this line!
    remainder.take(2)          // it is *not* safe to use "remainder" after this line!
  } else it
}

Another explanation from Martin Odersky and Lex Spoon:

There's an important difference between the foreach method on iterators and the same method on traversable collections: When called to an iterator, foreach will leave the iterator at its end when it is done. So calling next again on the same iterator will fail with a NoSuchElementException. By contrast, when called on on a collection, foreach leaves the number of elements in the collection unchanged (unless the passed function adds to removes elements, but this is discouraged, because it may lead to surprising results).

Source: http://www.scala-lang.org/docu/files/collections-api/collections_43.html

Note also (thanks to Wei-Ching Lin for this tip) Iterator extends the TraversableOnce trait while Iterable doesn't.