What is most idiomatic way to create an iterator in Go?

One option is to use channels. Channels are like iterators in a way and you can iterate over them using range keyword. But when you find out you can't break out of this loop without leaking goroutine the usage becomes limited.

What is the idiomatic way to create iterator pattern in go?

Edit:

The fundamental problem with channels is they are a push model. Iterator is is a pull model. You don't have to tell iterator to stop. I'm looking for a way to iterate over collections in a nice expressive way. I would also like to chain iterators (map, filter, fold alternatives).


Solution 1:

Channels are useful, but closures are often more suitable.

package main

import "fmt"

func main() {
    gen := newEven()
    fmt.Println(gen())
    fmt.Println(gen())
    fmt.Println(gen())
    gen = nil // release for garbage collection
}

func newEven() func() int {
    n := 0
    // closure captures variable n
    return func() int {
        n += 2
        return n
    }
}

Playground: http://play.golang.org/p/W7pG_HUOzw

Don't like closures either? Use a named type with a method:

package main

import "fmt"

func main() {
    gen := even(0)
    fmt.Println(gen.next())
    fmt.Println(gen.next())
    fmt.Println(gen.next())
}

type even int

func (e *even) next() int {
    *e += 2
    return int(*e)
}

Playground: http://play.golang.org/p/o0lerLcAh3

There are tradeoffs among the three techniques so you can't nominate one as idiomatic. Use whatever best meets your needs.

Chaining is easy because functions are first class objects. Here's an extension of the closure example. I added a type intGen for integer generator which makes it clear where generator functions are used as arguments and return values. mapInt is defined in a general way to map any integer function to an integer generator. Other functions such as filter and fold could be defined similarly.

package main

import "fmt"

func main() {
    gen := mapInt(newEven(), square)
    fmt.Println(gen())
    fmt.Println(gen())
    fmt.Println(gen())
    gen = nil // release for garbage collection
}

type intGen func() int

func newEven() intGen {
    n := 0
    return func() int {
        n += 2
        return n
    }
}

func mapInt(g intGen, f func(int) int) intGen {
    return func() int {
        return f(g())
    }
}

func square(i int) int {
    return i * i
}

Playground: http://play.golang.org/p/L1OFm6JuX0

Solution 2:

TL;DR: Forget closures and channels, too slow. If the individual elements of your collection are accessible by index, go for the classic C iteration over an array-like type. If not, implement a stateful iterator.

I needed to iterate over some collection type for which the exact storage implementation is not set in stone yet. This, plus the zillions other reasons to abstract the implementation details from the client, lead me to do some testing with various iteration methods. Full code here, including some implementations that make use of errors as values. Here are the benchmark results:

  • classic C iteration over an array-like structure. The type provides the methods ValueAt() and Len():

    l := Len(collection)
    for i := 0; i < l; i++ { value := collection.ValueAt(i) }
    // benchmark result: 2492641 ns/op
    
  • Closure style iterator. The collection's Iterator method returns a next() function (a closure over the collection and cursor) and a hasNext boolean. next() returns the next value and a hasNext boolean. Note that this runs much faster than using separate next() and hasNext() closures returning single values:

    for next, hasNext := collection.Iterator(); hasNext; {
        value, hasNext = next()
    }
    // benchmark result: 7966233 ns/op !!!
    
  • Stateful iterator. A simple struct with two data fields, the collection and a cursor, and two methods: Next() and HasNext(). This time the Iterator() method of the collection returns a pointer to a properly initialized iterator structure:

    for iter := collection.Iterator(); iter.HasNext(); {
        value := iter.Next()
    }
    // benchmark result: 4010607 ns/op
    

As much as I like closures, performance wise it's a no-Go. As for design patterns, well, Gophers prefer the term "idiomatic way to do" stuff for good reason. Also grep the go source tree for iterators: with so few files that mention the name, iterators are definitely not a Go thing.

Also check out this page: http://ewencp.org/blog/golang-iterators/

Anyhow, interfaces do not help in any way here, unless you want to define some Iterable interface, but this is a completely different subject.

Solution 3:

TL;DR: Iterators are not idiomatic in Go. Leave them to other languages.

In depth then, the Wikipedia entry "Iterator pattern" begins, "In object-oriented programming, the iterator pattern is a design pattern..." Two red flags there: First, object oriented programming concepts often don't translate well into Go, and second, many Go programmers don't think much of design patterns. That first paragraph also includes "The iterator pattern decouples algorithms from containers", but only after stating "an iterator [accesses] the container's elements. Well which is it? If an algorithm is accessing the container's elements it can hardly claim to be decoupled. The answer in many languages involves some kind of generics that allows the language to generalize over similar data structures. The answer in Go is interfaces. Interfaces enforce a stricter decoupling of algorithms and objects by denying access to structure and requiring all interaction be based on behavior. Behavior means capabilites expressed through methods on the data.

For a minimal iterator type the needed capability is a Next method. A Go interface can represent an iterator object by simply specifying this single method signature. If you want a container type to be iterable, it must satisfy the iterator interface by implementing all of the methods of the interface. (We only have one here, and in fact it is common for interfaces to have only a single method.)

A minimal working example:

package main

import "fmt"

// IntIterator is an iterator object.
// yes, it's just an interface.
type intIterator interface {
    Next() (value int, ok bool)
}

// IterableSlice is a container data structure
// that supports iteration.
// That is, it satisfies intIterator.
type iterableSlice struct {
    x int
    s []int
}

// iterableSlice.Next implements intIterator.Next,
// satisfying the interface.
func (s *iterableSlice) Next() (value int, ok bool) {
    s.x++
    if s.x >= len(s.s) {
        return 0, false
    }
    return s.s[s.x], true
}

// newSlice is a constructor that constructs an iterable
// container object from the native Go slice type.
func newSlice(s []int) *iterableSlice {
    return &iterableSlice{-1, s}
}

func main() {
    // Ds is just intIterator type.
    // It has no access to any data structure.
    var ds intIterator

    // Construct.  Assign the concrete result from newSlice
    // to the interface ds.  ds has a non-nil value now,
    // but still has no access to the structure of the
    // concrete type.
    ds = newSlice([]int{3, 1, 4})

    // iterate
    for {
        // Use behavior only.  Next returns values
        // but without insight as to how the values
        // might have been represented or might have
        // been computed.
        v, ok := ds.Next()
        if !ok {
            break
        }
        fmt.Println(v)
    }
}

Playground: http://play.golang.org/p/AFZzA7PRDR

This is the basic idea of interfaces, but it's absurd overkill for iterating over a slice. In many cases where you would reach for an iterator in other languages, you write Go code using built in language primitives that iterate directly over basic types. Your code stays clear and concise. Where that gets complicated, consider what features you really need. Do you need to emit results from random places in some function? Channels provide a yield-like capability that allows that. Do you need infinite lists or lazy evaluation? Closures work great. Do you have different data types and you need them to transparently support the same operations? Interfaces deliver. With channels, functions, and interfaces all first class objects, these techniques are all easily composable. So what then is the most idiomatic way? It is to experiment with different techniques, get comfortable with them, and use whatever meets your needs in the simplest way possible. Iterators, in the object oriented sense anyway, are almost never simplest.

Solution 4:

You can break out without leaking by giving your goroutines a second channel for control messages. In the simplest case it's just a chan bool. When you want the goroutine to stop, you send on this channel. Inside the goroutine, you put the iterator's channel send and the listen on the control channel inside a select.

Here's an example.

You can take this further by allowing different control messages, such as "skip".

Your question is pretty abstract, so to say more, a concrete example would be helpful.

Solution 5:

Looking to the container/list package it looks like there is no way to do it. C-like way should be used if you iterate over object.

Something like this.

type Foo struct {
...
}

func (f *Foo) Next() int {
...
}

foo := Foo(10)

for f := foo.Next(); f >= 0; f = foo.Next() {
...
}