Are there equivalents to slice::chunks/windows for iterators to loop over pairs, triplets etc?
It can be useful to iterate over multiple variables at once, overlapping (slice::windows
), or not (slice::chunks
).
This only works for slices; is it possible to do this for iterators, using tuples for convenience?
Something like the following could be written:
for (prev, next) in some_iter.windows(2) {
...
}
If not, could it be implemented as a trait on existing iterators?
It's possible to take chunks of an iterator using Itertools::tuples
, up to a 4-tuple:
use itertools::Itertools; // 0.9.0
fn main() {
let some_iter = vec![1, 2, 3, 4, 5, 6].into_iter();
for (prev, next) in some_iter.tuples() {
println!("{}--{}", prev, next);
}
}
(playground)
1--2
3--4
5--6
If you don't know that your iterator exactly fits into the chunks, you can use Tuples::into_buffer
to access any leftovers:
use itertools::Itertools; // 0.9.0
fn main() {
let some_iter = vec![1, 2, 3, 4, 5].into_iter();
let mut t = some_iter.tuples();
for (prev, next) in t.by_ref() {
println!("{}--{}", prev, next);
}
for leftover in t.into_buffer() {
println!("{}", leftover);
}
}
(playground)
1--2
3--4
5
It's also possible to take up to 4-tuple windows with Itertools::tuple_windows
:
use itertools::Itertools; // 0.9.0
fn main() {
let some_iter = vec![1, 2, 3, 4, 5, 6].into_iter();
for (prev, next) in some_iter.tuple_windows() {
println!("{}--{}", prev, next);
}
}
(playground)
1--2
2--3
3--4
4--5
5--6
If you need to get partial chunks / windows, you can get
TL;DR: The best way to have chunks
and windows
on an arbitrary iterator/collection is to first collect
it into a Vec
and iterate over that.
The exact syntax requested is impossible in Rust.
The issue is that in Rust, a function's signature is depending on types, not values, and while Dependent Typing exists, there are few languages that implement it (it's hard).
This is why chunks
and windows
return a sub-slice by the way; the number of elements in a &[T]
is not part of the type and therefore can be decided at run-time.
Let's pretend you asked for: for slice in some_iter.windows(2)
instead then.
Where would the storage backing this slice live?
It cannot live:
- in the original collection because a
LinkedList
doesn't have a contiguous storage - in the iterator because of the definition of
Iterator::Item
, there is no lifetime available
So, unfortunately, slices can only be used when the backing storage is a slice.
If dynamic allocations are accepted, then it is possible to use Vec<Iterator::Item>
as the Item
of the chunking iterator.
struct Chunks<I: Iterator> {
elements: Vec<<I as Iterator>::Item>,
underlying: I,
}
impl<I: Iterator> Chunks<I> {
fn new(iterator: I, size: usize) -> Chunks<I> {
assert!(size > 0);
let mut result = Chunks {
underlying: iterator, elements: Vec::with_capacity(size)
};
result.refill(size);
result
}
fn refill(&mut self, size: usize) {
assert!(self.elements.is_empty());
for _ in 0..size {
match self.underlying.next() {
Some(item) => self.elements.push(item),
None => break,
}
}
}
}
impl<I: Iterator> Iterator for Chunks<I> {
type Item = Vec<<I as Iterator>::Item>;
fn next(&mut self) -> Option<Self::Item> {
if self.elements.is_empty() {
return None;
}
let new_elements = Vec::with_capacity(self.elements.len());
let result = std::mem::replace(&mut self.elements, new_elements);
self.refill(result.len());
Some(result)
}
}
fn main() {
let v = vec!(1, 2, 3, 4, 5);
for slice in Chunks::new(v.iter(), 2) {
println!("{:?}", slice);
}
}
Will return:
[1, 2] [3, 4] [5]
The canny reader will realize that I surreptitiously switched from windows
to chunks
.
windows
is more difficult, because it returns the same element multiple times which require that the element be Clone
. Also, since it needs returning a full Vec
each time, it will need internally to keep a Vec<Vec<Iterator::Item>>
.
This is left as an exercise to the reader.
Finally, a note on performance: all those allocations are gonna hurt (especially in the windows
case).
The best allocation strategy is generally to allocate a single chunk of memory and then live off that (unless the amount is really massive, in which case streaming is required).
It's called collect::<Vec<_>>()
in Rust.
And since the Vec
has a chunks
and windows
methods (by virtue of implementing Deref<Target=[T]>
), you can then use that instead:
for slice in v.iter().collect::<Vec<_>>().chunks(2) {
println!("{:?}", slice);
}
for slice in v.iter().collect::<Vec<_>>().windows(2) {
println!("{:?}", slice);
}
Sometimes the best solutions are the simplest.