Iterators are a powerful tool that allow for efficient iteration over data structures, and they are implemented in many programming languages - including Rust. However, Rust's unique ownership system gives rise to interesting differences in how iterators are implemented and used. In this article, we will explore these differences and delve into Rust's features by creating and implementing the most commonly used iterator traits - Iterator
and IntoIterator
.
Imagine you have a structure like this:
pub struct Todos {
pub list: Vec<Todo>,
}
The struct above is a Vec<Todo>
, which is a vector of Todos
, which have this structure:
pub struct Todo {
pub message: String,
pub done: bool,
}
If we wish to iterate over each Todo
in this Vec
, we could simply use its list
property and iterate over its items, right?
But what if we wanted to iterate over Todos
itself, without exposing its internal properties?
That's where things start to get interesting...
What is an iterator?
An iterator is a pattern that allows you to go through each element in an iterable sequentially. Iterators are very useful in loops, allowing you to iterate over items in a neat way.
Let's see a loop without and with iterators, in Javascript:
let array = [1, 2, 3, 4];
// without iterators
for (var i = 0; i < array.length; i++) {
console.log(array[i]);
}
// with iterators; "item" is the iterator
for (item of array) {
console.log(item)
}
As you can see, they are present in a lot of other languages, as Javascript, Python and so on.
Iterators in Rust
In Rust, similarly with languages such as Python, iterators are lazy. It means that they are effectless unless you iterate over them (a.k.a consume them).
let numbers = vec![1, 2, 3];
let numbers_iter = numbers.iter();
The code above creates an iterator - and does nothing with it.
To use the iterator, we should create a for loop, for example as it follows:
let numbers = vec![1, 2, 3];
let numbers_iter = numbers.iter();
for number in numbers {
println!("{}", number)
}
There are also other ways which we can use to create iterators. For example, every time we use map()
in Rust we are creating an iterator.
Iterators vs Iterables
A vector, as seen before, is an iterable.
It means that we can iterate over them; but more precisely, it means that we can use a Vec to create an Iterator. This is the definition of an iterable - some object with which we can create an Iterator. For example, a Vec<u32>
can produce an iterator - but is not an iterator itself.
Creating an Iterator
Okay, let's go back to our Todos
struct and see how could we create a way to iterate over its items.
Todos
has a field list
, which is a Vec<Todos>
.
In Rust, Iterator is a trait which has some methods in it, such as next()
, which is responsible for taking the next element of the collection and returning it, or returning None
if we've already reached the end of the collection. Its implemention looks roughly like this:
trait Iterator {
type Item;
fn next(&mut self) -> Option<Self::Item>;
}
Fine. We could, then, create a way to iterate over the elements of the list
field of our Todos
, writing a custom next()
function. The logic for this is simple - we could do something like this, in pseudocode:
function next() {
if index < list.len() {
let todo = Some(list[index])
self.index += 1
return todo
} else {
return None
}
}
In the logic above, we check for the index of the element and return it, given its index is less than the list length.
But we have a problem here. Where to store index
?
Storing state
That's when the difference between an iterator and an iterable comes into hand.
Translating the pseudocode above, one could implement the next()
function in Todos
like this, implementing a method from the Iterator
trait,
impl<'a> Iterator for Todos {
type Item = &'a Todo;
fn next(&mut self) -> Option<Self::Item> {
if self.index < self.list.len() {
let result = Some(&self.list[self.index]);
self.index += 1;
result
} else {
None
}
}
}
But it would require us to change the structure of our Todos
: we would need to add a index
field in it to use every time we call next()
- which is, every time we iterate over its items through an iterator.
pub struct Todos {
pub list: Vec<Todo>,
index: usize,
}
In this logic, we would modify the constructor function in order to always create the struct with index: 0
.
pub fn new(list: Vec<Todo>) -> Self {
Todos { list, index: 0 }
}
However, this approach simply does not feel right...
Storing an index field in your struct is not ideal; the index is used for storing the current state in an iterator, so it should be part of the iterator, not of the struct. This is why we are going to create an iterator type - which is going to store the index property - and implement the Iterator
trait for this type.
The iterator is a type that abstracts the logic of iterating over an iterable; it also stores the current state of iteration, such as the index property.
TodoIterator
Let's create our iterator then!
First of all, we need to create the type of the iterator:
struct TodosIterator<'a> {
todos: &'a Todos,
index: usize,
}
Note the lifetime annotations here.
TodosIterator
has a field todos
, which references a Todos
. As we are dealing with a reference, we need to ensure that this field points to something valid - and here lifetime parameters come at hand. The struct TodosIterator
is generic over this lifetime, 'a'
.
Basically, we are using this notation to specify that the todos
field of the iterator needs to have the same lifetime as the data it is pointing to. This way, we ensure that it cannot reference a Todos
struct that has been dropped.
Implementing Iterator for TodosIterator
// ps: it could be done using .get() as well, however I wanted to use this approach to make it a bit more clear how Options are returned in this context
impl<'a> Iterator for TodosIterator<'a> {
type Item = &'a Todo;
fn next(&mut self) -> Option<Self::Item> {
if self.index < self.todos.list.len() {
let result = Some(&self.todos.list[self.index]);
self.index += 1;
result
} else {
None
}
}
}
Now, we are doing it right.
In the implementation of this trait, we specify the same lifetime used for the struct. It means that the implementation is tied to the lifetime of the struct - which is necessary because we are returning a reference in the next()
method.
How to use this iterator?
Now we have already created the TodoIterator
struct as well as implemented the Iterator
trait for this struct, how do we use it to iterate over Todos
?
The answer lies on the iter()
method. This method takes a reference of our Todos
and uses it to create an iterator, which we can use to iterate!
impl Todos {
pub fn iter(&self) -> TodosIterator {
TodosIterator {
todos: self,
index: 0,
}
}
}
// Now we can iterate:
for todo in todos.iter() {
println!("{}", todo); // each todo is a &Todo, and is immutable
}
IntoIterator
IntoIterator is a bit different from the Iterator
trait. Its single method, into_iter()
, returns an iterator over some data and makes it possible to iterate over data in a generic way, regardless of its actual type. It happens because all types that implement IntoIterator
can be converted into an iterator.
Let's understand its implementation:
pub trait IntoIterator {
type Item;
type IntoIter: Iterator<Item = Self::Item>;
fn into_iter(self) -> Self::IntoIter;
}
Some important points here:
1 - The Item
type parameter is the type of the elements that the iterator will yield.
2 - The IntoIter
type parameter is the type of the iterator that will be returned by the into_iter
method. This type must implement the Iterator
trait, and the type of its Item
must be the same as the Item
of IntoIterator
.
3 - The into_iter
method takes self
as an argument, meaning it consumes the original object and returns an iterator over its elements.
Ok, how to implement it?
You may think we could reuse the TodosIterator
- however, we can't since it takes a &Todos
, and we need to take ownership of the iterable. So let's create another iterator that does it:
pub struct TodosIntoIterator {
todos: Todos
}
The difference between TodosIntoIterator
and TodosIterator
is that here we are not using a reference, but taking ownership and returning each item itself - that why we also don't need the lifetime annotations anymore. And also there's no index to keep the state, we soon are going to see why.
Following the definition of the trait IntoIterator
, we can implement it for Todos
:
impl IntoIterator for Todos {
type Item = Todo;
type IntoIter = TodosIntoIterator;
fn into_iter(self) -> TodosIntoIterator {
TodosIntoIterator { todos: self }
}
}
However, before doing it, we need to implement Iterator
for TodosIntoIterator
(remember the type parameters?) to describe how we are going to iterate over it.
impl Iterator for TodosIntoIterator {
type Item = Todo;
fn next(&mut self) -> Option<Self::Item> {
if self.todos.list.len() == 0 {
return None;
}
let result = self.todos.list.remove(0);
Some(result)
}
}
This implementation is a bit different from what we've done for TodosIterator
. We are taking advantage of the remove()
method that exists for Vec
s in Rust. This method removes the item at the position n
and returns it to us, giving the ownership (which is necessary to return a Todo
instead of a &Todo
). Due to how this method works, we can always use "0" to return the first element, instead of storing a state and increasing it sequentially.
Using remove
also guarantees that we are going to yield each value exactly once while also not keeping values already returned in the state of the iterator.
And now, we are done! These two implementations make it possible for us to iterate over a Todos
in two different ways:
- By reference (&Todos)
for todo in todo_list.iter() {
println!("{}", todo);// todo is a &Todo
}
// we can reuse todo_list, since it's not consumed
- Taking ownership:
for todo in todo_list {
println!("{}", todo); // todo is a Todo
}
// we cannot reuse todo_list, since the for loop consumes it
Top comments (0)