The next JavaScript standard, ES2018 is here and it comes with a new big feature: asynchronous iteration. It is a enormously useful feature and I want to share with you one super simple example on how can we use it in real life.
In this post I am NOT going to explain what are async iterators or iterators. You can get those explanations in here or here
The problem. We want to fetch data from an API that is returned paginated and do stuff with every page. For example, we want to fetch all the commits of a Github repo and do some stuff with those data.
We want to separate the logic of "fetching commits" and "do stuff", so we are going to use two functions. In a Real Lifeβ’ scenario, fetchCommits
would be probably in a different module and the "do stuff" part will call fetchCommits
somehow:
// Imagine that this function is in a different module...
function fetchCommits(repo) {}
function doStuff() {
const commits = fetchCommits('facebook/react')
// do something with `commits`
}
Now, Github API will return commits paginated (like most of the REST APIs) so we will fetch the commits "in batches". We want to implement this "pagination" logic somehow in fetchCommits
.
However we don't want to return all the commits together in fetchCommits
, we want to do some logic for each page when they come and implement such logic in the "do stuff" part.
Solution without async iteration
To do it, we were somehow forced to use callbacks:
// Here we "do stuff"
fetchCommits('facebook/react', commits => {
// do something with `commits`
}
Can we use Promises?. Well, not in this way because we will get only one page or the whole thing:
function doStuff() {
fetchCommits('facebook/react').then(commits => {
// do something
})
}
Can we use sync generators? Well... we could return a Promise
in the generator and resolve that promise outside it.
// fetchCommits is a generator
for (let commitsPromise of fetchCommits('facebook/react')) {
const commits = await commitsPromise
// do something
}
This is actually a clean solution, but how is the implementation of the fetchCommits
generator?
function* fetchCommits(repo) {
const lastPage = 30 // Must be a known value
const url = `https://api.github.com/${repo}/commits?per_page=10`
let currentPage = 1
while (currentPage <= lastPage) {
// `fetch` returns a Promise. The generator is just yielding that one.
yield fetch(url + '&page=' + currentPage)
currentPage++
}
}
Not bad solution but we have one big issue here: the lastPage
value must be known in advance. This is often not possible since that value comes in the headers when we do a first request.
If we still want to use generators, then we can use an async function to get that value and return a sync generator...
async function fetchCommits (repo) {
const url = `https://api.github.com/${repo}/commits?per_page=10`
const response = await fetch(url)
// Here we are calculating the last page...
const last = parseLinkHeader(response.headers.link).last.url
const lastPage = parseInt(
last.split('?')[1].split('&').filter(q => q.indexOf('page') === 0)[0].split('=')[1]
)
// And this is the actual generator
return function* () {
let currentPage = 1
while (currentPage <= lastPage) {
// And this looks non dangerous but we are hard coding URLs!!
yield fetch(url + '&page=' + currentPage)
currentPage++
}
}
}
This is not a good solution since we are literally hard-coding the "next" URL.
Also the usage of this could be a bit confusing...
async function doStuff() {
// Calling a function to get...
const getIterator = await fetchCommits('facebook/react')
// ... a function that returns an iterator???
for (const commitsPromise of getIterator()) {
const value = await commitsPromise
// Do stuff...
}
}
Optimally, we want to obtain the "next" URL after every request and that involves to put asynchronous logic in the generator but outside of the yielded value
Async generators (async function*
) and for await
loops
Now, async generators and asynchronous iteration allow us to iterate through structures where all the logic outside of the yielded value is also calculated asynchronously. It means that, for every API call we can guess the "next URL" based on the headers and also check if we reach the end.
In fact, this could be a real implementation:
(The example works in node >= 10)
const rp = require('request-promise')
const parseLinkHeader = require('parse-link-header')
async function* fetchCommits (repo) {
let url = `https://api.github.com/${repo}/commits?per_page=10`
while (url) {
const response = await request(url, {
headers: {'User-Agent': 'example.com'},
json: true,
resolveWithFullResponse: true
})
// We obtain the "next" url looking at the "link" header
// And we need an async generator because the header is part of the response.
const linkHeader = parseLinkHeader(response.headers.link)
// if the "link header" is not present or doesn't have the "next" value,
// "url" will be undefined and the loop will finish
url = linkHeader && linkHeader.next && linkHeader.next.url
yield response.body
}
}
And the logic of the caller function gets also really simple:
async function start () {
let total = 0
const iterator = fetchCommits('facebook/react')
// Here is the "for-await-of"
for await (const commits of iterator) {
// Do stuff with "commits" like printing the "total"
total += commits.length
console.log(total)
// Or maybe throwing errors
if (total > 100) {
throw new Error('Manual Stop!')
}
}
console.log('End')
}
start()
Do you have any other examples on how to use async generators?
Top comments (4)
why aren't you using the header link returned appositely by GitHub API to know the next page to load instead?
It is exactly what I'm doing :)
you are right, I actually somehow skipped the last url assignment. One possible improvement then, would be to fetch headers once, use URLSearchParams to get/set pages, and load all the pages at once in parallel, returning results as Promise.all(...)
That would be N pages at once, instead of N pages one after the other ;-)
Edit: my suggestion is based on the fact GitHub returns the last page too, but I guess for your article what you are doing is already good enough as example.
Thanks for the suggestion! Your solution would work perfectly :)
I don't think that my solution is valid for all scenarios but might be good sometimes. For example: