DEV Community

Cover image for Firestore cursor-based pagination on the API server
moga
moga

Posted on

Firestore cursor-based pagination on the API server

Assumptions

  • Firestore is used on your API server
  • We want to use cursor-based pagination instead of Offset-based pagination in the API server
    • The client sends the cursor string to the API server, which then retrieves the continuation and returns it

What I want to do

  • I want to modularize the pagination process using Firestore as a data source in the API server.
  • The module requires cursor, limit, and Firestore of Query like firestore().collection('posts').where(...).orderBy(...)
  • At a minimum, the followings will be returned as a result
    • An array of the documents you retrieved
    • Whether there is next data (hasNextPage)
    • The cursor of the last document

Difficulties

The startAfter/startAt used for pagination in Firestore can be specified by "the value of the field specified by orderBy" or "a snapshot of the document".

In the former case, the type of the cursor to be passed depends on what orderBy is specified. In TypeScript, it can be string, number, firestore.Timestamp, etc., but it is more practical to use string for returning the cursor to the API client (the cursor for pagination in Relay in GraphQL is a string). However, in order to make pagination processing common, it is complicated to say "the type of the cursor in this query is XX, so it must be converted like this".

In the latter case (document snapshot), the snapshot is an object and is not suitable to be converted directly into a cursor (string). So, I came up with the idea to convert the path of the document into a cursor.

How to do it

The following is an example of TypeScript code (roughly equivalent to GraphQL's Relay style cursor pagination).

import { firestore } from 'firebase-admin'

// base64 encode the snapshot's path
const encodeCursor = (snapshot: firestore.DocumentSnapshot | firestore.QueryDocumentSnapshot) => {
  return Buffer.from(snapshot.ref.path).toString('base64')
}

const decodeCursor = (cursor: string) => {
  return Buffer.from(cursor, 'base64').toString('utf8')
}

type Connection = {
  nodes: { id: string }[].
  pageInfo: {
    hasNextPage: boolean
    endCursor?: string | null
  }
}

export const paginateFirestore = async (query: firestore.Query, limit: number, cursor?: string | null): Promise<Connection> => {
  // get one more item for hasNextPage
  let q = query.limit(limit + 1)

  if (cursor) {
    // If a cursor is passed, convert it to a path and get a snapshot of the document
    const path = decodeCursor(cursor)
    const snap = await admin.firestore().doc(path).get()

    if (!snap.exists) {
      return { nodes: [], pageInfo: { hasNextPage: false }
    }

    // pass to startAfter
    q = q.startAfter(snap)
  }

  const snapshot = await q.get()
  const hasNextPage = snapshot.size > limit
  const docs = snapshot.docs.slice(0, limit)

  // make the path of the last document a cursor
  const endCursor = hasNextPage ? encodeCursor(docs[docs.length - 1]) : null

  return {
    nodes: docs.map(doc => ({ id: doc.id, . .doc.data() })),
    pageInfo: {
      hasNextPage,
      endCursor,
    },
  }
}
Enter fullscreen mode Exit fullscreen mode

The usage is as follows.

const query = firestore().collection('posts').orderBy('createdAt', 'desc')
const connection = await paginateFirestore(query, 100, args.cursor)
Enter fullscreen mode Exit fullscreen mode


`

Advantages and disadvantages.

Advantages

  • Simplifies the process because the cursor will always be path.
  • Clients only needs to pass the generated cursor.
  • Pagination by snapshot is more accurate than pagination by orderBy field.

Disadvantages

  • Overhead from getting one extra item.

What do you think?

I thought that the overhead of retrieving one extra item would not have a big impact on gRPC, so I implemented it this way, prioritizing the simplicity of the code. If you have any thoughts on this, I'd love to hear your feedback!

Top comments (3)

Collapse
 
rogah profile image
Rogério W. Carvalho • Edited

In this approach, does it mean that for every page you have to fetch twice? Two round trips, being one to fetch the cursor document and another to fetch the actual page?

Meaning this would always incur of a extra read I guess.

Collapse
 
moga profile image
moga

Sorry for the late reply. You are completely right.

Collapse
 
andy240510 profile image
Andy Wu

Unfortunately, the approach will not work if the last document of the page gets deleted before requesting the next page since it cannot get the document snapshot of a deleted path.