DEV Community

Cover image for Pagination in raw SQL with OFFSET

Pagination in raw SQL with OFFSET

Update

Stop using offset for pagination - Why it's grossly inefficient
by u/fyzic in programming

Although I shared this link here, I strongly to read this post to the end so that when you're reading the aforementioned post, it feels more relevant.


Original post

How you ever wondered how pagination works, without having to learn it in the context of web development, or a specific programming language? If yes and you're like me, someone who is an admirer of PostgreSQL, you came to the right place. Here we're gonna check and see how this magic called pagination works in SQL.

Sit tight; we're gonna kick things off and move as quickly as possible from here on out.

Some assumptions I am making so that we can move to the more interesting parts of this post:

  • We have a table called "news_articles":

    column name type
    id uuid
    title Timestamp
    created_at Timestamp
    updated_at Timestamp
  • We wanna fetch part of the store data according to the WHERE clause, OFFSET, and LIMIT.

  • I added the prisma part at the end of the post. So you can find it under "Prisma" section.


Breaking the problem into two half

  1. How can I do it in two separate queries (super simple).
  2. How can I combine them into one query, I do not like the sound of a two I/O whereas I can do it in one.

Separate queries

  • Selecting data:

    SELECT *
    FROM public.news_articles
    WHERE title LIKE '%something%'
    ORDER BY created_at ASC
    OFFSET 0
    LIMIT 10;
    

    CAUTION

    • It is important to add OFFSET and LIMIT after everything else. Otherwise your query will fail when you execute it.
    • Do not omit ORDER BY when you have used LIMIT if you want to be able to predict what subset of data you're gonna receive.

      Query optimizer takes LIMIT into account when generating query plans, so you are very likely to get different plans (yielding different row orders) depending on what you give for LIMIT and OFFSET.

      Using different LIMIT/OFFSET values to select different subsets will give inconsistent results unless you enforce a predictable result ordering with ORDER BY.

      This is not a bug; it is an inherent consequence of the fact that SQL does not promise to deliver the results of a query in any particular order unless ORDER BY is used.
      Ref.

  • Counting all the records, so that we can calculate next page and previous page number if any:

    SELECT COUNT(id) as "total"
    FROM public.news_articles;
    

    Here we are using COUNT(id) so that we are sure we are counting all the records. As you know this aggregate function won't count null values. Thus id is the best choice.
    Ref

  • And then we can just calculate the previous page and next page like this:

    SELECT total / 10::double precision AS "totalPage";
    

    Then if totalPage is bigger than the current page ((limit + offset) / limit):

    • I just increase the page by one and previous page will be the current page.
    • Otherwise there is not next page. But our previous page would be the current page minus one.

    Learn more about casting. In this example we:

    • Invoked a cast through its construct: i.e. 10::double precision.
    • But we could do it also explicitly: i.e. CAST(10 AS double precision).

Smashing and combining all of these

  • Nested queries: For that to happen we need to write a subquery within our main query. So that we can get everything in one fell swoop.

    Read about more here, and I could not find something more elaborate. Dunno if they missed it or simply I was not able to find the doc related to subqueries. But most likely it is scattered in their website.

  • We also need to utilize some of the builtin functions of PSQL:

    • TO_JSONB converts all columns of "news_articles" into a jsonb or json datatype (ref).
      • Alternatively we can use JSONB_BUILD_OBJECT where we should alternating keys and values.
    • JSON_AGG to aggregating json data (ref).
SELECT *, (total / 10::double precision)::int AS "totalPage"
FROM (
  SELECT
    (SELECT COUNT(id) FROM public.news_articles) AS "total",
    (
      SELECT JSON_AGG(TO_JSONB(filtered_news_articles))
      FROM (
            SELECT *
            FROM public.news_articles
            WHERE title LIKE '%something%'
            ORDER BY created_at ASC
            OFFSET 0
            LIMIT 10
        ) as "filtered_news_articles"
      ) AS "data"
);
Enter fullscreen mode Exit fullscreen mode

This query will return something like this if you wanted to see it in plain JSON:

{
  "total": 50,
  "data": [
    {
      "id": "9b050c4f-e0dc-4c19-9e02-844957a67522",
      "title": "A title with something inside it!"
      // ...
    },
    {
      "id": "b5c5c3c9-75c9-4495-908f-47e42abc92a9",
      "title": "Is something ready?"
      // ...
    },
    // ...
  ],
  "totalPage": 5
}
Enter fullscreen mode Exit fullscreen mode

IMPORTANT:

  • In a real world app we usually tend to use dynamic values for limit & offset. That's why I used a cast operator to convert limit into double precision. Otherwise it would performed an integer operation and that could lead to not seeing last page's data.
  • Here we are converting the totalPage back to int again after it is calculated.
  • Calculating everything in SQL can become cumbersome if you over do it. Just look at how much harder it is to read it just because we wanted to have the totalPage calculate inside SQL. But instead we could do it in our codebase.

Prisma

For that I actually have a repo but since ATM it is ready I just copy and paste the code here:

prismaClient.$queryRaw<{
  data: {id: string, title: string}[];
  total: number;
}>`SELECT
    (SELECT COUNT(id) FROM public.news_articles) AS "total",
    (
      SELECT JSON_AGG(TO_JSONB(filtered_news_articles))
      FROM (
        SELECT *
        FROM public.news_articles
        WHERE title LIKE '%something%'
        ORDER BY created_at ASC
        OFFSET 0
        LIMIT 10
      ) as "filtered_news_articles"
    )`
Enter fullscreen mode Exit fullscreen mode

NOTE:

  • I have decided to calculate totalPage in my Typescript app.
  • We are importing News interface from @prisma/client, it is generated automatically for us.

If this was helpful to you, consider giving my SQL repo a star, or you simply like this post and share your thoughts about how I could rewrite this query so that would be more efficient and maintainable.

GitHub logo kasir-barati / sql

A place where I keep track of what I know about PostgreSQL + ORMs

SQL

Originally this was a repo I've created while I was working at Spad just to test operator precedence in Prisma. But then I decided it should be where I share my know-how in SQL and this lovely ORM called Prisma. But again I changed my mind and from now on it is gonna be place for my future SQL course.







You can also find me on:

Top comments (0)