DEV Community

Optimizing SQL Queries by 23x!!!

Navneet Kumar Singh on January 15, 2024

So I have been into web dev for about 3 years now and professionally for more than a year, This was the first time I had handled an issue that was ...

Read full post

Mike Talbot ⭐ • Jan 15 • Edited

What a really great article, fascinating and so well put together!

I've stopped using ORMs, every time I find a project with abysmal performance its because assumptions in the way an ORM is used, or assumptions on usage style by the ORM's designer. I'll use something to convert JSON to SQL perhaps, but at the raw metal level.

Arber Braja • Jan 15

If you only want to kind of convert json to SQL or kind of having a capability to send to the backend a json looking object regarding what data to fetch and let the backend do its "magic" you could give GraphQL a try. However, with complex queries you could suffer even in GraphQL land.

Mike Talbot ⭐ • Jan 15 • Edited

Good point, and we do use GraphQL. The JSON to SQL is just on the back end to make writing the queries a bit easier, we have an adapted mongodb query parser that support relational joins etc, but we are doing them explicitly.

// Resolver for a GraphQL end point
async function maintenanceReports(_, { regime, group, fromDate, toDate, search, done, notDone, take = 10, skip = 0 }) {
    const client = getClient()
    const $or = []
    const $orDone = []
    const $and = [{ $or }, { $or: $orDone }]
    const query = {
        regime,
        $and,
    }
    if (done) {
        $orDone.push({ closedDate: { $ne: null } })
    }
    if (notDone) {
        $orDone.push({ closedDate: { $eq: null } })
    }
    if (group) {
        $and.push({ group })
    }
    if (fromDate) {
        $and.push({ reportDate: { $gt: Date.create(fromDate).toISOString() } })
    }
    if (toDate) {
        $and.push({ reportDate: { $lt: Date.create(toDate).toISOString() } })
    }
    if (search) {
        $or.push({ _nc_report: { $like: `%${search}%` } })
        $or.push({
            $regimeJoin: {
                $joinsTo: {
                    table: Tree.source,
                    where: { $data: { $like: `%${search}%` }, client },
                },
            },
        })
        $or.push({
            $scheduleJoin: {
                $joinsTo: {
                    table: Schedules.source,
                    where: {
                        $data: { $like: `%${search}%` },
                        $or: [{ client }, { client: "SYSTEM" }, { client: "" }, { client: null }],
                    },
                },
            },
        })
    }

    const list = await MaintenanceReports.list(query, { orderStmt: ["-created"] })

    return {
        total: list.length,
        reports: list.slice(skip, skip + take),
    }
}

Eduard • Jan 15

Yes, there's a whole myriad of issues that could be summarised as "stop using orms".

Navneet Kumar Singh • Jan 16 • Edited

rather than stop using them we should start analysing them and stop trusting them blindly..

Eduard • Jan 16

I agree that they can be useful sometimes, the issue in my opinion is that it pushes new programmers to just never learn SQL, and keep them trapped in ORM land, where they don't even know how to begin to analyze performance or alternatives.

Navneet Kumar Singh • Jan 16

Yep agreed!

Navneet Kumar Singh • Jan 15

Agreed!

ChristopherHoffman • Jan 16

Your custom sql query is not the same as what the ORM provided. You have a bug in your code, due to using pagination with LEFT JOINs. Because of the left joins, if those joins are not one-to-one, then the DB will return multiple rows for the Cartesian product of those joins. That breaks a basic limit/offset for pagination. Your ORM was trying to fix that for you.

For example. If you have a Person table, and a Department table, and a PersonDepartment to allow for a many to many join. If Bob is in three departments, then when you do a left join from person to department, you'll get 3 rows:

Id, name, deptId, deptName
1, Bob, 1, Math
1, Bob, 2, Science,
1, Bob, 3, Engineering

Your ORM would then take that and construct a single person object with an array of departments.

Now imagine you had 10 people, each with 10 departments. Your query, with a limit of 10 and offset of 0, would return the first 10 rows. But because of the left join, it those first 10 rows would probably just be Bob with his 10 departments, when really what you wanted was the first 10 people, with all 100 departments.

That's why your ORM was doing two queries. The first was trying to get the first 10 people's id's that match the where and inner joins. Then, it takes those 10 id's, and has to do a second query and use those id's to grab all the data for those 10 records. That second query might return 100s or thousands of rows, depending on how many one-to-many joins there are.

Prashanth Ragam • Jan 16 • Edited

What would the JSON query(active record pattern) be if we intend to get a cartesian product @christopherhoffman ? In this article the required output is to get all submissions, so if a user Bob has 10 submissions, the correct output is to get 10 records, rather than one record of bob, containing an array of 10 submissions.

Navneet Kumar Singh • Jan 16

right, its use case base we wanted it that way so we went with that

Dave A • Jan 15

Really great article. Thanks for sharing. I've only ever used classic ado.net, entity framework and dapper.net (all for mssql).
I'd add that all can be really effective IF you are knowledgeable and disciplined in how you applied them.
There are certainly more "gotchas" with particular orm frameworks but by and large, they are just tools for you to use as you wish.
I personally found dapper.net to be the best to work with but depends on the developer/lead Dev's preferences and familiarity with raw sql .
Great breakdown of potential query optimizations btw 😎👍

Navneet Kumar Singh • Jan 15

Thanks!

Vladas Saulis • Jan 18 • Edited

SQL sucked, sucks and will suck until the last using it dies hard...
1000 times I tell all - hold relations in your head, not in SQL. And one day you realize
that relation many to many is very simple and takes only 2 tables.
I even won't say here about joins (brrr).

Navneet Kumar Singh • Jan 18

😿

Vitalii Samofal • Jan 18 • Edited

can you have multiple "problems" with your "submissions"? if yes, then your limit, offset doesn't really work

you can change skip, take to offset limit and it will do the same thing as you did

Navneet Kumar Singh • Jan 18 • Edited

No, we can't a submission is tied to a single problem furthermore I thought what you are saying should make a lot of sense but sadly there are no options to provide a limit and offset that way.

Lev N. • Jan 15

ORMs are a disaster, I have seen so many queries butchered by ORMs it's really fascinating to me why developers keep using them.

Navneet Kumar Singh • Jan 15

true

Arvin • Jan 18

Anyone has experience moving away from Prisma and see better performance?

Navneet Kumar Singh • Jan 18

raw sql will always be better for perf

Arvin • Jan 18

No doubt. But sometimes the performance increase isn't worth the migration. I'm interested to see if someone has had a significant increase when migrating in a large database.

Navneet Kumar Singh • Jan 18 • Edited

I haven't we use a mixture of orm and raw sql. Wherever we see an improvement is needed we make it.

Derek • Jan 17

ORM's are the suck

Clinton Rocha • Feb 24

Great article, thanks for the information and congratulations on the content!

Navneet Kumar Singh • Feb 25

Thank you 😄

Jose E Saura • Jan 24

Loved the article! It's interesting that ORM f**k things up:/

Navneet Kumar Singh • Jan 29

Yeah sometimes :(

dreic • Jan 16

Using ORM is the only fail you needed :). In business apps ORM should never be used as down the road it will give you big issues.

Navneet Kumar Singh • Jan 16

Harsh but somewhat true 😅

Sacha Wharton • Jan 18

Really great article, thank you for putting in the time to write this up.

Navneet Kumar Singh • Jan 18

Thank you soo much!

Amrita • Jan 15

Perfect 👌

Ricardo • Jan 16

very cool article.
What is the matter with the ORM to perform such a bad performing query ?
I would also suggest to do partition the table by date or so if the table may be interesting too.
good job!

Navneet Kumar Singh • Jan 16 • Edited

Not sure why the orm would do such a thing as I am not familiar with it's implementation.. and yeah we will do partitioning once we come to need it thanks it is a good suggestion

WilliamAnthony • Jan 16

Sorry I have to ask, if ORM is that bad, what should I use if I want to build database independent application. I mean I can switch from mysql to postresql or other relational db

Navneet Kumar Singh • Jan 16 • Edited

Its not that ORM's are bad, it's just that sometimes they mess up as they are a very abstracted version of what we really want to interact with and sometimes those abstractions comes at a cost of perf and memory, In my case we were hitting a limit only on this one api of our so we investigated and found out that the query which was generated by ORM was inefficient and not useful, So we just went with raw sql, we still use typeorm a lot in out codebase but its just that where we need more perf in complex read operations we generally now write it in raw sql.