DEV Community

Cover image for Optimizing SQL Queries by 23x!!!

Optimizing SQL Queries by 23x!!!

Navneet Kumar Singh on January 15, 2024

So I have been into web dev for about 3 years now and professionally for more than a year, This was the first time I had handled an issue that was ...
Collapse
 
miketalbot profile image
Mike Talbot ⭐ • Edited

What a really great article, fascinating and so well put together!

I've stopped using ORMs, every time I find a project with abysmal performance its because assumptions in the way an ORM is used, or assumptions on usage style by the ORM's designer. I'll use something to convert JSON to SQL perhaps, but at the raw metal level.

Collapse
 
arberbr profile image
Arber Braja

If you only want to kind of convert json to SQL or kind of having a capability to send to the backend a json looking object regarding what data to fetch and let the backend do its "magic" you could give GraphQL a try. However, with complex queries you could suffer even in GraphQL land.

Collapse
 
miketalbot profile image
Mike Talbot ⭐ • Edited

Good point, and we do use GraphQL. The JSON to SQL is just on the back end to make writing the queries a bit easier, we have an adapted mongodb query parser that support relational joins etc, but we are doing them explicitly.

// Resolver for a GraphQL end point
async function maintenanceReports(_, { regime, group, fromDate, toDate, search, done, notDone, take = 10, skip = 0 }) {
    const client = getClient()
    const $or = []
    const $orDone = []
    const $and = [{ $or }, { $or: $orDone }]
    const query = {
        regime,
        $and,
    }
    if (done) {
        $orDone.push({ closedDate: { $ne: null } })
    }
    if (notDone) {
        $orDone.push({ closedDate: { $eq: null } })
    }
    if (group) {
        $and.push({ group })
    }
    if (fromDate) {
        $and.push({ reportDate: { $gt: Date.create(fromDate).toISOString() } })
    }
    if (toDate) {
        $and.push({ reportDate: { $lt: Date.create(toDate).toISOString() } })
    }
    if (search) {
        $or.push({ _nc_report: { $like: `%${search}%` } })
        $or.push({
            $regimeJoin: {
                $joinsTo: {
                    table: Tree.source,
                    where: { $data: { $like: `%${search}%` }, client },
                },
            },
        })
        $or.push({
            $scheduleJoin: {
                $joinsTo: {
                    table: Schedules.source,
                    where: {
                        $data: { $like: `%${search}%` },
                        $or: [{ client }, { client: "SYSTEM" }, { client: "" }, { client: null }],
                    },
                },
            },
        })
    }

    const list = await MaintenanceReports.list(query, { orderStmt: ["-created"] })

    return {
        total: list.length,
        reports: list.slice(skip, skip + take),
    }
}
Enter fullscreen mode Exit fullscreen mode
Collapse
 
viiik profile image
Eduard

Yes, there's a whole myriad of issues that could be summarised as "stop using orms".

Collapse
 
navneet7716 profile image
Navneet Kumar Singh • Edited

rather than stop using them we should start analysing them and stop trusting them blindly..

Thread Thread
 
viiik profile image
Eduard

I agree that they can be useful sometimes, the issue in my opinion is that it pushes new programmers to just never learn SQL, and keep them trapped in ORM land, where they don't even know how to begin to analyze performance or alternatives.

Thread Thread
 
navneet7716 profile image
Navneet Kumar Singh

Yep agreed!

Collapse
 
navneet7716 profile image
Navneet Kumar Singh

Agreed!

Collapse
 
christopherhoffman profile image
ChristopherHoffman

Your custom sql query is not the same as what the ORM provided. You have a bug in your code, due to using pagination with LEFT JOINs. Because of the left joins, if those joins are not one-to-one, then the DB will return multiple rows for the Cartesian product of those joins. That breaks a basic limit/offset for pagination. Your ORM was trying to fix that for you.

For example. If you have a Person table, and a Department table, and a PersonDepartment to allow for a many to many join. If Bob is in three departments, then when you do a left join from person to department, you'll get 3 rows:

Id, name, deptId, deptName
1, Bob, 1, Math
1, Bob, 2, Science,
1, Bob, 3, Engineering

Your ORM would then take that and construct a single person object with an array of departments.

Now imagine you had 10 people, each with 10 departments. Your query, with a limit of 10 and offset of 0, would return the first 10 rows. But because of the left join, it those first 10 rows would probably just be Bob with his 10 departments, when really what you wanted was the first 10 people, with all 100 departments.

That's why your ORM was doing two queries. The first was trying to get the first 10 people's id's that match the where and inner joins. Then, it takes those 10 id's, and has to do a second query and use those id's to grab all the data for those 10 records. That second query might return 100s or thousands of rows, depending on how many one-to-many joins there are.

Collapse
 
prashanth-does-dev profile image
Prashanth Ragam • Edited

What would the JSON query(active record pattern) be if we intend to get a cartesian product @christopherhoffman ? In this article the required output is to get all submissions, so if a user Bob has 10 submissions, the correct output is to get 10 records, rather than one record of bob, containing an array of 10 submissions.

Collapse
 
navneet7716 profile image
Navneet Kumar Singh

right, its use case base we wanted it that way so we went with that

Collapse
 
dyloneus profile image
Dave A

Really great article. Thanks for sharing. I've only ever used classic ado.net, entity framework and dapper.net (all for mssql).
I'd add that all can be really effective IF you are knowledgeable and disciplined in how you applied them.
There are certainly more "gotchas" with particular orm frameworks but by and large, they are just tools for you to use as you wish.
I personally found dapper.net to be the best to work with but depends on the developer/lead Dev's preferences and familiarity with raw sql .
Great breakdown of potential query optimizations btw 😎👍

Collapse
 
navneet7716 profile image
Navneet Kumar Singh

Thanks!

Collapse
 
vsaulis profile image
Vladas Saulis • Edited

SQL sucked, sucks and will suck until the last using it dies hard...
1000 times I tell all - hold relations in your head, not in SQL. And one day you realize
that relation many to many is very simple and takes only 2 tables.
I even won't say here about joins (brrr).

Collapse
 
navneet7716 profile image
Navneet Kumar Singh

😿

Collapse
 
vsamofal profile image
Vitalii Samofal • Edited

can you have multiple "problems" with your "submissions"? if yes, then your limit, offset doesn't really work

you can change skip, take to offset limit and it will do the same thing as you did

Image description

Collapse
 
navneet7716 profile image
Navneet Kumar Singh • Edited

No, we can't a submission is tied to a single problem furthermore I thought what you are saying should make a lot of sense but sadly there are no options to provide a limit and offset that way.

Collapse
 
lnahrf profile image
Lev N.

ORMs are a disaster, I have seen so many queries butchered by ORMs it's really fascinating to me why developers keep using them.

Collapse
 
navneet7716 profile image
Navneet Kumar Singh

true

Collapse
 
arvin profile image
Arvin

Anyone has experience moving away from Prisma and see better performance?

Collapse
 
navneet7716 profile image
Navneet Kumar Singh

raw sql will always be better for perf

Collapse
 
arvin profile image
Arvin

No doubt. But sometimes the performance increase isn't worth the migration. I'm interested to see if someone has had a significant increase when migrating in a large database.

Thread Thread
 
navneet7716 profile image
Navneet Kumar Singh • Edited

I haven't we use a mixture of orm and raw sql. Wherever we see an improvement is needed we make it.

Collapse
 
dev-i-am profile image
Derek

ORM's are the suck

Collapse
 
clintonrocha98 profile image
Clinton Rocha

Great article, thanks for the information and congratulations on the content!

Collapse
 
navneet7716 profile image
Navneet Kumar Singh

Thank you 😄

Collapse
 
eddsaura profile image
Jose E Saura

Loved the article! It's interesting that ORM f**k things up:/

Collapse
 
navneet7716 profile image
Navneet Kumar Singh

Yeah sometimes :(

Collapse
 
dreic profile image
dreic

Using ORM is the only fail you needed :). In business apps ORM should never be used as down the road it will give you big issues.

Collapse
 
navneet7716 profile image
Navneet Kumar Singh

Harsh but somewhat true 😅

Collapse
 
sachajw profile image
Sacha Wharton

Really great article, thank you for putting in the time to write this up.

Collapse
 
navneet7716 profile image
Navneet Kumar Singh

Thank you soo much!

Collapse
 
amrita88 profile image
Amrita

Perfect 👌

Collapse
 
lucke01 profile image
Ricardo

very cool article.
What is the matter with the ORM to perform such a bad performing query ?
I would also suggest to do partition the table by date or so if the table may be interesting too.
good job!

Collapse
 
navneet7716 profile image
Navneet Kumar Singh • Edited

Not sure why the orm would do such a thing as I am not familiar with it's implementation.. and yeah we will do partitioning once we come to need it thanks it is a good suggestion

Collapse
 
williamanthony profile image
WilliamAnthony

Sorry I have to ask, if ORM is that bad, what should I use if I want to build database independent application. I mean I can switch from mysql to postresql or other relational db

Collapse
 
navneet7716 profile image
Navneet Kumar Singh • Edited

Its not that ORM's are bad, it's just that sometimes they mess up as they are a very abstracted version of what we really want to interact with and sometimes those abstractions comes at a cost of perf and memory, In my case we were hitting a limit only on this one api of our so we investigated and found out that the query which was generated by ORM was inefficient and not useful, So we just went with raw sql, we still use typeorm a lot in out codebase but its just that where we need more perf in complex read operations we generally now write it in raw sql.