Optimizing queries for performance is at the top of the mind of every DBA - no matter if you’re junior or senior, optimizing your queries is something that you must learn. How do you build your queries so that they provide the best ROI for both you and your team? Find out in this blog.
Every seasoned DBA knows the importance of performance – it is the hidden backbone of applications: after all, it is one of the main reasons why our applications load quickly, why we‘re redirected from one page to another without even noticing a thing, and why business ventures bring revenue. Think about it: would you stay on a website if it was as slow as a turtle? Most likely not. Slow performance = complaining customers = less revenue. Not good!
Optimizing the performance of queries is one of the main tasks of a modern-day DBA. If there‘s no DBA on board, this task is usually undertaken by senior developers, but the issues they face remain the same – and fixing them may not be the easiest of tasks. In this blog, we will help you figure out how best to ensure that your queries are optimized for peak performance at all times.
Application Performance – the Basics
Before we tell you how best to optimize queries that are running within your database, you should decide on the path you’re taking. People usually turn to the performance side of their queries whenever their applications are slowing down, however, not everything is query-related. Here are some general things you should keep in mind:
What kind of queries are you optimizing? Each type of query (
INSERT
,SELECT
,UPDATE
,DELETE
) requires a different approach – e.g. read (SELECT
) queries usually benefit from indexes whileINSERT
,UPDATE
, andDELETE
queries are usually disturbed by them, etc.How much data is inside the database? Optimizing queries while having >100M records inside of the database is way more difficult than optimizing them while working on a small data set.
What kind of project are you working on? This question is very closely related to the second question because if your database is supporting a simple forum, it’s likely that there’s not that much data to work with; on the other hand, if you’re running a big data project like a data breach search engine with billions of records, the toll on your database will be much higher.
What do you want to achieve? Are you aiming to improve the performance of your application as a whole, or do you wish to speed up a specific part of your website?
Answering these questions will help craft a path through which you should move before starting to optimize the performance of your queries. Finally, another very important thing to keep in mind is the fact that your application can be slow due to a multitude of factors including DDoS and other kinds of attacks (DDoS attacks aim to overwhelm the infrastructure of an application effectively grinding it down to a halt), so before working on the performance of your database queries, make sure to take good care of security measures as well – directing traffic through a CDN provided by CloudFlare or other vendors should help.
Once that’s out of the way and you still feel like the problem is within the database itself, read on and we will walk you through everything you need to know.
Query Performance – the Basics
Once you feel that the problem is within your database, first decide what kind of queries are you going to optimize – think of the CRUD (Create Read Update Delete) principle and break it down, then choose the option to go with: the first method we are going to walk you through will help optimize queries that read data (SELECT
queries), and the second method will help with INSERT
, UPDATE
and DELETE
query optimization.
However, while we will split the advice we give you into two aforementioned parts, some parts of the article will apply to both types of queries, so make sure to familiarize yourself with all information before proceeding to optimize queries on your infrastructure – take a backup of your data, and let’s go!
Optimizing SELECT Operations
The optimization of SELECT
queries is the hottest topic of any database performance discussion, and, consequently, the most problematic one as well. In most cases, once our SELECT
queries slow down, the entire application backed by our database noticeably slows down as well. Sometimes, such performance decreases mean customer churn, and as a result – revenue is lost.
Before optimizing any query, though, it would be helpful to familiarize ourselves with their internals – for that, MySQL can offer the SHOW PROFILE
command. The command is a community contribution by Jeremy Cole and it’s included in MySQL by default provided we run MySQL 5.1 or above. The reason profiling is so powerful is because queries are tasks that are comprised of smaller tasks – if we understand how those smaller tasks work, we can improve their performance, and as a result, improve the performance of our query as a whole.
To make use of query profiling in MySQL, perform the following steps:
1. Issue a SET profiling = 1;
query to enable profiling (if you’re using DbVisualizer to run queries, there’s no need to issue a USE
query to select a database – it will be selected for you in a field above the query – DbVisualizer will also provide you the list of the contents of your database server on the left side):
2. Once you’ve enabled profiling, feel free to run your queries (note that DbVisualizer also comes with the capability to limit the result count in the top right corner – such a feature is especially useful when running LIKE
queries with wildcards because wildcards before and after the query mean “any character” – too many results returned may crash the server):
3. Issue a SHOW PROFILES; query to figure out the ID of the query you need to profile (in our case, the ID is 45):
4. Provide the ID in a SHOW PROFILE query and observe the results (see example below):
In our case, everything’s pretty quick (the sum of the duration doesn’t exceed a couple of seconds), but when a really slow SELECT
query is being profiled, everything will probably look differently – most SELECT
queries that need optimizing take their time through the “executing” and “Sending data” phases to both execute and make the data visible to the user. There are multiple things we can do to make them perform faster, though:
1. Make sure MySQL is only scanning through data that is necessary – a frequent downfall in the performance space is the usage of SELECT
* queries – when such queries are in use, the database has to scan through all rows, and if there are more than a million rows in the table, speed can be an issue. To overcome this problem, select only the columns that are necessary for the result set and make use of the LIMIT
clause at the end of the query to limit the number of results. Database management software such as DbVisualizer can make that easier for you by enabling you to limit the number of rows that can be returned (use -1 to disable row limiting) – if a SELECT
* query is a necessity for your application, run it at the beginning of the script and make use of the results of it (via a loop) in multiple places:
2. If you find yourself using wildcards (%), avoid using them at the beginning of your query – as MySQL needs to search through anything (%) and then your string, it will take more time. Consider only using a wildcard at the end of the search string:
1 SELECT * FROM emails WHERE email LIKE 'yahoo.com%'; -- Good!
2 SELECT * FROM emails WHERE email LIKE '%yahoo.com%'; -- Bad!
3. Consider eliminating UNION
and DISTINCT
operators if they’re not necessary for your query to execute – both of those operators cause sorting operations to be completed, which in turn slows down their performance on bigger data sets.
4. Consider indexing and partitioning your tables – each index and every partition has its toll on a database from a size perspective, but each index makes a SELECT
query searching for a specific result faster, and partitions split data down into invisible “tables” making the search faster to complete. However, there’s a caveat – in turn, both indexes and partitions slow down INSERT
, DELETE
, and UPDATE
queries because:
a)INSERT
queries are slowed down because the index and partition have to be updated together with the data itself.
b)The more indexes your table has, the slower DELETE
operations become (source: “Use the Index, Luke!”):
c) When UPDATE queries are in use, MySQL needs to update the index together with the data itself.
5. If possible, switch wildcards to full-text indexes in MySQL: such indexes are usually faster.
6. Utilize the slow query log provided by MySQL. Include the following three variables inside of your my.cnf file and all queries lasting longer than specified in the long_query_time parameter will be added to the file specified in the slow_query_log_file parameter.
These steps should provide you with a good enough foundation to work with – however, if you see that your reading processes are fast enough and you need to optimize other types of queries instead, you might want to read on.
Optimizing INSERT, UPDATE, and DELETE Queries
Most developers know their way around SELECT
query performance, however, when it comes to other types of queries, they find themselves going through the docs again just to figure out they’re really basic to work with. Actually, it’s the SELECT
queries that should take up most of your time – the rest of them are a piece of cake. Here’s how to work with them:
1. To optimize the performance of INSERT
queries, either load the data in by using VALUES ()
to insert multiple values at a time or make use of the LOAD DATA INFILE
capability if you’re working with bigger data sets: the capabilities provided by LOAD DATA INFILE
make MySQL capable to skip multiple operations and load data in bulk after making use of the InnoDB buffer pool. If you find yourself using LOAD DATA INFILE
, though, make sure to familiarize yourself with the docs because the statement comes with additional bells and whistles such as column termination via a character (see image #9), etc.:
2. To speed up UPDATE
queries, lock the table (issue a LOCK TABLES
statement), run multiple UPDATE
queries one after another (consider using a LIMIT
clause to limit the number of rows that are updated at once), then issue an UNLOCK TABLES
statement.
3. Finally, there’s no incredible magic that speeds up DELETEs
either – increase the size of the InnoDB buffer pool (read more), or, for quick deletion of the data in the whole table, use a TRUNCATE
query like so – TRUNCATEs
are always much faster than DELETE
operations:
Summary
In this blog, we have walked you through how you can go about optimizing your queries in MySQL infrastructure for performance. One of the key takeaways we’d want you to remember is that optimizing SELECT
queries for speed will decrease the speed of INSERT
, UPDATE
, and DELETE
queries, and vice versa – other than that, keep in mind at least a couple of things mentioned in this article when optimizing your queries for performance, and you should be good to go.
We hope you’ve enjoyed reading this blog – attentive readers might have noticed that throughout this blog, we’ve kept mentioning how the capabilities of DbVisualizer could help in performance optimization as well; we’re also pleased to offer a free 21-day trial for evaluation purposes, so make sure to grab the deal, share this article with your friends if you think they could make use of the advice in this article, and until next time!
About the author
Lukas Vileikis is an ethical hacker and a frequent conference speaker. He runs one of the biggest & fastest data breach search engines in the world - BreachDirectory.com, frequently speaks at conferences and blogs in multiple places including his blog over at lukasvileikis.com.
Top comments (1)