When I delete data in CockroachDB (CRDB), the delete is slow, and how can I speed it up?
This is a topic that comes up occasionally, and it's not super intuitive, so I think it would be worth explaining in more detail.
Setup
To explore this, let's create some data we can play with.
First let's create a database and then configure the database to only wait 60 seconds after deleting today to actually delete the data out of the storage layer in CRDB (this will be helpful if we need to delete the data multiple times and run everything again)
CREATE DATABASE deletes;
ALTER DATABASE deletes CONFIGURE ZONE USING gc.ttlseconds= 60;
USE deletes;
Then, let's create a table. I'm going to create a table with a few columns of various cardinality and 100 million rows. There is a helpful function in CRDB called generate_series
which is great for this kind of data creation:
CREATE TABLE deletes.order_items ( id PRIMARY KEY, customer_id, order_id, product_id, order_date )
AS
SELECT
g.i AS id,
g.i % 77777 AS customer_id,
g.i % 1000000 as order_id,
g.i % 11111 as product_id,
g.i::TIMESTAMP as order_date
FROM generate_series(1578003200, 1678003200, 1) g(i);
-- 100,000,000 records
-- between 2020-01-02 22:13:20 (i.e., 1578003200)
-- and 2023-03-05 08:00:00 (i.e., 1678003200)
I'm leveraging a few handy techniques here:
- I'm using the
CREATE TABLE AS
(or CTAS as we call it) to generate a new table. This is much faster in CRDB than creating the table first and then doingINSERT INTO ... SELECT
because it bypasses all the consistency-checking algorithms. - I'm using the
g.i
value as my "seed value" and using it directly as my id field - for some of the other fields, I'm using the modulo operator (i.e., %) against the g.i feild to create some low-cardinality values.
- I'm generating values in the range between 1578003200 and 1678003200 so that I can easily convert these values into timestamps.
- Tip: To figure out which values you want to use as your start and end values, run
SELECT '2023-01-15 2:00 PM'::TIMESTAMP::INT;
with your desired date values to figure out the values to specify.
Once we have our table created, let's create indexes on our other fields in order to mimic what a realistic table might do:
CREATE INDEX ON order_items ( customer_id );
CREATE INDEX ON order_items ( order_id );
CREATE INDEX ON order_items ( product_id );
CREATE INDEX ON order_items ( order_date );
To get a sense of the cardinality of our fields, we can run a query:
SELECT
COUNT(*) AS rec_count,
COUNT(DISTINCT customer_id) AS cust_count,
COUNT(DISTINCT order_id) AS ord_count,
COUNT(DISTINCT product_id) AS prd_count,
COUNT(DISTINCT order_date) AS date_count
FROM order_items;
Our results:
rec_count | cust_count | ord_count | prd_count | date_count
------------+------------+-----------+-----------+-------------
100000001 | 77777 | 1000000 | 11111 | 100000001
Looking at explain plans
Now that we have some data to play with, we can start to understand what happens when we delete data in this table.
Let's start by selecting a single record by the PK value:
SELECT *
FROM order_items
WHERE id = 1600000000;
id | customer_id | order_id | product_id | order_date
-------------+-------------+----------+------------+----------------------
1600000000 | 49333 | 0 | 4889 | 2020-09-13 12:26:40
Now, let's prepend "EXPLAIN" to our query to look at the execution plan.
EXPLAIN
SELECT *
FROM order_items
WHERE id = 1600000000;
info
---------------------------------------------------------------------------------------------------------------------
distribution: local
vectorized: true
• scan
estimated row count: 1 (<0.01% of the table; stats collected 3 hours ago; using stats forecast for 3 hours ago)
table: order_items@order_items_pkey
spans: [/1600000000 - /1600000000]
This is as simple of an execution plan as you can get. It's doing a scan of 1 span in the primary index (i.e., the table itself).
Now, let's query by the order_date field which is also a very high-cardinality value with an index on it:
SELECT *
FROM order_items
WHERE order_date = 1600000000::timestamp;
id | customer_id | order_id | product_id | order_date
-------------+-------------+----------+------------+----------------------
1600000000 | 49333 | 0 | 4889 | 2020-09-13 12:26:40
We get the exact same data back. Let's look at the explain plan:
EXPLAIN
SELECT *
FROM order_items
WHERE order_date = 1600000000::timestamp;
info
-------------------------------------------------------------------------------------------------------------------------------------------------------------
distribution: local
vectorized: true
• index join
│ estimated row count: 1
│ table: order_items@order_items_pkey
│
└── • scan
estimated row count: 1 (<0.01% of the table; stats collected 3 hours ago; using stats forecast for 3 hours ago)
table: order_items@order_items_order_date_idx
spans: [/'2020-09-13 12:26:40' - /'2020-09-13 12:26:40']
index recommendations: 1
1. type: index replacement
SQL commands: CREATE INDEX ON order_items (order_date) STORING (customer_id, order_id, product_id); DROP INDEX order_items@order_items_order_date_idx;
Notice a few differences about this explain plan from the one above:
- We do a scan of the
order_items_order_date_idx
this time - We then do an
index join
back to the primary indexorder_items_pkey
- CRDB has helpfully given us some suggestions on how we can alter our index to perform better.
It's very efficient to use the order_date index since our predicate is on order_date. But the order_date index only contains the field we're indexing (order_date) and a reference back to the PK (id). In order to satisfy our query's SELECT *
clause, we need to join back to the primary index in order to have all the rest of the field values in this record. The index suggestion is telling us that if we include/store all the other fields from the record in the order_date index that we can skip the index join since we'll already have all the data points necessary to satisfy our query.
You can prove this by changing the SELECT query to only return the id and order_date fields:
EXPLAIN
SELECT id, order_date
FROM order_items
WHERE order_date = 1600000000::timestamp;
info
---------------------------------------------------------------------------------------------------------------------
distribution: local
vectorized: true
• scan
estimated row count: 1 (<0.01% of the table; stats collected 3 hours ago; using stats forecast for 3 hours ago)
table: order_items@order_items_order_date_idx
spans: [/'2020-09-13 12:26:40' - /'2020-09-13 12:26:40']
Moving onto deletes
Now that we have a sense of interpreting query plans on our table, let's think about what happens when we delete a record in CRDB.
When we delete a record by the PK value, our execution plan is very simple -- find the record and delete it:
EXPLAIN
DELETE FROM order_items
WHERE id = 1600000000;
info
-------------------------------------------------------------------------------------------------------------------------
distribution: local
vectorized: true
• delete
│ from: order_items
│ auto commit
│
└── • scan
estimated row count: 1 (<0.01% of the table; stats collected 3 hours ago; using stats forecast for 3 hours ago)
table: order_items@order_items_pkey
spans: [/1600000000 - /1600000000]
If we do a similar DELETE query using the order_date field instead of the PK, we see that we once again have an index join:
EXPLAIN DELETE FROM order_items
WHERE order_date = 1600000000::timestamp;
info
-------------------------------------------------------------------------------------------------------------------------------------------------------------
distribution: local
vectorized: true
• delete
│ from: order_items
│ auto commit
│
└── • index join
│ estimated row count: 1
│ table: order_items@order_items_pkey
│
└── • scan
estimated row count: 1 (<0.01% of the table; stats collected 3 hours ago; using stats forecast for 3 hours ago)
table: order_items@order_items_order_date_idx
spans: [/'2020-09-13 12:26:40' - /'2020-09-13 12:26:40']
index recommendations: 1
1. type: index replacement
SQL commands: CREATE INDEX ON order_items (order_date) STORING (customer_id, order_id, product_id); DROP INDEX order_items@order_items_order_date_idx;
The need for this index join is pretty obvious on the SELECT query, but why is it needed on the DELETE query? The answer is that when CRDB deletes a record, it needs to delete the entry(ies) from the primary index but it also needs to delete the entry(ies) from all of the corresponding indexes. And, in order to delete the index entries, it needs the index keys.
Let's follow the optimizer's advice and create this new "covering" index:
CREATE INDEX ON order_items (order_date) STORING (customer_id, order_id, product_id);
DROP INDEX order_items@order_items_order_date_idx;
If we run our DELETE statement again with this new index in place, we can see that the need for the index join has been eliminated.
EXPLAIN DELETE FROM order_items
WHERE order_date = 1600000000::timestamp; info
-------------------------------------------------------------------------------------------------------------------------
distribution: local
vectorized: true
• delete
│ from: order_items
│ auto commit
│
└── • scan
estimated row count: 1 (<0.01% of the table; stats collected 3 hours ago; using stats forecast for 3 hours ago)
table: order_items@order_items_order_date_idx1
spans: [/'2020-09-13 12:26:40' - /'2020-09-13 12:26:40']
Warning: This is a great technique to help make deletes efficient. If you're doing a regular batch delete of old data or cleaning up records that have met some "deletable" state, this can help a lot -- especially on really big tables. However, be aware that if you later add another index on the order_items table, you'll want to remember to include that newly-indexed field into the index that is going to be leveraged in your deletes -- otherwise, this index join will creep back into your plan.
Summary
Covering indexes are a great tool in CRDB and other DBMS systems to make specific queries more efficient. It can be relatively easy to identify when they should be utilized to help SELECT queries, but it's a little un-intuitive to realize that they can also help with DELETEs. But now you know, so happy deleting!!
Top comments (0)