DEV Community

Franck Pachot for AWS Heroes

Posted on

You Probably Don't Need Serializable Isolation

In my previous post, I provided an example of a serializable transaction. I discussed how Oracle Database, PostgreSQL, and YugabyteDB handle it when the transaction's Serializable isolation level is set.
In a reply to comments, I mentioned that it is perfectly fine to run such a use case with a database that doesn't provide a serializable isolation level.

Before presenting an example, let's clarify that, even if Oracle does not fully implement the ANSI/ISO isolation level, it is not an issue as long as the developers know it. Oracle Database powers critical OLTP applications in hospitals and banks, and it has proven to be reliable for decades. These applications use the default Read Committed isolation level without anomalies because the application developers have built the proper logic with explicit locking or other techniques. PostgreSQL-compatible databases provide a serializable isolation level, but most applications use Read Committed, including some critical OLTP applications.

Let's use the same example as in the previous post. The table was:

create table demo (
 id int generated always as identity primary key
 , message varchar(80)
);
Enter fullscreen mode Exit fullscreen mode

The transaction was:

select * from demo;
insert into demo (message) values ('I am the first row');
Enter fullscreen mode Exit fullscreen mode

To ensure that no one inserts a row between your SELECT and INSERT statements, the SQL standard provides only one solution:

  • use the SERIALIZABLE isolation level.

You should remember that the SQL standard does not define explicit locking, such as SELECT FOR UPDATE or LOCK TABLE, and it also does not define CREATE INDEX. But all databases have these features.

The simplest solution to avoid anomalies with such a transaction in the absence of a serializable isolation level is implicit locking:

  • lock the table in SHARE before the SELECT.

Locking a table is a scalability issue. In this example, you must block any insertion to determine whether you are in the first row.

There is an alternative scalable solution that operates in the Read Committed mode and employs an optimistic approach:

  • protect from anomalies with integrity constraints.

An optimistic approach detects failures and raises errors instead of locking. In our transaction, if we write 'I am the first row' and there should only be one row with this message, we can ensure integrity using a UNIQUE INDEX.

Here is an example that works in PostgreSQL or YugabyteDB:

create unique index demo_only_one_first_row
 on demo( message )
 where message = 'I am the first row'
;
Enter fullscreen mode Exit fullscreen mode

This unique index ensures that only one message with 'I am the first row' can be inserted.
In case of conflict, one session will wait and fail if the other commits the same message:

yugabyte=# insert into demo (message) values ('I am the first row');
ERROR:  duplicate key value violates unique constraint "demo_only_one_first_row"
DETAIL:  Key (message)=(I am the first row) already exists.
Enter fullscreen mode Exit fullscreen mode

Thanks to the partial index, additional rows can be added without incurring any verification overhead.
This can be confirmed in YugabyteDB with the dist option of explain that shows the number of read and write requests:

yugabyte=# explain (analyze, dist, costs off, summary off)
           insert into demo (message) values ('I am the first row');

                        QUERY PLAN
----------------------------------------------------------
 Insert on demo (actual time=6.660..6.661 rows=0 loops=1)
   ->  Result (actual time=4.287..4.289 rows=1 loops=1)
         Storage Table Write Requests: 1
         Storage Index Write Requests: 1

yugabyte=# explain (analyze, dist, costs off, summary off)
           insert into demo (message) values ('I am the next row');

                        QUERY PLAN
----------------------------------------------------------
 Insert on demo (actual time=0.050..0.050 rows=0 loops=1)
   ->  Result (actual time=0.004..0.005 rows=1 loops=1)
         Storage Table Write Requests: 1

yugabyte=# explain (analyze, dist, costs off, summary off)
           insert into demo (message) values ('I am the next row');

                        QUERY PLAN
----------------------------------------------------------
 Insert on demo (actual time=0.062..0.062 rows=0 loops=1)
   ->  Result (actual time=0.005..0.006 rows=1 loops=1)
         Storage Table Write Requests: 1
Enter fullscreen mode Exit fullscreen mode

The first-row insertion resulted in two write requests, one to the table and one to the index. Subsequent row insertions only required one write request to the table.

Partial indexes are not available in the Oracle Database. However, this can be achieved using another specificity of this database: null entries are not indexed:

create unique index demo_only_one_first_row
on demo ( case message when 'I am the first row' then message end );
Enter fullscreen mode Exit fullscreen mode

This is similar to the partial index I used in YugabyteDB. It only stores an entry for the 'I am the first row' message, guaranteeing its uniqueness.

Those solutions may seem like workarounds but are valid if properly documented. The main logic is simple: use a SELECT statement before the INSERT, and the unique index is only there to protect against an unlikely race condition without any overhead in the most common scenario. It might be surprising to see an index used to make transactions serializable without table locks. However, that's how Oracle ensures referential integrity without a serializable isolation level by either locking the entire table or using an index on the foreign key.

The simple example demonstrates that you shouldn't depend solely on ANSI/ISO definitions for SQL, particularly regarding isolation levels. Every database is unique and offers enough SQL features to allow you to create efficient and reliable applications. The key is to understand the peculiarities of your specific database implementation and tailor your logic accordingly. It's essential to recognize how race conditions can arise in typical workloads and exceptional cases to ensure your solution remains scalable.

The SQL ANSI/ISO Serializable level is convenient because it prevents race conditions. However, it may not be available in all databases, and even when it is, its performance may not be optimal for your specific use case. Most SQL applications use Read Committed or Repeatable Read (essentially Snapshot Isolation at the statement or transaction level) and rely on implicit locking and integrity constraints to ensure the database's consistency.

Top comments (0)