SQL databases store the current state and enough information to read about a previous state with Multi-Version Concurrency Control. Keeping all change records would not be scalable, so we only keep enough history for the oldest ongoing transaction. There are two possibilities when long ongoing transactions are running:
- Let history grow. During VACUUM, PostgreSQL keeps old records that are more recent than the database transaction horizon, allowing bloat to persist.
- Fail long transactions after a time limit, like Oracle
undo_retention
or YugabyteDBtimestamp_history_retention_interval_sec
The second solution avoids runaway queries causing problems with other transactions. A growing MVCC history can impact performance and operations, affecting reads, memory, storage, and backups. Therefore, it's best to prevent MVCC history from growing uncontrollably. However, such an error must give enough information to understand the reason and fix the runaway query, and you should be able to understand it.
An example
I start YugabyteDB in a Docker container:
-bash-4.2# docker run --rm -it yugabytedb/yugabyte bash
[root@066127d97d21 yugabyte]#
[root@066127d97d21 yugabyte]# yugabyted start
Starting yugabyted...
✅ YugabyteDB Started
✅ UI ready
✅ Data placement constraint successfully verified
...
+---------------------------------------------------------------------------------------------------------+
| yugabyted |
+---------------------------------------------------------------------------------------------------------+
| Status : Running. |
| Replication Factor : 1 |
| YugabyteDB UI : http://172.17.0.3:15433 |
| JDBC : jdbc:postgresql://172.17.0.3:5433/yugabyte?user=yugabyte&password=yugabyte |
| YSQL : bin/ysqlsh -h 172.17.0.3 -U yugabyte -d yugabyte |
| YCQL : bin/ycqlsh 172.17.0.3 9042 -u cassandra |
| Data Dir : /root/var/data |
| Log Dir : /root/var/logs |
| Universe UUID : 640b90ac-c720-418a-bfd4-03f3eb106bab |
+---------------------------------------------------------------------------------------------------------+
...
[root@066127d97d21 yugabyte]#
[root@066127d97d21 yugabyte]# ysqlsh -h $(hostname)
ysqlsh (11.2-YB-2.21.0.0-b0)
Type "help" for help.
yugabyte=#
I create a demo table with one row:
yugabyte=# create table demo
( id bigserial primary key , value text )
;
CREATE TABLE
yugabyte=# insert into demo(value)
select 'Hello World' from generate_series(1,1)
;
INSERT 0 1
I start a transaction, read from my table, and also show some time information, in Epoch, about the transaction time and the current time:
yugabyte=# begin transaction isolation level repeatable read
;
BEGIN
yugabyte=# select * from demo
;
id | value
----+-------------
1 | Hello World
(1 row)
yugabyte=# select
current_setting('yb_effective_transaction_isolation_level')
, extract(epoch from transaction_timestamp()) as tx
, extract(epoch from clock_timestamp()) as clock
;
current_setting | tx | clock
-----------------+------------------+------------------
repeatable read | 1713905945.54437 | 1713905960.63649
(1 row)
Still in the transaction, I wait a few minutes and recheck the time. For this demo, I don't want to wait 15 minutes, which is the default, so I set it temporarily to 60 seconds (with timestamp_history_retention_interval_sec
) and run a full compaction:
yugabyte=# select
current_setting('yb_effective_transaction_isolation_level')
, extract(epoch from transaction_timestamp()) as tx
, extract(epoch from clock_timestamp()) as clock
;
current_setting | tx | clock
-----------------+------------------+------------------
repeatable read | 1713905945.54437 | 1713906114.97461
(1 row)
yugabyte=# \! yb-ts-cli --server_address=$(hostname) set_flag --force timestamp_history_retention_interval_sec 60
yugabyte=# \! yb-ts-cli --server_address=$(hostname) compact_all_tablets
Successfully compacted all tablets
yugabyte=# \! yb-ts-cli --server_address=$(hostname) set_flag --force timestamp_history_retention_interval_sec 900
I did not make any updates, but the database doesn't know. Had I made any modifications, the versions from before the compaction time minus 60 seconds would have been deleted. The database cannot guarantee a consistent read since my transaction's read time is from before that time. The snapshot required for the reading is too old.
yugabyte=# select
current_setting('yb_effective_transaction_isolation_level')
, extract(epoch from transaction_timestamp()) as tx
, extract(epoch from clock_timestamp()) as clock
;
current_setting | tx | clock
-----------------+------------------+------------------
repeatable read | 1713905945.54437 | 1713906169.19518
(1 row)
yugabyte=# select * from demo;
ERROR: Snapshot too old. Read point: { physical: 1713905951675923 }, earliest read time allowed: { physical: 1713906057639729 }, delta (usec): 105963806: kSnapshotTooOld
The Snapshot too old
message gives essential information: the read point (the start of my transaction as I'm at a Repeatable Read isolation level) and the earliest time allowed (the retention time when compaction occurred). The delta is the difference between the two and gives an idea of the minimum retention that would have been required to run this query.
Here is the timeline with the numbers above:
1713905945.54437 Transaction time
1713905951.675923 Read point (snapshot) <-----------------+ Snapshot
1713905960.63649 Clock at transaction start | Too Old
1713906057.639729 Earliest read time allowed <-----+ |
1713906114.97461 Clock before compaction | |
(timestamp_history_retention_interval_sec) ---+ |
1713906169.19518 Clock before SELECT error |
(transaction read time) --------+
When you encounter the "Snapshot Too Old" error, you should check if the duration of the statement/transaction is expected and then increase the MVCC retention. If it is the query that has a problem, you should fix it first.
yb_read_time
Note that in the latest version, you can set the read time yourself with yb_read_time
. It is similar to Oracle's flashback query. Be careful, and it should be used cautiously for specific use cases, like recovery from errors.
Here is an example with my table:
yugabyte=# select * from demo;
id | value
----+-------------
1 | Hello World
I get the current time as an Epoch (in microseconds):
yugabyte=# select (1000000*extract(epoch from now()))::bigint now;
\gset
now
------------------
1713965275094928
(1 row)
yugabyte=# \gset
I use this variable to set the read time for this flashback query
yugabyte=# set yb_read_time=:now;
NOTICE: 00000: yb_read_time should be set with caution.
DETAIL: No DDL operations should be performed while it is set and it should not be set to a timestamp before a
DDL operation has been performed. It does not have well defined semantics for normal transactions and is only to
be used after consultation
SET
A warning says it does not follow the SQL transaction semantics. That is because we cannot apply the current transaction changes to a past state.
I insert a new row and query my table:
yugabyte=# insert into demo(value)
select 'Hello Again' from generate_series(1,1)
;
INSERT 0 1
yugabyte=# select * from demo;
id | value
----+-------------
1 | Hello World
(1 row)
The new row is invisible because my read point is before the insert.
I can see my row if I revert to the standard SQL behavior where the read time is the beginning of the transaction or statement:
yugabyte=# set yb_read_time=0;
NOTICE: 00000: yb_read_time should be set with caution.
DETAIL: No DDL operations should be performed while it is set and it should not be set to a timestamp before a
DDL operation has been performed. It does not have well defined semantics for normal transactions and is only to
be used after consultation
SET
yugabyte=# select * from demo;
id | value
----+-------------
2 | Hello Again
1 | Hello World
(2 rows)
The warning also says that it should not be set to before DDL. If I go too far in time, there are no rows displayed because the table didn't exist at that time (but the query is still parsed with the current catalog):
yugabyte=# set yb_read_time=1713879482182794;
NOTICE: 00000: yb_read_time should be set with caution.
DETAIL: No DDL operations should be performed while it is set and it should not be set to a timestamp before a DDL operation has been performed. It does not have well defined semantics for normal transactions and is only to be used after consultation
SET
yugabyte=# select * from demo;
id | value
----+-------
(0 rows)
I'll let you try a date in the future. It will show the current version because the database is unaware of future changes.
To summarize
"Snapshot Too Old" error is expected for queries that run more than the MVCC retention. It defaults to 15 minutes, but can be increased with timestamp_history_retention_interval_sec
. Note that this retention value can be internally increased to support database snapshots, to allow for Point In Time Recovery to any time between two snapshots. Additionally, the read point can be set to query as-of a past point-in-time. This can be used to recover from errors, in the current database or a clone.
Top comments (0)