TECH SCHOOL

Posted on Sep 2, 2020 • Edited on Jun 26, 2021

How to avoid deadlock in DB transaction? Queries order matter!

#go #database #tutorial #webdev

One of the hardest thing when working with database transaction is locking and handling deadlock.

From my experience, the best way to deal with deadlock is to avoid it. By that I mean we should fine-tune our queries in the transaction so that deadlock won’t have a chance to occur, or at least minimize its chance of occurrence.

And that’s exactly what I’m gonna show you in this article.

Here's:

Link to the full series playlist on Youtube
And its Github repository

Alright, let’s dive in!

A potential deadlock scenario

This is the money transfer transaction code that we’ve implemented in the previous lecture.



func (store *Store) TransferTx(ctx context.Context, arg TransferTxParams) (TransferTxResult, error) {
    var result TransferTxResult

    err := store.execTx(ctx, func(q *Queries) error {
        var err error

        result.Transfer, err = q.CreateTransfer(ctx, CreateTransferParams{
            FromAccountID: arg.FromAccountID,
            ToAccountID:   arg.ToAccountID,
            Amount:        arg.Amount,
        })
        if err != nil {
            return err
        }

        result.FromEntry, err = q.CreateEntry(ctx, CreateEntryParams{
            AccountID: arg.FromAccountID,
            Amount:    -arg.Amount,
        })
        if err != nil {
            return err
        }

        result.ToEntry, err = q.CreateEntry(ctx, CreateEntryParams{
            AccountID: arg.ToAccountID,
            Amount:    arg.Amount,
        })
        if err != nil {
            return err
        }

        result.FromAccount, err = q.AddAccountBalance(ctx, AddAccountBalanceParams{
            ID:     arg.FromAccountID,
            Amount: -arg.Amount,
        })
        if err != nil {
            return err
        }

        result.ToAccount, err = q.AddAccountBalance(ctx, AddAccountBalanceParams{
            ID:     arg.ToAccountID,
            Amount: arg.Amount,
        })
        if err != nil {
            return err
        }

        return nil
    })

    return result, err
}

Basically we’ve fixed the deadlock issue caused by the foreign key constraints. However, if we look at the code carefully, we can see a potential deadlock scenario.

In this transaction, we’re updating the balance of the fromAccount and the toAccount. And we know that they both require an exclusive lock to perform the operation. So if there are 2 concurrent transactions involving the same pair of accounts, there might be a potential deadlock.

But we already have a test that runs 5 concurrent transfer transactions with the same pair of accounts, but deadlock doesn’t occur, right?



func TestTransferTx(t *testing.T) {
    store := NewStore(testDB)

    account1 := createRandomAccount(t)
    account2 := createRandomAccount(t)
    fmt.Println(">> before:", account1.Balance, account2.Balance)

    n := 5
    amount := int64(10)

    errs := make(chan error)
    results := make(chan TransferTxResult)

    // run n concurrent transfer transaction
    for i := 0; i < n; i++ {
        go func() {
            result, err := store.TransferTx(context.Background(), TransferTxParams{
                FromAccountID: account1.ID,
                ToAccountID:   account2.ID,
                Amount:        amount,
            })

            errs <- err
            results <- result
        }()
    }

    // check results
    existed := make(map[int]bool)

    for i := 0; i < n; i++ {
        err := <-errs
        require.NoError(t, err)

        result := <-results
        require.NotEmpty(t, result)

        // check transfer
        transfer := result.Transfer
        require.NotEmpty(t, transfer)
        require.Equal(t, account1.ID, transfer.FromAccountID)
        require.Equal(t, account2.ID, transfer.ToAccountID)
        require.Equal(t, amount, transfer.Amount)
        require.NotZero(t, transfer.ID)
        require.NotZero(t, transfer.CreatedAt)

        _, err = store.GetTransfer(context.Background(), transfer.ID)
        require.NoError(t, err)

        // check entries
        fromEntry := result.FromEntry
        require.NotEmpty(t, fromEntry)
        require.Equal(t, account1.ID, fromEntry.AccountID)
        require.Equal(t, -amount, fromEntry.Amount)
        require.NotZero(t, fromEntry.ID)
        require.NotZero(t, fromEntry.CreatedAt)

        _, err = store.GetEntry(context.Background(), fromEntry.ID)
        require.NoError(t, err)

        toEntry := result.ToEntry
        require.NotEmpty(t, toEntry)
        require.Equal(t, account2.ID, toEntry.AccountID)
        require.Equal(t, amount, toEntry.Amount)
        require.NotZero(t, toEntry.ID)
        require.NotZero(t, toEntry.CreatedAt)

        _, err = store.GetEntry(context.Background(), toEntry.ID)
        require.NoError(t, err)

        // check accounts
        fromAccount := result.FromAccount
        require.NotEmpty(t, fromAccount)
        require.Equal(t, account1.ID, fromAccount.ID)

        toAccount := result.ToAccount
        require.NotEmpty(t, toAccount)
        require.Equal(t, account2.ID, toAccount.ID)

        // check balances
        fmt.Println(">> tx:", fromAccount.Balance, toAccount.Balance)

        diff1 := account1.Balance - fromAccount.Balance
        diff2 := toAccount.Balance - account2.Balance
        require.Equal(t, diff1, diff2)
        require.True(t, diff1 > 0)
        require.True(t, diff1%amount == 0) // 1 * amount, 2 * amount, 3 * amount, ..., n * amount

        k := int(diff1 / amount)
        require.True(t, k >= 1 && k <= n)
        require.NotContains(t, existed, k)
        existed[k] = true
    }

    // check the final updated balance
    updatedAccount1, err := store.GetAccount(context.Background(), account1.ID)
    require.NoError(t, err)

    updatedAccount2, err := store.GetAccount(context.Background(), account2.ID)
    require.NoError(t, err)

    fmt.Println(">> after:", updatedAccount1.Balance, updatedAccount2.Balance)

    require.Equal(t, account1.Balance-int64(n)*amount, updatedAccount1.Balance)
    require.Equal(t, account2.Balance+int64(n)*amount, updatedAccount2.Balance)
}

That’s correct! However, the transactions in our existing test all do the same thing: transfer money from account 1 to account 2. What if some of them transfer money from account 2 to account 1?

To illustrate how deadlock might occur in this scenario, I have prepared 2 transactions’ in TablePlus:



-- Tx1: transfer $10 from account 1 to account 2
BEGIN;

UPDATE accounts SET balance = balance - 10 WHERE id = 1 RETURNING *;
UPDATE accounts SET balance = balance + 10 WHERE id = 2 RETURNING *;

ROLLBACK;


-- Tx2: transfer $10 from account 2 to account 1
BEGIN;

UPDATE accounts SET balance = balance - 10 WHERE id = 2 RETURNING *;
UPDATE accounts SET balance = balance + 10 WHERE id = 1 RETURNING *;

ROLLBACK;

The 1st transaction will transfer 10 dollars from account 1 to account 2, by first subtracting 10 from the the balance of account 1, and then adding 10 to the balance of account 2.

The 2nd transaction will do the reverse work: transfer 10 dollars from account 2 to account 1. First it subtracts 10 from the balance of account 2. Then it adds 10 to the balance of account 1.

Now let’s open the terminal to run these transactions in 2 parallel psql console.

First, I will start the first psql console and BEGIN the 1st transaction. Then I’m gonna run its 1st query to subtract 10 from account 1’s balance.

The account is updated instantly. Now let’s open another tab, start a new psql console, and BEGIN the 2nd transaction. Then let’s run its 1st query to subtract 10 from account 2’s balance.

This query also returns immediately. Now back to the 1st transaction and run its 2nd query to update account 2’s balance.

This time, the query is blocked because the 2nd transaction is also updating the same account 2.

If we go back to TablePlus and run this query to list all the locks:

We can see that this update account 2 query of transaction 1 is trying to acquire a ShareLock on transaction ID 911, but it is not granted yet

That's because transaction 2 is already holding an ExclusiveLock on the same transaction ID. Therefore, transaction 1 must wait for transaction 2 to finish before continue.

Now if we continue running the 2nd query of transaction 2 to update account 1’s balance:

We will get a deadlock because this account 1 is being updated by transaction 1, thus transaction 2 also needs to wait for transaction 1 to finish before getting the result of this query. Deadlock occurs because these 2 concurrent transactions both need to wait for the other.

OK now let’s rollback these 2 transactions then go back to our simple bank project to replicate this scenario in a test.

Replicate deadlock scenario in a test

It’s gonna be very similar to the test that we’ve written in the last lecture, so I will just duplicate that TestTransferTx function, and change its name to TestTransferTxDeadlock.

Here, let’s say we’re gonna run n = 10 concurrent transactions. The idea is to have 5 transactions that send money from account 1 to account 2, and another 5 transactions that send money in reverse direction, from account 2 to account 1.



func TestTransferTxDeadlock(t *testing.T) {
    store := NewStore(testDB)

    account1 := createRandomAccount(t)
    account2 := createRandomAccount(t)
    fmt.Println(">> before:", account1.Balance, account2.Balance)

    n := 10
    amount := int64(10)
    errs := make(chan error)

    ...
}

In this scenario, we only need to check for deadlock error, we don’t need to care about the result because it has already been checked in the other test. So I have removed the results channel, and just keep the errs channel.

Now inside the for loop, let’s define 2 new variables: fromAccountID will be account1.ID, and toAccountID will be account2.ID.

But since we want half of the transaction to send money from account 2 to account 1, I will check if the counter i is an odd number (i % 2 = 1), then fromAccountID should be account2.ID and toAccountID should be account1.ID instead.



func TestTransferTxDeadlock(t *testing.T) {
    ...

    for i := 0; i < n; i++ {
        fromAccountID := account1.ID
        toAccountID := account2.ID

        if i%2 == 1 {
            fromAccountID = account2.ID
            toAccountID = account1.ID
        }

        go func() {
            _, err := store.TransferTx(context.Background(), TransferTxParams{
                FromAccountID: fromAccountID,
                ToAccountID:   toAccountID,
                Amount:        amount,
            })

            errs <- err
        }()
    }
}

Now inside the go routine, we should set the fields of TransferTxParams to fromAccountID and toAccountID. Then remove the results <- result statement because we don’t care about the result anymore.

OK, now the check errors part. Let’s delete the existed map, and everything inside the for loop, except the error checking statements.



func TestTransferTxDeadlock(t *testing.T) {
    ...

    for i := 0; i < n; i++ {
        err := <-errs
        require.NoError(t, err)
    }

    ...
}

We also want to check the final updated balance of the 2 accounts. In this case, there are 10 transactions that move the same amount of money between account 1 and account 2. But because of this if i%2 == 1 condition, 5 of them will move money from account 1 to account 2, and the other 5 will move money from account 2 back to account 1.

Therefore, we expect that in the end, the balance of both account 1 and account 2 should be the same as they were before the transactions.



func TestTransferTxDeadlock(t *testing.T) {
    ...

    // check the final updated balance
    updatedAccount1, err := store.GetAccount(context.Background(), account1.ID)
    require.NoError(t, err)

    updatedAccount2, err := store.GetAccount(context.Background(), account2.ID)
    require.NoError(t, err)

    fmt.Println(">> after:", updatedAccount1.Balance, updatedAccount2.Balance)
    require.Equal(t, account1.Balance, updatedAccount1.Balance)
    require.Equal(t, account2.Balance, updatedAccount2.Balance)
}

So here, updatedAccount1.Balance should equal to account1.Balance, and updatedAccount2.Balance should equal to account2.Balance.

OK let’s run this test!

We’ve got a deadlock error as expected. Let’s learn how to fix it!

Fix the deadlock issue

As you’ve already seen in the example that we ran in psql console, the reason deadlock occurs is because of the different order in which 2 concurrent transactions update the accounts’ balance, where transaction 1 update account 1 before account 2, while the other transaction 2 update account 2 before account 1.

So this gives us an idea of how deadlock can be avoided by making both transactions update the accounts balance in the same order. Let’s say in this transaction 2, we just move the update account 1 query up, and keep everything else the same.



-- Tx1: transfer $10 from account 1 to account 2
BEGIN;

UPDATE accounts SET balance = balance - 10 WHERE id = 1 RETURNING *;
UPDATE accounts SET balance = balance + 10 WHERE id = 2 RETURNING *;

ROLLBACK;


-- Tx2: transfer $10 from account 2 to account 1
BEGIN;

UPDATE accounts SET balance = balance + 10 WHERE id = 1 RETURNING *; -- moved up
UPDATE accounts SET balance = balance - 10 WHERE id = 2 RETURNING *;

ROLLBACK;

So now both transaction 1 and transaction 2 will always update account 1 before account 2. Let’s try to run them in the psql console to see what will happen!

First, begin transaction 1 and run its 1st query to update account 1.

Then switch to the other console and begin transaction 2. Also run its 1st query to update account 1.

Now unlike before, this time the query is blocked right away, because transaction 1 is already holding an exclusive lock to update the same account 1. So let’s go back to transaction 1 and run its 2nd query to update account 2.

The result is returned immediately, and transaction 2 is still blocked. So we just COMMIT this transaction 1 to release the lock. Then go to transaction 2.

We can see that it is unblocked instantly, and the balance has been updated to the new value.

We can run the 2nd query to update account 2, then COMMIT transaction 2 successfully with no deadlocks.

Alright, so now we understand that the best defense against deadlocks is to avoid them by making sure that our application always acquire locks in a consistent order.

For example, in our case, we can easily change our code so that it always updates the account with smaller ID first.

Here we check if arg.FromAccountID is less than arg.ToAccountID then the fromAccount should be updated before the toAccount. Else, the toAccount should be updated before the fromAccount.



func (store *Store) TransferTx(ctx context.Context, arg TransferTxParams) (TransferTxResult, error) {
    var result TransferTxResult

    err := store.execTx(ctx, func(q *Queries) error {
        ...

        if arg.FromAccountID < arg.ToAccountID {
            result.FromAccount, err = q.AddAccountBalance(ctx, AddAccountBalanceParams{
                ID:     arg.FromAccountID,
                Amount: -arg.Amount,
            })
            if err != nil {
                return err
            }

            result.ToAccount, err = q.AddAccountBalance(ctx, AddAccountBalanceParams{
                ID:     arg.ToAccountID,
                Amount: arg.Amount,
            })
            if err != nil {
                return err
            }
        } else {
            result.ToAccount, err = q.AddAccountBalance(ctx, AddAccountBalanceParams{
                ID:     arg.ToAccountID,
                Amount: arg.Amount,
            })
            if err != nil {
                return err
            }

            result.FromAccount, err = q.AddAccountBalance(ctx, AddAccountBalanceParams{
                ID:     arg.FromAccountID,
                Amount: -arg.Amount,
            })
            if err != nil {
                return err
            }
        }

        return nil
    })

    return result, err
}

OK, now after this change, we expect that the deadlock should be gone. Let’s rerun our test!

It passed! In the logs, we can see the balances are the same before and after the transactions. Perfect!

Refactor the code

Before we finish, let’s refactor the code a bit, because now it looks quite long and somewhat duplicated. To do this, I’m gonna define a new addMoney() function to add money to 2 accounts.

It will takes several inputs: the context, the queries object, the ID of the 1st account, the amount of money that should be added to that 1st account, the ID of the 2nd account, and the amount of money that should be added to that 2nd account.

The function will return 3 values: the 1st account object, the 2nd account object after updated, and a potential error.



func addMoney(
    ctx context.Context,
    q *Queries,
    accountID1 int64,
    amount1 int64,
    accountID2 int64,
    amount2 int64,
) (account1 Account, account2 Account, err error) {
    account1, err = q.AddAccountBalance(ctx, AddAccountBalanceParams{
        ID:     accountID1,
        Amount: amount1,
    })
    if err != nil {
        return
    }

    account2, err = q.AddAccountBalance(ctx, AddAccountBalanceParams{
        ID:     accountID2,
        Amount: amount2,
    })
    return
}

Inside this function, we first call q.AddAcountBalance() to add amount1 to account1’s balance. So the ID should be accountID1, and Amount should be amount1. We save the results to the output account1 and err variables.

Then we check if err is not nil, simply return. Here because we’re using named return variables, so this return with no parameters is basically the same as if we write return account1, account2, err. This is a cool syntax feature of Go that makes the code more concise.

We do similar thing to add amount2 to account2. Now with this addMoney function in hand, we can refactor our transfer transaction:



func (store *Store) TransferTx(ctx context.Context, arg TransferTxParams) (TransferTxResult, error) {
    var result TransferTxResult

    err := store.execTx(ctx, func(q *Queries) error {
        ...

        if arg.FromAccountID < arg.ToAccountID {
            result.FromAccount, result.ToAccount, err = addMoney(ctx, q, arg.FromAccountID, -arg.Amount, arg.ToAccountID, arg.Amount)
        } else {
            result.ToAccount, result.FromAccount, err = addMoney(ctx, q, arg.ToAccountID, arg.Amount, arg.FromAccountID, -arg.Amount)
        }

        return err
    })

    return result, err
}

If fromAccountID is less than toAccountID, we want to update fromAccount before toAccount. So here, we call addMoney(), pass in the context ctx, query q, arg.FromAccountID, -arg.Amount because money is moving out, then arg.ToAccountID, and finally arg.Amount because money is moving in.

The output of this function call should be assigned to result.FromAccount, result.ToAccount, and err.

Else, in case toAccountID is smaller, we want to make sure that toAccount is updated before fromAccount. So we just duplicate the previous command, but change it a bit to reverse the accounts order.

And that’s it! The refactoring is done. Let’s rerun the TestTransferTxDeadlock!

It passed! Excellent! Let’s also run the normal TestTransferTx:

Also passed! And finally rerun the whole package test:

All passed!

So everything is now working properly. Deadlock is no longer a threat to our application.

And that’s the end of today’s lecture. I hope it’s useful for you.

Thanks a lot for reading! Happy coding and see you in the next one!

If you like the article, please subscribe to our Youtube channel and follow us on Twitter for more tutorials in the future.

If you want to join me on my current amazing team at Voodoo, check out our job openings here. Remote or onsite in Paris/Amsterdam/London/Berlin/Barcelona with visa sponsorship.