Django's transaction.atomic()

It may not be as atomic as you think: it wasn't for me

Saturday October 10th, 2020

I have a confession: I assumed things about Django's transaction.atomic() that are not true, at least not true by default in PostgreSQL. I assumed that in a transaction.atomic() context as below, database statements are protected from any race conditions, and everything will Just Work™.

with transaction.atomic():
    # Database statements

But that's really not true.

Enter the world of transaction isolation levels: None with autocommit, Read committed, Repeatable read, Serializable, and "do it yourself".

Django ⬄ SQL mapping

Firstly, this post discusses transactions mostly in terms of SQL. As such, it's good to have a reasonable understanding of what Django does in terms of SQL queries.

Django	SQL
Fetch from a QuerySet	SELECT
Existing model `save`, or QuerySet `update`	UPDATE
Enter an `atomic` context	START TRANSACTION
Exit an `atomic` context without exception	COMMIT
Exit an `atomic` context with an exception	ROLLBACK

No explicit transaction, with autocommit

For completeness, without transaction.atomic(), by default Django runs in autocommit mode, which means every statement runs in its own Read committed transaction and committed immediately. While tempting to translate "auto" as "don't worry", even in autocommit mode UPDATEs can block , deadlock , or miss rows .

Read committed

The default isolation level of a PostgreSQL transaction, and so also of transaction.atomic(), is Read committed. Such a transaction has 4 important properties.

Each of its SELECTs behaves as though a snapshot of the database is taken at the start of the SELECT, containing only committed data from other transactions .
Each of its UPDATEs can block if concurrent transactions have UPDATEd the same rows, and block until those transactions COMMIT or ROLLBACK .
This blocking can lead to deadlock if concurrent UPDATEs on the same rows are done in different orders .
Its COMMIT won't fail due to concurrent transactions COMMITting the same rows .

This leaves a lot of room for race conditions, which can be roughly split into two classes.

Between multiple SELECTs in a Read committed transaction, other concurrent transactions can COMMIT, and so their changes become visible . Consider a Sock model, where each sock is either white or black.
```
with transaction.atomic():
    black_socks = list(Sock.objects.filter(colour='black'))  # A SELECT query
    white_socks = list(Sock.objects.filter(colour='white'))  # A SELECT query
```
You might be tempted to assume that due to the transaction.atomic(), each sock will be in exactly one of black_socks or white_socks. However, since each SELECT uses a different snapshot of database state, a given sock can be in both, or neither, of these lists.
UPDATEs with WHERE clauses, have more complex semantics, effectively using mutiple snapshots over their lifetimes. Similar to SELECT, they effectively take a snapshot at start to initially choose the rows. However, they then block until the COMMIT of concurrent transactions that are updating those rows; and once unblocked, take snapshots again to re-evaluate the WHERE clause; and, if it passes for any given row, performs the update on it. This re-evaluation on new snapshots means that such statements may "miss" rows .

What are typically safe from this class of race-conditions are UPDATEs where the WHERE clause is based on columns that do not change, such as primary keys. From experience, this is the vast majority of UPDATEs.

Repeatable read

A Repeatable read transaction

with transaction.atomic():
    cursor = connection.cursor()
    cursor.execute('SET TRANSACTION ISOLATION LEVEL REPEATABLE READ')
    # ...

has 4 important properties.

It behaves as though a snapshot of the database is taken at the first non transaction-control statement [e.g. a SELECT, UPDATE], which is used throughout the entire transaction, and data it sees is not affected by other transactions COMMITting modifications .
Each of its UPDATEs can block if concurrent transactions UPDATE the same rows, until those transactions COMMIT or ROLLBACK . This is the same as Read committed.
This blocking can lead to deadlock if UPDATEs are not ordered . This is the same as Read committed.
It will fail on COMMIT if a concurrent transaction committed updates to the same rows .

While attractive, this isn't enough to avoid race conditions. Multiple concurrent transactions can use the same snapshot , violate assumptions of your code, and cause bugs.

Serializable

Serializable is the strongest isolation level. It's similar to Repeatable read in that a snapshot is taken at the beginning of the first non transaction-control statement , and it will fail if concurrent transactions attempt to commit updates to the same rows .

However, it will fail in more situations than Repeatable read. Specifically, if at COMMIT the database cannot determine that the transaction could have been performed serially with respect to the read/writes of other transactions, then it will fail with a django.db.DatabaseError. This can happen even if they updated different rows .

PostgreSQL is somewhat overcautious doing this. For example, the following faily simple example can fail if performed concurrently .

with transaction.atomic():
    cursor = connection.cursor()
    cursor.execute('SET TRANSACTION ISOLATION LEVEL SERIALIZABLE')
    Sock.objects.all().count()
    Sock.objects.get(id=request.POST['sock__id']).save()

Do it yourself: select_for_update

You can "add to" isolation inside transactions using select_for_update. This blocks until the current transaction can obtain the same locks as an UPDATE would on the matching rows, which are then kept until the end of the transaction .

A use for this is in a Read committed transaction to enforce serializability, without the risk of commits failing as they could in a true Serializable transaction, but at the cost of the time of blocking, and the risk of deadlock .

At the time of writing this appears to be the only technique that is a first-class Django citizen: no need to write explit SQL.

Summary

There is no magic or one-size-fits-all approach to database transactions, and you can't trust Django to always do the right thing, even with a transaction.atomic() or select_for_update.

Thank you to a colleague of mine who basically told me what I thought I knew was wrong. Thanks also to the PostgreSQL isolation levels documentation and to the PostgreSQL Serializable Snapshot Isolation (SSI) documentation on which some of the above examples are based. These pages are good for further reading for more details on INSERT, DELETE, and other forms of SELECT which are ommitted in the above for brevity.