An introduction to database generated columns, using SQLite and the new GeneratedField added in Django 5.0.
⚡ TL;DR
GeneratedField
is a new Django 5.0 feature and has these characteristics:
- its value is calculated entirely by the database
- it works in all supported backend databases
Here is a model example with a GeneratedField
to use with SQLite as the database backend:
from django.db import models
from django.db.models import F
class Rectangle(models.Model):
base = models.FloatField()
height = models.FloatField()
area = models.GeneratedField(
expression=F("base") * F("height"),
output_field=models.FloatField(),
db_persist=True,
)
🚀 Introduction
I've been thinking for a long time that database generated columns would be a killer feature for Django and now they will finally be available in Django 5.0 with the new GeneratedField
.
Given my great interest in this functionality, I proposed it to the Django developer mailing list, I tried to help in the revision phase of the Pull Request on the Django code, but above all I tried to test them as much as possible after the merge, trying to highlight any bugs before the release of Django 5.0 as a stable version.
In this article, I want to share with you the work I have done to allow you to experiment with GeneratedField
using common model field types using SQLite as a database backend.
🤖 Generated Field
The GeneratedField
are full-fledged fields that can be used in queries, displayed and even indexed but their values cannot be set or modified because they are automatically calculated by the database itself whenever the other fields of the same row are modified.
For this reason, they are very useful for having immediately available values calculated starting from the other fields of the same model as long as these are in the same database table.
The definition of the GeneratedField
class requires specifying the expression
, output_field
, and db_persist
attributes.
class GeneratedField(
expression, output_field, db_persist=None, **kwargs
):
...
🕶️ Attributes
db_persist
In this article, we're going to use SQLite as a database backend with persisted columns, specified with db_persist=True
, in that the database column will occupy storage as if it were a real column.
output_field
We have to specify an explicit model field instance to define the field’s data type and its attributes.
expression
The expression
attributes is the one we're going to play more with, using various example of Expression
that the database will use to automatically set the field value each time the model is changed.
↔️ Migrations
The SQL code generated from the migrations produced in our examples will all have the GENERATED ALWAYS SQL (...) STORED
syntax, varying only in the internal logic of the expression.
💻 Set Up
The requirements to experiment with Django GeneratedField
are:
🐍 Python
A stable and supported version of Python 3 (tested with Python 3.11-3.12):
$ python3 --version
Python 3.12.0
⚗️ Virtual environment
A Python virtual environment:
$ python3 -m venv ~/.generatedfields
$ source ~/.generatedfields/bin/activate
🦄 Django
The latest version of Django (tested with Django 5.0):
$ python3 -m pip install django==5.0
🗃️ The generatedfields
project
To create the generatedfields
project I'll switch to my projects
directory:
$ cd ~/projects
and then use the startproject
Django command:
$ python3 -m django startproject generatedfields
The basic files of our project will be created in the generatedfields
directory:
$ tree --noreport generatedfields/
generatedfields/
├── manage.py
└── generatedfields
├── asgi.py
├── __init__.py
├── settings.py
├── urls.py
└── wsgi.py
🗂️ The samples
app
After switching to the generatedfields
directory:
$ cd generatedfields
We can create our samples
app with the Django startapp
command:
$ python3 -m django startapp samples
Again, all the necessary files will be created for us in the samples
directory:
$ tree --noreport samples/
samples/
├── admin.py
├── apps.py
├── __init__.py
├── migrations
│ └── __init__.py
├── models.py
├── tests.py
└── views.py
✅ Activating the samples
app
Now, we have to activate our samples
application by inserting its name in the list of the INSTALLED_APPS
in the generatedfields
settings file.
generatedfields/generatedfields/settings.py
INSTALLED_APPS = [
"django.contrib.admin",
"django.contrib.auth",
"django.contrib.contenttypes",
"django.contrib.sessions",
"django.contrib.messages",
"django.contrib.staticfiles",
"samples",
]
📚 Sample models
The example models I have experimented with have expressions with complexity that progressively increase from example to example.
We start by combining two fields of the model to create the value for the generated one and then using database functions, combining them, using them with conditions and even generating multiple fields from the same starting field.
These experiments have allowed us to identify some bugs and fix them during this Django 5.0 testing period, but I leave off some edge cases to take into account which we'll mention at the end of the article.
🔧 Useful commands
In the following examples, we will use some commands that we show here to avoid repetition.
makemigrations
Generates a new database migration to apply it to our database:
$ python3 -m manage makemigrations --no-header
sqlmigrate
See the SQL code that our migration will apply to the database:
$ python3 -m manage sqlmigrate samples <migration-id>
migrate
Apply migration to our database with:
$ python3 -m manage migrate
shell
Start an interactive Django shell:
$ python3 -m manage shell
✖️ A calculated product field
The first generated field experiment I thought of is the one in which to use the model fields by combining them, for example, the Rectangle
model, in which we have two float fields, base
and height
, and a generated field, the area
, which is automatically calculated by the database by multiplying the other two fields together.
generatedfields/samples/models.py
from django.db import models
from django.db.models import F
class Rectangle(models.Model):
base = models.FloatField()
height = models.FloatField()
area = models.GeneratedField(
expression=F("base") * F("height"),
output_field=models.FloatField(),
db_persist=True,
)
def __str__(self):
return (
f"{self.base}×{self.height}="
f"{self.area}"
)
After running makemigrations
, use sqlmigrate
and see the SQL code that will be applied to the database:
BEGIN;
--
-- Create model Rectangle
--
CREATE TABLE "samples_rectangle" (
"id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"base" real NOT NULL,
"height" real NOT NULL,
"area" real GENERATED ALWAYS AS (
("base" * "height")
) STORED
);
COMMIT;
In the generated SQL code you see the notation for the generated columns GENERATED ALWAYS AS
, the specific formula "base" * "height"
and the notation specifying that the column is of type STORED
.
After running migrate
, open a shell
and check the model behaves as intended:
>>> from samples.models import Rectangle
>>> Rectangle.objects.create(base=6, height=7)
<Rectangle: 6×7=42.0>
✳️ A calculated exponentiation field
The second example model is the Square
, from its single side
field, the value of the area
generated field is calculated, using the Power
database function, made available by Django's ORM.
generatedfields/samples/models.py
from django.db import models
from django.db.models.functions import (
Power,
)
class Square(models.Model):
side = models.FloatField()
area = models.GeneratedField(
expression=Power("side", 2),
output_field=models.FloatField(),
db_persist=True,
)
def __str__(self):
return f"{self.side}²={self.area}"
After running makemigrations
, use sqlmigrate
and see the SQL code that will be applied to the database:
BEGIN;
--
-- Create model Square
--
CREATE TABLE "samples_square" (
"id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"side" real NOT NULL,
"area" real GENERATED ALWAYS AS (
POWER("side", 2)
) STORED
);
COMMIT;
The SQL code generated for the SQLite database uses the built-in POWER
mathematical function to calculate the value of the calculated column.
After running migrate
, open a shell
and check the model behaves as intended:
>>> from samples.models import Square
>>> Square.objects.create(side=3)
<Square: 3²=9.0>
⭕ A calculated geometric formula field
The Circle
model is very similar to that of the Square
but the expression for calculating the value of the area
generated field uses the database function Pi
, and the result of multiplication with the latter by the value of the radius
field, is rounded, with the other database function Power
.
generatedfields/samples/models.py
from django.db import models
from django.db.models.functions import (
Pi, Power, Round
)
class Circle(models.Model):
radius = models.FloatField()
area = models.GeneratedField(
expression=Round(
Power("radius", 2) * Pi(),
precision=2,
),
output_field=models.FloatField(),
db_persist=True,
)
def __str__(self):
return (
f"{self.radius}²×π="
f"{self.area}"
)
After running makemigrations
, use sqlmigrate
and see the SQL code that will be applied to the database:
BEGIN;
--
-- Create model Circle
--
CREATE TABLE "samples_circle" (
"id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"radius" real NOT NULL,
"area" real GENERATED ALWAYS AS (
ROUND((POWER("radius", 2) * PI()), 2)
) STORED
);
COMMIT;
The SQL code of the calculated column is more complex than that of the previous example, in line with the increased complexity of the model expression but, the functions used have a one-to-one correspondence with those made available by the Django ORM.
After running migrate
, open a shell
and check the model behaves as intended:
>>> from samples.models import Circle
>>> Circle.objects.create(radius=3.1415)
<Circle: 3.1415²×π=31.0>
📐 A calculated trigonometric functions field
The RightTriangle
model is the last geometric example and in this one a fairly complex expression is used to calculate the generated field area
, starting from the hypotenuse
and angle
fields, using the trigonometric functions Sin
and Cos
made available by the Django ORM in addition to those used in the previous examples.
generatedfields/samples/models.py
from django.db import models
from django.db.models.functions import (
Cos, Power, Radians, Round, Sin
)
class RightTriangle(models.Model):
hypotenuse = models.FloatField()
angle = models.FloatField()
area = models.GeneratedField(
expression=Round(
(
Power("hypotenuse", 2)
* Sin(Radians("angle"))
* Cos(Radians("angle"))
)
/ 2,
precision=2,
),
output_field=models.FloatField(),
db_persist=True,
)
def __str__(self):
return (
f"{self.hypotenuse}²×"
f"sin({self.angle}°)×"
f"cos({self.angle}°)÷"
f"2={self.area}"
)
After running makemigrations
, use sqlmigrate
and see the SQL code that will be applied to the database:
BEGIN;
--
-- Create model RightTriangle
--
CREATE TABLE "samples_righttriangle" (
"id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"hypotenuse" real NOT NULL,
"angle" real NOT NULL,
"area" real GENERATED ALWAYS AS (
ROUND(
(
(
(
POWER("hypotenuse", 2)
* SIN(RADIANS("angle"))
) * COS(RADIANS("angle"))
) / 2
),
2
)
) STORED
);
COMMIT;
Despite the increased complexity of the expression, the SQL code of the database generated column expression, for SQLite, is clear, demonstrating the power of SQL for these types of queries.
After running migrate
, open a shell
and check the model behaves as intended:
>>> from samples.models import RightTriangle
>>> RightTriangle.objects.create(angle=45, hypotenuse=5)
<RightTriangle: 5²×sin(45°)×cos(45°)÷2=6.25>
🧾 A calculated total price field
In the Item
model the expression of the generated field total_price
is a multiplication of the two fields price
and quantity
to which the Round
function is applied. However, the integer quantity
field has a default value of 1
, but this default value is declared via the new db_default
parameter, which sets a database-computed default value, also introduced in Django 5.0.
generatedfields/samples/models.py
from django.db import models
from django.db.models import F, Value
class Item(models.Model):
price = models.DecimalField(
max_digits=6, decimal_places=2
)
quantity = models.PositiveSmallIntegerField(
db_default=Value(1)
)
total_price = models.GeneratedField(
expression=F("price") * F("quantity"),
output_field=models.DecimalField(
max_digits=11, decimal_places=2
),
db_persist=True,
)
def __str__(self):
return (
f"{self.price}×"
f"{self.quantity}="
f"{self.total_price}"
)
After running makemigrations
, use sqlmigrate
and see the SQL code that will be applied to the database:
BEGIN;
--
-- Create model Item
--
CREATE TABLE "samples_item" (
"id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"price" decimal NOT NULL,
"quantity" smallint unsigned DEFAULT 1 NOT NULL CHECK (
"quantity" >= 0
),
"total_price" decimal GENERATED ALWAYS AS (
CAST(("price" * "quantity") AS NUMERIC)
) STORED
);
COMMIT;
The SQL code generated for SQLite contains an expression similar to a previous example, but the presence of the DEFAULT
in the quantity
column means that the calculation logic of the generated column total_price
is entirely contained in the database.
After running migrate
, open a shell
and check the model behaves as intended:
>>> from samples.models import Item
>>> Item.objects.create(price=9.99)
<Item: 9.99×1=9.99>
>>> Item.objects.create(price=4.99, quantity=2)
<Item: 4.99×2=9.98>
🚦 A calculated status field
In the Order
model we see an example of using a conditional expression in the generated status
field. The presence or absence of the creation
and payment
values permits the selection of the correct status
value directly in the SQLite database.
generatedfields/samples/models.py
from django.db import models
from django.db.models import (
Case, Value, When
)
class Order(models.Model):
creation = models.DateTimeField()
payment = models.DateTimeField(null=True)
status = models.GeneratedField(
expression=Case(
When(
payment__isnull=False,
then=Value("paid"),
),
default=Value("created"),
),
output_field=models.TextField(),
db_persist=True,
)
def __str__(self):
return (
f"[{self.status}] "
f"{self.payment or self.creation}"
)
After running makemigrations
, use sqlmigrate
and see the SQL code that will be applied to the database:
BEGIN;
--
-- Create model Order
--
CREATE TABLE "samples_order" (
"id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"creation" datetime NOT NULL,
"payment" datetime NULL,
"status" text GENERATED ALWAYS AS (
CASE
WHEN "payment" IS NOT NULL THEN 'paid'
ELSE 'created'
END
) STORED
);
COMMIT;
The SQL generated by SQLite uses the CASE
expression to populate the value of the generated column status
based on whether or not there is a datetime
value in the payment
column.
After running migrate
, open a shell
and check the model behaves as intended:
>>> from samples.models import Order
>>> Order.objects.create(creation="2023-01-01 12:00Z")
<Order: [created] 2023-01-01 12:00Z>
>>> Order.objects.create(
... creation="2023-01-02 00:00Z",
... payment="2023-01-03 06:30Z"
... )
<Order: [paid] 2023-01-02 6:30Z>
🗓️ Calculated date and duration fields
In the Event
model there are three fields generated compared to only two basic DateTimeField
, the start
field is required and the end
field is optional. The generated fields start_date
and end_date
have an expression that truncates the date of the corresponding fields. In the third generated field duration
the expression performs the subtraction between the value of the end
and start
field. If the basic fields are empty, the expression of the generated fields returns a null value.
generatedfields/samples/models.py
from django.db import models
from django.db.models import F
from django.db.models.functions import (
TruncDate,
)
class Event(models.Model):
start = models.DateTimeField()
start_date = models.GeneratedField(
expression=TruncDate("start"),
output_field=models.DateField(),
db_persist=True,
)
end = models.DateTimeField(null=True)
end_date = models.GeneratedField(
expression=TruncDate("end"),
output_field=models.DateField(),
db_persist=True,
)
duration = models.GeneratedField(
expression=F("end") - F("start"),
output_field=models.DurationField(),
db_persist=True,
)
def __str__(self):
return (
f"[{self.duration or '∞'}] "
f"{self.start_date}…"
f"{self.end_date or ''}"
)
After running makemigrations
, use sqlmigrate
and see the SQL code that will be applied to the database:
BEGIN;
--
-- Create model Event
--
CREATE TABLE "samples_event" (
"id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"start" datetime NOT NULL,
"start_date" date GENERATED ALWAYS AS (
django_datetime_cast_date(
"start", 'UTC', 'UTC'
)
) STORED,
"end" datetime NULL,
"end_date" date GENERATED ALWAYS AS (
django_datetime_cast_date(
"end", 'UTC', 'UTC'
)
) STORED,
"duration" bigint GENERATED ALWAYS AS (
django_timestamp_diff(
"end", "start"
)
) STORED
);
COMMIT;
In the generated SQL code the expressions do not have complex logic, but we see the presence of the deterministic functions django_datetime_cast_date
and django_timestamp_diff
defined in the Django database backend for SQLite.
After running migrate
, open a shell
and check the model behaves as intended:
>>> from samples.models import Event
>>> Event.objects.create(start="2023-1-1 12:00Z")
<Event: [∞] 2023-01-01…>
>>> Event.objects.create(
... start="2023-1-1 11:45Z",
... end="2023-1-9 00:00Z"
... )
<Event: [7 days, 12:15:00] 2023-01-01…2023-01-09>
🔑 A calculated JSON key field
In the Package
model there is the slug
field and the data
JSONfield in which we want to store the JSON payload of the Python package, corresponding to the slug
value, as it is returned to us by PyPi
via its endpoint https://pypi.org/pypi/
. The generated field version
will contain the latest version of the package, extracted from the JSONfield data
.
generatedfields/samples/models.py
from django.db import models
from django.db.models import F
class Package(models.Model):
slug = models.SlugField()
data = models.JSONField()
version = models.GeneratedField(
expression=F("data__info__version"),
output_field=models.TextField(),
db_persist=True,
)
def __str__(self):
return f"{self.slug} {self.version}"
After running makemigrations
, use sqlmigrate
and see the SQL code that will be applied to the database:
BEGIN;
--
-- Create model Package
--
CREATE TABLE "samples_package" (
"id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"slug" varchar(50) NOT NULL,
"data" text NOT NULL CHECK (
(
JSON_VALID("data")
OR "data" IS NULL
)
),
"version" text GENERATED ALWAYS AS (
(
CASE
WHEN JSON_TYPE(
"data", '$."info"."version"'
) IN (
'false', 'true', 'null'
) THEN JSON_TYPE(
"data", '$."info"."version"'
)
ELSE JSON_EXTRACT(
"data", '$."info"."version"'
)
END
)
) STORED
);
CREATE INDEX "samples_package_slug_idx"
ON "samples_package" ("slug");
COMMIT;
The SQL code generated for SQLite is a little more complex and verbose than what we might have using another database with advanced JSON type support (for example PostgreSQL).
After running migrate
, open a shell
and check the model behaves as intended:
>>> import json
>>> from urllib.request import urlopen
>>> from samples.models import Package
>>> slug = "django"
>>> URL = f"https://pypi.org/pypi/{slug}/json"
>>> data = json.loads(urlopen(URL).read())
>>> Package.objects.create(slug=slug, data=data)
<Package: django 4.2.7>
🔔 Note
At first glance, the case presented in this example might seem not very useful because it is limited to a simple extraction of the value corresponding to a specific key from a JSON, which in many databases is rather quick to do at run time.
In reality, if we assume that we have a large number of instances of the Package
model, the execution of a query in which the version value is already present in its field greatly lightens its execution by simplifying the logic, and also alleviating indexes.
Furthermore, if the version were the only interesting field to extract from the JSON in the data
field, we could defer the data
field from our queries, speeding up them and drastically reducing the data transferred from the database (e.g. in the change list pages, in the Django admin).
🖇️ A calculated concatenated field
The last User
model partially follows the one already present in Django itself, but we only have the first_name
and the last_name
fields. The generated field full_name
is calculated starting from the first two on which is performed a concatenation with the addition of a space, using the appropriate Concat
function.
generatedfields/samples/models.py
from django.db import models
from django.db.models import Value
from django.db.models.functions import (
Concat,
)
class User(models.Model):
first_name = models.CharField(max_length=150)
last_name = models.CharField(max_length=150)
full_name = models.GeneratedField(
expression=Concat(
"first_name", Value(" "), "last_name",
),
output_field=models.TextField(),
db_persist=True,
)
def __str__(self):
return self.full_name
After running makemigrations
, use sqlmigrate
and see the SQL code that will be applied to the database:
BEGIN;
--
-- Create model User
--
CREATE TABLE "samples_user" (
"id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"first_name" varchar(150) NOT NULL,
"last_name" varchar(150) NOT NULL,
"full_name" text GENERATED ALWAYS AS (
COALESCE("first_name", '') ||
COALESCE(COALESCE(' ', '') ||
COALESCE("last_name", ''), '')
) STORED
);
COMMIT;
The SQL code generated for SQLite uses the concatenation operator ||
to join column values together with space and the COALESCE
function in case one of the values is empty.
After running migrate
, open a shell
and check the model behaves as intended:
>>> from samples.models import User
>>> User.objects.create(first_name="Jane", last_name="Doe")
<User: Jane Doe>
🔔 Note
This concatenation example is one of the most used to present the potential of the generated fields. I wanted to present it only at the end because, contrary to what it may seem at first glance, it contains pitfalls when used with other databases but I will delve into this in the next article of this series of articles on database generated columns.
🙋 Get involved
Thank you for getting this far with reading this article.
I hope you found it interesting to read the examples I proposed, as it was for me to formulate them and report them here. The motivation behind this work and this article is to collaborate with all Django developers to churn out features that are as solid and stable as possible.
As I have repeatedly invited you to do in my articles and talks, I invite you to get involved.
Try Django 5.0 and experiment with its new features, but above all report any strange behavior or bugs so that they end up being better for everyone.
🐙 GitHub
I have published the code presented in this article in this GitHub repository of mine to allow you to carry out experiments.
🥠 Conclusion
As we have seen, the generated fields of Django 5.0 are a very powerful tool, which lend themselves to solving a wide variety of problems.
I think they also have the merit of bringing a lot of logic back into the database, which in most projects is the lowest layer of the project stack.
Together with the database-computed default values they also make it possible to simplify the collaboration of different applications on the same database.
In future articles, I'll try to show other examples of using Django's generated fields with other Django-supported database backends (e.g. PostgreSQL) and with other Django modules (e.g. GeoDjango).
Stay tuned.
— Paolo
⚠️ Disclaimer
This code is for demonstration purposes only and should not be used in production as is. However, the code is released without any guarantee from the author and no liability can be attributed. Use at your own risk.
Top comments (0)