Django Doctor audits and auto fixes Django anti-patterns. We audited the Sentry codebase for problems hindering maintainability to see how we can help, and ultimately created a Pull Request to fix some of them. In this post we will break down the most impactful issues, and how they can avoided.
Missing reverse migration
Django data migrations have two aspects: forwards and backwards. If a developer wants to undo migrations then backwards must be specified. However, many projects have migrations that miss this because by default the skeleton data migration generated by Django only shows the forwards handler, and hence like 22% of the Django codebases we audited, Sentry has some data migrations that do not support undoing the migration.
Unfortunately it only takes one missing reverse to make all migrations fail when attempting to undo the migrations. This does not leave the developer many options if they want to reverse migrations due to discovering a bug in prod that requires rolling back to the last good release.
Thus in all the affected data migrations in the Sentry Pull Request we suggested the following:
These migration changes will allow a developer to migrate backwards without Django throwing an exception, and one day may save a developer great headache when they are trying to recovery from a disaster in prod that requires rolling back and undoing a migration.
Checking queryset truthiness:
Querysets are lazily evaluated - meaning the records are not read from the database until code interact with the data. That's why we can do queryset.all()
without downloading every records from the database.
However, if some code does if queryset
then the queryset is evaluated immediately: all records in the queryset are read from the database. Perhaps tens, perhaps thousands, perhaps more. Checking if a queryset is truthy/falsey is much less efficient than checking queryset.exists()
.
Like 50% of the codebases we checked, Sentry suffers this anti-pattern. However, the fix is simple once the problem is identifier: in the Sentry Pull Request we suggested the following:
Also noteworthy is there were cases when the queryset was being evaluated for performance gains: it's perfectly valid to evaluate the queryset if the data in the queryset will be consumed by some code later on: no need to perform two database calls then one can suffice.
URL anti-patterns
Like 40% of the Django projects we checked, Sentry have some URL related anti-patterns:
- duplicate URL names read more
- hard-coded URLs in templates instead of using url template tag read more
- hard-coded static asset URLs instead of using the static template tag read more
Duplicate URL name:
Django provides a very convenient method of generating URLs: giving each URL a name and then allowing the developer to use the {% url
template tag or reverse
to generate the URL.
But what happens if two URLs share the same name? For example here and here both have the name 'sentry-project-event-redirect'
It will mean half of the time templates will be linking to the wrong place: if a template tries to link to the login view via {% url "sentry-project-event-redirect"
can you be sure it's going to the right place? No.
How did this happen? Well it's easily missed when reading the code. The urls.py is over 600 lines of very dense code. Easy to not see the needle in the haystack without tools to help.
Hard-coded URL
Hard-coding URLs in templates makes changing URLs harder, and makes linking to the wrong page easier - so harms maintainability. Sentry had one case of hard-coding a URL, which is very good: only one case in the huge codebase and that only was was pointing to the homepage, which is unlikely to change.
Hard-coded static asset URLs
Django documentation suggests not hard-coding static URLs, and instead using the {% static
template tag. {% static ...
returns the path the browser can use to request the file. At it's simplest, that would return a path that looks up the file on your local file system. That's fine for your local dev environment but in prod we will likely use third-party libraries such as whitenoise or django-storages to improve performance of the production web server.
We saw three instances of hard-coded static URLs, all of them in the same file. These hinder maintainability in a few ways. The key one is if the STORAGE_BACKEND
is changed to one optimizing for speed of static file delivery then the following can happen:
- File revving can change the filename
- Serving from S3 (or similar) can change the domain the file is served from
Therefore hard-coding the static URL would break if the STORAGE_BACKEND
was changed to e.g, whitenoise or django-storages
This issue is discussed on greater detail here.
Does your Django codebase have these issues?
Over time it's easy for tech debt to slip into your codebase. I can check that for you at django.doctor, or can review your GitHub PRs:
Or try out Django refactor challenge.
Top comments (0)