SQLAlchemy is a widely used database toolkit for Python, providing a SQL abstraction layer covering most, if not all, your relational database needs. It's often used together with popular frameworks such as Flask and FastAPI, using either its Core library and/or its ORM components.
This article will take a look at how to set up a basic, standalone SQLAlchemy application, meaning we'll take a look into SQLAlchemy itself and skip other frameworks like the ones we just mentioned. We'll explore some of SQAlchemy's foundational concepts, so we better understand how it works and have some insight about how other frameworks may be using it under their hood.
We will also focus on SQLAlchemy's ORM component. As we'll learn along this article, SQLALchemy is divided into a Core and an ORM component, that represent different ways to achieve the same result, that being interacting with a database. This article is ORM-centered. I suggest checking the documentation for a full introduction to the library, and there is also a comprehensive unified tutorial provided by the SQLAlchemy team, which explores both sides in-depth.
Finally, we'll see some very brief CRUD examples, just so we have a basis on data manipulation. Overall the article works as a simple setup guide with some refreshers about the framework. It does not intend to analyze concepts in great depth, nor a full-fledged tutorial on relational databases or data operations for the sake of brevity. I'll leave the final code available in this GitHub repository
With all that being said, let's start with a few key concepts to familiarize ourselves with SQLAlchemy.
1. Key concepts
The first thing to understand about SQLAlchemy is that it is divided into two major parts: the Object Relational Mapper (ORM) and the Core. Basically, they are different ways to implement the same thing: database interactions. While developers tend to choose and stick to one of them when creating applications, it's not impossible to find codebases with both out in the wild. It's a good idea to understand both so you can make educated implementation choices and be able to comprehend the choices made in pre-existing applications.
The Core component contains the foundation for the whole SQLAlchemy's framework, with most of the library's database services and integrations. It's most significant part is the SQL Expression Language, which in fact works as a query builder system. In other words, a toolkit to construct SQL expressions through composable objects, which can then be executed against the database.
The ORM is built on top of this Core foundation. It expands upon it through a domain object model that is mapped to a database schema. This means you can have classes that are mapped to the database tables and columns, and implement different operations in an object-oriented way. This can be further expanded with external tools such as Alembic, which is a migration tool, or marshmallow, which is a serialization tool.
While using an ORM or a query builder can be quite different, both are able to implement the same operations, and SQLAlchemy's ORM is an extension of the Core. As such, both will rely on a few identical fundamental pieces, such as the database connection. We'll begin our exploration by them.
Our starting point will be the engine and the aforementioned connection.
2. Getting started
Let's get some initial setup out of the way, starting by an empty directory and a virtual environment. Create a new directory with any name you like, and a virtual environment with your preferred tool. I personally use Pyenv. It is worth noting that this tutorial was created with Python 3.11.3, and SQLAlchemy version 2.0.27.
pyenv virtualenv 3.11.3 standalone-sqlalchemy
pyenv activate standalone-sqlalchemy
With the virtual environment activated, install SQLAlchemy.
pyenv exec pip install SQLAlchemy
pyenv exec pip freeze > requirements.txt
Create a database
subdirectory with an empty __init__.py
file, which will contain our files as we progress along the article. The directory tree will look like this:
.
├── database
│ └── __init__.py
└── requirements.txt
3. Creating an engine
The Engine is the starting point of a SQLAlchemy application. It is the central source of connections to the database and holds the application's connection pool. It's independent from either the Core or the ORM and will be used in any and all SQLAlchemy applications.
The usual implementation is to have the engine object created only once in a particular database server. For that, we can use a factory function provided by SQLAlchemy, where we can create and tune database connections. It's the create_engine() function. We use it by passing a database URL as the first positional argument, followed by keyword arguments containing the configuration options we wish to have in place.
The database URL is a usual connection URL. It indicates the database dialect to be used and other required connection parameters. Here's an example:
"dialect+driver://username:password@host:port/database"
The dialect
part tells SQLAlchemy what kind of database it will be interacting with. The driver
part is pointing to a specific DBAPI implementation, tuned to the database type being used. The DBAPI, in short, is a widely used Python specification providing a low-level system to talk to particular databases. Together, the dialect and driver combination ensures the Python code we write is translated to the correct SQL expressions before they're sent to the database. Changing the dialect changes the SQL translation to other database types without having to modify our Python code.
Refer to this documentation if you want to explore the dialects in detail, including current support level for different database types.
Let's see some examples. We could connect to an in-memory SQLite database:
from sqlalchemy import create_engine
engine = create_engine("sqlite+pysqlite:///:memory:")
Or to one in the filesystem:
from sqlalchemy import create_engine
engine = create_engine("sqlite+pysqlite:///dbname.db")
On a local port:
from sqlalchemy import create_engine
engine = create_engine("sqlite+pysqlite:///127.0.0.1:3307/dbname")
And so on. It's worth knowing that you may also create database URLs programmatically through the URL.create()
constructor. Hardcoding credentials is generally a bad idea, and this constructor is handy for defining credentials through environment variables, for example. The resulting object can be passed to create_engine()
directly:
from sqlalchemy import create_engine
from sqlalchemy import URL
url_object = URL.create(
"postgresql+pg8000",
username="dbadmin",
password="zjjk@4863/v",
host="pghost10",
database="dbname",
)
engine = create_engine(url_object, echo=True)
This documentation section goes in-depth about the engine options we have available. For the purposes of this article we'll set everything to default, save for an echo=True
parameter that causes SQL generated by our code to be logged to stdout. It'll be handy to demonstrate our code being translated to SQL.
4. Connecting
Remember that the engine is independent from both the Core and the ORM, but is used by both implementations. However, there are slight differences in how these implementations use the engine object to establish a connection.
On the Core implementation we usually work with Connection
and Result
objects. Put simply, the Connection
object interacts with the engine to connect to the database. The Result
objects are simply a set of database results, generated whenever we perform any given database operation.
The way we tend to use Connection
objects is to have them represent a specific resource open against the database (usually a table), such that the scope of operations we want to perform is kept well-defined. We do the by means of Python's with
statements, which also leaves the code organized and easy to read. Here's an example:
from sqlalchemy import text
from database.engine import engine
with engine.connect() as conn:
result = conn.execute(text("SELECT 'hello world'"))
print(result.all())
Here used a connection call wrapped into the engine object itself and defined a connection context within the with block
. This also leads the connection to be automatically closed. We then used the text()
construct to write textual SQL, just so we can demonstrate this connection. We perform a SELECT statement and bind the results to the result
variable.
Remember we configured our engine with the echo=True
setting so we can see the emitted SQL logged to stdout. Here's what running this code results in:
BEGIN (implicit)
select 'hello world'
[...]
[('hello world',)]
ROLLBACK
The SELECT statement is executed, giving the [('hello world',)]
result you see above. SQLAlchemy then retrieves this data and returns it as Result
objects to the result
variable in our code above. SQLAlchemy did all the heavy lift of opening and closing the connection, translating the correct SQL, executing it and parsing the results.
There's another thing of notice on the translated SQL snippet above: the ROLLBACK statement. The reason it's there is because of one major aspect of SQLAlchemy and the underlying Python DBAPI: transactions, by default, are not committed automatically. If you want to persist changes, you must explicitly use the Connection.commit()
method:
from sqlalchemy import text
with engine.connect() as conn:
result = conn.execute(text("select 'hello world'"))
print(result.all())
conn.commit()
BEGIN (implicit)
select 'hello world'
[...]
[('hello world',)]
COMMIT
Now, while the Connection
pattern I just showed you is typically used on Core-centric implementations, the patterns we used are very similar to what is done on the ORM. The difference is that we'll use Session
objects instead of Connection
.
The Session
is the ORM's fundamental interface for database interaction. It holds the connection and uses with
statements just like we saw, but it also has additional features to manage operations for ORM objects. In order to effectively demonstrate this, however, we first need to understand what these ORM objects are. We'll do this in the next section.
5. Setting up mapped models
ORM models, ORM classes, mapped classes or mapped objects are, in short, objects that correspond to a particular database table. Each column is mapped to a class attribute, and each class attribute is implemented in a way that describes the column's properties and constraints. Thus the "mapped objects" designation.
The ORM implements this via Declarative Mapping. Its most common pattern is to define a Base
class that inherits from the built-in DeclarativeBase
superclass. We then subclass this Base
class for each of our database tables. The underlying inheritance chain causes each subclass to acquire ORM mappings and functionalities.
We'll demonstrate this with two simple classes, Artist
and Track
. First, let's also organize some code into dedicated modules. On the database
directory we created earlier, create a engine.py
, a models.py
, a utils.py
and a __init__py
file. Our project's directory tree will look like this:
.
├── db
│ ├── __init__.py
│ ├── engine.py
│ ├── models.py
│ └── utils.py
└── requirements.txt
Add the code below to engine.py
:
import os
from sqlalchemy import create_engine
current_path = os.path.dirname(os.path.realpath(__file__))
database_url = f"sqlite+pysqlite:///{current_path}/database.db"
engine = create_engine(database_url, echo=True)
This will point the application to the database.bd
file, in the same directory as the engine.py
file, where we'll keep a SQLite database. Next, add the following bits to the models.py
file. I'll explain the classes shortly.
from typing import List
from sqlalchemy import ForeignKey
from sqlalchemy import String
from sqlalchemy.orm import DeclarativeBase
from sqlalchemy.orm import Mapped
from sqlalchemy.orm import mapped_column
from sqlalchemy.orm import relationship
class Base(DeclarativeBase):
pass
class Artist(Base):
__tablename__ = "artist"
id: Mapped[int] = mapped_column(primary_key=True)
name: Mapped[str] = mapped_column(String(30))
tracks: Mapped[List["Track"]] = relationship(
back_populates="artist", cascade="all, delete-orphan"
)
def __repr__(self) -> str:
return f"Artist(id={self.id!r}, name={self.name!r})"
class Track(Base):
__tablename__ = "track"
id: Mapped[int] = mapped_column(primary_key=True)
title: Mapped[str]
artist_id: Mapped[int] = mapped_column(ForeignKey("artist.id"))
artist: Mapped["Artist"] = relationship(back_populates="tracks")
def __repr__(self) -> str:
return f"Track(id={self.id!r}, title={self.title!r})"
Here we're defining a Base
class that inherits from DeclarativeBase
, and then defining the Artist
and Track
subclasses. As discussed earlier, this informs SQLAlchemy that Artist
and Track
are mapped ORM objects, providing them with all instrumentation needed to interact with the database.
Because of this, we are also able to interact with the table schemas themselves. This is exemplified by the __tablename__
class-level attribute, which defines table names for the respective ORM classes.
The other class attributes are each mapped to a table column. This mapping, more specifically, is established by the mapped_column()
directive. Alongside it, the Mapped
typing is used by SQLAlchemy to derive the corresponding SQL types and their nullability constraints.
The Column.primary_key
parameter, as the name implies, identifies the columns defined as the table's primary key. Table relationships can be defined through the the relationship()
construct. We won't go in depth, but relationships and mapped columns have quite a lot of options and arguments that help define every aspect of the underlying relationship. Be sure to check the documentation if you need more detail on this.
Finally, we come to the concept of table metadata. In this context, table metadata is a SQLAlchemy construct defined by our mapped classes, which describes what each underlying database table looks like. This construct is defined by both class-level attributes such as __tablename__
and the attributes that are mapped to table columns. The metadata enables us to interact with the database in some interesting ways. Let's demonstrate some by adding the helper functions below to the utils.py
file:
from database import models
from database.engine import engine
def create_database() -> None:
models.Base.metadata.create_all(engine)
def drop_database() -> None:
models.Base.metadata.drop_all(engine)
def reset_database() -> None:
drop_database()
create_database()
create_database()
will create all tables described in the mapped classes in the models
module. If the file pointed to in the engine does not yet exist, it will be automatically created. drop_database()
will do the opposite, leaving us with an empty database (but still keep the file). reset_database()
will drop and re-create the database leaving us with empty tables, which is useful if you want to play around with the mapped objects. Try opening a Python shell in your virtual environment and executing the first function. We'll be able to visualize the emitted SQL due to the echo=True
property we set on the engine:
PRAGMA main.table_info("artist")
...
PRAGMA temp.table_info("track")
...
CREATE TABLE artist (
id INTEGER NOT NULL,
name VARCHAR(30) NOT NULL,
PRIMARY KEY (id)
)
...
CREATE TABLE track (
id INTEGER NOT NULL,
title VARCHAR NOT NULL,
artist_id INTEGER NOT NULL,
PRIMARY KEY (id),
FOREIGN KEY(artist_id) REFERENCES artist (id)
)
...
COMMIT
Note that handling the schema directly like will will cause changes to be auto-committed. The following statements are logged to stdout as result from running the drop function, showing the reverse operation.
BEGIN (implicit)
PRAGMA main.table_info("artist")
...
PRAGMA main.table_info("track")
...
DROP TABLE track
...
DROP TABLE artist
...
COMMIT
With functional mapped classes, we can return to the matter of using Session
objects for out connection. We'll see this in practice, as we go over some basic CRUD operations on the next section, where we'll also get the hang of data manipulation.
6. Basic CRUD examples
Going forth, we will be adding, fetching, updating and deleting records from the database by means of Session
objects. Let us revise some key points about them:
- Sessions are how the SQLAlchemy ORM connects to the database and performs transactions.
- A
Session
object connects to the database via theEngine
object, where we configure various connection parameters. - Transactions are to be executed within a session context, which is usually done through Python's
with
statements. - Transactions are not committed automatically, requiring us to call
Session.commit()
to persist changes.
There's a lot more detail we could talk about sessions, such as how the engine connection is handled or how to create session factories with sessionmaker
. We'll stick to the basics for brevity, but you may start exploring these topics in this part of the documentation.
With that being said, let's dig in.
Create
To add rows to a table, in this example, we'll be using session contexts to create instances of our mapped objects. Then, we use Session.add()
and Session.commit()
to save them. Let's demonstrate this with our Artist
model. In the database
directory, create a operations.py
file:
.
├── database
│ ├── __init__.py
│ ├── engine.py
│ ├── models.py
│ ├── operations.py
│ └── utils.py
└── requirements.txt
And add the following bits:
from sqlalchemy.orm import Session
from database.engine import engine
from database.models import Artist, Track
# from database.utils import reset_database
def create_artist(name: str) -> None:
with Session(engine) as session:
artist = Artist(name=name)
session.add(artist)
session.commit()
def populate_artists() -> None:
create_artist("David Bowie")
create_artist("Queen")
create_artist("Iggy Pop")
The commented-out reset_database
method above can be used to reset the database if at any point you want to return it to a blank slate with empty tables.
We can create mapped object instances simply by passing named arguments to our mapped classes. By doing this within a session context, and then committing the changes, will lead ORM to do all the work of adding new database records for each of the instances we created. Adding data is as simple as that. This can be verified by looking at the SQL statements logged to stdout. Here's an excerpt for the snippet above:
INSERT INTO artist (name) VALUES ('David Bowie')
[...] ('David Bowie',)
COMMIT
We'll discuss reading data on the next section, but here's a sneak peek so I can show you some additional details into one of the rows we just added:
from sqlalchemy import select
def find_bowie():
with Session(engine) as session:
query = select(Artist).filter_by(name="David Bowie")
result = session.execute(query)
print(result)
Here's the print statement output:
[(Artist(id=1, name='David Bowie'),)]
We can confirm the David Bowie
row has been saved successfully, and also that a primary key was automatically assigned to it. By default, primary key columns will be auto-incrementing integers starting at 1
, unless configured otherwise.
Also, not the contents of the result
variable. It contains a list with a tuple-like data structure. This tuple is actually a SQLAlchemy data structure that we'll come back to in the next section.
Let's shift up a gear. Replace the find_bowie
function with the code below:
from sqlalchemy import Row
from typing import Tuple
def find_artist_by_name(name: str) -> Row[Tuple[Artist]] | None:
with Session(engine) as session:
query = select(Artist).filter_by(name=name)
result = session.execute(query)
return result.first()
def find_track_by_title(title: str) -> (Track | None):
with Session(engine) as session:
query = select(Track).filter_by(title=title)
result = session.scalars(query)
return result.first()
def create_track(title: str, artist: Artist) -> None:
with Session(engine) as session:
track = Track(title=title, artist=artist)
session.add(track)
session.commit()
def populate_bowie_tracks() -> None:
result = find_artist_by_name("David Bowie")
if result is None:
return
bowie = result[0]
with Session(engine) as session:
track1 = Track(title="Under Pressure", artist=bowie)
track2 = Track(title="Starman", artist=bowie)
track3 = Track(title="Heroes", artist=bowie)
track4 = Track(title="Space Oddity", artist=bowie)
track5 = Track(title="Rebel Rebel", artist=bowie)
session.add(track1)
session.add(track2)
session.add(track3)
session.add(track4)
session.add(track5)
session.commit()
def populate_queen_tracks() -> None:
result = find_artist_by_name("Queen")
if result is None:
return
queen = result[0]
with Session(engine) as session:
track1 = Track(title="Bohemian Rhapsody", artist=queen)
track2 = Track(title="Don't Stop Me Now", artist=queen)
track3 = Track(title="We Will Rock You", artist=queen)
track4 = Track(title="We Are The Champions", artist=queen)
track5 = Track(title="I Want To Break Free", artist=queen)
session.add(track1)
session.add(track2)
session.add(track3)
session.add(track4)
session.add(track5)
session.commit()
def populate_iggy_tracks() -> None:
result = find_artist_by_name("Iggy Pop")
if result is None:
return
iggy = result[0]
with Session(engine) as session:
track1 = Track(title="The Passenger", artist=iggy)
track2 = Track(title="Lust For Life", artist=iggy)
track3 = Track(title="Candy", artist=iggy)
track4 = Track(title="Livin' On The Edge Of The Night", artist=iggy)
track5 = Track(title="Hold The Line", artist=iggy)
session.add(track1)
session.add(track2)
session.add(track3)
session.add(track4)
session.add(track5)
session.commit()
def populate_tracks() -> None:
populate_bowie_tracks()
populate_queen_tracks()
populate_iggy_tracks()
This gives us a proper artist search function which returns either the searched artist or None
if it is not found. There's also a track creation function, and some utility functions to populate our tables with some data.
There's also a relationship creation example in there. You can create one by passing an existing mapped object instance, to another mapped object instance, to establish a relationship. Note that this is done specifically in the attribute defined as a relationship, or in this case the artist
attribute in the Track
class.
Alternatively, we could achieve the same by passing an instance's primary key to an attribute defined as a foreign key:
Track(title="Ziggy Stardust", artist_id=bowie.id)
Now that we have added some data, let's explore querying.
Read
Querying is a wide topic, with many strategies available in SQLAlchemy, such as filtering, ordering, joins and others. Whichever the case, we'll usually use a combination of Sessions, mapped objects and the SQL Expression Language system.
Once again, we'll stick to the basics for brevity. To start, add this bit to operations.py
:
def find_artists() -> None:
with Session(engine) as session:
query = select(Artist)
result = session.execute(query)
print(result.all())
Here's the printed statement:
[
(Artist(id=1, name='David Bowie'),),
(Artist(id=2, name='Queen'),),
(Artist(id=3, name='Iggy Pop'),)
]
Remember that whenever we execute a SQL statement against the database (in this case a SELECT statement through the select
function), it generates a Result
object. It represents a set of database results for the performed operation, from which we can retrieve the resulting data.
There are a few ways retrieving this data. One is calling the .all()
method like we did in the snippet above. It returns a list of Row
objects. The Row
object represents a single database row, that is, a table record.
We can also use the .fist()
method to extract just the first row from the Result
object, or None
if it doesn't exists:
print(find_artist_by_name("Lou Reed"))
# Output:
None
Row
and Result
objects are used on both Core or ORM implementations. However, when we are using mapped ORM objects, rows may be mapped to one of our ORM classes. For example, let's check data types at different levels: the Row
object itself, and the data within.
def find_artists() -> None:
with Session(engine) as session:
query = select(Artist)
result = session.execute(query)
first_artist = result.first()
print(type(first_artist))
print(type(first_artist[0]))
print(first_artist[0])
# First print output:
<class 'sqlalchemy.engine.row.Row'>
# Second print output:
<class 'database.models.Artist'>
# Third print output:
Artist(id=1, name='David Bowie')
Better explaining the outputs above:
- When we perform any given operation in the database, we get
Result
objects wrapping the obtained data. We got one on theresult
variable. - When we call
.first()
,.all()
or any such methods of theResult
object, we get access to one or moreRow
objects contained in it. These objects usually represent table rows. - Because we're working with the ORM, the data within the
Row
objects was transformed to an instance of the mapped ORM class we just queried upon.
Because the first_artist[0]
is a mapped object instance, we can access any of its attributes to get the corresponding data:
print(first_artist[0].id)
# Output:
1
print(first_artist[0].name)
# Output:
"David Bowie"
Remember that we defined a relationship between artists and tracks. Try accessing the tracks
attribute:
print(first_artist[0].tracks)
# Output:
[
Track(id=1, title='Under Pressure'),
Track(id=2, title='Starman'),
Track(id=3, title='Heroes'),
Track(id=4, title='Space Oddity'),
Track(id=5, title='Rebel Rebel')
]
If you look back at how we defined this relationship in the Artist
and Track
classes you'll see that, in addition to a foreign key attribute in the Track
class, we added relationship attributes with a back_populate
parameter in both of them. The effect of this is that relationships are kept synchronized and easily accessible, which is a nice ORM feature.
The next thing I'd like to introduce to you is the concept of scalars. No doubt you noticed we needed to use some intricate index syntax above (first_artist[0]
), to access the mapped ORM instance from the Row
object. You might have foreseen even worse situations, like this:
all_artists = result.all()
desired_record = all_artists[1][0]
Scalars are a way to avoid this by bypassing the Row
objects entirely, getting direct access to the mapped instances. We can do this in two ways. Either call a .scalars()
method in a result object:
def find_artists() -> None:
with Session(engine) as session:
query = select(Artist)
result = session.execute(query)
all_artists = result.scalars().all()
print(all_artists)
Or use Session.scalars()
to execute the operation, which is often preferred:
def find_artists() -> None:
with Session(engine) as session:
query = select(Artist)
result = session.scalars(query)
all_artists = result.all()
print(all_artists)
Both will lead to the same result:
[
Artist(id=1, name='David Bowie'),
Artist(id=2, name='Queen'),
Artist(id=3, name='Iggy Pop')
]
As you can see, we skipped the Row
objects entirely and obtained mapped objects instances directly. This is a very common ORM pattern and reduces code complexity by a fair amount.
Still, you may be asking whether Rows
still have use cases. The answer is mostly yes, although it's more common outside the ORM. That's because scalars only work like with queries that return ORM objects. You may encounter scenarios where this is not the case, however. A good example of this is selecting specific table columns, common when doing joins. The snippet below demonstrates this. It uses Row
objects to pick data coming from from multiple tables:
def find_tracks() -> None:
with Session(engine) as session:
query = select(Artist.name, Track.title).join(Artist.tracks)
result = session.execute(query)
for row in result:
print(f"Track: {row.title}, by {row.name}")
Track: Under Pressure, by David Bowie
Track: Starman, by David Bowie
Track: Heroes, by David Bowie
Track: Space Oddity, by David Bowie
Track: Rebel Rebel, by David Bowie
Track: Bohemian Rhapsody, by Queen
Track: Don't Stop Me Now, by Queen
Track: We Will Rock You, by Queen
Track: We Are The Champions, by Queen
Track: I Want To Break Free, by Queen
Track: The Passenger, by Iggy Pop
Track: Lust For Life, by Iggy Pop
Track: Candy, by Iggy Pop
Track: Livin' On The Edge Of The Night, by Iggy Pop
Track: Hold The Line, by Iggy Pop
If we call session.scalars()
instead of the session.execute()
in the code above, it runs but the Row
object casting does not work as expected. It erroneously become a string:
def find_tracks() -> None:
with Session(engine) as session:
query = select(Artist.name, Track.title).join(Artist.tracks)
result = session.scalar(query)
print(result)
# Output:
"David Bowie"
So we lost ORM capabilities in exchange for constructing a tailored dataset. While we could still get the same data while sticking to the ORM, this example serves to demonstrate the usability of Row
objects. Nevertheless, it's always a good idea to keep the codebase concise and reduce complexity to the minimum. It's better to choose a paradigm and stick to it unless the need really arises.
TO wrap this section up, we saw that Result
, Row
and Scalar
objects are the base blocks we need to understand how querying works in SQLAlchemy. Knowing this makes it easier to read and dive deeper into the ORM querying guide and other related documentation. There's still a lot of querying topics such as joins, set operations, pagination, ordering, subqueries and others that we won't have the time to explore.
Finally before we move on, let's tidy up our operations.py
file and make use of scalars in our functions:
from sqlalchemy import select
from sqlalchemy.orm import Session
from typing import Sequence
from database.engine import engine
from database.models import Artist, Track
def find_artists() -> Sequence[Artist]:
with Session(engine) as session:
query = select(Artist)
result = session.scalars(query)
return result.all()
def find_artist_by_name(name: str) -> (Artist | None):
with Session(engine) as session:
query = select(Artist).filter_by(name=name)
result = session.scalars(query)
return result.first()
def find_tracks() -> Sequence[Track]:
with Session(engine) as session:
query = select(Track).where(Track.id < 10)
result = session.scalars(query)
return result.all()
def find_track_by_title(title: str) -> (Track | None):
with Session(engine) as session:
query = select(Track).filter_by(title=title)
result = session.scalars(query)
return result.first()
def create_artist(name: str) -> None:
with Session(engine) as session:
artist = Artist(name=name)
session.add(artist)
session.commit()
def create_track(title: str, artist: Artist) -> None:
with Session(engine) as session:
track = Track(title=title, artist=artist)
session.add(track)
session.commit()
def populate_artists() -> None:
create_artist("David Bowie")
create_artist("Queen")
create_artist("Iggy Pop")
def populate_bowie_tracks() -> None:
bowie = find_artist_by_name("David Bowie")
if bowie is None:
return
with Session(engine) as session:
track1 = Track(title="Under Pressure", artist=bowie)
track2 = Track(title="Starman", artist=bowie)
track3 = Track(title="Heroes", artist=bowie)
track4 = Track(title="Space Oddity", artist=bowie)
track5 = Track(title="Rebel Rebel", artist=bowie)
session.add(track1)
session.add(track2)
session.add(track3)
session.add(track4)
session.add(track5)
session.commit()
def populate_queen_tracks() -> None:
queen = find_artist_by_name("Queen")
if queen is None:
return
with Session(engine) as session:
track1 = Track(title="Bohemian Rhapsody", artist=queen)
track2 = Track(title="Don't Stop Me Now", artist=queen)
track3 = Track(title="We Will Rock You", artist=queen)
track4 = Track(title="We Are The Champions", artist=queen)
track5 = Track(title="I Want To Break Free", artist=queen)
session.add(track1)
session.add(track2)
session.add(track3)
session.add(track4)
session.add(track5)
session.commit()
def populate_iggy_tracks() -> None:
iggy = find_artist_by_name("Iggy Pop")
if iggy is None:
return
with Session(engine) as session:
track1 = Track(title="The Passenger", artist=iggy)
track2 = Track(title="Lust For Life", artist=iggy)
track3 = Track(title="Candy", artist=iggy)
track4 = Track(title="Livin' On The Edge Of The Night", artist=iggy)
track5 = Track(title="Hold The Line", artist=iggy)
session.add(track1)
session.add(track2)
session.add(track3)
session.add(track4)
session.add(track5)
session.commit()
def populate_tracks() -> None:
populate_bowie_tracks()
populate_queen_tracks()
populate_iggy_tracks()
Update
Updating records is really easy. All we need is a mapped object instance, usually obtained after querying the database in any way. If we change one of its attributes within a session context and commit the changes, the ORM will save them in the database. To demonstrate, let's create two update functions for our classes:
def update_artist(artist: Artist, name: str) -> None:
with Session(engine) as session:
artist.name = name
session.add(artist)
session.commit()
def update_track(track: Track, title: str, artist_id: int) -> None:
with Session(engine) as session:
track.title = title
track.artist_id = artist_id
session.add(track)
session.commit()
These methods take a mapped instance of one of the models and their corresponding attributes as arguments. Now, assume we create an artist with a typo, for example:
def create_m83():
with Session(engine) as session:
create_artist("m83")
m83 = find_artist_by_name("m83")
print(m83)
# Output:
# Artist(id=4, name='m83')
All we need is to pass the instance and updated data to the update function:
def fix_m83():
with Session(engine) as session:
m83 = find_artist_by_name("m83")
print(m83)
update_artist(m83, "M83")
m83 = find_artist_by_name("M83")
print(m83)
# First output:
# Artist(id=4, name='m83')
# Second output:
# Artist(id=4, name='M83')
It's similar if we want to update a relationship:
def create_midnight_city():
# Here we create Midnight City with a typo and the wrong artist
with Session(engine) as session:
artist = find_artist_by_name("David Bowie")
create_track("Midnight CIty", artist)
# Here we query the track and its artist to show data currently in the database
with Session(engine) as session:
track = find_track_by_title("Midnight CIty")
artist = session.scalars(
select(Artist).where(Artist.id == track.artist_id)
).first()
print(track)
print(artist)
# First output:
# Track(id=16, title='Midnight CIty')
# Second output:
# Artist(id=1, name='David Bowie')
def fix_midnight_city():
with Session(engine) as session:
artist = find_artist_by_name("M83")
track = find_track_by_title("Midnight CIty")
update_track(track, "Midnight City", artist_id=artist.id)
track = find_track_by_title("Midnight City")
artist = session.scalars(
select(Artist).where(Artist.id == track.artist_id)
).first()
print(track)
print(artist)
# First output:
# Track(id=16, title='Midnight City')
# Second output:
# Artist(id=4, name='M83')
Surely, these methods are just for demonstration purposes and there are better implementations. This is a good opportunity to play around with them to practice querying and modifying data. Save for the update methods, they will be discarded from the final code.
Delete
Like updating, deletion operations will require a mapped object instance and a session context. Then you call session.delete()
and session.commit()
:
def delete_artist(artist: Artist) -> None:
with Session(engine) as session:
session.delete(artist)
session.commit()
def delete_track(track: Track) -> None:
with Session(engine) as session:
session.delete(track)
session.commit()
Let's try deleting an artist:
with Session(engine) as session:
m83 = find_artist_by_name("M83")
print(m83)
delete_artist(m83)
session.commit()
# Output:
# Artist(id=4, name='m83')
Now if you try finding the same artist again:
with Session(engine) as session:
m83 = find_artist_by_name("M83")
print(m83)
# Output:
# None
Now try running the find_tracks
function. You'll notice any tracks from the deleted artist are no longer present in the database. This is because of the deletion constraints we defined in our Artist
class, specifically:
tracks: Mapped[List["Track"]] = relationship(
back_populates="artist", cascade="all, delete-orphan"
)
Which defines an ON DELETE CASCADE
behavior when deleting from a parent table. For more details on cascades, check this documentation.
This covers the basic CRUD operations for our application. Next, let's wrap up.
7. Wrapping up
This article has explored the basics of SQLAlchemy and leaves you with a very simple ORM application. From here you're able to head to the documentation to analyze the concepts we discussed in more detail, as well as delve into more advanced topics. I'll leave you with a list of ideas to try and learn more, in no particular order. Write an application of your own, or feel free to reuse the code we have created. It's available here. Best of luck!
- Learn more about sessions. Learn about
sessionmaker
, try refactoring the application around it. - Add more mapped ORM classes with different column definitions. Learn the available data types and constraints, and how to define them.
- Try creating different kinds of relationships and testing their behavior.
- Write more complex queries. Try sorting, filtering and pagination. Try performing searches that involve relationships.
- Head over to Alembic and learn about migrations. Think how you could implement them.
- Try writing serializers to convert instances or you mapped objects to simple Python dictionaries. For a whole framework dedicated to this purpose, head over to marshmallow.
Top comments (2)
This is a fantastic guide! How would you recommend handling large-scale migrations with SQLAlchemy for complex applications?
Hey there! Specific approaches tend to vary with each database, so I'll be better able to provide some guidelines.
Backups are the most important, especially with anything large-scale and on databases with pre-existing data. Ensure data is safe and recoverable before applying any extensive operations.
Keep development and production environments separate. Ideally, we would be working with a separate database instance, in complete isolation from production data. This could be a remote database on another URL or a local instance using tools like Docker Compose to replicate the production environment. This way, you have somewhere to run and test migrations without worry of breaking things.
Having the development database full of data can be very useful to visualize the end results of your changes. I've worked in companies that provided developers with a database dump to populate their local instances with, which was great. However, this is not always an option. A workaround is to create seeding scripts that populate tables with dummy data for you, similar to what we did in the article. Some codebases may already have seeding scripts ready to use.
Finally, we come to handling the migrations themselves. As mentioned, the details will depend on your specific database, but some common approaches I find useful are: