I've recently picked up the sequel-activerecord_connection gem again to make some reliability improvements around database transactions. For context, this gem extends Sequel with the ability to reuse Active Record's database connection, which should lower the barrier for trying out Sequel in apps that use Active Record.
After pushing some fixes, I was thinking how working on this gem has greatly increased my familiarity with the internals of database transaction implementations of both Active Record and Sequel. Since there aren't any existing articles on this topic, I thought it would be useful to share the knowledge I gathered over the past few months.
This article will compare the transaction API implementation between Active Record and Sequel, and assumes the reader is already familiar with Active Record's transaction API usage. As the title suggests, I will be critical of Active Record's implementation. I know some will perceive this as "not nice", but I think it's important to be aware of the internal complexity of libraries we're using every day (myself included).
Model vs Database
Active Record
Active Record transactions are typically called on the model, which is shown in the official docs as well. I think this can be misleading to novice developers, as it suggests that database transactions are tied to specific database tables, when in fact they're applied to any queries made by the current database connection.
# opens a connection-wide transaction (unrelated to the `accounts` table)
Account.transaction do
balance.save!
account.save!
end
Active Record provides transaction callbacks as part of a model's lifecycle, allowing you to execute code after the transaction commits or rolls back. This, for example, allows you to spawn a background job after a record is persisted, but wait until the transaction commits to ensure the record is up-to-date when the background job is picked up.
class Account < ActiveRecord::Base
after_create_commit :send_welcome_email
private
def send_welcome_email
AccountMailer.welcome(self).deliver_later
end
end
In my opinion, this approach has several issues. For one, it encourages putting business logic into your Active Record models, and generally increases complexity of the model lifecycle. It's unfortunately not trivial to use transaction callbacks outside of models, because they're coupled to models (although there are gems that work around that).
Transaction callbacks can also negatively impact memory usage if you're allocating many model instances within a transaction, as references to these model instances are held until the transaction is committed or rolled back, which prevents Ruby from garbage collecting them beforehand. Active Record will do this for any model that has any transaction callbacks defined.
class Comment < ActiveRecord::Base
after_commit :deliver_new_mentions, on: [:create, :update], if: :body_changed?
private
def deliver_new_mentions
MentionNotificationJob.deliver_later(self)
end
end
ActiveRecord::Base.transaction do
author.comments.find_each do |comment|
# Even though we're not triggering the `after_commit` callback here, Active
# Record will still keep references to these model instances, preventing
# Ruby from garbage collecting them until the transaction is closed.
comment.update(author: new_author)
end
end
Sequel
In Sequel, the transaction API is implemented on the database object, which is completely decoupled from models.
DB = Sequel.connect(adapter: "postgresql", database: "myapp") #=> #<Sequel::Database ...>
# calling #transaction on the database object communicates it's connection-wide
DB.transaction do
balance.save
account.save
end
Sequel also has transaction hooks, but they too are defined on the database object, and aren't tied to models in any way – they're just blocks of code that get executed after the transaction is committed or rolled back. This makes them possible to use in business logic that lives outside of models (of course, in that case one can also just move the code outside of the transaction block).
class CreateAccount
def call(attributes)
DB.transaction do
account = Account.create(attributes)
send_welcome_email(account)
account.update(api_key: SecureRandom.hex)
end
end
private
def send_welcome_email(account)
# queue email delivery after the enclosing transaction commits
DB.after_commit do
AccountMailer.welcome(account).deliver_later
end
end
end
And if you really want to register transaction hooks on the model level, you can do that inside regular model lifecycle hooks:
class Account < Sequel::Model
def after_create
db.after_commit { AccountMailer.welcome(self).deliver_later }
end
end
By giving us the ability to compose APIs this way, the Sequel::Model
class was able to remain unaware of the existence of transaction hooks (which keeps it simpler), but we were still able to achieve the same functionality as we have with Active Record.
Note that the examples above will still keep references to the model instances until the transaction is closed. However, Sequel's API gives us the necessary control to change that. For example, we can choose to register a transaction hook only if a certain condition holds (useful in use cases like file attachments):
class Comment < Sequel::Model
def after_save
if column_changed?(:body)
db.after_commit { MentionNotificationJob.deliver_later(self) }
end
end
end
DB.transaction do
author.comments_dataset.paged_each do |comment|
# The transaction hooks aren't registered, so Ruby can garbage collect
# these model instances while the loop is running.
comment.update(author: new_author)
end
end
We can also register a transaction hook in a way where it will only keep the reference to the record id instead of the whole record instance:
class MentionNotifications
def self.enqueue(comment_id)
db.after_commit { MentionNotificationJob.deliver_later(comment_id) }
end
end
class Comment < Sequel::Model
def after_save
NotificationMentions.enqueue(id)
end
end
Transaction state
Active Record
Active Record maintains transaction state on the connection level, but a lot of transaction-related state is also maintained at the model level. While the transaction manager is implemented pretty decently, the ActiveRecord::Transactions
module is incredibly complex, and has been the source of numerous issues.
The reason for this complexity is that every new incoming bug has generally been solved by adding yet another tweak, yet another conditional, yet another instance variable. And some of these instance variables even leak outside of the ActiveRecord::Transactions
module, which indicates a leaky abstraction.
Honestly, for me this reached a state where I don't consider Active Record's transaction callbacks to be safe enough for production, and I try to avoid them whenever possible.
Sequel
Sequel stores all the transaction state in a single @transactions
instance variable on the database object. Models don't have access to the transaction state, which keeps transactions fully decoupled from models.
DB.transaction do |conn|
DB.after_commit { ... }
DB.transaction(savepoint: true) do
DB.instance_variable_get(:@transactions)[conn] #=>
# {
# after_commit: [
# <Proc...> # the block we've registered above
# ],
# savepoints: [
# { ... }, # transaction data
# { ... } # savepoint data
# ]
# }
end
end
If you're reading Sequel's transaction code, you'll notice that all of it is contained in a single file and single context (including transaction hooks). In my experience this made the logic much easier to grok.
Lazy transactions
Active Record
In version 6.0, Active Record introduced a performance optimization that makes transactions lazy. What this means is that Active Record will issue BEGIN/COMMIT queries only if there was at least one query exected inside the transaction block.
ActiveRecord::Base.transaction do
ActiveRecord::Base.connection.execute "SELECT 1"
end
# BEGIN
# SELECT 1
# COMMIT
ActiveRecord::Base.transaction do
end
# (no queries were executed)
The main use case behind this addition seems to be saving lots of records whose attributes didn't change, where each attempted update would execute empty BEGIN/COMMIT statements (even though no UPDATE was issued), which didn't perform well. A workaround at the time would be to call record.save if record.changed?
instead.
article = Article.find(id)
article.published #=> true
article.update(published: true) # executed empty BEGIN/COMMIT prior to Active Record 6.0
However, as Sean Griffin had pointed out in the pull request review, this added significant complexity for very little gain. In addition to requiring additional transaction state, each Active Record adapter is now also responsible for materializing transactions when necessary.
Sequel
In Sequel, opening a transaction will always execute BEGIN/COMMIT statements (if the transaction commits), regardless of whether any queries were made inside the block or not.
DB.transaction do
end
# BEGIN
# COMMIT
Sequel::Model#save
behaves differently than ActiveRecord::Base#save
, in terms that it always executes an UPDATE statement for an existing record (updating all columns). To update only changed attributes, you would use Sequel::Model#save_changes
, which doesn't execute UPDATE if no attributes have changed. And Sequel::Model#update
calls #save_changes
under the hood:
article = Article.find(id)
article.published #=> true
article.update(published: true) # no queries executed
Unlike ActiveRecord::Base#save
, Sequel::Model#save_changes
doesn't open a transaction if it won't execute the UPDATE statement. This seems like a much more elegant solution to the problem Active Record's lazy transactions intended to solve, but with none of the complexity.
Final words
I really care that libraries I'm using at work have sufficiently straightforward internals that I can understand when debugging an issue. When it comes to database transactions, Active Record's internal complexity is just too overwhelming for me (and that's coming from someone who contributes to open source on a daily basis).
On the other hand, the Sequel's transaction implementation was fairly straightforward to understand, which is all the more impressive considering that it's more feature-rich compared to Active Record (see the docs). And this is not an exception – I regularly see this pattern whenever I'm reading Sequel's source code 😉
Hopefully this article will add another point towards Sequel for people starting new Ruby/Rails projects.
Top comments (2)
Great article, as usual!
This kind of bothers me. Isn't it something a developer should take care of, knowing the tradeoffs of the framework?
record.save if record.changed?
doesn't sound like a workaround for me but a proper programming approach to the problem of saving many objects.Thanks! Yes, that shouldn't be difficult to do, but I suppose Active Record wants to prevent you from running into that performance problem to begin with. Personally, with Active Record it wouldn't come to my mind to add
if record.changed?
, because I assumerecord.save
will not do anything if no columns changed. But instead of makingActiveRecord::Base#save
not open a transaction in this case, which would avoid that problem, it was somehow better to implement lazy transactions? 🤷♂️