Introduction
Enum - is a user-defined data type representing a set of named values. In general, the idea of enums aims to provide 2 main purposes:
- to make codebase more readable and human-friendly
- to add extra type safety by limiting possible assignable value to the variable of enum type
Enum is based on integer type data type, what is definitely good for performance. If we are going to deal with a lot of data.
The "Old school" way
The most common way of using "enum" is the definite state of some object. In spite of having no enums in pure Ruby, we are able to emulate them with constants or symbols. Here is the sample of old-styled Rails 3 (legacy) data-model source code:
class Account < ActiveRecord::Base
PAID_STATUS = 'paid'
OVERDUE_STATUS = 'ovedue'
BLOCKED_STATUS = 'blocked'
STATUSES = [PAID_STATUS, OVERDUE_STATUS, BLOCKED_STATUS].freeze
validates :status, presence: true, inclusion: {in: STATUSES}
def paid?
self.status == PAID_STATUS
end
end
# everywhere we work with Account instance - it is much easier
# to deal with helper predicate instead of comparing values "in place"
if account.paid?
# do something special for the user who pays
end
It gives more readability to the codebase and makes it looks friendly for a new developer! But it is overloaded with boilerplates - constants definition and values checking. This checks spread all over the codebase and make maintenance more difficult.
The "Rails way"
Staring from Rails 4.1 Active Record provides easier way to use enums:
class Conversation < ActiveRecord::Base
enum status: [ :active, :archived ]
end
and we've got some useful mutation methods, check value methods and scopes out of the box, g.e.:
conversation.active! # sets the active status for object
conversation.active? # checks if the object have active status
Conversation.action.find_each { |c| puts c.status } # provides scope for searching
We've got source code to be more readable and maintainable.
Validation comes out of the box, and it raises the ArgumentError
if a value is not defined.
conversation.status = :unknown
=> ArgumentError: 'unknown' is not a valid channel
Default behavior supposes to use numeric as the underlying type of the column:
create_table :conversations do |t|
t.integer :status
end
And Active Record will assign values for your named constants:
[1] pry(main)> Conversation.statuses
=> {"active"=>0, "archived"=>1}
Everything is great until we will need to add value to your enumeration (and make it first):
class Conversation < ActiveRecord::Base
enum status: [ :initial, :active, :archived ]
end
This change will force Active Record to re-enumerate values starting from zero. And values will mix up and crash your application logic:
> Conversation.statuses
=> {"initial"=>0, "active"=>1, "archived"=>2}
The "Rails way" with Postgres Enums
We have to points to do it:
- define enum values explicitly with model definition
- introduce Postgres enum type to declare underlying database column data type
Here is the model with channel
attribute having enumerated values:
class UserNotification < ActiveRecord::Base
# define rails enum and the underlying values to use for every enum-value
enum channel: {
email: 'email',
webpush: 'webpush',
telegram: 'telegram'
}
end
It means that we will have email
, webpush
and telegram
- as values for column channel
.
Write migration to create Postgres enum-type user_notification_channel
:
class CreateUserNotifications < ActiveRecord::Migration
def up
execute <<-DDL
CREATE TYPE user_notification_channel AS ENUM (
'email', 'webpush', 'telegram'
);
DDL
create_table :user_notifications do |t|
...
t.column :channel, :user_notification_channel
end
end
def down
drop_column :channel, :user_notification_channel
execute "DROP type user_notification_channel;"
end
end
This trick lets us have 'string' value in Ruby:
notification.channel
=> "email"
but occupies only 4 bytes in a database row.
Main benefits of this complex "Rails&Postgres" way:
- readable and user-friendly source code
- string representation of enum in Ruby with optimal data storage in the database
- more type safety on database level
- meaningful and clear database structure out of Rails-application context
How to examples
Migrating existing text column to Postgres Enum
The reasons for doing this are:
- reduce table size (fixed 4 bytes enum vs string length)
- increase type safety and database data consistency
- make database values more meaningful and friendly (out of models context)
class CreateUserNotifications < ActiveRecord::Migration
def up
execute <<-DDL
ALTER TABLE user_notifications
ALTER COLUMN channel TYPE user_notification_channel
USING channel::user_notification_channel;
DDL
end
def down
# moves column back to have text data-type
execute <<-DDL
ALTER TABLE user_notifications
ALTER COLUMN channel TYPE text
USING channel::text;
DDL
end
end
This works well without breaking any changes!
Add value to existing enum migration:
class ExpandValueOfUserNotificationChannels < ActiveRecord::Migration
def up
execute <<-DDL
ALTER TYPE user_notification_channel ADD VALUE 'post_pigeon';
DDL
end
end
Drop value from existing enum
Postgres does not allow to drop value from existing enum, but here is some trick to drop value:
- migrate the column to text
- re-create the new enum type
- migrate back to the new enum
class RemoveTelegramFromUserNotificationChannel < ActiveRecord::Migration
def up
execute <<-DDL
ALTER TABLE user_notifications
ALTER COLUMN channel TYPE text
USING channel::text;
DROP TYPE user_notification_channel;
CREATE TYPE user_notification_channel AS ENUM (
'email', 'webpush'
);
ALTER TABLE user_notifications
ALTER COLUMN channel TYPE user_notification_channel
USING channel::user_notification_channel;
end
end
Migrate from numeric Rails enum to Postgres Enum
Imagine we have Rails enums and the integer channel column in database:
class UserNotification < ActiveRecord::Base
enum channel: [:email, :webpush, :telegram]
end
Numeric values of each Rails enum value could be gotten by the following code:
[1] pry(main)> UserNotification.channels
=> {"email"=>0, "webpush"=>1, "telegram"=>3}
Just alter column by changing type and transform the value from numeric to the enum:
class MigrateNotificationsChannelToPostgresEnum < ActiveRecord::Migration
def up
execute <<-DDL
ALTER TABLE user_notifications
ALTER COLUMN channel TYPE user_notification_channel
USING CASE channel
WHEN 0 THEN 'email'::user_notification_channel
WHEN 1 THEN 'webpush'::user_notification_channel
WHEN 2 THEN 'telegram'::user_notification_channel
ELSE RAISE 'UNKNOWN VALUE'
END
DDL
end
end
This migration will transform numeric values to enums. Profit! 🙂
Conclusion
Postgres enums are great and helpful!
They are very easy to integrate with Rails!
They help to ensure data consistency for the definite set of values. The increase human-readability of the data out of the Rails application context.
They also have better performance (comparing to text values).
Thanks for reading.
Top comments (9)
Thanks for sharing this knowledge!
I noticed a mistake on the migration:
Every value is 0, I think it should be 0, 1 and 2.
Nice post @dsalahutdinov , apart from what @antico5 says I see another detail here, is that the last one should be
instead of
@johand , thanks, fixed
unfortunately the db/schema.rb file is messed up with this approach
Could not dump table "users" because of following StandardError
Unknown type 'xyz_setting' for column 'xyz'
stackoverflow.com/questions/646689...
Hi! Good point. To have the "native" schema.rb support of postgres enum you could use this ruby gem github.com/bibendi/activerecord-po...
Thanks for the article. I was already looking to Postgres Enums for a solution to my data validation and now you've given me the path to my proposed solution. Happy days.
Another approach: jetrockets.pro/blog/ruby-string-li...
thanx for sharing! should be remove_column instead of drop_column in migration.
Michael, thanks for your comment!
This is great idea, which comes after the fail without having explicit values first :)