If you store large objects in the database (such as JSON), for example, data for big reports, then this can take up a lot of space. To reduce the size of data, you can compress and store in binary form.
PostgreSQL has a bytea field type for storing such data. You can add bytea column in Rails using migration
add_column :reports, :data, :binary
For binary field operations, you can use the Rails Attributes API and add a new BinaryHash
data type
# app/types/binary_hash.rb
class BinaryHash < ActiveRecord::Type::Binary
def serialize(value)
super value_to_binary(value.to_json)
end
def deserialize(value)
super case value
when NilClass
{}
when ActiveModel::Type::Binary::Data
value_to_hash(value.to_s)
else
value_to_hash(PG::Connection.unescape_bytea(value))
end
end
private
def value_to_hash(value)
JSON.parse(
ActiveSupport::Gzip.decompress(value),
symbolize_names: true
) || {}
end
def value_to_binary(value)
ActiveSupport::Gzip.compress(value)
end
end
Register new type in initializers
# config/initializers/types.rb
ActiveRecord::Type.register(:binary_hash, BinaryHash)
And add to binary type attribute in model
# app/models/snapshot.rb
class Reports < ApplicationRecord
attribute :data, :binary_hash
end
Tests show that data size is reduced by almost 3 times
Run time with 100000 width JSON
user system total real
Compress JSON 0.008671 0.001535 0.010206 ( 0.010885)
Decompress JSON 0.001357 0.000095 0.001452 ( 0.001509)
json size 95450 bytes
binary size 33868 bytes
~ 2.82 times compression
Top comments (2)
Is this even necessary with postgres and Rails having the jsonb field? I would imagine you're just comporessing the whitespace out of the json which shouldn't be needed with jsonb. Also jsonb would allow for indexing and searching.
Nice guide!