Julien Camblan for Tymate

Posted on Feb 27, 2020 • Edited on Jul 3, 2020

First dive into GraphQL Ruby

#graphql #ruby #rails #beginners

At Tymate, the question of using GraphQL for our APIs has been around for a long time. Since most of our desktop applications are built with React, it was obviously a recurring request from our front-end developers.

My name is Julien, I'm 32 and I've been a Ruby developer for a little more than a year, after a professional retraining. This article is a free transcription of an internal workshop at Tymate during which I presented my feedback on a first use of the GraphQL Ruby gem.

The following assumes basic knowledge of how GraphQL works.

Data structure

The first thing to know when you start working with GraphQL is that everything is typed. Every piece of data you use or want to serialize must be clearly defined. It's not something you're naturally used to with Ruby, but it quickly becomes very comfortable.

Types

So we will naturally start by defining types.

First, we will create types for the models of our API that we will need for our GraphQL queries. The definition of GraphQL types is very similar to the logic found in serializers on our classic REST APIs: the definition of the data that we will be able to consult and the form in which this data is made available. But this obviously goes further by integrating typing, validations, possibly rights management...

Below is an example of a GraphQL type corresponding to a model of our API :

module Types
  class ProviderType < Types::BaseObject
    # These lines allow you to integrate the specifics of Relay to the type
    # we'll come back to this later in the article
    implements GraphQL::Relay::Node.interface
    global_id_field :id

    # Here, we define the fields that we will be able to request on our object
    # it can be basic fields, methods from the model
    # or methods directly defined in the type.
    # For each field, we must specify the type
    # and the possibility of being null.
    field :display_name, String, null: true
    field :phone_number, String, null: true
    field :siret, String, null: true
    field :email, String, null: true
    field :iban, String, null: true
    field :bic, String, null: true
    field :address, Types::AddressType, null: true
    field :created_at, GraphQL::Types::ISO8601DateTime, null: true
    field :updated_at, GraphQL::Types::ISO8601DateTime, null: true

    # Here, we choose to rewrite the model's address relationship.
    # The simple definition of the field address is enough
    # to return the correct items.
    # But in doing so, we can integrate the logic of
    # the BatchLoader gem to eradicate the N + 1 from our request.
    def address
      BatchLoader::GraphQL.for(object).batch do |providers_ids, loader|
        Address.where(
          addressable_type: 'Provider', addressable_id: providers_ids
        ).each { |address| loader.call(object, address) }
      end
    end  
  end
end

Queries

Once we have defined the types we need, they certainly appear directly in the documentation automatically generated by GraphiQL, but they cannot be accessed by API consumers.

For this, it is necessary to create queries. This is the equivalent of a GET query in REST. The consumer will ask to consult either a specific object, or a collection of objects (which we will call a connection in the Relay methodology, I will come back to this later).

Display an isolated object

Queries are defined in the QueryType class:

module Types
  class QueryType < Types::BaseObject
    # Here, the field is the name of the query.
    # The arguments passed in the block will allow
    # to find the object we're looking for
    field :item, Types::ItemType, null: false do
      argument :id, ID, required: true
    end

    # As opposed to the definition of model types seen above,
    # For Queries the field doesn't stand on its own
    # It is necessary to define in a method of the same name
    # the logic that will solve the query
    def item(id:)
      Item.find(id)
    end
  end
end

In its behavior, a query is very close to what you do in a REST controller. We define the query in a field, to which we associate the type of the object(s) we are going to return, and arguments that will allow us to retrieve the desired object, for example an ID.

In the item method, we retrieve the arguments passed in the field, and we look for the corresponding object.

Lighten Queries

As soon as you start having a lot of different models that you want to be able to display in a query, you're going to be faced with a lot of code duplication. A bit of metaprogramming can lighten our query_type.rb :

module Types
  class QueryType < Types::BaseObject
    # First we define a common method that finds an object
    # from a unique ID provided by Relay
    def item(id:)
      ApiSchema.object_from_id(id, context)
    end

    # And we create the fields that call the above method for all types
    # for which we need a query.
    %i[
      something
      something_else
      attachment
      user
      identity
      drink
      food
      provider
    ].each do |method|
      field method,
            "Types::#{method.to_s.camelize}Type".constantize,
            null: false do
        argument :id, ID, required: true
      end

      alias_method method, :item
    end
  end
end

Display a collection of objects

By default, there is not much difference between a query that returns an object and one that returns a collection of objects. We will simply have to specify that the query result will be an array by passing the type itself between [] :

module Types
  class QueryType < Types::BaseObject
    field :items, [Types::ItemType], null: false do
      # you no longer search for an item by its ID, but by its container.
      argument :category_id, ID, required: true
    end

    def item(category_id:)
      Category.find(category_id).items
    end
  end
end

We call the query like this:

query items {
  items {
    id
    displayName
  }
}

And we get the following json in response:

[
  {
    "id": "1",
    "displayName": "toto"
  },
  {
    "id": "2",
    "displayName": "tata"
  }
]

Relay & connections

It could work for very basic needs, but it will quickly become limited.

For better structured requests, Facebook has created Relay, a GraphQL overlay (natively managed by gem) that introduces two very practical paradigms:

we work with global IDs (strings created from a base64 encoding of ["ModelName, ID"])
a very specific nomenclature to organize and consume collections of objects: the connections.

edit: I wrote the initial workshop on the basis of version v1.9 of the graphql-ruby gem. The notion of connection has been extracted from Relay to become the default formatting of the collections in v1.10.

The global ID instead of the classical IDs is there first and foremost for JS applications that will consume the API. This allows to always use this ID as a key in object loops. From an API point of view, working with unique IDs regardless of the type of object is also very practical.

Nomenclature

Concerning the connections, here is what our previous query adapted in this format looks like:

{
  items(first: 2) {
    totalCount
    pageInfo {
      hasNextPage
      hasPreviousPage
      endCursor
      startCursor
    }
    edges {
      cursor
      node {
        id
        displayName
      }
    }
  }
}

And the corresponding response:

{
  "data": {
    "items": {
      "totalCount": 351,
      "pageInfo": {
        "hasNextPage": true,
        "hasPreviousPage": false,
        "endCursor": "Mw",
        "startCursor": "MQ"
      },
      "edges": [
        {
          "cursor": "MQ",
          "node": {
            "id": "UGxhY2UtMzUy",
            "displayName": "Redbeard"
          }
        },
        {
          "cursor": "Mg",
          "node": {
            "id": "UGxhY2UtMzUx",
            "displayName": "Frey of Riverrun"
          }
        },
        {
          "cursor": "Mw",
          "node": {
            "id": "QmlsbGVyLTI=",
            "displayName": "Something Else"
          }
        }
      ]
    }
  }
}

The connection gives us access to additional information to our collection. By default, the gem generates the pageInfo which is used for cursor pagination, but we can also write custom fields like here the totalCount added to manage a more traditional numbered pagination.

The edges are intended to contain information related to the relationship between the object and its collection. By default, it will contain the cursor, which represents the position of the node it contains within the collection. But it is possible to define its own custom fields.

The nodes are basically the objects of the collection.

Integration

Relay brings valuable data to queries, but it requires more verbosity in their definition. Concretely, instead of just one query, we will have to define 3 types:

the query
the connection
the edge

ConnectionType

The connection type definition includes 3 components:

the specification of the EdgeType to be used
the parameters that can be applied to this connection
custom fields that can be requested in return

class Types::Connections::ProvidersConnectionType < Types::BaseConnection
  # We start by calling the class of the EdgeType we want to associate
  # with the connection
  edge_type(Types::Edges::ProviderEdge)

  # By defining Input types here, we may then call them in the queries
  # associated with this connection.
  # I will come back to the `eq/ne/in` arguments below.
  class Types::ProviderApprovedInput < ::Types::BaseInputObject
    argument :eq, Boolean, required: false
    argument :ne, Boolean, required: false
    argument :in, [Boolean], required: false
  end

  class Types::ProviderFilterInput < ::Types::BaseInputObject
    argument :approved, Types::ProviderApprovedInput, required: false
  end

  # Missing by default, we can incorporate a counter of the number of objects
  # in the collection by creating a field and resolving it
  # in the ConnectionType
  field :total_count, Integer, null: false

  def total_count
    skipped = object.arguments[:skip] || 0
    object.nodes.size + skipped
  end
end

EdgeType

The edge can be thought of as a linking table, which acts as a bridge between the connection and the nodes it contains. By default, we need to set the node_type to identify the type of object returned by our connection. But it is also possible to define custom methods.
I haven't yet encountered a use case for this feature, however.

class Types::Edges::ProviderEdge < GraphQL::Types::Relay::BaseEdge
  node_type(Types::ProviderType)
end

QueryType

Lastly, once the connection is well defined, it must be called:

either in a specific query
either in the type of a parent object

To do this, instead of associating the field with the type of the returned final object, we associate it with the type of the connection.

module Types
  class QueryType < Types::BaseObject
    field :providers, Types::Connections::ProvidersConnectionType, null: true

    def providers
      Provider.all
    end
  end
end

module Types
  class ParentType < Types::BaseObject
    # ...

    global_id_field :id
    field :display_name, String, null: true

    # ...

    field :providers, Types::Connections::ProvidersConnectionType, null: true

    def providers
      BatchLoader.for(object.id).batch(default_value: []) do |ids, loader|
        Provider.where(parent_id: ids).each do |rk|
          loader.call(rk.parent_id) { |i| i << rk }
        end
      end
    end
  end
end

By default, you can pass all the basic paging arguments to the query (first, after, before, last...). If necessary, we can specify additional arguments to specify our query:

module Types
  class QueryType < Types::BaseObject
    field :providers, Types::Connections::ProvidersConnectionType, null: true do
      # filter calls ProvidersConnectionType specific InputTypes
      # which we defined above
      argument :filter, Types::ProviderFilterInput, required: false
    end

    def providers
      Provider.all
    end
  end
end

Queries extraction

All the queries we want to expose on our API must be defined in the query_type.rb file shown just before. But with the increased complexity, the file will be quickly overloaded. Then it is obviously possible to extract the queries logic in other files, resolvers.

The query_type.rb file will look like this:

module Types
  class QueryType < Types::BaseObject
    # ...

    field :all_providers_connection, resolver: Queries::AllProvidersConnection

    # ...
end

The query logic will be in a separate file:

module Queries
  class AllProvidersConnection < Queries::BaseQuery
    description 'list all providers'

    type Types::Connections::ProvidersConnectionType, null: false

    argument :filter, Types::ProviderFilterInput, required: false
    argument :search, String, required: false
    argument :skip, Int, required: false
    argument :order, Types::Order, required: false

    def resolve(**args)
      res = connection_with_arguments(Provider.all, args)
      res = apply_filter(res, args[:filter])
      res
    end
  end
end

The custom methods connection_with_arguments and apply_filter are defined in BaseQuery.

Sorting and filtering connections

connection_with_arguments allows me to integrate sorting arguments and numbered paging into my queries.

def connection_with_arguments(res, **args)
  order = args[:order] || { by: :id, direction: :desc }
  res = res.filter(args[:search]) if args[:search]
  res = res.offset(args[:skip]) if args[:skip]
  # the specification of the table name is necessary to enable
  # sorting when the initial SQL query contains joins
  res = res.order(
    "#{res.model.table_name}.#{order[:by]} #{order[:direction]}"
  )
  res
end

apply_filter provides global logic to all filter arguments in the API.

Usually, in our REST APIs, we use scopes to allow users to filter the results of a query. But these filters are still pretty basic. In designing this first GraphQL API, working with my fellow React developer and API consumer, we wanted to go a little further. GraphQL makes it possible to choose precisely the data that one wishes to receive, so we might as well give also the possibility to filter them precisely.

So we looked for an existing formatting structure for the filters and decided to use the standard proposed in the GatsbyJS documentation as a starting point.

Complete list of possible operators

In the playground below the list, there is an example query with a description of what the query does for each operator.

eq : short for equal, must match the given data exactly

ne : short for not equal, must be different from the given data

regex : short for regular expression, must match the given pattern. Note that backslashes need to be escaped twice, so /\w+/ needs to be written as "/\\\\w+/".

glob : short for global, allows to use wildcard * which acts as a placeholder for any non-empty string

in : short for in array, must be an element of the array

nin : short for not in array, must NOT be an element of the array

gt : short for greater than, must be greater than given value

gte : short for greater than or equal, must be greater than or equal to given value

lt : short for less than, must be less than given value

lte : short for less than or equal, must be less than or equal to given value

elemMatch : short for element match, this indicates that the field you are filtering will return an array of elements, on which you can apply a filter using the previous operators

The current integration looks like this:

def apply_filter(scope, filters)
  return scope unless filters

  filters.keys.each do |filter_key|
    filters[filter_key].keys.each do |arg_key|
      value = filters[filter_key][arg_key]

      # Here we translate the global IDs we get from the API consumer
      # in classic IDs recognized by Postgresql
      if filter_key.match?(/([a-z])_id/)
        value = [filters[filter_key][arg_key]].flatten.map do |id|
          GraphQL::Schema::UniqueWithinType.decode(id)[1].to_i
        end
      end

      scope = case arg_key
              when :eq, :in
                scope.where(filter_key => value)
              when :ne, :nin
                scope.where.not(filter_key => value)
              when :start_with
                scope.where("#{filter_key} ILIKE ?", "#{value}%")
              else
                scope
              end
    end
  end

  scope
end

It's still a work in progress, the code is rudimentary, but it's enough to filter the first queries.

Pagination

On this first application, the screens were still designed with a numbered pagination. So I had to integrate default fields to all connections:

class Types::BaseConnection < GraphQL::Types::Relay::BaseConnection
  field :total_count, Integer, null: false
  field :total_pages, Integer, null: false
  field :current_page, Integer, null: false

  def total_count
    return @total_count if @total_count.present?

    skipped = object.arguments[:skip] || 0
    @total_count = object.nodes.size + skipped
  end

  def total_pages
    page_size = object.first if object.respond_to?(:first)
    page_size ||= object.max_page_size
    total_count / page_size + 1
  end

  def current_page
    page_size = object.first if object.respond_to?(:first)
    page_size ||= object.max_page_size
    skipped = object.arguments[:skip] || 0
    (skipped / page_size) + 1
  end
end

It works, but it is regrettable because GraphQL is really designed for cursor pagination, so we are forced to reinvent the wheel in several places while we have a turnkey and certainly more powerful operation at our disposal.

We would certainly gain a lot in performance by pushing pagination by cursor everywhere where we could do without the total number of pages, the page number and the number of items in the collection.

So that's good?

Our first feature using GraphQL will go into production next week, so it's still a little early to make a full assessment of the use of this API language.

Nevertheless, the benefits for our front-end developers are immediate, as both React and GraphQL are designed to be used together.
On the other hand, if this first project is only consumed by a desktop application, we will soon have to ask ourselves the question of GraphQL libraries for mobile languages, especially on Flutter, test them and hope that they offer the same ease of use.

As for the back-end, I enjoyed working on fully typed classes. Although it's a bit verbose, it imposes a certain rigor which, once the reflex is acquired, becomes particularly comfortable (fewer surprise bugs, when it breaks we know more quickly why).

Note: This motivated me to take a renewed interest in Sorbet, perhaps the subject of a future article?

Of course, you have to relearn the whole way you build your API services, but I think it's for the best. Without even considering switching from REST to GraphQL, knowing and testing each other's paradigms can only improve the way we work.

The question of performance will remain, in particular the fight against N+1 requests... To be continued!

Top comments (4)

Bryan Müller • Feb 27 '20

We've been using GraphQL Ruby Pro in production for a couple years now. A tool that we've used to help prevent N+1 queries is JIT Preloader github.com/clio/jit_preloader

You can use it in a field resolver like so:

Types::UserType < Types::BaseType

# stuff
#
# 

  field :partners, [Types::PartnerType, null: true], null: true do
    argument :limit, Integer, description: "Limit records (default: 50, max: 50).", required: false
    argument :offset, Integer, description: "Offset by number of records, exceeding total record count will return 0 records.", required: false
    argument :order, String, description: "Order records by (created/updated_at_asc/desc).", required: false
  end

  def partners(**args)
    Resolvers::PartnersResolver.call(object.partners.jit_preload, args, context)
  end
end

JIT Preload will then figure out which associations to load down the query resolver chain. For more details I would check out their docs. It has been pretty handy and a great way for us to address the N+1 query problem with ActiveRecord.

It's also handy because it can be applied outside the context of GraphQL as well in somewhere like the service layer.

Julien Camblan • Feb 28 '20

Thanks Bryan for the advice! JIT Preloader sounds really nice. I've browsed through the documentation and it seems much less verbose than BatchLoader which I'm using at the moment anyway. I'm going to start trying it out in preparation for the next version of our GraphQL API. If it answers my concerns, it should be much more convenient to maintain.

Antoine Matyja • Apr 17 '20

Hi 🙂
We are starting to explore GraphQL too and I found your post looking for tips to implement numbered pagination. (we work in the same building 🙃 well not these days obviously)

Thank you for this useful post, I will share it to my colleagues.

For numbered pagination, I ended up using this gist from the graphql gem author: gist.github.com/rmosolgo/da1dd95c2...
It implements numbered pagination with custom page types, like connection types but independently.

Darío Barrionuevo • Jan 29 '21

Hi! How well did this scale after almost one year? I recently joined a project where this gem is implemented, and there are lots of complains about performance. We can fetch no more than 50 records at the same time, which could be a humble number for some listings, and even then my best avg response times are around 500/600ms with a query complexity of 65 (measured using GraphQL::Analysis::AST::QueryComplexity). If the page size is increased to 100 records, avg response times would scale up to around 1300ms which is ridiculous for that amount of records.

Of course I've tried everything I could find online plus my own findings. After trying 3 different batch loaders I even replaced Postgres queries in favor of ElasticSearch to remove query processing overhead, and my results did get better, but still not great.

So could you please share your thoughts about performance? Thanks so much!
My stack: Rails 5.2, Ruby 2.7.2, hosted on EC2 c4.xlarge instances

DEV Community