Etienne Depaulis

Posted on Nov 9, 2019

Github App authentication strategies with Elixir

#elixir #github #oauth

At Lifen, an healthcare communication SaaS, we rely on a micro-services architecture with more than 50 services written in 4 different langages (Ruby, Node, Java & Python) and therefore have a bunch of Github repos to monitor.

I've been using Ruby for quite some time now and last year I decided to learn Elixir for multiple reasons: functionnal paradigm and curiosity mainly, then I fell in love with pattern matching but that's another story...

In my opinion, the best, and maybe only way, to learn a new langage is to build something that goes in production as soon as possible. So I decided to build a Phoenix app to support a Github App which handles webhooks and provides an overview of key data about our repos.

This article covers/deals with two potential strategies to handle Github App authentication via an Elixir app.

Let's consider the following use case: retrieve and store the CODEOWNERS content which enables the "bot" to link each repo to a team. The objective is to expose an endpoint to trigger the retrieval.

Github Authentication Flow

The whole flow is described in Github API documentation. The key aspect here is that you need to obtain a temporary (1h) access token per organization on which your application is installed to then authenticate your API calls:

As this process requires "extra" API calls, we would like to cache the provided token.

Let's persist it in the database!

The first strategy is to store the access token in the database and then check on each call if a valid one is available. If not, the whole token generating flow is activated and the new token is stored:

defmodule Github.JwtToken do
  use Joken.Config

  def signed_jwt do
    current_timestamp = DateTime.utc_now() |> DateTime.to_unix()

    {github_app_id, _remainder} = System.get_env("GITHUB_APP_ID") |> Integer.parse

    extra_claims = %{"iat" => current_timestamp,
                     "exp" => current_timestamp + (10 * 60),
                     "iss" => github_app_id}

    generate_and_sign!(extra_claims)
  end
end

Github.JwtToken. signed_jwt generates the self-signed JWT token which will then be used for the /app/ Github API endpoints.

defmodule Github.InstallationToken do
  import Ecto.Query, warn: false
  alias MyApplication.Repo

  use Ecto.Schema
  import Ecto.Changeset

  alias Github.InstallationToken

  schema "installation_tokens" do
    field :expires_at, :utc_datetime
    field :token, :string
    field :organization, :string

    timestamps()
  end

  @doc false
  def changeset(installation_token, attrs) do
    installation_token
    |> cast(attrs, [:token, :organization, :expires_at])
    |> validate_required([:token, :organization, :expires_at])
  end

  def create_installation_token(attrs \\ %{}) do
    %InstallationToken{}
    |> InstallationToken.changeset(attrs)
    |> Repo.insert()
  end

  def valid_token_for(organization) do
    installation_token = find_or_create_installation_token(organization)

    installation_token.token
  end

  defp find_installation_id(client, organization) do
    {200, installations, _response} = Tentacat.App.Installations.list_mine(client)

    matched_installation = Enum.find(installations, fn installation -> installation["account"]["login"] == organization end)

    matched_installation["id"]
  end

  defp generate_installation_token(organization) do
    jwt = Github.JwtToken.signed_jwt
    client = Tentacat.Client.new(%{jwt: jwt})

    installation_id = find_installation_id(client, organization)

    {201, access_tokens, _response} = Tentacat.App.Installations.token(client, installation_id)

    %{token: access_tokens["token"], expires_at: access_tokens["expires_at"], organization: organization}
  end

  defp find_or_create_installation_token(organization) do
    current_datetime = DateTime.utc_now()

    query = from i in InstallationToken, where: ^current_datetime < i.expires_at and i.organization == ^organization

    query |> first |> Repo.one |> eventually_create_installation_token(organization)
  end

  defp eventually_create_installation_token(nil, organization) do
    {:ok, installation_token} = generate_installation_token(organization) |> create_installation_token

    installation_token
  end

  defp eventually_create_installation_token(installation_token = %InstallationToken{}, _organization) do
    installation_token
  end
end

The token cache strategy discussed in this section is implemented in Github.InstallationToken which regroups both the schema and Github's logic.

defmodule Github.Content do

  def sync_codeowners(application) do
    token = Github.InstallationToken.valid_token_for(application.organization)
    client = Tentacat.Client.new(%{access_token: token})

    {200, %{"content" => encoded_content}, _response} = Tentacat.Repositories.Contents.content(client, application.organization, repo, ".github/CODEOWNERS")

    attrs = %{raw_codeowners: :base64.mime_decode(encoded_content)}
    application |> MyApplication.Main.update_application(attrs)
  end

end

defmodule MyApplicationWeb.ApplicationController do
  use MyApplicationWeb, :controller

  def sync_codeowners(conn, %{"id" => id}) do
    MyApplication.Main.get_application!(id)
    |> Github.Content.sync_codeowners(application)

    conn
    |> put_flash(:info, "Sync done")
    |> redirect(to: Routes.application_path(conn, :index))
  end

end

We then simply have to call MyApplicationWeb.ApplicationController.sync_codeowners to trigger the desired action.

This code could be greatly improved by running Github.Content.sync_codeowners in a separate process (aka asynchronous job) using Task.start(fn -> slow_method() end).

This approach is quite standard and could easily have been written in other frameworks, such as Rails for example. It is quick to implement, and thanks to the "fire & forget" background job strategy provided by Task it can handle high volumes of requests. There are 2 potential bottlenecks though:

Github API for Apps has a rate limit of 5 000 requests per hour per organization (with a higher limit for organizations with 20+ members)
if multiple requests are triggered simultaneously with an invalid token, there are no locking mechanism to avoid requesting multiple access tokens

This approach also lacks the invalid token cleaning strategy.

What about a supervised GenServer strategy ?

After reading Elixir in Action by Saša Jurić, a lot of what I've been reading about Elixir, Erlang, BEAM and the OTP philosophy started to make sense. I strongly recommend this reading! (I also strongly recommend Elixir for programmers by Dave Thomas).

Let's put this knowledge in practice with our use case by creating a separate process per organization (process are super cheap in the BEAM world) using the GenServer approach. Each process will hold its own state, meaning we don't have to persist it to the database anymore.

Also, as we will only have one process per organization, we should be able to handle the rate limit and concurrent token refresh strategy.

defmodule MyApplication.Application do
  use Application

  def start(_type, _args) do
    children = [
      MyApplication.Repo,
      MyApplicationWeb.Endpoint,
      Github.System
    ]

    opts = [strategy: :one_for_one, name: MyApplication.Supervisor]
    Supervisor.start_link(children, opts)
  end
end

When we start our phoenix server, we simply add an additionnal process (Github.System) to be supervised by the "main" supervisor.

defmodule Github.System do

  def start_link do
    Supervisor.start_link(
      [
        Github.ProcessRegistry,
        Github.Cache
      ],
      strategy: :one_for_one
    )
  end

  def child_spec(_arg) do
    %{
      id: __MODULE__,
      start: {__MODULE__, :start_link, []},
      type: :supervisor
    }
  end

end

This process will be responsible for supervising 2 processes:

Github.ProcessRegistry: a registry to register our per-organization processes
Github.Cache: which acts as a switch to start or select our dynamic processes

defmodule Github.ProcessRegistry do
  def start_link do
    Registry.start_link(keys: :unique, name: __MODULE__)
  end

  def via_tuple(key) do
    {:via, Registry, {__MODULE__, key}}
  end

  def child_spec(_) do
    Supervisor.child_spec(
      Registry,
      id: __MODULE__,
      start: {__MODULE__, :start_link, []}
    )
  end
end

defmodule Github.Cache do
  def start_link() do
    # IO.puts("Starting Github Cache")
    DynamicSupervisor.start_link(name: __MODULE__, strategy: :one_for_one)
  end

  def child_spec(_arg) do
    %{
      id: __MODULE__,
      start: {__MODULE__, :start_link, []},
      type: :supervisor
    }
  end

  def organization_process(organization_name) do
    case start_child(organization_name) do
      {:ok, pid} -> pid
      {:error, {:already_started, pid}} -> pid
    end
  end

  defp start_child(organization_name) do
    DynamicSupervisor.start_child(__MODULE__, {Github.Client, organization_name})
  end
end

We can then simply call Github.Cache.organization_process("lifen") to either start or retrieve the process in charge of the Lifen organization.

defmodule Github.Client do
  use GenServer

  def start_link(name) do
    GenServer.start_link(Github.Client, name, name: via_tuple(name))
  end

  def extract_codeowners(organization_process, application) do
    GenServer.cast(organization_process, {:extract_codeowners, application})
  end

  defp via_tuple(name) do
    Github.ProcessRegistry.via_tuple({__MODULE__, name})
  end

  @impl GenServer
  def init(organization) do
    {:ok, set_token(organization)}
  end

  @impl GenServer
  def handle_cast({:extract_codeowners, application}, state) do
    state = eventually_refresh_token(state)

    state.token |> set_raw_codeowners(application)

    MyApplicationWeb.Endpoint.broadcast("applications", "updated_application", %{})

    { :noreply, state }
  end

  defp set_raw_codeowners(token, %MyApplication.Main.Application{} = application) do
    client = Tentacat.Client.new(%{access_token: token})

    {200, %{"content" => encoded_content}, _response} = Tentacat.Repositories.Contents.content(client, application.organization, repo, ".github/CODEOWNERS")

    attrs = %{raw_codeowners: :base64.mime_decode(encoded_content)}
    application |> MyApplication.Main.update_application(attrs)
  end

  defp eventually_refresh_token(state) do
    if state.expires_at > DateTime.utc_now() |> DateTime.to_unix() do
      set_token(state.organization)
    else
      state
    end
  end

  defp set_token(organization) do
    Github.InstallationToken.generate_installation_token(organization)
  end
end

Finally, once we have the process PID, we can send it the :extract_codeowners message via Github.Client.extract_codeowners(process, application).

The main dashboard relies on Phoenix Live View so we simply have to broadcast a message to refresh it via MyApplicationWeb.Endpoint.broadcast("applications", "updated_application", %{}).

Also, eventually_refresh_token uses the only if .. else .. end of the whole application thanks to pattern matching!

defmodule Github.JwtToken do
  use Joken.Config

  def signed_jwt do
    current_timestamp = DateTime.utc_now() |> DateTime.to_unix()

    {github_app_id, _remainder} = System.get_env("GITHUB_APP_ID") |> Integer.parse

    extra_claims = %{"iat" => current_timestamp,
                     "exp" => current_timestamp + (10 * 60),
                     "iss" => github_app_id}

    generate_and_sign!(extra_claims)
  end
end

defmodule Github.InstallationToken do

  def generate_installation_token(organization) do
    jwt = Github.JwtToken.signed_jwt
    client = Tentacat.Client.new(%{jwt: jwt})

    installation_id = find_installation_id(client, organization)

    create_access_token(client, organization, installation_id)
  end

  def find_installation_id(client, organization) do
    {200, installations, _response} = Tentacat.App.Installations.list_mine(client)

    matched_installation = Enum.find(installations, fn installation -> installation["account"]["login"] == organization end)

    matched_installation["id"]
  end

  def create_access_token(client, organization, installation_id) do
    {201, access_tokens, _response} = Tentacat.App.Installations.token(client, installation_id)

    %{token: access_tokens["token"], expires_at: unix_expires_at(access_tokens["expires_at"]), organization: organization}
  end

  defp unix_expires_at(expires_at) do
    {:ok, naive_expires_at} = NaiveDateTime.from_iso8601(expires_at)
    {:ok, expires_at} = DateTime.from_naive(naive_expires_at, "Etc/UTC")

    expires_at |> DateTime.to_unix()
  end

end

defmodule MyApplicationWeb.ApplicationController do
  use MyApplicationWeb, :controller

  def sync_codeowners(conn, %{"id" => id}) do
    application = MyApplication.Main.get_application!(id)

    Github.Cache.organization_process(application.organisation)
    |> Github.Client.extract_codeowners(application)

    conn
    |> put_flash(:info, "Sync in progress")
    |> redirect(to: Routes.application_path(conn, :index))
  end

end

This strategy was a bit more complex to implement as it is more advanced and specific to Elixir.

It is important to note that we only have one process per organisation, meaning only one "queue" each time. If we want to handle a heavier load, we could spawn processes from it.

Also an interesting optimization for Github.Client could be to avoid calling the API during the initialization of the process (the solution to this problem is in Saša's book).

DEV Community

Github App authentication strategies with Elixir

Github Authentication Flow

Let's persist it in the database!

What about a supervised GenServer strategy ?

Top comments (0)

Read next

Deploying to Azure with Terraform and GitHub Actions

Git Merge vs Rebase

Publishing a Private Package on GitHub Packages

Introduction to "branch"