DEV Community

Cover image for How Our Infrastructure Supports Last-Minute Studying
Žiga Patačko Koderman for zerodays

Posted on • Edited on

How Our Infrastructure Supports Last-Minute Studying

The past few weeks, one of our clients, Astra AI - an AI powered math tutor, has been recording a steep increase in traffic. This makes sense, as in June, students were frantically studying to improve their final grades right before the school year ended and preparing for the Slovenian national high school final exam - Matura.

Although it’s been years since our teachers warned us not to study in the last days and avoid cramming, the data shows that this is still the case. Well, little has changed since our study days (not that we listened to teachers back then either, of course).

But this time we are experiencing this phenomenon from a completely new perspective - intensely studying in the last few days before the exam means a sudden increase in traffic. The chart below represents the number of OpenAI tokens used by Astra per day:

Chart showing a big spike in token usage.

We won’t be sharing the absolute numbers of course, but suffice to say that we believe this to be one of the biggest usages in our region.

This was Astra’s first time through such (admittedly expected) load increase. However, we were fairly confident in our infrastructure and did nothing special in advance to handle this. And in fact no technical issues arose.

Since the peak, we’ve analyzed the logs, reviewed the data, and attributed this to a few key factors:

  • Some clever load balancing between multiple API keys in order to climb OpenAI's tier ladder.
  • Great choice of hosting providers allowing for easy elastic scaling, namely:
    • Vercel for the Next.js frontend and part of the backend. The killer features for us include autoscaling, CDN and instant rollbacks out of the box 📦.
    • Railway.app for the background machinery handling the more complicated requests.
  • Great choice of monitoring and analytics tools (Sentry, Axiom, Posthog and Uptime Kuma) coupled with amazing Slack integrations that allowed us to iron out any issues way before the traffic spike while the troubling features were still fresh from the oven.

As we could talk about our infrastructure choices for days, we decided to keep this post short and simple but are planning on doing more of a deep dive in one of the future posts, so keep an eye out for that.

This blog post was written by the zerodays.dev team.

Top comments (0)