I am investigating solutions on how to scale a flask-python API.
The solutions I have seen around (that are appropriate for enterprise level) are:
- Gunicorn with a few workers
- Celery or a similar technology for the tasks that are not trivial to finish
- Multiprocessing with a load balancer
- A combination of the above
Would you mind suggesting any other approaches you might have seen around or comment on the above?
For simplicity let's assume that there is no binding to specific cloud vendor.
Top comments (6)
First make sure your server can be scaled horizontally. That is, you can run multiple instances of the server and all will work without issue or speed reduction. Any occasionally overlooked step.
Then instead of running flask directly use a wsgi server such as gunicorn. If you are feeling up to the challenge you can try asynchronous frameworks like responder, starlette or bocadilla. Both sides are trying to solve the same problem of handling as many requests as quickly as possible.
Celery can be used for long running or not time sensitive background tasks like daily metrics and the like.
You may also need to look as database stuff like scaled postgres or using redis for faster reads. This will be much more dependent on your app and I would look at this last. But is worth keeping in mind as you scale up.
This will be enough to handle quite a number of users already, but for extra reliability and throughput I recommend some sort of load balancer. This comes in many flavors. From the simple nginx host to the mighty kubernetes cluster.
Whatever you choose be sure to always automate it, DevOps is invaluable at all sizes.
Thanks for the great tips. Are you suggesting, not to use celery for plain requests that for example write a few KB in the DB? Is ASGI(where bicadillo etc are based on) stable enough?
If that write to the DB if required before you can return a response to the user than yes. I would check out the new asynchronous database tools like tortoise or the one by the starlette guys. Regardless with enough workers it shouldn't slow down your service.
I would say they are pretty stable, i would look around before choosing one for long term. FastAPI and responder were my top 2.
I would start by writing down the answers for a few questions, for example:
@miniscruff gives you a good general set of ideas but if you can write down your baseline, your issues and your "goals" it's going to be easier to scale. As said, definitely don't run Flask in production with the development server, see Flask's deployment options to change that.
In general: you should measure. Yes you can throw money at the problem but where should that money go?
The answers you found all make sense:
Actually that would be a good information to know :D
Thank you for your insight. It all makes sense. Regarding the last point, there are scaling capabilities from the cloud vendor of course, but I want to rely as little as possible to vendor specific stuff (at least for now)
Let me know if you need more pointers, it really depends on what you're trying to accomplish.
I understand the need to not be locked in too much but compare it to the possible amount of time you have to spend on DIY vs managed services.