Why Scaling?
The potential of your application to cope with increasing numbers of users simultaneously interacting with it. Ultimately, you want it to grow and be able to handle more and more requests per minute (RPMs). There are a number of factors that play a part in ensuring scalability, and it’s worth taking each of them into consideration.
Contents
Requirements:
Well, i am using Docker
to wrap up all my necessary tools and django apps on docker-container. Of-course you can ignore docker but have to install required tools independentely, it all up to you how you go through to it.
Well i am not going through with much details and explanation, please help yourself.
django-rest-framework
Nginx
Redis
Postgres
-
Poetry
(an alternative forpip
orpipenv
)
Quikstart
Feeling lazy?
- clone the repo:
boilerplate
- and run below command
$ python3 -m venv env # create virtual environment
$ source env/bin/activate
$ poetry install # make sure you have install poetry on your machine
OR
$ mkdir scale && cd scale
$ python3 -m venv env # create virtual environment
$ source env/bin/activate
$ poetry init # poetry initialization and generates *.toml file
$ poetry add djangorestframework psycopg2-binary Faker
django-redis gunicorn
$ djang-admin startproject config .
$ python manage.py startapp products
$ touch Dockerfile
$ touch docker-compose.yml
Project structure:
─── scale
├── config
│ ├── **init**.py
│ ├── asgi.py
│ ├── settings
│ │ ├── **init**.py
│ │ ├──base.py
│ │ ├──dev.py
│ │ ├──prod.py
│ ├── urls.py
│ └── wsgi.py
├── manage.py
└── products
└── .env
└── manage.py
└── docker-compose.yml
└── Dockerfile
note: above structure i have breakdown settings into
base.py
,prod.py
,dev.py
. Help yourself to break down, or you can get from hereboilerplate
Let's start with docker.
Dockerfile
FROM python:3.8.5-alpine
# prevents Python from generating .pyc files in the container
ENV PYTHONDONTWRITEBYTECODE 1
# Turns off buffering for easier container logging
ENV PYTHONUNBUFFERED 1
RUN \
apk add --no-cache curl
# install psycopg2 dependencies
RUN apk update \
&& apk add postgresql-dev gcc python3-dev musl-dev
# Install poetry
RUN pip install -U pip \
&& curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python
ENV PATH="${PATH}:/root/.poetry/bin"
RUN mkdir /code
RUN mkdir /code/staticfiles
RUN mkdir /code/mediafiles
WORKDIR /code
COPY . /code
RUN poetry config virtualenvs.create false \
&& poetry install --no-interaction --no-ansi
docker-compose.yaml
version: "3.9"
services:
scale:
restart: always
build: .
command: python manage.py runserver 0.0.0.0
volumes:
- .:/code
ports:
- 8000:8000
env_file:
- ./.env
depends_on:
- db
db:
image: "postgres:11"
volumes:
- postgres_data:/var/lib/postgresql/data/
ports:
- 54322:5432
environment:
- POSTGRES_USER=scale
- POSTGRES_PASSWORD=scale
- POSTGRES_DB=scale
volumes:
postgres_data:
Above we create Dockerfile
and docker-compose.yaml
file.
- we used alpine based image
- installed dependencies for
postgres
andpoetry
setup - create service name
scale
anddb
Run the command:
docker-compose up
you will get some error database does not exist
let's create a database:
$ docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
78ac4d15bcd8 postgres:11 "docker-entrypoint.s…" 2 hours ago Up 31 seconds 0.0.0.0:54322->5432/tcp, :::54322->5432/tcp scale_db_1
copy CONTAINER ID
value
$ docker exec -it 78ac4d15bcd8 bash
:/#
:/# psql --username=postgres
psql (11.12 (Debian 11.12-1.pgdg90+1))
Type "help" for help.
postgres=# CREATE DATABASE scale;
postgres=# CREATE USER scale WITH PASSWORD 'scale';
postgres=# ALTER ROLE scale SET client_encoding TO 'utf8';
postgres=# ALTER ROLE scale SET default_transaction_isolation TO 'read committed';
postgres=# ALTER ROLE scale SET timezone TO 'UTC';
postgres=# ALTER ROLE scale SUPERUSER;
postgres=# GRANT ALL PRIVILEGES ON DATABASE scale TO scale;
postgres=# \q
make sure your settings/dev.py
have config like this or your given credentials and change your host
localhost
to db
:
from config.settings import BASE_DIR
DATABASES = {
"default": {
"ENGINE": "django.db.backends.postgresql_psycopg2",
"ATOMIC_REQUESTS": True,
"NAME": "scale",
"USER": "scale",
"PASSWORD": "scale",
"HOST": "db",
"PORT": "5432",
}
}
# REDIS CONFIG
CACHES = {
"default": {
"BACKEND": "django_redis.cache.RedisCache",
"LOCATION": "redis://redis:6379/0",
"OPTIONS": {"CLIENT_CLASS": "django_redis.client.DefaultClient"},
}
}
STATIC_URL = '/static/'
STATIC_ROOT = BASE_DIR.parent / "staticfiles" # for collect static
MEDIA_ROOT = BASE_DIR.parent / "media"
MEDIA_URL = "/media/"
Nginx Setup
Next, we setup redis
and nginx
and gunicorn
on docker:
docker-compose.yaml
version: "3.9"
services:
scale:
restart: always
build: .
command: gunicorn config.wsgi:application --bind 0.0.0.0:8000
volumes:
- .:/code
- static_volume:/code/staticfiles
- media_volume:/code/mediafiles
expose:
- 8000
env_file:
- ./.env
depends_on:
- db
- redis
db:
image: "postgres:11"
volumes:
- postgres_data:/var/lib/postgresql/data/
ports:
- 54322:5432
environment:
- POSTGRES_USER=scale
- POSTGRES_PASSWORD=scale
- POSTGRES_DB=scale
redis:
image: redis
ports:
- 63799:6379
restart: on-failure
nginx:
build: ./nginx
restart: always
volumes:
- static_volume:/code/staticfiles
- media_volume:/code/mediafiles
ports:
- 2000:80
depends_on:
- scale
volumes:
postgres_data:
static_volume:
media_volume:
so, above we add two services redis
and nginx
and initialze gunicorn
instead of our regular command. Next we create a nginx
dir on root project with Dockerfile
& nginx.conf
nginx/Dockerfile
FROM nginx:latest
RUN rm /etc/nginx/conf.d/default.conf
COPY nginx.conf /etc/nginx/conf.d
nginx/nginx.conf
upstream core {
server scale:8000;
}
server {
listen 80;
location / {
proxy_pass http://core;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $host;
proxy_redirect off;
client_max_body_size 100M;
}
location /staticfiles/ {
alias /code/staticfiles/;
}
location /mediafiles/ {
alias /code/mediafiles/;
}
}
Above, we created a Dockerfile
which will build our nginx
image and nginx.conf
where we are serving our app and serving static and media files.
let's run docker-compose
file.
docker-compose up --build
Navigate this link to browser http://localhost:2000/
Note: Above
docker-compose.yaml
file onnginx
service we initiatedport: 2000:80
.
so our server will run on port 2000.
Caching
Products
First lets try without caching.
Now, let's create a model
for our products
app.
products/models.py
from django.db import models
from django.utils.translation import gettext_lazy as _
class Category(models.Model):
name = models.CharField(_("Category Name"), max_length=255, unique=True)
description = models.TextField(null=True)
class Meta:
ordering = ("name",)
verbose_name = _("Category")
verbose_name_plural = _("Categories")
def __str__(self) -> str:
return self.name
class Product(models.Model):
name = models.CharField(_("Product Name"), max_length=255)
category = models.ForeignKey(
Category, on_delete=models.DO_NOTHING)
description = models.TextField()
price = models.DecimalField(decimal_places=2, max_digits=10)
quantity = models.IntegerField(default=0)
discount = models.DecimalField(decimal_places=2, max_digits=10)
image = models.URLField(max_length=255)
class Meta:
ordering = ("id",)
verbose_name = _("Product")
verbose_name_plural = _("Products")
def __str__(self):
return self.name
so further moving forward let's create a dummy data using custom commands.
create a management directory inside products app.
── products
│── management
│ │── **init**.py
│ │── commands
│ │ │── **init**.py
│ │ │── category_seed.py
│ │ │── product_seed.py
category_seed.py
from django.core.management import BaseCommand
from django.db import connections
from django.db.utils import OperationalError
from products.models import Category
from faker import Faker
class Command(BaseCommand):
def handle(self, *args, **kwargs):
faker = Faker()
for _ in range(30):
Category.objects.create(
name=faker.name(),
description=faker.text(200)
)
product_seed.py
from django.core.management import BaseCommand
from django.db import connections
from django.db.utils import OperationalError
from products.models import Category, Product
from random import randrange, randint
from faker import Faker
class Command(BaseCommand):
def handle(self, *args, **kwargs):
faker = Faker()
for _ in range(5000):
price = randrange(10, 100)
quantity = randrange(1, 5)
cat_id = randint(1, 30)
category = Category.objects.get(id=cat)
Product.objects.create(
name=faker.name(),
category=category,
description=faker.text(200),
price=price,
discount=100,
quantity=quantity,
image=faker.image_url()
)
so, i will create 5000 of products and 30 category
$ docker-compose exec scale sh
/code # python manage.py makemigrations
/code # python manage.py migrate
/code # python manage.py createsuperuser
/code # python manage.py collectstatic --no-input
/code # python manage.py category_seed
/code # python manage.py product_seed # takes while to create 5000 data
You can view data on pgadmin or admin dashboard if data are loaded or not.
After creation of dummy data let's create a serializers
and views
serializers.py
from rest_framework import serializers
from .models import Product, Category
class CategorySerializers(serializers.ModelSerializer):
class Meta:
model = Category
fields = "__all__"
class CategoryRelatedField(serializers.StringRelatedField):
def to_representation(self, value):
return CategorySerializers(value).data
def to_internal_value(self, data):
return data
class ProductSerializers(serializers.ModelSerializer):
class Meta:
model = Product
fields = "__all__"
class ReadProductSerializer(serializers.ModelSerializer):
category = serializers.StringRelatedField(read_only=True)
# category = CategoryRelatedField()
# category = CategorySerializers()
class Meta:
model = Product
fields = "__all__"
views.py
from products.models import Product
from rest_framework import (
viewsets,
status,
)
import time
from .serializers import ProductSerializers, ReadProductSerializer
from rest_framework.response import Response
class ProductViewSet(viewsets.ViewSet):
def list(self, request):
serializer = ReadProductSerializer(Category.objects.all(), many=True)
return Response(serializer.data)
def create(self, request):
serializer = ProductSerializers(data=request.data)
serializer.is_valid(raise_exception=True)
serializer.save()
return Response(
serializer.data, status=status.HTTP_201_CREATED)
def retrieve(self, request, pk=None,):
products = Product.objects.get(id=pk)
serializer = ReadProductSerializer(products)
return Response(
serializer.data
)
def update(self, request, pk=None):
products = Product.objects.get(id=pk)
serializer = ProductSerializers(
instance=products, data=request.data, partial=True)
serializer.is_valid(raise_exception=True)
serializer.save()
return Response(
serializer.data, status=status.HTTP_202_ACCEPTED)
def destroy(self, request, pk=None):
products = Product.objects.get(id=pk)
products.delete()
return Response(
status=status.HTTP_204_NO_CONTENT
)
urls.py
from django.urls import path
from .views import ProductViewSet
urlpatterns = [
path("product", ProductViewSet.as_view(
{"get": "list", "post": "create"})),
path(
"product/<str:pk>",
ProductViewSet.as_view(
{"get": "retrieve", "put": "update", "delete": "destroy"}),
),
]
so, we created a view usingviewsets
let's try with postman using different serializers on viewsets to get lists of 5K data.
http://localhost:2000/api/v1/products
serializers | Time |
---|---|
ReadProductSerializer (stringrelatedfield) | 6.42s |
ReadProductSerializer (CategoryRelatedFeild) | 7.05s |
ReadProductSerializer (Nested) | 6.49s |
ReadProductSerializer (PrimaryKeyRelatedField) | 681 ms |
ReadProductSerializer (without any) | 674ms |
Note: response time may varies depending on your system.
Lets get data by using caching:
views.py
from rest_framework.views import APIView
from products.models import Category, Product
from rest_framework import (
viewsets,
status,
)
from rest_framework.pagination import PageNumberPagination
import time
from .serializers import CategorySerializers, ProductSerializers, ReadProductSerializer
from rest_framework.response import Response
from django.core.cache import cache
class ProductListApiView(APIView):
def get(self, request):
paginator = PageNumberPagination()
paginator.page_size = 10
# get products from cache if exists
products = cache.get('products_data')
# if products does not exists on cache create it
if not products:
products = list(Product.objects.select_related('category'))
cache.set('products_data', products, timeout=60 * 60)
# paginating cache products
result = paginator.paginate_queryset(products, request)
serializer = ReadProductSerializer(result, many=True)
return paginator.get_paginated_response(serializer.data)
class ProductViewSet(viewsets.ViewSet):
def create(self, request):
serializer = ProductSerializers(data=request.data)
serializer.is_valid(raise_exception=True)
serializer.save()
# get cache of products
# if exists
# delete cache
for key in cache.keys('*'):
if 'products_data' in key:
cache.delete(key)
cache.delete("products_data")
return Response(
serializer.data, status=status.HTTP_201_CREATED)
def retrieve(self, request, pk=None,):
products = Product.objects.get(id=pk)
serializer = ReadProductSerializer(products)
return Response(
serializer.data
)
def update(self, request, pk=None):
products = Product.objects.get(id=pk)
serializer = ProductSerializers(
instance=products, data=request.data, partial=True)
serializer.is_valid(raise_exception=True)
serializer.save()
for key in cache.keys('*'):
if 'products_data' in key:
cache.delete(key)
cache.delete("products_data")
return Response(
serializer.data, status=status.HTTP_202_ACCEPTED)
def destroy(self, request, pk=None):
products = Product.objects.get(id=pk)
products.delete()
for key in cache.keys('*'):
if 'products_data' in key:
cache.delete(key)
cache.delete("products_data")
return Response(
status=status.HTTP_204_NO_CONTENT
)
so, i have created a seperate APIView
and remove list
function from viewsets
. Which will fetch data from cache and paginated view.
change your products/urls.py
from django.urls import path
from .views import ProductListApiView, ProductViewSet
urlpatterns = [
path('products', ProductListApiView.as_view()),
path("product", ProductViewSet.as_view(
{"post": "create"})),
path(
"product/<str:pk>",
ProductViewSet.as_view(
{"get": "retrieve", "put": "update", "delete": "destroy"}),
),
]
So, try it again with postman with different serializers
.
you will get results between 90 to 200ms
depending upon your machine.
Note: in above
apiview
i have usedselect_related
. Try removing it and run again with postman, will find a different results.
To learn more about queryset `i.e select_related, prefetch_related. click this link N+1 Queries Problem
Final words:
Still there are lots of rooms to improve, it depends how?, where?, for what?, how many?.
Hope You guys liked it... chao 👋👋
Top comments (0)