⚖️Scaling Django⚖️

Why Scaling?

The potential of your application to cope with increasing numbers of users simultaneously interacting with it. Ultimately, you want it to grow and be able to handle more and more requests per minute (RPMs). There are a number of factors that play a part in ensuring scalability, and it’s worth taking each of them into consideration.

Contents

Requirements:

Well, i am using Docker to wrap up all my necessary tools and django apps on docker-container. Of-course you can ignore docker but have to install required tools independentely, it all up to you how you go through to it.

Well i am not going through with much details and explanation, please help yourself.

Quikstart

Feeling lazy?

$ python3 -m venv env # create virtual environment
$ source env/bin/activate 
$ poetry install # make sure you have install poetry on your machine

OR

$ mkdir scale && cd scale
$ python3 -m venv env # create virtual environment
$ source env/bin/activate
$ poetry init # poetry initialization and generates *.toml file
$ poetry add djangorestframework psycopg2-binary Faker 
django-redis gunicorn
$ djang-admin startproject config .
$ python manage.py startapp products
$ touch Dockerfile
$ touch docker-compose.yml

Project structure:

─── scale
    ├── config
    │ ├── **init**.py
    │ ├── asgi.py
    │ ├── settings
    │ │ ├── **init**.py
    │ │ ├──base.py
    │ │ ├──dev.py
    │ │ ├──prod.py
    │ ├── urls.py
    │ └── wsgi.py
    ├── manage.py
    └── products
    └── .env
    └── manage.py
    └── docker-compose.yml
    └── Dockerfile

note: above structure i have breakdown settings into base.py, prod.py, dev.py. Help yourself to break down, or you can get from here boilerplate

Let's start with docker.

Dockerfile

FROM python:3.8.5-alpine

# prevents Python from generating .pyc files in the container
ENV PYTHONDONTWRITEBYTECODE 1
# Turns off buffering for easier container logging
ENV PYTHONUNBUFFERED 1

RUN \
    apk add --no-cache curl

# install psycopg2 dependencies
RUN apk update \
    && apk add postgresql-dev gcc python3-dev musl-dev


# Install poetry
RUN pip install -U pip \
    && curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python
ENV PATH="${PATH}:/root/.poetry/bin"


RUN mkdir /code
RUN mkdir /code/staticfiles
RUN mkdir /code/mediafiles

WORKDIR /code
COPY . /code

RUN poetry config virtualenvs.create false \
    && poetry install --no-interaction --no-ansi

docker-compose.yaml

version: "3.9"

services:
  scale:
    restart: always
    build: .
    command: python manage.py runserver 0.0.0.0
    volumes:
      - .:/code
    ports:
      - 8000:8000
    env_file:
      - ./.env
    depends_on:
      - db
  db:
    image: "postgres:11"
    volumes:
      - postgres_data:/var/lib/postgresql/data/
    ports:
      - 54322:5432
    environment:
      - POSTGRES_USER=scale
      - POSTGRES_PASSWORD=scale
      - POSTGRES_DB=scale

volumes:
  postgres_data:

Above we create Dockerfile and docker-compose.yaml file.

  • we used alpine based image
  • installed dependencies for postgres and poetry setup
  • create service name scale and db

Run the command:

docker-compose up

you will get some error database does not exist

let's create a database:

$ docker container ls
CONTAINER ID   IMAGE         COMMAND                  CREATED          STATUS          PORTS                                         NAMES

78ac4d15bcd8   postgres:11   "docker-entrypoint.s…"   2 hours ago      Up 31 seconds   0.0.0.0:54322->5432/tcp, :::54322->5432/tcp   scale_db_1

copy CONTAINER ID value

$ docker exec -it 78ac4d15bcd8 bash
 :/#
 :/# psql --username=postgres
 psql (11.12 (Debian 11.12-1.pgdg90+1))
 Type "help" for help.

 postgres=# CREATE DATABASE scale;
 postgres=# CREATE USER scale WITH PASSWORD 'scale';
 postgres=# ALTER ROLE scale SET client_encoding TO 'utf8';
 postgres=# ALTER ROLE scale SET default_transaction_isolation TO 'read committed';
 postgres=# ALTER ROLE scale SET timezone TO 'UTC';
 postgres=# ALTER ROLE scale SUPERUSER;
 postgres=# GRANT ALL PRIVILEGES ON DATABASE scale TO scale;
 postgres=# \q

make sure your settings/dev.py have config like this or your given credentials and change your host localhost to db:

from config.settings import BASE_DIR

DATABASES = {
    "default": {
        "ENGINE": "django.db.backends.postgresql_psycopg2",
        "ATOMIC_REQUESTS": True,
        "NAME": "scale",
        "USER": "scale",
        "PASSWORD": "scale",
        "HOST": "db",
        "PORT": "5432",
    }
}

# REDIS CONFIG
CACHES = {
    "default": {
        "BACKEND": "django_redis.cache.RedisCache",
        "LOCATION": "redis://redis:6379/0",
        "OPTIONS": {"CLIENT_CLASS": "django_redis.client.DefaultClient"},
    }
}

STATIC_URL = '/static/'
STATIC_ROOT = BASE_DIR.parent / "staticfiles"  # for collect static

MEDIA_ROOT = BASE_DIR.parent / "media"
MEDIA_URL = "/media/"

Nginx Setup

Next, we setup redis and nginx and gunicorn on docker:
docker-compose.yaml

version: "3.9"

services:
  scale:
    restart: always
    build: .
    command: gunicorn config.wsgi:application --bind 0.0.0.0:8000
    volumes:
      - .:/code
      - static_volume:/code/staticfiles
      - media_volume:/code/mediafiles
    expose:
      - 8000
    env_file:
      - ./.env
    depends_on:
      - db
      - redis
  db:
    image: "postgres:11"
    volumes:
      - postgres_data:/var/lib/postgresql/data/
    ports:
      - 54322:5432
    environment:
      - POSTGRES_USER=scale
      - POSTGRES_PASSWORD=scale
      - POSTGRES_DB=scale
  redis:
    image: redis
    ports:
      - 63799:6379
    restart: on-failure

  nginx:
    build: ./nginx
    restart: always
    volumes:
      - static_volume:/code/staticfiles
      - media_volume:/code/mediafiles
    ports:
      - 2000:80
    depends_on:
      - scale

volumes:
  postgres_data:
  static_volume:
  media_volume:

so, above we add two services redis and nginx and initialze gunicorn instead of our regular command. Next we create a nginx dir on root project with Dockerfile & nginx.conf

nginx/Dockerfile

FROM nginx:latest

RUN rm /etc/nginx/conf.d/default.conf
COPY nginx.conf /etc/nginx/conf.d

nginx/nginx.conf

upstream core {
    server scale:8000;
}

server {

    listen 80;

    location / {
        proxy_pass http://core;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $host;
        proxy_redirect off;
        client_max_body_size 100M;
    }

     location /staticfiles/ {
        alias /code/staticfiles/;
    }
      location /mediafiles/ {
        alias /code/mediafiles/;
    }

}

Above, we created a Dockerfile which will build our nginx image and nginx.conf where we are serving our app and serving static and media files.

let's run docker-compose file.

docker-compose up --build

Navigate this link to browser http://localhost:2000/

Note: Above docker-compose.yaml file on nginx service we initiated port: 2000:80.
so our server will run on port 2000.

Caching Products

First lets try without caching.

Now, let's create a model for our products app.

products/models.py

from django.db import models
from django.utils.translation import gettext_lazy as _


class Category(models.Model):
    name = models.CharField(_("Category Name"), max_length=255, unique=True)
    description = models.TextField(null=True)

    class Meta:
        ordering = ("name",)
        verbose_name = _("Category")
        verbose_name_plural = _("Categories")

    def __str__(self) -> str:
        return self.name


class Product(models.Model):
    name = models.CharField(_("Product Name"), max_length=255)
    category = models.ForeignKey(
        Category, on_delete=models.DO_NOTHING)

    description = models.TextField()
    price = models.DecimalField(decimal_places=2, max_digits=10)
    quantity = models.IntegerField(default=0)
    discount = models.DecimalField(decimal_places=2, max_digits=10)
    image = models.URLField(max_length=255)

    class Meta:
        ordering = ("id",)
        verbose_name = _("Product")
        verbose_name_plural = _("Products")

    def __str__(self):
        return self.name

so further moving forward let's create a dummy data using custom commands.
create a management directory inside products app.

── products
│── management
│ │── **init**.py
│ │── commands
│ │ │── **init**.py
│ │ │── category_seed.py
│ │ │── product_seed.py

category_seed.py

from django.core.management import BaseCommand
from django.db import connections
from django.db.utils import OperationalError
from products.models import Category

from faker import Faker


class Command(BaseCommand):
    def handle(self, *args, **kwargs):
        faker = Faker()

        for _ in range(30):
            Category.objects.create(
                name=faker.name(),
                description=faker.text(200)
            )

product_seed.py

from django.core.management import BaseCommand
from django.db import connections
from django.db.utils import OperationalError
from products.models import Category, Product
from random import randrange, randint

from faker import Faker


class Command(BaseCommand):
    def handle(self, *args, **kwargs):
        faker = Faker()

        for _ in range(5000):
            price = randrange(10, 100)
            quantity = randrange(1, 5)
            cat_id = randint(1, 30)
            category = Category.objects.get(id=cat)
            Product.objects.create(
                name=faker.name(),
                category=category,
                description=faker.text(200),
                price=price,
                discount=100,
                quantity=quantity,
                image=faker.image_url()
)

so, i will create 5000 of products and 30 category

$ docker-compose exec scale sh
/code # python manage.py makemigrations
/code # python manage.py migrate
/code # python manage.py createsuperuser
/code # python manage.py collectstatic --no-input
/code # python manage.py category_seed
/code # python manage.py product_seed # takes while to create 5000 data

You can view data on pgadmin or admin dashboard if data are loaded or not.

After creation of dummy data let's create a serializers and views

serializers.py

from rest_framework import serializers

from .models import Product, Category


class CategorySerializers(serializers.ModelSerializer):
    class Meta:
        model = Category
        fields = "__all__"


class CategoryRelatedField(serializers.StringRelatedField):
    def to_representation(self, value):
        return CategorySerializers(value).data

    def to_internal_value(self, data):
        return data


class ProductSerializers(serializers.ModelSerializer):

    class Meta:
        model = Product
        fields = "__all__"


class ReadProductSerializer(serializers.ModelSerializer):

    category = serializers.StringRelatedField(read_only=True)
    # category = CategoryRelatedField()
    # category = CategorySerializers()

    class Meta:
        model = Product
        fields = "__all__"

views.py

from products.models import Product
from rest_framework import (
    viewsets,
    status,
)

import time
from .serializers import ProductSerializers, ReadProductSerializer

from rest_framework.response import Response


class ProductViewSet(viewsets.ViewSet):

    def list(self, request):
        serializer = ReadProductSerializer(Category.objects.all(), many=True)
        return Response(serializer.data)

    def create(self, request):
        serializer = ProductSerializers(data=request.data)
        serializer.is_valid(raise_exception=True)
        serializer.save()
        return Response(
            serializer.data, status=status.HTTP_201_CREATED)

    def retrieve(self, request, pk=None,):
        products = Product.objects.get(id=pk)
        serializer = ReadProductSerializer(products)
        return Response(
            serializer.data
        )

    def update(self, request, pk=None):
        products = Product.objects.get(id=pk)
        serializer = ProductSerializers(
            instance=products, data=request.data, partial=True)
        serializer.is_valid(raise_exception=True)
        serializer.save()
        return Response(
            serializer.data, status=status.HTTP_202_ACCEPTED)

    def destroy(self, request, pk=None):
        products = Product.objects.get(id=pk)
        products.delete()
        return Response(
            status=status.HTTP_204_NO_CONTENT
        )

urls.py

from django.urls import path

from .views import ProductViewSet

urlpatterns = [
    path("product", ProductViewSet.as_view(
        {"get": "list", "post": "create"})),
    path(
        "product/<str:pk>",
        ProductViewSet.as_view(
            {"get": "retrieve", "put": "update", "delete": "destroy"}),
    ),
]

so, we created a view usingviewsets

let's try with postman using different serializers on viewsets to get lists of 5K data.

http://localhost:2000/api/v1/products

serializers Time
ReadProductSerializer (stringrelatedfield) 6.42s
ReadProductSerializer (CategoryRelatedFeild) 7.05s
ReadProductSerializer (Nested) 6.49s
ReadProductSerializer (PrimaryKeyRelatedField) 681 ms
ReadProductSerializer (without any) 674ms

Note: response time may varies depending on your system.

Lets get data by using caching:

views.py

from rest_framework.views import APIView
from products.models import Category, Product
from rest_framework import (
    viewsets,
    status,
)
from rest_framework.pagination import PageNumberPagination
import time
from .serializers import CategorySerializers, ProductSerializers, ReadProductSerializer

from rest_framework.response import Response

from django.core.cache import cache

class ProductListApiView(APIView):

    def get(self, request):
        paginator = PageNumberPagination()
        paginator.page_size = 10

        # get products from cache if exists
        products = cache.get('products_data')

        #  if products does not exists on cache create it
        if not products:
            products = list(Product.objects.select_related('category'))
            cache.set('products_data', products, timeout=60 * 60)

        # paginating cache products
        result = paginator.paginate_queryset(products, request)

        serializer = ReadProductSerializer(result, many=True)
        return paginator.get_paginated_response(serializer.data)


class ProductViewSet(viewsets.ViewSet):

    def create(self, request):
        serializer = ProductSerializers(data=request.data)
        serializer.is_valid(raise_exception=True)
        serializer.save()

        # get cache of products
        #  if exists
        #  delete cache
        for key in cache.keys('*'):
            if 'products_data' in key:
                cache.delete(key)
        cache.delete("products_data")

        return Response(
            serializer.data, status=status.HTTP_201_CREATED)

    def retrieve(self, request, pk=None,):
        products = Product.objects.get(id=pk)
        serializer = ReadProductSerializer(products)

        return Response(
            serializer.data
        )

    def update(self, request, pk=None):
        products = Product.objects.get(id=pk)
        serializer = ProductSerializers(
            instance=products, data=request.data, partial=True)
        serializer.is_valid(raise_exception=True)
        serializer.save()
        for key in cache.keys('*'):
            if 'products_data' in key:
                cache.delete(key)
        cache.delete("products_data")
        return Response(
            serializer.data, status=status.HTTP_202_ACCEPTED)

    def destroy(self, request, pk=None):
        products = Product.objects.get(id=pk)
        products.delete()
        for key in cache.keys('*'):
            if 'products_data' in key:
                cache.delete(key)
        cache.delete("products_data")
        return Response(
            status=status.HTTP_204_NO_CONTENT
        )

so, i have created a seperate APIView and remove list function from viewsets. Which will fetch data from cache and paginated view.
change your products/urls.py

from django.urls import path

from .views import ProductListApiView, ProductViewSet

urlpatterns = [

    path('products', ProductListApiView.as_view()),

    path("product", ProductViewSet.as_view(
        {"post": "create"})),
    path(
        "product/<str:pk>",
        ProductViewSet.as_view(
            {"get": "retrieve", "put": "update", "delete": "destroy"}),
    ),
]

So, try it again with postman with different serializers.
you will get results between 90 to 200ms depending upon your machine.

Note: in above apiview i have used select_related. Try removing it and run again with postman, will find a different results.

To learn more about queryset `i.e select_related, prefetch_related. click this link N+1 Queries Problem

Final words:

Still there are lots of rooms to improve, it depends how?, where?, for what?, how many?.

Hope You guys liked it... chao 👋👋

20