⚖️Scaling Django⚖️

Why Scaling?
The potential of your application to cope with increasing numbers of users simultaneously interacting with it. Ultimately, you want it to grow and be able to handle more and more requests per minute (RPMs). There are a number of factors that play a part in ensuring scalability, and it’s worth taking each of them into consideration.
Contents
Requirements:
Well, i am using Docker to wrap up all my necessary tools and django apps on docker-container. Of-course you can ignore docker but have to install required tools independentely, it all up to you how you go through to it.

Well i am not going through with much details and explanation, please help yourself.

  • django-rest-framework
  • Nginx
  • Redis
  • Postgres
  • Poetry (an alternative for pip or pipenv)
  • Quikstart
    Feeling lazy?
  • clone the repo: boilerplate
  • and run below command
  • $ python3 -m venv env # create virtual environment
    $ source env/bin/activate 
    $ poetry install # make sure you have install poetry on your machine
    OR
    $ mkdir scale && cd scale
    $ python3 -m venv env # create virtual environment
    $ source env/bin/activate
    $ poetry init # poetry initialization and generates *.toml file
    $ poetry add djangorestframework psycopg2-binary Faker 
    django-redis gunicorn
    $ djang-admin startproject config .
    $ python manage.py startapp products
    $ touch Dockerfile
    $ touch docker-compose.yml
    Project structure:
    ─── scale
        ├── config
        │ ├── **init**.py
        │ ├── asgi.py
        │ ├── settings
        │ │ ├── **init**.py
        │ │ ├──base.py
        │ │ ├──dev.py
        │ │ ├──prod.py
        │ ├── urls.py
        │ └── wsgi.py
        ├── manage.py
        └── products
        └── .env
        └── manage.py
        └── docker-compose.yml
        └── Dockerfile

    note: above structure i have breakdown settings into base.py, prod.py, dev.py. Help yourself to break down, or you can get from here boilerplate

    Let's start with docker.
    Dockerfile
    FROM python:3.8.5-alpine
    
    # prevents Python from generating .pyc files in the container
    ENV PYTHONDONTWRITEBYTECODE 1
    # Turns off buffering for easier container logging
    ENV PYTHONUNBUFFERED 1
    
    RUN \
        apk add --no-cache curl
    
    # install psycopg2 dependencies
    RUN apk update \
        && apk add postgresql-dev gcc python3-dev musl-dev
    
    
    # Install poetry
    RUN pip install -U pip \
        && curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python
    ENV PATH="${PATH}:/root/.poetry/bin"
    
    
    RUN mkdir /code
    RUN mkdir /code/staticfiles
    RUN mkdir /code/mediafiles
    
    WORKDIR /code
    COPY . /code
    
    RUN poetry config virtualenvs.create false \
        && poetry install --no-interaction --no-ansi
    docker-compose.yaml
    version: "3.9"
    
    services:
      scale:
        restart: always
        build: .
        command: python manage.py runserver 0.0.0.0
        volumes:
          - .:/code
        ports:
          - 8000:8000
        env_file:
          - ./.env
        depends_on:
          - db
      db:
        image: "postgres:11"
        volumes:
          - postgres_data:/var/lib/postgresql/data/
        ports:
          - 54322:5432
        environment:
          - POSTGRES_USER=scale
          - POSTGRES_PASSWORD=scale
          - POSTGRES_DB=scale
    
    volumes:
      postgres_data:
    Above we create Dockerfile and docker-compose.yaml file.
  • we used alpine based image
  • installed dependencies for postgres and poetry setup
  • create service name scale and db
  • Run the command:
    docker-compose up
    you will get some error database does not exist
    let's create a database:
    $ docker container ls
    CONTAINER ID   IMAGE         COMMAND                  CREATED          STATUS          PORTS                                         NAMES
    
    78ac4d15bcd8   postgres:11   "docker-entrypoint.s…"   2 hours ago      Up 31 seconds   0.0.0.0:54322->5432/tcp, :::54322->5432/tcp   scale_db_1
    copy CONTAINER ID value
    $ docker exec -it 78ac4d15bcd8 bash
     :/#
     :/# psql --username=postgres
     psql (11.12 (Debian 11.12-1.pgdg90+1))
     Type "help" for help.
    
     postgres=# CREATE DATABASE scale;
     postgres=# CREATE USER scale WITH PASSWORD 'scale';
     postgres=# ALTER ROLE scale SET client_encoding TO 'utf8';
     postgres=# ALTER ROLE scale SET default_transaction_isolation TO 'read committed';
     postgres=# ALTER ROLE scale SET timezone TO 'UTC';
     postgres=# ALTER ROLE scale SUPERUSER;
     postgres=# GRANT ALL PRIVILEGES ON DATABASE scale TO scale;
     postgres=# \q
    make sure your settings/dev.py have config like this or your given credentials and change your host localhost to db:
    from config.settings import BASE_DIR
    
    DATABASES = {
        "default": {
            "ENGINE": "django.db.backends.postgresql_psycopg2",
            "ATOMIC_REQUESTS": True,
            "NAME": "scale",
            "USER": "scale",
            "PASSWORD": "scale",
            "HOST": "db",
            "PORT": "5432",
        }
    }
    
    # REDIS CONFIG
    CACHES = {
        "default": {
            "BACKEND": "django_redis.cache.RedisCache",
            "LOCATION": "redis://redis:6379/0",
            "OPTIONS": {"CLIENT_CLASS": "django_redis.client.DefaultClient"},
        }
    }
    
    STATIC_URL = '/static/'
    STATIC_ROOT = BASE_DIR.parent / "staticfiles"  # for collect static
    
    MEDIA_ROOT = BASE_DIR.parent / "media"
    MEDIA_URL = "/media/"
    Nginx Setup
    Next, we setup redis and nginx and gunicorn on docker:
    docker-compose.yaml
    version: "3.9"
    
    services:
      scale:
        restart: always
        build: .
        command: gunicorn config.wsgi:application --bind 0.0.0.0:8000
        volumes:
          - .:/code
          - static_volume:/code/staticfiles
          - media_volume:/code/mediafiles
        expose:
          - 8000
        env_file:
          - ./.env
        depends_on:
          - db
          - redis
      db:
        image: "postgres:11"
        volumes:
          - postgres_data:/var/lib/postgresql/data/
        ports:
          - 54322:5432
        environment:
          - POSTGRES_USER=scale
          - POSTGRES_PASSWORD=scale
          - POSTGRES_DB=scale
      redis:
        image: redis
        ports:
          - 63799:6379
        restart: on-failure
    
      nginx:
        build: ./nginx
        restart: always
        volumes:
          - static_volume:/code/staticfiles
          - media_volume:/code/mediafiles
        ports:
          - 2000:80
        depends_on:
          - scale
    
    volumes:
      postgres_data:
      static_volume:
      media_volume:
    so, above we add two services redis and nginx and initialze gunicorn instead of our regular command. Next we create a nginx dir on root project with Dockerfile & nginx.conf
    nginx/Dockerfile
    FROM nginx:latest
    
    RUN rm /etc/nginx/conf.d/default.conf
    COPY nginx.conf /etc/nginx/conf.d
    nginx/nginx.conf
    upstream core {
        server scale:8000;
    }
    
    server {
    
        listen 80;
    
        location / {
            proxy_pass http://core;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header Host $host;
            proxy_redirect off;
            client_max_body_size 100M;
        }
    
         location /staticfiles/ {
            alias /code/staticfiles/;
        }
          location /mediafiles/ {
            alias /code/mediafiles/;
        }
    
    }
    Above, we created a Dockerfile which will build our nginx image and nginx.conf where we are serving our app and serving static and media files.
    let's run docker-compose file.
    docker-compose up --build
    Navigate this link to browser http://localhost:2000/

    Note: Above docker-compose.yaml file on nginx service we initiated port: 2000:80.
    so our server will run on port 2000.

    Caching Products
    First lets try without caching.
    Now, let's create a model for our products app.
    products/models.py
    from django.db import models
    from django.utils.translation import gettext_lazy as _
    
    
    class Category(models.Model):
        name = models.CharField(_("Category Name"), max_length=255, unique=True)
        description = models.TextField(null=True)
    
        class Meta:
            ordering = ("name",)
            verbose_name = _("Category")
            verbose_name_plural = _("Categories")
    
        def __str__(self) -> str:
            return self.name
    
    
    class Product(models.Model):
        name = models.CharField(_("Product Name"), max_length=255)
        category = models.ForeignKey(
            Category, on_delete=models.DO_NOTHING)
    
        description = models.TextField()
        price = models.DecimalField(decimal_places=2, max_digits=10)
        quantity = models.IntegerField(default=0)
        discount = models.DecimalField(decimal_places=2, max_digits=10)
        image = models.URLField(max_length=255)
    
        class Meta:
            ordering = ("id",)
            verbose_name = _("Product")
            verbose_name_plural = _("Products")
    
        def __str__(self):
            return self.name
    so further moving forward let's create a dummy data using custom commands.
    create a management directory inside products app.
    ── products
    │── management
    │ │── **init**.py
    │ │── commands
    │ │ │── **init**.py
    │ │ │── category_seed.py
    │ │ │── product_seed.py
    category_seed.py
    from django.core.management import BaseCommand
    from django.db import connections
    from django.db.utils import OperationalError
    from products.models import Category
    
    from faker import Faker
    
    
    class Command(BaseCommand):
        def handle(self, *args, **kwargs):
            faker = Faker()
    
            for _ in range(30):
                Category.objects.create(
                    name=faker.name(),
                    description=faker.text(200)
                )
    product_seed.py
    from django.core.management import BaseCommand
    from django.db import connections
    from django.db.utils import OperationalError
    from products.models import Category, Product
    from random import randrange, randint
    
    from faker import Faker
    
    
    class Command(BaseCommand):
        def handle(self, *args, **kwargs):
            faker = Faker()
    
            for _ in range(5000):
                price = randrange(10, 100)
                quantity = randrange(1, 5)
                cat_id = randint(1, 30)
                category = Category.objects.get(id=cat)
                Product.objects.create(
                    name=faker.name(),
                    category=category,
                    description=faker.text(200),
                    price=price,
                    discount=100,
                    quantity=quantity,
                    image=faker.image_url()
    )
    so, i will create 5000 of products and 30 category
    $ docker-compose exec scale sh
    /code # python manage.py makemigrations
    /code # python manage.py migrate
    /code # python manage.py createsuperuser
    /code # python manage.py collectstatic --no-input
    /code # python manage.py category_seed
    /code # python manage.py product_seed # takes while to create 5000 data
    You can view data on pgadmin or admin dashboard if data are loaded or not.
    After creation of dummy data let's create a serializers and views
    serializers.py
    from rest_framework import serializers
    
    from .models import Product, Category
    
    
    class CategorySerializers(serializers.ModelSerializer):
        class Meta:
            model = Category
            fields = "__all__"
    
    
    class CategoryRelatedField(serializers.StringRelatedField):
        def to_representation(self, value):
            return CategorySerializers(value).data
    
        def to_internal_value(self, data):
            return data
    
    
    class ProductSerializers(serializers.ModelSerializer):
    
        class Meta:
            model = Product
            fields = "__all__"
    
    
    class ReadProductSerializer(serializers.ModelSerializer):
    
        category = serializers.StringRelatedField(read_only=True)
        # category = CategoryRelatedField()
        # category = CategorySerializers()
    
        class Meta:
            model = Product
            fields = "__all__"
    views.py
    from products.models import Product
    from rest_framework import (
        viewsets,
        status,
    )
    
    import time
    from .serializers import ProductSerializers, ReadProductSerializer
    
    from rest_framework.response import Response
    
    
    class ProductViewSet(viewsets.ViewSet):
    
        def list(self, request):
            serializer = ReadProductSerializer(Category.objects.all(), many=True)
            return Response(serializer.data)
    
        def create(self, request):
            serializer = ProductSerializers(data=request.data)
            serializer.is_valid(raise_exception=True)
            serializer.save()
            return Response(
                serializer.data, status=status.HTTP_201_CREATED)
    
        def retrieve(self, request, pk=None,):
            products = Product.objects.get(id=pk)
            serializer = ReadProductSerializer(products)
            return Response(
                serializer.data
            )
    
        def update(self, request, pk=None):
            products = Product.objects.get(id=pk)
            serializer = ProductSerializers(
                instance=products, data=request.data, partial=True)
            serializer.is_valid(raise_exception=True)
            serializer.save()
            return Response(
                serializer.data, status=status.HTTP_202_ACCEPTED)
    
        def destroy(self, request, pk=None):
            products = Product.objects.get(id=pk)
            products.delete()
            return Response(
                status=status.HTTP_204_NO_CONTENT
            )
    urls.py
    from django.urls import path
    
    from .views import ProductViewSet
    
    urlpatterns = [
        path("product", ProductViewSet.as_view(
            {"get": "list", "post": "create"})),
        path(
            "product/<str:pk>",
            ProductViewSet.as_view(
                {"get": "retrieve", "put": "update", "delete": "destroy"}),
        ),
    ]
    so, we created a view usingviewsets
    let's try with postman using different serializers on viewsets to get lists of 5K data.
    http://localhost:2000/api/v1/products
    serializers Time
    ReadProductSerializer (stringrelatedfield) 6.42s
    ReadProductSerializer (CategoryRelatedFeild) 7.05s
    ReadProductSerializer (Nested) 6.49s
    ReadProductSerializer (PrimaryKeyRelatedField) 681 ms
    ReadProductSerializer (without any) 674ms

    Note: response time may varies depending on your system.

    Lets get data by using caching:
    views.py
    from rest_framework.views import APIView
    from products.models import Category, Product
    from rest_framework import (
        viewsets,
        status,
    )
    from rest_framework.pagination import PageNumberPagination
    import time
    from .serializers import CategorySerializers, ProductSerializers, ReadProductSerializer
    
    from rest_framework.response import Response
    
    from django.core.cache import cache
    
    class ProductListApiView(APIView):
    
        def get(self, request):
            paginator = PageNumberPagination()
            paginator.page_size = 10
    
            # get products from cache if exists
            products = cache.get('products_data')
    
            #  if products does not exists on cache create it
            if not products:
                products = list(Product.objects.select_related('category'))
                cache.set('products_data', products, timeout=60 * 60)
    
            # paginating cache products
            result = paginator.paginate_queryset(products, request)
    
            serializer = ReadProductSerializer(result, many=True)
            return paginator.get_paginated_response(serializer.data)
    
    
    class ProductViewSet(viewsets.ViewSet):
    
        def create(self, request):
            serializer = ProductSerializers(data=request.data)
            serializer.is_valid(raise_exception=True)
            serializer.save()
    
            # get cache of products
            #  if exists
            #  delete cache
            for key in cache.keys('*'):
                if 'products_data' in key:
                    cache.delete(key)
            cache.delete("products_data")
    
            return Response(
                serializer.data, status=status.HTTP_201_CREATED)
    
        def retrieve(self, request, pk=None,):
            products = Product.objects.get(id=pk)
            serializer = ReadProductSerializer(products)
    
            return Response(
                serializer.data
            )
    
        def update(self, request, pk=None):
            products = Product.objects.get(id=pk)
            serializer = ProductSerializers(
                instance=products, data=request.data, partial=True)
            serializer.is_valid(raise_exception=True)
            serializer.save()
            for key in cache.keys('*'):
                if 'products_data' in key:
                    cache.delete(key)
            cache.delete("products_data")
            return Response(
                serializer.data, status=status.HTTP_202_ACCEPTED)
    
        def destroy(self, request, pk=None):
            products = Product.objects.get(id=pk)
            products.delete()
            for key in cache.keys('*'):
                if 'products_data' in key:
                    cache.delete(key)
            cache.delete("products_data")
            return Response(
                status=status.HTTP_204_NO_CONTENT
            )
    so, i have created a seperate APIView and remove list function from viewsets. Which will fetch data from cache and paginated view.
    change your products/urls.py
    from django.urls import path
    
    from .views import ProductListApiView, ProductViewSet
    
    urlpatterns = [
    
        path('products', ProductListApiView.as_view()),
    
        path("product", ProductViewSet.as_view(
            {"post": "create"})),
        path(
            "product/<str:pk>",
            ProductViewSet.as_view(
                {"get": "retrieve", "put": "update", "delete": "destroy"}),
        ),
    ]
    So, try it again with postman with different serializers.
    you will get results between 90 to 200ms depending upon your machine.

    Note: in above apiview i have used select_related. Try removing it and run again with postman, will find a different results.

    To learn more about queryset `i.e select_related, prefetch_related. click this link N+1 Queries Problem
    Final words:
    Still there are lots of rooms to improve, it depends how?, where?, for what?, how many?.
    Hope You guys liked it... chao 👋👋

    32

    This website collects cookies to deliver better user experience

    ⚖️Scaling Django⚖️