⚖️Django 스케일링⚖️

⚖️Django 스케일링⚖️

2022-10-19 last update

23 minutes reading nginx django docker redis

왜 스케일링인가?



애플리케이션과 동시에 상호 작용하는 점점 더 많은 수의 사용자에 대처할 수 있는 애플리케이션의 잠재력. 궁극적으로, 당신은 그것이 성장하고 분당 요청(RPM)을 점점 더 많이 처리할 수 있기를 원합니다. 확장성을 보장하는 데 역할을 하는 여러 요소가 있으며 각각을 고려할 가치가 있습니다.

내용물


  • Nginx
  • Caching

  • 요구 사항:



    글쎄, 나는 docker-container에 필요한 모든 도구와 django 앱을 마무리하기 위해 Docker 을 사용하고 있습니다. 물론 docker를 무시할 수 있지만 필요한 도구를 독립적으로 설치해야 합니다.

    Well i am not going through with much details and explanation, please help yourself.


  • django-rest-framework
  • Nginx
  • Redis
  • Postgres

  • Poetry (pip 또는 pipenv )

  • 퀵스타트



    게으른 느낌?
  • 리포지토리 복제: boilerplate
  • 아래 명령을 실행하십시오.

  • $ python3 -m venv env # create virtual environment
    $ source env/bin/activate 
    $ poetry install # make sure you have install poetry on your machine
    



    또는



    $ mkdir scale && cd scale
    $ python3 -m venv env # create virtual environment
    $ source env/bin/activate
    $ poetry init # poetry initialization and generates *.toml file
    $ poetry add djangorestframework psycopg2-binary Faker 
    django-redis gunicorn
    $ djang-admin startproject config .
    $ python manage.py startapp products
    $ touch Dockerfile
    $ touch docker-compose.yml
    

    프로젝트 구조:



    ─── scale
        ├── config
        │ ├── **init**.py
        │ ├── asgi.py
        │ ├── settings
        │ │ ├── **init**.py
        │ │ ├──base.py
        │ │ ├──dev.py
        │ │ ├──prod.py
        │ ├── urls.py
        │ └── wsgi.py
        ├── manage.py
        └── products
        └── .env
        └── manage.py
        └── docker-compose.yml
        └── Dockerfile
    
    

    note: above structure i have breakdown settings into base.py, prod.py, dev.py. Help yourself to break down, or you can get from here boilerplate



    도커부터 시작해보자.
    Dockerfile
    FROM python:3.8.5-alpine
    
    # prevents Python from generating .pyc files in the container
    ENV PYTHONDONTWRITEBYTECODE 1
    # Turns off buffering for easier container logging
    ENV PYTHONUNBUFFERED 1
    
    RUN \
        apk add --no-cache curl
    
    # install psycopg2 dependencies
    RUN apk update \
        && apk add postgresql-dev gcc python3-dev musl-dev
    
    
    # Install poetry
    RUN pip install -U pip \
        && curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python
    ENV PATH="${PATH}:/root/.poetry/bin"
    
    
    RUN mkdir /code
    RUN mkdir /code/staticfiles
    RUN mkdir /code/mediafiles
    
    WORKDIR /code
    COPY . /code
    
    RUN poetry config virtualenvs.create false \
        && poetry install --no-interaction --no-ansi
    
    
    docker-compose.yaml
    version: "3.9"
    
    services:
      scale:
        restart: always
        build: .
        command: python manage.py runserver 0.0.0.0
        volumes:
          - .:/code
        ports:
          - 8000:8000
        env_file:
          - ./.env
        depends_on:
          - db
      db:
        image: "postgres:11"
        volumes:
          - postgres_data:/var/lib/postgresql/data/
        ports:
          - 54322:5432
        environment:
          - POSTGRES_USER=scale
          - POSTGRES_PASSWORD=scale
          - POSTGRES_DB=scale
    
    volumes:
      postgres_data:
    

    위에서 Dockerfiledocker-compose.yaml 파일을 만듭니다.
  • 알파인 기반 이미지를 사용했습니다
  • postgrespoetry 설정
  • 에 대한 종속성을 설치했습니다.
  • 서비스 이름 생성 scaledb

  • 다음 명령을 실행합니다.

    docker-compose up
    

    오류가 발생합니다database does not exist.

    데이터베이스를 생성해 보겠습니다.

    $ docker container ls
    CONTAINER ID   IMAGE         COMMAND                  CREATED          STATUS          PORTS                                         NAMES
    
    78ac4d15bcd8   postgres:11   "docker-entrypoint.s…"   2 hours ago      Up 31 seconds   0.0.0.0:54322->5432/tcp, :::54322->5432/tcp   scale_db_1
    

    복사CONTAINER ID

     $ docker exec -it 78ac4d15bcd8 bash
     :/#
     :/# psql --username=postgres
     psql (11.12 (Debian 11.12-1.pgdg90+1))
     Type "help" for help.
    
     postgres=# CREATE DATABASE scale;
     postgres=# CREATE USER scale WITH PASSWORD 'scale';
     postgres=# ALTER ROLE scale SET client_encoding TO 'utf8';
     postgres=# ALTER ROLE scale SET default_transaction_isolation TO 'read committed';
     postgres=# ALTER ROLE scale SET timezone TO 'UTC';
     postgres=# ALTER ROLE scale SUPERUSER;
     postgres=# GRANT ALL PRIVILEGES ON DATABASE scale TO scale;
     postgres=# \q
    
    settings/dev.py에 이와 같은 구성이 있는지 확인하거나 주어진 자격 증명을 가지고 host localhostdb로 변경하십시오.

    from config.settings import BASE_DIR
    
    DATABASES = {
        "default": {
            "ENGINE": "django.db.backends.postgresql_psycopg2",
            "ATOMIC_REQUESTS": True,
            "NAME": "scale",
            "USER": "scale",
            "PASSWORD": "scale",
            "HOST": "db",
            "PORT": "5432",
        }
    }
    
    # REDIS CONFIG
    CACHES = {
        "default": {
            "BACKEND": "django_redis.cache.RedisCache",
            "LOCATION": "redis://redis:6379/0",
            "OPTIONS": {"CLIENT_CLASS": "django_redis.client.DefaultClient"},
        }
    }
    
    STATIC_URL = '/static/'
    STATIC_ROOT = BASE_DIR.parent / "staticfiles"  # for collect static
    
    MEDIA_ROOT = BASE_DIR.parent / "media"
    MEDIA_URL = "/media/"
    
    
    

    엔진엑스 설정

    What is Nginx?



    다음으로 도커에서 redisnginxgunicorn를 설정합니다.docker-compose.yaml
    version: "3.9"
    
    services:
      scale:
        restart: always
        build: .
        command: gunicorn config.wsgi:application --bind 0.0.0.0:8000
        volumes:
          - .:/code
          - static_volume:/code/staticfiles
          - media_volume:/code/mediafiles
        expose:
          - 8000
        env_file:
          - ./.env
        depends_on:
          - db
          - redis
      db:
        image: "postgres:11"
        volumes:
          - postgres_data:/var/lib/postgresql/data/
        ports:
          - 54322:5432
        environment:
          - POSTGRES_USER=scale
          - POSTGRES_PASSWORD=scale
          - POSTGRES_DB=scale
      redis:
        image: redis
        ports:
          - 63799:6379
        restart: on-failure
    
      nginx:
        build: ./nginx
        restart: always
        volumes:
          - static_volume:/code/staticfiles
          - media_volume:/code/mediafiles
        ports:
          - 2000:80
        depends_on:
          - scale
    
    volumes:
      postgres_data:
      static_volume:
      media_volume:
    

    따라서 위에서 일반 명령 대신 두 개의 서비스redisnginx 및 초기화gunicorn를 추가합니다. 다음으로 nginx & Dockerfile를 사용하여 루트 프로젝트에 nginx.conf dir을 만듭니다.
    nginx/Dockerfile
    FROM nginx:latest
    
    RUN rm /etc/nginx/conf.d/default.conf
    COPY nginx.conf /etc/nginx/conf.d
    
    nginx/nginx.conf
    upstream core {
        server scale:8000;
    }
    
    server {
    
        listen 80;
    
        location / {
            proxy_pass http://core;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header Host $host;
            proxy_redirect off;
            client_max_body_size 100M;
        }
    
         location /staticfiles/ {
            alias /code/staticfiles/;
        }
          location /mediafiles/ {
            alias /code/mediafiles/;
        }
    
    }
    
    

    위에서 우리는 Dockerfile 이미지를 빌드하고 nginx 앱을 제공하고 정적 및 미디어 파일을 제공할 nginx.conf를 만들었습니다.
    docker-compose 파일을 실행해보자.

    docker-compose up --build
    

    이 링크를 브라우저로 이동하십시오http://localhost:2000/.

    Note: Above docker-compose.yaml file on nginx service we initiated port: 2000:80.
    so our server will run on port 2000.



    캐싱 제품

    First lets try without caching.

    Now, let's create a model for our products app.

    products/models.py

    from django.db import models
    from django.utils.translation import gettext_lazy as _
    
    
    class Category(models.Model):
        name = models.CharField(_("Category Name"), max_length=255, unique=True)
        description = models.TextField(null=True)
    
        class Meta:
            ordering = ("name",)
            verbose_name = _("Category")
            verbose_name_plural = _("Categories")
    
        def __str__(self) -> str:
            return self.name
    
    
    class Product(models.Model):
        name = models.CharField(_("Product Name"), max_length=255)
        category = models.ForeignKey(
            Category, on_delete=models.DO_NOTHING)
    
        description = models.TextField()
        price = models.DecimalField(decimal_places=2, max_digits=10)
        quantity = models.IntegerField(default=0)
        discount = models.DecimalField(decimal_places=2, max_digits=10)
        image = models.URLField(max_length=255)
    
        class Meta:
            ordering = ("id",)
            verbose_name = _("Product")
            verbose_name_plural = _("Products")
    
        def __str__(self):
            return self.name
    

    so further moving forward let's create a dummy data using custom commands.
    create a management directory inside products app.

    ── products
    │── management
    │ │── **init**.py
    │ │── commands
    │ │ │── **init**.py
    │ │ │── category_seed.py
    │ │ │── product_seed.py
    
    

    category_seed.py

    from django.core.management import BaseCommand
    from django.db import connections
    from django.db.utils import OperationalError
    from products.models import Category
    
    from faker import Faker
    
    
    class Command(BaseCommand):
        def handle(self, *args, **kwargs):
            faker = Faker()
    
            for _ in range(30):
                Category.objects.create(
                    name=faker.name(),
                    description=faker.text(200)
                )
    
    
    
    

    product_seed.py

    from django.core.management import BaseCommand
    from django.db import connections
    from django.db.utils import OperationalError
    from products.models import Category, Product
    from random import randrange, randint
    
    from faker import Faker
    
    
    class Command(BaseCommand):
        def handle(self, *args, **kwargs):
            faker = Faker()
    
            for _ in range(5000):
                price = randrange(10, 100)
                quantity = randrange(1, 5)
                cat_id = randint(1, 30)
                category = Category.objects.get(id=cat)
                Product.objects.create(
                    name=faker.name(),
                    category=category,
                    description=faker.text(200),
                    price=price,
                    discount=100,
                    quantity=quantity,
                    image=faker.image_url()
    )
    

    so, i will create 5000 of products and 30 category

    $ docker-compose exec scale sh
    /code # python manage.py makemigrations
    /code # python manage.py migrate
    /code # python manage.py createsuperuser
    /code # python manage.py collectstatic --no-input
    /code # python manage.py category_seed
    /code # python manage.py product_seed # takes while to create 5000 data
    

    You can view data on pgadmin or admin dashboard if data are loaded or not.

    After creation of dummy data let's create a serializers and views

    serializers.py

    from rest_framework import serializers
    
    from .models import Product, Category
    
    
    class CategorySerializers(serializers.ModelSerializer):
        class Meta:
            model = Category
            fields = "__all__"
    
    
    class CategoryRelatedField(serializers.StringRelatedField):
        def to_representation(self, value):
            return CategorySerializers(value).data
    
        def to_internal_value(self, data):
            return data
    
    
    class ProductSerializers(serializers.ModelSerializer):
    
        class Meta:
            model = Product
            fields = "__all__"
    
    
    class ReadProductSerializer(serializers.ModelSerializer):
    
        category = serializers.StringRelatedField(read_only=True)
        # category = CategoryRelatedField()
        # category = CategorySerializers()
    
        class Meta:
            model = Product
            fields = "__all__"
    
    

    views.py

    from products.models import Product
    from rest_framework import (
        viewsets,
        status,
    )
    
    import time
    from .serializers import ProductSerializers, ReadProductSerializer
    
    from rest_framework.response import Response
    
    
    class ProductViewSet(viewsets.ViewSet):
    
        def list(self, request):
            serializer = ReadProductSerializer(Category.objects.all(), many=True)
            return Response(serializer.data)
    
        def create(self, request):
            serializer = ProductSerializers(data=request.data)
            serializer.is_valid(raise_exception=True)
            serializer.save()
            return Response(
                serializer.data, status=status.HTTP_201_CREATED)
    
        def retrieve(self, request, pk=None,):
            products = Product.objects.get(id=pk)
            serializer = ReadProductSerializer(products)
            return Response(
                serializer.data
            )
    
        def update(self, request, pk=None):
            products = Product.objects.get(id=pk)
            serializer = ProductSerializers(
                instance=products, data=request.data, partial=True)
            serializer.is_valid(raise_exception=True)
            serializer.save()
            return Response(
                serializer.data, status=status.HTTP_202_ACCEPTED)
    
        def destroy(self, request, pk=None):
            products = Product.objects.get(id=pk)
            products.delete()
            return Response(
                status=status.HTTP_204_NO_CONTENT
            )
    
    

    urls.py

    
    from django.urls import path
    
    from .views import ProductViewSet
    
    urlpatterns = [
        path("product", ProductViewSet.as_view(
            {"get": "list", "post": "create"})),
        path(
            "product/<str:pk>",
            ProductViewSet.as_view(
                {"get": "retrieve", "put": "update", "delete": "destroy"}),
        ),
    ]
    
    

    so, we created a view using viewsets

    let's try with postman using different serializers on viewsets to get lists of 5K data.

    http://localhost:2000/api/v1/products

    serializers Time
    ReadProductSerializer (stringrelatedfield) 6.42s
    ReadProductSerializer (CategoryRelatedFeild) 7.05s
    ReadProductSerializer (Nested) 6.49s
    ReadProductSerializer (PrimaryKeyRelatedField) 681 ms
    ReadProductSerializer (without any) 674ms

    Note: response time may varies depending on your system.

    Lets get data by using caching:

    views.py

    from rest_framework.views import APIView
    from products.models import Category, Product
    from rest_framework import (
        viewsets,
        status,
    )
    from rest_framework.pagination import PageNumberPagination
    import time
    from .serializers import CategorySerializers, ProductSerializers, ReadProductSerializer
    
    from rest_framework.response import Response
    
    from django.core.cache import cache
    
    class ProductListApiView(APIView):
    
        def get(self, request):
            paginator = PageNumberPagination()
            paginator.page_size = 10
    
            # get products from cache if exists
            products = cache.get('products_data')
    
            #  if products does not exists on cache create it
            if not products:
                products = list(Product.objects.select_related('category'))
                cache.set('products_data', products, timeout=60 * 60)
    
            # paginating cache products
            result = paginator.paginate_queryset(products, request)
    
            serializer = ReadProductSerializer(result, many=True)
            return paginator.get_paginated_response(serializer.data)
    
    
    class ProductViewSet(viewsets.ViewSet):
    
        def create(self, request):
            serializer = ProductSerializers(data=request.data)
            serializer.is_valid(raise_exception=True)
            serializer.save()
    
            # get cache of products
            #  if exists
            #  delete cache
            for key in cache.keys('*'):
                if 'products_data' in key:
                    cache.delete(key)
            cache.delete("products_data")
    
            return Response(
                serializer.data, status=status.HTTP_201_CREATED)
    
        def retrieve(self, request, pk=None,):
            products = Product.objects.get(id=pk)
            serializer = ReadProductSerializer(products)
    
            return Response(
                serializer.data
            )
    
        def update(self, request, pk=None):
            products = Product.objects.get(id=pk)
            serializer = ProductSerializers(
                instance=products, data=request.data, partial=True)
            serializer.is_valid(raise_exception=True)
            serializer.save()
            for key in cache.keys('*'):
                if 'products_data' in key:
                    cache.delete(key)
            cache.delete("products_data")
            return Response(
                serializer.data, status=status.HTTP_202_ACCEPTED)
    
        def destroy(self, request, pk=None):
            products = Product.objects.get(id=pk)
            products.delete()
            for key in cache.keys('*'):
                if 'products_data' in key:
                    cache.delete(key)
            cache.delete("products_data")
            return Response(
                status=status.HTTP_204_NO_CONTENT
            )
    
    

    so, i have created a seperate APIView and remove list function from viewsets . Which will fetch data from cache and paginated view.
    change your products/urls.py

    
    from django.urls import path
    
    from .views import ProductListApiView, ProductViewSet
    
    urlpatterns = [
    
        path('products', ProductListApiView.as_view()),
    
        path("product", ProductViewSet.as_view(
            {"post": "create"})),
        path(
            "product/<str:pk>",
            ProductViewSet.as_view(
                {"get": "retrieve", "put": "update", "delete": "destroy"}),
        ),
    ]
    
    

    So, try it again with postman with different serializers .
    you will get results between 90 to 200ms depending upon your machine.

    Note: in above apiview i have used select_related. Try removing it and run again with postman, will find a different results.

    To learn more about queryset `i.e select_related, prefetch_related. click this link N+1 Queries Problem


    라마 / 스케일링 장고






    마지막 말:



    여전히 개선할 여지가 많습니다. 어떻게, 어디서, 무엇을 위해, 몇 개를 사용하는지에 따라 다릅니다.

    여러분이 좋아하셨기를 바랍니다... chao 👋👋