Files
igny8/master-docs/50-infra/SCALING-AND-LOAD-BALANCING.md
IGNY8 VPS (Salman) 3cbed65601 revamps docs complete
2025-12-07 14:14:29 +00:00

3.5 KiB

Scaling and Load Balancing

Purpose

Outline how services can be scaled based on existing Docker/Gunicorn/Celery configuration and external dependencies.

Code Locations (exact paths)

  • Compose stack: docker-compose.app.yml
  • Backend process model: docker-compose.app.yml (Gunicorn command), backend/Dockerfile
  • Celery worker/beat commands: docker-compose.app.yml
  • Celery settings: backend/igny8_core/settings.py (Celery config)

High-Level Responsibilities

  • Describe horizontal/vertical scaling levers for backend and workers.
  • Note reliance on external Postgres/Redis and shared network.

Detailed Behavior

  • Backend container runs Gunicorn with --workers 4 --timeout 120 (from compose). Scaling options:
    • Increase workers via compose command args.
    • Run additional backend containers on the same igny8_net behind a reverse proxy (proxy not defined here; compose assumes external Caddy/infra stack handles routing).
  • Celery:
    • Worker command celery -A igny8_core worker --loglevel=info --concurrency=4; concurrency can be increased per container or by adding more worker replicas.
    • Beat runs separately to schedule tasks.
    • Broker/backend: Redis from external infra; settings enforce prefetch multiplier 1 and task soft/hard time limits (25/30 min).
  • Frontend/marketing/sites:
    • Served via dev servers in current compose; for production, build static assets (frontend Dockerfile uses Caddy). Can scale by running additional containers behind the proxy.
  • Network:
    • igny8_net is external; load balancer (e.g., Caddy/infra) not defined in this repo but must front multiple backend/frontend replicas if added.

Data Structures / Models Involved (no code)

  • None; operational scaling only.

Execution Flow

  • Scaling out: start additional containers with the same images on igny8_net; configure external proxy to round-robin to backend instances.
  • Scaling up: raise Gunicorn worker count or Celery concurrency in compose overrides.

Cross-Module Interactions

  • Celery workload includes automation/AI tasks; ensure Redis/Postgres sized accordingly when increasing concurrency.
  • Request ID and resource tracking middleware remain per-instance; logs aggregate by container.

State Transitions (if applicable)

  • New replicas join network immediately; no shared session storage configured (stateless JWT APIs), so backend replicas are safe behind load balancer.

Error Handling

  • If backend not reachable, healthcheck fails and depends_on blocks frontend start.
  • Celery tasks exceeding time limits are terminated per settings.

Tenancy Rules

  • Unchanged by scaling; tenancy enforced per request in each instance.

Billing Rules (if applicable)

  • None; credit deductions occur in application code regardless of scale.

Background Tasks / Schedulers (if applicable)

  • Celery beat should remain single active scheduler; if running multiple beats, use external lock (not implemented) to avoid duplicate schedules.

Key Design Considerations

  • Ensure reverse proxy/ingress (outside this repo) balances across backend replicas and terminates TLS.
  • Keep Redis/Postgres highly available and sized for additional connections when scaling workers/backends.

How Developers Should Work With This Module

  • Use compose override or Portainer to adjust worker counts; validate resource limits.
  • Avoid multiple Celery beat instances unless coordination is added.
  • When introducing production-ready load balancing, add proxy config to infra repo and keep ports consistent with compose.