Files
igny8/docs/10-BACKEND/automation/SCHEDULER.md
IGNY8 VPS (Salman) 6a4f95c35a docs re-org
2025-12-09 13:26:35 +00:00

4.4 KiB
Raw Blame History

Automation Scheduler

Purpose

Describe how scheduled runs are detected, triggered, and resumed using Celery tasks and automation configs.

Code Locations (exact paths)

  • Celery tasks: backend/igny8_core/business/automation/tasks.py
  • Models: backend/igny8_core/business/automation/models.py
  • Service invoked: backend/igny8_core/business/automation/services/automation_service.py

High-Level Responsibilities

  • Periodically scan enabled automation configs to start scheduled runs.
  • Prevent overlapping runs per site via cache locks and active run checks.
  • Resume paused runs from their recorded stage.

Detailed Behavior

  • check_scheduled_automations (Celery, hourly):
    • Iterates AutomationConfig with is_enabled=True.
    • Frequency rules:
      • daily: run when current hour matches scheduled_time.hour.
      • weekly: run Mondays at the scheduled hour.
      • monthly: run on the 1st of the month at the scheduled hour.
    • Skips if last_run_at is within ~23 hours or if an AutomationRun with status='running' exists for the site.
    • On trigger: instantiates AutomationService(account, site), calls start_automation(trigger_type='scheduled'), updates last_run_at and next_run_at (via _calculate_next_run), saves config, and enqueues run_automation_task.delay(run_id).
    • Exceptions are logged per site; lock release is handled by the service on failure paths.
  • run_automation_task:
    • Loads service via from_run_id, runs stages 17 sequentially.
    • On exception: marks run failed, records error/completed_at, and deletes site lock.
  • resume_automation_task / alias continue_automation_task:
    • Loads service via from_run_id, uses current_stage to continue remaining stages.
    • On exception: marks run failed, records error/completed_at.
  • _calculate_next_run:
    • Computes next run datetime based on frequency and scheduled_time, resetting seconds/microseconds; handles month rollover for monthly frequency.

Data Structures / Models Involved (no code)

  • AutomationConfig: contains schedule fields (frequency, scheduled_time, last_run_at, next_run_at, is_enabled).
  • AutomationRun: records run status/stage used during resume/failure handling.

Execution Flow

  1. Celery beat (or cron) invokes check_scheduled_automations hourly.
  2. Eligible configs spawn new runs via AutomationService.start_automation (includes lock + credit check).
  3. run_automation_task executes the pipeline asynchronously.
  4. Paused runs can be resumed by enqueueing resume_automation_task/continue_automation_task, which restart at current_stage.
  5. Failures set run status to failed and release locks.

Cross-Module Interactions

  • Uses planner/writer data inside the pipeline (see pipeline doc); billing/credits enforced at start.
  • Locking is done via Django cache, independent of other modules but prevents concurrent Celery runs per site.

State Transitions

  • Config timestamps (last_run_at, next_run_at) update on scheduled launch.
  • Run status changes to failed on task exceptions; to completed at stage 7; to paused/cancelled via API.

Error Handling

  • Scheduled start is skipped with log messages if recently run or already running.
  • Exceptions during run execution mark the run failed, record error message, set completed_at, and release the cache lock.

Tenancy Rules

  • Configs and runs are site- and account-scoped; scheduler uses stored account/site from the config; no cross-tenant scheduling.

Billing Rules

  • Start uses AutomationService.start_automation, which enforces credit sufficiency before scheduling the Celery execution.

Background Tasks / Schedulers

  • Hourly check_scheduled_automations plus the long-running run_automation_task and resume tasks run in Celery workers.

Key Design Considerations

  • Hourly scan with coarse matching keeps implementation simple while honoring per-site schedules.
  • Cache lock and active-run checks prevent double-starts from overlapping schedules or manual triggers.
  • Resume task reuses the same stage methods to keep behavior consistent between fresh and resumed runs.

How Developers Should Work With This Module

  • When adding new frequencies, extend check_scheduled_automations and _calculate_next_run consistently.
  • Ensure Celery beat (or an equivalent scheduler) runs check_scheduled_automations hourly in production.
  • Preserve lock acquisition and failure handling when modifying task flows to avoid orphaned locks.