Files

IGNY8 VPS (Salman) 6a4f95c35a docs re-org

2025-12-09 13:26:35 +00:00

9.3 KiB

Raw Blame History

Automation Module (Code-Sourced, Dec 2025)

Single canonical reference for IGNY8 automation (backend, frontend, and runtime behavior). Replaces all prior automation docs in this folder.

1) What Automation Does

Runs the 7-stage pipeline across Planner/Writer:
1. Keywords → Clusters (AI)
2. Clusters → Ideas (AI)
3. Ideas → Tasks (Local)
4. Tasks → Content (AI)
5. Content → Image Prompts (AI)
6. Image Prompts → Images (AI)
7. Manual Review Gate (Manual)
Per-site, per-account isolation. One run at a time per site; guarded by cache lock automation_lock_{site_id}.
Scheduling via Celery beat (automation.check_scheduled_automations); execution via Celery tasks (run_automation_task, resume_automation_task / continue_automation_task).

2) Backend API (behavior + payloads)

Base: /api/v1/automation/ (auth required; site must belong to user’s account).

GET config?site_id=: returns or creates config with enable flag, frequency (daily|weekly|monthly), scheduled_time, stage_1..6 batch sizes, delays (within_stage_delay, between_stage_delay), last_run_at, next_run_at.
PUT update_config?site_id=: same fields as above, updates in-place.
POST run_now?site_id=: starts a manual run; enqueues run_automation_task. Fails if a run is already active or lock exists.
GET current_run?site_id=: current running/paused run with status, current_stage, totals, and stage_1..7_result blobs (counts, credits, partial flags, skip reasons).
GET pipeline_overview?site_id=: per-stage status counts and “pending” numbers for UI cards.
GET current_processing?site_id=&run_id=: live processing snapshot for an active run; null if not running.
POST pause|resume|cancel?site_id=&run_id=: pause after current item; resume from saved current_stage; cancel after current item and stamp cancelled_at/completed_at.
GET history?site_id=: last 20 runs (id, status, trigger, timestamps, total_credits_used, current_stage).
GET logs?run_id=&lines=100: tail of the per-run activity log written by AutomationLogger.
GET estimate?site_id=: estimated_credits, current_balance, sufficient (balance >= 1.2x estimate).

Error behaviors:

Missing site_id/run_id → 400.
Site not in account → 404.
Run not found → 404 on run-specific endpoints.
Already running / lock held → 400 on run_now.

3) Data Model (runtime state)

AutomationConfig (one per site): enable flag, schedule (frequency, time), batch sizes per stage (1–6), delays (within-stage, between-stage), last_run_at, next_run_at.
AutomationRun: run_id, trigger_type (manual/scheduled), status (running/paused/cancelled/completed/failed), current_stage, timestamps (start/pause/resume/cancel/complete), total_credits_used, per-stage result JSON (stage_1_result … stage_7_result), error_message.
Activity logs: one file per run via AutomationLogger; streamed through the logs endpoint.

4) How Execution Works (AutomationService)

Start: grabs cache lock automation_lock_{site_id}, estimates credits, enforces 1.2x balance check, creates AutomationRun and log file.
AI functions used: Stage 1 AutoClusterFunction; Stage 2 GenerateIdeasFunction; Stage 4 GenerateContentFunction; Stage 5 GenerateImagePromptsFunction; Stage 6 uses process_image_generation_queue (not the partial generate_images AI function).
Stage flow (per code):
- Stage 1 Keywords → Clusters: require ≥5 keywords (validate_minimum_keywords); batch by config; AIEngine clustering; records keywords_processed, clusters_created, batches, credits, time; skips if insufficient keywords.
- Stage 2 Clusters → Ideas: batch by config; AIEngine ideas; records ideas_created.
- Stage 3 Ideas → Tasks: local conversion of queued ideas to tasks; batches by config; no AI.
- Stage 4 Tasks → Content: batch by config; AIEngine content; records content count + word totals.
- Stage 5 Content → Image Prompts: batch by config; AIEngine image-prompts into Images (featured + in-article).
- Stage 6 Image Prompts → Images: uses process_image_generation_queue with provider/model from IntegrationSettings; updates Images status.
- Stage 7 Manual Review Gate: marks ready-for-review counts; no AI.
Control: each stage checks _check_should_stop (paused/cancelled); saves partial progress (counts, credits) before returning; resume continues from current_stage.
Credits: upfront estimate check (1.2x buffer) before starting; AIEngine per-call pre-checks and post-SAVE deductions; total_credits_used accumulates.
Locks: acquired on start; cleared on completion or failure; also cleared on fatal errors in tasks.
Errors: any unhandled exception marks run failed, sets error_message, logs error, clears lock; pipeline_overview/history reflect status.
Stage result fields (persisted):
- S1: keywords_processed, clusters_created, batches_run, credits_used, skipped/partial flags, time_elapsed.
- S2: clusters_processed, ideas_created, batches_run, credits_used.
- S3: ideas_processed, tasks_created, batches_run.
- S4: tasks_processed, content_created, total_words, batches_run, credits_used.
- S5: content_processed, prompts_created, batches_run, credits_used.
- S6: images_processed, images_generated, batches_run.
- S7: ready_for_review counts.

Batching & delays:

Configurable per site; stage_1..6 batch sizes control how many items per batch; within_stage_delay pauses between batches; between_stage_delay between stages.

Scheduling:

check_scheduled_automations runs hourly; respects frequency/time and last_run_at (~23h guard); skips if a run is active; sets next_run_at; starts run_automation_task.

Celery execution:

run_automation_task runs stages 1→7 sequentially for a run_id; failures mark run failed and clear lock.
resume_automation_task / continue_automation_task continue from saved current_stage.
Workers need access to cache (locks) and IntegrationSettings (models/providers).

Image pipeline specifics:

Stage 5 writes prompts to Images (featured + ordered in-article).
Stage 6 generates images via queue helper; AI generate_images remains partial/broken and is not used by automation.

5) Frontend Behavior (AutomationPage)

Route: /automation.
What the user can do: run now, pause, resume, cancel; edit config (enable/schedule, batch sizes, delays); view activity log; view history; watch live processing card and pipeline cards update.
Polling: every ~5s while a run is running/paused for current_run, pipeline_overview, metrics, current_processing; lighter polling when idle.
Metrics: fetched via low-level endpoints (keywords/clusters/ideas/tasks/content/images) for authoritative counts.
States shown: running, paused, cancelled, failed, completed; processing card shown when a run exists; pipeline cards use “pending” counts from pipeline_overview.
Activity log: pulled from logs endpoint; shown in UI for live tailing.

6) Configuration & Dependencies

Needs IntegrationSettings for AI models and image providers (OpenAI/runware).
Requires Celery beat and workers; cache backend required for locks.
Tenant scoping everywhere: site + account filtering on all automation queries.

7) Known Limitations and Gaps

generate_images AI function is partial/broken; automation uses queue helper instead.
Pause/Cancel stop after the current item; no mid-item abort.
Batch defaults are conservative (e.g., stage_2=1, stage_4=1); tune per site for throughput.
Stage 7 is manual; no automated review step.
No automated test suite observed for automation pipeline (stage transitions, pause/resume/cancel, scheduling guards, credit estimation/deduction).
Enhancements to consider: fix or replace generate_images; add mid-item abort; surface lock status/owner; broaden batch defaults after validation; add operator-facing doc in app; add tests.

8) Field/Behavior Quick Tables

Pipeline “pending” definitions (pipeline_overview)

Stage 1: Keywords with status new, cluster is null, not disabled.
Stage 2: Clusters status new, not disabled, with no ideas.
Stage 3: ContentIdeas status new.
Stage 4: Tasks status queued.
Stage 5: Content status draft with zero images.
Stage 6: Images status pending.
Stage 7: Content status review.

Stage result fields (stored on AutomationRun)

S1: keywords_processed, clusters_created, batches_run, credits_used, skipped, partial, time_elapsed.
S2: clusters_processed, ideas_created, batches_run, credits_used.
S3: ideas_processed, tasks_created, batches_run.
S4: tasks_processed, content_created, total_words, batches_run, credits_used.
S5: content_processed, prompts_created, batches_run, credits_used.
S6: images_processed, images_generated, batches_run.
S7: ready_for_review.

Credit handling

Pre-run: estimate_credits * 1.2 vs account.credits (fails if insufficient).
Per AI call: AIEngine pre-check credits; post-SAVE deduction with cost/tokens tracked; total_credits_used aggregates deductions.

Logging

Per-run log file via AutomationLogger; accessed with GET logs?run_id=&lines=; includes stage start/progress/errors and batch info.

Polling (frontend)

Active run: ~5s cadence for current_run, pipeline_overview, metrics, current_processing, logs tail.
Idle: lighter polling (current_run/pipeline_overview) to show readiness and pending counts.

9.3 KiB Raw Blame History Unescape Escape