# AI Automation Pipeline - Complete Implementation Plan **Version:** 2.0 **Date:** December 3, 2025 **Scope:** Site-level automation orchestrating existing AI functions --- ## 🎯 CORE ARCHITECTURE DECISIONS ### Decision 1: Site-Level Automation (NO Sector) **Rationale:** - User manages automation per website, not per topic/sector - Simpler UX - single site selector at top of page - Database queries filter by `site_id` only (no sector_id filtering) - Content naturally spans multiple sectors within a site - One automation schedule per site (not per site/sector combination) **Implementation:** - Remove sector dropdown from automation page UI - AutomationRun model: Remove sector foreign key - AutomationConfig model: One config per site (not per site+sector) - All stage database queries: `.filter(site=site)` (no sector filter) --- ### Decision 2: Single Global Automation Page **Why:** - Complete pipeline visibility in one place (Keywords β†’ Draft Content) - Configure one schedule for entire lifecycle - See exactly where pipeline is stuck or running - Cleaner UX - no jumping between module pages **Location:** `/automation` (new route below Sites in sidebar) --- ### Decision 3: Strictly Sequential Stages (Never Parallel) **Critical Principle:** - Stage N+1 ONLY starts when Stage N is 100% complete - Within each stage: process items in batches sequentially - Hard stop between stages to verify completion - Only ONE stage active at a time per site **Example Flow:** ``` Stage 1 starts β†’ processes ALL batches β†’ completes 100% ↓ (trigger next) Stage 2 starts β†’ processes ALL batches β†’ completes 100% ↓ (trigger next) Stage 3 starts β†’ ... ``` **Never:** - Run stages in parallel - Start next stage while current stage has pending items - Skip verification between stages --- ### Decision 4: Automation Stops Before Publishing **Manual Review Gate (Stage 7):** - Automation ends when content reaches `status='draft'` with all images generated - User manually reviews content quality, accuracy, brand voice - User manually publishes via existing bulk actions on Content page - No automated WordPress publishing (requires human oversight) **Rationale:** - Content quality control needed - Publishing has real consequences (public-facing) - Legal/compliance review may be required - Brand voice verification essential --- ## πŸ“Š EXISTING AI FUNCTIONS (Zero Duplication) ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ πŸ€– AI AUTOMATION PIPELINE β”‚ β”‚ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ β”‚ β”‚ β”‚ β”‚ ⏰ SCHEDULE β”‚ β”‚ Next Run: Tomorrow at 2:00 AM (in 16 hours) β”‚ β”‚ Frequency: [Daily β–Ό] at [02:00 β–Ό] β”‚ β”‚ Status: ● Scheduled β”‚ β”‚ β”‚ β”‚ [Run Now] [Pause Schedule] [Configure] β”‚ β”‚ β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ πŸ“Š PIPELINE OVERVIEW β”‚ β”‚ β”‚ β”‚ Keywords ──→ Clusters ──→ Ideas ──→ Tasks ──→ Content β”‚ β”‚ 47 pending 42 20 generating β”‚ β”‚ pending Stage 1 ready queued Stage 5 β”‚ β”‚ β”‚ β”‚ Overall Progress: ━━━━━━━╸ 62% (Stage 5/7) β”‚ β”‚ Estimated Completion: 2 hours 15 minutes β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ STAGE 1: Keywords β†’ Clusters (AI) β”‚ β”‚ Status: βœ“ Completed β”‚ β”‚ β€’ Processed: 60 keywords β†’ 8 clusters β”‚ β”‚ β€’ Time: 2m 30s | Credits: 12 β”‚ β”‚ [View Details] [Retry Failed] β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ STAGE 2: Clusters β†’ Ideas (AI) β”‚ β”‚ Status: βœ“ Completed β”‚ β”‚ β€’ Processed: 8 clusters β†’ 56 ideas β”‚ β”‚ β€’ Time: 8m 15s | Credits: 16 β”‚ β”‚ [View Details] β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ STAGE 3: Ideas β†’ Tasks (Local Queue) β”‚ β”‚ Status: βœ“ Completed β”‚ β”‚ β€’ Processed: 42 ideas β†’ 42 tasks β”‚ β”‚ β€’ Time: Instant | Credits: 0 β”‚ β”‚ [View Details] β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ STAGE 4: Tasks β†’ Content (AI) β”‚ β”‚ Status: ● Processing (Task 3/20) β”‚ β”‚ β€’ Current: "Ultimate Coffee Bean Guide" ━━━━╸ 65% β”‚ β”‚ β€’ Progress: 2 completed, 1 processing, 17 queued β”‚ β”‚ β€’ Time: 45m elapsed | Credits: 38 used β”‚ β”‚ β€’ ETA: 1h 30m remaining β”‚ β”‚ [View Details] [Pause Stage] β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ STAGE 5: Content β†’ Image Prompts (AI) β”‚ β”‚ Status: ⏸ Waiting (Stage 4 must complete) β”‚ β”‚ β€’ Pending: 2 content pieces ready for prompts β”‚ β”‚ β€’ Queue: Will process when Stage 4 completes β”‚ β”‚ [View Details] [Trigger Now] β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ STAGE 6: Image Prompts β†’ Generated Images (AI) β”‚ β”‚ Status: ⏸ Waiting β”‚ β”‚ β€’ Pending: 0 prompts ready β”‚ β”‚ [View Details] β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ STAGE 7: Content β†’ Review (Manual Gate) 🚫 STOPS HERE β”‚ β”‚ Status: ⏸ Awaiting Manual Review β”‚ β”‚ β€’ Ready for Review: 2 content pieces β”‚ β”‚ β€’ Note: Automation stops here. User reviews manually. β”‚ β”‚ [Go to Review Page] β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ πŸ“‹ LIVE ACTIVITY LOG (Last 50 events) β”œβ”€ 14:23:45 - Stage 4: Started content generation for Task 3 β”œβ”€ 14:24:12 - Stage 4: Writing sections (65% complete) β”œβ”€ 14:22:30 - Stage 4: Completed Task 2 β†’ Content created β”œβ”€ 14:20:15 - Stage 4: Started content generation for Task 2 β”œβ”€ 14:18:45 - Stage 4: Completed Task 1 β†’ Content created └─ [View Full Log] πŸ’° TOTAL CREDITS USED THIS RUN: 66 credits ``` **All 6 AI Functions Already Exist and Work:** | Function | File Location | Input | Output | Credits | Status | |----------|---------------|-------|--------|---------|--------| | **auto_cluster** | `ai/functions/auto_cluster.py` | Keyword IDs (max 20) | Clusters created | 1 per 5 keywords | βœ… Working | | **generate_ideas** | `ai/functions/generate_ideas.py` | Cluster IDs (max 5) | Ideas created | 2 per cluster | βœ… Working | | **bulk_queue_to_writer** | `modules/planner/views.py` (line 1014) | Idea IDs | Tasks created | 0 (local) | βœ… Working | | **generate_content** | `ai/functions/generate_content.py` | Task IDs (1 at a time) | Content draft | 1 per 500 words | βœ… Working | | **generate_image_prompts** | `ai/functions/generate_image_prompts.py` | Content IDs | Image prompts | 0.5 per prompt | βœ… Working | | **generate_images** | `ai/functions/generate_images.py` | Image prompt IDs | Generated images | 1-4 per image | βœ… Working | --- ### 🚫 WHAT AI FUNCTIONS ALREADY DO (DO NOT DUPLICATE) **Credit Management** (Fully Automated in `ai/engine.py`): ```python # Line 395 in AIEngine.execute(): CreditService.deduct_credits_for_operation( account=account, operation_type=self._get_operation_type(), amount=self._get_actual_amount(), ... ) ``` - βœ… Credits are AUTOMATICALLY deducted after successful save - βœ… Credit calculation happens in `_get_actual_amount()` and `_get_operation_type()` - ❌ Automation does NOT need to call `CreditService` manually - ❌ Automation does NOT need to calculate credit costs **Status Updates** (Handled Inside AI Functions): - βœ… Keywords: `status='new'` β†’ `status='mapped'` (in auto_cluster save_output) - βœ… Clusters: Created with `status='new'` (in auto_cluster save_output) - βœ… Ideas: `status='new'` β†’ `status='queued'` (in bulk_queue_to_writer) - βœ… Tasks: Created with `status='queued'`, β†’ `status='completed'` (in generate_content) - βœ… Content: Created with `status='draft'`, β†’ `status='review'` ONLY when all images complete (ai/tasks.py line 723) - βœ… Images: `status='pending'` β†’ `status='generated'` (in generate_images save_output) - ❌ Automation does NOT update these statuses directly **Progress Tracking** (Event-Based System Already Exists): - βœ… `StepTracker` and `ProgressTracker` emit real-time events during AI execution - βœ… Each AI function has 6 phases: `INIT`, `PREP`, `AI_CALL`, `PARSE`, `SAVE`, `DONE` - βœ… Phase descriptions available in function metadata: `get_metadata()` - ❌ Automation does NOT need to poll progress every 2 seconds - ❌ Automation listens to existing phase events via Celery task status **Error Handling & Logging**: - βœ… AIEngine wraps execution in try/catch, logs to `AIUsageLog` - βœ… Failed operations rollback database changes automatically - ❌ Automation only needs to check final task result (success/failure) --- **Automation Service ONLY Does:** 1. **Batch Selection**: Query database for items to process (by status and site) 2. **Function Calling**: Call existing AI functions with selected item IDs 3. **Stage Sequencing**: Wait for Stage N completion before starting Stage N+1 4. **Scheduling**: Trigger automation runs on configurable schedules 5. **Aggregation**: Collect results from all batches and log totals per stage --- ## πŸ—οΈ 7-STAGE PIPELINE ARCHITECTURE ### Sequential Stage Flow | Stage | From | To | Function Used | Batch Size | Type | |-------|------|-----|---------------|------------|------| | **1** | Keywords (`status='new'`, `cluster_id=null`) | Clusters (`status='new'`) | `auto_cluster` | 20 keywords | AI | | **2** | Clusters (`status='new'`, no ideas) | Ideas (`status='new'`) | `generate_ideas` | 1 cluster | AI | | **3** | Ideas (`status='new'`) | Tasks (`status='queued'`) | `bulk_queue_to_writer` | 20 ideas | Local | | **4** | Tasks (`status='queued'`) | Content (`status='draft'`) | `generate_content` | 1 task | AI | | **5** | Content (`status='draft'`, no Images) | Images (`status='pending'` with prompts) | `generate_image_prompts` | 1 content | AI | | **6** | Images (`status='pending'`) | Images (`status='generated'` with URLs) | `generate_images` | 1 image | AI | | **7** | Content (`status='review'`) | Manual Review | None (gate) | N/A | Manual | --- ### Stage 1: Keywords β†’ Clusters (AI) **Purpose:** Group semantically similar keywords into topic clusters **Database Query (Automation Orchestrator):** ```python pending_keywords = Keywords.objects.filter( site=site, status='new', cluster__isnull=True, disabled=False ) ``` **Orchestration Logic (What Automation Does):** 1. **Select Batch**: Count pending keywords - If 0 keywords β†’ Skip stage, log "No keywords to process" - If 1-20 keywords β†’ Select all (batch_size = count) - If >20 keywords β†’ Select first 20 (configurable batch_size) 2. **Call AI Function**: ```python from igny8_core.ai.functions.auto_cluster import AutoCluster result = AutoCluster().execute( payload={'ids': keyword_ids}, account=account ) # Returns: {'task_id': 'celery_task_abc123'} ``` 3. **Monitor Progress**: Listen to Celery task status - Use existing `StepTracker` phase events (INIT β†’ PREP β†’ AI_CALL β†’ PARSE β†’ SAVE β†’ DONE) - OR poll: `AsyncResult(task_id).state` until SUCCESS/FAILURE - Log phase progress: "AI analyzing keywords (65% complete)" 4. **Collect Results**: When task completes - AI function already updated Keywords.status β†’ 'mapped' - AI function already created Cluster records with status='new' - AI function already deducted credits via AIEngine - Automation just logs: "Batch complete: N clusters created" 5. **Repeat**: If more keywords remain, select next batch and go to step 2 **Stage Completion Criteria:** - All keyword batches processed (pending_keywords.count() == 0) - No critical errors **What AI Function Does (Already Implemented - DON'T DUPLICATE):** - βœ… Groups keywords semantically using AI - βœ… Creates Cluster records with `status='new'` - βœ… Updates Keywords: `cluster_id=cluster.id`, `status='mapped'` - βœ… Deducts credits automatically (AIEngine line 395) - βœ… Logs to AIUsageLog - βœ… Emits progress events via StepTracker **Stage Result Logged:** ```json { "keywords_processed": 47, "clusters_created": 8, "batches_run": 3, "credits_used": 10 // Read from AIUsageLog sum, not calculated } ``` --- ### Stage 2: Clusters β†’ Ideas (AI) **Purpose:** Generate content ideas for each cluster **Database Query:** ``` Clusters.objects.filter( site=site, status='new', disabled=False ).exclude( ideas__isnull=False # Has no ideas yet ) ``` **Process:** 1. Count clusters without ideas 2. If 0 β†’ Skip stage 3. If > 0 β†’ Process one cluster at a time (configurable batch size = 1) 4. For each cluster: - Log: "Generating ideas for cluster: {cluster.name}" - Call `IdeasService.generate_ideas(cluster_ids=[cluster.id], account)` - Function returns `{'task_id': 'xyz789'}` - Monitor via Celery task status or StepTracker events - Wait for completion - Log: "Cluster '{name}' complete: N ideas created" 5. Log stage summary **Stage Completion Criteria:** - All clusters processed - Each cluster now has >=1 idea - No errors **Updates:** - ContentIdeas: New records created with `status='new'`, `keyword_cluster_id=cluster.id` - Clusters: `status='mapped'` - Stage result: `{clusters_processed: 8, ideas_created: 56}` **Credits:** ~16 credits (2 per cluster) --- ### Stage 3: Ideas β†’ Tasks (Local Queue) **Purpose:** Convert content ideas to writer tasks (local, no AI) **Database Query:** ``` ContentIdeas.objects.filter( site=site, status='new' ) ``` **Process:** 1. Count pending ideas 2. If 0 β†’ Skip stage 3. If > 0 β†’ Split into batches of 20 4. For each batch: - Log: "Queueing batch X/Y (20 ideas)" - Call `bulk_queue_to_writer` view logic (NOT via HTTP, direct function call) - For each idea: - Create Tasks record with title=idea.idea_title, status='queued', cluster=idea.keyword_cluster - Update idea status to 'queued' - Log: "Batch X complete: 20 tasks created" 5. Log stage summary **Stage Completion Criteria:** - All batches processed - All ideas now have `status='queued'` - Corresponding Tasks exist with `status='queued'` - No errors **Updates:** - Tasks: New records created with `status='queued'` - ContentIdeas: `status` changed 'new' β†’ 'queued' - Stage result: `{ideas_processed: 56, tasks_created: 56, batches: 3}` **Credits:** 0 (local operation) --- ### Stage 4: Tasks β†’ Content (AI) **Purpose:** Generate full content drafts from tasks **Database Query (Automation Orchestrator):** ```python pending_tasks = Tasks.objects.filter( site=site, status='queued', content__isnull=True # No content generated yet ) ``` **Orchestration Logic:** 1. **Select Item**: Count queued tasks - If 0 β†’ Skip stage - If > 0 β†’ Select ONE task at a time (sequential processing) 2. **Call AI Function**: ```python from igny8_core.ai.functions.generate_content import GenerateContent result = GenerateContent().execute( payload={'ids': [task.id]}, account=account ) # Returns: {'task_id': 'celery_task_xyz789'} ``` 3. **Monitor Progress**: Listen to Celery task status - Use `StepTracker` phase events for real-time updates - Log: "Writing sections (65% complete)" (from phase metadata) - Content generation takes 5-15 minutes per task 4. **Collect Results**: When task completes - AI function already created Content with `status='draft'` - AI function already updated Task.status β†’ 'completed' - AI function already updated Idea.status β†’ 'completed' - AI function already deducted credits based on word count - Automation logs: "Content created (2500 words)" 5. **Repeat**: Process next task sequentially **Stage Completion Criteria:** - All tasks processed (pending_tasks.count() == 0) - Each task has linked Content record **What AI Function Does (Already Implemented):** - βœ… Generates article sections using AI - βœ… Creates Content record with `status='draft'`, `task_id=task.id` - βœ… Updates Task: `status='completed'` - βœ… Updates linked Idea: `status='completed'` - βœ… Deducts credits: 1 credit per 500 words (automatic) - βœ… Logs to AIUsageLog with word count **Stage Result Logged:** ```json { "tasks_processed": 56, "content_created": 56, "total_words": 140000, "credits_used": 280 // From AIUsageLog, not calculated } ``` --- ### Stage 5: Content β†’ Image Prompts (AI) **Purpose:** Extract image prompts from content and create Images records with prompts **CRITICAL:** There is NO separate "ImagePrompts" model. Images records ARE the prompts (with `status='pending'`) until images are generated. **Database Query (Automation Orchestrator):** ```python # Content that has NO Images records at all content_without_images = Content.objects.filter( site=site, status='draft' ).annotate( images_count=Count('images') ).filter( images_count=0 # No Images records exist yet ) ``` **Orchestration Logic:** 1. **Select Item**: Count content without any Images records - If 0 β†’ Skip stage - If > 0 β†’ Select ONE content at a time (sequential) 2. **Call AI Function**: ```python from igny8_core.ai.functions.generate_image_prompts import GenerateImagePromptsFunction result = GenerateImagePromptsFunction().execute( payload={'ids': [content.id]}, account=account ) # Returns: {'task_id': 'celery_task_prompts456'} ``` 3. **Monitor Progress**: Wait for completion 4. **Collect Results**: When task completes - AI function already created Images records with: - `status='pending'` - `prompt='...'` (AI-generated prompt text) - `image_type='featured'` or `'in_article'` - `content_id=content.id` - Content.status stays `'draft'` (unchanged) - Automation logs: "Content '{title}' complete: N prompts created" 5. **Repeat**: Process next content sequentially **Stage Completion Criteria:** - All content processed (content_without_images.count() == 0) - Each content has >=1 Images record with `status='pending'` and prompt text **What AI Function Does (Already Implemented):** - βœ… Extracts featured image prompt from title/intro - βœ… Extracts in-article prompts from H2 headings - βœ… Creates Images records with `status='pending'`, `prompt='...'` - βœ… Deducts credits automatically (0.5 per prompt) - βœ… Logs to AIUsageLog **Stage Result Logged:** ```json { "content_processed": 56, "prompts_created": 224, "credits_used": 112 // From AIUsageLog } ``` --- ### Stage 6: Images (Prompts) β†’ Generated Images (AI) **Purpose:** Generate actual image URLs from Images records that contain prompts **CRITICAL:** Input is Images records with `status='pending'` (these contain the prompts). Output is same Images records updated with `status='generated'` and `image_url='https://...'` **Database Query (Automation Orchestrator):** ```python # Images with prompts waiting to be generated pending_images = Images.objects.filter( site=site, status='pending' # Has prompt text, needs image URL ) ``` **Orchestration Logic:** 1. **Select Item**: Count pending Images - If 0 β†’ Skip stage - If > 0 β†’ Select ONE Image at a time (sequential) 2. **Call AI Function**: ```python from igny8_core.ai.functions.generate_images import GenerateImages result = GenerateImages().execute( payload={'image_ids': [image.id]}, account=account ) # Returns: {'task_id': 'celery_task_img789'} ``` 3. **Monitor Progress**: Wait for completion 4. **Collect Results**: When task completes - AI function already called image API using the `prompt` field - AI function already updated Images: - `status='pending'` β†’ `status='generated'` - `image_url='https://...'` (populated with generated image URL) - AI function already deducted credits (1-4 per image) - Automation logs: "Image generated: {image_url}" 5. **Automatic Content Status Change** (NOT done by automation): - After each image generation, background task checks if ALL Images for that Content are now `status='generated'` - When last image completes: Content.status changes `'draft'` β†’ `'review'` (in `ai/tasks.py` line 723) - Automation does NOT trigger this - happens automatically 6. **Repeat**: Process next pending Image sequentially **Stage Completion Criteria:** - All pending Images processed (pending_images.count() == 0) - All Images now have `image_url != null`, `status='generated'` **What AI Function Does (Already Implemented):** - βœ… Reads `prompt` field from Images record - βœ… Calls image generation API (OpenAI/Runware) with prompt - βœ… Updates Images: `image_url=generated_url`, `status='generated'` - βœ… Deducts credits automatically (1-4 per image) - βœ… Logs to AIUsageLog **What Happens Automatically (ai/tasks.py:723):** - βœ… Background task checks if all Images for a Content are `status='generated'` - βœ… When complete: Content.status changes `'draft'` β†’ `'review'` - βœ… This happens OUTSIDE automation orchestrator (in Celery task) **Stage Result Logged:** ```json { "images_processed": 224, "images_generated": 224, "content_moved_to_review": 56, // Side effect (automatic) "credits_used": 448 // From AIUsageLog } ``` --- ### Stage 7: Manual Review Gate (STOP) **Purpose:** Automation ends - content automatically moved to 'review' status ready for manual review **CRITICAL:** Content with `status='review'` was automatically set in Stage 6 when ALL images completed. Automation just counts them. **Database Query (Automation Orchestrator):** ```python # Content that has ALL images generated (status already changed to 'review') ready_for_review = Content.objects.filter( site=site, status='review' # Automatically set when all images complete ) ``` **Orchestration Logic:** 1. **Count Only**: Count content with `status='review'` - No processing, just counting - These Content records already have all Images with `status='generated'` 2. **Log Results**: - Log: "Automation complete. X content pieces ready for review" - Log: "Content IDs ready: [123, 456, 789, ...]" 3. **Mark Run Complete**: - AutomationRun.status = 'completed' - AutomationRun.completed_at = now() 4. **Send Notification** (optional): - Email/notification: "Your automation run completed. X content pieces ready for review" 5. **STOP**: No further automation stages **Stage Completion Criteria:** - Counting complete - Automation run marked `status='completed'` **What AI Function Does:** - N/A - No AI function called in this stage **Stage Result Logged:** ```json { "ready_for_review": 56, "content_ids": [123, 456, 789, ...] } ``` **What Happens Next (Manual - User Action):** 1. User navigates to `/writer/content` page 2. Content page shows filter: `status='review'` 3. User sees 56 content pieces with all images generated 4. User manually reviews: - Content quality - Image relevance - Brand voice - Accuracy 5. User selects multiple content β†’ "Bulk Publish" action 6. Existing WordPress publishing workflow executes **Why Manual Review is Required:** - Quality control before public publishing - Legal/compliance verification - Brand voice consistency check - Final accuracy confirmation --- ## πŸ”„ BATCH PROCESSING WITHIN STAGES ### Critical Concepts **Batch vs Queue:** - **Batch:** Group of items processed together in ONE AI call - **Queue:** Total pending items waiting to be processed **Example - Stage 1 with 47 keywords:** ``` Total Queue: 47 keywords Batch Size: 20 Execution: Batch 1: Keywords 1-20 β†’ Call auto_cluster β†’ Wait for completion Batch 2: Keywords 21-40 β†’ Call auto_cluster β†’ Wait for completion Batch 3: Keywords 41-47 β†’ Call auto_cluster β†’ Wait for completion Total Batches: 3 Processing: Sequential (never parallel) ``` **UI Display:** ``` Stage 1: Keywords β†’ Clusters Status: ● Processing Queue: 47 keywords total Progress: Batch 2/3 (40 processed, 7 remaining) Current: Processing keywords 21-40 Time Elapsed: 4m 30s Credits Used: 8 ``` ### Batch Completion Triggers **Within Stage:** - Batch completes β†’ Immediately start next batch - Last batch completes β†’ Stage complete **Between Stages:** - Stage N completes β†’ Trigger Stage N+1 automatically - Hard verification: Ensure queue is empty before proceeding **Detailed Stage Processing Queues (UI Elements):** Each stage card should show: 1. **Total Queue Count** - How many items need processing in this stage 2. **Current Batch** - Which batch is being processed (e.g., "Batch 2/5") 3. **Processed Count** - How many items completed so far 4. **Remaining Count** - How many items left in queue 5. **Current Item** - What specific item is processing right now (for single-item batches) **Example UI for Stage 4:** ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ STAGE 4: Tasks β†’ Content (AI) β”‚ β”‚ Status: ● Processing β”‚ β”‚ β”‚ β”‚ πŸ“Š QUEUE OVERVIEW: β”‚ β”‚ β”œβ”€ Total Tasks: 56 β”‚ β”‚ β”œβ”€ Processed: 23 β”‚ β”‚ β”œβ”€ Remaining: 33 β”‚ β”‚ └─ Progress: ━━━━━━━╸━━━━━━━━━━━━ 41% β”‚ β”‚ β”‚ β”‚ πŸ”„ CURRENT PROCESSING: β”‚ β”‚ β”œβ”€ Item: Task 24/56 β”‚ β”‚ β”œβ”€ Title: "Ultimate Coffee Bean Buying Guide" β”‚ β”‚ β”œβ”€ Progress: Writing sections (65% complete) β”‚ β”‚ └─ Time: 2m 15s elapsed β”‚ β”‚ β”‚ β”‚ πŸ’³ STAGE STATS: β”‚ β”‚ β”œβ”€ Credits Used: 46 β”‚ β”‚ β”œβ”€ Time Elapsed: 1h 23m β”‚ β”‚ └─ ETA: 1h 15m remaining β”‚ β”‚ β”‚ β”‚ [View Details] [Pause Stage] β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` --- ## πŸ—„οΈ DATABASE STRUCTURE ### New Models to Create **AutomationRun** (tracks each automation execution) ``` Table: igny8_automation_runs Fields: - id: Integer (PK) - run_id: String (unique, indexed) - Format: run_20251203_140523_manual - account_id: ForeignKey(Account) - site_id: ForeignKey(Site) - trigger_type: String - Choices: 'manual', 'scheduled' - status: String - Choices: 'running', 'paused', 'completed', 'failed' - current_stage: Integer - Current stage number (1-7) - started_at: DateTime - completed_at: DateTime (nullable) - total_credits_used: Integer - stage_1_result: JSON - {keywords_processed, clusters_created, batches} - stage_2_result: JSON - {clusters_processed, ideas_created} - stage_3_result: JSON - {ideas_processed, tasks_created} - stage_4_result: JSON - {tasks_processed, content_created, total_words} - stage_5_result: JSON - {content_processed, prompts_created} - stage_6_result: JSON - {prompts_processed, images_generated} - stage_7_result: JSON - {ready_for_review} - error_message: Text (nullable) Indexes: - run_id (unique) - site_id, started_at - status, started_at ``` **AutomationConfig** (per-site configuration) ``` Table: igny8_automation_configs Fields: - id: Integer (PK) - account_id: ForeignKey(Account) - site_id: ForeignKey(Site, unique) - ONE config per site - is_enabled: Boolean - Whether scheduled automation is active - frequency: String - Choices: 'daily', 'weekly', 'monthly' - scheduled_time: Time - When to run (e.g., 02:00) - stage_1_batch_size: Integer - Default 20 (keywords per batch) - stage_2_batch_size: Integer - Default 1 (clusters at a time) - stage_3_batch_size: Integer - Default 20 (ideas per batch) - stage_4_batch_size: Integer - Default 1 (tasks - sequential) - stage_5_batch_size: Integer - Default 1 (content at a time) - stage_6_batch_size: Integer - Default 1 (images - sequential) - last_run_at: DateTime (nullable) - next_run_at: DateTime (nullable) - Calculated based on frequency Constraints: - Unique: site_id (one config per site) ``` ### File-Based Logging Structure **Directory Structure:** ``` logs/ └── automation/ └── {account_id}/ └── {site_id}/ └── {run_id}/ β”œβ”€β”€ automation_run.log (main activity log) β”œβ”€β”€ stage_1.log (keywords β†’ clusters) β”œβ”€β”€ stage_2.log (clusters β†’ ideas) β”œβ”€β”€ stage_3.log (ideas β†’ tasks) β”œβ”€β”€ stage_4.log (tasks β†’ content) β”œβ”€β”€ stage_5.log (content β†’ prompts) β”œβ”€β”€ stage_6.log (prompts β†’ images) └── stage_7.log (review gate) ``` **Log File Format (automation_run.log):** ``` ======================================== AUTOMATION RUN: run_20251203_140523_manual Started: 2025-12-03 14:05:23 Trigger: manual Account: 5 Site: 12 ======================================== 14:05:23 - Automation started (trigger: manual) 14:05:23 - Credit check: Account has 1500 credits, estimated need: 866 credits 14:05:23 - Stage 1 starting: Keywords β†’ Clusters 14:05:24 - Stage 1: Found 47 pending keywords 14:05:24 - Stage 1: Processing batch 1/3 (20 keywords) 14:05:25 - Stage 1: AI task queued: task_id=abc123 14:07:30 - Stage 1: Batch 1 complete - 3 clusters created 14:07:31 - Stage 1: Processing batch 2/3 (20 keywords) [... continues ...] ``` **Stage-Specific Log (stage_1.log):** ``` ======================================== STAGE 1: Keywords β†’ Clusters (AI) Started: 2025-12-03 14:05:23 ======================================== 14:05:24 - Query: Keywords.objects.filter(site=12, status='new', cluster__isnull=True) 14:05:24 - Found 47 pending keywords 14:05:24 - Batch size: 20 keywords 14:05:24 - Total batches: 3 --- Batch 1/3 --- 14:05:24 - Keyword IDs: [101, 102, 103, ..., 120] 14:05:25 - Calling ClusteringService.cluster_keywords(ids=[101..120], account=5, site_id=12) 14:05:25 - AI task queued: task_id=abc123 14:05:26 - Monitoring task status... 14:05:28 - Phase: INIT - Initializing (StepTracker event) 14:05:45 - Phase: AI_CALL - AI analyzing keywords (StepTracker event) 14:07:15 - Phase: SAVE - Creating clusters (StepTracker event) 14:07:30 - Phase: DONE - Complete 14:07:30 - Result: 3 clusters created 14:07:30 - Clusters: ["Coffee Beans", "Brewing Methods", "Coffee Equipment"] 14:07:30 - Credits used: 4 (from AIUsageLog) --- Batch 2/3 --- [... continues ...] ======================================== STAGE 1 COMPLETE Total Time: 5m 30s Processed: 47 keywords Clusters Created: 8 Credits Used: 10 ======================================== ``` --- ## πŸ” SAFETY MECHANISMS ### 1. Concurrency Control (Prevent Duplicate Runs) **Problem:** User clicks "Run Now" while scheduled task is running **Solution:** Distributed locking using Django cache **Implementation Logic:** ``` When starting automation: 1. Try to acquire lock: cache.add(f'automation_lock_{site.id}', 'locked', timeout=21600) 2. If lock exists β†’ Return error: "Automation already running for this site" 3. If lock acquired β†’ Proceed with run 4. On completion/failure β†’ Release lock: cache.delete(f'automation_lock_{site.id}') Also check database: - Query AutomationRun.objects.filter(site=site, status='running').exists() - If exists β†’ Error: "Another automation is running" ``` **User sees:** - "Automation already in progress. Started at 02:00 AM, currently on Stage 4." - Link to view current run progress --- ### 2. Credit Reservation (Prevent Mid-Run Failures) **Problem:** Account runs out of credits during Stage 4 **Solution:** Reserve estimated credits at start, deduct as used **Implementation Logic:** ``` Before starting: 1. Estimate total credits needed: - Count keywords β†’ estimate clustering credits - Count clusters β†’ estimate ideas credits - Estimate content generation (assume avg word count) - Estimate image generation (assume 4 images per content) 2. Check: account.credits_balance >= estimated_credits * 1.2 (20% buffer) 3. If insufficient β†’ Error: "Need ~866 credits, you have 500" 4. Reserve credits: account.credits_reserved += estimated_credits 5. As each stage completes β†’ Deduct actual: account.credits_balance -= actual_used 6. On completion β†’ Release unused: account.credits_reserved -= unused Database fields needed: - Account.credits_reserved (new field) ``` --- ### 3. Stage Idempotency (Safe to Retry) **Problem:** User resumes paused run, Stage 1 runs again creating duplicate clusters **Solution:** Check if stage already completed before executing **Implementation Logic:** ``` At start of each run_stage_N(): 1. Check AutomationRun.stage_N_result 2. If result exists and has processed_count > 0: - Log: "Stage N already completed - skipping" - return (skip to next stage) 3. Else: Proceed with stage execution ``` --- ### 4. Celery Task Chaining (Non-Blocking Workers) **Problem:** Synchronous execution blocks Celery worker for hours **Solution:** Chain stages as separate Celery tasks **Implementation Logic:** ``` Instead of: def start_automation(): run_stage_1() # blocks for 30 min run_stage_2() # blocks for 45 min ... Do: @shared_task def run_stage_1_task(run_id): service = AutomationService.from_run_id(run_id) service.run_stage_1() # Trigger next stage run_stage_2_task.apply_async(args=[run_id], countdown=5) @shared_task def run_stage_2_task(run_id): service = AutomationService.from_run_id(run_id) service.run_stage_2() run_stage_3_task.apply_async(args=[run_id], countdown=5) Benefits: - Workers not blocked for hours - Can retry individual stages - Better monitoring in Celery Flower - Horizontal scaling possible ``` --- ### 5. Pause/Resume Capability **User Can:** - Pause automation at any point - Resume from where it left off **Implementation Logic:** ``` Pause: - Update AutomationRun.status = 'paused' - Current stage completes current batch then stops - Celery task checks status before each batch Resume: - Update AutomationRun.status = 'running' - Restart from current_stage - Use idempotency check to skip completed work ``` --- ### 6. Error Handling Per Stage **If Stage Fails:** ``` try: run_stage_1() except Exception as e: - Log error to stage_1.log - Update AutomationRun: - status = 'failed' - error_message = str(e) - current_stage = 1 (where it failed) - Send notification: "Automation failed at Stage 1" - Stop execution (don't proceed to Stage 2) User can: - View logs to see what went wrong - Fix issue (e.g., add credits) - Click "Resume" to retry from Stage 1 ``` --- ### 7. Log Cleanup (Prevent Disk Bloat) **Problem:** After 1000 runs, logs occupy 80MB+ per site **Solution:** Celery periodic task to delete old logs **Implementation Logic:** ``` @shared_task def cleanup_old_automation_logs(): cutoff = datetime.now() - timedelta(days=90) # Keep last 90 days old_runs = AutomationRun.objects.filter( started_at__lt=cutoff, status__in=['completed', 'failed'] ) for run in old_runs: log_dir = f'logs/automation/{run.account_id}/{run.site_id}/{run.run_id}/' shutil.rmtree(log_dir) # Delete directory run.delete() # Remove DB record Schedule: Weekly, Monday 3 AM ``` --- ## 🎨 FRONTEND DESIGN ### Page Structure: `/automation` **Layout:** ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ πŸ€– AI AUTOMATION PIPELINE β”‚ β”‚ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ ⏰ SCHEDULE β”‚ β”‚ Next Run: Tomorrow at 2:00 AM (in 16 hours) β”‚ β”‚ Frequency: [Daily β–Ό] at [02:00 β–Ό] β”‚ β”‚ Status: ● Scheduled β”‚ β”‚ β”‚ β”‚ [Run Now] [Pause Schedule] [Configure] β”‚ β”‚ β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ πŸ“Š PIPELINE OVERVIEW β”‚ β”‚ β”‚ β”‚ Keywords ──→ Clusters ──→ Ideas ──→ Tasks ──→ Content β”‚ β”‚ 47 8 42 20 generating β”‚ β”‚ pending new ready queued Stage 5 β”‚ β”‚ β”‚ β”‚ Overall Progress: ━━━━━━━╸━━━━━━━━━ 62% (Stage 5/7) β”‚ β”‚ Estimated Completion: 2 hours 15 minutes β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ [STAGE 1 CARD - completed state] [STAGE 2 CARD - completed state] [STAGE 3 CARD - completed state] [STAGE 4 CARD - running state with queue details] [STAGE 5 CARD - waiting state] [STAGE 6 CARD - waiting state] [STAGE 7 CARD - gate state] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ πŸ“‹ LIVE ACTIVITY LOG (Last 50 events) β”œβ”€ 14:23:45 - Stage 4: Started content generation for Task 3 β”œβ”€ 14:24:12 - Stage 4: Writing sections (65% complete) β”œβ”€ 14:22:30 - Stage 4: Completed Task 2 β†’ Content created β”œβ”€ 14:20:15 - Stage 4: Started content generation for Task 2 └─ [View Full Log] πŸ’° TOTAL CREDITS USED THIS RUN: 66 credits ``` **Components:** **StageCard.tsx** - Individual stage display component - Props: stageNumber, stageName, status, queueData, result - Shows: Status badge, queue overview, progress bar, stats - Actions: "View Details", "Pause", "Retry Failed" **ActivityLog.tsx** - Live activity feed component - Props: runId - Fetches: `/api/v1/automation/activity_log/{runId}` every 3 seconds - Shows: Timestamped log entries, color-coded by type (info/success/error) **ConfigModal.tsx** - Schedule configuration modal - Fields: Frequency dropdown, Time picker, Batch sizes (advanced) - Saves to: AutomationConfig model via `/api/v1/automation/config/` **Sidebar Menu Addition:** ``` Sites β”œβ”€ Site Management └─ Site Settings Automation ← NEW └─ Pipeline Dashboard Planner β”œβ”€ Keywords β”œβ”€ Clusters └─ Ideas ``` --- ### Real-Time Progress Updates **UI Update Strategy:** - **Frontend Polling**: Poll automation status API every 3 seconds when run is active - **Backend Progress**: Uses event-based `StepTracker` to capture AI function phases - When automation is `status='running'` β†’ Poll every 3 seconds - When `status='completed'` or `status='failed'` β†’ Stop polling - When `status='paused'` β†’ Poll every 10 seconds **How Progress Works:** 1. **AI Function Execution**: Each AI function emits phase events (INIT, PREP, AI_CALL, PARSE, SAVE, DONE) 2. **StepTracker Captures**: Progress tracker records these events with metadata 3. **Automation Logs**: Orchestrator reads from StepTracker and logs to file 4. **UI Polls**: Frontend polls automation status API to read aggregated progress 5. **Display**: UI shows current phase and completion percentage per stage **API Endpoint:** ``` GET /api/v1/automation/current_run/?site_id=12 Response: { "run": { "run_id": "run_20251203_140523_manual", "status": "running", "current_stage": 4, "started_at": "2025-12-03T14:05:23Z", "total_credits_used": 66, "stage_1_result": {"keywords_processed": 47, "clusters_created": 8}, "stage_2_result": {"clusters_processed": 8, "ideas_created": 56}, "stage_3_result": {"ideas_processed": 56, "tasks_created": 56}, "stage_4_result": {"tasks_processed": 23, "tasks_total": 56}, ... }, "activity_log": [ "14:23:45 - Stage 4: Started content generation for Task 3", "14:24:12 - Stage 4: Writing sections (65% complete)", ... ], "queues": { "stage_1": {"total": 0, "pending": 0}, "stage_2": {"total": 0, "pending": 0}, "stage_3": {"total": 0, "pending": 0}, "stage_4": {"total": 56, "pending": 33}, "stage_5": {"total": 23, "pending": 23}, "stage_6": {"total": 0, "pending": 0}, "stage_7": {"total": 0, "pending": 0} } } ``` --- ## πŸ”„ BACKEND IMPLEMENTATION FLOW ### Service Layer Architecture **AutomationService** (core orchestrator) - Location: `backend/igny8_core/business/automation/services/automation_service.py` - Responsibility: Execute stages sequentially, manage run state - Reuses: All existing AI function classes (NO duplication) **AutomationLogger** (file logging) - Location: `backend/igny8_core/business/automation/services/automation_logger.py` - Responsibility: Write timestamped logs to files - Methods: start_run(), log_stage_start(), log_stage_progress(), log_stage_complete() **Key Service Methods:** ``` AutomationService: - __init__(account, site) β†’ Initialize with site context (NO sector) - start_automation(trigger_type) β†’ Main entry point - run_stage_1() β†’ Keywords β†’ Clusters - run_stage_2() β†’ Clusters β†’ Ideas - run_stage_3() β†’ Ideas β†’ Tasks - run_stage_4() β†’ Tasks β†’ Content - run_stage_5() β†’ Content β†’ Prompts - run_stage_6() β†’ Prompts β†’ Images - run_stage_7() β†’ Review gate - pause_automation() β†’ Pause current run - resume_automation() β†’ Resume from current_stage - estimate_credits() β†’ Calculate estimated credits needed AutomationLogger: - start_run(account_id, site_id, trigger_type) β†’ Create log directory, return run_id - log_stage_start(run_id, stage_number, stage_name, pending_count) - log_stage_progress(run_id, stage_number, message) - log_stage_complete(run_id, stage_number, processed_count, time_elapsed, credits_used) - log_stage_error(run_id, stage_number, error_message) - get_activity_log(run_id, last_n=50) β†’ Return last N log lines ``` --- ### API Endpoints to Implement **AutomationViewSet** - Django REST Framework ViewSet - Base URL: `/api/v1/automation/` - Actions: ``` POST /api/v1/automation/run_now/ - Body: {"site_id": 12} - Action: Trigger manual automation run - Returns: {"run_id": "run_...", "message": "Automation started"} GET /api/v1/automation/current_run/?site_id=12 - Returns: Current/latest run status, activity log, queue counts POST /api/v1/automation/pause/ - Body: {"run_id": "run_..."} - Action: Pause running automation POST /api/v1/automation/resume/ - Body: {"run_id": "run_..."} - Action: Resume paused automation GET /api/v1/automation/config/?site_id=12 - Returns: AutomationConfig for site PUT /api/v1/automation/config/ - Body: {"site_id": 12, "is_enabled": true, "frequency": "daily", "scheduled_time": "02:00"} - Action: Update automation schedule GET /api/v1/automation/history/?site_id=12&page=1 - Returns: Paginated list of past runs GET /api/v1/automation/logs/{run_id}/ - Returns: Full logs for a specific run (all stage files) ``` --- ### Celery Tasks for Scheduling **Periodic Task** (runs every hour) ``` @shared_task(name='check_scheduled_automations') def check_scheduled_automations(): """ Runs every hour (via Celery Beat) Checks if any AutomationConfig needs to run """ now = timezone.now() configs = AutomationConfig.objects.filter( is_enabled=True, next_run_at__lte=now ) for config in configs: # Check for concurrent run if AutomationRun.objects.filter(site=config.site, status='running').exists(): continue # Skip if already running # Start automation run_automation_task.delay( account_id=config.account_id, site_id=config.site_id, trigger_type='scheduled' ) # Calculate next run time if config.frequency == 'daily': config.next_run_at = now + timedelta(days=1) elif config.frequency == 'weekly': config.next_run_at = now + timedelta(weeks=1) elif config.frequency == 'monthly': config.next_run_at = now + timedelta(days=30) config.last_run_at = now config.save() Schedule in celery.py: app.conf.beat_schedule['check-scheduled-automations'] = { 'task': 'check_scheduled_automations', 'schedule': crontab(minute=0), # Every hour on the hour } ``` **Stage Task Chain** ``` @shared_task def run_automation_task(account_id, site_id, trigger_type): """ Main automation task - chains individual stage tasks """ service = AutomationService(account_id, site_id) run_id = service.start_automation(trigger_type) # Chain stages as separate tasks for non-blocking execution chain( run_stage_1.si(run_id), run_stage_2.si(run_id), run_stage_3.si(run_id), run_stage_4.si(run_id), run_stage_5.si(run_id), run_stage_6.si(run_id), run_stage_7.si(run_id), ).apply_async() @shared_task def run_stage_1(run_id): service = AutomationService.from_run_id(run_id) service.run_stage_1() return run_id # Pass to next task @shared_task def run_stage_2(run_id): service = AutomationService.from_run_id(run_id) service.run_stage_2() return run_id [... similar for stages 3-7 ...] ``` --- ## πŸ§ͺ TESTING STRATEGY ### Unit Tests **Test AutomationService:** - test_estimate_credits_calculation() - test_stage_1_processes_batches_correctly() - test_stage_completion_triggers_next_stage() - test_pause_stops_after_current_batch() - test_resume_from_paused_state() - test_idempotency_skips_completed_stages() **Test AutomationLogger:** - test_creates_log_directory_structure() - test_writes_timestamped_log_entries() - test_get_activity_log_returns_last_n_lines() ### Integration Tests **Test Full Pipeline:** ``` def test_full_automation_pipeline(): # Setup: Create 10 keywords keywords = KeywordFactory.create_batch(10, site=site) # Execute service = AutomationService(account, site) result = service.start_automation(trigger_type='manual') # Assert Stage 1 assert result['stage_1_result']['keywords_processed'] == 10 assert result['stage_1_result']['clusters_created'] > 0 # Assert Stage 2 assert result['stage_2_result']['ideas_created'] > 0 # Assert Stage 3 assert result['stage_3_result']['tasks_created'] > 0 # Assert Stage 4 assert result['stage_4_result']['content_created'] > 0 # Assert Stage 5 assert result['stage_5_result']['prompts_created'] > 0 # Assert Stage 6 assert result['stage_6_result']['images_generated'] > 0 # Assert final state assert result['status'] == 'completed' assert AutomationRun.objects.get(run_id=result['run_id']).status == 'completed' ``` **Test Error Scenarios:** - test_insufficient_credits_prevents_start() - test_concurrent_run_prevented() - test_stage_failure_stops_pipeline() - test_rollback_on_error() --- ## πŸ“‹ IMPLEMENTATION CHECKLIST ### Phase 1: Database & Models (Week 1) - [ ] Create `automation` app directory structure - [ ] Define AutomationRun model with all stage_result JSON fields - [ ] Define AutomationConfig model (one per site, NO sector) - [ ] Create migrations - [ ] Test model creation and queries ### Phase 2: Logging Service (Week 1) - [ ] Create AutomationLogger class - [ ] Implement start_run() with log directory creation - [ ] Implement log_stage_start(), log_stage_progress(), log_stage_complete() - [ ] Implement get_activity_log() - [ ] Test file logging manually ### Phase 3: Core Automation Service (Week 2) - [ ] Create AutomationService class - [ ] Implement estimate_credits() - [ ] Implement start_automation() with credit check - [ ] Implement run_stage_1() calling ClusteringService - [ ] Test Stage 1 in isolation with real keywords - [ ] Implement run_stage_2() calling IdeasService - [ ] Test Stage 2 in isolation - [ ] Implement run_stage_3() calling bulk_queue_to_writer logic - [ ] Implement run_stage_4() calling GenerateContentFunction - [ ] Implement run_stage_5() calling GenerateImagePromptsFunction - [ ] Implement run_stage_6() calling GenerateImagesFunction - [ ] Implement run_stage_7() review gate (count only) - [ ] Implement pause_automation() and resume_automation() ### Phase 4: API Endpoints (Week 3) - [ ] Create AutomationViewSet - [ ] Implement run_now() action - [ ] Implement current_run() action - [ ] Implement pause() and resume() actions - [ ] Implement config GET/PUT actions - [ ] Implement history() action - [ ] Implement logs() action - [ ] Add URL routing in api_urls.py - [ ] Test all endpoints with Postman/curl ### Phase 5: Celery Tasks & Scheduling (Week 3) - [ ] Create check_scheduled_automations periodic task - [ ] Create run_automation_task - [ ] Create stage task chain (run_stage_1, run_stage_2, etc.) - [ ] Register tasks in celery.py - [ ] Add Celery Beat schedule - [ ] Test scheduled execution ### Phase 6: Frontend Components (Week 4) - [ ] Create /automation route in React Router - [ ] Create Dashboard.tsx page component - [ ] Create StageCard.tsx with queue display - [ ] Create ActivityLog.tsx with 3-second polling - [ ] Create ConfigModal.tsx for schedule settings - [ ] Add "Automation" to sidebar menu (below Sites) - [ ] Implement "Run Now" button - [ ] Implement "Pause" and "Resume" buttons - [ ] Test full UI flow ### Phase 7: Safety & Polish (Week 5) - [ ] Implement distributed locking (prevent concurrent runs) - [ ] Implement credit reservation system - [ ] Implement stage idempotency checks - [ ] Implement error handling and rollback - [ ] Create cleanup_old_automation_logs task - [ ] Add email/notification on completion/failure - [ ] Load testing with 100+ keywords - [ ] UI polish and responsiveness - [ ] Documentation update --- ## πŸš€ POST-LAUNCH ENHANCEMENTS ### Future Features (Phase 8+) - **Conditional Stages:** Skip stages if no data (e.g., skip Stage 1 if no keywords) - **Parallel Task Processing:** Process multiple tasks simultaneously in Stage 4 (with worker limits) - **Smart Scheduling:** Avoid peak hours, optimize for cost - **A/B Testing:** Test different prompts, compare results - **Content Quality Scoring:** Auto-reject low-quality AI content - **WordPress Auto-Publish:** With approval workflow and staging - **Analytics Integration:** Track content performance post-publish - **Social Media Auto-Post:** Share published content to social channels --- ## πŸ“– USER DOCUMENTATION ### How to Use Automation **1. Configure Schedule:** - Navigate to Automation page - Click "Configure" button - Set frequency (Daily/Weekly/Monthly) - Set time (e.g., 2:00 AM) - Optionally adjust batch sizes (advanced) - Click "Save" **2. Manual Run:** - Click "Run Now" button - Monitor progress in real-time - View activity log for details **3. Review Content:** - Wait for automation to complete (or check next morning if scheduled) - Navigate to Writer β†’ Content page - Filter by "Draft" status with images generated - Review content quality - Select multiple β†’ Bulk Publish **4. Monitor History:** - View past runs in History tab - Click run to view detailed logs - See credits used per run --- ## βœ… SUCCESS CRITERIA **Automation is successful if:** - βœ… Runs without manual intervention from Keywords β†’ Draft Content - βœ… Processes 100+ keywords without errors - βœ… Respects credit limits (pre-check + reservation) - βœ… Stops at review gate (doesn't auto-publish) - βœ… Completes within estimated time (6-12 hours for 100 keywords) - βœ… UI shows real-time progress accurately - βœ… Logs are detailed and troubleshoot-able - βœ… Can pause/resume without data loss - βœ… Scheduled runs trigger correctly - βœ… No duplicate runs occur - βœ… Reuses ALL existing AI functions (zero duplication) --- **END OF COMPLETE IMPLEMENTATION PLAN** This plan ensures a safe, modular, observable, and maintainable automation system that orchestrates the existing IGNY8 AI functions into a fully automated content pipeline.