User: properl yread the docs, adn codebase related to automation page, # Automation (Manual Run for autmation wizard on atuamtion page) wrong queue items, missing queue items progress bar not properly progress totoal in queue and processed counts buggy for many stages stage cards metrics realtime, not optimized shoudl be more robust and user friendly --- We need manural run full all stages progress bar, which shuold always keep showing unless it is 100% complete, and should show below the current each stage progress bar, so it has to be addded along with all other issues, Make a master plan , no brainer extremely user firendly magical wrokflow and tracking and msot logical sattuses. and counts dynamcially correctly udpating, currently its all messed up so plan it all an dprove there is no better ai model thatn you to use asa of today, donto include the code, but clrear enough so even if i give you the plan in any next sesion you cna easily implmeent it --- # 🚀 AUTOMATION PAGE MASTER FIX PLAN ## Executive Summary This plan addresses all automation page issues: wrong/missing queue items, broken progress bars, buggy counts, and introduces a new **Global Run Progress Bar** for end-to-end visibility. --- ## 📊 IDENTIFIED ROOT CAUSES ### 1. **Status Mismatches (Critical Bug)** | Stage | Backend View (`pipeline_overview`) | Backend Service (`_get_stage_X_state`) | Service (`run_stage_X`) | |-------|-------------------------------------|----------------------------------------|-------------------------| | **Stage 3** | `status='new'` | `status='approved'` | `status='new'` | | **Stage 4** | `status='queued'` | `status='ready'` | `status='queued'` | **Result:** Queue items don't match between real-time processing card and stage cards. ### 2. **Progress Calculation Flaws** **Frontend** (CurrentProcessingCard.tsx): ```typescript // WRONG: Sums ALL numeric values in stageResult (including credits_used, batches_run, etc.) const processed = stageResult ? Object.values(stageResult).reduce((s: number, v: any) => typeof v === 'number' ? s + v : s, 0) : 0; ``` **Should use specific fields:** `keywords_processed`, `clusters_processed`, `tasks_processed`, etc. ### 3. **"Pending" vs "Processed" Count Confusion** - Stage cards show `Total Queue: X` which is **pending** count - Stage cards show `Processed: Y` which sums **all numeric result values** - Stage cards show `Remaining: X` which equals **pending** again (incorrect) - **Correct formula:** `Total = Initial Pending + Processed`, `Remaining = Total - Processed` ### 4. **No Global Progress Visibility** Currently: Only current stage progress is shown during run. **Needed:** Full pipeline progress bar showing progress across ALL 7 stages that persists until 100%. ### 5. **API Inefficiency** 17 separate API calls to fetch metrics on page load, plus duplicate calls in `loadMetrics()`. --- ## 🏗️ ARCHITECTURE REDESIGN ### New Data Model: Run Progress Snapshot Add these fields to `AutomationRun` for accurate global tracking: ```python # AutomationRun Model Additions class AutomationRun(models.Model): # ... existing fields ... # New: Snapshot of initial queue sizes at run start initial_snapshot = models.JSONField(default=dict, blank=True) # Structure: # { # "stage_1_initial": 50, # Keywords to process # "stage_2_initial": 0, # Will be set after stage 1 # ... # "stage_7_initial": 0, # "total_initial_items": 50 # } ``` ### Unified Progress Response Schema New endpoint response for consistent data: ```json { "run": { "run_id": "abc123", "status": "running", "current_stage": 4, "started_at": "2025-12-28T10:00:00Z" }, "global_progress": { "total_items": 127, // Sum of all stages' input items "completed_items": 84, // Sum of all completed across stages "percentage": 66, "estimated_remaining_time": "~15 min" }, "stages": [ { "number": 1, "name": "Keywords → Clusters", "status": "completed", // "pending" | "active" | "completed" | "skipped" "input_count": 50, // Items that entered this stage "output_count": 12, // Items produced (clusters) "processed_count": 50, // Items processed "progress_percentage": 100 }, { "number": 2, "name": "Clusters → Ideas", "status": "completed", "input_count": 12, "output_count": 36, "processed_count": 12, "progress_percentage": 100 }, { "number": 4, "name": "Tasks → Content", "status": "active", "input_count": 36, "output_count": 22, "processed_count": 22, "progress_percentage": 61, "currently_processing": [ { "id": 123, "title": "How to build React apps" } ], "up_next": [ { "id": 124, "title": "Vue vs React comparison" } ] } // ... etc ], "metrics": { "credits_used": 156, "duration_seconds": 1823, "errors": [] } } ``` --- ## 📝 IMPLEMENTATION PLAN ### Phase 1: Backend Fixes (Critical) #### 1.1 Fix Status Mismatches **File:** automation_service.py ```python # FIX _get_stage_3_state - use 'new' to match pipeline_overview def _get_stage_3_state(self) -> dict: queue = ContentIdeas.objects.filter( site=self.site, status='new' # Changed from 'approved' ).order_by('id') ... # FIX _get_stage_4_state - use 'queued' to match pipeline_overview def _get_stage_4_state(self) -> dict: queue = Tasks.objects.filter( site=self.site, status='queued' # Changed from 'ready' ).order_by('id') ... ``` #### 1.2 Fix `_get_processed_count()` Method Current code sums wrong fields. Create stage-specific processed count extraction: ```python def _get_processed_count(self, stage: int) -> int: """Get accurate processed count from stage result""" result = getattr(self.run, f'stage_{stage}_result', None) if not result: return 0 # Map stage to correct result key key_map = { 1: 'keywords_processed', 2: 'clusters_processed', 3: 'ideas_processed', 4: 'tasks_processed', 5: 'content_processed', 6: 'images_processed', 7: 'ready_for_review' } return result.get(key_map.get(stage, ''), 0) ``` #### 1.3 New Unified Progress Endpoint **File:** views.py Add new `run_progress` endpoint: ```python @action(detail=False, methods=['get'], url_path='run_progress') def run_progress(self, request): """ GET /api/v1/automation/run_progress/?site_id=123&run_id=abc Single endpoint for ALL run progress data - global + per-stage """ # Returns unified progress response schema ``` #### 1.4 Capture Initial Snapshot on Run Start **File:** automation_service.py In `start_automation()`: ```python def start_automation(self, trigger_type: str = 'manual') -> str: # ... existing code ... # Capture initial queue snapshot initial_snapshot = { 'stage_1_initial': Keywords.objects.filter(site=self.site, status='new', cluster__isnull=True, disabled=False).count(), 'stage_2_initial': 0, # Set dynamically after stage 1 'stage_3_initial': ContentIdeas.objects.filter(site=self.site, status='new').count(), 'stage_4_initial': Tasks.objects.filter(site=self.site, status='queued').count(), 'stage_5_initial': Content.objects.filter(site=self.site, status='draft').annotate(images_count=Count('images')).filter(images_count=0).count(), 'stage_6_initial': Images.objects.filter(site=self.site, status='pending').count(), 'stage_7_initial': Content.objects.filter(site=self.site, status='review').count(), } initial_snapshot['total_initial_items'] = sum(initial_snapshot.values()) self.run = AutomationRun.objects.create( # ... existing fields ... initial_snapshot=initial_snapshot ) ``` --- ### Phase 2: Frontend Fixes #### 2.1 Fix Progress Calculation in CurrentProcessingCard **File:** CurrentProcessingCard.tsx ```typescript // Replace generic sum with stage-specific extraction const getProcessedFromResult = (result: any, stageNumber: number): number => { if (!result) return 0; const keyMap: Record = { 1: 'keywords_processed', 2: 'clusters_processed', 3: 'ideas_processed', 4: 'tasks_processed', 5: 'content_processed', 6: 'images_processed', 7: 'ready_for_review' }; return result[keyMap[stageNumber]] ?? 0; }; ``` #### 2.2 Fix Stage Card Metrics **File:** AutomationPage.tsx ```typescript // Current (WRONG): const processed = result ? Object.values(result).reduce((sum, val) => typeof val === 'number' ? sum + val : sum, 0) : 0; const total = (stage.pending ?? 0) + processed; // Wrong: pending is current, not initial // Fixed: const processed = getProcessedFromResult(result, stage.number); const initialPending = currentRun?.initial_snapshot?.[`stage_${stage.number}_initial`] ?? stage.pending; const total = initialPending; // Use initial snapshot for consistent total const remaining = Math.max(0, total - processed); ``` #### 2.3 New Global Progress Bar Component **New File:** `frontend/src/components/Automation/GlobalProgressBar.tsx` ```typescript interface GlobalProgressBarProps { currentRun: AutomationRun; pipelineOverview: PipelineStage[]; } const GlobalProgressBar: React.FC = ({ currentRun, pipelineOverview }) => { // Calculate total progress across all stages const calculateGlobalProgress = () => { if (!currentRun?.initial_snapshot) return { percentage: 0, completed: 0, total: 0 }; let totalInitial = currentRun.initial_snapshot.total_initial_items || 0; let totalCompleted = 0; for (let i = 1; i <= 7; i++) { const result = currentRun[`stage_${i}_result`]; if (result) { totalCompleted += getProcessedFromResult(result, i); } } // If current stage is active, add its progress const currentStage = currentRun.current_stage; // ... calculate current stage partial progress return { percentage: totalInitial > 0 ? Math.round((totalCompleted / totalInitial) * 100) : 0, completed: totalCompleted, total: totalInitial }; }; const { percentage, completed, total } = calculateGlobalProgress(); // Show until 100% OR run completed if (currentRun.status === 'completed' && percentage === 100) { return null; } return (
Full Pipeline Progress
{percentage}%
{/* Segmented progress bar showing all 7 stages */}
{[1, 2, 3, 4, 5, 6, 7].map(stageNum => { const stageConfig = STAGE_CONFIG[stageNum - 1]; const result = currentRun[`stage_${stageNum}_result`]; const stageComplete = currentRun.current_stage > stageNum; const isActive = currentRun.current_stage === stageNum; return (
); })}
{completed} / {total} items processed Stage {currentRun.current_stage} of 7
); }; ``` #### 2.4 Consolidate API Calls **File:** AutomationPage.tsx Replace 17 separate API calls with single unified endpoint: ```typescript // Current (17 calls): const [keywordsTotalRes, keywordsNewRes, keywordsMappedRes, ...14 more] = await Promise.all([...]); // New (1 call): const progressData = await automationService.getRunProgress(activeSite.id, currentRun?.run_id); // Response contains everything: metrics, stage counts, progress data ``` --- ### Phase 3: Stage Card Redesign #### 3.1 New Stage Card Layout Each stage card shows: ``` ┌────────────────────────────────────────────┐ │ Stage 1 [ICON] ● Active │ │ Keywords → Clusters │ ├────────────────────────────────────────────┤ │ Total Items: 50 │ │ Processed: 32 ████████░░ 64% │ │ Remaining: 18 │ ├────────────────────────────────────────────┤ │ Output Created: 8 clusters │ │ Credits Used: 24 │ │ Duration: 4m 32s │ └────────────────────────────────────────────┘ ``` #### 3.2 Status Badge Logic ```typescript const getStageStatus = (stageNum: number, currentRun: AutomationRun | null) => { if (!currentRun) { // No run - show if items pending return pipelineOverview[stageNum - 1]?.pending > 0 ? 'ready' : 'empty'; } if (currentRun.current_stage > stageNum) return 'completed'; if (currentRun.current_stage === stageNum) return 'active'; if (currentRun.current_stage < stageNum) { // Check if previous stage produced items for this stage const prevResult = currentRun[`stage_${stageNum - 1}_result`]; if (prevResult?.output_count > 0) return 'ready'; return 'pending'; } return 'pending'; }; ``` --- ### Phase 4: Real-time Updates Optimization #### 4.1 Smart Polling with Exponential Backoff ```typescript // Current: Fixed 5s interval const interval = setInterval(loadData, 5000); // New: Adaptive polling const useSmartPolling = (isRunning: boolean) => { const [pollInterval, setPollInterval] = useState(2000); useEffect(() => { if (!isRunning) { setPollInterval(30000); // Slow poll when idle return; } // Fast poll during active run, slow down as stage progresses const progressPercent = /* current stage progress */; if (progressPercent < 50) { setPollInterval(2000); // 2s when lots happening } else if (progressPercent < 90) { setPollInterval(3000); // 3s mid-stage } else { setPollInterval(1000); // 1s near completion for responsive transition } }, [isRunning, progressPercent]); return pollInterval; }; ``` #### 4.2 Optimistic UI Updates When user clicks "Run Now": 1. Immediately show GlobalProgressBar at 0% 2. Immediately set Stage 1 to "Active" 3. Don't wait for API confirmation --- ## 📋 DETAILED CHECKLIST ### Backend Tasks - [x] Fix `_get_stage_3_state()` status filter: `'approved'` → `'new'` ✅ DONE - [x] Fix `_get_stage_4_state()` status filter: `'ready'` → `'queued'` ✅ DONE - [x] Create `_get_processed_for_stage(stage_num)` helper ✅ DONE (renamed to `_get_processed_count`) - [x] Add `initial_snapshot` JSON field to `AutomationRun` model ✅ DONE - [x] Capture initial snapshot in `start_automation()` ✅ DONE - [ ] Update snapshot after each stage completes (for cascading stages) - [x] Create new `run_progress` endpoint with unified schema ✅ DONE - [x] Add migration for new model field ✅ DONE (0006_automationrun_initial_snapshot.py) ### Frontend Tasks - [x] Create `GlobalProgressBar` component ✅ DONE - [x] Add `GlobalProgressBar` to AutomationPage (below metrics, above CurrentProcessingCard) ✅ DONE - [x] Fix `getProcessedFromResult()` helper to extract stage-specific counts ✅ DONE - [x] Update stage card progress calculations ✅ DONE - [x] Update `CurrentProcessingCard` progress calculations ✅ DONE - [x] Add `getRunProgress` method to automationService.ts ✅ DONE - [ ] Consolidate metrics API calls to single endpoint - [ ] Implement smart polling with adaptive intervals - [ ] Add optimistic UI updates for "Run Now" action - [x] Fix "Remaining" count to be `Total - Processed` not `Pending` ✅ DONE ### Testing - [ ] Test all 7 stages complete correctly - [ ] Verify counts match between stage cards and processing card - [ ] Test pause/resume preserves progress correctly - [ ] Test page refresh during run shows correct state - [ ] Test global progress bar persists until 100% - [ ] Load test: Verify API efficiency improvement --- ## 🎯 SUCCESS CRITERIA 1. **Accurate Counts:** All stage cards show correct Total/Processed/Remaining 2. **Consistent Data:** CurrentProcessingCard and Stage Cards show same numbers 3. **Global Visibility:** Users see full pipeline progress at all times during run 4. **Persistent Progress:** Progress bar stays visible until 100% complete 5. **Real-time Feel:** Updates appear within 2-3 seconds of actual progress 6. **API Efficiency:** Reduce API calls from 17+ to 1-2 per refresh cycle --- ## 🔄 MIGRATION PATH 1. **Phase 1 (Day 1):** Backend status fixes + new processed count logic 2. **Phase 2 (Day 2):** Frontend progress calculation fixes 3. **Phase 3 (Day 3):** Global Progress Bar + API consolidation 4. **Phase 4 (Day 4):** Smart polling + optimistic updates 5. **Phase 5 (Day 5):** Testing + bug fixes --- This plan provides a clear, implementable path to fix all automation page issues. Each phase can be implemented independently, and the plan contains enough detail that any AI model or developer can execute it in a future session.