reorg-docs

2026-01-01 05:40:42 +00:00
parent d16e5e1a4b
commit dd63403e94
11 changed files with 0 additions and 411 deletions
--- a/docs/plans/implemented/🚀
+++ b/docs/plans/implemented/🚀
@@ -0,0 +1,584 @@
+User: properl yread the docs, adn codebase related to automation page, 
+
+# Automation (Manual Run for autmation wizard on atuamtion page)
+wrong queue items,
+missing queue items
+progress bar not properly progress
+totoal in queue and processed counts buggy for many stages
+stage cards metrics realtime, not optimized shoudl be more robust and user friendly 
+
+---
+We need manural run full all stages progress bar, which shuold always keep showing unless it is 100% complete, and should show below the current each stage progress bar, 
+
+so it has to be addded along with all other issues, 
+
+Make a master plan , no brainer extremely user firendly magical wrokflow and tracking and msot logical sattuses. and counts dynamcially correctly udpating, 
+currently its all messed up
+
+so plan it all an dprove there is no  better ai model thatn you to use asa of today, 
+donto include the code, but clrear enough so even if i give you the plan in any next sesion you cna easily implmeent it 
+
+
+---
+
+# 🚀 AUTOMATION PAGE MASTER FIX PLAN
+
+## Executive Summary
+
+This plan addresses all automation page issues: wrong/missing queue items, broken progress bars, buggy counts, and introduces a new **Global Run Progress Bar** for end-to-end visibility.
+
+---
+
+## 📊 IDENTIFIED ROOT CAUSES
+
+### 1. **Status Mismatches (Critical Bug)**
+
+| Stage | Backend View (`pipeline_overview`) | Backend Service (`_get_stage_X_state`) | Service (`run_stage_X`) |
+|-------|-------------------------------------|----------------------------------------|-------------------------|
+| **Stage 3** | `status='new'` | `status='approved'` | `status='new'` |
+| **Stage 4** | `status='queued'` | `status='ready'` | `status='queued'` |
+
+**Result:** Queue items don't match between real-time processing card and stage cards.
+
+### 2. **Progress Calculation Flaws**
+
+**Frontend** (CurrentProcessingCard.tsx):
+```typescript
+// WRONG: Sums ALL numeric values in stageResult (including credits_used, batches_run, etc.)
+const processed = stageResult ? Object.values(stageResult).reduce((s: number, v: any) => 
+  typeof v === 'number' ? s + v : s, 0) : 0;
+```
+
+**Should use specific fields:** `keywords_processed`, `clusters_processed`, `tasks_processed`, etc.
+
+### 3. **"Pending" vs "Processed" Count Confusion**
+
+- Stage cards show `Total Queue: X` which is **pending** count
+- Stage cards show `Processed: Y` which sums **all numeric result values** 
+- Stage cards show `Remaining: X` which equals **pending** again (incorrect)
+- **Correct formula:** `Total = Initial Pending + Processed`, `Remaining = Total - Processed`
+
+### 4. **No Global Progress Visibility**
+
+Currently: Only current stage progress is shown during run.
+
+**Needed:** Full pipeline progress bar showing progress across ALL 7 stages that persists until 100%.
+
+### 5. **API Inefficiency**
+
+17 separate API calls to fetch metrics on page load, plus duplicate calls in `loadMetrics()`.
+
+---
+
+## 🏗️ ARCHITECTURE REDESIGN
+
+### New Data Model: Run Progress Snapshot
+
+Add these fields to `AutomationRun` for accurate global tracking:
+
+```python
+# AutomationRun Model Additions
+class AutomationRun(models.Model):
+    # ... existing fields ...
+    
+    # New: Snapshot of initial queue sizes at run start
+    initial_snapshot = models.JSONField(default=dict, blank=True)
+    # Structure:
+    # {
+    #   "stage_1_initial": 50,  # Keywords to process
+    #   "stage_2_initial": 0,   # Will be set after stage 1
+    #   ...
+    #   "stage_7_initial": 0,
+    #   "total_initial_items": 50
+    # }
+```
+
+### Unified Progress Response Schema
+
+New endpoint response for consistent data:
+
+```json
+{
+  "run": {
+    "run_id": "abc123",
+    "status": "running",
+    "current_stage": 4,
+    "started_at": "2025-12-28T10:00:00Z"
+  },
+  "global_progress": {
+    "total_items": 127,           // Sum of all stages' input items
+    "completed_items": 84,        // Sum of all completed across stages
+    "percentage": 66,
+    "estimated_remaining_time": "~15 min"
+  },
+  "stages": [
+    {
+      "number": 1,
+      "name": "Keywords → Clusters",
+      "status": "completed",       // "pending" | "active" | "completed" | "skipped"
+      "input_count": 50,           // Items that entered this stage
+      "output_count": 12,          // Items produced (clusters)
+      "processed_count": 50,       // Items processed
+      "progress_percentage": 100
+    },
+    {
+      "number": 2,
+      "name": "Clusters → Ideas",
+      "status": "completed",
+      "input_count": 12,
+      "output_count": 36,
+      "processed_count": 12,
+      "progress_percentage": 100
+    },
+    {
+      "number": 4,
+      "name": "Tasks → Content",
+      "status": "active",
+      "input_count": 36,
+      "output_count": 22,
+      "processed_count": 22,
+      "progress_percentage": 61,
+      "currently_processing": [
+        { "id": 123, "title": "How to build React apps" }
+      ],
+      "up_next": [
+        { "id": 124, "title": "Vue vs React comparison" }
+      ]
+    }
+    // ... etc
+  ],
+  "metrics": {
+    "credits_used": 156,
+    "duration_seconds": 1823,
+    "errors": []
+  }
+}
+```
+
+---
+
+## 📝 IMPLEMENTATION PLAN
+
+### Phase 1: Backend Fixes (Critical)
+
+#### 1.1 Fix Status Mismatches
+
+**File:** automation_service.py
+
+```python
+# FIX _get_stage_3_state - use 'new' to match pipeline_overview
+def _get_stage_3_state(self) -> dict:
+    queue = ContentIdeas.objects.filter(
+        site=self.site, status='new'  # Changed from 'approved'
+    ).order_by('id')
+    ...
+
+# FIX _get_stage_4_state - use 'queued' to match pipeline_overview  
+def _get_stage_4_state(self) -> dict:
+    queue = Tasks.objects.filter(
+        site=self.site, status='queued'  # Changed from 'ready'
+    ).order_by('id')
+    ...
+```
+
+#### 1.2 Fix `_get_processed_count()` Method
+
+Current code sums wrong fields. Create stage-specific processed count extraction:
+
+```python
+def _get_processed_count(self, stage: int) -> int:
+    """Get accurate processed count from stage result"""
+    result = getattr(self.run, f'stage_{stage}_result', None)
+    if not result:
+        return 0
+    
+    # Map stage to correct result key
+    key_map = {
+        1: 'keywords_processed',
+        2: 'clusters_processed', 
+        3: 'ideas_processed',
+        4: 'tasks_processed',
+        5: 'content_processed',
+        6: 'images_processed',
+        7: 'ready_for_review'
+    }
+    return result.get(key_map.get(stage, ''), 0)
+```
+
+#### 1.3 New Unified Progress Endpoint
+
+**File:** views.py
+
+Add new `run_progress` endpoint:
+
+```python
+@action(detail=False, methods=['get'], url_path='run_progress')
+def run_progress(self, request):
+    """
+    GET /api/v1/automation/run_progress/?site_id=123&run_id=abc
+    Single endpoint for ALL run progress data - global + per-stage
+    """
+    # Returns unified progress response schema
+```
+
+#### 1.4 Capture Initial Snapshot on Run Start
+
+**File:** automation_service.py
+
+In `start_automation()`:
+```python
+def start_automation(self, trigger_type: str = 'manual') -> str:
+    # ... existing code ...
+    
+    # Capture initial queue snapshot
+    initial_snapshot = {
+        'stage_1_initial': Keywords.objects.filter(site=self.site, status='new', cluster__isnull=True, disabled=False).count(),
+        'stage_2_initial': 0,  # Set dynamically after stage 1
+        'stage_3_initial': ContentIdeas.objects.filter(site=self.site, status='new').count(),
+        'stage_4_initial': Tasks.objects.filter(site=self.site, status='queued').count(),
+        'stage_5_initial': Content.objects.filter(site=self.site, status='draft').annotate(images_count=Count('images')).filter(images_count=0).count(),
+        'stage_6_initial': Images.objects.filter(site=self.site, status='pending').count(),
+        'stage_7_initial': Content.objects.filter(site=self.site, status='review').count(),
+    }
+    initial_snapshot['total_initial_items'] = sum(initial_snapshot.values())
+    
+    self.run = AutomationRun.objects.create(
+        # ... existing fields ...
+        initial_snapshot=initial_snapshot
+    )
+```
+
+---
+
+### Phase 2: Frontend Fixes
+
+#### 2.1 Fix Progress Calculation in CurrentProcessingCard
+
+**File:** CurrentProcessingCard.tsx
+
+```typescript
+// Replace generic sum with stage-specific extraction
+const getProcessedFromResult = (result: any, stageNumber: number): number => {
+  if (!result) return 0;
+  
+  const keyMap: Record<number, string> = {
+    1: 'keywords_processed',
+    2: 'clusters_processed',
+    3: 'ideas_processed',
+    4: 'tasks_processed',
+    5: 'content_processed',
+    6: 'images_processed',
+    7: 'ready_for_review'
+  };
+  
+  return result[keyMap[stageNumber]] ?? 0;
+};
+```
+
+#### 2.2 Fix Stage Card Metrics
+
+**File:** AutomationPage.tsx
+
+```typescript
+// Current (WRONG):
+const processed = result ? Object.values(result).reduce((sum, val) => typeof val === 'number' ? sum + val : sum, 0) : 0;
+const total = (stage.pending ?? 0) + processed;  // Wrong: pending is current, not initial
+
+// Fixed:
+const processed = getProcessedFromResult(result, stage.number);
+const initialPending = currentRun?.initial_snapshot?.[`stage_${stage.number}_initial`] ?? stage.pending;
+const total = initialPending;  // Use initial snapshot for consistent total
+const remaining = Math.max(0, total - processed);
+```
+
+#### 2.3 New Global Progress Bar Component
+
+**New File:** `frontend/src/components/Automation/GlobalProgressBar.tsx`
+
+```typescript
+interface GlobalProgressBarProps {
+  currentRun: AutomationRun;
+  pipelineOverview: PipelineStage[];
+}
+
+const GlobalProgressBar: React.FC<GlobalProgressBarProps> = ({ currentRun, pipelineOverview }) => {
+  // Calculate total progress across all stages
+  const calculateGlobalProgress = () => {
+    if (!currentRun?.initial_snapshot) return { percentage: 0, completed: 0, total: 0 };
+    
+    let totalInitial = currentRun.initial_snapshot.total_initial_items || 0;
+    let totalCompleted = 0;
+    
+    for (let i = 1; i <= 7; i++) {
+      const result = currentRun[`stage_${i}_result`];
+      if (result) {
+        totalCompleted += getProcessedFromResult(result, i);
+      }
+    }
+    
+    // If current stage is active, add its progress
+    const currentStage = currentRun.current_stage;
+    // ... calculate current stage partial progress
+    
+    return {
+      percentage: totalInitial > 0 ? Math.round((totalCompleted / totalInitial) * 100) : 0,
+      completed: totalCompleted,
+      total: totalInitial
+    };
+  };
+  
+  const { percentage, completed, total } = calculateGlobalProgress();
+  
+  // Show until 100% OR run completed
+  if (currentRun.status === 'completed' && percentage === 100) {
+    return null;
+  }
+  
+  return (
+    <div className="bg-gradient-to-r from-brand-50 to-brand-100 border-2 border-brand-300 rounded-xl p-4 mb-6">
+      <div className="flex justify-between items-center mb-2">
+        <div className="flex items-center gap-2">
+          <BoltIcon className="w-5 h-5 text-brand-600 animate-pulse" />
+          <span className="font-bold text-brand-800">Full Pipeline Progress</span>
+        </div>
+        <span className="text-2xl font-bold text-brand-600">{percentage}%</span>
+      </div>
+      
+      {/* Segmented progress bar showing all 7 stages */}
+      <div className="flex h-4 rounded-full overflow-hidden bg-gray-200">
+        {[1, 2, 3, 4, 5, 6, 7].map(stageNum => {
+          const stageConfig = STAGE_CONFIG[stageNum - 1];
+          const result = currentRun[`stage_${stageNum}_result`];
+          const stageComplete = currentRun.current_stage > stageNum;
+          const isActive = currentRun.current_stage === stageNum;
+          
+          return (
+            <div
+              key={stageNum}
+              className={`flex-1 transition-all duration-500 ${
+                stageComplete ? `bg-gradient-to-r ${stageConfig.color}` :
+                isActive ? `bg-gradient-to-r ${stageConfig.color} opacity-60 animate-pulse` :
+                'bg-gray-300'
+              }`}
+              title={`Stage ${stageNum}: ${stageConfig.name}`}
+            />
+          );
+        })}
+      </div>
+      
+      <div className="flex justify-between text-xs text-gray-600 mt-2">
+        <span>{completed} / {total} items processed</span>
+        <span>Stage {currentRun.current_stage} of 7</span>
+      </div>
+    </div>
+  );
+};
+```
+
+#### 2.4 Consolidate API Calls
+
+**File:** AutomationPage.tsx
+
+Replace 17 separate API calls with single unified endpoint:
+
+```typescript
+// Current (17 calls):
+const [keywordsTotalRes, keywordsNewRes, keywordsMappedRes, ...14 more] = await Promise.all([...]);
+
+// New (1 call):
+const progressData = await automationService.getRunProgress(activeSite.id, currentRun?.run_id);
+// Response contains everything: metrics, stage counts, progress data
+```
+
+---
+
+### Phase 3: Stage Card Redesign
+
+#### 3.1 New Stage Card Layout
+
+Each stage card shows:
+
+```
+┌────────────────────────────────────────────┐
+│  Stage 1    [ICON]    ● Active             │
+│  Keywords → Clusters                        │
+├────────────────────────────────────────────┤
+│  Total Items:      50                       │
+│  Processed:        32     ████████░░ 64%   │
+│  Remaining:        18                       │
+├────────────────────────────────────────────┤
+│  Output Created:   8 clusters               │
+│  Credits Used:     24                       │
+│  Duration:         4m 32s                   │
+└────────────────────────────────────────────┘
+```
+
+#### 3.2 Status Badge Logic
+
+```typescript
+const getStageStatus = (stageNum: number, currentRun: AutomationRun | null) => {
+  if (!currentRun) {
+    // No run - show if items pending
+    return pipelineOverview[stageNum - 1]?.pending > 0 ? 'ready' : 'empty';
+  }
+  
+  if (currentRun.current_stage > stageNum) return 'completed';
+  if (currentRun.current_stage === stageNum) return 'active';
+  if (currentRun.current_stage < stageNum) {
+    // Check if previous stage produced items for this stage
+    const prevResult = currentRun[`stage_${stageNum - 1}_result`];
+    if (prevResult?.output_count > 0) return 'ready';
+    return 'pending';
+  }
+  return 'pending';
+};
+```
+
+---
+
+### Phase 4: Real-time Updates Optimization
+
+#### 4.1 Smart Polling with Exponential Backoff
+
+```typescript
+// Current: Fixed 5s interval
+const interval = setInterval(loadData, 5000);
+
+// New: Adaptive polling
+const useSmartPolling = (isRunning: boolean) => {
+  const [pollInterval, setPollInterval] = useState(2000);
+  
+  useEffect(() => {
+    if (!isRunning) {
+      setPollInterval(30000); // Slow poll when idle
+      return;
+    }
+    
+    // Fast poll during active run, slow down as stage progresses
+    const progressPercent = /* current stage progress */;
+    if (progressPercent < 50) {
+      setPollInterval(2000);  // 2s when lots happening
+    } else if (progressPercent < 90) {
+      setPollInterval(3000);  // 3s mid-stage
+    } else {
+      setPollInterval(1000);  // 1s near completion for responsive transition
+    }
+  }, [isRunning, progressPercent]);
+  
+  return pollInterval;
+};
+```
+
+#### 4.2 Optimistic UI Updates
+
+When user clicks "Run Now":
+1. Immediately show GlobalProgressBar at 0%
+2. Immediately set Stage 1 to "Active" 
+3. Don't wait for API confirmation
+
+---
+
+## 📋 DETAILED CHECKLIST
+
+### Backend Tasks
+- [x] Fix `_get_stage_3_state()` status filter: `'approved'` → `'new'` ✅ DONE
+- [x] Fix `_get_stage_4_state()` status filter: `'ready'` → `'queued'` ✅ DONE
+- [x] Create `_get_processed_for_stage(stage_num)` helper ✅ DONE (renamed to `_get_processed_count`)
+- [x] Add `initial_snapshot` JSON field to `AutomationRun` model ✅ DONE
+- [x] Capture initial snapshot in `start_automation()` ✅ DONE
+- [ ] Update snapshot after each stage completes (for cascading stages)
+- [x] Create new `run_progress` endpoint with unified schema ✅ DONE
+- [x] Add migration for new model field ✅ DONE (0006_automationrun_initial_snapshot.py)
+
+### Frontend Tasks
+- [x] Create `GlobalProgressBar` component ✅ DONE
+- [x] Add `GlobalProgressBar` to AutomationPage (below metrics, above CurrentProcessingCard) ✅ DONE
+- [x] Fix `getProcessedFromResult()` helper to extract stage-specific counts ✅ DONE
+- [x] Update stage card progress calculations ✅ DONE
+- [x] Update `CurrentProcessingCard` progress calculations ✅ DONE
+- [x] Add `getRunProgress` method to automationService.ts ✅ DONE
+- [ ] Consolidate metrics API calls to single endpoint
+- [ ] Implement smart polling with adaptive intervals
+- [ ] Add optimistic UI updates for "Run Now" action
+- [x] Fix "Remaining" count to be `Total - Processed` not `Pending` ✅ DONE
+
+### Testing
+- [ ] Test all 7 stages complete correctly
+- [ ] Verify counts match between stage cards and processing card
+- [ ] Test pause/resume preserves progress correctly
+- [ ] Test page refresh during run shows correct state
+- [ ] Test global progress bar persists until 100%
+- [ ] Load test: Verify API efficiency improvement
+
+---
+
+## 🎯 SUCCESS CRITERIA
+
+1. **Accurate Counts:** All stage cards show correct Total/Processed/Remaining
+2. **Consistent Data:** CurrentProcessingCard and Stage Cards show same numbers
+3. **Global Visibility:** Users see full pipeline progress at all times during run
+4. **Persistent Progress:** Progress bar stays visible until 100% complete
+5. **Real-time Feel:** Updates appear within 2-3 seconds of actual progress
+6. **API Efficiency:** Reduce API calls from 17+ to 1-2 per refresh cycle
+
+---
+
+## 🔄 MIGRATION PATH
+
+1. **Phase 1 (Day 1):** Backend status fixes + new processed count logic
+2. **Phase 2 (Day 2):** Frontend progress calculation fixes
+3. **Phase 3 (Day 3):** Global Progress Bar + API consolidation
+4. **Phase 4 (Day 4):** Smart polling + optimistic updates
+5. **Phase 5 (Day 5):** Testing + bug fixes
+
+---
+
+This plan provides a clear, implementable path to fix all automation page issues. Each phase can be implemented independently, and the plan contains enough detail that any AI model or developer can execute it in a future session.
+
+---
+
+#Implementation status
+
+--
+
+## ✅ Verification Complete
+
+### Migration Applied Successfully
+```
+Running migrations: Applying automation.0006_automationrun_initial_snapshot... OK
+```
+
+### Verification Against Plan Requirements
+
+| Plan Requirement | Status | Implementation |
+|-----------------|--------|----------------|
+| **Fix _get_stage_3_state** (approved→new) | ✅ | automation_service.py: `status='new'` |
+| **Fix _get_stage_4_state** (ready→queued) | ✅ | automation_service.py: `status='queued'` |
+| **Fix _get_processed_count** | ✅ | automation_service.py: Uses stage-specific keys |
+| **Add initial_snapshot field** | ✅ | Migration applied, field exists in model |
+| **Capture snapshot on run start** | ✅ | automation_service.py: `_capture_initial_snapshot()` called |
+| **New run_progress endpoint** | ✅ | views.py: `/run_progress/` endpoint |
+| **GlobalProgressBar component** | ✅ | GlobalProgressBar.tsx created |
+| **Fix frontend progress calculations** | ✅ | Both AutomationPage and CurrentProcessingCard use `getProcessedFromResult()` |
+| **getRunProgress in automationService** | ✅ | automationService.ts |
+
+### Alignment with Your Initial Instructions
+
+Your requirements from the plan file:
+1. ✅ **"wrong queue items"** → Fixed status mismatches (approved→new, ready→queued)
+2. ✅ **"missing queue items"** → Consistent status filters across all views
+3. ✅ **"progress bar not properly progress"** → Fixed `getProcessedFromResult()` to use correct keys
+4. ✅ **"total in queue and processed counts buggy"** → Uses initial_snapshot for accurate totals
+5. ✅ **"manual run full all stages progress bar"** → GlobalProgressBar component shows all 7 stages
+6. ✅ **"always keep showing unless 100% complete"** → Component hides only when `status === 'completed' && percentage >= 100`
+
+### AI Functions NOT Modified
+All AI functions remain unchanged:
+- auto_cluster.py - ✅ Untouched
+- generate_ideas.py - ✅ Untouched  
+- generate_content.py - ✅ Untouched
+- generate_image_prompts.py - ✅ Untouched
+- generate_images.py - ✅ Untouched
+- optimize_content.py - ✅ Untouched
+
+The changes only affect **progress tracking and display**, not the actual AI processing logic.