Automation Part 1

2025-12-03 08:07:43 +00:00
parent 5d96e1a2bd
commit b9774aafa2
23 changed files with 3444 additions and 1 deletions
--- a/AUTOMATION-IMPLEMENTATION-README.md
+++ b/AUTOMATION-IMPLEMENTATION-README.md
@@ -0,0 +1,383 @@
+# AI Automation Pipeline - Implementation Complete
+
+## Overview
+
+The IGNY8 AI Automation Pipeline is a fully automated content creation system that orchestrates existing AI functions into a 7-stage pipeline, transforming keywords into published content without manual intervention.
+
+## Architecture
+
+### Backend Components
+
+#### 1. Models (`/backend/igny8_core/business/automation/models.py`)
+
+**AutomationConfig**
+- Per-site configuration for automation
+- Fields: `is_enabled`, `frequency` (daily/weekly/monthly), `scheduled_time`, batch sizes for all 7 stages
+- OneToOne relationship with Site model
+
+**AutomationRun**
+- Tracks execution of automation runs
+- Fields: `run_id`, `status`, `current_stage`, `stage_1_result` through `stage_7_result` (JSON), `total_credits_used`
+- Status choices: running, paused, completed, failed
+
+#### 2. Services
+
+**AutomationLogger** (`services/automation_logger.py`)
+- File-based logging system
+- Log structure: `logs/automation/{account_id}/{site_id}/{run_id}/`
+- Files: `automation_run.log`, `stage_1.log` through `stage_7.log`
+- Methods: `start_run()`, `log_stage_start()`, `log_stage_progress()`, `log_stage_complete()`, `log_stage_error()`
+
+**AutomationService** (`services/automation_service.py`)
+- Core orchestrator for automation pipeline
+- Methods:
+  - `start_automation()` - Initialize new run with credit check
+  - `run_stage_1()` through `run_stage_7()` - Execute each pipeline stage
+  - `pause_automation()`, `resume_automation()` - Control run execution
+  - `estimate_credits()` - Pre-run credit estimation
+  - `from_run_id()` - Create service from existing run
+
+#### 3. API Endpoints (`views.py`)
+
+All endpoints at `/api/v1/automation/`:
+
+- `GET /config/?site_id=123` - Get automation configuration
+- `PUT /update_config/?site_id=123` - Update configuration
+- `POST /run_now/?site_id=123` - Trigger immediate run
+- `GET /current_run/?site_id=123` - Get active run status
+- `POST /pause/?run_id=abc` - Pause running automation
+- `POST /resume/?run_id=abc` - Resume paused automation
+- `GET /history/?site_id=123` - Get past runs (last 20)
+- `GET /logs/?run_id=abc&lines=100` - Get run logs
+- `GET /estimate/?site_id=123` - Estimate credits needed
+
+#### 4. Celery Tasks (`tasks.py`)
+
+**check_scheduled_automations**
+- Runs hourly via Celery Beat
+- Checks AutomationConfig records for scheduled runs
+- Triggers automation based on frequency and scheduled_time
+
+**run_automation_task**
+- Main background task that executes all 7 stages sequentially
+- Called by `run_now` API endpoint or scheduled trigger
+- Handles errors and updates AutomationRun status
+
+**resume_automation_task**
+- Resumes paused automation from `current_stage`
+- Called by `resume` API endpoint
+
+#### 5. Database Migration
+
+Located at `/backend/igny8_core/business/automation/migrations/0001_initial.py`
+
+Run with: `python manage.py migrate`
+
+### Frontend Components
+
+#### 1. Service (`/frontend/src/services/automationService.ts`)
+
+TypeScript API client with methods matching backend endpoints:
+- `getConfig()`, `updateConfig()`, `runNow()`, `getCurrentRun()`
+- `pause()`, `resume()`, `getHistory()`, `getLogs()`, `estimate()`
+
+#### 2. Pages
+
+**AutomationPage** (`pages/Automation/AutomationPage.tsx`)
+- Main dashboard at `/automation`
+- Displays current run status, stage progress, activity log, history
+- Real-time polling (5s interval when run is active)
+- Controls: Run Now, Pause, Resume, Configure
+
+#### 3. Components
+
+**StageCard** (`components/Automation/StageCard.tsx`)
+- Visual representation of each stage (1-7)
+- Shows status: pending (⏳), active (🔄), complete (✅)
+- Displays stage results (items processed, credits used, etc.)
+
+**ActivityLog** (`components/Automation/ActivityLog.tsx`)
+- Real-time log viewer with terminal-style display
+- Auto-refreshes every 3 seconds
+- Configurable line count (50, 100, 200, 500)
+
+**ConfigModal** (`components/Automation/ConfigModal.tsx`)
+- Modal for editing automation settings
+- Fields: Enable/disable, frequency, scheduled time, batch sizes
+- Form validation and save
+
+**RunHistory** (`components/Automation/RunHistory.tsx`)
+- Table of past automation runs
+- Columns: run_id, status, trigger, started, completed, credits, stage
+- Status badges with color coding
+
+## 7-Stage Pipeline
+
+### Stage 1: Keywords → Clusters (AI)
+- **Query**: `Keywords` with `status='new'`, `cluster__isnull=True`, `disabled=False`
+- **Batch Size**: Default 20 keywords
+- **AI Function**: `AutoCluster().execute()`
+- **Output**: Creates `Clusters` records
+- **Credits**: ~1 per 5 keywords
+
+### Stage 2: Clusters → Ideas (AI)
+- **Query**: `Clusters` with `status='new'`, exclude those with existing ideas
+- **Batch Size**: Default 1 cluster
+- **AI Function**: `GenerateIdeas().execute()`
+- **Output**: Creates `ContentIdeas` records
+- **Credits**: ~2 per cluster
+
+### Stage 3: Ideas → Tasks (Local Queue)
+- **Query**: `ContentIdeas` with `status='new'`
+- **Batch Size**: Default 20 ideas
+- **Operation**: Local database creation (no AI)
+- **Output**: Creates `Tasks` records with status='queued'
+- **Credits**: 0 (local operation)
+
+### Stage 4: Tasks → Content (AI)
+- **Query**: `Tasks` with `status='queued'`, `content__isnull=True`
+- **Batch Size**: Default 1 task
+- **AI Function**: `GenerateContent().execute()`
+- **Output**: Creates `Content` records with status='draft'
+- **Credits**: ~5 per content (2500 words avg)
+
+### Stage 5: Content → Image Prompts (AI)
+- **Query**: `Content` with `status='draft'`, `images_count=0` (annotated)
+- **Batch Size**: Default 1 content
+- **AI Function**: `GenerateImagePromptsFunction().execute()`
+- **Output**: Creates `Images` records with status='pending' (contains prompts)
+- **Credits**: ~2 per content (4 prompts avg)
+
+### Stage 6: Image Prompts → Generated Images (AI)
+- **Query**: `Images` with `status='pending'`
+- **Batch Size**: Default 1 image
+- **AI Function**: `GenerateImages().execute()`
+- **Output**: Updates `Images` to status='generated' with `image_url`
+- **Side Effect**: Automatically sets `Content.status='review'` when all images complete (via `ai/tasks.py:723`)
+- **Credits**: ~2 per image
+
+### Stage 7: Manual Review Gate
+- **Query**: `Content` with `status='review'`
+- **Operation**: Count only, no processing
+- **Output**: Returns list of content IDs ready for review
+- **Credits**: 0
+
+## Key Design Principles
+
+### 1. NO Duplication of AI Function Logic
+
+The automation system ONLY handles:
+- Batch selection and sequencing
+- Stage orchestration
+- Credit estimation and checking
+- Progress tracking and logging
+- Scheduling and triggers
+
+It does NOT handle:
+- Credit deduction (done by `AIEngine.execute()` at line 395)
+- Status updates (done within AI functions)
+- Progress tracking (StepTracker emits events automatically)
+
+### 2. Correct Image Model Understanding
+
+- **NO separate ImagePrompts model** - this was a misunderstanding
+- `Images` model serves dual purpose:
+  - `status='pending'` = has prompt, needs image URL
+  - `status='generated'` = has image_url
+- Stage 5 creates Images records with prompts
+- Stage 6 updates same records with URLs
+
+### 3. Automatic Content Status Changes
+
+- `Content.status` changes from 'draft' to 'review' automatically
+- Happens in `ai/tasks.py:723` when all images complete
+- Automation does NOT manually update this status
+
+### 4. Distributed Locking
+
+- Uses Django cache with `automation_lock_{site.id}` key
+- 6-hour timeout to prevent deadlocks
+- Released on completion, pause, or failure
+
+## Configuration
+
+### Schedule Configuration UI
+
+Located at `/automation` page → [Configure] button
+
+**Options:**
+- **Enable/Disable**: Toggle automation on/off
+- **Frequency**: Daily, Weekly (Mondays), Monthly (1st)
+- **Scheduled Time**: Time of day to run (24-hour format)
+- **Batch Sizes**: Per-stage item counts
+
+**Defaults:**
+- Stage 1: 20 keywords
+- Stage 2: 1 cluster
+- Stage 3: 20 ideas
+- Stage 4: 1 task
+- Stage 5: 1 content
+- Stage 6: 1 image
+
+### Credit Estimation
+
+Before starting, system estimates:
+- Stage 1: keywords_count / 5
+- Stage 2: clusters_count * 2
+- Stage 4: tasks_count * 5
+- Stage 5: content_count * 2
+- Stage 6: content_count * 8 (4 images * 2 credits avg)
+
+Requires 20% buffer: `account.credits_balance >= estimated * 1.2`
+
+## Deployment Checklist
+
+### Backend
+
+1. ✅ Models created in `business/automation/models.py`
+2. ✅ Services created (`AutomationLogger`, `AutomationService`)
+3. ✅ Views created (`AutomationViewSet`)
+4. ✅ URLs registered in `igny8_core/urls.py`
+5. ✅ Celery tasks created (`check_scheduled_automations`, `run_automation_task`, `resume_automation_task`)
+6. ✅ Celery beat schedule updated in `celery.py`
+7. ⏳ Migration created (needs to run: `python manage.py migrate`)
+
+### Frontend
+
+8. ✅ API service created (`services/automationService.ts`)
+9. ✅ Main page created (`pages/Automation/AutomationPage.tsx`)
+10. ✅ Components created (`StageCard`, `ActivityLog`, `ConfigModal`, `RunHistory`)
+11. ⏳ Route registration (add to router: `/automation` → `AutomationPage`)
+
+### Infrastructure
+
+12. ⏳ Celery worker running (for background tasks)
+13. ⏳ Celery beat running (for scheduled checks)
+14. ⏳ Redis/cache backend configured (for distributed locks)
+15. ⏳ Log directory writable: `/data/app/igny8/backend/logs/automation/`
+
+## Usage
+
+### Manual Trigger
+
+1. Navigate to `/automation` page
+2. Verify credit balance is sufficient (shows in header)
+3. Click [Run Now] button
+4. Monitor progress in real-time:
+   - Stage cards show current progress
+   - Activity log shows detailed logs
+   - Credits used updates live
+
+### Scheduled Automation
+
+1. Navigate to `/automation` page
+2. Click [Configure] button
+3. Enable automation
+4. Set frequency and time
+5. Configure batch sizes
+6. Save configuration
+7. Automation will run automatically at scheduled time
+
+### Pause/Resume
+
+- During active run, click [Pause] to halt execution
+- Click [Resume] to continue from current stage
+- Useful for credit management or issue investigation
+
+### Viewing History
+
+- Run History table shows last 20 runs
+- Filter by status, date, trigger type
+- Click run_id to view detailed logs
+
+## Monitoring
+
+### Log Files
+
+Located at: `logs/automation/{account_id}/{site_id}/{run_id}/`
+
+- `automation_run.log` - Main activity log
+- `stage_1.log` through `stage_7.log` - Stage-specific logs
+
+### Database Records
+
+**AutomationRun** table tracks:
+- Current status and stage
+- Stage results (JSON)
+- Credits used
+- Error messages
+- Timestamps
+
+**AutomationConfig** table tracks:
+- Last run timestamp
+- Next scheduled run
+- Configuration changes
+
+## Troubleshooting
+
+### Run stuck in "running" status
+
+1. Check Celery worker logs: `docker logs <celery_container>`
+2. Check for cache lock: `redis-cli GET automation_lock_<site_id>`
+3. Manually release lock if needed: `redis-cli DEL automation_lock_<site_id>`
+4. Update run status: `AutomationRun.objects.filter(run_id='...').update(status='failed')`
+
+### Insufficient credits
+
+1. Check estimate: GET `/api/v1/automation/estimate/?site_id=123`
+2. Add credits via billing page
+3. Retry run
+
+### Stage failures
+
+1. View logs: GET `/api/v1/automation/logs/?run_id=...`
+2. Check `error_message` field in AutomationRun
+3. Verify AI function is working: test individually via existing UI
+4. Check credit balance mid-run
+
+## Future Enhancements
+
+1. Email notifications on completion/failure
+2. Slack/webhook integrations
+3. Per-stage retry logic
+4. Partial run resumption after failure
+5. Advanced scheduling (specific days, multiple times)
+6. Content preview before Stage 7
+7. Auto-publish to WordPress option
+8. Credit usage analytics and forecasting
+
+## File Locations Summary
+
+```
+backend/igny8_core/business/automation/
+├── __init__.py
+├── models.py                   # AutomationConfig, AutomationRun
+├── views.py                    # AutomationViewSet (API endpoints)
+├── tasks.py                    # Celery tasks
+├── urls.py                     # URL routing
+├── migrations/
+│   ├── __init__.py
+│   └── 0001_initial.py        # Database schema
+└── services/
+    ├── __init__.py
+    ├── automation_logger.py    # File logging service
+    └── automation_service.py   # Core orchestrator
+
+frontend/src/
+├── services/
+│   └── automationService.ts    # API client
+├── pages/Automation/
+│   └── AutomationPage.tsx      # Main dashboard
+└── components/Automation/
+    ├── StageCard.tsx           # Stage status display
+    ├── ActivityLog.tsx         # Log viewer
+    ├── ConfigModal.tsx         # Settings modal
+    └── RunHistory.tsx          # Past runs table
+```
+
+## Credits
+
+Implemented according to `automation-plan.md` with corrections for:
+- Image model structure (no separate ImagePrompts)
+- AI function internal logic (no duplication)
+- Content status changes (automatic in background)