Files
igny8/docs/10-MODULES/AUTOMATION.md
IGNY8 VPS (Salman) f10916bfab VErsion 1.3.2
2026-01-03 09:35:43 +00:00

316 lines
9.3 KiB
Markdown

# Automation Module
**Last Verified:** January 3, 2026
**Version:** 1.3.2
**Status:** ✅ Active
**Backend Path:** `backend/igny8_core/business/automation/`
**Frontend Path:** `frontend/src/pages/Automation/`
---
## Quick Reference
| What | File | Key Items |
|------|------|-----------|
| Models | `business/automation/models.py` | `AutomationConfig`, `AutomationRun` |
| Service | `business/automation/services/automation_service.py` | `AutomationService` |
| Logger | `business/automation/services/automation_logger.py` | `AutomationLogger` |
| Celery Tasks | `business/automation/tasks.py` | `run_automation_task`, `check_scheduled_automations` |
| Publishing Tasks | `igny8_core/tasks/publishing_scheduler.py` | Scheduled publishing (v1.3.2) |
| Frontend | `pages/Automation/AutomationPage.tsx` | Main automation UI |
| Progress Bar | `components/Automation/GlobalProgressBar.tsx` | Full pipeline progress |
| Processing Card | `components/Automation/CurrentProcessingCard.tsx` | Real-time progress |
---
## Purpose
The Automation module runs the complete 7-stage content pipeline automatically:
```
Keywords → Clusters → Ideas → Tasks → Content → Image Prompts → Images → Published
```
---
## 7-Stage Pipeline
| Stage | Name | AI Function | Credit Cost |
|-------|------|-------------|-------------|
| 1 | Keywords → Clusters | `AutoClusterFunction` | Per batch |
| 2 | Clusters → Ideas | `GenerateIdeasFunction` | Per idea |
| 3 | Ideas → Tasks | None (local) | None |
| 4 | Tasks → Content | `GenerateContentFunction` | Per 100 words |
| 5 | Content → Image Prompts | `GenerateImagePromptsFunction` | Per prompt |
| 6 | Image Prompts → Images | `process_image_generation_queue` | Per image |
| 7 | Review → Published | Publishing Scheduler (v1.3.2) | None |
**Note:** Stage 7 uses the Publishing Scheduler with `PublishingSettings` to auto-approve and schedule content for publication. See [PUBLISHER.md](PUBLISHER.md) for details.
---
## Data Models
### AutomationConfig
| Field | Type | Purpose |
|-------|------|---------|
| account | FK | Owner account |
| site | FK | Target site |
| enabled | Boolean | Enable/disable automation |
| frequency | CharField | daily/weekly/monthly |
| scheduled_time | TimeField | Time to run |
| stage_1_batch_size | Integer | Keywords per batch |
| stage_2_batch_size | Integer | Clusters per batch |
| stage_3_batch_size | Integer | Ideas per batch |
| stage_4_batch_size | Integer | Tasks per batch |
| stage_5_batch_size | Integer | Content per batch |
| stage_6_batch_size | Integer | Images per batch |
| within_stage_delay | Integer | Seconds between batches |
| between_stage_delay | Integer | Seconds between stages |
| last_run_at | DateTime | Last execution |
| next_run_at | DateTime | Next scheduled run |
### AutomationRun
| Field | Type | Purpose |
|-------|------|---------|
| config | FK | Parent config |
| trigger_type | CharField | manual/scheduled |
| status | CharField | running/paused/cancelled/completed/failed |
| current_stage | Integer | Current stage (1-7) |
| started_at | DateTime | Start time |
| paused_at | DateTime | Pause time (nullable) |
| resumed_at | DateTime | Resume time (nullable) |
| cancelled_at | DateTime | Cancel time (nullable) |
| completed_at | DateTime | Completion time (nullable) |
| total_credits_used | Decimal | Total credits consumed |
| **initial_snapshot** | JSON | **v1.3.0** Queue sizes at run start |
| stage_1_result | JSON | Stage 1 results |
| stage_2_result | JSON | Stage 2 results |
| stage_3_result | JSON | Stage 3 results |
| stage_4_result | JSON | Stage 4 results |
| stage_5_result | JSON | Stage 5 results |
| stage_6_result | JSON | Stage 6 results |
| stage_7_result | JSON | Stage 7 results |
| error_message | TextField | Error details (nullable) |
---
## API Endpoints
| Method | Path | Handler | Purpose |
|--------|------|---------|---------|
| GET | `/api/v1/automation/config/` | Get/create config | Get automation config |
| PUT | `/api/v1/automation/update_config/` | Update config | Update settings |
| POST | `/api/v1/automation/run_now/` | Start manual run | Start automation |
| GET | `/api/v1/automation/current_run/` | Get current run | Run status/progress |
| GET | `/api/v1/automation/pipeline_overview/` | Get pipeline | Stage status counts |
| GET | `/api/v1/automation/current_processing/` | Get processing | Live processing status |
| **GET** | `/api/v1/automation/run_progress/` | **v1.3.0** | Unified progress data |
| POST | `/api/v1/automation/pause/` | Pause run | Pause after current item |
| POST | `/api/v1/automation/resume/` | Resume run | Resume from saved stage |
| POST | `/api/v1/automation/cancel/` | Cancel run | Cancel after current item |
| GET | `/api/v1/automation/history/` | Get history | Last 20 runs |
| GET | `/api/v1/automation/logs/` | Get logs | Activity log for run |
| GET | `/api/v1/automation/estimate/` | Get estimate | Credit estimate |
**Query Parameters:** All require `?site_id=`, run-specific require `?run_id=`
### run_progress Endpoint (v1.3.0)
Returns unified progress data for frontend:
```json
{
"run": { "run_id": "...", "status": "running", "current_stage": 3 },
"global_progress": { "total_items": 100, "completed_items": 45, "percentage": 45 },
"stages": [
{ "number": 1, "status": "completed", "input_count": 50, "processed_count": 50 },
...
],
"metrics": { "credits_used": 120, "duration_seconds": 3600 },
"initial_snapshot": { "stage_1_initial": 50, ... }
}
```
---
## Execution Flow
### Manual Run
1. User clicks "Run Now" on frontend
2. Frontend calls `POST /automation/run_now/?site_id=X`
3. Backend acquires cache lock `automation_lock_{site_id}`
4. **v1.3.0:** Captures initial snapshot with `_capture_initial_snapshot()`
5. Estimates credits required (1.2x buffer)
6. Validates balance >= estimate
7. Creates `AutomationRun` record
8. Enqueues `run_automation_task` Celery task
8. Returns run ID immediately
### Stage Execution
For each stage (1-7):
1. Check `_check_should_stop()` (paused/cancelled?)
2. Load items for processing
3. Process in batches (respecting batch_size)
4. For AI stages: Call AIEngine function
5. Wait `within_stage_delay` between batches
6. Save stage result JSON
7. Wait `between_stage_delay` before next stage
### Stage Result Fields
**Stage 1 (Clustering):**
```json
{
"keywords_processed": 150,
"clusters_created": 12,
"batches_run": 3,
"credits_used": 45,
"time_elapsed": 120,
"skipped": false,
"partial": false
}
```
**Stage 2 (Ideas):**
```json
{
"clusters_processed": 12,
"ideas_created": 36,
"batches_run": 2,
"credits_used": 72
}
```
**Stage 3 (Tasks):**
```json
{
"ideas_processed": 36,
"tasks_created": 36,
"batches_run": 4
}
```
**Stage 4 (Content):**
```json
{
"tasks_processed": 36,
"content_created": 36,
"total_words": 54000,
"batches_run": 6,
"credits_used": 540
}
```
**Stage 5 (Image Prompts):**
```json
{
"content_processed": 36,
"prompts_created": 180,
"batches_run": 4,
"credits_used": 36
}
```
**Stage 6 (Images):**
```json
{
"images_processed": 180,
"images_generated": 180,
"batches_run": 18
}
```
**Stage 7 (Review):**
```json
{
"ready_for_review": 36
}
```
---
## Scheduling
**Celery Beat Task:** `check_scheduled_automations`
**Frequency:** Hourly
**Logic:**
1. Find configs where `enabled=True`
2. Check if `next_run_at <= now`
3. Check if no active run exists
4. Start `run_automation_task` for eligible configs
5. Update `next_run_at` based on frequency
---
## Lock Mechanism
**Purpose:** Prevent concurrent runs for same site
**Key:** `automation_lock_{site_id}`
**Storage:** Redis cache
**Acquired:** On run start
**Released:** On completion/failure/cancel
---
## Credit Validation
Before starting:
1. Calculate estimated credits for all stages
2. Apply 1.2x safety buffer
3. Compare with account balance
4. Reject if balance < estimate
During execution:
- Each AI stage checks credits before processing
- Deductions happen after successful AI calls
- `total_credits_used` accumulates across stages
---
## Frontend Integration
### AutomationPage Components
- **Config Panel:** Enable/disable, schedule settings
- **Pipeline Cards:** Stage-by-stage status with pending counts
- **Processing Card:** Live processing status during run
- **Control Buttons:** Run Now, Pause, Resume, Cancel
- **Activity Log:** Real-time log streaming
- **History Table:** Past 20 runs with status
### Polling
- Every ~5s while run is running/paused
- Fetches: `current_run`, `pipeline_overview`, `current_processing`
- Lighter polling when idle
---
## Common Issues
| Issue | Cause | Fix |
|-------|-------|-----|
| "Already running" error | Lock exists from previous run | Wait or check if stuck |
| Insufficient credits | Balance < 1.2x estimate | Add credits |
| Stage skipped | No items to process | Check previous stages |
| Run stuck | Worker crashed | Clear lock, restart |
| Images not generating | Stage 5 didn't create prompts | Check stage 5 result |
---
## Planned Changes
| Feature | Status | Description |
|---------|--------|-------------|
| Progress bar accuracy | 🐛 Bug | Fix progress calculation based on actual stage |
| Completed count display | 🐛 Bug | Fix count display in UI |
| Stage skip configuration | 🔜 Planned | Allow skipping certain stages |
| Notification on complete | 🔜 Planned | Email/webhook when done |