diff --git a/AI_FUNCTIONS_AUDIT_REPORT.md b/AI_FUNCTIONS_AUDIT_REPORT.md new file mode 100644 index 00000000..6bd16884 --- /dev/null +++ b/AI_FUNCTIONS_AUDIT_REPORT.md @@ -0,0 +1,277 @@ +# AI Functions Deep Audit Report +## Clustering, Idea Generation, and Content Generation + +**Date:** 2025-01-XX +**Scope:** Complete audit of AI functions for clustering, idea generation, and content generation to identify unused code and files safe to remove. + +--- + +## Executive Summary + +This audit identifies **legacy code, deprecated functions, and unused implementations** that are safe to remove from the codebase. The current system uses a unified AI framework through `run_ai_task` → `AIEngine` → `BaseAIFunction` implementations. + +--- + +## 1. CURRENT ACTIVE ARCHITECTURE + +### 1.1 Active Flow (What's Actually Used) + +``` +Frontend API Call + ↓ +views.py (auto_cluster/auto_generate_ideas/auto_generate_content) + ↓ +run_ai_task (ai/tasks.py) - Unified Celery task entrypoint + ↓ +AIEngine (ai/engine.py) - Orchestrator + ↓ +BaseAIFunction implementations: + - AutoClusterFunction (ai/functions/auto_cluster.py) + - GenerateIdeasFunction (ai/functions/generate_ideas.py) + - GenerateContentFunction (ai/functions/generate_content.py) + ↓ +AICore (ai/ai_core.py) - Centralized AI request handler + ↓ +AIProvider (OpenAI/Runware) +``` + +### 1.2 Active Files (KEEP) + +#### Core Framework +- ✅ `backend/igny8_core/ai/tasks.py` - Unified Celery task (`run_ai_task`) +- ✅ `backend/igny8_core/ai/engine.py` - AIEngine orchestrator +- ✅ `backend/igny8_core/ai/base.py` - BaseAIFunction abstract class +- ✅ `backend/igny8_core/ai/ai_core.py` - AICore centralized handler +- ✅ `backend/igny8_core/ai/registry.py` - Function registry +- ✅ `backend/igny8_core/ai/prompts.py` - PromptRegistry +- ✅ `backend/igny8_core/ai/settings.py` - Model configurations +- ✅ `backend/igny8_core/ai/constants.py` - AI constants +- ✅ `backend/igny8_core/ai/tracker.py` - Progress tracking +- ✅ `backend/igny8_core/ai/validators.py` - Validation functions + +#### Function Implementations +- ✅ `backend/igny8_core/ai/functions/auto_cluster.py` - AutoClusterFunction +- ✅ `backend/igny8_core/ai/functions/generate_ideas.py` - GenerateIdeasFunction +- ✅ `backend/igny8_core/ai/functions/generate_content.py` - GenerateContentFunction +- ✅ `backend/igny8_core/ai/functions/__init__.py` - Exports + +#### API Endpoints +- ✅ `backend/igny8_core/modules/planner/views.py` - KeywordViewSet.auto_cluster(), ClusterViewSet.auto_generate_ideas() +- ✅ `backend/igny8_core/modules/writer/views.py` - TasksViewSet.auto_generate_content() + +--- + +## 2. DEPRECATED / UNUSED CODE (SAFE TO REMOVE) + +### 2.1 Legacy Wrapper Functions (NOT USED) + +#### ❌ `generate_ideas_core()` - **SAFE TO REMOVE** +- **Location:** `backend/igny8_core/ai/functions/generate_ideas.py:234` +- **Status:** Legacy wrapper function for backward compatibility +- **Usage:** ❌ **NOT CALLED ANYWHERE** (only in commented test code) +- **Purpose:** Was meant for direct calls without Celery, but all calls now go through `run_ai_task` +- **Action:** **REMOVE** - Function and export from `__init__.py` + +#### ❌ `generate_content_core()` - **SAFE TO REMOVE** +- **Location:** `backend/igny8_core/ai/functions/generate_content.py:303` +- **Status:** Legacy wrapper function for backward compatibility +- **Usage:** ❌ **NOT CALLED ANYWHERE** (only in commented test code) +- **Purpose:** Was meant for direct calls without Celery, but all calls now go through `run_ai_task` +- **Action:** **REMOVE** - Function and export from `__init__.py` + +### 2.2 AIProcessor Deprecated Methods (NOT USED) + +#### ❌ `AIProcessor.cluster_keywords()` - **SAFE TO REMOVE** +- **Location:** `backend/igny8_core/utils/ai_processor.py:1049-1282` +- **Status:** ⚠️ **DEPRECATED** (marked with deprecation warning) +- **Usage:** ❌ **NOT CALLED ANYWHERE** +- **Lines:** ~233 lines of code +- **Action:** **REMOVE** entire method + +#### ❌ `AIProcessor.generate_ideas()` - **SAFE TO REMOVE** +- **Location:** `backend/igny8_core/utils/ai_processor.py:1284-1363` +- **Status:** Legacy method +- **Usage:** ❌ **NOT CALLED ANYWHERE** +- **Lines:** ~80 lines of code +- **Action:** **REMOVE** entire method + +#### ❌ `AIProcessor.generate_content()` - **SAFE TO REMOVE** +- **Location:** `backend/igny8_core/utils/ai_processor.py:433-531` +- **Status:** Legacy method +- **Usage:** ❌ **NOT CALLED ANYWHERE** (only called internally by `extract_image_prompts`, which is also unused) +- **Lines:** ~98 lines of code +- **Action:** **REMOVE** entire method + +#### ❌ `AIProcessor.extract_image_prompts()` - **SAFE TO REMOVE** +- **Location:** `backend/igny8_core/utils/ai_processor.py:471-580` +- **Status:** Legacy method +- **Usage:** ❌ **NOT CALLED ANYWHERE** (new framework uses `GenerateImagesFunction` which uses `AICore.run_ai_request` directly) +- **Lines:** ~110 lines of code +- **Action:** **REMOVE** entire method + +**Note:** `AIProcessor.generate_image()` is **STILL USED** in `integration_views.py` for image generation, so keep the class and that method. + +### 2.3 Unused Exports + +#### ❌ `generate_ideas_core` export - **SAFE TO REMOVE** +- **Location:** `backend/igny8_core/ai/functions/__init__.py:5,12` +- **Action:** Remove from imports and `__all__` + +#### ❌ `generate_content_core` export - **SAFE TO REMOVE** +- **Location:** `backend/igny8_core/ai/functions/__init__.py:6,14` +- **Action:** Remove from imports and `__all__` + +### 2.4 Test File Cleanup + +#### ⚠️ `backend/igny8_core/ai/tests/test_run.py` +- **Status:** Contains commented-out code referencing removed functions +- **Action:** Clean up commented code (lines 16, 63, 75, 126-127) + +--- + +## 3. AIProcessor - PARTIAL CLEANUP + +### 3.1 What to KEEP in AIProcessor + +✅ **KEEP:** +- `AIProcessor.__init__()` - Initialization +- `AIProcessor._get_api_key()` - API key retrieval +- `AIProcessor._get_model()` - Model retrieval +- `AIProcessor._call_openai()` - OpenAI API calls (used by generate_image) +- `AIProcessor._extract_json_from_response()` - JSON extraction +- `AIProcessor.generate_image()` - **STILL USED** in `integration_views.py` +- `AIProcessor.get_prompt()` - May be used by generate_image +- Constants imports (MODEL_RATES, etc.) - Used by AICore + +### 3.2 What to REMOVE from AIProcessor + +❌ **REMOVE:** +- `AIProcessor.cluster_keywords()` - ~233 lines +- `AIProcessor.generate_ideas()` - ~80 lines +- `AIProcessor.generate_content()` - ~98 lines +- `AIProcessor.extract_image_prompts()` - ~110 lines +- **Total:** ~521 lines of unused code + +--- + +## 4. FILES ALREADY DELETED (Good!) + +✅ **Already Removed:** +- `backend/igny8_core/modules/planner/tasks.py` - ✅ Already deleted +- `backend/igny8_core/modules/writer/tasks.py` - ✅ Already deleted + +--- + +## 5. SUMMARY OF REMOVALS + +### 5.1 Functions to Remove + +| Function | Location | Lines | Status | +|----------|----------|-------|--------| +| `generate_ideas_core()` | `ai/functions/generate_ideas.py:234` | ~100 | ❌ Remove | +| `generate_content_core()` | `ai/functions/generate_content.py:303` | ~85 | ❌ Remove | +| `AIProcessor.cluster_keywords()` | `utils/ai_processor.py:1049` | ~233 | ❌ Remove | +| `AIProcessor.generate_ideas()` | `utils/ai_processor.py:1284` | ~80 | ❌ Remove | +| `AIProcessor.generate_content()` | `utils/ai_processor.py:433` | ~98 | ❌ Remove | +| `AIProcessor.extract_image_prompts()` | `utils/ai_processor.py:471` | ~110 | ❌ Remove | + +**Total Lines to Remove:** ~706 lines + +### 5.2 Exports to Remove + +| Export | Location | +|--------|----------| +| `generate_ideas_core` | `ai/functions/__init__.py:5,12` | +| `generate_content_core` | `ai/functions/__init__.py:6,14` | + +### 5.3 Test Cleanup + +| File | Action | +|------|--------| +| `ai/tests/test_run.py` | Remove commented code (lines 16, 63, 75, 126-127) | + +--- + +## 6. VERIFICATION CHECKLIST + +Before removing, verify: + +- [ ] ✅ No imports of `generate_ideas_core` or `generate_content_core` anywhere +- [ ] ✅ No calls to `AIProcessor.cluster_keywords()`, `generate_ideas()`, or `generate_content()` +- [ ] ✅ All active code paths use `run_ai_task` → `AIEngine` → `BaseAIFunction` +- [ ] ✅ `AIProcessor.generate_image()` is still used (verify in `integration_views.py`) +- [ ] ✅ Constants from `ai_processor.py` are still imported by `AICore` (verify) + +--- + +## 7. RECOMMENDED REMOVAL ORDER + +1. **Phase 1: Remove Legacy Wrapper Functions** + - Remove `generate_ideas_core()` from `generate_ideas.py` + - Remove `generate_content_core()` from `generate_content.py` + - Remove exports from `__init__.py` + +2. **Phase 2: Remove AIProcessor Deprecated Methods** + - Remove `AIProcessor.cluster_keywords()` + - Remove `AIProcessor.generate_ideas()` + - Remove `AIProcessor.generate_content()` + - Remove `AIProcessor.extract_image_prompts()` (depends on generate_content, so remove after) + +3. **Phase 3: Cleanup Tests** + - Clean up commented code in `test_run.py` + +4. **Phase 4: Verification** + - Run full test suite + - Verify all AI functions still work + - Check for any broken imports + +--- + +## 8. IMPACT ANALYSIS + +### 8.1 No Breaking Changes Expected + +- ✅ All active code paths use the new framework +- ✅ No external dependencies on removed functions +- ✅ Views only use `run_ai_task` (verified) + +### 8.2 Benefits + +- 🎯 **~706 lines of dead code removed** +- 🎯 **Clearer codebase** - Only active code remains +- 🎯 **Easier maintenance** - No confusion about which path to use +- 🎯 **Reduced technical debt** + +--- + +## 9. FILES TO MODIFY + +### 9.1 Files to Edit + +1. `backend/igny8_core/ai/functions/generate_ideas.py` + - Remove `generate_ideas_core()` function (lines 234-333) + +2. `backend/igny8_core/ai/functions/generate_content.py` + - Remove `generate_content_core()` function (lines 303-386) + +3. `backend/igny8_core/ai/functions/__init__.py` + - Remove `generate_ideas_core` import and export + - Remove `generate_content_core` import and export + +4. `backend/igny8_core/utils/ai_processor.py` + - Remove `cluster_keywords()` method (lines 1049-1282) + - Remove `generate_ideas()` method (lines 1284-1363) + - Remove `generate_content()` method (lines 433-531) + - Remove `extract_image_prompts()` method (lines 471-580) + +5. `backend/igny8_core/ai/tests/test_run.py` + - Remove commented code referencing removed functions + +--- + +## 10. CONCLUSION + +This audit identifies **~706 lines of unused legacy code** that can be safely removed without impacting functionality. All active code paths use the unified AI framework (`run_ai_task` → `AIEngine` → `BaseAIFunction`), making these legacy functions obsolete. + +**Recommendation:** Proceed with removal in phases as outlined above, with thorough testing after each phase. + diff --git a/AI_MASTER_ARCHITECTURE.md b/AI_MASTER_ARCHITECTURE.md new file mode 100644 index 00000000..bfbeaff6 --- /dev/null +++ b/AI_MASTER_ARCHITECTURE.md @@ -0,0 +1,802 @@ +# AI Master Architecture Document +## Clustering, Idea Generation, and Content Generation + +**Version:** 1.0 +**Date:** 2025-01-XX +**Scope:** Complete architecture for 3 verified AI functions (clustering, idea generation, content generation) + +--- + +## Table of Contents + +1. [Common Architecture](#1-common-architecture) +2. [Auto Cluster Keywords](#2-auto-cluster-keywords) +3. [Generate Ideas](#3-generate-ideas) +4. [Generate Content](#4-generate-content) + +--- + +## 1. Common Architecture + +### 1.1 Core Framework Files + +#### Entry Point +- **File:** `backend/igny8_core/ai/tasks.py` +- **Function:** `run_ai_task` +- **Purpose:** Unified Celery task entrypoint for all AI functions +- **Parameters:** `function_name` (str), `payload` (dict), `account_id` (int) +- **Flow:** Loads function from registry → Creates AIEngine → Executes function + +#### Engine Orchestrator +- **File:** `backend/igny8_core/ai/engine.py` +- **Class:** `AIEngine` +- **Purpose:** Central orchestrator managing lifecycle, progress, logging, cost tracking +- **Methods:** + - `execute` - Main execution pipeline (6 phases: INIT, PREP, AI_CALL, PARSE, SAVE, DONE) + - `_handle_error` - Centralized error handling + - `_log_to_database` - Logs to AITaskLog model + - Helper methods: `_get_input_description`, `_build_validation_message`, `_get_prep_message`, `_get_ai_call_message`, `_get_parse_message`, `_get_parse_message_with_count`, `_get_save_message`, `_calculate_credits_for_clustering` + +#### Base Function Class +- **File:** `backend/igny8_core/ai/base.py` +- **Class:** `BaseAIFunction` +- **Purpose:** Abstract base class defining interface for all AI functions +- **Abstract Methods:** + - `get_name` - Returns function name (e.g., 'auto_cluster') + - `prepare` - Loads and prepares data + - `build_prompt` - Builds AI prompt + - `parse_response` - Parses AI response + - `save_output` - Saves results to database +- **Optional Methods:** + - `get_metadata` - Returns display name, description, phases + - `get_max_items` - Returns max items limit (or None) + - `validate` - Validates input payload (default: checks for 'ids') + - `get_model` - Returns model override (default: None, uses account default) + +#### Function Registry +- **File:** `backend/igny8_core/ai/registry.py` +- **Functions:** + - `register_function` - Registers function class + - `register_lazy_function` - Registers lazy loader + - `get_function` - Gets function class by name (lazy loads if needed) + - `get_function_instance` - Gets function instance by name + - `list_functions` - Lists all registered functions +- **Lazy Loaders:** + - `_load_auto_cluster` - Loads AutoClusterFunction + - `_load_generate_ideas` - Loads GenerateIdeasFunction + - `_load_generate_content` - Loads GenerateContentFunction + +#### AI Core Handler +- **File:** `backend/igny8_core/ai/ai_core.py` +- **Class:** `AICore` +- **Purpose:** Centralized AI request handler for all text generation +- **Methods:** + - `run_ai_request` - Makes API call to OpenAI/Runware + - `extract_json` - Extracts JSON from response (handles markdown code blocks) + +#### Prompt Registry +- **File:** `backend/igny8_core/ai/prompts.py` +- **Class:** `PromptRegistry` +- **Purpose:** Centralized prompt management with hierarchical resolution +- **Method:** `get_prompt` - Gets prompt with resolution order: + 1. Task-level prompt_override (if exists) + 2. DB prompt for (account, function) + 3. Default fallback from DEFAULT_PROMPTS registry +- **Prompt Types:** + - `clustering` - For auto_cluster function + - `ideas` - For generate_ideas function + - `content_generation` - For generate_content function +- **Context Placeholders:** + - `[IGNY8_KEYWORDS]` - Replaced with keyword list + - `[IGNY8_CLUSTERS]` - Replaced with cluster list + - `[IGNY8_CLUSTER_KEYWORDS]` - Replaced with cluster keywords + - `[IGNY8_IDEA]` - Replaced with idea data + - `[IGNY8_CLUSTER]` - Replaced with cluster data + - `[IGNY8_KEYWORDS]` - Replaced with keywords (for content) + +#### Model Settings +- **File:** `backend/igny8_core/ai/settings.py` +- **Constants:** + - `MODEL_CONFIG` - Model configurations per function (model, max_tokens, temperature, response_format) + - `FUNCTION_ALIASES` - Legacy function name mappings +- **Functions:** + - `get_model_config` - Gets model config for function (reads from IntegrationSettings if account provided) + - `get_model` - Gets model name for function + - `get_max_tokens` - Gets max tokens for function + - `get_temperature` - Gets temperature for function + +#### Validators +- **File:** `backend/igny8_core/ai/validators.py` +- **Functions:** + - `validate_ids` - Validates 'ids' array in payload + - `validate_keywords_exist` - Validates keywords exist in database + - `validate_cluster_exists` - Validates cluster exists + - `validate_tasks_exist` - Validates tasks exist + - `validate_cluster_limits` - Validates plan limits (currently disabled - always returns valid) + - `validate_api_key` - Validates API key is configured + - `validate_model` - Validates model is in supported list + - `validate_image_size` - Validates image size for model + +#### Progress Tracking +- **File:** `backend/igny8_core/ai/tracker.py` +- **Classes:** + - `StepTracker` - Tracks request/response steps + - `ProgressTracker` - Tracks Celery progress updates + - `CostTracker` - Tracks API costs and tokens + - `ConsoleStepTracker` - Console-based step logging + +#### Database Logging +- **File:** `backend/igny8_core/ai/models.py` +- **Model:** `AITaskLog` +- **Fields:** `task_id`, `function_name`, `account`, `phase`, `message`, `status`, `duration`, `cost`, `tokens`, `request_steps`, `response_steps`, `error`, `payload`, `result` + +### 1.2 Execution Flow (All Functions) + +``` +1. API Endpoint (views.py) + ↓ +2. run_ai_task (tasks.py) + - Gets account from account_id + - Gets function instance from registry + - Creates AIEngine + ↓ +3. AIEngine.execute (engine.py) + Phase 1: INIT (0-10%) + - Calls function.validate() + - Updates progress tracker + ↓ + Phase 2: PREP (10-25%) + - Calls function.prepare() + - Calls function.build_prompt() + - Updates progress tracker + ↓ + Phase 3: AI_CALL (25-70%) + - Gets model config from settings + - Calls AICore.run_ai_request() + - Tracks cost and tokens + - Updates progress tracker + ↓ + Phase 4: PARSE (70-85%) + - Calls function.parse_response() + - Updates progress tracker + ↓ + Phase 5: SAVE (85-98%) + - Calls function.save_output() + - Logs credit usage + - Updates progress tracker + ↓ + Phase 6: DONE (98-100%) + - Logs to AITaskLog + - Returns result +``` + +--- + +## 2. Auto Cluster Keywords + +### 2.1 Function Implementation + +- **File:** `backend/igny8_core/ai/functions/auto_cluster.py` +- **Class:** `AutoClusterFunction` +- **Inherits:** `BaseAIFunction` + +### 2.2 API Endpoint + +- **File:** `backend/igny8_core/modules/planner/views.py` +- **ViewSet:** `KeywordViewSet` +- **Action:** `auto_cluster` +- **Method:** POST +- **URL Path:** `/v1/planner/keywords/auto_cluster/` +- **Payload:** + - `ids` (list[int]) - Keyword IDs to cluster + - `sector_id` (int, optional) - Sector ID for filtering +- **Response:** + - `success` (bool) + - `task_id` (str) - Celery task ID if async + - `clusters_created` (int) - Number of clusters created + - `keywords_updated` (int) - Number of keywords updated + - `message` (str) + +### 2.3 Function Methods + +#### `get_name()` +- **Returns:** `'auto_cluster'` + +#### `get_metadata()` +- **Returns:** Dict with `display_name`, `description`, `phases` (INIT, PREP, AI_CALL, PARSE, SAVE, DONE) + +#### `get_max_items()` +- **Returns:** `None` (no limit) + +#### `validate(payload, account)` +- **Validates:** + - Calls `validate_ids` to check for 'ids' array + - Calls `validate_keywords_exist` to verify keywords exist +- **Returns:** Dict with `valid` (bool) and optional `error` (str) + +#### `prepare(payload, account)` +- **Loads:** + - Keywords from database (filters by `ids`, `account`, optional `sector_id`) + - Uses `select_related` for: `account`, `site`, `site__account`, `sector`, `sector__site` +- **Returns:** Dict with: + - `keywords` (list[Keyword objects]) + - `keyword_data` (list[dict]) - Formatted data with: `id`, `keyword`, `volume`, `difficulty`, `intent` + - `sector_id` (int, optional) + +#### `build_prompt(data, account)` +- **Gets Prompt:** + - Calls `PromptRegistry.get_prompt(function_name='auto_cluster', account, context)` + - Context includes: `KEYWORDS` (formatted keyword list), optional `SECTOR` (sector name) +- **Formatting:** + - Formats keywords as: `"- {keyword} (Volume: {volume}, Difficulty: {difficulty}, Intent: {intent})"` + - Replaces `[IGNY8_KEYWORDS]` placeholder + - Adds JSON mode instruction if not present +- **Returns:** Prompt string + +#### `parse_response(response, step_tracker)` +- **Parsing:** + - Tries direct JSON parse first + - Falls back to `AICore.extract_json()` if needed (handles markdown code blocks) +- **Extraction:** + - Extracts `clusters` array from JSON + - Handles both dict with 'clusters' key and direct array +- **Returns:** List[Dict] with cluster data: + - `name` (str) - Cluster name + - `description` (str) - Cluster description + - `keywords` (list[str]) - List of keyword strings + +#### `save_output(parsed, original_data, account, progress_tracker, step_tracker)` +- **Input:** + - `parsed` - List of cluster dicts from parse_response + - `original_data` - Dict from prepare() with `keywords` and `sector_id` +- **Process:** + - Gets account, site, sector from first keyword + - For each cluster in parsed: + - Gets or creates `Clusters` record: + - Fields: `name`, `description`, `account`, `site`, `sector`, `status='active'` + - Uses `get_or_create` with name + account + site + sector + - Matches keywords (case-insensitive): + - Normalizes cluster keywords and available keywords to lowercase + - Updates matched `Keywords` records: + - Sets `cluster` foreign key + - Sets `status='mapped'` + - Recalculates cluster metrics: + - `keywords_count` - Count of keywords in cluster + - `volume` - Sum of keyword volumes (uses `volume_override` if available, else `seed_keyword__volume`) +- **Returns:** Dict with: + - `count` (int) - Clusters created + - `clusters_created` (int) - Clusters created + - `keywords_updated` (int) - Keywords updated + +### 2.4 Database Models + +#### Keywords Model +- **File:** `backend/igny8_core/modules/planner/models.py` +- **Model:** `Keywords` +- **Fields Used:** + - `id` - Keyword ID + - `seed_keyword` (ForeignKey) - Reference to SeedKeyword + - `keyword` (property) - Gets keyword text from seed_keyword + - `volume` (property) - Gets volume from volume_override or seed_keyword + - `difficulty` (property) - Gets difficulty from difficulty_override or seed_keyword + - `intent` (property) - Gets intent from seed_keyword + - `cluster` (ForeignKey) - Assigned cluster + - `status` - Status ('active', 'pending', 'mapped', 'archived') + - `account`, `site`, `sector` - From SiteSectorBaseModel + +#### Clusters Model +- **File:** `backend/igny8_core/modules/planner/models.py` +- **Model:** `Clusters` +- **Fields Used:** + - `name` - Cluster name (unique) + - `description` - Cluster description + - `keywords_count` - Count of keywords (recalculated) + - `volume` - Sum of keyword volumes (recalculated) + - `status` - Status ('active') + - `account`, `site`, `sector` - From SiteSectorBaseModel + +### 2.5 AI Response Format + +**Expected JSON:** +```json +{ + "clusters": [ + { + "name": "Cluster Name", + "description": "Cluster description", + "keywords": ["keyword1", "keyword2", "keyword3"] + } + ] +} +``` + +### 2.6 Progress Messages + +- **INIT:** "Validating {keyword1}, {keyword2}, {keyword3} and {X} more keywords" (shows first 3, then count) +- **PREP:** "Loading {count} keyword(s)" +- **AI_CALL:** "Generating clusters with Igny8 Semantic SEO Model" +- **PARSE:** "{count} cluster(s) created" +- **SAVE:** "Saving {count} cluster(s)" + +--- + +## 3. Generate Ideas + +### 3.1 Function Implementation + +- **File:** `backend/igny8_core/ai/functions/generate_ideas.py` +- **Class:** `GenerateIdeasFunction` +- **Inherits:** `BaseAIFunction` + +### 3.2 API Endpoint + +- **File:** `backend/igny8_core/modules/planner/views.py` +- **ViewSet:** `ClusterViewSet` +- **Action:** `auto_generate_ideas` +- **Method:** POST +- **URL Path:** `/v1/planner/clusters/auto_generate_ideas/` +- **Payload:** + - `ids` (list[int]) - Cluster IDs (max 10) +- **Response:** + - `success` (bool) + - `task_id` (str) - Celery task ID if async + - `ideas_created` (int) - Number of ideas created + - `message` (str) + +### 3.3 Function Methods + +#### `get_name()` +- **Returns:** `'generate_ideas'` + +#### `get_metadata()` +- **Returns:** Dict with `display_name`, `description`, `phases` (INIT, PREP, AI_CALL, PARSE, SAVE, DONE) + +#### `get_max_items()` +- **Returns:** `10` (max clusters per generation) + +#### `validate(payload, account)` +- **Validates:** + - Calls `super().validate()` to check for 'ids' array and max_items limit + - Calls `validate_cluster_exists` for first cluster ID + - Calls `validate_cluster_limits` for plan limits (currently disabled) +- **Returns:** Dict with `valid` (bool) and optional `error` (str) + +#### `prepare(payload, account)` +- **Loads:** + - Clusters from database (filters by `ids`, `account`) + - Uses `select_related` for: `sector`, `account`, `site`, `sector__site` + - Uses `prefetch_related` for: `keywords` +- **Gets Keywords:** + - For each cluster, loads `Keywords` with `select_related('seed_keyword')` + - Extracts keyword text from `seed_keyword.keyword` +- **Returns:** Dict with: + - `clusters` (list[Cluster objects]) + - `cluster_data` (list[dict]) - Formatted data with: `id`, `name`, `description`, `keywords` (list[str]) + - `account` (Account object) + +#### `build_prompt(data, account)` +- **Gets Prompt:** + - Calls `PromptRegistry.get_prompt(function_name='generate_ideas', account, context)` + - Context includes: + - `CLUSTERS` - Formatted cluster list: `"Cluster ID: {id} | Name: {name} | Description: {description}"` + - `CLUSTER_KEYWORDS` - Formatted cluster keywords: `"Cluster ID: {id} | Name: {name} | Keywords: {keyword1}, {keyword2}"` +- **Replaces Placeholders:** + - `[IGNY8_CLUSTERS]` → clusters_text + - `[IGNY8_CLUSTER_KEYWORDS]` → cluster_keywords_text +- **Returns:** Prompt string + +#### `parse_response(response, step_tracker)` +- **Parsing:** + - Calls `AICore.extract_json()` to extract JSON from response + - Validates 'ideas' key exists in JSON +- **Returns:** List[Dict] with idea data: + - `title` (str) - Idea title + - `description` (str or dict) - Idea description (can be JSON string) + - `content_type` (str) - Content type ('blog_post', 'article', etc.) + - `content_structure` (str) - Content structure ('cluster_hub', 'supporting_page', etc.) + - `cluster_id` (int, optional) - Cluster ID reference + - `cluster_name` (str, optional) - Cluster name reference + - `estimated_word_count` (int) - Estimated word count + - `covered_keywords` or `target_keywords` (str) - Target keywords + +#### `save_output(parsed, original_data, account, progress_tracker, step_tracker)` +- **Input:** + - `parsed` - List of idea dicts from parse_response + - `original_data` - Dict from prepare() with `clusters` and `cluster_data` +- **Process:** + - For each idea in parsed: + - Matches cluster: + - First tries by `cluster_id` from AI response + - Falls back to `cluster_name` matching + - Last resort: position-based matching (first idea → first cluster) + - Gets site from cluster (or cluster.sector.site) + - Handles description: + - If dict, converts to JSON string + - If not string, converts to string + - Creates `ContentIdeas` record: + - Fields: + - `idea_title` - From `title` + - `description` - Processed description + - `content_type` - From `content_type` (default: 'blog_post') + - `content_structure` - From `content_structure` (default: 'supporting_page') + - `target_keywords` - From `covered_keywords` or `target_keywords` + - `keyword_cluster` - Matched cluster + - `estimated_word_count` - From `estimated_word_count` (default: 1500) + - `status` - 'new' + - `account`, `site`, `sector` - From cluster +- **Returns:** Dict with: + - `count` (int) - Ideas created + - `ideas_created` (int) - Ideas created + +### 3.4 Database Models + +#### Clusters Model +- **File:** `backend/igny8_core/modules/planner/models.py` +- **Model:** `Clusters` +- **Fields Used:** + - `id` - Cluster ID + - `name` - Cluster name + - `description` - Cluster description + - `keywords` (related_name) - Related Keywords + - `account`, `site`, `sector` - From SiteSectorBaseModel + +#### ContentIdeas Model +- **File:** `backend/igny8_core/modules/planner/models.py` +- **Model:** `ContentIdeas` +- **Fields Used:** + - `idea_title` - Idea title + - `description` - Idea description (can be JSON string) + - `content_type` - Content type ('blog_post', 'article', 'guide', 'tutorial') + - `content_structure` - Content structure ('cluster_hub', 'landing_page', 'pillar_page', 'supporting_page') + - `target_keywords` - Target keywords string + - `keyword_cluster` (ForeignKey) - Related cluster + - `estimated_word_count` - Estimated word count + - `status` - Status ('new', 'scheduled', 'published') + - `account`, `site`, `sector` - From SiteSectorBaseModel + +### 3.5 AI Response Format + +**Expected JSON:** +```json +{ + "ideas": [ + { + "title": "Idea Title", + "description": "Idea description or JSON structure", + "content_type": "blog_post", + "content_structure": "supporting_page", + "cluster_id": 1, + "cluster_name": "Cluster Name", + "estimated_word_count": 1500, + "covered_keywords": "keyword1, keyword2" + } + ] +} +``` + +### 3.6 Progress Messages + +- **INIT:** "Verifying cluster integrity" +- **PREP:** "Loading cluster keywords" +- **AI_CALL:** "Generating ideas with Igny8 Semantic AI" +- **PARSE:** "{count} high-opportunity idea(s) generated" +- **SAVE:** "Content Outline for Ideas generated" + +--- + +## 4. Generate Content + +### 4.1 Function Implementation + +- **File:** `backend/igny8_core/ai/functions/generate_content.py` +- **Class:** `GenerateContentFunction` +- **Inherits:** `BaseAIFunction` + +### 4.2 API Endpoint + +- **File:** `backend/igny8_core/modules/writer/views.py` +- **ViewSet:** `TasksViewSet` +- **Action:** `auto_generate_content` +- **Method:** POST +- **URL Path:** `/v1/writer/tasks/auto_generate_content/` +- **Payload:** + - `ids` (list[int]) - Task IDs (max 10) +- **Response:** + - `success` (bool) + - `task_id` (str) - Celery task ID if async + - `tasks_updated` (int) - Number of tasks updated + - `message` (str) + +### 4.3 Function Methods + +#### `get_name()` +- **Returns:** `'generate_content'` + +#### `get_metadata()` +- **Returns:** Dict with `display_name`, `description`, `phases` (INIT, PREP, AI_CALL, PARSE, SAVE, DONE) + +#### `get_max_items()` +- **Returns:** `50` (max tasks per batch) + +#### `validate(payload, account)` +- **Validates:** + - Calls `super().validate()` to check for 'ids' array and max_items limit + - Calls `validate_tasks_exist` to verify tasks exist +- **Returns:** Dict with `valid` (bool) and optional `error` (str) + +#### `prepare(payload, account)` +- **Loads:** + - Tasks from database (filters by `ids`, `account`) + - Uses `select_related` for: `account`, `site`, `sector`, `cluster`, `idea` +- **Returns:** List[Task objects] + +#### `build_prompt(data, account)` +- **Input:** Can be single Task or list[Task] (handles first task if list) +- **Builds Idea Data:** + - `title` - From task.title + - `description` - From task.description + - `outline` - From task.idea.description (handles JSON structure): + - If JSON, formats as: `"## {H2 heading}\n### {H3 subheading}\nContent Type: {type}\nDetails: {details}"` + - If plain text, uses as-is + - `structure` - From task.idea.content_structure or task.content_structure + - `type` - From task.idea.content_type or task.content_type + - `estimated_word_count` - From task.idea.estimated_word_count +- **Builds Cluster Data:** + - `cluster_name` - From task.cluster.name + - `description` - From task.cluster.description + - `status` - From task.cluster.status +- **Builds Keywords Data:** + - From task.keywords (legacy) or task.idea.target_keywords +- **Gets Prompt:** + - Calls `PromptRegistry.get_prompt(function_name='generate_content', account, task, context)` + - Context includes: + - `IDEA` - Formatted idea data string + - `CLUSTER` - Formatted cluster data string + - `KEYWORDS` - Keywords string +- **Returns:** Prompt string + +#### `parse_response(response, step_tracker)` +- **Parsing:** + - First tries JSON parse: + - If successful and dict, returns dict + - Falls back to plain text: + - Calls `normalize_content()` from `content_normalizer` to convert to HTML + - Returns dict with `content` field +- **Returns:** Dict with: + - **If JSON:** + - `content` (str) - HTML content + - `title` (str, optional) - Content title + - `meta_title` (str, optional) - Meta title + - `meta_description` (str, optional) - Meta description + - `word_count` (int, optional) - Word count + - `primary_keyword` (str, optional) - Primary keyword + - `secondary_keywords` (list, optional) - Secondary keywords + - `tags` (list, optional) - Tags + - `categories` (list, optional) - Categories + - **If Plain Text:** + - `content` (str) - Normalized HTML content + +#### `save_output(parsed, original_data, account, progress_tracker, step_tracker)` +- **Input:** + - `parsed` - Dict from parse_response + - `original_data` - Task object or list[Task] (handles first task if list) +- **Process:** + - Extracts content fields from parsed dict: + - `content_html` - From `content` field + - `title` - From `title` or task.title + - `meta_title` - From `meta_title` or task.meta_title or task.title + - `meta_description` - From `meta_description` or task.meta_description or task.description + - `word_count` - From `word_count` or calculated from content + - `primary_keyword` - From `primary_keyword` + - `secondary_keywords` - From `secondary_keywords` (converts to list if needed) + - `tags` - From `tags` (converts to list if needed) + - `categories` - From `categories` (converts to list if needed) + - Calculates word count if not provided: + - Strips HTML tags and counts words + - Gets or creates `Content` record: + - Uses `get_or_create` with `task` (OneToOne relationship) + - Defaults: `html_content`, `word_count`, `status='draft'`, `account`, `site`, `sector` + - Updates `Content` fields: + - `html_content` - Content HTML + - `word_count` - Word count + - `title` - Content title + - `meta_title` - Meta title + - `meta_description` - Meta description + - `primary_keyword` - Primary keyword + - `secondary_keywords` - Secondary keywords (JSONField) + - `tags` - Tags (JSONField) + - `categories` - Categories (JSONField) + - `status` - Always 'draft' for newly generated content + - `metadata` - Extra fields from parsed dict (excludes standard fields) + - `account`, `site`, `sector`, `task` - Aligned from task + - Updates `Tasks` record: + - Sets `status='completed'` + - Updates `updated_at` +- **Returns:** Dict with: + - `count` (int) - Tasks updated (always 1 per task) + - `tasks_updated` (int) - Tasks updated + - `word_count` (int) - Word count + +### 4.4 Database Models + +#### Tasks Model +- **File:** `backend/igny8_core/modules/writer/models.py` +- **Model:** `Tasks` +- **Fields Used:** + - `id` - Task ID + - `title` - Task title + - `description` - Task description + - `keywords` - Keywords string (legacy) + - `cluster` (ForeignKey) - Related cluster + - `idea` (ForeignKey) - Related ContentIdeas + - `content_structure` - Content structure + - `content_type` - Content type + - `status` - Status ('queued', 'completed') + - `meta_title` - Meta title + - `meta_description` - Meta description + - `account`, `site`, `sector` - From SiteSectorBaseModel + +#### Content Model +- **File:** `backend/igny8_core/modules/writer/models.py` +- **Model:** `Content` +- **Fields Used:** + - `task` (OneToOneField) - Related task + - `html_content` - HTML content + - `word_count` - Word count + - `title` - Content title + - `meta_title` - Meta title + - `meta_description` - Meta description + - `primary_keyword` - Primary keyword + - `secondary_keywords` (JSONField) - Secondary keywords list + - `tags` (JSONField) - Tags list + - `categories` (JSONField) - Categories list + - `status` - Status ('draft', 'review', 'published') + - `metadata` (JSONField) - Additional metadata + - `account`, `site`, `sector` - From SiteSectorBaseModel (auto-set from task) + +### 4.5 AI Response Format + +**Expected JSON:** +```json +{ + "content": "Content HTML", + "title": "Content Title", + "meta_title": "Meta Title", + "meta_description": "Meta description", + "word_count": 1500, + "primary_keyword": "primary keyword", + "secondary_keywords": ["keyword1", "keyword2"], + "tags": ["tag1", "tag2"], + "categories": ["category1"] +} +``` + +**Or Plain Text:** +``` +Plain text content that will be normalized to HTML +``` + +### 4.6 Progress Messages + +- **INIT:** "Validating task" +- **PREP:** "Preparing content idea" +- **AI_CALL:** "Writing article with Igny8 Semantic AI" +- **PARSE:** "{count} article(s) created" +- **SAVE:** "Saving article" + +--- + +## 5. Change Guide + +### 5.1 Where to Change Validation Logic + +- **File:** `backend/igny8_core/ai/validators.py` +- **Functions:** `validate_ids`, `validate_keywords_exist`, `validate_cluster_exists`, `validate_tasks_exist` +- **Or:** Override `validate()` method in function class + +### 5.2 Where to Change Data Loading + +- **File:** Function-specific file (e.g., `auto_cluster.py`) +- **Method:** `prepare()` +- **Change:** Modify queryset filters, select_related, prefetch_related + +### 5.3 Where to Change Prompts + +- **File:** `backend/igny8_core/ai/prompts.py` +- **Method:** `PromptRegistry.get_prompt()` +- **Change:** Modify `DEFAULT_PROMPTS` dict or update database prompts + +### 5.4 Where to Change Model Configuration + +- **File:** `backend/igny8_core/ai/settings.py` +- **Constant:** `MODEL_CONFIG` +- **Change:** Update model, max_tokens, temperature, response_format per function + +### 5.5 Where to Change Response Parsing + +- **File:** Function-specific file (e.g., `generate_content.py`) +- **Method:** `parse_response()` +- **Change:** Modify JSON extraction or plain text handling + +### 5.6 Where to Change Database Saving + +- **File:** Function-specific file (e.g., `auto_cluster.py`) +- **Method:** `save_output()` +- **Change:** Modify model creation/update logic, field mappings + +### 5.7 Where to Change Progress Messages + +- **File:** `backend/igny8_core/ai/engine.py` +- **Methods:** `_get_prep_message()`, `_get_ai_call_message()`, `_get_parse_message()`, `_get_save_message()` +- **Or:** Override in function class `get_metadata()` phases + +### 5.8 Where to Change Error Handling + +- **File:** `backend/igny8_core/ai/engine.py` +- **Method:** `_handle_error()` +- **Change:** Modify error logging, error response format + +--- + +## 6. Dependencies + +### 6.1 Function Dependencies + +- All functions depend on: `BaseAIFunction`, `AICore`, `PromptRegistry`, `get_model_config` +- Clustering depends on: `Keywords`, `Clusters` models +- Ideas depends on: `Clusters`, `ContentIdeas`, `Keywords` models +- Content depends on: `Tasks`, `Content`, `ContentIdeas`, `Clusters` models + +### 6.2 External Dependencies + +- **Celery:** For async task execution (`run_ai_task`) +- **OpenAI API:** For AI text generation (via `AICore.run_ai_request`) +- **Django ORM:** For database operations +- **IntegrationSettings:** For account-specific model configuration + +--- + +## 7. Key Relationships + +### 7.1 Clustering Flow +``` +Keywords → Clusters (many-to-one) +- Keywords.cluster (ForeignKey) +- Clusters.keywords (related_name) +``` + +### 7.2 Ideas Flow +``` +Clusters → ContentIdeas (one-to-many) +- ContentIdeas.keyword_cluster (ForeignKey) +- Clusters.ideas (related_name, if exists) +``` + +### 7.3 Content Flow +``` +Tasks → Content (one-to-one) +- Content.task (OneToOneField) +- Tasks.content_record (related_name) + +Tasks → ContentIdeas (many-to-one) +- Tasks.idea (ForeignKey) +- ContentIdeas.tasks (related_name) + +Tasks → Clusters (many-to-one) +- Tasks.cluster (ForeignKey) +- Clusters.tasks (related_name) +``` + +--- + +## 8. Notes + +- All functions use the same execution pipeline through `AIEngine.execute()` +- Progress tracking is handled automatically by `AIEngine` +- Cost tracking is handled automatically by `CostTracker` +- Database logging is handled automatically by `AITaskLog` +- Model configuration can be overridden per account via `IntegrationSettings` +- Prompts can be overridden per account via database prompts +- All functions support both async (Celery) and sync execution +- Error handling is centralized in `AIEngine._handle_error()` +