Stage 1 & 2 refactor of AI engine

2025-11-09 19:22:15 +05:00
parent 8cd036d8ce
commit 375473308d
18 changed files with 2396 additions and 76 deletions
--- a/docs/ActiveDocs/STAGE1-REORGANIZATION-COMPLETE.md
+++ b/docs/ActiveDocs/STAGE1-REORGANIZATION-COMPLETE.md
@@ -0,0 +1,191 @@
+# Stage 1 - AI Folder Structure & Functional Split - COMPLETE ✅
+
+## Summary
+
+Successfully reorganized the AI backend into a clean, modular structure where every AI function lives inside its own file within `/ai/functions/`.
+
+## ✅ Completed Deliverables
+
+### 1. Folder Structure Created
+
+```
+backend/igny8_core/ai/
+├── functions/
+│   ├── __init__.py ✅
+│   ├── auto_cluster.py ✅
+│   ├── generate_ideas.py ✅
+│   ├── generate_content.py ✅
+│   └── generate_images.py ✅
+├── ai_core.py ✅ (Shared operations)
+├── validators.py ✅ (Consolidated validation)
+├── constants.py ✅ (Model pricing, valid models)
+├── engine.py ✅ (Updated to use AICore)
+├── tracker.py ✅ (Existing)
+├── base.py ✅ (Existing)
+├── processor.py ✅ (Existing wrapper)
+├── registry.py ✅ (Updated with new functions)
+└── __init__.py ✅ (Updated exports)
+```
+
+### 2. Shared Modules Created
+
+#### `ai_core.py`
+- **Purpose**: Shared operations for all AI functions
+- **Features**:
+  - API call construction (`call_openai`)
+  - Model selection (`get_model`, `get_api_key`)
+  - Response parsing (`extract_json`)
+  - Image generation (`generate_image`)
+  - Cost calculation (`calculate_cost`)
+- **Status**: ✅ Complete
+
+#### `validators.py`
+- **Purpose**: Consolidated validation logic
+- **Functions**:
+  - `validate_ids()` - Base ID validation
+  - `validate_keywords_exist()` - Keyword existence check
+  - `validate_cluster_limits()` - Plan limit checks
+  - `validate_cluster_exists()` - Cluster existence
+  - `validate_tasks_exist()` - Task existence
+  - `validate_api_key()` - API key validation
+  - `validate_model()` - Model validation
+  - `validate_image_size()` - Image size validation
+- **Status**: ✅ Complete
+
+#### `constants.py`
+- **Purpose**: AI-related constants
+- **Constants**:
+  - `MODEL_RATES` - Text model pricing
+  - `IMAGE_MODEL_RATES` - Image model pricing
+  - `VALID_OPENAI_IMAGE_MODELS` - Valid image models
+  - `VALID_SIZES_BY_MODEL` - Valid sizes per model
+  - `DEFAULT_AI_MODEL` - Default model name
+  - `JSON_MODE_MODELS` - Models supporting JSON mode
+- **Status**: ✅ Complete
+
+### 3. Function Files Created
+
+#### `functions/auto_cluster.py`
+- **Status**: ✅ Updated to use new validators and AICore
+- **Changes**:
+  - Uses `validate_ids()`, `validate_keywords_exist()`, `validate_cluster_limits()` from validators
+  - Uses `AICore.extract_json()` for JSON parsing
+  - Maintains backward compatibility
+
+#### `functions/generate_ideas.py`
+- **Status**: ✅ Created
+- **Features**:
+  - `GenerateIdeasFunction` class (BaseAIFunction)
+  - `generate_ideas_core()` legacy function for backward compatibility
+  - Uses AICore for API calls
+  - Uses validators for validation
+
+#### `functions/generate_content.py`
+- **Status**: ✅ Created
+- **Features**:
+  - `GenerateContentFunction` class (BaseAIFunction)
+  - `generate_content_core()` legacy function for backward compatibility
+  - Uses AICore for API calls
+  - Uses validators for validation
+
+#### `functions/generate_images.py`
+- **Status**: ✅ Created
+- **Features**:
+  - `GenerateImagesFunction` class (BaseAIFunction)
+  - `generate_images_core()` legacy function for backward compatibility
+  - Uses AICore for image generation
+  - Uses validators for validation
+
+### 4. Import Paths Updated
+
+#### Updated Files:
+- ✅ `modules/planner/views.py` - Uses `generate_ideas_core` from new location
+- ✅ `modules/planner/tasks.py` - Imports `generate_ideas_core` from new location
+- ✅ `modules/writer/tasks.py` - Imports `generate_content_core` and `generate_images_core` from new locations
+- ✅ `ai/engine.py` - Uses `AICore` instead of `AIProcessor`
+- ✅ `ai/functions/auto_cluster.py` - Uses new validators and AICore
+- ✅ `ai/registry.py` - Registered all new functions
+- ✅ `ai/__init__.py` - Exports all new modules
+
+### 5. Dependencies Verified
+
+#### No Circular Dependencies ✅
+- Functions depend on: `ai_core`, `validators`, `constants`, `base`
+- `ai_core` depends on: `utils.ai_processor` (legacy, will be refactored later)
+- `validators` depends on: `constants`, models
+- `engine` depends on: `ai_core`, `base`, `tracker`
+- All imports are clean and modular
+
+#### Modular Structure ✅
+- Each function file is self-contained
+- Shared logic in `ai_core.py`
+- Validation logic in `validators.py`
+- Constants in `constants.py`
+- No scattered or duplicated logic
+
+## 📋 File Structure Details
+
+### Core AI Modules
+
+| File | Purpose | Dependencies |
+|------|---------|--------------|
+| `ai_core.py` | Shared AI operations | `utils.ai_processor` (legacy) |
+| `validators.py` | All validation logic | `constants`, models |
+| `constants.py` | AI constants | None |
+| `engine.py` | Execution orchestrator | `ai_core`, `base`, `tracker` |
+| `base.py` | Base function class | None |
+| `tracker.py` | Progress/step tracking | None |
+| `registry.py` | Function registry | `base`, function modules |
+
+### Function Files
+
+| File | Function Class | Legacy Function | Status |
+|------|----------------|-----------------|--------|
+| `auto_cluster.py` | `AutoClusterFunction` | N/A (uses engine) | ✅ Updated |
+| `generate_ideas.py` | `GenerateIdeasFunction` | `generate_ideas_core()` | ✅ Created |
+| `generate_content.py` | `GenerateContentFunction` | `generate_content_core()` | ✅ Created |
+| `generate_images.py` | `GenerateImagesFunction` | `generate_images_core()` | ✅ Created |
+
+## 🔄 Import Path Changes
+
+### Old Imports (Still work, but deprecated)
+```python
+from igny8_core.utils.ai_processor import AIProcessor
+from igny8_core.modules.planner.tasks import _generate_single_idea_core
+```
+
+### New Imports (Recommended)
+```python
+from igny8_core.ai.functions.generate_ideas import generate_ideas_core
+from igny8_core.ai.functions.generate_content import generate_content_core
+from igny8_core.ai.functions.generate_images import generate_images_core
+from igny8_core.ai.ai_core import AICore
+from igny8_core.ai.validators import validate_ids, validate_cluster_limits
+from igny8_core.ai.constants import MODEL_RATES, DEFAULT_AI_MODEL
+```
+
+## ✅ Verification Checklist
+
+- [x] All function files created in `ai/functions/`
+- [x] Shared modules (`ai_core`, `validators`, `constants`) created
+- [x] No circular dependencies
+- [x] All imports updated in views and tasks
+- [x] Functions registered in registry
+- [x] `__init__.py` files updated
+- [x] Backward compatibility maintained (legacy functions still work)
+- [x] No linting errors
+- [x] Structure matches required layout
+
+## 🎯 Next Steps (Future Stages)
+
+- **Stage 2**: Inject tracker into all functions
+- **Stage 3**: Simplify logging
+- **Stage 4**: Clean up legacy code
+
+## 📝 Notes
+
+- Legacy `AIProcessor` from `utils.ai_processor` is still used by `ai_core.py` as a wrapper
+- This will be refactored in later stages
+- All existing API endpoints continue to work
+- No functional changes - only structural reorganization
+
--- a/docs/ActiveDocs/STAGE2-EXECUTION-LOGGING-COMPLETE.md
+++ b/docs/ActiveDocs/STAGE2-EXECUTION-LOGGING-COMPLETE.md
@@ -0,0 +1,220 @@
+# Stage 2 - AI Execution & Logging Layer - COMPLETE ✅
+
+## Summary
+
+Successfully created a centralized, consistent, and traceable execution layer for all AI requests with unified request handler and clean console-based logging.
+
+## ✅ Completed Deliverables
+
+### 1. Centralized Execution in `ai_core.py`
+
+#### `run_ai_request()` Method
+- **Purpose**: Single entry point for all AI text generation requests
+- **Features**:
+  - Step-by-step console logging with `print()` statements
+  - Standardized request payload construction
+  - Error handling with detailed logging
+  - Token counting and cost calculation
+  - Rate limit detection and logging
+  - Timeout handling
+  - JSON mode auto-enablement for supported models
+
+#### Console Logging Format
+```
+[AI][function_name] Step 1: Preparing request...
+[AI][function_name] Step 2: Using model: gpt-4o
+[AI][function_name] Step 3: Auto-enabled JSON mode for gpt-4o
+[AI][function_name] Step 4: Prompt length: 1234 characters
+[AI][function_name] Step 5: Request payload prepared (model=gpt-4o, max_tokens=4000, temp=0.7)
+[AI][function_name] Step 6: Sending request to OpenAI API...
+[AI][function_name] Step 7: Received response in 2.34s (status=200)
+[AI][function_name] Step 8: Received 150 tokens (input: 50, output: 100)
+[AI][function_name] Step 9: Content length: 450 characters
+[AI][function_name] Step 10: Cost calculated: $0.000123
+[AI][function_name][Success] Request completed successfully
+```
+
+#### Error Logging Format
+```
+[AI][function_name][Error] OpenAI Rate Limit - waiting 60s
+[AI][function_name][Error] HTTP 429 error: Rate limit exceeded (Rate limit - retry after 60s)
+[AI][function_name][Error] Request timeout (60s exceeded)
+[AI][function_name][Error] Failed to parse JSON response: ...
+```
+
+### 2. Image Generation with Logging
+
+#### `generate_image()` Method
+- **Purpose**: Centralized image generation with console logging
+- **Features**:
+  - Supports OpenAI DALL-E and Runware
+  - Model and size validation
+  - Step-by-step console logging
+  - Error handling with detailed messages
+  - Cost calculation
+
+#### Console Logging Format
+```
+[AI][generate_images] Step 1: Preparing image generation request...
+[AI][generate_images] Provider: OpenAI
+[AI][generate_images] Step 2: Using model: dall-e-3, size: 1024x1024
+[AI][generate_images] Step 3: Sending request to OpenAI Images API...
+[AI][generate_images] Step 4: Received response in 5.67s (status=200)
+[AI][generate_images] Step 5: Image generated successfully
+[AI][generate_images] Step 6: Cost: $0.0400
+[AI][generate_images][Success] Image generation completed
+```
+
+### 3. Updated All Function Files
+
+#### `functions/auto_cluster.py`
+- ✅ Uses `AICore.extract_json()` for JSON parsing
+- ✅ Engine calls `run_ai_request()` (via engine.py)
+
+#### `functions/generate_ideas.py`
+- ✅ Updated `generate_ideas_core()` to use `run_ai_request()`
+- ✅ Console logging enabled with function name
+
+#### `functions/generate_content.py`
+- ✅ Updated `generate_content_core()` to use `run_ai_request()`
+- ✅ Console logging enabled with function name
+
+#### `functions/generate_images.py`
+- ✅ Updated to use `run_ai_request()` for prompt extraction
+- ✅ Updated to use `generate_image()` with logging
+- ✅ Console logging enabled
+
+### 4. Updated Engine
+
+#### `engine.py`
+- ✅ Updated to use `run_ai_request()` instead of `call_openai()`
+- ✅ Passes function name for logging context
+- ✅ Maintains backward compatibility
+
+### 5. Deprecated Old Code
+
+#### `processor.py`
+- ✅ Marked as DEPRECATED
+- ✅ Redirects all calls to `AICore`
+- ✅ Kept for backward compatibility only
+- ✅ All methods now use `AICore` internally
+
+### 6. Edge Case Handling
+
+#### Implemented in `run_ai_request()`:
+- ✅ **API Key Validation**: Logs error if not configured
+- ✅ **Prompt Length**: Logs character count
+- ✅ **Rate Limits**: Detects and logs retry-after time
+- ✅ **Timeouts**: Handles 60s timeout with clear error
+- ✅ **JSON Parsing Errors**: Logs decode errors with context
+- ✅ **Empty Responses**: Validates content exists
+- ✅ **Token Overflow**: Max tokens enforced
+- ✅ **Model Validation**: Auto-selects JSON mode for supported models
+
+### 7. Standardized Request Schema
+
+#### OpenAI Request Payload
+```python
+{
+    "model": "gpt-4o",
+    "messages": [{"role": "user", "content": prompt}],
+    "temperature": 0.7,
+    "max_tokens": 4000,
+    "response_format": {"type": "json_object"}  # Auto-enabled for supported models
+}
+```
+
+#### All Functions Use Same Logic:
+- Model selection (account default or override)
+- JSON mode auto-enablement
+- Token limits
+- Temperature settings
+- Error handling
+
+### 8. Test Script Created
+
+#### `ai/tests/test_run.py`
+- ✅ Test script for all AI functions
+- ✅ Tests `run_ai_request()` directly
+- ✅ Tests JSON extraction
+- ✅ Placeholder tests for all functions
+- ✅ Can be run standalone to verify logging
+
+## 📋 File Changes Summary
+
+| File | Changes | Status |
+|------|---------|--------|
+| `ai_core.py` | Complete rewrite with `run_ai_request()` and console logging | ✅ Complete |
+| `engine.py` | Updated to use `run_ai_request()` | ✅ Complete |
+| `processor.py` | Marked deprecated, redirects to AICore | ✅ Complete |
+| `functions/auto_cluster.py` | Uses AICore methods | ✅ Complete |
+| `functions/generate_ideas.py` | Uses `run_ai_request()` | ✅ Complete |
+| `functions/generate_content.py` | Uses `run_ai_request()` | ✅ Complete |
+| `functions/generate_images.py` | Uses `run_ai_request()` and `generate_image()` | ✅ Complete |
+| `tests/test_run.py` | Test script created | ✅ Complete |
+
+## 🔄 Migration Path
+
+### Old Code (Deprecated)
+```python
+from igny8_core.utils.ai_processor import AIProcessor
+processor = AIProcessor(account=account)
+result = processor._call_openai(prompt, model=model)
+```
+
+### New Code (Recommended)
+```python
+from igny8_core.ai.ai_core import AICore
+ai_core = AICore(account=account)
+result = ai_core.run_ai_request(
+    prompt=prompt,
+    model=model,
+    function_name='my_function'
+)
+```
+
+## ✅ Verification Checklist
+
+- [x] `run_ai_request()` created with console logging
+- [x] All function files updated to use `run_ai_request()`
+- [x] Engine updated to use `run_ai_request()`
+- [x] Old processor code deprecated
+- [x] Edge cases handled with logging
+- [x] Request schema standardized
+- [x] Test script created
+- [x] No linting errors
+- [x] Backward compatibility maintained
+
+## 🎯 Benefits Achieved
+
+1. **Centralized Execution**: All AI requests go through one method
+2. **Consistent Logging**: Every request logs steps to console
+3. **Better Debugging**: Clear step-by-step visibility
+4. **Error Handling**: Comprehensive error detection and logging
+5. **Reduced Duplication**: No scattered AI call logic
+6. **Easy Testing**: Single point to test/mock
+7. **Future Ready**: Easy to add retry logic, backoff, etc.
+
+## 📝 Console Output Example
+
+When running any AI function, you'll see:
+```
+[AI][generate_ideas] Step 1: Preparing request...
+[AI][generate_ideas] Step 2: Using model: gpt-4o
+[AI][generate_ideas] Step 3: Auto-enabled JSON mode for gpt-4o
+[AI][generate_ideas] Step 4: Prompt length: 2345 characters
+[AI][generate_ideas] Step 5: Request payload prepared (model=gpt-4o, max_tokens=4000, temp=0.7)
+[AI][generate_ideas] Step 6: Sending request to OpenAI API...
+[AI][generate_ideas] Step 7: Received response in 3.45s (status=200)
+[AI][generate_ideas] Step 8: Received 250 tokens (input: 100, output: 150)
+[AI][generate_ideas] Step 9: Content length: 600 characters
+[AI][generate_ideas] Step 10: Cost calculated: $0.000250
+[AI][generate_ideas][Success] Request completed successfully
+```
+
+## 🚀 Next Steps (Future Stages)
+
+- **Stage 3**: Simplify logging (optional - console logging already implemented)
+- **Stage 4**: Clean up legacy code (remove old processor completely)
+- **Future**: Add retry logic, exponential backoff, request queuing
+