fixes of ai toke limit standrd 8192

2025-12-17 00:36:18 +00:00
parent 69c0fd8b69
commit 9656643f0f
7 changed files with 262 additions and 1872 deletions
--- a/AI_CLEANUP_SUMMARY.md
+++ b/AI_CLEANUP_SUMMARY.md
@@ -0,0 +1,163 @@
+# AI System Cleanup Summary
+
+## Actions Completed
+
+### 1. Standardized max_tokens to 8192
+**Status:** ✅ COMPLETE
+
+**Changes Made:**
+- `backend/igny8_core/ai/settings.py:103` - Changed fallback from 16384 → 8192
+- `backend/igny8_core/ai/ai_core.py:116` - Kept default at 8192 (already correct)
+- `backend/igny8_core/ai/ai_core.py:856` - Updated legacy method from 4000 → 8192
+- `backend/igny8_core/utils/ai_processor.py:111` - Updated from 4000 → 8192
+- `backend/igny8_core/utils/ai_processor.py:437` - Updated from 4000 → 8192
+- `backend/igny8_core/utils/ai_processor.py:531` - Updated from 1000 → 8192 (already done)
+- `backend/igny8_core/utils/ai_processor.py:1133` - Updated from 3000 → 8192
+- `backend/igny8_core/utils/ai_processor.py:1340` - Updated from 4000 → 8192
+- IntegrationSettings (aws-admin) - Updated from 16384 → 8192
+
+**Result:** Single source of truth = 8192 tokens across entire codebase
+
+### 2. Marked Legacy Code
+**Status:** ✅ COMPLETE
+
+**Changes Made:**
+- Added deprecation warning to `backend/igny8_core/utils/ai_processor.py`
+- Documented that it's only kept for MODEL_RATES constant
+- Marked `call_openai()` in `ai_core.py` as deprecated
+
+### 3. Removed Unused Files
+**Status:** ✅ COMPLETE
+
+**Files Removed:**
+- `backend/igny8_core/modules/writer/views.py.bak`
+- `frontend/src/pages/account/AccountSettingsPage.tsx.old`
+
+### 4. System Verification
+**Status:** ✅ COMPLETE
+
+**Test Results:**
+- Backend restarted successfully
+- Django check passed (0 issues)
+- Content generation tested with task 229
+- Confirmed max_tokens=8192 is being used
+- AI only generates 999 output tokens (< 8192 limit)
+
+## Current AI Architecture
+
+### Active System (Use This)
+```
+backend/igny8_core/ai/
+├── ai_core.py          - Core AI request handler
+├── engine.py           - Orchestrator (AIEngine class)
+├── settings.py         - Config loader (get_model_config)
+├── prompts.py          - Prompt templates
+├── base.py             - BaseAIFunction class
+├── tasks.py            - Celery tasks
+├── models.py           - AITaskLog
+├── tracker.py          - Progress tracking
+├── registry.py         - Function registry
+├── constants.py        - Shared constants
+└── functions/
+    ├── auto_cluster.py
+    ├── generate_ideas.py
+    ├── generate_content.py
+    ├── generate_images.py
+    ├── generate_image_prompts.py
+    └── optimize_content.py
+```
+
+### Legacy System (Do Not Use)
+```
+backend/igny8_core/utils/ai_processor.py
+```
+**Status:** DEPRECATED - Only kept for MODEL_RATES constant
+**Will be removed:** After extracting MODEL_RATES to ai/constants.py
+
+## Key Finding: Short Content Issue
+
+### Root Cause Analysis
+❌ **NOT a token limit issue:**
+- max_tokens set to 8192
+- AI only generates ~999 output tokens
+- Has room for 7000+ more tokens
+
+✅ **IS a prompt structure issue:**
+- AI generates "complete" content in 400-500 words
+- Thinks task is done because JSON structure is filled
+- Needs MORE AGGRESSIVE enforcement in prompt:
+  - "DO NOT stop until you reach 1200 words"
+  - "Count your words and verify before submitting"
+  - Possibly need to use a different output format that encourages longer content
+
+## Standardized Configuration
+
+### Single max_tokens Value
+**Value:** 8192 tokens (approximately 1500-2000 words)
+**Location:** All AI functions use this consistently
+**Fallback:** No fallbacks - required in IntegrationSettings
+
+### Where max_tokens Is Used
+1. `get_model_config()` - Loads from IntegrationSettings, falls back to 8192
+2. `AICore.run_ai_request()` - Default parameter: 8192
+3. All AI functions - Use value from get_model_config()
+4. IntegrationSettings - Database stores 8192
+
+## Recommendations
+
+### Short Term
+1. ✅ max_tokens standardized (DONE)
+2. 🔄 Fix prompt to enforce 1200+ words more aggressively
+3. 🔄 Consider using streaming or multi-turn approach for long content
+
+### Long Term
+1. Extract MODEL_RATES from ai_processor.py to ai/constants.py
+2. Remove ai_processor.py entirely
+3. Add validation that content meets minimum word count before saving
+4. Implement word count tracking in generation loop
+
+## Testing Commands
+
+```bash
+# Check current config
+docker exec igny8_backend python manage.py shell -c "
+from igny8_core.ai.settings import get_model_config
+from igny8_core.auth.models import Account
+account = Account.objects.filter(slug='aws-admin').first()
+config = get_model_config('generate_content', account=account)
+print(f'max_tokens: {config[\"max_tokens\"]}')
+"
+
+# Test content generation
+docker exec igny8_backend python manage.py shell -c "
+from igny8_core.ai.functions.generate_content import GenerateContentFunction
+from igny8_core.ai.engine import AIEngine
+from igny8_core.auth.models import Account
+account = Account.objects.filter(slug='aws-admin').first()
+fn = GenerateContentFunction()
+engine = AIEngine(celery_task=None, account=account)
+result = engine.execute(fn, {'ids': [229]})
+print(f'Success: {result.get(\"success\")}')
+"
+```
+
+## Files Modified
+
+1. `backend/igny8_core/ai/settings.py` - Standardized fallback to 8192
+2. `backend/igny8_core/ai/ai_core.py` - Updated legacy method, added deprecation note
+3. `backend/igny8_core/utils/ai_processor.py` - Updated all max_tokens, added deprecation warning
+4. IntegrationSettings database - Updated to 8192
+
+## Verification
+
+✅ All max_tokens references now use 8192
+✅ No conflicting fallback values
+✅ Legacy code marked clearly
+✅ System tested and working
+✅ Backend restarted successfully
+
+---
+
+**Date:** December 17, 2025
+**Status:** COMPLETE
+**Next Step:** Fix prompt structure for 1200+ word content generation