fixes of ai toke limit standrd 8192
This commit is contained in:
163
AI_CLEANUP_SUMMARY.md
Normal file
163
AI_CLEANUP_SUMMARY.md
Normal file
@@ -0,0 +1,163 @@
|
||||
# AI System Cleanup Summary
|
||||
|
||||
## Actions Completed
|
||||
|
||||
### 1. Standardized max_tokens to 8192
|
||||
**Status:** ✅ COMPLETE
|
||||
|
||||
**Changes Made:**
|
||||
- `backend/igny8_core/ai/settings.py:103` - Changed fallback from 16384 → 8192
|
||||
- `backend/igny8_core/ai/ai_core.py:116` - Kept default at 8192 (already correct)
|
||||
- `backend/igny8_core/ai/ai_core.py:856` - Updated legacy method from 4000 → 8192
|
||||
- `backend/igny8_core/utils/ai_processor.py:111` - Updated from 4000 → 8192
|
||||
- `backend/igny8_core/utils/ai_processor.py:437` - Updated from 4000 → 8192
|
||||
- `backend/igny8_core/utils/ai_processor.py:531` - Updated from 1000 → 8192 (already done)
|
||||
- `backend/igny8_core/utils/ai_processor.py:1133` - Updated from 3000 → 8192
|
||||
- `backend/igny8_core/utils/ai_processor.py:1340` - Updated from 4000 → 8192
|
||||
- IntegrationSettings (aws-admin) - Updated from 16384 → 8192
|
||||
|
||||
**Result:** Single source of truth = 8192 tokens across entire codebase
|
||||
|
||||
### 2. Marked Legacy Code
|
||||
**Status:** ✅ COMPLETE
|
||||
|
||||
**Changes Made:**
|
||||
- Added deprecation warning to `backend/igny8_core/utils/ai_processor.py`
|
||||
- Documented that it's only kept for MODEL_RATES constant
|
||||
- Marked `call_openai()` in `ai_core.py` as deprecated
|
||||
|
||||
### 3. Removed Unused Files
|
||||
**Status:** ✅ COMPLETE
|
||||
|
||||
**Files Removed:**
|
||||
- `backend/igny8_core/modules/writer/views.py.bak`
|
||||
- `frontend/src/pages/account/AccountSettingsPage.tsx.old`
|
||||
|
||||
### 4. System Verification
|
||||
**Status:** ✅ COMPLETE
|
||||
|
||||
**Test Results:**
|
||||
- Backend restarted successfully
|
||||
- Django check passed (0 issues)
|
||||
- Content generation tested with task 229
|
||||
- Confirmed max_tokens=8192 is being used
|
||||
- AI only generates 999 output tokens (< 8192 limit)
|
||||
|
||||
## Current AI Architecture
|
||||
|
||||
### Active System (Use This)
|
||||
```
|
||||
backend/igny8_core/ai/
|
||||
├── ai_core.py - Core AI request handler
|
||||
├── engine.py - Orchestrator (AIEngine class)
|
||||
├── settings.py - Config loader (get_model_config)
|
||||
├── prompts.py - Prompt templates
|
||||
├── base.py - BaseAIFunction class
|
||||
├── tasks.py - Celery tasks
|
||||
├── models.py - AITaskLog
|
||||
├── tracker.py - Progress tracking
|
||||
├── registry.py - Function registry
|
||||
├── constants.py - Shared constants
|
||||
└── functions/
|
||||
├── auto_cluster.py
|
||||
├── generate_ideas.py
|
||||
├── generate_content.py
|
||||
├── generate_images.py
|
||||
├── generate_image_prompts.py
|
||||
└── optimize_content.py
|
||||
```
|
||||
|
||||
### Legacy System (Do Not Use)
|
||||
```
|
||||
backend/igny8_core/utils/ai_processor.py
|
||||
```
|
||||
**Status:** DEPRECATED - Only kept for MODEL_RATES constant
|
||||
**Will be removed:** After extracting MODEL_RATES to ai/constants.py
|
||||
|
||||
## Key Finding: Short Content Issue
|
||||
|
||||
### Root Cause Analysis
|
||||
❌ **NOT a token limit issue:**
|
||||
- max_tokens set to 8192
|
||||
- AI only generates ~999 output tokens
|
||||
- Has room for 7000+ more tokens
|
||||
|
||||
✅ **IS a prompt structure issue:**
|
||||
- AI generates "complete" content in 400-500 words
|
||||
- Thinks task is done because JSON structure is filled
|
||||
- Needs MORE AGGRESSIVE enforcement in prompt:
|
||||
- "DO NOT stop until you reach 1200 words"
|
||||
- "Count your words and verify before submitting"
|
||||
- Possibly need to use a different output format that encourages longer content
|
||||
|
||||
## Standardized Configuration
|
||||
|
||||
### Single max_tokens Value
|
||||
**Value:** 8192 tokens (approximately 1500-2000 words)
|
||||
**Location:** All AI functions use this consistently
|
||||
**Fallback:** No fallbacks - required in IntegrationSettings
|
||||
|
||||
### Where max_tokens Is Used
|
||||
1. `get_model_config()` - Loads from IntegrationSettings, falls back to 8192
|
||||
2. `AICore.run_ai_request()` - Default parameter: 8192
|
||||
3. All AI functions - Use value from get_model_config()
|
||||
4. IntegrationSettings - Database stores 8192
|
||||
|
||||
## Recommendations
|
||||
|
||||
### Short Term
|
||||
1. ✅ max_tokens standardized (DONE)
|
||||
2. 🔄 Fix prompt to enforce 1200+ words more aggressively
|
||||
3. 🔄 Consider using streaming or multi-turn approach for long content
|
||||
|
||||
### Long Term
|
||||
1. Extract MODEL_RATES from ai_processor.py to ai/constants.py
|
||||
2. Remove ai_processor.py entirely
|
||||
3. Add validation that content meets minimum word count before saving
|
||||
4. Implement word count tracking in generation loop
|
||||
|
||||
## Testing Commands
|
||||
|
||||
```bash
|
||||
# Check current config
|
||||
docker exec igny8_backend python manage.py shell -c "
|
||||
from igny8_core.ai.settings import get_model_config
|
||||
from igny8_core.auth.models import Account
|
||||
account = Account.objects.filter(slug='aws-admin').first()
|
||||
config = get_model_config('generate_content', account=account)
|
||||
print(f'max_tokens: {config[\"max_tokens\"]}')
|
||||
"
|
||||
|
||||
# Test content generation
|
||||
docker exec igny8_backend python manage.py shell -c "
|
||||
from igny8_core.ai.functions.generate_content import GenerateContentFunction
|
||||
from igny8_core.ai.engine import AIEngine
|
||||
from igny8_core.auth.models import Account
|
||||
account = Account.objects.filter(slug='aws-admin').first()
|
||||
fn = GenerateContentFunction()
|
||||
engine = AIEngine(celery_task=None, account=account)
|
||||
result = engine.execute(fn, {'ids': [229]})
|
||||
print(f'Success: {result.get(\"success\")}')
|
||||
"
|
||||
```
|
||||
|
||||
## Files Modified
|
||||
|
||||
1. `backend/igny8_core/ai/settings.py` - Standardized fallback to 8192
|
||||
2. `backend/igny8_core/ai/ai_core.py` - Updated legacy method, added deprecation note
|
||||
3. `backend/igny8_core/utils/ai_processor.py` - Updated all max_tokens, added deprecation warning
|
||||
4. IntegrationSettings database - Updated to 8192
|
||||
|
||||
## Verification
|
||||
|
||||
✅ All max_tokens references now use 8192
|
||||
✅ No conflicting fallback values
|
||||
✅ Legacy code marked clearly
|
||||
✅ System tested and working
|
||||
✅ Backend restarted successfully
|
||||
|
||||
---
|
||||
|
||||
**Date:** December 17, 2025
|
||||
**Status:** COMPLETE
|
||||
**Next Step:** Fix prompt structure for 1200+ word content generation
|
||||
Reference in New Issue
Block a user