igny8/docs/plans/4th-jan-refactor/implementation-plan-for-ai-models-and-cost.md


## Complete End-to-End Analysis & Restructuring Plan

### Current State Summary (from actual database queries)

**AIModelConfig (6 records - all needed):**
| model_name | type | provider | active | default | cost | tokens_per_credit |
|------------|------|----------|--------|---------|------|-------------------|
| gpt-4o-mini | text | openai | ✅ | ❌ | $0.15/$0.60 per 1M | 10,000 |
| gpt-4o | text | openai | ❌ | ❌ | $2.50/$10.00 per 1M | 1,000 |
| gpt-5.1 | text | openai | ✅ | ✅ | $1.25/$10.00 per 1M | 1,000 |
| runware:97@1 | image | runware | ✅ | ❌ | $0.012/image | 1 credit/image |
| dall-e-3 | image | openai | ✅ | ✅ | $0.04/image | 5 credits/image |
| google:4@2 | image | runware | ✅ | ❌ | $0.14/image | 15 credits/image |

---

### Current End-to-End Flow (Traced from Code)

```
┌─────────────────────────────────────────────────────────────────────────┐
│                        CURRENT ARCHITECTURE                              │
└─────────────────────────────────────────────────────────────────────────┘

1. CONFIGURATION LAYER
   ┌────────────────────────────────────────┐
   │  GlobalIntegrationSettings (singleton) │  ← API keys stored here
   │  - openai_api_key                      │
   │  - runware_api_key                     │
   │  - anthropic_api_key (unused)          │
   │  - bria_api_key (unused)               │
   │  - openai_model: gpt-4o-mini ❌        │  ← Should be gpt-5.1
   │  - runware_model: bria:10@1 ❌         │  ← Model doesn't exist!
   │  - HARDCODED CHOICES duplicating DB    │
   └────────────────────────────────────────┘
                     │
                     ▼
   ┌────────────────────────────────────────┐
   │  AIModelConfig (database)              │  ← Source of truth for models
   │  - model_name, provider, costs         │
   │  - is_active, is_default               │
   └────────────────────────────────────────┘
                     │
                     ▼
   ┌────────────────────────────────────────┐
   │  constants.py                          │  ← DUPLICATE/LEGACY
   │  - MODEL_RATES (hardcoded)             │
   │  - IMAGE_MODEL_RATES (hardcoded)       │
   └────────────────────────────────────────┘

2. SETTINGS RESOLUTION LAYER
   ┌────────────────────────────────────────┐
   │  settings.py                        │
   │  get_model_config(function, account)   │
   │  - Gets model from GlobalIntegration   │
   │  - Gets max_tokens from AIModelConfig  │
   │  - Allows IntegrationSettings override │
   └────────────────────────────────────────┘
                     │
                     ▼
   ┌────────────────────────────────────────┐
   │  model_registry.py                  │
   │  ModelRegistry.get_model(model_id)     │
   │  - Try DB (AIModelConfig) first        │
   │  - Fallback to constants.py ❌         │  ← Should only use DB
   └────────────────────────────────────────┘

3. AI EXECUTION LAYER
   ┌────────────────────────────────────────┐
   │  engine.py (AIEngine)               │
   │  - Orchestrates all AI functions       │
   │  - Progress tracking, cost tracking    │
   └────────────────────────────────────────┘
                     │
                     ▼
   ┌────────────────────────────────────────┐
   │  ai_core.py (AICore)                │
   │  - _load_account_settings()            │  ← Gets API keys from Global
   │  - run_ai_request() for text           │
   │  - generate_image() for images         │
   │  - Uses IMAGE_MODEL_RATES fallback ❌  │
   └────────────────────────────────────────┘
                     │
                     ▼
   ┌────────────────────────────────────────┐
   │  ai/functions/                         │
   │  - generate_images.py                  │
   │  - generate_content.py                 │
   │  - auto_cluster.py                     │
   │  - Each function uses AICore           │
   └────────────────────────────────────────┘

4. CREDIT CALCULATION LAYER
   ┌────────────────────────────────────────┐
   │  CreditCostConfig (database)           │
   │  - operation_type                      │
   │  - tokens_per_credit                   │
   │  - min_credits                         │
   │  - price_per_credit_usd                │
   │  - For images: 50 tokens/credit, min 5 │
   └────────────────────────────────────────┘
                     │
                     ▼
   ┌────────────────────────────────────────┐
   │  CreditService                         │
   │  - calculate_credits_from_tokens()     │
   │  - deduct_credits_for_operation()      │
   │  - Text: tokens → credits AFTER call   │
   │  - Images: ??? (not token-based)       │
   └────────────────────────────────────────┘

5. FRONTEND (Sites/Settings.tsx)
   ┌────────────────────────────────────────┐
   │  HARDCODED model choices ❌            │
   │  - QUALITY_TO_CONFIG                   │
   │  - RUNWARE_MODEL_CHOICES               │
   │  - DALLE_MODEL_CHOICES                 │
   │  - MODEL_LANDSCAPE_SIZES               │
   └────────────────────────────────────────┘
```

---

### Problems Identified

| # | Problem | Location | Impact |
|---|---------|----------|--------|
| 1 | GlobalIntegrationSettings has hardcoded model choices | global_settings_models.py | Duplicates AIModelConfig |
| 2 | runware_model = "bria:10@1" but model doesn't exist | GlobalIntegrationSettings | Broken fallback |
| 3 | API keys mixed with model config | GlobalIntegrationSettings | No separation of concerns |
| 4 | IMAGE_MODEL_RATES still used as fallback | ai_core.py, model_registry.py | Inconsistent pricing |
| 5 | Frontend hardcodes model choices | Settings.tsx | Not dynamic |
| 6 | Image credit calculation unclear | CreditService | Not based on cost_per_image |
| 7 | constants.py duplicates DB data | constants.py | Maintenance burden |

---

### Target Architecture

```
┌─────────────────────────────────────────────────────────────────────────┐
│                        TARGET ARCHITECTURE                               │
└─────────────────────────────────────────────────────────────────────────┘

1. NEW: IntegrationProvider Model (stores ALL 3rd party API keys)
   ┌────────────────────────────────────────┐
   │  IntegrationProvider                   │  ← Future-proof: ALL integrations
   │  - provider_id: str (primary key)      │
   │    Examples: openai, runware, google,  │
   │    resend, stripe, etc.                │
   │  - display_name: str                   │
   │  - provider_type: ai | email | payment │
   │  - api_key: encrypted str              │
   │  - api_endpoint: URL (optional)        │
   │  - is_active: bool                     │
   │  - config: JSON (rate limits, etc.)    │
   └────────────────────────────────────────┘
                     │
                     ▼
2. CLEANED: AIModelConfig (references IntegrationProvider)
   ┌────────────────────────────────────────┐
   │  AIModelConfig                         │
   │  - model_name: str                     │
   │  - display_name: str                   │
   │  - model_type: text | image            │
   │  - provider: str → IntegrationProvider │
   │  - cost fields (unchanged)             │
   │  + credits_per_image: int (NEW)        │  ← For image models
   │  + tokens_per_credit: int (NEW)        │  ← For text models
   │  + quality_tier: basic|quality|premium │  ← For UI display
   │  - is_default: bool                    │  ← Loads automatically
   └────────────────────────────────────────┘
                     │
                     ▼
3. SIMPLIFIED: GlobalIntegrationSettings
   ┌────────────────────────────────────────┐
   │  GlobalIntegrationSettings             │
   │  NO hardcoded model names              │
   │  NO API keys (in IntegrationProvider)  │
   │  Loads defaults from AIModelConfig     │
   │    where is_default=True               │
   │  - image_style: str                    │
   │  - max_in_article_images: int          │
   │  - image_quality: str                  │
   └────────────────────────────────────────┘
                     │
                     ▼
4. UNIFIED: Model Resolution
   ┌────────────────────────────────────────┐
   │  ModelRegistry                         │
   │  - get_default_model(type) → from DB   │
   │  - get_model(model_id) → AIModelConfig │
   │  - get_provider(id) → IntegrationProv  │
   │  - get_api_key(provider) → key         │
   │  - NO fallback to constants            │
   │  - NO hardcoded defaults               │
   └────────────────────────────────────────┘
                     │
                     ▼
5. DYNAMIC: Frontend API
   ┌────────────────────────────────────────┐
   │  GET /api/v1/system/ai-models/         │
   │  Returns models from DB with defaults  │
   │  marked, no hardcoding needed          │
   └────────────────────────────────────────┘
```

---

### Implementation Plan (Complete)

#### Phase 1: Database Schema Changes

**1.1 Create IntegrationProvider Model (Future-proof for ALL integrations)**

File: models.py

```python
class IntegrationProvider(models.Model):
    """
    Centralized 3rd party integration provider configuration.
    Single location for ALL external service API keys and configs.
    """
    PROVIDER_TYPE_CHOICES = [
        ('ai', 'AI Provider'),
        ('email', 'Email Service'),
        ('payment', 'Payment Gateway'),
        ('storage', 'Storage Service'),
        ('other', 'Other'),
    ]

    provider_id = models.CharField(max_length=50, unique=True, primary_key=True)
    # Examples: openai, runware, google, resend, stripe, aws_s3, etc.

    display_name = models.CharField(max_length=100)
    provider_type = models.CharField(max_length=20, choices=PROVIDER_TYPE_CHOICES, default='ai')
    api_key = models.CharField(max_length=500, blank=True)  # Should be encrypted
    api_secret = models.CharField(max_length=500, blank=True)  # For services needing secret
    api_endpoint = models.URLField(blank=True)  # Custom endpoint if needed
    webhook_secret = models.CharField(max_length=500, blank=True)  # For Stripe etc.
    is_active = models.BooleanField(default=True)
    config = models.JSONField(default=dict, blank=True)  # Rate limits, regions, etc.
    created_at = models.DateTimeField(auto_now_add=True)
    updated_at = models.DateTimeField(auto_now=True)

    class Meta:
        db_table = 'igny8_integration_providers'
        verbose_name = 'Integration Provider'
        verbose_name_plural = 'Integration Providers'
```

**1.2 Add fields to AIModelConfig**

```python
# Add to AIModelConfig for IMAGE models
credits_per_image = models.IntegerField(
    null=True, blank=True,
    help_text="Fixed credits per image generated. For image models only."
)

# Add to AIModelConfig for TEXT models
tokens_per_credit = models.IntegerField(
    null=True, blank=True,
    help_text="Number of tokens that equal 1 credit. For text models only."
)

# Add quality tier for UI display (image models)
quality_tier = models.CharField(
    max_length=20,
    choices=[('basic', 'Basic'), ('quality', 'Quality'), ('premium', 'Premium')],
    null=True, blank=True,
    help_text="Quality tier for frontend UI display"
)
```

**1.3 Migration to populate data**

Create IntegrationProvider records:
```
| provider_id | display_name      | provider_type | Notes                    |
|-------------|-------------------|---------------|--------------------------|
| openai      | OpenAI            | ai            | GPT models, DALL-E       |
| runware     | Runware           | ai            | Image generation         |
| google      | Google Cloud      | ai            | Future: Gemini, etc.     |
| resend      | Resend            | email         | Transactional email      |
| stripe      | Stripe            | payment       | Payment processing       |
```

Update AIModelConfig with credit/token values:
```
| model_name    | type  | tokens_per_credit | credits_per_image | quality_tier |
|---------------|-------|-------------------|-------------------|--------------|
| gpt-4o-mini   | text  | 10000             | -                 | -            |
| gpt-4o        | text  | 1000              | -                 | -            |
| gpt-5.1       | text  | 1000              | -                 | -            |
| runware:97@1  | image | -                 | 1                 | basic        |
| dall-e-3      | image | -                 | 5                 | quality      |
| google:4@2    | image | -                 | 15                | premium      |
```

#### Phase 2: Backend Code Changes

**2.1 Remove hardcoded constants**

File: constants.py
- Remove: MODEL_RATES, IMAGE_MODEL_RATES
- Keep: JSON_MODE_MODELS (or move to AIModelConfig.supports_json_mode check)

**2.2 Update ModelRegistry**

File: model_registry.py
- Remove fallback to constants.py
- Add: `get_default_model(model_type) → AIModelConfig where is_default=True`
- Add: `get_provider(provider_id) → IntegrationProvider`
- Add: `get_api_key(provider_id) → str`
- **NO hardcoded model names** - always query DB

**2.3 Update AICore**

File: ai_core.py
- Change `_load_account_settings()` to use IntegrationProvider for API keys
- Remove IMAGE_MODEL_RATES import and usage
- Use `ModelRegistry.get_default_model('text')` instead of hardcoded model
- Use `ModelRegistry.calculate_cost()` exclusively

**2.4 Update CreditService**

File: credit_service.py

For IMAGE models:
```python
def calculate_credits_for_image(model_name: str, num_images: int) -> int:
    """Calculate credits for image generation from AIModelConfig"""
    model = AIModelConfig.objects.get(model_name=model_name, is_active=True)
    return model.credits_per_image * num_images
```

For TEXT models:
```python
def calculate_credits_from_tokens(model_name: str, total_tokens: int) -> int:
    """Calculate credits from token usage based on model's tokens_per_credit"""
    model = AIModelConfig.objects.get(model_name=model_name, is_active=True)
    tokens_per_credit = model.tokens_per_credit or 1000  # fallback
    return math.ceil(total_tokens / tokens_per_credit)
```

**2.5 Simplify GlobalIntegrationSettings**

File: global_settings_models.py
- Remove: All API key fields (moved to IntegrationProvider)
- Remove: All hardcoded CHOICES
- Remove: Model name fields (defaults loaded from AIModelConfig.is_default)
- Keep: image_style, max_in_article_images, image_quality
- Add: Helper methods to get defaults from AIModelConfig

#### Phase 3: API Endpoints

**3.1 New endpoint: GET /api/v1/system/ai-models/**

Returns all active models from database with defaults marked (no hardcoding):
```json
{
  "text_models": [
    {
      "model_name": "gpt-5.1",
      "display_name": "GPT-5.1 Premium",
      "is_default": true,
      "tokens_per_credit": 1000,
      "max_output_tokens": 8192
    },
    {
      "model_name": "gpt-4o-mini",
      "display_name": "GPT-4o Mini",
      "is_default": false,
      "tokens_per_credit": 10000,
      "max_output_tokens": 16000
    }
  ],
  "image_models": [
    {
      "model_name": "runware:97@1",
      "display_name": "Basic",
      "quality_tier": "basic",
      "is_default": false,
      "credits_per_image": 1,
      "valid_sizes": ["1024x1024", "1280x768"]
    },
    {
      "model_name": "dall-e-3",
      "display_name": "Quality",
      "quality_tier": "quality",
      "is_default": true,
      "credits_per_image": 5,
      "valid_sizes": ["1024x1024", "1792x1024"]
    },
    {
      "model_name": "google:4@2",
      "display_name": "Premium",
      "quality_tier": "premium",
      "is_default": false,
      "credits_per_image": 15,
      "valid_sizes": ["1024x1024", "1376x768"]
    }
  ],
  "image_settings": {
    "style": "photorealistic",
    "max_in_article_images": 4,
    "quality": "hd"
  }
}
```

#### Phase 4: Frontend Changes

**4.1 Settings.tsx**

- Remove: QUALITY_TO_CONFIG, RUNWARE_MODEL_CHOICES, DALLE_MODEL_CHOICES hardcodes
- Remove: MODEL_LANDSCAPE_SIZES hardcodes
- Add: Fetch models from `/api/v1/system/ai-models/`
- Load valid_sizes from API response per model
- Display to user (no provider/model names visible):
  - **"Basic (1 credit/image)"**
  - **"Quality (5 credits/image)"**
  - **"Premium (15 credits/image)"**
- Default selection: model where `is_default=true` from API

#### Phase 5: Cleanup

**5.1 Files to clean/remove**
- Remove unused fields from GlobalIntegrationSettings: anthropic_*, bria_*, all API key fields, hardcoded model fields
- Remove deprecated methods from AICore
- Update all imports removing constants.py usage
- Remove CreditCostConfig dependency for image operations (use AIModelConfig.credits_per_image directly)

---

### Credit Calculation Summary

**Text Models (token-based):**
| Model | tokens_per_credit | Example: 5000 tokens |
|-------|-------------------|----------------------|
| gpt-5.1 | 1,000 | 5 credits |
| gpt-4o | 1,000 | 5 credits |
| gpt-4o-mini | 10,000 | 1 credit |

**Image Models (per-image):**
| Model | credits_per_image | quality_tier | Display |
|-------|-------------------|--------------|---------|
| runware:97@1 | 1 | basic | "Basic (1 credit/image)" |
| dall-e-3 | 5 | quality | "Quality (5 credits/image)" |
| google:4@2 | 15 | premium | "Premium (15 credits/image)" |

---

### Migration Order

1. Create IntegrationProvider model + migration
2. Add credits_per_image, tokens_per_credit, quality_tier to AIModelConfig + migration
3. Data migration: populate IntegrationProvider, update AIModelConfig with credit values
4. Update ModelRegistry (remove constants fallback, add get_default_model)
5. Update AICore (use IntegrationProvider for keys)
6. Update CreditService (model-based credit calculation)
7. Create API endpoint /api/v1/system/ai-models/
8. Update frontend (load from API, no hardcodes)
9. Cleanup GlobalIntegrationSettings (remove API keys, hardcoded choices)
10. Remove constants.py hardcoded rates

---

### Files Changed Summary

| File | Action |
|------|--------|
| models.py | Add IntegrationProvider, update AIModelConfig with credit fields |
| model_registry.py | Remove constants fallback, add get_default_model(), get_provider() |
| ai_core.py | Use IntegrationProvider for keys, ModelRegistry for defaults |
| constants.py | Remove MODEL_RATES, IMAGE_MODEL_RATES |
| credit_service.py | Model-based credit calculation for both text and images |
| global_settings_models.py | Remove API keys, hardcoded choices, model fields |
| backend/igny8_core/api/views/system.py | Add ai-models endpoint |
| Settings.tsx | Load models from API, remove all hardcodes |

---

### Key Principles

1. **No hardcoded model names** - GlobalIntegrationSettings loads defaults from AIModelConfig where is_default=True
2. **Single source of truth** - AIModelConfig is THE source for all model info including credit costs
3. **Future-proof** - IntegrationProvider handles ALL 3rd party integrations (AI, email, payment, etc.)
4. **Dynamic frontend** - All model choices loaded from API, not hardcoded
5. **Configurable credits** - Change credits_per_image or tokens_per_credit in admin, no code changes needed