Files
igny8/docs/plans/flexible-model-configuration-plan.md
2025-12-29 01:41:36 +00:00

10 KiB

Flexible Model Configuration System Plan

Overview

This plan outlines how to implement a flexible model configuration system that allows:

  • Adding/removing/activating models dynamically
  • Configuring rates for each model
  • Supporting multiple providers (OpenAI, Anthropic, Runware)
  • Per-account model overrides

Current State

Model Rates (hardcoded in ai/constants.py)

MODEL_RATES = {
    'gpt-4.1': {'input': 2.00, 'output': 8.00},      # per 1M tokens
    'gpt-4o-mini': {'input': 0.15, 'output': 0.60},
    'gpt-4o': {'input': 2.50, 'output': 10.00},
    'gpt-5.1': {'input': 1.25, 'output': 10.00},
    'gpt-5.2': {'input': 1.75, 'output': 14.00},
}

IMAGE_MODEL_RATES = {
    'dall-e-3': 0.040,           # per image
    'dall-e-2': 0.020,
    'gpt-image-1': 0.042,
    'gpt-image-1-mini': 0.011,
}

Current Settings Architecture

  • GlobalIntegrationSettings (singleton) - Platform-wide API keys and defaults
  • IntegrationSettings (per-account) - Model/parameter overrides
  • GlobalAIPrompt - Platform-wide prompt templates
  • AIPrompt (per-account) - Custom prompt overrides

Proposed Changes

Phase 1: Database Model for AI Models

Create a new model AIModel to store model configurations:

# backend/igny8_core/modules/system/global_settings_models.py

class AIModel(models.Model):
    """
    Dynamic AI model configuration.
    Replaces hardcoded MODEL_RATES and IMAGE_MODEL_RATES.
    """
    PROVIDER_CHOICES = [
        ('openai', 'OpenAI'),
        ('anthropic', 'Anthropic'),
        ('runware', 'Runware'),
        ('google', 'Google AI'),
    ]

    MODEL_TYPE_CHOICES = [
        ('text', 'Text Generation'),
        ('image', 'Image Generation'),
        ('embedding', 'Embedding'),
    ]

    # Identification
    model_id = models.CharField(
        max_length=100,
        unique=True,
        help_text="Model identifier (e.g., 'gpt-4o-mini', 'claude-3-sonnet')"
    )
    display_name = models.CharField(
        max_length=200,
        help_text="User-friendly name (e.g., 'GPT-4o Mini')"
    )
    provider = models.CharField(max_length=50, choices=PROVIDER_CHOICES)
    model_type = models.CharField(max_length=20, choices=MODEL_TYPE_CHOICES)

    # Pricing (per 1M tokens for text, per image for image models)
    input_rate = models.DecimalField(
        max_digits=10,
        decimal_places=4,
        default=0,
        help_text="Cost per 1M input tokens (text) or per request (image)"
    )
    output_rate = models.DecimalField(
        max_digits=10,
        decimal_places=4,
        default=0,
        help_text="Cost per 1M output tokens (text only)"
    )

    # Capabilities
    max_tokens = models.IntegerField(
        default=8192,
        help_text="Maximum tokens for this model"
    )
    supports_json_mode = models.BooleanField(
        default=True,
        help_text="Whether model supports JSON response format"
    )
    supports_vision = models.BooleanField(
        default=False,
        help_text="Whether model supports image input"
    )

    # Status
    is_active = models.BooleanField(default=True)
    is_default = models.BooleanField(
        default=False,
        help_text="Use as default when no specific model is configured"
    )
    sort_order = models.IntegerField(default=0)

    # Metadata
    created_at = models.DateTimeField(auto_now_add=True)
    updated_at = models.DateTimeField(auto_now=True)

    class Meta:
        db_table = 'igny8_ai_models'
        ordering = ['sort_order', 'display_name']

    def __str__(self):
        return f"{self.display_name} ({self.model_id})"

Phase 2: Model Registry Service

Create a service layer to manage models:

# backend/igny8_core/ai/model_registry.py

class ModelRegistry:
    """
    Central registry for AI model configurations.
    Provides caching and fallback logic.
    """

    _cache = {}
    _cache_ttl = 300  # 5 minutes

    @classmethod
    def get_model(cls, model_id: str) -> Optional[dict]:
        """Get model configuration by ID"""
        # Check cache first
        # Fallback to database
        # Return dict with rates, capabilities, etc.
        pass

    @classmethod
    def get_models_by_type(cls, model_type: str) -> List[dict]:
        """Get all active models of a type"""
        pass

    @classmethod
    def get_default_model(cls, model_type: str = 'text') -> dict:
        """Get default model for a type"""
        pass

    @classmethod
    def calculate_cost(
        cls,
        model_id: str,
        input_tokens: int = 0,
        output_tokens: int = 0,
        image_count: int = 0
    ) -> float:
        """Calculate cost for an operation"""
        pass

    @classmethod
    def is_model_supported(cls, model_id: str) -> bool:
        """Check if a model is configured and active"""
        pass

Phase 3: Update AICore to Use Registry

Modify ai_core.py to use the model registry:

# In run_ai_request()
from igny8_core.ai.model_registry import ModelRegistry

# Replace hardcoded MODEL_RATES check
if not ModelRegistry.is_model_supported(model):
    supported = ModelRegistry.get_models_by_type('text')
    error_msg = f"Model '{model}' is not supported. Available models: {[m['model_id'] for m in supported]}"
    # ...

# Replace hardcoded cost calculation
model_info = ModelRegistry.get_model(model)
if model_info:
    cost = ModelRegistry.calculate_cost(
        model_id=model,
        input_tokens=input_tokens,
        output_tokens=output_tokens
    )

Phase 4: Admin Interface

Add Django admin for managing models:

# backend/igny8_core/modules/system/admin.py

@admin.register(AIModel)
class AIModelAdmin(admin.ModelAdmin):
    list_display = ['model_id', 'display_name', 'provider', 'model_type', 'input_rate', 'output_rate', 'is_active', 'is_default']
    list_filter = ['provider', 'model_type', 'is_active', 'is_default']
    search_fields = ['model_id', 'display_name']
    ordering = ['sort_order', 'display_name']

    fieldsets = (
        ('Identification', {
            'fields': ('model_id', 'display_name', 'provider', 'model_type')
        }),
        ('Pricing', {
            'fields': ('input_rate', 'output_rate')
        }),
        ('Capabilities', {
            'fields': ('max_tokens', 'supports_json_mode', 'supports_vision')
        }),
        ('Status', {
            'fields': ('is_active', 'is_default', 'sort_order')
        }),
    )

Phase 5: Data Migration

Create a migration to seed initial models:

# Migration file
def seed_initial_models(apps, schema_editor):
    AIModel = apps.get_model('system', 'AIModel')

    models = [
        # OpenAI Text Models
        {'model_id': 'gpt-4o-mini', 'display_name': 'GPT-4o Mini', 'provider': 'openai', 'model_type': 'text', 'input_rate': 0.15, 'output_rate': 0.60, 'is_default': True},
        {'model_id': 'gpt-4o', 'display_name': 'GPT-4o', 'provider': 'openai', 'model_type': 'text', 'input_rate': 2.50, 'output_rate': 10.00},
        {'model_id': 'gpt-4.1', 'display_name': 'GPT-4.1', 'provider': 'openai', 'model_type': 'text', 'input_rate': 2.00, 'output_rate': 8.00},
        {'model_id': 'gpt-5.1', 'display_name': 'GPT-5.1', 'provider': 'openai', 'model_type': 'text', 'input_rate': 1.25, 'output_rate': 10.00, 'max_tokens': 16000},
        {'model_id': 'gpt-5.2', 'display_name': 'GPT-5.2', 'provider': 'openai', 'model_type': 'text', 'input_rate': 1.75, 'output_rate': 14.00, 'max_tokens': 16000},

        # Anthropic Text Models
        {'model_id': 'claude-3-sonnet', 'display_name': 'Claude 3 Sonnet', 'provider': 'anthropic', 'model_type': 'text', 'input_rate': 3.00, 'output_rate': 15.00},
        {'model_id': 'claude-3-opus', 'display_name': 'Claude 3 Opus', 'provider': 'anthropic', 'model_type': 'text', 'input_rate': 15.00, 'output_rate': 75.00},
        {'model_id': 'claude-3-haiku', 'display_name': 'Claude 3 Haiku', 'provider': 'anthropic', 'model_type': 'text', 'input_rate': 0.25, 'output_rate': 1.25},

        # OpenAI Image Models
        {'model_id': 'dall-e-3', 'display_name': 'DALL-E 3', 'provider': 'openai', 'model_type': 'image', 'input_rate': 0.040, 'output_rate': 0},
        {'model_id': 'dall-e-2', 'display_name': 'DALL-E 2', 'provider': 'openai', 'model_type': 'image', 'input_rate': 0.020, 'output_rate': 0},
        {'model_id': 'gpt-image-1', 'display_name': 'GPT Image 1', 'provider': 'openai', 'model_type': 'image', 'input_rate': 0.042, 'output_rate': 0},

        # Runware Image Models
        {'model_id': 'runware:97@1', 'display_name': 'Runware 97@1', 'provider': 'runware', 'model_type': 'image', 'input_rate': 0.009, 'output_rate': 0},
    ]

    for i, model in enumerate(models):
        AIModel.objects.create(sort_order=i, **model)

Phase 6: API Endpoints for Model Management

Add REST endpoints for managing models:

# GET /api/v1/admin/ai-models/ - List all models
# POST /api/v1/admin/ai-models/ - Create new model
# PUT /api/v1/admin/ai-models/{id}/ - Update model
# DELETE /api/v1/admin/ai-models/{id}/ - Delete model
# POST /api/v1/admin/ai-models/{id}/toggle-active/ - Toggle active status
# POST /api/v1/admin/ai-models/{id}/set-default/ - Set as default

Phase 7: Frontend Admin UI

Create admin UI for model management:

  • List view with filtering/sorting
  • Create/Edit form with validation
  • Quick toggle for active/default status
  • Price calculator preview

Implementation Order

  1. Week 1: Create AIModel model and migration
  2. Week 1: Create ModelRegistry service
  3. Week 2: Update ai_core.py to use registry
  4. Week 2: Update constants.py to load from database
  5. Week 3: Add Django admin interface
  6. Week 3: Add API endpoints
  7. Week 4: Create frontend admin UI
  8. Week 4: Testing and documentation

Backward Compatibility

  • Keep constants.py as fallback if database is empty
  • ModelRegistry.get_model() checks DB first, falls back to constants
  • No changes to existing GlobalIntegrationSettings or IntegrationSettings
  • Existing API calls continue to work unchanged

Benefits

  1. No Code Changes for New Models: Add models via admin UI
  2. Easy Price Updates: Update rates without deployment
  3. Provider Flexibility: Support any provider by adding models
  4. Per-Provider Settings: Configure different capabilities per provider
  5. Audit Trail: Track when models were added/modified
  6. A/B Testing: Easily enable/disable models for testing