Files
igny8/v2/V2-Execution-Docs/02I-video-creator.md
IGNY8 VPS (Salman) 0570052fec 1
2026-03-23 17:20:51 +00:00

897 lines
31 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# IGNY8 Phase 2: Video Creator (02I)
## AI Video Creation Pipeline — Stage 9
**Document Version:** 1.0
**Date:** 2026-03-23
**Phase:** IGNY8 Phase 2 — Feature Expansion
**Status:** Build Ready
**Source of Truth:** Codebase at `/data/app/igny8/`
**Audience:** Claude Code, Backend Developers, Architects
---
## 1. CURRENT STATE
### Video Today
There is **no** video creation capability in IGNY8. No TTS, no FFmpeg pipeline, no video publishing. Images exist (generated by pipeline Stages 5-6) and can feed into video as visual assets.
### What Exists
- `Images` model (writer app) — generated images from pipeline, usable as video visual assets
- `SocialAccount` model (02H) — provides OAuth connections to YouTube, Instagram, TikTok for video publishing
- Self-hosted AI infrastructure (Phase 0F) — provides GPU for TTS and AI image generation
- Content generation pipeline (01E) — content records provide source material for video scripts
- Celery infrastructure with multiple queues — supports dedicated `video` queue for long-running renders
### What Does Not Exist
- No video app or models
- No script generation from articles
- No TTS (text-to-speech) voiceover generation
- No FFmpeg/MoviePy video composition pipeline
- No subtitle generation
- No video publishing to platforms
- No Stage 9 pipeline integration
---
## 2. WHAT TO BUILD
### Overview
Build Stage 9 of the automation pipeline: an AI video creation system that converts published content into videos. The pipeline has 5 stages: script generation → voiceover → visual assets → composition → publishing. Videos publish to YouTube, Instagram Reels, and TikTok.
### 2.1 Video Types
| Type | Duration | Aspect Ratio | Primary Platform |
|------|----------|-------------|-----------------|
| **Short** | 30-90s | 9:16 vertical | YouTube Shorts, Instagram Reels, TikTok |
| **Medium** | 60-180s | 9:16 or 16:9 | TikTok, YouTube |
| **Long** | 5-15m | 16:9 horizontal | YouTube |
### 2.2 Platform Specs
| Platform | Max Duration | Resolution | Encoding | Max File Size |
|----------|-------------|------------|----------|--------------|
| YouTube Long | Up to 12h | 1920×1080 | MP4 H.264, AAC audio | 256GB |
| YouTube Shorts | ≤60s | 1080×1920 | MP4 H.264, AAC audio | 256GB |
| Instagram Reels | ≤90s | 1080×1920 | MP4 H.264, AAC audio | 650MB |
| TikTok | ≤10m | 1080×1920 | MP4 H.264, AAC audio | 72MB |
### 2.3 Five-Stage Video Pipeline
#### Stage 1 — Script Generation (AI)
**Input:** Content record (title, content_html, meta_description, keywords, images)
AI extracts key points and produces:
```json
{
"hook": "text (3-5 sec)",
"intro": "text (10-15 sec)",
"points": [
{"text": "...", "duration_est": 20, "visual_cue": "show chart", "text_overlay": "Key stat"}
],
"cta": "text (5-10 sec)",
"chapter_markers": [{"time": 0, "title": "Intro"}],
"total_estimated_duration": 120
}
```
SEO: AI generates platform-specific title, description, tags for each target platform.
#### Stage 2 — Voiceover (TTS)
**Cloud Providers:**
| Provider | Cost | Quality | Features |
|----------|------|---------|----------|
| OpenAI TTS | $15-30/1M chars | High | Voices: alloy, echo, fable, onyx, nova, shimmer |
| ElevenLabs | Plan-based | Highest | Voice cloning, ultra-realistic |
**Self-Hosted (via 0F GPU):**
| Model | Quality | Speed | Notes |
|-------|---------|-------|-------|
| Coqui XTTS-v2 | Good | Medium | Multi-language, free |
| Bark | Expressive | Slow | Emotional speech |
| Piper TTS | Moderate | Fast | Lightweight |
**Features:** Voice selection, speed control, multi-language support
**Output:** WAV/MP3 audio file + word-level timestamps (for subtitle sync)
#### Stage 3 — Visual Assets
**Sources:**
- Article images from `Images` model (already generated by pipeline Stages 5-6)
- AI-generated scenes (Runware/DALL-E/Stable Diffusion via 0F)
- Stock footage APIs: Pexels, Pixabay (free, API key required)
- Text overlay frames (rendered via Pillow)
- Code snippet frames (via Pygments syntax highlighting)
**Effects:**
- Ken Burns effect on still images (zoom/pan animation)
- Transition effects between scenes (fade, slide, dissolve)
#### Stage 4 — Video Composition (FFmpeg + MoviePy)
**Libraries:** FFmpeg (encoding), MoviePy (high-level composition), Pillow (text overlays), pydub (audio processing)
**Process:**
1. Create visual timeline from script sections
2. Assign visuals to each section (image/video clip per point)
3. Add text overlays at specified timestamps
4. Mix voiceover audio with background music (royalty-free, 20% volume)
5. Apply transitions between sections
6. Render to target resolution/format
**Render Presets:**
| Preset | Resolution | Duration Range | Encoding |
|--------|-----------|----------------|----------|
| `youtube_long` | 1920×1080 | 3-15m | H.264/AAC |
| `youtube_short` | 1080×1920 | 30-60s | H.264/AAC |
| `instagram_reel` | 1080×1920 | 30-90s | H.264/AAC |
| `tiktok` | 1080×1920 | 30-180s | H.264/AAC |
#### Stage 5 — SEO & Publishing
- Auto-generate SRT subtitle file from TTS word-level timestamps
- AI thumbnail: hero image with title text overlay
- Platform-specific metadata: title (optimized per platform), description (with timestamps for YouTube), tags, category
- Publishing via platform APIs (reuses OAuth from 02H SocialAccount)
- Confirmation logging with platform video ID
### 2.4 User Flow
1. Select content → choose video type (short/medium/long) → target platforms
2. AI generates script → user reviews/edits script
3. Select voice → preview audio → approve
4. Auto-assign visuals → user can swap images → preview composition
5. Render video → preview final → approve
6. Publish to selected platforms → track performance
### 2.5 Dedicated Celery Queue
Video rendering is CPU/GPU intensive and requires isolation:
- **Dedicated `video` queue:** `celery -A igny8_core worker -Q video --concurrency=1`
- **Long-running tasks:** 5-30 minutes per video render
- **Progress tracking:** via Celery result backend (task status updates)
- **Temp file cleanup:** after publish, clean up intermediate files (audio, frames, raw renders)
---
## 3. DATA MODELS & APIS
### 3.1 New Models
All models in a new `video` app.
#### VideoProject (video app)
```python
class VideoProject(SiteSectorBaseModel):
"""
Top-level container for a video creation project.
Links to source content and tracks overall progress.
"""
content = models.ForeignKey(
'writer.Content',
on_delete=models.SET_NULL,
null=True,
blank=True,
related_name='video_projects',
help_text='Source content (null for standalone video)'
)
project_type = models.CharField(
max_length=10,
choices=[
('short', 'Short (30-90s)'),
('medium', 'Medium (60-180s)'),
('long', 'Long (5-15m)'),
]
)
target_platforms = models.JSONField(
default=list,
help_text='List of target platform strings: youtube_long, youtube_short, instagram_reel, tiktok'
)
status = models.CharField(
max_length=15,
choices=[
('draft', 'Draft'),
('scripting', 'Script Generation'),
('voiceover', 'Voiceover Generation'),
('composing', 'Visual Composition'),
('rendering', 'Rendering'),
('review', 'Ready for Review'),
('published', 'Published'),
('failed', 'Failed'),
],
default='draft'
)
settings = models.JSONField(
default=dict,
help_text='{voice_id, voice_provider, music_track, transition_style}'
)
class Meta:
app_label = 'video'
db_table = 'igny8_video_projects'
```
**PK:** BigAutoField (integer) — inherits from SiteSectorBaseModel
#### VideoScript (video app)
```python
class VideoScript(models.Model):
"""
Script for a video project — generated by AI, editable by user.
"""
project = models.OneToOneField(
'video.VideoProject',
on_delete=models.CASCADE,
related_name='script'
)
script_text = models.TextField(help_text='Full narration text')
sections = models.JSONField(
default=list,
help_text='[{text, duration_est, visual_cue, text_overlay}]'
)
hook = models.TextField(blank=True, default='')
cta = models.TextField(blank=True, default='')
chapter_markers = models.JSONField(
default=list,
help_text='[{time, title}]'
)
total_estimated_duration = models.IntegerField(
default=0,
help_text='Total estimated duration in seconds'
)
seo_metadata = models.JSONField(
default=dict,
help_text='{platform: {title, description, tags}}'
)
version = models.IntegerField(default=1)
class Meta:
app_label = 'video'
db_table = 'igny8_video_scripts'
```
**PK:** BigAutoField (integer) — standard Django Model
#### VideoAsset (video app)
```python
class VideoAsset(models.Model):
"""
Individual asset (image, footage, music, overlay, subtitle) for a video project.
"""
project = models.ForeignKey(
'video.VideoProject',
on_delete=models.CASCADE,
related_name='assets'
)
asset_type = models.CharField(
max_length=15,
choices=[
('image', 'Image'),
('footage', 'Footage'),
('music', 'Background Music'),
('overlay', 'Text Overlay'),
('subtitle', 'Subtitle File'),
]
)
source = models.CharField(
max_length=20,
choices=[
('article_image', 'Article Image'),
('ai_generated', 'AI Generated'),
('stock_pexels', 'Pexels Stock'),
('stock_pixabay', 'Pixabay Stock'),
('uploaded', 'User Uploaded'),
('rendered', 'Rendered'),
]
)
file_path = models.CharField(max_length=500, help_text='Path in media storage')
file_url = models.URLField(blank=True, default='')
duration = models.FloatField(
null=True, blank=True,
help_text='Duration in seconds (for video/audio assets)'
)
section_index = models.IntegerField(
null=True, blank=True,
help_text='Which script section this asset belongs to'
)
order = models.IntegerField(default=0)
class Meta:
app_label = 'video'
db_table = 'igny8_video_assets'
```
**PK:** BigAutoField (integer) — standard Django Model
#### RenderedVideo (video app)
```python
class RenderedVideo(models.Model):
"""
A rendered video file for a specific platform preset.
One project can have multiple renders (one per target platform).
"""
project = models.ForeignKey(
'video.VideoProject',
on_delete=models.CASCADE,
related_name='rendered_videos'
)
preset = models.CharField(
max_length=20,
choices=[
('youtube_long', 'YouTube Long'),
('youtube_short', 'YouTube Short'),
('instagram_reel', 'Instagram Reel'),
('tiktok', 'TikTok'),
]
)
resolution = models.CharField(
max_length=15,
help_text='e.g. 1920x1080 or 1080x1920'
)
duration = models.FloatField(help_text='Duration in seconds')
file_size = models.BigIntegerField(help_text='File size in bytes')
file_path = models.CharField(max_length=500)
file_url = models.URLField(blank=True, default='')
subtitle_file_path = models.CharField(max_length=500, blank=True, default='')
thumbnail_path = models.CharField(max_length=500, blank=True, default='')
render_started_at = models.DateTimeField()
render_completed_at = models.DateTimeField(null=True, blank=True)
status = models.CharField(
max_length=15,
choices=[
('queued', 'Queued'),
('rendering', 'Rendering'),
('completed', 'Completed'),
('failed', 'Failed'),
],
default='queued'
)
class Meta:
app_label = 'video'
db_table = 'igny8_rendered_videos'
```
**PK:** BigAutoField (integer) — standard Django Model
#### PublishedVideo (video app)
```python
class PublishedVideo(models.Model):
"""
Tracks a rendered video published to a social platform.
Uses SocialAccount from 02H for OAuth credentials.
"""
rendered_video = models.ForeignKey(
'video.RenderedVideo',
on_delete=models.CASCADE,
related_name='publications'
)
social_account = models.ForeignKey(
'social.SocialAccount',
on_delete=models.CASCADE,
related_name='published_videos'
)
platform = models.CharField(max_length=15)
platform_video_id = models.CharField(max_length=255, blank=True, default='')
published_url = models.URLField(blank=True, default='')
title = models.CharField(max_length=255)
description = models.TextField()
tags = models.JSONField(default=list)
thumbnail_url = models.URLField(blank=True, default='')
published_at = models.DateTimeField(null=True, blank=True)
status = models.CharField(
max_length=15,
choices=[
('publishing', 'Publishing'),
('published', 'Published'),
('failed', 'Failed'),
('removed', 'Removed'),
],
default='publishing'
)
class Meta:
app_label = 'video'
db_table = 'igny8_published_videos'
```
**PK:** BigAutoField (integer) — standard Django Model
#### VideoEngagement (video app)
```python
class VideoEngagement(models.Model):
"""
Engagement metrics for a published video.
Fetched periodically from platform APIs.
"""
published_video = models.ForeignKey(
'video.PublishedVideo',
on_delete=models.CASCADE,
related_name='engagement_records'
)
views = models.IntegerField(default=0)
likes = models.IntegerField(default=0)
comments = models.IntegerField(default=0)
shares = models.IntegerField(default=0)
watch_time_seconds = models.IntegerField(default=0)
avg_view_duration = models.FloatField(default=0.0)
raw_data = models.JSONField(default=dict, help_text='Full platform API response')
fetched_at = models.DateTimeField(auto_now_add=True)
class Meta:
app_label = 'video'
db_table = 'igny8_video_engagement'
```
**PK:** BigAutoField (integer) — standard Django Model
### 3.2 New App Registration
Create video app:
- **App config:** `igny8_core/modules/video/apps.py` with `app_label = 'video'`
- **Add to INSTALLED_APPS** in `igny8_core/settings.py`
### 3.3 Migration
```
igny8_core/migrations/XXXX_add_video_models.py
```
**Operations:**
1. `CreateModel('VideoProject', ...)` — with indexes on content, status
2. `CreateModel('VideoScript', ...)` — OneToOne to VideoProject
3. `CreateModel('VideoAsset', ...)` — with index on project
4. `CreateModel('RenderedVideo', ...)` — with index on project, status
5. `CreateModel('PublishedVideo', ...)` — with indexes on rendered_video, social_account
6. `CreateModel('VideoEngagement', ...)` — with index on published_video
### 3.4 API Endpoints
All endpoints under `/api/v1/video/`:
#### Project Management
| Method | Path | Description |
|--------|------|-------------|
| POST | `/api/v1/video/projects/` | Create video project. Body: `{content_id, project_type, target_platforms}`. |
| GET | `/api/v1/video/projects/?site_id=X` | List projects with filters (status, project_type). |
| GET | `/api/v1/video/projects/{id}/` | Project detail with script, assets, renders. |
#### Script
| Method | Path | Description |
|--------|------|-------------|
| POST | `/api/v1/video/scripts/generate/` | AI-generate script from content. Body: `{project_id}`. |
| PUT | `/api/v1/video/scripts/{project_id}/` | Edit script (user modifications). |
#### Voiceover
| Method | Path | Description |
|--------|------|-------------|
| POST | `/api/v1/video/voiceover/generate/` | Generate TTS audio. Body: `{project_id, voice_id, provider}`. |
| POST | `/api/v1/video/voiceover/preview/` | Preview voice sample (short clip). Body: `{text, voice_id, provider}`. |
#### Assets
| Method | Path | Description |
|--------|------|-------------|
| GET | `/api/v1/video/assets/{project_id}/` | List project assets. |
| POST | `/api/v1/video/assets/{project_id}/` | Add/replace asset. |
#### Rendering & Publishing
| Method | Path | Description |
|--------|------|-------------|
| POST | `/api/v1/video/render/` | Queue video render. Body: `{project_id, presets: []}`. |
| GET | `/api/v1/video/render/{id}/status/` | Render progress (queued/rendering/completed/failed). |
| GET | `/api/v1/video/rendered/{project_id}/` | List rendered videos for project. |
| POST | `/api/v1/video/publish/` | Publish to platform. Body: `{rendered_video_id, social_account_id}`. |
#### Analytics
| Method | Path | Description |
|--------|------|-------------|
| GET | `/api/v1/video/analytics/?site_id=X` | Aggregate video analytics across projects. |
| GET | `/api/v1/video/analytics/{published_video_id}/` | Single video analytics with engagement timeline. |
**Permissions:** All endpoints use `SiteSectorModelViewSet` permission patterns.
### 3.5 AI Functions
#### GenerateVideoScriptFunction
**Registry key:** `generate_video_script`
**Location:** `igny8_core/ai/functions/generate_video_script.py`
```python
class GenerateVideoScriptFunction(BaseAIFunction):
"""
Generates video script from content record.
Produces hook, intro, body points, CTA, chapter markers,
and platform-specific SEO metadata.
"""
function_name = 'generate_video_script'
def validate(self, project_id, **kwargs):
# Verify project exists, has linked content with content_html
pass
def prepare(self, project_id, **kwargs):
# Load VideoProject + Content
# Extract key points, images, meta_description
# Determine target duration from project_type
pass
def build_prompt(self):
# Include: content title, key points, meta_description
# Target duration constraints
# Per target_platform: SEO metadata requirements
pass
def parse_response(self, response):
# Parse script structure: hook, intro, points[], cta, chapter_markers
# Parse seo_metadata per platform
pass
def save_output(self, parsed):
# Create/update VideoScript record
# Update VideoProject.status = 'scripting' → 'voiceover'
pass
```
### 3.6 TTS Service
**Location:** `igny8_core/business/tts_service.py`
```python
class TTSService:
"""
Text-to-speech service. Supports cloud providers and self-hosted models.
Returns audio file + word-level timestamps.
"""
PROVIDERS = {
'openai': OpenAITTSProvider,
'elevenlabs': ElevenLabsTTSProvider,
'coqui': CoquiTTSProvider, # Self-hosted via 0F
'bark': BarkTTSProvider, # Self-hosted via 0F
'piper': PiperTTSProvider, # Self-hosted via 0F
}
def generate(self, text, voice_id, provider='openai'):
"""
Generate voiceover audio.
Returns {audio_path, duration, timestamps: [{word, start, end}]}
"""
pass
def preview(self, text, voice_id, provider='openai', max_chars=200):
"""Generate short preview clip."""
pass
def list_voices(self, provider='openai'):
"""List available voices for provider."""
pass
```
### 3.7 Video Composition Service
**Location:** `igny8_core/business/video_composition.py`
```python
class VideoCompositionService:
"""
Composes video from script + audio + visual assets using FFmpeg/MoviePy.
"""
PRESETS = {
'youtube_long': {'width': 1920, 'height': 1080, 'min_dur': 180, 'max_dur': 900},
'youtube_short': {'width': 1080, 'height': 1920, 'min_dur': 30, 'max_dur': 60},
'instagram_reel': {'width': 1080, 'height': 1920, 'min_dur': 30, 'max_dur': 90},
'tiktok': {'width': 1080, 'height': 1920, 'min_dur': 30, 'max_dur': 180},
}
def compose(self, project_id, preset):
"""
Full composition:
1. Load script + audio + visual assets
2. Create visual timeline from script sections
3. Assign visuals to sections
4. Add text overlays at timestamps
5. Mix voiceover + background music (20% volume)
6. Apply transitions
7. Render to preset resolution/format
8. Generate SRT subtitles from TTS timestamps
9. Generate thumbnail
10. Create RenderedVideo record
Returns RenderedVideo instance.
"""
pass
def _apply_ken_burns(self, image_clip, duration):
"""Apply zoom/pan animation to still image."""
pass
def _generate_subtitles(self, timestamps, output_path):
"""Generate SRT file from word-level timestamps."""
pass
def _generate_thumbnail(self, project, output_path):
"""Create thumbnail: hero image + title text overlay."""
pass
```
### 3.8 Video Publisher Service
**Location:** `igny8_core/business/video_publisher.py`
```python
class VideoPublisherService:
"""
Publishes rendered videos to platforms via their APIs.
Reuses SocialAccount OAuth from 02H.
"""
def publish(self, rendered_video_id, social_account_id):
"""
Upload and publish video to platform.
1. Load RenderedVideo + SocialAccount
2. Decrypt OAuth tokens
3. Upload video file via platform API
4. Set metadata (title, description, tags, thumbnail)
5. Create PublishedVideo record
"""
pass
```
### 3.9 Celery Tasks
**Location:** `igny8_core/tasks/video_tasks.py`
All video tasks run on the dedicated `video` queue:
```python
@shared_task(name='generate_video_script', queue='video')
def generate_video_script_task(project_id):
"""AI script generation from content."""
pass
@shared_task(name='generate_voiceover', queue='video')
def generate_voiceover_task(project_id):
"""TTS audio generation."""
pass
@shared_task(name='render_video', queue='video')
def render_video_task(project_id, preset):
"""
FFmpeg/MoviePy video composition. Long-running (5-30 min).
Updates RenderedVideo.status through lifecycle.
"""
pass
@shared_task(name='generate_thumbnail', queue='video')
def generate_thumbnail_task(project_id):
"""AI thumbnail creation with title overlay."""
pass
@shared_task(name='generate_subtitles', queue='video')
def generate_subtitles_task(project_id):
"""SRT generation from TTS timestamps."""
pass
@shared_task(name='publish_video', queue='video')
def publish_video_task(rendered_video_id, social_account_id):
"""Upload video to platform API."""
pass
@shared_task(name='fetch_video_engagement')
def fetch_video_engagement_task():
"""Periodic metric fetch for published videos. Runs on default queue."""
pass
@shared_task(name='video_pipeline_stage9', queue='video')
def video_pipeline_stage9(content_id):
"""
Full pipeline: script → voice → render → publish.
Triggered after Stage 8 (social) or directly after Stage 7 (publish).
"""
pass
```
**Beat Schedule Additions:**
| Task | Schedule | Notes |
|------|----------|-------|
| `fetch_video_engagement` | Every 12 hours | Fetches engagement metrics for published videos |
**Docker Configuration:**
```yaml
# Add to docker-compose.app.yml:
celery_video_worker:
build: ./backend
command: celery -A igny8_core worker -Q video --concurrency=1 --loglevel=info
# Requires FFmpeg installed in Docker image
```
**Dockerfile Addition:**
```dockerfile
# Add to backend/Dockerfile:
RUN apt-get update && apt-get install -y ffmpeg
```
---
## 4. IMPLEMENTATION STEPS
### Step 1: Create Video App
1. Create `igny8_core/modules/video/` directory with `__init__.py` and `apps.py`
2. Add `video` to `INSTALLED_APPS` in settings.py
3. Create 6 models: VideoProject, VideoScript, VideoAsset, RenderedVideo, PublishedVideo, VideoEngagement
### Step 2: Migration
1. Create migration for 6 new models
2. Run migration
### Step 3: System Dependencies
1. Add FFmpeg to Docker image (`apt-get install -y ffmpeg`)
2. Add to `requirements.txt`: `moviepy`, `pydub`, `Pillow` (already present), `pysrt`
3. Add docker-compose service for `celery_video_worker` with `-Q video --concurrency=1`
### Step 4: AI Function
1. Implement `GenerateVideoScriptFunction` in `igny8_core/ai/functions/generate_video_script.py`
2. Register `generate_video_script` in `igny8_core/ai/registry.py`
### Step 5: Services
1. Implement `TTSService` in `igny8_core/business/tts_service.py` (cloud + self-hosted providers)
2. Implement `VideoCompositionService` in `igny8_core/business/video_composition.py`
3. Implement `VideoPublisherService` in `igny8_core/business/video_publisher.py`
### Step 6: Pipeline Integration
Add Stage 9 trigger:
```python
# After Stage 8 (social posts) or Stage 7 (publish):
def post_social_or_publish(content_id):
site = content.site
config = AutomationConfig.objects.get(site=site)
if config.settings.get('video_enabled'):
video_pipeline_stage9.delay(content_id)
```
### Step 7: API Endpoints
1. Create `igny8_core/urls/video.py` with project, script, voiceover, asset, render, publish, analytics endpoints
2. Create views extending `SiteSectorModelViewSet`
3. Register URL patterns under `/api/v1/video/`
### Step 8: Celery Tasks
1. Implement 8 tasks in `igny8_core/tasks/video_tasks.py`
2. Add `fetch_video_engagement` to beat schedule
3. Ensure render tasks target `video` queue
### Step 9: Serializers & Admin
1. Create DRF serializers for all 6 models
2. Register models in Django admin
### Step 10: Credit Cost Configuration
Add to `CreditCostConfig` (billing app):
| operation_type | default_cost | description |
|---------------|-------------|-------------|
| `video_script_generation` | 5 | AI script generation from content |
| `video_tts_standard` | 10/min | Cloud TTS (OpenAI) — per minute of audio |
| `video_tts_selfhosted` | 2/min | Self-hosted TTS (Coqui/Piper via 0F) |
| `video_tts_hd` | 20/min | HD TTS (ElevenLabs) — per minute |
| `video_visual_generation` | 15-50 | AI visual asset generation (varies by count) |
| `video_thumbnail` | 3-10 | AI thumbnail creation |
| `video_composition` | 5 | FFmpeg render |
| `video_seo_metadata` | 1 | SEO metadata per platform |
| `video_short_total` | 40-80 | Total for short-form video |
| `video_long_total` | 100-250 | Total for long-form video |
---
## 5. ACCEPTANCE CRITERIA
### Script Generation
- [ ] AI generates structured script from content with hook, intro, body points, CTA
- [ ] Script includes chapter markers with timestamps
- [ ] Platform-specific SEO metadata generated (title, description, tags)
- [ ] Script duration estimates match project_type constraints
- [ ] User can edit script before proceeding
### Voiceover
- [ ] OpenAI TTS generates audio with voice selection
- [ ] ElevenLabs TTS works as premium option
- [ ] Self-hosted TTS (Coqui XTTS-v2) works via 0F GPU
- [ ] Word-level timestamps generated for subtitle sync
- [ ] Voice preview endpoint allows testing before full generation
### Visual Assets
- [ ] Article images from Images model used as visual assets
- [ ] Ken Burns effect applied to still images
- [ ] Text overlay frames rendered via Pillow
- [ ] Transitions applied between scenes
- [ ] User can swap assets before rendering
### Rendering
- [ ] FFmpeg/MoviePy composition produces correct resolution per preset
- [ ] Audio mix: voiceover at 100% + background music at 20%
- [ ] SRT subtitle file generated from TTS timestamps
- [ ] AI thumbnail generated with title text overlay
- [ ] Render runs on dedicated `video` Celery queue with concurrency=1
- [ ] Render progress trackable via status endpoint
### Publishing
- [ ] Video uploads to YouTube via API (reusing 02H SocialAccount)
- [ ] Video uploads to Instagram Reels
- [ ] Video uploads to TikTok
- [ ] Platform video ID and URL stored on PublishedVideo
- [ ] Engagement metrics fetched every 12 hours
### Pipeline Integration
- [ ] Stage 9 triggers automatically when video_enabled in AutomationConfig
- [ ] Full pipeline (script → voice → render → publish) runs as single Celery chain
- [ ] VideoObject schema (02G) generated for published video content
---
## 6. CLAUDE CODE INSTRUCTIONS
### File Locations
```
igny8_core/
├── modules/
│ └── video/
│ ├── __init__.py
│ ├── apps.py # app_label = 'video'
│ └── models.py # 6 models
├── ai/
│ └── functions/
│ └── generate_video_script.py # GenerateVideoScriptFunction
├── business/
│ ├── tts_service.py # TTSService (cloud + self-hosted)
│ ├── video_composition.py # VideoCompositionService
│ └── video_publisher.py # VideoPublisherService
├── tasks/
│ └── video_tasks.py # Celery tasks (video queue)
├── urls/
│ └── video.py # Video endpoints
└── migrations/
└── XXXX_add_video_models.py
```
### Conventions
- **PKs:** BigAutoField (integer) — do NOT use UUIDs
- **Table prefix:** `igny8_` on all new tables
- **App label:** `video` (new app)
- **Celery app name:** `igny8_core`
- **Celery queue:** `video` for all render/composition tasks (default queue for engagement fetch)
- **URL pattern:** `/api/v1/video/...`
- **Permissions:** Use `SiteSectorModelViewSet` permission pattern
- **Docker:** FFmpeg must be installed in Docker image; dedicated `celery_video_worker` service
- **AI functions:** Extend `BaseAIFunction`; register as `generate_video_script`
- **Frontend:** `.tsx` files with Zustand stores
### Cross-References
| Doc | Relationship |
|-----|-------------|
| **02H** | Socializer provides SocialAccount model + OAuth for YouTube/Instagram/TikTok publishing |
| **0F** | Self-hosted AI infrastructure provides GPU for TTS + image generation |
| **01E** | Pipeline Stage 9 integration — hooks after Stage 8 (social) or Stage 7 (publish) |
| **02G** | VideoObject schema generated for content with published video |
| **04A** | Managed services may include video creation as premium tier |
### Key Decisions
1. **New `video` app** — Separate app because video has 6 models and complex pipeline logic distinct from social posting
2. **Dedicated Celery queue** — Video rendering is CPU/GPU intensive (5-30 min); isolated `video` queue with concurrency=1 prevents blocking other tasks
3. **VideoScript, VideoAsset as plain models.Model** — Not SiteSectorBaseModel because they're children of VideoProject which carries the site/sector context
4. **Multiple RenderedVideo per project** — One project can target multiple platforms; each gets its own render at the correct resolution
5. **Reuse 02H OAuth** — PublishedVideo references SocialAccount from 02H; no duplicate OAuth infrastructure for video platforms
6. **Temp file cleanup** — Intermediate files (raw audio, image frames, non-final renders) cleaned up after successful publish to manage disk space
### System Requirements
- FFmpeg installed on server (add to Docker image via `apt-get install -y ffmpeg`)
- Python packages: `moviepy`, `pydub`, `Pillow`, `pysrt`
- Sufficient disk space for video temp files (cleanup after publish)
- Self-hosted GPU (from 0F) for TTS + AI image generation (optional — cloud fallback available)
- Dedicated Celery worker for `video` queue: `celery -A igny8_core worker -Q video --concurrency=1`