This commit is contained in:
IGNY8 VPS (Salman)
2026-03-23 17:20:51 +00:00
parent e78a41f11c
commit 0570052fec
21 changed files with 15889 additions and 0 deletions

View File

@@ -0,0 +1,896 @@
# IGNY8 Phase 2: Video Creator (02I)
## AI Video Creation Pipeline — Stage 9
**Document Version:** 1.0
**Date:** 2026-03-23
**Phase:** IGNY8 Phase 2 — Feature Expansion
**Status:** Build Ready
**Source of Truth:** Codebase at `/data/app/igny8/`
**Audience:** Claude Code, Backend Developers, Architects
---
## 1. CURRENT STATE
### Video Today
There is **no** video creation capability in IGNY8. No TTS, no FFmpeg pipeline, no video publishing. Images exist (generated by pipeline Stages 5-6) and can feed into video as visual assets.
### What Exists
- `Images` model (writer app) — generated images from pipeline, usable as video visual assets
- `SocialAccount` model (02H) — provides OAuth connections to YouTube, Instagram, TikTok for video publishing
- Self-hosted AI infrastructure (Phase 0F) — provides GPU for TTS and AI image generation
- Content generation pipeline (01E) — content records provide source material for video scripts
- Celery infrastructure with multiple queues — supports dedicated `video` queue for long-running renders
### What Does Not Exist
- No video app or models
- No script generation from articles
- No TTS (text-to-speech) voiceover generation
- No FFmpeg/MoviePy video composition pipeline
- No subtitle generation
- No video publishing to platforms
- No Stage 9 pipeline integration
---
## 2. WHAT TO BUILD
### Overview
Build Stage 9 of the automation pipeline: an AI video creation system that converts published content into videos. The pipeline has 5 stages: script generation → voiceover → visual assets → composition → publishing. Videos publish to YouTube, Instagram Reels, and TikTok.
### 2.1 Video Types
| Type | Duration | Aspect Ratio | Primary Platform |
|------|----------|-------------|-----------------|
| **Short** | 30-90s | 9:16 vertical | YouTube Shorts, Instagram Reels, TikTok |
| **Medium** | 60-180s | 9:16 or 16:9 | TikTok, YouTube |
| **Long** | 5-15m | 16:9 horizontal | YouTube |
### 2.2 Platform Specs
| Platform | Max Duration | Resolution | Encoding | Max File Size |
|----------|-------------|------------|----------|--------------|
| YouTube Long | Up to 12h | 1920×1080 | MP4 H.264, AAC audio | 256GB |
| YouTube Shorts | ≤60s | 1080×1920 | MP4 H.264, AAC audio | 256GB |
| Instagram Reels | ≤90s | 1080×1920 | MP4 H.264, AAC audio | 650MB |
| TikTok | ≤10m | 1080×1920 | MP4 H.264, AAC audio | 72MB |
### 2.3 Five-Stage Video Pipeline
#### Stage 1 — Script Generation (AI)
**Input:** Content record (title, content_html, meta_description, keywords, images)
AI extracts key points and produces:
```json
{
"hook": "text (3-5 sec)",
"intro": "text (10-15 sec)",
"points": [
{"text": "...", "duration_est": 20, "visual_cue": "show chart", "text_overlay": "Key stat"}
],
"cta": "text (5-10 sec)",
"chapter_markers": [{"time": 0, "title": "Intro"}],
"total_estimated_duration": 120
}
```
SEO: AI generates platform-specific title, description, tags for each target platform.
#### Stage 2 — Voiceover (TTS)
**Cloud Providers:**
| Provider | Cost | Quality | Features |
|----------|------|---------|----------|
| OpenAI TTS | $15-30/1M chars | High | Voices: alloy, echo, fable, onyx, nova, shimmer |
| ElevenLabs | Plan-based | Highest | Voice cloning, ultra-realistic |
**Self-Hosted (via 0F GPU):**
| Model | Quality | Speed | Notes |
|-------|---------|-------|-------|
| Coqui XTTS-v2 | Good | Medium | Multi-language, free |
| Bark | Expressive | Slow | Emotional speech |
| Piper TTS | Moderate | Fast | Lightweight |
**Features:** Voice selection, speed control, multi-language support
**Output:** WAV/MP3 audio file + word-level timestamps (for subtitle sync)
#### Stage 3 — Visual Assets
**Sources:**
- Article images from `Images` model (already generated by pipeline Stages 5-6)
- AI-generated scenes (Runware/DALL-E/Stable Diffusion via 0F)
- Stock footage APIs: Pexels, Pixabay (free, API key required)
- Text overlay frames (rendered via Pillow)
- Code snippet frames (via Pygments syntax highlighting)
**Effects:**
- Ken Burns effect on still images (zoom/pan animation)
- Transition effects between scenes (fade, slide, dissolve)
#### Stage 4 — Video Composition (FFmpeg + MoviePy)
**Libraries:** FFmpeg (encoding), MoviePy (high-level composition), Pillow (text overlays), pydub (audio processing)
**Process:**
1. Create visual timeline from script sections
2. Assign visuals to each section (image/video clip per point)
3. Add text overlays at specified timestamps
4. Mix voiceover audio with background music (royalty-free, 20% volume)
5. Apply transitions between sections
6. Render to target resolution/format
**Render Presets:**
| Preset | Resolution | Duration Range | Encoding |
|--------|-----------|----------------|----------|
| `youtube_long` | 1920×1080 | 3-15m | H.264/AAC |
| `youtube_short` | 1080×1920 | 30-60s | H.264/AAC |
| `instagram_reel` | 1080×1920 | 30-90s | H.264/AAC |
| `tiktok` | 1080×1920 | 30-180s | H.264/AAC |
#### Stage 5 — SEO & Publishing
- Auto-generate SRT subtitle file from TTS word-level timestamps
- AI thumbnail: hero image with title text overlay
- Platform-specific metadata: title (optimized per platform), description (with timestamps for YouTube), tags, category
- Publishing via platform APIs (reuses OAuth from 02H SocialAccount)
- Confirmation logging with platform video ID
### 2.4 User Flow
1. Select content → choose video type (short/medium/long) → target platforms
2. AI generates script → user reviews/edits script
3. Select voice → preview audio → approve
4. Auto-assign visuals → user can swap images → preview composition
5. Render video → preview final → approve
6. Publish to selected platforms → track performance
### 2.5 Dedicated Celery Queue
Video rendering is CPU/GPU intensive and requires isolation:
- **Dedicated `video` queue:** `celery -A igny8_core worker -Q video --concurrency=1`
- **Long-running tasks:** 5-30 minutes per video render
- **Progress tracking:** via Celery result backend (task status updates)
- **Temp file cleanup:** after publish, clean up intermediate files (audio, frames, raw renders)
---
## 3. DATA MODELS & APIS
### 3.1 New Models
All models in a new `video` app.
#### VideoProject (video app)
```python
class VideoProject(SiteSectorBaseModel):
"""
Top-level container for a video creation project.
Links to source content and tracks overall progress.
"""
content = models.ForeignKey(
'writer.Content',
on_delete=models.SET_NULL,
null=True,
blank=True,
related_name='video_projects',
help_text='Source content (null for standalone video)'
)
project_type = models.CharField(
max_length=10,
choices=[
('short', 'Short (30-90s)'),
('medium', 'Medium (60-180s)'),
('long', 'Long (5-15m)'),
]
)
target_platforms = models.JSONField(
default=list,
help_text='List of target platform strings: youtube_long, youtube_short, instagram_reel, tiktok'
)
status = models.CharField(
max_length=15,
choices=[
('draft', 'Draft'),
('scripting', 'Script Generation'),
('voiceover', 'Voiceover Generation'),
('composing', 'Visual Composition'),
('rendering', 'Rendering'),
('review', 'Ready for Review'),
('published', 'Published'),
('failed', 'Failed'),
],
default='draft'
)
settings = models.JSONField(
default=dict,
help_text='{voice_id, voice_provider, music_track, transition_style}'
)
class Meta:
app_label = 'video'
db_table = 'igny8_video_projects'
```
**PK:** BigAutoField (integer) — inherits from SiteSectorBaseModel
#### VideoScript (video app)
```python
class VideoScript(models.Model):
"""
Script for a video project — generated by AI, editable by user.
"""
project = models.OneToOneField(
'video.VideoProject',
on_delete=models.CASCADE,
related_name='script'
)
script_text = models.TextField(help_text='Full narration text')
sections = models.JSONField(
default=list,
help_text='[{text, duration_est, visual_cue, text_overlay}]'
)
hook = models.TextField(blank=True, default='')
cta = models.TextField(blank=True, default='')
chapter_markers = models.JSONField(
default=list,
help_text='[{time, title}]'
)
total_estimated_duration = models.IntegerField(
default=0,
help_text='Total estimated duration in seconds'
)
seo_metadata = models.JSONField(
default=dict,
help_text='{platform: {title, description, tags}}'
)
version = models.IntegerField(default=1)
class Meta:
app_label = 'video'
db_table = 'igny8_video_scripts'
```
**PK:** BigAutoField (integer) — standard Django Model
#### VideoAsset (video app)
```python
class VideoAsset(models.Model):
"""
Individual asset (image, footage, music, overlay, subtitle) for a video project.
"""
project = models.ForeignKey(
'video.VideoProject',
on_delete=models.CASCADE,
related_name='assets'
)
asset_type = models.CharField(
max_length=15,
choices=[
('image', 'Image'),
('footage', 'Footage'),
('music', 'Background Music'),
('overlay', 'Text Overlay'),
('subtitle', 'Subtitle File'),
]
)
source = models.CharField(
max_length=20,
choices=[
('article_image', 'Article Image'),
('ai_generated', 'AI Generated'),
('stock_pexels', 'Pexels Stock'),
('stock_pixabay', 'Pixabay Stock'),
('uploaded', 'User Uploaded'),
('rendered', 'Rendered'),
]
)
file_path = models.CharField(max_length=500, help_text='Path in media storage')
file_url = models.URLField(blank=True, default='')
duration = models.FloatField(
null=True, blank=True,
help_text='Duration in seconds (for video/audio assets)'
)
section_index = models.IntegerField(
null=True, blank=True,
help_text='Which script section this asset belongs to'
)
order = models.IntegerField(default=0)
class Meta:
app_label = 'video'
db_table = 'igny8_video_assets'
```
**PK:** BigAutoField (integer) — standard Django Model
#### RenderedVideo (video app)
```python
class RenderedVideo(models.Model):
"""
A rendered video file for a specific platform preset.
One project can have multiple renders (one per target platform).
"""
project = models.ForeignKey(
'video.VideoProject',
on_delete=models.CASCADE,
related_name='rendered_videos'
)
preset = models.CharField(
max_length=20,
choices=[
('youtube_long', 'YouTube Long'),
('youtube_short', 'YouTube Short'),
('instagram_reel', 'Instagram Reel'),
('tiktok', 'TikTok'),
]
)
resolution = models.CharField(
max_length=15,
help_text='e.g. 1920x1080 or 1080x1920'
)
duration = models.FloatField(help_text='Duration in seconds')
file_size = models.BigIntegerField(help_text='File size in bytes')
file_path = models.CharField(max_length=500)
file_url = models.URLField(blank=True, default='')
subtitle_file_path = models.CharField(max_length=500, blank=True, default='')
thumbnail_path = models.CharField(max_length=500, blank=True, default='')
render_started_at = models.DateTimeField()
render_completed_at = models.DateTimeField(null=True, blank=True)
status = models.CharField(
max_length=15,
choices=[
('queued', 'Queued'),
('rendering', 'Rendering'),
('completed', 'Completed'),
('failed', 'Failed'),
],
default='queued'
)
class Meta:
app_label = 'video'
db_table = 'igny8_rendered_videos'
```
**PK:** BigAutoField (integer) — standard Django Model
#### PublishedVideo (video app)
```python
class PublishedVideo(models.Model):
"""
Tracks a rendered video published to a social platform.
Uses SocialAccount from 02H for OAuth credentials.
"""
rendered_video = models.ForeignKey(
'video.RenderedVideo',
on_delete=models.CASCADE,
related_name='publications'
)
social_account = models.ForeignKey(
'social.SocialAccount',
on_delete=models.CASCADE,
related_name='published_videos'
)
platform = models.CharField(max_length=15)
platform_video_id = models.CharField(max_length=255, blank=True, default='')
published_url = models.URLField(blank=True, default='')
title = models.CharField(max_length=255)
description = models.TextField()
tags = models.JSONField(default=list)
thumbnail_url = models.URLField(blank=True, default='')
published_at = models.DateTimeField(null=True, blank=True)
status = models.CharField(
max_length=15,
choices=[
('publishing', 'Publishing'),
('published', 'Published'),
('failed', 'Failed'),
('removed', 'Removed'),
],
default='publishing'
)
class Meta:
app_label = 'video'
db_table = 'igny8_published_videos'
```
**PK:** BigAutoField (integer) — standard Django Model
#### VideoEngagement (video app)
```python
class VideoEngagement(models.Model):
"""
Engagement metrics for a published video.
Fetched periodically from platform APIs.
"""
published_video = models.ForeignKey(
'video.PublishedVideo',
on_delete=models.CASCADE,
related_name='engagement_records'
)
views = models.IntegerField(default=0)
likes = models.IntegerField(default=0)
comments = models.IntegerField(default=0)
shares = models.IntegerField(default=0)
watch_time_seconds = models.IntegerField(default=0)
avg_view_duration = models.FloatField(default=0.0)
raw_data = models.JSONField(default=dict, help_text='Full platform API response')
fetched_at = models.DateTimeField(auto_now_add=True)
class Meta:
app_label = 'video'
db_table = 'igny8_video_engagement'
```
**PK:** BigAutoField (integer) — standard Django Model
### 3.2 New App Registration
Create video app:
- **App config:** `igny8_core/modules/video/apps.py` with `app_label = 'video'`
- **Add to INSTALLED_APPS** in `igny8_core/settings.py`
### 3.3 Migration
```
igny8_core/migrations/XXXX_add_video_models.py
```
**Operations:**
1. `CreateModel('VideoProject', ...)` — with indexes on content, status
2. `CreateModel('VideoScript', ...)` — OneToOne to VideoProject
3. `CreateModel('VideoAsset', ...)` — with index on project
4. `CreateModel('RenderedVideo', ...)` — with index on project, status
5. `CreateModel('PublishedVideo', ...)` — with indexes on rendered_video, social_account
6. `CreateModel('VideoEngagement', ...)` — with index on published_video
### 3.4 API Endpoints
All endpoints under `/api/v1/video/`:
#### Project Management
| Method | Path | Description |
|--------|------|-------------|
| POST | `/api/v1/video/projects/` | Create video project. Body: `{content_id, project_type, target_platforms}`. |
| GET | `/api/v1/video/projects/?site_id=X` | List projects with filters (status, project_type). |
| GET | `/api/v1/video/projects/{id}/` | Project detail with script, assets, renders. |
#### Script
| Method | Path | Description |
|--------|------|-------------|
| POST | `/api/v1/video/scripts/generate/` | AI-generate script from content. Body: `{project_id}`. |
| PUT | `/api/v1/video/scripts/{project_id}/` | Edit script (user modifications). |
#### Voiceover
| Method | Path | Description |
|--------|------|-------------|
| POST | `/api/v1/video/voiceover/generate/` | Generate TTS audio. Body: `{project_id, voice_id, provider}`. |
| POST | `/api/v1/video/voiceover/preview/` | Preview voice sample (short clip). Body: `{text, voice_id, provider}`. |
#### Assets
| Method | Path | Description |
|--------|------|-------------|
| GET | `/api/v1/video/assets/{project_id}/` | List project assets. |
| POST | `/api/v1/video/assets/{project_id}/` | Add/replace asset. |
#### Rendering & Publishing
| Method | Path | Description |
|--------|------|-------------|
| POST | `/api/v1/video/render/` | Queue video render. Body: `{project_id, presets: []}`. |
| GET | `/api/v1/video/render/{id}/status/` | Render progress (queued/rendering/completed/failed). |
| GET | `/api/v1/video/rendered/{project_id}/` | List rendered videos for project. |
| POST | `/api/v1/video/publish/` | Publish to platform. Body: `{rendered_video_id, social_account_id}`. |
#### Analytics
| Method | Path | Description |
|--------|------|-------------|
| GET | `/api/v1/video/analytics/?site_id=X` | Aggregate video analytics across projects. |
| GET | `/api/v1/video/analytics/{published_video_id}/` | Single video analytics with engagement timeline. |
**Permissions:** All endpoints use `SiteSectorModelViewSet` permission patterns.
### 3.5 AI Functions
#### GenerateVideoScriptFunction
**Registry key:** `generate_video_script`
**Location:** `igny8_core/ai/functions/generate_video_script.py`
```python
class GenerateVideoScriptFunction(BaseAIFunction):
"""
Generates video script from content record.
Produces hook, intro, body points, CTA, chapter markers,
and platform-specific SEO metadata.
"""
function_name = 'generate_video_script'
def validate(self, project_id, **kwargs):
# Verify project exists, has linked content with content_html
pass
def prepare(self, project_id, **kwargs):
# Load VideoProject + Content
# Extract key points, images, meta_description
# Determine target duration from project_type
pass
def build_prompt(self):
# Include: content title, key points, meta_description
# Target duration constraints
# Per target_platform: SEO metadata requirements
pass
def parse_response(self, response):
# Parse script structure: hook, intro, points[], cta, chapter_markers
# Parse seo_metadata per platform
pass
def save_output(self, parsed):
# Create/update VideoScript record
# Update VideoProject.status = 'scripting' → 'voiceover'
pass
```
### 3.6 TTS Service
**Location:** `igny8_core/business/tts_service.py`
```python
class TTSService:
"""
Text-to-speech service. Supports cloud providers and self-hosted models.
Returns audio file + word-level timestamps.
"""
PROVIDERS = {
'openai': OpenAITTSProvider,
'elevenlabs': ElevenLabsTTSProvider,
'coqui': CoquiTTSProvider, # Self-hosted via 0F
'bark': BarkTTSProvider, # Self-hosted via 0F
'piper': PiperTTSProvider, # Self-hosted via 0F
}
def generate(self, text, voice_id, provider='openai'):
"""
Generate voiceover audio.
Returns {audio_path, duration, timestamps: [{word, start, end}]}
"""
pass
def preview(self, text, voice_id, provider='openai', max_chars=200):
"""Generate short preview clip."""
pass
def list_voices(self, provider='openai'):
"""List available voices for provider."""
pass
```
### 3.7 Video Composition Service
**Location:** `igny8_core/business/video_composition.py`
```python
class VideoCompositionService:
"""
Composes video from script + audio + visual assets using FFmpeg/MoviePy.
"""
PRESETS = {
'youtube_long': {'width': 1920, 'height': 1080, 'min_dur': 180, 'max_dur': 900},
'youtube_short': {'width': 1080, 'height': 1920, 'min_dur': 30, 'max_dur': 60},
'instagram_reel': {'width': 1080, 'height': 1920, 'min_dur': 30, 'max_dur': 90},
'tiktok': {'width': 1080, 'height': 1920, 'min_dur': 30, 'max_dur': 180},
}
def compose(self, project_id, preset):
"""
Full composition:
1. Load script + audio + visual assets
2. Create visual timeline from script sections
3. Assign visuals to sections
4. Add text overlays at timestamps
5. Mix voiceover + background music (20% volume)
6. Apply transitions
7. Render to preset resolution/format
8. Generate SRT subtitles from TTS timestamps
9. Generate thumbnail
10. Create RenderedVideo record
Returns RenderedVideo instance.
"""
pass
def _apply_ken_burns(self, image_clip, duration):
"""Apply zoom/pan animation to still image."""
pass
def _generate_subtitles(self, timestamps, output_path):
"""Generate SRT file from word-level timestamps."""
pass
def _generate_thumbnail(self, project, output_path):
"""Create thumbnail: hero image + title text overlay."""
pass
```
### 3.8 Video Publisher Service
**Location:** `igny8_core/business/video_publisher.py`
```python
class VideoPublisherService:
"""
Publishes rendered videos to platforms via their APIs.
Reuses SocialAccount OAuth from 02H.
"""
def publish(self, rendered_video_id, social_account_id):
"""
Upload and publish video to platform.
1. Load RenderedVideo + SocialAccount
2. Decrypt OAuth tokens
3. Upload video file via platform API
4. Set metadata (title, description, tags, thumbnail)
5. Create PublishedVideo record
"""
pass
```
### 3.9 Celery Tasks
**Location:** `igny8_core/tasks/video_tasks.py`
All video tasks run on the dedicated `video` queue:
```python
@shared_task(name='generate_video_script', queue='video')
def generate_video_script_task(project_id):
"""AI script generation from content."""
pass
@shared_task(name='generate_voiceover', queue='video')
def generate_voiceover_task(project_id):
"""TTS audio generation."""
pass
@shared_task(name='render_video', queue='video')
def render_video_task(project_id, preset):
"""
FFmpeg/MoviePy video composition. Long-running (5-30 min).
Updates RenderedVideo.status through lifecycle.
"""
pass
@shared_task(name='generate_thumbnail', queue='video')
def generate_thumbnail_task(project_id):
"""AI thumbnail creation with title overlay."""
pass
@shared_task(name='generate_subtitles', queue='video')
def generate_subtitles_task(project_id):
"""SRT generation from TTS timestamps."""
pass
@shared_task(name='publish_video', queue='video')
def publish_video_task(rendered_video_id, social_account_id):
"""Upload video to platform API."""
pass
@shared_task(name='fetch_video_engagement')
def fetch_video_engagement_task():
"""Periodic metric fetch for published videos. Runs on default queue."""
pass
@shared_task(name='video_pipeline_stage9', queue='video')
def video_pipeline_stage9(content_id):
"""
Full pipeline: script → voice → render → publish.
Triggered after Stage 8 (social) or directly after Stage 7 (publish).
"""
pass
```
**Beat Schedule Additions:**
| Task | Schedule | Notes |
|------|----------|-------|
| `fetch_video_engagement` | Every 12 hours | Fetches engagement metrics for published videos |
**Docker Configuration:**
```yaml
# Add to docker-compose.app.yml:
celery_video_worker:
build: ./backend
command: celery -A igny8_core worker -Q video --concurrency=1 --loglevel=info
# Requires FFmpeg installed in Docker image
```
**Dockerfile Addition:**
```dockerfile
# Add to backend/Dockerfile:
RUN apt-get update && apt-get install -y ffmpeg
```
---
## 4. IMPLEMENTATION STEPS
### Step 1: Create Video App
1. Create `igny8_core/modules/video/` directory with `__init__.py` and `apps.py`
2. Add `video` to `INSTALLED_APPS` in settings.py
3. Create 6 models: VideoProject, VideoScript, VideoAsset, RenderedVideo, PublishedVideo, VideoEngagement
### Step 2: Migration
1. Create migration for 6 new models
2. Run migration
### Step 3: System Dependencies
1. Add FFmpeg to Docker image (`apt-get install -y ffmpeg`)
2. Add to `requirements.txt`: `moviepy`, `pydub`, `Pillow` (already present), `pysrt`
3. Add docker-compose service for `celery_video_worker` with `-Q video --concurrency=1`
### Step 4: AI Function
1. Implement `GenerateVideoScriptFunction` in `igny8_core/ai/functions/generate_video_script.py`
2. Register `generate_video_script` in `igny8_core/ai/registry.py`
### Step 5: Services
1. Implement `TTSService` in `igny8_core/business/tts_service.py` (cloud + self-hosted providers)
2. Implement `VideoCompositionService` in `igny8_core/business/video_composition.py`
3. Implement `VideoPublisherService` in `igny8_core/business/video_publisher.py`
### Step 6: Pipeline Integration
Add Stage 9 trigger:
```python
# After Stage 8 (social posts) or Stage 7 (publish):
def post_social_or_publish(content_id):
site = content.site
config = AutomationConfig.objects.get(site=site)
if config.settings.get('video_enabled'):
video_pipeline_stage9.delay(content_id)
```
### Step 7: API Endpoints
1. Create `igny8_core/urls/video.py` with project, script, voiceover, asset, render, publish, analytics endpoints
2. Create views extending `SiteSectorModelViewSet`
3. Register URL patterns under `/api/v1/video/`
### Step 8: Celery Tasks
1. Implement 8 tasks in `igny8_core/tasks/video_tasks.py`
2. Add `fetch_video_engagement` to beat schedule
3. Ensure render tasks target `video` queue
### Step 9: Serializers & Admin
1. Create DRF serializers for all 6 models
2. Register models in Django admin
### Step 10: Credit Cost Configuration
Add to `CreditCostConfig` (billing app):
| operation_type | default_cost | description |
|---------------|-------------|-------------|
| `video_script_generation` | 5 | AI script generation from content |
| `video_tts_standard` | 10/min | Cloud TTS (OpenAI) — per minute of audio |
| `video_tts_selfhosted` | 2/min | Self-hosted TTS (Coqui/Piper via 0F) |
| `video_tts_hd` | 20/min | HD TTS (ElevenLabs) — per minute |
| `video_visual_generation` | 15-50 | AI visual asset generation (varies by count) |
| `video_thumbnail` | 3-10 | AI thumbnail creation |
| `video_composition` | 5 | FFmpeg render |
| `video_seo_metadata` | 1 | SEO metadata per platform |
| `video_short_total` | 40-80 | Total for short-form video |
| `video_long_total` | 100-250 | Total for long-form video |
---
## 5. ACCEPTANCE CRITERIA
### Script Generation
- [ ] AI generates structured script from content with hook, intro, body points, CTA
- [ ] Script includes chapter markers with timestamps
- [ ] Platform-specific SEO metadata generated (title, description, tags)
- [ ] Script duration estimates match project_type constraints
- [ ] User can edit script before proceeding
### Voiceover
- [ ] OpenAI TTS generates audio with voice selection
- [ ] ElevenLabs TTS works as premium option
- [ ] Self-hosted TTS (Coqui XTTS-v2) works via 0F GPU
- [ ] Word-level timestamps generated for subtitle sync
- [ ] Voice preview endpoint allows testing before full generation
### Visual Assets
- [ ] Article images from Images model used as visual assets
- [ ] Ken Burns effect applied to still images
- [ ] Text overlay frames rendered via Pillow
- [ ] Transitions applied between scenes
- [ ] User can swap assets before rendering
### Rendering
- [ ] FFmpeg/MoviePy composition produces correct resolution per preset
- [ ] Audio mix: voiceover at 100% + background music at 20%
- [ ] SRT subtitle file generated from TTS timestamps
- [ ] AI thumbnail generated with title text overlay
- [ ] Render runs on dedicated `video` Celery queue with concurrency=1
- [ ] Render progress trackable via status endpoint
### Publishing
- [ ] Video uploads to YouTube via API (reusing 02H SocialAccount)
- [ ] Video uploads to Instagram Reels
- [ ] Video uploads to TikTok
- [ ] Platform video ID and URL stored on PublishedVideo
- [ ] Engagement metrics fetched every 12 hours
### Pipeline Integration
- [ ] Stage 9 triggers automatically when video_enabled in AutomationConfig
- [ ] Full pipeline (script → voice → render → publish) runs as single Celery chain
- [ ] VideoObject schema (02G) generated for published video content
---
## 6. CLAUDE CODE INSTRUCTIONS
### File Locations
```
igny8_core/
├── modules/
│ └── video/
│ ├── __init__.py
│ ├── apps.py # app_label = 'video'
│ └── models.py # 6 models
├── ai/
│ └── functions/
│ └── generate_video_script.py # GenerateVideoScriptFunction
├── business/
│ ├── tts_service.py # TTSService (cloud + self-hosted)
│ ├── video_composition.py # VideoCompositionService
│ └── video_publisher.py # VideoPublisherService
├── tasks/
│ └── video_tasks.py # Celery tasks (video queue)
├── urls/
│ └── video.py # Video endpoints
└── migrations/
└── XXXX_add_video_models.py
```
### Conventions
- **PKs:** BigAutoField (integer) — do NOT use UUIDs
- **Table prefix:** `igny8_` on all new tables
- **App label:** `video` (new app)
- **Celery app name:** `igny8_core`
- **Celery queue:** `video` for all render/composition tasks (default queue for engagement fetch)
- **URL pattern:** `/api/v1/video/...`
- **Permissions:** Use `SiteSectorModelViewSet` permission pattern
- **Docker:** FFmpeg must be installed in Docker image; dedicated `celery_video_worker` service
- **AI functions:** Extend `BaseAIFunction`; register as `generate_video_script`
- **Frontend:** `.tsx` files with Zustand stores
### Cross-References
| Doc | Relationship |
|-----|-------------|
| **02H** | Socializer provides SocialAccount model + OAuth for YouTube/Instagram/TikTok publishing |
| **0F** | Self-hosted AI infrastructure provides GPU for TTS + image generation |
| **01E** | Pipeline Stage 9 integration — hooks after Stage 8 (social) or Stage 7 (publish) |
| **02G** | VideoObject schema generated for content with published video |
| **04A** | Managed services may include video creation as premium tier |
### Key Decisions
1. **New `video` app** — Separate app because video has 6 models and complex pipeline logic distinct from social posting
2. **Dedicated Celery queue** — Video rendering is CPU/GPU intensive (5-30 min); isolated `video` queue with concurrency=1 prevents blocking other tasks
3. **VideoScript, VideoAsset as plain models.Model** — Not SiteSectorBaseModel because they're children of VideoProject which carries the site/sector context
4. **Multiple RenderedVideo per project** — One project can target multiple platforms; each gets its own render at the correct resolution
5. **Reuse 02H OAuth** — PublishedVideo references SocialAccount from 02H; no duplicate OAuth infrastructure for video platforms
6. **Temp file cleanup** — Intermediate files (raw audio, image frames, non-final renders) cleaned up after successful publish to manage disk space
### System Requirements
- FFmpeg installed on server (add to Docker image via `apt-get install -y ffmpeg`)
- Python packages: `moviepy`, `pydub`, `Pillow`, `pysrt`
- Sufficient disk space for video temp files (cleanup after publish)
- Self-hosted GPU (from 0F) for TTS + AI image generation (optional — cloud fallback available)
- Dedicated Celery worker for `video` queue: `celery -A igny8_core worker -Q video --concurrency=1`