pre-launch-final mods-docs
This commit is contained in:
713
docs/PRE-LAUNCH/ITEM-3-PROMPT-OPTIMIZATION.md
Normal file
713
docs/PRE-LAUNCH/ITEM-3-PROMPT-OPTIMIZATION.md
Normal file
@@ -0,0 +1,713 @@
|
||||
# Item 3: Prompt Improvement and Model Optimization
|
||||
|
||||
**Priority:** High
|
||||
**Target:** Production Launch
|
||||
**Last Updated:** December 11, 2025
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Redesign and optimize all AI prompts for clustering, idea generation, content generation, and image prompt extraction to achieve:
|
||||
- Extreme accuracy and consistent outputs
|
||||
- Faster processing with optimized token usage
|
||||
- Correct word count adherence (500, 1000, 1500 words)
|
||||
- Improved clustering quality and idea relevance
|
||||
- Better image prompt clarity and relevance
|
||||
|
||||
---
|
||||
|
||||
## Current Prompt System Architecture
|
||||
|
||||
### Prompt Registry
|
||||
|
||||
**Location:** `backend/igny8_core/ai/prompts.py`
|
||||
|
||||
**Class:** `PromptRegistry`
|
||||
|
||||
**Hierarchy** (resolution order):
|
||||
1. Task-level `prompt_override` (if exists on specific task)
|
||||
2. Database prompt from `AIPrompt` model (account-specific)
|
||||
3. Default fallback from `PromptRegistry.DEFAULT_PROMPTS`
|
||||
|
||||
**Storage:**
|
||||
- Default prompts: Hardcoded in `prompts.py`
|
||||
- Account overrides: `system_aiprompt` database table
|
||||
- Task overrides: `prompt_override` field on task object
|
||||
|
||||
---
|
||||
|
||||
## Current Prompts Analysis
|
||||
|
||||
### 1. Clustering Prompt
|
||||
|
||||
**Function:** `auto_cluster`
|
||||
**File:** `backend/igny8_core/ai/functions/auto_cluster.py`
|
||||
**Prompt Key:** `'clustering'`
|
||||
|
||||
#### Current Prompt Structure
|
||||
|
||||
**Approach:** Semantic strategist + intent-driven clustering
|
||||
|
||||
**Key Instructions:**
|
||||
- Return single JSON with "clusters" array
|
||||
- Each cluster: name, description, keywords[]
|
||||
- Multi-dimensional grouping (intent, use-case, function, persona, context)
|
||||
- Model real search behavior and user journeys
|
||||
- Avoid superficial groupings and duplicates
|
||||
- 3-10 keywords per cluster
|
||||
|
||||
**Strengths:**
|
||||
✅ Clear JSON output format
|
||||
✅ Detailed grouping logic with dimensions
|
||||
✅ Emphasis on semantic strength over keyword matching
|
||||
✅ User journey modeling (Problem → Solution, General → Specific)
|
||||
|
||||
**Issues:**
|
||||
❌ Very long prompt (~400+ tokens) - may confuse model
|
||||
❌ No examples provided - model must guess formatting
|
||||
❌ Doesn't specify what to do with outliers explicitly
|
||||
❌ No guidance on cluster count (outputs variable)
|
||||
❌ Description length not constrained
|
||||
|
||||
**Real-World Performance Issues:**
|
||||
- Sometimes creates too many small clusters (1-2 keywords each)
|
||||
- Inconsistent cluster naming convention
|
||||
- Descriptions sometimes generic ("Keywords related to...")
|
||||
|
||||
---
|
||||
|
||||
### 2. Idea Generation Prompt
|
||||
|
||||
**Function:** `generate_ideas`
|
||||
**File:** `backend/igny8_core/ai/functions/generate_ideas.py`
|
||||
**Prompt Key:** `'ideas'`
|
||||
|
||||
#### Current Prompt Structure
|
||||
|
||||
**Approach:** SEO-optimized content ideas + outlines
|
||||
|
||||
**Key Instructions:**
|
||||
- Input: Clusters + Keywords
|
||||
- Output: JSON "ideas" array
|
||||
- 1 cluster_hub + 2-4 supporting ideas per cluster
|
||||
- Fields: title, description, content_type, content_structure, cluster_id, estimated_word_count, covered_keywords
|
||||
- Outline format: intro (hook + 2 paragraphs), 5-8 H2 sections with 2-3 H3s each
|
||||
- Content mixing: paragraphs, lists, tables, blockquotes
|
||||
- No bullets/lists at start
|
||||
- Professional tone, no generic phrasing
|
||||
|
||||
**Strengths:**
|
||||
✅ Detailed outline structure
|
||||
✅ Content mixing guidance (lists, tables, blockquotes)
|
||||
✅ Clear JSON format
|
||||
✅ Tone guidelines
|
||||
|
||||
**Issues:**
|
||||
❌ Very complex prompt (600+ tokens)
|
||||
❌ Outline format too prescriptive (might limit creativity)
|
||||
❌ No examples provided
|
||||
❌ Estimated word count often inaccurate (too high or too low)
|
||||
❌ "hook" guidance unclear (what makes a good hook?)
|
||||
❌ Content structure validation not enforced
|
||||
|
||||
**Real-World Performance Issues:**
|
||||
- Generated ideas sometimes too similar within cluster
|
||||
- Outlines don't always respect structure types (e.g., "review" vs "guide")
|
||||
- covered_keywords field sometimes empty or incorrect
|
||||
- cluster_hub vs supporting ideas distinction unclear
|
||||
|
||||
---
|
||||
|
||||
### 3. Content Generation Prompt
|
||||
|
||||
**Function:** `generate_content`
|
||||
**File:** `backend/igny8_core/ai/functions/generate_content.py`
|
||||
**Prompt Key:** `'content_generation'`
|
||||
|
||||
#### Current Prompt Structure
|
||||
|
||||
**Approach:** Editorial content strategist
|
||||
|
||||
**Key Instructions:**
|
||||
- Output: JSON {title, content (HTML)}
|
||||
- Introduction: 1 italic hook (30-40 words) + 2 paragraphs (50-60 words each), no headings
|
||||
- H2 sections: 5-8 total, 250-300 words each
|
||||
- Section format: 2 narrative paragraphs → list/table → optional closing paragraph → 2-3 subsections
|
||||
- Vary list/table types
|
||||
- Never start section with list/table
|
||||
- Tone: professional, no passive voice, no generic intros
|
||||
- Keyword usage: natural in title, intro, headings
|
||||
|
||||
**Strengths:**
|
||||
✅ Detailed structure guidance
|
||||
✅ Strong tone/style rules
|
||||
✅ HTML output format
|
||||
✅ Keyword integration guidance
|
||||
|
||||
**Issues:**
|
||||
❌ **Word count not mentioned in prompt** - critical flaw
|
||||
❌ No guidance on 500 vs 1000 vs 1500 word versions
|
||||
❌ Hook word count (30-40) + paragraph counts (50-60 × 2) don't scale proportionally
|
||||
❌ Section word count (250-300) doesn't adapt to total target
|
||||
❌ No example output
|
||||
❌ Content structure (article vs guide vs review) not clearly differentiated
|
||||
❌ Table column guidance missing (what columns? how many?)
|
||||
|
||||
**Real-World Performance Issues:**
|
||||
- **Output length wildly inconsistent** (generates 800 words when asked for 1500)
|
||||
- Introductions sometimes have headings despite instructions
|
||||
- Lists appear at start of sections
|
||||
- Table structure unclear (random columns)
|
||||
- Doesn't adapt content density to word count
|
||||
|
||||
---
|
||||
|
||||
### 4. Image Prompt Extraction
|
||||
|
||||
**Function:** `generate_image_prompts`
|
||||
**File:** `backend/igny8_core/ai/functions/generate_image_prompts.py`
|
||||
**Prompt Key:** `'image_prompt_extraction'`
|
||||
|
||||
#### Current Prompt Structure
|
||||
|
||||
**Approach:** Extract visual descriptions from article
|
||||
|
||||
**Key Instructions:**
|
||||
- Input: article title + content
|
||||
- Output: JSON {featured_prompt, in_article_prompts[]}
|
||||
- Extract featured image (main topic)
|
||||
- Extract up to {max_images} in-article images
|
||||
- Each prompt detailed for image generation (visual elements, style, mood, composition)
|
||||
|
||||
**Strengths:**
|
||||
✅ Clear structure
|
||||
✅ Separates featured vs in-article
|
||||
✅ Emphasizes detail in descriptions
|
||||
|
||||
**Issues:**
|
||||
❌ No guidance on what makes a good image prompt
|
||||
❌ No style/mood specifications
|
||||
❌ Doesn't specify where in article to place images
|
||||
❌ No examples
|
||||
❌ "Detailed enough" is subjective
|
||||
|
||||
**Real-World Performance Issues:**
|
||||
- Prompts sometimes too generic ("Image of a person using a laptop")
|
||||
- No context from article content (extracts irrelevant visuals)
|
||||
- Featured image prompt sometimes identical to in-article prompt
|
||||
- No guidance on image diversity (all similar)
|
||||
|
||||
---
|
||||
|
||||
### 5. Image Generation Template
|
||||
|
||||
**Prompt Key:** `'image_prompt_template'`
|
||||
|
||||
#### Current Template
|
||||
|
||||
**Approach:** Template-based prompt assembly
|
||||
|
||||
**Format:**
|
||||
```
|
||||
Create a high-quality {image_type} image... "{post_title}"... {image_prompt}...
|
||||
Focus on realistic, well-composed scene... lifestyle/editorial web content...
|
||||
Avoid text, watermarks, logos... **not blurry.**
|
||||
```
|
||||
|
||||
**Issues:**
|
||||
❌ {image_type} not always populated
|
||||
❌ "high-quality" and "not blurry" redundant/unclear
|
||||
❌ No style guidance (photographic, illustration, 3D, etc.)
|
||||
❌ No aspect ratio specification
|
||||
|
||||
---
|
||||
|
||||
## Required Improvements
|
||||
|
||||
### A. Clustering Prompt Redesign
|
||||
|
||||
#### Goals
|
||||
- Reduce prompt length by 30-40%
|
||||
- Add 2-3 concrete examples
|
||||
- Enforce consistent cluster count (5-15 clusters ideal)
|
||||
- Standardize cluster naming (title case, descriptive)
|
||||
- Limit description to 20-30 words
|
||||
|
||||
#### Proposed Structure
|
||||
|
||||
**Section 1: Role & Task** (50 tokens)
|
||||
- Clear, concise role definition
|
||||
- Task: group keywords into intent-driven clusters
|
||||
|
||||
**Section 2: Output Format with Example** (100 tokens)
|
||||
- JSON structure
|
||||
- Show 1 complete example cluster
|
||||
- Specify exact field requirements
|
||||
|
||||
**Section 3: Clustering Rules** (150 tokens)
|
||||
- List 5-7 key rules (bullet format)
|
||||
- Keyword-first approach
|
||||
- Intent dimensions (brief)
|
||||
- Quality thresholds (3-10 keywords per cluster)
|
||||
- No duplicates
|
||||
|
||||
**Section 4: Quality Checklist** (50 tokens)
|
||||
- Checklist of 4-5 validation points
|
||||
- Model self-validates before output
|
||||
|
||||
**Total:** ~350 tokens (vs current ~420)
|
||||
|
||||
#### Example Output Format to Include
|
||||
|
||||
```json
|
||||
{
|
||||
"clusters": [
|
||||
{
|
||||
"name": "Organic Bedding Benefits",
|
||||
"description": "Health, eco-friendly, and comfort aspects of organic cotton bedding materials",
|
||||
"keywords": ["organic sheets", "eco-friendly bedding", "chemical-free cotton", "hypoallergenic sheets", "sustainable bedding"]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### B. Idea Generation Prompt Redesign
|
||||
|
||||
#### Goals
|
||||
- Simplify outline structure (less prescriptive)
|
||||
- Add examples of cluster_hub vs supporting ideas
|
||||
- Better covered_keywords extraction
|
||||
- Adaptive word count estimation
|
||||
- Content structure differentiation
|
||||
|
||||
#### Proposed Structure
|
||||
|
||||
**Section 1: Role & Objective** (40 tokens)
|
||||
- SEO content strategist
|
||||
- Task: generate content ideas from clusters
|
||||
|
||||
**Section 2: Output Format with Examples** (150 tokens)
|
||||
- Show 1 cluster_hub example
|
||||
- Show 1 supporting idea example
|
||||
- Highlight key differences
|
||||
|
||||
**Section 3: Idea Generation Rules** (100 tokens)
|
||||
- 1 cluster_hub (comprehensive, authoritative)
|
||||
- 2-4 supporting ideas (specific angles)
|
||||
- Word count: 1500-2200 for hubs, 1000-1500 for supporting
|
||||
- covered_keywords: extract from cluster keywords
|
||||
|
||||
**Section 4: Outline Guidance** (100 tokens)
|
||||
- Simplified: Intro + 5-8 sections + Conclusion
|
||||
- Section types by content_structure:
|
||||
- article: narrative + data
|
||||
- guide: step-by-step + tips
|
||||
- review: pros/cons + comparison
|
||||
- listicle: numbered + categories
|
||||
- comparison: side-by-side + verdict
|
||||
|
||||
**Total:** ~390 tokens (vs current ~610)
|
||||
|
||||
---
|
||||
|
||||
### C. Content Generation Prompt Redesign
|
||||
|
||||
**Most Critical Improvement:** Word Count Adherence
|
||||
|
||||
#### Goals
|
||||
- **Primary:** Generate exact word count (±5% tolerance)
|
||||
- Scale structure proportionally to word count
|
||||
- Differentiate content structures clearly
|
||||
- Improve HTML quality and consistency
|
||||
- Better keyword integration
|
||||
|
||||
#### Proposed Adaptive Word Count System
|
||||
|
||||
**Word Count Targets:**
|
||||
- 500 words: Short-form (5 sections × 80 words + intro/outro 60 words)
|
||||
- 1000 words: Standard (6 sections × 140 words + intro/outro 120 words)
|
||||
- 1500 words: Long-form (7 sections × 180 words + intro/outro 180 words)
|
||||
|
||||
**Prompt Variable Replacement:**
|
||||
|
||||
Before sending to AI, calculate:
|
||||
- `{TARGET_WORD_COUNT}` - from task.word_count
|
||||
- `{INTRO_WORDS}` - 60 / 120 / 180 based on target
|
||||
- `{SECTION_COUNT}` - 5 / 6 / 7 based on target
|
||||
- `{SECTION_WORDS}` - 80 / 140 / 180 based on target
|
||||
- `{HOOK_WORDS}` - 25 / 35 / 45 based on target
|
||||
|
||||
#### Proposed Structure
|
||||
|
||||
**Section 1: Role & Objective** (30 tokens)
|
||||
```
|
||||
You are an editorial content writer. Generate a {TARGET_WORD_COUNT}-word article...
|
||||
```
|
||||
|
||||
**Section 2: Word Count Requirements** (80 tokens)
|
||||
```
|
||||
CRITICAL: The content must be exactly {TARGET_WORD_COUNT} words (±5% tolerance).
|
||||
|
||||
Structure breakdown:
|
||||
- Introduction: {INTRO_WORDS} words total
|
||||
- Hook (italic): {HOOK_WORDS} words
|
||||
- Paragraphs: 2 × ~{INTRO_WORDS/2} words each
|
||||
- Main Sections: {SECTION_COUNT} H2 sections
|
||||
- Each section: {SECTION_WORDS} words
|
||||
- Conclusion: 60 words
|
||||
|
||||
Word count validation: Count words in final output and adjust if needed.
|
||||
```
|
||||
|
||||
**Section 3: Content Flow & HTML** (120 tokens)
|
||||
- Detailed structure per section
|
||||
- HTML tag usage (<p>, <h2>, <h3>, <ul>, <ol>, <table>)
|
||||
- Formatting rules
|
||||
|
||||
**Section 4: Style & Quality** (80 tokens)
|
||||
- Tone guidance
|
||||
- Keyword usage
|
||||
- Avoid generic phrases
|
||||
- Examples of good vs bad openings
|
||||
|
||||
**Section 5: Content Structure Types** (90 tokens)
|
||||
- article: {structure description}
|
||||
- guide: {structure description}
|
||||
- review: {structure description}
|
||||
- comparison: {structure description}
|
||||
- listicle: {structure description}
|
||||
- cluster_hub: {structure description}
|
||||
|
||||
**Section 6: Output Format with Example** (100 tokens)
|
||||
- JSON structure
|
||||
- Show abbreviated example with proper HTML
|
||||
|
||||
**Total:** ~500 tokens (vs current ~550, but much more precise)
|
||||
|
||||
---
|
||||
|
||||
### D. Image Prompt Improvements
|
||||
|
||||
#### Goals
|
||||
- Generate visually diverse prompts
|
||||
- Better context from article content
|
||||
- Specify image placement guidelines
|
||||
- Improve prompt detail and clarity
|
||||
|
||||
#### Proposed Extraction Prompt Structure
|
||||
|
||||
**Section 1: Task & Context** (50 tokens)
|
||||
```
|
||||
Extract image prompts from this article for visual content placement.
|
||||
|
||||
Article: {title}
|
||||
Content: {content}
|
||||
Required: 1 featured + {max_images} in-article images
|
||||
```
|
||||
|
||||
**Section 2: Image Types & Guidelines** (100 tokens)
|
||||
```
|
||||
Featured Image:
|
||||
- Hero visual representing article's main theme
|
||||
- Broad, engaging, high-quality
|
||||
- Should work at large sizes (1200×630+)
|
||||
|
||||
In-Article Images (place strategically):
|
||||
1. After introduction
|
||||
2. Mid-article (before major H2 sections)
|
||||
3. Supporting specific concepts or examples
|
||||
4. Before conclusion
|
||||
|
||||
Each prompt must describe:
|
||||
- Subject & composition
|
||||
- Visual style (photographic, minimal, editorial)
|
||||
- Mood & lighting
|
||||
- Color palette suggestions
|
||||
- Avoid: text, logos, faces (unless relevant)
|
||||
```
|
||||
|
||||
**Section 3: Prompt Quality Rules** (80 tokens)
|
||||
- Be specific and descriptive (not generic)
|
||||
- Include scene details, angles, perspective
|
||||
- Specify lighting, time of day if relevant
|
||||
- Mention style references
|
||||
- Ensure diversity across all images
|
||||
- No duplicate concepts
|
||||
|
||||
**Section 4: Output Format** (50 tokens)
|
||||
- JSON structure
|
||||
- Show example with good vs bad prompts
|
||||
|
||||
#### Proposed Template Prompt Improvement
|
||||
|
||||
Replace current template with:
|
||||
|
||||
```
|
||||
A {style} photograph for "{post_title}". {image_prompt}.
|
||||
Composition: {composition_hint}. Lighting: {lighting_hint}.
|
||||
Mood: {mood}. Style: clean, modern, editorial web content.
|
||||
No text, watermarks, or logos.
|
||||
```
|
||||
|
||||
Where:
|
||||
- {style} - photographic, minimalist, lifestyle, etc.
|
||||
- {composition_hint} - center-framed, rule-of-thirds, wide-angle, etc.
|
||||
- {lighting_hint} - natural daylight, soft indoor, dramatic, etc.
|
||||
- {mood} - professional, warm, energetic, calm, etc.
|
||||
|
||||
---
|
||||
|
||||
## Implementation Plan
|
||||
|
||||
### Phase 1: Clustering Prompt (Week 1)
|
||||
|
||||
**Tasks:**
|
||||
1. ✅ Draft new clustering prompt with examples
|
||||
2. ✅ Test with sample keyword sets (20, 50, 100 keywords)
|
||||
3. ✅ Compare outputs: old vs new
|
||||
4. ✅ Validate cluster quality (manual review)
|
||||
5. ✅ Update `PromptRegistry.DEFAULT_PROMPTS['clustering']`
|
||||
6. ✅ Deploy and monitor
|
||||
|
||||
**Success Criteria:**
|
||||
- Consistent cluster count (5-15)
|
||||
- No single-keyword clusters
|
||||
- Clear, descriptive names
|
||||
- Concise descriptions (20-30 words)
|
||||
- 95%+ of keywords clustered
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Idea Generation Prompt (Week 1-2)
|
||||
|
||||
**Tasks:**
|
||||
1. ✅ Draft new ideas prompt with examples
|
||||
2. ✅ Test with 5-10 clusters
|
||||
3. ✅ Validate cluster_hub vs supporting idea distinction
|
||||
4. ✅ Check covered_keywords accuracy
|
||||
5. ✅ Verify content_structure alignment
|
||||
6. ✅ Update `PromptRegistry.DEFAULT_PROMPTS['ideas']`
|
||||
7. ✅ Deploy and monitor
|
||||
|
||||
**Success Criteria:**
|
||||
- Clear distinction between hub and supporting ideas
|
||||
- Accurate covered_keywords extraction
|
||||
- Appropriate word count estimates
|
||||
- Outlines match content_structure type
|
||||
- No duplicate ideas within cluster
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: Content Generation Prompt (Week 2)
|
||||
|
||||
**Tasks:**
|
||||
1. ✅ Draft new content prompt with word count logic
|
||||
2. ✅ Implement dynamic variable replacement in `build_prompt()`
|
||||
3. ✅ Test with 500, 1000, 1500 word targets
|
||||
4. ✅ Validate actual word counts (automated counting)
|
||||
5. ✅ Test all content_structure types
|
||||
6. ✅ Verify HTML quality and consistency
|
||||
7. ✅ Update `PromptRegistry.DEFAULT_PROMPTS['content_generation']`
|
||||
8. ✅ Deploy and monitor
|
||||
|
||||
**Code Change Required:**
|
||||
|
||||
**File:** `backend/igny8_core/ai/functions/generate_content.py`
|
||||
|
||||
**Method:** `build_prompt()`
|
||||
|
||||
**Add word count calculation:**
|
||||
|
||||
```python
|
||||
def build_prompt(self, data: Any, account=None) -> str:
|
||||
task = data if not isinstance(data, list) else data[0]
|
||||
|
||||
# Calculate adaptive word count parameters
|
||||
target_words = task.word_count or 1000
|
||||
|
||||
if target_words <= 600:
|
||||
intro_words = 60
|
||||
section_count = 5
|
||||
section_words = 80
|
||||
hook_words = 25
|
||||
elif target_words <= 1200:
|
||||
intro_words = 120
|
||||
section_count = 6
|
||||
section_words = 140
|
||||
hook_words = 35
|
||||
else:
|
||||
intro_words = 180
|
||||
section_count = 7
|
||||
section_words = 180
|
||||
hook_words = 45
|
||||
|
||||
# Get prompt and replace variables
|
||||
prompt = PromptRegistry.get_prompt(
|
||||
function_name='generate_content',
|
||||
account=account,
|
||||
task=task,
|
||||
context={
|
||||
'TARGET_WORD_COUNT': target_words,
|
||||
'INTRO_WORDS': intro_words,
|
||||
'SECTION_COUNT': section_count,
|
||||
'SECTION_WORDS': section_words,
|
||||
'HOOK_WORDS': hook_words,
|
||||
# ... existing context
|
||||
}
|
||||
)
|
||||
|
||||
return prompt
|
||||
```
|
||||
|
||||
**Success Criteria:**
|
||||
- 95%+ of generated content within ±5% of target word count
|
||||
- HTML structure consistent
|
||||
- Content structure types clearly differentiated
|
||||
- Keyword integration natural
|
||||
- No sections starting with lists
|
||||
|
||||
---
|
||||
|
||||
### Phase 4: Image Prompt Improvements (Week 2-3)
|
||||
|
||||
**Tasks:**
|
||||
1. ✅ Draft new extraction prompt with placement guidelines
|
||||
2. ✅ Draft new template prompt with style variables
|
||||
3. ✅ Test with 10 sample articles
|
||||
4. ✅ Validate image diversity and relevance
|
||||
5. ✅ Update both prompts in registry
|
||||
6. ✅ Update `GenerateImagePromptsFunction` to use new template
|
||||
7. ✅ Deploy and monitor
|
||||
|
||||
**Success Criteria:**
|
||||
- No duplicate image concepts in same article
|
||||
- Prompts are specific and detailed
|
||||
- Featured image distinct from in-article images
|
||||
- Image placement logically distributed
|
||||
- Generated images relevant to content
|
||||
|
||||
---
|
||||
|
||||
## Prompt Versioning & Testing
|
||||
|
||||
### Version Control
|
||||
|
||||
**Recommendation:** Store prompt versions in database for A/B testing
|
||||
|
||||
**Schema:**
|
||||
|
||||
```python
|
||||
class AIPromptVersion(models.Model):
|
||||
prompt_type = CharField(choices=PROMPT_TYPE_CHOICES)
|
||||
version = IntegerField()
|
||||
prompt_value = TextField()
|
||||
is_active = BooleanField(default=False)
|
||||
created_at = DateTimeField(auto_now_add=True)
|
||||
performance_metrics = JSONField(default=dict) # Track success rates
|
||||
```
|
||||
|
||||
**Process:**
|
||||
1. Test new prompt version alongside current
|
||||
2. Compare outputs on same inputs
|
||||
3. Measure quality metrics (manual + automated)
|
||||
4. Gradually roll out if better
|
||||
5. Keep old version as fallback
|
||||
|
||||
---
|
||||
|
||||
### Automated Quality Metrics
|
||||
|
||||
**Implement automated checks:**
|
||||
|
||||
| Metric | Check | Threshold |
|
||||
|--------|-------|-----------|
|
||||
| Word Count Accuracy | `abs(actual - target) / target` | < 0.05 (±5%) |
|
||||
| HTML Validity | Parse with BeautifulSoup | 100% valid |
|
||||
| Keyword Presence | Count keyword mentions | ≥ 3 for primary |
|
||||
| Structure Compliance | Check H2/H3 hierarchy | Valid structure |
|
||||
| Cluster Count | Number of clusters | 5-15 |
|
||||
| Cluster Size | Keywords per cluster | 3-10 |
|
||||
| No Duplicates | Keyword appears once | 100% unique |
|
||||
|
||||
**Log results:**
|
||||
- Track per prompt version
|
||||
- Identify patterns in failures
|
||||
- Use for prompt iteration
|
||||
|
||||
---
|
||||
|
||||
## Model Selection & Optimization
|
||||
|
||||
### Current Models
|
||||
|
||||
**Location:** `backend/igny8_core/ai/settings.py`
|
||||
|
||||
**Default Models per Function:**
|
||||
- Clustering: GPT-4 (expensive but accurate)
|
||||
- Ideas: GPT-4 (creative)
|
||||
- Content: GPT-4 (quality)
|
||||
- Image Prompts: GPT-3.5-turbo (simpler task)
|
||||
- Images: DALL-E 3 / Runware
|
||||
|
||||
### Optimization Opportunities
|
||||
|
||||
**Cost vs Quality Tradeoffs:**
|
||||
|
||||
| Function | Current | Alternative | Cost Savings | Quality Impact |
|
||||
|----------|---------|-------------|--------------|----------------|
|
||||
| Clustering | GPT-4 | GPT-4-turbo | 50% | Minimal |
|
||||
| Ideas | GPT-4 | GPT-4-turbo | 50% | Minimal |
|
||||
| Content | GPT-4 | GPT-4-turbo | 50% | Test required |
|
||||
| Image Prompts | GPT-3.5 | Keep | - | - |
|
||||
|
||||
**Recommendation:** Test GPT-4-turbo for all text generation tasks
|
||||
- Faster response time
|
||||
- 50% cost reduction
|
||||
- Similar quality for structured outputs
|
||||
|
||||
---
|
||||
|
||||
## Success Metrics
|
||||
|
||||
- ✅ Word count accuracy: 95%+ within ±5%
|
||||
- ✅ Clustering quality: No single-keyword clusters
|
||||
- ✅ Idea generation: Clear hub vs supporting distinction
|
||||
- ✅ HTML validity: 100%
|
||||
- ✅ Keyword integration: Natural, not stuffed
|
||||
- ✅ Image prompt diversity: No duplicates
|
||||
- ✅ User satisfaction: Fewer manual edits needed
|
||||
- ✅ Processing time: <10s for 1000-word article
|
||||
- ✅ Credit cost: 30% reduction with model optimization
|
||||
|
||||
---
|
||||
|
||||
## Related Files Reference
|
||||
|
||||
### Backend
|
||||
- `backend/igny8_core/ai/prompts.py` - Prompt registry and defaults
|
||||
- `backend/igny8_core/ai/functions/auto_cluster.py` - Clustering function
|
||||
- `backend/igny8_core/ai/functions/generate_ideas.py` - Ideas function
|
||||
- `backend/igny8_core/ai/functions/generate_content.py` - Content function
|
||||
- `backend/igny8_core/ai/functions/generate_image_prompts.py` - Image prompts
|
||||
- `backend/igny8_core/ai/settings.py` - Model configuration
|
||||
- `backend/igny8_core/modules/system/models.py` - AIPrompt model
|
||||
|
||||
### Testing
|
||||
- Create test suite: `backend/igny8_core/ai/tests/test_prompts.py`
|
||||
- Test fixtures with sample inputs
|
||||
- Automated quality validation
|
||||
- Performance benchmarks
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- All prompt changes should be tested on real data first
|
||||
- Keep old prompts in version history for rollback
|
||||
- Monitor user feedback on content quality
|
||||
- Consider user-customizable prompt templates (advanced feature)
|
||||
- Document prompt engineering best practices for team
|
||||
- SAG clustering prompt (mentioned in original doc) to be handled separately as specialized architecture
|
||||
Reference in New Issue
Block a user