21 KiB
Item 3: Prompt Improvement and Model Optimization
Priority: High
Target: Production Launch
Last Updated: December 11, 2025
Overview
Redesign and optimize all AI prompts for clustering, idea generation, content generation, and image prompt extraction to achieve:
- Extreme accuracy and consistent outputs
- Faster processing with optimized token usage
- Correct word count adherence (500, 1000, 1500 words)
- Improved clustering quality and idea relevance
- Better image prompt clarity and relevance
Current Prompt System Architecture
Prompt Registry
Location: backend/igny8_core/ai/prompts.py
Class: PromptRegistry
Hierarchy (resolution order):
- Task-level
prompt_override(if exists on specific task) - Database prompt from
AIPromptmodel (account-specific) - Default fallback from
PromptRegistry.DEFAULT_PROMPTS
Storage:
- Default prompts: Hardcoded in
prompts.py - Account overrides:
system_aipromptdatabase table - Task overrides:
prompt_overridefield on task object
Current Prompts Analysis
1. Clustering Prompt
Function: auto_cluster
File: backend/igny8_core/ai/functions/auto_cluster.py
Prompt Key: 'clustering'
Current Prompt Structure
Approach: Semantic strategist + intent-driven clustering
Key Instructions:
- Return single JSON with "clusters" array
- Each cluster: name, description, keywords[]
- Multi-dimensional grouping (intent, use-case, function, persona, context)
- Model real search behavior and user journeys
- Avoid superficial groupings and duplicates
- 3-10 keywords per cluster
Strengths:
✅ Clear JSON output format
✅ Detailed grouping logic with dimensions
✅ Emphasis on semantic strength over keyword matching
✅ User journey modeling (Problem → Solution, General → Specific)
Issues:
❌ Very long prompt (~400+ tokens) - may confuse model
❌ No examples provided - model must guess formatting
❌ Doesn't specify what to do with outliers explicitly
❌ No guidance on cluster count (outputs variable)
❌ Description length not constrained
Real-World Performance Issues:
- Sometimes creates too many small clusters (1-2 keywords each)
- Inconsistent cluster naming convention
- Descriptions sometimes generic ("Keywords related to...")
2. Idea Generation Prompt
Function: generate_ideas
File: backend/igny8_core/ai/functions/generate_ideas.py
Prompt Key: 'ideas'
Current Prompt Structure
Approach: SEO-optimized content ideas + outlines
Key Instructions:
- Input: Clusters + Keywords
- Output: JSON "ideas" array
- 1 cluster_hub + 2-4 supporting ideas per cluster
- Fields: title, description, content_type, content_structure, cluster_id, estimated_word_count, covered_keywords
- Outline format: intro (hook + 2 paragraphs), 5-8 H2 sections with 2-3 H3s each
- Content mixing: paragraphs, lists, tables, blockquotes
- No bullets/lists at start
- Professional tone, no generic phrasing
Strengths:
✅ Detailed outline structure
✅ Content mixing guidance (lists, tables, blockquotes)
✅ Clear JSON format
✅ Tone guidelines
Issues:
❌ Very complex prompt (600+ tokens)
❌ Outline format too prescriptive (might limit creativity)
❌ No examples provided
❌ Estimated word count often inaccurate (too high or too low)
❌ "hook" guidance unclear (what makes a good hook?)
❌ Content structure validation not enforced
Real-World Performance Issues:
- Generated ideas sometimes too similar within cluster
- Outlines don't always respect structure types (e.g., "review" vs "guide")
- covered_keywords field sometimes empty or incorrect
- cluster_hub vs supporting ideas distinction unclear
3. Content Generation Prompt
Function: generate_content
File: backend/igny8_core/ai/functions/generate_content.py
Prompt Key: 'content_generation'
Current Prompt Structure
Approach: Editorial content strategist
Key Instructions:
- Output: JSON {title, content (HTML)}
- Introduction: 1 italic hook (30-40 words) + 2 paragraphs (50-60 words each), no headings
- H2 sections: 5-8 total, 250-300 words each
- Section format: 2 narrative paragraphs → list/table → optional closing paragraph → 2-3 subsections
- Vary list/table types
- Never start section with list/table
- Tone: professional, no passive voice, no generic intros
- Keyword usage: natural in title, intro, headings
Strengths:
✅ Detailed structure guidance
✅ Strong tone/style rules
✅ HTML output format
✅ Keyword integration guidance
Issues:
❌ Word count not mentioned in prompt - critical flaw
❌ No guidance on 500 vs 1000 vs 1500 word versions
❌ Hook word count (30-40) + paragraph counts (50-60 × 2) don't scale proportionally
❌ Section word count (250-300) doesn't adapt to total target
❌ No example output
❌ Content structure (article vs guide vs review) not clearly differentiated
❌ Table column guidance missing (what columns? how many?)
Real-World Performance Issues:
- Output length wildly inconsistent (generates 800 words when asked for 1500)
- Introductions sometimes have headings despite instructions
- Lists appear at start of sections
- Table structure unclear (random columns)
- Doesn't adapt content density to word count
4. Image Prompt Extraction
Function: generate_image_prompts
File: backend/igny8_core/ai/functions/generate_image_prompts.py
Prompt Key: 'image_prompt_extraction'
Current Prompt Structure
Approach: Extract visual descriptions from article
Key Instructions:
- Input: article title + content
- Output: JSON {featured_prompt, in_article_prompts[]}
- Extract featured image (main topic)
- Extract up to {max_images} in-article images
- Each prompt detailed for image generation (visual elements, style, mood, composition)
Strengths:
✅ Clear structure
✅ Separates featured vs in-article
✅ Emphasizes detail in descriptions
Issues:
❌ No guidance on what makes a good image prompt
❌ No style/mood specifications
❌ Doesn't specify where in article to place images
❌ No examples
❌ "Detailed enough" is subjective
Real-World Performance Issues:
- Prompts sometimes too generic ("Image of a person using a laptop")
- No context from article content (extracts irrelevant visuals)
- Featured image prompt sometimes identical to in-article prompt
- No guidance on image diversity (all similar)
5. Image Generation Template
Prompt Key: 'image_prompt_template'
Current Template
Approach: Template-based prompt assembly
Format:
Create a high-quality {image_type} image... "{post_title}"... {image_prompt}...
Focus on realistic, well-composed scene... lifestyle/editorial web content...
Avoid text, watermarks, logos... **not blurry.**
Issues:
❌ {image_type} not always populated
❌ "high-quality" and "not blurry" redundant/unclear
❌ No style guidance (photographic, illustration, 3D, etc.)
❌ No aspect ratio specification
Required Improvements
A. Clustering Prompt Redesign
Goals
- Reduce prompt length by 30-40%
- Add 2-3 concrete examples
- Enforce consistent cluster count (5-15 clusters ideal)
- Standardize cluster naming (title case, descriptive)
- Limit description to 20-30 words
Proposed Structure
Section 1: Role & Task (50 tokens)
- Clear, concise role definition
- Task: group keywords into intent-driven clusters
Section 2: Output Format with Example (100 tokens)
- JSON structure
- Show 1 complete example cluster
- Specify exact field requirements
Section 3: Clustering Rules (150 tokens)
- List 5-7 key rules (bullet format)
- Keyword-first approach
- Intent dimensions (brief)
- Quality thresholds (3-10 keywords per cluster)
- No duplicates
Section 4: Quality Checklist (50 tokens)
- Checklist of 4-5 validation points
- Model self-validates before output
Total: ~350 tokens (vs current ~420)
Example Output Format to Include
{
"clusters": [
{
"name": "Organic Bedding Benefits",
"description": "Health, eco-friendly, and comfort aspects of organic cotton bedding materials",
"keywords": ["organic sheets", "eco-friendly bedding", "chemical-free cotton", "hypoallergenic sheets", "sustainable bedding"]
}
]
}
B. Idea Generation Prompt Redesign
Goals
- Simplify outline structure (less prescriptive)
- Add examples of cluster_hub vs supporting ideas
- Better covered_keywords extraction
- Adaptive word count estimation
- Content structure differentiation
Proposed Structure
Section 1: Role & Objective (40 tokens)
- SEO content strategist
- Task: generate content ideas from clusters
Section 2: Output Format with Examples (150 tokens)
- Show 1 cluster_hub example
- Show 1 supporting idea example
- Highlight key differences
Section 3: Idea Generation Rules (100 tokens)
- 1 cluster_hub (comprehensive, authoritative)
- 2-4 supporting ideas (specific angles)
- Word count: 1500-2200 for hubs, 1000-1500 for supporting
- covered_keywords: extract from cluster keywords
Section 4: Outline Guidance (100 tokens)
- Simplified: Intro + 5-8 sections + Conclusion
- Section types by content_structure:
- article: narrative + data
- guide: step-by-step + tips
- review: pros/cons + comparison
- listicle: numbered + categories
- comparison: side-by-side + verdict
Total: ~390 tokens (vs current ~610)
C. Content Generation Prompt Redesign
Most Critical Improvement: Word Count Adherence
Goals
- Primary: Generate exact word count (±5% tolerance)
- Scale structure proportionally to word count
- Differentiate content structures clearly
- Improve HTML quality and consistency
- Better keyword integration
Proposed Adaptive Word Count System
Word Count Targets:
- 500 words: Short-form (5 sections × 80 words + intro/outro 60 words)
- 1000 words: Standard (6 sections × 140 words + intro/outro 120 words)
- 1500 words: Long-form (7 sections × 180 words + intro/outro 180 words)
Prompt Variable Replacement:
Before sending to AI, calculate:
{TARGET_WORD_COUNT}- from task.word_count{INTRO_WORDS}- 60 / 120 / 180 based on target{SECTION_COUNT}- 5 / 6 / 7 based on target{SECTION_WORDS}- 80 / 140 / 180 based on target{HOOK_WORDS}- 25 / 35 / 45 based on target
Proposed Structure
Section 1: Role & Objective (30 tokens)
You are an editorial content writer. Generate a {TARGET_WORD_COUNT}-word article...
Section 2: Word Count Requirements (80 tokens)
CRITICAL: The content must be exactly {TARGET_WORD_COUNT} words (±5% tolerance).
Structure breakdown:
- Introduction: {INTRO_WORDS} words total
- Hook (italic): {HOOK_WORDS} words
- Paragraphs: 2 × ~{INTRO_WORDS/2} words each
- Main Sections: {SECTION_COUNT} H2 sections
- Each section: {SECTION_WORDS} words
- Conclusion: 60 words
Word count validation: Count words in final output and adjust if needed.
Section 3: Content Flow & HTML (120 tokens)
- Detailed structure per section
- HTML tag usage (
,
,
,
- ,
- Formatting rules
- Tone guidance
- Keyword usage
- Avoid generic phrases
- Examples of good vs bad openings
- article: {structure description}
- guide: {structure description}
- review: {structure description}
- comparison: {structure description}
- listicle: {structure description}
- cluster_hub: {structure description}
- JSON structure
- Show abbreviated example with proper HTML
- Generate visually diverse prompts
- Better context from article content
- Specify image placement guidelines
- Improve prompt detail and clarity
- Be specific and descriptive (not generic)
- Include scene details, angles, perspective
- Specify lighting, time of day if relevant
- Mention style references
- Ensure diversity across all images
- No duplicate concepts
- JSON structure
- Show example with good vs bad prompts
- {style} - photographic, minimalist, lifestyle, etc.
- {composition_hint} - center-framed, rule-of-thirds, wide-angle, etc.
- {lighting_hint} - natural daylight, soft indoor, dramatic, etc.
- {mood} - professional, warm, energetic, calm, etc.
- ✅ Draft new clustering prompt with examples
- ✅ Test with sample keyword sets (20, 50, 100 keywords)
- ✅ Compare outputs: old vs new
- ✅ Validate cluster quality (manual review)
- ✅ Update
PromptRegistry.DEFAULT_PROMPTS['clustering'] - ✅ Deploy and monitor
- Consistent cluster count (5-15)
- No single-keyword clusters
- Clear, descriptive names
- Concise descriptions (20-30 words)
- 95%+ of keywords clustered
- ✅ Draft new ideas prompt with examples
- ✅ Test with 5-10 clusters
- ✅ Validate cluster_hub vs supporting idea distinction
- ✅ Check covered_keywords accuracy
- ✅ Verify content_structure alignment
- ✅ Update
PromptRegistry.DEFAULT_PROMPTS['ideas'] - ✅ Deploy and monitor
- Clear distinction between hub and supporting ideas
- Accurate covered_keywords extraction
- Appropriate word count estimates
- Outlines match content_structure type
- No duplicate ideas within cluster
- ✅ Draft new content prompt with word count logic
- ✅ Implement dynamic variable replacement in
build_prompt() - ✅ Test with 500, 1000, 1500 word targets
- ✅ Validate actual word counts (automated counting)
- ✅ Test all content_structure types
- ✅ Verify HTML quality and consistency
- ✅ Update
PromptRegistry.DEFAULT_PROMPTS['content_generation'] - ✅ Deploy and monitor
- 95%+ of generated content within ±5% of target word count
- HTML structure consistent
- Content structure types clearly differentiated
- Keyword integration natural
- No sections starting with lists
- ✅ Draft new extraction prompt with placement guidelines
- ✅ Draft new template prompt with style variables
- ✅ Test with 10 sample articles
- ✅ Validate image diversity and relevance
- ✅ Update both prompts in registry
- ✅ Update
GenerateImagePromptsFunctionto use new template - ✅ Deploy and monitor
- No duplicate image concepts in same article
- Prompts are specific and detailed
- Featured image distinct from in-article images
- Image placement logically distributed
- Generated images relevant to content
- Test new prompt version alongside current
- Compare outputs on same inputs
- Measure quality metrics (manual + automated)
- Gradually roll out if better
- Keep old version as fallback
- Track per prompt version
- Identify patterns in failures
- Use for prompt iteration
- Clustering: GPT-4 (expensive but accurate)
- Ideas: GPT-4 (creative)
- Content: GPT-4 (quality)
- Image Prompts: GPT-3.5-turbo (simpler task)
- Images: DALL-E 3 / Runware
- Faster response time
- 50% cost reduction
- Similar quality for structured outputs
- ✅ Word count accuracy: 95%+ within ±5%
- ✅ Clustering quality: No single-keyword clusters
- ✅ Idea generation: Clear hub vs supporting distinction
- ✅ HTML validity: 100%
- ✅ Keyword integration: Natural, not stuffed
- ✅ Image prompt diversity: No duplicates
- ✅ User satisfaction: Fewer manual edits needed
- ✅ Processing time: <10s for 1000-word article
- ✅ Credit cost: 30% reduction with model optimization
backend/igny8_core/ai/prompts.py- Prompt registry and defaultsbackend/igny8_core/ai/functions/auto_cluster.py- Clustering functionbackend/igny8_core/ai/functions/generate_ideas.py- Ideas functionbackend/igny8_core/ai/functions/generate_content.py- Content functionbackend/igny8_core/ai/functions/generate_image_prompts.py- Image promptsbackend/igny8_core/ai/settings.py- Model configurationbackend/igny8_core/modules/system/models.py- AIPrompt model- Create test suite:
backend/igny8_core/ai/tests/test_prompts.py - Test fixtures with sample inputs
- Automated quality validation
- Performance benchmarks
- All prompt changes should be tested on real data first
- Keep old prompts in version history for rollback
- Monitor user feedback on content quality
- Consider user-customizable prompt templates (advanced feature)
- Document prompt engineering best practices for team
- SAG clustering prompt (mentioned in original doc) to be handled separately as specialized architecture
- , )
Section 4: Style & Quality (80 tokens)
Section 5: Content Structure Types (90 tokens)
Section 6: Output Format with Example (100 tokens)
Total: ~500 tokens (vs current ~550, but much more precise)
D. Image Prompt Improvements
Goals
Proposed Extraction Prompt Structure
Section 1: Task & Context (50 tokens)
Extract image prompts from this article for visual content placement. Article: {title} Content: {content} Required: 1 featured + {max_images} in-article imagesSection 2: Image Types & Guidelines (100 tokens)
Featured Image: - Hero visual representing article's main theme - Broad, engaging, high-quality - Should work at large sizes (1200×630+) In-Article Images (place strategically): 1. After introduction 2. Mid-article (before major H2 sections) 3. Supporting specific concepts or examples 4. Before conclusion Each prompt must describe: - Subject & composition - Visual style (photographic, minimal, editorial) - Mood & lighting - Color palette suggestions - Avoid: text, logos, faces (unless relevant)Section 3: Prompt Quality Rules (80 tokens)
Section 4: Output Format (50 tokens)
Proposed Template Prompt Improvement
Replace current template with:
A {style} photograph for "{post_title}". {image_prompt}. Composition: {composition_hint}. Lighting: {lighting_hint}. Mood: {mood}. Style: clean, modern, editorial web content. No text, watermarks, or logos.Where:
Implementation Plan
Phase 1: Clustering Prompt (Week 1)
Tasks:
Success Criteria:
Phase 2: Idea Generation Prompt (Week 1-2)
Tasks:
Success Criteria:
Phase 3: Content Generation Prompt (Week 2)
Tasks:
Code Change Required:
File:
backend/igny8_core/ai/functions/generate_content.pyMethod:
build_prompt()Add word count calculation:
def build_prompt(self, data: Any, account=None) -> str: task = data if not isinstance(data, list) else data[0] # Calculate adaptive word count parameters target_words = task.word_count or 1000 if target_words <= 600: intro_words = 60 section_count = 5 section_words = 80 hook_words = 25 elif target_words <= 1200: intro_words = 120 section_count = 6 section_words = 140 hook_words = 35 else: intro_words = 180 section_count = 7 section_words = 180 hook_words = 45 # Get prompt and replace variables prompt = PromptRegistry.get_prompt( function_name='generate_content', account=account, task=task, context={ 'TARGET_WORD_COUNT': target_words, 'INTRO_WORDS': intro_words, 'SECTION_COUNT': section_count, 'SECTION_WORDS': section_words, 'HOOK_WORDS': hook_words, # ... existing context } ) return promptSuccess Criteria:
Phase 4: Image Prompt Improvements (Week 2-3)
Tasks:
Success Criteria:
Prompt Versioning & Testing
Version Control
Recommendation: Store prompt versions in database for A/B testing
Schema:
class AIPromptVersion(models.Model): prompt_type = CharField(choices=PROMPT_TYPE_CHOICES) version = IntegerField() prompt_value = TextField() is_active = BooleanField(default=False) created_at = DateTimeField(auto_now_add=True) performance_metrics = JSONField(default=dict) # Track success ratesProcess:
Automated Quality Metrics
Implement automated checks:
Metric Check Threshold Word Count Accuracy abs(actual - target) / target< 0.05 (±5%) HTML Validity Parse with BeautifulSoup 100% valid Keyword Presence Count keyword mentions ≥ 3 for primary Structure Compliance Check H2/H3 hierarchy Valid structure Cluster Count Number of clusters 5-15 Cluster Size Keywords per cluster 3-10 No Duplicates Keyword appears once 100% unique Log results:
Model Selection & Optimization
Current Models
Location:
backend/igny8_core/ai/settings.pyDefault Models per Function:
Optimization Opportunities
Cost vs Quality Tradeoffs:
Function Current Alternative Cost Savings Quality Impact Clustering GPT-4 GPT-4-turbo 50% Minimal Ideas GPT-4 GPT-4-turbo 50% Minimal Content GPT-4 GPT-4-turbo 50% Test required Image Prompts GPT-3.5 Keep - - Recommendation: Test GPT-4-turbo for all text generation tasks
Success Metrics
Related Files Reference
Backend
Testing
Notes