NAVIGATION_REFACTOR COMPLETED
This commit is contained in:
425
docs/plans/implemented/PROMPT_ALIGNMENT_SUGGESTIONS.md
Normal file
425
docs/plans/implemented/PROMPT_ALIGNMENT_SUGGESTIONS.md
Normal file
@@ -0,0 +1,425 @@
|
||||
# AI Prompt Alignment Suggestions
|
||||
|
||||
**Date:** January 15, 2026
|
||||
|
||||
## 🚨 CRITICAL FINDING: Data Loss Between Idea Generation & Content Generation
|
||||
|
||||
**The Problem:** The idea generation AI creates detailed outlines with 6-10 H2 sections, but this outline structure is **never stored in the database**. Only basic fields (title, description text, keywords) are saved. When content generation runs, it has NO ACCESS to:
|
||||
- The planned section count (6? 8? 10?)
|
||||
- The section outline structure (h2_topic, coverage details)
|
||||
- The primary focus keywords
|
||||
- The covered keywords
|
||||
- The target word count
|
||||
|
||||
**Result:** Content generator uses a fixed template (6 sections, 1000-1200 words) that conflicts with the variable planning done by ideas generator (6-10 sections, 1200-1800 words).
|
||||
|
||||
**Solution:** Either add a JSONField to store the complete idea structure, OR update the content prompt to work with limited information and pass available keyword/word count data.
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
After analyzing the current **Ideas Generation** and **Content Generation** prompts from the database, I've identified key areas where these prompts need better alignment to ensure consistency in content output.
|
||||
|
||||
---
|
||||
|
||||
## Current State Analysis
|
||||
|
||||
### Ideas Generation Prompt
|
||||
- Generates 3-7 content ideas per cluster
|
||||
- Defines 6-10 H2 sections per idea
|
||||
- Targets 1-2 primary focus keywords + 2-3 covered keywords (3-5 total)
|
||||
- AI-determined word count based on sections/keywords
|
||||
- Emphasizes completely different keywords per idea
|
||||
- Outputs strategic outline only (no detailed H3/formatting)
|
||||
|
||||
### Content Generation Prompt
|
||||
- Targets 1000-1200 words
|
||||
- Requires exactly 6 H2 sections
|
||||
- Has rigid section format requirements (2 paragraphs, 2 lists, 1 table)
|
||||
- Detailed HTML structure specifications
|
||||
- Strict word count per paragraph (50-80 words)
|
||||
- Includes specific formatting rules for lists and tables
|
||||
|
||||
---
|
||||
|
||||
## Key Inconsistencies Identified
|
||||
|
||||
### 1. **Section Count Mismatch**
|
||||
- **Ideas Prompt:** 6-10 H2 sections (variable, AI-determined)
|
||||
- **Content Prompt:** Exactly 6 H2 sections (fixed)
|
||||
- **Issue:** Content generator cannot accommodate ideas with 7-10 sections
|
||||
|
||||
### 2. **Word Count Flexibility**
|
||||
- **Ideas Prompt:** AI-determined based on topic complexity (typically 1200-1800 words)
|
||||
- **Content Prompt:** Fixed 1000-1200 words
|
||||
- **Issue:** Complex topics with 8-10 sections cannot fit in 1000-1200 words
|
||||
|
||||
### 3. **Format Variety vs. Fixed Pattern**
|
||||
- **Ideas Prompt:** No formatting specifications (lets content generator decide)
|
||||
- **Content Prompt:** Rigid format (2 paragraphs, 2 lists, 1 table distributed)
|
||||
- **Issue:** Some topics need more lists/tables, others need more narrative
|
||||
|
||||
### 4. **Keyword Coverage Alignment**
|
||||
- **Ideas Prompt:** 3-5 keywords total (1-2 primary + 2-3 covered)
|
||||
- **Content Prompt:** Primary keyword + secondary keywords (no clear limit)
|
||||
- **Alignment:** This is actually okay, but needs clearer instruction
|
||||
|
||||
---
|
||||
|
||||
## Suggested Changes to Content Generation Prompt
|
||||
|
||||
### Change 1: Dynamic Section Count
|
||||
**Current:**
|
||||
```
|
||||
### 1. WORD COUNT: 1000-1200 words target
|
||||
- Write 6 H2 sections
|
||||
```
|
||||
|
||||
**Suggested:**
|
||||
```
|
||||
### 1. WORD COUNT AND SECTIONS
|
||||
|
||||
**Use the section count from the provided outline:**
|
||||
- The outline specifies the number of H2 sections to write
|
||||
- Typically 6-10 H2 sections based on topic complexity
|
||||
- Write ALL sections from the outline
|
||||
|
||||
**Word count calculation:**
|
||||
- Base: 150-180 words per H2 section
|
||||
- Introduction: 100-150 words
|
||||
- Total = (Number of H2 sections × 170) + 125
|
||||
- Example: 6 sections = ~1,145 words | 8 sections = ~1,485 words | 10 sections = ~1,825 words
|
||||
```
|
||||
|
||||
### Change 2: Flexible Format Distribution
|
||||
**Current:**
|
||||
```
|
||||
### 2. SECTION FORMAT VARIETY
|
||||
**For 6 H2 sections, distribute as:**
|
||||
- 2 sections: Paragraphs ONLY
|
||||
- 2 section: Paragraphs + Lists
|
||||
- 1 section: Paragraphs + Tables
|
||||
```
|
||||
|
||||
**Suggested:**
|
||||
```
|
||||
### 2. SECTION FORMAT VARIETY
|
||||
|
||||
**Format distribution (scales with section count):**
|
||||
|
||||
**For 6-7 sections:**
|
||||
- 3-4 sections: Paragraphs ONLY
|
||||
- 2 sections: Paragraphs + Lists
|
||||
- 1 section: Paragraphs + Tables
|
||||
|
||||
**For 8-9 sections:**
|
||||
- 4-5 sections: Paragraphs ONLY
|
||||
- 2-3 sections: Paragraphs + Lists
|
||||
- 1-2 sections: Paragraphs + Tables
|
||||
|
||||
**For 10+ sections:**
|
||||
- 5-6 sections: Paragraphs ONLY
|
||||
- 3 sections: Paragraphs + Lists
|
||||
- 2 sections: Paragraphs + Tables
|
||||
|
||||
**Rules (apply to all counts):**
|
||||
- Randomize which sections get which format
|
||||
- Never use same pattern for consecutive sections
|
||||
- Lists: 4-5 items, 15-20 words each
|
||||
- Tables: 4-5 columns, 5-6 rows with real data
|
||||
- Use block quotes randomly in non-table sections
|
||||
```
|
||||
|
||||
### Change 3: Input Structure Alignment - CRITICAL FINDING
|
||||
|
||||
**What's Currently Output in [IGNY8_IDEA]:**
|
||||
|
||||
Based on code analysis (`backend/igny8_core/ai/functions/generate_content.py`), here's what's actually being passed:
|
||||
|
||||
```python
|
||||
# From generate_content.py build_prompt():
|
||||
idea_data = f"Title: {task.title or 'Untitled'}\n"
|
||||
if task.description:
|
||||
idea_data += f"Description: {task.description}\n"
|
||||
idea_data += f"Content Type: {task.content_type or 'post'}\n"
|
||||
idea_data += f"Content Structure: {task.content_structure or 'article'}\n"
|
||||
```
|
||||
|
||||
**Current Output Format (Plain Text):**
|
||||
```
|
||||
Title: How to Build an Email List from Scratch
|
||||
Description: This guide covers the fundamentals of list building...
|
||||
Content Type: post
|
||||
Content Structure: guide
|
||||
```
|
||||
|
||||
**What's Available But NOT Being Passed:**
|
||||
|
||||
The ContentIdeas model has these fields:
|
||||
- ✅ `primary_focus_keywords` (CharField - "email list building")
|
||||
- ✅ `target_keywords` (CharField - "subscriber acquisition, lead magnets")
|
||||
- ✅ `estimated_word_count` (IntegerField - 1500)
|
||||
- ✅ `content_type` (CharField - "post")
|
||||
- ✅ `content_structure` (CharField - "guide")
|
||||
|
||||
But the outline structure (intro_focus, main_sections array) is **NOT stored anywhere**:
|
||||
- ❌ No outline JSON stored in ContentIdeas model
|
||||
- ❌ No outline JSON stored in Tasks model
|
||||
- ❌ The AI generates the outline but it's only in the API response, never persisted
|
||||
|
||||
**The Root Problem:**
|
||||
|
||||
1. **Ideas Generator outputs** full JSON with outline:
|
||||
```json
|
||||
{
|
||||
"title": "...",
|
||||
"description": {
|
||||
"overview": "...",
|
||||
"outline": {
|
||||
"intro_focus": "...",
|
||||
"main_sections": [
|
||||
{"h2_topic": "...", "coverage": "..."},
|
||||
{"h2_topic": "...", "coverage": "..."},
|
||||
...6-10 sections...
|
||||
]
|
||||
}
|
||||
},
|
||||
"primary_focus_keywords": "...",
|
||||
"covered_keywords": "..."
|
||||
}
|
||||
```
|
||||
|
||||
2. **Only these get saved** to ContentIdeas:
|
||||
- `idea_title` = title
|
||||
- `description` = description.overview (NOT the outline!)
|
||||
- `primary_focus_keywords` = primary_focus_keywords
|
||||
- `target_keywords` = covered_keywords
|
||||
- `estimated_word_count` = estimated_word_count
|
||||
|
||||
3. **Content Generator receives** (from Tasks):
|
||||
- Just title and description text
|
||||
- No section outline
|
||||
- No keyword info
|
||||
- No word count target
|
||||
|
||||
**Why This Causes Misalignment:**
|
||||
- Content generator has NO IDEA how many sections were planned (6? 8? 10?)
|
||||
- Content generator doesn't know which keywords to target
|
||||
- Content generator doesn't know the word count goal
|
||||
- Content generator can't follow the planned outline structure
|
||||
|
||||
---
|
||||
|
||||
**Recommended Solution Path:**
|
||||
|
||||
**OPTION A: Store Full Idea JSON** (Best for Long-term)
|
||||
|
||||
1. Add JSONField to ContentIdeas model:
|
||||
```python
|
||||
class ContentIdeas(models.Model):
|
||||
# ... existing fields ...
|
||||
idea_json = models.JSONField(
|
||||
default=dict,
|
||||
blank=True,
|
||||
help_text="Complete idea structure from AI generation (outline, keywords, sections)"
|
||||
)
|
||||
```
|
||||
|
||||
2. Update generate_ideas.py to save full JSON:
|
||||
```python
|
||||
# In save_output method:
|
||||
content_idea = ContentIdeas.objects.create(
|
||||
# ... existing fields ...
|
||||
idea_json=idea_data, # Store the complete JSON structure
|
||||
)
|
||||
```
|
||||
|
||||
3. Update generate_content.py to use full structure:
|
||||
```python
|
||||
# In build_prompt method:
|
||||
if task.idea and task.idea.idea_json:
|
||||
# Pass full JSON structure
|
||||
idea_data = json.dumps(task.idea.idea_json, indent=2)
|
||||
else:
|
||||
# Fallback to current simple format
|
||||
idea_data = f"Title: {task.title}\nDescription: {task.description}\n"
|
||||
```
|
||||
|
||||
4. Update Content Generation prompt INPUT section:
|
||||
```
|
||||
## INPUT
|
||||
|
||||
**CONTENT IDEA:**
|
||||
[IGNY8_IDEA]
|
||||
|
||||
Expected JSON structure:
|
||||
{
|
||||
"title": "Article title",
|
||||
"description": {
|
||||
"overview": "2-3 sentence description",
|
||||
"outline": {
|
||||
"intro_focus": "What the introduction should establish",
|
||||
"main_sections": [
|
||||
{"h2_topic": "Section heading", "coverage": "What to cover"},
|
||||
... array of 6-10 sections ...
|
||||
]
|
||||
}
|
||||
},
|
||||
"primary_focus_keywords": "1-2 main keywords",
|
||||
"covered_keywords": "2-3 supporting keywords",
|
||||
"estimated_word_count": 1500,
|
||||
"content_type": "post",
|
||||
"content_structure": "guide_tutorial"
|
||||
}
|
||||
|
||||
**KEYWORD CLUSTER:**
|
||||
[IGNY8_CLUSTER]
|
||||
|
||||
**KEYWORDS:**
|
||||
[IGNY8_KEYWORDS]
|
||||
|
||||
**INSTRUCTIONS:**
|
||||
- Use the exact number of H2 sections from main_sections array
|
||||
- Each H2 section should follow the h2_topic and coverage from the outline
|
||||
- Target the word count from estimated_word_count (±100 words)
|
||||
- Focus on primary_focus_keywords and covered_keywords for SEO
|
||||
```
|
||||
|
||||
**OPTION B: Quick Fix - Pass Available Fields** (Can implement immediately without DB changes)
|
||||
|
||||
Update generate_content.py:
|
||||
```python
|
||||
# In build_prompt method:
|
||||
idea_data = f"Title: {task.title or 'Untitled'}\n"
|
||||
if task.description:
|
||||
idea_data += f"Description: {task.description}\n"
|
||||
idea_data += f"Content Type: {task.content_type or 'post'}\n"
|
||||
idea_data += f"Content Structure: {task.content_structure or 'article'}\n"
|
||||
|
||||
# ADD: Pull from related idea if available
|
||||
if task.idea:
|
||||
if task.idea.primary_focus_keywords:
|
||||
idea_data += f"Primary Focus Keywords: {task.idea.primary_focus_keywords}\n"
|
||||
if task.idea.target_keywords:
|
||||
idea_data += f"Covered Keywords: {task.idea.target_keywords}\n"
|
||||
if task.idea.estimated_word_count:
|
||||
idea_data += f"Target Word Count: {task.idea.estimated_word_count}\n"
|
||||
```
|
||||
|
||||
Then update Content Generation prompt:
|
||||
```
|
||||
## INPUT
|
||||
|
||||
**CONTENT IDEA:**
|
||||
[IGNY8_IDEA]
|
||||
|
||||
Format:
|
||||
- Title: Article title
|
||||
- Description: Content overview
|
||||
- Content Type: post|page|product
|
||||
- Content Structure: article|guide|comparison|review|listicle
|
||||
- Primary Focus Keywords: 1-2 main keywords (if available)
|
||||
- Covered Keywords: 2-3 supporting keywords (if available)
|
||||
- Target Word Count: Estimated words (if available)
|
||||
|
||||
**NOTE:** Generate 6-8 H2 sections based on content_structure type. Scale word count to match Target Word Count if provided (±100 words acceptable).
|
||||
```
|
||||
|
||||
### Change 4: Keyword Usage Clarity
|
||||
**Current:**
|
||||
```
|
||||
## KEYWORD USAGE
|
||||
|
||||
**Primary keyword** (identify from title):
|
||||
- Use in title, intro, meta title/description
|
||||
- Include in 2-3 H2 headings naturally
|
||||
- Mention 2-3 times in content (0.5-1% density)
|
||||
|
||||
**Secondary keywords** (3-4 from keyword list):
|
||||
- Distribute across H2 sections
|
||||
- Use in H2/H3 headings where natural
|
||||
- 2-3 mentions each (0.3-0.6% density)
|
||||
- Include variations and related terms
|
||||
```
|
||||
|
||||
**Suggested:**
|
||||
```
|
||||
## KEYWORD USAGE
|
||||
|
||||
**Primary focus keywords** (1-2 from IGNY8_IDEA.primary_focus_keywords):
|
||||
- Already in the provided title (use it as-is)
|
||||
- Include in 2-3 H2 headings naturally (outline already targets this)
|
||||
- Mention 2-3 times in content (0.5-1% density)
|
||||
|
||||
**Covered keywords** (2-3 from IGNY8_IDEA.covered_keywords):
|
||||
- Distribute across H2 sections
|
||||
- Use in H2/H3 headings where natural (outline may already include them)
|
||||
- 2-3 mentions each (0.3-0.6% density)
|
||||
- Include variations and related terms
|
||||
|
||||
**Total keyword target:** 3-5 keywords (1-2 primary + 2-3 covered)
|
||||
```
|
||||
|
||||
### Change 5: Verification Checklist Update
|
||||
**Current:**
|
||||
```
|
||||
## VERIFICATION BEFORE OUTPUT
|
||||
|
||||
- [ ] 1000-1200 words ONLY (excluding HTML tags) - STOP if exceeding
|
||||
- [ ] 6 H2 sections
|
||||
- [ ] Maximum 2 sections with lists
|
||||
- [ ] Maximum 2 sections with tables
|
||||
```
|
||||
|
||||
**Suggested:**
|
||||
```
|
||||
## VERIFICATION BEFORE OUTPUT
|
||||
|
||||
- [ ] Word count matches outline's estimated_word_count (±100 words acceptable)
|
||||
- [ ] Number of H2 sections matches outline's main_sections count
|
||||
- [ ] Format distribution scales appropriately with section count
|
||||
- [ ] All sections from outline are covered
|
||||
- [ ] Primary focus keywords (1-2) used correctly
|
||||
- [ ] Covered keywords (2-3) distributed naturally
|
||||
- [ ] All paragraphs 50-80 words
|
||||
- [ ] All lists 4-5 items, 15-20 words each
|
||||
- [ ] All tables 4-5 columns, 5-6 rows, real data
|
||||
- [ ] No placeholder content anywhere
|
||||
- [ ] Meta title <60 chars, description <160 chars
|
||||
- [ ] Valid JSON with escaped quotes
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Summary of Benefits
|
||||
|
||||
### With These Changes:
|
||||
1. ✅ **Flexibility:** Content generator can handle 6-10 sections from ideas
|
||||
2. ✅ **Consistency:** Section count matches between idea and content generation
|
||||
3. ✅ **Scalability:** Word count scales naturally with complexity
|
||||
4. ✅ **Quality:** Format variety adapts to content needs
|
||||
5. ✅ **Alignment:** Clear keyword strategy (1-2 primary + 2-3 covered = 3-5 total)
|
||||
6. ✅ **Maintainability:** One source of truth for section structure (the outline)
|
||||
|
||||
### Key Principle:
|
||||
**The Ideas Generator is the strategic planner** (decides sections, word count, keywords)
|
||||
**The Content Generator is the tactical executor** (follows the plan, adds formatting/depth)
|
||||
|
||||
---
|
||||
|
||||
## Implementation Notes
|
||||
|
||||
- These changes maintain all quality requirements (word count per paragraph, list/table specs, etc.)
|
||||
- The rigid structure is replaced with scalable rules that maintain quality at any section count
|
||||
- The content generator becomes more flexible while maintaining consistency
|
||||
- Both prompts now work together as a cohesive system
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. Update the `content_generation` prompt in the database with suggested changes
|
||||
2. Test with various section counts (6, 8, 10 sections) to verify scalability
|
||||
3. Monitor output quality to ensure formatting rules scale properly
|
||||
4. Consider creating a validation layer that checks idea/content alignment before generation
|
||||
Reference in New Issue
Block a user