25 KiB
IGNY8 Phase 2: Taxonomy Term Content (02B)
Rich Content Generation for Taxonomy Terms
Document Version: 1.0
Date: 2026-03-23
Phase: IGNY8 Phase 2 — Feature Expansion
Status: Build Ready
Source of Truth: Codebase at /data/app/igny8/
Audience: Claude Code, Backend Developers, Architects
1. CURRENT STATE
Existing Taxonomy Infrastructure
The taxonomy system is partially built:
ContentTaxonomy (writer app, db_table=igny8_content_taxonomies):
- Stores taxonomy term references synced from WordPress
- Fields:
name,slug,external_id(WP term ID),taxonomy_type(category/tag/product_cat/product_tag/attribute) - No content generation — terms are metadata only (name + slug + external reference)
ContentTaxonomyRelation (writer app):
- Links
ContenttoContentTaxonomy(many-to-many through table) - Allows assigning existing taxonomy terms to content pieces
Content Model (writer app, db_table=igny8_content):
content_type='taxonomy'exists in CONTENT_TYPE_CHOICES but is unused by the generation pipeline- CONTENT_STRUCTURE_CHOICES includes
category_archive,tag_archive,attribute_archive taxonomy_termsManyToManyField through ContentTaxonomyRelation
Tasks Model (writer app, db_table=igny8_tasks):
taxonomy_termForeignKey to ContentTaxonomy (nullable, db_column='taxonomy_id')- Not used by automation pipeline — present as a field only
SiteIntegration (integration app):
- WordPress connections exist via
SiteIntegrationmodel SyncEventlogs operations but taxonomy sync is stubbed/incomplete
What Doesn't Exist
- No content generation for taxonomy terms (categories, tags, attributes)
- No cluster mapping for taxonomy terms
- No WordPress → IGNY8 taxonomy sync (full fetch and reconcile)
- No IGNY8 → WordPress term content push
- No AI function for term content generation
- No admin interface for managing term-to-cluster mapping
2. WHAT TO BUILD
Overview
Make taxonomy terms first-class SEO content pages by:
- Syncing terms from WordPress — fetch all categories, tags, WooCommerce taxonomies
- Mapping terms to clusters — automatic keyword-overlap + semantic matching
- Generating rich content — AI-generated landing page content for each term
- Pushing content back — sync generated content to WordPress term descriptions + meta
Taxonomy Sync (WordPress → IGNY8)
Full bidirectional sync leveraging existing SiteIntegration:
Fetch targets:
- WordPress categories (
taxonomy_type='category') - WordPress tags (
taxonomy_type='tag') - WooCommerce product categories (
taxonomy_type='product_cat') - WooCommerce product tags (
taxonomy_type='product_tag') - WooCommerce product attributes (
taxonomy_type='attribute', e.g.,pa_color,pa_size)
Sync logic:
- Use existing
SiteIntegration.credentials_jsonto authenticate WP REST API - Fetch all terms via
GET /wp-json/wp/v2/categories,/tags,/product_cat, etc. - Reconcile: create new
ContentTaxonomyrecords, update changed ones, flag deleted - Store parent/child hierarchy for categories
- Log sync as
SyncEventwithevent_type='metadata_sync'
Cluster Mapping Service
A shared service (cluster_mapping_service.py) that maps taxonomy terms to keyword clusters:
Algorithm:
| Factor | Weight | Method |
|---|---|---|
| Keyword overlap | 40% | Compare term name + slug against cluster keywords |
| Semantic similarity | 40% | Embedding-based cosine similarity (term name vs cluster description) |
| Title match | 20% | Exact/partial match of term name in cluster name |
Output per term:
primary_cluster_id— best-match clustersecondary_cluster_ids— additional related clusters (up to 3)mapping_confidence— 0.0 to 1.0 scoremapping_status:auto_mapped(confidence ≥ 0.6) — assigned automaticallysuggested(confidence 0.3–0.6) — suggested for manual reviewunmapped(confidence < 0.3) — no good match found
Term Content Generation
Each taxonomy term gets rich, SEO-optimized content:
Generated sections:
- H1 Title — optimized for the term + primary cluster keywords
- Rich description — 500–1,500 words covering the topic
- FAQ section — 5–8 questions and answers
- Related terms — links to sibling/child terms
- Meta title — 50–60 characters
- Meta description — 150–160 characters
AI function: GenerateTermContentFunction(BaseAIFunction):
- Input: term name, taxonomy_type, assigned cluster keywords, existing content titles under term, parent/sibling terms for context
- Output: structured JSON with sections (intro, overview, FAQ, related)
- Uses
ContentTypeTemplatefrom 02A wherecontent_type='taxonomy'
Term Content Sync (IGNY8 → WordPress)
Push generated content to WordPress:
- Custom WP REST endpoint:
POST /wp-json/igny8/v1/terms/{id}/content - Stores in WordPress term meta:
_igny8_term_content— HTML content_igny8_term_faq— JSON FAQ array_igny8_term_meta_title— SEO title_igny8_term_meta_description— SEO description
- Updates native WordPress term description with the generated content
- Schema: CollectionPage with itemListElement for listed content
3. DATA MODELS & APIs
Modified Models
ContentTaxonomy (db_table=igny8_content_taxonomies) — add fields:
# Cluster mapping
cluster = models.ForeignKey(
'planner.Clusters', on_delete=models.SET_NULL,
null=True, blank=True, related_name='taxonomy_terms',
help_text="Primary cluster this term maps to"
)
secondary_cluster_ids = models.JSONField(
default=list, blank=True,
help_text="Additional related cluster IDs"
)
mapping_confidence = models.FloatField(
default=0.0,
help_text="Cluster mapping confidence score 0.0-1.0"
)
mapping_status = models.CharField(
max_length=20, default='unmapped',
choices=[
('auto_mapped', 'Auto Mapped'),
('manual_mapped', 'Manual Mapped'),
('suggested', 'Suggested'),
('unmapped', 'Unmapped'),
],
db_index=True
)
# Generated content
term_content = models.TextField(
blank=True, default='',
help_text="Generated rich HTML content for the term page"
)
term_faq = models.JSONField(
default=list, blank=True,
help_text="Generated FAQ: [{question, answer}]"
)
meta_title = models.CharField(max_length=255, blank=True, default='')
meta_description = models.TextField(blank=True, default='')
content_status = models.CharField(
max_length=20, default='none',
choices=[
('none', 'No Content'),
('generating', 'Generating'),
('generated', 'Generated'),
('published', 'Published to WP'),
],
db_index=True
)
# Hierarchy
parent_term = models.ForeignKey(
'self', on_delete=models.SET_NULL,
null=True, blank=True, related_name='child_terms'
)
term_count = models.IntegerField(
default=0,
help_text="Number of posts/products using this term"
)
# Sync tracking
last_synced_from_wp = models.DateTimeField(null=True, blank=True)
last_pushed_to_wp = models.DateTimeField(null=True, blank=True)
New AI Function
# igny8_core/ai/functions/generate_term_content.py
class GenerateTermContentFunction(BaseAIFunction):
"""Generate rich SEO content for taxonomy terms."""
def get_name(self) -> str:
return 'generate_term_content'
def get_metadata(self) -> Dict:
return {
'display_name': 'Generate Term Content',
'description': 'Generate rich landing page content for taxonomy terms',
'phases': {
'INIT': 'Initializing...',
'PREP': 'Loading term and cluster data...',
'AI_CALL': 'Generating term content...',
'PARSE': 'Parsing response...',
'SAVE': 'Saving term content...',
'DONE': 'Complete!'
}
}
def get_max_items(self) -> int:
return 10 # Process up to 10 terms per batch
def validate(self, payload: dict, account=None) -> Dict:
term_ids = payload.get('ids', [])
if not term_ids:
return {'valid': False, 'error': 'No term IDs provided'}
return {'valid': True}
def prepare(self, payload: dict, account=None) -> List:
term_ids = payload.get('ids', [])
terms = ContentTaxonomy.objects.filter(
id__in=term_ids,
account=account
).select_related('cluster', 'parent_term')
return list(terms)
def build_prompt(self, data: Any, account=None) -> str:
term = data # Single term
# Build context: cluster keywords, existing content, siblings
cluster_keywords = []
if term.cluster:
cluster_keywords = list(
term.cluster.keywords.values_list('keyword', flat=True)[:20]
)
sibling_terms = list(
ContentTaxonomy.objects.filter(
taxonomy_type=term.taxonomy_type,
site=term.site,
parent_term=term.parent_term
).exclude(id=term.id).values_list('name', flat=True)[:10]
)
# Use ContentTypeTemplate from 02A if available
# Fall back to default term prompt
return self._build_term_prompt(term, cluster_keywords, sibling_terms)
def parse_response(self, response: str, step_tracker=None) -> Dict:
# Parse structured JSON: {content_html, faq, meta_title, meta_description}
pass
def save_output(self, parsed, original_data, account=None, **kwargs) -> Dict:
term = original_data
term.term_content = parsed.get('content_html', '')
term.term_faq = parsed.get('faq', [])
term.meta_title = parsed.get('meta_title', '')
term.meta_description = parsed.get('meta_description', '')
term.content_status = 'generated'
term.save()
return {'count': 1, 'items_updated': [term.id]}
Register in igny8_core/ai/registry.py:
register_lazy_function('generate_term_content', lambda: GenerateTermContentFunction)
New Service
# igny8_core/business/content/cluster_mapping_service.py
class ClusterMappingService:
"""Maps taxonomy terms to keyword clusters using multi-factor scoring."""
KEYWORD_OVERLAP_WEIGHT = 0.4
SEMANTIC_SIMILARITY_WEIGHT = 0.4
TITLE_MATCH_WEIGHT = 0.2
AUTO_MAP_THRESHOLD = 0.6
SUGGEST_THRESHOLD = 0.3
def map_terms_to_clusters(self, site_id: int, account_id: int) -> Dict:
"""
Map all unmapped ContentTaxonomy terms to Clusters for a site.
Returns: {mapped: int, suggested: int, unmapped: int}
"""
pass
def map_single_term(self, term: ContentTaxonomy) -> Dict:
"""
Map a single term. Returns:
{cluster_id, secondary_ids, confidence, status}
"""
pass
def _keyword_overlap_score(self, term_name: str, cluster_keywords: list) -> float:
pass
def _semantic_similarity_score(self, term_name: str, cluster_description: str) -> float:
pass
def _title_match_score(self, term_name: str, cluster_name: str) -> float:
pass
New Celery Tasks
# igny8_core/tasks/taxonomy_tasks.py
@shared_task(bind=True, max_retries=3, default_retry_delay=60)
def sync_taxonomy_from_wordpress(self, site_id: int, account_id: int):
"""Fetch all taxonomy terms from WordPress and reconcile with ContentTaxonomy."""
pass
@shared_task(bind=True, max_retries=3, default_retry_delay=60)
def map_terms_to_clusters(self, site_id: int, account_id: int):
"""Run cluster mapping on all unmapped terms for a site."""
pass
@shared_task(bind=True, max_retries=3, default_retry_delay=60)
def generate_term_content_task(self, term_ids: list, account_id: int):
"""Generate content for a batch of taxonomy terms."""
pass
@shared_task(bind=True, max_retries=3, default_retry_delay=60)
def push_term_content_to_wordpress(self, term_id: int, account_id: int):
"""Push generated term content to WordPress via REST API."""
pass
Migration
igny8_core/migrations/XXXX_taxonomy_term_content.py
Fields added to ContentTaxonomy:
cluster— ForeignKey to Clusters (nullable)secondary_cluster_ids— JSONFieldmapping_confidence— FloatFieldmapping_status— CharFieldterm_content— TextFieldterm_faq— JSONFieldmeta_title— CharFieldmeta_description— TextFieldcontent_status— CharFieldparent_term— ForeignKey to self (nullable)term_count— IntegerFieldlast_synced_from_wp— DateTimeField (nullable)last_pushed_to_wp— DateTimeField (nullable)
API Endpoints
# Taxonomy Term Management
GET /api/v1/writer/taxonomy/terms/ # List terms with mapping status (filterable)
GET /api/v1/writer/taxonomy/terms/{id}/ # Term detail
GET /api/v1/writer/taxonomy/terms/unmapped/ # Terms needing cluster assignment
GET /api/v1/writer/taxonomy/terms/stats/ # Summary: mapped/unmapped/generated/published counts
# WordPress Sync
POST /api/v1/writer/taxonomy/terms/sync/ # Trigger WP → IGNY8 sync
GET /api/v1/writer/taxonomy/terms/sync/status/ # Last sync time + status
# Cluster Mapping
POST /api/v1/writer/taxonomy/terms/{id}/map-cluster/ # Manual cluster assignment
POST /api/v1/writer/taxonomy/terms/auto-map/ # Run auto-mapping for all unmapped terms
GET /api/v1/writer/taxonomy/terms/{id}/cluster-suggestions/ # Get AI cluster suggestions for a term
# Content Generation
POST /api/v1/writer/taxonomy/terms/create-tasks/ # Bulk create generation tasks for selected terms
POST /api/v1/writer/taxonomy/terms/{id}/generate/ # Generate content for single term
POST /api/v1/writer/taxonomy/terms/generate-bulk/ # Generate content for multiple terms
# Publishing to WordPress
POST /api/v1/writer/taxonomy/terms/{id}/publish/ # Push single term content to WP
POST /api/v1/writer/taxonomy/terms/publish-bulk/ # Push multiple terms to WP
ViewSet:
# igny8_core/modules/writer/views/taxonomy_term_views.py
class TaxonomyTermViewSet(SiteSectorModelViewSet):
serializer_class = TaxonomyTermSerializer
queryset = ContentTaxonomy.objects.all()
filterset_fields = ['taxonomy_type', 'mapping_status', 'content_status', 'site']
@action(detail=False, methods=['get'])
def unmapped(self, request):
qs = self.get_queryset().filter(mapping_status='unmapped')
return self.paginate_and_respond(qs)
@action(detail=False, methods=['get'])
def stats(self, request):
site_id = request.query_params.get('site_id')
qs = self.get_queryset().filter(site_id=site_id)
return Response({
'total': qs.count(),
'mapped': qs.filter(mapping_status__in=['auto_mapped', 'manual_mapped']).count(),
'suggested': qs.filter(mapping_status='suggested').count(),
'unmapped': qs.filter(mapping_status='unmapped').count(),
'content_generated': qs.filter(content_status='generated').count(),
'content_published': qs.filter(content_status='published').count(),
})
@action(detail=False, methods=['post'])
def sync(self, request):
site_id = request.data.get('site_id')
sync_taxonomy_from_wordpress.delay(site_id, request.account.id)
return Response({'message': 'Taxonomy sync started'})
@action(detail=True, methods=['post'], url_path='map-cluster')
def map_cluster(self, request, pk=None):
term = self.get_object()
cluster_id = request.data.get('cluster_id')
term.cluster_id = cluster_id
term.mapping_status = 'manual_mapped'
term.mapping_confidence = 1.0
term.save()
return Response(TaxonomyTermSerializer(term).data)
@action(detail=False, methods=['post'], url_path='auto-map')
def auto_map(self, request):
site_id = request.data.get('site_id')
map_terms_to_clusters.delay(site_id, request.account.id)
return Response({'message': 'Auto-mapping started'})
@action(detail=True, methods=['get'], url_path='cluster-suggestions')
def cluster_suggestions(self, request, pk=None):
term = self.get_object()
service = ClusterMappingService()
suggestions = service.get_suggestions(term, top_n=5)
return Response({'suggestions': suggestions})
@action(detail=True, methods=['post'])
def generate(self, request, pk=None):
term = self.get_object()
generate_term_content_task.delay([term.id], request.account.id)
return Response({'message': 'Content generation started'})
@action(detail=True, methods=['post'])
def publish(self, request, pk=None):
term = self.get_object()
push_term_content_to_wordpress.delay(term.id, request.account.id)
return Response({'message': 'Publishing to WordPress started'})
URL Registration:
# igny8_core/modules/writer/urls.py — add to existing router
router.register('taxonomy/terms', TaxonomyTermViewSet, basename='taxonomy-term')
Credit Costs
| Operation | Credits | Via |
|---|---|---|
| Taxonomy sync (WordPress → IGNY8) | 1 per batch | CreditCostConfig: taxonomy_sync |
| Term content generation | 4–6 per term | CreditCostConfig: term_content_generation |
| Term content optimization | 3–5 per term | CreditCostConfig: term_content_optimization |
Add to CreditCostConfig:
CreditCostConfig.objects.get_or_create(
operation_type='taxonomy_sync',
defaults={'display_name': 'Taxonomy Sync', 'base_credits': 1}
)
CreditCostConfig.objects.get_or_create(
operation_type='term_content_generation',
defaults={'display_name': 'Term Content Generation', 'base_credits': 5}
)
Add to CreditUsageLog.OPERATION_TYPE_CHOICES:
('taxonomy_sync', 'Taxonomy Sync'),
('term_content_generation', 'Term Content Generation'),
4. IMPLEMENTATION STEPS
Step 1: Add Fields to ContentTaxonomy
File to modify:
backend/igny8_core/business/content/models.py(or wherever ContentTaxonomy is defined)- Add all 13 new fields listed in migration section
Step 2: Create and Run Migration
cd /data/app/igny8/backend
python manage.py makemigrations --name taxonomy_term_content
python manage.py migrate
Step 3: Build ClusterMappingService
File to create:
backend/igny8_core/business/content/cluster_mapping_service.py
Step 4: Create GenerateTermContentFunction
File to create:
backend/igny8_core/ai/functions/generate_term_content.py
Register in:
backend/igny8_core/ai/registry.py
Step 5: Create Celery Tasks
File to create:
backend/igny8_core/tasks/taxonomy_tasks.py
Register in Celery beat schedule (optional — these are primarily on-demand):
sync_taxonomy_from_wordpress— can be periodic (weekly) or on-demand
Step 6: Add Credit Cost Entries
Add taxonomy_sync and term_content_generation to:
CreditCostConfigseed dataCreditUsageLog.OPERATION_TYPE_CHOICES
Step 7: Build Serializers
File to create:
backend/igny8_core/modules/writer/serializers/taxonomy_term_serializer.py
Step 8: Build ViewSet and URLs
File to create:
backend/igny8_core/modules/writer/views/taxonomy_term_views.py
Modify:
backend/igny8_core/modules/writer/urls.py
Step 9: Frontend
Files to create/modify in frontend/src/:
pages/Writer/TaxonomyTerms.tsx— term list with mapping status indicatorspages/Writer/TaxonomyTermDetail.tsx— term detail with generated content previewcomponents/Writer/ClusterMappingPanel.tsx— cluster assignment/suggestion UIstores/taxonomyTermStore.ts— Zustand storeapi/taxonomyTerms.ts— API client
Step 10: Tests
cd /data/app/igny8/backend
python manage.py test igny8_core.business.content.tests.test_cluster_mapping
python manage.py test igny8_core.ai.tests.test_generate_term_content
python manage.py test igny8_core.modules.writer.tests.test_taxonomy_term_views
5. ACCEPTANCE CRITERIA
- All 13 new fields on ContentTaxonomy migrate successfully
GenerateTermContentFunctionregistered in AI function registry- WordPress → IGNY8 taxonomy sync fetches categories, tags, WooCommerce taxonomies
- Sync creates/updates ContentTaxonomy records with correct taxonomy_type
- Parent/child hierarchy preserved via parent_term FK
- SyncEvent logged with event_type='metadata_sync' after each sync operation
- ClusterMappingService maps terms with confidence scores
- Terms with confidence ≥ 0.6 auto-mapped, 0.3–0.6 suggested, < 0.3 unmapped
- Manual cluster assignment sets mapping_status='manual_mapped' with confidence=1.0
- Term content generation produces: content_html, FAQ, meta_title, meta_description
- content_status transitions: none → generating → generated → published
- Publishing pushes content to WordPress via
POST /wp-json/igny8/v1/terms/{id}/content - All API endpoints require authentication and enforce account isolation
- Frontend term list shows mapping status badges (mapped/suggested/unmapped)
- Frontend supports manual cluster assignment from suggestion list
- Credit deduction works for taxonomy_sync and term_content_generation operations
- Backward compatible — existing ContentTaxonomy records unaffected (new fields nullable/defaulted)
6. CLAUDE CODE INSTRUCTIONS
Execution Order
- Read
backend/igny8_core/business/content/models.py— find ContentTaxonomy and ContentTaxonomyRelation - Read
backend/igny8_core/business/planning/models.py— understand Clusters model for FK reference - Read
backend/igny8_core/ai/functions/generate_content.py— reference pattern for new AI function - Read
backend/igny8_core/ai/registry.py— understand registration pattern - Add fields to ContentTaxonomy model
- Create migration and run it
- Build ClusterMappingService
- Build GenerateTermContentFunction + register it
- Build Celery tasks
- Build serializers, ViewSet, URLs
- Build frontend components
Key Constraints
- ALL primary keys are
BigAutoField(integer). No UUIDs. - Model class names PLURAL:
Clusters,Keywords,Tasks,ContentIdeas,Images.Contentstays singular.ContentTaxonomystays singular. - Frontend:
.tsxfiles, Zustand stores, Vitest testing - Celery app name:
igny8_core - All new db_tables use
igny8_prefix - Follow existing ViewSet pattern:
SiteSectorModelViewSetfor site-scoped resources - AI functions follow
BaseAIFunctionpattern with lazy registry
File Tree (New/Modified)
backend/igny8_core/
├── business/content/
│ ├── models.py # MODIFY: add fields to ContentTaxonomy
│ └── cluster_mapping_service.py # NEW: ClusterMappingService
├── ai/functions/
│ └── generate_term_content.py # NEW: GenerateTermContentFunction
├── ai/
│ └── registry.py # MODIFY: register generate_term_content
├── tasks/
│ └── taxonomy_tasks.py # NEW: sync, map, generate, publish tasks
├── modules/writer/
│ ├── serializers/
│ │ └── taxonomy_term_serializer.py # NEW
│ ├── views/
│ │ └── taxonomy_term_views.py # NEW
│ └── urls.py # MODIFY: register taxonomy/terms route
├── migrations/
│ └── XXXX_taxonomy_term_content.py # NEW: auto-generated
frontend/src/
├── pages/Writer/
│ ├── TaxonomyTerms.tsx # NEW: term list page
│ └── TaxonomyTermDetail.tsx # NEW: term detail + content preview
├── components/Writer/
│ └── ClusterMappingPanel.tsx # NEW: cluster assignment UI
├── stores/
│ └── taxonomyTermStore.ts # NEW: Zustand store
├── api/
│ └── taxonomyTerms.ts # NEW: API client
Cross-References
- 02A (content types extension): ContentTypeTemplate for content_type='taxonomy' provides prompt template
- 01A (SAG data foundation): SAGAttribute → taxonomy mapping context
- 01D (setup wizard): wizard creates initial taxonomy plan used for cluster mapping
- 03B (WP plugin connected): connected plugin receives term content via REST endpoint
- 03C (companion theme): theme renders term landing pages using pushed content