# IGNY8 Phase 2: Taxonomy Term Content (02B) ## Rich Content Generation for Taxonomy Terms **Document Version:** 1.0 **Date:** 2026-03-23 **Phase:** IGNY8 Phase 2 — Feature Expansion **Status:** Build Ready **Source of Truth:** Codebase at `/data/app/igny8/` **Audience:** Claude Code, Backend Developers, Architects --- ## 1. CURRENT STATE ### Existing Taxonomy Infrastructure The taxonomy system is partially built: **ContentTaxonomy** (writer app, db_table=`igny8_content_taxonomies`): - Stores taxonomy term references synced from WordPress - Fields: `name`, `slug`, `external_id` (WP term ID), `taxonomy_type` (category/tag/product_cat/product_tag/attribute) - No content generation — terms are metadata only (name + slug + external reference) **ContentTaxonomyRelation** (writer app): - Links `Content` to `ContentTaxonomy` (many-to-many through table) - Allows assigning existing taxonomy terms to content pieces **Content Model** (writer app, db_table=`igny8_content`): - `content_type='taxonomy'` exists in CONTENT_TYPE_CHOICES but is unused by the generation pipeline - CONTENT_STRUCTURE_CHOICES includes `category_archive`, `tag_archive`, `attribute_archive` - `taxonomy_terms` ManyToManyField through ContentTaxonomyRelation **Tasks Model** (writer app, db_table=`igny8_tasks`): - `taxonomy_term` ForeignKey to ContentTaxonomy (nullable, db_column='taxonomy_id') - Not used by automation pipeline — present as a field only **SiteIntegration** (integration app): - WordPress connections exist via `SiteIntegration` model - `SyncEvent` logs operations but taxonomy sync is stubbed/incomplete ### What Doesn't Exist - No content generation for taxonomy terms (categories, tags, attributes) - No cluster mapping for taxonomy terms - No WordPress → IGNY8 taxonomy sync (full fetch and reconcile) - No IGNY8 → WordPress term content push - No AI function for term content generation - No admin interface for managing term-to-cluster mapping --- ## 2. WHAT TO BUILD ### Overview Make taxonomy terms first-class SEO content pages by: 1. **Syncing terms from WordPress** — fetch all categories, tags, WooCommerce taxonomies 2. **Mapping terms to clusters** — automatic keyword-overlap + semantic matching 3. **Generating rich content** — AI-generated landing page content for each term 4. **Pushing content back** — sync generated content to WordPress term descriptions + meta ### Taxonomy Sync (WordPress → IGNY8) Full bidirectional sync leveraging existing `SiteIntegration`: **Fetch targets:** - WordPress categories (`taxonomy_type='category'`) - WordPress tags (`taxonomy_type='tag'`) - WooCommerce product categories (`taxonomy_type='product_cat'`) - WooCommerce product tags (`taxonomy_type='product_tag'`) - WooCommerce product attributes (`taxonomy_type='attribute'`, e.g., `pa_color`, `pa_size`) **Sync logic:** 1. Use existing `SiteIntegration.credentials_json` to authenticate WP REST API 2. Fetch all terms via `GET /wp-json/wp/v2/categories`, `/tags`, `/product_cat`, etc. 3. Reconcile: create new `ContentTaxonomy` records, update changed ones, flag deleted 4. Store parent/child hierarchy for categories 5. Log sync as `SyncEvent` with `event_type='metadata_sync'` ### Cluster Mapping Service A shared service (`cluster_mapping_service.py`) that maps taxonomy terms to keyword clusters: **Algorithm:** | Factor | Weight | Method | |--------|--------|--------| | Keyword overlap | 40% | Compare term name + slug against cluster keywords | | Semantic similarity | 40% | Embedding-based cosine similarity (term name vs cluster description) | | Title match | 20% | Exact/partial match of term name in cluster name | **Output per term:** - `primary_cluster_id` — best-match cluster - `secondary_cluster_ids` — additional related clusters (up to 3) - `mapping_confidence` — 0.0 to 1.0 score - `mapping_status`: - `auto_mapped` (confidence ≥ 0.6) — assigned automatically - `suggested` (confidence 0.3–0.6) — suggested for manual review - `unmapped` (confidence < 0.3) — no good match found ### Term Content Generation Each taxonomy term gets rich, SEO-optimized content: **Generated sections:** 1. **H1 Title** — optimized for the term + primary cluster keywords 2. **Rich description** — 500–1,500 words covering the topic 3. **FAQ section** — 5–8 questions and answers 4. **Related terms** — links to sibling/child terms 5. **Meta title** — 50–60 characters 6. **Meta description** — 150–160 characters **AI function:** `GenerateTermContentFunction(BaseAIFunction)`: - Input: term name, taxonomy_type, assigned cluster keywords, existing content titles under term, parent/sibling terms for context - Output: structured JSON with sections (intro, overview, FAQ, related) - Uses `ContentTypeTemplate` from 02A where `content_type='taxonomy'` ### Term Content Sync (IGNY8 → WordPress) Push generated content to WordPress: - Custom WP REST endpoint: `POST /wp-json/igny8/v1/terms/{id}/content` - Stores in WordPress term meta: - `_igny8_term_content` — HTML content - `_igny8_term_faq` — JSON FAQ array - `_igny8_term_meta_title` — SEO title - `_igny8_term_meta_description` — SEO description - Updates native WordPress term description with the generated content - Schema: CollectionPage with itemListElement for listed content --- ## 3. DATA MODELS & APIs ### Modified Models **ContentTaxonomy** (db_table=`igny8_content_taxonomies`) — add fields: ```python # Cluster mapping cluster = models.ForeignKey( 'planner.Clusters', on_delete=models.SET_NULL, null=True, blank=True, related_name='taxonomy_terms', help_text="Primary cluster this term maps to" ) secondary_cluster_ids = models.JSONField( default=list, blank=True, help_text="Additional related cluster IDs" ) mapping_confidence = models.FloatField( default=0.0, help_text="Cluster mapping confidence score 0.0-1.0" ) mapping_status = models.CharField( max_length=20, default='unmapped', choices=[ ('auto_mapped', 'Auto Mapped'), ('manual_mapped', 'Manual Mapped'), ('suggested', 'Suggested'), ('unmapped', 'Unmapped'), ], db_index=True ) # Generated content term_content = models.TextField( blank=True, default='', help_text="Generated rich HTML content for the term page" ) term_faq = models.JSONField( default=list, blank=True, help_text="Generated FAQ: [{question, answer}]" ) meta_title = models.CharField(max_length=255, blank=True, default='') meta_description = models.TextField(blank=True, default='') content_status = models.CharField( max_length=20, default='none', choices=[ ('none', 'No Content'), ('generating', 'Generating'), ('generated', 'Generated'), ('published', 'Published to WP'), ], db_index=True ) # Hierarchy parent_term = models.ForeignKey( 'self', on_delete=models.SET_NULL, null=True, blank=True, related_name='child_terms' ) term_count = models.IntegerField( default=0, help_text="Number of posts/products using this term" ) # Sync tracking last_synced_from_wp = models.DateTimeField(null=True, blank=True) last_pushed_to_wp = models.DateTimeField(null=True, blank=True) ``` ### New AI Function ```python # igny8_core/ai/functions/generate_term_content.py class GenerateTermContentFunction(BaseAIFunction): """Generate rich SEO content for taxonomy terms.""" def get_name(self) -> str: return 'generate_term_content' def get_metadata(self) -> Dict: return { 'display_name': 'Generate Term Content', 'description': 'Generate rich landing page content for taxonomy terms', 'phases': { 'INIT': 'Initializing...', 'PREP': 'Loading term and cluster data...', 'AI_CALL': 'Generating term content...', 'PARSE': 'Parsing response...', 'SAVE': 'Saving term content...', 'DONE': 'Complete!' } } def get_max_items(self) -> int: return 10 # Process up to 10 terms per batch def validate(self, payload: dict, account=None) -> Dict: term_ids = payload.get('ids', []) if not term_ids: return {'valid': False, 'error': 'No term IDs provided'} return {'valid': True} def prepare(self, payload: dict, account=None) -> List: term_ids = payload.get('ids', []) terms = ContentTaxonomy.objects.filter( id__in=term_ids, account=account ).select_related('cluster', 'parent_term') return list(terms) def build_prompt(self, data: Any, account=None) -> str: term = data # Single term # Build context: cluster keywords, existing content, siblings cluster_keywords = [] if term.cluster: cluster_keywords = list( term.cluster.keywords.values_list('keyword', flat=True)[:20] ) sibling_terms = list( ContentTaxonomy.objects.filter( taxonomy_type=term.taxonomy_type, site=term.site, parent_term=term.parent_term ).exclude(id=term.id).values_list('name', flat=True)[:10] ) # Use ContentTypeTemplate from 02A if available # Fall back to default term prompt return self._build_term_prompt(term, cluster_keywords, sibling_terms) def parse_response(self, response: str, step_tracker=None) -> Dict: # Parse structured JSON: {content_html, faq, meta_title, meta_description} pass def save_output(self, parsed, original_data, account=None, **kwargs) -> Dict: term = original_data term.term_content = parsed.get('content_html', '') term.term_faq = parsed.get('faq', []) term.meta_title = parsed.get('meta_title', '') term.meta_description = parsed.get('meta_description', '') term.content_status = 'generated' term.save() return {'count': 1, 'items_updated': [term.id]} ``` Register in `igny8_core/ai/registry.py`: ```python register_lazy_function('generate_term_content', lambda: GenerateTermContentFunction) ``` ### New Service ```python # igny8_core/business/content/cluster_mapping_service.py class ClusterMappingService: """Maps taxonomy terms to keyword clusters using multi-factor scoring.""" KEYWORD_OVERLAP_WEIGHT = 0.4 SEMANTIC_SIMILARITY_WEIGHT = 0.4 TITLE_MATCH_WEIGHT = 0.2 AUTO_MAP_THRESHOLD = 0.6 SUGGEST_THRESHOLD = 0.3 def map_terms_to_clusters(self, site_id: int, account_id: int) -> Dict: """ Map all unmapped ContentTaxonomy terms to Clusters for a site. Returns: {mapped: int, suggested: int, unmapped: int} """ pass def map_single_term(self, term: ContentTaxonomy) -> Dict: """ Map a single term. Returns: {cluster_id, secondary_ids, confidence, status} """ pass def _keyword_overlap_score(self, term_name: str, cluster_keywords: list) -> float: pass def _semantic_similarity_score(self, term_name: str, cluster_description: str) -> float: pass def _title_match_score(self, term_name: str, cluster_name: str) -> float: pass ``` ### New Celery Tasks ```python # igny8_core/tasks/taxonomy_tasks.py @shared_task(bind=True, max_retries=3, default_retry_delay=60) def sync_taxonomy_from_wordpress(self, site_id: int, account_id: int): """Fetch all taxonomy terms from WordPress and reconcile with ContentTaxonomy.""" pass @shared_task(bind=True, max_retries=3, default_retry_delay=60) def map_terms_to_clusters(self, site_id: int, account_id: int): """Run cluster mapping on all unmapped terms for a site.""" pass @shared_task(bind=True, max_retries=3, default_retry_delay=60) def generate_term_content_task(self, term_ids: list, account_id: int): """Generate content for a batch of taxonomy terms.""" pass @shared_task(bind=True, max_retries=3, default_retry_delay=60) def push_term_content_to_wordpress(self, term_id: int, account_id: int): """Push generated term content to WordPress via REST API.""" pass ``` ### Migration ``` igny8_core/migrations/XXXX_taxonomy_term_content.py ``` Fields added to `ContentTaxonomy`: 1. `cluster` — ForeignKey to Clusters (nullable) 2. `secondary_cluster_ids` — JSONField 3. `mapping_confidence` — FloatField 4. `mapping_status` — CharField 5. `term_content` — TextField 6. `term_faq` — JSONField 7. `meta_title` — CharField 8. `meta_description` — TextField 9. `content_status` — CharField 10. `parent_term` — ForeignKey to self (nullable) 11. `term_count` — IntegerField 12. `last_synced_from_wp` — DateTimeField (nullable) 13. `last_pushed_to_wp` — DateTimeField (nullable) ### API Endpoints ``` # Taxonomy Term Management GET /api/v1/writer/taxonomy/terms/ # List terms with mapping status (filterable) GET /api/v1/writer/taxonomy/terms/{id}/ # Term detail GET /api/v1/writer/taxonomy/terms/unmapped/ # Terms needing cluster assignment GET /api/v1/writer/taxonomy/terms/stats/ # Summary: mapped/unmapped/generated/published counts # WordPress Sync POST /api/v1/writer/taxonomy/terms/sync/ # Trigger WP → IGNY8 sync GET /api/v1/writer/taxonomy/terms/sync/status/ # Last sync time + status # Cluster Mapping POST /api/v1/writer/taxonomy/terms/{id}/map-cluster/ # Manual cluster assignment POST /api/v1/writer/taxonomy/terms/auto-map/ # Run auto-mapping for all unmapped terms GET /api/v1/writer/taxonomy/terms/{id}/cluster-suggestions/ # Get AI cluster suggestions for a term # Content Generation POST /api/v1/writer/taxonomy/terms/create-tasks/ # Bulk create generation tasks for selected terms POST /api/v1/writer/taxonomy/terms/{id}/generate/ # Generate content for single term POST /api/v1/writer/taxonomy/terms/generate-bulk/ # Generate content for multiple terms # Publishing to WordPress POST /api/v1/writer/taxonomy/terms/{id}/publish/ # Push single term content to WP POST /api/v1/writer/taxonomy/terms/publish-bulk/ # Push multiple terms to WP ``` **ViewSet:** ```python # igny8_core/modules/writer/views/taxonomy_term_views.py class TaxonomyTermViewSet(SiteSectorModelViewSet): serializer_class = TaxonomyTermSerializer queryset = ContentTaxonomy.objects.all() filterset_fields = ['taxonomy_type', 'mapping_status', 'content_status', 'site'] @action(detail=False, methods=['get']) def unmapped(self, request): qs = self.get_queryset().filter(mapping_status='unmapped') return self.paginate_and_respond(qs) @action(detail=False, methods=['get']) def stats(self, request): site_id = request.query_params.get('site_id') qs = self.get_queryset().filter(site_id=site_id) return Response({ 'total': qs.count(), 'mapped': qs.filter(mapping_status__in=['auto_mapped', 'manual_mapped']).count(), 'suggested': qs.filter(mapping_status='suggested').count(), 'unmapped': qs.filter(mapping_status='unmapped').count(), 'content_generated': qs.filter(content_status='generated').count(), 'content_published': qs.filter(content_status='published').count(), }) @action(detail=False, methods=['post']) def sync(self, request): site_id = request.data.get('site_id') sync_taxonomy_from_wordpress.delay(site_id, request.account.id) return Response({'message': 'Taxonomy sync started'}) @action(detail=True, methods=['post'], url_path='map-cluster') def map_cluster(self, request, pk=None): term = self.get_object() cluster_id = request.data.get('cluster_id') term.cluster_id = cluster_id term.mapping_status = 'manual_mapped' term.mapping_confidence = 1.0 term.save() return Response(TaxonomyTermSerializer(term).data) @action(detail=False, methods=['post'], url_path='auto-map') def auto_map(self, request): site_id = request.data.get('site_id') map_terms_to_clusters.delay(site_id, request.account.id) return Response({'message': 'Auto-mapping started'}) @action(detail=True, methods=['get'], url_path='cluster-suggestions') def cluster_suggestions(self, request, pk=None): term = self.get_object() service = ClusterMappingService() suggestions = service.get_suggestions(term, top_n=5) return Response({'suggestions': suggestions}) @action(detail=True, methods=['post']) def generate(self, request, pk=None): term = self.get_object() generate_term_content_task.delay([term.id], request.account.id) return Response({'message': 'Content generation started'}) @action(detail=True, methods=['post']) def publish(self, request, pk=None): term = self.get_object() push_term_content_to_wordpress.delay(term.id, request.account.id) return Response({'message': 'Publishing to WordPress started'}) ``` **URL Registration:** ```python # igny8_core/modules/writer/urls.py — add to existing router router.register('taxonomy/terms', TaxonomyTermViewSet, basename='taxonomy-term') ``` ### Credit Costs | Operation | Credits | Via | |-----------|---------|-----| | Taxonomy sync (WordPress → IGNY8) | 1 per batch | CreditCostConfig: `taxonomy_sync` | | Term content generation | 4–6 per term | CreditCostConfig: `term_content_generation` | | Term content optimization | 3–5 per term | CreditCostConfig: `term_content_optimization` | Add to `CreditCostConfig`: ```python CreditCostConfig.objects.get_or_create( operation_type='taxonomy_sync', defaults={'display_name': 'Taxonomy Sync', 'base_credits': 1} ) CreditCostConfig.objects.get_or_create( operation_type='term_content_generation', defaults={'display_name': 'Term Content Generation', 'base_credits': 5} ) ``` Add to `CreditUsageLog.OPERATION_TYPE_CHOICES`: ```python ('taxonomy_sync', 'Taxonomy Sync'), ('term_content_generation', 'Term Content Generation'), ``` --- ## 4. IMPLEMENTATION STEPS ### Step 1: Add Fields to ContentTaxonomy File to modify: - `backend/igny8_core/business/content/models.py` (or wherever ContentTaxonomy is defined) - Add all 13 new fields listed in migration section ### Step 2: Create and Run Migration ```bash cd /data/app/igny8/backend python manage.py makemigrations --name taxonomy_term_content python manage.py migrate ``` ### Step 3: Build ClusterMappingService File to create: - `backend/igny8_core/business/content/cluster_mapping_service.py` ### Step 4: Create GenerateTermContentFunction File to create: - `backend/igny8_core/ai/functions/generate_term_content.py` Register in: - `backend/igny8_core/ai/registry.py` ### Step 5: Create Celery Tasks File to create: - `backend/igny8_core/tasks/taxonomy_tasks.py` Register in Celery beat schedule (optional — these are primarily on-demand): - `sync_taxonomy_from_wordpress` — can be periodic (weekly) or on-demand ### Step 6: Add Credit Cost Entries Add `taxonomy_sync` and `term_content_generation` to: - `CreditCostConfig` seed data - `CreditUsageLog.OPERATION_TYPE_CHOICES` ### Step 7: Build Serializers File to create: - `backend/igny8_core/modules/writer/serializers/taxonomy_term_serializer.py` ### Step 8: Build ViewSet and URLs File to create: - `backend/igny8_core/modules/writer/views/taxonomy_term_views.py` Modify: - `backend/igny8_core/modules/writer/urls.py` ### Step 9: Frontend Files to create/modify in `frontend/src/`: - `pages/Writer/TaxonomyTerms.tsx` — term list with mapping status indicators - `pages/Writer/TaxonomyTermDetail.tsx` — term detail with generated content preview - `components/Writer/ClusterMappingPanel.tsx` — cluster assignment/suggestion UI - `stores/taxonomyTermStore.ts` — Zustand store - `api/taxonomyTerms.ts` — API client ### Step 10: Tests ```bash cd /data/app/igny8/backend python manage.py test igny8_core.business.content.tests.test_cluster_mapping python manage.py test igny8_core.ai.tests.test_generate_term_content python manage.py test igny8_core.modules.writer.tests.test_taxonomy_term_views ``` --- ## 5. ACCEPTANCE CRITERIA - [ ] All 13 new fields on ContentTaxonomy migrate successfully - [ ] `GenerateTermContentFunction` registered in AI function registry - [ ] WordPress → IGNY8 taxonomy sync fetches categories, tags, WooCommerce taxonomies - [ ] Sync creates/updates ContentTaxonomy records with correct taxonomy_type - [ ] Parent/child hierarchy preserved via parent_term FK - [ ] SyncEvent logged with event_type='metadata_sync' after each sync operation - [ ] ClusterMappingService maps terms with confidence scores - [ ] Terms with confidence ≥ 0.6 auto-mapped, 0.3–0.6 suggested, < 0.3 unmapped - [ ] Manual cluster assignment sets mapping_status='manual_mapped' with confidence=1.0 - [ ] Term content generation produces: content_html, FAQ, meta_title, meta_description - [ ] content_status transitions: none → generating → generated → published - [ ] Publishing pushes content to WordPress via `POST /wp-json/igny8/v1/terms/{id}/content` - [ ] All API endpoints require authentication and enforce account isolation - [ ] Frontend term list shows mapping status badges (mapped/suggested/unmapped) - [ ] Frontend supports manual cluster assignment from suggestion list - [ ] Credit deduction works for taxonomy_sync and term_content_generation operations - [ ] Backward compatible — existing ContentTaxonomy records unaffected (new fields nullable/defaulted) --- ## 6. CLAUDE CODE INSTRUCTIONS ### Execution Order 1. Read `backend/igny8_core/business/content/models.py` — find ContentTaxonomy and ContentTaxonomyRelation 2. Read `backend/igny8_core/business/planning/models.py` — understand Clusters model for FK reference 3. Read `backend/igny8_core/ai/functions/generate_content.py` — reference pattern for new AI function 4. Read `backend/igny8_core/ai/registry.py` — understand registration pattern 5. Add fields to ContentTaxonomy model 6. Create migration and run it 7. Build ClusterMappingService 8. Build GenerateTermContentFunction + register it 9. Build Celery tasks 10. Build serializers, ViewSet, URLs 11. Build frontend components ### Key Constraints - ALL primary keys are `BigAutoField` (integer). No UUIDs. - Model class names PLURAL: `Clusters`, `Keywords`, `Tasks`, `ContentIdeas`, `Images`. `Content` stays singular. `ContentTaxonomy` stays singular. - Frontend: `.tsx` files, Zustand stores, Vitest testing - Celery app name: `igny8_core` - All new db_tables use `igny8_` prefix - Follow existing ViewSet pattern: `SiteSectorModelViewSet` for site-scoped resources - AI functions follow `BaseAIFunction` pattern with lazy registry ### File Tree (New/Modified) ``` backend/igny8_core/ ├── business/content/ │ ├── models.py # MODIFY: add fields to ContentTaxonomy │ └── cluster_mapping_service.py # NEW: ClusterMappingService ├── ai/functions/ │ └── generate_term_content.py # NEW: GenerateTermContentFunction ├── ai/ │ └── registry.py # MODIFY: register generate_term_content ├── tasks/ │ └── taxonomy_tasks.py # NEW: sync, map, generate, publish tasks ├── modules/writer/ │ ├── serializers/ │ │ └── taxonomy_term_serializer.py # NEW │ ├── views/ │ │ └── taxonomy_term_views.py # NEW │ └── urls.py # MODIFY: register taxonomy/terms route ├── migrations/ │ └── XXXX_taxonomy_term_content.py # NEW: auto-generated frontend/src/ ├── pages/Writer/ │ ├── TaxonomyTerms.tsx # NEW: term list page │ └── TaxonomyTermDetail.tsx # NEW: term detail + content preview ├── components/Writer/ │ └── ClusterMappingPanel.tsx # NEW: cluster assignment UI ├── stores/ │ └── taxonomyTermStore.ts # NEW: Zustand store ├── api/ │ └── taxonomyTerms.ts # NEW: API client ``` ### Cross-References - **02A** (content types extension): ContentTypeTemplate for content_type='taxonomy' provides prompt template - **01A** (SAG data foundation): SAGAttribute → taxonomy mapping context - **01D** (setup wizard): wizard creates initial taxonomy plan used for cluster mapping - **03B** (WP plugin connected): connected plugin receives term content via REST endpoint - **03C** (companion theme): theme renders term landing pages using pushed content