Files
igny8/v2/V2-Execution-Docs/02B-taxonomy-term-content.md
IGNY8 VPS (Salman) 0570052fec 1
2026-03-23 17:20:51 +00:00

25 KiB
Raw Blame History

IGNY8 Phase 2: Taxonomy Term Content (02B)

Rich Content Generation for Taxonomy Terms

Document Version: 1.0 Date: 2026-03-23 Phase: IGNY8 Phase 2 — Feature Expansion Status: Build Ready Source of Truth: Codebase at /data/app/igny8/ Audience: Claude Code, Backend Developers, Architects


1. CURRENT STATE

Existing Taxonomy Infrastructure

The taxonomy system is partially built:

ContentTaxonomy (writer app, db_table=igny8_content_taxonomies):

  • Stores taxonomy term references synced from WordPress
  • Fields: name, slug, external_id (WP term ID), taxonomy_type (category/tag/product_cat/product_tag/attribute)
  • No content generation — terms are metadata only (name + slug + external reference)

ContentTaxonomyRelation (writer app):

  • Links Content to ContentTaxonomy (many-to-many through table)
  • Allows assigning existing taxonomy terms to content pieces

Content Model (writer app, db_table=igny8_content):

  • content_type='taxonomy' exists in CONTENT_TYPE_CHOICES but is unused by the generation pipeline
  • CONTENT_STRUCTURE_CHOICES includes category_archive, tag_archive, attribute_archive
  • taxonomy_terms ManyToManyField through ContentTaxonomyRelation

Tasks Model (writer app, db_table=igny8_tasks):

  • taxonomy_term ForeignKey to ContentTaxonomy (nullable, db_column='taxonomy_id')
  • Not used by automation pipeline — present as a field only

SiteIntegration (integration app):

  • WordPress connections exist via SiteIntegration model
  • SyncEvent logs operations but taxonomy sync is stubbed/incomplete

What Doesn't Exist

  • No content generation for taxonomy terms (categories, tags, attributes)
  • No cluster mapping for taxonomy terms
  • No WordPress → IGNY8 taxonomy sync (full fetch and reconcile)
  • No IGNY8 → WordPress term content push
  • No AI function for term content generation
  • No admin interface for managing term-to-cluster mapping

2. WHAT TO BUILD

Overview

Make taxonomy terms first-class SEO content pages by:

  1. Syncing terms from WordPress — fetch all categories, tags, WooCommerce taxonomies
  2. Mapping terms to clusters — automatic keyword-overlap + semantic matching
  3. Generating rich content — AI-generated landing page content for each term
  4. Pushing content back — sync generated content to WordPress term descriptions + meta

Taxonomy Sync (WordPress → IGNY8)

Full bidirectional sync leveraging existing SiteIntegration:

Fetch targets:

  • WordPress categories (taxonomy_type='category')
  • WordPress tags (taxonomy_type='tag')
  • WooCommerce product categories (taxonomy_type='product_cat')
  • WooCommerce product tags (taxonomy_type='product_tag')
  • WooCommerce product attributes (taxonomy_type='attribute', e.g., pa_color, pa_size)

Sync logic:

  1. Use existing SiteIntegration.credentials_json to authenticate WP REST API
  2. Fetch all terms via GET /wp-json/wp/v2/categories, /tags, /product_cat, etc.
  3. Reconcile: create new ContentTaxonomy records, update changed ones, flag deleted
  4. Store parent/child hierarchy for categories
  5. Log sync as SyncEvent with event_type='metadata_sync'

Cluster Mapping Service

A shared service (cluster_mapping_service.py) that maps taxonomy terms to keyword clusters:

Algorithm:

Factor Weight Method
Keyword overlap 40% Compare term name + slug against cluster keywords
Semantic similarity 40% Embedding-based cosine similarity (term name vs cluster description)
Title match 20% Exact/partial match of term name in cluster name

Output per term:

  • primary_cluster_id — best-match cluster
  • secondary_cluster_ids — additional related clusters (up to 3)
  • mapping_confidence — 0.0 to 1.0 score
  • mapping_status:
    • auto_mapped (confidence ≥ 0.6) — assigned automatically
    • suggested (confidence 0.30.6) — suggested for manual review
    • unmapped (confidence < 0.3) — no good match found

Term Content Generation

Each taxonomy term gets rich, SEO-optimized content:

Generated sections:

  1. H1 Title — optimized for the term + primary cluster keywords
  2. Rich description — 5001,500 words covering the topic
  3. FAQ section — 58 questions and answers
  4. Related terms — links to sibling/child terms
  5. Meta title — 5060 characters
  6. Meta description — 150160 characters

AI function: GenerateTermContentFunction(BaseAIFunction):

  • Input: term name, taxonomy_type, assigned cluster keywords, existing content titles under term, parent/sibling terms for context
  • Output: structured JSON with sections (intro, overview, FAQ, related)
  • Uses ContentTypeTemplate from 02A where content_type='taxonomy'

Term Content Sync (IGNY8 → WordPress)

Push generated content to WordPress:

  • Custom WP REST endpoint: POST /wp-json/igny8/v1/terms/{id}/content
  • Stores in WordPress term meta:
    • _igny8_term_content — HTML content
    • _igny8_term_faq — JSON FAQ array
    • _igny8_term_meta_title — SEO title
    • _igny8_term_meta_description — SEO description
  • Updates native WordPress term description with the generated content
  • Schema: CollectionPage with itemListElement for listed content

3. DATA MODELS & APIs

Modified Models

ContentTaxonomy (db_table=igny8_content_taxonomies) — add fields:

# Cluster mapping
cluster = models.ForeignKey(
    'planner.Clusters', on_delete=models.SET_NULL,
    null=True, blank=True, related_name='taxonomy_terms',
    help_text="Primary cluster this term maps to"
)
secondary_cluster_ids = models.JSONField(
    default=list, blank=True,
    help_text="Additional related cluster IDs"
)
mapping_confidence = models.FloatField(
    default=0.0,
    help_text="Cluster mapping confidence score 0.0-1.0"
)
mapping_status = models.CharField(
    max_length=20, default='unmapped',
    choices=[
        ('auto_mapped', 'Auto Mapped'),
        ('manual_mapped', 'Manual Mapped'),
        ('suggested', 'Suggested'),
        ('unmapped', 'Unmapped'),
    ],
    db_index=True
)

# Generated content
term_content = models.TextField(
    blank=True, default='',
    help_text="Generated rich HTML content for the term page"
)
term_faq = models.JSONField(
    default=list, blank=True,
    help_text="Generated FAQ: [{question, answer}]"
)
meta_title = models.CharField(max_length=255, blank=True, default='')
meta_description = models.TextField(blank=True, default='')
content_status = models.CharField(
    max_length=20, default='none',
    choices=[
        ('none', 'No Content'),
        ('generating', 'Generating'),
        ('generated', 'Generated'),
        ('published', 'Published to WP'),
    ],
    db_index=True
)

# Hierarchy
parent_term = models.ForeignKey(
    'self', on_delete=models.SET_NULL,
    null=True, blank=True, related_name='child_terms'
)
term_count = models.IntegerField(
    default=0,
    help_text="Number of posts/products using this term"
)

# Sync tracking
last_synced_from_wp = models.DateTimeField(null=True, blank=True)
last_pushed_to_wp = models.DateTimeField(null=True, blank=True)

New AI Function

# igny8_core/ai/functions/generate_term_content.py

class GenerateTermContentFunction(BaseAIFunction):
    """Generate rich SEO content for taxonomy terms."""

    def get_name(self) -> str:
        return 'generate_term_content'

    def get_metadata(self) -> Dict:
        return {
            'display_name': 'Generate Term Content',
            'description': 'Generate rich landing page content for taxonomy terms',
            'phases': {
                'INIT': 'Initializing...',
                'PREP': 'Loading term and cluster data...',
                'AI_CALL': 'Generating term content...',
                'PARSE': 'Parsing response...',
                'SAVE': 'Saving term content...',
                'DONE': 'Complete!'
            }
        }

    def get_max_items(self) -> int:
        return 10  # Process up to 10 terms per batch

    def validate(self, payload: dict, account=None) -> Dict:
        term_ids = payload.get('ids', [])
        if not term_ids:
            return {'valid': False, 'error': 'No term IDs provided'}
        return {'valid': True}

    def prepare(self, payload: dict, account=None) -> List:
        term_ids = payload.get('ids', [])
        terms = ContentTaxonomy.objects.filter(
            id__in=term_ids,
            account=account
        ).select_related('cluster', 'parent_term')
        return list(terms)

    def build_prompt(self, data: Any, account=None) -> str:
        term = data  # Single term
        # Build context: cluster keywords, existing content, siblings
        cluster_keywords = []
        if term.cluster:
            cluster_keywords = list(
                term.cluster.keywords.values_list('keyword', flat=True)[:20]
            )
        sibling_terms = list(
            ContentTaxonomy.objects.filter(
                taxonomy_type=term.taxonomy_type,
                site=term.site,
                parent_term=term.parent_term
            ).exclude(id=term.id).values_list('name', flat=True)[:10]
        )
        # Use ContentTypeTemplate from 02A if available
        # Fall back to default term prompt
        return self._build_term_prompt(term, cluster_keywords, sibling_terms)

    def parse_response(self, response: str, step_tracker=None) -> Dict:
        # Parse structured JSON: {content_html, faq, meta_title, meta_description}
        pass

    def save_output(self, parsed, original_data, account=None, **kwargs) -> Dict:
        term = original_data
        term.term_content = parsed.get('content_html', '')
        term.term_faq = parsed.get('faq', [])
        term.meta_title = parsed.get('meta_title', '')
        term.meta_description = parsed.get('meta_description', '')
        term.content_status = 'generated'
        term.save()
        return {'count': 1, 'items_updated': [term.id]}

Register in igny8_core/ai/registry.py:

register_lazy_function('generate_term_content', lambda: GenerateTermContentFunction)

New Service

# igny8_core/business/content/cluster_mapping_service.py

class ClusterMappingService:
    """Maps taxonomy terms to keyword clusters using multi-factor scoring."""

    KEYWORD_OVERLAP_WEIGHT = 0.4
    SEMANTIC_SIMILARITY_WEIGHT = 0.4
    TITLE_MATCH_WEIGHT = 0.2
    AUTO_MAP_THRESHOLD = 0.6
    SUGGEST_THRESHOLD = 0.3

    def map_terms_to_clusters(self, site_id: int, account_id: int) -> Dict:
        """
        Map all unmapped ContentTaxonomy terms to Clusters for a site.
        Returns: {mapped: int, suggested: int, unmapped: int}
        """
        pass

    def map_single_term(self, term: ContentTaxonomy) -> Dict:
        """
        Map a single term. Returns:
        {cluster_id, secondary_ids, confidence, status}
        """
        pass

    def _keyword_overlap_score(self, term_name: str, cluster_keywords: list) -> float:
        pass

    def _semantic_similarity_score(self, term_name: str, cluster_description: str) -> float:
        pass

    def _title_match_score(self, term_name: str, cluster_name: str) -> float:
        pass

New Celery Tasks

# igny8_core/tasks/taxonomy_tasks.py

@shared_task(bind=True, max_retries=3, default_retry_delay=60)
def sync_taxonomy_from_wordpress(self, site_id: int, account_id: int):
    """Fetch all taxonomy terms from WordPress and reconcile with ContentTaxonomy."""
    pass

@shared_task(bind=True, max_retries=3, default_retry_delay=60)
def map_terms_to_clusters(self, site_id: int, account_id: int):
    """Run cluster mapping on all unmapped terms for a site."""
    pass

@shared_task(bind=True, max_retries=3, default_retry_delay=60)
def generate_term_content_task(self, term_ids: list, account_id: int):
    """Generate content for a batch of taxonomy terms."""
    pass

@shared_task(bind=True, max_retries=3, default_retry_delay=60)
def push_term_content_to_wordpress(self, term_id: int, account_id: int):
    """Push generated term content to WordPress via REST API."""
    pass

Migration

igny8_core/migrations/XXXX_taxonomy_term_content.py

Fields added to ContentTaxonomy:

  1. cluster — ForeignKey to Clusters (nullable)
  2. secondary_cluster_ids — JSONField
  3. mapping_confidence — FloatField
  4. mapping_status — CharField
  5. term_content — TextField
  6. term_faq — JSONField
  7. meta_title — CharField
  8. meta_description — TextField
  9. content_status — CharField
  10. parent_term — ForeignKey to self (nullable)
  11. term_count — IntegerField
  12. last_synced_from_wp — DateTimeField (nullable)
  13. last_pushed_to_wp — DateTimeField (nullable)

API Endpoints

# Taxonomy Term Management
GET    /api/v1/writer/taxonomy/terms/                               # List terms with mapping status (filterable)
GET    /api/v1/writer/taxonomy/terms/{id}/                          # Term detail
GET    /api/v1/writer/taxonomy/terms/unmapped/                      # Terms needing cluster assignment
GET    /api/v1/writer/taxonomy/terms/stats/                         # Summary: mapped/unmapped/generated/published counts

# WordPress Sync
POST   /api/v1/writer/taxonomy/terms/sync/                          # Trigger WP → IGNY8 sync
GET    /api/v1/writer/taxonomy/terms/sync/status/                   # Last sync time + status

# Cluster Mapping
POST   /api/v1/writer/taxonomy/terms/{id}/map-cluster/              # Manual cluster assignment
POST   /api/v1/writer/taxonomy/terms/auto-map/                      # Run auto-mapping for all unmapped terms
GET    /api/v1/writer/taxonomy/terms/{id}/cluster-suggestions/      # Get AI cluster suggestions for a term

# Content Generation
POST   /api/v1/writer/taxonomy/terms/create-tasks/                  # Bulk create generation tasks for selected terms
POST   /api/v1/writer/taxonomy/terms/{id}/generate/                 # Generate content for single term
POST   /api/v1/writer/taxonomy/terms/generate-bulk/                 # Generate content for multiple terms

# Publishing to WordPress
POST   /api/v1/writer/taxonomy/terms/{id}/publish/                  # Push single term content to WP
POST   /api/v1/writer/taxonomy/terms/publish-bulk/                  # Push multiple terms to WP

ViewSet:

# igny8_core/modules/writer/views/taxonomy_term_views.py
class TaxonomyTermViewSet(SiteSectorModelViewSet):
    serializer_class = TaxonomyTermSerializer
    queryset = ContentTaxonomy.objects.all()
    filterset_fields = ['taxonomy_type', 'mapping_status', 'content_status', 'site']

    @action(detail=False, methods=['get'])
    def unmapped(self, request):
        qs = self.get_queryset().filter(mapping_status='unmapped')
        return self.paginate_and_respond(qs)

    @action(detail=False, methods=['get'])
    def stats(self, request):
        site_id = request.query_params.get('site_id')
        qs = self.get_queryset().filter(site_id=site_id)
        return Response({
            'total': qs.count(),
            'mapped': qs.filter(mapping_status__in=['auto_mapped', 'manual_mapped']).count(),
            'suggested': qs.filter(mapping_status='suggested').count(),
            'unmapped': qs.filter(mapping_status='unmapped').count(),
            'content_generated': qs.filter(content_status='generated').count(),
            'content_published': qs.filter(content_status='published').count(),
        })

    @action(detail=False, methods=['post'])
    def sync(self, request):
        site_id = request.data.get('site_id')
        sync_taxonomy_from_wordpress.delay(site_id, request.account.id)
        return Response({'message': 'Taxonomy sync started'})

    @action(detail=True, methods=['post'], url_path='map-cluster')
    def map_cluster(self, request, pk=None):
        term = self.get_object()
        cluster_id = request.data.get('cluster_id')
        term.cluster_id = cluster_id
        term.mapping_status = 'manual_mapped'
        term.mapping_confidence = 1.0
        term.save()
        return Response(TaxonomyTermSerializer(term).data)

    @action(detail=False, methods=['post'], url_path='auto-map')
    def auto_map(self, request):
        site_id = request.data.get('site_id')
        map_terms_to_clusters.delay(site_id, request.account.id)
        return Response({'message': 'Auto-mapping started'})

    @action(detail=True, methods=['get'], url_path='cluster-suggestions')
    def cluster_suggestions(self, request, pk=None):
        term = self.get_object()
        service = ClusterMappingService()
        suggestions = service.get_suggestions(term, top_n=5)
        return Response({'suggestions': suggestions})

    @action(detail=True, methods=['post'])
    def generate(self, request, pk=None):
        term = self.get_object()
        generate_term_content_task.delay([term.id], request.account.id)
        return Response({'message': 'Content generation started'})

    @action(detail=True, methods=['post'])
    def publish(self, request, pk=None):
        term = self.get_object()
        push_term_content_to_wordpress.delay(term.id, request.account.id)
        return Response({'message': 'Publishing to WordPress started'})

URL Registration:

# igny8_core/modules/writer/urls.py — add to existing router
router.register('taxonomy/terms', TaxonomyTermViewSet, basename='taxonomy-term')

Credit Costs

Operation Credits Via
Taxonomy sync (WordPress → IGNY8) 1 per batch CreditCostConfig: taxonomy_sync
Term content generation 46 per term CreditCostConfig: term_content_generation
Term content optimization 35 per term CreditCostConfig: term_content_optimization

Add to CreditCostConfig:

CreditCostConfig.objects.get_or_create(
    operation_type='taxonomy_sync',
    defaults={'display_name': 'Taxonomy Sync', 'base_credits': 1}
)
CreditCostConfig.objects.get_or_create(
    operation_type='term_content_generation',
    defaults={'display_name': 'Term Content Generation', 'base_credits': 5}
)

Add to CreditUsageLog.OPERATION_TYPE_CHOICES:

('taxonomy_sync', 'Taxonomy Sync'),
('term_content_generation', 'Term Content Generation'),

4. IMPLEMENTATION STEPS

Step 1: Add Fields to ContentTaxonomy

File to modify:

  • backend/igny8_core/business/content/models.py (or wherever ContentTaxonomy is defined)
  • Add all 13 new fields listed in migration section

Step 2: Create and Run Migration

cd /data/app/igny8/backend
python manage.py makemigrations --name taxonomy_term_content
python manage.py migrate

Step 3: Build ClusterMappingService

File to create:

  • backend/igny8_core/business/content/cluster_mapping_service.py

Step 4: Create GenerateTermContentFunction

File to create:

  • backend/igny8_core/ai/functions/generate_term_content.py

Register in:

  • backend/igny8_core/ai/registry.py

Step 5: Create Celery Tasks

File to create:

  • backend/igny8_core/tasks/taxonomy_tasks.py

Register in Celery beat schedule (optional — these are primarily on-demand):

  • sync_taxonomy_from_wordpress — can be periodic (weekly) or on-demand

Step 6: Add Credit Cost Entries

Add taxonomy_sync and term_content_generation to:

  • CreditCostConfig seed data
  • CreditUsageLog.OPERATION_TYPE_CHOICES

Step 7: Build Serializers

File to create:

  • backend/igny8_core/modules/writer/serializers/taxonomy_term_serializer.py

Step 8: Build ViewSet and URLs

File to create:

  • backend/igny8_core/modules/writer/views/taxonomy_term_views.py

Modify:

  • backend/igny8_core/modules/writer/urls.py

Step 9: Frontend

Files to create/modify in frontend/src/:

  • pages/Writer/TaxonomyTerms.tsx — term list with mapping status indicators
  • pages/Writer/TaxonomyTermDetail.tsx — term detail with generated content preview
  • components/Writer/ClusterMappingPanel.tsx — cluster assignment/suggestion UI
  • stores/taxonomyTermStore.ts — Zustand store
  • api/taxonomyTerms.ts — API client

Step 10: Tests

cd /data/app/igny8/backend
python manage.py test igny8_core.business.content.tests.test_cluster_mapping
python manage.py test igny8_core.ai.tests.test_generate_term_content
python manage.py test igny8_core.modules.writer.tests.test_taxonomy_term_views

5. ACCEPTANCE CRITERIA

  • All 13 new fields on ContentTaxonomy migrate successfully
  • GenerateTermContentFunction registered in AI function registry
  • WordPress → IGNY8 taxonomy sync fetches categories, tags, WooCommerce taxonomies
  • Sync creates/updates ContentTaxonomy records with correct taxonomy_type
  • Parent/child hierarchy preserved via parent_term FK
  • SyncEvent logged with event_type='metadata_sync' after each sync operation
  • ClusterMappingService maps terms with confidence scores
  • Terms with confidence ≥ 0.6 auto-mapped, 0.30.6 suggested, < 0.3 unmapped
  • Manual cluster assignment sets mapping_status='manual_mapped' with confidence=1.0
  • Term content generation produces: content_html, FAQ, meta_title, meta_description
  • content_status transitions: none → generating → generated → published
  • Publishing pushes content to WordPress via POST /wp-json/igny8/v1/terms/{id}/content
  • All API endpoints require authentication and enforce account isolation
  • Frontend term list shows mapping status badges (mapped/suggested/unmapped)
  • Frontend supports manual cluster assignment from suggestion list
  • Credit deduction works for taxonomy_sync and term_content_generation operations
  • Backward compatible — existing ContentTaxonomy records unaffected (new fields nullable/defaulted)

6. CLAUDE CODE INSTRUCTIONS

Execution Order

  1. Read backend/igny8_core/business/content/models.py — find ContentTaxonomy and ContentTaxonomyRelation
  2. Read backend/igny8_core/business/planning/models.py — understand Clusters model for FK reference
  3. Read backend/igny8_core/ai/functions/generate_content.py — reference pattern for new AI function
  4. Read backend/igny8_core/ai/registry.py — understand registration pattern
  5. Add fields to ContentTaxonomy model
  6. Create migration and run it
  7. Build ClusterMappingService
  8. Build GenerateTermContentFunction + register it
  9. Build Celery tasks
  10. Build serializers, ViewSet, URLs
  11. Build frontend components

Key Constraints

  • ALL primary keys are BigAutoField (integer). No UUIDs.
  • Model class names PLURAL: Clusters, Keywords, Tasks, ContentIdeas, Images. Content stays singular. ContentTaxonomy stays singular.
  • Frontend: .tsx files, Zustand stores, Vitest testing
  • Celery app name: igny8_core
  • All new db_tables use igny8_ prefix
  • Follow existing ViewSet pattern: SiteSectorModelViewSet for site-scoped resources
  • AI functions follow BaseAIFunction pattern with lazy registry

File Tree (New/Modified)

backend/igny8_core/
├── business/content/
│   ├── models.py                                        # MODIFY: add fields to ContentTaxonomy
│   └── cluster_mapping_service.py                       # NEW: ClusterMappingService
├── ai/functions/
│   └── generate_term_content.py                         # NEW: GenerateTermContentFunction
├── ai/
│   └── registry.py                                      # MODIFY: register generate_term_content
├── tasks/
│   └── taxonomy_tasks.py                                # NEW: sync, map, generate, publish tasks
├── modules/writer/
│   ├── serializers/
│   │   └── taxonomy_term_serializer.py                  # NEW
│   ├── views/
│   │   └── taxonomy_term_views.py                       # NEW
│   └── urls.py                                          # MODIFY: register taxonomy/terms route
├── migrations/
│   └── XXXX_taxonomy_term_content.py                    # NEW: auto-generated

frontend/src/
├── pages/Writer/
│   ├── TaxonomyTerms.tsx                                # NEW: term list page
│   └── TaxonomyTermDetail.tsx                           # NEW: term detail + content preview
├── components/Writer/
│   └── ClusterMappingPanel.tsx                          # NEW: cluster assignment UI
├── stores/
│   └── taxonomyTermStore.ts                             # NEW: Zustand store
├── api/
│   └── taxonomyTerms.ts                                 # NEW: API client

Cross-References

  • 02A (content types extension): ContentTypeTemplate for content_type='taxonomy' provides prompt template
  • 01A (SAG data foundation): SAGAttribute → taxonomy mapping context
  • 01D (setup wizard): wizard creates initial taxonomy plan used for cluster mapping
  • 03B (WP plugin connected): connected plugin receives term content via REST endpoint
  • 03C (companion theme): theme renders term landing pages using pushed content