Files
igny8/DATA_SEGREGATION_SYSTEM_VS_USER.md
2025-12-20 02:46:00 +00:00

15 KiB

Data Segregation: System vs User Data

Purpose

This document categorizes all models in the Django admin sidebar to identify:

  • SYSTEM DATA: Configuration, templates, and settings that must be preserved (pre-configured, production-ready data)
  • USER DATA: Account-specific, tenant-specific, or test data that can be cleaned up during testing phase

1. Accounts & Tenancy

Model Type Description Clean/Keep
Account USER DATA Customer accounts (test accounts during development) CLEAN - Remove test accounts
User USER DATA User profiles linked to accounts CLEAN - Remove test users
Site USER DATA Sites/domains owned by accounts CLEAN - Remove test sites
Sector USER DATA Sectors within sites (account-specific) CLEAN - Remove test sectors
SiteUserAccess USER DATA User permissions per site CLEAN - Remove test access records

Summary: All models are USER DATA - Safe to clean for fresh production start


2. Global Resources

Model Type Description Clean/Keep
Industry SYSTEM DATA Global industry taxonomy (e.g., Healthcare, Finance, Technology) ⚠️ KEEP - Pre-configured industries
IndustrySector SYSTEM DATA Sub-categories within industries (e.g., Cardiology, Investment Banking) ⚠️ KEEP - Pre-configured sectors
SeedKeyword MIXED DATA Seed keywords for industries - can be seeded or user-generated ⚠️ REVIEW - Keep system seeds, remove test seeds

Summary:

  • KEEP: Industry and IndustrySector (global taxonomy)
  • REVIEW: SeedKeyword - separate system defaults from test data

3. Plans and Billing

Model Type Description Clean/Keep
Plan SYSTEM DATA Subscription plans (Free, Pro, Enterprise, etc.) ⚠️ KEEP - Production pricing tiers
Subscription USER DATA Active subscriptions per account CLEAN - Remove test subscriptions
Invoice USER DATA Generated invoices for accounts CLEAN - Remove test invoices
Payment USER DATA Payment records CLEAN - Remove test payments
CreditPackage SYSTEM DATA Available credit packages for purchase ⚠️ KEEP - Production credit offerings
PaymentMethodConfig SYSTEM DATA Supported payment methods (Stripe, PayPal) ⚠️ KEEP - Production payment configs
AccountPaymentMethod USER DATA Saved payment methods per account CLEAN - Remove test payment methods

Summary:

  • KEEP: Plan, CreditPackage, PaymentMethodConfig (system pricing/config)
  • CLEAN: Subscription, Invoice, Payment, AccountPaymentMethod (user transactions)

4. Credits

Model Type Description Clean/Keep
CreditTransaction USER DATA Credit add/subtract transactions CLEAN - Remove test transactions
CreditUsageLog USER DATA Log of credit usage per operation CLEAN - Remove test usage logs
CreditCostConfig SYSTEM DATA Cost configuration per operation type ⚠️ KEEP - Production cost structure
PlanLimitUsage USER DATA Usage tracking per account/plan limits CLEAN - Remove test usage data

Summary:

  • KEEP: CreditCostConfig (system cost rules)
  • CLEAN: All transaction and usage logs (user activity)

5. Content Planning

Model Type Description Clean/Keep
Keywords USER DATA Keywords researched per site/sector CLEAN - Remove test keywords
Clusters USER DATA Content clusters created per site CLEAN - Remove test clusters
ContentIdeas USER DATA Content ideas generated for accounts CLEAN - Remove test ideas

Summary: All models are USER DATA - Safe to clean completely


6. Content Generation

Model Type Description Clean/Keep
Tasks USER DATA Content writing tasks assigned to users CLEAN - Remove test tasks
Content USER DATA Generated content/articles CLEAN - Remove test content
Images USER DATA Generated or uploaded images CLEAN - Remove test images

Summary: All models are USER DATA - Safe to clean completely


7. Taxonomy & Organization

Model Type Description Clean/Keep
ContentTaxonomy USER DATA Custom taxonomies (categories/tags) per site CLEAN - Remove test taxonomies
ContentTaxonomyRelation USER DATA Relationships between content and taxonomies CLEAN - Remove test relations
ContentClusterMap USER DATA Mapping of content to clusters CLEAN - Remove test mappings
ContentAttribute USER DATA Custom attributes for content CLEAN - Remove test attributes

Summary: All models are USER DATA - Safe to clean completely


8. Publishing & Integration

Model Type Description Clean/Keep
SiteIntegration USER DATA WordPress/platform integrations per site CLEAN - Remove test integrations
SyncEvent USER DATA Sync events between IGNY8 and external platforms CLEAN - Remove test sync logs
PublishingRecord USER DATA Records of published content CLEAN - Remove test publish records
PublishingChannel SYSTEM DATA Available publishing channels (WordPress, Ghost, etc.) ⚠️ KEEP - Production channel configs
DeploymentRecord USER DATA Deployment history per account CLEAN - Remove test deployments

Summary:

  • KEEP: PublishingChannel (system-wide channel definitions)
  • CLEAN: All user-specific integration and sync data

9. AI & Automation

Model Type Description Clean/Keep
IntegrationSettings MIXED DATA API keys/settings for OpenAI, etc. ⚠️ REVIEW - Keep system defaults, remove test configs
AIPrompt SYSTEM DATA AI prompt templates for content generation ⚠️ KEEP - Production prompt library
Strategy SYSTEM DATA Content strategy templates ⚠️ KEEP - Production strategy templates
AuthorProfile SYSTEM DATA Author persona templates ⚠️ KEEP - Production author profiles
APIKey USER DATA User-generated API keys for platform access CLEAN - Remove test API keys
WebhookConfig USER DATA Webhook configurations per account CLEAN - Remove test webhooks
AutomationConfig USER DATA Automation rules per account/site CLEAN - Remove test automations
AutomationRun USER DATA Execution history of automations CLEAN - Remove test run logs

Summary:

  • KEEP: AIPrompt, Strategy, AuthorProfile (system templates)
  • REVIEW: IntegrationSettings (separate system vs user API keys)
  • CLEAN: APIKey, WebhookConfig, AutomationConfig, AutomationRun (user configs)

10. System Settings

Model Type Description Clean/Keep
ContentType SYSTEM DATA Django ContentTypes (auto-managed) ⚠️ KEEP - Django core system table
ContentTemplate SYSTEM DATA Content templates for generation ⚠️ KEEP - Production templates
TaxonomyConfig SYSTEM DATA Taxonomy configuration rules ⚠️ KEEP - Production taxonomy rules
SystemSetting SYSTEM DATA Global system settings ⚠️ KEEP - Production system config
ContentTypeConfig SYSTEM DATA Content type definitions (blog post, landing page, etc.) ⚠️ KEEP - Production content types
NotificationConfig SYSTEM DATA Notification templates and rules ⚠️ KEEP - Production notification configs

Summary: All models are SYSTEM DATA - Must be kept and properly seeded for production


11. Django Admin

Model Type Description Clean/Keep
Group SYSTEM DATA Permission groups (Admin, Editor, Viewer, etc.) ⚠️ KEEP - Production role definitions
Permission SYSTEM DATA Django permissions (auto-managed) ⚠️ KEEP - Django core system table
PasswordResetToken USER DATA Password reset tokens (temporary) CLEAN - Remove expired tokens
Session USER DATA User session data CLEAN - Remove old sessions

Summary:

  • KEEP: Group, Permission (system access control)
  • CLEAN: PasswordResetToken, Session (temporary user data)

12. Tasks & Logging

Model Type Description Clean/Keep
AITaskLog USER DATA Logs of AI operations per account CLEAN - Remove test logs
AuditLog USER DATA Audit trail of user actions CLEAN - Remove test audit logs
LogEntry USER DATA Django admin action logs CLEAN - Remove test admin logs
TaskResult USER DATA Celery task execution results CLEAN - Remove test task results
GroupResult USER DATA Celery group task results CLEAN - Remove test group results

Summary: All models are USER DATA - Safe to clean completely (logs/audit trails)


Summary Table: Data Segregation by Category

Category System Data Models User Data Models Mixed/Review
Accounts & Tenancy 0 5 0
Global Resources 2 0 1
Plans and Billing 3 4 0
Credits 1 3 0
Content Planning 0 3 0
Content Generation 0 3 0
Taxonomy & Organization 0 4 0
Publishing & Integration 1 4 0
AI & Automation 3 4 1
System Settings 6 0 0
Django Admin 2 2 0
Tasks & Logging 0 5 0
TOTAL 18 37 2

Action Plan: Production Data Preparation

Phase 1: Preserve System Data ⚠️

Models to Keep & Seed Properly:

  1. Global Taxonomy

    • Industry (pre-populate 10-15 major industries)
    • IndustrySector (pre-populate 100+ sub-sectors)
    • SeedKeyword (system-level seed keywords per industry)
  2. Pricing & Plans

    • Plan (Free, Starter, Pro, Enterprise tiers)
    • CreditPackage (credit bundles for purchase)
    • PaymentMethodConfig (Stripe, PayPal configs)
    • CreditCostConfig (cost per operation type)
  3. Publishing Channels

    • PublishingChannel (WordPress, Ghost, Medium, etc.)
  4. AI & Content Templates

    • AIPrompt (100+ production-ready prompts)
    • Strategy (content strategy templates)
    • AuthorProfile (author persona library)
    • ContentTemplate (article templates)
    • ContentTypeConfig (blog post, landing page, etc.)
  5. System Configuration

    • SystemSetting (global platform settings)
    • TaxonomyConfig (taxonomy rules)
    • NotificationConfig (email/webhook templates)
  6. Access Control

    • Group (Admin, Editor, Viewer, Owner roles)
    • Permission (Django-managed)
    • ContentType (Django-managed)

Phase 2: Clean User/Test Data

Models to Truncate/Delete:

  1. Account Data: Account, User, Site, Sector, SiteUserAccess
  2. Billing Transactions: Subscription, Invoice, Payment, AccountPaymentMethod, CreditTransaction
  3. Content Data: Keywords, Clusters, ContentIdeas, Tasks, Content, Images
  4. Taxonomy Relations: ContentTaxonomy, ContentTaxonomyRelation, ContentClusterMap, ContentAttribute
  5. Integration Data: SiteIntegration, SyncEvent, PublishingRecord, DeploymentRecord
  6. User Configs: APIKey, WebhookConfig, AutomationConfig, AutomationRun
  7. Logs: AITaskLog, AuditLog, LogEntry, TaskResult, GroupResult, CreditUsageLog, PlanLimitUsage, PasswordResetToken, Session

Phase 3: Review Mixed Data ⚠️

Models Requiring Manual Review:

  1. SeedKeyword: Separate system seeds from test data
  2. IntegrationSettings: Keep system-level API configs, remove test account keys

Database Cleanup Commands (Use with Caution)

Safe Cleanup (Logs & Sessions)

# Remove old logs (>90 days)
AITaskLog.objects.filter(created_at__lt=timezone.now() - timedelta(days=90)).delete()
CreditUsageLog.objects.filter(created_at__lt=timezone.now() - timedelta(days=90)).delete()
LogEntry.objects.filter(action_time__lt=timezone.now() - timedelta(days=90)).delete()

# Remove old sessions and tokens
Session.objects.filter(expire_date__lt=timezone.now()).delete()
PasswordResetToken.objects.filter(expires_at__lt=timezone.now()).delete()

# Remove old task results
TaskResult.objects.filter(date_done__lt=timezone.now() - timedelta(days=30)).delete()

Full Test Data Cleanup (Development/Staging Only)

# WARNING: Only run in development/staging environments
# This will delete ALL user-generated data

# User data
Account.objects.all().delete()  # Cascades to most user data
User.objects.filter(is_superuser=False).delete()

# Remaining user data
SiteIntegration.objects.all().delete()
AutomationConfig.objects.all().delete()
APIKey.objects.all().delete()
WebhookConfig.objects.all().delete()

# Logs and history
AITaskLog.objects.all().delete()
AuditLog.objects.all().delete()
LogEntry.objects.all().delete()
TaskResult.objects.all().delete()
GroupResult.objects.all().delete()

Verify System Data Exists

# Check system data is properly seeded
print(f"Industries: {Industry.objects.count()}")
print(f"Plans: {Plan.objects.count()}")
print(f"AI Prompts: {AIPrompt.objects.count()}")
print(f"Strategies: {Strategy.objects.count()}")
print(f"Content Templates: {ContentTemplate.objects.count()}")
print(f"Publishing Channels: {PublishingChannel.objects.count()}")
print(f"Groups: {Group.objects.count()}")

Recommendations

Before Production Launch:

  1. Export System Data: Export all SYSTEM DATA models to fixtures for reproducibility

    python manage.py dumpdata igny8_core_auth.Industry > fixtures/industries.json
    python manage.py dumpdata igny8_core_auth.Plan > fixtures/plans.json
    python manage.py dumpdata system.AIPrompt > fixtures/prompts.json
    # ... repeat for all system models
    
  2. Create Seed Script: Create management command to populate fresh database with system data

    python manage.py seed_system_data
    
  3. Database Snapshot: Take snapshot after system data is seeded, before any user data

  4. Separate Databases: Consider separate staging database with full test data vs production with clean start

  5. Data Migration Plan:

    • If migrating from old system: Only migrate Account, User, Content, and critical user data
    • Leave test data behind in old system

Next Steps

  1. Review this document and confirm data segregation logic
  2. ⚠️ Create fixtures/seeds for all 18 SYSTEM DATA models
  3. ⚠️ Review 2 MIXED DATA models (SeedKeyword, IntegrationSettings)
  4. Create cleanup script for 37 USER DATA models
  5. Test cleanup script in staging environment
  6. Execute cleanup before production launch

Generated: December 20, 2025 Purpose: Production data preparation and test data cleanup