# Data Segregation: System vs User Data ## Purpose This document categorizes all models in the Django admin sidebar to identify: - **SYSTEM DATA**: Configuration, templates, and settings that must be preserved (pre-configured, production-ready data) - **USER DATA**: Account-specific, tenant-specific, or test data that can be cleaned up during testing phase --- ## 1. Accounts & Tenancy | Model | Type | Description | Clean/Keep | |-------|------|-------------|------------| | Account | USER DATA | Customer accounts (test accounts during development) | ✅ CLEAN - Remove test accounts | | User | USER DATA | User profiles linked to accounts | ✅ CLEAN - Remove test users | | Site | USER DATA | Sites/domains owned by accounts | ✅ CLEAN - Remove test sites | | Sector | USER DATA | Sectors within sites (account-specific) | ✅ CLEAN - Remove test sectors | | SiteUserAccess | USER DATA | User permissions per site | ✅ CLEAN - Remove test access records | **Summary**: All models are USER DATA - Safe to clean for fresh production start --- ## 2. Global Resources | Model | Type | Description | Clean/Keep | |-------|------|-------------|------------| | Industry | SYSTEM DATA | Global industry taxonomy (e.g., Healthcare, Finance, Technology) | ⚠️ KEEP - Pre-configured industries | | IndustrySector | SYSTEM DATA | Sub-categories within industries (e.g., Cardiology, Investment Banking) | ⚠️ KEEP - Pre-configured sectors | | SeedKeyword | MIXED DATA | Seed keywords for industries - can be seeded or user-generated | ⚠️ REVIEW - Keep system seeds, remove test seeds | **Summary**: - **KEEP**: Industry and IndustrySector (global taxonomy) - **REVIEW**: SeedKeyword - separate system defaults from test data --- ## 3. Plans and Billing | Model | Type | Description | Clean/Keep | |-------|------|-------------|------------| | Plan | SYSTEM DATA | Subscription plans (Free, Pro, Enterprise, etc.) | ⚠️ KEEP - Production pricing tiers | | Subscription | USER DATA | Active subscriptions per account | ✅ CLEAN - Remove test subscriptions | | Invoice | USER DATA | Generated invoices for accounts | ✅ CLEAN - Remove test invoices | | Payment | USER DATA | Payment records | ✅ CLEAN - Remove test payments | | CreditPackage | SYSTEM DATA | Available credit packages for purchase | ⚠️ KEEP - Production credit offerings | | PaymentMethodConfig | SYSTEM DATA | Supported payment methods (Stripe, PayPal) | ⚠️ KEEP - Production payment configs | | AccountPaymentMethod | USER DATA | Saved payment methods per account | ✅ CLEAN - Remove test payment methods | **Summary**: - **KEEP**: Plan, CreditPackage, PaymentMethodConfig (system pricing/config) - **CLEAN**: Subscription, Invoice, Payment, AccountPaymentMethod (user transactions) --- ## 4. Credits | Model | Type | Description | Clean/Keep | |-------|------|-------------|------------| | CreditTransaction | USER DATA | Credit add/subtract transactions | ✅ CLEAN - Remove test transactions | | CreditUsageLog | USER DATA | Log of credit usage per operation | ✅ CLEAN - Remove test usage logs | | CreditCostConfig | SYSTEM DATA | Cost configuration per operation type | ⚠️ KEEP - Production cost structure | | PlanLimitUsage | USER DATA | Usage tracking per account/plan limits | ✅ CLEAN - Remove test usage data | **Summary**: - **KEEP**: CreditCostConfig (system cost rules) - **CLEAN**: All transaction and usage logs (user activity) --- ## 5. Content Planning | Model | Type | Description | Clean/Keep | |-------|------|-------------|------------| | Keywords | USER DATA | Keywords researched per site/sector | ✅ CLEAN - Remove test keywords | | Clusters | USER DATA | Content clusters created per site | ✅ CLEAN - Remove test clusters | | ContentIdeas | USER DATA | Content ideas generated for accounts | ✅ CLEAN - Remove test ideas | **Summary**: All models are USER DATA - Safe to clean completely --- ## 6. Content Generation | Model | Type | Description | Clean/Keep | |-------|------|-------------|------------| | Tasks | USER DATA | Content writing tasks assigned to users | ✅ CLEAN - Remove test tasks | | Content | USER DATA | Generated content/articles | ✅ CLEAN - Remove test content | | Images | USER DATA | Generated or uploaded images | ✅ CLEAN - Remove test images | **Summary**: All models are USER DATA - Safe to clean completely --- ## 7. Taxonomy & Organization | Model | Type | Description | Clean/Keep | |-------|------|-------------|------------| | ContentTaxonomy | USER DATA | Custom taxonomies (categories/tags) per site | ✅ CLEAN - Remove test taxonomies | | ContentTaxonomyRelation | USER DATA | Relationships between content and taxonomies | ✅ CLEAN - Remove test relations | | ContentClusterMap | USER DATA | Mapping of content to clusters | ✅ CLEAN - Remove test mappings | | ContentAttribute | USER DATA | Custom attributes for content | ✅ CLEAN - Remove test attributes | **Summary**: All models are USER DATA - Safe to clean completely --- ## 8. Publishing & Integration | Model | Type | Description | Clean/Keep | |-------|------|-------------|------------| | SiteIntegration | USER DATA | WordPress/platform integrations per site | ✅ CLEAN - Remove test integrations | | SyncEvent | USER DATA | Sync events between IGNY8 and external platforms | ✅ CLEAN - Remove test sync logs | | PublishingRecord | USER DATA | Records of published content | ✅ CLEAN - Remove test publish records | | PublishingChannel | SYSTEM DATA | Available publishing channels (WordPress, Ghost, etc.) | ⚠️ KEEP - Production channel configs | | DeploymentRecord | USER DATA | Deployment history per account | ✅ CLEAN - Remove test deployments | **Summary**: - **KEEP**: PublishingChannel (system-wide channel definitions) - **CLEAN**: All user-specific integration and sync data --- ## 9. AI & Automation | Model | Type | Description | Clean/Keep | |-------|------|-------------|------------| | IntegrationSettings | MIXED DATA | API keys/settings for OpenAI, etc. | ⚠️ REVIEW - Keep system defaults, remove test configs | | AIPrompt | SYSTEM DATA | AI prompt templates for content generation | ⚠️ KEEP - Production prompt library | | Strategy | SYSTEM DATA | Content strategy templates | ⚠️ KEEP - Production strategy templates | | AuthorProfile | SYSTEM DATA | Author persona templates | ⚠️ KEEP - Production author profiles | | APIKey | USER DATA | User-generated API keys for platform access | ✅ CLEAN - Remove test API keys | | WebhookConfig | USER DATA | Webhook configurations per account | ✅ CLEAN - Remove test webhooks | | AutomationConfig | USER DATA | Automation rules per account/site | ✅ CLEAN - Remove test automations | | AutomationRun | USER DATA | Execution history of automations | ✅ CLEAN - Remove test run logs | **Summary**: - **KEEP**: AIPrompt, Strategy, AuthorProfile (system templates) - **REVIEW**: IntegrationSettings (separate system vs user API keys) - **CLEAN**: APIKey, WebhookConfig, AutomationConfig, AutomationRun (user configs) --- ## 10. System Settings | Model | Type | Description | Clean/Keep | |-------|------|-------------|------------| | ContentType | SYSTEM DATA | Django ContentTypes (auto-managed) | ⚠️ KEEP - Django core system table | | ContentTemplate | SYSTEM DATA | Content templates for generation | ⚠️ KEEP - Production templates | | TaxonomyConfig | SYSTEM DATA | Taxonomy configuration rules | ⚠️ KEEP - Production taxonomy rules | | SystemSetting | SYSTEM DATA | Global system settings | ⚠️ KEEP - Production system config | | ContentTypeConfig | SYSTEM DATA | Content type definitions (blog post, landing page, etc.) | ⚠️ KEEP - Production content types | | NotificationConfig | SYSTEM DATA | Notification templates and rules | ⚠️ KEEP - Production notification configs | **Summary**: All models are SYSTEM DATA - Must be kept and properly seeded for production --- ## 11. Django Admin | Model | Type | Description | Clean/Keep | |-------|------|-------------|------------| | Group | SYSTEM DATA | Permission groups (Admin, Editor, Viewer, etc.) | ⚠️ KEEP - Production role definitions | | Permission | SYSTEM DATA | Django permissions (auto-managed) | ⚠️ KEEP - Django core system table | | PasswordResetToken | USER DATA | Password reset tokens (temporary) | ✅ CLEAN - Remove expired tokens | | Session | USER DATA | User session data | ✅ CLEAN - Remove old sessions | **Summary**: - **KEEP**: Group, Permission (system access control) - **CLEAN**: PasswordResetToken, Session (temporary user data) --- ## 12. Tasks & Logging | Model | Type | Description | Clean/Keep | |-------|------|-------------|------------| | AITaskLog | USER DATA | Logs of AI operations per account | ✅ CLEAN - Remove test logs | | AuditLog | USER DATA | Audit trail of user actions | ✅ CLEAN - Remove test audit logs | | LogEntry | USER DATA | Django admin action logs | ✅ CLEAN - Remove test admin logs | | TaskResult | USER DATA | Celery task execution results | ✅ CLEAN - Remove test task results | | GroupResult | USER DATA | Celery group task results | ✅ CLEAN - Remove test group results | **Summary**: All models are USER DATA - Safe to clean completely (logs/audit trails) --- ## Summary Table: Data Segregation by Category | Category | System Data Models | User Data Models | Mixed/Review | |----------|-------------------|------------------|--------------| | **Accounts & Tenancy** | 0 | 5 | 0 | | **Global Resources** | 2 | 0 | 1 | | **Plans and Billing** | 3 | 4 | 0 | | **Credits** | 1 | 3 | 0 | | **Content Planning** | 0 | 3 | 0 | | **Content Generation** | 0 | 3 | 0 | | **Taxonomy & Organization** | 0 | 4 | 0 | | **Publishing & Integration** | 1 | 4 | 0 | | **AI & Automation** | 3 | 4 | 1 | | **System Settings** | 6 | 0 | 0 | | **Django Admin** | 2 | 2 | 0 | | **Tasks & Logging** | 0 | 5 | 0 | | **TOTAL** | **18** | **37** | **2** | --- ## Action Plan: Production Data Preparation ### Phase 1: Preserve System Data ⚠️ **Models to Keep & Seed Properly:** 1. **Global Taxonomy** - Industry (pre-populate 10-15 major industries) - IndustrySector (pre-populate 100+ sub-sectors) - SeedKeyword (system-level seed keywords per industry) 2. **Pricing & Plans** - Plan (Free, Starter, Pro, Enterprise tiers) - CreditPackage (credit bundles for purchase) - PaymentMethodConfig (Stripe, PayPal configs) - CreditCostConfig (cost per operation type) 3. **Publishing Channels** - PublishingChannel (WordPress, Ghost, Medium, etc.) 4. **AI & Content Templates** - AIPrompt (100+ production-ready prompts) - Strategy (content strategy templates) - AuthorProfile (author persona library) - ContentTemplate (article templates) - ContentTypeConfig (blog post, landing page, etc.) 5. **System Configuration** - SystemSetting (global platform settings) - TaxonomyConfig (taxonomy rules) - NotificationConfig (email/webhook templates) 6. **Access Control** - Group (Admin, Editor, Viewer, Owner roles) - Permission (Django-managed) - ContentType (Django-managed) ### Phase 2: Clean User/Test Data ✅ **Models to Truncate/Delete:** 1. **Account Data**: Account, User, Site, Sector, SiteUserAccess 2. **Billing Transactions**: Subscription, Invoice, Payment, AccountPaymentMethod, CreditTransaction 3. **Content Data**: Keywords, Clusters, ContentIdeas, Tasks, Content, Images 4. **Taxonomy Relations**: ContentTaxonomy, ContentTaxonomyRelation, ContentClusterMap, ContentAttribute 5. **Integration Data**: SiteIntegration, SyncEvent, PublishingRecord, DeploymentRecord 6. **User Configs**: APIKey, WebhookConfig, AutomationConfig, AutomationRun 7. **Logs**: AITaskLog, AuditLog, LogEntry, TaskResult, GroupResult, CreditUsageLog, PlanLimitUsage, PasswordResetToken, Session ### Phase 3: Review Mixed Data ⚠️ **Models Requiring Manual Review:** 1. **SeedKeyword**: Separate system seeds from test data 2. **IntegrationSettings**: Keep system-level API configs, remove test account keys --- ## Database Cleanup Commands (Use with Caution) ### Safe Cleanup (Logs & Sessions) ```python # Remove old logs (>90 days) AITaskLog.objects.filter(created_at__lt=timezone.now() - timedelta(days=90)).delete() CreditUsageLog.objects.filter(created_at__lt=timezone.now() - timedelta(days=90)).delete() LogEntry.objects.filter(action_time__lt=timezone.now() - timedelta(days=90)).delete() # Remove old sessions and tokens Session.objects.filter(expire_date__lt=timezone.now()).delete() PasswordResetToken.objects.filter(expires_at__lt=timezone.now()).delete() # Remove old task results TaskResult.objects.filter(date_done__lt=timezone.now() - timedelta(days=30)).delete() ``` ### Full Test Data Cleanup (Development/Staging Only) ```python # WARNING: Only run in development/staging environments # This will delete ALL user-generated data # User data Account.objects.all().delete() # Cascades to most user data User.objects.filter(is_superuser=False).delete() # Remaining user data SiteIntegration.objects.all().delete() AutomationConfig.objects.all().delete() APIKey.objects.all().delete() WebhookConfig.objects.all().delete() # Logs and history AITaskLog.objects.all().delete() AuditLog.objects.all().delete() LogEntry.objects.all().delete() TaskResult.objects.all().delete() GroupResult.objects.all().delete() ``` ### Verify System Data Exists ```python # Check system data is properly seeded print(f"Industries: {Industry.objects.count()}") print(f"Plans: {Plan.objects.count()}") print(f"AI Prompts: {AIPrompt.objects.count()}") print(f"Strategies: {Strategy.objects.count()}") print(f"Content Templates: {ContentTemplate.objects.count()}") print(f"Publishing Channels: {PublishingChannel.objects.count()}") print(f"Groups: {Group.objects.count()}") ``` --- ## Recommendations ### Before Production Launch: 1. **Export System Data**: Export all SYSTEM DATA models to fixtures for reproducibility ```bash python manage.py dumpdata igny8_core_auth.Industry > fixtures/industries.json python manage.py dumpdata igny8_core_auth.Plan > fixtures/plans.json python manage.py dumpdata system.AIPrompt > fixtures/prompts.json # ... repeat for all system models ``` 2. **Create Seed Script**: Create management command to populate fresh database with system data ```bash python manage.py seed_system_data ``` 3. **Database Snapshot**: Take snapshot after system data is seeded, before any user data 4. **Separate Databases**: Consider separate staging database with full test data vs production with clean start 5. **Data Migration Plan**: - If migrating from old system: Only migrate Account, User, Content, and critical user data - Leave test data behind in old system --- ## Next Steps 1. ✅ Review this document and confirm data segregation logic 2. ⚠️ Create fixtures/seeds for all 18 SYSTEM DATA models 3. ⚠️ Review 2 MIXED DATA models (SeedKeyword, IntegrationSettings) 4. ✅ Create cleanup script for 37 USER DATA models 5. ✅ Test cleanup script in staging environment 6. ✅ Execute cleanup before production launch --- *Generated: December 20, 2025* *Purpose: Production data preparation and test data cleanup*