15 KiB
Data Segregation: System vs User Data
Purpose
This document categorizes all models in the Django admin sidebar to identify:
- SYSTEM DATA: Configuration, templates, and settings that must be preserved (pre-configured, production-ready data)
- USER DATA: Account-specific, tenant-specific, or test data that can be cleaned up during testing phase
1. Accounts & Tenancy
| Model | Type | Description | Clean/Keep |
|---|---|---|---|
| Account | USER DATA | Customer accounts (test accounts during development) | ✅ CLEAN - Remove test accounts |
| User | USER DATA | User profiles linked to accounts | ✅ CLEAN - Remove test users |
| Site | USER DATA | Sites/domains owned by accounts | ✅ CLEAN - Remove test sites |
| Sector | USER DATA | Sectors within sites (account-specific) | ✅ CLEAN - Remove test sectors |
| SiteUserAccess | USER DATA | User permissions per site | ✅ CLEAN - Remove test access records |
Summary: All models are USER DATA - Safe to clean for fresh production start
2. Global Resources
| Model | Type | Description | Clean/Keep |
|---|---|---|---|
| Industry | SYSTEM DATA | Global industry taxonomy (e.g., Healthcare, Finance, Technology) | ⚠️ KEEP - Pre-configured industries |
| IndustrySector | SYSTEM DATA | Sub-categories within industries (e.g., Cardiology, Investment Banking) | ⚠️ KEEP - Pre-configured sectors |
| SeedKeyword | MIXED DATA | Seed keywords for industries - can be seeded or user-generated | ⚠️ REVIEW - Keep system seeds, remove test seeds |
Summary:
- KEEP: Industry and IndustrySector (global taxonomy)
- REVIEW: SeedKeyword - separate system defaults from test data
3. Plans and Billing
| Model | Type | Description | Clean/Keep |
|---|---|---|---|
| Plan | SYSTEM DATA | Subscription plans (Free, Pro, Enterprise, etc.) | ⚠️ KEEP - Production pricing tiers |
| Subscription | USER DATA | Active subscriptions per account | ✅ CLEAN - Remove test subscriptions |
| Invoice | USER DATA | Generated invoices for accounts | ✅ CLEAN - Remove test invoices |
| Payment | USER DATA | Payment records | ✅ CLEAN - Remove test payments |
| CreditPackage | SYSTEM DATA | Available credit packages for purchase | ⚠️ KEEP - Production credit offerings |
| PaymentMethodConfig | SYSTEM DATA | Supported payment methods (Stripe, PayPal) | ⚠️ KEEP - Production payment configs |
| AccountPaymentMethod | USER DATA | Saved payment methods per account | ✅ CLEAN - Remove test payment methods |
Summary:
- KEEP: Plan, CreditPackage, PaymentMethodConfig (system pricing/config)
- CLEAN: Subscription, Invoice, Payment, AccountPaymentMethod (user transactions)
4. Credits
| Model | Type | Description | Clean/Keep |
|---|---|---|---|
| CreditTransaction | USER DATA | Credit add/subtract transactions | ✅ CLEAN - Remove test transactions |
| CreditUsageLog | USER DATA | Log of credit usage per operation | ✅ CLEAN - Remove test usage logs |
| CreditCostConfig | SYSTEM DATA | Cost configuration per operation type | ⚠️ KEEP - Production cost structure |
| PlanLimitUsage | USER DATA | Usage tracking per account/plan limits | ✅ CLEAN - Remove test usage data |
Summary:
- KEEP: CreditCostConfig (system cost rules)
- CLEAN: All transaction and usage logs (user activity)
5. Content Planning
| Model | Type | Description | Clean/Keep |
|---|---|---|---|
| Keywords | USER DATA | Keywords researched per site/sector | ✅ CLEAN - Remove test keywords |
| Clusters | USER DATA | Content clusters created per site | ✅ CLEAN - Remove test clusters |
| ContentIdeas | USER DATA | Content ideas generated for accounts | ✅ CLEAN - Remove test ideas |
Summary: All models are USER DATA - Safe to clean completely
6. Content Generation
| Model | Type | Description | Clean/Keep |
|---|---|---|---|
| Tasks | USER DATA | Content writing tasks assigned to users | ✅ CLEAN - Remove test tasks |
| Content | USER DATA | Generated content/articles | ✅ CLEAN - Remove test content |
| Images | USER DATA | Generated or uploaded images | ✅ CLEAN - Remove test images |
Summary: All models are USER DATA - Safe to clean completely
7. Taxonomy & Organization
| Model | Type | Description | Clean/Keep |
|---|---|---|---|
| ContentTaxonomy | USER DATA | Custom taxonomies (categories/tags) per site | ✅ CLEAN - Remove test taxonomies |
| ContentTaxonomyRelation | USER DATA | Relationships between content and taxonomies | ✅ CLEAN - Remove test relations |
| ContentClusterMap | USER DATA | Mapping of content to clusters | ✅ CLEAN - Remove test mappings |
| ContentAttribute | USER DATA | Custom attributes for content | ✅ CLEAN - Remove test attributes |
Summary: All models are USER DATA - Safe to clean completely
8. Publishing & Integration
| Model | Type | Description | Clean/Keep |
|---|---|---|---|
| SiteIntegration | USER DATA | WordPress/platform integrations per site | ✅ CLEAN - Remove test integrations |
| SyncEvent | USER DATA | Sync events between IGNY8 and external platforms | ✅ CLEAN - Remove test sync logs |
| PublishingRecord | USER DATA | Records of published content | ✅ CLEAN - Remove test publish records |
| PublishingChannel | SYSTEM DATA | Available publishing channels (WordPress, Ghost, etc.) | ⚠️ KEEP - Production channel configs |
| DeploymentRecord | USER DATA | Deployment history per account | ✅ CLEAN - Remove test deployments |
Summary:
- KEEP: PublishingChannel (system-wide channel definitions)
- CLEAN: All user-specific integration and sync data
9. AI & Automation
| Model | Type | Description | Clean/Keep |
|---|---|---|---|
| IntegrationSettings | MIXED DATA | API keys/settings for OpenAI, etc. | ⚠️ REVIEW - Keep system defaults, remove test configs |
| AIPrompt | SYSTEM DATA | AI prompt templates for content generation | ⚠️ KEEP - Production prompt library |
| Strategy | SYSTEM DATA | Content strategy templates | ⚠️ KEEP - Production strategy templates |
| AuthorProfile | SYSTEM DATA | Author persona templates | ⚠️ KEEP - Production author profiles |
| APIKey | USER DATA | User-generated API keys for platform access | ✅ CLEAN - Remove test API keys |
| WebhookConfig | USER DATA | Webhook configurations per account | ✅ CLEAN - Remove test webhooks |
| AutomationConfig | USER DATA | Automation rules per account/site | ✅ CLEAN - Remove test automations |
| AutomationRun | USER DATA | Execution history of automations | ✅ CLEAN - Remove test run logs |
Summary:
- KEEP: AIPrompt, Strategy, AuthorProfile (system templates)
- REVIEW: IntegrationSettings (separate system vs user API keys)
- CLEAN: APIKey, WebhookConfig, AutomationConfig, AutomationRun (user configs)
10. System Settings
| Model | Type | Description | Clean/Keep |
|---|---|---|---|
| ContentType | SYSTEM DATA | Django ContentTypes (auto-managed) | ⚠️ KEEP - Django core system table |
| ContentTemplate | SYSTEM DATA | Content templates for generation | ⚠️ KEEP - Production templates |
| TaxonomyConfig | SYSTEM DATA | Taxonomy configuration rules | ⚠️ KEEP - Production taxonomy rules |
| SystemSetting | SYSTEM DATA | Global system settings | ⚠️ KEEP - Production system config |
| ContentTypeConfig | SYSTEM DATA | Content type definitions (blog post, landing page, etc.) | ⚠️ KEEP - Production content types |
| NotificationConfig | SYSTEM DATA | Notification templates and rules | ⚠️ KEEP - Production notification configs |
Summary: All models are SYSTEM DATA - Must be kept and properly seeded for production
11. Django Admin
| Model | Type | Description | Clean/Keep |
|---|---|---|---|
| Group | SYSTEM DATA | Permission groups (Admin, Editor, Viewer, etc.) | ⚠️ KEEP - Production role definitions |
| Permission | SYSTEM DATA | Django permissions (auto-managed) | ⚠️ KEEP - Django core system table |
| PasswordResetToken | USER DATA | Password reset tokens (temporary) | ✅ CLEAN - Remove expired tokens |
| Session | USER DATA | User session data | ✅ CLEAN - Remove old sessions |
Summary:
- KEEP: Group, Permission (system access control)
- CLEAN: PasswordResetToken, Session (temporary user data)
12. Tasks & Logging
| Model | Type | Description | Clean/Keep |
|---|---|---|---|
| AITaskLog | USER DATA | Logs of AI operations per account | ✅ CLEAN - Remove test logs |
| AuditLog | USER DATA | Audit trail of user actions | ✅ CLEAN - Remove test audit logs |
| LogEntry | USER DATA | Django admin action logs | ✅ CLEAN - Remove test admin logs |
| TaskResult | USER DATA | Celery task execution results | ✅ CLEAN - Remove test task results |
| GroupResult | USER DATA | Celery group task results | ✅ CLEAN - Remove test group results |
Summary: All models are USER DATA - Safe to clean completely (logs/audit trails)
Summary Table: Data Segregation by Category
| Category | System Data Models | User Data Models | Mixed/Review |
|---|---|---|---|
| Accounts & Tenancy | 0 | 5 | 0 |
| Global Resources | 2 | 0 | 1 |
| Plans and Billing | 3 | 4 | 0 |
| Credits | 1 | 3 | 0 |
| Content Planning | 0 | 3 | 0 |
| Content Generation | 0 | 3 | 0 |
| Taxonomy & Organization | 0 | 4 | 0 |
| Publishing & Integration | 1 | 4 | 0 |
| AI & Automation | 3 | 4 | 1 |
| System Settings | 6 | 0 | 0 |
| Django Admin | 2 | 2 | 0 |
| Tasks & Logging | 0 | 5 | 0 |
| TOTAL | 18 | 37 | 2 |
Action Plan: Production Data Preparation
Phase 1: Preserve System Data ⚠️
Models to Keep & Seed Properly:
-
Global Taxonomy
- Industry (pre-populate 10-15 major industries)
- IndustrySector (pre-populate 100+ sub-sectors)
- SeedKeyword (system-level seed keywords per industry)
-
Pricing & Plans
- Plan (Free, Starter, Pro, Enterprise tiers)
- CreditPackage (credit bundles for purchase)
- PaymentMethodConfig (Stripe, PayPal configs)
- CreditCostConfig (cost per operation type)
-
Publishing Channels
- PublishingChannel (WordPress, Ghost, Medium, etc.)
-
AI & Content Templates
- AIPrompt (100+ production-ready prompts)
- Strategy (content strategy templates)
- AuthorProfile (author persona library)
- ContentTemplate (article templates)
- ContentTypeConfig (blog post, landing page, etc.)
-
System Configuration
- SystemSetting (global platform settings)
- TaxonomyConfig (taxonomy rules)
- NotificationConfig (email/webhook templates)
-
Access Control
- Group (Admin, Editor, Viewer, Owner roles)
- Permission (Django-managed)
- ContentType (Django-managed)
Phase 2: Clean User/Test Data ✅
Models to Truncate/Delete:
- Account Data: Account, User, Site, Sector, SiteUserAccess
- Billing Transactions: Subscription, Invoice, Payment, AccountPaymentMethod, CreditTransaction
- Content Data: Keywords, Clusters, ContentIdeas, Tasks, Content, Images
- Taxonomy Relations: ContentTaxonomy, ContentTaxonomyRelation, ContentClusterMap, ContentAttribute
- Integration Data: SiteIntegration, SyncEvent, PublishingRecord, DeploymentRecord
- User Configs: APIKey, WebhookConfig, AutomationConfig, AutomationRun
- Logs: AITaskLog, AuditLog, LogEntry, TaskResult, GroupResult, CreditUsageLog, PlanLimitUsage, PasswordResetToken, Session
Phase 3: Review Mixed Data ⚠️
Models Requiring Manual Review:
- SeedKeyword: Separate system seeds from test data
- IntegrationSettings: Keep system-level API configs, remove test account keys
Database Cleanup Commands (Use with Caution)
Safe Cleanup (Logs & Sessions)
# Remove old logs (>90 days)
AITaskLog.objects.filter(created_at__lt=timezone.now() - timedelta(days=90)).delete()
CreditUsageLog.objects.filter(created_at__lt=timezone.now() - timedelta(days=90)).delete()
LogEntry.objects.filter(action_time__lt=timezone.now() - timedelta(days=90)).delete()
# Remove old sessions and tokens
Session.objects.filter(expire_date__lt=timezone.now()).delete()
PasswordResetToken.objects.filter(expires_at__lt=timezone.now()).delete()
# Remove old task results
TaskResult.objects.filter(date_done__lt=timezone.now() - timedelta(days=30)).delete()
Full Test Data Cleanup (Development/Staging Only)
# WARNING: Only run in development/staging environments
# This will delete ALL user-generated data
# User data
Account.objects.all().delete() # Cascades to most user data
User.objects.filter(is_superuser=False).delete()
# Remaining user data
SiteIntegration.objects.all().delete()
AutomationConfig.objects.all().delete()
APIKey.objects.all().delete()
WebhookConfig.objects.all().delete()
# Logs and history
AITaskLog.objects.all().delete()
AuditLog.objects.all().delete()
LogEntry.objects.all().delete()
TaskResult.objects.all().delete()
GroupResult.objects.all().delete()
Verify System Data Exists
# Check system data is properly seeded
print(f"Industries: {Industry.objects.count()}")
print(f"Plans: {Plan.objects.count()}")
print(f"AI Prompts: {AIPrompt.objects.count()}")
print(f"Strategies: {Strategy.objects.count()}")
print(f"Content Templates: {ContentTemplate.objects.count()}")
print(f"Publishing Channels: {PublishingChannel.objects.count()}")
print(f"Groups: {Group.objects.count()}")
Recommendations
Before Production Launch:
-
Export System Data: Export all SYSTEM DATA models to fixtures for reproducibility
python manage.py dumpdata igny8_core_auth.Industry > fixtures/industries.json python manage.py dumpdata igny8_core_auth.Plan > fixtures/plans.json python manage.py dumpdata system.AIPrompt > fixtures/prompts.json # ... repeat for all system models -
Create Seed Script: Create management command to populate fresh database with system data
python manage.py seed_system_data -
Database Snapshot: Take snapshot after system data is seeded, before any user data
-
Separate Databases: Consider separate staging database with full test data vs production with clean start
-
Data Migration Plan:
- If migrating from old system: Only migrate Account, User, Content, and critical user data
- Leave test data behind in old system
Next Steps
- ✅ Review this document and confirm data segregation logic
- ⚠️ Create fixtures/seeds for all 18 SYSTEM DATA models
- ⚠️ Review 2 MIXED DATA models (SeedKeyword, IntegrationSettings)
- ✅ Create cleanup script for 37 USER DATA models
- ✅ Test cleanup script in staging environment
- ✅ Execute cleanup before production launch
Generated: December 20, 2025 Purpose: Production data preparation and test data cleanup