357 lines
15 KiB
Markdown
357 lines
15 KiB
Markdown
# Data Segregation: System vs User Data
|
|
|
|
## Purpose
|
|
This document categorizes all models in the Django admin sidebar to identify:
|
|
- **SYSTEM DATA**: Configuration, templates, and settings that must be preserved (pre-configured, production-ready data)
|
|
- **USER DATA**: Account-specific, tenant-specific, or test data that can be cleaned up during testing phase
|
|
|
|
---
|
|
|
|
## 1. Accounts & Tenancy
|
|
|
|
| Model | Type | Description | Clean/Keep |
|
|
|-------|------|-------------|------------|
|
|
| Account | USER DATA | Customer accounts (test accounts during development) | ✅ CLEAN - Remove test accounts |
|
|
| User | USER DATA | User profiles linked to accounts | ✅ CLEAN - Remove test users |
|
|
| Site | USER DATA | Sites/domains owned by accounts | ✅ CLEAN - Remove test sites |
|
|
| Sector | USER DATA | Sectors within sites (account-specific) | ✅ CLEAN - Remove test sectors |
|
|
| SiteUserAccess | USER DATA | User permissions per site | ✅ CLEAN - Remove test access records |
|
|
|
|
**Summary**: All models are USER DATA - Safe to clean for fresh production start
|
|
|
|
---
|
|
|
|
## 2. Global Resources
|
|
|
|
| Model | Type | Description | Clean/Keep |
|
|
|-------|------|-------------|------------|
|
|
| Industry | SYSTEM DATA | Global industry taxonomy (e.g., Healthcare, Finance, Technology) | ⚠️ KEEP - Pre-configured industries |
|
|
| IndustrySector | SYSTEM DATA | Sub-categories within industries (e.g., Cardiology, Investment Banking) | ⚠️ KEEP - Pre-configured sectors |
|
|
| SeedKeyword | MIXED DATA | Seed keywords for industries - can be seeded or user-generated | ⚠️ REVIEW - Keep system seeds, remove test seeds |
|
|
|
|
**Summary**:
|
|
- **KEEP**: Industry and IndustrySector (global taxonomy)
|
|
- **REVIEW**: SeedKeyword - separate system defaults from test data
|
|
|
|
---
|
|
|
|
## 3. Plans and Billing
|
|
|
|
| Model | Type | Description | Clean/Keep |
|
|
|-------|------|-------------|------------|
|
|
| Plan | SYSTEM DATA | Subscription plans (Free, Pro, Enterprise, etc.) | ⚠️ KEEP - Production pricing tiers |
|
|
| Subscription | USER DATA | Active subscriptions per account | ✅ CLEAN - Remove test subscriptions |
|
|
| Invoice | USER DATA | Generated invoices for accounts | ✅ CLEAN - Remove test invoices |
|
|
| Payment | USER DATA | Payment records | ✅ CLEAN - Remove test payments |
|
|
| CreditPackage | SYSTEM DATA | Available credit packages for purchase | ⚠️ KEEP - Production credit offerings |
|
|
| PaymentMethodConfig | SYSTEM DATA | Supported payment methods (Stripe, PayPal) | ⚠️ KEEP - Production payment configs |
|
|
| AccountPaymentMethod | USER DATA | Saved payment methods per account | ✅ CLEAN - Remove test payment methods |
|
|
|
|
**Summary**:
|
|
- **KEEP**: Plan, CreditPackage, PaymentMethodConfig (system pricing/config)
|
|
- **CLEAN**: Subscription, Invoice, Payment, AccountPaymentMethod (user transactions)
|
|
|
|
---
|
|
|
|
## 4. Credits
|
|
|
|
| Model | Type | Description | Clean/Keep |
|
|
|-------|------|-------------|------------|
|
|
| CreditTransaction | USER DATA | Credit add/subtract transactions | ✅ CLEAN - Remove test transactions |
|
|
| CreditUsageLog | USER DATA | Log of credit usage per operation | ✅ CLEAN - Remove test usage logs |
|
|
| CreditCostConfig | SYSTEM DATA | Cost configuration per operation type | ⚠️ KEEP - Production cost structure |
|
|
| PlanLimitUsage | USER DATA | Usage tracking per account/plan limits | ✅ CLEAN - Remove test usage data |
|
|
|
|
**Summary**:
|
|
- **KEEP**: CreditCostConfig (system cost rules)
|
|
- **CLEAN**: All transaction and usage logs (user activity)
|
|
|
|
---
|
|
|
|
## 5. Content Planning
|
|
|
|
| Model | Type | Description | Clean/Keep |
|
|
|-------|------|-------------|------------|
|
|
| Keywords | USER DATA | Keywords researched per site/sector | ✅ CLEAN - Remove test keywords |
|
|
| Clusters | USER DATA | Content clusters created per site | ✅ CLEAN - Remove test clusters |
|
|
| ContentIdeas | USER DATA | Content ideas generated for accounts | ✅ CLEAN - Remove test ideas |
|
|
|
|
**Summary**: All models are USER DATA - Safe to clean completely
|
|
|
|
---
|
|
|
|
## 6. Content Generation
|
|
|
|
| Model | Type | Description | Clean/Keep |
|
|
|-------|------|-------------|------------|
|
|
| Tasks | USER DATA | Content writing tasks assigned to users | ✅ CLEAN - Remove test tasks |
|
|
| Content | USER DATA | Generated content/articles | ✅ CLEAN - Remove test content |
|
|
| Images | USER DATA | Generated or uploaded images | ✅ CLEAN - Remove test images |
|
|
|
|
**Summary**: All models are USER DATA - Safe to clean completely
|
|
|
|
---
|
|
|
|
## 7. Taxonomy & Organization
|
|
|
|
| Model | Type | Description | Clean/Keep |
|
|
|-------|------|-------------|------------|
|
|
| ContentTaxonomy | USER DATA | Custom taxonomies (categories/tags) per site | ✅ CLEAN - Remove test taxonomies |
|
|
| ContentTaxonomyRelation | USER DATA | Relationships between content and taxonomies | ✅ CLEAN - Remove test relations |
|
|
| ContentClusterMap | USER DATA | Mapping of content to clusters | ✅ CLEAN - Remove test mappings |
|
|
| ContentAttribute | USER DATA | Custom attributes for content | ✅ CLEAN - Remove test attributes |
|
|
|
|
**Summary**: All models are USER DATA - Safe to clean completely
|
|
|
|
---
|
|
|
|
## 8. Publishing & Integration
|
|
|
|
| Model | Type | Description | Clean/Keep |
|
|
|-------|------|-------------|------------|
|
|
| SiteIntegration | USER DATA | WordPress/platform integrations per site | ✅ CLEAN - Remove test integrations |
|
|
| SyncEvent | USER DATA | Sync events between IGNY8 and external platforms | ✅ CLEAN - Remove test sync logs |
|
|
| PublishingRecord | USER DATA | Records of published content | ✅ CLEAN - Remove test publish records |
|
|
| PublishingChannel | SYSTEM DATA | Available publishing channels (WordPress, Ghost, etc.) | ⚠️ KEEP - Production channel configs |
|
|
| DeploymentRecord | USER DATA | Deployment history per account | ✅ CLEAN - Remove test deployments |
|
|
|
|
**Summary**:
|
|
- **KEEP**: PublishingChannel (system-wide channel definitions)
|
|
- **CLEAN**: All user-specific integration and sync data
|
|
|
|
---
|
|
|
|
## 9. AI & Automation
|
|
|
|
| Model | Type | Description | Clean/Keep |
|
|
|-------|------|-------------|------------|
|
|
| IntegrationSettings | MIXED DATA | API keys/settings for OpenAI, etc. | ⚠️ REVIEW - Keep system defaults, remove test configs |
|
|
| AIPrompt | SYSTEM DATA | AI prompt templates for content generation | ⚠️ KEEP - Production prompt library |
|
|
| Strategy | SYSTEM DATA | Content strategy templates | ⚠️ KEEP - Production strategy templates |
|
|
| AuthorProfile | SYSTEM DATA | Author persona templates | ⚠️ KEEP - Production author profiles |
|
|
| APIKey | USER DATA | User-generated API keys for platform access | ✅ CLEAN - Remove test API keys |
|
|
| WebhookConfig | USER DATA | Webhook configurations per account | ✅ CLEAN - Remove test webhooks |
|
|
| AutomationConfig | USER DATA | Automation rules per account/site | ✅ CLEAN - Remove test automations |
|
|
| AutomationRun | USER DATA | Execution history of automations | ✅ CLEAN - Remove test run logs |
|
|
|
|
**Summary**:
|
|
- **KEEP**: AIPrompt, Strategy, AuthorProfile (system templates)
|
|
- **REVIEW**: IntegrationSettings (separate system vs user API keys)
|
|
- **CLEAN**: APIKey, WebhookConfig, AutomationConfig, AutomationRun (user configs)
|
|
|
|
---
|
|
|
|
## 10. System Settings
|
|
|
|
| Model | Type | Description | Clean/Keep |
|
|
|-------|------|-------------|------------|
|
|
| ContentType | SYSTEM DATA | Django ContentTypes (auto-managed) | ⚠️ KEEP - Django core system table |
|
|
| ContentTemplate | SYSTEM DATA | Content templates for generation | ⚠️ KEEP - Production templates |
|
|
| TaxonomyConfig | SYSTEM DATA | Taxonomy configuration rules | ⚠️ KEEP - Production taxonomy rules |
|
|
| SystemSetting | SYSTEM DATA | Global system settings | ⚠️ KEEP - Production system config |
|
|
| ContentTypeConfig | SYSTEM DATA | Content type definitions (blog post, landing page, etc.) | ⚠️ KEEP - Production content types |
|
|
| NotificationConfig | SYSTEM DATA | Notification templates and rules | ⚠️ KEEP - Production notification configs |
|
|
|
|
**Summary**: All models are SYSTEM DATA - Must be kept and properly seeded for production
|
|
|
|
---
|
|
|
|
## 11. Django Admin
|
|
|
|
| Model | Type | Description | Clean/Keep |
|
|
|-------|------|-------------|------------|
|
|
| Group | SYSTEM DATA | Permission groups (Admin, Editor, Viewer, etc.) | ⚠️ KEEP - Production role definitions |
|
|
| Permission | SYSTEM DATA | Django permissions (auto-managed) | ⚠️ KEEP - Django core system table |
|
|
| PasswordResetToken | USER DATA | Password reset tokens (temporary) | ✅ CLEAN - Remove expired tokens |
|
|
| Session | USER DATA | User session data | ✅ CLEAN - Remove old sessions |
|
|
|
|
**Summary**:
|
|
- **KEEP**: Group, Permission (system access control)
|
|
- **CLEAN**: PasswordResetToken, Session (temporary user data)
|
|
|
|
---
|
|
|
|
## 12. Tasks & Logging
|
|
|
|
| Model | Type | Description | Clean/Keep |
|
|
|-------|------|-------------|------------|
|
|
| AITaskLog | USER DATA | Logs of AI operations per account | ✅ CLEAN - Remove test logs |
|
|
| AuditLog | USER DATA | Audit trail of user actions | ✅ CLEAN - Remove test audit logs |
|
|
| LogEntry | USER DATA | Django admin action logs | ✅ CLEAN - Remove test admin logs |
|
|
| TaskResult | USER DATA | Celery task execution results | ✅ CLEAN - Remove test task results |
|
|
| GroupResult | USER DATA | Celery group task results | ✅ CLEAN - Remove test group results |
|
|
|
|
**Summary**: All models are USER DATA - Safe to clean completely (logs/audit trails)
|
|
|
|
---
|
|
|
|
## Summary Table: Data Segregation by Category
|
|
|
|
| Category | System Data Models | User Data Models | Mixed/Review |
|
|
|----------|-------------------|------------------|--------------|
|
|
| **Accounts & Tenancy** | 0 | 5 | 0 |
|
|
| **Global Resources** | 2 | 0 | 1 |
|
|
| **Plans and Billing** | 3 | 4 | 0 |
|
|
| **Credits** | 1 | 3 | 0 |
|
|
| **Content Planning** | 0 | 3 | 0 |
|
|
| **Content Generation** | 0 | 3 | 0 |
|
|
| **Taxonomy & Organization** | 0 | 4 | 0 |
|
|
| **Publishing & Integration** | 1 | 4 | 0 |
|
|
| **AI & Automation** | 3 | 4 | 1 |
|
|
| **System Settings** | 6 | 0 | 0 |
|
|
| **Django Admin** | 2 | 2 | 0 |
|
|
| **Tasks & Logging** | 0 | 5 | 0 |
|
|
| **TOTAL** | **18** | **37** | **2** |
|
|
|
|
---
|
|
|
|
## Action Plan: Production Data Preparation
|
|
|
|
### Phase 1: Preserve System Data ⚠️
|
|
**Models to Keep & Seed Properly:**
|
|
|
|
1. **Global Taxonomy**
|
|
- Industry (pre-populate 10-15 major industries)
|
|
- IndustrySector (pre-populate 100+ sub-sectors)
|
|
- SeedKeyword (system-level seed keywords per industry)
|
|
|
|
2. **Pricing & Plans**
|
|
- Plan (Free, Starter, Pro, Enterprise tiers)
|
|
- CreditPackage (credit bundles for purchase)
|
|
- PaymentMethodConfig (Stripe, PayPal configs)
|
|
- CreditCostConfig (cost per operation type)
|
|
|
|
3. **Publishing Channels**
|
|
- PublishingChannel (WordPress, Ghost, Medium, etc.)
|
|
|
|
4. **AI & Content Templates**
|
|
- AIPrompt (100+ production-ready prompts)
|
|
- Strategy (content strategy templates)
|
|
- AuthorProfile (author persona library)
|
|
- ContentTemplate (article templates)
|
|
- ContentTypeConfig (blog post, landing page, etc.)
|
|
|
|
5. **System Configuration**
|
|
- SystemSetting (global platform settings)
|
|
- TaxonomyConfig (taxonomy rules)
|
|
- NotificationConfig (email/webhook templates)
|
|
|
|
6. **Access Control**
|
|
- Group (Admin, Editor, Viewer, Owner roles)
|
|
- Permission (Django-managed)
|
|
- ContentType (Django-managed)
|
|
|
|
### Phase 2: Clean User/Test Data ✅
|
|
**Models to Truncate/Delete:**
|
|
|
|
1. **Account Data**: Account, User, Site, Sector, SiteUserAccess
|
|
2. **Billing Transactions**: Subscription, Invoice, Payment, AccountPaymentMethod, CreditTransaction
|
|
3. **Content Data**: Keywords, Clusters, ContentIdeas, Tasks, Content, Images
|
|
4. **Taxonomy Relations**: ContentTaxonomy, ContentTaxonomyRelation, ContentClusterMap, ContentAttribute
|
|
5. **Integration Data**: SiteIntegration, SyncEvent, PublishingRecord, DeploymentRecord
|
|
6. **User Configs**: APIKey, WebhookConfig, AutomationConfig, AutomationRun
|
|
7. **Logs**: AITaskLog, AuditLog, LogEntry, TaskResult, GroupResult, CreditUsageLog, PlanLimitUsage, PasswordResetToken, Session
|
|
|
|
### Phase 3: Review Mixed Data ⚠️
|
|
**Models Requiring Manual Review:**
|
|
|
|
1. **SeedKeyword**: Separate system seeds from test data
|
|
2. **IntegrationSettings**: Keep system-level API configs, remove test account keys
|
|
|
|
---
|
|
|
|
## Database Cleanup Commands (Use with Caution)
|
|
|
|
### Safe Cleanup (Logs & Sessions)
|
|
```python
|
|
# Remove old logs (>90 days)
|
|
AITaskLog.objects.filter(created_at__lt=timezone.now() - timedelta(days=90)).delete()
|
|
CreditUsageLog.objects.filter(created_at__lt=timezone.now() - timedelta(days=90)).delete()
|
|
LogEntry.objects.filter(action_time__lt=timezone.now() - timedelta(days=90)).delete()
|
|
|
|
# Remove old sessions and tokens
|
|
Session.objects.filter(expire_date__lt=timezone.now()).delete()
|
|
PasswordResetToken.objects.filter(expires_at__lt=timezone.now()).delete()
|
|
|
|
# Remove old task results
|
|
TaskResult.objects.filter(date_done__lt=timezone.now() - timedelta(days=30)).delete()
|
|
```
|
|
|
|
### Full Test Data Cleanup (Development/Staging Only)
|
|
```python
|
|
# WARNING: Only run in development/staging environments
|
|
# This will delete ALL user-generated data
|
|
|
|
# User data
|
|
Account.objects.all().delete() # Cascades to most user data
|
|
User.objects.filter(is_superuser=False).delete()
|
|
|
|
# Remaining user data
|
|
SiteIntegration.objects.all().delete()
|
|
AutomationConfig.objects.all().delete()
|
|
APIKey.objects.all().delete()
|
|
WebhookConfig.objects.all().delete()
|
|
|
|
# Logs and history
|
|
AITaskLog.objects.all().delete()
|
|
AuditLog.objects.all().delete()
|
|
LogEntry.objects.all().delete()
|
|
TaskResult.objects.all().delete()
|
|
GroupResult.objects.all().delete()
|
|
```
|
|
|
|
### Verify System Data Exists
|
|
```python
|
|
# Check system data is properly seeded
|
|
print(f"Industries: {Industry.objects.count()}")
|
|
print(f"Plans: {Plan.objects.count()}")
|
|
print(f"AI Prompts: {AIPrompt.objects.count()}")
|
|
print(f"Strategies: {Strategy.objects.count()}")
|
|
print(f"Content Templates: {ContentTemplate.objects.count()}")
|
|
print(f"Publishing Channels: {PublishingChannel.objects.count()}")
|
|
print(f"Groups: {Group.objects.count()}")
|
|
```
|
|
|
|
---
|
|
|
|
## Recommendations
|
|
|
|
### Before Production Launch:
|
|
|
|
1. **Export System Data**: Export all SYSTEM DATA models to fixtures for reproducibility
|
|
```bash
|
|
python manage.py dumpdata igny8_core_auth.Industry > fixtures/industries.json
|
|
python manage.py dumpdata igny8_core_auth.Plan > fixtures/plans.json
|
|
python manage.py dumpdata system.AIPrompt > fixtures/prompts.json
|
|
# ... repeat for all system models
|
|
```
|
|
|
|
2. **Create Seed Script**: Create management command to populate fresh database with system data
|
|
```bash
|
|
python manage.py seed_system_data
|
|
```
|
|
|
|
3. **Database Snapshot**: Take snapshot after system data is seeded, before any user data
|
|
|
|
4. **Separate Databases**: Consider separate staging database with full test data vs production with clean start
|
|
|
|
5. **Data Migration Plan**:
|
|
- If migrating from old system: Only migrate Account, User, Content, and critical user data
|
|
- Leave test data behind in old system
|
|
|
|
---
|
|
|
|
## Next Steps
|
|
|
|
1. ✅ Review this document and confirm data segregation logic
|
|
2. ⚠️ Create fixtures/seeds for all 18 SYSTEM DATA models
|
|
3. ⚠️ Review 2 MIXED DATA models (SeedKeyword, IntegrationSettings)
|
|
4. ✅ Create cleanup script for 37 USER DATA models
|
|
5. ✅ Test cleanup script in staging environment
|
|
6. ✅ Execute cleanup before production launch
|
|
|
|
---
|
|
|
|
*Generated: December 20, 2025*
|
|
*Purpose: Production data preparation and test data cleanup*
|