Files
igny8/KW_DB/management/README.md
2026-01-13 12:00:16 +00:00

160 lines
4.6 KiB
Markdown

# Keyword Database Management Scripts
⚠️ **IMPORTANT: Import scripts have been moved to `/data/app/igny8/backend/scripts/`**
This folder contains the keyword data organized by Industry/Sector structure.
## ✅ Successfully Imported
**Date**: January 13, 2026
**Total Keywords**: 6,470 keywords from 31 CSV files
**Industries**: HealthCare Medical
**Sectors**: Physiotherapy Rehabilitation, Massage & Therapy, Relaxation Devices
## 📁 Actual Script Locations
The import scripts are located in:
```
/data/app/igny8/backend/scripts/
├── README.md # Full documentation
├── IMPORT_STATUS.md # Current import status
├── import_seed_keywords_single.py # Import single CSV file (testing)
└── import_all_seed_keywords.py # Import all CSVs from folder structure
```
## Folder Structure
```
KW_DB/
├── management/
│ └── README.md # This file (redirects to actual scripts)
└── [Industry]/[Sector]/*.csv # Keyword data files
```
## Data Structure
```
/KW_DB/
└── HealthCare_Medical/ → Industry
└── Physiotherapy_Rehabilitation/ → Sector
└── *.csv files → Keywords
```
## 🚀 Quick Start - Import Commands
### Import Single CSV File (Testing)
```bash
cd /data/app/igny8
docker compose -f docker-compose.app.yml exec igny8_backend \
python3 /app/scripts/import_seed_keywords_single.py \
--csv /data/app/igny8/KW_DB/HealthCare_Medical/Physiotherapy_Rehabilitation/google_us_muscle-stimulator_matching-terms_2025-12-19_04-25-32.csv \
--industry "HealthCare Medical" \
--sector "Physiotherapy Rehabilitation" \
--dry-run --verbose
```
### Import All Keywords
```bash
cd /data/app/igny8
# Dry run first (preview)
docker compose -f docker-compose.app.yml exec igny8_backend \
python3 /app/scripts/import_all_seed_keywords.py \
--base-path /data/app/igny8/KW_DB \
--dry-run
# Actual import
docker compose -f docker-compose.app.yml exec igny8_backend \
python3 /app/scripts/import_all_seed_keywords.py \
--base-path /data/app/igny8/KW_DB
```
## 📚 Full Documentation
For complete documentation, see:
- **[/data/app/igny8/backend/scripts/README.md](../../backend/scripts/README.md)** - Full import documentation
- **[/data/app/igny8/backend/scripts/IMPORT_STATUS.md](../../backend/scripts/IMPORT_STATUS.md)** - Current import status
## CSV File Format
Expected columns:
- `#` - Row number (ignored)
- `Keyword` - The keyword text (required)
- `Country` - Country code like "us" (required, converted to uppercase)
- `Volume` - Search volume (optional, defaults to 0)
- `Difficulty` - Keyword difficulty 0-100 (optional, defaults to 0)
- Other columns are ignored
## Database Models
### Industry
- Global industry categories
- Example: "HealthCare Medical", "E-Commerce", "Finance"
### IndustrySector
- Subcategories within industries
- Example: "Physiotherapy Rehabilitation", "Dental Care"
- Linked to parent Industry
### SeedKeyword
- Individual keywords with metrics
- Linked to Industry + Sector
- Unique constraint: `keyword + industry + sector + country`
## Common Issues
### Issue: "Industry not found"
**Solution**: Script will auto-create industries from folder names
### Issue: "Duplicate keywords"
**Solution**: Script checks `keyword + country` and skips duplicates
### Issue: "Empty volume/difficulty values"
**Solution**: Script defaults empty values to 0
### Issue: "CSV parsing errors"
**Solution**: Check CSV encoding (should be UTF-8)
## Verification Commands
After import, verify data in Django shell:
```python
cd /data/app/igny8/backend
python manage.py shell
from igny8_core.auth.models import Industry, IndustrySector, SeedKeyword
Verification Commands
Check imported data:
```bash
# Quick counts
docker compose -f docker-compose.app.yml exec igny8_backend python3 manage.py shell -c "
from igny8_core.auth.models import Industry, IndustrySector, SeedKeyword
print(f'Industries: {Industry.objects.count()}')
print(f'Sectors: {IndustrySector.objects.count()}')
print(f'Keywords: {SeedKeyword.objects.count()}')
"
# Check by industry
docker compose -f docker-compose.app.yml exec igny8_backend python3 manage.py shell -c "
from igny8_core.auth.models import SeedKeyword
print(f'HealthCare Medical: {SeedKeyword.objects.filter(industry__slug=\"healthcare-medical\").count()}')
"
```
## 🔗 Django Admin
View imported data:
- Industries: `/admin/auth/industry/`
- Sectors: `/admin/auth/industrysector/`
- Keywords: `/admin/auth/seedkeyword/`
---
**For full documentation and troubleshooting, see: [/data/app/igny8/backend/scripts/README.md](../../backend/scripts/README.md)**