8.3 KiB
IGNY8 Seed Keywords Import Scripts
This folder contains scripts for importing seed keywords from the KW_DB folder structure into the IGNY8 global keywords database.
📁 Folder Structure
/data/app/igny8/KW_DB/
{Industry}/ # e.g., HealthCare_Medical
{Sector}/ # e.g., Physiotherapy_Rehabilitation
*.csv # Keyword CSV files
🔧 Available Scripts
1. import_seed_keywords_single.py
Import keywords from a single CSV file (for testing).
Usage:
# Dry run (preview only)
docker compose -f docker-compose.app.yml exec igny8_backend \
python3 /app/scripts/import_seed_keywords_single.py \
--csv /data/app/igny8/KW_DB/HealthCare_Medical/Physiotherapy_Rehabilitation/google_us_muscle-stimulator_matching-terms_2025-12-19_04-25-32.csv \
--industry "HealthCare Medical" \
--sector "Physiotherapy Rehabilitation" \
--dry-run --verbose
# Actual import
docker compose -f docker-compose.app.yml exec igny8_backend \
python3 /app/scripts/import_seed_keywords_single.py \
--csv /data/app/igny8/KW_DB/HealthCare_Medical/Physiotherapy_Rehabilitation/google_us_muscle-stimulator_matching-terms_2025-12-19_04-25-32.csv \
--industry "HealthCare Medical" \
--sector "Physiotherapy Rehabilitation"
Options:
--csv- Path to CSV file (required)--industry- Industry name (required)--sector- Sector name (required)--dry-run- Preview without saving to database--verbose- Show detailed progress for each keyword
2. import_all_seed_keywords.py
Import keywords from all CSV files in the KW_DB folder structure.
Usage:
# Dry run (preview all imports)
docker compose -f docker-compose.app.yml exec igny8_backend \
python3 /app/scripts/import_all_seed_keywords.py \
--base-path /data/app/igny8/KW_DB \
--dry-run
# Actual import
docker compose -f docker-compose.app.yml exec igny8_backend \
python3 /app/scripts/import_all_seed_keywords.py \
--base-path /data/app/igny8/KW_DB
Options:
--base-path- Base path to KW_DB folder (default: /data/app/igny8/KW_DB)--dry-run- Preview without saving to database--verbose- Show detailed progress for each keyword
📊 CSV File Format
Expected CSV columns:
- Keyword (required) - The keyword text
- Country (optional) - Country code (default: US)
- Volume (optional) - Search volume (default: 0)
- Difficulty (optional) - Keyword difficulty 0-100 (default: 0)
- CPC (ignored) - Not imported
- Parent Keyword (ignored) - Not imported
Example:
Keyword,Country,Volume,Difficulty,CPC,Parent Keyword
physical therapy,us,12000,45,3.20,
tens unit,us,5000,32,2.50,physical therapy
🔍 Duplicate Handling
Duplicate Check: keyword + country (case-insensitive) within same industry+sector
- If a keyword with the same country already exists in the same industry+sector → SKIPS import
- Example: "physical therapy [US]" in "HealthCare Medical > Physiotherapy Rehabilitation" will be skipped if already exists
🗄️ Database Models
Industry
name- Industry name (e.g., "HealthCare Medical")slug- URL-friendly slug (e.g., "healthcare-medical")is_active- Active status (default: True)
IndustrySector
name- Sector name (e.g., "Physiotherapy Rehabilitation")slug- URL-friendly slug (e.g., "physiotherapy-rehabilitation")industry- Foreign key to Industryis_active- Active status (default: True)
SeedKeyword
keyword- Keyword textindustry- Foreign key to Industrysector- Foreign key to IndustrySectorcountry- Country code (e.g., "US")volume- Search volumedifficulty- Keyword difficulty (0-100)is_active- Active status (default: True)
Unique Constraint: keyword + industry + sector (at model level)
Script Duplicate Check: keyword + country + industry + sector (stricter than model)
✅ Verification
After import, verify the data:
# Check counts in Django shell
docker compose -f docker-compose.app.yml exec igny8_backend python3 manage.py shell
>>> from igny8_core.auth.models import Industry, IndustrySector, SeedKeyword
>>> Industry.objects.count()
>>> IndustrySector.objects.count()
>>> SeedKeyword.objects.count()
>>> SeedKeyword.objects.filter(industry__name="HealthCare Medical").count()
Or check in Django admin:
- Industries:
/admin/auth/industry/ - Sectors:
/admin/auth/industrysector/ - Keywords:
/admin/auth/seedkeyword/
🐛 Troubleshooting
Issue: "ModuleNotFoundError: No module named 'igny8_core'"
Solution: Script must run inside Docker container:
docker compose -f docker-compose.app.yml exec igny8_backend python3 /app/scripts/...
Issue: "CSV file not found"
Solution: Use full path inside container: /data/app/igny8/KW_DB/...
Issue: Keywords not importing (showing as duplicates)
Solution: Check if keywords already exist:
docker compose -f docker-compose.app.yml exec igny8_backend python3 manage.py shell
>>> from igny8_core.auth.models import SeedKeyword
>>> SeedKeyword.objects.filter(keyword__iexact="physical therapy", country="US")
Issue: Want to re-import after cleaning database
Solution: Delete existing keywords first:
docker compose -f docker-compose.app.yml exec igny8_backend python3 manage.py shell
>>> from igny8_core.auth.models import SeedKeyword
>>> SeedKeyword.objects.all().delete() # Delete all keywords
>>> SeedKeyword.objects.filter(industry__slug="healthcare-medical").delete() # Delete specific industry
📈 Import Statistics
The scripts provide detailed statistics:
- Total rows processed
- Keywords imported successfully
- Duplicates skipped (keyword + country)
- Invalid rows skipped (empty keywords, bad data)
- Errors encountered
- Industries/Sectors created
Example output:
======================================================================
IMPORT SUMMARY
======================================================================
Total rows processed: 4,523
✓ Imported: 4,201
⊘ Skipped (duplicate): 280
⊘ Skipped (invalid): 37
✗ Errors: 5
======================================================================
🚀 Quick Start Guide
-
Test with single file first:
docker compose -f docker-compose.app.yml exec igny8_backend \ python3 /app/scripts/import_seed_keywords_single.py \ --csv /data/app/igny8/KW_DB/HealthCare_Medical/Physiotherapy_Rehabilitation/google_us_muscle-stimulator_matching-terms_2025-12-19_04-25-32.csv \ --industry "HealthCare Medical" \ --sector "Physiotherapy Rehabilitation" \ --dry-run -
If successful, remove
--dry-runand run actual import:docker compose -f docker-compose.app.yml exec igny8_backend \ python3 /app/scripts/import_seed_keywords_single.py \ --csv /data/app/igny8/KW_DB/HealthCare_Medical/Physiotherapy_Rehabilitation/google_us_muscle-stimulator_matching-terms_2025-12-19_04-25-32.csv \ --industry "HealthCare Medical" \ --sector "Physiotherapy Rehabilitation" -
Verify in Django admin:
/admin/auth/seedkeyword/ -
Import all files:
docker compose -f docker-compose.app.yml exec igny8_backend \ python3 /app/scripts/import_all_seed_keywords.py \ --base-path /data/app/igny8/KW_DB \ --dry-run -
If successful, run actual bulk import:
docker compose -f docker-compose.app.yml exec igny8_backend \ python3 /app/scripts/import_all_seed_keywords.py \ --base-path /data/app/igny8/KW_DB
📝 Notes
- Scripts automatically create Industries and Sectors if they don't exist
- Folder names are converted to display names (underscores → spaces)
- Slugs are auto-generated from names
- All imports happen within transactions for data integrity
- Dry-run mode uses transaction rollback (no database changes)
- Empty or invalid CSV rows are skipped with warnings
🔗 Related Documentation
Author: IGNY8 Team
Created: January 13, 2026
Last Updated: January 13, 2026