- Simplified HasTenantAccess permission logic to ensure every authenticated user has an account. - Added fallback to system account for OpenAI settings in AI configuration. - Allowed any authenticated user to check task progress in IntegrationSettingsViewSet. - Created a script to identify and fix orphaned users without accounts. - Updated error response handling in business endpoints for clarity.
5.2 KiB
UNDER OBSERVATION
Issue: User Logged Out During Image Prompt Generation (Dec 10, 2025)
Original Problem
User performed workflow: auto-cluster → generate ideas → queue to writer → generate content → generate image prompt. During image prompt generation (near completion), user was automatically logged out.
Investigation Timeline
Initial Analysis:
- Suspected backend container restarts invalidating sessions
- Docker ps showed all containers up 19+ minutes - NO RESTARTS during incident
- Backend logs showed:
[IsAuthenticatedAndActive] DENIED: User not authenticatedandClient error: Authentication credentials were not provided - Token was not being sent with API requests
Root Cause Identified: The logout was NOT caused by backend issues or container restarts. It was caused by frontend state corruption during HMR (Hot Module Reload) triggered by code changes made to fix an unrelated useLocation() error.
What Actually Happened:
-
Commit
5fb3687854- Already had proper fix for useLocation() error (Suspense outside Routes) -
Additional "fixes" applied on Dec 10, 2025:
- Changed
cacheDir: "/tmp/vite-cache"in vite.config.ts - Moved BrowserRouter above ErrorBoundary in main.tsx
- Added
watch.interval: 100andfs.strict: false
- Changed
-
These changes triggered:
- Vite cache stored in /tmp got wiped on container operations
- Full rebuild with HMR
- Component tree restructuring (BrowserRouter position change)
- Auth store (Zustand persist) lost state during rapid unmount/remount cycle
- Frontend started making API calls WITHOUT Authorization header
- Backend correctly rejected unauthenticated requests
- Frontend logout() triggered
Fix Applied
Reverted the problematic changes:
- Removed
cacheDir: "/tmp/vite-cache"- let Vite use default node_modules/.vite - Restored BrowserRouter position inside ErrorBoundary/ThemeProvider (original structure)
- Removed
watch.intervalandfs.strictadditions
Kept the actual fixes:
- Backend: Removed
IsSystemAccountOrDeveloperfrom IntegrationSettingsViewSet class-level permissions - Backend: Auto-cluster
extra_data→debug_infoparameter fix - Frontend: Suspense wrapping Routes (from commit
5fb3687) - THIS was the real useLocation() fix
What to Watch For
1. useLocation() Error After Container Restarts
- Symptom: "useLocation() may be used only in the context of a component"
- Where: Keywords page, other planner/writer module pages (50-60% of pages)
- If it happens:
- Check if Vite cache is stale
- Clear node_modules/.vite inside frontend container:
docker compose exec igny8_frontend rm -rf /app/node_modules/.vite - Restart frontend container
- DO NOT change cacheDir or component tree structure
2. Auth State Loss During Development
- Symptom: Random logouts during active sessions, "Authentication credentials were not provided"
- Triggers:
- HMR with significant component tree changes
- Rapid container restarts during development
- Changes to context provider order in main.tsx
- Prevention:
- Avoid restructuring main.tsx component tree
- Test auth persistence after any main.tsx changes
- Monitor browser console for localStorage errors during HMR
3. Permission Errors for Normal Users
- Symptom: "You do not have permission to perform this action" for valid users with complete account setup
- Check:
- Backend logs for permission class debug output:
[IsAuthenticatedAndActive],[IsViewerOrAbove],[HasTenantAccess] - Verify user has role='owner' and is_active=True
- Ensure viewset doesn't have
IsSystemAccountOrDeveloperat class level for endpoints normal users need
- Backend logs for permission class debug output:
4. Celery Task Progress Polling 403 Errors
- Symptom: Task progress endpoint returns 403 for normal users
- Root cause: ViewSet class-level permissions blocking action-level overrides
- Solution: Ensure IntegrationSettingsViewSet permission_classes doesn't include IsSystemAccountOrDeveloper
Lessons Learned
- Don't layer fixes on top of fixes - Identify root cause first
- Vite cache location matters - /tmp gets wiped, breaking HMR state persistence
- Component tree structure is fragile - Moving BrowserRouter breaks auth rehydration timing
- Container uptime ≠ code stability - HMR can cause issues without restart
- Permission debugging - Added logging to permission classes was critical for diagnosis
- The original fix was already correct - Commit
5fb3687had it right, additional "improvements" broke it
Files Modified (Reverted)
frontend/vite.config.ts- Removed cacheDir and watch config changesfrontend/src/main.tsx- Restored original component tree structure
Files Modified (Kept)
backend/igny8_core/modules/system/integration_views.py- Removed IsSystemAccountOrDeveloperbackend/igny8_core/modules/planner/views.py- Fixed extra_data → debug_infobackend/igny8_core/api/permissions.py- Added debug logging (can be removed later)
Status
RESOLVED - Auth state stable, backend permissions correct, useLocation fix preserved.
Monitor for 48 hours - Watch for any recurrence of useLocation errors or auth issues after container restarts.