Files
igny8/backup-api-standard-v1/docs/RATE-LIMITING.md
2025-11-16 03:28:25 +00:00

440 lines
11 KiB
Markdown

# Rate Limiting Guide
**Version**: 1.0.0
**Last Updated**: 2025-11-16
Complete guide for understanding and handling rate limits in the IGNY8 API v1.0.
---
## Overview
Rate limiting protects the API from abuse and ensures fair resource usage. Different operation types have different rate limits based on their resource intensity.
---
## Rate Limit Headers
Every API response includes rate limit information in headers:
- `X-Throttle-Limit`: Maximum requests allowed in the time window
- `X-Throttle-Remaining`: Remaining requests in current window
- `X-Throttle-Reset`: Unix timestamp when the limit resets
### Example Response Headers
```http
HTTP/1.1 200 OK
X-Throttle-Limit: 60
X-Throttle-Remaining: 45
X-Throttle-Reset: 1700123456
Content-Type: application/json
```
---
## Rate Limit Scopes
Rate limits are scoped by operation type:
### AI Functions (Expensive Operations)
| Scope | Limit | Endpoints |
|-------|-------|-----------|
| `ai_function` | 10/min | Auto-cluster, content generation |
| `image_gen` | 15/min | Image generation (DALL-E, Runware) |
| `planner_ai` | 10/min | AI-powered planner operations |
| `writer_ai` | 10/min | AI-powered writer operations |
### Content Operations
| Scope | Limit | Endpoints |
|-------|-------|-----------|
| `content_write` | 30/min | Content creation, updates |
| `content_read` | 100/min | Content listing, retrieval |
### Authentication
| Scope | Limit | Endpoints |
|-------|-------|-----------|
| `auth` | 20/min | Login, register, password reset |
| `auth_strict` | 5/min | Sensitive auth operations |
### Planner Operations
| Scope | Limit | Endpoints |
|-------|-------|-----------|
| `planner` | 60/min | Keywords, clusters, ideas CRUD |
### Writer Operations
| Scope | Limit | Endpoints |
|-------|-------|-----------|
| `writer` | 60/min | Tasks, content, images CRUD |
### System Operations
| Scope | Limit | Endpoints |
|-------|-------|-----------|
| `system` | 100/min | Settings, prompts, profiles |
| `system_admin` | 30/min | Admin-only system operations |
### Billing Operations
| Scope | Limit | Endpoints |
|-------|-------|-----------|
| `billing` | 30/min | Credit queries, usage logs |
| `billing_admin` | 10/min | Credit management (admin) |
### Default
| Scope | Limit | Endpoints |
|-------|-------|-----------|
| `default` | 100/min | Endpoints without explicit scope |
---
## Rate Limit Exceeded (429)
When rate limit is exceeded, you receive:
**Status Code**: `429 Too Many Requests`
**Response**:
```json
{
"success": false,
"error": "Rate limit exceeded",
"request_id": "550e8400-e29b-41d4-a716-446655440000"
}
```
**Headers**:
```http
X-Throttle-Limit: 60
X-Throttle-Remaining: 0
X-Throttle-Reset: 1700123456
```
### Handling Rate Limits
**1. Check Headers Before Request**
```python
def make_request(url, headers):
response = requests.get(url, headers=headers)
# Check remaining requests
remaining = int(response.headers.get('X-Throttle-Remaining', 0))
if remaining < 5:
# Approaching limit, slow down
time.sleep(1)
return response.json()
```
**2. Handle 429 Response**
```python
def make_request_with_backoff(url, headers, max_retries=3):
for attempt in range(max_retries):
response = requests.get(url, headers=headers)
if response.status_code == 429:
# Get reset time
reset_time = int(response.headers.get('X-Throttle-Reset', 0))
current_time = int(time.time())
wait_seconds = max(1, reset_time - current_time)
print(f"Rate limited. Waiting {wait_seconds} seconds...")
time.sleep(wait_seconds)
continue
return response.json()
raise Exception("Max retries exceeded")
```
**3. Implement Exponential Backoff**
```python
import time
import random
def make_request_with_exponential_backoff(url, headers):
max_wait = 60 # Maximum wait time in seconds
base_wait = 1 # Base wait time in seconds
for attempt in range(5):
response = requests.get(url, headers=headers)
if response.status_code != 429:
return response.json()
# Exponential backoff with jitter
wait_time = min(
base_wait * (2 ** attempt) + random.uniform(0, 1),
max_wait
)
print(f"Rate limited. Waiting {wait_time:.2f} seconds...")
time.sleep(wait_time)
raise Exception("Rate limit exceeded after retries")
```
---
## Best Practices
### 1. Monitor Rate Limit Headers
Always check `X-Throttle-Remaining` to avoid hitting limits:
```python
def check_rate_limit(response):
remaining = int(response.headers.get('X-Throttle-Remaining', 0))
if remaining < 10:
print(f"Warning: Only {remaining} requests remaining")
return remaining
```
### 2. Implement Request Queuing
For bulk operations, queue requests to stay within limits:
```python
import queue
import threading
class RateLimitedAPI:
def __init__(self, requests_per_minute=60):
self.queue = queue.Queue()
self.requests_per_minute = requests_per_minute
self.min_interval = 60 / requests_per_minute
self.last_request_time = 0
def make_request(self, url, headers):
# Ensure minimum interval between requests
elapsed = time.time() - self.last_request_time
if elapsed < self.min_interval:
time.sleep(self.min_interval - elapsed)
response = requests.get(url, headers=headers)
self.last_request_time = time.time()
return response.json()
```
### 3. Cache Responses
Cache frequently accessed data to reduce API calls:
```python
from functools import lru_cache
import time
class CachedAPI:
def __init__(self, cache_ttl=300): # 5 minutes
self.cache = {}
self.cache_ttl = cache_ttl
def get_cached(self, url, headers, cache_key):
# Check cache
if cache_key in self.cache:
data, timestamp = self.cache[cache_key]
if time.time() - timestamp < self.cache_ttl:
return data
# Fetch from API
response = requests.get(url, headers=headers)
data = response.json()
# Store in cache
self.cache[cache_key] = (data, time.time())
return data
```
### 4. Batch Requests When Possible
Use bulk endpoints instead of multiple individual requests:
```python
# ❌ Don't: Multiple individual requests
for keyword_id in keyword_ids:
response = requests.get(f"/api/v1/planner/keywords/{keyword_id}/", headers=headers)
# ✅ Do: Use bulk endpoint if available
response = requests.post(
"/api/v1/planner/keywords/bulk/",
json={"ids": keyword_ids},
headers=headers
)
```
---
## Rate Limit Bypass
### Development/Debug Mode
Rate limiting is automatically bypassed when:
- `DEBUG=True` in Django settings
- `IGNY8_DEBUG_THROTTLE=True` environment variable
- User belongs to `aws-admin` account
- User has `admin` or `developer` role
**Note**: Headers are still set for debugging, but requests are not blocked.
---
## Monitoring Rate Limits
### Track Usage
```python
class RateLimitMonitor:
def __init__(self):
self.usage_by_scope = {}
def track_request(self, response, scope):
if scope not in self.usage_by_scope:
self.usage_by_scope[scope] = {
'total': 0,
'limited': 0
}
self.usage_by_scope[scope]['total'] += 1
if response.status_code == 429:
self.usage_by_scope[scope]['limited'] += 1
remaining = int(response.headers.get('X-Throttle-Remaining', 0))
limit = int(response.headers.get('X-Throttle-Limit', 0))
usage_percent = ((limit - remaining) / limit) * 100
if usage_percent > 80:
print(f"Warning: {scope} at {usage_percent:.1f}% capacity")
def get_report(self):
return self.usage_by_scope
```
---
## Troubleshooting
### Issue: Frequent 429 Errors
**Causes**:
- Too many requests in short time
- Not checking rate limit headers
- No request throttling implemented
**Solutions**:
1. Implement request throttling
2. Monitor `X-Throttle-Remaining` header
3. Add delays between requests
4. Use bulk endpoints when available
### Issue: Rate Limits Too Restrictive
**Solutions**:
1. Contact support for higher limits (if justified)
2. Optimize requests (cache, batch, reduce frequency)
3. Use development account for testing (bypass enabled)
---
## Code Examples
### Python - Complete Rate Limit Handler
```python
import requests
import time
from datetime import datetime
class RateLimitedClient:
def __init__(self, base_url, token):
self.base_url = base_url
self.headers = {
'Authorization': f'Bearer {token}',
'Content-Type': 'application/json'
}
self.rate_limits = {}
def _wait_for_rate_limit(self, scope='default'):
"""Wait if approaching rate limit"""
if scope in self.rate_limits:
limit_info = self.rate_limits[scope]
remaining = limit_info.get('remaining', 0)
reset_time = limit_info.get('reset_time', 0)
if remaining < 5:
wait_time = max(0, reset_time - time.time())
if wait_time > 0:
print(f"Rate limit low. Waiting {wait_time:.1f}s...")
time.sleep(wait_time)
def _update_rate_limit_info(self, response, scope='default'):
"""Update rate limit information from response headers"""
limit = response.headers.get('X-Throttle-Limit')
remaining = response.headers.get('X-Throttle-Remaining')
reset = response.headers.get('X-Throttle-Reset')
if limit and remaining and reset:
self.rate_limits[scope] = {
'limit': int(limit),
'remaining': int(remaining),
'reset_time': int(reset)
}
def request(self, method, endpoint, scope='default', **kwargs):
"""Make rate-limited request"""
# Wait if approaching limit
self._wait_for_rate_limit(scope)
# Make request
url = f"{self.base_url}{endpoint}"
response = requests.request(method, url, headers=self.headers, **kwargs)
# Update rate limit info
self._update_rate_limit_info(response, scope)
# Handle rate limit error
if response.status_code == 429:
reset_time = int(response.headers.get('X-Throttle-Reset', 0))
wait_time = max(1, reset_time - time.time())
print(f"Rate limited. Waiting {wait_time:.1f}s...")
time.sleep(wait_time)
# Retry once
response = requests.request(method, url, headers=self.headers, **kwargs)
self._update_rate_limit_info(response, scope)
return response.json()
def get(self, endpoint, scope='default'):
return self.request('GET', endpoint, scope)
def post(self, endpoint, data, scope='default'):
return self.request('POST', endpoint, scope, json=data)
# Usage
client = RateLimitedClient("https://api.igny8.com/api/v1", "your_token")
# Make requests with automatic rate limit handling
keywords = client.get("/planner/keywords/", scope="planner")
```
---
**Last Updated**: 2025-11-16
**API Version**: 1.0.0