11 KiB
Rate Limiting Guide
Version: 1.0.0
Last Updated: 2025-11-16
Complete guide for understanding and handling rate limits in the IGNY8 API v1.0.
Overview
Rate limiting protects the API from abuse and ensures fair resource usage. Different operation types have different rate limits based on their resource intensity.
Rate Limit Headers
Every API response includes rate limit information in headers:
X-Throttle-Limit: Maximum requests allowed in the time windowX-Throttle-Remaining: Remaining requests in current windowX-Throttle-Reset: Unix timestamp when the limit resets
Example Response Headers
HTTP/1.1 200 OK
X-Throttle-Limit: 60
X-Throttle-Remaining: 45
X-Throttle-Reset: 1700123456
Content-Type: application/json
Rate Limit Scopes
Rate limits are scoped by operation type:
AI Functions (Expensive Operations)
| Scope | Limit | Endpoints |
|---|---|---|
ai_function |
10/min | Auto-cluster, content generation |
image_gen |
15/min | Image generation (DALL-E, Runware) |
planner_ai |
10/min | AI-powered planner operations |
writer_ai |
10/min | AI-powered writer operations |
Content Operations
| Scope | Limit | Endpoints |
|---|---|---|
content_write |
30/min | Content creation, updates |
content_read |
100/min | Content listing, retrieval |
Authentication
| Scope | Limit | Endpoints |
|---|---|---|
auth |
20/min | Login, register, password reset |
auth_strict |
5/min | Sensitive auth operations |
Planner Operations
| Scope | Limit | Endpoints |
|---|---|---|
planner |
60/min | Keywords, clusters, ideas CRUD |
Writer Operations
| Scope | Limit | Endpoints |
|---|---|---|
writer |
60/min | Tasks, content, images CRUD |
System Operations
| Scope | Limit | Endpoints |
|---|---|---|
system |
100/min | Settings, prompts, profiles |
system_admin |
30/min | Admin-only system operations |
Billing Operations
| Scope | Limit | Endpoints |
|---|---|---|
billing |
30/min | Credit queries, usage logs |
billing_admin |
10/min | Credit management (admin) |
Default
| Scope | Limit | Endpoints |
|---|---|---|
default |
100/min | Endpoints without explicit scope |
Rate Limit Exceeded (429)
When rate limit is exceeded, you receive:
Status Code: 429 Too Many Requests
Response:
{
"success": false,
"error": "Rate limit exceeded",
"request_id": "550e8400-e29b-41d4-a716-446655440000"
}
Headers:
X-Throttle-Limit: 60
X-Throttle-Remaining: 0
X-Throttle-Reset: 1700123456
Handling Rate Limits
1. Check Headers Before Request
def make_request(url, headers):
response = requests.get(url, headers=headers)
# Check remaining requests
remaining = int(response.headers.get('X-Throttle-Remaining', 0))
if remaining < 5:
# Approaching limit, slow down
time.sleep(1)
return response.json()
2. Handle 429 Response
def make_request_with_backoff(url, headers, max_retries=3):
for attempt in range(max_retries):
response = requests.get(url, headers=headers)
if response.status_code == 429:
# Get reset time
reset_time = int(response.headers.get('X-Throttle-Reset', 0))
current_time = int(time.time())
wait_seconds = max(1, reset_time - current_time)
print(f"Rate limited. Waiting {wait_seconds} seconds...")
time.sleep(wait_seconds)
continue
return response.json()
raise Exception("Max retries exceeded")
3. Implement Exponential Backoff
import time
import random
def make_request_with_exponential_backoff(url, headers):
max_wait = 60 # Maximum wait time in seconds
base_wait = 1 # Base wait time in seconds
for attempt in range(5):
response = requests.get(url, headers=headers)
if response.status_code != 429:
return response.json()
# Exponential backoff with jitter
wait_time = min(
base_wait * (2 ** attempt) + random.uniform(0, 1),
max_wait
)
print(f"Rate limited. Waiting {wait_time:.2f} seconds...")
time.sleep(wait_time)
raise Exception("Rate limit exceeded after retries")
Best Practices
1. Monitor Rate Limit Headers
Always check X-Throttle-Remaining to avoid hitting limits:
def check_rate_limit(response):
remaining = int(response.headers.get('X-Throttle-Remaining', 0))
if remaining < 10:
print(f"Warning: Only {remaining} requests remaining")
return remaining
2. Implement Request Queuing
For bulk operations, queue requests to stay within limits:
import queue
import threading
class RateLimitedAPI:
def __init__(self, requests_per_minute=60):
self.queue = queue.Queue()
self.requests_per_minute = requests_per_minute
self.min_interval = 60 / requests_per_minute
self.last_request_time = 0
def make_request(self, url, headers):
# Ensure minimum interval between requests
elapsed = time.time() - self.last_request_time
if elapsed < self.min_interval:
time.sleep(self.min_interval - elapsed)
response = requests.get(url, headers=headers)
self.last_request_time = time.time()
return response.json()
3. Cache Responses
Cache frequently accessed data to reduce API calls:
from functools import lru_cache
import time
class CachedAPI:
def __init__(self, cache_ttl=300): # 5 minutes
self.cache = {}
self.cache_ttl = cache_ttl
def get_cached(self, url, headers, cache_key):
# Check cache
if cache_key in self.cache:
data, timestamp = self.cache[cache_key]
if time.time() - timestamp < self.cache_ttl:
return data
# Fetch from API
response = requests.get(url, headers=headers)
data = response.json()
# Store in cache
self.cache[cache_key] = (data, time.time())
return data
4. Batch Requests When Possible
Use bulk endpoints instead of multiple individual requests:
# ❌ Don't: Multiple individual requests
for keyword_id in keyword_ids:
response = requests.get(f"/api/v1/planner/keywords/{keyword_id}/", headers=headers)
# ✅ Do: Use bulk endpoint if available
response = requests.post(
"/api/v1/planner/keywords/bulk/",
json={"ids": keyword_ids},
headers=headers
)
Rate Limit Bypass
Development/Debug Mode
Rate limiting is automatically bypassed when:
DEBUG=Truein Django settingsIGNY8_DEBUG_THROTTLE=Trueenvironment variable- User belongs to
aws-adminaccount - User has
adminordeveloperrole
Note: Headers are still set for debugging, but requests are not blocked.
Monitoring Rate Limits
Track Usage
class RateLimitMonitor:
def __init__(self):
self.usage_by_scope = {}
def track_request(self, response, scope):
if scope not in self.usage_by_scope:
self.usage_by_scope[scope] = {
'total': 0,
'limited': 0
}
self.usage_by_scope[scope]['total'] += 1
if response.status_code == 429:
self.usage_by_scope[scope]['limited'] += 1
remaining = int(response.headers.get('X-Throttle-Remaining', 0))
limit = int(response.headers.get('X-Throttle-Limit', 0))
usage_percent = ((limit - remaining) / limit) * 100
if usage_percent > 80:
print(f"Warning: {scope} at {usage_percent:.1f}% capacity")
def get_report(self):
return self.usage_by_scope
Troubleshooting
Issue: Frequent 429 Errors
Causes:
- Too many requests in short time
- Not checking rate limit headers
- No request throttling implemented
Solutions:
- Implement request throttling
- Monitor
X-Throttle-Remainingheader - Add delays between requests
- Use bulk endpoints when available
Issue: Rate Limits Too Restrictive
Solutions:
- Contact support for higher limits (if justified)
- Optimize requests (cache, batch, reduce frequency)
- Use development account for testing (bypass enabled)
Code Examples
Python - Complete Rate Limit Handler
import requests
import time
from datetime import datetime
class RateLimitedClient:
def __init__(self, base_url, token):
self.base_url = base_url
self.headers = {
'Authorization': f'Bearer {token}',
'Content-Type': 'application/json'
}
self.rate_limits = {}
def _wait_for_rate_limit(self, scope='default'):
"""Wait if approaching rate limit"""
if scope in self.rate_limits:
limit_info = self.rate_limits[scope]
remaining = limit_info.get('remaining', 0)
reset_time = limit_info.get('reset_time', 0)
if remaining < 5:
wait_time = max(0, reset_time - time.time())
if wait_time > 0:
print(f"Rate limit low. Waiting {wait_time:.1f}s...")
time.sleep(wait_time)
def _update_rate_limit_info(self, response, scope='default'):
"""Update rate limit information from response headers"""
limit = response.headers.get('X-Throttle-Limit')
remaining = response.headers.get('X-Throttle-Remaining')
reset = response.headers.get('X-Throttle-Reset')
if limit and remaining and reset:
self.rate_limits[scope] = {
'limit': int(limit),
'remaining': int(remaining),
'reset_time': int(reset)
}
def request(self, method, endpoint, scope='default', **kwargs):
"""Make rate-limited request"""
# Wait if approaching limit
self._wait_for_rate_limit(scope)
# Make request
url = f"{self.base_url}{endpoint}"
response = requests.request(method, url, headers=self.headers, **kwargs)
# Update rate limit info
self._update_rate_limit_info(response, scope)
# Handle rate limit error
if response.status_code == 429:
reset_time = int(response.headers.get('X-Throttle-Reset', 0))
wait_time = max(1, reset_time - time.time())
print(f"Rate limited. Waiting {wait_time:.1f}s...")
time.sleep(wait_time)
# Retry once
response = requests.request(method, url, headers=self.headers, **kwargs)
self._update_rate_limit_info(response, scope)
return response.json()
def get(self, endpoint, scope='default'):
return self.request('GET', endpoint, scope)
def post(self, endpoint, data, scope='default'):
return self.request('POST', endpoint, scope, json=data)
# Usage
client = RateLimitedClient("https://api.igny8.com/api/v1", "your_token")
# Make requests with automatic rate limit handling
keywords = client.get("/planner/keywords/", scope="planner")
Last Updated: 2025-11-16
API Version: 1.0.0