Rate Limits¶
API key requests are subject to the following rate limits:
Default Limits¶
| Limit | Value | Window |
|---|---|---|
| Requests per minute | 30 | Rolling 60-second window |
| Decisions per hour | 20 | Rolling 3600-second window |
Both limits are enforced independently. Exceeding either limit returns a 429 status.
Rate Limit Response¶
When rate limited, the API returns:
{
"error": {
"code": "RATE_LIMIT_EXCEEDED",
"message": "API key rate limit exceeded: 31/30 requests per minute.",
"status": 429,
"details": {},
"request_id": "req_abc123",
"timestamp": "2026-02-19T12:00:00Z"
}
}
How Rate Limits Work¶
- Limits are tracked per API key, not per IP address or account.
- Each clone can have up to 5 API keys, each with independent rate limits.
- The requests per minute counter increments on every request, regardless of success or failure.
- The decisions per hour counter increments only on successful decision requests.
- Counters reset automatically after their window expires.
Best Practices¶
1. Implement Exponential Backoff¶
When you receive a 429, wait before retrying:
import time
retry_after = 2 # start with 2 seconds
for attempt in range(3):
try:
result = client.decide(context="...")
break
except RateLimitError:
time.sleep(retry_after)
retry_after *= 2
2. Cache Decisions¶
If the same context produces the same decision, cache results to reduce API calls.
3. Use Separate Keys for Environments¶
Use different API keys for development, staging, and production to avoid test traffic consuming production limits.
4. Monitor Usage¶
Check the API key's total_requests and last_used_at in the dashboard to track consumption patterns.
SDK Behavior¶
Both official SDKs handle rate limits automatically:
- Python SDK: Raises
RateLimitErrorwith aretry_afterattribute. - TypeScript SDK: Throws
RateLimitErrorwith aretryAfterproperty. - Neither SDK auto-retries on 429 — rate limit errors are always surfaced to your code.