Rate Limits¶

API key requests are subject to the following rate limits:

Default Limits¶

Limit	Value	Window
Requests per minute	30	Rolling 60-second window
Decisions per hour	20	Rolling 3600-second window

Both limits are enforced independently. Exceeding either limit returns a 429 status.

Rate Limit Response¶

When rate limited, the API returns:

{
  "error": {
    "code": "RATE_LIMIT_EXCEEDED",
    "message": "API key rate limit exceeded: 31/30 requests per minute.",
    "status": 429,
    "details": {},
    "request_id": "req_abc123",
    "timestamp": "2026-02-19T12:00:00Z"
  }
}

How Rate Limits Work¶

Limits are tracked per API key, not per IP address or account.
Each clone can have up to 5 API keys, each with independent rate limits.
The requests per minute counter increments on every request, regardless of success or failure.
The decisions per hour counter increments only on successful decision requests.
Counters reset automatically after their window expires.

Best Practices¶

1. Implement Exponential Backoff¶

When you receive a 429, wait before retrying:

import time

retry_after = 2  # start with 2 seconds
for attempt in range(3):
    try:
        result = client.decide(context="...")
        break
    except RateLimitError:
        time.sleep(retry_after)
        retry_after *= 2

2. Cache Decisions¶

If the same context produces the same decision, cache results to reduce API calls.

3. Use Separate Keys for Environments¶

Use different API keys for development, staging, and production to avoid test traffic consuming production limits.

4. Monitor Usage¶

Check the API key's total_requests and last_used_at in the dashboard to track consumption patterns.

SDK Behavior¶

Both official SDKs handle rate limits automatically:

Python SDK: Raises RateLimitError with a retry_after attribute.
TypeScript SDK: Throws RateLimitError with a retryAfter property.
Neither SDK auto-retries on 429 — rate limit errors are always surfaced to your code.