Rate Limits
Revenium applies generous rate limits to keep usage predictable for agents and integrations. Limits are scoped per account and grouped into three buckets, each with its own ceiling and observable response headers.
Buckets and limits
Every authenticated request is mapped to one of three buckets based on the path. Each bucket has its own request-per-second limit.
metering
/meter/v2/** (REST metering: AI completions, events, tool, API) and /v2/otlp/** (OTLP signals)
1,000 req/sec
analytics
AI Metrics, AI Traces, billing analytics, chart and cost-attribution reads
100 req/sec
platform
All other /profitstream/v2/api/** and /v2/sdk/** endpoints (CRUD on subscriptions, tenants, sources, integrations, alerts, models, anomalies, plus SDK auth endpoints)
50 req/sec
Limits are per account, not per API key. If your integration uses multiple keys under the same account, they share the bucket budget.
Response headers
Four headers appear on every authenticated response that resolves to a bucket, regardless of status code (2xx, 4xx, 5xx, 204).
X-RateLimit-Limit
Maximum requests allowed in the current window for this bucket
X-RateLimit-Remaining
Requests remaining in the current window. Reaches 0 at the limit, never goes negative
X-RateLimit-Reset
Unix epoch (seconds) when the current window resets
X-RateLimit-Bucket
Which bucket this request mapped to: metering, analytics, or platform
Use Remaining to throttle preemptively. Use Reset to schedule your next batch. Use Bucket so a metering call does not back off because a platform call was busy.
Limits reset each second. Use X-RateLimit-Reset to see exactly when.
When you hit the limit
A request that exceeds the bucket budget returns 429 Too Many Requests with two additional headers and a JSON body.
Additional headers:
Retry-After
Seconds to wait before retrying. Integer, always present
X-RateLimit-Limited-Reason
Why the limit fired. Values: bucket-rate (bucket exhaustion) or error-pattern (repeated 4xx responses)
Response body:
The bucket field is present when Limited-Reason is bucket-rate. For error-pattern, the body still has type, code, message, and doc_url, but no bucket.
Error-pattern protection
Beyond per-bucket budgets, Revenium temporarily slows callers that produce sustained 4xx errors. This protects accounts from runaway misconfigured integrations spamming the API with broken requests.
Sustained 4xx over a short window triggers a temporary block. Each consecutive violation doubles the cooldown, capped at one hour. Retry-After reflects the remaining cooldown. X-RateLimit-Limited-Reason will be error-pattern.
Fix the underlying request (auth, missing fields, invalid IDs) and the cooldown clears on its own.
Recommended client behavior
Read the headers. Treat
X-RateLimit-RemainingandX-RateLimit-Resetas authoritative. IfRemainingis low, slow down before you hit0.Honor
Retry-After. On429, wait at least that many seconds before retrying. Do not retry sooner.Use exponential backoff with jitter on repeat 429s. Start at the
Retry-Aftervalue and add randomized jitter to avoid thundering-herd retries with other clients in your account.Differentiate by
Limited-Reason:bucket-rate: you are sending too fast. Throttle byRemaining.error-pattern: you are sending broken requests. Fix the request shape, not the rate.
Separate metering from platform calls. The metering bucket has 20x the platform budget. A busy
platformflow should not slow downmetering.
If you have a use case that needs sustained higher throughput on a specific bucket, contact support so we can review the integration.
Last updated
Was this helpful?