❗Cost & Performance Alerts
Configure Revenium cost & performance alerts to be notified as soon as something changes, not when the bill arrives.
Revenium’s cost and performance alerting system helps you detect unexpected changes in usage or spend in near real-time—so you can take action before you're surprised by a bill, not after.
The alert dashboard allows you to configure alerts on key metrics, apply scoped filters, and receive notifications via your preferred channels.

Supported Alert Types
You can configure alerts based on two condition types:
1. Spike Detection
Fires when a metric exceeds a defined threshold during a scheduled check.
Operates across any rolling time window.
Best for monitoring rate-based metrics and detecting unusual spikes.
Example: Alert when cost per transaction exceeds
$0.25for a specific product or agent.Example: Alert when error rate exceeds 5% for any API endpoint
2. Budget Threshold
Fires when usage or costs exceed a threshold within a rolling calendar period.
Automatically resets at the start of each new period (daily, weekly, monthly, quarterly).
Ideal for budget monitoring and cost control.
Example: Alert when total cost exceeds
$5,000in a month.Example: Alert when token usage exceeds
1million tokens per week.
💰 Learn more about Budget Monitoring →
Available Metrics and Alert Types
Total Cost
Total cost incurred
Both types available
Cost Per Transaction
Cost per task/request
Spike Detection Only
Tokens Per Minute
Rate of token generation/processing
Spike Detection Only
Requests Per minute
Rate of API requests
Spike Detection Only
Error Rate(%)
Percentage of requests resulting in errors
Spike Detection Only
Error Count
Total number of errors
Both Types Available
Token Count
Total tokens processed/generated
Both Types Available
Input Tokens
Total input tokens processed
Both Types Available
Output Tokens
Total output tokens processed
Both Types Available
Filtering Options
Alerts can be scoped using one or more dimensions. You may apply filters to narrow or group alert conditions by:
Organization
Credential
Product
Model
Provider
Agent
Subscriber
Supported filter operators:
equalscontainsstarts withends with
Filters can be combined to target a specific workload, tenant, or user.
Notification Channels
Alerts can be sent to:
Email
Slack
If you need additional notification mechanisms (e.g., PagerDuty, Webhook, Opsgenie), please let us know.
Common Alerting Scenarios
These examples illustrate high-value alert configurations used by developers, FinOps leads, and engineering managers.
1. Prevent Cost Spikes per Request
Use Case: A developer tests a new feature or model and needs to ensure it doesn’t exceed budgeted request costs.
Metric: Cost per transaction
Filter: Product =
summarization-apiCondition: Threshold > $0.25
Action: Send Slack alert if request costs exceed target
2. Enforce Daily Token Budgets
Use Case: Limit individual user activity to manage runaway costs.
Metric: Token count
Filter: Credential contains
dev-key-*Condition: Cumulative Usage ≥ 1,000,000 tokens / Daily
Action: Alert user and engineering lead via email
3. Catch Performance Degradations in Request Volume
Use Case: Alert if request rate drops or surges abnormally.
Metric: Requests per minute
Filter: Model =
gpt-4-turboCondition: Threshold < 10 or > 1000 / minute
Action: Notify platform team via Slack
4. Track Monthly Spend by Business Unit
Use Case: Finance team manages AI budgets per department.
Metric: Total cost
Filter: Organization =
acme-corpCondition: Cumulative Usage > $10,000 / Monthly
Action: Email budget owner with CSV export of usage data
5. Flag Sudden Cost Surges
Use Case: Alert on sharp spikes regardless of baseline.
Metric: Total cost
Condition: Change > 30% / 24hr
Action: Notify platform operations team
6. Control Cost per Feature or Agent
Use Case: Understand cost efficiency of specific workloads.
Metric: Cost per transaction
Filter: Agent =
recommendation-engineCondition: Threshold > $1.00
Action: Post alert to Slack channel with link to usage explorer
7. Monitor Token Throughput
Use Case: Ensure high-volume workloads don’t exceed rate limits or quota.
Metric: Tokens per minute
Condition: Threshold > 150,000 TPM
Action: Trigger Slack alert and send webhook to autoscaler
8. Proactively Surface High Error Rates
Use Case: Minimize the impact of backend or model instability.
Metric: Error rate
Filter: Model =
claude-3-opusCondition: Threshold > 5% over 10 minutes
Action: Notify SRE team for triage
Summary
Revenium’s alerting system provides early, targeted visibility into cost and performance anomalies—so developers and teams can fix problems before they impact budgets. Alerts are configurable, scoped to the workloads that matter, and designed to eliminate billing surprises.
For questions or to request support for additional alert types or channels, contact the Revenium team.
Last updated
Was this helpful?
