Cost & Performance Alerts

Configure Revenium cost & performance alerts to be notified as soon as something changes, not when the bill arrives.

Revenium’s cost and performance alerting system helps you detect unexpected changes in usage or spend in near real-time—so you can take action before you're surprised by a bill, not after.

The alert dashboard allows you to configure alerts on key metrics, apply scoped filters, and receive notifications via your preferred channels.


Supported Alert Types

You can configure alerts based on two condition types:

1. Spike Detection

Fires when a metric exceeds a defined threshold during a scheduled check.

  • Operates across any rolling time window.

  • Best for monitoring rate-based metrics and detecting unusual spikes.

  • Example: Alert when cost per transaction exceeds $0.25 for a specific product or agent.

  • Example: Alert when error rate exceeds 5% for any API endpoint

2. Budget Threshold

Fires when usage or costs exceed a threshold within a rolling calendar period.

  • Automatically resets at the start of each new period (daily, weekly, monthly, quarterly).

  • Ideal for budget monitoring and cost control.

  • Example: Alert when total cost exceeds $5,000 in a month.

  • Example: Alert when token usage exceeds 1 million tokens per week.

💰 Learn more about Budget Monitoring →


Available Metrics and Alert Types

Metric
Description
Available Alert Types

Total Cost

Total cost incurred

Both types available

Cost Per Transaction

Cost per task/request

Spike Detection Only

Tokens Per Minute

Rate of token generation/processing

Spike Detection Only

Requests Per minute

Rate of API requests

Spike Detection Only

Error Rate(%)

Percentage of requests resulting in errors

Spike Detection Only

Error Count

Total number of errors

Both Types Available

Token Count

Total tokens processed/generated

Both Types Available

Input Tokens

Total input tokens processed

Both Types Available

Output Tokens

Total output tokens processed

Both Types Available


Filtering Options

Alerts can be scoped using one or more dimensions. You may apply filters to narrow or group alert conditions by:

  • Organization

  • Credential

  • Product

  • Model

  • Provider

  • Agent

  • Subscriber

Supported filter operators:

  • equals

  • contains

  • starts with

  • ends with

Filters can be combined to target a specific workload, tenant, or user.


Notification Channels

Alerts can be sent to:

  • Email

  • Slack

If you need additional notification mechanisms (e.g., PagerDuty, Webhook, Opsgenie), please let us know.


Common Alerting Scenarios

These examples illustrate high-value alert configurations used by developers, FinOps leads, and engineering managers.

1. Prevent Cost Spikes per Request

Use Case: A developer tests a new feature or model and needs to ensure it doesn’t exceed budgeted request costs.

  • Metric: Cost per transaction

  • Filter: Product = summarization-api

  • Condition: Threshold > $0.25

  • Action: Send Slack alert if request costs exceed target


2. Enforce Daily Token Budgets

Use Case: Limit individual user activity to manage runaway costs.

  • Metric: Token count

  • Filter: Credential contains dev-key-*

  • Condition: Cumulative Usage ≥ 1,000,000 tokens / Daily

  • Action: Alert user and engineering lead via email


3. Catch Performance Degradations in Request Volume

Use Case: Alert if request rate drops or surges abnormally.

  • Metric: Requests per minute

  • Filter: Model = gpt-4-turbo

  • Condition: Threshold < 10 or > 1000 / minute

  • Action: Notify platform team via Slack


4. Track Monthly Spend by Business Unit

Use Case: Finance team manages AI budgets per department.

  • Metric: Total cost

  • Filter: Organization = acme-corp

  • Condition: Cumulative Usage > $10,000 / Monthly

  • Action: Email budget owner with CSV export of usage data


5. Flag Sudden Cost Surges

Use Case: Alert on sharp spikes regardless of baseline.

  • Metric: Total cost

  • Condition: Change > 30% / 24hr

  • Action: Notify platform operations team


6. Control Cost per Feature or Agent

Use Case: Understand cost efficiency of specific workloads.

  • Metric: Cost per transaction

  • Filter: Agent = recommendation-engine

  • Condition: Threshold > $1.00

  • Action: Post alert to Slack channel with link to usage explorer


7. Monitor Token Throughput

Use Case: Ensure high-volume workloads don’t exceed rate limits or quota.

  • Metric: Tokens per minute

  • Condition: Threshold > 150,000 TPM

  • Action: Trigger Slack alert and send webhook to autoscaler


8. Proactively Surface High Error Rates

Use Case: Minimize the impact of backend or model instability.

  • Metric: Error rate

  • Filter: Model = claude-3-opus

  • Condition: Threshold > 5% over 10 minutes

  • Action: Notify SRE team for triage


Summary

Revenium’s alerting system provides early, targeted visibility into cost and performance anomalies—so developers and teams can fix problems before they impact budgets. Alerts are configurable, scoped to the workloads that matter, and designed to eliminate billing surprises.

For questions or to request support for additional alert types or channels, contact the Revenium team.

Last updated

Was this helpful?