Cost & Performance Alerts

Configure Revenium cost & performance alerts to be notified as soon as something changes, not when the bill arrives.

Revenium’s cost and performance alerting system helps you detect unexpected changes in usage or spend in near real-time—so you can take action before you're surprised by a bill, not after.

The alert dashboard allows you to configure alerts on key metrics, apply scoped filters, and receive notifications via your preferred channels.


Supported Alert Types

You can configure alerts based on two condition types:

1. Threshold

  • Fires when a metric crosses a fixed value.

  • Operates across any rolling time window.

  • Example: Alert when cost per transaction exceeds $0.25 for a specific product or agent.

2. Cumulative Usage in Period

  • Fires when total usage in a calendar-aligned period (daily, weekly, monthly, quarterly) exceeds a fixed value.

  • Automatically resets at the start of each new period.

  • Example: Alert when total cost exceeds $5,000 in a month.


Available Metrics

You can create alerts on the following metrics:

  • Total cost

  • Cost per transaction

  • Tokens per minute

  • Requests per minute

  • Token count

  • Input token count

  • Output token count

  • Error rate

  • Error count


Filtering Options

Alerts can be scoped using one or more dimensions. You may apply filters to narrow or group alert conditions by:

  • Organization

  • Credential

  • Product

  • Model

  • Provider

  • Agent

  • Subscriber

Supported filter operators:

  • equals

  • contains

  • starts with

  • ends with

Filters can be combined to target a specific workload, tenant, or user.


Notification Channels

Alerts can be sent to:

  • Email

  • Slack

If you need additional notification mechanisms (e.g., PagerDuty, Webhook, Opsgenie), please let us know.


Common Alerting Scenarios

These examples illustrate high-value alert configurations used by developers, FinOps leads, and engineering managers.

1. Prevent Cost Spikes per Request

Use Case: A developer tests a new feature or model and needs to ensure it doesn’t exceed budgeted request costs.

  • Metric: Cost per transaction

  • Filter: Product = summarization-api

  • Condition: Threshold > $0.25

  • Action: Send Slack alert if request costs exceed target


2. Enforce Daily Token Budgets

Use Case: Limit individual user activity to manage runaway costs.

  • Metric: Token count

  • Filter: Credential contains dev-key-*

  • Condition: Cumulative Usage ≥ 1,000,000 tokens / Daily

  • Action: Alert user and engineering lead via email


3. Catch Performance Degradations in Request Volume

Use Case: Alert if request rate drops or surges abnormally.

  • Metric: Requests per minute

  • Filter: Model = gpt-4-turbo

  • Condition: Threshold < 10 or > 1000 / minute

  • Action: Notify platform team via Slack


4. Track Monthly Spend by Business Unit

Use Case: Finance team manages AI budgets per department.

  • Metric: Total cost

  • Filter: Organization = acme-corp

  • Condition: Cumulative Usage > $10,000 / Monthly

  • Action: Email budget owner with CSV export of usage data


5. Flag Sudden Cost Surges

Use Case: Alert on sharp spikes regardless of baseline.

  • Metric: Total cost

  • Condition: Change > 30% / 24hr

  • Action: Notify platform operations team


6. Control Cost per Feature or Agent

Use Case: Understand cost efficiency of specific workloads.

  • Metric: Cost per transaction

  • Filter: Agent = recommendation-engine

  • Condition: Threshold > $1.00

  • Action: Post alert to Slack channel with link to usage explorer


7. Monitor Token Throughput

Use Case: Ensure high-volume workloads don’t exceed rate limits or quota.

  • Metric: Tokens per minute

  • Condition: Threshold > 150,000 TPM

  • Action: Trigger Slack alert and send webhook to autoscaler


8. Proactively Surface High Error Rates

Use Case: Minimize the impact of backend or model instability.

  • Metric: Error rate

  • Filter: Model = claude-3-opus

  • Condition: Threshold > 5% over 10 minutes

  • Action: Notify SRE team for triage


Summary

Revenium’s alerting system provides early, targeted visibility into cost and performance anomalies—so developers and teams can fix problems before they impact budgets. Alerts are configurable, scoped to the workloads that matter, and designed to eliminate billing surprises.

For questions or to request support for additional alert types or channels, contact the Revenium team.

Last updated

Was this helpful?