# Cost & Performance Alerts

Revenium’s cost and performance alerting system helps you detect unexpected changes in usage or spend in near real-time—so you can take action before you're surprised by a bill, not after.

The alert dashboard allows you to configure alerts on key metrics, apply scoped filters, and receive notifications via your preferred channels.

***

<figure><img src="https://2470865788-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FSUfCzMW8qWeXstipFXEh%2Fuploads%2Fgit-blob-b9c669a435f4d9562075bf2e929820ec6fdd3b96%2Fimage.png?alt=media" alt=""><figcaption></figcaption></figure>

### Supported Alert Types

You can configure alerts based on three condition types:

#### 1. Spike Detection

Fires when a metric exceeds a defined threshold during a scheduled check.

* Operates across any rolling time window.
* Best for monitoring rate-based metrics and detecting unusual spikes.
* Example: Alert when cost per transaction exceeds `$0.25` for a specific product or agent.
* Example: Alert when error rate exceeds <kbd>5%</kbd> for any API endpoint

#### 2. Budget Threshold

Fires when usage or costs exceed a threshold within a rolling calendar period.

* Automatically resets at the start of each new period (daily, weekly, monthly, quarterly).
* Ideal for budget monitoring and cost control.
* Example: Alert when total cost exceeds `$5,000` in a month.
* Example: Alert when token usage exceeds `1` million tokens per week.

💰 [Learn more about Budget Monitoring →](https://docs.revenium.io/budget-monitoring)

#### 3. Relative Change

Fires when a metric changes by a specified percentage compared to a previous period.

* Compares current period against previous period of the same length.
* Detects both increases and decreases.
* Supports daily, weekly, monthly, and quarterly evaluation frequencies.
* Ideal for detecting gradual trends or sudden drops in activity.
* Example: Alert when total cost increases by more than `50%` compared to the previous week.
* Example: Alert when token usage drops by more than `30%` compared to the previous day.

{% hint style="info" %}
**Drop-to-Zero Detection**: Relative change alerts automatically detect scenarios where usage drops to zero—useful for identifying when a critical workflow or customer stops sending data entirely.
{% endhint %}

***

### Available Metrics and Alert Types

| Metric               | Description                                | Available Alert Types |
| -------------------- | ------------------------------------------ | --------------------- |
| Total Cost           | Total cost incurred                        | Both types available  |
| Cost Per Transaction | Cost per task/request                      | Spike Detection Only  |
| Tokens Per Minute    | Rate of token generation/processing        | Spike Detection Only  |
| Requests Per minute  | Rate of API requests                       | Spike Detection Only  |
| Error Rate(%)        | Percentage of requests resulting in errors | Spike Detection Only  |
| Error Count          | Total number of errors                     | Both Types Available  |
| Token Count          | Total tokens processed/generated           | Both Types Available  |
| Input Tokens         | Total input tokens processed               | Both Types Available  |
| Output Tokens        | Total output tokens processed              | Both Types Available  |

***

### Filtering Options

Alerts can be scoped using one or more dimensions. You may apply filters to narrow or group alert conditions by:

* Organization
* Credential
* Product
* Model
* Provider
* Agent
* Subscriber
* Task Type

Supported filter operators:

* `equals`
* `contains`
* `starts with`
* `ends with`

Filters can be combined to target a specific workload, tenant, or user.

***

### Notification Channels

Alerts can be sent to:

* **Email** – Receive alerts directly in your inbox
* **Slack** – Post alerts to designated Slack channels
* **Webhook** – Send alerts to any HTTP endpoint for custom integrations

#### Webhook Notifications

Webhooks allow you to integrate Revenium alerts with any external system—incident management platforms, custom dashboards, automation workflows, or internal tools.

**Webhook Configuration:**

* **URL**: The HTTP endpoint that will receive alert payloads
* **Method**: POST request with JSON body
* **Authentication**: Optional headers for API keys or bearer tokens

**Reliability Features:**

* **Automatic Retries**: Failed webhook deliveries are retried with exponential backoff
* **Circuit Breaker**: If an endpoint consistently fails, Revenium temporarily pauses deliveries to prevent cascade failures, then automatically resumes when the endpoint recovers

**Webhook Payload Example:**

```json
{
  "alertId": "alert-12345",
  "alertName": "Cost Spike - Production API",
  "severity": "warning",
  "metric": "totalCost",
  "currentValue": 1250.00,
  "threshold": 1000.00,
  "triggeredAt": "2025-01-23T14:30:00Z",
  "filters": {
    "organization": "acme-corp",
    "product": "summarization-api"
  }
}
```

{% hint style="info" %}
Webhooks are ideal for triggering automated responses—such as scaling infrastructure, notifying on-call engineers via PagerDuty, or updating internal dashboards.
{% endhint %}

***

### Common Alerting Scenarios

These examples illustrate high-value alert configurations used by developers, FinOps leads, and engineering managers.

#### 1. Prevent Cost Spikes per Request

**Use Case**: A developer tests a new feature or model and needs to ensure it doesn’t exceed budgeted request costs.

* **Metric**: Cost per transaction
* **Filter**: Product = `summarization-api`
* **Condition**: Threshold > $0.25
* **Action**: Send Slack alert if request costs exceed target

***

#### 2. Enforce Daily Token Budgets

**Use Case**: Limit individual user activity to manage runaway costs.

* **Metric**: Token count
* **Filter**: Credential contains `dev-key-*`
* **Condition**: Cumulative Usage ≥ 1,000,000 tokens / Daily
* **Action**: Alert user and engineering lead via email

***

#### 3. Catch Performance Degradations in Request Volume

**Use Case**: Alert if request rate drops or surges abnormally.

* **Metric**: Requests per minute
* **Filter**: Model = `gpt-4-turbo`
* **Condition**: Threshold < 10 or > 1000 / minute
* **Action**: Notify platform team via Slack

***

#### 4. Track Monthly Spend by Business Unit

**Use Case**: Finance team manages AI budgets per department.

* **Metric**: Total cost
* **Filter**: Organization = `acme-corp`
* **Condition**: Cumulative Usage > $10,000 / Monthly
* **Action**: Email budget owner with CSV export of usage data

***

#### 5. Flag Sudden Cost Surges

**Use Case**: Alert on sharp spikes regardless of baseline.

* **Metric**: Total cost
* **Condition**: Change > 30% / 24hr
* **Action**: Notify platform operations team

***

#### 6. Control Cost per Feature or Agent

**Use Case**: Understand cost efficiency of specific workloads.

* **Metric**: Cost per transaction
* **Filter**: Agent = `recommendation-engine`
* **Condition**: Threshold > $1.00
* **Action**: Post alert to Slack channel with link to usage explorer

***

#### 7. Monitor Token Throughput

**Use Case**: Ensure high-volume workloads don’t exceed rate limits or quota.

* **Metric**: Tokens per minute
* **Condition**: Threshold > 150,000 TPM
* **Action**: Trigger Slack alert and send webhook to autoscaler

***

#### 8. Proactively Surface High Error Rates

**Use Case**: Minimize the impact of backend or model instability.

* **Metric**: Error rate
* **Filter**: Model = `claude-3-opus`
* **Condition**: Threshold > 5% over 10 minutes
* **Action**: Notify SRE team for triage

***

#### 9. Detect Week-over-Week Cost Increases

**Use Case**: Catch gradual cost creep before it becomes significant.

* **Metric**: Total cost
* **Alert Type**: Relative Change
* **Filter**: Organization = `engineering`
* **Condition**: Change > 25% vs previous week
* **Action**: Email FinOps team for review

***

#### 10. Alert on Workflow Disruption

**Use Case**: Detect when a critical AI workflow stops entirely.

* **Metric**: Token count
* **Alert Type**: Relative Change
* **Filter**: Task Type = `customer-support-summarization`
* **Condition**: Change < -90% vs previous day
* **Action**: Page on-call engineer via webhook

***

#### 11. Monitor Cost by Task Type

**Use Case**: Track AI costs for specific workflows across your organization.

* **Metric**: Total cost
* **Filter**: Task Type = `code-review`
* **Condition**: Cumulative Usage > $1,000 / Weekly
* **Action**: Notify engineering manager via Slack

***

### Summary

Revenium’s alerting system provides early, targeted visibility into cost and performance anomalies—so developers and teams can fix problems before they impact budgets. Alerts are configurable, scoped to the workloads that matter, and designed to eliminate billing surprises.

For questions or to request support for additional alert types or channels, contact the Revenium team.
