# AI Insights

AI Insights is a dedicated analytics surface that runs your AI usage data through Revenium's Recommendations Engine and surfaces the findings most worth acting on. Instead of reading dashboards and trying to spot what matters, Revenium analyzes your transaction history including costs, errors, and agent behavior, and hands you a prioritized list of recommendations with concrete suggested actions and the spend at stake.

Each recommendation is grounded in your actual data (no hypothetical advice), tagged with severity and category, linked to the specific transactions that triggered it, and ranked by potential monthly savings.

## Why this matters

AI usage data accumulates faster than most teams have time to read. Patterns that meaningfully affect spend or reliability — wasted spend, agents failing silently, model costs growing faster than the usage that drives them — are scattered across transactions that nobody scans end-to-end.

AI Insights surfaces those patterns automatically. Detectors run across every dimension of your usage and produce a prioritized list of findings, each grounded in your actual data with a potential monthly savings estimate, the affected transactions, and a concrete suggested action.

## How AI Insights works

The Recommendations Engine runs a multi-stage pipeline looking across dozens of dimensions for optimization opportunities. The in-depth analysis takes a few minutes to complete. You'll be notified within the application when the results are available to retrieve. These analyses can also be triggered and retrieved via API.

## Reading the Insights page

### Summary cards

Four metric cards across the top of the page give you the headline view of the current run:

* **Potential Monthly Savings** — The sum of estimated monthly savings across all recommendations, if each one is acted on. These are estimates derived from your actual usage data, not guarantees.
* **Critical / High Severity** — Critical means immediate action is warranted with strong evidence of significant impact. High means a material problem that should be prioritized as soon as possible.
* **Top Category** — The recommendation category contributing the largest share of impact in this run (Waste, Spend Concentration, Reliability, or Efficiency). Minor issues are also consolidated under a "Worth a Look" category for lower priority issues.
* **Affected Spend** — The total AI spend linked to specific findings in the analyzed window. Potential Monthly Savings is the corresponding 30-day projection of recoverable spend; for 30-day analyses it will be smaller than Affected Spend, while shorter windows (1d, 7d) may project monthly savings higher than the raw affected spend in that window.

### Recommendation cards

Each recommendation appears as a card with:

* **Severity badge** — Critical, High, Medium, Low, or Info.
* **Category badge** — Waste, Worth a look, Concentration, Reliability, or Efficiency.
* **Monthly impact** — A green badge showing the estimated monthly savings if non-zero.
* **Detector label** — The specific pattern that triggered the recommendation (e.g. "Error concentration", "Outdated model").
* **Title and body** — A plain-English explanation of what was detected. Long bodies truncate with a **Read more** link.
* **Affected entities** — The agents, models, credentials, or subscribers implicated in the finding.
* **Suggested action** — An expandable section with the specific change to make (swap model, adjust prompt, enable a setting, restructure the flow, etc.).
* **Feedback controls** — Thumbs up / thumbs down, plus a dismiss button with reason picker.

Switch between **By impact** (single ranked list across all categories) and **By type** (grouped by category) using the toggle at the top of the list.

### Sample transactions and trace linking

Expand any recommendation to see the sample transactions that triggered the finding — up to 20 transaction IDs per finding. For each sample you can:

* **Copy** the transaction ID to clipboard.
* **Open Trace** — Resolves the transaction to its enclosing trace and navigates to Trace Analytics so you can inspect the full request path, latency breakdown, and downstream calls.
* **Copy Transaction IDs** — Copies all transaction IDs if you wish to investigate these IDs in another system.

### Run history and time range

The meta bar above the cards lets you:

* **Switch runs** — Jump between past analyses via the run history dropdown.
* **Choose a time range** — Scope the analysis window to the past 1 day, 7 days, or 30 days (default).
* **Run a new analysis** — Click **Analyze** to kick off a fresh run on the selected time range.

## Recommendation categories

Recommendations are grouped into five categories, each describing a different *kind* of problem.

### Waste

> Spend you can eliminate without changing what the workload does — enable a setting, upgrade to current-generation pricing, or stop redundant work. Success rate and capability stay the same.

Examples: an outdated model that has been superseded by a same-price newer version, a prompt that repeatedly misses cache, retry loops billing the same prompt three times.

### Concentration

> A single subscriber, credential, agent, or model accounts for a disproportionate share of spend, errors, or exposure — creating fragility and limiting your optimization leverage.

Acting on Concentration findings often unlocks downstream improvements — when one entity is responsible for a large share of issues, addressing it has outsized leverage even when the immediate dollar impact looks modest.

### Reliability

> The workload is failing, retrying, or degrading at a rate that is itself the problem. Fix improves success rate; cost usually drops as a side effect.

Reliability findings cover error concentration, throttling patterns, correlated error+retry, quality drops, and similar.

### Efficiency

> The cost per successful completion is high relative to what the task warrants. Swap a model, rewrite a prompt, or restructure a flow to produce the same result for less.

Efficiency findings flag places where the same outcome can be achieved for less spend.

### Worth a look

> Something notable you probably haven't seen yet — either a cost that appeared or grew faster than usage justifies, or an option you didn't know was available (like a same-price newer model).

These don't necessarily require action — they're surfaced because they're worth a glance.

## Types of issues AI Insights looks for

Findings span seven broad areas:

* **Configuration waste** — workloads paying for outdated models, missed prompt caching, mismatched service tiers, or other settings that can be changed without affecting capability.
* **Failure-driven cost** — errors, retries, throttling, and reliability issues that are themselves driving spend.
* **Concentration risk** — a single subscriber, agent, credential, or model accounting for a disproportionate share of cost, errors, or exposure.
* **Cost-to-outcome misalignment** — workloads where the cost per successful result is high relative to the value of the result.
* **Better alternatives available** — newer or less expensive model options, or tasks better served by a different approach.
* **Unexpected growth** — cost or call volume growing faster than the underlying usage that drives it.
* **Attribution gaps** — spend that can't be tied back to a known agent, customer, product, or workflow.

## Use cases

* **Pre-renewal review** — Before a contract renewal or budget cycle, run a 30-day analysis to surface every Waste and Concentration finding for review.
* **Incident triage** — When an alert fires for a cost or error spike, an Insights run scoped to the last day or seven days will frequently identify the root finding (error concentration, retry waste, throttling) without manual log diving.
* **Quarterly review** — Compare runs across the quarter to track whether your findings are decreasing — i.e. whether the team is actually closing the optimization gaps.
* **New integration validation** — After connecting a new model, agent, or middleware, run Insights a week later to catch any unexpected fan-out, error patterns, or attribution gaps before they grow into problems.

## Frequently asked questions

**Do I get charged for analysis runs?** Analysis runs are included in your Revenium subscription at no additional cost.

**How often should I run an analysis?** Weekly is a reasonable cadence for active workloads. For lower-traffic teams, monthly is often enough.

**What if I don't have enough data yet?** The engine works best with at least a few days of meaningful traffic. With very low volume, Worth-a-look and Concentration findings dominate; Waste and Efficiency findings need more data to be confident.

**How are monthly savings estimated?** Savings are projected from the affected-spend slice over the analyzed window, scaled to a 30-day estimate, and adjusted by an estimated capture rate per detector. They're estimates, not guarantees.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.revenium.io/ai-insights.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
