# Monitor Latency & Performance

Knowing your AI executed successfully is a low bar. What matters is whether it executed well - and whether the performance you're seeing today is better or worse than it was last week. Revenium's Performance section is where slow agents get caught, degrading models get identified, and inefficient workflows get fixed before they become expensive habits.

Find it under **Intelligence > Performance** in your sidebar.

***

### <i class="fa-stopwatch">:stopwatch:</i> Catch Reliability Problems Before Your Users Do

The Tasks view gives you the first signal that something is wrong: completion rate, broken down into successful, timed out, and failed - over time, not just as a snapshot. A completion rate that looks fine today can be masking a window of degradation that happened last Tuesday. The over-time view is what catches that.

When tasks do fail, the **Failed Tasks** table tells you immediately what failed, on which agent, using which model and provider, why it stopped, and how long it ran before it did. You're not hunting through logs - everything you need to start an investigation is in one place.

***

### <i class="fa-traffic-light-slow">:traffic-light-slow:</i> Find Out Which Tasks Are Actually Slow

Not all slowness is equal. **Duration by Task Type** surfaces which operations are taking significantly longer than others. If code analysis is running at ten times the duration of a chat response, that's either a prompt length issue, a model selection problem, or a workflow that needs restructuring - and you won't know which until you can see the comparison clearly.

**Time to First Token** adds the dimension that raw duration misses: the latency your users actually experience. A model that's fast to complete but slow to start feels broken, even if the total response time is acceptable. Tracking TTFT by model over time means you'll spot a provider degradation or model regression as it's happening, not after users have noticed.

<figure><img src="/files/pXHsJ2bBx7zYvZniXUC9" alt="" width="563"><figcaption></figcaption></figure>

***

### <i class="fa-robot">:robot:</i> Hold Individual Agents Accountable

Aggregate metrics hide individual misbehaviour. The Agents view breaks throughput, completion rate, and execution duration down per agent, so an agent that's consistently slower or less reliable than its peers is immediately visible rather than averaged away.

The **Agent Model Comparison** table is where model choice decisions get validated. If you've switched an agent from one model to another, this table shows you the before and after - requests, average duration, TTFT, failure rate, and quality score - side by side. It's the difference between assuming a model change improved things and knowing it did.

**Reliability by Agent** ranks failure rate per agent from highest to lowest. One agent with a disproportionate failure rate is the kind of signal that gets missed in aggregate reporting and found here.

***

### <i class="fa-bone-break">:bone-break:</i> Find the Runs That Are Breaking Your P99

Most traces are fine. The expensive ones - the outliers that are distorting your average and quietly degrading your user experience - live in the tail. The Traces view surfaces them.

The gap between your average duration and your P99 duration tells you how much variance exists in your system. A P99 that's ten times your average means a real proportion of your users are having a dramatically worse experience than your headline metrics suggest. **Performance Anomalies** classify those outliers automatically into Critical (P99), High (P95), and Moderate (P75) tiers, each with an explanation of what's happening and a direct link to filter and investigate the specific traces responsible.

**Four callouts cut straight to the most important signals:** your **slowest trace type** by P95 duration, your **most transaction-heavy trace type**, the trace type with the worst P99/P50 ratio (**the most unpredictable**), and the **trace type that has degraded most** since the previous period. If something has changed in your system, one of those four numbers will tell you.

***

### <i class="fa-eyes">:eyes:</i> Spot Agents That Are Looping

Slowness isn't the only way a workflow can become expensive. The Efficiency view tracks transaction count per trace - how many calls each execution is generating. An agent that's suddenly producing five times its usual transaction count isn't slower in wall-clock time, but it's likely stuck in a loop, making redundant tool calls, or failing to reach a clean exit condition.

**Circular Pattern Analysis** takes this further, automatically detecting whether any workflows have developed circular dependencies - agents or tools calling each other in a loop. This is the failure mode that can turn a $2 workflow into a $200 one before anyone notices. No circular patterns detected is the result you want; when they do appear, this is where you'll find them.

***

### <i class="fa-people-arrows">:people-arrows:</i> Understand How Your Agents Talk to Each Other

For multi-agent architectures, performance isn't just about individual agents - it's about how they interact. The Agent Interaction view tracks patterns, costs, and performance metrics for agent-to-agent calls within a trace, making it possible to see whether the overhead of agent coordination is justified by the outcomes it produces.

This requires your instrumentation to pass `agent`, a shared `traceId` across all transactions in a workflow, and `parentTransactionId` to link agent calls together. See [Instrument Your Code](/track-and-control-costs/instrument-your-code.md) for setup details.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.revenium.io/optimize-performance/monitor-latency-and-performance.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.