> For the complete documentation index, see [llms.txt](https://docs.revenium.io/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.revenium.io/instrument-your-agents/agent-instrumentation-guide.md).

# Agent Instrumentation Guide

Agent workflows break the assumptions that most observability tools are built on. A standard LLM integration is simple: one prompt, one response, one cost. An agent is something else entirely - a chain of reasoning steps, tool calls, handoffs between sub-agents, and occasionally human escalations, all working together to deliver a single business outcome that may or may not have anything to do with whether the underlying model calls succeeded.

Tracking this properly is hard for reasons that compound. Token costs are the smallest part of what an agent actually costs to run - in production workflows, external tool calls are commonly ten times the token spend. Agents loop in ways that aren't visible until someone checks the aggregate numbers at the end of the month. Decision paths that look identical at the transaction level have wildly different cost profiles once you account for which tools they triggered. And the thing that actually matters - did the agent close the sale, deflect the support ticket, complete the task - doesn't show up anywhere in technical logs at all.

Revenium treats agents as first-class workloads rather than slightly more elaborate LLM calls. This page explains the conceptual model, what it lets you see that infrastructure monitoring doesn't, and where to go in the product and API for each piece.

### <i class="fa-arrows-retweet">:arrows-retweet:</i> What Makes Agent Workloads Hard to Observe

Four problems show up in every serious agent deployment, and each one needs a different kind of instrumentation to see properly.

**Tool costs dwarf token costs.** A customer service agent that pulls a credit report from Experian at $25 per lookup spends more on one external API call than it does on thousands of tokens. Agents that call multiple services on each run - CRM lookups, document processing, data enrichment, maps, databases - generate cost profiles where the LLM spend is a rounding error. Without structured tool tracking, all of this is invisible to any monitoring that only watches what flows through the model provider.

**Agents loop without anyone noticing.** A reasoning step that fails once is fine. A reasoning step that fails and triggers a retry, which triggers another external API call, which feeds back into another reasoning step that also fails, can easily run dozens of times before the workflow either succeeds or gives up. The token cost per iteration is small. The tool cost per iteration is not. Without trace-level visibility, a looping agent looks identical to a well-behaved one in aggregate dashboards.

**Multi-agent handoffs are opaque.** When agent A calls agent B which calls agent C, reconstructing that call tree from raw logs means reading timestamps and guessing at relationships. The interesting questions - which handoff is slowest, which agent spawns the most sub-calls, where does cost concentrate across the workflow - are impossible to answer without explicit parent/child relationships between calls.

**Technical success and business outcomes are different things.** An agent can execute flawlessly on every single step and still fail to close the sale, deflect the ticket, or complete the task it was built for. A 100% technical success rate on a lead qualification workflow tells you nothing about whether leads are actually getting qualified. Observing agents properly means measuring outcomes, not just executions.

### <i class="fa-object-group">:object-group:</i> The Four Dimensions Revenium Uses

The instrumentation model maps to the four levels at which agent work actually happens:

**The transaction.** One AI call with full metadata - which agent made it, which model it used, what it cost, how long it took, and (crucially) what its parent transaction was. The `agent` and `parentTransactionId` fields are what let Revenium reconstruct who-called-whom across multi-agent workflows without guessing.

**The trace.** A sequence of related transactions that together make up one workflow run, grouped by `traceId`. This is the unit at which loops become visible, bottlenecks show up, and cost distribution across steps can be analyzed. A single trace might span one agent making ten calls, or ten agents each making one call - the grouping is what matters.

**The job.** The business-level unit of work that a single trace, or many traces, contributes toward. `agenticJobId` ties technical execution to a real-world outcome - a support ticket resolution, a lead qualification, a document review. Jobs are what you measure ROI on, because jobs are what your business actually cares about.

**The squad.** A named, coordinated group of agents executing together as a unit. Different from ad-hoc multi-agent workflows - a squad is an explicit concept that Revenium tracks as its own entity, with aggregated metrics across executions and full timeline views of agent coordination within each run.

These four dimensions don't form a single strict hierarchy. Transactions and traces are a clean technical stack - each trace contains multiple transactions. Jobs and squads sit on separate axes: a job is a *business* grouping that collects whichever transactions contributed to one outcome (possibly spanning one trace, possibly many), and a squad is an *orchestration* grouping that aggregates executions of a named multi-agent unit. The same transaction can carry a traceId, an agenticJobId, and belong to a squad execution simultaneously - they're independent dimensions that each answer a different question. You instrument at whichever levels match the questions you need to answer: trace for "what happened inside this workflow run", job for "did this decision produce a business outcome", squad for "how are my coordinated multi-agent units performing".

### <i class="fa-unlock">:unlock:</i> What This Unlocks That Token-Only Observability Doesn't

Once this structure is in place, five things become legible that would otherwise require ad-hoc log reconstruction every time:

**Tool costs alongside token costs.** Revenium's Tool Registry captures every external tool, API, or service an agent calls - with its own pricing model, its own attribution chain, and full visibility into the cost iceberg beneath the tokens. In the typical production workflow this is the majority of the spend, not the minority.

**Agent-to-agent call patterns.** Because `parentTransactionId` links every call to the one that triggered it, Revenium can render an agent interaction matrix - which agents call which, how often, how much it costs, and how long it takes. The critical path through a multi-agent workflow becomes a chart rather than a detective exercise.

**Circular patterns.** Traces that loop show up as an explicit anomaly class rather than a slightly-higher-than-average cost. An agent calling itself, or two agents calling each other in an unproductive handshake, is a distinctive pattern that can be detected structurally and surfaced before it compounds.

**Business ROI per job type.** With outcomes reported against `agenticJobId`, Revenium can tell you not just what an agent cost, but whether it delivered - and how that compares across job types. A support-ticket agent can have a 60% deflection rate at $0.15 per ticket versus $50 for a human agent, and that ratio appears in the product as a clean ROI figure rather than something a finance team has to build in a spreadsheet.

**Human-in-the-loop cost accounting.** Escalations to human reviewers cost real money in time. Registering human effort as a custom tool in the Tool Registry and metering each escalation properly means the full cost of a workflow - including the minutes a human spent on it - flows into the same ROI calculation as the token and tool costs. Escalated outcomes keep their full business value, and the human time shows up as a cost, which is the only way the economics come out right.

### <i class="fa-robot">:robot:</i> Where Agent Workloads Surface in the Product

Three areas of the UI are scoped specifically for agent work:

**Performance > Agents** is the aggregate view - total throughput, agent completion rate, reliability per agent, duration breakdowns, and cross-model comparison. The place to look when the question is "which of my agents is performing well and which isn't?"

**Performance > Traces > Agent Interaction** shows how agents communicate within traces. This is where the agent interaction matrix, critical path analysis, and circular pattern detection live. The place to look when the question is "what's actually happening inside this workflow?"

**Data > Squads** aggregates multi-agent workflow executions by named squad, with status distribution, trace counts, and execution timelines. The place to look when you're running explicit multi-agent coordination and want to see how the whole squad performed rather than the agents individually.

### <i class="fa-fill">:fill:</i> Deeper Coverage in the Section

Four related pages go deeper on specific aspects of agent instrumentation.

[AI Outcomes](/instrument-your-agents/agent-outcomes.md) covers outcome reporting — CONVERTED, ESCALATED, DEFLECTED, and CUSTOM — posted after each job run so Revenium can calculate ROI, deflection rates, and cost per conversion. This is where the business-value side of the ledger lives. Start here if your question is "are these agents paying for themselves?"

[Monitor Agent Tool Usage](/instrument-your-agents/monitor-agent-tool-usage.md) covers the Tool Registry in depth - registering tools, configuring pricing models for them, metering tool events, and analyzing the cost iceberg where tool spend dominates token spend. This is the page to work through once the transaction-level instrumentation is in place and you're ready to capture the rest of the cost picture.

[Analyze Decision Costs](/instrument-your-agents/analyze-decision-costs.md) covers the Jobs system and outcomes tracking. Jobs are Revenium's higher-level abstraction for "one decision" - the business-level unit of work that groups transactions and traces together and gets measured against a business outcome. This is where ROI, conversion funnels, value ratios per job type, and the correct handling of escalation costs all live. Essential reading if you're using LangChain, CrewAI, or any orchestration framework where "did the workflow deliver" matters more than "did each call succeed".

[MCP Server Setup](/integrations/mcp-server.md) addresses the reverse direction: hooking your own AI agents - Claude, Cursor, custom assistants - to Revenium itself via Model Context Protocol, so they can query cost data, investigate spikes, configure alerts, and reason about AI economics natively inside their own context. Different problem entirely from instrumenting your agents' output; this is about instrumenting the agents that work with your Revenium data.

### <i class="fa-code">:code:</i> Via API

Agent instrumentation uses five sections of the Revenium API working together. The [AI Metering endpoints](https://revenium.readme.io/reference/meter_ai_completion) accept the standard completion payload with the `agent`, `transactionId`, `parentTransactionId`, and `traceId` fields that make the dimensional model work. The [Tool Metering endpoint](https://revenium.readme.io/reference/meter_tool_event) captures external tool costs against registered tools with full attribution. The [Jobs API](https://revenium.readme.io/reference/report_job_outcome) handles business-outcome reporting against `agenticJobId`. The [Squads endpoints](https://revenium.readme.io/reference/list_squad_executions) surface aggregated metrics for multi-agent coordinated executions. And the [AI Traces endpoints](https://revenium.readme.io/reference/list_ai_traces) expose trace-level analytics including agent interaction tables, interaction matrices, critical path analysis, and circular pattern detection - the underlying data behind the agent-specific UI surfaces.

For per-agent cost and performance analytics, the [Analytics API](https://revenium.readme.io/reference/get_performance_metrics_by_agent) exposes cost-by-agent, performance-by-agent, task-performance-by-agent, and task-completion endpoints that power the dashboards but are also usable directly for custom reporting or external integrations.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.revenium.io/instrument-your-agents/agent-instrumentation-guide.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.