CAUM Live

See when agent work stops converting into progress.

Install CAUM beside one recurring AI agent workflow. CAUM returns zero-semantic receipts and a live dashboard for loops, retries, workflow memory, review boundaries, token exposure, and cost exposure without reading prompts, files, messages, source code, or business payloads.

Start Live Pilot Watch Simple Demo Run One Receipt

3-line SDK path Per-step evidence Workflow memory Zero-semantic Observe-only No blocking Hash-chain evidence

Live Runs

Shadow mode

Active runs

Review boundaries

Reviewable exposure

5.0%

agent_42 T5 boundary step 6 0% conversion

browser_run T4 review 62% conversion

code_agent T2 boundary none 78% conversion

chain_head: 6f1a9c8b... | decision: observe_only | allowed_to_block: false

What CAUM Live tells you

CAUM is narrow by design. It does not try to be a general tracing suite or a truth detector. It measures whether agent work still has healthy structural movement, where it first stops converting cleanly, and what budget exposure sits after that passive review boundary.

Developer API Customer Dashboard Technical Sandbox Evidence Samples

Structural health

T1-T5 health tiers for each run, with review priority and profile-aware calibration.

Review boundary

The first structural point where a run should become reviewable, with tail tokens and cost after that point.

Work conversion

A zero-semantic readout of whether observed steps are still turning into structural movement.

Workflow memory

Customer-scoped review cards for recurrent fingerprints, conversion drift, and long-horizon rhythm across repeated sessions.

Audit is the entry point. Live is the money layer.

A one-time receipt shows the structural loop or retry pattern. CAUM Live keeps that evidence attached to running agents, where repeated loops, retry storms, reasoning stalls, tool churn, and workflow-level memory can accumulate across days or teams.

T5Before signal

T2After guard

step 6Review boundary

5.0% -> 1.8%Exposure scenario

Receipt shows the patternRun one workflow through CAUM and show the structural finding, cost exposure, and remediation receipt.

Live watches the recurrenceStream neutral events from production or internal agents and review hard alerts, work conversion, review boundaries, tokens, and cost exposure during operation.

Teams tune policiesUse receipts to adjust retry ceilings, checkpoints, tool budgets, fallback routes, and review gates. CAUM observes the before/after; it does not control the agent.

Policy Effectiveness closes the loop.

CAUM Live can compare customer-marked before/after cohorts after a team applies its own retry ceiling, handoff rule, or exit contract. The output is an observed structural exposure delta, not a realized savings claim.

600Generated tasks

3Policy cohorts

69.3%Mean exposure delta

1.85MObserved token delta

Claim lock: this RunPod lab validates the measurement path on generated structural tasks. It is not customer prevalence, not ROI proof, and not a guarantee of financial reduction.

RunPod Policy Effectiveness Lab

Three passive policies were tested as before/after structural cohorts. CAUM observed lower reviewable exposure after the policy marker was applied.

Policy	Exposure delta	Reviewable delta	Token delta
Retry error ceilingMarks repeated failed retries before they become hidden churn.	50.4%	$119.90	358,400
Handoff bounce limitSurfaces repeated routing between actors or tools.	79.0%	$173.30	435,000
Reasoning-to-action exit contractDetects long reasoning runs that stop becoming action.	78.4%	$289.63	1,051,800

Send structure, not secrets

Live events can include agent identity and delegation metadata when available. CAUM hashes identity fields in returned evidence and ignores sensitive content fields.

{
  "session_id": "deploy-184",
  "agent_id": "agent-42",
  "identity_sub": "spiffe://acme/prod/agent/42",
  "scopes": ["tools:read", "tools:execute"],
  "event": {
    "event": "tool_call",
    "tool": "bash",
    "status": "completed",
    "input_tokens": 420,
    "cost_usd": 0.013
  }
}

Start Live once the receipt proves the case.

These plans buy continuous structural observability and onboarding for agent runs. CAUM Receipt remains the low-friction entry point; CAUM Live is for recurring agent spend and operational review.

Builder

CAUM Live Builder

For one builder who wants CAUM on coding-agent runs without an enterprise process.

$99 / month

CAUM Live onboarding for one agent workflow
Structural health, work conversion, token, and cost signals
Zero-semantic event boundary

Start Builder

Recommended

CAUM Live Team Pilot

For teams running agents often enough that repeated loops, retries, and reviewable cost exposure already matter.

$299 / month

Team CAUM Live pilot and setup support
Review-boundary ledger and delegation metadata support
CAUM Receipt workflow for review evidence

Start Live Pilot

Operator

CAUM Live Operator

For higher-volume pilots that need a more guided integration path and operator review.

$999 / month

Higher-touch onboarding for multiple agent workflows
Run review support around hard structural alerts and passive boundaries
Private-content-safe event design

Start Operator

CAUM observes structural evidence only. It does not judge whether an answer is true, does not make content-truth claims, and does not block agents.

CAUM Live during the run. CAUM Receipt after the run.

Use CAUM Receipt to enter the workflow. Use CAUM Live when the same agent behavior repeats often enough that reviewable structural exposure becomes a recurring operating problem.

Start Live Pilot Watch Simple Demo Open Technical Sandbox Run Receipt