Back to docs

Agent integration

A reference for AI agents (Claude Desktop, Cursor, Cline, custom MCP clients) and backend code that need to query Brain Orchestra for audit logs, traces, billing, and the model catalog.

Two equivalent surfaces

Same data, two transports. Pick one based on integration shape.

Both are scoped to the project the API key belongs to — every query auto-filters to that project's rows. No cross-tenant leakage by construction.

Connecting an MCP client

Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json:

json
{
  "mcpServers": {
    "brain-orchestra": {
      "command": "npx",
      "args": ["-y", "mcp-remote", "https://api.brainorchestra.ai/v1/mcp"],
      "env": {
        "MCP_REMOTE_HEADERS": "{\"Authorization\":\"Bearer bo_live_YOUR_API_KEY\"}"
      }
    }
  }
}

Restart Claude Desktop. The MCP icon should show "brain-orchestra" with the tool list available.

Cursor / Cline / generic MCP client

Point at https://api.brainorchestra.ai/v1/mcp with Authorization: Bearer bo_live_... in headers. Both clients support the same mcp-remote proxy if needed.

Self-test

bash
curl -X POST https://api.brainorchestra.ai/v1/mcp \
  -H "Authorization: Bearer bo_live_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"tools/list","id":1}'

You should get back the tool list. 401 means either the key is wrong or the project hasn't accepted Terms (sign in to dashboard once).

Tools available to agents

search_audit_logs — query the audit trail

Every gateway request through BO produces a durable audit row. This tool searches them. REST equivalent: GET /v1/data/audit.

Parameters: actor_id, model, status (completed | failed | cancelled | timed_out | streaming), date_from / date_to (ISO 8601), limit (default 50, max 200), offset.

Returns: array of audit records with request_id, actor_id, model, provider, territorial_tier, status, latency_ms, tokens_in, tokens_out, cost_eur, created_at, completed_at. PII fields when applicable.

Agent prompts that translate to this tool:

list_traces, get_trace_detail — multi-step agentic execution

BO supports trace-based observability for agents that make N gateway calls under one logical operation. REST equivalents: GET /v1/data/traces and GET /v1/data/traces/:traceId/spans.

list_traces returns the trace summary (cost, duration, status, span count). get_trace_detail returns the full span tree showing the order, depth, model, cost, and latency of every constituent request.

Agent prompts:

get_billing_summary — aggregate cost

REST equivalent: GET /v1/data/billing/summary.

Parameters: month (format YYYY-MM, defaults to current month).

Returns: total_cost_eur, by_model, by_actor, by_provider arrays.

Catalog — list_models vs get_catalog

Two tools, distinct use cases.

list_models (REST: GET /v1/data/models) returns models eligible for your project's tier. Lightweight, no pricing. Use for routing / capability discovery.

get_catalog (REST: GET /v1/data/catalog) returns the entire catalog regardless of project tier, with eligibleForProject: bool per row plus per-customer effective pricing in USD per 1M tokens. Use for model pickers and tier-upgrade UIs.

Per-entry shape from get_catalog:

json
{
  "name": "gpt-5",
  "providerModelId": "gpt-5",
  "provider": "openai",
  "size": "large",
  "qualityRank": 0.97,
  "modelType": "chat",
  "outputDimensions": null,
  "capabilities": ["text", "vision", "document"],
  "tiers": {
    "unrestricted": true,
    "eu_cloud": false,
    "eu_strict": false,
    "eu_sweden": false
  },
  "adapterByTier": { "unrestricted": "direct" },
  "pricing": {
    "inputPer1m": 1.25,
    "outputPer1m": 10.0,
    "currency": "usd",
    "source": "catalog_default_pending_api",
    "verifiedAt": null,
    "effectiveFrom": null
  },
  "eligibleForProject": true
}

Agent prompts:

get_model_health — real-time availability

REST equivalent: GET /v1/data/model-health. Returns per-model avg_ttft_ms, p95_ttft_ms, success_rate, composite score, is_availablefrom BO's synthetic-ping worker.

Trace correlation — grouping multi-call operations

When an agent turn or a job runs N gateway calls under one logical operation (a Noot conversation, a multi-step briefing, a tool-use loop), you can group them under a single trace so the dashboard can show them as one unit and so the gateway can enforce a shared budget + guardrails.

A trace carries an operation contract— territorial tier, budget cap, max depth, max requests, allowed models — set at creation and immutable for the trace's lifetime. That's by design: when an auditor asks "what was the budget on this run before it started?", the trace row answers it.

Two ways to create a trace

Option A — Explicit pre-create (one extra round trip, cleanest separation). Make a one-time call to POST /v1/traces at the start of your logical operation:

bash
curl -X POST https://api.brainorchestra.ai/v1/traces \
  -H "Authorization: Bearer bo_live_..." \
  -H "X-User-Id: alice@acme.com" \
  -H "Content-Type: application/json" \
  -d '{
    "operation_type": "noot_chat",
    "max_requests": 5,
    "max_depth": 5,
    "budget_eur": 0.50
  }'

# 201 Created
# { "trace_id": "bo_trace_0a1b2c...", "status": "active", ... }

Then on every downstream chat or embeddings call, attach the trace via the X-Trace-Id header:

bash
curl https://api.brainorchestra.ai/v1/chat/completions \
  -H "Authorization: Bearer bo_live_..." \
  -H "X-User-Id: alice@acme.com" \
  -H "X-Trace-Id: bo_trace_0a1b2c..." \
  -H "Content-Type: application/json" \
  -d '{ "model": "claude-haiku-4-5", "messages": [...], "stream": true }'

Option B — Inline create on first call (no extra round trip, fits agent loops better). On the first request of the operation, send X-Create-Trace: true (and no X-Trace-Id), plus an optional trace_contract in the request body:

bash
curl https://api.brainorchestra.ai/v1/chat/completions \
  -H "Authorization: Bearer bo_live_..." \
  -H "X-User-Id: alice@acme.com" \
  -H "X-Create-Trace: true" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-haiku-4-5",
    "messages": [...],
    "stream": true,
    "trace_contract": {
      "operation_type": "noot_chat",
      "max_requests": 5,
      "max_depth": 5
    }
  }'

# Response includes header: X-Trace-Id: bo_trace_...
# Read that header, then thread it on calls 2..N as X-Trace-Id

BO mints the trace as part of the request, runs the call as its first span, and returns the new trace id in the X-Trace-Id response header. Read that header from the response and thread it as X-Trace-Id: <that-id> on every subsequent call in the operation.

Common pitfalls

⚠️ Don't generate trace IDs client-side.

Sending X-Trace-Id: <your-fresh-uuid>against a trace that BO doesn't know about returns 404 trace_not_found. The trace must exist server-side first — either via POST /v1/traces or via X-Create-Trace: trueon the first call. There is no "first request with a new ID auto-creates" semantic.

Authentication and scoping

Worked examples

Backend script (Python, REST)

python
import os, requests

bo = "https://api.brainorchestra.ai"
headers = {"Authorization": f"Bearer {os.environ['BO_API_KEY']}"}

# Last 7 days of failed requests
r = requests.get(
    f"{bo}/v1/data/audit",
    headers=headers,
    params={
        "status": "failed",
        "date_from": "2026-04-19T00:00:00Z",
        "date_to": "2026-04-26T00:00:00Z",
        "limit": 200,
    },
)
failures = r.json()["audit_logs"]

# This month's billing
r = requests.get(f"{bo}/v1/data/billing/summary", headers=headers)
print(f"Spend: ${r.json()['total_cost_eur']}")

# Discover document-capable models with pricing
r = requests.get(f"{bo}/v1/data/catalog", headers=headers)
docs = [m for m in r.json()["models"]
        if m["eligibleForProject"] and "document" in m["capabilities"]]
for m in sorted(docs, key=lambda m: m["pricing"]["inputPer1m"]):
    print(f"{m['name']}: ${m['pricing']['inputPer1m']}/1M in")

Building a model picker (Node, REST)

javascript
const res = await fetch("https://api.brainorchestra.ai/v1/data/catalog", {
  headers: { Authorization: `Bearer ${process.env.BO_API_KEY}` },
});
const { models, project_tier } = await res.json();

const eligible = models.filter(m => m.eligibleForProject);
const upsell = models.filter(m => !m.eligibleForProject);

console.log(`${eligible.length} models available; ${upsell.length} more if you upgrade tier`);
upsell.forEach(m => {
  const tiers = Object.entries(m.tiers).filter(([, v]) => v).map(([k]) => k);
  console.log(`  ${m.name} — needs tier: ${tiers.join(' or ')}`);
});

MCP vs REST — when to choose which

Use MCP when your agent runs inside an MCP-aware host (Claude Desktop, Cursor, Cline) — the host handles auth, retries, schema validation, and translates natural-language to tool calls. The user (operator) gets to grant per-tool permissions and see what the agent is doing.

Use REST whenyour code is imperative — backend script, dashboard, CI job, scheduled report — or when you're not running inside an MCP host.

Both surfaces stay in lockstep — the catalog endpoint shape, for example, is built from a single src/api/catalog-shape.ts module so MCP and REST can never disagree.

Rate limits

Read endpoints share the project's per-minute rate limit (default 60 RPM, configurable per project in dashboard settings). They don't consume your prepaid balance — read-only data returns are free.