|

Agent integration

When your agent fetches content from the web and uses it in a response, OpenAttribution tracks that usage so content owners get visibility. This page explains what to report, when, and how.


The short version

Your agent does three things that matter for attribution:

  1. Retrieves content - fetches a URL to read it
  2. Grounds on content - loads it into the generation context so it can shape the response
  3. Cites content - explicitly references it in the response to the user

Report all three. Retrieval tells the content owner their content was accessed. Grounding tells them it actually entered the model's context and influenced the response - the core attribution signal. Citation is the subset of grounded content the agent surfaced to the user.

Content can influence every turn in a session without being cited once. That is why grounding, not citation, is the load-bearing event for attribution.


Setup

1. Request agent access

Agent access is gated while the network is bedding in. Sign in, then submit a request via POST /api/v1/identity/agent-access-requests with your org name, admin email, and a short use case. An operator admin reviews and approves; on approval the agent org is created and you can mint an oat_pk_ API key from the dashboard. Content owners are self-serve, but agent and platform onboarding both go through the same approval path.

Submit an agent access request if you don't have a session yet.

oat_pk_ is an API-key prefix, not a signing key
oat_pk_ ("pk" = platform key) is the prefix on platform and agent API keys you put on the X-API-Key header. It is unrelated to the Ed25519 publicKey values in a .well-known manifest's keys array, which are for signing telemetry events (informational in v0.1). Key prefixes and scopes are documented in full at authentication and on the API reference.

2. Publish your own manifest

Publish an OpenAttribution manifest declaring your agent's identity, signing keys, and the endpoint where you submit telemetry. The manifest goes under a path prefix on your own domain, served at the well-known path:

https://yourcompany.com/agents/your-agent/.well-known/openattribution.json
{
  "schema_version": "0.1",
  "id": "https://yourcompany.com/agents/your-agent/.well-known/openattribution.json",
  "roles": ["agent"],
  "operator": { "name": "Your Company" },
  "keys": [
    { "id": "key-1", "type": "Ed25519", "publicKey": "z6Mk..." }
  ],
  "telemetry": {
    "endpoint": "https://telemetry.openattribution.org/events",
    "conformance_level": "grounding"
  }
}

Each agent your company operates gets its own manifest under its own path prefix. The telemetry.endpoint is where you submit your sessions - usually OA's hosted endpoint, but you can self-host the reference server and point at your own URL instead. The .well-known manifest guide covers the full schema.

3. Resolve content-owner manifests

Before fetching content from a domain, fetch its manifest at /.well-known/openattribution.json. A content-owner manifest looks like:

bash
curl -s https://example.com/.well-known/openattribution.json | jq .
response
{
  "schema_version": "0.1",
  "id": "https://example.com/.well-known/openattribution.json",
  "roles": ["content_owner"],
  "operator": { "name": "Example Media" },
  "telemetry": {
    "endpoint": "https://telemetry.openattribution.org/events"
  },
  "domains": ["example.com", "*.example.com"]
}

You don't need to send the content owner's events anywhere yourself - your agent submits its own events to its own telemetry.endpoint. Resolving the content owner's manifest is how you confirm the participant exists, learn the roles they declare, and pick up any signing keys you'll later use to verify cross-observer events. Section 8.7 of the spec covers consumer behaviour.

Cache the lookup
Manifests change rarely. Cache them per domain respecting the response's Cache-Control headers - typically at least an hour.

Reporting events

A typical agent interaction produces this sequence:

event flow
User asks a question
  -> Agent fetches content          (content_retrieved)
  -> Agent loads it into context    (content_grounded)
  -> Agent generates response       (content_cited)
  -> User sees source references    (content_displayed)
  -> User clicks a source link      (content_engaged)

You can report events incrementally during a session or upload a complete session at the end. Both patterns are supported.

The API envelope vs the session document
The request bodies below use a loose API envelope - { "session_id": ..., "events": [...] } for incremental reporting (/sessions/start then /events), and a slightly fuller shape with started_at / ended_at / outcome for bulk upload (/sessions/bulk). Both are accepted on the wire. The canonical archival and interchange representation of a complete session is the JSON session document defined by telemetry-session.json (with document_type and schema_version) - that is what the server materialises, and what a conformance validator checks against. You report against the envelope; the document is the assembled result.

Option A: Incremental reporting

Start a session, report events as they happen, end the session.

1. Start session
curl -X POST https://telemetry.openattribution.org/sessions/start \
  -H "X-API-Key: YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "initiator_type": "user",
    "agent_id": "your-agent-name"
  }'
response
{ "session_id": "550e8400-e29b-41d4-a716-446655440000" }
2. Report events
curl -X POST https://telemetry.openattribution.org/events \
  -H "X-API-Key: YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "550e8400-e29b-41d4-a716-446655440000",
    "events": [
      {
        "type": "content_retrieved",
        "timestamp": "2026-03-18T10:00:01Z",
        "source_role": "agent",
        "content_url": "https://example.com/article/best-headphones"
      },
      {
        "type": "content_grounded",
        "timestamp": "2026-03-18T10:00:02Z",
        "source_role": "agent",
        "content_url": "https://example.com/article/best-headphones",
        "data": {
          "scope": "turn",
          "tokens_ingested": 1840,
          "cached": false,
          "media_type": "text"
        }
      },
      {
        "type": "content_cited",
        "timestamp": "2026-03-18T10:00:05Z",
        "source_role": "agent",
        "content_url": "https://example.com/article/best-headphones",
        "data": {
          "citation_type": "paraphrase",
          "excerpt_tokens": 85,
          "excerpt_chars": 340,
          "position": "primary",
          "media_type": "text"
        }
      }
    ]
  }'
3. End session
curl -X POST https://telemetry.openattribution.org/sessions/end \
  -H "X-API-Key: YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "550e8400-e29b-41d4-a716-446655440000",
    "outcome": { "type": "browse" }
  }'

Option B: Bulk upload

Upload a complete session after it ends. Simpler if you're batch processing.

bash
curl -X POST https://telemetry.openattribution.org/sessions/bulk \
  -H "X-API-Key: YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "550e8400-e29b-41d4-a716-446655440000",
    "agent_id": "your-agent-name",
    "started_at": "2026-03-18T10:00:00Z",
    "ended_at": "2026-03-18T10:05:00Z",
    "events": [
      {
        "type": "content_retrieved",
        "timestamp": "2026-03-18T10:00:01Z",
        "source_role": "agent",
        "content_url": "https://example.com/article/best-headphones"
      },
      {
        "type": "content_grounded",
        "timestamp": "2026-03-18T10:00:02Z",
        "source_role": "agent",
        "content_url": "https://example.com/article/best-headphones",
        "data": {
          "scope": "turn",
          "tokens_ingested": 1840,
          "cached": false
        }
      },
      {
        "type": "content_cited",
        "timestamp": "2026-03-18T10:00:05Z",
        "source_role": "agent",
        "content_url": "https://example.com/article/best-headphones",
        "data": {
          "citation_type": "paraphrase",
          "excerpt_tokens": 85,
          "position": "primary"
        }
      }
    ],
    "outcome": { "type": "browse" }
  }'

Event types

These are the events an agent typically reports. All use source_role: "agent".

content_retrieved

Your agent fetched a URL. Report this for every URL you read, whether or not you end up using it. This is the base signal - it tells content owners their content was accessed.

FieldRequiredDescription
content_urlYesThe URL you fetched
data.media_typeNotext, image, video, or audio. Defaults to text

content_grounded

Your agent loaded the content into the generation model's context. This is the boundary where content can directly influence the response - the core attribution signal. Content used only for retrieval selection (embedding similarity, re-ranking, query routing) without entering the generation context is not grounded.

Grounding is decoupled from retrieval. A live fetch produces both content_retrieved and content_grounded. A cached grounding produces content_grounded only - there is no HTTP request for the content owner to observe, so the agent's report is the only signal the content owner gets.

FieldRequiredDescription
content_urlYesThe URL you grounded on (or content_id if URL is not available)
data.scopeYesturn (influences one response) or session (influences every subsequent turn)
data.cachedRecommendedtrue if grounded from agent-side cache rather than a live fetch
data.tokens_ingestedRecommendedToken count of the chunk or document placed into context
data.content_hashNoSHA-256 of the content as it entered context (sha256:{hex}). Chunk hash, not document hash, when chunked
data.media_typeNotext, image, video, or audio
Grounded but not cited is normal
Content can shape every turn in a session without ever being cited. A session-scoped grounding event captures that influence. Citations are a strict subset of grounded content.

content_cited

Your agent referenced the content in its response - direct quote, paraphrase, or source link. Citation is the high-value signal that connects content to user-visible output.

FieldRequiredDescription
content_urlYesThe URL you cited
data.citation_typeNodirect_quote, paraphrase, reference, or contradiction
data.excerpt_tokensNoToken count of the excerpt used
data.excerpt_charsNoCharacter count of the excerpt used
data.positionNoprimary, supporting, or mentioned
data.media_typeNotext, image, video, or audio
data.content_hashNoSHA-256 of the content you processed (sha256:{hex}). Useful for dispute resolution

content_displayed

Content was shown to the user - in a card, sidebar, source list, or inline citation. Not all agents have a display surface, so this is optional.

content_engaged

The user interacted with cited content. The data.engagement_type field captures what happened:

engagement_typeWhen to send
link_clickUser clicked a link to the source content
expandUser expanded a citation preview or source card
copyUser copied the cited text
shareUser shared the response or a specific citation

Click tokens and landing page attribution

When your agent displays a link to source content and the user clicks it, you can pass session context through to the landing page. This lets the destination site (or their affiliate network) see which content influenced the visit - closing the loop between "agent cited this article" and "user landed here."

How it works

  1. Before displaying an outbound link, create a click token for the session
  2. Append the token to the URL as a ctx query parameter
  3. When the user clicks, report a content_engaged event with link_click
  4. The landing page owner captures ctx and looks up the session via the public API
1. Create a click token
curl -X POST https://api.openattribution.org/click-tokens \
  -H "X-API-Key: YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "550e8400-e29b-41d4-a716-446655440000",
    "content_url": "https://retailer.com/headphones/sony-wh1000xm5"
  }'
response
{
  "token": "ctx_abc123def456",
  "session_id": "550e8400-...",
  "content_url": "https://retailer.com/headphones/sony-wh1000xm5",
  "expires_at": "2026-06-18T10:00:00Z"
}

Append the token to the outbound URL:

text
https://retailer.com/headphones/sony-wh1000xm5?ctx=ctx_abc123def456

The landing page owner (or their network) captures the ctx parameter and looks up the session context with their own key (the lookup needs telemetry:read scope):

lookup
curl https://api.openattribution.org/ctx/ctx_abc123def456 \
  -H "X-API-Key: oat_pk_..."

The lookup returns 404 unless both sides have opted in - the agent that issued the token has share_sessions_via_click_tokens enabled and the cited content owner has visible_in_click_token_lookups enabled. See the affiliate and ad networks guide for the receiving side.

response
{
  "session_id": "550e8400-...",
  "started_at": "2026-03-18T10:00:00Z",
  "click_content_url": "https://retailer.com/headphones/sony-wh1000xm5",
  "content_urls_cited": [
    "https://wirecutter.com/reviews/best-headphones",
    "https://techradar.com/best/wireless-headphones"
  ],
  "content_urls_retrieved": [
    "https://wirecutter.com/reviews/best-headphones",
    "https://techradar.com/best/wireless-headphones",
    "https://rtings.com/headphones/reviews/sony/wh-1000xm5"
  ]
}
Privacy controls
Click-token lookup is two-sided opt-in. The endpoint returns 404 unless you (the agent) have share_sessions_via_click_tokens enabled and the content owner whose URLs appear in the session has visible_in_click_token_lookups enabled. Both default to false. They're single toggles in your dashboard settings - no per-content-owner approval needed. The click URL is recorded either way; consent gates the cited-context view, not the click record. Tokens expire after 90 days.

Report the click

When the user clicks the link, report a content_engaged event so the content owner sees the clickthrough:

json
{
  "type": "content_engaged",
  "timestamp": "2026-03-18T10:01:00Z",
  "source_role": "agent",
  "content_url": "https://retailer.com/headphones/sony-wh1000xm5",
  "data": {
    "engagement_type": "link_click"
  }
}
Networks can also report engagement
If the destination site or its affiliate network supports OA, they can also report a content_engaged event using the same ctx token - session_id is not required when ctx_token is present. Two independent observations of the same click corroborate each other. See the marketplaces and networks guide.

The Content-Telemetry-ID header

When your agent fetches content over HTTP, include an Content-Telemetry-ID header with a UUID:

text
GET /article/best-headphones HTTP/1.1
Host: www.example.com
Content-Telemetry-ID: 550e8400-e29b-41d4-a716-446655440000

Then include the same UUID as the content_telemetry_id field on your content_retrieved event.

This enables cross-observer correlation. If the content owner's CDN or web server also reports the retrieval, both events share the same ID. A retrieval confirmed by both agent and content owner is a stronger signal than either alone.

your event
{
  "type": "content_retrieved",
  "timestamp": "2026-03-18T10:00:01Z",
  "source_role": "agent",
  "content_telemetry_id": "550e8400-e29b-41d4-a716-446655440000",
  "content_url": "https://www.example.com/article/best-headphones"
}
Not required, but valuable
The header is optional. Without it, your retrieval events still work - they just can't be corroborated against the content owner's own logs.

Session outcomes

When you end a session, include an outcome. This closes the attribution loop.

OutcomeWhen
browseUser read the response, no further action
conversionUser completed a purchase or signup
abandonmentUser left mid-conversation

For conversions, include the value:

json
{
  "outcome": {
    "type": "conversion",
    "value_amount": 4999,
    "currency": "USD"
  }
}

value_amount is in minor currency units - cents for USD, pence for GBP. So $49.99 = 4999.


What to report when

Start with retrieval, grounding, and citation. These three events cover the core attribution signal. Add display and engagement as your integration matures.

The spec defines three conformance levels - retrieval, grounding, and attribution - which map onto the priority tiers below. The authoritative declaration is telemetry.conformance_level in your manifest; you may also stamp conformance_level on session documents, where it's informational.

PriorityEventsWhat it enables
Retrievalcontent_retrievedContent owners see which content was fetched
GroundingAbove + content_groundedDistinguish fetched-but-unused from content that actually entered the model's context. Captures cached groundings invisible to the content owner's logs
AttributionAbove + content_cited + content_displayed + content_engaged + outcomesFull attribution loop: which content shaped which response, what the user saw, what they did, what it was worth
RecommendedContent-Telemetry-ID headerCross-observer corroboration with content owner-side logs

Next steps