Agent integration
When your agent fetches content from the web and uses it in a response, OpenAttribution tracks that usage so content owners get visibility. This page explains what to report, when, and how.
The short version
Your agent does three things that matter for attribution:
- Retrieves content - fetches a URL to read it
- Grounds on content - loads it into the generation context so it can shape the response
- Cites content - explicitly references it in the response to the user
Report all three. Retrieval tells the content owner their content was accessed. Grounding tells them it actually entered the model's context and influenced the response - the core attribution signal. Citation is the subset of grounded content the agent surfaced to the user.
Content can influence every turn in a session without being cited once. That is why grounding, not citation, is the load-bearing event for attribution.
Setup
1. Request agent access
Agent access is gated while the network is bedding in. Sign in, then submit a request via POST /api/v1/identity/agent-access-requests with your org name, admin email, and a short use case. An operator admin reviews and approves;
on approval the agent org is created and you can mint an oat_pk_ API key from the dashboard. Content owners are
self-serve, but agent and platform onboarding both go through the same approval path.
Submit an agent access request if you don't have a session yet.
oat_pk_ ("pk" = platform key) is the prefix on platform and agent
API keys you put on the X-API-Key header. It is unrelated to the
Ed25519 publicKey values in a .well-known manifest's keys array, which are for signing telemetry events (informational
in v0.1). Key prefixes and scopes are documented in full at authentication and on
the API reference.2. Publish your own manifest
Publish an OpenAttribution manifest declaring your agent's identity, signing keys, and the endpoint where you submit telemetry. The manifest goes under a path prefix on your own domain, served at the well-known path:
{
"schema_version": "0.1",
"id": "https://yourcompany.com/agents/your-agent/.well-known/openattribution.json",
"roles": ["agent"],
"operator": { "name": "Your Company" },
"keys": [
{ "id": "key-1", "type": "Ed25519", "publicKey": "z6Mk..." }
],
"telemetry": {
"endpoint": "https://telemetry.openattribution.org/events",
"conformance_level": "grounding"
}
}Each agent your company operates gets its own manifest under its own path prefix. The telemetry.endpoint is where you submit your sessions -
usually OA's hosted endpoint, but you can self-host the reference server and point at your own
URL instead. The .well-known manifest guide covers the full schema.
3. Resolve content-owner manifests
Before fetching content from a domain, fetch its manifest at /.well-known/openattribution.json.
A content-owner manifest looks like:
curl -s https://example.com/.well-known/openattribution.json | jq .{
"schema_version": "0.1",
"id": "https://example.com/.well-known/openattribution.json",
"roles": ["content_owner"],
"operator": { "name": "Example Media" },
"telemetry": {
"endpoint": "https://telemetry.openattribution.org/events"
},
"domains": ["example.com", "*.example.com"]
}You don't need to send the content owner's events anywhere yourself - your agent submits its
own events to its own telemetry.endpoint.
Resolving the content owner's manifest is how you confirm the participant exists, learn the
roles they declare, and pick up any signing keys you'll later use to verify cross-observer
events. Section 8.7 of the spec covers consumer behaviour.
Cache-Control headers - typically at least an hour.Reporting events
A typical agent interaction produces this sequence:
User asks a question
-> Agent fetches content (content_retrieved)
-> Agent loads it into context (content_grounded)
-> Agent generates response (content_cited)
-> User sees source references (content_displayed)
-> User clicks a source link (content_engaged)You can report events incrementally during a session or upload a complete session at the end. Both patterns are supported.
{ "session_id": ..., "events": [...] } for incremental reporting (/sessions/start then /events), and a
slightly fuller shape with started_at / ended_at / outcome for bulk upload (/sessions/bulk). Both are accepted on the wire. The
canonical archival and interchange representation of a complete session is the JSON session document defined by telemetry-session.json (with document_type and schema_version) - that
is what the server materialises, and what a conformance validator checks against. You report against
the envelope; the document is the assembled result.Option A: Incremental reporting
Start a session, report events as they happen, end the session.
curl -X POST https://telemetry.openattribution.org/sessions/start \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"initiator_type": "user",
"agent_id": "your-agent-name"
}'{ "session_id": "550e8400-e29b-41d4-a716-446655440000" }curl -X POST https://telemetry.openattribution.org/events \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"session_id": "550e8400-e29b-41d4-a716-446655440000",
"events": [
{
"type": "content_retrieved",
"timestamp": "2026-03-18T10:00:01Z",
"source_role": "agent",
"content_url": "https://example.com/article/best-headphones"
},
{
"type": "content_grounded",
"timestamp": "2026-03-18T10:00:02Z",
"source_role": "agent",
"content_url": "https://example.com/article/best-headphones",
"data": {
"scope": "turn",
"tokens_ingested": 1840,
"cached": false,
"media_type": "text"
}
},
{
"type": "content_cited",
"timestamp": "2026-03-18T10:00:05Z",
"source_role": "agent",
"content_url": "https://example.com/article/best-headphones",
"data": {
"citation_type": "paraphrase",
"excerpt_tokens": 85,
"excerpt_chars": 340,
"position": "primary",
"media_type": "text"
}
}
]
}'curl -X POST https://telemetry.openattribution.org/sessions/end \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"session_id": "550e8400-e29b-41d4-a716-446655440000",
"outcome": { "type": "browse" }
}'Option B: Bulk upload
Upload a complete session after it ends. Simpler if you're batch processing.
curl -X POST https://telemetry.openattribution.org/sessions/bulk \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"session_id": "550e8400-e29b-41d4-a716-446655440000",
"agent_id": "your-agent-name",
"started_at": "2026-03-18T10:00:00Z",
"ended_at": "2026-03-18T10:05:00Z",
"events": [
{
"type": "content_retrieved",
"timestamp": "2026-03-18T10:00:01Z",
"source_role": "agent",
"content_url": "https://example.com/article/best-headphones"
},
{
"type": "content_grounded",
"timestamp": "2026-03-18T10:00:02Z",
"source_role": "agent",
"content_url": "https://example.com/article/best-headphones",
"data": {
"scope": "turn",
"tokens_ingested": 1840,
"cached": false
}
},
{
"type": "content_cited",
"timestamp": "2026-03-18T10:00:05Z",
"source_role": "agent",
"content_url": "https://example.com/article/best-headphones",
"data": {
"citation_type": "paraphrase",
"excerpt_tokens": 85,
"position": "primary"
}
}
],
"outcome": { "type": "browse" }
}'Event types
These are the events an agent typically reports. All use source_role: "agent".
content_retrieved
Your agent fetched a URL. Report this for every URL you read, whether or not you end up using it. This is the base signal - it tells content owners their content was accessed.
| Field | Required | Description |
|---|---|---|
content_url | Yes | The URL you fetched |
data.media_type | No | text, image, video, or audio. Defaults to text |
content_grounded
Your agent loaded the content into the generation model's context. This is the boundary where content can directly influence the response - the core attribution signal. Content used only for retrieval selection (embedding similarity, re-ranking, query routing) without entering the generation context is not grounded.
Grounding is decoupled from retrieval. A live fetch produces both content_retrieved and content_grounded. A cached grounding produces content_grounded only - there is no HTTP request for the content owner to observe, so the agent's report is the
only signal the content owner gets.
| Field | Required | Description |
|---|---|---|
content_url | Yes | The URL you grounded on (or content_id if URL is not available) |
data.scope | Yes | turn (influences one response) or session (influences every subsequent turn) |
data.cached | Recommended | true if grounded from agent-side cache rather than a live fetch |
data.tokens_ingested | Recommended | Token count of the chunk or document placed into context |
data.content_hash | No | SHA-256 of the content as it entered context (sha256:{hex}). Chunk hash, not document hash, when chunked |
data.media_type | No | text, image, video, or audio |
content_cited
Your agent referenced the content in its response - direct quote, paraphrase, or source link. Citation is the high-value signal that connects content to user-visible output.
| Field | Required | Description |
|---|---|---|
content_url | Yes | The URL you cited |
data.citation_type | No | direct_quote, paraphrase, reference, or contradiction |
data.excerpt_tokens | No | Token count of the excerpt used |
data.excerpt_chars | No | Character count of the excerpt used |
data.position | No | primary, supporting, or mentioned |
data.media_type | No | text, image, video, or audio |
data.content_hash | No | SHA-256 of the content you processed (sha256:{hex}). Useful for dispute resolution |
content_displayed
Content was shown to the user - in a card, sidebar, source list, or inline citation. Not all agents have a display surface, so this is optional.
content_engaged
The user interacted with cited content. The data.engagement_type field captures what happened:
| engagement_type | When to send |
|---|---|
link_click | User clicked a link to the source content |
expand | User expanded a citation preview or source card |
copy | User copied the cited text |
share | User shared the response or a specific citation |
Click tokens and landing page attribution
When your agent displays a link to source content and the user clicks it, you can pass session context through to the landing page. This lets the destination site (or their affiliate network) see which content influenced the visit - closing the loop between "agent cited this article" and "user landed here."
How it works
- Before displaying an outbound link, create a click token for the session
- Append the token to the URL as a
ctxquery parameter - When the user clicks, report a
content_engagedevent withlink_click - The landing page owner captures
ctxand looks up the session via the public API
curl -X POST https://api.openattribution.org/click-tokens \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"session_id": "550e8400-e29b-41d4-a716-446655440000",
"content_url": "https://retailer.com/headphones/sony-wh1000xm5"
}'{
"token": "ctx_abc123def456",
"session_id": "550e8400-...",
"content_url": "https://retailer.com/headphones/sony-wh1000xm5",
"expires_at": "2026-06-18T10:00:00Z"
}Append the token to the outbound URL:
https://retailer.com/headphones/sony-wh1000xm5?ctx=ctx_abc123def456The landing page owner (or their network) captures the ctx parameter and looks up the session context with their own key (the lookup needs telemetry:read scope):
curl https://api.openattribution.org/ctx/ctx_abc123def456 \
-H "X-API-Key: oat_pk_..."The lookup returns 404 unless both sides have opted in - the agent that issued the token has share_sessions_via_click_tokens enabled and the cited content
owner has visible_in_click_token_lookups enabled. See the affiliate and ad networks guide
for the receiving side.
{
"session_id": "550e8400-...",
"started_at": "2026-03-18T10:00:00Z",
"click_content_url": "https://retailer.com/headphones/sony-wh1000xm5",
"content_urls_cited": [
"https://wirecutter.com/reviews/best-headphones",
"https://techradar.com/best/wireless-headphones"
],
"content_urls_retrieved": [
"https://wirecutter.com/reviews/best-headphones",
"https://techradar.com/best/wireless-headphones",
"https://rtings.com/headphones/reviews/sony/wh-1000xm5"
]
}404 unless you (the agent) have share_sessions_via_click_tokens enabled and the content owner whose URLs appear in the session has visible_in_click_token_lookups enabled. Both
default to false. They're single toggles in your
dashboard settings - no per-content-owner approval needed. The click URL is recorded
either way; consent gates the cited-context view, not the click record. Tokens expire
after 90 days.Report the click
When the user clicks the link, report a content_engaged event so the content owner sees the clickthrough:
{
"type": "content_engaged",
"timestamp": "2026-03-18T10:01:00Z",
"source_role": "agent",
"content_url": "https://retailer.com/headphones/sony-wh1000xm5",
"data": {
"engagement_type": "link_click"
}
}content_engaged event using the same ctx token - session_id is not required when ctx_token is present. Two
independent observations of the same click corroborate each other. See the marketplaces and networks guide.The Content-Telemetry-ID header
When your agent fetches content over HTTP, include an Content-Telemetry-ID header with a UUID:
GET /article/best-headphones HTTP/1.1
Host: www.example.com
Content-Telemetry-ID: 550e8400-e29b-41d4-a716-446655440000Then include the same UUID as the content_telemetry_id field on your content_retrieved event.
This enables cross-observer correlation. If the content owner's CDN or web server also reports the retrieval, both events share the same ID. A retrieval confirmed by both agent and content owner is a stronger signal than either alone.
{
"type": "content_retrieved",
"timestamp": "2026-03-18T10:00:01Z",
"source_role": "agent",
"content_telemetry_id": "550e8400-e29b-41d4-a716-446655440000",
"content_url": "https://www.example.com/article/best-headphones"
}Session outcomes
When you end a session, include an outcome. This closes the attribution loop.
| Outcome | When |
|---|---|
browse | User read the response, no further action |
conversion | User completed a purchase or signup |
abandonment | User left mid-conversation |
For conversions, include the value:
{
"outcome": {
"type": "conversion",
"value_amount": 4999,
"currency": "USD"
}
}value_amount is in minor currency units - cents for USD, pence for GBP. So $49.99 = 4999.
What to report when
Start with retrieval, grounding, and citation. These three events cover the core attribution signal. Add display and engagement as your integration matures.
The spec defines three conformance levels - retrieval, grounding, and attribution - which map onto the priority tiers below. The authoritative declaration is telemetry.conformance_level in your manifest; you may also stamp conformance_level on session documents, where it's informational.
| Priority | Events | What it enables |
|---|---|---|
| Retrieval | content_retrieved | Content owners see which content was fetched |
| Grounding | Above + content_grounded | Distinguish fetched-but-unused from content that actually entered the model's context. Captures cached groundings invisible to the content owner's logs |
| Attribution | Above + content_cited + content_displayed + content_engaged + outcomes | Full attribution loop: which content shaped which response, what the user saw, what they did, what it was worth |
| Recommended | Content-Telemetry-ID header | Cross-observer corroboration with content owner-side logs |
Next steps
- Submit an agent access request and get an API key once approved
- API reference for full endpoint documentation
- Telemetry specification for the complete schema