Fastly
Fastly AI Bot Management already classifies AI traffic at the edge - crawlers vs fetchers,
verified vs suspected. OA needs that classification piped into a content_retrieved event
and sent to a telemetry endpoint. This page describes what the integration needs to produce and
the enrichment data available at the edge.
What the integration produces
When AI Bot Management classifies a request as an AI crawler or fetcher, a content_retrieved event
should be sent to the content owner's telemetry endpoint. The event carries the URL that was
fetched, the classification, and edge metadata. No user data, no request body, no response
content.
AI Bot ──GET──> Fastly Edge
│
├── AI Bot Management classifies request
│ (VERIFIED-BOT.AI-CRAWLER, VERIFIED-BOT.AI-FETCHER, etc.)
│
├── Serves content from cache/origin (unaffected)
└── content_retrieved event sent to telemetry endpoint
└── https://telemetry.openattribution.dev/v1/eventsEvent payload
This is what the telemetry endpoint expects. The data object is flexible -
the more enrichment the edge can provide, the more useful the event is for the content owner.
{
"org_id": "content-owner-org-id",
"events": [{
"type": "content_retrieved",
"timestamp": "2026-03-23T14:30:00Z",
"content_url": "https://example.com/article/ai-transparency",
"source_role": "edge",
"oa_telemetry_id": "from OA-Telemetry-ID request header, if present",
"data": {
"user_agent": "GPTBot/1.0",
"bot_category": "inference",
"verified": true,
"cache_status": "HIT",
"response_status": 200,
"response_bytes": 34210,
"ja4": "t13d...",
"asn": 14061,
"asn_org": "DigitalOcean, LLC",
"country": "US",
"ip_hash": "a1b2c3..."
}
}]
}bot_category field maps from Fastly's AI Bot Management signals:
signals containing AI-FETCHER map to "inference",
signals containing AI-CRAWLER map to "training".
The verified field maps from VERIFIED-BOT vs SUSPECTED-BOT.Enrichment available at the edge
These VCL variables are available in the log context and would be valuable in the event payload:
| VCL variable | OA field | Why it matters |
|---|---|---|
req.http.User-Agent | user_agent | Identifies which agent is fetching |
req.http.OA-Telemetry-ID | oa_telemetry_id | Correlates edge event with agent-side event |
tls.client.ja4 | ja4 | TLS fingerprint - identifies clients beyond UA string |
client.as.number | asn | AS number - confirms origin network |
client.as.name | asn_org | AS organisation name |
client.geo.country_code | country | Request origin country |
fastly_info.state | cache_status | HIT/MISS - was origin hit or cache served? |
resp.status | response_status | HTTP status code returned |
resp.body_bytes_written | response_bytes | Volume of content served |
digest.hash_sha256(client.ip) | ip_hash | Hashed IP for deduplication without storing raw IPs |
The OA-Telemetry-ID header deserves a note: when an agent includes this header in its request, the edge event can be
correlated with the agent's own telemetry event for the same retrieval. Two independent
observations of the same fetch. When the agent doesn't include the header, the edge event still
stands on its own.
AI Bot Management signals
Fastly AI Bot Management provides four system signals for AI traffic. These map cleanly to OA's bot_category and verified fields:
| Fastly signal | OA bot_category | Meaning |
|---|---|---|
VERIFIED-BOT.AI-CRAWLER | training | Verified identity, crawling for model training |
VERIFIED-BOT.AI-FETCHER | inference | Verified identity, fetching at query time (RAG) |
SUSPECTED-BOT.AI-CRAWLER | training | Suspected training crawler (not DNS-verified) |
SUSPECTED-BOT.AI-FETCHER | inference | Suspected inference fetcher (not DNS-verified) |
The inference category is where content attribution is most relevant - there is a user, a query, and a session
behind the retrieval. training crawls have no session context but are still valuable to track for volume and compliance visibility.
Integration paths
We see three possible approaches, ranging from what a Fastly customer can set up today to a native integration.
Log Streaming with basic bot detection
A Fastly customer can configure HTTPS Real-Time Log Streaming to an OA telemetry endpoint today.
VCL conditions can filter on client.class.bot (boolean - is this a bot?) and client.bot.name (string - e.g. "GPTBot"). This works but it's User-Agent based detection only - it doesn't
use the richer AI Bot Management classification and can't distinguish training from inference.
/.well-known/fastly/logging/challenge on the endpoint domain. The response must
contain the hex SHA-256 of your Fastly service ID. The OA public telemetry server already
handles this.WAF rules with redirect or tag
AI Bot Management signals are available in the Next-Gen WAF rules system. A rule matching on VERIFIED-BOT.AI-FETCHER or SUSPECTED-BOT.AI-CRAWLER can trigger actions - this is how the TollBit integration works (redirect AI bots to a paywall).
The same mechanism could route AI bot requests through an OA-aware path, but we'd need to
understand the best way to capture the event data without disrupting the content delivery.
Native integration
The ideal path: AI Bot Management classification triggers an async event to the content owner's OA telemetry endpoint. No VCL, no log streaming configuration, no redirect. The customer enables it, points it at their endpoint (or the OA public server), and retrieval events flow. Same zero-latency pattern as existing log streaming but with the full AI Bot Management classification included.
Access gating
Separate from telemetry, AI Bot Management supports block and redirect actions per signal. An
OA-aware gating rule could check for AI bot classification plus the presence of an OA-Telemetry-ID header, and block agents that don't participate in the protocol. Agents that carry the header
get access. Agents that don't get a 403 explaining how to participate.