ADRF v0.1.0: An open rule format for agent detection

AI agents have started to look like normal HTTP clients with unusual habits. A coding agent, crawler, operator, or payment assistant may call the same origin through the same gateway as a human's browser, but the wire shape is different: User-Agent prefixes, SDK headers, TLS fingerprints, signature headers, request cadence, and eventually payload hints.

Every gateway that wants to classify that traffic ends up writing the same small rule language. One team writes YAML for Claude Code. Another writes JSON for Cursor. A third adds a JA4 matcher to a WAF rule. The rule logic is not the hard part. The hard part is keeping the rule packs portable, reviewable, and current.

Today we published ADRF, the Agent Detection Rule Format, under the BSD-3-Clause license. The initial v0.1.0 release includes the normative spec, JSON Schema, governance docs, contribution checklist, and worked examples.

What ADRF does

ADRF is a YAML rule-pack format. A rule names an agent and declares the passive request signals that identify it. Version 0 covers the matchers we already use in production paths: User-Agent regexes, required headers, and JA4 prefixes. The matcher returns an agent id, a provenance tier, and a score. Policy stays downstream.

version: 0
agents:
  - id: claude-code-cli
    match:
      user_agent_pattern: '^claude-cli/'
      header_present:
        - x-stainless-arch
    provenance: unsigned-named
    score: 95
    confidence: 0.95

That is intentionally boring. The format should be easy to read in a pull request, easy to validate with JSON Schema, and easy for a proxy, SIEM, IDS, or test harness to consume without bringing along the rest of SBproxy.

What v0.1.0 includes

The release ships a small but complete governance package:

SPEC.md, the normative ADRF v0 format.
schemas/v0.json, a draft 2020-12 JSON Schema for rule packs.
examples/claude-code.yaml, a worked single-agent rule.
GOVERNANCE.md and CONTRIBUTING.md, including the evidence checklist for new named-agent rules.

The reference matcher lives in the sbproxy-agent-detect crate. SBproxy consumes ADRF packs directly; the spec repo exists so other projects do not need to treat SBproxy source code as the format documentation.

What comes next

ADRF v0 is request-scoped and deliberately small. The roadmap tracks the next signal families: header-order hashes, JA4T and JA4X predicates, behavioural cadence buckets, payload hints such as MCP clientInfo.name, and identity predicates tied to verified Web Bot Auth Signature-Agent values.

Those features need careful boundaries. Payload rules can see user content. Behavioural rules need state. Signed-agent identity depends on a verifier outside the rule format. The v0 spec draws the line clearly enough for current consumers, and the governance doc gives us a public path to extend it without breaking those consumers silently.

How to contribute

If you maintain a gateway, proxy, IDS, WAF, crawler registry, or agent runtime, try the schema against your own rule format. If you have a stable public wire shape for a named agent, open a rule PR with evidence: observed User-Agent, version and date, header tells, source links, and false-positive tradeoffs.

The goal is not to make one vendor's classifier canonical. The goal is to make agent detection rules portable enough that the security community can argue about the evidence in the open.