How we measure agent readability
The CovenAI v2.0.1 score quantifies how well a website is structured, citable, discoverable, accessible to AI agents, and transactable by them. We measure these signals; we don't transact for you. Open rubric, paid tooling. This document describes what we measure, how scores are constructed, and what the corpus data reveals.
Background
What the v2.0.1 score measures
As AI systems — large language models, autonomous agents, shopping assistants, and search overviews — become primary intermediaries between people and the web, a site's ability to be read and cited by these systems matters as much as its ability to rank in a traditional search index.
A site can rank well in conventional search results and still be effectively unreadable to AI-driven discovery. The reasons are structural: AI systems evaluate content differently from keyword-matching algorithms. They assess whether content is machine-readable, structurally coherent, semantically clear, demonstrably authoritative, and accessible to the crawlers that feed their training and retrieval pipelines.
The CovenAI v2.0.1 score was built to make these signals measurable. A high score indicates that a site is well-positioned to appear in AI-generated responses, be cited in LLM outputs, and be transacted with by purchasing agents. A low score reveals specific, fixable gaps in how the site presents itself to non-human systems. The score is available free via the scan tool and updated continuously in Agent Analytics for monitored sites.
Framework
The five scored dimensions
Each site is evaluated across five independent scored dimensions. Scores for each dimension range from 0 to 100. The composite score is the weighted sum of all five and ranges from 0 to 100. Weights reflect the relative influence each dimension has on observed AI citation and transaction behaviour.
A sixth signal, Agent Correlation, is also measured and reported below for transparency, but contributes 0% to the public score. We treat it as a research signal because it can be inflated with synthetic agent traffic; using it in the headline number would reward gaming over genuine agent-readiness.
- Real heading hierarchy —
<h2>tags, not styled<p>elements or<details>accordions - Structured lists —
<ul>/<ol>elements, not comma-separated prose - Semantic wrappers —
<section>and<article>to define content boundaries - Answer-first pattern — direct answer in the opening paragraph
- Content depth — minimum word count for citation eligibility
- JSON-LD presence, type specificity, and entity completeness
- datePublished and dateModified in structured data
- Author attribution and credentials
- About and contact information
- Transparency signals — methodology, data sourcing, external references
- robots.txt — all nine major AI crawlers permitted
- llms.txt presence and validity
- Sitemap availability and freshness
- HTTPS and canonical tag hygiene
- Crawl accessibility — no login walls or JS-only rendering on key pages
- MCP server card (Model Context Protocol discovery)
- Agent Skills index
- API Catalog (RFC 9727)
- OAuth Authorisation Server metadata (RFC 8414)
- Markdown content negotiation, Web Bot Auth, RFC 8288 link headers
- Agent policy file declaring transaction posture
- Structured offers endpoint (catalogue discoverable by agents)
- HTTP 402 response shape on price-gated resources
- MCP manifest declaring payment-capable tools
- DNS TXT record announcing the payment endpoint
Agent Correlation · research signal · 0% to public score
We also observe live agent traffic against scanned sites — visit recency, agent-type diversity, dwell signals across the nine AI agent systems we track (GPTBot, OAI-SearchBot, ChatGPT-User, ClaudeBot, Claude-SearchBot, Claude-User, PerplexityBot, Google-Extended, Gemini-Deep-Research). This data is reported alongside the score for transparency, but contributes 0% to the public number. Including it would reward sites with synthetic agent traffic over sites with genuine agent-readiness; we treat it as a confidence signal we use to validate the rubric, not as a public scoring input.
Scoring
Score bands and what they mean
The composite score runs from 0 to 100. Scores are normalised against observed performance across the CovenAI corpus, which is updated as new sites are added. The bands below reflect meaningful thresholds in real-world agent readability outcomes.
| Band | Score range | What it means |
|---|---|---|
| Bad | 0 – 39 | Significant structural or technical barriers to AI discovery. A site in this band is poorly read by most AI systems and will not surface reliably in AI-generated responses. High-priority remediation recommended. |
| Needs Work | 40 – 59 | Partial readability. Some AI systems may encounter the site, but inconsistent signals reduce citation likelihood. Targeted improvements to the lowest-scoring dimensions will have the highest impact. |
| Good with caveats | 60 – 79 | Solid foundation. The site is legible to most AI systems and reasonably likely to appear in relevant AI-generated responses. One or more dimensions still have meaningful gaps worth addressing. |
| Good | 80 – 100 | Strong agent readability. The site is well-structured, citable, discoverable, and accessible to AI systems. Well-positioned to appear in AI-generated responses and agentic discovery pipelines. |
Corpus Data
Global corpus: April 2026
In April 2026, CovenAI ran the v2.0.1 scoring methodology across 938 sites spanning a range of industries, regions, and site types. Of those, 488 returned a complete score across all five scored dimensions. The remainder had one or more dimensions that could not be evaluated — typically due to crawl access restrictions or insufficient content depth. The corpus continues to grow as sites connect to Agent Analytics monitoring; the figures below reflect the April 2026 snapshot.
The results show that most sites are partially readable by AI systems but fall short on the signals that drive citation: structured data, clear authorship, fresh date markup, and permissive AI crawler access.
The average score of 47.4 places most sites in the Needs Work band. The median of 50 indicates the distribution is relatively even around the midpoint, with no heavy skew towards the extremes. The highest score recorded was 76, placing it at the top of the Good with caveats band — a reminder that even well-optimised sites tend to have meaningful gaps in at least one dimension.
The most consistent low-score drivers across the corpus are absent citability signals (missing structured data, no author attribution, no date metadata) and restrictive robots.txt configurations that block one or more major AI crawlers.
Data Sources
What we analyse
Scores are derived from live page analysis conducted by CovenAI’s scanning infrastructure — including Coven-Citability-Bot, our own web agent — at the time of assessment. Scores reflect the state of a site at scan time and will change as the site evolves. Sites connected to Agent Analytics receive updated scores on a regular cadence.
The analysis draws on signals from the publicly accessible version of each page as seen by a standard web client, structured data validators, the Agent Diagnostic Layer (ADL) for agent identity and traffic classification, crawl behaviour logs (for the Agent Correlation research signal), Transactability surface probes, and heuristic evaluation of content quality signals. No proprietary or authenticated data is used; scores reflect only what AI systems themselves can observe.
Industry benchmark data is aggregated and anonymised. Individual site scores are not disclosed in public reports.
Improvement
How to improve your score
Because scores decompose into five independent dimensions, improvement is systematic rather than speculative. The highest-impact actions are almost always in the lowest-scoring dimensions. Across the corpus, the most consistent low-score dimensions are Citability and Discoverability — both addressable with focused, low-effort changes.
Structure is consistently the most surprising high-impact area. Pages with genuinely good content often score poorly on structure because of invisible markup issues: FAQ sections built with <details> accordions register zero headings to an AI agent; section titles styled as <p> tags are indistinguishable from body copy; comma-separated item lists score nothing where a <ul> would score full points. These are low-effort fixes with outsized score impact. Read the full breakdown →
Run a free scan to see your site's current scores across all five dimensions. Agent Analytics provides continuous monitoring and updated scores as your site evolves.
Find out how your site scores across all five dimensions — free, no account required.