We Built an AI Retrieval Observatory. Here Is What 67 Days Showed

67 days, six software directories, 236,059 AI-agent requests: what machines read, what switches retrieval on, and why retrieval is not citation.

By Emerging S.R.O. ยท Published 2026-07-02

For months, our software directories showed a discouraging analytics pattern. Google displayed the pages often. Almost nobody clicked.

That left a basic question. Was the content being ignored, or was some of its use happening outside the click?

Google Search Console could not answer it. Search Console records impressions and clicks. It does not show when an AI system requests a page while preparing an answer.

So we built a separate instrument. Since April 25, 2026, our servers have recorded every request from declared AI-associated user agents across six independent software directories covering AI software, cybersecurity, finance, HR, logistics, and energy. Together the directories publish 905 structured reviews with pricing, features, compliance facts, and source dates.

This report covers the first 67 complete days of that record, April 25 through June 30. It is the first issue of a recurring series. The clean result: 236,059 successful requests from tracked AI-associated agents, most of them reaching actual software reviews. That did not prove our pages were cited. It did prove that low click-through is not the same as no machine readership.

The short version

Finding Result
Successful tracked requests 236,059
Measurement days 67, none missing
Requests classified as live retrieval 78.42%
Requests declaring ChatGPT-User 163,464 (69.25%)
Share reaching substantive content 84.31%
Google clicks, same properties, same quarter 484
Requests to llms.txt and llms-full.txt 239 (0.10%)

The network averaged 2,412 requests per day in the last six days of April, 3,232 per day in May, and 4,047 per day in June. The rise was not endless. June reached a plateau. An observatory records what happened; it does not force every curve into a growth story.

What server logs add that search analytics cannot

Search analytics describe one route from a page to a person. A result appears, a person clicks, the site records a visit.

AI retrieval can follow a different route. A person asks an AI system a question. The system searches, requests several pages, compares their claims, and produces an answer without sending the person anywhere. From the publisher's side, the click disappears. The server request remains.

One terminology note, because precision matters here. We say "request" throughout. A successful request confirms our server delivered a page to a tracked user agent. It does not prove the page was parsed, trusted, quoted, or cited. Our logs record the declared user agent, host, path, timestamp, and status. We count only status-200 responses from the six directory domains.

Finding 1: most requests reached software reviews

The largest result is also the simplest.

Content surface Requests Share
Software review pages 186,469 78.99%
Homepages 19,371 8.21%
robots.txt and sitemaps 10,315 4.37%
Other pages 7,121 3.02%
AI visibility reports 6,667 2.82%
Research articles 3,142 1.33%
Briefing PDFs 2,735 1.16%
llms.txt and llms-full.txt 239 0.10%

Reviews, AI reports, research articles, and briefing PDFs together received 199,013 requests, 84.31% of the total. The traffic was not mainly bots checking robots.txt or reading a machine-specific index. It concentrated on the pages containing software facts.

Finding 2: Google shows the same pages, and almost nobody clicks

Over the overlapping quarter, March 30 to June 29, Google Search Console recorded 484 human clicks across all six sites combined, 5.3 per day, against just over one million impressions. That is a network click-through rate of 0.045 percent. The largest site averaged 1.6 clicks a day on roughly 231,000 monthly impressions, with 24 zero-click days in the quarter.

The demand is visible; the clicks are not. On queries ranking in Google's top five positions, where a typical result converts around ten percent of impressions into clicks, our network-wide rate was 0.15 percent, a gap of roughly 70x. Bing, which does not overlay AI answers as aggressively, converts the same pages at 2.0 percent.

One page makes the point by itself. Our Microsoft Azure Speech Service review drew 82,635 Google impressions this quarter and 4 clicks, while tracked agents requested the same page thousands of times.

The safe combined conclusion, stated carefully: Google exposes these pages but sends very few clicks. Separately, declared AI-associated agents request the pages at substantial volume. AI answer surfaces are a plausible contributor to the missing clicks, but the observatory has not proven they explain every one.

The channel has already moved

For this catalog, tracked AI requests outnumber human search clicks by roughly seven hundred to one. The distribution channel for reference content is no longer the click. It is the request.

Finding 3: ChatGPT-User was the dominant declared agent

Declared agent Requests Share
ChatGPT-User 163,464 69.25%
Meta-ExternalAgent 31,382 13.29%
PerplexityBot 8,556 3.62%
OAI-SearchBot 7,930 3.36%
ClaudeBot 7,172 3.04%
GPTBot 5,503 2.33%
Bytespider 4,197 1.78%
GoogleOther 2,402 1.02%
anthropic 2,175 0.92%
YouBot 1,281 0.54%

Our current manifest classifies 78.42% of successful requests as live retrieval, meaning agents associated with answering a question at the moment it is asked rather than gathering training data.

That classification needs care. ChatGPT-User is associated with user-triggered retrieval. Other labels may mix retrieval, indexing, and supporting activity. User-agent strings can be copied or spoofed. The defensible claim is not that 185,119 people received answers from our pages. The defensible claim is that 185,119 successful requests matched patterns our manifest classifies as live retrieval. That is still a meaningful result, and one we expect to refine as agent behavior becomes better documented.

Finding 4: the network rose, then flattened

Period Requests Average per day
April 25 to 30 14,473 2,412
May 100,186 3,232
June 121,400 4,047

The highest clean daily total was 5,046. The pattern is not a smooth line. There are crawler bursts, quiet weekends, and single ingestion events. On May 7, GPTBot made 2,037 successful requests, against 181 the day before and 128 the day after, most of them against a newly published report surface. That spike looked suspicious in the daily chart. The raw logs showed a real one-day crawler wave, not an extraction error.

By late June, volume had stopped rising. We do not yet know whether the plateau reflects seasonality, catalog size, agent behavior, or a pause. The next issue will say.

Finding 5: the pattern was not limited to the AI directory

Directory Requests Share
BlockSentient (AI) 142,754 60.47%
ZeroMetric (security) 31,318 13.27%
StaffGrid (HR) 19,031 8.06%
TreasuryMetric (finance) 18,372 7.78%
LedgerSupply (logistics) 12,448 5.27%
PowerAudit (energy) 12,136 5.14%

Every directory recorded at least one tracked request on every one of the 67 days. And the distribution moved over time. BlockSentient's share of each period's tracked traffic fell from roughly 73 percent in late April to about 57 percent in early June to about 49 percent in late June. The other five grew faster. The directories differ in age, catalog size, and enrichment depth, so this is not a controlled vertical comparison. It does reduce the chance that the observation is an artifact of publishing about AI tools.

Finding 6: structured pages entered the request stream within days

Between early May and early June we added 93 new tool listings, each with structured pricing, feature, and compliance data. The before state is unambiguous: the pages did not exist. Within days of publication they were drawing 0.1 to 17 requests per day each. Across 6,000 post-publication requests on this cohort, 82.5 percent carried live-retrieval labels and 73 percent declared ChatGPT-User.

Coverage of the long tail is nearly total: in a representative two-week window, 519 of 522 never-enriched pages received at least one tracked request.

The harder question is what structured enrichment does to a page that already exists. We have seven historical cases with verifiable upgrade dates inside the window, against a network tide of x1.3 to x2.1 for untouched pages over the same weeks:

Tool Before (requests/day) After Change
Shippo 0.11 9.87 x88.8
8fig 0.15 1.35 x8.8
Miro 0.58 3.57 x6.1
Later 0.75 2.97 x4.0
Rebolt 0.38 1.26 x3.3
Blinq 1.62 2.21 x1.4
ActiveCampaign 1.90 0.61 x0.3

Median change: x4.0. Six rose, one fell, and we publish the one that fell. Seven observational cases are suggestive, not causal evidence. So we turned the question into an experiment.

A pre-registered test, running now

On June 11, 2026 we randomly selected 20 never-enriched tools using the script below, seed included. Ten receive our standard structured enrichment during the first week of July. Ten form a holdout we will not touch until the readout. The analysis rules are fixed in advance: difference-in-differences between treatment and holdout on tracked requests per day. We will publish the result whether it is positive, null, or negative, in the August issue. Anyone can audit the assignment. Anyone with a website can replicate the design.

import json, os, random
random.seed(20260611)
pool = []
for site in ["blocksentient", "zerometric", "treasurymetric",
             "staffgrid", "ledgersupply", "poweraudit"]:
    fd = f"/var/www/{site}/features_data"
    have = {fn[:-5] for fn in os.listdir(fd) if fn.endswith('.json')} if os.path.isdir(fd) else set()
    for t in json.load(open(f"/var/www/{site}/db.json")):
        i = t.get('id')
        if i and i not in have:
            pool.append((site, i))
sample = random.sample(pool, 20)
treat, hold = sample[:10], sample[10:]
Treatment (enriched July 2026) Holdout (untouched until readout)
blocksentient: aivo blocksentient: botstar
blocksentient: chatgpt blocksentient: gupshup
blocksentient: intercom ledgersupply: cerasis
blocksentient: mailjet ledgersupply: fleet-complete
blocksentient: zoho-analytics poweraudit: bosch
poweraudit: energycap poweraudit: ginlong-solis
poweraudit: greenlots poweraudit: newmotion
poweraudit: juicebox-(enel-x) staffgrid: bullhorn
staffgrid: jazzhr zerometric: cybrary
treasurymetric: paylocity zerometric: proofpoint-zenguide

A falsifiable claim, in public

If enrichment does what the seven cases suggest, the treatment column outruns the holdout by August. If it does not, we will print that instead.

Finding 7: llms.txt was a very small surface

The llms.txt proposal asks publishers to provide a dedicated machine-readable index file for AI systems. [1]

We published both llms.txt and llms-full.txt on all six sites in April. Across the full window, the exact files received 114 and 125 successful requests respectively, 239 combined. That is 0.10 percent of the clean baseline.

What tracked agents appear to follow instead is ordinary web architecture. When we published 877 new pages on May 1 with zero sitemap presence, agents found them through a single small link on each review page. They made 748 requests within six days, crawling review page and companion page together in the same sessions.

This finding is limited to our network. It does not prove llms.txt has no value elsewhere or that future agents will ignore it. During this window, on these sites, agents requested ordinary content pages far more often than the dedicated machine-index files.

Finding 8: retrieval is not citation

Everything above measures the request. A request is one stage in a longer chain. A crawler discovers the page. A retrieval system requests it. The page enters a candidate-source set. A fact from it appears in the answer. The answer visibly cites the page. Server logs are strongest at the first two stages. For the third, we captured direct evidence.

ChatGPT exposes its reasoning while it searches. We recorded full traces for two procurement questions in temporary chats with personalization off, and we publish them as exhibits: two anecdotes about one engine, not statistics.

Question one, trivial: is the Telegram Bot API free in 2026. The engine spent 14 seconds, consulted 24 sources, took the unambiguous official answer, and opened no window for third parties.

Question two, hard: what does Microsoft Azure Speech Service cost in 2026. The engine spent just under three minutes across 78 sources. The official pricing page hides values behind scripts, so the engine triangulated Microsoft's own numbers through Korean, Spanish, Indonesian, and Swedish localizations where static snippets leak them. It pulled third-party pages into its candidate set, ours among them, then cross-checked their facts against the official source and discarded candidates it judged stale.

Microsoft's official batch transcription list price is 0.18 dollars per audio hour. [2]

Across the third-party sources the engine encountered, published figures for the same line item ranged from 0.18 to 4.50 dollars per hour, a 25x spread. In the middle of that chaos, the engine noted the "potential value in citing third-party sources" where official values are missing.

Our pages entered the candidate set in both runs of the hard question. Candidate inclusion is stronger than a log request and weaker than answer inclusion. We have not yet confirmed a visible citation in a final answer. The two traces suggest that freshness and verifiability may influence which sources survive the engine's checks, and that verification effort scales with how ambiguous the official record is. Testing that properly is the second experiment: a controlled question panel that records the engine, the exact question, the final answer, visible citations, and the matching server requests, connecting retrieval to answer-side evidence.

The clock from publish to answer surface

Our May 1 deployment of 877 pages gave a clean timing read. Index and training crawlers arrived within hours to days. OpenAI's search-index crawler became and remains the top reader of those pages, and Anthropic's ClaudeBot roughly quadrupled its activity on them in late June. Requests declaring ChatGPT-User began as a trickle four to five weeks after publication and are still climbing slowly.

Publication to answer surface is a clock measured in weeks. It is one reason this report ships in early July.

Common Crawl publishes a new web snapshot roughly quarterly, and the next one lands mid-summer. [3]

Its crawler's bursts are visible in our own logs. Inclusion does not guarantee model-training use, but content absent from the snapshot cannot be included at all.

A correction before publication

Our first draft total was 238,995 successful requests. We did not publish that number.

During validation we found two problems. First, the total included 1,773 requests to Emerging.cz, our corporate site, which is not one of the six directories. Second, the production log contained a one-time overlap: part of April 25 was processed twice by two extraction modes, duplicating 1,163 requests.

We rebuilt the full window directly from 84 retained nginx source files, processing each file once. The corrected result is 236,059. The change is 1.23 percent and alters no conclusion, but it changes the number, so it belongs in the report. Earlier counts shown on our own pages also included status-301 redirects; the baseline here uses status-200 responses only, and 36,395 redirects were excluded by that rule.

The clean dataset, extractor, manifest, and summary are preserved with SHA-256 checksums.

What the data proves, and what it does not

The first 67 days support five conclusions. The instrument works: every day covered, all six sites, zero parser errors. Low click-through did not mean no machine audience. The activity concentrated on substantive content. It extended across six verticals. And the system can detect its own errors, which is the difference between an observatory and a dashboard.

The same logs cannot tell us how many people caused the requests or which questions caused them. They cannot show whether pages were fully parsed, whether facts appeared in answers, or whether citations were shown. They cannot verify that every declared agent was authentic. Retrieval is not citation. The experiments above exist to close that gap with evidence instead of assumption.

If you publish on the web

Four observations from one network, offered as hypotheses for yours. Structured, specific pages entered the machine request stream within days. Internal links, not sitemaps and not llms.txt, appeared to be the discovery route. Live-retrieval-labeled traffic dominated, which means page freshness matters at answer time, every day. And the citation gate appears to run on verifiable currency: a page stating each fact with a date and a checkable source is doing the engine's verification work for it.

Methodology

Properties: six software directories operated by Emerging S.R.O. Window: 2026-04-25 through 2026-06-30, UTC. Source: nginx access logs, rebuilt from 84 retained source files with a preserved extractor version; each canonical record carries a source-file hash and line. Filter: HTTP status 200. Excluded host: Emerging.cz. Agent detection: user-agent strings matched against a maintained pattern list, taken at face value. Live-retrieval classification: ChatGPT-User, Perplexity-User, PerplexityBot, OAI-SearchBot, DuckAssistBot, YouBot, and Anthropic's retrieval agent; training and index: GPTBot, ClaudeBot, Meta-ExternalAgent, GoogleOther, Google-Extended, CCBot, Bytespider, and similar. Integrity: 67 of 67 days represented, all six sites on every day, zero parser errors. Search figures: Google Search Console and Bing Webmaster Tools for the same properties, March 30 to June 29. Reasoning traces: captured from ChatGPT in temporary chats with personalization off, June 2026, stored in full.

Privacy: we never receive the questions people enter into AI systems. Raw server logs contain standard request metadata, including IP addresses and full user-agent strings. The canonical research dataset excludes IP addresses, full user-agent strings, referrers, and query parameters. Public releases use aggregates only.

Limitations

This is one network operated by one small company. The six directories share infrastructure and publishing methods; their traffic is not representative of the web. User-agent strings can be spoofed; our categories describe declared agents, not verified identities. The live-retrieval label depends on our current manifest, and some agents may have mixed purposes. A successful request does not prove parsing, answer inclusion, or citation. Our server answers unknown paths with the application shell and status 200, so path-based surface counts include some requests to pages that do not exist; cross-checking requested paths against the live catalog is planned for the next issue. Parts of the network served mislabeled branding to crawlers during part of May; the requests remain valid, but content-level analysis of affected pages needs separate tagging. The seven enrichment cases are observational; the randomized experiment exists to test causality. The June plateau may be seasonal. The two reasoning traces are exhibits from one engine on two questions.

Data

An aggregated dataset accompanies this piece at /research/data/ on each network site: per-day, per-agent, per-site request counts for the full window, plus per-tool counts for the 100 most-requested review pages. Files are CSV and JSON under a stable, versioned URL, built from the checksummed canonical baseline. License: CC BY 4.0. If you use it, cite this report. If you find an error, email us and we will publish the correction.

Disclosure

We measure AI-associated requests to pages we publish ourselves, including pages about AI measurement. That is a conflict of interest, and more requests can make the network look more important, so we name it rather than hide it. Our response is method, corrections, limitations, and negative findings published alongside the positive ones: the June plateau is in this report, and a null or negative experiment result will be too.

Nothing here is sponsored. Vendors cannot pay to change a transparency score, receive a Vendor Verified badge without documented corrections, or alter the observatory's measurements. The Veracity Media Network is operated by Emerging S.R.O., a two-person company in Brno, Czech Republic, building this in public. Researchers, journalists, publishers, and teams studying AI retrieval are welcome to contact us through any network site to compare methods or request the aggregate tables.

Sources

  1. The llms.txt proposal asks publishers to provide a dedicated machine-readable index file for AI systems. https://llmstxt.org (verified 2026-07-01)
  2. Microsoft's official batch transcription list price is 0.18 dollars per audio hour. https://azure.microsoft.com/pricing/details/cognitive-services/speech-services/ (verified 2026-06-11)
  3. Common Crawl publishes a new web snapshot roughly quarterly, and the next one lands mid-summer. https://commoncrawl.org (verified 2026-07-01)