attestd / thesis

The Autonomous Infrastructure Thesis

This is not a prediction about AGI or a bet on a specific autonomous future. It is the continuation of a pattern that has repeated in computing for decades. The actor performing a task changes. The infrastructure that actor needs to perform it reliably changes with it. Security operations are at that transition point now.

00 / the pattern

This transition has happened before. It is happening again.

Throughout the history of computing, the same sequence has repeated. Humans perform a task. Tooling emerges to help humans perform it. Automation increases. The automation needs structured versions of the same information the humans were consuming. New infrastructure emerges around machine consumption.

Humans once manually routed network traffic. Routers appeared. Humans once manually scheduled compute jobs. Schedulers appeared. Humans once manually monitored infrastructure health. Observability systems appeared. In each case the underlying task did not change. The actor performing it did. And the infrastructure that actor needed to perform it reliably was necessarily different from what humans had used before.

A human security engineer does not perceive reality directly. They consume representations of it: vendor advisories, CVE databases, KEV feeds, scanner output, trusted sources synthesized into a picture they can reason from. Their effectiveness comes partly from judgment and partly from the quality of that instrumentation. The tooling that exists today was built for that consumer.

As machines increasingly perform the same workflows, evaluating dependencies, making deployment decisions, managing running infrastructure, they need structured sources of truth designed for machine consumption. Not because the underlying security questions change. Is this version exploited? Is this package compromised? Is there a fix available? Those questions are identical. Because the actor asking them has changed, and the format that actor can consume reliably is different.

This is not a prediction. It is the continuation of a pattern that has repeated in computing for decades. The decision-maker changes. The supporting infrastructure changes with it.

01 / direction of travel

Security operations are at that transition point now.

Agents are already resolving dependencies, evaluating whether to deploy a component, managing running services, and patching systems without waiting for a human to approve each step. The model is not human-in-the-loop with some automation. The model is autonomous operation with human oversight at the boundary, and that boundary is moving outward as the systems become more capable and the humans become the bottleneck.

The evidence is observable in call patterns. CI pipelines call a security API in predictable batches during build time: systematic product/version combinations at scheduled intervals. Agent tool loops look structurally different: calls arrive outside build windows, with the timing irregularity and retry cadence of interactive reasoning rather than scheduled jobs. The same product/version pair gets queried multiple times within seconds. A CI pipeline does not do that. An agent working through a decision does, checking a condition, forming an intermediate conclusion, checking again to confirm. The integrations with LangChain, AutoGen, and the MCP server exist because developers are already building these systems, not because they are planning to.

The direction of travel is clear and the trajectory is accelerating. The question is not whether autonomous systems will make infrastructure security decisions. They already do. The question is whether they will make them reliably.

02 / the replicability problem

The problem is not model quality. It is input quality.

LLMs are capable of sophisticated reasoning. They are not capable of producing consistent outputs from inconsistent inputs. Security advisories, CVSS descriptions, and KEV feeds are written for human analysts. They are prose. They contain qualifications, context-dependent language, and domain-specific conventions that require interpretation. When you ask a model to reason from prose security data, you get different synthesis on different runs, not because the model is unreliable, but because the input does not constrain the output space sufficiently to produce a consistent answer.

Ask any LLM whether a given version of nginx is safe to deploy. You will get a different answer depending on how the question is framed, what other context is in the prompt, and whether the model has access to current CISA KEV data. The variance is not a bug that will be resolved with a larger model or a better prompt. It is a structural consequence of asking the model to reconstruct a security condition from an unstructured description of it.

The attestd ingestion pipeline provides direct evidence of this. When NVD provides structured CVSS vectors alongside the advisory prose, LLM extraction cross-checks prose against metadata and the system reports high confidence: a confidence score of 0.85 or above. When only the prose is usable and structured metadata is absent or incomplete, the system falls back to database-derived fields and reports lower confidence: 0.5. The same structural problem that makes direct LLM reasoning from prose unreliable is measurably visible in the quality of our own outputs when the input quality degrades. This is not a theoretical concern. It is instrumented.

Trustworthy autonomous decision-making requires that the same input produces the same output. Security advisories written for human analysts do not have this property. The solution is not to improve the model. It is to change the input format.
03 / structural forces

Three forces are making this worse simultaneously.

The replicability problem is a permanent structural condition, not a temporary gap to be closed by NVD or the model providers. Three independent forces are making it worse on an accelerating timeline.

NVD enrichment degradation. NIST publicly confirmed in April 2026 that NVD enrichment processing fell critically behind schedule. The enrichment phase is what adds CVSS scores, CWE classification, and CPE applicability statements to newly published CVEs. Thousands of CVEs entered the NVD API without CVSS base scores, without CWE identifiers, and without any CPE data. A direct NVD API consumer receives these incomplete records with no in-band signal that enrichment is missing. The expected input, structured metadata alongside the prose, was absent with no warning. The enrichment gap is structural: NVD publication volume has grown faster than enrichment processing capacity, and the backlog compounds.

Supply chain attacks concentrating on agentic toolchains. In March 2026, a malicious version of litellm was published to PyPI. It contained a credential stealer in proxy_server.py. The version had no CVEs. Its risk_state was "none" because the attack was a malicious publish, not an exploitation of a known vulnerability. An agent checking only CVE risk would have approved the version as safe. The supply chain compromise signal, which is independent of CVE history, is what catches it. LiteLLM is not an incidental target. It is exactly the kind of AI orchestration package that autonomous agents depend on. The attackers know where autonomous systems are.

In May 2026, CVE-2026-45321 was published affecting TanStack Router and related packages: 27 packages across the TanStack Router/Start/Solid family, all of which sit inside JS AI agent toolchains. Attestd's automated OSV sweep detected the threat, expanded monitoring coverage to all 27 packages, and made structured signals available before any manual triage occurred. This is not an argument that automated security response is necessary. It is a demonstration that it already works at scale. The attack surface on autonomous toolchains is actively growing. Systematic automated response is the only viable answer.

Infrastructure that autonomous systems depend on is high-value target. Erlang/OTP CVE-2025-32433, an unauthenticated remote code execution via SSH at CVSS 10.0, affects the exact runtime that underpins RabbitMQ. Message bus infrastructure that agentic orchestration systems depend on for coordination. The attack surface on autonomous infrastructure is not hypothetical. It is well-indexed and actively targeted.

04 / the prerequisite

Deterministic structured security inputs are not an optimization. They are the prerequisite.

If you want a machine to make the same correct decision every time it encounters the same security condition, it needs to receive that condition as a fact. Not as a passage of prose that it must interpret. Not as a CVSS score that it must translate into operational meaning. As a typed, structured fact: this version has a known unauthenticated RCE. This package version contains a backdoor. This product has no relevant CVEs in the current ingestion snapshot.

attestd runs LLM extraction at ingestion time. It runs once, with structured metadata to cross-check against, and holds the output in a pre-computed synthesis result. When a query arrives, the response is a lookup against that pre-computed state. There is no generative step at query time. The same product@version pair always returns the same risk_state within the same ingestion snapshot. Call it once or call it a hundred times in rapid succession without flapping. This is what deterministic structured input looks like in practice: the security condition, computed once from the best available data, served as a fact.

The failure mode in the absence of this layer is not abstract. Consider an autonomous orchestration system managing its own RabbitMQ broker infrastructure, its own message bus used for coordination between agents. Erlang/OTP CVE-2025-32433 was published in 2025: unauthenticated remote code execution via the SSH daemon, CVSS 10.0. The system had no structured signal that this vulnerability was present in the runtime it was running. Not because the CVE was hidden. It was published, indexed, and fully enriched in NVD. But there was no layer providing that security condition as a fact the system could act on. The CVE existed. The runtime was vulnerable. The agent had no way to know.

The prerequisite argument is not that autonomous systems need better security tooling. It is that autonomous systems need security inputs in a format that autonomous systems can consume correctly. The existing tooling (scanners, advisories, CVE feeds) was built for humans. The interface needs to change, not just the speed of delivery.
05 / where attestd fits

The security perception layer for autonomous infrastructure.

attestd is not a developer productivity tool. It does not exist to save time compared to reading NVD directly. The argument for attestd is not that it makes security checks faster or easier. The argument is that reliable autonomous decision-making requires security conditions to arrive as structured facts, and that without a layer providing those facts, the available outcomes are all worse than wrong.

Without a structured security perception layer, an agent assessing a deployment either hallucinates a risk state from whatever prose it can retrieve, produces an assessment that varies on each run and cannot be relied upon across reasoning loops, or, most commonly, has no security context at all and proceeds. All three outcomes share the same failure mode: the system's behaviour is unreliable in the specific way that makes autonomous infrastructure untrustworthy. Not wrong in a way that can be caught and corrected, but inconsistent in a way that cannot be detected until something fails.

attestd sits at the layer where autonomous systems perceive security conditions about the software they run and depend on. It ingests NVD, CISA KEV, OSV, and PyPI and npm registry signals continuously. It synthesises those inputs into structured, deterministic facts. It makes those facts available via a single API call that returns the same answer every time for the same question. That is the architecture the trustworthy autonomous future requires. Not because it is a good idea, but because the alternative does not work.

continue reading