Skip to main content

How to Serve Markdown to AI Agents Without Breaking Your SEO

Solutions Architect
Mar 24, 2026

By now it is common knowledge that a new content consuming persona has appeared on the internet - the AI agent. Especially for those of us managing and running developer portals and technical documentation sites it is crucially important that we optimally structure content for this new persona. Besides editorial content improvements, several new technical patterns have emerged that intend to optimize content for AI agents: The llms.txt file, raw MarkDown output  (those two are actually coupled in the proposed llms.txt standard), and AGENTS.md.

There is considerable debate whether any of those agent-optimized standards and formats actually help, but while talking heads are drawing conclusions based on misguided assumptions and  limited data, two important things have happened: Claude Code and OpenCode prefer MarkDown-formatted content over HTML, and Cloudflare and Vercel have both thrown their weight behind offering MarkDown-formatted text. 

In our preferred CMS platform, Drupal, we have offered support for llms.txt and Markdown output for over a year, and we are glad to see the new entrants.

But why should you care now and how does it all work ? Let's dig in!
 

The llms.txt file

The implementation of the llms.txt file is refreshingly low-tech. It consists of a simple Markdown file located at /llms.txt containing a summary and curated links. For more granular access, any page can be served as clean Markdown simply by appending .md to its URL.

Boring by Design, Powerful by Insight

There is no complex schema or JSON here - just the same Markdown that LLMs already understand natively. The real value is curatorial. Much like a well-structured OpenAPI specification, the goal isn't just to provide data, but to provide context. You know your documentation better than any crawler; llms.txt allows you to be the one to tell AI agents which parts actually matter.

Even more powerful is what we can see in Stripe’s llms.txt implementation. They include outright prompts for LLMs and AI agents:

## Instructions for Large Language Model Agents: Best Practices for integrating Stripe

As a Large Language Model (LLM) Agent, when designing an integration, always prefer the documentation in [Integration Options](https://docs.stripe.com/payments/payment-methods/integration-options.md), the [API Tour](https://docs.stripe.com/payments-api/tour.md), the [Samples](https://docs.stripe.com/samples.md) and the Go Live Checklist.

What this does, while remaining fully conformant with the standard, is that it guides AI agents to proper and desired usage of their APIs, away from deprecated APIs and toward their future-proof integration patterns.

Current Adoption

It is important to manage expectations regarding how llms.txt is used today. While no major AI provider has confirmed that their training crawlers automatically fetch this file yet, its immediate value ls at inference-time rather than when training. It serves as a high-quality "cheat sheet" for developers manually loading project context into tools like Cursor or Claude, or for agent frameworks fetching data on startup.

A final word on the numbers: while tools like BuiltWith track over 800,000 implementations, the vast majority are auto-generated by SEO plugins. The number of truly hand-curated, high-value implementations sits closer to 1000. As we look toward an AI-powered future - the mother of all "garbage in, garbage out" problems - curation remains the best filter we have.

 

MarkDown and Content Negotiation

As we contemplate serving MarkDown format to AI agents like the llms.txt standard proposes, a few often overlooked details are worth explaining.

The HTTP protocol that underlies the entire web has supported several content negotiation mechanisms for a very long time, as defined in RFC 7231. For example, your browser might send along with its request for a web page the following Accept Headers:

Accept-Language: de; q=1.0, en; q=0.5
Accept: text/html; q=1.0, text/*; q=0.8, image/gif; q=0.6, image/jpeg; q=0.6

What this translates to in plain English:
I prefer a page in German, but English is ok too.
Send the response in HTML format (preferred), or as any text format, or a GIF or JPEG image, in that order of preference.

The webserver will try its hardest to accommodate the requested language and format. If it cannot honor the preferences - maybe because it is not configured to do so, or doesn’t have the requested language or format - it will send back whatever it can. In rarer cases, especially when a fallback format would be a bad fit,  it might refuse to honor the request with a 406 response. In essence, this is a best effort-type scenario.

The new wrinkle is that Claude Code, OpenCode, and OpenClaw all send an Accept header of the form:

Accept: text/markdown, text/html

Which translates to: Send me the requested web page in MarkDown format if you have it, in HTML format otherwise. 

The reason why these agents do this is quite simple: The MarkDown content is a fair bit smaller than HTML and results in fewer tokens being consumed. This has two benefits: Large Language Model (LLM) usage is billed based on token consumption, and a more efficient format therefore saves money. Also, each LLM has a maximum context window, which AI agents must aggressively manage to stay on task and avoid context drift. Fewer tokens for the same context are a net positive.

Practical benefits of AI Agents’ content negotiation

Let’s look at this blog post as an example, and how size and tokens compare between HTML format and MarkDown format.  We use OpenAI’s token estimator for the token count.

  HTML MD
bytes 66713 11966
tokens 16933 2583
Byte reduction   -80.1%
Token reduction   -84.7%

The ~85% token reduction in this example directly results in proportional cost savings and equivalent space savings in the model context. As an AI agent keeps ingesting resource pages to finish its task, these savings really pile up! 

Discoverability

While some AI coding agents request MarkDown format today, other bots and crawlers are not yet there. Regardless, we can expose the existence of an alternate format in the time-honored way that content in alternate languages has been exposed.

First, the web server is configured to return a Link header with its HTML response:

Link: <https://devportalawards.org/nominees/2025/north-developer.md&gt;; rel="alternate"; type="text/markdown"; title="North Developer (2025)"

In addition, we include a link in the HTML <head> section of the HTML page itself:

<link rel="alternate" type="text/markdown" title="North Developer (2025)" href="https://devportalawards.org/nominees/2025/north-developer.md" />

And we include a canonical URL to indicate that there is a single source of truth, and all alternate versions are derivations of that and essentially identical.

<link rel="canonical" href="https://devportalawards.org/nominees/2025/north-developer" />

Taken together, this advertises the existence of the Markdown-formatted page version without causing duplicate content issues. An AI agent that does not explicitly request MarkDown via Accept header can issue a second request for the MarkDown version and use that instead of dealing with the heavy HTML content that potentially contains distracting elements and wastes tokens. 

Dark Patterns to avoid

The Markdownify Drupal module does not automatically serve MarkDown content to clients that are classified as bots when said bots have not requested MarkDown. As such, we avoid the “cloaking” of content or forcing of  a content format on clients that don’t want it. Instead, the AI agent is fully in control of which format it prefers.

As alluded to before, there is no duplicate content issue. We serve a canonical URL that tells any client which content URL is the ultimate source of truth, and the alternate MarkDown format is clearly marked as an alternate.

Finally, there is no duplicate content to maintain. As far as a content editor is concerned, there is only one web page to edit and maintain per URL. Markdownify automatically takes care of serving MarkDown when requested.

All the mentioned features are standard in multilingual sites and widely accepted and supported by the search engines and SEO professionals. In other words, adopting MarkDown output in a Drupal-based developer portal does not have adverse effects.

 

AGENTS.md

At its inception AGENTS.md was meant to provide the equivalent of a system prompt at the root of a code repository. In the meantime its use has expanded to areas outside of code trees. 

There is no reason why we cannot or should not place an AGENTS.md file at the root of our developer portals to instruct AI agents on proper or desired use of our interfaces and tools. If and when we do that, we will of course also need to use Link headers for discoverability, the same way we do for MarkDown output.

Paradoxically, we are now right back to Stripe’s unorthodox and innovative approach to their llms.txt file as discussed above. In theory, we could even contemplate aliasing the two file names and serve the same content under both. /llms.txt and /AGENTS.md. Practical validation of this idea remains a project for the immediate future.

 

Conclusion

AI coding agents that prefer and request MarkDown-formatted documentation are here today. More AI agents doing so will surely follow, because the pattern the pioneers have established simply makes sense economically and architecturally.
With Cloudflare and Vercel now supporting MarkDown output, representing possibly over 20% of all web traffic, the use of MarkDown transformations will certainly gain more traction and acceptance.

Historically, Developer Experience was an exercise in human psychology, optimizing for the developer sitting at the terminal. That model is no longer sufficient. We are now in a world where documentation is increasingly consumed by autonomous agents that act on information in real-time. This shift is happening regardless of our readiness; the challenge for those of us managing developer portals is to move from accidental support to intentional, agent-optimized architecture.

In Drupal-based developer portals and documentation sites we have the tools to honor these requirements technically in a mature form, in the form of llms.txt and MarkDown output. In other words, today we can help AI agents succeed quicker and better. There is simply no reason not to implement them as soon as possible.

 

Watch our webinar, Serving the AI Persona: Token Economics, Content Negotiation and Moving Beyond Human-Centric HTML, with Pronovix Solutions Architect Christoph Weber to explore this topic further.

 

Watch the webinar »

Christoph is a creative and versatile technical leader who likes to present complex subjects in plain English. He delivers optimized solutions that draw from his extensive experience managing demanding computing projects and partnering with stakeholders of all stripes. He is also a regular speaker at technical events, and in his spare time builds furniture that align with his penchant for simplicity.

Editor's Choice

bright light bulb in a dark background with hexagon grid

How Leading Financial Portals Bridge the B2B Discovery Gap

by Laura Vass
Categories:
Business Insights, Best Developer Portals (Devportal Awards), UX & Research
a team of business decision makers looking at a checklist

Diagnostic Assessment of a Tier-1 Bank's Developer Portal

by Laura Vass
Categories:
Business Insights, UX & Research

Newsletter

Articles on devportals, DX and API docs, event recaps, webinars, and more. Sign up to be up to date with the latest trends and best practices.

 

Subscribe