Skip to main content

Make your Developer Portal Ready for AI Agents

Pronovix announces full support for the llms.txt standard with a Drupal Recipe

To simplify consumption of developer portals by Large Language Models (LLMs) and AI Agents, Pronovix is adding full support for the llms.txt standard by open-sourcing a Drupal Recipe and two supporting modules. LLMs, AI agents, and web crawlers are everywhere, so making your site AI-readable – not just human-friendly – is essential. Think about it: it is much harder to understand a text filled with tons of technical jargon than something that helps you get a good grasp of the story it is trying to tell you. Well, the same is true for AI. If they have to break down large amounts of data, it exhausts their context window.

A little context

LLMs are powerful tools. We all know they have limitations, but as a fellow technical writer said, it’s always nice to have an eager junior colleague by your side, who knows a lot about everything – although not always accurately –, never gets tired, and who you can always reach out to in need. Why not make their job easier with your developer portals, too? This is where the llms.txt standard comes in. Jeremy Howard, the author of article introducing the standard perfectly summed up the issue, so I am just going to quote his description:

"Large language models increasingly rely on website information, but face a critical limitation: context windows are too small to handle most websites in their entirety. Converting complex HTML pages with navigation, ads, and JavaScript into LLM-friendly plain text is both difficult and imprecise."

The llms.txt file’s role in tackling this is simple yet significant. It serves as a crucial resource for LLMs by acting as a concise homepage and sitemap that aggregates key website content and directly links to detailed content pages, enabling LLMs to bypass less relevant areas like menus and archives. Written in Markdown format, the currently most easily processed format for LLMs, llms.txt provides brief background information, guidance, and pointers to more extensive Markdown files, ensuring both structured information for AI and readability for humans. By offering a collection of relevant page content, this approach significantly enhances how LLMs can extract knowledge from a website and generate useful, contextualized answers for users. Read more about the specification.

Furthermore, the llms.txt standard proposes that all content pages be made available in MarkDown at the same URL, but with a .md extension, to simplify content delivery to machines.

The LLM Support Recipe

In the LLM Support Recipe, we combine three Drupal modules to achieve full llms.txt standard support:

  1. The /llms.txt module provides support for authoring a full featured /llms.txt file.
  2. The Markdownify module enables output of Drupal entities as Markdown-formatted text.
  3. The Markdownify File Attachment submodule enables output of API specification files as code blocks inside MarkDown to provide full context for LLMs when it comes to API-related content.

Pronovix’s llms.txt Drupal module

In light of the above, what’s in Pronovix’s /llms.txt module? This module aims to provide an advanced solution to manage the content of the llms.txt file. What it does:

  • Provides a configurable and custom entity type to manage the llms.txt file’s header and other sections in an editable format.
  • Adds menu tokens that enable users to embed dedicated menus in a machine readable Markdown format.

And what makes it different from other similar modules? Pronovix’s llms_txt Drupal module has advanced caching and content management features, like section reordering and editable view of each descriptive section.

Markdownify module

To provide further support to the proposed llms.txt standard, the Markdownify module can turn all your site’s content into machine-readable format at their original URL appended with a .md extension, so that you don’t have to worry about large HTML contexts. 

While modern LLMs can ingest and parse raw HTML, Markdown offers significant advantages:

  • Cost efficiency: AI services charge based on token usage, typically in increments of one million. In our testing, Markdown reduces token count by a 18:1 ratio compared to HTML, leading to lower costs.
  • Faster processing: The reduction in token count results in quicker processing of your content in Markdown format.
  • Zero distractions: The Markdown output omits headers, footers, ads, and other irrelevant output, providing a clean and concise context for AI models.
  • Universal format: Markdown is widely supported and easily understood by AI models, making it the lingua franca for structured text.

Markdownify File Attachment submodule

Because a core content type on developer portals are pages with an attached API specification file, it is important to provide complete context for API documentation by including the text of the API specification with the Markdown output.

Hence, we wrote and are open-sourcing a filefield submodule to Markdownify that renders out attached text files. The assumption is that an LLM can easily parse the YAML or JSON text of an API specification, especially when given the context of what the attached file is.

A quick example:

# Train Travel API
API for finding and booking train trips across Europe.
Attached file traintravel.yml with yml extension available at https://example.com/files/default/train-travel.yml follows:
```
openapi: 3.1.0
Info:
  title: Train Travel API
  description: API for finding and booking train trips across Europe.
  version: 1.0.1
  Contact:
    name: Train Support
    url: https://example.com/support
(and so on)
```

The result is a simple text file that an LLM can read and use. It embeds the API specification at the location where the overall documentation for this API resides, and therefore provides complete information and context exactly where needed.

On other websites with attached text files, we assume the same will be true, and this file attachment module should be helpful as well.

Conclusion

The Pronovix LLM Support Drupal Recipe addresses the crucial need for websites to be AI-accessible, particularly for the nascent category of AI Agents. By creating the structured llms.txt file and providing additional context with links, Drupal site owners can make their websites AI ready. Coupled with the Markdownify module, this solution reduces token usage, accelerates processing, and enhances the clarity of information provided to LLMs. Empower your Drupal developer portal to speak fluently with AI today!

Pronovix logo

Need help with adding full LLM support to your developer portal? Let's talk about your case!

All Pronovix publications are the fruit of a team effort, enabled by the research and collective knowledge of the entire Pronovix team. Our ideas and experiences are greatly shaped by our clients and the communities we participate in.

Dávid is a Technical Writer at Pronovix. Before joining the company, he was working as a freelance translator and interpreter. He is enthusiastic about bridging gaps and strongly believes that communication is essential to that, be it a subtitle for your favorite series, or an API documentation.

Dezső is the Chief Technology Officer at Pronovix. He wanted to have a computer from a very young age — not for playing games, but to do programming and other cool stuff. He started learning web programming at high school where he met his mentor László Csécsy (boobaa) who introduced him to Drupal. He earned a BSc degree in Bachelor of Business Information Technology and later an MSc degree in Software Engineering at the University of Szeged in Hungary. Thanks to his enthusiasm for computers and programming he is always ready to improve his skills, and can quickly learn new languages and technologies.

Christoph is a creative and versatile technical leader who likes to present complex subjects in plain English. He delivers optimized solutions that draw from his extensive experience managing demanding computing projects and partnering with stakeholders of all stripes. He is also a regular speaker at technical events, and in his spare time builds furniture that align with his penchant for simplicity.

Newsletter

Articles on devportals, DX and API docs, event recaps, webinars, and more. Sign up to be up to date with the latest trends and best practices.

 

Subscribe