Building on our insightful conversations, such as those at the AI The Docs conference, and drawing from our project experiences, we bring you a white paper on AI-ready developer portals.
If you’d like to stay informed about our latest publications, podcasts, and upcoming events, subscribe to the Pronovix Developer Portals Newsletter.
Introduction
From API catalogs to AI-ready portals
When we started our journey with developer portals, they were little more than browsable API catalogs and the audience was mostly developers. For this reason, documentation was technical, minimal, and often written by engineers as a side task. However, the audience and business needs changed, prompting a broader content scope and an evolved user experience.
As portals changed, content such as guides, tutorials, and use cases became essential players: developer portals transformed into self-service ecosystems for onboarding users, supporting product adoption, and driving business outcomes.
The rise of generative AI tools (such as agents) has introduced new challenges and opportunities for developer portals. As these systems become integral to how developers search for, access, and interact with technical information, organizations need to rethink how they structure and maintain their content.
To support AI-driven developer experiences, portals need to evolve from static content repositories into dynamic, machine-consumable sources of truth. Preparing for this shift requires not just technical adjustments, but a strategic re-evaluation of content governance, metadata practices, and cross-functional collaboration.
About the white paper and who is it for
As part of our commitment to knowledge-sharing, we created this white paper that explores how developer portals can support the effective and responsible use of generative AI tools. It addresses key questions around organizational and technical to-dos, offering guidance on how to improve your developer portal’s AI-readiness.
In this paper, we relied on our teams’ experience and expertise to offer you a holistic approach from theory to actionable steps. The white paper is intended for professionals involved in the strategy, design, and maintenance of developer portals and API ecosystems, particularly those exploring (and experimenting with) how AI will impact their work. The scope mostly entails downstream content creation and focuses on LLMs and generative AI agents.
The white paper is especially relevant for:
- Developer Portal Owners and Product Managers looking to align their portals with AI-driven developer expectations.
- Technical Writers and Content Strategists aiming to structure content that serves both human users and AI agents.
- API and Platform Architects seeking to understand how documentation and metadata can support machine interoperability.
- User Experience Specialists working to enhance the discoverability, usability, and reach of their API products.
Regardless of your journey with generative AI tools, this white paper offers practical insights and a framework to guide your next steps (outlined in the first few chapters). You can use the theoretical background (second chapter) as a foundation to help decision-makers understand why content audits, cross-team alignment, and high-quality data are essential for building AI-ready developer portals. If you are looking for a reliable, future-ready solution, in the final chapter, we explore how Drupal’s open-source architecture, robust content modeling, and its other capabilities align with the evolving needs of AI-driven developer experiences.
Table of contents:
- Practical steps toward AI-readiness
- AI-readiness in theory
- Generative AI and Content Management Systems (CMS) - Drupal’s capabilities
- Search Engine Optimizatoin (SEO), Answer Engine Optimization (AEO) and Generative Engine Optimization (GEO)
- Conclusion and resources

About Pronovix Developer Portals
We build scalable, extensible developer portals by combining resilient solutions, managed services, and hands-on consultancy. At the foundation of our approach is a strong focus on knowledge-sharing, consulting, and we offer a range of services that can be used independently or in combination with the Zero Gravity platform.
For each project, we assemble a team of specialists (developers, UX researchers/designers, technical writers, and more), tailored to our customer’s specific needs and challenges. We collaborate with organizations across a variety of industries, which has allowed us to gain a broad experience in developer portal strategy and implementation.
Authors and main contributors

Practical steps toward AI-readiness
The foundation: cross-team collaboration
Structured content and well-written, rich data is more important than ever. However, while developer portals have come a long way, we still often see teams (like those of technical writers or UX specialists) working in silos behind the scenes. In our experience, these disconnects become even more pronounced when AI comes into the picture.
Design without content is hollow and content without structure is invisible. Both fall short unless usability and information architecture are treated as shared responsibilities. For example, challenges arise when UX designers create intuitive flows without accounting for content requirements or when technical writers try to produce clear, relevant documentation without insight into user personas and their journeys. Without close collaboration, even well-executed work can result in a fragmented developer experience.
When setting up the information architecture of a developer portal, our UX team aims to reflect the company’s unique context: its business goals, user expectations, and long-term strategy. Our technical writers play a critical role in making content well-written, accessible, and logically structured: foundations that are increasingly vital for enabling meaningful interaction.
As you will see soon in the example of a global financial services provider we worked for, missing metadata, a lack of clear structure, and inaccessible legacy documents can prevent AI tools from retrieving reliable, up-to-date information.
Collaboration between UX and Technical Writer specialists
What might a fruitful collaboration look like? At Pronovix, we focus on the following areas to ensure that both UX and technical writer specialists bring their expertise:
When our UX and techwriter teams join forces, we ensure better outcomes. Even if our teams are not in the same meetings, we still work from a shared roadmap. For UX designers, it is crucial to involve the technical writers early, or the other way round: when you are creating content, make sure you check the personas the UX specialists set up and which user requirements need to be fulfilled. In other words: collaboration from the start is key. Long-term success is impossible if specialist teams are stuck in silos.
AI as an intermediary user
Speaking of personas, AI brought changes in this practice too. We use personas to translate complex user behavior into clear, actionable insights that support more focused, user-centered planning. Traditionally, we focused on human users, but now generative AI is starting to participate in the user journey as a persona in its own right. It gathers and interprets content to generate answers, recommendations, or insights for others. In this way, AI becomes a kind of intermediary user: it consumes to provide.
This shift challenges us to rethink our approach. To serve this new user well, we must ensure that portals are structured, annotated, and interconnected in ways that AI can understand and use effectively. When we treat AI as a persona, we can list its characteristics, needs, typical behavior, and the way it will interact with the portal.
A persona card can help us understand if the decisions we make around organizing the sitemap and setting up content will benefit or hinder users. In this case, the user is AI:
Designing for generative AI
The output provided by any generative AI is only as good as the structure and content it relies on.
AI should be able to:
- consume through predictable and clear structure, and accurate information,
- recognize and find relationships, dependencies, and semantic markers, and
- follow an accessible layout and content (with the help of alt text or captions).
We cannot just add AI as something new to the portal and expect it to interpret data and documentation as a human would, we need to ensure that what we have is prepared for it. For this reason, AI becomes part of the whole user experience design: UX and technical writer teams need to align more than ever to build developer portals that are helpful to human users and interpretable by machines, laying the foundation for AI-driven services.
Practical steps toward AI-readiness 2.
Five steps to be prepared for generative AI tools
Step1: Apply the Pareto Principle - Prioritize high-impact content
We recommend starting with the 20% of your content that drives 80% of user interactions and AI value. With thousands of content items already in place, determining what truly matters can quickly become overwhelming. For this reason, we have a suggestion for you.
Focus first on the minimum viable documentation:
- API reference: Use structured format standards, like OpenAPI for REST APIs, to expose endpoints and schema in a machine-readable way.
- Onboarding guides: Break down workflows into clear, step-by-step instructions. Visual elements so helpful to some human user personas absolutely need descriptive alt text and consistent internal links.
- Tables and diagrams have machine readable format solutions, do not add them as images.
- API overviews: Provide plain-language summaries of API products, focusing on capabilities and business value for non-technical users.
Step2: Improve documentation quality with a targeted content audit
Before automating anything, audit your existing content. Companies typically have more information documented than they presume but in worse shape than they expected.
Assess:
- Accuracy and relevance
- Usage patterns (via site analytics or support data)
- Gaps in structure or clarity
One example that highlights the importance of solid content foundations comes from our collaboration with a global financial services provider. Like many large enterprises, the Company maintained documentation across multiple disconnected systems (which are a natural evolution in many cases). Over time, this led to inconsistent content, outdated versions, and poor discoverability.
Despite these underlying content challenges, the Company invested in an advanced search interface with filters, categories, and drill-down features. Their CMS supported structured content and metadata, and technically, the building blocks were in place. But in practice, the search experience remained frustrating: critical information was missing from results, and AI-powered tools (like retrieval-augmented generation [RAG]) failed to deliver accurate or helpful answers.
When the Company reached out to us and we looked more closely, it became clear that the root cause was not the technology, it was the content. Key gaps included:
- Incomplete or missing metadata
- Inconsistent information architecture
- Broken links disrupting content relationships
- Crucial information trapped in PDFs stored outside the system
This is where a targeted content audit makes a difference and Pronovix’s technical writers can help you to avoid long-term and costly issues. The scope of the audit was limited to a selected set of documentation, requiring roughly one full workday from our team. By systematically evaluating the structure, quality, and accessibility of documentation, a content audit identifies what needs to be fixed or improved before investing further in tooling or AI solutions. It is an affordable and scalable way to lay the groundwork for both human and machine-readable content. For the mentioned Company’s case, it would have meant a clearer path to usable search, effective AI integration, and ultimately a better developer experience from the get go.
Step3: Strengthen content hierarchy and metadata
A strong information architecture is essential for AI to navigate your content ecosystem (we will get back to this question later, in more details). As an actionable step, we recommend you to focus on:
- Consistent and straightforward URL structures to reflect hierarchy and topic groupings
- Parent-child relationships between content types (e.g., concept → tutorial → reference)
- Rich metadata including tags, categories, versions, and supported features
Some examples for the parent-child relationships:
- Won’t be effective:
-
www.zero-gravity.org/node/32 -
www.zero-gravity.org/bpapi-tutorial
-
- Effective ways:
-
www.zero-gravity.org/apis-for-the-universe -
www.zero-gravity.org/api-catalog/best-product-api/tutorial
-
Drupal’s Content Management System, its structured content model and flexible taxonomy system make it easier to enforce consistent writing practices and manage reusable content elements across teams. With built-in support for metadata, versioning, and multilingual content, Drupal enables organizations to create machine-consumable documentation that serves both human users and AI systems. By aligning your editorial workflows with Drupal’s capabilities, you can streamline collaboration between writers, developers, and product teams. You can find more information about Drupal’s capabilities in a later chapter.
Step4: Make content discoverable - Internally and for AI
Well-structured internal- and cross-linking boosts usability and AI comprehension. Use:
- Manual linking to connect related topics, API references, and tutorials.
- Automated linking tools like Drupal’s SmartLinker AI to scale internal linking consistently.
Good linking not only guides users, it helps AI models understand topical clusters, which improves their ability to retrieve, summarize, and contextualize content correctly.
Beyond linking, organizations now try to leverage AI to make content easier to find, while also shaping an experience that encourages exploration and smoother workflows. We expect to see developments in areas such as:
- Cognitive Search: Search tools powered by natural language processing can interpret developer intent, not just the keywords. For instance, a query like "recommendation system" could surface relevant APIs, documentation, and tutorials related to machine learning-based solutions.
- Content Personalization: By adapting what content is displayed based on a developer’s browsing behavior or previous interactions, portals can offer a more targeted experience. While not a new concept, generative AI opens new frontiers for relevance and responsiveness.
- Predictive Assistance: Machine learning can suggest tools, integrations, or next steps to help developers discover new possibilities they did not know to look for.
Of course, some of these are yet only possibilities, but with a well-prepared foundation, companies can make it a reality.
Step5: Convert legacy content to structured formats
Legacy documentation often lives in PDFs, Word files, or unstructured HTML. In the case of our experience with the global financial services provider, that was one of the main issues. Files in these formats can be expensive for AI to process, and difficult to extract meaning from them, as AI-based search engines cannot access external PDF content by default.
If the content must live externally (for example for compliance reasons), then:
- Periodically import or synchronise the key external PDFs into your portal
- Extract and index their content in a structured way
- Present users with summarized or previewed content with the option to open the full PDF
In summary, conversion is not just about the format, it also highlights underlying quality issues. Fixing those will enhance both human and machine experiences.
Conclusion
Your AI readiness depends on more than technology: it needs accurate, structured, and semantically rich content. In the past few chapters, we offered practical steps toward your goals. If you need help or you are unsure about how to execute these, we are here to help you write for humans and structure for machines, aligning UX, technical writing, and architecture to support the next generation of developer experience.
Pronovix’s UX team has developed a dedicated AI user persona card to help teams plan and organize content with AI consumption in mind. By defining AI’s behavioral traits, content needs, and typical interaction patterns, we guide organizations to make decisions that benefit both human and machine users.
We can also offer guidance with content audit as it might be overwhelming for teams who have several, more urgent tasks. Our technical writers provide affordable, targeted content audits that uncover the root causes of broken user and AI experiences. From missing metadata to inaccessible PDFs, we identify and help fix foundational issues before you invest in AI tooling.
Ready to gauge your platform’s AI readiness? Talk to us about our content audit or how we can support your journey toward a future-proof developer portal. With the practical steps in place, we can now turn to the rationale behind them and explore their broader significance.
AI-readiness in theory
The rise of LLMs and generative AI
LLMs and generative AI are built to mimic human communication, which makes them impressive but not necessarily factual. They are similar to human experts who generate approximate answers first, and turn to authoritative sources (such as search engines or calculations) when precision is needed.
This reflects a bi-modal information system:
- A generative mode for exploring ideas and producing initial responses.
- An authoritative mode for delivering verified, factual answers.
It is important not to confuse the two. LLMs sound convincing, but they are not designed for accuracy, they are designed to predict plausible language. That is why we need to interact with AI systems in a consciously multi-modal way and serve them the right content.
One key takeaway from our experience and participation in many discussions about generative AI is that no matter how advanced an AI model is, AI does not fix content problems, it magnifies them: its effectiveness depends entirely on the quality of the underlying content. If your documentation is incomplete, inconsistent, or hard to navigate, AI tools will struggle to generate accurate or helpful responses. Instead of creating clarity, they risk compounding confusion at scale.
To make AI work for your developer portal, you need a strong content foundation. That means well-structured, up-to-date, and meaningful documentation that both humans and machines can interpret and use effectively. Without it, even the best AI strategy is likely to fall short of expectations.
Semantic Search and Retrieval-Augmented Generation (RAG)
In the context of AI it becomes essential to understand how search engines operate and shape the results users receive. With this knowledge, teams can better structure and prepare their content to support more accurate, helpful search.
Semantic Search aims to interpret the user’s intent and the contextual meaning of their query, rather than relying only on keyword matching (as regular search engines do). From a UX point of view, semantic search aims to understand what the user truly wants, so it can improve: relevance, user experience, and engagement.
When implemented in a developer portal, semantic search enables the system to:
- understand synonyms and related terms, such as “car” and “vehicle”,
- process natural language queries like “How can I use the API?” (rather than relying on exact keyword matches like “API usage”),
- return semantically related content (a query for “green travel” also returns results for “sustainable transportation”),
- consider context so if a user has been browsing content about library systems, a search for “catalog” will prioritize library-related results over fashion,
- go beyond document titles to surface pages where the exact search term does not appear but the meaning is still present,
- filter results by people, entities, or categories.
Semantic search relies on natural language processing (NLP), a branch of artificial intelligence that enables systems to process and understand human language. One of the key technologies behind semantic search is the use of vector embeddings which are numerical representations of text that allow systems to compare meaning rather than rely on exact keyword matches. This enables search engines to recognize the contextual similarity between queries and content, even when the language differs.
In other implementations, semantic search engines may also integrate knowledge graphs aka structured models of entities and their relationships to further enhance understanding. While a knowledge base provides the content (such as articles and documentation), vectors make that content searchable by meaning, and knowledge graphs add a layer of semantic context.
Regardless of the technology, in semantic systems, metadata is a key component: it provides contextual information such as when something was published, who created it, and which products or topics it relates to. In short, metadata helps systems and users better understand, organize, and make use of the underlying data. Metadata helps establish the connections within knowledge graphs, enabling the creation of semantically rich databases. Together, metadata and graph structures support more meaningful, machine-interpretable relationships between content.
Closely connected to these systems are large language models, which are trained on massive amounts of text to understand and generate human-like language. A specific architecture that enhances LLMs is retrieval-augmented generation. RAG improves performance in tasks like question-answering by enabling the model to retrieve relevant information from external, curated sources before generating a response.
Originally, LLMs operated on static training data and could not access real-time information. Today, however, techniques like RAG and protocols such as the model context protocol (MCP) allow models to retrieve and incorporate live, domain-specific data resulting in more accurate and context-aware outputs. By feeding external knowledge into the model, RAG systems help reduce hallucinations and generate more precise, factual, and relevant responses.
In advanced AI applications (especially those requiring up-to-date or domain-specific information) semantic search and RAG are often combined, leveraging the strengths of both to produce highly effective and accurate AI-driven interactions.
Format efficiency and token optimization for AI systems
When preparing content for AI, one often-overlooked factor is its format: how does your choice of markup language or other structured text formats affect tokenization and processing efficiency?
As AI systems process text by breaking it into tokens (small, meaningful units of language), complex formatting can significantly increase the token count, and by extension, computational costs. Pronovix’s technical writers conducted research on how different content formats influence tokenization and found measurable differences.
For example, converting legacy HTML pages to Markdown can significantly reduce the number of tokens required to represent the same content. Markdown is, on average, five times more token-efficient than HTML. To illustrate:
- The original HTML version of a web page contains 40988 characters, which breaks down into 11184 tokens due to its heavy use of tags and styling elements.
- After converting the same content into Markdown (preserving only the essential structure and meaning), the size drops to 6577 characters and just 1580 tokens.
Reducing token count will lower the processing costs and it also enables faster, more efficient AI responses. The investments you make today in content strategy, metadata usage, and collaborative processes will shape how effectively AI tools can serve your users tomorrow.
Below is a comparative breakdown of how different formats perform in terms of token usage per 100 words:
|
Format |
Token estimate per 100 words |
What the estimate includes |
| Markdown | 120-140 | Includes typical structure: headers (# Heading), emphasis (**bold**), links ([text](url)), and lists. |
| DITA XML | 300-350 | Heavy use of tags: <topic>, <title>, <section>, <p>, <codeblock>, etc. - markup often outweighs content. |
| ReStructured Text (RST) | 150-200 | Includes headers, inline roles (e.g., :fieldname: Field content), directives, and links. |
| YAML | 160-180 | Includes typical structure: nested key-value pairs, simple schemas (e.g., API specs, config files). |
| JSON | 180-200 | Includes quoted keys/values, brackets, braces — more verbose than YAML. |
Optimizing formats, reducing unnecessary processing costs, and clarifying ownership across teams are all essential steps toward a future-proof developer portal. The investments you make today in content strategy, metadata usage, and collaborative processes will shape how effectively AI tools can serve your users tomorrow.
With the right content practices and team workflows in place, your developer portal can evolve into a scalable interface that meets the needs of both humans and machines. We wrote this white paper to help teams execute towards this evolution independently, and of course Pronovix is here to support you whenever you are ready to take the next step.
“We're building the future, but it runs on today’s forgotten pages.” - Ádám Balogh, Technical Writer and DocOps Engineer.
Generative AI and Content Management Systems (CMS) - Drupal’s capabilities
While the previous chapters focused on tool-agnostic principles to help you build AI-ready developer portals, we know that choosing the right platform is a critical part of the implementation process. Based on our experience, we can recommend Drupal as a reliable and future-proof publishing solution (CMS).
A CMS is the foundation of the digital content infrastructure: it enables teams to create, organize, and maintain content efficiently without needing to write code. In the context of AI, a CMS plays a strategic role as it allows the structuring and enrichment of content with metadata, making it more accessible to humans and machines. With a CMS you can allow a multitude of users to access the site via a log in, fill in forms, search, or participate in discussion forums. A CMS can reliably scale to your needs.
Drupal offers a flexible and robust foundation for integrating generative tools. With its structured content model, powerful taxonomy system, and open-source extensibility, Drupal is well-suited to support RAG and AI-powered search.
In this section, we highlight some of the capabilities that explains why Drupal is suitable for future-proof developer portals, how to configure and operate RAG Search in Drupal. Then you can learn more about how llms.txt works and why it matters. While there are many more capabilities worth exploring, this white paper focuses on those that offer long-term value for your developer portal and are already available for implementation, empowering you to take action independently or with our help.
Drupal RAG Search
As we explained in the Semantic Search and Retrieval-Augmented Generation (RAG) chapter, instead of just returning links, RAG retrieves relevant content chunks from a knowledge base (Drupal content indexed in a Vector Database) and uses an LLM to synthesize a direct answer based only on that retrieved information. This provides users with contextually accurate answers grounded in our specific content, complete with source references.
AI Module (Drupal) orchestrates interactions with LLMs and vector databases. Key submodules/features include:
- Search API AI plugin: Handles the embedding process during ingestion.
- AI Vector database provider: Facilitates connection to various Vector Databases.
- Moderation submodule: Filters user input for harmful content.
- AI Providers: Choose from the most popular AI providers or connect your own LLM.
- DeepChat integration: Provides a ready-to-use Chatbot UI.
- AI agents: It is the framework to manage and orchestrate the agents.
- AI Automators: Tools to generate content or automate AI-assisted tasks based on content input. By integrating RAG into Drupal, you can provide relevant, context-grounded answers sourced directly from your knowledge-base without leaving the site.
RAG workflow in practice
Based on the represented high level workflow diagram, the two main phases include the following steps:
Phase 1: Content Ingestion
- Chunking: Drupal content is split into semantically meaningful units via Search API processors. If we follow the diagram, the document is embedded into the vector space through a Search API AI Plugin, and it will transform the data into digestible chunks.
Tip: Smaller chunks improve the precision of semantic search.
- Embedding: Each chunk is processed by an LLM (via the AI Module's Search API AI plugin) to generate a vector embedding (a numerical representation of its meaning).
- Storing: Embeddings, along with metadata (node ID, title, URI), are saved to the vector database. You can choose from different vendors: open-source (like Milvus) to Cloud-based (such as Pinecone).
Phase 2: Query Processing
- User Query: A user submits a natural language question via chatbot or another interface.
- Moderation: The input is checked for harmful or inappropriate content.
- Embedding the Query: The query is embedded using the same LLM model as in ingestion.
- Semantic Search: The query vector is matched against stored content chunks in the vector DB.
- Context Augmentation: Retrieved chunks are compiled into a prompt.
- Answer Generation: The LLM generates a response using only the provided context, which is then presented to the user via UI.
- The process can be repeated as much as needed.
Drupal AI Module
The Drupal AI module provides a framework for easily integrating artificial intelligence on any Drupal site regardless of the vendors. The AI module aims to provide a suite of modules and an API foundation for generating text content, images, content analysis and more.
You can combine the best features and approaches from the AI Automator, Search API AI and other modules in a unified framework and solution for AI in Drupal. There is an abstraction layer enabling integrations with third party AI providers such as Google Gemini, OpenAI (ChatGPT, DALL-E), Anthropic (Claude), Fireworks, Mistral and more. You can even use open-source models on servers you host and control.
As we see, in most cases, a simpler solution is better for initial experimenting, and for highly accurate results, it is necessary to invest in more mature tools. You can start your journey based on the mentioned suggestions, but if you need help, we can guide you through the planning, implementation, and maintenance.
Considerations and best practices
- Chunking Strategy: Balance chunk size for speed vs. contextual richness.
- Model Selection (Vector Database Choice and LLM Choice): Match the same model family for embedding and generation when possible.
- Prompt Engineering: The quality of a prompt plays a critical role in shaping the effectiveness of AI outputs. In many cases, the success or failure of the interaction hinges on how well the prompt is constructed.
- Moderation: Ensure safeguards for responsible AI use.
- Performance Tuning: Expect to iterate especially for threshold values and embedding strategies.
The RAG Search module and the Drupal AI ecosystem are evolving rapidly. With the current version, changes to how AI agents and assistants interact will open up new use cases and more flexible architectures. For full technical docs and examples, see the official Drupal AI module documentation.
Are you interested in this solution and want to know more about dependencies, installing, and enabling the required modules? Contact us and we can discuss your case.
llms.txt Standard
The llms.txt file is an emerging standard designed to support responsible content discovery and usage by large language models. It acts as a centralized entry point, guiding LLMs, web crawlers, and AI agents to relevant, high-quality content on a website while helping them avoid less useful sections such as navigational menus or archives. Typically written in Markdown or plain text for simplicity and ease of parsing, llms.txt may include brief descriptions, context, and direct links to deeper documentation or structured content (often also in Markdown). This format enhances both human readability and machine processing, facilitating more efficient indexing and enabling LLMs to generate more accurate, contextualized answers from the content it references.
To support more formal mechanisms for controlling data usage, we have open-sourced a Drupal Recipe that provides full support for llms.txt. The promise of llms.txt, alongside providing content in machine-preferred formats as well, is not necessarily for the crawlers of today, but for the agents not yet built.
Furthermore, it is about extending your developer portal: enabling it to serve both humans and machines with content tailored to their needs. A future crawler, seeing a pointer to a simpler, lower-token version of the page, could switch tactics in real-time. For our clients’ portals, we recommend enabling this option by default to improve machine discoverability.
From our comparison tests you can find in the previous chapter, consuming a streamlined version of a page can reduce token count, translating directly into lower cloud processing costs and faster throughput for LLM-based systems. LLMs benefit from reduced noise, and your portal should offer a clear directive to machines: one that will probably differ from what is designed for human navigation.
How it works in practice
In the LLM Support Recipe, we combine three Drupal modules to achieve full llms.txt standard support:
- The /llms.txt module provides support for authoring a full featured /llms.txt file.
- The Markdownify module enables output of Drupal entities as Markdown-formatted text.
- The Markdownify File Attachment submodule enables output of API specification files as code blocks inside MarkDown to provide full context for LLMs when it comes to API-related content.
The Markdownify module provides further support to the proposed llms.txt standard by turning all the site’s content into machine-readable format at their original URL appended with a .md extension, so that you do not have to worry about large HTML contexts.
Pronovix’s /llms.txt module aims to provide an advanced solution to manage the content of the llms.txt file. What it does:
- Provides a configurable and custom entity type to manage the llms.txt file’s header and other sections in an editable format.
- Adds menu tokens that enable users to embed dedicated menus in a machine readable Markdown format.
Our module has advanced caching and content management features, like section reordering and editable view of each descriptive section.
Benefits
The Pronovix LLM Support Drupal Recipe addresses the crucial need for websites to be AI-accessible, particularly for the nascent category of AI agents. To summarize, the module’s main benefits are:
- Enable AI agents: Provides information to help LLMs use your developer portal at inference time.
- Reduce AI cost: Simplifies documentation output to reduce token use via the LLM API.
- Increase inference speed: The reduction in token count results in quicker processing of your content in Markdown format.
By creating the structured llms.txt file and providing additional context with links, Drupal site owners can make their websites AI-ready. Coupled with the Markdownify module, this solution reduces token usage, accelerates processing, and enhances the clarity of information provided to LLMs.
Do you need help with empowering your Drupal developer portal to speak fluently with AI? Contact us and let’s discuss your case.
Search Engine Optimizatoin (SEO), Answer Engine Optimization (AEO) and Generative Engine Optimization (GEO)
In this chapter we clarify how AI search currently finds the answer to user questions, and serve best practices for enabling your content to show up with more success.
Is SEO dead?
LLMs and AI-based search are profoundly reshaping how people interact with developer portal content. In the “pre-LLM” era, leading developer portal teams were implementing SEO (Search Engine Optimization) best practices, as organic search traffic was seen as a natural and necessary off-site discoverability layer on top of a cutting-edge onsite information architecture design, speckless in-site search feature and carefully built onboarding workflows.
With the rise of large language models, a new solution-driven search behaviour emerges, where the classic "google a keyword, click a link, and hunt for the code snippet" workflow doesn’t necessarily apply. Instead, users are asking ChatGPT, Claude or Copilot for a(n API) product overview or checking Perplexity for an API rate limit.
However LLMs are reshaping organic search, SEO still remains fundamental. For organic documentation discoverability, classical and general SEO principles still apply, including the old trinity of on-page, technical and off-page tactics.
In the discovery phase, AI search engines (e. g. Perplixity or SearchGPT) and features (like Google Overviews) use Retrieval-Augmented Generation (RAG), meaning that they rely on “traditional” search engines to find high quality content and then use an LLM to summarize it.
This means that in case of major SEO shortcomings (for example indexability issues or severe accessibility problems), your content might remain undiscoverable for AI search.
If you make it to the retrieved pages in AI search, your content still needs to pass the authority and verification test in order to be cited. This means that both traditional and AI search tools are looking for Experience, Expertise, Authoritativeness, and Trustworthiness (E-E-A-T) signals, intent alignment and entity authority.
Definitions: SEO, AEO, GEO
These three interconnected practices are closely related and built upon each other in a hierarchical way. SEO provides the technical and authoritative foundation, AEO ensures your documentation provides the direct answer, and GEO secures your place as the cited source within an AI's synthesized response.
Let’s see the definition and purpose of each in a bit more detail.
Search Engine Optimization (SEO): The practice of optimizing your website to rank as high as possible in traditional search engines like Google or Bing.
Goal: to drive organic traffic and clicks to your website.
Key Focus: As the foundation of off-site discoverability, it involves optimizing website content, technical structure (e.g., indexability, loading, mobile-friendliness), and authority signalsand backlinks to achieve high rankings in Search Engine Results Pages (SERPs). In SEO the atomic-level fundamental units of measure are short and longtail keywords.
A devportal with indexability issues won’t even make it to the result page (and to AI responses). Also, if an API's landing page doesn't rank on Google, it likely won't be indexed deeply enough by the crawlers that feed AI models.
AEO (Answer Engine Optimization): A subset of SEO focused on providing a single, direct answer to a user’s specific question.
Goal: to be the source for Featured Snippets, "People Also Ask" boxes, and Voice Search results (Siri, Alexa).
Key Focus: AEO marks an important step moving from click-throughs to a “Zero-Click" strategy. It focuses on structuring content—often using question/answer formats and Schema Markup—to be easily extracted as a direct answer in featured snippets, knowledge panels, or voice assistants. Conversational "Question & Answer" formatting, FAQ sections, and Schema Markup are often utilized to tell the engine exactly what the answer is.
When a developer asks, "What is the rate limit for an API", AEO ensures the answer is served instantly in the search bar or chat interface, reducing friction.
GEO (Generative Engine Optimization) is the newest layer, focused on making your content the preferred source for Generative AI (like ChatGPT, Perplexity, or Google AI Overviews).
Goal: To be cited and summarized as an authoritative reference within an AI-generated response.
Key Focus: Brand authority across the web, factual accuracy (to prevent AI hallucinations), and providing deep "topical authority" that AI models want to use as a primary source.
In this hierarchy GEO is on top. Instead of click-throughs and instant answers the goal is to become an authority on a topic with the highest possible citation frequency in AI-generated summaries. To achieve this it is necessary to create contextually rich, authoritative, and informative content.
If your onboarding guide is cited as the definitive source, you gain "semantic authority" that drives long-term developer trust.
Best Practices
Optimize for High Entity Authority
From the perspective of modern search and AI-powered retrieval systems, your brand and your APIs function as entities—unique, identifiable nodes within explicit or implicit knowledge graphs. These entities are connected through relationships (for example, a brand providing an API, or an API implementing OAuth 2.0), and it is therefore essential to describe these relationships clearly and consistently across your content.
Entity authority reflects how strongly a search or AI system associates your portal with specific technical concepts (such as ‘OAuth 2.0’ or ‘event-driven architecture’), based on consistent terminology, contextual signals, depth of coverage, and corroboration across trusted sources.
How to establish a High Entity Authority?
- Consistent naming: Use the exact same name for your API products, methods, and error codes across your portal, GitHub, and social media.
- Entity stacking: Link your documentation to other high-authority entities. For example, explicitly state your compatibility with "AWS Lambda" or "Kubernetes" using their official names to associate your "entity" with theirs.
- Internal Knowledge Graphs: Use a "Hub and Spoke" model. A central "API Overview" page (the Hub) should link to specific "Tutorials" and "Reference Docs" (the Spokes), creating a clear “semantic web” for crawlers to follow. Pay special attention for consistent naming of APIs, products and concepts. Investing time and effort into mapping and designing your internal knowledge graph pays out a big time in the long term. If you don’t have the resources and expertise in your team to design your information architecture, ask for professional help.
E-E-A-T in Developer Portals
Google and AI engines prioritize content that demonstrates Experience, Expertise, Authoritativeness, and Trustworthiness.
- Experience: Include real-world "Case Studies" or "Success Stories." Show code samples that were "tested in production" rather than just generic boilerplate.
- Expertise: Ensure every tutorial or blog post has a clear Author Bio. If the content is written by a Senior Developer Advocate or a Core Engineer, the AI is more likely to trust it over an anonymous page.
- Authoritativeness: Earn mentions from reputable developer communities (Stack Overflow, Hacker News, Dev.to) and high-domain authority sites. High-engagement social media posts also serve as a high-impact trust signal for AI models looking for authority.
- Trustworthiness: Keep technical documentation up-to-date. Outdated OpenAPI specs or broken code samples (detected by AI via "hallucination checks") will lead to your content being de-prioritized.
Metadata and Structured Data
Structured data markup (or Schema Markup) provides search engines with a standardized, machine-readable way to understand and classify your content. Adding Schema Markup in parallel with best practices listed above can significantly improve how models understand products, events and concepts on your developer portal.
- FAQ Schema: Use FAQ question-answer pairs in your "Common Errors" or "Support" pages, but it also makes sense to experiment with adding FAQ content and schema to API overview pages, providing straight and concise information for all major user personas on your developer portal. Make sure that you only use FAQ markup if you have the same content on your page.
- How-To Schema: Apply this to your "Quick Start" guides. This tells AI engines that your content is a step-by-step instruction set, making it highly likely to be used in "How do I..." queries.
- SoftwareApplication & WebAPI Schema: Use specialized Schema.org types like SoftwareApplication or WebAPI to define your product's version, license, and technical requirements in a way LLMs can index instantly.
Tracking Performance: SEO, AEO/GEO
For measuring SEO performance, traditional tools like Bing Webmaster Tools and Google Search Console provide indispensable data. However most interactions in AI search don’t end up in a click-through and as a consequence you won’t get any information about 5000 users seeing your brand in a chat interface.
In GEO instead of Click-Through Rate we try to track Citation Frequency and Brand Sentiment, and instead of looking for keyword opportunities we perform a Competitor Gap Analysis to understand our position in AI responses.
| Feature | Traditional SEO | AEO & GEO |
|---|---|---|
| Primary Metric | CTR (Click-Through Rate): % of users who clicked from search results to your portal. | Citation Frequency: How often your docs are cited/linked by AI as the definitive source. |
| Primary Goal | Organic Traffic: Driving high session volumes to your developer portal. | Market Share of Answers: Being the AI's preferred response. |
| Performance Drivers | Keyword Opportunities: Finding "low difficulty" terms to rank for. | Competitor Gap Analysis: Finding prompts where competitors are cited but you are not. |
| Trust Signal | Backlink Profile: Quality and quantity of external domains linking to your portal. | Brand Sentiment: The tone AI uses (e.g., "Industry leader" vs. "Known for poor uptime"). |
| Tool Stack | Standard: Google Search Console (GSC) & Bing Webmaster Tools (BWT). | Modern: GSC/BWT + 3rd Party AI Trackers |
| Success Feature | SERP Rankings: Holding "Position 1" in the standard list of links. | Position Zero / LLM Context: Winning the Featured Snippet or the "Citation List" in Chatbots. |
| Visibility Type | Passive Discovery: A dev finds you among a list of links. | Active Recommendation: AI explicitly recommends your API during a coding workflow. |
Conclusion and resources
As AI becomes increasingly embedded in digital experiences, organizations are seeking platforms that can support intelligent, dynamic content delivery. This white paper seeks to provide both strategic insight and practical guidance to help you navigate your AI journey regardless of where you are in the process.
We aimed to give you the knowledge and tools to begin implementing these solutions independently. However, if you are looking to accelerate progress, avoid common pitfalls, or tailor your approach to your organization’s unique needs, our team is here to support you.
One of the key takeaways we hope to leave you with is this: achieving AI-readiness on your developer portal requires more than just the right tools. It begins with accurate, well-structured, and semantically rich content. To support this, Pronovix’s UX team has developed an AI user persona card designed to help teams plan and organize their content for machine consumption. By articulating AI-specific behavioral patterns, content needs, and interaction expectations, we empower organizations to make informed, future-proof content decisions that serve both human and AI users. Furthermore, our technical writers can ensure that your content is ready for AI consumption so you can avoid bottlenecks from the beginning.
Throughout this white paper, we have also highlighted why Drupal is a strong candidate for building secure, trustworthy AI-powered developer portals. Its flexible content architecture, advanced modeling capabilities, and growing ecosystem of AI integrations make it well-suited for generative AI applications.
Main resources
- Make Your Developer Portal Ready For AI Agents | Pronovix
- 5 practical actions toward AI-readiness | Pronovix
- Christoph Weber: Make Your Entire API Operations AI-ready with APIs.json | Pronovix
- AI success begins with a strong developer portal content strategy | Pronovix
- Kristof Van Tomme - The API portal is dead, long live the Platform- and Interface portals | Pronovix
- 3 ideas on AI readiness, the role of APIs and developer portals in generative AI systems | Pronovix
- Developer Portals as Digital Marketing Tools and Technology Choices for Devportals | Pronovix
- Insights from Building a Private LLM Chatbot | Pronovix
- API Productization and Governance - Discussion with Michaela Halliwell | Pronovix
- What Makes Conversational AI Trustworthy? - Discussion with Ronald Ashri | Pronovix
- AI The Docs Online 2025
- Retrieval Augmented Generation (RAG) and Semantic Search for GPTs | OpenAI Help Center
- What is semantic search, and how does it work? | Google Cloud
- Milvus VDB Provider | Drupal.org
- Pinecone VDB Provider | Drupal.org
- OpenAI Provider | Drupal.org
- Gemini Provider | Drupal.org
- AI (Artificial Intelligence) | Drupal.org
- Markdownify Content | Drupal.org
At Pronovix, we’re committed to continuous learning and knowledge-sharing: an approach that drives our podcast initiatives. The API Resilience podcast features leading API teams discussing key trends shaping the API economy, while Developer Success explores the critical factors behind successful API and interface programs. Both podcasts are hosted by Kristof Van Tomme, CEO and co-founder of Pronovix. Laura Vass, co-founder of Pronovix and organizer of the API The Docs conference series, hosts documentarians and other practitioners from all across the API world to discuss the latest topics, new learnings and best practices around API documentation and developer portals.
Through initiatives like the DevPortal Awards and the API The Docs conferences, we see how companies are pushing the boundaries of developer experience and thus promoting adoption of their technological solutions. These are ongoing learning journeys and we invite you to join us on the path forward.
All Pronovix publications are the fruit of a team effort, enabled by the research and collective knowledge of the entire Pronovix team. Our ideas and experiences are greatly shaped by our clients and the communities we participate in.
Sign up for our Developer Portal Newsletter to receive notifications about white paper updates, our developer portal, API documentation, and Developer Experience research publications.