DITA to Drupal
We use the Feeds module to import the documents. We create a custom fetcher that checks a folder on a cron run and a custom parser to be able to use it for importing the transformed DITAMap XHTML. We also create a special GUID (or UUID) for books and topics to be able to update in place (just new or updated nodes, this should be composed of a map and a topic element ID). Each DITAMap contains metadata for the software version that also gets imported into Drupal, this can then be used in a faceted search interface to differentiate between software versions. Currently the XHTML maps import is controlled by a 2 step import workflow, where the first step is to recognize and recreate the map structure in the Drupal Book node framework, then the second step is to populate the created nodes with the DITA specific content, including rich formatted text, URLs, internal link references, embedded images and tags.
Mass publication is a new feature that allows content editors to bulk upload content by uploading the XHTML output in a zip file. Single maps or multiple maps can be uploaded at a time. This semi-automatized solution will unzip the files on the web server to a ‘watched’ folder, and upon the next cron runs the importer will fetch and process the files by creating and/or updating the content with the latest changes. The use of the UUIDs will make sure that only the respective nodes will be updated.
The file upload interface is available for authenticated users with a special permission.
To make sure that XHTML outputs will work fine with the importer, we set up a validation process to check the structure of the XHTML, the cross-references between the index.html file and the rest of the HTML files, as well as the referenced images. In case of any inconsistencies, the validation will stop the import process and send an email report to the editorial team with the discovered issues.
To preserve cross references between DITA topic maps, a script scans through all the existing x-refs and translates DITA links to Drupal links, so we can make sure that all of the linking structure implemented in DITA gets translated over to Drupal correctly. Basically, we ensure links work within a book and across multiple books.
We automatically process images found in your DITA content, and we store them as actual image assets in the Drupal files directory. The filter makes sure that all image references are mapped to the content and all image files are embedded properly. This allows us, if necessary, to further customise the image experience (e.g. image processing to improve site performance, responsive images for mobile users, etc.).
We implemented 2 types of reporting. First report is created on unsuccessful imports, when the validation failed. The report lists all the inconsistencies found in the XHTML structures and gives a detailed feedback for editors where to look for errors in their DITA maps.
The second report happens on successful import, pointing out the number of imported files, created/updated Drupal nodes and books. It also includes links to the created and updated nodes so editors can find them more easily.
All reports are sent via email to users in a predefined editorial role.
Drupal as a publishing format for DITA content
Drupal’s modular and extendable approach made it possible to build a custom content model that can integrate a DITA specialisation and publishing workflow. See our process description for a more detailed explanation and the Drupal modules used.
Our process is tailor made for your organization, but see an example process here:
The DITA XML source is managed by product units. The CMS pulls in the generated XHTML and automatically creates the content based on that.
The currently proven solution for transferring documentation prepared by the DITA Open Toolkit to Drupal consists of 2 main steps:
- The first step is to compile an easy-to-process format out of the DITA XML that can be transformed into Drupal pages with the least efforts. The DITA XML editor tools offer a standard output format called XHTML for this purpose, preserving the full content and the structure of the DITA topic maps.
- The second step is to feed this structure into Drupal, and mirror the original structure of pages in the website pages. For this the most suitable solution is to use the Feeds Module, its Tamper plugin system and an extensible Parser to be able to import the XHTML maps. Some specific maps may require a custom tamper plugin or some custom parsers written by developers.
The most suitable content type for holding the imported DITA maps is supplied by the Drupal Book module along with the Book Helper module. This configuration offers a well structured back-end for the content of the DITA maps, as well as an organised navigation throughout any levels of the document pages. With the help of the DHTML Menu module, the navigation may become even more smooth, providing extendable/collapsible menu structures.
Based on the DITA maps, the importer can identify the pages as unique, saving their unique identifier for the content update process. The solution also supports integration with versioning and content authoring workflows, like the Workbench and the Workbench moderation modules.
Not a CCMS solution
Our current solution exports existing DITA content into XHTML that is then imported in a Drupal site. The Drupal CMS is not used as a Component Content Management System (CCMS), it is merely used as a publishing format. At Pronovix we are looking into the possibility to build an open source CCMS for DITA content using Drupal, but this is currently only in the planning phase.
Equally the current solution doesn’t process XML topics on the fly. If you want to make use of conditional text or variables, this will require another approach. We have developed tools for conditional text in Drupal, but this is not part of this package.
DITA is the leading open standard for single source publishing in enterprise businesses. It allows organisations to reuse topics between different types of deliverables. Learn more about DITA here.
Drupal is designed to be the perfect content management solution for nontechnical users who need both simplicity and flexibility. It accomplishes this through its modular approach to site building. Unlike other CMSs, Drupal isn’t a prefabricated toy truck, but rather a collection of wheels, windshields, axles, frames, etc., that a toy maker can easily connect together. With Drupal, a maker could create a toy truck, but she or he could just as easily create a toy airplane, submarine, or robot. For this reason, Drupal may be described as both a content management system and a content management framework—one system that strives to have the strengths of both, without their deficiencies. Learn more about Drupal here.
If you would like to learn more about the DITA to Drupal package and would be interested how we could build one for you, get in touch with us, and we will schedule a free introduction call with you.