Wednesday, October 28, 2009

Getting into the (xml)Flow of Things

By now there is wide-spread acceptance that XML's tagging and indexing capability is a powerful tool to leverage a publisher's valuable content asset. Just as important is implementing a publishing workflow utilizing a content management system that stores documents in a native XML format. This means that the goal is to have a workflow where data is not only created and tagged in XML, but also stored in native XML creating the possibility to repurpose the data as needed without data transforms in and out of the CMS.


Consider the challenges presented in the following typical workflow. Even when content is tagged in a rich XML scheme but stored in a relational database the first step that we are faced with is transforming the data from XML so that it can be stored in relational database tables. Once it is stored, if we want to repurpose this data for publication, say on the web, another conversion must take place to recreate the XML once again. This laborious task of multiple back and forth transforms never results in a timely or high quality production process.


Certainly, just getting the data into the relational database can be a long process to begin with. But consider the challenge of receiving XML data from multiple, even hundreds, of sources on a daily basis. The process then involves standardizing the data which is a huge undertaking. In Dave Kellogg's (CEO of MarkLogic) post The First Step's a Doozy, Dave considers Step 1 of loading content into the relational database system to be a daunting challenge.


In order to realize the full potential of an end-to-end publishing workflow, it must be built around content management that not only "handles" XML as another data type but rather employs a central native XML repository. Once the XML can get flowing in this manner it will ensure that publishers can make content that was cumbersome to repurpose into an asset that is easy to assemble in any form desired.

No comments: