Complex Data Sharing Using XSLT and XPath
As XSLT and XPath gain a foothold among web services devs, many Java and .NET devs still think resist using these tools for complex data sharing projects.
Integration Developer News recently spoke with Zarella L. Rendon, coauthor (with John Robert Gardner) of one of the top books in the field -- Prentice Hall's XSLT and XPath: A Guide to XML Transformations to find out how widely useful XSLT and XPath can be to devs with even the most complex data sharing-driven web services or integration.
Rendon is the Managing Director at XML-Factor, specializing in the implementation of custom XML solutions. She is a member of OASIS, the W3C XSL Working Group, and several vertical industry groups.
IDN: What specific types of XSLT (or XPath) functions do you consider the easiest to learn and/or give developers the best "bang for the buck"?
Rendon: The first and most important thing that programmers need to understand about XSLT is that it isn't intended to be a general-purpose conversion tool. XSLT is a tool for converting XML to something else, not the other way around. XSLT presupposes that you already have XML and some understanding of how XML works. XSLT is written in XML, so knowing the rules is a good place to start. Once you know how XML works, you can understand the structure of an XML document.
XPath is the tool that provides the navigation through that structure. XSLT and XPath work together: XPath to access nodes in an XML document, and XSLT to do something with them once you have them.
IDN: What are the most common mistakes (and/or misconceptions) that non-XML developers have when it comes to using XSLT or XPath?
Rendon: Most non-XML programmers have a hard time understanding the concept of nested elements, and how to take advantage of that nesting to process their documents or data. For example, XML allows parent-child relationships in a document like section-title, chapter-title, topic-title. In this example, the title element has no distinction except for the fact that its parent is either a chapter, a section or a topic.
Using XSLT, you can process elements based on their parentage, so a section title can be mapped to an h3 tag in HTML, and a chapter title can be mapped to an h2, and so on. Conversely, you can process XML elements based on their children or attributes. Once the XML file is in "memory," you can access any part of the structure as many times as necessary.
[Ed. Note: Several XSLT stylesheet templates, with ample code and rationale for their use, are provided on the web by Prentice Hall in a sample chapter to Rendon's book, XSLT and XPATH: A Guide to Transformation. Developers can download the sample chapter here: http://vig.prenhall.com/samplechapter/0130404462.pdf.]
So if you wanted to build a table of contents for the titles of the chapter, section and topic elements, you don't need to process the document twice.
IDN: In your book, you provide some very good and concise examples of how to use XSLT stylesheets. But I'm not sure where they are derived from. Are there XSLT stylesheet "templates" that developers who don't code directly in XML can use, or do developers need to create their own stylesheets?
Rendon: XSLT is a dynamic language that is written specifically for the data set that you're trying to process. The match statements in an XSLT template use the names of the elements in your XML file, as defined by your DTD (Document Type Definition) or Schema. There are some predefined DTDs and Schemas that can be used generically, like DocBook, and these have some XSLT templates provided.
However, because XSLT, XML, and your target output vary according to the data, no one set of templates will match, unless they are written specifically for that data set. Developers can use sample stylesheet found in various places on the web, but they must be tailored to fit the application.
Developers who are using XSLT should fully understand their XML document structure before starting out. XSLT and XPath are simple to use once you understand the syntax, and they provide powerful addressing and conversion tools for XML. If you have any XML knowledge, using XSLT and XPath will be a simple extension of that knowledge. The book also includes some boilerplates for XSLT stylesheets, and breaks down the needed components.
[Ed Note: For Java devs working with XSLT and XPath, might we suggest the Apache's Xalan-Java project. Xalan-Java (named after a rare musical instrument) fully implements XSL Transformations (XSLT) Version 1.0 and the XML Path Language (XPath) Version 1.0. XSLT is the first part of the XSL stylesheet language for XML. It includes the XSL Transformation vocabulary and XPath, a language for addressing parts of XML documents.]