Data Sharing Using XSLT and XPath

XSLT and XPath are gaining a reputation for helping developers tie into web and web services applications. But don't think you need to be an XML expert. Integration Developer News spoke with Zarella L. Rendon, coauthor of Prentice Hall's XSLT and XPath: A Guide to XML Transformations, to find out how XSLT and XPath can help developers speed web services and integration projects.

Tags: XSLT, XML, XPath, Developers, XSLT Stylesheet, Web Services, Templates,

by Vance McCarthy

XSLT and XPath are gaining a reputation for helping developers tie into web and web services applications. But one of the drawbacks of using these tools is that developers think they need to be XML experts. Integration Developer News spoke with Zarella L. Rendon, coauthor (with John Robert Gardner) of Prentice Hall's XSLT and XPath: A Guide to XML Transformations, to find out what aspects of XSLT and XPath can help developers with their web services and integration projects.

Rendon is a senior applications engineer and cofounder of the Hackensack, N.J.-based Isogen International, a leading provider of XML and SGML solutions. Rendon is also a member of the W3C XSL Working Group.

IDN: What specific types of XSLT (or XPath) functions do you consider the easiest to learn and/or give developers the best "bang for the buck"?

Rendon: The first and most important thing that programmers need to understand about XSLT is that it isn't intended to be a general-purpose conversion tool. XSLT is a tool for converting XML to something else, not the other way around. XSLT presupposes that you already have XML and some understanding of how XML works. XSLT is written in XML, so knowing the rules is a good place to start. Once you know how XML works, you can understand the structure of an XML document.

XPath is the tool that provides the navigation through that structure. XSLT and XPath work together: XPath to access nodes in an XML document, and XSLT to do something with them once you have them.

IDN: What are the most common mistakes (and/or misconceptions) that non-XML developers have when it comes to using XSLT or XPath?

Rendon: Most non-XML programmers have a hard time understanding the concept of nested elements, and how to take advantage of that nesting to process their documents or data. For example, XML allows parent-child relationships in a document like section-title, chapter-title, topic-title. In this example, the title element has no distinction except for the fact that its parent is either a chapter, a section or a topic.

Using XSLT, you can process elements based on their parentage, so a section title can be mapped to an h3 tag in HTML, and a chapter title can be mapped to an h2, and so on. Conversely, you can process XML elements based on their children or attributes. Once the XML file is in "memory," you can access any part of the structure as many times as necessary.

So if you wanted to build a table of contents for the titles of the chapter, section and topic elements, you don't need to process the document twice.

IDN: Can you provide a short example of how a developer without much XML or XSLT experience (but strong skills in ASP/.NET, Java, C++ and Open Source) might use that technique to solve a common data-sharing or web services/integration problem?

Rendon: As an example of using XPath to access the children of an element, consider an XML financial database that has a structure like this:


Let's say you need a report of all the events for 2002-11-20 that closed above the opening price. Because you know the structure of the XML, you can address the children of an event, compare them and return the entire event as follows:


This example uses XSLT together with XPath to check the date attribute of an event, and then check to see if its close is greater than its open price.

Note that the > shown in the statement means "greater than" in XML. XPath is the agent that allows the sophisticated queries on the data and returns the information set found to XSLT for processing. The XSLT statement ( in this example) will get processed for every element that is matched in the XPath statement.

IDN: In your book, you provide some very good and concise examples of how to use XSLT stylesheets. But I'm not sure where they are derived from. Are there XSLT stylesheet "templates" that developers who don't code directly in XML can use, or do developers need to create their own stylesheets?

Rendon: XSLT is a dynamic language that is written specifically for the data set that you're trying to process. The match statements in an XSLT template use the names of the elements in your XML file, as defined by your DTD (Document Type Definition) or Schema. There are some predefined DTDs and Schemas that can be used generically, like DocBook, and these have some XSLT templates provided.

However, because XSLT, XML, and your target output vary according to the data, no one set of templates will match, unless they are written specifically for that data set. Developers can use sample stylesheet found in various places on the web, but they must be tailored to fit the application.

Developers who are using XSLT should fully understand their XML document structure before starting out. XSLT and XPath are simple to use once you understand the syntax, and they provide powerful addressing and conversion tools for XML. If you have any XML knowledge, using XSLT and XPath will be a simple extension of that knowledge. The book also includes some boilerplates for XSLT stylesheets, and breaks down the needed components.

[Ed. Note: Several XSLT stylesheet templates, with ample code and rationale for their use, are provided on the web by Prentice Hall in a sample chapter to Rendon's book, XSLT and XPATH: A Guide to Transformation. Developers can download the sample chapter here:]

Other areas addressed in the book include: working with multiple stylesheets; working with "variables;" defining conditional XSLT elements; developing XSLT processors; defining namespaces; and working with Saxon (an Open Source XSLT processor) and Xalan (a W3C-based XSLT stylesheet processor in Java and C++) to generate multiple output files.