Sept. 3, 2007, 4:57 a.m.
posted by reo
Concatenating XSLT Transformations with a Filter ChainIt is sometimes useful to create a "filter chain" of XSLT transformations, so that the output of one transformation becomes the input of the next. This section of the tutorial shows you how to do that. 1 Writing the ProgramStart by writing a program to do the filtering. This example will show the full source code, but you can use one of the programs you've been working on as a basis, to make things easier. Note The code described here is contained in FilterChain.java. The sample program includes the import statements that identify the package locations for each class: import javax.xml.parsers.FactoryConfigurationError; import javax.xml.parsers.ParserConfigurationException; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import org.xml.sax.SAXException; import org.xml.sax.SAXParseException; import org.xml.sax.InputSource; import org.xml.sax.XMLReader; import org.xml.sax.XMLFilter; import javax.xml.transform.Transformer; import javax.xml.transform.TransformerException; import javax.xml.transform.TransformerFactory; import javax.xml.transform.TransformerConfigurationException; import javax.xml.transform.sax.SAXTransformerFactory; import javax.xml.transform.sax.SAXSource; import javax.xml.transform.sax.SAXResult; import javax.xml.transform.stream.StreamSource; import javax.xml.transform.stream.StreamResult; import java.io.*; The program also includes the standard error handlers you're used to. They're listed here, just so they are all gathered together in one place:
}
catch (TransformerConfigurationException tce) {
// Error generated by the parser
System.out.println ("* Transformer Factory error");
System.out.println(" " + tce.getMessage() );
// Use the contained exception, if any
Throwable x = tce;
if (tce.getException() != null)
x = tce.getException();
x.printStackTrace();
}
catch (TransformerException te) {
// Error generated by the parser
System.out.println ("* Transformation error");
System.out.println(" " + te.getMessage() );
// Use the contained exception, if any
Throwable x = te;
if (te.getException() != null)
x = te.getException();
x.printStackTrace();
}
catch (SAXException sxe) {
// Error generated by this application
// (or a parser-initialization error)
Exception x = sxe;
if (sxe.getException() != null)
x = sxe.getException();
x.printStackTrace();
}
catch (ParserConfigurationException pce) {
// Parser with specified options can't be built
pce.printStackTrace();
}
catch (IOException ioe) {
// I/O error
ioe.printStackTrace();
}
In between the import statements and the error handling, the core of the program consists of the code shown below.
public static void main (String argv[])
{
if (argv.length != 3) {
System.err.println ("Usage: java FilterChain stylesheet1
stylesheet2 xmlfile");
System.exit (1);
}
try {
// Read the arguments
File stylesheet1 = new File(argv[0]);
File stylesheet2 = new File(argv[1]);
File datafile = new File(argv[2]);
// Set up the input stream
BufferedInputStream bis = new
BufferedInputStream(newFileInputStream(datafile));
InputSource input = new InputSource(bis);
// Set up to read the input file
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser parser = spf.newSAXParser();
XMLReader reader = parser.getXMLReader();
// Create the filters (see Note #1)
SAXTransformerFactory stf =
(SAXTransformerFactory)
TransformerFactory.newInstance();
XMLFilter filter1 = stf.newXMLFilter(
new StreamSource(stylesheet1));
XMLFilter filter2 = stf.newXMLFilter(
new StreamSource(stylesheet2));
// Wire the output of the reader to filter1 (see Note #2)
// and the output of filter1 to filter2
filter1.setParent(reader);
filter2.setParent(filter1);
// Set up the output stream
StreamResult result = new StreamResult(System.out);
// Set up the transformer to process the SAX events generated
// by the last filter in the chain
Transformer transformer = stf.newTransformer();
SAXSource transformSource = new SAXSource(
filter2, input);
transformer.transform(transformSource, result);
} catch (...) {
...
Note This weird bit of code is explained by the fact that SAXTransformerFactoryextends TransformerFactory, adding methods to obtain filter objects. The newInstance()method is a static method defined in TransformerFactory, which (naturally enough) returns a TransformerFactory object. In reality, though, it returns a SAXTransformerFactory. So, to get at the extra methods defined by SAXTransformerFactory, the return value must be cast to the actual type. An XMLFilter object is both a SAX reader and a SAX content handler. As a SAX reader, it generates SAX events to whatever object has registered to receive them. As a content handler, it consumes SAX events generated by its "parent" object—which is, of necessity, a SAX reader, as well. (Calling the event generator a "parent" must make sense when looking at the internal architecture. From the external perspective, the name doesn't appear to be particularly fitting.) The fact that filters both generate and consume SAX events allows them to be chained together. 2 Understanding How it WorksThe code listed above shows you how to set up the transformation. Figure should help you get a better feel for what's happening when it executes. 2. Operation of chained filters![]() When you create the transformer, you pass it at a SAXSource object, which encapsulates a reader (in this case, filter2) and an input stream. You also pass it a pointer to the result stream, where it directs its output. The diagram shows what happens when you invoke transform() on the transformer. Here is an explanation of the steps:
3 Testing the ProgramTo try out the program, you'll create an XML file based on a tiny fraction of the XML DocBook format, and convert it to the ARTICLE format defined here. Then you'll apply the ARTICLE stylesheet to generate an HTML version. Note This example processes small-docbook-article.xml using docbookTo-Article.xsl, and article1c.xsl. The result is the HTML code shown in filter-out.txt. (The browser-displayable versions are small-docbook-article-xml.html, docbookToArticle-xsl.html, article1c-xsl.html, and filterout.html.) See the O'Reilly Web pages for a good description of the DocBook article format. Start by creating a small article that uses a minute subset of the XML DocBook format:
<?xml version="1.0"?>
<Article>
<ArtHeader>
<Title>Title of my (Docbook) article</Title>
</ArtHeader>
<Sect1>
<Title>Title of Section 1.</Title>
<Para>This is a paragraph.</Para>
</Sect1>
</Article>
Next, create a stylesheet to convert it into the ARTICLE format:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0"
>
<xsl:output method="xml"/> (see Note #1)
<xsl:template match="/">
<ARTICLE>
<xsl:apply-templates/>
</ARTICLE>
</xsl:template>
<!-- Lower level titles strip out the element tag --> (see
Note #2)
<!-- Top-level title -->
<xsl:template match="/Article/ArtHeader/Title"> (see Note #3)
<TITLE> <xsl:apply-templates/> </TITLE>
</xsl:template>
<xsl:template match="//Sect1"> (see Note #4)
<SECT><xsl:apply-templates/></SECT>
</xsl:template>
<xsl:template match="Para">
<PARA><xsl:apply-templates/></PARA> (see Note #5)
</xsl:template>
</xsl:stylesheet>
Note
Although it hasn't been mentioned explicitly, XSLT defines a number of built-in (default) template rules. The complete set is listed in section 5.8 of the spec. Mainly, they provide for the automatic copying of text and attribute nodes, and for skipping comments and processing instructions. They also dictate that inner elements are processed, even when their containing tags that don't have templates. That is the reason that the text node in the section title is processed, even though the section title is not covered by any template. Now, run the FilterChain program, passing it the stylesheet above, the ARTICLE stylesheet, and the small DocBook file, in that order. The result should like this: <html> <body> <h1 align="center">Title of my (Docbook) article</h1> <h1>Title of Section 1.</h1> <p>This is a paragraph.</p> </body> </html> 4 ConclusionCongratulations! You have completed the XSLT tutorial! There is a lot you do with XML and XSLT, and you are now prepared to explore the many exciting possibilities that await. |
- Comment
