Loading...

Thursday, October 25, 2012

Groovy Goodness: Pretty Print XML

The easiest way to pretty print an XML structure is with the XmlUtil class. The class has a serialize() method which is overloaded for several parameter types like String, GPathResult and Node. We can pass an OutputSteam or Writer object as argument to write the pretty formatted XML to. If we don't specify these the serialize() method return a String value.

import groovy.xml.*

def prettyXml = '''\<?xml version="1.0" encoding="UTF-8"?><languages>
  <language id="1">Groovy</language>
  <language id="2">Java</language>
  <language id="3">Scala</language>
</languages>
'''


// Pretty print a non-formatted XML String.
def xmlString = '<languages><language id="1">Groovy</language><language id="2">Java</language><language id="3">Scala</language></languages>'
assert XmlUtil.serialize(xmlString) == prettyXml

// Use Writer object as extra argument.
def xmlOutput = new StringWriter()
XmlUtil.serialize xmlString, xmlOutput
assert xmlOutput.toString() == prettyXml

// Pretty print a Node.
Node languagesNode = new XmlParser().parseText(xmlString)
assert XmlUtil.serialize(languagesNode) == prettyXml


// Pretty print a GPathResult.
def langagesResult = new XmlSlurper().parseText(xmlString)
assert XmlUtil.serialize(langagesResult) == prettyXml


// Pretty print org.w3c.dom.Element.
org.w3c.dom.Document doc = DOMBuilder.newInstance().parseText(xmlString)
org.w3c.dom.Element root = doc.documentElement
assert XmlUtil.serialize(root) == prettyXml


// Little trick to pretty format
// the result of StreamingMarkupBuilder.bind(). 
def languagesXml = {
    languages {
        language id: 1, 'Groovy'
        language id: 2, 'Java'
        language id: 3, 'Scala'
    }
}
def languagesBuilder = new StreamingMarkupBuilder()
assert XmlUtil.serialize(languagesBuilder.bind(languagesXml)) == prettyXml

If we already have a groovy.util.Node object we can also use the XmlNodePrinter. For example if we use XmlParser to parse XML we get a Node object. We create a new instance of the XmlNodePrinter and use the print() method to output the node with child nodes. If we don't specify a Writer object the output is send to System.out.

import groovy.xml.*

// Get groovy.util.Node value.
def xmlString = '<languages><language id="1">Groovy</language><language id="2">Java</language><language id="3">Scala</language></languages>'
Node languages = new XmlParser().parseText(xmlString)


// Create output with all default settings.
def xmlOutput = new StringWriter()
def xmlNodePrinter = new XmlNodePrinter(new PrintWriter(xmlOutput))
xmlNodePrinter.print(languages)

assert xmlOutput.toString() == '''\
<languages>
  <language id="1">
    Groovy
  </language>
  <language id="2">
    Java
  </language>
  <language id="3">
    Scala
  </language>
</languages>
'''


// Create output and set indent character
// one space.
// (can also by \t for tabs, or other characters)
xmlOutput = new StringWriter()
xmlNodePrinter = new XmlNodePrinter(new PrintWriter(xmlOutput), " " /* indent */)
xmlNodePrinter.print(languages)

assert xmlOutput.toString() == '''\
<languages>
 <language id="1">
  Groovy
 </language>
 <language id="2">
  Java
 </language>
 <language id="3">
  Scala
 </language>
</languages>
'''


// Use properties preserveWhitespace,
// expandEmptyElements and quote to
// change the formatting.
xmlOutput = new StringWriter()
xmlNodePrinter = new XmlNodePrinter(new PrintWriter(xmlOutput))
xmlNodePrinter.with {
    preserveWhitespace = true
    expandEmptyElements = true
    quote = "'" // Use single quote for attributes
}
xmlNodePrinter.print(languages)

assert xmlOutput.toString() == """\
<languages>
  <language id='1'>Groovy</language>
  <language id='2'>Java</language>
  <language id='3'>Scala</language>
</languages>
"""

Code written with Groovy 2.0.5

3 comments:

Sérgio Michels said...

On Windows with Grails 2.2.1 (Groovy 2) this assertion is failing.

Andrea Panattoni said...

I run your first script in the IntelliJ groovy shell and I discover that the assertions are wrong. It seems that XmlUtil.serialize(...) doesn't put a new line after the heading declaration. I used the groovy SDK 2.1.8. Do you know if it this is a well know issue?

Hubert Klein Ikkink said...

@Andrea Panatonni: I have run the code again with Groovy 2.1.7 and I got the same issue as you. I have now changed the code sample in the blog post.

Post a Comment