Recently I had a requirement to sort an XML document based on the tag names in the document.
You can sort it using XSLT, but this post tells you how to sort the XML nodes through Java.
Lets extend the com.sun.org.apache.xerces.internal.util.DOMUtil or org.apache.xerces.internal.util.DOMUtil class which has some basic utility methods. And I'm going to extend it by adding a method called sortChildNodes() .
This method sorts the children of the given node in descending or ascending order with the given Comparator. And it recurses upto the specified depth if available.
1 package com.googlepages.aanand.dom;
2
3 import java.util.ArrayList;
4 import java.util.Collections;
5 import java.util.Comparator;
6 import java.util.Iterator;
7 import java.util.List;
8
9 import org.w3c.dom.Node;
10 import org.w3c.dom.NodeList;
11 import org.w3c.dom.Text;
12
13 import com.sun.org.apache.xerces.internal.util.DOMUtil;
14
15 public class DOMUtilExt extends DOMUtil {
16
17 /**
18 * Sorts the children of the given node upto the specified depth if
19 * available
20 *
21 * @param node -
22 * node whose children will be sorted
23 * @param descending -
24 * true for sorting in descending order
25 * @param depth -
26 * depth upto which to sort in DOM
27 * @param comparator -
28 * comparator used to sort, if null a default NodeName
29 * comparator is used.
30 */
31 public static void sortChildNodes(Node node, boolean descending,
32 int depth,Comparator comparator) {
33
34 List nodes = new ArrayList();
35 NodeList childNodeList = node.getChildNodes();
36 if (depth > 0 && childNodeList.getLength() > 0) {
37 for (int i = 0; i < childNodeList.getLength(); i++) {
38 Node tNode = childNodeList.item(i);
39 sortChildNodes(tNode, descending, depth - 1,
40 comparator);
// Remove empty text nodes
41 if ((!(tNode instanceof Text))
42 || (tNode instanceof Text && ((Text) tNode)
43 .getTextContent().trim().length() > 1))
44 {
nodes.add(tNode);
45 }
46 }
47 Comparator comp = (comparator != null) ? comparator
48 : new DefaultNodeNameComparator();
49 if (descending)
50 {
51 //if descending is true, get the reverse ordered comparator
52 Collections.sort(nodes, Collections.reverseOrder(comp));
53 } else {
54 Collections.sort(nodes, comp);
55 }
56
57 for (Iterator iter = nodes.iterator(); iter.hasNext();) {
58 Node element = (Node) iter.next();
59 node.appendChild(element);
60 }
61 }
62
63 }
64
65 }
66
67 class DefaultNodeNameComparator implements Comparator {
68
69 public int compare(Object arg0, Object arg1) {
70 return ((Node) arg0).getNodeName().compareTo(
71 ((Node) arg1).getNodeName());
72 }
73
74 }
And I'm also removing the empty text nodes. If descending is set true, then a reverse ordering comparator is obtained from the Collections utility class.
The utility uses a default NodeName comparator if a comparator is not specified. Its sorts based on the name of the nodes in the DOM.
Writing a Comparator implementation is very simple, for example you may want to sort a document based on an attribute in the XML document.
class MyComparator3 implements Comparator {
public int compare(Object arg0, Object arg1) {
if (arg0 instanceof Element && arg1 instanceof Element) {
return ((Element) arg0).getAttribute("id").compareTo(
((Element) arg1).getAttribute("id"));
} else {
return ((Node) arg0).getNodeName().compareTo(
((Node) arg1).getNodeName());
}
}
}
Its a very simple class to sort the nodes in any way you want. Please comment on it, if you point out a problem with the utility.