Programmatica: 12/01/2006

Saturday, December 16, 2006

Sorting XML in Java

Recently I had a requirement to sort an XML document based on the tag names in the document.
You can sort it using XSLT, but this post tells you how to sort the XML nodes through Java.

Lets extend the com.sun.org.apache.xerces.internal.util.DOMUtil or org.apache.xerces.internal.util.DOMUtil class which has some basic utility methods. And I'm going to extend it by adding a method called sortChildNodes() .

This method sorts the children of the given node in descending or ascending order with the given Comparator. And it recurses upto the specified depth if available.


 1 package com.googlepages.aanand.dom;
 2 
 3 import java.util.ArrayList;
 4 import java.util.Collections;
 5 import java.util.Comparator;
 6 import java.util.Iterator;
 7 import java.util.List;
 8 
 9 import org.w3c.dom.Node;
10 import org.w3c.dom.NodeList;
11 import org.w3c.dom.Text;
12 
13 import com.sun.org.apache.xerces.internal.util.DOMUtil;
14 
15 public class DOMUtilExt extends DOMUtil {
16 
17         /**
18          * Sorts the children of the given node upto the specified depth if
19          * available
20          * 
21          * @param node -
22          *            node whose children will be sorted
23          * @param descending -
24          *            true for sorting in descending order
25          * @param depth -
26          *            depth upto which to sort in DOM
27          * @param comparator -
28          *           comparator used to sort, if null a default NodeName
29          *           comparator is used.
30          */
31         public static void sortChildNodes(Node node, boolean descending,
32                         int depth,Comparator comparator) {
33 
34                 List nodes = new ArrayList();
35                 NodeList childNodeList = node.getChildNodes();
36                 if (depth > 0 && childNodeList.getLength() > 0) {
37                    for (int i = 0; i < childNodeList.getLength(); i++) {
38                         Node tNode = childNodeList.item(i);
39                         sortChildNodes(tNode, descending, depth - 1,
40                                        comparator);
                           // Remove empty text nodes
41                         if ((!(tNode instanceof Text))
42                                 || (tNode instanceof Text && ((Text) tNode)
43                                         .getTextContent().trim().length() > 1))
44                         {    
                              nodes.add(tNode);
45                         }
46                    }
47                    Comparator comp = (comparator != null) ? comparator
48                                 : new DefaultNodeNameComparator();
49                    if (descending)
50                    {
51                     //if descending is true, get the reverse ordered comparator
52                         Collections.sort(nodes, Collections.reverseOrder(comp));
53                    } else {
54                         Collections.sort(nodes, comp);
55                    }
56 
57                   for (Iterator iter = nodes.iterator(); iter.hasNext();) {
58                         Node element = (Node) iter.next();
59                         node.appendChild(element);
60                   }
61                 }
62 
63         }
64 
65 }
66 
67 class DefaultNodeNameComparator implements Comparator {
68 
69         public int compare(Object arg0, Object arg1) {
70                 return ((Node) arg0).getNodeName().compareTo(
71                                 ((Node) arg1).getNodeName());
72         }
73 
74 }

And I'm also removing the empty text nodes. If descending is set true, then a reverse ordering comparator is obtained from the Collections utility class.

The utility uses a default NodeName comparator if a comparator is not specified. Its sorts based on the name of the nodes in the DOM.

Writing a Comparator implementation is very simple, for example you may want to sort a document based on an attribute in the XML document.


class MyComparator3 implements Comparator {

 public int compare(Object arg0, Object arg1) {
    
         if (arg0 instanceof Element && arg1 instanceof Element) {
                 return ((Element) arg0).getAttribute("id").compareTo(
                                 ((Element) arg1).getAttribute("id"));
         } else {
                 return ((Node) arg0).getNodeName().compareTo(
                                 ((Node) arg1).getNodeName());
         }

 }

}

Its a very simple class to sort the nodes in any way you want. Please comment on it, if you point out a problem with the utility.

Tuesday, December 12, 2006

Vim: The Beauty

I have always been searching for new editors and have been a fan of VIM, though i was not using it very often. I have used it for editing Ruby and Java with features like intellisense, which is available from version 7.0. But now i feel that its also very easy to customize and beautify the editor to your needs.
I tried a new color scheme and font for VIM 7.0 from this site: http://iamphet.nm.ru/vim/index.html

Then i setup the default look of the editor by editing the "Startup Settings" in the Edit menu.

1
2 set nocompatible
3 source $VIMRUNTIME/vimrc_example.vim
4 source $VIMRUNTIME/mswin.vim
5 behave mswin
6 set nobackup
7 set nu!
8 :colors northsky
9 set guifont=ke9x15
10 set diffexpr=MyDiff()
11 function MyDiff()
12 let opt = ''
13 if &diffopt =~ 'icase' | let opt = opt . '-i ' | endif
14 if &diffopt =~ 'iwhite' | let opt = opt . '-b ' | endif
15 silent execute '\"!e:\Program Files\vim\diff\" -a ' . opt . v:fname_in . ' ' . v:fname_new . ' > ' . v:fname_out
16 endfunction
17

Before that i installed the font kex.fon from the "Fonts" window through control panel.
To customize VIM, the best resource would be VIM user's guide.

Happy Vimming!

Saturday, December 09, 2006

Work around for math:max in XPath 1.0

XPath 1.0 doesn't provide the function max(), but the function last() behaves differently with numerical parameters.

For example:


<?xml version="1.0" encoding="UTF-8"?>
<nodes>
     <node name="1" value ="2"></node>
     <node name="2" value = "3"></node>
     <node name="3" value = "1"></node>
     <node name="4" value = "5"></node>
     <node name="5" value = "4"></node>
</nodes>

has nodes with different values. To find the maximum of the values available, we can use the last method as follows:


<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
      <xsl:template match="nodes">
              Max value - <xsl:value-of select="node/@value[last()]"/>
      </xsl:template>
</xsl:stylesheet>

The function last() finds the max value of an attribute in a given node-set. This function can be used as a workaround if we are not using XPath 2.0.

But the last() function is normally used to find the position of the last node.

Sunday, December 03, 2006

Ruby Users Guide

I have uploaded the Ruby Users Guide in CHM format for ease of use. I'll be uploading open source licensed books in CHM format.

Ruby User's Guide

Programmatica

Saturday, December 16, 2006

Sorting XML in Java

Tuesday, December 12, 2006

Vim: The Beauty

Saturday, December 09, 2006

Work around for math:max in XPath 1.0

Sunday, December 03, 2006

Ruby Users Guide

Listeners

Links

Blog Archive

About Me