Sorting tables with Java 
        
        
        Tuesday, January 09, 2007
         
        
        
        One of the main functions of computer applications is the visualization of data. The most common visualization format is a table with rows and column.
I find that the most desirable function in a table is that of sorting by specific column types. Whether sorting alphanumerically, dates, or object types, it helps the user find the information faster.
Not all data needs sorting, though. For example, sorting the following is not really needed:
| Name | Age | 
| Jose | 145 | 
| Gaius Julius Caesar | 2106 | 
(This is such a silly example as we rarely encounter so little data to analyze.)
The solution to this almost trivial problem depends on how the system has actually being designed: what is the architecture of the system.
After looking back at the sorting algorithms I've used in the past, I started thinking of the different ways table sorting can actually be implemented.
Every case I present here does the same thing (sort a table), however, each case deals with the actual system at hand differently. In other words, the solutions presented are part of an existing web framework.
Note that there is nothing really earth shattering or new about any of these solutions. I just thought they would make a good example of different web architectures, and simple curiosity for some, for an almost trivial but necessary requirement in most web applications: sorting by column type.
(Click on the sequence diagrams to enlarge.)
Client Only Solution (DOM-JavaScript)The client only solution makes use of JavaScript and DOM to do all the sorting. 

This is quite an easy solution to implement, as all the data is in the rendered HTML and sorting is accomplished via JavaScript, DOM, and CSS--of course, the data is stored in a DB, but I didn't include it in the diagram for simplicity's sake.
Matt Kruse has a very neat implementation of this and is available as a (free) JavaScript library. I won't steal his thunder, so go the 
examples page to see the versatility of his solution.
The issue with a client only sorting implementation is that the data set can change on the database at any time and if we don't refresh the page with the new data important decisions could be made with incomplete information. Hence, in theory, every sorting request should check for new data and just keep sorting away. 
Note, however, that if a table is part of a summary, for example, monthly income reports, then it is safe to use the JavaScript sorting as is--no need for data refresh.
But what about when the data needs to be refreshed for every sort request? 
The simple act of sorting can put quite a load on resources if we have thousands of users doing the same thing at the same time and if every sorting action means a trip to the whole system. Therefore, any solution needs to take into account how it will behave under heavy load in the existing architecture.
As end users, we only care that our applications work. We don't want to know how complex, or simple the solution is. All we want to see is our tables sorting instantaneously. Hence, I present a few solutions (some better than others).
Case 1Go directly from the web page to the database and use SQL to return the sorted items (essentially, using "ORDER BY"). 

This is how web application were coded circa 1996: C or Perl application used to talk directly to databases. We still find this development pattern being implemented with newer scripting languages like PHP or Ruby.
Case 2Add a middle tier to handle the business requirements of your application. 

This is preferable to Case 1 above, however, it is probably not the most scalable solution. Nothing surprising here. All that has been done differently is that I've moved the logic embedded in the web page (JSP) into some middle tier, which is still part of the Web layer.
In this realm, you can find MTS/COM+ components with ASP pages and Servlet with JSP interaction, where you have massive middle tier servers doing all the work, i.e., concurrency, load handling, etc.
Case 3This is a typical MVC pattern implementation. 

This way of doing things just separates code into logical chunks to do stuff. Even though it is a full MVC pattern solution, is not a very smart one. Every time sorting needs to be done, the whole system needs to be put into action--could be expensive with thousands of concurrent users.
Note that this is an advantage over the case 1 and 2, as I'm not coupling data to my actual business logic. And this separation is what leads to scalability of Software Development: you can have teams of 50 or 100 developers to work on the same code tree without any worries of duplicate or lost work.
Case 4Still and MVC solution, with some minor caching added.

What I mean by caching is that for every sorting request the set is sorted in some web layer object. In Java, the data set can be stored in a collection object in the HttpSession.
I think you can see how sorting at the Web layer would be much more efficient when thousands of users are hitting the same server, i.e., the ball stops at the Controller layer and business and DB objects are not doing any of the work.
Case 5Now the bleeding edge solution (meant to be funny). 
I did say that if your data set needs to be updated for every sorting request (you always need the latest data), then you can use Case 3 (use the whole system to sort by column). However, there is a better solution that makes use of Matt's JavaScript sorting library and an AJAX call.

This is what is happening and why this is more efficient than Case 3 or 4:
- The first time the table is displayed, there are no two ways about it, you must hit the DB to get the data set.
 
 
- Once the data is rendered, you can serve every sorting request with the JavaScript library (no back end interaction means fast execution and no server load).
 
 Now, what if the data changed on the back end after the first request and you need to have the latest data set at all times?
 
 
-  This is where a bit of AJAX can be used if you don't want your users to keep refreshing the web page, i.e., Case 3 or 4. 
 
 You can make an asynchronous call to check if the data set has changed or not. And if it has, update the collection at the web layer. This will guarantee that the next time the table is sorted, fresh new data will be available. Once the fresh data has been rendered, then the JavaScript sorting code takes over and everything is quite fast again.
 
 Note that the JavaScript sorting action needs to "know" that the collection at the web layer has been updated, and hence go to the controller to get the latest data set. I'm not sure if this step is clear in the sequence diagram, as I didn't want to combine the JavaScript sorting with the checking of the web layer for the data in the diagram, but these two actions must be coded as one. Also note that every asynchronous call (the AJAX part) does hit the backend at some pre-defined interval, however, to your user the interactivity of the application has improved, giving them an almost desktop feel to the web application.
SummaryAny solution presented here would end up sorting tables by column, however, each one is different depending on the architecture of the solution and the specific user requirement. 
Case 5 is a combination of solutions and may seem a bit over done with so many layers doing stuff. However, if the requirement is part of an existing architecture, we must code to accommodate the design, i.e., the architecture is there for a reason, whether is to take advantage of load balancing or logical layering services.
 
		
        
   
Comments:
    
   
      
         Hi,
Your last solution will not work if you have to use paging, because then you cannot sort the data on the client anymore. If you need to show thousands of objects (paged) then the only logical solution is to sort on the business layer.
Regards,
Markus(java performance blog)
   
      
         Hi Markus. Thanks for your comment. I wish you had left an email address, but hope you come back.
Actually, the last solution DOES work with paging.
If you read in detail, I said to use Matt's JavaScript sorting solution.
The library does handle multiple pages for large data sets. Go to the examples page at http://www.javascripttoolbox.com/lib/table/examples.php and navigate to the bottom of the page. There is an example with paging that you can take a look at.
I hope that helps.