Ajax address suggestion using Lucene and J2EE
This blog is about implementing an Ajax address suggestion. It is implemented as a J2EE web application. It displays the address suggestions as you type in.
I have used the “National Register of Historic Places” data which was available at site http://www.itasoftware.com/careers/SolveThisWorkHerePuzzles.html?catid=114
The data file is an xml file and has around 84000 addresses. Each address is represented as a Property object.
The address data is indexed using the Open Source search engine Apache Lucene. The xml file is loaded into memory using the JAXB API’s.
In order to use JAXB, we will need schema to generate classes corresponding to the each Property xml data. I have generated schema for the xml file using trang jar.
Command for generating the schema for the xml document.
java -jar trang.jar -I xml -O xsd nrhp.xml nrhp.xsd
The xsd document is as follows.
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <xs:element name="properties"> <xs:complexType> <xs:sequence> <xs:element maxOccurs="unbounded" ref="property"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="property"> <xs:complexType> <xs:sequence> <xs:element maxOccurs="unbounded" ref="name"/> <xs:element minOccurs="0" ref="address"/> <xs:element minOccurs="0" ref="city"/> <xs:element ref="state"/> </xs:sequence> <xs:attribute name="id" use="required" type="xs:integer"/> </xs:complexType> </xs:element> <xs:element name="name" type="xs:string"/> <xs:element name="address" type="xs:string"/> <xs:element name="city" type="xs:string"/> <xs:element name="state" type="xs:NCName"/> </xs:schema>
Next, the schema needs to be binded., generating a set of java classes that represent the schema.This can be done using the xjc command from commandline or using the xjc eclipse plugin. The following classes are generated.
com.objects.ObjectFactory.java com.objects.Properties.java com.objects.Property.java
In most of the cases, the schema binding classes will be READ ONLY. But, we can modify the classes under careful consideration. I have modifed the class com.objects.Properties.java to include the following method.
public void setProperty(List<Property> prop) {
property = prop;
}
which, is used when returing the results from server to web browser.I have created two classes, com.dataConversion.Xml2Obj to convert xml document to java objects and com.dataConversion.Obj2Xml to
convert java objects back to XML document. With JAXB, both the tasks can be easily implemented.Class com.dataConversion.Xml2Obj
package com.dataConversion;
import java.io.File;
import java.util.List;
import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import javax.xml.bind.Unmarshaller;
public class Xml2Obj {
public List<com.objects.Property> getAddressList(String xmlFilePath) throws JAXBException {
JAXBContext jc = JAXBContext.newInstance("com.objects");
Unmarshaller unmarshaller = jc.createUnmarshaller();
com.objects.Properties properties =
(com.objects.Properties)unmarshaller.unmarshal(new File(xmlFilePath));
return properties.getProperty();
}
}
Class com.dataConversion.Obj2Xml
package com.dataConversion;
import java.io.Writer;
import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import javax.xml.bind.Marshaller;
public class Obj2Xml {
public void persistAddressList(com.objects.Properties prop, Writer writer) throws JAXBException {
JAXBContext jaxbContext = JAXBContext.newInstance("com.objects");
Marshaller marshaller = jaxbContext.createMarshaller();
marshaller.marshal(prop, writer);
}
}
The most important process is indexing the data. The address data is indexed on address and city field together. Since, i am storing the index in memory for performance, i have used org.apache.lucene.store.RAMDirectory to store the index. The indexing process involves org.apache.lucene.index.IndexWriter object, adding org.apache.lucene.document.Document objects to IndexWriter
object. Each Document object corresponds to an Property object. The property Id is the key. The class com.dataServices.AddressSearch implements the creating the index and providing method for search.The implementation is as follows.
package com.dataServices;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Hashtable;
import java.util.List;
import javax.xml.bind.JAXBException;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.CorruptIndexException;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.queryParser.ParseException;
import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.search.Hits;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.Searcher;
import org.apache.lucene.store.LockObtainFailedException;
import org.apache.lucene.store.RAMDirectory;
import com.objects.*;
import com.dataConversion.*;
public class AddressSearch {
private RAMDirectory inMemoryIndex;
private IndexWriter indexWriter;
private Searcher searcher;
private StandardAnalyzer standardAnalyzer;
private QueryParser queryParser;
private Hashtable<String, Property> addressLookup= null;
private static AddressSearch addressSearch;
//Default file path
private String xmlFilePath = "C:/java/nrhp.xml";
private AddressSearch() throws CorruptIndexException, LockObtainFailedException, IOException, JAXBException {
inMemoryIndex = new RAMDirectory();
indexWriter = new IndexWriter(inMemoryIndex, new StandardAnalyzer());
addressLookup = new Hashtable<String, Property>();
buildIndex();
searcher = new IndexSearcher(inMemoryIndex);
standardAnalyzer = new StandardAnalyzer();
queryParser = new QueryParser("data", standardAnalyzer);
queryParser.setAllowLeadingWildcard(true);
}
private void buildIndex() throws JAXBException, CorruptIndexException, IOException {
List<Property> propList = getAddressList();
Document doc = null;
StringBuffer sb = null;
for (Property p: propList) {
doc = new Document();
sb= new StringBuffer();
if ((p.getId()==null)||(p.getAddress()==null)) continue;
doc.add(new Field("propId", p.getId().toString(),Field.Store.YES, Field.Index.NO));
if ((p.getAddress()==null) || (p.getCity()==null) || (p.getState()==null))
continue;
sb.append(p.getAddress().replaceAll(" ", "")).append(p.getCity().replaceAll(" ", ""));
doc.add(new Field("data", sb.toString(), Field.Store.COMPRESS, Field.Index.TOKENIZED));
indexWriter.addDocument(doc);
//also add to HashTable
addressLookup.put(p.getId().toString(), p);
}
indexWriter.optimize();
indexWriter.close();
}
public List<Property> search(String searchString) throws ParseException, IOException {
List<Property> results = new ArrayList<Property>();
Query query = queryParser.parse(searchString);
Hits hits = searcher.search(query);
if (hits.length() != 0)
for (int i=0; i< hits.length(); i++) {
results.add(addressLookup.get(hits.doc(i).get("propId")));
}
return results;
}
public static synchronized AddressSearch getAddressSearch() throws Exception{
if (addressSearch == null)
addressSearch = new AddressSearch();
return addressSearch;
}
private List<Property> getAddressList() throws JAXBException {
Xml2Obj x2o = new Xml2Obj();
return x2o.getAddressList(xmlFilePath);
}
public static void initialize(String xmlFile) throws Exception {
getAddressSearch();
}
}
The class com.dataServices.GetAddress is a HttpServlet that provides the address search functionality by accessing the AddressSearch object. The result of the search is a list of Property objects, which match the search term. The Property objects are converted to xml using the class Obj2Xml and sent back as response.The addressSuggestion.jsp is the client side page, through which the users can type in address. The important javascript methods which does the ajax resquest and handing the response is shown below.
//For Ajax
var req;
function init() {
if (window.XMLHttpRequest)
req = new XMLHttpRequest;
else if (window.ActiveXObject)
req = new ActiveXObject("Microsoft.XMLHTTP");
var url = 'http://' + location.host + '/AjaxAddressSugg/GetAddress';
req.open("POST", url, true);
req.setRequestHeader("Content-Type", "application/x-www-form-urlencoded");
}
function initiateSearch() {
var searchString = document.getElementById('address').value;
if ((searchString == null) || (searchString.length < 3)) {
return;
}
init();
req.onreadystatechange = displayResults;
var request="searchString="+searchString;
req.send(request);
}
function displayResults() {
if (req.readyState == 4) {
if (req.status == 200) {
var propertyTags = req.responseXML.getElementsByTagName('property');
for (var i=0; i< propertyTags.length; i++) {
var full_address = "
<table width='100%' cellspacing='0' cellpadding='1'>
<tr style='background-color: #CCDEF7;'>
<td><b>";
for (var j=0; j< propertyTags[i].getElementsByTagName('name').length; j++)
full_address = full_address +
propertyTags[i].getElementsByTagName('name')[j].firstChild.nodeValue;
full_address = full_address + "</b></td>
</tr>
<tr style='background-color: #DDE9FB;'>
<td>" +
propertyTags[i].getElementsByTagName('address')[0].firstChild.nodeValue
+ ' ' + propertyTags[i].getElementsByTagName('city')[0].firstChild.nodeValue
+ ' ' + propertyTags[i].getElementsByTagName('state')[0].firstChild.nodeValue
+ '</td>
</tr>
</table>
';
var row = document.getElementById('table001').insertRow(-1);
var cell = row.insertCell(-1);
cell.innerHTML = full_address;
}
}
}
}
function removeResults() {
var resultsTable = document.getElementById('table001');
var records = resultsTable.rows.length;
for (var i=(records-1); i> 0; i--)
resultsTable.deleteRow(i);
}
I have attached the screen shot of one of the searches below.
On my laptop with 1.66 Ghz Intel Core Duo and 2 GB RAM, and java version 1.6.0_02 the response was around 170-250 ms.
The code can be improved a lot through optimization, which the current implementation lacks.I have enclosed the war file, which includes the source code. The war file is successfully tested on Tomcat and Glassfish.
The search fails in some cases with the following exception.
org.apache.lucene.search.BooleanQuery$TooManyClauses: maxClauseCount is set to 1024
This happens when the search string you entered results in more than 1024 tokens. This can be fixed by increasing maxClauseCount in the BooleanQuery class to 2048.
Also, the ajax search begins when user enters minimum of 3 characters.
The servlet above is not thread-safe.
Update
The servlet has been changed to fetch the location of the xml data file from web.xml instead of using workaround.
Quick Preview
Resources
SourceCode and WAR file
Scripting Graphs in Browser
Few would have encountered situations where a graph based on dynamic data needs to be part of a web page. In these cases, we either opt for some open source or commercial graph generating tools. We can also use Canvas HTML element to create graphs, if we can put some effort in learning. In this blog, i will describe how i used Canvas HTML element to generate graphs.
Canvas HTML element is used to create graph. Canvas HTML element is part of HTML 5 Specification. Hence, any browser that confirms to HTML 5 specification should be able to properly render the Canvas element. As of now Firefox(above 1.5v) and Opera supports this. IE doesn’t support this tag, but there is a workaround.
A snapshot of the line graph. Since, i have used random number generator for generating the values, the graph is more random.

This is the JavaScript source code, which generated the graph.
var graphArray;
function populateGraphArray() {
//value can be anything from 0 to 100
graphArray = new Array();
var randomnumber;
for (var i=0; i< 200 ; i++){
graphArray[i] = Math.floor(Math.random()*101);
}
}
function drawGraph() {
var canvas = document.getElementById("graphCanvas");
var graphWidth = canvas.width;
var graphHeight = canvas.height;
if (canvas.getContext) {
var ctx = canvas.getContext("2d");
ctx.beginPath();
//Draw the x and y axis
ctx.lineWidth = 5;
ctx.lineTo(0, 0);
ctx.lineTo(0, graphHeight);
ctx.lineTo(graphWidth, graphHeight);
ctx.stroke();
//Draw the Graph
ctx.lineWidth = 1;
ctx.strokeStyle='rgb(114, 172, 216)';
var unitX;
var unitY;
if (graphHeight > 100)
unitY = Math.floor(graphHeight/100);
else
unitY = Math.floor(100/graphHeight);
if ( graphWidth > graphArray.length)
unitX = Math.floor(graphWidth/graphArray.length);
else
unitX = Math.floor(graphArray.length/graphWidth);
for (var i=0; i < graphArray.length; i++){
if (i==0)
ctx.moveTo(unitX*i, (graphHeight - (graphArray[i]*unitY)));
ctx.lineTo(unitX*i, (graphHeight - (graphArray[i]*unitY)));
}
ctx.stroke();
}
}
“graphCanvas” is the Canvas element in the HTML page. The code is pretty simple if you go through the documentation on Canvas at Mozilla Developer Center.
Our logic just boils down to generate the x and y coordinate values, which usually will be a small algorithm.
You can also create a bar graph with just 2 lines of code change. The bar graph generated is..
The javascript code is
function drawGraph() {
var canvas = document.getElementById("graphCanvas");
var graphWidth = canvas.width;
var graphHeight = canvas.height;
if (canvas.getContext) {
var ctx = canvas.getContext("2d");
ctx.beginPath();
//Draw the x and y axis
ctx.lineWidth = 5;
ctx.lineTo(0, 0);
ctx.lineTo(0, graphHeight);
ctx.lineTo(graphWidth, graphHeight);
ctx.stroke();
//Draw the Graph
ctx.lineWidth = 1;
ctx.strokeStyle='rgb(80, 45, 46)';
var unitX;
var unitY;
if (graphHeight > 100)
unitY = Math.floor(graphHeight/100);
else
unitY = Math.floor(100/graphHeight);
if ( graphWidth > graphArray.length)
unitX = Math.floor(graphWidth/graphArray.length);
else
unitX = Math.floor(graphArray.length/graphWidth);
ctx.lineWidth = unitY;
for (var i=0; i < graphArray.length; i++){
ctx.moveTo(unitX*i, graphHeight);
ctx.lineTo(unitX*i, (graphHeight - (graphArray[i]*unitY)));
}
ctx.stroke();
}
}
References
Source Code Download

