As you know deal with lucene spatial search is sometimes hard. To understand the basic functionality I searched for a simple example. Mike Haller explained in his post Spatial search with Lucene a good example of how lucene spatial works and he gives some example code based on Lucene 3.0.2. In this post I deal with lucene 2.9.3 and lucene spatial.
Some wrapper and helper classes doesn’t exists in Lucene 2.9.3, so we have to create it.
Cooordinate class to store latitude and longitude
public class Coordinate { private final double lat; private final double lon; private String name; public Coordinate(final double lat, final double lon) { this.lat = lat; this.lon = lon; } public Coordinate(final double lat, final double lon, final String name) { this(lat, lon); this.name = name; } public double getLat() { return lat; } public double getLon() { return lon; } public String getName() { return name; } }
SpatialHelper class to build the cartesian tiers
import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.document.Field.Index; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.spatial.tier.projections.CartesianTierPlotter; import org.apache.lucene.spatial.tier.projections.IProjector; import org.apache.lucene.spatial.tier.projections.SinusoidalProjector; import org.apache.lucene.util.NumericUtils; public class SpatialHelper { private final IProjector projector = new SinusoidalProjector(); private CartesianTierPlotter ctp = new CartesianTierPlotter(0, projector, CartesianTierPlotter.DEFALT_FIELD_PREFIX); private final int startTier; private final int endTier; public static final double MILE = 1.609344; public static String LAT_FIELD = "lat"; public static String LON_FIELD = "lng"; public SpatialHelper(final double maxKm, final double minKm) { endTier = ctp.bestFit(getMiles(maxKm)); startTier = ctp.bestFit(getMiles(minKm)); } public void addLoc(final IndexWriter writer, final String name, final Coordinate coord) throws Exception { final Document doc = new Document(); doc.add(new Field("name", name, Field.Store.YES, Index.ANALYZED)); addSpatialLcnFields(coord, doc); writer.addDocument(doc); } public static double getMiles(final double km) { return km / MILE; } public static double getKm(final double miles) { return miles * MILE; } private void addSpatialLcnFields(final Coordinate coord, final Document document) { document.add(new Field("lat", NumericUtils.doubleToPrefixCoded(coord.getLat()), Field.Store.YES, Field.Index.NOT_ANALYZED)); document.add(new Field("lng", NumericUtils.doubleToPrefixCoded(coord.getLon()), Field.Store.YES, Field.Index.NOT_ANALYZED)); addCartesianTiers(coord, document); } private void addCartesianTiers(final Coordinate coord, final Document document) { for (int tier = startTier; tier <= endTier; tier++) { ctp = new CartesianTierPlotter(tier, projector, CartesianTierPlotter.DEFALT_FIELD_PREFIX); final double boxId = ctp.getTierBoxId(coord.getLat(), coord.getLon()); document.add(new Field(ctp.getTierFieldName(), NumericUtils.doubleToPrefixCoded(boxId), Field.Store.YES, Field.Index.NOT_ANALYZED_NO_NORMS)); } } }
Examples class for generate some locations in Zürich and in the near of Zürich
the latitude / longitude values are taken from:
http://maps.google.com/maps/api/geocode/xml?address=…&sensor=false®ion=ch
The value for address can be a postal code, a city, a street,.. More details about google geocoding service you will find at google geocoding documantation site.
import org.apache.lucene.index.IndexWriter; public class Examples { public static final Coordinate SCHWAMMENDINGEN_8051 = new Coordinate(47.4008593, 8.5781373, "Schwammedingen"); public static final Coordinate SEEBACH_8052 = new Coordinate(47.4232860, 8.5422655, "Seebach"); public static final Coordinate KNONAU_8934 = new Coordinate(47.2237640, 8.4611790, "Knonau"); public static final Coordinate ZUERICH_8000 = new Coordinate(47.3690239, 8.5380326, "Zürich in 8000"); public static final Coordinate EBIKON_6030_6031 = new Coordinate(47.0819237, 8.3415740, "Ebikon"); public static final Coordinate ADLISWIL_8134 = new Coordinate(47.3119892, 8.5256064, "Adliswil"); public static final Coordinate BAAR_6341 = new Coordinate(47.1934110, 8.5230670, "Baar"); public static void createExampleLocations(final IndexWriter w) throws Exception { final SpatialHelper s = new SpatialHelper(10.0d, 20.0d); s.addLoc(w, EBIKON_6030_6031.getName(), EBIKON_6030_6031); s.addLoc(w, SEEBACH_8052.getName(), SEEBACH_8052); // uncomment these lines to generate 10'000 locations near by Zuerich // for (int i = 0; i < 10000; i++) { // s.addLoc(w, "Zürich_" + i, new Coordinate(ZUERICH_8000.getLat() + i / // 100000D, ZUERICH_8000.getLon() + i // / 100000D)); // } s.addLoc(w, ZUERICH_8000.getName(), ZUERICH_8000); s.addLoc(w, SCHWAMMENDINGEN_8051.getName(), SCHWAMMENDINGEN_8051); s.addLoc(w, ADLISWIL_8134.getName(), ADLISWIL_8134); s.addLoc(w, KNONAU_8934.getName(), KNONAU_8934); s.addLoc(w, BAAR_6341.getName(), BAAR_6341); } }
Test class to test location search in kilometer unit and sort the results by radius
import java.util.ArrayList; import java.util.List; import java.util.Map; import org.apache.lucene.analysis.WhitespaceAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.IndexWriter.MaxFieldLength; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.MatchAllDocsQuery; import org.apache.lucene.search.Query; import org.apache.lucene.search.Sort; import org.apache.lucene.search.SortField; import org.apache.lucene.search.TopDocs; import org.apache.lucene.spatial.tier.DistanceFieldComparatorSource; import org.apache.lucene.spatial.tier.DistanceQueryBuilder; import org.apache.lucene.spatial.tier.projections.CartesianTierPlotter; import org.apache.lucene.store.Directory; import org.apache.lucene.store.RAMDirectory; import org.junit.Test; public class GEOLocationTest { @Test public void testSpatialSearch() throws Exception { final Directory dir = new RAMDirectory(); final IndexWriter writer = new IndexWriter(dir, new WhitespaceAnalyzer(), MaxFieldLength.UNLIMITED); Examples.createExampleLocations(writer); writer.commit(); writer.close(true); // test data final IndexSearcher searcher = new IndexSearcher(dir, true); final double testDistance = SpatialHelper.getMiles(10.0D); final List locations = find(searcher, Examples.SEEBACH_8052, testDistance); for (final String location : locations) { System.out.println("location found: " + location); } } private List find(final IndexSearcher searcher, final Coordinate start, final double miles) throws Exception { final List result = new ArrayList(); final DistanceQueryBuilder dq = new DistanceQueryBuilder(start.getLat(), start.getLon(), miles, SpatialHelper.LAT_FIELD, SpatialHelper.LON_FIELD, CartesianTierPlotter.DEFALT_FIELD_PREFIX, true); // Create a distance sort // As the radius filter has performed the distance calculations // already, pass in the filter to reuse the results. final DistanceFieldComparatorSource dsort = new DistanceFieldComparatorSource(dq.getDistanceFilter()); final Sort sort = new Sort(new SortField("geo_distance", dsort)); final Query query = new MatchAllDocsQuery(); // find with distance sort final TopDocs hits = searcher.search(query, dq.getFilter(), 20, sort); final Map distances = dq.getDistanceFilter().getDistances(); // find normal, gets unordered result // final TopDocs hits = searcher.search(dq.getQuery(query), 10); for (int i = 0; i < hits.totalHits; i++) { final int docID = hits.scoreDocs[i].doc; final Document doc = searcher.doc(docID); final StringBuilder builder = new StringBuilder(); builder.append("Ort: ").append(doc.get("name")).append(" distance: ").append( SpatialHelper.getKm(distances.get(docID))); result.add(builder.toString()); } return result; } }
The SpatialHelper with a kilometer range between 10km and 20km calls the CartesianTierPlotter to produce data for tier 15 only. The grid size for tier 15 is on Mile (1,61 km).
I tried to produce a huge amount of location points around Zurich to afford perfomance estimations. Basicaly it looks good. Try it out!
Have fun to play with this example an let me know, if you have some other ways to beat lucene spatial.
To sort by distance,is it necesssary to index field “geo_distance” during indexing??
Is it possible to do sort using only latitude and longitude?
PS: i am using lucene 3.0.3 contrib for spatial apis.
I guess that sort by latitude / longitude ist not meaningful to make an order by distance. It’s not necesssary to index the distance. The distance filter associated to the distance query does the calculation of the radius during filtering the results. The geo_distance field is the filtering calculation result and can be used as sort field.