As mentioned here, I’ve been working with a spreadsheet containing addresses that want to be geocoded. I’ve had lots of experience running batches of addresses through geocoding services, but in the case of the police department I’ve been working with, it would be nice to be able to do the geocoding interactively. That way, if 400 Marlboro resolves incorrectly to 400 Marlboro Rd., the clerk will know it’s necessary to specify 400 Marlboro St. if that’s the intended address.
I found two examples of spreadsheets programmed with this behavior, first from AutomateExcel.com and second from Juice Analytics. When I compared these I realized that I wanted to combine aspects of both.
The AutomateExcel version is extremely simple. That’s partly because it uses the XML mapping features of Excel 2003 or 2007 to capture the XML output of a geocoding service, and partly because it only deals with a single address.
The Juice version is more complex. That’s partly because it eschews the XML mapping features in order to support older versions of Excel, and partly because it deals with many addresses. (It also exports KML for use with Google Earth.)
In my case I was willing to assume Excel 2003 or later, and use XML mapping. But I wanted to be able to accumulate results for many rows of addresses. I also wanted to switch from the XML output of geocoder.us, which is used in the AutomateExcel version, to the XML output of Yahoo’s geocoder, which is used in the Juice version.
The version I came up with is here, and the VBA code appears below. I haven’t used VBA or the XML mapping features of Excel in a while so, while the experience is fresh in my mind, I thought I’d record some of my key observations.
Mapping the output of Yahoo’s geocoder
I started by replicating the XML mapping in the AutomateExcel version. Here’s a sample geocoder.us query:
To create an XML mapping in Excel you do: Data -> From the web -> [plug in the URL] -> Import. Excel warns: “The specified XML source does not refer to a schema. Excel will create a schema.” OK.
Then it asks: “Where do you want to put the data?” I answered: “XML table in existing worksheet.”
Then I did Developer -> Source to reveal the XML map, unbound the mapped fields, and rebound them to the vertical rather than horizontal layout I wanted to use.
It was all good. So now I tried the same procedure using this sample Yahoo query:
But this time when I unbound and rebound the fields, I couldn’t access the values in the same way. Eventually I saw why not. The Yahoo results reference a schema, and that triggers a more complex behavior in Excel involving the importation of whole data sets.
So I saved an instance of Yahoo’s XML results in a file, stripped out the schema reference, and then acquired it using Data -> From other sources -> From XML Data import. Then it behaved just like the first example. I expect there’s a simpler solution, and hopefully this item will attract a reference to it.
With the mapping done, it’s a one-liner in VBA (ActiveWorkbook.XmlMaps(“YahooMap”).Import url:=url) to fetch the XML data and spray it into the mapped cells. That’s dramatically simpler than the regular-expression gymnastics performed by the Juice version. Of course if you need to support older versions of Excel, you’ve got to perform those gymnastics.
My first version was full of hardcoded references to rows and columns in the temporary sheet where the XML data gets unpacked, and in the main sheet where raw addresses are decorated with latitude, longitude, parsed address, precision (e.g. exact address vs. street-level vs. city-level), and cleanliness (e.g. whether there were warnings).
I knew that I’d need to use lookup functions to relativize all those references, and it soon became apparent that I’d want to use the Match function — which finds the position of an item in a row or column — to do it. But it returns numeric positions, which are fine for rows but don’t correspond to alphanumeric column names like C3. The solution, as generations of Excel hackers have learned but I never had need of until now, is to go to Options and enable the R1C1 reference style. Now the columns have numbers too, and in VBA you can write reference like so:
Dynamic variable assignment
That cleaned up a lot of the mess, but there was still a lot of per-variable code that I’d written in order to stash the geocoded results into VBA variables and then later retrieve them. I thought of generalizing that by using Eval, like so:
Eval( ‘y_lat = Selection.Value’ )
But no dice. Excel 2007 told me there was no Eval function. Which is just as well, because that time-honored trick is really sketchy. So I went looking for VBA’s equivalent to the Perl associative array or the Python dictionary, and found it in VBA’s Collection.
All in all, it was an educational exercise. The patterns here can serve as a model for any scenario that involves interactively querying a web service based on some cell in Excel, and then incorporating the results into companion cells. Of course since I’m a complete novice when it comes to this stuff, I’m hoping that by posting my code I’ll also find out about other and better approaches.
1 You’ll want to substitute your Yahoo application id for YahooDemo. And unless the addresses you’re looking up happen to be in my town, you’ll want to adjust the city and state too.
dim address as string dim escaped_address as string dim city as string dim state as string dim yahoo_id as string dim url as string dim y_labels() as variant dim y_values as collection dim main_address_col as integer dim scratch_data_col as integer dim scratch_label_col as integer public sub init city = "Keene" state = "NH" address = ActiveWorkbook.Application.ActiveCell main_address_col = 2 scratch_label_col = 1 scratch_data_col = 2 escaped_address = replace(address, " ", "+", 1) yahoo_id = "YahooDemo" y_labels = array ("y_lon","y_lat","y_addr","y_precision","y_clean") end sub public sub GeocoderYahoo() on error goto ErrorMsg init url = "http://local.yahooapis.com/MapsService/V1/geocode?appid=" & yahoo_id url = url & "&city=" & city & "&state=" & state & "&street=" & escaped_address 'call the geocoder ActiveWorkbook.XmlMaps("YahooMap").Import url:=url 'find current row index = application.match(address,columns(main_address_col),0) set y_values = new collection 'gather results into collection worksheets("Scratchpad").select for each label in y_labels row = application.match(label,columns(scratch_label_col),0) rows(row).columns(scratch_data_col).select y_values.Add Item:=selection.value, Key:=label next label 'unpack collection worksheets("Main").select for each label in y_labels col = application.match(label,rows(1),0) rows(index).columns(col).select selection.value = y_values(label) next label rows(index).columns(main_address_col).select goto Fini ErrorMsg: msgbox ("Cannot geocode: " & address) Fini: end sub