
On my last trip through Chicago’s O’Hare Airport I got fooled by the mystery outlet depicted in the photo. At a distance it looked like an AC power outlet, but it’s not one of these. Annoying, because I’d rather spread out on the floor than crouch among the huddled masses at the power bar. What is that thing anyway?
Category: .
First look at Resolver, an IronPython-based spreadsheet
Last month in an item about working with crime data I asked:

Will there be a role for IronPython (or IronRuby) here, someday, such that you could use these languages inside Excel? That’d be very cool.
Several folks suggested that I should take a look at Resolver, an IronPython-based spreadsheet that deeply unifies Pythonic object-oriented programming with the sort of direct manipulation that makes the spreadsheet so useful. Resolver was and still is in private beta, but today’s screencast (Flash, Silverlight) will give you a good sense of what it’s all about.
The presenters are Giles Thomas, managing director and CTO of Resolver Systems (and creator of his own Resolver screencast), and Michael Foord, who blogs about Python, contributes to the IronPython cookbook, and is also working on the forthcoming book IronPython in Action.
If you are (or would like to be) using Python to wrangle business data, Resolver will make sense immediately. You’ll love the idea of wielding Python’s powerful data manipulation features in that context. You’ll appreciate what it would mean to harness not only the Python standard libraries but, because Resolver is IronPython-based, also the .NET Framework and the universe of third-party .NET assemblies. And you’ll be intrigued by the way in which the IronPython code that represents and animates a Resolver spreadsheet can be reused elsewhere — for example, in web applications.
But there’s more to the story. Because a cell in a Resolver spreadsheet can contain a reference to any .NET object, Resolver creates, as Giles Thomas says, “a somewhat pathological but entirely new way of programming using a spreadsheet.” You can, for example, define an anonymous function — say, a function that returns the square of its argument — and store it in cell B4. Then you can place a value — say, 5 — in cell A2. Then you can store this formula in cell B6:
=B4(A2)
That says: “Apply the squaring function in B4 to the value in A2.” The result in B6 will be 25.
I’ve long argued that the interactive and exploratory style of dynamic object-oriented languages is an important but underappreciated benefit. As I may have mentioned before, IronPython’s creator Jim Hugunin told me that when he first showed IronPython to folks at Microsoft, he was surprised by their reaction. He thought the big wow would be IronPython’s ability to streamline and accelerate use of the .NET Framework. But while people did appreciate that, they were truly wowed by something that’s second nature to every Python programmer — the read/eval/print loop which traces all the way back to the earliest Lisp systems.
It is a magical and powerful thing to be able to explore and modify a running program’s code and data. From those early Lisp systems to today’s Python and Ruby implementations, we have been doing that exploration and modification using a command line.1 We can trick it out with recall, name completion, and search, but it’s still a command line with all the limitations that entails. If I’ve defined an object A and stored some code or data there, my definition and invocations of A will scroll out of view as I continue to work. They won’t be visually persistent.
In a Resolver spreadsheet, these objects are visually persistent. I haven’t yet got my hands on Resolver, but here’s an example of what I think that will mean. Suppose that I have a data set I want to transform, against which I’m testing five different versions of a transformation function. I’d put the data in cell A1, the functions in cells B1..B5, and the results in C1..C5. Now I’ll see everything at a glance. The spreadsheet that would conventionally have been the results viewer at the end of a series of tests becomes the environment in which the tests are written, performed, and evaluated.
The spreadsheet is also an important bridge between programmers and their business sponsors. It’s no accident that Ward Cunningham’s FIT (Framework for Integrated Test) was originally inspired by Ward’s experience of inviting business analysts to write test cases in spreadsheets. In its current form, FIT uses HTML tables in a wiki as the bridge between analysts who write tests and developers who write the code that must pass those tests. I think Resolver and FIT may prove to be a marriage made in heaven.
While Resolver will initially appeal to business programmers who appreciate Python as a language, and IronPython as a way of leveraging the .NET Framework and .NET-based business logic, the ideas it embodies transcend Python and .NET. I’ll be fascinated to see how this “pathological but entirely new way of programming using a spreadsheet” will evolve.
1 Smalltalkers will note that they have been using a three-pane browser all along, and that’s true. However the spreadsheet metaphor, in this context, is something else again.
Screencasting and scripting
I was chatting the other day with Jim Hugunin about an earlier posting on automation and accessibility, and Jim highlighted a point that’s worth calling out separately. If you had a script that could drive an application through all of the things shown in a screencast, you wouldn’t need the screencast. The script would not only embody the knowledge contained passively in the screencast, but would also activate that knowledge, combining task demonstration with task performance.
Of course this isn’t an either/or kind of thing. There would still be reasons to want a screencast too. As James MacLennon pointed out yesterday:
All too often, the classic on-the-job training technique has been “just follow Jim around, and do what he does for the next three weeks …”. This kind of unstructured training doesn’t lend itself to easily to written documentation – it’s the nature of the process as well is the nature of the people. Video, however, allows us to simulate this “follow him around” approach.
Citing Chris Gemingnani’s Excel recreation of a New York Times graphic, James says:
This kind of approach clicked with me, because this was my preferred method for learning a new programming environment. If I could just get an experienced programmer to take me through the edit / compile / debug / build cycle, I would be off and running.
So you’d really want both the screencast and the script — for extra credit, synchronized to work together.
What stands in the way of doing this? Don’t the Office applications, for example, already have the ability to record scripts? Yes, they do, but that flavor of scripting targets what I called engine-based rather than UI-based automation. Try this: Launch Word, turn on macro recording, and then perform the following sequence of actions:
- Mailings
- Recipients
- Type New List
Now switch off the recorder and look at your script. It’s empty, because you haven’t yet done anything with the engine that’s exposed by Word’s automation interface, you’ve only interacted with the user interface in preparation for doing something with the engine.
It would be really useful to be able to capture and replay that interaction. And in fact, I’ve written a little IronPython script that does replay it, using the UI Automation mechanism I discussed in the earlier posting. It’s not yet even really a proof of concept, but it does contain three lines of code that correspond exactly to the above sequence. Each line animates the corresponding piece of Word’s user interface. So when you run the script, the Mailings ribbon is activated, then the Recipients button is highlighted and selected, and then the Type New List menu choice appears and is selected.
What I’m envisioning here is UI-based semantic automation. I call it UI-based to distinguish it from the engine-based approach that bypasses the user interface. I call it semantic because it deals with named objects in addition to keystrokes and mouseclicks. Is this even possible? I think so, but so far I’ve only scratched the surface. Deeper in there be dragons, some of which John Robbins contends with in the article I cited. I’d be curious to know who else has fought those dragons and what lessons have been learned.
Talk faster! No, slower!
I learned a couple of things and it spurred some interesting ideas. However, neither of them talk very fast…I just cannot stand that most people talk so slowly.
Have no fear of pauses, they help frame and structure the noise between the pauses.
Silverlight for screencasters
I’ve been doing some experiments to find out how the Silverlight plug-in will work as a player for screencasts. On this test page you’ll find four different versions of a 23-second clip. There’s one for Quicktime, one for Windows Media, one for Flash, and one for Silverlight.
Some important variables, from a screencaster’s perspective, are: legibility, file size, and convenience of production, deployment, and viewing.
That legibility matters seems obvious, but I see an awful lot of screencasts delivered at squinty resolutions. This puzzles me. The purpose of a screencast is to show and describe on-screen action. If you can’t read the screen, what’s the point?
All four of these examples are legible. The Quicktime version achieves the best clarity, but there’s a tradeoff: it’s also the largest file.
That size matters is perhaps less obvious to those of us living in the developed world. But as I’ve been recently reminded by both Beth Kanter and Barbara Aronson, much of the world remains bandwidth-challenged. Videos that don’t squeeze themselves down will not be seen in many places where they should be.
Among these four examples, Windows Media weighs in lightest at under half a megabyte. That works out to about a megabyte per minute, which is the target I like to shoot for. If it’s possible to deliver a legible screencast at a data rate significantly less then that, I’d like to know how.
The sizes of the other versions in this example, in ascending order: Flash 1.2MB, Silverlight 1.5MB, Quicktime 2MB.
Of course these sizes depend on which encoder is used, and on which settings are applied. For these tests, I produced all of the screencasts in Camtasia. For Quicktime and Windows Media, Camtasia uses the encoders that come with those platforms. For Flash, it supplies an encoder. For Silverlight, it doesn’t yet supply an encoder so I produced an uncompressed AVI and then used Expression Encoder to create a Silverlight-compatible WMV file.
I should add here that, despite all the work I’ve done in this area, I’m still a bit vague on the concept of a screen encoder — that is, a video encoder that’s tuned for the kinds of low-motion but text-rich content that’s typical of screencasts. In beta versions of Silverlight and Expression Encoder, for example, there wasn’t a screen video option, so the only way to produce a legible screencast was to crank up a motion-video encoder to the maximum data rate, which produced a massive file. Now Expression Encoder provides a screen encoding option, which I used for this test and which Silverlight 1.0 can obviously play back.
It seems to me that Camtasia should be able to use that encoder directly, but until I figure out how, it will be less convenient to produce Silverlight screencasts from Camtasia than to produce the other formats. Rendering to AVI as an intermediate step is doable, but time-consuming.
In terms of deployment convenience, one measure is the number of supporting HTML, JavaScript, configuration, and other files required in order to play a screencast. I’m a minimalist, so when I deploy Camtasia screencasts I throw away the wrappers that Camtasia generates and go with the Simplest Thing That Could Possibly Work. From my perspective, that winds up being an OBJECT tag (and, sigh, also an EMBED tag) for Quicktime or Windows Media, plus a reference to a minimal player in the case of Flash. By comparison, my Expression-generated Silverlight example has lots of moving parts — an HTML file, a XAML file, a flock of JavaScript files, and the WMV file.
The Silverlight example could of course be simplified by coalescing the JavaScript support, but that alone won’t solve another issue of deployment convenience. It’s nice to be able to embed a screencast in any arbitrary web host. From the perspective of my WordPress.com blog, that’s an issue for all four of these approaches. WordPress is always coming up with new ways to embed video from various services, but the reason that’s necessary is that WordPress.com — quite rationally — strips out most of the advanced HTML tags and JavaScript support that you might want to include in your blog postings. In general, embedded video seems to be a game of point solutions. In order to embed video flavor X in web host Y you need a specific X+Y adapter. I understand the reasons why, but it’s frustrating.
One of those adapters, by the way, will be needed for WordPress.com and Silverlight Streaming, which is the Microsoft hosting service announced at the MIX conference earlier this year. I’ve hosted another version of my Silverlight example there. It’s the same set of files as this example, minus the HTML wrapper and the core Silverlight JavaScript code, plus an XML manifest, all packaged up in a zip file. I’m not expecting my little test to attract millions of viewers, but if I were, this hosting service would be one way to handle the load.
In terms of viewing convenience, the Silverlight example exhibits a nice property that I wasn’t expecting. When you resize the window containing the player, the player scales to fit. I’m pretty sure the embeddable Quicktime and Windows Media players can’t do that. Flash-based media players are more customizable, and can respond to container resize events, but I don’t think I’ve ever seen the technique applied to a screencast. It’s a nice idea. A screencast at 1:1 resolution is guaranteed to be legible, but will also consume a lot of screen real estate. So it’s tempting to shrink its width and height in production. But by how much? Any fixed resolution will work well for some people and not others. Resizable screencasts would be great for accessibility.
Of course you can resize any standalone player. So this issue boils down to what’s possible when the player is embedded in a web page. And as we’ve seen, embedding can be problematic. In general, we need to work toward a smoother transition between embedded and standalone viewing experiences.
The ultimate test of viewing convenience is, of course: Does it play instantly, regardless of the operating system or browser I happen to be using? Flash leads the way in that regard. Silverlight aspires to the same level of plug-in ubiquity, and with the announcement of Moonlight that aspiration seems achievable.
Ultimately a screencaster wants to be able to produce one video that works well for everyone, everywhere, for various definitions of works well. That’s a hard problem. Solutions depend on the raw capabilities of media players, it’s true. But they also depend on an ecosystem of plug-ins, browsers, encoders, operating systems, and hosting services.
A conversation with Beth Kanter about social software and non-profit organizations
My guest for this week’s ITConversations show is Beth Kanter. We share a common interest in showing people how and why to use social software. In this conversation Beth reflects on her work with “digital immigrants” in non-profit organizations. The cornucupia of free services is a blessing for these organizations. But even when financial hurdles are stripped away, conceptual hurdles remain. Helping people to understand what’s possible, and to exploit online services in appropriate ways, is both a great challenge and a great opportunity.
The fourth platform
In my podcast with Ed Iacobucci about DayJet’s approach to reinventing air travel, Ed recalls the moment when he knew that the Eclipse VLJ (very light jet) represented the hardware component of a new platform. His contribution would be to create the operating system that would enable new travel applications.
Antonio Rodriguez, who joined me for another podcast about Tabblo, the online photo service he founded, enjoyed the Ed Iacobucci podcast but concluded:
I think I am beginning to develop an aversion to the term platform.
When I read that, I said to myself: “Yeah, me too.”
But we can’t help ourselves. On the very same day, Antonio responded to Marc Andreessen’s taxonomy of platforms. So did lots of others, including Joshua Allen.
Marc’s post defines three levels of platform:
- Flickr-style data-access APIs.
- Facebook-style containers of “Internet plug-ins.” While hosted by their container, these plug-in applications must also provide their own life support.
- Ning-, Salesforce-, and Second Life-style runtime environments that fully support their dependent applications.
As both Antonio and Joshua point out, Marc’s level 3 runtime leads to the sort of lock-in scenario that developers have learned to regard with suspicion. Antonio pushes back on Marc’s characterization of Amazon’s S3 and EC2 as “only sort of” a level 3 platform. Facebook alone may be a level 2 platform, he says, but in combination with Amazon’s neutral infrastructure it already reaches level 3. And of course Amazon’s services can recombine with other level 2 platforms to yield other level 3 platforms.
Joshua, meanwhile, questions the need for levels 2 and 3. He thinks that Marc’s base platform — the data flowing at level 1 — has potential we’ve scarcely begun to realize. I agree. Syndication-oriented architecture surely has limitations, but we haven’t run into them yet.
It’s also worth noting that Marc’s taxonomy is wholly cloud-centric:
I think that kids coming out of college over the next several years are going to wonder why anyone ever built apps for anything other than “the cloud” — the Internet — and, ultimately, why they did so with anything other than the kinds of Level 3 platforms that we as an industry are going to build over the next several years.
I’ll go along with that, but only if we can extend our definition of the cloud to encompass what the Internet originally was: a network of peers. With rare but notable exceptions (e.g. BitTorrent) it hasn’t been that for a long time. I think it will be that again. There’s a level 4 platform waiting in the wings. At level 4, the cloud of storage and computation is partly centralized in a handful of intergalactic clusters, and partly distributed across a network of humble peers. Microsoft’s forthcoming Internet service bus is one example of a level 4 platform. I hope, and expect, we’ll see others.
Appreciating Common Craft’s “paperworks” sketchcasts
I am an immediate fan of Common Craft’s style of concept videos. Their explanations of how and why to use del.icio.us and Google Docs are crisp and entertaining. They convey the essence of these activities more clearly than any other visual explanations I’ve seen, including many of the screencasts I’ve made.
The style is called paperworks because these sketchcasts are made by capturing screenshots, printing out key elements, and then filming, animating, annotating, and narrating arrangements and rearrangements of these scraps of paper. The first time you watch one, you’ll be captivated: it’s cute, it’s fresh. But is this just a gimmick? After you watch a few more, and you begin to acclimate to the style, does its effectiveness wane? Not yet, for me, because these productions have more going for them than cuteness and freshness.
One of the principles at work here is the moral equivalent of cropping and zooming in the screencast medium. When you’re trying to explain software on a conceptual level, images captured from screens can be a mixed blessing. It’s valuable to show exactly what screens look like, and exactly how actions flow within and across them. But the amount of detail that’s visible in a typical screen can often distract from the story you’re trying to tell. By cropping the screen, and/or by zooming in on the active region, you can prune away a lot of visual clutter and focus on key interactions. The paperworks style is an extreme form of cropping and zooming; it prunes and focuses very aggressively.
Another principle is sketching. According to Bill Buxton, sketching goes hand in hand with what he calls design thinking. When I asked Bill how he would have used sketching in the design of a feature like the Office ribbon, he said:
You’d start with paper prototyping — quickly hand-rendered versions, and for the pulldown menus and other objects you’d have Post-It notes. So when somebody comes with a pencil and pretends it’s their stylus and they click on something, you’ve anticipated the things they’ll do, and you stick down a Post-It note.
If that’s a helpful way to imagine software interaction in the design phase, why wouldn’t it also be a helpful way to conceptualize the software in use? The paperworks style strongly suggests that it is. These sketchcasts are great visual explanations of working software. I suspect they’d be equally useful during the design of that software.
A conversation with Ed Iacobucci about the reinvention of air travel
In Free Flight, the seminal book on the forthcoming reinvention of air travel, James Fallows tells a story about Bruce Holmes, who was then the manager of NASA’s general aviation program office. For years Holmes clocked his door-to-door travel times for commercial flights, and he found that for trips shorter than 500 miles, flying was no faster than driving. The hub-and-spoke air travel system is the root of the problem, and there’s no incremental fix. The solution is to augment it with a radically new system that works more like a peer-to-peer network.
Today Bruce Holmes works for DayJet, one of the companies at the forefront of a movement to invent and deliver that radically new system. Ed Iacobucci is DayJet’s co-founder, president, and CEO, and I’m delighted to have him join me for this week’s episode of Interviews with Innovators.
I first met Ed way back in 1991 when he came to BYTE to show us the first version of Citrix, which was the product he left IBM and founded his first company to create. As we discuss in this interview, the trip he made then — from Boca Raton, Florida to Peterborough, New Hampshire — was a typically grueling experience, and it would be no different today. A long car trip to a hub airport, a multi-hop flight, another long car trip from hub airport to destination.
In a few weeks, DayJet will begin offering a different kind of experience for travel within a trial network of small Florida airports. If all goes well, the network will then expand to the entire Southeast, and eventually — I sincerely hope — will reactivate small airports around the country, including the one that’s two miles from my home.
In this interview, Ed describes how he worked through a false start, realized that on-demand air travel would require a platform, decided that Eclipse Aviation’s line of precision-engineered, mass-produceable, and affordable jets would be the platform’s equivalent to the personal computer, and then conceived and created its network operating system and software service infrastructure.
There were two major research and development challenges. First, how do you find an optimal routing solution when there’s no fixed schedule and when every new reservation ripples through the entire network?
It didn’t take very long to figure out that if you replace one 25-million-dollar plane with 25 one-million-dollar planes, it fixes a lot of problems. And if you couple that with doing it by the seat instead of by the plane, that lets you interleave packets, or payloads, and increases the efficiency even more. So it became very clear that we needed to build a large, self-optimizing network that would take a lot of other factors into consideration, like the physics of the airplane, the temperature, the loads. The beauty of aviation is that it’s like physics meets business, right? How much you can carry depends on temperatures, altitudes, runway lengths — and safety is all expressed in terms of parameters that the optimizer has to take into account as it starts shuffling around customers. It’s not a straight optimization, it has to be done in real time, and it has an incredible number of constraints.
So I hired mathematicians, really smart guys, and we brought them on and gave them the challenge of their lifetime — really, for the rest of their lives — because you’ll never find a solution to the problem, it’s what mathemeticians call NP-hard, which means you can take every computer made between now and the end of our lives, and run them until the end of our solar system, and you’ll never find the optimal solution. You have to move from traditional hard optimization to heuristics aided by optimization techniques. So then we brought in an operations research group from Georgia Tech, real heavy hitters who did optimization for large air carriers. But optimizing assets around a fixed schedule is a vastly different problem from trying to determine the most optimal solution in real time for something that doesn’t have a fixed schedule and morphs with every new request that comes in. Their response was, “Nobody’s ever done this.”
If it were just chartering airplanes, that’s not very exciting. But now we’ve got new science, new math, it’s a lot of green fields in areas where we could get collaboration with major universities where topnotch people want to work with us, and assign Ph.D. students to work with us.
The goal was to be able to respond yes or no, within ten seconds, to a customer’s request for a flight between two participating airports, and that goal has been achieved. Given that capability, DayJet has been able to create a new business model that prices tickets according to the value of each customer’s time. If you value your time highly, you can request a narrow time window for your flight, and pay more for your ticket. If you’re willing to accept a wider window — say, you’re OK with leaving anywhere between 10AM and 3PM — you’ll pay a lot less. Ed calls this “time arbitrage” and it’s at the heart of what’s really revolutionary about this system from the customer’s point of view.
There was another big problem. As you build out the network of regional airports, you have to make big asset investment decisions. How can you model demand in order to guide those investment decisions?
Believe me, there are many more ways to go bankrupt than to make money. What we’re really building here is a value network, and the composition of that network determines the load. If I have a network of nodes A, B, C, and D, and I add a new node, E, that can have an impact on all the nodes at various times of the day. But if I add F, that could impact some nodes differently than others. It’s an interrelated loading problem that’s very difficult to model. So I thought, OK, we’ve got these guys taking chaos and organizing it into order, so we can file flight plans and make it all look organized, or actually be organized, on the back end. What I need is another group of people to create organized chaos, or complexity, to mimic the behaviors of a region of travelers, that can be used to test how well we can reorganize that chaos into order. That’s not simple either, it’s going to depend on pricing, and time/value tradeoffs, and density of your transportation network, and what nodes you introduce, and what the interactions between the nodes are, because every city you introduce has a different effect on the others.
I realized we needed the kind of thing that SimCity represents. When I was in school we called it discrete time simulation, but then it got a biological twist and became complexity science, and at one point chaos theory, though complexity science is the more accepted term. Along with one of my directors I had a served on the board of the BIOS Institute in Santa Fe, an offshoot of the Santa Fe Institute which was biological or evolutionary modeling of large complex systems. So I got in touch with some of those guys and we offered them a job. We said, hey, come on board and we’re going to build the most sophisticated regional travel model that’s ever been built, and we’re going to use it not just to postulate the future but to build a business.
So they came on and worked for about four years and came up with this other piece of technology, which is married to the optimizer, and the simplest way to describe it as SimCity on steroids, very targeted on the problem of regional travel. So we’ve got nine different types of agents or sims, populated using IRS statistics, operating in ten-square-mile zones, they all have different rules on how they book trips and what flexibility they have. Then we loaded on top of that a bunch of demographic data — some we bought, some we got from DOT, some from IRS. And then we loaded all the schedules for all the airlines between all the airports in the contiguous 48. And then we developed algorithms so we could estimate driving times, and added time-of-day congestion through various nodes. And then we added train schedules. The result is a very sophisticated, very high-fidelity model of the transportation options you would face if you lived in one ten-square-mile region of the US, and needed to go to another one.
The story so far sounds like a high-tech dream come true, and if DayJet succeeds that’s just what it will be. But there are a couple of big reality checks. First and foremost is regulation. Although DayJet’s technology is built to exploit the benefits of extreme virtualization, the FAA places severe limits on how far that can go. So while DayJet would have preferred not to own and operate its entire fleet, hire and train all its crews, and manage all of its airport facilities, that’s exactly what it must do to meet current regulations. The business can only succeed if it works within the current regulatory regime. Of course if it does succeed on those terms, and if that success paves the way for regulations that are friendlier to a more virtualized approach, DayJet’s travel-oriented network operating system will become all the more valuable.
Another reality check is the customer’s experience. Because there are no fixed schedules, there’s much more fluidity than you can reasonably present to a customer. Internally the system may reschedule your trip a dozen times, but you won’t want to be flooded with rescheduling notifications.
This is what we’ve been wrestling with for the last six months. What it means is that you just add more constraints. We won’t flip you around all over the place. You start by negotiating as big a window as you can accept, because the bigger the window, the cheaper the ticket. And it’s not a departure window, it’s a window in which we will complete the mission. Then the challenge operationally is how to shrink those windows down as you get closer to flight time, leaving enough space for disruption recovery. We’re learning, and we’ve discovered that the night before we can crunch that window down. So we notify the customer the night before that you have to be at the airport by time X, and you’ll be at your destination by time Y.
It’s all music to my ears. I’ve been dreaming about this for years. Ed’s actually doing it, and I can’t wait to see how things turn out.
Tools of the trade
My wife, who is an artist, recently picked up a copy of David Hockney’s book Secret Knowledge: Recovering the Lost Techniques of the Old Masters. She’d been having a discussion with an artist friend of hers about whether it’s wrong to use tracing, or other optical aids, when doing illustrations or paintings. In the book, Hockney advances the highly controversial theory that the dramatic surge in visual realism that occurred in the early 15th century was propelled by the use of optical projection techniques. The old masters, he claims, used mirrors, lenses, and the camera obscura to capture the outlines of the people and objects they painted.
Hockney says that a newly-available form of visualization led him to this conjecture:
Now with colour photocopiers and desktop printers anyone can produce cheap but good reproductions at home, and so place works that were previously separated by hundreds of miles side by side. This is what I did in my studio, and it allowed me to see the whole sweep of it all. It was only by putting pictures together in this way that I began to notice things; and I’m sure these things could only have been seen by an artist, a mark-maker, who is not as far from practice, or science, as an art historian. After all, I’m only saying that artists once knew how to use a tool, and that this knowledge was lost.
It all sounded perfectly plausible to me, but the art world isn’t buying it. Hockney’s critics cite a research paper by Microsoft’s Antonio Criminisi and Ricoh’s David Stork in which the authors show that Hockney’s central example — a painting of a chandelier — exhibits more irregularity than you’d expect if it had been rendered using an optical aid.
It’s fascinating stuff. I was already thinking about interviewing some folks at Microsoft Research about the state of computer vision, so maybe this will be the place to start.
But meanwhile, I’m left wondering about the context of the debate. Setting aside for the moment whether Hockney is right or wrong about the old masters’ use of optical aids, would they have been wrong to have used them in the way he suggests? Is that cheating? Should it diminish our appreciation of the work?
The art world says yes, it is cheating and does cheapen the work. But Hockney doesn’t see it that way. These optical aids, he argues, were just tools used by professionals who wisely chose to automate where they could in order to free up time and energy so they could add creative value where it mattered most.
My wife’s artist friend concurs. It’s not that she can’t draw freehand. She can and she does, but she also uses tracing techniques to identify landmarks and — because it’s commercial work she’s doing — to speed up some of the foundation-laying drudgery.
I’m sure the analogy is imperfect but, to a software guy, this all sounds very familiar. There are right and wrong ways to rely on software tools and frameworks, but I don’t think less of programmers to rely on them in the right ways. On the contrary, I think less of programmers who don’t.
The blurred line between personal information management and publishing
When I mothballed my InfoWorld blog and moved in here, I decided not to use WordPress categories but instead to continue the del.icio.us-based method I’d been using before. Of the many strategies woven together in my use of del.icio.us, two principal ones are keeping track of stuff in general, and keeping track of my own stuff. In terms of the latter, I like to be able to answer a question like this:
Where is your collection of articles about how to do screencasting?
With an answer like this:
If I relied on WordPress categories, the scope of such a query would be restricted to my WordPress blog. Because I use del.icio.us tags instead, the scope can include my old blog, my new blog, essays I’ve published elsewhere, and of course material from anywhere else on the web.
So that was the plan, but when I switched blogs I never got around to adapting my tagging workflow to the new setup. After a while I began to realize that I couldn’t answer questions with URLs because none of my recent items were queryable in that way.
So I went through and tagged all the items in this blog, from January to August, in a single blitz. That might sound like an insurmountable task but really it isn’t. I exported the blog to a file, captured just the titles and links, and opened those up in a browser. Then I grabbed items in batches of twenty or so, opened them into tabs, and worked through them. It took an hour and a half. Being the tagaholic that I am, it wasn’t just an exercise in drudgery. I appreciated the opportunity to reflect on the evolution of my tag vocabulary.
At the time I did worry about how this would look to somebody watching my del.icio.us tagstream. And for good reason. Here’s how it looked to Chris Muscarella:
Jon Udell tags his own things almost exclusively. That’s lame.
Historically that’s not true, but recently it looks that way, and in any case it’s a fair comment. When you mix personal information management with publishing, the lines can get blurry.
On reflection I realized that I’d made things worse by including my del.icio.us links in the blog’s sidebar. On my old blog, I filtered these to not include my own postings, which are all identified with the tag jonudell. (And eerily, although that blog is mothballed, it is still syndicating my current non-personal del.icio.us links.) I could probably do that here as well, but not with the WordPress del.icio.us widget. It offers a filter for tag inclusion:
Show only these tags (separated by spaces):
But there’s no filter for tag exclusion — e.g., everything not tagged jonudell. So I’ve yanked that widget for now. Come to think of it, that same exclusion filter would useful for my del.icio.us feed. Should WordPress and del.icio.us add these features? Perhaps. Then again, this is exactly the sort of thing a general purpose syndication bus ought to be able to do for us.
A conversation with Rohit Khare about syndication-oriented architecture
This week’s ITConversations podcast with Rohit Khare focuses on a topic that is near and dear to my heart: syndication. For both of us, that is the real substance of Facebook. Says Rohit:
Imagine there’s an application someday with 35 million users, and the first thing they see every morning is a news feed, and it’ll do a really intelligent job of summarizing what everyone they know has been up to since they last logged in. You wouldn’t have thought, “I need to sign up for a new consumer service that will tell me when people break up or get married or give talks.” And yet here we have this wonderful new phenomenon showing that there is pent-up demand. Now you can come back to the office and say, “Don’t you wish you had an interface like that so all of our field service techs could know what was going on, and be just as collaborative as this is?
So how do we get there? Start by “RSSifying” everything in sight. Then flow all the feeds through a “syndication bus”:
You do in some ways centralize the information flow, but you get the benefit of decentralized awareness — it’s an interesting paradox. If I have one syndication bus that’s responsible for delivering information to all of my users, and everyone in the community, then that same piece of software is in a very good position to detect patterns and emerging trends. If you think about meme trackers that can report, hey, this is a hot story that’s come up in the last few hours, that’s going to be really powerful when it mainstreams.
By way of disclosure, the backstory for this interview begins in 2002 when Rohit — who had co-founded KnowNow in 2000 — gave a great talk at the Emerging Technology on what he was then calling application-layer internetworking (ALIN). (I mentioned it in this InfoWorld column.) Among his other talents, Rohit is a great coiner of sticky buzzphrases and acronyms. Phil Windley, for example, conceded that ALIN was catchier than his own Layer 5 routing for web services.
Then in 2004, I interviewed KnowNow’s Michael Terner and Richard Treadway. The company’s tagline then — Simple Integration Connecting Data, Applications, and People: Business-to-Business, Event-Driven, Loosely-Coupled — was descriptive but decidedly less catchy.
Now Rohit and KnowNow are pitching a new buzzphrase, Syndication-Oriented Architecture, and a new acronym, SynOA. We are admittedly pushing the envelope when it comes to variations on the -OA theme, but I can’t help myself, I like this one for two reasons. First, the idea of syndication needs all the marketing help it can get. We’ve been at this for almost a decade and it hasn’t really caught on in the way it deserves to. Second, it’s just so obviously the right thing on so many levels, one of which happens to be information flow within the enterprise.
Automation and accessibility
In last week’s item on social scripting, I suggested that CoScripter’s automation strategy — based on simple English instructions that people can easily read, write, and share — could in theory work across the continuum of application styles. And arguably it will need to, because we’re increasingly likely to mix those styles. If you begin to rely on an automation sequence for your bank’s web application, for example, you’ll be sorry to have it broken by an upgrade that introduces AJAX, Flash, or Silverlight components.
What enables CoScripter to work in the web domain is the document object model (DOM) of which every web page is a rendering. Because JavaScript code can explore and interact with the DOM’s tree of user-interface objects, the browser can be driven semantically, by object names and properties, rather than literally, by mouse clicks and keystrokes. The literal method is workable, and there many tools that make excellent use of it. The semantic method is more reliable if available, but it isn’t always. So the literal method winds up being the common denominator, because every style of application will respond to mouse clicks and keystrokes.
There is another kind of semantic technique long supported by desktop applications that define object models, notably the Mac’s AppleScript object model and Windows’ Component Object Model. These technologies enable automation scripts to reach below the user interface of applications, and to work with their internal machinery.
Using the Word object model, for example, you can automate a mail merge. If you run this program, you’ll see Word launch, you’ll see a data document written by an invisible hand, and then you’ll see a mail merge appear. What you won’t see are the user-interface actions required to produce these effects, because this level of automation bypasses the user interface.
So let’s distinguish between two flavors of semantic automation. The mail merge script does what I’ll call engine-based semantic automation. And CoScripter does what I’ll call UI-based semantic automation.
These two flavors are useful in quite different ways. With the engine-based approach, an automation script uses the application as if it (the application) were a service. In this case you don’t want windows and dialog boxes popping up all over the place, you just want to feed inputs and harvest outputs. The engine-based approach works accurately and efficiently, but it doesn’t yield a representation of task knowledge that a normal person could use, learn from, adapt, or share.
With the UI-based approach, an automation script uses the application as if it (the script) were a human being. It sees and touches exactly what the human sees and touches. This is not the optimal way to crank out a thousand mailing labels. But the UI-based approach does yield a representation of task knowledge that a normal person could use, learn from, adapt, or share.
Shareable representations of task knowledge are incredibly useful and powerful. Screencasts are one such representation, and as many people have noticed in recent years, they can radically outperform traditional forms of documentation. But you can’t interact with a screencast or concisely describe it. You can only watch and learn and imitate. Although that’s way better than not being able to watch and learn and imitate, interaction and concise description would be better still.
CoScripter delivers that superior experience of interaction and concise description. It does so by means of UI-based semantic automation which, in turn, is enabled by the browser’s document object model.
What might enable a more comprehensive flavor of UI-based semantic automation? Noodling on this question I arrived at one possible answer: the Windows UI Automation API, which is part of .NET Framework 3.0. I’d heard of it, but hadn’t connected the dots. In this June 2005 article for the ACM’s Special Interest Group on Accessible Computing, Rob Haverty lays out the rationale for this relatively new mechanism:
Windows UI Automation unifies disparate UI Frameworks such as Avalon [Windows Presentation Foundation], Trident [the browser], and Win32 so that code can be written against one API rather than several.
The basis of this unification is a tree of automation elements that is, in effect, a generic document object model. Automation providers map various specific object models, notably those of the browser and of Windows, into the generic tree. The API provides mechanisms for searching the tree and interacting with its elements.
It’s a powerful system that is also accurately described by John Robbins as “intensively fiddly.” So in this March 2007 MSDN article, he provides and illustrates the use of a set of convenience wrappers around the raw System.Windows.Automation classes. The sample program included with that article drives Notepad through a few basic operations. Could it be extended in the direction of CoScripter, in a way that realizes UI Automation’s ambition to uniformly control Windows and web applications?
I took a crack at that, and concluded that creating even a proof-of-concept will require more time and more programming chops than I can muster. But I’d be interested to hear from anyone who’s gone further down that path. I think this is potentially a very big deal. Although I suspect most programmers see UI Automation in the context of software testing, for which it is indeed well suited, Rob Haverty’s article suggests that it was primarily motivated by the need for better assistive technologies and improved accessibility.
When Tessa Lau says that accessibility guidelines are the lifeblood of CoScripter, she’s talking about affordances for people whose cannot otherwise use the full capability of their software. But consider Rob Haverty’s definition of accessible technology:
Accessible technology enables individuals to adjust their computers to meet their visual, hearing, dexterity, cognitive, and speech needs.
I like his use of the word cognitive because in some sense we are all cognitively impaired when we try to use software. For most people, most of the time, the concept count is way too high. We don’t normally think of automation as an assistive technology. But arguably it is one. And when automation yields interactive documentation that lives in shared information spaces, it becomes a really potent assistive technology.
In case it’s not obvious, I am not claiming that Windows UI Automation can realize this vision of assistive automation across the spectrum of application types. It’s currently only available by default for Vista, and optionally for Windows XP if enhanced with the .NET Framework 3.0. It is not part of Silverlight or Moonlight, though conceivably one day it might be. And it clearly has nothing to do with Mac OS X, or Java, or Flash, or the Linux desktop.
But the idea of UI-based semantic automation is something that could apply in all these domains. A proof-of-concept CoScripter-like application-plus-service spanning two major domains — Windows desktop apps and browser-based apps running on Windows — would be a big step toward that broader vision.
The social scripting continuum
Back in June, IBM’s Tessa Lau joined me on my ITConversations podcast to discuss Koala, “a system for recording, automating, and sharing business processes performed in a web browser.” The service is now available on the AlphaWorks site as CoScripter, where the first script I tried was Tessa’s own Update your Facebook status. Here is the text of the script as it appears in the CoScripter wiki:
* go to "http://www.facebook.com" * enter your "e-mail address" (e.g. tlau@tlau.org) into the "Email:" textbox * enter your password into the "Password:" textbox * click the "Login" button * click the "Profile" link * click the "Update your status..." link * enter your status into the status field
Interestingly there was a bug in that script. The fourth step was originally:
* click the "Password" button
Because there is no button labeled “Password” on Facebook’s login page, the script failed.1 When I made the change from “Password” to “Login” in the CoScripter sidebar I simultaneously fixed the script and added the corrected version to the wiki. After posting this entry, I added a comment to the wiki that points back here. All in all, it’s a nice illustration of the emerging style of social programming that we also see in applications like Yahoo! Pipes and Popfly.
As Tessa explains in the podcast, many scripts — including this Facebook example — require secrets, notably usernames and passwords. These you can conveniently record as name/value pairs stored in a personal database. I have two observations about that. First, secrets appear to be stored remotely. If so, I’d prefer to keep them local. (Update: They are indeed local, see Tessa’s comment below.) Second, there should be a way to qualify them by domain, because names like “Email Address” and “Password” will soon become overloaded.
One of the delightful things about CoScripter is the simple and natural language used to express sequences of actions. It looks just like the instructions an ordinary user would write down for another ordinary user to follow. By embedding those instructions in an interpreter that makes it easy for anyone to run and debug them step by step, and by reflecting them into a versioned wiki, CoScripter creates a rich environment in which people can record, exchange, and refine their operational knowledge of web applications.
Currently CoScripter is a creature of the web, and specifically of a Firefox-based, Flash-free web. Adapting it to another browser would be hard but doable. Adapting it to work with RIA (rich Internet application) plug-ins like Flash or Silverlight is really problematic, though, because RIA plug-ins don’t mesh very well with the web’s RESTful style.
There are minor exceptions. Back in 2004 I raised that issue in terms of Flash, and Adobe’s Kevin Lynch showed how to materialize URLs for states within a Flash application. But this doesn’t occur normally and naturally when you write a Flash application, as it does when you write a web application. Or rather, as it used to when you wrote a web application, because AJAX also tends to hide an application’s URL namespace.
Because the same issue is going to come up all over again in the context of Silverlight, now would be a good time to think about how Silverlight apps can expose automation interfaces that cooperate with the RESTful web they’re part of.
With any flavor of web application, whether it’s based on simple HTML and JavaScript, or enriched with AJAX, or turbocharged with Flash or Silverlight, it would be great not only to be able to automate as CoScripter can, but also to share and collaboratively refine the scripts. How can we best assure that possibility? Tessa Lau thinks that web accessibility guidelines represent our best hope. If CoScripter-style automation were to catch on it would be a further incentive to adopt those guidelines, and would likely reshape them in useful ways as well.
But why stop there? In principle there’s no reason why desktop applications can’t play the same game, and there are compelling reasons why they should. Today, for example, I found the answers to the 25 top “How do I?” questions asked about Word. Those answers are pointers to articles in the Microsoft knowledge base. For the ever-popular “How do I create mailing labels?”, the answer includes instructions like these:
- Open the document in Word, and then start the mail merge. To start a mail merge, follow these steps, as appropriate for the version of Word that you are running:
- Microsoft Word 2002:
On the Tools menu, click Letters and Mailings, and then click Mail Merge Wizard. - Microsoft Office Word 2003:
On the Tools menu, click Letters and Mailings, and then click Mail Merge. - Microsoft Office Word 2007:
On the Mailings tab, click Start Mail Merge, and then click Step by Step Mail Merge Wizard.
- Microsoft Word 2002:
- Under Select document type, click Labels, and then click Next: Starting Document. Step 2 of the Mail Merge appears.
- Under Select starting document, click Change document layout or Start from existing document. With the Change document layout option, you can use one of the mail-merge templates to set your label options. When you click Label options, the Label Options dialog box appears. Select the type of printer (dot matrix or laser), the type of label product (such as Avery), and the product number. If you are using a custom label, click Details, and then type the size of the label. Click OK. With the Start from existing document option, you can open an existing mail-merge document and use that as your main document.
- Click Next: Select Recipients
The resemblance to CoScripter’s step-by-step instructions is striking. Why shouldn’t instructions like these be able to drive Word’s automation interfaces? Why couldn’t users create and share their own instructions? Sure it’s a desktop application, but nowadays that’s just an endpoint along a continuum of application styles — HTML, JavaScript, AJAX, RIA, desktop app — all of which are connected and can communicate. Collaborative automation is just one of many opportunities to exploit that ability to communicate, but it’s a huge one.
1 I suspect that Tessa planted that bug intentionally to see if we were paying attention!
XML documents: flavors versus essence
I have steered clear of the politics surrounding XML document formats both before and after joining Microsoft. But I was, and will always be, an outspoken advocate for the idea of XML documents. That’s a message that doesn’t make headlines but bears repeating. We have hardly begun to appreciate or exploit the value of XML. A couple of articles in the current issue of CTQuarterly, a journal about how cyberinfrastructure enables science, illuminate that point.
In Next-Generation Implications of Open Access, Paul Ginsparg writes:
One of the surprises of the past two decades is how little progress has been made in the underlying document format employed. Equation-intensive physicists, mathematicians, and computer scientists now generally create PDF from TeX. It is a methodology based on a pre-1980s print-on-paper mentality and not optimized for network distribution. The implications of widespread usage of newer document formats such as Microsoft’s Open Office XML or the OASIS OpenDocument format and the attendant ability to extract semantic information and modularize documents are scarcely appreciated by the research communities.
As the developer of the arXiv (formerly LANL) preprint archive, which predates the web, he understands better than almost anyone how that “pre-1980s print-on-paper” mentality thwarts the advancement of knowledge.
In The Shape of the Scientific Article in The Developing Cyberinfrastructure, Clifford Lynch writes:
We are seeing the deployment of software that computes upon the entire corpus of scientific literature. Such computation includes not only the now familiar and commonplace indexing by various search engines, but also computational analysis, abstraction, correlation, anomaly identification and hypothesis generation that is often termed “data mining” or “text mining.”
I like his tagline for this: “Scientific literature that is computed upon, not merely read by humans.”
XML document formats aren’t a panacea, but when we use them to reduce friction and lower activation thresholds, data will find data, and people will find people. To achieve those effects, the essential property of machine readability matters more than its flavor.
SharePoint, IronPython, and another lesson in the virtue of laziness
I’m doing an internal project that involves reading several different data sources from a SharePoint 2007 server, merging them, and posting the merged data back to the server. Being lazy, I wanted to use IronPython, write as little code as possible, and do everything dynamically.
Reading the data sources, which are customized SharePoint lists (i.e., database tables), was straightforward. Every SharePoint list offers an “Export to Spreadsheet” link which produces an XML dump. Given that export URL, here’s a recipe for reading the data (from a Windows client that’s already authenticated to the server) and converting it to a list of Python dictionaries.
import clr
clr.AddReferenceByPartialName('System.Xml')
import System.Xml
from System.Xml import *
from System.Net import WebRequest
from System.IO import StreamReader
def getDataAsListOfXmlNodes(URL):
request = WebRequest.Create(URL)
request.Method = "GET"
request.UseDefaultCredentials = True
response = request.GetResponse()
result = StreamReader(response.GetResponseStream()).ReadToEnd()
doc = XmlDocument()
doc.LoadXml(result)
nsmgr = XmlNamespaceManager(doc.NameTable)
nsmgr.AddNamespace( 'z', '#RowsetSchema')
nodes = doc.SelectNodes ("//z:row", nsmgr )
return nodes
def convertNodesToDicts(nodes):
listOfDicts = []
for node in nodes:
attrs = node.Attributes
dict = {}
for a in attrs:
dict[a.Name] = a.Value
listOfDicts.append(dict)
return listOfDicts
nodes = getDataAsListOfXmlNodes('http://host/sites/mysite/_vti_bin/...')
dicts = convertNodesToDicts(nodes)
Uploading my merged file to a document library on the server wasn’t so straightforward. I knew that SharePoint provides a set of web services APIs, so I started by acquiring the IronPython “Dynamic Web Services Helpers” from the Web Services sample. Among other things, these wrappers make it trivial to consume a WSDL-based web service. Here, for example, is a snippet that uploads a photo using the Imaging web service:
import System, clr
clr.AddReference("DynamicWebServiceHelpers.dll")
from DynamicWebServiceHelpers import *
filename = 'jon.jpg'
ws = WebService.Load('http://HOST/_vti_bin/Imaging.asmx')
ws.UseDefaultCredentials = True
bytes = open(filename,'rb').read()
bytes = map (ord, list(bytes))
bytes = System.Array.CreateArray(System.Byte,bytes)
ws.Upload('Photos','',bytes,filename,True)
So far, so good. But when I then looked for a generic service to upload any file to any document library, I found myself on a slippery slope. In the midst of exploring how to use the Swiss-Army-knife Lists service to accomplish a simple file upload, I realized I was working way too hard. Back in 2004, Bill Simser reached the same conclusion:
There seemed to be a lot of argument about using Web Services, lists, and all that just to upload a document. It can’t be that hard.
And it isn’t. As others have discovered too, SharePoint responds to a plain old HTTP PUT. Here’s an IronPython update to Bill’s recipe:
def upload(HOST,fname,rdir,rfile): wc = WebClient() wc.UseDefaultCredentials = True bytes = open(fname,'rb').read() bytes = map(ord,list(bytes)) bytes = System.Array.CreateArray(System.Byte,bytes) url = '%s/%s/%s' % (HOST, rdir, rfile) wc.UploadData(url,'PUT',bytes)
And presto. A local file called, say, myfile.html, lands someplace like http://host/sites/mysite/MyLibrary/myfile.html.
Sheesh. If it feels like you’re working too hard, maybe you are. Step back, take a deep breath, and look for a lazier solution.
A conversation with Barbara Aronson about global access to medical journals
If you doubt that a librarian can change the world, listen to what Barbara Aronson has to say in this week’s installment of my ITConversations podcast. As the librarian for the World Health Organization, she’s been the driving force behind HINARI, a publisher partnership that’s making thousands of otherwise unaffordable medical journals available to researchers in poor countries — at no cost to the poorest 70 (“band 1”) countries, and at nominal cost to the next-poorest 43 (“band 2”) countries.
I first heard about HINARI from Lee Dirks, Microsoft’s director of scholarly communication, whom I met last month at the EDUCAUSE Seminars on Academic Computing. “She’s doing amazing work,” Lee told me. Wow, is she ever. Her mission to democratize access to medical knowledge challenges our assumptions about the nature of open access, the economics of publishing, and the research priorities of the developed world.
In response to criticism that HINARI isn’t “pure” open access, because of the token fees paid by band 2 countries, she says:
I think that most people who live in wealthy countries have no idea about what incomes are like in poor countries, and how many countries are poor. There are 72 countries whose gross national income per capita is less than a thousand dollars. There are 43 who are in the one- to three-thousand-dollar range. These countries bear the highest burden of disease in the world. These are places where most of the wars in the world are taking place, where they have the highest unemployment, and the most precarious life. Now most of the health research in the world is done on the problems of the richest people, because the money for health research comes from the wealthy countries. So you’ve got these very poor countries with the worst health problems, they’re trying to train doctors and nurses to take care of their populations, they’re trying to train researchers to find out the information that doesn’t exist to solve their problems, and they’re doing all this with no access to scientific and technical information.
Is HINARI open access? Not according to anybody’s strict definition of open access, but those definitions are made in the developed world. The open access argument, or discussion, as it’s going along in the places where it is going along, is too narrow, and it needs to be expanded if we’re going to be truly global about this.
What’s fascinating here, from the perspective of publishing economics, is the way in which publishers seem to be using HINARI to prototype a tiered pricing model that Barbara Aronson suggests may be applicable to their customers in the developed world too:
Maybe if the publishers could find a way to manage tiered pricing, which is a big question, maybe then they’d be able to address issues like: What about poor institutions in rich countries? What about underfunded individuals in rich countries?
But I don’t want to leave the impression that this conversation was all about publishing economics. There’s much more at stake. As she points out, researchers don’t just consume the information that HINARI makes available, they refract it through their own experiences and then contribute important new knowledge.
Two weeks ago I heard from the International Center for Diarrheal Diseases Resarch, in Bangladesh. They’re big users of HINARI. These are the people who pioneered the absolutely brilliant way of treating diarrhea in small children. Diarrheal diseases are one of the two biggest killers of children under five in the world. And this means in the developing world, because in the developed world people don’t die of this. These are the people who developed groundbreaking approaches to dealing with that, and to dealing with acute respiratory infection which is the other big killer. Bangladesh is one of the poorest countries in the world, and yet it has produced world-class research that has helped way beyond its borders.
I tip my hat to Barbara Aronson, to the World Health Organization, and to the hundred-plus publishers who’ve joined the WHO in this inspirational project.
Collaborative mapping and computational thinking
When Eric MacKnight pointed me to Ewan McIntosh’s reflections on Stuart Meldrum’s mapping project, he said: “This struck me as an idea that would interest you.”
It does. These folks are reaching for ways to build maps collaboratively — in this case, maps of the locations of schools in Scotland. One option would be to rely on a central authority that publishes the whole set of locations. Another would be to divvy up the work such that districts publish subsets of locations, or schools publish their own individual locations.
Division of labor aside, there’s the question of locus of control. Should there be a central registry to which contributors are granted access? Or should there be small pieces loosely joined?
Both, actually. Although we see this as an either/or choice, the two strategies can be complementary, and not just in the realm of collaborative mapping. All kinds of collective data management scenarios will benefit from combining these approaches.
Ewan McIntosh offers an example of an easy way to implement a central registry: a mashup of Google Spreadsheets and Google Maps. The spreadsheet is used as a lightweight, multi-user, versioned database of locations that populate the map. A central authority can control the registry, granting access to districts or schools.
In a comment on Ewan’s entry, John Johnston points to another mashup that’s conducive to a decentralized strategy based on tagging and syndication. John’s mashup scans his Flickr account for geotagged photos and sprays them onto the map. Whenever he geotags a new photo, a new point appears on the map.
To build a collaborative map using John’s approach, you still need a central registry. But it needn’t manage primary data such as locations. Instead it need only manage metadata: identifiers for schools, identifiers for authorized contributors. If John were a school administrator he’d register as a contributor, geotag a photo of the school, and tag it with the school’s identifier.
Now that’s a contrived example, because geotagged photos are a circuitous way to federate location data. But this model supports other and more natural approaches equally well. Instead of looking for tagged photos on Flickr, for example, the registry might look for resources (e.g. schools’ domain names) tagged on del.icio.us or elsewhere.
With this scheme you have a nice separation of concerns. Schools are the authoritative sources for data about themselves. Registries define the protocols for syndicating that data, and the sources they’ll syndicate from.
If there’s only one registry the benefits aren’t overwhelming. But in fact there are many registries.
Consider a school that belongs to a variety of regional academic and athletic associations. Today, each association’s view of its member schools requires yet another centrally-controlled database. Many facts about individual schools will be duplicated across those databases. The location won’t often change, but other facts will. When that happens there’s no way to publish one authoritative change about a school that flows to many subscribing registries.
The alternative I’m sketching is really a variation on the lifebits theme. Like individuals, organizations have an interest in declaring authoritative facts about themselves. Aggregate views shouldn’t require singular control of a database, but rather singular definitions of tagging, syndication, and membership protocols.
There is, of course, a huge problem with this scheme. It presents a formidable conceptual barrier. For example, my experiment in community information hasn’t gone very far yet. To me, it makes perfect sense. No need to plug your photos or your events into my database. Instead, just reuse your photos on Flickr and your events on Eventful. But nobody expects things to work that way. In principle the indirection of tagging and syndication creates all sorts of useful effects. In practice most people aren’t comfortable with that indirection. If Jeanette Wing has her way, such computational thinking will become much more prevalent. I hope she’s right, and I wonder what we can do to encourage it.
Social networks then and now
On a recent vacation during which I helped a friend who’s building a house on Prince Edward Island, I picked up a copy of The Guardian and happened upon the death and funeral announcements. At first glance what’s remarkable is the amount of detail about the family of the deceased, the entire cast of characters involved with the funeral, and even the hymns sung. Scanning all this information, it took me a while to realize that something was missing. There’s almost no information about the life and times of the deceased. What is recognized in these pages is not the person but rather the social network to which the person was connected.
We caught a glimpse of the power of that social network when we were raising the first wall of the house. Word got around, people showed up to help, and we felt the force of community in a place that modernized fairly recently and still retains a strong flavor of pre-industrial culture. In that world, social networking isn’t a lifestyle choice, it’s a matter of survival.
On the way home, waiting in the Charlottetown airport, I saw a copy of Newsweek with a cover story about Facebook. Arguably our new modes of Internet-based social networking really are lifestyle choices, at least so far. As they mature it will be interesting to see how we use them — both to recapture lost ways of life, and to create new ones.
How wind works
At Burning Man this year Dick Hardt will be generating electricity with a 400W wind turbine. A couple of days ago I saw what appears to be that same thing in a Canadian Tire store on Prince Edward Island, at the end of an aisle that included the usual sorts of automotive accessories you’d expect to find there. It reminded me of the first time I saw an Ethernet switch, formerly a esoteric item, on the shelf at Staples. Evidently wind power is going mainstream. Here’s the 400W generator on sale for $800:

And here’s the delightful illustration on the side of the box, with the immortal caption “How Wind Works”:

It starts with a puff…
Two turns
As a former gymnast, I’ve always been frustrated with what passes for television coverage of the sport. The announcers always point out what everyone can plainly see: “Oops, didn’t stick the landing.” But they never tell you anything about the real subtleties of the sport. When I’m watching on TV with friends and family I try to explain things, but it all goes by too quickly. Even when replaying a recorded show in slow motion, it can be really hard to pinpoint what goes on.
Here’s an example from a competition I saw tonight. In these parallel frames of video, Nastia Liukin on the left and Shayla Worley on the right are at exactly the same point in a back giant swing:
One second later they’ve both done a half turn to what appears to be exactly the same point in a front giant swing:
But although Elfi Schlegel and Tim Daggett never mention this, there’s a huge difference between what those two women did in the intervening second, and also in the positions they came to.
I captured the two sequences side by side in this video. You may have to drag the slider back and forth a few times to catch what’s going on. Here’s a guide.
They both release their left hands and begin to turn.
Worley on the right turns her back away from the camera and ends in an ordinary undergrip. You know that one. Extend your arms forward, palms up and thumbs out, lay a broomstick across your palms, and grasp. It’s easy and natural.
Liukin on the left turns her back toward the camera, and ends in an eagle grip. You don’t know that one. Release one hand from the broomstick, rotate your thumb inwards and then outwards again through 180 degrees, and regrasp. Now do the same with the other hand. It’s hard and unnatural. Unless you have extremely flexible forearms and shoulders, you won’t even be able to do it.
This intermediate frame shows the difference most clearly. You can see their ponytails flying in opposite directions:
When I was in high school, my coach used to take Super 8 movies of the top competitors — in that era, in men’s gymnastics, it was the Japanese — and we would analyze their performances frame by frame. It’s so cool to be able to make and share that kind of analysis on the web. If we could get Elfi and Tim to do some of that, televised gymnastics would be so much better.
By the way, Nastia Liukin’s set was one of the most fabulous bar routines that I’ve ever seen performed by a woman or a man. Not just because of that crazy eagle grip, which she uses in several places, but in every way: flow, extension, flight, timing, power, flexibility, daring, and style.
Hosted lifebits scenarios
Recently I gave a talk in which I explored the idea of a hosted lifebits service. I think it’ll turn out to be fundamental principle and an enabler of many things, including the social network portability that is the blogosphere’s topic du jour. But before we go there, let’s explore how a series of more basic scenarios might play out in the context of a hosted lifebits service.
1) I write a blog entry.
Today we can, and often do, put serious effort into these acts of personal publishing. But the infrastructure to which we commit our words, sounds, and images doesn’t take our effort seriously. There’s no guarantee that anyone will be able to access an item at the published address in a year, never mind ten or a hundred. And there’s no guarantee that the effects of these acts of personal publishing — the reactions they provoke, the influences that flow from them, the reputations they create for us — can be measured.
In the hosted lifebits scenario such guarantees will exist, because we’ll pay for the service that makes them. At the core of that service is an archive that provides price-tiered levels of assurance that your stuff will be stable over time, that access will be granted in exactly the ways you specify, and that you can monitor that access.
I may over time use a succession of blog publishing systems. No problem, because the publishing service is decoupled from the core lifebits service. When I change publishing services, no content changes hands. There’s no export or import. I just authorize a different publishing service to access my archive. And there’s no rewriting of the URLs either. I declare what my web namespace will be. The lifebits service guarantees its long-term persistence, and collaborates with the publishing service to populate that namespace.
What if my lifebits service goes belly-up? Still no problem. There are multiple lifebits providers, and they belong to a government-regulated federation that assures continuity. One real-life example of this business arrangement is the insurance industry’s notion of a guaranty fund.
2) I comment on somebody else’s blog.
Today, each time I do that, I commit my words to a different foreign system. Logically they’re all my comments, but operationally they’re scattered all over the place, subject to a random assortment of naming and archival policies. True, there are services that can help me lasso all my comments, but architecturally this is just herding cats that have already gotten out of the bag.
In the hosted lifebits scenario, the item I’m commenting on is a permanent part of its author’s archive, at a stable URL. To comment on it, I write an item into my archive that refers to the item I’m commenting on, and my publishing service notifies the author’s publishing service that a comment has been made. We have various approximations of this behavior today, of course, but real consistency and coherence will require the use of lifebits services and associated lifebits-aware publishing services.
3) I write an email.
Today when I do that, I transmit a message from my email system to yours. If I want to maintain a coherent archive of my email, there are all sorts of challenges. Over time I use a succession of personal and business email systems. And at any given point I use several different ones concurrently, to separate personal from business correspondence. I know a few people who have kept their email archives intact over time, but for most those archives are scattered across a variety of local and (nowadays) cloud-based repositories.
In the hosted lifebits scenario, an email message can be a kissing cousin to a blog posting or a comment. I write it, commit it to my archive at a stable URL, notify you of its existence at that URL, and optionally transmit a copy of the message. That last step is optional because this model decouples two aspects of email that have always been inseparable: notification and transmission.
The core lifebits service that I’m postulating here, plus associated lifebits-aware publishing services — which are what email services turn out to be in this model — aren’t enough to achieve that decoupling. We’ll also need an access control regime that leverages an identity metasystem. Once those ingredients are all available, we’ll start to see that the services of notification, storage, access control, and transmission can be recombined to achieve powerful effects.
Consider the following example. When Robert Scoble worked for Microsoft he reportedly once said: “I wish we had trackbacks for email.” Why? He was comparing the efficiency of blog communication to the relative inefficiency of email communication. In the blogosphere it’s fairly easy to trace the influence of a posting, but in email there’s no way to monitor the influence of your contribution to an email thread once your name drops off it. In a pervasively publishing-oriented enterprise, knowledge management and social network analysis would be radically simpler and more effective than they are today.
Pushing the email example even further leads to a key objection. From my perspective, my personal and professional lifebits are all part of the same stream. But can we really imagine that when I join a company I’ll be able to federate my personal lifebits service with its corporate lifebits service? That sounds crazy at first. Companies run their own email infrastructure in part so they can enforce policies about email retention and destruction. And yet, companies are increasingly outsourcing that infrastructure and delegating the enforcement of those policies to third parties. Providers that survive in an ecosystem of lifebits providers will have to convince everyone that they are trustworthy, reliable, and interoperable.
Is there a reason to think that my company’s provider will do better on those measures than my personal provider? If we’re thinking just in term of today’s email and blogging systems, then yes, there probably is. There hasn’t been an incentive, yet, for personal-grade systems to meet enterprise-grade expectations. But let’s add one more scenario:
4) I visit a doctor.
Today, the record of my visit is kept by the hospital. Yes, the portable health record is coming, finally, and that will be a great step forward. But it’s really just an interim step. Here too, the services of notification, storage, access control, and transmission can be usefully recombined. Imagine that your health records are managed, in the most permanent and authoritative sense, by your own lifebits service which you choose to federate with the corporate lifebits services of the various health care providers you encounter during your life.
This raises the bar on the guarantees of trustworthiness, reliability, and interoperability that personal lifebits service must make. Those guarantees won’t come for free. But if I can amortize the cost across all my data silos — health, family, employment, education, finance, shopping, social life — the benefits will be huge and I’ll gladly pay.
A conversation with Greg Elin about the Sunlight Foundation
My guest for this week’s ITConversations show is Greg Elin, chief data architect with the Sunlight Foundation. Founded in 2006, the Sunlight Foundation aims to make the operation of Congress and the U.S. government more transparent and accountable. There are lots of obvious reasons why that’s a good thing. Greg adds a non-obvious reason that I hadn’t heard and find compelling:
I increasingly feel that the reason for Congressional hearings to be open and recorded and annotated is market efficiency. The fed does not announce what it’s going to do with interest rates until it announces it to everybody. But is that the case for the rest of Congress and legislation? If I can afford to have a fulltime lobbyist going to the committee meetings, don’t I have an inside track? Can’t I arbitrage my market investments based on that? It’s a question of market effiency.
That was one of the moments in this conversation where I stopped and said: Wow, great point. Here’s another. We were talking about the difficulty of organizing information from disparate sources based on unique identifiers, whether for individual legislators or for sections and paragraphs of legislation. Greg made this excellent point:
As technologists, we forget how much we’ve gamed the system from the beginning in setting up our tools. That Ethernet card comes with a hardcoded ID, and it’s unique, but it took us a long time to get there, and it required the cooperation of a lot of people to make it work.
Having surveyed a wide range of government data sources, Greg’s conclusion is that the future is already here, but not yet evenly distributed. There are pockets within the government where data management practices are excellent, and large swaths where they are mediocre to horrible. The Sunlight Foundation has an interesting take on how to bootstrap better data practices across the board. By demonstrating them externally, in compelling ways, you can incent the government to internalize them:
Sunlight Foundation made a grant to OMBWatch, they put together fedspending.org, and as that was happening the Coburn-Obama bill was passed, which basically said that the OMB had to put together the same type of website. If the Sunlight Foundation — and other organizations like the Heritage Foundation and Porkbusters — if we had not been doing a collaborative project at the time around earmarks, and at the same time working with OMBWatch to do fedspending.org, I think that there wouldn’t have been the drumbeat pressure for the government to make this information available.
Later the conversation turned to data integrity and data provenanance. What I mean by integrity, here, is the sort of question raised by my Hans Rosling wannabe screencast in which I observe that town-reported crime statistics rolled up to a statewide total don’t agree with state crime statistics as seen from a national perspective. Greg has a similar example:
Everything that CRP [Center for Responsive Politics] tracks is on a two-year election cycle. But OMBWatch is tracking contracts, and Taxpayers for Common sense is tracking earmarks, on a budget year cycle. So things don’t necessarily line up.
There’s never going to be an easy way to make these different gears mesh. But until now, we’ve never had any way to see exactly how they don’t mesh, and to factor that into our thinking. That’s one of the subtler effects of transparency.
Another is the possibility of a more complete view of data provenance — that is, where it comes from, and how it’s transformed along the way. Influenced by Jeff Jonas’ notions of sequence neutrality and data tethering, Greg envisions an open protocol for what he calls continuous data analysis:
If we can get an open protocol for reporting what we find in data, you’re beginning to make explicit the transformations that you apply. What I need to be able to do here at Sunlight, and what all of us working with public data need to be able to do, is instantly reprocess data that we’ve already processed, because any data we get is going to be missing something. If someone decides to change a taxonomy term, you ought to be able to rerun the data at every level with that new taxonomy term.
This was an excellent conversation, thanks Greg!
Unexamined software idioms #1: Linking in rich text editors
There’s undoubtedly a whole series of items to be written on unexamined idioms in software user interfaces.1 Here’s one to kick off the series: the linking mechanism in rich text editors. It hasn’t changed in a decade, and it works the same way in new editors — like Yahoo’s Rich Text Editor and the .NET-based Windows Live Writer — as it always did. The idiom goes like this:
1. Select the text to which you want to attach your link.
2. Click the Link button.
3. Type (or paste) the URL.
I’ve watched novices struggle with this for years, and it’s no wonder that they do. What’s missing from this protocol is the capture of the URL. (That’s almost always necessary because few URLs nowadys can be easily typed.) So the idiom really goes like this:
1. Navigate to the target of the link.
2. Capture the URL.
3. Select the text to which you want to attach your link.
4. Click the Link button.
5. Type (or paste) the URL.
We have never, in any rich text editor I’ve ever seen, woven in support for those crucial first two steps.
How might that work? It occurs to me that a picture-in-picture browser would be really helpful. I’ve only seen one example of that genre — Bitty Browser — but it, or an equivalent widget, would seem like a great solution. When you click the Link button you get a picture-in-picture browser that you use to navigate to the link target. Ideally it loads with your current history and tabs, so the target is within easy reach.2 When you land on the target, there’s a button to copy the URL. Now that you’ve been guided through the first two steps, the remaining three flow naturally.
1 Just for fun, I’m going to try keeping a list at del.icio.us/judell/unexamined-software-idioms. To play along you can do the same at del.icio.us/YOU/unexamined-software-idioms, and we can see what accumulates in del.icio.us/tag/unexamined-software-idioms.
2 You can see a glimmer of this idea in Live Writer. From its linking dialog you can navigate to, and select, a prior post or a glossary entry.
MuVo woes
I listen to lots of podcasts, often in harsh conditions to which I wouldn’t want to expose a hard-disk-based device. So flash-memory-based gadgets are an attractive choice. Their capacity isn’t an issue for me because once I’ve listened to a podcast I just discard it. There’s no need to manage it as part of a collection that lives on my computer, or is synchronized to a player. If I want to hear it again sometime, I’ll download it and transfer it again.
I also do an increasing amount of voice recording, when preparing for talks. Here too, less is more. I don’t need high quality sound, just convenient recording of speech that’s recognizable on playback. Again, this recording often occurs in harsh outdoor conditions. And it sometimes occurs spontaneously, in which case I want to be able to pop in a AAA battery and go. I’ve come to loath devices with proprietary batteries that are useless if you forget to charge them in advance.
Finally, like everyone, I use USB sticks to store files and move them from one place to another.
I’ve found one device that meets all three of these requirements brilliantly: the Creative MuVo. I’ve been a huge fan of this gadget since 2004, and have owned three models. First was the 256MB MuVo TX. Later came the 512MB MuVo TX FM, which doubled the storage and added an FM radio I never used. Before giving away the TX I owned both for a while, and on one memorable occasion I found a compelling simultaneous use for the pair.
True, the device has a tendency to flake out now and then, in ways that would confound most people, but I was always able to resurrect it with a firmware refresh.
Until now. The TX FM still works as a USB drive but the player is dead. Since I was going to Staples anyway I picked up what seemed like the obvious replacement, the MuVo V100, without doing any research. Bad idea, it’s dog slow on transfers:
At Creative’s site there are three pages of customer complaints about the MuVo v100 slow file transfer rate. No fix is currently available, though Creative customer service sends customers on a useless firmware download wild goose chases and neglects to mention that the snail-like transfer rate is a well-documented problem. [Amazon customer reviews]
Sheesh. I’m taking it back, ordering another TX FM instead, and wishing that somebody would provide that excellent bundle of features in sturdier package.
A conversation with Kentaro Toyama about Microsoft Research India
This morning I spoke with Kentaro Toyama, the assistant managing director of Microsoft Research India, about the mission of Microsoft’s Bangalore-based research center. Our podcast touches on all six of MSR India’s research areas. These are mostly concerned with the same kinds of advanced computer science problems that the other labs around the world focus on. Although it wasn’t a requirement that each of these efforts be particularly appropriate to India, it turns out that one way or another they are.
India’s wealth of mathematical talent, for example, is a tremendous asset for a research program in cryptography, security, and algorithms. Likewise its linguistic diversity — there are 22 officially recognized languages, and several hundred dialects — makes it a natural home for research on multilingual systems. And a country that’s adding 7 million mobile phone subscribers every month is a great place to investigate mobility, networks, and systems.
There’s also work in areas outside the realm of classic computer science. Kentaro Toyama leads an area called technology for emerging markets which tackles problems like how to create text-free user interfaces for people who cannot read. Obviously you need to rely heavily on graphics and on audio feedback, but there are fascinating subtleties involved. Simple icons don’t work well, because they’re not expressive enough. But fully realistic images don’t work well either, because they’re overly literal. It turns out that a cartoon-like approach is what works best, and within that discipline there are further subtleties — for example, you want to animate the pictorial verbs, but not the nouns.
I was also fascinated to hear about related work in digital geographics, and in particular, about an effort to render map data in the style of hand-drawn historical maps. Why do this? Well for one thing, those old maps are beautiful. But as Kentaro Toyama points out, there’s a non-aesthetic reason too. Maps produced by human cartographers communicate more effectively than machine-generated maps normally can. That’s because cartographers use their intelligence and judgement to select and emphasize certain features at the expense of others. It’d be great to be able to model some of that intelligence and judgment and reproduce it software.
I’ve been to India twice. When I was 5, my family lived in New Delhi for a year. Then in 1993, for BYTE, I visited to learn about India’s software industry. Maybe finding out more about MSR India will turn out to be a reason to go again.
Transmission of tacit knowledge: teaching what we don’t know that we know
In a couple of talks last year on the theme of network-enabled apprenticeship, I referred to an example of the transmission of tacit knowledge. What happened was that Jim Hugunin accidentally taught me a feature of the Python programming language — the use of the special underscore variable to store the value of the most recently evaluated expression — without ever realizing that I hadn’t known about it, or that his use of the idiom transferred it to me.
Now Chris Gemignani has taught me something else about Python in the same accidental and unconscious way. Last week I mentioned his geocoder for Excel. He’s also written a Python class that’s useful for batch geocoding, and when I found it today I was struck by this idiom:
print “location: %(latitude)s, %(longitude)s” % address
If you’re a non-programmer, here’s a bit of background. Most programming languages include some version of printf, a function that use a format string to control the interpolation of the values of variables into text. So in Python, for example, this statement…
print “location: %s, %s” % ( latitude, longitude )
…would interpolate the values of the variables named latitude and longitude into the format string “location: %s, %s” to produce an output like:
location: 42.933659, -72.278542
It’s quite likely, though, that those variables will be members of a data structure like this dictionary:
address = { ‘latitude’: 42.933659, ‘longitude’: -72.278542 }
In this case, your normal instinct will be to write:
print “location: %s, %s” % ( address[‘latitude’], address[‘longitude’] )
That works fine, but the alternative Chris revealed to me is better:
print “location: %(latitude)s, %(longitude)s” % address
Although I use Python extensively, I had never discovered this! It’s better in two ways. First, it’s more concise. Second, it associates the names of the variables directly with the percent markers in the format string. That’s not a big deal when there are only two variables to keep track of, but often there are more, and matching up the positions of the markers in the format string with the positions of their corresponding variables in the corresponding list is tedious and error prone.
Quite possibly none of this means anything to you, because you’re neither a programmer nor a Pythonista. Even so, I’ll argue that this principle of transmission of tacit knowledge is profound, and can apply to almost any discipline that’s subject to online narration.
There are all sorts of obvious reasons to narrate the work that we do. By doing so we build reputation, we attract like-minded collaborators, we draw constructive criticism, and we teach what we know.
Sometimes there’s also a non-obvious reason. It’s possible to teach what we don’t know that we know.
PodScreenMathSlideSketchCasting
Richard Ziade is experimenting with a video form he calls sketchcasting. A sketchcast is a recording of a whiteboard session plus voiceover. I’ve seen some very effective educational uses of this technique, and it’s interesting to compare Tim Fahlberg’s mathcasts to Richard Ziade’s sketchcasts. When Tim Fahlberg demonstrates the solution to a math problem in one of his mathcasts, the visual repertoire of numbers and symbols is fixed, and the creative contribution is sequencing and narration. When Richard Ziade delivers a presentation as a sketchcast, the visual repertoire is open-ended. We all know people who like to sketch and who communicate effectively that way. Richard Ziade is clearly one of them. Microsoft’s Steve Cellini is another. In meetings he invariably leaps to the whiteboard and draws pictures of the ideas being discussed.
It’s great to see all these forms evolving and — crucially — becoming more accessible. TechSmith’s Jing, for example, aims to make screencasting more spontaneous. SlideShare makes it easy to produce and share slidecasts, which are audio narrations of slide decks.
As words suffixed with cast proliferate — pod, screen, math, sketch, slide — it can all seem a bit bewildering. But with a range of choices, people who want to produce rich media can gravitate to the forms that match their skills and inclinations. And for those who watch and listen to these productions, it’s not complicated at all. You click the link, you watch and/or listen.
Excel geocoding adventures
As mentioned here, I’ve been working with a spreadsheet containing addresses that want to be geocoded. I’ve had lots of experience running batches of addresses through geocoding services, but in the case of the police department I’ve been working with, it would be nice to be able to do the geocoding interactively. That way, if 400 Marlboro resolves incorrectly to 400 Marlboro Rd., the clerk will know it’s necessary to specify 400 Marlboro St. if that’s the intended address.
I found two examples of spreadsheets programmed with this behavior, first from AutomateExcel.com and second from Juice Analytics. When I compared these I realized that I wanted to combine aspects of both.
The AutomateExcel version is extremely simple. That’s partly because it uses the XML mapping features of Excel 2003 or 2007 to capture the XML output of a geocoding service, and partly because it only deals with a single address.
The Juice version is more complex. That’s partly because it eschews the XML mapping features in order to support older versions of Excel, and partly because it deals with many addresses. (It also exports KML for use with Google Earth.)
In my case I was willing to assume Excel 2003 or later, and use XML mapping. But I wanted to be able to accumulate results for many rows of addresses. I also wanted to switch from the XML output of geocoder.us, which is used in the AutomateExcel version, to the XML output of Yahoo’s geocoder, which is used in the Juice version.
The version I came up with is here, and the VBA code appears below. I haven’t used VBA or the XML mapping features of Excel in a while so, while the experience is fresh in my mind, I thought I’d record some of my key observations.
Mapping the output of Yahoo’s geocoder
I started by replicating the XML mapping in the AutomateExcel version. Here’s a sample geocoder.us query:
geocode?address=400%20marlboro%20st,keene,nh
To create an XML mapping in Excel you do: Data -> From the web -> [plug in the URL] -> Import. Excel warns: “The specified XML source does not refer to a schema. Excel will create a schema.” OK.
Then it asks: “Where do you want to put the data?” I answered: “XML table in existing worksheet.”
Then I did Developer -> Source to reveal the XML map, unbound the mapped fields, and rebound them to the vertical rather than horizontal layout I wanted to use.
It was all good. So now I tried the same procedure using this sample Yahoo query:
geocode?appid=YahooDemo&street=400%20marlboro%20st&city=Keene&state=NH1
But this time when I unbound and rebound the fields, I couldn’t access the values in the same way. Eventually I saw why not. The Yahoo results reference a schema, and that triggers a more complex behavior in Excel involving the importation of whole data sets.
So I saved an instance of Yahoo’s XML results in a file, stripped out the schema reference, and then acquired it using Data -> From other sources -> From XML Data import. Then it behaved just like the first example. I expect there’s a simpler solution, and hopefully this item will attract a reference to it.
With the mapping done, it’s a one-liner in VBA (ActiveWorkbook.XmlMaps(“YahooMap”).Import url:=url) to fetch the XML data and spray it into the mapped cells. That’s dramatically simpler than the regular-expression gymnastics performed by the Juice version. Of course if you need to support older versions of Excel, you’ve got to perform those gymnastics.
Relativizing references
My first version was full of hardcoded references to rows and columns in the temporary sheet where the XML data gets unpacked, and in the main sheet where raw addresses are decorated with latitude, longitude, parsed address, precision (e.g. exact address vs. street-level vs. city-level), and cleanliness (e.g. whether there were warnings).
I knew that I’d need to use lookup functions to relativize all those references, and it soon became apparent that I’d want to use the Match function — which finds the position of an item in a row or column — to do it. But it returns numeric positions, which are fine for rows but don’t correspond to alphanumeric column names like C3. The solution, as generations of Excel hackers have learned but I never had need of until now, is to go to Options and enable the R1C1 reference style. Now the columns have numbers too, and in VBA you can write reference like so:
rows(index).columns(address_col)
Dynamic variable assignment
That cleaned up a lot of the mess, but there was still a lot of per-variable code that I’d written in order to stash the geocoded results into VBA variables and then later retrieve them. I thought of generalizing that by using Eval, like so:
Eval( ‘y_lat = Selection.Value’ )
But no dice. Excel 2007 told me there was no Eval function. Which is just as well, because that time-honored trick is really sketchy. So I went looking for VBA’s equivalent to the Perl associative array or the Python dictionary, and found it in VBA’s Collection.
All in all, it was an educational exercise. The patterns here can serve as a model for any scenario that involves interactively querying a web service based on some cell in Excel, and then incorporating the results into companion cells. Of course since I’m a complete novice when it comes to this stuff, I’m hoping that by posting my code I’ll also find out about other and better approaches.
1 You’ll want to substitute your Yahoo application id for YahooDemo. And unless the addresses you’re looking up happen to be in my town, you’ll want to adjust the city and state too.
dim address as string
dim escaped_address as string
dim city as string
dim state as string
dim yahoo_id as string
dim url as string
dim y_labels() as variant
dim y_values as collection
dim main_address_col as integer
dim scratch_data_col as integer
dim scratch_label_col as integer
public sub init
city = "Keene"
state = "NH"
address = ActiveWorkbook.Application.ActiveCell
main_address_col = 2
scratch_label_col = 1
scratch_data_col = 2
escaped_address = replace(address, " ", "+", 1)
yahoo_id = "YahooDemo"
y_labels = array ("y_lon","y_lat","y_addr","y_precision","y_clean")
end sub
public sub GeocoderYahoo()
on error goto ErrorMsg
init
url = "http://local.yahooapis.com/MapsService/V1/geocode?appid=" & yahoo_id
url = url & "&city=" & city & "&state=" & state & "&street=" & escaped_address
'call the geocoder
ActiveWorkbook.XmlMaps("YahooMap").Import url:=url
'find current row
index = application.match(address,columns(main_address_col),0)
set y_values = new collection
'gather results into collection
worksheets("Scratchpad").select
for each label in y_labels
row = application.match(label,columns(scratch_label_col),0)
rows(row).columns(scratch_data_col).select
y_values.Add Item:=selection.value, Key:=label
next label
'unpack collection
worksheets("Main").select
for each label in y_labels
col = application.match(label,rows(1),0)
rows(index).columns(col).select
selection.value = y_values(label)
next label
rows(index).columns(main_address_col).select
goto Fini
ErrorMsg:
msgbox ("Cannot geocode: " & address)
Fini:
end sub
Internet history: the missing 15 years
Imagine a 300-page history of the United States that spent the first 290 pages on events up to and including the Civil War, then zoomed through everything else in the last 10 pages. According to Doug Gale, who I met this week at EDUCAUSE, that’s more or less how the history of the Internet has been written. He runs a consultancy called Information Technology Associates which is located in Big Sky, Montana because it can be. Earlier in his career he was an NFSNET administrator who was instrumental in taking us from a research network with a few hundred nodes in 1980 to the 7-million-node recognizably modern Internet we had by 1995. Although much has been written about the early ARPANET, there’s surprisingly little documentation of the 15-year period from the development of CSNET in 1980 to the decommissioning of the NFSNET in 1995.
So Doug Gale is now interviewing seventy-odd of the key players in that 15-year transformation, in order to build an archive of source materials that can be used to write the history of that formative era. The oral archive is currently a work in progress, and none of the interviews captured so far have been published, but he intends to do that and is looking for help.
I’d love to read that history once it’s written. Meanwhile, here’s the project’s home page.