A conversation with Bill Buxton about design thinking

In the latest episode of my Microsoft Conversations series I got together with Bill Buxton to talk about the design philosophy set forth in his new book Sketching User Experiences. Nowadays Bill is a principal researcher with Microsoft Research, and before that he was chief scientist at Alias/Wavefront, but his involvement in the design of software and hardware user interfaces goes all the way back to Xerox PARC. Along the way he’s accumulated a fund of wisdom about what he calls design thinking — a way of producing, illustrating, and winnowing ideas about how products could work.

I haven’t yet received my copy of his book, but my background for this conversation was a talk given last November at BostonCHI, the Boston chapter of the ACM’s special interest group on computer-human interaction. In that talk (which summarizes key themes from the book), and also in this conversation, Bill lays down core principles for designing effective user experiences.

He proceeds from the assumption that sketching is fundamental to all design activity, and explores what it means to sketch a variety of possible user experiences. His approach is aggressively low-tech and eclectic. He argues that although you can use software tools to create fully-realized interactive mockups, you generally shouldn’t. Those things aren’t sketches, they’re prototypes, and as such they eat up more time, effort, and money than is warranted in the early stages of design. What you want to do instead is produce sketches that are quick, cheap, and disposable.

How would you apply that strategy to the design of, say, the Office ribbon? When Bill talks about sketching, he means it literally:

You’d start with paper prototyping — quickly hand-rendered versions, and for the pulldown menus and other objects you’d have Post-It notes. So when somebody comes with a pencil and pretends it’s their stylus and they click on something, you’ve anticipated the things they’ll do, and you stick down a Post-It note.

What matters here isn’t the interaction between the test subject and the prototype, because it isn’t really a prototype, it’s a sketch. Rather, what matters is the interaction between the test subject and the designer. The sketch need do no more than facilitate that interaction.

Continuing with the same example, here’s how an eclectic strategy keeps things simple and cheap:

Now that will give you the flow and the sequence of actions, but it will not give you the dynamics in terms of response time. To show that, I’d use exactly the same things, photograph them, and then make a rough pencil-test video so I could play back what I think the timing has to be to show it in realtime. It’s a combination of techniques, where none is sufficient on its own.

Later in the conversation, he challenges some of my favorite themes. Bill’s skeptical about the notion (popularized by Eric von Hippel) that lead users can be co-designers of products. And he doesn’t think that logging interaction data is as useful as I think it is. But he agrees with me that a key weakness of paper prototypes is their inability to incorporate the actual data that animates our experiences of products and services. One of his examples: MP3 players think in terms of songs, not movements, so if you load one with classical music you’ll find a bunch of duplicate songs called Adagio. In such a case, Bill admits, you’d like to have used a more fully-realized prototype that could have absorbed real data and flushed out these kinds of problems. His point isn’t that you should never deploy heavier design artillery, but rather that you should reserve it for when it’s absolutely necessary. Much of the time, he believes, sketching is faster, cheaper, and more productive.

Unifying the experience of online identity

Several months ago my bank implemented an anti-phishing scheme called Site ID, and now my mortgage company has gone to a similar scheme called PassMark. Both required an enrollment procedure in which I had to choose private questions and give answers (e.g., mother’s maiden name) and then choose (and label) an image. The question-and-answer protocol mainly beefs up name/password security, and secondarily deters phishing — because I’d notice if a site I believed to be my bank or mortgage company suddenly didn’t use that protocol. The primary anti-phishing feature is the named image. The idea is that now I’ll be suspicious if one of these sites doesn’t show me the image and label that I chose.

When you’re talking about a single site, this idea arguably makes sense. But it starts to break down when applied across sites. In my case, there’s dissonance created by different variants of the protocol: PassMark versus Site ID. Then there’s the fact that these aren’t my images, they’re generic clip art with no personal significance to me. Another variant of this approach, the Yahoo! Sign-In Seal, does allow me to choose a personally meaningful image — but only to verify Yahoo! sites.

These fragmentary approaches can’t provide the grounded and consistent experience that we so desperately need. One subtle aspect of that consistency, highlighted in Richard Turner’s CardSpace screencast, is the visual gestalt that’s created by the set of cards you hold. In the CardSpace identity selector, the images you see always appear together and form a pattern. Presumably the same will be true in the Higgins-based identity selector, though I haven’t seen that yet.

I can’t say for sure, because none of us is yet having this experience with our banks and mortgage companies, but the use of that pattern across interactions with many sites should provide that grounded and consistent experience. Note that the images forming that pattern can be personalized, as Kevin Hammond discusses in this item (via Kim Cameron) about adding a handmade image to a self-issued card. Can you do something similar with a managed card issued by an identity provider? I imagine it’s possible, but I’m not sure, maybe somebody on the CardSpace team can answer that.

In any event, the general problem isn’t just that PassMark or Site ID or Sign-In Seal are different schemes. Even if one of those were suddenly to become the standard used everywhere, the subjective feeling would still be that each site manages a piece of your identity but that nothing brings it all together under your control. We must have, and I’m increasingly hopeful that we will have, diverse and interoperable identity selectors, identity providers, relying parties, and trust protocols. But every participant in the identity metasystem must also have a set of core properties that are invariant. One of the key invariant properties is that it must bring your experience of online identity together and place it under your control.

A conversation with Doug Kaye about PodCorps

Because of travel, last week’s ITConversations show was a rerun of my conversation with Lou Rosenfeld about a cluster of topics including information architecture, search analytics, print and online publishing, designing for usability, tagging, and microformats. This week’s show is a conversation with ITConversations founder Doug Kaye about his new project, PodCorps, which aims to connect producers of spoken-word events with stringers who can help get those events audio- or video-recorded and then published on the Web.

Here’s a fun fact I uncovered. Doug hadn’t heard of one of my favorite things lately, the LibriVox project. When I mentioned that Hugh McGuire cites AKMA’s collaborative recording of Larry Lessig’s Free Culture as a primary inspiration, Doug pointed out that he was the reader for the first chapter of that project. Small world!

RESTful Web Services

RESTful Web Services, by Leonard Richardson and Sam Ruby, was published this month. I interviewed the authors yesterday for an upcoming ITConversations show, but I also want to spell out here why I think it’s such an important book.

In the realm of IT you could hardly pick a more controversial topic. Or, in a way, a more unlikely one, given that the REST (Representational State Transfer) architectural style has its roots in what would normally have been an obscure Ph.D. thesis. Roy Fielding, the author of that thesis, told me in an interview that he was surprised by its breakout popularity. But he probably shouldn’t have been. There are not many technologies as foundational as the Hypertext Transfer Protocol (HTTP), whose principles that thesis defines.

But Fielding’s thesis is a thesis, not a practical guide. The effort to bridge from theory to practice has produced a considerable amount of folklore. “We’re writing a book,” the authors say in their web introduction, “to codify the folklore, define what’s been left undefined, and try to move past the theological arguments.” The mission is clearly defined in the first chapter:

My goal in this book is not to make the programmable web bigger. That’s almost impossible: the programmable web already encompasses nearly everything with an HTTP interface. My goal is to help make the programmable web better: more uniform, better structured, and using the features of HTTP to greatest advantage.

The book opens by usefully distinguishing between a set of architectural styles (REST, RPC [remote procedure call], REST-RPC hybrid) and suite of technologies (HTTP, XML-RPC, WS-*, SOAP). We tend to conflate architectures with technologies because they usually go together, but that’s not necessarily the case. The authors cite Google’s SOAP API (and other “read-only SOAP and XML-RPC services” as being “technically REST architecture” but nevertheless “bad architectures for web services, because they “look nothing like the Web.”

This book asserts that most services can, and should, “look like the Web,” and it spells out what that means. Among the key principles:

  • Data are organized as sets of resources
  • Resources are addressable
  • An application presents a broad surface area of addressable resources
  • Representations of resources are densely interconnected

To illustrate these principles, the authors work through a series of examples from which they distill gems of practical advice. When designing URIs, for example, they recommend that you use forward slashes to encode hierarchy (/parent/child), commas to encode ordered siblings (/parent/child1,child2), and semicolons to encode unordered siblings (/parent/red;green). Pedantic? Yes. And bring it on. Lacking a Strunk and White Elements of Style for URI namespace, we’ve made a mess of it. It’s long past time to grow up and recognize the serious importance of principled design in this infinitely large namespace.

Here’s another key principle: “When in doubt, model it as a resource.” To illustrate that principle in a dramatic way, the authors apply it to a problem that RESTful web services are normally thought incapable of solving: transactions. By modeling the transaction itself as a resource, they arrive at the following:

First I create a transaction by sending a POST to a transaction factory resource:

POST /transactions/account-transfer HTTP/1.1
Host: example.com

The response gives me the URI of my newly created transaction resource:

201 Created
Location: /transactions/account-transfer/11a5

I PUT the first part of my transaction: the new, reduced balance of
the checking account.

PUT /transactions/account-transfer/11a5/accounts/checking/11 HTTP/1.1
Host: example.com

balance=150

I PUT the second part of my transaction: the new, increased balance of
the savings account.

PUT /transactions/account-transfer/11a5/accounts/savings/55 HTTP/1.1
Host: example.com

balance=250

At any point up to this I can DELETE the transaction resource to roll
back the transaction. Instead I’m going to commit the transaction:

PUT /transactions/account-transfer/11a5 HTTP/1.1
Host: example.com

committed=true

I don’t think my bank’s going to be adopting this technique any time soon, but it’s a fascinating thought experiment which suggests that what the authors call resource-oriented architecture (ROA) is a young and in many ways still relatively unexplored discipline.

On the question of ROA versus SOA (service-oriented architecture), the authors say that for certain kinds of enterprisey problems — including advanced security protocols and complex coordinated workflows — only SOA meets the need. They recommend it for these purposes, when the need arises. But in the many situations where the need does not arise, they recommend starting with ROA.

I’m inclined to agree, but I’d feel better about that recommendation if the glide path from ROA to SOA were smoother. It isn’t. Toward the end of our interview I asked Sam Ruby, who has been a long and forceful advocate for a smooth glide path, whether he thinks we’ll achieve it. He doesn’t. That worries me, but I haven’t given up hope. I’ve always seen ROA and SOA as points along what Frank Martinez calls a tolerance continuum. Among its other accomplishments, this excellent book advances that important point of view.

Hosted lifebits

Today my digital assets are spread out all over the place. Some are on various websites that I control, and a lot more that I don’t. Others are on various local hard disks that I control, and a lot more that I don’t. It’s become really clear to me that I’d be willing to pay for the service of consolidating all this stuff, syndicating it to wherever it’s needed, and guaranteeing its availability throughout — and indeed beyond — my lifetime.

The scenario, as I’ve been painting it in conversations with friends and associates, begins at childbirth. In addition to a social security number, everyone gets a handle to a chunk of managed storage. How that’s coordinated by public- and private-sector entities is an open question, but here’s how it plays out from the individual’s point of view.

Grade 3

Your teacher assigns a report that will be published in your e-portfolio, which is a website managed by the school. Your parents tell you to write the report, and publish it into your space. Then they release it to the school’s content management system. A couple of years later the school switches to a new system and breaks all the old URLs. But the original version remains accessible throughout your parents’ lives, and yours, and even your kids’.

Grade 8

On the class trip to Washington, DC, you take a batch of digital photos. You want to share them on MySpace, so you do, but not directly, because MySpace isn’t really your space. So you upload the photos to the place that really is your space, where they’ll be permanently and reliably available, then you syndicate them into MySpace for the social effects that happen there.

Grade 11

You’re applying to colleges. You publish your essay into your space, then syndicate it to the common application service. The essay points to supporting evidence — your e-portfolio, recommendations — which are also (to a reasonable degree of assurance) permanently recorded in your space.

College sophomore

You visit the clinic and are diagnosed with mononucleosis. You’ve authorized the clinic to store your medical records in your space. This comes in handy a couple of years later, when you’ve transferred to another school, and their clinic needs to refer to your health history.

Working professional

You use your blog to narrate the key events and accomplishments in your professional life, and to articulate your public agenda. All this is, of course, published in your space where you are confident (to the level of assurance you can reasonably afford) that it will be reliably available for your whole life, and even beyond.

Although this notion of a hosted lifebits service seems inevitable in the long run, it’s not at all clear how we’ll get there. The need is not yet apparent to most people, though it will increasingly become apparent. The technical aspects are somewhat challenging, but the social and business aspects are even more challenging.

In social terms, I think it’ll be hard to get people to decouple the idea of storage as a service from the idea of value-added services wrapped around storage.

On the business side, my conversations with Tony Hammond and Geoffrey Bilder have given me a glimpse of how these issues are being approached in the world of scholarly and professional publishing. But it’s not yet apparent that the specialized concerns driving these efforts will, in fact, generalize in important ways to almost everybody.

Trusting, but verifying, your teenager’s use of the Internet

Parents nowadays face tough questions about whether to monitor or (try to) control their kids’ use of the Internet, and if so, how. Although my personal opinion is that trying to restrict access is a losing battle, I understand why the idea is appealing. You’d like your kids to have some maturity and some perspective under their belts before encountering some of what the Internet so readily brings to their attention. When my kids were younger, the Internet was younger too. I guess if they were still that young I’d be wishing I could create a sandbox for them, even though I don’t think you can. But they’re teenagers now, and they have their own computers. For two reasons, activating the parental controls on those computers isn’t the strategy I want to pursue.

The first reason is that I don’t think filtering the Internet is feasible. Even if we could agree on a definition of what may be harmful, which we never could, people will find ways to route around censorship. Meanwhile we’ll inevitably censor things we never meant to — like, for example, my InfoWorld blog.

The second reason is that I don’t want to incent my kids to route around controls I might try to impose. Nor do I want to force them to go elsewhere to experience an uncensored Internet. The reality of the Internet, like the reality of the world, is something they’ll be dealing with for the rest of their lives. I’d rather they engage with that reality at home where I can more easily keep track of their activities.

If you want to be able to monitor without imposing explicit controls — in other words, trust but verify — then it’s worth knowing about the feature of Windows Vista that supports that preference. It’s in Control Panel -> User Accounts and Family Safety -> Parental Controls. There are two On/Off choices. The first, Parental Controls, enforces any of the controls you elsewhere define. These include restrictions about which websites are accessible, when the computer may be used, and which games or other programs may be used.

My strategy is to leave Parental Controls off, but switch on the second On/Off choice: Activity Reporting. That produces a detailed report about which websites were visited, which applications were used when, which games were played, messages sent/received and contacts added (if the kids use Outlook and Windows Messager, which mine don’t), and more.

The crucial item here, for me, is websites visited. That’s recently become an issue that I want to keep an eye on. With this setup I can, in a way that’s browser-independent, persistent across flushes of the browser cache, and very unlikely to be disabled.

Windows XP and Mac OS X don’t offer the same capability out of the box, but there are of course lots of third-party add-ons. Not ever having tried them myself, I’d be interested to hear how effectively they can be used to implement a “trust but verify” policy. And more generally, I’d be interested to hear about how other parents of teenagers are dealing with the difficult tradeoffs involved in this thorny issue.

A conversation with Allen Wirfs-Brock about the history of Smalltalk and the future of dynamic languages

More than 25 years ago, Allen Wirfs-Brock created one of the early implementations of Smalltalk. He was working at Tektronix at the time, as was Ward Cunningham who became the first user of Tektronix Smalltalk. Allen later served as chief scientist of Digitalk-ParcPlace and CTO of Instantiations, then joined Microsoft four years ago. His original charter was to work on future strategies for Visual Studio, but recently — in light of growing interest in dynamic languages at Microsot — he’s returning to his roots.

In the latest installment of my Microsoft Conversations series we review the history of Smalltalk, and trace the evolution of the techniques that it (and Lisp) pioneered, from the early implementations to such modern descendants as Python and Ruby.

I’m always looking for ways to explain why dynamic programming techniques are so important, and a great explanation emerged from this conversation. A Smalltalk system is, among other things, a population of continuously evolving objects that communicate by passing messages. That same description applies to another kind of system: the Internet. I suggested — and Allen agreed — that this congruence is driving renewed appreciation for dynamic languages.

Motivation, context, and citizen analysis of government data

Matt McAlister heard “crackling firearms” in his San Francisco neighborhood and wrote a wonderful essay on a theme that was central to my keynote talk last week at the GOVIS conference: how citizens can and will work with governments to diagnose social problems and develop solutions. When the District of Columbia’s DCStat program rolled out last summer, I was delighted by the forward thinking involved. Publishing the city’s operational data directly to the web, for everyone to see and analyze, with the explicit goal of making the delivery of government services transparent and accountable, was and is an astonishingly bold move. And as Matt found when investigating crime in his neighborhood, it’s still part of the unevenly distributed future:

I then found the official San Francisco Police Department Crime Map. Of course, the data is wrapped in their own heavy-handed user interface and unavailable in common shareable web data formats.

Access to data is good, and access to data in useful formats is better, but these are only the first steps. We need to make interpretations of the data, compare and discuss those interpretations, and use them to inform policy advocacy. The mashups that Matt reviews are a glimpse of what’s to come, but these interactive visualizations have a long way to go.

Here’s another glimpse of what’s to come: I took a snapshot of the DC crime data, uploaded it to Dabble DB, built a view of burglary by district and neighborhood, and published it at this public URL. There are two key points here. First, discussion can attach to (and will be discoverable in relation to) that URL. Second, the data behind the view is also available at that URL, in a variety of useful formats, so alternate views can be produced, pointed to, and discussed.

Still, these are only views of data. There’s no analysis and interpretation, no statistical rigor. Since most ordinary citizens lack the expertise to engage at that level, are governments that publish raw data simply asking for trouble? Will bogus interpretations by unqualified observers wind up doing more harm than good?

That’s a legitimate concern, and while the issue hasn’t yet arisen, because public access to this level of data is a very new phenomenon, it certainly will. To address that concern I’ll reiterate part of another item in which I mentioned John Willinsky’s amazing talk on the future of education:

Willinsky talks about how he, as a reading specialist, would never have predicted what has now become routine. Patients with no ability to read specialized medical literature are, nonetheless, doing so, and then arriving in their doctors’ offices asking well-informed questions. Willinsky (only semi-jokingly) says the Canadian Medical Association decided this shouldn’t be called “patient intimidation” but, rather, “shared decision-making.”

How can level 8 readers absorb level 14 material? There are only two factors that govern reading success, Willinsky says: motivation, and context. When you’re sick, or when a loved one is sick, your motivation is a given. As for context:

They don’t have a context? They build a context. The first time they get a medical article, duh, I don’t know what’s going on here, I can’t read the title. But what happened when I did that search? I got 20 other articles on the same topic. And of those 20, one of them, I got a start on. It was from the New York Times, or the Globe and Mail, and when I take that explanation back to the medical research, I’ve got a context. And then when I go into the doctor’s office…and actually, one of the interesting things…is that a study showed that 65% of the doctors who had had this experience of patient intimidation shared decision-making said the research was new to them, and they were kind of grateful, because they don’t have time to check every new development.

When your loved one is sick, you’re motivated to engage with primary medical literature, and you’ll build yourself a context in which to do that. Similarly, when your neighborhood is sick, you’ll be motivated to engage with government data, and you’ll build yourself a context for that.

The quest for context could, among other things, lead to a renewed appreciation for a tool that’s widely available but radically underutilized: Excel. Most people don’t earn a living as quants, so Excel, for most people, winds up being a tool for summing columns of numbers and arranging text in tabular format. That may change as more public data surfaces, and as more people realize they want to be able to interpret it. In which case Chris Gemignani and the rest of the Juice Analytics team will emerge as leading resources available to motivated citizens wanting to learn how to make better use of Excel.

Shared navigation of online bureaucracies

In my talk on Friday at the GOVIS (government information systems) conference in Wellington, I wasn’t the only one to suggest that web 2.0 attitudes will change the relationship between governments and citizens. That notion now seems to be pretty firmly established, and the question is not whether citizens will collaborate with their governments, but rather how.

Among other developments, I think we’ll soon see a refreshing new approach to the consumption of government services. A couple of weeks ago at Berkeley’s school of information I met Anna Kartavenko, one of Bob Glushko’s graduate students, She’s working on ways to make the byzantine California regulatory apparatus more accessible. If you’re starting a business in that state, it’s really hard to figure out which licenses you need to apply for, as well as how (and in what order) to apply for them.

The problem is universal, of course, and folks at GOVIS were wrestling with it too. When you’re providing the information systems that both document and implement government services, you certainly want to do everything right in terms of system and information architecture. But I suspect there’s about to be a new force in the world that will work toward the same ends — easy discovery and effective use of services — by very different means. That force is shared experiential knowledge.

Yes, search should give the right answer, and the systems that search points you to should work well. No, these things don’t always happen. But even if they do, you’d still like to plug into somebody who’s been down the same path you are traveling. A formal description of a procedure is never enough. If possible, we’d always like to hear from somebody who’s been there, done that, knows the drill, and can point out the pitfalls.

What we loosely call social media are beginning to create that possibility. For a variety of reasons, people are beginning to document and share what they know. If you write it down, you’ll be able to remember it yourself in case you have to replay the steps. And writing it down in a shared information system in the cloud is becoming a more reliable way to assure your own future access to this documentation than writing it down locally.

To the extent your knowledge is a source of competitive advantage, you’ll want to be cautious about how much of it you publish. But then again, the reputation you establish by publishing some of your knowledge may lead to new opportunities to use that knowledge for your own gain.

Along with these incentives, which I classify as examples of enlightened self interest, there are also purely altruistic motives, and I don’t discount those. But let’s just stick with enlightened self interest for now. Given those incentives to share knowledge, how can we lower the activation threshold for sharing?

I think one answer will emerge from the intersection of social bookmarking and clickstream logging. Suppose that instead of bookmarking and tagging a single URL, you could bookmark and tag a sequence of page-visiting and form-filling events. The sequence corresponds to some complex multi-step task. The performance of the task crosses several (or many) online jurisdictions. The outcome might be successful or not: “Yes, I got the license,” or “No I didn’t.” But in either case, it would be qualified by an anecdotal report: “Yes, I got the license, but I found out that if you’re in my category you need an import license and you have to meet the following insurance requirement.”

You couldn’t reasonably expect very many people to reflect on their encounters with online bureaucracy and take time to write reports like that. But what if it were a much more lightweight activity, like the difference between writing a blog entry and tossing off a del.icio.us bookmark or a Twitter message? Then participation becomes much more likely.

The key ingredient here is identifying a sequence of events in the browser (or rich client), and enabling people to visualize and then categorize and describe that sequence. And that seems eminently doable.

Comparing notes on speaker preparation

Jeremy Zawodny describes his method for preparing talks and asks:

If you end up speaking in front of audiences on a semi-regular basis, is your preparation experience anything like mine?

My process used to be what Jeremy describes — composing slides — but now it’s turned into something completely different and quite surprising to me. As I discussed here, I’ve finally trained myself to use dictation effectively. I’ll go out for a long walk, like two or three hours, and dictate a rough draft of the talk. I’m not able to do that continuously, I have to stop and think and start again, but I turn the recorder off during the think time so when I’m done I’ve got something approximating what the talk will be. Then I go for another long walk and listen to what I recorded, making notes about what slides to use. For last week’s talk I didn’t take those notes in audio form, I scribbled them down while walking, but next time I’m going to go back to audio capture. If you distill the long narrative down to short titles or phrases, it’s quick and easy to listen to a spoken distillation and write down the titles which become the armature for the slides.

The obvious reason why this works is that speaking out loud is good practice for speaking out loud. One of the subtler reasons is that exercise and fresh air really help. Another is that when I’m away from my office and can’t fiddle with a computer or look things up on the web, I have to literally think on my feet.

As I acknowledged here, I’m indebted to John Mitchell for suggesting this technique to me. According to him, it dates back to “the BBC WW2 radio correspondents, and then Edward R. Murrow.” Thanks again for the tip, John, it’s been really helpful.

Internet access adventures in New Zealand’s south island

I’ve spent the last three days touring the top part of New Zealand’s south island, from Picton (where the ferry lands) over the scenic highway to Nelson, down the even more scenic west coast road to Greymouth, and across the spectacularly scenic Arthurs Pass to Christchurch on the east. It hasn’t been like a US roadtrip at all. The distances aren’t great, I’ve been going slowly in order to better enjoy the narrow twisty roads and ubiquitous one-lane bridges, and I’ve been stopping often to hike in the native forest or climb partway up a mountain. If you wanted to get a taste of this place and happened to only have a handful of days in which to do it, you could do worse to follow this itinerary.

Notwithstanding my rationalization the other day, you’ll certainly want to bring your camera. You might reasonably opt to leave your laptop home, though, because Internet access from hotels here is a comedy of errors. The most absurd moment came last night when I checked into a hotel in Christchurch. I bought an access card for the WiFi service for $10, scratched it to reveal the access code, and…it smudged completely! I could not believe it! This card has only one purpose in life — to reveal a string of hex digits — and it cannot even manage to do that. Incredible!

It wasn’t a fluke, either. I showed it to the hotel clerk and he tried another card. Same result, except almost legible this time. We debated whether a digit was a 5 or an S, and whether another was an S or a 3, and in the end I had to try about a half-dozen variations. They had that access point locked down pretty well, too. The code was a dozen hex digits, and I felt like I was burgling Fort Knox. Finally I cracked the code, checked email, called the person I was meeting for dinner, and headed out, 10 minutes into the two-hour session I’d bought. Later I powered up to continue the session and…you guessed it…token expired. Aaarrggghhh! The future is not yet evenly distributed.

So anyway, Internet cafes are the ticket here, and I’ve seen quite a variety of them. Most notable was the McDonalds in Greymouth. That town, like a lot of the towns around here, caters to backpackers. I bought an access card along with my coffee — in this case, one that revealed legible digits when scratched — and sat down at a PC equipped with a camera and a headset. I didn’t see any backpackers videoconferencing, but it was interesting to see that the capability is part of the standard kit rolled out to these locations.

Even more interesting was the list of applications installed on the machine. It included Internet Explorer, Firefox, Microsoft Office, various instant messaging clients including Skype and…wait for it…SSH. If you’re technically inclined, you’ll probably laugh at that. If not, here’s why it’s funny. SSH stands for secure shell, and it’s used to open up a command-line session with a remote computer — typically a Linux or Unix server, although Windows servers can receive SSH connections as well (mine do). You might SSH into a remote machine in order to read mail in a terminal-oriented mail reader, or to perform administrative tasks on a remote system, but these are fairly esoteric activities. You wouldn’t think there are enough backpackers wanting to do these things to warrant making SSH part of the standard setup. Who knew?

For photographic storytelling, cameras are becoming optional

I never would have chosen to be cameraless in one of the most photogenic parts of the world, but since I am it’s turned into an opportunity to reflect on my relationship to photography — past, present, and future. My dad likes to joke about the obsessive vacation photographer who, when asked how the trip went, would reply: “Don’t know, haven’t gotten the pictures back yet.”

There’s always been a tension between the desire to soak in experience through our primary senses, and the desire to augment our primary senses with gear that enables us to record, edit, preserve, and share what we experience. But here on the sundeck of the Kaitaki ferry, waiting to cross the Cook Strait from Wellington to Picton, augmentation is ascendant. You can see the whole spectrum of photographers and their gear on display. Happy snappers with their low-end digicams, videographers with camcorders, serious photographers with their digital SLRs.

Although skills and inclinations vary across that spectrum, everyone shares a common appreciation for the thrill of photography, and advancing technology keeps renewing that thrill. But as I watch my fellow passengers snapping and filming all around me, I’m aware that a sea change is underway. I am certain that some of the photos and videos being shot all around me today will turn up on Flickr and YouTube, and that I’ll be able to find them by searching for words like Kaitaki and Wellington and Picton and ferry, and for the date May 12 2007. There’s even a small but growing probability that some of this imagery will be geotagged and thus correlatable with precise locations along the way.

This collaborative annotation of the planet opens the door to a new dimension of pictorial storytelling. We will no longer be limited to the images that we ourselves have captured. We’ll be able to combine our own deeply personal images with the most interesting ones shot by others who have been to the same places. In some cases, those will be photos that no contemporary photographer could have taken. The other day, in the library of Wellington’s Te Papa museum, I was looking through drawers full of gorgeous black and white photos from New Zealand’s past. I don’t think many of those have yet been scanned and uploaded and tagged, but inevitably some will be.

Deeper layers of annotation are possible as well. At the GOVIS conference, a lunch companion mentioned the work of David Rumsey, whose digitizations of historical maps provoked a standing ovation at OSCON 2004. Rumsey has the wonderful idea that by making these exquisite hand-drawn maps available for everyone to curate and to use, he’ll enable us to add a historical dimension to the stories we tell with our own (and other) photographic images.

I can’t deny that I’m missing my camera, and that this essay is partly a rationalization of circumstances I wouldn’t have chosen. I really enjoy taking pictures, I’ve got an eye for taking decent ones, and I’d like to be doing that right now. But I also like the idea that it’s becoming less necessary to carry and use a camera in order to tell pictorial stories about the places we visit and the people we meet.

PS. In case you are wondering I did not post this from the ferry. The unevenly distributed future does not yet extend that far.

Amazing lifehack: Pack a starter pistol to deter luggage theft

It was really dumb of me to put a camera into a piece of checked luggage, but I did, and now an airport baggage handler somewhere is one camera richer. It’s my fault, of course. My only excuse is that I almost never check bags, so when I packed this one I was thinking carry-on, not checked.

In searching around for similar cases, I found this tale of a guy who found his stolen camera on eBay, tracked down the thief, and got him arrested.

In the comments, mixed in with the debate about whether or not Delta should have granted the refund he requested (they didn’t), there’s this amazing bit of advice lifted from a comment on another blog:

One note on using TSA rules to your advantage.

Weapons that travel MUST be in a hard case, must be declared upon check-in, and MUST BE LOCKED by a TSA official.

A “weapons” is defined as a rifle, shotgun, pistol, airgun, and STARTER PISTOL. Yes, starter pistols – those little guns that fire blanks at track and swim meets – are considered weapons…and do NOT have to be registered in any state in the United States.

I have a starter pistol for all my cases. All I have to do upon check-in is tell the airline ticket agent that I have a weapon to declare…I’m given a little card to sign, the card is put in the case, the case is given to a TSA official who takes my key and locks the case, and gives my key back to me.

That’s the procedure. The case is extra-tracked…TSA does not want to lose a weapons case. This reduces the chance of the case being lost to virtually zero.

It’s a great way to travel with camera gear…I’ve been doing this since Dec 2001 and have had no problems whatsoever.

What a brilliant hack!

Annotating online maps for offline use

Last week I wrote about making a screencast to capture information about the MacArthur Maze detour for possible use offline. This week I’m in a similar situation. I’m scouting out places to go in Wellington, and elsewhere in New Zealond, and of course I’m doing that online. But I’ll probably be needing to refer to this research while I’m offline, so I’m annotating maps with that purpose in mind.

From that perspective, I’ve noticed some subtle differences between maps.google.com and maps.live.com. (maps.yahoo.com doesn’t seem to cover this part of the world.) In both cases, you can search for an address, add it to a saved map as a pushpin which includes the address you searched for, and then edit the blurb that appears with the pushpin. So, for example, I was able to change “25 Dixon Street” to the more useful “Avis, 4-801-8108, 25 Dixon Street.”

One subtle difference is that in Google Maps, the pushpins you add to your saved map all default to the same color. If you’ll be referring to a cached image, there’s no way to correlate between the legend and the locations on the map. To enable that, I had to edit each pushpin and vary its color.

In the Live Search product, by contrast, the pushpins are numbered which means there’s no extra step required to correlate between the legend and the locations. That makes it slightly easier to create something that’s useful in print — or, since printers are often hard to find when traveling, that’s useful as a saved image.

Another subtle difference is that in the Google case, varying the colors of your saved pushpins might not help if you send the map to a black-and-white printer.

I’d be curious to know to what extent these differences represent a conscious strategy on the part of the Live team to make annotated maps more useful offline. I’m also really interested in ways that a subset of the interactivity of online maps can be captured for offline use. The screencast I made last week was a crude step in that direction, there’s lots more that could be done.

Wellington bound

It is a long way from New Hampshire to New Zealand, where I’ll be speaking at the GOVIS conference in Wellington on Friday. A really long way, it’s dawning on me, as I prepare to leave on Monday afternoon in order to arrive (just past the International Date Line) on Wednesday afternoon. But hey, it’s doable thanks to 747s and laptop computers and MP3 players. Imagine making that trip 100 years ago!

I’ve always wanted to see New Zealand, so I’ll be taking some time to look around after the conference. If topics emerge that fit the themes of this blog, and if network access permits, I’ll blog them. Otherwise it’s likely to be quiet here for a while.

Happy Snappers and Happy Casters

When I interviewed Bill Crow about HD Photo, he talked about how the efforts of the Windows photo team — and more broadly the efforts of the whole digital photography ecosystem — were directed toward a persona called the Happy Snapper. Earlier generations of Happy Snappers used Brownie cameras, and later Polaroids, to achieve decent results without much skill. Our generation of Happy Snappers uses digital cameras in the same way. It makes sense to take care of the Happy Snapper because there are so many of them. So far, it hasn’t made sense to invest the same effort in the Happy Caster — that is, the Happy Snapper’s audio counterpart.

I doubt that podcasting alone will turn the tide. Although lots of people are discovering the possibilities of the medium, there will always be more interest in creating pictures than in creating sounds. But video of course includes sound, so maybe video will tip things in favor of the Happy Caster. That’d be good, because I’ve been doing a lot of audio recording lately, and it turns out to be harder to get decent results than I had imagined.

I was amused to note, in the credits for my latest ITConversations show, that my name appears as the audio engineer. That’s the moral equivalent of listing Alan Smithee as the director of a film. It’s what happens when things get screwed up and the real director doesn’t want his own name to appear. In this case I was the one who screwed up, and Paul Figgiani — ITC’s highly-experienced professional audio engineer — was the one who quite rightly declined to appear in the credits.

What happened was that I recorded the show using a new software/hardware combo and, although the levels looked OK, they were really too high. You can never really recover from a mistake like that, but I had to try. Thanks to the power of Adobe Audition the results were at least passable, albeit decidedly sketchy.

That’s an extreme example, but in general I’ve found that in the digital audio domain there are lots of ways to screw up, and that it takes a lot of specialized expert knowledge to avoid screwing up. I’m learning, but there’s still a lot to learn.

When I’m interviewing people on the phone, over the Internet, or in person, I’d rather spend less time worrying about gear and software settings and more time focusing on the conversation. I”m not a Happy Caster yet, but I’d sure like to be.

The question is whether there are (or will be) enough of us would-be Happy Casters to warrant the creation of the same kind of ecosystem in which the Happy Snappers flourish. I hope so!

A conversation with Gent Hito about RSSBus and the data web

The acronym RSS usually stands for Really Simple Syndication. Gent Hito thinks it should also stand for Really Simple Services. In this week’s ITConversations podcast we discuss RSSBus, a .NET-based engine — for desktops and for servers — that aims to simplify both the production and consumption of data expressed in terms of RSS (or Atom) feeds.

By normalizing all data feeds to flattened sets of name/value pairs, RSSBus trades away some of the power of advanced data modeling in order to reach a broad population of developers — and even, ideally, ordinary information workers who will be able to pull feeds into their spreadsheets, combine and filter them, and publish their transformed feeds back out to the Net.

If you’re handy with scripting languages and with data, you may or may not need something like RSSBus, because you can combine and manipulate feeds directly and pretty easily. But the amount of web screenscraping that I keep having to do tells me that providing the more structured outputs I’d rather work with still isn’t an obvious and easy thing for many developers. And while a growing minority of developers do produce XML outputs, information workers typically can’t make sense of those feeds. By helping to democratize the creation and use of simple feeds, Gent Hito hopes to accelerate the emergence of the data web.

Tagging is declarative programming for everybody

I’ve been searching for a pithy phrase to capture an idea, and the title of this piece takes my best shot. For ages, we’ve imagined that non-programmers can, on some level, learn to adapt the software that they use. Many did, most dramatically by engaging with Excel’s programming features, but most did not. And though I’m an admirer of Yahoo Pipes, I don’t think that a visual skin on top of a procedural programming model will appeal broadly to folks who otherwise aren’t inclined toward procedural programming.

Now arguably, the reason that most people aren’t so inclined is that we’ve failed to teach computational thinking1. Jeannette Wing’s manifesto on this topic invites us to imagine how the intellectual tools of programming — including abstraction, naming, composition, refactoring, heuristic reasoning, and separation of concerns — add up to “a universally applicable attitude and skill set that everyone, not just computer scientists, would be eager to learn and use.”

I’m sure Jeannette Wing is right, I hope she will persuade educators, and I’m delighted to discover that Microsoft Research is sponsoring the Center for Computational Thinking at Carnegie Mellon to explore this topic. Meanwhile, I wonder if we’re seeing the emergence, in the wild, of a very basic form of computational thinking that may prove to be intuitive and broadly accessible.

As I mentioned in my talk at Berkeley earlier this week, the elmcity.info project invites people to use tags not only to describe things, but also to compose and coordinate simple services. So for example, by posting a geotagged photo of a restaurant menu to Flickr, with the right set of tags, you’re both helping to create a directory service and enhancing that directory with Flickr’s image services and Yahoo’s mapping services. By posting a video clip of a candidate’s appearance to YouTube, along with the tag nhprimary, you’re helping to build a database of clips. By posting an event to Eventful along with the tag podcorps, you’re making a request for service — that is, you’re inviting one of the PodCorps.org stringers to help you record and publish a podcast of your event.

One of the trends in programming is the transition to a more declarative style. Where possible, you specify what you want to retrieve from the database, rather than how to retrieve it. Where possible, you specify which transactional or security behaviors your service should have, rather than how to implement those behaviors.

Among other things, tagging may become to ordinary folks what attributes are becoming to programmers: a language that doesn’t just describe things, but also invokes and coordinates behaviors.


1
Thanks to Patrick Phelan for pointing me to Jeanette Wing’s work.

Screencasting and map exploration

I’m always finding surprising new uses for screencast technology, and yesterday was another revelation. On Tuesday I left the MIX show for Berkeley, where I gave a talk at the School of Information to a group that included some of the students whose projects I discussed in my podcast with Bob Glushko and AnnaLee Saxenian. From the San Francisco airport, the normal route to Berkeley would have involved the MacArthur maze, but its recent meltdown dictated an elaborate detour.

I assumed the detour wouldn’t yet be available in the mobile navigation systems, so I spent some time reviewing the route online. Normally I’d print out a couple of maps but I was in a hotel room with no printer handy, so I wound up making a screen recording of the route. It was screencast, sort of, but one that I’d never show anyone, just a reference in case I needed it.

What I captured was a much richer representation of the route than a sequence of still screenshots interspersed with written instructions. At various points along the way, I’d zoom in for more detail, zoom out for context, and toggle back and forth between map view for the roads and hybrid view for buildings and landscape. It would be awkward to have to pull over, pop open the laptop, and review this movie, but I figured if I had to, I could.

As soon as I finished capturing the screencast, though, I knew I wouldn’t need to refer to it, for reasons that remind me of the dynamics of writing down lists of things to do. More often than not, once I write things down, I don’t need to refer back to them. The act of writing gives them a mental permanence they otherwise wouldn’t have.

Making the screencast of my route seems to have had a similar effect.  It would be interesting to run the following experiment. Have one group of subjects explore a route in an online mapping service, then measure how well they can follow the route. Have another group of subjects use the mapping software in the same way, while also requiring them to capture a movie of that interaction, and then measure their performance. I predict that the latter group will perform better because recording things that you might (or might not) later play back has a similar effect to writing things down that you might (or might not) later read. The process makes you think in a more active and intentional way, which leads to a more permanent mental representation.

It’s occurred to me before that every online mapping service should offer the ability to record an interactive exploration of a route, play it back online, and also make the screencast available for downloading and offline viewing. I still think that’s a great idea. What I hadn’t considered is that the most valuable part of this process might not be the use of the final output, but rather the act of producing it.

Watching Anders Hejlsberg reinvent the relationship between programs and data

Although Silverlight is the star of MIX 07, this is a big Microsoft developer show and there are all kinds of things going on. My favorite session from yesterday was Anders Hejlsberg’s mega-demonstration of LINQ. It’s come a long way since I first saw it at the 2005 PDC, and since my mid-2006 screencast with Anders. All along, he’s been very patiently and persistently reinventing the relationship between programming languages and data.

You have to be a certain kind of person to enjoy watching Anders run LINQ through its paces, Intellisensing his way through the construction of C# queries against object, SQL and XML data, but I am that kind of person, and I find it utterly hypnotic. In this particular demo, he spent a lot of time creating snippets of C# code to generate SQL queries, and peeking under the covers at the SQL that gets generated. There were two key takeway points. First, obviously, that some dodgy SQL effects become available to average developers who might otherwise not be able to achieve them. But second, and more subtly, that even the simplified query syntax in the C# code can be made simpler. The tool-generated code that represents SQL tables, for example, can now express relationships among tables directly. Given a table of Products and a table of Categories, you can eliminate the join syntax from a LINQ query and treat Categories as a property of Products. The bottom line is not only writing (and reading) less SQL code, but also less C# code.

There is, however, a connection between Silverlight and LINQ. As I mentioned yesterday, LINQ for in-memory objects is part of Silverlight. I haven’t had a chance to try this yet, but I do have a good example of the kinds of things I’d want to use it for.

The first version of my InfoWorld metadata explorer was actually a client-side application — still available here (for Firefox only) — that pulled a bunch of metadata into local memory and used XPath queries and JavaScript list filtering to produce its effects.

The server-based version uses Python to accomplish the same effects. With Silverlight it should now be possible to move that Python version back to the client, where it would probably run faster than the JavaScript version. But what I’m really interested to see is whether I can use LINQ in Silverlight to simplify the gnarly XPath and list-filtering logic in order to make the thing easier to understand and maintain.

To take this a step further, recall my earlier observations about Greasemonkey and Silverlight. A lot of the time, when we use the web, we’re effectively performing joins among data sources. You visit one site to look up some data, then you grab some of it and plug it into another site. If you’re lucky, somebody will have built a mashup, on a third site, to facilitate that join. But what if your browser had the data manipulation chops to help you do that mashup directly? I’m hoping that technologies like Silverlight and LINQ will enable things like that to happen.