First have a great use experience, then have a great user experience

For a couple of years I’ve been trying to transfer my experience of listening to podcasts to my dad. There’s so much interesting stuff to listen to, and he has both the time and the interest to listen, so in theory it’s a perfect fit. But in practice, though he’s heard a few of the talks I’ve forwarded to him as links, I haven’t managed to create the “aha” moment for him. This past week, though, may have been the tipping point. He’d landed in the hospital and I was determined to give him an alternative to the in-room TV. So I loaded up my old 256MB Creative MuVo with a selection of favorite talks, bought him a pair of headphones, gave him the kit, and showed him how to turn it on and press play.

It’s been a huge success. The next challenge, of course, will be to show him how to refill the gadget once he’s listened to the dozen or so hours of stuff I gave him. But I hope I’ve won the important battle. Time will tell, and I could be wrong, but my hunch is that what remains — a conversation about feeds, podcatchers, and USB cables — will be a mop-up operation.

In the tech industry, though, I think we often pretend that the mop-up operation is the battle. We talk obsessively about the user experience, and we recognize that we invariably fail to make it as crisp and coherent as it should be. But user experience is an overloaded term. I propose that we unpack it into (at least) two separate concepts. One is the basis of the “aha” moment. For now I’ll call it the use experience. In this example, it’s the experience of listening to spoken-word podcasts from sources that, just a few years ago, weren’t available.

I’ll reserve the term user experience for something else: the tax we pay in order to enjoy the use experience. This tax is not the basis of an “aha” moment. It’s expressed in terms of the devices, cables, batteries, applications, menus, dialog boxes, and — last but not least — the concepts we must grapple with in order to reliably reproduce the use experience. A great user experience makes all this crap relatively less awkward, confusing, and annoying. A lousy user experience makes it relatively more so. But the point is that it’s all crap! It’s the tax we pay to enjoy the use experience, and we want to pay as little of it as we can get away with.

How do you engineer a great use experience, as opposed to a great user experience? Part of the answer is deep personalization. So while the talks I loaded onto that MuVo for my dad included some of my favorites from ITConversations, the Long Now Foundation, TED, and elsewhere, I also included some my own podcasts. And that’s what tipped it for my dad. He’s proud of the work I do, but most of it has always been inaccessible to him. So I picked a handful of my most general-interest podcasts on education, government, and health care. And that worked really well. Every time I visited last weekend, he was listening to one of mine. But when I talked to him mid-week he was listening to the Burt Rutan TED talk that I’d hoped he would enjoy.

This is a personal story, but I’m certain the principles apply more broadly. On the Microsoft campus this past week, for example, I got together with Mike Champion for coffee and a chat. Among other things we talked about Channel 9 which he rarely tunes into, although it features a variety of things that would interest him. We also discussed the fact that, while there are spaces in his life into which he might enjoyably and profitably inject podcast listening — long bike rides, for example — he hasn’t yet done so.

Right after our chat I walked into my first team meeting with the Channel 9 and 10 folks and recalled a point I’d made a while ago, which is that video isn’t the medium of choice for Mike’s bike ride. He knows what Anders Hejlsberg and Jim Gray look like. He doesn’t have time in front of a computer (or a handheld video player) to watch them talk. But he does have time on his bike to listen to them talk. Everybody in the meeting agreed that peeling off sound tracks from the videos and making them available as podcasts is a no-brainer, so it looks like that’ll happen. Thanks in advance, Adam, for agreeing to make it happen, and please don’t take this as arm-twisting. I know you’re busy and will get to this when you can. I’m telling this story to make a larger point which I think may provoke some useful discussion.

The larger point is that all of us, me included, tend to focus on engineering the user experience and tend to forget about engineering the use experience. A better user experience, in this case, is partly about making audio files available, and partly about organizing podcast feeds so that I could subscribe to everything that comes down the pipe featuring Anders Hejlsberg or Jim Gray or other folks I want to tune in.

Those tweaks will probably lower the activation threshold enough for Mike to hop over it. But I’m not certain of that. There are still obstacles to overcome. What will motivate Mike to overcome them? An “aha” moment, a good use experience. So how do you engineer that?

I like Mike, but not enough to give him a preloaded MP3 player. I could, however, make him a mix of some Channel 9 stuff. And as I did for my dad, I’d want to include some of the other stuff that I’m always recommending to people.

How exactly to do that is an interesting question. As a hip Internet citizen and podcast aficionado I’ll be inclined to find a podcast remixing service, use it to make Mike’s mix, then point him to the feed it emits. But I actually think that would be the wrong approach. If I point him to a podcast feed, I force him to grapple with the podcatching user experience. But I don’t want to clobber him right away with a user experience. First I want to give him a satisfying use experience.

Different requirements dictate different engineering solutions. If I’m trying to engineer a delightful use experience it might be best to hand him a ZIP file of MP3s. I know he’ll know what to do with that, using any kind of MP3 player, without having to deal with any new tools or concepts.

Now of course Mike, being a typically super-smart and super-technical Microsoft employee, is perfectly able to deal with new tools and concepts. That’s what he does for a living, after all, and he does it because he likes to.

But in this context, I think that’s a red herring. Just because Mike can power through the crap doesn’t mean that he should have to, at least not right away, at least not if it can be avoided. The less to distract him from that “aha” moment, the better.

There’s probably a whole literature on this topic, and the terminological distinction I’m trying to make here may have been made differently and better elsewhere, in which case I’ll appreciate pointers to that literature. Terminology aside, I think the distinction is important in lots of ways. In terms of Channels 9 and 10, for example, it suggests the following:

1. As do video stores, Channels 9 and 10 could offer staff picks.

2. The picks could be made available not only as feeds, but also as bundles.

3. The picks could mix store-brand stuff from 9 and 10 with related stuff from elsewhere.

4. Viewers and listeners who follow 9/10 (and other sources) could use remix services hosted at 9/10 (or elsewhere) to share their own picks as feeds (or bundles).

But I think this principle applies much more broadly. Recently, for example, I mentioned my positive reaction to the $8/month commodity hosting offered by Ironically, BlueHosts’s founder recently blogged about how Microsoft doesn’t — and he thinks, can’t — play in that market. Could that change? If so, how? I can’t answer those questions at the moment, but I can say that good answers would lead with use experiences and follow with user experiences.

When you provision an instance of a MySQL database at BlueHost, you have a much better user experience than you have when you provision one from the command line, but to be honest it’s not a great user experience. Lots more could be done to clarify the concepts of databases, users, passwords, rights, and so on. Still, relative to the command-line alternative, you can much more quickly and more easily have the experience of deploying a world-accessible database-backed application. When you have that kind of use experience, you become an adopter of the enabling technology. It’s that powerful.

A conversation with Avi Bryant and Andrew Catton about Dabble DB

Last week’s Friday podcast ran afoul of travel craziness but the series continues this week with a further exploration of Dabble DB, the through-the-web database that was also featured in a screencast. In my conversation with Avi Bryant and Andrew Catton we explore some of the underpinnings of Dabble, including the remarkable fact that it’s written in the Squeak implementation of Smalltalk.

I’ve underplayed that point until now, because I’m trying to broaden the appeal of what I do, but it turns out that Dabble DB is a great example of how dynamic languages can produce effects that people see, interact with, and care very much about. Programmers aren’t the only ones to benefit from direct manipulation of objects, continuous refinement, and always-live data. We all need things to work that way, so it’s cool to see how the dynamic qualities of Dabble’s Smalltalk engine bubble up into the application.

Rewriting the web with MSIE

In response to my item on media hacking the other day, this comment alerted me to a really sweet bookmarklet that adds a slider to a Flash movie. You don’t get timecodes but you do get start/pause/scrub which is a tremendous benefit.

When I tried it out, on both Firefox and IE, I was reminded again about the relative inaccessibility of bookmarklets in recent versions of IE. In Firefox it’s a drag-and-drop to the linkbar, and even that procedure eludes most people. In IE it’s a much more complicated dance which I illustrated in my Bookmarklets 101 screencast.

Because I now aim to improve digital literacy as broadly as I can, I’ll be focusing more than I have in the past on the browser that most people still use, which is IE. Here I’d like to toss out a couple of points for discussion and follow-up.

Bookmarklet policies.

It’s understandable that bookmarklets, which are JavaScript snippets that run in the context of web pages, would be locked down in a browser that’s busily rehabilitating its security reputation. But typically they’re not really locked down, just inconveniently accessible. Suppose you want to encourage your people to use these kinds of productivity aids. What does the domain policy look like for doing that?

Greasemonkey for IE.

What’s the deal nowadays? At one time I heard about Trixie but not so much lately. I’ll revisit it myself, but I’m curious to hear reports on Trixie’s compatibility with Greasemonkey userscripts, its rate of adoption, and its security model.

Thursday night switcheroo

The original plan to meet up at the Crossroads Mall on Thursday night turned out to conflict with what sounds like a popular and fun event: the Seattle podcasting network get-together. So I’ll be there instead, and if anyone wants to go out for drinks later, I’ll be up for that. Dennis Hamilton, who originally proposed the Crossroads Mall, has kindly volunteered to drop by there and see if anybody who does happen to show up wants to go over to the podcast meeting instead.

Channel 9 media hacking

In honor of my first get-together with the MSDN Channel 9 and 10 folks later today, I thought I’d do a spot of media hacking in support of the cause. One of the things that caught my eye recently was Brian Jones’ screencast on data/view separation in Word 2007. It’s published as a SWF (Shockwave Flash) movie and, like other SWF files on Channel 9, it’s delivered into the browser directly, without a controls wrapper. So there’s no way to see the length of the screencast, or pause it, or scroll around in it, or — as I was inclined to do — refer to a segment within the screencast.

I figured it would be a snap to grab the controller that Camtasia Studio emits and tweak its configuration file to point to Brian’s screencast. But that seemingly simple hack turned into a merry chase. It turns out that the Camtasia controller isn’t entirely generic. It embeds (at least) the width and height of the controlled video. I could use Camtasia to create a new controller, but I don’t have that software here with me, and in any case it seems like there should be a way to override those values.

First, though, I took a step back and spent some time looking for a generic SWF component to play back SWFs. For FLV (Flash video) files, I’ve made great use of flvplayer.swf 1. It’s a nice simple widget that does just the one thing I want: it accepts the address of an FLV file as a parameter, and it plays that file. There has to be an analogous swfplayer.swf, right? Well, I looked hard and didn’t find it, maybe someone can enlighten us on that score.

Circling back to the Camtasia controller, I asked myself another question. There has to be an easy way to not only display, but also edit, the header of a SWF file, right? Again, I looked hard and came up empty handed. Now it was a challenge, so I dug into things like SWF::File and Flasm, tools for picking apart and reassembling SWF files. Neither quite did the trick. Then I remembered a tip from Rich Kilmer about Kinetic Fusion, a Java toolkit for roundtripping between SWF and XML. Using it, I was able to convert the SWF to XML, alter the width/height values, and recreate the SWF 2.

I know, I know, this is crazy, there has to be a better way, and I hope someone will enlighten me. But in any case, I finally did succeed, sort of. Here’s a controllable version of Brian’s screencast:

One further complication: I’d hoped to publish only the modified controller and configuration file, leaving the screencast in situ on Channel 9. But the cross-domain nature of that arrangement seems to rule it out. So I wound up rehosting the video on the same server as the controller and configuration file. In this case, just to keep things interesting, that server happens to be my Amazon S3 account.

Anyway, if you’ve made it this far, I can now refer you to the segment of that screencast. At 6:45 (of 9:53), Brian shows how to swap out one batch of XML data associated with a Word document and swap in another. I’ll say more about why I found that interesting in another post. Meanwhile, I’ll be pondering how one of my perennial interests — URL-adddressable and randomly accessible rich media — can help expose more of the considerable value that’s contained in the Channel 9 screencasts.

1. If you’re a Flash developer, it’s trivial to whip up your own playback control. But it’s non-trivial for regular folks who just want to embed videos in HTML pages. These folks find themselves rooting around on the net for components that should be way easier to find and use.

2. If you try this on the Camtasia controller, note that the decompiled XML won’t immediately recompile. The generated ActionScript contains a handful of references to this:componentName that instead should be this.componentName.

New employee orientation

Once in a blue moon I find myself sitting in a new employee orientation. Today, as on other occasions, I was struck by how hard it is for the benefits people to explain their offerings. The presentation is necessarily general but everyone’s circumstances are particular. There’s no good way to bridge that gap in a large group session.

My guess is that a lot of the folks who were in that session today will, upon joining their teams tomorrow, ask for advice about various health and investment options. But team members won’t necessarily be the best sources of advice, because similar work circumstances don’t map to similar life circumstances. What new employees really need is to compare notes with other employees in similar life circumstances.

Benefits people and coworkers often won’t be in a position to meet that need. But a social application that matched up employees in similar life circumstances could be a great way to transfer highly particular kinds of benefits knowledge.

Ambient video awareness and visible conversations

A few years ago Marc Eisenstadt, chief scientist with the Open University’s Knowledge Media Institute, wrote to tell me about a system called BuddySpace. We’ve been in touch on and off since then, and when he heard I’d be in Cambridge for the Technology, Knowledge, and Society conference, he invited me to the OU’s headquarters in Milton Keynes for a visit. I wasn’t able to make that detour, but we got together anyway thanks to KMI’s new media maven Peter Scott, who was at the conference to demonstrate and discuss some of the Open University’s groundbreaking work in video-enhanced remote collaboration.

Peter’s talk focused mainly on Hexagon, a project in “ambient video awareness.” The idea is that a distributed team of webcam-equipped collaborators monitor one anothers’ work environments — at home, in the office, on the road — using hexagonal windows that tile nicely on a computer display. It’s a “room-based” system, Peter says. Surveillance occurs only team members enter a virtual room, thereby announcing their willingness to see and be seen.

Why would anyone want to do that? Suppose Mary wants to contact Joe, and must choose between an assortment of communication options: instant messaging, email, phone, videoconferencing. If she can see that Joe is on the phone, she’ll know to choose email or IM over a phone call or a videoconference. Other visual cues might help her to decide between synchronous IM and asynchronous email. If Joe looks bored and is tapping his fingers, he might be on hold and thus receptive to instantaneous chat. If he’s gesticulating wildly and talking up a storm, though, he’s clearly non-interruptible, in which case Mary should cut him some slack and use email as a buffer.

Hexagon has been made available to a number of groups. Some used it enthusiastically for a while. But only one group so far has made it a permanent habit: Peter’s own research group. As a result, he considers it a failed experiment. Maybe so, but I’m willing to cut the project some slack. It’s true that in the real world, far from research centers dedicated to video-enhanced remote collaboration, you won’t find many people who are as comfortable with extreme transparency — and as fluent with multi-modal communication — as Marc and Peter and their crew. But the real world is moving in that direction, and the camera-crazy UK may be leading the way as seen in the photo at right which juxtaposes a medieval wrought-iron lantern and a modern TV camera.

Meanwhile, some of those not yet ready for Hexagon may find related Open University projects, like FlashMeeting, to be more approachable. FlashMeeting is a lightweight videoconferencing system based on Adobe’s Flash Communication Server. Following his talk, Peter used his laptop to set up a FlashMeeting conference that included the two of us, Marc Eisenstadt at OU headquarters, and Tony Hirst who joined from the Isle of Wight. It’s a push-to-talk system that requires speakers to take turns. You queue for the microphone by clicking on a “raise your hand” icon. Like all such schemes, it’s awkward in some ways and convenient in others.

There were two awkward bits for me. First, I missed the free-flowing give-and-take of a full duplex conversation. Second, I had to divide my attention between mastering the interface and participating in the conference. At one point, for example, I needed to dequeue a request to talk. That’s doable, but in focusing on how to do it I lost the conversational thread.

Queue-to-talk is a common protocol, of course — it’s how things work at conferences, for example. In the FlashMeeting environment it serves to chunk the conversation in a way that’s incredibly useful downstream. All FlashMeeting conferences are recorded and can be played back. Because people queue to talk, it’s easy to chunk the playback into fragments that map the structure of the conversation. You can see the principle at work in this playback. Every segment boundary has an URL. If a speaker runs long, his or her segment will be subdivided to ensure fine-grained access to all parts of the meeting.

The chunking also provides data that can be used to visualize the “shape” of a meeting. These conversational maps clearly distinguish between, for example, meetings that are presentations dominated by one speaker, versus meetings that (like ours) are conversations among co-equals. The maps also capture subtleties of interaction. You can see, for example, when someone’s hand has been raised for a long time, and whether that person ultimately does speak or instead withdraws from the queue.

A map of a conversation

I expect the chunking is also handy for random-access navigation. In the conversation mapped here, for example, I spoke once at some length. If I were trying to recall what I said at that point, seeing the structure would help me pinpoint where to tune in.

Although Hexagon hasn’t caught on outside the lab, Peter says there’s been pretty good uptake of FlashMeeting because people “know how to have meetings.” I wonder if that’s really true, though. I suspect we know less about meetings than we think we do, and that automated analysis could tell us a lot.

The simple act of recording and playback can be a revelation. Once, for example, I recorded (with permission) a slightly tense phone negotiation. When I played it back, I heard myself making strategic, tactical, and social errors. I learned a lot from that, and might have learned even more if I’d had the benefit of the kinds of conversational x-rays that the OU researchers are developing.

Virtual worlds with exotic modes of social interaction tend to be headline-grabbers. Witness the Second Life PR fad, for example. By contrast, technology that merely reflects our real-world interactions back to us isn’t nearly so sexy. For most of us, though, in most cases, it might be a lot more useful.