A few years ago Marc Eisenstadt, chief scientist with the Open University’s Knowledge Media Institute, wrote to tell me about a system called BuddySpace. We’ve been in touch on and off since then, and when he heard I’d be in Cambridge for the Technology, Knowledge, and Society conference, he invited me to the OU’s headquarters in Milton Keynes for a visit. I wasn’t able to make that detour, but we got together anyway thanks to KMI’s new media maven Peter Scott, who was at the conference to demonstrate and discuss some of the Open University’s groundbreaking work in video-enhanced remote collaboration.
Peter’s talk focused mainly on Hexagon, a project in “ambient video awareness.” The idea is that a distributed team of webcam-equipped collaborators monitor one anothers’ work environments — at home, in the office, on the road — using hexagonal windows that tile nicely on a computer display. It’s a “room-based” system, Peter says. Surveillance occurs only team members enter a virtual room, thereby announcing their willingness to see and be seen.
Why would anyone want to do that? Suppose Mary wants to contact Joe, and must choose between an assortment of communication options: instant messaging, email, phone, videoconferencing. If she can see that Joe is on the phone, she’ll know to choose email or IM over a phone call or a videoconference. Other visual cues might help her to decide between synchronous IM and asynchronous email. If Joe looks bored and is tapping his fingers, he might be on hold and thus receptive to instantaneous chat. If he’s gesticulating wildly and talking up a storm, though, he’s clearly non-interruptible, in which case Mary should cut him some slack and use email as a buffer.
Hexagon has been made available to a number of groups. Some used it enthusiastically for a while. But only one group so far has made it a permanent habit: Peter’s own research group. As a result, he considers it a failed experiment. Maybe so, but I’m willing to cut the project some slack. It’s true that in the real world, far from research centers dedicated to video-enhanced remote collaboration, you won’t find many people who are as comfortable with extreme transparency — and as fluent with multi-modal communication — as Marc and Peter and their crew. But the real world is moving in that direction, and the camera-crazy UK may be leading the way as seen in the photo at right which juxtaposes a medieval wrought-iron lantern and a modern TV camera.
Meanwhile, some of those not yet ready for Hexagon may find related Open University projects, like FlashMeeting, to be more approachable. FlashMeeting is a lightweight videoconferencing system based on Adobe’s Flash Communication Server. Following his talk, Peter used his laptop to set up a FlashMeeting conference that included the two of us, Marc Eisenstadt at OU headquarters, and Tony Hirst who joined from the Isle of Wight. It’s a push-to-talk system that requires speakers to take turns. You queue for the microphone by clicking on a “raise your hand” icon. Like all such schemes, it’s awkward in some ways and convenient in others.
There were two awkward bits for me. First, I missed the free-flowing give-and-take of a full duplex conversation. Second, I had to divide my attention between mastering the interface and participating in the conference. At one point, for example, I needed to dequeue a request to talk. That’s doable, but in focusing on how to do it I lost the conversational thread.
Queue-to-talk is a common protocol, of course — it’s how things work at conferences, for example. In the FlashMeeting environment it serves to chunk the conversation in a way that’s incredibly useful downstream. All FlashMeeting conferences are recorded and can be played back. Because people queue to talk, it’s easy to chunk the playback into fragments that map the structure of the conversation. You can see the principle at work in this playback. Every segment boundary has an URL. If a speaker runs long, his or her segment will be subdivided to ensure fine-grained access to all parts of the meeting.
The chunking also provides data that can be used to visualize the “shape” of a meeting. These conversational maps clearly distinguish between, for example, meetings that are presentations dominated by one speaker, versus meetings that (like ours) are conversations among co-equals. The maps also capture subtleties of interaction. You can see, for example, when someone’s hand has been raised for a long time, and whether that person ultimately does speak or instead withdraws from the queue.
I expect the chunking is also handy for random-access navigation. In the conversation mapped here, for example, I spoke once at some length. If I were trying to recall what I said at that point, seeing the structure would help me pinpoint where to tune in.
Although Hexagon hasn’t caught on outside the lab, Peter says there’s been pretty good uptake of FlashMeeting because people “know how to have meetings.” I wonder if that’s really true, though. I suspect we know less about meetings than we think we do, and that automated analysis could tell us a lot.
The simple act of recording and playback can be a revelation. Once, for example, I recorded (with permission) a slightly tense phone negotiation. When I played it back, I heard myself making strategic, tactical, and social errors. I learned a lot from that, and might have learned even more if I’d had the benefit of the kinds of conversational x-rays that the OU researchers are developing.
Virtual worlds with exotic modes of social interaction tend to be headline-grabbers. Witness the Second Life PR fad, for example. By contrast, technology that merely reflects our real-world interactions back to us isn’t nearly so sexy. For most of us, though, in most cases, it might be a lot more useful.