An experiment in online community

For my sabbatical project, I’m laying foundations for a community website. The project is focused on my hometown, but the idea is to do things in a way that can serve as a model for other towns. So I’m using cheaply- or freely-available infrastructure that can be deployed on commodity hosting services. The Python-based web development framework Django qualifies, with some caveats I’ve mentioned. And Django is particularly well-suited to this project because it distills the experiences of the developers of the top-notch community websites of Lawrence, Kansas. The story of those sites, told here by Rob Curley, is inspirational.

I’m also using Assembla for version control (with Subversion), issue tracking (with Trac), and other collaborative features described by my friend Andy Singleton in this podcast (transcript). It’s downright remarkable to be able to conjure up all this infrastructure instantly and for free. I’m a lone wolf on this project so far, but I hope to recruit collaborators, and I look forward to being able to work with them in this environment.

I have two goals for this project. First, aggregate and normalize existing online resources. Second, show people how and why to create online resources in ways that are easy to aggregate and normalize.

Online event calendars are one obvious target. The newspaper has one, the college has one, the city has one, and there’s also a smattering of local events listed in places like Yahoo Local, Upcoming, and Eventful. So far I’ve welded four such sources into a common calendar, and wow, what a messy job that’s been. The newspaper, the college, and the city offer web calendars only as HTML, which I can and do scrape. In theory the Yahoo/Upcoming and Eventful stuff is easier to work with, but in practice, not so much. Yahoo Local offers no structured outputs. Upcoming does, the events reflected into it from Y Local use hCalendar format, but finding and using tools to parse that stuff always seems to involve more time and effort than I expect. Eventful’s structured outputs are RSS and iCal. If you want details about events, such as location and time, you need to parse the iCal, which is non-trivial but doable. If you just need the basics, though — date, title, link — it’s trivial to get that from the RSS feed.

I’m pretty good at scraping and parsing and merging, but I don’t want to make a career out of it. The idea is to repurpose various silos in ways that are immediately useful, but also lead people to discover better ways to manage their silos — or, ultimately, to discover alternatives to the silos.

An example of a better way to manage a siloed calendar would be to publish it in structured formats as well as HTML. But while that would make things easier for me, I doubt that iCal or RSS have enough mainstream traction to make it a priority for a small-town newspaper, college, or town government. If folks could flip a switch and make the legacy web calendar emit structured output, they might do that. But otherwise — and I’d guess typically — it’s not going to happen.

For event calendars, switching to a hosted service is becoming an attractive alternative. In the major metro areas with big colleges and newspapers, it may make sense to manage event information using in-house IT systems, although combining these systems will require effort and is thus unlikely to occur. But for the many smaller communities like mine, it’s hard to justify a do-it-yourself approach. Services like Upcoming and Eventful aren’t simply free, they’re much more capable than homegrown solutions will ever be. If you’re starting from scratch, the choice would be a no-brainer — if more people realized these services were available, and understood what they can do. If you’re already using a homegrown service, though, it’ll be hard to overcome inertia and make a switch.

How to overcome that inertia? In theory, if I reflect screenscraped events out to Upcoming and/or Eventful, the additional value they’ll have there will encourage gradual migration. If anyone’s done something like that successfully, I’d be interested to hear about it.

On another front, I hope to showcase the few existing local blogs and encourage more local blogging activity. Syndication and tagging make it really easy to federate such activity. But although I know that, and doubtless every reader of this blog knows that, most people still don’t.

I think the best way to show what’s possible will be to leverage services like Flickr and YouTube. There are a lot more folks who are posting photos and videos related to my community than there are folks who are blogging about my community. Using text search and tag search, I can create a virtual community space in which those efforts come together. If that community space gains some traction, will people start to figure out that photos and videos described and tagged in certain simple and obvious ways are, implicitly, contributions to the community space? Might they then begin to realize that other self-motivated activities, like blogging, could also contribute to the community space, as and when they intersect with the public agenda?

I dunno, but I’d really like to see it happen. So, I’m doing the experiment.

A conversation with John Halamka about health information exchange

Dr. John Halamka joins me for this week’s podcast. He’s a renaissance guy: a physician, a CIO, and a healthcare IT innovator whose work I mentioned in a pair of InfoWorld columns. Lots of people are talking about secure exchange of medical records and portable continuity of care documents. John Halamka is on the front lines actually making these visions real. Among other activities he chairs the New England Health Electronic Data Interchange Network (NEHEN), which began exchanging financial and insurance data almost a decade ago and is now handling clinical data as well in the form of e-prescriptions. The technical, legal, and operational issues are daunting, but you’ll enjoy his pragmatic style and infectious enthusiasm.

We also discuss the national initiative to create a standard for continuity of care documents that will provide two key benefits. First, continuity both within and across regions. Second, data on medical outcomes that can be used by patients to choose providers, and by providers to study the effectiveness of procedures and medicines.

Websites mentioned in this podcast include:

Oh, and there’s a new feed address for this series: feeds.feedburner.com/JonUdellFridayPodcasts.

Django gymnastics

Recently I’ve been noodling with Django, a Python-based web application framework that’s comparable in many ways to Ruby on Rails. It appeals to me for a variety of reasons. Python has been my language of choice for several years, and going forward I expect it to help me build bridges between the worlds of LAMP and .NET. Django’s templating and object-relational mapping features are, as in RoR, hugely productive. And Django’s through-the-web administration reminds me of a comparable feature in Zope that I’ve always treasured. It’s incredibly handy to be able to delegate basic CRUD operations to trusted associates, who can use the built-in interface in lieue of the friendlier one you’d want to create for the general public.

The recommended way to deploy Django is to run it under mod_python, the Apache module that keeps Python interpreters in memory for high performance. But a lot of popular web hosting services don’t support that arrangement. For example, I just signed up for an account at BlueHost, the service used by the instructional technologists at the University of Mary Washington, and I looked into what it would take to get Django working in that environment.

Despite helpful clues it still took a while to work out the solution. In the process I reactivated dormant neurons in the parts of my brain dedicated to such esoterica as mod_rewrite and FastCGI, but I’d rather have been working with Django than working out how to configure it.

By way of contrast, setting up WordPress — a more well-known and popular application — was a one-click operation thanks to Fantastico, an add-on installer for the cPanel site manager.

I’ve heard it said that a compelling screencast is one key factor influencing the adoption of a new web-based application. One-click install in shared hosting environments has to be another. For a while, anyway, until the virtualization juggernaut gives everyone the illusion of dedicated hosting.

Video knowledge

Sean McCown is a professional database administrator who writes the Database Underground blog for InfoWorld. Lately his postings have been full of references to videos. One day, he watched a Sysinternals training flick, combining live video with screencasting, and made immediate use of it to pinpoint and fix a problem. Another day, he made his own training screencast:

I sat down last night and made a video of the restore procedure for one of our ETL processes. It was 10mins long, and it explained everything someone would need to know to recover the process from a crash. [Database Underground: Not just a DR plan anymore]

Screencasting is poised to become a routine tool of business communication, but there are still a few hurdles to overcome. For starters, video capture isn’t as accessible as it ought to be. Second Life gets it right: there’s always a camera available, and you can turn it on at any time. Every desktop OS should work like that.

Meanwhile, I’ll reiterate some advice: Camtasia is an excellent tool for capturing screen video on Windows, but its $300 price tag covers a lot of editing and production features that you may never use if you’re capturing in stream-of-consciousness mode for purposes of documentation. In that case, the free Windows Media Encoder is perfectly adequate.

On the Mac I’d been using Snapz Pro X for short flicks, but it takes forever to save long sessions. Next time I do a long-form Mac screencast I’ll try iShowU. That’s what Peter Wayner used for his AJAX screencasts. Peter says that iShowU saves instantly. I tried the demo, and it does.

Finally, there’s the odd hack I tried here: I used the camera’s display as the Mac’s screen, and captured to tape. If the 720×480 format is appropriate for your subject — and when the focus is a single application window, it can be — this is a nice way to collect a lot of raw material without chewing up a ton of disk space.

Capture mechanics aside, I think the bigger impediment is mindset. To do what Sean did — that is, narrate and show an internal process, for internal consumption — you have to overcome the same natural reticence that makes dictation such an awkward process for those of us who haven’t formerly incorporated it into our work style. You also have to overcome the notion, which we unconsciously absorb from our entertainment-oriented culture, that video is a form of entertainment. It can be. Depending on the producer, a screencast documenting a disaster recovery scenario could be side-splittingly funny. And if the humor didn’t compromise the message, a funny version would be much more effective than a dry recitation. But even a dry recitation is way, way better than what’s typically available: nothing.

Trailing-edge requirements for a community app

One of the projects I’m tackling on sabbatical is a community version of LibraryLookup. The service I wanted to create is described here: an RSS feed that’s updated when a book on your Amazon wishlist becomes available in your local library. Originally I planned to build a simple web application that would register Amazon wishlist IDs and produce custom RSS feeds for each registrant. But as I thought about what would make this service palatable to a community, I saw two problems with that approach:

  1. Familiarity. Most folks will not be familiar with RSS. If the primary goal is to get people using the service, rather than to evangelize RSS, it should use the more familiar style of email notification.
  2. Deployability. A web application needs to be hosted somewhere. In most communities, the library won’t be able to host the service on its own infrastructure. But if it’s hosted elsewhere, there will be a (rational) reluctance to take a dependency on that provider.

To address the first concern, I’m doing this as an old-fashioned email-based app. You subscribe or unsubscribe by sending email with a command and a wishlist ID in the Subject: header. And you receive notifications about book availability by way of email.

To address the second concern, I’m doing it as a client-side Python script, so that the only dependency is some version of Python and an Internet connection.

Because a library might not even be able to dedicate an email address for this purpose, I’m exploring the use of Gmail as the communication engine. In order for that to work, Python has to be able to make secure and authenticated POP and SMTP connections. Happily, it can.

The recipe for connecting Python to Gmail’s POP service is trivial:

import poplib
p = poplib.POP3_SSL(‘pop.gmail.com’)
p.user(‘USERNAME’)
p.pass_(‘PASSWORD’)

The recipe for connecting Python to Gmail’s SMTP service is less obvious:

import smtplib,
s = smtplib.SMTP(“smtp.gmail.com”)
s.ehlo(‘smtp.gmail.com’)
s.starttls()
s.ehlo(‘smtp.gmail.com’)
auth = ‘\x00USERNAME\x00PASSWORD’
eauth = base64.b64encode(auth)
s.putcmd(“AUTH PLAIN”)
s.putcmd(eauth)

This won’t work with no authentication, but neither will it work with the SMTP module’s login() which uses the wrong authentication type (i.e., LOGIN rather than PLAIN, I think).

Any POP/SMTP servers can be used, of course, so there’s no dependency on Gmail here, but it’s nice to see that Gmail can easily be pressed into service if need be.

It feels retro and trailing-edge to do an email-based app but, in order to make it familiar and deployable that seems like the right approach.

Larry O’Brien serves up three hardball questions

It is both sobering and gratifying to see folks asking the same questions about my upcoming gig that I’ve been asking myself. Larry O’Brien serves up three hardballs:

1. To what extent will the inherent imperative to advocate MS technologies stifle him?

Note that there is also a weird corollary: What about the MS technologies that I’m deeply fond of? In the past, nobody (well, hardly anybody) would question my motives if I got fired up about Monad or LINQ or IronPython. Now that’s bound to happen.

In any case, the only way this will work is if I explore and advocate things I believe in. So that’s what I plan to do. Some of those things will exist within the MS portfolio, some outside. Identifying value wherever it exists, and finding useful ways to extract and recombine it, is what I do. I hope I can continue to do that effectively as an MS employee but, of course, we’ll all just have to wait and see.

2. Will he be reduced to just a conduit of information (Microsoft’s new A-list blogger) or will he continue to contribute new creations?

Hand-on tinkering is critical to my method. I have in mind a long-running project that will enable me to try out lots of interesting things, while creating something useful in the process. I don’t want to say more until I’ve laid some foundations, but yes, I do plan to keep on contributing in the modest ways that I always have.

3. Will direct knowledge of unannounced initiatives keep him quiet on the very subjects on which he’s passionate?

Part of this career change goes beyond switching employers. The disconnect between the geek world and the civilian world has really been bugging me lately. Leading edge aside, there’s so much potential at the trailing edge that languishes because nobody helps people connect the dots. On the desktop, on the web, and everywhere else touched by computers and networks, people are running on 2 cylinders. And when we upgrade their computers and operating systems, that doesn’t tend to change.

I really, really want to show a lot of people how to use more of what they’ve got. Smarter methods of communication. More powerful data analysis and visualization. Surprisingly simple kinds of integration. These are my passions, and as Larry points out, they tend to involve fairly simple and accessible tools and techniques. In theory, to pursue this part of my mission, I don’t need to know about every secret project in the pipeline. Whether it’ll actually work out that way in practice, I dunno.

Being here, being there

Mike Champion raises an interesting point that applies to Microsoft but also more broadly:

The culture at MS is very F2F-oriented…if you’re out of sight, you have to work hard not to be out of mind.

But then he adds:

Geographic distance will help keep you from getting sucked into the groupthink of whatever group you’re in. Microsoft collectively needs to be constantly reminded what the world looks like to people whose view isn’t fogged up by our typical drizzle or distracted by the scenery on the sunny days.

We’re entering an era in which our personal, social, and professional lives are increasingly network-mediated. Trust-at-a-distance is a new possibility, with economic ramifications that everyone from Yochai Benkler to Jim Russell is trying to figure out. As someone who’s worked remotely for 8 years, and is about to work remotely for a company with relatively few remote employees, this question is extremely interesting to me.

On the one hand, I’ve learned that I can accomplish a lot because I spend an abormal percentage of my waking hours in flow rather than in meetings. I’ve also learned that network-mediated interactions can be more productive than F2F interactions. Consider my August screencast with Jim Hugunin, or my May screencast with Anders Hejlsberg, or indeed any of the other screencasts in that series. They’re all scheduled events, mediated by telephone and screensharing. I can’t see how physical colocation would improve them.

On the other hand, there’s the “watercooler” effect: being in a place, you see and hear and smell things that aren’t otherwise transmitted through the network. I have no doubt whatsoever that shared physical space matters in ways we can’t begin to describe or understand.

But as collaboration in shared virtual space takes its rightful place alongside collaboration in shared physical space, shouldn’t a company whose products are key enablers of virtual collaboration be eating its own dogfood?

Of course things are never as black-and-white as they appear. So I’m going to bookmark this posting and return to it in six months. Hopefully by then I’ll know more about the value of being here and of being there.