For my sabbatical project, I’m laying foundations for a community website. The project is focused on my hometown, but the idea is to do things in a way that can serve as a model for other towns. So I’m using cheaply- or freely-available infrastructure that can be deployed on commodity hosting services. The Python-based web development framework Django qualifies, with some caveats I’ve mentioned. And Django is particularly well-suited to this project because it distills the experiences of the developers of the top-notch community websites of Lawrence, Kansas. The story of those sites, told here by Rob Curley, is inspirational.

I’m also using Assembla for version control (with Subversion), issue tracking (with Trac), and other collaborative features described by my friend Andy Singleton in this podcast (transcript). It’s downright remarkable to be able to conjure up all this infrastructure instantly and for free. I’m a lone wolf on this project so far, but I hope to recruit collaborators, and I look forward to being able to work with them in this environment.

I have two goals for this project. First, aggregate and normalize existing online resources. Second, show people how and why to create online resources in ways that are easy to aggregate and normalize.

Online event calendars are one obvious target. The newspaper has one, the college has one, the city has one, and there’s also a smattering of local events listed in places like Yahoo Local, Upcoming, and Eventful. So far I’ve welded four such sources into a common calendar, and wow, what a messy job that’s been. The newspaper, the college, and the city offer web calendars only as HTML, which I can and do scrape. In theory the Yahoo/Upcoming and Eventful stuff is easier to work with, but in practice, not so much. Yahoo Local offers no structured outputs. Upcoming does, the events reflected into it from Y Local use hCalendar format, but finding and using tools to parse that stuff always seems to involve more time and effort than I expect. Eventful’s structured outputs are RSS and iCal. If you want details about events, such as location and time, you need to parse the iCal, which is non-trivial but doable. If you just need the basics, though — date, title, link — it’s trivial to get that from the RSS feed.

I’m pretty good at scraping and parsing and merging, but I don’t want to make a career out of it. The idea is to repurpose various silos in ways that are immediately useful, but also lead people to discover better ways to manage their silos — or, ultimately, to discover alternatives to the silos.

An example of a better way to manage a siloed calendar would be to publish it in structured formats as well as HTML. But while that would make things easier for me, I doubt that iCal or RSS have enough mainstream traction to make it a priority for a small-town newspaper, college, or town government. If folks could flip a switch and make the legacy web calendar emit structured output, they might do that. But otherwise — and I’d guess typically — it’s not going to happen.

For event calendars, switching to a hosted service is becoming an attractive alternative. In the major metro areas with big colleges and newspapers, it may make sense to manage event information using in-house IT systems, although combining these systems will require effort and is thus unlikely to occur. But for the many smaller communities like mine, it’s hard to justify a do-it-yourself approach. Services like Upcoming and Eventful aren’t simply free, they’re much more capable than homegrown solutions will ever be. If you’re starting from scratch, the choice would be a no-brainer — if more people realized these services were available, and understood what they can do. If you’re already using a homegrown service, though, it’ll be hard to overcome inertia and make a switch.

How to overcome that inertia? In theory, if I reflect screenscraped events out to Upcoming and/or Eventful, the additional value they’ll have there will encourage gradual migration. If anyone’s done something like that successfully, I’d be interested to hear about it.

On another front, I hope to showcase the few existing local blogs and encourage more local blogging activity. Syndication and tagging make it really easy to federate such activity. But although I know that, and doubtless every reader of this blog knows that, most people still don’t.

I think the best way to show what’s possible will be to leverage services like Flickr and YouTube. There are a lot more folks who are posting photos and videos related to my community than there are folks who are blogging about my community. Using text search and tag search, I can create a virtual community space in which those efforts come together. If that community space gains some traction, will people start to figure out that photos and videos described and tagged in certain simple and obvious ways are, implicitly, contributions to the community space? Might they then begin to realize that other self-motivated activities, like blogging, could also contribute to the community space, as and when they intersect with the public agenda?

I dunno, but I’d really like to see it happen. So, I’m doing the experiment.