Trusted feeds

As several folks rightly pointed out in comments here, a community site based on tagging and syndication is exquisitely vulnerable to abuse. In the first incarnation of the photos page, for example, a malicious person could have posted something awful to Flickr and it would have shown up on that page. Flickr has its own abuse-handling system, of course, but its response time might not be quick enough to avert bad PR for elmcity.info.

My first thought was to attach an abuse-reporting link to every piece of externally sourced content. It would be a hair trigger that would — in a Wiki-like way — allow anyone to shoot first (i.e., remove an offensive item immediately) and enable the site management to ask questions later (i.e., review logs, revert removals if need be.)

I’m still interested in trying that approach, but not as the mainstay. Instead I want to promote the idea of trusted feeds. There are currently two on that photos page, one from Flickr and one from local blogger Lorianne DiSabato. I know Lorianne and I trust her to produce a stream of high-quality photos (and essays) about life in our community.

After reviewing the Flickr photostreams of the people whose recent photos match a Flickr search for “Keene NH” I decided to extend provisional trust to them, as well, so I put their names on a list of trusted feeds.

Then I restricted the page to just those feeds, and added a note explaining that anyone who sends an email request can join the list of trusted feeds.

Of course anything short of frictionless participation is an obstacle. On the other hand, based on my conversation with Paul English about customer service, there’s a lot to be said for a required step in the process that forms a human relationship — attenuated by email, true, but still, a relationship.

I think it’s even more interesting when the service, or site, is rooted in a geographical place. On the world wide web, I’m always forming those kinds of relationships with people I will never meet. But on a place-based site, I may already have met these folks. If I haven’t yet, I might still. Trust on the Internet has a very different flavor when the scope is local.

A couple of years ago I was on a panel of media types at a local community leadership seminar, where I was the token blogger. The topic was how the community gathers and disseminates news. NHPR’s executive editor Jon Greenberg said what needed to be said about blogging, which was helpful because it was more credible to that audience coming from him than from me. Even so, there was a lot of pushback. When it was suggested that people could consume a richer and more varied diet of news, they balked. “It’s your[the media’s] job to sift and summarize, not ours.”

Similarly, when it was suggested that people could produce news about the local issues where they are stakeholders and have important knowledge, the pushback was: “But you can’t trust random information on the Internet.”

I found that fascinating. Here were a bunch of folks — a hospital administrator, a fire chief, a school nurse, a librarian — who all know one another. What they seemed to be saying, though, is the Internet would invalidate that trust.

Now I assume that they trust emails from one another. Likewise phone calls, which are increasingly carried over the Internet. And if the fire chief wrote a blog that the school nurse subscribed to, there would be no doubt in the mind of John, the school nurse, that the information blogged by Mary, the fire chief, was real and trustworthy.

Until you join the two-way web, though, you don’t really see how it’s like other familiar modes of communication: phone, email. Or how the nature of that communication differs depending on whether the communicating parties live near one another.

If feeds begin to flow locally, it’ll be easy to trust them in a way that’ll supply most of the moderation we need. The problem, of course, is getting those feeds to flow. Bill Seitz asked:

So you think the “average” person will have Flickr and del.icio.us accounts in addition to joining your site?

No, I don’t, though over time more will use these or equivalent services. So yes, I also need to show how any online resource that’s being created, anywhere, for any purpose, can flow into the community site. It only takes two agreements:

  1. An agreement on where to find the source.
  2. An agreement to trust the source.

In the short-to-medium term, those sources are not going to be friendly to me, the developer. So I’ll have to go the extra mile to bring them in, as I’m doing on the events page.

Conceptual barriers

I’ve planted the seed that I hope will grow into the kind of community site that defines community the old-fashioned way — people living in the same place — as well as in the modern sense of network affiliation. The project has raised a bunch of technical, operational, and aesthetic issues.

Technical: Django is working well for me, but I haven’t invested deeply in it yet. Patrick Phelan, a web developer I’ve corresponded with for years, reminded me the other day that my reluctance is strategic. With any framework, buy-in cuts two ways, and you should never take unnecessary dependencies. Patrick noted that I am using WSGI, a Python-based Web Server Gateway Interface, to connect Django by way of FastCGI to my commodity hosting service. And he pointed out that a rich WSGI ecosystem is evolving that could enable me to proceed in the minimalistic style I prefer, integrating best-of-breed middleware (e.g., URL mapping, templating) as needed. If the preceding sentence makes any sense to you, but you haven’t heard about Paste and Pylon (as I had not until Patrick pointed me at them), then you might want to watch the Google TechTalk that Patrick recommends.

Operational: I’m doing this project on $8/month commodity hosting because I want to understand, and explain, how much can be accomplished for how little. Bottom line: amazingly much for amazingly little. For years I’ve supplied my own infrastructure, so I never had the experience of using a hosting service that provides web wrappers to: create subdomains; provision databases and email accounts; deploy blogs and wikis. Sweet! At the same time, though, I’m struck by how much specialized cross-domain knowledge I’ve had to muster. For example, the first service I’ve built on the site, a community version of LibraryLookup, relies on programmatic use of authenticated SMTP to send signup confirmation messages and status alerts. I figured out how to do that in Python, but it took some head-scratching, and my solution isn’t particularly robust. For me, spending an extra buck a month for a more robust solution (ideally delivered as a language-independent web service) would be an option I’d consider. For many people, though, it would be an enabler for things that otherwise wouldn’t happen. There’s a ton of opportunity in this space for buck-a-month services like that.

Aesthetic: For now I’m going with an aggressively Web 0.1 style, a la del.icio.us and craigslist. My wife’s first comment was: “So, you are going to pretty it up a bit, right?” I dunno, you can argue it both ways. The current arrangement has the advantage of being The Simplest Thing That Could Possibly Work. But virtuous laziness aside, it may be that craigslist, in particular, has validated the Web 0.1 aesthetic for community information services. Or it may be that my wife’s first reaction was correct, and I’ll have to look for a volunteer designer. We’ll see.

None of these issues are top of mind for me now, though, because they’re all trumped by a conceptual issue. How do I demonstrate methods of syndication, tagging, and service composition so that people will understand them and, more importantly, apply them?

Consider the version of LibraryLookup that I’ve built for this site. The protocol is, admittedly, abstract. It invites you to use your Amazon wishlist not only for its existing purposes — keeping track of stuff you’re interested in, registering for gifts you’d like to receive — but also as an interface to your local library.

Dan Chudnov thinks this is a questionable approach, and his point about interlibrary loan is well taken. But we don’t have through-the-web interlibrary loan in my town, and if we did, I’d still want to use Amazon as my primary interface to it. To me, it’s obvious why and how to wire those things together. To most people, it isn’t, and that’s the challenge.

To meet that challenge, I’m stepping back from some things things that have been articles of faith for me. For example, this service does not yet notify by way of RSS. Just email for now. Of course I can and will offer RSS, but in my community (as in most) that is not the preferred way to receive notifications.

Everything else about this service will be unfamiliar to most people:

  • That an Amazon wishlist can serve multiple purposes.
  • That LibraryLookup is OK with Amazon. (It is. Jeff Bezos told me so.)
  • That we should expect to be able to wire the web to suit our purposes.

The lone familiar aspect of this service, I realized, is that once in a while you get an email alerting you that something you want is available. Everyone will understand that. But the rest is going to be hard, and I’ve concluded that evangelizing RSS in this context would only muddy the waters even more.

In other ways, though, I’m pushing hard for the unfamiliar. It would be an obvious thing to use Django’s wonderful automation of database CRUD (create, read, update, delete) operations to directly manage events, businesses, outdoor activities, media, and other collections of items of local interest. People are familiar with the notion of a site that you contribute directly to, and I could do things that way, but for the most part I don’t want to. I want to show that you can contribute indirectly, from almost anywhere, and that services like Flickr and del.icio.us can be the database.

I got a great idea about how to approach this from Mark Phippard, a software guy who lives in my town (though we’ve not yet met in person). Mark wrote to offer technical assistance, which I’m glad to receive, but I wrote back asking for help breaking through the conceptual barrier. How do I motivate the idea of indirect, loosely-coupled contribution?

Mark mentioned that one of his pet peeves is the dearth of online information about local restaurants. You can find their phone numbers on the web, but he’d like to see their menus. That’s a perfect opportunity to show how Flickr can be used as a database. If Mark, or I, or someone else scans or photographs a couple of restaurant menus and posts them to Flickr, tagged with ‘restaurant’ and ‘menu’ and ‘elmcityinfo’, we’ll have the seed of a directory that anyone can help populate very easily. Along the way, we might be able to show that Flickr isn’t the only way to do it. A blog can also serve the purpose, or a personal site with photo albums made and uploaded by JAlbum. So long as we agree on a tag vocabulary, I can federate stuff from a variety of sources.

And now, I’m off to collect some local restaurant menus. A nice little fieldwork project for my sabbatical!

A conversation with Paul English about customer service and human dignity

This week’s podcast features Paul English. He’s a software veteran who’s been VP of technology at Intuit and runs the Internet travel search engine at Kayak.com, but is best known for the IVR Cheat Sheet. Now available at gethuman.com, this popular database of voice-system shortcuts makes it easier for people to get the human assistance they crave when calling customer service centers.

The gethuman project isn’t just a list of IVR hacks anymore. It’s evolved into a consumer movement that publishes best practices for quality phone service and rates companies’ adherence to those best practices.

Although human-intensive customer service is usually regarded as costly and inefficient, operations like craigslist — where Craig Newmark’s title is, famously, customer service representative and founder — invite us to rethink that conventional wisdom. Kayak.com’s customer service was inspired by craigslist. Paul English says that making his engineers directly responsible for customer service has done wonders for the software development process. Because they’re on the front lines dealing with the fallout from poor usability, they’re highly motivated to improve it.

We also discussed web-based data management. The original IVR Cheat Sheet was done with Intuit QuickBase, an early and little-known entrant into a category that’s now heating up: web databases.

Finally, we talked about Partners in Health, the organization to which Paul English donates his consulting fees. The story of Partners in Health is told in Tracy Kidder’s book Mountains Beyond Mountains: Healing the World: The Quest of Dr. Paul Farmer. At the end of the podcast I mention that I’d added that book to my Amazon wishlist. The other day, while looking for something to listen to on an afternoon run, I checked my RSS reader and saw that the book was available in my local library in audio format. Sweet! Two afternoon runs later, I’m halfway through. It’s both an inspirational tale about Paul Farmer’s mission and a case study in how holistic health care systems can operate far more cost-effectively than most do today.

PowerBook rot

Back in 2003 I wrote an essay on the dreaded syndrome of Windows rot. As fate would have it, I am still using that same machine, and it’s been quite stable since then. In a year or so, we’ll start hearing opinions about the relative rot-resistance of Vista versus XPSP2. But meanwhile, I’m plagued by a different syndrome: PowerBook rot.

Both of my PowerBooks are afflicted. About six months ago, my 2001-era Titanium G4 began to suffer sporadic WiFi signal loss along with the kinds of narcolepsy and spontaneous shutdowns that many new Intel Macs have exhibited. I’ve tried all the obvious things, including reseating the Airport card and resetting the NVRAM and PMU, but to no avail. The WiFi is negotiable, I could try a different card or just use the machine at home on a wired LAN, but if I can’t fix the worsening narcolepsy and shutdowns it’s all over. Something’s gone funky on the motherboard, I guess, and this machine’s too old and beat up to justify replacing it.

Then, a couple of weeks ago, my 2005-era G4 caught the spontaneous shutdown bug. I wondered if it might be protesting my new job, but when I noticed half my RAM was missing, I diagnosed lower memory slot failure. So now that machine is away having its motherboard replaced, a procedure that appears to be suspiciously routine for my local Apple store.

It’s always dangerous to extrapolate from anecdotal experience, and there’s never good data on this kind of thing, but I must say that while researching these problems I’ve seen a lot of bitching about PowerBooks. Is it just me, or are these things not built to last?

An experiment in online community

For my sabbatical project, I’m laying foundations for a community website. The project is focused on my hometown, but the idea is to do things in a way that can serve as a model for other towns. So I’m using cheaply- or freely-available infrastructure that can be deployed on commodity hosting services. The Python-based web development framework Django qualifies, with some caveats I’ve mentioned. And Django is particularly well-suited to this project because it distills the experiences of the developers of the top-notch community websites of Lawrence, Kansas. The story of those sites, told here by Rob Curley, is inspirational.

I’m also using Assembla for version control (with Subversion), issue tracking (with Trac), and other collaborative features described by my friend Andy Singleton in this podcast (transcript). It’s downright remarkable to be able to conjure up all this infrastructure instantly and for free. I’m a lone wolf on this project so far, but I hope to recruit collaborators, and I look forward to being able to work with them in this environment.

I have two goals for this project. First, aggregate and normalize existing online resources. Second, show people how and why to create online resources in ways that are easy to aggregate and normalize.

Online event calendars are one obvious target. The newspaper has one, the college has one, the city has one, and there’s also a smattering of local events listed in places like Yahoo Local, Upcoming, and Eventful. So far I’ve welded four such sources into a common calendar, and wow, what a messy job that’s been. The newspaper, the college, and the city offer web calendars only as HTML, which I can and do scrape. In theory the Yahoo/Upcoming and Eventful stuff is easier to work with, but in practice, not so much. Yahoo Local offers no structured outputs. Upcoming does, the events reflected into it from Y Local use hCalendar format, but finding and using tools to parse that stuff always seems to involve more time and effort than I expect. Eventful’s structured outputs are RSS and iCal. If you want details about events, such as location and time, you need to parse the iCal, which is non-trivial but doable. If you just need the basics, though — date, title, link — it’s trivial to get that from the RSS feed.

I’m pretty good at scraping and parsing and merging, but I don’t want to make a career out of it. The idea is to repurpose various silos in ways that are immediately useful, but also lead people to discover better ways to manage their silos — or, ultimately, to discover alternatives to the silos.

An example of a better way to manage a siloed calendar would be to publish it in structured formats as well as HTML. But while that would make things easier for me, I doubt that iCal or RSS have enough mainstream traction to make it a priority for a small-town newspaper, college, or town government. If folks could flip a switch and make the legacy web calendar emit structured output, they might do that. But otherwise — and I’d guess typically — it’s not going to happen.

For event calendars, switching to a hosted service is becoming an attractive alternative. In the major metro areas with big colleges and newspapers, it may make sense to manage event information using in-house IT systems, although combining these systems will require effort and is thus unlikely to occur. But for the many smaller communities like mine, it’s hard to justify a do-it-yourself approach. Services like Upcoming and Eventful aren’t simply free, they’re much more capable than homegrown solutions will ever be. If you’re starting from scratch, the choice would be a no-brainer — if more people realized these services were available, and understood what they can do. If you’re already using a homegrown service, though, it’ll be hard to overcome inertia and make a switch.

How to overcome that inertia? In theory, if I reflect screenscraped events out to Upcoming and/or Eventful, the additional value they’ll have there will encourage gradual migration. If anyone’s done something like that successfully, I’d be interested to hear about it.

On another front, I hope to showcase the few existing local blogs and encourage more local blogging activity. Syndication and tagging make it really easy to federate such activity. But although I know that, and doubtless every reader of this blog knows that, most people still don’t.

I think the best way to show what’s possible will be to leverage services like Flickr and YouTube. There are a lot more folks who are posting photos and videos related to my community than there are folks who are blogging about my community. Using text search and tag search, I can create a virtual community space in which those efforts come together. If that community space gains some traction, will people start to figure out that photos and videos described and tagged in certain simple and obvious ways are, implicitly, contributions to the community space? Might they then begin to realize that other self-motivated activities, like blogging, could also contribute to the community space, as and when they intersect with the public agenda?

I dunno, but I’d really like to see it happen. So, I’m doing the experiment.

A conversation with John Halamka about health information exchange

Dr. John Halamka joins me for this week’s podcast. He’s a renaissance guy: a physician, a CIO, and a healthcare IT innovator whose work I mentioned in a pair of InfoWorld columns. Lots of people are talking about secure exchange of medical records and portable continuity of care documents. John Halamka is on the front lines actually making these visions real. Among other activities he chairs the New England Health Electronic Data Interchange Network (NEHEN), which began exchanging financial and insurance data almost a decade ago and is now handling clinical data as well in the form of e-prescriptions. The technical, legal, and operational issues are daunting, but you’ll enjoy his pragmatic style and infectious enthusiasm.

We also discuss the national initiative to create a standard for continuity of care documents that will provide two key benefits. First, continuity both within and across regions. Second, data on medical outcomes that can be used by patients to choose providers, and by providers to study the effectiveness of procedures and medicines.

Websites mentioned in this podcast include:

Oh, and there’s a new feed address for this series: feeds.feedburner.com/JonUdellFridayPodcasts.

Django gymnastics

Recently I’ve been noodling with Django, a Python-based web application framework that’s comparable in many ways to Ruby on Rails. It appeals to me for a variety of reasons. Python has been my language of choice for several years, and going forward I expect it to help me build bridges between the worlds of LAMP and .NET. Django’s templating and object-relational mapping features are, as in RoR, hugely productive. And Django’s through-the-web administration reminds me of a comparable feature in Zope that I’ve always treasured. It’s incredibly handy to be able to delegate basic CRUD operations to trusted associates, who can use the built-in interface in lieue of the friendlier one you’d want to create for the general public.

The recommended way to deploy Django is to run it under mod_python, the Apache module that keeps Python interpreters in memory for high performance. But a lot of popular web hosting services don’t support that arrangement. For example, I just signed up for an account at BlueHost, the service used by the instructional technologists at the University of Mary Washington, and I looked into what it would take to get Django working in that environment.

Despite helpful clues it still took a while to work out the solution. In the process I reactivated dormant neurons in the parts of my brain dedicated to such esoterica as mod_rewrite and FastCGI, but I’d rather have been working with Django than working out how to configure it.

By way of contrast, setting up WordPress — a more well-known and popular application — was a one-click operation thanks to Fantastico, an add-on installer for the cPanel site manager.

I’ve heard it said that a compelling screencast is one key factor influencing the adoption of a new web-based application. One-click install in shared hosting environments has to be another. For a while, anyway, until the virtualization juggernaut gives everyone the illusion of dedicated hosting.