Strategic choices for calendar publishers

Although I haven’t been able to confirm this officially yet, it looks like FuseCal, the HTML screen-scraping service that I’ve been using (and recommending) as a way to convert calendar-like web pages into iCalendar feeds, has shut down.

The web pages that FuseCal has been successfully processing, for several curators participating in the elmcity project, are listed below. They’re a kind of existence proof, validating the notion that unstructured calendar info — what people intuitively create — can be mechanically transformed into structured info that syndicates reliably.

I hope this service, or some future variant of it, will continue. It’s a really useful way to help people grasp the concept of publishing calendar feeds.

But in the long run, it’s a set of training wheels. Ultimately we need to teach people why and how to produce such feeds more directly. All of the event information shown below could be managed in a more structured way using calendar software that produces data feeds for syndication and web pages for browsing.

More broadly, incidents like this prompt us to consider the nature of the services ecosystem we’re all embedded in — as users and, increasingly, as co-creators. In the software business, developers have long since learned to evaluate the benefits and risks of “taking a dependency” on a component, library, or service. Users didn’t have to think too much about that. A software product that was discontinued would keep working perfectly well, maybe for years. But services can — and sometimes do — stop abruptly.

Since the elmcity project is embedded in a services ecosystem, as both a provider and a consumer, how should a curator evaluate service dependencies and their associated risks and benefits? Here are some guidelines.

Many eggs, many baskets

An instance of the calendar aggregator gathers events from three main sources: Eventful (service #1), Upcoming (service #2), and a curated set of iCalendar feeds. A subset of those feeds may (until recently) have been mediated by FuseCal (service #3). So there were three main service dependencies here, and that’s one form of diversification.

But the iCalendar feeds represent another, and more powerful, form of diversification. One may be served up by a Drupal system, one may be an ICS file posted from Outlook 2007, one may be an instance of Google Calendar. Each depends on its own supporting services, but the ecosystem is very diverse.

Data and service portability

The elmcity project isn’t a database of events, but rather an aggregator of feeds of events. What matters in this case is portability of metadata describing the feeds, as well as data describing events. The system depends on Delicious for the management of the metadata. But all this metadata is replicated to Azure for safekeeping.

Since the elmcity project does run on Azure, there’s clearly a strong dependence on that platform’s compute and storage services. But I could run the code on another host — even another cloud-based host, thanks to Amazon’s EC2 for Windows. Likewise I could store blobs and tables in Amazon’s S3 and SimpleDB.

Strategic choices

In this context, the use of FuseCal was a strategic choice. There isn’t a readily available replacement, and that’s a recipe for the sort of disruption we’ve just experienced. But since the system is diversified, that disruption is contained. Was the benefit provided by this unique service worth the cost of disruption? Some curators may disagree, but I think the answer is yes. It was really helpful to be able to show people that informational web pages are implicitly data feeds, and to show what can happen when those data feeds are made explicit.

Still, it was a crutch. Ultimately we want people to stand on their own two feet, and take direct control of the information they publish to the web. FuseCal had to guess which times went with which events, and sometimes guessed wrong. If you’re publishing the event, you want to state these facts unambiguously. And using a variety of methods, as I’ve shown, you can. Those methods are the real strategic choices. If you can publish your own data feed, simply and inexpensively, you should seize the opportunity to do so

Calendar pages successfully parsed by FuseCal