Feed validation revisited: The parallel universe of iCalendar feeds

If you were tuned into the blogosphere back in 2001, you’ll recall lots of chatter about RSS feed validation. RSS came in multiple flavors. Anyone could whip up a feed purporting to be in one or another of those formats, and many of us did. There were all kinds of questions about how and why feeds did or didn’t conform to the various specifications.

Nowadays we have even more flavors. There’s RSS 2.0. And there’s Atom, which isn’t a member of the RSS family at all, it’s a different species of feed format. And yet you rarely hear about problems with feeds that can’t be read and processed by feedreaders.

I think there are two reasons why RSS/Atom-style feeds work pretty well nowdays. First, there’s the Feed Validator. Mark Pilgrim and Sam Ruby put a huge amount of effort into this excellent tool. Why? Here is their explanation:

Despite its relatively simple nature, RSS is poorly implemented by many tools. This validator is an attempt to codify the specification (literally, to translate it into code) to make it easier to know when you’re producing RSS correctly, and to help you fix it when you’re not.

The second reason is that RSS/Atom-style syndication has been happening in a lot of places for a long time now. A lot of people have used, and helped to refine, the tools and techniques.

Now I’m exploring the parallel world of calendar syndication, using ICS feeds instead of RSS/Atom feeds. And it feels like 2001 all over again. There are ICS feeds out there, but nowhere near as many as RSS/Atom feeds. And my hunch is that even when ICS feeds are published, they’re often unused, so there isn’t enough feedback to flush out problems. Finally, the ICS equivalent of the RSS/Atom Feed Validator — a service called iCalendar Validator, based on a Java library called iCal4j — isn’t anywhere near as comprehensive and informative as the RSS/Atom Validator.

Here’s a chart that lists the iCalendar feeds currently being collected by the elmcity.info calendar aggregator.

feed producer valid in iCal4J loads with DDay.iCal loads with iCalendar.py loads with vObject
armadillos google yes yes yes yes
aveo google yes yes yes yes
chamber of commerce homegrown yes no yes yes
cheshire democrats google yes yes yes yes
frost free library drupal no no yes no
fuzzy logic google yes yes yes yes
gilsum church google yes yes yes yes
hannah grimes drupal yes yes yes no
keene high soccer google no yes yes yes
keene public library fusecal yes yes yes yes
keene state bodyworks google yes yes yes yes
mmama cinema google yes yes yes yes
mmama dance google yes yes no no
mmama music google yes yes yes yes
mmama visual google yes yes yes yes
monadnock folk wordpress ec3 yes yes yes yes
monadnock regional high unknown no yes yes yes
swamp bats google yes yes yes yes
town of gilsum google yes yes yes yes
unh coop extension homegrown no yes yes yes
upcoming yahoo no yes yes yes
ymca google yes yes yes yes

As you can see, the results are all over the map. Some purportedly valid feeds won’t load using one iCalendar library, some won’t load using another. Some purportedly invalid feeds do load.

I expect things will get worse before they get better. There are only a handful of different ICS producers represented here, but the two labeled homegrown were created directly or indirectly in response to my project. If we recapitulate the RSS/Atom experience with ICS, and lots more ad-hoc ICS feeds arrive on the scene, charts like this will go even redder.

To make them go green, we’ll need a more robust ICS validator.

11 Comments

  1. I originally intended vobject to provide validation in addition to parsing of icalendar data, but I ended up focusing more on being liberal in what I accepted and working around bugs in major clients.

    (as an aside, I’d be fascinated to see which of your feeds vobject fails to parse).

    The validation scaffolding is still in there, though, and it wouldn’t be particularly hard to improve.

    It’d be great to collect a snapshot of failing icalendar files and catalog them just so implementers in all languages could see which ones they fail on, how! I think the folks at CalConnect came up with test files like this at some point, but I don’t think they were made public.

    I keep meaning to make a GAE icalendar validator using vobject, maybe one of these weekends.

    P.S. If you aren’t already planning on it, you might try to attend the CalConnect at Microsoft’s Redmond campus, http://www.calconnect.org/calconnect14.shtml

  2. I created the UNH Cooperative Extension application, and didn’t know about the validator. I fixed the feeds so they validate. The problem was not adding ;VALUE=DATE to all day events where I set the DTSTART and DTEND to date values.

    I’m using iCalendar.py on my GAE project (www.mashical.com) and have felt the pain of ics formats. For example, Yahoo sports calendars use TZID in the DTSTART and DTEND, which iCalendar.py doesn’t handle natively. I had to use a regular expression and pytz to take the timezone information and transform the start and end times.

    Also, when looking around my community (New Boston, NH) to see if I could create an aggregated calendar, like elmcity.info, I found nary an iCalendar enabled calendar!

  3. “I’d be fascinated to see which of your feeds vobject fails to parse”

    I updated the chart above to include vObject. As you can see, it seems to like everything except the two Drupal-produced feeds.

    “It’d be great to collect a snapshot of failing icalendar files and catalog them just so implementers in all languages could see which ones they fail on, how!”

    Absolutely. That’s another benefit of maintaining the feed registry in an open, queryable location:

    http://del.icio.us/elmcity/trusted+ics+feed

    “I think the folks at CalConnect came up with test files like this at some point, but I don’t think they were made public.”

    In any case, the stuff that comes up in the wild is what’s most interesting to me.

    “If you aren’t already planning on it, you might try to attend the CalConnect at Microsoft’s Redmond campus, http://www.calconnect.org/calconnect14.shtml

    Hmm. Perhaps. Although I don’t think that the use case I’m pursuing here is of much interest to that community.

  4. “I fixed the feeds so they validate.”

    Although, as we see above, their former non-validity didn’t prevent three iCalendar libraries from parsing the feeds.

    “I had to use a regular expression and pytz to take the timezone information and transform the start and end times.”

    And therein lies the dilemma. If those changes are required, a robust validator should prescribe them.

    “Also, when looking around my community (New Boston, NH) to see if I could create an aggregated calendar, like elmcity.info, I found nary an iCalendar enabled calendar!”

    This, of course, is the ultimate purpose of my project. ICS feeds want to be as superabundant as RSS feeds. They aren’t, and for no better reason than that it just doesn’t occur to people to publish them.

    Changing that is an exercise in politics more than in software development. Jim Groom’s going to try to make Fredericksburg VA a sister city to Keene w/respect to this idea:

    http://bavatuesdays.com/a-calendar-year/

    Maybe you can do the same for your region of NH?

  5. I would like the feed validator to have a web service interface, so I can put a more neutral UI on it, stressing what I feel is important and de-emphasizing what I don’t feel is important.

    I would then be able to do more with it than I have been lately — because I can’t recommend something that makes warnings look like errors, esp when they’re so much a matter of taste and don’t reflect anything the spec says or doesn’t say.

    I was looking for a recent thread that mentioned the validator and this is the one I found — I thought you might agree that a web service interface for the validator makes sense.

    Hope all is well and that you’re having a pleasant and healthy 2009!

  6. > I thought you might agree that a web
    > service interface for the validator makes
    > sense.

    I do. Of course I also think a web service interface on just about everything makes sense.

    I hope that if iCalendar validation advances in 2009, it’ll take that approach.

    I’d also love to see it follow the approach taken at feedvalidator.org w/respect to test cases.

    http://feedvalidator.org/testcases/

    Although the validator itself is written in Python, the tests are purely declarative and completely decoupled from the implementation. A different validator could work off the same tests, and multiple validators could serve as checks on one another.

    What’s more, if that happened, the ensuing conversation would doubtless improve the breadth and quality of the (already excellent) test suite.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s