Why we need an XML representation for iCalendar

Translations:

Croatian

On this week’s Innovators show I got together with two of the authors of a new proposal for representing iCalendar in XML. Mike Douglass is lead developer of the Bedework Calendar System, and Steven Lees is Microsoft’s program manager for FeedSync and chair of the XML technical committee in CalConnect, the Calendaring and Scheduling Consortium.

What’s proposed is no more, but no less, than a well-defined two-way mapping between the current non-XML-based iCalendar format and an equivalent XML format. So, for example, here’s an event — the first low tide of 2009 in Myrtle Beach, SC — in iCalendar format:

BEGIN:VEVENT
SUMMARY:Low Tide 0.39 ft
DTSTART:20090101T090000Z
UID:2009.0
DTSTAMP:20080527T000001Z
END:VEVENT

And here’s the equivalent XML:

<vevent>
  <properties>
    <dtstamp>
      <date-time utc='yes'>
        <year>2008</year><month>5</month><day>27</day>
        <hour>0</hour><minute>0</minute><second>1</second>
      </date-time>
    </dtstamp>
    <dtstart>
      <date-time utc='yes'>
        <year>2009</year><month>1</month><day>1</day>
        <hour>9</hour><minute>0</minute><second>0</second>
      </date>
    </dtstart>
    <summary>
      <text>Low Tide 0.39 ft</text>
    </summary>
    <uid>
      <text>2009.0</text>
    </uid>
  </properties>
</vevent>

The mapping is quite straightforward, as you can see. At first glance, the XML version just seems verbose. So why bother? Because the iCalendar format can be tricky to read and write, either directly (using eyes and hands) or indirectly (using software). That’s especially true when, as is typical, events include longer chunks of text than you see here.

I make an analogy to the RSS ecosystem. When I published my first RSS feed a decade ago, I wrote it by hand. More specifically, I copied an existing feed as a template, and altered it using cut-and-paste. Soon afterward, I wrote the first of countless scripts that flowed data through similar templates to produce various kinds of RSS feeds.

Lots of other people did the same, and that’s part of the reason why we now have a robust network of RSS and Atom feeds that carries not only blogs, but all kinds of data packets.

Another part of the reason is the Feed Validator which, thanks to heroic efforts by Mark Pilgrim and Sam Ruby, became and remains the essential sanity check for anybody who’s whipping up an ad-hoc RSS or Atom feed.

No such ecosystem exists for iCalendar. I’ve been working hard to show why we need one, but the most compelling rationale comes from a Scott Adams essay that I quoted from in this blog entry. Dilber’s creator wrote:

I think the biggest software revolution of the future is that the calendar will be the organizing filter for most of the information flowing into your life. You think you are bombarded with too much information every day, but in reality it is just the timing of the information that is wrong. Once the calendar becomes the organizing paradigm and filter, it won’t seem as if there is so much.

If you buy that argument, then we’re going to need more than a handful of applications that can reliably create and exchange calendar data. We’ll want anyone to whip up a calendar feed as easily as anyone can now whip up an RSS/Atom feed.

We’ll also need more than a handful of parsers that can reliably read calendar feeds, so that thousands of ad-hoc applications, services, and scripts will be able consume all the new streams of time-and-date-oriented information.

I think that a standard XML representation of iCalendar will enable lots of ad-hoc producers and consumers to get into the game, and collectively bootstrap this new ecosystem. And that will enable what Scott Adams envisions.

Here’s a small but evocative example. Yesterday I started up a new instance of the elmcity aggregator for Myrtle Beach, SC. The curator, Dave Slusher, found a tide table for his location, and it offers an iCalendar feed. So the Myrtle Beach calendar for today begins like this:

Thu Jul 23 2009

WeeHours

Thu 03:07 AM Low Tide -0.58 ft (Tide Table for Myrtle Beach, SC)

Morning

Thu 06:21 AM Sunrise 6:21 AM EDT (Tide Table for Myrtle Beach, SC)
Thu 09:09 AM High Tide 5.99 ft (Tide Table for Myrtle Beach, SC)
Thu 10:00 AM Free Coffee Fridays (eventful: )
Thu 10:00 AM Summer Arts Project at The Market Common (eventful: )
Thu 10:00 AM E.B. Lewis: Story Painter (eventful: )

Imagine this kind of thing happening on the scale of the RSS/Atom feed ecosystem. The lack of an agreed-upon XML representation for iCalendar isn’t the only reason why we don’t have an equally vibrant ecosystem of calendar feeds. But it’s an impediment that can be swept away, and I hope this proposal will finally do that.

25 thoughts on “Why we need an XML representation for iCalendar

  1. There was some work done to map iCalendar to RDF, which of course has an XML serialization as well. They created some URIs and proof-of-concept tools to round-trip, but as far as I know it pretty much ended there.

    Actually, I think RDF is a little more suited to this than XML. Especially RDFa, which would allow you to mark up the same document with human- and machine-readable information about an event with minimal repetition.

    hCalendar is another approach to machine-data-within-human-data.

  2. But, why split up the time coordinates? ISO8601, or rather its XML Schema subset, is widespread in use and well supported by XML technologies… E.g. in XSLT2 it’s much easier to work with an ISO8601 date than it would be with this… plus it’s very verbose.

    1. “Why split up the time”

      Here is the stated rationale:

      “By separating the values into different elements, it becomes much easier to transform or query over the resulting XML.

      For instance, given a set of xCal elements representing events, it is easy to find the events that start at a particular time or in a particular year, month, or day. Given a date / time value in the xCal format described here, it would be easy to transform it to HTML while removing unneeded sub-elements; no additional parsing is needed.”

      1. If the software library or whatever that’s reading the XML supports standard data types, then it should be able to parse an ISO8691 date/time just fine. It should, there are a few standard time data types incorporated right into XML: http://www.w3.org/TR/xmlschema-2/#dateTime .

        In general, you err on the side of too much structure, which is not bad, but can make life a bit harder for implementors.

        The minimal version is

        Low Tide 0.39 ft

        Either way, it would be useful.

        The main problem though with current ics is not the syntax per se, but that it allows for some wide variation between valid files (as I think you’ve discussed earlier).

      2. Second try at including angle brackets in a comment…

        <vevent dtstart=”2009-01-01T09:00:00Z” uid=”2009.1″ dtstamp=”2008-05-27T00:00:01Z”>
        <summary>Low Tide 0.39 ft</summary>
        </vevent>

  3. It is not just length, but richness of text that is important in real-world event calendar applications. Something like CalDAV/iCalendar, for example, is great for scheduling appointments, but what if you want to write an event profile or preview for others. The lack of ability to include rich text in these “structured” formats is limiting compared to, for example, an HTML microformat alternative like hCalendar.

  4. “The lack of ability to include rich text in these “structured” formats is limiting compared to, for example, an HTML microformat alternative like hCalendar.”

    I know what you mean, and sympathize.

    Yet I think we need to remind ourselves of the power that can be unleashed when we embrace constraints. The obvious contemporary example is Twitter. Its brutally small and simple packets make it an efficient routing system that interconnects many flavors of richness external to it.

    I have come to regard iCalendar in the same way. And FWIW, the data structure at the core of my aggregator is way simpler even than what iCalendar can represent. Just these fields:

    title (open lap swim)
    source (ymca)
    url (http://…)
    dtstart (7AM)
    categories (recreation)

    The strategy is:

    – Low activation threshold for feed producers to join calendar syndication networks

    – Router, not repository

    – Use the network to coordinate notices about events, but expect that users will always visit the authoritative sources for full details and context

  5. I am *so* glad to see this, Jon. Thanks for pointing it out. The RDF serialization never seemed simple and intuitive to me but then it I’ve never been able to get RDF either so it’s probably my own limitations.

  6. I guess that wave crested without anybody hopping on, so now there’s another coming along.

    From my perspective, agreement on anything reasonable — for XML and, as Jake points out, probably nowadays for at least also JSON too — is preferable to no agreement.

  7. Jon,

    I think hCalendar actually solves the use cases you present for “an XML representation for iCalendar”, and there’s already an ecosystem of hCalendar generators, processors, search engines (Yahoo! Searchmonkey) and validators.

    The Optimus microformats transformer and validator will for example validate your hCalendar for you:

    http://microformats.org/wiki/optimus

  8. Hi Tantek,

    That’s a really interesting point. Question: There are infinitely many ways to represent a chunk of iCalendar in hCalendar. But is there a canonical way to do it? If so that might indeed meet the requirement for round-trip conversion between iCalendar and something angle-brackety.

    Another question: How much validation does Optimus do? I ask because the current best iCalendar validator, http://severinghaus.org/projects/icv/, isn’t nearly as complete or robust as the RSS/Atom analog at http://feedvalidator.org/. Over at http://icalvalid.wikidot.com/ I’ve been doing some groundwork with Ben Fortuna and Doug Day but we haven’t gotten very far yet.

  9. One of the reasons for an XML format in addition to hCalendar is that it’s better suited to data interchange, as opposed to presentation.

    A better reason is that we know from folks like Eventful and others that people are embedding calendar data in XML today in a variety of incompatible ways. It’s definitely progress if we can converge on a single XML format.

  10. > It’s definitely progress if we can
    > converge on a single XML format.

    Agreed.
    Thought experiment: What if there were a canonical way to write hCalendar for purposes of data interchange?

  11. > What if there were a canonical way to
    > write hCalendar for purposes of data
    > interchange?

    On second thought: Nah, dumb idea, hCalendar sans presentation wouldn’t make any sense, would it?

  12. Arguments for XML:

    First, EVERYTHING ELSE uses it! While RDF may be better suited, and JSON sleek and new, we should be pushing for a general information standard. XML should feed into or reference RDF, and it easily can be converted to other, similar representations like JSON. Realistically this data won’t (or shouldn’t) exist in any format until it is pulled from a source like a database, so the best choice should be the standard that can be generated easily by hand (for those of us with simple needs or quick tasks that won’t justify a db), AND can be easily served up by a script or directly from a database. This is one of the goals as I see it of XML: interchangeability.

    Secondly, wider client support. And I mean both editors and browsers. I can currently create an SVG in Inkscape, tweak it in Notepad++, open it in Firefox or Gimp, and, if I’m feeling crazy, send it to a screen printer who can open it in Illustrator and make me a wicked huge print of it. I can (to some extent), embed that SVG into my site’s XHTML along side some MathML. Someday I might even get to drop some MusicML somewhere in there so that a visitor can get the tabs for ‘Yesterday’ and even get the browser to play it while they read it.

    Now wouldn’t it be great if I could make my staff’s work calendar in Sunbird, tweak it in Notepad++, drop it directly into my schedule page’s XHTML and know that when they visit the site in something other than IE, they can not only get a good output of their schedule, but can add it to their Google calendar or send it straight to their phone, maybe without even needing an add on?

    The thing that I like the least about hcalendar is that it’s trying (very nobly and cleverly, don’t get me wrong) to put scheduling info on top of HTML. The idea of microformats is great if the data is going to be there anyway in that format already and you want to add that extra layer. But most of them time we just want the schedule, not some divs and list items that also have that extra feature.

    So XML is certainly the way to go. It makes the data more interchangeable between devices and applications, more embeddable/interactive with other pre-existing XML standards, and more available for editing by both humans and machines.

    PS: Is there ANY actual implementation of X-Cal that is current and working? I’d really like to see a full example that is both valid and can be imported into Outlook or GCalendar.

  13. While hCalendar may not be XML at some “formal” level, it is for all practical purposes the current standard XML-compatible form of iCalendar data. Existing libraries such as ical4j treat it as such with round-tripping and all (obviously there is more than one possible byte stream resulting from transformation to hCalendar format, but that is par for the course with XML).

    Standardizing on preferred round trip rules between iCalendar and hCalendar, or adding extra decorations to hCalendar to ease parsing by generic XML tools would probably be good, but creating a new incompatible format is just creating more problems for everyone.

  14. Hey, this is really interesting and im glad i stumpled upon your blog. The only thing i knew about XML was ”sitemap” But now you have made an impact. ill add your site to my blog.

Leave a Reply to Jim McMillanCancel reply