Sam Ruby offers the following advice to those of us who would like to improve the interoperability of iCalendar feeds:
Identifying real issues that prevent real feeds from being consumed by real consumers and describing the issue in terms that makes sense to the producer is what most would call value.
I’ll be documenting issues as I encounter them. Here’s the first: Should feeds use, or not use, blank lines between components? (A component is a chunk of text representing an event, or something else that can show up in an iCalendar file, like a todo item.)
The presence of blank lines is a reason why this feed is one of two I’m tracking that won’t parse in DDay.iCal.
The unmodified feed looks like this:
BEGIN:VEVENT ...stuff... END:VEVENT BEGIN:VEVENT ...stuff END:VEVENT
Part of the “fix” is to make it look like this:
BEGIN:VEVENT ...stuff... END:VEVENT BEGIN:VEVENT ...stuff END:VEVENT
But I’ve put “fix” in air quotes because, well, who’s wrong in this case? The feed producer (in this case, the Keene Chamber of Commerce), or the feed consumer (in this case, DDay.iCal)?
I looked at the spec and didn’t find evidence pointing one way or the other. Neither did this person:
> 1) yes, KOrganizer adds empty lines between VEVENT, VTODO and > VJOURNAL. I just checked the specification (RFC 2445), and it > doesn't say anything about blank lines... (neither explicitly > allowed, nor explicitly not allowed)
This is a perfect example of why the process that Mark Pilgrim and Sam Ruby went through for RSS/Atom feeds will be so valuable for iCalendar feeds. Quite a few details that affect interoperability turn out to depend on assumptions and interpretations that aren’t explicit.
Maybe I’m misreading the spec, and it really does forbid blank lines between components. If so, great, the validator can enforce that rule. But maybe it neither allows nor forbids. In that case, the validator can say so, and suggest a best practice. In this case, my guess is that the best practice would be not to include blank lines.
But I said that remvoing the blank lines is only part of the “fix” — and here’s why. When I remove them, the feed still won’t parse in DDay.iCal, but for a different reason. Now the problem lies here:
BEGIN:VCALENDAR X-WR-CALNAME:GKCC BEGIN:VEVENT ...stuff...
In this case, the reason is clearly stated in the spec. A feed is supposed to include VERSION and PRODID properties like so:
BEGIN:VCALENDAR VERSION:2.0 PRODID:-//hacksw/handcal//NONSGML v1.0//EN BEGIN:VEVENT
If I inject those into the Chamber of Commerce feed, and remove blank lines, it parses in DDay.iCal.
Note that the unmodified feed is reported to be valid by this iCal4J-based validator. A more robust validator, in the style of the Pilgrim/Ruby RSS/Atom validator, would fail the feed, and would cite the relevant part of the spec in its explanation of the failure.
The spec says, by the way, that both VERSION and PRODID are required elements. When I saw that DDay.iCal was rejecting the Chamber of Commerce feed, which contains neither, I figured that was why. And sure enough, it accepts this:
BEGIN:VCALENDAR VERSION:2.0 PRODID:Keene Chamber of Commerce X-WR-CALNAME:GKCC BEGIN:VEVENT
But it also accepts this:
BEGIN:VCALENDAR VERSION:2.0 X-WR-CALNAME:GKCC BEGIN:VEVENT
BEGIN:VCALENDAR PRODID:Keene Chamber of Commerce X-WR-CALNAME:GKCC BEGIN:VEVENT
But not this:
BEGIN:VCALENDAR PRODID:Keene Chamber of Commerce BEGIN:VEVENT
Eventually I twigged to the fact that it’s evidently just looking for two (or more) non-empty lines between the BEGINs. For example, this parses:
BEGIN:VCALENDAR FOO:BAR BAZ:FOO BEGIN:VEVENT
In practice this isn’t a big deal. None of the metadata matters to me, for my purposes, so my aggregator can just elide it before sending a feed to the parser. But the metadata might matter for someone, for some purpose. A proper validator would help ensure that it will be available to those people, for those purposes, by enabling feed producers and feed consumers to more easily produce and consume valid feeds.
For what it’s worth, I’m going to track this category of issue using the tag icalvalid, and I invite other interested parties to do the same. As in the case of the grl2020 tag, I know the tag can appear in a variety of places including del.icio.us, Technorati, WordPress, and nowadays of course Twitter. So I’ll create a metafeed that tracks icalvalid in all of those places.
Update: OK, here’s the icalvalid metafeed, based on this Yahoo Pipe.
10 thoughts on “iCalendar validation issues #1 and #2: blank lines, PRODID and VERSION”
I would read the following from RFC2445:
“The iCalendar object is organized into individual lines of text,
called content lines. Content lines are delimited by a line break,
which is a CRLF sequence (US-ASCII decimal 13, followed by US-ASCII
to support your contention that blank lines are not permitted. “Content lines” are not permitted to be empty:
“contentline = name *(“;” param ) “:” value CRLF”
I just want to thank you for digging into this, Jon.
To me, it seems like the first matter (DDaily.iCal throwing errors because of an extra line feed) is a point where the author of DDaily.iCal should follow Postel’s law: “Be conservative in what you do; be liberal in what you accept from others.” If the spec is somewhat vague on the matter, the parser should have enough robustness to handle a few stray characters (line feeds, in this case).
From your subsequent experiments, though, it sounds like DDaily.iCal is quite a fragile hack to start with…
Thank you Jon for doing this research – I believe this is a good step toward improving iCalendar interoperability.
“it sounds like DDaily.iCal is quite a fragile hack to start with…”
Peter, please be careful of gross assumptions before flaming another’s work: A close examination of RFC2445’s BNF demonstrates that indeed, blank lines are not explicitly allowed by the standard:
Although it is not explicitly accepted by the standard, I personally felt this issue should be corrected in DDay.iCal, and have subsequently modified it to accept blank lines.
As to the other issue of requiring PRODID and VERSION – those also are required by the standard, and hence are by DDay.iCal and its parser (Google Calendars will not parse calendars without a VERSION property). Since the parser (ANTLR) is not robust enough to require PRODID and VERSION explictly, it simply required 2 or more calendar properties. This has since been modified as well to accept calendars with no properties.
Thanks again Jon – this is excellent work, and I appreciate you letting me know the results of your work. Keep it up!!
> From your subsequent experiments, though,
> it sounds like DDaily.iCal is quite a
> fragile hack to start with…
Far from it. The DDay.iCal library is of very high quality. The issue, in my view, is that — lacking a robust validator — we face a lot of uncertainty about how all the libraries I’ve looked at should handle the kinds of feeds we see out in the wild.
> the parser (ANTLR) is not robust
> enough to require PRODID and VERSION
I wondered about that.
> Far from it. The DDay.iCal library is of
> very high quality.
Thank you very much.
> The issue, in my view, is that — lacking a
> robust validator — we face a lot of
> uncertainty about how all the libraries
> I’ve looked at should handle the kinds of
> feeds we see out in the wild.
In your view, do you think a DDay.iCal-based validator would be appropriate or useful? I’m wondering what your thoughts are toward finding/creating a robust validator?
A calendar validator would certainly be useful! I’m having Outlook compatibility issues, and a validator would be great!