Azure calendar aggregator: Part 1

For about a week now, I’ve been running a service in the Azure cloud that aggregates calendar events from Eventful.com and from a diverse set of iCalendar feeds. As I mentioned last month, my aim is to recreate and then extend my experimental elmcity.info community information hub, while exploring and documenting the evolution of Azure and the layered services emerging on top of it.

I haven’t written a whole lot about programming here for a while, because I’ve trying to to explain the whys and wherefores of syndication-oriented communication to a wider audience. But as I build out this service I’m learning a lot about cloud-based software development in general, and about Azure in particular, and I want to narrate this work. I’ll try to do it in a way that will inform developers who currently use Microsoft tools and technologies, as well as those who don’t. But I’ll also try to be accessible to folks who don’t write software, yet would like to learn something about the opportunities that cloud computing is creating as well as the challenges it poses.

The service, as it currently exists, is running as an Azure worker role. That means it does input, processing, and output, but presents no user interface. The inputs are Eventful.com, accessed by way of its API, and a growing set of public iCalendar feeds. The processing involves reading calendar events and normalizing them to a common intermediate format. The output is currently XML to the Azure blob store, one file for Eventful and another for the iCalendar feeds.

I’m only allocating one instance of this worker process, and that’s probably enough horsepower for any single community’s events. But I’d like to be able to scale out the aggregator to serve other communities as well, potentially many others. Turning up the dial to do that would be a nice illustration — and test — of the cloud computing fabric.

The existing aggregator at elmcity.info is written in Python, and my original plan was to port it with minimal change to IronPython on Azure. That didn’t work out because, although bare-bones IronPython code runs on Azure as I show here, you quickly run into restrictions imposed by Azure’s security sandbox. The trust policy, defined here, is based on a feature of the .NET platform known as code access security (CAS).

When you upload code to the Azure cloud, or run it in the local development fabric, the hosting environment only partly trusts your code, and also only partly trusts any components used by your code. This is part of a layered, defense-in-depth security strategy, prudent for the same reason that it’s prudent to run your own computer as a partly-trusted user instead of an all-powerful administrator. It is also problematic for the same reason. A lot of Windows applications used to require administrative privilege in order to run properly, and some — though fewer month by month — still do. Similarly, a lot of .NET components that run happily in the fully-trusted environment of your local computer won’t run in Azure’s medium-trust environment, or (what’s nearly equivalent) in Internet Information Server 7 (IIS 7) when its security mode is set to medium trust.

I am no expert on the subject of code access security, but here’s what I think:

  • The medium-trust policy is probably a good thing.
  • It does, however, impede instant gratification when you’re mixing components from various sources.
  • But that impedance will diminish as more component builders adopt the good practice of not making their components unnecessarily require full trust.

I think that IronPython is likely to become such a component, once the dust settles from the recent 2.0 release. (If you care about this issue, you can vote up its priority.) Meanwhile I’ve been working in C#, which has been a fascinating experience. On the one hand, I believe that dynamic languages like Python are excellent choice for agile development everywhere, and especially in the fluid environment of the cloud. On the other hand, I’m not a language bigot and have always appreciated the virtues of statically-typed languages.

My basic philosophy has always been to use a mix of best-of-breed tools in order to gain maximum leverage. The combination of IronPython and C#, on the .NET platform, is a really powerful one, for the same reason that the Jython/Java combo is. On this project, even though I am not yet deploying any code written in IronPython, I often use IronPython to test C# components that I’ve written or acquired.

Along the way, I’ve been recalling something IronPython’s creator, Jim Hugunin, said at the Professional Developers Conference back in October. Jim’s talk followed one by Anders Hejlsberg, the creator of C#. Anders showed an experimental future version of C# that makes use of the Dynamic Language Runtime which supports IronPython and IronRuby on .NET. The effect was to create an island of dynamic typing within C#’s otherwise statically-typed world. We all appreciated the delicious irony of a static type called ‘dynamic’.

Jim might have sounded a bit wistful when he said: “I’m not sure what a dynamic language is any more.” But I think this blurring of boundaries is a wonderful thing. Many smart people I deeply respect value the static typing of C#. Some of the same smart people, and many different ones, value the dynamic typing in languages like Ruby and Python. If I can leverage the union of what all of those smart people find valuable, I’ll happily do so.

I’ll have more to say about this project, and of course code to share, as things evolve. Meanwhile, though, I want to acknowledge Doug Day at DDay Software. When I switched from Python to C#, the key component I needed was an iCalendar module equivalent to MaxM’s excellent Python iCalendar module, which I’m using at elmcity.info. Doug’s DDay.iCal met the need. It’s a solid, cleanly-built, open source .NET component that enables code written in any of the .NET family of languages to parse, and generate, iCalendar (RFC 2445) files.

And now back to the project, which reminds me of the era at BYTE during which I got to build stuff while writing about what I was building. It’s great fun. And as John Leeke so eloquently says, it engages the mind, the hands, and the heart.

5 Comments

  1. Jon,

    Please do keep us updated as your proof of concept progresses. I’ve joined the tech-preview for Azure, but have not had the time available to use it yet (which is why I’m interested in other’s opinions!). Thanks for sharing what you learn and experience.

  2. Don’t get me wrong, I’ve always respected your work and even didn’t mind seeing you go to work for Microsoft (it felt like one of us was infiltrating the tower). But hearing you say “The medium-trust policy is probably a good thing” is heart breaking. Either give me a machine in the cloud to work on our don’t (anything less is censorship), I’m not going to learn security stuff (and I don’t think you are going to really take the time too either: if so do please do tell, maybe it takes someone like you to convince us otherwise?). As always and with the utmost respect, still enjoying your blog/podcasts and thoughts on mixed dynamic/static languages for us died in the wool python/ruby zealots is appreciated. But please don’t say security is a good thing for the web: “the web is agreement”. Sincerely, Disgusted of Tumbridge Wells

  3. “Either give me a machine in the cloud to work on our don’t (anything less is censorship)”

    I’d rather have the opportunity to self-censor. And on Amazon EC2 I have that opportunity. That said, when I’ve used EC2 VMs I have been running as root. Why? No good reason, just path of least resistance.

    Do you routinely run as root on your personal box, and on hosted boxes? If so, you can do that on EC2, and I suspect you’ll be able to on raw Azure VMs too. But setting the default to something less potent is…well, think about it. Have you ever condemned Microsoft for not being secure by default? How do you square that with condemning Microsoft for being secure by default?

    More broadly, the cloud environment is going to challenge a lot of long-held assumptions in what I think will be useful ways. Less so for raw VM hosting a la Amazon, more so for the kinds of “fabrics” of which App Engine and Azure are examples.

  4. I’m glad I happened upon this article. I’m a developer evangelist for Microsoft, and one of the things I’ve been working on lately is a web site (link above) for helping user groups and other folks in the community promote and discover events of interest to them. I’ve had an iCalendar feed for most of the life of the site, and also provide .ics attachments in the RSS feed for each event using enclosures, so I’m a fellow proponent of the use and availability of iCalendar.

    I just gave a presentation on Azure services at the MSDN Developer Conference in Reston, VA, and after the event was discussing with some of the local community folks ways to improve the process of getting data into the Community Megaphone site, including the possibility of having a worker process in the Azure cloud that could monitor an Azure queue, waiting for messages passed in by other sites or applications. It hadn’t occurred to me to try parsing external iCalendar feeds, but that’s a great idea.

    I’m curious, is your XML ‘intermediate’ format essentially an XML recreation of the iCalendar format, or are you using your own custom schema?

    Looking forward to reading more as your project progresses. Do you envision sharing any of the code at some point?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s