URI, XML, HTTP, REST, and the Azure Services Platform

When friends and family ask about the Professional Developers Conference I attended this week, I tell them it’s kind of like Microsoft’s State of the Union address. I’ve been to a number of these over the years. This was my first as an employee, and Microsoft’s first as a company fully committed to what I believe are the right principles, patterns, and practices. That’s a big statement, and as always you should consider the source and take it for what it’s worth. But if you’ve followed my work over the years, you’ll spot many familiar themes in the following exegesis of the day two keynote by Don Box and Chris Anderson, and you’ll know why this PDC put a huge smile on my face.

In case you’re unfamiliar with the theatrical genre I call PDC performance art, I should briefly explain. Traditionally, at this show attended by thousands of software developers, a few of Microsoft’s technical leaders come to the stage, write small programs on the fly, and run them. These daring high-wire acts are humorous and entertaining, but also deeply informative. The live code exercises new platform technologies, and tells stories about why and how the audience might want to apply those technologies.

The story that Don and Chris told began with a simple web service, running on a demo machine, that printed out a list of processes — effectively, a Unix ps (process status) command. It was built using several key components and features of the .NET Framework: LINQ (Language Integrated Query) to query for and enumerate the list, WCF (Windows Communication Foundation) to package the query as an HTTP-accessible service, UriTemplate to control the namespace of that service, SyndicationFeed to format the response as an Atom feed, and ServiceHost to run the service on the local machine.

When it ran, this program enabled a browser running on the local machine to surf to a service running on the local machine and view its process list as an Atom feed. This colocation of web client and web service on the local machine is a key pattern that I first explored a decade ago. Dave Winer named the pattern Fractional Horsepower HTTP Server and put it to excellent use in his pioneering blog tool Radio UserLand. The pattern embodies a key underlying principle: symmetry. We have long been conditioned to think of the Internet in terms of clients versus servers (and now services), but that’s an artificial distinction. In the terminology of TCP/IP networking, there are no servers and clients, there are only hosts — that is, peer nodes communicating directly with one another. Firewalls and NATs abolished that symmetry. The newly-announced Azure Services Platform is a technology that can help us restore it.

The next step was to extend the program, adding the ability to kill any of the running processes. The Atom feed was already modeling the process list as a set of URI-addressable resources. To implement the feature in a simple, standard, and discoverable way, it was only necessary to apply the HTTP DELETE verb to those resources. Internally, the program of course had to implement a DeleteProcess method. But that method name need not, and according to RESTful best practices should not, appear in the service’s API. And happily, the service did not — as do so many purportedly RESTful services — expose any URIs that look like this:

http://localhost/service?method=delete&process=123

Instead it only exposed URIs that look like this:

http://localhost/service/Process?id=123

An HTTP GET method, invoked on this URI, could return information about the process. An HTTP DELETE method invoked on the same URI accomplishes the kill function, and does so without violating the RESTful principle of interface uniformity. Later on we’ll see a nice example of the benefits of that uniformity. But here, let’s notice another key principle at work. I’ve said that the kill operation was discoverable. That’s true thanks to the Atom Publishing Protocol. It defines a hyperlink within each entry that is the RESTful endpoint for update and delete requests targeted at that entry. So the program’s DeleteProcess method queried the Atom feed for those hyperlinks, and used their addresses to create the URI namespace that exposed process deletion to clients.

The general principal at work here is linking. A core tenet of RESTful style is that link-rich hypermedia documents, useful to people because they make it possible to navigate and discover related things, are equally useful to programs for the same reason.

These are, of course, best practices for an ecosystem sustained by web standards like URI, HTTP, and XML. But it was wonderful to see those best practices clearly demonstrated in a PDC keynote. It has not always been so. Trust me, I would have noticed.

On the next turn of the crank, the standalone process viewer and killer was network-enabled thanks to Azure technology that I first told you about a year ago, back when it was known as the Internet Service Bus. Using it, Don and Chris created this endpoint in the cloud:

http://servicebus.windows.net/services/DonAndChrisPDC

You can go ahead and click that URL if you like, it’s still live. What you’ll fetch is an empty Atom feed. During the keynote, though, Don and Chris wired that endpoint to the program running on the demo machine onstage. This was accomplished in a purely declarative way, by adding a binding to the program’s configuration file that pointed to the chunk of web namespace whose root is servicebus.windows.net/services/DonAndChrisPDC.

This wasn’t yet a cloud-based service, that came later. At this stage it was still a local service that was advertised in the cloud and made available to the public Internet. To accomplish that, Azure has to enable clients out on the Net to traverse intervening firewalls and NATs and contact the local service. It does so in a way that illustrates another key principle: policy-driven intermediation.

The need for such intermediation was soon apparent when the local service was relaunched with its Azure binding. Now anyone in the world could visit the above URL in a browser, view processes, and even try to delete one. Within seconds, someone did try, and Don shouted: “Stop the service, Chris!” There was no real risk — the program was running in debug mode, with a breakpoint set on DeleteProcess — but it was a great theatrical moment.

Now in fact, the service was secure by default. In order to expose it to the Net in an unauthenticated way, there was a configuration setting that overrode the default security. After removing that, an interactive (i.e., browser-based) request produced a login page. Crucially, that login page did not come from the local service, but rather from Azure which was handling security, as well as connectivity, for the service. The policy in effect was username/password, so after typing in appropriate credentials, interactive access was restored, but now in a controlled way. A different policy — for example, one requiring X.509 certificates or SAML tokens — could be defined in, and enforced by, the Azure fabric.

Next, the local client program that had been accessing the service — first directly, then by way of the Azure cloud — was adapted for the same kind of secure access. To do that, it requested an authentication token from Azure’s access control system, and then inserted that token into the HTTP headers of subsequent requests to the service.

So that was act one. Here was Don’s segue into act two: “Chris, are there other services in the world we might want to program in a similar fashion?”

Why yes, Chris said, and launched Live Desktop. There, courtesy of Live Mesh, were some folders that were synchronized cloud replicas of folders on the local demo machine. Since Live Mesh is also based on Atom feeds, it should be easy to convert a RESTful service that enumerates and deletes OS processes into a RESTful service that enumerates and deletes Live Mesh folders.

It was easy. In the client program, the base URI changed from servicebus.windows.net to user-ctp.windows.net/V0.1/Mesh/MeshObjects. And the authentication token had to change too because, well, to be honest, Azure’s subsystems aren’t yet seamlessly integrated. But that was it. The same LINQ query to find entries in a feed worked exactly as before. Only now it listed folders in the cloud rather than processes on the local machine. That’s the beauty of a uniform HTTP interface in the RESTful style.

Note that the Live Mesh API works symmetrically with respect to the cloud and the local client. The same program that lists folders in the cloud can list folders on your local machine. You just point the URIs at localhost, and use the Fractional Horsepower HTTP Server that’s part of the locally-installed Live Mesh software.

Note also that you don’t have to use any Microsoft technologies to work with these Azure services. The demo program used LINQ, WCF, and — for the Live Mesh stuff — a wrapper library that packages the API for use by .NET software. But any technology for shredding XML and communicating over HTTP will work just fine.

In act three, the focus shifted to Azure’s storage service. Using all the same patterns and principles, the program morphed into one that could upload DLL files into Azure’s blob store, use Azure tables to associate human-readable metadata with the DLLs, and issue a simple relational query against the set of uploaded files.

Finally, in act four, the service that had been running locally, on the demo machine, was adapted — with some minor changes — to work with the local development version of the Azure compute cloud, and then deployed to the staging and production areas of the real cloud.

To sum up, the emerging Microsoft platform not only spans a continuum of programmable devices and services, it also spans a continuum of access styles that are all based on core standards including URI, XML, and HTTP. I think this is a great story, and I’m exceedingly happy to finally be able to tell it.

Kim Cameron’s excellent adventure

I hope James Governor, Mary Branscombe, and Kim Cameron will triangulate on this, but here’s my report on a cosmically funny incident at a party last night. I walked up to James just as he witnessed Kim being forcibly denied access to the venue. He lacked the necessary identity token — a plastic wristband — and couldn’t talk his way in.

If you don’t know who Kim is, what’s cosmically funny here is that he’s the architect for Microsoft’s identity system and one of the planet’s leading authorities on identity tokens and access control.

We stood around for a while, laughing and wondering if Kim would reappear or just call it a night. Then he emerged from the elevator, wearing a wristband which — wait for it — belonged to John Fontana.

Kim hacked his way into the party with a forged credential! You can’t make this stuff up!

PyAWS, Fermat’s Last Theorem, and search diversity

I use the Amazon API to check wishlists programmatically, and back in March I mentioned that it was being upgraded in a way that would break the Python wrapper I’d been using for years. Readers pointed me to a new wrapper called PyAWS, but I found that it didn’t offer the one thing I needed: A simple way to retrieve all the ISBNs on a wishlist.

I solved the problem for myself with a few lines of code, but neglected to include them. Today, that March entry received a hilarious comment:

I came here searching for a way to retrieve my Amazon wishlist using PyAWS… You’re the top query (out of a grand total of 5!) for pyaws wishlist amazon.

However, reading the blog article above, I had flashbacks of Fermat’s Last Theorem:

“After poring over this mysterious PyAWS, I found a wonderfully simple way of retrieving a wishlist like with PyAmazon. However the margin of this blog post is too narrow to contain the few lines of Python code required.”

:-)

Could you please post the said few lines of codes to retrieve a wishlist with PyAWS? Would be much appreciated. I’d rather not have to pore over the whole Amazon API documentation to learn how to retrieve a simple wishlist or two with PyAWS.

Sorry about that! Here’s what I’m currently doing. It’s not PyAWS, just a regex hack of the raw XML output from REST queries.

import urllib2,re

def getAmazonWishlist(aws_access_id,wishlist_id):
  url = 'http://webservices.amazon.com/onca/xml?Service=\
    AWSECommerceService&AWSAccessKeyId=%s&ListId=%s\
    &ListType=WishList&Operation=ListLookup' %\ 
    (aws_access_id, wishlist_id)
  s = urllib2.urlopen(url).read()
  pages = re.findall('<TotalPages>(.+?)</TotalPages>',s)[0]
  for page in range(int(pages)): 
    url = 'http://webservices.amazon.com/onca/xml?Service=\
      AWSECommerceService&AWSAccessKeyId=%s&ListId=%s\
      &ProductPage=%s&ListType=WishList&Operation=ListLookup\
      &ResponseGroup=ListFull' %\ 
      (aws_access_id, wishlist_id, page+1)
    s += urllib2.urlopen(url).read()
  return re.findall('<ASIN>(.+?)</ASIN>.+?<Title>(.+?)<',s)

(Ironically the margin of this blog post is too narrow for the few lines of Python code required, so I’ve split those lines where indicated.)

By the way, DoubleSearch reveals that although Google currently finds only 5 results for pyaws wishlist amazon, Live Search finds 9. More importantly, if the blog entries from Rich Burridge and me are indeed the most relevant results, Live Search puts them first.

That’s not always true, of course. Often Google does better. But not always. In any case, even when the first pages of results from both engines are equally relevant, they’ll likely differ in ways that DoubleSearch invites you notice.

If you’re inclined to dismiss what I’m about to say because I’m employed as a Microsoft evangelist, then fair enough, move along, there’s nothing to see here. But if you’ve followed me over the years and continue to trust my instincts, then hear me out on this. I’ve always believed in, and acted on, the principle of diversity. If you think the same way, then you use more than one operating system, more than one programming language, more than one application in many categories.

So why would you use only one search engine?

If you haven’t tried Live Search in a while, you’ll find that it’s improved quite a bit. I’m not saying it’s better than Google, but I am saying it’s usefully different. Given the central importance of search, I argue that it’s in everyone’s interest to exploit that diversity.

Now arguably most people don’t care about diversity. There’s a strong impulse to find one way to do something, and then stick with it. People don’t readily adopt new behavior. To help them along, you need to minimize disruption.

To that end, I’ve been asking some friends and associates to give DoubleSearch a try. Specifically, I’m asking them to make it their browser’s default search provider, then let me know how long they keep it and, if they drop it, why.

I know there are logistical issues with DoubleSearch. In particular, given the side-by-side-in-frames presentation, it’s awkward to click through on a search result. You’d rather right-click and open in a new tab. Some people already have that habit, others don’t, their experiences will differ accordingly.

I’m sure there are deeper cognitive issues as well. For example, I find it useful to compare the two result pages side-by-side, but others — maybe many others — will just find that distracting.

Anyway, if you do try this experiment for yourself, feel free to comment here on how it goes.

Pumpkins with Oomph

Five years ago I was exploring the idea of embedding active chunks of structured data into web pages. Back then I used the phrase interactive microcontent. Nowadays, we say microformats. If you’re a reader of this blog you’re probably technically-oriented, and you already know about microformats. But most people aren’t, and they don’t. The challenge has always been to provide an end-to-end experience that will enable non-technical folks to create and use these nuggets of semantic web goodness.

Here’s a project that can help: Oomph. It’s the first lab component of the relaunched MIX Online site, which is run by Microsoft evangelists who, like me, care about web standards and web innovation.

To demonstrate Oomph, I’ve injected a microformatted event here:

What: Keene Pumpkin Festival
When: Saturday, October 25, 2008 (all day)
Where: Downtown

Keene, New Hampshire

I created this event using Live Writer — a WYSIWYG blog editor — and its Event Plug-in. In this case, no data entry was required because the plug-in enabled me to search Eventful and capture the existing Pumpkin Festival record found there. That’s just the sort of grease we’ll need in order to overcome data friction.

Still, for most folks there’s no obvious reason to publish a microformatted event. The information looks nice, but it’s not clear what you or anyone else can do with it.

One aspect of the Oomph toolkit is an Internet Explorer extension that makes that embedded event come alive. Here’s what this page looks like in IE with the Oomph extension:

The arrow points to an indicator that “gleams” when a page contains microformatted elements.

Clicking on the indicator opens a panel that activates them. In this case, the event is enhanced with icons for a variety of calendar import methods.

When an item has a location, you can map it:

If Oomph were only an IE-specific extension, I’d wouldn’t be writing about it. But in fact, it’s a cross-browser solution based on jQuery. I can’t demonstrate that here because WordPress.com blocks JavaScript, but consider these two pages:

1. Oomph: with explicit JavaScript. This page explicitly calls the Oomph JavaScript code, and works cross-browser. Try it!

2. Oomph: without explicit JavaScript: This page (like the blog entry you’re reading) does not uses the Oomph JavaScript code. The enhanced behavior is still available in IE, by way of the Oomph extension. It could also be available in Firefox, Safari, or Chrome if similarly extended.

It’s really helpful to have the option to go both ways: Server-side where it’s permitted, client-side where it isn’t.

There’s more to Oomph: CSS styles for microformats, and a Live Writer plug-in for inserting hCard (contact) elements into blog postings. You can get the toolkit and documentation on CodePlex. Nice work guys!

Why and how to blurb your social bookmarks

From a 2004 entry entitled Information Routing:

To further my own self-interest in keeping track of things, I’ve made a minor extension to the del.icio.us bookmarklet, so that selected text on the target page is used for the (optional) extended description of the routed item. This makes the items I route easier for me to scan. And for you too. Of course if you did the same, the items you route would be easier for you to scan. And for me too.

We keep losing sight of these basic principles. Meanwhile, their importance keeps growing. Here’s a case in point from this morning’s flow: an item in FriendFeed from Charlene Li:

OK, I get the gist. If malware and/or Facebook are central topics for me, and I haven’t heard about this already, I’ll click through. But in most cases, I’m scanning my flow to expand my peripheral awareness. I don’t have time to click through and look at everything. I need context wrapped around the items that appear in my flow.

Annotating your social bookmarks is a great way to provide me with that context. In del.icio.us, here’s how you do that:

Before you invoke the del.icio.us posting form, select the text that best summarizes the item you’re bookmarking. Then paste it into the Notes field.

Here’s the del.icio.us view of the item that Charlene and I have bookmarked:

It’s been bookmarked five times. There are two annotations. But I claim that only one of them is useful. carlhaggerty’s blurb is site boilerplate that comes from an F-Secure meta tag. It says nothing about this particular item. My blurb, however, adds useful context. It identifies the item as a Flash-related phishing exploit.

Admittedly, you have to go out of your way to expose this del.icio.us view of the F-Secure item as bookmarked by five del.icio.us users. But in an environment where syndication and information routing are pervasive, our actions have consequences elsewhere. Here’s how that same item appears to my FriendFeed subscribers:

Now my subscribers can absorb, at a glance, the additional context about phishing and Flash. Their peripheral awareness expands. Their time spent scanning their flows is more productive. And their subconscious anxiety — about not clicking through to read the majority of items they can’t possibly have time to read — is alleviated.

I would like to enjoy these benefits too, but I need your help. Please consider annotating the items you share. If you’re so inclined, here’s a bookmarklet that will help:

post to del.icio.us

(Update: As per Carl’s comment below, that won’t work because WordPress defangs the JavaScript. Another way: Create a bookmark on your links bar, edit its properties, and paste the following into its URL or Location field:

javascript:(function(){var%20notes=window.getSelection ? window.getSelection().toString() : (document.selection?document.selection.createRange().text : ""); notes=encodeURIComponent(notes); f=’http://delicious.com/save?url=&#8217; +encodeURIComponent(window.location.href) +’&title=’+encodeURIComponent(document.title) +’&notes=’+notes+’&v=5&’; a=function() {if(!window.open(f+’noui=1&jump=doclose’, ‘deliciousuiv5′,’location=yes,links=no,scrollbars=no, toolbar=no,width=550,height=550′)) location.href=f+’jump=yes’}; if(/Firefox/.test(navigator.userAgent)) {setTimeout(a,0)}else{a()}})()

I’ve also updated it to incorporate a better way to capture the selection, per Alf’s comment below, thanks Alf!)

(Further update: Crap. How the hell do you defeat the “smart” single quotes at WordPress.com? The above won’t work either, but I’ve put draggable and copyable versions at http://jonudell.net/delicious-bookmarklet.html).

This is a version of the standard del.icio.us bookmarklet. It updates the extension I made way back in 2004. If you replace the standard del.icio.us bookmarklet with this version, you’ll still need to highlight the salient text on the page you’re bookmarking. But you won’t need to paste it into the form. It’ll pop into the Notes field automatically. Do this, and I’ll love you forever.

PS: del.icio.us: Why not make this version the standard bookmarklet, and explain why? As your bookmarks increasingly find their way into our lifestreams and workstreams, useful annotations will matter more and more.

Finding faces

The fun I’ve been having with DoubleSearch has reminded me how easy it is to create new search providers that plug into your browser’s dropdown list of search engines. Here’s an interesting one: FaceSearch. As the name suggests, it finds pictures of faces.

This is nothing more than a Live Search for images, with face recognition turned on. You can get the same effect by doing a regular search, and then using the Refine By options to show only color photos of faces. But those restrictions don’t persist across searches. And if you have to tweak them every time, it’s awkward to explore face space.

With FaceSearch, you can fly through sequences of face-oriented searches. For example:

  1. people: yourself, friends, family, and associates, celebrities
  2. emotions: happy, angry
  3. expressions: smile, frown, wink
  4. places: your town, New Zealand, Mumbai, Reykavik
  5. ethnic groups: Maori, Pashtun
  6. decorations: glasses, hat, piercing
  7. hairstyles: combover, mullet, afro

It’s interesting to compare these results to Flickr searches that include the tag face. When the facial aspect of a photo is important enough to tag, you’ll do much better with Flickr. For example, it delivers wonderful results for angry face and Mumbai face. But it finds nothing for Reykavik face, whereas Live Search finds thousands of Icelandic faces.

Update #1: It helps if you spell Reykjavik correctly. When you do, Flickr finds 300 Icelandic faces.

Update #2: As per Bill’s comment below, Google has a syntax for face search too. Hadn’t known that. So this can become another kind of DoubleSearch. Done! Now twice the fun!

Tracks4Africa: Mapping and annotating Africa’s remote eco-destinations

Back in 2005 I made a screencast that showed how the convergence of GPS and online mapping enables us to collectively annotate the planet. The Tracks4Africa folks have been doing that since 2000. On this week’s Innovators show, Johann Groenewald explains how some GPS enthusiasts who are passionate about exploring, documenting, and preserving Africa’s rural and remote “eco-destinations” have created an annotated map that travelers can use and enhance. The GPS maps have evolved into a commercial product. The annotations — including photos and commentary — are available at the Padkos website, and also as a layer in Google Earth.

I found out about T4A when a reader commented on an earlier item about ground truthing and crowdsourced mapping. T4A is a wonderful demonstration of that possibility. It’s also a great story about how open data contributed by a community, and commercial data managed by a business, can thrive in a symbiotic relationship.

Dual search revisited

Paul Pival noticed a problem with the browser widget I made the other day to search Google and Live side-by-side. The service invoked by that widget, at dualsearch.atsites.net, fails when your query contains double-quoted phrases.

It’s an easy fix as I’ll demonstrate here. There are three ingredients:

  • A itty-bitty web application
  • A simple XML file
  • An even simpler HTML/JavaScript file

Let’s examine them.

1. The web application just receives a query, URL-encodes it, and interpolates it into the template for a web page that invokes the two search engines in side-by-side frames.

There are a million ways to do that. Here’s a Python/Django implementation:

def doublesearch(request):
  import urllib
  q = request.GET['q']
  q = urllib.quote(q)
  template = """<html>
<frameset cols="*,*" frameborder="no">
  <frame src="http://www.google.com/search?q=__QUERY__" />
  <frame src="http://search.msn.com/results.aspx?q=__QUERY__" />
</frameset>
</html>"""
  html = template.replace('__QUERY__',q)
  return HttpResponse(html)

2. The XML file contains an OpenSearch description that invokes that little web application, passing it the query that you type into your browser’s search box. Here’s an example that uses a sample service I’ve located at my elmcity.info site:

<?xml version="1.0" encoding="UTF-8" ?>
<OpenSearchDescription xmlns="http://a9.com/-/spec/opensearch/1.1/">
<ShortName>DoubleSearch</ShortName>
<Description>DoubleSearch provider</Description>
<Image height="16" width="16" type="image/x-icon">
http://jonudell.net/img/doublesearch.ico"</Image>
<InputEncoding>UTF-8</InputEncoding>
<Url type="text/html"
  template="http://elmcity.info/doublesearch/?q={searchTerms}" />
</OpenSearchDescription>

3. Finally, here’s the HTML file that encapsulates the snippet of JavaScript that installs the OpenSearch widget into your browser:

<a href="javascript:window.external.AddSearchProvider
 ('http://jonudell.net/doublesearch.xml')">Add</a> the
 DoubleSearch provider.

You can Add the DoubleSearch widget and try it for yourself. Unlike other variants I’ve found, this one doesn’t wrap any cruft around the side-by-side results. It simply presents them.

As I mentioned the other day, I’m finding that combining the top 10 results from both engines makes for a more useful set of 20 results than taking the top 20 from either.

With today’s wider screens, placing the two result frames side-by-side works out pretty well. In this mode, however, you’ll want to avoid clicking through directly on a result. Instead, right-click on the result and open it in a new tab.

An Internet-to-TV feed with IronPython, XAML, and WPF

In a recent series of items I discussed ways of turning an Internet data feed into a video crawl for use on a local public access cable television channel. In the last installment the solution had evolved into an IronPython script that fetches the data, writes XAML code to animate the crawl, and runs that XAML as a fullscreen WPF (Windows Presentation Foundation) application.

This week we finally got a chance to try out the live feed, and we didn’t like what we saw. For starters, the animation was jerky. The PC that became available for this project is an older box running Windows XP. I installed .NET Framework 3.0 on the box, and it now supports WPF apps, but not with the graphics acceleration needed for smooth scrolling.

Even with the smooth scrolling that we see on my laptop, though, it wasn’t quite right. This application displays a long list of events, and it’s going to grow even longer. We decided that a paginated display would be better, so I went back to the drawing board.

We’re happy with the result. It displays pages like so:

                  Community Calendar

 06:30 PM open/lap swim  (ymca) 

 07:00 PM Caregiving for Individuals with 
   Dementia (unh coop extension) 

 07:00 PM Vicky Cristina Barcelona (colonial 
   theatre) 

 07:30 PM Faculty Recital-Jazz (eventful: Redfern 
  Arts Center) 

 events from http://elmcity.info    page 9 of 12

Pages fade in, display for 8 seconds, then fade out. There are a million ways to do this, but since I was already exploring IronPython, XAML, and WPF I decided to remix those ingredients. For my own future reference, and for anyone else heading down the same path, here are some notes on what I learned. As always, I welcome suggestions and corrections. I’m still a XAML beginner, and will be very interested to learn about alternative approaches.

The approach I take here is clearly influenced by my own past experience doing web development using dynamic languages. There’s no C# code, no compilation, no Visual Studio. The solution is minimal in the way I strongly prefer for simple projects: a single IronPython script that depends only on IronPython and .NET Framework 3.0.

When developing for the web, I typically build a HTML/JavaScript mockup, view it in a browser, and then consider how to generate that HTML and JavaScript. Here, XAML is the HTML, and a XAML viewer is the browser. The conventional XAML viewer that comes with the Windows SDK is called XAMLPad, but it’s a beefier tool than I needed for this purpose, so I wound up using the more minimal XamlHack.

I started with the contents of a single page:

<Canvas ClipToBounds="True" Background="Black" 
  Width="800" Height="600">

<TextBlock x:Name="page1" Canvas.Top="0" Canvas.Left="20" 
  Foreground="#FFFFFF" FontSize="36" FontFamily="Arial" 
  xml:space="preserve">

<![CDATA[
 06:30 PM open/lap swim  (ymca) 

 07:00 PM Caregiving for Individuals with 
   Dementia (unh coop extension) 
]]>

</TextBlock>
</Canvas>

I found that text formatting isn’t WPF’s strong suit, so I’m using the XAML equivalent of an HTML <pre> tag to display text that’s preformatted in IronPython.

Next, I added the fade-in and fade-out effects

<Canvas ClipToBounds="True" Background="Black" 
  Width="800" Height="600">
<TextBlock.Triggers> <EventTrigger RoutedEvent="FrameworkElement.Loaded"> <BeginStoryboard> <Storyboard> <DoubleAnimation BeginTime="0:0:0" Storyboard.TargetName="page1" Storyboard.TargetProperty="Opacity" From="0" To="1" Duration="0:0:1" /> # 1 sec fade in <DoubleAnimation BeginTime="0:0:9" # wait 8 sec Storyboard.TargetName="page1" Storyboard.TargetProperty="Opacity" From="1" To="0" Duration="0:0:1" /> # 1 sec fade out </Storyboard> </BeginStoryboard> </EventTrigger> </TextBlock.Triggers>
+ <TextBlock x:Name="page1" Canvas.Top="0" ...> </Canvas>

I thought it would be possible to chain together a series of these animations, and nest that series inside another animation in order to create the infinite loop that’s required. There may be a way to do that in XAML, but I didn’t find it. So, since I was already planning to generate the XAML — in order to interpolate current event data, plus a variety of attribute values — I went with a generator that produces a series of these pages. That solved chaining, but not looping. To make the sequence loop, I added a second timer/event-handler pair to the IronPython script. The first handler reloads the data once a day. The second handler reloads the XAML at intervals computed according to the number of pages for each day, thus looping the animation.

Next I added XAML elements for the header and footer. The header is static, but the footer has a dynamic page counter so I animated it in the same way as the page.

Next I made templates for all the XAML elements. Here’s the footer template:

template_footer = """<Label x:Name="footer___FOOTER_PAGE_NUM___" 
  Canvas.Top="___FOOTER_CANVAS_TOP___" Canvas.Left="___
  FOOTER_CANVAS_LEFT___" Foreground="#FFFFFF" xml:space="preserve" 
  FontSize="___FOOTER_FONTSIZE___" FontFamily="Arial" Opacity="0">
           page ___FOOTER_PAGE_NUM___ of ___FOOTER_PAGE_COUNT___
<Label.Triggers>
<EventTrigger RoutedEvent="FrameworkElement.Loaded">
  <BeginStoryboard>
    <Storyboard>
     <DoubleAnimation 
      BeginTime="___BEGIN_FADE_IN___" 
      Storyboard.TargetName="footer___FOOTER_PAGE_NUM___"
      Storyboard.TargetProperty="Opacity" 
       From="0" To="1" Duration="___FADE_DURATION___"  /> 
     <DoubleAnimation 
      BeginTime="___BEGIN_FADE_OUT___" 
      Storyboard.TargetName="footer___FOOTER_PAGE_NUM___"
      Storyboard.TargetProperty="Opacity" 
       From="1" To="0" Duration="___FADE_DURATION___"  /> 
     </Storyboard>
  </BeginStoryboard>
</EventTrigger>
</Label.Triggers>
</Label>
"""

The script uses variables that correspond to the uppercase triple-underscore-bracketed names. So, for example:

___FOOTER_CANVAS_TOP___ = 520
___FOOTER_CANVAS_LEFT___ = 10
___FOOTER_FONTSIZE___ = 28

To avoid typing all these names twice in order to interpolate variables into the template, I cheated by defining this pair of Python functions:

def isspecial(key):
  import re
  return re.match('^___.+___$',key) is not None 

def interpolate(localdict,template):
  specialkeys = filter(isspecial,localdict.keys())
  for key in specialkeys:
    exec("""template = template.replace("%s",
      str(localdict['%s']))""" % (key,key))
  return template

Given that setup, here’s the core of the XAML generator:

def create_xaml(raw_text,watch_time,fade_duration):

  ___TITLE_TEXT___ = 'Community Calendar'
  ___BODY_TEXT___ = ''
  ___BODIES_AND_FOOTERS___ = ''
  ___BODY_NUM___ = 0
  ___FOOTER_PAGE_NUM___ = 0
  ___BODY_CANVAS_TOP___ = 0
  ___BODY_CANVAS_LEFT___ = 20
  ___BODY_FONTSIZE___ = 36 
  ___TITLE_CANVAS_TOP___ = -30
  ___TITLE_CANVAS_LEFT___ = 200
  ___TITLE_FONTSIZE___ = 34 
  ___FOOTER_CANVAS_TOP___ = 520
  ___FOOTER_CANVAS_LEFT___ = 10
  ___FOOTER_FONTSIZE___ = 28
  ___FOOTER_PAGE_COUNT___ = 0
  ___FOOTER_PAGE_NUM___ = 0
  ___BEGIN_FADE_IN___ = ''
  ___BEGIN_FADE_OUT___ = ''
  ___FADE_DURATION___ = ''

  pagecount = 0
  for page in page_iterator(raw_text):
    pagecount += 1
  ___FOOTER_PAGE_COUNT___ = pagecount

  begin_fade_in = 0
  begin_fade_out = begin_fade_in + fade_duration + watch_time

  pagenum = 0

  for page in page_iterator(raw_text):
    pagenum += 1

    ___BODY_TEXT___ = page
    ___BODY_NUM___ = pagenum
    ___FOOTER_PAGE_NUM___ = pagenum
    ___BEGIN_FADE_IN___ = makeMinsSecs(begin_fade_in)
    ___BEGIN_FADE_OUT___ = makeMinsSecs(begin_fade_out)
    ___FADE_DURATION___ = makeMinsSecs(fade_duration)

    body = interpolate(locals(),template_body)

    footer = interpolate(locals(),template_footer)

    ___BODIES_AND_FOOTERS___ += body + footer

    begin_fade_in = begin_fade_out + fade_duration
    begin_fade_out = begin_fade_in + fade_duration 
     + watch_time

  xaml = interpolate(locals(),template_xaml)
  
  return (pagecount,xaml)

I guess I could rely less on XAML code generation and exploit IronPython’s ability to dynamically reach into and modify live .NET objects. That would be the WPF analog to JavaScript DOM-tweaking in the web realm. But this works, it’s easy enough to understand, and it’s handy for debugging purposes to have the generated XAML lying around in a file I can easily inspect.

Finally, here’s the core of the application itself:

class CalendarDisplay(Application):

  def load_xaml(self,filename):
    from System.Windows.Markup import XamlReader
    f = FileStream(filename, FileMode.Open)
    try:
      element = XamlReader.Load(f)
    finally:
      f.Close()
    return element

  def loop_handler(self,sender,args):  # reload XAML

    def update_xaml():
      self.window.Content = self.load_xaml(self.xamlfile)

    self.loop_timer.Dispatcher.Invoke(DispatcherPriority.Normal,
      CallTarget0(update_xaml))

  def day_handler(self,sender,args):     # fetch data, generate XAML

    def update_xaml():
      self.pagecount = calendarToXaml(self.path,self.xamlfile,self.url,
        self.cachefile,self.watch_time,self.fade_duration)
      self.window.Content = self.load_xaml(self.xamlfile)

    self.day_timer.Dispatcher.Invoke(DispatcherPriority.Normal,
      CallTarget0(update_xaml))

  def __init__(self):

    Application.__init__(self)

    self.xamlfile = 'display.xaml'
    self.path = '.'
    self.cachefile = 'last.txt'
    self.url = 'http://elmcity.info/events/todayAsText'
    self.watch_time = 8
    self.fade_duration = 1
    self.pagecount = calendarToXaml(self.path,self.xamlfile,self.url,
      self.cachefile,self.watch_time,self.fade_duration)

    self.window = Window()
    self.window.Content = self.load_xaml(self.xamlfile)
    self.window.WindowStyle = WindowStyle.None
    self.window.WindowState = WindowState.Maximized
    self.window.Topmost = True
    self.window.Cursor = Cursors.None
    self.window.Background = Brushes.Black
    self.window.Foreground = Brushes.White
    self.window.Show()

    self.day_timer = DispatcherTimer()
    self.day_timer.Interval = TimeSpan(24, 0, 0)
    self.day_timer.Tick += self.day_handler
    self.day_timer.Start()

    self.loop_timer = DispatcherTimer()
    interval = self.pagecount * (self.watch_time + self.fade_duration*2)
    self.loop_timer.Interval = TimeSpan(0, 0, interval)
    self.loop_timer.Tick += self.loop_handler
    self.loop_timer.Start()

CalendarDisplay().Run()

Celebrating iCalendar’s 10th anniversary: The best is yet to come

Next month marks the tenth anniversary of RFC 2445 (iCalendar), the specification that describes how Internet applications represent and exchange calendar information. The authors of RFC 2445 were Frank Dawson (now with Nokia) and Derik Stenerson (now with Microsoft). I asked both to join me to reflect on the past, present, and future of this key standard. Only Derik was available, and he’s my guest for this week’s ITConversations show.

If you’ve followed my blog you’ll know that I’ve come to regard the ICS files that iCalendar-aware apps create and consume as feeds that could and should form a syndication ecosystem analogous to the RSS ecosystem. So in addition to filling us in on how iCalendar came to be, Derik considers whether the analogy holds water, and concludes that it probably does.

Although iCalendar has been around for a decade, I argue that the confluence of syndication and personal publishing, in the calendar domain, requires three enablers.

First, you need a workable syndication format, and we have that: RSS for blogs, ICS for calendars.

Second, you need what we used to call one-button personal publishing. Bloggers have had that capability for a long time. Calendar users have it too, but it’s emerged relatively recently, and many aren’t aware of it.

Third, you need feed aggregators. These proliferate in blogspace but, I argue, are conspicously absent from calendar space. Services like Eventful and Upcoming produce calendar feeds. But because they do not consume them, they don’t encourage individuals and groups to publish feeds, and to think and act in a syndication-oriented way. I’ve prototyped a calendar aggregator at http://elmcity.info/events/, but the category isn’t yet well-established.

If my analysis is correct, one or more well-known services that both consume and produce calendar feeds would unlock the latent potential of iCalendar and help us jumpstart a calendar syndication ecosystem.

This American Life’s finest hours

Back in May, This American Life aired a widely-acclaimed show on the mortgage crisis. In The Giant Pool of Money, Alex Blumberg and Adam Davidson pepper their analysis with dialogue from a cast of characters including:

Richard Campbell, ex-Marine, behind on his mortgage: “At one point, my son had $7,000 in a CD and I had to break it. That really hurt.”

Clarence Nathan, who got a $540K second mortgage while working 3 part time not very steady jobs: “I wouldn’t have loaned me the money. And nobody that I know would have loaned me the money. I know guys who are criminals who wouldn’t loan me that and they break your knee-caps.”

Glen Pizzolorusso, just out of college, making $1 million a year selling mortgages to people like Clarence Nathan: “These people didn’t have a pot to piss in. They can barely make a car payment and we’re giving them a 300, 400 thousand dollar house.”

It’s a powerful show. If you don’t have the time or inclination to listen, you can read the transcript.

Last Friday, Alex Blumberg and Adam Davidson returned with Another Frightening Show About the Economy. There’s no transcript yet, but I just listened while doing housework. It’s just as compelling, and also amazingly prescient. Here’s Adam Davidson from that 10/3 show:

We’ve surveyed a bunch of economists, and most say there’s another approach that’s clearly better. It’s called a stock injection plan. In the Paulson plan, we give 700 billion to the banks and get back these toxic, crappy assets. With the stock injection plan, we still give something like 700 billion dollars to the banks, but in return we get an ownership plan.

From the Planet Money blog, also on 10/3, referring to the TAL show:

That White House plan wasn’t the only plan. It wasn’t even necessarily the plan you think it is. In this podcast, Adam Davidson tells This American Life host Ira Glass about a mysterious phone call in which a tipster suggested that an alternate proposal had crept into the language of the reworked bill. Davidson says that it concerns so-called stock injection, and that economists like it — a lot.

And sure enough, we learned about that alternate plan today. I heard it on the news, and today’s Planet Money is a well-deserved “I told you so”:

That backdoor bailout we’ve been talking about came now front and center. U.S. Treasury Secretary Henry Paulson says the U.S. is prepared to use public money to buy up portions of private banks. Alternately called a stock injection and a capital one, the move would amount to at least a partial nationalization of the financial system.

Why wasn’t this the original plan? Because banks hate it, Davidson says, and they’re a powerful lobby. But, push has come to shove.

The 10/3 TAL show paints a brighter picture of this alternate plan, calling it simpler, fairer, more economically sound, and a better deal for the taxpayer. We’ll see how the market responds tomorrow. But here’s the line that stuck in my head:

Someone, and we still don’t know who, put in very subtle language into the Senate bill that gives this as an option to the Treasury Secretary.

Repeat: “Someone, and we still don’t know who.” Excuse me? The future of our economy depends on subtle language inserted into the bailout bill, we can’t point to who wrote it, or when, and reporters have to receive anonymous tips to learn about it?

I’ve written recently about a Congressional content management system. Micah Sifry makes the same point in an outstanding episode of Phil Windley’s Technometria podcast. The stakes are way too high for these shell games. We need a whole lot more transparency in the legislative as well as financial realms, and we need it now.

Metasearching the web with OpenSearch

Mark O’Neill dug up some ancient history in a recent blog post:

Is “WOA” really new? I urge everyone to read this Byte article from Jon Udell in 1996, 14 years ago. Part of the title says it all: Every website is a software component. A powerful capability for ad hoc distributed computing arises naturally from the architecture of the Web.

Actually it was only 12 years ago, but long enough so that I had to remind myself, today, of the lesson I learned back then. The full title of the column Mark refers to was: “I use AltaVista to build BYTE’s Metasearch application and realize that every Web site is a software component.” It was my first experience with client-side web scripting and lightweight service composition.

Fast forwarding to today, I was flipping between Google and Live Search and noticing that the answers I was looking for were distributed across the two sources. I’ve been doing that a lot lately, because the combination is really powerful. But for some reason, I hadn’t gotten around to automating a side-by-side search. And it’s gotten a whole lot easier than it used to be.

To see why metasearch is helpful, try this query two ways:

Google: search google live side-by-side

Live: search google live side-by-side

I found four relevant results spread, in non-overlapping pairs, across the two engines: TripleMe and SearchDub (via Google), and DualSearch and SearchBoth (via Live).

I tried the above query in all four, found DualSearch to be most useful, and made an DualSearch OpenSearch provider that you can use to add this side-by-side capability to the search box in FireFox, MSIE, or any other browser that can plug in OpenSearch providers.

Poking around some more, I came across FuzzFind and, although I don’t find it as useful as DualSearch, it does incorporate del.icio.us which is helpful for me. So I made a FuzzFind OpenSearch provider too.

Clearly I’m not the the first person to think of metasearch OpenSearch providers. Which other ones are you aware of? Which do you use most, and why? Feel free to tag your finds with metasearch, opensearch, and provider.

Bonus question: Why doesn’t every search engine offer its own browser-pluggable OpenSearch provider right on its home page?

Small steps forward for calendar syndication

In turbulent times it can help to focus on small steps and tangible signs of progress. In that spirit, here’s a fragment of the collaborative events calendar I’ve been trying to summon into existence:

07:00 PM DREW HICKUM & THE COLONELS (armadillos)
08:00 PM Roger McGuinn & Tom Rush (eventful: Colonial Theatre)
08:00 PM Patty Larkin | Francestown Meetinghouse (monadnock folk)
08:30 PM Chris Fitz (eventful: E.F. Lane Hotel)

A pretty good selection for a Saturday night in the Monadnock region! That’s good news for all of us living around here.

What’s especially encouraging, for me, is the process behind the scenes. That sequence of four closely-spaced events comes from four contributors who are publishing three different flavors of calendar feed, using Eventful, Google Calendar, and WordPress.

Best of all, only one of those contributors was me.

A conversation with Howard Bloom about collective learning, group selectionism, and the global brain

My guest for this week’s Innovators podcast is Howard Bloom. He’s written several books, one of which — Global Brain: The Evolution of Mass Mind from the Big Bang to the 21st Century — is the main topic of our conversation.

There’s no easy way to summarize this show, but here are some notes that I took while reading the book, and used to guide the discussion:

global data sharing among bacteria

complex adaptive system

imitative learning

individual vs group selection

passion for gathering in cities

raven roosts are data collection centers

elements of a collective learning machine:

  1. conformity enforcers (genome, social norms)
  2. diversity generators (curiosity, deviance)
  3. inner judges
  4. resource shifters
  5. intergroup tournaments

apoptosis / cell suicide

behavioral vs verbal memes

the group influences individual perception

each node in the collective brain represents a different approach available to the mesh of mind

individuals and subgroups are disposable rovers, sensors for an interlaced intelligence

pumphouse gang shows how individuals and groups can become test pilots for speculative strategies

team hunters, crop thieves, garbage raiders: each a separate “hypothesis”

collective intelligence uses the ground rules of a neural net: shuttling resources and influence to those who master problems, stripping influence, connection, and luxury from those who cannot seem to understand

If these themes resonate, you’ll love hearing Howard elaborate them.

Meme tracking with Twitter and Timeline

Social networks are Petri dishes in which we can watch memes emerge and spread by imitation. Three years ago, I traced the effect of a powerful one created by the ACLU: a fictional screencast about a dystopian future in which identity and privacy have gone horribly wrong. What I found when I looked at the data was that, although forward thinkers and actors in the realm of digital identity had only recently become aware of the ACLU’s powerful meme, it had been active for 18 months, most forcefully at the beginning of that span.

In that case the meme was an idea which, because it was neatly represented by an URL, could be tracked by using services like del.icio.us and bloglines as proxies for the attention that flows to an URL.

In other cases, a meme is best represented by a word — often, a neologism. There’s no canonical URL to track, but there are other ways to monitor the spread of the meme. Search engines, for example. In the case of screencast, for example, there were 200 Google hits for screencast in April 2005, 60,000 in June 2005, 325,000 in November 2005, and there are 3,000,000 today.

I’m always on the lookout for new ways to make these kinds of observations. Yesterday I encountered Pecha Kucha for the first time. It has a Wikipedia page, so the revision log there is one source of insight.

Since I encountered the phrase on Twitter, I tried a different strategy. While relaying a definition of the term, I used the tag #pechakucha. I realized that these Twitter “hashtags” are another proxy for linguistic memeflow, so I plotted occurrences of the tag on a Timeline. There were only 16 occurrences as of yesterday, so it’s a little sparse, but the same approach can be used to provide insight into the birth and evolution of any Twitter hashtag.

Here’s a Timeline for #quotes. It started on April 6, 2008, when Leonardo Souza quoth: “#quotes ‘This story, like any story worth telling, is about a girl'”, which evidently is from Spider-Man.

One of the nice features of Timeline, one of David Huynh’s many ingenious creations, is this condensed summary of activity:

Here we can see sporadic use of #quotes from April to the first of September, and then much heavier use. What happened on September 1? Tim O’Reilly, a powerful meme transmitter and amplifier, quoth: “‘The skill of writing is to create a context in which other people can think.’ Edwin Schlossberg. #quotes”

In Timeline we can watch other Twitter users immediately begin to use and transmit the #quotes meme:

This method will be most useful for watching Twitter hashtags that haven’t yet been widely adopted. If you apply it to, say, #ike you’ll run into two problems. First, Twitter’s API caps the number of search results you can retrieve, so in the case of #ike we can only see back as far as September 18. Second, Timeline struggles to display thousands of events.

These are general problems. No matter which Petri dish we observe — del.icio.us tagspace, the blogosphere, Twitter — our ability to watch memes evolve is limited by the amount of data we can gather, and also by our ability to effectively visualize what data we can gather. I expect both constraints to gradually erode. As they do, this game of meme tracking will become even more interesting.

The Congressional content management system

Recent legislative drama highlights the absurdity of expecting people to make sense of complex texts that are evolving rapidly in high-stakes, high-pressure situations. What we have here is a classic culture clash, in this case between people who think in terms of paper documents and those who think in terms of electronic documents.

Washington is a paper-based culture. There are hopeful signs of change, and Bob Glushko spotted one of them here:

Based on the file name embedded in the pdf of the bill — O:\AYO\AYO08C04.xml — at least the people doing the publishing work for the bill are doing their best to save our tax dollars by creating the file using XML for efficient production and revision.

But there’s no public access to AYO08C04.xml. The government’s reflex is still to publish paper, or its electronic equivalent, PDF. So when the Sunlight Foundation’s John Wonderlich tried to visualize the evolution of the Senate’s version of the bailout bill, he was reduced to printing out PDFs, arranging them on the floor, and marking them up with a yellow highlighter.

Recognizing the futility of this approach, he complained on a mailing list and Joshua Tauberer responded with a special GovTrack.us feature that extracts the text from the PDFs and provides electronic comparisons. John Wonderlich observes:

Josh’s page does what I failed to effectively do with paper: get a comprehensive view of what has changed between each copy of the bill.

As I noted with respect to my recent legislative excursion, every Wikipedia author/editor takes for granted the ability to review the entire history of an article, compare differences between any two versions easily and effectively, and collaborate with other interested parties.

Even more powerful change visualization is possible, as we saw when Andy Baio, in response to my LazyWeb request for animation of Wikipedia change history, sponsored a contest that Dan Phiffer won.

Is MediaWiki, the software that powers Wikipedia, a more capable content management system than the one used by Congress to produce and collaboratively edit AYO08C04.xml? I would hope that the internal taxpayer-funded system is actually delivering the benefits that Bob Glushko supposes it must be. But how can we be sure? Maybe somebody in the know can comment.

Old-fashioned and newfangled plumbing

On this week’s Interviews with Innovators I followed up on the most unusual thing I saw at DEMO: a silicon-based flow-control valve for air conditioners. Mark Luckevich, VP of engineering for Microstaq, explains how they’re using MEMS (micro-electro-mechanical systems) to enable a simple retrofit that could save large amounts of the energy currently used for commercial air conditioning.

This conjunction of old-fashioned and newfangled styles of plumbing represents the sort of cultural mashup that always gets my attention. As Amory Lovins has been saying for decades about energy conservation, there’s low-hanging fruit we can harvest by instrumenting, monitoring, and controlling our HVAC systems using modern sensors, controls, and information systems. The Microstaq valve is a great example of that.

More generally, it points toward an interdisciplinary cross-fertilization that enables a set of well-established IT practices — logging, testing, debugging, hot-spot analysis, refactoring — to be applied in a very different domain.