Art Rhyno’s science project

Art Rhyno’s title is Systems Librarian but he should consider adding Mad Scientist to his business card because he is full of wild and crazy and — to me, at least — brilliant ideas. Last year, when I was a judge for the Talis “Mashing up the Library” competion, one of my favorite entries was this one from Art. The project mirrors a library catalog to the desktop and integrates it with desktop search. The searcher in this case is Google Desktop, but could be another, and the integration is accomplished by exposing the catalog as a set of Web Folders, which Art correctly describes as “Microsoft’s in-built and oft-overlooked WebDAV option.”

There’s more going on in this example than even I can easily wrap my head around, but let’s step back and consider the document itself, which Art provides at this URL:

http://librarycog.uwindsor.ca:8087/artblog/librarycog/indexcat

That’s a vey special URL. Art explains:

This document was created in OpenOffice and is served directly on the web using Cocoon’s nifty Zip support and the elegant and sensible XML syntax of OpenDocument.

In other words, Art writes and maintains an OpenOffice document, but an intermediary translates it on the fly into an HTML document. Other translations are equally feasible — to Word’s XML format, for example. What’s more:

Add in WebDAV support, and the barriers between the desktop and the Web start to blur, and the options for repurposing content achieve megaton levels.

Now let’s switch gears and look at some remarkable developments in the realm of Office. In this entry, Doug Mahugh explores the anatomy of a Word document that embeds within itself a contact record that’s in hCard format. The exact same chunk of XHTML data that you’d find on a web page, like this one, lives inside the Word document.

What’s more, the fields of the contact record can be individually read and written because they’re bound to controls. So if you download the file from Doug’s blog and open it up in Word 2007, you can modify the contact record in situ. The rewritten fields appear inline with the text of the document, but under the covers they’re written into a custom XML part — that is, a file of XML that lives inside the ZIP file that is the new format for Word docs.

(The mechanism that Doug describes for wiring the interactive controls to the custom part in which they’re stored is radically simplified by Matthew Scott’s Content Control Toolkit, a really nice visual editor and mapper that’s freely available as both an executable .NET program and as C# source code.)

The style of intermediation that Art Rhyno’s been developing — based on the notion of what he calls a WebDAV proxy — could produce powerful effects in this realm too, blurring the boundaries between XML file formats on the one hand and between the desktop and the web on the other. For example, the act of opening a file containing an embedded hCard could silently trigger the extraction of that contact information, and the storage of it locally or remotely or both.

I’d like to explore this theme on the Windows desktop and find out what’s possible. Art’s weapon of choice is Cocoon, the Apache project’s XML pipelining framework. And of course Cocoon can run on the Windows desktop. But deploying it there isn’t something many people are likely to want to do. So I’m looking for a way to achieve similar effects with infrastructure that’s based on (or ideally already contained within) the .NET Framework. Does it exist?

Posted in .

7 thoughts on “Art Rhyno’s science project

  1. I guess the logical question then becomes, “Who needs the desktop?” Why not cut out the middle man and just go directly to Google Docs? I realize that this isn’t a practical position to take in the current environment (especially in the enterprise), and understand the value of offline working and security. However, blurring the lines between text and presentation seems to be a harbringer of bad things to come for the owner of the desktop.

Leave a Reply