A conversation with Dan Chudnov about OpenURL, context-sensitive linking, and digital archiving

Today’s podcast with Dan Chudnov is a sequel to my earlier podcast with Tony Hammond about the Nature Publishing Group’s use of digital object identifiers. I invited Dan to discuss related topics including the OpenURL standard for context-sensitive linking.

I’m not the only one who’s had a hard time understanding how these technologies relate to one another and to the web. See, for example, Dorothea Salo’s rant I hate library standards, also Dan’s own recent essay Rethinking OpenURL.

I have ventured into this confusing landscape because I think that the issues that libraries and academic publishers are wrestling with — persistent long-term storage, permanent URLs, reliable citation indexing and analysis — are ones that will matter to many businesses and individuals. As we project our corporate, professional, and personal identities onto the web, we’ll start to see that the long-term stability of those projections is valuable and worth paying for.

Recently, for example, Dave Winer — who’s been exploring Amazon’s S3 — wrote:

I have an idea of making a proposal to Amazon to pay it a onetime fee for hosting the content for perpetuity, that way I can remove a concern for my heirs, and feel that my writing may survive me, something I’d like to assure.

Beyond long-term storage of bits, there’s a whole cluster of related services that we’re coming to depend on, but that flow from relationships that are transient. When I moved this blog from infoworld.com to wordpress.com, for example, InfoWorld very graciously redirected the RSS feed, but another organization might not have done so. I could have finessed that issue by using FeedBurner, but I wasn’t — and honestly, still am not — ready to make a long-term bet on that service.

For most people today, digital archiving and web publishing services are provided to you by your school, by your employer, or — increasingly — by some entity on the web. When your life circumstances change, it’s often necessary or desirable to change your provider, but it’s rarely easy to do that, and almost never possible to do it without loss of continuity.

There are no absolute guarantees, of course, but a relatively strong assurance of continuity is something that more and more folks will be ready to pay for. Amazon is on the short list of organizations in a position to make such assurances. So, obviously, is Microsoft. Will Microsoft’s existing and future online services move in that direction? I hope so. Among other things, it’s a business model that doesn’t depend on advertising, and that would be a refreshing change.

XMP and microformats revisited

Yesterday I exercised poetic license when I suggested that Adobe’s Extensible metadata platform (XMP) was not only the spiritual cousin of microformats like hCalendar but also, perhaps, more likely to see widespread use in the near term. My poetic license was revoked, though, in a couple of comments:

Mike Linksvayer: How someone as massively clued-in as Jon Udell could be so misled as to describe XMP as a microformat is beyond me.

Danny Ayers: Like Mike I don’t really understand Jon’s references to microformats – I first assumed he meant XMP could be replaced with a uF.

Actually, I’m serious about this. If I step back and ask myself what are the essential qualities of a microformat, it’s a short list:

  1. A small chunk of machine-readable metadata,
  2. embedded in a document.

Mike notes:

XMP is embedded in a binary file, completely opaque to nearly all users; microformats put a premium on (practically require) colocation of metadata with human-visible HTML.

Yes, I understand. And as someone who is composing this blog entry as XHTML, in emacs, using a semantic CSS tag that will enable me to search for quotes by Mike Linksvayer and find the above fragment, I’m obviously all about metadata coexisting with human-readable HTML. And I’ve been applying this technique since long before I ever heard the term microformats — my own term was originally microcontent.

But some things that have mattered to me in my ivory tower, like “colocation of metadata with human-visible HTML,” matter to almost nobody else. In the real world, people have been waiting — still are waiting — for widespread deployment of the tools that will enable them to embed chunks of metadata in documents, work with that metadata in-place, and exchange it.

We’ll get there, I hope and pray. But when we finally do, how different are these two scenarios, really?

  1. I use an interactive editor to create the chunk of metadata I embed in a blog posting.
  2. I use an interactive editor to create the chunk of metadata I embed in a photo.

Now there is, as Mike points out, a big philosophical difference between XMP, which aims for arbitrary extensibility, and fixed-function microformats that target specific things like calendar events. But in practice, from the programmer’s perspective, here’s what I observe.

Hand me an HTML document containing a microformat instance and I will cast about in search of tools to parse it, find a variety of ones that sort of work, and then wrestle with the details.

Hand me an image file containing an XMP fragment and, lo and behold, it’s the same story!

In both of these cases, there either will or won’t be enough use of these formats to kickstart the kind of virtuous cycle where production of the formats gets reasonably well normalized. In the ivory tower we pretend that the formats matter above all, and we argue endlessly about them. Personally I’d rather see what I’d consider to be a simpler and cleaner XMP. Others will doubtless argue that XMP doesn’t go far enough in its embrace of semantic web standards. But when we have that argument we are missing the point. What matters is use. This method of embedding metadata in photos is going to be used a whole lot, and in ways that are very like how I’ve been imagining microformats would be used.

PS: As per for this comment, Scott Dart informs me that PNG (and to a lesser extent GIF) can embed arbitrary metadata, but that support for those embeddings regrettably didn’t make the cut in .NET Framework 3.0.

Truth, files, microformats, and XMP

In 2005 I noted the following two definitions of truth:

1. WinFS architect Quentin Clark: “We [i.e. the WinFS database] are the truth.”

2. Indigo architect Don Box: “Message, oh Message / The Truth Is On The Wire / There Is Nothing Else”

Today I’m adding a third definition:

3. Scott Dart, program manager for the Vista Photo Gallery: “The truth is in the file.”

What Scott means is that although image metadata is cached in a database, so that Photo Gallery can search and organize quickly, the canonical location for metadata, including tags, is the file itself. As a result, when you use Photo Gallery to tag your images, you’re making an investment in the image files themselves. If you copy those files to another machine, or upload them to the Net, the tags will travel with those image files. Other applications will be able to make them visible and editable, and those edits can flow back to your local store if you transfer the files back.

That’s huge. It’s also, of course, a bit more complicated. As Scott explains, there are different flavors of metadata: EXIF, IPTC, and the new favorite, XMP. And not all image formats can embed image metadata. In fact many popular formats can’t, including PNG, GIF, and BMP. [Update: Incorrect, see next rock.] But JPG can, and it’s a wonderful thing to behold.

For example, I selected a picture of a yellow flower in Photo Gallery and tagged it with flower. Here’s the XML that showed up inside yellowflower.jpg:

<xmp:xmpmeta xmlns:xmp="adobe:ns:meta/">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="uuid:faf5bdd5-ba3d-11da-ad31-d33d75182f1b" 
<rdf:Bag xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="uuid:faf5bdd5-ba3d-11da-ad31-d33d75182f1b" 
  <rdf:Bag xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description xmlns:MicrosoftPhoto="http://ns.microsoft.com/photo/1.0">
<rdf:Description xmlns:xmp="http://ns.adobe.com/xap/1.0/">

It’s a bit of a mish-mash, to say the least. There’s RDF (Resource Description Framework) syntax, Adobe-style metadata syntax, and Microsoft-style metadata syntax. But it works. And when I look at this it strikes me that here, finally, is a microformat that has a shot at reaching critical mass.

Perhaps we’ve been looking in the wrong places for the first microformat to achieve liftoff. Many of us hoped hCalendar would, but it’s hard to argue that it has. I suppose that’s partly because even though we have a variety of online event services that produce the hCalendar format, there just aren’t that many people publishing and annotating that many events.

There are already a lot of people saving, publishing, and annotating photos. And the tagging interface in Vista’s Photo Gallery, which is really sweet, is about to recruit a whole lot more.

There’s also good support in .NET Framework 3.0 for reading and writing XMP metadata. In the example above, the tag flower was assigned interactively in Photo Gallery. Here’s an IronPython script to read that tag, and change it to iris.

import clr
from System.IO import FileStream, FileMode, FileAccess, FileShare
from System.Windows.Media.Imaging import JpegBitmapDecoder, 

def ReadFirstTag(jpg):
  f = FileStream(jpg,FileMode.Open)
  decoder = JpegBitmapDecoder(f, BitmapCreateOptions.PreservePixelFormat, 
  frame = decoder.Frames[0]
  metadata = frame.Metadata
  return metadata.GetQuery("/xmp/dc:subject/{int=0}")

def WriteFirstTag(jpg,tag):
  f = FileStream(jpg,FileMode.Open, FileAccess.ReadWrite, 
  decoder = JpegBitmapDecoder(f, BitmapCreateOptions.PreservePixelFormat, 
  frame = decoder.Frames[0]
  writer = frame.CreateInPlaceBitmapMetadataWriter()
    print "cannot save metadata"

print ReadFirstTag('yellowflower.jpg') 
print ReadFirstTag('yellowflower.jpg')

The output of this script is:


And when you revisit the photo in Photo Gallery, the tag has indeed changed from flower to iris. Very cool.

Adaptive user interfaces for focused attention

The goal of the search strategy I outlined the other day was to find Mary Czerwinski, a Microsoft researcher, and interview her for a podcast. I did find her, and the resulting podcast is here. We had a great time talking about ways that adaptive user interfaces can leverage spatial and temporal memory, about ambient awareness of team activity, and about the proper role of interruptions in the modern work environment.

In the course of the conversation I mentioned WriteRoom and the notion of a distraction-free desktop. Lately I find myself powerful attracted to Zen simplicity, and I wondered how that impulse might square with the new Office ribbon. It’s a great improvement over the conventional menu systems, but I wondered if there were a quick and easy way to suppress the ribbon when you want to achieve the WriteRoom effect.

It turns out that there are several ways to do that, and I documented them in this short screencast.

Now that I’ve learned how to use the ribbon selectively, there’s one piece of unfinished business. In Vista as in Windows XP, you can hide the desktop icons by right-clicking the desktop and choosing View->Show Desktop Icons. But in order to really incorporate this feature into your workflow you’d like to have it on a hotkey, like WindowsKey->M which instantly minimizes all open windows.

Jeff Ullmann had written to me a while ago with a solution based on the Windows Scripting Host, but the registry layout that it depends on is different in Vista. So, how can you make a clean-desktop hotkey in Vista? I’ve seen the question asked in various places but as yet have found no answers. If you’ve got the recipe I’d love to see it.

Annotate the web, then rewire it

In an essay last week about Yahoo Pipes, Tim O’Reilly said he was inspired, back in 1997, by a talk at the first Perl conference in which I had “expressed a vision of web sites as data sources that could be re-used, and of a new programming paradigm that took the whole internet as its platform.” Someone asked in the comments whether that idea hadn’t instead been put forward in Andrew Schulman’s talk. It turns out that neither Tim nor I can remember exactly what Andrew and I said, but I hope we both touched on this idea because it’s a big one that underlies the whole web services movement and much else besides.

Later on in that comment thread, Tim cites an email message from me in which I try to reconstruct what may have happened. One of the artifacts I dug up was this 1996 BYTE column (cleaner version here). That’s when the lightbulb clicked on for me, and I saw very clearly that the web was collection of components that I’d be able to wire together.

Of course all I was doing was drawing attention to what the creators of the web had intended and made possible. In my recent interview with Roy Fielding, for example, we talked about his early work on libwww-perl, the library that made websites into playthings for Perl programmers. Wiring the web was very much part of the original vision. The idea just needed some champions to broaden its appeal. That’s the role that I, among others, have played.

From that perspective, then, what of Yahoo Pipes? It delights me! Much more importantly, I think it could ultimately appeal to non-technical folks, but there are some conceptual barriers to overcome. The concept of “wiring the web” is one of those, but not the first one. The dominant way in which most people will “program” the web is by writing metadata, not code, and we’ll need an interface as friendly and powerful as Pipes to help them do that.

That last sentence won’t make any sense to the average non-technical person, but the example I gave yesterday might. A by-product of this presidential election cycle will be massive quantities of online video. We should expect to be able to reach into the various repositories and assemble coherent views by issue and by candidate, and Yahoo Pipes would be a great way to do that. But not until and unless the video has been sliced and diced and tagged appropriately so as to yield to structured search.

It’s the slicing and dicing and tagging, not the rewiring, that’s the real bottleneck. I talked last week about factoring group formation out of the various social networks into a common infrastructure. We need to do the same for tagging. How do I know whether to tag my contribution as HillaryClinton and NewHampshire and manufacturing or Hillary Clinton and NH and manufacturing? Where’s the immediate feedback that shows me, across tag-oriented services including YouTube and Blip, how my contribution does or doesn’t align with others, and how I might adjust my tag vocabulary to improve that alignment?

When I tag a video snippet with the name of a politician (“Hillary Clinton”) and a topic (“manufacturing”) I clearly envision a future query in which these slots are filled with the same values or different ones (“Barack Obama”, “energy”). And I clearly envision the kinds of richly-annotated topical remixes that such queries will enable. But such outcomes are not obvious to most people. We need to figure out how to make them obvious.

Retail politics in New Hampshire circa 2007

Hillary Clinton kicked off her campaign this weekend in New Hampshire, and spoke today at the high school in Keene, where I live. Seeing candidates up close and personal is one of the perks of life in small-town New Hampshire, but today it didn’t pan out for me. I arrived early but still couldn’t get into the cafeteria where the event was held. I could have watched the video feed that was piped into the auditorium for a spillover crowd, but instead I went home and watched on the local cable channel.

Here’s a question-and-answer exchange that I captured and put up on Blip.tv:

The question was: “How can government revive and support U.S. manufacturing?” The five-part answer runs almost six-and-a-half minutes. That’s way more time than is ever allotted in the official debates we so obsessively scrutinize.

Retail politics is a wonderful thing, and I wish I’d been there in person. Not everyone who lives in Keene got in, though, and few who live outside Keene did. But those of us connected to the local cable network got to see and hear a whole lot more than the snippets that will air on regular TV. The same will be true in other local communities. Collectively over the course of the various campaigns we’ll see and hear a lot and, in principle, we will be able to collaboratively make sense of it.

By the time the 2008 election rolls around, we ought to be in a position to assemble and review catalogs of these kinds of detailed responses, tagged by candidate and by issue. If you care about manufacturing, you ought to be able to mix yourself a 2-hour show that includes the most informative discourse on the topic from all the candidates. And you should be able to review commentary, from experts who aren’t necessarily the usual TV suspects, that adds value to that discourse.

In practice there’s a fly in the ointment. Are we allowed to republish and categorize this material, as I’ve done here, to provide fodder for decentralized discussion and analysis?

I’m going to check with the guy who runs our local cable channel tomorrow and if there’s a problem I’ll take that video down. But I hope there won’t be a problem. What’s more, I hope that he and his counterparts in other communities will take the issue off the table by choosing appropriate Creative Commons-style licenses for this kind of public-interest material, whether it airs on local cable channels or streams to the Net or both.

A conversation with Antonio Rodriguez about Tabblo, photo albums, and social networks

My guest for this week’s podcast is Antonio Rodriguez, founder of Tabblo, a photo site that’s used to create online photo albums that can be transformed into a variety of print formats.

Among the topics of discussion were:

  • How photo albums tell stories about key events in peoples’ lives
  • Strategies for archival storage of images
  • Strategies for organizing collections of images
  • The relationship between photo applications that live on the desktop and applications that live in the cloud
  • Whether people share their photos online, and if so, with whom
  • What Tabblo’s layout engine does, and how it might be extended
  • Automatic geotagging

We also revisited a topic we’d discussed earlier in the week, on a panel at the MIT Enterprise Forum. The question, also explored here, is: How might certain features of social networks, notably group formation, be factored out of invidual sites and made available in a more federated way?