Passwordless MyOpenID

In response to a Kim Cameron item about Blogger’s support for OpenID — and, when the OpenID provider is myopenid.com, for identity selectors — Vittorio Bertocci pointed out something I had not realized:

MyOpenID does exactly what I was asking for: it allows me to create a new openid without having to establish any password. Let me repeat/rephrase it: I can create an account that can be accessed exclusively by using a personal card.

That got my attention. Coincidentally I had just been reading the rough cut of Vittorio’s forthcoming book, Understanding CardSpace, and was at the same time reviewing how OpenID providers like MyOpenID work with OpenID relying parties like ClaimID.com. The ability to create a passwordless, card-only account on MyOpenID is a great step forward, for the reasons Vittorio explains on his blog.

I went over to MyOpenID, created a new, passwordless account, associated that OpenID URL with my ClaimID account, and away I went. Nice!

Now I’m trying to imagine how I would explain all this to a civilian. Honestly, I don’t think I could, yet. It’s a stretch even for me to hold in my head all the moving parts. Which identity selector works with which browser on which platform? What does the card represent? What does the OpenID URL represent?

But we are tantalizingly close to real use cases that will begin to walk people through these scenarios. It’s difficult to describe the abstractions, but as people begin to actually have the experiences, it’ll all start to come clear. Similarly, as people start to have the managed-card experiences that Dick Hardt discusses in our ITConversations podcast, those will start to come clear as well.

To all those attending the Internet Identity Workshop today: Thanks, and keep up the great work!

A conversation with Greg Whisenant about CrimeReports.com

For this week’s ITConversations show I spoke with Greg Whisenant, founder of CrimeReports.com. His company, called Public Engines, has ambitions to offer a range of services that enable citizens to access public data. CrimeReports, the flagship, aims to generalize the process of data extraction and reformulation that was done by Adrian Holovaty for ChicagoCrime.org. It works by installing software behind the police department’s firewall that relays crime data from internal reporting systems to the CrimeReports service.

Participating towns and cities all become part of single federated mapping application. So if two towns are adjacent, you’ll just pan seamlessly across the political border. It’s a cool idea, and makes you wonder about how a service/syndication-oriented architecture could enable federation across different mapping applications.

What’s particularly exciting to Greg, and to me as well, is the way in which these kinds of applications begin to create a framework for citizen/government collaboration. To that end, it’ll be important to roll out these services at a pace, and in a way, that enables governments to feel comfortable as they move to a more transparent stance. So CrimeReports does things in a pretty controlled way. Police departments can internally preview the application before it’s released, and there’s also the option to run more detailed analysis internally than is available to the public.

What worries me a little, though, is that CrimeReports implementations don’t (so far) yield up feeds of the underlying data. I understand the reasons why not. But I think it’s crucial that citizens will come to expect such access, and will be encouraged to make effective use of it.

First things first, to be sure. Systems that enable citizens both report and review a variety of events in the lives of their cities will bring a new and welcome era of collaboration. But let’s make sure the data flowing through those systems is, and remains, available.

Is software too soft?

The other night I was remotely assisting my mom, because she couldn’t find the search box in her Safari browser. Turns out that she’d somehow removed it, along with the address box, from the browser’s chrome. Not being a regular Safari user it took me a minute to track down where to fix this: View -> Customize Address Bar. But on reflection I can see why she was utterly baffled by the disappearance of this basic landmark.

We talk a lot about how people can figure out how to use cars, and about how they’ll be able to figure out how to use computers too — if only we can make computers as “easy to use” as cars.

But nobody ever gets into a car and asks: “Hey, where’d the steering wheel go?”

Software is essentially metamorphic, and none of us — if we’re honest — can deal very well with that. This isn’t simply a question of newbies versus adepts. In a lecture on the personalization of search — part of the UC Berkeley course Search Engines: Technology, Society, and Business — Microsoft researcher Jaime Teevan talks about how something like 40% of the finding that people do is actually re-finding. Most people don’t bookmark or otherwise save found items because they expect to be able to find them again. But they also expect to re-find an item at the same position in the search results list, and they’re significantly disrupted if it has moved.

If you observe yourself interacting with a computer, you’ll see lots of examples of this kind of thing. The composition and sequence of buttons or bookmarklets in a toolbar is completely arbitrary, but once you’ve created a layout you start to depend on it in ways that you don’t even realize until you switch to another environment that lacks that customization. Navigational paths through applications, or file systems, are trails that could have been blazed in a number of ways but, once blazed in a particular way, compel you to follow them. And when those trails are disrupted, so are you.

Sometimes I wonder if computer interfaces simply have too many degrees of freedom for most people to ever really be comfortable with. And if handhelds will become ascendant not only because the devices are mobile, but also because the interfaces aren’t so aggressively metamorphic.

CardSpace for the rest of us

Hat tip to the CardSpace team for enabling “long tail” use of Information Card technology by lots of folks who are (understandably) daunted by the prospect of installing SSL certificates onto web servers. Kim Cameron’s screencast walks through the scenario in PHP, but anyone who can parse a bit of XML in any language will be able to follow along. The demo shows how to create a simple http: (not https:) web page that invokes an identity selector, and then parses out and reports the attributes sent by the client.

As Kim points out this is advisable only in low-value scenarios where an unencrypted exchange may be deemed acceptable. But when you count blogs, and other kinds of lightweight or ad-hoc services, there are a lot of those scenarios.

Kim adds the following key point:

Students and others who want to see the basic ideas of the Metasystem can therefore get into the game more easily, and upgrade to certificates once they’ve mastered the basics.

Exactly. Understanding the logistics of SSL is unrelated to understanding how identity claims can be represented and exchanged. Separating those concerns is a great way to grow the latter understanding.

More simple, single-purpose screen sharing

In a 2006 InfoWorld column entitled Simple, single-purpose screen sharing I lamented the tendency of screen-sharing programs to pile on features that get in the way of doing the core screen-sharing function simply and well. Readers responded with quite a few suggestions. Yesterday I learned about a Microsoft product, SharedView, which belongs on that list. (Note: it’s currently a beta 2 release.)

For me, the ArsTechnica review of SharedView — which dings it for lacking the full feature set you find in products like WebEx or Live Meeting — completely misses the point. Simple, single-purpose screensharing is exactly what I want, and exactly what SharedView delivers.

Excellent debate visualizer at NYTimes.com

Kudos to the New York Times for its remarkable interactive transcript of the Republican debate. The display has two tabs: Video Transcript, and Transcript Analyzer. The Video Transcript is a side-by-side display of the video and the transcript, linked together and randomly accessible from either side, plus a list of topics that you can jump to. These are the kinds of features you’d like to be able to take for granted, but which aren’t always implemented as cleanly as they are here.

But the Transcript Analyzer takes the game to a new level. Or at least, one that I haven’t seen before in a mainstream publication. The entire conversation is chunked and can be visualized in several ways. It’s reminiscent of the Open University’s FlashMeeting technology which I mentioned here.

In the Times’ visualizer, you can see at a glance the length and wordcount of all participants’ contributions — including YouTube participants who are aggregated as “YouTube user”. Selecting a participant highlights their contributions, and when you mouse over the colored bars, that section of the transcript pops up.

Even more wonderful is the ability to search for words, and see at a glance which chunks from which participants contain those words. The found chunks are highlighted and, in a really nice touch, the locations of the found words within the chunks are indicated with small dark bars. Mouse over a found chunk, and the transcript pops up with the found words in bold. Wow! It’s just stunningly well done.

The point of all this, of course, is not to exhibit stunning technical virtuosity, although it does. The point is to be able to type in a word like, say, energy, and instantly discover that only one candidate said anything substantive on the topic. (It was Mitt Romney, by the way.) Somehow, in all of the presidential campaigning, that topic continues to languish. But with tools like this, citizens can begin to focus with laserlike precision not only on what candidates are saying, but also — and in some ways more crucially — on what they are not.

Hats off to the Times’ Shan Carter, Gabriel Dance, Matt Ericson, Tom Jackson, Jonathan Ellis, and Sarah Wheaton for their great work on this amazing conversation visualizer.

Your winnings, sir

In random moments I type my first name into Google to check on my long-running competition with Jon Stewart for the top spot. I thought that once he ousted me it would be all over, but strangely there are still days — like today — when I show up first. Except not really, because the top link goes to my InfoWorld blog, not my current blog which currently shows up at #40.

The situation is completely different in Live Search, by the way, where I’m way down in the list along with other Jons who are loved by Google but are not conventionally famous.

This Google love is a temporary anomaly that’s lasted longer than I expected. But if things really shouldn’t work this way, how should they work?

Part of the answer is a lifebits service that guarantees me a persistent lifelong online persona and namespace. That’ll present interesting challenges as people mix personal identities with institutional identities, and then move among institutions. But those challenges will also create business opportunities for a service fabric that manages identity, syndicates content, and measures reputation.

Suppose you’re a Microsoft blogger who has launched at blogs.msdn.com. You can choose to write a mostly professional blog, or a mostly personal one, or a blend of both. Or you can separate the professional from the personal by establishing separate blogs. But no matter how you slice it, there are no good answers to some vexing questions like:

How do you integrate the online persona that you developed before joining Microsoft, or the one you will develop if you leave?

or:

If you establish separate blogs for separate purposes, but wish to combine their reputation effects, how do you do that?

More broadly, this isn’t just about the reputation that accrues to your online persona, but also the reputation that it confers on others. Page ranking algorithms are numeric, not social. People who know me, and my work, value resources I cite because it’s me citing them. So they assign equal value to citations that emanate from weblog.infoworld.com/udell or from jonudell.net. But ranking engines have no idea that those two sources represent a common identity, and no idea of how other identities relate to that one.

The service fabric I’m envisioning would deal with this problem by means of:

1. Claims-based digital identity.

2. Persistent digital object identifiers.

From the identity metasystem manifesto:

Digital identities consist of sets of claims made about the subject of the identity, where “claims” are pieces of information about the subject that the issuer asserts are valid.

In this scenario the issuer of claims about me might as well be me. I have no need to appeal to some other authority, I just want to be able to say, definitively, “I published this piece of content,” and also, “I linked to that other piece of content.”

Now although we normally think of people having digital identities, it seems to me that digital objects can have them too. If those objects have unique and stable identifiers, then they can be the subjects of claims. In the case of a conventional hyperlink, the claim is simply that my digital identity has linked to a digital object that’s associated with some other digital identity. Your evaluation of me, of the object, and of the object’s author can leverage not only the numeric weights assigned by conventional search engines, but also claims made — about me, the object, or the object’s author — by people in your social network that you trust.

We can also imagine the service fabric supporting stronger claims, like “I recommend this object,” or “I assert that this object has been peer-reviewed,” or “This object is required reading at institution X for purpose Y.” These claims won’t be implicit in the web, but could arise from a federation of identity and content services.

It’s admittedly a stretch, but surely a worthy ambition. The recent brouhaha at TechCrunch, about astroturfing YouTube to make videos go viral, drew strong reactions from all quarters. Some people were shocked by the tactics described. Others were shocked by the naivete of the shocked. And still others were shocked in a Casablanca sort of way:

Captain Renault:
I’m shocked, shocked to find that gambling is going on in here!

[a croupier hands Renault a pile of money]

Croupier:
Your winnings, sir.

Captain Renault:
[sotto voce] Oh, thank you very much.

Piles of money will continue to be made in this way. But there are other piles that can be made by offering identity and content services that take us in another direction. I would like to gravitate toward those piles.

Social information management

For much of my career I’ve been exploring ways to bring people and information together in shared online spaces. Groupware, social software, the semantic web, the giant global graph, it’s all the same to me in one fundamental way. When we push information into shared spaces, data finds data, and people find people, and all sorts of magic happens.

What do all these acts have in common?

– I publish a blog entry about a technique I just learned.

– I geocode a photo on Flickr.

– I curate a list on del.icio.us.

– I add to a Wikipedia page.

– I arrange a dinner using Windows Live Events.

These all begin as acts of personal information management. In another era I’d have written down the technique I learned in a private journal, put the photo in a shoebox, kept the list in a notebook, scribbled in the margin of my encyclopedia, and recorded the dinner engagement on my kitchen calendar.

When I instead perform these acts of personal information management in shared information spaces, they continue to serve all of their original purposes. But amazing new possibilities arise. Somebody may thank me for blogging the technique I learned, or even better, show me a refinement I hadn’t thought of. My geocoded photos can cluster together with others, revealing unsuspected and delightful connections among people and places. Other people help me grow my list. My contribution to Wikipedia feels virtuous. My dinner arrangements can include unplanned invitees in a context-preserving way.

What do we call personal information management when it moves into shared online spaces? I asked myself that question, and the answer that came back was: social information management. That didn’t seem like a familiar term. But when I searched, I found that it was claimed by my 2005 InfoWorld column, WinFS and social information management.

The questions raised there, about the tug-of-war between formally structured and organically grown information, apply equally to the emerging crop of semantic web applications. A new question now arises about what the organizing structure will be. A knowledge taxonomy? A social graph? A hybrid of the two?

I don’t know the answer, and I don’t think anybody does. But everything that’s happening traces an arc from personal to social information management. From now on we’ll all be learning how far along that arc we wish to travel, weighing the risks and benefits of participating against the risks and benefits of not participating.

Update: Reading Brian Jones’ note about the upcoming XML 2007 conference in Boston reminded me that I once delivered a keynote at that conference. The year was 2003, the title of the talk was The social life of XML, and it was an extended variation on this same theme. For example:

Documents, including the purchase order and the messages related to it, aren’t just passive carriers of information. They’re the warp on which we weave a socially constructed reality. Somehow, we need to find ways to connect that reality to the workflow and process orchestration systems now being invented.

And:

The emerging web services network is radically open — not only because the messages exchanged on that network are XML, but also because the services are connected using pipelines. We can inject intermediaries into those pipelines; the intermediaries can observe and act on the messages. So we can acquire a lot of useful context, and can implement useful policy, by reading and writing what goes by on the wire. Things don’t tend work the same way on the desktop, but maybe they could. Our personal productivity tools are in a position to learn a lot about how we interact with remote services, communicate with other people, and manage our data. And they’re in a position to help us do those things more effectively. But the messages and events flowing on our local machines have nothing in common with the messages and events flowing in the cloud.

At the time I was very excited about the XML intelligence that had recently been injected into the Office applications. And I still am. The use of data-enriched documents in software-plus-services scenarios is growing — slowly, to be sure, but steadily.

If I were to give another talk today, though, I’d want to elaborate on what Adam Bosworth said in his keynote at that same conference. He talked about RSS, about navigating a linked web of data, and about what today I would call (thanks to Rohit Khare) syndication-oriented architecture. Enabling regular folks to produce and consume structured and data-enriched documents is something that will happen gradually, as tools and infrastructure settle into place. But enabling regular folks to produce and consume feeds, and combine them in navigable ways, is something that could happen explosively. So far as I can see, the barriers are more conceptual than technical.

What would a civilian do?

On my last trip through Logan airport my laptop ran afoul of a DHCP provider that handed it a duplicate address. Although I didn’t notice it right away, because I hopped onto a long flight and then spent the following day with a wired connection, the wireless capability of my Vista laptop had been turned off as a result of this incident.

When I realized what had happened, my first instinct was to start spelunking around in various administrative nooks and crannies of the system. It reminded me of another incident where Phil Windley went into overdrive to debug a display problem on his Mac. As Reed Hedges wrote in a comment:

…it’s as if you were the engineers building the MacOSX operating system and the only way to figure out problems is to rebuild it, and isolate causes.

Mindful of this, I asked myself: “What would a civilian do?” Then I took a deep breath and clicked on the Diagnose and Repair link. And lo, Vista said: “Wireless service is turned off. Would you like to turn it back on?”

Why yes, thanks, I would.

And lo, it worked.

Of course I couldn’t just carry on without knowing what had been turned off and on. For future reference, it was the WLAN AutoConfig service, which “enumerates WLAN adapters, manages WLAN connections and profiles” — something I’d forgotten about, or possibly never encountered, but that a civilian would hope never to meet.

It was a nice glimpse of how things ought to be.

That said, I’m wondering how the explanation that I found — using the ipconfig command, and the event log viewer — could be translated into something a civilian would find intelligible and useful. Recall that there was a long delay between the incident that triggered the wireless shutdown, and the appearance of the problem. I was able to figure out what had happened, but a civilian wouldn’t associate a transient popup message in an airport with a problem manifesting 18 hours later.

It’s wonderful to know that somebody can (in certain cases) wave a magic wand, say “Yes, please fix this,” and make it so. But absent a rational explanation for what went wrong, this can become a recipe for superstition. A civilian would be highly likely to attribute the whole incident to something that he or she did wrong, and then to fear doing that wrong thing again.

Of course we’d like things to always just work. But our systems are complex artifacts and things will go wrong. Intelligent diagnosis and repair is a daunting but worthwhile challenge, and it’s nice to see evidence of progress. Communicating helpfully to users about that process is an equally daunting but worthwhile challenge.

Kucinich in Keene

I went down to the bagel shop early this morning to have breakfast and read the NY Times. There’s a buzz, I look up, and it’s Dennis Kucinich. I’d heard he was in town today, but didn’t expect him here. He moves from table to table, introducing himself and chatting, and eventually makes his way over to me. What to say? That I admire him in certain ways but don’t think he’s a viable candidate? That I’m worried about the war, the economy, and the climate?

I stand up, shake his hand, and thank him for his visit.

Such is life in New Hampshire during the presidential primary season.

Competing for the creative class

I went to the Cities of Knowledge conference in Dublin, at the historic Clontarf Castle, to give a talk on citizen use of government data. But I would say that the man of the hour was Richard Florida. His theory about the economic and social impact of what he calls the creative class was repeatedly invoked by the technologists, officials, and academics who were at this conference to discuss the future of IT-enabled cities.

The gist of Florida’s thinking, which you can catch at ITConversations if you are not inclined to read the book1, is that a vibrant creative culture — where creativity is very broadly defined to include scientists, technologists, entrepeneurs, artists, musicians, and others — has become the defining reason why cities, regions, and countries succeed.

If you buy into that notion, and the attendees from Barcelona, Derry, Dublin, Helsinki, San Jose, Tallinn, and elsewhere very much did, it leads to an interesting conclusion. Cities have always competed with one another to provide attractive business climates. Quality of life was an aspect of the competition, but incentives to businesses were what really mattered.

A point made at this conference, though, is that the creative class values place above employer. To a 25-year-old European marketing or software professional, the choice of Barcelona over some less desirable city is now more decisive than the choice between working for IBM or Microsoft.

You still need to make your city attractive to IBM and Microsoft, because these companies help create and sustain the quality-of-life conditions that attract the creative class. But companies don’t have a direct interest in those conditions, people do.

It was fascinating to see how these cities are now thinking explicitly about competing — in terms of their housing, transportation, safety, culture, and IT enablement — to attract the creative class. Success produces a compound benefit, because the creative class is an engine of prosperity. Not only does it spend money, it also germinates new businesses. And those tend to be just the kinds of businesses that appeal to the creative class, so it can become a virtuous cycle.

Is it elitist to focus on the needs of the creative class? I don’t think so. Every citizen cares about housing, transportation, safety, culture, and IT enablement. If cities do better in those areas in order to attract the creative class, everybody wins.


1 Can this really be a good thing for the book business? Based on the number of books I have not read after catching the author’s drift in an extended audio interview or lecture, I do wonder.

Why Guinness tastes better in Ireland

According to the tour guide at the Guinness Storehouse in Dublin, the belief that Guinness tastes better in Ireland is not just an urban myth. He offered two explanations for the phenomenon.

1. Freshness

Because Guinness is so much more popular in Ireland than elsewhere, kegs don’t last long. So your pint is more likely to be fresh.

2. Cleanliness

If you’re a pub owner in Ireland, you are affiliated with Guinness. One of the terms of that relationship is that every three weeks you’ll receive a visit from a Guinness representative who will flush the lines to your Guinness taps. Pub owners are supposed to do that on their own, but some are lazy about it and Guinness doesn’t want to take any chances. If the lines aren’t flushed, the brew will be compromised.

What a savvy business practice!

On the way home, in the airport, I got to see it in action. These taps have signs on them that say Line Cleaning In Progress.

And here’s the Guinness rep flushing the lines. I asked him how often he shows up to do this, and he confirmed the story. “Every twenty-one days.”

The only problem is, knowing this, how can I ever again order a Guinness at home?

TSA to Aer Lingus: Hello?

So, by the way, again with the boarding pass stamp. This time it was even stranger. AerLingus.com had invited me to check in online, and to print a two-page boarding pass. There was a gate copy, and a passenger copy. Both displayed this instruction:

Proceed directly to your gate? On an international flight? OK, whatever.

But of course when I got to security, nothing doing. They sent me back to the ticket counter to get my boarding pass stamped. When I got there, they just tore up the passes I’d printed and issued me a new one. Which was…wait for it…not stamped.

It’s baffling. Mostly, I felt sorry for the developers who worked hard to put together a smooth interaction that leads to a very nice-looking boarding pass which, apparently unbeknownst to them, has no purpose whatsoever.

Drizzly Dublin

I’m here to give a talk tomorrow at the Cities of Knowledge conference, on the topic of citizen use of public data. Walking past Trinity College I saw a sign for Vordel. Hey, I know those guys! Stopped in to have a chat, and while I was there I asked them how they’re getting along now that REST has vanquished WS-*. Of course, from their perspective, that hasn’t happened at all. “For banks and insurance companies,” said VP of Engineering Dave McKenna, “it’s XML in, XML out.”

There’s more than one way to do it.

Update: Vordel’s Mark O’Neill adds:

With our XML Gateways you can support SOAP and REST with the same Web Services, and apply the same policy umbrella to both: http://radio.weblogs.com/0111797/2007/10/05.html

Exactly. As Mark says in that posting, “it isn’t a case of ‘either/or’ for SOAP And REST.” And with today’s release-to-manufacturing of .NET Framework 3.5, REST support in the Windows Communication Foundation means that, there too, you can choose the style that suits your need.

A conversation with Gardner Campbell about the digital imagination

In this week’s ITConversations show I spoke with Gardner Campbell about how networked computers and human minds together can produce what he calls “the digital imagination.” I first met Gardner when he invited me to speak at the University of Mary Washington’s Faculty Academy, a high-energy gathering of instructional technologists. As Gardner mentioned on Twitter recently, the magic that’s happening at UMW is a team effort. Team members I’ve met include Jim Groom, Martha Burtis, Andy Rush, and Jerry Slezak. I don’t know whether their collective efforts efforts are creating a positive buzz for UMW outside the blogosphere, but they should be.

As for Gardner, well, here’s a guy who teaches everything from Milton to rock and roll to Ted Nelson. He’s creating a new kind of academic discipline that preaches but also practices information and media literacy. In this interview he explains clearly and passionately what that means, and why it matters.

Beth Kanter’s birthday card to screencasting

Beth Kanter made a birthday card for the third anniversary of the term screencast1, and included a screencast in which she reflects on her use of the medium, and on what the future may hold.

Beth’s using Jing too, thinks it may be part of an emerging micromedia trend, and points to Jeremiah Owyang’s definition: micromedia “provides bite-sized voice and video to micro audiences.”

Meanwhile Mary Branscombe asks:

Is so much content online in bite sized chunks because it’s easier to put up (not necessarily if you’re taking the effort to bite off the right chunk) or easier to consume?

Both, although point taken about mastery of the short form. Ironic, isn’t it? We fast forward through the 15- and 30-second chunks on TV, then create our own and watch each other’s.

Beth sees micromedia through a teaching/training lens, as do I. For me there’s also the bug-report aspect — it seems mundane, but it’d be huge for the software business if users could give developers narrated demos of the problems they’re having, rather than verbal descriptions.

Jeremiah Owyang emphasizes the social aspect:

Quick audio or video messages published to a trusted social community. May be created and consumed using mobile technology, and often distributed using other social media tools.

I hope we’ll also see broader, Wikipedia-like uses that pool collective knowledge. An awful lot of what we know is best conveyed by showing and telling directly, rather than by abstractly describing. Lawnmower maintenance, social bookmarking, or any other knowledge-based skill — in the physical or the virtual world — can be demonstrated and cataloged. As micromedia tools lower the activation threshold, that encyclopedia can be written — or rather, performed and recorded and uploaded.


1 I always like to point out, as Wikipedia currently says, that I only invited readers of my blog to propose names, and selected the term screencast. It was Joseph McDonald and Deeje Cooley who both (separately) proposed the term.

I also like to point out that nothing was invented, and that the medium has a long history going back to (as far as I can remember) Lotus Screencam. My contributions were to realize that the medium was radically underappreciated, to explore it, and to evangelize it.

Jing’s the thing

Tomorrow is the third anniversary of the term screencast. Taking stock, I’m reminded of all the uses of this medium we’ve seen since, and also of those still in the pipeline. The diagnostic use that I recommended to Mary Branscombe here is one of those still-emerging uses for most people. And after describing the Windows Media Encoder technique to her, I realized I’ve been remiss in not exploring, and advocating, TechSmith’s Jing.

Here’s a short screencast illustrating the use of the Excel geocoder I discussed a while ago. It was ridiculously easy to capture that screencast using Jing and, what’s equally useful, to upload it up to screencast.com at a shareable URL.

When the TechSmith folks told me about Jing I was thinking about screencasting at a different level: professional quality, careful editing, multiple delivery formats. So I made note of it, but didn’t fully appreciate its significance. Jing is perfect for Mary Branscombe’s scenario. And when more scenarios like that can play out more easily and naturally, we’ll all benefit from the improved flow of understanding about how software works, or why it sometimes doesn’t.

“It won’t repro”

In a comment on an item last month about Photo Gallery, Mary Branscombe writes:

I’m having an issue at the moment where renaming a file in Windows Live Photo Gallery seems to reset the date on the file so WLPG sees a file from May 2006 as having been taken today. Has anyone else seen this? Changing the name also loses my tags and confuses WLPG so it can’t upload it to flickr… All JPEGs.

That’s a perfectly plausible problem report. But I couldn’t reproduce it, and neither could a couple of product managers I asked to take a look. If it “won’t repro” we’re stuck.

But there might be a way out of this bind. The description “renaming a file in Windows Live Gallery seems to reset the date” is reasonably precise, but there can be all kinds of nuances that would be difficult for Mary to convey, or that she might not even be aware of. That’s why it’s a great idea to capture a screencast that illustrates the problem you’re having. You can do that with Windows Media Encoder which remains, as I’ve been saying for years, sadly unknown and unappreciated. If you’ve never installed or used it, John Montgomery’s recent screencast shows you how.

I wish that this sort of diagnostic screencasting were more accessible to people. Even I don’t reach for the tool as often as I should, and when I don’t, I regret it. For example, last year I ran into an issue with an application that suddenly wouldn’t save a particular file type. Of course it “wouldn’t repro” and I got into a long back-and-forth with the developer in the course of which I wound up installing a specially instrumented version of the program to capture a detailed log file.

In the end we found that, as is so often the case, it was a silly little thing. The export feature broke when I switched, without realizing it, from an absolute path:

c:/jon/…

to a relative path:

/jon/…

It turns out that a third-party component used by the program for this export operation won’t accept relative paths. The program needed to know that (which it didn’t) and, if a user entered a relative path, needed to transform it into an absolute one.

We’ll never know how things might have otherwise turned out. But if I’d shown the developer a screencast of my problem scenario, there’s a decent chance he’d have said: “Hmm. Something unusual about that path in the file save box.”

So I’ve asked Mary Branscombe to make a diagnostic screencast, and if you find yourself in a similar situation I urge you to do the same. Pictorial descriptions of software behavior can enhance verbal descriptions with details that we ourselves aren’t aware of.

Owning your namespace

Given my interest in persistent URLs and reliable citation, it’s surprising that I only just today learned about WebCite. Here is the WebCite URL for a recent entry of mine:

http://www.webcitation.org/5TLg33jR5

This looks a lot like a TinyURL. If you’re on Twitter, you’re seeing a lot of those because Twitter automatically invokes the TinyURL service when you cite an URL.

But WebCite has a different, and very special, mission. It’s for scholarly and professional authors whose articles are themselves persistently linkable by way of Digital Object Identifiers. Increasingly those articles cite more ephemeral things, like blog entries. Using a WebCite bookmarklet, these authors can produce URLs that point to archived copies of web pages. Think Wayback Machine, but you can ask to have an item archived and be sure that it will be.

This is cool, and it’s interestingly different from the ad-supported TinyURL. In the case of WebCite, support is expected from a consortium of publishers whose content cites a mix of persistent academic works and ephemeral web stuff. Such content will be more valuable, the reasoning goes, if the ephemera can also be reliably cited.

As the author of ephemeral items, of course, I’d like to insert myself into that value chain. In this model the citing author and the publisher can see referrals to my item, but I can’t. That’s another reason why I need a lifebits system that’s independent of my blog publishing service, and of linking and persistence services, but can control my namespace and syndicate to and from those services.

From screencasting to automation

I was pleased to see the announcement that Novell and Microsoft are collaborating on the User Interface Automation (UIA) stuff. My mom can use all the help she can get. But as I discussed in Automation and accessibility, beefing up our ability to automate software in a consistent way can give us huge leverage in other areas, like education, training, and collaboration.

In The social scripting continuum I suggested that a system like CoScripter could automate desktop and web applications in a common way. Here’s one way to think of the benefit of doing that. Today, I can share software-related task knowledge in a social manner by creating and posting screencasts. But you can only watch a screencast. If I could instead share that task knowledge in the form of standardized high-level scripts, you wouldn’t need to watch the screencast. Of course, you might want to, for other reasons, but not simply to get the procedural knowledge transferred from my brain and fingers to yours.

Given how popular screencasts have become in three years, I’ve got a hunch that taking things to that next level would be huge. And lord knows I’d love to be able to convey packages of procedural knowledge to my mom that way.

Multilingual idioms

As I’ve begun to dig into PowerShell, I’m reminded again of a whole cluster of themes: why programming languages differ, how they can share a common foundation, and what influences our ability to use a mix of languages according to their strengths and our preferences.

In this example I used a combination of PowerShell and Python because each afforded convenient access to a familiar idiom. In the case of PowerShell, the idiom was a certain style of XML processing. Of course the idiom is available in Python too, but even as a regular Python user I need to stop and think which module provides it.

In the case of Python, the idiom was a certain use of the regular expression machinery. I wasn’t familiar with the PowerShell/.NET idiom, but I knew it in Python, so I used what I knew.

Of course any programming language can be used to accomplish any programming task. You can write a game of Tetris in XSLT, for example, and people have. But each language has its sweet spot. Once you get comfortable in that zone, you want to stay there. It takes effort to be open to other languages’ sweet spots, and to the possibility of combining them.

When I suggested that a pattern dictionary of idioms could help grease the skids, Kevin Reid pointed to RosettaCode.org. Very cool! This would be a great place to enumerate variations on, for example, the regular expression idiom that allows the dot (period) character to match newlines. I have probably learned and forgotten a half-dozen times that SingleLine is not to be confused with multiline, SingleLine is a perl regex term, it corresponds to dotall in java regex. If we could reliably find patterns like allows “.” to match newlines and map from them to implementations in any language, we’d be in a better position to use multiple languages according to their strengths and our preferences.

Note that although my original example was done using the standard C-based version of Python, I later switched to IronPython in order to use the .NET regular expression machinery. It’s true that there are some idioms to be mastered in order to use that machinery from different .NET languages. But there’s also a large foundation of idioms shared by all .NET languages. Because PowerShell and IronPython rest on that common foundation, I’ll think we’ll be able to discover, as both evolve, more about what is essentially Pythonic and essentially PowerShellish.

A conversation with Dick Hardt about British Columbia’s digital identity initiative

When Kim Cameron pointed to this CBC story about a British Columbia trial use of managed Information Cards, he noted:

Dick Hardt of sxip played the key and even charismatic role in developing a catalytic relationship between industry and government.

On this week’s ITConversations show1 I chatted with Dick Hardt about that project. According to Kim’s Information Card thermometer, 10 percent of desktops are now running CardSpace or an equivalent identity selector technology such as DigitalMe. I’m not sure where the tipping point will be, but even if you’re in that 10 percent it’s hard to find concrete examples of how the technology will simplify your life.

The BC program should prove to be a nice example. It will provide roaming access to WiFi hotspots for people who work in government agencies and also in public-sector organizations. The managed cards issued to these folks will identify them as members of those agencies and organizations.

From the user’s perspective, this will in many cases be the first real hands-on experience with the identity selector that’s built into Vista, available for XP, and emerging in other forms.

From the government’s perspective, it will provide another kind of experience. The identity metasystem that Kim Cameron has been birthing is really about network effects. In this kind of network, the packets are identity claims, and you want them to be able to flow frictionlessly.

I asked Dick to compare this architecture to other kinds of “trust bridges” — like the Higher Education Bridge Certification Authority and the Federal Bridge Certification Authority — and here’s what he said:

The architectural advantage of this model is that you have a URI representing each claim in a transaction. So that makes it wide open. You don’t have a single schema, you have a set of URIs and anybody can define a new one. That enables an organization to set up their own claims. They can say, our people have these attributes, and this is what those attributes mean.

The advantage of this approach is that once you’ve got some parts of it working, it’s very easy for someone else to join in and become part of the whole network. So once we’ve got this WiFi thing set up and running, another public sector organization comes along and wants to use it, and we just say, OK, you just need to turn something up to issue them managed cards. Then someone says, well, I’ve got a service I’d like someone to access if they’re members of one of these organizations. They can just turn it up, and their people already have the cards they can use to access it.

The equivalence between URIs and identity claims seems crucial here. Although I hadn’t made this connection before, I suspect it will enable a compositional approach to identity management which has much in common with the principles of RESTful web services.

Of course it’s challenging for experts, and impossible for civilians, to discuss this stuff in the abstract. But when somebody receives a managed card, uses it to access a service, finds that the claims carried by the card can be used to access another service, and can see which claims are being sent to which parties for which purposes, it’ll all start to make sense. It’s been a long time coming, but it feels like the puzzle pieces are finally fitting into place.


1 An audio glitch injected some annoying static into this particular episode, for which I apologize to Dick and to my listeners. Grumble. I wish it were easier to be a happy caster.

Processing a WordPress export file with PowerShell

So I wanted to make a HTML page of just the titles of my blog items, with the titles hyperlinked. Here’s a solution in PowerShell:

[xml]$xml = get-content 'wordpress.xml'
$items = $xml.rss.channel.item | Select-Object title,link
foreach ($item in $items)
  {
  $s = '<p><a href="' + $item.link + '">' + $item.title + '</a></p>'
  echo $s
  }

I like how the XML handling is just woven into the fabric.

That said, the XML file that WordPress exports is — I just discovered to my chagrin — not actually XML. The comments contain all sorts of junk that choke an XML parser. I couldn’t find an example of a multiline non-greedy regular expression search-and-replace in PowerShell, so I stripped out the comments using Python:

import re
s = open('wordpress.2007-11-09.xml').read()
pat = re.compile('<wp:comment>.+?</wp:comment>',re.DOTALL)
s = re.sub(pat,'',s)
f = open('wordpress.xml','w')
f.write(s)

Mapping idioms from one language to another is such an interesting problem. I’ve always imagined a kind of Rosetta Stone of patterns. It would contain patterns like multiline non-greedy regular expression search-and-replace and then you could map examples from any language into those patterns. Does any resource on the web approximate that kind of pattern vocabulary?

Stocks and flows in online communication: another hat tip to Jerry Michalski

I always try to push information to where it can be most valuable. In practice, that often means moving from personal invitation-only spaces to shared public spaces.

In an essay called Too busy to blog? Count your keystrokes, I made the point that this strategy needn’t cost you more effort. What I’ve been calling the principle of keystroke conservation seems to stick in people’s heads.

Here’s another phrase that’s been stuck in my head since October: stocks and flows. That’s how Jerry Michalski, at the Social Computing Symposium, described the communication pattern that moves information from personal/transient to shared/persistent spaces.

When I use email to solicit responses back to a searchable/subscribable wiki, that’s just what I’m trying to do: convert flows into stocks.

Lee LeFever, whose company’s “paperworks” sketchcasts I recently applauded, wrote a series of items elaborating on Jerry’s notion of stocks and flows in online communication.

I would quibble slightly with Lee’s point that blogs are mainly about flow and wikis mainly stock. I think both can nicely combine aspects of both, and that their essential differences lie elsewhere.

The distinction between non-archived, non-shared interpersonal messaging (email, IM) and any kind of shared space (forum, wiki, blog) is much more stark. Some flows are inherently private and/or transient. Those can’t and shouldn’t become stocks. Other flows are not inherently private and/or transient. Those can, and almost always should, become stocks.

NoScript

By way of Patrick Logan, I see that Douglas Crockford is recommending that Firefox users should be running with the NoScript extension, which enables you to whitelist or blacklist sites trying to run JavaScript code in the page you’re visiting.

I hadn’t tried NoScript before. Wearing my security-minded developer’s hat, I like the idea. It’s a great way to see which scripts are invoked by various websites, and to understand how those sites behave with those scripts enabled or disabled.

Wearing my civilian hat, I’d wonder about the level of effort required to make those kinds of granular decisions. Douglas Crockford observes:

You might think that you would have to spend a lot of time managing the policy, but surprisingly, you don’t.

On the one hand I’m inclined to agree. We’ve seen the same thing with firewalls that do outbound filtering. But on the other hand, NoScript prompts occur much more frequently. Will civilians be willing to deal with that? I’d be curious to know how non-geeks are getting along with NoScript.

I also have a question about NoScript’s default policy. The NoScript.net tagline reads: “NoScript – JavaScript/Java/Flash blocker for a safer Firefox experience!” However, having just installed it, I find it to be a Java/Silverlight blocker and a Flash allower:

Just curious: Why?

A day at the Wharton School

Last Friday I attended a Wharton School conference on technology-enabled business transformation. It was a small gathering of students, academics, and Philadelphia-area businessfolk who partner with Wharton on variety of projects. The agenda was diverse and open-ended.

My own talk riffed on the theme of syndication. I borrowed Rohit Khare’s phrase syndication-oriented architecture, and talked about how Internet-scale publish/subscribe mechanisms — including not only RSS feeds but also things like Facebook news feeds — will also work inside the enterprise.

Coincidentally, BEA’s Shane Pearson was making a similar case at Defrag, as reported by Phil Windley:

Shane asked “what if wanted to know what articles and blogs my co-workers were reading?” He then put up a slide that showed what Facebook might look like if it provided enterprise-friendly functionality.

Shane Pearson's Facebook for the enterprise mock-up

This got my attention. Maybe it’s been obvious to others, but I’ve informally done similar things with co-workers—shared what we’re reading—but this could make it more automatic. I’d welcome the opportunity to see more of what my co-workers think is interesting in any given day.

Update: Andrew McAfee writes on this same topic today:

One of my Facebook friends told his network via his status message that he was going to accompany a foreign head of state to a high-level meeting on technology issues. Because I was only weakly tied to this person I had no idea that he was that well connected or interested in public policy. But as a result of his Facebook update, which took him about ten seconds to type and me one second to read, I now know who to reach out to should I ever want to dive into European IT issues.

I also suggested that enterprises will want to embed these kinds of feeds in their service-oriented architectures, in order to enable declarative control of authentication, reliability, and auditing. And I noted that in the Microsoft portfolio, the Windows Communication Foundation and the Internet Service Bus are ways of providing that kind of control.

This wasn’t a geek crowd, though, it was a business school crowd, so I didn’t spend much time talking about plumbing. But the broad theme of lightweight syndication, as an enabler of what I called the enterprise awareness network, was very well received. Perhaps that’s because technical folks take these concepts for granted. But in this group, very few were regular users of RSS readers. Afterward, though, I heard lots of comments along the lines of: “Hmm, that would make a lot of sense.”

The other speakers were mostly business school types, and their perspectives were quite different from those I encounter at tech conferences. Here are some snapshots.

Tom Eisenmann delivered a broad survey of emerging business models in which network effects play a key role. He said that 60% of the world’s 100 largest companies earn > 50% of their revenues from platform-mediated networks. One example of a platform-mediated network is the Xbox ecosystem. It is a “two-sided network”: i.e., developers on one side, users on the other. Network effects occur both within and across those two sides. The principles that govern the creation and management of businesses based on those network effects, he said, are only now emerging.

Peter Fader gave an entertaining “I told you so” talk on Napster. What he told us, back in 2000, was that Napster was the right thing for the music industry, if they could only have wrapped their heads around it.

His view of what the music industry should do (apart from not suing customers) is:

  1. Don’t let an outsider (i.e. Steve Jobs) drive your industry
  2. Create the celestial jukebox
  3. Get people sharing and discovering music again
  4. Make money with subscriptions

What about the claim that you can’t compete with free? He doesn’t believe that. With superior selection, convenience, and service — plus the innovation that becomes possible when you unleash network effects — he says you can.

Ravi Aron talked about a study in which people in Asian countries and in the US/UK were asked to rate the complexity of various tasks. A stunning difference emerged. In Asian countries, people said that tasks requiring analytical and mathematical skills were simple, whereas tasks involving judgement, negotiation, and interpersonal skills were complex. In the US/UK, it was exactly the reverse!

He suggests that as businesses reconfigure themselves into composable sets of services, outsourcing will flow in both directions, divided along these lines.

PowerShell data munging, revisited

It can be dicey to invite comparisons between programming languages, as I did last week in an entry on data munging with PowerShell. But in this case, although I didn’t at first articulate very well what I found interesting about the example from Lee Holmes, by way of Scott Hanselman, the commentary helped me figure that out.

Among the questions that arose:

Terseness

Can PowerShell express things as tersely as other dynamic languages? Nearly so. Does that matter? Lee Holmes writes:

Once you have a solution that works, a natural scripter’s passion is to tinker it down to one line. It’s no longer educational, intelligible, or extendable, but it’s fun.

There may be a bit more to this story. Dynamically-typed scripting languages are naturally terser than their statically-typed counterparts. An oft-cited benefit is that a complete chunk of logic can be seen and understood as a whole thing. Should that chunk be a function, or an entire program? You can legitimately make a tradeoff. Sometimes that chunk needs to be valuable to others in the future, in which case the maximally-terse one-liner isn’t helpful. But sometimes it only needs to be valuable to you right now, in which case the one-liner may actually improve your focus.

Style

More interesting to me was the style of Lee’s PowerShell function. It expressed the operations of selection, grouping, and summarization in a style that felt different to me than what I’m used to seeing, and doing, in Python. I used the word “composable” to describe that difference. Others words might be “declarative,” “SQL-like,” or indeed “LINQ-like”. I wondered if this might an essential or a superficial difference between PowerShell and Python.

Wai Yip Tung’s example, which closely matches Lee Holmes’ example, suggests that it is not an essential difference.

Wai’s example also reminds us that much depends on what you know, or don’t know, about a language’s supporting libraries. And that in turn says something about how the libraries do or don’t manifest themselves to us. As Wai says, if Python’s obscure itemgetter function had instead been called select1, and presented as part of a module that includes groupby, Python programmers would be more likely to produce scripts in the style of the PowerShell example.

Finally, Wai speculates on how PowerShell’s pipelining style might be captured in Python, and whether or not that might be broadly useful. I don’t know the answer to that, but I think it’s an interesting question.


1 Note: The select(0,1) idiom shown in Wai’s example is new in Python 2.5, to which I have not yet upgraded. In Python 2.4 you can only select one thing.

A conversation with Beth Jefferson about reinventing the library catalog

This week’s ITConversations show features Beth Jefferson, founder of BiblioCommons, a company that aims to reinvent and federate the online catalogs of public libraries. She’s thinking very creatively about the social forces that such a federation could marshall. The idea is not to create yet another social network. Instead, she wants to promote the social discovery — and social cataloging — of books, CDs, videos, and other kinds of library resources. Social networks pivot on interpersonal relationships. A BiblioCommons-enabled network would, in a complementary way, pivot on those resources.

How would such a network achieve meaningful scale? Beth has found some data which suggests that if you federated lots of public library catalogs, the combined user population would rival some of the web’s largest sites. Enabling those folks to connect with one another, in the context of resource collections that share common metadata, would be a big deal.

The BiblioCommons software is only now entering its first trial phase. But you can see some of what it does in Beth’s presentation at code4lib, a conference for library technologists.

PowerShell data munging

I first wrote about PowerShell back in 2004, when it was called Monad and/or MSH. What most intrigued me was the way .NET objects could flow through its pipeline. I wouldn’t have thought then, or now, of reaching for PowerShell just to do some basic logfile processing. But Scott Hanselman’s log analysis example today got my attention.

My assumption was that Python would offer easier and more natural ways to absorb, consolidate, sort, and emit the kind of simple log data shown in Scott’s example. But when I tried to recreate that example in Python, I developed a new appreciation for PowerShell’s data munging chops. The selection, grouping, summing, and sorting capabilities — and the pervasive use of the pipeline — are potent ways to manipulate .NET objects. But they’re equally potent when you’re just munging a CSV file.

True, Python has a csv module that’s roughly equivalent to PowerShell’s Import-CSV, so you can easily slurp the data into a dictionary, taking fieldnames from the first row. But how do you perform the selection, grouping, summing, and sorting as succintly and — for lack of a better word — as composably? I didn’t find obvious answers. Of course, just because I failed doesn’t mean it can’t be done. I’d be curious to see solutions in Python (or Ruby, Perl, etc.) that people think capture the spirit of Scott’s example. I suspect there may be interesting lessons to be learned going in both directions.

PS: When I ran the script that Lee Holmes wrote based on Scott’s one-liner, I was initially puzzled to see a one-column display of names, not a two-column display of names and counts. Then I realized it was a formatting problem — the second column was running off the edge of my display. So I changed:

Get-ShowHits | Sort -Desc Hits

To:

Get-ShowHits | Sort -Desc Hits | Format-Table Name,Hits -auto