My close encounter with the Hannaford data breach

My debit card was one of the potentially 4.2 million exposed in the recent Hannaford data breach. Here’s part of the letter from my bank, the Savings Bank of Walpole.

I’ve thanked them privately, and want to thank them publicly as well, for being proactive and doing the right thing here. They’re dealing with fallout from a problem they didn’t create.

Details are still emerging but we don’t yet have the full story. As the InfoWorld story notes, Hannaford’s servers might have been compromised by a remote exploit through the network, or a local exploit made possible by unauthorized physical access.

In the aftermath, most of the usual defense-in-depth strategies are being rehashed, and that’s good. But one-time account numbers still aren’t on the radar screen, and I keep on wondering: Why not?

A conversation with Tim Spalding about LibraryThing

I had a great time talking about LibraryThing with Tim Spalding for this week’s ITConversations show. He says LibraryThing is a baroque application. I think of it as deep in the same ways that Flickr is: Many features, many modes of use, many constituencies. Although Tim is flagellating himself about the way we swam around in those depths, I enjoyed the conversation immensely. If you’re fascinated by the dynamics of social information management — whether or not you are a book-lover — I think you will too.

We wound up talking for almost two hours. I omitted the second hour not only for reasons of length, but also because it raised a question that neither of us felt we were able to address very well. As mentioned in comments here, though, it does warrant further consideration. A lot of folks, me included, feel that the inability to move identity and relationships across social networks is increasingly an impediment to joining them and participating in them.

But Tim rightly points out that friction has value. Rites of initiation are costly for a reason. When you invest effort you create meaning. So here’s the question. How do we separate those aspects of social information management that should be portable and frictionless from those that should be unique and special?

Cluster computing, with large data, for the classroom

This week’s Perspectives is a two-parter: an interview and companion screencast on the topic of cluster computing in the classroom. The interview is with Kyril Faenov, the General Manager of the Windows HPC (high performance computing) unit, and the screencast is with Rich Ciapala, a program manager for Microsoft HPC++ Labs.

The project demonstrated in the screencast, and discussed in the interview, is called CompFin Lab. It’s a system that enables professors to in turn enable their students to run computationally expensive financial models on large quantities of data. From the student’s perspective, you go to a SharePoint server, select a computational model, pick a basket of stocks, and run the model. Behind the scenes the task is partitioned and sprayed across a cluster of computers, then the results are gathered and presented in an Excel spreadsheet.

From the professor’s point of view, some .NET programming is required. But a framework abstracts the mechanics of dealing with the cluster, so the professor can focus on the logic of the model itself.

There are couple of key points about the evolution of high-performance computing that I want to highlight here. First, there’s what Kyril calls “the gravitational pull of data.” Increasingly, people and organizations are building vast repositories of data that other people and organizations will want to analyze in computationally expensive ways. It’s great to have access to a compute cluster in the cloud that can do the heavy lifting, but when datasets get really big you get bottlenecked trying to send the data to where the code runs. At a certain point you’d rather send the code to where the data lives.

A second and related point is that in our current model for large-scale cloud-based computing, there are only a handful of what I call intergalactic clusters — namely, those operated by Google, Yahoo, Amazon, and Microsoft. These are one-of-a-kind behemoths. You can’t replicate one of them locally and apply it to your terabytes of data. So as Kyril and his team build out their cloud-based HPC services, they’re working to ensure the services can be replicated locally.

Maybe the most optimal thing is for you to stand up a 1000-node cluster with each node having a terabyte of disk. We want to enable that. We want to be able to tell our customers: Here’s how we run this large-scale data-driven HPC applications, and here’s how, within a day or two, you can stand up one of these yourself.

The idea is that if you build one of those for your own terabyte trove of astronomical or climatalogical data, you can run your own computations against that data, and you can also share that capability with other people and organizations who want to run their code against your data.

Revisiting the InfoWorld metadata explorer

A while ago I wrote an alternative search and navigation interface to InfoWorld.com. The search is broken now because the underlying engine switched from Ultraseek to Google, and nobody has updated the search wrapper. But the navigation piece still works, and while it does, I want to invite some commentary because I’m thinking of doing something similar for another project.

In this model the navigation is metadata-driven, and supports views like:

InfoWorld stories tagged ‘Silverlight’

InfoWorld news stories tagged ‘Silverlight’

InfoWorld news stories by Elizabeth Montalbano tagged ‘Silverlight’

Every piece of metadata in the tabular display is active, and toggles a filter for that item. This works especially well for the tags, and enables you to cruise through the tagspace in a fluid way. For example, try this progression:

1. InfoWorld news stories tagged ‘Silverlight’

2. Click ‘flash’ to toggle it on

3. InfoWorld news stories tagged ‘Silverlight’ and ‘Flash’

4. Click ‘silverlight’ to toggle it off

InfoWorld news stories tagged ‘Flash’

The same principle holds for other bits of metadata, like storytype. So for example:

1. InfoWorld news stories tagged ‘Silverlight’

2. Click ‘News’ to toggle it off

3. InfoWorld stories tagged ‘Silverlight’

4. Click ‘Review’ to toggle it on

5. InfoWorld Reviews tagged ‘Silverlight’

6. Click ‘Martin Heller’ to toggle it on

7. InfoWorld Reviews by Martin Heller tagged ‘Silverlight’

8. Click ‘silverlight’ to toggle it off

9. InfoWorld Reviews by Martin Heller

It’s powerful to explore things this way, but if I did something like this again, I’d look for ways to make these filter progressions more intuitive and discoverable.

I just don’t think people expect every item to work as a control as well as an information display. And because they don’t, it may be a bad idea to do things that way. Or maybe it’s a good idea that’s still in search of its perfect expression. I’d be curious to know what you think.

Rediscovering LibraryThing

To prepare for an interview with Tim Spalding, the founder and lead developer of LibraryThing, I re-registered with LibraryThing, spent some quality time with the service, and was wildly impressed.

At one point in the interview, Tim asked me how I, Mr. LibraryLookup, as likely a person as there is to use and appreciate LibraryThing, could have gone so long without hooking up with it.

I think part of the answer is hidden in the first paragraph: I had to re-register for the service, which I had tirekicked a year or two ago. The friction of joining and re-joining online services has become a major barrier.

There’s also conceptual friction. LibraryThing is a deep application that does lots of things, but on the surface, it appears to be a mechanism for cataloging books that you own. In fact it isn’t only that, you can just load it with books that you’ve read, or might read, as a way to seed discovery and recommendation.

Finally, there’s data friction. There are bibliophiles who will obsessively catalog their own collections, but I’m not one of them. I do, however, maintain a list of books on my Amazon wishlist. I syndicate that list to the version of LibraryLookup that alerts me when books on the wishlist become available in my local library.

What I needed was a frictionless way to reuse that list. And on this go-round with LibraryThing I found it. Sort of. You can import your Amazon wishlist into LibraryThing, which is a great way to jumpstart the discovery and recommendation process. It doesn’t yet syndicate from Amazon, so the initial import won’t be refreshed, but Tim says that’s coming.

It turns out not to matter at all that list of books I’m interested in happens to be an Amazon wishlist. All that matters is that I can keep it in some service, somewhere, that can syndicate data to other services elsewhere.

A conversation with Carl Malamud about access to public information

This week’s ITConversations show is a chat with Carl Malamud, whose exploits I’ve followed ever since he launched podcasting a decade ahead of schedule with a project called Internet Talk Radio. Since then, Carl’s mainly known for his tireless crusade to release troves of public information to the Net: SEC filings, patents, Congressional video, historical photographs, and most recently, U.S. case law.

One of the questions I wanted to explore with Carl is also raised here by John Montgomery:

Popfly, a mashup tool, depends on three things: data that is simple to access programmatically, interesting, and available under terms that enable users to work with it. As with most software endeavors, you can pick two.

The government has a huge amount of interesting data that’s available under really great terms. Weather? Check out http://www.noaa.gov. Financial information? Start with http://www.sec.gov. Crime statistics? Dig around in http://www.usdoj.gov/. But how much of this is programmatically accessible? Very little, as it turns out.

John mentions the Sunlight Foundation’s efforts to provide an intermediary layer of services that make raw data easier to access and manipulate, and I raised that point with Carl. From his perspective, of course, it all starts with the data which he is rightly focused on providing. Even though the U.S. is far ahead of many other countries in this regard, there are oceans of important information not yet available even in raw form.

Carl has enormous faith in the Net’s ability to interconnect and enhance these raw sources, and I do too. Here’s a small but significant example. If you view source on 28 Fed.R.Serv.3d 415, you’ll see one of my favorite strategies at work: semantic metadata encoded using CSS style tags. That enables an important kind of programmatic access. Now it’s true that today, Internet search engines don’t support queries that ask for documents where Shelby Reed appears as a plaintiff in an appeal to the U.S. Court of Appeals, Fifth Circuit. Someday, though, that kind query will be supported, and the latent semantics of this rendering of U.S. case law will emerge.

These enhanced services don’t necessarily just arise from the grassroots, however. Resource-rich organizations are often in the best position to provide them. One example, we agreed, is the New York Times’ stunningly effective visualization of presidential election debates. Ideally we’d be able to visualize all of the proceedings of Congress in the same way. That’s probably too much to expect of public-interest groups running shoestring operations. But what such groups can do is apply Carl’s favorite technique: Create a few high-profile examples, and then pressure the government into internalizing the process.

Perspectives: Understanding CardSpace with Vittorio Bertocci

The second installment of Perspectives is up, with Vittorio Bertocci, author of Understanding Windows CardSpace. This interview was recorded a few months ago, and has been waiting for the Perspectives site to launch. In January I excerpted the part about omnidirectional identity, a difficult phrase that I continue to struggle with. Maybe a better one is Internet persona: the social mask that you project when you self-publish online, and to which reputation attaches. Whatever we call this phenomenon, its Laws of Identity — not only for people, but also for digital object — are not yet well defined.

Most of the interview, though, concerns the existing “unidirectional” mechanisms supported by CardSpace. I asked Vittorio to relate those mechanisms to precursors like SSL client certificates and Kerberos, and also to the complementary OpenID system. As discussed in my ITConversations podcast with Dick Hardt, the principles that govern this identity machinery are abstract and, until we experience them firsthand, will be hard for most of us to grasp. But Vittorio does a good job of explaining those principles in terms of concrete examples.

A close call: photos lost, then found

While reviewing a white paper by a colleague on the subject of personal digital archives, I realized that I hadn’t followed through on a plan to consolidate a few different caches of digital photos from various digicam and computer eras. So of course, when I went looking, things weren’t exactly the way I remembered. One particular batch was missing, and there were some anxious moments while I booted up dormant computers and mounted shelved disks. In the end I found the missing set, but although I could have sworn they were in three safe places, there was really only one.

In these moments of panic, the need for a lifebits service becomes crystal clear. But the moments pass, and we move on. Most people, most of the time, don’t yet feel the need for that kind of service.

Inevitably that will change. I wonder how, and when?

When the LazyWeb gets too lazy

I’m running a couple of services that make automatic use of Amazon wishlists, and today I noticed that the current version of the API is going away:

503 – Service Unavailable

ECS3 is currently unavailable due to a planned outage in preparation for the complete shutdown of ECS3 on March 31, 2008.

After March 31, 2008, we will no longer accept Amazon ECS 3.0 requests. Please upgrade to the Amazon Associates Web Service (previously called Amazon E-Commerce Web Service 4.0) by then to ensure that you or your customers are not affected by the upcoming deprecation.

Amazon ECS 3.0 deprecation was announced a year ago in February 2007. You can read the original post at http://developer.amazonwebservices.com/connect/ann.jspa?annID=164.

In preparation of the March 31st deprecation, the Amazon ECS 3.0 web service will experience several outages. The complete outage schedule can be viewed at http://developer.amazonwebservices.com/connect/ann.jspa?annID=276.

Please refer to the migration guide for assistance in mapping Amazon ECS 3.0 calls to their Amazon Associates Web Service 4.0 equivalents. You can find the migration guide at http://developer.amazonwebservices.com/connect/entry.jspa?categoryID=12&externalID=627. Please use the Amazon Associates Web Service forum to ask technical questions and share answers with your fellow developers.

We thank you for being part of Amazon’s Developer community and look forward to your continued support.

Like Rich Burridge, I’ll be needing a replacement for PyAmazon, the Python module Mark Pilgrim wrote long ago to simplify use of the original Amazon API.

In our modern world of aggregation, search, and syndication, it’s easy to wait and see what will happen. I went to bloglines and searched for blog items that — like Rich’s and now mine — point to Amazon’s page about migrating to the new API. And then I subscribed to that search.

In a way, this is too easy. I can imagine a bunch of people camped on that query, watching the clock and waiting for someone else to step up to the plate before March 31. The first time around, when Amazon web services were new and shiny, it was cool to be that person. Now, not so much.

Update: A couple of folks have pointed to PyAWS. As mentioned in Rich Burridge’s blog entry, it doesn’t seem to offer, e.g., a single call to retrieve all items from a wishlist. However, when I reviewed my use of the earlier PyAmazon, in terms of raw interaction with the RESTful API and its XML output, I remembered how simple that interaction was. It’s just as simple in the new Amazon API, just slightly different. Encapsulating what I needed to do required only a few lines of code.

Generalizing that encapsulation is much harder. And when you have to repeat that hard work for many different languages, and for many different APIs, the inevitable result is that these per-language API wrappers tend to lag.

That’s one reason I’m looking forward to services built on Astoria ADO.NET Data Services, or an equivalent normalization layer. I think it can substantially narrow the gap between RESTful APIs and the convenience wrappers we enjoy in various programming languages.

A conversation with Ward Cunningham about visible workings and aboutus.org

This week on ITConversations I have a two-part interview with Ward Cunningham. In part one, we explore his implementation of Brian Marick’s visible workings idea, which combines software testing with business process transparency. This is one of those transformative ideas that will not, at first, seem interesting and important to most people. And maybe it never will. But then again, Ward has a track record. The wiki idea didn’t at first seem interesting and important to most people either, and look what’s happened there. So, you never know. Maybe in 2020 we’ll notice that business software is a lot more reliable and understandable than it used to be, and we’ll look back and say: Ward did it again.

In part two, we discuss Ward’s new wiki-based venture, aboutus.org. It’s a directory that aims to become a sort of extended WHOIS database, where domain name owners — along with anyone who reads the websites attached to those domains — can collaboratively describe the people, companies, and organizations represented by those websites. I like the concept, but I wish it weren’t necessary to sign up in order to update http://aboutus.org/jonudell.net. Instead I’d prefer to describe myself on my own hosted lifebits service, wherever that might be, and then syndicate the information to aboutus.org and elsewhere.

Missing the cluetrain

I wasn’t going to post this humorous anecdote but Mike Caulfield reminded me that it’s too funny not to share. After musing about a subscription service for running shoes, I walked in my local store, bought a new pair, and invited them to notify me in three months. Hilarity ensued.

He: We’re not really set up to do that.

Me: You could email me.

He: Yeah, but then we’d have to keep some kind of customer database on the computer.

Oh, right. Having a database of customers who’ve invited you to contact them on a regular basis … that’d suck, wouldn’t it?

Perspectives, a new interview series, launches today

Today I’m launching a new Microsoft-oriented interview series called Perspectives. The show will touch on a variety of topics including robotics, digital identity, e-science, and social software. I’ll be speaking mostly with passionate Microsoft innovators, and sometimes also with key partners from academia and industry.

The format is an audio podcast and a blog, where the blog provides a partial (but substantial) text transcription in order to make these conversations accessible to folks who don’t listen to podcasts, and also to expose them to the Net’s ecosystem of search, linking, and aggregation. Where appropriate, I’ll also use screencasts to show software in action.

Perspectives runs on the same publishing platform that supports Channel 10 (for enthusiasts), Channel 8 (for students), TechNet Edge (for IT pros), and VisitMIX (for Web designers and developers). (Channel 9, the original site, will migrate to this platform too.) Perspectives intersects with the interests of all these sites, but it doesn’t really belong in any of them, so we’ve created an independent home for it. Thanks to the EvNet team, especially Duncan Mackenzie, David Shadle, and Jeff Sandquist, for making that happen.

The first episode, with Henrik Nielsen and Tandy Trower, explores the Microsoft Robotics initiative. We discuss why robotics is — as futurist Paul Saffo believes — a Next Big Thing. And Henrik and Tandy explain how the concurrency and decentralized-services infrastructure that supports the robotics platform is broadly relevant in an era of loosely-coupled services.

Ann Arbor’s public library is a beacon of progress

On the Ann Arbor public library’s website you can find a wonderful example of how two local institutions — the library and the police department — can work together to curate an online exhibit. In 2002, history buff and police sergeant Michael Logghe self-published the lavishly illustrated True Crimes and the History of the Ann Arbor Police Department. The library worked with Logghe to produce an online version of the book. And when he visited the library to speak about the book and the online exhibit, his talk was recorded and made available for download (as video or audio-only) from the library’s podcast feed. Nicely done!

In my Remixing the library talk, I said that the two-way web paves the way for this kind of productive teamwork. It’s not a natural reflex, as Cassandra Targett points out:

It’s a shift from being passive recipients of the world’s knowledge to active participants in its creation, a shift that in many ways goes against some of the deepest core principles of what has become library science.

For a profession steeped in the idea that our role is to describe packaged knowledge and then help people find it (and play no role in how they use it once we point the way to it), the idea that we can not only modify some types of packages or even create substantially new ones is quite foreign still.

As I noted in my interview with Adrian Holovaty about EveryBlock, the curatorial collaboration among local governments, newspapers and libraries can encompass more than text, images, audio, and video. Those same institutions can work together to curate data about the operation of government (crime, taxes, maintenance), about social and civic life (event calendars), about the environment (weather, air quality), and more.

Although it’s starting to happen more in the scientific realm, I haven’t yet found a good example of that kind of data-oriented collaboration in the civic realm. But the teamwork shown by Ann Arbor’s police department and public library embodies the spirit that will make it happen.

Linking to excerpts from the MIX keynotes

John Lam asked how to excerpt fragments of Steve Ballmer’s keynote, and the principle of keystroke conservation requires me to answer here. The VisitMIX page for the keynote lists three streams. The links point to .asx files, which are wrappers around references to media files or streams. In this case, the references point to streams, which means that you can excerpt fragments by specifying the starttime and duration parameters.

Here’s the medium-bandwidth .asx file into which I’ve inserted starttime and duration parameters to create a fragment that points to a question and answer about HealthVault.

<asx version="3.0">
  <title>mix08: steve ballmer</title>
  <entry>
    <title>mix08: steve ballmer on healthvault</title>
    <starttime value = "52:50.0"/>
    <duration value="1:45"/>
    <copyright>copyright 2008. all rights reserved.</copyright>
    <ref href="mms://istreampl.wmod.llnwd.net/a269/o2/microsoft/300_microsoft_mix_080306.wmv" />
  </entry>
</asx>

I’ve posted the file at http://channel9.msdn.com/media/ballmer-keynote-healthvault.asx. It should play in Windows Media Player, and also in VLC on the Mac or Linux though I can’t check those at the moment.

In general, launching appropriate media players from a web page is a complex process. I’m hoping and expecting that Silverlight, over time, will simplify it, and help make rich media more granularly linkable.

A conversation with Michael Lenczner about community wifi in Montreal

In Montreal this Friday, McGill professor Darin Barney will be giving a version of his talk on citizenship and technology. Here’s an excerpt:

Each of the telegraph, telephone, radio and television was accompanied by its own heroic rhetoric of democratic transformation and reinvigorated civic engagement. None have delivered fully on this promise, but each has been crucial for the maintenance of a system of political and economic power in which most people are systematically distanced from the practice of citizenship most of the time. For the most part, these technologies have been means of anything but citizenship: spectacular entertainment; docile recreation; habituation to the rhythms of capitalist production and consumption; cultural normalization. The internet, as a radically decentralized medium whose capacity for publication and circulation far surpasses that of its broadcast predecessors, has certainly provided the means by which politically-engaged citizens can access and produce politically-charged information that would never have seen the light of day under the regime of the television and newspaper. This information can be an important resource for political judgment. But the Internet also surpasses its predecessors as an integrated medium of enrolment in the depoliticized economy and culture of consumer capitalism. This is why we should be wary of equating more and better access to information and communication technology with enhanced citizenship.

One Montreal resident deeply influenced by Barney’s critique of the Internet as an enabler of citizenship is Michael Lenczner, whom I interviewed for this week’s ITConversations show. Mike is a co-founder of ÃŽle Sans Fil, Montreal’s community wireless network. With over 150 access points and nearly 60,000 users, the project is a huge success, all the more so given that municipal wi-fi projects in other cities have failed to materialize. And yet, Mike questions the value of what’s been accomplished. The project’s goal was not merely to light up hotspots in downtown Montreal, but to enhance the “sociality” of the city and elicit more and better civic engagement. He doubts these goals have been achieved, and asks himself hard questions about how technology can be deployed to these ends.When I met Mike recently in Montreal, I said: “It amazes that you’re asking yourself these questions. He replied: “It amazes me that others don’t.”

Automation and accessibility in Silverlight and IE8

In this interview at MIX, Mark Rideout explains how Silverlight will use the same UIA (User Interface Automation) mechanisms that make Windows apps (and will make Linux applications) accessible by way of assistive technologies like screenreaders.

If you’re not somebody who needs that kind of assistance, you may not think this matters to you. But as I’ve pointed out in a series of essays, the flip side of accessibility is automation, and that’s something we all need.

For software developers, the automation framework provides the hooks needed to test the interactive behavior of applications.

For users, it provides the hooks needed to record, exchange, and replay software interactions. In The social scripting continuum I showed how IBM’s CoScripter enables people to share their knowledge of how to use web applications. It’s fabulous, but it’s restricted to the domain of simple web apps running in Firefox. IE, Ajax, Flash, Silverlight, and desktop apps are all out of scope.

With an automation/accessibility framework common to browsers, rich runtimes in browsers, and desktop apps, you could in theory enable a common way for people to describe and share their knowledge of how to use software across the full range of application types, for any browser, any rich runtime, and any operating system.

We’re not there yet, and we may never get there, but this Silverlight announcement points toward a future that’s worth imagining.

Update: In related news, John Resig notes that IE8 supports the W3C’s ARIA (Accessible Rich Internet Applications), which makes Ajax applications accessible to screenreaders. Here’s a brief guide for the perplexed, myself included, because this stuff is a layer cake.

Native accessibility toolkits, like MSAA (Microsoft Active Accessibility) and ATK (Linux Accessibility Toolkit), are the foundation. The Mozilla implementation of ARIA rests on this layer, as does the IE8 implementation. User Interface Automation (UIA), meanwhile, is part of the .NET Framework. It can be used to automate unmanaged apps like Word, as well as managed apps on the desktop or (now) in Silverlight. How UIA will be realized on Linux is something I don’t know, but would like to find out.

I can’t formulate a unified field theory that joins all these pieces, on various platforms, but I hope one will emerge.

Permalinking the Hard Rock Memorabilia exhibit

The Hard Rock Memorabilia exhibit is a great example of what becomes possible now that Seadragon Deep Zoom is integrated into Silverlight 2. The exhibit includes:

Madonna’s page in her high school yearbook:

Pat Boone’s shoes:

John Lennon’s handwritten lyrics to Imagine:

And there’s much more. When you choose subsets — by artist, decade, type (e.g. clothing, instruments), genre, location — the images retile, and they’re all navigable using Deep Zoom’s extreme zoom and pan capability.

Note that the links above lead directly into the exhibit and focus on the indicated asset. You acquire these from the Share link in the right pane, which exposes URLs of the form:

http://memorabilia.hardrock.com/Default.aspx?AssetId=8191

It’s great to see this permalink feature included. Deep Zoom is going to open up vast spaces for exploration, and in order to explore those spaces together we’ll need shared coordinate systems.

To that end, I’m hoping that future incarnations of this sort of exhibit will expose richer URL namespaces. If I want to show you Madonna’s yearbook in the context of the 1970s, I have to tell you to click Decade, then 1970, then choose the 2nd item in the 3rd row. It’d be great to be able to get you there directly:

memorabilia.hardrock.com/decade/1970/20352

And of course I’d want to locate Madonna for you, among her other classmates, by zooming to the desired view and then tacking those coordinates onto the URL.

If these precise locators are made available, conversations about the views they identify can form on the web. To see why it’s crucial to expose a public namespace, consider the David Rumsey map collection. There you can explore and precisely annotate an extraordinary collection of historical maps. And you search for those annotations within the Java-based viewer. But when you annotate a feature within a map, it doesn’t — so far as I can tell — produce a shareable URL. If those URLs were available, the collection would be woven into public discourse to a far greater degree than it is.

A couple of years ago, I asked whether rich Internet apps can be web-friendly. One of the reponses came from Kevin Lynch at Adobe, who made this example showing how navigation within a Flash exhibit of images can be reflected on the URL-line.

I don’t think it matters much whether you expose the RIA’s state on the URL-line or by means of a permalink. What matters is that you do it, and do it in as granular way as makes sense for the application.

PS: For extra credit, it’s nice to provide the underlying data for this sort of exhibit. When you’re exploring the Cubism timeline, for example, you can grab the data and mix it as you please.

WebSlices can help popularize feed syndication

With the release of the first public beta of Internet Explorer 8, two new features come to light: Activities and WebSlices. You can see a demo of both in Joshua Allen’s interview with Jane Kim. I think of Activities as next-generation bookmarklets, and also as kissing cousins to the OpenSearch providers that you can add to the browser’s search box.

WebSlices are something else again. They transform pieces of web pages into little feeds that you can subscribe to. For all its power and utility, feed syndication hasn’t yet really sunk into the consciousness of most people. I’m hoping that WebSlices, which are dead simple to create, will help bridge the gap.

Here’s a complete working example of a page with two slices:

<div class="hslice" id="1">  
<p class="entry-title">Slice 1</p>  
<p class="entry-content">This is slice 1.</p>
</div> 

<div class="hslice" id="2">  
<p class="entry-title">Slice 2</p>  
<p class="entry-content">This is slice 2.</p>
<div>

The syntax is based on the hAtom microformat, which in turn is a subset of the Atom feed format. For my purposes here, ’nuff said about that. I’m much more interested in what users will see, do, and understand. Let’s view that page in IE8:

The orange feed icon in the toolbar changes to a (presumably not final) purplish thingy. And when I hover over the second slice, another of those pops up. Both are lit, indicating there’s fresh content.

From either the toolbar or the inline hover, I can subscribe (to just the second slice) like so:

It shows up as a favorite, bolded to indicate fresh content:

From another page, I can peek at the slice’s content by clicking its button:

But when you click Favorites->Feeds, you’ll see it’s also a conventional feed:

I like this for a couple of reasons. First, because it will give microformats a big boost, and propel the data web forward. Second, because it will introduce many more people to the whole idea of subscribing to feeds. There’s a big conceptual barrier there that we haven’t yet brought most people across. I’m hoping that a new way of subscribing to a new kind of feed will also raise awareness about the old ways of subscribing to conventional feeds.

Ward Cunningham’s implementation of Brian Marick’s “Visible Workings”

In Portland last week I visited with Ward Cunningham, whose pragmatic and humane approach to the art of software informs everything he touches: the Wiki, object-oriented, agile, and test-driven programming, the framework for integrated test. (InfoWorld stuff about Ward here, here, and here.) Ward’s living the startup life these days, at aboutus.org, which describes itself as a “socially editable directory of the internet.” Think WHOIS morphed into a Wikipedia where you are not only permitted, but actively encouraged, to write the biography of your company or community.

But that’s not what we mostly talked about. Instead Ward took me behind the scenes at the portal for the Eclipse Foundation. Only members can participate in the workflows accessible through this portal: electing new committers, scheduling project reviews. But it turns out that anybody can explore the portal use cases.

Here’s a simple one: Change Personal Address. This is the part of the system that runs when a member changes facts about his or her address. You can see a test script that exercises this part of the system. You can even run the test script and inspect the results. Try that, and you’ll see that the output interleaves lines of script with renderings of what the users sees: screenshots, emails.

Finally you can swim the test. Here the steps and results are laid out in a table. Time advances as you move down the rows, and there’s a column for every actor in the workflow.

When you hover over an action step or a notification, the corresponding screenshot or email message pops up. This is a great way to visualize a complex email-mediated workflow that can involve many actors, and unfold over many days. But here’s the kicker: the visualization is also available to users, directly from the interface. Here’s the screen that you see when you’re changing your address:

Next to the Save button there’s an Explore link. If you click it, you’ll discover the same swim visualization that anyone, anytime, can explore here. Note the variations, most interestingly the one for the case where the person is a committer, and where the address change either does or does not coincide with a change of employer. If you did change employer, you’re going to get this email informing you that additional paperwork is required:

This isn’t just an innovative approach to software testing and workflow visualization. It’s also a radical statement about business process transparency. For most of us, most of the time, business systems are black boxes whose internal workings we can only discern in the outcomes of our (often painful) interactions with them. But what if you could find out, before pressing the Save button, what’s going on in that black box? And what if your way of finding out wasn’t by reading bogus documentation, but instead by probing the system itself using its own test framework?

It’s a huge idea. In a blog about this project, Ward writes:

The MyFoundation portal, once again, respects the curiosity and intellect of its users by exposing all aspects of the processes it supports. Who asked for this? No one. No one thought to. That doesn’t mean it isn’t needed.

Brian Marick calls this Visible Workings. He identifies a middle ground, between the traditional GUI presentation and the raw source code that produces it. This middle ground makes the application both explanatory and tinkerable. The portal’s swim diagram is our middle ground. We know it makes our work explanatory and look forward to investigating the tinkerable aspects too.

And elsewhere:

Online forms have too much in common with income tax forms. Nobody likes filling out either one. Each is a sea of fields, each field another question, one question after another. It is like being interrogated. Can we make filling out a form more like a conversation than an interrogation? The portal’s explore links suggest a way toward this goal. These links let you ask a question every now and then. You get to ask, “why do you ask?” Wouldn’t it be great if you could always do that?

Amen, brother.

A conversation with Adrian Holovaty about EveryBlock.com

For this week’s ITConversations show, Adrian Holovaty joins me to chat about EveryBlock, a new website that gathers and publishes “address-specific” information such as crime reports, building-code violations, and restaurant inspections.

Acquiring this information isn’t frictionless and raises questions about how this kind of data can be published usefully, as opposed to merely published. EveryBlock also raises broader questions about news gathering and reporting. The project, which is funded by a Knight Foundation grant, has attracted some criticism for not being journalistic in spirit. But Adrian Holovaty suggests that EveryBlock actually redefines news.

The previous criterion for something being covered in the newspaper was that it has to affect a lot of people in the readership. But if the pothole is fixed on your block, it’s news to you, just like what your friends are doing on Facebook is news to you. Instead of a friend feed, we’re making an address feed.

More broadly, as information that used to yield only to investigative shoe-leather starts to flow freely on the Net, journalists will be able to divert energy from data collection to analysis.

I get a little frustrated when the high-falutin’ journalists look at EveryBlock and say ‘How is this journalism? Why do you think this is replacing newspapers?’ Well, this isn’t intended to replace journalism at all, if anything it’ll help you find trends going on in the world.

There’s also an open question as to which social institutions can best organize and curate these sources of information. Governments? Newspapers? Libraries? Self-organizing groups of citizens? I’m really curious to see how it plays out.