Joining web namespaces

The other day I read the following statement in the Economist:

Sensitivity of the data will decide if an application is suitable for processing in the cloud.

The writer does not mention, and probably is unaware of, the principle of translucent data. In a translucent database, the data is encrypted and thus opaque to the operator of the database. Users of the data share keys to unlock the data, and can do anything with cleartext copies that they keep locally. Can real and useful applications be built in this kind of regime? We don’t really know, because hardly anybody has tried. But if it turns out to be possible, it could become a foundation of cloud computing.

I wanted to advance the story. In particular, I wanted to help make a connection between that statement in the Economist and the idea of data translucency. I’ve written about translucency on my blog, and those entries are tagged on delicious. But nowadays the attention stream flows mainly through Twitter. So I composed this tweet:

Economist: “Sensitivity of the data will decide if an application is suitable for processing in the cloud.” Unless the data is #translucent.

There’s a limit to what you can do in 140 characters. That tweet uses all 140, but still falls short of what I wanted to do:

  • Quote from the Economist
  • Link to the Economist
  • Colonize a formerly empty hashtag namespace (#translucency)
  • Connect that namespace to its delicious counterpart

Inevitably I failed to do all that in 140 characters. Reflecting on the failure, I made this LazyWeb wish:

I wish I could tweet the command “join http://delicious.com/judell/translucency to #translucent and #translucency”

I’ve had some success joining tag namespaces from different domains. I mentioned the idea in this entry, and a commenter (engtech) provided a nifty solution based on Yahoo Pipes. I have since used it to keep track of items tagged icalvalid on blogs, on delicious, and on Twitter.1

My LazyWeb wish came from that experience, plus another which I wrote up in an entry entitled To: elmcity, From: @curator, Message: start. That entry describes how elmcity curators can now use Twitter direct messages to send commands to the elmcity service. The mechanism harkens back to Rael Dornfest’s brilliant Sandy, a service that acted as a personal assistant and responded to a repertoire of command messages.

Sandy lost her job when Rael went to work for Twitter. I’ve wondered if she would be rehired there. If so, a command like the one I proposed might be an example of the kind of thing she could do.

On further reflection, I’m not really sure what such a command would mean, or whether it would make sense to use Twitter to send it, or indeed whether it would make sense for Twitter (rather than some other service) to respond to it. But I’m in an exploratory mood, so let’s explore.

It would be straightforward to create a service that would take the Yahoo Pipes trick to the next level. Instead of editing and saving a Yahoo Pipe, you’d just command that service to merge the set of feeds for some tag. That command might best take the form of a URL:

http://tagjoiner.org/join/TAG?delicious=yes&twitter=yes&wordpress=yes

As is true for my combined icalvalid feed, the result formats could be HTML for viewing and RSS for feed splicing. As the creator of the joined feed, I’m aware that it exists, and I can cite it when I want to direct people’s attention to the union of the namespaces.

But suppose I wanted the joined namespace to be more discoverable than that? Here’s where it might make sense for Twitter to be involved. If a hashtag search on Twitter did the join, it could be made evident to the followers of the person making the join request, or even to anyone searching for the hashtag involved in the request.

This is almost surely too indirect and too abstract to ever make sense as a mainstream feature. But it’s fun to imagine. If I’ve made an investment in a tag on delicious, or WordPress, or somewhere else, I’d like to be able to bring those items to the attention of people who encounter the corresponding Twitter hashtag.

The general idea behind all this goes way beyond Twitter, of course. Waiting in the wings is a whole class of services that reconcile different web namespaces.


1 That feed used to include a mix of items marked [DELICIOUS] and [TWITTER]. But the Twitter items are less durable and seem to have aged out of the combined feed.

SQL Azure “Vidalia”: Practical translucency

Ever since Peter Wayner introduced me to the idea of a translucent database I’ve been thinking about the implications of this powerful idea. In a nutshell, the data in a translucent database service is opaque to the operator of the service, and visible only to sets of users who establish trust relationships. My 2002 review of Peter’s book summarizes his babysitter example:

Imagine a web service that enables parents to find available babysitters. A compromise would disastrously reveal vulnerable households where parents are absent and teenage girls are present. Translucency, in this case, means encrypting sensitive data (identities of parents, identities and schedules of babysitters) so that it is hidden even from the database itself, while yet enabling the two parties (parents, babysitters) to rendezvous.

Fast forwarding to 2009, here’s a current headline from InfoWorld: Microsoft adds access controls for SQL Azure online database. The article doesn’t say so, but this is database translucency in action.

The 2009 version of the babysitter example appears at 37:45 in this PDC session, where Dave Campbell and Rahul Auradkur discuss, and also show, a translucent pharmaceutical reagent marketplace. Dave Campbell spells out the scenario:

Pharma companies see reagents as being pre-competitive. They don’t compete at that level, and they’re willing to sell these reagents to one another, as long nobody can see what’s being bought and sold. That’s the controlled trust we need to set up.

The trick is accomplished by means of encryption and careful separation of concerns. Access policies are isolated from data storage, capable of federation, and auditable by trusted intermediaries.

This is exciting new territory. Historically, we’ve always assumed that the operator of an online information system has complete access to the data in that service. Translucency turns that assumption on its head, and leads to entirely new service design patterns. To implement those patterns requires more than just a database in the cloud. You also need a coordinated suite of supporting services for identity, access control, auditing, and more. Azure, as it becomes one provider of such services, will help make translucency a practical reality.

Talking with Marco Barulli about zero-knowledge online password management

A couple of years ago I was enamored with a clever password manager that pointed the way toward an ideal solution. It was really just a bookmarklet — a small chunk of JavaScript code — that used a simple method to produce a unique and strong password for the website you were visiting. The method was to combine a passphrase that you could remember with the domain name of the site, using a one-way cryptographic hash, in order to produce a strong password that would be unique to the site — and that you’d otherwise never be able to remember.

It wasn’t perfect. Sometimes the passwords it generated wouldn’t meet a site’s requirements. And sometimes the login domain name would vary, which broke the scheme. But it introduced me to two powerful — and related — ideas. JavaScript could turn your browser into a programmable cryptographic engine. And that engine could be used to implement protocols that relied on cryptography but transmitted no secrets over the wire.

To my way of thinking, that’s a killer combination. For years I’ve been using Bruce Schneier’s Password Safe, a Windows program that keeps my passwords in an encrypted store. There are many such programs, another example being 1Password for the Mac. This kind of app lives on your computer and talks to a local data store. That means it’s cumbersome to move the app and your data from one of your machines to another. And you can’t use it online, say from a public machine at the library or a friend’s computer.

Imagine a web application that would encrypt your credentials and store them in the cloud. It would deliver that encrypted store to any browser you happen to be using, along with a JavaScript engine that could decrypt it, display your credentials, and even use them to automatically log you onto any of your password-protected services. You’d trust it because its cryptographic code would be available for security pros to validate.

I’ve wanted this solution for a long time. Now I have it: Clipperz. My guest for this week’s Innovators show is Marco Barulli, founder and CEO of Clipperz, which he describes as a zero-knowledge web application. What Clipperz has zero knowledge of is you and your data. It just connects you with your data, on terms that you control, in a way that reminds me of Peter Wayner’s concept of translucent databases.

Clipperz is immediately useful to all of us who struggle to manage our growing collections of online credentials, But it’s also a great example of an important design principle. We reflexively build services that identity users and retain all kinds of information about them. Often we need such knowledge, but it’s a liability for the operators of services that store it, and a risk for users of those services. If it’s feasible not to know, we can embrace that constraint and achieve powerful effects.