Today my digital assets are spread out all over the place. Some are on various websites that I control, and a lot more that I don’t. Others are on various local hard disks that I control, and a lot more that I don’t. It’s become really clear to me that I’d be willing to pay for the service of consolidating all this stuff, syndicating it to wherever it’s needed, and guaranteeing its availability throughout — and indeed beyond — my lifetime.
The scenario, as I’ve been painting it in conversations with friends and associates, begins at childbirth. In addition to a social security number, everyone gets a handle to a chunk of managed storage. How that’s coordinated by public- and private-sector entities is an open question, but here’s how it plays out from the individual’s point of view.
Your teacher assigns a report that will be published in your e-portfolio, which is a website managed by the school. Your parents tell you to write the report, and publish it into your space. Then they release it to the school’s content management system. A couple of years later the school switches to a new system and breaks all the old URLs. But the original version remains accessible throughout your parents’ lives, and yours, and even your kids’.
On the class trip to Washington, DC, you take a batch of digital photos. You want to share them on MySpace, so you do, but not directly, because MySpace isn’t really your space. So you upload the photos to the place that really is your space, where they’ll be permanently and reliably available, then you syndicate them into MySpace for the social effects that happen there.
You’re applying to colleges. You publish your essay into your space, then syndicate it to the common application service. The essay points to supporting evidence — your e-portfolio, recommendations — which are also (to a reasonable degree of assurance) permanently recorded in your space.
You visit the clinic and are diagnosed with mononucleosis. You’ve authorized the clinic to store your medical records in your space. This comes in handy a couple of years later, when you’ve transferred to another school, and their clinic needs to refer to your health history.
You use your blog to narrate the key events and accomplishments in your professional life, and to articulate your public agenda. All this is, of course, published in your space where you are confident (to the level of assurance you can reasonably afford) that it will be reliably available for your whole life, and even beyond.
Although this notion of a hosted lifebits service seems inevitable in the long run, it’s not at all clear how we’ll get there. The need is not yet apparent to most people, though it will increasingly become apparent. The technical aspects are somewhat challenging, but the social and business aspects are even more challenging.
In social terms, I think it’ll be hard to get people to decouple the idea of storage as a service from the idea of value-added services wrapped around storage.
On the business side, my conversations with Tony Hammond and Geoffrey Bilder have given me a glimpse of how these issues are being approached in the world of scholarly and professional publishing. But it’s not yet apparent that the specialized concerns driving these efforts will, in fact, generalize in important ways to almost everybody.
59 thoughts on “Hosted lifebits”
How do we get from here and now to the state you describe?
How do we organize things once we get there?
How do we maintain back-ups?
How do we handle versioning?
How do we pay for this?
How do we handle proprietary data formats and information?
How do we handle expanding storage requirements? As storage grows cheaper, people want to use more and more of it.
I suppose all these problems indicate opportunities for research and commerce, and a possible career path.
I think this is absolutely where things are heading and what people need, even if many of them don’t yet realize it. Amazon’s S3 service is an example of the kind of building block infrastructural service that could be used to host our lifebits. Others will follow but, I predict that in the end game there will only be a handful of platforms available that will be able to provide these types of infrastructural services.
The decoupling is essential, though. I think facial recognition software is cool and I would love to try it out but I am not going to upload all of my pictures to YAPSS (Yet Another Photo Sharing Site) just to get this service. I want that service (and lots of other cool services!) to deal with my photos in their one, canonical location. Over time, these services will continue to decorate my photos (and other files) with more and more useful information (tags!) and then those files will be available to be mashed up into new applications.
This particular singularity is near.
p.s. – I still miss Byte magazine!
Yes, many questions arise. Some others:
How do I make it forget something?
How do I specify access?
How do I audit access?
“I suppose all these problems indicate opportunities for research and commerce”
“in the end game there will only be a handful of platforms available”
Yes, but there can’t be too few because there’s a level of continuity here that can’t by definition be guaranteed by any one business entity. There needs to be a consortium of interoperable entities, that has to be governed, there may or may not need to be a government involved, etc.
“I want that service (and lots of other cool services!) to deal with my photos in their one, canonical location.”
Exactly! Not at all obvious yet but, like you, I’m optimistic that it will shortly become so.
Great idea, but a couple of problems I see and mentioned earlier is who is going to maintain this database of information and how are we going to keep it from becoming obsolete as version goes on? My in-laws have VHS of my wife when she was growing and they are now looking to transfer those over to DVD as they are starting to show the age. In one way they might not have to deal with that problem because it would be more digitial, but if for some reason it was stored in a MS Word document would I then have to make sure that everytime Microsoft released an update to Office, my data would then work?
If I place it on the web, how do I know that whatever markup language I use would continue to be used 30 years from now? If I store my videos in Windows Media format, who makes sure that 20+ years from now if my wife and I have children there were be some form of Windows Media player? Would there be companies that I would then have to pay to make sure my “data” was current and able to be used?
Great thought though
“There needs to be a consortium of interoperable entities”
I think that’s right. A loose consortium bound together by standards around data independence. Comparable to how I can move my phone number between different provides only a lot more complicated.
The differentiator between these different providers will be the quality and quantity of services they offer to operate on and add value to your content and make connections to other relevant content. Those services will still be silo’ed and by choosing a provider you would be choosing the set of available services.
Don’t even store anything that isn’t in a fully documented, open, format.
We need URNs and a proper working name broker, so we can refer to things by *name* and not by *location*. If I could grab URN:GUID:5fc21f43-a618-4106-84cc-978865f9870b (say) and have it refer to a storage location of my files, who cares where they were, if then if the current storage provider goes awol, I can move ’em somewhere else and update the “URN broker”
I’m not sure unique names for *everything* is a goal worth working toward. If the ultimate end is to have everyone’s ‘stuff’ permanently and universally accessible to whomever they chose to grant access, why complicate things by dictating everyone’s personal namespace? I mean, given a choice between asking for ‘URN:GUID:5fc21f43-a618-4106-84cc-978865f9870b’ and asking for ‘the picture of a duck from 2006 in Brian’s family data store’ or ‘Brian:family:2006:duck.jpg’ (where ‘Brian’ is some unique identifier), I’ll take the duck every time.
I suppose I’m really taking exception to how far to extend the unique identifiers. I’d rather give a personal store a unique ID and let the people handle their own naming from there on out. Although, re-reading your example as opposed to your opening sentence, maybe we’re saying the same thing. This is already here in a hodge-podge sort of way for the technically advanced. ftp://briansdomain.net/family/2006/duck.jpg can be pretty much wherever I want it to be physically. Of course, it will only be around for as long as I keep up with it. If only we just could make it easy, universal, and automatic.
As a side note (and maybe it’s because I’m spatially oriented), I like location as a reference. ‘The 3rd crescent wrench down in the 2nd drawer over from the right in the tool chest in my garage’ seems more natural than ‘The silver 3/8″ craftsman crescent wrench with serial #65219756954’ somehow.
All conversations like this should include a thorough review od th digital identity solutions like OpenID, and I suggest getting people like Dick Hardt from Sxip in on the conversation, since he’s been thinking about it for years. We need secure online “banks”, hosted by providers of our choice, that will share our data – identity data, reputation data, pictures, whatever – with whomever we give our permission to share it with. So, not a new idea, but it will really take (and already has been taking) a team effort to implement.
Great concept. I think this goal is reachable less through settling on a single physical storage server (syndicating outwards in every direction) and more through the accumulating data itself being formatted in an open, transformable, evolvable collection of instances. Some structured and some unstructured – depending on what frags of life I’m collecting. If I accumulate this massive collection of XML fragments (and reasonably convertable binaries for photos etc) then anywhere I move them around becomes that syndicatable and searchable perma-store. When my kids and grandkids are using photon computing devices then they will certainly be able to take my old bag of lifebits and “transform” them so they can enjoy my life holographically (or whatever). The record was open but still needed to hop-scotch through time from physical medium to pshysical medium.
“reachable less through settling on a single physical storage server”
It wouldn’t (couldn’t) be that. Rather, a virtual storage service that runs in the context of an interoperable federation of providers that individually offer various levels of value-added services and data integrity assurance, and collectively offer business continuity assurance that no individual one of them could.
“a virtual storage service that runs in the context of an interoperable federation of providers”
It would then be a “federation of federations”. If there were 100 “nodes” available in such a federation I might use 1-17 + 36, 38, 47 plus 78-91 for my unique footprint of lifebits. You might also use 1-17 and 78-91 but you prefer 54, 59, 71 over my three other choices etc. So we are all in the federated cloud but some nodes compete for business from federation users.
The federation would need a design then where if any one node went dead my data at that node be able to hop to another live node (if even temporary and manually managed by me) while the federation “healed” itself by a more permanent replacement node stepping in. Kind of how the original Internet had a design goal of uninterrupted re-routing around a city taken out by nuclear blast.
This is partly why an open-format backup/portable mechanism would be needed (in my view).
I ran across this bit from Dave Winer this morning: http://www.scripting.com/stories/2007/03/01/preservingIdeas.html – and it seems to be a related goal.
“it seems to be a related goal”
Very much so. I’ve referenced Dave’s thinking on this topic here — http://blog.jonudell.net/2007/02/16/a-conversation-with-dan-chudnov-about-openurl-context-sensitive-linking-and-digital-archiving/ — and in several podcasts.
Hi Jon, great post.
While we have been coming at it from a slightly different angle, these are some of the things we, at the Internet Address Book, have been thinking about too. In fact we are getting ready to launch some new features within the next couple of weeks which, while being a long long way from solving the vast majority of the problems you have highlighted, takes a very small step in the right direction. Its about using the basic infrastructure of the internet to create a place online that people can truly call their own.
Anyway, thought you might be interested, our first post on the subject (reads a little like a manifesto) is here: http://blog.internetaddressbook.com/?p=55 of course always keen to discuss more.
Related, though only a partial (and ad hoc) solution to the real problem, but worth noting:
The Society for American Baseball Research (SABR) has taken responsibility for preserving two member websites. In one case the member died unexpectedly, orphaning a site with content the society considers immensely valuable (and a lot of non-baseball stuff which mostly reminds us that Doug was an interesting guy); in the other case, the member moved on to other interests and offered the site to the society (again, a site with clear value to the membership).
Not, as I say, a full solution; more an acknowledgment that the issue needs a solution, and a tentative step in that direction. I imagine other research and/or hobby groups have done something like this.
Yes there are so many places that things can be stored these days that people will be reluctant to pay for it.
A new service interesting in this context: https://fluidinfo.com/
Someone has already build a resume layer for it: http://fluid-cv.appspot.com/