In response to last Friday’s podcast with Tony Hammond about publishing for posterity, David Magda wrote to point out that our main topic of discussion — the DOI (digital object identifier) system — is one implementation of the CNRI (Corporation for National Research Initiatives) Handle System but there are others, including DSpace. I wondered whether this class of software might work its way into the realm of mainstream blogging. David responded:

A weblog (or web pages in general) are simply a collection of text, link, pictures. This is no different than any other document / object / entity that Dspace would handle. It’d simply be another type of CMS IMHO. I think this would be a really good project to implement for an undergrad thesis, or perhaps as part of a master’s thesis.

However as neat as all this is, I don’t think it would be implemented soon: or at least not in mainstream software. Few people will care whether their MySpace page survives over the aeons (and many people don’t want their kids to know what they did twenty years in the past).

But some of us do, and more of us will. The other day, for example, my daughter walked into my office while I was in the middle of a purge. Among the items destined for the recycling bin was a pile of InfoWorld magazines.

She: You’re throwing all these out?

Me: No, I’m keeping a few of my favorites. But as for the rest, I don’t have the space, and anyway it’s all on the web.

She: Don’t you want your grandkids to be able to see what you did?

Heh. She had me there. A pile of magazines sitting on a shelf is almost certainly a more reliable long-term archive than a website running on any current content management system.

Here’s another example. Back in 2002 I cited an essay by Ray Ozzie that appeared on what was then his blog, at ozzie.net. But if you follow the link I cited today, you’ll land on the home page of the latest incarnation of Ray’s blog. The original essay is still available, but to find it you have to do something like this:

My Blog v1 & v2 -> stories -> Why?

So OK, the web rots, get over it, we should all accept that, right?

Well, libraries and academic publishers don’t accept that. Nothing lasts forever, but they’re building content management systems that are far more durable and resilient than any of the current blogging systems.

Conventional wisdom says that it wouldn’t make sense to make blogging systems similarly durable and resilient, for two reasons. First, because the investment would be too costly. Second, because blogs aren’t meant to last anyway, they’re just throwaway content.

The first point is well taken. As Tony Hammond points out in our podcast, the cost isn’t just software. Even when that’s free, infrastructure and governance are costly.

But I violently disagree with the second point. Just because most blog entries aren’t written for posterity doesn’t mean that many can’t be or shouldn’t be. My view is that blogs are becoming our resumes, our digital portfolios, our public identities. We’re already forced to think long-term about the consequences of what we put into those public portfolios because, though no real persistence infrastructure exists, stuff does tend to hang around. And if it’s going to be remembered, it should be remembered properly.

So a logical next step, and a business opportunity for someone, is to provide real persistence. This service likely won’t emerge in the context of enterprise blogging, because enterprises nowadays are more focused on the flip side of document retention: forgetting rather than remembering. Instead it’s a service that individuals will pay for, to ensure that the public record they write will persist across a series of employers and content management systems.