Owning your namespace

Given my interest in persistent URLs and reliable citation, it’s surprising that I only just today learned about WebCite. Here is the WebCite URL for a recent entry of mine:

http://www.webcitation.org/5TLg33jR5

This looks a lot like a TinyURL. If you’re on Twitter, you’re seeing a lot of those because Twitter automatically invokes the TinyURL service when you cite an URL.

But WebCite has a different, and very special, mission. It’s for scholarly and professional authors whose articles are themselves persistently linkable by way of Digital Object Identifiers. Increasingly those articles cite more ephemeral things, like blog entries. Using a WebCite bookmarklet, these authors can produce URLs that point to archived copies of web pages. Think Wayback Machine, but you can ask to have an item archived and be sure that it will be.

This is cool, and it’s interestingly different from the ad-supported TinyURL. In the case of WebCite, support is expected from a consortium of publishers whose content cites a mix of persistent academic works and ephemeral web stuff. Such content will be more valuable, the reasoning goes, if the ephemera can also be reliably cited.

As the author of ephemeral items, of course, I’d like to insert myself into that value chain. In this model the citing author and the publisher can see referrals to my item, but I can’t. That’s another reason why I need a lifebits system that’s independent of my blog publishing service, and of linking and persistence services, but can control my namespace and syndicate to and from those services.

17 Comments

  1. Call me skeptical, but I just don’t see the business model. The WebCite FAQ has these two entries:

    Who is going to pay for this?
    There are various possible models to cover the ongoing costs of operations. The most likely model is that publishers will pay a membership fee (similar to PILA/CrossRef membership fees) to have their cited webreferences archived. There is no fee for authors. Readers from publishers/journals who are WebCite� members will also have free access to archived material, unless publishers opt to charge their readers or to make this is value added service for subscribers only.

    How can I be certain that webcitation.org doesn’t disappear in the future?
    Ultimately, WebCite� will be owned and operated by a consortium of publishers, who together have all a vested interest in keeping this service alive. The creators of WebCite� are already using it for their journals and citations and therefore have a vital interest in keeping the service alive.

    Perhaps the service is better than nothing, but those sound like dot-com-bubble promises. I wonder what the motivation of a publisher would be to pay to put all of their content into this system rather than keep up their own publishing systems. And what about “appropriate copy” problems? — for articles that are behind a pay-for wall, what incentive is there to get a short “webcitation.org” URL that would open up the content to the world? And if it is all about a short URL, how is webcitation.org any better than a DOI (or an ARK).

  2. I really like the idea – on demand cached permalink.

    That reminds me of another url-shrinker I’ve been thinking you might be interested in, conceptually: http://decenturl.com/. Instead of obfuscating url’s, it constructs friendly url’s based on the page title.

  3. It just hit me – WebCite’s missing the boat by going after this with a non-profit point of view. I have the same reservation that Peter Murray’s got – there’s no reason to believe they’ll be around in 5 years. I used URL123 a bunch a while ago, and when they shut down it was a major inconvenience to me and a lot of other people.

    Since WebCite adds their banner to the top of the page, the smart move would be for them to add some contextual advertising (like Google AdSense), either at the top or bottom of the page. It could be very minimal, and clearly marked “Ads placed by WebCite” so it would be obvious that the ads weren’t part of the original content. That’s the kind of content for which contextual ads really make sense, and I’d feel a lot better about using WebCite if I knew they had a real business model.

  4. Jon– Looooong time no see. Last time in person was at the lab in Peterborough. At any rate, I am a professor at DePaul university now, and I am doing research into early PC/networking journlaism. I really would like to interbiew you– you could provide some great info for me and the study I’m doing. Please contact me bia email or phone: 630-406-6130.

    Thanks, and late congrats on the ‘new’ job.

    Steve Bosak (formerly with Palindrome/Seagate Software/Zenith Data Systems; former writer at PC Tech Journal, the Chicago Trib)

  5. This sounds like a solution looking for a problem. Just moving content to a different site isn’t going to change the percent availability of the collection, and by putting it in a central repository, you create a central point of failure, throwing away one of the main strengths of the web.

    That’s not the worst thing, though. The most serious problem is that by choosing a TinyURL scheme, you’re stripping information out of the links themselves, something that bloggers have worked hard to put there by creating meaningful. human-readable, permalinks.

    The problem of persistent content has to be solved by the content publishers. If you’re worried that a page you cite won’t be around in 5-10 years, the way to handle that is to make a local copy of it. If you’re worried that citations of your content won’t work 5-10 years from now, the best way to handle that is to use a redirect and keep control of the domain. If you can’t do that, you need to take responsibility for breaking links to your content and update the referring cites.

    It just seems to be that the cost of maintaining a mirror of the web vastly exceeds the value of maintaining links to content that the content owner and citing author didn’t feel important enough to keep accessible.

  6. I am suffering from jet lag, so I’ll keep this short, but I should point out that CrossRef has been in discussions with WebCite about whether we should offer a service like this. I summarized my enthusiasm and concerns about WebCite on the CrossTech blog last month (http://www.crossref.org/CrossTech/2007/10/nlm_blog_citation_guidelines_1.html), so I won’t repeat them here, but I will add a few thoughts:

    0) In the above posting, substitute my use of the word “cache” with the word “archive.” Cliff Lynch rightly pointed out that I was using the wrong term.
    1) There is increasing interest on the part of academic bloggers in making sure that they are able to differentiate their professionally related blogging, from their posts about their cats, vacations, etc. The BPR3 initiative (bpr3.org) is an interesting development in this area as is an upcoming conference on Science Blogging (scienceblogging.com). I should think that these bloggers might also be concerned about the relative ephemerality of their content as it is currently managed.
    2) General Bloggers might be interested in paying for a service like this because they are realizing increasingly that, like academics, they *too* now live in a world where citation persistence starts to matter. I can imagine a service like WordPress or ScienceBlogs offering a “professional blogger package” that included assignment of DOIs and archiving postings (as described in the above-mentioned CrossTech posting).
    3) Publishers might be interested in financing such a service in order to help them increase the utility of the work they publish. Clearly, an article where all the citations actually lead to something is far more useful than one where the citations break. We’ve got a solution for ensuring the integrity of links to more formally published content, but what are we going to do about relatively ephemeral content?

  7. @Peter Murray “how is webcitation.org any better than a DOI.”

    It’s interesting to be able to do it in a lightweight, on-demand way, is all.

    @Mr Gunn “The most serious problem is that by choosing a TinyURL scheme, you’re stripping information out of the links themselves, something that bloggers have worked hard to put there by creating meaningful. human-readable, permalinks.”

    Agreed. I would like the best of all worlds. To control my own namespace, and to assure persistence and reliable citation. What I actually like most this discussion is that it forces us to tease apart the separable components and think about how we might recompose them.

    @Geoffrey Bilder: Over here — http://www.crossref.org/CrossTech/2007/10/nlm_blog_citation_guidelines_1.html — you said: “It seems to me that a system for reliably citing blogs and wikis would benefit many communities.”

    Exactly. There are no perfect solutions but for the most part we’re not even trying. The blogosphere is doing a pretty darn good job of citation analysis, but is lousy at preserving what is cited.

    We oughta have higher expectations, and be thinking about ways to achieve them, is all I’m saying.

  8. What this effort has made me realize is that if publishers and authors don’t get their act together and come up with a solution that works that they control, there’s a strong possibility that a good idea but seriously-flawed implementation such as this can sneak up and become a de facto standard.

  9. @Mr Gunn “The most serious problem is that by choosing a TinyURL scheme, you’re stripping information out of the links themselves, something that bloggers have worked hard to put there by creating meaningful. human-readable, permalinks.”

    If you check the WebCite technical documentation at http://www.webcitation.org/doc/WebCiteBestPracticesGuide.pdf you will see that WebCite also supports a transparent format like http://www.webcitation.org/query?url=this&date=that (retrieving THIS url at THAT date). The abbreviated format using an ID instead (http://www.webcitation.org/aSDHJE) is handy for print publications to save space, and is also used by publishers who in the references section cite the original URL plus the WebCited (archived) URL (see for example references in http://www.jmir.org, or also the BMC journals).

    I guess we could (and will) also make the database table public (the table which maps WebCite snapshot IDs to URLs/dates).

    Gunther Eysenbach
    WebCite initiator

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s