Revisiting language evolution in

Recently I began keeping track of interesting public data sources using the tag judell/publicdata, and invited others to do the same using their own accounts. That method sets up an interesting pattern of collaboration whereby all contributions flow up to the global bucket, tag/publicdata, but individual contributors can curate subsets of that collection according to their own interests.

A nice example of that pattern emerged when the Many Eyes folks showed up at manyeyes/publicdata. Their contributions flowed up to the global bucket, and thence to the RSS feed I’m watching, which is how I got to find out about this excellent survey of a variety of public sources. It was done for a class at the University of Maryland, and it very helpfully characterizes data sources along a number of axes including searchability, browsability, interaction, and formats.

All this is quite straightforward and unsurprising to anyone who’s familiar with social bookmarking — which is to say, still quite unfamiliar to most people today.

So there’s not much chance that the next maneuver I’m going to describe will resonate in the general population, but I want to describe it anyway because those of us who think about these things ought to be thinking about how to make it more discoverable.

Several years ago, in a screencast entitled Language evolution in, I posited that tag vocabularies could evolve in the same way that natural languages do. In the realm of natural language, we coin new words all the time. When we hear a new word that we like, we adopt it — or, perhaps, adapt it. The punchline of the screencast was that this is how the grassroots semantic web will form. There are just two requirements: We need to be able to speak, and we need to be able to hear others speak.

Speaking, in the realm of tag vocabularies, means writing tags, and sometimes creating new ones. Hearing means reading tags, and observing how they’re applied to resources and by whom.

If you land on a page that you haven’t yet bookmarked, you can use the posting bookmarklet to show you (as recommended tags) which other tags have been assigned to that URL.

I tend to rely on a more sensitive organ of hearing: a bookmarklet that I call dc, for conversation. I use it all the time. Suppose, for example, I’d found that University of Maryland page through some other means of referral than I’d have reflexively clicked the dc bookmarklet to produce this report which shows who else has bookmarked that page, and how it has been described.

In this case there’s not much to see. The URL was bookmarked once in Feb 07, by elzzup, to the tags data and class, and again in Jul 07, by manyeyes, to the tag publicdata.

This view is interesting for a couple of reasons that I don’t think are widely appreciated. First, it shows a progression from general ways of describing the resource to a more particular way. Note, by the way, that the proposed refinement of data to publicdata is not visible when you launch the bookmarking form, which recommends only class and publicdata. Note also that the introduction of publicdata is really a hack. It would arguably be better to rely on the individual tags public and data. But that would make it necessary to query for the conjunction, and that connection is too fragile. So publicdata also suggests something about how to form tags — that is, by making these conjunctions explicit.

Second, it shows who has proposed publicdata — namely, manyeyes, an identity that may be recognized, and that if recognized will add weight to the proposed usage of the tag.

These are subtle effects. For most people, they’re too subtle to matter at all. But I’m reminded that there’s important work yet to be done to render these effects in ways that make it easier for everyone to hear (and visualize) linguistic evolution in the tag domain, so that people can participate more actively and more naturally in that evolution.


  1. Did you try to use to track those RSS feeds with more instant insights? I did the same thing with Anothr months ago, finding Japanese and Chinese content are now peering with English ones.

  2. >I tend to rely on a more sensitive organ of hearing: a bookmarklet that I call dc, for conversation.

    Where can I find this bookmarklet?

  3. This is very interesting

    …but I might need a definition of public… :

    You are not authorized to view this page
    The Web server you are attempting to reach has a list of IP addresses that are not allowed to access the Web site, and the IP address of your browsing computer is on this list.

  4. “Where can I find this bookmarklet?”

    I tried to include it in the above but evidently failed. The text of the bookmarklet is:


  5. I had to change the “’” quotes to “‘” quotes to make the bookmark work (in FF2). (quoting quotes, how delicious for the use-mention conscious)

  6. hmmm, what I meant was, the copy-paste quote, whichever it is, did not work. Replacing them by quotes input by keyboard worked.

  7. (Did you know that your blog site goes into a redirect loop under firefox? At least for me..

    For some reason it trys to redirect me to, which redirect back to
    and then loops. I suspect it may be related to the fact I have a account?)


    One thing I’m surprised hasn’t implemented is the idea of discussions on the tag pages, and on the page for each URL.

    We recently did a demo of this work with a customized version of Scuttle, which went down pretty well. I’ve posted some about this at (although not about the comments explicity)

  8. “Did you know that your blog site goes into a redirect loop under firefox?”

    I’ve seen something that smells related to that, when going to my own site through a proxy. Are you proxied?

    “the idea of discussions on the tag pages”

    True. In principle, there can be virtual discussion, blog-to-blog style, by way of Technorati et al. In practice that’s probably a bit too attenuated to catch on.

    “Cloudalicious looks at these language changes over time as well ”


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s