Accounting for page popularity

Today Lauren Weinstein draws attention to “a fascinating and apparently singular page on Google that you’ve probably never seen.” He’s right, I hadn’t, and apparently not many others have either. The page, http://www.google.com/explanation.html, appears as a sponsored link when you search for the word Jew, and apologizes for the fact that a hate site appears as a highly-ranked result. Although the apology dates back to April 2004, more than three years ago, it has so far attracted fewer citations (currently 50) and bookmarks (currently 26) than some of the blog posts I’ve written since April 2004.

Lauren writes:

The Web, after all, isn’t really computers and routers, fiber and spinning disk arrays, databases and blogs. The Web is people. Our job now is to find the path toward helping make sure that the power of Web search enhances people’s lives while not incidentally creating asymmetric opportunities for seriously damaging innocent lives in the process.

Lauren’s item today points back to a pair of earlier items in which he proposed a dispute resolution mechanism that’s reminiscent of Wikipedia’s:

Question: Would it make sense for search engines, only in carefully limited, delineated, and serious situations, to provide on some search results a “Disputed Page” link to information explaining the dispute in detail, as an available middle ground between complete non-action and total page take downs?

As we see today, that’s already happening in at least this one case. I’m sure it won’t be the only one, and that the kind of mechanism Lauren envisions will emerge.

In parallel, I believe we’ll increasingly need and want more and better explanations of all search results. Today, for example, I am the second and tenth results for the word Jon. As recently as last week I edged out Jon Stewart for the top spot. Why? I have a large Web surface area, it has grown steadily over many years, it’s mostly contained within the link-happy blogosphere.

Five years ago I called this a temporary anomaly, and predicted that a democratization of web presence will adjust the imbalance. It hasn’t happened yet, though. Meanwhile, it’s reasonable to expect that search engines might begin to provide the kinds of explanations that I’ve given here. Yes, ranking algorithms are proprietary, but some evidence — about the number of supporting pages, the structure of collections, the nature of supporting link networks — could go a long way toward helping people contextualize search results.

Web search can create an asymmetric advantage for all kinds of agendas. In exceptional circumstances where such advantage is exploited to do damage to people, I think Lauren’s right, we’ll need a mechanism to handle those exceptions. But in all cases, whether the agenda is positive or negative, better accounting for the nature of the advantage would be helpful.

Posted in .

3 thoughts on “Accounting for page popularity

  1. After reading your post I found myself browsing to the Google owned photo sharing system Picasa. What is curious and prompted me to write this comment is that at the bottom of the album pages they have a link
    to “Report inappropriate content”. Why treat Picasa photo albums differently from Google Search results? From the outside it might appear like Google is suffering from corporate schizophrenia; on the
    one hand not willing to censor search results, yet on the other hand willing to play the role of censor. Yet I suspect the distinction arises for a complex mix of reasons:

    Because Picasa (Google) are hosting the photographs themselves they feel/(are) more responsible for the content. Though I’ve not checked, I’m guessing they explicitly specify in the terms of service what content they feel is appropriate for Picasa users to publish.

    Google search and Picasa are clearly different products with different purposes and uses. Users may be more likely to browse Picasa for content, rather than search explicitly. A search request heuristically tends to mirror intention, where as browsing behaviour tends be more ephemeral. If a user searches explicitly for something, they usually want what they’re looking for (unless of course, you want to stay in a Hilton hotel in Paris). Consequently they are less likely to be surprised by what they find.

    There are no universally recognised “terms of service” for the Internet. Filtering search results effectively puts them in a position of being an arbiter for what’s allowed on the web. This is obviously a position they need to be very careful about adopting.

    Having a “inappropriate content” button on search pages would doubtless expose their ranking algorithms to easier gaming.

    Many Google properties such as Blogger, Picassa and youtube were acquisitions, so in many ways it is not surprising that there are differences.

    Does the distinction here arise solely through the terms of service?

    Whilst these and many more reasons might be used to justify the different approaches between their products, it is interesting to note that there is a different attitude. Will this prevail; or will we see
    a unified approach to the problem across the board?

    I’d argue that a slight modification of the ‘disputed page’ idea could work for Picasa, providing of course that the content is legal and there are no awkward copyright issues (nasty-grams). Indeed this
    approach is used by the other Google owned site, youtube. The youtube community flag potentially offensive videos as such. Users stumbling across such content are warned and are then required to login and
    verify their age. It seems that the suggested strategy could form the basis of Googles censorship stance across the board, though questions of abuse and fairness will always arise.

  2. Although I do agree with Lauren Weinstein’s notion that a dispute mechanism will sometimes be necessary, it will by definition have to be used only in rare and exceptional cases.

    The question I’m raising is whether, in parallel, it will be generally helpful to have more transparency about what we might call the infrastructure of popularity. Banning a hate site is problematic for a bunch of reasons, one of them being that if that site is popular, its popularity is telling us something we might want to understand better — and could with more data.

  3. Google shouldn’t be apologizing for content that isn’t their own to begin with. But if they feel they must, the apology should general, not preferential to one word. Certainly anti-semites would see the cited apology as nothing short of pandering.

Leave a Reply