How shared vocabularies tie the annotated web together

I’m fired up about the work I want to share at Domains 2017 this summer. The tagline for the conference is Indie Tech and Other Curiosities, and I plan to be one of the curiosities!

I’ve long been a cheerleader for the Domain of One’s Own movement. In Reclaiming Innovation, Jim Groom wrote about the need to “understand technologies as ‘potentiality’ (to graft a concept by Anton Chekov from a literary to a technical context).” He continued:

This is the idea that within the use of every technical tool there is more than just the consciousness of that tool, there is also the possibility to spark something beyond those predefined uses. The only real way to galvanize that potentiality is to provide the conditions of possibility — that is, a toolkit for user innovation.

My recent collaboration with Mike Caulfield on the Digital Polarization Initiative has led to the creation of just such a toolkit. It supports DigiPo in the ways described and shown here. A version of the toolkit, demoed here, will support a team of investigative journalists. Now I need to show how the toolkit enables educators, scientists, investigative reporters, students — anyone who researches and writes articles or reports or papers backed by web-based evidence — to innovate in similar ways.

In tech we tend to abuse the term innovation so let me spell out exactly what I mean: Better ways to gather, organize, reason over, and cite online evidence. Web annotation, standardized this week by the W3C, is a key enabler. The web’s infinite space of addressable URLs is now augmented by a larger infinity of segments of interest within the resources pointed to by URLs. In the textual realm, paragraphs, list items, sentences, or individual words can be reliably linked to conversations — but also applications — that live in connected annotation layers.

A web of addressable segments of interest is a necessary, but not sufficient, condition of possibility. We also need tools that enable us to gather, organize, recombine, and cite those segments. And some of those tools need to be malleable in the hands of users who can shape them for their own purposes.

When I reread Vannevar Bush’s As We May Think, to prepare for a conversation about it with Gardner Campbell and Jeremy Dean (video, Gardner’s reflections), I focused on this passage:

He has dozens of possibly pertinent books and articles in his memex. First he runs through an encyclopedia, finds an interesting but sketchy article, leaves it projected. Next, in a history, he finds another pertinent item, and ties the two together.

Nowadays that first encyclopedia article lives at one URL. The pertinent item in a history is a segment of interest within another URL-addressable resource. How do we tie them together? A crucial connector is a tag that belongs to neither resource but refers to both.

When tools control the sets of tags available for resource interconnection, they enable groups of people to make such connections reliably. That’s what the DigiPo toolkit does when it offers a list of investigation pages, drawn from the namespace of a wiki, as the set of tags that connect annotation-defined evidence to investigations. You see that happening with the DigiPo toolkit shown here, and with a variant of the toolkit shown here. In both cases the tags that bind evidence to wiki pages are controlled by software that acquires a list of wiki pages and presents the names of those pages as selectable tags.

One future direction for the toolkit leads to software that acquires lists of pages from other kinds of content management systems: WordPress, Drupal, you name it. Every CMS defines a namespace that is implicitly a list of tags that can be used to bind sets of resources to the pages served by that CMS. If you’re looking to adapt a DigiPo-like tool to your CMS, I’ll be delighted to show you how.

Such adaptation, though, requires somebody to write some code. While it’s unfashionable in some circles to say so, I don’t think everyone should learn to code. There’s a more fundamental web literacy, nicely captured by Audrey Watters here:

It’s about understanding the components of the Web and knowing how to tag and then manipulate them. By thinking and developing sets of named resources, you are a Web thinker. This isn’t about programming but rather the creation of sets of resources and the identification of components that work with those resources and combine them to create solutions.

Web annotation vastly enlarges the universe of resources that can be named. But it’s on us to name them. Tags are a principal way we do that. If our naming of resources is going to be an effective way to organize and combine them, though, we need to do it reliably and consistently. Software can enforce that consistency, but not everyone can write software. So a user innovation toolkit for the annotated web needs to empower users to enforce consistent naming without writing code.

A couple of weeks ago I built a Chrome extension that enables users to define their own lists of shared tags by recording them in an Google Doc. The demonstration video prompted this query from Jim Groom:

I just got through with a workshop here demoing Hypothes.is for a European group that may be using it to annotate online legislation for data privacy set to go live in 2018. They are teaching a course on it, and this could be one of the spaces/hubs they build the open part around. I came back to this video just now, but got the sense I could already tag from within annotations/pages, so how does the tag helper change this? Just a different way at it? Is it new functionality from previous tags? I love that you can have a Google Doc list of tags, but the video example is not making sense to me for some reason. And I wanna know :)

Here’s my response. That tag helper, now incorporated into the toolkit I’m evolving for DigiPo and other uses, makes it possible for people who don’t write code to define tag namespaces that govern their gathering, organization, recombination, and citation not only of URL-addressable resources but also of annotation-addressable segments of interest within those resources. People can “tie them together” — as Vannevar Bush imagined — in the ways their interests and workflows require.

Does that answer the question? If not, please keep asking until I do so properly. User-defined tag namespaces, though admittedly still a curiosity, are one of the best ways to make collective use of a web of addressable segments.

How annotation layers define “segments of interest” for new kinds of applications

Here are some analogies we use when talking about software:

Construction: Programs are houses built on foundations called platforms.

Ecology: Programs are organisms that depend on ecosystem services provided by platforms.

Community: Programs work together in accordance with rules defined by platforms.

Architecture: Programs are planned, designed, and built according to architectural plans.

Economics: Programs are producers and consumers of services.

Computer hardware: Programs are components that attach to a shared bus.

All are valid and may be useful in one way or another. In this essay I focus on the last because it points to an important way of understanding what web annotation can enable. My claim here is that the web’s emerging annotation layer forms a shared bus for a new wave of content-oriented applications.

A computer’s bus connects devices: disk drive, keyboard, network adapter. If we think of the web in this way, we’d say that devices (your computer, mine) and also people (you, me) attach to the bus. And that the protocol for attachment has something to do with URLs.

You can, for example, follow this link to display and interact with the set of Hypothesis annotations related to this web page. You can also paste the link’s URL into a message or a document to share the view with someone else.

That same URL can behave like an API (application programming interface) that accesses the resource named and located by the URL. A page like this one, part of the DigiPo fact-checking project, uses the link that way. It derives the Hypothes search URL from its own URL, and injects the resulting Hypothesis view into the page.

Every time we create a new wiki page at digipo.io, we mint a new URL that summons the set of Hypothesis annotations specific to that page. In principle there’s no limit to the number of such pages — and associated sets of annotations — we can add. And that’s just one of an unlimited number of sites. The web of URL-addressable resources is infinitely large.

Even so, URLs address only a small part of a larger infinity of resources: words and phrases in texts, regions within images, segments of audio and video. Web annotation enables us to address that larger infinity. The DigiPo project illustrates some of the ways in which annotation expands the notion of content as a bus shared by people and computers. But first some background on how annotation works.

The proposed standard for web annotation defines an extensible set of selectors:

Many Annotations refer to part of a resource, rather than all of it, as the Target. We call that part of the resource a Segment (of Interest). A Selector is used to describe how to determine the Segment from within the Source resource.

When the segment of interest is a selection in a textual resource, one kind of selector captures the selection and its surrounding text. Another captures the position of the selection (“starts at the 347th character, ends at the 364th”). Still another captures its location in a web page (“contained in the 2nd list item in the first list in the seventh paragraph”). For reasons of both speed and reliability, Hypothesis uses all three selectors when it attaches (“anchors”) an annotation to a selection.

When a segment of interest is a clip within a podcast or a video, a selector would capture the start and stop (“starts at 1 minute, 32 seconds, ends at 3 minutes, 12 seconds”). When it’s a region in a bitmapped image, a selector would capture the coordinates (“starts at x=12,y=53, ends at x=355,y=124”). When it’s a piece of a vector image, a selector would capture the Scalable Vector Graphics (SVG) markup defining that piece of the image.

The W3C’s model of web annotation lays a foundation for other kinds of selectors in other domains: locations in maps, nodes in Jupyter notebooks, bars and trend lines and data points in charts. But let’s stick with textual annotation for now, consider how it expands the universe of addressable resources, and explore what we can do in that universe.

Here’s a picture of what’s happening in and around the above-mentioned DigiPo page:

The author has cited a Hypothesis link that refers to a piece of evidence in another web page. The link encapsulates both the URL of that page and a set of selectors that mark the selected passage within it. When you follow the link Hypothesis takes you to the page, scrolls to the passage, and highlights it. That’s a powerful interactive experience!

Now suppose you want to review all the evidence that supports this investigation. You can do it interactively but that will require a lot of context-disrupting clicks. So another program embedded in the wiki page summarizes the cited quotes for you. It uses a variant of the Hypothesis direct link that delivers the interactive experience. The variant is a Hypothesis API call that delivers the annotation in a machine-friendly format. The summarization script collects all the Hypothesis direct links on the page, gathers the annotations, extracts the URLs and quotes, injects them into the Footnotes section of the page, and rewrites the links to point to corresponding footnotes.

To enable this magic, an app that people can use to annotate regions in web pages is necessary but not sufficient. You also need an API-accessible service that enables computers to create and retrieve annotations. Even more fundamentally, you need an open web standard that defines how apps and services work not only with atomic resources named and located by URLs, but also segments of interest within them.

What else is possible on a shared content bus where segments of interest are directly addressable both by people and computers? Here’s one idea being pondered by some folks in the world of open educational resources (OER). Suppose you’re creating an open textbook that attaches quizzes to segments within the text. The quizzes live in a database. How do you connect a quiz to a segment in your book?

Because a quiz is an URL-addressable resource, you can transclude one directly into your book near the segment to which it applies. Doing that normally means encoding the segment’s location in the book’s markup so the software that attaches the quiz can put it in the right place. That works, but it entangles two editorial tasks: writing the book, and curating the quizzes. That entanglement makes it harder to provide tools that support the tasks individually. If you can annotate segments of interest, though, you can disentangle the tasks, tool them separately, build the book more efficiently, and ensure others can more cleanly repurpose your work.

Analogies are necessary but imperfect. The notion of a shared bus, formed by an annotation layer and used by applications oriented to segments of content, may or may not resonate. I’m looking for a better analogy; suggestions welcome. But however you want to think about it, the method I’m describing here works powerfully well, I’ll continue to apply it, and I’d love to discuss ways you can too.