A story about marginalia in today’s New York Times, Book Lovers Fear Dim Future for Notes in the Margins, opens with an account of a rare and otherwise undistinguished book that’s valuable only because Mark Twain scribbled in its margins:

Like many readers, Twain was engaging in marginalia, writing comments alongside passages and sometimes giving an author a piece of his mind. It is a rich literary pastime, sometimes regarded as a tool of literary archaeology, but it has an uncertain fate in a digitalized world.

“People will always find a way to annotate electronically,” said G. Thomas Tanselle, a former vice president of the John Simon Guggenheim Memorial Foundation and an adjunct professor of English at Columbia University. “But there is the question of how it is going to be preserved. And that is a problem now facing collections libraries.”

Actually it’s a problem facing everyone, and if we solve it for ourselves we’ll solve it for libraries too. The Times story wanders off into nostalagia without proposing any solution. Here’s my proposal for the next Mark Twain and for all the rest of us too: a network of cloud-based personal data stores.

When Mark Twain v1 wrote his marginalia, he had to commit them to a single physical copy of a book. His notes were available only to him, and even then not very effectively. He couldn’t search for one of his own comments. There was only one way to access it. If he wasn’t where the book was, that path was temporarily blocked. If he lost the book, he lost the comment.

Mark Twain v2 will be a citizen of the web. He will possess, among other habits of highly effective web citizens, the habit of communicating by reference rather than by value. In this case, he’d start by citing the passage he was annotating. Whether he was reading a print or electronic book, the citation would encode some facts: author, title, edition, page number, paragraph number. Writing down all those facts for every marginal note would be onerous, but Mark Twain v2 won’t have to, he’ll use software that automates the drudgery and enables him to write as simply as: “pg 52, para 3: Nonsense! I published Huck Finn…”

The citation refers to, but is not part of, the book to which it refers. That’s one level of indirection, and nowadays it’s a familiar one. We have all created and used URLs that point to pages of books, or even to paragraphs within those pages.

The next level of indirection is less familiar. Mark Twain v1′s note was inscribed in the margin of a particular copy of a particular edition of a book. Mark Twain v2′s note can, at a minimum, refer to every copy of that edition. But ideally it can do better. It can refer to every copy of every edition that contains the referenced passage. That’s hard to achieve today in the realm of conventional books, which don’t afford edition-independent ways to refer to works, or to paragraphs, or to sentences within those works.

But the web shows how it can be done. There are, for example, a variety of ways to refer to a work. There’s no consensus as to which is best, and poor interoperability among the various schemes, but it’s a start.

There’s also a longstanding web tradition of intra-work citation. Back in 2000, I wrote a report called Internet Groupware for Scientific Collaboration. The document uses a technique called Purple Numbers, one of the many spinoffs from Doug Engelbart’s pioneering work, to create an URL for each paragraph. What’s more, each of those Purple Numbers (in my case, actually, they were Green Numbers) linked to a discussion board.

Amazingly that discussion board still survives at QuickTopic, but it was never very useful. That’s because a crucial third kind of indirection was, and remains, missing in action. Ideally the authors of those QuickTopic comments would have committed their words to personal web archives under their control, and then syndicated their comments to QuickTopic. Because we still lack that capability, Mark Twain v2 can at best exploit the principle of indirection for his own purposes. He can write marginalia that refers to works, and he can store his marginalia in the cloud for anywhere/anytime access and for safekeeping.

For Mark Twain v1 that might have been plenty good enough. He could rant privately, knowing that his marginalia — like his autobiography — would be available to scholars and to the public after his death. But Mark Twain v2 expects more. Like his namesake, he wants to control access to his lifestream and assure its continuity. But he also wants selected bits of that lifestream to influence the world now instead of later, on terms he defines. A comment that refers to pg 52, para 3 in a work can be declared private or public. If public, it can syndicate to any web context that refers to pg 52, para 3 of that work, while remaining tethered to the authoritative source: a public facet of Mark Twain v2′s lifestream.

A lot of pieces need to fall into place to enable this scenario. Happily they are, for many reasons, the right pieces.