Reading and writing for our peers

The story Jan Dawson tells in The De-Democratization of Online Publishing is familiar to me. Like him, I was thrilled to be part of the birth of personal publishing in the mid-1990s. By 2001 my RSS feedreader was delivering a healthy mix of professional and amateur sources. Through the lens of my RSS reader, stories in the New York Times were no more or less important than blog posts from my peers in the tech blogosophere, And because RSS was such a simple format, there was no technical barrier to entry. It was a golden era of media democratization not seen before or since.

As Dawson rightly points out, new formats from Google (Accelerated Mobile Pages) and Facebook (Instant Articles) are “de-democratizing” online publishing by upping the ante. These new formats require skills and tooling not readily available to amateurs. That means, he says, that “we’re effectively turning back the clock to a pre-web world in which the only publishers that mattered were large publishers and it was all but impossible to be read if you didn’t work for one of them.”

Let’s unpack that. When I worked for a commercial publisher in 2003, my charter was to bring its audience to the web and establish blogging as a new way to engage with that audience. But my situation was atypical. Most of the bloggers I read weren’t, like me, working for employers in the business of manufacturing audiences. They were narrating their work and conserving keystrokes. Were they impossible to read? On the contrary, if you shared enough interests in common it was impossible not to read them.

When publishers created audiences and connected advertisers to them, you were unlikely to be read widely. Those odds don’t change when Google and Facebook become the publishers; only the gatekeepers do. But when publishing is personal and social, that doesn’t matter.

One of the bloggers I met long ago, Lucas Gonze, is a programmer and a musician who curates and performs 19th-century parlour music. He reminded me that before the advent of recording and mass distribution, music wasn’t performed by a small class of professionals for large audiences. People gathered around the piano in the parlour to play and sing.

Personal online publishing once felt like that. I don’t know if it will again, but the barrier isn’t technical. The tools invented then still exist and they work just fine. The only question is whether we’ll rekindle our enthusiasm for reading and writing for our peers.

From PDF to PWP: A vision for compound web documents

I’ve been in the web publishing game since it began, and for all this time I’ve struggled to make peace with the refusal of the Portable Document Format (PDF) to wither and die. Why, in a world of born-digital documents mostly created and displayed on computers and rarely printed, would we cling to a format designed to emulate sheets of paper bound into books?

For those of us who labor to extract and repurpose the contents of PDF files, it’s a nightmare. You can get the text out of a PDF file but you can’t easily reconstruct the linear stream that went in. That problem is worse for tabular data. For web publishers, it’s a best practice to separate content assets (text, lists, tables, images) from presentation (typography, layout) so the assets can be recombined for different purposes and reused in a range of of formats: print, screens of all sizes. PDF authoring tools could, in theory, enable some of that separation, but in practice they don’t. Even if they did, it probably wouldn’t matter much.

Consider a Word document. Here the tools for achieving separation are readily available. If you want to set the size of a heading you don’t have to do it concretely, by setting it directly. Instead you can do it abstractly, by defining a class of heading, setting properties on the class, and assigning the class to your heading. This makes perfect sense to programmers and zero sense to almost everyone else. Templates help. But when people need to color outside the lines, it’s most natural to do so concretely (by adjusting individual elements) not abstractly (by defining and using classes).

It is arguably a failure of software design that our writing tools don’t notice repetition of concrete patterns and guide us to corresponding abstractions. That’s true for pre-web tools like Word. It’s equally true for web tools — like Google Docs — that ape their ancestors. Let’s play this idea out. What if, under the covers, the tools made a clean separation of layout and typography (defined in a style sheet) from text, images, and data (stored in a repository)? Great! Now you can restyle your document, and print it or display it on any device. And you can share with others who work with you on any of their devices.

What does sharing mean, though? It gets complicated. The statements “I’ll send you the document” or “I’ll share the document with you” can sometimes mean: “Here is a link to the document.” But they can also mean: “Here is a copy of the document.” The former is cognitively unnatural for the same reason that defining abstract styles is. We tend to think concretely. We want to manipulate things in the digital world directly. Although we’re learning to appreciate how the link enables collaboration and guarantees we see the same version, sending or sharing a copy (which affords neither advantage) feels more concrete and therefore more natural than sending or sharing a link.

Psychology notwithstanding, we can’t (yet) be sure that the recipient of a document we send or share will able to use it online. So, often, sending or sharing can’t just mean transferring a link. It has to mean transferring a copy. The sender attaches the copy to a message, or makes the copy available to the recipient for download.

That’s where the PDF file shines. It bundles a set of assets into a single compound document. You can’t recombine or repurpose those assets easily, if at all. But transfer is a simple transaction. The sender does nothing extra to bundle it for transmission, and the recipient does nothing extra to unbundle it for use.

I’ve been thinking about this as I observe my own use of Google Docs. Nowadays I create lots of them. My web publishing instincts tell me to create sets of reusable assets and then link them together. Instead, though, I find myself making bigger and bigger Google Docs. One huge driver of this behavior has been the ability to take screenshots, crop them, and copy/paste them into a doc. It’s massively more efficient than the corresponding workflow in, say, WordPress, where the process entails saving a file, uploading to the Media Folder, and then sourcing the image from there.

Another driver has been the Google Docs table of contents feature. I have a 100-page Google Doc that’s pushing the limits of the system and really ought to be a set of interlinked files. But the workflow for that is also a pain: capture the link to A, insert it into B, capture the link to B, insert it into A. I’ve come to see the table of contents feature — which builds the TOC as a set of links derived from doc headings — as a link automation tool.

As the Google Drive at work accumulates more stuff, I’m finding it harder to find and assemble bits and pieces scattered everywhere. It’s more productive to work with fewer but larger documents that bundle many bits and pieces together. If I send you a link to a section called out in the TOC, it’s as if I sent you a link to an individual document. But you land in a context that enables you to find related stuff by scanning the TOC. That can be a more reliable method of discovery, for you, than searching the whole Google Drive.

Can’t I just keep an inventory of assets in a folder and point you to the folder? Yes, but I’ve tried, it feels way less effective, I think there are two reasons why. First, there’s the overhead of creating and naming the assets. Second, the TOC conveys outline structure that the folder listing doesn’t.

This method is woefully imperfect for all kinds of reasons. A 100-page Google Doc is an unwieldy construct. Anonymous assets can’t be found by search. Links to headings lack human-readable information. And yet it’s effective because, I am coming to realize, there’s an ancient and powerful technology at work here. When I create a Google Doc in this way I am creating something like a book.

This may explain why the seeming immortality of the PDF format is less crazy than I have presumed. Even so, I’m still not ready to ante up for Acrobat Pro. I don’t know exactly what a book that’s born digital and read on devices ought to be. I do know a PDF file isn’t the right answer. Nor is a website delivered as a zip file. We need a thing with properties of both.

I think a W3C Working Draft entitled Portable Web Publications for the Open Web Platform (PWP) points in the right direction. Here’s the manifesto:

Our vision for Portable Web Publications is to define a class of documents on the Web that would be part of the Digital Publishing ecosystem but would also be fully native citizens of the Open Web Platform.

PWP usefully blurs distinctions along two axes.

That’s exactly what’s needed to achieve the goal. We want compound documents to be able to travel as packed bundles. We want to address their parts individually. And we want both modes available to us regardless of whether the documents are local or remote.

Because a PWP will be made from an inventory of managed assets, it will require professional tooling that’s beyond the scope of Google Docs or Word Online. Today it’s mainly commercial publishers who create such tools and use them to take apart and reconstruct the documents — typically still Word files — sent to them by authors. But web-native authoring tools are emerging, notably in scientific publishing. It’s not a stretch to imagine such tools empowering authors to create publication-ready books in PWP. It’s more of a stretch to imagine successors to Google Docs and Word Online making that possible for those of us who create book-like business documents. But we can dream.

Customer service and human dignity

It’s been a decade since I interviewed Paul English on the subject of customer service and human dignity (audio). He was CTO and co-founder at kayak but in this interview we talked more about GetHuman. It had begun as a list of cheats to help you hack through the automated defenses of corporate customer service and get to a real person. Here’s how I remember The IVR Cheat Sheet back then:

finance phone steps to find a human
America First Credit Union 800-999-3961 0 or say “member services”
American Express 800-528-4800 0 repeatedly
Bank of America 800-900-9000 00 or dial 813-882-1103 for Executive Office.
Bank of America 800-622-8731 *
Bank of America 800-432-1000 Say “operator” or “associate” at any point in the menu.
Charles Schwab 800-435-9050 3, 0
Chase 800-CHASE24 5 pause 1 4
Chrysler Financial 800-700-0738 Select language, then press 00
Citi AAdvantage 888-766-2484 Ignore prompts and wait for a human.
Citi Card 800-967-8500 0,0,0,0,0

In our interview Paul said:

Dignity is defined in part as giving people the right to make decisions. In particular if it’s a company I’m paying $100/month for cable or cell phone or whatever, and they don’t give me the ability to decide when I need to talk to a human, I find it really insulting.

When the CEO makes the terrible decision to treat customer service as a cost center, the bonus for the VP who runs it is based on one thing: shaving pennies off the cost of the call.

I responded:

Which is a tragedy because customer service is a huge opportunity for business differentiation. If we set up a false dichotomy, where it’s either automated or human, we’re missing out on the real opportunity which is to connect the right people to the right context at the right time. That’s what needs to happen, but a tricky thing to orchestrate and there doesn’t seem to be any vision for how to do that.

I’ve used GetHuman for 10 years. Yesterday I went there to gird for battle with Comcast and was delighted to see that the service has morphed into this:

Boston-based startup GetHuman on Wednesday unveiled a new service that lets you to pay $5 to $25 to hire a “problem solver” who will call a company’s customer service line on your behalf to resolve issues. Prices vary depending on the company, but GetHuman offers to fight for your airline refund, deal with Facebook account issues, or perhaps even prevent a grueling call with Comcast to disconnect your service.

— CNET, May 4, 2016

I’m really curious about their hands-off problem-solving service and will try it in other circumstances, but my negotiation with Comcast was going to require my direct involvement. So this free call-back service made my day:

How our Comcast call-back works

First we call Comcast, wade through their phone maze, wait on hold for you, and then call you back when an agent can talk. We try 4 times, in case we don’t get through the first time. Of course, once you do talk to a Comcast rep, you still have to do the talking, negotiating, etc.

I went back to work. The call came. Normally I’d be feeling angry and humiliated in this situation. Instead I felt happy and empowered. Companies have used their robots to thwart me all these years. Now I’ve got a robot on my side of the table. It’s on!

A chorus of IT recipes

My all-time favorite scene in The Matrix, if not in all of moviedom, is the one where Trinity needs to know how to fly a helicopter. “Tank, I need a pilot program for a B-212.” Her eyelids flutter while she downloads the skill.

I always used to think there was just one definitive flight instruction implant. But lately, thanks to Ward Cunningham and Mike Caulfield, I’ve started to imagine it a different way.

Here’s a thing that happened today. I needed to test a contribution from Ned Zimmerman that will improve the Hypothesis WordPress plugin. The WordPress setup I’d been using had rotted, it was time for a refresh, and the way you do that nowadays is with a tool called Docker. I’d used it for other things but not yet for WordPress. So of course I searched:

wordpress docker ubuntu

A chorus of recipes came back. I picked the first one and got stuck here with this sad report:

'module' object has on attribute 'connection'

Many have tried to solve this problem. Some have succeeded. But for my particular Linux setup it just wasn’t in the cards. Pretty quickly I pulled the trigger on that approach, went back to the chorus, and tried another recipe which worked like a charm.

The point is that there is no definitive recipe for the task. Circumstances differ. There’s a set of available recipes, some better than others for your particular situation. You want to be able to discover them, then rapidly evaluate them.

Learning by consulting a chorus is something programmers and sysadmins take for granted because a generation of open source practice has built a strong chorus. The band’s been together for a long time, a community knows the tunes.

Can this approach help us master other disciplines? Yes, but only if the work of practitioners is widely available online for review and study. Where that requirement is met, choral explanations ought to be able to flourish.

Augmenting journalism

Silicon Valley’s enthusiasm for a universal basic income follows naturally from a techno-utopian ideology of abundance. As robots displace human workers, they’ll provide more and more of the goods and services that humans need, faster and cheaper and better than we could. We’ll just need to be paid to consume those goods and services.

This narrative reveals a profound failure of imagination. Our greatest tech visionary, Doug Engelbart, wanted to augment human workers, not obsolete them. If an automated economy can free people from drudgework and — more importantly — sustain them, I’m all for it. But I believe that many people want to contribute if they can. Some want to teach. Some want to care for the elderly. Some want to build affordable housing. Some want to explore a field of science. Some want to grow food. Some want to write news stories about local or global issues.

Before we pay people simply to consume, why wouldn’t we subsidize these jobs? People want to do them, too few are available and they pay too poorly, expanding these workforces would benefit everyone.

The argument I’ll make here applies equally to many kinds of jobs, but I’ll focus here on journalism because my friend Joshua Allen invited me to respond to a Facebook post in which he says, in part:

We thought we were creating Borges’ Library of Babel, but we were haplessly ushering in the surveillance state and burning down the journalistic defenses that might have protected us from ascendant Trump.

Joshua writes from the perspective of someone who, like me, celebrated an era of technological progress that hasn’t served society in the ways we imagined it would. But we can’t simply blame the web for the demise of journalism. We mourn the loss of an economic arrangement — news as a profit-making endeavor — that arguably never ought to have existed. At the dawn of the republic it did not.

This is a fundamental of democratic theory: that you have to have an informed citizenry if you’re going to have not even self-government, but any semblance of the rule of law and a constitutional republic, because people in power will almost always gravitate to doing things to benefit themselves that will be to the harm of the Republic, unless they’re held accountable, even if they’re democratically elected. That’s built into our constitutional system. And that’s why the framers of the Constitution were obsessed with a free press; they were obsessed with understanding if you don’t have a credible press system, the Constitution can’t work. And that’s why the Framers in the first several generations of the Republic, members of Congress and the President, put into place extraordinary press subsidies to create a press system that never would have existed had it been left to the market.

— Robert McChesney, in Why We Need to Subsidize Journalism. An Exclusive Interview with Robert W. McChesney and John Nichols

It’s true that a universal basic income would enable passionate journalists like Dave Askins and Mary Morgan to inform their communities in ways otherwise uneconomical. But we can do better than that. The best journalism won’t be produced by human reporters or robot reporters. It will be a collaboration among them.

The hottest topic in Silicon Valley, for good reason, is machine learning. Give the machines enough data, proponents say, and they’ll figure out how to outperform us on tasks that require intelligence — even, perhaps, emotional intelligence. It helps, of course, if the machines can study the people now doing those tasks. So we’ll mentor our displacers, show them the ropes, help them develop and tune their algorithms. The good news is that we will at least play a transitional role before we’re retired to enjoy our universal basic incomes. But what if we don’t want that outcome? And what if it isn’t the best outcome we could get?

Let’s change the narrative. The world needs more and better journalism. Many more want to do that journalism than our current economy can sustain. The best journalism could come from people who are augmented by machine intelligence. Before we pay people to consume it, let’s pay some of them to partner with machines in order to produce quality journalism at scale.

I get to be a blogger

To orient myself to Santa Rosa when we arrived two years ago I attended a couple of city council meetings. At one of them I heard a man introduce himself in a way that got my attention. “I’m Matt Martin,” he said, “and I get to be the executive director of Social Advocates for Youth.” I interpreted that as: “It is my privilege to be the director of SAY.” Last week at a different local event I heard the same thing from another SAY employee. “I’m Ken Quinto and I get to be associate director of development for SAY.” I asked Ken if I was interpreting that figure of speech correctly and he said I was.

Well, I get to be director of partnership and integration for and also a blogger. Former privileges include: evangelist for Microsoft, pioneering blogger for InfoWorld, freelance web developer and consultant, podcaster for ITConversations, columnist for various tech publications, writer and editor and web developer for BYTE. In all these roles I’ve gotten to explore technological landscapes, tackle interesting problems, connect with people who want to solve them, and write about what I learn.

Once, and for a long time, the writing was my primary work product. When blogging took off in the early 2000s I became fascinated with Dave Winer’s notion that narrating your work — a practice more recently called observable work and working out loud — made sense for everyone, not just writers who got paid to write. I advocated strongly for that practice. But my advice came from a place of privilege. Unlike most people, I was getting paid to write.

I still get to tackle interesting problems and connect with people who want to solve them. But times have changed. For me (and many others) that writing won’t bring the attention or the money that it once did. It’s been hard — really hard — to let go of that. But I’m still the writer I always was. And the practice of work narration that I once advocated from a position of privilege still matters now that I’ve lost that privilege.

The way forward, I think, is to practice what I long preached. I can narrate a piece of work, summarize what I’ve learned, and invite fellow travelers to validate or revise my conclusions. The topics will often be narrow and will appeal to a small audiences. Writing about assistive technology, for example, won’t make pageview counters spin. But it doesn’t have to. It only needs to reach the people who care about the topic, connect me to them, and help us advance the work.

Doing that kind of writing isn’t my day job anymore, and maybe never will be again. But I get to do it if I want to. That is a privilege available to nearly everyone.

Towards accessible annotation: a prototype and some questions

The most basic operation in — select text on a page, click the Annotate button — is not yet accessible to a visually-impaired person who is using a screenreader. I’ve done a bit of research and come up with an approach that looks like it could work, but also raises many questions. In the spirit of keystroke conservation I want to record here what I think I know, and try to find out what I don’t.

Here’s a screencast of an initial prototype that shows, with the NVDA screen reader active on my system, the following sequence of events:

  • Load the Gettysburg address.
  • Use a key to move a selection from paragraph to paragraph.
  • Hear the selected paragraph.
  • Tab to the Annotate button and hit Enter to annotate the selected paragraph.

It’s a start. Now for some questions:

1. Is this a proper use of the aria-live attribute?

The screenreader can do all sorts of fancy navigation, like skip to the next word, sentence, or paragraph. But its notion of a selection exists within a copy of the document and (so far as I can tell) is not connected to the browsers’s copy. So the prototype uses a mechanism called ARIA Live Regions.

When you use the hotkey to advance to a paragraph and select it, a JavaScript method sets the aria-live attribute on that paragraph. That alone isn’t enough to make the screenreader announce the paragraph, it just tells it to watch the element and read it aloud if it changes. To effect a change, the JS method prepends selected: to the paragraph. Then the screenreader speaks it.

2. Can JavaScript in the browser relate the screenreader’s virtual buffer to the browser’s Document Object Model?

I suspect the answer is no, but I’d love to be proven wrong. If JS in the browser can know what the screenreader knows, the accessibility story would be much better.

3. Is this a proper use of role="link"?

The first iteration of this prototype used a document that mixed paragraphs and lists. Both were selected by the hotkey, but only the list items were read aloud by the screen reader. Then I realized that’s because list items are among the set of things — links, buttons, input boxes, checkboxes, menus — that are primary navigational elements from the screenreader’s perspective. So the version shown in the screencase adds role="link" to the visited-and-selected paragraph. That smells wrong, but what’s right?

4. Is there a polyfill for Selection.modify()?

Navigating by element — paragraph, list item, etc. — is a start. But you want to be able to select the next word (or previous) word or sentence or paragraph or table cell. And you want to be able to extend a selection to include the next word or sentence or paragraph or table cell.

A non-standard technology, Selection.modify(), is headed in that direction, and works today in Firefox and Chrome. But it’s not on a standards track. So is there a library that provides that capability in a cross-browser fashion?

It’s a hard problem. A selection within a paragraph that appears to grab a string of characters is, under the covers, quite likely to cross what are called node boundaries. Here, from an answer on StackOverflow, is a picture of what’s going on:

When a selection includes a superscript3 as shown here, it’s obvious to you what the text of the selection should be: 123456790. But that sequence of characters isn’t readily available to a JavaScript program looking at the page. It has to traverse a sequence of nodes in the browser’s Document Object Model in order to extract a linear stream of text.

It’s doable, and in fact does just that when you make a selection-based annotation. That gets harder, though, when you want to move or extend that selection by words and paragraphs. So is there a polyfill for Selection.modify()? The closest I’ve found is rangy, are there others?

5. What about key bindings?

The screen reader reserves lots of keystrokes for its own use. If it’s not going to be possible to access its internal representation of the document, how will there be enough keys left over for rich navigation and selection in the browser?