AOL’s Patch enshrines the event anti-pattern

An anti-pattern, Wikipedia says, is a design pattern “that may be commonly used but is ineffective and/or counterproductive in practice.” The example most familiar to me is the password anti-pattern which describes sites that use your credentials to impersonate you on other sites. My work on the elmcity project has surfaced a few other anti-patterns, including one I call the submit-your-event anti-pattern. On websites with events pages, every invitation to send in event information by email, or to type it into a web form, is an example of this anti-pattern. Why? It breaks most of the seven habits of highly effective web citizens, especially #1 (“Be the authoritative source for your own data”) and #2 (“Pass by reference not by value”).

The pattern I recommend invites sources of event data to submit it by publishing feeds, and aggregators of event data to acquire it by subscribing to feeds. Since the legacy forms-based method has such deep roots, you might want to keep it going while offering the feed-based method as a preferred alternative, like so:

The submit-your-event anti-pattern is being rolled out in a major way at Patch, AOL’s foray into local news. On every events pages, like this one for East Providence, RI, Patch invites you to sign up, log in, and populate its database. I had noticed this in my travels through the eventsphere, and thought of it again when I read Ken Auletta’s (paywalled) New Yorker article this week, which profiles Tim Armstrong’s mission to save AOL by building a network of local news hubs. Here was the inspiration for the Patch events service:

While he was at Google, Armstrong had his revelation about local news. One Saturday morning in 2007, he and his children were driving home from a bagel store half a mile from their home, in Riverside, Connecticut. At a stoplight, they pulled over to look at the hand-lettered signs that residents had stuck in the grass to advertise local events. There was no online listing of events in Riverside, and the Greenwich Time lacked a calendar. Armstrong called the newspaper and introduced himself as a resident and told an editor that the paper was missing out on a terrific business opportunity.

“We really don’t need any help. We have a fine business,” the editor told him, before saying thanks and hanging up.

This was crazy, Armstrong recalls thinking. He lived in one of the wealthiest towns in America, yet he had to drive to a stoplight to find out what to do with his family.

event posters, keene, spring 2005

I had a similar revelation back in 2005, when I walked around Keene, photographed all the event posters on shop windows and kiosks, and compared the results to listings in print and online. The posters collectively told much more about goings-on in town than did any of the listings, or indeed all of the listings combined.

A few years later, I came to a very different conclusion than the one Armstrong reached. The kiosk-and-shop-window system was outperforming the web because it was, in one crucial way, more weblike than any existing web-based events system. You don’t have to ask permission to post flyers, you just post them where they can be seen by everyone.

Imagine if the web used a submit-your-web-page anti-pattern. To post a page to the web you’d visit a site, register, log in, fill in a form, hit submit, and wait for approval. Well of course you can’t imagine that, because there wouldn’t be a web if things had to work that way.

To work as well as the web, the eventsphere has to work like the web. If you have events to promote, you post them to your own site. That’s the authoritative source. The information is displayed there as text for people to read, but is also available as data for people and machines to syndicate. Local media hubs don’t get to be the exclusive owners of a database of events because there is no database of events, there are only feeds from authoritative sources. They do, if they’re savvy web thinkers, get to play a preeminent role in the eventsphere by inviting contributors to light up their feeds, and by offering rules and tools that help contributors manage them.

Seven ways to think like the web

Update: For a simpler formulation of the ideas in this essay, see Doug Belshaw’s Working openly on the web: a manifesto.

Back in 2000, the patterns, principles, and best practices for building web information systems were mostly anecdotal and folkloric. Roy Fielding’s dissertation on the web’s deep architecture provided a formal definition that we’ve been digesting ever since. In his introduction he wrote that the web is “an Internet-scale distributed hypermedia system” that aims to “interconnect information networks across organizational boundaries.” His thesis helped us recognize and apply such principles as universal naming, linking, loose coupling, and disciplined resource design. These are not only engineering concerns. Nowadays they matter to everyone. Why? Because the web is a hybrid information system co-created by people and machines. Sometimes computers publish our data for us, and sometimes we publish it directly. Sometimes machines subscribe to what machines and people publish, sometimes people do.

Given the web’s hybrid nature, how to can we teach people to make best use of this distributed hypermedia system? That’s what I’ve been trying to do, in one way or another, for many years. It’s been a challenge to label and describe the principles I want people to learn and apply. I’ve used the terms computational thinking, Fourth R principles, and most recently Mark Surman’s evocative thinking like the web.

Back in October, at the Traction Software users’ conference, I led a discussion on the theme of observable work in which we brainstormed a list of some principles that people apply when they work well together online. It’s the same list that emerges when I talk about computational thinking, or Fourth R principles, or thinking like the web. Here’s an edited version of the list we put up on the easel that day:

  1. Be the authoritative source for your own data

  2. Pass by reference not by value

  3. Know the difference between structured and unstructured data

  4. Create and adopt disciplined naming conventions

  5. Push your data to the widest appropriate scope

  6. Participate in pub/sub networks as both a publisher and a subscriber

  7. Reuse components and services

1. Be the authoritative source for your own data

In the elmcity context, that means regarding your own website, blog, or online calendar as the authoritative source. More broadly, it means publishing facts about yourself, or your organization, to a place on the web that you control, and that is bound in some way to your identity.

Why?

To a large and growing extent, your public identity is what the web knows about your ideas, activities, and relationships. When that knowledge isn’t private, your interests are best served by publishing it to online spaces that you control and use for the purpose.

Related

Mastering your own search index, Hosted lifebits

2. Pass by reference rather than by value

In the case of calendar events, you’re passing by value when you send copies of your data to event sites in email, or when you log into an events site and recopy data that you’ve already written down for yourself and published on your own site.

You’re passing by reference when you publish the URL of your calendar feed and invite people and services to subscribe to your feed at that URL.

Other examples include sending somebody a link to an article instead of a copy of the article, or uploading a file to DropBox and sharing the URL.

Why?

Nobody else cares about your data as much as you do. If other people and other systems source your data from a canonical URL that you advertise and control, then they will always get data that’s as timely and accurate as you care to make it.

Also, when you pass by reference you’re enabling reuse (see 7 below). The resources you publish can be recombined, by you and by others, with other resources published by you and by others.

Finally, a canonical URL helps you measure how the web reacts to your data. If the URL is cited elsewhere you can discover those citations, and you can evaluate the context that surrounds them.

Related

The principle of indirection, Hyperlinks matter

3. Know the difference between unstructured and structured data

When you create an events page on your website, and the calendar on that page is an HTML file or a PDF file, you’re posting unstructured data. This is information that people can read and print, and it’s fine for that purpose. But it’s not data that networked computers can process.

When you publish an iCalendar feed in addition to your HTML- or PDF-based calendar, you’re publishing data that machines can work with.

Perhaps the most familiar example is your blog, if you have one. Your blog publishing software creates an HTML page for people to read. But at the same time it creates an RSS or Atom feed that enables feedreaders, or blog aggregation services, to automatically collect your entries and merge them with entries from other blogs.

Why?

When you publish an iCalendar feed in addition to your HTML- or PDF-based calendar, you’re publishing data that machines can work with.

The web is a human/machine hybrid. If you contribute data in formats useful only to people, you sacrifice the network effects that the machines can promote. If you also contribute in formats the machines understand, they can share your stuff amongst themselves, convey it to more people than you can reach through word-of-mouth human networks, and enable hybrid human/machine intelligence to work with it.

Related

The laws of information chemistry, Developing intuitions about data

4. Create and adopt disciplined naming conventions

When people publish calendars into elmcity hubs, they can assign unique and meaningful URLs and/or tags to each event they publish. And they can collaborate with curators of hubs to use tag vocabularies that define virtual collections of events.

The same strategies work in all web contexts. Most familiar is the first order of business at every conference attended by web thinkers: “The tag for this conference is ______.” When people agree to use common names in shared data spaces, effects like aggregation, routing, and targeted search require no special software.

Why?

The web’s supply of unique names (e.g., URLs, tags) is infinite. The namespace that you can control, by choosing URLs and tags for the things you post, is smaller but still infinite. Web thinkers use thoughtful, rigorous naming conventions to manage their own personal information and, at the same time, to enable network effects in shared data spaces.

Related

Heds, deks, and ledes, The power of informal contracts, Permalinks and hashtags for city council agenda items, Scribbling in the margins of iCalendar

5. Push your data to the widest appropriate scope

When you speak in electronic spaces you can address audiences at varying scopes. An email message addresses one or several people; a blog post on a company intranet can address the whole company; a blog post on the public web can address the whole world. Web thinkers know that keystrokes invested to capture and transmit knowledge will pay the highest dividends when routed to the widest appropriate scope.

The elmcity example: a public calendar of events can be managed in what is notionally a personal calendar application, say, Google Calendar or Outlook, but one that can post data to a public URL.

For bloggers, this principle governs the choice to explain what you think, learn, and do on your public blog (when appropriate) rather than in private communication.

Why?

Unless confidentiality precludes the choice, web thinkers prefer shared data spaces to private ones because they enable directed or serendipitous discovery and ad-hoc collaboration.

Related

Too busy to blog? Count your keystrokes

6. Participate in pub/sub networks as both a publisher and a subscriber

Our everyday calendar programs are, in blog parlance, both feed publishers and feed readers. Individuals and organizations can publish their own feeds to the web of calendar data while at the same time subscribing to others’ feeds. On a larger scale, an elmcity hub subscribes to a set of feeds, and in turn publishes a feed to which other individuals (or hubs) can subscribe.

Why?

The blog ecosystem is the best example of pub/sub syndication among heterogeneous endpoints through intermediary services. Similar effects can happen in social media, and they happen in ways that people find easier to understand, but they happen within silos: Facebook, Twitter. Web thinkers know that standard protocols and formats enable syndication that crosses silos and supports the most open kinds of collaboration.

Related

Personal data stores and pub/sub networks

7. Reuse components and services

In the elmcity context, calendar programs are used in several complementary ways. They combine personal information management (e.g., keeping track of your own organization’s public calendar) with public information management (e.g., publishing the calendar).

In another sense they serve the needs of humans who read those calendars on the web while also supporting mechanical services (like elmcity) that subscribe to and syndicate the calendars.

In general, a reusable web resource is:

  1. Effectively named
  2. Properly structured
  3. Densely interconnected (linked) both within and beyond itself
  4. Appropriately scoped

Why?

The web’s “small pieces loosely joined” architecture echoes what in another era we called the Unix philosophy. Web thinkers design reusable parts, and also reuse such parts where possible, because they know that the web both embodies and rewards this strategy.

Related

How will the elmcity service scale? Like the web!, How to manage private and public calendars together

Inviting Toronto to think like the web

In 2006, operating quietly behind the scenes, Dan Thomas and Suzanne Peck alerted me to what would become the municipal open data movement we know today. As the ball got rolling, though, I felt that something was missing. It’s great that citizens are now learning to expect access to municipal data, and to expect useful online services to flow from such access. But citizens are providers of data too. We need to expect one another to provide the data for which we are individually and collectively authoritative.

One of my favorite taglines for the municipal open data movement is Mark Surman‘s evocative phrase: cities that think like the web. The elmcity project aims to help cities do that. Next week I’ll be in Toronto for a series of meetings and also a public talk. I want to suggest that in cities that think like the web, citizens understand and apply “fourth R” principles. They know something about how data can be structured for humans to read versus for computers to process. They recognize that pub/sub syndication is a good way to merge their own data into the public ecosystem. They take responsibility for publishing their own data in useful ways, and they expect their fellow citizens to do the same.

If you’re a Torontonian who’s interested in these ideas, we’ll be discussing them on Tuesday afternoon at the University of Toronto’s Cities Centre (John H. Daniels Faculty of Architecture, Landscape, and Design, 230 College Street, Room 103). And if you know Torontonians who aren’t technical but who care about these ideas, please do alert them. They’re the ones I particularly need to reach.

Location-tagged events in elmcity hubs

The elmcity project’s single biggest hurdle continues to be a conceptual one. People mostly lack the intuition that it’s possible — never mind easy and free — to publish data that can syndicate. In response to an earlier item on this topic, Stefano Mazzocchi (Cocoon, Simile, and Google Refine) offered some thoughts which I’m sharing with his permission:

A few weeks ago, my in-laws were visiting. She is a pretty famous book author and we were talking about how technology could bring value to her workflow (she is not flat out an IT luddite but close enough).

She just created her first web site and asked me how she could promote it on search engines (classic newbie SEO question). She was not interested in the mechanics at all, she just wanted more exposure.

We talked about Twitter and about how publishers and authors use it to promote themselves and engage their audiences. She thought all this was very “Hollywood” and not her style at all. But I showed her that you don’t need to use Twitter that way, you can just mine it for your ego network. Then I explained how I set up all sorts of traps around the web, with newsfeeds, and how I use Google Reader to aggregate them all for me.

I showed her right there and then. Searched for her new book name on Twitter, clicked on the RSS feed, did the same on Google blog search, Google news search, and voila, her personal PR aggregation network was born.

She was completely blown away. She didn’t know any of this was even remotely possible, yet, once explained, it make perfect sense. It’s like having personal agents watching everything that goes on and sending you the information. Email versus RSS doesn’t make any difference to her. As long as she has a place to go and check out what others say about her, she’s happy.

My take: no tech-unsavvy person thinks it reasonable to have a personal agent that does, for individuals and for free, what gigantic organizations struggle to do every day.

The fact that Google can search 15 billion pages in milliseconds doesn’t faze them as much. If librarians can do it, so does Google. Big deal.

But personal agents constantly working in the cloud for you? It doesn’t even show up in the realm of possibilities.

If Stefano’s in-law were on Facebook, of course, she’d be getting a sense of what it’s like to have one of those agents in the cloud. Her activity stream would magically be visible to friends, and their reactions to it would magically be visible to her. That’s why I often say, nowadays, that Facebook is a great set of training wheels for the pub/sub network.

But Facebook isn’t, yet, a place where people can learn how to publish data that syndicates beyond Facebook. It’s possible, as I discussed in Heds, deks, and ledes, to post public events on Facebook in a way that can be discovered by an elmcity hub or by some other agent. If you don’t know such agents can and do exist, though, you’ll never stop to think about whether they’re actually finding your data — and if not, how to make sure that they do.

One of the key points embedded in Stefano’s parable is that his in-law didn’t have to do anything special in order to be able to find the web’s reaction to her book. To the extent that her name and the name of her book are out there and indexed, they provide good-enough hooks for search aggregation.

Over time, of course, the efficacy of these searches will decay. I’ve watched this happen with my own name. Years back, my stuff was pretty much the only stuff that a search for Udell would find. (There was even a time when the first Jon on Google was me, not Jon Stewart!) Then my wife began showing up, along with a whole bunch of other Udells. So I tuned my filters to Jon Udell, and they work better for now, but there are other Jon Udells and it’s only a matter of time before that namespace gets cluttered too.

In order to reliably find stuff about me, I need filters tuned to aspects of my identity: my domain name, my Twitter handle. Eventually Stefano’s in-law may reach the same conclusion. She may realize that posting to her website is more than a way to share her thoughts with the world. It also enables the world to react to her posts in ways she can, in turn, discover. At that point she may start to see why it’s important to actively colonize parts of the web that are, or can be, bound to aspects of her identity.

The elmcity project, similarly, invites promoters of public events — and communities at large — to colonize the web in ways bound to those individual or group identities. When you produce a calendar feed that flows through an elmcity hub, you’re not just helping to populate that hub. That feed is attached to your own site and, in theory, is directly discoverable there. In practice, though, there are aren’t yet good methods of discovery. We don’t yet have, for iCalendar, an autodiscovery mechanism like the one we have for RSS. That’d be easy enough, as Mark McClaren suggests:

Love it or hate it, iCalendar is the pervasive calendaring format. If we can enable RSS autodiscovery then why not do the same with iCalendar feeds. Adding one line of code would make it easier for people/machines to subscribe to an iCalendar feed.

<link rel="alternate" type="text/calendar"
  title="iCalendar feed for example.com"
  href="calendar.ics" />

It would also be really helpful to be able to bind locations to events in a discoverable way. To that end I’ve recently enhanced the HTML rendering of elmcity hubs. Now they include what Google calls rich snippets, using the RDFa-style markup documented here. The snippets include latitude and longitude coordinates derived in one of two ways:

1. Per-event. There are several ways that an event can show up bearing latitude/longitude values. The vast majority of such events will be those coming from Eventful and Upcoming, both of which services provide lat/lon values via their APIs. There’s also a GEO property defined for iCalendar, and some iCalendar producers use it to geocode events.

2. Per-hub. Although most iCalendar producers don’t use the per-event GEO property, elmcity hubs know their own locations. So events that lack specific lat/lon coordinates inherit the locations of their hubs.

It’s going to be a while yet until folks like Stefano’s book-writing in-law start to realize they can, as Kingsley Idehen nicely puts it, master their own seach indexes. But sooner or later they’ll realize that it’s possible. Likewise, it’ll be a while yet until promoters of public events realize that the event data they push to their websites can not only feed pub/sub networks, but can also feed location-aware search engines. I’m a patient man, though, and I do expect the seeds I’m planting to grow and eventually bear fruit.

The new oral tradition

Nowadays when people ask if I’ve read a book and I start to answer yes, I have to stop and think. Did I actually read the book? Or did I only hear the author discuss the book on a podcast? This confusion wouldn’t happen if the book were a work of fiction, but I’m mainly drawn to non-fiction and in that realm I’ve noticed a couple of things. First, I seem to absorb the gist of non-fiction books so well from listening to their authors that I sometimes feel as if I’ve read them. Second, I find that when I do read these books I am sometimes disappointed to find that the writing doesn’t compel me in the same way that the speaking did.

I almost hate to mention this effect, because book publishing is a tough business already and doesn’t need more grief. Nor would I want authors to fear audio exposure. But the effect is real, at least for me, and I wonder why. Here are two theories:

1. The rebirth of the oral tradition

Before there was print, we mainly experienced writing as authors’ voices. Print expanded the reach of their words but not of their voices. During the 20th century, electronic media expanded the reach of some authors’ voices — but only those few who appeared in mainstream media. In the 21st century, though, podcasting has democratized interviews with — and lectures by — authors. Now, for almost any book, you can find one or more podcasts in which the author discusses the work. We have far greater access to the voices of the authors we read and, through their voices, to their personalities. The voices and personalities can be more compelling than the writing.

2. The process of iterative refinement

When you read a book, you access the author’s brain at the moment when the book was just finished. When you listen to an author discuss a book, though, you access his or her brain after it has reflected on the book and processed the world’s reaction to it. That later brain knows more about the themes of the book, and can articulate them better.

Spoken-word audio occupies a small niche within the ecosystem of downloadable audio, so maybe few are noticing this effect. That’s probably a good thing. I like accessing authors’ brains through their voices in addition to — but sometimes as a substitute for — their written words. That kind of substitution, if more widely practiced, would be disruptive.