Searching across silos, circa 2018

The other day, while searching across various information silos — WordPress, Slack, readthedocs.io, GitHub, Google Drive, Google Groups, Zendesk, Stack Overflow for Teams — I remembered the first time I simplified that chore. It was a fall day in 1996. I’d been thinking about software components since writing BYTE’s Componentware1 cover story in 1994, and about web software since launching the BYTE website in 1995. On that day in 1996 I put the two ideas together for the first time, and wrote up my findings in an installment of my monthly column entitled On-Line Componentware. (Yes, we really did write “On-Line” that way.)

The solution described in that column was motivated by McGraw-Hill’s need to unify search across BYTE and some other MgH pubs that were coming online. Alta Vista, then the dominant search engine, was indexing all the sites. It didn’t offer a way to search across them. But it did offer a site: query as Google and Bing do today, so you could run per-site queries. I saw it was possible to regard Alta Vista as a new kind of software component for a new kind of web-based application that called the search engine once per site and stitched the results into a single web page.

Over the years I’ve done a few of these metasearch apps. In the pre-API era that meant normalizing results from scraped web pages. In the API era it often means normalizing results fetched from per-site APIs. But that kind of normalization was overkill this time around, I just needed an easier way to search for words that might be on our blog, or in our docs, or in our GitHub repos, or in our Google storage. So I made a web page that accepts a search term, runs a bunch of site-specific searches, and opens each into a new tab.

This solution isn’t nearly as nice as some of my prior metasearchers. But it’s way easier than authenticating to APIs and merging their results, or using a bunch of different query syntaxes interactively. And it’s really helping me find stuff scattered across our silos.

But — there’s always a but — you’ll notice that Slack and Zendesk aren’t yet in the mix. All the other services make it possible to form a URL that includes the search term. That’s just basic web thinking. A set of web search results is an important kind of web resource that, like any other, should be URL-accessible. Unless I’m wrong, in which case someone please correct me, I can’t do the equivalent of `?q=notifications` in a Slack or Zendesk URL. Of course it’s possible to wrap their APIs in a way that enables URL query, but I shouldn’t have to. Every search deserves its own URL.


1 Thanks as always to Brewster Kahle for preserving what publishers did not.

Posted in .

2 thoughts on “Searching across silos, circa 2018

  1. There used to be a startup named “Greplin” which tried to solve the similar problem for the personal data hosted in such services

Leave a Reply