Freebase, Wikipedia, Powerset

Athough I’ve explored Freebase in several ways, I hadn’t seen the way it is now integrated — along with Wikipedia — into the Powerset demo of natural language search. It’s quite eye-opening to see the answer to a deceptively simple query like Tim O’Reilly’s siblings.

Assuming that a database goes to the trouble of actually knowing Tim as an entity, knowing his siblings as entities, and knowing their relationships, it’s fascinating to think about when you’d want to do Parallax-style exploration — in this case, find Tim, then click to see his siblings — and when you’d want to cut to the chase by asking a direct question. Personally I’d want both modes available, but I suppose some people will mainly prefer to navigate and others to ask questions.

Posted in Uncategorized

7 thoughts on “Freebase, Wikipedia, Powerset

  1. When Powerset first came out, the quality of the results when Freebase was crawled was immediately visible, while the Wikipedia-only results left must to be desired.

    You should also take a look at Dipity. Those timelines use Freebase quite a bit.

  2. “Personally I’d want both modes available, but I suppose some people will mainly prefer to navigate and others to ask questions.”

    This reminds me vaguely of the early web days when Yahoo style navigating of topics was the only way to find things. Then Altavista gave you a search box to type boolean queries into. And it was interesting to see who would gravitate to each interface.

  3. Jon,

    These nicely looking views of data are just a part of an overall picture. How do I get my own view of these subjective (albeit nice looking) projectsions via opaque Web Pages?

    Opaque Web Page you may ask, yes, and by that I mean what’s exemplified by these section entries for the tags @rel :

    If this page could use and @rel to expose it’s structured data (which it has culled from Wikipedia) then I would be somewhat happy (not necessarily impressed). Bearing in mind that Wikipedia is the main data source, at the very least the related links from Wikipedia should be exposed using something like rel=”dc:source” within .

  4. Clearly needed encode the HTML excepts in prior comments :-(
    Key point is that there is no attributon in the of their pages which makes them Opaque re Data Sources.

    Data Source Opacity is not what the Web is about, so Semantic Technology usage inside, and opagque Web Pages on the Web ultimately runs counter intuitive to the essence of the Web (imho).


  5. When I followed the Tim O’Reilly… link it included the snippet below, about some very colourful members of the O’Reilly family. Is this a problem with the data source, or Powerset? I don’t suppose the O’Reilly gang has any connection to Tim O’Reilly!

    Doesn’t this minor glitch point to a major problem with the linked data idea. If each dataset is say 10% inaccurate (for example clinical coding of diagnoses in the NHS in England was recently found to be at least 10% inaccurate despite the efforts of full time coding staff) then linking 3 data sets together will produce results that are very hard to place any reliance on.

    The text is:

    When Cyril and Ryan were younger, Eldridge wrote a story on Irish street gangs which depicted the O’Reilly’s gang as brutal and heartless. … While Cyril was in Protective Custody, it was determined that he and Ryan were half-siblings with different mothers but Ryan and Suzanne keep this secret from him.

  6. > I don’t suppose the O’Reilly gang has
    > any connection to Tim O’Reilly!

    Presumably not :-)

    In fact, that reference comes from Wikipedia not from Freebase. I reckon that the siblings listed there are correct, assuming that one or more members of that family have created/maintained the set of linked entities.

    How other O’Reilly families will disambiguate themselves as Freebase grows is an interesting question, of course…

Leave a Reply