Kudos to the New York Times for its remarkable interactive transcript of the Republican debate. The display has two tabs: Video Transcript, and Transcript Analyzer. The Video Transcript is a side-by-side display of the video and the transcript, linked together and randomly accessible from either side, plus a list of topics that you can jump to. These are the kinds of features you’d like to be able to take for granted, but which aren’t always implemented as cleanly as they are here.
But the Transcript Analyzer takes the game to a new level. Or at least, one that I haven’t seen before in a mainstream publication. The entire conversation is chunked and can be visualized in several ways. It’s reminiscent of the Open University’s FlashMeeting technology which I mentioned here.
In the Times’ visualizer, you can see at a glance the length and wordcount of all participants’ contributions — including YouTube participants who are aggregated as “YouTube user”. Selecting a participant highlights their contributions, and when you mouse over the colored bars, that section of the transcript pops up.
Even more wonderful is the ability to search for words, and see at a glance which chunks from which participants contain those words. The found chunks are highlighted and, in a really nice touch, the locations of the found words within the chunks are indicated with small dark bars. Mouse over a found chunk, and the transcript pops up with the found words in bold. Wow! It’s just stunningly well done.
The point of all this, of course, is not to exhibit stunning technical virtuosity, although it does. The point is to be able to type in a word like, say, energy, and instantly discover that only one candidate said anything substantive on the topic. (It was Mitt Romney, by the way.) Somehow, in all of the presidential campaigning, that topic continues to languish. But with tools like this, citizens can begin to focus with laserlike precision not only on what candidates are saying, but also — and in some ways more crucially — on what they are not.
Hats off to the Times’ Shan Carter, Gabriel Dance, Matt Ericson, Tom Jackson, Jonathan Ellis, and Sarah Wheaton for their great work on this amazing conversation visualizer.
16 thoughts on “Excellent debate visualizer at NYTimes.com”
Jon, thanks for sharing and bringing this to people’s attention. I definitely agree that this is a phenomenal piece of technology that adds tremendous value. I was able to look at the visualization and hone in on keywords that were important for me. Also, I was able to go directly to the questions I was really concerned about.
Kudos to the folks at the NY Times! Keep up the great work!
Truly amazing, thanks for letting on. Pity it doesn’t work with Mozilla Firefox (no sound).
thanks for the complimentary write-up. i’m stoked that you find the tool impressive, and even happier that you find it useful. that’s the goal. as far as it not working in firefox, we’re certainly hoping that’s not the case. try refreshing. sometimes i think the streaming server gets a little wonky, but a refresh should set u up right.
Indeed, very nifty, Jon. I love this stuff; it’s part of the infrastructure that’s essential for us to be able to bring our national debate to a more nuanced level. Another one I bumped into: http://www.neoformix.com/Projects/TranscriptAnalyzer/index.html
I did a thing once that compared words in the debate transcript with a corpus of spoken english, and flagged the most improbable words. That helps give you a flavor of what was talked about, and shows you things you didn’t know you wanted to ask.
The engine behind this is great. I wonder how much work is involved in getting the transcription, speakers and timers.
I work on Moodle (the learning/course management system), and it would be great to be able for the lecturer to upload a video of a lecture, and have a wiki-style page where other can – if they want – transcribe fragments in a simple format. Would also be great for conference videos that are posted on the internet for a long time – the ones that people think are important would get stand a chance of getting transcribed. And with the right payoff — and this kind of browsing is a great payoff — people will be keen on doing it.
This is a most useful tool, and I put up a blog post about it.
I do hope that the NYT does this for some of the upcoming debates.
Following on K. Kennedy’s addition, here is “Issue Tracker”:
*thanks to Web2.Oh Really? for the pointer*
Blinkx uses Autonomy technology to do transcriptions of podcasts and videos but they don’t publish them because the copyright position is uncertain; I long for transcripts of the rich media that’s online so we can mine it like this.