The digital darkroom revealed


Today while editing a podcast I stopped to record a bit of the on-screen action. I’ve written before about the audio editing techniques used by the NPR pros make conversations sound clear and intelligible. I use the same methods on my podcasts, and I’ve been meaning to show it. Today’s two-and-a-half-minute screencast gives you a good idea how it works.

In this short example, I’m talking to Partha Sundaram about something called SQM (pronounced ‘squim’). In the original version we both talk over each other a bit, and I repeat myself. In the final version each voice stands alone and the needless repetition is gone.

You don’t need fancy editing software to do this. Although I’m using Audition in this demo, I’ve done the same kind of thing quite often in Audacity.

You do, however, need to put the voices onto separate channels. When it comes to telephone recording, I am a disciple of Doug Kaye and I use the gadget he recommends, the Telos One, to split the caller and callee onto left and right stereo channels. At $600 the Telos box clearly isn’t for everyone, though, so I’d be interested to hear about a more accessible way to achieve channel separation.

As I mention in the screencast, it’s tedious to do this kind of editing. But it can go pretty fast once you get the hang of it. Since I review my podcasts anyway before publishing them, I’ve decided it’s worth the trouble to make them as clean and intelligible as I can quickly manage. Just like the pros do.

Or do they? I was driving home with my son last night, listening to Fresh Air — a great episode in which Terry Gross interviews Ira Glass about the new TV version of This American Life — and we were both struck by the absence of internal editing. When my son heard this bit — an extreme but not atypical example of the kind of verbal redundancy we heard throughout the show — he burst out laughing. I just found it puzzling. Is internal editing done only for certain shows and not for others? What rules govern when it is or isn’t done?

14 Comments

  1. Great tutorial Jon. I can answer the internal editing question in the NPR case. They left in the sound of Terry’s “Terryness” and then equity required that they leave in the sound of Ira’s “Iraness” — or is that “Glassiness”.

  2. “They left in the sound of Terry’s “Terryness” and then equity required that they leave in the sound of Ira’s “Iraness” — or is that “Glassiness”.”

    Ya think? Even so, they could’ve dialed it down a bit, leaving enough Terryness and Iraness to qualify for authenticity while eliminating much of what’s simply annoying.

    – Jon

  3. “At $600 the Telos box clearly isn’t for everyone, though, so I’d be interested to hear about a more accessible way to achieve channel separation.”

    From the context it seems that by “more accessible” you mean “cheaper”.
    Sorry about being picky because I enjoy your articles and learn from them.

  4. “From the context it seems that by “more accessible” you mean “cheaper”.”

    Actually I do mean accessible. A cheaper alternative would be Skype, for example, if there’s a way it can do channel separation. (Is there? I haven’t found that out, and would like to know.)

    But from my perspective Skype is not particularly accessible. There are a lot of people in the geek world who can and will use it. But increasingly the folks I want to interview are not citizens of that world. For most people, the telephone is still the most comfortable thing.

    That will change, and is changing, of course. So in the short run I guess “accessible” means “a cheaper Telos” but in the long run it might mean “a procedure for channel separation with Skype that non-propellerheads can easily implement”.

  5. Here’s one from the UK: http://www.retellrecorders.co.uk/recording/machine/650.htm

    Plantronix MX-10 also looks like a possibility, but reading the manual I couldn’t tell so I’m not including the link.

    Seems like, since the earpiece and mouthpiece of a telephone handset are separate circuits (I don’t _think_ I hear myself in the earpiece) it ought to be possible to wire two mono circuits in parallel with the handset and connect those via Y-adapter to the stereo in on the PC: but someone more versed in telephone electronics than I am will have to advise you on that.

  6. Very well presented. I’ve been intimidated by audio editing in the past, but the procedure you’ve shown here really is simple. I think I could handle it. That is if I could take the tedium. ;-)

    While I probably won’t use this myself (not enough patience) I’m sure at some point in the future someone will ask me how to do this, and when they do, I’ll have an answer for them!

  7. I think Fresh Air actually airs live initially. I would imagine that accounts for some of the difference.

    In another life I studied discourse linguistics, and part of the funny thing was we had to transcribe conversations we recorded EXACTLY, all the ums and uhs and false starts included. It was a riot. I really grew to love them: when transcribed, they look like poetry. To me there was nothing more intimate in corpus linguistics than the false start (except perhaps conjunction use — but that’s another day)

    I wonder in some ways about cleaning out redundacy though. I did a 20 minute interview of Governor Bill Richardson last week, and I cut out nothing — dumped it straight to mp3 and posted it. Had some reservations about that, mainly because *I* sounded like an idiot.

    But the thing is, most questions he jumped straight off the mark, which was impressive for a guy who’d been talking for 10 hours straight. On the question of what his hardest political contest was, though, there was a long pause, and some “um”-ing and a bunch of false starts. That’s information to my listeners, because we’re interested not only in his answer, but in how he finds his way to it.

    As you know from interviewing as well, very often those hesititations are the interviewer watching the face of the interviewee, and seeing confusion, then trying to swing the question around to an angle they look more comfortable with.

    So obviously, when the subject is RSS etc., that stuff can go — but when the subject is partially the interviewee the pauses and false starts are much like micro-expressions — information possibly as important as the answers that follow. In such cases, I’d argue that editing them out is not in the listener’s interest.

  8. As a listener, I am of two minds regarding “umms” and “ahhs”. If I’m listening to a conversation, then I want to hear the pauses, gaps, and personal sound effects. It sounds more natural, and the impression is that I am a part of the conversation.

    However, if it’s more presentational then I prefer it to be more edited. In this case, I’m trying to gather information, and the pauses are blocking me.

  9. I know that I always wind up saying there’s a middle ground but, well, there usually is, and in this case I think there is.

    For example, I just finished editing a conversational podcast from which I removed maybe 2/3 of the “likes” and “umms” and false starts and repetitions, but left the rest. So it still feels quite natural, but things move along better than they otherwise did.

    As I did this editing, I thought about Mike Caulfield’s very interesting point about when those “micro-expressions” might in fact conveying information.

  10. Данный пост действительно помог мне принять очень нужное для себя решение. За это автору отдельное спасибо! С нетерпением жду от Вас новых сообщений!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s