Behind the scenes: The editing of a screencast

While I was editing today’s screencast I kept a log of my edits, and I’ve included that log below. As is typical when I edit screencasts, this one squeezed down quite a lot: from almost 54 minutes to 34 minutes. The result not only saves the viewer a precious 20 minutes, but also unfolds in a far more entertaining and engaging way.

I’ve written a lot about why to do this kind of editing, but never shown in detail what the process is like. For folks who are familiar with the editing process — in any medium — this is all just basic knowledge and common sense. But there are lots of folks who are not familiar with the editing process in any medium. So to convey what it’s like, I decided to narrate (part of) the editing of this particular screencast.

As I’ve mentioned before, there’s one huge difference between editing audio and editing video. With audio, as with text, you can seamlessly cut and rearrange to your heart’s content. With video, the need to preserve visual continuity imposes severe limits, especially on the so-called internal edits that elide words and phrases. It’s interesting to note that, in this respect, the demo/interview genre of screencasting has more in common with audio than with video. There’s usually a lot less happening in a screencast than in motion video. So you can usually get away with the sort of heavy editing that’s normally only possible in the audio domain. And it’s very useful to be able to do that.


(Initial length: 53:45.)

I cut the first 2.5 mins of Henrik talking in general terms about CCR, DSS, the programming model. Why? Nothing to show, and this info is available elsewhere.

The real meat of this demo is to show how the Robotics Studio exposes a RESTful interface, and to demo interactions with (real and simulated) robots using that interface.

In the next segment, Henrik starts by saying “I have a nice big robot next to me, I might be able to show you, if I can just…”

I then cut 15 seconds of him fumbling around in the services directory and muttering to himself, while hunting for the webcam interface. So it went from:

“I might be able to show you, if I can just” …. 15 seconds of fumbling and muttering … “there you go! [image appears]”

to:

“I might be able to show you, if I can just … there you go! [image appears]”

This is partly about respect for the viewer’s time, because people have better things to do than watch and listen to 15 seconds of fumbling and muttering. And it’s partly about keeping the storyline moving forward in an engaging way.

A subtle point here is that I left in just enough of the fumbling and muttering. If I had reduced it to:

“I might just be able to show you…there you go!”

then it would have felt overproduced and inauthentic. I want Henrik to fumble and mutter a little bit, that’s part of the whole charm of the thing. But I want to limit the fumbling and muttering to a reasonable length. I think that leaving in “if I can just…” retains just enough of that quality — but not too much:

“I might be able to show you…if I can just…there you go! [image appears]”

During the next stretch I made no major cuts, but lots of little ones in the range of 2 to 5 seconds. These are places where the audio pauses because Henrik is thinking, or waiting for the computer to respond. And they are also places where he’s just verbally warming up to what he really wants to say — or where I am doing the same.

Example: “…and so, um…” –> “” == 2 seconds saved

Example: “And what we have here is that, um, and so, everybody has seen a web server” –> “Everybody has seen a web server” == 5 seconds saved

These internal cuts are completely inaudible and, so long as they don’t interrupt the onscreen action, also invisible. Since a typical screencast is often visually quiescent there are many opportunities to make these cuts. They not only reduce the end-to-end time, but also — just as important — they make the video far more watchable.

The next major cut was a 20-second setup leading to the statement: “In our model, everybody is a client and everybody is a server.” In the setup, Henrik talked about how typical web apps (like home banking) exhibit a more classical client/server architecture. It was a judgement call but, in this case, I decided that the kinds of folks who will care about RESTful interfaces to a robotic services fabric didn’t need the setup, and that it was more valuable to shave those 20 seconds than to keep them.

It’s worth noting how the context supported making this cut. Originally:

“We have services that talk to each other, that wire each other up, and use each other to construct and compose applications.”

… 20-second setup ….

“In our model, everybody is a client and everybody is a server.”

Finally:

“We have services that talk to each other, that wire each other up, and use each other to construct and compose applications. In our model, everybody is a client and everybody is a server.”

It flows perfectly.

Next I cut a restatement of “everybody is a client and a server” which chewed up 5 or 6 seconds without adding anything new. In doing so I ran into a logistical problem. When trying to make precise audio cuts in Camtasia you can run into trouble in tight spaces. (I keep meaning — and keep forgetting — to mitigate this problem by capturing at a higher frame rate than the one at which I finally produce.) A workaround is to silence a region that’s too small to accurately cut.

So, for example, after cutting that restatement I wound up with:

...at the same time same time. That has some great benefits.
                    ---------

I wasn’t able to cut the redundant “same time” without affecting the “That has” — but I was able to replace the redundant “same time” with silence:

...at the same time. ________ That has some great benefits.

That left a perfectly natural-sounding 1-second pause.

(Length now: 49:38)

Through the next section I made assorted internal cuts, and one major cut. After Henrik contrasted OO-style inheritance with the additive composition of RESTful services which is the extension pattern for the Robotics Studio, we got into a several-minute discussion about the tradeoffs between these approaches. It wasn’t really conclusive, though, and I realized that it would be better to factor that out. In fact, while recording, I decided at this point to do a separate podcast in which we’ll drill down on these more abstract points. In a screencast, you want to keep the visuals moving along.

(Length now: 46:38)

For the remainder, more of the same: internal cuts, plus 20- or 30-second chunks that were disposable.

Final length: 34:30

7 Comments

  1. Good intro into trimming the audio fat, but I think you could have been a little more ruthless — too many pauses, and 10 second diversions! Also, as a note for future casts, you might want to ensure that both speakers are approximately the same volume — you are very quite in comparison to Henrik.

  2. “Good intro into trimming the audio fat, but I think you could have been a little more ruthless — too many pauses, and 10 second diversions!”

    Hmm. OK, noted.

    “Also, as a note for future casts, you might want to ensure that both speakers are approximately the same volume — you are very quiet in comparison to Henrik.”

    Yeah, sorry about that. Normally I record to separate stereo tracks, and I thought Camtasia was configured that way, but it wasn’t so there was no way to separately adjust the tracks as I usually do.

  3. and one other major reflection and ah for me. Perhaps just the difference in the techniques. I tend to edit at the storyboard and scripting level … but with an interview, I guess you flip it – capture everything and go back an edit.

    So, here’s the real thing I want to know:

    -You must have had some bullet points jotted down someplace before you started to interview?
    -As you were doing the interview, did you jot down any place in the audio that was like the “keepers” or did you just focus on listening and watching and interviewing – and then while listening to the whole tape started to mark it?
    -You gave us the log of your edits, but did you do those as you went along – did the use of markers make it more efficient?

    inquiring minds want to know ..

  4. “What I want to know is how long it took you to edit out the twenty minutes — was it longer than twenty minutes?”

    Oh sure. Maybe an hour or so if I’d only been editing, but since I was also logging the edits it was way longer than usual.

    “or did you just focus on listening and watching and interviewing”

    Yes.

    “You gave us the log of your edits, but did you do those as you went along – did the use of markers make it more efficient?”

    I think it’s an excellent discipline to mark everything, and sometimes I do, but sometimes I am impatient and just plunge in and do things directly. This was one of the latter cases.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s