My Screencasting Process

A bunch of people have asked about my screencasting process for RubyTapas. My process is a work in progress that I frequently iterate on, so this is really just a snapshot of my process as of January 2013. I don’t think of this as “screencasting best practices” or anything like that; I’m still a rank amateur at the screencasting game. But for whatever it’s worth, here’s how I do it.

It begins with an idea for an episode. I have a lengthy and ever-growing idea list which is where I often start. But as often as not I’ll pick a topic that isn’t on the list at all, instead going with something that’s on my mind after a pair programming session or an online discussion.

Once I have an idea, I write the script. If I’m unsure about the code examples I’ll start with them and then write words around them; otherwise I’ll just write straight through, alternating narrative and code examples. I write the script in Emacs Org-Mode, making use of its excellent support for editing embedded code samples. I try to write the text in a conversational tone, and I try to hear how it will sound as voice-over (VO) in my head.

VO text differs a little from writing a blog post. One thing I’ve realized is that where in a blog post I would just say “…as you can see in the following code:” and then let the code sample speak for itself, on video it works better to have a running commentary of the code as I’m writing it. So I try to remember to write out a description of the code to follow.

After the script writing comes the screen capture. I use an Emacs function I wrote to create a special RubyTapas frame (window) with my modified Molokai color scheme and some other customizations. I begin recording the frame using ffmpeg, via a script I hacked together to automate the setup. Among other things, this script makes sure that the Emacs window fits my 960×540 target size, and that only the window client area (without title bar, etc.) is recorded.

I typically record all of the code examples in a single take. If I screw one up I mention something about it into the mic and do it over. I try to remember to leave some padding around demonstrations, leaving the screen at rest for a few seconds, to make editing easier. Lately I’ve also been experimenting with doing one take for each step in the demonstration.

After recording, I switch over to my dedicated video editing machine. The files are already synced over thanks to Dropbox. I use Sony Vegas Pro 12 on Windows 7 to edit video. I used to do this in a virtual machine but since I finally replaced my development workstation I was able to dedicate a machine to running Windows on the bare metal, which is a lot faster.

I use a Production Assistant 2 template to start up a new video project with a bunch of boilerplate setup already done. Specifically:

  • A title track with a placeholder title card
  • A screen capture track
  • A VO track with an effects stack already configured. I use a noise gate followed by a Wave Hammer compressor.
  • An ambient noise track (recorded in my office) to ensure that breaks in the VO track don’t sound jarring.

I record the VO inside Vegas. I record into a Blue Snowball mic with a Nady pop filter. [UPDATE October 2013: I’ve since upgraded to a Rode Podcaster, including the Rode PSA1 desk mount arm and PSM1 shock mount. I now use a lighter weight pop filter so as not to over-weight the arm. With the Podcaster’s headphone port I’m able to monitor my input in realtime alongside the PC audio output. For this I use a pair of Sony MDR-7506 studio monitor headphones.]

Typically I’ll record a separate clip for each section of narration in between code examples. The VO becomes the backbone of the video, for pacing purposes. I may add some silence in cases where the code example is just too long to synchronize with narration, but otherwise the VO sets the pace and I edit the video around it.

This part is also where I do the final edits to the script. Sometimes I’ll read a line and it just won’t sound right, so I’ll edit it and re-record.

Once the VO is laid down, it’s simply a matter of editing the screen capture to fit. This means cutting up the screen recording into clips based on code edits, and then squeezing or stretching them to fit the narration. This is very easy in Vegas; you can speed up or slow down a clip by simply control-clicking on one of the edges and dragging it to the desired length. Occasionally I’ll need to freeze the frame for a while as the VO does some exposition; in these cases I’ll do a frame grab and insert the still image into the timeline to fill the space.

Sometimes this process is quick and just takes a few edits. Other times, when I have a lot of code changes in the recording and I want the video to be closely synchronized with the VO, it may take dozens of cuts and adjustments to ensure that each code change takes place on the screen at the same time as my narration of it.

My editing workflow has been made more enjoyable by the acquisition of a Contour Design ShuttlePRO 2. Between the shuttle wheel, jog dial, and the many programmable buttons, I’m able to edit with one hand on the ShuttlePRO and one hand on the mouse. This has been a lot more efficient than mouse-and-keyboard, especially for quickly dialing in on exactly the frame I want to begin or end a clip on.

Once I finish editing to my satisfaction, I render to MP4. Then I export the HTML version of the script, as well as the source code file, using Org-Mode commands. I upload them all to DPD, add a synopsis, and schedule it for release. And that’s it!

I’d estimate the whole process takes roughly 2-4 hours for one ~5 minute episode. Sometimes an episode can take a whole day if I need to do a lot of research for the script, or if I do some fancy post-production effects like doodling circles and arrows on top of the code.

And that’s it! It’s a lot of work, but I’m reasonably happy with the outcome.