By Jason Snell
September 3, 2020 1:37 PM PT
Descript: Making podcast editing more like text editing
Warning: This story has not been updated in several years and may contain out-of-date information.
As a part of my ongoing 20 Macs for 2020 project, I’ve been releasing weekly podcast episodes featuring the voices of a bunch of people I know discussing the Macs I’ve placed on my list of the 20 most notable Macs of all time.
Though I’ve probably hosted and edited more than a thousand podcast episodes at this point, the 20 Macs for 2020 podcast is a very different kind of beast. It’s scripted, and I’m weaving short clips from the interviews I recorded together with my own words to tell a story. In terms of podcasts, it’s more of a public-radio sort of style. In terms of my own workflow, it’s got more in common with the Six Colors Apple Report Card series, in which I take a very large body of written commentary from my panel and boil it down into a few key phrases.
But working with audio is different from working with text. Building the Apple Report Card calls on my skills as a journalist, taking relevant quotes and dropping them in as I write. But when I decided what I wanted the 20 Macs for 2020 podcast to sound like, I was baffled about how I would achieve a similar effect.
For a while, I resigned myself to using an AI-based transcription tool like Rev for my interviews. Then I’d copy and paste from those transcripts, assemble my script, and pay someone1 to go back to the source audio file and piece the quotes I used back together in an audio editing app. It was doable, but it seemed like an awful lot of work.
Then I realized that there was a perfect tool for this project, and it was one that I had looked at months before and discounted as being irrelevant for the kind of podcasts I do: Descript. This is an app that consumes audio files, generates text transcripts, and then lets you edit audio by editing the text transcriptions. Delete a sentence in the Descript text editor, and that sentence is edited out of the audio.
Even better: You can create new documents inside Descript and copy and paste text from other documents inside, and their audio comes along for the ride. So, for example, I was able to paste all the interview comments about the Blue and White Power Mac G3 from all my interviews into a single document, and then craft my script within that document, switching back and forth between different comments from different voices. (Descript also supports entering text into documents by typing, so I was able to write all of my lines within the app, too.)
So far so good, but I was anticipating a painful process when it came to building the final version of the podcast. while Descript is a more full-featured audio editor than you might think, I wanted the complete control that I can get from Logic Pro X. I was anticipating needing to export a raw set of interview clips from Descript and then cleaning the whole thing up in Logic, but I wasn’t counting on just how good Descript’s export features would be.
Not only was it easy to export my script out as a Word file (optionally with time codes), but Descript will bundle up everything and export it into a format that can be read by pretty much every popular audio or video editor. (Yes, it works with video too—and I’ve got to imagine Descript would be a boon to anyone editing a documentary with hours of interview footage to wade through.)
This is not a “flattened” version of the project, either. It’s every single edit, but tied to the underlying source file. This is vitally important, because if Descript’s approach to editing has a flaw, it’s that trimming audio by deleting text doesn’t always create the most natural sounding edits. But once that project is imported into Logic, I can expand or contract every single edit as needed, until it’s all perfect.
Here’s a look at what a Descript export looks like when it’s opened in Logic, with a separate track for every single interview, and every single edit visible and editable:
This is a starting point. From here, I’ll record my own narration—Descript estimates the length of time all of my narration will run based on my script and leaves empty space for it—add in music, turn on some plug-ins to enhance the sound, and do all the other nitty gritty things that lead to the final version. (I also turned to my pal Brian Hamilton to do a pass that smooths out all the dialogue edits and musical transitions—thanks, Brian!)
While I couldn’t imagine using Descript for my conversational podcasts, I can’t imagine not using it for this project. It has let me turn my decades-honed skill as a writer who trims and edits quotes to tell a story into a podcast editing tool.
Descript is free to try (including three hours of gratis transcription time), but I’m using the $15 monthly/$144 annual plan. A more expensive plan adds more powerful features, including the ability to synthesize a host’s voice from text—useful if you’re part of a big production and can’t get the talent in a studio to record a filler word or three.
- Possibly my daughter, who was looking for a summer job at the time. ↩
If you appreciate articles like this one, support us by becoming a Six Colors subscriber. Subscribers get access to an exclusive podcast, members-only stories, and a special community.