Tool to bulk speed up/convert an audio file

I want to listen to certain podcasts on my phone but I have two common problems:

  1. The audio is in some weird format (some don't play on my phone).
  2. The audio is slow.

I want to use something like sox or avconv to bulk convert the files. Since this is just voice and going on a cell phone, small low-quality files would be best for me. I had some good success using avconv:

avconv -i weird.wma normal.ogg

Unforunately, this command creates an enormous ogg file and I can't get it play faster. Ideally, this particular file would play at 170% of the original speed.


Solution 1:

Convert with FFmpeg

FFmpeg has a built-in audio filter for changing the tempo without changing the pitch. We need to encode the file to some format your phone plays. This depends on the phone of course. Many modern smartphones like AAC audio:

ffmpeg -i weird.wma -filter:a "atempo=1.7" -c:a libfaac -q:a 100 final.m4a

Here, you can change the quality with the -q:a option, where the value is in percent and higher means better.

Or, MP3 audio with an (average) quality of 4, where less means better (0 resulting in around 245 kBit/s):

ffmpeg -i weird.wma -filter:a "atempo=1.7" -c:a libmp3lame -q:a 4 final.mp3

If your phone doesn't support any of these, we'll have to dig further. Oh, and I use ffmpeg synonymously with avconv here. They're not quite the same, but for the above cases you can use either tool. If your version of FFmpeg or avconv doesn't bundle FAAC or LAME, go get a static Linux build from the FFmpeg download page.


If you don't like the FFmpeg filters, here is another approach:

Extract raw audio

First of all, you need to extract the raw audio stream in an uncompressed format, e.g. PCM Stereo 16-bit audio in a WAV container.

ffmpeg -i weird.wma temp.wav

Now we can use the file temp.wav to shorten the audio. We have a few options for that:

Option 1: SoX

SoX offers a few different filters that allow you to change speed, pitch or tempo. Simply changing speed will increase the pitch to compensate for the length, so this might sound unnatural.

The tempo filter uses an advanced algorithm to shorten files but keep the pitch, by dividing it into smaller time windows and then "merging" them, thus speeding up the tempo. For example:

sox temp.wav output.wav -tempo 1.7

This might sound a little weird. If it does, resort to option 2.

Option 2: Paul's Extreme Sound Stretch

This program promises to offer better quality than SoX, and there is a command line version written in Python available from GitHub. A command could look like this—keep in mind that it by default stretches the file, so to shorten it we calulate the inverse of 1.7, which is 0.59:

python paulstretch_stereo.py -s 0.59 temp.wav output.wav

Convert raw audio to a compressed file

Now we have a shortened WAV file, but this is still uncompressed, so we need to compress it again. Refer to the options at the top of this post for various formats.

ffmpeg -i output.wav -c:a …

The non-ffmpeg methods outlined above will lose your metadata; you can add it in again when converting output.wav to a compressed format like so:

ffmpeg -i output.wav -i weird.wma -map 0 -map_metadata 1 -c:a ...