Which one VoIP SIP audio codec should I choose to high quality calls?

Solution 1:

As of today, one would most likely use Opus, which outperforms most other codecs, as can be seen in the following chart (from Wikipedia):

Opus works across the entire bandwidth (from narrowband to fullband), and always provides better quality than even dedicated speech codecs, due to its ability to switch encoding mode dynamically depending on the bitrate and bandwidth.

My old answer from 2013 is below.


According to a few studies I've read, G.711 seems to provide the best tradeoff between required bandwidth, compression delay and subjective audio quality.

Cisco published an article in 2006, Understanding Codecs: Complexity, Hardware Support, MOS, and Negotiation, where they also evaluated the subjective Mean Opinion Score (MOS) over a couple of codecs, which mostly include those you mention. MOS range from 1–5, with 5 being the best quality, as averaged over a number of listeners (usually more than 12–15).

The results were as follows:

+---------------------+------------------+------------+-------------------------+
| Compression Method  |  Bit Rate (kbps) |  MOS Score |  Compression Delay (ms) |
+---------------------+------------------+------------+-------------------------+
| G.711 PCM           |  64              |  4.1       |  0.75                   |
| G.726 ADPCM         |  32              |  3.85      |  1                      |
| G.728 LD-CELP       |  16              |  3.61      |  3 to 5                 |
| G.729 CS-ACELP      |  8               |  3.92      |  10                     |
| G.729 x 2 Encodings |  8               |  3.27      |  10                     |
| G.729 x 3 Encodings |  8               |  2.68      |  10                     |
| G.729a CS-ACELP     |  8               |  3.7       |  10                     |
| G.723.1 MP-MLQ      |  6.3             |  3.9       |  30                     |
| G.723.1 ACELP       |  5.3             |  3.65      |  30                     |
+---------------------+------------------+------------+-------------------------+

As you can see, G.711 still requires more bandwidth than the other codecs, which were developed for ultra-low bandwidth applications, but in your case, with ADSL, this is no issue. What you get is a very low delay with good MOS values.

A more recent 2009 study by Karapantazis et al gives an even better overview (click to enlarge):

As you can see, there are certain broadband codecs which you could also take into account, Speex being very popular.

Solution 2:

I suspect that the latest cry in audio codecs are, in that order: Opus, SILK and Speex.

  • Opus

Opus is a lossy audio coding format developed by the Internet Engineering Task Force (IETF) that is particularly suitable for interactive real-time applications over the Internet.

All known software patents which cover Opus are licensed under royalty-free terms.

Opus incorporates technology from two other audio coding formats: the speech-oriented SILK and the low-latency CELT.

CELT itself was originally developed by the Xiph.Org Foundation (as part of the Ogg codec family).

  • SILK

SILK is an audio compression format and audio codec developed by Skype Limited.

Since licencing out, it has also been used by others. It has been extended to the Internet standard Opus codec.

  • Speex

Speex is a patent-free audio compression format designed for speech and also a free software speech codec that may be used on VoIP applications and podcasts.[6] It is based on the CELP speech coding algorithm.[7] Speex claims to be free of any patent restrictions and is licensed under the revised (3-clause) BSD license. It may be used with the Ogg container format or directly transmitted over UDP/RTP.

The Speex designers see their project as complementary to the Vorbis general-purpose audio compression project.

Xiph.Org now considers Speex obsolete; its successor is the more modern Opus codec, which surpasses its performance in all areas.