What is a Codec (e.g. DivX?), and how does it differ from a File Format (e.g. MPG)?

I'm so confused... what is the difference between an audio/video codec (which apparently is a shorthand for "encoder/decoder", kind of like how "modem" is really "modulator/demodulator") and an audio/video format?
(Am I even using the correct terminology?)

i.e.: What is the difference between saying something is "MPEG-4" and saying something uses the "DivX" codec? Why does Windows Media Player sometimes run .mpg files, and sometimes not?

Also, which of the following are codecs, which ones are file formats, and which ones are neither?

  • Quicktime MOV
  • MPEG (1, 2, 3, 4)
  • WMV
  • FFmpeg
  • AVC
  • Xvid
  • DivX (how is it different from its palindrome, Xvid?)
  • H.264

Solution 1:

Some main definitions:

  • A codec (e.g., H.264, HEVC, VP9) is only responsible for the video or audio part, and one or more codecs can be merged into a container.
  • A container (e.g., MP4, MKV) is responsible for keeping them together and this is also what you usually open up in your media player of choice.
  • A particular encoder (e.g., x264, libvpx) is responsible for turning an input stream into a codec-compliant bitstream. There are often multiple encoders for one particular codec.

As you can see, we'll have to explain a few things here.

What is a codec?

A codec is short for encoder/decoder, which basically just means the following: Data generated by an encoder can always be decoded by an appropriate decoder. This happens to be valid for video, audio, but you could also think about cryptography (an encoder needs an appropriate decoder to display an encrypted message).

Nowadays, when a video codec is specified, the institutions that take part in it usually only specify the syntax of the standard. For example, they will say: "The bitstream format has to be like this", "The 0x810429AAB here will be translated into that", etc. Often they supply a reference encoder and decoder, but how an encoder is then written to match such a format completely is up to manufacturers.

This is the reason why you will find so many encoders for the very same codec, and some of them even commercial.

A case example – H.264

Before we mix up terminology, let's take an example. Consider the case for H.264. The standard's name is H.264 – that's not the name of the actual encoder. Mainconcept is a very good commercial encoder, whereas x264 is a free and open source one. Both claim to deliver good quality, of course.

The mere fact that you can optimize encoding makes for a competition here. Both encoders will deliver a standardized bitstream that can always be decoded by a H.264-compliant decoder.

To summarize

So, all in all, let's just say that an encoder will:

  • take video frames
  • produce a valid bitstream

The bitstream is then multiplexed into a container.

The decoder will:

  • take that valid bitstream
  • reconstruct the video frames from it

They both conform to a codec standard. That's all!


Current Codecs

These days, you will probably only find videos encoded with the codecs I will mention below. Interestingly, almost all of them were created by the Motion Picture Experts Group (MPEG). But there are some other, royalty-free codecs as well, e.g. those created by Google or the Alliance for Open Media, which are competitors to the MPEG standards.

Note that "MPEG" can refer to both codecs and containers, as you will see below. This adds to the confusion, but just know that "MPEG" alone doesn't mean anything, e.g. "I have a file in MPEG format" is very ambiguous".

MPEG-2

MPEG-2 is quite old. Its first public release is from 1996. MPEG-2 video is mostly used for DVDs and TV broadcasting, e.g. DVB-T or satellite, and legacy applications where compatibility is important. MPEG-2 videos are mostly found in an .MPG container.

MPEG-4 Part 2

This is probably the one that was used mostly to encode videos for the web in the mid-2000s, but it has been superseded in the meantime. It offers good quality at practical file sizes, which meant that you could burn a whole movie of 90 minutes length onto a 600 MB CD (whereas with MPEG-2 you would have needed a DVD, see my answer here). It doesn't work so well for HD or 4K content anymore.

Some encoders that output MPEG-4 Part 2 video are DivX, its open sourced ripoff XviD, and Nero Digital.

MPEG-4 Part 2 videos mostly come in an AVI container, but MP4 is also seen often.

MPEG-4 Part 10 / AVC / H.264

This is also known as MPEG-4 Advanced Video Coding (AVC) or H.264; it is the most used codec today. It offers good quality at small file sizes and therefore is perfectly suited for all kinds of video for the Internet or mobile devices. You will find H.264 in almost every modern applications, from phones to camcorders. On Blu-ray disks, video is now encoded in H.264.

Some encoders for it are: x264, NVENC (from NVIDIA), Mainconcept. The videos mostly come in MP4, MKV or MOV containers.

HEVC / H.265

Also called MPEG-H Part 2, this is the successor of MPEG-4 Part 10 / AVC / H.264. It is aimed at higher resolutions (up to 8K) and can offer up to 50% higher encoding performance (in terms of quality vs. bitrate) compared to H.264 (see this paper, for example).

The standard was published in 2013, and slowly, the codec is beginning to be used more and more, e.g., for IPTV or online video transmissions. HEVC is also used by Apple to store videos and images (using HEIF) on iOS. However, the fact that there are multiple patent pools associated with HEVC has many companies (almost all but Apple) shifting to royalty-free alternatives. HEVC is also not natively supported by all browsers, making it unusable for web streaming.

The best-known encoder is x265. There's also NVENC. The videos usually come in MP4 containers.

VP9 and AV1

VP9 (the successor of VP8) is a codec mainly developed by Google. It is open and royalty-free, and implemented in many browsers. Its quality is almost as good as HEVC, and sometimes even better (see this paper by Netflix). VP9 is what you get when you watch YouTube on a browser that supports it.

VP9 can be encoded with the libvpx encoder, and it often comes in WebM or MKV containers.

Some companies got together to form an even stronger competitor to HEVC – but as a royalty-free alternative. AV1 will be the successor of VP9, and it is based on what was supposed to become VP10. It is backed by the Alliance for Open Media (founded by Amazon, Cisco, Google, Intel, Microsoft, Mozilla, and Netflix). Read more about it here.

The libaom encoder can be used to generate AV1 bitstreams, but it is still experimental.


What is a format (container)?

Until now we've only explained the raw "bitstream", which is basically just really raw video data. You could actually go ahead and watch the video using such a raw bitstream. But in most cases that's just not enough or not practical.

Therefore, you need to wrap the video in a container. There are several reasons why:

  • Maybe you want some audio along with the video
  • Maybe you want to skip to a certain part in the video (like, "go to 1:32:20.12")
  • Both audio and video should be perfectly synchronized
  • The video might need to be transmitted over a reliable network and split into packets before
  • The video might even be sent over a lossy network (like 3G) and split into packets before

For all of those reasons, container formats were invented, some simple, some more advanced. What they all do is "wrap" the video bitstream into another bitstream.

A container will synchronize video and audio frames according to their Presentation Time Stamp (PTS), which makes sure they are displayed at exactly the same time. It would also take care of adding information for streaming servers, if necessary, so that a streaming server knows when to send which part of the file.

Let's take a look at some popular containers.


Popular containers

You will find videos mostly wrapped in the following containers. There are other less popular ones as well, but as I said, mostly, it's those:

AVI

Audio Video Interleave — this is the most basic container, it's just there to interleave audio and video. It was written in 1992 and is still used today, but considered legacy, so do not use it anymore.

MP4

is also known as MPEG-4 Part 14 and is based on the QuickTime file format. This is the go-to format for H.264 video, but it also wraps HEVC, MPEG-4 Part 2 and MPEG-2.

This container might also wrap audio only, which is why you'll find so many .mp4 files which are no videos but rather AAC-encoded audio, also in .m4a files (just a different extension). The extension .m4v is usually taken for video bitstreams.

MKV and WebM

Matroska Video (MKV) is an open sourced and free file format that is often found nowadays, as it supports basically any codec, from H.264 to VP9, and of course also many audio codecs.

WebM is based on MKV and is primarily used for VP9 video and Opus audio – it is the container of choice for web streaming video when these codecs are used.

Ogg

The Ogg container is the container of choice for the Theora video codec (and the Vorbis audio codec), also created by the Xiph.Org Foundation. It's also free and open source (just like the codec).

FLV

The Flash video format was created by Adobe, for use in their streaming applications. It isn't used that much anymore, as the way streaming is done has changed significantly over the last years.


Popular codecs and formats

Also, which of the following are codecs, which ones are file formats, and which ones are neither?

  • Quicktime MOV: .mov is the file extension for the QuickTime File Format, which is a container created by Apple. This container was later adapted for MP4. It can carry all kinds of codecs. Quicktime is actually a whole media framework, it doesn't really specify any codec itself as far as I'm concerned.
  • MPEG (1, 2, 3, 4): Standards defined by the Motion Picture Experts Group. See my post above for details.
  • WMV: Windows Media Video. It's actually a codec wrapped in an Advanced Systems Format container, which uses the .wmv extension again. Weird, but that's the way it is.
  • FFmpeg: This is neither a codec nor a container. It is a library of video tools that also allow conversion between different codecs and containers. FFmpeg relies on the open source libavcodec and libavformat libraries for creating codecs and containers, respectively. Most of video tools you find today are based on it.
  • AVC: Synonym for MPEG-4 Part 10 or H.264.
  • DivX: Another type of encoder for MPEG-4 Part 2 video.
  • Xvid: One type of encoder for MPEG-4 Part 2 video. It's just the open source, free version of DivX, which of course led to some controversy.
  • H.264: Synonym for MPEG-4 Part 10 or AVC.

On a side note:

Am I even using the correct terminology?

I guess once would prefer to specifically use "codec" and "container" instead of "format" to avoid misunderstandings. A format can theoretically be anything, because both codecs and containers specify a format (i.e. how data should be represented).

That being said, the FFmpeg terminology would be to use "format" for the container. This is also because of the distinction between:

  • libavcodec, the library for encoding/decoding
  • libavformat, the library for the containers

Solution 2:

In general a media 'format' is really a container, containing an audio stream (of some audio codec) and a video stream (of some video codec) and sometimes additional information. most 'files' you have have a filetype based on container and not codec

FFmpeg is neither a container nor a codec - its a versatile suite of libraries, codecs and software for conversion of files that underlies many converters and music players.

H.264/AVC and xvid/divx are codecs

AVI (which divx/xvid files are), mp4, mpeg are containers.

I'm not sure about quicktime mov - .mov is a container, quicktime is a codec.

Solution 3:

there are codecs and containers (file formats). The codec describes how the data is encoded/decoded. The other describes how the encoded data is placed inside the file.

Most media players support multiple codec and container types. This is confusing, so I suggest you read my references for more informstion