What is a container in YARN?
What is a container in YARN? Is it same as the child JVM in which the tasks on the nodemanager run or is it different?
It represents a resource (memory) on a single node at a given cluster.
A container is
- supervised by the node manager
- scheduled by the resource manager
One MR task runs in such container(s).
There can be multiple containers on a single Node (or a single very big one).
Every node in the system is considered to be composed of multiple containers of minimum size of memory (say 512MB or 1 GB). The ApplicationMaster can request any container as a multiple of the minimum memory size.
Source, see section ResourceManager/Resource Model.