ffmpeg not using full CPU power
I have a Dell Precision 490 workstation, running Ubuntu 11.10, with two dual core Xeon 5160 CPUs (so 2 physical CPUs, with 2 cores each), which I am using to encode TV recordings (MPEG2) to H264 using ffmpeg & x264. ffmpeg is encoding at approx 180fps and everything works well, but CPU usage is little low, hovering around 30% per core, when I would expect close to 100%. Anyone know why this is?
These are the inputs I'm giving ffmpeg
time ffmpeg -i ch31.m2t -s 640x360 -acodec libfaac -ac 1 -ar 44100 -b:a 56k -vcodec libx264 -preset superfast -b:v 744k ch31_superfast_800.mp4
Output of uname -a
Linux dell 3.0.0-14-generic #23-Ubuntu SMP Mon Nov 21 20:28:43 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
Output of grep -i core /proc/cpuinfo
core id : 0
cpu cores : 2
core id : 0
cpu cores : 2
core id : 1
cpu cores : 2
core id : 1
cpu cores : 2
Output of mpstat -P ALL
Linux 3.0.0-14-generic (dell) 18/12/11 _x86_64_ (4 CPU)
15:46:52 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
15:46:52 all 16.37 10.29 1.20 3.46 0.00 0.13 0.00 0.00 68.54
15:46:52 0 21.43 10.91 1.74 1.06 0.00 0.39 0.00 0.00 64.47
15:46:52 1 15.64 9.95 1.08 5.29 0.00 0.07 0.00 0.00 67.97
15:46:52 2 14.65 10.29 1.03 5.88 0.00 0.03 0.00 0.00 68.11
15:46:52 3 13.79 10.03 0.97 1.59 0.00 0.02 0.00 0.00 73.60
Output of top whilst ffmpeg is running
top - 15:35:59 up 1:11, 3 users, load average: 0.00, 0.15, 0.29
Tasks: 162 total, 2 running, 159 sleeping, 0 stopped, 1 zombie
Cpu0 : 29.9%us, 1.7%sy, 31.6%ni, 36.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 30.4%us, 1.3%sy, 25.8%ni, 42.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu2 : 38.7%us, 2.0%sy, 22.7%ni, 36.0%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu3 : 27.2%us, 1.0%sy, 27.9%ni, 43.5%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 4056484k total, 3939144k used, 117340k free, 16532k buffers
Swap: 4190204k total, 9348k used, 4180856k free, 3024616k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3559 decrypti 20 0 321m 91m 4708 R 240 2.3 45:52.11 ffmpeg
Solution 1:
Like most code, this code is not doing a good job of taking advantage of multiple cores. It may switch which core it's placing load on very rapidly (around 100 times per second), but it is only able to take full advantage of one core at a time.