Very uneven CPU utilization with SQL Server 2012 on 2 processor computer with 16 cores / processor

The uneven performance was likely a combination of the 20-core limit combined with the way that sql server schedules threads on NUMA machines. Unfortunately, SQL Server 2012 doesn't use any intelligence in deciding which 20 cores to utilize, resulting in an unbalanced number of cores per NUMA node. With 32 cores spread across 2 NUMA Nodes, you'll likely end up with a 16/4 split. This is problematic because SQL will try to balance activity equally across NUMA nodes in a round-robin fashion (assuming that you're not forcing affinity w/resource governor).

In your case, 1/2 the load is assigned to 4 cores, and 1/2 to 16 cores. The bottleneck on the 4-core node effectively acts as a throttle, limiting the capacity of the machine to 2x 4 cores = 8 cores = 25% CPU usage.

Once you upgraded to core edition, then sql utilized all 32 cores across 2 numa nodes (16/16 split). Performance improved, etc.

One option that could have improved your performance would have been to utilize sql server resource governor to affinitize the majority of your workload to one numa node. For example, you could create a resource pool WEB_APP, and affinitize it to only run on the 16 core numa node. The load assigned to the WEB_APP pool could utilize 50% of the server capacity, plus the remaining 12.5% capacity from the 4-core node.

The other option would be limit the cores available to sql server to only be 10 from each numa node.