Performance of Dedicated vs Virtual Server

Solution 1:

One difference may be that in most hypervisors not all performance enhancing CPU features are made available to guests. (That is among other reasons because for instance live migration won't be successful when a guest is using CPU extensions on the old host that are not present on the target host.)

By default most hypervisors will only expose a limited but almost universally compatible subset of CPU features. If your application benefits from such, they may at the time of testing have only been available in your bare-metal server.

A good example of CPU features that could be missing in both KVM (Qemu) and Hyper-V guests with default settings are the AES and AES-NI instruction set that will significantly normally speed up AES encryption/decryption.

Solution 2:

This is, unfortunately, expected. Virtualization has a significant impact on performance of CPU intensive applications, especially if there is a high level of process concurrency. I have been testing the performance impact of virtualization over the years, and while things have improved over the past decade, they haven't improved all that much. Even the current generation of AMD CPUs which are unburdened by spectre and metdown mitigations exhibit a performance penalty of 17-25% when virtualized on an purely CPU bound workload.

The high overhead manifests particularly badly under a heavy load with high concurrency, because context switching is made much more expensive under a hypervisor. A light workload with low concurrency suffers a relatively minimal performance penalty, but a heavily concurrent workload that saturates the guest CPU performs significantly worse.

Some of the performance overhead can be mitigated by:

  1. Pinning vCPUs to physical CPUs
  2. Allocating the guest memory from huge memory pages (2MB vs. 4KB).

On latest AMD CPUs, this will typically limit the overhead to no more than 20% even under a heavy concurrent workload.