vCPU performance between 1 or 2 vCPU's
Solution 1:
Not considerable enough to make an impact. The adjustments are more for licensing. For example, Windows Server is licensed per processor slot, so you'd pay more to have 1 core and 4 CPUs than to have 1 CPU and 4 cores.
Same goes with other products whose costs quickly rise with more processors (looking at you, Oracle).
Solution 2:
Short answer: Probably none that you would notice.
Long answer: Maybe. The issue that comes to my mind first and foremost is that modern CPUs operate much faster than the main memory they use. This is the primary reason why NUMA (non-uniform memory access) was invented. CPUs on the same die (ex. two cores on the same chip) would share the same NUMA node... and they would both access memory from this node faster than they could access the memory from another NUMA node. So if you are building a computer that is going to have many cores and many physical processors, keep NUMA node locality in mind. If a processor has to access memory that is far away it will be slower.
NUMA is not going to be an issue for you if your machine only has one or two processors, but I thought I would mention it anyway, just for completeness' sake.
Solution 3:
While there shouldn't be a difference here, my benchmarks have shown a slight (but clean nonetheless) performance increase in Windows guests when using single core multi-socket emulation (e.g. 4 vCPUs are mapped as 4 sockets, single core, single thread). No visible difference in Linux guests though.
Tests were done on KVM, using Windows 2003R2 and 2008R2 guests and RHEL5 and RHEL6 guests for the Linux side of things. My guess is Windows tries to do some extra scheduling tricks that either excel on multiple sockets or fail on multiple cores.