How to utilize the High Performance cores on Apple Silicon

I have developed a macOS app which is heavily relying on multithreading (a call center simulator). It runs fine on my iMac 2019 and fills up all cores nicely. In my test scenario it simulates app. 1.4 mio. telephone calls in total in 100 iterations, each iteration as a dispatch item on a parallel dispatch queue.

Core usage on Intel iMac

Now I have bought a new Mac mini with M1 Apple Silicon and I was eager to see how the performance develops on that test machine. Well, it’s not bad but not as good as I expected:

System Duration
iMac 2019, Intel 6-core i5, 3.0 GHz, Catalina macOS 10.15.7 19.95 s
Mac mini, M1 8-core, Big Sur macOS 11.2, Rosetta2 26.85 s
Mac mini, M1 8-core, Big Sur macOS 11.2, native ARM 17.07 s

Investigating a little bit further I noticed that at the start of the simulation all 8 cores of the M1 Mac are filled up properly but after a few seconds only the 4 high efficiency cores are used any more.

Core usage on M1 Mac mini

I have read the Apple docs „Optimize for Apple Silicon with performance and efficiency cores“ and double checked that the dispatch queue for the iterations is set up properly:

let simQueue = DispatchQueue.global(qos: .userInitiated)

But no success. After a few seconds of running the high performance cores are obviously not utilized any more. I even tried to set up the queue with qos set to .userInteracive up that didn’t help either. I also flagged the dispatch items with proper qos but that didn’t change anything. It looks to me that other apps (e.g. XCode) do utilize the high performance cores even for a longer time.

Does anybody know how to force a M1 Mac to utilize the high performance cores?


Solution 1:

"M1 8 core" is really "M1 4 performance + 4 power saving cores". I expect it to have be a bit more performance than an Intel 6 core, but not much. Exactly has you see, 15% faster than six Intel cores or about as fast as 7 Intel cores would be. The current M1 chips are low end processors. "A bit better than Intel six cores" is quite good.

Your code must be running on the performance cores, otherwise there would be no chance at all to come close to the Intel performance. In that graph, nothing tells you which cores are used.

What happens most likely is that all cores start running, each trying to do one eighth of the work, and after about 8 seconds the performance cores have their work done. Then the power saving cores move their work to the performance cores. And you are just misinterpreting the image as only low performance cores doing the work.