How large is my 2012 MacBook Air's L1 cache? Are they larger now?

Your current MacBook Air has an L1 cache of 128 kB. So your guess of 10^5 bytes is not far off!

The latest MacBook Air has the same amount of L1 cache, so in that respect it is the same. However, other parts of the CPU are vastly different.

Note that your might see that some of these CPUs are listed as having "Smart Cache", which essentially means that some caches are shared among multiple cores. In your case, the L1 cache is never shared - so it is always a private cache for each core. This means that if your code is single threaded, then the actual amount of L1 cache used by that program is actually 64 kB and not 128 kB.

In addition to that, the L1 cache is actually not a single cache as such,. It is split into seperate portions for instructions and data. This means that with the 64 kB, you will have cached 32 kB of data and 32 kB of instructions (program code).

You might wonder why the new MacBook Air has the same amount of L1 cache and think that it hasn't kept up with the development in the industry - but that's actually not how it is. The size of the L1 has been pretty constant for years and years. 20 years ago you might have seen 32 kB L1 cache per core, the last many years 64 kB L1 cache has been pretty standard.

For comparison the latest Mac Pro (desktop computer) with the most expensive CPU option (the Intel Xeon W-3265M 24 core CPU) also has a 64 kB L1 cache per core.

Just increasing the size of the L1 cache won't really do you any good in terms of increasing overall performance. L1 cache is the most "expensive" type of cache in terms of the amount of physical space taken up on the die, and in terms of power usage. In addition, because it is even faster than TLB lookups, the extra benefit of a larger size quickly diminishes. In essence, you'll get more benefit from having a larger L2 or L3 cache than increasing the size of the L1 cache - and it's much "cheaper" to increase those.

All in all this seems to indicate that different L1 cache sizes isn't the deciding factor that determines that your computer is slow and others are fast.

I could guess that perhaps a big difference lies in the fact that your old processor does not support the same "number-crunching" instruction sets as a new processor does. This small test example doesn't really show it, but from your links it appears that you're really doing some computations with vectors. Specialized CPU instructions exists that really benefit those types of computations - and I know that numpy can take advantage of at least some of them. Your current CPU only supports the older AVX instruction sets, whereas a new MacBook Air supports AVX2 in addition to SSE4.

You might for comparison's sake try disabling AVX2/SSE4 support in numpy on the fast machine and rerun the calculation to see if that accounts for the majority of the time difference.