Why has the size of L1 cache not increased very much over the last 20 years?

The Intel i486 has 8 KB of L1 cache. The Intel Nehalem has 32 KB L1 instruction cache and 32 KB L1 data cache per core.

The amount of L1 cache hasn't increased at nearly the rate the clockrate has increased.

Why not?


Solution 1:

30K of Wikipedia text isn't as helpful as an explanation of why too large of a cache is less optimal. When the cache gets too large the latency to find an item in the cache (factoring in cache misses) begins to approach the latency of looking up the item in main memory. I don't know what proportions CPU designers aim for, but I would think it is something analogous to the 80-20 guideline: You'd like to find your most common data in the cache 80% of the time, and the other 20% of the time you'll have to go to main memory to find it. (or whatever the CPU designers intended proportions may be.)

EDIT: I'm sure it's nowhere near 80%/20%, so substitute X and 1-X. :)

Solution 2:

One factor is that L1 fetches start before the TLB translations are complete so as to decrease latency. With a small enough cache and high enough way the index bits for the cache will be the same between virtual and physical addresses. This probably decreases the cost of maintaining memory coherency with a virtually-indexed, physically-tagged cache.