Processors cache L1, L2 and L3 are all made of SRAM?
In general they are all implemented with SRAM.
(IBM's POWER and zArchitecture chips use DRAM memory for L3. This is called embedded DRAM because it is implemented in the same type of process technology as logic, allowing fast logic to be integrated into the same chip as the DRAM. For POWER4 the off-chip L3 used eDRAM; POWER7 has the L3 on the same chip as the processing cores.)
Although they use SRAM, they do not all use the same SRAM design. SRAM for L2 and L3 are optimized for size (to increase the capacity given limited manufacturable chip size or reduce the cost of a given capacity) while SRAM for L1 is more likely to be optimized for speed.
More importantly, the access time is related to the physical size of the storage. With a two dimensional layout one can expect physical access latency to be roughly proportional to the square root of the capacity. (Non-uniform cache architecture exploits this to provide a subset of cache at lower latency. The L3 slices of recent Intel processors have a similar effect; a hit in the local slice has significantly lower latency.) This effect can make a DRAM cache faster than an SRAM cache at high capacities because the DRAM is physically smaller.
Another factor is that most L2 and L3 caches use serial access of tags and data where most L1 caches access tags and data in parallel. This is a power optimization (L2 miss rates are higher than L1 miss rates, so data access is more likely to be wasted work; L2 data access generally requires more energy--related to the capacity--; and L2 caches usually have higher associativity which means that more data entries would have to be read speculatively). Obviously, having to wait for the tag matching before accessing the data will add to the time required to retrieve the data. (L2 access also typically only begins after an L1 miss is confirmed, so the latency of L1 miss detection is added to the total access latency of L2.)
In addition, L2 cache is physically more distant from the execution engine. Placing the L1 data cache close to the execution engine (so that the common case of L1 hit is fast) generally means that L2 must be placed farther away.