Instruction per Cycle vs Increased Cycle Count

Increasing either instruction per cycle or increase cycle count both are valid design choice for processor manufactures. I understand theory, but it would be much clearer if I had some real life example.

So, can anyone give me some example that can benefit both of this design choice? Like which application / type of application/process takes advantage of higher IPC count and which takes advantage of higher cycle count.


The Computer Architect

It takes much more engineering effort to increase IPC, than simply increasing the clock frequency. E.g. pipelineing, caches, multiple cores--altogether introduced to increase IPC--get very complex and require many transistors.

Although the maximum clock frequency is restricted by the length of the critical path of a given design, if you're lucky, you can increase the clock frequency without any refactoring. And even if you have to reduce path lengths, the changes are not as profound as those the techniques mentioned above require.

With current processors, however, clock frequencies are already pushed to the economical limits. Here, speed gains solely stem from IPC increase.

The Programmer

From the programmer's point of view, it's in so far an issue, as he has to adjust his programming style to the new systems computer architects create. E.g. concurrent programming will become more and more inevitable in order to take advantage of the high IPC values.


I've actually designed a couple of processors (many years ago) and have a little bit of experience in the trade-offs.

To increase the instuctions per cycle (or, more likely, reduce the cycles per instruction) you generally have to "throw hardware" at the problem -- add more gates and latches and multiplexers. Beyond a certain point (which was passed about a decade ago) you must "pipeline" and be working on several instructions at once. This increase in complexity not only drives up basic costs (since the cost of a chip is related to the area it occupies), it also increases the likelihood that a bug will make it through the initial design review and result in a bad chip that must be "respun" -- a major cost and schedule hit. In addition, the increase in complexity increases loads such that, absent even more hardware, the length of a cycle actually increases. You could possibly encounter the situation where adding hardware slowed things down. (In fact, I saw this happen in one case.)

Additionally, "pipelining" can encounter conditions where the pipeline is "broken" because of frequent (and unanticipated) branches and other such problems, causing the processor to slow to a crawl. So there's a limit to how much of this can be done productively.

To speed up individual cycles you need to do one of three things:

  1. Use a faster technology (a "no-brainer" if the technology is available, but new, faster technologies are not showing up as frequently as they used to)
  2. Somehow remove logic from the "critical path" (possibly by deleting complex instruction from the instruction set or adding other limitations at the software level).
  3. Reduce the propagation delay through the slowest data paths (which usually means "throwing hardware" at the problem again -- and again with the chance that this would backfire and slow things down).

So it's a lot of trade-offs, and a bit of a tap dance through a minefield.