Why is serial data transmission faster than parallel?
You cannot formulate it this way.
Serial transmission is slower than parallel transmission given the same signal frequency. With a parallel transmission you can transfer one word per cycle (e.g. 1 byte = 8 bits) but with a serial transmission only a fraction of it (e.g. 1 bit).
The reason modern devices use serial transmission is the following:
You cannot increase the signal frequency for a parallel transmission without limit, because, by design, all signals from the transmitter need to arrive at the receiver at the same time. This cannot be guaranteed for high frequencies, as you cannot guarantee that the signal transit time is equal for all signal lines (think of different paths on the mainboard). The higher the frequency, the more tiny differences matter. Hence the receiver has to wait until all signal lines are settled -- obviously, waiting lowers the transfer rate.
Another good point (from this post) is that one needs to consider crosstalk with parallel signal lines. The higher the frequency, the more pronounced crosstalk gets and with it the higher the probability of a corrupted word and the need to retransmit it.1
So, even if you transfer less data per cycle with a serial transmission, you can go to much higher frequencies which results in a higher net transfer rate.
1 This also explains why UDMA-Cables (Parallel ATA with increased transfer speed) had twice as many wires as pins. Every second wire was grounded to reduce crosstalk.
The problem is synchronization.
When you send in parallel you must measure all of the lines at the exact same moment, as you go faster the size of the window for that moment gets smaller and smaller, eventually it can get so small that some of the wires may still be stabilizing while others are finished before you ran out of time.
By sending in serial you no longer need to worry about all of the lines stabilizing, just one line. And it is more cost efficient to make one line stabilize 10 times faster than to add 10 lines at the same speed.
Some things like PCI Express do the best of both worlds, they do a parallel set of serial connections (the 16x port on your motherboard has 16 serial connections). By doing that each line does not need to be in perfect sync with the other lines, just as long as the controller at the other end can reorder the "packets" of data as they come in using the correct order.
The How Stuff Works page for PCI-Express does a very good explination in depth on how PCI Express in serial can be faster than PCI or PCI-X in parallel.
TL;DR Version: It is easier to make a single connection go 16 times faster than 8 connections go 2 times faster once you get to very high frequencies.
Parallel isn't inherently slower, but it does introduce challenges what serial communication does not.
But many of the fastest links are still parallel: The front-side bus in your computer is typically highly-parallel, and is usually among the fastest interlinks in a computer. Fiber optic connections can also be highly-parallel by carrying multiple wavelengths over a single fiber. This is expensive and therefore not typical, though. The most common form of Gigabit ethernet is actually 4 parallel channels of 250Mbit Ethernet in a single wire.
The most pronounced challenge introduced by parallelism is "crosstalk": when signal current starts or stops, it momentarily induces a small current on the wires next to it. The faster the signal, the more often this happens, and the more difficult it gets to filter out. Parallel IDE attempted to minimize this problem by doubling the amount of wires in the ribbon cable, and connecting every other wire to ground. But that solution only gets you so far. Long cables, folds and loops, and proximity to other ribbon cables all make this an unreliable solution for very high-speed signals.
But if you go with only one signal line, well then you're free to switch it as fast as your hardware will allow. It also solves subtle synchronization issues with some signals travelling faster than others.
Two wires is always theoretically twice as fast as one, but each signal line you add subtly complicates the physics, which may be better to avoid.
Serial data transmission isn't faster than parallel. It's more convenient and so development has gone into making fast external serial interfacing between equipment units. Nobody wants to deal with ribbon cables that have 50 or more conductors.
Between chips on a circuit board, a serial protocol like I2C that needs only two wires is much easier to deal with than routing numerous parallel traces.
But there are plenty of examples inside your computer where parallelism is used to massively increase the bandwidth. For instance, words are not read one bit at a time from memory. And in fact, caches are refilled in large blocks. Raster displays are another example: parallel access to multiple memory banks to get the pixels faster, in parallel. Memory bandwith depends critically on parallelism.
This DAC device touted by Tektronix as "the world’s fastest commercially available 10-bit high speed DAC" makes heavy use of parallelism to bring in the data, which comes into the DAC over 320 lines, which are reduced to 10 through two stages of multiplexing driven by different divisions of the master 12 GHZ clock. If the world's fastest 10 bit DAC could be made using a single serial input line, then it probably would.
Parallel was the obvious way to increase speed when logic gates were slow enough that you could use similar electrical techniques for buses/cables and on-chip transmission. If you're already toggling the wire as fast as your transistor allows, so the only way to scale is using more wires.
With time, Moore's law outpaced the electromagnetic constrains so transmissions over cables, or even on-board buses, became a bottleneck compared to on-chip speeds. OTOH, the speed disparity allows sophisticated processing at the ends to use the channel more effectively.
Once propogation delay approaches the order of a few clocks, you start worrying about analogue effects like reflections => you need matched impedances along the way (especially tricky for connectors) and prefer point-to-point wires over multi-point buses. That's why SCSI needed termination, and that's why USB needs hubs instead of simple splitters.
-
At higher speeds you have multiple bits in flight at any given moment along the wire => you need to use pipelined protocols (which is why Intel's FSB protocols became frightfully complicated; I think packetized protocols like PCIe were a reaction to this complexity).
Another effect is a multi-cycle penalty for switching the direction of signal flow—that's why Firewire and SATA and PCIe using dedicated wires per direction outperformed USB 2.0.
-
Induced noise, aka crosstalk, goes up with frequency. The single biggest advance in speeds came from adoption of differential signalling which dramatically reduced crosstalk (mathematically, an unbalanced charge's field goes down as R^2, but a dipole's field goes down as R^3).
I think this is what caused the "serial is faster that parallel" impression — the jump was so large that you could go down to 1 or 2 differential pairs and still be faster than LPT or IDE cables. There was also a crosstalk win from having only one signal pair in the cable, but that's minor.
-
Wire propogation delay varies (both because wire lengths are hard to match across 90º turns, connectors etc. and because of parasitic effects from other conductors) which made synchronization an issue.
The solution was to have tunable delays at every receiver, and tune them at startup and/or continually from the data itself. Encoding the data to avoid streaks of 0s or 1s incurs a small overhead but has electric benefits (avoids DC drift, controls spectrum) and most importantly allows dropping the clock wire(s) altogether (which isn't a big deal on top of 40 signals but is a huge deal for a serial cable to have 1 or 2 pairs instead of 2 or 3).
Note that we are throwing parallelism at the bottleneck — today's BGA chips have hundreds or thousands of pins, PCBs have more and more layers. Compare this to old 40-pin microcontrollers and 2 layer PCBs...
Most of the above techniques became indispensable for both parallel and serial transmission. It's just that the longer the wires, the more attractive it becomes to push higher rates through fewer wires.