Where does PCI-E link-width negotiation occur?
I'm trying to diagnose an underperforming PCI-E card in my system, and I've realized that it's negotiating the wrong link-width. Specifically, from running lspci -vv
, I see:
LnkCap: Port #1, Speed 8GT/s, Width x8, ASPM L1, Exit Latency L0s <4us, L1 <4us
ClockPM- Surprise- LLActRep- BwNot-
while
LnkSta: Speed 8GT/s, Width x4, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
My question is: does this negotiation happen at the hardware level or at the software level? Put another way, does the card negotiate directly with the PCI-E slot, or does this happen somewhere in the drivers?
(If this turns out to be an obvious answer, please forgive me...after trying to diagnose this for a week, my mind is a bit fried.)
Solution 1:
It's done at the electrical level, not by software. The two registers you've listed above, LNK_CAP and LNK_STA are what you correctly noted as 'Here's what the link is capable of' and 'Here the current status'. There is also SLT_CAP and SLT_STA, which may be worth a look as that is specific to a given 'slot' in the machine.
The PCIe spec defines a LTSSM -- Link Training and Status State Machine. At the PHY/device level, this is what determine the maximum speed both devices support, the maximum link width both devices support, and this is also where polarity reversal / lane reversal is handled (to make layout easier for us, the spec allows P/N to be swapped, etc.).
The devices send known, ordered sets of symbols to each other and the hardware works its way up from 2.5GT/s. There are speed change commands that can be sent to each other, and here is where the channel equalization settings are also defined.
If you're linking up at the wrong speed, it may be possible that the PCIe root port is configured wrong, or that there's a signal integrity issue forcing a lower link width. In my experience, if you were linking up at 5 GT/s instead of 8 GT/s, that's more of a SI issue -- linking up at x4 8 GT/s instead of x8 8 GT/s seems like a configuration issue, or perhaps adding a card to a slot that doesn't support x8 width.
The root complex capabilities register (Offset 04h) will reveal the maximum supported width, that might help with your diagnostics. IIRC, -x will dump the first 4K of config space, -xx or -xxx will dump the PCIe extended config space. If you dump your entire config space here / pastebin it, I can dig through it for you possibly, but Linux does a decent job of decoding what the registers do.