Linux will not recognize Thunderbolt 3 (Titan Ridge) card on Supermicro motherboard
I have a server with a Supermicro C9Z390-PGW motherboard to which I needed to add Thunderbolt 3 connectivity, so I purchased a Gigabyte GC-Titan Ridge Rev 2.0 add in card. However, after powering on the machine it does not show up in lspci
and if I plug in a Thunderbolt device, while the device does receive power, Linux does not recognize the new device.
There are many guides online for issues like these, but a lot of them are outdated, so hopefully, this is helpful for people encountering this issue in >= 2021. These instructions were tested on Ubuntu 21.04 with Linux kernel 5.11 and are likely not applicable to older kernels (which were missing patches).
The issue here is that many motherboard BIOSes do not properly initialize the Thunderbolt controller and thus Linux cannot make use of them. In particular, the BIOS needs to perform two functions:
- Power on the Thunderbolt controller so Linux can recognize it
- Reserve a PCIe bus number for any hotplug device to be added
Fortunately, both of these can be worked around without requiring the manufacturer to provide a BIOS update. For the first issue, the add-in cards have GPIO that, if bridged, will force power the Thunderbolt controller. On the GC-Titan Ridge Rev 2.0 card, the appropriate pins to bridge are pins 3 and 5 (as counted from the bottom/left). Here's a picture of the correct pins to bridge. On other cards, the pinout of the GPIO may be different. Make absolutely sure that your card matches before attempting this, since incorrect bridging may damage your device. (Of course, never try to do any bridging while the device is powered on).
Once this is done, if you boot the server, the thunderbolt controller should show up in PCIe:
$ lspci | grep "USB controller"
37:00.0 USB controller: Intel Corporation JHL7540 Thunderbolt 3 USB Controller [Titan Ridge 4C 2018] (rev 06)
However, hotplug will likely still not work, causing errors like:
No bus number available for hot-added bridge.
This is issue #2 above. To fix this, we need to change the kernel parameters
to reserve bus numbers for hotplug devices. This can be accomplished by editing
the GRUB_CMDLINE_LINUX_DEFAULT
in /etc/default/grub
and running sudo update-grub
afterwards.
In particular, for our issue, we can add the following to GRUB_CMDLINE_LINUX_DEFAULT
:
pci=assign-busses,realloc,hpbussize=0x10,hpmmiosize=128M,hpmmioprefsize=1G
These options to the following:
-
assign-busses
- Force overrides the PCIe bus assignment that the firmware did (which would have failed to assign any buses to the thunderbolt device) -
realloc
- Force reallocating PCIe bridge ranges -
hpbussize
- The number of buses to reserve for hotplug. This depends on the devices you want to add. I set it to0x10
above, which seems reasonable, but if you see the out of bus-numbers error again you may need to bump this. -
hpmmiosize
- The amount of non-prefetchable address space to reserve to PCIe devices being hotplugged. This again depends on the devices you want to add. This sets it to 128M, but if you have a large number of devices, or devices with large BARs, you may need to increase this value. Setting this value too large may exhaust available PCIe MMIO space and prevent devices from working correctly. -
hpmmioprefsize
- Similar to the other parameter, except that this pre-allocates prefetchable memory (MMIO_PREF) for the PCIe bar. This is separate and devices generally need more MMIO_PREF space than regular MMIO space.
Note that before kernel version 5.6, hpmmiosize
and hpmmioprefsize
used to be simply hpmemsize
, which would override both, but it was easy to get into situations where hpmmiosize
was too large to fit, while the devices behind the bridge only really needed more MMIO_PREF
space, so the parameter was split.
The person who wrote said kernel patch also has a guide to make this work for older kernels and other add-in cards with appropriate kernel patches, which I will link here: https://egpu.io/forums/thunderbolt-enclosures/pdf-guide-and-patches-for-making-linux-v5-3-kernel-to-work-with-thunderbolt-3-add-in-card/.
Addendum
Here's some more information about how the connector bridging thing works. The connector we're bridging is referred to as the THB-C connector, or TBHEADER. It is not documented, but from public motherboard documentation, we can see that the pinout is
5 Force Power (Thunderbolt Controller POC_GPIO_3 aka TBT_FORCE_PWR)
4 Plug Event (Thunderbolt Controller GPIO_5 aka TBT_CIO_PLUG_EVENT#)
3 S3 Sleep Indication (Thunderbolt Controller POC_GPIO_5 aka TBT_SLP_S3#)
2 S4_S5 (Not much info on this, potentially wired to RESET_N on the Thunderbolt controller)
1 GND
The S3 Sleep indication appears to usually just be wired to the regular mainboard SLP3# signal (on the motherboard side), while the other two are wired to PCH GPIOs. In particular, it appears that these controllers implement a power saving mode where they stop decoding PCIe transactions to the NHI, which is the PCIe device that the linux drivers are looking for. Thus, what we're doing here is really forcing the Force Power
pin to high (which works because TBT_SLP_S3# happens to be on a pull up on the device side), which forces the controller to stay out of power saving mode (i.e. any 3.3V source would probably do here, though if you find one on the mainboard, you may want to put a resistor in series to be safe).
Overall, it seem that this bridging should be perfectly safe on motherboards that don't have the special thunderbolt header, though the system might have issues with S3 sleep states not properly functioning if devices are plugged into the thunderbolt bridge.