How to get virtualized SR-IOV Infiniband interface UP?

I've spent several days on this now and I've managed to get SR-IOV working with the Mellanox Infiniband card using the latest firmware.

The Virtual Functions appear in Dom0 as

06:00.1 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function] 06:00.2 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function] 06:00.3 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function] 06:00.4 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]

I've then detached 06:00.1 from Dom0 and assigned it to xen-pciback.

I've passed this into a Xen test domain.

lspci inside the test DomU shows:

00:01.1 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]

I have the following modules loaded in DomU

mlx4_ib
rdma_ucm
ib_umad
ib_uverbs
ib_ipoib

dmesg output for mlx4 drivers shows:

[   11.956787] mlx4_core: Mellanox ConnectX core driver v1.1 (Dec, 2011)
[   11.956789] mlx4_core: Initializing 0000:00:01.1
[   11.956859] mlx4_core 0000:00:01.1: enabling device (0000 -> 0002)
[   11.957242] mlx4_core 0000:00:01.1: Xen PCI mapped GSI0 to IRQ30
[   11.957581] mlx4_core 0000:00:01.1: Detected virtual function - running in slave mode
[   11.957606] mlx4_core 0000:00:01.1: Sending reset
[   11.957699] mlx4_core 0000:00:01.1: Sending vhcr0
[   11.976090] mlx4_core 0000:00:01.1: HCA minimum page size:512
[   11.976672] mlx4_core 0000:00:01.1: Timestamping is not supported in slave mode.
[   12.068079] <mlx4_ib> mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v1.0 (April 4, 2008)
[   12.184072] mlx4_core 0000:00:01.1: mlx4_ib: multi-function enabled
[   12.184075] mlx4_core 0000:00:01.1: mlx4_ib: operating in qp1 tunnel mode

I've even got the ib0 device appearing.

ib0       Link encap:UNSPEC  HWaddr 80-00-05-49-FE-80-00-00-00-00-00-00-00-00-00-00  
          inet addr:10.10.10.10  Bcast:10.10.10.255  Mask:255.255.255.0
          UP BROADCAST MULTICAST  MTU:2044  Metric:1
          RX packets:117303 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:256 
          RX bytes:6576132 (6.5 MB)  TX bytes:0 (0.0 B)

I can even locally ping 10.10.10.10.

However, those pings aren't sent out on the infiniband fabric.

It appears to be because the link is down. ibstat shows:

CA 'mlx4_0'
    CA type: MT4100
    Number of ports: 1
    Firmware version: 2.30.3000
    Hardware version: 0
    Node GUID: 0x001405005ef41f25
    System image GUID: 0x002590ffff175727
    Port 1:
        State: Down
        Physical state: LinkUp
        Rate: 10
        Base lid: 9
        LMC: 0
        SM lid: 1
        Capability mask: 0x02514868
        Port GUID: 0x0000000000000000

How do I get it UP? the domU link is UP but not the VF one?


And the answer is actually found here: According to this link: http://www.spinics.net/lists/linux-rdma/msg13307.html

What do I need for the slave VF's port to become active? I'm running opensm 3.3.13 on a different box, is that new enough? (does SR-IOV require any SM support?)

Yes, as Hal noted, at minimum you need opensm 3.3.14 (http://marc.info/?l=linux-rdma&m=133819320432335&w=2) as it is the 1st version to support alias-guid et al stuff needed for SRIOV, 3.3.15 is also out now, so you want the 2nd version that supports this... basically you need IB link for the PPF and the slave to get an alias guid registered for it @ the SM. We (IL team) were off Tues/Wed as of a holiday, will try to get you further details tonight and if not, by tomorrow, sure.

I've now upgraded OpenSM and will report back soon.


EDIT: OK, it's now working. However I'm getting a log blowout for opensm. The OpenSM process is writing hundreds of entries per second of the form:

Sep 30 20:36:26 707784 [7DC1700] 0x01 -> validate_requested_mgid: ERR 1B01: Wrong MGID Prefix 0x8000 must be 0xFF
Sep 30 20:36:26 707810 [7DC1700] 0x01 -> mcmr_rcv_create_new_mgrp: ERR 1B22: Invalid requested MGID
Sep 30 20:36:26 708096 [8DC3700] 0x01 -> validate_requested_mgid: ERR 1B01: Wrong MGID Prefix 0x8000 must be 0xFF
Sep 30 20:36:26 708119 [8DC3700] 0x01 -> mcmr_rcv_create_new_mgrp: ERR 1B22: Invalid requested MGID
Sep 30 20:36:26 708391 [FF5B0700] 0x01 -> validate_requested_mgid: ERR 1B01: Wrong MGID Prefix 0x8000 must be 0xFF
Sep 30 20:36:26 708421 [FF5B0700] 0x01 -> mcmr_rcv_create_new_mgrp: ERR 1B22: Invalid requested MGID
Sep 30 20:36:26 708696 [3DB9700] 0x01 -> validate_requested_mgid: ERR 1B01: Wrong MGID Prefix 0x8000 must be 0xFF
Sep 30 20:36:26 708719 [3DB9700] 0x01 -> mcmr_rcv_create_new_mgrp: ERR 1B22: Invalid requested MGID

And the above error messages went away when I rebooted and gave Dom0 more memory. I currently have 2GB allocated to it with autoballooning off. Unfortunately, they are back with no obvious reason. So I've asked a new question that relates to that here

I'm not really sure why it works in dom0 but in my case I have to have OpenSM running on the Dom0 which has the VF's. I presume this is because the OpenSM instance running on Dom0 knows about the VF's and can advertise them while a subnet manager on another node doesn't? that's my guess. I hope the other xen node will pick up it's VF's as well. That may end up becoming another question. For now it's working with a single Xen node.


OpenSM must be installed and started on hypervisor host to bring the state up. Then start start OpenSM with option: PORTS="ALL".