Using SR-IOV I350 embedded switch for KVM virtual network -- is an external switch required?

I'm connecting several KVM VMs to a virtual network that is routed to a 1Gbit physical network. The router uses netfilter/iptables to filter traffic between the real and virtual networks. For the virtual network switch I'm using SR-IOV with PCI-passthrough. Compared to using a Linux bridge, this setup allows higher throughput (limited by PCIe bandwidth) with lower CPU overhead (ref: pp 22-23 of Toshiaki Makita's presentation at LinuxCon Japan, 2014.

I've assigned one VF of the same port of an Intel I350 NIC to each VM and to the KVM host. (Each I350 port has a maximum 7 VFs, so 6 VMs + host is the maximum size of this virtual network). This setup is working as expected except for an annoying quirk: the I350's embedded switch only functions when I connect the associated I350 physical port with a physical patch cable to a physical external switch (which has all other ports empty). With the external switch on, the virtual network works fine; but when the external switch is powered down the PF link status changes to "NO-CARRIER" and the virtual network no longer passes packets.

Does anyone know of a way to get the I350's embedded switch to function without an active link on the physical port?

The VM host runs Debian 10 (Buster), if that matters.

Thanks for any light you can shed!

Further notes:

  • A loopback plug like the Smartronix SuperLooper would be less clunky than an external switch, and might work. However according to the manufacturer it's "intended solely for testing systems where the Near End Crosstalk (NEXT) function can be disabled" -- and the I350 datasheet does not mention such functionality.

  • Section 3.7.6 of the I350 datasheet describes 4 different supported internal loopback modes. (The same internal loopback functionality is also present in the I210, and likely other Intel chips as well.) However I have not found any information on using Linux tools to configure the I350/I210/etc to use internal loopback. It's also unclear whether activating one of the internal loopback modes would also activate the I350's embedded switch...?

Updates:

  • Thanks to @Tomek, I tried

    # ip link set dev eth1 vf 0 state enable
    RTNETLINK answers: Operation not supported
    

    ip link set dev eth1 vf 0 trust on does work, so syntax is correct and driver is working. I got curious whether the igb driver or I350 hardware prevents setting vf link state to enable. Looking at i40e_main.c (for example), struct net_device_ops i40e_netdev_ops contains

    .ndo_set_vf_link_state  = i40e_ndo_set_vf_link_state,
    .ndo_set_vf_spoofchk    = i40e_ndo_set_vf_spoofchk,
    .ndo_set_vf_trust       = i40e_ndo_set_vf_trust,
    

    whereas in igb_main.c struct net_device_ops igb_netdev_op has

    .ndo_set_vf_spoofchk    = igb_ndo_set_vf_spoofchk,
    .ndo_set_vf_trust       = igb_ndo_set_vf_trust,
    

    but is missing .ndo_set_vf_link_state. So it looks like igb does not support setting VF link state to enabled. Whether the I350 hardware could support this feature is another question. It seems this standard way of enabling the embedded switch will not work for I350. Maybe there's another way?

  • The I350 datasheet makes some discouraging statements:

    7.8.3.1 Packet Switching (VMDq) Model: VMDq Assumptions

    1. When the link is down, the Tx flow is stopped, and thus the local switching traffic is stopped also.

    and

    7.3.3.5 TX Packets Switching

    The following rules apply to loopback traffic:

    • Loopback is disabled when the network link is disconnected.

    Whereas the Intel 710 datasheet reads quite differently:

    Table 1-7. Internal Switching Features

    Internal switching operates independently of the state of the LAN ports (also when LAN ports are down).

    It's increasingly looking like the answer to my question is: Yes, an I350 does need an external switch attached in order to switch VM-VM traffic. I'd love for someone to prove me wrong!


Solution 1:

While I am not sure if this will work on I350 NIC I think the answer is in the ip-link man page:

vf NUM specify a Virtual Function device to be configured. The associated PF device
must be specified using the dev parameter.
[--cut--]
    state auto|enable|disable - set the virtual link state as seen by
    the specified VF. Setting to auto means a reflection of the PF link state,
    enable lets the VF to communicate with other VFs on this host even
    if the PF link state is down, disable causes the HW to drop any packets
    sent by the VF.

Setting the VF state to enable should force all VFs up regardless of the link state and allow switching between them even without a cable.