Infiniband port status UP but can't open UMAD port ((null):0)
My system has 2 infiniband devices, one of which has both the ports up.
$> ibstatus
Infiniband device 'mlx4_0' port 1 status:
default gid: fe80:0000:0000:0000:0002:c903:000f:0a9f
base lid: 0x22
sm lid: 0x1
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 20 Gb/sec (4X DDR)
link_layer: IB
Infiniband device 'mlx4_0' port 2 status:
default gid: fe80:0000:0000:0000:0002:c903:000f:0aa0
base lid: 0x23
sm lid: 0x1
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 20 Gb/sec (4X DDR)
link_layer: IB
Infiniband device 'mlx4_1' port 1 status:
default gid: fe80:0000:0000:0000:0002:c903:000f:0a6b
base lid: 0x0
sm lid: 0x0
state: 1: DOWN
phys state: 2: Polling
rate: 10 Gb/sec (4X)
link_layer: IB
Infiniband device 'mlx4_1' port 2 status:
default gid: fe80:0000:0000:0000:0002:c903:000f:0a6c
base lid: 0xd
sm lid: 0x2
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 10 Gb/sec (4X)
link_layer: IB
Now, when I check the ib port state by lid,
$> ibportstate -L 10x22 enable
ibwarn: [14836] mad_rpc_open_port: can't open UMAD port ((null):0)
ibportstate: iberror: failed: Failed to open '(null)' port '0'
I am not sure about the reason for this error message. Am I missing something?
Does the corresponding umad
device file exist (this is typically /dev/infiniband/umad0
) ?
Also, on the system I have access to, permissions of /dev/infiniband/umad0
are set by default such that normal users can't access them:
crw-rw---- 1 root root 231, 0 Feb 1 16:00 /dev/infiniband/umad0
so you could use sudo
to run your command (or relax the permissions of /dev/infiniband/umad0
).
It maybe just a typo here on SO, but you are specifying LID as 10x22
. As LID is supposed to be a hexadecimal number, the 1 is extraneous. It should be just a 0x22
.