NIC with SR-IOV capabilities work by introducing the idea of physical functions (PFs) and virtual functions (VFs).
A PF is used by host and VF configurations are applied through the PF. Each VF can be treated as a separate physical NIC and assigned to one container, and configured with it's own MAC, VLAN and IP, etc.
SR-IOV CNI plugin works with SR-IOV device plugin for VF allocation for a container. A metaplugin such as Multus gets the allocated VF's deviceID
(PCI address) and responsible to invoke SR-IOV cni plugin with that deviceID
.
This plugin requires Go 1.5+ to build.
Go 1.5 users will need to set GO15VENDOREXPERIMENT=1
to get vendored dependencies. This flag is set by default in 1.6.
To build the plugin binary:
# make
Upon successful build the plugin binary will be available in build/sriov
.
Given Intel ixgbe NIC on CentOS, Fedora or RHEL:
# vi /etc/modprobe.conf
options ixgbe max_vfs=8,8
SRIOV-CNI support Mellanox ConnectX®-4 Lx and ConnectX®-5 adapter cards. To enable SR-IOV functionality the following steps are required:
1- Enable SR-IOV in the NIC's Firmware.
Installing Mellanox Management Tools (MFT) or mstflint is a pre-requisite, MFT can be downloaded from here, mstflint package available in the various distros and can be downloaded from here.
Use Mellanox Firmware Tools package to enable and configure SR-IOV in firmware
# mst start
Starting MST (Mellanox Software Tools) driver set
Loading MST PCI module - Success
Loading MST PCI configuration module - Success
Create devices
Locate the HCA device on the desired PCI slot
# mst status
MST modules:
------------
MST PCI module loaded
MST PCI configuration module loaded
MST devices:
------------
/dev/mst/mt4115_pciconf0 - PCI configuration cycles access.
...
Enable SR-IOV
# mlxconfig -d /dev/mst/mt4115_pciconf0 q set SRIOV_EN=1 NUM_OF_VFS=8
...
Apply new Configuration? ? (y/n) [n] : y
Applying... Done!
-I- Please reboot machine to load new configurations.
Reboot the machine
# reboot
2- Enable SR-IOV in the NIC's Driver.
# ibdev2netdev
mlx5_0 port 1 ==> enp2s0f0 (Up)
mlx5_1 port 1 ==> enp2s0f1 (Up)
# echo 4 > /sys/class/net/enp2s0f0/device/sriov_numvfs
# ibdev2netdev -v
0000:02:00.0 mlx5_0 (MT4115 - MT1523X04353) CX456A - ConnectX-4 QSFP fw 12.23.1020 port 1 (ACTIVE) ==> enp2s0f0 (Up)
0000:02:00.1 mlx5_1 (MT4115 - MT1523X04353) CX456A - ConnectX-4 QSFP fw 12.23.1020 port 1 (ACTIVE) ==> enp2s0f1 (Up)
0000:02:00.5 mlx5_2 (MT4116 - NA) fw 12.23.1020 port 1 (DOWN ) ==> enp2s0f2 (Down)
0000:02:00.6 mlx5_3 (MT4116 - NA) fw 12.23.1020 port 1 (DOWN ) ==> enp2s0f3 (Down)
0000:02:00.7 mlx5_4 (MT4116 - NA) fw 12.23.1020 port 1 (DOWN ) ==> enp2s0f4 (Down)
0000:02:00.2 mlx5_5 (MT4116 - NA) fw 12.23.1020 port 1 (DOWN ) ==> enp2s0f5 (Down)
# lspci | grep Mellanox
02:00.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4]
02:00.1 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4]
02:00.2 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function]
02:00.3 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function]
02:00.4 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function]
02:00.5 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function]
# ip link show
...
enp2s0f2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether c6:6d:7d:dd:2a:d5 brd ff:ff:ff:ff:ff:ff
enp2s0f3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 42:3e:07:68:da:fb brd ff:ff:ff:ff:ff:ff
enp2s0f4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 42:68:f2:aa:c2:27 brd ff:ff:ff:ff:ff:ff
enp2s0f5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
...
To change the number of VFs reset the number to 0 then set the needed number
echo 0 > /sys/class/net/enp2s0f0/device/sriov_numvfs
echo 8 > /sys/class/net/enp2s0f0/device/sriov_numvfs
Note: In case spoofchk is enabled for VF, a valid administrative MAC needs to be specified.
-
name
(string, required): the name of the network -
type
(string, required): "sriov" -
deviceID
(string, required): A valid pci address of an SRIOV NIC's VF. e.g. "0000:03:02.3" -
vlan
(int, optional): VLAN ID to assign for the VF. Value must be in the range 0-4094 (0 for disabled, 1-4094 for valid VLAN IDs). -
vlanQoS
(int, optional): VLAN QoS to assign for the VF. Value must be in the range 0-7. This option requiresvlan
field to be set to a non-zero value. Otherwise, the error will be returned. -
mac
(string, optional): MAC address to assign for the VF -
ipam
(dictionary, optional): IPAM configuration to be used for this network. -
spoofchk
(string, optional): turn packet spoof checking on or off for the VF -
trust
(string, optional): turn trust setting on or off for the VF -
link_state
(string, optional): enforce link state for the VF. Allowed values: auto, enable, disable. Note that driver support may differ for this feature. For example,i40e
is known to work butigb
doesn't. -
min_tx_rate
(int, optional): change the allowed minimum transmit bandwidth, in Mbps, for the VF. Setting this to 0 disables rate limiting. The min_tx_rate value should be <= max_tx_rate. Support of this feature depends on NICs and drivers. -
max_tx_rate
(int, optional): change the allowed maximum transmit bandwidth, in Mbps, for the VF. Setting this to 0 disables rate limiting.
If this plugin is used with a VF bound to a dpdk driver then the IPAM configuration will be ignored.
# cat > /etc/cni/net.d/10-mynet.conf <<EOF
{
"cniVersion": "0.3.1",
"name": "sriov-net",
"type": "sriov",
"deviceID": "0000:03:02.0",
"vlan": 2222,
"ipam": {
"type": "host-local",
"subnet": "10.56.217.0/24",
"rangeStart": "10.56.217.171",
"rangeEnd": "10.56.217.181",
"routes": [
{ "dst": "0.0.0.0/0" }
],
"gateway": "10.56.217.1"
}
}
EOF
eth0 Link encap:Ethernet HWaddr 66:D8:02:77:AA:AA
inet addr:10.55.206.46 Bcast:0.0.0.0 Mask:255.255.255.192
inet6 addr: fe80::64d8:2ff:fe77:aaaa/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:7 errors:0 dropped:0 overruns:0 frame:0
TX packets:14 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:530 (530.0 b) TX bytes:988 (988.0 b)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
Note: In dpdk mode IPAM configuration is ignored.
# cat > /etc/cni/net.d/20-mynet-dpdk.conf <<EOF
{
"cniVersion": "0.3.1",
"name": "sriov-dpdk",
"type": "sriov",
"deviceID": "0000:03:02.0",
"vlan": 1000
}
EOF
# cat > /etc/cni/net.d/20-mynet-dpdk.conf <<EOF
{
"cniVersion": "0.3.1",
"name": "sriov-dpdk",
"type": "sriov",
"deviceID": "0000:03:02.0",
"vlan": 1000,
"min_tx_rate": 100,
"max_tx_rate": 200,
"spoofchk": "off",
"trust": "on",
"link_state": "enable"
}
EOF
For any questions about Multus CNI, please reach out on github issue or feel free to contact the developers @kural OR @ahalim in our Intel-Corp Slack