mbox series

[SRU,F,J,K,0/2] iavf: SR-IOV VFs error with no traffic flow when MTU greater than 1500

Message ID 20221004044436.15046-1-matthew.ruffell@canonical.com
Headers show
Series iavf: SR-IOV VFs error with no traffic flow when MTU greater than 1500 | expand

Message

Matthew Ruffell Oct. 4, 2022, 4:44 a.m. UTC
BugLink: https://bugs.launchpad.net/bugs/1983656

[Impact]

Virtual Machines with SR-IOV VFs from an Intel E810-XXV [8086:159b] get no 
traffic flow and produce error messages in both the host and guest during
network configuration.

Environment: Ubuntu OpenStack Focal-Ussuri with OVN
Host Kernel: v5.15.0-41-generic 20.04 Focal-HWE
Guest Kernels: v5.4.x Focal, v5.15.0-41-generic Jammy

Host Error Messages:
ice 0000:98:00.1: VF 7 failed opcode 6, retval: -5

Guest Error Messages:
iavf 0000:00:05.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6

In the context of these errors "6" refers to the value of 
VIRTCHNL_OP_CONFIG_VSI_QUEUES

It was found in these cases that the VM is able to successfully transmit packets
but never receives any and the RX packet drop counters for the VF in "ip link" 
on the host increase equal to the RX packet count.

There is a prior commit e6ba5273d4ede03d075d7a116b8edad1f6115f4d claiming to
resolve this error in some cases. It is already included in 5.15.0-41-generic
and did not resolve the issue.

The following conditions are required to trigger the bug:
- A port VLAN must be assigned by the host
- The MTU must be set >1500 by the guest

There is no workaround, Intel E810 SR-IOV VFs with MTU >1500 cannot be
used without these patches.

[Fix]

iavf currently sets the maximum packet size to IAVF_MAX_RXBUFFER, but on the
previous ice driver, it was decremented by VLAN_HLEN to make some space to fit
the VLAN header. This doesn't happen on iavf, and we end up trying to use a 
packet size larger than IAVF_MAX_RXBUFFER, causing the IAVF_ERR_PARAM error.

The fix is to change the maximum packet size from IAVF_MAX_RXBUFFER to max_mtu
received from the PF via GET_VF_RESOURCES msg.

Also pick up a necessary commit for i40e to announce the correct maximum packet
size by GET_VF_RESOURCES msg.

This has been fixed by the following commits:

commit 399c98c4dc50b7eb7e9f24da7ffdda6f025676ef
Author: Michal Jaron <michalx.jaron@intel.com>
Date:   Tue Sep 13 15:38:35 2022 +0200
Subject: iavf: Fix set max MTU size with port VLAN and jumbo frames
Link: https://github.com/torvalds/linux/commit/399c98c4dc50b7eb7e9f24da7ffdda6f025676ef

commit 372539def2824c43b6afe2403045b140f65c5acc
Author: Michal Jaron <michalx.jaron@intel.com>
Date:   Tue Sep 13 15:38:36 2022 +0200
Subject: i40e: Fix VF set max MTU size
Link: https://github.com/torvalds/linux/commit/372539def2824c43b6afe2403045b140f65c5acc

A test kernel is available in the following ppa:

https://launchpad.net/~arif-ali/+archive/ubuntu/sf00343742

If you install the test kernel to a compute host and VM, when you attach a 
VF and set the MTU to 9000, it succeeds, and traffic can flow.

[Test Plan]

Create a Focal VM and assign an Intel E810 (ice) SR-IOV VF with a port vlan:

Openstack works, as does creating a VM directly with uvtool/libvirt.

$ uvt-kvm create focal-test release=focal

Using the document to understand SRIOV basics in the link below

https://www.intel.com/content/www/us/en/developer/articles/technical/configure-sr-iov-network-virtual-functions-in-linux-kvm.html

The following command show all the bus info for all the network devices

$ lshw -c network -businfo

Choose one, as shown below

pci@0000:17:01.4  ens2f0v4     network        Ethernet Adaptive Virtual Function

We can then add the following into the XML definition via “virsh edit focal-test”

<interface type='hostdev' managed='yes'>
      <source>
        <address type='pci' domain='0x0000' bus='0x17' slot='0x01' function='0x4'/>
      </source>
     <vlan>
        <tag id='998'/>
      </vlan>
</interface>

Then we stop and start the VM via "virsh shutdown focal-test" and then 
"virsh start focal-test". We can then login to the VM using the command below

$ uvt-kvm ssh focal-test

Once you have logged in, run the following ip parameters

$ sudo ip a a 192.168.1.7/24 dev enp7s0
$ sudo ip link set up dev enp7s0
$ sudo ip link set mtu 9000 dev enp7s0

Now check dmesg, and we will find the error

[   61.529605] iavf 0000:07:00.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6

Setting the IP and bringing the link up

[   36.228877] iavf 0000:07:00.0 enp7s0: NIC Link is Up Speed is 25 Gbps Full Duplex
[   36.228887] IPv6: ADDRCONF(NETDEV_CHANGE): enp7s0: link becomes ready
[   45.740100] crng init done
[   45.740102] random: 7 urandom warning(s) missed due to ratelimiting

Then setting the MTU

[   61.433706] iavf 0000:07:00.0: Received 16 queues, but can only have a max of 4
[   61.433707] iavf 0000:07:00.0: Fixing by reducing queues to 4
[   61.529605] iavf 0000:07:00.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
[   61.552890] iavf 0000:07:00.0 enp7s0: NIC Link is Up Speed is 25 Gbps Full Duplex

There is a test kernel available in the following ppa:

https://launchpad.net/~arif-ali/+archive/ubuntu/sf00343742

If you install the test kernel, setting the MTU to 9000 works as expected and
traffic can flow.

[Where problems could occur]

We are changing how maximum MTU is calculated and applied to VFs in the iavf and
i40e drivers. Currently, any MTU over 1500 does not work at all when a port
VLAN is enabled, but if someone has somehow got their setup to work, they may
see a difference in MTU with these patches applied.

The iavf and i40e drivers are a popular driver, and if a regression were to
occur, initialisation and bringup of these network devices and VFs might fail.

Most users currently using MTUs of 1500 are unlikely to see any difference or
be at risk of regression.

[Other Info]

Both patches were developed by intel, and have been accepted into v6.0-rc7 and
are already released into upstream stable v5.4.215, v5.15.71 and v5.19.12. These
patches are well tested by the community and considered safe.

Michal Jaron (2):
  iavf: Fix set max MTU size with port VLAN and jumbo frames
  i40e: Fix VF set max MTU size

 .../ethernet/intel/i40e/i40e_virtchnl_pf.c    | 20 +++++++++++++++++++
 .../net/ethernet/intel/iavf/iavf_virtchnl.c   |  7 +++++--
 2 files changed, 25 insertions(+), 2 deletions(-)

Comments

Tim Gardner Oct. 4, 2022, 2:24 p.m. UTC | #1
On 10/3/22 22:44, Matthew Ruffell wrote:
> BugLink: https://bugs.launchpad.net/bugs/1983656
> 
> [Impact]
> 
> Virtual Machines with SR-IOV VFs from an Intel E810-XXV [8086:159b] get no
> traffic flow and produce error messages in both the host and guest during
> network configuration.
> 
> Environment: Ubuntu OpenStack Focal-Ussuri with OVN
> Host Kernel: v5.15.0-41-generic 20.04 Focal-HWE
> Guest Kernels: v5.4.x Focal, v5.15.0-41-generic Jammy
> 
> Host Error Messages:
> ice 0000:98:00.1: VF 7 failed opcode 6, retval: -5
> 
> Guest Error Messages:
> iavf 0000:00:05.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
> 
> In the context of these errors "6" refers to the value of
> VIRTCHNL_OP_CONFIG_VSI_QUEUES
> 
> It was found in these cases that the VM is able to successfully transmit packets
> but never receives any and the RX packet drop counters for the VF in "ip link"
> on the host increase equal to the RX packet count.
> 
> There is a prior commit e6ba5273d4ede03d075d7a116b8edad1f6115f4d claiming to
> resolve this error in some cases. It is already included in 5.15.0-41-generic
> and did not resolve the issue.
> 
> The following conditions are required to trigger the bug:
> - A port VLAN must be assigned by the host
> - The MTU must be set >1500 by the guest
> 
> There is no workaround, Intel E810 SR-IOV VFs with MTU >1500 cannot be
> used without these patches.
> 
> [Fix]
> 
> iavf currently sets the maximum packet size to IAVF_MAX_RXBUFFER, but on the
> previous ice driver, it was decremented by VLAN_HLEN to make some space to fit
> the VLAN header. This doesn't happen on iavf, and we end up trying to use a
> packet size larger than IAVF_MAX_RXBUFFER, causing the IAVF_ERR_PARAM error.
> 
> The fix is to change the maximum packet size from IAVF_MAX_RXBUFFER to max_mtu
> received from the PF via GET_VF_RESOURCES msg.
> 
> Also pick up a necessary commit for i40e to announce the correct maximum packet
> size by GET_VF_RESOURCES msg.
> 
> This has been fixed by the following commits:
> 
> commit 399c98c4dc50b7eb7e9f24da7ffdda6f025676ef
> Author: Michal Jaron <michalx.jaron@intel.com>
> Date:   Tue Sep 13 15:38:35 2022 +0200
> Subject: iavf: Fix set max MTU size with port VLAN and jumbo frames
> Link: https://github.com/torvalds/linux/commit/399c98c4dc50b7eb7e9f24da7ffdda6f025676ef
> 
> commit 372539def2824c43b6afe2403045b140f65c5acc
> Author: Michal Jaron <michalx.jaron@intel.com>
> Date:   Tue Sep 13 15:38:36 2022 +0200
> Subject: i40e: Fix VF set max MTU size
> Link: https://github.com/torvalds/linux/commit/372539def2824c43b6afe2403045b140f65c5acc
> 
> A test kernel is available in the following ppa:
> 
> https://launchpad.net/~arif-ali/+archive/ubuntu/sf00343742
> 
> If you install the test kernel to a compute host and VM, when you attach a
> VF and set the MTU to 9000, it succeeds, and traffic can flow.
> 
> [Test Plan]
> 
> Create a Focal VM and assign an Intel E810 (ice) SR-IOV VF with a port vlan:
> 
> Openstack works, as does creating a VM directly with uvtool/libvirt.
> 
> $ uvt-kvm create focal-test release=focal
> 
> Using the document to understand SRIOV basics in the link below
> 
> https://www.intel.com/content/www/us/en/developer/articles/technical/configure-sr-iov-network-virtual-functions-in-linux-kvm.html
> 
> The following command show all the bus info for all the network devices
> 
> $ lshw -c network -businfo
> 
> Choose one, as shown below
> 
> pci@0000:17:01.4  ens2f0v4     network        Ethernet Adaptive Virtual Function
> 
> We can then add the following into the XML definition via “virsh edit focal-test”
> 
> <interface type='hostdev' managed='yes'>
>        <source>
>          <address type='pci' domain='0x0000' bus='0x17' slot='0x01' function='0x4'/>
>        </source>
>       <vlan>
>          <tag id='998'/>
>        </vlan>
> </interface>
> 
> Then we stop and start the VM via "virsh shutdown focal-test" and then
> "virsh start focal-test". We can then login to the VM using the command below
> 
> $ uvt-kvm ssh focal-test
> 
> Once you have logged in, run the following ip parameters
> 
> $ sudo ip a a 192.168.1.7/24 dev enp7s0
> $ sudo ip link set up dev enp7s0
> $ sudo ip link set mtu 9000 dev enp7s0
> 
> Now check dmesg, and we will find the error
> 
> [   61.529605] iavf 0000:07:00.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
> 
> Setting the IP and bringing the link up
> 
> [   36.228877] iavf 0000:07:00.0 enp7s0: NIC Link is Up Speed is 25 Gbps Full Duplex
> [   36.228887] IPv6: ADDRCONF(NETDEV_CHANGE): enp7s0: link becomes ready
> [   45.740100] crng init done
> [   45.740102] random: 7 urandom warning(s) missed due to ratelimiting
> 
> Then setting the MTU
> 
> [   61.433706] iavf 0000:07:00.0: Received 16 queues, but can only have a max of 4
> [   61.433707] iavf 0000:07:00.0: Fixing by reducing queues to 4
> [   61.529605] iavf 0000:07:00.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
> [   61.552890] iavf 0000:07:00.0 enp7s0: NIC Link is Up Speed is 25 Gbps Full Duplex
> 
> There is a test kernel available in the following ppa:
> 
> https://launchpad.net/~arif-ali/+archive/ubuntu/sf00343742
> 
> If you install the test kernel, setting the MTU to 9000 works as expected and
> traffic can flow.
> 
> [Where problems could occur]
> 
> We are changing how maximum MTU is calculated and applied to VFs in the iavf and
> i40e drivers. Currently, any MTU over 1500 does not work at all when a port
> VLAN is enabled, but if someone has somehow got their setup to work, they may
> see a difference in MTU with these patches applied.
> 
> The iavf and i40e drivers are a popular driver, and if a regression were to
> occur, initialisation and bringup of these network devices and VFs might fail.
> 
> Most users currently using MTUs of 1500 are unlikely to see any difference or
> be at risk of regression.
> 
> [Other Info]
> 
> Both patches were developed by intel, and have been accepted into v6.0-rc7 and
> are already released into upstream stable v5.4.215, v5.15.71 and v5.19.12. These
> patches are well tested by the community and considered safe.
> 
> Michal Jaron (2):
>    iavf: Fix set max MTU size with port VLAN and jumbo frames
>    i40e: Fix VF set max MTU size
> 
>   .../ethernet/intel/i40e/i40e_virtchnl_pf.c    | 20 +++++++++++++++++++
>   .../net/ethernet/intel/iavf/iavf_virtchnl.c   |  7 +++++--
>   2 files changed, 25 insertions(+), 2 deletions(-)
> 
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Stefan Bader Oct. 7, 2022, 12:32 p.m. UTC | #2
On 04.10.22 06:44, Matthew Ruffell wrote:
> BugLink: https://bugs.launchpad.net/bugs/1983656
> 
> [Impact]
> 
> Virtual Machines with SR-IOV VFs from an Intel E810-XXV [8086:159b] get no
> traffic flow and produce error messages in both the host and guest during
> network configuration.
> 
> Environment: Ubuntu OpenStack Focal-Ussuri with OVN
> Host Kernel: v5.15.0-41-generic 20.04 Focal-HWE
> Guest Kernels: v5.4.x Focal, v5.15.0-41-generic Jammy
> 
> Host Error Messages:
> ice 0000:98:00.1: VF 7 failed opcode 6, retval: -5
> 
> Guest Error Messages:
> iavf 0000:00:05.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
> 
> In the context of these errors "6" refers to the value of
> VIRTCHNL_OP_CONFIG_VSI_QUEUES
> 
> It was found in these cases that the VM is able to successfully transmit packets
> but never receives any and the RX packet drop counters for the VF in "ip link"
> on the host increase equal to the RX packet count.
> 
> There is a prior commit e6ba5273d4ede03d075d7a116b8edad1f6115f4d claiming to
> resolve this error in some cases. It is already included in 5.15.0-41-generic
> and did not resolve the issue.
> 
> The following conditions are required to trigger the bug:
> - A port VLAN must be assigned by the host
> - The MTU must be set >1500 by the guest
> 
> There is no workaround, Intel E810 SR-IOV VFs with MTU >1500 cannot be
> used without these patches.
> 
> [Fix]
> 
> iavf currently sets the maximum packet size to IAVF_MAX_RXBUFFER, but on the
> previous ice driver, it was decremented by VLAN_HLEN to make some space to fit
> the VLAN header. This doesn't happen on iavf, and we end up trying to use a
> packet size larger than IAVF_MAX_RXBUFFER, causing the IAVF_ERR_PARAM error.
> 
> The fix is to change the maximum packet size from IAVF_MAX_RXBUFFER to max_mtu
> received from the PF via GET_VF_RESOURCES msg.
> 
> Also pick up a necessary commit for i40e to announce the correct maximum packet
> size by GET_VF_RESOURCES msg.
> 
> This has been fixed by the following commits:
> 
> commit 399c98c4dc50b7eb7e9f24da7ffdda6f025676ef
> Author: Michal Jaron <michalx.jaron@intel.com>
> Date:   Tue Sep 13 15:38:35 2022 +0200
> Subject: iavf: Fix set max MTU size with port VLAN and jumbo frames
> Link: https://github.com/torvalds/linux/commit/399c98c4dc50b7eb7e9f24da7ffdda6f025676ef
> 
> commit 372539def2824c43b6afe2403045b140f65c5acc
> Author: Michal Jaron <michalx.jaron@intel.com>
> Date:   Tue Sep 13 15:38:36 2022 +0200
> Subject: i40e: Fix VF set max MTU size
> Link: https://github.com/torvalds/linux/commit/372539def2824c43b6afe2403045b140f65c5acc
> 
> A test kernel is available in the following ppa:
> 
> https://launchpad.net/~arif-ali/+archive/ubuntu/sf00343742
> 
> If you install the test kernel to a compute host and VM, when you attach a
> VF and set the MTU to 9000, it succeeds, and traffic can flow.
> 
> [Test Plan]
> 
> Create a Focal VM and assign an Intel E810 (ice) SR-IOV VF with a port vlan:
> 
> Openstack works, as does creating a VM directly with uvtool/libvirt.
> 
> $ uvt-kvm create focal-test release=focal
> 
> Using the document to understand SRIOV basics in the link below
> 
> https://www.intel.com/content/www/us/en/developer/articles/technical/configure-sr-iov-network-virtual-functions-in-linux-kvm.html
> 
> The following command show all the bus info for all the network devices
> 
> $ lshw -c network -businfo
> 
> Choose one, as shown below
> 
> pci@0000:17:01.4  ens2f0v4     network        Ethernet Adaptive Virtual Function
> 
> We can then add the following into the XML definition via “virsh edit focal-test”
> 
> <interface type='hostdev' managed='yes'>
>        <source>
>          <address type='pci' domain='0x0000' bus='0x17' slot='0x01' function='0x4'/>
>        </source>
>       <vlan>
>          <tag id='998'/>
>        </vlan>
> </interface>
> 
> Then we stop and start the VM via "virsh shutdown focal-test" and then
> "virsh start focal-test". We can then login to the VM using the command below
> 
> $ uvt-kvm ssh focal-test
> 
> Once you have logged in, run the following ip parameters
> 
> $ sudo ip a a 192.168.1.7/24 dev enp7s0
> $ sudo ip link set up dev enp7s0
> $ sudo ip link set mtu 9000 dev enp7s0
> 
> Now check dmesg, and we will find the error
> 
> [   61.529605] iavf 0000:07:00.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
> 
> Setting the IP and bringing the link up
> 
> [   36.228877] iavf 0000:07:00.0 enp7s0: NIC Link is Up Speed is 25 Gbps Full Duplex
> [   36.228887] IPv6: ADDRCONF(NETDEV_CHANGE): enp7s0: link becomes ready
> [   45.740100] crng init done
> [   45.740102] random: 7 urandom warning(s) missed due to ratelimiting
> 
> Then setting the MTU
> 
> [   61.433706] iavf 0000:07:00.0: Received 16 queues, but can only have a max of 4
> [   61.433707] iavf 0000:07:00.0: Fixing by reducing queues to 4
> [   61.529605] iavf 0000:07:00.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
> [   61.552890] iavf 0000:07:00.0 enp7s0: NIC Link is Up Speed is 25 Gbps Full Duplex
> 
> There is a test kernel available in the following ppa:
> 
> https://launchpad.net/~arif-ali/+archive/ubuntu/sf00343742
> 
> If you install the test kernel, setting the MTU to 9000 works as expected and
> traffic can flow.
> 
> [Where problems could occur]
> 
> We are changing how maximum MTU is calculated and applied to VFs in the iavf and
> i40e drivers. Currently, any MTU over 1500 does not work at all when a port
> VLAN is enabled, but if someone has somehow got their setup to work, they may
> see a difference in MTU with these patches applied.
> 
> The iavf and i40e drivers are a popular driver, and if a regression were to
> occur, initialisation and bringup of these network devices and VFs might fail.
> 
> Most users currently using MTUs of 1500 are unlikely to see any difference or
> be at risk of regression.
> 
> [Other Info]
> 
> Both patches were developed by intel, and have been accepted into v6.0-rc7 and
> are already released into upstream stable v5.4.215, v5.15.71 and v5.19.12. These
> patches are well tested by the community and considered safe.
> 
> Michal Jaron (2):
>    iavf: Fix set max MTU size with port VLAN and jumbo frames
>    i40e: Fix VF set max MTU size
> 
>   .../ethernet/intel/i40e/i40e_virtchnl_pf.c    | 20 +++++++++++++++++++
>   .../net/ethernet/intel/iavf/iavf_virtchnl.c   |  7 +++++--
>   2 files changed, 25 insertions(+), 2 deletions(-)
> 

Acked-by: Stefan Bader <stefan.bader@canonical.com>
Andrea Righi Oct. 12, 2022, 6:50 a.m. UTC | #3
On Tue, Oct 04, 2022 at 05:44:34PM +1300, Matthew Ruffell wrote:
> BugLink: https://bugs.launchpad.net/bugs/1983656
> 
> [Impact]
> 
> Virtual Machines with SR-IOV VFs from an Intel E810-XXV [8086:159b] get no 
> traffic flow and produce error messages in both the host and guest during
> network configuration.
> 
> Environment: Ubuntu OpenStack Focal-Ussuri with OVN
> Host Kernel: v5.15.0-41-generic 20.04 Focal-HWE
> Guest Kernels: v5.4.x Focal, v5.15.0-41-generic Jammy
> 
> Host Error Messages:
> ice 0000:98:00.1: VF 7 failed opcode 6, retval: -5
> 
> Guest Error Messages:
> iavf 0000:00:05.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
> 
> In the context of these errors "6" refers to the value of 
> VIRTCHNL_OP_CONFIG_VSI_QUEUES
> 
> It was found in these cases that the VM is able to successfully transmit packets
> but never receives any and the RX packet drop counters for the VF in "ip link" 
> on the host increase equal to the RX packet count.
> 
> There is a prior commit e6ba5273d4ede03d075d7a116b8edad1f6115f4d claiming to
> resolve this error in some cases. It is already included in 5.15.0-41-generic
> and did not resolve the issue.
> 
> The following conditions are required to trigger the bug:
> - A port VLAN must be assigned by the host
> - The MTU must be set >1500 by the guest
> 
> There is no workaround, Intel E810 SR-IOV VFs with MTU >1500 cannot be
> used without these patches.
> 
> [Fix]
> 
> iavf currently sets the maximum packet size to IAVF_MAX_RXBUFFER, but on the
> previous ice driver, it was decremented by VLAN_HLEN to make some space to fit
> the VLAN header. This doesn't happen on iavf, and we end up trying to use a 
> packet size larger than IAVF_MAX_RXBUFFER, causing the IAVF_ERR_PARAM error.
> 
> The fix is to change the maximum packet size from IAVF_MAX_RXBUFFER to max_mtu
> received from the PF via GET_VF_RESOURCES msg.
> 
> Also pick up a necessary commit for i40e to announce the correct maximum packet
> size by GET_VF_RESOURCES msg.
> 
> This has been fixed by the following commits:
> 
> commit 399c98c4dc50b7eb7e9f24da7ffdda6f025676ef
> Author: Michal Jaron <michalx.jaron@intel.com>
> Date:   Tue Sep 13 15:38:35 2022 +0200
> Subject: iavf: Fix set max MTU size with port VLAN and jumbo frames
> Link: https://github.com/torvalds/linux/commit/399c98c4dc50b7eb7e9f24da7ffdda6f025676ef
> 
> commit 372539def2824c43b6afe2403045b140f65c5acc
> Author: Michal Jaron <michalx.jaron@intel.com>
> Date:   Tue Sep 13 15:38:36 2022 +0200
> Subject: i40e: Fix VF set max MTU size
> Link: https://github.com/torvalds/linux/commit/372539def2824c43b6afe2403045b140f65c5acc
> 
> A test kernel is available in the following ppa:
> 
> https://launchpad.net/~arif-ali/+archive/ubuntu/sf00343742
> 
> If you install the test kernel to a compute host and VM, when you attach a 
> VF and set the MTU to 9000, it succeeds, and traffic can flow.
> 
> [Test Plan]
> 
> Create a Focal VM and assign an Intel E810 (ice) SR-IOV VF with a port vlan:
> 
> Openstack works, as does creating a VM directly with uvtool/libvirt.
> 
> $ uvt-kvm create focal-test release=focal
> 
> Using the document to understand SRIOV basics in the link below
> 
> https://www.intel.com/content/www/us/en/developer/articles/technical/configure-sr-iov-network-virtual-functions-in-linux-kvm.html
> 
> The following command show all the bus info for all the network devices
> 
> $ lshw -c network -businfo
> 
> Choose one, as shown below
> 
> pci@0000:17:01.4  ens2f0v4     network        Ethernet Adaptive Virtual Function
> 
> We can then add the following into the XML definition via “virsh edit focal-test”
> 
> <interface type='hostdev' managed='yes'>
>       <source>
>         <address type='pci' domain='0x0000' bus='0x17' slot='0x01' function='0x4'/>
>       </source>
>      <vlan>
>         <tag id='998'/>
>       </vlan>
> </interface>
> 
> Then we stop and start the VM via "virsh shutdown focal-test" and then 
> "virsh start focal-test". We can then login to the VM using the command below
> 
> $ uvt-kvm ssh focal-test
> 
> Once you have logged in, run the following ip parameters
> 
> $ sudo ip a a 192.168.1.7/24 dev enp7s0
> $ sudo ip link set up dev enp7s0
> $ sudo ip link set mtu 9000 dev enp7s0
> 
> Now check dmesg, and we will find the error
> 
> [   61.529605] iavf 0000:07:00.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
> 
> Setting the IP and bringing the link up
> 
> [   36.228877] iavf 0000:07:00.0 enp7s0: NIC Link is Up Speed is 25 Gbps Full Duplex
> [   36.228887] IPv6: ADDRCONF(NETDEV_CHANGE): enp7s0: link becomes ready
> [   45.740100] crng init done
> [   45.740102] random: 7 urandom warning(s) missed due to ratelimiting
> 
> Then setting the MTU
> 
> [   61.433706] iavf 0000:07:00.0: Received 16 queues, but can only have a max of 4
> [   61.433707] iavf 0000:07:00.0: Fixing by reducing queues to 4
> [   61.529605] iavf 0000:07:00.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
> [   61.552890] iavf 0000:07:00.0 enp7s0: NIC Link is Up Speed is 25 Gbps Full Duplex
> 
> There is a test kernel available in the following ppa:
> 
> https://launchpad.net/~arif-ali/+archive/ubuntu/sf00343742
> 
> If you install the test kernel, setting the MTU to 9000 works as expected and
> traffic can flow.
> 
> [Where problems could occur]
> 
> We are changing how maximum MTU is calculated and applied to VFs in the iavf and
> i40e drivers. Currently, any MTU over 1500 does not work at all when a port
> VLAN is enabled, but if someone has somehow got their setup to work, they may
> see a difference in MTU with these patches applied.
> 
> The iavf and i40e drivers are a popular driver, and if a regression were to
> occur, initialisation and bringup of these network devices and VFs might fail.
> 
> Most users currently using MTUs of 1500 are unlikely to see any difference or
> be at risk of regression.
> 
> [Other Info]
> 
> Both patches were developed by intel, and have been accepted into v6.0-rc7 and
> are already released into upstream stable v5.4.215, v5.15.71 and v5.19.12. These
> patches are well tested by the community and considered safe.

Applied to kinetic/linux.

Thanks,
-Andrea
Stefan Bader Oct. 19, 2022, 9:54 a.m. UTC | #4
On 04.10.22 06:44, Matthew Ruffell wrote:
> BugLink: https://bugs.launchpad.net/bugs/1983656
> 
> [Impact]
> 
> Virtual Machines with SR-IOV VFs from an Intel E810-XXV [8086:159b] get no
> traffic flow and produce error messages in both the host and guest during
> network configuration.
> 
> Environment: Ubuntu OpenStack Focal-Ussuri with OVN
> Host Kernel: v5.15.0-41-generic 20.04 Focal-HWE
> Guest Kernels: v5.4.x Focal, v5.15.0-41-generic Jammy
> 
> Host Error Messages:
> ice 0000:98:00.1: VF 7 failed opcode 6, retval: -5
> 
> Guest Error Messages:
> iavf 0000:00:05.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
> 
> In the context of these errors "6" refers to the value of
> VIRTCHNL_OP_CONFIG_VSI_QUEUES
> 
> It was found in these cases that the VM is able to successfully transmit packets
> but never receives any and the RX packet drop counters for the VF in "ip link"
> on the host increase equal to the RX packet count.
> 
> There is a prior commit e6ba5273d4ede03d075d7a116b8edad1f6115f4d claiming to
> resolve this error in some cases. It is already included in 5.15.0-41-generic
> and did not resolve the issue.
> 
> The following conditions are required to trigger the bug:
> - A port VLAN must be assigned by the host
> - The MTU must be set >1500 by the guest
> 
> There is no workaround, Intel E810 SR-IOV VFs with MTU >1500 cannot be
> used without these patches.
> 
> [Fix]
> 
> iavf currently sets the maximum packet size to IAVF_MAX_RXBUFFER, but on the
> previous ice driver, it was decremented by VLAN_HLEN to make some space to fit
> the VLAN header. This doesn't happen on iavf, and we end up trying to use a
> packet size larger than IAVF_MAX_RXBUFFER, causing the IAVF_ERR_PARAM error.
> 
> The fix is to change the maximum packet size from IAVF_MAX_RXBUFFER to max_mtu
> received from the PF via GET_VF_RESOURCES msg.
> 
> Also pick up a necessary commit for i40e to announce the correct maximum packet
> size by GET_VF_RESOURCES msg.
> 
> This has been fixed by the following commits:
> 
> commit 399c98c4dc50b7eb7e9f24da7ffdda6f025676ef
> Author: Michal Jaron <michalx.jaron@intel.com>
> Date:   Tue Sep 13 15:38:35 2022 +0200
> Subject: iavf: Fix set max MTU size with port VLAN and jumbo frames
> Link: https://github.com/torvalds/linux/commit/399c98c4dc50b7eb7e9f24da7ffdda6f025676ef
> 
> commit 372539def2824c43b6afe2403045b140f65c5acc
> Author: Michal Jaron <michalx.jaron@intel.com>
> Date:   Tue Sep 13 15:38:36 2022 +0200
> Subject: i40e: Fix VF set max MTU size
> Link: https://github.com/torvalds/linux/commit/372539def2824c43b6afe2403045b140f65c5acc
> 
> A test kernel is available in the following ppa:
> 
> https://launchpad.net/~arif-ali/+archive/ubuntu/sf00343742
> 
> If you install the test kernel to a compute host and VM, when you attach a
> VF and set the MTU to 9000, it succeeds, and traffic can flow.
> 
> [Test Plan]
> 
> Create a Focal VM and assign an Intel E810 (ice) SR-IOV VF with a port vlan:
> 
> Openstack works, as does creating a VM directly with uvtool/libvirt.
> 
> $ uvt-kvm create focal-test release=focal
> 
> Using the document to understand SRIOV basics in the link below
> 
> https://www.intel.com/content/www/us/en/developer/articles/technical/configure-sr-iov-network-virtual-functions-in-linux-kvm.html
> 
> The following command show all the bus info for all the network devices
> 
> $ lshw -c network -businfo
> 
> Choose one, as shown below
> 
> pci@0000:17:01.4  ens2f0v4     network        Ethernet Adaptive Virtual Function
> 
> We can then add the following into the XML definition via “virsh edit focal-test”
> 
> <interface type='hostdev' managed='yes'>
>        <source>
>          <address type='pci' domain='0x0000' bus='0x17' slot='0x01' function='0x4'/>
>        </source>
>       <vlan>
>          <tag id='998'/>
>        </vlan>
> </interface>
> 
> Then we stop and start the VM via "virsh shutdown focal-test" and then
> "virsh start focal-test". We can then login to the VM using the command below
> 
> $ uvt-kvm ssh focal-test
> 
> Once you have logged in, run the following ip parameters
> 
> $ sudo ip a a 192.168.1.7/24 dev enp7s0
> $ sudo ip link set up dev enp7s0
> $ sudo ip link set mtu 9000 dev enp7s0
> 
> Now check dmesg, and we will find the error
> 
> [   61.529605] iavf 0000:07:00.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
> 
> Setting the IP and bringing the link up
> 
> [   36.228877] iavf 0000:07:00.0 enp7s0: NIC Link is Up Speed is 25 Gbps Full Duplex
> [   36.228887] IPv6: ADDRCONF(NETDEV_CHANGE): enp7s0: link becomes ready
> [   45.740100] crng init done
> [   45.740102] random: 7 urandom warning(s) missed due to ratelimiting
> 
> Then setting the MTU
> 
> [   61.433706] iavf 0000:07:00.0: Received 16 queues, but can only have a max of 4
> [   61.433707] iavf 0000:07:00.0: Fixing by reducing queues to 4
> [   61.529605] iavf 0000:07:00.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
> [   61.552890] iavf 0000:07:00.0 enp7s0: NIC Link is Up Speed is 25 Gbps Full Duplex
> 
> There is a test kernel available in the following ppa:
> 
> https://launchpad.net/~arif-ali/+archive/ubuntu/sf00343742
> 
> If you install the test kernel, setting the MTU to 9000 works as expected and
> traffic can flow.
> 
> [Where problems could occur]
> 
> We are changing how maximum MTU is calculated and applied to VFs in the iavf and
> i40e drivers. Currently, any MTU over 1500 does not work at all when a port
> VLAN is enabled, but if someone has somehow got their setup to work, they may
> see a difference in MTU with these patches applied.
> 
> The iavf and i40e drivers are a popular driver, and if a regression were to
> occur, initialisation and bringup of these network devices and VFs might fail.
> 
> Most users currently using MTUs of 1500 are unlikely to see any difference or
> be at risk of regression.
> 
> [Other Info]
> 
> Both patches were developed by intel, and have been accepted into v6.0-rc7 and
> are already released into upstream stable v5.4.215, v5.15.71 and v5.19.12. These
> patches are well tested by the community and considered safe.
> 
> Michal Jaron (2):
>    iavf: Fix set max MTU size with port VLAN and jumbo frames
>    i40e: Fix VF set max MTU size
> 
>   .../ethernet/intel/i40e/i40e_virtchnl_pf.c    | 20 +++++++++++++++++++
>   .../net/ethernet/intel/iavf/iavf_virtchnl.c   |  7 +++++--
>   2 files changed, 25 insertions(+), 2 deletions(-)
> 

Applied to jammy,focal:linux/master-next. Thanks.

-Stefan