diff mbox series

[1/2] igb: Disable threaded IRQ for igb_msix_other

Message ID 20240920185918.616302-2-wander@redhat.com
State Under Review
Delegated to: Anthony Nguyen
Headers show
Series Fixes in igbvf driver | expand

Commit Message

Wander Lairson Costa Sept. 20, 2024, 6:59 p.m. UTC
During testing of SR-IOV, Red Hat QE encountered an issue where the
ip link up command intermittently fails for the igbvf interfaces when
using the PREEMPT_RT variant. Investigation revealed that
e1000_write_posted_mbx returns an error due to the lack of an ACK
from e1000_poll_for_ack.

The underlying issue arises from the fact that IRQs are threaded by
default under PREEMPT_RT. While the exact hardware details are not
available, it appears that the IRQ handled by igb_msix_other must
be processed before e1000_poll_for_ack times out. However,
e1000_write_posted_mbx is called with preemption disabled, leading
to a scenario where the IRQ is serviced only after the failure of
e1000_write_posted_mbx.

To resolve this, we set IRQF_NO_THREAD for the affected interrupt,
ensuring that the kernel handles it immediately, thereby preventing
the aforementioned error.

Reproducer:

    #!/bin/bash

    # echo 2 > /sys/class/net/ens14f0/device/sriov_numvfs
    ipaddr_vlan=3
    nic_test=ens14f0
    vf=${nic_test}v0

    while true; do
	    ip link set ${nic_test} mtu 1500
	    ip link set ${vf} mtu 1500
	    ip link set $vf up
	    ip link set ${nic_test} vf 0 vlan ${ipaddr_vlan}
	    ip addr add 172.30.${ipaddr_vlan}.1/24 dev ${vf}
	    ip addr add 2021:db8:${ipaddr_vlan}::1/64 dev ${vf}
	    if ! ip link show $vf | grep 'state UP'; then
		    echo 'Error found'
		    break
	    fi
	    ip link set $vf down
    done

Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Reported-by: Yuying Ma <yuma@redhat.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Przemek Kitszel Sept. 23, 2024, 9:07 a.m. UTC | #1
On 9/20/24 20:59, Wander Lairson Costa wrote:
> During testing of SR-IOV, Red Hat QE encountered an issue where the
> ip link up command intermittently fails for the igbvf interfaces when
> using the PREEMPT_RT variant. Investigation revealed that
> e1000_write_posted_mbx returns an error due to the lack of an ACK
> from e1000_poll_for_ack.
> 
> The underlying issue arises from the fact that IRQs are threaded by
> default under PREEMPT_RT. While the exact hardware details are not
> available, it appears that the IRQ handled by igb_msix_other must
> be processed before e1000_poll_for_ack times out. However,
> e1000_write_posted_mbx is called with preemption disabled, leading
> to a scenario where the IRQ is serviced only after the failure of
> e1000_write_posted_mbx.
> 
> To resolve this, we set IRQF_NO_THREAD for the affected interrupt,
> ensuring that the kernel handles it immediately, thereby preventing
> the aforementioned error.
> 
> Reproducer:
> 
>      #!/bin/bash
> 
>      # echo 2 > /sys/class/net/ens14f0/device/sriov_numvfs
>      ipaddr_vlan=3
>      nic_test=ens14f0
>      vf=${nic_test}v0
> 
>      while true; do
> 	    ip link set ${nic_test} mtu 1500
> 	    ip link set ${vf} mtu 1500
> 	    ip link set $vf up
> 	    ip link set ${nic_test} vf 0 vlan ${ipaddr_vlan}
> 	    ip addr add 172.30.${ipaddr_vlan}.1/24 dev ${vf}
> 	    ip addr add 2021:db8:${ipaddr_vlan}::1/64 dev ${vf}
> 	    if ! ip link show $vf | grep 'state UP'; then
> 		    echo 'Error found'
> 		    break
> 	    fi
> 	    ip link set $vf down
>      done
> 
> Signed-off-by: Wander Lairson Costa <wander@redhat.com>
> Reported-by: Yuying Ma <yuma@redhat.com>
> ---
>   drivers/net/ethernet/intel/igb/igb_main.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
> index 1ef4cb871452..8a1696d7289f 100644
> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> @@ -907,7 +907,7 @@ static int igb_request_msix(struct igb_adapter *adapter)
>   	int i, err = 0, vector = 0, free_vector = 0;
>   
>   	err = request_irq(adapter->msix_entries[vector].vector,
> -			  igb_msix_other, 0, netdev->name, adapter);
> +			  igb_msix_other, IRQF_NO_THREAD, netdev->name, adapter);
>   	if (err)
>   		goto err_out;
>   

Thank you for small, localized fix with a good description.
Our VAL will check it also on non-RT OS.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>

PS: for future intel ethernet submissions please split out fixes and
refactors, and tag each commit with the [iwl-net] or [iwl-next] tags
Romanowski, Rafal Oct. 9, 2024, 8:54 a.m. UTC | #2
From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> on behalf of Przemek Kitszel <przemyslaw.kitszel@intel.com>
Sent: Monday, September 23, 2024 11:07 AM
To: Wander Lairson Costa <wander@redhat.com>
Cc: Yuying Ma <yuma@redhat.com>; moderated list:INTEL ETHERNET DRIVERS <intel-wired-lan@lists.osuosl.org>; open list <linux-kernel@vger.kernel.org>; Eric Dumazet <edumazet@google.com>; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; open list:NETWORKING DRIVERS <netdev@vger.kernel.org>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; David S. Miller <davem@davemloft.net>
Subject: Re: [Intel-wired-lan] [PATCH 1/2] igb: Disable threaded IRQ for igb_msix_other

On 9/20/24 20:59, Wander Lairson Costa wrote:
> During testing of SR-IOV, Red Hat QE encountered an issue where the
> ip link up command intermittently fails for the igbvf interfaces when
> using the PREEMPT_RT variant. Investigation revealed that
> e1000_write_posted_mbx returns an error due to the lack of an ACK
> from e1000_poll_for_ack.
>
> The underlying issue arises from the fact that IRQs are threaded by
> default under PREEMPT_RT. While the exact hardware details are not
> available, it appears that the IRQ handled by igb_msix_other must
> be processed before e1000_poll_for_ack times out. However,
> e1000_write_posted_mbx is called with preemption disabled, leading
> to a scenario where the IRQ is serviced only after the failure of
> e1000_write_posted_mbx.
>
> To resolve this, we set IRQF_NO_THREAD for the affected interrupt,
> ensuring that the kernel handles it immediately, thereby preventing
> the aforementioned error.
>
> Reproducer:
>
>      #!/bin/bash
>
>      # echo 2 > /sys/class/net/ens14f0/device/sriov_numvfs
>      ipaddr_vlan=3
>      nic_test=ens14f0
>      vf=${nic_test}v0
>
>      while true; do
>            ip link set ${nic_test} mtu 1500
>            ip link set ${vf} mtu 1500
>            ip link set $vf up
>            ip link set ${nic_test} vf 0 vlan ${ipaddr_vlan}
>            ip addr add 172.30.${ipaddr_vlan}.1/24 dev ${vf}
>            ip addr add 2021:db8:${ipaddr_vlan}::1/64 dev ${vf}
>            if ! ip link show $vf | grep 'state UP'; then
>                    echo 'Error found'
>                    break
>            fi
>            ip link set $vf down
>      done
>
> Signed-off-by: Wander Lairson Costa <wander@redhat.com>
> Reported-by: Yuying Ma <yuma@redhat.com>
> ---
>   drivers/net/ethernet/intel/igb/igb_main.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
> index 1ef4cb871452..8a1696d7289f 100644
> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> @@ -907,7 +907,7 @@ static int igb_request_msix(struct igb_adapter *adapter)
>        int i, err = 0, vector = 0, free_vector = 0;
>
>        err = request_irq(adapter->msix_entries[vector].vector,
> -                       igb_msix_other, 0, netdev->name, adapter);
> +                       igb_msix_other, IRQF_NO_THREAD, netdev->name, adapter);
>        if (err)
>                goto err_out;
>

Tested-by: Rafal Romanowski <rafal.romanowski@intel.com<mailto:rafal.romanowski@intel.com>>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 1ef4cb871452..8a1696d7289f 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -907,7 +907,7 @@  static int igb_request_msix(struct igb_adapter *adapter)
 	int i, err = 0, vector = 0, free_vector = 0;
 
 	err = request_irq(adapter->msix_entries[vector].vector,
-			  igb_msix_other, 0, netdev->name, adapter);
+			  igb_msix_other, IRQF_NO_THREAD, netdev->name, adapter);
 	if (err)
 		goto err_out;