diff mbox series

[iwl-next,v2,1/1] igc: Improve XDP_SETUP_PROG process

Message ID 20241211134532.3489335-1-yoong.siang.song@intel.com
State Superseded
Headers show
Series [iwl-next,v2,1/1] igc: Improve XDP_SETUP_PROG process | expand

Commit Message

Song Yoong Siang Dec. 11, 2024, 1:45 p.m. UTC
Improve XDP_SETUP_PROG process by avoiding unnecessary link down event.

This patch is tested by using ip link set xdpdrv command to attach a simple
XDP program which always return XDP_PASS.

Before this patch, attaching xdp program will cause ptp4l to lost sync for
few seconds, as shown in ptp4l log below:
  ptp4l[198.082]: rms    4 max    8 freq   +906 +/-   2 delay    12 +/-   0
  ptp4l[199.082]: rms    3 max    4 freq   +906 +/-   3 delay    12 +/-   0
  ptp4l[199.536]: port 1 (enp2s0): link down
  ptp4l[199.536]: port 1 (enp2s0): SLAVE to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED)
  ptp4l[199.600]: selected local clock 22abbc.fffe.bb1234 as best master
  ptp4l[199.600]: port 1 (enp2s0): assuming the grand master role
  ptp4l[199.600]: port 1 (enp2s0): master state recommended in slave only mode
  ptp4l[199.600]: port 1 (enp2s0): defaultDS.priority1 probably misconfigured
  ptp4l[202.266]: port 1 (enp2s0): link up
  ptp4l[202.300]: port 1 (enp2s0): FAULTY to LISTENING on INIT_COMPLETE
  ptp4l[205.558]: port 1 (enp2s0): new foreign master 44abbc.fffe.bb2144-1
  ptp4l[207.558]: selected best master clock 44abbc.fffe.bb2144
  ptp4l[207.559]: port 1 (enp2s0): LISTENING to UNCALIBRATED on RS_SLAVE
  ptp4l[208.308]: port 1 (enp2s0): UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
  ptp4l[208.933]: rms  742 max 1303 freq   -195 +/- 682 delay    12 +/-   0
  ptp4l[209.933]: rms  178 max  274 freq   +387 +/- 243 delay    12 +/-   0

After this patch, attaching xdp program no longer cause ptp4l to lost sync,
as shown on ptp4l log below:
  ptp4l[201.183]: rms    1 max    3 freq   +959 +/-   1 delay     8 +/-   0
  ptp4l[202.183]: rms    1 max    3 freq   +961 +/-   2 delay     8 +/-   0
  ptp4l[203.183]: rms    2 max    3 freq   +958 +/-   2 delay     8 +/-   0
  ptp4l[204.183]: rms    3 max    5 freq   +961 +/-   3 delay     8 +/-   0
  ptp4l[205.183]: rms    2 max    4 freq   +964 +/-   3 delay     8 +/-   0

Besides, before this patch, attaching xdp program will cause flood ping to
loss 10 packets, as shown in ping statistics below:
  --- 169.254.1.2 ping statistics ---
  100000 packets transmitted, 99990 received, +6 errors, 0.01% packet loss, time 34001ms
  rtt min/avg/max/mdev = 0.028/0.301/3104.360/13.838 ms, pipe 10, ipg/ewma 0.340/0.243 ms

After this patch, attaching xdp program no longer cause flood ping to loss
any packets, as shown in ping statistics below:
  --- 169.254.1.2 ping statistics ---
  100000 packets transmitted, 100000 received, 0% packet loss, time 32326ms
  rtt min/avg/max/mdev = 0.027/0.231/19.589/0.155 ms, pipe 2, ipg/ewma 0.323/0.322 ms

On the other hand, this patch is also tested with tools/testing/selftests/
bpf/xdp_hw_metadata app to make sure XDP zero-copy is working fine with
XDP Tx and Rx metadata. Below is the result of last packet after received
10000 UDP packets with interval 1 ms:
  poll: 1 (0) skip=0 fail=0 redir=10000
  xsk_ring_cons__peek: 1
  0x55881c7ef7a8: rx_desc[9999]->addr=8f110 addr=8f110 comp_addr=8f110 EoP
  rx_hash: 0xFB9BB6A3 with RSS type:0x1
  HW RX-time:   1733923136269470866 (sec:1733923136.2695) delta to User RX-time sec:0.0000 (43.280 usec)
  XDP RX-time:   1733923136269482482 (sec:1733923136.2695) delta to User RX-time sec:0.0000 (31.664 usec)
  No rx_vlan_tci or rx_vlan_proto, err=-95
  0x55881c7ef7a8: ping-pong with csum=ab19 (want 315b) csum_start=34 csum_offset=6
  0x55881c7ef7a8: complete tx idx=9999 addr=f010
  HW TX-complete-time:   1733923136269591637 (sec:1733923136.2696) delta to User TX-complete-time sec:0.0001 (108.571 usec)
  XDP RX-time:   1733923136269482482 (sec:1733923136.2695) delta to User TX-complete-time sec:0.0002 (217.726 usec)
  HW RX-time:   1733923136269470866 (sec:1733923136.2695) delta to HW TX-complete-time sec:0.0001 (120.771 usec)
  0x55881c7ef7a8: complete rx idx=10127 addr=8f110

Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
---
V2 changelog:
 - show some examples of problem in commit msg. (Vinicius)
 - igc_close()/igc_open() are too big a hammer for installing a new XDP
   program. Only do we we really need. (Vinicius)
---
 drivers/net/ethernet/intel/igc/igc_xdp.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

Comments

Maciej Fijalkowski Dec. 12, 2024, 2:08 p.m. UTC | #1
On Wed, Dec 11, 2024 at 09:45:32PM +0800, Song Yoong Siang wrote:
> Improve XDP_SETUP_PROG process by avoiding unnecessary link down event.
> 
> This patch is tested by using ip link set xdpdrv command to attach a simple
> XDP program which always return XDP_PASS.
> 
> Before this patch, attaching xdp program will cause ptp4l to lost sync for
> few seconds, as shown in ptp4l log below:
>   ptp4l[198.082]: rms    4 max    8 freq   +906 +/-   2 delay    12 +/-   0
>   ptp4l[199.082]: rms    3 max    4 freq   +906 +/-   3 delay    12 +/-   0
>   ptp4l[199.536]: port 1 (enp2s0): link down
>   ptp4l[199.536]: port 1 (enp2s0): SLAVE to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED)
>   ptp4l[199.600]: selected local clock 22abbc.fffe.bb1234 as best master
>   ptp4l[199.600]: port 1 (enp2s0): assuming the grand master role
>   ptp4l[199.600]: port 1 (enp2s0): master state recommended in slave only mode
>   ptp4l[199.600]: port 1 (enp2s0): defaultDS.priority1 probably misconfigured
>   ptp4l[202.266]: port 1 (enp2s0): link up
>   ptp4l[202.300]: port 1 (enp2s0): FAULTY to LISTENING on INIT_COMPLETE
>   ptp4l[205.558]: port 1 (enp2s0): new foreign master 44abbc.fffe.bb2144-1
>   ptp4l[207.558]: selected best master clock 44abbc.fffe.bb2144
>   ptp4l[207.559]: port 1 (enp2s0): LISTENING to UNCALIBRATED on RS_SLAVE
>   ptp4l[208.308]: port 1 (enp2s0): UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
>   ptp4l[208.933]: rms  742 max 1303 freq   -195 +/- 682 delay    12 +/-   0
>   ptp4l[209.933]: rms  178 max  274 freq   +387 +/- 243 delay    12 +/-   0
> 
> After this patch, attaching xdp program no longer cause ptp4l to lost sync,
> as shown on ptp4l log below:
>   ptp4l[201.183]: rms    1 max    3 freq   +959 +/-   1 delay     8 +/-   0
>   ptp4l[202.183]: rms    1 max    3 freq   +961 +/-   2 delay     8 +/-   0
>   ptp4l[203.183]: rms    2 max    3 freq   +958 +/-   2 delay     8 +/-   0
>   ptp4l[204.183]: rms    3 max    5 freq   +961 +/-   3 delay     8 +/-   0
>   ptp4l[205.183]: rms    2 max    4 freq   +964 +/-   3 delay     8 +/-   0
> 
> Besides, before this patch, attaching xdp program will cause flood ping to
> loss 10 packets, as shown in ping statistics below:
>   --- 169.254.1.2 ping statistics ---
>   100000 packets transmitted, 99990 received, +6 errors, 0.01% packet loss, time 34001ms
>   rtt min/avg/max/mdev = 0.028/0.301/3104.360/13.838 ms, pipe 10, ipg/ewma 0.340/0.243 ms
> 
> After this patch, attaching xdp program no longer cause flood ping to loss
> any packets, as shown in ping statistics below:
>   --- 169.254.1.2 ping statistics ---
>   100000 packets transmitted, 100000 received, 0% packet loss, time 32326ms
>   rtt min/avg/max/mdev = 0.027/0.231/19.589/0.155 ms, pipe 2, ipg/ewma 0.323/0.322 ms
> 
> On the other hand, this patch is also tested with tools/testing/selftests/
> bpf/xdp_hw_metadata app to make sure XDP zero-copy is working fine with
> XDP Tx and Rx metadata. Below is the result of last packet after received
> 10000 UDP packets with interval 1 ms:
>   poll: 1 (0) skip=0 fail=0 redir=10000
>   xsk_ring_cons__peek: 1
>   0x55881c7ef7a8: rx_desc[9999]->addr=8f110 addr=8f110 comp_addr=8f110 EoP
>   rx_hash: 0xFB9BB6A3 with RSS type:0x1
>   HW RX-time:   1733923136269470866 (sec:1733923136.2695) delta to User RX-time sec:0.0000 (43.280 usec)
>   XDP RX-time:   1733923136269482482 (sec:1733923136.2695) delta to User RX-time sec:0.0000 (31.664 usec)
>   No rx_vlan_tci or rx_vlan_proto, err=-95
>   0x55881c7ef7a8: ping-pong with csum=ab19 (want 315b) csum_start=34 csum_offset=6
>   0x55881c7ef7a8: complete tx idx=9999 addr=f010
>   HW TX-complete-time:   1733923136269591637 (sec:1733923136.2696) delta to User TX-complete-time sec:0.0001 (108.571 usec)
>   XDP RX-time:   1733923136269482482 (sec:1733923136.2695) delta to User TX-complete-time sec:0.0002 (217.726 usec)
>   HW RX-time:   1733923136269470866 (sec:1733923136.2695) delta to HW TX-complete-time sec:0.0001 (120.771 usec)
>   0x55881c7ef7a8: complete rx idx=10127 addr=8f110
> 
> Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
> ---
> V2 changelog:
>  - show some examples of problem in commit msg. (Vinicius)
>  - igc_close()/igc_open() are too big a hammer for installing a new XDP
>    program. Only do we we really need. (Vinicius)
> ---
>  drivers/net/ethernet/intel/igc/igc_xdp.c | 19 +++++++++++++++----
>  1 file changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/igc/igc_xdp.c b/drivers/net/ethernet/intel/igc/igc_xdp.c
> index 869815f48ac1..64b04aad614c 100644
> --- a/drivers/net/ethernet/intel/igc/igc_xdp.c
> +++ b/drivers/net/ethernet/intel/igc/igc_xdp.c
> @@ -14,6 +14,7 @@ int igc_xdp_set_prog(struct igc_adapter *adapter, struct bpf_prog *prog,
>  	bool if_running = netif_running(dev);
>  	struct bpf_prog *old_prog;
>  	bool need_update;
> +	int i;
>  
>  	if (dev->mtu > ETH_DATA_LEN) {
>  		/* For now, the driver doesn't support XDP functionality with
> @@ -24,8 +25,13 @@ int igc_xdp_set_prog(struct igc_adapter *adapter, struct bpf_prog *prog,
>  	}
>  
>  	need_update = !!adapter->xdp_prog != !!prog;
> -	if (if_running && need_update)
> -		igc_close(dev);
> +	if (if_running && need_update) {
> +		for (i = 0; i < adapter->num_rx_queues; i++) {
> +			igc_disable_rx_ring(adapter->rx_ring[i]);
> +			igc_disable_tx_ring(adapter->tx_ring[i]);
> +			napi_disable(&adapter->rx_ring[i]->q_vector->napi);
> +		}
> +	}
>  
>  	old_prog = xchg(&adapter->xdp_prog, prog);
>  	if (old_prog)
> @@ -36,8 +42,13 @@ int igc_xdp_set_prog(struct igc_adapter *adapter, struct bpf_prog *prog,
>  	else
>  		xdp_features_clear_redirect_target(dev);
>  
> -	if (if_running && need_update)
> -		igc_open(dev);
> +	if (if_running && need_update) {
> +		for (i = 0; i < adapter->num_rx_queues; i++) {
> +			napi_enable(&adapter->rx_ring[i]->q_vector->napi);
> +			igc_enable_tx_ring(adapter->tx_ring[i]);
> +			igc_enable_rx_ring(adapter->rx_ring[i]);

I agree we could do better than igc_close/igc_open pair, but have you
tried igc_down/igc_open instead?

> +		}
> +	}
>  
>  	return 0;
>  }
> -- 
> 2.34.1
>
Song Yoong Siang Dec. 12, 2024, 3:21 p.m. UTC | #2
On Thursday, December 12, 2024 10:09 PM, Fijalkowski, Maciej wrote:
>On Wed, Dec 11, 2024 at 09:45:32PM +0800, Song Yoong Siang wrote:
>> Improve XDP_SETUP_PROG process by avoiding unnecessary link down event.
>>

[...]

>
>I agree we could do better than igc_close/igc_open pair, but have you
>tried igc_down/igc_open instead?
>

Hi Fijalkowski Maciej,

Thanks for your comment. I did 2 experiments on your suggestion:
1. with igc_down/igc_open, kernel dump observed after load xdp prog,
    as shown in dmesg below:

[   87.957526] ------------[ cut here ]------------
[   87.962168] remove_proc_entry: removing non-empty directory 'irq/150', leaking at least 'enp2s0'
[   87.971014] WARNING: CPU: 17 PID: 3690 at fs/proc/generic.c:717 remove_proc_entry+0x19a/0x1b0
[   87.979581] Modules linked in: algif_hash(E) af_alg(E) nf_conntrack_netlink(E) xfrm_user(E) xfrm_algo(E) xt_addrtype(E) br_netfilter(E) nft_masq(E) xt_CHECKSUM(E) xt_MASQUERADE(E) xt_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_tcpudp(E) nft_compat(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) nfnetlink(E) bridge(E) stp(E) llc(E) overlay(E) binfmt_misc(E) intel_uncore_frequency(E) intel_uncore_frequency_common(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) processor_thermal_device_pci(E) kvm(E) processor_thermal_device(E) nls_iso8859_1(E) mei_hdcp(E) cmdlinepart(E) intel_rapl_msr(E) processor_thermal_wt_hint(E) hid_sensor_als(E) processor_thermal_rfim(E) rapl(E) spi_nor(E) processor_thermal_rapl(E) hid_sensor_trigger(E) intel_cstate(E) wmi_bmof(E) mtd(E) industrialio_triggered_buffer(E) intel_rapl_common(E) kfifo_buf(E) mei_me(E) hid_sensor_iio_common(E) processor_thermal_wt_req(E) processor_thermal_power_floor(E) idma64(E)
[   87.979631]  processor_thermal_mbox(E) mei(E) industrialio(E) int340x_thermal_zone(E) virt_dma(E) intel_pmc_core(E) int3400_thermal(E) pmt_telemetry(E) intel_hid(E) acpi_thermal_rel(E) mac_hid(E) pmt_class(E) sparse_keymap(E) acpi_pad(E) acpi_tad(E) sch_fq_codel(E) msr(E) parport_pc(E) ppdev(E) lp(E) parport(E) ramoops(E) reed_solomon(E) pstore_blk(E) pstore_zone(E) efi_pstore(E) ip_tables(E) x_tables(E) autofs4(E) btrfs(E) blake2b_generic(E) xor(E) raid6_pq(E) libcrc32c(E) dm_mirror(E) dm_region_hash(E) dm_log(E) hid_sensor_custom(E) hid_sensor_hub(E) hid_generic(E) intel_ishtp_hid(E) hid(E) i915(E) prime_numbers(E) drm_buddy(E) ttm(E) i2c_algo_bit(E) mxl_gpy(E) polynomial(E) drm_display_helper(E) crct10dif_pclmul(E) cec(E) dwc3(E) dwmac_intel(E) crc32_pclmul(E) stmmac(E) ghash_clmulni_intel(E) rc_core(E) ulpi(E) spi_pxa2xx_platform(E) udc_core(E) sha512_ssse3(E) i2c_i801(E) dw_dmac(E) sha256_ssse3(E) pcs_xpcs(E) i2c_mux(E) nvme(E) dw_dmac_core(E) drm_kms_helper(E) sha1_ssse3(E) e1000e(E) phylink(E) i2c_smbus(E)
[   88.068050]  spi_pxa2xx_core(E) igc(E) nvme_core(E) ahci(E) libahci(E) intel_ish_ipc(E) drm(E) dwc3_pci(E) intel_ishtp(E) video(E) pinctrl_alderlake(E) wmi(E) pinctrl_intel(E) ax88179_178a(E) usbnet(E) mii(E) uas(E) usb_storage(E) aesni_intel(E) crypto_simd(E) cryptd(E)
[   88.181533] CPU: 17 UID: 0 PID: 3690 Comm: ip Tainted: G     U      E      6.13.0-rc1-siang+ #36
[   88.190327] Tainted: [U]=USER, [E]=UNSIGNED_MODULE
[   88.195136] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-S ADP-S DDR5 UDIMM CRB, BIOS ADLSFWI1.R00.3032.B00.2201111106 01/11/2022
[   88.208957] RIP: 0010:remove_proc_entry+0x19a/0x1b0
[   88.213855] Code: 80 d8 5c b5 48 85 c0 48 8d 90 78 ff ff ff 48 0f 45 c2 49 8b 54 24 78 4c 8b 80 a0 00 00 00 48 8b 92 a0 00 00 00 e8 a6 c9 bb ff <0f> 0b e9 40 ff ff ff e8 aa 4b a8 00 66 2e 0f 1f 84 00 00 00 00 00
[   88.232613] RSP: 0018:ffffba1a83c1f3a8 EFLAGS: 00010282
[   88.237851] RAX: 0000000000000000 RBX: ffff8dc340294e40 RCX: 0000000000000000
[   88.245003] RDX: 0000000000000002 RSI: ffffffffb55a0e36 RDI: 00000000ffffffff
[   88.252153] RBP: ffffba1a83c1f3d8 R08: 0000000000000000 R09: ffffba1a83c1f210
[   88.259302] R10: ffffba1a83c1f208 R11: ffffffffb5f50ec8 R12: ffff8dc3c2606d80
[   88.266470] R13: ffff8dc3c2606e00 R14: 0000000000000000 R15: 0000000000000097
[   88.273626] FS:  00007f7018866b80(0000) GS:ffff8dcacf880000(0000) knlGS:0000000000000000
[   88.281734] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   88.287505] CR2: 00005603b5622d88 CR3: 0000000195582000 CR4: 0000000000752ef0
[   88.294663] PKRU: 55555554
[   88.297400] Call Trace:
[   88.299878]  <TASK>
[   88.302009]  ? show_regs+0x69/0x80
[   88.305444]  ? __warn+0x8d/0x140
[   88.308702]  ? remove_proc_entry+0x19a/0x1b0
[   88.312999]  ? report_bug+0x1c5/0x1d0
[   88.316692]  ? irq_work_queue+0x40/0x60
[   88.320558]  ? handle_bug+0x63/0xa0
[   88.324075]  ? exc_invalid_op+0x19/0x70
[   88.327938]  ? asm_exc_invalid_op+0x1b/0x20
[   88.332154]  ? remove_proc_entry+0x19a/0x1b0
[   88.336449]  ? remove_proc_entry+0x19a/0x1b0
[   88.340745]  unregister_irq_proc+0xe4/0x110
[   88.344960]  free_desc+0x41/0xd0
[   88.348218]  ? irq_domain_free_irqs+0x138/0x1b0
[   88.352773]  ? kfree+0x25b/0x2f0
[   88.356031]  irq_free_descs+0x4c/0x70
[   88.359720]  irq_domain_free_irqs+0x151/0x1b0
[   88.364107]  msi_domain_free_locked.part.0+0x186/0x1b0
[   88.369270]  msi_domain_free_irqs_all_locked+0x6d/0x90
[   88.374433]  pci_msi_teardown_msi_irqs+0x3f/0x50
[   88.379078]  pci_free_msi_irqs+0x12/0x40
[   88.383032]  pci_disable_msix+0x51/0x70
[   88.386894]  igc_reset_interrupt_capability+0x2c/0xc0 [igc]
ptp4l[88.192]: p[   88.392520]  __igc_open+0x356/0x630 [igc]
[   88.397927]  igc_open+0x31/0x60 [igc]
ort 1 (enp2s0): [   88.401635]  igc_xdp_set_prog+0xa0/0xf0 [igc]
[   88.407389]  igc_bpf+0x46/0x50 [igc]
SLAVE to LISTENI[   88.411012]  dev_xdp_install+0x85/0x140
[   88.416247]  ? __pfx_igc_bpf+0x10/0x10 [igc]
[   88.420546]  dev_xdp_attach+0x20f/0x5f0
[   88.424411]  dev_change_xdp_fd+0x1f3/0x230
[   88.428534]  do_setlink.constprop.0+0x695/0x11b0
[   88.433180]  ? _raw_spin_unlock+0x19/0x40
[   88.437220]  ? __nla_validate_parse+0x5a/0xe80
[   88.441694]  ? __handle_mm_fault+0xbb2/0x1060
[   88.446076]  ? ns_capable+0x2d/0x60
[   88.449592]  ? netlink_ns_capable+0x3c/0x50
[   88.453807]  rtnl_newlink+0x8bb/0xd10
[   88.457497]  ? count_memcg_events.constprop.0+0x1e/0x40
[   88.462750]  ? security_capable+0x69/0x190
[   88.466879]  ? __pfx_rtnl_newlink+0x10/0x10
[   88.471091]  rtnetlink_rcv_msg+0x35b/0x400
[   88.475212]  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[   88.479856]  netlink_rcv_skb+0x59/0x110
[   88.483721]  rtnetlink_rcv+0x15/0x20
[   88.487325]  netlink_unicast+0x23c/0x350
[   88.491273]  netlink_sendmsg+0x208/0x460
[   88.495228]  ____sys_sendmsg+0x3c1/0x3f0
[   88.499183]  ___sys_sendmsg+0x88/0xd0
[   88.502872]  ? debug_smp_processor_id+0x17/0x20
[   88.507426]  ? mod_objcg_state+0xb6/0x2f0
[   88.511464]  ? mod_objcg_state+0xb6/0x2f0
[   88.515502]  ? __memcg_slab_free_hook+0xdf/0x130
[   88.520147]  __sys_sendmsg+0x74/0xd0
[   88.523754]  __x64_sys_sendmsg+0x1d/0x30
[   88.527702]  x64_sys_call+0x92e/0x2140
[   88.531482]  do_syscall_64+0x4d/0x120
[   88.535170]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[   88.540247] RIP: 0033:0x7f7018727b17
[   88.543853] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
[   88.562625] RSP: 002b:00007ffd53158938 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[   88.570215] RAX: ffffffffffffffda RBX: 00000000675af93b RCX: 00007f7018727b17
[   88.577369] RDX: 0000000000000000 RSI: 00007ffd531589a0 RDI: 0000000000000003
[   88.584528] RBP: 0000000000000000 R08: 0000000000000001 R09: 00005603bd0ecd90
[   88.591686] R10: 00005603bd0ec360 R11: 0000000000000246 R12: 0000000000000001
[   88.598844] R13: 00007ffd53158a50 R14: 0000000000000000 R15: 00005603b55d8040
NG on ANNOUNCE_R[   88.606005]  </TASK>
[   88.609592] ---[ end trace 0000000000000000 ]---


2. with igc_down/igc_up, there is still link down and lost sync event,
    as shown in ptp4l log below:

ptp4l[183.164]: rms    8 max   15 freq  +1012 +/-   4 delay     8 +/-   0
ptp4l[184.164]: rms    3 max    6 freq  +1018 +/-   5 delay     8 +/-   0
ptp4l[185.164]: rms    4 max    6 freq  +1026 +/-   3 delay     8 +/-   0
[  185.998988] NET: Registered PF_ALG protocol family
ptp4l[186.064]: port 1 (enp2s0): link down
ptp4l[186.064]: port 1 (enp2s0): SLAVE to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED)
ptp4l[186.124]: selected local clock 22abbc.fffe.bb1234 as best master
ptp4l[186.124]: port 1 (enp2s0): assuming the grand master role
ptp4l[186.124]: port 1 (enp2s0): master state recommended in slave only mode
ptp4l[186.124]: port 1 (enp2s0): defaultDS.priority1 probably misconfigured
[  188.375104] igc 0000:02:00.0 enp2s0: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
ptp4l[188.896]: port 1 (enp2s0): link up
ptp4l[188.928]: port 1 (enp2s0): FAULTY to LISTENING on INIT_COMPLETE
ptp4l[192.100]: port 1 (enp2s0): new foreign master 44abbc.fffe.bb2144-1
ptp4l[194.100]: selected best master clock 44abbc.fffe.bb2144
ptp4l[194.100]: port 1 (enp2s0): LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[194.974]: port 1 (enp2s0): UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[195.099]: rms  172 max  375 freq   +895 +/- 225 delay    10 +/-   2
ptp4l[196.099]: rms  126 max  245 freq   +739 +/-  98 delay    13 +/-   0
ptp4l[197.099]: rms   80 max   97 freq   +986 +/-  45 delay    13 +/-   0

Thanks & Regards
Siang

>> +		}
>> +	}
>>
>>  	return 0;
>>  }
>> --
>> 2.34.1
>>
Paul Menzel Dec. 13, 2024, 8:09 a.m. UTC | #3
Dear Song,


Thank you for your patch. Maybe for the summary/title you could be more 
specific:

igc: Avoid unnecessary link down event in XDP_SETUP_PROG process


Am 11.12.24 um 14:45 schrieb Song Yoong Siang:
> Improve XDP_SETUP_PROG process by avoiding unnecessary link down event.
> 
> This patch is tested by using ip link set xdpdrv command to attach a simple
> XDP program which always return XDP_PASS.

return*s*

> Before this patch, attaching xdp program will cause ptp4l to lost sync for

to lose

> few seconds, as shown in ptp4l log below:
>    ptp4l[198.082]: rms    4 max    8 freq   +906 +/-   2 delay    12 +/-   0
>    ptp4l[199.082]: rms    3 max    4 freq   +906 +/-   3 delay    12 +/-   0
>    ptp4l[199.536]: port 1 (enp2s0): link down
>    ptp4l[199.536]: port 1 (enp2s0): SLAVE to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED)
>    ptp4l[199.600]: selected local clock 22abbc.fffe.bb1234 as best master
>    ptp4l[199.600]: port 1 (enp2s0): assuming the grand master role
>    ptp4l[199.600]: port 1 (enp2s0): master state recommended in slave only mode
>    ptp4l[199.600]: port 1 (enp2s0): defaultDS.priority1 probably misconfigured
>    ptp4l[202.266]: port 1 (enp2s0): link up
>    ptp4l[202.300]: port 1 (enp2s0): FAULTY to LISTENING on INIT_COMPLETE
>    ptp4l[205.558]: port 1 (enp2s0): new foreign master 44abbc.fffe.bb2144-1
>    ptp4l[207.558]: selected best master clock 44abbc.fffe.bb2144
>    ptp4l[207.559]: port 1 (enp2s0): LISTENING to UNCALIBRATED on RS_SLAVE
>    ptp4l[208.308]: port 1 (enp2s0): UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
>    ptp4l[208.933]: rms  742 max 1303 freq   -195 +/- 682 delay    12 +/-   0
>    ptp4l[209.933]: rms  178 max  274 freq   +387 +/- 243 delay    12 +/-   0
> 
> After this patch, attaching xdp program no longer cause ptp4l to lost sync,

to lose

> as shown on ptp4l log below:
>    ptp4l[201.183]: rms    1 max    3 freq   +959 +/-   1 delay     8 +/-   0
>    ptp4l[202.183]: rms    1 max    3 freq   +961 +/-   2 delay     8 +/-   0
>    ptp4l[203.183]: rms    2 max    3 freq   +958 +/-   2 delay     8 +/-   0
>    ptp4l[204.183]: rms    3 max    5 freq   +961 +/-   3 delay     8 +/-   0
>    ptp4l[205.183]: rms    2 max    4 freq   +964 +/-   3 delay     8 +/-   0
> 
> Besides, before this patch, attaching xdp program will cause flood ping to
> loss 10 packets, as shown in ping statistics below:

to lose

>    --- 169.254.1.2 ping statistics ---
>    100000 packets transmitted, 99990 received, +6 errors, 0.01% packet loss, time 34001ms
>    rtt min/avg/max/mdev = 0.028/0.301/3104.360/13.838 ms, pipe 10, ipg/ewma 0.340/0.243 ms
> 
> After this patch, attaching xdp program no longer cause flood ping to loss

cause*s*, to lose

> any packets, as shown in ping statistics below:
>    --- 169.254.1.2 ping statistics ---
>    100000 packets transmitted, 100000 received, 0% packet loss, time 32326ms
>    rtt min/avg/max/mdev = 0.027/0.231/19.589/0.155 ms, pipe 2, ipg/ewma 0.323/0.322 ms
> 
> On the other hand, this patch is also tested with tools/testing/selftests/
> bpf/xdp_hw_metadata app to make sure XDP zero-copy is working fine with
> XDP Tx and Rx metadata. Below is the result of last packet after received
> 10000 UDP packets with interval 1 ms:
>    poll: 1 (0) skip=0 fail=0 redir=10000
>    xsk_ring_cons__peek: 1
>    0x55881c7ef7a8: rx_desc[9999]->addr=8f110 addr=8f110 comp_addr=8f110 EoP
>    rx_hash: 0xFB9BB6A3 with RSS type:0x1
>    HW RX-time:   1733923136269470866 (sec:1733923136.2695) delta to User RX-time sec:0.0000 (43.280 usec)
>    XDP RX-time:   1733923136269482482 (sec:1733923136.2695) delta to User RX-time sec:0.0000 (31.664 usec)
>    No rx_vlan_tci or rx_vlan_proto, err=-95
>    0x55881c7ef7a8: ping-pong with csum=ab19 (want 315b) csum_start=34 csum_offset=6
>    0x55881c7ef7a8: complete tx idx=9999 addr=f010
>    HW TX-complete-time:   1733923136269591637 (sec:1733923136.2696) delta to User TX-complete-time sec:0.0001 (108.571 usec)
>    XDP RX-time:   1733923136269482482 (sec:1733923136.2695) delta to User TX-complete-time sec:0.0002 (217.726 usec)
>    HW RX-time:   1733923136269470866 (sec:1733923136.2695) delta to HW TX-complete-time sec:0.0001 (120.771 usec)
>    0x55881c7ef7a8: complete rx idx=10127 addr=8f110
> 
> Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
> ---
> V2 changelog:
>   - show some examples of problem in commit msg. (Vinicius)
>   - igc_close()/igc_open() are too big a hammer for installing a new XDP
>     program. Only do we we really need. (Vinicius)

The first sentence of the second item could go into the commit message 
with a note, why `igc_close()`/`igc_open()` are not needed.

> ---
>   drivers/net/ethernet/intel/igc/igc_xdp.c | 19 +++++++++++++++----
>   1 file changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/igc/igc_xdp.c b/drivers/net/ethernet/intel/igc/igc_xdp.c
> index 869815f48ac1..64b04aad614c 100644
> --- a/drivers/net/ethernet/intel/igc/igc_xdp.c
> +++ b/drivers/net/ethernet/intel/igc/igc_xdp.c
> @@ -14,6 +14,7 @@ int igc_xdp_set_prog(struct igc_adapter *adapter, struct bpf_prog *prog,
>   	bool if_running = netif_running(dev);
>   	struct bpf_prog *old_prog;
>   	bool need_update;
> +	int i;

I’d use unsigned int as it’s used to access array elements.

>   
>   	if (dev->mtu > ETH_DATA_LEN) {
>   		/* For now, the driver doesn't support XDP functionality with
> @@ -24,8 +25,13 @@ int igc_xdp_set_prog(struct igc_adapter *adapter, struct bpf_prog *prog,
>   	}
>   
>   	need_update = !!adapter->xdp_prog != !!prog;
> -	if (if_running && need_update)
> -		igc_close(dev);
> +	if (if_running && need_update) {
> +		for (i = 0; i < adapter->num_rx_queues; i++) {
> +			igc_disable_rx_ring(adapter->rx_ring[i]);
> +			igc_disable_tx_ring(adapter->tx_ring[i]);
> +			napi_disable(&adapter->rx_ring[i]->q_vector->napi);
> +		}
> +	}
>   
>   	old_prog = xchg(&adapter->xdp_prog, prog);
>   	if (old_prog)
> @@ -36,8 +42,13 @@ int igc_xdp_set_prog(struct igc_adapter *adapter, struct bpf_prog *prog,
>   	else
>   		xdp_features_clear_redirect_target(dev);
>   
> -	if (if_running && need_update)
> -		igc_open(dev);
> +	if (if_running && need_update) {
> +		for (i = 0; i < adapter->num_rx_queues; i++) {
> +			napi_enable(&adapter->rx_ring[i]->q_vector->napi);
> +			igc_enable_tx_ring(adapter->tx_ring[i]);
> +			igc_enable_rx_ring(adapter->rx_ring[i]);
> +		}
> +	}
>   
>   	return 0;
>   }


Kind regards,

Paul
Song Yoong Siang Dec. 17, 2024, 1:28 a.m. UTC | #4
On Friday, December 13, 2024 4:10 PM, Paul Menzel <pmenzel@molgen.mpg.de> wrote:
>Dear Song,
>
>
>Thank you for your patch. Maybe for the summary/title you could be more
>specific:
>
>igc: Avoid unnecessary link down event in XDP_SETUP_PROG process
>
>
>Am 11.12.24 um 14:45 schrieb Song Yoong Siang:
>> Improve XDP_SETUP_PROG process by avoiding unnecessary link down event.
>>
>> This patch is tested by using ip link set xdpdrv command to attach a simple
>> XDP program which always return XDP_PASS.
>
>return*s*
>
>> Before this patch, attaching xdp program will cause ptp4l to lost sync for
>
>to lose
>
>> few seconds, as shown in ptp4l log below:
>>    ptp4l[198.082]: rms    4 max    8 freq   +906 +/-   2 delay    12 +/-   0
>>    ptp4l[199.082]: rms    3 max    4 freq   +906 +/-   3 delay    12 +/-   0
>>    ptp4l[199.536]: port 1 (enp2s0): link down
>>    ptp4l[199.536]: port 1 (enp2s0): SLAVE to FAULTY on FAULT_DETECTED
>(FT_UNSPECIFIED)
>>    ptp4l[199.600]: selected local clock 22abbc.fffe.bb1234 as best master
>>    ptp4l[199.600]: port 1 (enp2s0): assuming the grand master role
>>    ptp4l[199.600]: port 1 (enp2s0): master state recommended in slave only mode
>>    ptp4l[199.600]: port 1 (enp2s0): defaultDS.priority1 probably misconfigured
>>    ptp4l[202.266]: port 1 (enp2s0): link up
>>    ptp4l[202.300]: port 1 (enp2s0): FAULTY to LISTENING on INIT_COMPLETE
>>    ptp4l[205.558]: port 1 (enp2s0): new foreign master 44abbc.fffe.bb2144-1
>>    ptp4l[207.558]: selected best master clock 44abbc.fffe.bb2144
>>    ptp4l[207.559]: port 1 (enp2s0): LISTENING to UNCALIBRATED on RS_SLAVE
>>    ptp4l[208.308]: port 1 (enp2s0): UNCALIBRATED to SLAVE on
>MASTER_CLOCK_SELECTED
>>    ptp4l[208.933]: rms  742 max 1303 freq   -195 +/- 682 delay    12 +/-   0
>>    ptp4l[209.933]: rms  178 max  274 freq   +387 +/- 243 delay    12 +/-   0
>>
>> After this patch, attaching xdp program no longer cause ptp4l to lost sync,
>
>to lose
>
>> as shown on ptp4l log below:
>>    ptp4l[201.183]: rms    1 max    3 freq   +959 +/-   1 delay     8 +/-   0
>>    ptp4l[202.183]: rms    1 max    3 freq   +961 +/-   2 delay     8 +/-   0
>>    ptp4l[203.183]: rms    2 max    3 freq   +958 +/-   2 delay     8 +/-   0
>>    ptp4l[204.183]: rms    3 max    5 freq   +961 +/-   3 delay     8 +/-   0
>>    ptp4l[205.183]: rms    2 max    4 freq   +964 +/-   3 delay     8 +/-   0
>>
>> Besides, before this patch, attaching xdp program will cause flood ping to
>> loss 10 packets, as shown in ping statistics below:
>
>to lose
>
>>    --- 169.254.1.2 ping statistics ---
>>    100000 packets transmitted, 99990 received, +6 errors, 0.01% packet loss, time
>34001ms
>>    rtt min/avg/max/mdev = 0.028/0.301/3104.360/13.838 ms, pipe 10, ipg/ewma
>0.340/0.243 ms
>>
>> After this patch, attaching xdp program no longer cause flood ping to loss
>
>cause*s*, to lose
>
>> any packets, as shown in ping statistics below:
>>    --- 169.254.1.2 ping statistics ---
>>    100000 packets transmitted, 100000 received, 0% packet loss, time 32326ms
>>    rtt min/avg/max/mdev = 0.027/0.231/19.589/0.155 ms, pipe 2, ipg/ewma
>0.323/0.322 ms
>>
>> On the other hand, this patch is also tested with tools/testing/selftests/
>> bpf/xdp_hw_metadata app to make sure XDP zero-copy is working fine with
>> XDP Tx and Rx metadata. Below is the result of last packet after received
>> 10000 UDP packets with interval 1 ms:
>>    poll: 1 (0) skip=0 fail=0 redir=10000
>>    xsk_ring_cons__peek: 1
>>    0x55881c7ef7a8: rx_desc[9999]->addr=8f110 addr=8f110 comp_addr=8f110 EoP
>>    rx_hash: 0xFB9BB6A3 with RSS type:0x1
>>    HW RX-time:   1733923136269470866 (sec:1733923136.2695) delta to User RX-
>time sec:0.0000 (43.280 usec)
>>    XDP RX-time:   1733923136269482482 (sec:1733923136.2695) delta to User RX-
>time sec:0.0000 (31.664 usec)
>>    No rx_vlan_tci or rx_vlan_proto, err=-95
>>    0x55881c7ef7a8: ping-pong with csum=ab19 (want 315b) csum_start=34
>csum_offset=6
>>    0x55881c7ef7a8: complete tx idx=9999 addr=f010
>>    HW TX-complete-time:   1733923136269591637 (sec:1733923136.2696) delta to
>User TX-complete-time sec:0.0001 (108.571 usec)
>>    XDP RX-time:   1733923136269482482 (sec:1733923136.2695) delta to User TX-
>complete-time sec:0.0002 (217.726 usec)
>>    HW RX-time:   1733923136269470866 (sec:1733923136.2695) delta to HW TX-
>complete-time sec:0.0001 (120.771 usec)
>>    0x55881c7ef7a8: complete rx idx=10127 addr=8f110
>>
>> Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
>> ---
>> V2 changelog:
>>   - show some examples of problem in commit msg. (Vinicius)
>>   - igc_close()/igc_open() are too big a hammer for installing a new XDP
>>     program. Only do we we really need. (Vinicius)
>
>The first sentence of the second item could go into the commit message
>with a note, why `igc_close()`/`igc_open()` are not needed.
>
>> ---
>>   drivers/net/ethernet/intel/igc/igc_xdp.c | 19 +++++++++++++++----
>>   1 file changed, 15 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/igc/igc_xdp.c
>b/drivers/net/ethernet/intel/igc/igc_xdp.c
>> index 869815f48ac1..64b04aad614c 100644
>> --- a/drivers/net/ethernet/intel/igc/igc_xdp.c
>> +++ b/drivers/net/ethernet/intel/igc/igc_xdp.c
>> @@ -14,6 +14,7 @@ int igc_xdp_set_prog(struct igc_adapter *adapter, struct
>bpf_prog *prog,
>>   	bool if_running = netif_running(dev);
>>   	struct bpf_prog *old_prog;
>>   	bool need_update;
>> +	int i;
>
>I’d use unsigned int as it’s used to access array elements.
>
>>
>>   	if (dev->mtu > ETH_DATA_LEN) {
>>   		/* For now, the driver doesn't support XDP functionality with
>> @@ -24,8 +25,13 @@ int igc_xdp_set_prog(struct igc_adapter *adapter, struct
>bpf_prog *prog,
>>   	}
>>
>>   	need_update = !!adapter->xdp_prog != !!prog;
>> -	if (if_running && need_update)
>> -		igc_close(dev);
>> +	if (if_running && need_update) {
>> +		for (i = 0; i < adapter->num_rx_queues; i++) {
>> +			igc_disable_rx_ring(adapter->rx_ring[i]);
>> +			igc_disable_tx_ring(adapter->tx_ring[i]);
>> +			napi_disable(&adapter->rx_ring[i]->q_vector->napi);
>> +		}
>> +	}
>>
>>   	old_prog = xchg(&adapter->xdp_prog, prog);
>>   	if (old_prog)
>> @@ -36,8 +42,13 @@ int igc_xdp_set_prog(struct igc_adapter *adapter, struct
>bpf_prog *prog,
>>   	else
>>   		xdp_features_clear_redirect_target(dev);
>>
>> -	if (if_running && need_update)
>> -		igc_open(dev);
>> +	if (if_running && need_update) {
>> +		for (i = 0; i < adapter->num_rx_queues; i++) {
>> +			napi_enable(&adapter->rx_ring[i]->q_vector->napi);
>> +			igc_enable_tx_ring(adapter->tx_ring[i]);
>> +			igc_enable_rx_ring(adapter->rx_ring[i]);
>> +		}
>> +	}
>>
>>   	return 0;
>>   }
>
>
>Kind regards,
>
>Paul

Hi Paul,

Thank you for your valuable review comments.
FYI, I submitted v3 [1] to address your comments.
Please feel free to review again.

[1] https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20241216073849.3555825-1-yoong.siang.song@intel.com/

Thanks & Regards
Siang
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/igc/igc_xdp.c b/drivers/net/ethernet/intel/igc/igc_xdp.c
index 869815f48ac1..64b04aad614c 100644
--- a/drivers/net/ethernet/intel/igc/igc_xdp.c
+++ b/drivers/net/ethernet/intel/igc/igc_xdp.c
@@ -14,6 +14,7 @@  int igc_xdp_set_prog(struct igc_adapter *adapter, struct bpf_prog *prog,
 	bool if_running = netif_running(dev);
 	struct bpf_prog *old_prog;
 	bool need_update;
+	int i;
 
 	if (dev->mtu > ETH_DATA_LEN) {
 		/* For now, the driver doesn't support XDP functionality with
@@ -24,8 +25,13 @@  int igc_xdp_set_prog(struct igc_adapter *adapter, struct bpf_prog *prog,
 	}
 
 	need_update = !!adapter->xdp_prog != !!prog;
-	if (if_running && need_update)
-		igc_close(dev);
+	if (if_running && need_update) {
+		for (i = 0; i < adapter->num_rx_queues; i++) {
+			igc_disable_rx_ring(adapter->rx_ring[i]);
+			igc_disable_tx_ring(adapter->tx_ring[i]);
+			napi_disable(&adapter->rx_ring[i]->q_vector->napi);
+		}
+	}
 
 	old_prog = xchg(&adapter->xdp_prog, prog);
 	if (old_prog)
@@ -36,8 +42,13 @@  int igc_xdp_set_prog(struct igc_adapter *adapter, struct bpf_prog *prog,
 	else
 		xdp_features_clear_redirect_target(dev);
 
-	if (if_running && need_update)
-		igc_open(dev);
+	if (if_running && need_update) {
+		for (i = 0; i < adapter->num_rx_queues; i++) {
+			napi_enable(&adapter->rx_ring[i]->q_vector->napi);
+			igc_enable_tx_ring(adapter->tx_ring[i]);
+			igc_enable_rx_ring(adapter->rx_ring[i]);
+		}
+	}
 
 	return 0;
 }