diff mbox series

[net,1/4] net/mlx5e: Force CHECKSUM_UNNECESSARY for short ethernet frames

Message ID 20190119002523.16097-2-saeedm@mellanox.com
State Accepted
Delegated to: David Miller
Headers show
Series [net,1/4] net/mlx5e: Force CHECKSUM_UNNECESSARY for short ethernet frames | expand

Commit Message

Saeed Mahameed Jan. 19, 2019, 12:25 a.m. UTC
From: Cong Wang <xiyou.wangcong@gmail.com>

When an ethernet frame is padded to meet the minimum ethernet frame
size, the padding octets are not covered by the hardware checksum.
Fortunately the padding octets are usually zero's, which don't affect
checksum. However, we have a switch which pads non-zero octets, this
causes kernel hardware checksum fault repeatedly.

Prior to:
commit '88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE ...")'
skb checksum was forced to be CHECKSUM_NONE when padding is detected.
After it, we need to keep skb->csum updated, like what we do for RXFCS.
However, fixing up CHECKSUM_COMPLETE requires to verify and parse IP
headers, it is not worthy the effort as the packets are so small that
CHECKSUM_COMPLETE can't save anything.

Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"),
Cc: Eric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Cc: Nikola Ciprich <nikola.ciprich@linuxbox.cz>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Comments

Cong Wang Jan. 22, 2019, 9:30 p.m. UTC | #1
On Fri, Jan 18, 2019 at 4:25 PM Saeed Mahameed <saeedm@mellanox.com> wrote:
>
> From: Cong Wang <xiyou.wangcong@gmail.com>

I don't know why you want to make me as the author here, but I never
agree on _your_ updates on my previous patch.

Please see below.


>  drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> index 1d0bb5ff8c26..f86e4804e83e 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> @@ -732,6 +732,8 @@ static u8 get_ip_proto(struct sk_buff *skb, int network_depth, __be16 proto)
>                                             ((struct ipv6hdr *)ip_p)->nexthdr;
>  }
>
> +#define short_frame(size) ((size) <= ETH_ZLEN + ETH_FCS_LEN)
> +


I don't agree on unconditionally comparing with ETH_ZLEN + ETH_FCS_LEN.


>  static inline void mlx5e_handle_csum(struct net_device *netdev,
>                                      struct mlx5_cqe64 *cqe,
>                                      struct mlx5e_rq *rq,
> @@ -754,6 +756,17 @@ static inline void mlx5e_handle_csum(struct net_device *netdev,
>         if (unlikely(test_bit(MLX5E_RQ_STATE_NO_CSUM_COMPLETE, &rq->state)))
>                 goto csum_unnecessary;
>
> +       /* CQE csum doesn't cover padding octets in short ethernet
> +        * frames. And the pad field is appended prior to calculating
> +        * and appending the FCS field.
> +        *
> +        * Detecting these padded frames requires to verify and parse
> +        * IP headers, so we simply force all those small frames to be
> +        * CHECKSUM_UNNECESSARY even if they are not padded.

This is inaccurate and misleading, it is unnecessary only if the packet
passes the if check right below the goto label 'csum_unnecessary',
otherwise still a CHECKSUM_NONE. IOW, you are not forcing anything
here.

> +        */
> +       if (short_frame(skb->len))

Missed an "unlikely()". Short frames are rare, comparing to non-short
ones.

I respect your judgement on CHECKSUM_UNNECESSARY, even when
I still disagree with you. Please respect me by not forcing me to accept
any updates from you, IOW, kindly removing my name from anything
in this commit, SoB and authorship.

Thank you for your understanding!
Saeed Mahameed Jan. 25, 2019, 6:22 p.m. UTC | #2
On Tue, 2019-01-22 at 13:30 -0800, Cong Wang wrote:
> On Fri, Jan 18, 2019 at 4:25 PM Saeed Mahameed <saeedm@mellanox.com>
> wrote:
> > From: Cong Wang <xiyou.wangcong@gmail.com>
> 
> I don't know why you want to make me as the author here, but I never
> agree on _your_ updates on my previous patch.
> 
> Please see below.
> 

sorry, i just took your patch and worked on top of it, i thought you
would like to get the credit for this.


> 
> >  drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 13 +++++++++++++
> >  1 file changed, 13 insertions(+)
> > 
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> > b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> > index 1d0bb5ff8c26..f86e4804e83e 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> > @@ -732,6 +732,8 @@ static u8 get_ip_proto(struct sk_buff *skb, int
> > network_depth, __be16 proto)
> >                                             ((struct ipv6hdr
> > *)ip_p)->nexthdr;
> >  }
> > 
> > +#define short_frame(size) ((size) <= ETH_ZLEN + ETH_FCS_LEN)
> > +
> 
> I don't agree on unconditionally comparing with ETH_ZLEN +
> ETH_FCS_LEN.
> 

This is more relaxed and it covers both cases unconditionally. 

> 
> >  static inline void mlx5e_handle_csum(struct net_device *netdev,
> >                                      struct mlx5_cqe64 *cqe,
> >                                      struct mlx5e_rq *rq,
> > @@ -754,6 +756,17 @@ static inline void mlx5e_handle_csum(struct
> > net_device *netdev,
> >         if (unlikely(test_bit(MLX5E_RQ_STATE_NO_CSUM_COMPLETE, &rq-
> > >state)))
> >                 goto csum_unnecessary;
> > 
> > +       /* CQE csum doesn't cover padding octets in short ethernet
> > +        * frames. And the pad field is appended prior to
> > calculating
> > +        * and appending the FCS field.
> > +        *
> > +        * Detecting these padded frames requires to verify and
> > parse
> > +        * IP headers, so we simply force all those small frames to
> > be
> > +        * CHECKSUM_UNNECESSARY even if they are not padded.
> 
> This is inaccurate and misleading, it is unnecessary only if the
> packet
> passes the if check right below the goto label 'csum_unnecessary',
> otherwise still a CHECKSUM_NONE. IOW, you are not forcing anything
> here.
> 

yes, the comment is not 100% accurate, but it delivers the message.

> > +        */
> > +       if (short_frame(skb->len))
> 
> Missed an "unlikely()". Short frames are rare, comparing to non-short
> ones.
> 
> I respect your judgement on CHECKSUM_UNNECESSARY, even when
> I still disagree with you. Please respect me by not forcing me to
> accept
> any updates from you, IOW, kindly removing my name from anything
> in this commit, SoB and authorship.
> 
> Thank you for your understanding!

Again sorry about this, will be more careful in the future.

Thanks for your support and great work.
Eric Dumazet Jan. 25, 2019, 6:31 p.m. UTC | #3
On 01/25/2019 10:22 AM, Saeed Mahameed wrote:
> On Tue, 2019-01-22 at 13:30 -0800, Cong Wang wrote:
>> On Fri, Jan 18, 2019 at 4:25 PM Saeed Mahameed <saeedm@mellanox.com>
>> wrote:
>>> From: Cong Wang <xiyou.wangcong@gmail.com>
>>
>> I don't know why you want to make me as the author here, but I never
>> agree on _your_ updates on my previous patch.
>>
>> Please see below.
>>
> 
> sorry, i just took your patch and worked on top of it, i thought you
> would like to get the credit for this.
>

I thought the issue was that the hardware csum provided by both mlx4 and mlx5 only
 covered the bytes included in the IP (v4 or v6) frame.

Meaning that any non zero padding bytes are not checksummed.

If this can not be fixed by a firmware change, then the fix has nothing to do with a frame being
smaller than ETH_ZLEN + ETH_FCS_LEN

Alternative would be for the driver to trim the frame (pretend the skb->len is exactly the expected one),
but one could argue that tcpdump should be able to see padding bytes.
Saeed Mahameed Jan. 25, 2019, 6:58 p.m. UTC | #4
On Fri, 2019-01-25 at 10:31 -0800, Eric Dumazet wrote:
> 
> On 01/25/2019 10:22 AM, Saeed Mahameed wrote:
> > On Tue, 2019-01-22 at 13:30 -0800, Cong Wang wrote:
> > > On Fri, Jan 18, 2019 at 4:25 PM Saeed Mahameed <
> > > saeedm@mellanox.com>
> > > wrote:
> > > > From: Cong Wang <xiyou.wangcong@gmail.com>
> > > 
> > > I don't know why you want to make me as the author here, but I
> > > never
> > > agree on _your_ updates on my previous patch.
> > > 
> > > Please see below.
> > > 
> > 
> > sorry, i just took your patch and worked on top of it, i thought
> > you
> > would like to get the credit for this.
> > 
> 
> I thought the issue was that the hardware csum provided by both mlx4
> and mlx5 only
>  covered the bytes included in the IP (v4 or v6) frame.
> 
> Meaning that any non zero padding bytes are not checksummed.

in case of non IP, mlx5 will provide csum complete on the whole frame.

> If this can not be fixed by a firmware change, then the fix has
> nothing to do with a frame being
> smaller than ETH_ZLEN + ETH_FCS_LEN
> 

Again, most of the switches/routers that have non-zero padding bug are
padding only small frames.

> Alternative would be for the driver to trim the frame (pretend the
> skb->len is exactly the expected one),
> but one could argue that tcpdump should be able to see padding bytes.
> 

That requires parsing the IP headers in the driver, we are trying to
avoid that, this patch is not perfect but eliminates many of the csum
warnings seen on mlx5.

>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 1d0bb5ff8c26..f86e4804e83e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -732,6 +732,8 @@  static u8 get_ip_proto(struct sk_buff *skb, int network_depth, __be16 proto)
 					    ((struct ipv6hdr *)ip_p)->nexthdr;
 }
 
+#define short_frame(size) ((size) <= ETH_ZLEN + ETH_FCS_LEN)
+
 static inline void mlx5e_handle_csum(struct net_device *netdev,
 				     struct mlx5_cqe64 *cqe,
 				     struct mlx5e_rq *rq,
@@ -754,6 +756,17 @@  static inline void mlx5e_handle_csum(struct net_device *netdev,
 	if (unlikely(test_bit(MLX5E_RQ_STATE_NO_CSUM_COMPLETE, &rq->state)))
 		goto csum_unnecessary;
 
+	/* CQE csum doesn't cover padding octets in short ethernet
+	 * frames. And the pad field is appended prior to calculating
+	 * and appending the FCS field.
+	 *
+	 * Detecting these padded frames requires to verify and parse
+	 * IP headers, so we simply force all those small frames to be
+	 * CHECKSUM_UNNECESSARY even if they are not padded.
+	 */
+	if (short_frame(skb->len))
+		goto csum_unnecessary;
+
 	if (likely(is_last_ethertype_ip(skb, &network_depth, &proto))) {
 		if (unlikely(get_ip_proto(skb, network_depth, proto) == IPPROTO_SCTP))
 			goto csum_unnecessary;