diff mbox series

[bpf] xsk: do not discard packet when QUEUE_STATE_FROZEN

Message ID 1594287951-27479-1-git-send-email-magnus.karlsson@intel.com
State Changes Requested
Delegated to: BPF Maintainers
Headers show
Series [bpf] xsk: do not discard packet when QUEUE_STATE_FROZEN | expand

Commit Message

Magnus Karlsson July 9, 2020, 9:45 a.m. UTC
In the skb Tx path, transmission of a packet is performed with
dev_direct_xmit(). When QUEUE_STATE_FROZEN is set in the transmit
routines, it returns NETDEV_TX_BUSY signifying that it was not
possible to send the packet now, please try later. Unfortunately, the
xsk transmit code discarded the packet and returned EBUSY to the
application. Fix this unnecessary packet loss, by not discarding the
packet and return EAGAIN. As EAGAIN is returned to the application, it
can then retry the send operation and the packet will finally be sent
as we will likely not be in the QUEUE_STATE_FROZEN state anymore. So
EAGAIN tells the application that the packet was not discarded from
the Tx ring and that it needs to call send() again. EBUSY, on the
other hand, signifies that the packet was not sent and discarded from
the Tx ring. The application needs to put the packet on the Tx ring
again if it wants it to be sent.

This unnecessary packet loss has been reported by the user below to
occur at high transmit loads.

Fixes: 35fcde7f8deb ("xsk: support for Tx")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Reported-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
Suggested-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
---
 net/xdp/xsk.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

Comments

Jonathan Lemon July 9, 2020, 5:03 p.m. UTC | #1
On Thu, Jul 09, 2020 at 11:45:51AM +0200, Magnus Karlsson wrote:
> In the skb Tx path, transmission of a packet is performed with
> dev_direct_xmit(). When QUEUE_STATE_FROZEN is set in the transmit
> routines, it returns NETDEV_TX_BUSY signifying that it was not
> possible to send the packet now, please try later. Unfortunately, the
> xsk transmit code discarded the packet and returned EBUSY to the
> application. Fix this unnecessary packet loss, by not discarding the
> packet and return EAGAIN. As EAGAIN is returned to the application, it
> can then retry the send operation and the packet will finally be sent
> as we will likely not be in the QUEUE_STATE_FROZEN state anymore. So
> EAGAIN tells the application that the packet was not discarded from
> the Tx ring and that it needs to call send() again. EBUSY, on the
> other hand, signifies that the packet was not sent and discarded from
> the Tx ring. The application needs to put the packet on the Tx ring
> again if it wants it to be sent.

Doesn't the original code leak the skb if NETDEV_TX_BUSY is returned?
I'm not seeing where it was released.  The new code looks correct.


> Fixes: 35fcde7f8deb ("xsk: support for Tx")
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> Reported-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> Suggested-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> ---
>  net/xdp/xsk.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index 3700266..5304250 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -376,13 +376,22 @@ static int xsk_generic_xmit(struct sock *sk)
>  		skb->destructor = xsk_destruct_skb;
>  
>  		err = dev_direct_xmit(skb, xs->queue_id);
> -		xskq_cons_release(xs->tx);
>  		/* Ignore NET_XMIT_CN as packet might have been sent */
> -		if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) {
> +		if (err == NET_XMIT_DROP) {
>  			/* SKB completed but not sent */
> +			xskq_cons_release(xs->tx);
>  			err = -EBUSY;
>  			goto out;
> +		} else if  (err == NETDEV_TX_BUSY) {

Should be "if (err == ..." here, no else.


> +			/* QUEUE_STATE_FROZEN, tell application to
> +			 * retry sending the packet
> +			 */
> +			skb->destructor = NULL;
> +			kfree_skb(skb);
> +			err = -EAGAIN;
> +			goto out;
>  		}
> +		xskq_cons_release(xs->tx);
>  
>  		sent_frame = true;
>  	}
> -- 
> 2.7.4
>
Magnus Karlsson July 9, 2020, 5:10 p.m. UTC | #2
On Thu, Jul 9, 2020 at 7:06 PM Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
>
> On Thu, Jul 09, 2020 at 11:45:51AM +0200, Magnus Karlsson wrote:
> > In the skb Tx path, transmission of a packet is performed with
> > dev_direct_xmit(). When QUEUE_STATE_FROZEN is set in the transmit
> > routines, it returns NETDEV_TX_BUSY signifying that it was not
> > possible to send the packet now, please try later. Unfortunately, the
> > xsk transmit code discarded the packet and returned EBUSY to the
> > application. Fix this unnecessary packet loss, by not discarding the
> > packet and return EAGAIN. As EAGAIN is returned to the application, it
> > can then retry the send operation and the packet will finally be sent
> > as we will likely not be in the QUEUE_STATE_FROZEN state anymore. So
> > EAGAIN tells the application that the packet was not discarded from
> > the Tx ring and that it needs to call send() again. EBUSY, on the
> > other hand, signifies that the packet was not sent and discarded from
> > the Tx ring. The application needs to put the packet on the Tx ring
> > again if it wants it to be sent.
>
> Doesn't the original code leak the skb if NETDEV_TX_BUSY is returned?
> I'm not seeing where it was released.  The new code looks correct.

You are correct. Should also have mentioned that in the commit message.

/Magnus

>
> > Fixes: 35fcde7f8deb ("xsk: support for Tx")
> > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> > Reported-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> > Suggested-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> > ---
> >  net/xdp/xsk.c | 13 +++++++++++--
> >  1 file changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > index 3700266..5304250 100644
> > --- a/net/xdp/xsk.c
> > +++ b/net/xdp/xsk.c
> > @@ -376,13 +376,22 @@ static int xsk_generic_xmit(struct sock *sk)
> >               skb->destructor = xsk_destruct_skb;
> >
> >               err = dev_direct_xmit(skb, xs->queue_id);
> > -             xskq_cons_release(xs->tx);
> >               /* Ignore NET_XMIT_CN as packet might have been sent */
> > -             if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) {
> > +             if (err == NET_XMIT_DROP) {
> >                       /* SKB completed but not sent */
> > +                     xskq_cons_release(xs->tx);
> >                       err = -EBUSY;
> >                       goto out;
> > +             } else if  (err == NETDEV_TX_BUSY) {
>
> Should be "if (err == ..." here, no else.
>
>
> > +                     /* QUEUE_STATE_FROZEN, tell application to
> > +                      * retry sending the packet
> > +                      */
> > +                     skb->destructor = NULL;
> > +                     kfree_skb(skb);
> > +                     err = -EAGAIN;
> > +                     goto out;
> >               }
> > +             xskq_cons_release(xs->tx);
> >
> >               sent_frame = true;
> >       }
> > --
> > 2.7.4
> >
Magnus Karlsson July 9, 2020, 7:30 p.m. UTC | #3
On Thu, Jul 9, 2020 at 7:10 PM Magnus Karlsson
<magnus.karlsson@gmail.com> wrote:
>
> On Thu, Jul 9, 2020 at 7:06 PM Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
> >
> > On Thu, Jul 09, 2020 at 11:45:51AM +0200, Magnus Karlsson wrote:
> > > In the skb Tx path, transmission of a packet is performed with
> > > dev_direct_xmit(). When QUEUE_STATE_FROZEN is set in the transmit
> > > routines, it returns NETDEV_TX_BUSY signifying that it was not
> > > possible to send the packet now, please try later. Unfortunately, the
> > > xsk transmit code discarded the packet and returned EBUSY to the
> > > application. Fix this unnecessary packet loss, by not discarding the
> > > packet and return EAGAIN. As EAGAIN is returned to the application, it
> > > can then retry the send operation and the packet will finally be sent
> > > as we will likely not be in the QUEUE_STATE_FROZEN state anymore. So
> > > EAGAIN tells the application that the packet was not discarded from
> > > the Tx ring and that it needs to call send() again. EBUSY, on the
> > > other hand, signifies that the packet was not sent and discarded from
> > > the Tx ring. The application needs to put the packet on the Tx ring
> > > again if it wants it to be sent.
> >
> > Doesn't the original code leak the skb if NETDEV_TX_BUSY is returned?
> > I'm not seeing where it was released.  The new code looks correct.
>
> You are correct. Should also have mentioned that in the commit message.

Jonathan,

Some context here. The bug report from Arkadiusz started out with the
unnecessary packet loss. While fixing it, I discovered that it was
actually leaking memory too. If you want, I can send a v2 that has a
commit message that mentions both problems? Let me know what you
prefer.

> /Magnus
>
> >
> > > Fixes: 35fcde7f8deb ("xsk: support for Tx")
> > > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> > > Reported-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> > > Suggested-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> > > ---
> > >  net/xdp/xsk.c | 13 +++++++++++--
> > >  1 file changed, 11 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > > index 3700266..5304250 100644
> > > --- a/net/xdp/xsk.c
> > > +++ b/net/xdp/xsk.c
> > > @@ -376,13 +376,22 @@ static int xsk_generic_xmit(struct sock *sk)
> > >               skb->destructor = xsk_destruct_skb;
> > >
> > >               err = dev_direct_xmit(skb, xs->queue_id);
> > > -             xskq_cons_release(xs->tx);
> > >               /* Ignore NET_XMIT_CN as packet might have been sent */
> > > -             if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) {
> > > +             if (err == NET_XMIT_DROP) {
> > >                       /* SKB completed but not sent */
> > > +                     xskq_cons_release(xs->tx);
> > >                       err = -EBUSY;
> > >                       goto out;
> > > +             } else if  (err == NETDEV_TX_BUSY) {
> >
> > Should be "if (err == ..." here, no else.
> >
> >
> > > +                     /* QUEUE_STATE_FROZEN, tell application to
> > > +                      * retry sending the packet
> > > +                      */
> > > +                     skb->destructor = NULL;
> > > +                     kfree_skb(skb);
> > > +                     err = -EAGAIN;
> > > +                     goto out;
> > >               }
> > > +             xskq_cons_release(xs->tx);
> > >
> > >               sent_frame = true;
> > >       }
> > > --
> > > 2.7.4
> > >
Jonathan Lemon July 9, 2020, 7:34 p.m. UTC | #4
On Thu, Jul 09, 2020 at 09:30:42PM +0200, Magnus Karlsson wrote:
> On Thu, Jul 9, 2020 at 7:10 PM Magnus Karlsson
> <magnus.karlsson@gmail.com> wrote:
> >
> > On Thu, Jul 9, 2020 at 7:06 PM Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
> > >
> > > On Thu, Jul 09, 2020 at 11:45:51AM +0200, Magnus Karlsson wrote:
> > > > In the skb Tx path, transmission of a packet is performed with
> > > > dev_direct_xmit(). When QUEUE_STATE_FROZEN is set in the transmit
> > > > routines, it returns NETDEV_TX_BUSY signifying that it was not
> > > > possible to send the packet now, please try later. Unfortunately, the
> > > > xsk transmit code discarded the packet and returned EBUSY to the
> > > > application. Fix this unnecessary packet loss, by not discarding the
> > > > packet and return EAGAIN. As EAGAIN is returned to the application, it
> > > > can then retry the send operation and the packet will finally be sent
> > > > as we will likely not be in the QUEUE_STATE_FROZEN state anymore. So
> > > > EAGAIN tells the application that the packet was not discarded from
> > > > the Tx ring and that it needs to call send() again. EBUSY, on the
> > > > other hand, signifies that the packet was not sent and discarded from
> > > > the Tx ring. The application needs to put the packet on the Tx ring
> > > > again if it wants it to be sent.
> > >
> > > Doesn't the original code leak the skb if NETDEV_TX_BUSY is returned?
> > > I'm not seeing where it was released.  The new code looks correct.
> >
> > You are correct. Should also have mentioned that in the commit message.
> 
> Jonathan,
> 
> Some context here. The bug report from Arkadiusz started out with the
> unnecessary packet loss. While fixing it, I discovered that it was
> actually leaking memory too. If you want, I can send a v2 that has a
> commit message that mentions both problems? Let me know what you
> prefer.

I think it would be best to mention both problems for the benefit of
future readers.
Magnus Karlsson July 9, 2020, 7:39 p.m. UTC | #5
On Thu, Jul 9, 2020 at 9:34 PM Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
>
> On Thu, Jul 09, 2020 at 09:30:42PM +0200, Magnus Karlsson wrote:
> > On Thu, Jul 9, 2020 at 7:10 PM Magnus Karlsson
> > <magnus.karlsson@gmail.com> wrote:
> > >
> > > On Thu, Jul 9, 2020 at 7:06 PM Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
> > > >
> > > > On Thu, Jul 09, 2020 at 11:45:51AM +0200, Magnus Karlsson wrote:
> > > > > In the skb Tx path, transmission of a packet is performed with
> > > > > dev_direct_xmit(). When QUEUE_STATE_FROZEN is set in the transmit
> > > > > routines, it returns NETDEV_TX_BUSY signifying that it was not
> > > > > possible to send the packet now, please try later. Unfortunately, the
> > > > > xsk transmit code discarded the packet and returned EBUSY to the
> > > > > application. Fix this unnecessary packet loss, by not discarding the
> > > > > packet and return EAGAIN. As EAGAIN is returned to the application, it
> > > > > can then retry the send operation and the packet will finally be sent
> > > > > as we will likely not be in the QUEUE_STATE_FROZEN state anymore. So
> > > > > EAGAIN tells the application that the packet was not discarded from
> > > > > the Tx ring and that it needs to call send() again. EBUSY, on the
> > > > > other hand, signifies that the packet was not sent and discarded from
> > > > > the Tx ring. The application needs to put the packet on the Tx ring
> > > > > again if it wants it to be sent.
> > > >
> > > > Doesn't the original code leak the skb if NETDEV_TX_BUSY is returned?
> > > > I'm not seeing where it was released.  The new code looks correct.
> > >
> > > You are correct. Should also have mentioned that in the commit message.
> >
> > Jonathan,
> >
> > Some context here. The bug report from Arkadiusz started out with the
> > unnecessary packet loss. While fixing it, I discovered that it was
> > actually leaking memory too. If you want, I can send a v2 that has a
> > commit message that mentions both problems? Let me know what you
> > prefer.
>
> I think it would be best to mention both problems for the benefit of
> future readers.

You will get a v2 tomorrow.

/Magnus

> --
> Jonathan
diff mbox series

Patch

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 3700266..5304250 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -376,13 +376,22 @@  static int xsk_generic_xmit(struct sock *sk)
 		skb->destructor = xsk_destruct_skb;
 
 		err = dev_direct_xmit(skb, xs->queue_id);
-		xskq_cons_release(xs->tx);
 		/* Ignore NET_XMIT_CN as packet might have been sent */
-		if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) {
+		if (err == NET_XMIT_DROP) {
 			/* SKB completed but not sent */
+			xskq_cons_release(xs->tx);
 			err = -EBUSY;
 			goto out;
+		} else if  (err == NETDEV_TX_BUSY) {
+			/* QUEUE_STATE_FROZEN, tell application to
+			 * retry sending the packet
+			 */
+			skb->destructor = NULL;
+			kfree_skb(skb);
+			err = -EAGAIN;
+			goto out;
 		}
+		xskq_cons_release(xs->tx);
 
 		sent_frame = true;
 	}