diff mbox series

[bpf,v2] xsk: fix memory leak and packet loss in Tx skb path

Message ID 1594363554-4076-1-git-send-email-magnus.karlsson@intel.com
State Changes Requested
Delegated to: BPF Maintainers
Headers show
Series [bpf,v2] xsk: fix memory leak and packet loss in Tx skb path | expand

Commit Message

Magnus Karlsson July 10, 2020, 6:45 a.m. UTC
In the skb Tx path, transmission of a packet is performed with
dev_direct_xmit(). When QUEUE_STATE_FROZEN is set in the transmit
routines, it returns NETDEV_TX_BUSY signifying that it was not
possible to send the packet now, please try later. Unfortunately, the
xsk transmit code discarded the packet, missed to free the skb, and
returned EBUSY to the application. Fix this memory leak and
unnecessary packet loss, by not discarding the packet in the Tx ring,
freeing the allocated skb, and return EAGAIN. As EAGAIN is returned to the
application, it can then retry the send operation and the packet will
finally be sent as we will likely not be in the QUEUE_STATE_FROZEN
state anymore. So EAGAIN tells the application that the packet was not
discarded from the Tx ring and that it needs to call send()
again. EBUSY, on the other hand, signifies that the packet was not
sent and discarded from the Tx ring. The application needs to put the
packet on the Tx ring again if it wants it to be sent.

Fixes: 35fcde7f8deb ("xsk: support for Tx")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Reported-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
Suggested-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
---
The v1 of this patch was called "xsk: do not discard packet when
QUEUE_STATE_FROZEN".
---
 net/xdp/xsk.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

Comments

Jonathan Lemon July 10, 2020, 4:34 p.m. UTC | #1
On Fri, Jul 10, 2020 at 08:45:54AM +0200, Magnus Karlsson wrote:
> In the skb Tx path, transmission of a packet is performed with
> dev_direct_xmit(). When QUEUE_STATE_FROZEN is set in the transmit
> routines, it returns NETDEV_TX_BUSY signifying that it was not
> possible to send the packet now, please try later. Unfortunately, the
> xsk transmit code discarded the packet, missed to free the skb, and
> returned EBUSY to the application. Fix this memory leak and
> unnecessary packet loss, by not discarding the packet in the Tx ring,
> freeing the allocated skb, and return EAGAIN. As EAGAIN is returned to the
> application, it can then retry the send operation and the packet will
> finally be sent as we will likely not be in the QUEUE_STATE_FROZEN
> state anymore. So EAGAIN tells the application that the packet was not
> discarded from the Tx ring and that it needs to call send()
> again. EBUSY, on the other hand, signifies that the packet was not
> sent and discarded from the Tx ring. The application needs to put the
> packet on the Tx ring again if it wants it to be sent.
> 
> Fixes: 35fcde7f8deb ("xsk: support for Tx")
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> Reported-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> Suggested-by: Arkadiusz Zema <A.Zema@falconvsystems.com>

Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Daniel Borkmann July 10, 2020, 11:26 p.m. UTC | #2
Hi Magnus,

On 7/10/20 8:45 AM, Magnus Karlsson wrote:
> In the skb Tx path, transmission of a packet is performed with
> dev_direct_xmit(). When QUEUE_STATE_FROZEN is set in the transmit
> routines, it returns NETDEV_TX_BUSY signifying that it was not
> possible to send the packet now, please try later. Unfortunately, the
> xsk transmit code discarded the packet, missed to free the skb, and
> returned EBUSY to the application. Fix this memory leak and
> unnecessary packet loss, by not discarding the packet in the Tx ring,
> freeing the allocated skb, and return EAGAIN. As EAGAIN is returned to the
> application, it can then retry the send operation and the packet will
> finally be sent as we will likely not be in the QUEUE_STATE_FROZEN
> state anymore. So EAGAIN tells the application that the packet was not
> discarded from the Tx ring and that it needs to call send()
> again. EBUSY, on the other hand, signifies that the packet was not
> sent and discarded from the Tx ring. The application needs to put the
> packet on the Tx ring again if it wants it to be sent.
> 
> Fixes: 35fcde7f8deb ("xsk: support for Tx")
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> Reported-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> Suggested-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> ---
> The v1 of this patch was called "xsk: do not discard packet when
> QUEUE_STATE_FROZEN".
> ---
>   net/xdp/xsk.c | 13 +++++++++++--
>   1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index 3700266..5304250 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -376,13 +376,22 @@ static int xsk_generic_xmit(struct sock *sk)
>   		skb->destructor = xsk_destruct_skb;
>   
>   		err = dev_direct_xmit(skb, xs->queue_id);
> -		xskq_cons_release(xs->tx);
>   		/* Ignore NET_XMIT_CN as packet might have been sent */
> -		if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) {
> +		if (err == NET_XMIT_DROP) {
>   			/* SKB completed but not sent */
> +			xskq_cons_release(xs->tx);
>   			err = -EBUSY;
>   			goto out;
> +		} else if  (err == NETDEV_TX_BUSY) {
> +			/* QUEUE_STATE_FROZEN, tell application to
> +			 * retry sending the packet
> +			 */
> +			skb->destructor = NULL;
> +			kfree_skb(skb);
> +			err = -EAGAIN;
> +			goto out;

Hmm, I'm probably missing something or I should blame my current lack of coffee,
but I'll ask anyway.. What is the relation here to the kfree_skb{,_list}() in
dev_direct_xmit() when we have NETDEV_TX_BUSY condition? Wouldn't the patch above
double-free with NETDEV_TX_BUSY?

>   		}
> +		xskq_cons_release(xs->tx);
>   
>   		sent_frame = true;
>   	}
> 

Thanks,
Daniel
Magnus Karlsson July 11, 2020, 7:39 a.m. UTC | #3
On Sat, Jul 11, 2020 at 1:28 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> Hi Magnus,
>
> On 7/10/20 8:45 AM, Magnus Karlsson wrote:
> > In the skb Tx path, transmission of a packet is performed with
> > dev_direct_xmit(). When QUEUE_STATE_FROZEN is set in the transmit
> > routines, it returns NETDEV_TX_BUSY signifying that it was not
> > possible to send the packet now, please try later. Unfortunately, the
> > xsk transmit code discarded the packet, missed to free the skb, and
> > returned EBUSY to the application. Fix this memory leak and
> > unnecessary packet loss, by not discarding the packet in the Tx ring,
> > freeing the allocated skb, and return EAGAIN. As EAGAIN is returned to the
> > application, it can then retry the send operation and the packet will
> > finally be sent as we will likely not be in the QUEUE_STATE_FROZEN
> > state anymore. So EAGAIN tells the application that the packet was not
> > discarded from the Tx ring and that it needs to call send()
> > again. EBUSY, on the other hand, signifies that the packet was not
> > sent and discarded from the Tx ring. The application needs to put the
> > packet on the Tx ring again if it wants it to be sent.
> >
> > Fixes: 35fcde7f8deb ("xsk: support for Tx")
> > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> > Reported-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> > Suggested-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> > ---
> > The v1 of this patch was called "xsk: do not discard packet when
> > QUEUE_STATE_FROZEN".
> > ---
> >   net/xdp/xsk.c | 13 +++++++++++--
> >   1 file changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > index 3700266..5304250 100644
> > --- a/net/xdp/xsk.c
> > +++ b/net/xdp/xsk.c
> > @@ -376,13 +376,22 @@ static int xsk_generic_xmit(struct sock *sk)
> >               skb->destructor = xsk_destruct_skb;
> >
> >               err = dev_direct_xmit(skb, xs->queue_id);
> > -             xskq_cons_release(xs->tx);
> >               /* Ignore NET_XMIT_CN as packet might have been sent */
> > -             if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) {
> > +             if (err == NET_XMIT_DROP) {
> >                       /* SKB completed but not sent */
> > +                     xskq_cons_release(xs->tx);
> >                       err = -EBUSY;
> >                       goto out;
> > +             } else if  (err == NETDEV_TX_BUSY) {
> > +                     /* QUEUE_STATE_FROZEN, tell application to
> > +                      * retry sending the packet
> > +                      */
> > +                     skb->destructor = NULL;
> > +                     kfree_skb(skb);
> > +                     err = -EAGAIN;
> > +                     goto out;
>
> Hmm, I'm probably missing something or I should blame my current lack of coffee,
> but I'll ask anyway.. What is the relation here to the kfree_skb{,_list}() in
> dev_direct_xmit() when we have NETDEV_TX_BUSY condition? Wouldn't the patch above
> double-free with NETDEV_TX_BUSY?

I think you are correct even without coffee :-). I misinterpreted the
following piece of code in dev_direct_xmit():

if (!dev_xmit_complete(ret))
     kfree_skb(skb);

If the skb was NOT consumed by the transmit, then it goes and frees
the skb. NETDEV_TX_BUSY as a return value will make
dev_xmit_complete() return false which triggers the freeing of the
skb. So if I now understand dev_direct_xmit() correctly, it will
always consume the skb, even when NETDEV_TX_BUSY is returned. And this
is what I would like to avoid. If the skb is freed, the destructor is
triggered and it will complete the packet to user-space, which is the
same thing as dropping it, which is what I want to avoid in the first
place since it is completely unnecessary.

So what would be the best way to solve this? Prefer to share the code
with AF_PACKET if possible. Introduce a boolean function parameter to
indicate if it should be freed in this case? Other ideas? Here are the
users of dev_direct_xmit():

drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c

line 349
line 939
line 1033
line 1303
line 1665

include/linux/netdevice.h, line 2719
net/core/dev.c

line 4095
line 4132

net/packet/af_packet.c, line 240
net/xdp/xsk.c, line 425

Thanks: Magnus

> >               }
> > +             xskq_cons_release(xs->tx);
> >
> >               sent_frame = true;
> >       }
> >
>
> Thanks,
> Daniel
Jonathan Lemon July 13, 2020, 4:53 p.m. UTC | #4
On Sat, Jul 11, 2020 at 09:39:58AM +0200, Magnus Karlsson wrote:
> On Sat, Jul 11, 2020 at 1:28 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >
> > Hi Magnus,
> >
> > On 7/10/20 8:45 AM, Magnus Karlsson wrote:
> > > In the skb Tx path, transmission of a packet is performed with
> > > dev_direct_xmit(). When QUEUE_STATE_FROZEN is set in the transmit
> > > routines, it returns NETDEV_TX_BUSY signifying that it was not
> > > possible to send the packet now, please try later. Unfortunately, the
> > > xsk transmit code discarded the packet, missed to free the skb, and
> > > returned EBUSY to the application. Fix this memory leak and
> > > unnecessary packet loss, by not discarding the packet in the Tx ring,
> > > freeing the allocated skb, and return EAGAIN. As EAGAIN is returned to the
> > > application, it can then retry the send operation and the packet will
> > > finally be sent as we will likely not be in the QUEUE_STATE_FROZEN
> > > state anymore. So EAGAIN tells the application that the packet was not
> > > discarded from the Tx ring and that it needs to call send()
> > > again. EBUSY, on the other hand, signifies that the packet was not
> > > sent and discarded from the Tx ring. The application needs to put the
> > > packet on the Tx ring again if it wants it to be sent.
> > >
> > > Fixes: 35fcde7f8deb ("xsk: support for Tx")
> > > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> > > Reported-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> > > Suggested-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> > > ---
> > > The v1 of this patch was called "xsk: do not discard packet when
> > > QUEUE_STATE_FROZEN".
> > > ---
> > >   net/xdp/xsk.c | 13 +++++++++++--
> > >   1 file changed, 11 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > > index 3700266..5304250 100644
> > > --- a/net/xdp/xsk.c
> > > +++ b/net/xdp/xsk.c
> > > @@ -376,13 +376,22 @@ static int xsk_generic_xmit(struct sock *sk)
> > >               skb->destructor = xsk_destruct_skb;
> > >
> > >               err = dev_direct_xmit(skb, xs->queue_id);
> > > -             xskq_cons_release(xs->tx);
> > >               /* Ignore NET_XMIT_CN as packet might have been sent */
> > > -             if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) {
> > > +             if (err == NET_XMIT_DROP) {
> > >                       /* SKB completed but not sent */
> > > +                     xskq_cons_release(xs->tx);
> > >                       err = -EBUSY;
> > >                       goto out;
> > > +             } else if  (err == NETDEV_TX_BUSY) {
> > > +                     /* QUEUE_STATE_FROZEN, tell application to
> > > +                      * retry sending the packet
> > > +                      */
> > > +                     skb->destructor = NULL;
> > > +                     kfree_skb(skb);
> > > +                     err = -EAGAIN;
> > > +                     goto out;
> >
> > Hmm, I'm probably missing something or I should blame my current lack of coffee,
> > but I'll ask anyway.. What is the relation here to the kfree_skb{,_list}() in
> > dev_direct_xmit() when we have NETDEV_TX_BUSY condition? Wouldn't the patch above
> > double-free with NETDEV_TX_BUSY?
> 
> I think you are correct even without coffee :-). I misinterpreted the
> following piece of code in dev_direct_xmit():
> 
> if (!dev_xmit_complete(ret))
>      kfree_skb(skb);

I did look carefuly at this, but apparently forgot about the "!" part of
the conditional while looking at dev_xmit_complete() internals:

    return (NETDEV_TX_BUSY < NET_XMIT_MASK)
    return (0x10 < 0x0f)
    return false;
Daniel Borkmann July 15, 2020, 6:36 p.m. UTC | #5
On 7/11/20 9:39 AM, Magnus Karlsson wrote:
> On Sat, Jul 11, 2020 at 1:28 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 7/10/20 8:45 AM, Magnus Karlsson wrote:
>>> In the skb Tx path, transmission of a packet is performed with
>>> dev_direct_xmit(). When QUEUE_STATE_FROZEN is set in the transmit
>>> routines, it returns NETDEV_TX_BUSY signifying that it was not
>>> possible to send the packet now, please try later. Unfortunately, the
>>> xsk transmit code discarded the packet, missed to free the skb, and
>>> returned EBUSY to the application. Fix this memory leak and
>>> unnecessary packet loss, by not discarding the packet in the Tx ring,
>>> freeing the allocated skb, and return EAGAIN. As EAGAIN is returned to the
>>> application, it can then retry the send operation and the packet will
>>> finally be sent as we will likely not be in the QUEUE_STATE_FROZEN
>>> state anymore. So EAGAIN tells the application that the packet was not
>>> discarded from the Tx ring and that it needs to call send()
>>> again. EBUSY, on the other hand, signifies that the packet was not
>>> sent and discarded from the Tx ring. The application needs to put the
>>> packet on the Tx ring again if it wants it to be sent.
>>>
>>> Fixes: 35fcde7f8deb ("xsk: support for Tx")
>>> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
>>> Reported-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
>>> Suggested-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
>>> ---
>>> The v1 of this patch was called "xsk: do not discard packet when
>>> QUEUE_STATE_FROZEN".
>>> ---
>>>    net/xdp/xsk.c | 13 +++++++++++--
>>>    1 file changed, 11 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
>>> index 3700266..5304250 100644
>>> --- a/net/xdp/xsk.c
>>> +++ b/net/xdp/xsk.c
>>> @@ -376,13 +376,22 @@ static int xsk_generic_xmit(struct sock *sk)
>>>                skb->destructor = xsk_destruct_skb;
>>>
>>>                err = dev_direct_xmit(skb, xs->queue_id);
>>> -             xskq_cons_release(xs->tx);
>>>                /* Ignore NET_XMIT_CN as packet might have been sent */
>>> -             if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) {
>>> +             if (err == NET_XMIT_DROP) {
>>>                        /* SKB completed but not sent */
>>> +                     xskq_cons_release(xs->tx);
>>>                        err = -EBUSY;
>>>                        goto out;
>>> +             } else if  (err == NETDEV_TX_BUSY) {
>>> +                     /* QUEUE_STATE_FROZEN, tell application to
>>> +                      * retry sending the packet
>>> +                      */
>>> +                     skb->destructor = NULL;
>>> +                     kfree_skb(skb);
>>> +                     err = -EAGAIN;
>>> +                     goto out;
>>
>> Hmm, I'm probably missing something or I should blame my current lack of coffee,
>> but I'll ask anyway.. What is the relation here to the kfree_skb{,_list}() in
>> dev_direct_xmit() when we have NETDEV_TX_BUSY condition? Wouldn't the patch above
>> double-free with NETDEV_TX_BUSY?
> 
> I think you are correct even without coffee :-). I misinterpreted the
> following piece of code in dev_direct_xmit():
> 
> if (!dev_xmit_complete(ret))
>       kfree_skb(skb);
> 
> If the skb was NOT consumed by the transmit, then it goes and frees
> the skb. NETDEV_TX_BUSY as a return value will make
> dev_xmit_complete() return false which triggers the freeing of the
> skb. So if I now understand dev_direct_xmit() correctly, it will
> always consume the skb, even when NETDEV_TX_BUSY is returned. And this
> is what I would like to avoid. If the skb is freed, the destructor is
> triggered and it will complete the packet to user-space, which is the
> same thing as dropping it, which is what I want to avoid in the first
> place since it is completely unnecessary.
> 
> So what would be the best way to solve this? Prefer to share the code
> with AF_PACKET if possible. Introduce a boolean function parameter to
> indicate if it should be freed in this case? Other ideas? Here are the
> users of dev_direct_xmit():

Another option could be looking at pktgen which mangles skb->users to keep
the skb alive.

Thanks,
Daniel
Magnus Karlsson July 16, 2020, 4:43 a.m. UTC | #6
On Wed, Jul 15, 2020 at 8:36 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 7/11/20 9:39 AM, Magnus Karlsson wrote:
> > On Sat, Jul 11, 2020 at 1:28 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >> On 7/10/20 8:45 AM, Magnus Karlsson wrote:
> >>> In the skb Tx path, transmission of a packet is performed with
> >>> dev_direct_xmit(). When QUEUE_STATE_FROZEN is set in the transmit
> >>> routines, it returns NETDEV_TX_BUSY signifying that it was not
> >>> possible to send the packet now, please try later. Unfortunately, the
> >>> xsk transmit code discarded the packet, missed to free the skb, and
> >>> returned EBUSY to the application. Fix this memory leak and
> >>> unnecessary packet loss, by not discarding the packet in the Tx ring,
> >>> freeing the allocated skb, and return EAGAIN. As EAGAIN is returned to the
> >>> application, it can then retry the send operation and the packet will
> >>> finally be sent as we will likely not be in the QUEUE_STATE_FROZEN
> >>> state anymore. So EAGAIN tells the application that the packet was not
> >>> discarded from the Tx ring and that it needs to call send()
> >>> again. EBUSY, on the other hand, signifies that the packet was not
> >>> sent and discarded from the Tx ring. The application needs to put the
> >>> packet on the Tx ring again if it wants it to be sent.
> >>>
> >>> Fixes: 35fcde7f8deb ("xsk: support for Tx")
> >>> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> >>> Reported-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> >>> Suggested-by: Arkadiusz Zema <A.Zema@falconvsystems.com>
> >>> ---
> >>> The v1 of this patch was called "xsk: do not discard packet when
> >>> QUEUE_STATE_FROZEN".
> >>> ---
> >>>    net/xdp/xsk.c | 13 +++++++++++--
> >>>    1 file changed, 11 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> >>> index 3700266..5304250 100644
> >>> --- a/net/xdp/xsk.c
> >>> +++ b/net/xdp/xsk.c
> >>> @@ -376,13 +376,22 @@ static int xsk_generic_xmit(struct sock *sk)
> >>>                skb->destructor = xsk_destruct_skb;
> >>>
> >>>                err = dev_direct_xmit(skb, xs->queue_id);
> >>> -             xskq_cons_release(xs->tx);
> >>>                /* Ignore NET_XMIT_CN as packet might have been sent */
> >>> -             if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) {
> >>> +             if (err == NET_XMIT_DROP) {
> >>>                        /* SKB completed but not sent */
> >>> +                     xskq_cons_release(xs->tx);
> >>>                        err = -EBUSY;
> >>>                        goto out;
> >>> +             } else if  (err == NETDEV_TX_BUSY) {
> >>> +                     /* QUEUE_STATE_FROZEN, tell application to
> >>> +                      * retry sending the packet
> >>> +                      */
> >>> +                     skb->destructor = NULL;
> >>> +                     kfree_skb(skb);
> >>> +                     err = -EAGAIN;
> >>> +                     goto out;
> >>
> >> Hmm, I'm probably missing something or I should blame my current lack of coffee,
> >> but I'll ask anyway.. What is the relation here to the kfree_skb{,_list}() in
> >> dev_direct_xmit() when we have NETDEV_TX_BUSY condition? Wouldn't the patch above
> >> double-free with NETDEV_TX_BUSY?
> >
> > I think you are correct even without coffee :-). I misinterpreted the
> > following piece of code in dev_direct_xmit():
> >
> > if (!dev_xmit_complete(ret))
> >       kfree_skb(skb);
> >
> > If the skb was NOT consumed by the transmit, then it goes and frees
> > the skb. NETDEV_TX_BUSY as a return value will make
> > dev_xmit_complete() return false which triggers the freeing of the
> > skb. So if I now understand dev_direct_xmit() correctly, it will
> > always consume the skb, even when NETDEV_TX_BUSY is returned. And this
> > is what I would like to avoid. If the skb is freed, the destructor is
> > triggered and it will complete the packet to user-space, which is the
> > same thing as dropping it, which is what I want to avoid in the first
> > place since it is completely unnecessary.
> >
> > So what would be the best way to solve this? Prefer to share the code
> > with AF_PACKET if possible. Introduce a boolean function parameter to
> > indicate if it should be freed in this case? Other ideas? Here are the
> > users of dev_direct_xmit():
>
> Another option could be looking at pktgen which mangles skb->users to keep
> the skb alive.

Thanks. Will take a look at that and give it a try.

/Magnus

> Thanks,
> Daniel
diff mbox series

Patch

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 3700266..5304250 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -376,13 +376,22 @@  static int xsk_generic_xmit(struct sock *sk)
 		skb->destructor = xsk_destruct_skb;
 
 		err = dev_direct_xmit(skb, xs->queue_id);
-		xskq_cons_release(xs->tx);
 		/* Ignore NET_XMIT_CN as packet might have been sent */
-		if (err == NET_XMIT_DROP || err == NETDEV_TX_BUSY) {
+		if (err == NET_XMIT_DROP) {
 			/* SKB completed but not sent */
+			xskq_cons_release(xs->tx);
 			err = -EBUSY;
 			goto out;
+		} else if  (err == NETDEV_TX_BUSY) {
+			/* QUEUE_STATE_FROZEN, tell application to
+			 * retry sending the packet
+			 */
+			skb->destructor = NULL;
+			kfree_skb(skb);
+			err = -EAGAIN;
+			goto out;
 		}
+		xskq_cons_release(xs->tx);
 
 		sent_frame = true;
 	}