diff mbox series

net: correct zerocopy refcnt with newly allocated UDP or RAW uarg

Message ID 20200813115800.4546-1-linmiaohe@huawei.com
State Rejected
Delegated to: David Miller
Headers show
Series net: correct zerocopy refcnt with newly allocated UDP or RAW uarg | expand

Commit Message

Miaohe Lin Aug. 13, 2020, 11:58 a.m. UTC
The var extra_uref is introduced to pass the initial reference taken in
sock_zerocopy_alloc to the first generated skb. But now we may fail to pass
the initial reference with newly allocated UDP or RAW uarg when the skb is
zcopied.

If the skb is zcopied, we always set extra_uref to false. This is fine with
reallocted uarg because no extra ref is taken by UDP and RAW zerocopy. But
if uarg is newly allocated via sock_zerocopy_alloc(), we lost the initial
reference because extra_uref is false and we missed to pass it to the first
generated skb.

To fix this, we should set extra_uref to true if UDP or RAW uarg is newly
allocated when the skb is zcopied.

Fixes: 522924b58308 ("net: correct udp zerocopy refcnt also when zerocopy only on append")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 net/ipv4/ip_output.c  | 4 +++-
 net/ipv6/ip6_output.c | 4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

Comments

Willem de Bruijn Aug. 13, 2020, 12:34 p.m. UTC | #1
On Thu, Aug 13, 2020 at 1:59 PM Miaohe Lin <linmiaohe@huawei.com> wrote:
>
> The var extra_uref is introduced to pass the initial reference taken in
> sock_zerocopy_alloc to the first generated skb. But now we may fail to pass
> the initial reference with newly allocated UDP or RAW uarg when the skb is
> zcopied.
>
> If the skb is zcopied, we always set extra_uref to false. This is fine with
> reallocted uarg because no extra ref is taken by UDP and RAW zerocopy. But
> if uarg is newly allocated via sock_zerocopy_alloc(), we lost the initial
> reference because extra_uref is false and we missed to pass it to the first
> generated skb.
>
> To fix this, we should set extra_uref to true if UDP or RAW uarg is newly
> allocated when the skb is zcopied.

extra_uref is true if there is no previous skb to append to or there
is a previous skb, but that does not have zerocopy data associated yet
(because the previous call(s) did not set MSG_ZEROCOPY).

In other words, when first (allocating and) associating a zerocopy
struct with the skb.



> Fixes: 522924b58308 ("net: correct udp zerocopy refcnt also when zerocopy only on append")
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
>  net/ipv4/ip_output.c  | 4 +++-
>  net/ipv6/ip6_output.c | 4 +++-
>  2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> index 61f802d5350c..78d3b5d48617 100644
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -1019,7 +1019,9 @@ static int __ip_append_data(struct sock *sk,
>                 uarg = sock_zerocopy_realloc(sk, length, skb_zcopy(skb));
>                 if (!uarg)
>                         return -ENOBUFS;
> -               extra_uref = !skb_zcopy(skb);   /* only ref on new uarg */
> +               /* Only ref on newly allocated uarg. */
> +               if (!skb_zcopy(skb) || (sk->sk_type != SOCK_STREAM && skb_zcopy(skb) != uarg))
> +                       extra_uref = true;

SOCK_STREAM does not use __ip_append_data.

This leaves as new branch skb_zcopy(skb) && skb_zcopy(skb) != uarg.

This function can only acquire a uarg through sock_zerocopy_realloc,
which on skb_zcopy(skb) only returns the existing uarg or NULL (for
not SOCK_STREAM).

So I don't see when that condition can happen.
diff mbox series

Patch

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 61f802d5350c..78d3b5d48617 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1019,7 +1019,9 @@  static int __ip_append_data(struct sock *sk,
 		uarg = sock_zerocopy_realloc(sk, length, skb_zcopy(skb));
 		if (!uarg)
 			return -ENOBUFS;
-		extra_uref = !skb_zcopy(skb);	/* only ref on new uarg */
+		/* Only ref on newly allocated uarg. */
+		if (!skb_zcopy(skb) || (sk->sk_type != SOCK_STREAM && skb_zcopy(skb) != uarg))
+			extra_uref = true;
 		if (rt->dst.dev->features & NETIF_F_SG &&
 		    csummode == CHECKSUM_PARTIAL) {
 			paged = true;
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index c78e67d7747f..0f82923239a9 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1476,7 +1476,9 @@  static int __ip6_append_data(struct sock *sk,
 		uarg = sock_zerocopy_realloc(sk, length, skb_zcopy(skb));
 		if (!uarg)
 			return -ENOBUFS;
-		extra_uref = !skb_zcopy(skb);	/* only ref on new uarg */
+		/* Only ref on newly allocated uarg. */
+		if (!skb_zcopy(skb) || (sk->sk_type != SOCK_STREAM && skb_zcopy(skb) != uarg))
+			extra_uref = true;
 		if (rt->dst.dev->features & NETIF_F_SG &&
 		    csummode == CHECKSUM_PARTIAL) {
 			paged = true;