diff mbox

[RFC] tcp md5 use of alloc_percpu

Message ID 5447FDB2.2010906@gmail.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Leonard Crestez Oct. 22, 2014, 6:55 p.m. UTC
Hello,

It seems that the TCP MD5 feature allocates a percpu struct tcp_md5sig_pool and uses part of that memory for a scratch buffer to do crypto on. Here is the relevant code:

static int tcp_v4_md5_hash_pseudoheader(struct tcp_md5sig_pool *hp,
                                        __be32 daddr, __be32 saddr, int nbytes)
{
        struct tcp4_pseudohdr *bp;
        struct scatterlist sg;

        bp = &hp->md5_blk.ip4;

        /*
         * 1. the TCP pseudo-header (in the order: source IP address,
         * destination IP address, zero-padded protocol number, and
         * segment length)
         */
        bp->saddr = saddr;
        bp->daddr = daddr;
        bp->pad = 0;
        bp->protocol = IPPROTO_TCP;
        bp->len = cpu_to_be16(nbytes);

        sg_init_one(&sg, bp, sizeof(*bp));
        return crypto_hash_update(&hp->md5_desc, &sg, sizeof(*bp));
}

sg_init_one does virt_addr on the pointer which assumes it is directly accessible. But the tcp_md5sig_pool pointer comes from alloc_percpu which can return memory from the vmalloc area after the pcpu_first_chunk is exhausted. This looks wrong to me. I'm am getting crashes on mips and I believe this to be the cause.

Allocating a scratch buffer this way is very peculiar. The tcp4_pseudohdr struct is only 12 bytes in length. Similar code in tcp_v6_md5_hash_pseudoheader uses a 40 byte tcp6_pseudohdr. I think it is perfectly reasonable to allocate this kind of stuff on the stack, right? These pseudohdr structs are not used at all outside these two static functions and it would simplify the code.

The whole notion of struct tcp_md5sig_pool seems dubious. This is a very tiny struct already and after removing the pseudohdr it shrinks to a percpu hash_desc for md5 (8 or 16 bytes). Wouldn't DEFINE_PERCPU be more appropriate? Before commit 71cea17ed39fdf1c0634f530ddc6a2c2fc601c2b the struct tcp_md5sig_pool structs were freed when all users were gone, but that functionality seems to have been dropped.

I'm not familiar with the linux crypto API. Isn't there an easier way to get a temporary md5 hasher?

Here's what I mean by allocating tcp{4,6}_pseudohdr on the stack:

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Eric Dumazet Oct. 22, 2014, 7:12 p.m. UTC | #1
On Wed, 2014-10-22 at 21:55 +0300, Crestez Dan Leonard wrote:
> Hello,
> 
> It seems that the TCP MD5 feature allocates a percpu struct
> tcp_md5sig_pool and uses part of that memory for a scratch buffer to
> do crypto on. Here is the relevant code:
> 
> static int tcp_v4_md5_hash_pseudoheader(struct tcp_md5sig_pool *hp,
>                                         __be32 daddr, __be32 saddr,
> int nbytes)
> {
>         struct tcp4_pseudohdr *bp;
>         struct scatterlist sg;
> 
>         bp = &hp->md5_blk.ip4;
> 
>         /*
>          * 1. the TCP pseudo-header (in the order: source IP address,
>          * destination IP address, zero-padded protocol number, and
>          * segment length)
>          */
>         bp->saddr = saddr;
>         bp->daddr = daddr;
>         bp->pad = 0;
>         bp->protocol = IPPROTO_TCP;
>         bp->len = cpu_to_be16(nbytes);
> 
>         sg_init_one(&sg, bp, sizeof(*bp));
>         return crypto_hash_update(&hp->md5_desc, &sg, sizeof(*bp));
> }
> 
> sg_init_one does virt_addr on the pointer which assumes it is directly
> accessible. But the tcp_md5sig_pool pointer comes from alloc_percpu
> which can return memory from the vmalloc area after the
> pcpu_first_chunk is exhausted. This looks wrong to me. I'm am getting
> crashes on mips and I believe this to be the cause.



Then just remove the alloc_percpu() from __tcp_alloc_md5sig_pool() and
make this a static per cpu definition (not a dynamic allocation)



> 
> Allocating a scratch buffer this way is very peculiar. The
> tcp4_pseudohdr struct is only 12 bytes in length. Similar code in
> tcp_v6_md5_hash_pseudoheader uses a 40 byte tcp6_pseudohdr. I think it
> is perfectly reasonable to allocate this kind of stuff on the stack,
> right? These pseudohdr structs are not used at all outside these two
> static functions and it would simplify the code.
> 
Yep, but the sg stuff does not allow for stack variables. Because of
possible offloading and DMA, I dont know...

> The whole notion of struct tcp_md5sig_pool seems dubious. This is a
> very tiny struct already and after removing the pseudohdr it shrinks
> to a percpu hash_desc for md5 (8 or 16 bytes). Wouldn't DEFINE_PERCPU
> be more appropriate?

Sure. this would be the more appropriate fix IMO.

>  Before commit 71cea17ed39fdf1c0634f530ddc6a2c2fc601c2b the struct
> tcp_md5sig_pool structs were freed when all users were gone, but that
> functionality seems to have been dropped.
> 
> I'm not familiar with the linux crypto API. Isn't there an easier way
> to get a temporary md5 hasher?

You should CC crypto guys maybe ...

> 
> Here's what I mean by allocating tcp{4,6}_pseudohdr on the stack:

Your patch is quite invasive, you should so something simpler to ease
backports.

Thanks


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jonathan Toppins Oct. 22, 2014, 9:35 p.m. UTC | #2
On 10/22/14, 3:12 PM, Eric Dumazet wrote:
> On Wed, 2014-10-22 at 21:55 +0300, Crestez Dan Leonard wrote:
>> Hello,
>>
>> It seems that the TCP MD5 feature allocates a percpu struct
>> tcp_md5sig_pool and uses part of that memory for a scratch buffer to
>> do crypto on. Here is the relevant code:
>>
>> static int tcp_v4_md5_hash_pseudoheader(struct tcp_md5sig_pool *hp,
>>                                         __be32 daddr, __be32 saddr,
>> int nbytes)
>> {
>>         struct tcp4_pseudohdr *bp;
>>         struct scatterlist sg;
>>
>>         bp = &hp->md5_blk.ip4;
>>
>>         /*
>>          * 1. the TCP pseudo-header (in the order: source IP address,
>>          * destination IP address, zero-padded protocol number, and
>>          * segment length)
>>          */
>>         bp->saddr = saddr;
>>         bp->daddr = daddr;
>>         bp->pad = 0;
>>         bp->protocol = IPPROTO_TCP;
>>         bp->len = cpu_to_be16(nbytes);
>>
>>         sg_init_one(&sg, bp, sizeof(*bp));
>>         return crypto_hash_update(&hp->md5_desc, &sg, sizeof(*bp));
>> }
>>
>> sg_init_one does virt_addr on the pointer which assumes it is directly
>> accessible. But the tcp_md5sig_pool pointer comes from alloc_percpu
>> which can return memory from the vmalloc area after the
>> pcpu_first_chunk is exhausted. This looks wrong to me. I'm am getting
>> crashes on mips and I believe this to be the cause.

I can confirm this created an issue on our powerpc based switches. My
solution in our 3.2 kernel was to allocate the buffer on the stack. I
like this solution better.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Oct. 22, 2014, 9:53 p.m. UTC | #3
From: Crestez Dan Leonard <cdleonard@gmail.com>
Date: Wed, 22 Oct 2014 21:55:46 +0300

>  static int tcp_v4_md5_hash_pseudoheader(struct tcp_md5sig_pool *hp,
>  					__be32 daddr, __be32 saddr, int nbytes)
>  {
> -	struct tcp4_pseudohdr *bp;
> +	struct tcp4_pseudohdr bp;
...
> +	sg_init_one(&sg, &bp, sizeof(bp));
> +	return crypto_hash_update(&hp->md5_desc, &sg, sizeof(bp));

As others have mentioned, you cannot do this.

On some architectures the kernel stack comes from vmalloc()
memory too.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leonard Crestez Oct. 22, 2014, 11:05 p.m. UTC | #4
On 10/22/2014 10:12 PM, Eric Dumazet wrote:
> On Wed, 2014-10-22 at 21:55 +0300, Crestez Dan Leonard wrote:
>> Hello,
>>
>> It seems that the TCP MD5 feature allocates a percpu struct
>> tcp_md5sig_pool and uses part of that memory for a scratch buffer to
>> do crypto on. Here is the relevant code:
>>
>> static int tcp_v4_md5_hash_pseudoheader(struct tcp_md5sig_pool *hp,
>>                                          __be32 daddr, __be32 saddr,
>> int nbytes)
>> {
>>          struct tcp4_pseudohdr *bp;
>>          struct scatterlist sg;
>>
>>          bp = &hp->md5_blk.ip4;
>>
>>          /*
>>           * 1. the TCP pseudo-header (in the order: source IP address,
>>           * destination IP address, zero-padded protocol number, and
>>           * segment length)
>>           */
>>          bp->saddr = saddr;
>>          bp->daddr = daddr;
>>          bp->pad = 0;
>>          bp->protocol = IPPROTO_TCP;
>>          bp->len = cpu_to_be16(nbytes);
>>
>>          sg_init_one(&sg, bp, sizeof(*bp));
>>          return crypto_hash_update(&hp->md5_desc, &sg, sizeof(*bp));
>> }
>>
>> sg_init_one does virt_addr on the pointer which assumes it is directly
>> accessible. But the tcp_md5sig_pool pointer comes from alloc_percpu
>> which can return memory from the vmalloc area after the
>> pcpu_first_chunk is exhausted. This looks wrong to me. I'm am getting
>> crashes on mips and I believe this to be the cause.
>>
>> Allocating a scratch buffer this way is very peculiar. The
>> tcp4_pseudohdr struct is only 12 bytes in length. Similar code in
>> tcp_v6_md5_hash_pseudoheader uses a 40 byte tcp6_pseudohdr. I think it
>> is perfectly reasonable to allocate this kind of stuff on the stack,
>> right? These pseudohdr structs are not used at all outside these two
>> static functions and it would simplify the code.
>>
> Yep, but the sg stuff does not allow for stack variables. Because of
> possible offloading and DMA, I dont know...
A stack buffer is used in tcp_md5_hash_header to add a tcphdr to the 
hash. A quick grep for sg_init_one find a couple of additional instances 
of what looks like doing crypto on small stack buffers:

net/bluetooth/smp.e:110
net/sunrpc/auth_gss/gss_krb5_crypto.c:194
net/rxrpc/rxkad.c:multiple

But those might also be bugs.

If the buffers passed to the crypto api need to be DMA-ble then wouldn't 
this also exclude DEFINE_PERCPU? The DMA-API-HOWTO mentions that items 
in data/text/bss might not be DMA-able, presumably depending on the 
architecture.

>> I'm not familiar with the linux crypto API. Isn't there an easier way
>> to get a temporary md5 hasher?
>
> You should CC crypto guys maybe ...
Added linux-crypto in CC. To summarize the question: What kind of memory 
can be passed to crypto api functions like crypto_hash_update?

 >> The whole notion of struct tcp_md5sig_pool seems dubious. This is a
 >> very tiny struct already and after removing the pseudohdr it shrinks
 >> to a percpu hash_desc for md5 (8 or 16 bytes). Wouldn't DEFINE_PERCPU
 >> be more appropriate?
 >
 > Sure. this would be the more appropriate fix IMO.
I'll post this as a patch if somebody can confirm that it is correct and 
portable.

Doing a temp kmalloc/kfree would also work, but it would hurt 
performance. It would be nice to have a generic way to ask for a small 
temporary DMA-ble buffer.

If DEFINE_PERCPU is not suitable then the tcp_md5sig_pool structs should 
be allocated via individual kmallocs for each cpu.

Regards,
Leonard
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Herbert Xu Oct. 24, 2014, 9:33 a.m. UTC | #5
Crestez Dan Leonard <cdleonard@gmail.com> wrote:
>
>> Yep, but the sg stuff does not allow for stack variables. Because of
>> possible offloading and DMA, I dont know...
> A stack buffer is used in tcp_md5_hash_header to add a tcphdr to the 
> hash. A quick grep for sg_init_one find a couple of additional instances 
> of what looks like doing crypto on small stack buffers:

First of all crypto_hash_update is obsolete, don't use it in any
new code.  Thanks for reminding me to get rid of existing users.

You should either use crypto_shash_update for small data, e.g., headers
or crypto_ahash_update for large data such as whole packets.

If you use shash then you may allocate your buffer on the stack.  With
ahash stack memory is not allowed.

I hope this clears things up for you.

Cheers,
diff mbox

Patch

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 4062b4f..beabd7b 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1266,33 +1266,9 @@  struct tcp_md5sig_info {
 	struct rcu_head		rcu;
 };
 
-/* - pseudo header */
-struct tcp4_pseudohdr {
-	__be32		saddr;
-	__be32		daddr;
-	__u8		pad;
-	__u8		protocol;
-	__be16		len;
-};
-
-struct tcp6_pseudohdr {
-	struct in6_addr	saddr;
-	struct in6_addr daddr;
-	__be32		len;
-	__be32		protocol;	/* including padding */
-};
-
-union tcp_md5sum_block {
-	struct tcp4_pseudohdr ip4;
-#if IS_ENABLED(CONFIG_IPV6)
-	struct tcp6_pseudohdr ip6;
-#endif
-};
-
 /* - pool: digest algorithm, hash description and scratch buffer */
 struct tcp_md5sig_pool {
 	struct hash_desc	md5_desc;
-	union tcp_md5sum_block	md5_blk;
 };
 
 /* - functions */
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 94d1a77..e716a67 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1041,27 +1041,33 @@  static int tcp_v4_parse_md5_keys(struct sock *sk, char __user *optval,
 			      GFP_KERNEL);
 }
 
+struct tcp4_pseudohdr {
+	__be32		saddr;
+	__be32		daddr;
+	__u8		pad;
+	__u8		protocol;
+	__be16		len;
+};
+
 static int tcp_v4_md5_hash_pseudoheader(struct tcp_md5sig_pool *hp,
 					__be32 daddr, __be32 saddr, int nbytes)
 {
-	struct tcp4_pseudohdr *bp;
+	struct tcp4_pseudohdr bp;
 	struct scatterlist sg;
 
-	bp = &hp->md5_blk.ip4;
-
 	/*
 	 * 1. the TCP pseudo-header (in the order: source IP address,
 	 * destination IP address, zero-padded protocol number, and
 	 * segment length)
 	 */
-	bp->saddr = saddr;
-	bp->daddr = daddr;
-	bp->pad = 0;
-	bp->protocol = IPPROTO_TCP;
-	bp->len = cpu_to_be16(nbytes);
-
-	sg_init_one(&sg, bp, sizeof(*bp));
-	return crypto_hash_update(&hp->md5_desc, &sg, sizeof(*bp));
+	bp.saddr = saddr;
+	bp.daddr = daddr;
+	bp.pad = 0;
+	bp.protocol = IPPROTO_TCP;
+	bp.len = cpu_to_be16(nbytes);
+
+	sg_init_one(&sg, &bp, sizeof(bp));
+	return crypto_hash_update(&hp->md5_desc, &sg, sizeof(bp));
 }
 
 static int tcp_v4_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key,
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 8314955..87a9126 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -568,22 +568,28 @@  static int tcp_v6_parse_md5_keys(struct sock *sk, char __user *optval,
 			      AF_INET6, cmd.tcpm_key, cmd.tcpm_keylen, GFP_KERNEL);
 }
 
+struct tcp6_pseudohdr {
+	struct in6_addr	saddr;
+	struct in6_addr daddr;
+	__be32		len;
+	__be32		protocol;	/* including padding */
+};
+
 static int tcp_v6_md5_hash_pseudoheader(struct tcp_md5sig_pool *hp,
 					const struct in6_addr *daddr,
 					const struct in6_addr *saddr, int nbytes)
 {
-	struct tcp6_pseudohdr *bp;
+	struct tcp6_pseudohdr bp;
 	struct scatterlist sg;
 
-	bp = &hp->md5_blk.ip6;
 	/* 1. TCP pseudo-header (RFC2460) */
-	bp->saddr = *saddr;
-	bp->daddr = *daddr;
-	bp->protocol = cpu_to_be32(IPPROTO_TCP);
-	bp->len = cpu_to_be32(nbytes);
+	bp.saddr = *saddr;
+	bp.daddr = *daddr;
+	bp.protocol = cpu_to_be32(IPPROTO_TCP);
+	bp.len = cpu_to_be32(nbytes);
 
-	sg_init_one(&sg, bp, sizeof(*bp));
-	return crypto_hash_update(&hp->md5_desc, &sg, sizeof(*bp));
+	sg_init_one(&sg, &bp, sizeof(bp));
+	return crypto_hash_update(&hp->md5_desc, &sg, sizeof(bp));
 }
 
 static int tcp_v6_md5_hash_hdr(char *md5_hash, struct tcp_md5sig_key *key,