diff mbox

xfrm: take iphdr size into account for esp payload size calculation

Message ID 1336602952-10479-1-git-send-email-bpoirier@suse.de
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Benjamin Poirier May 9, 2012, 10:35 p.m. UTC
Corrects the function that determines the esp payload size.
The calculations done in esp4_get_mtu lead to overlength frames in transport
mode for certain mtu values (eg. 1499 leads to FRAGFAILS) and suboptimal
frames for others (eg. 1500, where the addition of padding in the esp header
can be avoided).

According to what is done, mainly in esp_output(), net_header_len aka
sizeof(struct iphdr) must be taken into account before doing the alignment
calculation.

Signed-off-by: Benjamin Poirier <bpoirier@suse.de>

---
Tested with
* transport mode E
* transport mode EA
* transport mode E + ah
* tunnel mode E

Not tested with BEET, but it should be the same as transport mode
	draft-nikander-esp-beet-mode-03.txt Section 5.2:
	"The wire packet format is identical to the ESP transport mode"
---
 net/ipv4/esp4.c |   24 +++---------------------
 1 files changed, 3 insertions(+), 21 deletions(-)

Comments

Steffen Klassert May 10, 2012, 12:18 p.m. UTC | #1
On Wed, May 09, 2012 at 06:35:52PM -0400, Benjamin Poirier wrote:
> 
> According to what is done, mainly in esp_output(), net_header_len aka
> sizeof(struct iphdr) must be taken into account before doing the alignment
> calculation.

Why do you need to take the ip header into account here? Your patch breaks
pmtu discovery, at least on tunnel mode with aes-sha1 (aes blocksize 16 bytes).

With your patch applied:

tracepath -n 192.168.1.2
 1?: [LOCALHOST]     pmtu 1442
 1:  send failed
 1:  send failed
     Resume: pmtu 1442

Without your patch:

tracepath -n 192.168.1.2
 1?: [LOCALHOST]     pmtu 1438
 1:  192.168.1.2       0.736ms reached
 1:  192.168.1.2       0.390ms reached
     Resume: pmtu 1438 hops 1 back 64 

Your patch increases the mtu by 4 bytes. Be aware that adding
one byte of payload may increase the packet size up to 16 bytes
in the case of aes, as we have to pad the encryption payload
always to a multiple of the cipher blocksize.

> -
> -	switch (x->props.mode) {
> -	case XFRM_MODE_TUNNEL:
> -		break;
> -	default:
> -	case XFRM_MODE_TRANSPORT:
> -		/* The worst case */
> -		mtu -= blksize - 4;
> -		mtu += min_t(u32, blksize - 4, rem);
> -		break;

Btw. why we are doing the calculation above for transport mode?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Benjamin Poirier May 11, 2012, 1:02 a.m. UTC | #2
On 2012/05/10 14:18, Steffen Klassert wrote:
> On Wed, May 09, 2012 at 06:35:52PM -0400, Benjamin Poirier wrote:
> > 
> > According to what is done, mainly in esp_output(), net_header_len aka
> > sizeof(struct iphdr) must be taken into account before doing the alignment
> > calculation.
> 
> Why do you need to take the ip header into account here?

The value returned by this function is tuned for tcp segment size:
1) from tcp_mtu_to_mss()
mss = pmtu - tcp_hlen - net_hlen
2) frame structure for transport mode
mtu = mss + tcp_hlen + esp_header_len(esp_payload_len) + ah_len + net_hlen

The "mtu" parameter of esp4_get_mtu is in fact mtu - ah_len.
The return value of esp4_get_mtu is put into pmtu.

If we put 1 and 2 together we have:
pmtu = mtu - ah_len - esp_header_len(esp_payload_len)
with esp_payload_len = mss + tcp_hlen

This formula expands to:
pmtu = mtu - ah_len - (header_len + align(align(pmtu - net_hlen + 2, blksize),
	esp->padlen) - (pmtu - net_hlen) + alen)

and simplifies to:
pmtu = (mtu - ah_len - net_hlen - header_len - alen) & ~(max(blksize,
	esp->padlen) - 1) + (net_hlen - 2)

which, in the context of esp4_get_mtu, becomes:
((mtu - x->props.header_len - crypto_aead_authsize(esp->aead) - sizeof(struct
iphdr)) & ~(align - 1)) + (sizeof(struct iphdr) - 2)

This is the same formula as before, except for sizeof(struct iphdr) which was
missing.

> Your patch breaks
> pmtu discovery, at least on tunnel mode with aes-sha1 (aes blocksize 16 bytes).
> 
> With your patch applied:
> 
> tracepath -n 192.168.1.2
>  1?: [LOCALHOST]     pmtu 1442
>  1:  send failed
>  1:  send failed
>      Resume: pmtu 1442
> 
> Without your patch:
> 
> tracepath -n 192.168.1.2
>  1?: [LOCALHOST]     pmtu 1438
>  1:  192.168.1.2       0.736ms reached
>  1:  192.168.1.2       0.390ms reached
>      Resume: pmtu 1438 hops 1 back 64 
> 
> Your patch increases the mtu by 4 bytes. Be aware that adding
> one byte of payload may increase the packet size up to 16 bytes
> in the case of aes, as we have to pad the encryption payload
> always to a multiple of the cipher blocksize.

Thanks for testing. My own testing had been limited to rsync'ing large
files.

The problem with tunnel mode in v1 of the patch was twofold.
First, we don't play games with mss and tcp_hlen. esp_payload_len = pmtu
directly, instead of esp_payload_len = pmtu - net_hlen.
Second, in the tunnel case, header_len already includes net_hlen, see
esp_init_state().
The net result is that the formula for tunnel mode was already correct.

> 
> > -
> > -	switch (x->props.mode) {
> > -	case XFRM_MODE_TUNNEL:
> > -		break;
> > -	default:
> > -	case XFRM_MODE_TRANSPORT:
> > -		/* The worst case */
> > -		mtu -= blksize - 4;
> > -		mtu += min_t(u32, blksize - 4, rem);
> > -		break;
> 
> Btw. why we are doing the calculation above for transport mode?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steffen Klassert May 11, 2012, 10:39 a.m. UTC | #3
On Thu, May 10, 2012 at 09:02:49PM -0400, Benjamin Poirier wrote:
> 
> The value returned by this function is tuned for tcp segment size:
> 1) from tcp_mtu_to_mss()
> mss = pmtu - tcp_hlen - net_hlen
> 2) frame structure for transport mode
> mtu = mss + tcp_hlen + esp_header_len(esp_payload_len) + ah_len + net_hlen

I think you can simplify the calculations here, this
calculation should not depend on any special layer 4
protocol.

> 
> The "mtu" parameter of esp4_get_mtu is in fact mtu - ah_len.
> The return value of esp4_get_mtu is put into pmtu.
> 
> If we put 1 and 2 together we have:
> pmtu = mtu - ah_len - esp_header_len(esp_payload_len)
> with esp_payload_len = mss + tcp_hlen
> 
> This formula expands to:
> pmtu = mtu - ah_len - (header_len + align(align(pmtu - net_hlen + 2, blksize),
> 	esp->padlen) - (pmtu - net_hlen) + alen)
> 
> and simplifies to:
> pmtu = (mtu - ah_len - net_hlen - header_len - alen) & ~(max(blksize,
> 	esp->padlen) - 1) + (net_hlen - 2)
> 
> which, in the context of esp4_get_mtu, becomes:
> ((mtu - x->props.header_len - crypto_aead_authsize(esp->aead) - sizeof(struct
> iphdr)) & ~(align - 1)) + (sizeof(struct iphdr) - 2)
> 
> This is the same formula as before, except for sizeof(struct iphdr) which was
> missing.
> 

Well, makes sense. I use transport mode very rarely, so I never noticed this.
But I was sure that it worked correct in tunnel mode. 

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index 89a47b3..60f2738 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -459,28 +459,10 @@  static u32 esp4_get_mtu(struct xfrm_state *x, int mtu)
 	struct esp_data *esp = x->data;
 	u32 blksize = ALIGN(crypto_aead_blocksize(esp->aead), 4);
 	u32 align = max_t(u32, blksize, esp->padlen);
-	u32 rem;
 
-	mtu -= x->props.header_len + crypto_aead_authsize(esp->aead);
-	rem = mtu & (align - 1);
-	mtu &= ~(align - 1);
-
-	switch (x->props.mode) {
-	case XFRM_MODE_TUNNEL:
-		break;
-	default:
-	case XFRM_MODE_TRANSPORT:
-		/* The worst case */
-		mtu -= blksize - 4;
-		mtu += min_t(u32, blksize - 4, rem);
-		break;
-	case XFRM_MODE_BEET:
-		/* The worst case. */
-		mtu += min_t(u32, IPV4_BEET_PHMAXLEN, rem);
-		break;
-	}
-
-	return mtu - 2;
+	return ((mtu - x->props.header_len - crypto_aead_authsize(esp->aead) -
+			sizeof(struct iphdr)) & ~(align - 1)) + (sizeof(struct
+			iphdr) - 2);
 }
 
 static void esp4_err(struct sk_buff *skb, u32 info)