Message ID | 1336602952-10479-1-git-send-email-bpoirier@suse.de |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, May 09, 2012 at 06:35:52PM -0400, Benjamin Poirier wrote: > > According to what is done, mainly in esp_output(), net_header_len aka > sizeof(struct iphdr) must be taken into account before doing the alignment > calculation. Why do you need to take the ip header into account here? Your patch breaks pmtu discovery, at least on tunnel mode with aes-sha1 (aes blocksize 16 bytes). With your patch applied: tracepath -n 192.168.1.2 1?: [LOCALHOST] pmtu 1442 1: send failed 1: send failed Resume: pmtu 1442 Without your patch: tracepath -n 192.168.1.2 1?: [LOCALHOST] pmtu 1438 1: 192.168.1.2 0.736ms reached 1: 192.168.1.2 0.390ms reached Resume: pmtu 1438 hops 1 back 64 Your patch increases the mtu by 4 bytes. Be aware that adding one byte of payload may increase the packet size up to 16 bytes in the case of aes, as we have to pad the encryption payload always to a multiple of the cipher blocksize. > - > - switch (x->props.mode) { > - case XFRM_MODE_TUNNEL: > - break; > - default: > - case XFRM_MODE_TRANSPORT: > - /* The worst case */ > - mtu -= blksize - 4; > - mtu += min_t(u32, blksize - 4, rem); > - break; Btw. why we are doing the calculation above for transport mode? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2012/05/10 14:18, Steffen Klassert wrote: > On Wed, May 09, 2012 at 06:35:52PM -0400, Benjamin Poirier wrote: > > > > According to what is done, mainly in esp_output(), net_header_len aka > > sizeof(struct iphdr) must be taken into account before doing the alignment > > calculation. > > Why do you need to take the ip header into account here? The value returned by this function is tuned for tcp segment size: 1) from tcp_mtu_to_mss() mss = pmtu - tcp_hlen - net_hlen 2) frame structure for transport mode mtu = mss + tcp_hlen + esp_header_len(esp_payload_len) + ah_len + net_hlen The "mtu" parameter of esp4_get_mtu is in fact mtu - ah_len. The return value of esp4_get_mtu is put into pmtu. If we put 1 and 2 together we have: pmtu = mtu - ah_len - esp_header_len(esp_payload_len) with esp_payload_len = mss + tcp_hlen This formula expands to: pmtu = mtu - ah_len - (header_len + align(align(pmtu - net_hlen + 2, blksize), esp->padlen) - (pmtu - net_hlen) + alen) and simplifies to: pmtu = (mtu - ah_len - net_hlen - header_len - alen) & ~(max(blksize, esp->padlen) - 1) + (net_hlen - 2) which, in the context of esp4_get_mtu, becomes: ((mtu - x->props.header_len - crypto_aead_authsize(esp->aead) - sizeof(struct iphdr)) & ~(align - 1)) + (sizeof(struct iphdr) - 2) This is the same formula as before, except for sizeof(struct iphdr) which was missing. > Your patch breaks > pmtu discovery, at least on tunnel mode with aes-sha1 (aes blocksize 16 bytes). > > With your patch applied: > > tracepath -n 192.168.1.2 > 1?: [LOCALHOST] pmtu 1442 > 1: send failed > 1: send failed > Resume: pmtu 1442 > > Without your patch: > > tracepath -n 192.168.1.2 > 1?: [LOCALHOST] pmtu 1438 > 1: 192.168.1.2 0.736ms reached > 1: 192.168.1.2 0.390ms reached > Resume: pmtu 1438 hops 1 back 64 > > Your patch increases the mtu by 4 bytes. Be aware that adding > one byte of payload may increase the packet size up to 16 bytes > in the case of aes, as we have to pad the encryption payload > always to a multiple of the cipher blocksize. Thanks for testing. My own testing had been limited to rsync'ing large files. The problem with tunnel mode in v1 of the patch was twofold. First, we don't play games with mss and tcp_hlen. esp_payload_len = pmtu directly, instead of esp_payload_len = pmtu - net_hlen. Second, in the tunnel case, header_len already includes net_hlen, see esp_init_state(). The net result is that the formula for tunnel mode was already correct. > > > - > > - switch (x->props.mode) { > > - case XFRM_MODE_TUNNEL: > > - break; > > - default: > > - case XFRM_MODE_TRANSPORT: > > - /* The worst case */ > > - mtu -= blksize - 4; > > - mtu += min_t(u32, blksize - 4, rem); > > - break; > > Btw. why we are doing the calculation above for transport mode? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, May 10, 2012 at 09:02:49PM -0400, Benjamin Poirier wrote: > > The value returned by this function is tuned for tcp segment size: > 1) from tcp_mtu_to_mss() > mss = pmtu - tcp_hlen - net_hlen > 2) frame structure for transport mode > mtu = mss + tcp_hlen + esp_header_len(esp_payload_len) + ah_len + net_hlen I think you can simplify the calculations here, this calculation should not depend on any special layer 4 protocol. > > The "mtu" parameter of esp4_get_mtu is in fact mtu - ah_len. > The return value of esp4_get_mtu is put into pmtu. > > If we put 1 and 2 together we have: > pmtu = mtu - ah_len - esp_header_len(esp_payload_len) > with esp_payload_len = mss + tcp_hlen > > This formula expands to: > pmtu = mtu - ah_len - (header_len + align(align(pmtu - net_hlen + 2, blksize), > esp->padlen) - (pmtu - net_hlen) + alen) > > and simplifies to: > pmtu = (mtu - ah_len - net_hlen - header_len - alen) & ~(max(blksize, > esp->padlen) - 1) + (net_hlen - 2) > > which, in the context of esp4_get_mtu, becomes: > ((mtu - x->props.header_len - crypto_aead_authsize(esp->aead) - sizeof(struct > iphdr)) & ~(align - 1)) + (sizeof(struct iphdr) - 2) > > This is the same formula as before, except for sizeof(struct iphdr) which was > missing. > Well, makes sense. I use transport mode very rarely, so I never noticed this. But I was sure that it worked correct in tunnel mode. Thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c index 89a47b3..60f2738 100644 --- a/net/ipv4/esp4.c +++ b/net/ipv4/esp4.c @@ -459,28 +459,10 @@ static u32 esp4_get_mtu(struct xfrm_state *x, int mtu) struct esp_data *esp = x->data; u32 blksize = ALIGN(crypto_aead_blocksize(esp->aead), 4); u32 align = max_t(u32, blksize, esp->padlen); - u32 rem; - mtu -= x->props.header_len + crypto_aead_authsize(esp->aead); - rem = mtu & (align - 1); - mtu &= ~(align - 1); - - switch (x->props.mode) { - case XFRM_MODE_TUNNEL: - break; - default: - case XFRM_MODE_TRANSPORT: - /* The worst case */ - mtu -= blksize - 4; - mtu += min_t(u32, blksize - 4, rem); - break; - case XFRM_MODE_BEET: - /* The worst case. */ - mtu += min_t(u32, IPV4_BEET_PHMAXLEN, rem); - break; - } - - return mtu - 2; + return ((mtu - x->props.header_len - crypto_aead_authsize(esp->aead) - + sizeof(struct iphdr)) & ~(align - 1)) + (sizeof(struct + iphdr) - 2); } static void esp4_err(struct sk_buff *skb, u32 info)
Corrects the function that determines the esp payload size. The calculations done in esp4_get_mtu lead to overlength frames in transport mode for certain mtu values (eg. 1499 leads to FRAGFAILS) and suboptimal frames for others (eg. 1500, where the addition of padding in the esp header can be avoided). According to what is done, mainly in esp_output(), net_header_len aka sizeof(struct iphdr) must be taken into account before doing the alignment calculation. Signed-off-by: Benjamin Poirier <bpoirier@suse.de> --- Tested with * transport mode E * transport mode EA * transport mode E + ah * tunnel mode E Not tested with BEET, but it should be the same as transport mode draft-nikander-esp-beet-mode-03.txt Section 5.2: "The wire packet format is identical to the ESP transport mode" --- net/ipv4/esp4.c | 24 +++--------------------- 1 files changed, 3 insertions(+), 21 deletions(-)