diff mbox

ip_tunnel mtu calculation

Message ID 20130629175701.73c17a16@vostro
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Timo Teras June 29, 2013, 2:57 p.m. UTC
Hi,

I'm reviewing changes since 3.9 to net-next and observed that, the
tunnel refactoring had the following change in ip_gre xmit path.

In ip_tunnel_xmit() mtu is now calculated as:
        if (df)
                mtu = dst_mtu(&rt->dst) - dev->hard_header_len
                                        - sizeof(struct iphdr);
        else
                mtu = skb_dst(skb) ? dst_mtu(skb_dst(skb)) : dev->mtu;

And it used to be in ip_gre.c: ipgre_tunnel_xmit():
        if (df)
                mtu = dst_mtu(&rt->dst) - dev->hard_header_len - tunnel->hlen;
        else
                mtu = skb_dst(skb) ? dst_mtu(skb_dst(skb)) : dev->mtu;

I notice that tunnel->hlen is replaced with sizeof(struct iphdr), but
in case of GRE those are not the same thing. And the refactored
ip_gre.c does not set hard_header_len either. So it would like the mtu
is now miscalculated (planning to give a full test-spin for net-next
next week).

It seems the tunnel->hlen used to be the full length, including
sizeof(struct iphdr).

But the new, refactored code seems exclude sizeof(struct iphdr) from
the tunnel->hlen. So would the following be appropriate?

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Pravin B Shelar June 30, 2013, 4:36 a.m. UTC | #1
On Sat, Jun 29, 2013 at 7:57 AM, Timo Teras <timo.teras@iki.fi> wrote:
> Hi,
>
> I'm reviewing changes since 3.9 to net-next and observed that, the
> tunnel refactoring had the following change in ip_gre xmit path.
>
> In ip_tunnel_xmit() mtu is now calculated as:
>         if (df)
>                 mtu = dst_mtu(&rt->dst) - dev->hard_header_len
>                                         - sizeof(struct iphdr);
>         else
>                 mtu = skb_dst(skb) ? dst_mtu(skb_dst(skb)) : dev->mtu;
>
> And it used to be in ip_gre.c: ipgre_tunnel_xmit():
>         if (df)
>                 mtu = dst_mtu(&rt->dst) - dev->hard_header_len - tunnel->hlen;
>         else
>                 mtu = skb_dst(skb) ? dst_mtu(skb_dst(skb)) : dev->mtu;
>
> I notice that tunnel->hlen is replaced with sizeof(struct iphdr), but
> in case of GRE those are not the same thing. And the refactored
> ip_gre.c does not set hard_header_len either. So it would like the mtu
> is now miscalculated (planning to give a full test-spin for net-next
> next week).
>
> It seems the tunnel->hlen used to be the full length, including
> sizeof(struct iphdr).
>
> But the new, refactored code seems exclude sizeof(struct iphdr) from
> the tunnel->hlen. So would the following be appropriate?
>
This is ip-tunnel layer, skb has gre header pushed. so mtu does not
need to account gre header when compared to skb->len.
But I missed one comparison for mtu check where iph->tot-len is used
rather that skb-len, which is correct length.

gre module is using iph->tot_len for pmtu check which is wrong for
gre-tap device. This bug is there even before restructuring.
I will send out patch for ip-tunnels code for now.

Thanks.
> diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
> index 394cebc..ac3a9a1 100644
> --- a/net/ipv4/ip_tunnel.c
> +++ b/net/ipv4/ip_tunnel.c
> @@ -564,7 +564,7 @@ void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev,
>
>         if (df)
>                 mtu = dst_mtu(&rt->dst) - dev->hard_header_len
> -                                       - sizeof(struct iphdr);
> +                       - tunnel->hlen - sizeof(struct iphdr);
>         else
>                 mtu = skb_dst(skb) ? dst_mtu(skb_dst(skb)) : dev->mtu;
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Timo Teras June 30, 2013, 7:46 a.m. UTC | #2
On Sat, 29 Jun 2013 21:36:39 -0700
Pravin Shelar <pshelar@nicira.com> wrote:

> On Sat, Jun 29, 2013 at 7:57 AM, Timo Teras <timo.teras@iki.fi> wrote:
> > Hi,
> >
> > I'm reviewing changes since 3.9 to net-next and observed that, the
> > tunnel refactoring had the following change in ip_gre xmit path.
> >
> > In ip_tunnel_xmit() mtu is now calculated as:
> >         if (df)
> >                 mtu = dst_mtu(&rt->dst) - dev->hard_header_len
> >                                         - sizeof(struct iphdr);
> >         else
> >                 mtu = skb_dst(skb) ? dst_mtu(skb_dst(skb)) :
> > dev->mtu;
> >
> > And it used to be in ip_gre.c: ipgre_tunnel_xmit():
> >         if (df)
> >                 mtu = dst_mtu(&rt->dst) - dev->hard_header_len -
> > tunnel->hlen; else
> >                 mtu = skb_dst(skb) ? dst_mtu(skb_dst(skb)) :
> > dev->mtu;
> >
> > I notice that tunnel->hlen is replaced with sizeof(struct iphdr),
> > but in case of GRE those are not the same thing. And the refactored
> > ip_gre.c does not set hard_header_len either. So it would like the
> > mtu is now miscalculated (planning to give a full test-spin for
> > net-next next week).
> >
> > It seems the tunnel->hlen used to be the full length, including
> > sizeof(struct iphdr).
> >
> > But the new, refactored code seems exclude sizeof(struct iphdr) from
> > the tunnel->hlen. So would the following be appropriate?
> >
> This is ip-tunnel layer, skb has gre header pushed. so mtu does not
> need to account gre header when compared to skb->len.
>
> But I missed one comparison for mtu check where iph->tot-len is used
> rather that skb-len, which is correct length.
>
> gre module is using iph->tot_len for pmtu check which is wrong for
> gre-tap device. This bug is there even before restructuring.
> I will send out patch for ip-tunnels code for now.

This fixes only the first part of the problem.

The mtu is sent out few lines below as ICMP message. That MTU needs to
contain also the tunnel header's length. Other wise the remote gets
wrong impression of path mtu.

- Timo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index 394cebc..ac3a9a1 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -564,7 +564,7 @@  void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev,
 
 	if (df)
 		mtu = dst_mtu(&rt->dst) - dev->hard_header_len
-					- sizeof(struct iphdr);
+			- tunnel->hlen - sizeof(struct iphdr);
 	else
 		mtu = skb_dst(skb) ? dst_mtu(skb_dst(skb)) : dev->mtu;