diff mbox series

[ovs-dev,v5,1/3] userspace: Adjust segment size on encapsulation.

Message ID 20240712213010.1780089-1-mkp@redhat.com
State Superseded
Delegated to: Ilya Maximets
Headers show
Series [ovs-dev,v5,1/3] userspace: Adjust segment size on encapsulation. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed

Commit Message

Mike Pattrick July 12, 2024, 9:30 p.m. UTC
When prepending a tunnel header to a packet marked for segmentation, we
need to adjust the segment size. Failure to do so can result in packets
that are larger than the intended MTU post segmentation.

Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.")
Signed-off-by: Mike Pattrick <mkp@redhat.com>
---
 lib/netdev-native-tnl.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

David Marchand July 13, 2024, 1:45 p.m. UTC | #1
On Fri, Jul 12, 2024 at 11:30 PM Mike Pattrick <mkp@redhat.com> wrote:
>
> When prepending a tunnel header to a packet marked for segmentation, we
> need to adjust the segment size. Failure to do so can result in packets
> that are larger than the intended MTU post segmentation.
>
> Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.")
> Signed-off-by: Mike Pattrick <mkp@redhat.com>

I see no difference with v4, so just repeating:
Reviewed-by: David Marchand <david.marchand@redhat.com>
Ilya Maximets July 24, 2024, 7:56 p.m. UTC | #2
On 7/12/24 23:30, Mike Pattrick wrote:
> When prepending a tunnel header to a packet marked for segmentation, we
> need to adjust the segment size. Failure to do so can result in packets
> that are larger than the intended MTU post segmentation.

Hi, Mike.  Not a full review yet, but could you, please, explain in
more details why we need to adjust TSO segment size here?

In order to ensure packet delivery, MTU inside VM / container must be
smaller than the physical network MTU by the size of the tunnel header.
So, tso_segsz of the packet received from such a VM or container should
be already small enough.  Or am I missing something?

Best regards, Ilya Maximets.

> 
> Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.")
> Signed-off-by: Mike Pattrick <mkp@redhat.com>
> ---
>  lib/netdev-native-tnl.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
> index 16c56608d..78c91b5fa 100644
> --- a/lib/netdev-native-tnl.c
> +++ b/lib/netdev-native-tnl.c
> @@ -161,6 +161,17 @@ netdev_tnl_push_ip_header(struct dp_packet *packet, const void *header,
>      *ip_tot_size = dp_packet_size(packet) - sizeof (struct eth_header);
>  
>      memcpy(eth, header, size);
> +
> +    /* The prepended header may cause TSO marked packets to exceed the intended
> +     * MTU on segmentation. */
> +    if (dp_packet_hwol_is_tso(packet)) {
> +        uint16_t tso_segsz = dp_packet_get_tso_segsz(packet);
> +        if (tso_segsz > size) {
> +            tso_segsz -= size;
> +            dp_packet_set_tso_segsz(packet, tso_segsz);
> +        }
> +    }
> +
>      /* The encapsulated packet has type Ethernet. Adjust dp_packet. */
>      packet->packet_type = htonl(PT_ETH);
>      dp_packet_reset_offsets(packet);
Mike Pattrick July 24, 2024, 10:45 p.m. UTC | #3
On Wed, Jul 24, 2024 at 3:56 PM Ilya Maximets <i.maximets@ovn.org> wrote:
>
> On 7/12/24 23:30, Mike Pattrick wrote:
> > When prepending a tunnel header to a packet marked for segmentation, we
> > need to adjust the segment size. Failure to do so can result in packets
> > that are larger than the intended MTU post segmentation.
>
> Hi, Mike.  Not a full review yet, but could you, please, explain in
> more details why we need to adjust TSO segment size here?
>
> In order to ensure packet delivery, MTU inside VM / container must be
> smaller than the physical network MTU by the size of the tunnel header.
> So, tso_segsz of the packet received from such a VM or container should
> be already small enough.  Or am I missing something?

My rationale here is the VM can't know if OVS will be encapsulating a
packet in a udp tunnel, or how big that encapsulation would be. The
UDP encapsulation is pretty large and potentially variable, it may be
overly burdensome if the VM MTU must be adjusted to anticipate any UDP
tunnel.


Thanks,
Mike

>
> Best regards, Ilya Maximets.
>
> >
> > Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.")
> > Signed-off-by: Mike Pattrick <mkp@redhat.com>
> > ---
> >  lib/netdev-native-tnl.c | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> >
> > diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
> > index 16c56608d..78c91b5fa 100644
> > --- a/lib/netdev-native-tnl.c
> > +++ b/lib/netdev-native-tnl.c
> > @@ -161,6 +161,17 @@ netdev_tnl_push_ip_header(struct dp_packet *packet, const void *header,
> >      *ip_tot_size = dp_packet_size(packet) - sizeof (struct eth_header);
> >
> >      memcpy(eth, header, size);
> > +
> > +    /* The prepended header may cause TSO marked packets to exceed the intended
> > +     * MTU on segmentation. */
> > +    if (dp_packet_hwol_is_tso(packet)) {
> > +        uint16_t tso_segsz = dp_packet_get_tso_segsz(packet);
> > +        if (tso_segsz > size) {
> > +            tso_segsz -= size;
> > +            dp_packet_set_tso_segsz(packet, tso_segsz);
> > +        }
> > +    }
> > +
> >      /* The encapsulated packet has type Ethernet. Adjust dp_packet. */
> >      packet->packet_type = htonl(PT_ETH);
> >      dp_packet_reset_offsets(packet);
>
Ilya Maximets July 25, 2024, 10:22 a.m. UTC | #4
On 7/25/24 00:45, Mike Pattrick wrote:
> On Wed, Jul 24, 2024 at 3:56 PM Ilya Maximets <i.maximets@ovn.org> wrote:
>>
>> On 7/12/24 23:30, Mike Pattrick wrote:
>>> When prepending a tunnel header to a packet marked for segmentation, we
>>> need to adjust the segment size. Failure to do so can result in packets
>>> that are larger than the intended MTU post segmentation.
>>
>> Hi, Mike.  Not a full review yet, but could you, please, explain in
>> more details why we need to adjust TSO segment size here?
>>
>> In order to ensure packet delivery, MTU inside VM / container must be
>> smaller than the physical network MTU by the size of the tunnel header.
>> So, tso_segsz of the packet received from such a VM or container should
>> be already small enough.  Or am I missing something?
> 
> My rationale here is the VM can't know if OVS will be encapsulating a
> packet in a udp tunnel, or how big that encapsulation would be. The
> UDP encapsulation is pretty large and potentially variable, it may be
> overly burdensome if the VM MTU must be adjusted to anticipate any UDP
> tunnel.

It is a burden, sure.  But that's how it works for kernel datapath as well
and for networking in general.  VM has to know the MTU of the underlying
network the same way as a physical host has to know the MTU of the physical
network it is connected to, there is no way around this.  All the common
cluster management systems like OpenStack or ovn-kubernetes CNI will set
MTUs for VMs and container interfaces appropriately when tunneling is
involved.  OVS on its own can't make this decision, especially because we
don't actually know the MTU of the destination or any hops in-between anyway.
VM has to know the MTU or has to run some form of path MTU discovery to not
send oversized packets.  Userspace tunnels are not good for PMTUD since
they will not forward or generate ICMP errors, but that's a separate issue
and there are ways to not rely on ICMP for PMTUD.

Best regards, Ilya Maximets.

> 
> 
> Thanks,
> Mike
> 
>>
>> Best regards, Ilya Maximets.
>>
>>>
>>> Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.")
>>> Signed-off-by: Mike Pattrick <mkp@redhat.com>
>>> ---
>>>  lib/netdev-native-tnl.c | 11 +++++++++++
>>>  1 file changed, 11 insertions(+)
>>>
>>> diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
>>> index 16c56608d..78c91b5fa 100644
>>> --- a/lib/netdev-native-tnl.c
>>> +++ b/lib/netdev-native-tnl.c
>>> @@ -161,6 +161,17 @@ netdev_tnl_push_ip_header(struct dp_packet *packet, const void *header,
>>>      *ip_tot_size = dp_packet_size(packet) - sizeof (struct eth_header);
>>>
>>>      memcpy(eth, header, size);
>>> +
>>> +    /* The prepended header may cause TSO marked packets to exceed the intended
>>> +     * MTU on segmentation. */
>>> +    if (dp_packet_hwol_is_tso(packet)) {
>>> +        uint16_t tso_segsz = dp_packet_get_tso_segsz(packet);
>>> +        if (tso_segsz > size) {
>>> +            tso_segsz -= size;
>>> +            dp_packet_set_tso_segsz(packet, tso_segsz);
>>> +        }
>>> +    }
>>> +
>>>      /* The encapsulated packet has type Ethernet. Adjust dp_packet. */
>>>      packet->packet_type = htonl(PT_ETH);
>>>      dp_packet_reset_offsets(packet);
>>
>
diff mbox series

Patch

diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
index 16c56608d..78c91b5fa 100644
--- a/lib/netdev-native-tnl.c
+++ b/lib/netdev-native-tnl.c
@@ -161,6 +161,17 @@  netdev_tnl_push_ip_header(struct dp_packet *packet, const void *header,
     *ip_tot_size = dp_packet_size(packet) - sizeof (struct eth_header);
 
     memcpy(eth, header, size);
+
+    /* The prepended header may cause TSO marked packets to exceed the intended
+     * MTU on segmentation. */
+    if (dp_packet_hwol_is_tso(packet)) {
+        uint16_t tso_segsz = dp_packet_get_tso_segsz(packet);
+        if (tso_segsz > size) {
+            tso_segsz -= size;
+            dp_packet_set_tso_segsz(packet, tso_segsz);
+        }
+    }
+
     /* The encapsulated packet has type Ethernet. Adjust dp_packet. */
     packet->packet_type = htonl(PT_ETH);
     dp_packet_reset_offsets(packet);