diff mbox series

[ovs-dev,v1,1/1] datapath-windows: Merge split dis-continuous net-buf.

Message ID 20240717114229.95763-1-svc.ovs-community@vmware.com
State Superseded
Delegated to: Alin Gabriel Serdean
Headers show
Series [ovs-dev,v1,1/1] datapath-windows: Merge split dis-continuous net-buf. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/intel-ovs-compilation fail test: fail

Commit Message

Wilson Peng July 17, 2024, 11:42 a.m. UTC
From: Wilson Peng <pweisong@vmware.com>

NdisGetDataBuffer() is called without providing a buffer to copy packet
data in case it is not contiguous.  So, it fails in some scenarios
where the packet is handled by the general network stack before OVS
and headers become split in multiple buffers.

In the fix it will supply the stack buffer to copy packet data when
call NdisGetDataBuffer().

In the conntrack Action process, it will do OvsPartialCopyNBL firstly
with the size of layers l7offsets. If the header is split the header
will be merged to one continuous buffer.

But IPV6 traffic is not handed in this case.

Reported-at: https://github.com/openvswitch/ovs-issues/issues/323
Signed-off-by: Wilson Peng <pweisong@vmware.com>
---
 datapath-windows/ovsext/Actions.c    |  7 +++++++
 datapath-windows/ovsext/BufferMgmt.c | 16 +++++++++++++---
 datapath-windows/ovsext/Conntrack.c  |  4 +++-
 datapath-windows/ovsext/IpFragment.c |  8 ++++++--
 4 files changed, 29 insertions(+), 6 deletions(-)

Comments

Ilya Maximets July 17, 2024, 12:10 p.m. UTC | #1
On 7/17/24 13:42, Wilson Peng via dev wrote:
> From: Wilson Peng <pweisong@vmware.com>
> 
> NdisGetDataBuffer() is called without providing a buffer to copy packet
> data in case it is not contiguous.  So, it fails in some scenarios
> where the packet is handled by the general network stack before OVS
> and headers become split in multiple buffers.
> 
> In the fix it will supply the stack buffer to copy packet data when
> call NdisGetDataBuffer().
> 
> In the conntrack Action process, it will do OvsPartialCopyNBL firstly
> with the size of layers l7offsets. If the header is split the header
> will be merged to one continuous buffer.
> 
> But IPV6 traffic is not handed in this case.
> 
> Reported-at: https://github.com/openvswitch/ovs-issues/issues/323
> Signed-off-by: Wilson Peng <pweisong@vmware.com>
> ---
>  datapath-windows/ovsext/Actions.c    |  7 +++++++
>  datapath-windows/ovsext/BufferMgmt.c | 16 +++++++++++++---
>  datapath-windows/ovsext/Conntrack.c  |  4 +++-
>  datapath-windows/ovsext/IpFragment.c |  8 ++++++--
>  4 files changed, 29 insertions(+), 6 deletions(-)

Hi, Wilson.  Thanks for the patch!

Though I wonder why are we using NdisGetDataBuffer directly at all?
We have helper functions OvsGetTcp or OvsGetPacketBytes and these
should be able to handle non-contiguous memory.  I tried to use them
in my previous attempt to fix this issue here:
  https://patchwork.ozlabs.org/project/openvswitch/patch/20240606105348.2354265-1-i.maximets@ovn.org/
Though I definitely missed a few places and so the patch doesn't work
in its current form.

Best regards, Ilya Maximets.
Wilson Peng July 17, 2024, 1:18 p.m. UTC | #2
Hi, Ilya,

    The patch is adding a fix for the current IPv4 traffic not supporting
non-contiguous
header issues on WIndows2019 and Windows2022.  Actually there are several
places
in ovs-windows still using NdisGetDataBuffer directly. We would replace
them with the
mentioned helper function like OvsGetTcp or OvsGetPacketBytes in other
patches after
The needed testing is done. Not sure if this kind of strategy is accepted?

Regards
Wilson

On Wed, Jul 17, 2024 at 8:10 PM Ilya Maximets <i.maximets@ovn.org> wrote:

> On 7/17/24 13:42, Wilson Peng via dev wrote:
> > From: Wilson Peng <pweisong@vmware.com>
> >
> > NdisGetDataBuffer() is called without providing a buffer to copy packet
> > data in case it is not contiguous.  So, it fails in some scenarios
> > where the packet is handled by the general network stack before OVS
> > and headers become split in multiple buffers.
> >
> > In the fix it will supply the stack buffer to copy packet data when
> > call NdisGetDataBuffer().
> >
> > In the conntrack Action process, it will do OvsPartialCopyNBL firstly
> > with the size of layers l7offsets. If the header is split the header
> > will be merged to one continuous buffer.
> >
> > But IPV6 traffic is not handed in this case.
> >
> > Reported-at: https://github.com/openvswitch/ovs-issues/issues/323
> > Signed-off-by: Wilson Peng <pweisong@vmware.com>
> > ---
> >  datapath-windows/ovsext/Actions.c    |  7 +++++++
> >  datapath-windows/ovsext/BufferMgmt.c | 16 +++++++++++++---
> >  datapath-windows/ovsext/Conntrack.c  |  4 +++-
> >  datapath-windows/ovsext/IpFragment.c |  8 ++++++--
> >  4 files changed, 29 insertions(+), 6 deletions(-)
>
> Hi, Wilson.  Thanks for the patch!
>
> Though I wonder why are we using NdisGetDataBuffer directly at all?
> We have helper functions OvsGetTcp or OvsGetPacketBytes and these
> should be able to handle non-contiguous memory.  I tried to use them
> in my previous attempt to fix this issue here:
>
> https://patchwork.ozlabs.org/project/openvswitch/patch/20240606105348.2354265-1-i.maximets@ovn.org/
> Though I definitely missed a few places and so the patch doesn't work
> in its current form.
>
> Best regards, Ilya Maximets.
>
Wenying Dong Aug. 5, 2024, 6:37 a.m. UTC | #3
Hi Ilya,

I am working on Antrea, which has a strong dependency on Windows OVS. We
are now expecting to bump up our Windows Node OS version from Server 2019
to Server 2022. During this work, we found the Windows BSoD issue happened
very frequently on Server 2022. And now it is a blocker to run OVS on
Windows Server 2022

We have tried Wilson's patch on our local testbed, and it can resolve the
BSoD issue. The patch is in the "review" status for a long time, and it
seems you have a different preference on how to parse the IP Header rather
than Windows API "NdisGetDataBuffer". May I know what your concern is?

If the thing is about the choices on Windows APIs, I find that "
NdisGetDataBuffer" is called not only in the place where the bugfix is
referred, then it means even if switching to your suggested function
doesn't completely replace the API calls, and some other change is still
needed for the purpose. Then is it possible to merge the patch to unblock
running OVS on Windows Server 2022 first, then the OVS community can use
another change to unify the API calls at one time?

Best,
Wenying

On Wed, Jul 17, 2024 at 8:10 PM Ilya Maximets <i.maximets@ovn.org> wrote:

> On 7/17/24 13:42, Wilson Peng via dev wrote:
> > From: Wilson Peng <pweisong@vmware.com>
> >
> > NdisGetDataBuffer() is called without providing a buffer to copy packet
> > data in case it is not contiguous.  So, it fails in some scenarios
> > where the packet is handled by the general network stack before OVS
> > and headers become split in multiple buffers.
> >
> > In the fix it will supply the stack buffer to copy packet data when
> > call NdisGetDataBuffer().
> >
> > In the conntrack Action process, it will do OvsPartialCopyNBL firstly
> > with the size of layers l7offsets. If the header is split the header
> > will be merged to one continuous buffer.
> >
> > But IPV6 traffic is not handed in this case.
> >
> > Reported-at: https://github.com/openvswitch/ovs-issues/issues/323
> > Signed-off-by: Wilson Peng <pweisong@vmware.com>
> > ---
> >  datapath-windows/ovsext/Actions.c    |  7 +++++++
> >  datapath-windows/ovsext/BufferMgmt.c | 16 +++++++++++++---
> >  datapath-windows/ovsext/Conntrack.c  |  4 +++-
> >  datapath-windows/ovsext/IpFragment.c |  8 ++++++--
> >  4 files changed, 29 insertions(+), 6 deletions(-)
>
> Hi, Wilson.  Thanks for the patch!
>
> Though I wonder why are we using NdisGetDataBuffer directly at all?
> We have helper functions OvsGetTcp or OvsGetPacketBytes and these
> should be able to handle non-contiguous memory.  I tried to use them
> in my previous attempt to fix this issue here:
>
> https://patchwork.ozlabs.org/project/openvswitch/patch/20240606105348.2354265-1-i.maximets@ovn.org/
> Though I definitely missed a few places and so the patch doesn't work
> in its current form.
>
> Best regards, Ilya Maximets.
>
Ilya Maximets Aug. 5, 2024, 7:02 p.m. UTC | #4
On 8/5/24 08:37, Wenying Dong wrote:
> Hi Ilya,
> 
> I am working on Antrea, which has a strong dependency on Windows OVS.
> We are now expecting to bump up our Windows Node OS version from Server
> 2019 to Server 2022. During this work, we found the Windows BSoD issue
> happened very frequently on Server 2022. And now it is a blocker to run
> OVS on Windows Server 2022
> 
> We have tried Wilson's patch on our local testbed, and it can resolve the
> BSoD issue. The patch is in the "review" status for a long time, and it
> seems you have a different preference on how to parse the IP Header rather
> than Windows API "NdisGetDataBuffer". May I know what your concern is? 
> 
> If the thing is about the choices on Windows APIs, I find that
> "NdisGetDataBuffer" is called not only in the place where the bugfix is
> referred, then it means even if switching to your suggested function doesn't
> completely replace the API calls, and some other change is still needed for
> the purpose. Then is it possible to merge the patch to unblock running OVS
> on Windows Server 2022 first, then the OVS community can use another change
> to unify the API calls at one time?

Hi, Wenying.

The thing about API call is that we have a group of helper functions that
supposed to abstract this kind of header accesses and ideally we should use
those instead of accessing headers directly.

In general, it seems OK to get the change as-is without moving to those
helpers.  The main problem I have is that I have no real way to test
Windows driver code and so I'm not comfortable merging it.  Alin planned
to look at the currently under-review windows patches.  Hopefully he can
get to them this week.

If not, I can consider applying these patches myself, but I'll be fully
relying on your testing, which may be OK, since you're the main and possibly
the only users of the Windows datapath.  But it's still not the best option
since we don't have any kind of CI, except for checking that it compiles.

Best regards, Ilya Maximets.

> 
> Best,
> Wenying
> 
> On Wed, Jul 17, 2024 at 8:10 PM Ilya Maximets <i.maximets@ovn.org <mailto:i.maximets@ovn.org>> wrote:
> 
>     On 7/17/24 13:42, Wilson Peng via dev wrote:
>     > From: Wilson Peng <pweisong@vmware.com <mailto:pweisong@vmware.com>>
>     >
>     > NdisGetDataBuffer() is called without providing a buffer to copy packet
>     > data in case it is not contiguous.  So, it fails in some scenarios
>     > where the packet is handled by the general network stack before OVS
>     > and headers become split in multiple buffers.
>     >
>     > In the fix it will supply the stack buffer to copy packet data when
>     > call NdisGetDataBuffer().
>     >
>     > In the conntrack Action process, it will do OvsPartialCopyNBL firstly
>     > with the size of layers l7offsets. If the header is split the header
>     > will be merged to one continuous buffer.
>     >
>     > But IPV6 traffic is not handed in this case.
>     >
>     > Reported-at: https://github.com/openvswitch/ovs-issues/issues/323 <https://github.com/openvswitch/ovs-issues/issues/323>
>     > Signed-off-by: Wilson Peng <pweisong@vmware.com <mailto:pweisong@vmware.com>>
>     > ---
>     >  datapath-windows/ovsext/Actions.c    |  7 +++++++
>     >  datapath-windows/ovsext/BufferMgmt.c | 16 +++++++++++++---
>     >  datapath-windows/ovsext/Conntrack.c  |  4 +++-
>     >  datapath-windows/ovsext/IpFragment.c |  8 ++++++--
>     >  4 files changed, 29 insertions(+), 6 deletions(-)
> 
>     Hi, Wilson.  Thanks for the patch!
> 
>     Though I wonder why are we using NdisGetDataBuffer directly at all?
>     We have helper functions OvsGetTcp or OvsGetPacketBytes and these
>     should be able to handle non-contiguous memory.  I tried to use them
>     in my previous attempt to fix this issue here:
>       https://patchwork.ozlabs.org/project/openvswitch/patch/20240606105348.2354265-1-i.maximets@ovn.org/ <https://patchwork.ozlabs.org/project/openvswitch/patch/20240606105348.2354265-1-i.maximets@ovn.org/>
>     Though I definitely missed a few places and so the patch doesn't work
>     in its current form.
> 
>     Best regards, Ilya Maximets.
> 
> 
> This electronic communication and the information and any files transmitted with it, or attached to it, are confidential and are intended solely for the use of the individual or entity to whom it is addressed and may contain information that is confidential, legally privileged, protected by privacy laws, or otherwise restricted from disclosure to anyone else. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended recipient, you are hereby notified that any use, copying, distributing, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. If you received this e-mail in error, please return the e-mail to the sender, delete it from your computer, and destroy any printed copy of it.
diff mbox series

Patch

diff --git a/datapath-windows/ovsext/Actions.c b/datapath-windows/ovsext/Actions.c
index 97029b0f4..4bc2a6b65 100644
--- a/datapath-windows/ovsext/Actions.c
+++ b/datapath-windows/ovsext/Actions.c
@@ -2414,6 +2414,13 @@  OvsDoExecuteActions(POVS_SWITCH_CONTEXT switchContext,
             }
 
             PNET_BUFFER_LIST oldNbl = ovsFwdCtx.curNbl;
+            PUINT8 bufferStart = NULL;
+
+            bufferStart = OvsGetHeaderBySize(&ovsFwdCtx, layers->l7Offset);
+            if (!bufferStart) {
+                 dropReason = L"OVS-Netbuf reallocated failed";
+                 goto dropit;
+            }
             status = OvsExecuteConntrackAction(&ovsFwdCtx, key,
                                                (const PNL_ATTR)a);
             if (status != NDIS_STATUS_SUCCESS) {
diff --git a/datapath-windows/ovsext/BufferMgmt.c b/datapath-windows/ovsext/BufferMgmt.c
index 5c52757a0..40faaefd6 100644
--- a/datapath-windows/ovsext/BufferMgmt.c
+++ b/datapath-windows/ovsext/BufferMgmt.c
@@ -1110,12 +1110,22 @@  GetIpHeaderInfo(PNET_BUFFER_LIST curNbl,
     IPHdr *ipHdr;
     IPv6Hdr *ipv6Hdr;
     PNET_BUFFER curNb;
+    CHAR tempBuf[MAX_IPV4_HLEN];
 
     curNb = NET_BUFFER_LIST_FIRST_NB(curNbl);
     ASSERT(NET_BUFFER_NEXT_NB(curNb) == NULL);
-    eth = (EthHdr *)NdisGetDataBuffer(curNb,
-                                      hdrInfo->l4Offset,
-                                      NULL, 1, 0);
+
+    if (hdrInfo->isIPv6) {
+        eth = (EthHdr *)NdisGetDataBuffer(curNb,
+                                          hdrInfo->l4Offset,
+                                          NULL, 1, 0);
+    } else {
+        NdisZeroMemory(tempBuf, MAX_IPV4_HLEN);
+        eth = (EthHdr *)NdisGetDataBuffer(curNb,
+                                          hdrInfo->l4Offset,
+                                          (PVOID)tempBuf, 1, 0);
+    }
+
     if (eth == NULL) {
         return NDIS_STATUS_INVALID_PACKET;
     }
diff --git a/datapath-windows/ovsext/Conntrack.c b/datapath-windows/ovsext/Conntrack.c
index 39ba5cc10..1a1794fc1 100644
--- a/datapath-windows/ovsext/Conntrack.c
+++ b/datapath-windows/ovsext/Conntrack.c
@@ -683,6 +683,7 @@  OvsGetTcpHeader(PNET_BUFFER_LIST nbl,
     TCPHdr *tcp;
     VOID *dest = storage;
     uint16_t ipv6ExtLength = 0;
+    CHAR tempBuf[MAX_IPV4_HLEN];
 
     if (layers->isIPv6) {
         ipv6Hdr = NdisGetDataBuffer(NET_BUFFER_LIST_FIRST_NB(nbl),
@@ -701,9 +702,10 @@  OvsGetTcpHeader(PNET_BUFFER_LIST nbl,
             return storage;
         }
     } else {
+        NdisZeroMemory(tempBuf, MAX_IPV4_HLEN);
         ipHdr = NdisGetDataBuffer(NET_BUFFER_LIST_FIRST_NB(nbl),
                                   layers->l4Offset + sizeof(TCPHdr),
-                                  NULL, 1 /*no align*/, 0);
+                                  (PVOID)tempBuf, 1 /*no align*/, 0);
         if (ipHdr == NULL) {
             return NULL;
         }
diff --git a/datapath-windows/ovsext/IpFragment.c b/datapath-windows/ovsext/IpFragment.c
index afb8e50d6..4e119d82a 100644
--- a/datapath-windows/ovsext/IpFragment.c
+++ b/datapath-windows/ovsext/IpFragment.c
@@ -153,12 +153,14 @@  OvsIpv4Reassemble(POVS_SWITCH_CONTEXT switchContext,
     PNET_BUFFER_LIST newNbl = NULL;
     UINT16 ipHdrLen, packetHeader;
     UINT32 packetLen;
+    CHAR tempBuf[MAX_IPV4_HLEN];
 
     curNb = NET_BUFFER_LIST_FIRST_NB(*curNbl);
     ASSERT(NET_BUFFER_NEXT_NB(curNb) == NULL);
 
+    NdisZeroMemory(tempBuf, MAX_IPV4_HLEN);
     eth = (EthHdr*)NdisGetDataBuffer(curNb, layers->l4Offset,
-                                     NULL, 1, 0);
+                                     (PVOID)tempBuf, 1, 0);
     if (eth == NULL) {
         return NDIS_STATUS_INVALID_PACKET;
     }
@@ -253,12 +255,14 @@  OvsProcessIpv4Fragment(POVS_SWITCH_CONTEXT switchContext,
     POVS_IPFRAG_ENTRY entry;
     POVS_FRAGMENT_LIST fragStorage;
     LOCK_STATE_EX htLockState;
+    CHAR tempBuf[MAX_IPV4_HLEN];
 
     curNb = NET_BUFFER_LIST_FIRST_NB(*curNbl);
     ASSERT(NET_BUFFER_NEXT_NB(curNb) == NULL);
 
+    NdisZeroMemory(tempBuf, MAX_IPV4_HLEN);
     eth = (EthHdr*)NdisGetDataBuffer(curNb, layers->l4Offset,
-                                     NULL, 1, 0);
+                                     (PVOID)tempBuf, 1, 0);
     if (eth == NULL) {
         return NDIS_STATUS_INVALID_PACKET;
     }