diff mbox series

[ovs-dev,v2,1/1] datapath-windows : Avoid a deadlock when processing TFTP conntrack.

Message ID 20240605053552.17478-1-svc.ovs-community@vmware.com
State Changes Requested
Delegated to: Alin Gabriel Serdean
Headers show
Series [ovs-dev,v2,1/1] datapath-windows : Avoid a deadlock when processing TFTP conntrack. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/intel-ovs-compilation success test: success

Commit Message

Wilson Peng June 5, 2024, 5:35 a.m. UTC
From: Wilson Peng <pweisong@vmware.com>

It is found the TFTP reply packet with source port 69 will trigger host hang
And the possible coredump.

According to part 4 in TFTP RFC https://datatracker.ietf.org/doc/html/rfc1350,
The TFTP reply packet should use a new source-port(not 69) to connect to
Client.

Upon this TFTP reply packet ovs-windows kernel part will meet deadlock as it
Does set entry->parent to be equal to entry itself.

Reproducing step(installing ovsext driver on Win2019 server):

Topo: podvif38--192.168.10.38-----ovs----192.168.10.40----podvif40

Setup flow on local setup,
$Vif38Name="podvif38"
$Vif40Name="podvif40"
ovs-ofctl del-flows br-int --strict "table=0,priority=0"
ovs-ofctl add-flow br-int "table=0,in_port=$Vif38Name,udp,
    actions=ct(commit,alg=tftp),output:$Vif40Name"

Constructing a TFTP request and reply packet using scapy below,
TFTPPacket1=Ether(dst="00:15:5d:04:a0:a2",src="00:15:5d:04:a0:a1")/
            IP(src="192.168.10.38",dst="192.168.10.40")/
            UDP(sport=51000,dport=69)/TFTP()/
            TFTP_RRQ(filename="OVSIM_CERT.cer",mode="octet")

TFTPPacket2=Ether(dst="00:15:5d:04:a0:a1",src="00:15:5d:04:a0:a2")/
            IP(src="192.168.10.40",dst="192.168.10.38")/
            UDP(sport=69,dport=51000)/TFTP()/
            TFTP_ERROR(errorcode=4,errormsg=b'Access violation')

Sending the packet via ovs CMD,
ovs-ofctl packet-out br-int 1 'resubmit(,0)'
00155D04A0A800155D04A0A7080045000033000100004011E51AC0A80A26C0A80A28C7380045001'
FBFCC00014F5653494D5F434552542E636572006F6374657400

ovs-ofctl packet-out br-int 1 'resubmit(,0)'
00155D04A0A700155D04A0A8080045000031000100004011E51CC0A80A28C0A80A260045C73800'
1DB033000500044163636573732076696F6C6174696F6E2E00

Without the fix, Windows node will HANG and not responding.

It is good to backport the fix to main and backported until 3.1

Signed-off-by: Wilson Peng <pweisong@vmware.com>
---
 datapath-windows/ovsext/Conntrack.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Simon Horman June 5, 2024, 12:32 p.m. UTC | #1
On Wed, Jun 05, 2024 at 01:35:52PM +0800, Wilson Peng via dev wrote:
> From: Wilson Peng <pweisong@vmware.com>
> 
> It is found the TFTP reply packet with source port 69 will trigger host hang
> And the possible coredump.
> 
> According to part 4 in TFTP RFC https://datatracker.ietf.org/doc/html/rfc1350,
> The TFTP reply packet should use a new source-port(not 69) to connect to
> Client.
> 
> Upon this TFTP reply packet ovs-windows kernel part will meet deadlock as it
> Does set entry->parent to be equal to entry itself.
> 
> Reproducing step(installing ovsext driver on Win2019 server):
> 
> Topo: podvif38--192.168.10.38-----ovs----192.168.10.40----podvif40
> 
> Setup flow on local setup,
> $Vif38Name="podvif38"
> $Vif40Name="podvif40"
> ovs-ofctl del-flows br-int --strict "table=0,priority=0"
> ovs-ofctl add-flow br-int "table=0,in_port=$Vif38Name,udp,
>     actions=ct(commit,alg=tftp),output:$Vif40Name"
> 
> Constructing a TFTP request and reply packet using scapy below,
> TFTPPacket1=Ether(dst="00:15:5d:04:a0:a2",src="00:15:5d:04:a0:a1")/
>             IP(src="192.168.10.38",dst="192.168.10.40")/
>             UDP(sport=51000,dport=69)/TFTP()/
>             TFTP_RRQ(filename="OVSIM_CERT.cer",mode="octet")
> 
> TFTPPacket2=Ether(dst="00:15:5d:04:a0:a1",src="00:15:5d:04:a0:a2")/
>             IP(src="192.168.10.40",dst="192.168.10.38")/
>             UDP(sport=69,dport=51000)/TFTP()/
>             TFTP_ERROR(errorcode=4,errormsg=b'Access violation')

Hi Wilson,

From an L4 perspective the above packet flow seems very normal -
the src/dst addresses and ports are reversed. So I am curious
to know what it is about TFTP - other than the RFC reference above -
that makes this special from an OVS PoV.

In other words, why does parent == entry in this case,
but, I assume, not for other Ether/IPv4/UDP cases?

> 
> Sending the packet via ovs CMD,
> ovs-ofctl packet-out br-int 1 'resubmit(,0)'
> 00155D04A0A800155D04A0A7080045000033000100004011E51AC0A80A26C0A80A28C7380045001'
> FBFCC00014F5653494D5F434552542E636572006F6374657400
> 
> ovs-ofctl packet-out br-int 1 'resubmit(,0)'
> 00155D04A0A700155D04A0A8080045000031000100004011E51CC0A80A28C0A80A260045C73800'
> 1DB033000500044163636573732076696F6C6174696F6E2E00
> 
> Without the fix, Windows node will HANG and not responding.
> 
> It is good to backport the fix to main and backported until 3.1
> 
> Signed-off-by: Wilson Peng <pweisong@vmware.com>
> ---
>  datapath-windows/ovsext/Conntrack.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/datapath-windows/ovsext/Conntrack.c b/datapath-windows/ovsext/Conntrack.c
> index 39ba5cc10..17a1257f6 100644
> --- a/datapath-windows/ovsext/Conntrack.c
> +++ b/datapath-windows/ovsext/Conntrack.c
> @@ -410,6 +410,8 @@ OvsCtEntryCreate(OvsForwardingContext *fwdCtx,
>      }
>      if (state != OVS_CS_F_INVALID && commit) {
>          if (entry) {
> +            /* make sure that parentEntry is not equal to entry*/
> +            ASSERT(!parentEntry || (parentEntry != entry));
>              entry->parent = parentEntry;
>              if (OvsCtAddEntry(entry, ctx, natInfo, currentTime)) {
>                  *entryCreated = TRUE;
> @@ -1033,8 +1035,9 @@ OvsProcessConntrackEntry(OvsForwardingContext *fwdCtx,
>              } else {
>                  POVS_CT_ENTRY parentEntry;
>                  parentEntry = OvsCtRelatedLookup(ctx->key, currentTime);
> -                entry->parent = parentEntry;
> -                if (parentEntry != NULL) {
> +                /* make sure that parentEntry is not equal to entry*/
> +                if ((parentEntry != NULL) && (parentEntry != entry)) {
> +                    entry->parent = parentEntry;
>                      state |= OVS_CS_F_RELATED;
>                  }
>              }
> -- 
> 2.39.2 (Apple Git-143)
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Wilson Peng June 5, 2024, 1:18 p.m. UTC | #2
Hi, Simon,
    In this case, for tftp packet processing, it does have a child-parent
processing logic just like ftp in tcp.
Tftp packet1 from port1 to 69 and it will create one new conntrack entry
and create one related conntrack
entry also with src-port 0 and dst-port port1.   Original idea is if the
tftp reply packet is from port2 to port1 and
then it will create one new conntrack entry then it could set this
conntrack entry(port2->port1) parent be equal
to  conntrack_entry(port1-->69).

    In the processing part, if the tftp reply packet is from 69 to port1
then it would hit the original conntrack entry
(port1-->69) and lead to conntrack_entry parent be equal to itself.

    Only ftp/tftp traffic processing will have parent-child logic.

Regards
Wilson




On Wed, Jun 5, 2024 at 8:32 PM Simon Horman <horms@ovn.org> wrote:

> On Wed, Jun 05, 2024 at 01:35:52PM +0800, Wilson Peng via dev wrote:
> > From: Wilson Peng <pweisong@vmware.com>
> >
> > It is found the TFTP reply packet with source port 69 will trigger host
> hang
> > And the possible coredump.
> >
> > According to part 4 in TFTP RFC
> https://datatracker.ietf.org/doc/html/rfc1350,
> > The TFTP reply packet should use a new source-port(not 69) to connect to
> > Client.
> >
> > Upon this TFTP reply packet ovs-windows kernel part will meet deadlock
> as it
> > Does set entry->parent to be equal to entry itself.
> >
> > Reproducing step(installing ovsext driver on Win2019 server):
> >
> > Topo: podvif38--192.168.10.38-----ovs----192.168.10.40----podvif40
> >
> > Setup flow on local setup,
> > $Vif38Name="podvif38"
> > $Vif40Name="podvif40"
> > ovs-ofctl del-flows br-int --strict "table=0,priority=0"
> > ovs-ofctl add-flow br-int "table=0,in_port=$Vif38Name,udp,
> >     actions=ct(commit,alg=tftp),output:$Vif40Name"
> >
> > Constructing a TFTP request and reply packet using scapy below,
> > TFTPPacket1=Ether(dst="00:15:5d:04:a0:a2",src="00:15:5d:04:a0:a1")/
> >             IP(src="192.168.10.38",dst="192.168.10.40")/
> >             UDP(sport=51000,dport=69)/TFTP()/
> >             TFTP_RRQ(filename="OVSIM_CERT.cer",mode="octet")
> >
> > TFTPPacket2=Ether(dst="00:15:5d:04:a0:a1",src="00:15:5d:04:a0:a2")/
> >             IP(src="192.168.10.40",dst="192.168.10.38")/
> >             UDP(sport=69,dport=51000)/TFTP()/
> >             TFTP_ERROR(errorcode=4,errormsg=b'Access violation')
>
> Hi Wilson,
>
> From an L4 perspective the above packet flow seems very normal -
> the src/dst addresses and ports are reversed. So I am curious
> to know what it is about TFTP - other than the RFC reference above -
> that makes this special from an OVS PoV.
>
> In other words, why does parent == entry in this case,
> but, I assume, not for other Ether/IPv4/UDP cases?
>
> >
> > Sending the packet via ovs CMD,
> > ovs-ofctl packet-out br-int 1 'resubmit(,0)'
> >
> 00155D04A0A800155D04A0A7080045000033000100004011E51AC0A80A26C0A80A28C7380045001'
> > FBFCC00014F5653494D5F434552542E636572006F6374657400
> >
> > ovs-ofctl packet-out br-int 1 'resubmit(,0)'
> >
> 00155D04A0A700155D04A0A8080045000031000100004011E51CC0A80A28C0A80A260045C73800'
> > 1DB033000500044163636573732076696F6C6174696F6E2E00
> >
> > Without the fix, Windows node will HANG and not responding.
> >
> > It is good to backport the fix to main and backported until 3.1
> >
> > Signed-off-by: Wilson Peng <pweisong@vmware.com>
> > ---
> >  datapath-windows/ovsext/Conntrack.c | 7 +++++--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/datapath-windows/ovsext/Conntrack.c
> b/datapath-windows/ovsext/Conntrack.c
> > index 39ba5cc10..17a1257f6 100644
> > --- a/datapath-windows/ovsext/Conntrack.c
> > +++ b/datapath-windows/ovsext/Conntrack.c
> > @@ -410,6 +410,8 @@ OvsCtEntryCreate(OvsForwardingContext *fwdCtx,
> >      }
> >      if (state != OVS_CS_F_INVALID && commit) {
> >          if (entry) {
> > +            /* make sure that parentEntry is not equal to entry*/
> > +            ASSERT(!parentEntry || (parentEntry != entry));
> >              entry->parent = parentEntry;
> >              if (OvsCtAddEntry(entry, ctx, natInfo, currentTime)) {
> >                  *entryCreated = TRUE;
> > @@ -1033,8 +1035,9 @@ OvsProcessConntrackEntry(OvsForwardingContext
> *fwdCtx,
> >              } else {
> >                  POVS_CT_ENTRY parentEntry;
> >                  parentEntry = OvsCtRelatedLookup(ctx->key, currentTime);
> > -                entry->parent = parentEntry;
> > -                if (parentEntry != NULL) {
> > +                /* make sure that parentEntry is not equal to entry*/
> > +                if ((parentEntry != NULL) && (parentEntry != entry)) {
> > +                    entry->parent = parentEntry;
> >                      state |= OVS_CS_F_RELATED;
> >                  }
> >              }
> > --
> > 2.39.2 (Apple Git-143)
> >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Wilson Peng June 5, 2024, 1:26 p.m. UTC | #3
Hi, Simon,
    In this case, for tftp packet processing, it does have a child-parent
processing logic just like ftp in tcp.
Tftp packet1 from port1 to 69 and it will create one new conntrack entry
and create one related conntrack
entry also with src-port 0 and dst-port port1.   Original idea is if the
tftp reply packet is from port2 to port1 and
then it will create one new conntrack entry then it could set this
conntrack entry(port2->port1) parent be equal
to  conntrack_entry(port1-->69).

    In the processing part, if the tftp reply packet is from 69 to port1
then it would hit the original conntrack entry
(port1-->69) and lead to conntrack_entry parent be equal to itself.

    Only ftp/tftp traffic processing will have parent-child logic.

Regards
Wilson

On Wed, Jun 5, 2024 at 8:32 PM Simon Horman <horms@ovn.org> wrote:

> On Wed, Jun 05, 2024 at 01:35:52PM +0800, Wilson Peng via dev wrote:
> > From: Wilson Peng <pweisong@vmware.com>
> >
> > It is found the TFTP reply packet with source port 69 will trigger host
> hang
> > And the possible coredump.
> >
> > According to part 4 in TFTP RFC
> https://datatracker.ietf.org/doc/html/rfc1350,
> > The TFTP reply packet should use a new source-port(not 69) to connect to
> > Client.
> >
> > Upon this TFTP reply packet ovs-windows kernel part will meet deadlock
> as it
> > Does set entry->parent to be equal to entry itself.
> >
> > Reproducing step(installing ovsext driver on Win2019 server):
> >
> > Topo: podvif38--192.168.10.38-----ovs----192.168.10.40----podvif40
> >
> > Setup flow on local setup,
> > $Vif38Name="podvif38"
> > $Vif40Name="podvif40"
> > ovs-ofctl del-flows br-int --strict "table=0,priority=0"
> > ovs-ofctl add-flow br-int "table=0,in_port=$Vif38Name,udp,
> >     actions=ct(commit,alg=tftp),output:$Vif40Name"
> >
> > Constructing a TFTP request and reply packet using scapy below,
> > TFTPPacket1=Ether(dst="00:15:5d:04:a0:a2",src="00:15:5d:04:a0:a1")/
> >             IP(src="192.168.10.38",dst="192.168.10.40")/
> >             UDP(sport=51000,dport=69)/TFTP()/
> >             TFTP_RRQ(filename="OVSIM_CERT.cer",mode="octet")
> >
> > TFTPPacket2=Ether(dst="00:15:5d:04:a0:a1",src="00:15:5d:04:a0:a2")/
> >             IP(src="192.168.10.40",dst="192.168.10.38")/
> >             UDP(sport=69,dport=51000)/TFTP()/
> >             TFTP_ERROR(errorcode=4,errormsg=b'Access violation')
>
> Hi Wilson,
>
> From an L4 perspective the above packet flow seems very normal -
> the src/dst addresses and ports are reversed. So I am curious
> to know what it is about TFTP - other than the RFC reference above -
> that makes this special from an OVS PoV.
>
> In other words, why does parent == entry in this case,
> but, I assume, not for other Ether/IPv4/UDP cases?
>
> >
> > Sending the packet via ovs CMD,
> > ovs-ofctl packet-out br-int 1 'resubmit(,0)'
> >
> 00155D04A0A800155D04A0A7080045000033000100004011E51AC0A80A26C0A80A28C7380045001'
> > FBFCC00014F5653494D5F434552542E636572006F6374657400
> >
> > ovs-ofctl packet-out br-int 1 'resubmit(,0)'
> >
> 00155D04A0A700155D04A0A8080045000031000100004011E51CC0A80A28C0A80A260045C73800'
> > 1DB033000500044163636573732076696F6C6174696F6E2E00
> >
> > Without the fix, Windows node will HANG and not responding.
> >
> > It is good to backport the fix to main and backported until 3.1
> >
> > Signed-off-by: Wilson Peng <pweisong@vmware.com>
> > ---
> >  datapath-windows/ovsext/Conntrack.c | 7 +++++--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/datapath-windows/ovsext/Conntrack.c
> b/datapath-windows/ovsext/Conntrack.c
> > index 39ba5cc10..17a1257f6 100644
> > --- a/datapath-windows/ovsext/Conntrack.c
> > +++ b/datapath-windows/ovsext/Conntrack.c
> > @@ -410,6 +410,8 @@ OvsCtEntryCreate(OvsForwardingContext *fwdCtx,
> >      }
> >      if (state != OVS_CS_F_INVALID && commit) {
> >          if (entry) {
> > +            /* make sure that parentEntry is not equal to entry*/
> > +            ASSERT(!parentEntry || (parentEntry != entry));
> >              entry->parent = parentEntry;
> >              if (OvsCtAddEntry(entry, ctx, natInfo, currentTime)) {
> >                  *entryCreated = TRUE;
> > @@ -1033,8 +1035,9 @@ OvsProcessConntrackEntry(OvsForwardingContext
> *fwdCtx,
> >              } else {
> >                  POVS_CT_ENTRY parentEntry;
> >                  parentEntry = OvsCtRelatedLookup(ctx->key, currentTime);
> > -                entry->parent = parentEntry;
> > -                if (parentEntry != NULL) {
> > +                /* make sure that parentEntry is not equal to entry*/
> > +                if ((parentEntry != NULL) && (parentEntry != entry)) {
> > +                    entry->parent = parentEntry;
> >                      state |= OVS_CS_F_RELATED;
> >                  }
> >              }
> > --
> > 2.39.2 (Apple Git-143)
> >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Simon Horman June 6, 2024, 12:56 p.m. UTC | #4
On Wed, Jun 05, 2024 at 09:26:02PM +0800, Wilson Peng wrote:
> Hi, Simon,
>     In this case, for tftp packet processing, it does have a child-parent
> processing logic just like ftp in tcp.
> Tftp packet1 from port1 to 69 and it will create one new conntrack entry
> and create one related conntrack
> entry also with src-port 0 and dst-port port1.   Original idea is if the
> tftp reply packet is from port2 to port1 and
> then it will create one new conntrack entry then it could set this
> conntrack entry(port2->port1) parent be equal
> to  conntrack_entry(port1-->69).
> 
>     In the processing part, if the tftp reply packet is from 69 to port1
> then it would hit the original conntrack entry
> (port1-->69) and lead to conntrack_entry parent be equal to itself.
> 
>     Only ftp/tftp traffic processing will have parent-child logic.

Hi Wilson,

Thanks for the explanation.

I understand that this patch will prevent OVS from crashing (good!).  But
I'm unclear how flows that would have caused a crash will be handled by
conntrack with this patch present.  Could you shed some light on that too?
Wilson Peng June 6, 2024, 1:47 p.m. UTC | #5
Hi, Simon,

Below list why it would go into deadlock,

1). 1st place OvsCtExecute_()->OvsProcessConntrackEntry().   entry->parent
= parentEntry;(here parentEntry == entry)
2).  After it return back from OvsProcessConntrackEntry(), it goes to the
lines below,(Still on function OvsCtExecute_())
     1263     if (entry == NULL) {
1264         return status;
1265     }
1266
1267     /*
1268      * Note that natInfo is not the same as entry->natInfo here.
natInfo
1269      * is decided by action in the openflow rule, entry->natInfo is
decided
1270      * when the entry is created. In the reverse NAT case, natInfo is
1271      * NAT_ACTION_REVERSE, yet entry->natInfo is NAT_ACTION_SRC or
1272      * NAT_ACTION_DST without NAT_ACTION_REVERSE
1273      */
1274     KIRQL irql = KeGetCurrentIrql();
1275     OVS_ACQUIRE_SPIN_LOCK(&(entry->lock), irql);
------>SPIN_LOCK 1st time
1276     if (natInfo->natAction != NAT_ACTION_NONE) {
1277         OvsNatPacket(fwdCtx, entry, entry->natInfo.natAction,
1278                      key, ctx.reply);
1279     }

3, Then it will goto the next same SPIN_LOCK for then lines below(Still on
function OvsCtExecute_())
1340     /* Add original tuple information to flow Key */
1341     if (entry->key.dl_type == ntohs(ETH_TYPE_IPV4)) {
1342         if (parent != NULL) {
1343             OVS_ACQUIRE_SPIN_LOCK(&(parent->lock), irql);
 --->>SPIN_LOCK 2nd time causing deadlock
1344             OvsCtUpdateTuple(key, &parent->key);
1345             OVS_RELEASE_SPIN_LOCK(&(parent->lock), irql);
1346         } else {
1347             OvsCtUpdateTuple(key, &entry->key);
1348         }
...

1365     OVS_RELEASE_SPIN_LOCK(&(entry->lock), irql);      --->this is the
SPIN_LOCK release place.

Regards
Wilson

On Thu, Jun 6, 2024 at 8:56 PM Simon Horman <horms@ovn.org> wrote:

> On Wed, Jun 05, 2024 at 09:26:02PM +0800, Wilson Peng wrote:
> > Hi, Simon,
> >     In this case, for tftp packet processing, it does have a child-parent
> > processing logic just like ftp in tcp.
> > Tftp packet1 from port1 to 69 and it will create one new conntrack entry
> > and create one related conntrack
> > entry also with src-port 0 and dst-port port1.   Original idea is if the
> > tftp reply packet is from port2 to port1 and
> > then it will create one new conntrack entry then it could set this
> > conntrack entry(port2->port1) parent be equal
> > to  conntrack_entry(port1-->69).
> >
> >     In the processing part, if the tftp reply packet is from 69 to port1
> > then it would hit the original conntrack entry
> > (port1-->69) and lead to conntrack_entry parent be equal to itself.
> >
> >     Only ftp/tftp traffic processing will have parent-child logic.
>
> Hi Wilson,
>
> Thanks for the explanation.
>
> I understand that this patch will prevent OVS from crashing (good!).  But
> I'm unclear how flows that would have caused a crash will be handled by
> conntrack with this patch present.  Could you shed some light on that too?
>
Alin Serdean Aug. 5, 2024, 10:47 p.m. UTC | #6
Hi Wilson,

Apologies for the delay and thank you for your patch.

Its really hard for me to wrap my head around how this issue occurred.

Would it be possible to reproduce the issue and run the !locks command inside WinDbg and provide the output?

Thank you,
Alin.
Wilson Peng Oct. 10, 2024, 10:07 a.m. UTC | #7
Hi, Alin,
    I have explained the issue in this thread below(
https://patchwork.ozlabs.org/project/openvswitch/patch/20240605053552.17478-1-svc.ovs-community@vmware.com/).


    And also I have supplied the reproducing CMD for this hang issue.

    If it is constructing some TFTP (request/reply) packet to inject into
the ovs windows setup,  the WIndows node will
 hange. In a simple word, the SPIN_LOCK will be called twice on the same
conntrack entry if the entry is equal to parent
entry.( this is possible for TFTP packet handling).

   The TFTP packet logic is added around 2022, this patch would fix the
potential hang issue with the fix.

Regards
Wilson

on datapath-windows/ovsext/Conntrack.c OvsCtExecute_() function

1275     OVS_ACQUIRE_SPIN_LOCK(&(entry->lock), irql);  ------>SPIN_LOCK 1st time

1343            OVS_ACQUIRE_SPIN_LOCK(&(parent->lock), irql);
--->>SPIN_LOCK 2nd time causing deadlock


On Tue, Aug 6, 2024 at 6:48 AM Alin Serdean <alinserdean@gmail.com> wrote:

> Hi Wilson,
>
> Apologies for the delay and thank you for your patch.
>
> Its really hard for me to wrap my head around how this issue occurred.
>
> Would it be possible to reproduce the issue and run the !locks command
> inside WinDbg and provide the output?
>
> Thank you,
> Alin.
> ------------------------------
> *From:* dev <ovs-dev-bounces@openvswitch.org> on behalf of Wilson Peng
> via dev <ovs-dev@openvswitch.org>
> *Sent:* Thursday, June 6, 2024 3:47 PM
> *To:* Simon Horman <horms@ovn.org>; dev@openvswitch.org <
> dev@openvswitch.org>
> *Subject:* Re: [ovs-dev] [PATCH v2 1/1] datapath-windows : Avoid a
> deadlock when processing TFTP conntrack.
>
> Hi, Simon,
>
> Below list why it would go into deadlock,
>
> 1). 1st place OvsCtExecute_()->OvsProcessConntrackEntry().   entry->parent
> = parentEntry;(here parentEntry == entry)
> 2).  After it return back from OvsProcessConntrackEntry(), it goes to the
> lines below,(Still on function OvsCtExecute_())
>      1263     if (entry == NULL) {
> 1264         return status;
> 1265     }
> 1266
> 1267     /*
> 1268      * Note that natInfo is not the same as entry->natInfo here.
> natInfo
> 1269      * is decided by action in the openflow rule, entry->natInfo is
> decided
> 1270      * when the entry is created. In the reverse NAT case, natInfo is
> 1271      * NAT_ACTION_REVERSE, yet entry->natInfo is NAT_ACTION_SRC or
> 1272      * NAT_ACTION_DST without NAT_ACTION_REVERSE
> 1273      */
> 1274     KIRQL irql = KeGetCurrentIrql();
> 1275     OVS_ACQUIRE_SPIN_LOCK(&(entry->lock), irql);
> ------>SPIN_LOCK 1st time
> 1276     if (natInfo->natAction != NAT_ACTION_NONE) {
> 1277         OvsNatPacket(fwdCtx, entry, entry->natInfo.natAction,
> 1278                      key, ctx.reply);
> 1279     }
>
> 3, Then it will goto the next same SPIN_LOCK for then lines below(Still on
> function OvsCtExecute_())
> 1340     /* Add original tuple information to flow Key */
> 1341     if (entry->key.dl_type == ntohs(ETH_TYPE_IPV4)) {
> 1342         if (parent != NULL) {
> 1343             OVS_ACQUIRE_SPIN_LOCK(&(parent->lock), irql);
>  --->>SPIN_LOCK 2nd time causing deadlock
> 1344             OvsCtUpdateTuple(key, &parent->key);
> 1345             OVS_RELEASE_SPIN_LOCK(&(parent->lock), irql);
> 1346         } else {
> 1347             OvsCtUpdateTuple(key, &entry->key);
> 1348         }
> ...
>
> 1365     OVS_RELEASE_SPIN_LOCK(&(entry->lock), irql);      --->this is the
> SPIN_LOCK release place.
>
> Regards
> Wilson
>
> On Thu, Jun 6, 2024 at 8:56 PM Simon Horman <horms@ovn.org> wrote:
>
> > On Wed, Jun 05, 2024 at 09:26:02PM +0800, Wilson Peng wrote:
> > > Hi, Simon,
> > >     In this case, for tftp packet processing, it does have a
> child-parent
> > > processing logic just like ftp in tcp.
> > > Tftp packet1 from port1 to 69 and it will create one new conntrack
> entry
> > > and create one related conntrack
> > > entry also with src-port 0 and dst-port port1.   Original idea is if
> the
> > > tftp reply packet is from port2 to port1 and
> > > then it will create one new conntrack entry then it could set this
> > > conntrack entry(port2->port1) parent be equal
> > > to  conntrack_entry(port1-->69).
> > >
> > >     In the processing part, if the tftp reply packet is from 69 to
> port1
> > > then it would hit the original conntrack entry
> > > (port1-->69) and lead to conntrack_entry parent be equal to itself.
> > >
> > >     Only ftp/tftp traffic processing will have parent-child logic.
> >
> > Hi Wilson,
> >
> > Thanks for the explanation.
> >
> > I understand that this patch will prevent OVS from crashing (good!).  But
> > I'm unclear how flows that would have caused a crash will be handled by
> > conntrack with this patch present.  Could you shed some light on that
> too?
> >
>
> --
> This electronic communication and the information and any files
> transmitted
> with it, or attached to it, are confidential and are intended solely for
> the use of the individual or entity to whom it is addressed and may
> contain
> information that is confidential, legally privileged, protected by privacy
> laws, or otherwise restricted from disclosure to anyone else. If you are
> not the intended recipient or the person responsible for delivering the
> e-mail to the intended recipient, you are hereby notified that any use,
> copying, distributing, dissemination, forwarding, printing, or copying of
> this e-mail is strictly prohibited. If you received this e-mail in error,
> please return the e-mail to the sender, delete it from your computer, and
> destroy any printed copy of it.
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
diff mbox series

Patch

diff --git a/datapath-windows/ovsext/Conntrack.c b/datapath-windows/ovsext/Conntrack.c
index 39ba5cc10..17a1257f6 100644
--- a/datapath-windows/ovsext/Conntrack.c
+++ b/datapath-windows/ovsext/Conntrack.c
@@ -410,6 +410,8 @@  OvsCtEntryCreate(OvsForwardingContext *fwdCtx,
     }
     if (state != OVS_CS_F_INVALID && commit) {
         if (entry) {
+            /* make sure that parentEntry is not equal to entry*/
+            ASSERT(!parentEntry || (parentEntry != entry));
             entry->parent = parentEntry;
             if (OvsCtAddEntry(entry, ctx, natInfo, currentTime)) {
                 *entryCreated = TRUE;
@@ -1033,8 +1035,9 @@  OvsProcessConntrackEntry(OvsForwardingContext *fwdCtx,
             } else {
                 POVS_CT_ENTRY parentEntry;
                 parentEntry = OvsCtRelatedLookup(ctx->key, currentTime);
-                entry->parent = parentEntry;
-                if (parentEntry != NULL) {
+                /* make sure that parentEntry is not equal to entry*/
+                if ((parentEntry != NULL) && (parentEntry != entry)) {
+                    entry->parent = parentEntry;
                     state |= OVS_CS_F_RELATED;
                 }
             }