diff mbox series

[ovs-dev,net-next,2/3] net: sched: set IPS_CONFIRMED in tmpl status only when commit is set in act_ct

Message ID 4ffd82b3acc34ebd09855a26eb148fcd59fa872c.1689541664.git.lucien.xin@gmail.com
State Handled Elsewhere
Headers show
Series net: handle the exp removal problem with ovs upcall properly | expand

Commit Message

Xin Long July 16, 2023, 9:09 p.m. UTC
With the following flows, the packets will be dropped if OVS TC offload is
enabled.

  'ip,ct_state=-trk,in_port=1 actions=ct(zone=1)'
  'ip,ct_state=+trk+new+rel,in_port=1 actions=ct(commit,zone=1)'
  'ip,ct_state=+trk+new+rel,in_port=1 actions=ct(commit,zone=2),normal'

In the 1st flow, it finds the exp from the hashtable and removes it then
creates the ct with this exp in act_ct. However, in the 2nd flow it goes
to the OVS upcall at the 1st time. When the skb comes back from userspace,
it has to create the ct again without exp(the exp was removed last time).
With no 'rel' set in the ct, the 3rd flow can never get matched.

In OVS conntrack, it works around it by adding its own exp lookup function
ovs_ct_expect_find() where it doesn't remove the exp. Instead of creating
a real ct, it only updates its keys with the exp and its master info. So
when the skb comes back, the exp is still in the hashtable.

However, we can't do this trick in act_ct, as tc flower match is using a
real ct, and passing the exp and its master info to flower parsing via
tc_skb_cb is also not possible (tc_skb_cb size is not big enough).

The simple and clear fix is to not remove the exp at the 1st flow, namely,
not set IPS_CONFIRMED in tmpl when commit is not set in act_ct.

Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/sched/act_ct.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Aaron Conole July 19, 2023, 4:07 p.m. UTC | #1
Xin Long <lucien.xin@gmail.com> writes:

> With the following flows, the packets will be dropped if OVS TC offload is
> enabled.
>
>   'ip,ct_state=-trk,in_port=1 actions=ct(zone=1)'
>   'ip,ct_state=+trk+new+rel,in_port=1 actions=ct(commit,zone=1)'
>   'ip,ct_state=+trk+new+rel,in_port=1 actions=ct(commit,zone=2),normal'
>
> In the 1st flow, it finds the exp from the hashtable and removes it then
> creates the ct with this exp in act_ct. However, in the 2nd flow it goes
> to the OVS upcall at the 1st time. When the skb comes back from userspace,
> it has to create the ct again without exp(the exp was removed last time).
> With no 'rel' set in the ct, the 3rd flow can never get matched.
>
> In OVS conntrack, it works around it by adding its own exp lookup function
> ovs_ct_expect_find() where it doesn't remove the exp. Instead of creating
> a real ct, it only updates its keys with the exp and its master info. So
> when the skb comes back, the exp is still in the hashtable.
>
> However, we can't do this trick in act_ct, as tc flower match is using a
> real ct, and passing the exp and its master info to flower parsing via
> tc_skb_cb is also not possible (tc_skb_cb size is not big enough).
>
> The simple and clear fix is to not remove the exp at the 1st flow, namely,
> not set IPS_CONFIRMED in tmpl when commit is not set in act_ct.
>
> Reported-by: Shuang Li <shuali@redhat.com>
> Signed-off-by: Xin Long <lucien.xin@gmail.com>
> ---

Acked-by: Aaron Conole <aconole@redhat.com>
Davide Caratti July 19, 2023, 4:44 p.m. UTC | #2
On Sun, Jul 16, 2023 at 05:09:18PM -0400, Xin Long wrote:
> With the following flows, the packets will be dropped if OVS TC offload is
> enabled.

[...]
> 
> The simple and clear fix is to not remove the exp at the 1st flow, namely,
> not set IPS_CONFIRMED in tmpl when commit is not set in act_ct.
> 
> Reported-by: Shuang Li <shuali@redhat.com>
> Signed-off-by: Xin Long <lucien.xin@gmail.com>

Reviewed-by: Davide Caratti <dcaratti@redhat.com>
diff mbox series

Patch

diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
index abc71a06d634..7c652d14528b 100644
--- a/net/sched/act_ct.c
+++ b/net/sched/act_ct.c
@@ -1238,7 +1238,8 @@  static int tcf_ct_fill_params(struct net *net,
 		}
 	}
 
-	__set_bit(IPS_CONFIRMED_BIT, &tmpl->status);
+	if (p->ct_action & TCA_CT_ACT_COMMIT)
+		__set_bit(IPS_CONFIRMED_BIT, &tmpl->status);
 	return 0;
 err:
 	nf_ct_put(p->tmpl);