Message ID | 20210917215602.10633-1-odivlad@gmail.com |
---|---|
State | Accepted |
Headers | show |
Series | [ovs-dev] northd: support HW VTEP with stateful datapath | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/github-robot-_ovn-kubernetes | fail | github build: failed |
On Fri, Sep 17, 2021 at 5:56 PM Vladislav Odintsov <odivlad@gmail.com> wrote: > > A packet going from HW VTEP device to VIF port when arrives to > hypervisor chassis should go through LS ingress pipeline to l2_lkp > stage without any match. In l2_lkp stage an output port is > determined and then packet passed to LS egress pipeline for futher > processing and to VIF port delivery. > > Prior to this commit a packet, which was received from HW VTEP > device was dropped in an LS ingress datapath, where stateful services > were defined (ACLs, LBs). > > To fix this issue we add a special flag-bit which can be used in LS > pipelines, to check whether the packet came from HW VTEP devices. > In ls_in_pre_acl and ls_in_pre_lb we add new flow with priority 110 > to skip such packets. > > Signed-off-by: Vladislav Odintsov <odivlad@gmail.com> Thanks. I applied this patch to master and to the newly created branch-21.09 (considering it as a bug fix). I didn't backport to other branches. Let me know if you need backports to other patches. I applied with the below changes -------------- diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml index 7bb39d2ab..39f4eaa0c 100644 --- a/northd/ovn-northd.8.xml +++ b/northd/ovn-northd.8.xml @@ -263,16 +263,14 @@ packets that match the <code>inport</code>. </li> <li> - Logical flows for RAMP (controller-vtep) devices are created for each - physical switch. Packets came from such devices hit these flows and set - the 14'th bit of OVS register 0 (REG0[14]) to 1. This regbit indicates - that packet came from RAMP (controller-vtep) device. Later in logical - switch ingress pipeline this register is checked in ls_in_acl_pre and - ls_in_lb_pre stages whether to skip sending packet to conntrack in - ingress pipeline or not. Packets from RAMP devices should go though - ingress pipeline without any flow match till ls_in_l2_lkup stage to - determine output port. Stateful ACLs for coming from RAMP device - packets are checked within logical switch egress pipeline. + For logical ports of type <code>vtep</code>, the above logical flow + will also apply the action <code>REGBIT_FROM_RAMP = 1;</code> to + indicate that the packet is coming from a RAMP (controller-vtep) + device. Later pipelines will use this information to skip + sending the packet to the conntrack. Packets from <code>vtep</code> + logical ports should go though ingress pipeline only to determine + the output port and they should not be subjected to any ACL checks. + Egress pipeline will do the ACL checks. </li> </ul> @@ -467,10 +465,11 @@ <p> This table has a priority-110 flow with the match - <code>reg0[14] == 1</code> for all logical switch datapaths to resubmit - traffic to the next table. <code>reg0[14]</code> is the register bit, - which indicates that packet was received from RAMP device. Packets from - RAMP device are handled by ACLs only in Logical Switch egress pipeline. + <code>REGBIT_FROM_RAMP == 1</code> for all logical switch datapaths to + resubmit traffic to the next table. <code>REGBIT_FROM_RAMP</code> + indicates that packet was received from <code>vtep</code> logical ports + and it can be skipped from the stateful ACL processing in the ingress + pipeline. </p> <p> @@ -534,11 +533,11 @@ <p> This table has a priority-110 flow with the match - <code>reg0[14] == 1</code> for all logical switch datapaths to resubmit - traffic to the next table. <code>reg0[14]</code> is the register bit, - which indicates that packet was received from RAMP device. Packets from - RAMP device could be handled by load balancing flows only in Logical - Switch egress pipeline. + <code>REGBIT_FROM_RAMP == 1</code> for all logical switch datapaths to + resubmit traffic to the next table. <code>REGBIT_FROM_RAMP</code> + indicates that packet was received from <code>vtep</code> logical ports + and it can be skipped from the load balancer processing in the ingress + pipeline. </p> <p> -------------------- Numan > --- > northd/northd.c | 14 ++++++++++++++ > northd/ovn-northd.8.xml | 29 +++++++++++++++++++++++++++++ > northd/ovn_northd.dl | 33 +++++++++++++++++++++++++++++++-- > tests/ovn-northd.at | 2 ++ > 4 files changed, 76 insertions(+), 2 deletions(-) > > diff --git a/northd/northd.c b/northd/northd.c > index 688a6e4ef..1b84874a7 100644 > --- a/northd/northd.c > +++ b/northd/northd.c > @@ -196,6 +196,7 @@ enum ovn_stage { > #define REGBIT_LKUP_FDB "reg0[11]" > #define REGBIT_HAIRPIN_REPLY "reg0[12]" > #define REGBIT_ACL_LABEL "reg0[13]" > +#define REGBIT_FROM_RAMP "reg0[14]" > > #define REG_ORIG_DIP_IPV4 "reg1" > #define REG_ORIG_DIP_IPV6 "xxreg1" > @@ -5112,6 +5113,11 @@ build_lswitch_input_port_sec_op( > if (queue_id) { > ds_put_format(actions, "set_queue(%s); ", queue_id); > } > + > + if (!strcmp(op->nbsp->type, "vtep")) { > + ds_put_format(actions, REGBIT_FROM_RAMP" = 1; "); > + } > + > ds_put_cstr(actions, "next;"); > ovn_lflow_add_with_lport_and_hint(lflows, op->od, S_SWITCH_IN_PORT_SEC_L2, > 50, ds_cstr(match), ds_cstr(actions), > @@ -5359,6 +5365,10 @@ build_pre_acls(struct ovn_datapath *od, struct hmap *port_groups, > "nd || nd_rs || nd_ra || mldv1 || mldv2 || " > "(udp && udp.src == 546 && udp.dst == 547)", "next;"); > > + /* Do not send coming from RAMP switch packets to conntrack. */ > + ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 110, > + REGBIT_FROM_RAMP" == 1", "next;"); > + > /* Ingress and Egress Pre-ACL Table (Priority 100). > * > * Regardless of whether the ACL is "from-lport" or "to-lport", > @@ -5463,6 +5473,10 @@ build_pre_lb(struct ovn_datapath *od, struct hmap *lflows, > ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 110, > "eth.src == $svc_monitor_mac", "next;"); > > + /* Do not send coming from RAMP switch packets to conntrack. */ > + ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 110, > + REGBIT_FROM_RAMP" == 1", "next;"); > + > /* Allow all packets to go to next tables by default. */ > ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 0, "1", "next;"); > ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 0, "1", "next;"); > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml > index eebf0d717..7bb39d2ab 100644 > --- a/northd/ovn-northd.8.xml > +++ b/northd/ovn-northd.8.xml > @@ -262,6 +262,18 @@ > logical ports on which port security is not enabled, these advance all > packets that match the <code>inport</code>. > </li> > + <li> > + Logical flows for RAMP (controller-vtep) devices are created for each > + physical switch. Packets came from such devices hit these flows and set > + the 14'th bit of OVS register 0 (REG0[14]) to 1. This regbit indicates > + that packet came from RAMP (controller-vtep) device. Later in logical > + switch ingress pipeline this register is checked in ls_in_acl_pre and > + ls_in_lb_pre stages whether to skip sending packet to conntrack in > + ingress pipeline or not. Packets from RAMP devices should go though > + ingress pipeline without any flow match till ls_in_l2_lkup stage to > + determine output port. Stateful ACLs for coming from RAMP device > + packets are checked within logical switch egress pipeline. > + </li> > </ul> > > <p> > @@ -453,6 +465,14 @@ > processing. > </p> > > + <p> > + This table has a priority-110 flow with the match > + <code>reg0[14] == 1</code> for all logical switch datapaths to resubmit > + traffic to the next table. <code>reg0[14]</code> is the register bit, > + which indicates that packet was received from RAMP device. Packets from > + RAMP device are handled by ACLs only in Logical Switch egress pipeline. > + </p> > + > <p> > This table also has a priority-110 flow with the match > <code>eth.dst == <var>E</var></code> for all logical switch > @@ -512,6 +532,15 @@ > configured. We can now add a lflow to drop ct.inv packets. > </p> > > + <p> > + This table has a priority-110 flow with the match > + <code>reg0[14] == 1</code> for all logical switch datapaths to resubmit > + traffic to the next table. <code>reg0[14]</code> is the register bit, > + which indicates that packet was received from RAMP device. Packets from > + RAMP device could be handled by load balancing flows only in Logical > + Switch egress pipeline. > + </p> > + > <p> > This table also has a priority-110 flow with the match > <code>eth.dst == <var>E</var></code> for all logical switch > diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl > index 669728497..0202af5dc 100644 > --- a/northd/ovn_northd.dl > +++ b/northd/ovn_northd.dl > @@ -1631,6 +1631,7 @@ function rEGBIT_ACL_HINT_BLOCK() : istring = i"reg0[10]" > function rEGBIT_LKUP_FDB() : istring = i"reg0[11]" > function rEGBIT_HAIRPIN_REPLY() : istring = i"reg0[12]" > function rEGBIT_ACL_LABEL() : istring = i"reg0[13]" > +function rEGBIT_FROM_RAMP() : istring = i"reg0[14]" > > function rEG_ORIG_DIP_IPV4() : istring = i"reg1" > function rEG_ORIG_DIP_IPV6() : istring = i"xxreg1" > @@ -2070,6 +2071,16 @@ for (&Switch(._uuid = ls_uuid, .has_stateful_acl = true)) { > .io_port = None, > .controller_meter = None); > > + /* Do not send coming from RAMP switch packets to conntrack. */ > + Flow(.logical_datapath = ls_uuid, > + .stage = s_SWITCH_IN_PRE_ACL(), > + .priority = 110, > + .__match = i"${rEGBIT_FROM_RAMP()} == 1", > + .actions = i"next;", > + .stage_hint = 0, > + .io_port = None, > + .controller_meter = None); > + > /* Ingress and Egress Pre-ACL Table (Priority 100). > * > * Regardless of whether the ACL is "from-lport" or "to-lport", > @@ -2136,6 +2147,16 @@ for (&Switch(._uuid = ls_uuid)) { > .io_port = None, > .controller_meter = None); > > + /* Do not send coming from RAMP switch packets to conntrack. */ > + Flow(.logical_datapath = ls_uuid, > + .stage = s_SWITCH_IN_PRE_LB(), > + .priority = 110, > + .__match = i"${rEGBIT_FROM_RAMP()} == 1", > + .actions = i"next;", > + .stage_hint = 0, > + .io_port = None, > + .controller_meter = None); > + > /* Allow all packets to go to next tables by default. */ > Flow(.logical_datapath = ls_uuid, > .stage = s_SWITCH_IN_PRE_LB(), > @@ -3361,10 +3382,18 @@ for (&SwitchPort(.lsp = lsp, .sw = sw, .json_name = json_name, .ps_eth_addresses > } else { > i"inport == ${json_name} && eth.src == {${ps_eth_addresses.join(\" \")}}" > } in > - var actions = match (pbinding.options.get(i"qdisc_queue_id")) { > + var actions = { > + var ramp = if (lsp.__type == i"vtep") { > + i"${rEGBIT_FROM_RAMP()} = 1; " > + } else { > + i"" > + }; > + var queue = match (pbinding.options.get(i"qdisc_queue_id")) { > None -> i"next;", > Some{id} -> i"set_queue(${id}); next;" > - } in > + }; > + i"${ramp}${queue}" > + } in > Flow(.logical_datapath = sw._uuid, > .stage = s_SWITCH_IN_PORT_SEC_L2(), > .priority = 50, > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at > index 2af3f2096..5de554455 100644 > --- a/tests/ovn-northd.at > +++ b/tests/ovn-northd.at > @@ -3597,6 +3597,7 @@ check_stateful_flows() { > table=6 (ls_in_pre_lb ), priority=110 , match=(eth.dst == $svc_monitor_mac), action=(next;) > table=6 (ls_in_pre_lb ), priority=110 , match=(ip && inport == "sw0-lr0"), action=(next;) > table=6 (ls_in_pre_lb ), priority=110 , match=(nd || nd_rs || nd_ra || mldv1 || mldv2), action=(next;) > + table=6 (ls_in_pre_lb ), priority=110 , match=(reg0[[14]] == 1), action=(next;) > ]) > > AT_CHECK([grep "ls_in_pre_stateful" sw0flows | sort], [0], [dnl > @@ -3660,6 +3661,7 @@ AT_CHECK([grep "ls_in_pre_lb" sw0flows | sort], [0], [dnl > table=6 (ls_in_pre_lb ), priority=110 , match=(eth.dst == $svc_monitor_mac), action=(next;) > table=6 (ls_in_pre_lb ), priority=110 , match=(ip && inport == "sw0-lr0"), action=(next;) > table=6 (ls_in_pre_lb ), priority=110 , match=(nd || nd_rs || nd_ra || mldv1 || mldv2), action=(next;) > + table=6 (ls_in_pre_lb ), priority=110 , match=(reg0[[14]] == 1), action=(next;) > ]) > > AT_CHECK([grep "ls_in_pre_stateful" sw0flows | sort], [0], [dnl > -- > 2.30.0 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
Hi Numan, thanks. I’m okay with your changes. Recently I’ve seen report about this problem with RAMP/VTEP on the list, so since it’s a bugfix, I think it would be great to backport it down to branches. Though, there are a lot of conflicts with older branches, I’ve submitted the backport for 21.06 here: https://patchwork.ozlabs.org/project/ovn/patch/20210918125121.8257-1-odivlad@gmail.com/ 21.03 and older branches have more non-trivial conflicts, and backporting should be done more carefully. If one needs that, he/she can try to do it by its own. Regards, Vladislav Odintsov > On 18 Sep 2021, at 04:04, Numan Siddique <numans@ovn.org> wrote: > > On Fri, Sep 17, 2021 at 5:56 PM Vladislav Odintsov <odivlad@gmail.com <mailto:odivlad@gmail.com>> wrote: >> >> A packet going from HW VTEP device to VIF port when arrives to >> hypervisor chassis should go through LS ingress pipeline to l2_lkp >> stage without any match. In l2_lkp stage an output port is >> determined and then packet passed to LS egress pipeline for futher >> processing and to VIF port delivery. >> >> Prior to this commit a packet, which was received from HW VTEP >> device was dropped in an LS ingress datapath, where stateful services >> were defined (ACLs, LBs). >> >> To fix this issue we add a special flag-bit which can be used in LS >> pipelines, to check whether the packet came from HW VTEP devices. >> In ls_in_pre_acl and ls_in_pre_lb we add new flow with priority 110 >> to skip such packets. >> >> Signed-off-by: Vladislav Odintsov <odivlad@gmail.com> > > Thanks. I applied this patch to master and to the newly created > branch-21.09 (considering it as a bug fix). > > I didn't backport to other branches. Let me know if you need > backports to other patches. > > I applied with the below changes > > -------------- > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml > index 7bb39d2ab..39f4eaa0c 100644 > --- a/northd/ovn-northd.8.xml > +++ b/northd/ovn-northd.8.xml > @@ -263,16 +263,14 @@ > packets that match the <code>inport</code>. > </li> > <li> > - Logical flows for RAMP (controller-vtep) devices are created for each > - physical switch. Packets came from such devices hit these flows and set > - the 14'th bit of OVS register 0 (REG0[14]) to 1. This regbit indicates > - that packet came from RAMP (controller-vtep) device. Later in logical > - switch ingress pipeline this register is checked in ls_in_acl_pre and > - ls_in_lb_pre stages whether to skip sending packet to conntrack in > - ingress pipeline or not. Packets from RAMP devices should go though > - ingress pipeline without any flow match till ls_in_l2_lkup stage to > - determine output port. Stateful ACLs for coming from RAMP device > - packets are checked within logical switch egress pipeline. > + For logical ports of type <code>vtep</code>, the above logical flow > + will also apply the action <code>REGBIT_FROM_RAMP = 1;</code> to > + indicate that the packet is coming from a RAMP (controller-vtep) > + device. Later pipelines will use this information to skip > + sending the packet to the conntrack. Packets from <code>vtep</code> > + logical ports should go though ingress pipeline only to determine > + the output port and they should not be subjected to any ACL checks. > + Egress pipeline will do the ACL checks. > </li> > </ul> > > @@ -467,10 +465,11 @@ > > <p> > This table has a priority-110 flow with the match > - <code>reg0[14] == 1</code> for all logical switch datapaths to resubmit > - traffic to the next table. <code>reg0[14]</code> is the register bit, > - which indicates that packet was received from RAMP device. Packets from > - RAMP device are handled by ACLs only in Logical Switch egress pipeline. > + <code>REGBIT_FROM_RAMP == 1</code> for all logical switch datapaths to > + resubmit traffic to the next table. <code>REGBIT_FROM_RAMP</code> > + indicates that packet was received from <code>vtep</code> logical ports > + and it can be skipped from the stateful ACL processing in the ingress > + pipeline. > </p> > > <p> > @@ -534,11 +533,11 @@ > > <p> > This table has a priority-110 flow with the match > - <code>reg0[14] == 1</code> for all logical switch datapaths to resubmit > - traffic to the next table. <code>reg0[14]</code> is the register bit, > - which indicates that packet was received from RAMP device. Packets from > - RAMP device could be handled by load balancing flows only in Logical > - Switch egress pipeline. > + <code>REGBIT_FROM_RAMP == 1</code> for all logical switch datapaths to > + resubmit traffic to the next table. <code>REGBIT_FROM_RAMP</code> > + indicates that packet was received from <code>vtep</code> logical ports > + and it can be skipped from the load balancer processing in the ingress > + pipeline. > </p> > > <p> > -------------------- > > Numan > >> --- >> northd/northd.c | 14 ++++++++++++++ >> northd/ovn-northd.8.xml | 29 +++++++++++++++++++++++++++++ >> northd/ovn_northd.dl | 33 +++++++++++++++++++++++++++++++-- >> tests/ovn-northd.at | 2 ++ >> 4 files changed, 76 insertions(+), 2 deletions(-) >> >> diff --git a/northd/northd.c b/northd/northd.c >> index 688a6e4ef..1b84874a7 100644 >> --- a/northd/northd.c >> +++ b/northd/northd.c >> @@ -196,6 +196,7 @@ enum ovn_stage { >> #define REGBIT_LKUP_FDB "reg0[11]" >> #define REGBIT_HAIRPIN_REPLY "reg0[12]" >> #define REGBIT_ACL_LABEL "reg0[13]" >> +#define REGBIT_FROM_RAMP "reg0[14]" >> >> #define REG_ORIG_DIP_IPV4 "reg1" >> #define REG_ORIG_DIP_IPV6 "xxreg1" >> @@ -5112,6 +5113,11 @@ build_lswitch_input_port_sec_op( >> if (queue_id) { >> ds_put_format(actions, "set_queue(%s); ", queue_id); >> } >> + >> + if (!strcmp(op->nbsp->type, "vtep")) { >> + ds_put_format(actions, REGBIT_FROM_RAMP" = 1; "); >> + } >> + >> ds_put_cstr(actions, "next;"); >> ovn_lflow_add_with_lport_and_hint(lflows, op->od, S_SWITCH_IN_PORT_SEC_L2, >> 50, ds_cstr(match), ds_cstr(actions), >> @@ -5359,6 +5365,10 @@ build_pre_acls(struct ovn_datapath *od, struct hmap *port_groups, >> "nd || nd_rs || nd_ra || mldv1 || mldv2 || " >> "(udp && udp.src == 546 && udp.dst == 547)", "next;"); >> >> + /* Do not send coming from RAMP switch packets to conntrack. */ >> + ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 110, >> + REGBIT_FROM_RAMP" == 1", "next;"); >> + >> /* Ingress and Egress Pre-ACL Table (Priority 100). >> * >> * Regardless of whether the ACL is "from-lport" or "to-lport", >> @@ -5463,6 +5473,10 @@ build_pre_lb(struct ovn_datapath *od, struct hmap *lflows, >> ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 110, >> "eth.src == $svc_monitor_mac", "next;"); >> >> + /* Do not send coming from RAMP switch packets to conntrack. */ >> + ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 110, >> + REGBIT_FROM_RAMP" == 1", "next;"); >> + >> /* Allow all packets to go to next tables by default. */ >> ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 0, "1", "next;"); >> ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 0, "1", "next;"); >> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml >> index eebf0d717..7bb39d2ab 100644 >> --- a/northd/ovn-northd.8.xml >> +++ b/northd/ovn-northd.8.xml >> @@ -262,6 +262,18 @@ >> logical ports on which port security is not enabled, these advance all >> packets that match the <code>inport</code>. >> </li> >> + <li> >> + Logical flows for RAMP (controller-vtep) devices are created for each >> + physical switch. Packets came from such devices hit these flows and set >> + the 14'th bit of OVS register 0 (REG0[14]) to 1. This regbit indicates >> + that packet came from RAMP (controller-vtep) device. Later in logical >> + switch ingress pipeline this register is checked in ls_in_acl_pre and >> + ls_in_lb_pre stages whether to skip sending packet to conntrack in >> + ingress pipeline or not. Packets from RAMP devices should go though >> + ingress pipeline without any flow match till ls_in_l2_lkup stage to >> + determine output port. Stateful ACLs for coming from RAMP device >> + packets are checked within logical switch egress pipeline. >> + </li> >> </ul> >> >> <p> >> @@ -453,6 +465,14 @@ >> processing. >> </p> >> >> + <p> >> + This table has a priority-110 flow with the match >> + <code>reg0[14] == 1</code> for all logical switch datapaths to resubmit >> + traffic to the next table. <code>reg0[14]</code> is the register bit, >> + which indicates that packet was received from RAMP device. Packets from >> + RAMP device are handled by ACLs only in Logical Switch egress pipeline. >> + </p> >> + >> <p> >> This table also has a priority-110 flow with the match >> <code>eth.dst == <var>E</var></code> for all logical switch >> @@ -512,6 +532,15 @@ >> configured. We can now add a lflow to drop ct.inv packets. >> </p> >> >> + <p> >> + This table has a priority-110 flow with the match >> + <code>reg0[14] == 1</code> for all logical switch datapaths to resubmit >> + traffic to the next table. <code>reg0[14]</code> is the register bit, >> + which indicates that packet was received from RAMP device. Packets from >> + RAMP device could be handled by load balancing flows only in Logical >> + Switch egress pipeline. >> + </p> >> + >> <p> >> This table also has a priority-110 flow with the match >> <code>eth.dst == <var>E</var></code> for all logical switch >> diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl >> index 669728497..0202af5dc 100644 >> --- a/northd/ovn_northd.dl >> +++ b/northd/ovn_northd.dl >> @@ -1631,6 +1631,7 @@ function rEGBIT_ACL_HINT_BLOCK() : istring = i"reg0[10]" >> function rEGBIT_LKUP_FDB() : istring = i"reg0[11]" >> function rEGBIT_HAIRPIN_REPLY() : istring = i"reg0[12]" >> function rEGBIT_ACL_LABEL() : istring = i"reg0[13]" >> +function rEGBIT_FROM_RAMP() : istring = i"reg0[14]" >> >> function rEG_ORIG_DIP_IPV4() : istring = i"reg1" >> function rEG_ORIG_DIP_IPV6() : istring = i"xxreg1" >> @@ -2070,6 +2071,16 @@ for (&Switch(._uuid = ls_uuid, .has_stateful_acl = true)) { >> .io_port = None, >> .controller_meter = None); >> >> + /* Do not send coming from RAMP switch packets to conntrack. */ >> + Flow(.logical_datapath = ls_uuid, >> + .stage = s_SWITCH_IN_PRE_ACL(), >> + .priority = 110, >> + .__match = i"${rEGBIT_FROM_RAMP()} == 1", >> + .actions = i"next;", >> + .stage_hint = 0, >> + .io_port = None, >> + .controller_meter = None); >> + >> /* Ingress and Egress Pre-ACL Table (Priority 100). >> * >> * Regardless of whether the ACL is "from-lport" or "to-lport", >> @@ -2136,6 +2147,16 @@ for (&Switch(._uuid = ls_uuid)) { >> .io_port = None, >> .controller_meter = None); >> >> + /* Do not send coming from RAMP switch packets to conntrack. */ >> + Flow(.logical_datapath = ls_uuid, >> + .stage = s_SWITCH_IN_PRE_LB(), >> + .priority = 110, >> + .__match = i"${rEGBIT_FROM_RAMP()} == 1", >> + .actions = i"next;", >> + .stage_hint = 0, >> + .io_port = None, >> + .controller_meter = None); >> + >> /* Allow all packets to go to next tables by default. */ >> Flow(.logical_datapath = ls_uuid, >> .stage = s_SWITCH_IN_PRE_LB(), >> @@ -3361,10 +3382,18 @@ for (&SwitchPort(.lsp = lsp, .sw = sw, .json_name = json_name, .ps_eth_addresses >> } else { >> i"inport == ${json_name} && eth.src == {${ps_eth_addresses.join(\" \")}}" >> } in >> - var actions = match (pbinding.options.get(i"qdisc_queue_id")) { >> + var actions = { >> + var ramp = if (lsp.__type == i"vtep") { >> + i"${rEGBIT_FROM_RAMP()} = 1; " >> + } else { >> + i"" >> + }; >> + var queue = match (pbinding.options.get(i"qdisc_queue_id")) { >> None -> i"next;", >> Some{id} -> i"set_queue(${id}); next;" >> - } in >> + }; >> + i"${ramp}${queue}" >> + } in >> Flow(.logical_datapath = sw._uuid, >> .stage = s_SWITCH_IN_PORT_SEC_L2(), >> .priority = 50, >> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at >> index 2af3f2096..5de554455 100644 >> --- a/tests/ovn-northd.at >> +++ b/tests/ovn-northd.at >> @@ -3597,6 +3597,7 @@ check_stateful_flows() { >> table=6 (ls_in_pre_lb ), priority=110 , match=(eth.dst == $svc_monitor_mac), action=(next;) >> table=6 (ls_in_pre_lb ), priority=110 , match=(ip && inport == "sw0-lr0"), action=(next;) >> table=6 (ls_in_pre_lb ), priority=110 , match=(nd || nd_rs || nd_ra || mldv1 || mldv2), action=(next;) >> + table=6 (ls_in_pre_lb ), priority=110 , match=(reg0[[14]] == 1), action=(next;) >> ]) >> >> AT_CHECK([grep "ls_in_pre_stateful" sw0flows | sort], [0], [dnl >> @@ -3660,6 +3661,7 @@ AT_CHECK([grep "ls_in_pre_lb" sw0flows | sort], [0], [dnl >> table=6 (ls_in_pre_lb ), priority=110 , match=(eth.dst == $svc_monitor_mac), action=(next;) >> table=6 (ls_in_pre_lb ), priority=110 , match=(ip && inport == "sw0-lr0"), action=(next;) >> table=6 (ls_in_pre_lb ), priority=110 , match=(nd || nd_rs || nd_ra || mldv1 || mldv2), action=(next;) >> + table=6 (ls_in_pre_lb ), priority=110 , match=(reg0[[14]] == 1), action=(next;) >> ]) >> >> AT_CHECK([grep "ls_in_pre_stateful" sw0flows | sort], [0], [dnl >> -- >> 2.30.0 >> >> _______________________________________________ >> dev mailing list >> dev@openvswitch.org <mailto:dev@openvswitch.org> >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev <https://mail.openvswitch.org/mailman/listinfo/ovs-dev>
diff --git a/northd/northd.c b/northd/northd.c index 688a6e4ef..1b84874a7 100644 --- a/northd/northd.c +++ b/northd/northd.c @@ -196,6 +196,7 @@ enum ovn_stage { #define REGBIT_LKUP_FDB "reg0[11]" #define REGBIT_HAIRPIN_REPLY "reg0[12]" #define REGBIT_ACL_LABEL "reg0[13]" +#define REGBIT_FROM_RAMP "reg0[14]" #define REG_ORIG_DIP_IPV4 "reg1" #define REG_ORIG_DIP_IPV6 "xxreg1" @@ -5112,6 +5113,11 @@ build_lswitch_input_port_sec_op( if (queue_id) { ds_put_format(actions, "set_queue(%s); ", queue_id); } + + if (!strcmp(op->nbsp->type, "vtep")) { + ds_put_format(actions, REGBIT_FROM_RAMP" = 1; "); + } + ds_put_cstr(actions, "next;"); ovn_lflow_add_with_lport_and_hint(lflows, op->od, S_SWITCH_IN_PORT_SEC_L2, 50, ds_cstr(match), ds_cstr(actions), @@ -5359,6 +5365,10 @@ build_pre_acls(struct ovn_datapath *od, struct hmap *port_groups, "nd || nd_rs || nd_ra || mldv1 || mldv2 || " "(udp && udp.src == 546 && udp.dst == 547)", "next;"); + /* Do not send coming from RAMP switch packets to conntrack. */ + ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 110, + REGBIT_FROM_RAMP" == 1", "next;"); + /* Ingress and Egress Pre-ACL Table (Priority 100). * * Regardless of whether the ACL is "from-lport" or "to-lport", @@ -5463,6 +5473,10 @@ build_pre_lb(struct ovn_datapath *od, struct hmap *lflows, ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 110, "eth.src == $svc_monitor_mac", "next;"); + /* Do not send coming from RAMP switch packets to conntrack. */ + ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 110, + REGBIT_FROM_RAMP" == 1", "next;"); + /* Allow all packets to go to next tables by default. */ ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 0, "1", "next;"); ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 0, "1", "next;"); diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml index eebf0d717..7bb39d2ab 100644 --- a/northd/ovn-northd.8.xml +++ b/northd/ovn-northd.8.xml @@ -262,6 +262,18 @@ logical ports on which port security is not enabled, these advance all packets that match the <code>inport</code>. </li> + <li> + Logical flows for RAMP (controller-vtep) devices are created for each + physical switch. Packets came from such devices hit these flows and set + the 14'th bit of OVS register 0 (REG0[14]) to 1. This regbit indicates + that packet came from RAMP (controller-vtep) device. Later in logical + switch ingress pipeline this register is checked in ls_in_acl_pre and + ls_in_lb_pre stages whether to skip sending packet to conntrack in + ingress pipeline or not. Packets from RAMP devices should go though + ingress pipeline without any flow match till ls_in_l2_lkup stage to + determine output port. Stateful ACLs for coming from RAMP device + packets are checked within logical switch egress pipeline. + </li> </ul> <p> @@ -453,6 +465,14 @@ processing. </p> + <p> + This table has a priority-110 flow with the match + <code>reg0[14] == 1</code> for all logical switch datapaths to resubmit + traffic to the next table. <code>reg0[14]</code> is the register bit, + which indicates that packet was received from RAMP device. Packets from + RAMP device are handled by ACLs only in Logical Switch egress pipeline. + </p> + <p> This table also has a priority-110 flow with the match <code>eth.dst == <var>E</var></code> for all logical switch @@ -512,6 +532,15 @@ configured. We can now add a lflow to drop ct.inv packets. </p> + <p> + This table has a priority-110 flow with the match + <code>reg0[14] == 1</code> for all logical switch datapaths to resubmit + traffic to the next table. <code>reg0[14]</code> is the register bit, + which indicates that packet was received from RAMP device. Packets from + RAMP device could be handled by load balancing flows only in Logical + Switch egress pipeline. + </p> + <p> This table also has a priority-110 flow with the match <code>eth.dst == <var>E</var></code> for all logical switch diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl index 669728497..0202af5dc 100644 --- a/northd/ovn_northd.dl +++ b/northd/ovn_northd.dl @@ -1631,6 +1631,7 @@ function rEGBIT_ACL_HINT_BLOCK() : istring = i"reg0[10]" function rEGBIT_LKUP_FDB() : istring = i"reg0[11]" function rEGBIT_HAIRPIN_REPLY() : istring = i"reg0[12]" function rEGBIT_ACL_LABEL() : istring = i"reg0[13]" +function rEGBIT_FROM_RAMP() : istring = i"reg0[14]" function rEG_ORIG_DIP_IPV4() : istring = i"reg1" function rEG_ORIG_DIP_IPV6() : istring = i"xxreg1" @@ -2070,6 +2071,16 @@ for (&Switch(._uuid = ls_uuid, .has_stateful_acl = true)) { .io_port = None, .controller_meter = None); + /* Do not send coming from RAMP switch packets to conntrack. */ + Flow(.logical_datapath = ls_uuid, + .stage = s_SWITCH_IN_PRE_ACL(), + .priority = 110, + .__match = i"${rEGBIT_FROM_RAMP()} == 1", + .actions = i"next;", + .stage_hint = 0, + .io_port = None, + .controller_meter = None); + /* Ingress and Egress Pre-ACL Table (Priority 100). * * Regardless of whether the ACL is "from-lport" or "to-lport", @@ -2136,6 +2147,16 @@ for (&Switch(._uuid = ls_uuid)) { .io_port = None, .controller_meter = None); + /* Do not send coming from RAMP switch packets to conntrack. */ + Flow(.logical_datapath = ls_uuid, + .stage = s_SWITCH_IN_PRE_LB(), + .priority = 110, + .__match = i"${rEGBIT_FROM_RAMP()} == 1", + .actions = i"next;", + .stage_hint = 0, + .io_port = None, + .controller_meter = None); + /* Allow all packets to go to next tables by default. */ Flow(.logical_datapath = ls_uuid, .stage = s_SWITCH_IN_PRE_LB(), @@ -3361,10 +3382,18 @@ for (&SwitchPort(.lsp = lsp, .sw = sw, .json_name = json_name, .ps_eth_addresses } else { i"inport == ${json_name} && eth.src == {${ps_eth_addresses.join(\" \")}}" } in - var actions = match (pbinding.options.get(i"qdisc_queue_id")) { + var actions = { + var ramp = if (lsp.__type == i"vtep") { + i"${rEGBIT_FROM_RAMP()} = 1; " + } else { + i"" + }; + var queue = match (pbinding.options.get(i"qdisc_queue_id")) { None -> i"next;", Some{id} -> i"set_queue(${id}); next;" - } in + }; + i"${ramp}${queue}" + } in Flow(.logical_datapath = sw._uuid, .stage = s_SWITCH_IN_PORT_SEC_L2(), .priority = 50, diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at index 2af3f2096..5de554455 100644 --- a/tests/ovn-northd.at +++ b/tests/ovn-northd.at @@ -3597,6 +3597,7 @@ check_stateful_flows() { table=6 (ls_in_pre_lb ), priority=110 , match=(eth.dst == $svc_monitor_mac), action=(next;) table=6 (ls_in_pre_lb ), priority=110 , match=(ip && inport == "sw0-lr0"), action=(next;) table=6 (ls_in_pre_lb ), priority=110 , match=(nd || nd_rs || nd_ra || mldv1 || mldv2), action=(next;) + table=6 (ls_in_pre_lb ), priority=110 , match=(reg0[[14]] == 1), action=(next;) ]) AT_CHECK([grep "ls_in_pre_stateful" sw0flows | sort], [0], [dnl @@ -3660,6 +3661,7 @@ AT_CHECK([grep "ls_in_pre_lb" sw0flows | sort], [0], [dnl table=6 (ls_in_pre_lb ), priority=110 , match=(eth.dst == $svc_monitor_mac), action=(next;) table=6 (ls_in_pre_lb ), priority=110 , match=(ip && inport == "sw0-lr0"), action=(next;) table=6 (ls_in_pre_lb ), priority=110 , match=(nd || nd_rs || nd_ra || mldv1 || mldv2), action=(next;) + table=6 (ls_in_pre_lb ), priority=110 , match=(reg0[[14]] == 1), action=(next;) ]) AT_CHECK([grep "ls_in_pre_stateful" sw0flows | sort], [0], [dnl
A packet going from HW VTEP device to VIF port when arrives to hypervisor chassis should go through LS ingress pipeline to l2_lkp stage without any match. In l2_lkp stage an output port is determined and then packet passed to LS egress pipeline for futher processing and to VIF port delivery. Prior to this commit a packet, which was received from HW VTEP device was dropped in an LS ingress datapath, where stateful services were defined (ACLs, LBs). To fix this issue we add a special flag-bit which can be used in LS pipelines, to check whether the packet came from HW VTEP devices. In ls_in_pre_acl and ls_in_pre_lb we add new flow with priority 110 to skip such packets. Signed-off-by: Vladislav Odintsov <odivlad@gmail.com> --- northd/northd.c | 14 ++++++++++++++ northd/ovn-northd.8.xml | 29 +++++++++++++++++++++++++++++ northd/ovn_northd.dl | 33 +++++++++++++++++++++++++++++++-- tests/ovn-northd.at | 2 ++ 4 files changed, 76 insertions(+), 2 deletions(-)