Message ID | 20240108131243.53816-1-priyankar.jain@nutanix.com |
---|---|
State | Changes Requested |
Headers | show |
Series | [ovs-dev] northd: Added lb_vip_mac config option in Logical_Switch. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | warning | apply and check: warning |
ovsrobot/github-robot-_Build_and_Test | fail | github build: failed |
ovsrobot/github-robot-_ovn-kubernetes | success | github build: passed |
References: <20240108131243.53816-1-priyankar.jain@nutanix.com> Bleep bloop. Greetings Priyankar Jain, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: WARNING: Line is 93 characters long (recommended limit is 79) #157 FILE: northd/ovn-northd.8.xml:1624: <ref column="other_config:lb_vip_mac" table="Logical_Switch" db="OVN_Northbound"/>, Lines checked: 326, Warnings: 1, Errors: 0 Please check this out. If you feel there has been an error, please email aconole@redhat.com Thanks, 0-day Robot
On Mon, Jan 8, 2024 at 8:13 AM Priyankar Jain <priyankar.jain@nutanix.com> wrote: > > Currently load balancer applied to a logical switch has the > following restriction: > - VIP of the load balancer cannot reside in the subnet prefix as the > clients as OVN does not install ARP responder flows for the LB VIP. > Hi Priyankar, Sorry for the late reviews. The above statement is actually not correct. OVN does allow the VIP of the load balancer to be from the same subnet prefix. But in order for this to work, this logical switch has to be associated with a logical router. Eg. -------------------------- ovn-nbctl ls-add sw0 ovn-nbctl lsp-add sw0 sw0-p1 ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03 10.0.0.3" ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:03 10.0.0.3" ovn-nbctl lr-add lr0 ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24 ovn-nbctl lsp-add sw0 sw0-lr0 ovn-nbctl lsp-set-type sw0-lr0 router ovn-nbctl lsp-set-addresses sw0-lr0 router ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 ovn-nbctl lb-add lb1 10.0.0.10:80 10.0.0.3:80,20.0.0.3:80 ovn-nbctl --wait=sb ls-lb-add sw0 lb1 # This will work once you do ovn-nbctl lr-lb-add lr0 lb1 --------------------------- Do you have a use case where you attach a load balancer to a logical switch and this logical switch doesn't connect to any logical router ? If so, then perhaps we can consider this patch. However if you always attach a logical router to a logical switch (like how its done in the test case added in this patch), then just attaching the lb to the router would do. Can you please confirm if this is sufficient for your use case ? Perhaps we should document it in OVN :) Thanks Numan > This change adds a new config option "lb_vip_mac" in the logical_switch > table which is expected to be a MAC address. If the logical_switch has > this option configured, northd will program an ARP responder flow for > all the LB VIPs of the logical_switch with this MAC address. > > Usecase: With this change, CMS can set the lb_vip_mac value to same as > the default gateway MAC. This allows CMS to allocate VIP of the Load > balancer from any subnet prefix. > > Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com> > --- > northd/northd.c | 71 ++++++++++++++++++++++++++ > northd/northd.h | 2 + > northd/ovn-northd.8.xml | 49 ++++++++++++++++++ > tests/ovn.at | 109 ++++++++++++++++++++++++++++++++++++++++ > 4 files changed, 231 insertions(+) > > diff --git a/northd/northd.c b/northd/northd.c > index db3cd272e..ebca2c073 100644 > --- a/northd/northd.c > +++ b/northd/northd.c > @@ -790,8 +790,11 @@ init_lb_for_datapath(struct ovn_datapath *od) > { > if (od->nbs) { > od->has_lb_vip = ls_has_lb_vip(od); > + od->lb_vip_mac = nullable_xstrdup( > + smap_get(&od->nbs->other_config, "lb_vip_mac")); > } else { > od->has_lb_vip = lr_has_lb_vip(od); > + od->lb_vip_mac = NULL; > } > } > > @@ -800,6 +803,9 @@ destroy_lb_for_datapath(struct ovn_datapath *od) > { > ovn_lb_ip_set_destroy(od->lb_ips); > od->lb_ips = NULL; > + > + free(od->lb_vip_mac); > + od->lb_vip_mac = NULL; > } > > /* A group of logical router datapaths which are connected - either > @@ -12204,6 +12210,70 @@ build_lrouter_nat_flows_for_lb(struct ovn_lb_vip *lb_vip, > } > } > > +static void > +build_lb_rules_arp_nd_rsp(struct hmap *lflows, struct ovn_lb_datapaths *lb_dps, > + const struct ovn_datapaths *ls_datapaths, > + struct ds *match, struct ds *actions) > +{ > + if (!lb_dps->n_nb_ls) { > + return; > + } > + > + const struct ovn_northd_lb *lb = lb_dps->lb; > + for (size_t i = 0; i < lb->n_vips; i++) { > + struct ovn_lb_vip *lb_vip = &lb->vips[i]; > + > + size_t index; > + BITMAP_FOR_EACH_1 (index, ods_size(ls_datapaths), lb_dps->nb_ls_map) { > + struct ovn_datapath *od = ls_datapaths->array[index]; > + if (!od->lb_vip_mac) { > + continue; > + } > + ds_clear(match); > + ds_clear(actions); > + if (IN6_IS_ADDR_V4MAPPED(&lb_vip->vip)) { > + ds_put_format(match, "arp.tpa == %s && arp.op == 1", > + lb_vip->vip_str); > + ds_put_format(actions, > + "eth.dst = eth.src; " > + "eth.src = %s; " > + "arp.op = 2; /* ARP reply */ " > + "arp.tha = arp.sha; " > + "arp.sha = %s; " > + "arp.tpa = arp.spa; " > + "arp.spa = %s; " > + "outport = inport; " > + "flags.loopback = 1; " > + "output;", > + od->lb_vip_mac, od->lb_vip_mac, > + lb_vip->vip_str); > + } else { > + ds_put_format(match, "nd_ns && nd.target == %s", > + lb_vip->vip_str); > + ds_put_format(actions, > + "nd_na { " > + "eth.dst = eth.src; " > + "eth.src = %s; " > + "ip6.src = %s; " > + "nd.target = %s; " > + "nd.tll = %s; " > + "outport = inport; " > + "flags.loopback = 1; " > + "output; " > + "};", > + od->lb_vip_mac, > + lb_vip->vip_str, > + lb_vip->vip_str, > + od->lb_vip_mac); > + } > + ovn_lflow_add_with_hint(lflows, od, > + S_SWITCH_IN_ARP_ND_RSP, 130, > + ds_cstr(match), ds_cstr(actions), > + &lb->nlb->header_); > + } > + } > +} > + > static void > build_lswitch_flows_for_lb(struct ovn_lb_datapaths *lb_dps, > struct hmap *lflows, > @@ -12255,6 +12325,7 @@ build_lswitch_flows_for_lb(struct ovn_lb_datapaths *lb_dps, > ls_datapaths, match, action); > build_lb_rules(lflows, lb_dps, ls_datapaths, features, match, action, > meter_groups, svc_monitor_map); > + build_lb_rules_arp_nd_rsp(lflows, lb_dps, ls_datapaths, match, action); > } > > /* If there are any load balancing rules, we should send the packet to > diff --git a/northd/northd.h b/northd/northd.h > index 5be7b5384..3e1b24e2c 100644 > --- a/northd/northd.h > +++ b/northd/northd.h > @@ -262,6 +262,8 @@ struct ovn_datapath { > bool has_vtep_lports; > bool has_arp_proxy_port; > > + char *lb_vip_mac; > + > /* IPAM data. */ > struct ipam_info ipam_info; > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml > index 98cf7adb4..94daf47fb 100644 > --- a/northd/ovn-northd.8.xml > +++ b/northd/ovn-northd.8.xml > @@ -1618,6 +1618,55 @@ output; > </p> > </li> > > + <li> > + <p> > + If <var>E</var> is defined in the value of > + <ref column="other_config:lb_vip_mac" table="Logical_Switch" db="OVN_Northbound"/>, > + For each <var>VIP</var> defined in the value of the > + <ref column="vips" table="Load_Balancer" db="OVN_Northbound"/> > + column of <ref table="Load_Balancer" db="OVN_Northbound"/> table, > + priority-130 logical flow is added with the match > + <code>arp.tpa == <var>VIP</var> > + && && arp.op == 1</code> and applies the action > + </p> > + > + <pre> > +eth.dst = eth.src; > +eth.src = <var>E</var>; > +arp.op = 2; /* ARP reply. */ > +arp.tha = arp.sha; > +arp.sha = <var>E</var>; > +arp.tpa = arp.spa; > +arp.spa = <var>VIP</var>; > +outport = inport; > +flags.loopback = 1; > +output; > + </pre> > + > + <p> > + These flows are required if an ARP request is sent for the > + <var>VIP</var>. This enables CMS to have VIP allocated from > + the same subnet prefix as the clients. > + </p> > + > + <p> > + For IPv6 the similar flow is added with the following action > + </p> > + > + <pre> > +nd_na { > + eth.dst = eth.src; > + eth.src = <var>E</var>; > + ip6.src = <var>VIP</var>; > + nd.target = <var>VIP</var>; > + nd.tll = <var>E</var>; > + outport = inport; > + flags.loopback = 1; > + output; > +}; > + </pre> > + </li> > + > <li> > One priority-0 fallback flow that matches all packets and advances to > the next table. > diff --git a/tests/ovn.at b/tests/ovn.at > index 5615ba1a9..f25791d3f 100644 > --- a/tests/ovn.at > +++ b/tests/ovn.at > @@ -37524,3 +37524,112 @@ wait_for_ports_up > OVN_CLEANUP([hv1]) > AT_CLEANUP > ]) > + > +OVN_FOR_EACH_NORTHD([ > +AT_SETUP([Logical Switch lb_vip_mac - IPv4]) > +AT_KEYWORDS([lb]) > +ovn_start > + > +net_add n1 > + > +sim_add hv1 > +as hv1 > +ovs-vsctl add-br br-phys > +ovn_attach n1 br-phys 192.168.0.1 > +check ovs-vsctl -- add-port br-int hv1-vif1 -- \ > + set interface hv1-vif1 external-ids:iface-id=sw0-p1 \ > + options:tx_pcap=hv1/vif1-tx.pcap \ > + options:rxq_pcap=hv1/vif1-rx.pcap \ > + ofport-request=1 > +check ovs-vsctl -- add-port br-int hv1-vif2 -- \ > + set interface hv1-vif2 external-ids:iface-id=sw0-p2 \ > + options:tx_pcap=hv1/vif2-tx.pcap \ > + options:rxq_pcap=hv1/vif2-rx.pcap \ > + ofport-request=2 > + > +sim_add hv2 > +as hv2 > +check ovs-vsctl add-br br-phys > +ovn_attach n1 br-phys 192.168.0.2 > +check ovs-vsctl -- add-port br-int hv2-vif1 -- \ > + set interface hv2-vif1 external-ids:iface-id=sw1-p1 \ > + options:tx_pcap=hv2/vif1-tx.pcap \ > + options:rxq_pcap=hv2/vif1-rx.pcap \ > + ofport-request=1 > + > +check ovn-nbctl ls-add sw0 > + > +check ovn-nbctl lsp-add sw0 sw0-p1 > +check ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03 10.0.0.3" > +check ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:03 10.0.0.3" > + > +# Create the second logical switch with one port > +check ovn-nbctl ls-add sw1 > +check ovn-nbctl lsp-add sw1 sw1-p1 > +check ovn-nbctl lsp-set-addresses sw1-p1 "40:54:00:00:00:03 20.0.0.3" > +check ovn-nbctl lsp-set-port-security sw1-p1 "40:54:00:00:00:03 20.0.0.3" > + > +OVN_SW0_ID=$(ovn-nbctl --bare --column _uuid find logical_switch name=sw0) > +OVN_SW1_ID=$(ovn-nbctl --bare --column _uuid find logical_switch name=sw1) > + > +# Create a logical router and attach both logical switches > +check ovn-nbctl lr-add lr0 > +check ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24 > +check ovn-nbctl lsp-add sw0 sw0-lr0 > +check ovn-nbctl lsp-set-type sw0-lr0 router > +check ovn-nbctl lsp-set-addresses sw0-lr0 router > +check ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 > +check ovn-nbctl set Logical_Switch ${OVN_SW0_ID} other_config:lb_vip_mac=00:00:00:00:ff:01 > + > +check ovn-nbctl lrp-add lr0 lr0-sw1 00:00:00:00:ff:02 20.0.0.1/24 > +check ovn-nbctl lsp-add sw1 sw1-lr0 > +check ovn-nbctl lsp-set-type sw1-lr0 router > +check ovn-nbctl lsp-set-addresses sw1-lr0 router > +check ovn-nbctl lsp-set-options sw1-lr0 router-port=lr0-sw1 > +check ovn-nbctl set Logical_Switch ${OVN_SW1_ID} other_config:lb_vip_mac=00:00:00:00:ff:02 > + > +check ovn-nbctl lb-add lb1 10.0.0.10:80 10.0.0.3:80,20.0.0.3:80 > +OVN_LB_ID=$(ovn-nbctl --bare --column _uuid find load_balancer name=lb1) > + > +check ovn-nbctl --wait=sb ls-lb-add sw0 lb1 > +check ovn-nbctl --wait=sb ls-lb-add sw1 lb1 > + > +OVN_POPULATE_ARP > +wait_for_ports_up > +check ovn-nbctl --wait=hv sync > + > +AT_CAPTURE_FILE([sbflows]) > +OVS_WAIT_FOR_OUTPUT( > + [ovn-sbctl dump-flows > sbflows > + ovn-sbctl dump-flows sw0 | grep ct_lb_mark | grep priority=120 | sed 's/table=..//'], 0, > + [dnl > + (ls_in_pre_stateful ), priority=120 , match=(reg0[[2]] == 1 && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(reg1 = 10.0.0.10; reg2[[0..15]] = 80; ct_lb_mark;) > + (ls_in_lb ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(reg0[[1]] = 0; ct_lb_mark(backends=10.0.0.3:80,20.0.0.3:80);) > +]) > + > +AT_CAPTURE_FILE([sbflows-arp]) > +OVS_WAIT_FOR_OUTPUT( > + [ovn-sbctl dump-flows sw0 | grep 00:00:00:00:ff:01 | grep 10.0.0.10 | grep priority=130 | sed 's/table=..//'], 0, > + [dnl > + (ls_in_arp_rsp ), priority=130 , match=(arp.tpa == 10.0.0.10 && arp.op == 1), action=(eth.dst = eth.src; eth.src = 00:00:00:00:ff:01; arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha = 00:00:00:00:ff:01; arp.tpa = arp.spa; arp.spa = 10.0.0.10; outport = inport; flags.loopback = 1; output;) > +]) > + > +AT_CAPTURE_FILE([sbflows]) > +OVS_WAIT_FOR_OUTPUT( > + [ovn-sbctl dump-flows > sbflows > + ovn-sbctl dump-flows sw1 | grep ct_lb_mark | grep priority=120 | sed 's/table=..//'], 0, > + [dnl > + (ls_in_pre_stateful ), priority=120 , match=(reg0[[2]] == 1 && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(reg1 = 10.0.0.10; reg2[[0..15]] = 80; ct_lb_mark;) > + (ls_in_lb ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(reg0[[1]] = 0; ct_lb_mark(backends=10.0.0.3:80,20.0.0.3:80);) > +]) > + > +AT_CAPTURE_FILE([sbflows-arp2]) > +OVS_WAIT_FOR_OUTPUT( > + [ovn-sbctl dump-flows sw1 | grep 00:00:00:00:ff:02 | grep 10.0.0.10 | grep priority=130 | sed 's/table=..//'], 0, > + [dnl > + (ls_in_arp_rsp ), priority=130 , match=(arp.tpa == 10.0.0.10 && arp.op == 1), action=(eth.dst = eth.src; eth.src = 00:00:00:00:ff:02; arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha = 00:00:00:00:ff:02; arp.tpa = arp.spa; arp.spa = 10.0.0.10; outport = inport; flags.loopback = 1; output;) > +]) > + > +OVN_CLEANUP([hv1], [hv2]) > +AT_CLEANUP > +]) > -- > 2.39.2 (Apple Git-143) > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
Hi, On 01/02/24 5:15 am, Numan Siddique wrote: > On Mon, Jan 8, 2024 at 8:13 AM Priyankar Jain > <priyankar.jain@nutanix.com> wrote: >> Currently load balancer applied to a logical switch has the >> following restriction: >> - VIP of the load balancer cannot reside in the subnet prefix as the >> clients as OVN does not install ARP responder flows for the LB VIP. >> > Hi Priyankar, > > Sorry for the late reviews. No worries. Thanks for you comments on the patch. > > The above statement is actually not correct. OVN does allow the VIP > of the load balancer > to be from the same subnet prefix. But in order for this to work, > this logical switch > has to be associated with a logical router. My bad. I would repurpose the commit message to make it more clearer. > > Eg. > -------------------------- > ovn-nbctl ls-add sw0 > > ovn-nbctl lsp-add sw0 sw0-p1 > ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03 10.0.0.3" > ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:03 10.0.0.3" > > ovn-nbctl lr-add lr0 > ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24 > ovn-nbctl lsp-add sw0 sw0-lr0 > ovn-nbctl lsp-set-type sw0-lr0 router > ovn-nbctl lsp-set-addresses sw0-lr0 router > ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 > > ovn-nbctl lb-add lb1 10.0.0.10:80 10.0.0.3:80,20.0.0.3:80 > ovn-nbctl --wait=sb ls-lb-add sw0 lb1 > > # This will work once you do > ovn-nbctl lr-lb-add lr0 lb1 > --------------------------- > > Do you have a use case where you attach a load balancer to a logical switch > and this logical switch doesn't connect to any logical router ? If > so, then perhaps we can consider > this patch. However if you always attach a logical router to a > logical switch (like how its > done in the test case added in this patch), then just attaching the > lb to the router would do. There are actually 2 usecases for us: 1. Logical switch (with localnet port) is not connected to any logical router. Routing is handled by the underlay here. And we have LB attached to logical switch. 2. Load balancer only applied to logical switches. Say we don't want the load balancer to be accessible from outside world (NS connectivity). Initially i thought it logical_router_policy can help but these gets applied (lr_in_policy) only after the LB (lr_in_defrag). Hence, policies see only translated addresses. In this case the LB is only applied to the logical switches and not to the logical routers. > > Can you please confirm if this is sufficient for your use case ? > Perhaps we should document it in OVN :) > > Thanks > Numan > > >> This change adds a new config option "lb_vip_mac" in the logical_switch >> table which is expected to be a MAC address. If the logical_switch has >> this option configured, northd will program an ARP responder flow for >> all the LB VIPs of the logical_switch with this MAC address. >> >> Usecase: With this change, CMS can set the lb_vip_mac value to same as >> the default gateway MAC. This allows CMS to allocate VIP of the Load >> balancer from any subnet prefix. >> >> Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com> > > > >> --- >> northd/northd.c | 71 ++++++++++++++++++++++++++ >> northd/northd.h | 2 + >> northd/ovn-northd.8.xml | 49 ++++++++++++++++++ >> tests/ovn.at | 109 ++++++++++++++++++++++++++++++++++++++++ >> 4 files changed, 231 insertions(+) >> >> diff --git a/northd/northd.c b/northd/northd.c >> index db3cd272e..ebca2c073 100644 >> --- a/northd/northd.c >> +++ b/northd/northd.c >> @@ -790,8 +790,11 @@ init_lb_for_datapath(struct ovn_datapath *od) >> { >> if (od->nbs) { >> od->has_lb_vip = ls_has_lb_vip(od); >> + od->lb_vip_mac = nullable_xstrdup( >> + smap_get(&od->nbs->other_config, "lb_vip_mac")); >> } else { >> od->has_lb_vip = lr_has_lb_vip(od); >> + od->lb_vip_mac = NULL; >> } >> } >> >> @@ -800,6 +803,9 @@ destroy_lb_for_datapath(struct ovn_datapath *od) >> { >> ovn_lb_ip_set_destroy(od->lb_ips); >> od->lb_ips = NULL; >> + >> + free(od->lb_vip_mac); >> + od->lb_vip_mac = NULL; >> } >> >> /* A group of logical router datapaths which are connected - either >> @@ -12204,6 +12210,70 @@ build_lrouter_nat_flows_for_lb(struct ovn_lb_vip *lb_vip, >> } >> } >> >> +static void >> +build_lb_rules_arp_nd_rsp(struct hmap *lflows, struct ovn_lb_datapaths *lb_dps, >> + const struct ovn_datapaths *ls_datapaths, >> + struct ds *match, struct ds *actions) >> +{ >> + if (!lb_dps->n_nb_ls) { >> + return; >> + } >> + >> + const struct ovn_northd_lb *lb = lb_dps->lb; >> + for (size_t i = 0; i < lb->n_vips; i++) { >> + struct ovn_lb_vip *lb_vip = &lb->vips[i]; >> + >> + size_t index; >> + BITMAP_FOR_EACH_1 (index, ods_size(ls_datapaths), lb_dps->nb_ls_map) { >> + struct ovn_datapath *od = ls_datapaths->array[index]; >> + if (!od->lb_vip_mac) { >> + continue; >> + } >> + ds_clear(match); >> + ds_clear(actions); >> + if (IN6_IS_ADDR_V4MAPPED(&lb_vip->vip)) { >> + ds_put_format(match, "arp.tpa == %s && arp.op == 1", >> + lb_vip->vip_str); >> + ds_put_format(actions, >> + "eth.dst = eth.src; " >> + "eth.src = %s; " >> + "arp.op = 2; /* ARP reply */ " >> + "arp.tha = arp.sha; " >> + "arp.sha = %s; " >> + "arp.tpa = arp.spa; " >> + "arp.spa = %s; " >> + "outport = inport; " >> + "flags.loopback = 1; " >> + "output;", >> + od->lb_vip_mac, od->lb_vip_mac, >> + lb_vip->vip_str); >> + } else { >> + ds_put_format(match, "nd_ns && nd.target == %s", >> + lb_vip->vip_str); >> + ds_put_format(actions, >> + "nd_na { " >> + "eth.dst = eth.src; " >> + "eth.src = %s; " >> + "ip6.src = %s; " >> + "nd.target = %s; " >> + "nd.tll = %s; " >> + "outport = inport; " >> + "flags.loopback = 1; " >> + "output; " >> + "};", >> + od->lb_vip_mac, >> + lb_vip->vip_str, >> + lb_vip->vip_str, >> + od->lb_vip_mac); >> + } >> + ovn_lflow_add_with_hint(lflows, od, >> + S_SWITCH_IN_ARP_ND_RSP, 130, >> + ds_cstr(match), ds_cstr(actions), >> + &lb->nlb->header_); >> + } >> + } >> +} >> + >> static void >> build_lswitch_flows_for_lb(struct ovn_lb_datapaths *lb_dps, >> struct hmap *lflows, >> @@ -12255,6 +12325,7 @@ build_lswitch_flows_for_lb(struct ovn_lb_datapaths *lb_dps, >> ls_datapaths, match, action); >> build_lb_rules(lflows, lb_dps, ls_datapaths, features, match, action, >> meter_groups, svc_monitor_map); >> + build_lb_rules_arp_nd_rsp(lflows, lb_dps, ls_datapaths, match, action); >> } >> >> /* If there are any load balancing rules, we should send the packet to >> diff --git a/northd/northd.h b/northd/northd.h >> index 5be7b5384..3e1b24e2c 100644 >> --- a/northd/northd.h >> +++ b/northd/northd.h >> @@ -262,6 +262,8 @@ struct ovn_datapath { >> bool has_vtep_lports; >> bool has_arp_proxy_port; >> >> + char *lb_vip_mac; >> + >> /* IPAM data. */ >> struct ipam_info ipam_info; >> >> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml >> index 98cf7adb4..94daf47fb 100644 >> --- a/northd/ovn-northd.8.xml >> +++ b/northd/ovn-northd.8.xml >> @@ -1618,6 +1618,55 @@ output; >> </p> >> </li> >> >> + <li> >> + <p> >> + If <var>E</var> is defined in the value of >> + <ref column="other_config:lb_vip_mac" table="Logical_Switch" db="OVN_Northbound"/>, >> + For each <var>VIP</var> defined in the value of the >> + <ref column="vips" table="Load_Balancer" db="OVN_Northbound"/> >> + column of <ref table="Load_Balancer" db="OVN_Northbound"/> table, >> + priority-130 logical flow is added with the match >> + <code>arp.tpa == <var>VIP</var> >> + && && arp.op == 1</code> and applies the action >> + </p> >> + >> + <pre> >> +eth.dst = eth.src; >> +eth.src = <var>E</var>; >> +arp.op = 2; /* ARP reply. */ >> +arp.tha = arp.sha; >> +arp.sha = <var>E</var>; >> +arp.tpa = arp.spa; >> +arp.spa = <var>VIP</var>; >> +outport = inport; >> +flags.loopback = 1; >> +output; >> + </pre> >> + >> + <p> >> + These flows are required if an ARP request is sent for the >> + <var>VIP</var>. This enables CMS to have VIP allocated from >> + the same subnet prefix as the clients. >> + </p> >> + >> + <p> >> + For IPv6 the similar flow is added with the following action >> + </p> >> + >> + <pre> >> +nd_na { >> + eth.dst = eth.src; >> + eth.src = <var>E</var>; >> + ip6.src = <var>VIP</var>; >> + nd.target = <var>VIP</var>; >> + nd.tll = <var>E</var>; >> + outport = inport; >> + flags.loopback = 1; >> + output; >> +}; >> + </pre> >> + </li> >> + >> <li> >> One priority-0 fallback flow that matches all packets and advances to >> the next table. >> diff --git a/tests/ovn.at b/tests/ovn.at >> index 5615ba1a9..f25791d3f 100644 >> --- a/tests/ovn.at >> +++ b/tests/ovn.at >> @@ -37524,3 +37524,112 @@ wait_for_ports_up >> OVN_CLEANUP([hv1]) >> AT_CLEANUP >> ]) >> + >> +OVN_FOR_EACH_NORTHD([ >> +AT_SETUP([Logical Switch lb_vip_mac - IPv4]) >> +AT_KEYWORDS([lb]) >> +ovn_start >> + >> +net_add n1 >> + >> +sim_add hv1 >> +as hv1 >> +ovs-vsctl add-br br-phys >> +ovn_attach n1 br-phys 192.168.0.1 >> +check ovs-vsctl -- add-port br-int hv1-vif1 -- \ >> + set interface hv1-vif1 external-ids:iface-id=sw0-p1 \ >> + options:tx_pcap=hv1/vif1-tx.pcap \ >> + options:rxq_pcap=hv1/vif1-rx.pcap \ >> + ofport-request=1 >> +check ovs-vsctl -- add-port br-int hv1-vif2 -- \ >> + set interface hv1-vif2 external-ids:iface-id=sw0-p2 \ >> + options:tx_pcap=hv1/vif2-tx.pcap \ >> + options:rxq_pcap=hv1/vif2-rx.pcap \ >> + ofport-request=2 >> + >> +sim_add hv2 >> +as hv2 >> +check ovs-vsctl add-br br-phys >> +ovn_attach n1 br-phys 192.168.0.2 >> +check ovs-vsctl -- add-port br-int hv2-vif1 -- \ >> + set interface hv2-vif1 external-ids:iface-id=sw1-p1 \ >> + options:tx_pcap=hv2/vif1-tx.pcap \ >> + options:rxq_pcap=hv2/vif1-rx.pcap \ >> + ofport-request=1 >> + >> +check ovn-nbctl ls-add sw0 >> + >> +check ovn-nbctl lsp-add sw0 sw0-p1 >> +check ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03 10.0.0.3" >> +check ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:03 10.0.0.3" >> + >> +# Create the second logical switch with one port >> +check ovn-nbctl ls-add sw1 >> +check ovn-nbctl lsp-add sw1 sw1-p1 >> +check ovn-nbctl lsp-set-addresses sw1-p1 "40:54:00:00:00:03 20.0.0.3" >> +check ovn-nbctl lsp-set-port-security sw1-p1 "40:54:00:00:00:03 20.0.0.3" >> + >> +OVN_SW0_ID=$(ovn-nbctl --bare --column _uuid find logical_switch name=sw0) >> +OVN_SW1_ID=$(ovn-nbctl --bare --column _uuid find logical_switch name=sw1) >> + >> +# Create a logical router and attach both logical switches >> +check ovn-nbctl lr-add lr0 >> +check ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24 >> +check ovn-nbctl lsp-add sw0 sw0-lr0 >> +check ovn-nbctl lsp-set-type sw0-lr0 router >> +check ovn-nbctl lsp-set-addresses sw0-lr0 router >> +check ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 >> +check ovn-nbctl set Logical_Switch ${OVN_SW0_ID} other_config:lb_vip_mac=00:00:00:00:ff:01 >> + >> +check ovn-nbctl lrp-add lr0 lr0-sw1 00:00:00:00:ff:02 20.0.0.1/24 >> +check ovn-nbctl lsp-add sw1 sw1-lr0 >> +check ovn-nbctl lsp-set-type sw1-lr0 router >> +check ovn-nbctl lsp-set-addresses sw1-lr0 router >> +check ovn-nbctl lsp-set-options sw1-lr0 router-port=lr0-sw1 >> +check ovn-nbctl set Logical_Switch ${OVN_SW1_ID} other_config:lb_vip_mac=00:00:00:00:ff:02 >> + >> +check ovn-nbctl lb-add lb1 10.0.0.10:80 10.0.0.3:80,20.0.0.3:80 >> +OVN_LB_ID=$(ovn-nbctl --bare --column _uuid find load_balancer name=lb1) >> + >> +check ovn-nbctl --wait=sb ls-lb-add sw0 lb1 >> +check ovn-nbctl --wait=sb ls-lb-add sw1 lb1 >> + >> +OVN_POPULATE_ARP >> +wait_for_ports_up >> +check ovn-nbctl --wait=hv sync >> + >> +AT_CAPTURE_FILE([sbflows]) >> +OVS_WAIT_FOR_OUTPUT( >> + [ovn-sbctl dump-flows > sbflows >> + ovn-sbctl dump-flows sw0 | grep ct_lb_mark | grep priority=120 | sed 's/table=..//'], 0, >> + [dnl >> + (ls_in_pre_stateful ), priority=120 , match=(reg0[[2]] == 1 && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(reg1 = 10.0.0.10; reg2[[0..15]] = 80; ct_lb_mark;) >> + (ls_in_lb ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(reg0[[1]] = 0; ct_lb_mark(backends=10.0.0.3:80,20.0.0.3:80);) >> +]) >> + >> +AT_CAPTURE_FILE([sbflows-arp]) >> +OVS_WAIT_FOR_OUTPUT( >> + [ovn-sbctl dump-flows sw0 | grep 00:00:00:00:ff:01 | grep 10.0.0.10 | grep priority=130 | sed 's/table=..//'], 0, >> + [dnl >> + (ls_in_arp_rsp ), priority=130 , match=(arp.tpa == 10.0.0.10 && arp.op == 1), action=(eth.dst = eth.src; eth.src = 00:00:00:00:ff:01; arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha = 00:00:00:00:ff:01; arp.tpa = arp.spa; arp.spa = 10.0.0.10; outport = inport; flags.loopback = 1; output;) >> +]) >> + >> +AT_CAPTURE_FILE([sbflows]) >> +OVS_WAIT_FOR_OUTPUT( >> + [ovn-sbctl dump-flows > sbflows >> + ovn-sbctl dump-flows sw1 | grep ct_lb_mark | grep priority=120 | sed 's/table=..//'], 0, >> + [dnl >> + (ls_in_pre_stateful ), priority=120 , match=(reg0[[2]] == 1 && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(reg1 = 10.0.0.10; reg2[[0..15]] = 80; ct_lb_mark;) >> + (ls_in_lb ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(reg0[[1]] = 0; ct_lb_mark(backends=10.0.0.3:80,20.0.0.3:80);) >> +]) >> + >> +AT_CAPTURE_FILE([sbflows-arp2]) >> +OVS_WAIT_FOR_OUTPUT( >> + [ovn-sbctl dump-flows sw1 | grep 00:00:00:00:ff:02 | grep 10.0.0.10 | grep priority=130 | sed 's/table=..//'], 0, >> + [dnl >> + (ls_in_arp_rsp ), priority=130 , match=(arp.tpa == 10.0.0.10 && arp.op == 1), action=(eth.dst = eth.src; eth.src = 00:00:00:00:ff:02; arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha = 00:00:00:00:ff:02; arp.tpa = arp.spa; arp.spa = 10.0.0.10; outport = inport; flags.loopback = 1; output;) >> +]) >> + >> +OVN_CLEANUP([hv1], [hv2]) >> +AT_CLEANUP >> +]) >> -- >> 2.39.2 (Apple Git-143) >> >> _______________________________________________ >> dev mailing list >> dev@openvswitch.org >> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwIFaQ&c=s883GpUCOChKOHiocYtGcg&r=BWfApgwqQ_BTXd_yEq_kwWLM-KTqkdPvolDMCJmUagg&m=BTv1Q6JqT7HSTeR7S-vi-AW1n2nxQAc6LgOL377VnC6VrcrjQGBeftTQ2HHdjeZC&s=KKX-3tpcW-bcP2X2u7oY0ZcCp3YXSgLD-SWH1kQSaO4&e= >>
On Mon, Feb 5, 2024 at 11:45 PM Priyankar Jain <priyankar.jain@nutanix.com> wrote: > > Hi, > > On 01/02/24 5:15 am, Numan Siddique wrote: > > On Mon, Jan 8, 2024 at 8:13 AM Priyankar Jain > > <priyankar.jain@nutanix.com> wrote: > >> Currently load balancer applied to a logical switch has the > >> following restriction: > >> - VIP of the load balancer cannot reside in the subnet prefix as the > >> clients as OVN does not install ARP responder flows for the LB VIP. > >> > > Hi Priyankar, > > > > Sorry for the late reviews. > No worries. Thanks for you comments on the patch. > > > > The above statement is actually not correct. OVN does allow the VIP > > of the load balancer > > to be from the same subnet prefix. But in order for this to work, > > this logical switch > > has to be associated with a logical router. > My bad. I would repurpose the commit message to make it more clearer. > > > > Eg. > > -------------------------- > > ovn-nbctl ls-add sw0 > > > > ovn-nbctl lsp-add sw0 sw0-p1 > > ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03 10.0.0.3" > > ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:03 10.0.0.3" > > > > ovn-nbctl lr-add lr0 > > ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24 > > ovn-nbctl lsp-add sw0 sw0-lr0 > > ovn-nbctl lsp-set-type sw0-lr0 router > > ovn-nbctl lsp-set-addresses sw0-lr0 router > > ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 > > > > ovn-nbctl lb-add lb1 10.0.0.10:80 10.0.0.3:80,20.0.0.3:80 > > ovn-nbctl --wait=sb ls-lb-add sw0 lb1 > > > > # This will work once you do > > ovn-nbctl lr-lb-add lr0 lb1 > > --------------------------- > > > > Do you have a use case where you attach a load balancer to a logical switch > > and this logical switch doesn't connect to any logical router ? If > > so, then perhaps we can consider > > this patch. However if you always attach a logical router to a > > logical switch (like how its > > done in the test case added in this patch), then just attaching the > > lb to the router would do. > > There are actually 2 usecases for us: > > 1. Logical switch (with localnet port) is not connected to any logical > router. Routing is handled by the underlay here. And we have LB attached > to logical switch. > > 2. Load balancer only applied to logical switches. Say we don't want the > load balancer to be accessible from outside world (NS connectivity). > Initially i thought it logical_router_policy can help but these gets > applied (lr_in_policy) only after the LB (lr_in_defrag). Hence, policies > see only translated addresses. In this case the LB is only applied to > the logical switches and not to the logical routers. '> Thanks for the explanation. I think it makes sense to add this feature. I'm afraid you have to rework a bit now that the northd lflow I-P patches are merged. Few pointers 1. Please move the 'lb_vip_mac' from 'struct od' to 'struct ls_stateful_record' in northd/en-ls-stateful.h 2. Init this variable here - https://github.com/ovn-org/ovn/blob/main/northd/en-ls-stateful.c#L314 3. Call the function build_lb_rules_arp_nd_rsp() which you added from build_ls_stateful_flows() - https://github.com/ovn-org/ovn/blob/main/northd/northd.c#L15490 Please let me know if you have questions. Overall the patch LGTM. Thanks Numan > > > > Can you please confirm if this is sufficient for your use case ? > > Perhaps we should document it in OVN :) > > > > Thanks > > Numan > > > > > >> This change adds a new config option "lb_vip_mac" in the logical_switch > >> table which is expected to be a MAC address. If the logical_switch has > >> this option configured, northd will program an ARP responder flow for > >> all the LB VIPs of the logical_switch with this MAC address. > >> > >> Usecase: With this change, CMS can set the lb_vip_mac value to same as > >> the default gateway MAC. This allows CMS to allocate VIP of the Load > >> balancer from any subnet prefix. > >> > >> Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com> > > > > > > > >> --- > >> northd/northd.c | 71 ++++++++++++++++++++++++++ > >> northd/northd.h | 2 + > >> northd/ovn-northd.8.xml | 49 ++++++++++++++++++ > >> tests/ovn.at | 109 ++++++++++++++++++++++++++++++++++++++++ > >> 4 files changed, 231 insertions(+) > >> > >> diff --git a/northd/northd.c b/northd/northd.c > >> index db3cd272e..ebca2c073 100644 > >> --- a/northd/northd.c > >> +++ b/northd/northd.c > >> @@ -790,8 +790,11 @@ init_lb_for_datapath(struct ovn_datapath *od) > >> { > >> if (od->nbs) { > >> od->has_lb_vip = ls_has_lb_vip(od); > >> + od->lb_vip_mac = nullable_xstrdup( > >> + smap_get(&od->nbs->other_config, "lb_vip_mac")); > >> } else { > >> od->has_lb_vip = lr_has_lb_vip(od); > >> + od->lb_vip_mac = NULL; > >> } > >> } > >> > >> @@ -800,6 +803,9 @@ destroy_lb_for_datapath(struct ovn_datapath *od) > >> { > >> ovn_lb_ip_set_destroy(od->lb_ips); > >> od->lb_ips = NULL; > >> + > >> + free(od->lb_vip_mac); > >> + od->lb_vip_mac = NULL; > >> } > >> > >> /* A group of logical router datapaths which are connected - either > >> @@ -12204,6 +12210,70 @@ build_lrouter_nat_flows_for_lb(struct ovn_lb_vip *lb_vip, > >> } > >> } > >> > >> +static void > >> +build_lb_rules_arp_nd_rsp(struct hmap *lflows, struct ovn_lb_datapaths *lb_dps, > >> + const struct ovn_datapaths *ls_datapaths, > >> + struct ds *match, struct ds *actions) > >> +{ > >> + if (!lb_dps->n_nb_ls) { > >> + return; > >> + } > >> + > >> + const struct ovn_northd_lb *lb = lb_dps->lb; > >> + for (size_t i = 0; i < lb->n_vips; i++) { > >> + struct ovn_lb_vip *lb_vip = &lb->vips[i]; > >> + > >> + size_t index; > >> + BITMAP_FOR_EACH_1 (index, ods_size(ls_datapaths), lb_dps->nb_ls_map) { > >> + struct ovn_datapath *od = ls_datapaths->array[index]; > >> + if (!od->lb_vip_mac) { > >> + continue; > >> + } > >> + ds_clear(match); > >> + ds_clear(actions); > >> + if (IN6_IS_ADDR_V4MAPPED(&lb_vip->vip)) { > >> + ds_put_format(match, "arp.tpa == %s && arp.op == 1", > >> + lb_vip->vip_str); > >> + ds_put_format(actions, > >> + "eth.dst = eth.src; " > >> + "eth.src = %s; " > >> + "arp.op = 2; /* ARP reply */ " > >> + "arp.tha = arp.sha; " > >> + "arp.sha = %s; " > >> + "arp.tpa = arp.spa; " > >> + "arp.spa = %s; " > >> + "outport = inport; " > >> + "flags.loopback = 1; " > >> + "output;", > >> + od->lb_vip_mac, od->lb_vip_mac, > >> + lb_vip->vip_str); > >> + } else { > >> + ds_put_format(match, "nd_ns && nd.target == %s", > >> + lb_vip->vip_str); > >> + ds_put_format(actions, > >> + "nd_na { " > >> + "eth.dst = eth.src; " > >> + "eth.src = %s; " > >> + "ip6.src = %s; " > >> + "nd.target = %s; " > >> + "nd.tll = %s; " > >> + "outport = inport; " > >> + "flags.loopback = 1; " > >> + "output; " > >> + "};", > >> + od->lb_vip_mac, > >> + lb_vip->vip_str, > >> + lb_vip->vip_str, > >> + od->lb_vip_mac); > >> + } > >> + ovn_lflow_add_with_hint(lflows, od, > >> + S_SWITCH_IN_ARP_ND_RSP, 130, > >> + ds_cstr(match), ds_cstr(actions), > >> + &lb->nlb->header_); > >> + } > >> + } > >> +} > >> + > >> static void > >> build_lswitch_flows_for_lb(struct ovn_lb_datapaths *lb_dps, > >> struct hmap *lflows, > >> @@ -12255,6 +12325,7 @@ build_lswitch_flows_for_lb(struct ovn_lb_datapaths *lb_dps, > >> ls_datapaths, match, action); > >> build_lb_rules(lflows, lb_dps, ls_datapaths, features, match, action, > >> meter_groups, svc_monitor_map); > >> + build_lb_rules_arp_nd_rsp(lflows, lb_dps, ls_datapaths, match, action); > >> } > >> > >> /* If there are any load balancing rules, we should send the packet to > >> diff --git a/northd/northd.h b/northd/northd.h > >> index 5be7b5384..3e1b24e2c 100644 > >> --- a/northd/northd.h > >> +++ b/northd/northd.h > >> @@ -262,6 +262,8 @@ struct ovn_datapath { > >> bool has_vtep_lports; > >> bool has_arp_proxy_port; > >> > >> + char *lb_vip_mac; > >> + > >> /* IPAM data. */ > >> struct ipam_info ipam_info; > >> > >> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml > >> index 98cf7adb4..94daf47fb 100644 > >> --- a/northd/ovn-northd.8.xml > >> +++ b/northd/ovn-northd.8.xml > >> @@ -1618,6 +1618,55 @@ output; > >> </p> > >> </li> > >> > >> + <li> > >> + <p> > >> + If <var>E</var> is defined in the value of > >> + <ref column="other_config:lb_vip_mac" table="Logical_Switch" db="OVN_Northbound"/>, > >> + For each <var>VIP</var> defined in the value of the > >> + <ref column="vips" table="Load_Balancer" db="OVN_Northbound"/> > >> + column of <ref table="Load_Balancer" db="OVN_Northbound"/> table, > >> + priority-130 logical flow is added with the match > >> + <code>arp.tpa == <var>VIP</var> > >> + && && arp.op == 1</code> and applies the action > >> + </p> > >> + > >> + <pre> > >> +eth.dst = eth.src; > >> +eth.src = <var>E</var>; > >> +arp.op = 2; /* ARP reply. */ > >> +arp.tha = arp.sha; > >> +arp.sha = <var>E</var>; > >> +arp.tpa = arp.spa; > >> +arp.spa = <var>VIP</var>; > >> +outport = inport; > >> +flags.loopback = 1; > >> +output; > >> + </pre> > >> + > >> + <p> > >> + These flows are required if an ARP request is sent for the > >> + <var>VIP</var>. This enables CMS to have VIP allocated from > >> + the same subnet prefix as the clients. > >> + </p> > >> + > >> + <p> > >> + For IPv6 the similar flow is added with the following action > >> + </p> > >> + > >> + <pre> > >> +nd_na { > >> + eth.dst = eth.src; > >> + eth.src = <var>E</var>; > >> + ip6.src = <var>VIP</var>; > >> + nd.target = <var>VIP</var>; > >> + nd.tll = <var>E</var>; > >> + outport = inport; > >> + flags.loopback = 1; > >> + output; > >> +}; > >> + </pre> > >> + </li> > >> + > >> <li> > >> One priority-0 fallback flow that matches all packets and advances to > >> the next table. > >> diff --git a/tests/ovn.at b/tests/ovn.at > >> index 5615ba1a9..f25791d3f 100644 > >> --- a/tests/ovn.at > >> +++ b/tests/ovn.at > >> @@ -37524,3 +37524,112 @@ wait_for_ports_up > >> OVN_CLEANUP([hv1]) > >> AT_CLEANUP > >> ]) > >> + > >> +OVN_FOR_EACH_NORTHD([ > >> +AT_SETUP([Logical Switch lb_vip_mac - IPv4]) > >> +AT_KEYWORDS([lb]) > >> +ovn_start > >> + > >> +net_add n1 > >> + > >> +sim_add hv1 > >> +as hv1 > >> +ovs-vsctl add-br br-phys > >> +ovn_attach n1 br-phys 192.168.0.1 > >> +check ovs-vsctl -- add-port br-int hv1-vif1 -- \ > >> + set interface hv1-vif1 external-ids:iface-id=sw0-p1 \ > >> + options:tx_pcap=hv1/vif1-tx.pcap \ > >> + options:rxq_pcap=hv1/vif1-rx.pcap \ > >> + ofport-request=1 > >> +check ovs-vsctl -- add-port br-int hv1-vif2 -- \ > >> + set interface hv1-vif2 external-ids:iface-id=sw0-p2 \ > >> + options:tx_pcap=hv1/vif2-tx.pcap \ > >> + options:rxq_pcap=hv1/vif2-rx.pcap \ > >> + ofport-request=2 > >> + > >> +sim_add hv2 > >> +as hv2 > >> +check ovs-vsctl add-br br-phys > >> +ovn_attach n1 br-phys 192.168.0.2 > >> +check ovs-vsctl -- add-port br-int hv2-vif1 -- \ > >> + set interface hv2-vif1 external-ids:iface-id=sw1-p1 \ > >> + options:tx_pcap=hv2/vif1-tx.pcap \ > >> + options:rxq_pcap=hv2/vif1-rx.pcap \ > >> + ofport-request=1 > >> + > >> +check ovn-nbctl ls-add sw0 > >> + > >> +check ovn-nbctl lsp-add sw0 sw0-p1 > >> +check ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03 10.0.0.3" > >> +check ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:03 10.0.0.3" > >> + > >> +# Create the second logical switch with one port > >> +check ovn-nbctl ls-add sw1 > >> +check ovn-nbctl lsp-add sw1 sw1-p1 > >> +check ovn-nbctl lsp-set-addresses sw1-p1 "40:54:00:00:00:03 20.0.0.3" > >> +check ovn-nbctl lsp-set-port-security sw1-p1 "40:54:00:00:00:03 20.0.0.3" > >> + > >> +OVN_SW0_ID=$(ovn-nbctl --bare --column _uuid find logical_switch name=sw0) > >> +OVN_SW1_ID=$(ovn-nbctl --bare --column _uuid find logical_switch name=sw1) > >> + > >> +# Create a logical router and attach both logical switches > >> +check ovn-nbctl lr-add lr0 > >> +check ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24 > >> +check ovn-nbctl lsp-add sw0 sw0-lr0 > >> +check ovn-nbctl lsp-set-type sw0-lr0 router > >> +check ovn-nbctl lsp-set-addresses sw0-lr0 router > >> +check ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 > >> +check ovn-nbctl set Logical_Switch ${OVN_SW0_ID} other_config:lb_vip_mac=00:00:00:00:ff:01 > >> + > >> +check ovn-nbctl lrp-add lr0 lr0-sw1 00:00:00:00:ff:02 20.0.0.1/24 > >> +check ovn-nbctl lsp-add sw1 sw1-lr0 > >> +check ovn-nbctl lsp-set-type sw1-lr0 router > >> +check ovn-nbctl lsp-set-addresses sw1-lr0 router > >> +check ovn-nbctl lsp-set-options sw1-lr0 router-port=lr0-sw1 > >> +check ovn-nbctl set Logical_Switch ${OVN_SW1_ID} other_config:lb_vip_mac=00:00:00:00:ff:02 > >> + > >> +check ovn-nbctl lb-add lb1 10.0.0.10:80 10.0.0.3:80,20.0.0.3:80 > >> +OVN_LB_ID=$(ovn-nbctl --bare --column _uuid find load_balancer name=lb1) > >> + > >> +check ovn-nbctl --wait=sb ls-lb-add sw0 lb1 > >> +check ovn-nbctl --wait=sb ls-lb-add sw1 lb1 > >> + > >> +OVN_POPULATE_ARP > >> +wait_for_ports_up > >> +check ovn-nbctl --wait=hv sync > >> + > >> +AT_CAPTURE_FILE([sbflows]) > >> +OVS_WAIT_FOR_OUTPUT( > >> + [ovn-sbctl dump-flows > sbflows > >> + ovn-sbctl dump-flows sw0 | grep ct_lb_mark | grep priority=120 | sed 's/table=..//'], 0, > >> + [dnl > >> + (ls_in_pre_stateful ), priority=120 , match=(reg0[[2]] == 1 && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(reg1 = 10.0.0.10; reg2[[0..15]] = 80; ct_lb_mark;) > >> + (ls_in_lb ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(reg0[[1]] = 0; ct_lb_mark(backends=10.0.0.3:80,20.0.0.3:80);) > >> +]) > >> + > >> +AT_CAPTURE_FILE([sbflows-arp]) > >> +OVS_WAIT_FOR_OUTPUT( > >> + [ovn-sbctl dump-flows sw0 | grep 00:00:00:00:ff:01 | grep 10.0.0.10 | grep priority=130 | sed 's/table=..//'], 0, > >> + [dnl > >> + (ls_in_arp_rsp ), priority=130 , match=(arp.tpa == 10.0.0.10 && arp.op == 1), action=(eth.dst = eth.src; eth.src = 00:00:00:00:ff:01; arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha = 00:00:00:00:ff:01; arp.tpa = arp.spa; arp.spa = 10.0.0.10; outport = inport; flags.loopback = 1; output;) > >> +]) > >> + > >> +AT_CAPTURE_FILE([sbflows]) > >> +OVS_WAIT_FOR_OUTPUT( > >> + [ovn-sbctl dump-flows > sbflows > >> + ovn-sbctl dump-flows sw1 | grep ct_lb_mark | grep priority=120 | sed 's/table=..//'], 0, > >> + [dnl > >> + (ls_in_pre_stateful ), priority=120 , match=(reg0[[2]] == 1 && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(reg1 = 10.0.0.10; reg2[[0..15]] = 80; ct_lb_mark;) > >> + (ls_in_lb ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(reg0[[1]] = 0; ct_lb_mark(backends=10.0.0.3:80,20.0.0.3:80);) > >> +]) > >> + > >> +AT_CAPTURE_FILE([sbflows-arp2]) > >> +OVS_WAIT_FOR_OUTPUT( > >> + [ovn-sbctl dump-flows sw1 | grep 00:00:00:00:ff:02 | grep 10.0.0.10 | grep priority=130 | sed 's/table=..//'], 0, > >> + [dnl > >> + (ls_in_arp_rsp ), priority=130 , match=(arp.tpa == 10.0.0.10 && arp.op == 1), action=(eth.dst = eth.src; eth.src = 00:00:00:00:ff:02; arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha = 00:00:00:00:ff:02; arp.tpa = arp.spa; arp.spa = 10.0.0.10; outport = inport; flags.loopback = 1; output;) > >> +]) > >> + > >> +OVN_CLEANUP([hv1], [hv2]) > >> +AT_CLEANUP > >> +]) > >> -- > >> 2.39.2 (Apple Git-143) > >> > >> _______________________________________________ > >> dev mailing list > >> dev@openvswitch.org > >> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwIFaQ&c=s883GpUCOChKOHiocYtGcg&r=BWfApgwqQ_BTXd_yEq_kwWLM-KTqkdPvolDMCJmUagg&m=BTv1Q6JqT7HSTeR7S-vi-AW1n2nxQAc6LgOL377VnC6VrcrjQGBeftTQ2HHdjeZC&s=KKX-3tpcW-bcP2X2u7oY0ZcCp3YXSgLD-SWH1kQSaO4&e= > >> > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
diff --git a/northd/northd.c b/northd/northd.c index db3cd272e..ebca2c073 100644 --- a/northd/northd.c +++ b/northd/northd.c @@ -790,8 +790,11 @@ init_lb_for_datapath(struct ovn_datapath *od) { if (od->nbs) { od->has_lb_vip = ls_has_lb_vip(od); + od->lb_vip_mac = nullable_xstrdup( + smap_get(&od->nbs->other_config, "lb_vip_mac")); } else { od->has_lb_vip = lr_has_lb_vip(od); + od->lb_vip_mac = NULL; } } @@ -800,6 +803,9 @@ destroy_lb_for_datapath(struct ovn_datapath *od) { ovn_lb_ip_set_destroy(od->lb_ips); od->lb_ips = NULL; + + free(od->lb_vip_mac); + od->lb_vip_mac = NULL; } /* A group of logical router datapaths which are connected - either @@ -12204,6 +12210,70 @@ build_lrouter_nat_flows_for_lb(struct ovn_lb_vip *lb_vip, } } +static void +build_lb_rules_arp_nd_rsp(struct hmap *lflows, struct ovn_lb_datapaths *lb_dps, + const struct ovn_datapaths *ls_datapaths, + struct ds *match, struct ds *actions) +{ + if (!lb_dps->n_nb_ls) { + return; + } + + const struct ovn_northd_lb *lb = lb_dps->lb; + for (size_t i = 0; i < lb->n_vips; i++) { + struct ovn_lb_vip *lb_vip = &lb->vips[i]; + + size_t index; + BITMAP_FOR_EACH_1 (index, ods_size(ls_datapaths), lb_dps->nb_ls_map) { + struct ovn_datapath *od = ls_datapaths->array[index]; + if (!od->lb_vip_mac) { + continue; + } + ds_clear(match); + ds_clear(actions); + if (IN6_IS_ADDR_V4MAPPED(&lb_vip->vip)) { + ds_put_format(match, "arp.tpa == %s && arp.op == 1", + lb_vip->vip_str); + ds_put_format(actions, + "eth.dst = eth.src; " + "eth.src = %s; " + "arp.op = 2; /* ARP reply */ " + "arp.tha = arp.sha; " + "arp.sha = %s; " + "arp.tpa = arp.spa; " + "arp.spa = %s; " + "outport = inport; " + "flags.loopback = 1; " + "output;", + od->lb_vip_mac, od->lb_vip_mac, + lb_vip->vip_str); + } else { + ds_put_format(match, "nd_ns && nd.target == %s", + lb_vip->vip_str); + ds_put_format(actions, + "nd_na { " + "eth.dst = eth.src; " + "eth.src = %s; " + "ip6.src = %s; " + "nd.target = %s; " + "nd.tll = %s; " + "outport = inport; " + "flags.loopback = 1; " + "output; " + "};", + od->lb_vip_mac, + lb_vip->vip_str, + lb_vip->vip_str, + od->lb_vip_mac); + } + ovn_lflow_add_with_hint(lflows, od, + S_SWITCH_IN_ARP_ND_RSP, 130, + ds_cstr(match), ds_cstr(actions), + &lb->nlb->header_); + } + } +} + static void build_lswitch_flows_for_lb(struct ovn_lb_datapaths *lb_dps, struct hmap *lflows, @@ -12255,6 +12325,7 @@ build_lswitch_flows_for_lb(struct ovn_lb_datapaths *lb_dps, ls_datapaths, match, action); build_lb_rules(lflows, lb_dps, ls_datapaths, features, match, action, meter_groups, svc_monitor_map); + build_lb_rules_arp_nd_rsp(lflows, lb_dps, ls_datapaths, match, action); } /* If there are any load balancing rules, we should send the packet to diff --git a/northd/northd.h b/northd/northd.h index 5be7b5384..3e1b24e2c 100644 --- a/northd/northd.h +++ b/northd/northd.h @@ -262,6 +262,8 @@ struct ovn_datapath { bool has_vtep_lports; bool has_arp_proxy_port; + char *lb_vip_mac; + /* IPAM data. */ struct ipam_info ipam_info; diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml index 98cf7adb4..94daf47fb 100644 --- a/northd/ovn-northd.8.xml +++ b/northd/ovn-northd.8.xml @@ -1618,6 +1618,55 @@ output; </p> </li> + <li> + <p> + If <var>E</var> is defined in the value of + <ref column="other_config:lb_vip_mac" table="Logical_Switch" db="OVN_Northbound"/>, + For each <var>VIP</var> defined in the value of the + <ref column="vips" table="Load_Balancer" db="OVN_Northbound"/> + column of <ref table="Load_Balancer" db="OVN_Northbound"/> table, + priority-130 logical flow is added with the match + <code>arp.tpa == <var>VIP</var> + && && arp.op == 1</code> and applies the action + </p> + + <pre> +eth.dst = eth.src; +eth.src = <var>E</var>; +arp.op = 2; /* ARP reply. */ +arp.tha = arp.sha; +arp.sha = <var>E</var>; +arp.tpa = arp.spa; +arp.spa = <var>VIP</var>; +outport = inport; +flags.loopback = 1; +output; + </pre> + + <p> + These flows are required if an ARP request is sent for the + <var>VIP</var>. This enables CMS to have VIP allocated from + the same subnet prefix as the clients. + </p> + + <p> + For IPv6 the similar flow is added with the following action + </p> + + <pre> +nd_na { + eth.dst = eth.src; + eth.src = <var>E</var>; + ip6.src = <var>VIP</var>; + nd.target = <var>VIP</var>; + nd.tll = <var>E</var>; + outport = inport; + flags.loopback = 1; + output; +}; + </pre> + </li> + <li> One priority-0 fallback flow that matches all packets and advances to the next table. diff --git a/tests/ovn.at b/tests/ovn.at index 5615ba1a9..f25791d3f 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -37524,3 +37524,112 @@ wait_for_ports_up OVN_CLEANUP([hv1]) AT_CLEANUP ]) + +OVN_FOR_EACH_NORTHD([ +AT_SETUP([Logical Switch lb_vip_mac - IPv4]) +AT_KEYWORDS([lb]) +ovn_start + +net_add n1 + +sim_add hv1 +as hv1 +ovs-vsctl add-br br-phys +ovn_attach n1 br-phys 192.168.0.1 +check ovs-vsctl -- add-port br-int hv1-vif1 -- \ + set interface hv1-vif1 external-ids:iface-id=sw0-p1 \ + options:tx_pcap=hv1/vif1-tx.pcap \ + options:rxq_pcap=hv1/vif1-rx.pcap \ + ofport-request=1 +check ovs-vsctl -- add-port br-int hv1-vif2 -- \ + set interface hv1-vif2 external-ids:iface-id=sw0-p2 \ + options:tx_pcap=hv1/vif2-tx.pcap \ + options:rxq_pcap=hv1/vif2-rx.pcap \ + ofport-request=2 + +sim_add hv2 +as hv2 +check ovs-vsctl add-br br-phys +ovn_attach n1 br-phys 192.168.0.2 +check ovs-vsctl -- add-port br-int hv2-vif1 -- \ + set interface hv2-vif1 external-ids:iface-id=sw1-p1 \ + options:tx_pcap=hv2/vif1-tx.pcap \ + options:rxq_pcap=hv2/vif1-rx.pcap \ + ofport-request=1 + +check ovn-nbctl ls-add sw0 + +check ovn-nbctl lsp-add sw0 sw0-p1 +check ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03 10.0.0.3" +check ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:03 10.0.0.3" + +# Create the second logical switch with one port +check ovn-nbctl ls-add sw1 +check ovn-nbctl lsp-add sw1 sw1-p1 +check ovn-nbctl lsp-set-addresses sw1-p1 "40:54:00:00:00:03 20.0.0.3" +check ovn-nbctl lsp-set-port-security sw1-p1 "40:54:00:00:00:03 20.0.0.3" + +OVN_SW0_ID=$(ovn-nbctl --bare --column _uuid find logical_switch name=sw0) +OVN_SW1_ID=$(ovn-nbctl --bare --column _uuid find logical_switch name=sw1) + +# Create a logical router and attach both logical switches +check ovn-nbctl lr-add lr0 +check ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24 +check ovn-nbctl lsp-add sw0 sw0-lr0 +check ovn-nbctl lsp-set-type sw0-lr0 router +check ovn-nbctl lsp-set-addresses sw0-lr0 router +check ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 +check ovn-nbctl set Logical_Switch ${OVN_SW0_ID} other_config:lb_vip_mac=00:00:00:00:ff:01 + +check ovn-nbctl lrp-add lr0 lr0-sw1 00:00:00:00:ff:02 20.0.0.1/24 +check ovn-nbctl lsp-add sw1 sw1-lr0 +check ovn-nbctl lsp-set-type sw1-lr0 router +check ovn-nbctl lsp-set-addresses sw1-lr0 router +check ovn-nbctl lsp-set-options sw1-lr0 router-port=lr0-sw1 +check ovn-nbctl set Logical_Switch ${OVN_SW1_ID} other_config:lb_vip_mac=00:00:00:00:ff:02 + +check ovn-nbctl lb-add lb1 10.0.0.10:80 10.0.0.3:80,20.0.0.3:80 +OVN_LB_ID=$(ovn-nbctl --bare --column _uuid find load_balancer name=lb1) + +check ovn-nbctl --wait=sb ls-lb-add sw0 lb1 +check ovn-nbctl --wait=sb ls-lb-add sw1 lb1 + +OVN_POPULATE_ARP +wait_for_ports_up +check ovn-nbctl --wait=hv sync + +AT_CAPTURE_FILE([sbflows]) +OVS_WAIT_FOR_OUTPUT( + [ovn-sbctl dump-flows > sbflows + ovn-sbctl dump-flows sw0 | grep ct_lb_mark | grep priority=120 | sed 's/table=..//'], 0, + [dnl + (ls_in_pre_stateful ), priority=120 , match=(reg0[[2]] == 1 && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(reg1 = 10.0.0.10; reg2[[0..15]] = 80; ct_lb_mark;) + (ls_in_lb ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(reg0[[1]] = 0; ct_lb_mark(backends=10.0.0.3:80,20.0.0.3:80);) +]) + +AT_CAPTURE_FILE([sbflows-arp]) +OVS_WAIT_FOR_OUTPUT( + [ovn-sbctl dump-flows sw0 | grep 00:00:00:00:ff:01 | grep 10.0.0.10 | grep priority=130 | sed 's/table=..//'], 0, + [dnl + (ls_in_arp_rsp ), priority=130 , match=(arp.tpa == 10.0.0.10 && arp.op == 1), action=(eth.dst = eth.src; eth.src = 00:00:00:00:ff:01; arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha = 00:00:00:00:ff:01; arp.tpa = arp.spa; arp.spa = 10.0.0.10; outport = inport; flags.loopback = 1; output;) +]) + +AT_CAPTURE_FILE([sbflows]) +OVS_WAIT_FOR_OUTPUT( + [ovn-sbctl dump-flows > sbflows + ovn-sbctl dump-flows sw1 | grep ct_lb_mark | grep priority=120 | sed 's/table=..//'], 0, + [dnl + (ls_in_pre_stateful ), priority=120 , match=(reg0[[2]] == 1 && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(reg1 = 10.0.0.10; reg2[[0..15]] = 80; ct_lb_mark;) + (ls_in_lb ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(reg0[[1]] = 0; ct_lb_mark(backends=10.0.0.3:80,20.0.0.3:80);) +]) + +AT_CAPTURE_FILE([sbflows-arp2]) +OVS_WAIT_FOR_OUTPUT( + [ovn-sbctl dump-flows sw1 | grep 00:00:00:00:ff:02 | grep 10.0.0.10 | grep priority=130 | sed 's/table=..//'], 0, + [dnl + (ls_in_arp_rsp ), priority=130 , match=(arp.tpa == 10.0.0.10 && arp.op == 1), action=(eth.dst = eth.src; eth.src = 00:00:00:00:ff:02; arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha = 00:00:00:00:ff:02; arp.tpa = arp.spa; arp.spa = 10.0.0.10; outport = inport; flags.loopback = 1; output;) +]) + +OVN_CLEANUP([hv1], [hv2]) +AT_CLEANUP +])
Currently load balancer applied to a logical switch has the following restriction: - VIP of the load balancer cannot reside in the subnet prefix as the clients as OVN does not install ARP responder flows for the LB VIP. This change adds a new config option "lb_vip_mac" in the logical_switch table which is expected to be a MAC address. If the logical_switch has this option configured, northd will program an ARP responder flow for all the LB VIPs of the logical_switch with this MAC address. Usecase: With this change, CMS can set the lb_vip_mac value to same as the default gateway MAC. This allows CMS to allocate VIP of the Load balancer from any subnet prefix. Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com> --- northd/northd.c | 71 ++++++++++++++++++++++++++ northd/northd.h | 2 + northd/ovn-northd.8.xml | 49 ++++++++++++++++++ tests/ovn.at | 109 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 231 insertions(+)