Message ID | 2129b838a2c925360f910e86a19bdf701df013ce.1733403411.git.felix.huettner@stackit.cloud |
---|---|
State | Superseded |
Headers | show |
Series | OVN Fabric integration: Route table. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/github-robot-_ovn-kubernetes | success | github build: passed |
Hi Felix and Dumitru, Thanks for posting this standalone change Felix. I'm going to take it for a spin and try to integrate it to Frode's original "route leaking" RFC. I have one question/suggestion for the schema that I'll leave below. On Thu, 2024-12-05 at 14:00 +0100, Felix Huettner via dev wrote: > The Route table will be used in the future to coordinate routing > information between northd and ovn-controller. > Northd will insert routes that should be advertised to the outside > fabric. > Ovn-controller will insert routes that have been learned from the > outside fabric. > > Signed-off-by: Felix Huettner <felix.huettner@stackit.cloud> > --- > v2->v3: > * refType is now strong > * fixed typos > > > ovn-sb.ovsschema | 29 ++++++++++++++++-- > ovn-sb.xml | 80 > ++++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 107 insertions(+), 2 deletions(-) > > diff --git a/ovn-sb.ovsschema b/ovn-sb.ovsschema > index 73abf2c8d..88763f429 100644 > --- a/ovn-sb.ovsschema > +++ b/ovn-sb.ovsschema > @@ -1,7 +1,7 @@ > { > "name": "OVN_Southbound", > - "version": "20.37.0", > - "cksum": "1950136776 31493", > + "version": "20.38.0", > + "cksum": "1392903395 32893", > "tables": { > "SB_Global": { > "columns": { > @@ -617,6 +617,31 @@ > "type": {"key": "string", "value": "string", > "min": 0, "max": "unlimited"}}}, > "indexes": [["chassis"]], > + "isRoot": true}, > + "Route": { > + "columns": { > + "datapath": > + {"type": {"key": {"type": "uuid", > + "refTable": > "Datapath_Binding"}}}, > + "logical_port": {"type": {"key": {"type": "uuid", > + "refTable": > "Port_Binding", > + "refType": > "strong"}}}, > + "ip_prefix": {"type": "string"}, > + "nexthop": {"type": "string"}, > + "tracked_port": {"type": {"key": {"type": "uuid", > + "refTable": > "Port_Binding", > + "refType": > "strong"}, > + "min": 0, > + "max": 1}}, > + "type": {"type": {"key": {"type": "string", > + "enum": ["set", > ["advertise", > + > "receive"]]}, > + "min": 1, "max": 1}}, I wonder if this field is necessary. We could use presence or absence of "nexthop" value to determine whether the route is being learned or advertised. I'm not sure if ovsdb is doing something clever with enums or stores them as raw values in each row, but removing this could save us some space. What do you think? Martin. > + "external_ids": { > + "type": {"key": "string", "value": "string", > + "min": 0, "max": "unlimited"}}}, > + "indexes": [["datapath", "logical_port", "ip_prefix", > "nexthop", > + "type"]], > "isRoot": true} > } > } > diff --git a/ovn-sb.xml b/ovn-sb.xml > index ea4adc1c3..e435cf243 100644 > --- a/ovn-sb.xml > +++ b/ovn-sb.xml > @@ -5217,4 +5217,84 @@ tcp.flags = RST; > The set of variable values for a given chassis. > </column> > </table> > + > + <table name="Route"> > + <p> > + Each record represents a route that is exported from ovn or > imported to > + ovn using some dynamic routing logic outside of ovn. > + It is populated on the one hand by <code>ovn-northd</code> > based on the > + addresses, routes and NAT Entries of a > + <code>OVN_Northbound.Logical_Router_Port</code>. > + On the other hand <code>ovn-controller</code> populates it > with routes > + learned from outside of OVN. > + </p> > + > + <column name="datapath"> > + The datapath belonging to the > + <code>OVN_Northbound.Logical_Router</code> that this route is > valid > + for. > + </column> > + > + <column name="logical_port"> > + <p> > + If the type is <code>advertise</code> then this is the > logical_port > + the router will send packets out. > + </p> > + > + <p> > + If the type is <code>receive</code> then this is the > logical_port > + the route was learned on. > + </p> > + </column> > + > + <column name="ip_prefix"> > + <p> > + IP prefix of this route (e.g. 192.168.100.0/24). > + </p> > + </column> > + > + <column name="nexthop"> > + <p> > + If the type is <code>advertise</code> then this is empty. > + </p> > + > + <p> > + If the type is <code>receive</code> then this is the nexthop > ip we > + from the outside. > + </p> > + </column> > + > + <column name="tracked_port"> > + <p> > + Only relevant for type <code>advertise</code>. > + > + In combination with a host <code>ip_prefix</code> this > tracks the port > + OVN will forward the packets for this destination to. > + > + An announcing chassis can use this information to check if > this > + destination is local and adjust the route priorities based > on that. > + </p> > + </column> > + > + <column name="type"> > + <p> > + If the route is to be exported from OVN to the outside > network or if > + it is imported from the outside network. > + </p> > + <ul> > + <li> > + <code>advertise</code>: This route should be advertised to > the > + outside network. > + </li> > + <li> > + <code>receive</code>: This route has been learned from the > outside > + network. > + </li> > + </ul> > + </column> > + > + <column name="external_ids"> > + See <em>External IDs</em> at the beginning of this document. > + </column> > + </table> > </database>
On Thu, Dec 05, 2024 at 02:08:37PM +0100, martin.kalcok@canonical.com wrote: > Hi Felix and Dumitru, > Thanks for posting this standalone change Felix. I'm going to take it > for a spin and try to integrate it to Frode's original "route leaking" > RFC. > I have one question/suggestion for the schema that I'll leave below. > > On Thu, 2024-12-05 at 14:00 +0100, Felix Huettner via dev wrote: > > The Route table will be used in the future to coordinate routing > > information between northd and ovn-controller. > > Northd will insert routes that should be advertised to the outside > > fabric. > > Ovn-controller will insert routes that have been learned from the > > outside fabric. > > > > Signed-off-by: Felix Huettner <felix.huettner@stackit.cloud> > > --- > > v2->v3: > > * refType is now strong > > * fixed typos > > > > > > ovn-sb.ovsschema | 29 ++++++++++++++++-- > > ovn-sb.xml | 80 > > ++++++++++++++++++++++++++++++++++++++++++++++++ > > 2 files changed, 107 insertions(+), 2 deletions(-) > > > > diff --git a/ovn-sb.ovsschema b/ovn-sb.ovsschema > > index 73abf2c8d..88763f429 100644 > > --- a/ovn-sb.ovsschema > > +++ b/ovn-sb.ovsschema > > @@ -1,7 +1,7 @@ > > { > > "name": "OVN_Southbound", > > - "version": "20.37.0", > > - "cksum": "1950136776 31493", > > + "version": "20.38.0", > > + "cksum": "1392903395 32893", > > "tables": { > > "SB_Global": { > > "columns": { > > @@ -617,6 +617,31 @@ > > "type": {"key": "string", "value": "string", > > "min": 0, "max": "unlimited"}}}, > > "indexes": [["chassis"]], > > + "isRoot": true}, > > + "Route": { > > + "columns": { > > + "datapath": > > + {"type": {"key": {"type": "uuid", > > + "refTable": > > "Datapath_Binding"}}}, > > + "logical_port": {"type": {"key": {"type": "uuid", > > + "refTable": > > "Port_Binding", > > + "refType": > > "strong"}}}, > > + "ip_prefix": {"type": "string"}, > > + "nexthop": {"type": "string"}, > > + "tracked_port": {"type": {"key": {"type": "uuid", > > + "refTable": > > "Port_Binding", > > + "refType": > > "strong"}, > > + "min": 0, > > + "max": 1}}, > > + "type": {"type": {"key": {"type": "string", > > + "enum": ["set", > > ["advertise", > > + > > "receive"]]}, > > + "min": 1, "max": 1}}, > > I wonder if this field is necessary. We could use presence or absence > of "nexthop" value to determine whether the route is being learned or > advertised. I'm not sure if ovsdb is doing something clever with enums > or stores them as raw values in each row, but removing this could save > us some space. What do you think? Hi Martin, i think generally we could use the information in the "nexthop" column for this. However i would find this rather surprising if i am not deeply in the implementation. For debugging at least it is not obvious. I am not sure how much space saving we would get out of this, but i don't think that it is worth the cost of less debuggability. Especially since each learned/announced route will cause additionally once or multiple logical flows to be created which are significantly larger than this type field. But i am open to other opinions as well. Thanks a lot Felix > > Martin. > > > + "external_ids": { > > + "type": {"key": "string", "value": "string", > > + "min": 0, "max": "unlimited"}}}, > > + "indexes": [["datapath", "logical_port", "ip_prefix", > > "nexthop", > > + "type"]], > > "isRoot": true} > > } > > } > > diff --git a/ovn-sb.xml b/ovn-sb.xml > > index ea4adc1c3..e435cf243 100644 > > --- a/ovn-sb.xml > > +++ b/ovn-sb.xml > > @@ -5217,4 +5217,84 @@ tcp.flags = RST; > > The set of variable values for a given chassis. > > </column> > > </table> > > + > > + <table name="Route"> > > + <p> > > + Each record represents a route that is exported from ovn or > > imported to > > + ovn using some dynamic routing logic outside of ovn. > > + It is populated on the one hand by <code>ovn-northd</code> > > based on the > > + addresses, routes and NAT Entries of a > > + <code>OVN_Northbound.Logical_Router_Port</code>. > > + On the other hand <code>ovn-controller</code> populates it > > with routes > > + learned from outside of OVN. > > + </p> > > + > > + <column name="datapath"> > > + The datapath belonging to the > > + <code>OVN_Northbound.Logical_Router</code> that this route is > > valid > > + for. > > + </column> > > + > > + <column name="logical_port"> > > + <p> > > + If the type is <code>advertise</code> then this is the > > logical_port > > + the router will send packets out. > > + </p> > > + > > + <p> > > + If the type is <code>receive</code> then this is the > > logical_port > > + the route was learned on. > > + </p> > > + </column> > > + > > + <column name="ip_prefix"> > > + <p> > > + IP prefix of this route (e.g. 192.168.100.0/24). > > + </p> > > + </column> > > + > > + <column name="nexthop"> > > + <p> > > + If the type is <code>advertise</code> then this is empty. > > + </p> > > + > > + <p> > > + If the type is <code>receive</code> then this is the nexthop > > ip we > > + from the outside. > > + </p> > > + </column> > > + > > + <column name="tracked_port"> > > + <p> > > + Only relevant for type <code>advertise</code>. > > + > > + In combination with a host <code>ip_prefix</code> this > > tracks the port > > + OVN will forward the packets for this destination to. > > + > > + An announcing chassis can use this information to check if > > this > > + destination is local and adjust the route priorities based > > on that. > > + </p> > > + </column> > > + > > + <column name="type"> > > + <p> > > + If the route is to be exported from OVN to the outside > > network or if > > + it is imported from the outside network. > > + </p> > > + <ul> > > + <li> > > + <code>advertise</code>: This route should be advertised to > > the > > + outside network. > > + </li> > > + <li> > > + <code>receive</code>: This route has been learned from the > > outside > > + network. > > + </li> > > + </ul> > > + </column> > > + > > + <column name="external_ids"> > > + See <em>External IDs</em> at the beginning of this document. > > + </column> > > + </table> > > </database> > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
On Thu, 2024-12-05 at 15:28 +0100, Felix Huettner wrote: > On Thu, Dec 05, 2024 at 02:08:37PM +0100, > martin.kalcok@canonical.com wrote: > > Hi Felix and Dumitru, > > Thanks for posting this standalone change Felix. I'm going to take > > it > > for a spin and try to integrate it to Frode's original "route > > leaking" > > RFC. > > I have one question/suggestion for the schema that I'll leave > > below. > > > > On Thu, 2024-12-05 at 14:00 +0100, Felix Huettner via dev wrote: > > > The Route table will be used in the future to coordinate routing > > > information between northd and ovn-controller. > > > Northd will insert routes that should be advertised to the > > > outside > > > fabric. > > > Ovn-controller will insert routes that have been learned from the > > > outside fabric. > > > > > > Signed-off-by: Felix Huettner <felix.huettner@stackit.cloud> > > > --- > > > v2->v3: > > > * refType is now strong > > > * fixed typos > > > > > > > > > ovn-sb.ovsschema | 29 ++++++++++++++++-- > > > ovn-sb.xml | 80 > > > ++++++++++++++++++++++++++++++++++++++++++++++++ > > > 2 files changed, 107 insertions(+), 2 deletions(-) > > > > > > diff --git a/ovn-sb.ovsschema b/ovn-sb.ovsschema > > > index 73abf2c8d..88763f429 100644 > > > --- a/ovn-sb.ovsschema > > > +++ b/ovn-sb.ovsschema > > > @@ -1,7 +1,7 @@ > > > { > > > "name": "OVN_Southbound", > > > - "version": "20.37.0", > > > - "cksum": "1950136776 31493", > > > + "version": "20.38.0", > > > + "cksum": "1392903395 32893", > > > "tables": { > > > "SB_Global": { > > > "columns": { > > > @@ -617,6 +617,31 @@ > > > "type": {"key": "string", "value": "string", > > > "min": 0, "max": "unlimited"}}}, > > > "indexes": [["chassis"]], > > > + "isRoot": true}, > > > + "Route": { > > > + "columns": { > > > + "datapath": > > > + {"type": {"key": {"type": "uuid", > > > + "refTable": > > > "Datapath_Binding"}}}, > > > + "logical_port": {"type": {"key": {"type": > > > "uuid", > > > + "refTable": > > > "Port_Binding", > > > + "refType": > > > "strong"}}}, > > > + "ip_prefix": {"type": "string"}, > > > + "nexthop": {"type": "string"}, > > > + "tracked_port": {"type": {"key": {"type": > > > "uuid", > > > + "refTable": > > > "Port_Binding", > > > + "refType": > > > "strong"}, > > > + "min": 0, > > > + "max": 1}}, > > > + "type": {"type": {"key": {"type": "string", > > > + "enum": ["set", > > > ["advertise", > > > + > > > "receive"]]}, > > > + "min": 1, "max": 1}}, > > > > I wonder if this field is necessary. We could use presence or > > absence > > of "nexthop" value to determine whether the route is being learned > > or > > advertised. I'm not sure if ovsdb is doing something clever with > > enums > > or stores them as raw values in each row, but removing this could > > save > > us some space. What do you think? > > Hi Martin, > > i think generally we could use the information in the "nexthop" > column > for this. However i would find this rather surprising if i am not > deeply > in the implementation. For debugging at least it is not obvious. > > I am not sure how much space saving we would get out of this, but i > don't think that it is worth the cost of less debuggability. > Especially > since each learned/announced route will cause additionally once or > multiple logical flows to be created which are significantly larger > than > this type field. > > But i am open to other opinions as well. > > Thanks a lot > Felix Hi Felix, I'm not hell-bent on this change and I agree that it would make the code less explicit. It just occurred to me because I recently saw similar pattern applied in NAT table [0] where NAT rules is considered "distributed" if the record contains both logical_port and external_nat. I don't have enough experience with ovsdb in large deployments, it was just a gut feel that potentially saving ~8bytes for a table that could have potentially 100s or 1000s of records could be impactful. There's also the fact that without the server-side validation (i.e. if tpye is "advertised" then nexthop must not be empty), it's up to the clients to enforce that invalid records are not created. The issue of transparency could be solved with documentation and abstraction in parsing method (like in the case of the NAT). Anyway, as I said, this is not a blocker from my side, just a suggestion for consideration. Martin [0] https://github.com/ovn-org/ovn/blob/3b32b7d08ebcd005681db5131c798da07a89133a/northd/northd.c#L16506-L16530 > > > > > Martin. > > > > > + "external_ids": { > > > + "type": {"key": "string", "value": "string", > > > + "min": 0, "max": "unlimited"}}}, > > > + "indexes": [["datapath", "logical_port", > > > "ip_prefix", > > > "nexthop", > > > + "type"]], > > > "isRoot": true} > > > } > > > } > > > diff --git a/ovn-sb.xml b/ovn-sb.xml > > > index ea4adc1c3..e435cf243 100644 > > > --- a/ovn-sb.xml > > > +++ b/ovn-sb.xml > > > @@ -5217,4 +5217,84 @@ tcp.flags = RST; > > > The set of variable values for a given chassis. > > > </column> > > > </table> > > > + > > > + <table name="Route"> > > > + <p> > > > + Each record represents a route that is exported from ovn > > > or > > > imported to > > > + ovn using some dynamic routing logic outside of ovn. > > > + It is populated on the one hand by <code>ovn-northd</code> > > > based on the > > > + addresses, routes and NAT Entries of a > > > + <code>OVN_Northbound.Logical_Router_Port</code>. > > > + On the other hand <code>ovn-controller</code> populates it > > > with routes > > > + learned from outside of OVN. > > > + </p> > > > + > > > + <column name="datapath"> > > > + The datapath belonging to the > > > + <code>OVN_Northbound.Logical_Router</code> that this route > > > is > > > valid > > > + for. > > > + </column> > > > + > > > + <column name="logical_port"> > > > + <p> > > > + If the type is <code>advertise</code> then this is the > > > logical_port > > > + the router will send packets out. > > > + </p> > > > + > > > + <p> > > > + If the type is <code>receive</code> then this is the > > > logical_port > > > + the route was learned on. > > > + </p> > > > + </column> > > > + > > > + <column name="ip_prefix"> > > > + <p> > > > + IP prefix of this route (e.g. 192.168.100.0/24). > > > + </p> > > > + </column> > > > + > > > + <column name="nexthop"> > > > + <p> > > > + If the type is <code>advertise</code> then this is > > > empty. > > > + </p> > > > + > > > + <p> > > > + If the type is <code>receive</code> then this is the > > > nexthop > > > ip we > > > + from the outside. > > > + </p> > > > + </column> > > > + > > > + <column name="tracked_port"> > > > + <p> > > > + Only relevant for type <code>advertise</code>. > > > + > > > + In combination with a host <code>ip_prefix</code> this > > > tracks the port > > > + OVN will forward the packets for this destination to. > > > + > > > + An announcing chassis can use this information to check > > > if > > > this > > > + destination is local and adjust the route priorities > > > based > > > on that. > > > + </p> > > > + </column> > > > + > > > + <column name="type"> > > > + <p> > > > + If the route is to be exported from OVN to the outside > > > network or if > > > + it is imported from the outside network. > > > + </p> > > > + <ul> > > > + <li> > > > + <code>advertise</code>: This route should be > > > advertised to > > > the > > > + outside network. > > > + </li> > > > + <li> > > > + <code>receive</code>: This route has been learned from > > > the > > > outside > > > + network. > > > + </li> > > > + </ul> > > > + </column> > > > + > > > + <column name="external_ids"> > > > + See <em>External IDs</em> at the beginning of this > > > document. > > > + </column> > > > + </table> > > > </database> > > > > _______________________________________________ > > dev mailing list > > dev@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
On Fri, Dec 06, 2024 at 10:10:21AM +0100, martin.kalcok@canonical.com wrote: > On Thu, 2024-12-05 at 15:28 +0100, Felix Huettner wrote: > > On Thu, Dec 05, 2024 at 02:08:37PM +0100, > > martin.kalcok@canonical.com wrote: > > > Hi Felix and Dumitru, > > > Thanks for posting this standalone change Felix. I'm going to take > > > it > > > for a spin and try to integrate it to Frode's original "route > > > leaking" > > > RFC. > > > I have one question/suggestion for the schema that I'll leave > > > below. > > > > > > On Thu, 2024-12-05 at 14:00 +0100, Felix Huettner via dev wrote: > > > > The Route table will be used in the future to coordinate routing > > > > information between northd and ovn-controller. > > > > Northd will insert routes that should be advertised to the > > > > outside > > > > fabric. > > > > Ovn-controller will insert routes that have been learned from the > > > > outside fabric. > > > > > > > > Signed-off-by: Felix Huettner <felix.huettner@stackit.cloud> > > > > --- > > > > v2->v3: > > > > * refType is now strong > > > > * fixed typos > > > > > > > > > > > > ovn-sb.ovsschema | 29 ++++++++++++++++-- > > > > ovn-sb.xml | 80 > > > > ++++++++++++++++++++++++++++++++++++++++++++++++ > > > > 2 files changed, 107 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/ovn-sb.ovsschema b/ovn-sb.ovsschema > > > > index 73abf2c8d..88763f429 100644 > > > > --- a/ovn-sb.ovsschema > > > > +++ b/ovn-sb.ovsschema > > > > @@ -1,7 +1,7 @@ > > > > { > > > > "name": "OVN_Southbound", > > > > - "version": "20.37.0", > > > > - "cksum": "1950136776 31493", > > > > + "version": "20.38.0", > > > > + "cksum": "1392903395 32893", > > > > "tables": { > > > > "SB_Global": { > > > > "columns": { > > > > @@ -617,6 +617,31 @@ > > > > "type": {"key": "string", "value": "string", > > > > "min": 0, "max": "unlimited"}}}, > > > > "indexes": [["chassis"]], > > > > + "isRoot": true}, > > > > + "Route": { > > > > + "columns": { > > > > + "datapath": > > > > + {"type": {"key": {"type": "uuid", > > > > + "refTable": > > > > "Datapath_Binding"}}}, > > > > + "logical_port": {"type": {"key": {"type": > > > > "uuid", > > > > + "refTable": > > > > "Port_Binding", > > > > + "refType": > > > > "strong"}}}, > > > > + "ip_prefix": {"type": "string"}, > > > > + "nexthop": {"type": "string"}, > > > > + "tracked_port": {"type": {"key": {"type": > > > > "uuid", > > > > + "refTable": > > > > "Port_Binding", > > > > + "refType": > > > > "strong"}, > > > > + "min": 0, > > > > + "max": 1}}, > > > > + "type": {"type": {"key": {"type": "string", > > > > + "enum": ["set", > > > > ["advertise", > > > > + > > > > "receive"]]}, > > > > + "min": 1, "max": 1}}, > > > > > > I wonder if this field is necessary. We could use presence or > > > absence > > > of "nexthop" value to determine whether the route is being learned > > > or > > > advertised. I'm not sure if ovsdb is doing something clever with > > > enums > > > or stores them as raw values in each row, but removing this could > > > save > > > us some space. What do you think? > > > > Hi Martin, > > > > i think generally we could use the information in the "nexthop" > > column > > for this. However i would find this rather surprising if i am not > > deeply > > in the implementation. For debugging at least it is not obvious. > > > > I am not sure how much space saving we would get out of this, but i > > don't think that it is worth the cost of less debuggability. > > Especially > > since each learned/announced route will cause additionally once or > > multiple logical flows to be created which are significantly larger > > than > > this type field. > > > > But i am open to other opinions as well. > > > > Thanks a lot > > Felix > > Hi Felix, > I'm not hell-bent on this change and I agree that it would make the > code less explicit. It just occurred to me because I recently saw > similar pattern applied in NAT table [0] where NAT rules is considered > "distributed" if the record contains both logical_port and > external_nat. Hi Martin, i think a big difference here is that we need to have both ovn-northd and ovn-controller agree to what each of these things mean. In the above example it looked like something only northd needs to know internally to derive the correct flows. > > I don't have enough experience with ovsdb in large deployments, it was > just a gut feel that potentially saving ~8bytes for a table that could > have potentially 100s or 1000s of records could be impactful. > There's also the fact that without the server-side validation (i.e. if > tpye is "advertised" then nexthop must not be empty), it's up to the > clients to enforce that invalid records are not created. I just spend some time measureing it by adding a few thousand Routes and checking the RSS difference. With the current schema (actually with Dumitru's requested adjustments) we are at ~3230 Byte per route. If we remove the "type" we are at ~2514 Byte. So ~716 Byte difference. If we calculate with 10.000 Routes then we end up at roughly 7MB of difference. I would think this is worth the clarity. Additionally it allows us to later actually use nexthop also for advertised routes. Allthough honestly i don't know a usecase for this. Whats your opinion on that one? Thanks a lot Felix > > The issue of transparency could be solved with documentation and > abstraction in parsing method (like in the case of the NAT). > > Anyway, as I said, this is not a blocker from my side, just a > suggestion for consideration. > > Martin > > [0] > https://github.com/ovn-org/ovn/blob/3b32b7d08ebcd005681db5131c798da07a89133a/northd/northd.c#L16506-L16530 > > > > > > > > Martin. > > > > > > > + "external_ids": { > > > > + "type": {"key": "string", "value": "string", > > > > + "min": 0, "max": "unlimited"}}}, > > > > + "indexes": [["datapath", "logical_port", > > > > "ip_prefix", > > > > "nexthop", > > > > + "type"]], > > > > "isRoot": true} > > > > } > > > > } > > > > diff --git a/ovn-sb.xml b/ovn-sb.xml > > > > index ea4adc1c3..e435cf243 100644 > > > > --- a/ovn-sb.xml > > > > +++ b/ovn-sb.xml > > > > @@ -5217,4 +5217,84 @@ tcp.flags = RST; > > > > The set of variable values for a given chassis. > > > > </column> > > > > </table> > > > > + > > > > + <table name="Route"> > > > > + <p> > > > > + Each record represents a route that is exported from ovn > > > > or > > > > imported to > > > > + ovn using some dynamic routing logic outside of ovn. > > > > + It is populated on the one hand by <code>ovn-northd</code> > > > > based on the > > > > + addresses, routes and NAT Entries of a > > > > + <code>OVN_Northbound.Logical_Router_Port</code>. > > > > + On the other hand <code>ovn-controller</code> populates it > > > > with routes > > > > + learned from outside of OVN. > > > > + </p> > > > > + > > > > + <column name="datapath"> > > > > + The datapath belonging to the > > > > + <code>OVN_Northbound.Logical_Router</code> that this route > > > > is > > > > valid > > > > + for. > > > > + </column> > > > > + > > > > + <column name="logical_port"> > > > > + <p> > > > > + If the type is <code>advertise</code> then this is the > > > > logical_port > > > > + the router will send packets out. > > > > + </p> > > > > + > > > > + <p> > > > > + If the type is <code>receive</code> then this is the > > > > logical_port > > > > + the route was learned on. > > > > + </p> > > > > + </column> > > > > + > > > > + <column name="ip_prefix"> > > > > + <p> > > > > + IP prefix of this route (e.g. 192.168.100.0/24). > > > > + </p> > > > > + </column> > > > > + > > > > + <column name="nexthop"> > > > > + <p> > > > > + If the type is <code>advertise</code> then this is > > > > empty. > > > > + </p> > > > > + > > > > + <p> > > > > + If the type is <code>receive</code> then this is the > > > > nexthop > > > > ip we > > > > + from the outside. > > > > + </p> > > > > + </column> > > > > + > > > > + <column name="tracked_port"> > > > > + <p> > > > > + Only relevant for type <code>advertise</code>. > > > > + > > > > + In combination with a host <code>ip_prefix</code> this > > > > tracks the port > > > > + OVN will forward the packets for this destination to. > > > > + > > > > + An announcing chassis can use this information to check > > > > if > > > > this > > > > + destination is local and adjust the route priorities > > > > based > > > > on that. > > > > + </p> > > > > + </column> > > > > + > > > > + <column name="type"> > > > > + <p> > > > > + If the route is to be exported from OVN to the outside > > > > network or if > > > > + it is imported from the outside network. > > > > + </p> > > > > + <ul> > > > > + <li> > > > > + <code>advertise</code>: This route should be > > > > advertised to > > > > the > > > > + outside network. > > > > + </li> > > > > + <li> > > > > + <code>receive</code>: This route has been learned from > > > > the > > > > outside > > > > + network. > > > > + </li> > > > > + </ul> > > > > + </column> > > > > + > > > > + <column name="external_ids"> > > > > + See <em>External IDs</em> at the beginning of this > > > > document. > > > > + </column> > > > > + </table> > > > > </database> > > > > > > _______________________________________________ > > > dev mailing list > > > dev@openvswitch.org > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
On Mon, 2024-12-09 at 13:37 +0100, Felix Huettner wrote: > On Fri, Dec 06, 2024 at 10:10:21AM +0100, > martin.kalcok@canonical.com wrote: > > On Thu, 2024-12-05 at 15:28 +0100, Felix Huettner wrote: > > > On Thu, Dec 05, 2024 at 02:08:37PM +0100, > > > martin.kalcok@canonical.com wrote: > > > > Hi Felix and Dumitru, > > > > Thanks for posting this standalone change Felix. I'm going to > > > > take > > > > it > > > > for a spin and try to integrate it to Frode's original "route > > > > leaking" > > > > RFC. > > > > I have one question/suggestion for the schema that I'll leave > > > > below. > > > > > > > > On Thu, 2024-12-05 at 14:00 +0100, Felix Huettner via dev > > > > wrote: > > > > > The Route table will be used in the future to coordinate > > > > > routing > > > > > information between northd and ovn-controller. > > > > > Northd will insert routes that should be advertised to the > > > > > outside > > > > > fabric. > > > > > Ovn-controller will insert routes that have been learned from > > > > > the > > > > > outside fabric. > > > > > > > > > > Signed-off-by: Felix Huettner <felix.huettner@stackit.cloud> > > > > > --- > > > > > v2->v3: > > > > > * refType is now strong > > > > > * fixed typos > > > > > > > > > > > > > > > ovn-sb.ovsschema | 29 ++++++++++++++++-- > > > > > ovn-sb.xml | 80 > > > > > ++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > 2 files changed, 107 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/ovn-sb.ovsschema b/ovn-sb.ovsschema > > > > > index 73abf2c8d..88763f429 100644 > > > > > --- a/ovn-sb.ovsschema > > > > > +++ b/ovn-sb.ovsschema > > > > > @@ -1,7 +1,7 @@ > > > > > { > > > > > "name": "OVN_Southbound", > > > > > - "version": "20.37.0", > > > > > - "cksum": "1950136776 31493", > > > > > + "version": "20.38.0", > > > > > + "cksum": "1392903395 32893", > > > > > "tables": { > > > > > "SB_Global": { > > > > > "columns": { > > > > > @@ -617,6 +617,31 @@ > > > > > "type": {"key": "string", "value": > > > > > "string", > > > > > "min": 0, "max": > > > > > "unlimited"}}}, > > > > > "indexes": [["chassis"]], > > > > > + "isRoot": true}, > > > > > + "Route": { > > > > > + "columns": { > > > > > + "datapath": > > > > > + {"type": {"key": {"type": "uuid", > > > > > + "refTable": > > > > > "Datapath_Binding"}}}, > > > > > + "logical_port": {"type": {"key": {"type": > > > > > "uuid", > > > > > + > > > > > "refTable": > > > > > "Port_Binding", > > > > > + "refType": > > > > > "strong"}}}, > > > > > + "ip_prefix": {"type": "string"}, > > > > > + "nexthop": {"type": "string"}, > > > > > + "tracked_port": {"type": {"key": {"type": > > > > > "uuid", > > > > > + > > > > > "refTable": > > > > > "Port_Binding", > > > > > + "refType": > > > > > "strong"}, > > > > > + "min": 0, > > > > > + "max": 1}}, > > > > > + "type": {"type": {"key": {"type": "string", > > > > > + "enum": ["set", > > > > > ["advertise", > > > > > + > > > > > "receive"]]}, > > > > > + "min": 1, "max": 1}}, > > > > > > > > I wonder if this field is necessary. We could use presence or > > > > absence > > > > of "nexthop" value to determine whether the route is being > > > > learned > > > > or > > > > advertised. I'm not sure if ovsdb is doing something clever > > > > with > > > > enums > > > > or stores them as raw values in each row, but removing this > > > > could > > > > save > > > > us some space. What do you think? > > > > > > Hi Martin, > > > > > > i think generally we could use the information in the "nexthop" > > > column > > > for this. However i would find this rather surprising if i am not > > > deeply > > > in the implementation. For debugging at least it is not obvious. > > > > > > I am not sure how much space saving we would get out of this, but > > > i > > > don't think that it is worth the cost of less debuggability. > > > Especially > > > since each learned/announced route will cause additionally once > > > or > > > multiple logical flows to be created which are significantly > > > larger > > > than > > > this type field. > > > > > > But i am open to other opinions as well. > > > > > > Thanks a lot > > > Felix > > > > Hi Felix, > > I'm not hell-bent on this change and I agree that it would make the > > code less explicit. It just occurred to me because I recently saw > > similar pattern applied in NAT table [0] where NAT rules is > > considered > > "distributed" if the record contains both logical_port and > > external_nat. > > Hi Martin, > > i think a big difference here is that we need to have both ovn-northd > and ovn-controller agree to what each of these things mean. > In the above example it looked like something only northd needs to > know > internally to derive the correct flows. > > > > > I don't have enough experience with ovsdb in large deployments, it > > was > > just a gut feel that potentially saving ~8bytes for a table that > > could > > have potentially 100s or 1000s of records could be impactful. > > There's also the fact that without the server-side validation (i.e. > > if > > tpye is "advertised" then nexthop must not be empty), it's up to > > the > > clients to enforce that invalid records are not created. > > I just spend some time measureing it by adding a few thousand Routes > and > checking the RSS difference. > With the current schema (actually with Dumitru's requested > adjustments) > we are at ~3230 Byte per route. If we remove the "type" we are at > ~2514 > Byte. So ~716 Byte difference. > > If we calculate with 10.000 Routes then we end up at roughly 7MB of > difference. > I would think this is worth the clarity. > > Additionally it allows us to later actually use nexthop also for > advertised routes. Allthough honestly i don't know a usecase for > this. > > Whats your opinion on that one? > > Thanks a lot > Felix Hi Felix, thanks for running numbers on this. The difference indeed seems negligible even on large deployments. You are right that the trade-off for removing the "type" column is probably not worth it. I've put this version (v3) to work on Friday by integrating it to fnordahl's original route leaking RFC and one more though occurred to me. Would it be worth to split the `Route` table to `Advertised_Route` and `Learned_Route`? I think this could be beneficial because: * Both "types" of routes have different "owners". While "Learned routes" are primarily managed "from the bottom", by the controller, "Advertised routes" are managed "from the top", by northd. Having two different components being in charge of keeping the table up to date seems to complicate things. * "Advertised routes" could do with just following attributes: * IP prefix * logical_port (strong ref to LRP that's supposed to advertise this prefix) * With two separate tables, development of learning/advertising would be more independent as changing requirements in one feature wouldn't affect development of the other. With the simplified structure of the `Advertised_Route` table that I mentioned above, northd would be the only component that adds data to this table and records would be either explicitly removed by northd (when NAT or LB gets removed) or implicitly when the LRP is removed and `logical_port` reference goes away. What do you think? Thanks, Martin. > > > > > The issue of transparency could be solved with documentation and > > abstraction in parsing method (like in the case of the NAT). > > > > Anyway, as I said, this is not a blocker from my side, just a > > suggestion for consideration. > > > > Martin > > > > [0] > > https://github.com/ovn-org/ovn/blob/3b32b7d08ebcd005681db5131c798da07a89133a/northd/northd.c#L16506-L16530 > > > > > > > > > > > Martin. > > > > > > > > > + "external_ids": { > > > > > + "type": {"key": "string", "value": > > > > > "string", > > > > > + "min": 0, "max": > > > > > "unlimited"}}}, > > > > > + "indexes": [["datapath", "logical_port", > > > > > "ip_prefix", > > > > > "nexthop", > > > > > + "type"]], > > > > > "isRoot": true} > > > > > } > > > > > } > > > > > diff --git a/ovn-sb.xml b/ovn-sb.xml > > > > > index ea4adc1c3..e435cf243 100644 > > > > > --- a/ovn-sb.xml > > > > > +++ b/ovn-sb.xml > > > > > @@ -5217,4 +5217,84 @@ tcp.flags = RST; > > > > > The set of variable values for a given chassis. > > > > > </column> > > > > > </table> > > > > > + > > > > > + <table name="Route"> > > > > > + <p> > > > > > + Each record represents a route that is exported from > > > > > ovn > > > > > or > > > > > imported to > > > > > + ovn using some dynamic routing logic outside of ovn. > > > > > + It is populated on the one hand by <code>ovn- > > > > > northd</code> > > > > > based on the > > > > > + addresses, routes and NAT Entries of a > > > > > + <code>OVN_Northbound.Logical_Router_Port</code>. > > > > > + On the other hand <code>ovn-controller</code> > > > > > populates it > > > > > with routes > > > > > + learned from outside of OVN. > > > > > + </p> > > > > > + > > > > > + <column name="datapath"> > > > > > + The datapath belonging to the > > > > > + <code>OVN_Northbound.Logical_Router</code> that this > > > > > route > > > > > is > > > > > valid > > > > > + for. > > > > > + </column> > > > > > + > > > > > + <column name="logical_port"> > > > > > + <p> > > > > > + If the type is <code>advertise</code> then this is > > > > > the > > > > > logical_port > > > > > + the router will send packets out. > > > > > + </p> > > > > > + > > > > > + <p> > > > > > + If the type is <code>receive</code> then this is the > > > > > logical_port > > > > > + the route was learned on. > > > > > + </p> > > > > > + </column> > > > > > + > > > > > + <column name="ip_prefix"> > > > > > + <p> > > > > > + IP prefix of this route (e.g. 192.168.100.0/24). > > > > > + </p> > > > > > + </column> > > > > > + > > > > > + <column name="nexthop"> > > > > > + <p> > > > > > + If the type is <code>advertise</code> then this is > > > > > empty. > > > > > + </p> > > > > > + > > > > > + <p> > > > > > + If the type is <code>receive</code> then this is the > > > > > nexthop > > > > > ip we > > > > > + from the outside. > > > > > + </p> > > > > > + </column> > > > > > + > > > > > + <column name="tracked_port"> > > > > > + <p> > > > > > + Only relevant for type <code>advertise</code>. > > > > > + > > > > > + In combination with a host <code>ip_prefix</code> > > > > > this > > > > > tracks the port > > > > > + OVN will forward the packets for this destination > > > > > to. > > > > > + > > > > > + An announcing chassis can use this information to > > > > > check > > > > > if > > > > > this > > > > > + destination is local and adjust the route priorities > > > > > based > > > > > on that. > > > > > + </p> > > > > > + </column> > > > > > + > > > > > + <column name="type"> > > > > > + <p> > > > > > + If the route is to be exported from OVN to the > > > > > outside > > > > > network or if > > > > > + it is imported from the outside network. > > > > > + </p> > > > > > + <ul> > > > > > + <li> > > > > > + <code>advertise</code>: This route should be > > > > > advertised to > > > > > the > > > > > + outside network. > > > > > + </li> > > > > > + <li> > > > > > + <code>receive</code>: This route has been learned > > > > > from > > > > > the > > > > > outside > > > > > + network. > > > > > + </li> > > > > > + </ul> > > > > > + </column> > > > > > + > > > > > + <column name="external_ids"> > > > > > + See <em>External IDs</em> at the beginning of this > > > > > document. > > > > > + </column> > > > > > + </table> > > > > > </database> > > > > > > > > _______________________________________________ > > > > dev mailing list > > > > dev@openvswitch.org > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > _______________________________________________ > > dev mailing list > > dev@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
On 12/9/24 2:14 PM, martin.kalcok@canonical.com wrote: > On Mon, 2024-12-09 at 13:37 +0100, Felix Huettner wrote: >> On Fri, Dec 06, 2024 at 10:10:21AM +0100, >> martin.kalcok@canonical.com wrote: >>> On Thu, 2024-12-05 at 15:28 +0100, Felix Huettner wrote: >>>> On Thu, Dec 05, 2024 at 02:08:37PM +0100, >>>> martin.kalcok@canonical.com wrote: >>>>> Hi Felix and Dumitru, >>>>> Thanks for posting this standalone change Felix. I'm going to >>>>> take >>>>> it >>>>> for a spin and try to integrate it to Frode's original "route >>>>> leaking" >>>>> RFC. >>>>> I have one question/suggestion for the schema that I'll leave >>>>> below. >>>>> >>>>> On Thu, 2024-12-05 at 14:00 +0100, Felix Huettner via dev >>>>> wrote: >>>>>> The Route table will be used in the future to coordinate >>>>>> routing >>>>>> information between northd and ovn-controller. >>>>>> Northd will insert routes that should be advertised to the >>>>>> outside >>>>>> fabric. >>>>>> Ovn-controller will insert routes that have been learned from >>>>>> the >>>>>> outside fabric. >>>>>> >>>>>> Signed-off-by: Felix Huettner <felix.huettner@stackit.cloud> >>>>>> --- >>>>>> v2->v3: >>>>>> * refType is now strong >>>>>> * fixed typos >>>>>> >>>>>> >>>>>> ovn-sb.ovsschema | 29 ++++++++++++++++-- >>>>>> ovn-sb.xml | 80 >>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>> 2 files changed, 107 insertions(+), 2 deletions(-) >>>>>> >>>>>> diff --git a/ovn-sb.ovsschema b/ovn-sb.ovsschema >>>>>> index 73abf2c8d..88763f429 100644 >>>>>> --- a/ovn-sb.ovsschema >>>>>> +++ b/ovn-sb.ovsschema >>>>>> @@ -1,7 +1,7 @@ >>>>>> { >>>>>> "name": "OVN_Southbound", >>>>>> - "version": "20.37.0", >>>>>> - "cksum": "1950136776 31493", >>>>>> + "version": "20.38.0", >>>>>> + "cksum": "1392903395 32893", >>>>>> "tables": { >>>>>> "SB_Global": { >>>>>> "columns": { >>>>>> @@ -617,6 +617,31 @@ >>>>>> "type": {"key": "string", "value": >>>>>> "string", >>>>>> "min": 0, "max": >>>>>> "unlimited"}}}, >>>>>> "indexes": [["chassis"]], >>>>>> + "isRoot": true}, >>>>>> + "Route": { >>>>>> + "columns": { >>>>>> + "datapath": >>>>>> + {"type": {"key": {"type": "uuid", >>>>>> + "refTable": >>>>>> "Datapath_Binding"}}}, >>>>>> + "logical_port": {"type": {"key": {"type": >>>>>> "uuid", >>>>>> + >>>>>> "refTable": >>>>>> "Port_Binding", >>>>>> + "refType": >>>>>> "strong"}}}, >>>>>> + "ip_prefix": {"type": "string"}, >>>>>> + "nexthop": {"type": "string"}, >>>>>> + "tracked_port": {"type": {"key": {"type": >>>>>> "uuid", >>>>>> + >>>>>> "refTable": >>>>>> "Port_Binding", >>>>>> + "refType": >>>>>> "strong"}, >>>>>> + "min": 0, >>>>>> + "max": 1}}, >>>>>> + "type": {"type": {"key": {"type": "string", >>>>>> + "enum": ["set", >>>>>> ["advertise", >>>>>> + >>>>>> "receive"]]}, >>>>>> + "min": 1, "max": 1}}, >>>>> >>>>> I wonder if this field is necessary. We could use presence or >>>>> absence >>>>> of "nexthop" value to determine whether the route is being >>>>> learned >>>>> or >>>>> advertised. I'm not sure if ovsdb is doing something clever >>>>> with >>>>> enums >>>>> or stores them as raw values in each row, but removing this >>>>> could >>>>> save >>>>> us some space. What do you think? >>>> >>>> Hi Martin, >>>> >>>> i think generally we could use the information in the "nexthop" >>>> column >>>> for this. However i would find this rather surprising if i am not >>>> deeply >>>> in the implementation. For debugging at least it is not obvious. >>>> >>>> I am not sure how much space saving we would get out of this, but >>>> i >>>> don't think that it is worth the cost of less debuggability. >>>> Especially >>>> since each learned/announced route will cause additionally once >>>> or >>>> multiple logical flows to be created which are significantly >>>> larger >>>> than >>>> this type field. >>>> >>>> But i am open to other opinions as well. >>>> >>>> Thanks a lot >>>> Felix >>> >>> Hi Felix, >>> I'm not hell-bent on this change and I agree that it would make the >>> code less explicit. It just occurred to me because I recently saw >>> similar pattern applied in NAT table [0] where NAT rules is >>> considered >>> "distributed" if the record contains both logical_port and >>> external_nat. >> >> Hi Martin, >> >> i think a big difference here is that we need to have both ovn-northd >> and ovn-controller agree to what each of these things mean. >> In the above example it looked like something only northd needs to >> know >> internally to derive the correct flows. >> >>> >>> I don't have enough experience with ovsdb in large deployments, it >>> was >>> just a gut feel that potentially saving ~8bytes for a table that >>> could >>> have potentially 100s or 1000s of records could be impactful. >>> There's also the fact that without the server-side validation (i.e. >>> if >>> tpye is "advertised" then nexthop must not be empty), it's up to >>> the >>> clients to enforce that invalid records are not created. >> >> I just spend some time measureing it by adding a few thousand Routes >> and >> checking the RSS difference. >> With the current schema (actually with Dumitru's requested >> adjustments) >> we are at ~3230 Byte per route. If we remove the "type" we are at >> ~2514 >> Byte. So ~716 Byte difference. >> >> If we calculate with 10.000 Routes then we end up at roughly 7MB of >> difference. >> I would think this is worth the clarity. >> >> Additionally it allows us to later actually use nexthop also for >> advertised routes. Allthough honestly i don't know a usecase for >> this. >> >> Whats your opinion on that one? >> >> Thanks a lot >> Felix > > Hi Felix, > thanks for running numbers on this. The difference indeed seems > negligible even on large deployments. You are right that the trade-off > for removing the "type" column is probably not worth it. > > I've put this version (v3) to work on Friday by integrating it to > fnordahl's original route leaking RFC and one more though occurred to > me. > > Would it be worth to split the `Route` table to `Advertised_Route` and > `Learned_Route`? I think this could be beneficial because: > * Both "types" of routes have different "owners". While "Learned > routes" are primarily managed "from the bottom", by the controller, > "Advertised routes" are managed "from the top", by northd. Having two > different components being in charge of keeping the table up to date > seems to complicate things. > * "Advertised routes" could do with just following attributes: > * IP prefix > * logical_port (strong ref to LRP that's supposed to advertise this > prefix) > * With two separate tables, development of learning/advertising would > be more independent as changing requirements in one feature wouldn't > affect development of the other. > > With the simplified structure of the `Advertised_Route` table that I > mentioned above, northd would be the only component that adds data to > this table and records would be either explicitly removed by northd > (when NAT or LB gets removed) or implicitly when the LRP is removed and > `logical_port` reference goes away. > > What do you think? > Good point, Martin. Now that you mention it it kind of makes sense that we don't mix Advertised with Learnt. It reduces a bit complexity and chance of getting I-P wrong in ovn-controller / ovn-northd too. +1 from me for separate tables. Regards, Dumitru > Thanks, > Martin. > >> >>> >>> The issue of transparency could be solved with documentation and >>> abstraction in parsing method (like in the case of the NAT). >>> >>> Anyway, as I said, this is not a blocker from my side, just a >>> suggestion for consideration. >>> >>> Martin >>> >>> [0] >>> https://github.com/ovn-org/ovn/blob/3b32b7d08ebcd005681db5131c798da07a89133a/northd/northd.c#L16506-L16530 >>>> >>>>> >>>>> Martin. >>>>> >>>>>> + "external_ids": { >>>>>> + "type": {"key": "string", "value": >>>>>> "string", >>>>>> + "min": 0, "max": >>>>>> "unlimited"}}}, >>>>>> + "indexes": [["datapath", "logical_port", >>>>>> "ip_prefix", >>>>>> "nexthop", >>>>>> + "type"]], >>>>>> "isRoot": true} >>>>>> } >>>>>> } >>>>>> diff --git a/ovn-sb.xml b/ovn-sb.xml >>>>>> index ea4adc1c3..e435cf243 100644 >>>>>> --- a/ovn-sb.xml >>>>>> +++ b/ovn-sb.xml >>>>>> @@ -5217,4 +5217,84 @@ tcp.flags = RST; >>>>>> The set of variable values for a given chassis. >>>>>> </column> >>>>>> </table> >>>>>> + >>>>>> + <table name="Route"> >>>>>> + <p> >>>>>> + Each record represents a route that is exported from >>>>>> ovn >>>>>> or >>>>>> imported to >>>>>> + ovn using some dynamic routing logic outside of ovn. >>>>>> + It is populated on the one hand by <code>ovn- >>>>>> northd</code> >>>>>> based on the >>>>>> + addresses, routes and NAT Entries of a >>>>>> + <code>OVN_Northbound.Logical_Router_Port</code>. >>>>>> + On the other hand <code>ovn-controller</code> >>>>>> populates it >>>>>> with routes >>>>>> + learned from outside of OVN. >>>>>> + </p> >>>>>> + >>>>>> + <column name="datapath"> >>>>>> + The datapath belonging to the >>>>>> + <code>OVN_Northbound.Logical_Router</code> that this >>>>>> route >>>>>> is >>>>>> valid >>>>>> + for. >>>>>> + </column> >>>>>> + >>>>>> + <column name="logical_port"> >>>>>> + <p> >>>>>> + If the type is <code>advertise</code> then this is >>>>>> the >>>>>> logical_port >>>>>> + the router will send packets out. >>>>>> + </p> >>>>>> + >>>>>> + <p> >>>>>> + If the type is <code>receive</code> then this is the >>>>>> logical_port >>>>>> + the route was learned on. >>>>>> + </p> >>>>>> + </column> >>>>>> + >>>>>> + <column name="ip_prefix"> >>>>>> + <p> >>>>>> + IP prefix of this route (e.g. 192.168.100.0/24). >>>>>> + </p> >>>>>> + </column> >>>>>> + >>>>>> + <column name="nexthop"> >>>>>> + <p> >>>>>> + If the type is <code>advertise</code> then this is >>>>>> empty. >>>>>> + </p> >>>>>> + >>>>>> + <p> >>>>>> + If the type is <code>receive</code> then this is the >>>>>> nexthop >>>>>> ip we >>>>>> + from the outside. >>>>>> + </p> >>>>>> + </column> >>>>>> + >>>>>> + <column name="tracked_port"> >>>>>> + <p> >>>>>> + Only relevant for type <code>advertise</code>. >>>>>> + >>>>>> + In combination with a host <code>ip_prefix</code> >>>>>> this >>>>>> tracks the port >>>>>> + OVN will forward the packets for this destination >>>>>> to. >>>>>> + >>>>>> + An announcing chassis can use this information to >>>>>> check >>>>>> if >>>>>> this >>>>>> + destination is local and adjust the route priorities >>>>>> based >>>>>> on that. >>>>>> + </p> >>>>>> + </column> >>>>>> + >>>>>> + <column name="type"> >>>>>> + <p> >>>>>> + If the route is to be exported from OVN to the >>>>>> outside >>>>>> network or if >>>>>> + it is imported from the outside network. >>>>>> + </p> >>>>>> + <ul> >>>>>> + <li> >>>>>> + <code>advertise</code>: This route should be >>>>>> advertised to >>>>>> the >>>>>> + outside network. >>>>>> + </li> >>>>>> + <li> >>>>>> + <code>receive</code>: This route has been learned >>>>>> from >>>>>> the >>>>>> outside >>>>>> + network. >>>>>> + </li> >>>>>> + </ul> >>>>>> + </column> >>>>>> + >>>>>> + <column name="external_ids"> >>>>>> + See <em>External IDs</em> at the beginning of this >>>>>> document. >>>>>> + </column> >>>>>> + </table> >>>>>> </database> >>>>> >>>>> _______________________________________________ >>>>> dev mailing list >>>>> dev@openvswitch.org >>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>> >>> _______________________________________________ >>> dev mailing list >>> dev@openvswitch.org >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
On Mon, Dec 09, 2024 at 02:20:03PM +0100, Dumitru Ceara wrote: > On 12/9/24 2:14 PM, martin.kalcok@canonical.com wrote: > > On Mon, 2024-12-09 at 13:37 +0100, Felix Huettner wrote: > >> On Fri, Dec 06, 2024 at 10:10:21AM +0100, > >> martin.kalcok@canonical.com wrote: > >>> On Thu, 2024-12-05 at 15:28 +0100, Felix Huettner wrote: > >>>> On Thu, Dec 05, 2024 at 02:08:37PM +0100, > >>>> martin.kalcok@canonical.com wrote: > >>>>> Hi Felix and Dumitru, > >>>>> Thanks for posting this standalone change Felix. I'm going to > >>>>> take > >>>>> it > >>>>> for a spin and try to integrate it to Frode's original "route > >>>>> leaking" > >>>>> RFC. > >>>>> I have one question/suggestion for the schema that I'll leave > >>>>> below. > >>>>> > >>>>> On Thu, 2024-12-05 at 14:00 +0100, Felix Huettner via dev > >>>>> wrote: > >>>>>> The Route table will be used in the future to coordinate > >>>>>> routing > >>>>>> information between northd and ovn-controller. > >>>>>> Northd will insert routes that should be advertised to the > >>>>>> outside > >>>>>> fabric. > >>>>>> Ovn-controller will insert routes that have been learned from > >>>>>> the > >>>>>> outside fabric. > >>>>>> > >>>>>> Signed-off-by: Felix Huettner <felix.huettner@stackit.cloud> > >>>>>> --- > >>>>>> v2->v3: > >>>>>> * refType is now strong > >>>>>> * fixed typos > >>>>>> > >>>>>> > >>>>>> ovn-sb.ovsschema | 29 ++++++++++++++++-- > >>>>>> ovn-sb.xml | 80 > >>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++ > >>>>>> 2 files changed, 107 insertions(+), 2 deletions(-) > >>>>>> > >>>>>> diff --git a/ovn-sb.ovsschema b/ovn-sb.ovsschema > >>>>>> index 73abf2c8d..88763f429 100644 > >>>>>> --- a/ovn-sb.ovsschema > >>>>>> +++ b/ovn-sb.ovsschema > >>>>>> @@ -1,7 +1,7 @@ > >>>>>> { > >>>>>> "name": "OVN_Southbound", > >>>>>> - "version": "20.37.0", > >>>>>> - "cksum": "1950136776 31493", > >>>>>> + "version": "20.38.0", > >>>>>> + "cksum": "1392903395 32893", > >>>>>> "tables": { > >>>>>> "SB_Global": { > >>>>>> "columns": { > >>>>>> @@ -617,6 +617,31 @@ > >>>>>> "type": {"key": "string", "value": > >>>>>> "string", > >>>>>> "min": 0, "max": > >>>>>> "unlimited"}}}, > >>>>>> "indexes": [["chassis"]], > >>>>>> + "isRoot": true}, > >>>>>> + "Route": { > >>>>>> + "columns": { > >>>>>> + "datapath": > >>>>>> + {"type": {"key": {"type": "uuid", > >>>>>> + "refTable": > >>>>>> "Datapath_Binding"}}}, > >>>>>> + "logical_port": {"type": {"key": {"type": > >>>>>> "uuid", > >>>>>> + > >>>>>> "refTable": > >>>>>> "Port_Binding", > >>>>>> + "refType": > >>>>>> "strong"}}}, > >>>>>> + "ip_prefix": {"type": "string"}, > >>>>>> + "nexthop": {"type": "string"}, > >>>>>> + "tracked_port": {"type": {"key": {"type": > >>>>>> "uuid", > >>>>>> + > >>>>>> "refTable": > >>>>>> "Port_Binding", > >>>>>> + "refType": > >>>>>> "strong"}, > >>>>>> + "min": 0, > >>>>>> + "max": 1}}, > >>>>>> + "type": {"type": {"key": {"type": "string", > >>>>>> + "enum": ["set", > >>>>>> ["advertise", > >>>>>> + > >>>>>> "receive"]]}, > >>>>>> + "min": 1, "max": 1}}, > >>>>> > >>>>> I wonder if this field is necessary. We could use presence or > >>>>> absence > >>>>> of "nexthop" value to determine whether the route is being > >>>>> learned > >>>>> or > >>>>> advertised. I'm not sure if ovsdb is doing something clever > >>>>> with > >>>>> enums > >>>>> or stores them as raw values in each row, but removing this > >>>>> could > >>>>> save > >>>>> us some space. What do you think? > >>>> > >>>> Hi Martin, > >>>> > >>>> i think generally we could use the information in the "nexthop" > >>>> column > >>>> for this. However i would find this rather surprising if i am not > >>>> deeply > >>>> in the implementation. For debugging at least it is not obvious. > >>>> > >>>> I am not sure how much space saving we would get out of this, but > >>>> i > >>>> don't think that it is worth the cost of less debuggability. > >>>> Especially > >>>> since each learned/announced route will cause additionally once > >>>> or > >>>> multiple logical flows to be created which are significantly > >>>> larger > >>>> than > >>>> this type field. > >>>> > >>>> But i am open to other opinions as well. > >>>> > >>>> Thanks a lot > >>>> Felix > >>> > >>> Hi Felix, > >>> I'm not hell-bent on this change and I agree that it would make the > >>> code less explicit. It just occurred to me because I recently saw > >>> similar pattern applied in NAT table [0] where NAT rules is > >>> considered > >>> "distributed" if the record contains both logical_port and > >>> external_nat. > >> > >> Hi Martin, > >> > >> i think a big difference here is that we need to have both ovn-northd > >> and ovn-controller agree to what each of these things mean. > >> In the above example it looked like something only northd needs to > >> know > >> internally to derive the correct flows. > >> > >>> > >>> I don't have enough experience with ovsdb in large deployments, it > >>> was > >>> just a gut feel that potentially saving ~8bytes for a table that > >>> could > >>> have potentially 100s or 1000s of records could be impactful. > >>> There's also the fact that without the server-side validation (i.e. > >>> if > >>> tpye is "advertised" then nexthop must not be empty), it's up to > >>> the > >>> clients to enforce that invalid records are not created. > >> > >> I just spend some time measureing it by adding a few thousand Routes > >> and > >> checking the RSS difference. > >> With the current schema (actually with Dumitru's requested > >> adjustments) > >> we are at ~3230 Byte per route. If we remove the "type" we are at > >> ~2514 > >> Byte. So ~716 Byte difference. > >> > >> If we calculate with 10.000 Routes then we end up at roughly 7MB of > >> difference. > >> I would think this is worth the clarity. > >> > >> Additionally it allows us to later actually use nexthop also for > >> advertised routes. Allthough honestly i don't know a usecase for > >> this. > >> > >> Whats your opinion on that one? > >> > >> Thanks a lot > >> Felix > > > > Hi Felix, > > thanks for running numbers on this. The difference indeed seems > > negligible even on large deployments. You are right that the trade-off > > for removing the "type" column is probably not worth it. > > > > I've put this version (v3) to work on Friday by integrating it to > > fnordahl's original route leaking RFC and one more though occurred to > > me. > > > > Would it be worth to split the `Route` table to `Advertised_Route` and > > `Learned_Route`? I think this could be beneficial because: > > * Both "types" of routes have different "owners". While "Learned > > routes" are primarily managed "from the bottom", by the controller, > > "Advertised routes" are managed "from the top", by northd. Having two > > different components being in charge of keeping the table up to date > > seems to complicate things. > > * "Advertised routes" could do with just following attributes: > > * IP prefix > > * logical_port (strong ref to LRP that's supposed to advertise this > > prefix) > > * With two separate tables, development of learning/advertising would > > be more independent as changing requirements in one feature wouldn't > > affect development of the other. > > > > With the simplified structure of the `Advertised_Route` table that I > > mentioned above, northd would be the only component that adds data to > > this table and records would be either explicitly removed by northd > > (when NAT or LB gets removed) or implicitly when the LRP is removed and > > `logical_port` reference goes away. > > > > What do you think? > > > > Good point, Martin. Now that you mention it it kind of makes sense that > we don't mix Advertised with Learnt. It reduces a bit complexity and > chance of getting I-P wrong in ovn-controller / ovn-northd too. > > +1 from me for separate tables. Hi Martin, Hi Dumitru, thanks for the comment. This should also make the logic in northd and ovn-controller more simple. I will try to incorporate this and the comments on v2 into my current code to see if i find any kind of issue with that and would afterwards post the next version. Thanks a lot Felix > > Regards, > Dumitru > > > Thanks, > > Martin. > > > >> > >>> > >>> The issue of transparency could be solved with documentation and > >>> abstraction in parsing method (like in the case of the NAT). > >>> > >>> Anyway, as I said, this is not a blocker from my side, just a > >>> suggestion for consideration. > >>> > >>> Martin > >>> > >>> [0] > >>> https://github.com/ovn-org/ovn/blob/3b32b7d08ebcd005681db5131c798da07a89133a/northd/northd.c#L16506-L16530 > >>>> > >>>>> > >>>>> Martin. > >>>>> > >>>>>> + "external_ids": { > >>>>>> + "type": {"key": "string", "value": > >>>>>> "string", > >>>>>> + "min": 0, "max": > >>>>>> "unlimited"}}}, > >>>>>> + "indexes": [["datapath", "logical_port", > >>>>>> "ip_prefix", > >>>>>> "nexthop", > >>>>>> + "type"]], > >>>>>> "isRoot": true} > >>>>>> } > >>>>>> } > >>>>>> diff --git a/ovn-sb.xml b/ovn-sb.xml > >>>>>> index ea4adc1c3..e435cf243 100644 > >>>>>> --- a/ovn-sb.xml > >>>>>> +++ b/ovn-sb.xml > >>>>>> @@ -5217,4 +5217,84 @@ tcp.flags = RST; > >>>>>> The set of variable values for a given chassis. > >>>>>> </column> > >>>>>> </table> > >>>>>> + > >>>>>> + <table name="Route"> > >>>>>> + <p> > >>>>>> + Each record represents a route that is exported from > >>>>>> ovn > >>>>>> or > >>>>>> imported to > >>>>>> + ovn using some dynamic routing logic outside of ovn. > >>>>>> + It is populated on the one hand by <code>ovn- > >>>>>> northd</code> > >>>>>> based on the > >>>>>> + addresses, routes and NAT Entries of a > >>>>>> + <code>OVN_Northbound.Logical_Router_Port</code>. > >>>>>> + On the other hand <code>ovn-controller</code> > >>>>>> populates it > >>>>>> with routes > >>>>>> + learned from outside of OVN. > >>>>>> + </p> > >>>>>> + > >>>>>> + <column name="datapath"> > >>>>>> + The datapath belonging to the > >>>>>> + <code>OVN_Northbound.Logical_Router</code> that this > >>>>>> route > >>>>>> is > >>>>>> valid > >>>>>> + for. > >>>>>> + </column> > >>>>>> + > >>>>>> + <column name="logical_port"> > >>>>>> + <p> > >>>>>> + If the type is <code>advertise</code> then this is > >>>>>> the > >>>>>> logical_port > >>>>>> + the router will send packets out. > >>>>>> + </p> > >>>>>> + > >>>>>> + <p> > >>>>>> + If the type is <code>receive</code> then this is the > >>>>>> logical_port > >>>>>> + the route was learned on. > >>>>>> + </p> > >>>>>> + </column> > >>>>>> + > >>>>>> + <column name="ip_prefix"> > >>>>>> + <p> > >>>>>> + IP prefix of this route (e.g. 192.168.100.0/24). > >>>>>> + </p> > >>>>>> + </column> > >>>>>> + > >>>>>> + <column name="nexthop"> > >>>>>> + <p> > >>>>>> + If the type is <code>advertise</code> then this is > >>>>>> empty. > >>>>>> + </p> > >>>>>> + > >>>>>> + <p> > >>>>>> + If the type is <code>receive</code> then this is the > >>>>>> nexthop > >>>>>> ip we > >>>>>> + from the outside. > >>>>>> + </p> > >>>>>> + </column> > >>>>>> + > >>>>>> + <column name="tracked_port"> > >>>>>> + <p> > >>>>>> + Only relevant for type <code>advertise</code>. > >>>>>> + > >>>>>> + In combination with a host <code>ip_prefix</code> > >>>>>> this > >>>>>> tracks the port > >>>>>> + OVN will forward the packets for this destination > >>>>>> to. > >>>>>> + > >>>>>> + An announcing chassis can use this information to > >>>>>> check > >>>>>> if > >>>>>> this > >>>>>> + destination is local and adjust the route priorities > >>>>>> based > >>>>>> on that. > >>>>>> + </p> > >>>>>> + </column> > >>>>>> + > >>>>>> + <column name="type"> > >>>>>> + <p> > >>>>>> + If the route is to be exported from OVN to the > >>>>>> outside > >>>>>> network or if > >>>>>> + it is imported from the outside network. > >>>>>> + </p> > >>>>>> + <ul> > >>>>>> + <li> > >>>>>> + <code>advertise</code>: This route should be > >>>>>> advertised to > >>>>>> the > >>>>>> + outside network. > >>>>>> + </li> > >>>>>> + <li> > >>>>>> + <code>receive</code>: This route has been learned > >>>>>> from > >>>>>> the > >>>>>> outside > >>>>>> + network. > >>>>>> + </li> > >>>>>> + </ul> > >>>>>> + </column> > >>>>>> + > >>>>>> + <column name="external_ids"> > >>>>>> + See <em>External IDs</em> at the beginning of this > >>>>>> document. > >>>>>> + </column> > >>>>>> + </table> > >>>>>> </database> > >>>>> > >>>>> _______________________________________________ > >>>>> dev mailing list > >>>>> dev@openvswitch.org > >>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >>> > >>> _______________________________________________ > >>> dev mailing list > >>> dev@openvswitch.org > >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > _______________________________________________ > > dev mailing list > > dev@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
diff --git a/ovn-sb.ovsschema b/ovn-sb.ovsschema index 73abf2c8d..88763f429 100644 --- a/ovn-sb.ovsschema +++ b/ovn-sb.ovsschema @@ -1,7 +1,7 @@ { "name": "OVN_Southbound", - "version": "20.37.0", - "cksum": "1950136776 31493", + "version": "20.38.0", + "cksum": "1392903395 32893", "tables": { "SB_Global": { "columns": { @@ -617,6 +617,31 @@ "type": {"key": "string", "value": "string", "min": 0, "max": "unlimited"}}}, "indexes": [["chassis"]], + "isRoot": true}, + "Route": { + "columns": { + "datapath": + {"type": {"key": {"type": "uuid", + "refTable": "Datapath_Binding"}}}, + "logical_port": {"type": {"key": {"type": "uuid", + "refTable": "Port_Binding", + "refType": "strong"}}}, + "ip_prefix": {"type": "string"}, + "nexthop": {"type": "string"}, + "tracked_port": {"type": {"key": {"type": "uuid", + "refTable": "Port_Binding", + "refType": "strong"}, + "min": 0, + "max": 1}}, + "type": {"type": {"key": {"type": "string", + "enum": ["set", ["advertise", + "receive"]]}, + "min": 1, "max": 1}}, + "external_ids": { + "type": {"key": "string", "value": "string", + "min": 0, "max": "unlimited"}}}, + "indexes": [["datapath", "logical_port", "ip_prefix", "nexthop", + "type"]], "isRoot": true} } } diff --git a/ovn-sb.xml b/ovn-sb.xml index ea4adc1c3..e435cf243 100644 --- a/ovn-sb.xml +++ b/ovn-sb.xml @@ -5217,4 +5217,84 @@ tcp.flags = RST; The set of variable values for a given chassis. </column> </table> + + <table name="Route"> + <p> + Each record represents a route that is exported from ovn or imported to + ovn using some dynamic routing logic outside of ovn. + It is populated on the one hand by <code>ovn-northd</code> based on the + addresses, routes and NAT Entries of a + <code>OVN_Northbound.Logical_Router_Port</code>. + On the other hand <code>ovn-controller</code> populates it with routes + learned from outside of OVN. + </p> + + <column name="datapath"> + The datapath belonging to the + <code>OVN_Northbound.Logical_Router</code> that this route is valid + for. + </column> + + <column name="logical_port"> + <p> + If the type is <code>advertise</code> then this is the logical_port + the router will send packets out. + </p> + + <p> + If the type is <code>receive</code> then this is the logical_port + the route was learned on. + </p> + </column> + + <column name="ip_prefix"> + <p> + IP prefix of this route (e.g. 192.168.100.0/24). + </p> + </column> + + <column name="nexthop"> + <p> + If the type is <code>advertise</code> then this is empty. + </p> + + <p> + If the type is <code>receive</code> then this is the nexthop ip we + from the outside. + </p> + </column> + + <column name="tracked_port"> + <p> + Only relevant for type <code>advertise</code>. + + In combination with a host <code>ip_prefix</code> this tracks the port + OVN will forward the packets for this destination to. + + An announcing chassis can use this information to check if this + destination is local and adjust the route priorities based on that. + </p> + </column> + + <column name="type"> + <p> + If the route is to be exported from OVN to the outside network or if + it is imported from the outside network. + </p> + <ul> + <li> + <code>advertise</code>: This route should be advertised to the + outside network. + </li> + <li> + <code>receive</code>: This route has been learned from the outside + network. + </li> + </ul> + </column> + + <column name="external_ids"> + See <em>External IDs</em> at the beginning of this document. + </column> + </table> </database>
The Route table will be used in the future to coordinate routing information between northd and ovn-controller. Northd will insert routes that should be advertised to the outside fabric. Ovn-controller will insert routes that have been learned from the outside fabric. Signed-off-by: Felix Huettner <felix.huettner@stackit.cloud> --- v2->v3: * refType is now strong * fixed typos ovn-sb.ovsschema | 29 ++++++++++++++++-- ovn-sb.xml | 80 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 107 insertions(+), 2 deletions(-)