diff mbox series

[ovs-dev] DHCP Relay Agent support for overlay subnets

Message ID 20231212180402.155433-1-naveen.yerramneni@nutanix.com
State Changes Requested
Delegated to: Numan Siddique
Headers show
Series [ovs-dev] DHCP Relay Agent support for overlay subnets | expand

Checks

Context Check Description
ovsrobot/apply-robot warning apply and check: warning
ovsrobot/github-robot-_Build_and_Test fail github build: failed
ovsrobot/github-robot-_ovn-kubernetes success github build: passed

Commit Message

Naveen Yerramneni Dec. 12, 2023, 6:04 p.m. UTC
This patch contains changes to enable DHCP Relay Agent support for overlay subnets.

    USE CASE:
    ----------
      - Enable IP address assignment for overlay subnets from the centralized DHCP server present in the underlay network.

    PREREQUISITES
    --------------
      - Logical Router Port IP should be assigned (statically) from the same overlay subnet which is managed by DHCP server.
      - LRP IP is used for GIADRR field when relaying the DHCP packets and also same IP needs to be configured as default gateway for the overlay subnet.
      - Overlay subnets managed by external DHCP server are expected to be directly reachable from the underlay network.

    EXPECTED PACKET FLOW:
    ----------------------
    Following is the expected packet flow inorder to support DHCP rleay functionality in OVN.
      1. DHCP client originates DHCP discovery (broadcast).
      2. DHCP relay (running on the OVN) receives the broadcast and forwards the packet to the DHCP server by converting it to unicast. While forwarding the packet, it updates the GIADDR in DHCP header to its
         interface IP on which DHCP packet is received.
      3. DHCP server uses GIADDR field to decide the IP address pool from which IP has to be assigned and DHCP offer is sent to the same IP (GIADDR).
      4. DHCP relay agent forwards the offer to the client, it resets the GIADDR field when forwarding the offer to the client.
      5. DHCP client sends DHCP request (broadcast) packet.
      6. DHCP relay (running on the OVN) receives the broadcast and forwards the packet to the DHCP server by converting it to unicast. While forwarding the packet, it updates the GIADDR in DHCP header to its
         interface IP on which DHCP packet is received.
      7. DHCP Server sends the ACK packet.
      8. DHCP relay agent forwards the ACK packet to the client, it resets the GIADDR field when forwarding the ACK to the client.
      9. All the future renew/release packets are directly exchanged between DHCP client and DHCP server.

    OVN DHCP RELAY PACKET FLOW:
    ----------------------------
    To add DHCP Relay support on OVN, we need to replicate all the behavior described above using distributed logical switch and logical router.
    At, highlevel packet flow is distributed among Logical Switch and Logical Router on source node (where VM is deployed) and redirect chassis(RC) node.
      1. Request packet gets processed on the source node where VM is deployed and relays the packet to DHCP server.
      2. Response packet is first processed on RC node (which first recieves the packet from underlay network). RC node forwards the packet to the right node by filling in the dest MAC and IP.

    OVN Packet flow with DHCP relay is explained below.
      1. DHCP client (VM) sends the DHCP discover packet (broadcast).
      2. Logical switch converts the packet to L2 unicast by setting the destination MAC to LRP's MAC
      3. Logical Router receives the packet and redirects it to the OVN controller.
      4. OVN controller updates the required information(GIADDR) in the DHCP payload after doing the required checks. If any check fails, packet is dropped.
      5. Logical Router converts the packet to L3 unicast and forwards it to the server. This packets gets routed like any other packet (via RC node).
      6. Server replies with DHCP offer.
      7. RC node processes the DHCP offer and forwards it to the OVN controller.
      8. OVN controller does sanity checks and  updates the destination MAC (available in DHCP header), destination IP (available in DHCP header), resets GIADDR  and reinjects the packet to datapath.
         If any check fails, packet is dropped.
      9. Logical router updates the source IP and port and forwards the packet to logical switch.
      10. Logical switch delivers the packet to the DHCP client.
      11. Similar steps are performed for Request and Ack packets.
      12. All the future renew/release packets are directly exchanged between DHCP client and DHCP server

    NEW OVN ACTIONS
    ---------------

      1. dhcp_relay_req(<relay-ip>, <server-ip>)
          - This action executes on the source node on which the DHCP request originated.
          - This action relays the DHCP request coming from client to the server. Relay-ip is used to update GIADDR in the DHCP header.
      2. dhcp_relay_resp_fwd(<relay-ip>, <server-ip>)
          - This action executes on the first node (RC node) which processes the DHCP response from the server.
          - This action updates  the destination MAC and destination IP so that the response can be forwarded to the appropriate node from which request was originated.
          - Relay-ip, server-ip are used to validate GIADDR and SERVER ID in the DHCP payload.

    FLOWS
    -----
    Following are the flows required for one overlay subnet.

      1. table=27(ls_in_l2_lkup      ), priority=100  , match=(inport == <vm_port> && eth.src == <vm_mac> && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(eth.dst=<lrp_mac>;outport=<lrp-port>;next;/* DHCP_RELAY_REQ */)
      2. table=3 (lr_in_ip_input     ), priority=110  , match=(inport == <lrp_port> && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(dhcp_relay_req(<lrp_ip>,<dhcp_server_ip>);ip4.src=<lrp_ip>;ip4.dst=<dhcp_server_ip>;udp.src=67;next; /* DHCP_RELAY_REQ */)
      3. table=3 (lr_in_ip_input     ), priority=110  , match=(ip4.src == <dhcp_server_ip> && ip4.dst ==<lrp_ip> && udp.src == 67 && udp.dst == 67), action=(next;/* DHCP_RELAY_RESP */)
      4. table=17(lr_in_dhcp_relay_resp_fwd), priority=110  , match=(ip4.src == <dhcp_server_ip> && ip4.dst == <lrp_ip> && udp.src == 67 && udp.dst == 67), action=(dhcp_relay_resp_fwd(<lrp_ip>,<dhcp_server_ip>);ip4.src=<lrp_ip>;udp.dst=68;outport=<lrp_port>;output; /* DHCP_RELAY_RESP */)

    NEW PIPELINE STAGES
    -------------------
    Following stage is added for DHCP relay feature. Some of the flows are fitted into the existing pipeline tages.
      1. lr_in_dhcp_relay_resp_fwd
          - Forward teh DHCP response to the appropriate node

    NB SCHEMA CHANGES
    ----------------
      1. New DHCP_Relay table
          "DHCP_Relay": {
                "columns": {
            "name": {"type": "string"},
                    "servers": {"type": {"key": "string",
                                           "min": 0,
                                           "max": 1}},
                    "external_ids": {
                        "type": {"key": "string", "value": "string",
                                "min": 0, "max": "unlimited"}}},
                "isRoot": true},
      2. New column to Logical_Router_Port table
          "dhcp_relay": {"type": {"key": {"type": "uuid",
                                "refTable": "DHCP_Relay",
                                "refType": "weak"},
                                "min": 0,
                                "max": 1}},
      3. New column to Logical_Switch_table
          "dhcp_relay_port": {"type": {"key": {"type": "uuid",
                                        "refTable": "Logical_Router_Port",
                                        "refType": "weak"},
                                         "min": 0,
                                         "max": 1}}},

    Commands to enable the feature:
    ------------------------------
      - ovn-nbctl create DHCP_Relay servers=<ip>
      - ovn-nbctl set Logical_Router_port <lrp_uuid> dhcp_relay=<dhcp_relay_uuid>
      - ovn-nbctl set Logical_Switch <ls_uuid> dhcp_relay_port=<lrp_uuid>

    Example:
    -------
     ovn-nbctl ls-add sw1
     ovn-nbctl lsp-add sw1 sw1-port1
     ovn-nbctl lsp-set-addresses sw1-port1 <MAC> #Only MAC address has to be specified when logical ports are created.
     ovn-nbctl lr-add lr1
     ovn-nbctl lrp-add lr1 lr1-port1 <MAC> <GATEWAY_IP/Prefix> #GATEWAY IP is set in GIADDR field when relaying the DHCP requests to server.
     ovn-nbctl lsp-add sw1 lr1-attachment
     ovn-nbctl lsp-set-type lr1-attachment router
     ovn-nbctl lsp-set-addresses lr1-attachment <MAC>
     ovn-nbctl lsp-set-options lr1-attachment router-port=lr1-port1
     ovn-nbctl create DHCP_Relay servers=<DHCP_SERVER_IP>
     ovn-nbctl set Logical_Router_port <lrp_uuid> dhcp_relay=<relay_uuid>
     ovn-nbctl set Logical_Switch <ls_uuid> dhcp_relay_port=<lrp_uuid>

    Limitations:
    ------------
      - All OVN features that needs IP address to be configured on logical port (like proxy arp, etc) will not be supported for overlay subnets on which DHCP relay is enabled.

    References:
    ----------
      - rfc1541, rfc1542, rfc2131

Signed-off-by: Naveen Yerramneni <naveen.yerramneni@nutanix.com>
Co-authored-by: Huzaifa Calcuttawala <huzaifa.c@nutanix.com>
Signed-off-by: Huzaifa Calcuttawala <huzaifa.c@nutanix.com>
CC: Mary Manohar <mary.manohar@nutanix.com>
---
 controller/pinctrl.c  | 441 ++++++++++++++++++++++++++++++++++++++++++
 include/ovn/actions.h |  26 +++
 lib/actions.c         | 117 +++++++++++
 lib/ovn-l7.h          |   1 +
 northd/northd.c       | 177 ++++++++++++++++-
 ovn-nb.ovsschema      |  25 ++-
 ovn-nb.xml            |  28 +++
 tests/atlocal.in      |   3 +
 tests/ovn-northd.at   |  41 +++-
 tests/ovn.at          |  12 +-
 tests/system-ovn.at   | 150 ++++++++++++++
 utilities/ovn-trace.c |  28 +++
 12 files changed, 1032 insertions(+), 17 deletions(-)

Comments

0-day Robot Dec. 12, 2023, 6:30 p.m. UTC | #1
References:  <20231212180402.155433-1-naveen.yerramneni@nutanix.com>
 

Bleep bloop.  Greetings Naveen Yerramneni, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 80 characters long (recommended limit is 79)
#214 FILE: controller/pinctrl.c:1951:
     *| UDP HEADER  | DHCP HEADER  | 4 Byte DHCP Cookie | DHCP OPTIONS(var len)|

WARNING: Line is 80 characters long (recommended limit is 79)
#400 FILE: controller/pinctrl.c:2137:
     *| UDP HEADER  | DHCP HEADER  | 4 Byte DHCP Cookie | DHCP OPTIONS(var len)|

Lines checked: 1484, Warnings: 2, Errors: 0


Please check this out.  If you feel there has been an error, please email aconole@redhat.com

Thanks,
0-day Robot
Numan Siddique Jan. 15, 2024, 9 p.m. UTC | #2
On Tue, Dec 12, 2023 at 1:05 PM Naveen Yerramneni
<naveen.yerramneni@nutanix.com> wrote:
>
>     This patch contains changes to enable DHCP Relay Agent support for overlay subnets.
>
>     USE CASE:
>     ----------
>       - Enable IP address assignment for overlay subnets from the centralized DHCP server present in the underlay network.
>
>     PREREQUISITES
>     --------------
>       - Logical Router Port IP should be assigned (statically) from the same overlay subnet which is managed by DHCP server.
>       - LRP IP is used for GIADRR field when relaying the DHCP packets and also same IP needs to be configured as default gateway for the overlay subnet.
>       - Overlay subnets managed by external DHCP server are expected to be directly reachable from the underlay network.
>
>     EXPECTED PACKET FLOW:
>     ----------------------
>     Following is the expected packet flow inorder to support DHCP rleay functionality in OVN.
>       1. DHCP client originates DHCP discovery (broadcast).
>       2. DHCP relay (running on the OVN) receives the broadcast and forwards the packet to the DHCP server by converting it to unicast. While forwarding the packet, it updates the GIADDR in DHCP header to its
>          interface IP on which DHCP packet is received.
>       3. DHCP server uses GIADDR field to decide the IP address pool from which IP has to be assigned and DHCP offer is sent to the same IP (GIADDR).
>       4. DHCP relay agent forwards the offer to the client, it resets the GIADDR field when forwarding the offer to the client.
>       5. DHCP client sends DHCP request (broadcast) packet.
>       6. DHCP relay (running on the OVN) receives the broadcast and forwards the packet to the DHCP server by converting it to unicast. While forwarding the packet, it updates the GIADDR in DHCP header to its
>          interface IP on which DHCP packet is received.
>       7. DHCP Server sends the ACK packet.
>       8. DHCP relay agent forwards the ACK packet to the client, it resets the GIADDR field when forwarding the ACK to the client.
>       9. All the future renew/release packets are directly exchanged between DHCP client and DHCP server.
>
>     OVN DHCP RELAY PACKET FLOW:
>     ----------------------------
>     To add DHCP Relay support on OVN, we need to replicate all the behavior described above using distributed logical switch and logical router.
>     At, highlevel packet flow is distributed among Logical Switch and Logical Router on source node (where VM is deployed) and redirect chassis(RC) node.
>       1. Request packet gets processed on the source node where VM is deployed and relays the packet to DHCP server.
>       2. Response packet is first processed on RC node (which first recieves the packet from underlay network). RC node forwards the packet to the right node by filling in the dest MAC and IP.
>
>     OVN Packet flow with DHCP relay is explained below.
>       1. DHCP client (VM) sends the DHCP discover packet (broadcast).
>       2. Logical switch converts the packet to L2 unicast by setting the destination MAC to LRP's MAC
>       3. Logical Router receives the packet and redirects it to the OVN controller.
>       4. OVN controller updates the required information(GIADDR) in the DHCP payload after doing the required checks. If any check fails, packet is dropped.
>       5. Logical Router converts the packet to L3 unicast and forwards it to the server. This packets gets routed like any other packet (via RC node).
>       6. Server replies with DHCP offer.
>       7. RC node processes the DHCP offer and forwards it to the OVN controller.
>       8. OVN controller does sanity checks and  updates the destination MAC (available in DHCP header), destination IP (available in DHCP header), resets GIADDR  and reinjects the packet to datapath.
>          If any check fails, packet is dropped.
>       9. Logical router updates the source IP and port and forwards the packet to logical switch.
>       10. Logical switch delivers the packet to the DHCP client.
>       11. Similar steps are performed for Request and Ack packets.
>       12. All the future renew/release packets are directly exchanged between DHCP client and DHCP server
>
>     NEW OVN ACTIONS
>     ---------------
>
>       1. dhcp_relay_req(<relay-ip>, <server-ip>)
>           - This action executes on the source node on which the DHCP request originated.
>           - This action relays the DHCP request coming from client to the server. Relay-ip is used to update GIADDR in the DHCP header.
>       2. dhcp_relay_resp_fwd(<relay-ip>, <server-ip>)
>           - This action executes on the first node (RC node) which processes the DHCP response from the server.
>           - This action updates  the destination MAC and destination IP so that the response can be forwarded to the appropriate node from which request was originated.
>           - Relay-ip, server-ip are used to validate GIADDR and SERVER ID in the DHCP payload.
>
>     FLOWS
>     -----
>     Following are the flows required for one overlay subnet.
>
>       1. table=27(ls_in_l2_lkup      ), priority=100  , match=(inport == <vm_port> && eth.src == <vm_mac> && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(eth.dst=<lrp_mac>;outport=<lrp-port>;next;/* DHCP_RELAY_REQ */)
>       2. table=3 (lr_in_ip_input     ), priority=110  , match=(inport == <lrp_port> && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(dhcp_relay_req(<lrp_ip>,<dhcp_server_ip>);ip4.src=<lrp_ip>;ip4.dst=<dhcp_server_ip>;udp.src=67;next; /* DHCP_RELAY_REQ */)
>       3. table=3 (lr_in_ip_input     ), priority=110  , match=(ip4.src == <dhcp_server_ip> && ip4.dst ==<lrp_ip> && udp.src == 67 && udp.dst == 67), action=(next;/* DHCP_RELAY_RESP */)
>       4. table=17(lr_in_dhcp_relay_resp_fwd), priority=110  , match=(ip4.src == <dhcp_server_ip> && ip4.dst == <lrp_ip> && udp.src == 67 && udp.dst == 67), action=(dhcp_relay_resp_fwd(<lrp_ip>,<dhcp_server_ip>);ip4.src=<lrp_ip>;udp.dst=68;outport=<lrp_port>;output; /* DHCP_RELAY_RESP */)
>
>     NEW PIPELINE STAGES
>     -------------------
>     Following stage is added for DHCP relay feature. Some of the flows are fitted into the existing pipeline tages.
>       1. lr_in_dhcp_relay_resp_fwd
>           - Forward teh DHCP response to the appropriate node
>
>     NB SCHEMA CHANGES
>     ----------------
>       1. New DHCP_Relay table
>           "DHCP_Relay": {
>                 "columns": {
>             "name": {"type": "string"},
>                     "servers": {"type": {"key": "string",
>                                            "min": 0,
>                                            "max": 1}},
>                     "external_ids": {
>                         "type": {"key": "string", "value": "string",
>                                 "min": 0, "max": "unlimited"}}},
>                 "isRoot": true},
>       2. New column to Logical_Router_Port table
>           "dhcp_relay": {"type": {"key": {"type": "uuid",
>                                 "refTable": "DHCP_Relay",
>                                 "refType": "weak"},
>                                 "min": 0,
>                                 "max": 1}},
>       3. New column to Logical_Switch_table
>           "dhcp_relay_port": {"type": {"key": {"type": "uuid",
>                                         "refTable": "Logical_Router_Port",
>                                         "refType": "weak"},
>                                          "min": 0,
>                                          "max": 1}}},
>
>     Commands to enable the feature:
>     ------------------------------
>       - ovn-nbctl create DHCP_Relay servers=<ip>
>       - ovn-nbctl set Logical_Router_port <lrp_uuid> dhcp_relay=<dhcp_relay_uuid>
>       - ovn-nbctl set Logical_Switch <ls_uuid> dhcp_relay_port=<lrp_uuid>
>
>     Example:
>     -------
>      ovn-nbctl ls-add sw1
>      ovn-nbctl lsp-add sw1 sw1-port1
>      ovn-nbctl lsp-set-addresses sw1-port1 <MAC> #Only MAC address has to be specified when logical ports are created.
>      ovn-nbctl lr-add lr1
>      ovn-nbctl lrp-add lr1 lr1-port1 <MAC> <GATEWAY_IP/Prefix> #GATEWAY IP is set in GIADDR field when relaying the DHCP requests to server.
>      ovn-nbctl lsp-add sw1 lr1-attachment
>      ovn-nbctl lsp-set-type lr1-attachment router
>      ovn-nbctl lsp-set-addresses lr1-attachment <MAC>
>      ovn-nbctl lsp-set-options lr1-attachment router-port=lr1-port1
>      ovn-nbctl create DHCP_Relay servers=<DHCP_SERVER_IP>
>      ovn-nbctl set Logical_Router_port <lrp_uuid> dhcp_relay=<relay_uuid>
>      ovn-nbctl set Logical_Switch <ls_uuid> dhcp_relay_port=<lrp_uuid>
>
>     Limitations:
>     ------------
>       - All OVN features that needs IP address to be configured on logical port (like proxy arp, etc) will not be supported for overlay subnets on which DHCP relay is enabled.
>
>     References:
>     ----------
>       - rfc1541, rfc1542, rfc2131
>
> Signed-off-by: Naveen Yerramneni <naveen.yerramneni@nutanix.com>
> Co-authored-by: Huzaifa Calcuttawala <huzaifa.c@nutanix.com>
> Signed-off-by: Huzaifa Calcuttawala <huzaifa.c@nutanix.com>
> CC: Mary Manohar <mary.manohar@nutanix.com>

Hi Naveen,

Thanks for the patch.  Sorry for the delayed response.

I've a few comments.

1.  Regarding the newly added Table - DHCP_Relay in NB DB and the
newly added columns in Logical_Switch and
    Logical_Router table.

    I don't think there is a need to add the new table DHCP_Relay
since it only stores the dhcp relay agent server ip.
    Also it could complicate the northd incremental processing.

    If for example we have below logical switches and router

    ovn-nbctl lr-add R1
    ovn-nbctl ls-add sw0
    ovn-nbctl ls-add sw1
    ovn-nbctl ls-add sw-ext
    ovn-nbctl lrp-add R1 rp-sw0 00:00:01:01:02:03 192.168.1.1/24
    ovn-nbctl lrp-add R1 rp-sw1 00:00:03:01:02:03 192.168.2.1/24
    ovn-nbctl lrp-add R1 rp-ext 00:00:02:01:02:03 172.16.1.254/24

    ovn-nbctl lsp-add sw0 sw0-rp -- set Logical_Switch_Port sw0-rp \
    type=router options:router-port=rp-sw0 \
    -- lsp-set-addresses sw0-rp router

    ovn-nbctl lsp-add sw1 sw1-rp -- set Logical_Switch_Port sw1-rp \
    type=router options:router-port=rp-sw1 \
    -- lsp-set-addresses sw1-rp router

    I'd suggest doing something like below to enable this feature.

    ovn-nbctl set Logical_Switch_Port sw0-rp options:dhcp_relay=true
    ovn-nbctl set Logical_Switch_Port sw1-rp options:dhcp_relay=true

    (Make sure that only one logical switch port of type router can
have this flag - dhcp_relay set
     for a given logical switch and document this limitation.)

    ovn-nbctl set Logical_Router_port rp-sw0 options:dhcp_relay_ip=172.16.1.1
    ovn-nbctl set Logical_Router_port rp-sw1 options:dhcp_relay_ip=172.16.1.1

    Let me know if there are any limitations with this.

2.  Regarding the newly added actions - dhcp_relay_req() and
dhcp_relay_resp_fwd().
     Both of these actions are encoded as OVS controller action with
pause enabled.
     Which means ovs-vswitchd has to freeze the flow translation and
resume the flow translation
     once the ovn-controller resumes it.  But the functions
pinctrl_handle_dhcp_relay_req()
     and pinctrl_handle_dhcp_relay_resp_fwd() do not resume the packet
if the packet
     has some errors.  This is wrong.  Otherwise vswitchd will never thaw the
     frozen translation.

     You can see the existing OVN actions - put_dhcp_opts() and few others which
     use controller action with pause.  In such actions, the result of
these actions
     are stored in a register bit (i.e if put_dhcp_opts() was successful or not)
     and in the next stage we take a decision based on the result.

     For the action dhcp_relay_req(relay_ip, server_ip),  I don't
think you should use the pause flag.
     Also in this action the argument server_ip is never used in the
function pinctrl_handle_dhcp_relay_req()
     other than to just log.

     I'd suggest you do something like this:

    table=3 (lr_in_ip_input     ), priority=110  , match=(inport ==
"lrp1" && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src
== 68 && udp.dst == 67),
    action=(dhcp_relay_req { ip4.src = 192.168.1.1; ip4.dst =
172.16.1.1; udp.src = 67; dhcp_header.giaddr = <relay_ip>;
next(pipeline=ingress,table=S_ROUTER_IN_UNSNAT);  /* DHCP_RELAY_REQ */
}

    dhcp_relay_req action would get translated into a controller
action with pause=false and all the inner actions of this are encoded
as
    normal actions and stored in the userdata of controller action.
Please see icmp4_error {} as an example.
    Add a new OVN field 'dhcp_header.giaddr' which gets translated as
controller action with pause flag set.
    Please see the existing OVN field - icmp4.frag_mtu as an example
and see this commit for reference [1]
    When encoding this new OVN field, store the relay_ip in the
userdata buffer and in pinctrl.c
    get the relay_ip value and store it in the dhcp header field.


    For the action dhcp_relay_resp_fwd,  I'd suggest something like below:

      table=17 (lr_in_dhcp_relay_resp_chk), priority=110  ,
match=(ip4.src == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src ==
67 && udp.dst == 67),
      action=(reg0[0] = dhcp_relay_resp_chk(dhcp_header.giaddr ==
<relay_ip>); next;)
      table=17 (lr_in_dhcp_relay_resp), priority=110  , match=(ip4.src
== 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src == 67 && udp.dst ==
67 && reg0[0] == 1),
      action=(ip4.src = 192.168.1.1; udp.dst = 68; outport = "lrp1";
output; /* DHCP_RELAY_RESP */)

       I used reg0[0] as an example.  You may need to check the free
register bit and use it.

      You need to encode dhcp_relay_resp_chk as controller action with
pause=true, and store the relay_ip in the userdata buffer.
      And in pinctrl.c  check that  'dhcp_header.giaddr == relay_ip'
or not.  If so, set the result register bit to 1, else to 0.

   Let me know if you've any questions.

3.  The newly added functions in pinctrl.c have a lot of repetitive
code and it is very much similar to existing
pinctrl_handle_put_dhcp_opts()
    Please see if the duplicate code can be avoided.

[1] - https://github.com/ovn-org/ovn/commit/3d9fec3fd5992e1201b4d4fdf43f1f397e8d5ea1

Thanks
Numan

> ---
>  controller/pinctrl.c  | 441 ++++++++++++++++++++++++++++++++++++++++++
>  include/ovn/actions.h |  26 +++
>  lib/actions.c         | 117 +++++++++++
>  lib/ovn-l7.h          |   1 +
>  northd/northd.c       | 177 ++++++++++++++++-
>  ovn-nb.ovsschema      |  25 ++-
>  ovn-nb.xml            |  28 +++
>  tests/atlocal.in      |   3 +
>  tests/ovn-northd.at   |  41 +++-
>  tests/ovn.at          |  12 +-
>  tests/system-ovn.at   | 150 ++++++++++++++
>  utilities/ovn-trace.c |  28 +++
>  12 files changed, 1032 insertions(+), 17 deletions(-)
>
> diff --git a/controller/pinctrl.c b/controller/pinctrl.c
> index 5a35d56f6..45240f01d 100644
> --- a/controller/pinctrl.c
> +++ b/controller/pinctrl.c
> @@ -1897,6 +1897,437 @@ is_dhcp_flags_broadcast(ovs_be16 flags)
>      return flags & htons(DHCP_BROADCAST_FLAG);
>  }
>
> +static const char *dhcp_msg_str[] = {
> +[0] = "INVALID",
> +[DHCP_MSG_DISCOVER] = "DISCOVER",
> +[DHCP_MSG_OFFER] = "OFFER",
> +[DHCP_MSG_REQUEST] = "REQUEST",
> +[OVN_DHCP_MSG_DECLINE] = "DECLINE",
> +[DHCP_MSG_ACK] = "ACK",
> +[DHCP_MSG_NAK] = "NAK",
> +[OVN_DHCP_MSG_RELEASE] = "RELEASE",
> +[OVN_DHCP_MSG_INFORM] = "INFORM"
> +};
> +
> +static bool
> +dhcp_relay_is_msg_type_supported(uint8_t msg_type)
> +{
> +    return (msg_type >= DHCP_MSG_DISCOVER && msg_type <= OVN_DHCP_MSG_RELEASE);
> +}
> +
> +static const char *dhcp_msg_str_get(uint8_t msg_type)
> +{
> +    if (!dhcp_relay_is_msg_type_supported(msg_type)) {
> +        return "INVALID";
> +    }
> +    return dhcp_msg_str[msg_type];
> +}
> +
> +/* Called with in the pinctrl_handler thread context. */
> +static void
> +pinctrl_handle_dhcp_relay_req(
> +    struct rconn *swconn,
> +    struct dp_packet *pkt_in, struct ofputil_packet_in *pin,
> +    struct ofpbuf *userdata,
> +    struct ofpbuf *continuation)
> +{
> +    enum ofp_version version = rconn_get_version(swconn);
> +    enum ofputil_protocol proto = ofputil_protocol_from_ofp_version(version);
> +    struct dp_packet *pkt_out_ptr = NULL;
> +
> +    /* Parse relay IP and server IP. */
> +    ovs_be32 *relay_ip = ofpbuf_try_pull(userdata, sizeof *relay_ip);
> +    ovs_be32 *server_ip = ofpbuf_try_pull(userdata, sizeof *server_ip);
> +    if (!relay_ip || !server_ip) {
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: relay ip or server ip "
> +                  "not present in the userdata");
> +        return;
> +    }
> +
> +    /* Validate the DHCP request packet.
> +     * Format of the DHCP packet is
> +     * ------------------------------------------------------------------------
> +     *| UDP HEADER  | DHCP HEADER  | 4 Byte DHCP Cookie | DHCP OPTIONS(var len)|
> +     * ------------------------------------------------------------------------
> +     */
> +
> +    size_t in_l4_size = dp_packet_l4_size(pkt_in);
> +    const char *end = (char *) dp_packet_l4(pkt_in) + in_l4_size;
> +    const char *in_dhcp_ptr = dp_packet_get_udp_payload(pkt_in);
> +    if (!in_dhcp_ptr) {
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: invalid or incomplete "
> +                  "DHCP packet received");
> +        return;
> +    }
> +
> +    const struct dhcp_header *in_dhcp_data
> +        = (const struct dhcp_header *) in_dhcp_ptr;
> +    in_dhcp_ptr += sizeof *in_dhcp_data;
> +    if (in_dhcp_ptr > end) {
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: invalid or incomplete "
> +                "DHCP packet received, bad data length");
> +        return;
> +    }
> +    if (in_dhcp_data->op != DHCP_OP_REQUEST) {
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: invalid opcode in the "
> +                "DHCP packet: %d", in_dhcp_data->op);
> +        return;
> +    }
> +
> +    /* DHCP options follow the DHCP header. The first 4 bytes of the DHCP
> +     * options is the DHCP magic cookie followed by the actual DHCP options.
> +     */
> +    ovs_be32 magic_cookie = htonl(DHCP_MAGIC_COOKIE);
> +    if (in_dhcp_ptr + sizeof magic_cookie > end ||
> +        get_unaligned_be32((const void *) in_dhcp_ptr) != magic_cookie) {
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: magic cookie not present "
> +                "in the packet");
> +        return;
> +    }
> +
> +    if (in_dhcp_data->giaddr) {
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: giaddr is already set");
> +        return;
> +    }
> +
> +    if (in_dhcp_data->htype != 0x1) {
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: packet is recieved with "
> +                "unsupported hardware type");
> +        return;
> +    }
> +
> +    ovs_be32 *server_id_ptr = NULL;
> +    const uint8_t *in_dhcp_msg_type = NULL;
> +
> +    in_dhcp_ptr += sizeof magic_cookie;
> +    ovs_be32 request_ip = in_dhcp_data->ciaddr;
> +    while (in_dhcp_ptr < end) {
> +        const struct dhcp_opt_header *in_dhcp_opt =
> +            (const struct dhcp_opt_header *) in_dhcp_ptr;
> +        if (in_dhcp_opt->code == DHCP_OPT_END) {
> +            break;
> +        }
> +        if (in_dhcp_opt->code == DHCP_OPT_PAD) {
> +            in_dhcp_ptr += 1;
> +            continue;
> +        }
> +        in_dhcp_ptr += sizeof *in_dhcp_opt;
> +        if (in_dhcp_ptr > end) {
> +            break;
> +        }
> +        in_dhcp_ptr += in_dhcp_opt->len;
> +        if (in_dhcp_ptr > end) {
> +            break;
> +        }
> +
> +        switch (in_dhcp_opt->code) {
> +        case DHCP_OPT_MSG_TYPE:
> +            if (in_dhcp_opt->len == 1) {
> +                in_dhcp_msg_type = DHCP_OPT_PAYLOAD(in_dhcp_opt);
> +            }
> +            break;
> +        case DHCP_OPT_REQ_IP:
> +            if (in_dhcp_opt->len == 4) {
> +                request_ip = get_unaligned_be32(DHCP_OPT_PAYLOAD(in_dhcp_opt));
> +            }
> +            break;
> +        /* Server Identifier */
> +        case OVN_DHCP_OPT_CODE_SERVER_ID:
> +            if (in_dhcp_opt->len == 4) {
> +                server_id_ptr = DHCP_OPT_PAYLOAD(in_dhcp_opt);
> +            }
> +            break;
> +        default:
> +            break;
> +        }
> +    }
> +
> +    /* Check whether the DHCP Message Type (opt 53) is present or not */
> +    if (!in_dhcp_msg_type) {
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: missing message type");
> +        return;
> +    }
> +
> +    /* Relay the DHCP request packet */
> +    uint16_t new_l4_size = in_l4_size;
> +    size_t new_packet_size = pkt_in->l4_ofs + new_l4_size;
> +
> +    struct dp_packet pkt_out;
> +    dp_packet_init(&pkt_out, new_packet_size);
> +    dp_packet_clear(&pkt_out);
> +    dp_packet_prealloc_tailroom(&pkt_out, new_packet_size);
> +    pkt_out_ptr = &pkt_out;
> +
> +    /* Copy the L2 and L3 headers from the pkt_in as they would remain same*/
> +    dp_packet_put(
> +        &pkt_out, dp_packet_pull(pkt_in, pkt_in->l4_ofs), pkt_in->l4_ofs);
> +
> +    pkt_out.l2_5_ofs = pkt_in->l2_5_ofs;
> +    pkt_out.l2_pad_size = pkt_in->l2_pad_size;
> +    pkt_out.l3_ofs = pkt_in->l3_ofs;
> +    pkt_out.l4_ofs = pkt_in->l4_ofs;
> +
> +    struct ip_header *out_ip = dp_packet_l3(&pkt_out);
> +
> +    struct udp_header *udp = dp_packet_put(
> +        &pkt_out, dp_packet_pull(pkt_in, UDP_HEADER_LEN), UDP_HEADER_LEN);
> +
> +    struct dhcp_header *dhcp_data = dp_packet_put(&pkt_out,
> +        dp_packet_pull(pkt_in, new_l4_size - UDP_HEADER_LEN),
> +        new_l4_size - UDP_HEADER_LEN);
> +    dhcp_data->giaddr = *relay_ip;
> +    if (udp->udp_csum) {
> +        udp->udp_csum = recalc_csum32(udp->udp_csum,
> +            0, dhcp_data->giaddr);
> +    }
> +    pin->packet = dp_packet_data(&pkt_out);
> +    pin->packet_len = dp_packet_size(&pkt_out);
> +
> +    /* Log the DHCP message. */
> +    static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(20, 40);
> +    const struct eth_header *l2 = dp_packet_eth(&pkt_out);
> +    VLOG_INFO_RL(&rl, "DHCP_RELAY_REQ:: MSG_TYPE:%s MAC:"ETH_ADDR_FMT
> +                " XID:%u"
> +                " REQ_IP:"IP_FMT
> +                " GIADDR:"IP_FMT
> +                " SERVER_ADDR:"IP_FMT,
> +                dhcp_msg_str_get(*in_dhcp_msg_type),
> +                ETH_ADDR_BYTES_ARGS(dhcp_data->chaddr), ntohl(dhcp_data->xid),
> +                IP_ARGS(request_ip), IP_ARGS(dhcp_data->giaddr),
> +                IP_ARGS(*server_ip));
> +    queue_msg(swconn, ofputil_encode_resume(pin, continuation, proto));
> +    if (pkt_out_ptr) {
> +        dp_packet_uninit(pkt_out_ptr);
> +    }
> +}
> +
> +/* Called with in the pinctrl_handler thread context. */
> +static void
> +pinctrl_handle_dhcp_relay_resp_fwd(
> +    struct rconn *swconn,
> +    struct dp_packet *pkt_in, struct ofputil_packet_in *pin,
> +    struct ofpbuf *userdata,
> +    struct ofpbuf *continuation)
> +{
> +    enum ofp_version version = rconn_get_version(swconn);
> +    enum ofputil_protocol proto = ofputil_protocol_from_ofp_version(version);
> +    struct dp_packet *pkt_out_ptr = NULL;
> +
> +    /* Parse relay IP and server IP. */
> +    ovs_be32 *relay_ip = ofpbuf_try_pull(userdata, sizeof *relay_ip);
> +    ovs_be32 *server_ip = ofpbuf_try_pull(userdata, sizeof *server_ip);
> +    if (!relay_ip || !server_ip) {
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: relay ip or server ip "
> +                "not present in the userdata");
> +        return;
> +    }
> +
> +    /* Validate the DHCP request packet.
> +     * Format of the DHCP packet is
> +     * ------------------------------------------------------------------------
> +     *| UDP HEADER  | DHCP HEADER  | 4 Byte DHCP Cookie | DHCP OPTIONS(var len)|
> +     * ------------------------------------------------------------------------
> +     */
> +
> +    size_t in_l4_size = dp_packet_l4_size(pkt_in);
> +    const char *end = (char *) dp_packet_l4(pkt_in) + in_l4_size;
> +    const char *in_dhcp_ptr = dp_packet_get_udp_payload(pkt_in);
> +    if (!in_dhcp_ptr) {
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: invalid or incomplete "
> +                "packet received");
> +        return;
> +    }
> +
> +    const struct dhcp_header *in_dhcp_data
> +        = (const struct dhcp_header *) in_dhcp_ptr;
> +    in_dhcp_ptr += sizeof *in_dhcp_data;
> +    if (in_dhcp_ptr > end) {
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: invalid or incomplete "
> +                    "packet received, bad data length");
> +        return;
> +    }
> +    if (in_dhcp_data->op != DHCP_OP_REPLY) {
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: invalid opcode "
> +                "in the packet: %d", in_dhcp_data->op);
> +        return;
> +    }
> +
> +    /* DHCP options follow the DHCP header. The first 4 bytes of the DHCP
> +     * options is the DHCP magic cookie followed by the actual DHCP options.
> +     */
> +    ovs_be32 magic_cookie = htonl(DHCP_MAGIC_COOKIE);
> +    if (in_dhcp_ptr + sizeof magic_cookie > end ||
> +        get_unaligned_be32((const void *) in_dhcp_ptr) != magic_cookie) {
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: magic cookie not present "
> +                "in the packet");
> +        return;
> +    }
> +
> +    if (!in_dhcp_data->giaddr) {
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: giaddr is "
> +                    "not set in request");
> +        return;
> +    }
> +    ovs_be32 giaddr = in_dhcp_data->giaddr;
> +
> +    ovs_be32 *server_id_ptr = NULL;
> +    ovs_be32 lease_time = 0;
> +    const uint8_t *in_dhcp_msg_type = NULL;
> +
> +    in_dhcp_ptr += sizeof magic_cookie;
> +    while (in_dhcp_ptr < end) {
> +        const struct dhcp_opt_header *in_dhcp_opt =
> +            (const struct dhcp_opt_header *) in_dhcp_ptr;
> +        if (in_dhcp_opt->code == DHCP_OPT_END) {
> +            break;
> +        }
> +        if (in_dhcp_opt->code == DHCP_OPT_PAD) {
> +            in_dhcp_ptr += 1;
> +            continue;
> +        }
> +        in_dhcp_ptr += sizeof *in_dhcp_opt;
> +        if (in_dhcp_ptr > end) {
> +            break;
> +        }
> +        in_dhcp_ptr += in_dhcp_opt->len;
> +        if (in_dhcp_ptr > end) {
> +            break;
> +        }
> +
> +        switch (in_dhcp_opt->code) {
> +        case DHCP_OPT_MSG_TYPE:
> +            if (in_dhcp_opt->len == 1) {
> +                in_dhcp_msg_type = DHCP_OPT_PAYLOAD(in_dhcp_opt);
> +            }
> +            break;
> +        /* Server Identifier */
> +        case OVN_DHCP_OPT_CODE_SERVER_ID:
> +            if (in_dhcp_opt->len == 4) {
> +                server_id_ptr = DHCP_OPT_PAYLOAD(in_dhcp_opt);
> +            }
> +            break;
> +        case OVN_DHCP_OPT_CODE_LEASE_TIME:
> +            if (in_dhcp_opt->len == 4) {
> +                lease_time = get_unaligned_be32(DHCP_OPT_PAYLOAD(in_dhcp_opt));
> +            }
> +            break;
> +        default:
> +            break;
> +        }
> +    }
> +
> +    /* Check whether the DHCP Message Type (opt 53) is present or not */
> +    if (!in_dhcp_msg_type) {
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: missing message type");
> +        return;
> +    }
> +
> +    if (!server_id_ptr) {
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: missing server identifier");
> +        return;
> +    }
> +
> +    if (*server_id_ptr != *server_ip) {
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: server identifier mismatch");
> +        return;
> +    }
> +
> +    if (giaddr != *relay_ip) {
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: giaddr mismatch");
> +        return;
> +    }
> +
> +
> +    /* Update destination MAC & IP so that the packet is forward to the
> +     * right destination node.
> +     */
> +    uint16_t new_l4_size = in_l4_size;
> +    size_t new_packet_size = pkt_in->l4_ofs + new_l4_size;
> +
> +    struct dp_packet pkt_out;
> +    dp_packet_init(&pkt_out, new_packet_size);
> +    dp_packet_clear(&pkt_out);
> +    dp_packet_prealloc_tailroom(&pkt_out, new_packet_size);
> +    pkt_out_ptr = &pkt_out;
> +
> +    /* Copy the L2 and L3 headers from the pkt_in as they would remain same*/
> +    struct eth_header *eth = dp_packet_put(
> +        &pkt_out, dp_packet_pull(pkt_in, pkt_in->l4_ofs), pkt_in->l4_ofs);
> +
> +    pkt_out.l2_5_ofs = pkt_in->l2_5_ofs;
> +    pkt_out.l2_pad_size = pkt_in->l2_pad_size;
> +    pkt_out.l3_ofs = pkt_in->l3_ofs;
> +    pkt_out.l4_ofs = pkt_in->l4_ofs;
> +
> +    struct udp_header *udp = dp_packet_put(
> +        &pkt_out, dp_packet_pull(pkt_in, UDP_HEADER_LEN), UDP_HEADER_LEN);
> +
> +    struct dhcp_header *dhcp_data = dp_packet_put(
> +        &pkt_out, dp_packet_pull(pkt_in, new_l4_size - UDP_HEADER_LEN),
> +        new_l4_size - UDP_HEADER_LEN);
> +    memcpy(&eth->eth_dst, dhcp_data->chaddr, sizeof(eth->eth_dst));
> +
> +    /* Send a broadcast IP frame when BROADCAST flag is set. */
> +    struct ip_header *out_ip = dp_packet_l3(&pkt_out);
> +    ovs_be32 ip_dst;
> +    ovs_be32 ip_dst_orig = get_16aligned_be32(&out_ip->ip_dst);
> +    if (!is_dhcp_flags_broadcast(dhcp_data->flags)) {
> +        ip_dst = dhcp_data->yiaddr;
> +    } else {
> +        ip_dst = htonl(0xffffffff);
> +    }
> +    put_16aligned_be32(&out_ip->ip_dst, ip_dst);
> +    out_ip->ip_csum = recalc_csum32(out_ip->ip_csum,
> +              ip_dst_orig, ip_dst);
> +    if (udp->udp_csum) {
> +        udp->udp_csum = recalc_csum32(udp->udp_csum,
> +            ip_dst_orig, ip_dst);
> +    }
> +    /* Reset giaddr */
> +    dhcp_data->giaddr = htonl(0x0);
> +    if (udp->udp_csum) {
> +        udp->udp_csum = recalc_csum32(udp->udp_csum,
> +            giaddr, 0);
> +    }
> +    pin->packet = dp_packet_data(&pkt_out);
> +    pin->packet_len = dp_packet_size(&pkt_out);
> +
> +    /* Log the DHCP message. */
> +    static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(20, 40);
> +    const struct eth_header *l2 = dp_packet_eth(&pkt_out);
> +    VLOG_INFO_RL(&rl, "DHCP_RELAY_RESP_FWD:: MSG_TYPE:%s MAC:"ETH_ADDR_FMT
> +             " XID:%u"
> +             " YIADDR:"IP_FMT
> +             " GIADDR:"IP_FMT
> +             " SERVER_ADDR:"IP_FMT,
> +             dhcp_msg_str_get(*in_dhcp_msg_type),
> +             ETH_ADDR_BYTES_ARGS(dhcp_data->chaddr), ntohl(dhcp_data->xid),
> +             IP_ARGS(dhcp_data->yiaddr),
> +             IP_ARGS(giaddr), IP_ARGS(*server_id_ptr));
> +    queue_msg(swconn, ofputil_encode_resume(pin, continuation, proto));
> +    if (pkt_out_ptr) {
> +        dp_packet_uninit(pkt_out_ptr);
> +    }
> +}
> +
>  /* Called with in the pinctrl_handler thread context. */
>  static void
>  pinctrl_handle_put_dhcp_opts(
> @@ -3203,6 +3634,16 @@ process_packet_in(struct rconn *swconn, const struct ofp_header *msg)
>          ovs_mutex_unlock(&pinctrl_mutex);
>          break;
>
> +    case ACTION_OPCODE_DHCP_RELAY_REQ:
> +        pinctrl_handle_dhcp_relay_req(swconn, &packet, &pin,
> +                                     &userdata, &continuation);
> +        break;
> +
> +    case ACTION_OPCODE_DHCP_RELAY_RESP_FWD:
> +        pinctrl_handle_dhcp_relay_resp_fwd(swconn, &packet, &pin,
> +                                     &userdata, &continuation);
> +        break;
> +
>      case ACTION_OPCODE_PUT_DHCP_OPTS:
>          pinctrl_handle_put_dhcp_opts(swconn, &packet, &pin, &headers,
>                                       &userdata, &continuation);
> diff --git a/include/ovn/actions.h b/include/ovn/actions.h
> index 49cfe0624..47d41b90f 100644
> --- a/include/ovn/actions.h
> +++ b/include/ovn/actions.h
> @@ -95,6 +95,8 @@ struct collector_set_ids;
>      OVNACT(LOOKUP_ND_IP,      ovnact_lookup_mac_bind_ip) \
>      OVNACT(PUT_DHCPV4_OPTS,   ovnact_put_opts)        \
>      OVNACT(PUT_DHCPV6_OPTS,   ovnact_put_opts)        \
> +    OVNACT(DHCPV4_RELAY_REQ,  ovnact_dhcp_relay)      \
> +    OVNACT(DHCPV4_RELAY_RESP_FWD, ovnact_dhcp_relay)      \
>      OVNACT(SET_QUEUE,         ovnact_set_queue)       \
>      OVNACT(DNS_LOOKUP,        ovnact_result)          \
>      OVNACT(LOG,               ovnact_log)             \
> @@ -387,6 +389,14 @@ struct ovnact_put_opts {
>      size_t n_options;
>  };
>
> +/* OVNACT_DHCP_RELAY. */
> +struct ovnact_dhcp_relay {
> +    struct ovnact ovnact;
> +    int family;
> +    ovs_be32 relay_ipv4;
> +    ovs_be32 server_ipv4;
> +};
> +
>  /* Valid arguments to SET_QUEUE action.
>   *
>   * QDISC_MIN_QUEUE_ID is the default queue, so user-defined queues should
> @@ -750,6 +760,22 @@ enum action_opcode {
>
>      /* multicast group split buffer action. */
>      ACTION_OPCODE_MG_SPLIT_BUF,
> +
> +    /* "dhcp_relay_req(relay_ip, server_ip)".
> +     *
> +     * Arguments follow the action_header, in this format:
> +     *   - The 32-bit DHCP relay IP.
> +     *   - The 32-bit DHCP server IP.
> +     */
> +    ACTION_OPCODE_DHCP_RELAY_REQ,
> +
> +    /* "dhcp_relay_resp_fwd(relay_ip, server_ip)".
> +     *
> +     * Arguments follow the action_header, in this format:
> +     *   - The 32-bit DHCP relay IP.
> +     *   - The 32-bit DHCP server IP.
> +     */
> +    ACTION_OPCODE_DHCP_RELAY_RESP_FWD,
>  };
>
>  /* Header. */
> diff --git a/lib/actions.c b/lib/actions.c
> index a73fe1a1e..69df428c6 100644
> --- a/lib/actions.c
> +++ b/lib/actions.c
> @@ -2629,6 +2629,118 @@ ovnact_controller_event_free(struct ovnact_controller_event *event)
>      free_gen_options(event->options, event->n_options);
>  }
>
> +static void
> +format_DHCPV4_RELAY_REQ(const struct ovnact_dhcp_relay *dhcp_relay,
> +                struct ds *s)
> +{
> +    ds_put_format(s, "dhcp_relay_req("IP_FMT","IP_FMT");",
> +                  IP_ARGS(dhcp_relay->relay_ipv4),
> +                  IP_ARGS(dhcp_relay->server_ipv4));
> +}
> +
> +static void
> +parse_dhcp_relay_req(struct action_context *ctx,
> +               struct ovnact_dhcp_relay *dhcp_relay)
> +{
> +    /* Skip dhcp_relay_req( */
> +    lexer_force_match(ctx->lexer, LEX_T_LPAREN);
> +
> +    /* Parse relay ip and server ip. */
> +    if (ctx->lexer->token.format == LEX_F_IPV4) {
> +        dhcp_relay->family = AF_INET;
> +        dhcp_relay->relay_ipv4 = ctx->lexer->token.value.ipv4;
> +        lexer_get(ctx->lexer);
> +        lexer_match(ctx->lexer, LEX_T_COMMA);
> +        if (ctx->lexer->token.format == LEX_F_IPV4) {
> +            dhcp_relay->family = AF_INET;
> +            dhcp_relay->server_ipv4 = ctx->lexer->token.value.ipv4;
> +            lexer_get(ctx->lexer);
> +        } else {
> +            lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp server ip");
> +            return;
> +        }
> +    } else {
> +          lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp relay "
> +                          "and server ips");
> +          return;
> +    }
> +    lexer_force_match(ctx->lexer, LEX_T_RPAREN);
> +}
> +
> +static void
> +encode_DHCPV4_RELAY_REQ(const struct ovnact_dhcp_relay *dhcp_relay,
> +                    const struct ovnact_encode_params *ep,
> +                    struct ofpbuf *ofpacts)
> +{
> +    size_t oc_offset = encode_start_controller_op(ACTION_OPCODE_DHCP_RELAY_REQ,
> +                                                  true, ep->ctrl_meter_id,
> +                                                  ofpacts);
> +    ofpbuf_put(ofpacts, &dhcp_relay->relay_ipv4,
> +            sizeof(dhcp_relay->relay_ipv4));
> +    ofpbuf_put(ofpacts, &dhcp_relay->server_ipv4,
> +            sizeof(dhcp_relay->server_ipv4));
> +    encode_finish_controller_op(oc_offset, ofpacts);
> +}
> +
> +static void
> +format_DHCPV4_RELAY_RESP_FWD(const struct ovnact_dhcp_relay *dhcp_relay,
> +                    struct ds *s)
> +{
> +    ds_put_format(s, "dhcp_relay_resp("IP_FMT","IP_FMT");",
> +                  IP_ARGS(dhcp_relay->relay_ipv4),
> +                  IP_ARGS(dhcp_relay->server_ipv4));
> +}
> +
> +static void
> +parse_dhcp_relay_resp_fwd(struct action_context *ctx,
> +               struct ovnact_dhcp_relay *dhcp_relay)
> +{
> +    /* Skip dhcp_relay_resp( */
> +    lexer_force_match(ctx->lexer, LEX_T_LPAREN);
> +
> +    /* Parse relay ip and server ip. */
> +    if (ctx->lexer->token.format == LEX_F_IPV4) {
> +        dhcp_relay->family = AF_INET;
> +        dhcp_relay->relay_ipv4 = ctx->lexer->token.value.ipv4;
> +        lexer_get(ctx->lexer);
> +        lexer_match(ctx->lexer, LEX_T_COMMA);
> +        if (ctx->lexer->token.format == LEX_F_IPV4) {
> +            dhcp_relay->family = AF_INET;
> +            dhcp_relay->server_ipv4 = ctx->lexer->token.value.ipv4;
> +            lexer_get(ctx->lexer);
> +        } else {
> +            lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp server ip");
> +            return;
> +        }
> +    } else {
> +          lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp relay and "
> +                          "server ips");
> +          return;
> +    }
> +    lexer_force_match(ctx->lexer, LEX_T_RPAREN);
> +}
> +
> +static void
> +encode_DHCPV4_RELAY_RESP_FWD(const struct ovnact_dhcp_relay *dhcp_relay,
> +                    const struct ovnact_encode_params *ep,
> +                    struct ofpbuf *ofpacts)
> +{
> +    size_t oc_offset = encode_start_controller_op(
> +                                ACTION_OPCODE_DHCP_RELAY_RESP_FWD,
> +                                true, ep->ctrl_meter_id,
> +                                ofpacts);
> +    ofpbuf_put(ofpacts, &dhcp_relay->relay_ipv4,
> +                  sizeof(dhcp_relay->relay_ipv4));
> +    ofpbuf_put(ofpacts, &dhcp_relay->server_ipv4,
> +                  sizeof(dhcp_relay->server_ipv4));
> +    encode_finish_controller_op(oc_offset, ofpacts);
> +}
> +
> +static void ovnact_dhcp_relay_free(
> +          struct ovnact_dhcp_relay *dhcp_relay OVS_UNUSED)
> +{
> +}
> +
>  static void
>  parse_put_opts(struct action_context *ctx, const struct expr_field *dst,
>                 struct ovnact_put_opts *po, const struct hmap *gen_opts,
> @@ -5451,6 +5563,11 @@ parse_action(struct action_context *ctx)
>          parse_sample(ctx);
>      } else if (lexer_match_id(ctx->lexer, "mac_cache_use")) {
>          ovnact_put_MAC_CACHE_USE(ctx->ovnacts);
> +    } else if (lexer_match_id(ctx->lexer, "dhcp_relay_req")) {
> +        parse_dhcp_relay_req(ctx, ovnact_put_DHCPV4_RELAY_REQ(ctx->ovnacts));
> +    } else if (lexer_match_id(ctx->lexer, "dhcp_relay_resp_fwd")) {
> +        parse_dhcp_relay_resp_fwd(ctx,
> +              ovnact_put_DHCPV4_RELAY_RESP_FWD(ctx->ovnacts));
>      } else {
>          lexer_syntax_error(ctx->lexer, "expecting action");
>      }
> diff --git a/lib/ovn-l7.h b/lib/ovn-l7.h
> index ad514a922..e08581123 100644
> --- a/lib/ovn-l7.h
> +++ b/lib/ovn-l7.h
> @@ -69,6 +69,7 @@ struct gen_opts_map {
>   */
>  #define OVN_DHCP_OPT_CODE_NETMASK      1
>  #define OVN_DHCP_OPT_CODE_LEASE_TIME   51
> +#define OVN_DHCP_OPT_CODE_SERVER_ID    54
>  #define OVN_DHCP_OPT_CODE_T1           58
>  #define OVN_DHCP_OPT_CODE_T2           59
>
> diff --git a/northd/northd.c b/northd/northd.c
> index 07dffb15a..7ac831fae 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -181,11 +181,13 @@ enum ovn_stage {
>      PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING_ECMP, 14, "lr_in_ip_routing_ecmp") \
>      PIPELINE_STAGE(ROUTER, IN,  POLICY,          15, "lr_in_policy")          \
>      PIPELINE_STAGE(ROUTER, IN,  POLICY_ECMP,     16, "lr_in_policy_ecmp")     \
> -    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE,     17, "lr_in_arp_resolve")     \
> -    PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN,     18, "lr_in_chk_pkt_len")     \
> -    PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS,     19, "lr_in_larger_pkts")     \
> -    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT,     20, "lr_in_gw_redirect")     \
> -    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     21, "lr_in_arp_request")     \
> +    PIPELINE_STAGE(ROUTER, IN,  DHCP_RELAY_RESP_FWD, 17,                      \
> +                  "lr_in_dhcp_relay_resp_fwd")                                \
> +    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE,     18, "lr_in_arp_resolve")     \
> +    PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN,     19, "lr_in_chk_pkt_len")     \
> +    PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS,     20, "lr_in_larger_pkts")     \
> +    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT,     21, "lr_in_gw_redirect")     \
> +    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     22, "lr_in_arp_request")     \
>                                                                        \
>      /* Logical router egress stages. */                               \
>      PIPELINE_STAGE(ROUTER, OUT, CHECK_DNAT_LOCAL,   0,                       \
> @@ -9610,6 +9612,80 @@ build_dhcpv6_options_flows(struct ovn_port *op,
>      ds_destroy(&match);
>  }
>
> +static void
> +build_lswitch_dhcp_relay_flows(struct ovn_port *op,
> +                           const struct hmap *lr_ports,
> +                           const struct hmap *lflows,
> +                           const struct shash *meter_groups OVS_UNUSED)
> +{
> +    if (op->nbrp || !op->nbsp) {
> +        return;
> +    }
> +    /* consider only ports attached to VMs */
> +    if (strcmp(op->nbsp->type, "")) {
> +        return;
> +    }
> +
> +    if (!op->od || !op->od->n_router_ports ||
> +        !op->od->nbs || !op->od->nbs->dhcp_relay_port) {
> +        return;
> +    }
> +
> +    struct ds match = DS_EMPTY_INITIALIZER;
> +    struct ds action = DS_EMPTY_INITIALIZER;
> +    struct nbrec_logical_router_port *lrp = op->od->nbs->dhcp_relay_port;
> +    struct ovn_port *rp = ovn_port_find(lr_ports, lrp->name);
> +
> +    if (!rp || !rp->nbrp || !rp->nbrp->dhcp_relay) {
> +        return;
> +    }
> +
> +    struct ovn_port *sp = NULL;
> +    struct nbrec_dhcp_relay *dhcp_relay = rp->nbrp->dhcp_relay;
> +
> +    for (int i = 0; i < op->od->n_router_ports; i++) {
> +        struct ovn_port *sp_tmp = op->od->router_ports[i];
> +        if (sp_tmp->peer == rp) {
> +            sp = sp_tmp;
> +            break;
> +        }
> +    }
> +    if (!sp) {
> +      return;
> +    }
> +
> +    char *server_ip_str = NULL;
> +    uint16_t port;
> +    int addr_family;
> +    struct in6_addr server_ip;
> +
> +    if (!ip_address_and_port_from_lb_key(dhcp_relay->servers, &server_ip_str,
> +                                         &server_ip, &port, &addr_family)) {
> +        return;
> +    }
> +
> +    if (server_ip_str == NULL) {
> +        return;
> +    }
> +
> +    ds_put_format(
> +        &match, "inport == %s && eth.src == %s && "
> +        "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
> +        "udp.src == 68 && udp.dst == 67",
> +        op->json_key, op->lsp_addrs[0].ea_s);
> +    ds_put_format(&action,
> +                  "eth.dst=%s;outport=%s;next;/* DHCP_RELAY_REQ */",
> +                  rp->lrp_networks.ea_s,sp->json_key);
> +    ovn_lflow_add_with_hint__(lflows, op->od,
> +                              S_SWITCH_IN_L2_LKUP, 100,
> +                              ds_cstr(&match),
> +                              ds_cstr(&action),
> +                              op->key,
> +                              NULL,
> +                              &lrp->header_);
> +    free(server_ip_str);
> +}
> +
>  static void
>  build_drop_arp_nd_flows_for_unbound_router_ports(struct ovn_port *op,
>                                                   const struct ovn_port *port,
> @@ -10181,6 +10257,13 @@ build_lswitch_dhcp_options_and_response(struct ovn_port *op,
>          return;
>      }
>
> +    if (op->od && op->od->nbs
> +        && op->od->nbs->dhcp_relay_port) {
> +        /* Don't add the DHCP server flows if DHCP Relay is enabled on the
> +         * logical switch. */
> +        return;
> +    }
> +
>      bool is_external = lsp_is_external(op->nbsp);
>      if (is_external && (!op->od->n_localnet_ports ||
>                          !op->nbsp->ha_chassis_group)) {
> @@ -14458,6 +14541,86 @@ build_dhcpv6_reply_flows_for_lrouter_port(
>      }
>  }
>
> +static void
> +build_dhcp_relay_flows_for_lrouter_port(
> +        struct ovn_port *op, struct hmap *lflows,
> +        struct ds *match)
> +{
> +    if (!op->nbrp || !op->nbrp->dhcp_relay) {
> +        return;
> +    }
> +    struct nbrec_dhcp_relay *dhcp_relay = op->nbrp->dhcp_relay;
> +    if (!dhcp_relay->servers) {
> +        return;
> +    }
> +
> +    int addr_family;
> +    /* currently not supporting custom port */
> +    uint16_t port;
> +    char *server_ip_str = NULL;
> +    struct in6_addr server_ip;
> +
> +    if (!ip_address_and_port_from_lb_key(dhcp_relay->servers, &server_ip_str,
> +                                         &server_ip, &port, &addr_family)) {
> +        return;
> +    }
> +
> +    if (server_ip_str == NULL) {
> +        return;
> +    }
> +
> +    struct ds dhcp_action = DS_EMPTY_INITIALIZER;
> +    ds_clear(match);
> +    ds_put_format(
> +        match, "inport == %s && "
> +        "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
> +        "udp.src == 68 && udp.dst == 67",
> +        op->json_key);
> +    ds_put_format(&dhcp_action,
> +                "dhcp_relay_req(%s,%s);"
> +                "ip4.src=%s;ip4.dst=%s;udp.src=67;next; /* DHCP_RELAY_REQ */",
> +                op->lrp_networks.ipv4_addrs[0].addr_s, server_ip_str,
> +                op->lrp_networks.ipv4_addrs[0].addr_s, server_ip_str);
> +
> +    ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_IP_INPUT, 110,
> +                            ds_cstr(match), ds_cstr(&dhcp_action),
> +                            &op->nbrp->header_);
> +
> +    ds_clear(match);
> +    ds_clear(&dhcp_action);
> +
> +    ds_put_format(
> +        match, "ip4.src == %s && ip4.dst == %s && "
> +        "udp.src == 67 && udp.dst == 67",
> +        server_ip_str, op->lrp_networks.ipv4_addrs[0].addr_s);
> +    ds_put_format(&dhcp_action, "next;/* DHCP_RELAY_RESP */");
> +    ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_IP_INPUT, 110,
> +                            ds_cstr(match), ds_cstr(&dhcp_action),
> +                            &op->nbrp->header_);
> +
> +    ds_clear(match);
> +    ds_clear(&dhcp_action);
> +
> +    ds_put_format(
> +        match, "ip4.src == %s && ip4.dst == %s && "
> +        "udp.src == 67 && udp.dst == 67",
> +        server_ip_str, op->lrp_networks.ipv4_addrs[0].addr_s);
> +    ds_put_format(&dhcp_action,
> +          "dhcp_relay_resp_fwd(%s,%s);ip4.src=%s;udp.dst=68;"
> +          "outport=%s;output; /* DHCP_RELAY_RESP */",
> +          op->lrp_networks.ipv4_addrs[0].addr_s, server_ip_str,
> +          op->lrp_networks.ipv4_addrs[0].addr_s, op->json_key);
> +    ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_DHCP_RELAY_RESP_FWD,
> +                            110,
> +                            ds_cstr(match), ds_cstr(&dhcp_action),
> +                            &op->nbrp->header_);
> +
> +    ds_clear(match);
> +    ds_clear(&dhcp_action);
> +
> +    free(server_ip_str);
> +}
> +
>  static void
>  build_ipv6_input_flows_for_lrouter_port(
>          struct ovn_port *op, struct hmap *lflows,
> @@ -15673,6 +15836,8 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath *od, struct hmap *lflows,
>      ovn_lflow_add(lflows, od, S_ROUTER_OUT_POST_SNAT, 0, "1", "next;");
>      ovn_lflow_add(lflows, od, S_ROUTER_OUT_EGR_LOOP, 0, "1", "next;");
>      ovn_lflow_add(lflows, od, S_ROUTER_IN_ECMP_STATEFUL, 0, "1", "next;");
> +    ovn_lflow_add(lflows, od, S_ROUTER_IN_DHCP_RELAY_RESP_FWD, 0, "1",
> +                  "next;");
>
>      const char *ct_flag_reg = features->ct_no_masked_label
>                                ? "ct_mark"
> @@ -16154,6 +16319,7 @@ build_lswitch_and_lrouter_iterate_by_lsp(struct ovn_port *op,
>      build_lswitch_dhcp_options_and_response(op, lflows, meter_groups);
>      build_lswitch_external_port(op, lflows);
>      build_lswitch_ip_unicast_lookup(op, lflows, actions, match);
> +    build_lswitch_dhcp_relay_flows(op, lr_ports, lflows, meter_groups);
>
>      /* Build Logical Router Flows. */
>      build_ip_routing_flows_for_router_type_lsp(op, lr_ports, lflows);
> @@ -16183,6 +16349,7 @@ build_lswitch_and_lrouter_iterate_by_lrp(struct ovn_port *op,
>      build_egress_delivery_flows_for_lrouter_port(op, lsi->lflows, &lsi->match,
>                                                   &lsi->actions);
>      build_dhcpv6_reply_flows_for_lrouter_port(op, lsi->lflows, &lsi->match);
> +    build_dhcp_relay_flows_for_lrouter_port(op, lsi->lflows, &lsi->match);
>      build_ipv6_input_flows_for_lrouter_port(op, lsi->lflows,
>                                              &lsi->match, &lsi->actions,
>                                              lsi->meter_groups);
> diff --git a/ovn-nb.ovsschema b/ovn-nb.ovsschema
> index b2e0993e0..6863d52cd 100644
> --- a/ovn-nb.ovsschema
> +++ b/ovn-nb.ovsschema
> @@ -1,7 +1,7 @@
>  {
>      "name": "OVN_Northbound",
> -    "version": "7.2.0",
> -    "cksum": "1069338687 34162",
> +    "version": "7.3.0",
> +    "cksum": "2325497400 35185",
>      "tables": {
>          "NB_Global": {
>              "columns": {
> @@ -89,7 +89,12 @@
>                      "type": {"key": {"type": "uuid",
>                                       "refTable": "Forwarding_Group",
>                                       "refType": "strong"},
> -                                     "min": 0, "max": "unlimited"}}},
> +                                     "min": 0, "max": "unlimited"}},
> +                "dhcp_relay_port": {"type": {"key": {"type": "uuid",
> +                                            "refTable": "Logical_Router_Port",
> +                                            "refType": "weak"},
> +                                            "min": 0,
> +                                            "max": 1}}},
>              "isRoot": true},
>          "Logical_Switch_Port": {
>              "columns": {
> @@ -436,6 +441,11 @@
>                  "ipv6_prefix": {"type": {"key": "string",
>                                        "min": 0,
>                                        "max": "unlimited"}},
> +                "dhcp_relay": {"type": {"key": {"type": "uuid",
> +                                            "refTable": "DHCP_Relay",
> +                                            "refType": "weak"},
> +                                            "min": 0,
> +                                            "max": 1}},
>                  "external_ids": {
>                      "type": {"key": "string", "value": "string",
>                               "min": 0, "max": "unlimited"}},
> @@ -529,6 +539,15 @@
>                      "type": {"key": "string", "value": "string",
>                               "min": 0, "max": "unlimited"}}},
>              "isRoot": true},
> +        "DHCP_Relay": {
> +            "columns": {
> +                "servers": {"type": {"key": "string",
> +                                       "min": 0,
> +                                       "max": 1}},
> +                "external_ids": {
> +                    "type": {"key": "string", "value": "string",
> +                             "min": 0, "max": "unlimited"}}},
> +            "isRoot": true},
>          "Connection": {
>              "columns": {
>                  "target": {"type": "string"},
> diff --git a/ovn-nb.xml b/ovn-nb.xml
> index fcb1c6ecc..dc20892e1 100644
> --- a/ovn-nb.xml
> +++ b/ovn-nb.xml
> @@ -608,6 +608,11 @@
>        Please see the <ref table="DNS"/> table.
>      </column>
>
> +    <column name="dhcp_relay_port">
> +      This column defines the <ref table="Logical_Router_Port"/> on which
> +      DHCP relay is enabled.
> +    </column>
> +
>      <column name="forwarding_groups">
>        Groups a set of logical port endpoints for traffic going out of the
>        logical switch.
> @@ -2980,6 +2985,11 @@ or
>        port has all ingress and egress traffic dropped.
>      </column>
>
> +    <column name="dhcp_relay">
> +      This column is used to enabled DHCP Relay. Please refer
> +      to <ref table="DHCP_Relay"/> table.
> +    </column>
> +
>      <group title="Distributed Gateway Ports">
>        <p>
>          Gateways, as documented under <code>Gateways</code> in the OVN
> @@ -4286,6 +4296,24 @@ or
>      </group>
>    </table>
>
> +  <table name="DHCP_Relay" title="DHCP Relay">
> +    <p>
> +      OVN implements native DHCPv4 relay support which caters to the common
> +      use case of relaying the DHCP requests to external DHCP server.
> +    </p>
> +
> +    <column name="servers">
> +      <p>
> +        The DHCPv4 server IP address.
> +      </p>
> +    </column>
> +    <group title="Common Columns">
> +      <column name="external_ids">
> +        See <em>External IDs</em> at the beginning of this document.
> +      </column>
> +    </group>
> +  </table>
> +
>    <table name="Connection" title="OVSDB client connections.">
>      <p>
>        Configuration for a database connection to an Open vSwitch database
> diff --git a/tests/atlocal.in b/tests/atlocal.in
> index 63d891b89..32d1c374e 100644
> --- a/tests/atlocal.in
> +++ b/tests/atlocal.in
> @@ -187,6 +187,9 @@ fi
>  # Set HAVE_DHCPD
>  find_command dhcpd
>
> +# Set HAVE_DHCLIENT
> +find_command dhclient
> +
>  # Set HAVE_BFDD_BEACON
>  find_command bfdd-beacon
>
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index 19e4f1263..4d8c9ff26 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -8786,9 +8786,9 @@ ovn-nbctl --wait=sb set logical_router_port R1-PUB options:redirect-type=bridged
>  ovn-sbctl dump-flows R1 > R1flows
>  AT_CAPTURE_FILE([R1flows])
>
> -AT_CHECK([grep "lr_in_arp_resolve" R1flows | grep priority=90 | sort], [0], [dnl
> -  table=17(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip4.src == 10.0.0.3 && is_chassis_resident("S0-P0")), action=(get_arp(outport, reg0); next;)
> -  table=17(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip6.src == 1000::3 && is_chassis_resident("S0-P0")), action=(get_nd(outport, xxreg0); next;)
> +AT_CHECK([grep "lr_in_arp_resolve" R1flows | grep priority=90 | sed 's/table=../table=??/' | sort], [0], [dnl
> +  table=??(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip4.src == 10.0.0.3 && is_chassis_resident("S0-P0")), action=(get_arp(outport, reg0); next;)
> +  table=??(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip6.src == 1000::3 && is_chassis_resident("S0-P0")), action=(get_nd(outport, xxreg0); next;)
>  ])
>
>  AT_CLEANUP
> @@ -10966,3 +10966,38 @@ Status: active
>
>  AT_CLEANUP
>  ])
> +
> +OVN_FOR_EACH_NORTHD_NO_HV([
> +AT_SETUP([check DHCP RELAY AGENT])
> +ovn_start NORTHD_TYPE
> +
> +check ovn-nbctl ls-add ls0
> +check ovn-nbctl lsp-add ls0 ls0-port1
> +check ovn-nbctl lsp-set-addresses ls0-port1 02:00:00:00:00:10
> +check ovn-nbctl lr-add lr0
> +check ovn-nbctl lrp-add lr0 lrp1 02:00:00:00:00:01 192.168.1.1/24
> +check ovn-nbctl lsp-add ls0 lrp1-attachment
> +check ovn-nbctl lsp-set-type lrp1-attachment router
> +check ovn-nbctl lsp-set-addresses lrp1-attachment 00:00:00:00:ff:02
> +check ovn-nbctl lsp-set-options lrp1-attachment router-port=lrp1
> +check ovn-nbctl lrp-add lr0 lrp-ext 02:00:00:00:00:02 192.168.2.1/24
> +
> +dhcp_relay=$(ovn-nbctl create DHCP_Relay servers=172.16.1.1)
> +check ovn-nbctl set Logical_Router_port lrp1 dhcp_relay=$dhcp_relay
> +rp_uuid=$(ovn-nbctl --bare --colum=_uuid list logical_router_port lrp1)
> +check ovn-nbctl set Logical_Switch ls0 dhcp_relay_port=$rp_uuid
> +
> +check ovn-nbctl --wait=sb sync
> +
> +ovn-sbctl lflow-list > lflows
> +AT_CAPTURE_FILE([lflows])
> +
> +AT_CHECK([grep -e "DHCP_RELAY_" lflows | sed 's/table=../table=??/'], [0], [dnl
> +  table=??(lr_in_ip_input     ), priority=110  , match=(inport == "lrp1" && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(dhcp_relay_req(192.168.1.1,172.16.1.1);ip4.src=192.168.1.1;ip4.dst=172.16.1.1;udp.src=67;next; /* DHCP_RELAY_REQ */)
> +  table=??(lr_in_ip_input     ), priority=110  , match=(ip4.src == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src == 67 && udp.dst == 67), action=(next;/* DHCP_RELAY_RESP */)
> +  table=??(lr_in_dhcp_relay_resp_fwd), priority=110  , match=(ip4.src == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src == 67 && udp.dst == 67), action=(dhcp_relay_resp_fwd(192.168.1.1,172.16.1.1);ip4.src=192.168.1.1;udp.dst=68;outport="lrp1";output; /* DHCP_RELAY_RESP */)
> +  table=??(ls_in_l2_lkup      ), priority=100  , match=(inport == "ls0-port1" && eth.src == 02:00:00:00:00:10 && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(eth.dst=02:00:00:00:00:01;outport="lrp1-attachment";next;/* DHCP_RELAY_REQ */)
> +])
> +
> +AT_CLEANUP
> +])
> diff --git a/tests/ovn.at b/tests/ovn.at
> index e8c79512b..839c07ce2 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -21905,7 +21905,7 @@ eth_dst=00000000ff01
>  ip_src=$(ip_to_hex 10 0 0 10)
>  ip_dst=$(ip_to_hex 172 168 0 101)
>  send_icmp_packet 1 1 $eth_src $eth_dst $ip_src $ip_dst c4c9 0000000000000000000000
> -AT_CHECK_UNQUOTED([as hv1 ovs-ofctl dump-flows br-int metadata=0x$lr0_dp_key | awk '/table=28, n_packets=1, n_bytes=45/{print $7" "$8}'],[0],[dnl
> +AT_CHECK_UNQUOTED([as hv1 ovs-ofctl dump-flows br-int metadata=0x$lr0_dp_key | awk '/table=29, n_packets=1, n_bytes=45/{print $7" "$8}'],[0],[dnl
>  priority=80,ip,reg15=0x$lr0_public_dp_key,metadata=0x$lr0_dp_key,nw_src=10.0.0.10 actions=drop
>  ])
>
> @@ -28964,7 +28964,7 @@ AT_CHECK([
>          grep "priority=100" | \
>          grep -c "ct(commit,zone=NXM_NX_REG11\\[[0..15\\]],.*exec(move:NXM_OF_ETH_SRC\\[[\\]]->NXM_NX_CT_LABEL\\[[32..79\\]],load:0x[[0-9]]->NXM_NX_CT_MARK\\[[16..31\\]]))"
>
> -        grep table=25 hv${hv}flows | \
> +        grep table=26 hv${hv}flows | \
>          grep "priority=200" | \
>          grep -c "move:NXM_NX_CT_LABEL\\[[\\]]->NXM_NX_XXREG1\\[[\\]],move:NXM_NX_XXREG1\\[[32..79\\]]->NXM_OF_ETH_DST"
>      done; :], [0], [dnl
> @@ -29089,7 +29089,7 @@ AT_CHECK([
>          grep "priority=100" | \
>          grep -c "ct(commit,zone=NXM_NX_REG11\\[[0..15\\]],.*exec(move:NXM_OF_ETH_SRC\\[[\\]]->NXM_NX_CT_LABEL\\[[32..79\\]],load:0x[[0-9]]->NXM_NX_CT_MARK\\[[16..31\\]]))"
>
> -        grep table=25 hv${hv}flows | \
> +        grep table=26 hv${hv}flows | \
>          grep "priority=200" | \
>          grep -c "move:NXM_NX_CT_LABEL\\[[\\]]->NXM_NX_XXREG1\\[[\\]],move:NXM_NX_XXREG1\\[[32..79\\]]->NXM_OF_ETH_DST"
>      done; :], [0], [dnl
> @@ -29586,7 +29586,7 @@ if test X"$1" = X"DGP"; then
>  else
>      prio=2
>  fi
> -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=25, n_packets=1,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=26, n_packets=1,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
>  1
>  ])
>
> @@ -29605,13 +29605,13 @@ AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep "actions=controller" | grep
>
>  if test X"$1" = X"DGP"; then
>      # The packet dst should be resolved once for E/W centralized NAT purpose.
> -    AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=25, n_packets=1,.* priority=100,reg0=0xa000101,reg15=.*metadata=0x${sw_key} actions=mod_dl_dst:00:00:00:00:01:01,resubmit" -c], [0], [dnl
> +    AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=26, n_packets=1,.* priority=100,reg0=0xa000101,reg15=.*metadata=0x${sw_key} actions=mod_dl_dst:00:00:00:00:01:01,resubmit" -c], [0], [dnl
>  1
>  ])
>  fi
>
>  # The packet should've been finally dropped in the lr_in_arp_resolve stage.
> -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=25, n_packets=2,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=26, n_packets=2,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
>  1
>  ])
>  OVN_CLEANUP([hv1])
> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
> index 7b9daba0d..591933a95 100644
> --- a/tests/system-ovn.at
> +++ b/tests/system-ovn.at
> @@ -12032,3 +12032,153 @@ as
>  OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
>  /connection dropped.*/d"])
>  AT_CLEANUP
> +
> +OVN_FOR_EACH_NORTHD([
> +AT_SETUP([DHCP RELAY AGENT])
> +AT_SKIP_IF([test $HAVE_DHCPD = no])
> +AT_SKIP_IF([test $HAVE_DHCLIENT = no])
> +AT_SKIP_IF([test $HAVE_TCPDUMP = no])
> +ovn_start
> +OVS_TRAFFIC_VSWITCHD_START()
> +
> +ADD_BR([br-int])
> +ADD_BR([br-ext])
> +
> +ovs-ofctl add-flow br-ext action=normal
> +# Set external-ids in br-int needed for ovn-controller
> +ovs-vsctl \
> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
> +        -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
> +        -- set bridge br-int fail-mode=secure other-config:disable-in-band=true
> +
> +# Start ovn-controller
> +start_daemon ovn-controller
> +
> +ADD_NAMESPACES(sw01)
> +ADD_VETH(sw01, sw01, br-int, "0", "f0:00:00:01:02:03")
> +ADD_NAMESPACES(sw11)
> +ADD_VETH(sw11, sw11, br-int, "0", "f0:00:00:02:02:03")
> +ADD_NAMESPACES(server)
> +ADD_VETH(s1, server, br-ext, "172.16.1.1/24", "f0:00:00:01:02:05", \
> +         "172.16.1.254")
> +
> +check ovn-nbctl lr-add R1
> +
> +check ovn-nbctl ls-add sw0
> +check ovn-nbctl ls-add sw1
> +check ovn-nbctl ls-add sw-ext
> +
> +check ovn-nbctl lrp-add R1 rp-sw0 00:00:01:01:02:03 192.168.1.1/24
> +check ovn-nbctl lrp-add R1 rp-sw1 00:00:03:01:02:03 192.168.2.1/24
> +check ovn-nbctl lrp-add R1 rp-ext 00:00:02:01:02:03 172.16.1.254/24
> +
> +dhcp_relay=$(ovn-nbctl create DHCP_Relay servers=172.16.1.1)
> +check ovn-nbctl set Logical_Router_port rp-sw0 dhcp_relay=$dhcp_relay
> +check ovn-nbctl set Logical_Router_port rp-sw1 dhcp_relay=$dhcp_relay
> +check ovn-nbctl lrp-set-gateway-chassis rp-ext hv1
> +
> +check ovn-nbctl lsp-add sw0 sw0-rp -- set Logical_Switch_Port sw0-rp \
> +    type=router options:router-port=rp-sw0 \
> +    -- lsp-set-addresses sw0-rp router
> +check ovn-nbctl lsp-add sw1 sw1-rp -- set Logical_Switch_Port sw1-rp \
> +    type=router options:router-port=rp-sw1 \
> +    -- lsp-set-addresses sw1-rp router
> +
> +rp_uuid=$(ovn-nbctl --bare --colum=_uuid list logical_router_port rp-sw0)
> +check ovn-nbctl set Logical_Switch sw0 dhcp_relay_port=$rp_uuid
> +rp_uuid=$(ovn-nbctl --bare --colum=_uuid list logical_router_port rp-sw1)
> +check ovn-nbctl set Logical_Switch sw1 dhcp_relay_port=$rp_uuid
> +
> +check ovn-nbctl lsp-add sw-ext ext-rp -- set Logical_Switch_Port ext-rp \
> +    type=router options:router-port=rp-ext \
> +    -- lsp-set-addresses ext-rp router
> +check ovn-nbctl lsp-add sw-ext lnet \
> +        -- lsp-set-addresses lnet unknown \
> +        -- lsp-set-type lnet localnet \
> +        -- lsp-set-options lnet network_name=phynet
> +
> +check ovn-nbctl lsp-add sw0 sw01 \
> +    -- lsp-set-addresses sw01 "f0:00:00:01:02:03"
> +
> +check ovn-nbctl lsp-add sw1 sw11 \
> +    -- lsp-set-addresses sw11 "f0:00:00:02:02:03"
> +
> +AT_CHECK([ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappings=phynet:br-ext])
> +
> +OVN_POPULATE_ARP
> +
> +check ovn-nbctl --wait=hv sync
> +
> +DHCP_TEST_DIR="/tmp/dhcp-test"
> +rm -rf $DHCP_TEST_DIR
> +mkdir $DHCP_TEST_DIR
> +cat > $DHCP_TEST_DIR/dhcpd.conf <<EOF
> +subnet 172.16.1.0 netmask 255.255.255.0 {
> +}
> +subnet 192.168.1.0 netmask 255.255.255.0 {
> +  range 192.168.1.10 192.168.1.10;
> +  option routers 192.168.1.1;
> +  option broadcast-address 192.168.1.255;
> +  default-lease-time 60;
> +  max-lease-time 120;
> +}
> +subnet 192.168.2.0 netmask 255.255.255.0 {
> +  range 192.168.2.10 192.168.2.10;
> +  option routers 192.168.2.1;
> +  option broadcast-address 192.168.2.255;
> +  default-lease-time 60;
> +  max-lease-time 120;
> +}
> +EOF
> +cat > $DHCP_TEST_DIR/dhclien.conf <<EOF
> +timeout 2
> +EOF
> +
> +touch $DHCP_TEST_DIR/dhcpd.leases
> +chown root:dhcpd $DHCP_TEST_DIR $DHCP_TEST_DIR/dhcpd.leases
> +chmod 775 $DHCP_TEST_DIR
> +chmod 664 $DHCP_TEST_DIR/dhcpd.leases
> +
> +
> +NETNS_DAEMONIZE([server], [dhcpd -4 -f -cf $DHCP_TEST_DIR/dhcpd.conf s1 > dhcpd.log 2>&1], [dhcpd.pid])
> +
> +NS_CHECK_EXEC([server], [tcpdump -l -nvv -i s1  udp > pkt.pcap 2>tcpdump_err &])
> +OVS_WAIT_UNTIL([grep "listening" tcpdump_err])
> +on_exit 'kill $(pidof tcpdump)'
> +
> +NS_CHECK_EXEC([sw01], [dhclient -1 -q -lf $DHCP_TEST_DIR/dhclient-sw01.lease -pf $DHCP_TEST_DIR/dhclient-sw01.pid -cf $DHCP_TEST_DIR/dhclien.conf sw01])
> +NS_CHECK_EXEC([sw11], [dhclient -1 -q -lf $DHCP_TEST_DIR/dhclient-sw11.lease -pf $DHCP_TEST_DIR/dhclient-sw11.pid -cf $DHCP_TEST_DIR/dhclien.conf sw11])
> +
> +OVS_WAIT_UNTIL([
> +    total_pkts=$(cat pkt.pcap | wc -l)
> +    test ${total_pkts} -ge 8
> +])
> +
> +on_exit 'kill `cat $DHCP_TEST_DIR/dhclient-sw01.pid` &&
> +kill `cat $DHCP_TEST_DIR/dhclient-sw11.pid` && rm -rf $DHCP_TEST_DIR'
> +
> +NS_CHECK_EXEC([sw01], [ip addr show sw01 | grep -oP '(?<=inet\s)\d+(\.\d+){3}'], [0], [dnl
> +192.168.1.10
> +])
> +NS_CHECK_EXEC([sw11], [ip addr show sw11 | grep -oP '(?<=inet\s)\d+(\.\d+){3}'], [0], [dnl
> +192.168.2.10
> +])
> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
> +
> +as ovn-sb
> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> +
> +as ovn-nb
> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> +
> +as northd
> +OVS_APP_EXIT_AND_WAIT([NORTHD_TYPE])
> +
> +as
> +OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
> +/failed to query port patch-.*/d
> +/.*terminating with signal 15.*/d"])
> +AT_CLEANUP
> +])
> diff --git a/utilities/ovn-trace.c b/utilities/ovn-trace.c
> index 0b86eae7b..ae9dd77de 100644
> --- a/utilities/ovn-trace.c
> +++ b/utilities/ovn-trace.c
> @@ -2328,6 +2328,25 @@ execute_put_dhcp_opts(const struct ovnact_put_opts *pdo,
>      execute_put_opts(pdo, name, uflow, super);
>  }
>
> +static void
> +execute_dhcpv4_relay_resp_fwd(const struct ovnact_dhcp_relay *dr,
> +                                const char *name, struct flow *uflow,
> +                                struct ovs_list *super)
> +{
> +    ovntrace_node_append(
> +        super, OVNTRACE_NODE_ERROR,
> +        "/* We assume that this packet is DHCPOFFER or DHCPACK and "
> +            "DHCP broadcast flag is set. Dest IP is set to broadcast. "
> +            "Dest MAC is set to broadcast but in real network this is unicast "
> +            "which is extracted from DHCP header. */");
> +
> +    /* Assume DHCP broadcast flag is set */
> +    uflow->nw_dst = 0xFFFFFFFF;
> +    /* Dest MAC is set to broadcast but in real network this is unicast */
> +    struct eth_addr bcast_mac = {0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF};
> +    uflow->dl_dst = bcast_mac;
> +}
> +
>  static void
>  execute_put_nd_ra_opts(const struct ovnact_put_opts *pdo,
>                         const char *name, struct flow *uflow,
> @@ -3215,6 +3234,15 @@ trace_actions(const struct ovnact *ovnacts, size_t ovnacts_len,
>                                    "put_dhcpv6_opts", uflow, super);
>              break;
>
> +        case OVNACT_DHCPV4_RELAY_REQ:
> +            /* Nothing to do for tracing. */
> +            break;
> +
> +        case OVNACT_DHCPV4_RELAY_RESP_FWD:
> +            execute_dhcpv4_relay_resp_fwd(ovnact_get_DHCPV4_RELAY_RESP_FWD(a),
> +                                    "dhcp_relay_resp_fwd", uflow, super);
> +            break;
> +
>          case OVNACT_PUT_ND_RA_OPTS:
>              execute_put_nd_ra_opts(ovnact_get_PUT_DHCPV6_OPTS(a),
>                                     "put_nd_ra_opts", uflow, super);
> --
> 2.36.6
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Naveen Yerramneni Jan. 24, 2024, 12:33 a.m. UTC | #3
> On 16-Jan-2024, at 2:30 AM, Numan Siddique <numans@ovn.org> wrote:
> 
> On Tue, Dec 12, 2023 at 1:05 PM Naveen Yerramneni
> <naveen.yerramneni@nutanix.com> wrote:
>> 
>>    This patch contains changes to enable DHCP Relay Agent support for overlay subnets.
>> 
>>    USE CASE:
>>    ----------
>>      - Enable IP address assignment for overlay subnets from the centralized DHCP server present in the underlay network.
>> 
>>    PREREQUISITES
>>    --------------
>>      - Logical Router Port IP should be assigned (statically) from the same overlay subnet which is managed by DHCP server.
>>      - LRP IP is used for GIADRR field when relaying the DHCP packets and also same IP needs to be configured as default gateway for the overlay subnet.
>>      - Overlay subnets managed by external DHCP server are expected to be directly reachable from the underlay network.
>> 
>>    EXPECTED PACKET FLOW:
>>    ----------------------
>>    Following is the expected packet flow inorder to support DHCP rleay functionality in OVN.
>>      1. DHCP client originates DHCP discovery (broadcast).
>>      2. DHCP relay (running on the OVN) receives the broadcast and forwards the packet to the DHCP server by converting it to unicast. While forwarding the packet, it updates the GIADDR in DHCP header to its
>>         interface IP on which DHCP packet is received.
>>      3. DHCP server uses GIADDR field to decide the IP address pool from which IP has to be assigned and DHCP offer is sent to the same IP (GIADDR).
>>      4. DHCP relay agent forwards the offer to the client, it resets the GIADDR field when forwarding the offer to the client.
>>      5. DHCP client sends DHCP request (broadcast) packet.
>>      6. DHCP relay (running on the OVN) receives the broadcast and forwards the packet to the DHCP server by converting it to unicast. While forwarding the packet, it updates the GIADDR in DHCP header to its
>>         interface IP on which DHCP packet is received.
>>      7. DHCP Server sends the ACK packet.
>>      8. DHCP relay agent forwards the ACK packet to the client, it resets the GIADDR field when forwarding the ACK to the client.
>>      9. All the future renew/release packets are directly exchanged between DHCP client and DHCP server.
>> 
>>    OVN DHCP RELAY PACKET FLOW:
>>    ----------------------------
>>    To add DHCP Relay support on OVN, we need to replicate all the behavior described above using distributed logical switch and logical router.
>>    At, highlevel packet flow is distributed among Logical Switch and Logical Router on source node (where VM is deployed) and redirect chassis(RC) node.
>>      1. Request packet gets processed on the source node where VM is deployed and relays the packet to DHCP server.
>>      2. Response packet is first processed on RC node (which first recieves the packet from underlay network). RC node forwards the packet to the right node by filling in the dest MAC and IP.
>> 
>>    OVN Packet flow with DHCP relay is explained below.
>>      1. DHCP client (VM) sends the DHCP discover packet (broadcast).
>>      2. Logical switch converts the packet to L2 unicast by setting the destination MAC to LRP's MAC
>>      3. Logical Router receives the packet and redirects it to the OVN controller.
>>      4. OVN controller updates the required information(GIADDR) in the DHCP payload after doing the required checks. If any check fails, packet is dropped.
>>      5. Logical Router converts the packet to L3 unicast and forwards it to the server. This packets gets routed like any other packet (via RC node).
>>      6. Server replies with DHCP offer.
>>      7. RC node processes the DHCP offer and forwards it to the OVN controller.
>>      8. OVN controller does sanity checks and  updates the destination MAC (available in DHCP header), destination IP (available in DHCP header), resets GIADDR  and reinjects the packet to datapath.
>>         If any check fails, packet is dropped.
>>      9. Logical router updates the source IP and port and forwards the packet to logical switch.
>>      10. Logical switch delivers the packet to the DHCP client.
>>      11. Similar steps are performed for Request and Ack packets.
>>      12. All the future renew/release packets are directly exchanged between DHCP client and DHCP server
>> 
>>    NEW OVN ACTIONS
>>    ---------------
>> 
>>      1. dhcp_relay_req(<relay-ip>, <server-ip>)
>>          - This action executes on the source node on which the DHCP request originated.
>>          - This action relays the DHCP request coming from client to the server. Relay-ip is used to update GIADDR in the DHCP header.
>>      2. dhcp_relay_resp_fwd(<relay-ip>, <server-ip>)
>>          - This action executes on the first node (RC node) which processes the DHCP response from the server.
>>          - This action updates  the destination MAC and destination IP so that the response can be forwarded to the appropriate node from which request was originated.
>>          - Relay-ip, server-ip are used to validate GIADDR and SERVER ID in the DHCP payload.
>> 
>>    FLOWS
>>    -----
>>    Following are the flows required for one overlay subnet.
>> 
>>      1. table=27(ls_in_l2_lkup      ), priority=100  , match=(inport == <vm_port> && eth.src == <vm_mac> && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(eth.dst=<lrp_mac>;outport=<lrp-port>;next;/* DHCP_RELAY_REQ */)
>>      2. table=3 (lr_in_ip_input     ), priority=110  , match=(inport == <lrp_port> && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(dhcp_relay_req(<lrp_ip>,<dhcp_server_ip>);ip4.src=<lrp_ip>;ip4.dst=<dhcp_server_ip>;udp.src=67;next; /* DHCP_RELAY_REQ */)
>>      3. table=3 (lr_in_ip_input     ), priority=110  , match=(ip4.src == <dhcp_server_ip> && ip4.dst ==<lrp_ip> && udp.src == 67 && udp.dst == 67), action=(next;/* DHCP_RELAY_RESP */)
>>      4. table=17(lr_in_dhcp_relay_resp_fwd), priority=110  , match=(ip4.src == <dhcp_server_ip> && ip4.dst == <lrp_ip> && udp.src == 67 && udp.dst == 67), action=(dhcp_relay_resp_fwd(<lrp_ip>,<dhcp_server_ip>);ip4.src=<lrp_ip>;udp.dst=68;outport=<lrp_port>;output; /* DHCP_RELAY_RESP */)
>> 
>>    NEW PIPELINE STAGES
>>    -------------------
>>    Following stage is added for DHCP relay feature. Some of the flows are fitted into the existing pipeline tages.
>>      1. lr_in_dhcp_relay_resp_fwd
>>          - Forward teh DHCP response to the appropriate node
>> 
>>    NB SCHEMA CHANGES
>>    ----------------
>>      1. New DHCP_Relay table
>>          "DHCP_Relay": {
>>                "columns": {
>>            "name": {"type": "string"},
>>                    "servers": {"type": {"key": "string",
>>                                           "min": 0,
>>                                           "max": 1}},
>>                    "external_ids": {
>>                        "type": {"key": "string", "value": "string",
>>                                "min": 0, "max": "unlimited"}}},
>>                "isRoot": true},
>>      2. New column to Logical_Router_Port table
>>          "dhcp_relay": {"type": {"key": {"type": "uuid",
>>                                "refTable": "DHCP_Relay",
>>                                "refType": "weak"},
>>                                "min": 0,
>>                                "max": 1}},
>>      3. New column to Logical_Switch_table
>>          "dhcp_relay_port": {"type": {"key": {"type": "uuid",
>>                                        "refTable": "Logical_Router_Port",
>>                                        "refType": "weak"},
>>                                         "min": 0,
>>                                         "max": 1}}},
>> 
>>    Commands to enable the feature:
>>    ------------------------------
>>      - ovn-nbctl create DHCP_Relay servers=<ip>
>>      - ovn-nbctl set Logical_Router_port <lrp_uuid> dhcp_relay=<dhcp_relay_uuid>
>>      - ovn-nbctl set Logical_Switch <ls_uuid> dhcp_relay_port=<lrp_uuid>
>> 
>>    Example:
>>    -------
>>     ovn-nbctl ls-add sw1
>>     ovn-nbctl lsp-add sw1 sw1-port1
>>     ovn-nbctl lsp-set-addresses sw1-port1 <MAC> #Only MAC address has to be specified when logical ports are created.
>>     ovn-nbctl lr-add lr1
>>     ovn-nbctl lrp-add lr1 lr1-port1 <MAC> <GATEWAY_IP/Prefix> #GATEWAY IP is set in GIADDR field when relaying the DHCP requests to server.
>>     ovn-nbctl lsp-add sw1 lr1-attachment
>>     ovn-nbctl lsp-set-type lr1-attachment router
>>     ovn-nbctl lsp-set-addresses lr1-attachment <MAC>
>>     ovn-nbctl lsp-set-options lr1-attachment router-port=lr1-port1
>>     ovn-nbctl create DHCP_Relay servers=<DHCP_SERVER_IP>
>>     ovn-nbctl set Logical_Router_port <lrp_uuid> dhcp_relay=<relay_uuid>
>>     ovn-nbctl set Logical_Switch <ls_uuid> dhcp_relay_port=<lrp_uuid>
>> 
>>    Limitations:
>>    ------------
>>      - All OVN features that needs IP address to be configured on logical port (like proxy arp, etc) will not be supported for overlay subnets on which DHCP relay is enabled.
>> 
>>    References:
>>    ----------
>>      - rfc1541, rfc1542, rfc2131
>> 
>> Signed-off-by: Naveen Yerramneni <naveen.yerramneni@nutanix.com>
>> Co-authored-by: Huzaifa Calcuttawala <huzaifa.c@nutanix.com>
>> Signed-off-by: Huzaifa Calcuttawala <huzaifa.c@nutanix.com>
>> CC: Mary Manohar <mary.manohar@nutanix.com>
> 
> Hi Naveen,
> 
> Thanks for the patch.  Sorry for the delayed response.
> 
> I've a few comments.
> 
> 1.  Regarding the newly added Table - DHCP_Relay in NB DB and the
> newly added columns in Logical_Switch and
>    Logical_Router table.
> 
>    I don't think there is a need to add the new table DHCP_Relay
> since it only stores the dhcp relay agent server ip.
>    Also it could complicate the northd incremental processing.
> 
>    If for example we have below logical switches and router
> 
>    ovn-nbctl lr-add R1
>    ovn-nbctl ls-add sw0
>    ovn-nbctl ls-add sw1
>    ovn-nbctl ls-add sw-ext
>    ovn-nbctl lrp-add R1 rp-sw0 00:00:01:01:02:03 192.168.1.1/24
>    ovn-nbctl lrp-add R1 rp-sw1 00:00:03:01:02:03 192.168.2.1/24
>    ovn-nbctl lrp-add R1 rp-ext 00:00:02:01:02:03 172.16.1.254/24
> 
>    ovn-nbctl lsp-add sw0 sw0-rp -- set Logical_Switch_Port sw0-rp \
>    type=router options:router-port=rp-sw0 \
>    -- lsp-set-addresses sw0-rp router
> 
>    ovn-nbctl lsp-add sw1 sw1-rp -- set Logical_Switch_Port sw1-rp \
>    type=router options:router-port=rp-sw1 \
>    -- lsp-set-addresses sw1-rp router
> 
>    I'd suggest doing something like below to enable this feature.
> 
>    ovn-nbctl set Logical_Switch_Port sw0-rp options:dhcp_relay=true
>    ovn-nbctl set Logical_Switch_Port sw1-rp options:dhcp_relay=true
> 
>    (Make sure that only one logical switch port of type router can
> have this flag - dhcp_relay set
>     for a given logical switch and document this limitation.)

Ack. This suggestion looks good.

>    ovn-nbctl set Logical_Router_port rp-sw0 options:dhcp_relay_ip=172.16.1.1
>    ovn-nbctl set Logical_Router_port rp-sw1 options:dhcp_relay_ip=172.16.1.1
> 
>    Let me know if there are any limitations with this.

The reason why I added new table is , it would be useful in future if we add 
additional options (like setting hop count in DHCP header, etc) to DHCP relay
functionality. What do you recommend if we have to add more options
In future ?
 


> 2.  Regarding the newly added actions - dhcp_relay_req() and
> dhcp_relay_resp_fwd().
>     Both of these actions are encoded as OVS controller action with
> pause enabled.
>     Which means ovs-vswitchd has to freeze the flow translation and
> resume the flow translation
>     once the ovn-controller resumes it.  But the functions
> pinctrl_handle_dhcp_relay_req()
>     and pinctrl_handle_dhcp_relay_resp_fwd() do not resume the packet
> if the packet
>     has some errors.  This is wrong.  Otherwise vswitchd will never thaw the
>     frozen translation.
> 
>     You can see the existing OVN actions - put_dhcp_opts() and few others which
>     use controller action with pause.  In such actions, the result of
> these actions
>     are stored in a register bit (i.e if put_dhcp_opts() was successful or not)
>     and in the next stage we take a decision based on the result.
> 
>     For the action dhcp_relay_req(relay_ip, server_ip),  I don't
> think you should use the pause flag.
>     Also in this action the argument server_ip is never used in the
> function pinctrl_handle_dhcp_relay_req()
>     other than to just log.
> 
>     I'd suggest you do something like this:
> 
>    table=3 (lr_in_ip_input     ), priority=110  , match=(inport ==
> "lrp1" && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src
> == 68 && udp.dst == 67),
>    action=(dhcp_relay_req { ip4.src = 192.168.1.1; ip4.dst =
> 172.16.1.1; udp.src = 67; dhcp_header.giaddr = <relay_ip>;
> next(pipeline=ingress,table=S_ROUTER_IN_UNSNAT);  /* DHCP_RELAY_REQ */
> }
> 
>    dhcp_relay_req action would get translated into a controller
> action with pause=false and all the inner actions of this are encoded
> as
>    normal actions and stored in the userdata of controller action.
> Please see icmp4_error {} as an example.
>    Add a new OVN field 'dhcp_header.giaddr' which gets translated as
> controller action with pause flag set.
>    Please see the existing OVN field - icmp4.frag_mtu as an example
> and see this commit for reference [1]
>    When encoding this new OVN field, store the relay_ip in the
> userdata buffer and in pinctrl.c
>    get the relay_ip value and store it in the dhcp header field.
> 
> 
>    For the action dhcp_relay_resp_fwd,  I'd suggest something like below:
> 
>      table=17 (lr_in_dhcp_relay_resp_chk), priority=110  ,
> match=(ip4.src == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src ==
> 67 && udp.dst == 67),
>      action=(reg0[0] = dhcp_relay_resp_chk(dhcp_header.giaddr ==
> <relay_ip>); next;)
>      table=17 (lr_in_dhcp_relay_resp), priority=110  , match=(ip4.src
> == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src == 67 && udp.dst ==
> 67 && reg0[0] == 1),
>      action=(ip4.src = 192.168.1.1; udp.dst = 68; outport = "lrp1";
> output; /* DHCP_RELAY_RESP */)
> 
>       I used reg0[0] as an example.  You may need to check the free
> register bit and use it.
> 
>      You need to encode dhcp_relay_resp_chk as controller action with
> pause=true, and store the relay_ip in the userdata buffer.
>      And in pinctrl.c  check that  'dhcp_header.giaddr == relay_ip'
> or not.  If so, set the result register bit to 1, else to 0.
> 
>   Let me know if you've any questions.
> 

Ack. Thanks for the suggestions and detailed explanation.
Before implementation I had referred to icmp4_error and native dhcp_server flows
but I had slight misunderstanding about pause flag.


> 3.  The newly added functions in pinctrl.c have a lot of repetitive
> code and it is very much similar to existing
> pinctrl_handle_put_dhcp_opts()
>    Please see if the duplicate code can be avoided.

Ack.



> [1] - https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_commit_3d9fec3fd5992e1201b4d4fdf43f1f397e8d5ea1&d=DwIFaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=jUP6tr4FN6iSRj6v8rdyetsvEpT13QUHVMbw__3u6Sm7qAhyuu9tBdezdVmkqt0p&s=xAleLPNTzueIGuScqWZRp7ppL2D7bbjqLZc6q4xk3Rg&e= 
> 
> Thanks
> Numan
> 
>> ---
>> controller/pinctrl.c  | 441 ++++++++++++++++++++++++++++++++++++++++++
>> include/ovn/actions.h |  26 +++
>> lib/actions.c         | 117 +++++++++++
>> lib/ovn-l7.h          |   1 +
>> northd/northd.c       | 177 ++++++++++++++++-
>> ovn-nb.ovsschema      |  25 ++-
>> ovn-nb.xml            |  28 +++
>> tests/atlocal.in      |   3 +
>> tests/ovn-northd.at   |  41 +++-
>> tests/ovn.at          |  12 +-
>> tests/system-ovn.at   | 150 ++++++++++++++
>> utilities/ovn-trace.c |  28 +++
>> 12 files changed, 1032 insertions(+), 17 deletions(-)
>> 
>> diff --git a/controller/pinctrl.c b/controller/pinctrl.c
>> index 5a35d56f6..45240f01d 100644
>> --- a/controller/pinctrl.c
>> +++ b/controller/pinctrl.c
>> @@ -1897,6 +1897,437 @@ is_dhcp_flags_broadcast(ovs_be16 flags)
>>     return flags & htons(DHCP_BROADCAST_FLAG);
>> }
>> 
>> +static const char *dhcp_msg_str[] = {
>> +[0] = "INVALID",
>> +[DHCP_MSG_DISCOVER] = "DISCOVER",
>> +[DHCP_MSG_OFFER] = "OFFER",
>> +[DHCP_MSG_REQUEST] = "REQUEST",
>> +[OVN_DHCP_MSG_DECLINE] = "DECLINE",
>> +[DHCP_MSG_ACK] = "ACK",
>> +[DHCP_MSG_NAK] = "NAK",
>> +[OVN_DHCP_MSG_RELEASE] = "RELEASE",
>> +[OVN_DHCP_MSG_INFORM] = "INFORM"
>> +};
>> +
>> +static bool
>> +dhcp_relay_is_msg_type_supported(uint8_t msg_type)
>> +{
>> +    return (msg_type >= DHCP_MSG_DISCOVER && msg_type <= OVN_DHCP_MSG_RELEASE);
>> +}
>> +
>> +static const char *dhcp_msg_str_get(uint8_t msg_type)
>> +{
>> +    if (!dhcp_relay_is_msg_type_supported(msg_type)) {
>> +        return "INVALID";
>> +    }
>> +    return dhcp_msg_str[msg_type];
>> +}
>> +
>> +/* Called with in the pinctrl_handler thread context. */
>> +static void
>> +pinctrl_handle_dhcp_relay_req(
>> +    struct rconn *swconn,
>> +    struct dp_packet *pkt_in, struct ofputil_packet_in *pin,
>> +    struct ofpbuf *userdata,
>> +    struct ofpbuf *continuation)
>> +{
>> +    enum ofp_version version = rconn_get_version(swconn);
>> +    enum ofputil_protocol proto = ofputil_protocol_from_ofp_version(version);
>> +    struct dp_packet *pkt_out_ptr = NULL;
>> +
>> +    /* Parse relay IP and server IP. */
>> +    ovs_be32 *relay_ip = ofpbuf_try_pull(userdata, sizeof *relay_ip);
>> +    ovs_be32 *server_ip = ofpbuf_try_pull(userdata, sizeof *server_ip);
>> +    if (!relay_ip || !server_ip) {
>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: relay ip or server ip "
>> +                  "not present in the userdata");
>> +        return;
>> +    }
>> +
>> +    /* Validate the DHCP request packet.
>> +     * Format of the DHCP packet is
>> +     * ------------------------------------------------------------------------
>> +     *| UDP HEADER  | DHCP HEADER  | 4 Byte DHCP Cookie | DHCP OPTIONS(var len)|
>> +     * ------------------------------------------------------------------------
>> +     */
>> +
>> +    size_t in_l4_size = dp_packet_l4_size(pkt_in);
>> +    const char *end = (char *) dp_packet_l4(pkt_in) + in_l4_size;
>> +    const char *in_dhcp_ptr = dp_packet_get_udp_payload(pkt_in);
>> +    if (!in_dhcp_ptr) {
>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: invalid or incomplete "
>> +                  "DHCP packet received");
>> +        return;
>> +    }
>> +
>> +    const struct dhcp_header *in_dhcp_data
>> +        = (const struct dhcp_header *) in_dhcp_ptr;
>> +    in_dhcp_ptr += sizeof *in_dhcp_data;
>> +    if (in_dhcp_ptr > end) {
>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: invalid or incomplete "
>> +                "DHCP packet received, bad data length");
>> +        return;
>> +    }
>> +    if (in_dhcp_data->op != DHCP_OP_REQUEST) {
>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: invalid opcode in the "
>> +                "DHCP packet: %d", in_dhcp_data->op);
>> +        return;
>> +    }
>> +
>> +    /* DHCP options follow the DHCP header. The first 4 bytes of the DHCP
>> +     * options is the DHCP magic cookie followed by the actual DHCP options.
>> +     */
>> +    ovs_be32 magic_cookie = htonl(DHCP_MAGIC_COOKIE);
>> +    if (in_dhcp_ptr + sizeof magic_cookie > end ||
>> +        get_unaligned_be32((const void *) in_dhcp_ptr) != magic_cookie) {
>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: magic cookie not present "
>> +                "in the packet");
>> +        return;
>> +    }
>> +
>> +    if (in_dhcp_data->giaddr) {
>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: giaddr is already set");
>> +        return;
>> +    }
>> +
>> +    if (in_dhcp_data->htype != 0x1) {
>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: packet is recieved with "
>> +                "unsupported hardware type");
>> +        return;
>> +    }
>> +
>> +    ovs_be32 *server_id_ptr = NULL;
>> +    const uint8_t *in_dhcp_msg_type = NULL;
>> +
>> +    in_dhcp_ptr += sizeof magic_cookie;
>> +    ovs_be32 request_ip = in_dhcp_data->ciaddr;
>> +    while (in_dhcp_ptr < end) {
>> +        const struct dhcp_opt_header *in_dhcp_opt =
>> +            (const struct dhcp_opt_header *) in_dhcp_ptr;
>> +        if (in_dhcp_opt->code == DHCP_OPT_END) {
>> +            break;
>> +        }
>> +        if (in_dhcp_opt->code == DHCP_OPT_PAD) {
>> +            in_dhcp_ptr += 1;
>> +            continue;
>> +        }
>> +        in_dhcp_ptr += sizeof *in_dhcp_opt;
>> +        if (in_dhcp_ptr > end) {
>> +            break;
>> +        }
>> +        in_dhcp_ptr += in_dhcp_opt->len;
>> +        if (in_dhcp_ptr > end) {
>> +            break;
>> +        }
>> +
>> +        switch (in_dhcp_opt->code) {
>> +        case DHCP_OPT_MSG_TYPE:
>> +            if (in_dhcp_opt->len == 1) {
>> +                in_dhcp_msg_type = DHCP_OPT_PAYLOAD(in_dhcp_opt);
>> +            }
>> +            break;
>> +        case DHCP_OPT_REQ_IP:
>> +            if (in_dhcp_opt->len == 4) {
>> +                request_ip = get_unaligned_be32(DHCP_OPT_PAYLOAD(in_dhcp_opt));
>> +            }
>> +            break;
>> +        /* Server Identifier */
>> +        case OVN_DHCP_OPT_CODE_SERVER_ID:
>> +            if (in_dhcp_opt->len == 4) {
>> +                server_id_ptr = DHCP_OPT_PAYLOAD(in_dhcp_opt);
>> +            }
>> +            break;
>> +        default:
>> +            break;
>> +        }
>> +    }
>> +
>> +    /* Check whether the DHCP Message Type (opt 53) is present or not */
>> +    if (!in_dhcp_msg_type) {
>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: missing message type");
>> +        return;
>> +    }
>> +
>> +    /* Relay the DHCP request packet */
>> +    uint16_t new_l4_size = in_l4_size;
>> +    size_t new_packet_size = pkt_in->l4_ofs + new_l4_size;
>> +
>> +    struct dp_packet pkt_out;
>> +    dp_packet_init(&pkt_out, new_packet_size);
>> +    dp_packet_clear(&pkt_out);
>> +    dp_packet_prealloc_tailroom(&pkt_out, new_packet_size);
>> +    pkt_out_ptr = &pkt_out;
>> +
>> +    /* Copy the L2 and L3 headers from the pkt_in as they would remain same*/
>> +    dp_packet_put(
>> +        &pkt_out, dp_packet_pull(pkt_in, pkt_in->l4_ofs), pkt_in->l4_ofs);
>> +
>> +    pkt_out.l2_5_ofs = pkt_in->l2_5_ofs;
>> +    pkt_out.l2_pad_size = pkt_in->l2_pad_size;
>> +    pkt_out.l3_ofs = pkt_in->l3_ofs;
>> +    pkt_out.l4_ofs = pkt_in->l4_ofs;
>> +
>> +    struct ip_header *out_ip = dp_packet_l3(&pkt_out);
>> +
>> +    struct udp_header *udp = dp_packet_put(
>> +        &pkt_out, dp_packet_pull(pkt_in, UDP_HEADER_LEN), UDP_HEADER_LEN);
>> +
>> +    struct dhcp_header *dhcp_data = dp_packet_put(&pkt_out,
>> +        dp_packet_pull(pkt_in, new_l4_size - UDP_HEADER_LEN),
>> +        new_l4_size - UDP_HEADER_LEN);
>> +    dhcp_data->giaddr = *relay_ip;
>> +    if (udp->udp_csum) {
>> +        udp->udp_csum = recalc_csum32(udp->udp_csum,
>> +            0, dhcp_data->giaddr);
>> +    }
>> +    pin->packet = dp_packet_data(&pkt_out);
>> +    pin->packet_len = dp_packet_size(&pkt_out);
>> +
>> +    /* Log the DHCP message. */
>> +    static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(20, 40);
>> +    const struct eth_header *l2 = dp_packet_eth(&pkt_out);
>> +    VLOG_INFO_RL(&rl, "DHCP_RELAY_REQ:: MSG_TYPE:%s MAC:"ETH_ADDR_FMT
>> +                " XID:%u"
>> +                " REQ_IP:"IP_FMT
>> +                " GIADDR:"IP_FMT
>> +                " SERVER_ADDR:"IP_FMT,
>> +                dhcp_msg_str_get(*in_dhcp_msg_type),
>> +                ETH_ADDR_BYTES_ARGS(dhcp_data->chaddr), ntohl(dhcp_data->xid),
>> +                IP_ARGS(request_ip), IP_ARGS(dhcp_data->giaddr),
>> +                IP_ARGS(*server_ip));
>> +    queue_msg(swconn, ofputil_encode_resume(pin, continuation, proto));
>> +    if (pkt_out_ptr) {
>> +        dp_packet_uninit(pkt_out_ptr);
>> +    }
>> +}
>> +
>> +/* Called with in the pinctrl_handler thread context. */
>> +static void
>> +pinctrl_handle_dhcp_relay_resp_fwd(
>> +    struct rconn *swconn,
>> +    struct dp_packet *pkt_in, struct ofputil_packet_in *pin,
>> +    struct ofpbuf *userdata,
>> +    struct ofpbuf *continuation)
>> +{
>> +    enum ofp_version version = rconn_get_version(swconn);
>> +    enum ofputil_protocol proto = ofputil_protocol_from_ofp_version(version);
>> +    struct dp_packet *pkt_out_ptr = NULL;
>> +
>> +    /* Parse relay IP and server IP. */
>> +    ovs_be32 *relay_ip = ofpbuf_try_pull(userdata, sizeof *relay_ip);
>> +    ovs_be32 *server_ip = ofpbuf_try_pull(userdata, sizeof *server_ip);
>> +    if (!relay_ip || !server_ip) {
>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: relay ip or server ip "
>> +                "not present in the userdata");
>> +        return;
>> +    }
>> +
>> +    /* Validate the DHCP request packet.
>> +     * Format of the DHCP packet is
>> +     * ------------------------------------------------------------------------
>> +     *| UDP HEADER  | DHCP HEADER  | 4 Byte DHCP Cookie | DHCP OPTIONS(var len)|
>> +     * ------------------------------------------------------------------------
>> +     */
>> +
>> +    size_t in_l4_size = dp_packet_l4_size(pkt_in);
>> +    const char *end = (char *) dp_packet_l4(pkt_in) + in_l4_size;
>> +    const char *in_dhcp_ptr = dp_packet_get_udp_payload(pkt_in);
>> +    if (!in_dhcp_ptr) {
>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: invalid or incomplete "
>> +                "packet received");
>> +        return;
>> +    }
>> +
>> +    const struct dhcp_header *in_dhcp_data
>> +        = (const struct dhcp_header *) in_dhcp_ptr;
>> +    in_dhcp_ptr += sizeof *in_dhcp_data;
>> +    if (in_dhcp_ptr > end) {
>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: invalid or incomplete "
>> +                    "packet received, bad data length");
>> +        return;
>> +    }
>> +    if (in_dhcp_data->op != DHCP_OP_REPLY) {
>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: invalid opcode "
>> +                "in the packet: %d", in_dhcp_data->op);
>> +        return;
>> +    }
>> +
>> +    /* DHCP options follow the DHCP header. The first 4 bytes of the DHCP
>> +     * options is the DHCP magic cookie followed by the actual DHCP options.
>> +     */
>> +    ovs_be32 magic_cookie = htonl(DHCP_MAGIC_COOKIE);
>> +    if (in_dhcp_ptr + sizeof magic_cookie > end ||
>> +        get_unaligned_be32((const void *) in_dhcp_ptr) != magic_cookie) {
>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: magic cookie not present "
>> +                "in the packet");
>> +        return;
>> +    }
>> +
>> +    if (!in_dhcp_data->giaddr) {
>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: giaddr is "
>> +                    "not set in request");
>> +        return;
>> +    }
>> +    ovs_be32 giaddr = in_dhcp_data->giaddr;
>> +
>> +    ovs_be32 *server_id_ptr = NULL;
>> +    ovs_be32 lease_time = 0;
>> +    const uint8_t *in_dhcp_msg_type = NULL;
>> +
>> +    in_dhcp_ptr += sizeof magic_cookie;
>> +    while (in_dhcp_ptr < end) {
>> +        const struct dhcp_opt_header *in_dhcp_opt =
>> +            (const struct dhcp_opt_header *) in_dhcp_ptr;
>> +        if (in_dhcp_opt->code == DHCP_OPT_END) {
>> +            break;
>> +        }
>> +        if (in_dhcp_opt->code == DHCP_OPT_PAD) {
>> +            in_dhcp_ptr += 1;
>> +            continue;
>> +        }
>> +        in_dhcp_ptr += sizeof *in_dhcp_opt;
>> +        if (in_dhcp_ptr > end) {
>> +            break;
>> +        }
>> +        in_dhcp_ptr += in_dhcp_opt->len;
>> +        if (in_dhcp_ptr > end) {
>> +            break;
>> +        }
>> +
>> +        switch (in_dhcp_opt->code) {
>> +        case DHCP_OPT_MSG_TYPE:
>> +            if (in_dhcp_opt->len == 1) {
>> +                in_dhcp_msg_type = DHCP_OPT_PAYLOAD(in_dhcp_opt);
>> +            }
>> +            break;
>> +        /* Server Identifier */
>> +        case OVN_DHCP_OPT_CODE_SERVER_ID:
>> +            if (in_dhcp_opt->len == 4) {
>> +                server_id_ptr = DHCP_OPT_PAYLOAD(in_dhcp_opt);
>> +            }
>> +            break;
>> +        case OVN_DHCP_OPT_CODE_LEASE_TIME:
>> +            if (in_dhcp_opt->len == 4) {
>> +                lease_time = get_unaligned_be32(DHCP_OPT_PAYLOAD(in_dhcp_opt));
>> +            }
>> +            break;
>> +        default:
>> +            break;
>> +        }
>> +    }
>> +
>> +    /* Check whether the DHCP Message Type (opt 53) is present or not */
>> +    if (!in_dhcp_msg_type) {
>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: missing message type");
>> +        return;
>> +    }
>> +
>> +    if (!server_id_ptr) {
>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: missing server identifier");
>> +        return;
>> +    }
>> +
>> +    if (*server_id_ptr != *server_ip) {
>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: server identifier mismatch");
>> +        return;
>> +    }
>> +
>> +    if (giaddr != *relay_ip) {
>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: giaddr mismatch");
>> +        return;
>> +    }
>> +
>> +
>> +    /* Update destination MAC & IP so that the packet is forward to the
>> +     * right destination node.
>> +     */
>> +    uint16_t new_l4_size = in_l4_size;
>> +    size_t new_packet_size = pkt_in->l4_ofs + new_l4_size;
>> +
>> +    struct dp_packet pkt_out;
>> +    dp_packet_init(&pkt_out, new_packet_size);
>> +    dp_packet_clear(&pkt_out);
>> +    dp_packet_prealloc_tailroom(&pkt_out, new_packet_size);
>> +    pkt_out_ptr = &pkt_out;
>> +
>> +    /* Copy the L2 and L3 headers from the pkt_in as they would remain same*/
>> +    struct eth_header *eth = dp_packet_put(
>> +        &pkt_out, dp_packet_pull(pkt_in, pkt_in->l4_ofs), pkt_in->l4_ofs);
>> +
>> +    pkt_out.l2_5_ofs = pkt_in->l2_5_ofs;
>> +    pkt_out.l2_pad_size = pkt_in->l2_pad_size;
>> +    pkt_out.l3_ofs = pkt_in->l3_ofs;
>> +    pkt_out.l4_ofs = pkt_in->l4_ofs;
>> +
>> +    struct udp_header *udp = dp_packet_put(
>> +        &pkt_out, dp_packet_pull(pkt_in, UDP_HEADER_LEN), UDP_HEADER_LEN);
>> +
>> +    struct dhcp_header *dhcp_data = dp_packet_put(
>> +        &pkt_out, dp_packet_pull(pkt_in, new_l4_size - UDP_HEADER_LEN),
>> +        new_l4_size - UDP_HEADER_LEN);
>> +    memcpy(&eth->eth_dst, dhcp_data->chaddr, sizeof(eth->eth_dst));
>> +
>> +    /* Send a broadcast IP frame when BROADCAST flag is set. */
>> +    struct ip_header *out_ip = dp_packet_l3(&pkt_out);
>> +    ovs_be32 ip_dst;
>> +    ovs_be32 ip_dst_orig = get_16aligned_be32(&out_ip->ip_dst);
>> +    if (!is_dhcp_flags_broadcast(dhcp_data->flags)) {
>> +        ip_dst = dhcp_data->yiaddr;
>> +    } else {
>> +        ip_dst = htonl(0xffffffff);
>> +    }
>> +    put_16aligned_be32(&out_ip->ip_dst, ip_dst);
>> +    out_ip->ip_csum = recalc_csum32(out_ip->ip_csum,
>> +              ip_dst_orig, ip_dst);
>> +    if (udp->udp_csum) {
>> +        udp->udp_csum = recalc_csum32(udp->udp_csum,
>> +            ip_dst_orig, ip_dst);
>> +    }
>> +    /* Reset giaddr */
>> +    dhcp_data->giaddr = htonl(0x0);
>> +    if (udp->udp_csum) {
>> +        udp->udp_csum = recalc_csum32(udp->udp_csum,
>> +            giaddr, 0);
>> +    }
>> +    pin->packet = dp_packet_data(&pkt_out);
>> +    pin->packet_len = dp_packet_size(&pkt_out);
>> +
>> +    /* Log the DHCP message. */
>> +    static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(20, 40);
>> +    const struct eth_header *l2 = dp_packet_eth(&pkt_out);
>> +    VLOG_INFO_RL(&rl, "DHCP_RELAY_RESP_FWD:: MSG_TYPE:%s MAC:"ETH_ADDR_FMT
>> +             " XID:%u"
>> +             " YIADDR:"IP_FMT
>> +             " GIADDR:"IP_FMT
>> +             " SERVER_ADDR:"IP_FMT,
>> +             dhcp_msg_str_get(*in_dhcp_msg_type),
>> +             ETH_ADDR_BYTES_ARGS(dhcp_data->chaddr), ntohl(dhcp_data->xid),
>> +             IP_ARGS(dhcp_data->yiaddr),
>> +             IP_ARGS(giaddr), IP_ARGS(*server_id_ptr));
>> +    queue_msg(swconn, ofputil_encode_resume(pin, continuation, proto));
>> +    if (pkt_out_ptr) {
>> +        dp_packet_uninit(pkt_out_ptr);
>> +    }
>> +}
>> +
>> /* Called with in the pinctrl_handler thread context. */
>> static void
>> pinctrl_handle_put_dhcp_opts(
>> @@ -3203,6 +3634,16 @@ process_packet_in(struct rconn *swconn, const struct ofp_header *msg)
>>         ovs_mutex_unlock(&pinctrl_mutex);
>>         break;
>> 
>> +    case ACTION_OPCODE_DHCP_RELAY_REQ:
>> +        pinctrl_handle_dhcp_relay_req(swconn, &packet, &pin,
>> +                                     &userdata, &continuation);
>> +        break;
>> +
>> +    case ACTION_OPCODE_DHCP_RELAY_RESP_FWD:
>> +        pinctrl_handle_dhcp_relay_resp_fwd(swconn, &packet, &pin,
>> +                                     &userdata, &continuation);
>> +        break;
>> +
>>     case ACTION_OPCODE_PUT_DHCP_OPTS:
>>         pinctrl_handle_put_dhcp_opts(swconn, &packet, &pin, &headers,
>>                                      &userdata, &continuation);
>> diff --git a/include/ovn/actions.h b/include/ovn/actions.h
>> index 49cfe0624..47d41b90f 100644
>> --- a/include/ovn/actions.h
>> +++ b/include/ovn/actions.h
>> @@ -95,6 +95,8 @@ struct collector_set_ids;
>>     OVNACT(LOOKUP_ND_IP,      ovnact_lookup_mac_bind_ip) \
>>     OVNACT(PUT_DHCPV4_OPTS,   ovnact_put_opts)        \
>>     OVNACT(PUT_DHCPV6_OPTS,   ovnact_put_opts)        \
>> +    OVNACT(DHCPV4_RELAY_REQ,  ovnact_dhcp_relay)      \
>> +    OVNACT(DHCPV4_RELAY_RESP_FWD, ovnact_dhcp_relay)      \
>>     OVNACT(SET_QUEUE,         ovnact_set_queue)       \
>>     OVNACT(DNS_LOOKUP,        ovnact_result)          \
>>     OVNACT(LOG,               ovnact_log)             \
>> @@ -387,6 +389,14 @@ struct ovnact_put_opts {
>>     size_t n_options;
>> };
>> 
>> +/* OVNACT_DHCP_RELAY. */
>> +struct ovnact_dhcp_relay {
>> +    struct ovnact ovnact;
>> +    int family;
>> +    ovs_be32 relay_ipv4;
>> +    ovs_be32 server_ipv4;
>> +};
>> +
>> /* Valid arguments to SET_QUEUE action.
>>  *
>>  * QDISC_MIN_QUEUE_ID is the default queue, so user-defined queues should
>> @@ -750,6 +760,22 @@ enum action_opcode {
>> 
>>     /* multicast group split buffer action. */
>>     ACTION_OPCODE_MG_SPLIT_BUF,
>> +
>> +    /* "dhcp_relay_req(relay_ip, server_ip)".
>> +     *
>> +     * Arguments follow the action_header, in this format:
>> +     *   - The 32-bit DHCP relay IP.
>> +     *   - The 32-bit DHCP server IP.
>> +     */
>> +    ACTION_OPCODE_DHCP_RELAY_REQ,
>> +
>> +    /* "dhcp_relay_resp_fwd(relay_ip, server_ip)".
>> +     *
>> +     * Arguments follow the action_header, in this format:
>> +     *   - The 32-bit DHCP relay IP.
>> +     *   - The 32-bit DHCP server IP.
>> +     */
>> +    ACTION_OPCODE_DHCP_RELAY_RESP_FWD,
>> };
>> 
>> /* Header. */
>> diff --git a/lib/actions.c b/lib/actions.c
>> index a73fe1a1e..69df428c6 100644
>> --- a/lib/actions.c
>> +++ b/lib/actions.c
>> @@ -2629,6 +2629,118 @@ ovnact_controller_event_free(struct ovnact_controller_event *event)
>>     free_gen_options(event->options, event->n_options);
>> }
>> 
>> +static void
>> +format_DHCPV4_RELAY_REQ(const struct ovnact_dhcp_relay *dhcp_relay,
>> +                struct ds *s)
>> +{
>> +    ds_put_format(s, "dhcp_relay_req("IP_FMT","IP_FMT");",
>> +                  IP_ARGS(dhcp_relay->relay_ipv4),
>> +                  IP_ARGS(dhcp_relay->server_ipv4));
>> +}
>> +
>> +static void
>> +parse_dhcp_relay_req(struct action_context *ctx,
>> +               struct ovnact_dhcp_relay *dhcp_relay)
>> +{
>> +    /* Skip dhcp_relay_req( */
>> +    lexer_force_match(ctx->lexer, LEX_T_LPAREN);
>> +
>> +    /* Parse relay ip and server ip. */
>> +    if (ctx->lexer->token.format == LEX_F_IPV4) {
>> +        dhcp_relay->family = AF_INET;
>> +        dhcp_relay->relay_ipv4 = ctx->lexer->token.value.ipv4;
>> +        lexer_get(ctx->lexer);
>> +        lexer_match(ctx->lexer, LEX_T_COMMA);
>> +        if (ctx->lexer->token.format == LEX_F_IPV4) {
>> +            dhcp_relay->family = AF_INET;
>> +            dhcp_relay->server_ipv4 = ctx->lexer->token.value.ipv4;
>> +            lexer_get(ctx->lexer);
>> +        } else {
>> +            lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp server ip");
>> +            return;
>> +        }
>> +    } else {
>> +          lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp relay "
>> +                          "and server ips");
>> +          return;
>> +    }
>> +    lexer_force_match(ctx->lexer, LEX_T_RPAREN);
>> +}
>> +
>> +static void
>> +encode_DHCPV4_RELAY_REQ(const struct ovnact_dhcp_relay *dhcp_relay,
>> +                    const struct ovnact_encode_params *ep,
>> +                    struct ofpbuf *ofpacts)
>> +{
>> +    size_t oc_offset = encode_start_controller_op(ACTION_OPCODE_DHCP_RELAY_REQ,
>> +                                                  true, ep->ctrl_meter_id,
>> +                                                  ofpacts);
>> +    ofpbuf_put(ofpacts, &dhcp_relay->relay_ipv4,
>> +            sizeof(dhcp_relay->relay_ipv4));
>> +    ofpbuf_put(ofpacts, &dhcp_relay->server_ipv4,
>> +            sizeof(dhcp_relay->server_ipv4));
>> +    encode_finish_controller_op(oc_offset, ofpacts);
>> +}
>> +
>> +static void
>> +format_DHCPV4_RELAY_RESP_FWD(const struct ovnact_dhcp_relay *dhcp_relay,
>> +                    struct ds *s)
>> +{
>> +    ds_put_format(s, "dhcp_relay_resp("IP_FMT","IP_FMT");",
>> +                  IP_ARGS(dhcp_relay->relay_ipv4),
>> +                  IP_ARGS(dhcp_relay->server_ipv4));
>> +}
>> +
>> +static void
>> +parse_dhcp_relay_resp_fwd(struct action_context *ctx,
>> +               struct ovnact_dhcp_relay *dhcp_relay)
>> +{
>> +    /* Skip dhcp_relay_resp( */
>> +    lexer_force_match(ctx->lexer, LEX_T_LPAREN);
>> +
>> +    /* Parse relay ip and server ip. */
>> +    if (ctx->lexer->token.format == LEX_F_IPV4) {
>> +        dhcp_relay->family = AF_INET;
>> +        dhcp_relay->relay_ipv4 = ctx->lexer->token.value.ipv4;
>> +        lexer_get(ctx->lexer);
>> +        lexer_match(ctx->lexer, LEX_T_COMMA);
>> +        if (ctx->lexer->token.format == LEX_F_IPV4) {
>> +            dhcp_relay->family = AF_INET;
>> +            dhcp_relay->server_ipv4 = ctx->lexer->token.value.ipv4;
>> +            lexer_get(ctx->lexer);
>> +        } else {
>> +            lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp server ip");
>> +            return;
>> +        }
>> +    } else {
>> +          lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp relay and "
>> +                          "server ips");
>> +          return;
>> +    }
>> +    lexer_force_match(ctx->lexer, LEX_T_RPAREN);
>> +}
>> +
>> +static void
>> +encode_DHCPV4_RELAY_RESP_FWD(const struct ovnact_dhcp_relay *dhcp_relay,
>> +                    const struct ovnact_encode_params *ep,
>> +                    struct ofpbuf *ofpacts)
>> +{
>> +    size_t oc_offset = encode_start_controller_op(
>> +                                ACTION_OPCODE_DHCP_RELAY_RESP_FWD,
>> +                                true, ep->ctrl_meter_id,
>> +                                ofpacts);
>> +    ofpbuf_put(ofpacts, &dhcp_relay->relay_ipv4,
>> +                  sizeof(dhcp_relay->relay_ipv4));
>> +    ofpbuf_put(ofpacts, &dhcp_relay->server_ipv4,
>> +                  sizeof(dhcp_relay->server_ipv4));
>> +    encode_finish_controller_op(oc_offset, ofpacts);
>> +}
>> +
>> +static void ovnact_dhcp_relay_free(
>> +          struct ovnact_dhcp_relay *dhcp_relay OVS_UNUSED)
>> +{
>> +}
>> +
>> static void
>> parse_put_opts(struct action_context *ctx, const struct expr_field *dst,
>>                struct ovnact_put_opts *po, const struct hmap *gen_opts,
>> @@ -5451,6 +5563,11 @@ parse_action(struct action_context *ctx)
>>         parse_sample(ctx);
>>     } else if (lexer_match_id(ctx->lexer, "mac_cache_use")) {
>>         ovnact_put_MAC_CACHE_USE(ctx->ovnacts);
>> +    } else if (lexer_match_id(ctx->lexer, "dhcp_relay_req")) {
>> +        parse_dhcp_relay_req(ctx, ovnact_put_DHCPV4_RELAY_REQ(ctx->ovnacts));
>> +    } else if (lexer_match_id(ctx->lexer, "dhcp_relay_resp_fwd")) {
>> +        parse_dhcp_relay_resp_fwd(ctx,
>> +              ovnact_put_DHCPV4_RELAY_RESP_FWD(ctx->ovnacts));
>>     } else {
>>         lexer_syntax_error(ctx->lexer, "expecting action");
>>     }
>> diff --git a/lib/ovn-l7.h b/lib/ovn-l7.h
>> index ad514a922..e08581123 100644
>> --- a/lib/ovn-l7.h
>> +++ b/lib/ovn-l7.h
>> @@ -69,6 +69,7 @@ struct gen_opts_map {
>>  */
>> #define OVN_DHCP_OPT_CODE_NETMASK      1
>> #define OVN_DHCP_OPT_CODE_LEASE_TIME   51
>> +#define OVN_DHCP_OPT_CODE_SERVER_ID    54
>> #define OVN_DHCP_OPT_CODE_T1           58
>> #define OVN_DHCP_OPT_CODE_T2           59
>> 
>> diff --git a/northd/northd.c b/northd/northd.c
>> index 07dffb15a..7ac831fae 100644
>> --- a/northd/northd.c
>> +++ b/northd/northd.c
>> @@ -181,11 +181,13 @@ enum ovn_stage {
>>     PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING_ECMP, 14, "lr_in_ip_routing_ecmp") \
>>     PIPELINE_STAGE(ROUTER, IN,  POLICY,          15, "lr_in_policy")          \
>>     PIPELINE_STAGE(ROUTER, IN,  POLICY_ECMP,     16, "lr_in_policy_ecmp")     \
>> -    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE,     17, "lr_in_arp_resolve")     \
>> -    PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN,     18, "lr_in_chk_pkt_len")     \
>> -    PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS,     19, "lr_in_larger_pkts")     \
>> -    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT,     20, "lr_in_gw_redirect")     \
>> -    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     21, "lr_in_arp_request")     \
>> +    PIPELINE_STAGE(ROUTER, IN,  DHCP_RELAY_RESP_FWD, 17,                      \
>> +                  "lr_in_dhcp_relay_resp_fwd")                                \
>> +    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE,     18, "lr_in_arp_resolve")     \
>> +    PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN,     19, "lr_in_chk_pkt_len")     \
>> +    PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS,     20, "lr_in_larger_pkts")     \
>> +    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT,     21, "lr_in_gw_redirect")     \
>> +    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     22, "lr_in_arp_request")     \
>>                                                                       \
>>     /* Logical router egress stages. */                               \
>>     PIPELINE_STAGE(ROUTER, OUT, CHECK_DNAT_LOCAL,   0,                       \
>> @@ -9610,6 +9612,80 @@ build_dhcpv6_options_flows(struct ovn_port *op,
>>     ds_destroy(&match);
>> }
>> 
>> +static void
>> +build_lswitch_dhcp_relay_flows(struct ovn_port *op,
>> +                           const struct hmap *lr_ports,
>> +                           const struct hmap *lflows,
>> +                           const struct shash *meter_groups OVS_UNUSED)
>> +{
>> +    if (op->nbrp || !op->nbsp) {
>> +        return;
>> +    }
>> +    /* consider only ports attached to VMs */
>> +    if (strcmp(op->nbsp->type, "")) {
>> +        return;
>> +    }
>> +
>> +    if (!op->od || !op->od->n_router_ports ||
>> +        !op->od->nbs || !op->od->nbs->dhcp_relay_port) {
>> +        return;
>> +    }
>> +
>> +    struct ds match = DS_EMPTY_INITIALIZER;
>> +    struct ds action = DS_EMPTY_INITIALIZER;
>> +    struct nbrec_logical_router_port *lrp = op->od->nbs->dhcp_relay_port;
>> +    struct ovn_port *rp = ovn_port_find(lr_ports, lrp->name);
>> +
>> +    if (!rp || !rp->nbrp || !rp->nbrp->dhcp_relay) {
>> +        return;
>> +    }
>> +
>> +    struct ovn_port *sp = NULL;
>> +    struct nbrec_dhcp_relay *dhcp_relay = rp->nbrp->dhcp_relay;
>> +
>> +    for (int i = 0; i < op->od->n_router_ports; i++) {
>> +        struct ovn_port *sp_tmp = op->od->router_ports[i];
>> +        if (sp_tmp->peer == rp) {
>> +            sp = sp_tmp;
>> +            break;
>> +        }
>> +    }
>> +    if (!sp) {
>> +      return;
>> +    }
>> +
>> +    char *server_ip_str = NULL;
>> +    uint16_t port;
>> +    int addr_family;
>> +    struct in6_addr server_ip;
>> +
>> +    if (!ip_address_and_port_from_lb_key(dhcp_relay->servers, &server_ip_str,
>> +                                         &server_ip, &port, &addr_family)) {
>> +        return;
>> +    }
>> +
>> +    if (server_ip_str == NULL) {
>> +        return;
>> +    }
>> +
>> +    ds_put_format(
>> +        &match, "inport == %s && eth.src == %s && "
>> +        "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
>> +        "udp.src == 68 && udp.dst == 67",
>> +        op->json_key, op->lsp_addrs[0].ea_s);
>> +    ds_put_format(&action,
>> +                  "eth.dst=%s;outport=%s;next;/* DHCP_RELAY_REQ */",
>> +                  rp->lrp_networks.ea_s,sp->json_key);
>> +    ovn_lflow_add_with_hint__(lflows, op->od,
>> +                              S_SWITCH_IN_L2_LKUP, 100,
>> +                              ds_cstr(&match),
>> +                              ds_cstr(&action),
>> +                              op->key,
>> +                              NULL,
>> +                              &lrp->header_);
>> +    free(server_ip_str);
>> +}
>> +
>> static void
>> build_drop_arp_nd_flows_for_unbound_router_ports(struct ovn_port *op,
>>                                                  const struct ovn_port *port,
>> @@ -10181,6 +10257,13 @@ build_lswitch_dhcp_options_and_response(struct ovn_port *op,
>>         return;
>>     }
>> 
>> +    if (op->od && op->od->nbs
>> +        && op->od->nbs->dhcp_relay_port) {
>> +        /* Don't add the DHCP server flows if DHCP Relay is enabled on the
>> +         * logical switch. */
>> +        return;
>> +    }
>> +
>>     bool is_external = lsp_is_external(op->nbsp);
>>     if (is_external && (!op->od->n_localnet_ports ||
>>                         !op->nbsp->ha_chassis_group)) {
>> @@ -14458,6 +14541,86 @@ build_dhcpv6_reply_flows_for_lrouter_port(
>>     }
>> }
>> 
>> +static void
>> +build_dhcp_relay_flows_for_lrouter_port(
>> +        struct ovn_port *op, struct hmap *lflows,
>> +        struct ds *match)
>> +{
>> +    if (!op->nbrp || !op->nbrp->dhcp_relay) {
>> +        return;
>> +    }
>> +    struct nbrec_dhcp_relay *dhcp_relay = op->nbrp->dhcp_relay;
>> +    if (!dhcp_relay->servers) {
>> +        return;
>> +    }
>> +
>> +    int addr_family;
>> +    /* currently not supporting custom port */
>> +    uint16_t port;
>> +    char *server_ip_str = NULL;
>> +    struct in6_addr server_ip;
>> +
>> +    if (!ip_address_and_port_from_lb_key(dhcp_relay->servers, &server_ip_str,
>> +                                         &server_ip, &port, &addr_family)) {
>> +        return;
>> +    }
>> +
>> +    if (server_ip_str == NULL) {
>> +        return;
>> +    }
>> +
>> +    struct ds dhcp_action = DS_EMPTY_INITIALIZER;
>> +    ds_clear(match);
>> +    ds_put_format(
>> +        match, "inport == %s && "
>> +        "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
>> +        "udp.src == 68 && udp.dst == 67",
>> +        op->json_key);
>> +    ds_put_format(&dhcp_action,
>> +                "dhcp_relay_req(%s,%s);"
>> +                "ip4.src=%s;ip4.dst=%s;udp.src=67;next; /* DHCP_RELAY_REQ */",
>> +                op->lrp_networks.ipv4_addrs[0].addr_s, server_ip_str,
>> +                op->lrp_networks.ipv4_addrs[0].addr_s, server_ip_str);
>> +
>> +    ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_IP_INPUT, 110,
>> +                            ds_cstr(match), ds_cstr(&dhcp_action),
>> +                            &op->nbrp->header_);
>> +
>> +    ds_clear(match);
>> +    ds_clear(&dhcp_action);
>> +
>> +    ds_put_format(
>> +        match, "ip4.src == %s && ip4.dst == %s && "
>> +        "udp.src == 67 && udp.dst == 67",
>> +        server_ip_str, op->lrp_networks.ipv4_addrs[0].addr_s);
>> +    ds_put_format(&dhcp_action, "next;/* DHCP_RELAY_RESP */");
>> +    ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_IP_INPUT, 110,
>> +                            ds_cstr(match), ds_cstr(&dhcp_action),
>> +                            &op->nbrp->header_);
>> +
>> +    ds_clear(match);
>> +    ds_clear(&dhcp_action);
>> +
>> +    ds_put_format(
>> +        match, "ip4.src == %s && ip4.dst == %s && "
>> +        "udp.src == 67 && udp.dst == 67",
>> +        server_ip_str, op->lrp_networks.ipv4_addrs[0].addr_s);
>> +    ds_put_format(&dhcp_action,
>> +          "dhcp_relay_resp_fwd(%s,%s);ip4.src=%s;udp.dst=68;"
>> +          "outport=%s;output; /* DHCP_RELAY_RESP */",
>> +          op->lrp_networks.ipv4_addrs[0].addr_s, server_ip_str,
>> +          op->lrp_networks.ipv4_addrs[0].addr_s, op->json_key);
>> +    ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_DHCP_RELAY_RESP_FWD,
>> +                            110,
>> +                            ds_cstr(match), ds_cstr(&dhcp_action),
>> +                            &op->nbrp->header_);
>> +
>> +    ds_clear(match);
>> +    ds_clear(&dhcp_action);
>> +
>> +    free(server_ip_str);
>> +}
>> +
>> static void
>> build_ipv6_input_flows_for_lrouter_port(
>>         struct ovn_port *op, struct hmap *lflows,
>> @@ -15673,6 +15836,8 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath *od, struct hmap *lflows,
>>     ovn_lflow_add(lflows, od, S_ROUTER_OUT_POST_SNAT, 0, "1", "next;");
>>     ovn_lflow_add(lflows, od, S_ROUTER_OUT_EGR_LOOP, 0, "1", "next;");
>>     ovn_lflow_add(lflows, od, S_ROUTER_IN_ECMP_STATEFUL, 0, "1", "next;");
>> +    ovn_lflow_add(lflows, od, S_ROUTER_IN_DHCP_RELAY_RESP_FWD, 0, "1",
>> +                  "next;");
>> 
>>     const char *ct_flag_reg = features->ct_no_masked_label
>>                               ? "ct_mark"
>> @@ -16154,6 +16319,7 @@ build_lswitch_and_lrouter_iterate_by_lsp(struct ovn_port *op,
>>     build_lswitch_dhcp_options_and_response(op, lflows, meter_groups);
>>     build_lswitch_external_port(op, lflows);
>>     build_lswitch_ip_unicast_lookup(op, lflows, actions, match);
>> +    build_lswitch_dhcp_relay_flows(op, lr_ports, lflows, meter_groups);
>> 
>>     /* Build Logical Router Flows. */
>>     build_ip_routing_flows_for_router_type_lsp(op, lr_ports, lflows);
>> @@ -16183,6 +16349,7 @@ build_lswitch_and_lrouter_iterate_by_lrp(struct ovn_port *op,
>>     build_egress_delivery_flows_for_lrouter_port(op, lsi->lflows, &lsi->match,
>>                                                  &lsi->actions);
>>     build_dhcpv6_reply_flows_for_lrouter_port(op, lsi->lflows, &lsi->match);
>> +    build_dhcp_relay_flows_for_lrouter_port(op, lsi->lflows, &lsi->match);
>>     build_ipv6_input_flows_for_lrouter_port(op, lsi->lflows,
>>                                             &lsi->match, &lsi->actions,
>>                                             lsi->meter_groups);
>> diff --git a/ovn-nb.ovsschema b/ovn-nb.ovsschema
>> index b2e0993e0..6863d52cd 100644
>> --- a/ovn-nb.ovsschema
>> +++ b/ovn-nb.ovsschema
>> @@ -1,7 +1,7 @@
>> {
>>     "name": "OVN_Northbound",
>> -    "version": "7.2.0",
>> -    "cksum": "1069338687 34162",
>> +    "version": "7.3.0",
>> +    "cksum": "2325497400 35185",
>>     "tables": {
>>         "NB_Global": {
>>             "columns": {
>> @@ -89,7 +89,12 @@
>>                     "type": {"key": {"type": "uuid",
>>                                      "refTable": "Forwarding_Group",
>>                                      "refType": "strong"},
>> -                                     "min": 0, "max": "unlimited"}}},
>> +                                     "min": 0, "max": "unlimited"}},
>> +                "dhcp_relay_port": {"type": {"key": {"type": "uuid",
>> +                                            "refTable": "Logical_Router_Port",
>> +                                            "refType": "weak"},
>> +                                            "min": 0,
>> +                                            "max": 1}}},
>>             "isRoot": true},
>>         "Logical_Switch_Port": {
>>             "columns": {
>> @@ -436,6 +441,11 @@
>>                 "ipv6_prefix": {"type": {"key": "string",
>>                                       "min": 0,
>>                                       "max": "unlimited"}},
>> +                "dhcp_relay": {"type": {"key": {"type": "uuid",
>> +                                            "refTable": "DHCP_Relay",
>> +                                            "refType": "weak"},
>> +                                            "min": 0,
>> +                                            "max": 1}},
>>                 "external_ids": {
>>                     "type": {"key": "string", "value": "string",
>>                              "min": 0, "max": "unlimited"}},
>> @@ -529,6 +539,15 @@
>>                     "type": {"key": "string", "value": "string",
>>                              "min": 0, "max": "unlimited"}}},
>>             "isRoot": true},
>> +        "DHCP_Relay": {
>> +            "columns": {
>> +                "servers": {"type": {"key": "string",
>> +                                       "min": 0,
>> +                                       "max": 1}},
>> +                "external_ids": {
>> +                    "type": {"key": "string", "value": "string",
>> +                             "min": 0, "max": "unlimited"}}},
>> +            "isRoot": true},
>>         "Connection": {
>>             "columns": {
>>                 "target": {"type": "string"},
>> diff --git a/ovn-nb.xml b/ovn-nb.xml
>> index fcb1c6ecc..dc20892e1 100644
>> --- a/ovn-nb.xml
>> +++ b/ovn-nb.xml
>> @@ -608,6 +608,11 @@
>>       Please see the <ref table="DNS"/> table.
>>     </column>
>> 
>> +    <column name="dhcp_relay_port">
>> +      This column defines the <ref table="Logical_Router_Port"/> on which
>> +      DHCP relay is enabled.
>> +    </column>
>> +
>>     <column name="forwarding_groups">
>>       Groups a set of logical port endpoints for traffic going out of the
>>       logical switch.
>> @@ -2980,6 +2985,11 @@ or
>>       port has all ingress and egress traffic dropped.
>>     </column>
>> 
>> +    <column name="dhcp_relay">
>> +      This column is used to enabled DHCP Relay. Please refer
>> +      to <ref table="DHCP_Relay"/> table.
>> +    </column>
>> +
>>     <group title="Distributed Gateway Ports">
>>       <p>
>>         Gateways, as documented under <code>Gateways</code> in the OVN
>> @@ -4286,6 +4296,24 @@ or
>>     </group>
>>   </table>
>> 
>> +  <table name="DHCP_Relay" title="DHCP Relay">
>> +    <p>
>> +      OVN implements native DHCPv4 relay support which caters to the common
>> +      use case of relaying the DHCP requests to external DHCP server.
>> +    </p>
>> +
>> +    <column name="servers">
>> +      <p>
>> +        The DHCPv4 server IP address.
>> +      </p>
>> +    </column>
>> +    <group title="Common Columns">
>> +      <column name="external_ids">
>> +        See <em>External IDs</em> at the beginning of this document.
>> +      </column>
>> +    </group>
>> +  </table>
>> +
>>   <table name="Connection" title="OVSDB client connections.">
>>     <p>
>>       Configuration for a database connection to an Open vSwitch database
>> diff --git a/tests/atlocal.in b/tests/atlocal.in
>> index 63d891b89..32d1c374e 100644
>> --- a/tests/atlocal.in
>> +++ b/tests/atlocal.in
>> @@ -187,6 +187,9 @@ fi
>> # Set HAVE_DHCPD
>> find_command dhcpd
>> 
>> +# Set HAVE_DHCLIENT
>> +find_command dhclient
>> +
>> # Set HAVE_BFDD_BEACON
>> find_command bfdd-beacon
>> 
>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>> index 19e4f1263..4d8c9ff26 100644
>> --- a/tests/ovn-northd.at
>> +++ b/tests/ovn-northd.at
>> @@ -8786,9 +8786,9 @@ ovn-nbctl --wait=sb set logical_router_port R1-PUB options:redirect-type=bridged
>> ovn-sbctl dump-flows R1 > R1flows
>> AT_CAPTURE_FILE([R1flows])
>> 
>> -AT_CHECK([grep "lr_in_arp_resolve" R1flows | grep priority=90 | sort], [0], [dnl
>> -  table=17(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip4.src == 10.0.0.3 && is_chassis_resident("S0-P0")), action=(get_arp(outport, reg0); next;)
>> -  table=17(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip6.src == 1000::3 && is_chassis_resident("S0-P0")), action=(get_nd(outport, xxreg0); next;)
>> +AT_CHECK([grep "lr_in_arp_resolve" R1flows | grep priority=90 | sed 's/table=../table=??/' | sort], [0], [dnl
>> +  table=??(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip4.src == 10.0.0.3 && is_chassis_resident("S0-P0")), action=(get_arp(outport, reg0); next;)
>> +  table=??(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip6.src == 1000::3 && is_chassis_resident("S0-P0")), action=(get_nd(outport, xxreg0); next;)
>> ])
>> 
>> AT_CLEANUP
>> @@ -10966,3 +10966,38 @@ Status: active
>> 
>> AT_CLEANUP
>> ])
>> +
>> +OVN_FOR_EACH_NORTHD_NO_HV([
>> +AT_SETUP([check DHCP RELAY AGENT])
>> +ovn_start NORTHD_TYPE
>> +
>> +check ovn-nbctl ls-add ls0
>> +check ovn-nbctl lsp-add ls0 ls0-port1
>> +check ovn-nbctl lsp-set-addresses ls0-port1 02:00:00:00:00:10
>> +check ovn-nbctl lr-add lr0
>> +check ovn-nbctl lrp-add lr0 lrp1 02:00:00:00:00:01 192.168.1.1/24
>> +check ovn-nbctl lsp-add ls0 lrp1-attachment
>> +check ovn-nbctl lsp-set-type lrp1-attachment router
>> +check ovn-nbctl lsp-set-addresses lrp1-attachment 00:00:00:00:ff:02
>> +check ovn-nbctl lsp-set-options lrp1-attachment router-port=lrp1
>> +check ovn-nbctl lrp-add lr0 lrp-ext 02:00:00:00:00:02 192.168.2.1/24
>> +
>> +dhcp_relay=$(ovn-nbctl create DHCP_Relay servers=172.16.1.1)
>> +check ovn-nbctl set Logical_Router_port lrp1 dhcp_relay=$dhcp_relay
>> +rp_uuid=$(ovn-nbctl --bare --colum=_uuid list logical_router_port lrp1)
>> +check ovn-nbctl set Logical_Switch ls0 dhcp_relay_port=$rp_uuid
>> +
>> +check ovn-nbctl --wait=sb sync
>> +
>> +ovn-sbctl lflow-list > lflows
>> +AT_CAPTURE_FILE([lflows])
>> +
>> +AT_CHECK([grep -e "DHCP_RELAY_" lflows | sed 's/table=../table=??/'], [0], [dnl
>> +  table=??(lr_in_ip_input     ), priority=110  , match=(inport == "lrp1" && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(dhcp_relay_req(192.168.1.1,172.16.1.1);ip4.src=192.168.1.1;ip4.dst=172.16.1.1;udp.src=67;next; /* DHCP_RELAY_REQ */)
>> +  table=??(lr_in_ip_input     ), priority=110  , match=(ip4.src == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src == 67 && udp.dst == 67), action=(next;/* DHCP_RELAY_RESP */)
>> +  table=??(lr_in_dhcp_relay_resp_fwd), priority=110  , match=(ip4.src == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src == 67 && udp.dst == 67), action=(dhcp_relay_resp_fwd(192.168.1.1,172.16.1.1);ip4.src=192.168.1.1;udp.dst=68;outport="lrp1";output; /* DHCP_RELAY_RESP */)
>> +  table=??(ls_in_l2_lkup      ), priority=100  , match=(inport == "ls0-port1" && eth.src == 02:00:00:00:00:10 && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(eth.dst=02:00:00:00:00:01;outport="lrp1-attachment";next;/* DHCP_RELAY_REQ */)
>> +])
>> +
>> +AT_CLEANUP
>> +])
>> diff --git a/tests/ovn.at b/tests/ovn.at
>> index e8c79512b..839c07ce2 100644
>> --- a/tests/ovn.at
>> +++ b/tests/ovn.at
>> @@ -21905,7 +21905,7 @@ eth_dst=00000000ff01
>> ip_src=$(ip_to_hex 10 0 0 10)
>> ip_dst=$(ip_to_hex 172 168 0 101)
>> send_icmp_packet 1 1 $eth_src $eth_dst $ip_src $ip_dst c4c9 0000000000000000000000
>> -AT_CHECK_UNQUOTED([as hv1 ovs-ofctl dump-flows br-int metadata=0x$lr0_dp_key | awk '/table=28, n_packets=1, n_bytes=45/{print $7" "$8}'],[0],[dnl
>> +AT_CHECK_UNQUOTED([as hv1 ovs-ofctl dump-flows br-int metadata=0x$lr0_dp_key | awk '/table=29, n_packets=1, n_bytes=45/{print $7" "$8}'],[0],[dnl
>> priority=80,ip,reg15=0x$lr0_public_dp_key,metadata=0x$lr0_dp_key,nw_src=10.0.0.10 actions=drop
>> ])
>> 
>> @@ -28964,7 +28964,7 @@ AT_CHECK([
>>         grep "priority=100" | \
>>         grep -c "ct(commit,zone=NXM_NX_REG11\\[[0..15\\]],.*exec(move:NXM_OF_ETH_SRC\\[[\\]]->NXM_NX_CT_LABEL\\[[32..79\\]],load:0x[[0-9]]->NXM_NX_CT_MARK\\[[16..31\\]]))"
>> 
>> -        grep table=25 hv${hv}flows | \
>> +        grep table=26 hv${hv}flows | \
>>         grep "priority=200" | \
>>         grep -c "move:NXM_NX_CT_LABEL\\[[\\]]->NXM_NX_XXREG1\\[[\\]],move:NXM_NX_XXREG1\\[[32..79\\]]->NXM_OF_ETH_DST"
>>     done; :], [0], [dnl
>> @@ -29089,7 +29089,7 @@ AT_CHECK([
>>         grep "priority=100" | \
>>         grep -c "ct(commit,zone=NXM_NX_REG11\\[[0..15\\]],.*exec(move:NXM_OF_ETH_SRC\\[[\\]]->NXM_NX_CT_LABEL\\[[32..79\\]],load:0x[[0-9]]->NXM_NX_CT_MARK\\[[16..31\\]]))"
>> 
>> -        grep table=25 hv${hv}flows | \
>> +        grep table=26 hv${hv}flows | \
>>         grep "priority=200" | \
>>         grep -c "move:NXM_NX_CT_LABEL\\[[\\]]->NXM_NX_XXREG1\\[[\\]],move:NXM_NX_XXREG1\\[[32..79\\]]->NXM_OF_ETH_DST"
>>     done; :], [0], [dnl
>> @@ -29586,7 +29586,7 @@ if test X"$1" = X"DGP"; then
>> else
>>     prio=2
>> fi
>> -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=25, n_packets=1,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
>> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=26, n_packets=1,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
>> 1
>> ])
>> 
>> @@ -29605,13 +29605,13 @@ AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep "actions=controller" | grep
>> 
>> if test X"$1" = X"DGP"; then
>>     # The packet dst should be resolved once for E/W centralized NAT purpose.
>> -    AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=25, n_packets=1,.* priority=100,reg0=0xa000101,reg15=.*metadata=0x${sw_key} actions=mod_dl_dst:00:00:00:00:01:01,resubmit" -c], [0], [dnl
>> +    AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=26, n_packets=1,.* priority=100,reg0=0xa000101,reg15=.*metadata=0x${sw_key} actions=mod_dl_dst:00:00:00:00:01:01,resubmit" -c], [0], [dnl
>> 1
>> ])
>> fi
>> 
>> # The packet should've been finally dropped in the lr_in_arp_resolve stage.
>> -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=25, n_packets=2,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
>> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=26, n_packets=2,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
>> 1
>> ])
>> OVN_CLEANUP([hv1])
>> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
>> index 7b9daba0d..591933a95 100644
>> --- a/tests/system-ovn.at
>> +++ b/tests/system-ovn.at
>> @@ -12032,3 +12032,153 @@ as
>> OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
>> /connection dropped.*/d"])
>> AT_CLEANUP
>> +
>> +OVN_FOR_EACH_NORTHD([
>> +AT_SETUP([DHCP RELAY AGENT])
>> +AT_SKIP_IF([test $HAVE_DHCPD = no])
>> +AT_SKIP_IF([test $HAVE_DHCLIENT = no])
>> +AT_SKIP_IF([test $HAVE_TCPDUMP = no])
>> +ovn_start
>> +OVS_TRAFFIC_VSWITCHD_START()
>> +
>> +ADD_BR([br-int])
>> +ADD_BR([br-ext])
>> +
>> +ovs-ofctl add-flow br-ext action=normal
>> +# Set external-ids in br-int needed for ovn-controller
>> +ovs-vsctl \
>> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
>> +        -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
>> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
>> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
>> +        -- set bridge br-int fail-mode=secure other-config:disable-in-band=true
>> +
>> +# Start ovn-controller
>> +start_daemon ovn-controller
>> +
>> +ADD_NAMESPACES(sw01)
>> +ADD_VETH(sw01, sw01, br-int, "0", "f0:00:00:01:02:03")
>> +ADD_NAMESPACES(sw11)
>> +ADD_VETH(sw11, sw11, br-int, "0", "f0:00:00:02:02:03")
>> +ADD_NAMESPACES(server)
>> +ADD_VETH(s1, server, br-ext, "172.16.1.1/24", "f0:00:00:01:02:05", \
>> +         "172.16.1.254")
>> +
>> +check ovn-nbctl lr-add R1
>> +
>> +check ovn-nbctl ls-add sw0
>> +check ovn-nbctl ls-add sw1
>> +check ovn-nbctl ls-add sw-ext
>> +
>> +check ovn-nbctl lrp-add R1 rp-sw0 00:00:01:01:02:03 192.168.1.1/24
>> +check ovn-nbctl lrp-add R1 rp-sw1 00:00:03:01:02:03 192.168.2.1/24
>> +check ovn-nbctl lrp-add R1 rp-ext 00:00:02:01:02:03 172.16.1.254/24
>> +
>> +dhcp_relay=$(ovn-nbctl create DHCP_Relay servers=172.16.1.1)
>> +check ovn-nbctl set Logical_Router_port rp-sw0 dhcp_relay=$dhcp_relay
>> +check ovn-nbctl set Logical_Router_port rp-sw1 dhcp_relay=$dhcp_relay
>> +check ovn-nbctl lrp-set-gateway-chassis rp-ext hv1
>> +
>> +check ovn-nbctl lsp-add sw0 sw0-rp -- set Logical_Switch_Port sw0-rp \
>> +    type=router options:router-port=rp-sw0 \
>> +    -- lsp-set-addresses sw0-rp router
>> +check ovn-nbctl lsp-add sw1 sw1-rp -- set Logical_Switch_Port sw1-rp \
>> +    type=router options:router-port=rp-sw1 \
>> +    -- lsp-set-addresses sw1-rp router
>> +
>> +rp_uuid=$(ovn-nbctl --bare --colum=_uuid list logical_router_port rp-sw0)
>> +check ovn-nbctl set Logical_Switch sw0 dhcp_relay_port=$rp_uuid
>> +rp_uuid=$(ovn-nbctl --bare --colum=_uuid list logical_router_port rp-sw1)
>> +check ovn-nbctl set Logical_Switch sw1 dhcp_relay_port=$rp_uuid
>> +
>> +check ovn-nbctl lsp-add sw-ext ext-rp -- set Logical_Switch_Port ext-rp \
>> +    type=router options:router-port=rp-ext \
>> +    -- lsp-set-addresses ext-rp router
>> +check ovn-nbctl lsp-add sw-ext lnet \
>> +        -- lsp-set-addresses lnet unknown \
>> +        -- lsp-set-type lnet localnet \
>> +        -- lsp-set-options lnet network_name=phynet
>> +
>> +check ovn-nbctl lsp-add sw0 sw01 \
>> +    -- lsp-set-addresses sw01 "f0:00:00:01:02:03"
>> +
>> +check ovn-nbctl lsp-add sw1 sw11 \
>> +    -- lsp-set-addresses sw11 "f0:00:00:02:02:03"
>> +
>> +AT_CHECK([ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappings=phynet:br-ext])
>> +
>> +OVN_POPULATE_ARP
>> +
>> +check ovn-nbctl --wait=hv sync
>> +
>> +DHCP_TEST_DIR="/tmp/dhcp-test"
>> +rm -rf $DHCP_TEST_DIR
>> +mkdir $DHCP_TEST_DIR
>> +cat > $DHCP_TEST_DIR/dhcpd.conf <<EOF
>> +subnet 172.16.1.0 netmask 255.255.255.0 {
>> +}
>> +subnet 192.168.1.0 netmask 255.255.255.0 {
>> +  range 192.168.1.10 192.168.1.10;
>> +  option routers 192.168.1.1;
>> +  option broadcast-address 192.168.1.255;
>> +  default-lease-time 60;
>> +  max-lease-time 120;
>> +}
>> +subnet 192.168.2.0 netmask 255.255.255.0 {
>> +  range 192.168.2.10 192.168.2.10;
>> +  option routers 192.168.2.1;
>> +  option broadcast-address 192.168.2.255;
>> +  default-lease-time 60;
>> +  max-lease-time 120;
>> +}
>> +EOF
>> +cat > $DHCP_TEST_DIR/dhclien.conf <<EOF
>> +timeout 2
>> +EOF
>> +
>> +touch $DHCP_TEST_DIR/dhcpd.leases
>> +chown root:dhcpd $DHCP_TEST_DIR $DHCP_TEST_DIR/dhcpd.leases
>> +chmod 775 $DHCP_TEST_DIR
>> +chmod 664 $DHCP_TEST_DIR/dhcpd.leases
>> +
>> +
>> +NETNS_DAEMONIZE([server], [dhcpd -4 -f -cf $DHCP_TEST_DIR/dhcpd.conf s1 > dhcpd.log 2>&1], [dhcpd.pid])
>> +
>> +NS_CHECK_EXEC([server], [tcpdump -l -nvv -i s1  udp > pkt.pcap 2>tcpdump_err &])
>> +OVS_WAIT_UNTIL([grep "listening" tcpdump_err])
>> +on_exit 'kill $(pidof tcpdump)'
>> +
>> +NS_CHECK_EXEC([sw01], [dhclient -1 -q -lf $DHCP_TEST_DIR/dhclient-sw01.lease -pf $DHCP_TEST_DIR/dhclient-sw01.pid -cf $DHCP_TEST_DIR/dhclien.conf sw01])
>> +NS_CHECK_EXEC([sw11], [dhclient -1 -q -lf $DHCP_TEST_DIR/dhclient-sw11.lease -pf $DHCP_TEST_DIR/dhclient-sw11.pid -cf $DHCP_TEST_DIR/dhclien.conf sw11])
>> +
>> +OVS_WAIT_UNTIL([
>> +    total_pkts=$(cat pkt.pcap | wc -l)
>> +    test ${total_pkts} -ge 8
>> +])
>> +
>> +on_exit 'kill `cat $DHCP_TEST_DIR/dhclient-sw01.pid` &&
>> +kill `cat $DHCP_TEST_DIR/dhclient-sw11.pid` && rm -rf $DHCP_TEST_DIR'
>> +
>> +NS_CHECK_EXEC([sw01], [ip addr show sw01 | grep -oP '(?<=inet\s)\d+(\.\d+){3}'], [0], [dnl
>> +192.168.1.10
>> +])
>> +NS_CHECK_EXEC([sw11], [ip addr show sw11 | grep -oP '(?<=inet\s)\d+(\.\d+){3}'], [0], [dnl
>> +192.168.2.10
>> +])
>> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
>> +
>> +as ovn-sb
>> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>> +
>> +as ovn-nb
>> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>> +
>> +as northd
>> +OVS_APP_EXIT_AND_WAIT([NORTHD_TYPE])
>> +
>> +as
>> +OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
>> +/failed to query port patch-.*/d
>> +/.*terminating with signal 15.*/d"])
>> +AT_CLEANUP
>> +])
>> diff --git a/utilities/ovn-trace.c b/utilities/ovn-trace.c
>> index 0b86eae7b..ae9dd77de 100644
>> --- a/utilities/ovn-trace.c
>> +++ b/utilities/ovn-trace.c
>> @@ -2328,6 +2328,25 @@ execute_put_dhcp_opts(const struct ovnact_put_opts *pdo,
>>     execute_put_opts(pdo, name, uflow, super);
>> }
>> 
>> +static void
>> +execute_dhcpv4_relay_resp_fwd(const struct ovnact_dhcp_relay *dr,
>> +                                const char *name, struct flow *uflow,
>> +                                struct ovs_list *super)
>> +{
>> +    ovntrace_node_append(
>> +        super, OVNTRACE_NODE_ERROR,
>> +        "/* We assume that this packet is DHCPOFFER or DHCPACK and "
>> +            "DHCP broadcast flag is set. Dest IP is set to broadcast. "
>> +            "Dest MAC is set to broadcast but in real network this is unicast "
>> +            "which is extracted from DHCP header. */");
>> +
>> +    /* Assume DHCP broadcast flag is set */
>> +    uflow->nw_dst = 0xFFFFFFFF;
>> +    /* Dest MAC is set to broadcast but in real network this is unicast */
>> +    struct eth_addr bcast_mac = {0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF};
>> +    uflow->dl_dst = bcast_mac;
>> +}
>> +
>> static void
>> execute_put_nd_ra_opts(const struct ovnact_put_opts *pdo,
>>                        const char *name, struct flow *uflow,
>> @@ -3215,6 +3234,15 @@ trace_actions(const struct ovnact *ovnacts, size_t ovnacts_len,
>>                                   "put_dhcpv6_opts", uflow, super);
>>             break;
>> 
>> +        case OVNACT_DHCPV4_RELAY_REQ:
>> +            /* Nothing to do for tracing. */
>> +            break;
>> +
>> +        case OVNACT_DHCPV4_RELAY_RESP_FWD:
>> +            execute_dhcpv4_relay_resp_fwd(ovnact_get_DHCPV4_RELAY_RESP_FWD(a),
>> +                                    "dhcp_relay_resp_fwd", uflow, super);
>> +            break;
>> +
>>         case OVNACT_PUT_ND_RA_OPTS:
>>             execute_put_nd_ra_opts(ovnact_get_PUT_DHCPV6_OPTS(a),
>>                                    "put_nd_ra_opts", uflow, super);
>> --
>> 2.36.6
>> 
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwIFaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=jUP6tr4FN6iSRj6v8rdyetsvEpT13QUHVMbw__3u6Sm7qAhyuu9tBdezdVmkqt0p&s=jJ3kFCf5o6dc-gW8diGvfaIQVC0Gwhe2y5aJYZJo0Rk&e=
Numan Siddique Jan. 24, 2024, 3:29 a.m. UTC | #4
On Tue, Jan 23, 2024 at 8:02 PM Naveen Yerramneni
<naveen.yerramneni@nutanix.com> wrote:
>
>
>
> > On 16-Jan-2024, at 2:30 AM, Numan Siddique <numans@ovn.org> wrote:
> >
> > On Tue, Dec 12, 2023 at 1:05 PM Naveen Yerramneni
> > <naveen.yerramneni@nutanix.com> wrote:
> >>
> >>    This patch contains changes to enable DHCP Relay Agent support for overlay subnets.
> >>
> >>    USE CASE:
> >>    ----------
> >>      - Enable IP address assignment for overlay subnets from the centralized DHCP server present in the underlay network.
> >>
> >>    PREREQUISITES
> >>    --------------
> >>      - Logical Router Port IP should be assigned (statically) from the same overlay subnet which is managed by DHCP server.
> >>      - LRP IP is used for GIADRR field when relaying the DHCP packets and also same IP needs to be configured as default gateway for the overlay subnet.
> >>      - Overlay subnets managed by external DHCP server are expected to be directly reachable from the underlay network.
> >>
> >>    EXPECTED PACKET FLOW:
> >>    ----------------------
> >>    Following is the expected packet flow inorder to support DHCP rleay functionality in OVN.
> >>      1. DHCP client originates DHCP discovery (broadcast).
> >>      2. DHCP relay (running on the OVN) receives the broadcast and forwards the packet to the DHCP server by converting it to unicast. While forwarding the packet, it updates the GIADDR in DHCP header to its
> >>         interface IP on which DHCP packet is received.
> >>      3. DHCP server uses GIADDR field to decide the IP address pool from which IP has to be assigned and DHCP offer is sent to the same IP (GIADDR).
> >>      4. DHCP relay agent forwards the offer to the client, it resets the GIADDR field when forwarding the offer to the client.
> >>      5. DHCP client sends DHCP request (broadcast) packet.
> >>      6. DHCP relay (running on the OVN) receives the broadcast and forwards the packet to the DHCP server by converting it to unicast. While forwarding the packet, it updates the GIADDR in DHCP header to its
> >>         interface IP on which DHCP packet is received.
> >>      7. DHCP Server sends the ACK packet.
> >>      8. DHCP relay agent forwards the ACK packet to the client, it resets the GIADDR field when forwarding the ACK to the client.
> >>      9. All the future renew/release packets are directly exchanged between DHCP client and DHCP server.
> >>
> >>    OVN DHCP RELAY PACKET FLOW:
> >>    ----------------------------
> >>    To add DHCP Relay support on OVN, we need to replicate all the behavior described above using distributed logical switch and logical router.
> >>    At, highlevel packet flow is distributed among Logical Switch and Logical Router on source node (where VM is deployed) and redirect chassis(RC) node.
> >>      1. Request packet gets processed on the source node where VM is deployed and relays the packet to DHCP server.
> >>      2. Response packet is first processed on RC node (which first recieves the packet from underlay network). RC node forwards the packet to the right node by filling in the dest MAC and IP.
> >>
> >>    OVN Packet flow with DHCP relay is explained below.
> >>      1. DHCP client (VM) sends the DHCP discover packet (broadcast).
> >>      2. Logical switch converts the packet to L2 unicast by setting the destination MAC to LRP's MAC
> >>      3. Logical Router receives the packet and redirects it to the OVN controller.
> >>      4. OVN controller updates the required information(GIADDR) in the DHCP payload after doing the required checks. If any check fails, packet is dropped.
> >>      5. Logical Router converts the packet to L3 unicast and forwards it to the server. This packets gets routed like any other packet (via RC node).
> >>      6. Server replies with DHCP offer.
> >>      7. RC node processes the DHCP offer and forwards it to the OVN controller.
> >>      8. OVN controller does sanity checks and  updates the destination MAC (available in DHCP header), destination IP (available in DHCP header), resets GIADDR  and reinjects the packet to datapath.
> >>         If any check fails, packet is dropped.
> >>      9. Logical router updates the source IP and port and forwards the packet to logical switch.
> >>      10. Logical switch delivers the packet to the DHCP client.
> >>      11. Similar steps are performed for Request and Ack packets.
> >>      12. All the future renew/release packets are directly exchanged between DHCP client and DHCP server
> >>
> >>    NEW OVN ACTIONS
> >>    ---------------
> >>
> >>      1. dhcp_relay_req(<relay-ip>, <server-ip>)
> >>          - This action executes on the source node on which the DHCP request originated.
> >>          - This action relays the DHCP request coming from client to the server. Relay-ip is used to update GIADDR in the DHCP header.
> >>      2. dhcp_relay_resp_fwd(<relay-ip>, <server-ip>)
> >>          - This action executes on the first node (RC node) which processes the DHCP response from the server.
> >>          - This action updates  the destination MAC and destination IP so that the response can be forwarded to the appropriate node from which request was originated.
> >>          - Relay-ip, server-ip are used to validate GIADDR and SERVER ID in the DHCP payload.
> >>
> >>    FLOWS
> >>    -----
> >>    Following are the flows required for one overlay subnet.
> >>
> >>      1. table=27(ls_in_l2_lkup      ), priority=100  , match=(inport == <vm_port> && eth.src == <vm_mac> && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(eth.dst=<lrp_mac>;outport=<lrp-port>;next;/* DHCP_RELAY_REQ */)
> >>      2. table=3 (lr_in_ip_input     ), priority=110  , match=(inport == <lrp_port> && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(dhcp_relay_req(<lrp_ip>,<dhcp_server_ip>);ip4.src=<lrp_ip>;ip4.dst=<dhcp_server_ip>;udp.src=67;next; /* DHCP_RELAY_REQ */)
> >>      3. table=3 (lr_in_ip_input     ), priority=110  , match=(ip4.src == <dhcp_server_ip> && ip4.dst ==<lrp_ip> && udp.src == 67 && udp.dst == 67), action=(next;/* DHCP_RELAY_RESP */)
> >>      4. table=17(lr_in_dhcp_relay_resp_fwd), priority=110  , match=(ip4.src == <dhcp_server_ip> && ip4.dst == <lrp_ip> && udp.src == 67 && udp.dst == 67), action=(dhcp_relay_resp_fwd(<lrp_ip>,<dhcp_server_ip>);ip4.src=<lrp_ip>;udp.dst=68;outport=<lrp_port>;output; /* DHCP_RELAY_RESP */)
> >>
> >>    NEW PIPELINE STAGES
> >>    -------------------
> >>    Following stage is added for DHCP relay feature. Some of the flows are fitted into the existing pipeline tages.
> >>      1. lr_in_dhcp_relay_resp_fwd
> >>          - Forward teh DHCP response to the appropriate node
> >>
> >>    NB SCHEMA CHANGES
> >>    ----------------
> >>      1. New DHCP_Relay table
> >>          "DHCP_Relay": {
> >>                "columns": {
> >>            "name": {"type": "string"},
> >>                    "servers": {"type": {"key": "string",
> >>                                           "min": 0,
> >>                                           "max": 1}},
> >>                    "external_ids": {
> >>                        "type": {"key": "string", "value": "string",
> >>                                "min": 0, "max": "unlimited"}}},
> >>                "isRoot": true},
> >>      2. New column to Logical_Router_Port table
> >>          "dhcp_relay": {"type": {"key": {"type": "uuid",
> >>                                "refTable": "DHCP_Relay",
> >>                                "refType": "weak"},
> >>                                "min": 0,
> >>                                "max": 1}},
> >>      3. New column to Logical_Switch_table
> >>          "dhcp_relay_port": {"type": {"key": {"type": "uuid",
> >>                                        "refTable": "Logical_Router_Port",
> >>                                        "refType": "weak"},
> >>                                         "min": 0,
> >>                                         "max": 1}}},
> >>
> >>    Commands to enable the feature:
> >>    ------------------------------
> >>      - ovn-nbctl create DHCP_Relay servers=<ip>
> >>      - ovn-nbctl set Logical_Router_port <lrp_uuid> dhcp_relay=<dhcp_relay_uuid>
> >>      - ovn-nbctl set Logical_Switch <ls_uuid> dhcp_relay_port=<lrp_uuid>
> >>
> >>    Example:
> >>    -------
> >>     ovn-nbctl ls-add sw1
> >>     ovn-nbctl lsp-add sw1 sw1-port1
> >>     ovn-nbctl lsp-set-addresses sw1-port1 <MAC> #Only MAC address has to be specified when logical ports are created.
> >>     ovn-nbctl lr-add lr1
> >>     ovn-nbctl lrp-add lr1 lr1-port1 <MAC> <GATEWAY_IP/Prefix> #GATEWAY IP is set in GIADDR field when relaying the DHCP requests to server.
> >>     ovn-nbctl lsp-add sw1 lr1-attachment
> >>     ovn-nbctl lsp-set-type lr1-attachment router
> >>     ovn-nbctl lsp-set-addresses lr1-attachment <MAC>
> >>     ovn-nbctl lsp-set-options lr1-attachment router-port=lr1-port1
> >>     ovn-nbctl create DHCP_Relay servers=<DHCP_SERVER_IP>
> >>     ovn-nbctl set Logical_Router_port <lrp_uuid> dhcp_relay=<relay_uuid>
> >>     ovn-nbctl set Logical_Switch <ls_uuid> dhcp_relay_port=<lrp_uuid>
> >>
> >>    Limitations:
> >>    ------------
> >>      - All OVN features that needs IP address to be configured on logical port (like proxy arp, etc) will not be supported for overlay subnets on which DHCP relay is enabled.
> >>
> >>    References:
> >>    ----------
> >>      - rfc1541, rfc1542, rfc2131
> >>
> >> Signed-off-by: Naveen Yerramneni <naveen.yerramneni@nutanix.com>
> >> Co-authored-by: Huzaifa Calcuttawala <huzaifa.c@nutanix.com>
> >> Signed-off-by: Huzaifa Calcuttawala <huzaifa.c@nutanix.com>
> >> CC: Mary Manohar <mary.manohar@nutanix.com>
> >
> > Hi Naveen,
> >
> > Thanks for the patch.  Sorry for the delayed response.
> >
> > I've a few comments.
> >
> > 1.  Regarding the newly added Table - DHCP_Relay in NB DB and the
> > newly added columns in Logical_Switch and
> >    Logical_Router table.
> >
> >    I don't think there is a need to add the new table DHCP_Relay
> > since it only stores the dhcp relay agent server ip.
> >    Also it could complicate the northd incremental processing.
> >
> >    If for example we have below logical switches and router
> >
> >    ovn-nbctl lr-add R1
> >    ovn-nbctl ls-add sw0
> >    ovn-nbctl ls-add sw1
> >    ovn-nbctl ls-add sw-ext
> >    ovn-nbctl lrp-add R1 rp-sw0 00:00:01:01:02:03 192.168.1.1/24
> >    ovn-nbctl lrp-add R1 rp-sw1 00:00:03:01:02:03 192.168.2.1/24
> >    ovn-nbctl lrp-add R1 rp-ext 00:00:02:01:02:03 172.16.1.254/24
> >
> >    ovn-nbctl lsp-add sw0 sw0-rp -- set Logical_Switch_Port sw0-rp \
> >    type=router options:router-port=rp-sw0 \
> >    -- lsp-set-addresses sw0-rp router
> >
> >    ovn-nbctl lsp-add sw1 sw1-rp -- set Logical_Switch_Port sw1-rp \
> >    type=router options:router-port=rp-sw1 \
> >    -- lsp-set-addresses sw1-rp router
> >
> >    I'd suggest doing something like below to enable this feature.
> >
> >    ovn-nbctl set Logical_Switch_Port sw0-rp options:dhcp_relay=true
> >    ovn-nbctl set Logical_Switch_Port sw1-rp options:dhcp_relay=true
> >
> >    (Make sure that only one logical switch port of type router can
> > have this flag - dhcp_relay set
> >     for a given logical switch and document this limitation.)
>
> Ack. This suggestion looks good.
>
> >    ovn-nbctl set Logical_Router_port rp-sw0 options:dhcp_relay_ip=172.16.1.1
> >    ovn-nbctl set Logical_Router_port rp-sw1 options:dhcp_relay_ip=172.16.1.1
> >
> >    Let me know if there are any limitations with this.
>
> The reason why I added new table is , it would be useful in future if we add
> additional options (like setting hop count in DHCP header, etc) to DHCP relay
> functionality. What do you recommend if we have to add more options
> In future ?

I see.  If there is a possibility of adding more options, then having
a separate table makes sense.
I'd suggest to add the options column to the DHCP_Relay table even if
this patch presently is not using
any.  This would help in upgrades.

But I don't think there is a need to add a new column in the logical
switch port table to enable dhcp realy.

Thanks
Numan

>
>
>
> > 2.  Regarding the newly added actions - dhcp_relay_req() and
> > dhcp_relay_resp_fwd().
> >     Both of these actions are encoded as OVS controller action with
> > pause enabled.
> >     Which means ovs-vswitchd has to freeze the flow translation and
> > resume the flow translation
> >     once the ovn-controller resumes it.  But the functions
> > pinctrl_handle_dhcp_relay_req()
> >     and pinctrl_handle_dhcp_relay_resp_fwd() do not resume the packet
> > if the packet
> >     has some errors.  This is wrong.  Otherwise vswitchd will never thaw the
> >     frozen translation.
> >
> >     You can see the existing OVN actions - put_dhcp_opts() and few others which
> >     use controller action with pause.  In such actions, the result of
> > these actions
> >     are stored in a register bit (i.e if put_dhcp_opts() was successful or not)
> >     and in the next stage we take a decision based on the result.
> >
> >     For the action dhcp_relay_req(relay_ip, server_ip),  I don't
> > think you should use the pause flag.
> >     Also in this action the argument server_ip is never used in the
> > function pinctrl_handle_dhcp_relay_req()
> >     other than to just log.
> >
> >     I'd suggest you do something like this:
> >
> >    table=3 (lr_in_ip_input     ), priority=110  , match=(inport ==
> > "lrp1" && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src
> > == 68 && udp.dst == 67),
> >    action=(dhcp_relay_req { ip4.src = 192.168.1.1; ip4.dst =
> > 172.16.1.1; udp.src = 67; dhcp_header.giaddr = <relay_ip>;
> > next(pipeline=ingress,table=S_ROUTER_IN_UNSNAT);  /* DHCP_RELAY_REQ */
> > }
> >
> >    dhcp_relay_req action would get translated into a controller
> > action with pause=false and all the inner actions of this are encoded
> > as
> >    normal actions and stored in the userdata of controller action.
> > Please see icmp4_error {} as an example.
> >    Add a new OVN field 'dhcp_header.giaddr' which gets translated as
> > controller action with pause flag set.
> >    Please see the existing OVN field - icmp4.frag_mtu as an example
> > and see this commit for reference [1]
> >    When encoding this new OVN field, store the relay_ip in the
> > userdata buffer and in pinctrl.c
> >    get the relay_ip value and store it in the dhcp header field.
> >
> >
> >    For the action dhcp_relay_resp_fwd,  I'd suggest something like below:
> >
> >      table=17 (lr_in_dhcp_relay_resp_chk), priority=110  ,
> > match=(ip4.src == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src ==
> > 67 && udp.dst == 67),
> >      action=(reg0[0] = dhcp_relay_resp_chk(dhcp_header.giaddr ==
> > <relay_ip>); next;)
> >      table=17 (lr_in_dhcp_relay_resp), priority=110  , match=(ip4.src
> > == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src == 67 && udp.dst ==
> > 67 && reg0[0] == 1),
> >      action=(ip4.src = 192.168.1.1; udp.dst = 68; outport = "lrp1";
> > output; /* DHCP_RELAY_RESP */)
> >
> >       I used reg0[0] as an example.  You may need to check the free
> > register bit and use it.
> >
> >      You need to encode dhcp_relay_resp_chk as controller action with
> > pause=true, and store the relay_ip in the userdata buffer.
> >      And in pinctrl.c  check that  'dhcp_header.giaddr == relay_ip'
> > or not.  If so, set the result register bit to 1, else to 0.
> >
> >   Let me know if you've any questions.
> >
>
> Ack. Thanks for the suggestions and detailed explanation.
> Before implementation I had referred to icmp4_error and native dhcp_server flows
> but I had slight misunderstanding about pause flag.
>
>
> > 3.  The newly added functions in pinctrl.c have a lot of repetitive
> > code and it is very much similar to existing
> > pinctrl_handle_put_dhcp_opts()
> >    Please see if the duplicate code can be avoided.
>
> Ack.
>
>
>
> > [1] - https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_commit_3d9fec3fd5992e1201b4d4fdf43f1f397e8d5ea1&d=DwIFaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=jUP6tr4FN6iSRj6v8rdyetsvEpT13QUHVMbw__3u6Sm7qAhyuu9tBdezdVmkqt0p&s=xAleLPNTzueIGuScqWZRp7ppL2D7bbjqLZc6q4xk3Rg&e=
> >
> > Thanks
> > Numan
> >
> >> ---
> >> controller/pinctrl.c  | 441 ++++++++++++++++++++++++++++++++++++++++++
> >> include/ovn/actions.h |  26 +++
> >> lib/actions.c         | 117 +++++++++++
> >> lib/ovn-l7.h          |   1 +
> >> northd/northd.c       | 177 ++++++++++++++++-
> >> ovn-nb.ovsschema      |  25 ++-
> >> ovn-nb.xml            |  28 +++
> >> tests/atlocal.in      |   3 +
> >> tests/ovn-northd.at   |  41 +++-
> >> tests/ovn.at          |  12 +-
> >> tests/system-ovn.at   | 150 ++++++++++++++
> >> utilities/ovn-trace.c |  28 +++
> >> 12 files changed, 1032 insertions(+), 17 deletions(-)
> >>
> >> diff --git a/controller/pinctrl.c b/controller/pinctrl.c
> >> index 5a35d56f6..45240f01d 100644
> >> --- a/controller/pinctrl.c
> >> +++ b/controller/pinctrl.c
> >> @@ -1897,6 +1897,437 @@ is_dhcp_flags_broadcast(ovs_be16 flags)
> >>     return flags & htons(DHCP_BROADCAST_FLAG);
> >> }
> >>
> >> +static const char *dhcp_msg_str[] = {
> >> +[0] = "INVALID",
> >> +[DHCP_MSG_DISCOVER] = "DISCOVER",
> >> +[DHCP_MSG_OFFER] = "OFFER",
> >> +[DHCP_MSG_REQUEST] = "REQUEST",
> >> +[OVN_DHCP_MSG_DECLINE] = "DECLINE",
> >> +[DHCP_MSG_ACK] = "ACK",
> >> +[DHCP_MSG_NAK] = "NAK",
> >> +[OVN_DHCP_MSG_RELEASE] = "RELEASE",
> >> +[OVN_DHCP_MSG_INFORM] = "INFORM"
> >> +};
> >> +
> >> +static bool
> >> +dhcp_relay_is_msg_type_supported(uint8_t msg_type)
> >> +{
> >> +    return (msg_type >= DHCP_MSG_DISCOVER && msg_type <= OVN_DHCP_MSG_RELEASE);
> >> +}
> >> +
> >> +static const char *dhcp_msg_str_get(uint8_t msg_type)
> >> +{
> >> +    if (!dhcp_relay_is_msg_type_supported(msg_type)) {
> >> +        return "INVALID";
> >> +    }
> >> +    return dhcp_msg_str[msg_type];
> >> +}
> >> +
> >> +/* Called with in the pinctrl_handler thread context. */
> >> +static void
> >> +pinctrl_handle_dhcp_relay_req(
> >> +    struct rconn *swconn,
> >> +    struct dp_packet *pkt_in, struct ofputil_packet_in *pin,
> >> +    struct ofpbuf *userdata,
> >> +    struct ofpbuf *continuation)
> >> +{
> >> +    enum ofp_version version = rconn_get_version(swconn);
> >> +    enum ofputil_protocol proto = ofputil_protocol_from_ofp_version(version);
> >> +    struct dp_packet *pkt_out_ptr = NULL;
> >> +
> >> +    /* Parse relay IP and server IP. */
> >> +    ovs_be32 *relay_ip = ofpbuf_try_pull(userdata, sizeof *relay_ip);
> >> +    ovs_be32 *server_ip = ofpbuf_try_pull(userdata, sizeof *server_ip);
> >> +    if (!relay_ip || !server_ip) {
> >> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> >> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: relay ip or server ip "
> >> +                  "not present in the userdata");
> >> +        return;
> >> +    }
> >> +
> >> +    /* Validate the DHCP request packet.
> >> +     * Format of the DHCP packet is
> >> +     * ------------------------------------------------------------------------
> >> +     *| UDP HEADER  | DHCP HEADER  | 4 Byte DHCP Cookie | DHCP OPTIONS(var len)|
> >> +     * ------------------------------------------------------------------------
> >> +     */
> >> +
> >> +    size_t in_l4_size = dp_packet_l4_size(pkt_in);
> >> +    const char *end = (char *) dp_packet_l4(pkt_in) + in_l4_size;
> >> +    const char *in_dhcp_ptr = dp_packet_get_udp_payload(pkt_in);
> >> +    if (!in_dhcp_ptr) {
> >> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> >> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: invalid or incomplete "
> >> +                  "DHCP packet received");
> >> +        return;
> >> +    }
> >> +
> >> +    const struct dhcp_header *in_dhcp_data
> >> +        = (const struct dhcp_header *) in_dhcp_ptr;
> >> +    in_dhcp_ptr += sizeof *in_dhcp_data;
> >> +    if (in_dhcp_ptr > end) {
> >> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> >> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: invalid or incomplete "
> >> +                "DHCP packet received, bad data length");
> >> +        return;
> >> +    }
> >> +    if (in_dhcp_data->op != DHCP_OP_REQUEST) {
> >> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> >> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: invalid opcode in the "
> >> +                "DHCP packet: %d", in_dhcp_data->op);
> >> +        return;
> >> +    }
> >> +
> >> +    /* DHCP options follow the DHCP header. The first 4 bytes of the DHCP
> >> +     * options is the DHCP magic cookie followed by the actual DHCP options.
> >> +     */
> >> +    ovs_be32 magic_cookie = htonl(DHCP_MAGIC_COOKIE);
> >> +    if (in_dhcp_ptr + sizeof magic_cookie > end ||
> >> +        get_unaligned_be32((const void *) in_dhcp_ptr) != magic_cookie) {
> >> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> >> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: magic cookie not present "
> >> +                "in the packet");
> >> +        return;
> >> +    }
> >> +
> >> +    if (in_dhcp_data->giaddr) {
> >> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> >> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: giaddr is already set");
> >> +        return;
> >> +    }
> >> +
> >> +    if (in_dhcp_data->htype != 0x1) {
> >> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> >> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: packet is recieved with "
> >> +                "unsupported hardware type");
> >> +        return;
> >> +    }
> >> +
> >> +    ovs_be32 *server_id_ptr = NULL;
> >> +    const uint8_t *in_dhcp_msg_type = NULL;
> >> +
> >> +    in_dhcp_ptr += sizeof magic_cookie;
> >> +    ovs_be32 request_ip = in_dhcp_data->ciaddr;
> >> +    while (in_dhcp_ptr < end) {
> >> +        const struct dhcp_opt_header *in_dhcp_opt =
> >> +            (const struct dhcp_opt_header *) in_dhcp_ptr;
> >> +        if (in_dhcp_opt->code == DHCP_OPT_END) {
> >> +            break;
> >> +        }
> >> +        if (in_dhcp_opt->code == DHCP_OPT_PAD) {
> >> +            in_dhcp_ptr += 1;
> >> +            continue;
> >> +        }
> >> +        in_dhcp_ptr += sizeof *in_dhcp_opt;
> >> +        if (in_dhcp_ptr > end) {
> >> +            break;
> >> +        }
> >> +        in_dhcp_ptr += in_dhcp_opt->len;
> >> +        if (in_dhcp_ptr > end) {
> >> +            break;
> >> +        }
> >> +
> >> +        switch (in_dhcp_opt->code) {
> >> +        case DHCP_OPT_MSG_TYPE:
> >> +            if (in_dhcp_opt->len == 1) {
> >> +                in_dhcp_msg_type = DHCP_OPT_PAYLOAD(in_dhcp_opt);
> >> +            }
> >> +            break;
> >> +        case DHCP_OPT_REQ_IP:
> >> +            if (in_dhcp_opt->len == 4) {
> >> +                request_ip = get_unaligned_be32(DHCP_OPT_PAYLOAD(in_dhcp_opt));
> >> +            }
> >> +            break;
> >> +        /* Server Identifier */
> >> +        case OVN_DHCP_OPT_CODE_SERVER_ID:
> >> +            if (in_dhcp_opt->len == 4) {
> >> +                server_id_ptr = DHCP_OPT_PAYLOAD(in_dhcp_opt);
> >> +            }
> >> +            break;
> >> +        default:
> >> +            break;
> >> +        }
> >> +    }
> >> +
> >> +    /* Check whether the DHCP Message Type (opt 53) is present or not */
> >> +    if (!in_dhcp_msg_type) {
> >> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> >> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: missing message type");
> >> +        return;
> >> +    }
> >> +
> >> +    /* Relay the DHCP request packet */
> >> +    uint16_t new_l4_size = in_l4_size;
> >> +    size_t new_packet_size = pkt_in->l4_ofs + new_l4_size;
> >> +
> >> +    struct dp_packet pkt_out;
> >> +    dp_packet_init(&pkt_out, new_packet_size);
> >> +    dp_packet_clear(&pkt_out);
> >> +    dp_packet_prealloc_tailroom(&pkt_out, new_packet_size);
> >> +    pkt_out_ptr = &pkt_out;
> >> +
> >> +    /* Copy the L2 and L3 headers from the pkt_in as they would remain same*/
> >> +    dp_packet_put(
> >> +        &pkt_out, dp_packet_pull(pkt_in, pkt_in->l4_ofs), pkt_in->l4_ofs);
> >> +
> >> +    pkt_out.l2_5_ofs = pkt_in->l2_5_ofs;
> >> +    pkt_out.l2_pad_size = pkt_in->l2_pad_size;
> >> +    pkt_out.l3_ofs = pkt_in->l3_ofs;
> >> +    pkt_out.l4_ofs = pkt_in->l4_ofs;
> >> +
> >> +    struct ip_header *out_ip = dp_packet_l3(&pkt_out);
> >> +
> >> +    struct udp_header *udp = dp_packet_put(
> >> +        &pkt_out, dp_packet_pull(pkt_in, UDP_HEADER_LEN), UDP_HEADER_LEN);
> >> +
> >> +    struct dhcp_header *dhcp_data = dp_packet_put(&pkt_out,
> >> +        dp_packet_pull(pkt_in, new_l4_size - UDP_HEADER_LEN),
> >> +        new_l4_size - UDP_HEADER_LEN);
> >> +    dhcp_data->giaddr = *relay_ip;
> >> +    if (udp->udp_csum) {
> >> +        udp->udp_csum = recalc_csum32(udp->udp_csum,
> >> +            0, dhcp_data->giaddr);
> >> +    }
> >> +    pin->packet = dp_packet_data(&pkt_out);
> >> +    pin->packet_len = dp_packet_size(&pkt_out);
> >> +
> >> +    /* Log the DHCP message. */
> >> +    static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(20, 40);
> >> +    const struct eth_header *l2 = dp_packet_eth(&pkt_out);
> >> +    VLOG_INFO_RL(&rl, "DHCP_RELAY_REQ:: MSG_TYPE:%s MAC:"ETH_ADDR_FMT
> >> +                " XID:%u"
> >> +                " REQ_IP:"IP_FMT
> >> +                " GIADDR:"IP_FMT
> >> +                " SERVER_ADDR:"IP_FMT,
> >> +                dhcp_msg_str_get(*in_dhcp_msg_type),
> >> +                ETH_ADDR_BYTES_ARGS(dhcp_data->chaddr), ntohl(dhcp_data->xid),
> >> +                IP_ARGS(request_ip), IP_ARGS(dhcp_data->giaddr),
> >> +                IP_ARGS(*server_ip));
> >> +    queue_msg(swconn, ofputil_encode_resume(pin, continuation, proto));
> >> +    if (pkt_out_ptr) {
> >> +        dp_packet_uninit(pkt_out_ptr);
> >> +    }
> >> +}
> >> +
> >> +/* Called with in the pinctrl_handler thread context. */
> >> +static void
> >> +pinctrl_handle_dhcp_relay_resp_fwd(
> >> +    struct rconn *swconn,
> >> +    struct dp_packet *pkt_in, struct ofputil_packet_in *pin,
> >> +    struct ofpbuf *userdata,
> >> +    struct ofpbuf *continuation)
> >> +{
> >> +    enum ofp_version version = rconn_get_version(swconn);
> >> +    enum ofputil_protocol proto = ofputil_protocol_from_ofp_version(version);
> >> +    struct dp_packet *pkt_out_ptr = NULL;
> >> +
> >> +    /* Parse relay IP and server IP. */
> >> +    ovs_be32 *relay_ip = ofpbuf_try_pull(userdata, sizeof *relay_ip);
> >> +    ovs_be32 *server_ip = ofpbuf_try_pull(userdata, sizeof *server_ip);
> >> +    if (!relay_ip || !server_ip) {
> >> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> >> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: relay ip or server ip "
> >> +                "not present in the userdata");
> >> +        return;
> >> +    }
> >> +
> >> +    /* Validate the DHCP request packet.
> >> +     * Format of the DHCP packet is
> >> +     * ------------------------------------------------------------------------
> >> +     *| UDP HEADER  | DHCP HEADER  | 4 Byte DHCP Cookie | DHCP OPTIONS(var len)|
> >> +     * ------------------------------------------------------------------------
> >> +     */
> >> +
> >> +    size_t in_l4_size = dp_packet_l4_size(pkt_in);
> >> +    const char *end = (char *) dp_packet_l4(pkt_in) + in_l4_size;
> >> +    const char *in_dhcp_ptr = dp_packet_get_udp_payload(pkt_in);
> >> +    if (!in_dhcp_ptr) {
> >> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> >> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: invalid or incomplete "
> >> +                "packet received");
> >> +        return;
> >> +    }
> >> +
> >> +    const struct dhcp_header *in_dhcp_data
> >> +        = (const struct dhcp_header *) in_dhcp_ptr;
> >> +    in_dhcp_ptr += sizeof *in_dhcp_data;
> >> +    if (in_dhcp_ptr > end) {
> >> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> >> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: invalid or incomplete "
> >> +                    "packet received, bad data length");
> >> +        return;
> >> +    }
> >> +    if (in_dhcp_data->op != DHCP_OP_REPLY) {
> >> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> >> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: invalid opcode "
> >> +                "in the packet: %d", in_dhcp_data->op);
> >> +        return;
> >> +    }
> >> +
> >> +    /* DHCP options follow the DHCP header. The first 4 bytes of the DHCP
> >> +     * options is the DHCP magic cookie followed by the actual DHCP options.
> >> +     */
> >> +    ovs_be32 magic_cookie = htonl(DHCP_MAGIC_COOKIE);
> >> +    if (in_dhcp_ptr + sizeof magic_cookie > end ||
> >> +        get_unaligned_be32((const void *) in_dhcp_ptr) != magic_cookie) {
> >> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> >> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: magic cookie not present "
> >> +                "in the packet");
> >> +        return;
> >> +    }
> >> +
> >> +    if (!in_dhcp_data->giaddr) {
> >> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> >> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: giaddr is "
> >> +                    "not set in request");
> >> +        return;
> >> +    }
> >> +    ovs_be32 giaddr = in_dhcp_data->giaddr;
> >> +
> >> +    ovs_be32 *server_id_ptr = NULL;
> >> +    ovs_be32 lease_time = 0;
> >> +    const uint8_t *in_dhcp_msg_type = NULL;
> >> +
> >> +    in_dhcp_ptr += sizeof magic_cookie;
> >> +    while (in_dhcp_ptr < end) {
> >> +        const struct dhcp_opt_header *in_dhcp_opt =
> >> +            (const struct dhcp_opt_header *) in_dhcp_ptr;
> >> +        if (in_dhcp_opt->code == DHCP_OPT_END) {
> >> +            break;
> >> +        }
> >> +        if (in_dhcp_opt->code == DHCP_OPT_PAD) {
> >> +            in_dhcp_ptr += 1;
> >> +            continue;
> >> +        }
> >> +        in_dhcp_ptr += sizeof *in_dhcp_opt;
> >> +        if (in_dhcp_ptr > end) {
> >> +            break;
> >> +        }
> >> +        in_dhcp_ptr += in_dhcp_opt->len;
> >> +        if (in_dhcp_ptr > end) {
> >> +            break;
> >> +        }
> >> +
> >> +        switch (in_dhcp_opt->code) {
> >> +        case DHCP_OPT_MSG_TYPE:
> >> +            if (in_dhcp_opt->len == 1) {
> >> +                in_dhcp_msg_type = DHCP_OPT_PAYLOAD(in_dhcp_opt);
> >> +            }
> >> +            break;
> >> +        /* Server Identifier */
> >> +        case OVN_DHCP_OPT_CODE_SERVER_ID:
> >> +            if (in_dhcp_opt->len == 4) {
> >> +                server_id_ptr = DHCP_OPT_PAYLOAD(in_dhcp_opt);
> >> +            }
> >> +            break;
> >> +        case OVN_DHCP_OPT_CODE_LEASE_TIME:
> >> +            if (in_dhcp_opt->len == 4) {
> >> +                lease_time = get_unaligned_be32(DHCP_OPT_PAYLOAD(in_dhcp_opt));
> >> +            }
> >> +            break;
> >> +        default:
> >> +            break;
> >> +        }
> >> +    }
> >> +
> >> +    /* Check whether the DHCP Message Type (opt 53) is present or not */
> >> +    if (!in_dhcp_msg_type) {
> >> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> >> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: missing message type");
> >> +        return;
> >> +    }
> >> +
> >> +    if (!server_id_ptr) {
> >> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> >> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: missing server identifier");
> >> +        return;
> >> +    }
> >> +
> >> +    if (*server_id_ptr != *server_ip) {
> >> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> >> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: server identifier mismatch");
> >> +        return;
> >> +    }
> >> +
> >> +    if (giaddr != *relay_ip) {
> >> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> >> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: giaddr mismatch");
> >> +        return;
> >> +    }
> >> +
> >> +
> >> +    /* Update destination MAC & IP so that the packet is forward to the
> >> +     * right destination node.
> >> +     */
> >> +    uint16_t new_l4_size = in_l4_size;
> >> +    size_t new_packet_size = pkt_in->l4_ofs + new_l4_size;
> >> +
> >> +    struct dp_packet pkt_out;
> >> +    dp_packet_init(&pkt_out, new_packet_size);
> >> +    dp_packet_clear(&pkt_out);
> >> +    dp_packet_prealloc_tailroom(&pkt_out, new_packet_size);
> >> +    pkt_out_ptr = &pkt_out;
> >> +
> >> +    /* Copy the L2 and L3 headers from the pkt_in as they would remain same*/
> >> +    struct eth_header *eth = dp_packet_put(
> >> +        &pkt_out, dp_packet_pull(pkt_in, pkt_in->l4_ofs), pkt_in->l4_ofs);
> >> +
> >> +    pkt_out.l2_5_ofs = pkt_in->l2_5_ofs;
> >> +    pkt_out.l2_pad_size = pkt_in->l2_pad_size;
> >> +    pkt_out.l3_ofs = pkt_in->l3_ofs;
> >> +    pkt_out.l4_ofs = pkt_in->l4_ofs;
> >> +
> >> +    struct udp_header *udp = dp_packet_put(
> >> +        &pkt_out, dp_packet_pull(pkt_in, UDP_HEADER_LEN), UDP_HEADER_LEN);
> >> +
> >> +    struct dhcp_header *dhcp_data = dp_packet_put(
> >> +        &pkt_out, dp_packet_pull(pkt_in, new_l4_size - UDP_HEADER_LEN),
> >> +        new_l4_size - UDP_HEADER_LEN);
> >> +    memcpy(&eth->eth_dst, dhcp_data->chaddr, sizeof(eth->eth_dst));
> >> +
> >> +    /* Send a broadcast IP frame when BROADCAST flag is set. */
> >> +    struct ip_header *out_ip = dp_packet_l3(&pkt_out);
> >> +    ovs_be32 ip_dst;
> >> +    ovs_be32 ip_dst_orig = get_16aligned_be32(&out_ip->ip_dst);
> >> +    if (!is_dhcp_flags_broadcast(dhcp_data->flags)) {
> >> +        ip_dst = dhcp_data->yiaddr;
> >> +    } else {
> >> +        ip_dst = htonl(0xffffffff);
> >> +    }
> >> +    put_16aligned_be32(&out_ip->ip_dst, ip_dst);
> >> +    out_ip->ip_csum = recalc_csum32(out_ip->ip_csum,
> >> +              ip_dst_orig, ip_dst);
> >> +    if (udp->udp_csum) {
> >> +        udp->udp_csum = recalc_csum32(udp->udp_csum,
> >> +            ip_dst_orig, ip_dst);
> >> +    }
> >> +    /* Reset giaddr */
> >> +    dhcp_data->giaddr = htonl(0x0);
> >> +    if (udp->udp_csum) {
> >> +        udp->udp_csum = recalc_csum32(udp->udp_csum,
> >> +            giaddr, 0);
> >> +    }
> >> +    pin->packet = dp_packet_data(&pkt_out);
> >> +    pin->packet_len = dp_packet_size(&pkt_out);
> >> +
> >> +    /* Log the DHCP message. */
> >> +    static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(20, 40);
> >> +    const struct eth_header *l2 = dp_packet_eth(&pkt_out);
> >> +    VLOG_INFO_RL(&rl, "DHCP_RELAY_RESP_FWD:: MSG_TYPE:%s MAC:"ETH_ADDR_FMT
> >> +             " XID:%u"
> >> +             " YIADDR:"IP_FMT
> >> +             " GIADDR:"IP_FMT
> >> +             " SERVER_ADDR:"IP_FMT,
> >> +             dhcp_msg_str_get(*in_dhcp_msg_type),
> >> +             ETH_ADDR_BYTES_ARGS(dhcp_data->chaddr), ntohl(dhcp_data->xid),
> >> +             IP_ARGS(dhcp_data->yiaddr),
> >> +             IP_ARGS(giaddr), IP_ARGS(*server_id_ptr));
> >> +    queue_msg(swconn, ofputil_encode_resume(pin, continuation, proto));
> >> +    if (pkt_out_ptr) {
> >> +        dp_packet_uninit(pkt_out_ptr);
> >> +    }
> >> +}
> >> +
> >> /* Called with in the pinctrl_handler thread context. */
> >> static void
> >> pinctrl_handle_put_dhcp_opts(
> >> @@ -3203,6 +3634,16 @@ process_packet_in(struct rconn *swconn, const struct ofp_header *msg)
> >>         ovs_mutex_unlock(&pinctrl_mutex);
> >>         break;
> >>
> >> +    case ACTION_OPCODE_DHCP_RELAY_REQ:
> >> +        pinctrl_handle_dhcp_relay_req(swconn, &packet, &pin,
> >> +                                     &userdata, &continuation);
> >> +        break;
> >> +
> >> +    case ACTION_OPCODE_DHCP_RELAY_RESP_FWD:
> >> +        pinctrl_handle_dhcp_relay_resp_fwd(swconn, &packet, &pin,
> >> +                                     &userdata, &continuation);
> >> +        break;
> >> +
> >>     case ACTION_OPCODE_PUT_DHCP_OPTS:
> >>         pinctrl_handle_put_dhcp_opts(swconn, &packet, &pin, &headers,
> >>                                      &userdata, &continuation);
> >> diff --git a/include/ovn/actions.h b/include/ovn/actions.h
> >> index 49cfe0624..47d41b90f 100644
> >> --- a/include/ovn/actions.h
> >> +++ b/include/ovn/actions.h
> >> @@ -95,6 +95,8 @@ struct collector_set_ids;
> >>     OVNACT(LOOKUP_ND_IP,      ovnact_lookup_mac_bind_ip) \
> >>     OVNACT(PUT_DHCPV4_OPTS,   ovnact_put_opts)        \
> >>     OVNACT(PUT_DHCPV6_OPTS,   ovnact_put_opts)        \
> >> +    OVNACT(DHCPV4_RELAY_REQ,  ovnact_dhcp_relay)      \
> >> +    OVNACT(DHCPV4_RELAY_RESP_FWD, ovnact_dhcp_relay)      \
> >>     OVNACT(SET_QUEUE,         ovnact_set_queue)       \
> >>     OVNACT(DNS_LOOKUP,        ovnact_result)          \
> >>     OVNACT(LOG,               ovnact_log)             \
> >> @@ -387,6 +389,14 @@ struct ovnact_put_opts {
> >>     size_t n_options;
> >> };
> >>
> >> +/* OVNACT_DHCP_RELAY. */
> >> +struct ovnact_dhcp_relay {
> >> +    struct ovnact ovnact;
> >> +    int family;
> >> +    ovs_be32 relay_ipv4;
> >> +    ovs_be32 server_ipv4;
> >> +};
> >> +
> >> /* Valid arguments to SET_QUEUE action.
> >>  *
> >>  * QDISC_MIN_QUEUE_ID is the default queue, so user-defined queues should
> >> @@ -750,6 +760,22 @@ enum action_opcode {
> >>
> >>     /* multicast group split buffer action. */
> >>     ACTION_OPCODE_MG_SPLIT_BUF,
> >> +
> >> +    /* "dhcp_relay_req(relay_ip, server_ip)".
> >> +     *
> >> +     * Arguments follow the action_header, in this format:
> >> +     *   - The 32-bit DHCP relay IP.
> >> +     *   - The 32-bit DHCP server IP.
> >> +     */
> >> +    ACTION_OPCODE_DHCP_RELAY_REQ,
> >> +
> >> +    /* "dhcp_relay_resp_fwd(relay_ip, server_ip)".
> >> +     *
> >> +     * Arguments follow the action_header, in this format:
> >> +     *   - The 32-bit DHCP relay IP.
> >> +     *   - The 32-bit DHCP server IP.
> >> +     */
> >> +    ACTION_OPCODE_DHCP_RELAY_RESP_FWD,
> >> };
> >>
> >> /* Header. */
> >> diff --git a/lib/actions.c b/lib/actions.c
> >> index a73fe1a1e..69df428c6 100644
> >> --- a/lib/actions.c
> >> +++ b/lib/actions.c
> >> @@ -2629,6 +2629,118 @@ ovnact_controller_event_free(struct ovnact_controller_event *event)
> >>     free_gen_options(event->options, event->n_options);
> >> }
> >>
> >> +static void
> >> +format_DHCPV4_RELAY_REQ(const struct ovnact_dhcp_relay *dhcp_relay,
> >> +                struct ds *s)
> >> +{
> >> +    ds_put_format(s, "dhcp_relay_req("IP_FMT","IP_FMT");",
> >> +                  IP_ARGS(dhcp_relay->relay_ipv4),
> >> +                  IP_ARGS(dhcp_relay->server_ipv4));
> >> +}
> >> +
> >> +static void
> >> +parse_dhcp_relay_req(struct action_context *ctx,
> >> +               struct ovnact_dhcp_relay *dhcp_relay)
> >> +{
> >> +    /* Skip dhcp_relay_req( */
> >> +    lexer_force_match(ctx->lexer, LEX_T_LPAREN);
> >> +
> >> +    /* Parse relay ip and server ip. */
> >> +    if (ctx->lexer->token.format == LEX_F_IPV4) {
> >> +        dhcp_relay->family = AF_INET;
> >> +        dhcp_relay->relay_ipv4 = ctx->lexer->token.value.ipv4;
> >> +        lexer_get(ctx->lexer);
> >> +        lexer_match(ctx->lexer, LEX_T_COMMA);
> >> +        if (ctx->lexer->token.format == LEX_F_IPV4) {
> >> +            dhcp_relay->family = AF_INET;
> >> +            dhcp_relay->server_ipv4 = ctx->lexer->token.value.ipv4;
> >> +            lexer_get(ctx->lexer);
> >> +        } else {
> >> +            lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp server ip");
> >> +            return;
> >> +        }
> >> +    } else {
> >> +          lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp relay "
> >> +                          "and server ips");
> >> +          return;
> >> +    }
> >> +    lexer_force_match(ctx->lexer, LEX_T_RPAREN);
> >> +}
> >> +
> >> +static void
> >> +encode_DHCPV4_RELAY_REQ(const struct ovnact_dhcp_relay *dhcp_relay,
> >> +                    const struct ovnact_encode_params *ep,
> >> +                    struct ofpbuf *ofpacts)
> >> +{
> >> +    size_t oc_offset = encode_start_controller_op(ACTION_OPCODE_DHCP_RELAY_REQ,
> >> +                                                  true, ep->ctrl_meter_id,
> >> +                                                  ofpacts);
> >> +    ofpbuf_put(ofpacts, &dhcp_relay->relay_ipv4,
> >> +            sizeof(dhcp_relay->relay_ipv4));
> >> +    ofpbuf_put(ofpacts, &dhcp_relay->server_ipv4,
> >> +            sizeof(dhcp_relay->server_ipv4));
> >> +    encode_finish_controller_op(oc_offset, ofpacts);
> >> +}
> >> +
> >> +static void
> >> +format_DHCPV4_RELAY_RESP_FWD(const struct ovnact_dhcp_relay *dhcp_relay,
> >> +                    struct ds *s)
> >> +{
> >> +    ds_put_format(s, "dhcp_relay_resp("IP_FMT","IP_FMT");",
> >> +                  IP_ARGS(dhcp_relay->relay_ipv4),
> >> +                  IP_ARGS(dhcp_relay->server_ipv4));
> >> +}
> >> +
> >> +static void
> >> +parse_dhcp_relay_resp_fwd(struct action_context *ctx,
> >> +               struct ovnact_dhcp_relay *dhcp_relay)
> >> +{
> >> +    /* Skip dhcp_relay_resp( */
> >> +    lexer_force_match(ctx->lexer, LEX_T_LPAREN);
> >> +
> >> +    /* Parse relay ip and server ip. */
> >> +    if (ctx->lexer->token.format == LEX_F_IPV4) {
> >> +        dhcp_relay->family = AF_INET;
> >> +        dhcp_relay->relay_ipv4 = ctx->lexer->token.value.ipv4;
> >> +        lexer_get(ctx->lexer);
> >> +        lexer_match(ctx->lexer, LEX_T_COMMA);
> >> +        if (ctx->lexer->token.format == LEX_F_IPV4) {
> >> +            dhcp_relay->family = AF_INET;
> >> +            dhcp_relay->server_ipv4 = ctx->lexer->token.value.ipv4;
> >> +            lexer_get(ctx->lexer);
> >> +        } else {
> >> +            lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp server ip");
> >> +            return;
> >> +        }
> >> +    } else {
> >> +          lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp relay and "
> >> +                          "server ips");
> >> +          return;
> >> +    }
> >> +    lexer_force_match(ctx->lexer, LEX_T_RPAREN);
> >> +}
> >> +
> >> +static void
> >> +encode_DHCPV4_RELAY_RESP_FWD(const struct ovnact_dhcp_relay *dhcp_relay,
> >> +                    const struct ovnact_encode_params *ep,
> >> +                    struct ofpbuf *ofpacts)
> >> +{
> >> +    size_t oc_offset = encode_start_controller_op(
> >> +                                ACTION_OPCODE_DHCP_RELAY_RESP_FWD,
> >> +                                true, ep->ctrl_meter_id,
> >> +                                ofpacts);
> >> +    ofpbuf_put(ofpacts, &dhcp_relay->relay_ipv4,
> >> +                  sizeof(dhcp_relay->relay_ipv4));
> >> +    ofpbuf_put(ofpacts, &dhcp_relay->server_ipv4,
> >> +                  sizeof(dhcp_relay->server_ipv4));
> >> +    encode_finish_controller_op(oc_offset, ofpacts);
> >> +}
> >> +
> >> +static void ovnact_dhcp_relay_free(
> >> +          struct ovnact_dhcp_relay *dhcp_relay OVS_UNUSED)
> >> +{
> >> +}
> >> +
> >> static void
> >> parse_put_opts(struct action_context *ctx, const struct expr_field *dst,
> >>                struct ovnact_put_opts *po, const struct hmap *gen_opts,
> >> @@ -5451,6 +5563,11 @@ parse_action(struct action_context *ctx)
> >>         parse_sample(ctx);
> >>     } else if (lexer_match_id(ctx->lexer, "mac_cache_use")) {
> >>         ovnact_put_MAC_CACHE_USE(ctx->ovnacts);
> >> +    } else if (lexer_match_id(ctx->lexer, "dhcp_relay_req")) {
> >> +        parse_dhcp_relay_req(ctx, ovnact_put_DHCPV4_RELAY_REQ(ctx->ovnacts));
> >> +    } else if (lexer_match_id(ctx->lexer, "dhcp_relay_resp_fwd")) {
> >> +        parse_dhcp_relay_resp_fwd(ctx,
> >> +              ovnact_put_DHCPV4_RELAY_RESP_FWD(ctx->ovnacts));
> >>     } else {
> >>         lexer_syntax_error(ctx->lexer, "expecting action");
> >>     }
> >> diff --git a/lib/ovn-l7.h b/lib/ovn-l7.h
> >> index ad514a922..e08581123 100644
> >> --- a/lib/ovn-l7.h
> >> +++ b/lib/ovn-l7.h
> >> @@ -69,6 +69,7 @@ struct gen_opts_map {
> >>  */
> >> #define OVN_DHCP_OPT_CODE_NETMASK      1
> >> #define OVN_DHCP_OPT_CODE_LEASE_TIME   51
> >> +#define OVN_DHCP_OPT_CODE_SERVER_ID    54
> >> #define OVN_DHCP_OPT_CODE_T1           58
> >> #define OVN_DHCP_OPT_CODE_T2           59
> >>
> >> diff --git a/northd/northd.c b/northd/northd.c
> >> index 07dffb15a..7ac831fae 100644
> >> --- a/northd/northd.c
> >> +++ b/northd/northd.c
> >> @@ -181,11 +181,13 @@ enum ovn_stage {
> >>     PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING_ECMP, 14, "lr_in_ip_routing_ecmp") \
> >>     PIPELINE_STAGE(ROUTER, IN,  POLICY,          15, "lr_in_policy")          \
> >>     PIPELINE_STAGE(ROUTER, IN,  POLICY_ECMP,     16, "lr_in_policy_ecmp")     \
> >> -    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE,     17, "lr_in_arp_resolve")     \
> >> -    PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN,     18, "lr_in_chk_pkt_len")     \
> >> -    PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS,     19, "lr_in_larger_pkts")     \
> >> -    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT,     20, "lr_in_gw_redirect")     \
> >> -    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     21, "lr_in_arp_request")     \
> >> +    PIPELINE_STAGE(ROUTER, IN,  DHCP_RELAY_RESP_FWD, 17,                      \
> >> +                  "lr_in_dhcp_relay_resp_fwd")                                \
> >> +    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE,     18, "lr_in_arp_resolve")     \
> >> +    PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN,     19, "lr_in_chk_pkt_len")     \
> >> +    PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS,     20, "lr_in_larger_pkts")     \
> >> +    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT,     21, "lr_in_gw_redirect")     \
> >> +    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     22, "lr_in_arp_request")     \
> >>                                                                       \
> >>     /* Logical router egress stages. */                               \
> >>     PIPELINE_STAGE(ROUTER, OUT, CHECK_DNAT_LOCAL,   0,                       \
> >> @@ -9610,6 +9612,80 @@ build_dhcpv6_options_flows(struct ovn_port *op,
> >>     ds_destroy(&match);
> >> }
> >>
> >> +static void
> >> +build_lswitch_dhcp_relay_flows(struct ovn_port *op,
> >> +                           const struct hmap *lr_ports,
> >> +                           const struct hmap *lflows,
> >> +                           const struct shash *meter_groups OVS_UNUSED)
> >> +{
> >> +    if (op->nbrp || !op->nbsp) {
> >> +        return;
> >> +    }
> >> +    /* consider only ports attached to VMs */
> >> +    if (strcmp(op->nbsp->type, "")) {
> >> +        return;
> >> +    }
> >> +
> >> +    if (!op->od || !op->od->n_router_ports ||
> >> +        !op->od->nbs || !op->od->nbs->dhcp_relay_port) {
> >> +        return;
> >> +    }
> >> +
> >> +    struct ds match = DS_EMPTY_INITIALIZER;
> >> +    struct ds action = DS_EMPTY_INITIALIZER;
> >> +    struct nbrec_logical_router_port *lrp = op->od->nbs->dhcp_relay_port;
> >> +    struct ovn_port *rp = ovn_port_find(lr_ports, lrp->name);
> >> +
> >> +    if (!rp || !rp->nbrp || !rp->nbrp->dhcp_relay) {
> >> +        return;
> >> +    }
> >> +
> >> +    struct ovn_port *sp = NULL;
> >> +    struct nbrec_dhcp_relay *dhcp_relay = rp->nbrp->dhcp_relay;
> >> +
> >> +    for (int i = 0; i < op->od->n_router_ports; i++) {
> >> +        struct ovn_port *sp_tmp = op->od->router_ports[i];
> >> +        if (sp_tmp->peer == rp) {
> >> +            sp = sp_tmp;
> >> +            break;
> >> +        }
> >> +    }
> >> +    if (!sp) {
> >> +      return;
> >> +    }
> >> +
> >> +    char *server_ip_str = NULL;
> >> +    uint16_t port;
> >> +    int addr_family;
> >> +    struct in6_addr server_ip;
> >> +
> >> +    if (!ip_address_and_port_from_lb_key(dhcp_relay->servers, &server_ip_str,
> >> +                                         &server_ip, &port, &addr_family)) {
> >> +        return;
> >> +    }
> >> +
> >> +    if (server_ip_str == NULL) {
> >> +        return;
> >> +    }
> >> +
> >> +    ds_put_format(
> >> +        &match, "inport == %s && eth.src == %s && "
> >> +        "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
> >> +        "udp.src == 68 && udp.dst == 67",
> >> +        op->json_key, op->lsp_addrs[0].ea_s);
> >> +    ds_put_format(&action,
> >> +                  "eth.dst=%s;outport=%s;next;/* DHCP_RELAY_REQ */",
> >> +                  rp->lrp_networks.ea_s,sp->json_key);
> >> +    ovn_lflow_add_with_hint__(lflows, op->od,
> >> +                              S_SWITCH_IN_L2_LKUP, 100,
> >> +                              ds_cstr(&match),
> >> +                              ds_cstr(&action),
> >> +                              op->key,
> >> +                              NULL,
> >> +                              &lrp->header_);
> >> +    free(server_ip_str);
> >> +}
> >> +
> >> static void
> >> build_drop_arp_nd_flows_for_unbound_router_ports(struct ovn_port *op,
> >>                                                  const struct ovn_port *port,
> >> @@ -10181,6 +10257,13 @@ build_lswitch_dhcp_options_and_response(struct ovn_port *op,
> >>         return;
> >>     }
> >>
> >> +    if (op->od && op->od->nbs
> >> +        && op->od->nbs->dhcp_relay_port) {
> >> +        /* Don't add the DHCP server flows if DHCP Relay is enabled on the
> >> +         * logical switch. */
> >> +        return;
> >> +    }
> >> +
> >>     bool is_external = lsp_is_external(op->nbsp);
> >>     if (is_external && (!op->od->n_localnet_ports ||
> >>                         !op->nbsp->ha_chassis_group)) {
> >> @@ -14458,6 +14541,86 @@ build_dhcpv6_reply_flows_for_lrouter_port(
> >>     }
> >> }
> >>
> >> +static void
> >> +build_dhcp_relay_flows_for_lrouter_port(
> >> +        struct ovn_port *op, struct hmap *lflows,
> >> +        struct ds *match)
> >> +{
> >> +    if (!op->nbrp || !op->nbrp->dhcp_relay) {
> >> +        return;
> >> +    }
> >> +    struct nbrec_dhcp_relay *dhcp_relay = op->nbrp->dhcp_relay;
> >> +    if (!dhcp_relay->servers) {
> >> +        return;
> >> +    }
> >> +
> >> +    int addr_family;
> >> +    /* currently not supporting custom port */
> >> +    uint16_t port;
> >> +    char *server_ip_str = NULL;
> >> +    struct in6_addr server_ip;
> >> +
> >> +    if (!ip_address_and_port_from_lb_key(dhcp_relay->servers, &server_ip_str,
> >> +                                         &server_ip, &port, &addr_family)) {
> >> +        return;
> >> +    }
> >> +
> >> +    if (server_ip_str == NULL) {
> >> +        return;
> >> +    }
> >> +
> >> +    struct ds dhcp_action = DS_EMPTY_INITIALIZER;
> >> +    ds_clear(match);
> >> +    ds_put_format(
> >> +        match, "inport == %s && "
> >> +        "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
> >> +        "udp.src == 68 && udp.dst == 67",
> >> +        op->json_key);
> >> +    ds_put_format(&dhcp_action,
> >> +                "dhcp_relay_req(%s,%s);"
> >> +                "ip4.src=%s;ip4.dst=%s;udp.src=67;next; /* DHCP_RELAY_REQ */",
> >> +                op->lrp_networks.ipv4_addrs[0].addr_s, server_ip_str,
> >> +                op->lrp_networks.ipv4_addrs[0].addr_s, server_ip_str);
> >> +
> >> +    ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_IP_INPUT, 110,
> >> +                            ds_cstr(match), ds_cstr(&dhcp_action),
> >> +                            &op->nbrp->header_);
> >> +
> >> +    ds_clear(match);
> >> +    ds_clear(&dhcp_action);
> >> +
> >> +    ds_put_format(
> >> +        match, "ip4.src == %s && ip4.dst == %s && "
> >> +        "udp.src == 67 && udp.dst == 67",
> >> +        server_ip_str, op->lrp_networks.ipv4_addrs[0].addr_s);
> >> +    ds_put_format(&dhcp_action, "next;/* DHCP_RELAY_RESP */");
> >> +    ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_IP_INPUT, 110,
> >> +                            ds_cstr(match), ds_cstr(&dhcp_action),
> >> +                            &op->nbrp->header_);
> >> +
> >> +    ds_clear(match);
> >> +    ds_clear(&dhcp_action);
> >> +
> >> +    ds_put_format(
> >> +        match, "ip4.src == %s && ip4.dst == %s && "
> >> +        "udp.src == 67 && udp.dst == 67",
> >> +        server_ip_str, op->lrp_networks.ipv4_addrs[0].addr_s);
> >> +    ds_put_format(&dhcp_action,
> >> +          "dhcp_relay_resp_fwd(%s,%s);ip4.src=%s;udp.dst=68;"
> >> +          "outport=%s;output; /* DHCP_RELAY_RESP */",
> >> +          op->lrp_networks.ipv4_addrs[0].addr_s, server_ip_str,
> >> +          op->lrp_networks.ipv4_addrs[0].addr_s, op->json_key);
> >> +    ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_DHCP_RELAY_RESP_FWD,
> >> +                            110,
> >> +                            ds_cstr(match), ds_cstr(&dhcp_action),
> >> +                            &op->nbrp->header_);
> >> +
> >> +    ds_clear(match);
> >> +    ds_clear(&dhcp_action);
> >> +
> >> +    free(server_ip_str);
> >> +}
> >> +
> >> static void
> >> build_ipv6_input_flows_for_lrouter_port(
> >>         struct ovn_port *op, struct hmap *lflows,
> >> @@ -15673,6 +15836,8 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath *od, struct hmap *lflows,
> >>     ovn_lflow_add(lflows, od, S_ROUTER_OUT_POST_SNAT, 0, "1", "next;");
> >>     ovn_lflow_add(lflows, od, S_ROUTER_OUT_EGR_LOOP, 0, "1", "next;");
> >>     ovn_lflow_add(lflows, od, S_ROUTER_IN_ECMP_STATEFUL, 0, "1", "next;");
> >> +    ovn_lflow_add(lflows, od, S_ROUTER_IN_DHCP_RELAY_RESP_FWD, 0, "1",
> >> +                  "next;");
> >>
> >>     const char *ct_flag_reg = features->ct_no_masked_label
> >>                               ? "ct_mark"
> >> @@ -16154,6 +16319,7 @@ build_lswitch_and_lrouter_iterate_by_lsp(struct ovn_port *op,
> >>     build_lswitch_dhcp_options_and_response(op, lflows, meter_groups);
> >>     build_lswitch_external_port(op, lflows);
> >>     build_lswitch_ip_unicast_lookup(op, lflows, actions, match);
> >> +    build_lswitch_dhcp_relay_flows(op, lr_ports, lflows, meter_groups);
> >>
> >>     /* Build Logical Router Flows. */
> >>     build_ip_routing_flows_for_router_type_lsp(op, lr_ports, lflows);
> >> @@ -16183,6 +16349,7 @@ build_lswitch_and_lrouter_iterate_by_lrp(struct ovn_port *op,
> >>     build_egress_delivery_flows_for_lrouter_port(op, lsi->lflows, &lsi->match,
> >>                                                  &lsi->actions);
> >>     build_dhcpv6_reply_flows_for_lrouter_port(op, lsi->lflows, &lsi->match);
> >> +    build_dhcp_relay_flows_for_lrouter_port(op, lsi->lflows, &lsi->match);
> >>     build_ipv6_input_flows_for_lrouter_port(op, lsi->lflows,
> >>                                             &lsi->match, &lsi->actions,
> >>                                             lsi->meter_groups);
> >> diff --git a/ovn-nb.ovsschema b/ovn-nb.ovsschema
> >> index b2e0993e0..6863d52cd 100644
> >> --- a/ovn-nb.ovsschema
> >> +++ b/ovn-nb.ovsschema
> >> @@ -1,7 +1,7 @@
> >> {
> >>     "name": "OVN_Northbound",
> >> -    "version": "7.2.0",
> >> -    "cksum": "1069338687 34162",
> >> +    "version": "7.3.0",
> >> +    "cksum": "2325497400 35185",
> >>     "tables": {
> >>         "NB_Global": {
> >>             "columns": {
> >> @@ -89,7 +89,12 @@
> >>                     "type": {"key": {"type": "uuid",
> >>                                      "refTable": "Forwarding_Group",
> >>                                      "refType": "strong"},
> >> -                                     "min": 0, "max": "unlimited"}}},
> >> +                                     "min": 0, "max": "unlimited"}},
> >> +                "dhcp_relay_port": {"type": {"key": {"type": "uuid",
> >> +                                            "refTable": "Logical_Router_Port",
> >> +                                            "refType": "weak"},
> >> +                                            "min": 0,
> >> +                                            "max": 1}}},
> >>             "isRoot": true},
> >>         "Logical_Switch_Port": {
> >>             "columns": {
> >> @@ -436,6 +441,11 @@
> >>                 "ipv6_prefix": {"type": {"key": "string",
> >>                                       "min": 0,
> >>                                       "max": "unlimited"}},
> >> +                "dhcp_relay": {"type": {"key": {"type": "uuid",
> >> +                                            "refTable": "DHCP_Relay",
> >> +                                            "refType": "weak"},
> >> +                                            "min": 0,
> >> +                                            "max": 1}},
> >>                 "external_ids": {
> >>                     "type": {"key": "string", "value": "string",
> >>                              "min": 0, "max": "unlimited"}},
> >> @@ -529,6 +539,15 @@
> >>                     "type": {"key": "string", "value": "string",
> >>                              "min": 0, "max": "unlimited"}}},
> >>             "isRoot": true},
> >> +        "DHCP_Relay": {
> >> +            "columns": {
> >> +                "servers": {"type": {"key": "string",
> >> +                                       "min": 0,
> >> +                                       "max": 1}},
> >> +                "external_ids": {
> >> +                    "type": {"key": "string", "value": "string",
> >> +                             "min": 0, "max": "unlimited"}}},
> >> +            "isRoot": true},
> >>         "Connection": {
> >>             "columns": {
> >>                 "target": {"type": "string"},
> >> diff --git a/ovn-nb.xml b/ovn-nb.xml
> >> index fcb1c6ecc..dc20892e1 100644
> >> --- a/ovn-nb.xml
> >> +++ b/ovn-nb.xml
> >> @@ -608,6 +608,11 @@
> >>       Please see the <ref table="DNS"/> table.
> >>     </column>
> >>
> >> +    <column name="dhcp_relay_port">
> >> +      This column defines the <ref table="Logical_Router_Port"/> on which
> >> +      DHCP relay is enabled.
> >> +    </column>
> >> +
> >>     <column name="forwarding_groups">
> >>       Groups a set of logical port endpoints for traffic going out of the
> >>       logical switch.
> >> @@ -2980,6 +2985,11 @@ or
> >>       port has all ingress and egress traffic dropped.
> >>     </column>
> >>
> >> +    <column name="dhcp_relay">
> >> +      This column is used to enabled DHCP Relay. Please refer
> >> +      to <ref table="DHCP_Relay"/> table.
> >> +    </column>
> >> +
> >>     <group title="Distributed Gateway Ports">
> >>       <p>
> >>         Gateways, as documented under <code>Gateways</code> in the OVN
> >> @@ -4286,6 +4296,24 @@ or
> >>     </group>
> >>   </table>
> >>
> >> +  <table name="DHCP_Relay" title="DHCP Relay">
> >> +    <p>
> >> +      OVN implements native DHCPv4 relay support which caters to the common
> >> +      use case of relaying the DHCP requests to external DHCP server.
> >> +    </p>
> >> +
> >> +    <column name="servers">
> >> +      <p>
> >> +        The DHCPv4 server IP address.
> >> +      </p>
> >> +    </column>
> >> +    <group title="Common Columns">
> >> +      <column name="external_ids">
> >> +        See <em>External IDs</em> at the beginning of this document.
> >> +      </column>
> >> +    </group>
> >> +  </table>
> >> +
> >>   <table name="Connection" title="OVSDB client connections.">
> >>     <p>
> >>       Configuration for a database connection to an Open vSwitch database
> >> diff --git a/tests/atlocal.in b/tests/atlocal.in
> >> index 63d891b89..32d1c374e 100644
> >> --- a/tests/atlocal.in
> >> +++ b/tests/atlocal.in
> >> @@ -187,6 +187,9 @@ fi
> >> # Set HAVE_DHCPD
> >> find_command dhcpd
> >>
> >> +# Set HAVE_DHCLIENT
> >> +find_command dhclient
> >> +
> >> # Set HAVE_BFDD_BEACON
> >> find_command bfdd-beacon
> >>
> >> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> >> index 19e4f1263..4d8c9ff26 100644
> >> --- a/tests/ovn-northd.at
> >> +++ b/tests/ovn-northd.at
> >> @@ -8786,9 +8786,9 @@ ovn-nbctl --wait=sb set logical_router_port R1-PUB options:redirect-type=bridged
> >> ovn-sbctl dump-flows R1 > R1flows
> >> AT_CAPTURE_FILE([R1flows])
> >>
> >> -AT_CHECK([grep "lr_in_arp_resolve" R1flows | grep priority=90 | sort], [0], [dnl
> >> -  table=17(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip4.src == 10.0.0.3 && is_chassis_resident("S0-P0")), action=(get_arp(outport, reg0); next;)
> >> -  table=17(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip6.src == 1000::3 && is_chassis_resident("S0-P0")), action=(get_nd(outport, xxreg0); next;)
> >> +AT_CHECK([grep "lr_in_arp_resolve" R1flows | grep priority=90 | sed 's/table=../table=??/' | sort], [0], [dnl
> >> +  table=??(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip4.src == 10.0.0.3 && is_chassis_resident("S0-P0")), action=(get_arp(outport, reg0); next;)
> >> +  table=??(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip6.src == 1000::3 && is_chassis_resident("S0-P0")), action=(get_nd(outport, xxreg0); next;)
> >> ])
> >>
> >> AT_CLEANUP
> >> @@ -10966,3 +10966,38 @@ Status: active
> >>
> >> AT_CLEANUP
> >> ])
> >> +
> >> +OVN_FOR_EACH_NORTHD_NO_HV([
> >> +AT_SETUP([check DHCP RELAY AGENT])
> >> +ovn_start NORTHD_TYPE
> >> +
> >> +check ovn-nbctl ls-add ls0
> >> +check ovn-nbctl lsp-add ls0 ls0-port1
> >> +check ovn-nbctl lsp-set-addresses ls0-port1 02:00:00:00:00:10
> >> +check ovn-nbctl lr-add lr0
> >> +check ovn-nbctl lrp-add lr0 lrp1 02:00:00:00:00:01 192.168.1.1/24
> >> +check ovn-nbctl lsp-add ls0 lrp1-attachment
> >> +check ovn-nbctl lsp-set-type lrp1-attachment router
> >> +check ovn-nbctl lsp-set-addresses lrp1-attachment 00:00:00:00:ff:02
> >> +check ovn-nbctl lsp-set-options lrp1-attachment router-port=lrp1
> >> +check ovn-nbctl lrp-add lr0 lrp-ext 02:00:00:00:00:02 192.168.2.1/24
> >> +
> >> +dhcp_relay=$(ovn-nbctl create DHCP_Relay servers=172.16.1.1)
> >> +check ovn-nbctl set Logical_Router_port lrp1 dhcp_relay=$dhcp_relay
> >> +rp_uuid=$(ovn-nbctl --bare --colum=_uuid list logical_router_port lrp1)
> >> +check ovn-nbctl set Logical_Switch ls0 dhcp_relay_port=$rp_uuid
> >> +
> >> +check ovn-nbctl --wait=sb sync
> >> +
> >> +ovn-sbctl lflow-list > lflows
> >> +AT_CAPTURE_FILE([lflows])
> >> +
> >> +AT_CHECK([grep -e "DHCP_RELAY_" lflows | sed 's/table=../table=??/'], [0], [dnl
> >> +  table=??(lr_in_ip_input     ), priority=110  , match=(inport == "lrp1" && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(dhcp_relay_req(192.168.1.1,172.16.1.1);ip4.src=192.168.1.1;ip4.dst=172.16.1.1;udp.src=67;next; /* DHCP_RELAY_REQ */)
> >> +  table=??(lr_in_ip_input     ), priority=110  , match=(ip4.src == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src == 67 && udp.dst == 67), action=(next;/* DHCP_RELAY_RESP */)
> >> +  table=??(lr_in_dhcp_relay_resp_fwd), priority=110  , match=(ip4.src == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src == 67 && udp.dst == 67), action=(dhcp_relay_resp_fwd(192.168.1.1,172.16.1.1);ip4.src=192.168.1.1;udp.dst=68;outport="lrp1";output; /* DHCP_RELAY_RESP */)
> >> +  table=??(ls_in_l2_lkup      ), priority=100  , match=(inport == "ls0-port1" && eth.src == 02:00:00:00:00:10 && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(eth.dst=02:00:00:00:00:01;outport="lrp1-attachment";next;/* DHCP_RELAY_REQ */)
> >> +])
> >> +
> >> +AT_CLEANUP
> >> +])
> >> diff --git a/tests/ovn.at b/tests/ovn.at
> >> index e8c79512b..839c07ce2 100644
> >> --- a/tests/ovn.at
> >> +++ b/tests/ovn.at
> >> @@ -21905,7 +21905,7 @@ eth_dst=00000000ff01
> >> ip_src=$(ip_to_hex 10 0 0 10)
> >> ip_dst=$(ip_to_hex 172 168 0 101)
> >> send_icmp_packet 1 1 $eth_src $eth_dst $ip_src $ip_dst c4c9 0000000000000000000000
> >> -AT_CHECK_UNQUOTED([as hv1 ovs-ofctl dump-flows br-int metadata=0x$lr0_dp_key | awk '/table=28, n_packets=1, n_bytes=45/{print $7" "$8}'],[0],[dnl
> >> +AT_CHECK_UNQUOTED([as hv1 ovs-ofctl dump-flows br-int metadata=0x$lr0_dp_key | awk '/table=29, n_packets=1, n_bytes=45/{print $7" "$8}'],[0],[dnl
> >> priority=80,ip,reg15=0x$lr0_public_dp_key,metadata=0x$lr0_dp_key,nw_src=10.0.0.10 actions=drop
> >> ])
> >>
> >> @@ -28964,7 +28964,7 @@ AT_CHECK([
> >>         grep "priority=100" | \
> >>         grep -c "ct(commit,zone=NXM_NX_REG11\\[[0..15\\]],.*exec(move:NXM_OF_ETH_SRC\\[[\\]]->NXM_NX_CT_LABEL\\[[32..79\\]],load:0x[[0-9]]->NXM_NX_CT_MARK\\[[16..31\\]]))"
> >>
> >> -        grep table=25 hv${hv}flows | \
> >> +        grep table=26 hv${hv}flows | \
> >>         grep "priority=200" | \
> >>         grep -c "move:NXM_NX_CT_LABEL\\[[\\]]->NXM_NX_XXREG1\\[[\\]],move:NXM_NX_XXREG1\\[[32..79\\]]->NXM_OF_ETH_DST"
> >>     done; :], [0], [dnl
> >> @@ -29089,7 +29089,7 @@ AT_CHECK([
> >>         grep "priority=100" | \
> >>         grep -c "ct(commit,zone=NXM_NX_REG11\\[[0..15\\]],.*exec(move:NXM_OF_ETH_SRC\\[[\\]]->NXM_NX_CT_LABEL\\[[32..79\\]],load:0x[[0-9]]->NXM_NX_CT_MARK\\[[16..31\\]]))"
> >>
> >> -        grep table=25 hv${hv}flows | \
> >> +        grep table=26 hv${hv}flows | \
> >>         grep "priority=200" | \
> >>         grep -c "move:NXM_NX_CT_LABEL\\[[\\]]->NXM_NX_XXREG1\\[[\\]],move:NXM_NX_XXREG1\\[[32..79\\]]->NXM_OF_ETH_DST"
> >>     done; :], [0], [dnl
> >> @@ -29586,7 +29586,7 @@ if test X"$1" = X"DGP"; then
> >> else
> >>     prio=2
> >> fi
> >> -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=25, n_packets=1,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
> >> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=26, n_packets=1,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
> >> 1
> >> ])
> >>
> >> @@ -29605,13 +29605,13 @@ AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep "actions=controller" | grep
> >>
> >> if test X"$1" = X"DGP"; then
> >>     # The packet dst should be resolved once for E/W centralized NAT purpose.
> >> -    AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=25, n_packets=1,.* priority=100,reg0=0xa000101,reg15=.*metadata=0x${sw_key} actions=mod_dl_dst:00:00:00:00:01:01,resubmit" -c], [0], [dnl
> >> +    AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=26, n_packets=1,.* priority=100,reg0=0xa000101,reg15=.*metadata=0x${sw_key} actions=mod_dl_dst:00:00:00:00:01:01,resubmit" -c], [0], [dnl
> >> 1
> >> ])
> >> fi
> >>
> >> # The packet should've been finally dropped in the lr_in_arp_resolve stage.
> >> -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=25, n_packets=2,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
> >> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=26, n_packets=2,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
> >> 1
> >> ])
> >> OVN_CLEANUP([hv1])
> >> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
> >> index 7b9daba0d..591933a95 100644
> >> --- a/tests/system-ovn.at
> >> +++ b/tests/system-ovn.at
> >> @@ -12032,3 +12032,153 @@ as
> >> OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
> >> /connection dropped.*/d"])
> >> AT_CLEANUP
> >> +
> >> +OVN_FOR_EACH_NORTHD([
> >> +AT_SETUP([DHCP RELAY AGENT])
> >> +AT_SKIP_IF([test $HAVE_DHCPD = no])
> >> +AT_SKIP_IF([test $HAVE_DHCLIENT = no])
> >> +AT_SKIP_IF([test $HAVE_TCPDUMP = no])
> >> +ovn_start
> >> +OVS_TRAFFIC_VSWITCHD_START()
> >> +
> >> +ADD_BR([br-int])
> >> +ADD_BR([br-ext])
> >> +
> >> +ovs-ofctl add-flow br-ext action=normal
> >> +# Set external-ids in br-int needed for ovn-controller
> >> +ovs-vsctl \
> >> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
> >> +        -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
> >> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
> >> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
> >> +        -- set bridge br-int fail-mode=secure other-config:disable-in-band=true
> >> +
> >> +# Start ovn-controller
> >> +start_daemon ovn-controller
> >> +
> >> +ADD_NAMESPACES(sw01)
> >> +ADD_VETH(sw01, sw01, br-int, "0", "f0:00:00:01:02:03")
> >> +ADD_NAMESPACES(sw11)
> >> +ADD_VETH(sw11, sw11, br-int, "0", "f0:00:00:02:02:03")
> >> +ADD_NAMESPACES(server)
> >> +ADD_VETH(s1, server, br-ext, "172.16.1.1/24", "f0:00:00:01:02:05", \
> >> +         "172.16.1.254")
> >> +
> >> +check ovn-nbctl lr-add R1
> >> +
> >> +check ovn-nbctl ls-add sw0
> >> +check ovn-nbctl ls-add sw1
> >> +check ovn-nbctl ls-add sw-ext
> >> +
> >> +check ovn-nbctl lrp-add R1 rp-sw0 00:00:01:01:02:03 192.168.1.1/24
> >> +check ovn-nbctl lrp-add R1 rp-sw1 00:00:03:01:02:03 192.168.2.1/24
> >> +check ovn-nbctl lrp-add R1 rp-ext 00:00:02:01:02:03 172.16.1.254/24
> >> +
> >> +dhcp_relay=$(ovn-nbctl create DHCP_Relay servers=172.16.1.1)
> >> +check ovn-nbctl set Logical_Router_port rp-sw0 dhcp_relay=$dhcp_relay
> >> +check ovn-nbctl set Logical_Router_port rp-sw1 dhcp_relay=$dhcp_relay
> >> +check ovn-nbctl lrp-set-gateway-chassis rp-ext hv1
> >> +
> >> +check ovn-nbctl lsp-add sw0 sw0-rp -- set Logical_Switch_Port sw0-rp \
> >> +    type=router options:router-port=rp-sw0 \
> >> +    -- lsp-set-addresses sw0-rp router
> >> +check ovn-nbctl lsp-add sw1 sw1-rp -- set Logical_Switch_Port sw1-rp \
> >> +    type=router options:router-port=rp-sw1 \
> >> +    -- lsp-set-addresses sw1-rp router
> >> +
> >> +rp_uuid=$(ovn-nbctl --bare --colum=_uuid list logical_router_port rp-sw0)
> >> +check ovn-nbctl set Logical_Switch sw0 dhcp_relay_port=$rp_uuid
> >> +rp_uuid=$(ovn-nbctl --bare --colum=_uuid list logical_router_port rp-sw1)
> >> +check ovn-nbctl set Logical_Switch sw1 dhcp_relay_port=$rp_uuid
> >> +
> >> +check ovn-nbctl lsp-add sw-ext ext-rp -- set Logical_Switch_Port ext-rp \
> >> +    type=router options:router-port=rp-ext \
> >> +    -- lsp-set-addresses ext-rp router
> >> +check ovn-nbctl lsp-add sw-ext lnet \
> >> +        -- lsp-set-addresses lnet unknown \
> >> +        -- lsp-set-type lnet localnet \
> >> +        -- lsp-set-options lnet network_name=phynet
> >> +
> >> +check ovn-nbctl lsp-add sw0 sw01 \
> >> +    -- lsp-set-addresses sw01 "f0:00:00:01:02:03"
> >> +
> >> +check ovn-nbctl lsp-add sw1 sw11 \
> >> +    -- lsp-set-addresses sw11 "f0:00:00:02:02:03"
> >> +
> >> +AT_CHECK([ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappings=phynet:br-ext])
> >> +
> >> +OVN_POPULATE_ARP
> >> +
> >> +check ovn-nbctl --wait=hv sync
> >> +
> >> +DHCP_TEST_DIR="/tmp/dhcp-test"
> >> +rm -rf $DHCP_TEST_DIR
> >> +mkdir $DHCP_TEST_DIR
> >> +cat > $DHCP_TEST_DIR/dhcpd.conf <<EOF
> >> +subnet 172.16.1.0 netmask 255.255.255.0 {
> >> +}
> >> +subnet 192.168.1.0 netmask 255.255.255.0 {
> >> +  range 192.168.1.10 192.168.1.10;
> >> +  option routers 192.168.1.1;
> >> +  option broadcast-address 192.168.1.255;
> >> +  default-lease-time 60;
> >> +  max-lease-time 120;
> >> +}
> >> +subnet 192.168.2.0 netmask 255.255.255.0 {
> >> +  range 192.168.2.10 192.168.2.10;
> >> +  option routers 192.168.2.1;
> >> +  option broadcast-address 192.168.2.255;
> >> +  default-lease-time 60;
> >> +  max-lease-time 120;
> >> +}
> >> +EOF
> >> +cat > $DHCP_TEST_DIR/dhclien.conf <<EOF
> >> +timeout 2
> >> +EOF
> >> +
> >> +touch $DHCP_TEST_DIR/dhcpd.leases
> >> +chown root:dhcpd $DHCP_TEST_DIR $DHCP_TEST_DIR/dhcpd.leases
> >> +chmod 775 $DHCP_TEST_DIR
> >> +chmod 664 $DHCP_TEST_DIR/dhcpd.leases
> >> +
> >> +
> >> +NETNS_DAEMONIZE([server], [dhcpd -4 -f -cf $DHCP_TEST_DIR/dhcpd.conf s1 > dhcpd.log 2>&1], [dhcpd.pid])
> >> +
> >> +NS_CHECK_EXEC([server], [tcpdump -l -nvv -i s1  udp > pkt.pcap 2>tcpdump_err &])
> >> +OVS_WAIT_UNTIL([grep "listening" tcpdump_err])
> >> +on_exit 'kill $(pidof tcpdump)'
> >> +
> >> +NS_CHECK_EXEC([sw01], [dhclient -1 -q -lf $DHCP_TEST_DIR/dhclient-sw01.lease -pf $DHCP_TEST_DIR/dhclient-sw01.pid -cf $DHCP_TEST_DIR/dhclien.conf sw01])
> >> +NS_CHECK_EXEC([sw11], [dhclient -1 -q -lf $DHCP_TEST_DIR/dhclient-sw11.lease -pf $DHCP_TEST_DIR/dhclient-sw11.pid -cf $DHCP_TEST_DIR/dhclien.conf sw11])
> >> +
> >> +OVS_WAIT_UNTIL([
> >> +    total_pkts=$(cat pkt.pcap | wc -l)
> >> +    test ${total_pkts} -ge 8
> >> +])
> >> +
> >> +on_exit 'kill `cat $DHCP_TEST_DIR/dhclient-sw01.pid` &&
> >> +kill `cat $DHCP_TEST_DIR/dhclient-sw11.pid` && rm -rf $DHCP_TEST_DIR'
> >> +
> >> +NS_CHECK_EXEC([sw01], [ip addr show sw01 | grep -oP '(?<=inet\s)\d+(\.\d+){3}'], [0], [dnl
> >> +192.168.1.10
> >> +])
> >> +NS_CHECK_EXEC([sw11], [ip addr show sw11 | grep -oP '(?<=inet\s)\d+(\.\d+){3}'], [0], [dnl
> >> +192.168.2.10
> >> +])
> >> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
> >> +
> >> +as ovn-sb
> >> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> >> +
> >> +as ovn-nb
> >> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> >> +
> >> +as northd
> >> +OVS_APP_EXIT_AND_WAIT([NORTHD_TYPE])
> >> +
> >> +as
> >> +OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
> >> +/failed to query port patch-.*/d
> >> +/.*terminating with signal 15.*/d"])
> >> +AT_CLEANUP
> >> +])
> >> diff --git a/utilities/ovn-trace.c b/utilities/ovn-trace.c
> >> index 0b86eae7b..ae9dd77de 100644
> >> --- a/utilities/ovn-trace.c
> >> +++ b/utilities/ovn-trace.c
> >> @@ -2328,6 +2328,25 @@ execute_put_dhcp_opts(const struct ovnact_put_opts *pdo,
> >>     execute_put_opts(pdo, name, uflow, super);
> >> }
> >>
> >> +static void
> >> +execute_dhcpv4_relay_resp_fwd(const struct ovnact_dhcp_relay *dr,
> >> +                                const char *name, struct flow *uflow,
> >> +                                struct ovs_list *super)
> >> +{
> >> +    ovntrace_node_append(
> >> +        super, OVNTRACE_NODE_ERROR,
> >> +        "/* We assume that this packet is DHCPOFFER or DHCPACK and "
> >> +            "DHCP broadcast flag is set. Dest IP is set to broadcast. "
> >> +            "Dest MAC is set to broadcast but in real network this is unicast "
> >> +            "which is extracted from DHCP header. */");
> >> +
> >> +    /* Assume DHCP broadcast flag is set */
> >> +    uflow->nw_dst = 0xFFFFFFFF;
> >> +    /* Dest MAC is set to broadcast but in real network this is unicast */
> >> +    struct eth_addr bcast_mac = {0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF};
> >> +    uflow->dl_dst = bcast_mac;
> >> +}
> >> +
> >> static void
> >> execute_put_nd_ra_opts(const struct ovnact_put_opts *pdo,
> >>                        const char *name, struct flow *uflow,
> >> @@ -3215,6 +3234,15 @@ trace_actions(const struct ovnact *ovnacts, size_t ovnacts_len,
> >>                                   "put_dhcpv6_opts", uflow, super);
> >>             break;
> >>
> >> +        case OVNACT_DHCPV4_RELAY_REQ:
> >> +            /* Nothing to do for tracing. */
> >> +            break;
> >> +
> >> +        case OVNACT_DHCPV4_RELAY_RESP_FWD:
> >> +            execute_dhcpv4_relay_resp_fwd(ovnact_get_DHCPV4_RELAY_RESP_FWD(a),
> >> +                                    "dhcp_relay_resp_fwd", uflow, super);
> >> +            break;
> >> +
> >>         case OVNACT_PUT_ND_RA_OPTS:
> >>             execute_put_nd_ra_opts(ovnact_get_PUT_DHCPV6_OPTS(a),
> >>                                    "put_nd_ra_opts", uflow, super);
> >> --
> >> 2.36.6
> >>
> >> _______________________________________________
> >> dev mailing list
> >> dev@openvswitch.org
> >> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwIFaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=jUP6tr4FN6iSRj6v8rdyetsvEpT13QUHVMbw__3u6Sm7qAhyuu9tBdezdVmkqt0p&s=jJ3kFCf5o6dc-gW8diGvfaIQVC0Gwhe2y5aJYZJo0Rk&e=
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Naveen Yerramneni Jan. 24, 2024, 1 p.m. UTC | #5
> On 24-Jan-2024, at 8:59 AM, Numan Siddique <numans@ovn.org> wrote:
> 
> On Tue, Jan 23, 2024 at 8:02 PM Naveen Yerramneni
> <naveen.yerramneni@nutanix.com> wrote:
>> 
>> 
>> 
>>> On 16-Jan-2024, at 2:30 AM, Numan Siddique <numans@ovn.org> wrote:
>>> 
>>> On Tue, Dec 12, 2023 at 1:05 PM Naveen Yerramneni
>>> <naveen.yerramneni@nutanix.com> wrote:
>>>> 
>>>>   This patch contains changes to enable DHCP Relay Agent support for overlay subnets.
>>>> 
>>>>   USE CASE:
>>>>   ----------
>>>>     - Enable IP address assignment for overlay subnets from the centralized DHCP server present in the underlay network.
>>>> 
>>>>   PREREQUISITES
>>>>   --------------
>>>>     - Logical Router Port IP should be assigned (statically) from the same overlay subnet which is managed by DHCP server.
>>>>     - LRP IP is used for GIADRR field when relaying the DHCP packets and also same IP needs to be configured as default gateway for the overlay subnet.
>>>>     - Overlay subnets managed by external DHCP server are expected to be directly reachable from the underlay network.
>>>> 
>>>>   EXPECTED PACKET FLOW:
>>>>   ----------------------
>>>>   Following is the expected packet flow inorder to support DHCP rleay functionality in OVN.
>>>>     1. DHCP client originates DHCP discovery (broadcast).
>>>>     2. DHCP relay (running on the OVN) receives the broadcast and forwards the packet to the DHCP server by converting it to unicast. While forwarding the packet, it updates the GIADDR in DHCP header to its
>>>>        interface IP on which DHCP packet is received.
>>>>     3. DHCP server uses GIADDR field to decide the IP address pool from which IP has to be assigned and DHCP offer is sent to the same IP (GIADDR).
>>>>     4. DHCP relay agent forwards the offer to the client, it resets the GIADDR field when forwarding the offer to the client.
>>>>     5. DHCP client sends DHCP request (broadcast) packet.
>>>>     6. DHCP relay (running on the OVN) receives the broadcast and forwards the packet to the DHCP server by converting it to unicast. While forwarding the packet, it updates the GIADDR in DHCP header to its
>>>>        interface IP on which DHCP packet is received.
>>>>     7. DHCP Server sends the ACK packet.
>>>>     8. DHCP relay agent forwards the ACK packet to the client, it resets the GIADDR field when forwarding the ACK to the client.
>>>>     9. All the future renew/release packets are directly exchanged between DHCP client and DHCP server.
>>>> 
>>>>   OVN DHCP RELAY PACKET FLOW:
>>>>   ----------------------------
>>>>   To add DHCP Relay support on OVN, we need to replicate all the behavior described above using distributed logical switch and logical router.
>>>>   At, highlevel packet flow is distributed among Logical Switch and Logical Router on source node (where VM is deployed) and redirect chassis(RC) node.
>>>>     1. Request packet gets processed on the source node where VM is deployed and relays the packet to DHCP server.
>>>>     2. Response packet is first processed on RC node (which first recieves the packet from underlay network). RC node forwards the packet to the right node by filling in the dest MAC and IP.
>>>> 
>>>>   OVN Packet flow with DHCP relay is explained below.
>>>>     1. DHCP client (VM) sends the DHCP discover packet (broadcast).
>>>>     2. Logical switch converts the packet to L2 unicast by setting the destination MAC to LRP's MAC
>>>>     3. Logical Router receives the packet and redirects it to the OVN controller.
>>>>     4. OVN controller updates the required information(GIADDR) in the DHCP payload after doing the required checks. If any check fails, packet is dropped.
>>>>     5. Logical Router converts the packet to L3 unicast and forwards it to the server. This packets gets routed like any other packet (via RC node).
>>>>     6. Server replies with DHCP offer.
>>>>     7. RC node processes the DHCP offer and forwards it to the OVN controller.
>>>>     8. OVN controller does sanity checks and  updates the destination MAC (available in DHCP header), destination IP (available in DHCP header), resets GIADDR  and reinjects the packet to datapath.
>>>>        If any check fails, packet is dropped.
>>>>     9. Logical router updates the source IP and port and forwards the packet to logical switch.
>>>>     10. Logical switch delivers the packet to the DHCP client.
>>>>     11. Similar steps are performed for Request and Ack packets.
>>>>     12. All the future renew/release packets are directly exchanged between DHCP client and DHCP server
>>>> 
>>>>   NEW OVN ACTIONS
>>>>   ---------------
>>>> 
>>>>     1. dhcp_relay_req(<relay-ip>, <server-ip>)
>>>>         - This action executes on the source node on which the DHCP request originated.
>>>>         - This action relays the DHCP request coming from client to the server. Relay-ip is used to update GIADDR in the DHCP header.
>>>>     2. dhcp_relay_resp_fwd(<relay-ip>, <server-ip>)
>>>>         - This action executes on the first node (RC node) which processes the DHCP response from the server.
>>>>         - This action updates  the destination MAC and destination IP so that the response can be forwarded to the appropriate node from which request was originated.
>>>>         - Relay-ip, server-ip are used to validate GIADDR and SERVER ID in the DHCP payload.
>>>> 
>>>>   FLOWS
>>>>   -----
>>>>   Following are the flows required for one overlay subnet.
>>>> 
>>>>     1. table=27(ls_in_l2_lkup      ), priority=100  , match=(inport == <vm_port> && eth.src == <vm_mac> && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(eth.dst=<lrp_mac>;outport=<lrp-port>;next;/* DHCP_RELAY_REQ */)
>>>>     2. table=3 (lr_in_ip_input     ), priority=110  , match=(inport == <lrp_port> && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(dhcp_relay_req(<lrp_ip>,<dhcp_server_ip>);ip4.src=<lrp_ip>;ip4.dst=<dhcp_server_ip>;udp.src=67;next; /* DHCP_RELAY_REQ */)
>>>>     3. table=3 (lr_in_ip_input     ), priority=110  , match=(ip4.src == <dhcp_server_ip> && ip4.dst ==<lrp_ip> && udp.src == 67 && udp.dst == 67), action=(next;/* DHCP_RELAY_RESP */)
>>>>     4. table=17(lr_in_dhcp_relay_resp_fwd), priority=110  , match=(ip4.src == <dhcp_server_ip> && ip4.dst == <lrp_ip> && udp.src == 67 && udp.dst == 67), action=(dhcp_relay_resp_fwd(<lrp_ip>,<dhcp_server_ip>);ip4.src=<lrp_ip>;udp.dst=68;outport=<lrp_port>;output; /* DHCP_RELAY_RESP */)
>>>> 
>>>>   NEW PIPELINE STAGES
>>>>   -------------------
>>>>   Following stage is added for DHCP relay feature. Some of the flows are fitted into the existing pipeline tages.
>>>>     1. lr_in_dhcp_relay_resp_fwd
>>>>         - Forward teh DHCP response to the appropriate node
>>>> 
>>>>   NB SCHEMA CHANGES
>>>>   ----------------
>>>>     1. New DHCP_Relay table
>>>>         "DHCP_Relay": {
>>>>               "columns": {
>>>>           "name": {"type": "string"},
>>>>                   "servers": {"type": {"key": "string",
>>>>                                          "min": 0,
>>>>                                          "max": 1}},
>>>>                   "external_ids": {
>>>>                       "type": {"key": "string", "value": "string",
>>>>                               "min": 0, "max": "unlimited"}}},
>>>>               "isRoot": true},
>>>>     2. New column to Logical_Router_Port table
>>>>         "dhcp_relay": {"type": {"key": {"type": "uuid",
>>>>                               "refTable": "DHCP_Relay",
>>>>                               "refType": "weak"},
>>>>                               "min": 0,
>>>>                               "max": 1}},
>>>>     3. New column to Logical_Switch_table
>>>>         "dhcp_relay_port": {"type": {"key": {"type": "uuid",
>>>>                                       "refTable": "Logical_Router_Port",
>>>>                                       "refType": "weak"},
>>>>                                        "min": 0,
>>>>                                        "max": 1}}},
>>>> 
>>>>   Commands to enable the feature:
>>>>   ------------------------------
>>>>     - ovn-nbctl create DHCP_Relay servers=<ip>
>>>>     - ovn-nbctl set Logical_Router_port <lrp_uuid> dhcp_relay=<dhcp_relay_uuid>
>>>>     - ovn-nbctl set Logical_Switch <ls_uuid> dhcp_relay_port=<lrp_uuid>
>>>> 
>>>>   Example:
>>>>   -------
>>>>    ovn-nbctl ls-add sw1
>>>>    ovn-nbctl lsp-add sw1 sw1-port1
>>>>    ovn-nbctl lsp-set-addresses sw1-port1 <MAC> #Only MAC address has to be specified when logical ports are created.
>>>>    ovn-nbctl lr-add lr1
>>>>    ovn-nbctl lrp-add lr1 lr1-port1 <MAC> <GATEWAY_IP/Prefix> #GATEWAY IP is set in GIADDR field when relaying the DHCP requests to server.
>>>>    ovn-nbctl lsp-add sw1 lr1-attachment
>>>>    ovn-nbctl lsp-set-type lr1-attachment router
>>>>    ovn-nbctl lsp-set-addresses lr1-attachment <MAC>
>>>>    ovn-nbctl lsp-set-options lr1-attachment router-port=lr1-port1
>>>>    ovn-nbctl create DHCP_Relay servers=<DHCP_SERVER_IP>
>>>>    ovn-nbctl set Logical_Router_port <lrp_uuid> dhcp_relay=<relay_uuid>
>>>>    ovn-nbctl set Logical_Switch <ls_uuid> dhcp_relay_port=<lrp_uuid>
>>>> 
>>>>   Limitations:
>>>>   ------------
>>>>     - All OVN features that needs IP address to be configured on logical port (like proxy arp, etc) will not be supported for overlay subnets on which DHCP relay is enabled.
>>>> 
>>>>   References:
>>>>   ----------
>>>>     - rfc1541, rfc1542, rfc2131
>>>> 
>>>> Signed-off-by: Naveen Yerramneni <naveen.yerramneni@nutanix.com>
>>>> Co-authored-by: Huzaifa Calcuttawala <huzaifa.c@nutanix.com>
>>>> Signed-off-by: Huzaifa Calcuttawala <huzaifa.c@nutanix.com>
>>>> CC: Mary Manohar <mary.manohar@nutanix.com>
>>> 
>>> Hi Naveen,
>>> 
>>> Thanks for the patch.  Sorry for the delayed response.
>>> 
>>> I've a few comments.
>>> 
>>> 1.  Regarding the newly added Table - DHCP_Relay in NB DB and the
>>> newly added columns in Logical_Switch and
>>>   Logical_Router table.
>>> 
>>>   I don't think there is a need to add the new table DHCP_Relay
>>> since it only stores the dhcp relay agent server ip.
>>>   Also it could complicate the northd incremental processing.
>>> 
>>>   If for example we have below logical switches and router
>>> 
>>>   ovn-nbctl lr-add R1
>>>   ovn-nbctl ls-add sw0
>>>   ovn-nbctl ls-add sw1
>>>   ovn-nbctl ls-add sw-ext
>>>   ovn-nbctl lrp-add R1 rp-sw0 00:00:01:01:02:03 192.168.1.1/24
>>>   ovn-nbctl lrp-add R1 rp-sw1 00:00:03:01:02:03 192.168.2.1/24
>>>   ovn-nbctl lrp-add R1 rp-ext 00:00:02:01:02:03 172.16.1.254/24
>>> 
>>>   ovn-nbctl lsp-add sw0 sw0-rp -- set Logical_Switch_Port sw0-rp \
>>>   type=router options:router-port=rp-sw0 \
>>>   -- lsp-set-addresses sw0-rp router
>>> 
>>>   ovn-nbctl lsp-add sw1 sw1-rp -- set Logical_Switch_Port sw1-rp \
>>>   type=router options:router-port=rp-sw1 \
>>>   -- lsp-set-addresses sw1-rp router
>>> 
>>>   I'd suggest doing something like below to enable this feature.
>>> 
>>>   ovn-nbctl set Logical_Switch_Port sw0-rp options:dhcp_relay=true
>>>   ovn-nbctl set Logical_Switch_Port sw1-rp options:dhcp_relay=true
>>> 
>>>   (Make sure that only one logical switch port of type router can
>>> have this flag - dhcp_relay set
>>>    for a given logical switch and document this limitation.)
>> 
>> Ack. This suggestion looks good.
>> 
>>>   ovn-nbctl set Logical_Router_port rp-sw0 options:dhcp_relay_ip=172.16.1.1
>>>   ovn-nbctl set Logical_Router_port rp-sw1 options:dhcp_relay_ip=172.16.1.1
>>> 
>>>   Let me know if there are any limitations with this.
>> 
>> The reason why I added new table is , it would be useful in future if we add
>> additional options (like setting hop count in DHCP header, etc) to DHCP relay
>> functionality. What do you recommend if we have to add more options
>> In future ?
> 
> I see.  If there is a possibility of adding more options, then having
> a separate table makes sense.
> I'd suggest to add the options column to the DHCP_Relay table even if
> this patch presently is not using
> any.  This would help in upgrades.
> 
> But I don't think there is a need to add a new column in the logical
> switch port table to enable dhcp realy.
> 
> Thanks
> Numan

Sure, I will add options column to DHCP_Relay column.
I will use options:dhcp_relay for LSP instead of new column as you suggested.

Thanks,
Naveen


> 
>> 
>> 
>> 
>>> 2.  Regarding the newly added actions - dhcp_relay_req() and
>>> dhcp_relay_resp_fwd().
>>>    Both of these actions are encoded as OVS controller action with
>>> pause enabled.
>>>    Which means ovs-vswitchd has to freeze the flow translation and
>>> resume the flow translation
>>>    once the ovn-controller resumes it.  But the functions
>>> pinctrl_handle_dhcp_relay_req()
>>>    and pinctrl_handle_dhcp_relay_resp_fwd() do not resume the packet
>>> if the packet
>>>    has some errors.  This is wrong.  Otherwise vswitchd will never thaw the
>>>    frozen translation.
>>> 
>>>    You can see the existing OVN actions - put_dhcp_opts() and few others which
>>>    use controller action with pause.  In such actions, the result of
>>> these actions
>>>    are stored in a register bit (i.e if put_dhcp_opts() was successful or not)
>>>    and in the next stage we take a decision based on the result.
>>> 
>>>    For the action dhcp_relay_req(relay_ip, server_ip),  I don't
>>> think you should use the pause flag.
>>>    Also in this action the argument server_ip is never used in the
>>> function pinctrl_handle_dhcp_relay_req()
>>>    other than to just log.
>>> 
>>>    I'd suggest you do something like this:
>>> 
>>>   table=3 (lr_in_ip_input     ), priority=110  , match=(inport ==
>>> "lrp1" && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src
>>> == 68 && udp.dst == 67),
>>>   action=(dhcp_relay_req { ip4.src = 192.168.1.1; ip4.dst =
>>> 172.16.1.1; udp.src = 67; dhcp_header.giaddr = <relay_ip>;
>>> next(pipeline=ingress,table=S_ROUTER_IN_UNSNAT);  /* DHCP_RELAY_REQ */
>>> }
>>> 
>>>   dhcp_relay_req action would get translated into a controller
>>> action with pause=false and all the inner actions of this are encoded
>>> as
>>>   normal actions and stored in the userdata of controller action.
>>> Please see icmp4_error {} as an example.
>>>   Add a new OVN field 'dhcp_header.giaddr' which gets translated as
>>> controller action with pause flag set.
>>>   Please see the existing OVN field - icmp4.frag_mtu as an example
>>> and see this commit for reference [1]
>>>   When encoding this new OVN field, store the relay_ip in the
>>> userdata buffer and in pinctrl.c
>>>   get the relay_ip value and store it in the dhcp header field.
>>> 
>>> 
>>>   For the action dhcp_relay_resp_fwd,  I'd suggest something like below:
>>> 
>>>     table=17 (lr_in_dhcp_relay_resp_chk), priority=110  ,
>>> match=(ip4.src == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src ==
>>> 67 && udp.dst == 67),
>>>     action=(reg0[0] = dhcp_relay_resp_chk(dhcp_header.giaddr ==
>>> <relay_ip>); next;)
>>>     table=17 (lr_in_dhcp_relay_resp), priority=110  , match=(ip4.src
>>> == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src == 67 && udp.dst ==
>>> 67 && reg0[0] == 1),
>>>     action=(ip4.src = 192.168.1.1; udp.dst = 68; outport = "lrp1";
>>> output; /* DHCP_RELAY_RESP */)
>>> 
>>>      I used reg0[0] as an example.  You may need to check the free
>>> register bit and use it.
>>> 
>>>     You need to encode dhcp_relay_resp_chk as controller action with
>>> pause=true, and store the relay_ip in the userdata buffer.
>>>     And in pinctrl.c  check that  'dhcp_header.giaddr == relay_ip'
>>> or not.  If so, set the result register bit to 1, else to 0.
>>> 
>>>  Let me know if you've any questions.
>>> 
>> 
>> Ack. Thanks for the suggestions and detailed explanation.
>> Before implementation I had referred to icmp4_error and native dhcp_server flows
>> but I had slight misunderstanding about pause flag.
>> 
>> 
>>> 3.  The newly added functions in pinctrl.c have a lot of repetitive
>>> code and it is very much similar to existing
>>> pinctrl_handle_put_dhcp_opts()
>>>   Please see if the duplicate code can be avoided.
>> 
>> Ack.
>> 
>> 
>> 
>>> [1] - https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_commit_3d9fec3fd5992e1201b4d4fdf43f1f397e8d5ea1&d=DwIFaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=jUP6tr4FN6iSRj6v8rdyetsvEpT13QUHVMbw__3u6Sm7qAhyuu9tBdezdVmkqt0p&s=xAleLPNTzueIGuScqWZRp7ppL2D7bbjqLZc6q4xk3Rg&e=
>>> 
>>> Thanks
>>> Numan
>>> 
>>>> ---
>>>> controller/pinctrl.c  | 441 ++++++++++++++++++++++++++++++++++++++++++
>>>> include/ovn/actions.h |  26 +++
>>>> lib/actions.c         | 117 +++++++++++
>>>> lib/ovn-l7.h          |   1 +
>>>> northd/northd.c       | 177 ++++++++++++++++-
>>>> ovn-nb.ovsschema      |  25 ++-
>>>> ovn-nb.xml            |  28 +++
>>>> tests/atlocal.in      |   3 +
>>>> tests/ovn-northd.at   |  41 +++-
>>>> tests/ovn.at          |  12 +-
>>>> tests/system-ovn.at   | 150 ++++++++++++++
>>>> utilities/ovn-trace.c |  28 +++
>>>> 12 files changed, 1032 insertions(+), 17 deletions(-)
>>>> 
>>>> diff --git a/controller/pinctrl.c b/controller/pinctrl.c
>>>> index 5a35d56f6..45240f01d 100644
>>>> --- a/controller/pinctrl.c
>>>> +++ b/controller/pinctrl.c
>>>> @@ -1897,6 +1897,437 @@ is_dhcp_flags_broadcast(ovs_be16 flags)
>>>>    return flags & htons(DHCP_BROADCAST_FLAG);
>>>> }
>>>> 
>>>> +static const char *dhcp_msg_str[] = {
>>>> +[0] = "INVALID",
>>>> +[DHCP_MSG_DISCOVER] = "DISCOVER",
>>>> +[DHCP_MSG_OFFER] = "OFFER",
>>>> +[DHCP_MSG_REQUEST] = "REQUEST",
>>>> +[OVN_DHCP_MSG_DECLINE] = "DECLINE",
>>>> +[DHCP_MSG_ACK] = "ACK",
>>>> +[DHCP_MSG_NAK] = "NAK",
>>>> +[OVN_DHCP_MSG_RELEASE] = "RELEASE",
>>>> +[OVN_DHCP_MSG_INFORM] = "INFORM"
>>>> +};
>>>> +
>>>> +static bool
>>>> +dhcp_relay_is_msg_type_supported(uint8_t msg_type)
>>>> +{
>>>> +    return (msg_type >= DHCP_MSG_DISCOVER && msg_type <= OVN_DHCP_MSG_RELEASE);
>>>> +}
>>>> +
>>>> +static const char *dhcp_msg_str_get(uint8_t msg_type)
>>>> +{
>>>> +    if (!dhcp_relay_is_msg_type_supported(msg_type)) {
>>>> +        return "INVALID";
>>>> +    }
>>>> +    return dhcp_msg_str[msg_type];
>>>> +}
>>>> +
>>>> +/* Called with in the pinctrl_handler thread context. */
>>>> +static void
>>>> +pinctrl_handle_dhcp_relay_req(
>>>> +    struct rconn *swconn,
>>>> +    struct dp_packet *pkt_in, struct ofputil_packet_in *pin,
>>>> +    struct ofpbuf *userdata,
>>>> +    struct ofpbuf *continuation)
>>>> +{
>>>> +    enum ofp_version version = rconn_get_version(swconn);
>>>> +    enum ofputil_protocol proto = ofputil_protocol_from_ofp_version(version);
>>>> +    struct dp_packet *pkt_out_ptr = NULL;
>>>> +
>>>> +    /* Parse relay IP and server IP. */
>>>> +    ovs_be32 *relay_ip = ofpbuf_try_pull(userdata, sizeof *relay_ip);
>>>> +    ovs_be32 *server_ip = ofpbuf_try_pull(userdata, sizeof *server_ip);
>>>> +    if (!relay_ip || !server_ip) {
>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: relay ip or server ip "
>>>> +                  "not present in the userdata");
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    /* Validate the DHCP request packet.
>>>> +     * Format of the DHCP packet is
>>>> +     * ------------------------------------------------------------------------
>>>> +     *| UDP HEADER  | DHCP HEADER  | 4 Byte DHCP Cookie | DHCP OPTIONS(var len)|
>>>> +     * ------------------------------------------------------------------------
>>>> +     */
>>>> +
>>>> +    size_t in_l4_size = dp_packet_l4_size(pkt_in);
>>>> +    const char *end = (char *) dp_packet_l4(pkt_in) + in_l4_size;
>>>> +    const char *in_dhcp_ptr = dp_packet_get_udp_payload(pkt_in);
>>>> +    if (!in_dhcp_ptr) {
>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: invalid or incomplete "
>>>> +                  "DHCP packet received");
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    const struct dhcp_header *in_dhcp_data
>>>> +        = (const struct dhcp_header *) in_dhcp_ptr;
>>>> +    in_dhcp_ptr += sizeof *in_dhcp_data;
>>>> +    if (in_dhcp_ptr > end) {
>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: invalid or incomplete "
>>>> +                "DHCP packet received, bad data length");
>>>> +        return;
>>>> +    }
>>>> +    if (in_dhcp_data->op != DHCP_OP_REQUEST) {
>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: invalid opcode in the "
>>>> +                "DHCP packet: %d", in_dhcp_data->op);
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    /* DHCP options follow the DHCP header. The first 4 bytes of the DHCP
>>>> +     * options is the DHCP magic cookie followed by the actual DHCP options.
>>>> +     */
>>>> +    ovs_be32 magic_cookie = htonl(DHCP_MAGIC_COOKIE);
>>>> +    if (in_dhcp_ptr + sizeof magic_cookie > end ||
>>>> +        get_unaligned_be32((const void *) in_dhcp_ptr) != magic_cookie) {
>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: magic cookie not present "
>>>> +                "in the packet");
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    if (in_dhcp_data->giaddr) {
>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: giaddr is already set");
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    if (in_dhcp_data->htype != 0x1) {
>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: packet is recieved with "
>>>> +                "unsupported hardware type");
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    ovs_be32 *server_id_ptr = NULL;
>>>> +    const uint8_t *in_dhcp_msg_type = NULL;
>>>> +
>>>> +    in_dhcp_ptr += sizeof magic_cookie;
>>>> +    ovs_be32 request_ip = in_dhcp_data->ciaddr;
>>>> +    while (in_dhcp_ptr < end) {
>>>> +        const struct dhcp_opt_header *in_dhcp_opt =
>>>> +            (const struct dhcp_opt_header *) in_dhcp_ptr;
>>>> +        if (in_dhcp_opt->code == DHCP_OPT_END) {
>>>> +            break;
>>>> +        }
>>>> +        if (in_dhcp_opt->code == DHCP_OPT_PAD) {
>>>> +            in_dhcp_ptr += 1;
>>>> +            continue;
>>>> +        }
>>>> +        in_dhcp_ptr += sizeof *in_dhcp_opt;
>>>> +        if (in_dhcp_ptr > end) {
>>>> +            break;
>>>> +        }
>>>> +        in_dhcp_ptr += in_dhcp_opt->len;
>>>> +        if (in_dhcp_ptr > end) {
>>>> +            break;
>>>> +        }
>>>> +
>>>> +        switch (in_dhcp_opt->code) {
>>>> +        case DHCP_OPT_MSG_TYPE:
>>>> +            if (in_dhcp_opt->len == 1) {
>>>> +                in_dhcp_msg_type = DHCP_OPT_PAYLOAD(in_dhcp_opt);
>>>> +            }
>>>> +            break;
>>>> +        case DHCP_OPT_REQ_IP:
>>>> +            if (in_dhcp_opt->len == 4) {
>>>> +                request_ip = get_unaligned_be32(DHCP_OPT_PAYLOAD(in_dhcp_opt));
>>>> +            }
>>>> +            break;
>>>> +        /* Server Identifier */
>>>> +        case OVN_DHCP_OPT_CODE_SERVER_ID:
>>>> +            if (in_dhcp_opt->len == 4) {
>>>> +                server_id_ptr = DHCP_OPT_PAYLOAD(in_dhcp_opt);
>>>> +            }
>>>> +            break;
>>>> +        default:
>>>> +            break;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    /* Check whether the DHCP Message Type (opt 53) is present or not */
>>>> +    if (!in_dhcp_msg_type) {
>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: missing message type");
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    /* Relay the DHCP request packet */
>>>> +    uint16_t new_l4_size = in_l4_size;
>>>> +    size_t new_packet_size = pkt_in->l4_ofs + new_l4_size;
>>>> +
>>>> +    struct dp_packet pkt_out;
>>>> +    dp_packet_init(&pkt_out, new_packet_size);
>>>> +    dp_packet_clear(&pkt_out);
>>>> +    dp_packet_prealloc_tailroom(&pkt_out, new_packet_size);
>>>> +    pkt_out_ptr = &pkt_out;
>>>> +
>>>> +    /* Copy the L2 and L3 headers from the pkt_in as they would remain same*/
>>>> +    dp_packet_put(
>>>> +        &pkt_out, dp_packet_pull(pkt_in, pkt_in->l4_ofs), pkt_in->l4_ofs);
>>>> +
>>>> +    pkt_out.l2_5_ofs = pkt_in->l2_5_ofs;
>>>> +    pkt_out.l2_pad_size = pkt_in->l2_pad_size;
>>>> +    pkt_out.l3_ofs = pkt_in->l3_ofs;
>>>> +    pkt_out.l4_ofs = pkt_in->l4_ofs;
>>>> +
>>>> +    struct ip_header *out_ip = dp_packet_l3(&pkt_out);
>>>> +
>>>> +    struct udp_header *udp = dp_packet_put(
>>>> +        &pkt_out, dp_packet_pull(pkt_in, UDP_HEADER_LEN), UDP_HEADER_LEN);
>>>> +
>>>> +    struct dhcp_header *dhcp_data = dp_packet_put(&pkt_out,
>>>> +        dp_packet_pull(pkt_in, new_l4_size - UDP_HEADER_LEN),
>>>> +        new_l4_size - UDP_HEADER_LEN);
>>>> +    dhcp_data->giaddr = *relay_ip;
>>>> +    if (udp->udp_csum) {
>>>> +        udp->udp_csum = recalc_csum32(udp->udp_csum,
>>>> +            0, dhcp_data->giaddr);
>>>> +    }
>>>> +    pin->packet = dp_packet_data(&pkt_out);
>>>> +    pin->packet_len = dp_packet_size(&pkt_out);
>>>> +
>>>> +    /* Log the DHCP message. */
>>>> +    static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(20, 40);
>>>> +    const struct eth_header *l2 = dp_packet_eth(&pkt_out);
>>>> +    VLOG_INFO_RL(&rl, "DHCP_RELAY_REQ:: MSG_TYPE:%s MAC:"ETH_ADDR_FMT
>>>> +                " XID:%u"
>>>> +                " REQ_IP:"IP_FMT
>>>> +                " GIADDR:"IP_FMT
>>>> +                " SERVER_ADDR:"IP_FMT,
>>>> +                dhcp_msg_str_get(*in_dhcp_msg_type),
>>>> +                ETH_ADDR_BYTES_ARGS(dhcp_data->chaddr), ntohl(dhcp_data->xid),
>>>> +                IP_ARGS(request_ip), IP_ARGS(dhcp_data->giaddr),
>>>> +                IP_ARGS(*server_ip));
>>>> +    queue_msg(swconn, ofputil_encode_resume(pin, continuation, proto));
>>>> +    if (pkt_out_ptr) {
>>>> +        dp_packet_uninit(pkt_out_ptr);
>>>> +    }
>>>> +}
>>>> +
>>>> +/* Called with in the pinctrl_handler thread context. */
>>>> +static void
>>>> +pinctrl_handle_dhcp_relay_resp_fwd(
>>>> +    struct rconn *swconn,
>>>> +    struct dp_packet *pkt_in, struct ofputil_packet_in *pin,
>>>> +    struct ofpbuf *userdata,
>>>> +    struct ofpbuf *continuation)
>>>> +{
>>>> +    enum ofp_version version = rconn_get_version(swconn);
>>>> +    enum ofputil_protocol proto = ofputil_protocol_from_ofp_version(version);
>>>> +    struct dp_packet *pkt_out_ptr = NULL;
>>>> +
>>>> +    /* Parse relay IP and server IP. */
>>>> +    ovs_be32 *relay_ip = ofpbuf_try_pull(userdata, sizeof *relay_ip);
>>>> +    ovs_be32 *server_ip = ofpbuf_try_pull(userdata, sizeof *server_ip);
>>>> +    if (!relay_ip || !server_ip) {
>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: relay ip or server ip "
>>>> +                "not present in the userdata");
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    /* Validate the DHCP request packet.
>>>> +     * Format of the DHCP packet is
>>>> +     * ------------------------------------------------------------------------
>>>> +     *| UDP HEADER  | DHCP HEADER  | 4 Byte DHCP Cookie | DHCP OPTIONS(var len)|
>>>> +     * ------------------------------------------------------------------------
>>>> +     */
>>>> +
>>>> +    size_t in_l4_size = dp_packet_l4_size(pkt_in);
>>>> +    const char *end = (char *) dp_packet_l4(pkt_in) + in_l4_size;
>>>> +    const char *in_dhcp_ptr = dp_packet_get_udp_payload(pkt_in);
>>>> +    if (!in_dhcp_ptr) {
>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: invalid or incomplete "
>>>> +                "packet received");
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    const struct dhcp_header *in_dhcp_data
>>>> +        = (const struct dhcp_header *) in_dhcp_ptr;
>>>> +    in_dhcp_ptr += sizeof *in_dhcp_data;
>>>> +    if (in_dhcp_ptr > end) {
>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: invalid or incomplete "
>>>> +                    "packet received, bad data length");
>>>> +        return;
>>>> +    }
>>>> +    if (in_dhcp_data->op != DHCP_OP_REPLY) {
>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: invalid opcode "
>>>> +                "in the packet: %d", in_dhcp_data->op);
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    /* DHCP options follow the DHCP header. The first 4 bytes of the DHCP
>>>> +     * options is the DHCP magic cookie followed by the actual DHCP options.
>>>> +     */
>>>> +    ovs_be32 magic_cookie = htonl(DHCP_MAGIC_COOKIE);
>>>> +    if (in_dhcp_ptr + sizeof magic_cookie > end ||
>>>> +        get_unaligned_be32((const void *) in_dhcp_ptr) != magic_cookie) {
>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: magic cookie not present "
>>>> +                "in the packet");
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    if (!in_dhcp_data->giaddr) {
>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: giaddr is "
>>>> +                    "not set in request");
>>>> +        return;
>>>> +    }
>>>> +    ovs_be32 giaddr = in_dhcp_data->giaddr;
>>>> +
>>>> +    ovs_be32 *server_id_ptr = NULL;
>>>> +    ovs_be32 lease_time = 0;
>>>> +    const uint8_t *in_dhcp_msg_type = NULL;
>>>> +
>>>> +    in_dhcp_ptr += sizeof magic_cookie;
>>>> +    while (in_dhcp_ptr < end) {
>>>> +        const struct dhcp_opt_header *in_dhcp_opt =
>>>> +            (const struct dhcp_opt_header *) in_dhcp_ptr;
>>>> +        if (in_dhcp_opt->code == DHCP_OPT_END) {
>>>> +            break;
>>>> +        }
>>>> +        if (in_dhcp_opt->code == DHCP_OPT_PAD) {
>>>> +            in_dhcp_ptr += 1;
>>>> +            continue;
>>>> +        }
>>>> +        in_dhcp_ptr += sizeof *in_dhcp_opt;
>>>> +        if (in_dhcp_ptr > end) {
>>>> +            break;
>>>> +        }
>>>> +        in_dhcp_ptr += in_dhcp_opt->len;
>>>> +        if (in_dhcp_ptr > end) {
>>>> +            break;
>>>> +        }
>>>> +
>>>> +        switch (in_dhcp_opt->code) {
>>>> +        case DHCP_OPT_MSG_TYPE:
>>>> +            if (in_dhcp_opt->len == 1) {
>>>> +                in_dhcp_msg_type = DHCP_OPT_PAYLOAD(in_dhcp_opt);
>>>> +            }
>>>> +            break;
>>>> +        /* Server Identifier */
>>>> +        case OVN_DHCP_OPT_CODE_SERVER_ID:
>>>> +            if (in_dhcp_opt->len == 4) {
>>>> +                server_id_ptr = DHCP_OPT_PAYLOAD(in_dhcp_opt);
>>>> +            }
>>>> +            break;
>>>> +        case OVN_DHCP_OPT_CODE_LEASE_TIME:
>>>> +            if (in_dhcp_opt->len == 4) {
>>>> +                lease_time = get_unaligned_be32(DHCP_OPT_PAYLOAD(in_dhcp_opt));
>>>> +            }
>>>> +            break;
>>>> +        default:
>>>> +            break;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    /* Check whether the DHCP Message Type (opt 53) is present or not */
>>>> +    if (!in_dhcp_msg_type) {
>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: missing message type");
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    if (!server_id_ptr) {
>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: missing server identifier");
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    if (*server_id_ptr != *server_ip) {
>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: server identifier mismatch");
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    if (giaddr != *relay_ip) {
>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: giaddr mismatch");
>>>> +        return;
>>>> +    }
>>>> +
>>>> +
>>>> +    /* Update destination MAC & IP so that the packet is forward to the
>>>> +     * right destination node.
>>>> +     */
>>>> +    uint16_t new_l4_size = in_l4_size;
>>>> +    size_t new_packet_size = pkt_in->l4_ofs + new_l4_size;
>>>> +
>>>> +    struct dp_packet pkt_out;
>>>> +    dp_packet_init(&pkt_out, new_packet_size);
>>>> +    dp_packet_clear(&pkt_out);
>>>> +    dp_packet_prealloc_tailroom(&pkt_out, new_packet_size);
>>>> +    pkt_out_ptr = &pkt_out;
>>>> +
>>>> +    /* Copy the L2 and L3 headers from the pkt_in as they would remain same*/
>>>> +    struct eth_header *eth = dp_packet_put(
>>>> +        &pkt_out, dp_packet_pull(pkt_in, pkt_in->l4_ofs), pkt_in->l4_ofs);
>>>> +
>>>> +    pkt_out.l2_5_ofs = pkt_in->l2_5_ofs;
>>>> +    pkt_out.l2_pad_size = pkt_in->l2_pad_size;
>>>> +    pkt_out.l3_ofs = pkt_in->l3_ofs;
>>>> +    pkt_out.l4_ofs = pkt_in->l4_ofs;
>>>> +
>>>> +    struct udp_header *udp = dp_packet_put(
>>>> +        &pkt_out, dp_packet_pull(pkt_in, UDP_HEADER_LEN), UDP_HEADER_LEN);
>>>> +
>>>> +    struct dhcp_header *dhcp_data = dp_packet_put(
>>>> +        &pkt_out, dp_packet_pull(pkt_in, new_l4_size - UDP_HEADER_LEN),
>>>> +        new_l4_size - UDP_HEADER_LEN);
>>>> +    memcpy(&eth->eth_dst, dhcp_data->chaddr, sizeof(eth->eth_dst));
>>>> +
>>>> +    /* Send a broadcast IP frame when BROADCAST flag is set. */
>>>> +    struct ip_header *out_ip = dp_packet_l3(&pkt_out);
>>>> +    ovs_be32 ip_dst;
>>>> +    ovs_be32 ip_dst_orig = get_16aligned_be32(&out_ip->ip_dst);
>>>> +    if (!is_dhcp_flags_broadcast(dhcp_data->flags)) {
>>>> +        ip_dst = dhcp_data->yiaddr;
>>>> +    } else {
>>>> +        ip_dst = htonl(0xffffffff);
>>>> +    }
>>>> +    put_16aligned_be32(&out_ip->ip_dst, ip_dst);
>>>> +    out_ip->ip_csum = recalc_csum32(out_ip->ip_csum,
>>>> +              ip_dst_orig, ip_dst);
>>>> +    if (udp->udp_csum) {
>>>> +        udp->udp_csum = recalc_csum32(udp->udp_csum,
>>>> +            ip_dst_orig, ip_dst);
>>>> +    }
>>>> +    /* Reset giaddr */
>>>> +    dhcp_data->giaddr = htonl(0x0);
>>>> +    if (udp->udp_csum) {
>>>> +        udp->udp_csum = recalc_csum32(udp->udp_csum,
>>>> +            giaddr, 0);
>>>> +    }
>>>> +    pin->packet = dp_packet_data(&pkt_out);
>>>> +    pin->packet_len = dp_packet_size(&pkt_out);
>>>> +
>>>> +    /* Log the DHCP message. */
>>>> +    static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(20, 40);
>>>> +    const struct eth_header *l2 = dp_packet_eth(&pkt_out);
>>>> +    VLOG_INFO_RL(&rl, "DHCP_RELAY_RESP_FWD:: MSG_TYPE:%s MAC:"ETH_ADDR_FMT
>>>> +             " XID:%u"
>>>> +             " YIADDR:"IP_FMT
>>>> +             " GIADDR:"IP_FMT
>>>> +             " SERVER_ADDR:"IP_FMT,
>>>> +             dhcp_msg_str_get(*in_dhcp_msg_type),
>>>> +             ETH_ADDR_BYTES_ARGS(dhcp_data->chaddr), ntohl(dhcp_data->xid),
>>>> +             IP_ARGS(dhcp_data->yiaddr),
>>>> +             IP_ARGS(giaddr), IP_ARGS(*server_id_ptr));
>>>> +    queue_msg(swconn, ofputil_encode_resume(pin, continuation, proto));
>>>> +    if (pkt_out_ptr) {
>>>> +        dp_packet_uninit(pkt_out_ptr);
>>>> +    }
>>>> +}
>>>> +
>>>> /* Called with in the pinctrl_handler thread context. */
>>>> static void
>>>> pinctrl_handle_put_dhcp_opts(
>>>> @@ -3203,6 +3634,16 @@ process_packet_in(struct rconn *swconn, const struct ofp_header *msg)
>>>>        ovs_mutex_unlock(&pinctrl_mutex);
>>>>        break;
>>>> 
>>>> +    case ACTION_OPCODE_DHCP_RELAY_REQ:
>>>> +        pinctrl_handle_dhcp_relay_req(swconn, &packet, &pin,
>>>> +                                     &userdata, &continuation);
>>>> +        break;
>>>> +
>>>> +    case ACTION_OPCODE_DHCP_RELAY_RESP_FWD:
>>>> +        pinctrl_handle_dhcp_relay_resp_fwd(swconn, &packet, &pin,
>>>> +                                     &userdata, &continuation);
>>>> +        break;
>>>> +
>>>>    case ACTION_OPCODE_PUT_DHCP_OPTS:
>>>>        pinctrl_handle_put_dhcp_opts(swconn, &packet, &pin, &headers,
>>>>                                     &userdata, &continuation);
>>>> diff --git a/include/ovn/actions.h b/include/ovn/actions.h
>>>> index 49cfe0624..47d41b90f 100644
>>>> --- a/include/ovn/actions.h
>>>> +++ b/include/ovn/actions.h
>>>> @@ -95,6 +95,8 @@ struct collector_set_ids;
>>>>    OVNACT(LOOKUP_ND_IP,      ovnact_lookup_mac_bind_ip) \
>>>>    OVNACT(PUT_DHCPV4_OPTS,   ovnact_put_opts)        \
>>>>    OVNACT(PUT_DHCPV6_OPTS,   ovnact_put_opts)        \
>>>> +    OVNACT(DHCPV4_RELAY_REQ,  ovnact_dhcp_relay)      \
>>>> +    OVNACT(DHCPV4_RELAY_RESP_FWD, ovnact_dhcp_relay)      \
>>>>    OVNACT(SET_QUEUE,         ovnact_set_queue)       \
>>>>    OVNACT(DNS_LOOKUP,        ovnact_result)          \
>>>>    OVNACT(LOG,               ovnact_log)             \
>>>> @@ -387,6 +389,14 @@ struct ovnact_put_opts {
>>>>    size_t n_options;
>>>> };
>>>> 
>>>> +/* OVNACT_DHCP_RELAY. */
>>>> +struct ovnact_dhcp_relay {
>>>> +    struct ovnact ovnact;
>>>> +    int family;
>>>> +    ovs_be32 relay_ipv4;
>>>> +    ovs_be32 server_ipv4;
>>>> +};
>>>> +
>>>> /* Valid arguments to SET_QUEUE action.
>>>> *
>>>> * QDISC_MIN_QUEUE_ID is the default queue, so user-defined queues should
>>>> @@ -750,6 +760,22 @@ enum action_opcode {
>>>> 
>>>>    /* multicast group split buffer action. */
>>>>    ACTION_OPCODE_MG_SPLIT_BUF,
>>>> +
>>>> +    /* "dhcp_relay_req(relay_ip, server_ip)".
>>>> +     *
>>>> +     * Arguments follow the action_header, in this format:
>>>> +     *   - The 32-bit DHCP relay IP.
>>>> +     *   - The 32-bit DHCP server IP.
>>>> +     */
>>>> +    ACTION_OPCODE_DHCP_RELAY_REQ,
>>>> +
>>>> +    /* "dhcp_relay_resp_fwd(relay_ip, server_ip)".
>>>> +     *
>>>> +     * Arguments follow the action_header, in this format:
>>>> +     *   - The 32-bit DHCP relay IP.
>>>> +     *   - The 32-bit DHCP server IP.
>>>> +     */
>>>> +    ACTION_OPCODE_DHCP_RELAY_RESP_FWD,
>>>> };
>>>> 
>>>> /* Header. */
>>>> diff --git a/lib/actions.c b/lib/actions.c
>>>> index a73fe1a1e..69df428c6 100644
>>>> --- a/lib/actions.c
>>>> +++ b/lib/actions.c
>>>> @@ -2629,6 +2629,118 @@ ovnact_controller_event_free(struct ovnact_controller_event *event)
>>>>    free_gen_options(event->options, event->n_options);
>>>> }
>>>> 
>>>> +static void
>>>> +format_DHCPV4_RELAY_REQ(const struct ovnact_dhcp_relay *dhcp_relay,
>>>> +                struct ds *s)
>>>> +{
>>>> +    ds_put_format(s, "dhcp_relay_req("IP_FMT","IP_FMT");",
>>>> +                  IP_ARGS(dhcp_relay->relay_ipv4),
>>>> +                  IP_ARGS(dhcp_relay->server_ipv4));
>>>> +}
>>>> +
>>>> +static void
>>>> +parse_dhcp_relay_req(struct action_context *ctx,
>>>> +               struct ovnact_dhcp_relay *dhcp_relay)
>>>> +{
>>>> +    /* Skip dhcp_relay_req( */
>>>> +    lexer_force_match(ctx->lexer, LEX_T_LPAREN);
>>>> +
>>>> +    /* Parse relay ip and server ip. */
>>>> +    if (ctx->lexer->token.format == LEX_F_IPV4) {
>>>> +        dhcp_relay->family = AF_INET;
>>>> +        dhcp_relay->relay_ipv4 = ctx->lexer->token.value.ipv4;
>>>> +        lexer_get(ctx->lexer);
>>>> +        lexer_match(ctx->lexer, LEX_T_COMMA);
>>>> +        if (ctx->lexer->token.format == LEX_F_IPV4) {
>>>> +            dhcp_relay->family = AF_INET;
>>>> +            dhcp_relay->server_ipv4 = ctx->lexer->token.value.ipv4;
>>>> +            lexer_get(ctx->lexer);
>>>> +        } else {
>>>> +            lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp server ip");
>>>> +            return;
>>>> +        }
>>>> +    } else {
>>>> +          lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp relay "
>>>> +                          "and server ips");
>>>> +          return;
>>>> +    }
>>>> +    lexer_force_match(ctx->lexer, LEX_T_RPAREN);
>>>> +}
>>>> +
>>>> +static void
>>>> +encode_DHCPV4_RELAY_REQ(const struct ovnact_dhcp_relay *dhcp_relay,
>>>> +                    const struct ovnact_encode_params *ep,
>>>> +                    struct ofpbuf *ofpacts)
>>>> +{
>>>> +    size_t oc_offset = encode_start_controller_op(ACTION_OPCODE_DHCP_RELAY_REQ,
>>>> +                                                  true, ep->ctrl_meter_id,
>>>> +                                                  ofpacts);
>>>> +    ofpbuf_put(ofpacts, &dhcp_relay->relay_ipv4,
>>>> +            sizeof(dhcp_relay->relay_ipv4));
>>>> +    ofpbuf_put(ofpacts, &dhcp_relay->server_ipv4,
>>>> +            sizeof(dhcp_relay->server_ipv4));
>>>> +    encode_finish_controller_op(oc_offset, ofpacts);
>>>> +}
>>>> +
>>>> +static void
>>>> +format_DHCPV4_RELAY_RESP_FWD(const struct ovnact_dhcp_relay *dhcp_relay,
>>>> +                    struct ds *s)
>>>> +{
>>>> +    ds_put_format(s, "dhcp_relay_resp("IP_FMT","IP_FMT");",
>>>> +                  IP_ARGS(dhcp_relay->relay_ipv4),
>>>> +                  IP_ARGS(dhcp_relay->server_ipv4));
>>>> +}
>>>> +
>>>> +static void
>>>> +parse_dhcp_relay_resp_fwd(struct action_context *ctx,
>>>> +               struct ovnact_dhcp_relay *dhcp_relay)
>>>> +{
>>>> +    /* Skip dhcp_relay_resp( */
>>>> +    lexer_force_match(ctx->lexer, LEX_T_LPAREN);
>>>> +
>>>> +    /* Parse relay ip and server ip. */
>>>> +    if (ctx->lexer->token.format == LEX_F_IPV4) {
>>>> +        dhcp_relay->family = AF_INET;
>>>> +        dhcp_relay->relay_ipv4 = ctx->lexer->token.value.ipv4;
>>>> +        lexer_get(ctx->lexer);
>>>> +        lexer_match(ctx->lexer, LEX_T_COMMA);
>>>> +        if (ctx->lexer->token.format == LEX_F_IPV4) {
>>>> +            dhcp_relay->family = AF_INET;
>>>> +            dhcp_relay->server_ipv4 = ctx->lexer->token.value.ipv4;
>>>> +            lexer_get(ctx->lexer);
>>>> +        } else {
>>>> +            lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp server ip");
>>>> +            return;
>>>> +        }
>>>> +    } else {
>>>> +          lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp relay and "
>>>> +                          "server ips");
>>>> +          return;
>>>> +    }
>>>> +    lexer_force_match(ctx->lexer, LEX_T_RPAREN);
>>>> +}
>>>> +
>>>> +static void
>>>> +encode_DHCPV4_RELAY_RESP_FWD(const struct ovnact_dhcp_relay *dhcp_relay,
>>>> +                    const struct ovnact_encode_params *ep,
>>>> +                    struct ofpbuf *ofpacts)
>>>> +{
>>>> +    size_t oc_offset = encode_start_controller_op(
>>>> +                                ACTION_OPCODE_DHCP_RELAY_RESP_FWD,
>>>> +                                true, ep->ctrl_meter_id,
>>>> +                                ofpacts);
>>>> +    ofpbuf_put(ofpacts, &dhcp_relay->relay_ipv4,
>>>> +                  sizeof(dhcp_relay->relay_ipv4));
>>>> +    ofpbuf_put(ofpacts, &dhcp_relay->server_ipv4,
>>>> +                  sizeof(dhcp_relay->server_ipv4));
>>>> +    encode_finish_controller_op(oc_offset, ofpacts);
>>>> +}
>>>> +
>>>> +static void ovnact_dhcp_relay_free(
>>>> +          struct ovnact_dhcp_relay *dhcp_relay OVS_UNUSED)
>>>> +{
>>>> +}
>>>> +
>>>> static void
>>>> parse_put_opts(struct action_context *ctx, const struct expr_field *dst,
>>>>               struct ovnact_put_opts *po, const struct hmap *gen_opts,
>>>> @@ -5451,6 +5563,11 @@ parse_action(struct action_context *ctx)
>>>>        parse_sample(ctx);
>>>>    } else if (lexer_match_id(ctx->lexer, "mac_cache_use")) {
>>>>        ovnact_put_MAC_CACHE_USE(ctx->ovnacts);
>>>> +    } else if (lexer_match_id(ctx->lexer, "dhcp_relay_req")) {
>>>> +        parse_dhcp_relay_req(ctx, ovnact_put_DHCPV4_RELAY_REQ(ctx->ovnacts));
>>>> +    } else if (lexer_match_id(ctx->lexer, "dhcp_relay_resp_fwd")) {
>>>> +        parse_dhcp_relay_resp_fwd(ctx,
>>>> +              ovnact_put_DHCPV4_RELAY_RESP_FWD(ctx->ovnacts));
>>>>    } else {
>>>>        lexer_syntax_error(ctx->lexer, "expecting action");
>>>>    }
>>>> diff --git a/lib/ovn-l7.h b/lib/ovn-l7.h
>>>> index ad514a922..e08581123 100644
>>>> --- a/lib/ovn-l7.h
>>>> +++ b/lib/ovn-l7.h
>>>> @@ -69,6 +69,7 @@ struct gen_opts_map {
>>>> */
>>>> #define OVN_DHCP_OPT_CODE_NETMASK      1
>>>> #define OVN_DHCP_OPT_CODE_LEASE_TIME   51
>>>> +#define OVN_DHCP_OPT_CODE_SERVER_ID    54
>>>> #define OVN_DHCP_OPT_CODE_T1           58
>>>> #define OVN_DHCP_OPT_CODE_T2           59
>>>> 
>>>> diff --git a/northd/northd.c b/northd/northd.c
>>>> index 07dffb15a..7ac831fae 100644
>>>> --- a/northd/northd.c
>>>> +++ b/northd/northd.c
>>>> @@ -181,11 +181,13 @@ enum ovn_stage {
>>>>    PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING_ECMP, 14, "lr_in_ip_routing_ecmp") \
>>>>    PIPELINE_STAGE(ROUTER, IN,  POLICY,          15, "lr_in_policy")          \
>>>>    PIPELINE_STAGE(ROUTER, IN,  POLICY_ECMP,     16, "lr_in_policy_ecmp")     \
>>>> -    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE,     17, "lr_in_arp_resolve")     \
>>>> -    PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN,     18, "lr_in_chk_pkt_len")     \
>>>> -    PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS,     19, "lr_in_larger_pkts")     \
>>>> -    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT,     20, "lr_in_gw_redirect")     \
>>>> -    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     21, "lr_in_arp_request")     \
>>>> +    PIPELINE_STAGE(ROUTER, IN,  DHCP_RELAY_RESP_FWD, 17,                      \
>>>> +                  "lr_in_dhcp_relay_resp_fwd")                                \
>>>> +    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE,     18, "lr_in_arp_resolve")     \
>>>> +    PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN,     19, "lr_in_chk_pkt_len")     \
>>>> +    PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS,     20, "lr_in_larger_pkts")     \
>>>> +    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT,     21, "lr_in_gw_redirect")     \
>>>> +    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     22, "lr_in_arp_request")     \
>>>>                                                                      \
>>>>    /* Logical router egress stages. */                               \
>>>>    PIPELINE_STAGE(ROUTER, OUT, CHECK_DNAT_LOCAL,   0,                       \
>>>> @@ -9610,6 +9612,80 @@ build_dhcpv6_options_flows(struct ovn_port *op,
>>>>    ds_destroy(&match);
>>>> }
>>>> 
>>>> +static void
>>>> +build_lswitch_dhcp_relay_flows(struct ovn_port *op,
>>>> +                           const struct hmap *lr_ports,
>>>> +                           const struct hmap *lflows,
>>>> +                           const struct shash *meter_groups OVS_UNUSED)
>>>> +{
>>>> +    if (op->nbrp || !op->nbsp) {
>>>> +        return;
>>>> +    }
>>>> +    /* consider only ports attached to VMs */
>>>> +    if (strcmp(op->nbsp->type, "")) {
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    if (!op->od || !op->od->n_router_ports ||
>>>> +        !op->od->nbs || !op->od->nbs->dhcp_relay_port) {
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    struct ds match = DS_EMPTY_INITIALIZER;
>>>> +    struct ds action = DS_EMPTY_INITIALIZER;
>>>> +    struct nbrec_logical_router_port *lrp = op->od->nbs->dhcp_relay_port;
>>>> +    struct ovn_port *rp = ovn_port_find(lr_ports, lrp->name);
>>>> +
>>>> +    if (!rp || !rp->nbrp || !rp->nbrp->dhcp_relay) {
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    struct ovn_port *sp = NULL;
>>>> +    struct nbrec_dhcp_relay *dhcp_relay = rp->nbrp->dhcp_relay;
>>>> +
>>>> +    for (int i = 0; i < op->od->n_router_ports; i++) {
>>>> +        struct ovn_port *sp_tmp = op->od->router_ports[i];
>>>> +        if (sp_tmp->peer == rp) {
>>>> +            sp = sp_tmp;
>>>> +            break;
>>>> +        }
>>>> +    }
>>>> +    if (!sp) {
>>>> +      return;
>>>> +    }
>>>> +
>>>> +    char *server_ip_str = NULL;
>>>> +    uint16_t port;
>>>> +    int addr_family;
>>>> +    struct in6_addr server_ip;
>>>> +
>>>> +    if (!ip_address_and_port_from_lb_key(dhcp_relay->servers, &server_ip_str,
>>>> +                                         &server_ip, &port, &addr_family)) {
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    if (server_ip_str == NULL) {
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    ds_put_format(
>>>> +        &match, "inport == %s && eth.src == %s && "
>>>> +        "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
>>>> +        "udp.src == 68 && udp.dst == 67",
>>>> +        op->json_key, op->lsp_addrs[0].ea_s);
>>>> +    ds_put_format(&action,
>>>> +                  "eth.dst=%s;outport=%s;next;/* DHCP_RELAY_REQ */",
>>>> +                  rp->lrp_networks.ea_s,sp->json_key);
>>>> +    ovn_lflow_add_with_hint__(lflows, op->od,
>>>> +                              S_SWITCH_IN_L2_LKUP, 100,
>>>> +                              ds_cstr(&match),
>>>> +                              ds_cstr(&action),
>>>> +                              op->key,
>>>> +                              NULL,
>>>> +                              &lrp->header_);
>>>> +    free(server_ip_str);
>>>> +}
>>>> +
>>>> static void
>>>> build_drop_arp_nd_flows_for_unbound_router_ports(struct ovn_port *op,
>>>>                                                 const struct ovn_port *port,
>>>> @@ -10181,6 +10257,13 @@ build_lswitch_dhcp_options_and_response(struct ovn_port *op,
>>>>        return;
>>>>    }
>>>> 
>>>> +    if (op->od && op->od->nbs
>>>> +        && op->od->nbs->dhcp_relay_port) {
>>>> +        /* Don't add the DHCP server flows if DHCP Relay is enabled on the
>>>> +         * logical switch. */
>>>> +        return;
>>>> +    }
>>>> +
>>>>    bool is_external = lsp_is_external(op->nbsp);
>>>>    if (is_external && (!op->od->n_localnet_ports ||
>>>>                        !op->nbsp->ha_chassis_group)) {
>>>> @@ -14458,6 +14541,86 @@ build_dhcpv6_reply_flows_for_lrouter_port(
>>>>    }
>>>> }
>>>> 
>>>> +static void
>>>> +build_dhcp_relay_flows_for_lrouter_port(
>>>> +        struct ovn_port *op, struct hmap *lflows,
>>>> +        struct ds *match)
>>>> +{
>>>> +    if (!op->nbrp || !op->nbrp->dhcp_relay) {
>>>> +        return;
>>>> +    }
>>>> +    struct nbrec_dhcp_relay *dhcp_relay = op->nbrp->dhcp_relay;
>>>> +    if (!dhcp_relay->servers) {
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    int addr_family;
>>>> +    /* currently not supporting custom port */
>>>> +    uint16_t port;
>>>> +    char *server_ip_str = NULL;
>>>> +    struct in6_addr server_ip;
>>>> +
>>>> +    if (!ip_address_and_port_from_lb_key(dhcp_relay->servers, &server_ip_str,
>>>> +                                         &server_ip, &port, &addr_family)) {
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    if (server_ip_str == NULL) {
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    struct ds dhcp_action = DS_EMPTY_INITIALIZER;
>>>> +    ds_clear(match);
>>>> +    ds_put_format(
>>>> +        match, "inport == %s && "
>>>> +        "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
>>>> +        "udp.src == 68 && udp.dst == 67",
>>>> +        op->json_key);
>>>> +    ds_put_format(&dhcp_action,
>>>> +                "dhcp_relay_req(%s,%s);"
>>>> +                "ip4.src=%s;ip4.dst=%s;udp.src=67;next; /* DHCP_RELAY_REQ */",
>>>> +                op->lrp_networks.ipv4_addrs[0].addr_s, server_ip_str,
>>>> +                op->lrp_networks.ipv4_addrs[0].addr_s, server_ip_str);
>>>> +
>>>> +    ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_IP_INPUT, 110,
>>>> +                            ds_cstr(match), ds_cstr(&dhcp_action),
>>>> +                            &op->nbrp->header_);
>>>> +
>>>> +    ds_clear(match);
>>>> +    ds_clear(&dhcp_action);
>>>> +
>>>> +    ds_put_format(
>>>> +        match, "ip4.src == %s && ip4.dst == %s && "
>>>> +        "udp.src == 67 && udp.dst == 67",
>>>> +        server_ip_str, op->lrp_networks.ipv4_addrs[0].addr_s);
>>>> +    ds_put_format(&dhcp_action, "next;/* DHCP_RELAY_RESP */");
>>>> +    ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_IP_INPUT, 110,
>>>> +                            ds_cstr(match), ds_cstr(&dhcp_action),
>>>> +                            &op->nbrp->header_);
>>>> +
>>>> +    ds_clear(match);
>>>> +    ds_clear(&dhcp_action);
>>>> +
>>>> +    ds_put_format(
>>>> +        match, "ip4.src == %s && ip4.dst == %s && "
>>>> +        "udp.src == 67 && udp.dst == 67",
>>>> +        server_ip_str, op->lrp_networks.ipv4_addrs[0].addr_s);
>>>> +    ds_put_format(&dhcp_action,
>>>> +          "dhcp_relay_resp_fwd(%s,%s);ip4.src=%s;udp.dst=68;"
>>>> +          "outport=%s;output; /* DHCP_RELAY_RESP */",
>>>> +          op->lrp_networks.ipv4_addrs[0].addr_s, server_ip_str,
>>>> +          op->lrp_networks.ipv4_addrs[0].addr_s, op->json_key);
>>>> +    ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_DHCP_RELAY_RESP_FWD,
>>>> +                            110,
>>>> +                            ds_cstr(match), ds_cstr(&dhcp_action),
>>>> +                            &op->nbrp->header_);
>>>> +
>>>> +    ds_clear(match);
>>>> +    ds_clear(&dhcp_action);
>>>> +
>>>> +    free(server_ip_str);
>>>> +}
>>>> +
>>>> static void
>>>> build_ipv6_input_flows_for_lrouter_port(
>>>>        struct ovn_port *op, struct hmap *lflows,
>>>> @@ -15673,6 +15836,8 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath *od, struct hmap *lflows,
>>>>    ovn_lflow_add(lflows, od, S_ROUTER_OUT_POST_SNAT, 0, "1", "next;");
>>>>    ovn_lflow_add(lflows, od, S_ROUTER_OUT_EGR_LOOP, 0, "1", "next;");
>>>>    ovn_lflow_add(lflows, od, S_ROUTER_IN_ECMP_STATEFUL, 0, "1", "next;");
>>>> +    ovn_lflow_add(lflows, od, S_ROUTER_IN_DHCP_RELAY_RESP_FWD, 0, "1",
>>>> +                  "next;");
>>>> 
>>>>    const char *ct_flag_reg = features->ct_no_masked_label
>>>>                              ? "ct_mark"
>>>> @@ -16154,6 +16319,7 @@ build_lswitch_and_lrouter_iterate_by_lsp(struct ovn_port *op,
>>>>    build_lswitch_dhcp_options_and_response(op, lflows, meter_groups);
>>>>    build_lswitch_external_port(op, lflows);
>>>>    build_lswitch_ip_unicast_lookup(op, lflows, actions, match);
>>>> +    build_lswitch_dhcp_relay_flows(op, lr_ports, lflows, meter_groups);
>>>> 
>>>>    /* Build Logical Router Flows. */
>>>>    build_ip_routing_flows_for_router_type_lsp(op, lr_ports, lflows);
>>>> @@ -16183,6 +16349,7 @@ build_lswitch_and_lrouter_iterate_by_lrp(struct ovn_port *op,
>>>>    build_egress_delivery_flows_for_lrouter_port(op, lsi->lflows, &lsi->match,
>>>>                                                 &lsi->actions);
>>>>    build_dhcpv6_reply_flows_for_lrouter_port(op, lsi->lflows, &lsi->match);
>>>> +    build_dhcp_relay_flows_for_lrouter_port(op, lsi->lflows, &lsi->match);
>>>>    build_ipv6_input_flows_for_lrouter_port(op, lsi->lflows,
>>>>                                            &lsi->match, &lsi->actions,
>>>>                                            lsi->meter_groups);
>>>> diff --git a/ovn-nb.ovsschema b/ovn-nb.ovsschema
>>>> index b2e0993e0..6863d52cd 100644
>>>> --- a/ovn-nb.ovsschema
>>>> +++ b/ovn-nb.ovsschema
>>>> @@ -1,7 +1,7 @@
>>>> {
>>>>    "name": "OVN_Northbound",
>>>> -    "version": "7.2.0",
>>>> -    "cksum": "1069338687 34162",
>>>> +    "version": "7.3.0",
>>>> +    "cksum": "2325497400 35185",
>>>>    "tables": {
>>>>        "NB_Global": {
>>>>            "columns": {
>>>> @@ -89,7 +89,12 @@
>>>>                    "type": {"key": {"type": "uuid",
>>>>                                     "refTable": "Forwarding_Group",
>>>>                                     "refType": "strong"},
>>>> -                                     "min": 0, "max": "unlimited"}}},
>>>> +                                     "min": 0, "max": "unlimited"}},
>>>> +                "dhcp_relay_port": {"type": {"key": {"type": "uuid",
>>>> +                                            "refTable": "Logical_Router_Port",
>>>> +                                            "refType": "weak"},
>>>> +                                            "min": 0,
>>>> +                                            "max": 1}}},
>>>>            "isRoot": true},
>>>>        "Logical_Switch_Port": {
>>>>            "columns": {
>>>> @@ -436,6 +441,11 @@
>>>>                "ipv6_prefix": {"type": {"key": "string",
>>>>                                      "min": 0,
>>>>                                      "max": "unlimited"}},
>>>> +                "dhcp_relay": {"type": {"key": {"type": "uuid",
>>>> +                                            "refTable": "DHCP_Relay",
>>>> +                                            "refType": "weak"},
>>>> +                                            "min": 0,
>>>> +                                            "max": 1}},
>>>>                "external_ids": {
>>>>                    "type": {"key": "string", "value": "string",
>>>>                             "min": 0, "max": "unlimited"}},
>>>> @@ -529,6 +539,15 @@
>>>>                    "type": {"key": "string", "value": "string",
>>>>                             "min": 0, "max": "unlimited"}}},
>>>>            "isRoot": true},
>>>> +        "DHCP_Relay": {
>>>> +            "columns": {
>>>> +                "servers": {"type": {"key": "string",
>>>> +                                       "min": 0,
>>>> +                                       "max": 1}},
>>>> +                "external_ids": {
>>>> +                    "type": {"key": "string", "value": "string",
>>>> +                             "min": 0, "max": "unlimited"}}},
>>>> +            "isRoot": true},
>>>>        "Connection": {
>>>>            "columns": {
>>>>                "target": {"type": "string"},
>>>> diff --git a/ovn-nb.xml b/ovn-nb.xml
>>>> index fcb1c6ecc..dc20892e1 100644
>>>> --- a/ovn-nb.xml
>>>> +++ b/ovn-nb.xml
>>>> @@ -608,6 +608,11 @@
>>>>      Please see the <ref table="DNS"/> table.
>>>>    </column>
>>>> 
>>>> +    <column name="dhcp_relay_port">
>>>> +      This column defines the <ref table="Logical_Router_Port"/> on which
>>>> +      DHCP relay is enabled.
>>>> +    </column>
>>>> +
>>>>    <column name="forwarding_groups">
>>>>      Groups a set of logical port endpoints for traffic going out of the
>>>>      logical switch.
>>>> @@ -2980,6 +2985,11 @@ or
>>>>      port has all ingress and egress traffic dropped.
>>>>    </column>
>>>> 
>>>> +    <column name="dhcp_relay">
>>>> +      This column is used to enabled DHCP Relay. Please refer
>>>> +      to <ref table="DHCP_Relay"/> table.
>>>> +    </column>
>>>> +
>>>>    <group title="Distributed Gateway Ports">
>>>>      <p>
>>>>        Gateways, as documented under <code>Gateways</code> in the OVN
>>>> @@ -4286,6 +4296,24 @@ or
>>>>    </group>
>>>>  </table>
>>>> 
>>>> +  <table name="DHCP_Relay" title="DHCP Relay">
>>>> +    <p>
>>>> +      OVN implements native DHCPv4 relay support which caters to the common
>>>> +      use case of relaying the DHCP requests to external DHCP server.
>>>> +    </p>
>>>> +
>>>> +    <column name="servers">
>>>> +      <p>
>>>> +        The DHCPv4 server IP address.
>>>> +      </p>
>>>> +    </column>
>>>> +    <group title="Common Columns">
>>>> +      <column name="external_ids">
>>>> +        See <em>External IDs</em> at the beginning of this document.
>>>> +      </column>
>>>> +    </group>
>>>> +  </table>
>>>> +
>>>>  <table name="Connection" title="OVSDB client connections.">
>>>>    <p>
>>>>      Configuration for a database connection to an Open vSwitch database
>>>> diff --git a/tests/atlocal.in b/tests/atlocal.in
>>>> index 63d891b89..32d1c374e 100644
>>>> --- a/tests/atlocal.in
>>>> +++ b/tests/atlocal.in
>>>> @@ -187,6 +187,9 @@ fi
>>>> # Set HAVE_DHCPD
>>>> find_command dhcpd
>>>> 
>>>> +# Set HAVE_DHCLIENT
>>>> +find_command dhclient
>>>> +
>>>> # Set HAVE_BFDD_BEACON
>>>> find_command bfdd-beacon
>>>> 
>>>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>>>> index 19e4f1263..4d8c9ff26 100644
>>>> --- a/tests/ovn-northd.at
>>>> +++ b/tests/ovn-northd.at
>>>> @@ -8786,9 +8786,9 @@ ovn-nbctl --wait=sb set logical_router_port R1-PUB options:redirect-type=bridged
>>>> ovn-sbctl dump-flows R1 > R1flows
>>>> AT_CAPTURE_FILE([R1flows])
>>>> 
>>>> -AT_CHECK([grep "lr_in_arp_resolve" R1flows | grep priority=90 | sort], [0], [dnl
>>>> -  table=17(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip4.src == 10.0.0.3 && is_chassis_resident("S0-P0")), action=(get_arp(outport, reg0); next;)
>>>> -  table=17(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip6.src == 1000::3 && is_chassis_resident("S0-P0")), action=(get_nd(outport, xxreg0); next;)
>>>> +AT_CHECK([grep "lr_in_arp_resolve" R1flows | grep priority=90 | sed 's/table=../table=??/' | sort], [0], [dnl
>>>> +  table=??(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip4.src == 10.0.0.3 && is_chassis_resident("S0-P0")), action=(get_arp(outport, reg0); next;)
>>>> +  table=??(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip6.src == 1000::3 && is_chassis_resident("S0-P0")), action=(get_nd(outport, xxreg0); next;)
>>>> ])
>>>> 
>>>> AT_CLEANUP
>>>> @@ -10966,3 +10966,38 @@ Status: active
>>>> 
>>>> AT_CLEANUP
>>>> ])
>>>> +
>>>> +OVN_FOR_EACH_NORTHD_NO_HV([
>>>> +AT_SETUP([check DHCP RELAY AGENT])
>>>> +ovn_start NORTHD_TYPE
>>>> +
>>>> +check ovn-nbctl ls-add ls0
>>>> +check ovn-nbctl lsp-add ls0 ls0-port1
>>>> +check ovn-nbctl lsp-set-addresses ls0-port1 02:00:00:00:00:10
>>>> +check ovn-nbctl lr-add lr0
>>>> +check ovn-nbctl lrp-add lr0 lrp1 02:00:00:00:00:01 192.168.1.1/24
>>>> +check ovn-nbctl lsp-add ls0 lrp1-attachment
>>>> +check ovn-nbctl lsp-set-type lrp1-attachment router
>>>> +check ovn-nbctl lsp-set-addresses lrp1-attachment 00:00:00:00:ff:02
>>>> +check ovn-nbctl lsp-set-options lrp1-attachment router-port=lrp1
>>>> +check ovn-nbctl lrp-add lr0 lrp-ext 02:00:00:00:00:02 192.168.2.1/24
>>>> +
>>>> +dhcp_relay=$(ovn-nbctl create DHCP_Relay servers=172.16.1.1)
>>>> +check ovn-nbctl set Logical_Router_port lrp1 dhcp_relay=$dhcp_relay
>>>> +rp_uuid=$(ovn-nbctl --bare --colum=_uuid list logical_router_port lrp1)
>>>> +check ovn-nbctl set Logical_Switch ls0 dhcp_relay_port=$rp_uuid
>>>> +
>>>> +check ovn-nbctl --wait=sb sync
>>>> +
>>>> +ovn-sbctl lflow-list > lflows
>>>> +AT_CAPTURE_FILE([lflows])
>>>> +
>>>> +AT_CHECK([grep -e "DHCP_RELAY_" lflows | sed 's/table=../table=??/'], [0], [dnl
>>>> +  table=??(lr_in_ip_input     ), priority=110  , match=(inport == "lrp1" && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(dhcp_relay_req(192.168.1.1,172.16.1.1);ip4.src=192.168.1.1;ip4.dst=172.16.1.1;udp.src=67;next; /* DHCP_RELAY_REQ */)
>>>> +  table=??(lr_in_ip_input     ), priority=110  , match=(ip4.src == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src == 67 && udp.dst == 67), action=(next;/* DHCP_RELAY_RESP */)
>>>> +  table=??(lr_in_dhcp_relay_resp_fwd), priority=110  , match=(ip4.src == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src == 67 && udp.dst == 67), action=(dhcp_relay_resp_fwd(192.168.1.1,172.16.1.1);ip4.src=192.168.1.1;udp.dst=68;outport="lrp1";output; /* DHCP_RELAY_RESP */)
>>>> +  table=??(ls_in_l2_lkup      ), priority=100  , match=(inport == "ls0-port1" && eth.src == 02:00:00:00:00:10 && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(eth.dst=02:00:00:00:00:01;outport="lrp1-attachment";next;/* DHCP_RELAY_REQ */)
>>>> +])
>>>> +
>>>> +AT_CLEANUP
>>>> +])
>>>> diff --git a/tests/ovn.at b/tests/ovn.at
>>>> index e8c79512b..839c07ce2 100644
>>>> --- a/tests/ovn.at
>>>> +++ b/tests/ovn.at
>>>> @@ -21905,7 +21905,7 @@ eth_dst=00000000ff01
>>>> ip_src=$(ip_to_hex 10 0 0 10)
>>>> ip_dst=$(ip_to_hex 172 168 0 101)
>>>> send_icmp_packet 1 1 $eth_src $eth_dst $ip_src $ip_dst c4c9 0000000000000000000000
>>>> -AT_CHECK_UNQUOTED([as hv1 ovs-ofctl dump-flows br-int metadata=0x$lr0_dp_key | awk '/table=28, n_packets=1, n_bytes=45/{print $7" "$8}'],[0],[dnl
>>>> +AT_CHECK_UNQUOTED([as hv1 ovs-ofctl dump-flows br-int metadata=0x$lr0_dp_key | awk '/table=29, n_packets=1, n_bytes=45/{print $7" "$8}'],[0],[dnl
>>>> priority=80,ip,reg15=0x$lr0_public_dp_key,metadata=0x$lr0_dp_key,nw_src=10.0.0.10 actions=drop
>>>> ])
>>>> 
>>>> @@ -28964,7 +28964,7 @@ AT_CHECK([
>>>>        grep "priority=100" | \
>>>>        grep -c "ct(commit,zone=NXM_NX_REG11\\[[0..15\\]],.*exec(move:NXM_OF_ETH_SRC\\[[\\]]->NXM_NX_CT_LABEL\\[[32..79\\]],load:0x[[0-9]]->NXM_NX_CT_MARK\\[[16..31\\]]))"
>>>> 
>>>> -        grep table=25 hv${hv}flows | \
>>>> +        grep table=26 hv${hv}flows | \
>>>>        grep "priority=200" | \
>>>>        grep -c "move:NXM_NX_CT_LABEL\\[[\\]]->NXM_NX_XXREG1\\[[\\]],move:NXM_NX_XXREG1\\[[32..79\\]]->NXM_OF_ETH_DST"
>>>>    done; :], [0], [dnl
>>>> @@ -29089,7 +29089,7 @@ AT_CHECK([
>>>>        grep "priority=100" | \
>>>>        grep -c "ct(commit,zone=NXM_NX_REG11\\[[0..15\\]],.*exec(move:NXM_OF_ETH_SRC\\[[\\]]->NXM_NX_CT_LABEL\\[[32..79\\]],load:0x[[0-9]]->NXM_NX_CT_MARK\\[[16..31\\]]))"
>>>> 
>>>> -        grep table=25 hv${hv}flows | \
>>>> +        grep table=26 hv${hv}flows | \
>>>>        grep "priority=200" | \
>>>>        grep -c "move:NXM_NX_CT_LABEL\\[[\\]]->NXM_NX_XXREG1\\[[\\]],move:NXM_NX_XXREG1\\[[32..79\\]]->NXM_OF_ETH_DST"
>>>>    done; :], [0], [dnl
>>>> @@ -29586,7 +29586,7 @@ if test X"$1" = X"DGP"; then
>>>> else
>>>>    prio=2
>>>> fi
>>>> -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=25, n_packets=1,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
>>>> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=26, n_packets=1,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
>>>> 1
>>>> ])
>>>> 
>>>> @@ -29605,13 +29605,13 @@ AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep "actions=controller" | grep
>>>> 
>>>> if test X"$1" = X"DGP"; then
>>>>    # The packet dst should be resolved once for E/W centralized NAT purpose.
>>>> -    AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=25, n_packets=1,.* priority=100,reg0=0xa000101,reg15=.*metadata=0x${sw_key} actions=mod_dl_dst:00:00:00:00:01:01,resubmit" -c], [0], [dnl
>>>> +    AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=26, n_packets=1,.* priority=100,reg0=0xa000101,reg15=.*metadata=0x${sw_key} actions=mod_dl_dst:00:00:00:00:01:01,resubmit" -c], [0], [dnl
>>>> 1
>>>> ])
>>>> fi
>>>> 
>>>> # The packet should've been finally dropped in the lr_in_arp_resolve stage.
>>>> -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=25, n_packets=2,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
>>>> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=26, n_packets=2,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
>>>> 1
>>>> ])
>>>> OVN_CLEANUP([hv1])
>>>> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
>>>> index 7b9daba0d..591933a95 100644
>>>> --- a/tests/system-ovn.at
>>>> +++ b/tests/system-ovn.at
>>>> @@ -12032,3 +12032,153 @@ as
>>>> OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
>>>> /connection dropped.*/d"])
>>>> AT_CLEANUP
>>>> +
>>>> +OVN_FOR_EACH_NORTHD([
>>>> +AT_SETUP([DHCP RELAY AGENT])
>>>> +AT_SKIP_IF([test $HAVE_DHCPD = no])
>>>> +AT_SKIP_IF([test $HAVE_DHCLIENT = no])
>>>> +AT_SKIP_IF([test $HAVE_TCPDUMP = no])
>>>> +ovn_start
>>>> +OVS_TRAFFIC_VSWITCHD_START()
>>>> +
>>>> +ADD_BR([br-int])
>>>> +ADD_BR([br-ext])
>>>> +
>>>> +ovs-ofctl add-flow br-ext action=normal
>>>> +# Set external-ids in br-int needed for ovn-controller
>>>> +ovs-vsctl \
>>>> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
>>>> +        -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
>>>> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
>>>> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
>>>> +        -- set bridge br-int fail-mode=secure other-config:disable-in-band=true
>>>> +
>>>> +# Start ovn-controller
>>>> +start_daemon ovn-controller
>>>> +
>>>> +ADD_NAMESPACES(sw01)
>>>> +ADD_VETH(sw01, sw01, br-int, "0", "f0:00:00:01:02:03")
>>>> +ADD_NAMESPACES(sw11)
>>>> +ADD_VETH(sw11, sw11, br-int, "0", "f0:00:00:02:02:03")
>>>> +ADD_NAMESPACES(server)
>>>> +ADD_VETH(s1, server, br-ext, "172.16.1.1/24", "f0:00:00:01:02:05", \
>>>> +         "172.16.1.254")
>>>> +
>>>> +check ovn-nbctl lr-add R1
>>>> +
>>>> +check ovn-nbctl ls-add sw0
>>>> +check ovn-nbctl ls-add sw1
>>>> +check ovn-nbctl ls-add sw-ext
>>>> +
>>>> +check ovn-nbctl lrp-add R1 rp-sw0 00:00:01:01:02:03 192.168.1.1/24
>>>> +check ovn-nbctl lrp-add R1 rp-sw1 00:00:03:01:02:03 192.168.2.1/24
>>>> +check ovn-nbctl lrp-add R1 rp-ext 00:00:02:01:02:03 172.16.1.254/24
>>>> +
>>>> +dhcp_relay=$(ovn-nbctl create DHCP_Relay servers=172.16.1.1)
>>>> +check ovn-nbctl set Logical_Router_port rp-sw0 dhcp_relay=$dhcp_relay
>>>> +check ovn-nbctl set Logical_Router_port rp-sw1 dhcp_relay=$dhcp_relay
>>>> +check ovn-nbctl lrp-set-gateway-chassis rp-ext hv1
>>>> +
>>>> +check ovn-nbctl lsp-add sw0 sw0-rp -- set Logical_Switch_Port sw0-rp \
>>>> +    type=router options:router-port=rp-sw0 \
>>>> +    -- lsp-set-addresses sw0-rp router
>>>> +check ovn-nbctl lsp-add sw1 sw1-rp -- set Logical_Switch_Port sw1-rp \
>>>> +    type=router options:router-port=rp-sw1 \
>>>> +    -- lsp-set-addresses sw1-rp router
>>>> +
>>>> +rp_uuid=$(ovn-nbctl --bare --colum=_uuid list logical_router_port rp-sw0)
>>>> +check ovn-nbctl set Logical_Switch sw0 dhcp_relay_port=$rp_uuid
>>>> +rp_uuid=$(ovn-nbctl --bare --colum=_uuid list logical_router_port rp-sw1)
>>>> +check ovn-nbctl set Logical_Switch sw1 dhcp_relay_port=$rp_uuid
>>>> +
>>>> +check ovn-nbctl lsp-add sw-ext ext-rp -- set Logical_Switch_Port ext-rp \
>>>> +    type=router options:router-port=rp-ext \
>>>> +    -- lsp-set-addresses ext-rp router
>>>> +check ovn-nbctl lsp-add sw-ext lnet \
>>>> +        -- lsp-set-addresses lnet unknown \
>>>> +        -- lsp-set-type lnet localnet \
>>>> +        -- lsp-set-options lnet network_name=phynet
>>>> +
>>>> +check ovn-nbctl lsp-add sw0 sw01 \
>>>> +    -- lsp-set-addresses sw01 "f0:00:00:01:02:03"
>>>> +
>>>> +check ovn-nbctl lsp-add sw1 sw11 \
>>>> +    -- lsp-set-addresses sw11 "f0:00:00:02:02:03"
>>>> +
>>>> +AT_CHECK([ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappings=phynet:br-ext])
>>>> +
>>>> +OVN_POPULATE_ARP
>>>> +
>>>> +check ovn-nbctl --wait=hv sync
>>>> +
>>>> +DHCP_TEST_DIR="/tmp/dhcp-test"
>>>> +rm -rf $DHCP_TEST_DIR
>>>> +mkdir $DHCP_TEST_DIR
>>>> +cat > $DHCP_TEST_DIR/dhcpd.conf <<EOF
>>>> +subnet 172.16.1.0 netmask 255.255.255.0 {
>>>> +}
>>>> +subnet 192.168.1.0 netmask 255.255.255.0 {
>>>> +  range 192.168.1.10 192.168.1.10;
>>>> +  option routers 192.168.1.1;
>>>> +  option broadcast-address 192.168.1.255;
>>>> +  default-lease-time 60;
>>>> +  max-lease-time 120;
>>>> +}
>>>> +subnet 192.168.2.0 netmask 255.255.255.0 {
>>>> +  range 192.168.2.10 192.168.2.10;
>>>> +  option routers 192.168.2.1;
>>>> +  option broadcast-address 192.168.2.255;
>>>> +  default-lease-time 60;
>>>> +  max-lease-time 120;
>>>> +}
>>>> +EOF
>>>> +cat > $DHCP_TEST_DIR/dhclien.conf <<EOF
>>>> +timeout 2
>>>> +EOF
>>>> +
>>>> +touch $DHCP_TEST_DIR/dhcpd.leases
>>>> +chown root:dhcpd $DHCP_TEST_DIR $DHCP_TEST_DIR/dhcpd.leases
>>>> +chmod 775 $DHCP_TEST_DIR
>>>> +chmod 664 $DHCP_TEST_DIR/dhcpd.leases
>>>> +
>>>> +
>>>> +NETNS_DAEMONIZE([server], [dhcpd -4 -f -cf $DHCP_TEST_DIR/dhcpd.conf s1 > dhcpd.log 2>&1], [dhcpd.pid])
>>>> +
>>>> +NS_CHECK_EXEC([server], [tcpdump -l -nvv -i s1  udp > pkt.pcap 2>tcpdump_err &])
>>>> +OVS_WAIT_UNTIL([grep "listening" tcpdump_err])
>>>> +on_exit 'kill $(pidof tcpdump)'
>>>> +
>>>> +NS_CHECK_EXEC([sw01], [dhclient -1 -q -lf $DHCP_TEST_DIR/dhclient-sw01.lease -pf $DHCP_TEST_DIR/dhclient-sw01.pid -cf $DHCP_TEST_DIR/dhclien.conf sw01])
>>>> +NS_CHECK_EXEC([sw11], [dhclient -1 -q -lf $DHCP_TEST_DIR/dhclient-sw11.lease -pf $DHCP_TEST_DIR/dhclient-sw11.pid -cf $DHCP_TEST_DIR/dhclien.conf sw11])
>>>> +
>>>> +OVS_WAIT_UNTIL([
>>>> +    total_pkts=$(cat pkt.pcap | wc -l)
>>>> +    test ${total_pkts} -ge 8
>>>> +])
>>>> +
>>>> +on_exit 'kill `cat $DHCP_TEST_DIR/dhclient-sw01.pid` &&
>>>> +kill `cat $DHCP_TEST_DIR/dhclient-sw11.pid` && rm -rf $DHCP_TEST_DIR'
>>>> +
>>>> +NS_CHECK_EXEC([sw01], [ip addr show sw01 | grep -oP '(?<=inet\s)\d+(\.\d+){3}'], [0], [dnl
>>>> +192.168.1.10
>>>> +])
>>>> +NS_CHECK_EXEC([sw11], [ip addr show sw11 | grep -oP '(?<=inet\s)\d+(\.\d+){3}'], [0], [dnl
>>>> +192.168.2.10
>>>> +])
>>>> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
>>>> +
>>>> +as ovn-sb
>>>> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>>>> +
>>>> +as ovn-nb
>>>> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>>>> +
>>>> +as northd
>>>> +OVS_APP_EXIT_AND_WAIT([NORTHD_TYPE])
>>>> +
>>>> +as
>>>> +OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
>>>> +/failed to query port patch-.*/d
>>>> +/.*terminating with signal 15.*/d"])
>>>> +AT_CLEANUP
>>>> +])
>>>> diff --git a/utilities/ovn-trace.c b/utilities/ovn-trace.c
>>>> index 0b86eae7b..ae9dd77de 100644
>>>> --- a/utilities/ovn-trace.c
>>>> +++ b/utilities/ovn-trace.c
>>>> @@ -2328,6 +2328,25 @@ execute_put_dhcp_opts(const struct ovnact_put_opts *pdo,
>>>>    execute_put_opts(pdo, name, uflow, super);
>>>> }
>>>> 
>>>> +static void
>>>> +execute_dhcpv4_relay_resp_fwd(const struct ovnact_dhcp_relay *dr,
>>>> +                                const char *name, struct flow *uflow,
>>>> +                                struct ovs_list *super)
>>>> +{
>>>> +    ovntrace_node_append(
>>>> +        super, OVNTRACE_NODE_ERROR,
>>>> +        "/* We assume that this packet is DHCPOFFER or DHCPACK and "
>>>> +            "DHCP broadcast flag is set. Dest IP is set to broadcast. "
>>>> +            "Dest MAC is set to broadcast but in real network this is unicast "
>>>> +            "which is extracted from DHCP header. */");
>>>> +
>>>> +    /* Assume DHCP broadcast flag is set */
>>>> +    uflow->nw_dst = 0xFFFFFFFF;
>>>> +    /* Dest MAC is set to broadcast but in real network this is unicast */
>>>> +    struct eth_addr bcast_mac = {0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF};
>>>> +    uflow->dl_dst = bcast_mac;
>>>> +}
>>>> +
>>>> static void
>>>> execute_put_nd_ra_opts(const struct ovnact_put_opts *pdo,
>>>>                       const char *name, struct flow *uflow,
>>>> @@ -3215,6 +3234,15 @@ trace_actions(const struct ovnact *ovnacts, size_t ovnacts_len,
>>>>                                  "put_dhcpv6_opts", uflow, super);
>>>>            break;
>>>> 
>>>> +        case OVNACT_DHCPV4_RELAY_REQ:
>>>> +            /* Nothing to do for tracing. */
>>>> +            break;
>>>> +
>>>> +        case OVNACT_DHCPV4_RELAY_RESP_FWD:
>>>> +            execute_dhcpv4_relay_resp_fwd(ovnact_get_DHCPV4_RELAY_RESP_FWD(a),
>>>> +                                    "dhcp_relay_resp_fwd", uflow, super);
>>>> +            break;
>>>> +
>>>>        case OVNACT_PUT_ND_RA_OPTS:
>>>>            execute_put_nd_ra_opts(ovnact_get_PUT_DHCPV6_OPTS(a),
>>>>                                   "put_nd_ra_opts", uflow, super);
>>>> --
>>>> 2.36.6
>>>> 
>>>> _______________________________________________
>>>> dev mailing list
>>>> dev@openvswitch.org
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwIFaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=jUP6tr4FN6iSRj6v8rdyetsvEpT13QUHVMbw__3u6Sm7qAhyuu9tBdezdVmkqt0p&s=jJ3kFCf5o6dc-gW8diGvfaIQVC0Gwhe2y5aJYZJo0Rk&e=
>> 
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwIFaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=Y0VulqBkhPSTIRvGcgPUyjhLDQiFRE2rJOS17q4U2rvbGvuTRX_KWc30pZQRDbFM&s=j60qqTajjKcQRA2_J_ZfXj1eIu5RmLsENUpdcQKh3PI&e=
Naveen Yerramneni Feb. 26, 2024, 4:41 p.m. UTC | #6
> On 24-Jan-2024, at 6:30 PM, Naveen Yerramneni <naveen.yerramneni@nutanix.com> wrote:
> 
> 
> 
>> On 24-Jan-2024, at 8:59 AM, Numan Siddique <numans@ovn.org> wrote:
>> 
>> On Tue, Jan 23, 2024 at 8:02 PM Naveen Yerramneni
>> <naveen.yerramneni@nutanix.com> wrote:
>>> 
>>> 
>>> 
>>>> On 16-Jan-2024, at 2:30 AM, Numan Siddique <numans@ovn.org> wrote:
>>>> 
>>>> On Tue, Dec 12, 2023 at 1:05 PM Naveen Yerramneni
>>>> <naveen.yerramneni@nutanix.com> wrote:
>>>>> 
>>>>>  This patch contains changes to enable DHCP Relay Agent support for overlay subnets.
>>>>> 
>>>>>  USE CASE:
>>>>>  ----------
>>>>>    - Enable IP address assignment for overlay subnets from the centralized DHCP server present in the underlay network.
>>>>> 
>>>>>  PREREQUISITES
>>>>>  --------------
>>>>>    - Logical Router Port IP should be assigned (statically) from the same overlay subnet which is managed by DHCP server.
>>>>>    - LRP IP is used for GIADRR field when relaying the DHCP packets and also same IP needs to be configured as default gateway for the overlay subnet.
>>>>>    - Overlay subnets managed by external DHCP server are expected to be directly reachable from the underlay network.
>>>>> 
>>>>>  EXPECTED PACKET FLOW:
>>>>>  ----------------------
>>>>>  Following is the expected packet flow inorder to support DHCP rleay functionality in OVN.
>>>>>    1. DHCP client originates DHCP discovery (broadcast).
>>>>>    2. DHCP relay (running on the OVN) receives the broadcast and forwards the packet to the DHCP server by converting it to unicast. While forwarding the packet, it updates the GIADDR in DHCP header to its
>>>>>       interface IP on which DHCP packet is received.
>>>>>    3. DHCP server uses GIADDR field to decide the IP address pool from which IP has to be assigned and DHCP offer is sent to the same IP (GIADDR).
>>>>>    4. DHCP relay agent forwards the offer to the client, it resets the GIADDR field when forwarding the offer to the client.
>>>>>    5. DHCP client sends DHCP request (broadcast) packet.
>>>>>    6. DHCP relay (running on the OVN) receives the broadcast and forwards the packet to the DHCP server by converting it to unicast. While forwarding the packet, it updates the GIADDR in DHCP header to its
>>>>>       interface IP on which DHCP packet is received.
>>>>>    7. DHCP Server sends the ACK packet.
>>>>>    8. DHCP relay agent forwards the ACK packet to the client, it resets the GIADDR field when forwarding the ACK to the client.
>>>>>    9. All the future renew/release packets are directly exchanged between DHCP client and DHCP server.
>>>>> 
>>>>>  OVN DHCP RELAY PACKET FLOW:
>>>>>  ----------------------------
>>>>>  To add DHCP Relay support on OVN, we need to replicate all the behavior described above using distributed logical switch and logical router.
>>>>>  At, highlevel packet flow is distributed among Logical Switch and Logical Router on source node (where VM is deployed) and redirect chassis(RC) node.
>>>>>    1. Request packet gets processed on the source node where VM is deployed and relays the packet to DHCP server.
>>>>>    2. Response packet is first processed on RC node (which first recieves the packet from underlay network). RC node forwards the packet to the right node by filling in the dest MAC and IP.
>>>>> 
>>>>>  OVN Packet flow with DHCP relay is explained below.
>>>>>    1. DHCP client (VM) sends the DHCP discover packet (broadcast).
>>>>>    2. Logical switch converts the packet to L2 unicast by setting the destination MAC to LRP's MAC
>>>>>    3. Logical Router receives the packet and redirects it to the OVN controller.
>>>>>    4. OVN controller updates the required information(GIADDR) in the DHCP payload after doing the required checks. If any check fails, packet is dropped.
>>>>>    5. Logical Router converts the packet to L3 unicast and forwards it to the server. This packets gets routed like any other packet (via RC node).
>>>>>    6. Server replies with DHCP offer.
>>>>>    7. RC node processes the DHCP offer and forwards it to the OVN controller.
>>>>>    8. OVN controller does sanity checks and  updates the destination MAC (available in DHCP header), destination IP (available in DHCP header), resets GIADDR  and reinjects the packet to datapath.
>>>>>       If any check fails, packet is dropped.
>>>>>    9. Logical router updates the source IP and port and forwards the packet to logical switch.
>>>>>    10. Logical switch delivers the packet to the DHCP client.
>>>>>    11. Similar steps are performed for Request and Ack packets.
>>>>>    12. All the future renew/release packets are directly exchanged between DHCP client and DHCP server
>>>>> 
>>>>>  NEW OVN ACTIONS
>>>>>  ---------------
>>>>> 
>>>>>    1. dhcp_relay_req(<relay-ip>, <server-ip>)
>>>>>        - This action executes on the source node on which the DHCP request originated.
>>>>>        - This action relays the DHCP request coming from client to the server. Relay-ip is used to update GIADDR in the DHCP header.
>>>>>    2. dhcp_relay_resp_fwd(<relay-ip>, <server-ip>)
>>>>>        - This action executes on the first node (RC node) which processes the DHCP response from the server.
>>>>>        - This action updates  the destination MAC and destination IP so that the response can be forwarded to the appropriate node from which request was originated.
>>>>>        - Relay-ip, server-ip are used to validate GIADDR and SERVER ID in the DHCP payload.
>>>>> 
>>>>>  FLOWS
>>>>>  -----
>>>>>  Following are the flows required for one overlay subnet.
>>>>> 
>>>>>    1. table=27(ls_in_l2_lkup      ), priority=100  , match=(inport == <vm_port> && eth.src == <vm_mac> && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(eth.dst=<lrp_mac>;outport=<lrp-port>;next;/* DHCP_RELAY_REQ */)
>>>>>    2. table=3 (lr_in_ip_input     ), priority=110  , match=(inport == <lrp_port> && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(dhcp_relay_req(<lrp_ip>,<dhcp_server_ip>);ip4.src=<lrp_ip>;ip4.dst=<dhcp_server_ip>;udp.src=67;next; /* DHCP_RELAY_REQ */)
>>>>>    3. table=3 (lr_in_ip_input     ), priority=110  , match=(ip4.src == <dhcp_server_ip> && ip4.dst ==<lrp_ip> && udp.src == 67 && udp.dst == 67), action=(next;/* DHCP_RELAY_RESP */)
>>>>>    4. table=17(lr_in_dhcp_relay_resp_fwd), priority=110  , match=(ip4.src == <dhcp_server_ip> && ip4.dst == <lrp_ip> && udp.src == 67 && udp.dst == 67), action=(dhcp_relay_resp_fwd(<lrp_ip>,<dhcp_server_ip>);ip4.src=<lrp_ip>;udp.dst=68;outport=<lrp_port>;output; /* DHCP_RELAY_RESP */)
>>>>> 
>>>>>  NEW PIPELINE STAGES
>>>>>  -------------------
>>>>>  Following stage is added for DHCP relay feature. Some of the flows are fitted into the existing pipeline tages.
>>>>>    1. lr_in_dhcp_relay_resp_fwd
>>>>>        - Forward teh DHCP response to the appropriate node
>>>>> 
>>>>>  NB SCHEMA CHANGES
>>>>>  ----------------
>>>>>    1. New DHCP_Relay table
>>>>>        "DHCP_Relay": {
>>>>>              "columns": {
>>>>>          "name": {"type": "string"},
>>>>>                  "servers": {"type": {"key": "string",
>>>>>                                         "min": 0,
>>>>>                                         "max": 1}},
>>>>>                  "external_ids": {
>>>>>                      "type": {"key": "string", "value": "string",
>>>>>                              "min": 0, "max": "unlimited"}}},
>>>>>              "isRoot": true},
>>>>>    2. New column to Logical_Router_Port table
>>>>>        "dhcp_relay": {"type": {"key": {"type": "uuid",
>>>>>                              "refTable": "DHCP_Relay",
>>>>>                              "refType": "weak"},
>>>>>                              "min": 0,
>>>>>                              "max": 1}},
>>>>>    3. New column to Logical_Switch_table
>>>>>        "dhcp_relay_port": {"type": {"key": {"type": "uuid",
>>>>>                                      "refTable": "Logical_Router_Port",
>>>>>                                      "refType": "weak"},
>>>>>                                       "min": 0,
>>>>>                                       "max": 1}}},
>>>>> 
>>>>>  Commands to enable the feature:
>>>>>  ------------------------------
>>>>>    - ovn-nbctl create DHCP_Relay servers=<ip>
>>>>>    - ovn-nbctl set Logical_Router_port <lrp_uuid> dhcp_relay=<dhcp_relay_uuid>
>>>>>    - ovn-nbctl set Logical_Switch <ls_uuid> dhcp_relay_port=<lrp_uuid>
>>>>> 
>>>>>  Example:
>>>>>  -------
>>>>>   ovn-nbctl ls-add sw1
>>>>>   ovn-nbctl lsp-add sw1 sw1-port1
>>>>>   ovn-nbctl lsp-set-addresses sw1-port1 <MAC> #Only MAC address has to be specified when logical ports are created.
>>>>>   ovn-nbctl lr-add lr1
>>>>>   ovn-nbctl lrp-add lr1 lr1-port1 <MAC> <GATEWAY_IP/Prefix> #GATEWAY IP is set in GIADDR field when relaying the DHCP requests to server.
>>>>>   ovn-nbctl lsp-add sw1 lr1-attachment
>>>>>   ovn-nbctl lsp-set-type lr1-attachment router
>>>>>   ovn-nbctl lsp-set-addresses lr1-attachment <MAC>
>>>>>   ovn-nbctl lsp-set-options lr1-attachment router-port=lr1-port1
>>>>>   ovn-nbctl create DHCP_Relay servers=<DHCP_SERVER_IP>
>>>>>   ovn-nbctl set Logical_Router_port <lrp_uuid> dhcp_relay=<relay_uuid>
>>>>>   ovn-nbctl set Logical_Switch <ls_uuid> dhcp_relay_port=<lrp_uuid>
>>>>> 
>>>>>  Limitations:
>>>>>  ------------
>>>>>    - All OVN features that needs IP address to be configured on logical port (like proxy arp, etc) will not be supported for overlay subnets on which DHCP relay is enabled.
>>>>> 
>>>>>  References:
>>>>>  ----------
>>>>>    - rfc1541, rfc1542, rfc2131
>>>>> 
>>>>> Signed-off-by: Naveen Yerramneni <naveen.yerramneni@nutanix.com>
>>>>> Co-authored-by: Huzaifa Calcuttawala <huzaifa.c@nutanix.com>
>>>>> Signed-off-by: Huzaifa Calcuttawala <huzaifa.c@nutanix.com>
>>>>> CC: Mary Manohar <mary.manohar@nutanix.com>
>>>> 
>>>> Hi Naveen,
>>>> 
>>>> Thanks for the patch.  Sorry for the delayed response.
>>>> 
>>>> I've a few comments.
>>>> 
>>>> 1.  Regarding the newly added Table - DHCP_Relay in NB DB and the
>>>> newly added columns in Logical_Switch and
>>>>  Logical_Router table.
>>>> 
>>>>  I don't think there is a need to add the new table DHCP_Relay
>>>> since it only stores the dhcp relay agent server ip.
>>>>  Also it could complicate the northd incremental processing.
>>>> 
>>>>  If for example we have below logical switches and router
>>>> 
>>>>  ovn-nbctl lr-add R1
>>>>  ovn-nbctl ls-add sw0
>>>>  ovn-nbctl ls-add sw1
>>>>  ovn-nbctl ls-add sw-ext
>>>>  ovn-nbctl lrp-add R1 rp-sw0 00:00:01:01:02:03 192.168.1.1/24
>>>>  ovn-nbctl lrp-add R1 rp-sw1 00:00:03:01:02:03 192.168.2.1/24
>>>>  ovn-nbctl lrp-add R1 rp-ext 00:00:02:01:02:03 172.16.1.254/24
>>>> 
>>>>  ovn-nbctl lsp-add sw0 sw0-rp -- set Logical_Switch_Port sw0-rp \
>>>>  type=router options:router-port=rp-sw0 \
>>>>  -- lsp-set-addresses sw0-rp router
>>>> 
>>>>  ovn-nbctl lsp-add sw1 sw1-rp -- set Logical_Switch_Port sw1-rp \
>>>>  type=router options:router-port=rp-sw1 \
>>>>  -- lsp-set-addresses sw1-rp router
>>>> 
>>>>  I'd suggest doing something like below to enable this feature.
>>>> 
>>>>  ovn-nbctl set Logical_Switch_Port sw0-rp options:dhcp_relay=true
>>>>  ovn-nbctl set Logical_Switch_Port sw1-rp options:dhcp_relay=true
>>>> 
>>>>  (Make sure that only one logical switch port of type router can
>>>> have this flag - dhcp_relay set
>>>>   for a given logical switch and document this limitation.)
>>> 
>>> Ack. This suggestion looks good.
>>> 
>>>>  ovn-nbctl set Logical_Router_port rp-sw0 options:dhcp_relay_ip=172.16.1.1
>>>>  ovn-nbctl set Logical_Router_port rp-sw1 options:dhcp_relay_ip=172.16.1.1
>>>> 
>>>>  Let me know if there are any limitations with this.
>>> 
>>> The reason why I added new table is , it would be useful in future if we add
>>> additional options (like setting hop count in DHCP header, etc) to DHCP relay
>>> functionality. What do you recommend if we have to add more options
>>> In future ?
>> 
>> I see.  If there is a possibility of adding more options, then having
>> a separate table makes sense.
>> I'd suggest to add the options column to the DHCP_Relay table even if
>> this patch presently is not using
>> any.  This would help in upgrades.
>> 
>> But I don't think there is a need to add a new column in the logical
>> switch port table to enable dhcp realy.
>> 
>> Thanks
>> Numan
> 
> Sure, I will add options column to DHCP_Relay column.
> I will use options:dhcp_relay for LSP instead of new column as you suggested.
> 
> Thanks,
> Naveen
> 

Hi Numan, 

I started working on your comments.

Regd options:dhcp_relay for LSP: Since DHCP relay is applicable at logical switch
level (for entire subnet). I am thinking what if we add options:dhcp_relay with value
as string (lsp name of port type router) to Logical Switch table ?
Please let me know your thoughts on this.  


> 
>> 
>>> 
>>> 
>>> 
>>>> 2.  Regarding the newly added actions - dhcp_relay_req() and
>>>> dhcp_relay_resp_fwd().
>>>>   Both of these actions are encoded as OVS controller action with
>>>> pause enabled.
>>>>   Which means ovs-vswitchd has to freeze the flow translation and
>>>> resume the flow translation
>>>>   once the ovn-controller resumes it.  But the functions
>>>> pinctrl_handle_dhcp_relay_req()
>>>>   and pinctrl_handle_dhcp_relay_resp_fwd() do not resume the packet
>>>> if the packet
>>>>   has some errors.  This is wrong.  Otherwise vswitchd will never thaw the
>>>>   frozen translation.
>>>> 
>>>>   You can see the existing OVN actions - put_dhcp_opts() and few others which
>>>>   use controller action with pause.  In such actions, the result of
>>>> these actions
>>>>   are stored in a register bit (i.e if put_dhcp_opts() was successful or not)
>>>>   and in the next stage we take a decision based on the result.
>>>> 
>>>>   For the action dhcp_relay_req(relay_ip, server_ip),  I don't
>>>> think you should use the pause flag.
>>>>   Also in this action the argument server_ip is never used in the
>>>> function pinctrl_handle_dhcp_relay_req()
>>>>   other than to just log.
>>>> 
>>>>   I'd suggest you do something like this:
>>>> 
>>>>  table=3 (lr_in_ip_input     ), priority=110  , match=(inport ==
>>>> "lrp1" && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src
>>>> == 68 && udp.dst == 67),
>>>>  action=(dhcp_relay_req { ip4.src = 192.168.1.1; ip4.dst =
>>>> 172.16.1.1; udp.src = 67; dhcp_header.giaddr = <relay_ip>;
>>>> next(pipeline=ingress,table=S_ROUTER_IN_UNSNAT);  /* DHCP_RELAY_REQ */
>>>> }
>>>> 
>>>>  dhcp_relay_req action would get translated into a controller
>>>> action with pause=false and all the inner actions of this are encoded
>>>> as
>>>>  normal actions and stored in the userdata of controller action.
>>>> Please see icmp4_error {} as an example.
>>>>  Add a new OVN field 'dhcp_header.giaddr' which gets translated as
>>>> controller action with pause flag set.
>>>>  Please see the existing OVN field - icmp4.frag_mtu as an example
>>>> and see this commit for reference [1]
>>>>  When encoding this new OVN field, store the relay_ip in the
>>>> userdata buffer and in pinctrl.c
>>>>  get the relay_ip value and store it in the dhcp header field.
>>>> 
>>>> 
>>>>  For the action dhcp_relay_resp_fwd,  I'd suggest something like below:
>>>> 
>>>>    table=17 (lr_in_dhcp_relay_resp_chk), priority=110  ,
>>>> match=(ip4.src == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src ==
>>>> 67 && udp.dst == 67),
>>>>    action=(reg0[0] = dhcp_relay_resp_chk(dhcp_header.giaddr ==
>>>> <relay_ip>); next;)
>>>>    table=17 (lr_in_dhcp_relay_resp), priority=110  , match=(ip4.src
>>>> == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src == 67 && udp.dst ==
>>>> 67 && reg0[0] == 1),
>>>>    action=(ip4.src = 192.168.1.1; udp.dst = 68; outport = "lrp1";
>>>> output; /* DHCP_RELAY_RESP */)
>>>> 
>>>>     I used reg0[0] as an example.  You may need to check the free
>>>> register bit and use it.
>>>> 
>>>>    You need to encode dhcp_relay_resp_chk as controller action with
>>>> pause=true, and store the relay_ip in the userdata buffer.
>>>>    And in pinctrl.c  check that  'dhcp_header.giaddr == relay_ip'
>>>> or not.  If so, set the result register bit to 1, else to 0.
>>>> 
>>>> Let me know if you've any questions.
>>>> 
>>> 
>>> Ack. Thanks for the suggestions and detailed explanation.
>>> Before implementation I had referred to icmp4_error and native dhcp_server flows
>>> but I had slight misunderstanding about pause flag.
>>> 


Regd dhcp_relay_req:  I think it might be better to implement two stage processing
for dhcp_relay_req similar to  dhcp_relay_resp (something like below). This will avoid
multiple OVN actions/fields if we update additional fields (like hop count in DHCP header) in future.

table=3 (lr_in_ip_input     ), priority=110  , match=(inport == "lrp1" && ip4.src == 0.0.0.0 && 
ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), 
action=(regx[y] = dhcp_relay_req_chk(192.168.1.1,172.16.1.1);next; /* DHCP_RELAY_REQ */)

table=4 (lr_in_ip_dhcp_req     ), priority=110  , match=(inport == "lrp1" && ip4.src == 0.0.0.0 && 
ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67 && regx[y]), 
action=(ip4.src=192.168.1.1;ip4.dst=172.16.1.1;
udp.src=67;next; /* DHCP_RELAY_REQ */)

table=4 (lr_in_ip_dhcp_req     ), priority=1  , match=(inport == "lrp1" && ip4.src == 0.0.0.0 && 
ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67 && regx[y] == 0), 
action=drop;

Please let me know if you are fine with this approach.



>>> 
>>>> 3.  The newly added functions in pinctrl.c have a lot of repetitive
>>>> code and it is very much similar to existing
>>>> pinctrl_handle_put_dhcp_opts()
>>>>  Please see if the duplicate code can be avoided.
>>> 
>>> Ack.
>>> 
>>> 
>>> 
>>>> [1] - https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_commit_3d9fec3fd5992e1201b4d4fdf43f1f397e8d5ea1&d=DwIFaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=jUP6tr4FN6iSRj6v8rdyetsvEpT13QUHVMbw__3u6Sm7qAhyuu9tBdezdVmkqt0p&s=xAleLPNTzueIGuScqWZRp7ppL2D7bbjqLZc6q4xk3Rg&e=
>>>> 
>>>> Thanks
>>>> Numan
>>>> 
>>>>> ---
>>>>> controller/pinctrl.c  | 441 ++++++++++++++++++++++++++++++++++++++++++
>>>>> include/ovn/actions.h |  26 +++
>>>>> lib/actions.c         | 117 +++++++++++
>>>>> lib/ovn-l7.h          |   1 +
>>>>> northd/northd.c       | 177 ++++++++++++++++-
>>>>> ovn-nb.ovsschema      |  25 ++-
>>>>> ovn-nb.xml            |  28 +++
>>>>> tests/atlocal.in      |   3 +
>>>>> tests/ovn-northd.at   |  41 +++-
>>>>> tests/ovn.at          |  12 +-
>>>>> tests/system-ovn.at   | 150 ++++++++++++++
>>>>> utilities/ovn-trace.c |  28 +++
>>>>> 12 files changed, 1032 insertions(+), 17 deletions(-)
>>>>> 
>>>>> diff --git a/controller/pinctrl.c b/controller/pinctrl.c
>>>>> index 5a35d56f6..45240f01d 100644
>>>>> --- a/controller/pinctrl.c
>>>>> +++ b/controller/pinctrl.c
>>>>> @@ -1897,6 +1897,437 @@ is_dhcp_flags_broadcast(ovs_be16 flags)
>>>>>   return flags & htons(DHCP_BROADCAST_FLAG);
>>>>> }
>>>>> 
>>>>> +static const char *dhcp_msg_str[] = {
>>>>> +[0] = "INVALID",
>>>>> +[DHCP_MSG_DISCOVER] = "DISCOVER",
>>>>> +[DHCP_MSG_OFFER] = "OFFER",
>>>>> +[DHCP_MSG_REQUEST] = "REQUEST",
>>>>> +[OVN_DHCP_MSG_DECLINE] = "DECLINE",
>>>>> +[DHCP_MSG_ACK] = "ACK",
>>>>> +[DHCP_MSG_NAK] = "NAK",
>>>>> +[OVN_DHCP_MSG_RELEASE] = "RELEASE",
>>>>> +[OVN_DHCP_MSG_INFORM] = "INFORM"
>>>>> +};
>>>>> +
>>>>> +static bool
>>>>> +dhcp_relay_is_msg_type_supported(uint8_t msg_type)
>>>>> +{
>>>>> +    return (msg_type >= DHCP_MSG_DISCOVER && msg_type <= OVN_DHCP_MSG_RELEASE);
>>>>> +}
>>>>> +
>>>>> +static const char *dhcp_msg_str_get(uint8_t msg_type)
>>>>> +{
>>>>> +    if (!dhcp_relay_is_msg_type_supported(msg_type)) {
>>>>> +        return "INVALID";
>>>>> +    }
>>>>> +    return dhcp_msg_str[msg_type];
>>>>> +}
>>>>> +
>>>>> +/* Called with in the pinctrl_handler thread context. */
>>>>> +static void
>>>>> +pinctrl_handle_dhcp_relay_req(
>>>>> +    struct rconn *swconn,
>>>>> +    struct dp_packet *pkt_in, struct ofputil_packet_in *pin,
>>>>> +    struct ofpbuf *userdata,
>>>>> +    struct ofpbuf *continuation)
>>>>> +{
>>>>> +    enum ofp_version version = rconn_get_version(swconn);
>>>>> +    enum ofputil_protocol proto = ofputil_protocol_from_ofp_version(version);
>>>>> +    struct dp_packet *pkt_out_ptr = NULL;
>>>>> +
>>>>> +    /* Parse relay IP and server IP. */
>>>>> +    ovs_be32 *relay_ip = ofpbuf_try_pull(userdata, sizeof *relay_ip);
>>>>> +    ovs_be32 *server_ip = ofpbuf_try_pull(userdata, sizeof *server_ip);
>>>>> +    if (!relay_ip || !server_ip) {
>>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: relay ip or server ip "
>>>>> +                  "not present in the userdata");
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    /* Validate the DHCP request packet.
>>>>> +     * Format of the DHCP packet is
>>>>> +     * ------------------------------------------------------------------------
>>>>> +     *| UDP HEADER  | DHCP HEADER  | 4 Byte DHCP Cookie | DHCP OPTIONS(var len)|
>>>>> +     * ------------------------------------------------------------------------
>>>>> +     */
>>>>> +
>>>>> +    size_t in_l4_size = dp_packet_l4_size(pkt_in);
>>>>> +    const char *end = (char *) dp_packet_l4(pkt_in) + in_l4_size;
>>>>> +    const char *in_dhcp_ptr = dp_packet_get_udp_payload(pkt_in);
>>>>> +    if (!in_dhcp_ptr) {
>>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: invalid or incomplete "
>>>>> +                  "DHCP packet received");
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    const struct dhcp_header *in_dhcp_data
>>>>> +        = (const struct dhcp_header *) in_dhcp_ptr;
>>>>> +    in_dhcp_ptr += sizeof *in_dhcp_data;
>>>>> +    if (in_dhcp_ptr > end) {
>>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: invalid or incomplete "
>>>>> +                "DHCP packet received, bad data length");
>>>>> +        return;
>>>>> +    }
>>>>> +    if (in_dhcp_data->op != DHCP_OP_REQUEST) {
>>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: invalid opcode in the "
>>>>> +                "DHCP packet: %d", in_dhcp_data->op);
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    /* DHCP options follow the DHCP header. The first 4 bytes of the DHCP
>>>>> +     * options is the DHCP magic cookie followed by the actual DHCP options.
>>>>> +     */
>>>>> +    ovs_be32 magic_cookie = htonl(DHCP_MAGIC_COOKIE);
>>>>> +    if (in_dhcp_ptr + sizeof magic_cookie > end ||
>>>>> +        get_unaligned_be32((const void *) in_dhcp_ptr) != magic_cookie) {
>>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: magic cookie not present "
>>>>> +                "in the packet");
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    if (in_dhcp_data->giaddr) {
>>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: giaddr is already set");
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    if (in_dhcp_data->htype != 0x1) {
>>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: packet is recieved with "
>>>>> +                "unsupported hardware type");
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    ovs_be32 *server_id_ptr = NULL;
>>>>> +    const uint8_t *in_dhcp_msg_type = NULL;
>>>>> +
>>>>> +    in_dhcp_ptr += sizeof magic_cookie;
>>>>> +    ovs_be32 request_ip = in_dhcp_data->ciaddr;
>>>>> +    while (in_dhcp_ptr < end) {
>>>>> +        const struct dhcp_opt_header *in_dhcp_opt =
>>>>> +            (const struct dhcp_opt_header *) in_dhcp_ptr;
>>>>> +        if (in_dhcp_opt->code == DHCP_OPT_END) {
>>>>> +            break;
>>>>> +        }
>>>>> +        if (in_dhcp_opt->code == DHCP_OPT_PAD) {
>>>>> +            in_dhcp_ptr += 1;
>>>>> +            continue;
>>>>> +        }
>>>>> +        in_dhcp_ptr += sizeof *in_dhcp_opt;
>>>>> +        if (in_dhcp_ptr > end) {
>>>>> +            break;
>>>>> +        }
>>>>> +        in_dhcp_ptr += in_dhcp_opt->len;
>>>>> +        if (in_dhcp_ptr > end) {
>>>>> +            break;
>>>>> +        }
>>>>> +
>>>>> +        switch (in_dhcp_opt->code) {
>>>>> +        case DHCP_OPT_MSG_TYPE:
>>>>> +            if (in_dhcp_opt->len == 1) {
>>>>> +                in_dhcp_msg_type = DHCP_OPT_PAYLOAD(in_dhcp_opt);
>>>>> +            }
>>>>> +            break;
>>>>> +        case DHCP_OPT_REQ_IP:
>>>>> +            if (in_dhcp_opt->len == 4) {
>>>>> +                request_ip = get_unaligned_be32(DHCP_OPT_PAYLOAD(in_dhcp_opt));
>>>>> +            }
>>>>> +            break;
>>>>> +        /* Server Identifier */
>>>>> +        case OVN_DHCP_OPT_CODE_SERVER_ID:
>>>>> +            if (in_dhcp_opt->len == 4) {
>>>>> +                server_id_ptr = DHCP_OPT_PAYLOAD(in_dhcp_opt);
>>>>> +            }
>>>>> +            break;
>>>>> +        default:
>>>>> +            break;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    /* Check whether the DHCP Message Type (opt 53) is present or not */
>>>>> +    if (!in_dhcp_msg_type) {
>>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: missing message type");
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    /* Relay the DHCP request packet */
>>>>> +    uint16_t new_l4_size = in_l4_size;
>>>>> +    size_t new_packet_size = pkt_in->l4_ofs + new_l4_size;
>>>>> +
>>>>> +    struct dp_packet pkt_out;
>>>>> +    dp_packet_init(&pkt_out, new_packet_size);
>>>>> +    dp_packet_clear(&pkt_out);
>>>>> +    dp_packet_prealloc_tailroom(&pkt_out, new_packet_size);
>>>>> +    pkt_out_ptr = &pkt_out;
>>>>> +
>>>>> +    /* Copy the L2 and L3 headers from the pkt_in as they would remain same*/
>>>>> +    dp_packet_put(
>>>>> +        &pkt_out, dp_packet_pull(pkt_in, pkt_in->l4_ofs), pkt_in->l4_ofs);
>>>>> +
>>>>> +    pkt_out.l2_5_ofs = pkt_in->l2_5_ofs;
>>>>> +    pkt_out.l2_pad_size = pkt_in->l2_pad_size;
>>>>> +    pkt_out.l3_ofs = pkt_in->l3_ofs;
>>>>> +    pkt_out.l4_ofs = pkt_in->l4_ofs;
>>>>> +
>>>>> +    struct ip_header *out_ip = dp_packet_l3(&pkt_out);
>>>>> +
>>>>> +    struct udp_header *udp = dp_packet_put(
>>>>> +        &pkt_out, dp_packet_pull(pkt_in, UDP_HEADER_LEN), UDP_HEADER_LEN);
>>>>> +
>>>>> +    struct dhcp_header *dhcp_data = dp_packet_put(&pkt_out,
>>>>> +        dp_packet_pull(pkt_in, new_l4_size - UDP_HEADER_LEN),
>>>>> +        new_l4_size - UDP_HEADER_LEN);
>>>>> +    dhcp_data->giaddr = *relay_ip;
>>>>> +    if (udp->udp_csum) {
>>>>> +        udp->udp_csum = recalc_csum32(udp->udp_csum,
>>>>> +            0, dhcp_data->giaddr);
>>>>> +    }
>>>>> +    pin->packet = dp_packet_data(&pkt_out);
>>>>> +    pin->packet_len = dp_packet_size(&pkt_out);
>>>>> +
>>>>> +    /* Log the DHCP message. */
>>>>> +    static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(20, 40);
>>>>> +    const struct eth_header *l2 = dp_packet_eth(&pkt_out);
>>>>> +    VLOG_INFO_RL(&rl, "DHCP_RELAY_REQ:: MSG_TYPE:%s MAC:"ETH_ADDR_FMT
>>>>> +                " XID:%u"
>>>>> +                " REQ_IP:"IP_FMT
>>>>> +                " GIADDR:"IP_FMT
>>>>> +                " SERVER_ADDR:"IP_FMT,
>>>>> +                dhcp_msg_str_get(*in_dhcp_msg_type),
>>>>> +                ETH_ADDR_BYTES_ARGS(dhcp_data->chaddr), ntohl(dhcp_data->xid),
>>>>> +                IP_ARGS(request_ip), IP_ARGS(dhcp_data->giaddr),
>>>>> +                IP_ARGS(*server_ip));
>>>>> +    queue_msg(swconn, ofputil_encode_resume(pin, continuation, proto));
>>>>> +    if (pkt_out_ptr) {
>>>>> +        dp_packet_uninit(pkt_out_ptr);
>>>>> +    }
>>>>> +}
>>>>> +
>>>>> +/* Called with in the pinctrl_handler thread context. */
>>>>> +static void
>>>>> +pinctrl_handle_dhcp_relay_resp_fwd(
>>>>> +    struct rconn *swconn,
>>>>> +    struct dp_packet *pkt_in, struct ofputil_packet_in *pin,
>>>>> +    struct ofpbuf *userdata,
>>>>> +    struct ofpbuf *continuation)
>>>>> +{
>>>>> +    enum ofp_version version = rconn_get_version(swconn);
>>>>> +    enum ofputil_protocol proto = ofputil_protocol_from_ofp_version(version);
>>>>> +    struct dp_packet *pkt_out_ptr = NULL;
>>>>> +
>>>>> +    /* Parse relay IP and server IP. */
>>>>> +    ovs_be32 *relay_ip = ofpbuf_try_pull(userdata, sizeof *relay_ip);
>>>>> +    ovs_be32 *server_ip = ofpbuf_try_pull(userdata, sizeof *server_ip);
>>>>> +    if (!relay_ip || !server_ip) {
>>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: relay ip or server ip "
>>>>> +                "not present in the userdata");
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    /* Validate the DHCP request packet.
>>>>> +     * Format of the DHCP packet is
>>>>> +     * ------------------------------------------------------------------------
>>>>> +     *| UDP HEADER  | DHCP HEADER  | 4 Byte DHCP Cookie | DHCP OPTIONS(var len)|
>>>>> +     * ------------------------------------------------------------------------
>>>>> +     */
>>>>> +
>>>>> +    size_t in_l4_size = dp_packet_l4_size(pkt_in);
>>>>> +    const char *end = (char *) dp_packet_l4(pkt_in) + in_l4_size;
>>>>> +    const char *in_dhcp_ptr = dp_packet_get_udp_payload(pkt_in);
>>>>> +    if (!in_dhcp_ptr) {
>>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: invalid or incomplete "
>>>>> +                "packet received");
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    const struct dhcp_header *in_dhcp_data
>>>>> +        = (const struct dhcp_header *) in_dhcp_ptr;
>>>>> +    in_dhcp_ptr += sizeof *in_dhcp_data;
>>>>> +    if (in_dhcp_ptr > end) {
>>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: invalid or incomplete "
>>>>> +                    "packet received, bad data length");
>>>>> +        return;
>>>>> +    }
>>>>> +    if (in_dhcp_data->op != DHCP_OP_REPLY) {
>>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: invalid opcode "
>>>>> +                "in the packet: %d", in_dhcp_data->op);
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    /* DHCP options follow the DHCP header. The first 4 bytes of the DHCP
>>>>> +     * options is the DHCP magic cookie followed by the actual DHCP options.
>>>>> +     */
>>>>> +    ovs_be32 magic_cookie = htonl(DHCP_MAGIC_COOKIE);
>>>>> +    if (in_dhcp_ptr + sizeof magic_cookie > end ||
>>>>> +        get_unaligned_be32((const void *) in_dhcp_ptr) != magic_cookie) {
>>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: magic cookie not present "
>>>>> +                "in the packet");
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    if (!in_dhcp_data->giaddr) {
>>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: giaddr is "
>>>>> +                    "not set in request");
>>>>> +        return;
>>>>> +    }
>>>>> +    ovs_be32 giaddr = in_dhcp_data->giaddr;
>>>>> +
>>>>> +    ovs_be32 *server_id_ptr = NULL;
>>>>> +    ovs_be32 lease_time = 0;
>>>>> +    const uint8_t *in_dhcp_msg_type = NULL;
>>>>> +
>>>>> +    in_dhcp_ptr += sizeof magic_cookie;
>>>>> +    while (in_dhcp_ptr < end) {
>>>>> +        const struct dhcp_opt_header *in_dhcp_opt =
>>>>> +            (const struct dhcp_opt_header *) in_dhcp_ptr;
>>>>> +        if (in_dhcp_opt->code == DHCP_OPT_END) {
>>>>> +            break;
>>>>> +        }
>>>>> +        if (in_dhcp_opt->code == DHCP_OPT_PAD) {
>>>>> +            in_dhcp_ptr += 1;
>>>>> +            continue;
>>>>> +        }
>>>>> +        in_dhcp_ptr += sizeof *in_dhcp_opt;
>>>>> +        if (in_dhcp_ptr > end) {
>>>>> +            break;
>>>>> +        }
>>>>> +        in_dhcp_ptr += in_dhcp_opt->len;
>>>>> +        if (in_dhcp_ptr > end) {
>>>>> +            break;
>>>>> +        }
>>>>> +
>>>>> +        switch (in_dhcp_opt->code) {
>>>>> +        case DHCP_OPT_MSG_TYPE:
>>>>> +            if (in_dhcp_opt->len == 1) {
>>>>> +                in_dhcp_msg_type = DHCP_OPT_PAYLOAD(in_dhcp_opt);
>>>>> +            }
>>>>> +            break;
>>>>> +        /* Server Identifier */
>>>>> +        case OVN_DHCP_OPT_CODE_SERVER_ID:
>>>>> +            if (in_dhcp_opt->len == 4) {
>>>>> +                server_id_ptr = DHCP_OPT_PAYLOAD(in_dhcp_opt);
>>>>> +            }
>>>>> +            break;
>>>>> +        case OVN_DHCP_OPT_CODE_LEASE_TIME:
>>>>> +            if (in_dhcp_opt->len == 4) {
>>>>> +                lease_time = get_unaligned_be32(DHCP_OPT_PAYLOAD(in_dhcp_opt));
>>>>> +            }
>>>>> +            break;
>>>>> +        default:
>>>>> +            break;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    /* Check whether the DHCP Message Type (opt 53) is present or not */
>>>>> +    if (!in_dhcp_msg_type) {
>>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: missing message type");
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    if (!server_id_ptr) {
>>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: missing server identifier");
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    if (*server_id_ptr != *server_ip) {
>>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: server identifier mismatch");
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    if (giaddr != *relay_ip) {
>>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: giaddr mismatch");
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +
>>>>> +    /* Update destination MAC & IP so that the packet is forward to the
>>>>> +     * right destination node.
>>>>> +     */
>>>>> +    uint16_t new_l4_size = in_l4_size;
>>>>> +    size_t new_packet_size = pkt_in->l4_ofs + new_l4_size;
>>>>> +
>>>>> +    struct dp_packet pkt_out;
>>>>> +    dp_packet_init(&pkt_out, new_packet_size);
>>>>> +    dp_packet_clear(&pkt_out);
>>>>> +    dp_packet_prealloc_tailroom(&pkt_out, new_packet_size);
>>>>> +    pkt_out_ptr = &pkt_out;
>>>>> +
>>>>> +    /* Copy the L2 and L3 headers from the pkt_in as they would remain same*/
>>>>> +    struct eth_header *eth = dp_packet_put(
>>>>> +        &pkt_out, dp_packet_pull(pkt_in, pkt_in->l4_ofs), pkt_in->l4_ofs);
>>>>> +
>>>>> +    pkt_out.l2_5_ofs = pkt_in->l2_5_ofs;
>>>>> +    pkt_out.l2_pad_size = pkt_in->l2_pad_size;
>>>>> +    pkt_out.l3_ofs = pkt_in->l3_ofs;
>>>>> +    pkt_out.l4_ofs = pkt_in->l4_ofs;
>>>>> +
>>>>> +    struct udp_header *udp = dp_packet_put(
>>>>> +        &pkt_out, dp_packet_pull(pkt_in, UDP_HEADER_LEN), UDP_HEADER_LEN);
>>>>> +
>>>>> +    struct dhcp_header *dhcp_data = dp_packet_put(
>>>>> +        &pkt_out, dp_packet_pull(pkt_in, new_l4_size - UDP_HEADER_LEN),
>>>>> +        new_l4_size - UDP_HEADER_LEN);
>>>>> +    memcpy(&eth->eth_dst, dhcp_data->chaddr, sizeof(eth->eth_dst));
>>>>> +
>>>>> +    /* Send a broadcast IP frame when BROADCAST flag is set. */
>>>>> +    struct ip_header *out_ip = dp_packet_l3(&pkt_out);
>>>>> +    ovs_be32 ip_dst;
>>>>> +    ovs_be32 ip_dst_orig = get_16aligned_be32(&out_ip->ip_dst);
>>>>> +    if (!is_dhcp_flags_broadcast(dhcp_data->flags)) {
>>>>> +        ip_dst = dhcp_data->yiaddr;
>>>>> +    } else {
>>>>> +        ip_dst = htonl(0xffffffff);
>>>>> +    }
>>>>> +    put_16aligned_be32(&out_ip->ip_dst, ip_dst);
>>>>> +    out_ip->ip_csum = recalc_csum32(out_ip->ip_csum,
>>>>> +              ip_dst_orig, ip_dst);
>>>>> +    if (udp->udp_csum) {
>>>>> +        udp->udp_csum = recalc_csum32(udp->udp_csum,
>>>>> +            ip_dst_orig, ip_dst);
>>>>> +    }
>>>>> +    /* Reset giaddr */
>>>>> +    dhcp_data->giaddr = htonl(0x0);
>>>>> +    if (udp->udp_csum) {
>>>>> +        udp->udp_csum = recalc_csum32(udp->udp_csum,
>>>>> +            giaddr, 0);
>>>>> +    }
>>>>> +    pin->packet = dp_packet_data(&pkt_out);
>>>>> +    pin->packet_len = dp_packet_size(&pkt_out);
>>>>> +
>>>>> +    /* Log the DHCP message. */
>>>>> +    static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(20, 40);
>>>>> +    const struct eth_header *l2 = dp_packet_eth(&pkt_out);
>>>>> +    VLOG_INFO_RL(&rl, "DHCP_RELAY_RESP_FWD:: MSG_TYPE:%s MAC:"ETH_ADDR_FMT
>>>>> +             " XID:%u"
>>>>> +             " YIADDR:"IP_FMT
>>>>> +             " GIADDR:"IP_FMT
>>>>> +             " SERVER_ADDR:"IP_FMT,
>>>>> +             dhcp_msg_str_get(*in_dhcp_msg_type),
>>>>> +             ETH_ADDR_BYTES_ARGS(dhcp_data->chaddr), ntohl(dhcp_data->xid),
>>>>> +             IP_ARGS(dhcp_data->yiaddr),
>>>>> +             IP_ARGS(giaddr), IP_ARGS(*server_id_ptr));
>>>>> +    queue_msg(swconn, ofputil_encode_resume(pin, continuation, proto));
>>>>> +    if (pkt_out_ptr) {
>>>>> +        dp_packet_uninit(pkt_out_ptr);
>>>>> +    }
>>>>> +}
>>>>> +
>>>>> /* Called with in the pinctrl_handler thread context. */
>>>>> static void
>>>>> pinctrl_handle_put_dhcp_opts(
>>>>> @@ -3203,6 +3634,16 @@ process_packet_in(struct rconn *swconn, const struct ofp_header *msg)
>>>>>       ovs_mutex_unlock(&pinctrl_mutex);
>>>>>       break;
>>>>> 
>>>>> +    case ACTION_OPCODE_DHCP_RELAY_REQ:
>>>>> +        pinctrl_handle_dhcp_relay_req(swconn, &packet, &pin,
>>>>> +                                     &userdata, &continuation);
>>>>> +        break;
>>>>> +
>>>>> +    case ACTION_OPCODE_DHCP_RELAY_RESP_FWD:
>>>>> +        pinctrl_handle_dhcp_relay_resp_fwd(swconn, &packet, &pin,
>>>>> +                                     &userdata, &continuation);
>>>>> +        break;
>>>>> +
>>>>>   case ACTION_OPCODE_PUT_DHCP_OPTS:
>>>>>       pinctrl_handle_put_dhcp_opts(swconn, &packet, &pin, &headers,
>>>>>                                    &userdata, &continuation);
>>>>> diff --git a/include/ovn/actions.h b/include/ovn/actions.h
>>>>> index 49cfe0624..47d41b90f 100644
>>>>> --- a/include/ovn/actions.h
>>>>> +++ b/include/ovn/actions.h
>>>>> @@ -95,6 +95,8 @@ struct collector_set_ids;
>>>>>   OVNACT(LOOKUP_ND_IP,      ovnact_lookup_mac_bind_ip) \
>>>>>   OVNACT(PUT_DHCPV4_OPTS,   ovnact_put_opts)        \
>>>>>   OVNACT(PUT_DHCPV6_OPTS,   ovnact_put_opts)        \
>>>>> +    OVNACT(DHCPV4_RELAY_REQ,  ovnact_dhcp_relay)      \
>>>>> +    OVNACT(DHCPV4_RELAY_RESP_FWD, ovnact_dhcp_relay)      \
>>>>>   OVNACT(SET_QUEUE,         ovnact_set_queue)       \
>>>>>   OVNACT(DNS_LOOKUP,        ovnact_result)          \
>>>>>   OVNACT(LOG,               ovnact_log)             \
>>>>> @@ -387,6 +389,14 @@ struct ovnact_put_opts {
>>>>>   size_t n_options;
>>>>> };
>>>>> 
>>>>> +/* OVNACT_DHCP_RELAY. */
>>>>> +struct ovnact_dhcp_relay {
>>>>> +    struct ovnact ovnact;
>>>>> +    int family;
>>>>> +    ovs_be32 relay_ipv4;
>>>>> +    ovs_be32 server_ipv4;
>>>>> +};
>>>>> +
>>>>> /* Valid arguments to SET_QUEUE action.
>>>>> *
>>>>> * QDISC_MIN_QUEUE_ID is the default queue, so user-defined queues should
>>>>> @@ -750,6 +760,22 @@ enum action_opcode {
>>>>> 
>>>>>   /* multicast group split buffer action. */
>>>>>   ACTION_OPCODE_MG_SPLIT_BUF,
>>>>> +
>>>>> +    /* "dhcp_relay_req(relay_ip, server_ip)".
>>>>> +     *
>>>>> +     * Arguments follow the action_header, in this format:
>>>>> +     *   - The 32-bit DHCP relay IP.
>>>>> +     *   - The 32-bit DHCP server IP.
>>>>> +     */
>>>>> +    ACTION_OPCODE_DHCP_RELAY_REQ,
>>>>> +
>>>>> +    /* "dhcp_relay_resp_fwd(relay_ip, server_ip)".
>>>>> +     *
>>>>> +     * Arguments follow the action_header, in this format:
>>>>> +     *   - The 32-bit DHCP relay IP.
>>>>> +     *   - The 32-bit DHCP server IP.
>>>>> +     */
>>>>> +    ACTION_OPCODE_DHCP_RELAY_RESP_FWD,
>>>>> };
>>>>> 
>>>>> /* Header. */
>>>>> diff --git a/lib/actions.c b/lib/actions.c
>>>>> index a73fe1a1e..69df428c6 100644
>>>>> --- a/lib/actions.c
>>>>> +++ b/lib/actions.c
>>>>> @@ -2629,6 +2629,118 @@ ovnact_controller_event_free(struct ovnact_controller_event *event)
>>>>>   free_gen_options(event->options, event->n_options);
>>>>> }
>>>>> 
>>>>> +static void
>>>>> +format_DHCPV4_RELAY_REQ(const struct ovnact_dhcp_relay *dhcp_relay,
>>>>> +                struct ds *s)
>>>>> +{
>>>>> +    ds_put_format(s, "dhcp_relay_req("IP_FMT","IP_FMT");",
>>>>> +                  IP_ARGS(dhcp_relay->relay_ipv4),
>>>>> +                  IP_ARGS(dhcp_relay->server_ipv4));
>>>>> +}
>>>>> +
>>>>> +static void
>>>>> +parse_dhcp_relay_req(struct action_context *ctx,
>>>>> +               struct ovnact_dhcp_relay *dhcp_relay)
>>>>> +{
>>>>> +    /* Skip dhcp_relay_req( */
>>>>> +    lexer_force_match(ctx->lexer, LEX_T_LPAREN);
>>>>> +
>>>>> +    /* Parse relay ip and server ip. */
>>>>> +    if (ctx->lexer->token.format == LEX_F_IPV4) {
>>>>> +        dhcp_relay->family = AF_INET;
>>>>> +        dhcp_relay->relay_ipv4 = ctx->lexer->token.value.ipv4;
>>>>> +        lexer_get(ctx->lexer);
>>>>> +        lexer_match(ctx->lexer, LEX_T_COMMA);
>>>>> +        if (ctx->lexer->token.format == LEX_F_IPV4) {
>>>>> +            dhcp_relay->family = AF_INET;
>>>>> +            dhcp_relay->server_ipv4 = ctx->lexer->token.value.ipv4;
>>>>> +            lexer_get(ctx->lexer);
>>>>> +        } else {
>>>>> +            lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp server ip");
>>>>> +            return;
>>>>> +        }
>>>>> +    } else {
>>>>> +          lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp relay "
>>>>> +                          "and server ips");
>>>>> +          return;
>>>>> +    }
>>>>> +    lexer_force_match(ctx->lexer, LEX_T_RPAREN);
>>>>> +}
>>>>> +
>>>>> +static void
>>>>> +encode_DHCPV4_RELAY_REQ(const struct ovnact_dhcp_relay *dhcp_relay,
>>>>> +                    const struct ovnact_encode_params *ep,
>>>>> +                    struct ofpbuf *ofpacts)
>>>>> +{
>>>>> +    size_t oc_offset = encode_start_controller_op(ACTION_OPCODE_DHCP_RELAY_REQ,
>>>>> +                                                  true, ep->ctrl_meter_id,
>>>>> +                                                  ofpacts);
>>>>> +    ofpbuf_put(ofpacts, &dhcp_relay->relay_ipv4,
>>>>> +            sizeof(dhcp_relay->relay_ipv4));
>>>>> +    ofpbuf_put(ofpacts, &dhcp_relay->server_ipv4,
>>>>> +            sizeof(dhcp_relay->server_ipv4));
>>>>> +    encode_finish_controller_op(oc_offset, ofpacts);
>>>>> +}
>>>>> +
>>>>> +static void
>>>>> +format_DHCPV4_RELAY_RESP_FWD(const struct ovnact_dhcp_relay *dhcp_relay,
>>>>> +                    struct ds *s)
>>>>> +{
>>>>> +    ds_put_format(s, "dhcp_relay_resp("IP_FMT","IP_FMT");",
>>>>> +                  IP_ARGS(dhcp_relay->relay_ipv4),
>>>>> +                  IP_ARGS(dhcp_relay->server_ipv4));
>>>>> +}
>>>>> +
>>>>> +static void
>>>>> +parse_dhcp_relay_resp_fwd(struct action_context *ctx,
>>>>> +               struct ovnact_dhcp_relay *dhcp_relay)
>>>>> +{
>>>>> +    /* Skip dhcp_relay_resp( */
>>>>> +    lexer_force_match(ctx->lexer, LEX_T_LPAREN);
>>>>> +
>>>>> +    /* Parse relay ip and server ip. */
>>>>> +    if (ctx->lexer->token.format == LEX_F_IPV4) {
>>>>> +        dhcp_relay->family = AF_INET;
>>>>> +        dhcp_relay->relay_ipv4 = ctx->lexer->token.value.ipv4;
>>>>> +        lexer_get(ctx->lexer);
>>>>> +        lexer_match(ctx->lexer, LEX_T_COMMA);
>>>>> +        if (ctx->lexer->token.format == LEX_F_IPV4) {
>>>>> +            dhcp_relay->family = AF_INET;
>>>>> +            dhcp_relay->server_ipv4 = ctx->lexer->token.value.ipv4;
>>>>> +            lexer_get(ctx->lexer);
>>>>> +        } else {
>>>>> +            lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp server ip");
>>>>> +            return;
>>>>> +        }
>>>>> +    } else {
>>>>> +          lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp relay and "
>>>>> +                          "server ips");
>>>>> +          return;
>>>>> +    }
>>>>> +    lexer_force_match(ctx->lexer, LEX_T_RPAREN);
>>>>> +}
>>>>> +
>>>>> +static void
>>>>> +encode_DHCPV4_RELAY_RESP_FWD(const struct ovnact_dhcp_relay *dhcp_relay,
>>>>> +                    const struct ovnact_encode_params *ep,
>>>>> +                    struct ofpbuf *ofpacts)
>>>>> +{
>>>>> +    size_t oc_offset = encode_start_controller_op(
>>>>> +                                ACTION_OPCODE_DHCP_RELAY_RESP_FWD,
>>>>> +                                true, ep->ctrl_meter_id,
>>>>> +                                ofpacts);
>>>>> +    ofpbuf_put(ofpacts, &dhcp_relay->relay_ipv4,
>>>>> +                  sizeof(dhcp_relay->relay_ipv4));
>>>>> +    ofpbuf_put(ofpacts, &dhcp_relay->server_ipv4,
>>>>> +                  sizeof(dhcp_relay->server_ipv4));
>>>>> +    encode_finish_controller_op(oc_offset, ofpacts);
>>>>> +}
>>>>> +
>>>>> +static void ovnact_dhcp_relay_free(
>>>>> +          struct ovnact_dhcp_relay *dhcp_relay OVS_UNUSED)
>>>>> +{
>>>>> +}
>>>>> +
>>>>> static void
>>>>> parse_put_opts(struct action_context *ctx, const struct expr_field *dst,
>>>>>              struct ovnact_put_opts *po, const struct hmap *gen_opts,
>>>>> @@ -5451,6 +5563,11 @@ parse_action(struct action_context *ctx)
>>>>>       parse_sample(ctx);
>>>>>   } else if (lexer_match_id(ctx->lexer, "mac_cache_use")) {
>>>>>       ovnact_put_MAC_CACHE_USE(ctx->ovnacts);
>>>>> +    } else if (lexer_match_id(ctx->lexer, "dhcp_relay_req")) {
>>>>> +        parse_dhcp_relay_req(ctx, ovnact_put_DHCPV4_RELAY_REQ(ctx->ovnacts));
>>>>> +    } else if (lexer_match_id(ctx->lexer, "dhcp_relay_resp_fwd")) {
>>>>> +        parse_dhcp_relay_resp_fwd(ctx,
>>>>> +              ovnact_put_DHCPV4_RELAY_RESP_FWD(ctx->ovnacts));
>>>>>   } else {
>>>>>       lexer_syntax_error(ctx->lexer, "expecting action");
>>>>>   }
>>>>> diff --git a/lib/ovn-l7.h b/lib/ovn-l7.h
>>>>> index ad514a922..e08581123 100644
>>>>> --- a/lib/ovn-l7.h
>>>>> +++ b/lib/ovn-l7.h
>>>>> @@ -69,6 +69,7 @@ struct gen_opts_map {
>>>>> */
>>>>> #define OVN_DHCP_OPT_CODE_NETMASK      1
>>>>> #define OVN_DHCP_OPT_CODE_LEASE_TIME   51
>>>>> +#define OVN_DHCP_OPT_CODE_SERVER_ID    54
>>>>> #define OVN_DHCP_OPT_CODE_T1           58
>>>>> #define OVN_DHCP_OPT_CODE_T2           59
>>>>> 
>>>>> diff --git a/northd/northd.c b/northd/northd.c
>>>>> index 07dffb15a..7ac831fae 100644
>>>>> --- a/northd/northd.c
>>>>> +++ b/northd/northd.c
>>>>> @@ -181,11 +181,13 @@ enum ovn_stage {
>>>>>   PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING_ECMP, 14, "lr_in_ip_routing_ecmp") \
>>>>>   PIPELINE_STAGE(ROUTER, IN,  POLICY,          15, "lr_in_policy")          \
>>>>>   PIPELINE_STAGE(ROUTER, IN,  POLICY_ECMP,     16, "lr_in_policy_ecmp")     \
>>>>> -    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE,     17, "lr_in_arp_resolve")     \
>>>>> -    PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN,     18, "lr_in_chk_pkt_len")     \
>>>>> -    PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS,     19, "lr_in_larger_pkts")     \
>>>>> -    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT,     20, "lr_in_gw_redirect")     \
>>>>> -    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     21, "lr_in_arp_request")     \
>>>>> +    PIPELINE_STAGE(ROUTER, IN,  DHCP_RELAY_RESP_FWD, 17,                      \
>>>>> +                  "lr_in_dhcp_relay_resp_fwd")                                \
>>>>> +    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE,     18, "lr_in_arp_resolve")     \
>>>>> +    PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN,     19, "lr_in_chk_pkt_len")     \
>>>>> +    PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS,     20, "lr_in_larger_pkts")     \
>>>>> +    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT,     21, "lr_in_gw_redirect")     \
>>>>> +    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     22, "lr_in_arp_request")     \
>>>>>                                                                     \
>>>>>   /* Logical router egress stages. */                               \
>>>>>   PIPELINE_STAGE(ROUTER, OUT, CHECK_DNAT_LOCAL,   0,                       \
>>>>> @@ -9610,6 +9612,80 @@ build_dhcpv6_options_flows(struct ovn_port *op,
>>>>>   ds_destroy(&match);
>>>>> }
>>>>> 
>>>>> +static void
>>>>> +build_lswitch_dhcp_relay_flows(struct ovn_port *op,
>>>>> +                           const struct hmap *lr_ports,
>>>>> +                           const struct hmap *lflows,
>>>>> +                           const struct shash *meter_groups OVS_UNUSED)
>>>>> +{
>>>>> +    if (op->nbrp || !op->nbsp) {
>>>>> +        return;
>>>>> +    }
>>>>> +    /* consider only ports attached to VMs */
>>>>> +    if (strcmp(op->nbsp->type, "")) {
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    if (!op->od || !op->od->n_router_ports ||
>>>>> +        !op->od->nbs || !op->od->nbs->dhcp_relay_port) {
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    struct ds match = DS_EMPTY_INITIALIZER;
>>>>> +    struct ds action = DS_EMPTY_INITIALIZER;
>>>>> +    struct nbrec_logical_router_port *lrp = op->od->nbs->dhcp_relay_port;
>>>>> +    struct ovn_port *rp = ovn_port_find(lr_ports, lrp->name);
>>>>> +
>>>>> +    if (!rp || !rp->nbrp || !rp->nbrp->dhcp_relay) {
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    struct ovn_port *sp = NULL;
>>>>> +    struct nbrec_dhcp_relay *dhcp_relay = rp->nbrp->dhcp_relay;
>>>>> +
>>>>> +    for (int i = 0; i < op->od->n_router_ports; i++) {
>>>>> +        struct ovn_port *sp_tmp = op->od->router_ports[i];
>>>>> +        if (sp_tmp->peer == rp) {
>>>>> +            sp = sp_tmp;
>>>>> +            break;
>>>>> +        }
>>>>> +    }
>>>>> +    if (!sp) {
>>>>> +      return;
>>>>> +    }
>>>>> +
>>>>> +    char *server_ip_str = NULL;
>>>>> +    uint16_t port;
>>>>> +    int addr_family;
>>>>> +    struct in6_addr server_ip;
>>>>> +
>>>>> +    if (!ip_address_and_port_from_lb_key(dhcp_relay->servers, &server_ip_str,
>>>>> +                                         &server_ip, &port, &addr_family)) {
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    if (server_ip_str == NULL) {
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    ds_put_format(
>>>>> +        &match, "inport == %s && eth.src == %s && "
>>>>> +        "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
>>>>> +        "udp.src == 68 && udp.dst == 67",
>>>>> +        op->json_key, op->lsp_addrs[0].ea_s);
>>>>> +    ds_put_format(&action,
>>>>> +                  "eth.dst=%s;outport=%s;next;/* DHCP_RELAY_REQ */",
>>>>> +                  rp->lrp_networks.ea_s,sp->json_key);
>>>>> +    ovn_lflow_add_with_hint__(lflows, op->od,
>>>>> +                              S_SWITCH_IN_L2_LKUP, 100,
>>>>> +                              ds_cstr(&match),
>>>>> +                              ds_cstr(&action),
>>>>> +                              op->key,
>>>>> +                              NULL,
>>>>> +                              &lrp->header_);
>>>>> +    free(server_ip_str);
>>>>> +}
>>>>> +
>>>>> static void
>>>>> build_drop_arp_nd_flows_for_unbound_router_ports(struct ovn_port *op,
>>>>>                                                const struct ovn_port *port,
>>>>> @@ -10181,6 +10257,13 @@ build_lswitch_dhcp_options_and_response(struct ovn_port *op,
>>>>>       return;
>>>>>   }
>>>>> 
>>>>> +    if (op->od && op->od->nbs
>>>>> +        && op->od->nbs->dhcp_relay_port) {
>>>>> +        /* Don't add the DHCP server flows if DHCP Relay is enabled on the
>>>>> +         * logical switch. */
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>>   bool is_external = lsp_is_external(op->nbsp);
>>>>>   if (is_external && (!op->od->n_localnet_ports ||
>>>>>                       !op->nbsp->ha_chassis_group)) {
>>>>> @@ -14458,6 +14541,86 @@ build_dhcpv6_reply_flows_for_lrouter_port(
>>>>>   }
>>>>> }
>>>>> 
>>>>> +static void
>>>>> +build_dhcp_relay_flows_for_lrouter_port(
>>>>> +        struct ovn_port *op, struct hmap *lflows,
>>>>> +        struct ds *match)
>>>>> +{
>>>>> +    if (!op->nbrp || !op->nbrp->dhcp_relay) {
>>>>> +        return;
>>>>> +    }
>>>>> +    struct nbrec_dhcp_relay *dhcp_relay = op->nbrp->dhcp_relay;
>>>>> +    if (!dhcp_relay->servers) {
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    int addr_family;
>>>>> +    /* currently not supporting custom port */
>>>>> +    uint16_t port;
>>>>> +    char *server_ip_str = NULL;
>>>>> +    struct in6_addr server_ip;
>>>>> +
>>>>> +    if (!ip_address_and_port_from_lb_key(dhcp_relay->servers, &server_ip_str,
>>>>> +                                         &server_ip, &port, &addr_family)) {
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    if (server_ip_str == NULL) {
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    struct ds dhcp_action = DS_EMPTY_INITIALIZER;
>>>>> +    ds_clear(match);
>>>>> +    ds_put_format(
>>>>> +        match, "inport == %s && "
>>>>> +        "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
>>>>> +        "udp.src == 68 && udp.dst == 67",
>>>>> +        op->json_key);
>>>>> +    ds_put_format(&dhcp_action,
>>>>> +                "dhcp_relay_req(%s,%s);"
>>>>> +                "ip4.src=%s;ip4.dst=%s;udp.src=67;next; /* DHCP_RELAY_REQ */",
>>>>> +                op->lrp_networks.ipv4_addrs[0].addr_s, server_ip_str,
>>>>> +                op->lrp_networks.ipv4_addrs[0].addr_s, server_ip_str);
>>>>> +
>>>>> +    ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_IP_INPUT, 110,
>>>>> +                            ds_cstr(match), ds_cstr(&dhcp_action),
>>>>> +                            &op->nbrp->header_);
>>>>> +
>>>>> +    ds_clear(match);
>>>>> +    ds_clear(&dhcp_action);
>>>>> +
>>>>> +    ds_put_format(
>>>>> +        match, "ip4.src == %s && ip4.dst == %s && "
>>>>> +        "udp.src == 67 && udp.dst == 67",
>>>>> +        server_ip_str, op->lrp_networks.ipv4_addrs[0].addr_s);
>>>>> +    ds_put_format(&dhcp_action, "next;/* DHCP_RELAY_RESP */");
>>>>> +    ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_IP_INPUT, 110,
>>>>> +                            ds_cstr(match), ds_cstr(&dhcp_action),
>>>>> +                            &op->nbrp->header_);
>>>>> +
>>>>> +    ds_clear(match);
>>>>> +    ds_clear(&dhcp_action);
>>>>> +
>>>>> +    ds_put_format(
>>>>> +        match, "ip4.src == %s && ip4.dst == %s && "
>>>>> +        "udp.src == 67 && udp.dst == 67",
>>>>> +        server_ip_str, op->lrp_networks.ipv4_addrs[0].addr_s);
>>>>> +    ds_put_format(&dhcp_action,
>>>>> +          "dhcp_relay_resp_fwd(%s,%s);ip4.src=%s;udp.dst=68;"
>>>>> +          "outport=%s;output; /* DHCP_RELAY_RESP */",
>>>>> +          op->lrp_networks.ipv4_addrs[0].addr_s, server_ip_str,
>>>>> +          op->lrp_networks.ipv4_addrs[0].addr_s, op->json_key);
>>>>> +    ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_DHCP_RELAY_RESP_FWD,
>>>>> +                            110,
>>>>> +                            ds_cstr(match), ds_cstr(&dhcp_action),
>>>>> +                            &op->nbrp->header_);
>>>>> +
>>>>> +    ds_clear(match);
>>>>> +    ds_clear(&dhcp_action);
>>>>> +
>>>>> +    free(server_ip_str);
>>>>> +}
>>>>> +
>>>>> static void
>>>>> build_ipv6_input_flows_for_lrouter_port(
>>>>>       struct ovn_port *op, struct hmap *lflows,
>>>>> @@ -15673,6 +15836,8 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath *od, struct hmap *lflows,
>>>>>   ovn_lflow_add(lflows, od, S_ROUTER_OUT_POST_SNAT, 0, "1", "next;");
>>>>>   ovn_lflow_add(lflows, od, S_ROUTER_OUT_EGR_LOOP, 0, "1", "next;");
>>>>>   ovn_lflow_add(lflows, od, S_ROUTER_IN_ECMP_STATEFUL, 0, "1", "next;");
>>>>> +    ovn_lflow_add(lflows, od, S_ROUTER_IN_DHCP_RELAY_RESP_FWD, 0, "1",
>>>>> +                  "next;");
>>>>> 
>>>>>   const char *ct_flag_reg = features->ct_no_masked_label
>>>>>                             ? "ct_mark"
>>>>> @@ -16154,6 +16319,7 @@ build_lswitch_and_lrouter_iterate_by_lsp(struct ovn_port *op,
>>>>>   build_lswitch_dhcp_options_and_response(op, lflows, meter_groups);
>>>>>   build_lswitch_external_port(op, lflows);
>>>>>   build_lswitch_ip_unicast_lookup(op, lflows, actions, match);
>>>>> +    build_lswitch_dhcp_relay_flows(op, lr_ports, lflows, meter_groups);
>>>>> 
>>>>>   /* Build Logical Router Flows. */
>>>>>   build_ip_routing_flows_for_router_type_lsp(op, lr_ports, lflows);
>>>>> @@ -16183,6 +16349,7 @@ build_lswitch_and_lrouter_iterate_by_lrp(struct ovn_port *op,
>>>>>   build_egress_delivery_flows_for_lrouter_port(op, lsi->lflows, &lsi->match,
>>>>>                                                &lsi->actions);
>>>>>   build_dhcpv6_reply_flows_for_lrouter_port(op, lsi->lflows, &lsi->match);
>>>>> +    build_dhcp_relay_flows_for_lrouter_port(op, lsi->lflows, &lsi->match);
>>>>>   build_ipv6_input_flows_for_lrouter_port(op, lsi->lflows,
>>>>>                                           &lsi->match, &lsi->actions,
>>>>>                                           lsi->meter_groups);
>>>>> diff --git a/ovn-nb.ovsschema b/ovn-nb.ovsschema
>>>>> index b2e0993e0..6863d52cd 100644
>>>>> --- a/ovn-nb.ovsschema
>>>>> +++ b/ovn-nb.ovsschema
>>>>> @@ -1,7 +1,7 @@
>>>>> {
>>>>>   "name": "OVN_Northbound",
>>>>> -    "version": "7.2.0",
>>>>> -    "cksum": "1069338687 34162",
>>>>> +    "version": "7.3.0",
>>>>> +    "cksum": "2325497400 35185",
>>>>>   "tables": {
>>>>>       "NB_Global": {
>>>>>           "columns": {
>>>>> @@ -89,7 +89,12 @@
>>>>>                   "type": {"key": {"type": "uuid",
>>>>>                                    "refTable": "Forwarding_Group",
>>>>>                                    "refType": "strong"},
>>>>> -                                     "min": 0, "max": "unlimited"}}},
>>>>> +                                     "min": 0, "max": "unlimited"}},
>>>>> +                "dhcp_relay_port": {"type": {"key": {"type": "uuid",
>>>>> +                                            "refTable": "Logical_Router_Port",
>>>>> +                                            "refType": "weak"},
>>>>> +                                            "min": 0,
>>>>> +                                            "max": 1}}},
>>>>>           "isRoot": true},
>>>>>       "Logical_Switch_Port": {
>>>>>           "columns": {
>>>>> @@ -436,6 +441,11 @@
>>>>>               "ipv6_prefix": {"type": {"key": "string",
>>>>>                                     "min": 0,
>>>>>                                     "max": "unlimited"}},
>>>>> +                "dhcp_relay": {"type": {"key": {"type": "uuid",
>>>>> +                                            "refTable": "DHCP_Relay",
>>>>> +                                            "refType": "weak"},
>>>>> +                                            "min": 0,
>>>>> +                                            "max": 1}},
>>>>>               "external_ids": {
>>>>>                   "type": {"key": "string", "value": "string",
>>>>>                            "min": 0, "max": "unlimited"}},
>>>>> @@ -529,6 +539,15 @@
>>>>>                   "type": {"key": "string", "value": "string",
>>>>>                            "min": 0, "max": "unlimited"}}},
>>>>>           "isRoot": true},
>>>>> +        "DHCP_Relay": {
>>>>> +            "columns": {
>>>>> +                "servers": {"type": {"key": "string",
>>>>> +                                       "min": 0,
>>>>> +                                       "max": 1}},
>>>>> +                "external_ids": {
>>>>> +                    "type": {"key": "string", "value": "string",
>>>>> +                             "min": 0, "max": "unlimited"}}},
>>>>> +            "isRoot": true},
>>>>>       "Connection": {
>>>>>           "columns": {
>>>>>               "target": {"type": "string"},
>>>>> diff --git a/ovn-nb.xml b/ovn-nb.xml
>>>>> index fcb1c6ecc..dc20892e1 100644
>>>>> --- a/ovn-nb.xml
>>>>> +++ b/ovn-nb.xml
>>>>> @@ -608,6 +608,11 @@
>>>>>     Please see the <ref table="DNS"/> table.
>>>>>   </column>
>>>>> 
>>>>> +    <column name="dhcp_relay_port">
>>>>> +      This column defines the <ref table="Logical_Router_Port"/> on which
>>>>> +      DHCP relay is enabled.
>>>>> +    </column>
>>>>> +
>>>>>   <column name="forwarding_groups">
>>>>>     Groups a set of logical port endpoints for traffic going out of the
>>>>>     logical switch.
>>>>> @@ -2980,6 +2985,11 @@ or
>>>>>     port has all ingress and egress traffic dropped.
>>>>>   </column>
>>>>> 
>>>>> +    <column name="dhcp_relay">
>>>>> +      This column is used to enabled DHCP Relay. Please refer
>>>>> +      to <ref table="DHCP_Relay"/> table.
>>>>> +    </column>
>>>>> +
>>>>>   <group title="Distributed Gateway Ports">
>>>>>     <p>
>>>>>       Gateways, as documented under <code>Gateways</code> in the OVN
>>>>> @@ -4286,6 +4296,24 @@ or
>>>>>   </group>
>>>>> </table>
>>>>> 
>>>>> +  <table name="DHCP_Relay" title="DHCP Relay">
>>>>> +    <p>
>>>>> +      OVN implements native DHCPv4 relay support which caters to the common
>>>>> +      use case of relaying the DHCP requests to external DHCP server.
>>>>> +    </p>
>>>>> +
>>>>> +    <column name="servers">
>>>>> +      <p>
>>>>> +        The DHCPv4 server IP address.
>>>>> +      </p>
>>>>> +    </column>
>>>>> +    <group title="Common Columns">
>>>>> +      <column name="external_ids">
>>>>> +        See <em>External IDs</em> at the beginning of this document.
>>>>> +      </column>
>>>>> +    </group>
>>>>> +  </table>
>>>>> +
>>>>> <table name="Connection" title="OVSDB client connections.">
>>>>>   <p>
>>>>>     Configuration for a database connection to an Open vSwitch database
>>>>> diff --git a/tests/atlocal.in b/tests/atlocal.in
>>>>> index 63d891b89..32d1c374e 100644
>>>>> --- a/tests/atlocal.in
>>>>> +++ b/tests/atlocal.in
>>>>> @@ -187,6 +187,9 @@ fi
>>>>> # Set HAVE_DHCPD
>>>>> find_command dhcpd
>>>>> 
>>>>> +# Set HAVE_DHCLIENT
>>>>> +find_command dhclient
>>>>> +
>>>>> # Set HAVE_BFDD_BEACON
>>>>> find_command bfdd-beacon
>>>>> 
>>>>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>>>>> index 19e4f1263..4d8c9ff26 100644
>>>>> --- a/tests/ovn-northd.at
>>>>> +++ b/tests/ovn-northd.at
>>>>> @@ -8786,9 +8786,9 @@ ovn-nbctl --wait=sb set logical_router_port R1-PUB options:redirect-type=bridged
>>>>> ovn-sbctl dump-flows R1 > R1flows
>>>>> AT_CAPTURE_FILE([R1flows])
>>>>> 
>>>>> -AT_CHECK([grep "lr_in_arp_resolve" R1flows | grep priority=90 | sort], [0], [dnl
>>>>> -  table=17(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip4.src == 10.0.0.3 && is_chassis_resident("S0-P0")), action=(get_arp(outport, reg0); next;)
>>>>> -  table=17(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip6.src == 1000::3 && is_chassis_resident("S0-P0")), action=(get_nd(outport, xxreg0); next;)
>>>>> +AT_CHECK([grep "lr_in_arp_resolve" R1flows | grep priority=90 | sed 's/table=../table=??/' | sort], [0], [dnl
>>>>> +  table=??(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip4.src == 10.0.0.3 && is_chassis_resident("S0-P0")), action=(get_arp(outport, reg0); next;)
>>>>> +  table=??(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip6.src == 1000::3 && is_chassis_resident("S0-P0")), action=(get_nd(outport, xxreg0); next;)
>>>>> ])
>>>>> 
>>>>> AT_CLEANUP
>>>>> @@ -10966,3 +10966,38 @@ Status: active
>>>>> 
>>>>> AT_CLEANUP
>>>>> ])
>>>>> +
>>>>> +OVN_FOR_EACH_NORTHD_NO_HV([
>>>>> +AT_SETUP([check DHCP RELAY AGENT])
>>>>> +ovn_start NORTHD_TYPE
>>>>> +
>>>>> +check ovn-nbctl ls-add ls0
>>>>> +check ovn-nbctl lsp-add ls0 ls0-port1
>>>>> +check ovn-nbctl lsp-set-addresses ls0-port1 02:00:00:00:00:10
>>>>> +check ovn-nbctl lr-add lr0
>>>>> +check ovn-nbctl lrp-add lr0 lrp1 02:00:00:00:00:01 192.168.1.1/24
>>>>> +check ovn-nbctl lsp-add ls0 lrp1-attachment
>>>>> +check ovn-nbctl lsp-set-type lrp1-attachment router
>>>>> +check ovn-nbctl lsp-set-addresses lrp1-attachment 00:00:00:00:ff:02
>>>>> +check ovn-nbctl lsp-set-options lrp1-attachment router-port=lrp1
>>>>> +check ovn-nbctl lrp-add lr0 lrp-ext 02:00:00:00:00:02 192.168.2.1/24
>>>>> +
>>>>> +dhcp_relay=$(ovn-nbctl create DHCP_Relay servers=172.16.1.1)
>>>>> +check ovn-nbctl set Logical_Router_port lrp1 dhcp_relay=$dhcp_relay
>>>>> +rp_uuid=$(ovn-nbctl --bare --colum=_uuid list logical_router_port lrp1)
>>>>> +check ovn-nbctl set Logical_Switch ls0 dhcp_relay_port=$rp_uuid
>>>>> +
>>>>> +check ovn-nbctl --wait=sb sync
>>>>> +
>>>>> +ovn-sbctl lflow-list > lflows
>>>>> +AT_CAPTURE_FILE([lflows])
>>>>> +
>>>>> +AT_CHECK([grep -e "DHCP_RELAY_" lflows | sed 's/table=../table=??/'], [0], [dnl
>>>>> +  table=??(lr_in_ip_input     ), priority=110  , match=(inport == "lrp1" && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(dhcp_relay_req(192.168.1.1,172.16.1.1);ip4.src=192.168.1.1;ip4.dst=172.16.1.1;udp.src=67;next; /* DHCP_RELAY_REQ */)
>>>>> +  table=??(lr_in_ip_input     ), priority=110  , match=(ip4.src == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src == 67 && udp.dst == 67), action=(next;/* DHCP_RELAY_RESP */)
>>>>> +  table=??(lr_in_dhcp_relay_resp_fwd), priority=110  , match=(ip4.src == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src == 67 && udp.dst == 67), action=(dhcp_relay_resp_fwd(192.168.1.1,172.16.1.1);ip4.src=192.168.1.1;udp.dst=68;outport="lrp1";output; /* DHCP_RELAY_RESP */)
>>>>> +  table=??(ls_in_l2_lkup      ), priority=100  , match=(inport == "ls0-port1" && eth.src == 02:00:00:00:00:10 && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(eth.dst=02:00:00:00:00:01;outport="lrp1-attachment";next;/* DHCP_RELAY_REQ */)
>>>>> +])
>>>>> +
>>>>> +AT_CLEANUP
>>>>> +])
>>>>> diff --git a/tests/ovn.at b/tests/ovn.at
>>>>> index e8c79512b..839c07ce2 100644
>>>>> --- a/tests/ovn.at
>>>>> +++ b/tests/ovn.at
>>>>> @@ -21905,7 +21905,7 @@ eth_dst=00000000ff01
>>>>> ip_src=$(ip_to_hex 10 0 0 10)
>>>>> ip_dst=$(ip_to_hex 172 168 0 101)
>>>>> send_icmp_packet 1 1 $eth_src $eth_dst $ip_src $ip_dst c4c9 0000000000000000000000
>>>>> -AT_CHECK_UNQUOTED([as hv1 ovs-ofctl dump-flows br-int metadata=0x$lr0_dp_key | awk '/table=28, n_packets=1, n_bytes=45/{print $7" "$8}'],[0],[dnl
>>>>> +AT_CHECK_UNQUOTED([as hv1 ovs-ofctl dump-flows br-int metadata=0x$lr0_dp_key | awk '/table=29, n_packets=1, n_bytes=45/{print $7" "$8}'],[0],[dnl
>>>>> priority=80,ip,reg15=0x$lr0_public_dp_key,metadata=0x$lr0_dp_key,nw_src=10.0.0.10 actions=drop
>>>>> ])
>>>>> 
>>>>> @@ -28964,7 +28964,7 @@ AT_CHECK([
>>>>>       grep "priority=100" | \
>>>>>       grep -c "ct(commit,zone=NXM_NX_REG11\\[[0..15\\]],.*exec(move:NXM_OF_ETH_SRC\\[[\\]]->NXM_NX_CT_LABEL\\[[32..79\\]],load:0x[[0-9]]->NXM_NX_CT_MARK\\[[16..31\\]]))"
>>>>> 
>>>>> -        grep table=25 hv${hv}flows | \
>>>>> +        grep table=26 hv${hv}flows | \
>>>>>       grep "priority=200" | \
>>>>>       grep -c "move:NXM_NX_CT_LABEL\\[[\\]]->NXM_NX_XXREG1\\[[\\]],move:NXM_NX_XXREG1\\[[32..79\\]]->NXM_OF_ETH_DST"
>>>>>   done; :], [0], [dnl
>>>>> @@ -29089,7 +29089,7 @@ AT_CHECK([
>>>>>       grep "priority=100" | \
>>>>>       grep -c "ct(commit,zone=NXM_NX_REG11\\[[0..15\\]],.*exec(move:NXM_OF_ETH_SRC\\[[\\]]->NXM_NX_CT_LABEL\\[[32..79\\]],load:0x[[0-9]]->NXM_NX_CT_MARK\\[[16..31\\]]))"
>>>>> 
>>>>> -        grep table=25 hv${hv}flows | \
>>>>> +        grep table=26 hv${hv}flows | \
>>>>>       grep "priority=200" | \
>>>>>       grep -c "move:NXM_NX_CT_LABEL\\[[\\]]->NXM_NX_XXREG1\\[[\\]],move:NXM_NX_XXREG1\\[[32..79\\]]->NXM_OF_ETH_DST"
>>>>>   done; :], [0], [dnl
>>>>> @@ -29586,7 +29586,7 @@ if test X"$1" = X"DGP"; then
>>>>> else
>>>>>   prio=2
>>>>> fi
>>>>> -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=25, n_packets=1,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
>>>>> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=26, n_packets=1,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
>>>>> 1
>>>>> ])
>>>>> 
>>>>> @@ -29605,13 +29605,13 @@ AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep "actions=controller" | grep
>>>>> 
>>>>> if test X"$1" = X"DGP"; then
>>>>>   # The packet dst should be resolved once for E/W centralized NAT purpose.
>>>>> -    AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=25, n_packets=1,.* priority=100,reg0=0xa000101,reg15=.*metadata=0x${sw_key} actions=mod_dl_dst:00:00:00:00:01:01,resubmit" -c], [0], [dnl
>>>>> +    AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=26, n_packets=1,.* priority=100,reg0=0xa000101,reg15=.*metadata=0x${sw_key} actions=mod_dl_dst:00:00:00:00:01:01,resubmit" -c], [0], [dnl
>>>>> 1
>>>>> ])
>>>>> fi
>>>>> 
>>>>> # The packet should've been finally dropped in the lr_in_arp_resolve stage.
>>>>> -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=25, n_packets=2,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
>>>>> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=26, n_packets=2,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
>>>>> 1
>>>>> ])
>>>>> OVN_CLEANUP([hv1])
>>>>> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
>>>>> index 7b9daba0d..591933a95 100644
>>>>> --- a/tests/system-ovn.at
>>>>> +++ b/tests/system-ovn.at
>>>>> @@ -12032,3 +12032,153 @@ as
>>>>> OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
>>>>> /connection dropped.*/d"])
>>>>> AT_CLEANUP
>>>>> +
>>>>> +OVN_FOR_EACH_NORTHD([
>>>>> +AT_SETUP([DHCP RELAY AGENT])
>>>>> +AT_SKIP_IF([test $HAVE_DHCPD = no])
>>>>> +AT_SKIP_IF([test $HAVE_DHCLIENT = no])
>>>>> +AT_SKIP_IF([test $HAVE_TCPDUMP = no])
>>>>> +ovn_start
>>>>> +OVS_TRAFFIC_VSWITCHD_START()
>>>>> +
>>>>> +ADD_BR([br-int])
>>>>> +ADD_BR([br-ext])
>>>>> +
>>>>> +ovs-ofctl add-flow br-ext action=normal
>>>>> +# Set external-ids in br-int needed for ovn-controller
>>>>> +ovs-vsctl \
>>>>> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
>>>>> +        -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
>>>>> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
>>>>> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
>>>>> +        -- set bridge br-int fail-mode=secure other-config:disable-in-band=true
>>>>> +
>>>>> +# Start ovn-controller
>>>>> +start_daemon ovn-controller
>>>>> +
>>>>> +ADD_NAMESPACES(sw01)
>>>>> +ADD_VETH(sw01, sw01, br-int, "0", "f0:00:00:01:02:03")
>>>>> +ADD_NAMESPACES(sw11)
>>>>> +ADD_VETH(sw11, sw11, br-int, "0", "f0:00:00:02:02:03")
>>>>> +ADD_NAMESPACES(server)
>>>>> +ADD_VETH(s1, server, br-ext, "172.16.1.1/24", "f0:00:00:01:02:05", \
>>>>> +         "172.16.1.254")
>>>>> +
>>>>> +check ovn-nbctl lr-add R1
>>>>> +
>>>>> +check ovn-nbctl ls-add sw0
>>>>> +check ovn-nbctl ls-add sw1
>>>>> +check ovn-nbctl ls-add sw-ext
>>>>> +
>>>>> +check ovn-nbctl lrp-add R1 rp-sw0 00:00:01:01:02:03 192.168.1.1/24
>>>>> +check ovn-nbctl lrp-add R1 rp-sw1 00:00:03:01:02:03 192.168.2.1/24
>>>>> +check ovn-nbctl lrp-add R1 rp-ext 00:00:02:01:02:03 172.16.1.254/24
>>>>> +
>>>>> +dhcp_relay=$(ovn-nbctl create DHCP_Relay servers=172.16.1.1)
>>>>> +check ovn-nbctl set Logical_Router_port rp-sw0 dhcp_relay=$dhcp_relay
>>>>> +check ovn-nbctl set Logical_Router_port rp-sw1 dhcp_relay=$dhcp_relay
>>>>> +check ovn-nbctl lrp-set-gateway-chassis rp-ext hv1
>>>>> +
>>>>> +check ovn-nbctl lsp-add sw0 sw0-rp -- set Logical_Switch_Port sw0-rp \
>>>>> +    type=router options:router-port=rp-sw0 \
>>>>> +    -- lsp-set-addresses sw0-rp router
>>>>> +check ovn-nbctl lsp-add sw1 sw1-rp -- set Logical_Switch_Port sw1-rp \
>>>>> +    type=router options:router-port=rp-sw1 \
>>>>> +    -- lsp-set-addresses sw1-rp router
>>>>> +
>>>>> +rp_uuid=$(ovn-nbctl --bare --colum=_uuid list logical_router_port rp-sw0)
>>>>> +check ovn-nbctl set Logical_Switch sw0 dhcp_relay_port=$rp_uuid
>>>>> +rp_uuid=$(ovn-nbctl --bare --colum=_uuid list logical_router_port rp-sw1)
>>>>> +check ovn-nbctl set Logical_Switch sw1 dhcp_relay_port=$rp_uuid
>>>>> +
>>>>> +check ovn-nbctl lsp-add sw-ext ext-rp -- set Logical_Switch_Port ext-rp \
>>>>> +    type=router options:router-port=rp-ext \
>>>>> +    -- lsp-set-addresses ext-rp router
>>>>> +check ovn-nbctl lsp-add sw-ext lnet \
>>>>> +        -- lsp-set-addresses lnet unknown \
>>>>> +        -- lsp-set-type lnet localnet \
>>>>> +        -- lsp-set-options lnet network_name=phynet
>>>>> +
>>>>> +check ovn-nbctl lsp-add sw0 sw01 \
>>>>> +    -- lsp-set-addresses sw01 "f0:00:00:01:02:03"
>>>>> +
>>>>> +check ovn-nbctl lsp-add sw1 sw11 \
>>>>> +    -- lsp-set-addresses sw11 "f0:00:00:02:02:03"
>>>>> +
>>>>> +AT_CHECK([ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappings=phynet:br-ext])
>>>>> +
>>>>> +OVN_POPULATE_ARP
>>>>> +
>>>>> +check ovn-nbctl --wait=hv sync
>>>>> +
>>>>> +DHCP_TEST_DIR="/tmp/dhcp-test"
>>>>> +rm -rf $DHCP_TEST_DIR
>>>>> +mkdir $DHCP_TEST_DIR
>>>>> +cat > $DHCP_TEST_DIR/dhcpd.conf <<EOF
>>>>> +subnet 172.16.1.0 netmask 255.255.255.0 {
>>>>> +}
>>>>> +subnet 192.168.1.0 netmask 255.255.255.0 {
>>>>> +  range 192.168.1.10 192.168.1.10;
>>>>> +  option routers 192.168.1.1;
>>>>> +  option broadcast-address 192.168.1.255;
>>>>> +  default-lease-time 60;
>>>>> +  max-lease-time 120;
>>>>> +}
>>>>> +subnet 192.168.2.0 netmask 255.255.255.0 {
>>>>> +  range 192.168.2.10 192.168.2.10;
>>>>> +  option routers 192.168.2.1;
>>>>> +  option broadcast-address 192.168.2.255;
>>>>> +  default-lease-time 60;
>>>>> +  max-lease-time 120;
>>>>> +}
>>>>> +EOF
>>>>> +cat > $DHCP_TEST_DIR/dhclien.conf <<EOF
>>>>> +timeout 2
>>>>> +EOF
>>>>> +
>>>>> +touch $DHCP_TEST_DIR/dhcpd.leases
>>>>> +chown root:dhcpd $DHCP_TEST_DIR $DHCP_TEST_DIR/dhcpd.leases
>>>>> +chmod 775 $DHCP_TEST_DIR
>>>>> +chmod 664 $DHCP_TEST_DIR/dhcpd.leases
>>>>> +
>>>>> +
>>>>> +NETNS_DAEMONIZE([server], [dhcpd -4 -f -cf $DHCP_TEST_DIR/dhcpd.conf s1 > dhcpd.log 2>&1], [dhcpd.pid])
>>>>> +
>>>>> +NS_CHECK_EXEC([server], [tcpdump -l -nvv -i s1  udp > pkt.pcap 2>tcpdump_err &])
>>>>> +OVS_WAIT_UNTIL([grep "listening" tcpdump_err])
>>>>> +on_exit 'kill $(pidof tcpdump)'
>>>>> +
>>>>> +NS_CHECK_EXEC([sw01], [dhclient -1 -q -lf $DHCP_TEST_DIR/dhclient-sw01.lease -pf $DHCP_TEST_DIR/dhclient-sw01.pid -cf $DHCP_TEST_DIR/dhclien.conf sw01])
>>>>> +NS_CHECK_EXEC([sw11], [dhclient -1 -q -lf $DHCP_TEST_DIR/dhclient-sw11.lease -pf $DHCP_TEST_DIR/dhclient-sw11.pid -cf $DHCP_TEST_DIR/dhclien.conf sw11])
>>>>> +
>>>>> +OVS_WAIT_UNTIL([
>>>>> +    total_pkts=$(cat pkt.pcap | wc -l)
>>>>> +    test ${total_pkts} -ge 8
>>>>> +])
>>>>> +
>>>>> +on_exit 'kill `cat $DHCP_TEST_DIR/dhclient-sw01.pid` &&
>>>>> +kill `cat $DHCP_TEST_DIR/dhclient-sw11.pid` && rm -rf $DHCP_TEST_DIR'
>>>>> +
>>>>> +NS_CHECK_EXEC([sw01], [ip addr show sw01 | grep -oP '(?<=inet\s)\d+(\.\d+){3}'], [0], [dnl
>>>>> +192.168.1.10
>>>>> +])
>>>>> +NS_CHECK_EXEC([sw11], [ip addr show sw11 | grep -oP '(?<=inet\s)\d+(\.\d+){3}'], [0], [dnl
>>>>> +192.168.2.10
>>>>> +])
>>>>> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
>>>>> +
>>>>> +as ovn-sb
>>>>> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>>>>> +
>>>>> +as ovn-nb
>>>>> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>>>>> +
>>>>> +as northd
>>>>> +OVS_APP_EXIT_AND_WAIT([NORTHD_TYPE])
>>>>> +
>>>>> +as
>>>>> +OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
>>>>> +/failed to query port patch-.*/d
>>>>> +/.*terminating with signal 15.*/d"])
>>>>> +AT_CLEANUP
>>>>> +])
>>>>> diff --git a/utilities/ovn-trace.c b/utilities/ovn-trace.c
>>>>> index 0b86eae7b..ae9dd77de 100644
>>>>> --- a/utilities/ovn-trace.c
>>>>> +++ b/utilities/ovn-trace.c
>>>>> @@ -2328,6 +2328,25 @@ execute_put_dhcp_opts(const struct ovnact_put_opts *pdo,
>>>>>   execute_put_opts(pdo, name, uflow, super);
>>>>> }
>>>>> 
>>>>> +static void
>>>>> +execute_dhcpv4_relay_resp_fwd(const struct ovnact_dhcp_relay *dr,
>>>>> +                                const char *name, struct flow *uflow,
>>>>> +                                struct ovs_list *super)
>>>>> +{
>>>>> +    ovntrace_node_append(
>>>>> +        super, OVNTRACE_NODE_ERROR,
>>>>> +        "/* We assume that this packet is DHCPOFFER or DHCPACK and "
>>>>> +            "DHCP broadcast flag is set. Dest IP is set to broadcast. "
>>>>> +            "Dest MAC is set to broadcast but in real network this is unicast "
>>>>> +            "which is extracted from DHCP header. */");
>>>>> +
>>>>> +    /* Assume DHCP broadcast flag is set */
>>>>> +    uflow->nw_dst = 0xFFFFFFFF;
>>>>> +    /* Dest MAC is set to broadcast but in real network this is unicast */
>>>>> +    struct eth_addr bcast_mac = {0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF};
>>>>> +    uflow->dl_dst = bcast_mac;
>>>>> +}
>>>>> +
>>>>> static void
>>>>> execute_put_nd_ra_opts(const struct ovnact_put_opts *pdo,
>>>>>                      const char *name, struct flow *uflow,
>>>>> @@ -3215,6 +3234,15 @@ trace_actions(const struct ovnact *ovnacts, size_t ovnacts_len,
>>>>>                                 "put_dhcpv6_opts", uflow, super);
>>>>>           break;
>>>>> 
>>>>> +        case OVNACT_DHCPV4_RELAY_REQ:
>>>>> +            /* Nothing to do for tracing. */
>>>>> +            break;
>>>>> +
>>>>> +        case OVNACT_DHCPV4_RELAY_RESP_FWD:
>>>>> +            execute_dhcpv4_relay_resp_fwd(ovnact_get_DHCPV4_RELAY_RESP_FWD(a),
>>>>> +                                    "dhcp_relay_resp_fwd", uflow, super);
>>>>> +            break;
>>>>> +
>>>>>       case OVNACT_PUT_ND_RA_OPTS:
>>>>>           execute_put_nd_ra_opts(ovnact_get_PUT_DHCPV6_OPTS(a),
>>>>>                                  "put_nd_ra_opts", uflow, super);
>>>>> --
>>>>> 2.36.6
>>>>> 
>>>>> _______________________________________________
>>>>> dev mailing list
>>>>> dev@openvswitch.org
>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwIFaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=jUP6tr4FN6iSRj6v8rdyetsvEpT13QUHVMbw__3u6Sm7qAhyuu9tBdezdVmkqt0p&s=jJ3kFCf5o6dc-gW8diGvfaIQVC0Gwhe2y5aJYZJo0Rk&e=
>>> 
>>> _______________________________________________
>>> dev mailing list
>>> dev@openvswitch.org
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwIFaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=Y0VulqBkhPSTIRvGcgPUyjhLDQiFRE2rJOS17q4U2rvbGvuTRX_KWc30pZQRDbFM&s=j60qqTajjKcQRA2_J_ZfXj1eIu5RmLsENUpdcQKh3PI&e=
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwIGaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=6ylfQ0EScd-bjMbFXhqTwjQMA0Mo8RWEiJGFuHPdQpQUVGX7EJROAcJYXsARR4gM&s=_b1yAjDOLFWiWXZph7kH-IZ2akA0WkpvOZxXboASDA8&e=
Numan Siddique Feb. 27, 2024, 6:15 p.m. UTC | #7
On Mon, Feb 26, 2024, 11:41 AM Naveen Yerramneni <
naveen.yerramneni@nutanix.com> wrote:

>
>
> > On 24-Jan-2024, at 6:30 PM, Naveen Yerramneni <
> naveen.yerramneni@nutanix.com> wrote:
> >
> >
> >
> >> On 24-Jan-2024, at 8:59 AM, Numan Siddique <numans@ovn.org> wrote:
> >>
> >> On Tue, Jan 23, 2024 at 8:02 PM Naveen Yerramneni
> >> <naveen.yerramneni@nutanix.com> wrote:
> >>>
> >>>
> >>>
> >>>> On 16-Jan-2024, at 2:30 AM, Numan Siddique <numans@ovn.org> wrote:
> >>>>
> >>>> On Tue, Dec 12, 2023 at 1:05 PM Naveen Yerramneni
> >>>> <naveen.yerramneni@nutanix.com> wrote:
> >>>>>
> >>>>>  This patch contains changes to enable DHCP Relay Agent support for
> overlay subnets.
> >>>>>
> >>>>>  USE CASE:
> >>>>>  ----------
> >>>>>    - Enable IP address assignment for overlay subnets from the
> centralized DHCP server present in the underlay network.
> >>>>>
> >>>>>  PREREQUISITES
> >>>>>  --------------
> >>>>>    - Logical Router Port IP should be assigned (statically) from the
> same overlay subnet which is managed by DHCP server.
> >>>>>    - LRP IP is used for GIADRR field when relaying the DHCP packets
> and also same IP needs to be configured as default gateway for the overlay
> subnet.
> >>>>>    - Overlay subnets managed by external DHCP server are expected to
> be directly reachable from the underlay network.
> >>>>>
> >>>>>  EXPECTED PACKET FLOW:
> >>>>>  ----------------------
> >>>>>  Following is the expected packet flow inorder to support DHCP rleay
> functionality in OVN.
> >>>>>    1. DHCP client originates DHCP discovery (broadcast).
> >>>>>    2. DHCP relay (running on the OVN) receives the broadcast and
> forwards the packet to the DHCP server by converting it to unicast. While
> forwarding the packet, it updates the GIADDR in DHCP header to its
> >>>>>       interface IP on which DHCP packet is received.
> >>>>>    3. DHCP server uses GIADDR field to decide the IP address pool
> from which IP has to be assigned and DHCP offer is sent to the same IP
> (GIADDR).
> >>>>>    4. DHCP relay agent forwards the offer to the client, it resets
> the GIADDR field when forwarding the offer to the client.
> >>>>>    5. DHCP client sends DHCP request (broadcast) packet.
> >>>>>    6. DHCP relay (running on the OVN) receives the broadcast and
> forwards the packet to the DHCP server by converting it to unicast. While
> forwarding the packet, it updates the GIADDR in DHCP header to its
> >>>>>       interface IP on which DHCP packet is received.
> >>>>>    7. DHCP Server sends the ACK packet.
> >>>>>    8. DHCP relay agent forwards the ACK packet to the client, it
> resets the GIADDR field when forwarding the ACK to the client.
> >>>>>    9. All the future renew/release packets are directly exchanged
> between DHCP client and DHCP server.
> >>>>>
> >>>>>  OVN DHCP RELAY PACKET FLOW:
> >>>>>  ----------------------------
> >>>>>  To add DHCP Relay support on OVN, we need to replicate all the
> behavior described above using distributed logical switch and logical
> router.
> >>>>>  At, highlevel packet flow is distributed among Logical Switch and
> Logical Router on source node (where VM is deployed) and redirect
> chassis(RC) node.
> >>>>>    1. Request packet gets processed on the source node where VM is
> deployed and relays the packet to DHCP server.
> >>>>>    2. Response packet is first processed on RC node (which first
> recieves the packet from underlay network). RC node forwards the packet to
> the right node by filling in the dest MAC and IP.
> >>>>>
> >>>>>  OVN Packet flow with DHCP relay is explained below.
> >>>>>    1. DHCP client (VM) sends the DHCP discover packet (broadcast).
> >>>>>    2. Logical switch converts the packet to L2 unicast by setting
> the destination MAC to LRP's MAC
> >>>>>    3. Logical Router receives the packet and redirects it to the OVN
> controller.
> >>>>>    4. OVN controller updates the required information(GIADDR) in the
> DHCP payload after doing the required checks. If any check fails, packet is
> dropped.
> >>>>>    5. Logical Router converts the packet to L3 unicast and forwards
> it to the server. This packets gets routed like any other packet (via RC
> node).
> >>>>>    6. Server replies with DHCP offer.
> >>>>>    7. RC node processes the DHCP offer and forwards it to the OVN
> controller.
> >>>>>    8. OVN controller does sanity checks and  updates the destination
> MAC (available in DHCP header), destination IP (available in DHCP header),
> resets GIADDR  and reinjects the packet to datapath.
> >>>>>       If any check fails, packet is dropped.
> >>>>>    9. Logical router updates the source IP and port and forwards the
> packet to logical switch.
> >>>>>    10. Logical switch delivers the packet to the DHCP client.
> >>>>>    11. Similar steps are performed for Request and Ack packets.
> >>>>>    12. All the future renew/release packets are directly exchanged
> between DHCP client and DHCP server
> >>>>>
> >>>>>  NEW OVN ACTIONS
> >>>>>  ---------------
> >>>>>
> >>>>>    1. dhcp_relay_req(<relay-ip>, <server-ip>)
> >>>>>        - This action executes on the source node on which the DHCP
> request originated.
> >>>>>        - This action relays the DHCP request coming from client to
> the server. Relay-ip is used to update GIADDR in the DHCP header.
> >>>>>    2. dhcp_relay_resp_fwd(<relay-ip>, <server-ip>)
> >>>>>        - This action executes on the first node (RC node) which
> processes the DHCP response from the server.
> >>>>>        - This action updates  the destination MAC and destination IP
> so that the response can be forwarded to the appropriate node from which
> request was originated.
> >>>>>        - Relay-ip, server-ip are used to validate GIADDR and SERVER
> ID in the DHCP payload.
> >>>>>
> >>>>>  FLOWS
> >>>>>  -----
> >>>>>  Following are the flows required for one overlay subnet.
> >>>>>
> >>>>>    1. table=27(ls_in_l2_lkup      ), priority=100  , match=(inport
> == <vm_port> && eth.src == <vm_mac> && ip4.src == 0.0.0.0 && ip4.dst ==
> 255.255.255.255 && udp.src == 68 && udp.dst == 67),
> action=(eth.dst=<lrp_mac>;outport=<lrp-port>;next;/* DHCP_RELAY_REQ */)
> >>>>>    2. table=3 (lr_in_ip_input     ), priority=110  , match=(inport
> == <lrp_port> && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 &&
> udp.src == 68 && udp.dst == 67),
> action=(dhcp_relay_req(<lrp_ip>,<dhcp_server_ip>);ip4.src=<lrp_ip>;ip4.dst=<dhcp_server_ip>;udp.src=67;next;
> /* DHCP_RELAY_REQ */)
> >>>>>    3. table=3 (lr_in_ip_input     ), priority=110  , match=(ip4.src
> == <dhcp_server_ip> && ip4.dst ==<lrp_ip> && udp.src == 67 && udp.dst ==
> 67), action=(next;/* DHCP_RELAY_RESP */)
> >>>>>    4. table=17(lr_in_dhcp_relay_resp_fwd), priority=110  ,
> match=(ip4.src == <dhcp_server_ip> && ip4.dst == <lrp_ip> && udp.src == 67
> && udp.dst == 67),
> action=(dhcp_relay_resp_fwd(<lrp_ip>,<dhcp_server_ip>);ip4.src=<lrp_ip>;udp.dst=68;outport=<lrp_port>;output;
> /* DHCP_RELAY_RESP */)
> >>>>>
> >>>>>  NEW PIPELINE STAGES
> >>>>>  -------------------
> >>>>>  Following stage is added for DHCP relay feature. Some of the flows
> are fitted into the existing pipeline tages.
> >>>>>    1. lr_in_dhcp_relay_resp_fwd
> >>>>>        - Forward teh DHCP response to the appropriate node
> >>>>>
> >>>>>  NB SCHEMA CHANGES
> >>>>>  ----------------
> >>>>>    1. New DHCP_Relay table
> >>>>>        "DHCP_Relay": {
> >>>>>              "columns": {
> >>>>>          "name": {"type": "string"},
> >>>>>                  "servers": {"type": {"key": "string",
> >>>>>                                         "min": 0,
> >>>>>                                         "max": 1}},
> >>>>>                  "external_ids": {
> >>>>>                      "type": {"key": "string", "value": "string",
> >>>>>                              "min": 0, "max": "unlimited"}}},
> >>>>>              "isRoot": true},
> >>>>>    2. New column to Logical_Router_Port table
> >>>>>        "dhcp_relay": {"type": {"key": {"type": "uuid",
> >>>>>                              "refTable": "DHCP_Relay",
> >>>>>                              "refType": "weak"},
> >>>>>                              "min": 0,
> >>>>>                              "max": 1}},
> >>>>>    3. New column to Logical_Switch_table
> >>>>>        "dhcp_relay_port": {"type": {"key": {"type": "uuid",
> >>>>>                                      "refTable":
> "Logical_Router_Port",
> >>>>>                                      "refType": "weak"},
> >>>>>                                       "min": 0,
> >>>>>                                       "max": 1}}},
> >>>>>
> >>>>>  Commands to enable the feature:
> >>>>>  ------------------------------
> >>>>>    - ovn-nbctl create DHCP_Relay servers=<ip>
> >>>>>    - ovn-nbctl set Logical_Router_port <lrp_uuid>
> dhcp_relay=<dhcp_relay_uuid>
> >>>>>    - ovn-nbctl set Logical_Switch <ls_uuid>
> dhcp_relay_port=<lrp_uuid>
> >>>>>
> >>>>>  Example:
> >>>>>  -------
> >>>>>   ovn-nbctl ls-add sw1
> >>>>>   ovn-nbctl lsp-add sw1 sw1-port1
> >>>>>   ovn-nbctl lsp-set-addresses sw1-port1 <MAC> #Only MAC address has
> to be specified when logical ports are created.
> >>>>>   ovn-nbctl lr-add lr1
> >>>>>   ovn-nbctl lrp-add lr1 lr1-port1 <MAC> <GATEWAY_IP/Prefix> #GATEWAY
> IP is set in GIADDR field when relaying the DHCP requests to server.
> >>>>>   ovn-nbctl lsp-add sw1 lr1-attachment
> >>>>>   ovn-nbctl lsp-set-type lr1-attachment router
> >>>>>   ovn-nbctl lsp-set-addresses lr1-attachment <MAC>
> >>>>>   ovn-nbctl lsp-set-options lr1-attachment router-port=lr1-port1
> >>>>>   ovn-nbctl create DHCP_Relay servers=<DHCP_SERVER_IP>
> >>>>>   ovn-nbctl set Logical_Router_port <lrp_uuid>
> dhcp_relay=<relay_uuid>
> >>>>>   ovn-nbctl set Logical_Switch <ls_uuid> dhcp_relay_port=<lrp_uuid>
> >>>>>
> >>>>>  Limitations:
> >>>>>  ------------
> >>>>>    - All OVN features that needs IP address to be configured on
> logical port (like proxy arp, etc) will not be supported for overlay
> subnets on which DHCP relay is enabled.
> >>>>>
> >>>>>  References:
> >>>>>  ----------
> >>>>>    - rfc1541, rfc1542, rfc2131
> >>>>>
> >>>>> Signed-off-by: Naveen Yerramneni <naveen.yerramneni@nutanix.com>
> >>>>> Co-authored-by: Huzaifa Calcuttawala <huzaifa.c@nutanix.com>
> >>>>> Signed-off-by: Huzaifa Calcuttawala <huzaifa.c@nutanix.com>
> >>>>> CC: Mary Manohar <mary.manohar@nutanix.com>
> >>>>
> >>>> Hi Naveen,
> >>>>
> >>>> Thanks for the patch.  Sorry for the delayed response.
> >>>>
> >>>> I've a few comments.
> >>>>
> >>>> 1.  Regarding the newly added Table - DHCP_Relay in NB DB and the
> >>>> newly added columns in Logical_Switch and
> >>>>  Logical_Router table.
> >>>>
> >>>>  I don't think there is a need to add the new table DHCP_Relay
> >>>> since it only stores the dhcp relay agent server ip.
> >>>>  Also it could complicate the northd incremental processing.
> >>>>
> >>>>  If for example we have below logical switches and router
> >>>>
> >>>>  ovn-nbctl lr-add R1
> >>>>  ovn-nbctl ls-add sw0
> >>>>  ovn-nbctl ls-add sw1
> >>>>  ovn-nbctl ls-add sw-ext
> >>>>  ovn-nbctl lrp-add R1 rp-sw0 00:00:01:01:02:03 192.168.1.1/24
> >>>>  ovn-nbctl lrp-add R1 rp-sw1 00:00:03:01:02:03 192.168.2.1/24
> >>>>  ovn-nbctl lrp-add R1 rp-ext 00:00:02:01:02:03 172.16.1.254/24
> >>>>
> >>>>  ovn-nbctl lsp-add sw0 sw0-rp -- set Logical_Switch_Port sw0-rp \
> >>>>  type=router options:router-port=rp-sw0 \
> >>>>  -- lsp-set-addresses sw0-rp router
> >>>>
> >>>>  ovn-nbctl lsp-add sw1 sw1-rp -- set Logical_Switch_Port sw1-rp \
> >>>>  type=router options:router-port=rp-sw1 \
> >>>>  -- lsp-set-addresses sw1-rp router
> >>>>
> >>>>  I'd suggest doing something like below to enable this feature.
> >>>>
> >>>>  ovn-nbctl set Logical_Switch_Port sw0-rp options:dhcp_relay=true
> >>>>  ovn-nbctl set Logical_Switch_Port sw1-rp options:dhcp_relay=true
> >>>>
> >>>>  (Make sure that only one logical switch port of type router can
> >>>> have this flag - dhcp_relay set
> >>>>   for a given logical switch and document this limitation.)
> >>>
> >>> Ack. This suggestion looks good.
> >>>
> >>>>  ovn-nbctl set Logical_Router_port rp-sw0
> options:dhcp_relay_ip=172.16.1.1
> >>>>  ovn-nbctl set Logical_Router_port rp-sw1
> options:dhcp_relay_ip=172.16.1.1
> >>>>
> >>>>  Let me know if there are any limitations with this.
> >>>
> >>> The reason why I added new table is , it would be useful in future if
> we add
> >>> additional options (like setting hop count in DHCP header, etc) to
> DHCP relay
> >>> functionality. What do you recommend if we have to add more options
> >>> In future ?
> >>
> >> I see.  If there is a possibility of adding more options, then having
> >> a separate table makes sense.
> >> I'd suggest to add the options column to the DHCP_Relay table even if
> >> this patch presently is not using
> >> any.  This would help in upgrades.
> >>
> >> But I don't think there is a need to add a new column in the logical
> >> switch port table to enable dhcp realy.
> >>
> >> Thanks
> >> Numan
> >
> > Sure, I will add options column to DHCP_Relay column.
> > I will use options:dhcp_relay for LSP instead of new column as you
> suggested.
> >
> > Thanks,
> > Naveen
> >
>
> Hi Numan,
>
> I started working on your comments.
>
> Regd options:dhcp_relay for LSP: Since DHCP relay is applicable at logical
> switch
> level (for entire subnet). I am thinking what if we add options:dhcp_relay
> with value
> as string (lsp name of port type router) to Logical Switch table ?
> Please let me know your thoughts on this.



Sounds good to me.



>
>
> >
> >>
> >>>
> >>>
> >>>
> >>>> 2.  Regarding the newly added actions - dhcp_relay_req() and
> >>>> dhcp_relay_resp_fwd().
> >>>>   Both of these actions are encoded as OVS controller action with
> >>>> pause enabled.
> >>>>   Which means ovs-vswitchd has to freeze the flow translation and
> >>>> resume the flow translation
> >>>>   once the ovn-controller resumes it.  But the functions
> >>>> pinctrl_handle_dhcp_relay_req()
> >>>>   and pinctrl_handle_dhcp_relay_resp_fwd() do not resume the packet
> >>>> if the packet
> >>>>   has some errors.  This is wrong.  Otherwise vswitchd will never
> thaw the
> >>>>   frozen translation.
> >>>>
> >>>>   You can see the existing OVN actions - put_dhcp_opts() and few
> others which
> >>>>   use controller action with pause.  In such actions, the result of
> >>>> these actions
> >>>>   are stored in a register bit (i.e if put_dhcp_opts() was successful
> or not)
> >>>>   and in the next stage we take a decision based on the result.
> >>>>
> >>>>   For the action dhcp_relay_req(relay_ip, server_ip),  I don't
> >>>> think you should use the pause flag.
> >>>>   Also in this action the argument server_ip is never used in the
> >>>> function pinctrl_handle_dhcp_relay_req()
> >>>>   other than to just log.
> >>>>
> >>>>   I'd suggest you do something like this:
> >>>>
> >>>>  table=3 (lr_in_ip_input     ), priority=110  , match=(inport ==
> >>>> "lrp1" && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src
> >>>> == 68 && udp.dst == 67),
> >>>>  action=(dhcp_relay_req { ip4.src = 192.168.1.1; ip4.dst =
> >>>> 172.16.1.1; udp.src = 67; dhcp_header.giaddr = <relay_ip>;
> >>>> next(pipeline=ingress,table=S_ROUTER_IN_UNSNAT);  /* DHCP_RELAY_REQ */
> >>>> }
> >>>>
> >>>>  dhcp_relay_req action would get translated into a controller
> >>>> action with pause=false and all the inner actions of this are encoded
> >>>> as
> >>>>  normal actions and stored in the userdata of controller action.
> >>>> Please see icmp4_error {} as an example.
> >>>>  Add a new OVN field 'dhcp_header.giaddr' which gets translated as
> >>>> controller action with pause flag set.
> >>>>  Please see the existing OVN field - icmp4.frag_mtu as an example
> >>>> and see this commit for reference [1]
> >>>>  When encoding this new OVN field, store the relay_ip in the
> >>>> userdata buffer and in pinctrl.c
> >>>>  get the relay_ip value and store it in the dhcp header field.
> >>>>
> >>>>
> >>>>  For the action dhcp_relay_resp_fwd,  I'd suggest something like
> below:
> >>>>
> >>>>    table=17 (lr_in_dhcp_relay_resp_chk), priority=110  ,
> >>>> match=(ip4.src == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src ==
> >>>> 67 && udp.dst == 67),
> >>>>    action=(reg0[0] = dhcp_relay_resp_chk(dhcp_header.giaddr ==
> >>>> <relay_ip>); next;)
> >>>>    table=17 (lr_in_dhcp_relay_resp), priority=110  , match=(ip4.src
> >>>> == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src == 67 && udp.dst ==
> >>>> 67 && reg0[0] == 1),
> >>>>    action=(ip4.src = 192.168.1.1; udp.dst = 68; outport = "lrp1";
> >>>> output; /* DHCP_RELAY_RESP */)
> >>>>
> >>>>     I used reg0[0] as an example.  You may need to check the free
> >>>> register bit and use it.
> >>>>
> >>>>    You need to encode dhcp_relay_resp_chk as controller action with
> >>>> pause=true, and store the relay_ip in the userdata buffer.
> >>>>    And in pinctrl.c  check that  'dhcp_header.giaddr == relay_ip'
> >>>> or not.  If so, set the result register bit to 1, else to 0.
> >>>>
> >>>> Let me know if you've any questions.
> >>>>
> >>>
> >>> Ack. Thanks for the suggestions and detailed explanation.
> >>> Before implementation I had referred to icmp4_error and native
> dhcp_server flows
> >>> but I had slight misunderstanding about pause flag.
> >>>
>
>
> Regd dhcp_relay_req:  I think it might be better to implement two stage
> processing
> for dhcp_relay_req similar to  dhcp_relay_resp (something like below).
> This will avoid
> multiple OVN actions/fields if we update additional fields (like hop count
> in DHCP header) in future.
>
> table=3 (lr_in_ip_input     ), priority=110  , match=(inport == "lrp1" &&
> ip4.src == 0.0.0.0 &&
> ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67),
> action=(regx[y] = dhcp_relay_req_chk(192.168.1.1,172.16.1.1);next; /*
> DHCP_RELAY_REQ */)
>
> table=4 (lr_in_ip_dhcp_req     ), priority=110  , match=(inport == "lrp1"
> && ip4.src == 0.0.0.0 &&
> ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67 && regx[y]),
> action=(ip4.src=192.168.1.1;ip4.dst=172.16.1.1;
> udp.src=67;next; /* DHCP_RELAY_REQ */)
>
> table=4 (lr_in_ip_dhcp_req     ), priority=1  , match=(inport == "lrp1" &&
> ip4.src == 0.0.0.0 &&
> ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67 && regx[y] ==
> 0),
> action=drop;
>
> Please let me know if you are fine with this approach.
>


Makes sense to have 2 stages if the action
dhcp_relay_req_chk() in pinctrl.c would ignore the packet if some errors.

Looking forward to the patches.


Thanks
Numan



>
>
> >>>
> >>>> 3.  The newly added functions in pinctrl.c have a lot of repetitive
> >>>> code and it is very much similar to existing
> >>>> pinctrl_handle_put_dhcp_opts()
> >>>>  Please see if the duplicate code can be avoided.
> >>>
> >>> Ack.
> >>>
> >>>
> >>>
> >>>> [1] -
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_commit_3d9fec3fd5992e1201b4d4fdf43f1f397e8d5ea1&d=DwIFaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=jUP6tr4FN6iSRj6v8rdyetsvEpT13QUHVMbw__3u6Sm7qAhyuu9tBdezdVmkqt0p&s=xAleLPNTzueIGuScqWZRp7ppL2D7bbjqLZc6q4xk3Rg&e=
> >>>>
> >>>> Thanks
> >>>> Numan
> >>>>
> >>>>> ---
> >>>>> controller/pinctrl.c  | 441
> ++++++++++++++++++++++++++++++++++++++++++
> >>>>> include/ovn/actions.h |  26 +++
> >>>>> lib/actions.c         | 117 +++++++++++
> >>>>> lib/ovn-l7.h          |   1 +
> >>>>> northd/northd.c       | 177 ++++++++++++++++-
> >>>>> ovn-nb.ovsschema      |  25 ++-
> >>>>> ovn-nb.xml            |  28 +++
> >>>>> tests/atlocal.in      |   3 +
> >>>>> tests/ovn-northd.at   |  41 +++-
> >>>>> tests/ovn.at          |  12 +-
> >>>>> tests/system-ovn.at   | 150 ++++++++++++++
> >>>>> utilities/ovn-trace.c |  28 +++
> >>>>> 12 files changed, 1032 insertions(+), 17 deletions(-)
> >>>>>
> >>>>> diff --git a/controller/pinctrl.c b/controller/pinctrl.c
> >>>>> index 5a35d56f6..45240f01d 100644
> >>>>> --- a/controller/pinctrl.c
> >>>>> +++ b/controller/pinctrl.c
> >>>>> @@ -1897,6 +1897,437 @@ is_dhcp_flags_broadcast(ovs_be16 flags)
> >>>>>   return flags & htons(DHCP_BROADCAST_FLAG);
> >>>>> }
> >>>>>
> >>>>> +static const char *dhcp_msg_str[] = {
> >>>>> +[0] = "INVALID",
> >>>>> +[DHCP_MSG_DISCOVER] = "DISCOVER",
> >>>>> +[DHCP_MSG_OFFER] = "OFFER",
> >>>>> +[DHCP_MSG_REQUEST] = "REQUEST",
> >>>>> +[OVN_DHCP_MSG_DECLINE] = "DECLINE",
> >>>>> +[DHCP_MSG_ACK] = "ACK",
> >>>>> +[DHCP_MSG_NAK] = "NAK",
> >>>>> +[OVN_DHCP_MSG_RELEASE] = "RELEASE",
> >>>>> +[OVN_DHCP_MSG_INFORM] = "INFORM"
> >>>>> +};
> >>>>> +
> >>>>> +static bool
> >>>>> +dhcp_relay_is_msg_type_supported(uint8_t msg_type)
> >>>>> +{
> >>>>> +    return (msg_type >= DHCP_MSG_DISCOVER && msg_type <=
> OVN_DHCP_MSG_RELEASE);
> >>>>> +}
> >>>>> +
> >>>>> +static const char *dhcp_msg_str_get(uint8_t msg_type)
> >>>>> +{
> >>>>> +    if (!dhcp_relay_is_msg_type_supported(msg_type)) {
> >>>>> +        return "INVALID";
> >>>>> +    }
> >>>>> +    return dhcp_msg_str[msg_type];
> >>>>> +}
> >>>>> +
> >>>>> +/* Called with in the pinctrl_handler thread context. */
> >>>>> +static void
> >>>>> +pinctrl_handle_dhcp_relay_req(
> >>>>> +    struct rconn *swconn,
> >>>>> +    struct dp_packet *pkt_in, struct ofputil_packet_in *pin,
> >>>>> +    struct ofpbuf *userdata,
> >>>>> +    struct ofpbuf *continuation)
> >>>>> +{
> >>>>> +    enum ofp_version version = rconn_get_version(swconn);
> >>>>> +    enum ofputil_protocol proto =
> ofputil_protocol_from_ofp_version(version);
> >>>>> +    struct dp_packet *pkt_out_ptr = NULL;
> >>>>> +
> >>>>> +    /* Parse relay IP and server IP. */
> >>>>> +    ovs_be32 *relay_ip = ofpbuf_try_pull(userdata, sizeof
> *relay_ip);
> >>>>> +    ovs_be32 *server_ip = ofpbuf_try_pull(userdata, sizeof
> *server_ip);
> >>>>> +    if (!relay_ip || !server_ip) {
> >>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> 5);
> >>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: relay ip or server ip "
> >>>>> +                  "not present in the userdata");
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    /* Validate the DHCP request packet.
> >>>>> +     * Format of the DHCP packet is
> >>>>> +     *
> ------------------------------------------------------------------------
> >>>>> +     *| UDP HEADER  | DHCP HEADER  | 4 Byte DHCP Cookie | DHCP
> OPTIONS(var len)|
> >>>>> +     *
> ------------------------------------------------------------------------
> >>>>> +     */
> >>>>> +
> >>>>> +    size_t in_l4_size = dp_packet_l4_size(pkt_in);
> >>>>> +    const char *end = (char *) dp_packet_l4(pkt_in) + in_l4_size;
> >>>>> +    const char *in_dhcp_ptr = dp_packet_get_udp_payload(pkt_in);
> >>>>> +    if (!in_dhcp_ptr) {
> >>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> 5);
> >>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: invalid or incomplete "
> >>>>> +                  "DHCP packet received");
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    const struct dhcp_header *in_dhcp_data
> >>>>> +        = (const struct dhcp_header *) in_dhcp_ptr;
> >>>>> +    in_dhcp_ptr += sizeof *in_dhcp_data;
> >>>>> +    if (in_dhcp_ptr > end) {
> >>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> 5);
> >>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: invalid or incomplete "
> >>>>> +                "DHCP packet received, bad data length");
> >>>>> +        return;
> >>>>> +    }
> >>>>> +    if (in_dhcp_data->op != DHCP_OP_REQUEST) {
> >>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> 5);
> >>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: invalid opcode in the "
> >>>>> +                "DHCP packet: %d", in_dhcp_data->op);
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    /* DHCP options follow the DHCP header. The first 4 bytes of
> the DHCP
> >>>>> +     * options is the DHCP magic cookie followed by the actual DHCP
> options.
> >>>>> +     */
> >>>>> +    ovs_be32 magic_cookie = htonl(DHCP_MAGIC_COOKIE);
> >>>>> +    if (in_dhcp_ptr + sizeof magic_cookie > end ||
> >>>>> +        get_unaligned_be32((const void *) in_dhcp_ptr) !=
> magic_cookie) {
> >>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> 5);
> >>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: magic cookie not present
> "
> >>>>> +                "in the packet");
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    if (in_dhcp_data->giaddr) {
> >>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> 5);
> >>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: giaddr is already set");
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    if (in_dhcp_data->htype != 0x1) {
> >>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> 5);
> >>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: packet is recieved with "
> >>>>> +                "unsupported hardware type");
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    ovs_be32 *server_id_ptr = NULL;
> >>>>> +    const uint8_t *in_dhcp_msg_type = NULL;
> >>>>> +
> >>>>> +    in_dhcp_ptr += sizeof magic_cookie;
> >>>>> +    ovs_be32 request_ip = in_dhcp_data->ciaddr;
> >>>>> +    while (in_dhcp_ptr < end) {
> >>>>> +        const struct dhcp_opt_header *in_dhcp_opt =
> >>>>> +            (const struct dhcp_opt_header *) in_dhcp_ptr;
> >>>>> +        if (in_dhcp_opt->code == DHCP_OPT_END) {
> >>>>> +            break;
> >>>>> +        }
> >>>>> +        if (in_dhcp_opt->code == DHCP_OPT_PAD) {
> >>>>> +            in_dhcp_ptr += 1;
> >>>>> +            continue;
> >>>>> +        }
> >>>>> +        in_dhcp_ptr += sizeof *in_dhcp_opt;
> >>>>> +        if (in_dhcp_ptr > end) {
> >>>>> +            break;
> >>>>> +        }
> >>>>> +        in_dhcp_ptr += in_dhcp_opt->len;
> >>>>> +        if (in_dhcp_ptr > end) {
> >>>>> +            break;
> >>>>> +        }
> >>>>> +
> >>>>> +        switch (in_dhcp_opt->code) {
> >>>>> +        case DHCP_OPT_MSG_TYPE:
> >>>>> +            if (in_dhcp_opt->len == 1) {
> >>>>> +                in_dhcp_msg_type = DHCP_OPT_PAYLOAD(in_dhcp_opt);
> >>>>> +            }
> >>>>> +            break;
> >>>>> +        case DHCP_OPT_REQ_IP:
> >>>>> +            if (in_dhcp_opt->len == 4) {
> >>>>> +                request_ip =
> get_unaligned_be32(DHCP_OPT_PAYLOAD(in_dhcp_opt));
> >>>>> +            }
> >>>>> +            break;
> >>>>> +        /* Server Identifier */
> >>>>> +        case OVN_DHCP_OPT_CODE_SERVER_ID:
> >>>>> +            if (in_dhcp_opt->len == 4) {
> >>>>> +                server_id_ptr = DHCP_OPT_PAYLOAD(in_dhcp_opt);
> >>>>> +            }
> >>>>> +            break;
> >>>>> +        default:
> >>>>> +            break;
> >>>>> +        }
> >>>>> +    }
> >>>>> +
> >>>>> +    /* Check whether the DHCP Message Type (opt 53) is present or
> not */
> >>>>> +    if (!in_dhcp_msg_type) {
> >>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> 5);
> >>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: missing message type");
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    /* Relay the DHCP request packet */
> >>>>> +    uint16_t new_l4_size = in_l4_size;
> >>>>> +    size_t new_packet_size = pkt_in->l4_ofs + new_l4_size;
> >>>>> +
> >>>>> +    struct dp_packet pkt_out;
> >>>>> +    dp_packet_init(&pkt_out, new_packet_size);
> >>>>> +    dp_packet_clear(&pkt_out);
> >>>>> +    dp_packet_prealloc_tailroom(&pkt_out, new_packet_size);
> >>>>> +    pkt_out_ptr = &pkt_out;
> >>>>> +
> >>>>> +    /* Copy the L2 and L3 headers from the pkt_in as they would
> remain same*/
> >>>>> +    dp_packet_put(
> >>>>> +        &pkt_out, dp_packet_pull(pkt_in, pkt_in->l4_ofs),
> pkt_in->l4_ofs);
> >>>>> +
> >>>>> +    pkt_out.l2_5_ofs = pkt_in->l2_5_ofs;
> >>>>> +    pkt_out.l2_pad_size = pkt_in->l2_pad_size;
> >>>>> +    pkt_out.l3_ofs = pkt_in->l3_ofs;
> >>>>> +    pkt_out.l4_ofs = pkt_in->l4_ofs;
> >>>>> +
> >>>>> +    struct ip_header *out_ip = dp_packet_l3(&pkt_out);
> >>>>> +
> >>>>> +    struct udp_header *udp = dp_packet_put(
> >>>>> +        &pkt_out, dp_packet_pull(pkt_in, UDP_HEADER_LEN),
> UDP_HEADER_LEN);
> >>>>> +
> >>>>> +    struct dhcp_header *dhcp_data = dp_packet_put(&pkt_out,
> >>>>> +        dp_packet_pull(pkt_in, new_l4_size - UDP_HEADER_LEN),
> >>>>> +        new_l4_size - UDP_HEADER_LEN);
> >>>>> +    dhcp_data->giaddr = *relay_ip;
> >>>>> +    if (udp->udp_csum) {
> >>>>> +        udp->udp_csum = recalc_csum32(udp->udp_csum,
> >>>>> +            0, dhcp_data->giaddr);
> >>>>> +    }
> >>>>> +    pin->packet = dp_packet_data(&pkt_out);
> >>>>> +    pin->packet_len = dp_packet_size(&pkt_out);
> >>>>> +
> >>>>> +    /* Log the DHCP message. */
> >>>>> +    static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(20, 40);
> >>>>> +    const struct eth_header *l2 = dp_packet_eth(&pkt_out);
> >>>>> +    VLOG_INFO_RL(&rl, "DHCP_RELAY_REQ:: MSG_TYPE:%s
> MAC:"ETH_ADDR_FMT
> >>>>> +                " XID:%u"
> >>>>> +                " REQ_IP:"IP_FMT
> >>>>> +                " GIADDR:"IP_FMT
> >>>>> +                " SERVER_ADDR:"IP_FMT,
> >>>>> +                dhcp_msg_str_get(*in_dhcp_msg_type),
> >>>>> +                ETH_ADDR_BYTES_ARGS(dhcp_data->chaddr),
> ntohl(dhcp_data->xid),
> >>>>> +                IP_ARGS(request_ip), IP_ARGS(dhcp_data->giaddr),
> >>>>> +                IP_ARGS(*server_ip));
> >>>>> +    queue_msg(swconn, ofputil_encode_resume(pin, continuation,
> proto));
> >>>>> +    if (pkt_out_ptr) {
> >>>>> +        dp_packet_uninit(pkt_out_ptr);
> >>>>> +    }
> >>>>> +}
> >>>>> +
> >>>>> +/* Called with in the pinctrl_handler thread context. */
> >>>>> +static void
> >>>>> +pinctrl_handle_dhcp_relay_resp_fwd(
> >>>>> +    struct rconn *swconn,
> >>>>> +    struct dp_packet *pkt_in, struct ofputil_packet_in *pin,
> >>>>> +    struct ofpbuf *userdata,
> >>>>> +    struct ofpbuf *continuation)
> >>>>> +{
> >>>>> +    enum ofp_version version = rconn_get_version(swconn);
> >>>>> +    enum ofputil_protocol proto =
> ofputil_protocol_from_ofp_version(version);
> >>>>> +    struct dp_packet *pkt_out_ptr = NULL;
> >>>>> +
> >>>>> +    /* Parse relay IP and server IP. */
> >>>>> +    ovs_be32 *relay_ip = ofpbuf_try_pull(userdata, sizeof
> *relay_ip);
> >>>>> +    ovs_be32 *server_ip = ofpbuf_try_pull(userdata, sizeof
> *server_ip);
> >>>>> +    if (!relay_ip || !server_ip) {
> >>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> 5);
> >>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: relay ip or server ip "
> >>>>> +                "not present in the userdata");
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    /* Validate the DHCP request packet.
> >>>>> +     * Format of the DHCP packet is
> >>>>> +     *
> ------------------------------------------------------------------------
> >>>>> +     *| UDP HEADER  | DHCP HEADER  | 4 Byte DHCP Cookie | DHCP
> OPTIONS(var len)|
> >>>>> +     *
> ------------------------------------------------------------------------
> >>>>> +     */
> >>>>> +
> >>>>> +    size_t in_l4_size = dp_packet_l4_size(pkt_in);
> >>>>> +    const char *end = (char *) dp_packet_l4(pkt_in) + in_l4_size;
> >>>>> +    const char *in_dhcp_ptr = dp_packet_get_udp_payload(pkt_in);
> >>>>> +    if (!in_dhcp_ptr) {
> >>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> 5);
> >>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: invalid or
> incomplete "
> >>>>> +                "packet received");
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    const struct dhcp_header *in_dhcp_data
> >>>>> +        = (const struct dhcp_header *) in_dhcp_ptr;
> >>>>> +    in_dhcp_ptr += sizeof *in_dhcp_data;
> >>>>> +    if (in_dhcp_ptr > end) {
> >>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> 5);
> >>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: invalid or
> incomplete "
> >>>>> +                    "packet received, bad data length");
> >>>>> +        return;
> >>>>> +    }
> >>>>> +    if (in_dhcp_data->op != DHCP_OP_REPLY) {
> >>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> 5);
> >>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: invalid opcode "
> >>>>> +                "in the packet: %d", in_dhcp_data->op);
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    /* DHCP options follow the DHCP header. The first 4 bytes of
> the DHCP
> >>>>> +     * options is the DHCP magic cookie followed by the actual DHCP
> options.
> >>>>> +     */
> >>>>> +    ovs_be32 magic_cookie = htonl(DHCP_MAGIC_COOKIE);
> >>>>> +    if (in_dhcp_ptr + sizeof magic_cookie > end ||
> >>>>> +        get_unaligned_be32((const void *) in_dhcp_ptr) !=
> magic_cookie) {
> >>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> 5);
> >>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: magic cookie not
> present "
> >>>>> +                "in the packet");
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    if (!in_dhcp_data->giaddr) {
> >>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> 5);
> >>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: giaddr is "
> >>>>> +                    "not set in request");
> >>>>> +        return;
> >>>>> +    }
> >>>>> +    ovs_be32 giaddr = in_dhcp_data->giaddr;
> >>>>> +
> >>>>> +    ovs_be32 *server_id_ptr = NULL;
> >>>>> +    ovs_be32 lease_time = 0;
> >>>>> +    const uint8_t *in_dhcp_msg_type = NULL;
> >>>>> +
> >>>>> +    in_dhcp_ptr += sizeof magic_cookie;
> >>>>> +    while (in_dhcp_ptr < end) {
> >>>>> +        const struct dhcp_opt_header *in_dhcp_opt =
> >>>>> +            (const struct dhcp_opt_header *) in_dhcp_ptr;
> >>>>> +        if (in_dhcp_opt->code == DHCP_OPT_END) {
> >>>>> +            break;
> >>>>> +        }
> >>>>> +        if (in_dhcp_opt->code == DHCP_OPT_PAD) {
> >>>>> +            in_dhcp_ptr += 1;
> >>>>> +            continue;
> >>>>> +        }
> >>>>> +        in_dhcp_ptr += sizeof *in_dhcp_opt;
> >>>>> +        if (in_dhcp_ptr > end) {
> >>>>> +            break;
> >>>>> +        }
> >>>>> +        in_dhcp_ptr += in_dhcp_opt->len;
> >>>>> +        if (in_dhcp_ptr > end) {
> >>>>> +            break;
> >>>>> +        }
> >>>>> +
> >>>>> +        switch (in_dhcp_opt->code) {
> >>>>> +        case DHCP_OPT_MSG_TYPE:
> >>>>> +            if (in_dhcp_opt->len == 1) {
> >>>>> +                in_dhcp_msg_type = DHCP_OPT_PAYLOAD(in_dhcp_opt);
> >>>>> +            }
> >>>>> +            break;
> >>>>> +        /* Server Identifier */
> >>>>> +        case OVN_DHCP_OPT_CODE_SERVER_ID:
> >>>>> +            if (in_dhcp_opt->len == 4) {
> >>>>> +                server_id_ptr = DHCP_OPT_PAYLOAD(in_dhcp_opt);
> >>>>> +            }
> >>>>> +            break;
> >>>>> +        case OVN_DHCP_OPT_CODE_LEASE_TIME:
> >>>>> +            if (in_dhcp_opt->len == 4) {
> >>>>> +                lease_time =
> get_unaligned_be32(DHCP_OPT_PAYLOAD(in_dhcp_opt));
> >>>>> +            }
> >>>>> +            break;
> >>>>> +        default:
> >>>>> +            break;
> >>>>> +        }
> >>>>> +    }
> >>>>> +
> >>>>> +    /* Check whether the DHCP Message Type (opt 53) is present or
> not */
> >>>>> +    if (!in_dhcp_msg_type) {
> >>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> 5);
> >>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: missing message type");
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    if (!server_id_ptr) {
> >>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> 5);
> >>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: missing server
> identifier");
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    if (*server_id_ptr != *server_ip) {
> >>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> 5);
> >>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: server identifier
> mismatch");
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    if (giaddr != *relay_ip) {
> >>>>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> 5);
> >>>>> +        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: giaddr mismatch");
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +
> >>>>> +    /* Update destination MAC & IP so that the packet is forward to
> the
> >>>>> +     * right destination node.
> >>>>> +     */
> >>>>> +    uint16_t new_l4_size = in_l4_size;
> >>>>> +    size_t new_packet_size = pkt_in->l4_ofs + new_l4_size;
> >>>>> +
> >>>>> +    struct dp_packet pkt_out;
> >>>>> +    dp_packet_init(&pkt_out, new_packet_size);
> >>>>> +    dp_packet_clear(&pkt_out);
> >>>>> +    dp_packet_prealloc_tailroom(&pkt_out, new_packet_size);
> >>>>> +    pkt_out_ptr = &pkt_out;
> >>>>> +
> >>>>> +    /* Copy the L2 and L3 headers from the pkt_in as they would
> remain same*/
> >>>>> +    struct eth_header *eth = dp_packet_put(
> >>>>> +        &pkt_out, dp_packet_pull(pkt_in, pkt_in->l4_ofs),
> pkt_in->l4_ofs);
> >>>>> +
> >>>>> +    pkt_out.l2_5_ofs = pkt_in->l2_5_ofs;
> >>>>> +    pkt_out.l2_pad_size = pkt_in->l2_pad_size;
> >>>>> +    pkt_out.l3_ofs = pkt_in->l3_ofs;
> >>>>> +    pkt_out.l4_ofs = pkt_in->l4_ofs;
> >>>>> +
> >>>>> +    struct udp_header *udp = dp_packet_put(
> >>>>> +        &pkt_out, dp_packet_pull(pkt_in, UDP_HEADER_LEN),
> UDP_HEADER_LEN);
> >>>>> +
> >>>>> +    struct dhcp_header *dhcp_data = dp_packet_put(
> >>>>> +        &pkt_out, dp_packet_pull(pkt_in, new_l4_size -
> UDP_HEADER_LEN),
> >>>>> +        new_l4_size - UDP_HEADER_LEN);
> >>>>> +    memcpy(&eth->eth_dst, dhcp_data->chaddr, sizeof(eth->eth_dst));
> >>>>> +
> >>>>> +    /* Send a broadcast IP frame when BROADCAST flag is set. */
> >>>>> +    struct ip_header *out_ip = dp_packet_l3(&pkt_out);
> >>>>> +    ovs_be32 ip_dst;
> >>>>> +    ovs_be32 ip_dst_orig = get_16aligned_be32(&out_ip->ip_dst);
> >>>>> +    if (!is_dhcp_flags_broadcast(dhcp_data->flags)) {
> >>>>> +        ip_dst = dhcp_data->yiaddr;
> >>>>> +    } else {
> >>>>> +        ip_dst = htonl(0xffffffff);
> >>>>> +    }
> >>>>> +    put_16aligned_be32(&out_ip->ip_dst, ip_dst);
> >>>>> +    out_ip->ip_csum = recalc_csum32(out_ip->ip_csum,
> >>>>> +              ip_dst_orig, ip_dst);
> >>>>> +    if (udp->udp_csum) {
> >>>>> +        udp->udp_csum = recalc_csum32(udp->udp_csum,
> >>>>> +            ip_dst_orig, ip_dst);
> >>>>> +    }
> >>>>> +    /* Reset giaddr */
> >>>>> +    dhcp_data->giaddr = htonl(0x0);
> >>>>> +    if (udp->udp_csum) {
> >>>>> +        udp->udp_csum = recalc_csum32(udp->udp_csum,
> >>>>> +            giaddr, 0);
> >>>>> +    }
> >>>>> +    pin->packet = dp_packet_data(&pkt_out);
> >>>>> +    pin->packet_len = dp_packet_size(&pkt_out);
> >>>>> +
> >>>>> +    /* Log the DHCP message. */
> >>>>> +    static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(20, 40);
> >>>>> +    const struct eth_header *l2 = dp_packet_eth(&pkt_out);
> >>>>> +    VLOG_INFO_RL(&rl, "DHCP_RELAY_RESP_FWD:: MSG_TYPE:%s
> MAC:"ETH_ADDR_FMT
> >>>>> +             " XID:%u"
> >>>>> +             " YIADDR:"IP_FMT
> >>>>> +             " GIADDR:"IP_FMT
> >>>>> +             " SERVER_ADDR:"IP_FMT,
> >>>>> +             dhcp_msg_str_get(*in_dhcp_msg_type),
> >>>>> +             ETH_ADDR_BYTES_ARGS(dhcp_data->chaddr),
> ntohl(dhcp_data->xid),
> >>>>> +             IP_ARGS(dhcp_data->yiaddr),
> >>>>> +             IP_ARGS(giaddr), IP_ARGS(*server_id_ptr));
> >>>>> +    queue_msg(swconn, ofputil_encode_resume(pin, continuation,
> proto));
> >>>>> +    if (pkt_out_ptr) {
> >>>>> +        dp_packet_uninit(pkt_out_ptr);
> >>>>> +    }
> >>>>> +}
> >>>>> +
> >>>>> /* Called with in the pinctrl_handler thread context. */
> >>>>> static void
> >>>>> pinctrl_handle_put_dhcp_opts(
> >>>>> @@ -3203,6 +3634,16 @@ process_packet_in(struct rconn *swconn, const
> struct ofp_header *msg)
> >>>>>       ovs_mutex_unlock(&pinctrl_mutex);
> >>>>>       break;
> >>>>>
> >>>>> +    case ACTION_OPCODE_DHCP_RELAY_REQ:
> >>>>> +        pinctrl_handle_dhcp_relay_req(swconn, &packet, &pin,
> >>>>> +                                     &userdata, &continuation);
> >>>>> +        break;
> >>>>> +
> >>>>> +    case ACTION_OPCODE_DHCP_RELAY_RESP_FWD:
> >>>>> +        pinctrl_handle_dhcp_relay_resp_fwd(swconn, &packet, &pin,
> >>>>> +                                     &userdata, &continuation);
> >>>>> +        break;
> >>>>> +
> >>>>>   case ACTION_OPCODE_PUT_DHCP_OPTS:
> >>>>>       pinctrl_handle_put_dhcp_opts(swconn, &packet, &pin, &headers,
> >>>>>                                    &userdata, &continuation);
> >>>>> diff --git a/include/ovn/actions.h b/include/ovn/actions.h
> >>>>> index 49cfe0624..47d41b90f 100644
> >>>>> --- a/include/ovn/actions.h
> >>>>> +++ b/include/ovn/actions.h
> >>>>> @@ -95,6 +95,8 @@ struct collector_set_ids;
> >>>>>   OVNACT(LOOKUP_ND_IP,      ovnact_lookup_mac_bind_ip) \
> >>>>>   OVNACT(PUT_DHCPV4_OPTS,   ovnact_put_opts)        \
> >>>>>   OVNACT(PUT_DHCPV6_OPTS,   ovnact_put_opts)        \
> >>>>> +    OVNACT(DHCPV4_RELAY_REQ,  ovnact_dhcp_relay)      \
> >>>>> +    OVNACT(DHCPV4_RELAY_RESP_FWD, ovnact_dhcp_relay)      \
> >>>>>   OVNACT(SET_QUEUE,         ovnact_set_queue)       \
> >>>>>   OVNACT(DNS_LOOKUP,        ovnact_result)          \
> >>>>>   OVNACT(LOG,               ovnact_log)             \
> >>>>> @@ -387,6 +389,14 @@ struct ovnact_put_opts {
> >>>>>   size_t n_options;
> >>>>> };
> >>>>>
> >>>>> +/* OVNACT_DHCP_RELAY. */
> >>>>> +struct ovnact_dhcp_relay {
> >>>>> +    struct ovnact ovnact;
> >>>>> +    int family;
> >>>>> +    ovs_be32 relay_ipv4;
> >>>>> +    ovs_be32 server_ipv4;
> >>>>> +};
> >>>>> +
> >>>>> /* Valid arguments to SET_QUEUE action.
> >>>>> *
> >>>>> * QDISC_MIN_QUEUE_ID is the default queue, so user-defined queues
> should
> >>>>> @@ -750,6 +760,22 @@ enum action_opcode {
> >>>>>
> >>>>>   /* multicast group split buffer action. */
> >>>>>   ACTION_OPCODE_MG_SPLIT_BUF,
> >>>>> +
> >>>>> +    /* "dhcp_relay_req(relay_ip, server_ip)".
> >>>>> +     *
> >>>>> +     * Arguments follow the action_header, in this format:
> >>>>> +     *   - The 32-bit DHCP relay IP.
> >>>>> +     *   - The 32-bit DHCP server IP.
> >>>>> +     */
> >>>>> +    ACTION_OPCODE_DHCP_RELAY_REQ,
> >>>>> +
> >>>>> +    /* "dhcp_relay_resp_fwd(relay_ip, server_ip)".
> >>>>> +     *
> >>>>> +     * Arguments follow the action_header, in this format:
> >>>>> +     *   - The 32-bit DHCP relay IP.
> >>>>> +     *   - The 32-bit DHCP server IP.
> >>>>> +     */
> >>>>> +    ACTION_OPCODE_DHCP_RELAY_RESP_FWD,
> >>>>> };
> >>>>>
> >>>>> /* Header. */
> >>>>> diff --git a/lib/actions.c b/lib/actions.c
> >>>>> index a73fe1a1e..69df428c6 100644
> >>>>> --- a/lib/actions.c
> >>>>> +++ b/lib/actions.c
> >>>>> @@ -2629,6 +2629,118 @@ ovnact_controller_event_free(struct
> ovnact_controller_event *event)
> >>>>>   free_gen_options(event->options, event->n_options);
> >>>>> }
> >>>>>
> >>>>> +static void
> >>>>> +format_DHCPV4_RELAY_REQ(const struct ovnact_dhcp_relay *dhcp_relay,
> >>>>> +                struct ds *s)
> >>>>> +{
> >>>>> +    ds_put_format(s, "dhcp_relay_req("IP_FMT","IP_FMT");",
> >>>>> +                  IP_ARGS(dhcp_relay->relay_ipv4),
> >>>>> +                  IP_ARGS(dhcp_relay->server_ipv4));
> >>>>> +}
> >>>>> +
> >>>>> +static void
> >>>>> +parse_dhcp_relay_req(struct action_context *ctx,
> >>>>> +               struct ovnact_dhcp_relay *dhcp_relay)
> >>>>> +{
> >>>>> +    /* Skip dhcp_relay_req( */
> >>>>> +    lexer_force_match(ctx->lexer, LEX_T_LPAREN);
> >>>>> +
> >>>>> +    /* Parse relay ip and server ip. */
> >>>>> +    if (ctx->lexer->token.format == LEX_F_IPV4) {
> >>>>> +        dhcp_relay->family = AF_INET;
> >>>>> +        dhcp_relay->relay_ipv4 = ctx->lexer->token.value.ipv4;
> >>>>> +        lexer_get(ctx->lexer);
> >>>>> +        lexer_match(ctx->lexer, LEX_T_COMMA);
> >>>>> +        if (ctx->lexer->token.format == LEX_F_IPV4) {
> >>>>> +            dhcp_relay->family = AF_INET;
> >>>>> +            dhcp_relay->server_ipv4 = ctx->lexer->token.value.ipv4;
> >>>>> +            lexer_get(ctx->lexer);
> >>>>> +        } else {
> >>>>> +            lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp
> server ip");
> >>>>> +            return;
> >>>>> +        }
> >>>>> +    } else {
> >>>>> +          lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp relay
> "
> >>>>> +                          "and server ips");
> >>>>> +          return;
> >>>>> +    }
> >>>>> +    lexer_force_match(ctx->lexer, LEX_T_RPAREN);
> >>>>> +}
> >>>>> +
> >>>>> +static void
> >>>>> +encode_DHCPV4_RELAY_REQ(const struct ovnact_dhcp_relay *dhcp_relay,
> >>>>> +                    const struct ovnact_encode_params *ep,
> >>>>> +                    struct ofpbuf *ofpacts)
> >>>>> +{
> >>>>> +    size_t oc_offset =
> encode_start_controller_op(ACTION_OPCODE_DHCP_RELAY_REQ,
> >>>>> +                                                  true,
> ep->ctrl_meter_id,
> >>>>> +                                                  ofpacts);
> >>>>> +    ofpbuf_put(ofpacts, &dhcp_relay->relay_ipv4,
> >>>>> +            sizeof(dhcp_relay->relay_ipv4));
> >>>>> +    ofpbuf_put(ofpacts, &dhcp_relay->server_ipv4,
> >>>>> +            sizeof(dhcp_relay->server_ipv4));
> >>>>> +    encode_finish_controller_op(oc_offset, ofpacts);
> >>>>> +}
> >>>>> +
> >>>>> +static void
> >>>>> +format_DHCPV4_RELAY_RESP_FWD(const struct ovnact_dhcp_relay
> *dhcp_relay,
> >>>>> +                    struct ds *s)
> >>>>> +{
> >>>>> +    ds_put_format(s, "dhcp_relay_resp("IP_FMT","IP_FMT");",
> >>>>> +                  IP_ARGS(dhcp_relay->relay_ipv4),
> >>>>> +                  IP_ARGS(dhcp_relay->server_ipv4));
> >>>>> +}
> >>>>> +
> >>>>> +static void
> >>>>> +parse_dhcp_relay_resp_fwd(struct action_context *ctx,
> >>>>> +               struct ovnact_dhcp_relay *dhcp_relay)
> >>>>> +{
> >>>>> +    /* Skip dhcp_relay_resp( */
> >>>>> +    lexer_force_match(ctx->lexer, LEX_T_LPAREN);
> >>>>> +
> >>>>> +    /* Parse relay ip and server ip. */
> >>>>> +    if (ctx->lexer->token.format == LEX_F_IPV4) {
> >>>>> +        dhcp_relay->family = AF_INET;
> >>>>> +        dhcp_relay->relay_ipv4 = ctx->lexer->token.value.ipv4;
> >>>>> +        lexer_get(ctx->lexer);
> >>>>> +        lexer_match(ctx->lexer, LEX_T_COMMA);
> >>>>> +        if (ctx->lexer->token.format == LEX_F_IPV4) {
> >>>>> +            dhcp_relay->family = AF_INET;
> >>>>> +            dhcp_relay->server_ipv4 = ctx->lexer->token.value.ipv4;
> >>>>> +            lexer_get(ctx->lexer);
> >>>>> +        } else {
> >>>>> +            lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp
> server ip");
> >>>>> +            return;
> >>>>> +        }
> >>>>> +    } else {
> >>>>> +          lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp relay
> and "
> >>>>> +                          "server ips");
> >>>>> +          return;
> >>>>> +    }
> >>>>> +    lexer_force_match(ctx->lexer, LEX_T_RPAREN);
> >>>>> +}
> >>>>> +
> >>>>> +static void
> >>>>> +encode_DHCPV4_RELAY_RESP_FWD(const struct ovnact_dhcp_relay
> *dhcp_relay,
> >>>>> +                    const struct ovnact_encode_params *ep,
> >>>>> +                    struct ofpbuf *ofpacts)
> >>>>> +{
> >>>>> +    size_t oc_offset = encode_start_controller_op(
> >>>>> +                                ACTION_OPCODE_DHCP_RELAY_RESP_FWD,
> >>>>> +                                true, ep->ctrl_meter_id,
> >>>>> +                                ofpacts);
> >>>>> +    ofpbuf_put(ofpacts, &dhcp_relay->relay_ipv4,
> >>>>> +                  sizeof(dhcp_relay->relay_ipv4));
> >>>>> +    ofpbuf_put(ofpacts, &dhcp_relay->server_ipv4,
> >>>>> +                  sizeof(dhcp_relay->server_ipv4));
> >>>>> +    encode_finish_controller_op(oc_offset, ofpacts);
> >>>>> +}
> >>>>> +
> >>>>> +static void ovnact_dhcp_relay_free(
> >>>>> +          struct ovnact_dhcp_relay *dhcp_relay OVS_UNUSED)
> >>>>> +{
> >>>>> +}
> >>>>> +
> >>>>> static void
> >>>>> parse_put_opts(struct action_context *ctx, const struct expr_field
> *dst,
> >>>>>              struct ovnact_put_opts *po, const struct hmap *gen_opts,
> >>>>> @@ -5451,6 +5563,11 @@ parse_action(struct action_context *ctx)
> >>>>>       parse_sample(ctx);
> >>>>>   } else if (lexer_match_id(ctx->lexer, "mac_cache_use")) {
> >>>>>       ovnact_put_MAC_CACHE_USE(ctx->ovnacts);
> >>>>> +    } else if (lexer_match_id(ctx->lexer, "dhcp_relay_req")) {
> >>>>> +        parse_dhcp_relay_req(ctx,
> ovnact_put_DHCPV4_RELAY_REQ(ctx->ovnacts));
> >>>>> +    } else if (lexer_match_id(ctx->lexer, "dhcp_relay_resp_fwd")) {
> >>>>> +        parse_dhcp_relay_resp_fwd(ctx,
> >>>>> +              ovnact_put_DHCPV4_RELAY_RESP_FWD(ctx->ovnacts));
> >>>>>   } else {
> >>>>>       lexer_syntax_error(ctx->lexer, "expecting action");
> >>>>>   }
> >>>>> diff --git a/lib/ovn-l7.h b/lib/ovn-l7.h
> >>>>> index ad514a922..e08581123 100644
> >>>>> --- a/lib/ovn-l7.h
> >>>>> +++ b/lib/ovn-l7.h
> >>>>> @@ -69,6 +69,7 @@ struct gen_opts_map {
> >>>>> */
> >>>>> #define OVN_DHCP_OPT_CODE_NETMASK      1
> >>>>> #define OVN_DHCP_OPT_CODE_LEASE_TIME   51
> >>>>> +#define OVN_DHCP_OPT_CODE_SERVER_ID    54
> >>>>> #define OVN_DHCP_OPT_CODE_T1           58
> >>>>> #define OVN_DHCP_OPT_CODE_T2           59
> >>>>>
> >>>>> diff --git a/northd/northd.c b/northd/northd.c
> >>>>> index 07dffb15a..7ac831fae 100644
> >>>>> --- a/northd/northd.c
> >>>>> +++ b/northd/northd.c
> >>>>> @@ -181,11 +181,13 @@ enum ovn_stage {
> >>>>>   PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING_ECMP, 14,
> "lr_in_ip_routing_ecmp") \
> >>>>>   PIPELINE_STAGE(ROUTER, IN,  POLICY,          15, "lr_in_policy")
>         \
> >>>>>   PIPELINE_STAGE(ROUTER, IN,  POLICY_ECMP,     16,
> "lr_in_policy_ecmp")     \
> >>>>> -    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE,     17,
> "lr_in_arp_resolve")     \
> >>>>> -    PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN,     18,
> "lr_in_chk_pkt_len")     \
> >>>>> -    PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS,     19,
> "lr_in_larger_pkts")     \
> >>>>> -    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT,     20,
> "lr_in_gw_redirect")     \
> >>>>> -    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     21,
> "lr_in_arp_request")     \
> >>>>> +    PIPELINE_STAGE(ROUTER, IN,  DHCP_RELAY_RESP_FWD, 17,
>           \
> >>>>> +                  "lr_in_dhcp_relay_resp_fwd")
>           \
> >>>>> +    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE,     18,
> "lr_in_arp_resolve")     \
> >>>>> +    PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN,     19,
> "lr_in_chk_pkt_len")     \
> >>>>> +    PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS,     20,
> "lr_in_larger_pkts")     \
> >>>>> +    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT,     21,
> "lr_in_gw_redirect")     \
> >>>>> +    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     22,
> "lr_in_arp_request")     \
> >>>>>                                                                     \
> >>>>>   /* Logical router egress stages. */                               \
> >>>>>   PIPELINE_STAGE(ROUTER, OUT, CHECK_DNAT_LOCAL,   0,
>        \
> >>>>> @@ -9610,6 +9612,80 @@ build_dhcpv6_options_flows(struct ovn_port
> *op,
> >>>>>   ds_destroy(&match);
> >>>>> }
> >>>>>
> >>>>> +static void
> >>>>> +build_lswitch_dhcp_relay_flows(struct ovn_port *op,
> >>>>> +                           const struct hmap *lr_ports,
> >>>>> +                           const struct hmap *lflows,
> >>>>> +                           const struct shash *meter_groups
> OVS_UNUSED)
> >>>>> +{
> >>>>> +    if (op->nbrp || !op->nbsp) {
> >>>>> +        return;
> >>>>> +    }
> >>>>> +    /* consider only ports attached to VMs */
> >>>>> +    if (strcmp(op->nbsp->type, "")) {
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    if (!op->od || !op->od->n_router_ports ||
> >>>>> +        !op->od->nbs || !op->od->nbs->dhcp_relay_port) {
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    struct ds match = DS_EMPTY_INITIALIZER;
> >>>>> +    struct ds action = DS_EMPTY_INITIALIZER;
> >>>>> +    struct nbrec_logical_router_port *lrp =
> op->od->nbs->dhcp_relay_port;
> >>>>> +    struct ovn_port *rp = ovn_port_find(lr_ports, lrp->name);
> >>>>> +
> >>>>> +    if (!rp || !rp->nbrp || !rp->nbrp->dhcp_relay) {
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    struct ovn_port *sp = NULL;
> >>>>> +    struct nbrec_dhcp_relay *dhcp_relay = rp->nbrp->dhcp_relay;
> >>>>> +
> >>>>> +    for (int i = 0; i < op->od->n_router_ports; i++) {
> >>>>> +        struct ovn_port *sp_tmp = op->od->router_ports[i];
> >>>>> +        if (sp_tmp->peer == rp) {
> >>>>> +            sp = sp_tmp;
> >>>>> +            break;
> >>>>> +        }
> >>>>> +    }
> >>>>> +    if (!sp) {
> >>>>> +      return;
> >>>>> +    }
> >>>>> +
> >>>>> +    char *server_ip_str = NULL;
> >>>>> +    uint16_t port;
> >>>>> +    int addr_family;
> >>>>> +    struct in6_addr server_ip;
> >>>>> +
> >>>>> +    if (!ip_address_and_port_from_lb_key(dhcp_relay->servers,
> &server_ip_str,
> >>>>> +                                         &server_ip, &port,
> &addr_family)) {
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    if (server_ip_str == NULL) {
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    ds_put_format(
> >>>>> +        &match, "inport == %s && eth.src == %s && "
> >>>>> +        "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
> >>>>> +        "udp.src == 68 && udp.dst == 67",
> >>>>> +        op->json_key, op->lsp_addrs[0].ea_s);
> >>>>> +    ds_put_format(&action,
> >>>>> +                  "eth.dst=%s;outport=%s;next;/* DHCP_RELAY_REQ */",
> >>>>> +                  rp->lrp_networks.ea_s,sp->json_key);
> >>>>> +    ovn_lflow_add_with_hint__(lflows, op->od,
> >>>>> +                              S_SWITCH_IN_L2_LKUP, 100,
> >>>>> +                              ds_cstr(&match),
> >>>>> +                              ds_cstr(&action),
> >>>>> +                              op->key,
> >>>>> +                              NULL,
> >>>>> +                              &lrp->header_);
> >>>>> +    free(server_ip_str);
> >>>>> +}
> >>>>> +
> >>>>> static void
> >>>>> build_drop_arp_nd_flows_for_unbound_router_ports(struct ovn_port *op,
> >>>>>                                                const struct ovn_port
> *port,
> >>>>> @@ -10181,6 +10257,13 @@
> build_lswitch_dhcp_options_and_response(struct ovn_port *op,
> >>>>>       return;
> >>>>>   }
> >>>>>
> >>>>> +    if (op->od && op->od->nbs
> >>>>> +        && op->od->nbs->dhcp_relay_port) {
> >>>>> +        /* Don't add the DHCP server flows if DHCP Relay is enabled
> on the
> >>>>> +         * logical switch. */
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>>   bool is_external = lsp_is_external(op->nbsp);
> >>>>>   if (is_external && (!op->od->n_localnet_ports ||
> >>>>>                       !op->nbsp->ha_chassis_group)) {
> >>>>> @@ -14458,6 +14541,86 @@ build_dhcpv6_reply_flows_for_lrouter_port(
> >>>>>   }
> >>>>> }
> >>>>>
> >>>>> +static void
> >>>>> +build_dhcp_relay_flows_for_lrouter_port(
> >>>>> +        struct ovn_port *op, struct hmap *lflows,
> >>>>> +        struct ds *match)
> >>>>> +{
> >>>>> +    if (!op->nbrp || !op->nbrp->dhcp_relay) {
> >>>>> +        return;
> >>>>> +    }
> >>>>> +    struct nbrec_dhcp_relay *dhcp_relay = op->nbrp->dhcp_relay;
> >>>>> +    if (!dhcp_relay->servers) {
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    int addr_family;
> >>>>> +    /* currently not supporting custom port */
> >>>>> +    uint16_t port;
> >>>>> +    char *server_ip_str = NULL;
> >>>>> +    struct in6_addr server_ip;
> >>>>> +
> >>>>> +    if (!ip_address_and_port_from_lb_key(dhcp_relay->servers,
> &server_ip_str,
> >>>>> +                                         &server_ip, &port,
> &addr_family)) {
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    if (server_ip_str == NULL) {
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    struct ds dhcp_action = DS_EMPTY_INITIALIZER;
> >>>>> +    ds_clear(match);
> >>>>> +    ds_put_format(
> >>>>> +        match, "inport == %s && "
> >>>>> +        "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
> >>>>> +        "udp.src == 68 && udp.dst == 67",
> >>>>> +        op->json_key);
> >>>>> +    ds_put_format(&dhcp_action,
> >>>>> +                "dhcp_relay_req(%s,%s);"
> >>>>> +                "ip4.src=%s;ip4.dst=%s;udp.src=67;next; /*
> DHCP_RELAY_REQ */",
> >>>>> +                op->lrp_networks.ipv4_addrs[0].addr_s,
> server_ip_str,
> >>>>> +                op->lrp_networks.ipv4_addrs[0].addr_s,
> server_ip_str);
> >>>>> +
> >>>>> +    ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_IP_INPUT,
> 110,
> >>>>> +                            ds_cstr(match), ds_cstr(&dhcp_action),
> >>>>> +                            &op->nbrp->header_);
> >>>>> +
> >>>>> +    ds_clear(match);
> >>>>> +    ds_clear(&dhcp_action);
> >>>>> +
> >>>>> +    ds_put_format(
> >>>>> +        match, "ip4.src == %s && ip4.dst == %s && "
> >>>>> +        "udp.src == 67 && udp.dst == 67",
> >>>>> +        server_ip_str, op->lrp_networks.ipv4_addrs[0].addr_s);
> >>>>> +    ds_put_format(&dhcp_action, "next;/* DHCP_RELAY_RESP */");
> >>>>> +    ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_IP_INPUT,
> 110,
> >>>>> +                            ds_cstr(match), ds_cstr(&dhcp_action),
> >>>>> +                            &op->nbrp->header_);
> >>>>> +
> >>>>> +    ds_clear(match);
> >>>>> +    ds_clear(&dhcp_action);
> >>>>> +
> >>>>> +    ds_put_format(
> >>>>> +        match, "ip4.src == %s && ip4.dst == %s && "
> >>>>> +        "udp.src == 67 && udp.dst == 67",
> >>>>> +        server_ip_str, op->lrp_networks.ipv4_addrs[0].addr_s);
> >>>>> +    ds_put_format(&dhcp_action,
> >>>>> +          "dhcp_relay_resp_fwd(%s,%s);ip4.src=%s;udp.dst=68;"
> >>>>> +          "outport=%s;output; /* DHCP_RELAY_RESP */",
> >>>>> +          op->lrp_networks.ipv4_addrs[0].addr_s, server_ip_str,
> >>>>> +          op->lrp_networks.ipv4_addrs[0].addr_s, op->json_key);
> >>>>> +    ovn_lflow_add_with_hint(lflows, op->od,
> S_ROUTER_IN_DHCP_RELAY_RESP_FWD,
> >>>>> +                            110,
> >>>>> +                            ds_cstr(match), ds_cstr(&dhcp_action),
> >>>>> +                            &op->nbrp->header_);
> >>>>> +
> >>>>> +    ds_clear(match);
> >>>>> +    ds_clear(&dhcp_action);
> >>>>> +
> >>>>> +    free(server_ip_str);
> >>>>> +}
> >>>>> +
> >>>>> static void
> >>>>> build_ipv6_input_flows_for_lrouter_port(
> >>>>>       struct ovn_port *op, struct hmap *lflows,
> >>>>> @@ -15673,6 +15836,8 @@ build_lrouter_nat_defrag_and_lb(struct
> ovn_datapath *od, struct hmap *lflows,
> >>>>>   ovn_lflow_add(lflows, od, S_ROUTER_OUT_POST_SNAT, 0, "1", "next;");
> >>>>>   ovn_lflow_add(lflows, od, S_ROUTER_OUT_EGR_LOOP, 0, "1", "next;");
> >>>>>   ovn_lflow_add(lflows, od, S_ROUTER_IN_ECMP_STATEFUL, 0, "1",
> "next;");
> >>>>> +    ovn_lflow_add(lflows, od, S_ROUTER_IN_DHCP_RELAY_RESP_FWD, 0,
> "1",
> >>>>> +                  "next;");
> >>>>>
> >>>>>   const char *ct_flag_reg = features->ct_no_masked_label
> >>>>>                             ? "ct_mark"
> >>>>> @@ -16154,6 +16319,7 @@
> build_lswitch_and_lrouter_iterate_by_lsp(struct ovn_port *op,
> >>>>>   build_lswitch_dhcp_options_and_response(op, lflows, meter_groups);
> >>>>>   build_lswitch_external_port(op, lflows);
> >>>>>   build_lswitch_ip_unicast_lookup(op, lflows, actions, match);
> >>>>> +    build_lswitch_dhcp_relay_flows(op, lr_ports, lflows,
> meter_groups);
> >>>>>
> >>>>>   /* Build Logical Router Flows. */
> >>>>>   build_ip_routing_flows_for_router_type_lsp(op, lr_ports, lflows);
> >>>>> @@ -16183,6 +16349,7 @@
> build_lswitch_and_lrouter_iterate_by_lrp(struct ovn_port *op,
> >>>>>   build_egress_delivery_flows_for_lrouter_port(op, lsi->lflows,
> &lsi->match,
> >>>>>                                                &lsi->actions);
> >>>>>   build_dhcpv6_reply_flows_for_lrouter_port(op, lsi->lflows,
> &lsi->match);
> >>>>> +    build_dhcp_relay_flows_for_lrouter_port(op, lsi->lflows,
> &lsi->match);
> >>>>>   build_ipv6_input_flows_for_lrouter_port(op, lsi->lflows,
> >>>>>                                           &lsi->match, &lsi->actions,
> >>>>>                                           lsi->meter_groups);
> >>>>> diff --git a/ovn-nb.ovsschema b/ovn-nb.ovsschema
> >>>>> index b2e0993e0..6863d52cd 100644
> >>>>> --- a/ovn-nb.ovsschema
> >>>>> +++ b/ovn-nb.ovsschema
> >>>>> @@ -1,7 +1,7 @@
> >>>>> {
> >>>>>   "name": "OVN_Northbound",
> >>>>> -    "version": "7.2.0",
> >>>>> -    "cksum": "1069338687 34162",
> >>>>> +    "version": "7.3.0",
> >>>>> +    "cksum": "2325497400 35185",
> >>>>>   "tables": {
> >>>>>       "NB_Global": {
> >>>>>           "columns": {
> >>>>> @@ -89,7 +89,12 @@
> >>>>>                   "type": {"key": {"type": "uuid",
> >>>>>                                    "refTable": "Forwarding_Group",
> >>>>>                                    "refType": "strong"},
> >>>>> -                                     "min": 0, "max":
> "unlimited"}}},
> >>>>> +                                     "min": 0, "max": "unlimited"}},
> >>>>> +                "dhcp_relay_port": {"type": {"key": {"type": "uuid",
> >>>>> +                                            "refTable":
> "Logical_Router_Port",
> >>>>> +                                            "refType": "weak"},
> >>>>> +                                            "min": 0,
> >>>>> +                                            "max": 1}}},
> >>>>>           "isRoot": true},
> >>>>>       "Logical_Switch_Port": {
> >>>>>           "columns": {
> >>>>> @@ -436,6 +441,11 @@
> >>>>>               "ipv6_prefix": {"type": {"key": "string",
> >>>>>                                     "min": 0,
> >>>>>                                     "max": "unlimited"}},
> >>>>> +                "dhcp_relay": {"type": {"key": {"type": "uuid",
> >>>>> +                                            "refTable":
> "DHCP_Relay",
> >>>>> +                                            "refType": "weak"},
> >>>>> +                                            "min": 0,
> >>>>> +                                            "max": 1}},
> >>>>>               "external_ids": {
> >>>>>                   "type": {"key": "string", "value": "string",
> >>>>>                            "min": 0, "max": "unlimited"}},
> >>>>> @@ -529,6 +539,15 @@
> >>>>>                   "type": {"key": "string", "value": "string",
> >>>>>                            "min": 0, "max": "unlimited"}}},
> >>>>>           "isRoot": true},
> >>>>> +        "DHCP_Relay": {
> >>>>> +            "columns": {
> >>>>> +                "servers": {"type": {"key": "string",
> >>>>> +                                       "min": 0,
> >>>>> +                                       "max": 1}},
> >>>>> +                "external_ids": {
> >>>>> +                    "type": {"key": "string", "value": "string",
> >>>>> +                             "min": 0, "max": "unlimited"}}},
> >>>>> +            "isRoot": true},
> >>>>>       "Connection": {
> >>>>>           "columns": {
> >>>>>               "target": {"type": "string"},
> >>>>> diff --git a/ovn-nb.xml b/ovn-nb.xml
> >>>>> index fcb1c6ecc..dc20892e1 100644
> >>>>> --- a/ovn-nb.xml
> >>>>> +++ b/ovn-nb.xml
> >>>>> @@ -608,6 +608,11 @@
> >>>>>     Please see the <ref table="DNS"/> table.
> >>>>>   </column>
> >>>>>
> >>>>> +    <column name="dhcp_relay_port">
> >>>>> +      This column defines the <ref table="Logical_Router_Port"/> on
> which
> >>>>> +      DHCP relay is enabled.
> >>>>> +    </column>
> >>>>> +
> >>>>>   <column name="forwarding_groups">
> >>>>>     Groups a set of logical port endpoints for traffic going out of
> the
> >>>>>     logical switch.
> >>>>> @@ -2980,6 +2985,11 @@ or
> >>>>>     port has all ingress and egress traffic dropped.
> >>>>>   </column>
> >>>>>
> >>>>> +    <column name="dhcp_relay">
> >>>>> +      This column is used to enabled DHCP Relay. Please refer
> >>>>> +      to <ref table="DHCP_Relay"/> table.
> >>>>> +    </column>
> >>>>> +
> >>>>>   <group title="Distributed Gateway Ports">
> >>>>>     <p>
> >>>>>       Gateways, as documented under <code>Gateways</code> in the OVN
> >>>>> @@ -4286,6 +4296,24 @@ or
> >>>>>   </group>
> >>>>> </table>
> >>>>>
> >>>>> +  <table name="DHCP_Relay" title="DHCP Relay">
> >>>>> +    <p>
> >>>>> +      OVN implements native DHCPv4 relay support which caters to
> the common
> >>>>> +      use case of relaying the DHCP requests to external DHCP
> server.
> >>>>> +    </p>
> >>>>> +
> >>>>> +    <column name="servers">
> >>>>> +      <p>
> >>>>> +        The DHCPv4 server IP address.
> >>>>> +      </p>
> >>>>> +    </column>
> >>>>> +    <group title="Common Columns">
> >>>>> +      <column name="external_ids">
> >>>>> +        See <em>External IDs</em> at the beginning of this document.
> >>>>> +      </column>
> >>>>> +    </group>
> >>>>> +  </table>
> >>>>> +
> >>>>> <table name="Connection" title="OVSDB client connections.">
> >>>>>   <p>
> >>>>>     Configuration for a database connection to an Open vSwitch
> database
> >>>>> diff --git a/tests/atlocal.in b/tests/atlocal.in
> >>>>> index 63d891b89..32d1c374e 100644
> >>>>> --- a/tests/atlocal.in
> >>>>> +++ b/tests/atlocal.in
> >>>>> @@ -187,6 +187,9 @@ fi
> >>>>> # Set HAVE_DHCPD
> >>>>> find_command dhcpd
> >>>>>
> >>>>> +# Set HAVE_DHCLIENT
> >>>>> +find_command dhclient
> >>>>> +
> >>>>> # Set HAVE_BFDD_BEACON
> >>>>> find_command bfdd-beacon
> >>>>>
> >>>>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> >>>>> index 19e4f1263..4d8c9ff26 100644
> >>>>> --- a/tests/ovn-northd.at
> >>>>> +++ b/tests/ovn-northd.at
> >>>>> @@ -8786,9 +8786,9 @@ ovn-nbctl --wait=sb set logical_router_port
> R1-PUB options:redirect-type=bridged
> >>>>> ovn-sbctl dump-flows R1 > R1flows
> >>>>> AT_CAPTURE_FILE([R1flows])
> >>>>>
> >>>>> -AT_CHECK([grep "lr_in_arp_resolve" R1flows | grep priority=90 |
> sort], [0], [dnl
> >>>>> -  table=17(lr_in_arp_resolve  ), priority=90   , match=(outport ==
> "R1-PUB" && ip4.src == 10.0.0.3 && is_chassis_resident("S0-P0")),
> action=(get_arp(outport, reg0); next;)
> >>>>> -  table=17(lr_in_arp_resolve  ), priority=90   , match=(outport ==
> "R1-PUB" && ip6.src == 1000::3 && is_chassis_resident("S0-P0")),
> action=(get_nd(outport, xxreg0); next;)
> >>>>> +AT_CHECK([grep "lr_in_arp_resolve" R1flows | grep priority=90 | sed
> 's/table=../table=??/' | sort], [0], [dnl
> >>>>> +  table=??(lr_in_arp_resolve  ), priority=90   , match=(outport ==
> "R1-PUB" && ip4.src == 10.0.0.3 && is_chassis_resident("S0-P0")),
> action=(get_arp(outport, reg0); next;)
> >>>>> +  table=??(lr_in_arp_resolve  ), priority=90   , match=(outport ==
> "R1-PUB" && ip6.src == 1000::3 && is_chassis_resident("S0-P0")),
> action=(get_nd(outport, xxreg0); next;)
> >>>>> ])
> >>>>>
> >>>>> AT_CLEANUP
> >>>>> @@ -10966,3 +10966,38 @@ Status: active
> >>>>>
> >>>>> AT_CLEANUP
> >>>>> ])
> >>>>> +
> >>>>> +OVN_FOR_EACH_NORTHD_NO_HV([
> >>>>> +AT_SETUP([check DHCP RELAY AGENT])
> >>>>> +ovn_start NORTHD_TYPE
> >>>>> +
> >>>>> +check ovn-nbctl ls-add ls0
> >>>>> +check ovn-nbctl lsp-add ls0 ls0-port1
> >>>>> +check ovn-nbctl lsp-set-addresses ls0-port1 02:00:00:00:00:10
> >>>>> +check ovn-nbctl lr-add lr0
> >>>>> +check ovn-nbctl lrp-add lr0 lrp1 02:00:00:00:00:01 192.168.1.1/24
> >>>>> +check ovn-nbctl lsp-add ls0 lrp1-attachment
> >>>>> +check ovn-nbctl lsp-set-type lrp1-attachment router
> >>>>> +check ovn-nbctl lsp-set-addresses lrp1-attachment 00:00:00:00:ff:02
> >>>>> +check ovn-nbctl lsp-set-options lrp1-attachment router-port=lrp1
> >>>>> +check ovn-nbctl lrp-add lr0 lrp-ext 02:00:00:00:00:02
> 192.168.2.1/24
> >>>>> +
> >>>>> +dhcp_relay=$(ovn-nbctl create DHCP_Relay servers=172.16.1.1)
> >>>>> +check ovn-nbctl set Logical_Router_port lrp1 dhcp_relay=$dhcp_relay
> >>>>> +rp_uuid=$(ovn-nbctl --bare --colum=_uuid list logical_router_port
> lrp1)
> >>>>> +check ovn-nbctl set Logical_Switch ls0 dhcp_relay_port=$rp_uuid
> >>>>> +
> >>>>> +check ovn-nbctl --wait=sb sync
> >>>>> +
> >>>>> +ovn-sbctl lflow-list > lflows
> >>>>> +AT_CAPTURE_FILE([lflows])
> >>>>> +
> >>>>> +AT_CHECK([grep -e "DHCP_RELAY_" lflows | sed
> 's/table=../table=??/'], [0], [dnl
> >>>>> +  table=??(lr_in_ip_input     ), priority=110  , match=(inport ==
> "lrp1" && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68
> && udp.dst == 67),
> action=(dhcp_relay_req(192.168.1.1,172.16.1.1);ip4.src=192.168.1.1;ip4.dst=172.16.1.1;udp.src=67;next;
> /* DHCP_RELAY_REQ */)
> >>>>> +  table=??(lr_in_ip_input     ), priority=110  , match=(ip4.src ==
> 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src == 67 && udp.dst == 67),
> action=(next;/* DHCP_RELAY_RESP */)
> >>>>> +  table=??(lr_in_dhcp_relay_resp_fwd), priority=110  ,
> match=(ip4.src == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src == 67 &&
> udp.dst == 67), action=(dhcp_rel
diff mbox series

Patch

diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index 5a35d56f6..45240f01d 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -1897,6 +1897,437 @@  is_dhcp_flags_broadcast(ovs_be16 flags)
     return flags & htons(DHCP_BROADCAST_FLAG);
 }
 
+static const char *dhcp_msg_str[] = {
+[0] = "INVALID",
+[DHCP_MSG_DISCOVER] = "DISCOVER",
+[DHCP_MSG_OFFER] = "OFFER",
+[DHCP_MSG_REQUEST] = "REQUEST",
+[OVN_DHCP_MSG_DECLINE] = "DECLINE",
+[DHCP_MSG_ACK] = "ACK",
+[DHCP_MSG_NAK] = "NAK",
+[OVN_DHCP_MSG_RELEASE] = "RELEASE",
+[OVN_DHCP_MSG_INFORM] = "INFORM"
+};
+
+static bool
+dhcp_relay_is_msg_type_supported(uint8_t msg_type)
+{
+    return (msg_type >= DHCP_MSG_DISCOVER && msg_type <= OVN_DHCP_MSG_RELEASE);
+}
+
+static const char *dhcp_msg_str_get(uint8_t msg_type)
+{
+    if (!dhcp_relay_is_msg_type_supported(msg_type)) {
+        return "INVALID";
+    }
+    return dhcp_msg_str[msg_type];
+}
+
+/* Called with in the pinctrl_handler thread context. */
+static void
+pinctrl_handle_dhcp_relay_req(
+    struct rconn *swconn,
+    struct dp_packet *pkt_in, struct ofputil_packet_in *pin,
+    struct ofpbuf *userdata,
+    struct ofpbuf *continuation)
+{
+    enum ofp_version version = rconn_get_version(swconn);
+    enum ofputil_protocol proto = ofputil_protocol_from_ofp_version(version);
+    struct dp_packet *pkt_out_ptr = NULL;
+
+    /* Parse relay IP and server IP. */
+    ovs_be32 *relay_ip = ofpbuf_try_pull(userdata, sizeof *relay_ip);
+    ovs_be32 *server_ip = ofpbuf_try_pull(userdata, sizeof *server_ip);
+    if (!relay_ip || !server_ip) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: relay ip or server ip "
+                  "not present in the userdata");
+        return;
+    }
+
+    /* Validate the DHCP request packet.
+     * Format of the DHCP packet is
+     * ------------------------------------------------------------------------
+     *| UDP HEADER  | DHCP HEADER  | 4 Byte DHCP Cookie | DHCP OPTIONS(var len)|
+     * ------------------------------------------------------------------------
+     */
+
+    size_t in_l4_size = dp_packet_l4_size(pkt_in);
+    const char *end = (char *) dp_packet_l4(pkt_in) + in_l4_size;
+    const char *in_dhcp_ptr = dp_packet_get_udp_payload(pkt_in);
+    if (!in_dhcp_ptr) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: invalid or incomplete "
+                  "DHCP packet received");
+        return;
+    }
+
+    const struct dhcp_header *in_dhcp_data
+        = (const struct dhcp_header *) in_dhcp_ptr;
+    in_dhcp_ptr += sizeof *in_dhcp_data;
+    if (in_dhcp_ptr > end) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: invalid or incomplete "
+                "DHCP packet received, bad data length");
+        return;
+    }
+    if (in_dhcp_data->op != DHCP_OP_REQUEST) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: invalid opcode in the "
+                "DHCP packet: %d", in_dhcp_data->op);
+        return;
+    }
+
+    /* DHCP options follow the DHCP header. The first 4 bytes of the DHCP
+     * options is the DHCP magic cookie followed by the actual DHCP options.
+     */
+    ovs_be32 magic_cookie = htonl(DHCP_MAGIC_COOKIE);
+    if (in_dhcp_ptr + sizeof magic_cookie > end ||
+        get_unaligned_be32((const void *) in_dhcp_ptr) != magic_cookie) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: magic cookie not present "
+                "in the packet");
+        return;
+    }
+
+    if (in_dhcp_data->giaddr) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: giaddr is already set");
+        return;
+    }
+
+    if (in_dhcp_data->htype != 0x1) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: packet is recieved with "
+                "unsupported hardware type");
+        return;
+    }
+
+    ovs_be32 *server_id_ptr = NULL;
+    const uint8_t *in_dhcp_msg_type = NULL;
+
+    in_dhcp_ptr += sizeof magic_cookie;
+    ovs_be32 request_ip = in_dhcp_data->ciaddr;
+    while (in_dhcp_ptr < end) {
+        const struct dhcp_opt_header *in_dhcp_opt =
+            (const struct dhcp_opt_header *) in_dhcp_ptr;
+        if (in_dhcp_opt->code == DHCP_OPT_END) {
+            break;
+        }
+        if (in_dhcp_opt->code == DHCP_OPT_PAD) {
+            in_dhcp_ptr += 1;
+            continue;
+        }
+        in_dhcp_ptr += sizeof *in_dhcp_opt;
+        if (in_dhcp_ptr > end) {
+            break;
+        }
+        in_dhcp_ptr += in_dhcp_opt->len;
+        if (in_dhcp_ptr > end) {
+            break;
+        }
+
+        switch (in_dhcp_opt->code) {
+        case DHCP_OPT_MSG_TYPE:
+            if (in_dhcp_opt->len == 1) {
+                in_dhcp_msg_type = DHCP_OPT_PAYLOAD(in_dhcp_opt);
+            }
+            break;
+        case DHCP_OPT_REQ_IP:
+            if (in_dhcp_opt->len == 4) {
+                request_ip = get_unaligned_be32(DHCP_OPT_PAYLOAD(in_dhcp_opt));
+            }
+            break;
+        /* Server Identifier */
+        case OVN_DHCP_OPT_CODE_SERVER_ID:
+            if (in_dhcp_opt->len == 4) {
+                server_id_ptr = DHCP_OPT_PAYLOAD(in_dhcp_opt);
+            }
+            break;
+        default:
+            break;
+        }
+    }
+
+    /* Check whether the DHCP Message Type (opt 53) is present or not */
+    if (!in_dhcp_msg_type) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+        VLOG_WARN_RL(&rl, "DHCP_RELAY_REQ: missing message type");
+        return;
+    }
+
+    /* Relay the DHCP request packet */
+    uint16_t new_l4_size = in_l4_size;
+    size_t new_packet_size = pkt_in->l4_ofs + new_l4_size;
+
+    struct dp_packet pkt_out;
+    dp_packet_init(&pkt_out, new_packet_size);
+    dp_packet_clear(&pkt_out);
+    dp_packet_prealloc_tailroom(&pkt_out, new_packet_size);
+    pkt_out_ptr = &pkt_out;
+
+    /* Copy the L2 and L3 headers from the pkt_in as they would remain same*/
+    dp_packet_put(
+        &pkt_out, dp_packet_pull(pkt_in, pkt_in->l4_ofs), pkt_in->l4_ofs);
+
+    pkt_out.l2_5_ofs = pkt_in->l2_5_ofs;
+    pkt_out.l2_pad_size = pkt_in->l2_pad_size;
+    pkt_out.l3_ofs = pkt_in->l3_ofs;
+    pkt_out.l4_ofs = pkt_in->l4_ofs;
+
+    struct ip_header *out_ip = dp_packet_l3(&pkt_out);
+
+    struct udp_header *udp = dp_packet_put(
+        &pkt_out, dp_packet_pull(pkt_in, UDP_HEADER_LEN), UDP_HEADER_LEN);
+
+    struct dhcp_header *dhcp_data = dp_packet_put(&pkt_out,
+        dp_packet_pull(pkt_in, new_l4_size - UDP_HEADER_LEN),
+        new_l4_size - UDP_HEADER_LEN);
+    dhcp_data->giaddr = *relay_ip;
+    if (udp->udp_csum) {
+        udp->udp_csum = recalc_csum32(udp->udp_csum,
+            0, dhcp_data->giaddr);
+    }
+    pin->packet = dp_packet_data(&pkt_out);
+    pin->packet_len = dp_packet_size(&pkt_out);
+
+    /* Log the DHCP message. */
+    static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(20, 40);
+    const struct eth_header *l2 = dp_packet_eth(&pkt_out);
+    VLOG_INFO_RL(&rl, "DHCP_RELAY_REQ:: MSG_TYPE:%s MAC:"ETH_ADDR_FMT
+                " XID:%u"
+                " REQ_IP:"IP_FMT
+                " GIADDR:"IP_FMT
+                " SERVER_ADDR:"IP_FMT,
+                dhcp_msg_str_get(*in_dhcp_msg_type),
+                ETH_ADDR_BYTES_ARGS(dhcp_data->chaddr), ntohl(dhcp_data->xid),
+                IP_ARGS(request_ip), IP_ARGS(dhcp_data->giaddr),
+                IP_ARGS(*server_ip));
+    queue_msg(swconn, ofputil_encode_resume(pin, continuation, proto));
+    if (pkt_out_ptr) {
+        dp_packet_uninit(pkt_out_ptr);
+    }
+}
+
+/* Called with in the pinctrl_handler thread context. */
+static void
+pinctrl_handle_dhcp_relay_resp_fwd(
+    struct rconn *swconn,
+    struct dp_packet *pkt_in, struct ofputil_packet_in *pin,
+    struct ofpbuf *userdata,
+    struct ofpbuf *continuation)
+{
+    enum ofp_version version = rconn_get_version(swconn);
+    enum ofputil_protocol proto = ofputil_protocol_from_ofp_version(version);
+    struct dp_packet *pkt_out_ptr = NULL;
+
+    /* Parse relay IP and server IP. */
+    ovs_be32 *relay_ip = ofpbuf_try_pull(userdata, sizeof *relay_ip);
+    ovs_be32 *server_ip = ofpbuf_try_pull(userdata, sizeof *server_ip);
+    if (!relay_ip || !server_ip) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: relay ip or server ip "
+                "not present in the userdata");
+        return;
+    }
+
+    /* Validate the DHCP request packet.
+     * Format of the DHCP packet is
+     * ------------------------------------------------------------------------
+     *| UDP HEADER  | DHCP HEADER  | 4 Byte DHCP Cookie | DHCP OPTIONS(var len)|
+     * ------------------------------------------------------------------------
+     */
+
+    size_t in_l4_size = dp_packet_l4_size(pkt_in);
+    const char *end = (char *) dp_packet_l4(pkt_in) + in_l4_size;
+    const char *in_dhcp_ptr = dp_packet_get_udp_payload(pkt_in);
+    if (!in_dhcp_ptr) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: invalid or incomplete "
+                "packet received");
+        return;
+    }
+
+    const struct dhcp_header *in_dhcp_data
+        = (const struct dhcp_header *) in_dhcp_ptr;
+    in_dhcp_ptr += sizeof *in_dhcp_data;
+    if (in_dhcp_ptr > end) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: invalid or incomplete "
+                    "packet received, bad data length");
+        return;
+    }
+    if (in_dhcp_data->op != DHCP_OP_REPLY) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: invalid opcode "
+                "in the packet: %d", in_dhcp_data->op);
+        return;
+    }
+
+    /* DHCP options follow the DHCP header. The first 4 bytes of the DHCP
+     * options is the DHCP magic cookie followed by the actual DHCP options.
+     */
+    ovs_be32 magic_cookie = htonl(DHCP_MAGIC_COOKIE);
+    if (in_dhcp_ptr + sizeof magic_cookie > end ||
+        get_unaligned_be32((const void *) in_dhcp_ptr) != magic_cookie) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: magic cookie not present "
+                "in the packet");
+        return;
+    }
+
+    if (!in_dhcp_data->giaddr) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP_FWD: giaddr is "
+                    "not set in request");
+        return;
+    }
+    ovs_be32 giaddr = in_dhcp_data->giaddr;
+
+    ovs_be32 *server_id_ptr = NULL;
+    ovs_be32 lease_time = 0;
+    const uint8_t *in_dhcp_msg_type = NULL;
+
+    in_dhcp_ptr += sizeof magic_cookie;
+    while (in_dhcp_ptr < end) {
+        const struct dhcp_opt_header *in_dhcp_opt =
+            (const struct dhcp_opt_header *) in_dhcp_ptr;
+        if (in_dhcp_opt->code == DHCP_OPT_END) {
+            break;
+        }
+        if (in_dhcp_opt->code == DHCP_OPT_PAD) {
+            in_dhcp_ptr += 1;
+            continue;
+        }
+        in_dhcp_ptr += sizeof *in_dhcp_opt;
+        if (in_dhcp_ptr > end) {
+            break;
+        }
+        in_dhcp_ptr += in_dhcp_opt->len;
+        if (in_dhcp_ptr > end) {
+            break;
+        }
+
+        switch (in_dhcp_opt->code) {
+        case DHCP_OPT_MSG_TYPE:
+            if (in_dhcp_opt->len == 1) {
+                in_dhcp_msg_type = DHCP_OPT_PAYLOAD(in_dhcp_opt);
+            }
+            break;
+        /* Server Identifier */
+        case OVN_DHCP_OPT_CODE_SERVER_ID:
+            if (in_dhcp_opt->len == 4) {
+                server_id_ptr = DHCP_OPT_PAYLOAD(in_dhcp_opt);
+            }
+            break;
+        case OVN_DHCP_OPT_CODE_LEASE_TIME:
+            if (in_dhcp_opt->len == 4) {
+                lease_time = get_unaligned_be32(DHCP_OPT_PAYLOAD(in_dhcp_opt));
+            }
+            break;
+        default:
+            break;
+        }
+    }
+
+    /* Check whether the DHCP Message Type (opt 53) is present or not */
+    if (!in_dhcp_msg_type) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: missing message type");
+        return;
+    }
+
+    if (!server_id_ptr) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: missing server identifier");
+        return;
+    }
+
+    if (*server_id_ptr != *server_ip) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: server identifier mismatch");
+        return;
+    }
+
+    if (giaddr != *relay_ip) {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+        VLOG_WARN_RL(&rl, "DHCP_RELAY_RESP: giaddr mismatch");
+        return;
+    }
+
+
+    /* Update destination MAC & IP so that the packet is forward to the
+     * right destination node.
+     */
+    uint16_t new_l4_size = in_l4_size;
+    size_t new_packet_size = pkt_in->l4_ofs + new_l4_size;
+
+    struct dp_packet pkt_out;
+    dp_packet_init(&pkt_out, new_packet_size);
+    dp_packet_clear(&pkt_out);
+    dp_packet_prealloc_tailroom(&pkt_out, new_packet_size);
+    pkt_out_ptr = &pkt_out;
+
+    /* Copy the L2 and L3 headers from the pkt_in as they would remain same*/
+    struct eth_header *eth = dp_packet_put(
+        &pkt_out, dp_packet_pull(pkt_in, pkt_in->l4_ofs), pkt_in->l4_ofs);
+
+    pkt_out.l2_5_ofs = pkt_in->l2_5_ofs;
+    pkt_out.l2_pad_size = pkt_in->l2_pad_size;
+    pkt_out.l3_ofs = pkt_in->l3_ofs;
+    pkt_out.l4_ofs = pkt_in->l4_ofs;
+
+    struct udp_header *udp = dp_packet_put(
+        &pkt_out, dp_packet_pull(pkt_in, UDP_HEADER_LEN), UDP_HEADER_LEN);
+
+    struct dhcp_header *dhcp_data = dp_packet_put(
+        &pkt_out, dp_packet_pull(pkt_in, new_l4_size - UDP_HEADER_LEN),
+        new_l4_size - UDP_HEADER_LEN);
+    memcpy(&eth->eth_dst, dhcp_data->chaddr, sizeof(eth->eth_dst));
+
+    /* Send a broadcast IP frame when BROADCAST flag is set. */
+    struct ip_header *out_ip = dp_packet_l3(&pkt_out);
+    ovs_be32 ip_dst;
+    ovs_be32 ip_dst_orig = get_16aligned_be32(&out_ip->ip_dst);
+    if (!is_dhcp_flags_broadcast(dhcp_data->flags)) {
+        ip_dst = dhcp_data->yiaddr;
+    } else {
+        ip_dst = htonl(0xffffffff);
+    }
+    put_16aligned_be32(&out_ip->ip_dst, ip_dst);
+    out_ip->ip_csum = recalc_csum32(out_ip->ip_csum,
+              ip_dst_orig, ip_dst);
+    if (udp->udp_csum) {
+        udp->udp_csum = recalc_csum32(udp->udp_csum,
+            ip_dst_orig, ip_dst);
+    }
+    /* Reset giaddr */
+    dhcp_data->giaddr = htonl(0x0);
+    if (udp->udp_csum) {
+        udp->udp_csum = recalc_csum32(udp->udp_csum,
+            giaddr, 0);
+    }
+    pin->packet = dp_packet_data(&pkt_out);
+    pin->packet_len = dp_packet_size(&pkt_out);
+
+    /* Log the DHCP message. */
+    static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(20, 40);
+    const struct eth_header *l2 = dp_packet_eth(&pkt_out);
+    VLOG_INFO_RL(&rl, "DHCP_RELAY_RESP_FWD:: MSG_TYPE:%s MAC:"ETH_ADDR_FMT
+             " XID:%u"
+             " YIADDR:"IP_FMT
+             " GIADDR:"IP_FMT
+             " SERVER_ADDR:"IP_FMT,
+             dhcp_msg_str_get(*in_dhcp_msg_type),
+             ETH_ADDR_BYTES_ARGS(dhcp_data->chaddr), ntohl(dhcp_data->xid),
+             IP_ARGS(dhcp_data->yiaddr),
+             IP_ARGS(giaddr), IP_ARGS(*server_id_ptr));
+    queue_msg(swconn, ofputil_encode_resume(pin, continuation, proto));
+    if (pkt_out_ptr) {
+        dp_packet_uninit(pkt_out_ptr);
+    }
+}
+
 /* Called with in the pinctrl_handler thread context. */
 static void
 pinctrl_handle_put_dhcp_opts(
@@ -3203,6 +3634,16 @@  process_packet_in(struct rconn *swconn, const struct ofp_header *msg)
         ovs_mutex_unlock(&pinctrl_mutex);
         break;
 
+    case ACTION_OPCODE_DHCP_RELAY_REQ:
+        pinctrl_handle_dhcp_relay_req(swconn, &packet, &pin,
+                                     &userdata, &continuation);
+        break;
+
+    case ACTION_OPCODE_DHCP_RELAY_RESP_FWD:
+        pinctrl_handle_dhcp_relay_resp_fwd(swconn, &packet, &pin,
+                                     &userdata, &continuation);
+        break;
+
     case ACTION_OPCODE_PUT_DHCP_OPTS:
         pinctrl_handle_put_dhcp_opts(swconn, &packet, &pin, &headers,
                                      &userdata, &continuation);
diff --git a/include/ovn/actions.h b/include/ovn/actions.h
index 49cfe0624..47d41b90f 100644
--- a/include/ovn/actions.h
+++ b/include/ovn/actions.h
@@ -95,6 +95,8 @@  struct collector_set_ids;
     OVNACT(LOOKUP_ND_IP,      ovnact_lookup_mac_bind_ip) \
     OVNACT(PUT_DHCPV4_OPTS,   ovnact_put_opts)        \
     OVNACT(PUT_DHCPV6_OPTS,   ovnact_put_opts)        \
+    OVNACT(DHCPV4_RELAY_REQ,  ovnact_dhcp_relay)      \
+    OVNACT(DHCPV4_RELAY_RESP_FWD, ovnact_dhcp_relay)      \
     OVNACT(SET_QUEUE,         ovnact_set_queue)       \
     OVNACT(DNS_LOOKUP,        ovnact_result)          \
     OVNACT(LOG,               ovnact_log)             \
@@ -387,6 +389,14 @@  struct ovnact_put_opts {
     size_t n_options;
 };
 
+/* OVNACT_DHCP_RELAY. */
+struct ovnact_dhcp_relay {
+    struct ovnact ovnact;
+    int family;
+    ovs_be32 relay_ipv4;
+    ovs_be32 server_ipv4;
+};
+
 /* Valid arguments to SET_QUEUE action.
  *
  * QDISC_MIN_QUEUE_ID is the default queue, so user-defined queues should
@@ -750,6 +760,22 @@  enum action_opcode {
 
     /* multicast group split buffer action. */
     ACTION_OPCODE_MG_SPLIT_BUF,
+
+    /* "dhcp_relay_req(relay_ip, server_ip)".
+     *
+     * Arguments follow the action_header, in this format:
+     *   - The 32-bit DHCP relay IP.
+     *   - The 32-bit DHCP server IP.
+     */
+    ACTION_OPCODE_DHCP_RELAY_REQ,
+
+    /* "dhcp_relay_resp_fwd(relay_ip, server_ip)".
+     *
+     * Arguments follow the action_header, in this format:
+     *   - The 32-bit DHCP relay IP.
+     *   - The 32-bit DHCP server IP.
+     */
+    ACTION_OPCODE_DHCP_RELAY_RESP_FWD,
 };
 
 /* Header. */
diff --git a/lib/actions.c b/lib/actions.c
index a73fe1a1e..69df428c6 100644
--- a/lib/actions.c
+++ b/lib/actions.c
@@ -2629,6 +2629,118 @@  ovnact_controller_event_free(struct ovnact_controller_event *event)
     free_gen_options(event->options, event->n_options);
 }
 
+static void
+format_DHCPV4_RELAY_REQ(const struct ovnact_dhcp_relay *dhcp_relay,
+                struct ds *s)
+{
+    ds_put_format(s, "dhcp_relay_req("IP_FMT","IP_FMT");",
+                  IP_ARGS(dhcp_relay->relay_ipv4),
+                  IP_ARGS(dhcp_relay->server_ipv4));
+}
+
+static void
+parse_dhcp_relay_req(struct action_context *ctx,
+               struct ovnact_dhcp_relay *dhcp_relay)
+{
+    /* Skip dhcp_relay_req( */
+    lexer_force_match(ctx->lexer, LEX_T_LPAREN);
+
+    /* Parse relay ip and server ip. */
+    if (ctx->lexer->token.format == LEX_F_IPV4) {
+        dhcp_relay->family = AF_INET;
+        dhcp_relay->relay_ipv4 = ctx->lexer->token.value.ipv4;
+        lexer_get(ctx->lexer);
+        lexer_match(ctx->lexer, LEX_T_COMMA);
+        if (ctx->lexer->token.format == LEX_F_IPV4) {
+            dhcp_relay->family = AF_INET;
+            dhcp_relay->server_ipv4 = ctx->lexer->token.value.ipv4;
+            lexer_get(ctx->lexer);
+        } else {
+            lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp server ip");
+            return;
+        }
+    } else {
+          lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp relay "
+                          "and server ips");
+          return;
+    }
+    lexer_force_match(ctx->lexer, LEX_T_RPAREN);
+}
+
+static void
+encode_DHCPV4_RELAY_REQ(const struct ovnact_dhcp_relay *dhcp_relay,
+                    const struct ovnact_encode_params *ep,
+                    struct ofpbuf *ofpacts)
+{
+    size_t oc_offset = encode_start_controller_op(ACTION_OPCODE_DHCP_RELAY_REQ,
+                                                  true, ep->ctrl_meter_id,
+                                                  ofpacts);
+    ofpbuf_put(ofpacts, &dhcp_relay->relay_ipv4,
+            sizeof(dhcp_relay->relay_ipv4));
+    ofpbuf_put(ofpacts, &dhcp_relay->server_ipv4,
+            sizeof(dhcp_relay->server_ipv4));
+    encode_finish_controller_op(oc_offset, ofpacts);
+}
+
+static void
+format_DHCPV4_RELAY_RESP_FWD(const struct ovnact_dhcp_relay *dhcp_relay,
+                    struct ds *s)
+{
+    ds_put_format(s, "dhcp_relay_resp("IP_FMT","IP_FMT");",
+                  IP_ARGS(dhcp_relay->relay_ipv4),
+                  IP_ARGS(dhcp_relay->server_ipv4));
+}
+
+static void
+parse_dhcp_relay_resp_fwd(struct action_context *ctx,
+               struct ovnact_dhcp_relay *dhcp_relay)
+{
+    /* Skip dhcp_relay_resp( */
+    lexer_force_match(ctx->lexer, LEX_T_LPAREN);
+
+    /* Parse relay ip and server ip. */
+    if (ctx->lexer->token.format == LEX_F_IPV4) {
+        dhcp_relay->family = AF_INET;
+        dhcp_relay->relay_ipv4 = ctx->lexer->token.value.ipv4;
+        lexer_get(ctx->lexer);
+        lexer_match(ctx->lexer, LEX_T_COMMA);
+        if (ctx->lexer->token.format == LEX_F_IPV4) {
+            dhcp_relay->family = AF_INET;
+            dhcp_relay->server_ipv4 = ctx->lexer->token.value.ipv4;
+            lexer_get(ctx->lexer);
+        } else {
+            lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp server ip");
+            return;
+        }
+    } else {
+          lexer_syntax_error(ctx->lexer, "expecting IPv4 dhcp relay and "
+                          "server ips");
+          return;
+    }
+    lexer_force_match(ctx->lexer, LEX_T_RPAREN);
+}
+
+static void
+encode_DHCPV4_RELAY_RESP_FWD(const struct ovnact_dhcp_relay *dhcp_relay,
+                    const struct ovnact_encode_params *ep,
+                    struct ofpbuf *ofpacts)
+{
+    size_t oc_offset = encode_start_controller_op(
+                                ACTION_OPCODE_DHCP_RELAY_RESP_FWD,
+                                true, ep->ctrl_meter_id,
+                                ofpacts);
+    ofpbuf_put(ofpacts, &dhcp_relay->relay_ipv4,
+                  sizeof(dhcp_relay->relay_ipv4));
+    ofpbuf_put(ofpacts, &dhcp_relay->server_ipv4,
+                  sizeof(dhcp_relay->server_ipv4));
+    encode_finish_controller_op(oc_offset, ofpacts);
+}
+
+static void ovnact_dhcp_relay_free(
+          struct ovnact_dhcp_relay *dhcp_relay OVS_UNUSED)
+{
+}
+
 static void
 parse_put_opts(struct action_context *ctx, const struct expr_field *dst,
                struct ovnact_put_opts *po, const struct hmap *gen_opts,
@@ -5451,6 +5563,11 @@  parse_action(struct action_context *ctx)
         parse_sample(ctx);
     } else if (lexer_match_id(ctx->lexer, "mac_cache_use")) {
         ovnact_put_MAC_CACHE_USE(ctx->ovnacts);
+    } else if (lexer_match_id(ctx->lexer, "dhcp_relay_req")) {
+        parse_dhcp_relay_req(ctx, ovnact_put_DHCPV4_RELAY_REQ(ctx->ovnacts));
+    } else if (lexer_match_id(ctx->lexer, "dhcp_relay_resp_fwd")) {
+        parse_dhcp_relay_resp_fwd(ctx,
+              ovnact_put_DHCPV4_RELAY_RESP_FWD(ctx->ovnacts));
     } else {
         lexer_syntax_error(ctx->lexer, "expecting action");
     }
diff --git a/lib/ovn-l7.h b/lib/ovn-l7.h
index ad514a922..e08581123 100644
--- a/lib/ovn-l7.h
+++ b/lib/ovn-l7.h
@@ -69,6 +69,7 @@  struct gen_opts_map {
  */
 #define OVN_DHCP_OPT_CODE_NETMASK      1
 #define OVN_DHCP_OPT_CODE_LEASE_TIME   51
+#define OVN_DHCP_OPT_CODE_SERVER_ID    54
 #define OVN_DHCP_OPT_CODE_T1           58
 #define OVN_DHCP_OPT_CODE_T2           59
 
diff --git a/northd/northd.c b/northd/northd.c
index 07dffb15a..7ac831fae 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -181,11 +181,13 @@  enum ovn_stage {
     PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING_ECMP, 14, "lr_in_ip_routing_ecmp") \
     PIPELINE_STAGE(ROUTER, IN,  POLICY,          15, "lr_in_policy")          \
     PIPELINE_STAGE(ROUTER, IN,  POLICY_ECMP,     16, "lr_in_policy_ecmp")     \
-    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE,     17, "lr_in_arp_resolve")     \
-    PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN,     18, "lr_in_chk_pkt_len")     \
-    PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS,     19, "lr_in_larger_pkts")     \
-    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT,     20, "lr_in_gw_redirect")     \
-    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     21, "lr_in_arp_request")     \
+    PIPELINE_STAGE(ROUTER, IN,  DHCP_RELAY_RESP_FWD, 17,                      \
+                  "lr_in_dhcp_relay_resp_fwd")                                \
+    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE,     18, "lr_in_arp_resolve")     \
+    PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN,     19, "lr_in_chk_pkt_len")     \
+    PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS,     20, "lr_in_larger_pkts")     \
+    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT,     21, "lr_in_gw_redirect")     \
+    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     22, "lr_in_arp_request")     \
                                                                       \
     /* Logical router egress stages. */                               \
     PIPELINE_STAGE(ROUTER, OUT, CHECK_DNAT_LOCAL,   0,                       \
@@ -9610,6 +9612,80 @@  build_dhcpv6_options_flows(struct ovn_port *op,
     ds_destroy(&match);
 }
 
+static void
+build_lswitch_dhcp_relay_flows(struct ovn_port *op,
+                           const struct hmap *lr_ports,
+                           const struct hmap *lflows,
+                           const struct shash *meter_groups OVS_UNUSED)
+{
+    if (op->nbrp || !op->nbsp) {
+        return;
+    }
+    /* consider only ports attached to VMs */
+    if (strcmp(op->nbsp->type, "")) {
+        return;
+    }
+
+    if (!op->od || !op->od->n_router_ports ||
+        !op->od->nbs || !op->od->nbs->dhcp_relay_port) {
+        return;
+    }
+
+    struct ds match = DS_EMPTY_INITIALIZER;
+    struct ds action = DS_EMPTY_INITIALIZER;
+    struct nbrec_logical_router_port *lrp = op->od->nbs->dhcp_relay_port;
+    struct ovn_port *rp = ovn_port_find(lr_ports, lrp->name);
+
+    if (!rp || !rp->nbrp || !rp->nbrp->dhcp_relay) {
+        return;
+    }
+
+    struct ovn_port *sp = NULL;
+    struct nbrec_dhcp_relay *dhcp_relay = rp->nbrp->dhcp_relay;
+
+    for (int i = 0; i < op->od->n_router_ports; i++) {
+        struct ovn_port *sp_tmp = op->od->router_ports[i];
+        if (sp_tmp->peer == rp) {
+            sp = sp_tmp;
+            break;
+        }
+    }
+    if (!sp) {
+      return;
+    }
+
+    char *server_ip_str = NULL;
+    uint16_t port;
+    int addr_family;
+    struct in6_addr server_ip;
+
+    if (!ip_address_and_port_from_lb_key(dhcp_relay->servers, &server_ip_str,
+                                         &server_ip, &port, &addr_family)) {
+        return;
+    }
+
+    if (server_ip_str == NULL) {
+        return;
+    }
+
+    ds_put_format(
+        &match, "inport == %s && eth.src == %s && "
+        "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
+        "udp.src == 68 && udp.dst == 67",
+        op->json_key, op->lsp_addrs[0].ea_s);
+    ds_put_format(&action,
+                  "eth.dst=%s;outport=%s;next;/* DHCP_RELAY_REQ */",
+                  rp->lrp_networks.ea_s,sp->json_key);
+    ovn_lflow_add_with_hint__(lflows, op->od,
+                              S_SWITCH_IN_L2_LKUP, 100,
+                              ds_cstr(&match),
+                              ds_cstr(&action),
+                              op->key,
+                              NULL,
+                              &lrp->header_);
+    free(server_ip_str);
+}
+
 static void
 build_drop_arp_nd_flows_for_unbound_router_ports(struct ovn_port *op,
                                                  const struct ovn_port *port,
@@ -10181,6 +10257,13 @@  build_lswitch_dhcp_options_and_response(struct ovn_port *op,
         return;
     }
 
+    if (op->od && op->od->nbs
+        && op->od->nbs->dhcp_relay_port) {
+        /* Don't add the DHCP server flows if DHCP Relay is enabled on the
+         * logical switch. */
+        return;
+    }
+
     bool is_external = lsp_is_external(op->nbsp);
     if (is_external && (!op->od->n_localnet_ports ||
                         !op->nbsp->ha_chassis_group)) {
@@ -14458,6 +14541,86 @@  build_dhcpv6_reply_flows_for_lrouter_port(
     }
 }
 
+static void
+build_dhcp_relay_flows_for_lrouter_port(
+        struct ovn_port *op, struct hmap *lflows,
+        struct ds *match)
+{
+    if (!op->nbrp || !op->nbrp->dhcp_relay) {
+        return;
+    }
+    struct nbrec_dhcp_relay *dhcp_relay = op->nbrp->dhcp_relay;
+    if (!dhcp_relay->servers) {
+        return;
+    }
+
+    int addr_family;
+    /* currently not supporting custom port */
+    uint16_t port;
+    char *server_ip_str = NULL;
+    struct in6_addr server_ip;
+
+    if (!ip_address_and_port_from_lb_key(dhcp_relay->servers, &server_ip_str,
+                                         &server_ip, &port, &addr_family)) {
+        return;
+    }
+
+    if (server_ip_str == NULL) {
+        return;
+    }
+
+    struct ds dhcp_action = DS_EMPTY_INITIALIZER;
+    ds_clear(match);
+    ds_put_format(
+        match, "inport == %s && "
+        "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
+        "udp.src == 68 && udp.dst == 67",
+        op->json_key);
+    ds_put_format(&dhcp_action,
+                "dhcp_relay_req(%s,%s);"
+                "ip4.src=%s;ip4.dst=%s;udp.src=67;next; /* DHCP_RELAY_REQ */",
+                op->lrp_networks.ipv4_addrs[0].addr_s, server_ip_str,
+                op->lrp_networks.ipv4_addrs[0].addr_s, server_ip_str);
+
+    ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_IP_INPUT, 110,
+                            ds_cstr(match), ds_cstr(&dhcp_action),
+                            &op->nbrp->header_);
+
+    ds_clear(match);
+    ds_clear(&dhcp_action);
+
+    ds_put_format(
+        match, "ip4.src == %s && ip4.dst == %s && "
+        "udp.src == 67 && udp.dst == 67",
+        server_ip_str, op->lrp_networks.ipv4_addrs[0].addr_s);
+    ds_put_format(&dhcp_action, "next;/* DHCP_RELAY_RESP */");
+    ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_IP_INPUT, 110,
+                            ds_cstr(match), ds_cstr(&dhcp_action),
+                            &op->nbrp->header_);
+
+    ds_clear(match);
+    ds_clear(&dhcp_action);
+
+    ds_put_format(
+        match, "ip4.src == %s && ip4.dst == %s && "
+        "udp.src == 67 && udp.dst == 67",
+        server_ip_str, op->lrp_networks.ipv4_addrs[0].addr_s);
+    ds_put_format(&dhcp_action,
+          "dhcp_relay_resp_fwd(%s,%s);ip4.src=%s;udp.dst=68;"
+          "outport=%s;output; /* DHCP_RELAY_RESP */",
+          op->lrp_networks.ipv4_addrs[0].addr_s, server_ip_str,
+          op->lrp_networks.ipv4_addrs[0].addr_s, op->json_key);
+    ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_DHCP_RELAY_RESP_FWD,
+                            110,
+                            ds_cstr(match), ds_cstr(&dhcp_action),
+                            &op->nbrp->header_);
+
+    ds_clear(match);
+    ds_clear(&dhcp_action);
+
+    free(server_ip_str);
+}
+
 static void
 build_ipv6_input_flows_for_lrouter_port(
         struct ovn_port *op, struct hmap *lflows,
@@ -15673,6 +15836,8 @@  build_lrouter_nat_defrag_and_lb(struct ovn_datapath *od, struct hmap *lflows,
     ovn_lflow_add(lflows, od, S_ROUTER_OUT_POST_SNAT, 0, "1", "next;");
     ovn_lflow_add(lflows, od, S_ROUTER_OUT_EGR_LOOP, 0, "1", "next;");
     ovn_lflow_add(lflows, od, S_ROUTER_IN_ECMP_STATEFUL, 0, "1", "next;");
+    ovn_lflow_add(lflows, od, S_ROUTER_IN_DHCP_RELAY_RESP_FWD, 0, "1",
+                  "next;");
 
     const char *ct_flag_reg = features->ct_no_masked_label
                               ? "ct_mark"
@@ -16154,6 +16319,7 @@  build_lswitch_and_lrouter_iterate_by_lsp(struct ovn_port *op,
     build_lswitch_dhcp_options_and_response(op, lflows, meter_groups);
     build_lswitch_external_port(op, lflows);
     build_lswitch_ip_unicast_lookup(op, lflows, actions, match);
+    build_lswitch_dhcp_relay_flows(op, lr_ports, lflows, meter_groups);
 
     /* Build Logical Router Flows. */
     build_ip_routing_flows_for_router_type_lsp(op, lr_ports, lflows);
@@ -16183,6 +16349,7 @@  build_lswitch_and_lrouter_iterate_by_lrp(struct ovn_port *op,
     build_egress_delivery_flows_for_lrouter_port(op, lsi->lflows, &lsi->match,
                                                  &lsi->actions);
     build_dhcpv6_reply_flows_for_lrouter_port(op, lsi->lflows, &lsi->match);
+    build_dhcp_relay_flows_for_lrouter_port(op, lsi->lflows, &lsi->match);
     build_ipv6_input_flows_for_lrouter_port(op, lsi->lflows,
                                             &lsi->match, &lsi->actions,
                                             lsi->meter_groups);
diff --git a/ovn-nb.ovsschema b/ovn-nb.ovsschema
index b2e0993e0..6863d52cd 100644
--- a/ovn-nb.ovsschema
+++ b/ovn-nb.ovsschema
@@ -1,7 +1,7 @@ 
 {
     "name": "OVN_Northbound",
-    "version": "7.2.0",
-    "cksum": "1069338687 34162",
+    "version": "7.3.0",
+    "cksum": "2325497400 35185",
     "tables": {
         "NB_Global": {
             "columns": {
@@ -89,7 +89,12 @@ 
                     "type": {"key": {"type": "uuid",
                                      "refTable": "Forwarding_Group",
                                      "refType": "strong"},
-                                     "min": 0, "max": "unlimited"}}},
+                                     "min": 0, "max": "unlimited"}},
+                "dhcp_relay_port": {"type": {"key": {"type": "uuid",
+                                            "refTable": "Logical_Router_Port",
+                                            "refType": "weak"},
+                                            "min": 0,
+                                            "max": 1}}},
             "isRoot": true},
         "Logical_Switch_Port": {
             "columns": {
@@ -436,6 +441,11 @@ 
                 "ipv6_prefix": {"type": {"key": "string",
                                       "min": 0,
                                       "max": "unlimited"}},
+                "dhcp_relay": {"type": {"key": {"type": "uuid",
+                                            "refTable": "DHCP_Relay",
+                                            "refType": "weak"},
+                                            "min": 0,
+                                            "max": 1}},
                 "external_ids": {
                     "type": {"key": "string", "value": "string",
                              "min": 0, "max": "unlimited"}},
@@ -529,6 +539,15 @@ 
                     "type": {"key": "string", "value": "string",
                              "min": 0, "max": "unlimited"}}},
             "isRoot": true},
+        "DHCP_Relay": {
+            "columns": {
+                "servers": {"type": {"key": "string",
+                                       "min": 0,
+                                       "max": 1}},
+                "external_ids": {
+                    "type": {"key": "string", "value": "string",
+                             "min": 0, "max": "unlimited"}}},
+            "isRoot": true},
         "Connection": {
             "columns": {
                 "target": {"type": "string"},
diff --git a/ovn-nb.xml b/ovn-nb.xml
index fcb1c6ecc..dc20892e1 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -608,6 +608,11 @@ 
       Please see the <ref table="DNS"/> table.
     </column>
 
+    <column name="dhcp_relay_port">
+      This column defines the <ref table="Logical_Router_Port"/> on which
+      DHCP relay is enabled.
+    </column>
+
     <column name="forwarding_groups">
       Groups a set of logical port endpoints for traffic going out of the
       logical switch.
@@ -2980,6 +2985,11 @@  or
       port has all ingress and egress traffic dropped.
     </column>
 
+    <column name="dhcp_relay">
+      This column is used to enabled DHCP Relay. Please refer
+      to <ref table="DHCP_Relay"/> table.
+    </column>
+
     <group title="Distributed Gateway Ports">
       <p>
         Gateways, as documented under <code>Gateways</code> in the OVN
@@ -4286,6 +4296,24 @@  or
     </group>
   </table>
 
+  <table name="DHCP_Relay" title="DHCP Relay">
+    <p>
+      OVN implements native DHCPv4 relay support which caters to the common
+      use case of relaying the DHCP requests to external DHCP server.
+    </p>
+
+    <column name="servers">
+      <p>
+        The DHCPv4 server IP address.
+      </p>
+    </column>
+    <group title="Common Columns">
+      <column name="external_ids">
+        See <em>External IDs</em> at the beginning of this document.
+      </column>
+    </group>
+  </table>
+
   <table name="Connection" title="OVSDB client connections.">
     <p>
       Configuration for a database connection to an Open vSwitch database
diff --git a/tests/atlocal.in b/tests/atlocal.in
index 63d891b89..32d1c374e 100644
--- a/tests/atlocal.in
+++ b/tests/atlocal.in
@@ -187,6 +187,9 @@  fi
 # Set HAVE_DHCPD
 find_command dhcpd
 
+# Set HAVE_DHCLIENT
+find_command dhclient
+
 # Set HAVE_BFDD_BEACON
 find_command bfdd-beacon
 
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 19e4f1263..4d8c9ff26 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -8786,9 +8786,9 @@  ovn-nbctl --wait=sb set logical_router_port R1-PUB options:redirect-type=bridged
 ovn-sbctl dump-flows R1 > R1flows
 AT_CAPTURE_FILE([R1flows])
 
-AT_CHECK([grep "lr_in_arp_resolve" R1flows | grep priority=90 | sort], [0], [dnl
-  table=17(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip4.src == 10.0.0.3 && is_chassis_resident("S0-P0")), action=(get_arp(outport, reg0); next;)
-  table=17(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip6.src == 1000::3 && is_chassis_resident("S0-P0")), action=(get_nd(outport, xxreg0); next;)
+AT_CHECK([grep "lr_in_arp_resolve" R1flows | grep priority=90 | sed 's/table=../table=??/' | sort], [0], [dnl
+  table=??(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip4.src == 10.0.0.3 && is_chassis_resident("S0-P0")), action=(get_arp(outport, reg0); next;)
+  table=??(lr_in_arp_resolve  ), priority=90   , match=(outport == "R1-PUB" && ip6.src == 1000::3 && is_chassis_resident("S0-P0")), action=(get_nd(outport, xxreg0); next;)
 ])
 
 AT_CLEANUP
@@ -10966,3 +10966,38 @@  Status: active
 
 AT_CLEANUP
 ])
+
+OVN_FOR_EACH_NORTHD_NO_HV([
+AT_SETUP([check DHCP RELAY AGENT])
+ovn_start NORTHD_TYPE
+
+check ovn-nbctl ls-add ls0
+check ovn-nbctl lsp-add ls0 ls0-port1
+check ovn-nbctl lsp-set-addresses ls0-port1 02:00:00:00:00:10
+check ovn-nbctl lr-add lr0
+check ovn-nbctl lrp-add lr0 lrp1 02:00:00:00:00:01 192.168.1.1/24
+check ovn-nbctl lsp-add ls0 lrp1-attachment
+check ovn-nbctl lsp-set-type lrp1-attachment router
+check ovn-nbctl lsp-set-addresses lrp1-attachment 00:00:00:00:ff:02
+check ovn-nbctl lsp-set-options lrp1-attachment router-port=lrp1
+check ovn-nbctl lrp-add lr0 lrp-ext 02:00:00:00:00:02 192.168.2.1/24
+
+dhcp_relay=$(ovn-nbctl create DHCP_Relay servers=172.16.1.1)
+check ovn-nbctl set Logical_Router_port lrp1 dhcp_relay=$dhcp_relay
+rp_uuid=$(ovn-nbctl --bare --colum=_uuid list logical_router_port lrp1)
+check ovn-nbctl set Logical_Switch ls0 dhcp_relay_port=$rp_uuid
+
+check ovn-nbctl --wait=sb sync
+
+ovn-sbctl lflow-list > lflows
+AT_CAPTURE_FILE([lflows])
+
+AT_CHECK([grep -e "DHCP_RELAY_" lflows | sed 's/table=../table=??/'], [0], [dnl
+  table=??(lr_in_ip_input     ), priority=110  , match=(inport == "lrp1" && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(dhcp_relay_req(192.168.1.1,172.16.1.1);ip4.src=192.168.1.1;ip4.dst=172.16.1.1;udp.src=67;next; /* DHCP_RELAY_REQ */)
+  table=??(lr_in_ip_input     ), priority=110  , match=(ip4.src == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src == 67 && udp.dst == 67), action=(next;/* DHCP_RELAY_RESP */)
+  table=??(lr_in_dhcp_relay_resp_fwd), priority=110  , match=(ip4.src == 172.16.1.1 && ip4.dst == 192.168.1.1 && udp.src == 67 && udp.dst == 67), action=(dhcp_relay_resp_fwd(192.168.1.1,172.16.1.1);ip4.src=192.168.1.1;udp.dst=68;outport="lrp1";output; /* DHCP_RELAY_RESP */)
+  table=??(ls_in_l2_lkup      ), priority=100  , match=(inport == "ls0-port1" && eth.src == 02:00:00:00:00:10 && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68 && udp.dst == 67), action=(eth.dst=02:00:00:00:00:01;outport="lrp1-attachment";next;/* DHCP_RELAY_REQ */)
+])
+
+AT_CLEANUP
+])
diff --git a/tests/ovn.at b/tests/ovn.at
index e8c79512b..839c07ce2 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -21905,7 +21905,7 @@  eth_dst=00000000ff01
 ip_src=$(ip_to_hex 10 0 0 10)
 ip_dst=$(ip_to_hex 172 168 0 101)
 send_icmp_packet 1 1 $eth_src $eth_dst $ip_src $ip_dst c4c9 0000000000000000000000
-AT_CHECK_UNQUOTED([as hv1 ovs-ofctl dump-flows br-int metadata=0x$lr0_dp_key | awk '/table=28, n_packets=1, n_bytes=45/{print $7" "$8}'],[0],[dnl
+AT_CHECK_UNQUOTED([as hv1 ovs-ofctl dump-flows br-int metadata=0x$lr0_dp_key | awk '/table=29, n_packets=1, n_bytes=45/{print $7" "$8}'],[0],[dnl
 priority=80,ip,reg15=0x$lr0_public_dp_key,metadata=0x$lr0_dp_key,nw_src=10.0.0.10 actions=drop
 ])
 
@@ -28964,7 +28964,7 @@  AT_CHECK([
         grep "priority=100" | \
         grep -c "ct(commit,zone=NXM_NX_REG11\\[[0..15\\]],.*exec(move:NXM_OF_ETH_SRC\\[[\\]]->NXM_NX_CT_LABEL\\[[32..79\\]],load:0x[[0-9]]->NXM_NX_CT_MARK\\[[16..31\\]]))"
 
-        grep table=25 hv${hv}flows | \
+        grep table=26 hv${hv}flows | \
         grep "priority=200" | \
         grep -c "move:NXM_NX_CT_LABEL\\[[\\]]->NXM_NX_XXREG1\\[[\\]],move:NXM_NX_XXREG1\\[[32..79\\]]->NXM_OF_ETH_DST"
     done; :], [0], [dnl
@@ -29089,7 +29089,7 @@  AT_CHECK([
         grep "priority=100" | \
         grep -c "ct(commit,zone=NXM_NX_REG11\\[[0..15\\]],.*exec(move:NXM_OF_ETH_SRC\\[[\\]]->NXM_NX_CT_LABEL\\[[32..79\\]],load:0x[[0-9]]->NXM_NX_CT_MARK\\[[16..31\\]]))"
 
-        grep table=25 hv${hv}flows | \
+        grep table=26 hv${hv}flows | \
         grep "priority=200" | \
         grep -c "move:NXM_NX_CT_LABEL\\[[\\]]->NXM_NX_XXREG1\\[[\\]],move:NXM_NX_XXREG1\\[[32..79\\]]->NXM_OF_ETH_DST"
     done; :], [0], [dnl
@@ -29586,7 +29586,7 @@  if test X"$1" = X"DGP"; then
 else
     prio=2
 fi
-AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=25, n_packets=1,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
+AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=26, n_packets=1,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
 1
 ])
 
@@ -29605,13 +29605,13 @@  AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep "actions=controller" | grep
 
 if test X"$1" = X"DGP"; then
     # The packet dst should be resolved once for E/W centralized NAT purpose.
-    AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=25, n_packets=1,.* priority=100,reg0=0xa000101,reg15=.*metadata=0x${sw_key} actions=mod_dl_dst:00:00:00:00:01:01,resubmit" -c], [0], [dnl
+    AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=26, n_packets=1,.* priority=100,reg0=0xa000101,reg15=.*metadata=0x${sw_key} actions=mod_dl_dst:00:00:00:00:01:01,resubmit" -c], [0], [dnl
 1
 ])
 fi
 
 # The packet should've been finally dropped in the lr_in_arp_resolve stage.
-AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=25, n_packets=2,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
+AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=26, n_packets=2,.* priority=$prio,ip,$inport.*$outport.*metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
 1
 ])
 OVN_CLEANUP([hv1])
diff --git a/tests/system-ovn.at b/tests/system-ovn.at
index 7b9daba0d..591933a95 100644
--- a/tests/system-ovn.at
+++ b/tests/system-ovn.at
@@ -12032,3 +12032,153 @@  as
 OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
 /connection dropped.*/d"])
 AT_CLEANUP
+
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([DHCP RELAY AGENT])
+AT_SKIP_IF([test $HAVE_DHCPD = no])
+AT_SKIP_IF([test $HAVE_DHCLIENT = no])
+AT_SKIP_IF([test $HAVE_TCPDUMP = no])
+ovn_start
+OVS_TRAFFIC_VSWITCHD_START()
+
+ADD_BR([br-int])
+ADD_BR([br-ext])
+
+ovs-ofctl add-flow br-ext action=normal
+# Set external-ids in br-int needed for ovn-controller
+ovs-vsctl \
+        -- set Open_vSwitch . external-ids:system-id=hv1 \
+        -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
+        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
+        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
+        -- set bridge br-int fail-mode=secure other-config:disable-in-band=true
+
+# Start ovn-controller
+start_daemon ovn-controller
+
+ADD_NAMESPACES(sw01)
+ADD_VETH(sw01, sw01, br-int, "0", "f0:00:00:01:02:03")
+ADD_NAMESPACES(sw11)
+ADD_VETH(sw11, sw11, br-int, "0", "f0:00:00:02:02:03")
+ADD_NAMESPACES(server)
+ADD_VETH(s1, server, br-ext, "172.16.1.1/24", "f0:00:00:01:02:05", \
+         "172.16.1.254")
+
+check ovn-nbctl lr-add R1
+
+check ovn-nbctl ls-add sw0
+check ovn-nbctl ls-add sw1
+check ovn-nbctl ls-add sw-ext
+
+check ovn-nbctl lrp-add R1 rp-sw0 00:00:01:01:02:03 192.168.1.1/24
+check ovn-nbctl lrp-add R1 rp-sw1 00:00:03:01:02:03 192.168.2.1/24
+check ovn-nbctl lrp-add R1 rp-ext 00:00:02:01:02:03 172.16.1.254/24
+
+dhcp_relay=$(ovn-nbctl create DHCP_Relay servers=172.16.1.1)
+check ovn-nbctl set Logical_Router_port rp-sw0 dhcp_relay=$dhcp_relay
+check ovn-nbctl set Logical_Router_port rp-sw1 dhcp_relay=$dhcp_relay
+check ovn-nbctl lrp-set-gateway-chassis rp-ext hv1
+
+check ovn-nbctl lsp-add sw0 sw0-rp -- set Logical_Switch_Port sw0-rp \
+    type=router options:router-port=rp-sw0 \
+    -- lsp-set-addresses sw0-rp router
+check ovn-nbctl lsp-add sw1 sw1-rp -- set Logical_Switch_Port sw1-rp \
+    type=router options:router-port=rp-sw1 \
+    -- lsp-set-addresses sw1-rp router
+
+rp_uuid=$(ovn-nbctl --bare --colum=_uuid list logical_router_port rp-sw0)
+check ovn-nbctl set Logical_Switch sw0 dhcp_relay_port=$rp_uuid
+rp_uuid=$(ovn-nbctl --bare --colum=_uuid list logical_router_port rp-sw1)
+check ovn-nbctl set Logical_Switch sw1 dhcp_relay_port=$rp_uuid
+
+check ovn-nbctl lsp-add sw-ext ext-rp -- set Logical_Switch_Port ext-rp \
+    type=router options:router-port=rp-ext \
+    -- lsp-set-addresses ext-rp router
+check ovn-nbctl lsp-add sw-ext lnet \
+        -- lsp-set-addresses lnet unknown \
+        -- lsp-set-type lnet localnet \
+        -- lsp-set-options lnet network_name=phynet
+
+check ovn-nbctl lsp-add sw0 sw01 \
+    -- lsp-set-addresses sw01 "f0:00:00:01:02:03"
+
+check ovn-nbctl lsp-add sw1 sw11 \
+    -- lsp-set-addresses sw11 "f0:00:00:02:02:03"
+
+AT_CHECK([ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappings=phynet:br-ext])
+
+OVN_POPULATE_ARP
+
+check ovn-nbctl --wait=hv sync
+
+DHCP_TEST_DIR="/tmp/dhcp-test"
+rm -rf $DHCP_TEST_DIR
+mkdir $DHCP_TEST_DIR
+cat > $DHCP_TEST_DIR/dhcpd.conf <<EOF
+subnet 172.16.1.0 netmask 255.255.255.0 {
+}
+subnet 192.168.1.0 netmask 255.255.255.0 {
+  range 192.168.1.10 192.168.1.10;
+  option routers 192.168.1.1;
+  option broadcast-address 192.168.1.255;
+  default-lease-time 60;
+  max-lease-time 120;
+}
+subnet 192.168.2.0 netmask 255.255.255.0 {
+  range 192.168.2.10 192.168.2.10;
+  option routers 192.168.2.1;
+  option broadcast-address 192.168.2.255;
+  default-lease-time 60;
+  max-lease-time 120;
+}
+EOF
+cat > $DHCP_TEST_DIR/dhclien.conf <<EOF
+timeout 2
+EOF
+
+touch $DHCP_TEST_DIR/dhcpd.leases
+chown root:dhcpd $DHCP_TEST_DIR $DHCP_TEST_DIR/dhcpd.leases
+chmod 775 $DHCP_TEST_DIR
+chmod 664 $DHCP_TEST_DIR/dhcpd.leases
+
+
+NETNS_DAEMONIZE([server], [dhcpd -4 -f -cf $DHCP_TEST_DIR/dhcpd.conf s1 > dhcpd.log 2>&1], [dhcpd.pid])
+
+NS_CHECK_EXEC([server], [tcpdump -l -nvv -i s1  udp > pkt.pcap 2>tcpdump_err &])
+OVS_WAIT_UNTIL([grep "listening" tcpdump_err])
+on_exit 'kill $(pidof tcpdump)'
+
+NS_CHECK_EXEC([sw01], [dhclient -1 -q -lf $DHCP_TEST_DIR/dhclient-sw01.lease -pf $DHCP_TEST_DIR/dhclient-sw01.pid -cf $DHCP_TEST_DIR/dhclien.conf sw01])
+NS_CHECK_EXEC([sw11], [dhclient -1 -q -lf $DHCP_TEST_DIR/dhclient-sw11.lease -pf $DHCP_TEST_DIR/dhclient-sw11.pid -cf $DHCP_TEST_DIR/dhclien.conf sw11])
+
+OVS_WAIT_UNTIL([
+    total_pkts=$(cat pkt.pcap | wc -l)
+    test ${total_pkts} -ge 8
+])
+
+on_exit 'kill `cat $DHCP_TEST_DIR/dhclient-sw01.pid` &&
+kill `cat $DHCP_TEST_DIR/dhclient-sw11.pid` && rm -rf $DHCP_TEST_DIR'
+
+NS_CHECK_EXEC([sw01], [ip addr show sw01 | grep -oP '(?<=inet\s)\d+(\.\d+){3}'], [0], [dnl
+192.168.1.10
+])
+NS_CHECK_EXEC([sw11], [ip addr show sw11 | grep -oP '(?<=inet\s)\d+(\.\d+){3}'], [0], [dnl
+192.168.2.10
+])
+OVS_APP_EXIT_AND_WAIT([ovn-controller])
+
+as ovn-sb
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+as ovn-nb
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+as northd
+OVS_APP_EXIT_AND_WAIT([NORTHD_TYPE])
+
+as
+OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
+/failed to query port patch-.*/d
+/.*terminating with signal 15.*/d"])
+AT_CLEANUP
+])
diff --git a/utilities/ovn-trace.c b/utilities/ovn-trace.c
index 0b86eae7b..ae9dd77de 100644
--- a/utilities/ovn-trace.c
+++ b/utilities/ovn-trace.c
@@ -2328,6 +2328,25 @@  execute_put_dhcp_opts(const struct ovnact_put_opts *pdo,
     execute_put_opts(pdo, name, uflow, super);
 }
 
+static void
+execute_dhcpv4_relay_resp_fwd(const struct ovnact_dhcp_relay *dr,
+                                const char *name, struct flow *uflow,
+                                struct ovs_list *super)
+{
+    ovntrace_node_append(
+        super, OVNTRACE_NODE_ERROR,
+        "/* We assume that this packet is DHCPOFFER or DHCPACK and "
+            "DHCP broadcast flag is set. Dest IP is set to broadcast. "
+            "Dest MAC is set to broadcast but in real network this is unicast "
+            "which is extracted from DHCP header. */");
+
+    /* Assume DHCP broadcast flag is set */
+    uflow->nw_dst = 0xFFFFFFFF;
+    /* Dest MAC is set to broadcast but in real network this is unicast */
+    struct eth_addr bcast_mac = {0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF};
+    uflow->dl_dst = bcast_mac;
+}
+
 static void
 execute_put_nd_ra_opts(const struct ovnact_put_opts *pdo,
                        const char *name, struct flow *uflow,
@@ -3215,6 +3234,15 @@  trace_actions(const struct ovnact *ovnacts, size_t ovnacts_len,
                                   "put_dhcpv6_opts", uflow, super);
             break;
 
+        case OVNACT_DHCPV4_RELAY_REQ:
+            /* Nothing to do for tracing. */
+            break;
+
+        case OVNACT_DHCPV4_RELAY_RESP_FWD:
+            execute_dhcpv4_relay_resp_fwd(ovnact_get_DHCPV4_RELAY_RESP_FWD(a),
+                                    "dhcp_relay_resp_fwd", uflow, super);
+            break;
+
         case OVNACT_PUT_ND_RA_OPTS:
             execute_put_nd_ra_opts(ovnact_get_PUT_DHCPV6_OPTS(a),
                                    "put_nd_ra_opts", uflow, super);