mbox series

[SRU,J:linux-bluefield,0/1] openvswitch gentling validation warning: missing .resv_start_op Edit

Message ID 20240311140256.2405-1-witu@nvidia.com
Headers show
Series openvswitch gentling validation warning: missing .resv_start_op Edit | expand

Message

William Tu March 11, 2024, 2:02 p.m. UTC
BugLink: https://bugs.launchpad.net/bugs/2056718

Intro:
======
When hit a kernel warning when loading openvswitch kernel module. Digging into the source code, we found it's due to the code snippet
        if (WARN_ON(i.cmd >= family->resv_start_op &&
                   (i.doit.validate || i.dumpit.validate)))
                 return -EINVAL;

in the gene_validate_ops() in net/netlink/genetlink.c, introduced in
108880a07bab genetlink: add iterator for walking family ops
from buglink about DPLL/SynCE
https://bugs.launchpad.net/bugs/2053155

How to fix:
===========
We need to cherry-pick the missing patch
Fixes: e4ba4554209f ("net: openvswitch: add missing .resv_start_op")

Author: Jakub Kicinski <kuba@kernel.org>
Date: Thu Oct 27 20:25:01 2022 -0700

    net: openvswitch: add missing .resv_start_op

    I missed one of the families in OvS when annotating .resv_start_op.
    This triggers the warning added in commit ce48ebdd5651 ("genetlink:
    limit the use of validation workarounds to old ops").

    Reported-by: syzbot+40eb8c0447c0e47a7e9b@syzkaller.appspotmail.com
    Fixes: 9c5d03d36251 ("genetlink: start to validate reserved header bytes")
    Link: https://lore.kernel.org/r/20221028032501.2724270-1-kuba@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Thanks!

How to reproduce:
=================
simply load the openvswitch.ko and dmesg

[ 1083.518212] WARNING: CPU: 2 PID: 17269 at net/netlink/genetlink.c:554 genl_validate_ops+0x134/0x254
...
[ 1083.518306] CPU: 2 PID: 17269 Comm: modprobe Tainted: G W OE 5.15.0-1037.39.10.g319565b-bluefield #g319565b
[ 1083.518309] Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS 4.7.0.13056 Feb 28 2024
[ 1083.518311] pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 1083.518313] pc : genl_validate_ops+0x134/0x254
[ 1083.518315] lr : genl_validate_ops+0x68/0x254
[ 1083.518317] sp : ffff80000a773810
[ 1083.518318] x29: ffff80000a773810 x28: ffff80000a773ba0 x27: ffffb1ea36f87318
[ 1083.518321] x26: ffffb1ea36f8cd20 x25: 0000000000000001 x24: ffffb1ea36f8cda8
[ 1083.518323] x23: 0000000000000000 x22: 0000000000000001 x21: ffffb1ea36f87210
[ 1083.518325] x20: ffffb1ea36f8b410 x19: 0000000000000001 x18: 0000000000000000
[ 1083.518328] x17: 0000000d00020008 x16: ffffb1ea4b70c2d0 x15: 0000003c00010006
[ 1083.518330] x14: 0000000068746170 x13: 0000000000000000 x12: 0000000000000001
[ 1083.518332] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffb1ea4b709a5c
[ 1083.518335] x8 : 0000000000000000 x7 : 0000000000000000 x6 : ffffb1ea4d4218c0
[ 1083.518337] x5 : 0000000000000004 x4 : 0000000000000000 x3 : 0000000000000001
[ 1083.518339] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000003
[ 1083.518341] Call trace:
[ 1083.518343] genl_validate_ops+0x134/0x254
[ 1083.518344] genl_register_family+0x30/0x1f4
[ 1083.518347] dp_init+0xd4/0x174 [openvswitch]
[ 1083.518360] do_one_initcall+0x4c/0x250
[ 1083.518364] do_init_module+0x50/0x260
[ 1083.518368] load_module+0x9fc/0xbe0
[ 1083.518370] __do_sys_finit_module+0xa8/0x114
[ 1083.518372] __arm64_sys_finit_module+0x28/0x3c
[ 1083.518375] invoke_syscall+0x78/0x100
[ 1083.518379] el0_svc_common.constprop.0+0x54/0x184
[ 1083.518381] do_el0_svc+0x30/0xac
[ 1083.518383] el0_svc+0x48/0x160
[ 1083.518387] el0t_64_sync_handler+0xa4/0x12c
[ 1083.518390] el0t_64_sync+0x1a4/0x1a8
[ 1083.518392] ---[ end trace ec4279298c2ae7be ]---
[ 1083.830668] openvswitch: Open vSwitch switching datapath


Jakub Kicinski (1):
  net: openvswitch: add missing .resv_start_op

 net/openvswitch/datapath.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Bartlomiej Zolnierkiewicz March 12, 2024, 12:33 p.m. UTC | #1
Acked-by: Bartlomiej Zolnierkiewicz <bartlomiej.zolnierkiewicz@canonical.com>

On Mon, Mar 11, 2024 at 3:04 PM William Tu <witu@nvidia.com> wrote:
>
> BugLink: https://bugs.launchpad.net/bugs/2056718
>
> Intro:
> ======
> When hit a kernel warning when loading openvswitch kernel module. Digging into the source code, we found it's due to the code snippet
>         if (WARN_ON(i.cmd >= family->resv_start_op &&
>                    (i.doit.validate || i.dumpit.validate)))
>                  return -EINVAL;
>
> in the gene_validate_ops() in net/netlink/genetlink.c, introduced in
> 108880a07bab genetlink: add iterator for walking family ops
> from buglink about DPLL/SynCE
> https://bugs.launchpad.net/bugs/2053155
>
> How to fix:
> ===========
> We need to cherry-pick the missing patch
> Fixes: e4ba4554209f ("net: openvswitch: add missing .resv_start_op")
>
> Author: Jakub Kicinski <kuba@kernel.org>
> Date: Thu Oct 27 20:25:01 2022 -0700
>
>     net: openvswitch: add missing .resv_start_op
>
>     I missed one of the families in OvS when annotating .resv_start_op.
>     This triggers the warning added in commit ce48ebdd5651 ("genetlink:
>     limit the use of validation workarounds to old ops").
>
>     Reported-by: syzbot+40eb8c0447c0e47a7e9b@syzkaller.appspotmail.com
>     Fixes: 9c5d03d36251 ("genetlink: start to validate reserved header bytes")
>     Link: https://lore.kernel.org/r/20221028032501.2724270-1-kuba@kernel.org
>     Signed-off-by: Jakub Kicinski <kuba@kernel.org>
>
> Thanks!
>
> How to reproduce:
> =================
> simply load the openvswitch.ko and dmesg
>
> [ 1083.518212] WARNING: CPU: 2 PID: 17269 at net/netlink/genetlink.c:554 genl_validate_ops+0x134/0x254
> ...
> [ 1083.518306] CPU: 2 PID: 17269 Comm: modprobe Tainted: G W OE 5.15.0-1037.39.10.g319565b-bluefield #g319565b
> [ 1083.518309] Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS 4.7.0.13056 Feb 28 2024
> [ 1083.518311] pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 1083.518313] pc : genl_validate_ops+0x134/0x254
> [ 1083.518315] lr : genl_validate_ops+0x68/0x254
> [ 1083.518317] sp : ffff80000a773810
> [ 1083.518318] x29: ffff80000a773810 x28: ffff80000a773ba0 x27: ffffb1ea36f87318
> [ 1083.518321] x26: ffffb1ea36f8cd20 x25: 0000000000000001 x24: ffffb1ea36f8cda8
> [ 1083.518323] x23: 0000000000000000 x22: 0000000000000001 x21: ffffb1ea36f87210
> [ 1083.518325] x20: ffffb1ea36f8b410 x19: 0000000000000001 x18: 0000000000000000
> [ 1083.518328] x17: 0000000d00020008 x16: ffffb1ea4b70c2d0 x15: 0000003c00010006
> [ 1083.518330] x14: 0000000068746170 x13: 0000000000000000 x12: 0000000000000001
> [ 1083.518332] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffb1ea4b709a5c
> [ 1083.518335] x8 : 0000000000000000 x7 : 0000000000000000 x6 : ffffb1ea4d4218c0
> [ 1083.518337] x5 : 0000000000000004 x4 : 0000000000000000 x3 : 0000000000000001
> [ 1083.518339] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000003
> [ 1083.518341] Call trace:
> [ 1083.518343] genl_validate_ops+0x134/0x254
> [ 1083.518344] genl_register_family+0x30/0x1f4
> [ 1083.518347] dp_init+0xd4/0x174 [openvswitch]
> [ 1083.518360] do_one_initcall+0x4c/0x250
> [ 1083.518364] do_init_module+0x50/0x260
> [ 1083.518368] load_module+0x9fc/0xbe0
> [ 1083.518370] __do_sys_finit_module+0xa8/0x114
> [ 1083.518372] __arm64_sys_finit_module+0x28/0x3c
> [ 1083.518375] invoke_syscall+0x78/0x100
> [ 1083.518379] el0_svc_common.constprop.0+0x54/0x184
> [ 1083.518381] do_el0_svc+0x30/0xac
> [ 1083.518383] el0_svc+0x48/0x160
> [ 1083.518387] el0t_64_sync_handler+0xa4/0x12c
> [ 1083.518390] el0t_64_sync+0x1a4/0x1a8
> [ 1083.518392] ---[ end trace ec4279298c2ae7be ]---
> [ 1083.830668] openvswitch: Open vSwitch switching datapath
>
>
> Jakub Kicinski (1):
>   net: openvswitch: add missing .resv_start_op
>
>  net/openvswitch/datapath.c | 1 +
>  1 file changed, 1 insertion(+)
>
Tim Gardner March 14, 2024, 11:05 p.m. UTC | #2
On 3/11/24 09:02, William Tu wrote:
> BugLink: https://bugs.launchpad.net/bugs/2056718
> 
> Intro:
> ======
> When hit a kernel warning when loading openvswitch kernel module. Digging into the source code, we found it's due to the code snippet
>          if (WARN_ON(i.cmd >= family->resv_start_op &&
>                     (i.doit.validate || i.dumpit.validate)))
>                   return -EINVAL;
> 
> in the gene_validate_ops() in net/netlink/genetlink.c, introduced in
> 108880a07bab genetlink: add iterator for walking family ops
> from buglink about DPLL/SynCE
> https://bugs.launchpad.net/bugs/2053155
> 
> How to fix:
> ===========
> We need to cherry-pick the missing patch
> Fixes: e4ba4554209f ("net: openvswitch: add missing .resv_start_op")
> 
> Author: Jakub Kicinski <kuba@kernel.org>
> Date: Thu Oct 27 20:25:01 2022 -0700
> 
>      net: openvswitch: add missing .resv_start_op
> 
>      I missed one of the families in OvS when annotating .resv_start_op.
>      This triggers the warning added in commit ce48ebdd5651 ("genetlink:
>      limit the use of validation workarounds to old ops").
> 
>      Reported-by: syzbot+40eb8c0447c0e47a7e9b@syzkaller.appspotmail.com
>      Fixes: 9c5d03d36251 ("genetlink: start to validate reserved header bytes")
>      Link: https://lore.kernel.org/r/20221028032501.2724270-1-kuba@kernel.org
>      Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> 
> Thanks!
> 
> How to reproduce:
> =================
> simply load the openvswitch.ko and dmesg
> 
> [ 1083.518212] WARNING: CPU: 2 PID: 17269 at net/netlink/genetlink.c:554 genl_validate_ops+0x134/0x254
> ...
> [ 1083.518306] CPU: 2 PID: 17269 Comm: modprobe Tainted: G W OE 5.15.0-1037.39.10.g319565b-bluefield #g319565b
> [ 1083.518309] Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS 4.7.0.13056 Feb 28 2024
> [ 1083.518311] pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 1083.518313] pc : genl_validate_ops+0x134/0x254
> [ 1083.518315] lr : genl_validate_ops+0x68/0x254
> [ 1083.518317] sp : ffff80000a773810
> [ 1083.518318] x29: ffff80000a773810 x28: ffff80000a773ba0 x27: ffffb1ea36f87318
> [ 1083.518321] x26: ffffb1ea36f8cd20 x25: 0000000000000001 x24: ffffb1ea36f8cda8
> [ 1083.518323] x23: 0000000000000000 x22: 0000000000000001 x21: ffffb1ea36f87210
> [ 1083.518325] x20: ffffb1ea36f8b410 x19: 0000000000000001 x18: 0000000000000000
> [ 1083.518328] x17: 0000000d00020008 x16: ffffb1ea4b70c2d0 x15: 0000003c00010006
> [ 1083.518330] x14: 0000000068746170 x13: 0000000000000000 x12: 0000000000000001
> [ 1083.518332] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffb1ea4b709a5c
> [ 1083.518335] x8 : 0000000000000000 x7 : 0000000000000000 x6 : ffffb1ea4d4218c0
> [ 1083.518337] x5 : 0000000000000004 x4 : 0000000000000000 x3 : 0000000000000001
> [ 1083.518339] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000003
> [ 1083.518341] Call trace:
> [ 1083.518343] genl_validate_ops+0x134/0x254
> [ 1083.518344] genl_register_family+0x30/0x1f4
> [ 1083.518347] dp_init+0xd4/0x174 [openvswitch]
> [ 1083.518360] do_one_initcall+0x4c/0x250
> [ 1083.518364] do_init_module+0x50/0x260
> [ 1083.518368] load_module+0x9fc/0xbe0
> [ 1083.518370] __do_sys_finit_module+0xa8/0x114
> [ 1083.518372] __arm64_sys_finit_module+0x28/0x3c
> [ 1083.518375] invoke_syscall+0x78/0x100
> [ 1083.518379] el0_svc_common.constprop.0+0x54/0x184
> [ 1083.518381] do_el0_svc+0x30/0xac
> [ 1083.518383] el0_svc+0x48/0x160
> [ 1083.518387] el0t_64_sync_handler+0xa4/0x12c
> [ 1083.518390] el0t_64_sync+0x1a4/0x1a8
> [ 1083.518392] ---[ end trace ec4279298c2ae7be ]---
> [ 1083.830668] openvswitch: Open vSwitch switching datapath
> 
> 
> Jakub Kicinski (1):
>    net: openvswitch: add missing .resv_start_op
> 
>   net/openvswitch/datapath.c | 1 +
>   1 file changed, 1 insertion(+)
> 
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Bartlomiej Zolnierkiewicz March 18, 2024, 3:30 p.m. UTC | #3
Applied to jammy:linux-bluefield/master-next. Thanks.

--
Best regards,
Bartlomiej

On Mon, Mar 11, 2024 at 3:04 PM William Tu <witu@nvidia.com> wrote:
>
> BugLink: https://bugs.launchpad.net/bugs/2056718
>
> Intro:
> ======
> When hit a kernel warning when loading openvswitch kernel module. Digging into the source code, we found it's due to the code snippet
>         if (WARN_ON(i.cmd >= family->resv_start_op &&
>                    (i.doit.validate || i.dumpit.validate)))
>                  return -EINVAL;
>
> in the gene_validate_ops() in net/netlink/genetlink.c, introduced in
> 108880a07bab genetlink: add iterator for walking family ops
> from buglink about DPLL/SynCE
> https://bugs.launchpad.net/bugs/2053155
>
> How to fix:
> ===========
> We need to cherry-pick the missing patch
> Fixes: e4ba4554209f ("net: openvswitch: add missing .resv_start_op")
>
> Author: Jakub Kicinski <kuba@kernel.org>
> Date: Thu Oct 27 20:25:01 2022 -0700
>
>     net: openvswitch: add missing .resv_start_op
>
>     I missed one of the families in OvS when annotating .resv_start_op.
>     This triggers the warning added in commit ce48ebdd5651 ("genetlink:
>     limit the use of validation workarounds to old ops").
>
>     Reported-by: syzbot+40eb8c0447c0e47a7e9b@syzkaller.appspotmail.com
>     Fixes: 9c5d03d36251 ("genetlink: start to validate reserved header bytes")
>     Link: https://lore.kernel.org/r/20221028032501.2724270-1-kuba@kernel.org
>     Signed-off-by: Jakub Kicinski <kuba@kernel.org>
>
> Thanks!
>
> How to reproduce:
> =================
> simply load the openvswitch.ko and dmesg
>
> [ 1083.518212] WARNING: CPU: 2 PID: 17269 at net/netlink/genetlink.c:554 genl_validate_ops+0x134/0x254
> ...
> [ 1083.518306] CPU: 2 PID: 17269 Comm: modprobe Tainted: G W OE 5.15.0-1037.39.10.g319565b-bluefield #g319565b
> [ 1083.518309] Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS 4.7.0.13056 Feb 28 2024
> [ 1083.518311] pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 1083.518313] pc : genl_validate_ops+0x134/0x254
> [ 1083.518315] lr : genl_validate_ops+0x68/0x254
> [ 1083.518317] sp : ffff80000a773810
> [ 1083.518318] x29: ffff80000a773810 x28: ffff80000a773ba0 x27: ffffb1ea36f87318
> [ 1083.518321] x26: ffffb1ea36f8cd20 x25: 0000000000000001 x24: ffffb1ea36f8cda8
> [ 1083.518323] x23: 0000000000000000 x22: 0000000000000001 x21: ffffb1ea36f87210
> [ 1083.518325] x20: ffffb1ea36f8b410 x19: 0000000000000001 x18: 0000000000000000
> [ 1083.518328] x17: 0000000d00020008 x16: ffffb1ea4b70c2d0 x15: 0000003c00010006
> [ 1083.518330] x14: 0000000068746170 x13: 0000000000000000 x12: 0000000000000001
> [ 1083.518332] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffb1ea4b709a5c
> [ 1083.518335] x8 : 0000000000000000 x7 : 0000000000000000 x6 : ffffb1ea4d4218c0
> [ 1083.518337] x5 : 0000000000000004 x4 : 0000000000000000 x3 : 0000000000000001
> [ 1083.518339] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000003
> [ 1083.518341] Call trace:
> [ 1083.518343] genl_validate_ops+0x134/0x254
> [ 1083.518344] genl_register_family+0x30/0x1f4
> [ 1083.518347] dp_init+0xd4/0x174 [openvswitch]
> [ 1083.518360] do_one_initcall+0x4c/0x250
> [ 1083.518364] do_init_module+0x50/0x260
> [ 1083.518368] load_module+0x9fc/0xbe0
> [ 1083.518370] __do_sys_finit_module+0xa8/0x114
> [ 1083.518372] __arm64_sys_finit_module+0x28/0x3c
> [ 1083.518375] invoke_syscall+0x78/0x100
> [ 1083.518379] el0_svc_common.constprop.0+0x54/0x184
> [ 1083.518381] do_el0_svc+0x30/0xac
> [ 1083.518383] el0_svc+0x48/0x160
> [ 1083.518387] el0t_64_sync_handler+0xa4/0x12c
> [ 1083.518390] el0t_64_sync+0x1a4/0x1a8
> [ 1083.518392] ---[ end trace ec4279298c2ae7be ]---
> [ 1083.830668] openvswitch: Open vSwitch switching datapath
>
>
> Jakub Kicinski (1):
>   net: openvswitch: add missing .resv_start_op
>
>  net/openvswitch/datapath.c | 1 +
>  1 file changed, 1 insertion(+)
>