Message ID | 20240531192357.34798-1-bethany.jamison@canonical.com |
---|---|
Headers | show |
Series | CVE-2024-26586 | expand |
On 5/31/24 1:23 PM, Bethany Jamison wrote: > [Impact] > > mlxsw: spectrum_acl_tcam: Fix stack corruption > > When tc filters are first added to a net device, the corresponding local > port gets bound to an ACL group in the device. The group contains a list > of ACLs. In turn, each ACL points to a different TCAM region where the > filters are stored. During forwarding, the ACLs are sequentially > evaluated until a match is found. > > One reason to place filters in different regions is when they are added > with decreasing priorities and in an alternating order so that two > consecutive filters can never fit in the same region because of their > key usage. > > In Spectrum-2 and newer ASICs the firmware started to report that the > maximum number of ACLs in a group is more than 16, but the layout of the > register that configures ACL groups (PAGT) was not updated to account > for that. It is therefore possible to hit stack corruption [1] in the > rare case where more than 16 ACLs in a group are required. > > Fix by limiting the maximum ACL group size to the minimum between what > the firmware reports and the maximum ACLs that fit in the PAGT register. > > Add a test case to make sure the machine does not crash when this > condition is hit. > > [1] > Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: mlxsw_sp_acl_tcam_group_update+0x116/0x120 > [...] > dump_stack_lvl+0x36/0x50 > panic+0x305/0x330 > __stack_chk_fail+0x15/0x20 > mlxsw_sp_acl_tcam_group_update+0x116/0x120 > mlxsw_sp_acl_tcam_group_region_attach+0x69/0x110 > mlxsw_sp_acl_tcam_vchunk_get+0x492/0xa20 > mlxsw_sp_acl_tcam_ventry_add+0x25/0xe0 > mlxsw_sp_acl_rule_add+0x47/0x240 > mlxsw_sp_flower_replace+0x1a9/0x1d0 > tc_setup_cb_add+0xdc/0x1c0 > fl_hw_replace_filter+0x146/0x1f0 > fl_change+0xc17/0x1360 > tc_new_tfilter+0x472/0xb90 > rtnetlink_rcv_msg+0x313/0x3b0 > netlink_rcv_skb+0x58/0x100 > netlink_unicast+0x244/0x390 > netlink_sendmsg+0x1e4/0x440 > ____sys_sendmsg+0x164/0x260 > ___sys_sendmsg+0x9a/0xe0 > __sys_sendmsg+0x7a/0xc0 > do_syscall_64+0x40/0xe0 > entry_SYSCALL_64_after_hwframe+0x63/0x6b > > [Fix] > > Noble: pending > Mantic: pending > Jammy: pending > Focal: backport - function lived in a different spot in the file > and cherry-pick couldn't find it > Bionic: not-affected > Xenial: not-affected > Trusty: not-affected > > [Test Case] > > Compile and boot tested > > [Where problems could occur] > > This fix affects those who use the Mellanox mlxsw driver, an issue with > this fix would be visible to the user via unexpected behavior or a > system crash. > > Ido Schimmel (1): > mlxsw: spectrum_acl_tcam: Fix stack corruption > > .../mellanox/mlxsw/spectrum_acl_tcam.c | 2 + > .../drivers/net/mlxsw/spectrum-2/tc_flower.sh | 56 ++++++++++++++++++- > 2 files changed, 57 insertions(+), 1 deletion(-) > Acked-by: Tim Gardner <tim.gardner@canonical.com>
On Fri, May 31, 2024 at 02:23:56PM -0500, Bethany Jamison wrote: > [Impact] > > mlxsw: spectrum_acl_tcam: Fix stack corruption > > When tc filters are first added to a net device, the corresponding local > port gets bound to an ACL group in the device. The group contains a list > of ACLs. In turn, each ACL points to a different TCAM region where the > filters are stored. During forwarding, the ACLs are sequentially > evaluated until a match is found. > > One reason to place filters in different regions is when they are added > with decreasing priorities and in an alternating order so that two > consecutive filters can never fit in the same region because of their > key usage. > > In Spectrum-2 and newer ASICs the firmware started to report that the > maximum number of ACLs in a group is more than 16, but the layout of the > register that configures ACL groups (PAGT) was not updated to account > for that. It is therefore possible to hit stack corruption [1] in the > rare case where more than 16 ACLs in a group are required. > > Fix by limiting the maximum ACL group size to the minimum between what > the firmware reports and the maximum ACLs that fit in the PAGT register. > > Add a test case to make sure the machine does not crash when this > condition is hit. > > [1] > Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: mlxsw_sp_acl_tcam_group_update+0x116/0x120 > [...] > dump_stack_lvl+0x36/0x50 > panic+0x305/0x330 > __stack_chk_fail+0x15/0x20 > mlxsw_sp_acl_tcam_group_update+0x116/0x120 > mlxsw_sp_acl_tcam_group_region_attach+0x69/0x110 > mlxsw_sp_acl_tcam_vchunk_get+0x492/0xa20 > mlxsw_sp_acl_tcam_ventry_add+0x25/0xe0 > mlxsw_sp_acl_rule_add+0x47/0x240 > mlxsw_sp_flower_replace+0x1a9/0x1d0 > tc_setup_cb_add+0xdc/0x1c0 > fl_hw_replace_filter+0x146/0x1f0 > fl_change+0xc17/0x1360 > tc_new_tfilter+0x472/0xb90 > rtnetlink_rcv_msg+0x313/0x3b0 > netlink_rcv_skb+0x58/0x100 > netlink_unicast+0x244/0x390 > netlink_sendmsg+0x1e4/0x440 > ____sys_sendmsg+0x164/0x260 > ___sys_sendmsg+0x9a/0xe0 > __sys_sendmsg+0x7a/0xc0 > do_syscall_64+0x40/0xe0 > entry_SYSCALL_64_after_hwframe+0x63/0x6b > > [Fix] > > Noble: pending > Mantic: pending > Jammy: pending > Focal: backport - function lived in a different spot in the file > and cherry-pick couldn't find it > Bionic: not-affected > Xenial: not-affected > Trusty: not-affected > > [Test Case] > > Compile and boot tested > > [Where problems could occur] > > This fix affects those who use the Mellanox mlxsw driver, an issue with > this fix would be visible to the user via unexpected behavior or a > system crash. > > Ido Schimmel (1): > mlxsw: spectrum_acl_tcam: Fix stack corruption > > .../mellanox/mlxsw/spectrum_acl_tcam.c | 2 + > .../drivers/net/mlxsw/spectrum-2/tc_flower.sh | 56 ++++++++++++++++++- > 2 files changed, 57 insertions(+), 1 deletion(-) Acked-by: Portia Stephens <portia.stephens@canonical.com>
On 31.05.24 21:23, Bethany Jamison wrote: > [Impact] > > mlxsw: spectrum_acl_tcam: Fix stack corruption > > When tc filters are first added to a net device, the corresponding local > port gets bound to an ACL group in the device. The group contains a list > of ACLs. In turn, each ACL points to a different TCAM region where the > filters are stored. During forwarding, the ACLs are sequentially > evaluated until a match is found. > > One reason to place filters in different regions is when they are added > with decreasing priorities and in an alternating order so that two > consecutive filters can never fit in the same region because of their > key usage. > > In Spectrum-2 and newer ASICs the firmware started to report that the > maximum number of ACLs in a group is more than 16, but the layout of the > register that configures ACL groups (PAGT) was not updated to account > for that. It is therefore possible to hit stack corruption [1] in the > rare case where more than 16 ACLs in a group are required. > > Fix by limiting the maximum ACL group size to the minimum between what > the firmware reports and the maximum ACLs that fit in the PAGT register. > > Add a test case to make sure the machine does not crash when this > condition is hit. > > [1] > Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: mlxsw_sp_acl_tcam_group_update+0x116/0x120 > [...] > dump_stack_lvl+0x36/0x50 > panic+0x305/0x330 > __stack_chk_fail+0x15/0x20 > mlxsw_sp_acl_tcam_group_update+0x116/0x120 > mlxsw_sp_acl_tcam_group_region_attach+0x69/0x110 > mlxsw_sp_acl_tcam_vchunk_get+0x492/0xa20 > mlxsw_sp_acl_tcam_ventry_add+0x25/0xe0 > mlxsw_sp_acl_rule_add+0x47/0x240 > mlxsw_sp_flower_replace+0x1a9/0x1d0 > tc_setup_cb_add+0xdc/0x1c0 > fl_hw_replace_filter+0x146/0x1f0 > fl_change+0xc17/0x1360 > tc_new_tfilter+0x472/0xb90 > rtnetlink_rcv_msg+0x313/0x3b0 > netlink_rcv_skb+0x58/0x100 > netlink_unicast+0x244/0x390 > netlink_sendmsg+0x1e4/0x440 > ____sys_sendmsg+0x164/0x260 > ___sys_sendmsg+0x9a/0xe0 > __sys_sendmsg+0x7a/0xc0 > do_syscall_64+0x40/0xe0 > entry_SYSCALL_64_after_hwframe+0x63/0x6b > > [Fix] > > Noble: pending > Mantic: pending > Jammy: pending > Focal: backport - function lived in a different spot in the file > and cherry-pick couldn't find it > Bionic: not-affected > Xenial: not-affected > Trusty: not-affected > > [Test Case] > > Compile and boot tested > > [Where problems could occur] > > This fix affects those who use the Mellanox mlxsw driver, an issue with > this fix would be visible to the user via unexpected behavior or a > system crash. > > Ido Schimmel (1): > mlxsw: spectrum_acl_tcam: Fix stack corruption > > .../mellanox/mlxsw/spectrum_acl_tcam.c | 2 + > .../drivers/net/mlxsw/spectrum-2/tc_flower.sh | 56 ++++++++++++++++++- > 2 files changed, 57 insertions(+), 1 deletion(-) > Applied to focal:linux/master-next. Thanks. -Stefan