mbox series

[SRU,Jammy,Noble,Unstable,0/1] net/sched: Fix conntrack use-after-free

Message ID 20240716045845.1961853-1-gerald.yang@canonical.com
Headers show
Series net/sched: Fix conntrack use-after-free | expand

Message

Gerald Yang July 16, 2024, 4:58 a.m. UTC
BugLink: https://bugs.launchpad.net/bugs/2073092

[Impact]

Hit conntrack refcount use-after-free issue:
refcount_t: addition on 0; use-after-free.

Call Trace:
<IRQ>
? show_regs+0x6d/0x80
? __warn+0x89/0x160
? refcount_warn_saturate+0x12e/0x150
? report_bug+0x17e/0x1b0
? handle_bug+0x46/0x90
? exc_invalid_op+0x18/0x80
? asm_exc_invalid_op+0x1b/0x20
? refcount_warn_saturate+0x12e/0x150
flow_offload_alloc+0xe5/0xf0 [nf_flow_table]
tcf_ct_flow_table_process_conn+0xc2/0x1e0 [act_ct]
tcf_ct_act+0x6c8/0xaa0 [act_ct]
tcf_action_exec+0xbc/0x1a0
fl_classify+0x1f8/0x200 [cls_flower]
__tcf_classify+0x169/0x200
tcf_classify+0xff/0x250
sch_handle_ingress.constprop.0+0x11f/0x290
? srso_alias_return_thunk+0x5/0x7f
__netif_receive_skb_core.constprop.0+0x60b/0xd70
? __udp4_lib_lookup+0x25f/0x2a0
__netif_receive_skb_list_core+0xfd/0x250
netif_receive_skb_list_internal+0x1a3/0x2d0
? srso_alias_return_thunk+0x5/0x7f
? dev_gro_receive+0x196/0x350
napi_complete_done+0x74/0x1c0
gro_cell_poll+0x7c/0xb0
__napi_poll+0x33/0x1f0
net_rx_action+0x181/0x2e0
__do_softirq+0xdc/0x349
? srso_alias_return_thunk+0x5/0x7f
? handle_irq_event+0x52/0x80
? handle_edge_irq+0xda/0x250
__irq_exit_rcu+0x75/0xa0
irq_exit_rcu+0xe/0x20
common_interrupt+0xa4/0xb0
</IRQ>
<TASK>

[Fix]
I enabled kasan and get:
BUG: KASAN: slab-use-after-free in tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct]
Read of size 1 at addr ffff888c07603600 by task handler130/6469

Call Trace:
<IRQ>
dump_stack_lvl+0x48/0x70
print_address_description.constprop.0+0x33/0x3d0
print_report+0xc0/0x2b0
kasan_report+0xd0/0x120
__asan_load1+0x6c/0x80
tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct]
tcf_ct_act+0x886/0x1350 [act_ct]
tcf_action_exec+0xf8/0x1f0
fl_classify+0x355/0x360 [cls_flower]
__tcf_classify+0x1fd/0x330
tcf_classify+0x21c/0x3c0
sch_handle_ingress.constprop.0+0x2c5/0x500
__netif_receive_skb_core.constprop.0+0xb25/0x1510
__netif_receive_skb_list_core+0x220/0x4c0
netif_receive_skb_list_internal+0x446/0x620
napi_complete_done+0x157/0x3d0
gro_cell_poll+0xcf/0x100
__napi_poll+0x65/0x310
net_rx_action+0x30c/0x5c0
__do_softirq+0x14f/0x491
__irq_exit_rcu+0x82/0xc0
irq_exit_rcu+0xe/0x20
common_interrupt+0xa1/0xb0
</IRQ>

Allocated by task 6469:
kasan_save_stack+0x38/0x70
kasan_set_track+0x25/0x40
kasan_save_alloc_info+0x1e/0x40
__kasan_krealloc+0x133/0x190
krealloc+0xaa/0x130
nf_ct_ext_add+0xed/0x230 [nf_conntrack]
tcf_ct_act+0x1095/0x1350 [act_ct]
tcf_action_exec+0xf8/0x1f0
fl_classify+0x355/0x360 [cls_flower]
__tcf_classify+0x1fd/0x330
tcf_classify+0x21c/0x3c0
sch_handle_ingress.constprop.0+0x2c5/0x500
__netif_receive_skb_core.constprop.0+0xb25/0x1510
__netif_receive_skb_list_core+0x220/0x4c0
netif_receive_skb_list_internal+0x446/0x620
napi_complete_done+0x157/0x3d0
gro_cell_poll+0xcf/0x100
__napi_poll+0x65/0x310
net_rx_action+0x30c/0x5c0
__do_softirq+0x14f/0x491

Freed by task 6469:
kasan_save_stack+0x38/0x70
kasan_set_track+0x25/0x40
kasan_save_free_info+0x2b/0x60
____kasan_slab_free+0x180/0x1f0
__kasan_slab_free+0x12/0x30
slab_free_freelist_hook+0xd2/0x1a0
__kmem_cache_free+0x1a2/0x2f0
kfree+0x78/0x120
nf_conntrack_free+0x74/0x130 [nf_conntrack]
nf_ct_destroy+0xb2/0x140 [nf_conntrack]
__nf_ct_resolve_clash+0x529/0x5d0 [nf_conntrack]
nf_ct_resolve_clash+0xf6/0x490 [nf_conntrack]
__nf_conntrack_confirm+0x2c6/0x770 [nf_conntrack]
tcf_ct_act+0x12ad/0x1350 [act_ct]
tcf_action_exec+0xf8/0x1f0
fl_classify+0x355/0x360 [cls_flower]
__tcf_classify+0x1fd/0x330
tcf_classify+0x21c/0x3c0
sch_handle_ingress.constprop.0+0x2c5/0x500
__netif_receive_skb_core.constprop.0+0xb25/0x1510
__netif_receive_skb_list_core+0x220/0x4c0
netif_receive_skb_list_internal+0x446/0x620
napi_complete_done+0x157/0x3d0
gro_cell_poll+0xcf/0x100
__napi_poll+0x65/0x310
net_rx_action+0x30c/0x5c0
__do_softirq+0x14f/0x491

When resolving a clash, a duplicate conntrack will be freed,
but in tcf_ct_act, it still uses the freed conntrack instead of the correct conntrack.

We sent a patch to upstream to fix it and got merged:
commit 26488172b0292bed837b95a006a3f3431d1898c3
Author: Chengen Du <chengen.du@canonical.com>
Date: Wed Jul 10 13:37:47 2024 +0800

    net/sched: Fix UAF when resolving a clash

Cherry-pick this comment to fix the conntrack slab use-after-free issue.

[Testcase]
Built a test kernel and verified on our environment.

[Where problems could occur]
This patch ensure when a clash happens and the duplicated conntrack is freed,
call nf_ct_get to get the correct conntrack,
the freed conntrack won't be used and the rest of code path will follow the original path.
This won't cause other issues.

Chengen Du (1):
  net/sched: Fix UAF when resolving a clash

 net/sched/act_ct.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Manuel Diewald July 16, 2024, 11:07 a.m. UTC | #1
On Tue, Jul 16, 2024 at 12:58:36PM +0800, Gerald Yang wrote:
> BugLink: https://bugs.launchpad.net/bugs/2073092
> 
> [Impact]
> 
> Hit conntrack refcount use-after-free issue:
> refcount_t: addition on 0; use-after-free.
> 
> Call Trace:
> <IRQ>
> ? show_regs+0x6d/0x80
> ? __warn+0x89/0x160
> ? refcount_warn_saturate+0x12e/0x150
> ? report_bug+0x17e/0x1b0
> ? handle_bug+0x46/0x90
> ? exc_invalid_op+0x18/0x80
> ? asm_exc_invalid_op+0x1b/0x20
> ? refcount_warn_saturate+0x12e/0x150
> flow_offload_alloc+0xe5/0xf0 [nf_flow_table]
> tcf_ct_flow_table_process_conn+0xc2/0x1e0 [act_ct]
> tcf_ct_act+0x6c8/0xaa0 [act_ct]
> tcf_action_exec+0xbc/0x1a0
> fl_classify+0x1f8/0x200 [cls_flower]
> __tcf_classify+0x169/0x200
> tcf_classify+0xff/0x250
> sch_handle_ingress.constprop.0+0x11f/0x290
> ? srso_alias_return_thunk+0x5/0x7f
> __netif_receive_skb_core.constprop.0+0x60b/0xd70
> ? __udp4_lib_lookup+0x25f/0x2a0
> __netif_receive_skb_list_core+0xfd/0x250
> netif_receive_skb_list_internal+0x1a3/0x2d0
> ? srso_alias_return_thunk+0x5/0x7f
> ? dev_gro_receive+0x196/0x350
> napi_complete_done+0x74/0x1c0
> gro_cell_poll+0x7c/0xb0
> __napi_poll+0x33/0x1f0
> net_rx_action+0x181/0x2e0
> __do_softirq+0xdc/0x349
> ? srso_alias_return_thunk+0x5/0x7f
> ? handle_irq_event+0x52/0x80
> ? handle_edge_irq+0xda/0x250
> __irq_exit_rcu+0x75/0xa0
> irq_exit_rcu+0xe/0x20
> common_interrupt+0xa4/0xb0
> </IRQ>
> <TASK>
> 
> [Fix]
> I enabled kasan and get:
> BUG: KASAN: slab-use-after-free in tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct]
> Read of size 1 at addr ffff888c07603600 by task handler130/6469
> 
> Call Trace:
> <IRQ>
> dump_stack_lvl+0x48/0x70
> print_address_description.constprop.0+0x33/0x3d0
> print_report+0xc0/0x2b0
> kasan_report+0xd0/0x120
> __asan_load1+0x6c/0x80
> tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct]
> tcf_ct_act+0x886/0x1350 [act_ct]
> tcf_action_exec+0xf8/0x1f0
> fl_classify+0x355/0x360 [cls_flower]
> __tcf_classify+0x1fd/0x330
> tcf_classify+0x21c/0x3c0
> sch_handle_ingress.constprop.0+0x2c5/0x500
> __netif_receive_skb_core.constprop.0+0xb25/0x1510
> __netif_receive_skb_list_core+0x220/0x4c0
> netif_receive_skb_list_internal+0x446/0x620
> napi_complete_done+0x157/0x3d0
> gro_cell_poll+0xcf/0x100
> __napi_poll+0x65/0x310
> net_rx_action+0x30c/0x5c0
> __do_softirq+0x14f/0x491
> __irq_exit_rcu+0x82/0xc0
> irq_exit_rcu+0xe/0x20
> common_interrupt+0xa1/0xb0
> </IRQ>
> 
> Allocated by task 6469:
> kasan_save_stack+0x38/0x70
> kasan_set_track+0x25/0x40
> kasan_save_alloc_info+0x1e/0x40
> __kasan_krealloc+0x133/0x190
> krealloc+0xaa/0x130
> nf_ct_ext_add+0xed/0x230 [nf_conntrack]
> tcf_ct_act+0x1095/0x1350 [act_ct]
> tcf_action_exec+0xf8/0x1f0
> fl_classify+0x355/0x360 [cls_flower]
> __tcf_classify+0x1fd/0x330
> tcf_classify+0x21c/0x3c0
> sch_handle_ingress.constprop.0+0x2c5/0x500
> __netif_receive_skb_core.constprop.0+0xb25/0x1510
> __netif_receive_skb_list_core+0x220/0x4c0
> netif_receive_skb_list_internal+0x446/0x620
> napi_complete_done+0x157/0x3d0
> gro_cell_poll+0xcf/0x100
> __napi_poll+0x65/0x310
> net_rx_action+0x30c/0x5c0
> __do_softirq+0x14f/0x491
> 
> Freed by task 6469:
> kasan_save_stack+0x38/0x70
> kasan_set_track+0x25/0x40
> kasan_save_free_info+0x2b/0x60
> ____kasan_slab_free+0x180/0x1f0
> __kasan_slab_free+0x12/0x30
> slab_free_freelist_hook+0xd2/0x1a0
> __kmem_cache_free+0x1a2/0x2f0
> kfree+0x78/0x120
> nf_conntrack_free+0x74/0x130 [nf_conntrack]
> nf_ct_destroy+0xb2/0x140 [nf_conntrack]
> __nf_ct_resolve_clash+0x529/0x5d0 [nf_conntrack]
> nf_ct_resolve_clash+0xf6/0x490 [nf_conntrack]
> __nf_conntrack_confirm+0x2c6/0x770 [nf_conntrack]
> tcf_ct_act+0x12ad/0x1350 [act_ct]
> tcf_action_exec+0xf8/0x1f0
> fl_classify+0x355/0x360 [cls_flower]
> __tcf_classify+0x1fd/0x330
> tcf_classify+0x21c/0x3c0
> sch_handle_ingress.constprop.0+0x2c5/0x500
> __netif_receive_skb_core.constprop.0+0xb25/0x1510
> __netif_receive_skb_list_core+0x220/0x4c0
> netif_receive_skb_list_internal+0x446/0x620
> napi_complete_done+0x157/0x3d0
> gro_cell_poll+0xcf/0x100
> __napi_poll+0x65/0x310
> net_rx_action+0x30c/0x5c0
> __do_softirq+0x14f/0x491
> 
> When resolving a clash, a duplicate conntrack will be freed,
> but in tcf_ct_act, it still uses the freed conntrack instead of the correct conntrack.
> 
> We sent a patch to upstream to fix it and got merged:
> commit 26488172b0292bed837b95a006a3f3431d1898c3
> Author: Chengen Du <chengen.du@canonical.com>
> Date: Wed Jul 10 13:37:47 2024 +0800
> 
>     net/sched: Fix UAF when resolving a clash
> 
> Cherry-pick this comment to fix the conntrack slab use-after-free issue.
> 
> [Testcase]
> Built a test kernel and verified on our environment.
> 
> [Where problems could occur]
> This patch ensure when a clash happens and the duplicated conntrack is freed,
> call nf_ct_get to get the correct conntrack,
> the freed conntrack won't be used and the rest of code path will follow the original path.
> This won't cause other issues.
> 
> Chengen Du (1):
>   net/sched: Fix UAF when resolving a clash
> 
>  net/sched/act_ct.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> -- 
> 2.43.0
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team

I guess this should only be applied to jammy and noble as the patch is
part of mainline 6.10 already and will be in oracular/unstable.

Acked-by: Manuel Diewald <manuel.diewald@canonical.com>
Stefan Bader July 16, 2024, 1:49 p.m. UTC | #2
On 16.07.24 13:07, Manuel Diewald wrote:
> On Tue, Jul 16, 2024 at 12:58:36PM +0800, Gerald Yang wrote:
>> BugLink: https://bugs.launchpad.net/bugs/2073092
>>
>> [Impact]
>>
>> Hit conntrack refcount use-after-free issue:
>> refcount_t: addition on 0; use-after-free.
>>
>> Call Trace:
>> <IRQ>
>> ? show_regs+0x6d/0x80
>> ? __warn+0x89/0x160
>> ? refcount_warn_saturate+0x12e/0x150
>> ? report_bug+0x17e/0x1b0
>> ? handle_bug+0x46/0x90
>> ? exc_invalid_op+0x18/0x80
>> ? asm_exc_invalid_op+0x1b/0x20
>> ? refcount_warn_saturate+0x12e/0x150
>> flow_offload_alloc+0xe5/0xf0 [nf_flow_table]
>> tcf_ct_flow_table_process_conn+0xc2/0x1e0 [act_ct]
>> tcf_ct_act+0x6c8/0xaa0 [act_ct]
>> tcf_action_exec+0xbc/0x1a0
>> fl_classify+0x1f8/0x200 [cls_flower]
>> __tcf_classify+0x169/0x200
>> tcf_classify+0xff/0x250
>> sch_handle_ingress.constprop.0+0x11f/0x290
>> ? srso_alias_return_thunk+0x5/0x7f
>> __netif_receive_skb_core.constprop.0+0x60b/0xd70
>> ? __udp4_lib_lookup+0x25f/0x2a0
>> __netif_receive_skb_list_core+0xfd/0x250
>> netif_receive_skb_list_internal+0x1a3/0x2d0
>> ? srso_alias_return_thunk+0x5/0x7f
>> ? dev_gro_receive+0x196/0x350
>> napi_complete_done+0x74/0x1c0
>> gro_cell_poll+0x7c/0xb0
>> __napi_poll+0x33/0x1f0
>> net_rx_action+0x181/0x2e0
>> __do_softirq+0xdc/0x349
>> ? srso_alias_return_thunk+0x5/0x7f
>> ? handle_irq_event+0x52/0x80
>> ? handle_edge_irq+0xda/0x250
>> __irq_exit_rcu+0x75/0xa0
>> irq_exit_rcu+0xe/0x20
>> common_interrupt+0xa4/0xb0
>> </IRQ>
>> <TASK>
>>
>> [Fix]
>> I enabled kasan and get:
>> BUG: KASAN: slab-use-after-free in tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct]
>> Read of size 1 at addr ffff888c07603600 by task handler130/6469
>>
>> Call Trace:
>> <IRQ>
>> dump_stack_lvl+0x48/0x70
>> print_address_description.constprop.0+0x33/0x3d0
>> print_report+0xc0/0x2b0
>> kasan_report+0xd0/0x120
>> __asan_load1+0x6c/0x80
>> tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct]
>> tcf_ct_act+0x886/0x1350 [act_ct]
>> tcf_action_exec+0xf8/0x1f0
>> fl_classify+0x355/0x360 [cls_flower]
>> __tcf_classify+0x1fd/0x330
>> tcf_classify+0x21c/0x3c0
>> sch_handle_ingress.constprop.0+0x2c5/0x500
>> __netif_receive_skb_core.constprop.0+0xb25/0x1510
>> __netif_receive_skb_list_core+0x220/0x4c0
>> netif_receive_skb_list_internal+0x446/0x620
>> napi_complete_done+0x157/0x3d0
>> gro_cell_poll+0xcf/0x100
>> __napi_poll+0x65/0x310
>> net_rx_action+0x30c/0x5c0
>> __do_softirq+0x14f/0x491
>> __irq_exit_rcu+0x82/0xc0
>> irq_exit_rcu+0xe/0x20
>> common_interrupt+0xa1/0xb0
>> </IRQ>
>>
>> Allocated by task 6469:
>> kasan_save_stack+0x38/0x70
>> kasan_set_track+0x25/0x40
>> kasan_save_alloc_info+0x1e/0x40
>> __kasan_krealloc+0x133/0x190
>> krealloc+0xaa/0x130
>> nf_ct_ext_add+0xed/0x230 [nf_conntrack]
>> tcf_ct_act+0x1095/0x1350 [act_ct]
>> tcf_action_exec+0xf8/0x1f0
>> fl_classify+0x355/0x360 [cls_flower]
>> __tcf_classify+0x1fd/0x330
>> tcf_classify+0x21c/0x3c0
>> sch_handle_ingress.constprop.0+0x2c5/0x500
>> __netif_receive_skb_core.constprop.0+0xb25/0x1510
>> __netif_receive_skb_list_core+0x220/0x4c0
>> netif_receive_skb_list_internal+0x446/0x620
>> napi_complete_done+0x157/0x3d0
>> gro_cell_poll+0xcf/0x100
>> __napi_poll+0x65/0x310
>> net_rx_action+0x30c/0x5c0
>> __do_softirq+0x14f/0x491
>>
>> Freed by task 6469:
>> kasan_save_stack+0x38/0x70
>> kasan_set_track+0x25/0x40
>> kasan_save_free_info+0x2b/0x60
>> ____kasan_slab_free+0x180/0x1f0
>> __kasan_slab_free+0x12/0x30
>> slab_free_freelist_hook+0xd2/0x1a0
>> __kmem_cache_free+0x1a2/0x2f0
>> kfree+0x78/0x120
>> nf_conntrack_free+0x74/0x130 [nf_conntrack]
>> nf_ct_destroy+0xb2/0x140 [nf_conntrack]
>> __nf_ct_resolve_clash+0x529/0x5d0 [nf_conntrack]
>> nf_ct_resolve_clash+0xf6/0x490 [nf_conntrack]
>> __nf_conntrack_confirm+0x2c6/0x770 [nf_conntrack]
>> tcf_ct_act+0x12ad/0x1350 [act_ct]
>> tcf_action_exec+0xf8/0x1f0
>> fl_classify+0x355/0x360 [cls_flower]
>> __tcf_classify+0x1fd/0x330
>> tcf_classify+0x21c/0x3c0
>> sch_handle_ingress.constprop.0+0x2c5/0x500
>> __netif_receive_skb_core.constprop.0+0xb25/0x1510
>> __netif_receive_skb_list_core+0x220/0x4c0
>> netif_receive_skb_list_internal+0x446/0x620
>> napi_complete_done+0x157/0x3d0
>> gro_cell_poll+0xcf/0x100
>> __napi_poll+0x65/0x310
>> net_rx_action+0x30c/0x5c0
>> __do_softirq+0x14f/0x491
>>
>> When resolving a clash, a duplicate conntrack will be freed,
>> but in tcf_ct_act, it still uses the freed conntrack instead of the correct conntrack.
>>
>> We sent a patch to upstream to fix it and got merged:
>> commit 26488172b0292bed837b95a006a3f3431d1898c3
>> Author: Chengen Du <chengen.du@canonical.com>
>> Date: Wed Jul 10 13:37:47 2024 +0800
>>
>>      net/sched: Fix UAF when resolving a clash
>>
>> Cherry-pick this comment to fix the conntrack slab use-after-free issue.
>>
>> [Testcase]
>> Built a test kernel and verified on our environment.
>>
>> [Where problems could occur]
>> This patch ensure when a clash happens and the duplicated conntrack is freed,
>> call nf_ct_get to get the correct conntrack,
>> the freed conntrack won't be used and the rest of code path will follow the original path.
>> This won't cause other issues.
>>
>> Chengen Du (1):
>>    net/sched: Fix UAF when resolving a clash
>>
>>   net/sched/act_ct.c | 8 ++++++++
>>   1 file changed, 8 insertions(+)
>>
>> -- 
>> 2.43.0
>>
>>
>> -- 
>> kernel-team mailing list
>> kernel-team@lists.ubuntu.com
>> https://lists.ubuntu.com/mailman/listinfo/kernel-team
> 
> I guess this should only be applied to jammy and noble as the patch is
> part of mainline 6.10 already and will be in oracular/unstable.
> 
> Acked-by: Manuel Diewald <manuel.diewald@canonical.com>
> 
> 
I am loosing a bit track of things, could this be related to what 
bluefield reported? IT was at least also conntrack...
Manuel Diewald July 16, 2024, 2:27 p.m. UTC | #3
On Tue, Jul 16, 2024 at 03:49:15PM +0200, Stefan Bader wrote:
> On 16.07.24 13:07, Manuel Diewald wrote:
> > On Tue, Jul 16, 2024 at 12:58:36PM +0800, Gerald Yang wrote:
> > > BugLink: https://bugs.launchpad.net/bugs/2073092
> > > 
> > > [Impact]
> > > 
> > > Hit conntrack refcount use-after-free issue:
> > > refcount_t: addition on 0; use-after-free.
> > > 
> > > Call Trace:
> > > <IRQ>
> > > ? show_regs+0x6d/0x80
> > > ? __warn+0x89/0x160
> > > ? refcount_warn_saturate+0x12e/0x150
> > > ? report_bug+0x17e/0x1b0
> > > ? handle_bug+0x46/0x90
> > > ? exc_invalid_op+0x18/0x80
> > > ? asm_exc_invalid_op+0x1b/0x20
> > > ? refcount_warn_saturate+0x12e/0x150
> > > flow_offload_alloc+0xe5/0xf0 [nf_flow_table]
> > > tcf_ct_flow_table_process_conn+0xc2/0x1e0 [act_ct]
> > > tcf_ct_act+0x6c8/0xaa0 [act_ct]
> > > tcf_action_exec+0xbc/0x1a0
> > > fl_classify+0x1f8/0x200 [cls_flower]
> > > __tcf_classify+0x169/0x200
> > > tcf_classify+0xff/0x250
> > > sch_handle_ingress.constprop.0+0x11f/0x290
> > > ? srso_alias_return_thunk+0x5/0x7f
> > > __netif_receive_skb_core.constprop.0+0x60b/0xd70
> > > ? __udp4_lib_lookup+0x25f/0x2a0
> > > __netif_receive_skb_list_core+0xfd/0x250
> > > netif_receive_skb_list_internal+0x1a3/0x2d0
> > > ? srso_alias_return_thunk+0x5/0x7f
> > > ? dev_gro_receive+0x196/0x350
> > > napi_complete_done+0x74/0x1c0
> > > gro_cell_poll+0x7c/0xb0
> > > __napi_poll+0x33/0x1f0
> > > net_rx_action+0x181/0x2e0
> > > __do_softirq+0xdc/0x349
> > > ? srso_alias_return_thunk+0x5/0x7f
> > > ? handle_irq_event+0x52/0x80
> > > ? handle_edge_irq+0xda/0x250
> > > __irq_exit_rcu+0x75/0xa0
> > > irq_exit_rcu+0xe/0x20
> > > common_interrupt+0xa4/0xb0
> > > </IRQ>
> > > <TASK>
> > > 
> > > [Fix]
> > > I enabled kasan and get:
> > > BUG: KASAN: slab-use-after-free in tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct]
> > > Read of size 1 at addr ffff888c07603600 by task handler130/6469
> > > 
> > > Call Trace:
> > > <IRQ>
> > > dump_stack_lvl+0x48/0x70
> > > print_address_description.constprop.0+0x33/0x3d0
> > > print_report+0xc0/0x2b0
> > > kasan_report+0xd0/0x120
> > > __asan_load1+0x6c/0x80
> > > tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct]
> > > tcf_ct_act+0x886/0x1350 [act_ct]
> > > tcf_action_exec+0xf8/0x1f0
> > > fl_classify+0x355/0x360 [cls_flower]
> > > __tcf_classify+0x1fd/0x330
> > > tcf_classify+0x21c/0x3c0
> > > sch_handle_ingress.constprop.0+0x2c5/0x500
> > > __netif_receive_skb_core.constprop.0+0xb25/0x1510
> > > __netif_receive_skb_list_core+0x220/0x4c0
> > > netif_receive_skb_list_internal+0x446/0x620
> > > napi_complete_done+0x157/0x3d0
> > > gro_cell_poll+0xcf/0x100
> > > __napi_poll+0x65/0x310
> > > net_rx_action+0x30c/0x5c0
> > > __do_softirq+0x14f/0x491
> > > __irq_exit_rcu+0x82/0xc0
> > > irq_exit_rcu+0xe/0x20
> > > common_interrupt+0xa1/0xb0
> > > </IRQ>
> > > 
> > > Allocated by task 6469:
> > > kasan_save_stack+0x38/0x70
> > > kasan_set_track+0x25/0x40
> > > kasan_save_alloc_info+0x1e/0x40
> > > __kasan_krealloc+0x133/0x190
> > > krealloc+0xaa/0x130
> > > nf_ct_ext_add+0xed/0x230 [nf_conntrack]
> > > tcf_ct_act+0x1095/0x1350 [act_ct]
> > > tcf_action_exec+0xf8/0x1f0
> > > fl_classify+0x355/0x360 [cls_flower]
> > > __tcf_classify+0x1fd/0x330
> > > tcf_classify+0x21c/0x3c0
> > > sch_handle_ingress.constprop.0+0x2c5/0x500
> > > __netif_receive_skb_core.constprop.0+0xb25/0x1510
> > > __netif_receive_skb_list_core+0x220/0x4c0
> > > netif_receive_skb_list_internal+0x446/0x620
> > > napi_complete_done+0x157/0x3d0
> > > gro_cell_poll+0xcf/0x100
> > > __napi_poll+0x65/0x310
> > > net_rx_action+0x30c/0x5c0
> > > __do_softirq+0x14f/0x491
> > > 
> > > Freed by task 6469:
> > > kasan_save_stack+0x38/0x70
> > > kasan_set_track+0x25/0x40
> > > kasan_save_free_info+0x2b/0x60
> > > ____kasan_slab_free+0x180/0x1f0
> > > __kasan_slab_free+0x12/0x30
> > > slab_free_freelist_hook+0xd2/0x1a0
> > > __kmem_cache_free+0x1a2/0x2f0
> > > kfree+0x78/0x120
> > > nf_conntrack_free+0x74/0x130 [nf_conntrack]
> > > nf_ct_destroy+0xb2/0x140 [nf_conntrack]
> > > __nf_ct_resolve_clash+0x529/0x5d0 [nf_conntrack]
> > > nf_ct_resolve_clash+0xf6/0x490 [nf_conntrack]
> > > __nf_conntrack_confirm+0x2c6/0x770 [nf_conntrack]
> > > tcf_ct_act+0x12ad/0x1350 [act_ct]
> > > tcf_action_exec+0xf8/0x1f0
> > > fl_classify+0x355/0x360 [cls_flower]
> > > __tcf_classify+0x1fd/0x330
> > > tcf_classify+0x21c/0x3c0
> > > sch_handle_ingress.constprop.0+0x2c5/0x500
> > > __netif_receive_skb_core.constprop.0+0xb25/0x1510
> > > __netif_receive_skb_list_core+0x220/0x4c0
> > > netif_receive_skb_list_internal+0x446/0x620
> > > napi_complete_done+0x157/0x3d0
> > > gro_cell_poll+0xcf/0x100
> > > __napi_poll+0x65/0x310
> > > net_rx_action+0x30c/0x5c0
> > > __do_softirq+0x14f/0x491
> > > 
> > > When resolving a clash, a duplicate conntrack will be freed,
> > > but in tcf_ct_act, it still uses the freed conntrack instead of the correct conntrack.
> > > 
> > > We sent a patch to upstream to fix it and got merged:
> > > commit 26488172b0292bed837b95a006a3f3431d1898c3
> > > Author: Chengen Du <chengen.du@canonical.com>
> > > Date: Wed Jul 10 13:37:47 2024 +0800
> > > 
> > >      net/sched: Fix UAF when resolving a clash
> > > 
> > > Cherry-pick this comment to fix the conntrack slab use-after-free issue.
> > > 
> > > [Testcase]
> > > Built a test kernel and verified on our environment.
> > > 
> > > [Where problems could occur]
> > > This patch ensure when a clash happens and the duplicated conntrack is freed,
> > > call nf_ct_get to get the correct conntrack,
> > > the freed conntrack won't be used and the rest of code path will follow the original path.
> > > This won't cause other issues.
> > > 
> > > Chengen Du (1):
> > >    net/sched: Fix UAF when resolving a clash
> > > 
> > >   net/sched/act_ct.c | 8 ++++++++
> > >   1 file changed, 8 insertions(+)
> > > 
> > > -- 
> > > 2.43.0
> > > 
> > > 
> > > -- 
> > > kernel-team mailing list
> > > kernel-team@lists.ubuntu.com
> > > https://lists.ubuntu.com/mailman/listinfo/kernel-team
> > 
> > I guess this should only be applied to jammy and noble as the patch is
> > part of mainline 6.10 already and will be in oracular/unstable.
> > 
> > Acked-by: Manuel Diewald <manuel.diewald@canonical.com>
> > 
> > 
> I am loosing a bit track of things, could this be related to what bluefield
> reported? IT was at least also conntrack...
> 

I think this is a different issue. The fix for what bluefield reported
is this:

https://lists.ubuntu.com/archives/kernel-team/2024-July/152061.html

That commit has been upstream since v6.0-rc1, so if it would fix the same
problem, noble would've not been affected by the issue this submission
addresses. Also mainline 6.10 carries both commits (for the bluefield
issue and this one).

> -- 
> - Stefan
> 






> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
Gerald Yang July 16, 2024, 4:04 p.m. UTC | #4
Hi Stefan, Manuel,

Thanks for checking this, this is a different issue for PS6, not for
bluefield.
I checked oracular yesterday, 6.10.0-17.17 already has this patch, so it's
only for jammy and noble.

Thanks,
Gerald

On Tue, Jul 16, 2024 at 10:27 PM Manuel Diewald <
manuel.diewald@canonical.com> wrote:

> On Tue, Jul 16, 2024 at 03:49:15PM +0200, Stefan Bader wrote:
> > On 16.07.24 13:07, Manuel Diewald wrote:
> > > On Tue, Jul 16, 2024 at 12:58:36PM +0800, Gerald Yang wrote:
> > > > BugLink: https://bugs.launchpad.net/bugs/2073092
> > > >
> > > > [Impact]
> > > >
> > > > Hit conntrack refcount use-after-free issue:
> > > > refcount_t: addition on 0; use-after-free.
> > > >
> > > > Call Trace:
> > > > <IRQ>
> > > > ? show_regs+0x6d/0x80
> > > > ? __warn+0x89/0x160
> > > > ? refcount_warn_saturate+0x12e/0x150
> > > > ? report_bug+0x17e/0x1b0
> > > > ? handle_bug+0x46/0x90
> > > > ? exc_invalid_op+0x18/0x80
> > > > ? asm_exc_invalid_op+0x1b/0x20
> > > > ? refcount_warn_saturate+0x12e/0x150
> > > > flow_offload_alloc+0xe5/0xf0 [nf_flow_table]
> > > > tcf_ct_flow_table_process_conn+0xc2/0x1e0 [act_ct]
> > > > tcf_ct_act+0x6c8/0xaa0 [act_ct]
> > > > tcf_action_exec+0xbc/0x1a0
> > > > fl_classify+0x1f8/0x200 [cls_flower]
> > > > __tcf_classify+0x169/0x200
> > > > tcf_classify+0xff/0x250
> > > > sch_handle_ingress.constprop.0+0x11f/0x290
> > > > ? srso_alias_return_thunk+0x5/0x7f
> > > > __netif_receive_skb_core.constprop.0+0x60b/0xd70
> > > > ? __udp4_lib_lookup+0x25f/0x2a0
> > > > __netif_receive_skb_list_core+0xfd/0x250
> > > > netif_receive_skb_list_internal+0x1a3/0x2d0
> > > > ? srso_alias_return_thunk+0x5/0x7f
> > > > ? dev_gro_receive+0x196/0x350
> > > > napi_complete_done+0x74/0x1c0
> > > > gro_cell_poll+0x7c/0xb0
> > > > __napi_poll+0x33/0x1f0
> > > > net_rx_action+0x181/0x2e0
> > > > __do_softirq+0xdc/0x349
> > > > ? srso_alias_return_thunk+0x5/0x7f
> > > > ? handle_irq_event+0x52/0x80
> > > > ? handle_edge_irq+0xda/0x250
> > > > __irq_exit_rcu+0x75/0xa0
> > > > irq_exit_rcu+0xe/0x20
> > > > common_interrupt+0xa4/0xb0
> > > > </IRQ>
> > > > <TASK>
> > > >
> > > > [Fix]
> > > > I enabled kasan and get:
> > > > BUG: KASAN: slab-use-after-free in
> tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct]
> > > > Read of size 1 at addr ffff888c07603600 by task handler130/6469
> > > >
> > > > Call Trace:
> > > > <IRQ>
> > > > dump_stack_lvl+0x48/0x70
> > > > print_address_description.constprop.0+0x33/0x3d0
> > > > print_report+0xc0/0x2b0
> > > > kasan_report+0xd0/0x120
> > > > __asan_load1+0x6c/0x80
> > > > tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct]
> > > > tcf_ct_act+0x886/0x1350 [act_ct]
> > > > tcf_action_exec+0xf8/0x1f0
> > > > fl_classify+0x355/0x360 [cls_flower]
> > > > __tcf_classify+0x1fd/0x330
> > > > tcf_classify+0x21c/0x3c0
> > > > sch_handle_ingress.constprop.0+0x2c5/0x500
> > > > __netif_receive_skb_core.constprop.0+0xb25/0x1510
> > > > __netif_receive_skb_list_core+0x220/0x4c0
> > > > netif_receive_skb_list_internal+0x446/0x620
> > > > napi_complete_done+0x157/0x3d0
> > > > gro_cell_poll+0xcf/0x100
> > > > __napi_poll+0x65/0x310
> > > > net_rx_action+0x30c/0x5c0
> > > > __do_softirq+0x14f/0x491
> > > > __irq_exit_rcu+0x82/0xc0
> > > > irq_exit_rcu+0xe/0x20
> > > > common_interrupt+0xa1/0xb0
> > > > </IRQ>
> > > >
> > > > Allocated by task 6469:
> > > > kasan_save_stack+0x38/0x70
> > > > kasan_set_track+0x25/0x40
> > > > kasan_save_alloc_info+0x1e/0x40
> > > > __kasan_krealloc+0x133/0x190
> > > > krealloc+0xaa/0x130
> > > > nf_ct_ext_add+0xed/0x230 [nf_conntrack]
> > > > tcf_ct_act+0x1095/0x1350 [act_ct]
> > > > tcf_action_exec+0xf8/0x1f0
> > > > fl_classify+0x355/0x360 [cls_flower]
> > > > __tcf_classify+0x1fd/0x330
> > > > tcf_classify+0x21c/0x3c0
> > > > sch_handle_ingress.constprop.0+0x2c5/0x500
> > > > __netif_receive_skb_core.constprop.0+0xb25/0x1510
> > > > __netif_receive_skb_list_core+0x220/0x4c0
> > > > netif_receive_skb_list_internal+0x446/0x620
> > > > napi_complete_done+0x157/0x3d0
> > > > gro_cell_poll+0xcf/0x100
> > > > __napi_poll+0x65/0x310
> > > > net_rx_action+0x30c/0x5c0
> > > > __do_softirq+0x14f/0x491
> > > >
> > > > Freed by task 6469:
> > > > kasan_save_stack+0x38/0x70
> > > > kasan_set_track+0x25/0x40
> > > > kasan_save_free_info+0x2b/0x60
> > > > ____kasan_slab_free+0x180/0x1f0
> > > > __kasan_slab_free+0x12/0x30
> > > > slab_free_freelist_hook+0xd2/0x1a0
> > > > __kmem_cache_free+0x1a2/0x2f0
> > > > kfree+0x78/0x120
> > > > nf_conntrack_free+0x74/0x130 [nf_conntrack]
> > > > nf_ct_destroy+0xb2/0x140 [nf_conntrack]
> > > > __nf_ct_resolve_clash+0x529/0x5d0 [nf_conntrack]
> > > > nf_ct_resolve_clash+0xf6/0x490 [nf_conntrack]
> > > > __nf_conntrack_confirm+0x2c6/0x770 [nf_conntrack]
> > > > tcf_ct_act+0x12ad/0x1350 [act_ct]
> > > > tcf_action_exec+0xf8/0x1f0
> > > > fl_classify+0x355/0x360 [cls_flower]
> > > > __tcf_classify+0x1fd/0x330
> > > > tcf_classify+0x21c/0x3c0
> > > > sch_handle_ingress.constprop.0+0x2c5/0x500
> > > > __netif_receive_skb_core.constprop.0+0xb25/0x1510
> > > > __netif_receive_skb_list_core+0x220/0x4c0
> > > > netif_receive_skb_list_internal+0x446/0x620
> > > > napi_complete_done+0x157/0x3d0
> > > > gro_cell_poll+0xcf/0x100
> > > > __napi_poll+0x65/0x310
> > > > net_rx_action+0x30c/0x5c0
> > > > __do_softirq+0x14f/0x491
> > > >
> > > > When resolving a clash, a duplicate conntrack will be freed,
> > > > but in tcf_ct_act, it still uses the freed conntrack instead of the
> correct conntrack.
> > > >
> > > > We sent a patch to upstream to fix it and got merged:
> > > > commit 26488172b0292bed837b95a006a3f3431d1898c3
> > > > Author: Chengen Du <chengen.du@canonical.com>
> > > > Date: Wed Jul 10 13:37:47 2024 +0800
> > > >
> > > >      net/sched: Fix UAF when resolving a clash
> > > >
> > > > Cherry-pick this comment to fix the conntrack slab use-after-free
> issue.
> > > >
> > > > [Testcase]
> > > > Built a test kernel and verified on our environment.
> > > >
> > > > [Where problems could occur]
> > > > This patch ensure when a clash happens and the duplicated conntrack
> is freed,
> > > > call nf_ct_get to get the correct conntrack,
> > > > the freed conntrack won't be used and the rest of code path will
> follow the original path.
> > > > This won't cause other issues.
> > > >
> > > > Chengen Du (1):
> > > >    net/sched: Fix UAF when resolving a clash
> > > >
> > > >   net/sched/act_ct.c | 8 ++++++++
> > > >   1 file changed, 8 insertions(+)
> > > >
> > > > --
> > > > 2.43.0
> > > >
> > > >
> > > > --
> > > > kernel-team mailing list
> > > > kernel-team@lists.ubuntu.com
> > > > https://lists.ubuntu.com/mailman/listinfo/kernel-team
> > >
> > > I guess this should only be applied to jammy and noble as the patch is
> > > part of mainline 6.10 already and will be in oracular/unstable.
> > >
> > > Acked-by: Manuel Diewald <manuel.diewald@canonical.com>
> > >
> > >
> > I am loosing a bit track of things, could this be related to what
> bluefield
> > reported? IT was at least also conntrack...
> >
>
> I think this is a different issue. The fix for what bluefield reported
> is this:
>
> https://lists.ubuntu.com/archives/kernel-team/2024-July/152061.html
>
> That commit has been upstream since v6.0-rc1, so if it would fix the same
> problem, noble would've not been affected by the issue this submission
> addresses. Also mainline 6.10 carries both commits (for the bluefield
> issue and this one).
>
> > --
> > - Stefan
> >
>
>
>
>
>
>
> > --
> > kernel-team mailing list
> > kernel-team@lists.ubuntu.com
> > https://lists.ubuntu.com/mailman/listinfo/kernel-team
>
>
> --
>  Manuel
>
Kevin Becker July 16, 2024, 7:08 p.m. UTC | #5
On Tue, Jul 16, 2024 at 12:59 AM Gerald Yang <gerald.yang@canonical.com> wrote:
>
> BugLink: https://bugs.launchpad.net/bugs/2073092
>
> [Impact]
>
> Hit conntrack refcount use-after-free issue:
> refcount_t: addition on 0; use-after-free.
>
> Call Trace:
> <IRQ>
> ? show_regs+0x6d/0x80
> ? __warn+0x89/0x160
> ? refcount_warn_saturate+0x12e/0x150
> ? report_bug+0x17e/0x1b0
> ? handle_bug+0x46/0x90
> ? exc_invalid_op+0x18/0x80
> ? asm_exc_invalid_op+0x1b/0x20
> ? refcount_warn_saturate+0x12e/0x150
> flow_offload_alloc+0xe5/0xf0 [nf_flow_table]
> tcf_ct_flow_table_process_conn+0xc2/0x1e0 [act_ct]
> tcf_ct_act+0x6c8/0xaa0 [act_ct]
> tcf_action_exec+0xbc/0x1a0
> fl_classify+0x1f8/0x200 [cls_flower]
> __tcf_classify+0x169/0x200
> tcf_classify+0xff/0x250
> sch_handle_ingress.constprop.0+0x11f/0x290
> ? srso_alias_return_thunk+0x5/0x7f
> __netif_receive_skb_core.constprop.0+0x60b/0xd70
> ? __udp4_lib_lookup+0x25f/0x2a0
> __netif_receive_skb_list_core+0xfd/0x250
> netif_receive_skb_list_internal+0x1a3/0x2d0
> ? srso_alias_return_thunk+0x5/0x7f
> ? dev_gro_receive+0x196/0x350
> napi_complete_done+0x74/0x1c0
> gro_cell_poll+0x7c/0xb0
> __napi_poll+0x33/0x1f0
> net_rx_action+0x181/0x2e0
> __do_softirq+0xdc/0x349
> ? srso_alias_return_thunk+0x5/0x7f
> ? handle_irq_event+0x52/0x80
> ? handle_edge_irq+0xda/0x250
> __irq_exit_rcu+0x75/0xa0
> irq_exit_rcu+0xe/0x20
> common_interrupt+0xa4/0xb0
> </IRQ>
> <TASK>
>
> [Fix]
> I enabled kasan and get:
> BUG: KASAN: slab-use-after-free in tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct]
> Read of size 1 at addr ffff888c07603600 by task handler130/6469
>
> Call Trace:
> <IRQ>
> dump_stack_lvl+0x48/0x70
> print_address_description.constprop.0+0x33/0x3d0
> print_report+0xc0/0x2b0
> kasan_report+0xd0/0x120
> __asan_load1+0x6c/0x80
> tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct]
> tcf_ct_act+0x886/0x1350 [act_ct]
> tcf_action_exec+0xf8/0x1f0
> fl_classify+0x355/0x360 [cls_flower]
> __tcf_classify+0x1fd/0x330
> tcf_classify+0x21c/0x3c0
> sch_handle_ingress.constprop.0+0x2c5/0x500
> __netif_receive_skb_core.constprop.0+0xb25/0x1510
> __netif_receive_skb_list_core+0x220/0x4c0
> netif_receive_skb_list_internal+0x446/0x620
> napi_complete_done+0x157/0x3d0
> gro_cell_poll+0xcf/0x100
> __napi_poll+0x65/0x310
> net_rx_action+0x30c/0x5c0
> __do_softirq+0x14f/0x491
> __irq_exit_rcu+0x82/0xc0
> irq_exit_rcu+0xe/0x20
> common_interrupt+0xa1/0xb0
> </IRQ>
>
> Allocated by task 6469:
> kasan_save_stack+0x38/0x70
> kasan_set_track+0x25/0x40
> kasan_save_alloc_info+0x1e/0x40
> __kasan_krealloc+0x133/0x190
> krealloc+0xaa/0x130
> nf_ct_ext_add+0xed/0x230 [nf_conntrack]
> tcf_ct_act+0x1095/0x1350 [act_ct]
> tcf_action_exec+0xf8/0x1f0
> fl_classify+0x355/0x360 [cls_flower]
> __tcf_classify+0x1fd/0x330
> tcf_classify+0x21c/0x3c0
> sch_handle_ingress.constprop.0+0x2c5/0x500
> __netif_receive_skb_core.constprop.0+0xb25/0x1510
> __netif_receive_skb_list_core+0x220/0x4c0
> netif_receive_skb_list_internal+0x446/0x620
> napi_complete_done+0x157/0x3d0
> gro_cell_poll+0xcf/0x100
> __napi_poll+0x65/0x310
> net_rx_action+0x30c/0x5c0
> __do_softirq+0x14f/0x491
>
> Freed by task 6469:
> kasan_save_stack+0x38/0x70
> kasan_set_track+0x25/0x40
> kasan_save_free_info+0x2b/0x60
> ____kasan_slab_free+0x180/0x1f0
> __kasan_slab_free+0x12/0x30
> slab_free_freelist_hook+0xd2/0x1a0
> __kmem_cache_free+0x1a2/0x2f0
> kfree+0x78/0x120
> nf_conntrack_free+0x74/0x130 [nf_conntrack]
> nf_ct_destroy+0xb2/0x140 [nf_conntrack]
> __nf_ct_resolve_clash+0x529/0x5d0 [nf_conntrack]
> nf_ct_resolve_clash+0xf6/0x490 [nf_conntrack]
> __nf_conntrack_confirm+0x2c6/0x770 [nf_conntrack]
> tcf_ct_act+0x12ad/0x1350 [act_ct]
> tcf_action_exec+0xf8/0x1f0
> fl_classify+0x355/0x360 [cls_flower]
> __tcf_classify+0x1fd/0x330
> tcf_classify+0x21c/0x3c0
> sch_handle_ingress.constprop.0+0x2c5/0x500
> __netif_receive_skb_core.constprop.0+0xb25/0x1510
> __netif_receive_skb_list_core+0x220/0x4c0
> netif_receive_skb_list_internal+0x446/0x620
> napi_complete_done+0x157/0x3d0
> gro_cell_poll+0xcf/0x100
> __napi_poll+0x65/0x310
> net_rx_action+0x30c/0x5c0
> __do_softirq+0x14f/0x491
>
> When resolving a clash, a duplicate conntrack will be freed,
> but in tcf_ct_act, it still uses the freed conntrack instead of the correct conntrack.
>
> We sent a patch to upstream to fix it and got merged:
> commit 26488172b0292bed837b95a006a3f3431d1898c3
> Author: Chengen Du <chengen.du@canonical.com>
> Date: Wed Jul 10 13:37:47 2024 +0800
>
>     net/sched: Fix UAF when resolving a clash
>
> Cherry-pick this comment to fix the conntrack slab use-after-free issue.
>
> [Testcase]
> Built a test kernel and verified on our environment.
>
> [Where problems could occur]
> This patch ensure when a clash happens and the duplicated conntrack is freed,
> call nf_ct_get to get the correct conntrack,
> the freed conntrack won't be used and the rest of code path will follow the original path.
> This won't cause other issues.
>
> Chengen Du (1):
>   net/sched: Fix UAF when resolving a clash
>
>  net/sched/act_ct.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> --

Acked-by: Kevin Becker <kevin.becker@canonical.com>
Stefan Bader July 19, 2024, 9:40 a.m. UTC | #6
On 16.07.24 06:58, Gerald Yang wrote:
> BugLink: https://bugs.launchpad.net/bugs/2073092
> 
> [Impact]
> 
> Hit conntrack refcount use-after-free issue:
> refcount_t: addition on 0; use-after-free.
> 
> Call Trace:
> <IRQ>
> ? show_regs+0x6d/0x80
> ? __warn+0x89/0x160
> ? refcount_warn_saturate+0x12e/0x150
> ? report_bug+0x17e/0x1b0
> ? handle_bug+0x46/0x90
> ? exc_invalid_op+0x18/0x80
> ? asm_exc_invalid_op+0x1b/0x20
> ? refcount_warn_saturate+0x12e/0x150
> flow_offload_alloc+0xe5/0xf0 [nf_flow_table]
> tcf_ct_flow_table_process_conn+0xc2/0x1e0 [act_ct]
> tcf_ct_act+0x6c8/0xaa0 [act_ct]
> tcf_action_exec+0xbc/0x1a0
> fl_classify+0x1f8/0x200 [cls_flower]
> __tcf_classify+0x169/0x200
> tcf_classify+0xff/0x250
> sch_handle_ingress.constprop.0+0x11f/0x290
> ? srso_alias_return_thunk+0x5/0x7f
> __netif_receive_skb_core.constprop.0+0x60b/0xd70
> ? __udp4_lib_lookup+0x25f/0x2a0
> __netif_receive_skb_list_core+0xfd/0x250
> netif_receive_skb_list_internal+0x1a3/0x2d0
> ? srso_alias_return_thunk+0x5/0x7f
> ? dev_gro_receive+0x196/0x350
> napi_complete_done+0x74/0x1c0
> gro_cell_poll+0x7c/0xb0
> __napi_poll+0x33/0x1f0
> net_rx_action+0x181/0x2e0
> __do_softirq+0xdc/0x349
> ? srso_alias_return_thunk+0x5/0x7f
> ? handle_irq_event+0x52/0x80
> ? handle_edge_irq+0xda/0x250
> __irq_exit_rcu+0x75/0xa0
> irq_exit_rcu+0xe/0x20
> common_interrupt+0xa4/0xb0
> </IRQ>
> <TASK>
> 
> [Fix]
> I enabled kasan and get:
> BUG: KASAN: slab-use-after-free in tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct]
> Read of size 1 at addr ffff888c07603600 by task handler130/6469
> 
> Call Trace:
> <IRQ>
> dump_stack_lvl+0x48/0x70
> print_address_description.constprop.0+0x33/0x3d0
> print_report+0xc0/0x2b0
> kasan_report+0xd0/0x120
> __asan_load1+0x6c/0x80
> tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct]
> tcf_ct_act+0x886/0x1350 [act_ct]
> tcf_action_exec+0xf8/0x1f0
> fl_classify+0x355/0x360 [cls_flower]
> __tcf_classify+0x1fd/0x330
> tcf_classify+0x21c/0x3c0
> sch_handle_ingress.constprop.0+0x2c5/0x500
> __netif_receive_skb_core.constprop.0+0xb25/0x1510
> __netif_receive_skb_list_core+0x220/0x4c0
> netif_receive_skb_list_internal+0x446/0x620
> napi_complete_done+0x157/0x3d0
> gro_cell_poll+0xcf/0x100
> __napi_poll+0x65/0x310
> net_rx_action+0x30c/0x5c0
> __do_softirq+0x14f/0x491
> __irq_exit_rcu+0x82/0xc0
> irq_exit_rcu+0xe/0x20
> common_interrupt+0xa1/0xb0
> </IRQ>
> 
> Allocated by task 6469:
> kasan_save_stack+0x38/0x70
> kasan_set_track+0x25/0x40
> kasan_save_alloc_info+0x1e/0x40
> __kasan_krealloc+0x133/0x190
> krealloc+0xaa/0x130
> nf_ct_ext_add+0xed/0x230 [nf_conntrack]
> tcf_ct_act+0x1095/0x1350 [act_ct]
> tcf_action_exec+0xf8/0x1f0
> fl_classify+0x355/0x360 [cls_flower]
> __tcf_classify+0x1fd/0x330
> tcf_classify+0x21c/0x3c0
> sch_handle_ingress.constprop.0+0x2c5/0x500
> __netif_receive_skb_core.constprop.0+0xb25/0x1510
> __netif_receive_skb_list_core+0x220/0x4c0
> netif_receive_skb_list_internal+0x446/0x620
> napi_complete_done+0x157/0x3d0
> gro_cell_poll+0xcf/0x100
> __napi_poll+0x65/0x310
> net_rx_action+0x30c/0x5c0
> __do_softirq+0x14f/0x491
> 
> Freed by task 6469:
> kasan_save_stack+0x38/0x70
> kasan_set_track+0x25/0x40
> kasan_save_free_info+0x2b/0x60
> ____kasan_slab_free+0x180/0x1f0
> __kasan_slab_free+0x12/0x30
> slab_free_freelist_hook+0xd2/0x1a0
> __kmem_cache_free+0x1a2/0x2f0
> kfree+0x78/0x120
> nf_conntrack_free+0x74/0x130 [nf_conntrack]
> nf_ct_destroy+0xb2/0x140 [nf_conntrack]
> __nf_ct_resolve_clash+0x529/0x5d0 [nf_conntrack]
> nf_ct_resolve_clash+0xf6/0x490 [nf_conntrack]
> __nf_conntrack_confirm+0x2c6/0x770 [nf_conntrack]
> tcf_ct_act+0x12ad/0x1350 [act_ct]
> tcf_action_exec+0xf8/0x1f0
> fl_classify+0x355/0x360 [cls_flower]
> __tcf_classify+0x1fd/0x330
> tcf_classify+0x21c/0x3c0
> sch_handle_ingress.constprop.0+0x2c5/0x500
> __netif_receive_skb_core.constprop.0+0xb25/0x1510
> __netif_receive_skb_list_core+0x220/0x4c0
> netif_receive_skb_list_internal+0x446/0x620
> napi_complete_done+0x157/0x3d0
> gro_cell_poll+0xcf/0x100
> __napi_poll+0x65/0x310
> net_rx_action+0x30c/0x5c0
> __do_softirq+0x14f/0x491
> 
> When resolving a clash, a duplicate conntrack will be freed,
> but in tcf_ct_act, it still uses the freed conntrack instead of the correct conntrack.
> 
> We sent a patch to upstream to fix it and got merged:
> commit 26488172b0292bed837b95a006a3f3431d1898c3
> Author: Chengen Du <chengen.du@canonical.com>
> Date: Wed Jul 10 13:37:47 2024 +0800
> 
>      net/sched: Fix UAF when resolving a clash
> 
> Cherry-pick this comment to fix the conntrack slab use-after-free issue.
> 
> [Testcase]
> Built a test kernel and verified on our environment.
> 
> [Where problems could occur]
> This patch ensure when a clash happens and the duplicated conntrack is freed,
> call nf_ct_get to get the correct conntrack,
> the freed conntrack won't be used and the rest of code path will follow the original path.
> This won't cause other issues.
> 
> Chengen Du (1):
>    net/sched: Fix UAF when resolving a clash
> 
>   net/sched/act_ct.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 

This was applied ahead of time for a re-spin in 2024.07.08 to 
jammy:linux-hwe-6.8 and is now applied to noble,jammy:linux/master-next 
(for upcoming SRU cycle). For Oracular / Unstable this is already 
included. Thanks.

-Stefan