diff mbox series

powerpc/pseries: Whitelist dtl slub object for copying to userspace

Message ID 20240614173844.746818-1-anjalik@linux.ibm.com (mailing list archive)
State Accepted
Commit 1a14150e1656f7a332a943154fc486504db4d586
Headers show
Series powerpc/pseries: Whitelist dtl slub object for copying to userspace | expand

Checks

Context Check Description
snowpatch_ozlabs/github-powerpc_selftests success Successfully ran 8 jobs.
snowpatch_ozlabs/github-powerpc_ppctests success Successfully ran 8 jobs.
snowpatch_ozlabs/github-powerpc_sparse success Successfully ran 4 jobs.
snowpatch_ozlabs/github-powerpc_clang success Successfully ran 6 jobs.
snowpatch_ozlabs/github-powerpc_kernel_qemu success Successfully ran 23 jobs.

Commit Message

Anjali K June 14, 2024, 5:38 p.m. UTC
Reading the dispatch trace log from /sys/kernel/debug/powerpc/dtl/cpu-*
results in a BUG() when the config CONFIG_HARDENED_USERCOPY is enabled as
shown below.

    kernel BUG at mm/usercopy.c:102!
    Oops: Exception in kernel mode, sig: 5 [#1]
    LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
    Modules linked in: xfs libcrc32c dm_service_time sd_mod t10_pi sg ibmvfc
    scsi_transport_fc ibmveth pseries_wdt dm_multipath dm_mirror dm_region_hash dm_log dm_mod fuse
    CPU: 27 PID: 1815 Comm: python3 Not tainted 6.10.0-rc3 #85
    Hardware name: IBM,9040-MRX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NM1060_042) hv:phyp pSeries
    NIP:  c0000000005d23d4 LR: c0000000005d23d0 CTR: 00000000006ee6f8
    REGS: c000000120c078c0 TRAP: 0700   Not tainted  (6.10.0-rc3)
    MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 2828220f  XER: 0000000e
    CFAR: c0000000001fdc80 IRQMASK: 0
    [ ... GPRs omitted ... ]
    NIP [c0000000005d23d4] usercopy_abort+0x78/0xb0
    LR [c0000000005d23d0] usercopy_abort+0x74/0xb0
    Call Trace:
     usercopy_abort+0x74/0xb0 (unreliable)
     __check_heap_object+0xf8/0x120
     check_heap_object+0x218/0x240
     __check_object_size+0x84/0x1a4
     dtl_file_read+0x17c/0x2c4
     full_proxy_read+0x8c/0x110
     vfs_read+0xdc/0x3a0
     ksys_read+0x84/0x144
     system_call_exception+0x124/0x330
     system_call_vectored_common+0x15c/0x2ec
    --- interrupt: 3000 at 0x7fff81f3ab34

Commit 6d07d1cd300f ("usercopy: Restrict non-usercopy caches to size 0")
requires that only whitelisted areas in slab/slub objects can be copied to
userspace when usercopy hardening is enabled using CONFIG_HARDENED_USERCOPY.
Dtl contains hypervisor dispatch events which are expected to be read by
privileged users. Hence mark this safe for user access.
Specify useroffset=0 and usersize=DISPATCH_LOG_BYTES to whitelist the
entire object.

Co-developed-by: Vishal Chourasia <vishalc@linux.ibm.com>
Signed-off-by: Vishal Chourasia <vishalc@linux.ibm.com>
Signed-off-by: Anjali K <anjalik@linux.ibm.com>
---
 arch/powerpc/platforms/pseries/setup.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


base-commit: 83a7eefedc9b56fe7bfeff13b6c7356688ffa670

Comments

Srikar Dronamraju June 17, 2024, 10:37 a.m. UTC | #1
> Commit 6d07d1cd300f ("usercopy: Restrict non-usercopy caches to size 0")
> requires that only whitelisted areas in slab/slub objects can be copied to
> userspace when usercopy hardening is enabled using CONFIG_HARDENED_USERCOPY.
> Dtl contains hypervisor dispatch events which are expected to be read by
> privileged users. Hence mark this safe for user access.
> Specify useroffset=0 and usersize=DISPATCH_LOG_BYTES to whitelist the
> entire object.
> 
> Co-developed-by: Vishal Chourasia <vishalc@linux.ibm.com>
> Signed-off-by: Vishal Chourasia <vishalc@linux.ibm.com>
> Signed-off-by: Anjali K <anjalik@linux.ibm.com>
> ---
>  arch/powerpc/platforms/pseries/setup.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 

Looks good to me.

Reviewed-by: Srikar Dronamraju <srikar@linux.ibm.com>
Kees Cook June 17, 2024, 5:59 p.m. UTC | #2
On Fri, Jun 14, 2024 at 11:08:44PM +0530, Anjali K wrote:
> Reading the dispatch trace log from /sys/kernel/debug/powerpc/dtl/cpu-*
> results in a BUG() when the config CONFIG_HARDENED_USERCOPY is enabled as
> shown below.
> 
>     kernel BUG at mm/usercopy.c:102!
>     Oops: Exception in kernel mode, sig: 5 [#1]
>     LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
>     Modules linked in: xfs libcrc32c dm_service_time sd_mod t10_pi sg ibmvfc
>     scsi_transport_fc ibmveth pseries_wdt dm_multipath dm_mirror dm_region_hash dm_log dm_mod fuse
>     CPU: 27 PID: 1815 Comm: python3 Not tainted 6.10.0-rc3 #85
>     Hardware name: IBM,9040-MRX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NM1060_042) hv:phyp pSeries
>     NIP:  c0000000005d23d4 LR: c0000000005d23d0 CTR: 00000000006ee6f8
>     REGS: c000000120c078c0 TRAP: 0700   Not tainted  (6.10.0-rc3)
>     MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 2828220f  XER: 0000000e
>     CFAR: c0000000001fdc80 IRQMASK: 0
>     [ ... GPRs omitted ... ]
>     NIP [c0000000005d23d4] usercopy_abort+0x78/0xb0
>     LR [c0000000005d23d0] usercopy_abort+0x74/0xb0
>     Call Trace:
>      usercopy_abort+0x74/0xb0 (unreliable)
>      __check_heap_object+0xf8/0x120
>      check_heap_object+0x218/0x240
>      __check_object_size+0x84/0x1a4
>      dtl_file_read+0x17c/0x2c4
>      full_proxy_read+0x8c/0x110
>      vfs_read+0xdc/0x3a0
>      ksys_read+0x84/0x144
>      system_call_exception+0x124/0x330
>      system_call_vectored_common+0x15c/0x2ec
>     --- interrupt: 3000 at 0x7fff81f3ab34
> 
> Commit 6d07d1cd300f ("usercopy: Restrict non-usercopy caches to size 0")
> requires that only whitelisted areas in slab/slub objects can be copied to
> userspace when usercopy hardening is enabled using CONFIG_HARDENED_USERCOPY.
> Dtl contains hypervisor dispatch events which are expected to be read by
> privileged users. Hence mark this safe for user access.
> Specify useroffset=0 and usersize=DISPATCH_LOG_BYTES to whitelist the
> entire object.
> 
> Co-developed-by: Vishal Chourasia <vishalc@linux.ibm.com>
> Signed-off-by: Vishal Chourasia <vishalc@linux.ibm.com>
> Signed-off-by: Anjali K <anjalik@linux.ibm.com>
> ---
>  arch/powerpc/platforms/pseries/setup.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
> index 284a6fa04b0c..cba40d9d1284 100644
> --- a/arch/powerpc/platforms/pseries/setup.c
> +++ b/arch/powerpc/platforms/pseries/setup.c
> @@ -343,8 +343,8 @@ static int alloc_dispatch_log_kmem_cache(void)
>  {
>  	void (*ctor)(void *) = get_dtl_cache_ctor();
>  
> -	dtl_cache = kmem_cache_create("dtl", DISPATCH_LOG_BYTES,
> -						DISPATCH_LOG_BYTES, 0, ctor);
> +	dtl_cache = kmem_cache_create_usercopy("dtl", DISPATCH_LOG_BYTES,
> +						DISPATCH_LOG_BYTES, 0, 0, DISPATCH_LOG_BYTES, ctor);
>  	if (!dtl_cache) {
>  		pr_warn("Failed to create dispatch trace log buffer cache\n");
>  		pr_warn("Stolen time statistics will be unreliable\n");

Are you sure you want to universally expose this memory region? It
sounds like it's only exposed via a debug interface. Maybe it'd be
better to use a bounce buffer in the debug interface instead?


diff --git a/arch/powerpc/platforms/pseries/dtl.c b/arch/powerpc/platforms/pseries/dtl.c
index 3f1cdccebc9c..3adcff5cc4b2 100644
--- a/arch/powerpc/platforms/pseries/dtl.c
+++ b/arch/powerpc/platforms/pseries/dtl.c
@@ -257,6 +257,22 @@ static int dtl_file_release(struct inode *inode, struct file *filp)
 	return 0;
 }
 
+static inline int bounce_copy(char __user *buf, void *src, size_t size)
+{
+	u8 *bounce;
+	int rc;
+
+	bounce = kmalloc(size, GFP_KERNEL);
+	if (!bounce)
+		return -ENOMEM;
+
+	memcpy(bounce, src, size);
+	rc = copy_to_user(buf, bounce, size);
+
+	kfree(bounce);
+	return rc;
+}
+
 static ssize_t dtl_file_read(struct file *filp, char __user *buf, size_t len,
 		loff_t *pos)
 {
@@ -300,7 +316,7 @@ static ssize_t dtl_file_read(struct file *filp, char __user *buf, size_t len,
 	if (i + n_req > dtl->buf_entries) {
 		read_size = dtl->buf_entries - i;
 
-		rc = copy_to_user(buf, &dtl->buf[i],
+		rc = bounce_copy(buf, &dtl->buf[i],
 				read_size * sizeof(struct dtl_entry));
 		if (rc)
 			return -EFAULT;
@@ -312,7 +328,7 @@ static ssize_t dtl_file_read(struct file *filp, char __user *buf, size_t len,
 	}
 
 	/* .. and now the head */
-	rc = copy_to_user(buf, &dtl->buf[i], n_req * sizeof(struct dtl_entry));
+	rc = bounce_copy(buf, &dtl->buf[i], n_req * sizeof(struct dtl_entry));
 	if (rc)
 		return -EFAULT;
Michael Ellerman June 18, 2024, 7:11 a.m. UTC | #3
Kees Cook <kees@kernel.org> writes:
> On Fri, Jun 14, 2024 at 11:08:44PM +0530, Anjali K wrote:
>> Reading the dispatch trace log from /sys/kernel/debug/powerpc/dtl/cpu-*
>> results in a BUG() when the config CONFIG_HARDENED_USERCOPY is enabled as
>> shown below.
>> 
>>     kernel BUG at mm/usercopy.c:102!
>>     Oops: Exception in kernel mode, sig: 5 [#1]
>>     LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
>>     Modules linked in: xfs libcrc32c dm_service_time sd_mod t10_pi sg ibmvfc
>>     scsi_transport_fc ibmveth pseries_wdt dm_multipath dm_mirror dm_region_hash dm_log dm_mod fuse
>>     CPU: 27 PID: 1815 Comm: python3 Not tainted 6.10.0-rc3 #85
>>     Hardware name: IBM,9040-MRX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NM1060_042) hv:phyp pSeries
>>     NIP:  c0000000005d23d4 LR: c0000000005d23d0 CTR: 00000000006ee6f8
>>     REGS: c000000120c078c0 TRAP: 0700   Not tainted  (6.10.0-rc3)
>>     MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 2828220f  XER: 0000000e
>>     CFAR: c0000000001fdc80 IRQMASK: 0
>>     [ ... GPRs omitted ... ]
>>     NIP [c0000000005d23d4] usercopy_abort+0x78/0xb0
>>     LR [c0000000005d23d0] usercopy_abort+0x74/0xb0
>>     Call Trace:
>>      usercopy_abort+0x74/0xb0 (unreliable)
>>      __check_heap_object+0xf8/0x120
>>      check_heap_object+0x218/0x240
>>      __check_object_size+0x84/0x1a4
>>      dtl_file_read+0x17c/0x2c4
>>      full_proxy_read+0x8c/0x110
>>      vfs_read+0xdc/0x3a0
>>      ksys_read+0x84/0x144
>>      system_call_exception+0x124/0x330
>>      system_call_vectored_common+0x15c/0x2ec
>>     --- interrupt: 3000 at 0x7fff81f3ab34
>> 
>> Commit 6d07d1cd300f ("usercopy: Restrict non-usercopy caches to size 0")
>> requires that only whitelisted areas in slab/slub objects can be copied to
>> userspace when usercopy hardening is enabled using CONFIG_HARDENED_USERCOPY.
>> Dtl contains hypervisor dispatch events which are expected to be read by
>> privileged users. Hence mark this safe for user access.
>> Specify useroffset=0 and usersize=DISPATCH_LOG_BYTES to whitelist the
>> entire object.
>> 
>> Co-developed-by: Vishal Chourasia <vishalc@linux.ibm.com>
>> Signed-off-by: Vishal Chourasia <vishalc@linux.ibm.com>
>> Signed-off-by: Anjali K <anjalik@linux.ibm.com>
>> ---
>>  arch/powerpc/platforms/pseries/setup.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>> 
>> diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
>> index 284a6fa04b0c..cba40d9d1284 100644
>> --- a/arch/powerpc/platforms/pseries/setup.c
>> +++ b/arch/powerpc/platforms/pseries/setup.c
>> @@ -343,8 +343,8 @@ static int alloc_dispatch_log_kmem_cache(void)
>>  {
>>  	void (*ctor)(void *) = get_dtl_cache_ctor();
>>  
>> -	dtl_cache = kmem_cache_create("dtl", DISPATCH_LOG_BYTES,
>> -						DISPATCH_LOG_BYTES, 0, ctor);
>> +	dtl_cache = kmem_cache_create_usercopy("dtl", DISPATCH_LOG_BYTES,
>> +						DISPATCH_LOG_BYTES, 0, 0, DISPATCH_LOG_BYTES, ctor);
>>  	if (!dtl_cache) {
>>  		pr_warn("Failed to create dispatch trace log buffer cache\n");
>>  		pr_warn("Stolen time statistics will be unreliable\n");
>
> Are you sure you want to universally expose this memory region? It
> sounds like it's only exposed via a debug interface. Maybe it'd be
> better to use a bounce buffer in the debug interface instead?

I'm not sure what the threat is?

The log entries are written by the hypervisor, but never read.

That kmem_cache is only used to allocate the array of dtl_entry structs,
the ring buffer itself (struct dtl) is allocated statically. So
overwriting the dtl_entries can't corrupt the structure of the ring
buffer, just the content.

An attacker could read the entries and see some kernel pointers, but
those are everywhere.

So it seems pretty harmless.

I guess there isn't a kmem_cache_create_user_readonly() ?

> diff --git a/arch/powerpc/platforms/pseries/dtl.c b/arch/powerpc/platforms/pseries/dtl.c
> index 3f1cdccebc9c..3adcff5cc4b2 100644
> --- a/arch/powerpc/platforms/pseries/dtl.c
> +++ b/arch/powerpc/platforms/pseries/dtl.c
> @@ -257,6 +257,22 @@ static int dtl_file_release(struct inode *inode, struct file *filp)
>  	return 0;
>  }
>  
> +static inline int bounce_copy(char __user *buf, void *src, size_t size)
> +{
> +	u8 *bounce;
> +	int rc;
> +
> +	bounce = kmalloc(size, GFP_KERNEL);
> +	if (!bounce)
> +		return -ENOMEM;
> +
> +	memcpy(bounce, src, size);
> +	rc = copy_to_user(buf, bounce, size);
> +
> +	kfree(bounce);
> +	return rc;
> +}

Is there no generic version of that?

> @@ -300,7 +316,7 @@ static ssize_t dtl_file_read(struct file *filp, char __user *buf, size_t len,
>  	if (i + n_req > dtl->buf_entries) {
>  		read_size = dtl->buf_entries - i;
>  
> -		rc = copy_to_user(buf, &dtl->buf[i],
> +		rc = bounce_copy(buf, &dtl->buf[i],
>  				read_size * sizeof(struct dtl_entry));
>  		if (rc)
>  			return -EFAULT;
> @@ -312,7 +328,7 @@ static ssize_t dtl_file_read(struct file *filp, char __user *buf, size_t len,
>  	}
>  
>  	/* .. and now the head */
> -	rc = copy_to_user(buf, &dtl->buf[i], n_req * sizeof(struct dtl_entry));
> +	rc = bounce_copy(buf, &dtl->buf[i], n_req * sizeof(struct dtl_entry));
>  	if (rc)
>  		return -EFAULT;
>  
>

cheers
Anjali K June 20, 2024, 5:28 p.m. UTC | #4
Hi Kees
Thank you for your review.

On 17/06/24 23:29, Kees Cook wrote:
> On Fri, Jun 14, 2024 at 11:08:44PM +0530, Anjali K wrote:
>> Reading the dispatch trace log from /sys/kernel/debug/powerpc/dtl/cpu-*
>> results in a BUG() when the config CONFIG_HARDENED_USERCOPY is enabled as
>> shown below.
>>
>>     kernel BUG at mm/usercopy.c:102!
>>     Oops: Exception in kernel mode, sig: 5 [#1]
>>     LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
>>     Modules linked in: xfs libcrc32c dm_service_time sd_mod t10_pi sg ibmvfc
>>     scsi_transport_fc ibmveth pseries_wdt dm_multipath dm_mirror dm_region_hash dm_log dm_mod fuse
>>     CPU: 27 PID: 1815 Comm: python3 Not tainted 6.10.0-rc3 #85
>>     Hardware name: IBM,9040-MRX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NM1060_042) hv:phyp pSeries
>>     NIP:  c0000000005d23d4 LR: c0000000005d23d0 CTR: 00000000006ee6f8
>>     REGS: c000000120c078c0 TRAP: 0700   Not tainted  (6.10.0-rc3)
>>     MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 2828220f  XER: 0000000e
>>     CFAR: c0000000001fdc80 IRQMASK: 0
>>     [ ... GPRs omitted ... ]
>>     NIP [c0000000005d23d4] usercopy_abort+0x78/0xb0
>>     LR [c0000000005d23d0] usercopy_abort+0x74/0xb0
>>     Call Trace:
>>      usercopy_abort+0x74/0xb0 (unreliable)
>>      __check_heap_object+0xf8/0x120
>>      check_heap_object+0x218/0x240
>>      __check_object_size+0x84/0x1a4
>>      dtl_file_read+0x17c/0x2c4
>>      full_proxy_read+0x8c/0x110
>>      vfs_read+0xdc/0x3a0
>>      ksys_read+0x84/0x144
>>      system_call_exception+0x124/0x330
>>      system_call_vectored_common+0x15c/0x2ec
>>     --- interrupt: 3000 at 0x7fff81f3ab34
>>
>> Commit 6d07d1cd300f ("usercopy: Restrict non-usercopy caches to size 0")
>> requires that only whitelisted areas in slab/slub objects can be copied to
>> userspace when usercopy hardening is enabled using CONFIG_HARDENED_USERCOPY.
>> Dtl contains hypervisor dispatch events which are expected to be read by
>> privileged users. Hence mark this safe for user access.
>> Specify useroffset=0 and usersize=DISPATCH_LOG_BYTES to whitelist the
>> entire object.
>>
>> Co-developed-by: Vishal Chourasia <vishalc@linux.ibm.com>
>> Signed-off-by: Vishal Chourasia <vishalc@linux.ibm.com>
>> Signed-off-by: Anjali K <anjalik@linux.ibm.com>
>> ---
>>  arch/powerpc/platforms/pseries/setup.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
>> index 284a6fa04b0c..cba40d9d1284 100644
>> --- a/arch/powerpc/platforms/pseries/setup.c
>> +++ b/arch/powerpc/platforms/pseries/setup.c
>> @@ -343,8 +343,8 @@ static int alloc_dispatch_log_kmem_cache(void)
>>  {
>>  	void (*ctor)(void *) = get_dtl_cache_ctor();
>>  
>> -	dtl_cache = kmem_cache_create("dtl", DISPATCH_LOG_BYTES,
>> -						DISPATCH_LOG_BYTES, 0, ctor);
>> +	dtl_cache = kmem_cache_create_usercopy("dtl", DISPATCH_LOG_BYTES,
>> +						DISPATCH_LOG_BYTES, 0, 0, DISPATCH_LOG_BYTES, ctor);
>>  	if (!dtl_cache) {
>>  		pr_warn("Failed to create dispatch trace log buffer cache\n");
>>  		pr_warn("Stolen time statistics will be unreliable\n");
> Are you sure you want to universally expose this memory region? It
> sounds like it's only exposed via a debug interface. Maybe it'd be
> better to use a bounce buffer in the debug interface instead
>
> diff --git a/arch/powerpc/platforms/pseries/dtl.c b/arch/powerpc/platforms/pseries/dtl.c
> index 3f1cdccebc9c..3adcff5cc4b2 100644
> --- a/arch/powerpc/platforms/pseries/dtl.c
> +++ b/arch/powerpc/platforms/pseries/dtl.c
> @@ -257,6 +257,22 @@ static int dtl_file_release(struct inode *inode, struct file *filp)
>  	return 0;
>  }
>  
> +static inline int bounce_copy(char __user *buf, void *src, size_t size)
> +{
> +	u8 *bounce;
> +	int rc;
> +
> +	bounce = kmalloc(size, GFP_KERNEL);
> +	if (!bounce)
> +		return -ENOMEM;
> +
> +	memcpy(bounce, src, size);
> +	rc = copy_to_user(buf, bounce, size);
> +
> +	kfree(bounce);
> +	return rc;
> +}
> +
>  static ssize_t dtl_file_read(struct file *filp, char __user *buf, size_t len,
>  		loff_t *pos)
>  {
> @@ -300,7 +316,7 @@ static ssize_t dtl_file_read(struct file *filp, char __user *buf, size_t len,
>  	if (i + n_req > dtl->buf_entries) {
>  		read_size = dtl->buf_entries - i;
>  
> -		rc = copy_to_user(buf, &dtl->buf[i],
> +		rc = bounce_copy(buf, &dtl->buf[i],
>  				read_size * sizeof(struct dtl_entry));
>  		if (rc)
>  			return -EFAULT;
> @@ -312,7 +328,7 @@ static ssize_t dtl_file_read(struct file *filp, char __user *buf, size_t len,
>  	}
>  
>  	/* .. and now the head */
> -	rc = copy_to_user(buf, &dtl->buf[i], n_req * sizeof(struct dtl_entry));
> +	rc = bounce_copy(buf, &dtl->buf[i], n_req * sizeof(struct dtl_entry));                              
>  	if (rc)
>  		return -EFAULT;
>  
>
>
I have verified that the above patch is working.                           
As you mentioned, an advantage of using the bounce buffer approach is that
instead of whitelisting the full dtl slub object at boot, we only whitelist
the subset of entries.                                                     
                                                                               
However given that:                                                        
(i) The dtl buffer is read-only. The dtl trace is a set of metrics which   
are collected to be read by privileged users.                              
(ii) Users usually reads all the dtl entries, not a subset.                                             
(iii) Read overflows are unlikely to expose anything useful to attackers        
since we are whitelisting the complete slub object and there are no        
contiguous memory locations which need to be hidden.                       
Can we go ahead with the whitelisting using kmem_cache_create_usercopy()   
approach?                                                                  
Or are there other reasons to prefer the bounce buffer approach?

Thank you,
Anjali K
Kees Cook June 20, 2024, 6:59 p.m. UTC | #5
On Thu, Jun 20, 2024 at 10:58:49PM +0530, Anjali K wrote:
> However given that:                                                        
> (i) The dtl buffer is read-only. The dtl trace is a set of metrics which   
> are collected to be read by privileged users.                              
> (ii) Users usually reads all the dtl entries, not a subset.                                             
> (iii) Read overflows are unlikely to expose anything useful to attackers        
> since we are whitelisting the complete slub object and there are no        
> contiguous memory locations which need to be hidden.                       
> Can we go ahead with the whitelisting using kmem_cache_create_usercopy()   
> approach?                                                                  
> Or are there other reasons to prefer the bounce buffer approach?

Yeah, based on this and what mpe said, I have no objection to just
allowing it in kmem_cache_create_usercopy(). I was mainly just curious
what the threat model was. :)

Reviewed-by: Kees Cook <kees@kernel.org>
Anjali K June 21, 2024, 8:12 a.m. UTC | #6
Hi Michael

On 18/06/24 12:41, Michael Ellerman wrote:
> I guess there isn't a kmem_cache_create_user_readonly() ?
Thank you for your review.                                                    
My understanding of the question is whether there's a way to whitelist a   
region such that it can be copied to userspace, but not written to using   
copy_from_user().                                                             
No, we don't have a function to whitelist only for copy_to_user() and not  
copy_from_user().

Thank you
Anjali K
Michael Ellerman June 21, 2024, 11:38 a.m. UTC | #7
Anjali K <anjalik@linux.ibm.com> writes:
> Hi Michael
>
> On 18/06/24 12:41, Michael Ellerman wrote:
>> I guess there isn't a kmem_cache_create_user_readonly() ?

> Thank you for your review.                                                    
> 
> My understanding of the question is whether there's a way to whitelist a   
> region such that it can be copied to userspace, but not written to using   
> copy_from_user().                                                             
 
Yes that's what I meant, and I pretty much worked that out from looking
at the implementation, but was hoping Kees would tell me it was there
somewhere, or implement it :)  Apologies for being cryptic.

> No, we don't have a function to whitelist only for copy_to_user() and not  
> copy_from_user().

Yep. I'll take this patch as-is, I think we've established that it's
pretty low risk to whitelist the whole cache.

cheers
Michael Ellerman June 24, 2024, 12:30 p.m. UTC | #8
On Fri, 14 Jun 2024 23:08:44 +0530, Anjali K wrote:
> Reading the dispatch trace log from /sys/kernel/debug/powerpc/dtl/cpu-*
> results in a BUG() when the config CONFIG_HARDENED_USERCOPY is enabled as
> shown below.
> 
>     kernel BUG at mm/usercopy.c:102!
>     Oops: Exception in kernel mode, sig: 5 [#1]
>     LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
>     Modules linked in: xfs libcrc32c dm_service_time sd_mod t10_pi sg ibmvfc
>     scsi_transport_fc ibmveth pseries_wdt dm_multipath dm_mirror dm_region_hash dm_log dm_mod fuse
>     CPU: 27 PID: 1815 Comm: python3 Not tainted 6.10.0-rc3 #85
>     Hardware name: IBM,9040-MRX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NM1060_042) hv:phyp pSeries
>     NIP:  c0000000005d23d4 LR: c0000000005d23d0 CTR: 00000000006ee6f8
>     REGS: c000000120c078c0 TRAP: 0700   Not tainted  (6.10.0-rc3)
>     MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 2828220f  XER: 0000000e
>     CFAR: c0000000001fdc80 IRQMASK: 0
>     [ ... GPRs omitted ... ]
>     NIP [c0000000005d23d4] usercopy_abort+0x78/0xb0
>     LR [c0000000005d23d0] usercopy_abort+0x74/0xb0
>     Call Trace:
>      usercopy_abort+0x74/0xb0 (unreliable)
>      __check_heap_object+0xf8/0x120
>      check_heap_object+0x218/0x240
>      __check_object_size+0x84/0x1a4
>      dtl_file_read+0x17c/0x2c4
>      full_proxy_read+0x8c/0x110
>      vfs_read+0xdc/0x3a0
>      ksys_read+0x84/0x144
>      system_call_exception+0x124/0x330
>      system_call_vectored_common+0x15c/0x2ec
>     --- interrupt: 3000 at 0x7fff81f3ab34
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/pseries: Whitelist dtl slub object for copying to userspace
      https://git.kernel.org/powerpc/c/1a14150e1656f7a332a943154fc486504db4d586

cheers
diff mbox series

Patch

diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 284a6fa04b0c..cba40d9d1284 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -343,8 +343,8 @@  static int alloc_dispatch_log_kmem_cache(void)
 {
 	void (*ctor)(void *) = get_dtl_cache_ctor();
 
-	dtl_cache = kmem_cache_create("dtl", DISPATCH_LOG_BYTES,
-						DISPATCH_LOG_BYTES, 0, ctor);
+	dtl_cache = kmem_cache_create_usercopy("dtl", DISPATCH_LOG_BYTES,
+						DISPATCH_LOG_BYTES, 0, 0, DISPATCH_LOG_BYTES, ctor);
 	if (!dtl_cache) {
 		pr_warn("Failed to create dispatch trace log buffer cache\n");
 		pr_warn("Stolen time statistics will be unreliable\n");