diff mbox series

vsprintf: Do not break early boot with probing addresses

Message ID 20190510084213.22149-1-pmladek@suse.com (mailing list archive)
State Not Applicable
Headers show
Series vsprintf: Do not break early boot with probing addresses | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch warning Failed to apply on branch next (8150a153c013aa2dd1ffae43370b89ac1347a7fb)
snowpatch_ozlabs/apply_patch fail Failed to apply to any branch

Commit Message

Petr Mladek May 10, 2019, 8:42 a.m. UTC
The commit 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing
invalid pointers") broke boot on several architectures. The common
pattern is that probe_kernel_read() is not working during early
boot because userspace access framework is not ready.

It is a generic problem. We have to avoid any complex external
functions in vsprintf() code, especially in the common path.
They might break printk() easily and are hard to debug.

Replace probe_kernel_read() with some simple checks for obvious
problems.

Details:

1. Report on Power:

Kernel crashes very early during boot with with CONFIG_PPC_KUAP and
CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG

The problem is the combination of some new code called via printk(),
check_pointer() which calls probe_kernel_read(). That then calls
allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
(before we've patched features). With the JUMP_LABEL debug enabled that
causes us to call printk() & dump_stack() and we end up recursing and
overflowing the stack.

Because it happens so early you don't get any output, just an apparently
dead system.

The stack trace (which you don't see) is something like:

  ...
  dump_stack+0xdc
  probe_kernel_read+0x1a4
  check_pointer+0x58
  string+0x3c
  vsnprintf+0x1bc
  vscnprintf+0x20
  printk_safe_log_store+0x7c
  printk+0x40
  dump_stack_print_info+0xbc
  dump_stack+0x8
  probe_kernel_read+0x1a4
  probe_kernel_read+0x19c
  check_pointer+0x58
  string+0x3c
  vsnprintf+0x1bc
  vscnprintf+0x20
  vprintk_store+0x6c
  vprintk_emit+0xec
  vprintk_func+0xd4
  printk+0x40
  cpufeatures_process_feature+0xc8
  scan_cpufeatures_subnodes+0x380
  of_scan_flat_dt_subnodes+0xb4
  dt_cpu_ftrs_scan_callback+0x158
  of_scan_flat_dt+0xf0
  dt_cpu_ftrs_scan+0x3c
  early_init_devtree+0x360
  early_setup+0x9c

2. Report on s390:

vsnprintf invocations, are broken on s390. For example, the early boot
output now looks like this where the first (efault) should be
the linux_banner:

[    0.099985] (efault)
[    0.099985] setup: Linux is running as a z/VM guest operating system in 64-bit mode
[    0.100066] setup: The maximum memory size is 8192MB
[    0.100070] cma: Reserved 4 MiB at (efault)
[    0.100100] numa: NUMA mode: (efault)

The reason for this, is that the code assumes that
probe_kernel_address() works very early. This however is not true on
at least s390. Uaccess on KERNEL_DS works only after page tables have
been setup on s390, which happens with setup_arch()->paging_init().

Any probe_kernel_address() invocation before that will return -EFAULT.

Fixes: 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing invalid pointers")
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 lib/vsprintf.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

Comments

Sergey Senozhatsky May 10, 2019, 8:51 a.m. UTC | #1
On (05/10/19 10:42), Petr Mladek wrote:
[..]
> Fixes: 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing invalid pointers")
> Signed-off-by: Petr Mladek <pmladek@suse.com>

FWIW
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

	-ss
Petr Mladek May 10, 2019, 2:49 p.m. UTC | #2
On Fri 2019-05-10 10:42:13, Petr Mladek wrote:
> The commit 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing
> invalid pointers") broke boot on several architectures. The common
> pattern is that probe_kernel_read() is not working during early
> boot because userspace access framework is not ready.
> 
> It is a generic problem. We have to avoid any complex external
> functions in vsprintf() code, especially in the common path.
> They might break printk() easily and are hard to debug.
> 
> Replace probe_kernel_read() with some simple checks for obvious
> problems.

JFYI, I have sent a pull request with this patch, see
https://lkml.kernel.org/r/20190510144718.riyy72g4cy5nkggx@pathway.suse.cz

Best Regards,
Petr
Steven Rostedt May 10, 2019, 4:24 p.m. UTC | #3
On Fri, 10 May 2019 10:42:13 +0200
Petr Mladek <pmladek@suse.com> wrote:

>  static const char *check_pointer_msg(const void *ptr)
>  {
> -	char byte;
> -
>  	if (!ptr)
>  		return "(null)";
>  
> -	if (probe_kernel_address(ptr, byte))
> +	if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
>  		return "(efault)";
>  


	< PAGE_SIZE ?

do you mean: < TASK_SIZE ?

-- Steve
Martin Schwidefsky May 10, 2019, 4:32 p.m. UTC | #4
On Fri, 10 May 2019 12:24:01 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Fri, 10 May 2019 10:42:13 +0200
> Petr Mladek <pmladek@suse.com> wrote:
> 
> >  static const char *check_pointer_msg(const void *ptr)
> >  {
> > -	char byte;
> > -
> >  	if (!ptr)
> >  		return "(null)";
> >  
> > -	if (probe_kernel_address(ptr, byte))
> > +	if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> >  		return "(efault)";
> >    
> 
> 
> 	< PAGE_SIZE ?
> 
> do you mean: < TASK_SIZE ?

The check with < TASK_SIZE would break on s390. The 'ptr' is
in the kernel address space, *not* in the user address space.
Remember s390 has two separate address spaces for kernel/user
the check < TASK_SIZE only makes sense with a __user pointer.
Steven Rostedt May 10, 2019, 4:40 p.m. UTC | #5
On Fri, 10 May 2019 18:32:58 +0200
Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:

> On Fri, 10 May 2019 12:24:01 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > On Fri, 10 May 2019 10:42:13 +0200
> > Petr Mladek <pmladek@suse.com> wrote:
> >   
> > >  static const char *check_pointer_msg(const void *ptr)
> > >  {
> > > -	char byte;
> > > -
> > >  	if (!ptr)
> > >  		return "(null)";
> > >  
> > > -	if (probe_kernel_address(ptr, byte))
> > > +	if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> > >  		return "(efault)";
> > >      
> > 
> > 
> > 	< PAGE_SIZE ?
> > 
> > do you mean: < TASK_SIZE ?  
> 
> The check with < TASK_SIZE would break on s390. The 'ptr' is
> in the kernel address space, *not* in the user address space.
> Remember s390 has two separate address spaces for kernel/user
> the check < TASK_SIZE only makes sense with a __user pointer.
> 

So we allow this to read user addresses? Can't that cause a fault?

If the condition is true, we return "(efault)".

-- Steve
Andy Shevchenko May 10, 2019, 4:41 p.m. UTC | #6
On Fri, May 10, 2019 at 12:24:01PM -0400, Steven Rostedt wrote:
> On Fri, 10 May 2019 10:42:13 +0200
> Petr Mladek <pmladek@suse.com> wrote:
> 
> >  static const char *check_pointer_msg(const void *ptr)
> >  {
> > -	char byte;
> > -
> >  	if (!ptr)
> >  		return "(null)";
> >  
> > -	if (probe_kernel_address(ptr, byte))
> > +	if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> >  		return "(efault)";
> >  
> 
> 
> 	< PAGE_SIZE ?
> 
> do you mean: < TASK_SIZE ?

Original code used PAGE_SIZE. If it needs to be changed, that it might be a
separate explanation / patch.
Martin Schwidefsky May 10, 2019, 4:45 p.m. UTC | #7
On Fri, 10 May 2019 12:40:58 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Fri, 10 May 2019 18:32:58 +0200
> Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> 
> > On Fri, 10 May 2019 12:24:01 -0400
> > Steven Rostedt <rostedt@goodmis.org> wrote:
> >   
> > > On Fri, 10 May 2019 10:42:13 +0200
> > > Petr Mladek <pmladek@suse.com> wrote:
> > >     
> > > >  static const char *check_pointer_msg(const void *ptr)
> > > >  {
> > > > -	char byte;
> > > > -
> > > >  	if (!ptr)
> > > >  		return "(null)";
> > > >  
> > > > -	if (probe_kernel_address(ptr, byte))
> > > > +	if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> > > >  		return "(efault)";
> > > >        
> > > 
> > > 
> > > 	< PAGE_SIZE ?
> > > 
> > > do you mean: < TASK_SIZE ?    
> > 
> > The check with < TASK_SIZE would break on s390. The 'ptr' is
> > in the kernel address space, *not* in the user address space.
> > Remember s390 has two separate address spaces for kernel/user
> > the check < TASK_SIZE only makes sense with a __user pointer.
> >   
> 
> So we allow this to read user addresses? Can't that cause a fault?
> 
> If the condition is true, we return "(efault)".

On x86 this would allow a user space access as kernel and user live
in the same address space, on s390 it would not.
h
Christophe Leroy May 10, 2019, 5:35 p.m. UTC | #8
Le 10/05/2019 à 18:24, Steven Rostedt a écrit :
> On Fri, 10 May 2019 10:42:13 +0200
> Petr Mladek <pmladek@suse.com> wrote:
> 
>>   static const char *check_pointer_msg(const void *ptr)
>>   {
>> -	char byte;
>> -
>>   	if (!ptr)
>>   		return "(null)";
>>   
>> -	if (probe_kernel_address(ptr, byte))
>> +	if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
>>   		return "(efault)";
>>   
> 
> 
> 	< PAGE_SIZE ?
> 
> do you mean: < TASK_SIZE ?

I guess not.

Usually, < PAGE_SIZE means NULL pointer dereference (via the member of a 
struct)

Christophe

---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast.
https://www.avast.com/antivirus
David Laight May 13, 2019, 8:52 a.m. UTC | #9
From: christophe leroy
> Sent: 10 May 2019 18:35
> Le 10/05/2019 à 18:24, Steven Rostedt a écrit :
> > On Fri, 10 May 2019 10:42:13 +0200
> > Petr Mladek <pmladek@suse.com> wrote:
> >
> >>   static const char *check_pointer_msg(const void *ptr)
> >>   {
> >> -	char byte;
> >> -
> >>   	if (!ptr)
> >>   		return "(null)";
> >>
> >> -	if (probe_kernel_address(ptr, byte))
> >> +	if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> >>   		return "(efault)";

"efault" looks a bit like a spellling mistake for "default".

> > 	< PAGE_SIZE ?
> >
> > do you mean: < TASK_SIZE ?
> 
> I guess not.
> 
> Usually, < PAGE_SIZE means NULL pointer dereference (via the member of a
> struct)

Maybe the caller should pass in a short buffer so that you can return
"(err-%d)" or "(null+%#x)" ?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Andy Shevchenko May 13, 2019, 9:13 a.m. UTC | #10
On Mon, May 13, 2019 at 08:52:41AM +0000, David Laight wrote:
> From: christophe leroy
> > Sent: 10 May 2019 18:35
> > Le 10/05/2019 à 18:24, Steven Rostedt a écrit :
> > > On Fri, 10 May 2019 10:42:13 +0200
> > > Petr Mladek <pmladek@suse.com> wrote:

> > >> -	if (probe_kernel_address(ptr, byte))
> > >> +	if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> > >>   		return "(efault)";
> 
> "efault" looks a bit like a spellling mistake for "default".

It's a special, thus it's in parenthesis, though somebody can be
misguided.

> > Usually, < PAGE_SIZE means NULL pointer dereference (via the member of a
> > struct)
> 
> Maybe the caller should pass in a short buffer so that you can return
> "(err-%d)"
> or "(null+%#x)" ?

In both cases it should be limited to the size of pointer (8 or 16
characters). Something like "(e:%4d)" would work for error codes.

The "(null)" is good enough by itself and already an established
practice..
Petr Mladek May 13, 2019, 12:24 p.m. UTC | #11
On Fri 2019-05-10 12:40:58, Steven Rostedt wrote:
> On Fri, 10 May 2019 18:32:58 +0200
> Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> 
> > On Fri, 10 May 2019 12:24:01 -0400
> > Steven Rostedt <rostedt@goodmis.org> wrote:
> > 
> > > On Fri, 10 May 2019 10:42:13 +0200
> > > Petr Mladek <pmladek@suse.com> wrote:
> > >   
> > > >  static const char *check_pointer_msg(const void *ptr)
> > > >  {
> > > > -	char byte;
> > > > -
> > > >  	if (!ptr)
> > > >  		return "(null)";
> > > >  
> > > > -	if (probe_kernel_address(ptr, byte))
> > > > +	if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> > > >  		return "(efault)";
> > > >      
> > > 
> > > 
> > > 	< PAGE_SIZE ?
> > > 
> > > do you mean: < TASK_SIZE ?  
> > 
> > The check with < TASK_SIZE would break on s390. The 'ptr' is
> > in the kernel address space, *not* in the user address space.
> > Remember s390 has two separate address spaces for kernel/user
> > the check < TASK_SIZE only makes sense with a __user pointer.
> > 
> 
> So we allow this to read user addresses? Can't that cause a fault?

I did some quick check and did not found anywhere a user pointer
being dereferenced via vsprintf().

In each case, %s did the check (ptr < PAGE_SIZE) even before this
patchset. The other checks are in %p format modifiers that are
used to print various kernel structures.

Finally, it accesses the pointer directly. I am not completely sure
but I think that it would not work that easily with an address
from the user address space.

Best Regards,
Petr
Petr Mladek May 13, 2019, 12:42 p.m. UTC | #12
On Mon 2019-05-13 12:13:20, Andy Shevchenko wrote:
> On Mon, May 13, 2019 at 08:52:41AM +0000, David Laight wrote:
> > From: christophe leroy
> > > Sent: 10 May 2019 18:35
> > > Le 10/05/2019 à 18:24, Steven Rostedt a écrit :
> > > > On Fri, 10 May 2019 10:42:13 +0200
> > > > Petr Mladek <pmladek@suse.com> wrote:
> 
> > > >> -	if (probe_kernel_address(ptr, byte))
> > > >> +	if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> > > >>   		return "(efault)";
> > 
> > "efault" looks a bit like a spellling mistake for "default".
> 
> It's a special, thus it's in parenthesis, though somebody can be
> misguided.
> 
> > > Usually, < PAGE_SIZE means NULL pointer dereference (via the member of a
> > > struct)
> > 
> > Maybe the caller should pass in a short buffer so that you can return
> > "(err-%d)"
> > or "(null+%#x)" ?

There are many vsprintf() users. I am afraid that nobody would want
to define extra buffers for error messages. It must fit into
the one for the formatted string.


> In both cases it should be limited to the size of pointer (8 or 16
> characters). Something like "(e:%4d)" would work for error codes.

I am afraid that we could get 10 different proposals from a huge
enough group of people. I wanted to start with something simple
(source code). I know that (efault) might be confused with
"default" but it is still just one string to grep.


> The "(null)" is good enough by itself and already an established
> practice..

(efault) made more sense with the probe_kernel_read() that
checked wide range of addresses. Well, I still think that
it makes sense to distinguish a pure NULL. And it still
used also for IS_ERR_VALUE().

Best Regards,
Petr
Steven Rostedt May 13, 2019, 2:15 p.m. UTC | #13
On Mon, 13 May 2019 14:42:20 +0200
Petr Mladek <pmladek@suse.com> wrote:

> > The "(null)" is good enough by itself and already an established
> > practice..  
> 
> (efault) made more sense with the probe_kernel_read() that
> checked wide range of addresses. Well, I still think that
> it makes sense to distinguish a pure NULL. And it still
> used also for IS_ERR_VALUE().

Why not just "(fault)"? That is self descriptive enough.

-- Steve
Sergey Senozhatsky May 14, 2019, 2:07 a.m. UTC | #14
On (05/13/19 14:42), Petr Mladek wrote:
> > The "(null)" is good enough by itself and already an established
> > practice..
> 
> (efault) made more sense with the probe_kernel_read() that
> checked wide range of addresses. Well, I still think that
> it makes sense to distinguish a pure NULL. And it still
> used also for IS_ERR_VALUE().

Wouldn't anything within first PAGE_SIZE bytes be reported as
a NULL deref?

       char *p = (char *)(PAGE_SIZE - 2);
       *p = 'a';

gives

 kernel: BUG: kernel NULL pointer dereference, address = 0000000000000ffe
 kernel: #PF: supervisor-privileged write access from kernel code
 kernel: #PF: error_code(0x0002) - not-present page


And I like Steven's "(fault)" idea.
How about this:

	if ptr < PAGE_SIZE		-> "(null)"
	if IS_ERR_VALUE(ptr)		-> "(fault)"

	-ss
Sergey Senozhatsky May 14, 2019, 2:25 a.m. UTC | #15
On (05/14/19 11:07), Sergey Senozhatsky wrote:
> How about this:
> 
> 	if ptr < PAGE_SIZE		-> "(null)"

No, this is totally stupid. Forget about it. Sorry.


> 	if IS_ERR_VALUE(ptr)		-> "(fault)"

But Steven's "(fault)" is nice.

	-ss
David Laight May 14, 2019, 8:28 a.m. UTC | #16
> And I like Steven's "(fault)" idea.
> How about this:
> 
> 	if ptr < PAGE_SIZE		-> "(null)"
> 	if IS_ERR_VALUE(ptr)		-> "(fault)"
> 
> 	-ss

Or:
	if (ptr < PAGE_SIZE)
		return ptr ? "(null+)" : "(null)";
	if IS_ERR_VALUE(ptr)
		return "(errno)"

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Geert Uytterhoeven May 14, 2019, 9:02 a.m. UTC | #17
On Tue, May 14, 2019 at 10:29 AM David Laight <David.Laight@aculab.com> wrote:
> > And I like Steven's "(fault)" idea.
> > How about this:
> >
> >       if ptr < PAGE_SIZE              -> "(null)"
> >       if IS_ERR_VALUE(ptr)            -> "(fault)"
> >
> >       -ss
>
> Or:
>         if (ptr < PAGE_SIZE)
>                 return ptr ? "(null+)" : "(null)";
>         if IS_ERR_VALUE(ptr)
>                 return "(errno)"

Do we care about the value? "(-E%u)"?

Gr{oetje,eeting}s,

                        Geert
Steven Rostedt May 14, 2019, 6:37 p.m. UTC | #18
[ Purple is a nice shade on the bike shed. ;-) ]

On Tue, 14 May 2019 11:02:17 +0200
Geert Uytterhoeven <geert@linux-m68k.org> wrote:

> On Tue, May 14, 2019 at 10:29 AM David Laight <David.Laight@aculab.com> wrote:
> > > And I like Steven's "(fault)" idea.
> > > How about this:
> > >
> > >       if ptr < PAGE_SIZE              -> "(null)"
> > >       if IS_ERR_VALUE(ptr)            -> "(fault)"
> > >
> > >       -ss  
> >
> > Or:
> >         if (ptr < PAGE_SIZE)
> >                 return ptr ? "(null+)" : "(null)";

Hmm, that is useful.

> >         if IS_ERR_VALUE(ptr)
> >                 return "(errno)"  

I still prefer "(fault)" as is pretty much all I would expect from a
pointer dereference, even if it is just bad parsing of, say, a parsing
an MAC address. "fault" is generic enough. "errno" will be confusing,
because that's normally a variable not a output.

> 
> Do we care about the value? "(-E%u)"?

That too could be confusing. What would (-E22) be considered by a user
doing an sprintf() on some string. I know that would confuse me, or I
would think that it was what the %pX displayed, and wonder why it
displayed it that way. Whereas "(fault)" is quite obvious for any %p
use case.

-- Steve
Geert Uytterhoeven May 14, 2019, 7:13 p.m. UTC | #19
Hi Steve,

On Tue, May 14, 2019 at 8:37 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> On Tue, 14 May 2019 11:02:17 +0200
> Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > On Tue, May 14, 2019 at 10:29 AM David Laight <David.Laight@aculab.com> wrote:
> > > > And I like Steven's "(fault)" idea.
> > > > How about this:
> > > >
> > > >       if ptr < PAGE_SIZE              -> "(null)"
> > > >       if IS_ERR_VALUE(ptr)            -> "(fault)"
> > > >
> > > >       -ss
> > >
> > > Or:
> > >         if (ptr < PAGE_SIZE)
> > >                 return ptr ? "(null+)" : "(null)";
>
> Hmm, that is useful.
>
> > >         if IS_ERR_VALUE(ptr)
> > >                 return "(errno)"
>
> I still prefer "(fault)" as is pretty much all I would expect from a
> pointer dereference, even if it is just bad parsing of, say, a parsing
> an MAC address. "fault" is generic enough. "errno" will be confusing,
> because that's normally a variable not a output.
>
> >
> > Do we care about the value? "(-E%u)"?
>
> That too could be confusing. What would (-E22) be considered by a user
> doing an sprintf() on some string. I know that would confuse me, or I
> would think that it was what the %pX displayed, and wonder why it
> displayed it that way. Whereas "(fault)" is quite obvious for any %p
> use case.

I would immediately understand there's a missing IS_ERR() check in a
function that can return  -EINVAL, without having to add a new printk()
to find out what kind of bogus value has been received, and without
having to reboot, and trying to reproduce...

Gr{oetje,eeting}s,

                        Geert
Steven Rostedt May 14, 2019, 7:35 p.m. UTC | #20
On Tue, 14 May 2019 21:13:06 +0200
Geert Uytterhoeven <geert@linux-m68k.org> wrote:

> > > Do we care about the value? "(-E%u)"?  
> >
> > That too could be confusing. What would (-E22) be considered by a user
> > doing an sprintf() on some string. I know that would confuse me, or I
> > would think that it was what the %pX displayed, and wonder why it
> > displayed it that way. Whereas "(fault)" is quite obvious for any %p
> > use case.  
> 
> I would immediately understand there's a missing IS_ERR() check in a
> function that can return  -EINVAL, without having to add a new printk()
> to find out what kind of bogus value has been received, and without
> having to reboot, and trying to reproduce...

Hi Geert,

I have to ask. Has there actually been a case that you used a %pX and
it faulted, and you had to go back to find what the value of the
failure was?

IMO, sprintf() should not be a tool to do this, because then people
will not add their IS_ERR() and just let sprintf() do the job for them.
I don't think that would be wise to allow.

-- Steve
Sergey Senozhatsky May 15, 2019, 6:21 a.m. UTC | #21
On (05/14/19 21:13), Geert Uytterhoeven wrote:
> I would immediately understand there's a missing IS_ERR() check in a
> function that can return  -EINVAL, without having to add a new printk()
> to find out what kind of bogus value has been received, and without
> having to reboot, and trying to reproduce...

But chances are that missing IS_ERR() will crash the kernel sooner
or later (in general case), if not in sprintf() then somewhere else.

	-ss
Geert Uytterhoeven May 15, 2019, 7:23 a.m. UTC | #22
Hi Steve,

On Tue, May 14, 2019 at 9:35 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> On Tue, 14 May 2019 21:13:06 +0200
> Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > > Do we care about the value? "(-E%u)"?
> > >
> > > That too could be confusing. What would (-E22) be considered by a user
> > > doing an sprintf() on some string. I know that would confuse me, or I
> > > would think that it was what the %pX displayed, and wonder why it
> > > displayed it that way. Whereas "(fault)" is quite obvious for any %p
> > > use case.
> >
> > I would immediately understand there's a missing IS_ERR() check in a
> > function that can return  -EINVAL, without having to add a new printk()
> > to find out what kind of bogus value has been received, and without
> > having to reboot, and trying to reproduce...
>
> I have to ask. Has there actually been a case that you used a %pX and
> it faulted, and you had to go back to find what the value of the
> failure was?

If it faulted, the bad pointer value is obvious from the backtrace.
If the code avoids the fault by verifying the pointer and returning
"(efault)" instead, the bad pointer value is lost.

Or am I missing something?

Thanks!

Gr{oetje,eeting}s,

                        Geert
Petr Mladek May 15, 2019, 7:35 a.m. UTC | #23
On Tue 2019-05-14 14:37:51, Steven Rostedt wrote:
> 
> [ Purple is a nice shade on the bike shed. ;-) ]
> 
> On Tue, 14 May 2019 11:02:17 +0200
> Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> 
> > On Tue, May 14, 2019 at 10:29 AM David Laight <David.Laight@aculab.com> wrote:
> > > > And I like Steven's "(fault)" idea.
> > > > How about this:
> > > >
> > > >       if ptr < PAGE_SIZE              -> "(null)"
> > > >       if IS_ERR_VALUE(ptr)            -> "(fault)"
> > > >
> > > >       -ss  
> > >
> > > Or:
> > >         if (ptr < PAGE_SIZE)
> > >                 return ptr ? "(null+)" : "(null)";
> 
> Hmm, that is useful.
> 
> > >         if IS_ERR_VALUE(ptr)
> > >                 return "(errno)"  
> 
> I still prefer "(fault)" as is pretty much all I would expect from a
> pointer dereference, even if it is just bad parsing of, say, a parsing
> an MAC address. "fault" is generic enough. "errno" will be confusing,
> because that's normally a variable not a output.
> 
> > 
> > Do we care about the value? "(-E%u)"?
> 
> That too could be confusing. What would (-E22) be considered by a user
> doing an sprintf() on some string. I know that would confuse me, or I
> would think that it was what the %pX displayed, and wonder why it
> displayed it that way. Whereas "(fault)" is quite obvious for any %p
> use case.

This discussion clearly shows that it is hard to make anyone happy.

I considered switching to "(fault)" because there seems to be more
people in favor of this.

But there is used also "(einval)" when an unsupported pointer
modifier is passed. The idea is to show error codes that people
are familiar with.

It might have been better to use the uppercase "(EFAULT)" and
"(EINVAL)" to make it more obvious. But I wanted to follow
the existing style with the lowercase "(null)".

As of now, I think that we should keep it as is unless there is
some wider agreement on a change.

Best Regards,
Petr
Petr Mladek May 15, 2019, 7:53 a.m. UTC | #24
On Wed 2019-05-15 09:23:05, Geert Uytterhoeven wrote:
> Hi Steve,
> 
> On Tue, May 14, 2019 at 9:35 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> > On Tue, 14 May 2019 21:13:06 +0200
> > Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > > > Do we care about the value? "(-E%u)"?
> > > >
> > > > That too could be confusing. What would (-E22) be considered by a user
> > > > doing an sprintf() on some string. I know that would confuse me, or I
> > > > would think that it was what the %pX displayed, and wonder why it
> > > > displayed it that way. Whereas "(fault)" is quite obvious for any %p
> > > > use case.
> > >
> > > I would immediately understand there's a missing IS_ERR() check in a
> > > function that can return  -EINVAL, without having to add a new printk()
> > > to find out what kind of bogus value has been received, and without
> > > having to reboot, and trying to reproduce...
> >
> > I have to ask. Has there actually been a case that you used a %pX and
> > it faulted, and you had to go back to find what the value of the
> > failure was?
> 
> If it faulted, the bad pointer value is obvious from the backtrace.
> If the code avoids the fault by verifying the pointer and returning
> "(efault)" instead, the bad pointer value is lost.
> 
> Or am I missing something?

Should buggy printk() crash the system?

Another problem is that vsprintf() is called in printk() under
lockbuf_lock. The messages are stored into printk_safe per CPU
buffers. It allows to see the nested messages. But there is still
a bigger risk of missing them than with a "normal" fault.

Finally, various variants of these checks were already used
in "random" printf formats. The only change is that we are
using them consistently everywhere[*] a pointer is accessed.

[*] Just the top level pointer is checked. Some pointer modifiers
are accessing ptr->ptr->val. The lower level pointers are not
checked to avoid too much complexity.

Best Regards,
Petr
David Laight May 15, 2019, 9 a.m. UTC | #25
From: Petr Mladek
> Sent: 15 May 2019 08:36
> On Tue 2019-05-14 14:37:51, Steven Rostedt wrote:
> >
> > [ Purple is a nice shade on the bike shed. ;-) ]
> >
> > On Tue, 14 May 2019 11:02:17 +0200
> > Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >
> > > On Tue, May 14, 2019 at 10:29 AM David Laight <David.Laight@aculab.com> wrote:
> > > > > And I like Steven's "(fault)" idea.
> > > > > How about this:
> > > > >
> > > > >       if ptr < PAGE_SIZE              -> "(null)"
> > > > >       if IS_ERR_VALUE(ptr)            -> "(fault)"
> > > > >
> > > > >       -ss
> > > >
> > > > Or:
> > > >         if (ptr < PAGE_SIZE)
> > > >                 return ptr ? "(null+)" : "(null)";
> >
> > Hmm, that is useful.
> >
> > > >         if IS_ERR_VALUE(ptr)
> > > >                 return "(errno)"
> >
> > I still prefer "(fault)" as is pretty much all I would expect from a
> > pointer dereference, even if it is just bad parsing of, say, a parsing
> > an MAC address. "fault" is generic enough. "errno" will be confusing,
> > because that's normally a variable not a output.
> >
> > >
> > > Do we care about the value? "(-E%u)"?
> >
> > That too could be confusing. What would (-E22) be considered by a user
> > doing an sprintf() on some string. I know that would confuse me, or I
> > would think that it was what the %pX displayed, and wonder why it
> > displayed it that way. Whereas "(fault)" is quite obvious for any %p
> > use case.
> 
> This discussion clearly shows that it is hard to make anyone happy.
> 
> I considered switching to "(fault)" because there seems to be more
> people in favor of this.
> 
> But there is used also "(einval)" when an unsupported pointer
> modifier is passed. The idea is to show error codes that people
> are familiar with.
> 
> It might have been better to use the uppercase "(EFAULT)" and
> "(EINVAL)" to make it more obvious. But I wanted to follow
> the existing style with the lowercase "(null)".

Printing 'fault' when the code was (trying to) validate the
address was ok.
When the only check is for an -errno value it seems wrong as
most invalid addresses will actually fault (and panic).

The reason modern printf generate "(null)" is that it is far too
easy for a diagnostic print to fail to test a pointer.
It also makes it easier when 'throwing in' printf while debugging
to add a single trace that will work regardless of whether a
call had succeeded or not.

With the Linux kernel putting errno values into pointers it
seems likely that most invalid pointers in printf will actaully
be error values.
Printing the value will be helpful during debugging - as a
trace can be put after a call and show the parameters and result.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
diff mbox series

Patch

diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 7b0a6140bfad..2f003cfe340e 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -628,19 +628,16 @@  static char *error_string(char *buf, char *end, const char *s,
 }
 
 /*
- * This is not a fool-proof test. 99% of the time that this will fault is
- * due to a bad pointer, not one that crosses into bad memory. Just test
- * the address to make sure it doesn't fault due to a poorly added printk
- * during debugging.
+ * Do not call any complex external code here. Nested printk()/vsprintf()
+ * might cause infinite loops. Failures might break printk() and would
+ * be hard to debug.
  */
 static const char *check_pointer_msg(const void *ptr)
 {
-	char byte;
-
 	if (!ptr)
 		return "(null)";
 
-	if (probe_kernel_address(ptr, byte))
+	if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
 		return "(efault)";
 
 	return NULL;