mbox series

[0/2] mm: make copy_to_kernel_nofault() not fault on user addresses

Message ID cover.1725223574.git.osandov@fb.com
Headers show
Series mm: make copy_to_kernel_nofault() not fault on user addresses | expand

Message

Omar Sandoval Sept. 2, 2024, 5:31 a.m. UTC
From: Omar Sandoval <osandov@fb.com>

Hi,

I hit a case where copy_to_kernel_nofault() will fault (lol): if the
destination address is in userspace and x86 Supervisor Mode Access
Prevention is enabled. Patch 2 has the details and the fix. Patch 1
renames a helper function so that its use in patch 2 makes more sense.
If the rename is too intrusive, I can drop it.

Thanks,
Omar

Omar Sandoval (2):
  mm: rename copy_from_kernel_nofault_allowed() to
    copy_kernel_nofault_allowed()
  mm: make copy_to_kernel_nofault() not fault on user addresses

 arch/arm/mm/fault.c         |  2 +-
 arch/loongarch/mm/maccess.c |  2 +-
 arch/mips/mm/maccess.c      |  2 +-
 arch/parisc/lib/memcpy.c    |  2 +-
 arch/powerpc/mm/maccess.c   |  2 +-
 arch/um/kernel/maccess.c    |  2 +-
 arch/x86/mm/maccess.c       |  4 ++--
 include/linux/uaccess.h     |  2 +-
 mm/maccess.c                | 10 ++++++----
 9 files changed, 15 insertions(+), 13 deletions(-)

Comments

Christophe Leroy Sept. 2, 2024, 6:19 a.m. UTC | #1
Le 02/09/2024 à 07:31, Omar Sandoval a écrit :
> [Vous ne recevez pas souvent de courriers de osandov@osandov.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
> 
> From: Omar Sandoval <osandov@fb.com>
> 
> Hi,
> 
> I hit a case where copy_to_kernel_nofault() will fault (lol): if the
> destination address is in userspace and x86 Supervisor Mode Access
> Prevention is enabled. Patch 2 has the details and the fix. Patch 1
> renames a helper function so that its use in patch 2 makes more sense.
> If the rename is too intrusive, I can drop it.

The name of the function is "copy_to_kernel". If the destination is a 
user address, it is not a copy to kernel but a copy to user and you 
already have the function copy_to_user() for that. copy_to_user() 
properly handles SMAP.

Christophe


> 
> Thanks,
> Omar
> 
> Omar Sandoval (2):
>    mm: rename copy_from_kernel_nofault_allowed() to
>      copy_kernel_nofault_allowed()
>    mm: make copy_to_kernel_nofault() not fault on user addresses
> 
>   arch/arm/mm/fault.c         |  2 +-
>   arch/loongarch/mm/maccess.c |  2 +-
>   arch/mips/mm/maccess.c      |  2 +-
>   arch/parisc/lib/memcpy.c    |  2 +-
>   arch/powerpc/mm/maccess.c   |  2 +-
>   arch/um/kernel/maccess.c    |  2 +-
>   arch/x86/mm/maccess.c       |  4 ++--
>   include/linux/uaccess.h     |  2 +-
>   mm/maccess.c                | 10 ++++++----
>   9 files changed, 15 insertions(+), 13 deletions(-)
> 
> --
> 2.46.0
> 
>
Omar Sandoval Sept. 2, 2024, 6:31 a.m. UTC | #2
On Mon, Sep 02, 2024 at 08:19:33AM +0200, Christophe Leroy wrote:
> 
> 
> Le 02/09/2024 à 07:31, Omar Sandoval a écrit :
> > [Vous ne recevez pas souvent de courriers de osandov@osandov.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
> > 
> > From: Omar Sandoval <osandov@fb.com>
> > 
> > Hi,
> > 
> > I hit a case where copy_to_kernel_nofault() will fault (lol): if the
> > destination address is in userspace and x86 Supervisor Mode Access
> > Prevention is enabled. Patch 2 has the details and the fix. Patch 1
> > renames a helper function so that its use in patch 2 makes more sense.
> > If the rename is too intrusive, I can drop it.
> 
> The name of the function is "copy_to_kernel". If the destination is a user
> address, it is not a copy to kernel but a copy to user and you already have
> the function copy_to_user() for that. copy_to_user() properly handles SMAP.

I'm not trying to copy to user. I am (well, KDB is) trying to copy to an
arbitrary address, and I want it to return an error instead of crashing
if the address is not a valid kernel address. As far as I can tell, that
is the whole point of copy_to_kernel_nofault().

Thanks,
Omar
David Hildenbrand Sept. 2, 2024, 8:56 a.m. UTC | #3
On 02.09.24 08:31, Omar Sandoval wrote:
> On Mon, Sep 02, 2024 at 08:19:33AM +0200, Christophe Leroy wrote:
>>
>>
>> Le 02/09/2024 à 07:31, Omar Sandoval a écrit :
>>> [Vous ne recevez pas souvent de courriers de osandov@osandov.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
>>>
>>> From: Omar Sandoval <osandov@fb.com>
>>>
>>> Hi,
>>>
>>> I hit a case where copy_to_kernel_nofault() will fault (lol): if the
>>> destination address is in userspace and x86 Supervisor Mode Access
>>> Prevention is enabled. Patch 2 has the details and the fix. Patch 1
>>> renames a helper function so that its use in patch 2 makes more sense.
>>> If the rename is too intrusive, I can drop it.
>>
>> The name of the function is "copy_to_kernel". If the destination is a user
>> address, it is not a copy to kernel but a copy to user and you already have
>> the function copy_to_user() for that. copy_to_user() properly handles SMAP.
> 
> I'm not trying to copy to user. I am (well, KDB is) trying to copy to an
> arbitrary address, and I want it to return an error instead of crashing
> if the address is not a valid kernel address. As far as I can tell, that
> is the whole point of copy_to_kernel_nofault().

The thing is that you (well, KDB) triggers something that would be 
considered a real BUG when triggered from "ordinary" (non-debugging) code.

But now I am confused: "if the destination address is in userspace" does 
not really make sense in the context of KDB, no?

   [15]kdb> mm 0 1234
   [   94.652476] BUG: kernel NULL pointer dereference, address: 
0000000000000000

Why is address 0 in "user space"? "Which" user space?

Isn't the problem here that KDB lets you blindly write to any 
non-existing memory address?


Likely it should do some proper filtering like we do in fs/proc/kcore.c:

Take a look at the KCORE_RAM case where we make sure the page exists, is 
online and may be accessed. Only then, we trigger a 
copy_from_kernel_nofault(). Note that the KCORE_USER is a corner case 
only for some special thingies on x86 (vsyscall), and can be ignored for 
our case here.
Omar Sandoval Sept. 2, 2024, 3:26 p.m. UTC | #4
On Mon, Sep 02, 2024 at 10:56:27AM +0200, David Hildenbrand wrote:
> On 02.09.24 08:31, Omar Sandoval wrote:
> > On Mon, Sep 02, 2024 at 08:19:33AM +0200, Christophe Leroy wrote:
> > > 
> > > 
> > > Le 02/09/2024 à 07:31, Omar Sandoval a écrit :
> > > > [Vous ne recevez pas souvent de courriers de osandov@osandov.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
> > > > 
> > > > From: Omar Sandoval <osandov@fb.com>
> > > > 
> > > > Hi,
> > > > 
> > > > I hit a case where copy_to_kernel_nofault() will fault (lol): if the
> > > > destination address is in userspace and x86 Supervisor Mode Access
> > > > Prevention is enabled. Patch 2 has the details and the fix. Patch 1
> > > > renames a helper function so that its use in patch 2 makes more sense.
> > > > If the rename is too intrusive, I can drop it.
> > > 
> > > The name of the function is "copy_to_kernel". If the destination is a user
> > > address, it is not a copy to kernel but a copy to user and you already have
> > > the function copy_to_user() for that. copy_to_user() properly handles SMAP.
> > 
> > I'm not trying to copy to user. I am (well, KDB is) trying to copy to an
> > arbitrary address, and I want it to return an error instead of crashing
> > if the address is not a valid kernel address. As far as I can tell, that
> > is the whole point of copy_to_kernel_nofault().
> 
> The thing is that you (well, KDB) triggers something that would be
> considered a real BUG when triggered from "ordinary" (non-debugging) code.

If that's the case, then it's a really weird inconsistency that it's OK
to call copy_from_kernel_nofault() with an invalid address but a bug to
call copy_to_kernel_nofault() on the same address. Again, isn't the
whole point of these functions to fail gracefully instead of crashing on
invalid addresses? (Modulo the offline and hwpoison cases you mention
for /proc/kcore.)

> But now I am confused: "if the destination address is in userspace" does not
> really make sense in the context of KDB, no?
> 
>   [15]kdb> mm 0 1234
>   [   94.652476] BUG: kernel NULL pointer dereference, address:
> 0000000000000000
> 
> Why is address 0 in "user space"? "Which" user space?

Sure, it's not really user space, but it's below TASK_SIZE_MAX, so
things like handle_page_fault() and fault_in_kernel_space() treat it as
if it were a user address. I could
s/userspace address/address that is less than TASK_SIZE_MAX or is_vsyscall_vaddr(address)/.

> Isn't the problem here that KDB lets you blindly write to any non-existing
> memory address?
> 
> 
> Likely it should do some proper filtering like we do in fs/proc/kcore.c:
> 
> Take a look at the KCORE_RAM case where we make sure the page exists, is
> online and may be accessed. Only then, we trigger a
> copy_from_kernel_nofault(). Note that the KCORE_USER is a corner case only
> for some special thingies on x86 (vsyscall), and can be ignored for our case
> here.

Sure, it would be better to harden KDB against all of these special
cases. But you can break things in all sorts of fun ways with a
debugger, anyways. The point of this patch is that it's nonsense that a
function named copy_to_kernel_nofault() does indeed fault in a trivial
case like address < TASK_SIZE_MAX.

Thanks,
Omar
David Hildenbrand Sept. 2, 2024, 4:39 p.m. UTC | #5
On 02.09.24 17:26, Omar Sandoval wrote:
> On Mon, Sep 02, 2024 at 10:56:27AM +0200, David Hildenbrand wrote:
>> On 02.09.24 08:31, Omar Sandoval wrote:
>>> On Mon, Sep 02, 2024 at 08:19:33AM +0200, Christophe Leroy wrote:
>>>>
>>>>
>>>> Le 02/09/2024 à 07:31, Omar Sandoval a écrit :
>>>>> [Vous ne recevez pas souvent de courriers de osandov@osandov.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
>>>>>
>>>>> From: Omar Sandoval <osandov@fb.com>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I hit a case where copy_to_kernel_nofault() will fault (lol): if the
>>>>> destination address is in userspace and x86 Supervisor Mode Access
>>>>> Prevention is enabled. Patch 2 has the details and the fix. Patch 1
>>>>> renames a helper function so that its use in patch 2 makes more sense.
>>>>> If the rename is too intrusive, I can drop it.
>>>>
>>>> The name of the function is "copy_to_kernel". If the destination is a user
>>>> address, it is not a copy to kernel but a copy to user and you already have
>>>> the function copy_to_user() for that. copy_to_user() properly handles SMAP.
>>>
>>> I'm not trying to copy to user. I am (well, KDB is) trying to copy to an
>>> arbitrary address, and I want it to return an error instead of crashing
>>> if the address is not a valid kernel address. As far as I can tell, that
>>> is the whole point of copy_to_kernel_nofault().
>>
>> The thing is that you (well, KDB) triggers something that would be
>> considered a real BUG when triggered from "ordinary" (non-debugging) code.
> 
> If that's the case, then it's a really weird inconsistency that it's OK
> to call copy_from_kernel_nofault() with an invalid address but a bug to
> call copy_to_kernel_nofault() on the same address. Again, isn't the
> whole point of these functions to fail gracefully instead of crashing on
> invalid addresses? (Modulo the offline and hwpoison cases you mention
> for /proc/kcore.)

I assume the difference is mostly historically, because usually, when 
modifying something (ftrace, live patch, kdb) you better know what you 
want to modify actually exist and can be modified. IOW, you usually 
read-before-weite.

In contrast, things like /proc/kcore, (I think) while limiting it to 
sane addresses, might still read from areas where we remove entries from 
the directmap (e.g., secretmem), I think.

Like, in a compiler, modifying a variable you didn't read before is 
rather rare as well. If you would have tried to read it, the 
copy_from_kernel_nofault() would have failed.

I agree that the difference is weird, and likely really "nobody ran into 
this before in sane use cases".

> 
>> But now I am confused: "if the destination address is in userspace" does not
>> really make sense in the context of KDB, no?
>>
>>    [15]kdb> mm 0 1234
>>    [   94.652476] BUG: kernel NULL pointer dereference, address:
>> 0000000000000000
>>
>> Why is address 0 in "user space"? "Which" user space?
> 
> Sure, it's not really user space, but it's below TASK_SIZE_MAX, so
> things like handle_page_fault() and fault_in_kernel_space() treat it as
> if it were a user address. I could
> s/userspace address/address that is less than TASK_SIZE_MAX or is_vsyscall_vaddr(address)/.

Ah, okay, that's x86 specifics detail in 
copy_from_kernel_nofault_allowed(), thanks.

> 
>> Isn't the problem here that KDB lets you blindly write to any non-existing
>> memory address?
>>
>>
>> Likely it should do some proper filtering like we do in fs/proc/kcore.c:
>>
>> Take a look at the KCORE_RAM case where we make sure the page exists, is
>> online and may be accessed. Only then, we trigger a
>> copy_from_kernel_nofault(). Note that the KCORE_USER is a corner case only
>> for some special thingies on x86 (vsyscall), and can be ignored for our case
>> here.
> 
> Sure, it would be better to harden KDB against all of these special
> cases. But you can break things in all sorts of fun ways with a
> debugger, anyways. The point of this patch is that it's nonsense that a
> function named copy_to_kernel_nofault() does indeed fault in a trivial
> case like address < TASK_SIZE_MAX.

Yes, because the write-without-read to kernel memory "you don't know 
even exists" is rather ... weird :)

Anyhow, no strong opinion here. Patches look simple enough.