diff mbox series

[v2,05/17] vdso: Avoid call to memset() by getrandom

Message ID 5deb67090b214f0e6eae96b7c406546d1a16f89b.1724309198.git.christophe.leroy@csgroup.eu (mailing list archive)
State Handled Elsewhere
Headers show
Series Wire up getrandom() vDSO implementation on powerpc | expand

Commit Message

Christophe Leroy Aug. 22, 2024, 7:13 a.m. UTC
With the current implementation, __cvdso_getrandom_data() calls
memset(), which is unexpected in the VDSO.

Rewrite opaque data initialisation to avoid memset().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 lib/vdso/getrandom.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

Comments

Jason A. Donenfeld Aug. 26, 2024, 8:01 a.m. UTC | #1
On Thu, Aug 22, 2024 at 09:13:13AM +0200, Christophe Leroy wrote:
> With the current implementation, __cvdso_getrandom_data() calls
> memset(), which is unexpected in the VDSO.
> 
> Rewrite opaque data initialisation to avoid memset().

I think of the various ways I've seen of fixing this -- e.g. adding a
memset() arch-by-arch -- this is the cleanest and simplest. I'll add
this as a fix to random.git now.

Jason
Eric Biggers Aug. 27, 2024, 6:08 p.m. UTC | #2
On Thu, Aug 22, 2024 at 09:13:13AM +0200, Christophe Leroy wrote:
> With the current implementation, __cvdso_getrandom_data() calls
> memset(), which is unexpected in the VDSO.
> 
> Rewrite opaque data initialisation to avoid memset().
> 
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> ---
>  lib/vdso/getrandom.c | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/vdso/getrandom.c b/lib/vdso/getrandom.c
> index cab153c5f9be..4a56f45141b4 100644
> --- a/lib/vdso/getrandom.c
> +++ b/lib/vdso/getrandom.c
> @@ -4,6 +4,7 @@
>   */
>  
>  #include <linux/minmax.h>
> +#include <linux/array_size.h>
>  #include <vdso/datapage.h>
>  #include <vdso/getrandom.h>
>  #include <vdso/unaligned.h>
> @@ -74,11 +75,15 @@ __cvdso_getrandom_data(const struct vdso_rng_data *rng_info, void *buffer, size_
>  	u32 counter[2] = { 0 };
>  
>  	if (unlikely(opaque_len == ~0UL && !buffer && !len && !flags)) {
> -		*(struct vgetrandom_opaque_params *)opaque_state = (struct vgetrandom_opaque_params) {
> -			.size_of_opaque_state = sizeof(*state),
> -			.mmap_prot = PROT_READ | PROT_WRITE,
> -			.mmap_flags = MAP_DROPPABLE | MAP_ANONYMOUS
> -		};
> +		struct vgetrandom_opaque_params *params = opaque_state;
> +		int i;
> +
> +		params->size_of_opaque_state = sizeof(*state);
> +		params->mmap_prot = PROT_READ | PROT_WRITE;
> +		params->mmap_flags = MAP_DROPPABLE | MAP_ANONYMOUS;
> +		for (i = 0; i < ARRAY_SIZE(params->reserved); i++)
> +			params->reserved[i] = 0;
> +
>  		return 0;
>  	}

Is there a compiler flag that could be used to disable the generation of calls
to memset?

- Eric
Jason A. Donenfeld Aug. 27, 2024, 6:22 p.m. UTC | #3
On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
> On Thu, Aug 22, 2024 at 09:13:13AM +0200, Christophe Leroy wrote:
> > With the current implementation, __cvdso_getrandom_data() calls
> > memset(), which is unexpected in the VDSO.
> > 
> > Rewrite opaque data initialisation to avoid memset().
> > 
> > Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> > ---
> >  lib/vdso/getrandom.c | 15 ++++++++++-----
> >  1 file changed, 10 insertions(+), 5 deletions(-)
> > 
> > diff --git a/lib/vdso/getrandom.c b/lib/vdso/getrandom.c
> > index cab153c5f9be..4a56f45141b4 100644
> > --- a/lib/vdso/getrandom.c
> > +++ b/lib/vdso/getrandom.c
> > @@ -4,6 +4,7 @@
> >   */
> >  
> >  #include <linux/minmax.h>
> > +#include <linux/array_size.h>
> >  #include <vdso/datapage.h>
> >  #include <vdso/getrandom.h>
> >  #include <vdso/unaligned.h>
> > @@ -74,11 +75,15 @@ __cvdso_getrandom_data(const struct vdso_rng_data *rng_info, void *buffer, size_
> >  	u32 counter[2] = { 0 };
> >  
> >  	if (unlikely(opaque_len == ~0UL && !buffer && !len && !flags)) {
> > -		*(struct vgetrandom_opaque_params *)opaque_state = (struct vgetrandom_opaque_params) {
> > -			.size_of_opaque_state = sizeof(*state),
> > -			.mmap_prot = PROT_READ | PROT_WRITE,
> > -			.mmap_flags = MAP_DROPPABLE | MAP_ANONYMOUS
> > -		};
> > +		struct vgetrandom_opaque_params *params = opaque_state;
> > +		int i;
> > +
> > +		params->size_of_opaque_state = sizeof(*state);
> > +		params->mmap_prot = PROT_READ | PROT_WRITE;
> > +		params->mmap_flags = MAP_DROPPABLE | MAP_ANONYMOUS;
> > +		for (i = 0; i < ARRAY_SIZE(params->reserved); i++)
> > +			params->reserved[i] = 0;
> > +
> >  		return 0;
> >  	}
> 
> Is there a compiler flag that could be used to disable the generation of calls
> to memset?

Apparently not: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90701

-ffreestanding disables the most but still can generate calls to memcpy
and memset, and the bug was closed as "RESOLVED INVALID". Surprising,
but maybe it's one of those "functions are part of language" things.
Segher Boessenkool Aug. 27, 2024, 10:53 p.m. UTC | #4
On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
> On Thu, Aug 22, 2024 at 09:13:13AM +0200, Christophe Leroy wrote:
> > With the current implementation, __cvdso_getrandom_data() calls
> > memset(), which is unexpected in the VDSO.
> > 
> > Rewrite opaque data initialisation to avoid memset().
> > 
> > Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> > ---
> >  lib/vdso/getrandom.c | 15 ++++++++++-----
> >  1 file changed, 10 insertions(+), 5 deletions(-)
> > 
> > diff --git a/lib/vdso/getrandom.c b/lib/vdso/getrandom.c
> > index cab153c5f9be..4a56f45141b4 100644
> > --- a/lib/vdso/getrandom.c
> > +++ b/lib/vdso/getrandom.c
> > @@ -4,6 +4,7 @@
> >   */
> >  
> >  #include <linux/minmax.h>
> > +#include <linux/array_size.h>
> >  #include <vdso/datapage.h>
> >  #include <vdso/getrandom.h>
> >  #include <vdso/unaligned.h>
> > @@ -74,11 +75,15 @@ __cvdso_getrandom_data(const struct vdso_rng_data *rng_info, void *buffer, size_
> >  	u32 counter[2] = { 0 };
> >  
> >  	if (unlikely(opaque_len == ~0UL && !buffer && !len && !flags)) {
> > -		*(struct vgetrandom_opaque_params *)opaque_state = (struct vgetrandom_opaque_params) {
> > -			.size_of_opaque_state = sizeof(*state),
> > -			.mmap_prot = PROT_READ | PROT_WRITE,
> > -			.mmap_flags = MAP_DROPPABLE | MAP_ANONYMOUS
> > -		};
> > +		struct vgetrandom_opaque_params *params = opaque_state;
> > +		int i;
> > +
> > +		params->size_of_opaque_state = sizeof(*state);
> > +		params->mmap_prot = PROT_READ | PROT_WRITE;
> > +		params->mmap_flags = MAP_DROPPABLE | MAP_ANONYMOUS;
> > +		for (i = 0; i < ARRAY_SIZE(params->reserved); i++)
> > +			params->reserved[i] = 0;
> > +
> >  		return 0;
> >  	}
> 
> Is there a compiler flag that could be used to disable the generation of calls
> to memset?

-fno-tree-loop-distribute-patterns .  But, as always, read up on it, see
what it actually does (and how it avoids your problem, and mostly: learn
what the actual problem *was*!)

Have fun,


Segher
Jason A. Donenfeld Aug. 28, 2024, 11:18 a.m. UTC | #5
On Tue, Aug 27, 2024 at 05:53:30PM -0500, Segher Boessenkool wrote:
> On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
> > On Thu, Aug 22, 2024 at 09:13:13AM +0200, Christophe Leroy wrote:
> > > With the current implementation, __cvdso_getrandom_data() calls
> > > memset(), which is unexpected in the VDSO.
> > > 
> > > Rewrite opaque data initialisation to avoid memset().
> > > 
> > > Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> > > ---
> > >  lib/vdso/getrandom.c | 15 ++++++++++-----
> > >  1 file changed, 10 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/lib/vdso/getrandom.c b/lib/vdso/getrandom.c
> > > index cab153c5f9be..4a56f45141b4 100644
> > > --- a/lib/vdso/getrandom.c
> > > +++ b/lib/vdso/getrandom.c
> > > @@ -4,6 +4,7 @@
> > >   */
> > >  
> > >  #include <linux/minmax.h>
> > > +#include <linux/array_size.h>
> > >  #include <vdso/datapage.h>
> > >  #include <vdso/getrandom.h>
> > >  #include <vdso/unaligned.h>
> > > @@ -74,11 +75,15 @@ __cvdso_getrandom_data(const struct vdso_rng_data *rng_info, void *buffer, size_
> > >  	u32 counter[2] = { 0 };
> > >  
> > >  	if (unlikely(opaque_len == ~0UL && !buffer && !len && !flags)) {
> > > -		*(struct vgetrandom_opaque_params *)opaque_state = (struct vgetrandom_opaque_params) {
> > > -			.size_of_opaque_state = sizeof(*state),
> > > -			.mmap_prot = PROT_READ | PROT_WRITE,
> > > -			.mmap_flags = MAP_DROPPABLE | MAP_ANONYMOUS
> > > -		};
> > > +		struct vgetrandom_opaque_params *params = opaque_state;
> > > +		int i;
> > > +
> > > +		params->size_of_opaque_state = sizeof(*state);
> > > +		params->mmap_prot = PROT_READ | PROT_WRITE;
> > > +		params->mmap_flags = MAP_DROPPABLE | MAP_ANONYMOUS;
> > > +		for (i = 0; i < ARRAY_SIZE(params->reserved); i++)
> > > +			params->reserved[i] = 0;
> > > +
> > >  		return 0;
> > >  	}
> > 
> > Is there a compiler flag that could be used to disable the generation of calls
> > to memset?
> 
> -fno-tree-loop-distribute-patterns .  But, as always, read up on it, see
> what it actually does (and how it avoids your problem, and mostly: learn
> what the actual problem *was*!)

This might help with various loops, but it doesn't help with the matter
that this patch fixes, which is struct initialization. I just tried it
with the arm64 patch to no avail.
Arnd Bergmann Aug. 28, 2024, 12:24 p.m. UTC | #6
On Wed, Aug 28, 2024, at 11:18, Jason A. Donenfeld wrote:
> On Tue, Aug 27, 2024 at 05:53:30PM -0500, Segher Boessenkool wrote:
>> On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
>> > 
>> > Is there a compiler flag that could be used to disable the generation of calls
>> > to memset?
>> 
>> -fno-tree-loop-distribute-patterns .  But, as always, read up on it, see
>> what it actually does (and how it avoids your problem, and mostly: learn
>> what the actual problem *was*!)
>
> This might help with various loops, but it doesn't help with the matter
> that this patch fixes, which is struct initialization. I just tried it
> with the arm64 patch to no avail.

Maybe -ffreestanding can help here? That should cause the vdso to be built
with the assumption that there is no libc, so it would neither add nor
remove standard library calls. Not sure if that causes other problems,
e.g. if the calling conventions are different.

       Arnd
Jason A. Donenfeld Aug. 28, 2024, 12:26 p.m. UTC | #7
On Wed, Aug 28, 2024 at 2:24 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Wed, Aug 28, 2024, at 11:18, Jason A. Donenfeld wrote:
> > On Tue, Aug 27, 2024 at 05:53:30PM -0500, Segher Boessenkool wrote:
> >> On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
> >> >
> >> > Is there a compiler flag that could be used to disable the generation of calls
> >> > to memset?
> >>
> >> -fno-tree-loop-distribute-patterns .  But, as always, read up on it, see
> >> what it actually does (and how it avoids your problem, and mostly: learn
> >> what the actual problem *was*!)
> >
> > This might help with various loops, but it doesn't help with the matter
> > that this patch fixes, which is struct initialization. I just tried it
> > with the arm64 patch to no avail.
>
> Maybe -ffreestanding can help here? That should cause the vdso to be built
> with the assumption that there is no libc, so it would neither add nor
> remove standard library calls. Not sure if that causes other problems,
> e.g. if the calling conventions are different.

From https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90701:

| You need -ffreestanding but that is documented to emit memset and
memcpy still.
Segher Boessenkool Aug. 28, 2024, 12:33 p.m. UTC | #8
On Wed, Aug 28, 2024 at 01:18:34PM +0200, Jason A. Donenfeld wrote:
> On Tue, Aug 27, 2024 at 05:53:30PM -0500, Segher Boessenkool wrote:
> > On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
> > > > +		for (i = 0; i < ARRAY_SIZE(params->reserved); i++)
> > > > +			params->reserved[i] = 0;

This is a loop.  With -ftree-loop-distribute-patterns, the default at
-O2, this is optimised to

    memset(params->reserved, 0, ...);

(which is perfectly fine, since memset is required to be there even
for freestanding environments, this is documented!)

> > -fno-tree-loop-distribute-patterns .  But, as always, read up on it, see
> > what it actually does (and how it avoids your problem, and mostly: learn
> > what the actual problem *was*!)
> 
> This might help with various loops, but it doesn't help with the matter
> that this patch fixes, which is struct initialization. I just tried it
> with the arm64 patch to no avail.

It very much *does* help.  Try harder?  Maybe you typoed?


Segher
Segher Boessenkool Aug. 28, 2024, 12:45 p.m. UTC | #9
On Wed, Aug 28, 2024 at 12:24:12PM +0000, Arnd Bergmann wrote:
> On Wed, Aug 28, 2024, at 11:18, Jason A. Donenfeld wrote:
> > On Tue, Aug 27, 2024 at 05:53:30PM -0500, Segher Boessenkool wrote:
> >> On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
> >> > 
> >> > Is there a compiler flag that could be used to disable the generation of calls
> >> > to memset?
> >> 
> >> -fno-tree-loop-distribute-patterns .  But, as always, read up on it, see
> >> what it actually does (and how it avoids your problem, and mostly: learn
> >> what the actual problem *was*!)
> >
> > This might help with various loops, but it doesn't help with the matter
> > that this patch fixes, which is struct initialization. I just tried it
> > with the arm64 patch to no avail.
> 
> Maybe -ffreestanding can help here? That should cause the vdso to be built
> with the assumption that there is no libc, so it would neither add nor
> remove standard library calls. Not sure if that causes other problems,
> e.g. if the calling conventions are different.

"GCC requires the freestanding
environment provide 'memcpy', 'memmove', 'memset' and 'memcmp'."

This is precisely to implement things like struct initialisation.  Maybe
we should have a "-ffreeerstanding" or "-ffreefloating" or think of
something funnier still environment as well, this problem has been there
since the -ffreestanding flag has existed, but the problem is as old as
the night.

-fno-builtin might help a bit more, but just attack the problem at
its root, like I suggested?

(This isn't a new problem, originally it showed up as "GCC replaces
(part of) my memcpy() implementation by a (recursive) call to memcpy()"
and, well, that doesn't quite work!)


Segher
Jason A. Donenfeld Aug. 28, 2024, 12:51 p.m. UTC | #10
On Wed, Aug 28, 2024 at 07:33:13AM -0500, Segher Boessenkool wrote:
> On Wed, Aug 28, 2024 at 01:18:34PM +0200, Jason A. Donenfeld wrote:
> > On Tue, Aug 27, 2024 at 05:53:30PM -0500, Segher Boessenkool wrote:
> > > On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
> > > > > +		for (i = 0; i < ARRAY_SIZE(params->reserved); i++)
> > > > > +			params->reserved[i] = 0;
> 
> This is a loop.  With -ftree-loop-distribute-patterns, the default at
> -O2, this is optimised to
> 
>     memset(params->reserved, 0, ...);
> 
> (which is perfectly fine, since memset is required to be there even
> for freestanding environments, this is documented!)
> 
> > > -fno-tree-loop-distribute-patterns .  But, as always, read up on it, see
> > > what it actually does (and how it avoids your problem, and mostly: learn
> > > what the actual problem *was*!)
> > 
> > This might help with various loops, but it doesn't help with the matter
> > that this patch fixes, which is struct initialization. I just tried it
> > with the arm64 patch to no avail.
> 
> It very much *does* help.  Try harder?  Maybe you typoed?

Scroll up and reread the original patch. You are confused. The loop is
what fixes the matter. It's struct initialization that generates the
memset.
Segher Boessenkool Aug. 28, 2024, 12:51 p.m. UTC | #11
On Wed, Aug 28, 2024 at 02:26:08PM +0200, Jason A. Donenfeld wrote:
> On Wed, Aug 28, 2024 at 2:24 PM Arnd Bergmann <arnd@arndb.de> wrote:
> >
> > On Wed, Aug 28, 2024, at 11:18, Jason A. Donenfeld wrote:
> > > On Tue, Aug 27, 2024 at 05:53:30PM -0500, Segher Boessenkool wrote:
> > >> On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
> > >> >
> > >> > Is there a compiler flag that could be used to disable the generation of calls
> > >> > to memset?
> > >>
> > >> -fno-tree-loop-distribute-patterns .  But, as always, read up on it, see
> > >> what it actually does (and how it avoids your problem, and mostly: learn
> > >> what the actual problem *was*!)
> > >
> > > This might help with various loops, but it doesn't help with the matter
> > > that this patch fixes, which is struct initialization. I just tried it
> > > with the arm64 patch to no avail.
> >
> > Maybe -ffreestanding can help here? That should cause the vdso to be built
> > with the assumption that there is no libc, so it would neither add nor
> > remove standard library calls. Not sure if that causes other problems,
> > e.g. if the calling conventions are different.
> 
> >From https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90701:
> 
> | You need -ffreestanding but that is documented to emit memset and
> memcpy still.

Yeah.

'-nostdlib'
     Do not use the standard system startup files or libraries when
     linking.

This won't help a bit, the compiler will still optimise your
initialisation loop to a memset() call, it just won't link to libgcc.a
and crt*.o and its ilk (which is not where mem* are implemented in the
first place!)


Segher
Ard Biesheuvel Aug. 28, 2024, 3:40 p.m. UTC | #12
On Wed, 28 Aug 2024 at 14:57, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> On Wed, Aug 28, 2024 at 12:24:12PM +0000, Arnd Bergmann wrote:
> > On Wed, Aug 28, 2024, at 11:18, Jason A. Donenfeld wrote:
> > > On Tue, Aug 27, 2024 at 05:53:30PM -0500, Segher Boessenkool wrote:
> > >> On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
> > >> >
> > >> > Is there a compiler flag that could be used to disable the generation of calls
> > >> > to memset?
> > >>
> > >> -fno-tree-loop-distribute-patterns .  But, as always, read up on it, see
> > >> what it actually does (and how it avoids your problem, and mostly: learn
> > >> what the actual problem *was*!)
> > >
> > > This might help with various loops, but it doesn't help with the matter
> > > that this patch fixes, which is struct initialization. I just tried it
> > > with the arm64 patch to no avail.
> >
> > Maybe -ffreestanding can help here? That should cause the vdso to be built
> > with the assumption that there is no libc, so it would neither add nor
> > remove standard library calls. Not sure if that causes other problems,
> > e.g. if the calling conventions are different.
>
> "GCC requires the freestanding
> environment provide 'memcpy', 'memmove', 'memset' and 'memcmp'."
>
> This is precisely to implement things like struct initialisation.  Maybe
> we should have a "-ffreeerstanding" or "-ffreefloating" or think of
> something funnier still environment as well, this problem has been there
> since the -ffreestanding flag has existed, but the problem is as old as
> the night.
>
> -fno-builtin might help a bit more, but just attack the problem at
> its root, like I suggested?
>

In my experience, this is likely to do the opposite: it causes the
compiler to 'forget' the semantics of memcpy() and memset(), so that
explicit trivial calls will no longer be elided and replaced with
plain loads and stores (as it can no longer guarantee the equivalence)

> (This isn't a new problem, originally it showed up as "GCC replaces
> (part of) my memcpy() implementation by a (recursive) call to memcpy()"
> and, well, that doesn't quite work!)
>

This needs to be fixed for Clang as well, so throwing GCC specific
flags at it will at best be a partial solution.

Omitting the struct assignment is a reasonable way to reduce the
likelihood that a memset() will be emitted, so for this patch

Acked-by: Ard Biesheuvel <ardb@kernel.org>

It is not a complete solution, unfortunately, and I guess there may be
other situations (compiler/arch combinations) where this might pop up
again.
Segher Boessenkool Aug. 28, 2024, 4:20 p.m. UTC | #13
Hi!

On Wed, Aug 28, 2024 at 05:40:23PM +0200, Ard Biesheuvel wrote:
> On Wed, 28 Aug 2024 at 14:57, Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
> >
> > On Wed, Aug 28, 2024 at 12:24:12PM +0000, Arnd Bergmann wrote:
> > > On Wed, Aug 28, 2024, at 11:18, Jason A. Donenfeld wrote:
> > > > On Tue, Aug 27, 2024 at 05:53:30PM -0500, Segher Boessenkool wrote:
> > > >> On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
> > > >> >
> > > >> > Is there a compiler flag that could be used to disable the generation of calls
> > > >> > to memset?
> > > >>
> > > >> -fno-tree-loop-distribute-patterns .  But, as always, read up on it, see
> > > >> what it actually does (and how it avoids your problem, and mostly: learn
> > > >> what the actual problem *was*!)
> > > >
> > > > This might help with various loops, but it doesn't help with the matter
> > > > that this patch fixes, which is struct initialization. I just tried it
> > > > with the arm64 patch to no avail.
> > >
> > > Maybe -ffreestanding can help here? That should cause the vdso to be built
> > > with the assumption that there is no libc, so it would neither add nor
> > > remove standard library calls. Not sure if that causes other problems,
> > > e.g. if the calling conventions are different.
> >
> > "GCC requires the freestanding
> > environment provide 'memcpy', 'memmove', 'memset' and 'memcmp'."
> >
> > This is precisely to implement things like struct initialisation.  Maybe
> > we should have a "-ffreeerstanding" or "-ffreefloating" or think of
> > something funnier still environment as well, this problem has been there
> > since the -ffreestanding flag has existed, but the problem is as old as
> > the night.
> >
> > -fno-builtin might help a bit more, but just attack the problem at
> > its root, like I suggested?
> >
> 
> In my experience, this is likely to do the opposite: it causes the
> compiler to 'forget' the semantics of memcpy() and memset(), so that
> explicit trivial calls will no longer be elided and replaced with
> plain loads and stores (as it can no longer guarantee the equivalence)

No, the compiler will never forget those semantics.  But if you tell it
your function named memset() is not the actual standard memset -- via
-fno-builtin-memset for example -- the compiler won't optimise things
involving it quite as much.  You told it so eh?

You can also tell it not to have a __builtin_memset function, but in
this particular case that won;t quite work, since the compiler does need
to have that builtin available to do struct and array initialisations
and the like.

> > (This isn't a new problem, originally it showed up as "GCC replaces
> > (part of) my memcpy() implementation by a (recursive) call to memcpy()"
> > and, well, that doesn't quite work!)
> >
> 
> This needs to be fixed for Clang as well, so throwing GCC specific
> flags at it will at best be a partial solution.

clang says it is a 100% plug-in replacement for GCC, so they will have
to accept all GCC flags.  And in many cases they do.  Cases where they
don't are bugs.

> It is not a complete solution, unfortunately, and I guess there may be
> other situations (compiler/arch combinations) where this might pop up
> again.

Why do mem* not work in VDSOs?  Fix that, and all these problems
disappear, and you do not need workrarounds :-)


Segher
Ard Biesheuvel Aug. 28, 2024, 5:12 p.m. UTC | #14
On Wed, 28 Aug 2024 at 18:24, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> Hi!
>
> On Wed, Aug 28, 2024 at 05:40:23PM +0200, Ard Biesheuvel wrote:
> > On Wed, 28 Aug 2024 at 14:57, Segher Boessenkool
> > <segher@kernel.crashing.org> wrote:
> > >
> > > On Wed, Aug 28, 2024 at 12:24:12PM +0000, Arnd Bergmann wrote:
> > > > On Wed, Aug 28, 2024, at 11:18, Jason A. Donenfeld wrote:
> > > > > On Tue, Aug 27, 2024 at 05:53:30PM -0500, Segher Boessenkool wrote:
> > > > >> On Tue, Aug 27, 2024 at 11:08:19AM -0700, Eric Biggers wrote:
> > > > >> >
> > > > >> > Is there a compiler flag that could be used to disable the generation of calls
> > > > >> > to memset?
> > > > >>
> > > > >> -fno-tree-loop-distribute-patterns .  But, as always, read up on it, see
> > > > >> what it actually does (and how it avoids your problem, and mostly: learn
> > > > >> what the actual problem *was*!)
> > > > >
> > > > > This might help with various loops, but it doesn't help with the matter
> > > > > that this patch fixes, which is struct initialization. I just tried it
> > > > > with the arm64 patch to no avail.
> > > >
> > > > Maybe -ffreestanding can help here? That should cause the vdso to be built
> > > > with the assumption that there is no libc, so it would neither add nor
> > > > remove standard library calls. Not sure if that causes other problems,
> > > > e.g. if the calling conventions are different.
> > >
> > > "GCC requires the freestanding
> > > environment provide 'memcpy', 'memmove', 'memset' and 'memcmp'."
> > >
> > > This is precisely to implement things like struct initialisation.  Maybe
> > > we should have a "-ffreeerstanding" or "-ffreefloating" or think of
> > > something funnier still environment as well, this problem has been there
> > > since the -ffreestanding flag has existed, but the problem is as old as
> > > the night.
> > >
> > > -fno-builtin might help a bit more, but just attack the problem at
> > > its root, like I suggested?
> > >
> >
> > In my experience, this is likely to do the opposite: it causes the
> > compiler to 'forget' the semantics of memcpy() and memset(), so that
> > explicit trivial calls will no longer be elided and replaced with
> > plain loads and stores (as it can no longer guarantee the equivalence)
>
> No, the compiler will never forget those semantics.  But if you tell it
> your function named memset() is not the actual standard memset -- via
> -fno-builtin-memset for example -- the compiler won't optimise things
> involving it quite as much.  You told it so eh?
>

That is exactly the point I am making. So how would this help in this case?

> You can also tell it not to have a __builtin_memset function, but in
> this particular case that won;t quite work, since the compiler does need
> to have that builtin available to do struct and array initialisations
> and the like.
>
> > > (This isn't a new problem, originally it showed up as "GCC replaces
> > > (part of) my memcpy() implementation by a (recursive) call to memcpy()"
> > > and, well, that doesn't quite work!)
> > >
> >
> > This needs to be fixed for Clang as well, so throwing GCC specific
> > flags at it will at best be a partial solution.
>
> clang says it is a 100% plug-in replacement for GCC, so they will have
> to accept all GCC flags.  And in many cases they do.  Cases where they
> don't are bugs.
>

Even if this were true, we support Clang versions today that do not
support -fno-tree-loop-distribute-patterns so my point stands.

> > It is not a complete solution, unfortunately, and I guess there may be
> > other situations (compiler/arch combinations) where this might pop up
> > again.
>
> Why do mem* not work in VDSOs?  Fix that, and all these problems
> disappear, and you do not need workrarounds :-)
>

Good point. We should just import memcpy and memset in the VDSO ELF metadata.

Not sure about static binaries, though: do those even use the VDSO?
Segher Boessenkool Aug. 28, 2024, 5:25 p.m. UTC | #15
On Wed, Aug 28, 2024 at 07:12:55PM +0200, Ard Biesheuvel wrote:
> On Wed, 28 Aug 2024 at 18:24, Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
> > > In my experience, this is likely to do the opposite: it causes the
> > > compiler to 'forget' the semantics of memcpy() and memset(), so that
> > > explicit trivial calls will no longer be elided and replaced with
> > > plain loads and stores (as it can no longer guarantee the equivalence)
> >
> > No, the compiler will never forget those semantics.  But if you tell it
> > your function named memset() is not the actual standard memset -- via
> > -fno-builtin-memset for example -- the compiler won't optimise things
> > involving it quite as much.  You told it so eh?
> 
> That is exactly the point I am making. So how would this help in this case?

I think we agree?  :-)

> > > This needs to be fixed for Clang as well, so throwing GCC specific
> > > flags at it will at best be a partial solution.
> >
> > clang says it is a 100% plug-in replacement for GCC, so they will have
> > to accept all GCC flags.  And in many cases they do.  Cases where they
> > don't are bugs.
> 
> Even if this were true, we support Clang versions today that do not
> support -fno-tree-loop-distribute-patterns so my point stands.

It is true.  Yes, this cause problems for their users.

> > > It is not a complete solution, unfortunately, and I guess there may be
> > > other situations (compiler/arch combinations) where this might pop up
> > > again.
> >
> > Why do mem* not work in VDSOs?  Fix that, and all these problems
> > disappear, and you do not need workrarounds :-)
> 
> Good point. We should just import memcpy and memset in the VDSO ELF metadata.

Yeah.  In many cases GCC will replace such calls by (faster and/or
smaller) inline code anyway, but when it does leave a call, there needs
to be an external function implementing it :-)

> Not sure about static binaries, though: do those even use the VDSO?

With "static binary" people usually mean "a binary not using any DSOs",
I think the VDSO is a DSO, also in this respect?  As always, -static
builds are *way* less problematic (and faster and smaller :-) )


Segher
Christophe Leroy Aug. 29, 2024, 5:36 p.m. UTC | #16
Le 28/08/2024 à 19:25, Segher Boessenkool a écrit :
> 
>> Not sure about static binaries, though: do those even use the VDSO?
> 
> With "static binary" people usually mean "a binary not using any DSOs",
> I think the VDSO is a DSO, also in this respect?  As always, -static
> builds are *way* less problematic (and faster and smaller :-) )
> 

AFAIK on powerpc even static binaries use the vDSO, otherwise signals 
don't work.

Christophe
Segher Boessenkool Aug. 29, 2024, 6:02 p.m. UTC | #17
On Thu, Aug 29, 2024 at 07:36:38PM +0200, Christophe Leroy wrote:
> 
> 
> Le 28/08/2024 à 19:25, Segher Boessenkool a écrit :
> >
> >>Not sure about static binaries, though: do those even use the VDSO?
> >
> >With "static binary" people usually mean "a binary not using any DSOs",
> >I think the VDSO is a DSO, also in this respect?  As always, -static
> >builds are *way* less problematic (and faster and smaller :-) )
> >
> 
> AFAIK on powerpc even static binaries use the vDSO, otherwise signals 
> don't work.

How can that work?  Non-dynamic binaries do not use ld.so (that is the
definition of a dynamic binary, even).  So they cannot link (at runtime)
to any DSO (unless that is done manually?!)

Maybe there is something at a fixed offset in the vDSO, or something
like that?  Is this documented somewhere?


Segher
Christophe Leroy Aug. 29, 2024, 6:50 p.m. UTC | #18
Le 29/08/2024 à 20:02, Segher Boessenkool a écrit :
> On Thu, Aug 29, 2024 at 07:36:38PM +0200, Christophe Leroy wrote:
>>
>>
>> Le 28/08/2024 à 19:25, Segher Boessenkool a écrit :
>>>
>>>> Not sure about static binaries, though: do those even use the VDSO?
>>>
>>> With "static binary" people usually mean "a binary not using any DSOs",
>>> I think the VDSO is a DSO, also in this respect?  As always, -static
>>> builds are *way* less problematic (and faster and smaller :-) )
>>>
>>
>> AFAIK on powerpc even static binaries use the vDSO, otherwise signals
>> don't work.
> 
> How can that work?  Non-dynamic binaries do not use ld.so (that is the
> definition of a dynamic binary, even).  So they cannot link (at runtime)
> to any DSO (unless that is done manually?!)
> 
> Maybe there is something at a fixed offset in the vDSO, or something
> like that?  Is this documented somewhere?
> 

You've got some explanation here : 
https://github.com/torvalds/linux/blob/master/Documentation/ABI/stable/vdso
Michael Ellerman Aug. 30, 2024, 10:01 a.m. UTC | #19
Segher Boessenkool <segher@kernel.crashing.org> writes:
> On Thu, Aug 29, 2024 at 07:36:38PM +0200, Christophe Leroy wrote:
>> 
>> 
>> Le 28/08/2024 à 19:25, Segher Boessenkool a écrit :
>> >
>> >>Not sure about static binaries, though: do those even use the VDSO?
>> >
>> >With "static binary" people usually mean "a binary not using any DSOs",
>> >I think the VDSO is a DSO, also in this respect?  As always, -static
>> >builds are *way* less problematic (and faster and smaller :-) )
>> >
>> 
>> AFAIK on powerpc even static binaries use the vDSO, otherwise signals 
>> don't work.
>
> How can that work?  Non-dynamic binaries do not use ld.so (that is the
> definition of a dynamic binary, even).  So they cannot link (at runtime)
> to any DSO (unless that is done manually?!)

At least for signals I don't think the application needs to know
anything about the VDSO. The kernel sets up the return to the signal
trampoline (in the VDSO), and as long as userspace returns from its
signal handler with blr it will land back on the trampoline.

cheers
diff mbox series

Patch

diff --git a/lib/vdso/getrandom.c b/lib/vdso/getrandom.c
index cab153c5f9be..4a56f45141b4 100644
--- a/lib/vdso/getrandom.c
+++ b/lib/vdso/getrandom.c
@@ -4,6 +4,7 @@ 
  */
 
 #include <linux/minmax.h>
+#include <linux/array_size.h>
 #include <vdso/datapage.h>
 #include <vdso/getrandom.h>
 #include <vdso/unaligned.h>
@@ -74,11 +75,15 @@  __cvdso_getrandom_data(const struct vdso_rng_data *rng_info, void *buffer, size_
 	u32 counter[2] = { 0 };
 
 	if (unlikely(opaque_len == ~0UL && !buffer && !len && !flags)) {
-		*(struct vgetrandom_opaque_params *)opaque_state = (struct vgetrandom_opaque_params) {
-			.size_of_opaque_state = sizeof(*state),
-			.mmap_prot = PROT_READ | PROT_WRITE,
-			.mmap_flags = MAP_DROPPABLE | MAP_ANONYMOUS
-		};
+		struct vgetrandom_opaque_params *params = opaque_state;
+		int i;
+
+		params->size_of_opaque_state = sizeof(*state);
+		params->mmap_prot = PROT_READ | PROT_WRITE;
+		params->mmap_flags = MAP_DROPPABLE | MAP_ANONYMOUS;
+		for (i = 0; i < ARRAY_SIZE(params->reserved); i++)
+			params->reserved[i] = 0;
+
 		return 0;
 	}