Message ID | 20230218203717.373211-8-bugaevc@gmail.com |
---|---|
State | New |
Headers | show |
Series | More x86_64-gnu glibc work | expand |
Sergey Bugaev, le sam. 18 févr. 2023 23:37:15 +0300, a ecrit: > Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> > --- > sysdeps/mach/hurd/{i386 => x86}/init-first.c | 17 ++++++++++++++--- > 1 file changed, 14 insertions(+), 3 deletions(-) > rename sysdeps/mach/hurd/{i386 => x86}/init-first.c (96%) > > diff --git a/sysdeps/mach/hurd/i386/init-first.c b/sysdeps/mach/hurd/x86/init-first.c > similarity index 96% > rename from sysdeps/mach/hurd/i386/init-first.c > rename to sysdeps/mach/hurd/x86/init-first.c > index a558da16..75ac1ff2 100644 > --- a/sysdeps/mach/hurd/i386/init-first.c > +++ b/sysdeps/mach/hurd/x86/init-first.c > @@ -227,10 +227,15 @@ init (int *data) > values we set just above. We have stashed in %eax the user code > return address. Push it on the top of the stack so it acts as > init1's return address, and then jump there. */ > +#ifdef __x86_64__ > +asm ("call_init1:\n" > + " push %rax\n" > + " jmp *%rcx\n"); > +#else > asm ("call_init1:\n" > " push %eax\n" > " jmp *%ecx\n"); > - > +#endif > > /* Do the first essential initializations that must precede all else. */ > static inline void > @@ -242,7 +247,7 @@ first_init (void) > #ifndef SHARED > /* In the static case, we need to set up TLS early so that the stack > protection guard can be read at gs:0x14 by the gcc-generated snippets. */ > - _hurd_tls_init(&__init1_tcbhead); > + _hurd_tls_init (&__init1_tcbhead); > asm ("movw %%gs,%w0" : "=m" (__init1_desc)); > #endif > > @@ -300,7 +305,7 @@ _hurd_stack_setup (void) > { > /* If we use ``__builtin_frame_address (0) + 2'' here, GCC gets > confused. */ > - init ((int *) &argc); > + init (&argc); That won't work on x86_64: there, arguments are passed mostly through registers, so &argc won't actually give you the address of arguments on the stack. This can probably be checked on linux x86_64 to make sure how arguments are passed. > } > > /* Push the user return address after the argument data, and then > @@ -308,9 +313,15 @@ _hurd_stack_setup (void) > caller had called `doinit1' with the argument data already on the > stack. */ > *--data = caller; > +# ifdef __x86_64__ > + asm volatile ("movq %0, %%rsp\n" /* Switch to new outermost stack. */ > + "movq $0, %%rbp\n" /* Clear outermost frame pointer. */ > + "jmp *%1" : : "r" (data), "r" (&doinit1)); > +# else > asm volatile ("movl %0, %%esp\n" /* Switch to new outermost stack. */ > "movl $0, %%ebp\n" /* Clear outermost frame pointer. */ > "jmp *%1" : : "r" (data), "r" (&doinit1)); > +# endif > /* NOTREACHED */ > } > > -- > 2.39.2 > >
On Mon, Feb 20, 2023 at 3:01 AM Samuel Thibault <samuel.thibault@gnu.org> wrote: > That won't work on x86_64: there, arguments are passed mostly through > registers, so &argc won't actually give you the address of arguments on > the stack. Right, good point. I wish I had a better understanding of just what's going on in this file. Maybe a lot of the hacks can be rewritten in a nicer way. For instance, do we really need to hijack the return addresses and jump to init1 in this weird way, only to enable it to access argc/arg0? Since we know where they are on our stack (__builtin_frame_address (0) + 2 or something like that), can't we just pass it a pointer? Let me actually try just that... Sergey
> I wish I had a better understanding of just what's going on in this > file. Maybe a lot of the hacks can be rewritten in a nicer way. For > instance, do we really need to hijack the return addresses and jump to > init1 in this weird way, only to enable it to access argc/arg0? Since > we know where they are on our stack (__builtin_frame_address (0) + 2 > or something like that), can't we just pass it a pointer? > > Let me actually try just that... Eh, no, let's start even earlier than that. Please correct me if I (inevitably) get things wrong. _dl_init_first is Hurd-specific. It is called from the assembly code in dl-machine.h, specifically there's a RTLD_START_SPECIAL_INIT macro that's defined to call _dl_init_first on the Hurd, and to nothing otherwise. This RTLD_START_SPECIAL_INIT is used ("invoked") in i386/dl-machine.h, s390/s390-{32,64}/dl-machine.h, ia64/dl-machine.h, and alpha/dl-machine.h (but notably not in x86_64/dl-machine.h). In all cases, it's emphasized that "The special initializer gets called with the stack just as the application's entry point will see it; it can switch stacks if it moves these contents over." But I conclude that: - s390-gnu, ia64-gnu, and alpha-gnu ports are nonexistent, nor are they realistically ever going to happen, so we can ignore them completely - the implementation does not seem to actually switch stacks (in fact, I have removed the unused switch_stacks function in the last commit) -- so the "gets called with the stack just as the application's entry point will see it" property may not be important anymore? The only thing it really needs, it seems, is a pointer to argc/argv/envp & Hurd data block *somewhere*. It does not have to be on the stack (though where else would it be), or immediately preceding its call frame -- all that really matters is that there's a pointer. So my thinking goes, why don't we just hook into _dl_start, which already has this very pointer? And in fact, _dl_start calls _dl_sysdep_start, for which there is already a Hurd version. Can't we just call our logic from there, and not worry about the stack layout and overwriting return addresses? That would work for the SHARED case; we also need to do something for the static case. In that case, we are invoked by static-start.S; do I understand it right that the argc/argv/whatever is still located on-stack even on x86_64 in this case, and not passed in registers according to the calling convention? Then again, we should be able to just use __builtin_frame_address (0) + 2 and avoid most of the hacks? Please tell me if this makes any sense. Sergey
On Sat, Feb 18, 2023 at 2:40 PM Sergey Bugaev via Libc-alpha <libc-alpha@sourceware.org> wrote: > > Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> > --- > sysdeps/mach/hurd/{i386 => x86}/init-first.c | 17 ++++++++++++++--- > 1 file changed, 14 insertions(+), 3 deletions(-) > rename sysdeps/mach/hurd/{i386 => x86}/init-first.c (96%) > > diff --git a/sysdeps/mach/hurd/i386/init-first.c b/sysdeps/mach/hurd/x86/init-first.c > similarity index 96% > rename from sysdeps/mach/hurd/i386/init-first.c > rename to sysdeps/mach/hurd/x86/init-first.c > index a558da16..75ac1ff2 100644 > --- a/sysdeps/mach/hurd/i386/init-first.c > +++ b/sysdeps/mach/hurd/x86/init-first.c > @@ -227,10 +227,15 @@ init (int *data) > values we set just above. We have stashed in %eax the user code > return address. Push it on the top of the stack so it acts as > init1's return address, and then jump there. */ > +#ifdef __x86_64__ > +asm ("call_init1:\n" > + " push %rax\n" > + " jmp *%rcx\n"); > +#else > asm ("call_init1:\n" > " push %eax\n" > " jmp *%ecx\n"); > - > +#endif > > /* Do the first essential initializations that must precede all else. */ > static inline void > @@ -242,7 +247,7 @@ first_init (void) > #ifndef SHARED > /* In the static case, we need to set up TLS early so that the stack > protection guard can be read at gs:0x14 by the gcc-generated snippets. */ > - _hurd_tls_init(&__init1_tcbhead); > + _hurd_tls_init (&__init1_tcbhead); > asm ("movw %%gs,%w0" : "=m" (__init1_desc)); > #endif > > @@ -300,7 +305,7 @@ _hurd_stack_setup (void) > { > /* If we use ``__builtin_frame_address (0) + 2'' here, GCC gets > confused. */ > - init ((int *) &argc); > + init (&argc); > } > > /* Push the user return address after the argument data, and then > @@ -308,9 +313,15 @@ _hurd_stack_setup (void) > caller had called `doinit1' with the argument data already on the > stack. */ > *--data = caller; > +# ifdef __x86_64__ > + asm volatile ("movq %0, %%rsp\n" /* Switch to new outermost stack. */ > + "movq $0, %%rbp\n" /* Clear outermost frame pointer. */ Unless you are intentionally preserving flags, `xorl %%ebp, %%ebp` is better way to zero a register. > + "jmp *%1" : : "r" (data), "r" (&doinit1)); > +# else > asm volatile ("movl %0, %%esp\n" /* Switch to new outermost stack. */ > "movl $0, %%ebp\n" /* Clear outermost frame pointer. */ > "jmp *%1" : : "r" (data), "r" (&doinit1)); > +# endif > /* NOTREACHED */ > } > > -- > 2.39.2 >
Sergey Bugaev, le lun. 20 févr. 2023 20:10:29 +0300, a ecrit: > - the implementation does not seem to actually switch stacks (in fact, > I have removed the unused switch_stacks function in the last commit) > -- so the "gets called with the stack just as the application's entry > point will see it" property may not be important anymore? That's possible, yes. > So my thinking goes, why don't we just hook into _dl_start, which > already has this very pointer? And in fact, _dl_start calls > _dl_sysdep_start, for which there is already a Hurd version. Can't we > just call our logic from there, and not worry about the stack layout > and overwriting return addresses? Possibly, yes. > That would work for the SHARED case; we also need to do something for > the static case. In that case, we are invoked by static-start.S; do I > understand it right that the argc/argv/whatever is still located > on-stack even on x86_64 in this case, and not passed in registers > according to the calling convention? I don't know. But you can check on Linux, we'll probably want to use the same ABI. Samuel
diff --git a/sysdeps/mach/hurd/i386/init-first.c b/sysdeps/mach/hurd/x86/init-first.c similarity index 96% rename from sysdeps/mach/hurd/i386/init-first.c rename to sysdeps/mach/hurd/x86/init-first.c index a558da16..75ac1ff2 100644 --- a/sysdeps/mach/hurd/i386/init-first.c +++ b/sysdeps/mach/hurd/x86/init-first.c @@ -227,10 +227,15 @@ init (int *data) values we set just above. We have stashed in %eax the user code return address. Push it on the top of the stack so it acts as init1's return address, and then jump there. */ +#ifdef __x86_64__ +asm ("call_init1:\n" + " push %rax\n" + " jmp *%rcx\n"); +#else asm ("call_init1:\n" " push %eax\n" " jmp *%ecx\n"); - +#endif /* Do the first essential initializations that must precede all else. */ static inline void @@ -242,7 +247,7 @@ first_init (void) #ifndef SHARED /* In the static case, we need to set up TLS early so that the stack protection guard can be read at gs:0x14 by the gcc-generated snippets. */ - _hurd_tls_init(&__init1_tcbhead); + _hurd_tls_init (&__init1_tcbhead); asm ("movw %%gs,%w0" : "=m" (__init1_desc)); #endif @@ -300,7 +305,7 @@ _hurd_stack_setup (void) { /* If we use ``__builtin_frame_address (0) + 2'' here, GCC gets confused. */ - init ((int *) &argc); + init (&argc); } /* Push the user return address after the argument data, and then @@ -308,9 +313,15 @@ _hurd_stack_setup (void) caller had called `doinit1' with the argument data already on the stack. */ *--data = caller; +# ifdef __x86_64__ + asm volatile ("movq %0, %%rsp\n" /* Switch to new outermost stack. */ + "movq $0, %%rbp\n" /* Clear outermost frame pointer. */ + "jmp *%1" : : "r" (data), "r" (&doinit1)); +# else asm volatile ("movl %0, %%esp\n" /* Switch to new outermost stack. */ "movl $0, %%ebp\n" /* Clear outermost frame pointer. */ "jmp *%1" : : "r" (data), "r" (&doinit1)); +# endif /* NOTREACHED */ }
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> --- sysdeps/mach/hurd/{i386 => x86}/init-first.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) rename sysdeps/mach/hurd/{i386 => x86}/init-first.c (96%)