Message ID | 1251980117-10089-1-git-send-email-kirill@shutemov.name |
---|---|
State | Superseded |
Headers | show |
On Thu, Sep 03, 2009 at 03:15:17PM +0300, Kirill A. Shutemov wrote: > Now we can drop link hack for i386 and fix text relocations on i386 host. > > v2: > - Add configure options do enable/disable PIE for usermode targets. > Disabling can be useful if you build uswing toolchain which has > broken PIE support. PIE for usermode targets enabled by default. Hm. Would be nice if the commit message said more about the "why". What is the advantage of PIE (I mean, is there something special about qemu that makes it particularly useful)? Is there any measurable speed difference between PIE and no PIE? (sorry if it was explained for v1, I must have missed that one)
On Thu, Sep 3, 2009 at 3:07 PM, Juan Quintela<quintela@trasno.org> wrote: > "Kirill A. Shutemov" <kirill@shutemov.name> wrote: >> Now we can drop link hack for i386 and fix text relocations on i386 >> host. > > Still not good enough :( > > Fedora 11 here. I got this error: > > /usr/bin/ld: main.o: relocation R_X86_64_TPO LINK arm-linux-user/qemu-arm > /usr/bin/ld: main.o: relocation R_X86_64_TPOFF32 against `thread_env' can not be used when making a shared object; recompile with -fPIC > main.o: could not read symbols: Bad value > collect2: ld returned 1 exit status > make[1]: *** [qemu-arm] Error 1 > make: *** [subdir-arm-linux-user] Error 2 > > (I got it for all the -linux-user targets) What version of binutils do you have? It seems your binutils is buggy.
On Thu, Sep 3, 2009 at 3:00 PM, Reimar Döffinger<Reimar.Doeffinger@gmx.de> wrote: > On Thu, Sep 03, 2009 at 03:15:17PM +0300, Kirill A. Shutemov wrote: >> Now we can drop link hack for i386 and fix text relocations on i386 host. >> >> v2: >> - Add configure options do enable/disable PIE for usermode targets. >> Disabling can be useful if you build uswing toolchain which has >> broken PIE support. PIE for usermode targets enabled by default. > > Hm. Would be nice if the commit message said more about the "why". What > is the advantage of PIE (I mean, is there something special about qemu > that makes it particularly useful)? The main advantage is that we can drop linking hack for i386 (and keep keep qemu self-virtualizable) and solve text relocations. The other advantage is security. Since qemu is PIE kernel can load at random position in memory. It makes qemu harder for many types of attacks. > Is there any measurable speed > difference between PIE and no PIE? Actually, I have no numbers for qemu. PIE code usually is a bit slower. Approximately, 1% for i386 according to some tests. RISC architectures should be affected less, since they have more registers. On other hand we are getting rid from text relocations on i386 which make executable loading slower. So...
> PIE code usually is a bit slower. Approximately, 1% for i386 according to > some tests. RISC architectures should be affected less, since they have > more registers. On other hand we are getting rid from text relocations on > i386 which make executable loading slower. So... I think you've got that backwards. A traditional (fixed address) executable requires no load-time relocation for internal references because all addresses are known at static link time. PIE require the dynamic linker adjust all absolute addresses. Paul
On Thursday 03 September 2009, Kirill A. Shutemov wrote: > Now we can drop link hack for i386 and fix text relocations on i386 host. > > v2: > - Add configure options do enable/disable PIE for usermode targets. > Disabling can be useful if you build uswing toolchain which has > broken PIE support. PIE for usermode targets enabled by default. This isn't as useful as you might think. How do you stop the host dynamic linker loading qemu where the guest application expects to be loaded? Paul
On 09/03/2009 04:38 PM, Paul Brook wrote: >> PIE code usually is a bit slower. Approximately, 1% for i386 according to >> some tests. RISC architectures should be affected less, since they have >> more registers. On other hand we are getting rid from text relocations on >> i386 which make executable loading slower. So... > > I think you've got that backwards. > A traditional (fixed address) executable requires no load-time relocation for > internal references because all addresses are known at static link time. PIE > require the dynamic linker adjust all absolute addresses. Yes, but since it's also compiled as PIE, there are no absolute addresses. Previously QEMU was linked -shared but compiled as non-position independent code. I am not sure whether only the self-virtualized machine would be subject to relocation, or also the outer one (maybe address space virtualization would also have to be taken into account?). Anyway, as far as text relocations are concerned Kirill's pathc cannot make things worse. Paolo
On Thu, Sep 3, 2009 at 5:38 PM, Paul Brook<paul@codesourcery.com> wrote: >> PIE code usually is a bit slower. Approximately, 1% for i386 according to >> some tests. RISC architectures should be affected less, since they have >> more registers. On other hand we are getting rid from text relocations on >> i386 which make executable loading slower. So... > > I think you've got that backwards. > A traditional (fixed address) executable requires no load-time relocation for > internal references because all addresses are known at static link time. PIE > require the dynamic linker adjust all absolute addresses. Usermode qemu on i386 is not a traditional executable, sicne it uses -Wl,-shared for linking. In result we've got an executable which looks like PIE, but dynamic linker have to resolve text relocations. I think the best way it to create a true PIE without a text relocations. P.S. I press "reply" instead "reply all" first time.
On Thu, Sep 3, 2009 at 5:39 PM, Paul Brook<paul@codesourcery.com> wrote: > On Thursday 03 September 2009, Kirill A. Shutemov wrote: >> Now we can drop link hack for i386 and fix text relocations on i386 host. >> >> v2: >> - Add configure options do enable/disable PIE for usermode targets. >> Disabling can be useful if you build uswing toolchain which has >> broken PIE support. PIE for usermode targets enabled by default. > > This isn't as useful as you might think. > > How do you stop the host dynamic linker loading qemu where the guest > application expects to be loaded? At least it not worse that it was. For kernel qemu with the linking hack looks like PIE and it can load it at random address, doesn't it? P.S. I press "reply" instead "reply all" first time.
On Thu, Sep 03, 2009 at 06:07:21PM +0300, Kirill A. Shutemov wrote: > On Thu, Sep 3, 2009 at 5:38 PM, Paul Brook<paul@codesourcery.com> wrote: > >> PIE code usually is a bit slower. Approximately, 1% for i386 according to > >> some tests. RISC architectures should be affected less, since they have > >> more registers. On other hand we are getting rid from text relocations on > >> i386 which make executable loading slower. So... > > > > I think you've got that backwards. > > A traditional (fixed address) executable requires no load-time relocation for > > internal references because all addresses are known at static link time. PIE > > require the dynamic linker adjust all absolute addresses. > > Usermode qemu on i386 is not a traditional executable, sicne it uses > -Wl,-shared for linking. In result we've got an executable which looks > like PIE, but dynamic linker have to resolve text relocations. I think > the best way it to create a true PIE without a text relocations. It is close to getting of topic, but since you state it, why try so hard to avoid text relocations? Sure, there are advantages (the biggest one is less issues with mis-/insufficiently configured selinux I think), possibly better sharing of pages when many instances are run and better delayed loading, but on x86/i386 that doesn't sound like a clear advantage compared to the in some cases quite relevant speed loss. Also, since this patch adds --disable-pie, isn't the hack currently used still necessary for that case? Or is --disable-pie supposed to disable self-hosting? Then maybe the option should be named --disable-self-hosting (and if that is indeed the only side-effect it might be better to disable it by default)?
On Thu, Sep 3, 2009 at 8:17 PM, Reimar Döffinger<Reimar.Doeffinger@gmx.de> wrote: > On Thu, Sep 03, 2009 at 06:07:21PM +0300, Kirill A. Shutemov wrote: >> On Thu, Sep 3, 2009 at 5:38 PM, Paul Brook<paul@codesourcery.com> wrote: >> >> PIE code usually is a bit slower. Approximately, 1% for i386 according to >> >> some tests. RISC architectures should be affected less, since they have >> >> more registers. On other hand we are getting rid from text relocations on >> >> i386 which make executable loading slower. So... >> > >> > I think you've got that backwards. >> > A traditional (fixed address) executable requires no load-time relocation for >> > internal references because all addresses are known at static link time. PIE >> > require the dynamic linker adjust all absolute addresses. >> >> Usermode qemu on i386 is not a traditional executable, sicne it uses >> -Wl,-shared for linking. In result we've got an executable which looks >> like PIE, but dynamic linker have to resolve text relocations. I think >> the best way it to create a true PIE without a text relocations. > > It is close to getting of topic, but since you state it, why try so hard > to avoid text relocations? > Sure, there are advantages (the biggest one is less issues with > mis-/insufficiently configured selinux I think), possibly better sharing > of pages when many instances are run and better delayed loading, but on > x86/i386 that doesn't sound like a clear advantage compared to the in > some cases quite relevant speed loss. Do you have any numbers about speed loss? > Also, since this patch adds --disable-pie, isn't the hack currently used > still necessary for that case? > Or is --disable-pie supposed to disable self-hosting? Then maybe the > option should be named --disable-self-hosting (and if that is indeed the > only side-effect it might be better to disable it by default)? >
On Fri, Sep 04, 2009 at 07:33:25AM +0300, Kirill A. Shutemov wrote: > On Thu, Sep 3, 2009 at 8:17 PM, Reimar > Döffinger<Reimar.Doeffinger@gmx.de> wrote: > > It is close to getting of topic, but since you state it, why try so hard > > to avoid text relocations? > > Sure, there are advantages (the biggest one is less issues with > > mis-/insufficiently configured selinux I think), possibly better sharing > > of pages when many instances are run and better delayed loading, but on > > x86/i386 that doesn't sound like a clear advantage compared to the in > > some cases quite relevant speed loss. > > Do you have any numbers about speed loss? No, I was getting a bit off-topic. At least with KVM I doubt there is any relevant speed loss for qemu, though for MPlayer/FFmpeg (very different situation) it could be about 10 % when I last did some tests.
On Fri, Sep 4, 2009 at 10:51 AM, Reimar Döffinger<Reimar.Doeffinger@gmx.de> wrote: > On Fri, Sep 04, 2009 at 07:33:25AM +0300, Kirill A. Shutemov wrote: >> On Thu, Sep 3, 2009 at 8:17 PM, Reimar >> Döffinger<Reimar.Doeffinger@gmx.de> wrote: >> > It is close to getting of topic, but since you state it, why try so hard >> > to avoid text relocations? >> > Sure, there are advantages (the biggest one is less issues with >> > mis-/insufficiently configured selinux I think), possibly better sharing >> > of pages when many instances are run and better delayed loading, but on >> > x86/i386 that doesn't sound like a clear advantage compared to the in >> > some cases quite relevant speed loss. >> >> Do you have any numbers about speed loss? > > No, I was getting a bit off-topic. At least with KVM I doubt there > is any relevant speed loss for qemu, though for MPlayer/FFmpeg (very different > situation) it could be about 10 % when I last did some tests. > My patch compile only usermode targets as PIE, so it will not affect KVM.
diff --git a/Makefile b/Makefile index bdac9b3..634ea81 100644 --- a/Makefile +++ b/Makefile @@ -39,8 +39,6 @@ subdir-%: $(call quiet-command,$(MAKE) $(SUBDIR_MAKEFLAGS) -C $* V="$(V)" TARGET_DIR="$*/" all,) $(filter %-softmmu,$(SUBDIR_RULES)): libqemu_common.a -$(filter %-user,$(SUBDIR_RULES)): libqemu_user.a - ROMSUBDIR_RULES=$(patsubst %,romsubdir-%, $(ROMS)) romsubdir-%: @@ -74,7 +72,7 @@ block-obj-y += $(addprefix block/, $(block-nested-y)) # CPUs and machines. obj-y = $(block-obj-y) -obj-y += readline.o console.o host-utils.o +obj-y += readline.o console.o obj-y += irq.o ptimer.o obj-y += i2c.o smbus.o smbus_eeprom.o max7310.o max111x.o wm8750.o @@ -161,12 +159,6 @@ bt-host.o: QEMU_CFLAGS += $(BLUEZ_CFLAGS) libqemu_common.a: $(obj-y) -####################################################################### -# user-obj-y is code used by qemu userspace emulation -user-obj-y = cutils.o cache-utils.o path.o envlist.o host-utils.o - -libqemu_user.a: $(user-obj-y) - ###################################################################### qemu-img.o: qemu-img-cmds.h diff --git a/Makefile.target b/Makefile.target index f7d1919..f738617 100644 --- a/Makefile.target +++ b/Makefile.target @@ -31,7 +31,7 @@ all: $(PROGS) ######################################################### # cpu emulator library -libobj-y = exec.o translate-all.o cpu-exec.o translate.o +libobj-y = exec.o translate-all.o cpu-exec.o translate.o host-utils.o libobj-y += tcg/tcg.o tcg/tcg-runtime.o libobj-$(CONFIG_SOFTFLOAT) += fpu/softfloat.o libobj-$(CONFIG_NOSOFTFLOAT) += fpu/softfloat-native.o @@ -80,9 +80,9 @@ ifdef CONFIG_LINUX_USER VPATH+=:$(SRC_PATH)/linux-user:$(SRC_PATH)/linux-user/$(TARGET_ABI_DIR) QEMU_CFLAGS+=-I$(SRC_PATH)/linux-user -I$(SRC_PATH)/linux-user/$(TARGET_ABI_DIR) - obj-y = main.o syscall.o strace.o mmap.o signal.o thunk.o \ elfload.o linuxload.o uaccess.o gdbstub.o gdbstub-xml.o +obj-y += envlist.o path.o obj-$(TARGET_HAS_BFLT) += flatload.o obj-$(TARGET_HAS_ELFLOAD32) += elfload32.o @@ -98,7 +98,7 @@ obj-arm-y += arm-semi.o obj-m68k-y += m68k-sim.o m68k-semi.o -ARLIBS=../libqemu_user.a libqemu.a +ARLIBS=libqemu.a endif #CONFIG_LINUX_USER ######################################################### @@ -116,6 +116,7 @@ LIBS+=-lmx obj-y = main.o commpage.o machload.o mmap.o signal.o syscall.o thunk.o \ gdbstub.o gdbstub-xml.o +obj-y += envlist.o path.o obj-i386-y += ioport-user.o @@ -133,13 +134,23 @@ QEMU_CFLAGS+=-I$(SRC_PATH)/bsd-user -I$(SRC_PATH)/bsd-user/$(TARGET_ARCH) obj-y = main.o bsdload.o elfload.o mmap.o signal.o strace.o syscall.o \ gdbstub.o gdbstub-xml.o uaccess.o +obj-y += envlist.o path.o obj-i386-y += ioport-user.o -ARLIBS=libqemu.a ../libqemu_user.a +ARLIBS=libqemu.a endif #CONFIG_BSD_USER +ifdef CONFIG_USER_ONLY +# hack to compile with -fpie for *-user targets +obj-y += cutils-user.o cache-utils-user.o +cutils-user.c cache-utils-user.c: + @echo " LN $(TARGET_DIR)$@" + @ln -s $(SRC_PATH)/$(@:%-user.c=%.c) $@ +endif + + ######################################################### # System emulator target ifdef CONFIG_SOFTMMU diff --git a/configure b/configure index 0d0162a..4f5850c 100755 --- a/configure +++ b/configure @@ -221,6 +221,7 @@ kerneldir="" aix="no" blobs="yes" pkgversion="" +user_pie="yes" # OS specific if check_define __linux__ ; then @@ -498,6 +499,10 @@ for opt do ;; --disable-guest-base) guest_base="no" ;; + --enable-user-pie) user_pie="yes" + ;; + --disable-user-pie) user_pie="no" + ;; --enable-uname-release=*) uname_release="$optarg" ;; --sparc_cpu=*) @@ -672,6 +677,8 @@ echo " --disable-bsd-user disable all BSD usermode emulation targets" echo " --enable-guest-base enable GUEST_BASE support for usermode" echo " emulation targets" echo " --disable-guest-base disable GUEST_BASE support" +echo " --enable-user-pie build usermode emulation targets as PIE" +echo " --disable-user-pie do not build usermode emulation targets as PIE" echo " --fmod-lib path to FMOD library" echo " --fmod-inc path to FMOD includes" echo " --oss-lib path to OSS library" @@ -1678,6 +1685,7 @@ echo "Documentation $docs" echo "uname -r $uname_release" echo "NPTL support $nptl" echo "GUEST_BASE $guest_base" +echo "PIE user targets $user_pie" echo "vde support $vde" echo "IO thread $io_thread" echo "Linux AIO support $linux_aio" @@ -2302,6 +2310,12 @@ if test "$target_softmmu" = "yes" ; then esac fi +if test "$target_user_only" = "yes" -a "$static" = "no" -a \ + "$user_pie" = "yes" ; then + cflags="-fpie $cflags" + ldflags="-pie $ldflags" +fi + if test "$target_softmmu" = "yes" -a \( \ "$TARGET_ARCH" = "microblaze" -o \ "$TARGET_ARCH" = "cris" \) ; then @@ -2323,16 +2337,6 @@ fi linker_script="-Wl,-T../config-host.ld -Wl,-T,\$(SRC_PATH)/\$(ARCH).ld" if test "$target_linux_user" = "yes" -o "$target_bsd_user" = "yes" ; then case "$ARCH" in - i386) - if test "$gprof" = "yes" -o "$static" = "yes" ; then - ldflags="$linker_script $ldflags" - else - # WARNING: this LDFLAGS is _very_ tricky : qemu is an ELF shared object - # that the kernel ELF loader considers as an executable. I think this - # is the simplest way to make it self virtualizable! - ldflags="-Wl,-shared $ldflags" - fi - ;; sparc) # -static is used to avoid g1/g3 usage by the dynamic linker ldflags="$linker_script -static $ldflags" @@ -2340,7 +2344,7 @@ if test "$target_linux_user" = "yes" -o "$target_bsd_user" = "yes" ; then ia64) ldflags="-Wl,-G0 $linker_script -static $ldflags" ;; - x86_64|ppc|ppc64|s390|sparc64|alpha|arm|m68k|mips|mips64) + i386|x86_64|ppc|ppc64|s390|sparc64|alpha|arm|m68k|mips|mips64) ldflags="$linker_script $ldflags" ;; esac diff --git a/linux-user/main.c b/linux-user/main.c index a628c01..d3af2e2 100644 --- a/linux-user/main.c +++ b/linux-user/main.c @@ -54,26 +54,6 @@ const char *qemu_uname_release = CONFIG_UNAME_RELEASE; const char interp[] __attribute__((section(".interp"))) = "/lib/ld-linux.so.2"; #endif -/* for recent libc, we add these dummy symbols which are not declared - when generating a linked object (bug in ld ?) */ -#if (__GLIBC__ > 2 || (__GLIBC__ == 2 && __GLIBC_MINOR__ >= 3)) && !defined(CONFIG_STATIC) -asm(".globl __preinit_array_start\n" - ".globl __preinit_array_end\n" - ".globl __init_array_start\n" - ".globl __init_array_end\n" - ".globl __fini_array_start\n" - ".globl __fini_array_end\n" - ".section \".rodata\"\n" - "__preinit_array_start:\n" - "__preinit_array_end:\n" - "__init_array_start:\n" - "__init_array_end:\n" - "__fini_array_start:\n" - "__fini_array_end:\n" - ".long 0\n" - ".previous\n"); -#endif - /* XXX: on x86 MAP_GROWSDOWN only works if ESP <= address + 32, so we allocate a bigger stack. Need a better solution, for example by remapping the process stack directly at the right place */
Now we can drop link hack for i386 and fix text relocations on i386 host. v2: - Add configure options do enable/disable PIE for usermode targets. Disabling can be useful if you build uswing toolchain which has broken PIE support. PIE for usermode targets enabled by default. Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> --- Makefile | 10 +--------- Makefile.target | 19 +++++++++++++++---- configure | 26 +++++++++++++++----------- linux-user/main.c | 20 -------------------- 4 files changed, 31 insertions(+), 44 deletions(-)