Message ID | ffd7fc255e194d1e2b0aa3d9d129e826c53219d4.1725611321.git.christophe.leroy@csgroup.eu (mailing list archive) |
---|---|
State | Handled Elsewhere |
Headers | show |
Series | [1/2] powerpc/vdso: Fix VDSO data access when running in a non-root time namespace | expand |
On Fri, Sep 06, 2024 at 10:33:44AM +0200, Christophe Leroy wrote: > Use the new get_realdatapage macro instead of get_datapage > > Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> > --- > arch/powerpc/kernel/vdso/getrandom.S | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/powerpc/kernel/vdso/getrandom.S b/arch/powerpc/kernel/vdso/getrandom.S > index a957cd2b2b03..f3bbf931931c 100644 > --- a/arch/powerpc/kernel/vdso/getrandom.S > +++ b/arch/powerpc/kernel/vdso/getrandom.S > @@ -31,7 +31,7 @@ > PPC_STL r2, PPC_MIN_STKFRM + STK_GOT(r1) > .cfi_rel_offset r2, PPC_MIN_STKFRM + STK_GOT > #endif > - get_datapage r8 > + get_realdatapage r8, r11 > addi r8, r8, VDSO_RNG_DATA_OFFSET > bl CFUNC(DOTSYM(\funct)) > PPC_LL r0, PPC_MIN_STKFRM + PPC_LR_STKOFF(r1) I tested that this is working as intended on powerpc, powerpc64, and powerpc64le. Thanks for writing the patch so quickly. Jason
Le 06/09/2024 à 16:07, Jason A. Donenfeld a écrit : > On Fri, Sep 06, 2024 at 10:33:44AM +0200, Christophe Leroy wrote: >> Use the new get_realdatapage macro instead of get_datapage >> >> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> >> --- >> arch/powerpc/kernel/vdso/getrandom.S | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/arch/powerpc/kernel/vdso/getrandom.S b/arch/powerpc/kernel/vdso/getrandom.S >> index a957cd2b2b03..f3bbf931931c 100644 >> --- a/arch/powerpc/kernel/vdso/getrandom.S >> +++ b/arch/powerpc/kernel/vdso/getrandom.S >> @@ -31,7 +31,7 @@ >> PPC_STL r2, PPC_MIN_STKFRM + STK_GOT(r1) >> .cfi_rel_offset r2, PPC_MIN_STKFRM + STK_GOT >> #endif >> - get_datapage r8 >> + get_realdatapage r8, r11 >> addi r8, r8, VDSO_RNG_DATA_OFFSET >> bl CFUNC(DOTSYM(\funct)) >> PPC_LL r0, PPC_MIN_STKFRM + PPC_LR_STKOFF(r1) > > I tested that this is working as intended on powerpc, powerpc64, and > powerpc64le. Thanks for writing the patch so quickly. You are welcome. And thanks for playing up with it while I was sleeping and getting ideas too. Did you learn powerpc assembly during the night or did you know it already ? At the end I ended up with something which I think is simple enough for a backport to stable. On the long run I wonder if we should try to find a more generic solution for getrandom instead of requiring each architecture to handle it. On gettimeofday the selection of the right page is embeded in the generic part, see for instance : static __maybe_unused __kernel_old_time_t __cvdso_time_data(const struct vdso_data *vd, __kernel_old_time_t *time) { __kernel_old_time_t t; if (IS_ENABLED(CONFIG_TIME_NS) && vd->clock_mode == VDSO_CLOCKMODE_TIMENS) vd = __arch_get_timens_vdso_data(vd); t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec); if (time) *time = t; return t; } and powerpc just provides: static __always_inline const struct vdso_data *__arch_get_timens_vdso_data(const struct vdso_data *vd) { return (void *)vd + (1U << CONFIG_PAGE_SHIFT); } I know it may not be that simple for getrandom but its probably worth trying. Or another solution could be to put random data in a third page that is always at the same place regardless of timens ? Christophe
On Fri, Sep 06, 2024 at 04:26:32PM +0200, Christophe Leroy wrote: > And thanks for playing up with it while I was sleeping and getting ideas > too. > > Did you learn powerpc assembly during the night or did you know it already ? I don't really know ppc assembly. I had perused the tree over the last week and gotten some feel for it when reviewing patches, but I don't have anything memorized (except, perhaps, the eieio instruction [1,2]). Last night after sending the first broken patch I went out to play (I play jazz guitar ~every night these days), and the whole time I kept thinking about the problem. So first thing I did when I got home was try to fake my way through some ppc asm. A fun mini project for me. [1] https://lore.kernel.org/lkml/Pine.LNX.4.33.0110120919130.31677-100000@penguin.transmeta.com/ [2] https://lore.kernel.org/lkml/alpine.LFD.2.00.0904141006170.18124@localhost.localdomain/ > At the end I ended up with something which I think is simple enough for > a backport to stable. It seems like a good patch indeed, and hopefully small enough that Michael will let me carry in my tree for 6.12, per the plan. > On the long run I wonder if we should try to find a more generic > solution for getrandom instead of requiring each architecture to handle > it. On gettimeofday the selection of the right page is embeded in the > generic part, see for instance : > > static __maybe_unused __kernel_old_time_t > __cvdso_time_data(const struct vdso_data *vd, __kernel_old_time_t *time) > { > __kernel_old_time_t t; > > if (IS_ENABLED(CONFIG_TIME_NS) && > vd->clock_mode == VDSO_CLOCKMODE_TIMENS) > vd = __arch_get_timens_vdso_data(vd); > > t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec); > > if (time) > *time = t; > > return t; > } > > and powerpc just provides: > > static __always_inline > const struct vdso_data *__arch_get_timens_vdso_data(const struct > vdso_data *vd) > { > return (void *)vd + (1U << CONFIG_PAGE_SHIFT); > } It's tempting, but maybe a bit tricky. LoongArch, for example, doesn't have this problem at all, because the layout of their vvars doesn't require it. So the vd->clock_mode access is unnecessary. > Or another solution could be to put random data in a third page that is > always at the same place regardless of timens ? Maybe that's the easier way, yea. Potentially wasteful, though. Jason
Le 06/09/2024 à 16:46, Jason A. Donenfeld a écrit : > On Fri, Sep 06, 2024 at 04:26:32PM +0200, Christophe Leroy wrote: > >> On the long run I wonder if we should try to find a more generic >> solution for getrandom instead of requiring each architecture to handle >> it. On gettimeofday the selection of the right page is embeded in the >> generic part, see for instance : >> >> static __maybe_unused __kernel_old_time_t >> __cvdso_time_data(const struct vdso_data *vd, __kernel_old_time_t *time) >> { >> __kernel_old_time_t t; >> >> if (IS_ENABLED(CONFIG_TIME_NS) && >> vd->clock_mode == VDSO_CLOCKMODE_TIMENS) >> vd = __arch_get_timens_vdso_data(vd); >> >> t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec); >> >> if (time) >> *time = t; >> >> return t; >> } >> >> and powerpc just provides: >> >> static __always_inline >> const struct vdso_data *__arch_get_timens_vdso_data(const struct >> vdso_data *vd) >> { >> return (void *)vd + (1U << CONFIG_PAGE_SHIFT); >> } > > It's tempting, but maybe a bit tricky. LoongArch, for example, doesn't > have this problem at all, because the layout of their vvars doesn't > require it. So the vd->clock_mode access is unnecessary. > >> Or another solution could be to put random data in a third page that is >> always at the same place regardless of timens ? > > Maybe that's the easier way, yea. Potentially wasteful, though. > Indeed I just looked at Loongarch and that's exactly what they do: they have a third page after the two pages dedicated to TIME for arch specific data, and they have added getrandom data there. The third page is common to every process so it won't waste more than a few bytes. It doesn't worry me even on the older boards that only have 32 Mbytes of RAM. So yes, I may have a look at that in the future, what we have at the moment is good enough to move forward. Christophe
On Fri, Sep 06, 2024 at 05:14:43PM +0200, Christophe Leroy wrote: > > > Le 06/09/2024 à 16:46, Jason A. Donenfeld a écrit : > > On Fri, Sep 06, 2024 at 04:26:32PM +0200, Christophe Leroy wrote: > > > >> On the long run I wonder if we should try to find a more generic > >> solution for getrandom instead of requiring each architecture to handle > >> it. On gettimeofday the selection of the right page is embeded in the > >> generic part, see for instance : > >> > >> static __maybe_unused __kernel_old_time_t > >> __cvdso_time_data(const struct vdso_data *vd, __kernel_old_time_t *time) > >> { > >> __kernel_old_time_t t; > >> > >> if (IS_ENABLED(CONFIG_TIME_NS) && > >> vd->clock_mode == VDSO_CLOCKMODE_TIMENS) > >> vd = __arch_get_timens_vdso_data(vd); > >> > >> t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec); > >> > >> if (time) > >> *time = t; > >> > >> return t; > >> } > >> > >> and powerpc just provides: > >> > >> static __always_inline > >> const struct vdso_data *__arch_get_timens_vdso_data(const struct > >> vdso_data *vd) > >> { > >> return (void *)vd + (1U << CONFIG_PAGE_SHIFT); > >> } > > > > It's tempting, but maybe a bit tricky. LoongArch, for example, doesn't > > have this problem at all, because the layout of their vvars doesn't > > require it. So the vd->clock_mode access is unnecessary. > > > >> Or another solution could be to put random data in a third page that is > >> always at the same place regardless of timens ? > > > > Maybe that's the easier way, yea. Potentially wasteful, though. > > > > Indeed I just looked at Loongarch and that's exactly what they do: they > have a third page after the two pages dedicated to TIME for arch > specific data, and they have added getrandom data there. > > The third page is common to every process so it won't waste more than a > few bytes. It doesn't worry me even on the older boards that only have > 32 Mbytes of RAM. > > So yes, I may have a look at that in the future, what we have at the > moment is good enough to move forward. My x86 code is kind of icky for this: static __always_inline const struct vdso_rng_data *__arch_get_vdso_rng_data(void) { if (IS_ENABLED(CONFIG_TIME_NS) && __vdso_data->clock_mode == VDSO_CLOCKMODE_TIMENS) return (void *)&__vdso_rng_data + ((void *)&__timens_vdso_data - (void *)&__vdso_data); return &__vdso_rng_data; } Doing the subtraction like that means that this is more clearly correct. But it also makes the compiler insert two jumps for the branch, and then reads the addresses of those variables and such. If I change it to: static __always_inline const struct vdso_rng_data *__arch_get_vdso_rng_data(void) { if (IS_ENABLED(CONFIG_TIME_NS) && __vdso_data->clock_mode == VDSO_CLOCKMODE_TIMENS) return (void *)&__vdso_rng_data + (3UL << CONFIG_PAGE_SHIFT); return &__vdso_rng_data; } Then there's a much nicer single `cmov` with no branching. But if I want to do that for real, I'll have to figure out what set of nice compile-time constants I can use. I haven't looked into this yet. Jason
On Fri, Sep 06, 2024 at 08:54:49PM +0200, Jason A. Donenfeld wrote: > On Fri, Sep 06, 2024 at 05:14:43PM +0200, Christophe Leroy wrote: > > > > > > Le 06/09/2024 à 16:46, Jason A. Donenfeld a écrit : > > > On Fri, Sep 06, 2024 at 04:26:32PM +0200, Christophe Leroy wrote: > > > > > >> On the long run I wonder if we should try to find a more generic > > >> solution for getrandom instead of requiring each architecture to handle > > >> it. On gettimeofday the selection of the right page is embeded in the > > >> generic part, see for instance : > > >> > > >> static __maybe_unused __kernel_old_time_t > > >> __cvdso_time_data(const struct vdso_data *vd, __kernel_old_time_t *time) > > >> { > > >> __kernel_old_time_t t; > > >> > > >> if (IS_ENABLED(CONFIG_TIME_NS) && > > >> vd->clock_mode == VDSO_CLOCKMODE_TIMENS) > > >> vd = __arch_get_timens_vdso_data(vd); > > >> > > >> t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec); > > >> > > >> if (time) > > >> *time = t; > > >> > > >> return t; > > >> } > > >> > > >> and powerpc just provides: > > >> > > >> static __always_inline > > >> const struct vdso_data *__arch_get_timens_vdso_data(const struct > > >> vdso_data *vd) > > >> { > > >> return (void *)vd + (1U << CONFIG_PAGE_SHIFT); > > >> } > > > > > > It's tempting, but maybe a bit tricky. LoongArch, for example, doesn't > > > have this problem at all, because the layout of their vvars doesn't > > > require it. So the vd->clock_mode access is unnecessary. > > > > > >> Or another solution could be to put random data in a third page that is > > >> always at the same place regardless of timens ? > > > > > > Maybe that's the easier way, yea. Potentially wasteful, though. > > > > > > > Indeed I just looked at Loongarch and that's exactly what they do: they > > have a third page after the two pages dedicated to TIME for arch > > specific data, and they have added getrandom data there. > > > > The third page is common to every process so it won't waste more than a > > few bytes. It doesn't worry me even on the older boards that only have > > 32 Mbytes of RAM. > > > > So yes, I may have a look at that in the future, what we have at the > > moment is good enough to move forward. > > My x86 code is kind of icky for this: > > static __always_inline const struct vdso_rng_data *__arch_get_vdso_rng_data(void) > { > if (IS_ENABLED(CONFIG_TIME_NS) && __vdso_data->clock_mode == VDSO_CLOCKMODE_TIMENS) > return (void *)&__vdso_rng_data + ((void *)&__timens_vdso_data - (void *)&__vdso_data); > return &__vdso_rng_data; > } > > Doing the subtraction like that means that this is more clearly correct. > But it also makes the compiler insert two jumps for the branch, and then > reads the addresses of those variables and such. > > If I change it to: > > static __always_inline const struct vdso_rng_data *__arch_get_vdso_rng_data(void) > { > if (IS_ENABLED(CONFIG_TIME_NS) && __vdso_data->clock_mode == VDSO_CLOCKMODE_TIMENS) > return (void *)&__vdso_rng_data + (3UL << CONFIG_PAGE_SHIFT); > return &__vdso_rng_data; > } > > Then there's a much nicer single `cmov` with no branching. > > But if I want to do that for real, I'll have to figure out what set of > nice compile-time constants I can use. I haven't looked into this yet. https://lore.kernel.org/all/20240906190655.2777023-1-Jason@zx2c4.com/
Le 07/09/2024 à 16:35, Jason A. Donenfeld a écrit : > On Fri, Sep 06, 2024 at 08:54:49PM +0200, Jason A. Donenfeld wrote: >> On Fri, Sep 06, 2024 at 05:14:43PM +0200, Christophe Leroy wrote: >>> >>> >>> Le 06/09/2024 à 16:46, Jason A. Donenfeld a écrit : >>>> On Fri, Sep 06, 2024 at 04:26:32PM +0200, Christophe Leroy wrote: >>>> >>>>> On the long run I wonder if we should try to find a more generic >>>>> solution for getrandom instead of requiring each architecture to handle >>>>> it. On gettimeofday the selection of the right page is embeded in the >>>>> generic part, see for instance : >>>>> >>>>> static __maybe_unused __kernel_old_time_t >>>>> __cvdso_time_data(const struct vdso_data *vd, __kernel_old_time_t *time) >>>>> { >>>>> __kernel_old_time_t t; >>>>> >>>>> if (IS_ENABLED(CONFIG_TIME_NS) && >>>>> vd->clock_mode == VDSO_CLOCKMODE_TIMENS) >>>>> vd = __arch_get_timens_vdso_data(vd); >>>>> >>>>> t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec); >>>>> >>>>> if (time) >>>>> *time = t; >>>>> >>>>> return t; >>>>> } >>>>> >>>>> and powerpc just provides: >>>>> >>>>> static __always_inline >>>>> const struct vdso_data *__arch_get_timens_vdso_data(const struct >>>>> vdso_data *vd) >>>>> { >>>>> return (void *)vd + (1U << CONFIG_PAGE_SHIFT); >>>>> } >>>> >>>> It's tempting, but maybe a bit tricky. LoongArch, for example, doesn't >>>> have this problem at all, because the layout of their vvars doesn't >>>> require it. So the vd->clock_mode access is unnecessary. >>>> >>>>> Or another solution could be to put random data in a third page that is >>>>> always at the same place regardless of timens ? >>>> >>>> Maybe that's the easier way, yea. Potentially wasteful, though. >>>> >>> >>> Indeed I just looked at Loongarch and that's exactly what they do: they >>> have a third page after the two pages dedicated to TIME for arch >>> specific data, and they have added getrandom data there. >>> >>> The third page is common to every process so it won't waste more than a >>> few bytes. It doesn't worry me even on the older boards that only have >>> 32 Mbytes of RAM. >>> >>> So yes, I may have a look at that in the future, what we have at the >>> moment is good enough to move forward. >> >> My x86 code is kind of icky for this: >> >> static __always_inline const struct vdso_rng_data *__arch_get_vdso_rng_data(void) >> { >> if (IS_ENABLED(CONFIG_TIME_NS) && __vdso_data->clock_mode == VDSO_CLOCKMODE_TIMENS) >> return (void *)&__vdso_rng_data + ((void *)&__timens_vdso_data - (void *)&__vdso_data); >> return &__vdso_rng_data; >> } >> >> Doing the subtraction like that means that this is more clearly correct. >> But it also makes the compiler insert two jumps for the branch, and then >> reads the addresses of those variables and such. >> >> If I change it to: >> >> static __always_inline const struct vdso_rng_data *__arch_get_vdso_rng_data(void) >> { >> if (IS_ENABLED(CONFIG_TIME_NS) && __vdso_data->clock_mode == VDSO_CLOCKMODE_TIMENS) >> return (void *)&__vdso_rng_data + (3UL << CONFIG_PAGE_SHIFT); >> return &__vdso_rng_data; >> } >> >> Then there's a much nicer single `cmov` with no branching. >> >> But if I want to do that for real, I'll have to figure out what set of >> nice compile-time constants I can use. I haven't looked into this yet. > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2F20240906190655.2777023-1-Jason%40zx2c4.com%2F&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C3ee8b35fe848434e72fd08dccf4a67ff%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638613165688600378%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=g4zcMonjJNYhwrUWeCDoL5Ri7Mbg5hVQJyZNU2zH4Pc%3D&reserved=0 Looks good. Allthough other architectures don't use defines but enums for that: arch/arm64/kernel/vdso.c-36- arch/arm64/kernel/vdso.c-37-enum vvar_pages { arch/arm64/kernel/vdso.c:38: VVAR_DATA_PAGE_OFFSET, arch/arm64/kernel/vdso.c:39: VVAR_TIMENS_PAGE_OFFSET, arch/arm64/kernel/vdso.c-40- VVAR_NR_PAGES, arch/arm64/kernel/vdso.c-41-}; -- arch/loongarch/include/asm/vdso/vdso.h-36- arch/loongarch/include/asm/vdso/vdso.h-37-enum vvar_pages { arch/loongarch/include/asm/vdso/vdso.h:38: VVAR_GENERIC_PAGE_OFFSET, arch/loongarch/include/asm/vdso/vdso.h:39: VVAR_TIMENS_PAGE_OFFSET, arch/loongarch/include/asm/vdso/vdso.h-40- VVAR_LOONGARCH_PAGES_START, arch/loongarch/include/asm/vdso/vdso.h-41- VVAR_LOONGARCH_PAGES_END = VVAR_LOONGARCH_PAGES_START + LOONGARCH_VDSO_DATA_PAGES - 1, -- arch/powerpc/kernel/vdso.c-54- arch/powerpc/kernel/vdso.c-55-enum vvar_pages { arch/powerpc/kernel/vdso.c:56: VVAR_DATA_PAGE_OFFSET, arch/powerpc/kernel/vdso.c:57: VVAR_TIMENS_PAGE_OFFSET, arch/powerpc/kernel/vdso.c-58- VVAR_NR_PAGES, arch/powerpc/kernel/vdso.c-59-}; -- arch/riscv/kernel/vdso.c-19- arch/riscv/kernel/vdso.c-20-enum vvar_pages { arch/riscv/kernel/vdso.c:21: VVAR_DATA_PAGE_OFFSET, arch/riscv/kernel/vdso.c:22: VVAR_TIMENS_PAGE_OFFSET, arch/riscv/kernel/vdso.c-23- VVAR_NR_PAGES, arch/riscv/kernel/vdso.c-24-}; -- arch/s390/kernel/vdso.c-31- arch/s390/kernel/vdso.c-32-enum vvar_pages { arch/s390/kernel/vdso.c:33: VVAR_DATA_PAGE_OFFSET, arch/s390/kernel/vdso.c:34: VVAR_TIMENS_PAGE_OFFSET, arch/s390/kernel/vdso.c-35- VVAR_NR_PAGES, arch/s390/kernel/vdso.c-36-}; Christophe
diff --git a/arch/powerpc/kernel/vdso/getrandom.S b/arch/powerpc/kernel/vdso/getrandom.S index a957cd2b2b03..f3bbf931931c 100644 --- a/arch/powerpc/kernel/vdso/getrandom.S +++ b/arch/powerpc/kernel/vdso/getrandom.S @@ -31,7 +31,7 @@ PPC_STL r2, PPC_MIN_STKFRM + STK_GOT(r1) .cfi_rel_offset r2, PPC_MIN_STKFRM + STK_GOT #endif - get_datapage r8 + get_realdatapage r8, r11 addi r8, r8, VDSO_RNG_DATA_OFFSET bl CFUNC(DOTSYM(\funct)) PPC_LL r0, PPC_MIN_STKFRM + PPC_LR_STKOFF(r1)
Use the new get_realdatapage macro instead of get_datapage Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> --- arch/powerpc/kernel/vdso/getrandom.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)