Message ID | 20171219114112.939391-1-arnd@arndb.de |
---|---|
State | New |
Headers | show |
Series | bug.h: Work around GCC PR82365 in BUG() | expand |
Hi Arnd, On Tue, Dec 19, 2017 at 12:39 PM, Arnd Bergmann <arnd@arndb.de> wrote: > The name barrier_before_unreachable() is a bit suboptimal here, > as it fails to describe the fact that it is needed for both > __builtin_unreachable() and for calling noreturn functions. Any other > suggestions would be welcome here. /me joins bikeshedding... barrier_before_noreturn()? barrier_before_dead_end()? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On 12/19/2017 03:41 AM, Arnd Bergmann wrote: > Looking at functions with large stack frames across all architectures > led me discovering that BUG() suffers from the same problem as > fortify_panic(), which I've added a workaround for already. In short, > variables that go out of scope by calling a noreturn function or > __builtin_unreachable() keep using stack space in functions afterwards. > > A workaround that was identified is to insert an empty assembler statement > just before calling the function that doesn't return. I'm adding a macro > "barrier_before_unreachable()" to document this, and insert calls to > that in all instances of BUG() that currently suffer from this problem. > > The files that saw the largest change from this had these frame sizes > before, and much less with my patch: > > fs/ext4/inode.c:82:1: warning: the frame size of 1672 bytes is larger than 800 bytes [-Wframe-larger-than=] > fs/ext4/namei.c:434:1: warning: the frame size of 904 bytes is larger than 800 bytes [-Wframe-larger-than=] > fs/ext4/super.c:2279:1: warning: the frame size of 1160 bytes is larger than 800 bytes [-Wframe-larger-than=] > fs/ext4/xattr.c:146:1: warning: the frame size of 1168 bytes is larger than 800 bytes [-Wframe-larger-than=] > fs/f2fs/inode.c:152:1: warning: the frame size of 1424 bytes is larger than 800 bytes [-Wframe-larger-than=] > net/netfilter/ipvs/ip_vs_core.c:1195:1: warning: the frame size of 1068 bytes is larger than 800 bytes [-Wframe-larger-than=] > net/netfilter/ipvs/ip_vs_core.c:395:1: warning: the frame size of 1084 bytes is larger than 800 bytes [-Wframe-larger-than=] > net/netfilter/ipvs/ip_vs_ftp.c:298:1: warning: the frame size of 928 bytes is larger than 800 bytes [-Wframe-larger-than=] > net/netfilter/ipvs/ip_vs_ftp.c:418:1: warning: the frame size of 908 bytes is larger than 800 bytes [-Wframe-larger-than=] > net/netfilter/ipvs/ip_vs_lblcr.c:718:1: warning: the frame size of 960 bytes is larger than 800 bytes [-Wframe-larger-than=] > drivers/net/xen-netback/netback.c:1500:1: warning: the frame size of 1088 bytes is larger than 800 bytes [-Wframe-larger-than=] > > In case of ARC and CRIS, it turns out that the BUG() implementation > actually does return (or at least the compiler thinks it does), resulting > in lots of warnings about uninitialized variable use and leaving noreturn > functions, such as: > > block/cfq-iosched.c: In function 'cfq_async_queue_prio': > block/cfq-iosched.c:3804:1: error: control reaches end of non-void function [-Werror=return-type] > include/linux/dmaengine.h: In function 'dma_maxpq': > include/linux/dmaengine.h:1123:1: error: control reaches end of non-void function [-Werror=return-type] > > This makes them call __builtin_trap() instead, which should normally > dump the stack and kill the current process, like some of the other > architectures already do. > > I tried adding barrier_before_unreachable() to panic() and fortify_panic() > as well, but that had very little effect, so I'm not submitting that > patch. > > Link: https://urldefense.proofpoint.com/v2/url?u=https-3A__gcc.gnu.org_bugzilla_show-5Fbug.cgi-3Fid-3D82365&d=DwIBAg&c=DPL6_X_6JkXFx7AXWqB0tg&r=c14YS-cH-kdhTOW89KozFhBtBJgs1zXscZojEZQ0THs&m=3Iu4HWDn1cXkYBpSFh5I80IzDKJi33hs5DbfGM-b3mI&s=sTrcyN5ej_ION8hJvF9eGLUZYwdlwI50vXUp3MK-XWY&e= > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > --- > The name barrier_before_unreachable() is a bit suboptimal here, > as it fails to describe the fact that it is needed for both > __builtin_unreachable() and for calling noreturn functions. Any other > suggestions would be welcome here. > --- > arch/arc/include/asm/bug.h | 3 ++- > arch/cris/include/arch-v10/arch/bug.h | 11 +++++++++-- > arch/ia64/include/asm/bug.h | 6 +++++- > arch/m68k/include/asm/bug.h | 3 +++ > arch/sparc/include/asm/bug.h | 6 +++++- > include/asm-generic/bug.h | 1 + > include/linux/compiler-gcc.h | 15 ++++++++++++++- > include/linux/compiler.h | 5 +++++ > 8 files changed, 44 insertions(+), 6 deletions(-) > > diff --git a/arch/arc/include/asm/bug.h b/arch/arc/include/asm/bug.h > index ea022d47896c..21ec82466d62 100644 > --- a/arch/arc/include/asm/bug.h > +++ b/arch/arc/include/asm/bug.h > @@ -23,7 +23,8 @@ void die(const char *str, struct pt_regs *regs, unsigned long address); > > #define BUG() do { \ > pr_warn("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__); \ > - dump_stack(); \ > + barrier_before_unreachable(); \ > + __builtin_trap(); \ > } while (0) > > #define HAVE_ARCH_BUG > diff --git a/arch/cris/include/arch-v10/arch/bug.h b/arch/cris/include/arch-v10/arch/bug.h > index 905afeacfedf..06da9d49152a 100644 > --- a/arch/cris/include/arch-v10/arch/bug.h > +++ b/arch/cris/include/arch-v10/arch/bug.h > @@ -44,18 +44,25 @@ struct bug_frame { > * not be used like this with newer versions of gcc. > */ > #define BUG() \ > +do { \ > __asm__ __volatile__ ("clear.d [" __stringify(BUG_MAGIC) "]\n\t"\ > "movu.w " __stringify(__LINE__) ",$r0\n\t"\ > "jump 0f\n\t" \ > ".section .rodata\n" \ > "0:\t.string \"" __FILE__ "\"\n\t" \ > - ".previous") > + ".previous"); \ > + unreachable(); \ > +} while (0) > #endif > > #else > > /* This just causes an oops. */ > -#define BUG() (*(int *)0 = 0) > +#define BUG() \ > +do { \ > + barrier_before_unreachable(); \ > + __builtin_trap(); \ I suppose BUG() implies "dead end" like semantics - which ARC was lacking before ? > +} while (0) > > #endif > > diff --git a/arch/ia64/include/asm/bug.h b/arch/ia64/include/asm/bug.h > index bd3eeb8d1cfa..66b37a532765 100644 > --- a/arch/ia64/include/asm/bug.h > +++ b/arch/ia64/include/asm/bug.h > @@ -4,7 +4,11 @@ > > #ifdef CONFIG_BUG > #define ia64_abort() __builtin_trap() > -#define BUG() do { printk("kernel BUG at %s:%d!\n", __FILE__, __LINE__); ia64_abort(); } while (0) > +#define BUG() do { \ > + printk("kernel BUG at %s:%d!\n", __FILE__, __LINE__); \ > + barrier_before_unreachable(); \ > + ia64_abort(); \ > +} while (0) > > /* should this BUG be made generic? */ > #define HAVE_ARCH_BUG > diff --git a/arch/m68k/include/asm/bug.h b/arch/m68k/include/asm/bug.h > index b7e2bf1ba4a6..275dca1435bf 100644 > --- a/arch/m68k/include/asm/bug.h > +++ b/arch/m68k/include/asm/bug.h > @@ -8,16 +8,19 @@ > #ifndef CONFIG_SUN3 > #define BUG() do { \ > pr_crit("kernel BUG at %s:%d!\n", __FILE__, __LINE__); \ > + barrier_before_unreachable(); \ > __builtin_trap(); \ > } while (0) > #else > #define BUG() do { \ > pr_crit("kernel BUG at %s:%d!\n", __FILE__, __LINE__); \ > + barrier_before_unreachable(); \ > panic("BUG!"); \ > } while (0) > #endif > #else > #define BUG() do { \ > + barrier_before_unreachable(); \ > __builtin_trap(); \ > } while (0) > #endif > diff --git a/arch/sparc/include/asm/bug.h b/arch/sparc/include/asm/bug.h > index 6f17528356b2..ea53e418f6c0 100644 > --- a/arch/sparc/include/asm/bug.h > +++ b/arch/sparc/include/asm/bug.h > @@ -9,10 +9,14 @@ > void do_BUG(const char *file, int line); > #define BUG() do { \ > do_BUG(__FILE__, __LINE__); \ > + barrier_before_unreachable(); \ > __builtin_trap(); \ > } while (0) > #else > -#define BUG() __builtin_trap() > +#define BUG() do { \ > + barrier_before_unreachable(); \ > + __builtin_trap(); \ > +} while (0) > #endif > > #define HAVE_ARCH_BUG > diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h > index 963b755d19b0..a7613e1b0c87 100644 > --- a/include/asm-generic/bug.h > +++ b/include/asm-generic/bug.h > @@ -52,6 +52,7 @@ struct bug_entry { > #ifndef HAVE_ARCH_BUG > #define BUG() do { \ > printk("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__); \ > + barrier_before_unreachable(); \ > panic("BUG!"); \ > } while (0) > #endif > diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h > index 5d595cfdb2c4..66cfdad68f7e 100644 > --- a/include/linux/compiler-gcc.h > +++ b/include/linux/compiler-gcc.h > @@ -205,6 +205,15 @@ > #endif > > /* > + * calling noreturn functions, __builtin_unreachable() and __builtin_trap() > + * confuse the stack allocation in gcc, leading to overly large stack > + * frames, see https://urldefense.proofpoint.com/v2/url?u=https-3A__gcc.gnu.org_bugzilla_show-5Fbug.cgi-3Fid-3D82365&d=DwIBAg&c=DPL6_X_6JkXFx7AXWqB0tg&r=c14YS-cH-kdhTOW89KozFhBtBJgs1zXscZojEZQ0THs&m=3Iu4HWDn1cXkYBpSFh5I80IzDKJi33hs5DbfGM-b3mI&s=sTrcyN5ej_ION8hJvF9eGLUZYwdlwI50vXUp3MK-XWY&e= > + * > + * Adding an empty inline assembly before it works around the problem > + */ > +#define barrier_before_unreachable() asm volatile("") > + > +/* > * Mark a position in code as unreachable. This can be used to > * suppress control flow warnings after asm blocks that transfer > * control elsewhere. > @@ -214,7 +223,11 @@ > * unreleased. Really, we need to have autoconf for the kernel. > */ > #define unreachable() \ > - do { annotate_unreachable(); __builtin_unreachable(); } while (0) > + do { \ > + annotate_unreachable(); \ > + barrier_before_unreachable(); \ > + __builtin_unreachable(); \ > + } while (0) > > /* Mark a function definition as prohibited from being cloned. */ > #define __noclone __attribute__((__noclone__, __optimize__("no-tracer"))) > diff --git a/include/linux/compiler.h b/include/linux/compiler.h > index 52e611ab9a6c..97847f2f86cf 100644 > --- a/include/linux/compiler.h > +++ b/include/linux/compiler.h > @@ -86,6 +86,11 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, > # define barrier_data(ptr) barrier() > #endif > > +/* workaround for GCC PR82365 if needed */ > +#ifndef barrier_before_unreachable > +# define barrier_before_unreachable() do { } while (0) > +#endif > + > /* Unreachable code */ > #ifdef CONFIG_STACK_VALIDATION > /*
On Tue, Dec 19, 2017 at 5:57 PM, Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote: > On 12/19/2017 03:41 AM, Arnd Bergmann wrote: >> In case of ARC and CRIS, it turns out that the BUG() implementation >> actually does return (or at least the compiler thinks it does), resulting >> in lots of warnings about uninitialized variable use and leaving noreturn >> functions, such as: >> >> block/cfq-iosched.c: In function 'cfq_async_queue_prio': >> block/cfq-iosched.c:3804:1: error: control reaches end of non-void >> function [-Werror=return-type] >> include/linux/dmaengine.h: In function 'dma_maxpq': >> include/linux/dmaengine.h:1123:1: error: control reaches end of non-void >> function [-Werror=return-type] >> diff --git a/arch/arc/include/asm/bug.h b/arch/arc/include/asm/bug.h >> index ea022d47896c..21ec82466d62 100644 >> --- a/arch/arc/include/asm/bug.h >> +++ b/arch/arc/include/asm/bug.h >> @@ -23,7 +23,8 @@ void die(const char *str, struct pt_regs *regs, unsigned >> long address); >> #define BUG() do { >> \ >> pr_warn("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, >> __func__); \ >> - dump_stack(); >> \ >> + barrier_before_unreachable(); >> \ >> + __builtin_trap(); >> \ >> } while (0) >> #define HAVE_ARCH_BUG > > > I suppose BUG() implies "dead end" like semantics - which ARC was lacking > before ? Correct. Using __builtin_trap() here avoids the 'control reaches end of non-void function' warnings, but then makes us run into the stack size problem that I work around with the barrier_before_unreachable(). It would be good if you could give this a quick test to see if you get sensible output from the __builtin_trap(); Arnd
On 12/19/2017 12:13 PM, Arnd Bergmann wrote: > >> I suppose BUG() implies "dead end" like semantics - which ARC was lacking >> before ? > Correct. Using __builtin_trap() here avoids the 'control reaches end of non-void > function' warnings, but then makes us run into the stack size problem that > I work around with the barrier_before_unreachable(). > > It would be good if you could give this a quick test to see if you get sensible > output from the __builtin_trap(); It does, added a BUG() arbit, hits an abort() ... ISA Extn : atomic ll64 unalign (not used) : mpy[opt 9] div_rem norm barrel-shift swap minmax swape BPU : partial match, cache:2048, Predict Table:16384 BUG: failure at ../arch/arc/mm/tlb.c:827/arc_mmu_init()! Tested-by: Vineet Gupta <vgupta@synopsys.com> FWIW newer ARC gcc actually implements the builtin so we get a trap 5 instruction now, vs., abort() calls before. BTW I missed reading the hunk of your changelog where this addresses the long standing mystery with ARC builds and numerous -Wreturn-type warnings. I always wondered why they were not fixed upstream already, being too lazy to investigate myself, and turns out this was due to this BUG() thingy. phew ! -Vineet
On Tue, Dec 19, 2017 at 11:38 PM, Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote: > On 12/19/2017 12:13 PM, Arnd Bergmann wrote: >> >> >>> I suppose BUG() implies "dead end" like semantics - which ARC was lacking >>> before ? >> >> Correct. Using __builtin_trap() here avoids the 'control reaches end of >> non-void >> function' warnings, but then makes us run into the stack size problem that >> I work around with the barrier_before_unreachable(). >> >> It would be good if you could give this a quick test to see if you get >> sensible >> output from the __builtin_trap(); > > > It does, added a BUG() arbit, hits an abort() > > ... > ISA Extn : atomic ll64 unalign (not used) > : mpy[opt 9] div_rem norm barrel-shift swap minmax swape > BPU : partial match, cache:2048, Predict Table:16384 > BUG: failure at ../arch/arc/mm/tlb.c:827/arc_mmu_init()! > > > Tested-by: Vineet Gupta <vgupta@synopsys.com> I meant whether it prints the right registers and stack trace, but I assume you tested that and just did not list it above. > FWIW newer ARC gcc actually implements the builtin so we get a trap 5 > instruction now, vs., abort() calls before. > > BTW I missed reading the hunk of your changelog where this addresses the > long standing mystery with ARC builds and numerous -Wreturn-type warnings. I > always wondered why they were not fixed upstream already, being too lazy to > investigate myself, and turns out this was due to this BUG() thingy. phew ! Hmm, so with the new definition of abort(), +__weak void abort(void) +{ + BUG(); + + /* if that doesn't kill us, halt */ + panic("Oops failed to kill thread"); +} won't that run into an endless recursion? Or do you then override abort() for ARC? Arnd
On 12/20/2017 01:01 AM, Arnd Bergmann wrote: > On Tue, Dec 19, 2017 at 11:38 PM, Vineet Gupta > <Vineet.Gupta1@synopsys.com> wrote: >> On 12/19/2017 12:13 PM, Arnd Bergmann wrote: >>> >>> >>>> I suppose BUG() implies "dead end" like semantics - which ARC was lacking >>>> before ? >>> >>> Correct. Using __builtin_trap() here avoids the 'control reaches end of >>> non-void >>> function' warnings, but then makes us run into the stack size problem that >>> I work around with the barrier_before_unreachable(). >>> >>> It would be good if you could give this a quick test to see if you get >>> sensible >>> output from the __builtin_trap(); >> >> >> It does, added a BUG() arbit, hits an abort() >> >> ... >> ISA Extn : atomic ll64 unalign (not used) >> : mpy[opt 9] div_rem norm barrel-shift swap minmax swape >> BPU : partial match, cache:2048, Predict Table:16384 >> BUG: failure at ../arch/arc/mm/tlb.c:827/arc_mmu_init()! >> >> >> Tested-by: Vineet Gupta <vgupta@synopsys.com> > > I meant whether it prints the right registers and stack trace, but I > assume you tested that and just did not list it above. Sorry, I didn't realize we are missing the stack trace now which you removed from the patch - why ? Did u intend to reduce inline generated code for the stack dump calls - which sounds like a great idea. But it would only work for the synchronous abort() but not when builtin translates to actual trap inducing instruction. > Hmm, so with the new definition of abort(), > > +__weak void abort(void) > +{ > + BUG(); > + > + /* if that doesn't kill us, halt */ > + panic("Oops failed to kill thread"); > +} > > won't that run into an endless recursion? Or do you then override abort() > for ARC? Indeed so. I didn't run into this in my testing as my for-curr has an ARC specific version (predating Sudip's generic version- because of build failures in our internal regression jobs etc). That version only calls panic. abort() is only likely to be called due to __builtin_trap() for arches where gcc doesn't have a target specific defn of it. And thus adding the call from BUG() will cause the recursion as you found out with Sudip's generic version and thus needs a fixup. Thx, -Vineet
On Wed, Dec 20, 2017 at 7:52 PM, Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote: > On 12/20/2017 01:01 AM, Arnd Bergmann wrote: >> >> On Tue, Dec 19, 2017 at 11:38 PM, Vineet Gupta >> <Vineet.Gupta1@synopsys.com> wrote: >>> >>> On 12/19/2017 12:13 PM, Arnd Bergmann wrote: >>>> >>>> >>>> >>>>> I suppose BUG() implies "dead end" like semantics - which ARC was >>>>> lacking >>>>> before ? >>>> >>>> >>>> Correct. Using __builtin_trap() here avoids the 'control reaches end of >>>> non-void >>>> function' warnings, but then makes us run into the stack size problem >>>> that >>>> I work around with the barrier_before_unreachable(). >>>> >>>> It would be good if you could give this a quick test to see if you get >>>> sensible >>>> output from the __builtin_trap(); >>> >>> >>> >>> It does, added a BUG() arbit, hits an abort() >>> >>> ... >>> ISA Extn : atomic ll64 unalign (not used) >>> : mpy[opt 9] div_rem norm barrel-shift swap minmax swape >>> BPU : partial match, cache:2048, Predict Table:16384 >>> BUG: failure at ../arch/arc/mm/tlb.c:827/arc_mmu_init()! >>> >>> >>> Tested-by: Vineet Gupta <vgupta@synopsys.com> >> >> >> I meant whether it prints the right registers and stack trace, but I >> assume you tested that and just did not list it above. > > > Sorry, I didn't realize we are missing the stack trace now which you removed > from the patch - why ? Did u intend to reduce inline generated code for the > stack dump calls - which sounds like a great idea. But it would only work > for the synchronous abort() but not when builtin translates to actual trap > inducing instruction. I assumed that the trap instruction would trigger the register and stack dump, as it does on all other architectures. The most common way this is handled is to have one instruction that is known to trap, and use that to trigger a BUG(), and have __builtin_trap() issue that instruction as well. You might also want to implement CONFIG_DEBUG_BUGVERBOSE support to attach further data to it. >> Hmm, so with the new definition of abort(), >> >> +__weak void abort(void) >> +{ >> + BUG(); >> + >> + /* if that doesn't kill us, halt */ >> + panic("Oops failed to kill thread"); >> +} >> >> won't that run into an endless recursion? Or do you then override abort() >> for ARC? > > > Indeed so. I didn't run into this in my testing as my for-curr has an ARC > specific version (predating Sudip's generic version- because of build > failures in our internal regression jobs etc). That version only calls > panic. > > abort() is only likely to be called due to __builtin_trap() for arches where > gcc doesn't have a target specific defn of it. And thus adding the call from > BUG() will cause the recursion as you found out with Sudip's generic version > and thus needs a fixup. How about overriding abort() with the same instruction that __builtin_trap() inserts on newer compilers then? That should make the behavior consistent. Arnd
On 12/20/2017 12:12 PM, Arnd Bergmann wrote: >> Sorry, I didn't realize we are missing the stack trace now which you removed >> from the patch - why ? Did u intend to reduce inline generated code for the >> stack dump calls - which sounds like a great idea. But it would only work >> for the synchronous abort() but not when builtin translates to actual trap >> inducing instruction. > > I assumed that the trap instruction would trigger the register and > stack dump, as it does on all other architectures. Only if __builtin_trap() translated to that instruction. Otherwise we need to do this inside abort() > The most common > way this is handled is to have one instruction that is known to trap, > and use that to trigger a BUG(), and have __builtin_trap() issue > that instruction as well. Good point. So we'll need ARC specific abort anyways. > You might also want to implement CONFIG_DEBUG_BUGVERBOSE > support to attach further data to it. OK I'll take a look ! > How about overriding abort() with the same instruction that > __builtin_trap() inserts on newer compilers then? That should > make the behavior consistent. Yeah this is a great point ! Will do thx, -Vineet
On Wed, 20 Dec 2017 12:29:02 -0800 Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote: > On 12/20/2017 12:12 PM, Arnd Bergmann wrote: > >> Sorry, I didn't realize we are missing the stack trace now which you removed > >> from the patch - why ? Did u intend to reduce inline generated code for the > >> stack dump calls - which sounds like a great idea. But it would only work > >> for the synchronous abort() but not when builtin translates to actual trap > >> inducing instruction. > > > > I assumed that the trap instruction would trigger the register and > > stack dump, as it does on all other architectures. > > Only if __builtin_trap() translated to that instruction. Otherwise we need to do > this inside abort() > > > The most common > > way this is handled is to have one instruction that is known to trap, > > and use that to trigger a BUG(), and have __builtin_trap() issue > > that instruction as well. > > Good point. So we'll need ARC specific abort anyways. > > > > You might also want to implement CONFIG_DEBUG_BUGVERBOSE > > support to attach further data to it. > > OK I'll take a look ! > > > How about overriding abort() with the same instruction that > > __builtin_trap() inserts on newer compilers then? That should > > make the behavior consistent. > > Yeah this is a great point ! Will do Didn't do ;) Is Arnd's patch good to merge or do we need a fixup? From: Arnd Bergmann <arnd@arndb.de> Subject: bug.h: work around GCC PR82365 in BUG() Looking at functions with large stack frames across all architectures led me discovering that BUG() suffers from the same problem as fortify_panic(), which I've added a workaround for already. In short, variables that go out of scope by calling a noreturn function or __builtin_unreachable() keep using stack space in functions afterwards. A workaround that was identified is to insert an empty assembler statement just before calling the function that doesn't return. I'm adding a macro "barrier_before_unreachable()" to document this, and insert calls to that in all instances of BUG() that currently suffer from this problem. The files that saw the largest change from this had these frame sizes before, and much less with my patch: fs/ext4/inode.c:82:1: warning: the frame size of 1672 bytes is larger than 800 bytes [-Wframe-larger-than=] fs/ext4/namei.c:434:1: warning: the frame size of 904 bytes is larger than 800 bytes [-Wframe-larger-than=] fs/ext4/super.c:2279:1: warning: the frame size of 1160 bytes is larger than 800 bytes [-Wframe-larger-than=] fs/ext4/xattr.c:146:1: warning: the frame size of 1168 bytes is larger than 800 bytes [-Wframe-larger-than=] fs/f2fs/inode.c:152:1: warning: the frame size of 1424 bytes is larger than 800 bytes [-Wframe-larger-than=] net/netfilter/ipvs/ip_vs_core.c:1195:1: warning: the frame size of 1068 bytes is larger than 800 bytes [-Wframe-larger-than=] net/netfilter/ipvs/ip_vs_core.c:395:1: warning: the frame size of 1084 bytes is larger than 800 bytes [-Wframe-larger-than=] net/netfilter/ipvs/ip_vs_ftp.c:298:1: warning: the frame size of 928 bytes is larger than 800 bytes [-Wframe-larger-than=] net/netfilter/ipvs/ip_vs_ftp.c:418:1: warning: the frame size of 908 bytes is larger than 800 bytes [-Wframe-larger-than=] net/netfilter/ipvs/ip_vs_lblcr.c:718:1: warning: the frame size of 960 bytes is larger than 800 bytes [-Wframe-larger-than=] drivers/net/xen-netback/netback.c:1500:1: warning: the frame size of 1088 bytes is larger than 800 bytes [-Wframe-larger-than=] In case of ARC and CRIS, it turns out that the BUG() implementation actually does return (or at least the compiler thinks it does), resulting in lots of warnings about uninitialized variable use and leaving noreturn functions, such as: block/cfq-iosched.c: In function 'cfq_async_queue_prio': block/cfq-iosched.c:3804:1: error: control reaches end of non-void function [-Werror=return-type] include/linux/dmaengine.h: In function 'dma_maxpq': include/linux/dmaengine.h:1123:1: error: control reaches end of non-void function [-Werror=return-type] This makes them call __builtin_trap() instead, which should normally dump the stack and kill the current process, like some of the other architectures already do. I tried adding barrier_before_unreachable() to panic() and fortify_panic() as well, but that had very little effect, so I'm not submitting that patch. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365 Link: http://lkml.kernel.org/r/20171219114112.939391-1-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Vineet Gupta <vgupta@synopsys.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Christopher Li <sparse@chrisli.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kees Cook <keescook@chromium.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> --- arch/arc/include/asm/bug.h | 3 ++- arch/cris/include/arch-v10/arch/bug.h | 11 +++++++++-- arch/ia64/include/asm/bug.h | 6 +++++- arch/m68k/include/asm/bug.h | 3 +++ arch/sparc/include/asm/bug.h | 6 +++++- include/asm-generic/bug.h | 1 + include/linux/compiler-gcc.h | 15 ++++++++++++++- include/linux/compiler.h | 5 +++++ 8 files changed, 44 insertions(+), 6 deletions(-) diff -puN arch/arc/include/asm/bug.h~bugh-work-around-gcc-pr82365-in-bug arch/arc/include/asm/bug.h --- a/arch/arc/include/asm/bug.h~bugh-work-around-gcc-pr82365-in-bug +++ a/arch/arc/include/asm/bug.h @@ -23,7 +23,8 @@ void die(const char *str, struct pt_regs #define BUG() do { \ pr_warn("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__); \ - dump_stack(); \ + barrier_before_unreachable(); \ + __builtin_trap(); \ } while (0) #define HAVE_ARCH_BUG diff -puN arch/cris/include/arch-v10/arch/bug.h~bugh-work-around-gcc-pr82365-in-bug arch/cris/include/arch-v10/arch/bug.h --- a/arch/cris/include/arch-v10/arch/bug.h~bugh-work-around-gcc-pr82365-in-bug +++ a/arch/cris/include/arch-v10/arch/bug.h @@ -44,18 +44,25 @@ struct bug_frame { * not be used like this with newer versions of gcc. */ #define BUG() \ +do { \ __asm__ __volatile__ ("clear.d [" __stringify(BUG_MAGIC) "]\n\t"\ "movu.w " __stringify(__LINE__) ",$r0\n\t"\ "jump 0f\n\t" \ ".section .rodata\n" \ "0:\t.string \"" __FILE__ "\"\n\t" \ - ".previous") + ".previous"); \ + unreachable(); \ +} while (0) #endif #else /* This just causes an oops. */ -#define BUG() (*(int *)0 = 0) +#define BUG() \ +do { \ + barrier_before_unreachable(); \ + __builtin_trap(); \ +} while (0) #endif diff -puN arch/ia64/include/asm/bug.h~bugh-work-around-gcc-pr82365-in-bug arch/ia64/include/asm/bug.h --- a/arch/ia64/include/asm/bug.h~bugh-work-around-gcc-pr82365-in-bug +++ a/arch/ia64/include/asm/bug.h @@ -4,7 +4,11 @@ #ifdef CONFIG_BUG #define ia64_abort() __builtin_trap() -#define BUG() do { printk("kernel BUG at %s:%d!\n", __FILE__, __LINE__); ia64_abort(); } while (0) +#define BUG() do { \ + printk("kernel BUG at %s:%d!\n", __FILE__, __LINE__); \ + barrier_before_unreachable(); \ + ia64_abort(); \ +} while (0) /* should this BUG be made generic? */ #define HAVE_ARCH_BUG diff -puN arch/m68k/include/asm/bug.h~bugh-work-around-gcc-pr82365-in-bug arch/m68k/include/asm/bug.h --- a/arch/m68k/include/asm/bug.h~bugh-work-around-gcc-pr82365-in-bug +++ a/arch/m68k/include/asm/bug.h @@ -8,16 +8,19 @@ #ifndef CONFIG_SUN3 #define BUG() do { \ pr_crit("kernel BUG at %s:%d!\n", __FILE__, __LINE__); \ + barrier_before_unreachable(); \ __builtin_trap(); \ } while (0) #else #define BUG() do { \ pr_crit("kernel BUG at %s:%d!\n", __FILE__, __LINE__); \ + barrier_before_unreachable(); \ panic("BUG!"); \ } while (0) #endif #else #define BUG() do { \ + barrier_before_unreachable(); \ __builtin_trap(); \ } while (0) #endif diff -puN arch/sparc/include/asm/bug.h~bugh-work-around-gcc-pr82365-in-bug arch/sparc/include/asm/bug.h --- a/arch/sparc/include/asm/bug.h~bugh-work-around-gcc-pr82365-in-bug +++ a/arch/sparc/include/asm/bug.h @@ -9,10 +9,14 @@ void do_BUG(const char *file, int line); #define BUG() do { \ do_BUG(__FILE__, __LINE__); \ + barrier_before_unreachable(); \ __builtin_trap(); \ } while (0) #else -#define BUG() __builtin_trap() +#define BUG() do { \ + barrier_before_unreachable(); \ + __builtin_trap(); \ +} while (0) #endif #define HAVE_ARCH_BUG diff -puN include/asm-generic/bug.h~bugh-work-around-gcc-pr82365-in-bug include/asm-generic/bug.h --- a/include/asm-generic/bug.h~bugh-work-around-gcc-pr82365-in-bug +++ a/include/asm-generic/bug.h @@ -52,6 +52,7 @@ struct bug_entry { #ifndef HAVE_ARCH_BUG #define BUG() do { \ printk("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__); \ + barrier_before_unreachable(); \ panic("BUG!"); \ } while (0) #endif diff -puN include/linux/compiler-gcc.h~bugh-work-around-gcc-pr82365-in-bug include/linux/compiler-gcc.h --- a/include/linux/compiler-gcc.h~bugh-work-around-gcc-pr82365-in-bug +++ a/include/linux/compiler-gcc.h @@ -205,6 +205,15 @@ #endif /* + * calling noreturn functions, __builtin_unreachable() and __builtin_trap() + * confuse the stack allocation in gcc, leading to overly large stack + * frames, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365 + * + * Adding an empty inline assembly before it works around the problem + */ +#define barrier_before_unreachable() asm volatile("") + +/* * Mark a position in code as unreachable. This can be used to * suppress control flow warnings after asm blocks that transfer * control elsewhere. @@ -214,7 +223,11 @@ * unreleased. Really, we need to have autoconf for the kernel. */ #define unreachable() \ - do { annotate_unreachable(); __builtin_unreachable(); } while (0) + do { \ + annotate_unreachable(); \ + barrier_before_unreachable(); \ + __builtin_unreachable(); \ + } while (0) /* Mark a function definition as prohibited from being cloned. */ #define __noclone __attribute__((__noclone__, __optimize__("no-tracer"))) diff -puN include/linux/compiler.h~bugh-work-around-gcc-pr82365-in-bug include/linux/compiler.h --- a/include/linux/compiler.h~bugh-work-around-gcc-pr82365-in-bug +++ a/include/linux/compiler.h @@ -86,6 +86,11 @@ void ftrace_likely_update(struct ftrace_ # define barrier_data(ptr) barrier() #endif +/* workaround for GCC PR82365 if needed */ +#ifndef barrier_before_unreachable +# define barrier_before_unreachable() do { } while (0) +#endif + /* Unreachable code */ #ifdef CONFIG_STACK_VALIDATION /*
On 02/07/2018 04:01 PM, Andrew Morton wrote: > On Wed, 20 Dec 2017 12:29:02 -0800 Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote: > >> On 12/20/2017 12:12 PM, Arnd Bergmann wrote: >>>> Sorry, I didn't realize we are missing the stack trace now which you removed >>>> from the patch - why ? Did u intend to reduce inline generated code for the >>>> stack dump calls - which sounds like a great idea. But it would only work >>>> for the synchronous abort() but not when builtin translates to actual trap >>>> inducing instruction. >>> >>> I assumed that the trap instruction would trigger the register and >>> stack dump, as it does on all other architectures. >> >> Only if __builtin_trap() translated to that instruction. Otherwise we need to do >> this inside abort() >> >>> The most common >>> way this is handled is to have one instruction that is known to trap, >>> and use that to trigger a BUG(), and have __builtin_trap() issue >>> that instruction as well. >> >> Good point. So we'll need ARC specific abort anyways. >> >> >>> You might also want to implement CONFIG_DEBUG_BUGVERBOSE >>> support to attach further data to it. >> >> OK I'll take a look ! >> >>> How about overriding abort() with the same instruction that >>> __builtin_trap() inserts on newer compilers then? That should >>> make the behavior consistent. >> >> Yeah this is a great point ! Will do > > Didn't do ;) Actually I did :-) The discussion above digressed from original intent of the patch and instead veered of into need for handling __builtin_trap() for ARC and a fallback abort() with same semantics etc which I'd agreed to do above. They are upstream 2017-12-08 af1be2e21203 ARC: handle gcc generated __builtin_trap for older compiler 2017-12-20 f5a16b93e629 ARC: handle gcc generated __builtin_trap() However we missed this large stack issue and the -Werror=return-type warning for ARC. Let me check the patch again and get back here ! > > Is Arnd's patch good to merge or do we need a fixup? > > > From: Arnd Bergmann <arnd@arndb.de> > Subject: bug.h: work around GCC PR82365 in BUG() > > Looking at functions with large stack frames across all architectures led > me discovering that BUG() suffers from the same problem as > fortify_panic(), which I've added a workaround for already. In short, > variables that go out of scope by calling a noreturn function or > __builtin_unreachable() keep using stack space in functions afterwards. > > A workaround that was identified is to insert an empty assembler statement > just before calling the function that doesn't return. I'm adding a macro > "barrier_before_unreachable()" to document this, and insert calls to that > in all instances of BUG() that currently suffer from this problem. > > The files that saw the largest change from this had these frame sizes > before, and much less with my patch: > > fs/ext4/inode.c:82:1: warning: the frame size of 1672 bytes is larger than 800 bytes [-Wframe-larger-than=] > fs/ext4/namei.c:434:1: warning: the frame size of 904 bytes is larger than 800 bytes [-Wframe-larger-than=] [...] > In case of ARC and CRIS, it turns out that the BUG() implementation > actually does return (or at least the compiler thinks it does), resulting > in lots of warnings about uninitialized variable use and leaving noreturn > functions, such as: > > block/cfq-iosched.c: In function 'cfq_async_queue_prio': > block/cfq-iosched.c:3804:1: error: control reaches end of non-void function [-Werror=return-type] > include/linux/dmaengine.h: In function 'dma_maxpq': > include/linux/dmaengine.h:1123:1: error: control reaches end of non-void function [-Werror=return-type] > > This makes them call __builtin_trap() instead, which should normally dump > the stack and kill the current process, like some of the other > architectures already do. > > I tried adding barrier_before_unreachable() to panic() and fortify_panic() > as well, but that had very little effect, so I'm not submitting that > patch. [...] > > diff -puN arch/arc/include/asm/bug.h~bugh-work-around-gcc-pr82365-in-bug arch/arc/include/asm/bug.h > --- a/arch/arc/include/asm/bug.h~bugh-work-around-gcc-pr82365-in-bug > +++ a/arch/arc/include/asm/bug.h > @@ -23,7 +23,8 @@ void die(const char *str, struct pt_regs > > #define BUG() do { \ > pr_warn("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__); \ > - dump_stack(); \ > + barrier_before_unreachable(); \ > + __builtin_trap(); \ > } while (0) >
On 02/07/2018 04:01 PM, Andrew Morton wrote: > On Wed, 20 Dec 2017 12:29:02 -0800 Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote: > >> On 12/20/2017 12:12 PM, Arnd Bergmann wrote: >>>> Sorry, I didn't realize we are missing the stack trace now which you removed >>>> from the patch - why ? Did u intend to reduce inline generated code for the >>>> stack dump calls - which sounds like a great idea. But it would only work >>>> for the synchronous abort() but not when builtin translates to actual trap >>>> inducing instruction. >>> >>> I assumed that the trap instruction would trigger the register and >>> stack dump, as it does on all other architectures. >> >> Only if __builtin_trap() translated to that instruction. Otherwise we need to do >> this inside abort() >> >>> The most common >>> way this is handled is to have one instruction that is known to trap, >>> and use that to trigger a BUG(), and have __builtin_trap() issue >>> that instruction as well. >> >> Good point. So we'll need ARC specific abort anyways. >> >> >>> You might also want to implement CONFIG_DEBUG_BUGVERBOSE >>> support to attach further data to it. >> >> OK I'll take a look ! >> >>> How about overriding abort() with the same instruction that >>> __builtin_trap() inserts on newer compilers then? That should >>> make the behavior consistent. >> >> Yeah this is a great point ! Will do > > Didn't do ;) > > Is Arnd's patch good to merge or do we need a fixup? > > > From: Arnd Bergmann <arnd@arndb.de> > Subject: bug.h: work around GCC PR82365 in BUG() > > Looking at functions with large stack frames across all architectures led > me discovering that BUG() suffers from the same problem as > fortify_panic(), which I've added a workaround for already. In short, > variables that go out of scope by calling a noreturn function or > __builtin_unreachable() keep using stack space in functions afterwards. > > A workaround that was identified is to insert an empty assembler statement > just before calling the function that doesn't return. I'm adding a macro > "barrier_before_unreachable()" to document this, and insert calls to that > in all instances of BUG() that currently suffer from this problem. > > The files that saw the largest change from this had these frame sizes > before, and much less with my patch: > > fs/ext4/inode.c:82:1: warning: the frame size of 1672 bytes is larger than 800 bytes [-Wframe-larger-than=] > fs/ext4/namei.c:434:1: warning: the frame size of 904 bytes is larger than 800 bytes [-Wframe-larger-than=] > fs/ext4/super.c:2279:1: warning: the frame size of 1160 bytes is larger than 800 bytes [-Wframe-larger-than=] > fs/ext4/xattr.c:146:1: warning: the frame size of 1168 bytes is larger than 800 bytes [-Wframe-larger-than=] > fs/f2fs/inode.c:152:1: warning: the frame size of 1424 bytes is larger than 800 bytes [-Wframe-larger-than=] > net/netfilter/ipvs/ip_vs_core.c:1195:1: warning: the frame size of 1068 bytes is larger than 800 bytes [-Wframe-larger-than=] > net/netfilter/ipvs/ip_vs_core.c:395:1: warning: the frame size of 1084 bytes is larger than 800 bytes [-Wframe-larger-than=] > net/netfilter/ipvs/ip_vs_ftp.c:298:1: warning: the frame size of 928 bytes is larger than 800 bytes [-Wframe-larger-than=] > net/netfilter/ipvs/ip_vs_ftp.c:418:1: warning: the frame size of 908 bytes is larger than 800 bytes [-Wframe-larger-than=] > net/netfilter/ipvs/ip_vs_lblcr.c:718:1: warning: the frame size of 960 bytes is larger than 800 bytes [-Wframe-larger-than=] > drivers/net/xen-netback/netback.c:1500:1: warning: the frame size of 1088 bytes is larger than 800 bytes [-Wframe-larger-than=] > > In case of ARC and CRIS, it turns out that the BUG() implementation > actually does return (or at least the compiler thinks it does), resulting > in lots of warnings about uninitialized variable use and leaving noreturn > functions, such as: > > block/cfq-iosched.c: In function 'cfq_async_queue_prio': > block/cfq-iosched.c:3804:1: error: control reaches end of non-void function [-Werror=return-type] > include/linux/dmaengine.h: In function 'dma_maxpq': > include/linux/dmaengine.h:1123:1: error: control reaches end of non-void function [-Werror=return-type] > > This makes them call __builtin_trap() instead, which should normally dump > the stack and kill the current process, like some of the other > architectures already do. > > I tried adding barrier_before_unreachable() to panic() and fortify_panic() > as well, but that had very little effect, so I'm not submitting that > patch. > > Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365 > Link: http://lkml.kernel.org/r/20171219114112.939391-1-arnd@arndb.de > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > Tested-by: Vineet Gupta <vgupta@synopsys.com> > Cc: Mikael Starvik <starvik@axis.com> > Cc: Jesper Nilsson <jesper.nilsson@axis.com> > Cc: Tony Luck <tony.luck@intel.com> > Cc: Fenghua Yu <fenghua.yu@intel.com> > Cc: Geert Uytterhoeven <geert@linux-m68k.org> > Cc: "David S. Miller" <davem@davemloft.net> > Cc: Christopher Li <sparse@chrisli.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Kees Cook <keescook@chromium.org> > Cc: Ingo Molnar <mingo@kernel.org> > Cc: Josh Poimboeuf <jpoimboe@redhat.com> > Cc: Will Deacon <will.deacon@arm.com> > Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org> > Cc: Mark Rutland <mark.rutland@arm.com> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > --- > > arch/arc/include/asm/bug.h | 3 ++- > arch/cris/include/arch-v10/arch/bug.h | 11 +++++++++-- > arch/ia64/include/asm/bug.h | 6 +++++- > arch/m68k/include/asm/bug.h | 3 +++ > arch/sparc/include/asm/bug.h | 6 +++++- > include/asm-generic/bug.h | 1 + > include/linux/compiler-gcc.h | 15 ++++++++++++++- > include/linux/compiler.h | 5 +++++ > 8 files changed, 44 insertions(+), 6 deletions(-) > > diff -puN arch/arc/include/asm/bug.h~bugh-work-around-gcc-pr82365-in-bug arch/arc/include/asm/bug.h > --- a/arch/arc/include/asm/bug.h~bugh-work-around-gcc-pr82365-in-bug > +++ a/arch/arc/include/asm/bug.h > @@ -23,7 +23,8 @@ void die(const char *str, struct pt_regs > > #define BUG() do { \ > pr_warn("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__); \ > - dump_stack(); \ > + barrier_before_unreachable(); \ > + __builtin_trap(); \ > } while (0) > For ARC, it is double win. 1. Fixes 3 -Wreturn-type warnings | ../net/core/ethtool.c:311:1: warning: control reaches end of non-void function [-Wreturn-type] | ../kernel/sched/core.c:3246:1: warning: control reaches end of non-void function [-Wreturn-type] | ../include/linux/sunrpc/svc_xprt.h:180:1: warning: control reaches end of non-void function [-Wreturn-type] 2. bloat-o-meter reports code size improvements as gcc elides the generated code for stack return. Acked-by: Vineet Gupta <vgupta@synopsys.com> # for arch/arc Tested-by: Vineet Gupta <vgupta@synopsys.com> # for arch/arc
On 02/07/2018 05:20 PM, Vineet Gupta wrote: > >> Didn't do ;) >> >> Is Arnd's patch good to merge or do we need a fixup? >> >> >> From: Arnd Bergmann <arnd@arndb.de> >> Subject: bug.h: work around GCC PR82365 in BUG() >> >> Looking at functions with large stack frames across all architectures led >> me discovering that BUG() suffers from the same problem as >> fortify_panic(), which I've added a workaround for already. In short, >> variables that go out of scope by calling a noreturn function or >> __builtin_unreachable() keep using stack space in functions afterwards. >> >> A workaround that was identified is to insert an empty assembler statement >> just before calling the function that doesn't return. I'm adding a macro >> "barrier_before_unreachable()" to document this, and insert calls to that >> in all instances of BUG() that currently suffer from this problem. >> .. .. >> --- >> >> arch/arc/include/asm/bug.h | 3 ++- >> arch/cris/include/arch-v10/arch/bug.h | 11 +++++++++-- >> arch/ia64/include/asm/bug.h | 6 +++++- >> arch/m68k/include/asm/bug.h | 3 +++ >> arch/sparc/include/asm/bug.h | 6 +++++- >> include/asm-generic/bug.h | 1 + >> include/linux/compiler-gcc.h | 15 ++++++++++++++- >> include/linux/compiler.h | 5 +++++ >> 8 files changed, 44 insertions(+), 6 deletions(-) >> >> diff -puN arch/arc/include/asm/bug.h~bugh-work-around-gcc-pr82365-in-bug >> arch/arc/include/asm/bug.h >> --- a/arch/arc/include/asm/bug.h~bugh-work-around-gcc-pr82365-in-bug >> +++ a/arch/arc/include/asm/bug.h >> @@ -23,7 +23,8 @@ void die(const char *str, struct pt_regs >> #define BUG() do { \ >> pr_warn("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__); \ >> - dump_stack(); \ >> + barrier_before_unreachable(); \ >> + __builtin_trap(); \ >> } while (0) > > For ARC, it is double win. > > 1. Fixes 3 -Wreturn-type warnings > > | ../net/core/ethtool.c:311:1: warning: control reaches end of non-void function > [-Wreturn-type] > | ../kernel/sched/core.c:3246:1: warning: control reaches end of non-void > function [-Wreturn-type] > | ../include/linux/sunrpc/svc_xprt.h:180:1: warning: control reaches end of > non-void function [-Wreturn-type] > > 2. bloat-o-meter reports code size improvements as gcc elides the generated code > for stack return. > > > Acked-by: Vineet Gupta <vgupta@synopsys.com> # for arch/arc > Tested-by: Vineet Gupta <vgupta@synopsys.com> # for arch/arc Ping ?
Hi Arnd, On Tue, Dec 19, 2017 at 12:39:33PM +0100, Arnd Bergmann wrote: > diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h > index 5d595cfdb2c4..66cfdad68f7e 100644 > --- a/include/linux/compiler-gcc.h > +++ b/include/linux/compiler-gcc.h > @@ -205,6 +205,15 @@ > #endif > > /* > + * calling noreturn functions, __builtin_unreachable() and __builtin_trap() > + * confuse the stack allocation in gcc, leading to overly large stack > + * frames, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365 > + * > + * Adding an empty inline assembly before it works around the problem > + */ > +#define barrier_before_unreachable() asm volatile("") > + > +/* > * Mark a position in code as unreachable. This can be used to > * suppress control flow warnings after asm blocks that transfer > * control elsewhere. > @@ -214,7 +223,11 @@ > * unreleased. Really, we need to have autoconf for the kernel. > */ > #define unreachable() \ > - do { annotate_unreachable(); __builtin_unreachable(); } while (0) > + do { \ > + annotate_unreachable(); \ > + barrier_before_unreachable(); \ > + __builtin_unreachable(); \ > + } while (0) Unfortunately this breaks microMIPS builds (e.g. MIPS micro32r2_defconfig and micro32r2el_defconfig) on gcc 7.2, due to the lack of .insn in the asm volatile. Because of the __builtin_unreachable() there is no code following it. Without the empty asm the compiler will apparently put the .insn there automatically, but with the empty asm it doesn't. Therefore the assembler won't treat an immediately preceeding label as pointing at 16-bit microMIPS instructions which need the ISA bit set, i.e. bit 0 of the address. This causes assembler errors since the branch target is treated as a different ISA mode: arch/mips/mm/dma-default.s:3265: Error: branch to a symbol in another ISA mode arch/mips/mm/dma-default.s:5027: Error: branch to a symbol in another ISA mode Due to a compiler bug on gcc 4.9.2 -> somewhere before 7.2, Paul submitted these patches a while back: https://patchwork.linux-mips.org/patch/13360/ https://patchwork.linux-mips.org/patch/13361/ Your patch (suitably fixed for microMIPS) would I imagine fix that issue too (it certainly fixes the resulting link error on microMIPS builds with an old toolchain). Before I forward port those patches to add .insn for MIPS, is that sort of approach (an arch specific asm/compiler-gcc.h to allow MIPS to override barrier_before_unreachable()) an acceptable fix? Thanks James
On Wed, Apr 11, 2018 at 12:48 AM, James Hogan <jhogan@kernel.org> wrote: > Hi Arnd, > > On Tue, Dec 19, 2017 at 12:39:33PM +0100, Arnd Bergmann wrote: >> diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h >> index 5d595cfdb2c4..66cfdad68f7e 100644 >> --- a/include/linux/compiler-gcc.h >> +++ b/include/linux/compiler-gcc.h >> @@ -205,6 +205,15 @@ >> #endif >> >> /* >> + * calling noreturn functions, __builtin_unreachable() and __builtin_trap() >> + * confuse the stack allocation in gcc, leading to overly large stack >> + * frames, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365 >> + * >> + * Adding an empty inline assembly before it works around the problem >> + */ >> +#define barrier_before_unreachable() asm volatile("") >> + >> +/* >> * Mark a position in code as unreachable. This can be used to >> * suppress control flow warnings after asm blocks that transfer >> * control elsewhere. >> @@ -214,7 +223,11 @@ >> * unreleased. Really, we need to have autoconf for the kernel. >> */ >> #define unreachable() \ >> - do { annotate_unreachable(); __builtin_unreachable(); } while (0) >> + do { \ >> + annotate_unreachable(); \ >> + barrier_before_unreachable(); \ >> + __builtin_unreachable(); \ >> + } while (0) > > Unfortunately this breaks microMIPS builds (e.g. MIPS > micro32r2_defconfig and micro32r2el_defconfig) on gcc 7.2, due to the > lack of .insn in the asm volatile. Because of the > __builtin_unreachable() there is no code following it. Without the empty > asm the compiler will apparently put the .insn there automatically, but > with the empty asm it doesn't. Therefore the assembler won't treat an > immediately preceeding label as pointing at 16-bit microMIPS > instructions which need the ISA bit set, i.e. bit 0 of the address. > This causes assembler errors since the branch target is treated as a > different ISA mode: > > arch/mips/mm/dma-default.s:3265: Error: branch to a symbol in another ISA mode > arch/mips/mm/dma-default.s:5027: Error: branch to a symbol in another ISA mode Ok, I see. > Due to a compiler bug on gcc 4.9.2 -> somewhere before 7.2, Paul > submitted these patches a while back: > https://patchwork.linux-mips.org/patch/13360/ > https://patchwork.linux-mips.org/patch/13361/ > > Your patch (suitably fixed for microMIPS) would I imagine fix that issue > too (it certainly fixes the resulting link error on microMIPS builds > with an old toolchain). > > Before I forward port those patches to add .insn for MIPS, is that sort > of approach (an arch specific asm/compiler-gcc.h to allow MIPS to > override barrier_before_unreachable()) an acceptable fix? That sounds fine to me. However, I would suggest making that asm/compiler.h instead of asm/compiler-gcc.h, so we can also use the same file to include workarounds for clang if needed. Arnd
On Wed, Apr 11, 2018 at 09:30:56AM +0200, Arnd Bergmann wrote: > On Wed, Apr 11, 2018 at 12:48 AM, James Hogan <jhogan@kernel.org> wrote: > > Before I forward port those patches to add .insn for MIPS, is that sort > > of approach (an arch specific asm/compiler-gcc.h to allow MIPS to > > override barrier_before_unreachable()) an acceptable fix? > > That sounds fine to me. However, I would suggest making that > asm/compiler.h instead of asm/compiler-gcc.h, so we can also > use the same file to include workarounds for clang if needed. Yes, though there are a few asm/compiler.h's already, and the alpha one includes linux/compiler.h before undefining inline, so seems to have its own specific purpose... Cheers James
On Wed, Apr 11, 2018 at 11:54 AM, James Hogan <jhogan@kernel.org> wrote: > On Wed, Apr 11, 2018 at 09:30:56AM +0200, Arnd Bergmann wrote: >> On Wed, Apr 11, 2018 at 12:48 AM, James Hogan <jhogan@kernel.org> wrote: >> > Before I forward port those patches to add .insn for MIPS, is that sort >> > of approach (an arch specific asm/compiler-gcc.h to allow MIPS to >> > override barrier_before_unreachable()) an acceptable fix? >> >> That sounds fine to me. However, I would suggest making that >> asm/compiler.h instead of asm/compiler-gcc.h, so we can also >> use the same file to include workarounds for clang if needed. > > Yes, though there are a few asm/compiler.h's already, and the alpha one > includes linux/compiler.h before undefining inline, so seems to have its > own specific purpose... Interesting. For the other ones, including asm/compiler.h from linux/compiler.h seems appropriate though, so the question would be what to do with the alpha case. I think we can simply remove that header file and replace it with this patch: diff --git a/arch/alpha/Kconfig b/arch/alpha/Kconfig index b2022885ced8..5502404f54cd 100644 --- a/arch/alpha/Kconfig +++ b/arch/alpha/Kconfig @@ -81,6 +81,9 @@ config PGTABLE_LEVELS int default 3 +config OPTIMIZE_INLINING + def_bool y + source "init/Kconfig" source "kernel/Kconfig.freezer" which should have the same effect.
On Wed, Apr 11, 2018 at 12:08:51PM +0200, Arnd Bergmann wrote: > On Wed, Apr 11, 2018 at 11:54 AM, James Hogan <jhogan@kernel.org> wrote: > > On Wed, Apr 11, 2018 at 09:30:56AM +0200, Arnd Bergmann wrote: > >> On Wed, Apr 11, 2018 at 12:48 AM, James Hogan <jhogan@kernel.org> wrote: > >> > Before I forward port those patches to add .insn for MIPS, is that sort > >> > of approach (an arch specific asm/compiler-gcc.h to allow MIPS to > >> > override barrier_before_unreachable()) an acceptable fix? > >> > >> That sounds fine to me. However, I would suggest making that > >> asm/compiler.h instead of asm/compiler-gcc.h, so we can also > >> use the same file to include workarounds for clang if needed. > > > > Yes, though there are a few asm/compiler.h's already, and the alpha one > > includes linux/compiler.h before undefining inline, so seems to have its > > own specific purpose... > > Interesting. For the other ones, including asm/compiler.h from linux/compiler.h > seems appropriate though, so the question would be what to do with the > alpha case. I think we can simply remove that header file and replace > it with this patch: > > diff --git a/arch/alpha/Kconfig b/arch/alpha/Kconfig > index b2022885ced8..5502404f54cd 100644 > --- a/arch/alpha/Kconfig > +++ b/arch/alpha/Kconfig > @@ -81,6 +81,9 @@ config PGTABLE_LEVELS > int > default 3 > > +config OPTIMIZE_INLINING > + def_bool y > + > source "init/Kconfig" > source "kernel/Kconfig.freezer" > > which should have the same effect. Hmm yes, and I suppose alpha would need ARCH_SUPPORTS_OPTIMIZED_INLINING too. I'll give it a try. Cheers James
diff --git a/arch/arc/include/asm/bug.h b/arch/arc/include/asm/bug.h index ea022d47896c..21ec82466d62 100644 --- a/arch/arc/include/asm/bug.h +++ b/arch/arc/include/asm/bug.h @@ -23,7 +23,8 @@ void die(const char *str, struct pt_regs *regs, unsigned long address); #define BUG() do { \ pr_warn("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__); \ - dump_stack(); \ + barrier_before_unreachable(); \ + __builtin_trap(); \ } while (0) #define HAVE_ARCH_BUG diff --git a/arch/cris/include/arch-v10/arch/bug.h b/arch/cris/include/arch-v10/arch/bug.h index 905afeacfedf..06da9d49152a 100644 --- a/arch/cris/include/arch-v10/arch/bug.h +++ b/arch/cris/include/arch-v10/arch/bug.h @@ -44,18 +44,25 @@ struct bug_frame { * not be used like this with newer versions of gcc. */ #define BUG() \ +do { \ __asm__ __volatile__ ("clear.d [" __stringify(BUG_MAGIC) "]\n\t"\ "movu.w " __stringify(__LINE__) ",$r0\n\t"\ "jump 0f\n\t" \ ".section .rodata\n" \ "0:\t.string \"" __FILE__ "\"\n\t" \ - ".previous") + ".previous"); \ + unreachable(); \ +} while (0) #endif #else /* This just causes an oops. */ -#define BUG() (*(int *)0 = 0) +#define BUG() \ +do { \ + barrier_before_unreachable(); \ + __builtin_trap(); \ +} while (0) #endif diff --git a/arch/ia64/include/asm/bug.h b/arch/ia64/include/asm/bug.h index bd3eeb8d1cfa..66b37a532765 100644 --- a/arch/ia64/include/asm/bug.h +++ b/arch/ia64/include/asm/bug.h @@ -4,7 +4,11 @@ #ifdef CONFIG_BUG #define ia64_abort() __builtin_trap() -#define BUG() do { printk("kernel BUG at %s:%d!\n", __FILE__, __LINE__); ia64_abort(); } while (0) +#define BUG() do { \ + printk("kernel BUG at %s:%d!\n", __FILE__, __LINE__); \ + barrier_before_unreachable(); \ + ia64_abort(); \ +} while (0) /* should this BUG be made generic? */ #define HAVE_ARCH_BUG diff --git a/arch/m68k/include/asm/bug.h b/arch/m68k/include/asm/bug.h index b7e2bf1ba4a6..275dca1435bf 100644 --- a/arch/m68k/include/asm/bug.h +++ b/arch/m68k/include/asm/bug.h @@ -8,16 +8,19 @@ #ifndef CONFIG_SUN3 #define BUG() do { \ pr_crit("kernel BUG at %s:%d!\n", __FILE__, __LINE__); \ + barrier_before_unreachable(); \ __builtin_trap(); \ } while (0) #else #define BUG() do { \ pr_crit("kernel BUG at %s:%d!\n", __FILE__, __LINE__); \ + barrier_before_unreachable(); \ panic("BUG!"); \ } while (0) #endif #else #define BUG() do { \ + barrier_before_unreachable(); \ __builtin_trap(); \ } while (0) #endif diff --git a/arch/sparc/include/asm/bug.h b/arch/sparc/include/asm/bug.h index 6f17528356b2..ea53e418f6c0 100644 --- a/arch/sparc/include/asm/bug.h +++ b/arch/sparc/include/asm/bug.h @@ -9,10 +9,14 @@ void do_BUG(const char *file, int line); #define BUG() do { \ do_BUG(__FILE__, __LINE__); \ + barrier_before_unreachable(); \ __builtin_trap(); \ } while (0) #else -#define BUG() __builtin_trap() +#define BUG() do { \ + barrier_before_unreachable(); \ + __builtin_trap(); \ +} while (0) #endif #define HAVE_ARCH_BUG diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h index 963b755d19b0..a7613e1b0c87 100644 --- a/include/asm-generic/bug.h +++ b/include/asm-generic/bug.h @@ -52,6 +52,7 @@ struct bug_entry { #ifndef HAVE_ARCH_BUG #define BUG() do { \ printk("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__); \ + barrier_before_unreachable(); \ panic("BUG!"); \ } while (0) #endif diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h index 5d595cfdb2c4..66cfdad68f7e 100644 --- a/include/linux/compiler-gcc.h +++ b/include/linux/compiler-gcc.h @@ -205,6 +205,15 @@ #endif /* + * calling noreturn functions, __builtin_unreachable() and __builtin_trap() + * confuse the stack allocation in gcc, leading to overly large stack + * frames, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365 + * + * Adding an empty inline assembly before it works around the problem + */ +#define barrier_before_unreachable() asm volatile("") + +/* * Mark a position in code as unreachable. This can be used to * suppress control flow warnings after asm blocks that transfer * control elsewhere. @@ -214,7 +223,11 @@ * unreleased. Really, we need to have autoconf for the kernel. */ #define unreachable() \ - do { annotate_unreachable(); __builtin_unreachable(); } while (0) + do { \ + annotate_unreachable(); \ + barrier_before_unreachable(); \ + __builtin_unreachable(); \ + } while (0) /* Mark a function definition as prohibited from being cloned. */ #define __noclone __attribute__((__noclone__, __optimize__("no-tracer"))) diff --git a/include/linux/compiler.h b/include/linux/compiler.h index 52e611ab9a6c..97847f2f86cf 100644 --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -86,6 +86,11 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, # define barrier_data(ptr) barrier() #endif +/* workaround for GCC PR82365 if needed */ +#ifndef barrier_before_unreachable +# define barrier_before_unreachable() do { } while (0) +#endif + /* Unreachable code */ #ifdef CONFIG_STACK_VALIDATION /*
Looking at functions with large stack frames across all architectures led me discovering that BUG() suffers from the same problem as fortify_panic(), which I've added a workaround for already. In short, variables that go out of scope by calling a noreturn function or __builtin_unreachable() keep using stack space in functions afterwards. A workaround that was identified is to insert an empty assembler statement just before calling the function that doesn't return. I'm adding a macro "barrier_before_unreachable()" to document this, and insert calls to that in all instances of BUG() that currently suffer from this problem. The files that saw the largest change from this had these frame sizes before, and much less with my patch: fs/ext4/inode.c:82:1: warning: the frame size of 1672 bytes is larger than 800 bytes [-Wframe-larger-than=] fs/ext4/namei.c:434:1: warning: the frame size of 904 bytes is larger than 800 bytes [-Wframe-larger-than=] fs/ext4/super.c:2279:1: warning: the frame size of 1160 bytes is larger than 800 bytes [-Wframe-larger-than=] fs/ext4/xattr.c:146:1: warning: the frame size of 1168 bytes is larger than 800 bytes [-Wframe-larger-than=] fs/f2fs/inode.c:152:1: warning: the frame size of 1424 bytes is larger than 800 bytes [-Wframe-larger-than=] net/netfilter/ipvs/ip_vs_core.c:1195:1: warning: the frame size of 1068 bytes is larger than 800 bytes [-Wframe-larger-than=] net/netfilter/ipvs/ip_vs_core.c:395:1: warning: the frame size of 1084 bytes is larger than 800 bytes [-Wframe-larger-than=] net/netfilter/ipvs/ip_vs_ftp.c:298:1: warning: the frame size of 928 bytes is larger than 800 bytes [-Wframe-larger-than=] net/netfilter/ipvs/ip_vs_ftp.c:418:1: warning: the frame size of 908 bytes is larger than 800 bytes [-Wframe-larger-than=] net/netfilter/ipvs/ip_vs_lblcr.c:718:1: warning: the frame size of 960 bytes is larger than 800 bytes [-Wframe-larger-than=] drivers/net/xen-netback/netback.c:1500:1: warning: the frame size of 1088 bytes is larger than 800 bytes [-Wframe-larger-than=] In case of ARC and CRIS, it turns out that the BUG() implementation actually does return (or at least the compiler thinks it does), resulting in lots of warnings about uninitialized variable use and leaving noreturn functions, such as: block/cfq-iosched.c: In function 'cfq_async_queue_prio': block/cfq-iosched.c:3804:1: error: control reaches end of non-void function [-Werror=return-type] include/linux/dmaengine.h: In function 'dma_maxpq': include/linux/dmaengine.h:1123:1: error: control reaches end of non-void function [-Werror=return-type] This makes them call __builtin_trap() instead, which should normally dump the stack and kill the current process, like some of the other architectures already do. I tried adding barrier_before_unreachable() to panic() and fortify_panic() as well, but that had very little effect, so I'm not submitting that patch. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365 Signed-off-by: Arnd Bergmann <arnd@arndb.de> --- The name barrier_before_unreachable() is a bit suboptimal here, as it fails to describe the fact that it is needed for both __builtin_unreachable() and for calling noreturn functions. Any other suggestions would be welcome here. --- arch/arc/include/asm/bug.h | 3 ++- arch/cris/include/arch-v10/arch/bug.h | 11 +++++++++-- arch/ia64/include/asm/bug.h | 6 +++++- arch/m68k/include/asm/bug.h | 3 +++ arch/sparc/include/asm/bug.h | 6 +++++- include/asm-generic/bug.h | 1 + include/linux/compiler-gcc.h | 15 ++++++++++++++- include/linux/compiler.h | 5 +++++ 8 files changed, 44 insertions(+), 6 deletions(-)