Message ID | alpine.DEB.2.02.1905191752230.13168@grove.saclay.inria.fr |
---|---|
State | New |
Headers | show |
Series | Simplify more EXACT_DIV_EXPR comparisons | expand |
On Sun, May 19, 2019 at 6:16 PM Marc Glisse <marc.glisse@inria.fr> wrote: > > Hello, > > 2 pieces: > > - the first one handles the case where the denominator is negative. It > doesn't happen often with exact_div, so I don't handle it everywhere, but > this one looked trivial > > - handle the case where a pointer difference is cast to an unsigned type > before being compared to a constant (I hit this in std::vector). With some > range info we could probably handle some non-constant cases as well... > > The second piece breaks Walloca-13.c (-Walloca-larger-than=100 -O2) > > void f (void*); > void g (int *p, int *q) > { > __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q); > if (n < 100) > f (__builtin_alloca (n)); > } > > At the time of walloca2, we have > > _1 = p_5(D) - q_6(D); > # RANGE [-2305843009213693952, 2305843009213693951] > _2 = _1 /[ex] 4; > # RANGE ~[2305843009213693952, 16140901064495857663] > n_7 = (long unsigned intD.10) _2; > _11 = (long unsigned intD.10) _1; > if (_11 <= 396) > [...] > _3 = allocaD.1059 (n_7); > > and warn. That's indeed to complicated relation of _11 to n_7 for VRP predicate discovery. > However, DOM3 later produces > > _1 = p_5(D) - q_6(D); > _11 = (long unsigned intD.10) _1; > if (_11 <= 396) while _11 vs. _1 works fine. > [...] > # RANGE [0, 99] NONZERO 127 > _2 = _1 /[ex] 4; > # RANGE [0, 99] NONZERO 127 > n_7 = (long unsigned intD.10) _2; > _3 = allocaD.1059 (n_7); > > so I am tempted to say that the walloca2 pass is too early, xfail the > testcase and file an issue... Hmm, there's a DOM pass before walloca2 already and moving walloca2 after loop opts doesn't look like the best thing to do? I suppose it's not DOM but sinking that does the important transform here? That is, Index: gcc/passes.def =================================================================== --- gcc/passes.def (revision 271395) +++ gcc/passes.def (working copy) @@ -241,9 +241,9 @@ along with GCC; see the file COPYING3. NEXT_PASS (pass_optimize_bswap); NEXT_PASS (pass_laddress); NEXT_PASS (pass_lim); - NEXT_PASS (pass_walloca, false); NEXT_PASS (pass_pre); NEXT_PASS (pass_sink_code); + NEXT_PASS (pass_walloca, false); NEXT_PASS (pass_sancov); NEXT_PASS (pass_asan); NEXT_PASS (pass_tsan); fixes it? > A single_use restriction would also probably fix this testcase, but I > don't think that's a good idea, the new code is better because the > division is now in the branch. OK. Can you test the pass motion above? Thanks, Richard. > 2019-05-20 Marc Glisse <marc.glisse@inria.fr> > > gcc/ > * match.pd (X/[ex]D<Y/[ex]D): Handle negative denominator. > ((size_t)(A /[ex] B) CMP C): New transformation. > > gcc/testsuite/ > * gcc.dg/tree-ssa/cmpexactdiv-3.c: New file. > * gcc.dg/tree-ssa/cmpexactdiv-4.c: New file. > > -- > Marc Glisse
On Mon, 20 May 2019, Richard Biener wrote: > On Sun, May 19, 2019 at 6:16 PM Marc Glisse <marc.glisse@inria.fr> wrote: >> >> Hello, >> >> 2 pieces: >> >> - the first one handles the case where the denominator is negative. It >> doesn't happen often with exact_div, so I don't handle it everywhere, but >> this one looked trivial >> >> - handle the case where a pointer difference is cast to an unsigned type >> before being compared to a constant (I hit this in std::vector). With some >> range info we could probably handle some non-constant cases as well... >> >> The second piece breaks Walloca-13.c (-Walloca-larger-than=100 -O2) >> >> void f (void*); >> void g (int *p, int *q) >> { >> __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q); >> if (n < 100) >> f (__builtin_alloca (n)); >> } >> >> At the time of walloca2, we have >> >> _1 = p_5(D) - q_6(D); >> # RANGE [-2305843009213693952, 2305843009213693951] >> _2 = _1 /[ex] 4; >> # RANGE ~[2305843009213693952, 16140901064495857663] >> n_7 = (long unsigned intD.10) _2; >> _11 = (long unsigned intD.10) _1; >> if (_11 <= 396) >> [...] >> _3 = allocaD.1059 (n_7); >> >> and warn. > > That's indeed to complicated relation of _11 to n_7 for > VRP predicate discovery. > >> However, DOM3 later produces >> >> _1 = p_5(D) - q_6(D); >> _11 = (long unsigned intD.10) _1; >> if (_11 <= 396) > > while _11 vs. _1 works fine. > >> [...] >> # RANGE [0, 99] NONZERO 127 >> _2 = _1 /[ex] 4; >> # RANGE [0, 99] NONZERO 127 >> n_7 = (long unsigned intD.10) _2; >> _3 = allocaD.1059 (n_7); >> >> so I am tempted to say that the walloca2 pass is too early, xfail the >> testcase and file an issue... > > Hmm, there's a DOM pass before walloca2 already and moving > walloca2 after loop opts doesn't look like the best thing to do? > I suppose it's not DOM but sinking that does the important transform > here? That is, > > Index: gcc/passes.def > =================================================================== > --- gcc/passes.def (revision 271395) > +++ gcc/passes.def (working copy) > @@ -241,9 +241,9 @@ along with GCC; see the file COPYING3. > NEXT_PASS (pass_optimize_bswap); > NEXT_PASS (pass_laddress); > NEXT_PASS (pass_lim); > - NEXT_PASS (pass_walloca, false); > NEXT_PASS (pass_pre); > NEXT_PASS (pass_sink_code); > + NEXT_PASS (pass_walloca, false); > NEXT_PASS (pass_sancov); > NEXT_PASS (pass_asan); > NEXT_PASS (pass_tsan); > > fixes it? I will check, but I don't think walloca uses any kind of on-demand VRP, so we still need some pass to update the ranges after sinking, which doesn't seem to happen until the next DOM pass.
On Mon, May 20, 2019 at 10:16 AM Marc Glisse <marc.glisse@inria.fr> wrote: > > On Mon, 20 May 2019, Richard Biener wrote: > > > On Sun, May 19, 2019 at 6:16 PM Marc Glisse <marc.glisse@inria.fr> wrote: > >> > >> Hello, > >> > >> 2 pieces: > >> > >> - the first one handles the case where the denominator is negative. It > >> doesn't happen often with exact_div, so I don't handle it everywhere, but > >> this one looked trivial > >> > >> - handle the case where a pointer difference is cast to an unsigned type > >> before being compared to a constant (I hit this in std::vector). With some > >> range info we could probably handle some non-constant cases as well... > >> > >> The second piece breaks Walloca-13.c (-Walloca-larger-than=100 -O2) > >> > >> void f (void*); > >> void g (int *p, int *q) > >> { > >> __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q); > >> if (n < 100) > >> f (__builtin_alloca (n)); > >> } > >> > >> At the time of walloca2, we have > >> > >> _1 = p_5(D) - q_6(D); > >> # RANGE [-2305843009213693952, 2305843009213693951] > >> _2 = _1 /[ex] 4; > >> # RANGE ~[2305843009213693952, 16140901064495857663] > >> n_7 = (long unsigned intD.10) _2; > >> _11 = (long unsigned intD.10) _1; > >> if (_11 <= 396) > >> [...] > >> _3 = allocaD.1059 (n_7); > >> > >> and warn. > > > > That's indeed to complicated relation of _11 to n_7 for > > VRP predicate discovery. > > > >> However, DOM3 later produces > >> > >> _1 = p_5(D) - q_6(D); > >> _11 = (long unsigned intD.10) _1; > >> if (_11 <= 396) > > > > while _11 vs. _1 works fine. > > > >> [...] > >> # RANGE [0, 99] NONZERO 127 > >> _2 = _1 /[ex] 4; > >> # RANGE [0, 99] NONZERO 127 > >> n_7 = (long unsigned intD.10) _2; > >> _3 = allocaD.1059 (n_7); > >> > >> so I am tempted to say that the walloca2 pass is too early, xfail the > >> testcase and file an issue... > > > > Hmm, there's a DOM pass before walloca2 already and moving > > walloca2 after loop opts doesn't look like the best thing to do? > > I suppose it's not DOM but sinking that does the important transform > > here? That is, > > > > Index: gcc/passes.def > > =================================================================== > > --- gcc/passes.def (revision 271395) > > +++ gcc/passes.def (working copy) > > @@ -241,9 +241,9 @@ along with GCC; see the file COPYING3. > > NEXT_PASS (pass_optimize_bswap); > > NEXT_PASS (pass_laddress); > > NEXT_PASS (pass_lim); > > - NEXT_PASS (pass_walloca, false); > > NEXT_PASS (pass_pre); > > NEXT_PASS (pass_sink_code); > > + NEXT_PASS (pass_walloca, false); > > NEXT_PASS (pass_sancov); > > NEXT_PASS (pass_asan); > > NEXT_PASS (pass_tsan); > > > > fixes it? > > I will check, but I don't think walloca uses any kind of on-demand VRP, so > we still need some pass to update the ranges after sinking, which doesn't > seem to happen until the next DOM pass. Oh, ok... Aldy, why's this a separate pass anyways? I think similar other warnigns are emitted from RTL expansion? So maybe we can indeed move the pass towards warn_restrict or late_warn_uninit. I also see that the Og pass pipeline misses the second walloca pass completely (and also the warn_restrict pass). Given code sinkings obvious effects on SSA value-range representation it may make sense to add another instance of that pass earlier. > > -- > Marc Glisse
On 5/20/19 3:16 AM, Richard Biener wrote: > On Mon, May 20, 2019 at 10:16 AM Marc Glisse <marc.glisse@inria.fr> wrote: >> >> On Mon, 20 May 2019, Richard Biener wrote: >> >>> On Sun, May 19, 2019 at 6:16 PM Marc Glisse <marc.glisse@inria.fr> wrote: >>>> >>>> Hello, >>>> >>>> 2 pieces: >>>> >>>> - the first one handles the case where the denominator is negative. It >>>> doesn't happen often with exact_div, so I don't handle it everywhere, but >>>> this one looked trivial >>>> >>>> - handle the case where a pointer difference is cast to an unsigned type >>>> before being compared to a constant (I hit this in std::vector). With some >>>> range info we could probably handle some non-constant cases as well... >>>> >>>> The second piece breaks Walloca-13.c (-Walloca-larger-than=100 -O2) >>>> >>>> void f (void*); >>>> void g (int *p, int *q) >>>> { >>>> __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q); >>>> if (n < 100) >>>> f (__builtin_alloca (n)); >>>> } >>>> >>>> At the time of walloca2, we have >>>> >>>> _1 = p_5(D) - q_6(D); >>>> # RANGE [-2305843009213693952, 2305843009213693951] >>>> _2 = _1 /[ex] 4; >>>> # RANGE ~[2305843009213693952, 16140901064495857663] >>>> n_7 = (long unsigned intD.10) _2; >>>> _11 = (long unsigned intD.10) _1; >>>> if (_11 <= 396) >>>> [...] >>>> _3 = allocaD.1059 (n_7); >>>> >>>> and warn. >>> >>> That's indeed to complicated relation of _11 to n_7 for >>> VRP predicate discovery. >>> >>>> However, DOM3 later produces >>>> >>>> _1 = p_5(D) - q_6(D); >>>> _11 = (long unsigned intD.10) _1; >>>> if (_11 <= 396) >>> >>> while _11 vs. _1 works fine. >>> >>>> [...] >>>> # RANGE [0, 99] NONZERO 127 >>>> _2 = _1 /[ex] 4; >>>> # RANGE [0, 99] NONZERO 127 >>>> n_7 = (long unsigned intD.10) _2; >>>> _3 = allocaD.1059 (n_7); >>>> >>>> so I am tempted to say that the walloca2 pass is too early, xfail the >>>> testcase and file an issue... >>> >>> Hmm, there's a DOM pass before walloca2 already and moving >>> walloca2 after loop opts doesn't look like the best thing to do? >>> I suppose it's not DOM but sinking that does the important transform >>> here? That is, >>> >>> Index: gcc/passes.def >>> =================================================================== >>> --- gcc/passes.def (revision 271395) >>> +++ gcc/passes.def (working copy) >>> @@ -241,9 +241,9 @@ along with GCC; see the file COPYING3. >>> NEXT_PASS (pass_optimize_bswap); >>> NEXT_PASS (pass_laddress); >>> NEXT_PASS (pass_lim); >>> - NEXT_PASS (pass_walloca, false); >>> NEXT_PASS (pass_pre); >>> NEXT_PASS (pass_sink_code); >>> + NEXT_PASS (pass_walloca, false); >>> NEXT_PASS (pass_sancov); >>> NEXT_PASS (pass_asan); >>> NEXT_PASS (pass_tsan); >>> >>> fixes it? >> >> I will check, but I don't think walloca uses any kind of on-demand VRP, so >> we still need some pass to update the ranges after sinking, which doesn't >> seem to happen until the next DOM pass. > > Oh, ok... Aldy, why's this a separate pass anyways? I think similar > other warnigns are emitted from RTL expansion? So maybe we can > indeed move the pass towards warn_restrict or late_warn_uninit. > I also see that the Og pass pipeline misses the second walloca pass > completely (and also the warn_restrict pass). I'd tend to agree that pushing it down would be a good thing. It's probably a separate pass because I suggested it not be embedded within tree-vrp.c which I felt had way too much stuff in it already. > > Given code sinkings obvious effects on SSA value-range representation > it may make sense to add another instance of that pass earlier. There is an early pass already. jeff
On 5/20/19 2:04 AM, Richard Biener wrote: > On Sun, May 19, 2019 at 6:16 PM Marc Glisse <marc.glisse@inria.fr> wrote: >> >> Hello, >> >> 2 pieces: >> >> - the first one handles the case where the denominator is negative. It >> doesn't happen often with exact_div, so I don't handle it everywhere, but >> this one looked trivial >> >> - handle the case where a pointer difference is cast to an unsigned type >> before being compared to a constant (I hit this in std::vector). With some >> range info we could probably handle some non-constant cases as well... >> >> The second piece breaks Walloca-13.c (-Walloca-larger-than=100 -O2) >> >> void f (void*); >> void g (int *p, int *q) >> { >> __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q); >> if (n < 100) >> f (__builtin_alloca (n)); >> } >> >> At the time of walloca2, we have >> >> _1 = p_5(D) - q_6(D); >> # RANGE [-2305843009213693952, 2305843009213693951] >> _2 = _1 /[ex] 4; >> # RANGE ~[2305843009213693952, 16140901064495857663] >> n_7 = (long unsigned intD.10) _2; >> _11 = (long unsigned intD.10) _1; >> if (_11 <= 396) >> [...] >> _3 = allocaD.1059 (n_7); >> >> and warn. > > That's indeed to complicated relation of _11 to n_7 for > VRP predicate discovery. > >> However, DOM3 later produces >> >> _1 = p_5(D) - q_6(D); >> _11 = (long unsigned intD.10) _1; >> if (_11 <= 396) > > while _11 vs. _1 works fine. > >> [...] >> # RANGE [0, 99] NONZERO 127 >> _2 = _1 /[ex] 4; >> # RANGE [0, 99] NONZERO 127 >> n_7 = (long unsigned intD.10) _2; >> _3 = allocaD.1059 (n_7); >> >> so I am tempted to say that the walloca2 pass is too early, xfail the >> testcase and file an issue... > > Hmm, there's a DOM pass before walloca2 already and moving > walloca2 after loop opts doesn't look like the best thing to do? > I suppose it's not DOM but sinking that does the important transform > here? That is, It could well be sinking that's critical here. It certainly was in cases I looked at in our <vec> code and how we needed to transform the pointer subtraction and subsequent testing to discover infeasible paths. Jeff
On Mon, 20 May 2019, Jeff Law wrote: > On 5/20/19 3:16 AM, Richard Biener wrote: >> Given code sinkings obvious effects on SSA value-range representation >> it may make sense to add another instance of that pass earlier. > There is an early pass already. IIUC Richard was talking about adding an early 'sink' pass and you are saying that there is already an early 'walloca' pass? Or are you talking about a different pass not called 'sink' that does something similar?
On 5/20/19 11:37 AM, Marc Glisse wrote: > On Mon, 20 May 2019, Jeff Law wrote: > >> On 5/20/19 3:16 AM, Richard Biener wrote: >>> Given code sinkings obvious effects on SSA value-range representation >>> it may make sense to add another instance of that pass earlier. >> There is an early pass already. > > IIUC Richard was talking about adding an early 'sink' pass and you are > saying that there is already an early 'walloca' pass? Or are you talking > about a different pass not called 'sink' that does something similar? Oh, I misunderstood. I meant there was an early Walloca pass. jeff
On 5/20/19 3:16 AM, Richard Biener wrote: > On Mon, May 20, 2019 at 10:16 AM Marc Glisse <marc.glisse@inria.fr> wrote: >> >> On Mon, 20 May 2019, Richard Biener wrote: >> >>> On Sun, May 19, 2019 at 6:16 PM Marc Glisse <marc.glisse@inria.fr> wrote: >>>> >>>> Hello, >>>> >>>> 2 pieces: >>>> >>>> - the first one handles the case where the denominator is negative. It >>>> doesn't happen often with exact_div, so I don't handle it everywhere, but >>>> this one looked trivial >>>> >>>> - handle the case where a pointer difference is cast to an unsigned type >>>> before being compared to a constant (I hit this in std::vector). With some >>>> range info we could probably handle some non-constant cases as well... >>>> >>>> The second piece breaks Walloca-13.c (-Walloca-larger-than=100 -O2) >>>> >>>> void f (void*); >>>> void g (int *p, int *q) >>>> { >>>> __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q); >>>> if (n < 100) >>>> f (__builtin_alloca (n)); >>>> } >>>> >>>> At the time of walloca2, we have >>>> >>>> _1 = p_5(D) - q_6(D); >>>> # RANGE [-2305843009213693952, 2305843009213693951] >>>> _2 = _1 /[ex] 4; >>>> # RANGE ~[2305843009213693952, 16140901064495857663] >>>> n_7 = (long unsigned intD.10) _2; >>>> _11 = (long unsigned intD.10) _1; >>>> if (_11 <= 396) >>>> [...] >>>> _3 = allocaD.1059 (n_7); >>>> >>>> and warn. >>> >>> That's indeed to complicated relation of _11 to n_7 for >>> VRP predicate discovery. >>> >>>> However, DOM3 later produces >>>> >>>> _1 = p_5(D) - q_6(D); >>>> _11 = (long unsigned intD.10) _1; >>>> if (_11 <= 396) >>> >>> while _11 vs. _1 works fine. >>> >>>> [...] >>>> # RANGE [0, 99] NONZERO 127 >>>> _2 = _1 /[ex] 4; >>>> # RANGE [0, 99] NONZERO 127 >>>> n_7 = (long unsigned intD.10) _2; >>>> _3 = allocaD.1059 (n_7); >>>> >>>> so I am tempted to say that the walloca2 pass is too early, xfail the >>>> testcase and file an issue... >>> >>> Hmm, there's a DOM pass before walloca2 already and moving >>> walloca2 after loop opts doesn't look like the best thing to do? >>> I suppose it's not DOM but sinking that does the important transform >>> here? That is, >>> >>> Index: gcc/passes.def >>> =================================================================== >>> --- gcc/passes.def (revision 271395) >>> +++ gcc/passes.def (working copy) >>> @@ -241,9 +241,9 @@ along with GCC; see the file COPYING3. >>> NEXT_PASS (pass_optimize_bswap); >>> NEXT_PASS (pass_laddress); >>> NEXT_PASS (pass_lim); >>> - NEXT_PASS (pass_walloca, false); >>> NEXT_PASS (pass_pre); >>> NEXT_PASS (pass_sink_code); >>> + NEXT_PASS (pass_walloca, false); >>> NEXT_PASS (pass_sancov); >>> NEXT_PASS (pass_asan); >>> NEXT_PASS (pass_tsan); >>> >>> fixes it? >> >> I will check, but I don't think walloca uses any kind of on-demand VRP, so >> we still need some pass to update the ranges after sinking, which doesn't >> seem to happen until the next DOM pass. > > Oh, ok... Aldy, why's this a separate pass anyways? I think similar > other warnigns are emitted from RTL expansion? So maybe we can > indeed move the pass towards warn_restrict or late_warn_uninit. I thought there was a preference to add new middle-end warnings into passes of their own rather than into existing passes. Is that not so (either in general or in this specific case)? From my POV, the main (only?) benefit of putting warnings in their own passes is modularity. Are there any others? The biggest drawback I see is that it makes it hard to then share data across multiple passes. The sharing can help not just warnings (reduce both false positive and false negative rates) but also optimization. That's why I'm merging the strlen and sprintf passes, and want to eventually also look into merging the -Wstringop-overflow warnings there (also emitted just before RTL expansion. Did I miss any downsides? I don't know if there's the -Walloca pass would benefit from merging with any of the others or vice versa, but superficially it seems like it might be worth thinking about integrating the -Walloc-larger-than warnings into the -Walloca pass, if only to keep similar functionality in the same place. > I also see that the Og pass pipeline misses the second walloca pass > completely (and also the warn_restrict pass). That's worth fixing. Martin > > Given code sinkings obvious effects on SSA value-range representation > it may make sense to add another instance of that pass earlier. > > > >> >> -- >> Marc Glisse
On Tue, May 21, 2019 at 4:13 AM Martin Sebor <msebor@gmail.com> wrote: > > On 5/20/19 3:16 AM, Richard Biener wrote: > > On Mon, May 20, 2019 at 10:16 AM Marc Glisse <marc.glisse@inria.fr> wrote: > >> > >> On Mon, 20 May 2019, Richard Biener wrote: > >> > >>> On Sun, May 19, 2019 at 6:16 PM Marc Glisse <marc.glisse@inria.fr> wrote: > >>>> > >>>> Hello, > >>>> > >>>> 2 pieces: > >>>> > >>>> - the first one handles the case where the denominator is negative. It > >>>> doesn't happen often with exact_div, so I don't handle it everywhere, but > >>>> this one looked trivial > >>>> > >>>> - handle the case where a pointer difference is cast to an unsigned type > >>>> before being compared to a constant (I hit this in std::vector). With some > >>>> range info we could probably handle some non-constant cases as well... > >>>> > >>>> The second piece breaks Walloca-13.c (-Walloca-larger-than=100 -O2) > >>>> > >>>> void f (void*); > >>>> void g (int *p, int *q) > >>>> { > >>>> __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q); > >>>> if (n < 100) > >>>> f (__builtin_alloca (n)); > >>>> } > >>>> > >>>> At the time of walloca2, we have > >>>> > >>>> _1 = p_5(D) - q_6(D); > >>>> # RANGE [-2305843009213693952, 2305843009213693951] > >>>> _2 = _1 /[ex] 4; > >>>> # RANGE ~[2305843009213693952, 16140901064495857663] > >>>> n_7 = (long unsigned intD.10) _2; > >>>> _11 = (long unsigned intD.10) _1; > >>>> if (_11 <= 396) > >>>> [...] > >>>> _3 = allocaD.1059 (n_7); > >>>> > >>>> and warn. > >>> > >>> That's indeed to complicated relation of _11 to n_7 for > >>> VRP predicate discovery. > >>> > >>>> However, DOM3 later produces > >>>> > >>>> _1 = p_5(D) - q_6(D); > >>>> _11 = (long unsigned intD.10) _1; > >>>> if (_11 <= 396) > >>> > >>> while _11 vs. _1 works fine. > >>> > >>>> [...] > >>>> # RANGE [0, 99] NONZERO 127 > >>>> _2 = _1 /[ex] 4; > >>>> # RANGE [0, 99] NONZERO 127 > >>>> n_7 = (long unsigned intD.10) _2; > >>>> _3 = allocaD.1059 (n_7); > >>>> > >>>> so I am tempted to say that the walloca2 pass is too early, xfail the > >>>> testcase and file an issue... > >>> > >>> Hmm, there's a DOM pass before walloca2 already and moving > >>> walloca2 after loop opts doesn't look like the best thing to do? > >>> I suppose it's not DOM but sinking that does the important transform > >>> here? That is, > >>> > >>> Index: gcc/passes.def > >>> =================================================================== > >>> --- gcc/passes.def (revision 271395) > >>> +++ gcc/passes.def (working copy) > >>> @@ -241,9 +241,9 @@ along with GCC; see the file COPYING3. > >>> NEXT_PASS (pass_optimize_bswap); > >>> NEXT_PASS (pass_laddress); > >>> NEXT_PASS (pass_lim); > >>> - NEXT_PASS (pass_walloca, false); > >>> NEXT_PASS (pass_pre); > >>> NEXT_PASS (pass_sink_code); > >>> + NEXT_PASS (pass_walloca, false); > >>> NEXT_PASS (pass_sancov); > >>> NEXT_PASS (pass_asan); > >>> NEXT_PASS (pass_tsan); > >>> > >>> fixes it? > >> > >> I will check, but I don't think walloca uses any kind of on-demand VRP, so > >> we still need some pass to update the ranges after sinking, which doesn't > >> seem to happen until the next DOM pass. > > > > Oh, ok... Aldy, why's this a separate pass anyways? I think similar > > other warnigns are emitted from RTL expansion? So maybe we can > > indeed move the pass towards warn_restrict or late_warn_uninit. > > I thought there was a preference to add new middle-end warnings > into passes of their own rather than into existing passes. Is > that not so (either in general or in this specific case)? The preference was to add them not into optimization passes. But of course having 10+ warning passes, each going over the whole IL is excessive. Also each of the locally computing ranges or so. Given the simplicity of Walloca I wonder why it's not part of another warning pass - since it's about tracking "sizes" again there are plenty that fit ;) > From my POV, the main (only?) benefit of putting warnings in their > own passes is modularity. Are there any others? > > The biggest drawback I see is that it makes it hard to then share > data across multiple passes. The sharing can help not just > warnings (reduce both false positive and false negative rates) but > also optimization. That's why I'm merging the strlen and sprintf > passes, and want to eventually also look into merging > the -Wstringop-overflow warnings there (also emitted just before > RTL expansion. Did I miss any downsides? When things fit together they are fine to merge obviously. One may not like -Warray-bounds inside VRP but it really "fits". OTOH making a warning part of an optimization pass naturally limits its effect to when the specific optimization is enabled. In theory it's possible to do -Warray-bounds at -O0 - we are in SSA form after all - but of course you don't want to enable VRP at -O0. > I don't know if there's the -Walloca pass would benefit from merging > with any of the others or vice versa, but superficially it seems like > it might be worth thinking about integrating the -Walloc-larger-than > warnings into the -Walloca pass, if only to keep similar functionality > in the same place. -Wrestrict and all the format warning stuff seems related as well. > > I also see that the Og pass pipeline misses the second walloca pass > > completely (and also the warn_restrict pass). > > That's worth fixing. > > Martin > > > > > Given code sinkings obvious effects on SSA value-range representation > > it may make sense to add another instance of that pass earlier. > > > > > > > >> > >> -- > >> Marc Glisse >
On 5/21/19 5:53 AM, Richard Biener wrote: > On Tue, May 21, 2019 at 4:13 AM Martin Sebor <msebor@gmail.com> wrote: >> >> On 5/20/19 3:16 AM, Richard Biener wrote: >>> On Mon, May 20, 2019 at 10:16 AM Marc Glisse <marc.glisse@inria.fr> wrote: >>>> >>>> On Mon, 20 May 2019, Richard Biener wrote: >>>> >>>>> On Sun, May 19, 2019 at 6:16 PM Marc Glisse <marc.glisse@inria.fr> wrote: >>>>>> >>>>>> Hello, >>>>>> >>>>>> 2 pieces: >>>>>> >>>>>> - the first one handles the case where the denominator is negative. It >>>>>> doesn't happen often with exact_div, so I don't handle it everywhere, but >>>>>> this one looked trivial >>>>>> >>>>>> - handle the case where a pointer difference is cast to an unsigned type >>>>>> before being compared to a constant (I hit this in std::vector). With some >>>>>> range info we could probably handle some non-constant cases as well... >>>>>> >>>>>> The second piece breaks Walloca-13.c (-Walloca-larger-than=100 -O2) >>>>>> >>>>>> void f (void*); >>>>>> void g (int *p, int *q) >>>>>> { >>>>>> __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q); >>>>>> if (n < 100) >>>>>> f (__builtin_alloca (n)); >>>>>> } >>>>>> >>>>>> At the time of walloca2, we have >>>>>> >>>>>> _1 = p_5(D) - q_6(D); >>>>>> # RANGE [-2305843009213693952, 2305843009213693951] >>>>>> _2 = _1 /[ex] 4; >>>>>> # RANGE ~[2305843009213693952, 16140901064495857663] >>>>>> n_7 = (long unsigned intD.10) _2; >>>>>> _11 = (long unsigned intD.10) _1; >>>>>> if (_11 <= 396) >>>>>> [...] >>>>>> _3 = allocaD.1059 (n_7); >>>>>> >>>>>> and warn. >>>>> >>>>> That's indeed to complicated relation of _11 to n_7 for >>>>> VRP predicate discovery. >>>>> >>>>>> However, DOM3 later produces >>>>>> >>>>>> _1 = p_5(D) - q_6(D); >>>>>> _11 = (long unsigned intD.10) _1; >>>>>> if (_11 <= 396) >>>>> >>>>> while _11 vs. _1 works fine. >>>>> >>>>>> [...] >>>>>> # RANGE [0, 99] NONZERO 127 >>>>>> _2 = _1 /[ex] 4; >>>>>> # RANGE [0, 99] NONZERO 127 >>>>>> n_7 = (long unsigned intD.10) _2; >>>>>> _3 = allocaD.1059 (n_7); >>>>>> >>>>>> so I am tempted to say that the walloca2 pass is too early, xfail the >>>>>> testcase and file an issue... >>>>> >>>>> Hmm, there's a DOM pass before walloca2 already and moving >>>>> walloca2 after loop opts doesn't look like the best thing to do? >>>>> I suppose it's not DOM but sinking that does the important transform >>>>> here? That is, >>>>> >>>>> Index: gcc/passes.def >>>>> =================================================================== >>>>> --- gcc/passes.def (revision 271395) >>>>> +++ gcc/passes.def (working copy) >>>>> @@ -241,9 +241,9 @@ along with GCC; see the file COPYING3. >>>>> NEXT_PASS (pass_optimize_bswap); >>>>> NEXT_PASS (pass_laddress); >>>>> NEXT_PASS (pass_lim); >>>>> - NEXT_PASS (pass_walloca, false); >>>>> NEXT_PASS (pass_pre); >>>>> NEXT_PASS (pass_sink_code); >>>>> + NEXT_PASS (pass_walloca, false); >>>>> NEXT_PASS (pass_sancov); >>>>> NEXT_PASS (pass_asan); >>>>> NEXT_PASS (pass_tsan); >>>>> >>>>> fixes it? >>>> >>>> I will check, but I don't think walloca uses any kind of on-demand VRP, so >>>> we still need some pass to update the ranges after sinking, which doesn't >>>> seem to happen until the next DOM pass. >>> >>> Oh, ok... Aldy, why's this a separate pass anyways? I think similar >>> other warnigns are emitted from RTL expansion? So maybe we can >>> indeed move the pass towards warn_restrict or late_warn_uninit. >> >> I thought there was a preference to add new middle-end warnings >> into passes of their own rather than into existing passes. Is >> that not so (either in general or in this specific case)? > > The preference was to add them not into optimization passes. But > of course having 10+ warning passes, each going over the whole IL > is excessive. Also each of the locally computing ranges or so. > > Given the simplicity of Walloca I wonder why it's not part of another > warning pass - since it's about tracking "sizes" again there are plenty > that fit ;) > >> From my POV, the main (only?) benefit of putting warnings in their >> own passes is modularity. Are there any others? >> >> The biggest drawback I see is that it makes it hard to then share >> data across multiple passes. The sharing can help not just >> warnings (reduce both false positive and false negative rates) but >> also optimization. That's why I'm merging the strlen and sprintf >> passes, and want to eventually also look into merging >> the -Wstringop-overflow warnings there (also emitted just before >> RTL expansion. Did I miss any downsides? > > When things fit together they are fine to merge obviously. > > One may not like -Warray-bounds inside VRP but it really "fits". > > OTOH making a warning part of an optimization pass naturally > limits its effect to when the specific optimization is enabled. > In theory it's possible to do -Warray-bounds at -O0 - we are in > SSA form after all - but of course you don't want to enable VRP at -O0. > >> I don't know if there's the -Walloca pass would benefit from merging >> with any of the others or vice versa, but superficially it seems like >> it might be worth thinking about integrating the -Walloc-larger-than >> warnings into the -Walloca pass, if only to keep similar functionality >> in the same place. My original code was in the VRP pass and I can't remember whether it was you or Jeff that suggested a separate pass so we'd stop polluting VRP with everything but the kitchen sink. Aldy
On Mon, May 27, 2019 at 3:09 PM Aldy Hernandez <aldyh@redhat.com> wrote: > > > > On 5/21/19 5:53 AM, Richard Biener wrote: > > On Tue, May 21, 2019 at 4:13 AM Martin Sebor <msebor@gmail.com> wrote: > >> > >> On 5/20/19 3:16 AM, Richard Biener wrote: > >>> On Mon, May 20, 2019 at 10:16 AM Marc Glisse <marc.glisse@inria.fr> wrote: > >>>> > >>>> On Mon, 20 May 2019, Richard Biener wrote: > >>>> > >>>>> On Sun, May 19, 2019 at 6:16 PM Marc Glisse <marc.glisse@inria.fr> wrote: > >>>>>> > >>>>>> Hello, > >>>>>> > >>>>>> 2 pieces: > >>>>>> > >>>>>> - the first one handles the case where the denominator is negative. It > >>>>>> doesn't happen often with exact_div, so I don't handle it everywhere, but > >>>>>> this one looked trivial > >>>>>> > >>>>>> - handle the case where a pointer difference is cast to an unsigned type > >>>>>> before being compared to a constant (I hit this in std::vector). With some > >>>>>> range info we could probably handle some non-constant cases as well... > >>>>>> > >>>>>> The second piece breaks Walloca-13.c (-Walloca-larger-than=100 -O2) > >>>>>> > >>>>>> void f (void*); > >>>>>> void g (int *p, int *q) > >>>>>> { > >>>>>> __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q); > >>>>>> if (n < 100) > >>>>>> f (__builtin_alloca (n)); > >>>>>> } > >>>>>> > >>>>>> At the time of walloca2, we have > >>>>>> > >>>>>> _1 = p_5(D) - q_6(D); > >>>>>> # RANGE [-2305843009213693952, 2305843009213693951] > >>>>>> _2 = _1 /[ex] 4; > >>>>>> # RANGE ~[2305843009213693952, 16140901064495857663] > >>>>>> n_7 = (long unsigned intD.10) _2; > >>>>>> _11 = (long unsigned intD.10) _1; > >>>>>> if (_11 <= 396) > >>>>>> [...] > >>>>>> _3 = allocaD.1059 (n_7); > >>>>>> > >>>>>> and warn. > >>>>> > >>>>> That's indeed to complicated relation of _11 to n_7 for > >>>>> VRP predicate discovery. > >>>>> > >>>>>> However, DOM3 later produces > >>>>>> > >>>>>> _1 = p_5(D) - q_6(D); > >>>>>> _11 = (long unsigned intD.10) _1; > >>>>>> if (_11 <= 396) > >>>>> > >>>>> while _11 vs. _1 works fine. > >>>>> > >>>>>> [...] > >>>>>> # RANGE [0, 99] NONZERO 127 > >>>>>> _2 = _1 /[ex] 4; > >>>>>> # RANGE [0, 99] NONZERO 127 > >>>>>> n_7 = (long unsigned intD.10) _2; > >>>>>> _3 = allocaD.1059 (n_7); > >>>>>> > >>>>>> so I am tempted to say that the walloca2 pass is too early, xfail the > >>>>>> testcase and file an issue... > >>>>> > >>>>> Hmm, there's a DOM pass before walloca2 already and moving > >>>>> walloca2 after loop opts doesn't look like the best thing to do? > >>>>> I suppose it's not DOM but sinking that does the important transform > >>>>> here? That is, > >>>>> > >>>>> Index: gcc/passes.def > >>>>> =================================================================== > >>>>> --- gcc/passes.def (revision 271395) > >>>>> +++ gcc/passes.def (working copy) > >>>>> @@ -241,9 +241,9 @@ along with GCC; see the file COPYING3. > >>>>> NEXT_PASS (pass_optimize_bswap); > >>>>> NEXT_PASS (pass_laddress); > >>>>> NEXT_PASS (pass_lim); > >>>>> - NEXT_PASS (pass_walloca, false); > >>>>> NEXT_PASS (pass_pre); > >>>>> NEXT_PASS (pass_sink_code); > >>>>> + NEXT_PASS (pass_walloca, false); > >>>>> NEXT_PASS (pass_sancov); > >>>>> NEXT_PASS (pass_asan); > >>>>> NEXT_PASS (pass_tsan); > >>>>> > >>>>> fixes it? > >>>> > >>>> I will check, but I don't think walloca uses any kind of on-demand VRP, so > >>>> we still need some pass to update the ranges after sinking, which doesn't > >>>> seem to happen until the next DOM pass. > >>> > >>> Oh, ok... Aldy, why's this a separate pass anyways? I think similar > >>> other warnigns are emitted from RTL expansion? So maybe we can > >>> indeed move the pass towards warn_restrict or late_warn_uninit. > >> > >> I thought there was a preference to add new middle-end warnings > >> into passes of their own rather than into existing passes. Is > >> that not so (either in general or in this specific case)? > > > > The preference was to add them not into optimization passes. But > > of course having 10+ warning passes, each going over the whole IL > > is excessive. Also each of the locally computing ranges or so. > > > > Given the simplicity of Walloca I wonder why it's not part of another > > warning pass - since it's about tracking "sizes" again there are plenty > > that fit ;) > > > >> From my POV, the main (only?) benefit of putting warnings in their > >> own passes is modularity. Are there any others? > >> > >> The biggest drawback I see is that it makes it hard to then share > >> data across multiple passes. The sharing can help not just > >> warnings (reduce both false positive and false negative rates) but > >> also optimization. That's why I'm merging the strlen and sprintf > >> passes, and want to eventually also look into merging > >> the -Wstringop-overflow warnings there (also emitted just before > >> RTL expansion. Did I miss any downsides? > > > > When things fit together they are fine to merge obviously. > > > > One may not like -Warray-bounds inside VRP but it really "fits". > > > > OTOH making a warning part of an optimization pass naturally > > limits its effect to when the specific optimization is enabled. > > In theory it's possible to do -Warray-bounds at -O0 - we are in > > SSA form after all - but of course you don't want to enable VRP at -O0. > > > >> I don't know if there's the -Walloca pass would benefit from merging > >> with any of the others or vice versa, but superficially it seems like > >> it might be worth thinking about integrating the -Walloc-larger-than > >> warnings into the -Walloca pass, if only to keep similar functionality > >> in the same place. > > My original code was in the VRP pass and I can't remember whether it was > you or Jeff that suggested a separate pass so we'd stop polluting VRP > with everything but the kitchen sink. Specifically for VRP the reason is I'd like to see it go away... Richard. > > Aldy
>>>> I don't know if there's the -Walloca pass would benefit from merging >>>> with any of the others or vice versa, but superficially it seems like >>>> it might be worth thinking about integrating the -Walloc-larger-than >>>> warnings into the -Walloca pass, if only to keep similar functionality >>>> in the same place. >> >> My original code was in the VRP pass and I can't remember whether it was >> you or Jeff that suggested a separate pass so we'd stop polluting VRP >> with everything but the kitchen sink. > > Specifically for VRP the reason is I'd like to see it go away... You're preaching to the choir here :). Aldy
On 5/21/19 3:53 AM, Richard Biener wrote: > On Tue, May 21, 2019 at 4:13 AM Martin Sebor <msebor@gmail.com> wrote: >> >> On 5/20/19 3:16 AM, Richard Biener wrote: >>> On Mon, May 20, 2019 at 10:16 AM Marc Glisse <marc.glisse@inria.fr> wrote: >>>> >>>> On Mon, 20 May 2019, Richard Biener wrote: >>>> >>>>> On Sun, May 19, 2019 at 6:16 PM Marc Glisse <marc.glisse@inria.fr> wrote: >>>>>> >>>>>> Hello, >>>>>> >>>>>> 2 pieces: >>>>>> >>>>>> - the first one handles the case where the denominator is negative. It >>>>>> doesn't happen often with exact_div, so I don't handle it everywhere, but >>>>>> this one looked trivial >>>>>> >>>>>> - handle the case where a pointer difference is cast to an unsigned type >>>>>> before being compared to a constant (I hit this in std::vector). With some >>>>>> range info we could probably handle some non-constant cases as well... >>>>>> >>>>>> The second piece breaks Walloca-13.c (-Walloca-larger-than=100 -O2) >>>>>> >>>>>> void f (void*); >>>>>> void g (int *p, int *q) >>>>>> { >>>>>> __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q); >>>>>> if (n < 100) >>>>>> f (__builtin_alloca (n)); >>>>>> } >>>>>> >>>>>> At the time of walloca2, we have >>>>>> >>>>>> _1 = p_5(D) - q_6(D); >>>>>> # RANGE [-2305843009213693952, 2305843009213693951] >>>>>> _2 = _1 /[ex] 4; >>>>>> # RANGE ~[2305843009213693952, 16140901064495857663] >>>>>> n_7 = (long unsigned intD.10) _2; >>>>>> _11 = (long unsigned intD.10) _1; >>>>>> if (_11 <= 396) >>>>>> [...] >>>>>> _3 = allocaD.1059 (n_7); >>>>>> >>>>>> and warn. >>>>> >>>>> That's indeed to complicated relation of _11 to n_7 for >>>>> VRP predicate discovery. >>>>> >>>>>> However, DOM3 later produces >>>>>> >>>>>> _1 = p_5(D) - q_6(D); >>>>>> _11 = (long unsigned intD.10) _1; >>>>>> if (_11 <= 396) >>>>> >>>>> while _11 vs. _1 works fine. >>>>> >>>>>> [...] >>>>>> # RANGE [0, 99] NONZERO 127 >>>>>> _2 = _1 /[ex] 4; >>>>>> # RANGE [0, 99] NONZERO 127 >>>>>> n_7 = (long unsigned intD.10) _2; >>>>>> _3 = allocaD.1059 (n_7); >>>>>> >>>>>> so I am tempted to say that the walloca2 pass is too early, xfail the >>>>>> testcase and file an issue... >>>>> >>>>> Hmm, there's a DOM pass before walloca2 already and moving >>>>> walloca2 after loop opts doesn't look like the best thing to do? >>>>> I suppose it's not DOM but sinking that does the important transform >>>>> here? That is, >>>>> >>>>> Index: gcc/passes.def >>>>> =================================================================== >>>>> --- gcc/passes.def (revision 271395) >>>>> +++ gcc/passes.def (working copy) >>>>> @@ -241,9 +241,9 @@ along with GCC; see the file COPYING3. >>>>> NEXT_PASS (pass_optimize_bswap); >>>>> NEXT_PASS (pass_laddress); >>>>> NEXT_PASS (pass_lim); >>>>> - NEXT_PASS (pass_walloca, false); >>>>> NEXT_PASS (pass_pre); >>>>> NEXT_PASS (pass_sink_code); >>>>> + NEXT_PASS (pass_walloca, false); >>>>> NEXT_PASS (pass_sancov); >>>>> NEXT_PASS (pass_asan); >>>>> NEXT_PASS (pass_tsan); >>>>> >>>>> fixes it? >>>> >>>> I will check, but I don't think walloca uses any kind of on-demand VRP, so >>>> we still need some pass to update the ranges after sinking, which doesn't >>>> seem to happen until the next DOM pass. >>> >>> Oh, ok... Aldy, why's this a separate pass anyways? I think similar >>> other warnigns are emitted from RTL expansion? So maybe we can >>> indeed move the pass towards warn_restrict or late_warn_uninit. >> >> I thought there was a preference to add new middle-end warnings >> into passes of their own rather than into existing passes. Is >> that not so (either in general or in this specific case)? > > The preference was to add them not into optimization passes. But > of course having 10+ warning passes, each going over the whole IL > is excessive. Also each of the locally computing ranges or so. > > Given the simplicity of Walloca I wonder why it's not part of another > warning pass - since it's about tracking "sizes" again there are plenty > that fit ;) -Walloca doesn't need to track object sizes in the same sense as objsize and strlen do. It just examines calls to allocation functions, same as -Walloc-larger-than. It would make sense to merge the implementation of two warnings. They don't need to run as a pass of their own. >> From my POV, the main (only?) benefit of putting warnings in their >> own passes is modularity. Are there any others? >> >> The biggest drawback I see is that it makes it hard to then share >> data across multiple passes. The sharing can help not just >> warnings (reduce both false positive and false negative rates) but >> also optimization. That's why I'm merging the strlen and sprintf >> passes, and want to eventually also look into merging >> the -Wstringop-overflow warnings there (also emitted just before >> RTL expansion. Did I miss any downsides? > > When things fit together they are fine to merge obviously. > > One may not like -Warray-bounds inside VRP but it really "fits". It fits because it has access to data that's not (yet) available outside VRP, right? It's not an ideal fit because by being in VRP it doesn't have access to allocated object sizes so out-of-bounds access to dynamically allocated objects aren't detected. To detect those, I think the only pass it could go in is strlen. So again an optimization pass. It doesn't fit there very well at all, but there is no other pass that tracks sizes of all objects (objsize does, though only in response to calls to __builtin_object_size). I think just like a general purpose API to query value ranges that cam be called from any pass, another API to query object sizes, and another to query string lengths, would be useful. > OTOH making a warning part of an optimization pass naturally > limits its effect to when the specific optimization is enabled. > In theory it's possible to do -Warray-bounds at -O0 - we are in > SSA form after all - but of course you don't want to enable VRP at -O0. True. Better warnings at -O0 would be great. The drawback would be that the warnings would then have to implement all the same analyses as the optimization passes they rely on. We'd end up having to implement a static analyzer on top of basic GIMPLE. >> I don't know if there's the -Walloca pass would benefit from merging >> with any of the others or vice versa, but superficially it seems like >> it might be worth thinking about integrating the -Walloc-larger-than >> warnings into the -Walloca pass, if only to keep similar functionality >> in the same place. > > -Wrestrict and all the format warning stuff seems related as well. I think so. All of strlen, sprintf, Wrestrict, -Wstringop-overflow and -Walloca/-Walloc-larger-than, and even objsize, would benefit from sharing the same IL traversal and the same data. As would -Warray-bounds to detect out-of-bounds accesses to dynamically allocated objects. There probably are other warnings and likely also optimizations that would benefit from closer coupling. Martin > >>> I also see that the Og pass pipeline misses the second walloca pass >>> completely (and also the warn_restrict pass). >> >> That's worth fixing. >> >> Martin >> >>> >>> Given code sinkings obvious effects on SSA value-range representation >>> it may make sense to add another instance of that pass earlier. >>> >>> >>> >>>> >>>> -- >>>> Marc Glisse >>
On Tue, May 28, 2019 at 5:34 PM Martin Sebor <msebor@gmail.com> wrote: > > On 5/21/19 3:53 AM, Richard Biener wrote: > > On Tue, May 21, 2019 at 4:13 AM Martin Sebor <msebor@gmail.com> wrote: > >> > >> On 5/20/19 3:16 AM, Richard Biener wrote: > >>> On Mon, May 20, 2019 at 10:16 AM Marc Glisse <marc.glisse@inria.fr> wrote: > >>>> > >>>> On Mon, 20 May 2019, Richard Biener wrote: > >>>> > >>>>> On Sun, May 19, 2019 at 6:16 PM Marc Glisse <marc.glisse@inria.fr> wrote: > >>>>>> > >>>>>> Hello, > >>>>>> > >>>>>> 2 pieces: > >>>>>> > >>>>>> - the first one handles the case where the denominator is negative. It > >>>>>> doesn't happen often with exact_div, so I don't handle it everywhere, but > >>>>>> this one looked trivial > >>>>>> > >>>>>> - handle the case where a pointer difference is cast to an unsigned type > >>>>>> before being compared to a constant (I hit this in std::vector). With some > >>>>>> range info we could probably handle some non-constant cases as well... > >>>>>> > >>>>>> The second piece breaks Walloca-13.c (-Walloca-larger-than=100 -O2) > >>>>>> > >>>>>> void f (void*); > >>>>>> void g (int *p, int *q) > >>>>>> { > >>>>>> __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q); > >>>>>> if (n < 100) > >>>>>> f (__builtin_alloca (n)); > >>>>>> } > >>>>>> > >>>>>> At the time of walloca2, we have > >>>>>> > >>>>>> _1 = p_5(D) - q_6(D); > >>>>>> # RANGE [-2305843009213693952, 2305843009213693951] > >>>>>> _2 = _1 /[ex] 4; > >>>>>> # RANGE ~[2305843009213693952, 16140901064495857663] > >>>>>> n_7 = (long unsigned intD.10) _2; > >>>>>> _11 = (long unsigned intD.10) _1; > >>>>>> if (_11 <= 396) > >>>>>> [...] > >>>>>> _3 = allocaD.1059 (n_7); > >>>>>> > >>>>>> and warn. > >>>>> > >>>>> That's indeed to complicated relation of _11 to n_7 for > >>>>> VRP predicate discovery. > >>>>> > >>>>>> However, DOM3 later produces > >>>>>> > >>>>>> _1 = p_5(D) - q_6(D); > >>>>>> _11 = (long unsigned intD.10) _1; > >>>>>> if (_11 <= 396) > >>>>> > >>>>> while _11 vs. _1 works fine. > >>>>> > >>>>>> [...] > >>>>>> # RANGE [0, 99] NONZERO 127 > >>>>>> _2 = _1 /[ex] 4; > >>>>>> # RANGE [0, 99] NONZERO 127 > >>>>>> n_7 = (long unsigned intD.10) _2; > >>>>>> _3 = allocaD.1059 (n_7); > >>>>>> > >>>>>> so I am tempted to say that the walloca2 pass is too early, xfail the > >>>>>> testcase and file an issue... > >>>>> > >>>>> Hmm, there's a DOM pass before walloca2 already and moving > >>>>> walloca2 after loop opts doesn't look like the best thing to do? > >>>>> I suppose it's not DOM but sinking that does the important transform > >>>>> here? That is, > >>>>> > >>>>> Index: gcc/passes.def > >>>>> =================================================================== > >>>>> --- gcc/passes.def (revision 271395) > >>>>> +++ gcc/passes.def (working copy) > >>>>> @@ -241,9 +241,9 @@ along with GCC; see the file COPYING3. > >>>>> NEXT_PASS (pass_optimize_bswap); > >>>>> NEXT_PASS (pass_laddress); > >>>>> NEXT_PASS (pass_lim); > >>>>> - NEXT_PASS (pass_walloca, false); > >>>>> NEXT_PASS (pass_pre); > >>>>> NEXT_PASS (pass_sink_code); > >>>>> + NEXT_PASS (pass_walloca, false); > >>>>> NEXT_PASS (pass_sancov); > >>>>> NEXT_PASS (pass_asan); > >>>>> NEXT_PASS (pass_tsan); > >>>>> > >>>>> fixes it? > >>>> > >>>> I will check, but I don't think walloca uses any kind of on-demand VRP, so > >>>> we still need some pass to update the ranges after sinking, which doesn't > >>>> seem to happen until the next DOM pass. > >>> > >>> Oh, ok... Aldy, why's this a separate pass anyways? I think similar > >>> other warnigns are emitted from RTL expansion? So maybe we can > >>> indeed move the pass towards warn_restrict or late_warn_uninit. > >> > >> I thought there was a preference to add new middle-end warnings > >> into passes of their own rather than into existing passes. Is > >> that not so (either in general or in this specific case)? > > > > The preference was to add them not into optimization passes. But > > of course having 10+ warning passes, each going over the whole IL > > is excessive. Also each of the locally computing ranges or so. > > > > Given the simplicity of Walloca I wonder why it's not part of another > > warning pass - since it's about tracking "sizes" again there are plenty > > that fit ;) > > -Walloca doesn't need to track object sizes in the same sense > as objsize and strlen do. It just examines calls to allocation > functions, same as -Walloc-larger-than. It would make sense to > merge the implementation of two warnings. They don't need to > run as a pass of their own. > > >> From my POV, the main (only?) benefit of putting warnings in their > >> own passes is modularity. Are there any others? > >> > >> The biggest drawback I see is that it makes it hard to then share > >> data across multiple passes. The sharing can help not just > >> warnings (reduce both false positive and false negative rates) but > >> also optimization. That's why I'm merging the strlen and sprintf > >> passes, and want to eventually also look into merging > >> the -Wstringop-overflow warnings there (also emitted just before > >> RTL expansion. Did I miss any downsides? > > > > When things fit together they are fine to merge obviously. > > > > One may not like -Warray-bounds inside VRP but it really "fits". > > It fits because it has access to data that's not (yet) available > outside VRP, right? It's not an ideal fit because by being in VRP > it doesn't have access to allocated object sizes so out-of-bounds > access to dynamically allocated objects aren't detected. To detect > those, I think the only pass it could go in is strlen. So again > an optimization pass. It doesn't fit there very well at all, but > there is no other pass that tracks sizes of all objects (objsize > does, though only in response to calls to __builtin_object_size). All true - now that we have warning passes computing range-information it definitely fits better there. At the time we added the warning there wasn't even per-SSA name range information. So - if you want to move it I'd applaud such effort! > I think just like a general purpose API to query value ranges that > cam be called from any pass, another API to query object sizes, and > another to query string lengths, would be useful. > > > OTOH making a warning part of an optimization pass naturally > > limits its effect to when the specific optimization is enabled. > > In theory it's possible to do -Warray-bounds at -O0 - we are in > > SSA form after all - but of course you don't want to enable VRP at -O0. > > True. Better warnings at -O0 would be great. The drawback would > be that the warnings would then have to implement all the same > analyses as the optimization passes they rely on. We'd end up > having to implement a static analyzer on top of basic GIMPLE. > > >> I don't know if there's the -Walloca pass would benefit from merging > >> with any of the others or vice versa, but superficially it seems like > >> it might be worth thinking about integrating the -Walloc-larger-than > >> warnings into the -Walloca pass, if only to keep similar functionality > >> in the same place. > > > > -Wrestrict and all the format warning stuff seems related as well. > > I think so. All of strlen, sprintf, Wrestrict, -Wstringop-overflow > and -Walloca/-Walloc-larger-than, and even objsize, would benefit > from sharing the same IL traversal and the same data. As would > -Warray-bounds to detect out-of-bounds accesses to dynamically > allocated objects. There probably are other warnings and likely > also optimizations that would benefit from closer coupling. > > Martin > > > > > >>> I also see that the Og pass pipeline misses the second walloca pass > >>> completely (and also the warn_restrict pass). > >> > >> That's worth fixing. > >> > >> Martin > >> > >>> > >>> Given code sinkings obvious effects on SSA value-range representation > >>> it may make sense to add another instance of that pass earlier. > >>> > >>> > >>> > >>>> > >>>> -- > >>>> Marc Glisse > >> >
On 5/29/19 1:21 AM, Richard Biener wrote: > On Tue, May 28, 2019 at 5:34 PM Martin Sebor <msebor@gmail.com> wrote: >> >> On 5/21/19 3:53 AM, Richard Biener wrote: >>> On Tue, May 21, 2019 at 4:13 AM Martin Sebor <msebor@gmail.com> wrote: >>>> >>>> On 5/20/19 3:16 AM, Richard Biener wrote: >>>>> On Mon, May 20, 2019 at 10:16 AM Marc Glisse <marc.glisse@inria.fr> wrote: >>>>>> >>>>>> On Mon, 20 May 2019, Richard Biener wrote: >>>>>> >>>>>>> On Sun, May 19, 2019 at 6:16 PM Marc Glisse <marc.glisse@inria.fr> wrote: >>>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> 2 pieces: >>>>>>>> >>>>>>>> - the first one handles the case where the denominator is negative. It >>>>>>>> doesn't happen often with exact_div, so I don't handle it everywhere, but >>>>>>>> this one looked trivial >>>>>>>> >>>>>>>> - handle the case where a pointer difference is cast to an unsigned type >>>>>>>> before being compared to a constant (I hit this in std::vector). With some >>>>>>>> range info we could probably handle some non-constant cases as well... >>>>>>>> >>>>>>>> The second piece breaks Walloca-13.c (-Walloca-larger-than=100 -O2) >>>>>>>> >>>>>>>> void f (void*); >>>>>>>> void g (int *p, int *q) >>>>>>>> { >>>>>>>> __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q); >>>>>>>> if (n < 100) >>>>>>>> f (__builtin_alloca (n)); >>>>>>>> } >>>>>>>> >>>>>>>> At the time of walloca2, we have >>>>>>>> >>>>>>>> _1 = p_5(D) - q_6(D); >>>>>>>> # RANGE [-2305843009213693952, 2305843009213693951] >>>>>>>> _2 = _1 /[ex] 4; >>>>>>>> # RANGE ~[2305843009213693952, 16140901064495857663] >>>>>>>> n_7 = (long unsigned intD.10) _2; >>>>>>>> _11 = (long unsigned intD.10) _1; >>>>>>>> if (_11 <= 396) >>>>>>>> [...] >>>>>>>> _3 = allocaD.1059 (n_7); >>>>>>>> >>>>>>>> and warn. >>>>>>> >>>>>>> That's indeed to complicated relation of _11 to n_7 for >>>>>>> VRP predicate discovery. >>>>>>> >>>>>>>> However, DOM3 later produces >>>>>>>> >>>>>>>> _1 = p_5(D) - q_6(D); >>>>>>>> _11 = (long unsigned intD.10) _1; >>>>>>>> if (_11 <= 396) >>>>>>> >>>>>>> while _11 vs. _1 works fine. >>>>>>> >>>>>>>> [...] >>>>>>>> # RANGE [0, 99] NONZERO 127 >>>>>>>> _2 = _1 /[ex] 4; >>>>>>>> # RANGE [0, 99] NONZERO 127 >>>>>>>> n_7 = (long unsigned intD.10) _2; >>>>>>>> _3 = allocaD.1059 (n_7); >>>>>>>> >>>>>>>> so I am tempted to say that the walloca2 pass is too early, xfail the >>>>>>>> testcase and file an issue... >>>>>>> >>>>>>> Hmm, there's a DOM pass before walloca2 already and moving >>>>>>> walloca2 after loop opts doesn't look like the best thing to do? >>>>>>> I suppose it's not DOM but sinking that does the important transform >>>>>>> here? That is, >>>>>>> >>>>>>> Index: gcc/passes.def >>>>>>> =================================================================== >>>>>>> --- gcc/passes.def (revision 271395) >>>>>>> +++ gcc/passes.def (working copy) >>>>>>> @@ -241,9 +241,9 @@ along with GCC; see the file COPYING3. >>>>>>> NEXT_PASS (pass_optimize_bswap); >>>>>>> NEXT_PASS (pass_laddress); >>>>>>> NEXT_PASS (pass_lim); >>>>>>> - NEXT_PASS (pass_walloca, false); >>>>>>> NEXT_PASS (pass_pre); >>>>>>> NEXT_PASS (pass_sink_code); >>>>>>> + NEXT_PASS (pass_walloca, false); >>>>>>> NEXT_PASS (pass_sancov); >>>>>>> NEXT_PASS (pass_asan); >>>>>>> NEXT_PASS (pass_tsan); >>>>>>> >>>>>>> fixes it? >>>>>> >>>>>> I will check, but I don't think walloca uses any kind of on-demand VRP, so >>>>>> we still need some pass to update the ranges after sinking, which doesn't >>>>>> seem to happen until the next DOM pass. >>>>> >>>>> Oh, ok... Aldy, why's this a separate pass anyways? I think similar >>>>> other warnigns are emitted from RTL expansion? So maybe we can >>>>> indeed move the pass towards warn_restrict or late_warn_uninit. >>>> >>>> I thought there was a preference to add new middle-end warnings >>>> into passes of their own rather than into existing passes. Is >>>> that not so (either in general or in this specific case)? >>> >>> The preference was to add them not into optimization passes. But >>> of course having 10+ warning passes, each going over the whole IL >>> is excessive. Also each of the locally computing ranges or so. >>> >>> Given the simplicity of Walloca I wonder why it's not part of another >>> warning pass - since it's about tracking "sizes" again there are plenty >>> that fit ;) >> >> -Walloca doesn't need to track object sizes in the same sense >> as objsize and strlen do. It just examines calls to allocation >> functions, same as -Walloc-larger-than. It would make sense to >> merge the implementation of two warnings. They don't need to >> run as a pass of their own. >> >>>> From my POV, the main (only?) benefit of putting warnings in their >>>> own passes is modularity. Are there any others? >>>> >>>> The biggest drawback I see is that it makes it hard to then share >>>> data across multiple passes. The sharing can help not just >>>> warnings (reduce both false positive and false negative rates) but >>>> also optimization. That's why I'm merging the strlen and sprintf >>>> passes, and want to eventually also look into merging >>>> the -Wstringop-overflow warnings there (also emitted just before >>>> RTL expansion. Did I miss any downsides? >>> >>> When things fit together they are fine to merge obviously. >>> >>> One may not like -Warray-bounds inside VRP but it really "fits". >> >> It fits because it has access to data that's not (yet) available >> outside VRP, right? It's not an ideal fit because by being in VRP >> it doesn't have access to allocated object sizes so out-of-bounds >> access to dynamically allocated objects aren't detected. To detect >> those, I think the only pass it could go in is strlen. So again >> an optimization pass. It doesn't fit there very well at all, but >> there is no other pass that tracks sizes of all objects (objsize >> does, though only in response to calls to __builtin_object_size). > > All true - now that we have warning passes computing range-information > it definitely fits better there. At the time we added the warning there > wasn't even per-SSA name range information. Right. Global ranges attached to SSA_NAMEs is a relatively recent capability and one that I believe we'll continue to find uses for. WRT out of bounds array indexing. What we do right now in VRP is so lame it makes me want to cry. Essentially we only warn if the entire range of the index is out of bounds. We can and should be able to do better. Global ranges aren't the solution for this problem. They're too imprecise, particularly after join points. Living inside VRP is problematical, not just because it's gross, but because I think we should be turning out of bounds array references into traps (after warning of course). In fact, it was problems in this space that were the prime motivation behind the ranger project :-) A common idiom I've seen during analysis is to use -1 to denote an error case. It's relatively common as a result to end up with stuff like this: index = PHI (1, 2, 3, -1) y = array[index] I think we want a warning for this kind of stuff. Furthermore, I think we want to isolate the path which results in -1 being assigned to index. Once that path is isolated, we can insert a trap on the isolated path and cleanup the CFG. Jeff
On Mon, 20 May 2019, Richard Biener wrote: > On Mon, May 20, 2019 at 10:16 AM Marc Glisse <marc.glisse@inria.fr> wrote: >> >> On Mon, 20 May 2019, Richard Biener wrote: >> >>> On Sun, May 19, 2019 at 6:16 PM Marc Glisse <marc.glisse@inria.fr> wrote: >>>> >>>> Hello, >>>> >>>> 2 pieces: >>>> >>>> - the first one handles the case where the denominator is negative. It >>>> doesn't happen often with exact_div, so I don't handle it everywhere, but >>>> this one looked trivial >>>> >>>> - handle the case where a pointer difference is cast to an unsigned type >>>> before being compared to a constant (I hit this in std::vector). With some >>>> range info we could probably handle some non-constant cases as well... >>>> >>>> The second piece breaks Walloca-13.c (-Walloca-larger-than=100 -O2) >>>> >>>> void f (void*); >>>> void g (int *p, int *q) >>>> { >>>> __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q); >>>> if (n < 100) >>>> f (__builtin_alloca (n)); >>>> } >>>> >>>> At the time of walloca2, we have >>>> >>>> _1 = p_5(D) - q_6(D); >>>> # RANGE [-2305843009213693952, 2305843009213693951] >>>> _2 = _1 /[ex] 4; >>>> # RANGE ~[2305843009213693952, 16140901064495857663] >>>> n_7 = (long unsigned intD.10) _2; >>>> _11 = (long unsigned intD.10) _1; >>>> if (_11 <= 396) >>>> [...] >>>> _3 = allocaD.1059 (n_7); >>>> >>>> and warn. >>> >>> That's indeed to complicated relation of _11 to n_7 for >>> VRP predicate discovery. >>> >>>> However, DOM3 later produces >>>> >>>> _1 = p_5(D) - q_6(D); >>>> _11 = (long unsigned intD.10) _1; >>>> if (_11 <= 396) >>> >>> while _11 vs. _1 works fine. >>> >>>> [...] >>>> # RANGE [0, 99] NONZERO 127 >>>> _2 = _1 /[ex] 4; >>>> # RANGE [0, 99] NONZERO 127 >>>> n_7 = (long unsigned intD.10) _2; >>>> _3 = allocaD.1059 (n_7); >>>> >>>> so I am tempted to say that the walloca2 pass is too early, xfail the >>>> testcase and file an issue... >>> >>> Hmm, there's a DOM pass before walloca2 already and moving >>> walloca2 after loop opts doesn't look like the best thing to do? >>> I suppose it's not DOM but sinking that does the important transform >>> here? That is, >>> >>> Index: gcc/passes.def >>> =================================================================== >>> --- gcc/passes.def (revision 271395) >>> +++ gcc/passes.def (working copy) >>> @@ -241,9 +241,9 @@ along with GCC; see the file COPYING3. >>> NEXT_PASS (pass_optimize_bswap); >>> NEXT_PASS (pass_laddress); >>> NEXT_PASS (pass_lim); >>> - NEXT_PASS (pass_walloca, false); >>> NEXT_PASS (pass_pre); >>> NEXT_PASS (pass_sink_code); >>> + NEXT_PASS (pass_walloca, false); >>> NEXT_PASS (pass_sancov); >>> NEXT_PASS (pass_asan); >>> NEXT_PASS (pass_tsan); >>> >>> fixes it? >> >> I will check, but I don't think walloca uses any kind of on-demand VRP, so >> we still need some pass to update the ranges after sinking, which doesn't >> seem to happen until the next DOM pass. > > Oh, ok... Aldy, why's this a separate pass anyways? I think similar > other warnigns are emitted from RTL expansion? So maybe we can > indeed move the pass towards warn_restrict or late_warn_uninit. I tried moving it after 'sink' and that didn't help. Moving it next to warn_restrict works for this test but breaks 2 others that currently "work" by accident (+ one where the message changes between "unbounded" and "too large", it isn't clear what the difference is between those messages). My suggestion, in addition to the original patch, is * gcc.dg/Walloca-13.c: Xfail. --- Walloca-13.c (revision 271742) +++ Walloca-13.c (working copy) @@ -1,12 +1,12 @@ /* { dg-do compile } */ /* { dg-require-effective-target alloca } */ /* { dg-options "-Walloca-larger-than=100 -O2" } */ void f (void*); void g (int *p, int *q) { __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q); if (n < 100) - f (__builtin_alloca (n)); + f (__builtin_alloca (n)); // { dg-bogus "may be too large due to conversion" "" { xfail { *-*-* } } } } Is that ok? > I also see that the Og pass pipeline misses the second walloca pass > completely (and also the warn_restrict pass). > > Given code sinkings obvious effects on SSA value-range representation > it may make sense to add another instance of that pass earlier.
On 5/29/19 3:27 PM, Marc Glisse wrote: > On Mon, 20 May 2019, Richard Biener wrote: > >> On Mon, May 20, 2019 at 10:16 AM Marc Glisse <marc.glisse@inria.fr> >> wrote: >>> >>> On Mon, 20 May 2019, Richard Biener wrote: >>> >>>> On Sun, May 19, 2019 at 6:16 PM Marc Glisse <marc.glisse@inria.fr> >>>> wrote: >>>>> >>>>> Hello, >>>>> >>>>> 2 pieces: >>>>> >>>>> - the first one handles the case where the denominator is negative. It >>>>> doesn't happen often with exact_div, so I don't handle it >>>>> everywhere, but >>>>> this one looked trivial >>>>> >>>>> - handle the case where a pointer difference is cast to an unsigned >>>>> type >>>>> before being compared to a constant (I hit this in std::vector). >>>>> With some >>>>> range info we could probably handle some non-constant cases as well... >>>>> >>>>> The second piece breaks Walloca-13.c (-Walloca-larger-than=100 -O2) >>>>> >>>>> void f (void*); >>>>> void g (int *p, int *q) >>>>> { >>>>> __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q); >>>>> if (n < 100) >>>>> f (__builtin_alloca (n)); >>>>> } >>>>> >>>>> At the time of walloca2, we have >>>>> >>>>> _1 = p_5(D) - q_6(D); >>>>> # RANGE [-2305843009213693952, 2305843009213693951] >>>>> _2 = _1 /[ex] 4; >>>>> # RANGE ~[2305843009213693952, 16140901064495857663] >>>>> n_7 = (long unsigned intD.10) _2; >>>>> _11 = (long unsigned intD.10) _1; >>>>> if (_11 <= 396) >>>>> [...] >>>>> _3 = allocaD.1059 (n_7); >>>>> >>>>> and warn. >>>> >>>> That's indeed to complicated relation of _11 to n_7 for >>>> VRP predicate discovery. >>>> >>>>> However, DOM3 later produces >>>>> >>>>> _1 = p_5(D) - q_6(D); >>>>> _11 = (long unsigned intD.10) _1; >>>>> if (_11 <= 396) >>>> >>>> while _11 vs. _1 works fine. >>>> >>>>> [...] >>>>> # RANGE [0, 99] NONZERO 127 >>>>> _2 = _1 /[ex] 4; >>>>> # RANGE [0, 99] NONZERO 127 >>>>> n_7 = (long unsigned intD.10) _2; >>>>> _3 = allocaD.1059 (n_7); >>>>> >>>>> so I am tempted to say that the walloca2 pass is too early, xfail the >>>>> testcase and file an issue... >>>> >>>> Hmm, there's a DOM pass before walloca2 already and moving >>>> walloca2 after loop opts doesn't look like the best thing to do? >>>> I suppose it's not DOM but sinking that does the important transform >>>> here? That is, >>>> >>>> Index: gcc/passes.def >>>> =================================================================== >>>> --- gcc/passes.def (revision 271395) >>>> +++ gcc/passes.def (working copy) >>>> @@ -241,9 +241,9 @@ along with GCC; see the file COPYING3. >>>> NEXT_PASS (pass_optimize_bswap); >>>> NEXT_PASS (pass_laddress); >>>> NEXT_PASS (pass_lim); >>>> - NEXT_PASS (pass_walloca, false); >>>> NEXT_PASS (pass_pre); >>>> NEXT_PASS (pass_sink_code); >>>> + NEXT_PASS (pass_walloca, false); >>>> NEXT_PASS (pass_sancov); >>>> NEXT_PASS (pass_asan); >>>> NEXT_PASS (pass_tsan); >>>> >>>> fixes it? >>> >>> I will check, but I don't think walloca uses any kind of on-demand >>> VRP, so >>> we still need some pass to update the ranges after sinking, which >>> doesn't >>> seem to happen until the next DOM pass. >> >> Oh, ok... Aldy, why's this a separate pass anyways? I think similar >> other warnigns are emitted from RTL expansion? So maybe we can >> indeed move the pass towards warn_restrict or late_warn_uninit. > > I tried moving it after 'sink' and that didn't help. Moving it next to > warn_restrict works for this test but breaks 2 others that currently > "work" by accident (+ one where the message changes between "unbounded" > and "too large", it isn't clear what the difference is between those > messages). So "unbounded" means there was no controlling predicate that would allow an upper bound on the size of the alloca to be determined. "too large" breaks down into two cases. One where we know it's too large, the other when it may be too large. The latter can happen due to casts and perhaps other things (controlling predicate with a dynamic upper bound maybe?) > > My suggestion, in addition to the original patch, is > > * gcc.dg/Walloca-13.c: Xfail. > > --- Walloca-13.c (revision 271742) > +++ Walloca-13.c (working copy) > @@ -1,12 +1,12 @@ > /* { dg-do compile } */ > /* { dg-require-effective-target alloca } */ > /* { dg-options "-Walloca-larger-than=100 -O2" } */ > > void f (void*); > > void g (int *p, int *q) > { > __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q); > if (n < 100) > - f (__builtin_alloca (n)); > + f (__builtin_alloca (n)); // { dg-bogus "may be too large due to > conversion" "" { xfail { *-*-* } } } > } > > Is that ok? Aldy would need to chime in -- at first glance it looks like we've lost precision here. jeff
On Wed, May 29, 2019 at 11:28 PM Marc Glisse <marc.glisse@inria.fr> wrote: > > On Mon, 20 May 2019, Richard Biener wrote: > > > On Mon, May 20, 2019 at 10:16 AM Marc Glisse <marc.glisse@inria.fr> wrote: > >> > >> On Mon, 20 May 2019, Richard Biener wrote: > >> > >>> On Sun, May 19, 2019 at 6:16 PM Marc Glisse <marc.glisse@inria.fr> wrote: > >>>> > >>>> Hello, > >>>> > >>>> 2 pieces: > >>>> > >>>> - the first one handles the case where the denominator is negative. It > >>>> doesn't happen often with exact_div, so I don't handle it everywhere, but > >>>> this one looked trivial > >>>> > >>>> - handle the case where a pointer difference is cast to an unsigned type > >>>> before being compared to a constant (I hit this in std::vector). With some > >>>> range info we could probably handle some non-constant cases as well... > >>>> > >>>> The second piece breaks Walloca-13.c (-Walloca-larger-than=100 -O2) > >>>> > >>>> void f (void*); > >>>> void g (int *p, int *q) > >>>> { > >>>> __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q); > >>>> if (n < 100) > >>>> f (__builtin_alloca (n)); > >>>> } > >>>> > >>>> At the time of walloca2, we have > >>>> > >>>> _1 = p_5(D) - q_6(D); > >>>> # RANGE [-2305843009213693952, 2305843009213693951] > >>>> _2 = _1 /[ex] 4; > >>>> # RANGE ~[2305843009213693952, 16140901064495857663] > >>>> n_7 = (long unsigned intD.10) _2; > >>>> _11 = (long unsigned intD.10) _1; > >>>> if (_11 <= 396) > >>>> [...] > >>>> _3 = allocaD.1059 (n_7); > >>>> > >>>> and warn. > >>> > >>> That's indeed to complicated relation of _11 to n_7 for > >>> VRP predicate discovery. > >>> > >>>> However, DOM3 later produces > >>>> > >>>> _1 = p_5(D) - q_6(D); > >>>> _11 = (long unsigned intD.10) _1; > >>>> if (_11 <= 396) > >>> > >>> while _11 vs. _1 works fine. > >>> > >>>> [...] > >>>> # RANGE [0, 99] NONZERO 127 > >>>> _2 = _1 /[ex] 4; > >>>> # RANGE [0, 99] NONZERO 127 > >>>> n_7 = (long unsigned intD.10) _2; > >>>> _3 = allocaD.1059 (n_7); > >>>> > >>>> so I am tempted to say that the walloca2 pass is too early, xfail the > >>>> testcase and file an issue... > >>> > >>> Hmm, there's a DOM pass before walloca2 already and moving > >>> walloca2 after loop opts doesn't look like the best thing to do? > >>> I suppose it's not DOM but sinking that does the important transform > >>> here? That is, > >>> > >>> Index: gcc/passes.def > >>> =================================================================== > >>> --- gcc/passes.def (revision 271395) > >>> +++ gcc/passes.def (working copy) > >>> @@ -241,9 +241,9 @@ along with GCC; see the file COPYING3. > >>> NEXT_PASS (pass_optimize_bswap); > >>> NEXT_PASS (pass_laddress); > >>> NEXT_PASS (pass_lim); > >>> - NEXT_PASS (pass_walloca, false); > >>> NEXT_PASS (pass_pre); > >>> NEXT_PASS (pass_sink_code); > >>> + NEXT_PASS (pass_walloca, false); > >>> NEXT_PASS (pass_sancov); > >>> NEXT_PASS (pass_asan); > >>> NEXT_PASS (pass_tsan); > >>> > >>> fixes it? > >> > >> I will check, but I don't think walloca uses any kind of on-demand VRP, so > >> we still need some pass to update the ranges after sinking, which doesn't > >> seem to happen until the next DOM pass. > > > > Oh, ok... Aldy, why's this a separate pass anyways? I think similar > > other warnigns are emitted from RTL expansion? So maybe we can > > indeed move the pass towards warn_restrict or late_warn_uninit. > > I tried moving it after 'sink' and that didn't help. Moving it next to > warn_restrict works for this test but breaks 2 others that currently > "work" by accident (+ one where the message changes between "unbounded" > and "too large", it isn't clear what the difference is between those > messages). > > My suggestion, in addition to the original patch, is > > * gcc.dg/Walloca-13.c: Xfail. > > --- Walloca-13.c (revision 271742) > +++ Walloca-13.c (working copy) > @@ -1,12 +1,12 @@ > /* { dg-do compile } */ > /* { dg-require-effective-target alloca } */ > /* { dg-options "-Walloca-larger-than=100 -O2" } */ > > void f (void*); > > void g (int *p, int *q) > { > __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q); > if (n < 100) > - f (__builtin_alloca (n)); > + f (__builtin_alloca (n)); // { dg-bogus "may be too large due to conversion" "" { xfail { *-*-* } } } > } > > Is that ok? It works for me. Any pass ordering changes should probably be done as followup anyways since they might expose other issues (as you've seen...) Richard. > > I also see that the Og pass pipeline misses the second walloca pass > > completely (and also the warn_restrict pass). > > > > Given code sinkings obvious effects on SSA value-range representation > > it may make sense to add another instance of that pass earlier. > > -- > Marc Glisse
I've never been too happy with the too large due to cast warnings. For that matter, it seems like a lot of the unbounded alloca warning variants were artifacts of the way we couldn't get precise ranges after vrp asserts had disappeared and we were trying to guess at what the actual range in the original code was. It's fragile at best. I haven't been paying too much attention to walloca because the ranger gets considerably better context ranges in the ranger walloca version, and we are getting correct warnings for a variety of things we couldn't before. So I was hoping to ignore this until we all agreed on what range, vrp etc will look like going forward. That being said, I could take a closer look at this xfail on Monday if y'all would like. But I don't currently have strong opinions either way. I guess it'll all change in the next few months. Aldy On Thu, May 30, 2019, 18:26 Jeff Law <law@redhat.com> wrote: > On 5/29/19 3:27 PM, Marc Glisse wrote: > > On Mon, 20 May 2019, Richard Biener wrote: > > > >> On Mon, May 20, 2019 at 10:16 AM Marc Glisse <marc.glisse@inria.fr> > >> wrote: > >>> > >>> On Mon, 20 May 2019, Richard Biener wrote: > >>> > >>>> On Sun, May 19, 2019 at 6:16 PM Marc Glisse <marc.glisse@inria.fr> > >>>> wrote: > >>>>> > >>>>> Hello, > >>>>> > >>>>> 2 pieces: > >>>>> > >>>>> - the first one handles the case where the denominator is negative. > It > >>>>> doesn't happen often with exact_div, so I don't handle it > >>>>> everywhere, but > >>>>> this one looked trivial > >>>>> > >>>>> - handle the case where a pointer difference is cast to an unsigned > >>>>> type > >>>>> before being compared to a constant (I hit this in std::vector). > >>>>> With some > >>>>> range info we could probably handle some non-constant cases as > well... > >>>>> > >>>>> The second piece breaks Walloca-13.c (-Walloca-larger-than=100 -O2) > >>>>> > >>>>> void f (void*); > >>>>> void g (int *p, int *q) > >>>>> { > >>>>> __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q); > >>>>> if (n < 100) > >>>>> f (__builtin_alloca (n)); > >>>>> } > >>>>> > >>>>> At the time of walloca2, we have > >>>>> > >>>>> _1 = p_5(D) - q_6(D); > >>>>> # RANGE [-2305843009213693952, 2305843009213693951] > >>>>> _2 = _1 /[ex] 4; > >>>>> # RANGE ~[2305843009213693952, 16140901064495857663] > >>>>> n_7 = (long unsigned intD.10) _2; > >>>>> _11 = (long unsigned intD.10) _1; > >>>>> if (_11 <= 396) > >>>>> [...] > >>>>> _3 = allocaD.1059 (n_7); > >>>>> > >>>>> and warn. > >>>> > >>>> That's indeed to complicated relation of _11 to n_7 for > >>>> VRP predicate discovery. > >>>> > >>>>> However, DOM3 later produces > >>>>> > >>>>> _1 = p_5(D) - q_6(D); > >>>>> _11 = (long unsigned intD.10) _1; > >>>>> if (_11 <= 396) > >>>> > >>>> while _11 vs. _1 works fine. > >>>> > >>>>> [...] > >>>>> # RANGE [0, 99] NONZERO 127 > >>>>> _2 = _1 /[ex] 4; > >>>>> # RANGE [0, 99] NONZERO 127 > >>>>> n_7 = (long unsigned intD.10) _2; > >>>>> _3 = allocaD.1059 (n_7); > >>>>> > >>>>> so I am tempted to say that the walloca2 pass is too early, xfail the > >>>>> testcase and file an issue... > >>>> > >>>> Hmm, there's a DOM pass before walloca2 already and moving > >>>> walloca2 after loop opts doesn't look like the best thing to do? > >>>> I suppose it's not DOM but sinking that does the important transform > >>>> here? That is, > >>>> > >>>> Index: gcc/passes.def > >>>> =================================================================== > >>>> --- gcc/passes.def (revision 271395) > >>>> +++ gcc/passes.def (working copy) > >>>> @@ -241,9 +241,9 @@ along with GCC; see the file COPYING3. > >>>> NEXT_PASS (pass_optimize_bswap); > >>>> NEXT_PASS (pass_laddress); > >>>> NEXT_PASS (pass_lim); > >>>> - NEXT_PASS (pass_walloca, false); > >>>> NEXT_PASS (pass_pre); > >>>> NEXT_PASS (pass_sink_code); > >>>> + NEXT_PASS (pass_walloca, false); > >>>> NEXT_PASS (pass_sancov); > >>>> NEXT_PASS (pass_asan); > >>>> NEXT_PASS (pass_tsan); > >>>> > >>>> fixes it? > >>> > >>> I will check, but I don't think walloca uses any kind of on-demand > >>> VRP, so > >>> we still need some pass to update the ranges after sinking, which > >>> doesn't > >>> seem to happen until the next DOM pass. > >> > >> Oh, ok... Aldy, why's this a separate pass anyways? I think similar > >> other warnigns are emitted from RTL expansion? So maybe we can > >> indeed move the pass towards warn_restrict or late_warn_uninit. > > > > I tried moving it after 'sink' and that didn't help. Moving it next to > > warn_restrict works for this test but breaks 2 others that currently > > "work" by accident (+ one where the message changes between "unbounded" > > and "too large", it isn't clear what the difference is between those > > messages). > So "unbounded" means there was no controlling predicate that would allow > an upper bound on the size of the alloca to be determined. > > "too large" breaks down into two cases. One where we know it's too > large, the other when it may be too large. The latter can happen due to > casts and perhaps other things (controlling predicate with a dynamic > upper bound maybe?) > > > > > > My suggestion, in addition to the original patch, is > > > > * gcc.dg/Walloca-13.c: Xfail. > > > > --- Walloca-13.c (revision 271742) > > +++ Walloca-13.c (working copy) > > @@ -1,12 +1,12 @@ > > /* { dg-do compile } */ > > /* { dg-require-effective-target alloca } */ > > /* { dg-options "-Walloca-larger-than=100 -O2" } */ > > > > void f (void*); > > > > void g (int *p, int *q) > > { > > __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q); > > if (n < 100) > > - f (__builtin_alloca (n)); > > + f (__builtin_alloca (n)); // { dg-bogus "may be too large due to > > conversion" "" { xfail { *-*-* } } } > > } > > > > Is that ok? > Aldy would need to chime in -- at first glance it looks like we've lost > precision here. > > jeff > >
On Fri, 31 May 2019, Aldy Hernandez wrote: > I've never been too happy with the too large due to cast warnings. For that > matter, it seems like a lot of the unbounded alloca warning variants were > artifacts of the way we couldn't get precise ranges after vrp asserts had > disappeared and we were trying to guess at what the actual range in the > original code was. It's fragile at best. Yes, very fragile. > I haven't been paying too much attention to walloca because the ranger gets > considerably better context ranges in the ranger walloca version, and we > are getting correct warnings for a variety of things we couldn't before. So > I was hoping to ignore this until we all agreed on what range, vrp etc will > look like going forward. Seems sensible. > That being said, I could take a closer look at this xfail on Monday if > y'all would like. But I don't currently have strong opinions either way. I > guess it'll all change in the next few months. As long as you are ok with one Walloca testcase being xfailed until the VRP work lands, I don't think there is a need to spend time on it now.
Sure. No problem. Thanks for looking at this. Aldy On Fri, May 31, 2019, 17:48 Marc Glisse <marc.glisse@inria.fr> wrote: > On Fri, 31 May 2019, Aldy Hernandez wrote: > > > I've never been too happy with the too large due to cast warnings. For > that > > matter, it seems like a lot of the unbounded alloca warning variants were > > artifacts of the way we couldn't get precise ranges after vrp asserts had > > disappeared and we were trying to guess at what the actual range in the > > original code was. It's fragile at best. > > Yes, very fragile. > > > I haven't been paying too much attention to walloca because the ranger > gets > > considerably better context ranges in the ranger walloca version, and we > > are getting correct warnings for a variety of things we couldn't before. > So > > I was hoping to ignore this until we all agreed on what range, vrp etc > will > > look like going forward. > > Seems sensible. > > > That being said, I could take a closer look at this xfail on Monday if > > y'all would like. But I don't currently have strong opinions either way. > I > > guess it'll all change in the next few months. > > As long as you are ok with one Walloca testcase being xfailed until the > VRP work lands, I don't think there is a need to spend time on it now. > > -- > Marc Glisse >
Index: gcc/match.pd =================================================================== --- gcc/match.pd (revision 271376) +++ gcc/match.pd (working copy) @@ -1490,21 +1490,23 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) && (wi::to_wide (@2) == wi::max_value (TYPE_PRECISION (TREE_TYPE (@0)), SIGNED) - 1)) (with { tree stype = signed_type_for (TREE_TYPE (@0)); } (icmp (convert:stype @0) { build_int_cst (stype, 0); }))))) /* X / 4 < Y / 4 iff X < Y when the division is known to be exact. */ (for cmp (simple_comparison) (simplify (cmp (exact_div @0 INTEGER_CST@2) (exact_div @1 @2)) (if (wi::gt_p (wi::to_wide (@2), 0, TYPE_SIGN (TREE_TYPE (@2)))) - (cmp @0 @1)))) + (cmp @0 @1) + (if (wi::lt_p (wi::to_wide (@2), 0, TYPE_SIGN (TREE_TYPE (@2)))) + (cmp @1 @0))))) /* X / C1 op C2 into a simple range test. */ (for cmp (simple_comparison) (simplify (cmp (trunc_div:s @0 INTEGER_CST@1) INTEGER_CST@2) (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && integer_nonzerop (@1) && !TREE_OVERFLOW (@1) && !TREE_OVERFLOW (@2)) (with { tree lo, hi; bool neg_overflow; @@ -3626,20 +3634,47 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) wi::overflow_type ovf; wide_int prod = wi::mul (wi::to_wide (@2), wi::to_wide (@1), TYPE_SIGN (TREE_TYPE (@1)), &ovf); } (if (ovf) { constant_boolean_node (wi::lt_p (wi::to_wide (@2), 0, TYPE_SIGN (TREE_TYPE (@2))) != (cmp == LT_EXPR || cmp == LE_EXPR), type); } (cmp @0 { wide_int_to_tree (TREE_TYPE (@0), prod); })))))) +/* Fold (size_t)(A /[ex] B) CMP C to (size_t)A CMP (size_t)B * C or A CMP' 0. + + For small C (less than max/B), this is (size_t)A CMP (size_t)B * C. + For large C (more than min/B+2^size), this is also true, with the + multiplication computed modulo 2^size. + For intermediate C, this just tests the sign of A. */ +(for cmp (lt le gt ge) + cmp2 (ge ge lt lt) + (simplify + (cmp (convert (exact_div @0 INTEGER_CST@1)) INTEGER_CST@2) + (if (tree_nop_conversion_p (TREE_TYPE (@0), TREE_TYPE (@2)) + && TYPE_UNSIGNED (TREE_TYPE (@2)) && !TYPE_UNSIGNED (TREE_TYPE (@0)) + && wi::gt_p (wi::to_wide (@1), 0, TYPE_SIGN (TREE_TYPE (@1)))) + (with + { + tree utype = TREE_TYPE (@2); + wide_int denom = wi::to_wide (@1); + wide_int right = wi::to_wide (@2); + wide_int smax = wi::sdiv_trunc (wi::max_value (TREE_TYPE (@0)), denom); + wide_int smin = wi::sdiv_trunc (wi::min_value (TREE_TYPE (@0)), denom); + bool small = wi::leu_p (right, smax); + bool large = wi::geu_p (right, smin); + } + (if (small || large) + (cmp (convert:utype @0) (mult @2 (convert @1))) + (cmp2 @0 { build_zero_cst (TREE_TYPE (@0)); })))))) + /* Unordered tests if either argument is a NaN. */ (simplify (bit_ior (unordered @0 @0) (unordered @1 @1)) (if (types_match (@0, @1)) (unordered @0 @1))) (simplify (bit_and (ordered @0 @0) (ordered @1 @1)) (if (types_match (@0, @1)) (ordered @0 @1))) (simplify Index: gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv-3.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv-3.c (nonexistent) +++ gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv-3.c (working copy) @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-optimized-raw" } */ + +int f(int*a,int*b){ + if(sizeof(__SIZE_TYPE__)!=sizeof(__PTRDIFF_TYPE__)) return -1; + __SIZE_TYPE__ d = b - a; + return d >= 5; +} + +/* { dg-final { scan-tree-dump-not "exact_div_expr" "optimized" } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv-4.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv-4.c (nonexistent) +++ gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv-4.c (working copy) @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-optimized-raw" } */ + +int f(int*a,int*b,int*c){ + __PTRDIFF_TYPE__ x = -(b - a); + __PTRDIFF_TYPE__ y = -(c - a); + return x < y; +} + +/* { dg-final { scan-tree-dump-not "exact_div_expr" "optimized" } } */