Message ID | ZLUZbRiErCZ1pnYK@kam.mff.cuni.cz |
---|---|
State | New |
Headers | show |
Series | Fix optimize_mask_stores profile update | expand |
On Mon, Jul 17, 2023 at 12:36 PM Jan Hubicka via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Hi, > While looking into sphinx3 regression I noticed that vectorizer produces > BBs with overall probability count 120%. This patch fixes it. > Richi, I don't know how to create a testcase, but having one would > be nice. > > Bootstrapped/regtested x86_64-linux, commited last night (sorry for > late email) This should trigger with sth like for (i) if (cond[i]) out[i] = 1.; so a masked store and then using AVX2+. ISTR we disable AVX masked stores on zen (but not AVX512). Richard. > gcc/ChangeLog: > > PR tree-optimization/110649 > * tree-vect-loop.cc (optimize_mask_stores): Set correctly > probability of the if-then-else construct. > > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc > index 7d917bfd72c..b44fb9c7712 100644 > --- a/gcc/tree-vect-loop.cc > +++ b/gcc/tree-vect-loop.cc > @@ -11680,6 +11679,7 @@ optimize_mask_stores (class loop *loop) > efalse = make_edge (bb, store_bb, EDGE_FALSE_VALUE); > /* Put STORE_BB to likely part. */ > efalse->probability = profile_probability::unlikely (); > + e->probability = efalse->probability.invert (); > store_bb->count = efalse->count (); isn't the count also wrong? Or rather efalse should be likely(). We're testing doing if (!mask all zeros) masked-store because a masked store with all zero mask can end up invoking COW page fault handling multiple times (because it doesn't actually write). Note -Ofast allows store data races and thus does RMW instead of a masked store. > make_single_succ_edge (store_bb, join_bb, EDGE_FALLTHRU); > if (dom_info_available_p (CDI_DOMINATORS))
> On Mon, Jul 17, 2023 at 12:36 PM Jan Hubicka via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: > > > > Hi, > > While looking into sphinx3 regression I noticed that vectorizer produces > > BBs with overall probability count 120%. This patch fixes it. > > Richi, I don't know how to create a testcase, but having one would > > be nice. > > > > Bootstrapped/regtested x86_64-linux, commited last night (sorry for > > late email) > > This should trigger with sth like > > for (i) > if (cond[i]) > out[i] = 1.; > > so a masked store and then using AVX2+. ISTR we disable AVX masked > stores on zen (but not AVX512). OK, let me see if I can get a testcase out of that. > > efalse = make_edge (bb, store_bb, EDGE_FALSE_VALUE); > > /* Put STORE_BB to likely part. */ > > efalse->probability = profile_probability::unlikely (); > > + e->probability = efalse->probability.invert (); > > store_bb->count = efalse->count (); > > isn't the count also wrong? Or rather efalse should be likely(). We're > testing doing > > if (!mask all zeros) > masked-store > > because a masked store with all zero mask can end up invoking COW page fault > handling multiple times (because it doesn't actually write). Hmm, I only fixed the profile, efalse was already set to unlikely, but indeed I think it should be likely. Maybe we can compute some bound on actual probability by knowing if(cond[i]) probability. If the loop always does factor many ones or zeros, the probability would remain the same. If that is p and they are all independent, the outcome would be (1-p)^factor sp we know the conditoinal shoul dbe in ragne (1-p)^factor....(1-p), right? Honza > > Note -Ofast allows store data races and thus does RMW instead of a masked store. > > > make_single_succ_edge (store_bb, join_bb, EDGE_FALLTHRU); > > if (dom_info_available_p (CDI_DOMINATORS))
> Am 17.07.2023 um 14:38 schrieb Jan Hubicka <hubicka@ucw.cz>: > > >> >>> On Mon, Jul 17, 2023 at 12:36 PM Jan Hubicka via Gcc-patches >>> <gcc-patches@gcc.gnu.org> wrote: >>> >>> Hi, >>> While looking into sphinx3 regression I noticed that vectorizer produces >>> BBs with overall probability count 120%. This patch fixes it. >>> Richi, I don't know how to create a testcase, but having one would >>> be nice. >>> >>> Bootstrapped/regtested x86_64-linux, commited last night (sorry for >>> late email) >> >> This should trigger with sth like >> >> for (i) >> if (cond[i]) >> out[i] = 1.; >> >> so a masked store and then using AVX2+. ISTR we disable AVX masked >> stores on zen (but not AVX512). > > OK, let me see if I can get a testcase out of that. >>> efalse = make_edge (bb, store_bb, EDGE_FALSE_VALUE); >>> /* Put STORE_BB to likely part. */ >>> efalse->probability = profile_probability::unlikely (); >>> + e->probability = efalse->probability.invert (); >>> store_bb->count = efalse->count (); >> >> isn't the count also wrong? Or rather efalse should be likely(). We're >> testing doing >> >> if (!mask all zeros) >> masked-store >> >> because a masked store with all zero mask can end up invoking COW page fault >> handling multiple times (because it doesn't actually write). > > Hmm, I only fixed the profile, efalse was already set to unlikely, but > indeed I think it should be likely. Maybe we can compute some bound on > actual probability by knowing if(cond[i]) probability. > If the loop always does factor many ones or zeros, the probability would > remain the same. > If that is p and they are all independent, the outcome would be > (1-p)^factor > > sp we know the conditoinal shoul dbe in ragne (1-p)^factor....(1-p), > right? Yes. I think the heuristic was added for The case of bigger ranges with all 0/1 for Purely random one wouldn’t expect all zeros ever in practice. Maybe the probability was also set with that special case in mind (which is of course broken) Richard > Honza > >> >> Note -Ofast allows store data races and thus does RMW instead of a masked store. >> >>> make_single_succ_edge (store_bb, join_bb, EDGE_FALLTHRU); >>> if (dom_info_available_p (CDI_DOMINATORS))
> On Mon, Jul 17, 2023 at 12:36 PM Jan Hubicka via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: > > > > Hi, > > While looking into sphinx3 regression I noticed that vectorizer produces > > BBs with overall probability count 120%. This patch fixes it. > > Richi, I don't know how to create a testcase, but having one would > > be nice. > > > > Bootstrapped/regtested x86_64-linux, commited last night (sorry for > > late email) > > This should trigger with sth like > > for (i) > if (cond[i]) > out[i] = 1.; > > so a masked store and then using AVX2+. ISTR we disable AVX masked > stores on zen (but not AVX512). Richard, if we know probability of if (cond[i]) to be p, then we know that the combined conditional is somewhere between low = p (the strategy packing true and falses into VF sized blocks) and high = min (p*vf,1) (the stragegy doing only one true per block if possible) Likely value is likely = 1-pow(1-p, vf) I wonder if we can work out p at least in common cases. Making store unlikely as we do right now will place it offline with extra jump. Making it likely is better unless p is very small. I think if p is close to 0 or 1 which may be common case the analysis above may be useful. If range [low...high] is small, we can use likely and keep it as reliable. If it is high, we can probably just end up with guessed value close but above 50% so the store stays inline. Honza
On Fri, Jul 21, 2023 at 7:34 PM Jan Hubicka <hubicka@ucw.cz> wrote: > > > On Mon, Jul 17, 2023 at 12:36 PM Jan Hubicka via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > Hi, > > > While looking into sphinx3 regression I noticed that vectorizer produces > > > BBs with overall probability count 120%. This patch fixes it. > > > Richi, I don't know how to create a testcase, but having one would > > > be nice. > > > > > > Bootstrapped/regtested x86_64-linux, commited last night (sorry for > > > late email) > > > > This should trigger with sth like > > > > for (i) > > if (cond[i]) > > out[i] = 1.; > > > > so a masked store and then using AVX2+. ISTR we disable AVX masked > > stores on zen (but not AVX512). > > Richard, > if we know probability of if (cond[i]) to be p, > then we know that the combined conditional is somewhere between > low = p (the strategy packing true and falses into VF sized > blocks) > and > high = min (p*vf,1) > (the stragegy doing only one true per block if possible) > Likely value is > > likely = 1-pow(1-p, vf) > > I wonder if we can work out p at least in common cases. > Making store unlikely as we do right now will place it offline with > extra jump. Making it likely is better unless p is very small. > > I think if p is close to 0 or 1 which may be common case the analysis > above may be useful. If range [low...high] is small, we can use likely > and keep it as reliable. > If it is high, we can probably just end up with guessed value close but > above 50% so the store stays inline. I'd say we want to keep the store inline in all cases (we likely lost any explicit profile info during if-conversion), not sure what we gain with providing a better "guess" here. So I think we should simply go with 'likely' derived from statistical independent events. If in future we can tie if-conversion and vectorization even closer we might be able to preserve profile data here. Richard. > > Honza
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 7d917bfd72c..b44fb9c7712 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -11680,6 +11679,7 @@ optimize_mask_stores (class loop *loop) efalse = make_edge (bb, store_bb, EDGE_FALSE_VALUE); /* Put STORE_BB to likely part. */ efalse->probability = profile_probability::unlikely (); + e->probability = efalse->probability.invert (); store_bb->count = efalse->count (); make_single_succ_edge (store_bb, join_bb, EDGE_FALLTHRU); if (dom_info_available_p (CDI_DOMINATORS))