Message ID | 55352944.8070109@arm.com |
---|---|
State | New |
Headers | show |
Ping. https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01047.html Thanks, Kyrill On 20/04/15 17:28, Kyrill Tkachov wrote: > Hi all, > > A pet project of mine is to get to the point where backend rtx costs functions won't have > to handle rtxes that don't match down to any patterns/expanders we have. Or at least limit such cases. > A case dealt with in this patch is QImode PLUS. We don't actually generate or handle these anywhere in > the arm backend *except* in sync.md where, for example, atomic_<sync_optab><mode> matches: > (set (match_operand:QHSD 0 "mem_noofs_operand" "+Ua") > (unspec_volatile:QHSD > [(syncop:QHSD (match_dup 0) > (match_operand:QHSD 1 "<atomic_op_operand>" "<atomic_op_str>")) > (match_operand:SI 2 "const_int_operand")] ;; model > VUNSPEC_ATOMIC_OP)) > > Here QHSD can contain QImode and HImode while syncop can be PLUS. > Now immediately during splitting in arm_split_atomic_op we convert that > QImode PLUS into an SImode one, so we never actually generate any kind of QImode add operations > (how would we? we don't have define_insns for such things) but the RTL optimisers will get a hold > of the UNSPEC_VOLATILE in the meantime and ask for it's cost (for example, cse when building libatomic). > Currently we don't handle UNSPEC_VOLATILE (VUNSPEC_ATOMIC_OP) so the arm rtx costs function just recurses > into the QImode PLUS that I'd like to avoid. > This patch stops that by passing the VUNSPEC_ATOMIC_OP into arm_unspec_cost and handling it there > (very straightforwardly just returning COSTS_N_INSNS (2); there's no indication that we want to do anything > smarter here) and stopping the recursion. > > This is a small step in the direction of not having to care about obviously useless rtxes in the backend. > The astute reader might notice that in sync.md we also have the pattern atomic_fetch_<sync_optab><mode> > which expands to/matches this: > (set (match_operand:QHSD 0 "s_register_operand" "=&r") > (match_operand:QHSD 1 "mem_noofs_operand" "+Ua")) > (set (match_dup 1) > (unspec_volatile:QHSD > [(syncop:QHSD (match_dup 1) > (match_operand:QHSD 2 "<atomic_op_operand>" "<atomic_op_str>")) > (match_operand:SI 3 "const_int_operand")] ;; model > VUNSPEC_ATOMIC_OP)) > > > Here the QImode PLUS is in a PARALLEL together with the UNSPEC, so it might have rtx costs called on it > as well. This will always be a (plus (reg) (mem)) rtx, which is unlike any other normal rtx we generate > in the arm backend. I'll try to get a patch to handle that case, but I'm still thinking on how to best > do that. > > Tested arm-none-eabi, I didn't see any codegen differences in some compiled codebases. > > Ok for trunk? > > P.S. I know that expmed creates all kinds of irregular rtxes and asks for their costs. I'm hoping to clean that > up at some point... > > 2015-04-20 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * config/arm/arm.c (arm_new_rtx_costs): Handle UNSPEC_VOLATILE. > (arm_unspec_cost): Allos UNSPEC_VOLATILE. Do not recurse inside > unknown unspecs.
Ping^2. Thanks, Kyrill On 30/04/15 13:01, Kyrill Tkachov wrote: > Ping. > https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01047.html > > Thanks, > Kyrill > > On 20/04/15 17:28, Kyrill Tkachov wrote: >> Hi all, >> >> A pet project of mine is to get to the point where backend rtx costs functions won't have >> to handle rtxes that don't match down to any patterns/expanders we have. Or at least limit such cases. >> A case dealt with in this patch is QImode PLUS. We don't actually generate or handle these anywhere in >> the arm backend *except* in sync.md where, for example, atomic_<sync_optab><mode> matches: >> (set (match_operand:QHSD 0 "mem_noofs_operand" "+Ua") >> (unspec_volatile:QHSD >> [(syncop:QHSD (match_dup 0) >> (match_operand:QHSD 1 "<atomic_op_operand>" "<atomic_op_str>")) >> (match_operand:SI 2 "const_int_operand")] ;; model >> VUNSPEC_ATOMIC_OP)) >> >> Here QHSD can contain QImode and HImode while syncop can be PLUS. >> Now immediately during splitting in arm_split_atomic_op we convert that >> QImode PLUS into an SImode one, so we never actually generate any kind of QImode add operations >> (how would we? we don't have define_insns for such things) but the RTL optimisers will get a hold >> of the UNSPEC_VOLATILE in the meantime and ask for it's cost (for example, cse when building libatomic). >> Currently we don't handle UNSPEC_VOLATILE (VUNSPEC_ATOMIC_OP) so the arm rtx costs function just recurses >> into the QImode PLUS that I'd like to avoid. >> This patch stops that by passing the VUNSPEC_ATOMIC_OP into arm_unspec_cost and handling it there >> (very straightforwardly just returning COSTS_N_INSNS (2); there's no indication that we want to do anything >> smarter here) and stopping the recursion. >> >> This is a small step in the direction of not having to care about obviously useless rtxes in the backend. >> The astute reader might notice that in sync.md we also have the pattern atomic_fetch_<sync_optab><mode> >> which expands to/matches this: >> (set (match_operand:QHSD 0 "s_register_operand" "=&r") >> (match_operand:QHSD 1 "mem_noofs_operand" "+Ua")) >> (set (match_dup 1) >> (unspec_volatile:QHSD >> [(syncop:QHSD (match_dup 1) >> (match_operand:QHSD 2 "<atomic_op_operand>" "<atomic_op_str>")) >> (match_operand:SI 3 "const_int_operand")] ;; model >> VUNSPEC_ATOMIC_OP)) >> >> >> Here the QImode PLUS is in a PARALLEL together with the UNSPEC, so it might have rtx costs called on it >> as well. This will always be a (plus (reg) (mem)) rtx, which is unlike any other normal rtx we generate >> in the arm backend. I'll try to get a patch to handle that case, but I'm still thinking on how to best >> do that. >> >> Tested arm-none-eabi, I didn't see any codegen differences in some compiled codebases. >> >> Ok for trunk? >> >> P.S. I know that expmed creates all kinds of irregular rtxes and asks for their costs. I'm hoping to clean that >> up at some point... >> >> 2015-04-20 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >> >> * config/arm/arm.c (arm_new_rtx_costs): Handle UNSPEC_VOLATILE. >> (arm_unspec_cost): Allos UNSPEC_VOLATILE. Do not recurse inside >> unknown unspecs.
Ping^3. Thanks, Kyrill On 12/05/15 10:08, Kyrill Tkachov wrote: > Ping^2. > > Thanks, > Kyrill > On 30/04/15 13:01, Kyrill Tkachov wrote: >> Ping. >> https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01047.html >> >> Thanks, >> Kyrill >> >> On 20/04/15 17:28, Kyrill Tkachov wrote: >>> Hi all, >>> >>> A pet project of mine is to get to the point where backend rtx costs functions won't have >>> to handle rtxes that don't match down to any patterns/expanders we have. Or at least limit such cases. >>> A case dealt with in this patch is QImode PLUS. We don't actually generate or handle these anywhere in >>> the arm backend *except* in sync.md where, for example, atomic_<sync_optab><mode> matches: >>> (set (match_operand:QHSD 0 "mem_noofs_operand" "+Ua") >>> (unspec_volatile:QHSD >>> [(syncop:QHSD (match_dup 0) >>> (match_operand:QHSD 1 "<atomic_op_operand>" "<atomic_op_str>")) >>> (match_operand:SI 2 "const_int_operand")] ;; model >>> VUNSPEC_ATOMIC_OP)) >>> >>> Here QHSD can contain QImode and HImode while syncop can be PLUS. >>> Now immediately during splitting in arm_split_atomic_op we convert that >>> QImode PLUS into an SImode one, so we never actually generate any kind of QImode add operations >>> (how would we? we don't have define_insns for such things) but the RTL optimisers will get a hold >>> of the UNSPEC_VOLATILE in the meantime and ask for it's cost (for example, cse when building libatomic). >>> Currently we don't handle UNSPEC_VOLATILE (VUNSPEC_ATOMIC_OP) so the arm rtx costs function just recurses >>> into the QImode PLUS that I'd like to avoid. >>> This patch stops that by passing the VUNSPEC_ATOMIC_OP into arm_unspec_cost and handling it there >>> (very straightforwardly just returning COSTS_N_INSNS (2); there's no indication that we want to do anything >>> smarter here) and stopping the recursion. >>> >>> This is a small step in the direction of not having to care about obviously useless rtxes in the backend. >>> The astute reader might notice that in sync.md we also have the pattern atomic_fetch_<sync_optab><mode> >>> which expands to/matches this: >>> (set (match_operand:QHSD 0 "s_register_operand" "=&r") >>> (match_operand:QHSD 1 "mem_noofs_operand" "+Ua")) >>> (set (match_dup 1) >>> (unspec_volatile:QHSD >>> [(syncop:QHSD (match_dup 1) >>> (match_operand:QHSD 2 "<atomic_op_operand>" "<atomic_op_str>")) >>> (match_operand:SI 3 "const_int_operand")] ;; model >>> VUNSPEC_ATOMIC_OP)) >>> >>> >>> Here the QImode PLUS is in a PARALLEL together with the UNSPEC, so it might have rtx costs called on it >>> as well. This will always be a (plus (reg) (mem)) rtx, which is unlike any other normal rtx we generate >>> in the arm backend. I'll try to get a patch to handle that case, but I'm still thinking on how to best >>> do that. >>> >>> Tested arm-none-eabi, I didn't see any codegen differences in some compiled codebases. >>> >>> Ok for trunk? >>> >>> P.S. I know that expmed creates all kinds of irregular rtxes and asks for their costs. I'm hoping to clean that >>> up at some point... >>> >>> 2015-04-20 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >>> >>> * config/arm/arm.c (arm_new_rtx_costs): Handle UNSPEC_VOLATILE. >>> (arm_unspec_cost): Allos UNSPEC_VOLATILE. Do not recurse inside >>> unknown unspecs.
On Mon, Apr 20, 2015 at 5:28 PM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote: > Hi all, > > A pet project of mine is to get to the point where backend rtx costs > functions won't have > to handle rtxes that don't match down to any patterns/expanders we have. Or > at least limit such cases. > A case dealt with in this patch is QImode PLUS. We don't actually generate > or handle these anywhere in > the arm backend *except* in sync.md where, for example, > atomic_<sync_optab><mode> matches: > (set (match_operand:QHSD 0 "mem_noofs_operand" "+Ua") > (unspec_volatile:QHSD > [(syncop:QHSD (match_dup 0) > (match_operand:QHSD 1 "<atomic_op_operand>" "<atomic_op_str>")) > (match_operand:SI 2 "const_int_operand")] ;; model > VUNSPEC_ATOMIC_OP)) > > Here QHSD can contain QImode and HImode while syncop can be PLUS. > Now immediately during splitting in arm_split_atomic_op we convert that > QImode PLUS into an SImode one, so we never actually generate any kind of > QImode add operations > (how would we? we don't have define_insns for such things) but the RTL > optimisers will get a hold > of the UNSPEC_VOLATILE in the meantime and ask for it's cost (for example, > cse when building libatomic). > Currently we don't handle UNSPEC_VOLATILE (VUNSPEC_ATOMIC_OP) so the arm rtx > costs function just recurses > into the QImode PLUS that I'd like to avoid. > This patch stops that by passing the VUNSPEC_ATOMIC_OP into arm_unspec_cost > and handling it there > (very straightforwardly just returning COSTS_N_INSNS (2); there's no > indication that we want to do anything > smarter here) and stopping the recursion. > > This is a small step in the direction of not having to care about obviously > useless rtxes in the backend. > The astute reader might notice that in sync.md we also have the pattern > atomic_fetch_<sync_optab><mode> > which expands to/matches this: > (set (match_operand:QHSD 0 "s_register_operand" "=&r") > (match_operand:QHSD 1 "mem_noofs_operand" "+Ua")) > (set (match_dup 1) > (unspec_volatile:QHSD > [(syncop:QHSD (match_dup 1) > (match_operand:QHSD 2 "<atomic_op_operand>" "<atomic_op_str>")) > (match_operand:SI 3 "const_int_operand")] ;; model > VUNSPEC_ATOMIC_OP)) > > > Here the QImode PLUS is in a PARALLEL together with the UNSPEC, so it might > have rtx costs called on it > as well. This will always be a (plus (reg) (mem)) rtx, which is unlike any > other normal rtx we generate > in the arm backend. I'll try to get a patch to handle that case, but I'm > still thinking on how to best > do that. > > Tested arm-none-eabi, I didn't see any codegen differences in some compiled > codebases. > > Ok for trunk? OK Ramana > > P.S. I know that expmed creates all kinds of irregular rtxes and asks for > their costs. I'm hoping to clean that > up at some point... > > 2015-04-20 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * config/arm/arm.c (arm_new_rtx_costs): Handle UNSPEC_VOLATILE. > (arm_unspec_cost): Allos UNSPEC_VOLATILE. Do not recurse inside > unknown unspecs.
commit b86d4036fbbbd15956595ef5a158e1a6881356c1 Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com> Date: Thu Apr 16 11:17:47 2015 +0100 [ARM] Handle UNSPEC_VOLATILE in costs and don't recurse inside the unspec diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 5116817..8f31be0 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -9607,7 +9607,8 @@ static bool arm_unspec_cost (rtx x, enum rtx_code /* outer_code */, bool speed_p, int *cost) { const struct cpu_cost_table *extra_cost = current_tune->insn_extra_cost; - gcc_assert (GET_CODE (x) == UNSPEC); + rtx_code code = GET_CODE (x); + gcc_assert (code == UNSPEC || code == UNSPEC_VOLATILE); switch (XINT (x, 1)) { @@ -9653,7 +9654,7 @@ arm_unspec_cost (rtx x, enum rtx_code /* outer_code */, bool speed_p, int *cost) *cost = COSTS_N_INSNS (2); break; } - return false; + return true; } /* Cost of a libcall. We assume one insn per argument, an amount for the @@ -11121,6 +11122,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, *cost = LIBCALL_COST (1); return false; + case UNSPEC_VOLATILE: case UNSPEC: return arm_unspec_cost (x, outer_code, speed_p, cost);