Message ID | 55433834.10206@arm.com |
---|---|
State | New |
Headers | show |
Ping. https://gcc.gnu.org/ml/gcc-patches/2015-05/msg00013.html Thanks, Kyrill On 01/05/15 09:24, Kyrill Tkachov wrote: > Hi all, > > It occurs to me that in the IOR-of-shifts form of the rev16 operation we should be costing the operand properly. > For that we'd want to reuse the aarch_rev16_p function that does all the heavy lifting and get it to write the > innermost operand of the rev16 for further costing. In the process we relax that function a bit to accept any > rtx as the operand, not just REGs so that we can calculate the cost of moving them in a register appropriately. > > This patch does just that and updates the arm and aarch64 callsites appropriately so that the operands are > processed properly. > > In practice I don't expect this to make much difference since this patterns occurs rarely anyway, but it seems > like the 'right thing to do' (TM). > > Bootstrapped and tested on arm,aarch64. > > Ok for trunk? > > Thanks, > Kyrill > > 2015-05-01 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * config/arm/aarch-common-protos.h (aarch_rev16_p): Update signature > to take a second argument. > * config/arm/aarch-common.c (aarch_rev16_p): Add second argument. > Write inner-most rev16 argument to it if recognised. > (aarch_rev16_p_1): Likewise. > * config/arm/arm.c (arm_new_rtx_costs): Properly cost rev16 operand > in the IOR case. > * config/aarch64/aarch64.c (aarch64_rtx_costs): Likewise.
Ping. Thanks, Kyrill On 12/05/15 10:07, Kyrill Tkachov wrote: > Ping. > https://gcc.gnu.org/ml/gcc-patches/2015-05/msg00013.html > > Thanks, > Kyrill > On 01/05/15 09:24, Kyrill Tkachov wrote: >> Hi all, >> >> It occurs to me that in the IOR-of-shifts form of the rev16 operation we should be costing the operand properly. >> For that we'd want to reuse the aarch_rev16_p function that does all the heavy lifting and get it to write the >> innermost operand of the rev16 for further costing. In the process we relax that function a bit to accept any >> rtx as the operand, not just REGs so that we can calculate the cost of moving them in a register appropriately. >> >> This patch does just that and updates the arm and aarch64 callsites appropriately so that the operands are >> processed properly. >> >> In practice I don't expect this to make much difference since this patterns occurs rarely anyway, but it seems >> like the 'right thing to do' (TM). >> >> Bootstrapped and tested on arm,aarch64. >> >> Ok for trunk? >> >> Thanks, >> Kyrill >> >> 2015-05-01 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >> >> * config/arm/aarch-common-protos.h (aarch_rev16_p): Update signature >> to take a second argument. >> * config/arm/aarch-common.c (aarch_rev16_p): Add second argument. >> Write inner-most rev16 argument to it if recognised. >> (aarch_rev16_p_1): Likewise. >> * config/arm/arm.c (arm_new_rtx_costs): Properly cost rev16 operand >> in the IOR case. >> * config/aarch64/aarch64.c (aarch64_rtx_costs): Likewise.
On Fri, May 01, 2015 at 09:24:20AM +0100, Kyrill Tkachov wrote: > Hi all, > > It occurs to me that in the IOR-of-shifts form of the rev16 operation we > should be costing the operand properly. For that we'd want to reuse the > aarch_rev16_p function that does all the heavy lifting and get it to write > the innermost operand of the rev16 for further costing. In the process we > relax that function a bit to accept any rtx as the operand, not just REGs so > that we can calculate the cost of moving them in a register appropriately. > > This patch does just that and updates the arm and aarch64 callsites > appropriately so that the operands are processed properly. > > In practice I don't expect this to make much difference since this patterns > occurs rarely anyway, but it seems like the 'right thing to do' (TM). > > Bootstrapped and tested on arm,aarch64. > > Ok for trunk? Looks OK for aarch64, please wait for an Ack from an arm maintainer. Cheers, James > 2015-05-01 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * config/arm/aarch-common-protos.h (aarch_rev16_p): Update signature > to take a second argument. > * config/arm/aarch-common.c (aarch_rev16_p): Add second argument. > Write inner-most rev16 argument to it if recognised. > (aarch_rev16_p_1): Likewise. > * config/arm/arm.c (arm_new_rtx_costs): Properly cost rev16 operand > in the IOR case. > * config/aarch64/aarch64.c (aarch64_rtx_costs): Likewise.
commit 297534c524af2aac4c67279909c5e54221a46b22 Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com> Date: Mon Mar 2 17:55:30 2015 +0000 [ARM/AArch64] Properly cost rev16 operand. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 0345b93..6083fd4 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -6003,9 +6003,9 @@ cost_plus: return false; case IOR: - if (aarch_rev16_p (x)) + if (aarch_rev16_p (x, &op0)) { - *cost = COSTS_N_INSNS (1); + *cost += rtx_cost (op0, BSWAP, 0, speed); if (speed) *cost += extra_cost->alu.rev; diff --git a/gcc/config/arm/aarch-common-protos.h b/gcc/config/arm/aarch-common-protos.h index 3ee7ebf..04b1f1d 100644 --- a/gcc/config/arm/aarch-common-protos.h +++ b/gcc/config/arm/aarch-common-protos.h @@ -24,7 +24,7 @@ #define GCC_AARCH_COMMON_PROTOS_H extern int aarch_crypto_can_dual_issue (rtx_insn *, rtx_insn *); -extern bool aarch_rev16_p (rtx); +extern bool aarch_rev16_p (rtx, rtx*); extern bool aarch_rev16_shleft_mask_imm_p (rtx, machine_mode); extern bool aarch_rev16_shright_mask_imm_p (rtx, machine_mode); extern int arm_early_load_addr_dep (rtx, rtx); diff --git a/gcc/config/arm/aarch-common.c b/gcc/config/arm/aarch-common.c index b2bb859..f1c267c 100644 --- a/gcc/config/arm/aarch-common.c +++ b/gcc/config/arm/aarch-common.c @@ -181,28 +181,31 @@ aarch_rev16_shleft_mask_imm_p (rtx val, machine_mode mode) static bool -aarch_rev16_p_1 (rtx lhs, rtx rhs, machine_mode mode) +aarch_rev16_p_1 (rtx lhs, rtx rhs, machine_mode mode, rtx *op) { if (GET_CODE (lhs) == AND && GET_CODE (XEXP (lhs, 0)) == ASHIFT && CONST_INT_P (XEXP (XEXP (lhs, 0), 1)) && INTVAL (XEXP (XEXP (lhs, 0), 1)) == 8 - && REG_P (XEXP (XEXP (lhs, 0), 0)) && CONST_INT_P (XEXP (lhs, 1)) && GET_CODE (rhs) == AND && GET_CODE (XEXP (rhs, 0)) == LSHIFTRT - && REG_P (XEXP (XEXP (rhs, 0), 0)) && CONST_INT_P (XEXP (XEXP (rhs, 0), 1)) && INTVAL (XEXP (XEXP (rhs, 0), 1)) == 8 && CONST_INT_P (XEXP (rhs, 1)) - && REGNO (XEXP (XEXP (rhs, 0), 0)) == REGNO (XEXP (XEXP (lhs, 0), 0))) + && rtx_equal_p (XEXP (XEXP (rhs, 0), 0), XEXP (XEXP (lhs, 0), 0))) { rtx lhs_mask = XEXP (lhs, 1); rtx rhs_mask = XEXP (rhs, 1); - - return aarch_rev16_shright_mask_imm_p (rhs_mask, mode) - && aarch_rev16_shleft_mask_imm_p (lhs_mask, mode); + rtx inner = XEXP (XEXP (lhs, 0), 0); + + if (aarch_rev16_shright_mask_imm_p (rhs_mask, mode) + && aarch_rev16_shleft_mask_imm_p (lhs_mask, mode)) + { + *op = inner; + return true; + } } return false; @@ -214,14 +217,18 @@ aarch_rev16_p_1 (rtx lhs, rtx rhs, machine_mode mode) | ((x << 8) & 0xff00ff00) for SImode and with similar but wider bitmasks for DImode. The two sub-expressions of the IOR can appear on either side so check both - permutations with the help of aarch_rev16_p_1 above. */ + permutations with the help of aarch_rev16_p_1 above. Write into OP the + rtx that has rev16 applied to it to be used for further operand costing. + OP will hold NULL_RTX if the operation is not recognised. */ bool -aarch_rev16_p (rtx x) +aarch_rev16_p (rtx x, rtx *op) { rtx left_sub_rtx, right_sub_rtx; bool is_rev = false; + *op = NULL_RTX; + if (GET_CODE (x) != IOR) return false; @@ -230,10 +237,10 @@ aarch_rev16_p (rtx x) /* There are no canonicalisation rules for the position of the two shifts involved in a rev, so try both permutations. */ - is_rev = aarch_rev16_p_1 (left_sub_rtx, right_sub_rtx, GET_MODE (x)); + is_rev = aarch_rev16_p_1 (left_sub_rtx, right_sub_rtx, GET_MODE (x), op); if (!is_rev) - is_rev = aarch_rev16_p_1 (right_sub_rtx, left_sub_rtx, GET_MODE (x)); + is_rev = aarch_rev16_p_1 (right_sub_rtx, left_sub_rtx, GET_MODE (x), op); return is_rev; } diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 4081680..b3c65d5 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -10389,14 +10389,19 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, *cost = LIBCALL_COST (2); return false; case IOR: - if (mode == SImode && arm_arch6 && aarch_rev16_p (x)) - { - *cost = COSTS_N_INSNS (1); - if (speed_p) - *cost += extra_cost->alu.rev; + { + rtx inner; + if (mode == SImode && arm_arch6 && aarch_rev16_p (x, &inner)) + { + *cost = COSTS_N_INSNS (1); + *cost += rtx_cost (inner, BSWAP, 0 , speed_p); - return true; - } + if (speed_p) + *cost += extra_cost->alu.rev; + + return true; + } + } /* Fall through. */ case AND: case XOR: if (mode == SImode)