Message ID | 53EA2767.5060301@arm.com |
---|---|
State | New |
Headers | show |
> On Aug 12, 2014, at 7:40 AM, Alan Lawrence <alan.lawrence@arm.com> wrote: > > (It is no more expensive.) Yes on some processors it could be. Thanks, Andrew > > gcc/ChangeLog: > > * config/aarch64/aarch64.md (subdi3, adddi3_aarch64): Don't penalize > SIMD reg variant. > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index f8eb305140e7b0aed006b33f1724a90939e48316..0a8ca4bcc7941f912c8d42200b33206d4188fa48 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1188,10 +1188,10 @@ (define_insn "*adddi3_aarch64" [(set - (match_operand:DI 0 "register_operand" "=rk,rk,rk,!w") + (match_operand:DI 0 "register_operand" "=rk,rk,rk,w") (plus:DI - (match_operand:DI 1 "register_operand" "%rk,rk,rk,!w") - (match_operand:DI 2 "aarch64_plus_operand" "I,r,J,!w")))] + (match_operand:DI 1 "register_operand" "%rk,rk,rk,w") + (match_operand:DI 2 "aarch64_plus_operand" "I,r,J,w")))] "" "@ add\\t%x0, %x1, %2 @@ -1662,9 +1662,9 @@ ) (define_insn "subdi3" - [(set (match_operand:DI 0 "register_operand" "=rk,!w") - (minus:DI (match_operand:DI 1 "register_operand" "r,!w") - (match_operand:DI 2 "register_operand" "r,!w")))] + [(set (match_operand:DI 0 "register_operand" "=rk,w") + (minus:DI (match_operand:DI 1 "register_operand" "r,w") + (match_operand:DI 2 "register_operand" "r,w")))] "" "@ sub\\t%x0, %x1, %x2
On Tue, Aug 12, 2014 at 04:53:38PM +0100, pinskia@gmail.com wrote: > > > > On Aug 12, 2014, at 7:40 AM, Alan Lawrence <alan.lawrence@arm.com> wrote: > > > > (It is no more expensive.) > > Yes on some processors it could be. Haven't we been here before [1]? This disparaging mechanism is still not going to give what we are trying to achieve (assigning cost at the -mcpu/-mtune level, rather than the target level). Recall that '!' gives a static cost of 600 [2] and that this cost is only applied to the alternative if any operand in that alternative needs reloading - otherwise, LRA sees a set of matching operands and does not bother checking costs [3]. IRA, as far as I can see, does not care about '!', but unconditionally adds 2 to the cost of an alternative for '?' [4]. Even if LRA did try to do a more complete job of always picking the alternative with lowest cost (rather than the current first-matching behaviour) "600" would be far too high a cost for the operation. If IRA took '!' into account, we would be as well to remove the alternative entirely. So, I still can't agree that we want these exclamation marks - and we are now in a halfway state where some instructions have them and some don't. We have to pick a consistent policy or we are going to see some very poor code generation. In an ideal world, we would have a sensible way of describing a (per-core granularity) alternative cost, which would be considered by the register allocators. I've played about with doing this using attributes, but it ends up as a messy patch and I can't bring myself to add yet another cost framework to the back end. Do you have any ideas as to how we can make some progress? Maybe Vladimir has some suggestions? Thanks, James [1] https://gcc.gnu.org/ml/gcc-patches/2014-03/msg01627.html [2] regoc.c::preprocess_constraints [3] I may have misread this, but that seems to be what the final condition of the main loop of lra-constraints::process_alt_operands implies. [4] ira-costs.c::record_reg_classes
On 2014-08-13, 4:36 AM, James Greenhalgh wrote: > On Tue, Aug 12, 2014 at 04:53:38PM +0100, pinskia@gmail.com wrote: >> >> >>> On Aug 12, 2014, at 7:40 AM, Alan Lawrence <alan.lawrence@arm.com> wrote: >>> >>> (It is no more expensive.) >> >> Yes on some processors it could be. > > Haven't we been here before [1]? > > This disparaging mechanism is still not going to give what we are trying to > achieve (assigning cost at the -mcpu/-mtune level, rather than the > target level). > > Recall that '!' gives a static cost of 600 [2] and that this cost is > only applied to the alternative if any operand in that alternative needs > reloading - otherwise, LRA sees a set of matching operands and does not > bother checking costs [3]. IRA, as far as I can see, does not care about > '!', but unconditionally adds 2 to the cost of an alternative for '?' [4]. > Yes, correct. IRA follows the old global RA and the documentation. > Even if LRA did try to do a more complete job of always picking the > alternative with lowest cost (rather than the current first-matching > behaviour) "600" would be far too high a cost for the operation. > > If IRA took '!' into account, we would be as well to remove the alternative > entirely. > Yes, that is right. > So, I still can't agree that we want these exclamation marks - and we are > now in a halfway state where some instructions have them and some don't. > We have to pick a consistent policy or we are going to see some very poor > code generation. > > In an ideal world, we would have a sensible way of describing a > (per-core granularity) alternative cost, which would be considered by > the register allocators. I've played about with doing this using attributes, > but it ends up as a messy patch and I can't bring myself to add yet another > cost framework to the back end. > > Do you have any ideas as to how we can make some progress? Maybe Vladimir > has some suggestions? > Yes, I have some thoughts. We could provide machine-dependent hooks to treat different costs for '!' (which will work only for LRA/reload), and '?', and even '*' (to ignore or not the next register constraint for register preference). The hook could get insn for which we calculate the costs and of course to have the current default values) for compatibility. It could provide a lot of flexibility for machine-description writers. Although it would create some conflicts with insn cost attribute calculation by alternatives and would create more complication for less experienced md writer. So we need some more consistent solution. It is all about code selection which currently is done in many places (combiner, register preferences and finally in reload/LRA). A lot of thinking should be done how to better approach to the solution. The right approach would be reconsider all the pipeline and probably rewriting combiner/register preferencing (which is moved from the old regclass.c practically without changes and ignore the fact that the alternative should be the same for different operands) / partially LRA. So it is complex task and a big project. But I am going to work in this direction when I start to be less busy. I am only afraid that the quick solutions as mentioned by me above could create a lot of complications for the long-term project. > Thanks, > James > > [1] https://gcc.gnu.org/ml/gcc-patches/2014-03/msg01627.html > [2] regoc.c::preprocess_constraints > [3] I may have misread this, but that seems to be what the final condition > of the main loop of lra-constraints::process_alt_operands implies. > [4] ira-costs.c::record_reg_classes >
Well, you're right that it could be. So I presented the wrong justification. Clearly we would benefit from some better cost infrastructure here, ideally that is expressive, taken into account at all appropriate stages of the compiler, and tunable per core. I imagine that steps (patches) towards such infrastructure would be welcomed by both AArch64 maintainers and more widely. In the meantime, however, we must work with what we have. I'll still argue that we should remove the '!' (as per patch), however. As James has said, even if your add is more expensive in SIMD registers, the '!' still doesn't express that; and leaving it in affects code-generation on all cores. And it is inconsistent with other instructions. --Alan pinskia@gmail.com wrote: > >> On Aug 12, 2014, at 7:40 AM, Alan Lawrence <alan.lawrence@arm.com> wrote: >> >> (It is no more expensive.) > > Yes on some processors it could be. > > Thanks, > Andrew > > >> gcc/ChangeLog: >> >> * config/aarch64/aarch64.md (subdi3, adddi3_aarch64): Don't penalize >> SIMD reg variant. >> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index f8eb305140e7b0aed006b33f1724a90939e48316..0a8ca4bcc7941f912c8d42200b33206d4188fa48 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1188,10 +1188,10 @@ (define_insn "*adddi3_aarch64" [(set - (match_operand:DI 0 "register_operand" "=rk,rk,rk,!w") + (match_operand:DI 0 "register_operand" "=rk,rk,rk,w") (plus:DI - (match_operand:DI 1 "register_operand" "%rk,rk,rk,!w") - (match_operand:DI 2 "aarch64_plus_operand" "I,r,J,!w")))] + (match_operand:DI 1 "register_operand" "%rk,rk,rk,w") + (match_operand:DI 2 "aarch64_plus_operand" "I,r,J,w")))] "" "@ add\\t%x0, %x1, %2 @@ -1662,9 +1662,9 @@ ) (define_insn "subdi3" - [(set (match_operand:DI 0 "register_operand" "=rk,!w") - (minus:DI (match_operand:DI 1 "register_operand" "r,!w") - (match_operand:DI 2 "register_operand" "r,!w")))] + [(set (match_operand:DI 0 "register_operand" "=rk,w") + (minus:DI (match_operand:DI 1 " register_operand" "r,w") + (match_operand:DI 2 "register_operand" "r,w")))] "" "@ sub\\t%x0, %x1, %x2 >
On 18 August 2014 17:50, Alan Lawrence <alan.lawrence@arm.com> wrote: > Well, you're right that it could be. So I presented the wrong justification. > > Clearly we would benefit from some better cost infrastructure here, ideally > that is expressive, taken into account at all appropriate stages of the > compiler, and tunable per core. I imagine that steps (patches) towards such > infrastructure would be welcomed by both AArch64 maintainers and more > widely. > > In the meantime, however, we must work with what we have. I'll still argue > that we should remove the '!' (as per patch), however. As James has said, > even if your add is more expensive in SIMD registers, the '!' still doesn't > express that; and leaving it in affects code-generation on all cores. And it > is inconsistent with other instructions. Agreed and OK. /Marcus
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index f8eb305140e7b0aed006b33f1724a90939e48316..0a8ca4bcc7941f912c8d42200b33206d4188fa48 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1188,10 +1188,10 @@ (define_insn "*adddi3_aarch64" [(set - (match_operand:DI 0 "register_operand" "=rk,rk,rk,!w") + (match_operand:DI 0 "register_operand" "=rk,rk,rk,w") (plus:DI - (match_operand:DI 1 "register_operand" "%rk,rk,rk,!w") - (match_operand:DI 2 "aarch64_plus_operand" "I,r,J,!w")))] + (match_operand:DI 1 "register_operand" "%rk,rk,rk,w") + (match_operand:DI 2 "aarch64_plus_operand" "I,r,J,w")))] "" "@ add\\t%x0, %x1, %2 @@ -1662,9 +1662,9 @@ ) (define_insn "subdi3" - [(set (match_operand:DI 0 "register_operand" "=rk,!w") - (minus:DI (match_operand:DI 1 "register_operand" "r,!w") - (match_operand:DI 2 "register_operand" "r,!w")))] + [(set (match_operand:DI 0 "register_operand" "=rk,w") + (minus:DI (match_operand:DI 1 "register_operand" "r,w") + (match_operand:DI 2 "register_operand" "r,w")))] "" "@ sub\\t%x0, %x1, %x2