Message ID | 20231218134251.1513432-1-xry111@xry111.site |
---|---|
State | New |
Headers | show |
Series | middle-end: Call negate_rtx instead of simplify_gen_unary expanding rotate shift [PR113033] | expand |
On 12/18/23 06:42, Xi Ruoyao wrote: > With simplify_gen_unary we end up with a not fully expanded RTX like > > (set (reg:SI 90) (and:SI (neg:SI (reg:SI 80)) (const_int 63))) > > Then it will cause an ICE with unrecognizable insn. > > gcc/ChangeLog: > > PR middle-end/113033 > * expmed.cc (expand_shift_1): When expanding rotate shift, call > negate_rtx instead of simplify_gen_unary (NEG, ...). The key difference being that using negate_rtx will go through the expander which knows how to synthesize negation whereas simplify_gen_unary will just generate a (neg ...) and assume it matches something in the backend, right? If so, this patch is fine for the trunk. jeff
On Mon, 2023-12-18 at 08:39 -0700, Jeff Law wrote: > > > On 12/18/23 06:42, Xi Ruoyao wrote: > > With simplify_gen_unary we end up with a not fully expanded RTX like > > > > (set (reg:SI 90) (and:SI (neg:SI (reg:SI 80)) (const_int 63))) > > > > Then it will cause an ICE with unrecognizable insn. > > > > gcc/ChangeLog: > > > > PR middle-end/113033 > > * expmed.cc (expand_shift_1): When expanding rotate shift, call > > negate_rtx instead of simplify_gen_unary (NEG, ...). > The key difference being that using negate_rtx will go through the > expander which knows how to synthesize negation whereas > simplify_gen_unary will just generate a (neg ...) and assume it matches > something in the backend, right? For PR113033 the key difference (to me) is negate_rtx emits an insn to set a new pseudo reg to -x. So the result will be (set (reg:SI 81) (neg:SI (reg:SI 80))) then (and (reg:SI 81) (const_int 31)) instead of a consolidated (and:SI (neg:SI (reg:SI IN)) (const_int 63)) AFAIK no backends have an instruction doing "negate an operand then and bitwisely". To me, technically the following operation other_amount = simplify_gen_binary (AND, GET_MODE (op1), other_amount, gen_int_mode (mask, GET_MODE (op1))); should also be something negate_rtx too or we may still end up with an ICE with backends incapable to match (set (reg:SI XX) (and (reg:SI 81) (const_int 31))) But fortunately most backends has an immediate and operation so it's unlikely to blow up... > If so, this patch is fine for the trunk.
On Tue, Dec 19, 2023 at 12:48:46AM +0800, Xi Ruoyao wrote: > > > gcc/ChangeLog: > > > > > > PR middle-end/113033 > > > * expmed.cc (expand_shift_1): When expanding rotate shift, call > > > negate_rtx instead of simplify_gen_unary (NEG, ...). > > > The key difference being that using negate_rtx will go through the > > expander which knows how to synthesize negation whereas > > simplify_gen_unary will just generate a (neg ...) and assume it matches > > something in the backend, right? > > For PR113033 the key difference (to me) is negate_rtx emits an insn to > set a new pseudo reg to -x. So the result will be > > (set (reg:SI 81) (neg:SI (reg:SI 80))) > > then > > (and (reg:SI 81) (const_int 31)) > > instead of a consolidated > > (and:SI (neg:SI (reg:SI IN)) (const_int 63)) > > AFAIK no backends have an instruction doing "negate an operand then and > bitwisely". Can you explain why it doesn't work as is though? I mean, expand_shift_1 with that (and (neg (reg ...)) (const_int ...)) should try to legitimize the operand (e.g. in maybe_legitimize_operand -> force_operand and force_operand should be able to deal with that, AND is binary op, so it recurses on the 2 operands and NEG is UNARY_P, so the recursion should deal with that if it is not general_operand. Jakub
On Mon, 2023-12-18 at 18:45 +0100, Jakub Jelinek wrote: > On Tue, Dec 19, 2023 at 12:48:46AM +0800, Xi Ruoyao wrote: > > > > gcc/ChangeLog: > > > > > > > > PR middle-end/113033 > > > > * expmed.cc (expand_shift_1): When expanding rotate shift, call > > > > negate_rtx instead of simplify_gen_unary (NEG, ...). > > > > > The key difference being that using negate_rtx will go through the > > > expander which knows how to synthesize negation whereas > > > simplify_gen_unary will just generate a (neg ...) and assume it matches > > > something in the backend, right? > > > > For PR113033 the key difference (to me) is negate_rtx emits an insn to > > set a new pseudo reg to -x. So the result will be > > > > (set (reg:SI 81) (neg:SI (reg:SI 80))) > > > > then > > > > (and (reg:SI 81) (const_int 31)) > > > > instead of a consolidated > > > > (and:SI (neg:SI (reg:SI IN)) (const_int 63)) > > > > AFAIK no backends have an instruction doing "negate an operand then and > > bitwisely". > > Can you explain why it doesn't work as is though? > I mean, expand_shift_1 with that (and (neg (reg ...)) (const_int ...)) > should try to legitimize the operand (e.g. in maybe_legitimize_operand > -> force_operand and force_operand should be able to deal with that, > AND is binary op, so it recurses on the 2 operands and NEG is UNARY_P, > so the recursion should deal with that if it is not general_operand. It happens with vector left rotate: V test (V a, int x) { int _1; V _4; <bb 2> [local count: 1073741824]: _1 = x_2(D) & 31; _4 = a_3(D) r<< _1; return _4; } Here V is in V4SImode. With other_amount = (and (neg (reg 85)) (const_int 31)), we end up calling expand_shift_1 ( code = RSHIFT_EXPR, mode = V4SImode, shifted = (reg:V4SI 82), amount = (and:SI (neg:SI (reg:SI 85)) (const_int 31)), target = (reg:V4SI 84), unsignedp = true, may_fail = false) It then calls expand_binop ( mode = V4SImode, lshr_optab, op0 = (reg:V4SI 82), op1 = (and:SI (neg:SI (reg:SI 85)) (const_int 31)), target = (reg:V4SI 84), unsignedp=1, methods=OPTAB_DIRECT) In expand_binop: rtx vop1 = expand_vector_broadcast (mode, op1); LoongArch backend don't have vec_duplicate (well, broadcasting is implemented as a special case of vec_init and maybe this is not so good...), so finally we get: vec = rtvec_alloc (n); for (int i = 0; i < n; ++i) RTVEC_ELT (vec, i) = op; rtx ret = gen_reg_rtx (vmode); emit_insn (GEN_FCN (icode) (ret, gen_rtx_PARALLEL (vmode, vec))); here "op" is (and:SI (neg:SI (reg:SI 85)) (const_int 31)), thus it evaded expansion :(.
On Tue, Dec 19, 2023 at 04:01:52AM +0800, Xi Ruoyao wrote: > On Mon, 2023-12-18 at 18:45 +0100, Jakub Jelinek wrote: > > On Tue, Dec 19, 2023 at 12:48:46AM +0800, Xi Ruoyao wrote: > > > > > gcc/ChangeLog: > > > > > > > > > > PR middle-end/113033 > > > > > * expmed.cc (expand_shift_1): When expanding rotate shift, call > > > > > negate_rtx instead of simplify_gen_unary (NEG, ...). > > > > > > > The key difference being that using negate_rtx will go through the > > > > expander which knows how to synthesize negation whereas > > > > simplify_gen_unary will just generate a (neg ...) and assume it matches > > > > something in the backend, right? > > > > > > For PR113033 the key difference (to me) is negate_rtx emits an insn to > > > set a new pseudo reg to -x. So the result will be > > > > > > (set (reg:SI 81) (neg:SI (reg:SI 80))) > > > > > > then > > > > > > (and (reg:SI 81) (const_int 31)) > > > > > > instead of a consolidated > > > > > > (and:SI (neg:SI (reg:SI IN)) (const_int 63)) > > > > > > AFAIK no backends have an instruction doing "negate an operand then and > > > bitwisely". > > > > Can you explain why it doesn't work as is though? > > I mean, expand_shift_1 with that (and (neg (reg ...)) (const_int ...)) > > should try to legitimize the operand (e.g. in maybe_legitimize_operand > > -> force_operand and force_operand should be able to deal with that, > > AND is binary op, so it recurses on the 2 operands and NEG is UNARY_P, > > so the recursion should deal with that if it is not general_operand. > > It happens with vector left rotate: > > V test (V a, int x) > { > int _1; > V _4; > > <bb 2> [local count: 1073741824]: > _1 = x_2(D) & 31; > _4 = a_3(D) r<< _1; > return _4; > > } > > Here V is in V4SImode. With other_amount = (and (neg (reg 85)) > (const_int 31)), we end up calling > > expand_shift_1 ( > code = RSHIFT_EXPR, > mode = V4SImode, > shifted = (reg:V4SI 82), > amount = (and:SI (neg:SI (reg:SI 85)) (const_int 31)), > target = (reg:V4SI 84), > unsignedp = true, > may_fail = false) > > It then calls > > expand_binop ( > mode = V4SImode, > lshr_optab, > op0 = (reg:V4SI 82), > op1 = (and:SI (neg:SI (reg:SI 85)) (const_int 31)), > target = (reg:V4SI 84), > unsignedp=1, > methods=OPTAB_DIRECT) > > In expand_binop: > > rtx vop1 = expand_vector_broadcast (mode, op1); > > LoongArch backend don't have vec_duplicate (well, broadcasting is > implemented as a special case of vec_init and maybe this is not so > good...), so finally we get: > > vec = rtvec_alloc (n); > for (int i = 0; i < n; ++i) > RTVEC_ELT (vec, i) = op; > rtx ret = gen_reg_rtx (vmode); > emit_insn (GEN_FCN (icode) (ret, gen_rtx_PARALLEL (vmode, vec))); > > here "op" is (and:SI (neg:SI (reg:SI 85)) (const_int 31)), thus it > evaded expansion :(. Then that seems like a bug in the loongarch vec_init pattern(s). Those really don't have a predicate in any of the backends on the input operand, so they need to force_reg it if it is something it can't handle. I've looked e.g. at i386 vec_init and that is exactly what it does, see the various tests + force_reg calls in ix86_expand_vector_init*. Jakub
On Mon, 2023-12-18 at 21:15 +0100, Jakub Jelinek wrote: > > Then that seems like a bug in the loongarch vec_init pattern(s). > Those really don't have a predicate in any of the backends on the input > operand, so they need to force_reg it if it is something it can't handle. > I've looked e.g. at i386 vec_init and that is exactly what it does, > see the various tests + force_reg calls in ix86_expand_vector_init*. Ok, I'm abandoning abandon this patch and I'll rework.
diff --git a/gcc/expmed.cc b/gcc/expmed.cc index 05331dd5d82..f9e416b9549 100644 --- a/gcc/expmed.cc +++ b/gcc/expmed.cc @@ -2634,9 +2634,7 @@ expand_shift_1 (enum tree_code code, machine_mode mode, rtx shifted, (mode, GET_MODE_BITSIZE (scalar_mode) - INTVAL (op1)); else { - other_amount - = simplify_gen_unary (NEG, GET_MODE (op1), - op1, GET_MODE (op1)); + other_amount = negate_rtx (GET_MODE (op1), op1); HOST_WIDE_INT mask = GET_MODE_PRECISION (scalar_mode) - 1; other_amount = simplify_gen_binary (AND, GET_MODE (op1), other_amount,