Vector Comparison patch

I'll just send you my current version. I'll be a little bit more specific.

The problem starts when you try to lower the following expression:

x = a > b;
x1 = vcond <x != 0, -1, 0>
vcond <x1, c, d>

Now, you go from the beginning to the end of the block, and you cannot
leave a > b, because only vconds are valid expressions to expand.

Now, you meet a > b first. You try to transform it into vcond <a > b,
-1, 0>, you build this expression, then you try to gimplify it, and
you see that you have something like:

x' = a >b;
x = vcond <x', -1, 0>
x1 = vcond <x != 0, -1, 0>
vcond <x1, c, d>

and your gsi stands at the x1 now, so the gimplification created a
comparison that optab would not understand. And I am not really sure
that you would be able to solve this problem easily.

It would helpr, if you could create vcond<x, op, y, op0, op1>, but you
cant and x op y is a single tree that must be gimplified, and I am not
sure that you can persuade gimplifier to leave this expression
untouched.

In the attachment the current version of the patch.

Thanks,
Artem.

On Mon, Aug 22, 2011 at 9:58 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Mon, Aug 22, 2011 at 10:49 PM, Artem Shinkarov
> <artyom.shinkaroff@gmail.com> wrote:
>> On Mon, Aug 22, 2011 at 9:42 PM, Richard Guenther
>> <richard.guenther@gmail.com> wrote:
>>> On Mon, Aug 22, 2011 at 5:58 PM, Artem Shinkarov
>>> <artyom.shinkaroff@gmail.com> wrote:
>>>> On Mon, Aug 22, 2011 at 4:50 PM, Richard Guenther
>>>> <richard.guenther@gmail.com> wrote:
>>>>> On Mon, Aug 22, 2011 at 5:43 PM, Artem Shinkarov
>>>>> <artyom.shinkaroff@gmail.com> wrote:
>>>>>> On Mon, Aug 22, 2011 at 4:34 PM, Richard Guenther
>>>>>> <richard.guenther@gmail.com> wrote:
>>>>>>> On Mon, Aug 22, 2011 at 5:21 PM, Artem Shinkarov
>>>>>>> <artyom.shinkaroff@gmail.com> wrote:
>>>>>>>> On Mon, Aug 22, 2011 at 4:01 PM, Richard Guenther
>>>>>>>> <richard.guenther@gmail.com> wrote:
>>>>>>>>> On Mon, Aug 22, 2011 at 2:05 PM, Artem Shinkarov
>>>>>>>>> <artyom.shinkaroff@gmail.com> wrote:
>>>>>>>>>> On Mon, Aug 22, 2011 at 12:25 PM, Richard Guenther
>>>>>>>>>> <richard.guenther@gmail.com> wrote:
>>>>>>>>>>> On Mon, Aug 22, 2011 at 12:53 AM, Artem Shinkarov
>>>>>>>>>>> <artyom.shinkaroff@gmail.com> wrote:
>>>>>>>>>>>> Richard
>>>>>>>>>>>>
>>>>>>>>>>>> I formalized an approach a little-bit, now it works without target
>>>>>>>>>>>> hooks, but some polishing is still required. I want you to comment on
>>>>>>>>>>>> the several important approaches that I use in the patch.
>>>>>>>>>>>>
>>>>>>>>>>>> So how does it work.
>>>>>>>>>>>> 1) All the vector comparisons at the level of  type-checker are
>>>>>>>>>>>> introduced using VEC_COND_EXPR with constant selection operands being
>>>>>>>>>>>> {-1} and {0}. For example v0 > v1 is transformed into VEC_COND_EXPR<v0
>>>>>>>>>>>>> v1, {-1}, {0}>.
>>>>>>>>>>>>
>>>>>>>>>>>> 2) When optabs expand VEC_COND_EXPR, two cases are considered:
>>>>>>>>>>>> 2.a) first operand of VEC_COND_EXPR is comparison, in that case nothing changes.
>>>>>>>>>>>> 2.b) first operand is something else, in that case, we specially mark
>>>>>>>>>>>> this case, recognize it in the backend, and do not create a
>>>>>>>>>>>> comparison, but use the mask as it was a result of some comparison.
>>>>>>>>>>>>
>>>>>>>>>>>> 3) In order to make sure that mask in VEC_COND_EXPR<mask, v0, v1> is a
>>>>>>>>>>>> vector comparison we use is_vector_comparison function, if it returns
>>>>>>>>>>>> false, then we replace mask with mask != {0}.
>>>>>>>>>>>>
>>>>>>>>>>>> So we end-up with the following functionality:
>>>>>>>>>>>> VEC_COND_EXPR<mask, v0,v1> -- if we know that mask is a result of
>>>>>>>>>>>> comparison of two vectors, we leave it as it is, otherwise change with
>>>>>>>>>>>> mask != {0}.
>>>>>>>>>>>>
>>>>>>>>>>>> Plain vector comparison a <op> b is represented with VEC_COND_EXPR,
>>>>>>>>>>>> which correctly expands, without creating useless masking.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Basically for me there are two questions:
>>>>>>>>>>>> 1) Can we perform information passing in optabs in a nicer way?
>>>>>>>>>>>> 2) How is_vector_comparison could be improved? I have several ideas,
>>>>>>>>>>>> like checking if constant vector all consists of 0 and -1, and so on.
>>>>>>>>>>>> But first is it conceptually fine.
>>>>>>>>>>>>
>>>>>>>>>>>> P.S. I tired to put the functionality of is_vector_comparison in
>>>>>>>>>>>> tree-ssa-forwprop, but the thing is that it is called only with -On,
>>>>>>>>>>>> which I find inappropriate, and the functionality gets more
>>>>>>>>>>>> complicated.
>>>>>>>>>>>
>>>>>>>>>>> Why is it inappropriate to not optimize it at -O0?  If the user
>>>>>>>>>>> separates comparison and ?: expression it's his own fault.
>>>>>>>>>>
>>>>>>>>>> Well, because all the information is there, and I perfectly envision
>>>>>>>>>> the case when user expressed comparison separately, just to avoid code
>>>>>>>>>> duplication.
>>>>>>>>>>
>>>>>>>>>> Like:
>>>>>>>>>> mask = a > b;
>>>>>>>>>> res1 = mask ? v0 : v1;
>>>>>>>>>> res2 = mask ? v2 : v3;
>>>>>>>>>>
>>>>>>>>>> Which in this case would be different from
>>>>>>>>>> res1 = a > b ? v0 : v1;
>>>>>>>>>> res2 = a > b ? v2 : v3;
>>>>>>>>>>
>>>>>>>>>>> Btw, the new hook is still in the patch.
>>>>>>>>>>>
>>>>>>>>>>> I would simply always create != 0 if it isn't and let optimizers
>>>>>>>>>>> (tree-ssa-forwprop.c) optimize this.  You still have to deal with
>>>>>>>>>>> non-comparison operands during expansion though, but if
>>>>>>>>>>> you always forced a != 0 from the start you can then simply
>>>>>>>>>>> interpret it as a proper comparison result (in which case I'd
>>>>>>>>>>> modify the backends to have an alternate pattern or directly
>>>>>>>>>>> expand to masking operations - using the fake comparison
>>>>>>>>>>> RTX is too much of a hack).
>>>>>>>>>>
>>>>>>>>>> Richard, I think you didn't get the problem.
>>>>>>>>>> I really need the way, to pass the information, that the expression
>>>>>>>>>> that is in the first operand of vcond is an appropriate mask. I though
>>>>>>>>>> for quite a while and this hack is the only answer I found, is there a
>>>>>>>>>> better way to do that. I could for example introduce another
>>>>>>>>>> tree-node, but it would be overkill as well.
>>>>>>>>>>
>>>>>>>>>> Now why do I need it so much:
>>>>>>>>>> I want to implement the comparison in a way that {1, 5, 0, -1} is
>>>>>>>>>> actually {-1,-1,-1,-1}. So whenever I am not sure that mask of
>>>>>>>>>> VEC_COND_EXPR is a real comparison I transform it to mask != {0} (not
>>>>>>>>>> always). To check the stuff, I use is_vector_comparison in
>>>>>>>>>> tree-vect-generic.
>>>>>>>>>>
>>>>>>>>>> So I really have the difference between mask ? x : y and mask != {0} ?
>>>>>>>>>> x : y, otherwise I could treat mask != {0} in the backend as just
>>>>>>>>>> mask.
>>>>>>>>>>
>>>>>>>>>> If this link between optabs and backend breaks, then the patch falls
>>>>>>>>>> apart. Because every time the comparison is taken out VEC_COND_EXPR, I
>>>>>>>>>> will have to put != {0}. Keep in mind, that I cannot always put the
>>>>>>>>>> comparison inside the VEC_COND_EXPR, what if it is defined in a
>>>>>>>>>> function-comparison, or somehow else?
>>>>>>>>>>
>>>>>>>>>> So what would be an appropriate way to connect optabs and the backend?
>>>>>>>>>
>>>>>>>>> Well, there is no problem in having the only valid mask operand for
>>>>>>>>> VEC_COND_EXPRs being either a comparison or a {-1,...} / {0,....} mask.
>>>>>>>>> Just the C parser has to transform mask ? vec1 : vec2 to mask != 0 ?
>>>>>>>>> vec1 : vec2.
>>>>>>>>
>>>>>>>> This happens already in the new version of patch (not submitted yet).
>>>>>>>>
>>>>>>>>> This comparison can be eliminated by optimization passes
>>>>>>>>> that
>>>>>>>>> either replace it by the real comparison computing the mask or just
>>>>>>>>> propagating the information this mask is already {-1,...} / {0,....} by simply
>>>>>>>>> dropping the comparison against zero.
>>>>>>>>
>>>>>>>> This is not a problem, because the backend recognizes these patterns,
>>>>>>>> so no optimization is needed in this part.
>>>>>>>
>>>>>>> I mean for
>>>>>>>
>>>>>>>  mask = v1 < v2 ? {-1,...} : {0,...};
>>>>>>>  val = VCOND_EXPR <mask != 0, v3, v4>;
>>>>>>>
>>>>>>> optimizers can see how mask is defined and drop the != 0 test or replace
>>>>>>> it by v1 < v2.
>>>>>>
>>>>>> Yes, sure.
>>>>>>
>>>>>>>>> For the backends I'd have vcond patterns for both an embedded comparison
>>>>>>>>> and for a mask.  (Now we can rewind the discussion a bit and allow
>>>>>>>>> arbitrary masks and define a vcond with a mask operand to do bitwise
>>>>>>>>> selection - what matters is the C frontend semantics which we need to
>>>>>>>>> translate to what the middle-end thinks of a VEC_COND_EXPR, they
>>>>>>>>> do not have to agree).
>>>>>>>>
>>>>>>>> But it seems like another combinatorial explosion here. Considering
>>>>>>>> what Richard said in his e-mail, in order to support "generic" vcond,
>>>>>>>> we just need to enumerate all the possible cases. Or I didn't
>>>>>>>> understand right?
>>>>>>>
>>>>>>> Well, the question is still what VCOND_EXPR and thus the vcond pattern
>>>>>>> semantically does for a non-comparison operand.  I'd argue that using
>>>>>>> the bitwise selection semantic gives us maximum flexibility and a native
>>>>>>> instruction with AMD XOP.  This non-comparison VCOND_EXPR is
>>>>>>> also easy to implement in the middle-end expansion code if there is
>>>>>>> no native instruction for it - by simply emitting the bitwise operations.
>>>>>>>
>>>>>>> But I have the feeling we are talking past each other ...?
>>>>>>
>>>>>> I am all for the bitwise behaviour in the backend pattern, that is
>>>>>> something that I rely on at the moment. What I don't want to have is
>>>>>> the same behaviour in the frontend. So If we can guarantee, that we
>>>>>> add != 0, when we don't know the "nature" of the mask, then I am
>>>>>> perfectly fine with the back-end having bitwise-selection behaviour.
>>>>>
>>>>> Well, the C frontend would simply always add that != 0 (because it
>>>>> doesn't know).
>>>>>
>>>>>>>> I mean, I don't mind of course, but it seems to me that it would be
>>>>>>>> cleaner to have one generic enough pattern.
>>>>>>>>
>>>>>>>> Is there seriously no way to pass something from optab into the backend??
>>>>>>>
>>>>>>> You can pass operands.  And information is implicitly encoded in the name.
>>>>>>
>>>>>> I didn't quite get that, could you give an example?
>>>>>
>>>>> It was a larger variant of "no, apart from what is obvious".
>>>>
>>>> Ha, joking again. :)
>>>>
>>>>>>>>> If the mask is computed by a function you are of course out of luck,
>>>>>>>>> but I don't see how you'd manage to infer knowledge from nowhere either.
>>>>>>>>
>>>>>>>> Well, take simpler example
>>>>>>>>
>>>>>>>> a = {0};
>>>>>>>> for ( ; *p; p += 16)
>>>>>>>>  a &= pattern > (vec)*p;
>>>>>>>>
>>>>>>>> res = a ? v0 : v1;
>>>>>>>>
>>>>>>>> In this case it is simple to analyse that a is a comparison, but you
>>>>>>>> cannot embed the operations of a into VEC_COND_EXPR.
>>>>>>>
>>>>>>> Sure, but if the above is C source the frontend would generate
>>>>>>> res = a != 0 ? v0 : v1; initially.  An optimization pass could still
>>>>>>> track this information and replace VEC_COND_EXPR <a != 0, v0, v1>
>>>>>>> with VEC_COND_EXPR <a, v0, v1> (no existing one would track
>>>>>>> vector contents though).
>>>>>>
>>>>>> Yeah, sure. My point is, that we must be able to pass this information
>>>>>> in the backend, that we checked everything, and we are sure that a is
>>>>>> a corerct mask, please don't add any != 0 to it.
>>>>>
>>>>> But all masks are correct as soon as they appear in a VEC_COND_EXPR.
>>>>> That's the whole point of the bitwise semantics.  It's only the C frontend
>>>>> that needs to be careful to impose its stricter semantics.
>>>>>
>>>>> Richard.
>>>>>
>>>>
>>>> Ok, I see the last difference in the approaches we envision.
>>>> I am assuming, that frontend does not put != 0, but the later
>>>> optimisations (veclower in my case) check every mask in VEC_COND_EXPR
>>>> and does the same functionality as you describe. So the philosophical
>>>> question why it is better to first add and then remove, rather than
>>>> just add if needed?
>>>
>>> Well, it's "better be right than sorry".  Thus, default to the
>>> conservatively correct
>>> way and let optimizers "optimize" it.
>>
>> How can we get sorry, it is impossible to skip the vcond during the
>> optimisation, but whatever, it is not really so important when to add.
>> Currently I have a bigger problem, see below.
>>>
>>>> In all the rest I think we agreed.
>>>
>>> Fine.
>>>
>>> Thanks,
>>> Richard.
>>>
>>>>
>>>> Artem.
>>>>
>>>
>>
>> I found out that I cannot really gimplify correctly the vcond<a >b ,
>> c, d> expression when a > b is vcond<a > b, -1, 0>. The problem is
>> that gimplifier pulls a > b always as a separate expression during the
>> gimplification, and I don't think that we can avoid it. So what
>> happens is:
>>
>> vcond <a > b , c , d>
>> transformed to
>> x = b > c;
>> x1 = vcond <x , -1, 0>
>> vcond <x1, c, d>
>>
>> and so on, infinitely long.
>
> Sounds like a bug that is possible to fix.
>
>> In order to fix the problem we need whether to introduce a new code
>> like VEC_COMP_LT, VEC_COMP_GT, and so on
>> whether a builtin function which we would lower
>> whether stick back to the idea of hook.
>>
>> Anyway, representing a >b using vcond does not work.
>
> Well, sure it will work, it just needs some work appearantly.
>
>> What would be your thinking here?
>
> Do you have a patch that exposes this problem?  I can have a look
> tomorrow.
>
> Richard.
>
>>
>> Thanks,
>> Artem.
>>
>

Vector Comparison patch

Commit Message

Comments

Patch