diff mbox

[AArch64] Allow const0_rtx operand for atomic compare-exchange patterns

Message ID 58B56D3E.3090704@foss.arm.com
State New
Headers show

Commit Message

Kyrill Tkachov Feb. 28, 2017, 12:29 p.m. UTC
Hi all,

For the testcase in this patch we currently generate:
foo:
         mov     w1, 0
         ldaxr   w2, [x0]
         cmp     w2, 3
         bne     .L2
         stxr    w3, w1, [x0]
         cmp     w3, 0
.L2:
         cset    w0, eq
         ret

Note that the STXR could have been storing the WZR register instead of moving zero into w1.
This is due to overly strict predicates and constraints in the store exclusive pattern and the
atomic compare exchange expanders and splitters.
This simple patch fixes that in the patterns concerned and with it we can generate:
foo:
         ldaxr   w1, [x0]
         cmp     w1, 3
         bne     .L2
         stxr    w2, wzr, [x0]
         cmp     w2, 0
.L2:
         cset    w0, eq
         ret


Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for GCC 8?

Thanks,
Kyrill

2017-02-28  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * config/aarch64/atomics.md (atomic_compare_and_swap<mode> expander):
     Use aarch64_reg_or_zero predicate for operand 4.
     (aarch64_compare_and_swap<mode> define_insn_and_split):
     Use aarch64_reg_or_zero predicate for operand 3.  Add 'Z' constraint.
     (aarch64_store_exclusive<mode>): Likewise for operand 2.

2017-02-28  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c: New test.

Comments

Kyrill Tkachov April 24, 2017, 9:37 a.m. UTC | #1
Pinging this back into context so that I don't forget about it...

https://gcc.gnu.org/ml/gcc-patches/2017-02/msg01648.html

Thanks,
Kyrill

On 28/02/17 12:29, Kyrill Tkachov wrote:
> Hi all,
>
> For the testcase in this patch we currently generate:
> foo:
>         mov     w1, 0
>         ldaxr   w2, [x0]
>         cmp     w2, 3
>         bne     .L2
>         stxr    w3, w1, [x0]
>         cmp     w3, 0
> .L2:
>         cset    w0, eq
>         ret
>
> Note that the STXR could have been storing the WZR register instead of moving zero into w1.
> This is due to overly strict predicates and constraints in the store exclusive pattern and the
> atomic compare exchange expanders and splitters.
> This simple patch fixes that in the patterns concerned and with it we can generate:
> foo:
>         ldaxr   w1, [x0]
>         cmp     w1, 3
>         bne     .L2
>         stxr    w2, wzr, [x0]
>         cmp     w2, 0
> .L2:
>         cset    w0, eq
>         ret
>
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
> Ok for GCC 8?
>
> Thanks,
> Kyrill
>
> 2017-02-28  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * config/aarch64/atomics.md (atomic_compare_and_swap<mode> expander):
>     Use aarch64_reg_or_zero predicate for operand 4.
>     (aarch64_compare_and_swap<mode> define_insn_and_split):
>     Use aarch64_reg_or_zero predicate for operand 3.  Add 'Z' constraint.
>     (aarch64_store_exclusive<mode>): Likewise for operand 2.
>
> 2017-02-28  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c: New test.
Kyrill Tkachov May 8, 2017, 10:59 a.m. UTC | #2
Ping.

Thanks,
Kyrill

On 24/04/17 10:37, Kyrill Tkachov wrote:
> Pinging this back into context so that I don't forget about it...
>
> https://gcc.gnu.org/ml/gcc-patches/2017-02/msg01648.html
>
> Thanks,
> Kyrill
>
> On 28/02/17 12:29, Kyrill Tkachov wrote:
>> Hi all,
>>
>> For the testcase in this patch we currently generate:
>> foo:
>>         mov     w1, 0
>>         ldaxr   w2, [x0]
>>         cmp     w2, 3
>>         bne     .L2
>>         stxr    w3, w1, [x0]
>>         cmp     w3, 0
>> .L2:
>>         cset    w0, eq
>>         ret
>>
>> Note that the STXR could have been storing the WZR register instead of moving zero into w1.
>> This is due to overly strict predicates and constraints in the store exclusive pattern and the
>> atomic compare exchange expanders and splitters.
>> This simple patch fixes that in the patterns concerned and with it we can generate:
>> foo:
>>         ldaxr   w1, [x0]
>>         cmp     w1, 3
>>         bne     .L2
>>         stxr    w2, wzr, [x0]
>>         cmp     w2, 0
>> .L2:
>>         cset    w0, eq
>>         ret
>>
>>
>> Bootstrapped and tested on aarch64-none-linux-gnu.
>> Ok for GCC 8?
>>
>> Thanks,
>> Kyrill
>>
>> 2017-02-28  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>     * config/aarch64/atomics.md (atomic_compare_and_swap<mode> expander):
>>     Use aarch64_reg_or_zero predicate for operand 4.
>>     (aarch64_compare_and_swap<mode> define_insn_and_split):
>>     Use aarch64_reg_or_zero predicate for operand 3.  Add 'Z' constraint.
>>     (aarch64_store_exclusive<mode>): Likewise for operand 2.
>>
>> 2017-02-28  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>     * gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c: New test.
>
Kyrill Tkachov June 1, 2017, 2:02 p.m. UTC | #3
Ping.

Thanks,
Kyrill

On 08/05/17 11:59, Kyrill Tkachov wrote:
> Ping.
>
> Thanks,
> Kyrill
>
> On 24/04/17 10:37, Kyrill Tkachov wrote:
>> Pinging this back into context so that I don't forget about it...
>>
>> https://gcc.gnu.org/ml/gcc-patches/2017-02/msg01648.html
>>
>> Thanks,
>> Kyrill
>>
>> On 28/02/17 12:29, Kyrill Tkachov wrote:
>>> Hi all,
>>>
>>> For the testcase in this patch we currently generate:
>>> foo:
>>>         mov     w1, 0
>>>         ldaxr   w2, [x0]
>>>         cmp     w2, 3
>>>         bne     .L2
>>>         stxr    w3, w1, [x0]
>>>         cmp     w3, 0
>>> .L2:
>>>         cset    w0, eq
>>>         ret
>>>
>>> Note that the STXR could have been storing the WZR register instead of moving zero into w1.
>>> This is due to overly strict predicates and constraints in the store exclusive pattern and the
>>> atomic compare exchange expanders and splitters.
>>> This simple patch fixes that in the patterns concerned and with it we can generate:
>>> foo:
>>>         ldaxr   w1, [x0]
>>>         cmp     w1, 3
>>>         bne     .L2
>>>         stxr    w2, wzr, [x0]
>>>         cmp     w2, 0
>>> .L2:
>>>         cset    w0, eq
>>>         ret
>>>
>>>
>>> Bootstrapped and tested on aarch64-none-linux-gnu.
>>> Ok for GCC 8?
>>>
>>> Thanks,
>>> Kyrill
>>>
>>> 2017-02-28  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>>
>>>     * config/aarch64/atomics.md (atomic_compare_and_swap<mode> expander):
>>>     Use aarch64_reg_or_zero predicate for operand 4.
>>>     (aarch64_compare_and_swap<mode> define_insn_and_split):
>>>     Use aarch64_reg_or_zero predicate for operand 3.  Add 'Z' constraint.
>>>     (aarch64_store_exclusive<mode>): Likewise for operand 2.
>>>
>>> 2017-02-28  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>>
>>>     * gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c: New test.
>>
>
James Greenhalgh June 6, 2017, 1:16 p.m. UTC | #4
On Tue, Feb 28, 2017 at 12:29:50PM +0000, Kyrill Tkachov wrote:
> Hi all,
> 
> For the testcase in this patch we currently generate:
> foo:
>         mov     w1, 0
>         ldaxr   w2, [x0]
>         cmp     w2, 3
>         bne     .L2
>         stxr    w3, w1, [x0]
>         cmp     w3, 0
> .L2:
>         cset    w0, eq
>         ret
> 
> Note that the STXR could have been storing the WZR register instead of moving zero into w1.
> This is due to overly strict predicates and constraints in the store exclusive pattern and the
> atomic compare exchange expanders and splitters.
> This simple patch fixes that in the patterns concerned and with it we can generate:
> foo:
>         ldaxr   w1, [x0]
>         cmp     w1, 3
>         bne     .L2
>         stxr    w2, wzr, [x0]
>         cmp     w2, 0
> .L2:
>         cset    w0, eq
>         ret
> 
> 
> Bootstrapped and tested on aarch64-none-linux-gnu.
> Ok for GCC 8?

OK.

Thanks,
James

> 2017-02-28  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> 
>     * config/aarch64/atomics.md (atomic_compare_and_swap<mode> expander):
>     Use aarch64_reg_or_zero predicate for operand 4.
>     (aarch64_compare_and_swap<mode> define_insn_and_split):
>     Use aarch64_reg_or_zero predicate for operand 3.  Add 'Z' constraint.
>     (aarch64_store_exclusive<mode>): Likewise for operand 2.
> 
> 2017-02-28  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> 
>     * gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c: New test.
Andrew Pinski June 20, 2017, 5:06 a.m. UTC | #5
On Tue, Feb 28, 2017 at 4:29 AM, Kyrill Tkachov
<kyrylo.tkachov@foss.arm.com> wrote:
> Hi all,
>
> For the testcase in this patch we currently generate:
> foo:
>         mov     w1, 0
>         ldaxr   w2, [x0]
>         cmp     w2, 3
>         bne     .L2
>         stxr    w3, w1, [x0]
>         cmp     w3, 0
> .L2:
>         cset    w0, eq
>         ret
>
> Note that the STXR could have been storing the WZR register instead of
> moving zero into w1.
> This is due to overly strict predicates and constraints in the store
> exclusive pattern and the
> atomic compare exchange expanders and splitters.
> This simple patch fixes that in the patterns concerned and with it we can
> generate:
> foo:
>         ldaxr   w1, [x0]
>         cmp     w1, 3
>         bne     .L2
>         stxr    w2, wzr, [x0]
>         cmp     w2, 0
> .L2:
>         cset    w0, eq
>         ret
>
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
> Ok for GCC 8?


This patch broke compiling with -march=+lse

./home/apinski/src/local5/gcc/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c:9:1:
error: unrecognizable insn:
 }
 ^
(insn 6 3 7 2 (parallel [
            (set (reg:CC 66 cc)
                (unspec_volatile:CC [
                        (const_int 0 [0])
                    ] UNSPECV_ATOMIC_CMPSW))
            (set (reg:SI 78)
                (mem/v:SI (reg/v/f:DI 77 [ a ]) [-1  S4 A32]))
            (set (mem/v:SI (reg/v/f:DI 77 [ a ]) [-1  S4 A32])
                (unspec_volatile:SI [
                        (const_int 3 [0x3])
                        (const_int 0 [0])
                        (const_int 1 [0x1])
                        (const_int 2 [0x2])
                        (const_int 2 [0x2])
                    ] UNSPECV_ATOMIC_CMPSW))
        ]) "/home/apinski/src/local5/gcc/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c":8
-1
     (nil))
during RTL pass: vregs

Note also your new testcase is broken even for defaulting to +lse as
it is not going to match stxr.  I might be the only person who tests
+lse by default :).

Thanks,
Andrew Pinski

>
> Thanks,
> Kyrill
>
> 2017-02-28  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * config/aarch64/atomics.md (atomic_compare_and_swap<mode> expander):
>     Use aarch64_reg_or_zero predicate for operand 4.
>     (aarch64_compare_and_swap<mode> define_insn_and_split):
>     Use aarch64_reg_or_zero predicate for operand 3.  Add 'Z' constraint.
>     (aarch64_store_exclusive<mode>): Likewise for operand 2.
>
> 2017-02-28  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c: New test.
Kyrill Tkachov June 20, 2017, 8:30 a.m. UTC | #6
Hi Andrew,

On 20/06/17 06:06, Andrew Pinski wrote:
> On Tue, Feb 28, 2017 at 4:29 AM, Kyrill Tkachov
> <kyrylo.tkachov@foss.arm.com> wrote:
>> Hi all,
>>
>> For the testcase in this patch we currently generate:
>> foo:
>>          mov     w1, 0
>>          ldaxr   w2, [x0]
>>          cmp     w2, 3
>>          bne     .L2
>>          stxr    w3, w1, [x0]
>>          cmp     w3, 0
>> .L2:
>>          cset    w0, eq
>>          ret
>>
>> Note that the STXR could have been storing the WZR register instead of
>> moving zero into w1.
>> This is due to overly strict predicates and constraints in the store
>> exclusive pattern and the
>> atomic compare exchange expanders and splitters.
>> This simple patch fixes that in the patterns concerned and with it we can
>> generate:
>> foo:
>>          ldaxr   w1, [x0]
>>          cmp     w1, 3
>>          bne     .L2
>>          stxr    w2, wzr, [x0]
>>          cmp     w2, 0
>> .L2:
>>          cset    w0, eq
>>          ret
>>
>>
>> Bootstrapped and tested on aarch64-none-linux-gnu.
>> Ok for GCC 8?
>
> This patch broke compiling with -march=+lse
>
> ./home/apinski/src/local5/gcc/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c:9:1:
> error: unrecognizable insn:
>   }
>   ^
> (insn 6 3 7 2 (parallel [
>              (set (reg:CC 66 cc)
>                  (unspec_volatile:CC [
>                          (const_int 0 [0])
>                      ] UNSPECV_ATOMIC_CMPSW))
>              (set (reg:SI 78)
>                  (mem/v:SI (reg/v/f:DI 77 [ a ]) [-1  S4 A32]))
>              (set (mem/v:SI (reg/v/f:DI 77 [ a ]) [-1  S4 A32])
>                  (unspec_volatile:SI [
>                          (const_int 3 [0x3])
>                          (const_int 0 [0])
>                          (const_int 1 [0x1])
>                          (const_int 2 [0x2])
>                          (const_int 2 [0x2])
>                      ] UNSPECV_ATOMIC_CMPSW))
>          ]) "/home/apinski/src/local5/gcc/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c":8
> -1
>       (nil))
> during RTL pass: vregs
>
> Note also your new testcase is broken even for defaulting to +lse as
> it is not going to match stxr.  I might be the only person who tests
> +lse by default :).

I reproduced the ICE, sorry for the trouble.
I believe the fix is as simple as relaxing the register_operand predicate
on the "value" operand of the LSE cas* patterns to aarch64_reg_or_zero
(and extending the constraint as well). This fixes the ICE for me.

I'll test a patch and submit ASAP.

Kyrill

> Thanks,
> Andrew Pinski
>
>> Thanks,
>> Kyrill
>>
>> 2017-02-28  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>      * config/aarch64/atomics.md (atomic_compare_and_swap<mode> expander):
>>      Use aarch64_reg_or_zero predicate for operand 4.
>>      (aarch64_compare_and_swap<mode> define_insn_and_split):
>>      Use aarch64_reg_or_zero predicate for operand 3.  Add 'Z' constraint.
>>      (aarch64_store_exclusive<mode>): Likewise for operand 2.
>>
>> 2017-02-28  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>      * gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c: New test.
diff mbox

Patch

diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 09d441075f0383420d26b7d3ccb62da2a3c44422..27fc1933ce39b6eddde9c092fa849e5f6645bea3 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -25,7 +25,7 @@  (define_expand "atomic_compare_and_swap<mode>"
    (match_operand:ALLI 1 "register_operand" "")			;; val out
    (match_operand:ALLI 2 "aarch64_sync_memory_operand" "")	;; memory
    (match_operand:ALLI 3 "general_operand" "")			;; expected
-   (match_operand:ALLI 4 "register_operand" "")			;; desired
+   (match_operand:ALLI 4 "aarch64_reg_or_zero" "")			;; desired
    (match_operand:SI 5 "const_int_operand")			;; is_weak
    (match_operand:SI 6 "const_int_operand")			;; mod_s
    (match_operand:SI 7 "const_int_operand")]			;; mod_f
@@ -45,7 +45,7 @@  (define_insn_and_split "aarch64_compare_and_swap<mode>"
    (set (match_dup 1)
     (unspec_volatile:SHORT
       [(match_operand:SI 2 "aarch64_plus_operand" "rI")	;; expected
-       (match_operand:SHORT 3 "register_operand" "r")	;; desired
+       (match_operand:SHORT 3 "aarch64_reg_or_zero" "rZ")	;; desired
        (match_operand:SI 4 "const_int_operand")		;; is_weak
        (match_operand:SI 5 "const_int_operand")		;; mod_s
        (match_operand:SI 6 "const_int_operand")]	;; mod_f
@@ -69,7 +69,7 @@  (define_insn_and_split "aarch64_compare_and_swap<mode>"
    (set (match_dup 1)
     (unspec_volatile:GPI
       [(match_operand:GPI 2 "aarch64_plus_operand" "rI")	;; expect
-       (match_operand:GPI 3 "register_operand" "r")		;; desired
+       (match_operand:GPI 3 "aarch64_reg_or_zero" "rZ")		;; desired
        (match_operand:SI 4 "const_int_operand")			;; is_weak
        (match_operand:SI 5 "const_int_operand")			;; mod_s
        (match_operand:SI 6 "const_int_operand")]		;; mod_f
@@ -534,7 +534,7 @@  (define_insn "aarch64_store_exclusive<mode>"
     (unspec_volatile:SI [(const_int 0)] UNSPECV_SX))
    (set (match_operand:ALLI 1 "aarch64_sync_memory_operand" "=Q")
     (unspec_volatile:ALLI
-      [(match_operand:ALLI 2 "register_operand" "r")
+      [(match_operand:ALLI 2 "aarch64_reg_or_zero" "rZ")
        (match_operand:SI 3 "const_int_operand")]
       UNSPECV_SX))]
   ""
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c b/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..15606b6899012dcb6616e0313d343b77249e1b24
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int
+foo (int *a)
+{
+  int x = 3;
+  return __atomic_compare_exchange_n (a, &x, 0, 1, __ATOMIC_ACQUIRE, __ATOMIC_ACQUIRE);
+}
+
+/* { dg-final { scan-assembler "stxr\\tw\[0-9\]+, wzr,.*" } } */
+/* { dg-final { scan-assembler-not "mov\\tw\[0-9\]+, 0" } } */