Message ID | 20111218.091549.446387858.kkojima@rr.iij4u.or.jp |
---|---|
State | New |
Headers | show |
On Sun, 2011-12-18 at 09:15 +0900, Kaz Kojima wrote: > Your patch doesn't work because SH soft atomic sequences have > another constraint that label 1 should have been 4-byte aligned. I know. That's what I was actually doing. Maybe I should have commented it, as it is not so obvious. The aligns are placed in such a way, that label 1 will always end up being 4-byte aligned. Example (atomic_compare_and_swap<mode>_soft): mova 1f, r0 <i124extend_insn> %2, %4 .align 2 ! align mov r15, r1 ! 4n + 0 mov #(0f-1f), r15 ! 4n + 2 0: mov.<i124suffix> @%1,%0 ! 4n + 4 cmp/eq %0, %4 ! 4n + 6 bf 1f ! 4n + 8 mov.<i124suffix> %3, @%1 ! 4n + 10 1: mov r1, r15 ! 4n + 12 = 4-byte aligned Example (atomic_fetch_<fetchop_name><mode>_soft): mova 1f, r0 .align 2 ! align mov r15, r1 ! 4n + 0 mov #(0f-1f), r15 ! 4n + 2 0: mov.<i124suffix> @%1, %0 ! 4n + 4 mov %0, %3 ! 4n + 6 <fetchop_insn> %2, %3 ! 4n + 8 mov.<i124suffix> %3, @%1 ! 4n + 10 1: mov r1, r15 ! 4n + 12 = 4-byte aligned +.align\\t2\\n\\ > +\\tmova\\t1f, r0\\n\\ > +\\tnop\\n\\ > \\tmov\\tr15, r1\\n\\ > \\tmov\\t#(0f-1f), r15\\n\\ > 0:\\tmov.<i124suffix>\\t@%1, %0\\n\\ > This might end up as... nop mova 1f,r0 nop mov r15,r1 ... One of the nops can be avoided by placing the align after the mova insn as shown above. Then the insn length also shouldn't need to be increased. > And fetchop_name for the logical or operation uses ior instead > of or. See genoptint.c, for example. Having this in sync.md: (define_code_attr fetchop_name [(plus "add") (minus "sub") (ior "ior") (xor "xor") (and "and")]) the following happens: int test (std::atomic<int>& val, int x) { return val |= x; } expand pass log: ;; Function int test(std::atomic<int>&, int) (_Z4testRSt6atomicIiEi, funcdef_no=319, decl_uid=7596, cgraph_uid=130) int test(std::atomic<int>&, int) (struct atomic & val, int x) { __int_type * D.7693; unsigned int __i.0; unsigned int D.7691; __int_type D.7690; # BLOCK 2 freq:10000 # PRED: ENTRY [100.0%] (fallthru,exec) D.7693_7 = &MEM[(struct __atomic_base *)val_1(D)]._M_i; __i.0_8 = (unsigned int) x_3(D); D.7691_9 = __atomic_or_fetch_4 (D.7693_7, __i.0_8, 5); [tail call] D.7690_10 = (__int_type) D.7691_9; return D.7690_10; # SUCC: EXIT [100.0%] } final output: __Z4testRSt6atomicIiEi: .LFB319: .cfi_startproc mov.l @r4,r2 ! 9 movsi_ie/7 [length = 2] .L2: mov r2,r3 ! 45 movsi_ie/2 [length = 2] or r5,r3 ! 7 *iorsi3_compact/1 [length = 2] mova 1f, r0 ! 12 atomic_compare_and_swapsi_soft [length = 20] mov r2, r7 mov r15, r1 mov #(0f-1f), r15 0: mov.l @r4, r6 cmp/eq r6, r7 bf 1f mov.l r3, @r4 .align 2 1: mov r1, r15 movt r0 ! 13 movqi_i/5 [length = 2] tst #255,r0 ! 15 tstqi_t_zero [length = 2] bt/s .L2 ! 16 branch_true [length = 2] mov r6,r2 ! 46 movsi_ie/2 [length = 2] rts ! 49 *return_i [length = 2] mov r3,r0 ! 21 movsi_ie/2 [length = 2] .cfi_endproc .LFE319: Changing "ior" to "or" in fetchop_name fixed that. Cheers, Oleg
Oleg Endo <oleg.endo@t-online.de> wrote: > I know. That's what I was actually doing. Maybe I should have > commented it, as it is not so obvious. The aligns are placed in such a > way, that label 1 will always end up being 4-byte aligned. You are right. I've misunderstood your changes. Sorry for my mess. > Having this in sync.md: > > (define_code_attr fetchop_name > [(plus "add") (minus "sub") (ior "ior") (xor "xor") (and "and")]) > > > the following happens: > > > int test (std::atomic<int>& val, int x) > { > return val |= x; > } [snip] > Changing "ior" to "or" in fetchop_name fixed that. Ugh, I've read the middle end code wrongly. Then we can remove fetchop_insn which now becomes to be the same one with fetchop_name. Could you propose the patch with that change and the backslash changes rth suggested? It's pre-approved with those changes. Regards, kaz
On Mon, 2011-12-19 at 07:13 +0900, Kaz Kojima wrote: > You are right. I've misunderstood your changes. > Sorry for my mess. Uhm, I could just have left the comments in the patch, but for some mysterious reasons I deleted them before sending it :T Sorry for that. If OK, I'd like to add some more verbose comments to sync.md, which also describe the ABI that is used for the atomic sequences. Just to have it in one place. > Ugh, I've read the middle end code wrongly. Then we can remove > fetchop_insn which now becomes to be the same one with fetchop_name. > Could you propose the patch with that change and the backslash > changes rth suggested? It's pre-approved with those changes. > Sure, no problem. Cheers, Oleg
Oleg Endo <oleg.endo@t-online.de> wrote: > If OK, I'd like to add some more verbose comments to sync.md, which also > describe the ABI that is used for the atomic sequences. Just to have it > in one place. Sure. Such comments will be fine. Regards, kaz
--- ORIG/trunk/gcc/config/sh/sync.md 2011-12-05 10:04:44.000000000 +0900 +++ trunk/gcc/config/sh/sync.md 2011-12-18 07:21:56.000000000 +0900 @@ -88,7 +88,8 @@ "* { return \"\\ -mova\\t1f, r0\\n\\ +.align\\t2\\n\\ +\\tmova\\t1f, r0\\n\\ \\t<i124extend_insn>\\t%2, %4\\n\\ \\tmov\\tr15, r1\\n\\ \\tmov\\t#(0f-1f), r15\\n\\ @@ -96,7 +97,6 @@ mova\\t1f, r0\\n\\ \\tcmp/eq\\t%0, %4\\n\\ \\tbf\\t1f\\n\\ \\tmov.<i124suffix>\\t%3, @%1\\n\\ -\\t.align\\t2\\n\\ 1:\\tmov\tr1, r15\"; }" [(set_attr "length" "20")]) @@ -141,17 +141,18 @@ mova\\t1f, r0\\n\\ "* { return \"\\ -mova\\t1f, r0\\n\\ +.align\\t2\\n\\ +\\tmova\\t1f, r0\\n\\ +\\tnop\\n\\ \\tmov\\tr15, r1\\n\\ \\tmov\\t#(0f-1f), r15\\n\\ 0:\\tmov.<i124suffix>\\t@%1, %0\\n\\ \\tmov\\t%0, %3\\n\\ \\t<fetchop_insn>\\t%2, %3\\n\\ \\tmov.<i124suffix>\\t%3, @%1\\n\\ -\\t.align\\t2\\n\\ 1:\\tmov\tr1, r15\"; }" - [(set_attr "length" "18")]) + [(set_attr "length" "20")]) (define_expand "atomic_fetch_nand<mode>" [(set (match_operand:I124 0 "register_operand" "") @@ -193,7 +194,8 @@ mova\\t1f, r0\\n\\ "* { return \"\\ -mova\\t1f, r0\\n\\ +.align\\t2\\n\\ +\\tmova\\t1f, r0\\n\\ \\tmov\\tr15, r1\\n\\ \\tmov\\t#(0f-1f), r15\\n\\ 0:\\tmov.<i124suffix>\\t@%1, %0\\n\\ @@ -201,7 +203,6 @@ mova\\t1f, r0\\n\\ \\tand\\t%0, %3\\n\\ \\tnot\\t%3, %3\\n\\ \\tmov.<i124suffix>\\t%3, @%1\\n\\ -\\t.align\\t2\\n\\ 1:\\tmov\tr1, r15\"; }" [(set_attr "length" "20")]) @@ -247,13 +248,13 @@ mova\\t1f, r0\\n\\ "* { return \"\\ -mova\\t1f, r0\\n\\ +.align\\t2\\n\\ +\\tmova\\t1f, r0\\n\\ \\tmov\\tr15, r1\\n\\ \\tmov\\t#(0f-1f), r15\\n\\ 0:\\tmov.<i124suffix>\\t@%1, %0\\n\\ \\t<fetchop_insn>\\t%2, %0\\n\\ \\tmov.<i124suffix>\\t%0, @%1\\n\\ -\\t.align\\t2\\n\\ 1:\\tmov\tr1, r15\"; }" [(set_attr "length" "16")]) @@ -299,14 +300,15 @@ mova\\t1f, r0\\n\\ "* { return \"\\ -mova\\t1f, r0\\n\\ +.align\\t2\\n\\ +\\tmova\\t1f, r0\\n\\ +\\tnop\\n\\ \\tmov\\tr15, r1\\n\\ \\tmov\\t#(0f-1f), r15\\n\\ 0:\\tmov.<i124suffix>\\t@%1, %0\\n\\ \\tand\\t%2, %0\\n\\ \\tnot\\t%0, %0\\n\\ \\tmov.<i124suffix>\\t%0, @%1\\n\\ -\\t.align\\t2\\n\\ 1:\\tmov\tr1, r15\"; }" - [(set_attr "length" "18")]) + [(set_attr "length" "20")])