Message ID | or4mgtmpgs.fsf@livre.home |
---|---|
State | New |
Headers | show |
On 11/10/2015 03:58 PM, Alexandre Oliva wrote: > On Nov 10, 2015, Alan Lawrence <alan.lawrence@arm.com> wrote: > >> FAIL: gcc.target/aarch64/aapcs64/func-ret-4.c execution, -O2 > > Ugh, sorry. I even checked that testcase by hand before submitting the > patch, because I knew it took the paths I was changing, but I didn't > realize the stack store and load would amount to shifts when the stack > slot was bypassed. > > With the following patch, we get a lsr and a ubfx, without the sp > adjustments. Please let me know if it causes any further problems. So > far, I've tested it on x86_64-linux-gnu, i686-linux-gnu, and > ppc64le-linux-gnu; the ppc64-linux-gnu test run is running slower and > probably won't be done before I call it a day, but I wanted to give you > something before taking off for the day. > > Is this ok to install if ppc64-linux-gnu also regstraps successfully? > > > [PR67753] adjust for padding when bypassing memory in assign_parm_setup_block > > From: Alexandre Oliva <aoliva@redhat.com> > > Storing a register in memory as a full word and then accessing the > same memory address under a smaller-than-word mode amounts to > right-shifting of the register word on big endian machines. So, if > BLOCK_REG_PADDING chooses upward padding for BYTES_BIG_ENDIAN, and > we're copying from the entry_parm REG directly to a pseudo, bypassing > any stack slot, perform the shifting explicitly. > > This fixes the miscompile of function_return_val_10 in > gcc.target/aarch64/aapcs64/func-ret-4.c for target aarch64_be-elf > introduced in the first patch for 67753. > > for gcc/ChangeLog > > PR rtl-optimization/67753 > PR rtl-optimization/64164 > * function.c (assign_parm_setup_block): Right-shift > upward-padded big-endian args when bypassing the stack slot. Don't you need to check the value of BLOCK_REG_PADDING at runtime? The padding is essentially allowed to vary. If you look at the other places where BLOCK_REG_PADDING is used, it's checked in a #ifdef, then again inside a if conditional. Jeff
On Nov 10, 2015, Jeff Law <law@redhat.com> wrote: >> * function.c (assign_parm_setup_block): Right-shift >> upward-padded big-endian args when bypassing the stack slot. > Don't you need to check the value of BLOCK_REG_PADDING at runtime? > The padding is essentially allowed to vary. Well, yeah, it's the result of BLOCK_REG_PADDING that tells whether upward-padding occurred and shifting is required. > If you look at the other places where BLOCK_REG_PADDING is used, it's > checked in a #ifdef, then again inside a if conditional. That's what I do in the patch too. That said, the initial conditions in the if/else-if/else chain for the no-larger-than-a-word case cover all of the non-BLOCK_REG_PADDING cases correctly, so that, if BLOCK_REG_PADDING is not defined, we can just skip the !MEM_P block altogether. That's also the reason why we can go straight to shifting when we get there. I tried to document my reasoning in the comments, but maybe it was still too obscure?
On 11/11/2015 11:10 AM, Alexandre Oliva wrote: > On Nov 10, 2015, Jeff Law <law@redhat.com> wrote: > >>> * function.c (assign_parm_setup_block): Right-shift >>> upward-padded big-endian args when bypassing the stack slot. >> Don't you need to check the value of BLOCK_REG_PADDING at runtime? >> The padding is essentially allowed to vary. > > Well, yeah, it's the result of BLOCK_REG_PADDING that tells whether > upward-padding occurred and shifting is required. > >> If you look at the other places where BLOCK_REG_PADDING is used, it's >> checked in a #ifdef, then again inside a if conditional. > > That's what I do in the patch too. ? I don't see the runtime check in your patch. I see a couple gcc_asserts, but no runtime check of BLOCK_REG_PADDING. > > That said, the initial conditions in the if/else-if/else chain for the > no-larger-than-a-word case cover all of the non-BLOCK_REG_PADDING cases > correctly, so that, if BLOCK_REG_PADDING is not defined, we can just > skip the !MEM_P block altogether. That's also the reason why we can go > straight to shifting when we get there. > > I tried to document my reasoning in the comments, but maybe it was still > too obscure? Certainly seems that way. Is it your assertion that the new code is what we want regardless of the *value* of REG_BLOCK_PADDING? Essentially meaning the check in the IF is covering both cases? What am I missing here? Jeff
On Nov 13, 2015, Jeff Law <law@redhat.com> wrote: > On 11/11/2015 11:10 AM, Alexandre Oliva wrote: >> On Nov 10, 2015, Jeff Law <law@redhat.com> wrote: >> >>>> * function.c (assign_parm_setup_block): Right-shift >>>> upward-padded big-endian args when bypassing the stack slot. >>> Don't you need to check the value of BLOCK_REG_PADDING at runtime? >>> The padding is essentially allowed to vary. >> >> Well, yeah, it's the result of BLOCK_REG_PADDING that tells whether >> upward-padding occurred and shifting is required. >> >>> If you look at the other places where BLOCK_REG_PADDING is used, it's >>> checked in a #ifdef, then again inside a if conditional. >> >> That's what I do in the patch too. > ? I don't see the runtime check in your patch. I see a couple > gcc_asserts, but no runtime check of BLOCK_REG_PADDING. The check is not in my patch, indeed. That's because the previous block performs the runtime check, and it only lets through two cases: the one we handle, and the one nobody uses. The previous block tests this: if (mode != BLKmode #ifdef BLOCK_REG_PADDING && (size == UNITS_PER_WORD || (BLOCK_REG_PADDING (mode, data->passed_type, 1) != (BYTES_BIG_ENDIAN ? upward : downward))) #endif ) i.e., whether we know the mode of the passed value, and its word-sized, or its padded such that the passed value is in the lowpart of the word. Since this is in a block that runs when size <= UNITS_PER_WORD, this catches (and works for) all cases of default padding (when BLOCK_REG_PADDING is not defined), and for cases in which BLOCK_REG_PADDING is defined so as to behave like the default padding, at least for smaller-than-word modes. So, since this handles little-endian bytes with upward padding and big-endian bytes with downward padding, what remains to be handled is little-endian bytes with downward padding and big-endian bytes with upward padding. I found no evidence that the former is ever used anywhere, or why anyone would ever force shifting for both REG and MEM use, and I don't see how the code would have dealt with this case anyway, so I left it unhandled. The other case, big-endian bytes with upward padding, is precisely the one that my previous patch broke on AArch64: we have the passed values pushed to the upper part of the REG so that it could be stored in memory as a whole word and then accessed in the smaller mode at the same address. After checking that this is the case at hand, we shift the value as if we stored it in memory as a word and loaded it in the value mode. >> That said, the initial conditions in the if/else-if/else chain for the >> no-larger-than-a-word case cover all of the non-BLOCK_REG_PADDING cases >> correctly, so that, if BLOCK_REG_PADDING is not defined, we can just >> skip the !MEM_P block altogether. That's also the reason why we can go >> straight to shifting when we get there. >> >> I tried to document my reasoning in the comments, but maybe it was still >> too obscure? > Certainly seems that way. Is it your assertion that the new code is > what we want regardless of the *value* of REG_BLOCK_PADDING? Sort of. If we get to that point, there's only one reasonable value of BLOCK_REG_PADDING (*), although there's another possible value that we historically haven't handled and that makes very little sense to support. We'd have silently corrupted it before, while now we'd get an assertion failure. I count that as an improvement, though it's unlikely we'd ever hit it: anyone trying to define BLOCK_REG_PADDING so as to pad small args downward on little-endian bytes would AFAICT soon find out it doesn't work. (*) unless mode is BLKmode, which the newly-added code implicitly excludes by testing that we don't have a MEM, but rather a REG. > Essentially meaning the check in the IF is covering both cases? Among all cases for arguments that are word-sized or smaller, the initial IF (not present in the patch) covers all of the "usual" cases. The remaining blocks, including the one I added, cover the remaining handled case, namely, BIG_ENDIAN_BYTES and upward BLOCK_REG_PADDING, or BLKmode BIG_ENDIAN_BYTES and downward BLOCK_REG_PADDING (that needs the opposite padding when storing to big-endian mem, so that the value can be accessed at the address in which the full word is stored). > What am I missing here? I agree that the way the remaining tests are written doesn't make it clear that they're all handling a single case, which makes things confusing. In part, that's because they really aren't; they also deal with BLKmode MEMs with "usual" padding. But that's not a case that the patch affects, because we don't have BLKmode REGs. Any suggestions on how to improve the comments so that they convey enough of this reasoning to make sense, without our having to write a book :-) on the topic? Thanks,
On 11/16/2015 05:07 PM, Alexandre Oliva wrote: > > The check is not in my patch, indeed. That's because the previous block > performs the runtime check, and it only lets through two cases: the one > we handle, and the one nobody uses. That was the conclusion I was starting to come to, but expressed so poorly in my last message. Sadly it was non-obvious from staring at the current code. Though I must admit that after a week, I can see it better now. Maybe that's a result of re-reading your message a half-dozen more times with the current code and your patch all visible in windows next to each other :-) Prior to your change we'd just blindly copy from ENTRY_PARM to MEM, which would result in missing the implicit shift if MEM wasn't actually a memory. You're just moving that conditional up and handling the shift explicitly. You've got asserts for the cases you're not handling (and no, I'm not aware of the need for this on any LE architecture, while I am aware of BE architectures that align in both directions). > Any suggestions on how to improve the comments so that they convey > enough of this reasoning to make sense, without our having to write a > book :-) on the topic? Refer back to this thread? :-) Seriously though, looking at things a week later, I can see it much better now. Thanks for your patience on this. OK for the trunk, jeff
diff --git a/gcc/function.c b/gcc/function.c index a637cb3..1ee092c 100644 --- a/gcc/function.c +++ b/gcc/function.c @@ -3002,6 +3002,38 @@ assign_parm_setup_block (struct assign_parm_data_all *all, emit_move_insn (change_address (mem, mode, 0), reg); } +#ifdef BLOCK_REG_PADDING + /* Storing the register in memory as a full word, as + move_block_from_reg below would do, and then using the + MEM in a smaller mode, has the effect of shifting right + if BYTES_BIG_ENDIAN. If we're bypassing memory, the + shifting must be explicit. */ + else if (!MEM_P (mem)) + { + rtx x; + + /* If the assert below fails, we should have taken the + mode != BLKmode path above, unless we have downward + padding of smaller-than-word arguments on a machine + with little-endian bytes, which would likely require + additional changes to work correctly. */ + gcc_checking_assert (BYTES_BIG_ENDIAN + && (BLOCK_REG_PADDING (mode, + data->passed_type, 1) + == upward)); + + int by = (UNITS_PER_WORD - size) * BITS_PER_UNIT; + + x = gen_rtx_REG (word_mode, REGNO (entry_parm)); + x = expand_shift (RSHIFT_EXPR, word_mode, x, by, + NULL_RTX, 1); + x = force_reg (word_mode, x); + x = gen_lowpart_SUBREG (GET_MODE (mem), x); + + emit_move_insn (mem, x); + } +#endif + /* Blocks smaller than a word on a BYTES_BIG_ENDIAN machine must be aligned to the left before storing to memory. Note that the previous test doesn't @@ -3023,14 +3055,20 @@ assign_parm_setup_block (struct assign_parm_data_all *all, tem = change_address (mem, word_mode, 0); emit_move_insn (tem, x); } - else if (!MEM_P (mem)) - emit_move_insn (mem, entry_parm); else move_block_from_reg (REGNO (entry_parm), mem, size_stored / UNITS_PER_WORD); } else if (!MEM_P (mem)) - emit_move_insn (mem, entry_parm); + { + gcc_checking_assert (size > UNITS_PER_WORD); +#ifdef BLOCK_REG_PADDING + gcc_checking_assert (BLOCK_REG_PADDING (GET_MODE (mem), + data->passed_type, 0) + == upward); +#endif + emit_move_insn (mem, entry_parm); + } else move_block_from_reg (REGNO (entry_parm), mem, size_stored / UNITS_PER_WORD);