Message ID | 20140429.224046.1363822337091646659.davem@davemloft.net |
---|---|
State | New |
Headers | show |
On Tue, Apr 29, 2014 at 10:40:46PM -0400, David Miller wrote: > From: Aurelien Jarno <aurelien@aurel32.net> > Date: Tue, 29 Apr 2014 11:53:39 +0200 > > > This method doesn't work when comparing a 0x00 char in string 1 and 0x01 > > char in string 2. In that case the mask for this byte is 0x01 and the > > corresponding xor is also 0x01. The result of the comparison therefore > > depends on the garbage after the end of the string. > > > > On Debian [1] this causes for example debian-installer to fail to build > > [2], and it might be the source of the random segfaults which we are > > trying to debug for a few years. > > > > [1] http://bugs.debian.org/746310 > > [2] http://bugs.debian.org/731806 > > Right you are, if there are any zeros after the first zero we will > potentially miscompare. > > The only solution I can see at the moment is to clear all except the > topmost bit in the mask. > > Can you please test this patch? Thanks a lot for this quick patch. I confirm it works correctly. That said as the patch needs to test the mask byte by byte, I do wonder if overall it wont be easier to directly test byte by byte the string word loaded in the register. My SPARC assembly knowledge is relatively limited, so I am not sure it is actually the case.
From: Aurelien Jarno <aurelien@aurel32.net> Date: Wed, 30 Apr 2014 18:09:33 +0200 > That said as the patch needs to test the mask byte by byte, I do wonder > if overall it wont be easier to directly test byte by byte the string > word loaded in the register. My SPARC assembly knowledge is relatively > limited, so I am not sure it is actually the case. There is also the possibility to do divide-and-conquer, for example determining that the low 32-bits have not bits set and then just checking the high 32-bits. But all such schemes add branches, and the whole idea of this code is for it to be straight line and branchless, thus avoiding thread switching on Niagara. Anyways, thanks for testing I'll get this fix merged.
diff --git a/sysdeps/sparc/sparc64/strcmp.S b/sysdeps/sparc/sparc64/strcmp.S index 8925396..312924a 100644 --- a/sysdeps/sparc/sparc64/strcmp.S +++ b/sysdeps/sparc/sparc64/strcmp.S @@ -121,6 +121,37 @@ ENTRY(strcmp) movleu %xcc, -1, %o0 srlx rTMP1, 7, rTMP1 + /* In order not to be influenced by bytes after the zero byte, we + * have to retain only the highest bit in the mask for the comparison + * with rSTRXOR to work properly. + */ + mov 0, rTMP2 + andcc rTMP1, 0x0100, %g0 + + movne %xcc, 8, rTMP2 + sllx rTMP1, 63 - 16, %o1 + + movrlz %o1, 16, rTMP2 + sllx rTMP1, 63 - 24, %o1 + + movrlz %o1, 24, rTMP2 + sllx rTMP1, 63 - 32, %o1 + + movrlz %o1, 32, rTMP2 + sllx rTMP1, 63 - 40, %o1 + + movrlz %o1, 40, rTMP2 + sllx rTMP1, 63 - 48, %o1 + + movrlz %o1, 48, rTMP2 + sllx rTMP1, 63 - 56, %o1 + + movrlz %o1, 56, rTMP2 + + srlx rTMP1, rTMP2, rTMP1 + + sllx rTMP1, rTMP2, rTMP1 + cmp rTMP1, rSTRXOR retl movgu %xcc, 0, %o0