Message ID | a321b583-d108-3c45-3354-fa80ccef8171@us.ibm.com |
---|---|
State | New |
Headers | show |
On 12/20/2016 09:19 AM, Paul Clarke wrote: > This change ensures that the "size" passed to strncmp() contains the > strings within the page. If there is a known difference or a known EOS before the end of the page, then SIZE can legitimately be beyond the end of the page and the routine should not SEGV. Are you certain you're not removing valid testing for this function? r~
On 12/20/2016 11:29 AM, Richard Henderson wrote: > On 12/20/2016 09:19 AM, Paul Clarke wrote: >> This change ensures that the "size" passed to strncmp() contains the >> strings within the page. > > If there is a known difference or a known EOS before the end of the page, then > SIZE can legitimately be beyond the end of the page and the routine should not > SEGV. > > Are you certain you're not removing valid testing for this function? That's a fair question. do_random_tests() checks if strncmp() gives correct results for different alignment and length combinations. It checks if strncmp() is not issuing unaligned loads beyond a native alignment: by placing the strings near the end of a page, such loads will provoke a seg fault. This test already takes pains to ensure the string proper does not cross a page boundary by doing various checks, but one check missing is for variable 'size'. The 'size' parameter can be used in the implementation of strncmp() to decide the loop value for the number of loads. So these checks help to make the testcase valid thereby any failure would be in the implementation of strncmp(). Regards, PC
On 12/20/2016 10:46 AM, Paul Clarke wrote: > The 'size' parameter can be used in the implementation of strncmp() to decide > the loop value for the number of loads. No, it can't. I know because I've made that mistake before. r~
On 12/20/2016 08:15 PM, Richard Henderson wrote: > On 12/20/2016 10:46 AM, Paul Clarke wrote: >> The 'size' parameter can be used in the implementation of strncmp() to decide >> the loop value for the number of loads. > > No, it can't. I know because I've made that mistake before. Richard is right. Even C11 is pretty clear that this optimization is not permitted because it says that “characters that follow a null character are not compared”. It's unspecified whether reading stops at the first difference. But considering that you need to check for NUL bytes anyway, I don't think this matters in practice because any size-based optimization would be invalid anyway. This is also relevant to real-world code, which may use strncmp (s, "expected-prefix", strlen ("expected-prefix")) == 0 to determine whether s begins with "expected-prefix". s could point close to the end of the heap. Thanks, Florian
On 12/21/2016 03:44 AM, Florian Weimer wrote: > On 12/20/2016 08:15 PM, Richard Henderson wrote: >> On 12/20/2016 10:46 AM, Paul Clarke wrote: >>> The 'size' parameter can be used in the implementation of strncmp() to decide >>> the loop value for the number of loads. >> >> No, it can't. I know because I've made that mistake before. > > Richard is right. Even C11 is pretty clear that this optimization is > not permitted because it says that “characters that follow a > null character are not compared”. > > It's unspecified whether reading stops at the first difference. But > considering that you need to check for NUL bytes anyway, I don't think > this matters in practice because any size-based optimization would be > invalid anyway. > > This is also relevant to real-world code, which may use > > strncmp (s, "expected-prefix", strlen ("expected-prefix")) == 0 > > to determine whether s begins with "expected-prefix". s could point > close to the end of the heap. Thanks for the thorough reviews. Looking more carefully this time, I agree. Since I sifted through the fairly complex code and found it challenging, let me try to preserve some value from the effort and submit a patch with just the explanatory comments, in case someone else needs to understand the code in the future. PC
diff --git a/string/test-strncmp.c b/string/test-strncmp.c index fb57a9b..97768f9 100644 --- a/string/test-strncmp.c +++ b/string/test-strncmp.c @@ -264,33 +264,58 @@ do_random_tests (void) size_t i, j, n, align1, align2, pos, len1, len2, size; int result; long r; + /* Get pointers near the end of a "page". */ UCHAR *p1 = (UCHAR *) (buf1 + page_size - 512 * CHARBYTES); UCHAR *p2 = (UCHAR *) (buf2 + page_size - 512 * CHARBYTES); for (n = 0; n < ITERATIONS; n++) { + /* Start index for first string within p1. */ align1 = random () & 31; + + /* Start index for second string within p2. */ if (random () & 1) align2 = random () & 31; else align2 = align1 + (random () & 24); - pos = random () & 511; - size = random () & 511; + + /* Compute larger offset into buffers. */ j = align1 > align2 ? align1 : align2; - if (pos + j >= 511) + + /* Position of difference between strings. */ + pos = random () & 511; + + /* Ensure pos within range for strings. */ + if (j + pos >= 511) pos = 510 - j - (random () & 7); + + /* Maximum size of operation. */ + size = random () & 511; + + /* Ensure size within range for buffers. */ + if (j + size >= 512) + size = 512 - j; + + /* Actual length of 1st string. */ len1 = random () & 511; if (pos >= len1 && (random () & 1)) len1 = pos + (random () & 7); if (len1 + j >= 512) len1 = 511 - j - (random () & 7); + + /* Actual length of 2nd string. */ + /* Note len2 >= len1. */ if (pos >= len1) len2 = len1; else len2 = len1 + (len1 != 511 - j ? random () % (511 - j - len1) : 0); + + /* Compute max length of strings, plus margin of dword. */ j = (pos > len2 ? pos : len2) + align1 + 64; if (j > 512) j = 512; + + /* Fill p1 with random data. */ for (i = 0; i < j; ++i) { p1[i] = random () & 255; @@ -301,6 +326,8 @@ do_random_tests (void) p1[i] = 1 + (random () & 127); } } + + /* Fill p2 with random data. */ for (i = 0; i < j; ++i) { p2[i] = random () & 255; @@ -313,11 +340,15 @@ do_random_tests (void) } result = 0; + + /* Make strings the same, up to position pos. */ MEMCPY (p2 + align2, p1 + align1, pos); + if (pos < len1) { if (p2[align2 + pos] == p1[align1 + pos]) { + /* Insert difference. */ p2[align2 + pos] = random () & 255; if (p2[align2 + pos] == p1[align1 + pos]) p2[align2 + pos] = p1[align1 + pos] + 3 + (random () & 127); @@ -325,12 +356,15 @@ do_random_tests (void) if (pos < size) { + /* Set expectations. */ if (p1[align1 + pos] < p2[align2 + pos]) result = -1; else result = 1; } } + + /* Null terminate strings. */ p1[len1 + align1] = 0; p2[len2 + align2] = 0;