Message ID | 20210607083011.855616-1-goldstein.w.n@gmail.com |
---|---|
State | New |
Headers | show |
Series | [v1] x86: memcmp-avx2-movbe.S and memcmp-evex-movbe.S fix overflow bug. | expand |
On Mon, Jun 7, 2021 at 1:30 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote: > > Fix bugs introducted in commits: > > author Noah Goldstein <goldstein.w.n@gmail.com> > Mon, 17 May 2021 17:57:24 +0000 (13:57 -0400) > commit 4ad473e97acdc5f6d811755b67c09f2128a644ce > > And > > author Noah Goldstein <goldstein.w.n@gmail.com> > Mon, 17 May 2021 17:56:52 +0000 (13:56 -0400) > commit 16d12015c57701b08d7bbed6ec536641bcafb428 > > Which added a bug which would cause pointer + length overflow to lead > to an early return as opposed to a Segmentation Fault. > > Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> > --- > sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S | 13 ++++++++----- > sysdeps/x86_64/multiarch/memcmp-evex-movbe.S | 15 ++++++++------- > 2 files changed, 16 insertions(+), 12 deletions(-) > Please open glibc bugs for both memcmp and memset overflow issue. Thanks.
On 6/7/21 2:00 PM, Noah Goldstein via Libc-alpha wrote: > Fix bugs introducted in commits: > > author Noah Goldstein <goldstein.w.n@gmail.com> > Mon, 17 May 2021 17:57:24 +0000 (13:57 -0400) > commit 4ad473e97acdc5f6d811755b67c09f2128a644ce > > And > > author Noah Goldstein <goldstein.w.n@gmail.com> > Mon, 17 May 2021 17:56:52 +0000 (13:56 -0400) > commit 16d12015c57701b08d7bbed6ec536641bcafb428 > > Which added a bug which would cause pointer + length overflow to lead > to an early return as opposed to a Segmentation Fault. If we end up making this change, IMO it should come with an explicit note that this behaviour is not guaranteed for invalid inputs in other implementations of memcmp or for that matter, in future versions of this memcmp. An input that causes pointer + length overflow is undefined behaviour and IMO we shouldn't try to define it for glibc. This change may not have a noticeable performance impact but future requests to guarantee this behaviour may not necessarily be this straightforward and IMO we should not bind ourselves to it. Siddhesh
On Mon, Jun 7, 2021 at 9:45 AM H.J. Lu <hjl.tools@gmail.com> wrote: > On Mon, Jun 7, 2021 at 1:30 AM Noah Goldstein <goldstein.w.n@gmail.com> > wrote: > > > > Fix bugs introducted in commits: > > > > author Noah Goldstein <goldstein.w.n@gmail.com> > > Mon, 17 May 2021 17:57:24 +0000 (13:57 -0400) > > commit 4ad473e97acdc5f6d811755b67c09f2128a644ce > > > > And > > > > author Noah Goldstein <goldstein.w.n@gmail.com> > > Mon, 17 May 2021 17:56:52 +0000 (13:56 -0400) > > commit 16d12015c57701b08d7bbed6ec536641bcafb428 > > > > Which added a bug which would cause pointer + length overflow to lead > > to an early return as opposed to a Segmentation Fault. > > > > Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> > > --- > > sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S | 13 ++++++++----- > > sysdeps/x86_64/multiarch/memcmp-evex-movbe.S | 15 ++++++++------- > > 2 files changed, 16 insertions(+), 12 deletions(-) > > > > Please open glibc bugs for both memcmp and memset overflow issue. > Done: Memcmp: https://sourceware.org/bugzilla/show_bug.cgi?id=27961 Memset: https://sourceware.org/bugzilla/show_bug.cgi?id=27960 > > Thanks. > > -- > H.J. >
On Mon, Jun 7, 2021 at 10:21 AM Siddhesh Poyarekar <siddhesh@gotplt.org> wrote: > On 6/7/21 2:00 PM, Noah Goldstein via Libc-alpha wrote: > > Fix bugs introducted in commits: > > > > author Noah Goldstein <goldstein.w.n@gmail.com> > > Mon, 17 May 2021 17:57:24 +0000 (13:57 -0400) > > commit 4ad473e97acdc5f6d811755b67c09f2128a644ce > > > > And > > > > author Noah Goldstein <goldstein.w.n@gmail.com> > > Mon, 17 May 2021 17:56:52 +0000 (13:56 -0400) > > commit 16d12015c57701b08d7bbed6ec536641bcafb428 > > > > Which added a bug which would cause pointer + length overflow to lead > > to an early return as opposed to a Segmentation Fault. > > If we end up making this change, IMO it should come with an explicit > note that this behaviour is not guaranteed for invalid inputs in other > implementations of memcmp or for that matter, in future versions of this > memcmp. > > An input that causes pointer + length overflow is undefined behaviour > and IMO we shouldn't try to define it for glibc. Is it actually UB? The caller is not causing overflow. The implementation method is. It is possible to implement without overflow. > This change may not > have a noticeable performance impact Minor impact actually for memcmp. > but future requests to guarantee > this behaviour may not necessarily be this straightforward and IMO we > should not bind ourselves to it. Agreed. > > Siddhesh >
On 6/7/21 10:28 AM, Noah Goldstein via Libc-alpha wrote: > > Is it actually UB? The caller is not causing overflow. The implementation > method is. It is possible to implement without overflow. The C Standard says that unless otherwise specified, when you pass an array (pointer + size) to a standard function, all the addresses in the array must be valid. It's valid if (say) memcmp is multithreaded and compares the first halves of the two arrays in parallel with comparing the second halves. If I understand things correctly this patch isn't fixing a conformance bug; it's merely a QoI issue, where by "quality" one means "I want this particular undefined behavior to cause a core dump".
On Mon, Jun 7, 2021 at 1:45 PM Paul Eggert <eggert@cs.ucla.edu> wrote: > On 6/7/21 10:28 AM, Noah Goldstein via Libc-alpha wrote: > > > > Is it actually UB? The caller is not causing overflow. The implementation > > method is. It is possible to implement without overflow. > > The C Standard says that unless otherwise specified, when you pass an > array (pointer + size) to a standard function, all the addresses in the > array must be valid. It's valid if (say) memcmp is multithreaded and > compares the first halves of the two arrays in parallel with comparing > the second halves. > If I understand things correctly this patch isn't fixing a conformance > bug; it's merely a QoI issue, where by "quality" one means "I want this > particular undefined behavior to cause a core dump". > I see. Good to hear there wasn't technically a bug :) What do we want the behavior to be? Generally I would think failing silently can be quite painful for users.
On 6/7/21 11:07 AM, Noah Goldstein wrote: > What do we want the behavior to be? Generally I would think > failing silently can be quite painful for users. Traditionally we have typically said to go for the best typical-case performance, and not to worry about how that affects undefined behavior. So, for example, glibc's memcmp ((void *) 1, NULL, 0) does not dump core even though its behavior is undefined, because it would slow glibc down to force memcmp to dump core for that case. A debugging library like valgrind's might choose to crash more reliably.
On Mon, Jun 7, 2021 at 3:51 PM Paul Eggert <eggert@cs.ucla.edu> wrote: > On 6/7/21 11:07 AM, Noah Goldstein wrote: > > What do we want the behavior to be? Generally I would think > > failing silently can be quite painful for users. > > Traditionally we have typically said to go for the best typical-case > performance, and not to worry about how that affects undefined behavior. > That works for me. The old versions are slightly faster / use less code. Panicked a bit last night when I thought I might have pushed a bug :/ > > So, for example, glibc's memcmp ((void *) 1, NULL, 0) does not dump core > even though its behavior is undefined, because it would slow glibc down > to force memcmp to dump core for that case. > > A debugging library like valgrind's might choose to crash more reliably. >
On Mon, Jun 7, 2021 at 1:45 PM Paul Eggert <eggert@cs.ucla.edu> wrote: > On 6/7/21 10:28 AM, Noah Goldstein via Libc-alpha wrote: > > > > Is it actually UB? The caller is not causing overflow. The implementation > > method is. It is possible to implement without overflow. > > The C Standard says that unless otherwise specified, when you pass an > array (pointer + size) to a standard function, all the addresses in the > array must be valid. It's valid if (say) memcmp is multithreaded and > compares the first halves of the two arrays in parallel with comparing > the second halves. > Does this mean there is an issue with functions like wcsnlen? They do say that they only need to be able to access memory up to first null terminator but currently the x86_64 wcsnlen-avx2.S implementation will multiple length by 4 which could cause overflow. For example with a string length 1000 and maxlen passed as 2^62 + 1 the return will be 1. Is that an issue? It's a pretty simple fix although I think this idiom of multiply length by 4 to get byte count is in a lot of similarly specified files. > > If I understand things correctly this patch isn't fixing a conformance > bug; it's merely a QoI issue, where by "quality" one means "I want this > particular undefined behavior to cause a core dump". >
On 6/9/21 10:45 AM, Noah Goldstein wrote: > > > On Mon, Jun 7, 2021 at 1:45 PM Paul Eggert <eggert@cs.ucla.edu > <mailto:eggert@cs.ucla.edu>> wrote: > > On 6/7/21 10:28 AM, Noah Goldstein via Libc-alpha wrote: > > > > Is it actually UB? The caller is not causing overflow. The > implementation > > method is. It is possible to implement without overflow. > > The C Standard says that unless otherwise specified, when you pass an > array (pointer + size) to a standard function, all the addresses in the > array must be valid. It's valid if (say) memcmp is multithreaded and > compares the first halves of the two arrays in parallel with comparing > the second halves. > > > Does this mean there is an issue with functions like wcsnlen? > > They do say that they only need to be able to access memory up > to first null terminator but currently the x86_64 wcsnlen-avx2.S > implementation will multiple length by 4 which could cause overflow. > > For example with a string length 1000 and maxlen passed as > 2^62 + 1 the return will be 1. The maxlen causes undefined behaviour; it is the responsibility of the caller to ensure that it doesn't and the way to do that for wcsnlen is to ensure that s+maxlen*sizeof(wchar_t) is within bounds of the object and hence does not cause overflow. Siddhesh
On Wed, Jun 9, 2021 at 1:25 AM Siddhesh Poyarekar <siddhesh@gotplt.org> wrote: > On 6/9/21 10:45 AM, Noah Goldstein wrote: > > > > > > On Mon, Jun 7, 2021 at 1:45 PM Paul Eggert <eggert@cs.ucla.edu > > <mailto:eggert@cs.ucla.edu>> wrote: > > > > On 6/7/21 10:28 AM, Noah Goldstein via Libc-alpha wrote: > > > > > > Is it actually UB? The caller is not causing overflow. The > > implementation > > > method is. It is possible to implement without overflow. > > > > The C Standard says that unless otherwise specified, when you pass an > > array (pointer + size) to a standard function, all the addresses in > the > > array must be valid. It's valid if (say) memcmp is multithreaded and > > compares the first halves of the two arrays in parallel with > comparing > > the second halves. > > > > > > Does this mean there is an issue with functions like wcsnlen? > > > > They do say that they only need to be able to access memory up > > to first null terminator but currently the x86_64 wcsnlen-avx2.S > > implementation will multiple length by 4 which could cause overflow. > > > > For example with a string length 1000 and maxlen passed as > > 2^62 + 1 the return will be 1. > > The maxlen causes undefined behaviour; it is the responsibility of the > caller to ensure that it doesn't and the way to do that for wcsnlen is > to ensure that s+maxlen*sizeof(wchar_t) is within bounds of the object > and hence does not cause overflow. > Siddhesh > Got it and thanks! Sorry for all the questions. Was a bit thrown off because we have overflow tests for strncat / strnlen so it appears we are testing UB.
On 6/9/21 11:13 AM, Noah Goldstein wrote: > Got it and thanks! Sorry for all the questions. > > Was a bit thrown off because we have overflow tests for strncat / strnlen > so it appears we are testing UB. Hmm, I couldn't spot any overflow tests from a quick skim; there are tests that set maxlen as SIZE_MAX (but should terminate before reaching SIZE_MAX and hence not cause an overflow) and page boundary tests in strnlen, but none of them should result in undefined behaviour. Could you point out the ones that invoke undefined behaviour? Thanks, Siddhesh
On Wed, Jun 9, 2021 at 2:02 AM Siddhesh Poyarekar <siddhesh@gotplt.org> wrote: > On 6/9/21 11:13 AM, Noah Goldstein wrote: > > Got it and thanks! Sorry for all the questions. > > > > Was a bit thrown off because we have overflow tests for strncat / strnlen > > so it appears we are testing UB. > > Hmm, I couldn't spot any overflow tests from a quick skim; there are > tests that set maxlen as SIZE_MAX (but should terminate before reaching > SIZE_MAX and hence not cause an overflow) and page boundary tests in > strnlen, but none of them should result in undefined behaviour. Could > you point out the ones that invoke undefined behaviour? > Wait are you saying that the return value overflowing causes UB or that the act of passing a maxlen where s + maxlen * sizeof(wchar_t) is outside range object is UB? If the former then why is the follow okay: Previous example with a string whose length is 1000 but because wcslen is passed maxlen where maxlen * sizeof(wchar_t) overflows and leads to a result less than 1000 the implementation of wcslen in wcsnlen-avx2.S will return a length less than 1000. If the latter then: For test-wcsnlen which redirects to test-strnlen If the UB is when is s+maxlen*sizeof(wchar_t) is outside object bound then s + SIZE_MAX + sizeof(wchar_t) surely is. Although even then test-strnlen s + SIZE_MAX will also overflow if s non null. > > Thanks, > Siddhesh >
On 6/9/21 12:02 PM, Noah Goldstein wrote: > Wait are you saying that the return value overflowing causes UB or > that the act of passing a maxlen where s + maxlen * sizeof(wchar_t) is > outside range object is UB? Not specifically the return value, but more broadly, causing wcsnlen to invoke undefined behaviour, which could include the former. > If the former then why is the follow okay: > > Previous example with a string whose length is 1000 > but because wcslen is passed maxlen where maxlen * sizeof(wchar_t) > overflows and leads to a result less than 1000 the implementation of > wcslen in wcsnlen-avx2.S will return a length less than 1000. > > > If the latter then: > > For test-wcsnlen which redirects to test-strnlen > If the UB is when is s+maxlen*sizeof(wchar_t) is outside object bound > then s + SIZE_MAX + sizeof(wchar_t) surely is. > > Although even then test-strnlen s + SIZE_MAX will also overflow if s non > null. That is a good catch; I did not notice that. That should be SIZE_MAX / sizeof (CHAR). Thanks, Siddhesh
On Wed, Jun 9, 2021 at 2:48 AM Siddhesh Poyarekar <siddhesh@gotplt.org> wrote: > On 6/9/21 12:02 PM, Noah Goldstein wrote: > > Wait are you saying that the return value overflowing causes UB or > > that the act of passing a maxlen where s + maxlen * sizeof(wchar_t) is > > outside range object is UB? > > Not specifically the return value, but more broadly, causing wcsnlen to > invoke undefined behaviour, which could include the former. > > > If the former then why is the follow okay: > > > > Previous example with a string whose length is 1000 > > but because wcslen is passed maxlen where maxlen * sizeof(wchar_t) > > overflows and leads to a result less than 1000 the implementation of > > wcslen in wcsnlen-avx2.S will return a length less than 1000. > > > > > > If the latter then: > > > > For test-wcsnlen which redirects to test-strnlen > > If the UB is when is s+maxlen*sizeof(wchar_t) is outside object bound > > then s + SIZE_MAX + sizeof(wchar_t) surely is. > > > > Although even then test-strnlen s + SIZE_MAX will also overflow if s non > > null. > > That is a good catch; I did not notice that. That should be SIZE_MAX / > sizeof (CHAR). > Do we want to support s + maxlen + sizeof(CHAR) overflowing? If not we can speed up the AVX2/EVEX implementation of strnlen/wcsnlen/memchr/wmemchr. (Assuming memchr is of same standard where s + n + sizeof(CHAR) can't overflow. If not it has this bug). > Thanks, > Siddhesh >
On 6/9/21 12:24 PM, Noah Goldstein wrote: > Do we want to support s + maxlen + sizeof(CHAR) overflowing? If not we can > speed up the AVX2/EVEX implementation of strnlen/wcsnlen/memchr/wmemchr. We don't want to specify any behaviour that is undefined by the standard so it's perfectly OK for the algorithm to assume that s + (maxlen - 1) * sizeof(CHAR) (off by one in my previous comment, sorry :)) points to a valid address if that makes the function go faster. Siddhesh
On 6/9/21 12:31 PM, Siddhesh Poyarekar wrote: > On 6/9/21 12:24 PM, Noah Goldstein wrote: >> Do we want to support s + maxlen + sizeof(CHAR) overflowing? If not we >> can >> speed up the AVX2/EVEX implementation of strnlen/wcsnlen/memchr/wmemchr. > > We don't want to specify any behaviour that is undefined by the standard > so it's perfectly OK for the algorithm to assume that s + (maxlen - 1) * > sizeof(CHAR) (off by one in my previous comment, sorry :)) points to a > valid address if that makes the function go faster. s/valid address/valid reference in the object referred to by s/ In fact, like implementations for strlen, it is OK to assume that reads beyond the object bounds are also OK as long as they're within the last page that s is in. Likewise for reads before object bounds as long as they're within the first page that s is in. Siddhesh
On Jun 09 2021, Siddhesh Poyarekar wrote: > On 6/9/21 12:02 PM, Noah Goldstein wrote: >> Wait are you saying that the return value overflowing causes UB or >> that the act of passing a maxlen where s + maxlen * sizeof(wchar_t) is >> outside range object is UB? > > Not specifically the return value, but more broadly, causing wcsnlen to > invoke undefined behaviour, which could include the former. I don't think wcsnlen can assume that such an overflow doesn't happen. Andreas.
On 6/9/21 1:09 PM, Andreas Schwab wrote: > On Jun 09 2021, Siddhesh Poyarekar wrote: > >> On 6/9/21 12:02 PM, Noah Goldstein wrote: >>> Wait are you saying that the return value overflowing causes UB or >>> that the act of passing a maxlen where s + maxlen * sizeof(wchar_t) is >>> outside range object is UB? >> >> Not specifically the return value, but more broadly, causing wcsnlen to >> invoke undefined behaviour, which could include the former. > > I don't think wcsnlen can assume that such an overflow doesn't happen. Hmm, I just noticed that wcsnlen is not in the ISO C draft (at least the one I have from 2011) and is defined in POSIX. The description doesn't seem to specify any access semantics for wcsnlen + maxlen. Are you referring to the fact that it's unspecified or are you aware of anywhere else in the spec that requires the implementation to ensure valid access? Siddhesh
On Jun 09 2021, Siddhesh Poyarekar wrote: > Hmm, I just noticed that wcsnlen is not in the ISO C draft (at least the > one I have from 2011) and is defined in POSIX. The description doesn't > seem to specify any access semantics for wcsnlen + maxlen. Are you > referring to the fact that it's unspecified or are you aware of anywhere > else in the spec that requires the implementation to ensure valid access? Does the sentence "The wcsnlen() function shall never examine more than the first maxlen characters of the wide-character array pointed to by ws." constitute a limit on maxlen, or that ws+maxlen must be valid? Andreas.
* Andreas Schwab: > On Jun 09 2021, Siddhesh Poyarekar wrote: > >> Hmm, I just noticed that wcsnlen is not in the ISO C draft (at least the >> one I have from 2011) and is defined in POSIX. The description doesn't >> seem to specify any access semantics for wcsnlen + maxlen. Are you >> referring to the fact that it's unspecified or are you aware of anywhere >> else in the spec that requires the implementation to ensure valid access? > > Does the sentence "The wcsnlen() function shall never examine more than > the first maxlen characters of the wide-character array pointed to by > ws." constitute a limit on maxlen, or that ws+maxlen must be valid? Is this bug related? wcsrtombs calls wcsnlen on input data which is not an array <https://sourceware.org/bugzilla/show_bug.cgi?id=23711> Thanks, Florian
On 6/9/21 2:44 PM, Andreas Schwab wrote: > On Jun 09 2021, Siddhesh Poyarekar wrote: > >> Hmm, I just noticed that wcsnlen is not in the ISO C draft (at least the >> one I have from 2011) and is defined in POSIX. The description doesn't >> seem to specify any access semantics for wcsnlen + maxlen. Are you >> referring to the fact that it's unspecified or are you aware of anywhere >> else in the spec that requires the implementation to ensure valid access? > > Does the sentence "The wcsnlen() function shall never examine more than > the first maxlen characters of the wide-character array pointed to by > ws." constitute a limit on maxlen, or that ws+maxlen must be valid? It does not specify any of that; that's what I said. My question was: do you think we should interpret that as a constraint the implementation has to take care of or are you aware of another text in the standard that makes it clearer? Siddhesh
On 6/9/21 2:50 PM, Florian Weimer wrote: > Is this bug related? > > wcsrtombs calls wcsnlen on input data which is not an array > <https://sourceware.org/bugzilla/show_bug.cgi?id=23711> Your interpretation that the input has to be an array of size maxlen appears to put constraints on maxlen. Siddhesh
On Jun 09 2021, Siddhesh Poyarekar wrote: > On 6/9/21 2:44 PM, Andreas Schwab wrote: >> On Jun 09 2021, Siddhesh Poyarekar wrote: >> >>> Hmm, I just noticed that wcsnlen is not in the ISO C draft (at least the >>> one I have from 2011) and is defined in POSIX. The description doesn't >>> seem to specify any access semantics for wcsnlen + maxlen. Are you >>> referring to the fact that it's unspecified or are you aware of anywhere >>> else in the spec that requires the implementation to ensure valid access? >> Does the sentence "The wcsnlen() function shall never examine more than >> the first maxlen characters of the wide-character array pointed to by >> ws." constitute a limit on maxlen, or that ws+maxlen must be valid? > > It does not specify any of that; that's what I said. Then you cannot assume that maxlen*sizeof(wchar_t) does not wrap around, and using that as a limit would be a bug. Andreas.
* Siddhesh Poyarekar: > On 6/9/21 2:50 PM, Florian Weimer wrote: >> Is this bug related? >> wcsrtombs calls wcsnlen on input data which is not an array >> <https://sourceware.org/bugzilla/show_bug.cgi?id=23711> > > Your interpretation that the input has to be an array of size maxlen > appears to put constraints on maxlen. But I think I later realized that GNU supports non-array use as an extension, and that would mean that there is no such constraint. Thanks, Florian
On 6/9/21 3:07 PM, Florian Weimer wrote: > * Siddhesh Poyarekar: > >> On 6/9/21 2:50 PM, Florian Weimer wrote: >>> Is this bug related? >>> wcsrtombs calls wcsnlen on input data which is not an array >>> <https://sourceware.org/bugzilla/show_bug.cgi?id=23711> >> >> Your interpretation that the input has to be an array of size maxlen >> appears to put constraints on maxlen. > > But I think I later realized that GNU supports non-array use as an > extension, and that would mean that there is no such constraint. OK got it. Thanks, Siddhesh
On 6/9/21 3:05 PM, Andreas Schwab wrote: > Then you cannot assume that maxlen*sizeof(wchar_t) does not wrap around, > and using that as a limit would be a bug. Thanks for clarifying. Siddhesh
So do we need a patch to fix this? I see the issue of assuming maxlen * sizeof(wchar_t) does not wrap in the following files: wmemchr-sse4_1 wmemchr-avx2 wcsnlen-sse2 wcsnlen-avx2 I can get one ready pretty easily if we do need one. On Wed, Jun 9, 2021 at 5:35 AM Andreas Schwab <schwab@linux-m68k.org> wrote: > On Jun 09 2021, Siddhesh Poyarekar wrote: > > > On 6/9/21 2:44 PM, Andreas Schwab wrote: > >> On Jun 09 2021, Siddhesh Poyarekar wrote: > >> > >>> Hmm, I just noticed that wcsnlen is not in the ISO C draft (at least > the > >>> one I have from 2011) and is defined in POSIX. The description doesn't > >>> seem to specify any access semantics for wcsnlen + maxlen. Are you > >>> referring to the fact that it's unspecified or are you aware of > anywhere > >>> else in the spec that requires the implementation to ensure valid > access? > >> Does the sentence "The wcsnlen() function shall never examine more than > >> the first maxlen characters of the wide-character array pointed to by > >> ws." constitute a limit on maxlen, or that ws+maxlen must be valid? > > > > It does not specify any of that; that's what I said. > > Then you cannot assume that maxlen*sizeof(wchar_t) does not wrap around, > and using that as a limit would be a bug. > > Andreas. > > -- > Andreas Schwab, schwab@linux-m68k.org > GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 > "And now for something completely different." >
On Wed, Jun 9, 2021 at 1:34 PM Noah Goldstein via Libc-alpha <libc-alpha@sourceware.org> wrote: > > So do we need a patch to fix this? > > I see the issue of assuming maxlen * sizeof(wchar_t) does not wrap > in the following files: > > wmemchr-sse4_1 > wmemchr-avx2 > > wcsnlen-sse2 > wcsnlen-avx2 > > I can get one ready pretty easily if we do need one. Can we create tests without undefined behavior to show the current implementation is wrong? If yes, please add such testcases and fix the implementations. Thanks. > > On Wed, Jun 9, 2021 at 5:35 AM Andreas Schwab <schwab@linux-m68k.org> wrote: > > > On Jun 09 2021, Siddhesh Poyarekar wrote: > > > > > On 6/9/21 2:44 PM, Andreas Schwab wrote: > > >> On Jun 09 2021, Siddhesh Poyarekar wrote: > > >> > > >>> Hmm, I just noticed that wcsnlen is not in the ISO C draft (at least > > the > > >>> one I have from 2011) and is defined in POSIX. The description doesn't > > >>> seem to specify any access semantics for wcsnlen + maxlen. Are you > > >>> referring to the fact that it's unspecified or are you aware of > > anywhere > > >>> else in the spec that requires the implementation to ensure valid > > access? > > >> Does the sentence "The wcsnlen() function shall never examine more than > > >> the first maxlen characters of the wide-character array pointed to by > > >> ws." constitute a limit on maxlen, or that ws+maxlen must be valid? > > > > > > It does not specify any of that; that's what I said. > > > > Then you cannot assume that maxlen*sizeof(wchar_t) does not wrap around, > > and using that as a limit would be a bug. > > > > Andreas. > > > > -- > > Andreas Schwab, schwab@linux-m68k.org > > GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 > > "And now for something completely different." > >
On 6/10/21 2:03 AM, Noah Goldstein wrote: > So do we need a patch to fix this? > We don't. As Andreas clarified, wcsnlen behaviour avoids overflow. Siddhesh
On 6/10/21 2:03 AM, Noah Goldstein wrote: > So do we need a patch to fix this? > > I see the issue of assuming maxlen * sizeof(wchar_t) does not wrap > in the following files: > > wmemchr-sse4_1 > wmemchr-avx2 > > wcsnlen-sse2 > wcsnlen-avx2 > > I can get one ready pretty easily if we do need one. Sorry I misunderstood your question. H.J. has the right advice for you. Thanks, Siddhesh
diff --git a/sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S b/sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S index 2621ec907a..4a9414ff61 100644 --- a/sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S +++ b/sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S @@ -60,6 +60,7 @@ # endif # define VEC_SIZE 32 +# define LOG_VEC_SIZE 7 # define PAGE_SIZE 4096 /* Warning! @@ -200,8 +201,7 @@ L(return_vec_2): # endif VZEROUPPER_RETURN - /* NB: p2align 5 here to ensure 4x loop is 32 byte aligned. */ - .p2align 5 + .p2align 4 L(8x_return_vec_0_1_2_3): /* Returning from L(more_8x_vec) requires restoring rsi. */ addq %rdi, %rsi @@ -232,7 +232,7 @@ L(return_vec_3): # endif VZEROUPPER_RETURN - .p2align 4 + .p2align 5 L(more_8x_vec): /* Set end of s1 in rdx. */ leaq -(VEC_SIZE * 4)(%rdi, %rdx), %rdx @@ -241,8 +241,11 @@ L(more_8x_vec): subq %rdi, %rsi /* Align s1 pointer. */ andq $-VEC_SIZE, %rdi + leaq -1(%rdx), %rax + subq %rdi, %rax /* Adjust because first 4x vec where check already. */ subq $-(VEC_SIZE * 4), %rdi + sarq $LOG_VEC_SIZE, %rax .p2align 4 L(loop_4x_vec): /* rsi has s2 - s1 so get correct address by adding s1 (in rdi). @@ -267,8 +270,8 @@ L(loop_4x_vec): jnz L(8x_return_vec_0_1_2_3) subq $-(VEC_SIZE * 4), %rdi /* Check if s1 pointer at end. */ - cmpq %rdx, %rdi - jb L(loop_4x_vec) + decq %rax + jne L(loop_4x_vec) subq %rdx, %rdi /* rdi has 4 * VEC_SIZE - remaining length. */ diff --git a/sysdeps/x86_64/multiarch/memcmp-evex-movbe.S b/sysdeps/x86_64/multiarch/memcmp-evex-movbe.S index 654dc7ac8c..60be3f43e7 100644 --- a/sysdeps/x86_64/multiarch/memcmp-evex-movbe.S +++ b/sysdeps/x86_64/multiarch/memcmp-evex-movbe.S @@ -53,6 +53,7 @@ # endif # define VEC_SIZE 32 +# define LOG_VEC_SIZE 7 # define PAGE_SIZE 4096 # define CHAR_PER_VEC (VEC_SIZE / CHAR_SIZE) @@ -163,10 +164,7 @@ ENTRY (MEMCMP) /* NB: eax must be zero to reach here. */ ret - /* NB: aligning 32 here allows for the rest of the jump targets - to be tuned for 32 byte alignment. Most important this ensures - the L(more_8x_vec) loop is 32 byte aligned. */ - .p2align 5 + .p2align 4 L(less_vec): /* Check if one or less CHAR. This is necessary for size = 0 but is also faster for size = CHAR_SIZE. */ @@ -277,7 +275,7 @@ L(return_vec_3): # endif ret - .p2align 4 + .p2align 5 L(more_8x_vec): /* Set end of s1 in rdx. */ leaq -(VEC_SIZE * 4)(%rdi, %rdx, CHAR_SIZE), %rdx @@ -286,8 +284,11 @@ L(more_8x_vec): subq %rdi, %rsi /* Align s1 pointer. */ andq $-VEC_SIZE, %rdi + leaq -1(%rdx), %rax + subq %rdi, %rax /* Adjust because first 4x vec where check already. */ subq $-(VEC_SIZE * 4), %rdi + sarq $LOG_VEC_SIZE, %rax .p2align 4 L(loop_4x_vec): VMOVU (%rsi, %rdi), %YMM1 @@ -307,8 +308,8 @@ L(loop_4x_vec): testl %ecx, %ecx jnz L(8x_return_vec_0_1_2_3) subq $-(VEC_SIZE * 4), %rdi - cmpq %rdx, %rdi - jb L(loop_4x_vec) + decq %rax + jnz L(loop_4x_vec) subq %rdx, %rdi /* rdi has 4 * VEC_SIZE - remaining length. */
Fix bugs introducted in commits: author Noah Goldstein <goldstein.w.n@gmail.com> Mon, 17 May 2021 17:57:24 +0000 (13:57 -0400) commit 4ad473e97acdc5f6d811755b67c09f2128a644ce And author Noah Goldstein <goldstein.w.n@gmail.com> Mon, 17 May 2021 17:56:52 +0000 (13:56 -0400) commit 16d12015c57701b08d7bbed6ec536641bcafb428 Which added a bug which would cause pointer + length overflow to lead to an early return as opposed to a Segmentation Fault. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> --- sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S | 13 ++++++++----- sysdeps/x86_64/multiarch/memcmp-evex-movbe.S | 15 ++++++++------- 2 files changed, 16 insertions(+), 12 deletions(-)