Message ID | 20240705130249.14116-2-alx@kernel.org |
---|---|
State | New |
Headers | show |
Series | [v1] Remove 'restrict' from 'nptr' in strtol(3)-like functions | expand |
On Fri, 2024-07-05 at 15:03 +0200, Alejandro Colomar wrote: > ISO C specifies these APIs as accepting a restricted pointer in their > first parameter: > > $ stdc c99 strtol > long int strtol(const char *restrict nptr, char **restrict endptr, int base); > $ stdc c11 strtol > long int strtol(const char *restrict nptr, char **restrict endptr, int base); > > However, it should be considered a defect in ISO C. It's common to see > code that aliases it: > > char str[] = "10 20"; > > p = str; > a = strtol(p, &p, 0); // Let's ignore error handling for > b = strtol(p, &p, 0); // simplicity. Why this is wrong? During the execution of strtol() the only expression accessing the object "p" is *endptr. When the body of strtol() refers "nptr" it accesses a different object, not "p". And if this is really wrong you should report it to WG14 before changing glibc.
On Fri, Jul 05, 2024 at 09:38:21PM +0800, Xi Ruoyao via Gcc wrote: > On Fri, 2024-07-05 at 15:03 +0200, Alejandro Colomar wrote: > > ISO C specifies these APIs as accepting a restricted pointer in their > > first parameter: > > > > $ stdc c99 strtol > > long int strtol(const char *restrict nptr, char **restrict endptr, int base); > > $ stdc c11 strtol > > long int strtol(const char *restrict nptr, char **restrict endptr, int base); > > > > However, it should be considered a defect in ISO C. It's common to see > > code that aliases it: > > > > char str[] = "10 20"; > > > > p = str; > > a = strtol(p, &p, 0); // Let's ignore error handling for > > b = strtol(p, &p, 0); // simplicity. > > Why this is wrong? I don't see anything wrong with it either. The function only reads the string starting with nptr and then stores some pointer to *endptr, if the caller doesn't make nptr point to what endptr points to or vice versa, that should be fine. Jakub
[CC += linux-man@, since we're discussing an API documented there, and the manual page would also need to be updated] Hi Xi, Jakub, On Fri, Jul 05, 2024 at 09:38:21PM GMT, Xi Ruoyao wrote: > On Fri, 2024-07-05 at 15:03 +0200, Alejandro Colomar wrote: > > ISO C specifies these APIs as accepting a restricted pointer in their > > first parameter: > > > > $ stdc c99 strtol > > long int strtol(const char *restrict nptr, char **restrict endptr, int base); > > $ stdc c11 strtol > > long int strtol(const char *restrict nptr, char **restrict endptr, int base); > > > > However, it should be considered a defect in ISO C. It's common to see > > code that aliases it: > > > > char str[] = "10 20"; > > > > p = str; > > a = strtol(p, &p, 0); // Let's ignore error handling for > > b = strtol(p, &p, 0); // simplicity. > > Why this is wrong? > > During the execution of strtol() the only expression accessing the > object "p" is *endptr. When the body of strtol() refers "nptr" it > accesses a different object, not "p". <http://port70.net/~nsz/c/c11/n1570.html#6.7.3p8> Theoretically, 'restrict' is defined in terms of accesses, not just references, so it's fine for strtol(3) to hold two references of p in restrict pointers. That is, the following code is valid: int dumb(int *restrict a, int *restrict also_a) { // We don't access the objects return a == also_a; } int main(void) { int x = 3; return dumb(&x, &x); } However, in practice that's dumb. The caller cannot know that the function doesn't access the object, so it must be cautious and enable -Wrestrict, which should be paranoid and do not allow passing references to the same object in different arguments, just in case the function decides to access to objects. Of course, GCC reports a diagnostic for the previous code: $ cc -Wall -Wextra dumb.c dumb.c: In function ‘main’: dumb.c:13:21: warning: passing argument 1 to ‘restrict’-qualified parameter aliases with argument 2 [-Wrestrict] 13 | return dumb(&x, &x); | ^~ ~~ ... even when there's no UB, since the object is not being accessed. But when the thing gets non-trivial, as in strtol(3), GCC misses the -Wrestrict diagnostic, as reported in <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112833>. Let's write a reproducer by altering the dumb.c program from above, with just another reference: int dumb2(int *restrict a, int *restrict *restrict ap) { // We don't access the objects return a == *ap; } int main(void) { int x = 3; int *xp = &x; return dumb2(&x, &xp); } GCC doesn't report anything bad here, even though it's basically the same as the program from above: $ cc -Wall -Wextra dumb2.c $ Again, there's no UB, but we really want to be cautious and get a diagnostic as callers, just in case the callee decides to access the object; we never know. So, GCC should be patched to report a warning in the program above. That will also cause strtol(3) to start issuing warnings in use cases like the one I showed. Even further, let's try something really weird: inequality comparison, which is only defined for pointers to the same array object: int dumb3(int *restrict a, int *restrict *restrict ap) { // We don't access the objects return a > *ap; } int main(void) { int x = 3; int *xp = &x; return dumb3(&x, &xp); } The behavior is still defined, since the obnjects are not accessed, but the compiler should really warn, on both sides: - The caller is passing references to the same object in restricted parameters, which is a red flag. - The callee is comparing for inequality pointers that should, under normal circumstances, cause Undefined Behavior. > And if this is really wrong you should report it to WG14 before changing > glibc. Well, I don't know how to report that defect to WG14. If you help me, I'll be pleased to do so. Do they have a public mailing list or anything like that? Cheers, Alex
Am Freitag, dem 05.07.2024 um 16:37 +0200 schrieb Alejandro Colomar via Gcc: > [CC += linux-man@, since we're discussing an API documented there, and > the manual page would also need to be updated] > > Hi Xi, Jakub, > > On Fri, Jul 05, 2024 at 09:38:21PM GMT, Xi Ruoyao wrote: > > On Fri, 2024-07-05 at 15:03 +0200, Alejandro Colomar wrote: > > > ISO C specifies these APIs as accepting a restricted pointer in their > > > first parameter: > > > > > > $ stdc c99 strtol > > > long int strtol(const char *restrict nptr, char **restrict endptr, int base); > > > $ stdc c11 strtol > > > long int strtol(const char *restrict nptr, char **restrict endptr, int base); > > > > > > However, it should be considered a defect in ISO C. It's common to see > > > code that aliases it: > > > > > > char str[] = "10 20"; > > > > > > p = str; > > > a = strtol(p, &p, 0); // Let's ignore error handling for > > > b = strtol(p, &p, 0); // simplicity. > > > > Why this is wrong? > > > > During the execution of strtol() the only expression accessing the > > object "p" is *endptr. When the body of strtol() refers "nptr" it > > accesses a different object, not "p". > > <http://port70.net/~nsz/c/c11/n1570.html#6.7.3p8> > > Theoretically, 'restrict' is defined in terms of accesses, not just > references, so it's fine for strtol(3) to hold two references of p in > restrict pointers. That is, the following code is valid: > > int > dumb(int *restrict a, int *restrict also_a) > { > // We don't access the objects > return a == also_a; > } > > int > main(void) > { > int x = 3; > > return dumb(&x, &x); > } > > However, in practice that's dumb. The caller cannot know that the > function doesn't access the object, so it must be cautious and enable > -Wrestrict, which should be paranoid and do not allow passing references > to the same object in different arguments, just in case the function > decides to access to objects. Of course, GCC reports a diagnostic for > the previous code: > > $ cc -Wall -Wextra dumb.c > dumb.c: In function ‘main’: > dumb.c:13:21: warning: passing argument 1 to ‘restrict’-qualified parameter aliases with argument 2 [-Wrestrict] > 13 | return dumb(&x, &x); > | ^~ ~~ > > ... even when there's no UB, since the object is not being accessed. > > But when the thing gets non-trivial, as in strtol(3), GCC misses the > -Wrestrict diagnostic, as reported in > <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112833>. > > Let's write a reproducer by altering the dumb.c program from above, with > just another reference: > > int > dumb2(int *restrict a, int *restrict *restrict ap) > { > // We don't access the objects > return a == *ap; > } > > int > main(void) > { > int x = 3; > int *xp = &x; > > return dumb2(&x, &xp); > } > > GCC doesn't report anything bad here, even though it's basically the > same as the program from above: > > $ cc -Wall -Wextra dumb2.c > $ strtol does have a "char * restrict * restrict" though, so the situation is different. A "char **" and a "const char *" shouldn't alias anyway. > > Again, there's no UB, but we really want to be cautious and get a > diagnostic as callers, just in case the callee decides to access the > object; we never know. > > So, GCC should be patched to report a warning in the program above. > That will also cause strtol(3) to start issuing warnings in use cases > like the one I showed. > > Even further, let's try something really weird: inequality comparison, > which is only defined for pointers to the same array object: > > int > dumb3(int *restrict a, int *restrict *restrict ap) > { > // We don't access the objects > return a > *ap; > } > > int > main(void) > { > int x = 3; > int *xp = &x; > > return dumb3(&x, &xp); > } > > The behavior is still defined, since the obnjects are not accessed, but > the compiler should really warn, on both sides: > > - The caller is passing references to the same object in restricted > parameters, which is a red flag. > > - The callee is comparing for inequality pointers that should, under > normal circumstances, cause Undefined Behavior. > > > > And if this is really wrong you should report it to WG14 before changing > > glibc. > > Well, I don't know how to report that defect to WG14. If you help me, > I'll be pleased to do so. Do they have a public mailing list or > anything like that? One can submit clarification or change requests: https://www.open-std.org/jtc1/sc22/wg14/www/contributing.html Martin
Hi Martin, On Fri, Jul 05, 2024 at 05:02:15PM GMT, Martin Uecker wrote: > > But when the thing gets non-trivial, as in strtol(3), GCC misses the > > -Wrestrict diagnostic, as reported in > > <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112833>. > > > > Let's write a reproducer by altering the dumb.c program from above, with > > just another reference: > > > > int > > dumb2(int *restrict a, int *restrict *restrict ap) > > { > > // We don't access the objects > > return a == *ap; > > } > > > > int > > main(void) > > { > > int x = 3; > > int *xp = &x; > > > > return dumb2(&x, &xp); > > } > > > > GCC doesn't report anything bad here, even though it's basically the > > same as the program from above: > > > > $ cc -Wall -Wextra dumb2.c > > $ > > strtol does have a "char * restrict * restrict" though, so the > situation is different. A "char **" and a "const char *" > shouldn't alias anyway. Pedantically, it is actually declared as 'char **restrict' (the inner one is not declared as restrict, even though it will be restricted, since there are no other unrestricted pointers). I've written functions that more closely resemble strtol(3), to show that in the end they all share the same issue regarding const-ness: $ cat d.c int d(const char *restrict ca, char *restrict a) { return ca > a; } int main(void) { char x = 3; char *xp = &x; d(xp, xp); } $ cc -Wall -Wextra d.c d.c: In function ‘main’: d.c:10:9: warning: passing argument 2 to ‘restrict’-qualified parameter aliases with argument 1 [-Wrestrict] 10 | d(xp, xp); | ^ This trivial program causes a diagnostic. (Although I think the '>' should also cause a diagnostic!!) Let's add a reference, to resemble strtol(3): $ cat d2.c int d2(const char *restrict ca, char *restrict *restrict ap) { return ca > *ap; } int main(void) { char x = 3; char *xp = &x; d2(xp, &xp); } $ cc -Wall -Wextra d2.c $ Why does this not cause a -Wrestrict diagnostic, while d.c does? How are these programs any different regarding pointer restrict-ness? > > Well, I don't know how to report that defect to WG14. If you help me, > > I'll be pleased to do so. Do they have a public mailing list or > > anything like that? > > One can submit clarification or change requests: > > https://www.open-std.org/jtc1/sc22/wg14/www/contributing.html Thanks! Will do. Anyway, I think this should be discussed in glibc/gcc in parallel, since it's clearly a missed diagnostic, and possibly a dangerous use of restrict if the compiler does any assumptions that shouldn't be done. Have a lovely day! Alex
Am Freitag, dem 05.07.2024 um 17:23 +0200 schrieb Alejandro Colomar: > Hi Martin, > > On Fri, Jul 05, 2024 at 05:02:15PM GMT, Martin Uecker wrote: > > > But when the thing gets non-trivial, as in strtol(3), GCC misses the > > > -Wrestrict diagnostic, as reported in > > > <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112833>. > > > > > > Let's write a reproducer by altering the dumb.c program from above, with > > > just another reference: > > > > > > int > > > dumb2(int *restrict a, int *restrict *restrict ap) > > > { > > > // We don't access the objects > > > return a == *ap; > > > } > > > > > > int > > > main(void) > > > { > > > int x = 3; > > > int *xp = &x; > > > > > > return dumb2(&x, &xp); > > > } > > > > > > GCC doesn't report anything bad here, even though it's basically the > > > same as the program from above: > > > > > > $ cc -Wall -Wextra dumb2.c > > > $ > > > > strtol does have a "char * restrict * restrict" though, so the > > situation is different. A "char **" and a "const char *" > > shouldn't alias anyway. > > Pedantically, it is actually declared as 'char **restrict' (the inner > one is not declared as restrict, even though it will be restricted, > since there are no other unrestricted pointers). > > I've written functions that more closely resemble strtol(3), to show > that in the end they all share the same issue regarding const-ness: > > $ cat d.c > int d(const char *restrict ca, char *restrict a) > { > return ca > a; > } > > int main(void) > { > char x = 3; > char *xp = &x; > d(xp, xp); > } > $ cc -Wall -Wextra d.c > d.c: In function ‘main’: > d.c:10:9: warning: passing argument 2 to ‘restrict’-qualified parameter aliases with argument 1 [-Wrestrict] > 10 | d(xp, xp); > | ^ > > This trivial program causes a diagnostic. (Although I think the '>' > should also cause a diagnostic!!) > > Let's add a reference, to resemble strtol(3): > > $ cat d2.c > int d2(const char *restrict ca, char *restrict *restrict ap) > { > return ca > *ap; > } > > int main(void) > { > char x = 3; > char *xp = &x; > d2(xp, &xp); > } > $ cc -Wall -Wextra d2.c > $ > > Why does this not cause a -Wrestrict diagnostic, while d.c does? How > are these programs any different regarding pointer restrict-ness? It would require data flow anaylsis to produce the diagnostic while the first can simply be diagnosed by comparing arguments. Martin > > > > Well, I don't know how to report that defect to WG14. If you help me, > > > I'll be pleased to do so. Do they have a public mailing list or > > > anything like that? > > > > One can submit clarification or change requests: > > > > https://www.open-std.org/jtc1/sc22/wg14/www/contributing.html > > Thanks! Will do. Anyway, I think this should be discussed in glibc/gcc > in parallel, since it's clearly a missed diagnostic, and possibly a > dangerous use of restrict if the compiler does any assumptions that > shouldn't be done. > > Have a lovely day! > Alex >
Hi Martin, On Fri, Jul 05, 2024 at 05:34:55PM GMT, Martin Uecker wrote: > > I've written functions that more closely resemble strtol(3), to show > > that in the end they all share the same issue regarding const-ness: (Above I meant s/const/restrict/) > > > > $ cat d.c > > int d(const char *restrict ca, char *restrict a) > > { > > return ca > a; > > } > > > > int main(void) > > { > > char x = 3; > > char *xp = &x; > > d(xp, xp); > > } > > $ cc -Wall -Wextra d.c > > d.c: In function ‘main’: > > d.c:10:9: warning: passing argument 2 to ‘restrict’-qualified parameter aliases with argument 1 [-Wrestrict] > > 10 | d(xp, xp); > > | ^ > > > > This trivial program causes a diagnostic. (Although I think the '>' > > should also cause a diagnostic!!) > > > > Let's add a reference, to resemble strtol(3): > > > > $ cat d2.c > > int d2(const char *restrict ca, char *restrict *restrict ap) > > { > > return ca > *ap; > > } > > > > int main(void) > > { > > char x = 3; > > char *xp = &x; > > d2(xp, &xp); > > } > > $ cc -Wall -Wextra d2.c > > $ > > > > Why does this not cause a -Wrestrict diagnostic, while d.c does? How > > are these programs any different regarding pointer restrict-ness? > > It would require data flow anaylsis to produce the diagnostic while > the first can simply be diagnosed by comparing arguments. Agree. It seems like a task for -fanalyzer. $ cc -Wall -Wextra -fanalyzer -fuse-linker-plugin -flto d2.c $ I'm unable to trigger that at all. It's probably not implemented, I guess. I've updated the bug report <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112833> to change the component to 'analyzer'. At least, I hope there's consensus that while current GCC doesn't warn about this, ideally it should, which means it should warn for valid uses of strtol(3), which means strtol(3) should be fixed, in all of ISO, POSIX, and glibc. Cheers, Alex > > > > Well, I don't know how to report that defect to WG14. If you help me, > > > > I'll be pleased to do so. Do they have a public mailing list or > > > > anything like that? > > > > > > One can submit clarification or change requests: > > > > > > https://www.open-std.org/jtc1/sc22/wg14/www/contributing.html P.S.: I've sent a mail to UNE (the Spanish National Body for ISO), and asked them about joining WG14. Let's see what they say. P.S. 2: I'm also preparing a paper; would you mind championing it if I'm not yet able to do it when it's ready? P.S. 3: Do you know of any Spanish member of WG14? Maybe I can talk with them to have more information about how they work.
在 2024-07-05 23:23, Alejandro Colomar via Gcc 写道: > Hi Martin, > > On Fri, Jul 05, 2024 at 05:02:15PM GMT, Martin Uecker wrote: >>> But when the thing gets non-trivial, as in strtol(3), GCC misses the >>> -Wrestrict diagnostic, as reported in A pointer to `char` can alias any object, so in theory one could write code that looks like below. This piece of code is probably nonsense, but it illustrates the exact necessity of the `restrict` qualifiers: char* dumb(char* p) { strtol((const char*) &p, &p, 0); return p; } // warning: passing argument 2 to 'restrict'-qualified parameter // aliases with argument 1
On Fri, 2024-07-05 at 17:23 +0200, Alejandro Colomar wrote: > > strtol does have a "char * restrict * restrict" though, so the > > situation is different. A "char **" and a "const char *" > > shouldn't alias anyway. > > Pedantically, it is actually declared as 'char **restrict' (the inner > one is not declared as restrict, even though it will be restricted, > since there are no other unrestricted pointers). So how's the following implementation of strtol (10-based, no negative number handling, no overflow handling, ASCII-only) wrong? long int my_strtol(const char *restrict nptr, char **restrict endptr) { long ret = 0; while (isdigit(*nptr)) ret = ret * 10 + (*nptr++ - '0'); *endptr = (char *)nptr; return ret; } There's no dumb thing, there's no restrict violation (unless it's called in a stupid way, see below), and there **shouldn't** be a -Wrestrict warning. If you do char *x = NULL; strtol((char *)&x, &x, 10); it'll violate restrict. Nobody sane should write this, and it's warned anyway: t.c: In function 'main': t.c:6:28: warning: passing argument 2 to 'restrict'-qualified parameter aliases with argument 1 [-Wrestrict] 6 | strtol((char *)&x, &x, 10); | ~~~~~~~~~~ ^~
On Fri, 2024-07-05 at 17:53 +0200, Alejandro Colomar wrote: > At least, I hope there's consensus that while current GCC doesn't warn > about this, ideally it should, which means it should warn for valid uses > of strtol(3), which means strtol(3) should be fixed, in all of ISO, > POSIX, and glibc. It **shouldn't**. strtol will only violate restrict if it's wrongly implemented, or something dumb is done like "strtol((const char*) &p, &p, 0)". See my previous reply.
Am Freitag, dem 05.07.2024 um 17:53 +0200 schrieb Alejandro Colomar: > Hi Martin, > > On Fri, Jul 05, 2024 at 05:34:55PM GMT, Martin Uecker wrote: > > > I've written functions that more closely resemble strtol(3), to show > > > that in the end they all share the same issue regarding const-ness: > > (Above I meant s/const/restrict/) > > > > > > > $ cat d.c > > > int d(const char *restrict ca, char *restrict a) > > > { > > > return ca > a; > > > } > > > > > > int main(void) > > > { > > > char x = 3; > > > char *xp = &x; > > > d(xp, xp); > > > } > > > $ cc -Wall -Wextra d.c > > > d.c: In function ‘main’: > > > d.c:10:9: warning: passing argument 2 to ‘restrict’-qualified parameter aliases with argument 1 [-Wrestrict] > > > 10 | d(xp, xp); > > > | ^ > > > > > > This trivial program causes a diagnostic. (Although I think the '>' > > > should also cause a diagnostic!!) > > > > > > Let's add a reference, to resemble strtol(3): > > > > > > $ cat d2.c > > > int d2(const char *restrict ca, char *restrict *restrict ap) > > > { > > > return ca > *ap; > > > } > > > > > > int main(void) > > > { > > > char x = 3; > > > char *xp = &x; > > > d2(xp, &xp); > > > } > > > $ cc -Wall -Wextra d2.c > > > $ > > > > > > Why does this not cause a -Wrestrict diagnostic, while d.c does? How > > > are these programs any different regarding pointer restrict-ness? > > > > It would require data flow anaylsis to produce the diagnostic while > > the first can simply be diagnosed by comparing arguments. > > Agree. It seems like a task for -fanalyzer. > > $ cc -Wall -Wextra -fanalyzer -fuse-linker-plugin -flto d2.c > $ > > I'm unable to trigger that at all. It's probably not implemented, I > guess. I've updated the bug report > <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112833> to change the > component to 'analyzer'. > > At least, I hope there's consensus that while current GCC doesn't warn > about this, ideally it should, which means it should warn for valid uses > of strtol(3), which means strtol(3) should be fixed, in all of ISO, > POSIX, and glibc. I am not sure. > > > > > > Well, I don't know how to report that defect to WG14. If you help me, > > > > > I'll be pleased to do so. Do they have a public mailing list or > > > > > anything like that? > > > > > > > > One can submit clarification or change requests: > > > > > > > > https://www.open-std.org/jtc1/sc22/wg14/www/contributing.html > > P.S.: > > I've sent a mail to UNE (the Spanish National Body for ISO), and > asked them about joining WG14. Let's see what they say. > > P.S. 2: > > I'm also preparing a paper; would you mind championing it if I'm not yet > able to do it when it's ready? Guests can present too. > > P.S. 3: > > Do you know of any Spanish member of WG14? Maybe I can talk with them > to have more information about how they work. You could ask Miguel Ojeda. Martin >
On Fri, 5 Jul 2024 at 16:54, Alejandro Colomar via Gcc <gcc@gcc.gnu.org> wrote: > At least, I hope there's consensus that while current GCC doesn't warn > about this, ideally it should, which means it should warn for valid uses > of strtol(3), which means strtol(3) should be fixed, in all of ISO, > POSIX, and glibc. I'm not convinced. It doesn't look like anybody else is convinced. I wouldn't call that consensus.
On Sat, 2024-07-06 at 00:01 +0800, Xi Ruoyao wrote: > On Fri, 2024-07-05 at 17:53 +0200, Alejandro Colomar wrote: > > At least, I hope there's consensus that while current GCC doesn't warn > > about this, ideally it should, which means it should warn for valid uses > > of strtol(3), which means strtol(3) should be fixed, in all of ISO, > > POSIX, and glibc. > > It **shouldn't**. strtol will only violate restrict if it's wrongly > implemented, or something dumb is done like "strtol((const char*) &p, > &p, 0)". > > See my previous reply. Also even if we'll introduce an over-eager warning having many positives, there are still many possibilities besides changing the standard to satisfy such a warning. For example, g++ -Wdangling- reference has many false positives (IMO it's over-eager) so a no_dangling attribute has been added to allow suppressing it.
On 05/07/2024 17:11, Jonathan Wakely via Gcc wrote: > On Fri, 5 Jul 2024 at 16:54, Alejandro Colomar via Gcc <gcc@gcc.gnu.org> wrote: >> At least, I hope there's consensus that while current GCC doesn't warn >> about this, ideally it should, which means it should warn for valid uses >> of strtol(3), which means strtol(3) should be fixed, in all of ISO, >> POSIX, and glibc. > > I'm not convinced. It doesn't look like anybody else is convinced. I > wouldn't call that consensus. And what's more this prototype is defined by the C standard. If you have a beef with that, then you should take it up with the committee; I can't see GCC wanting to be different on this. R.
On Fri, 5 Jul 2024 at 17:02, Xi Ruoyao via Gcc <gcc@gcc.gnu.org> wrote: > > On Fri, 2024-07-05 at 17:53 +0200, Alejandro Colomar wrote: > > At least, I hope there's consensus that while current GCC doesn't warn > > about this, ideally it should, which means it should warn for valid uses > > of strtol(3), which means strtol(3) should be fixed, in all of ISO, > > POSIX, and glibc. > > It **shouldn't**. strtol will only violate restrict if it's wrongly > implemented, or something dumb is done like "strtol((const char*) &p, > &p, 0)". > > See my previous reply. Right, is there a valid use of strtol where a warning would be justified? Showing that you can contrive a case where a const char* restrict and char** restrict can alias doesn't mean there's a problem with strtol.
Am Freitag, dem 05.07.2024 um 17:24 +0100 schrieb Jonathan Wakely: > On Fri, 5 Jul 2024 at 17:02, Xi Ruoyao via Gcc <gcc@gcc.gnu.org> wrote: > > > > On Fri, 2024-07-05 at 17:53 +0200, Alejandro Colomar wrote: > > > At least, I hope there's consensus that while current GCC doesn't warn > > > about this, ideally it should, which means it should warn for valid uses > > > of strtol(3), which means strtol(3) should be fixed, in all of ISO, > > > POSIX, and glibc. > > > > It **shouldn't**. strtol will only violate restrict if it's wrongly > > implemented, or something dumb is done like "strtol((const char*) &p, > > &p, 0)". > > > > See my previous reply. > > Right, is there a valid use of strtol where a warning would be justified? > > Showing that you can contrive a case where a const char* restrict and > char** restrict can alias doesn't mean there's a problem with strtol. I think his point is that a const char* restrict and something which is stored in a char* whose address is then passed can alias and there a warning would make sense in other situations. But I am also not convinced removing restrict would be an improvement. It would make more sense to have an annotation that indicates that endptr is only used as output. Martin
Jonathan Wakely via Gcc <gcc@gcc.gnu.org> writes: > On Fri, 5 Jul 2024 at 17:02, Xi Ruoyao via Gcc <gcc@gcc.gnu.org> wrote: >> >> On Fri, 2024-07-05 at 17:53 +0200, Alejandro Colomar wrote: >> > At least, I hope there's consensus that while current GCC doesn't warn >> > about this, ideally it should, which means it should warn for valid uses >> > of strtol(3), which means strtol(3) should be fixed, in all of ISO, >> > POSIX, and glibc. >> >> It **shouldn't**. strtol will only violate restrict if it's wrongly >> implemented, or something dumb is done like "strtol((const char*) &p, >> &p, 0)". >> >> See my previous reply. > > Right, is there a valid use of strtol where a warning would be justified? > > Showing that you can contrive a case where a const char* restrict and > char** restrict can alias doesn't mean there's a problem with strtol. I still don't understand why it'd be appropriate for GCC and glibc to override this without it even being *brought to* the committee, either.
Hi Xi, On Fri, Jul 05, 2024 at 11:55:05PM GMT, Xi Ruoyao wrote: > On Fri, 2024-07-05 at 17:23 +0200, Alejandro Colomar wrote: > > > strtol does have a "char * restrict * restrict" though, so the > > > situation is different. A "char **" and a "const char *" > > > shouldn't alias anyway. > > > > Pedantically, it is actually declared as 'char **restrict' (the inner > > one is not declared as restrict, even though it will be restricted, > > since there are no other unrestricted pointers). > > So how's the following implementation of strtol (10-based, no negative > number handling, no overflow handling, ASCII-only) wrong? > > long int my_strtol(const char *restrict nptr, char **restrict endptr) > { > long ret = 0; > > while (isdigit(*nptr)) > ret = ret * 10 + (*nptr++ - '0'); > > *endptr = (char *)nptr; > return ret; > } > > There's no dumb thing, there's no restrict violation (unless it's called > in a stupid way, see below), and there **shouldn't** be a -Wrestrict > warning. > > If you do > > char *x = NULL; > strtol((char *)&x, &x, 10); The restrict in `char **restrict endptr` already protects you from this. You don't need to make the first parameter also restricted. See: $ cat r.c long alx_strtol(const char *nptr, char **restrict endp); int main(void) { char x = 3; char *xp = &x; alx_strtol(xp, &xp); // Fine. alx_strtol(xp, (char **) xp); // Bug. alx_strtol((char *) &xp, &xp); // Bug. } $ cc -Wall -Wextra -S r.c r.c: In function ‘main’: r.c:9:24: warning: passing argument 2 to ‘restrict’-qualified parameter aliases with argument 1 [-Wrestrict] 9 | alx_strtol(xp, (char **) xp); // Bug. | ^~~~~~~~~~~~ r.c:10:34: warning: passing argument 2 to ‘restrict’-qualified parameter aliases with argument 1 [-Wrestrict] 10 | alx_strtol((char *) &xp, &xp); // Bug. | ~~~~~~~~~~~~ ^~~ Using my proposed prototype wouldn't case any warnings with a powerful -fanalyzer that would be able to emit diagnostics with the current prototype. In strtol(3), there are 3 pointers: - nptr - *endptr - endptr The first two should be allowed to alias each other (and at the end of the call, they definitely alias each other). It is only the third one which must not alias any of the other, which is why my patch (v2) keeps that restrict, but removes the other one. Does that make sense? Cheers, Alex > > it'll violate restrict. Nobody sane should write this, and it's warned > anyway: > > t.c: In function 'main': > t.c:6:28: warning: passing argument 2 to 'restrict'-qualified parameter > aliases with argument 1 [-Wrestrict] > 6 | strtol((char *)&x, &x, 10); > | ~~~~~~~~~~ ^~ > > -- > Xi Ruoyao <xry111@xry111.site> > School of Aerospace Science and Technology, Xidian University
On Fri, Jul 05, 2024 at 06:32:26PM GMT, Alejandro Colomar wrote: > Hi Xi, > > On Fri, Jul 05, 2024 at 11:55:05PM GMT, Xi Ruoyao wrote: > > On Fri, 2024-07-05 at 17:23 +0200, Alejandro Colomar wrote: > > > > strtol does have a "char * restrict * restrict" though, so the > > > > situation is different. A "char **" and a "const char *" > > > > shouldn't alias anyway. > > > > > > Pedantically, it is actually declared as 'char **restrict' (the inner > > > one is not declared as restrict, even though it will be restricted, > > > since there are no other unrestricted pointers). > > > > So how's the following implementation of strtol (10-based, no negative > > number handling, no overflow handling, ASCII-only) wrong? > > > > long int my_strtol(const char *restrict nptr, char **restrict endptr) > > { > > long ret = 0; > > > > while (isdigit(*nptr)) > > ret = ret * 10 + (*nptr++ - '0'); > > > > *endptr = (char *)nptr; > > return ret; > > } > > > > There's no dumb thing, there's no restrict violation (unless it's called > > in a stupid way, see below), and there **shouldn't** be a -Wrestrict > > warning. > > > > If you do > > > > char *x = NULL; > > strtol((char *)&x, &x, 10); > > The restrict in `char **restrict endptr` already protects you from this. > You don't need to make the first parameter also restricted. See: > > $ cat r.c > long alx_strtol(const char *nptr, char **restrict endp); > > int main(void) > { > char x = 3; > char *xp = &x; > > alx_strtol(xp, &xp); // Fine. > alx_strtol(xp, (char **) xp); // Bug. > alx_strtol((char *) &xp, &xp); // Bug. > } > $ cc -Wall -Wextra -S r.c > r.c: In function ‘main’: > r.c:9:24: warning: passing argument 2 to ‘restrict’-qualified parameter aliases with argument 1 [-Wrestrict] > 9 | alx_strtol(xp, (char **) xp); // Bug. > | ^~~~~~~~~~~~ > r.c:10:34: warning: passing argument 2 to ‘restrict’-qualified parameter aliases with argument 1 [-Wrestrict] > 10 | alx_strtol((char *) &xp, &xp); // Bug. > | ~~~~~~~~~~~~ ^~~ > > Using my proposed prototype wouldn't case any warnings with a powerful > -fanalyzer that would be able to emit diagnostics with the current > prototype. > > In strtol(3), there are 3 pointers: > > - nptr > - *endptr > - endptr > > The first two should be allowed to alias each other (and at the end of > the call, they definitely alias each other). It is only the third one > which must not alias any of the other, which is why my patch (v2) keeps (Whoops; forget about that v2; that was about a similar patch to strsep(3). In this case we're in patch v1; which already had that into consideration. Please read it as s/v2/v1/.) > that restrict, but removes the other one. > > Does that make sense? > > Cheers, > Alex > > > > > it'll violate restrict. Nobody sane should write this, and it's warned > > anyway: > > > > t.c: In function 'main': > > t.c:6:28: warning: passing argument 2 to 'restrict'-qualified parameter > > aliases with argument 1 [-Wrestrict] > > 6 | strtol((char *)&x, &x, 10); > > | ~~~~~~~~~~ ^~ > > > > -- > > Xi Ruoyao <xry111@xry111.site> > > School of Aerospace Science and Technology, Xidian University > > -- > <https://www.alejandro-colomar.es/>
Hi, On Fri, Jul 05, 2024 at 06:30:50PM GMT, Martin Uecker wrote: > Am Freitag, dem 05.07.2024 um 17:24 +0100 schrieb Jonathan Wakely: > > On Fri, 5 Jul 2024 at 17:02, Xi Ruoyao via Gcc <gcc@gcc.gnu.org> wrote: > > > > > > On Fri, 2024-07-05 at 17:53 +0200, Alejandro Colomar wrote: > > > > At least, I hope there's consensus that while current GCC doesn't warn > > > > about this, ideally it should, which means it should warn for valid uses > > > > of strtol(3), which means strtol(3) should be fixed, in all of ISO, > > > > POSIX, and glibc. > > > > > > It **shouldn't**. strtol will only violate restrict if it's wrongly > > > implemented, or something dumb is done like "strtol((const char*) &p, > > > &p, 0)". > > > > > > See my previous reply. That's not right. See my reply to yours, Xi. The restrict in char **endptr already prevents calls such as strtol(x, x, 0). The restrict in const char *nptr provides nothing. > > > > Right, is there a valid use of strtol where a warning would be justified? Is there any valid reason to have restrict in the _first_ parameter of strtol(3)? Other than "ISO C says so"? I'll take my beef with ISO C to WG14, and hopefully get that fixed. Can we please discuss this technically, ignoring the existence of ISO C, for the time being? > > Showing that you can contrive a case where a const char* restrict and > > char** restrict can alias doesn't mean there's a problem with strtol. > > I think his point is that a const char* restrict and something which > is stored in a char* whose address is then passed can alias and there > a warning would make sense in other situations. Indeed. > But I am also not convinced removing restrict would be an improvement. > It would make more sense to have an annotation that indicates that > endptr is only used as output. What is the benefit of keeping restrict there? It doesn't provide any benefits, AFAICS. I've prepared a paper for wg14. I'll ask for a number, but will attach it here already. I also attach the man(7) source code for it. Cheers, Alex
On Fri, 5 Jul 2024 at 20:28, Alejandro Colomar <alx@kernel.org> wrote: > > Hi, > > On Fri, Jul 05, 2024 at 06:30:50PM GMT, Martin Uecker wrote: > > Am Freitag, dem 05.07.2024 um 17:24 +0100 schrieb Jonathan Wakely: > > > On Fri, 5 Jul 2024 at 17:02, Xi Ruoyao via Gcc <gcc@gcc.gnu.org> wrote: > > > > > > > > On Fri, 2024-07-05 at 17:53 +0200, Alejandro Colomar wrote: > > > > > At least, I hope there's consensus that while current GCC doesn't warn > > > > > about this, ideally it should, which means it should warn for valid uses > > > > > of strtol(3), which means strtol(3) should be fixed, in all of ISO, > > > > > POSIX, and glibc. > > > > > > > > It **shouldn't**. strtol will only violate restrict if it's wrongly > > > > implemented, or something dumb is done like "strtol((const char*) &p, > > > > &p, 0)". > > > > > > > > See my previous reply. > > That's not right. See my reply to yours, Xi. The restrict in > > char **endptr > > already prevents calls such as strtol(x, x, 0). That seems to contradict footnote 153 in C23.
Hi, I have a paper for removing restrict from the first parameter of strtol(3) et al. The title is strtol(3) et al. should’t have a restricted first parameter. If it helps, I already have a draft of the paper, which I attach (both the PDF, and the man(7) source). Cheers, Alex
Hi Jonathan, On Fri, Jul 05, 2024 at 08:38:15PM GMT, Jonathan Wakely wrote: > On Fri, 5 Jul 2024 at 20:28, Alejandro Colomar <alx@kernel.org> wrote: > > > > Hi, > > > > On Fri, Jul 05, 2024 at 06:30:50PM GMT, Martin Uecker wrote: > > > Am Freitag, dem 05.07.2024 um 17:24 +0100 schrieb Jonathan Wakely: > > > > On Fri, 5 Jul 2024 at 17:02, Xi Ruoyao via Gcc <gcc@gcc.gnu.org> wrote: > > > > > > > > > > On Fri, 2024-07-05 at 17:53 +0200, Alejandro Colomar wrote: > > > > > > At least, I hope there's consensus that while current GCC doesn't warn > > > > > > about this, ideally it should, which means it should warn for valid uses > > > > > > of strtol(3), which means strtol(3) should be fixed, in all of ISO, > > > > > > POSIX, and glibc. > > > > > > > > > > It **shouldn't**. strtol will only violate restrict if it's wrongly > > > > > implemented, or something dumb is done like "strtol((const char*) &p, > > > > > &p, 0)". > > > > > > > > > > See my previous reply. > > > > That's not right. See my reply to yours, Xi. The restrict in > > > > char **endptr > > > > already prevents calls such as strtol(x, x, 0). > > That seems to contradict footnote 153 in C23. Did you mean a different footnote number? Here's 153 in N3047: 153) An implementation can delay the choice of which integer type until all enumeration constants have been seen. which seems completely unrelated. Cheers, Alex
On Fri, 5 Jul 2024 at 20:47, Alejandro Colomar <alx@kernel.org> wrote: > > Hi Jonathan, > > On Fri, Jul 05, 2024 at 08:38:15PM GMT, Jonathan Wakely wrote: > > On Fri, 5 Jul 2024 at 20:28, Alejandro Colomar <alx@kernel.org> wrote: > > > > > > Hi, > > > > > > On Fri, Jul 05, 2024 at 06:30:50PM GMT, Martin Uecker wrote: > > > > Am Freitag, dem 05.07.2024 um 17:24 +0100 schrieb Jonathan Wakely: > > > > > On Fri, 5 Jul 2024 at 17:02, Xi Ruoyao via Gcc <gcc@gcc.gnu.org> wrote: > > > > > > > > > > > > On Fri, 2024-07-05 at 17:53 +0200, Alejandro Colomar wrote: > > > > > > > At least, I hope there's consensus that while current GCC doesn't warn > > > > > > > about this, ideally it should, which means it should warn for valid uses > > > > > > > of strtol(3), which means strtol(3) should be fixed, in all of ISO, > > > > > > > POSIX, and glibc. > > > > > > > > > > > > It **shouldn't**. strtol will only violate restrict if it's wrongly > > > > > > implemented, or something dumb is done like "strtol((const char*) &p, > > > > > > &p, 0)". > > > > > > > > > > > > See my previous reply. > > > > > > That's not right. See my reply to yours, Xi. The restrict in > > > > > > char **endptr > > > > > > already prevents calls such as strtol(x, x, 0). > > > > That seems to contradict footnote 153 in C23. > > Did you mean a different footnote number? No. > > Here's 153 in N3047: That draft is nearly two years old. > > 153) An implementation can delay the choice of which integer type until > all enumeration constants have been seen. > > which seems completely unrelated. Because you're looking at a draft from nearly two years ago. Try N3220.
Hi Jonathan, On Fri, Jul 05, 2024 at 08:52:30PM GMT, Jonathan Wakely wrote: > > > > > > > It **shouldn't**. strtol will only violate restrict if it's wrongly > > > > > > > implemented, or something dumb is done like "strtol((const char*) &p, > > > > > > > &p, 0)". > > > > > > > > > > > > > > See my previous reply. > > > > > > > > That's not right. See my reply to yours, Xi. The restrict in > > > > > > > > char **endptr > > > > > > > > already prevents calls such as strtol(x, x, 0). > > > > > > That seems to contradict footnote 153 in C23. > > > > Did you mean a different footnote number? > > No. > > > > > Here's 153 in N3047: > > That draft is nearly two years old. > > > > > 153) An implementation can delay the choice of which integer type until > > all enumeration constants have been seen. > > > > which seems completely unrelated. > > Because you're looking at a draft from nearly two years ago. Try N3220. Ahhh, sorry! Indeed. Let's quote it here, for others to not need to find it: 153) In other words, E depends on the value of P itself rather than on the value of an object referenced indirectly through P. For example, if identifier p has type (int **restrict), then the pointer expressions p and p+1 are based on the restricted pointer object designated by p, but the pointer expressions *p and p[1] are not. I don't think footnote 153 is problematic here. Let's have this prototype: long int alx_strtol(const char *nptr, char **restrict endptr, int base); and let's discuss some example of bad usage: char str[] = "1"; s = str; alx_strtol(s, (char **)s, 0); According to 153, the pointer expression endptr is based on the restricted pointer object, but *endptr and nptr are not. The user has passed s as endptr, and also s as nptr. Thus, the object s is being accessed via a restricted pointer, endptr, and a non-restricted one, nptr. That's UB. Let's see a different example of bad usage: char str[] = "1"; s = str; alx_strtol((char *)&s, &s, 0); For similar reasons, it's also UB. The compiler diagnoses both: $ cat r.c long alx_strtol(const char *s, char **restrict endp, int base); int main(void) { char x = 3; char *xp = &x; alx_strtol(xp, &xp, 0); // Fine. alx_strtol(xp, (char **) xp, 0); // Bug. alx_strtol((char *) &xp, &xp, 0); // Bug. } $ cc -Wall -Wextra -S r.c r.c: In function ‘main’: r.c:9:24: warning: passing argument 2 to ‘restrict’-qualified parameter aliases with argument 1 [-Wrestrict] 9 | alx_strtol(xp, (char **) xp, 0); // Bug. | ^~~~~~~~~~~~ r.c:10:34: warning: passing argument 2 to ‘restrict’-qualified parameter aliases with argument 1 [-Wrestrict] 10 | alx_strtol((char *) &xp, &xp, 0); // Bug. | ~~~~~~~~~~~~ ^~~ Cheers, Alex
On Fri, Jul 05, 2024 at 08:52:30PM +0100, Jonathan Wakely wrote: > On Fri, 5 Jul 2024 at 20:47, Alejandro Colomar <alx@kernel.org> wrote: > > > > Here's 153 in N3047: > > That draft is nearly two years old. > > > > > 153) An implementation can delay the choice of which integer type until > > all enumeration constants have been seen. > > > > which seems completely unrelated. > > Because you're looking at a draft from nearly two years ago. Try N3220. That is 6.7.3.1p3: In what follows, a pointer expression E is said to be based on object P if (at some sequence point in the execution of B prior to the evaluation of E) modifying P to point to a copy of the array object into which it formerly pointed would change the value of E.153) Note that "based" is defined only for expressions with pointer types. Footnote 153) In other words, E depends on the value of P itself rather than on the value of an object referenced indirectly through P. For example, if identifier p has type (int **restrict), then the pointer expressions p and p+1 are based on the restricted pointer object designated by p, but the pointer expressions *p and p[1] are not. Which would be the same paragraph of the same section on N3047, but footnote number 168. o/ emanuele6
Am Freitag, dem 05.07.2024 um 21:28 +0200 schrieb Alejandro Colomar: ... > > > > Showing that you can contrive a case where a const char* restrict and > > > char** restrict can alias doesn't mean there's a problem with strtol. > > > > I think his point is that a const char* restrict and something which > > is stored in a char* whose address is then passed can alias and there > > a warning would make sense in other situations. > > Indeed. > > > But I am also not convinced removing restrict would be an improvement. > > It would make more sense to have an annotation that indicates that > > endptr is only used as output. > > What is the benefit of keeping restrict there? It doesn't provide any > benefits, AFAICS. Not really I think. I am generally not a fan of restrict. IMHO it is misdesigned and I would like to see it replaced with something better. But I also not convinced it really helps to remove it here. If we marked endptr as "write_only" (which it might already be) then a future warning mechanism for -Wrestrict could ignore the content of *endptr. Martin
On Fri, 5 Jul 2024 at 21:26, Martin Uecker <muecker@gwdg.de> wrote: > > Am Freitag, dem 05.07.2024 um 21:28 +0200 schrieb Alejandro Colomar: > > ... > > > > > > Showing that you can contrive a case where a const char* restrict and > > > > char** restrict can alias doesn't mean there's a problem with strtol. > > > > > > I think his point is that a const char* restrict and something which > > > is stored in a char* whose address is then passed can alias and there > > > a warning would make sense in other situations. > > > > Indeed. > > > > > But I am also not convinced removing restrict would be an improvement. > > > It would make more sense to have an annotation that indicates that > > > endptr is only used as output. > > > > What is the benefit of keeping restrict there? It doesn't provide any > > benefits, AFAICS. > > Not really I think. I am generally not a fan of restrict. > IMHO it is misdesigned and I would like to see it replaced > with something better. But I also not convinced it really > helps to remove it here. > > If we marked endptr as "write_only" (which it might already > be) then a future warning mechanism for -Wrestrict could > ignore the content of *endptr. That seems more useful. Add semantic information instead of taking it away. If the concern is a hypothetical future compiler warning that would give false positives for perfectly valid uses of strtol, then the problem is the compiler warning, not strtol. If additional information can be added to avoid the false positives (and also possibly optimize the code better), then that wouldn't require a change to the standard motivated by a hypothetical compiler warning.
On Fri, Jul 05, 2024 at 22:15:44 +0200, Emanuele Torre via Gcc wrote: > That is 6.7.3.1p3: > > > > In what follows, a pointer expression E is said to be based on object P > if (at some sequence point in the execution of B prior to the > evaluation of E) modifying P to point to a copy of the array object > into which it formerly pointed would change the value of E.153) Note > that "based" is defined only for expressions with pointer types. > > Footnote 153) In other words, E depends on the value of P itself rather > than on the value of an object referenced indirectly through P. For > example, if identifier p has type (int **restrict), then the pointer > expressions p and p+1 are based on the restricted pointer object > designated by p, but the pointer expressions *p and p[1] are not. > > > > Which would be the same paragraph of the same section on N3047, but > footnote number 168. Obviously, we need better types on our standardese pointers. Jonathan used a pointer of type `footnote _N3220*` while Alejandro was expecting a pointer of type `footnote _N3047*`. Of course, given that they end up referring to the same thing, this may have interesting implications when they are used with `restrict`. --Ben
On Fri, Jul 05, 2024 at 09:28:46PM GMT, Jonathan Wakely wrote: > On Fri, 5 Jul 2024 at 21:26, Martin Uecker <muecker@gwdg.de> wrote: > > > > Am Freitag, dem 05.07.2024 um 21:28 +0200 schrieb Alejandro Colomar: > > > > ... > > > > > > > > Showing that you can contrive a case where a const char* restrict and > > > > > char** restrict can alias doesn't mean there's a problem with strtol. > > > > > > > > I think his point is that a const char* restrict and something which > > > > is stored in a char* whose address is then passed can alias and there > > > > a warning would make sense in other situations. > > > > > > Indeed. > > > > > > > But I am also not convinced removing restrict would be an improvement. > > > > It would make more sense to have an annotation that indicates that > > > > endptr is only used as output. > > > > > > What is the benefit of keeping restrict there? It doesn't provide any > > > benefits, AFAICS. > > > > Not really I think. I am generally not a fan of restrict. > > IMHO it is misdesigned and I would like to see it replaced > > with something better. But I also not convinced it really > > helps to remove it here. > > > > If we marked endptr as "write_only" (which it might already > > be) then a future warning mechanism for -Wrestrict could > > ignore the content of *endptr. > > > That seems more useful. Add semantic information instead of taking it > away. How does restrict on nptr (or conversely on *endptr) add any semantic information? Can you please phrase the semantic information provided by it? How is it useful to the caller? How is it useful to the callee? Cheers, Alex > If the concern is a hypothetical future compiler warning that > would give false positives for perfectly valid uses of strtol, then > the problem is the compiler warning, not strtol. If additional > information can be added to avoid the false positives (and also > possibly optimize the code better), then that wouldn't require a > change to the standard motivated by a hypothetical compiler warning.
On Fri, Jul 05, 2024 at 09:28:46PM GMT, Jonathan Wakely wrote: > > If we marked endptr as "write_only" (which it might already > > be) then a future warning mechanism for -Wrestrict could > > ignore the content of *endptr. > > > That seems more useful. Add semantic information instead of taking it > away. If the concern is a hypothetical future compiler warning that > would give false positives for perfectly valid uses of strtol, then > the problem is the compiler warning, not strtol. If additional > information can be added to avoid the false positives (and also > possibly optimize the code better), then that wouldn't require a > change to the standard motivated by a hypothetical compiler warning. Let me be a little bit sarcastic. If so, let's take down -Wrestrict at all, because it triggers false positives at the same rate. How is it even in -Wall and not just -Wextra? Here's a false positive: $ cat d.c int is_same_pointer(const char *restrict ca, char *restrict a) { return ca == a; } int main(void) { char x = 3; char *xp = &x; is_same_pointer(xp, xp); } $ cc -Wall d.c d.c: In function ‘main’: d.c:10:9: warning: passing argument 2 to ‘restrict’-qualified parameter aliases with argument 1 [-Wrestrict] 10 | d(xp, xp); | ^~~~~~~~~~~~~~~ It's impossible to know if a use of restrict causes UB without reading both the source code of the caller and the callee, so except for -fanalyzer, it's impossible to diagnose something with certainty. So, it's certainly not something we want in -Wall. Or should I remove the 'restrict' qualifier from that function, loosing "precious" semantic information, just because the compiler doesn't like it? Cheers, Alex
On Fri, 5 Jul 2024 at 21:55, Alejandro Colomar <alx@kernel.org> wrote: > > On Fri, Jul 05, 2024 at 09:28:46PM GMT, Jonathan Wakely wrote: > > > If we marked endptr as "write_only" (which it might already > > > be) then a future warning mechanism for -Wrestrict could > > > ignore the content of *endptr. > > > > > > That seems more useful. Add semantic information instead of taking it > > away. If the concern is a hypothetical future compiler warning that > > would give false positives for perfectly valid uses of strtol, then > > the problem is the compiler warning, not strtol. If additional > > information can be added to avoid the false positives (and also > > possibly optimize the code better), then that wouldn't require a > > change to the standard motivated by a hypothetical compiler warning. > > Let me be a little bit sarcastic. > > If so, let's take down -Wrestrict at all, because it triggers false > positives at the same rate. How is it even in -Wall and not just > -Wextra? > > Here's a false positive: > > $ cat d.c > int is_same_pointer(const char *restrict ca, char *restrict a) > { > return ca == a; This is a strawman argument, because all your example functions have been pretty useless and/or not good uses of restrict. Yes, if you put restrict on functions where you don't ever access any objects through the pointers, the restrict qualifiers are misleading and so compilers might give bad warnings for your bad API. > } > > int main(void) > { > char x = 3; > char *xp = &x; > is_same_pointer(xp, xp); > } > $ cc -Wall d.c > d.c: In function ‘main’: > d.c:10:9: warning: passing argument 2 to ‘restrict’-qualified parameter aliases with argument 1 [-Wrestrict] > 10 | d(xp, xp); > | ^~~~~~~~~~~~~~~ > > It's impossible to know if a use of restrict causes UB without reading > both the source code of the caller and the callee, so except for > -fanalyzer, it's impossible to diagnose something with certainty. > > So, it's certainly not something we want in -Wall. > > Or should I remove the 'restrict' qualifier from that function, loosing > "precious" semantic information, just because the compiler doesn't like > it? > > Cheers, > Alex > > > -- > <https://www.alejandro-colomar.es/>
Hi Jonathan, On Fri, Jul 05, 2024 at 10:39:52PM GMT, Jonathan Wakely wrote: > On Fri, 5 Jul 2024 at 21:55, Alejandro Colomar <alx@kernel.org> wrote: > > > > On Fri, Jul 05, 2024 at 09:28:46PM GMT, Jonathan Wakely wrote: > > > > If we marked endptr as "write_only" (which it might already > > > > be) then a future warning mechanism for -Wrestrict could > > > > ignore the content of *endptr. > > > > > > > > > That seems more useful. Add semantic information instead of taking it > > > away. If the concern is a hypothetical future compiler warning that > > > would give false positives for perfectly valid uses of strtol, then > > > the problem is the compiler warning, not strtol. If additional > > > information can be added to avoid the false positives (and also > > > possibly optimize the code better), then that wouldn't require a > > > change to the standard motivated by a hypothetical compiler warning. > > > > Let me be a little bit sarcastic. > > > > If so, let's take down -Wrestrict at all, because it triggers false > > positives at the same rate. How is it even in -Wall and not just > > -Wextra? > > > > Here's a false positive: > > > > $ cat d.c > > int is_same_pointer(const char *restrict ca, char *restrict a) > > { > > return ca == a; > > This is a strawman argument, because all your example functions have > been pretty useless and/or not good uses of restrict. > > Yes, if you put restrict on functions where you don't ever access any > objects through the pointers, the restrict qualifiers are misleading That's precisely the case with strtol(3): it doesn't access any objects through *endptr, and so that pointer need not be restrict. Then, nptr is a read-only pointer, so is doesn't matter either if it's accessed or not. Let's say we add as many attributes as possible to strtol(3): [[gnu::access(read_only, 1)]] [[gnu::access(write_only, 1)]] [[gnu::leaf]] [[gnu::nothrow]] [[gnu::null_terminated_string_arg(1)]] // [[gnu::access(none, *1)]] long alx_strtol(const char *nptr, char **_Nullable restrict endp, int base); Let's say we could mark *endptr as a 'access(none)' pointer, since it's not accessed. Let's say we do that with [[gnu::access(none, *1)]]. Then, do you think the information of that prototype is any different than a prototype with restrict on the remaining pointers? [[gnu::access(read_only, 1)]] [[gnu::access(write_only, 1)]] [[gnu::leaf]] [[gnu::nothrow]] [[gnu::null_terminated_string_arg(1)]] // [[gnu::access(none, *1)]] long alx_strtol(const char *restrict nptr, char *restrict *_Nullable restrict endp, int base); I don't think so. Since *endptr is access(none), it certainly cannot access nptr, and thus the qualifier on nptr is superfluous. And even without the hypothetical [[gnu::access(none, *1)]]: - The callee doesn't care about restrict, because it doesn't access any objects via *endptr, so it certainly knows that nptr can be read without any concerns about optimization. - The caller can't know if strtol(3) accesses *endptr, or nptr, and so it can't optimize. Unless it passes an uninitialized value in *endptr, which means the caller knows for sure that nptr won't be written, regardless of restrict on it or not. Please, describe what's the information you think is being added by having restrict on nptr, on how it would be lost if we remove it. Cheers, Alex > and so compilers might give bad warnings for your bad API.
On Sat, Jul 06, 2024 at 12:02:06AM GMT, Alejandro Colomar wrote: > Hi Jonathan, > > On Fri, Jul 05, 2024 at 10:39:52PM GMT, Jonathan Wakely wrote: > > On Fri, 5 Jul 2024 at 21:55, Alejandro Colomar <alx@kernel.org> wrote: > > > > > > On Fri, Jul 05, 2024 at 09:28:46PM GMT, Jonathan Wakely wrote: > > > > > If we marked endptr as "write_only" (which it might already > > > > > be) then a future warning mechanism for -Wrestrict could > > > > > ignore the content of *endptr. > > > > > > > > > > > > That seems more useful. Add semantic information instead of taking it > > > > away. If the concern is a hypothetical future compiler warning that > > > > would give false positives for perfectly valid uses of strtol, then > > > > the problem is the compiler warning, not strtol. If additional > > > > information can be added to avoid the false positives (and also > > > > possibly optimize the code better), then that wouldn't require a > > > > change to the standard motivated by a hypothetical compiler warning. > > > > > > Let me be a little bit sarcastic. > > > > > > If so, let's take down -Wrestrict at all, because it triggers false > > > positives at the same rate. How is it even in -Wall and not just > > > -Wextra? > > > > > > Here's a false positive: > > > > > > $ cat d.c > > > int is_same_pointer(const char *restrict ca, char *restrict a) > > > { > > > return ca == a; > > > > This is a strawman argument, because all your example functions have > > been pretty useless and/or not good uses of restrict. > > > > Yes, if you put restrict on functions where you don't ever access any > > objects through the pointers, the restrict qualifiers are misleading > > That's precisely the case with strtol(3): it doesn't access any objects > through *endptr, and so that pointer need not be restrict. > > Then, nptr is a read-only pointer, so is doesn't matter either if it's > accessed or not. > > Let's say we add as many attributes as possible to strtol(3): > > [[gnu::access(read_only, 1)]] > [[gnu::access(write_only, 1)]] s/1/2/ > [[gnu::leaf]] > [[gnu::nothrow]] > [[gnu::null_terminated_string_arg(1)]] > // [[gnu::access(none, *1)]] s/*1/*2/ > long > alx_strtol(const char *nptr, char **_Nullable restrict endp, int base); > > Let's say we could mark *endptr as a 'access(none)' pointer, since it's > not accessed. Let's say we do that with [[gnu::access(none, *1)]]. s/*1/*2/ > Then, do you think the information of that prototype is any different > than a prototype with restrict on the remaining pointers? > > [[gnu::access(read_only, 1)]] > [[gnu::access(write_only, 1)]] s/1/2/ > [[gnu::leaf]] > [[gnu::nothrow]] > [[gnu::null_terminated_string_arg(1)]] > // [[gnu::access(none, *1)]] s/*1/*2/ > long > alx_strtol(const char *restrict nptr, > char *restrict *_Nullable restrict endp, int base); > > I don't think so. Since *endptr is access(none), it certainly cannot > access nptr, and thus the qualifier on nptr is superfluous. > > And even without the hypothetical [[gnu::access(none, *1)]]: s/*1/*2/ > > - The callee doesn't care about restrict, because it doesn't access > any objects via *endptr, so it certainly knows that nptr can be read > without any concerns about optimization. > > - The caller can't know if strtol(3) accesses *endptr, or nptr, and so > it can't optimize. Unless it passes an uninitialized value in > *endptr, which means the caller knows for sure that nptr won't be > written, regardless of restrict on it or not. > > Please, describe what's the information you think is being added by > having restrict on nptr, on how it would be lost if we remove it. > > Cheers, > Alex > > > and so compilers might give bad warnings for your bad API. > > -- > <https://www.alejandro-colomar.es/>
On Sat, 2024-07-06 at 00:02 +0200, Alejandro Colomar wrote: > That's precisely the case with strtol(3): it doesn't access any objects > through *endptr, and so that pointer need not be restrict. > > Then, nptr is a read-only pointer, so is doesn't matter either if it's > accessed or not. Restrict allows to reorder any writes to other objects with an read from nptr then. In strtol at least errno can be written, and depending on the implementation of locale things there may be more. TBAA does not help here because char aliases with anything.
On Sat, 2024-07-06 at 10:24 +0800, Xi Ruoyao wrote: > On Sat, 2024-07-06 at 00:02 +0200, Alejandro Colomar wrote: > > That's precisely the case with strtol(3): it doesn't access any objects > > through *endptr, and so that pointer need not be restrict. > > > > Then, nptr is a read-only pointer, so is doesn't matter either if it's > > accessed or not. > > Restrict allows to reorder any writes to other objects with an read from > nptr then. In strtol at least errno can be written, and depending on the > implementation of locale things there may be more. > > TBAA does not help here because char aliases with anything. Also in the implementation of strtol if it passes nptr to another auxiliary function and that function does not have some fancy const or access attributes, the compiler will assume that function may write into the buffer pointed by nptr because in C you can actually write via a const T * unless it really points to a const T (not non-qualified T). BTW among your list: > > [[gnu::access(read_only, 1)]] > > [[gnu::access(write_only, 2)]] > > [[gnu::leaf]] > > [[gnu::nothrow]] > > [[gnu::null_terminated_string_arg(1)]] IMO we should add these access attributes, they'll definitely help the optimization (like, optimize away the initialization of a pointer). We already have __THROW which expands to nothrow and leaf. I'm not sure if null_terminated_string_arg is correct: is the following invalid or not? char p[] = {'1', ')'}; char *q; strtol(p, &q, 10); assert(q == &p[1]); If this is invalid we should have null_terminated_string_arg so at least we'll get a warning against this.
Hi Xi, On Sat, Jul 06, 2024 at 10:39:41AM GMT, Xi Ruoyao wrote: > BTW among your list: > > > > [[gnu::access(read_only, 1)]] > > > [[gnu::access(write_only, 2)]] > > > [[gnu::leaf]] > > > [[gnu::nothrow]] > > > [[gnu::null_terminated_string_arg(1)]] > > IMO we should add these access attributes, they'll definitely help the > optimization (like, optimize away the initialization of a pointer). > > We already have __THROW which expands to nothrow and leaf. > > I'm not sure if null_terminated_string_arg is correct: is the following > invalid or not? > > char p[] = {'1', ')'}; > char *q; > strtol(p, &q, 10); > assert(q == &p[1]); > > If this is invalid we should have null_terminated_string_arg so at least > we'll get a warning against this. ISO C says: """ The strtol, strtoll, strtoul, and strtoull functions convert the initial portion of the string pointed to by nptr to long int, long long int, unsigned long int, and unsigned long long int representation, respectively. First, they decompose the input string into three parts: an initial, possibly empty, sequence of white-space characters (as specified by the isspace function), a subject sequence resembling an integer represented in some radix determined by the value of base, and a final string of one or more unrecognized characters, including the terminating null character of the input string. Then, they attempt to convert the subject sequence to an integer, and return the result. """ <http://port70.net/~nsz/c/c11/n1570.html#7.22.1.4p2> I'd say it's a string. Have a lovely day! Alex
Hi Xi, On Sat, Jul 06, 2024 at 10:24:16AM GMT, Xi Ruoyao wrote: > On Sat, 2024-07-06 at 00:02 +0200, Alejandro Colomar wrote: > > That's precisely the case with strtol(3): it doesn't access any objects > > through *endptr, and so that pointer need not be restrict. > > > > Then, nptr is a read-only pointer, so is doesn't matter either if it's > > accessed or not. > > Restrict allows to reorder any writes to other objects with an read from > nptr then. In strtol at least errno can be written, and depending on the > implementation of locale things there may be more. This does not apply here, I think. Let's include errno in the list of objects that strtol(3) takes, and list their access modes: - nptr access(read_only) - *endptr access(none) - endptr access(read_write) [it checks for NULL; I had forgotten] - errno access(read_write) In the callee: ~~~~~~~~~~~~~~ The access modes are known by the callee, because of course it knows what it does, so even without the attributes, it knows that. strtol(3) cannot write to errno until it has parsed the string. And once it knows it has failed (so wants to set errno), it has no reasons to read nptr again. Thus, even without knowing if 'errno' and 'nptr' are the same thing, there's nothing that could be optimized. *endptr is access(none), so it is implicitly restricted even without specifying restrict on it; the callee couldn't care less about it. endptr is 'restrict', so it can be treated separately from the rest. In the caller: ~~~~~~~~~~~~~~ We can't specify the access mode of *endptr nor errno, so the caller must assume they are read_write. endptr is 'restrict', but this is only useful for having warnings against dumb things such as strtol(x, x, 0). Other than that, the caller knows what it has passed in endptr, so it knows it's a different thing. The caller also knows that it hasn't passed errno as nptr. Then we must make a distinction in what the caller passes in *endptr: *endptr is uninitialized: ~~~~~~~~~~~~~~~~~~~~~~~~~ The caller knows that nptr is restricted even without the qualifier, since all other objects are either restricted, or known to be different. *endptr points to the same thing as nptr: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Regardless of the 'restrict' qualifier being specified or not, the caller has no way to determine if the callee accesses the object via nptr or via *endptr, so it must assume the worst case: *endptr; and so it must assume it could have written to it (since *endptr is non-const --and even if it were const, as you said, it means nothing--). So, even considering errno in the game, I don't see any difference if we specify nptr to be restrict or not. Thanks for the feedback! I'll incorporate the discussion about errno in the paper for WG14. Have a lovely day! Alex > > TBAA does not help here because char aliases with anything.
On Sat, Jul 06, 2024 at 08:10:28AM GMT, Alejandro Colomar wrote: > Hi Xi, > > On Sat, Jul 06, 2024 at 10:24:16AM GMT, Xi Ruoyao wrote: > > On Sat, 2024-07-06 at 00:02 +0200, Alejandro Colomar wrote: > > > That's precisely the case with strtol(3): it doesn't access any objects > > > through *endptr, and so that pointer need not be restrict. > > > > > > Then, nptr is a read-only pointer, so is doesn't matter either if it's > > > accessed or not. > > > > Restrict allows to reorder any writes to other objects with an read from > > nptr then. In strtol at least errno can be written, and depending on the > > implementation of locale things there may be more. > > This does not apply here, I think. Let's include errno in the list of > objects that strtol(3) takes, and list their access modes: > > - nptr access(read_only) > - *endptr access(none) > - endptr access(read_write) [it checks for NULL; I had forgotten] Sorry, I was right the first time; it's write_only. The NULL check is on the pointer; not the pointee. > - errno access(read_write) > > In the callee: > ~~~~~~~~~~~~~~ > > The access modes are known by the callee, because of course it knows > what it does, so even without the attributes, it knows that. > > strtol(3) cannot write to errno until it has parsed the string. And > once it knows it has failed (so wants to set errno), it has no reasons > to read nptr again. Thus, even without knowing if 'errno' and 'nptr' > are the same thing, there's nothing that could be optimized. > > *endptr is access(none), so it is implicitly restricted even without > specifying restrict on it; the callee couldn't care less about it. > > endptr is 'restrict', so it can be treated separately from the rest. > > In the caller: > ~~~~~~~~~~~~~~ > > We can't specify the access mode of *endptr nor errno, so the caller > must assume they are read_write. > > endptr is 'restrict', but this is only useful for having warnings > against dumb things such as strtol(x, x, 0). Other than that, the > caller knows what it has passed in endptr, so it knows it's a different > thing. > > The caller also knows that it hasn't passed errno as nptr. > > Then we must make a distinction in what the caller passes in *endptr: > > *endptr is uninitialized: > ~~~~~~~~~~~~~~~~~~~~~~~~~ > > The caller knows that nptr is restricted even without the qualifier, > since all other objects are either restricted, or known to be different. > > *endptr points to the same thing as nptr: > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Regardless of the 'restrict' qualifier being specified or not, the > caller has no way to determine if the callee accesses the object via > nptr or via *endptr, so it must assume the worst case: *endptr; and so > it must assume it could have written to it (since *endptr is non-const > --and even if it were const, as you said, it means nothing--). > > > So, even considering errno in the game, I don't see any difference if we > specify nptr to be restrict or not. > > Thanks for the feedback! I'll incorporate the discussion about errno in > the paper for WG14. > > Have a lovely day! > Alex > > > > > TBAA does not help here because char aliases with anything. > > -- > <https://www.alejandro-colomar.es/>
Hi, I've incorporated feedback, and here's a new revision, let's call it v0.2, of the draft for a WG14 paper. I've attached the man(7) source, and the generated PDF. Cheers, Alex
Hi Alejandro, if in caller it is known that endptr has access mode "write_only" then it can conclude that the content of *endptr has access mode "none", couldn't it? You also need to discuss backwards compatibility. Changing the type of those functions can break valid programs. You would need to make a case that this is unlikely to affect any real world program. Martin Am Sonntag, dem 07.07.2024 um 03:58 +0200 schrieb Alejandro Colomar: > Hi, > > I've incorporated feedback, and here's a new revision, let's call it > v0.2, of the draft for a WG14 paper. I've attached the man(7) source, > and the generated PDF. > > Cheers, > Alex > >
On 7/7/24 03:58, Alejandro Colomar wrote: > I've incorporated feedback, and here's a new revision, let's call it > v0.2, of the draft for a WG14 paper. Although I have not followed the email discussion closely, I read v0.2 and think that as stated there is little chance that its proposal will be accepted. Fundamentally the proposal is trying to say that there are two styles X and Y for declaring strtol and similar functions, and that although both styles are correct in some sense, style Y is better than style X. However, the advantages of Y are not clearly stated and the advantages of style X over Y are not admitted, so the proposal is not making the case clearly and fairly. One other thing: to maximize the chance of a proposal being accepted, please tailor it for its expected readership. The C committee is expert on ‘restrict’, so don’t try to define ‘restrict’ in your own way. Unless merely repeating the language of the standard, any definition given for ‘restrict’ is likely to cause the committee to quibble with the restatement of the standard wording. (It is OK to mention some corollaries of the standard definition, so long as the corollaries are not immediately obvious.) Here are some comments about the proposal. At the start these comments are detailed; towards the end, as I could see the direction the proposal was headed and was convinced it wouldn’t be accepted as stated, the comments are less detailed. "The API may copy" One normally doesn’t think of the application programming interface as copying. Please replace the phrase “the API” with “the caller” or “the callee” as appropriate. (Although ‘restrict’ can be used in places other than function parameters, I don’t think the proposal is concerned about those cases and so it doesn’t need to go into that.) "To avoid violations of for example C11::6.5.16.1p3," Code that violates C11::6.5.16.1p3 will do so regardless of whether ‘restrict’ is present. I would not mention C11::6.5.16.1p3 as it’s a red herring. Fundamentally, ‘restrict’ is not about the consequences of caching when one does overlapping moves; it’s about caching in a more general sense. “As long as an object is only accessed via one restricted pointer, other restricted pointers are allowed to point to the same object.” “only accessed” → “accessed only” “This is less strict than I think it should be, but this proposal doesn’t attempt to change that definition.” I would omit this sentence and all similar sentences. Don’t distract the reader with other potential proposals. The proposal as it stands is complicated enough. “return ca > a;” “return ca > *ap;” I fail to understand why these examples are present. It’s not simply that nobody writes code like that: the examples are not on point. I would remove the entire programs containing them, along with the sections that discuss them. When writing to the C committee one can assume the reader is expert in ‘restrict’, there is no need for examples such as these. “strtol(3) accepts 4 objects via pointer parameters and global variables.” Omit the “(3)”, here and elsewhere, as the audience is the C standard committee. “accepts” is a strange word to use here: normally one says “accepts” to talk about parameters, not global variables. Also, “global variables” is not right here. The C standard allows strtol, for example, to read and write an internal static cache. (Yes, that would be weird, but it’s allowed.) I suggest rephrasing this sentence to talk about accessing, not accepting. “endptr access(write_only) ... *endptr access(none)” This is true for glibc, but it’s not necessarily true for all conforming strtol implementations. If endptr is non-null, a conforming strtol implementation can both read and write *endptr; it can also both read and write **endptr. (Although it would need to write before reading, reading is allowed.) “This qualifier helps catch obvious bugs such as strtol(p, p, 0) and strtol(&p, &p, 0) .” No it doesn’t. Ordinary type checking catches those obvious bugs, and ‘restrict’ provides no additional help there. Please complicate the examples to make the point more clearly. “The caller knows that errno doesn’t alias any of the function arguments.” Only because all args are declared with ‘restrict’. So if the proposal is accepted, the caller doesn’t necessarily know that. “The callee knows that *endptr is not accessed.” This is true for glibc, but not necessarily true for every conforming strtol implementation. “It might seem that it’s a problem that the callee doesn’t know if nptr can alias errno or not. However, the callee will not write to the latter directly until it knows it has failed,” Again this is true for glibc, but not necessarily true for every conforming strtol implementation. To my mind this is the most serious objection. The current standard prohibits calls like strtol((char *) &errno, 0, 0). The proposal would relax the standard to allow such calls. In other words, the proposal would constrain implementations to support such calls. Why is this change worth making? Real-world programs do not make calls like that. “But nothing prohibits those internal helper functions to specify that nptr is restrict and thus distinct from errno.” Although true, it’s also the case that the C standard does not *require* internal helper functions to use ‘restrict’. All that matters is the accesses. So I’m not sure what the point of this statement is. “m = strtol(p, &p, 0); An analyzer more powerful than the current ones could extend the current -Wrestrict diagnostic to also diagnose this case.” Why would an analyzer want to do that? This case is a perfectly normal thing to do and it has well-defined behavior. “To prevent triggering diagnostics in a powerful analyzer that would be smart enough to diagnose the example function g(), the prototype of strtol(3) should be changed to ‘long int strtol(const char *nptr, char **restrict endptr, int base);’” Sorry, but the case has not been made to make any such change to strtol’s prototype. On the contrary, what I’m mostly gathering from the discussion is that ‘restrict’ can be confusing, which is not news. n3220 §6.7.4.2 examples 5 through 7 demonstrate that the C committee has thought through the points you’re making. (These examples were not present in C11.) This may help to explain why the standard specifies strtol with ‘restrict’ on both arguments.
Hi Martin, On Sun, Jul 07, 2024 at 09:15:23AM GMT, Martin Uecker wrote: > > Hi Alejandro, > > if in caller it is known that endptr has access mode "write_only" > then it can conclude that the content of *endptr has access mode > "none", couldn't it? Hmmmm. I think you're correct. I'll incorporate that and see how it affects the caller. At first glance, I think it would result in nptr access(read_only) alias *endptr endptr access(write_only) unique errno access(read_write) unique *endptr access(none) alias nptr Which is actually having perfect information, regardless of 'restrict' on nptr. :-) > You also need to discuss backwards compatibility. Changing > the type of those functions can break valid programs. I might be forgetting about other possibilities, but the only one I had in mind that could break API would be function pointers. However, a small experiment seems to say it doesn't: $ cat strtolp.c #include <stdlib.h> long alx_strtol(const char *nptr, char **restrict endptr, int base) { return strtol(nptr, endptr, base); } typedef long (*strtolp_t)(const char *restrict nptr, char **restrict endptr, int base); typedef long (*strtolpnr_t)(const char *nptr, char **restrict endptr, int base); int main(void) { [[maybe_unused]] strtolp_t a = &strtol; [[maybe_unused]] strtolpnr_t b = &strtol; [[maybe_unused]] strtolp_t c = &alx_strtol; [[maybe_unused]] strtolpnr_t d = &alx_strtol; } $ cc -Wall -Wextra strtolp.c $ Anyway, I'll say that it doesn't seem to break API. > You would > need to make a case that this is unlikely to affect any real > world program. If you have something else in mind that could break API, please let me know, and I'll add it to the experiments. Thanks! Have a lovely day! Alex
Am Sonntag, dem 07.07.2024 um 13:07 +0200 schrieb Alejandro Colomar via Gcc: > Hi Martin, > > On Sun, Jul 07, 2024 at 09:15:23AM GMT, Martin Uecker wrote: > > > > Hi Alejandro, > > > > if in caller it is known that endptr has access mode "write_only" > > then it can conclude that the content of *endptr has access mode > > "none", couldn't it? > > Hmmmm. I think you're correct. I'll incorporate that and see how it > affects the caller. > > At first glance, I think it would result in > > nptr access(read_only) alias *endptr > endptr access(write_only) unique > errno access(read_write) unique > *endptr access(none) alias nptr > > Which is actually having perfect information, regardless of 'restrict' > on nptr. :-) Yes, but my point is that even with "restrict" a smarter compiler could then also be smart enough not to warn even when *endptr aliases nptr. > > > You also need to discuss backwards compatibility. Changing > > the type of those functions can break valid programs. > > I might be forgetting about other possibilities, but the only one I had > in mind that could break API would be function pointers. However, a > small experiment seems to say it doesn't: Right, the outermost qualifiers are ignored, so this is not a compatibility problem. So I think this is not an issue, but it is worth pointing it out. Martin > > $ cat strtolp.c > #include <stdlib.h> > > long > alx_strtol(const char *nptr, char **restrict endptr, int base) > { > return strtol(nptr, endptr, base); > } > > typedef long (*strtolp_t)(const char *restrict nptr, > char **restrict endptr, int base); > typedef long (*strtolpnr_t)(const char *nptr, > char **restrict endptr, int base); > > int > main(void) > { > [[maybe_unused]] strtolp_t a = &strtol; > [[maybe_unused]] strtolpnr_t b = &strtol; > [[maybe_unused]] strtolp_t c = &alx_strtol; > [[maybe_unused]] strtolpnr_t d = &alx_strtol; > } > > $ cc -Wall -Wextra strtolp.c > $ > > Anyway, I'll say that it doesn't seem to break API. > > > You would > > need to make a case that this is unlikely to affect any real > > world program. > > If you have something else in mind that could break API, please let me > know, and I'll add it to the experiments. > > Thanks! > > Have a lovely day! > Alex >
Hi Paul, On Sun, Jul 07, 2024 at 12:42:51PM GMT, Paul Eggert wrote: > On 7/7/24 03:58, Alejandro Colomar wrote: > > > I've incorporated feedback, and here's a new revision, let's call it > > v0.2, of the draft for a WG14 paper. > Although I have not followed the email discussion closely, I read v0.2 and > think that as stated there is little chance that its proposal will be > accepted. Thanks for reading thoroughly, and the feedback! > Fundamentally the proposal is trying to say that there are two styles X and > Y for declaring strtol and similar functions, and that although both styles > are correct in some sense, style Y is better than style X. However, the > advantages of Y are not clearly stated and the advantages of style X over Y > are not admitted, so the proposal is not making the case clearly and fairly. > > One other thing: to maximize the chance of a proposal being accepted, please > tailor it for its expected readership. The C committee is expert on > ‘restrict’, so don’t try to define ‘restrict’ in your own way. Unless merely > repeating the language of the standard, any definition given for ‘restrict’ > is likely to cause the committee to quibble with the restatement of the > standard wording. (It is OK to mention some corollaries of the standard > definition, so long as the corollaries are not immediately obvious.) > > Here are some comments about the proposal. At the start these comments are > detailed; towards the end, as I could see the direction the proposal was > headed and was convinced it wouldn’t be accepted as stated, the comments are > less detailed. > > > "The API may copy" > > One normally doesn’t think of the application programming interface as > copying. Please replace the phrase “the API” with “the caller” or “the > callee” as appropriate. (Although ‘restrict’ can be used in places other > than function parameters, I don’t think the proposal is concerned about > those cases and so it doesn’t need to go into that.) Ok. > "To avoid violations of for example C11::6.5.16.1p3," > > Code that violates C11::6.5.16.1p3 will do so regardless of whether > ‘restrict’ is present. I would not mention C11::6.5.16.1p3 as it’s a red > herring. Fundamentally, ‘restrict’ is not about the consequences of caching > when one does overlapping moves; it’s about caching in a more general sense. The violations are UB regardless of restrict, but consistent use of restrict allows the caller to have a rough model of what the callee will do with the objects, and prevent those violations via compiler diagnostics. I've reworded that part to make it more clear why I'm mentioning that. > “As long as an object is only accessed via one restricted pointer, other > restricted pointers are allowed to point to the same object.” > > “only accessed” → “accessed only” Ok. > “This is less strict than I think it should be, but this proposal doesn’t > attempt to change that definition.” > > I would omit this sentence and all similar sentences. Don’t distract the > reader with other potential proposals. The proposal as it stands is > complicated enough. Ok. > “return ca > a;” > “return ca > *ap;” > > I fail to understand why these examples are present. It’s not simply that > nobody writes code like that: the examples are not on point. I would remove > the entire programs containing them, along with the sections that discuss > them. When writing to the C committee one can assume the reader is expert in > ‘restrict’, there is no need for examples such as these. Those are examples of how consistent use of restrict can --or could, in the case of g()-- detect, via compiler diagnostics, (likely) violations of seemingly unrelated parts of the standard, such as the referenced C11::6.5.16.1p3, or in this case, C11::6.5.8p5. > “strtol(3) accepts 4 objects via pointer parameters and global variables.” > > Omit the “(3)”, here and elsewhere, as the audience is the C standard > committee. The C standard committee doesn't know about historic use of (3)? That predates the standard, and they built on top of that (C originated in Unix). While they probably don't care about it anymore, I expect my paper to be read by other audience, including GCC and glibc, and I prefer to keep it readable for that audience. I expect the standard committee to at least have a rough idea of the existence of this syntax, and respect it, even if they don't use it or like it. > “accepts” is a strange word to use here: normally one says “accepts” to talk > about parameters, not global variables. The thing is, strtol(3) does not actually access *endptr. I thought that might cause more confusion than using "accepts". > Also, “global variables” is not > right here. The C standard allows strtol, for example, to read and write an > internal static cache. (Yes, that would be weird, but it’s allowed.) That's not part of the API. A user must not access internal static cache, and so the implementation is free to assume that it doesn't, regardless of the use of restrict in the API, so it is not relevant for the purpose of this discussion, I think. > I > suggest rephrasing this sentence to talk about accessing, not accepting. I don't want to use accessing, for it would be inconsistent with later saying that *endptr is not accessed. However, I'm open to other suggested terms that might be more appropriate than both. > “endptr access(write_only) ... *endptr access(none)” > > This is true for glibc, but it’s not necessarily true for all conforming > strtol implementations. If endptr is non-null, a conforming strtol > implementation can both read and write *endptr; It can't, I think. It's perfectly valid to pass an uninitialized endptr, which means the callee must not read the original value. char *end; strtol("0", &end, 0); If strtol(3) would be allowed to read it, the user would need to initialize it. > it can also both read and > write **endptr. (Although it would need to write before reading, reading is > allowed.) Here, we need to consider two separate objects. The object pointed-to by *endptr _before_ the object pointed to by endptr is written to, and the object pointed-to by *endptr _after_ the object pointed to by endptr is written to. For the former (the original *endptr): Since *endptr might be uninitialized, strtol(3) must NOT access the object pointed to by an uninitialized pointer. For the latter (the final *endptr): The callee cannot write to it, since the specification of the function is that the string will not be modified. And in any case, such an access is ultimately derived from nptr, not from *endptr, so it does not matter for the discussion of *endptr. Of course, that's derived from the specification of the function, and not from its prototype, since ISO C doesn't provide such detailed prototypes (since it doesn't have the [[gnu::access()]] attribute). But the standard must abide by its own specification of functions, anyway. > “This qualifier helps catch obvious bugs such as strtol(p, p, 0) and > strtol(&p, &p, 0) .” > > No it doesn’t. Ordinary type checking catches those obvious bugs, and > ‘restrict’ provides no additional help there. Please complicate the examples > to make the point more clearly. To be pedantic, I didn't specify the type of p, so it might be (void *), and thus avoid type checking at all. However, to avoid second guessing from the standards committee, I'll add casts, to make it more obvious that restrict is catching those. > “The caller knows that errno doesn’t alias any of the function arguments.” > > Only because all args are declared with ‘restrict’. So if the proposal is > accepted, the caller doesn’t necessarily know that. Not really. The caller has created the string (or has received it via a restricted pointer), and so it knows it's not derived from errno. char buf[LINE_MAX + 1]; fgets(...); n = strtol(buf, ...); This caller knows with certainty that errno does not alias buf. Of course, in some complex cases, it might not know, but I ommitted that for simplicity. And in any case, I don't think any optimizations are affected by that in the caller. > > > “The callee knows that *endptr is not accessed.” > > This is true for glibc, but not necessarily true for every conforming strtol > implementation. The original *endptr may be uninitialized, and so must not be accessed. > “It might seem that it’s a problem that the callee doesn’t know if nptr can > alias errno or not. However, the callee will not write to the latter > directly until it knows it has failed,” > > Again this is true for glibc, but not necessarily true for every conforming > strtol implementation. An implementation is free to set errno = EDEADLK in the middle of it, as long as it later removes that. However, I don't see how it would make any sense. If that's done, it's probably done via a helper internal function, which as said below, can use restrict for nptr, and thus know with certainty that nptr is distinct from errno. If that's done directly in the body of strtol(3) (the only place where it's not known that nptr is distinct from errno) we can probably agree that the implementation is doing that just for fun, and doesn't care about optimization, and thus we can safely ignore it. > To my mind this is the most serious objection. The current standard > prohibits calls like strtol((char *) &errno, 0, 0). The proposal would relax > the standard to allow such calls. In other words, the proposal would > constrain implementations to support such calls. I don't think it does. ISO C specifies that strtol(3) takes a string as its first parameter, and errno is not (unless you do this:). (char *)&errno = "111"; Okay, let's assume you're allowed to do that, since a char* can alias anything. I still don't think ISO C constrains implementations to allow passing (char *)&errno as a char*, just because it's not restrict. Let's find an ISO C function that accepts a non-restrict string: int system(const char *string); Does ISO C constrain implementations to support system((char *)&errno)? I don't think so. Maybe it does implicitly because of a defect in the wording, but even then it's widely understood that it doesn't. > Why is this change worth > making? Real-world programs do not make calls like that. Because it makes analysis of 'restrict' more consistent. The obvious improvement of GCC's analyzer to catch restrict violations will trigger false positives in normal uses of strtol(3). > “But nothing prohibits those internal helper functions to specify that nptr > is restrict and thus distinct from errno.” > > Although true, it’s also the case that the C standard does not *require* > internal helper functions to use ‘restrict’. All that matters is the > accesses. So I’m not sure what the point of this statement is. If an implementation wants to optimize, it should be careful and use restrict. If it doesn't, then it can't complain that ISO C doesn't allow it to. It's actually allowed to optimize, but it has to do some work for it. > “m = strtol(p, &p, 0); An analyzer more powerful than the current ones > could extend the current -Wrestrict diagnostic to also diagnose this case.” > > Why would an analyzer want to do that? This case is a perfectly normal thing > to do and it has well-defined behavior. Because without an analyzer, restrict cannot emit many useful diagnostics. It's a qualifier that's all about data flow analysis, and normal diagnostics aren't able to do that. A qualifier that enables optimizations but doesn't enable diagnostics is quite dangerous, and probably better not used. If however, the analyzer emits advanced diagnostics for misuses of it, then it's a good qualifier. Have a lovely day! Alex > > “To prevent triggering diagnostics in a powerful analyzer that would be > smart enough to diagnose the example function g(), the prototype of > strtol(3) should be changed to ‘long int strtol(const char *nptr, char > **restrict endptr, int base);’” > > Sorry, but the case has not been made to make any such change to strtol’s > prototype. On the contrary, what I’m mostly gathering from the discussion is > that ‘restrict’ can be confusing, which is not news. > > n3220 §6.7.4.2 examples 5 through 7 demonstrate that the C committee has > thought through the points you’re making. (These examples were not present > in C11.) This may help to explain why the standard specifies strtol with > ‘restrict’ on both arguments. >
Hi Martin, On Sun, Jul 07, 2024 at 02:21:17PM GMT, Martin Uecker wrote: > Am Sonntag, dem 07.07.2024 um 13:07 +0200 schrieb Alejandro Colomar via Gcc: > > Which is actually having perfect information, regardless of 'restrict' > > on nptr. :-) > > Yes, but my point is that even with "restrict" a smarter > compiler could then also be smart enough not to warn even > when *endptr aliases nptr. Hmmm, this is a valid argument. I feel less strongly about this proposal now. I'll document this in the proposal. Your analyzer would need to be more complex to be able to not trigger false positives here, but it's possible, so I guess I'm happy with either case. Still, removing restrict from strtol(3) would allow to change the semantics of restrict to be more restrictive (and easier to understand), so that passing aliasing pointers as restrict pointers would already be Undefined Behavior, regardless of the accesses by the callee. But yeah, either way it's good, as far as strtol(3) and gcc-20 are concerned. :) Have a lovely day! Alex > > > You also need to discuss backwards compatibility. Changing > > > the type of those functions can break valid programs. > > > > I might be forgetting about other possibilities, but the only one I had > > in mind that could break API would be function pointers. However, a > > small experiment seems to say it doesn't: > > Right, the outermost qualifiers are ignored, so this is not a > compatibility problem. So I think this is not an issue, but > it is worth pointing it out. Yup. > > Martin
Alex,
Your document number is below:
n3294 - strtol(3) et al. shouldn't have a restricted first parameter
Please return the updated document with this number
Best regards,
Dan
Technical Director - Enabling Mission Capability at Scale
Principal Member of the Technical Staff
Software Engineering Institute
Carnegie Mellon University
4500 Fifth Avenue
Pittsburgh, PA 15213
WORK: 412-268-7197
CELL: 412-427-4606
-----Original Message-----
From: Alejandro Colomar <alx@kernel.org>
Sent: Friday, July 5, 2024 3:42 PM
To: dplakosh@cert.org
Cc: Martin Uecker <muecker@gwdg.de>; Jonathan Wakely <jwakely.gcc@gmail.com>; Xi Ruoyao <xry111@xry111.site>; Jakub Jelinek <jakub@redhat.com>; libc-alpha@sourceware.org; gcc@gcc.gnu.org; Paul Eggert <eggert@cs.ucla.edu>; linux-man@vger.kernel.org; LIU Hao <lh_mouse@126.com>; Richard Earnshaw <Richard.Earnshaw@arm.com>; Sam James <sam@gentoo.org>
Subject: [WG14] Request for document number; strtol restrictness
Hi,
I have a paper for removing restrict from the first parameter of
strtol(3) et al. The title is
strtol(3) et al. should’t have a restricted first parameter.
If it helps, I already have a draft of the paper, which I attach (both the PDF, and the man(7) source).
Cheers,
Alex
--
<https://www.alejandro-colomar.es/>
On 7/7/24 14:42, Alejandro Colomar wrote: > On Sun, Jul 07, 2024 at 12:42:51PM GMT, Paul Eggert wrote: >> Also, “global variables” is not >> right here. The C standard allows strtol, for example, to read and write an >> internal static cache. (Yes, that would be weird, but it’s allowed.) > > That's not part of the API. A user must not access internal static > cache Although true in the normal (sane) case, as an extension the implementation can make such a static cache visible to the user, and in this case the caller must not pass cache addresses as arguments to strtol. For other functions this point is not purely academic. For example, the C standard specifies the signature "FILE *fopen(const char *restrict, const char *restrict);". If I understand your argument correctly, it says that the "restrict"s can be omitted there without changing the set of valid programs. But that can't be right, as omitting the "restrict"s would make the following code be valid in any platform where sizeof(int)>1: char *p = (char *) &errno; p[0] = 'r'; p[1] = 0; FILE *f = fopen (p, p); even though the current standard says this code is invalid. >> “endptr access(write_only) ... *endptr access(none)” >> >> This is true for glibc, but it’s not necessarily true for all conforming >> strtol implementations. If endptr is non-null, a conforming strtol >> implementation can both read and write *endptr; > > It can't, I think. It's perfectly valid to pass an uninitialized > endptr, which means the callee must not read the original value. Sure, but the callee can do something silly like "*endptr = p + 1; *endptr = *endptr - 1;". That is, it can read *endptr after writing it, without any undefined behavior. (And if the callee is written in assembly language it can read *endptr even before writing it - but I digress.) The point is that it is not correct to say that *endptr cannot be read from; it can. Similarly for **endptr. > Here, we need to consider two separate objects. The object pointed-to > by *endptr _before_ the object pointed to by endptr is written to, and > the object pointed-to by *endptr _after_ the object pointed to by endptr > is written to. Those are not the only possibilities. The C standard also permits strtol to set *endptr to some other pointer value, not pointing anywhere into the string being scanned, so long as it sets *endptr correctly before it returns. >> “The caller knows that errno doesn’t alias any of the function arguments.” >> >> Only because all args are declared with ‘restrict’. So if the proposal is >> accepted, the caller doesn’t necessarily know that. > > Not really. The caller has created the string (or has received it via a > restricted pointer) v0.2 doesn't state the assumption that the caller either created the string or received it via a restricted pointer. If this assumption were stated clearly, that would address the objection here. >> “The callee knows that *endptr is not accessed.” >> >> This is true for glibc, but not necessarily true for every conforming strtol >> implementation. > > The original *endptr may be uninitialized, and so must not be accessed. **endptr can be read once the callee sets *endptr. **endptr can even be written, if the callee temporarily sets *endptr to point to a writable buffer; admittedly this would be weird but it's allowed. >> “It might seem that it’s a problem that the callee doesn’t know if nptr can >> alias errno or not. However, the callee will not write to the latter >> directly until it knows it has failed,” >> >> Again this is true for glibc, but not necessarily true for every conforming >> strtol implementation. > > An implementation is free to set errno = EDEADLK in the middle of it, as > long as it later removes that. However, I don't see how it would make > any sense. It could make sense in some cases. Here the spec is a bit tricky, but an implementation is allowed to set errno = EINVAL first thing, and then set errno to some other nonzero value if it determines that the arguments are valid. I wouldn't implement strtol that way, but I can see where someone else might do that. > Let's find > an ISO C function that accepts a non-restrict string: > > int system(const char *string); > > Does ISO C constrain implementations to support system((char *)&errno)? > I don't think so. Maybe it does implicitly because of a defect in the > wording, but even then it's widely understood that it doesn't. 'system' is a special case since the C standard says 'system' can do pretty much anything it likes. That being said, I agree that implementations shouldn't need to support calls like atol((char *) &errno). Certainly the C standard's description of atol, which defines atol's behavior in terms of a call to strtol, means that atol's argument in practice must follow the 'restrict' rules. Perhaps we should report this sort of thing as a defect in the standard. It is odd, for example, that fopen's two arguments are both const char *restrict, but system's argument lacks the "restrict". >> Why is this change worth >> making? Real-world programs do not make calls like that. > > Because it makes analysis of 'restrict' more consistent. The obvious > improvement of GCC's analyzer to catch restrict violations will trigger > false positives in normal uses of strtol(3). v0.2 does not support this line of reasoning. On the contrary, v0.2 suggests that a compiler should diagnose calls like "strtol(p, &p, 0)", which would be wrong as that call is perfectly reasonable. Another way to put it: v0.2 does not clearly state the advantages of the proposed change, and in at least one area what it states as an advantage would actually be a disadvantage. >> “m = strtol(p, &p, 0); An analyzer more powerful than the current ones >> could extend the current -Wrestrict diagnostic to also diagnose this case.” >> >> Why would an analyzer want to do that? This case is a perfectly normal thing >> to do and it has well-defined behavior. > > Because without an analyzer, restrict cannot emit many useful > diagnostics. It's a qualifier that's all about data flow analysis, and > normal diagnostics aren't able to do that. > > A qualifier that enables optimizations but doesn't enable diagnostics is > quite dangerous, and probably better not used. If however, the analyzer > emits advanced diagnostics for misuses of it, then it's a good > qualifier. Sorry, but I don't understand what you're trying to say here. Really, I can't make heads or tails of it. As-is, 'restrict' can be useful both for optimization and for generating diagnostics, and GCC does both of these things right now even if you don't use -fanalyzer. Perhaps adding an example or two would help explain your point. But they'd need to be better examples than what's in v0.2 because v0.2 is unclear about this quality-of-diagnostics issue, as it relates to strtol.
Hi Paul, On Sun, Jul 07, 2024 at 07:30:43PM GMT, Paul Eggert wrote: > On 7/7/24 14:42, Alejandro Colomar wrote: > > On Sun, Jul 07, 2024 at 12:42:51PM GMT, Paul Eggert wrote: > > > Also, “global variables” is not > > > right here. The C standard allows strtol, for example, to read and write an > > > internal static cache. (Yes, that would be weird, but it’s allowed.) > > > > That's not part of the API. A user must not access internal static > > cache > > Although true in the normal (sane) case, as an extension the implementation > can make such a static cache visible to the user, and in this case the > caller must not pass cache addresses as arguments to strtol. > > For other functions this point is not purely academic. For example, the C > standard specifies the signature "FILE *fopen(const char *restrict, const > char *restrict);". If I understand your argument correctly, it says that the > "restrict"s can be omitted there without changing the set of valid programs. No, I didn't say that restrict can be removed from fopen(3). What I say is that in functions that accept pointers that alias each other, those aliasing pointers should not be restrict. Usually, pointers that alias are accessed, and thus they are not specified as restrict, such as in memmove(3). However, a small set of functions accept pointers that alias each other, but one of them is never accessed; in those few cases, restrict was added to the parameters in ISO C, but I claim it would be better removed. We're lucky, and the small set of functions where this happens don't seem to use any state, so we don't need to care about implementations using internal buffers that are passed somehow to the user.. From ISO C, IIRC, the only examples are strtol(3) et al. Another example is Plan9's seprint(3) family of functions. However, Plan9 doesn't use restrict, so it doesn't have it. > But that can't be right, as omitting the "restrict"s would make the > following code be valid in any platform where sizeof(int)>1: > > char *p = (char *) &errno; > p[0] = 'r'; > p[1] = 0; > FILE *f = fopen (p, p); > > even though the current standard says this code is invalid. No, I wouldn't remove any of the restrict qualifiers in fopen(3). Only from pointers that alias an access(none) pointer. > > > “endptr access(write_only) ... *endptr access(none)” > > > > > > This is true for glibc, but it’s not necessarily true for all conforming > > > strtol implementations. If endptr is non-null, a conforming strtol > > > implementation can both read and write *endptr; > > > > It can't, I think. It's perfectly valid to pass an uninitialized > > endptr, which means the callee must not read the original value. > > Sure, but the callee can do something silly like "*endptr = p + 1; *endptr = > *endptr - 1;". That is, it can read *endptr after writing it, without any > undefined behavior. (And if the callee is written in assembly language it > can read *endptr even before writing it - but I digress.) But once you modify the pointer provenance, you don't care anymore about it. We need to consider the pointers that a function receives, which are the ones the callee needs to know their provenance. Of course, a callee knows what it does, and so doesn't need restrict in local variables. C23/N3220::6.7.4.1p9 says: An object that is accessed through a restrict-qualified pointer has a special association with that pointer. This association, defined in 6.7.4.2, requires that all accesses to that object use, directly or indirectly, the value of that pointer. When you set *endptr = nptr + x, and use the lvalue **endptr, you're still accessing the object indirectly using the value of nptr. So, strtol(3) gets 4 objects, let's call them A, B, C, and D. A is gotten via its pointer nptr. B is gotten via its pointer endptr. C is gotten via its pointer *endptr. D is gotten via the global variable errno. Object A may be the same as object C. Object B is unique inside the callee. Its pointer endptr must be restrict-qualified to denote its uniqueness. Object D is unique, but there's no way to specify that. Object C must NOT be read or written. The function is of course allowed to set *endptr to whatever it likes, and then access it however it likes, but object C must still NOT be accessed, since its pointer may be uninitialized, and thus point to no object at all. Maybe I should use abstract names for the objects, to avoid confusing them with the pointer variables that are used to pass them? The formal definition of restrict refers to the "object into which it formerly [in the list of parameter declarations of a function definition] pointed". I'm not 100% certain, because this formal definition is quite unreadable, though. The more I read it, the less sure I am about it. BTW, I noticed something I didn't know: If L is used to access the value of the object X that it designates, and X is also modified (by any means), then the following requirements apply: T shall not be const-qualified This reads to me as "const variables are not writable when they are accessed via a restricted pointer; casting away is not enough". Am I reading this correctly? > The point is that it is not correct to say that *endptr cannot be read from; > it can. Similarly for **endptr. Some better wording: the object pointed-to by *endptr at function entry cannot be accessed. > > Here, we need to consider two separate objects. The object pointed-to > > by *endptr _before_ the object pointed to by endptr is written to, and > > the object pointed-to by *endptr _after_ the object pointed to by endptr > > is written to. > > Those are not the only possibilities. The C standard also permits strtol to > set *endptr to some other pointer value, not pointing anywhere into the > string being scanned, so long as it sets *endptr correctly before it > returns. Let's reword. The initial object pointed-to by it, and everything else. > > > “The caller knows that errno doesn’t alias any of the function arguments.” > > > > > > Only because all args are declared with ‘restrict’. So if the proposal is > > > accepted, the caller doesn’t necessarily know that. > > > > Not really. The caller has created the string (or has received it via a > > restricted pointer) > > v0.2 doesn't state the assumption that the caller either created the string > or received it via a restricted pointer. If this assumption were stated > clearly, that would address the objection here. Ok. > > > “The callee knows that *endptr is not accessed.” > > > > > > This is true for glibc, but not necessarily true for every conforming strtol > > > implementation. > > > > The original *endptr may be uninitialized, and so must not be accessed. > > **endptr can be read once the callee sets *endptr. **endptr can even be > written, if the callee temporarily sets *endptr to point to a writable > buffer; admittedly this would be weird but it's allowed. The object originally pointed-to by *endptr (C) is what we care about. Subsequently reusing the same pointer variable for pointing to different objects is uninteresting for the purposes of knowing which objects are accessed and in which way. > > > “It might seem that it’s a problem that the callee doesn’t know if nptr can > > > alias errno or not. However, the callee will not write to the latter > > > directly until it knows it has failed,” > > > > > > Again this is true for glibc, but not necessarily true for every conforming > > > strtol implementation. > > > > An implementation is free to set errno = EDEADLK in the middle of it, as > > long as it later removes that. However, I don't see how it would make > > any sense. > > It could make sense in some cases. Here the spec is a bit tricky, but an > implementation is allowed to set errno = EINVAL first thing, and then set > errno to some other nonzero value if it determines that the arguments are > valid. I wouldn't implement strtol that way, but I can see where someone > else might do that. In any case an implementation is not obliged to pessimize strtol(3). It is only allowed to. Should we not allow them to do so? > > Let's find > > an ISO C function that accepts a non-restrict string: > > > > int system(const char *string); > > > > Does ISO C constrain implementations to support system((char *)&errno)? > > I don't think so. Maybe it does implicitly because of a defect in the > > wording, but even then it's widely understood that it doesn't. > > 'system' is a special case since the C standard says 'system' can do pretty > much anything it likes. That being said, I agree that implementations > shouldn't need to support calls like atol((char *) &errno). Certainly the C > standard's description of atol, which defines atol's behavior in terms of a > call to strtol, means that atol's argument in practice must follow the > 'restrict' rules. Let's take a simpler one: rename(2). Is it allowed to receive &errno? Hopefully not. > > Perhaps we should report this sort of thing as a defect in the standard. It > is odd, for example, that fopen's two arguments are both const char > *restrict, but system's argument lacks the "restrict". > Meh, I don't care enough, I think. > > > Why is this change worth > > > making? Real-world programs do not make calls like that. > > > > Because it makes analysis of 'restrict' more consistent. The obvious > > improvement of GCC's analyzer to catch restrict violations will trigger > > false positives in normal uses of strtol(3). > > v0.2 does not support this line of reasoning. On the contrary, v0.2 suggests > that a compiler should diagnose calls like "strtol(p, &p, 0)", which would > be wrong as that call is perfectly reasonable. That call is perfectly, reasonable, which is why I suggest that the standard should modify the prototype so that strtol(p, &p, 0), which is a reasonable call, should not be warned by a compiler that would diagnose such calls. That is, just by reading the prototypes: void foo(int *restrict x, int **restrict p); and void bar(int *x, int **restrict endp); one should be able to determine that foo(p, &p); is probably causing UB (and thus trigger a warning) but bar(p, &p); is fine. > > Another way to put it: v0.2 does not clearly state the advantages of the > proposed change, and in at least one area what it states as an advantage > would actually be a disadvantage. The advantage is having more information in the caller. As a caller, I want to distinguish calls where it's ok to pass pointers that alias, and where not. And I want my compiler to be able to help me there. If restrict is overapplied, then an analyzer cannot determine that. Or as Martin noted, it can, if it takes both the restrict qualifiers _and_ the access attributes into account, and performs some non-trivial deduction. I'd rather have a simple analyzer, which will provide for less false positives and negatives. > > > “m = strtol(p, &p, 0); An analyzer more powerful than the current ones > > > could extend the current -Wrestrict diagnostic to also diagnose this case.” > > > > > > Why would an analyzer want to do that? This case is a perfectly normal thing > > > to do and it has well-defined behavior. > > > > Because without an analyzer, restrict cannot emit many useful > > diagnostics. It's a qualifier that's all about data flow analysis, and > > normal diagnostics aren't able to do that. > > > > A qualifier that enables optimizations but doesn't enable diagnostics is > > quite dangerous, and probably better not used. If however, the analyzer > > emits advanced diagnostics for misuses of it, then it's a good > > qualifier. > > Sorry, but I don't understand what you're trying to say here. Really, I > can't make heads or tails of it. As-is, 'restrict' can be useful both for > optimization and for generating diagnostics, and GCC does both of these > things right now even if you don't use -fanalyzer. GCC can only catch the most obvious violations of restrict. #include <string.h> typedef struct { int x; } T; [[gnu::access(read_only, 1)]] [[gnu::access(read_only, 2)]] void replace(T *restrict *restrict ls, const T *restrict new, size_t pos) { memcpy(ls[pos], new, sizeof(T)); } void f(T *restrict *restrict ls) { replace(ls, ls[0], 1); } $ gcc-14 -Wall -Wextra -fanalyzer replace.c -S $ The above program causes UB, and uses an API that is so similar to strtol(3), that for writing an analyzer that triggers on this code, it will trigger on strtol(3) too. Only if it's smart enough to also consider the GNU access attributes, it will be able to differentiate the two cases, as Martin suggested. > Perhaps adding an example or two would help explain your point. But they'd > need to be better examples than what's in v0.2 because v0.2 is unclear about > this quality-of-diagnostics issue, as it relates to strtol. Maybe the above? Cheers, Alex
On Sun, 2024-07-07 at 14:42 +0200, Alejandro Colomar wrote: > Hi Paul, > > On Sun, Jul 07, 2024 at 12:42:51PM GMT, Paul Eggert wrote: > > On 7/7/24 03:58, Alejandro Colomar wrote: > > > > > I've incorporated feedback, and here's a new revision, let's call > > > it > > > v0.2, of the draft for a WG14 paper. > > Although I have not followed the email discussion closely, I read > > v0.2 and > > think that as stated there is little chance that its proposal will > > be > > accepted. > > Thanks for reading thoroughly, and the feedback! > > > Fundamentally the proposal is trying to say that there are two > > styles X and > > Y for declaring strtol and similar functions, and that although > > both styles > > are correct in some sense, style Y is better than style X. However, > > the > > advantages of Y are not clearly stated and the advantages of style > > X over Y > > are not admitted, so the proposal is not making the case clearly > > and fairly. > > > > One other thing: to maximize the chance of a proposal being > > accepted, please > > tailor it for its expected readership. The C committee is expert on > > ‘restrict’, so don’t try to define ‘restrict’ in your own way. > > Unless merely > > repeating the language of the standard, any definition given for > > ‘restrict’ > > is likely to cause the committee to quibble with the restatement of > > the > > standard wording. (It is OK to mention some corollaries of the > > standard > > definition, so long as the corollaries are not immediately > > obvious.) > > > > Here are some comments about the proposal. At the start these > > comments are > > detailed; towards the end, as I could see the direction the > > proposal was > > headed and was convinced it wouldn’t be accepted as stated, the > > comments are > > less detailed. > > > > > > "The API may copy" > > > > One normally doesn’t think of the application programming interface > > as > > copying. Please replace the phrase “the API” with “the caller” or > > “the > > callee” as appropriate. (Although ‘restrict’ can be used in places > > other > > than function parameters, I don’t think the proposal is concerned > > about > > those cases and so it doesn’t need to go into that.) > > Ok. > > > "To avoid violations of for example C11::6.5.16.1p3," > > > > Code that violates C11::6.5.16.1p3 will do so regardless of whether > > ‘restrict’ is present. I would not mention C11::6.5.16.1p3 as it’s > > a red > > herring. Fundamentally, ‘restrict’ is not about the consequences of > > caching > > when one does overlapping moves; it’s about caching in a more > > general sense. > > The violations are UB regardless of restrict, but consistent use of > restrict allows the caller to have a rough model of what the callee > will > do with the objects, and prevent those violations via compiler > diagnostics. I've reworded that part to make it more clear why I'm > mentioning that. > > > “As long as an object is only accessed via one restricted pointer, > > other > > restricted pointers are allowed to point to the same object.” > > > > “only accessed” → “accessed only” > > Ok. > > > “This is less strict than I think it should be, but this proposal > > doesn’t > > attempt to change that definition.” > > > > I would omit this sentence and all similar sentences. Don’t > > distract the > > reader with other potential proposals. The proposal as it stands is > > complicated enough. > > Ok. > > > “return ca > a;” > > “return ca > *ap;” > > > > I fail to understand why these examples are present. It’s not > > simply that > > nobody writes code like that: the examples are not on point. I > > would remove > > the entire programs containing them, along with the sections that > > discuss > > them. When writing to the C committee one can assume the reader is > > expert in > > ‘restrict’, there is no need for examples such as these. > > Those are examples of how consistent use of restrict can --or could, > in > the case of g()-- detect, via compiler diagnostics, (likely) > violations > of seemingly unrelated parts of the standard, such as the referenced > C11::6.5.16.1p3, or in this case, C11::6.5.8p5. > > > “strtol(3) accepts 4 objects via pointer parameters and global > > variables.” > > > > Omit the “(3)”, here and elsewhere, as the audience is the C > > standard > > committee. > > The C standard committee doesn't know about historic use of (3)? > That > predates the standard, and they built on top of that (C originated in > Unix). While they probably don't care about it anymore, I expect my > paper to be read by other audience, including GCC and glibc, and I > prefer to keep it readable for that audience. I expect the standard > committee to at least have a rough idea of the existence of this > syntax, > and respect it, even if they don't use it or like it. > > > “accepts” is a strange word to use here: normally one says > > “accepts” to talk > > about parameters, not global variables. > > The thing is, strtol(3) does not actually access *endptr. I thought > that might cause more confusion than using "accepts". > > > Also, “global variables” is not > > right here. The C standard allows strtol, for example, to read and > > write an > > internal static cache. (Yes, that would be weird, but it’s > > allowed.) > > That's not part of the API. A user must not access internal static > cache, and so the implementation is free to assume that it doesn't, > regardless of the use of restrict in the API, so it is not relevant > for > the purpose of this discussion, I think. > > > I > > suggest rephrasing this sentence to talk about accessing, not > > accepting. > > I don't want to use accessing, for it would be inconsistent with > later > saying that *endptr is not accessed. However, I'm open to other > suggested terms that might be more appropriate than both. > > > “endptr access(write_only) ... *endptr access(none)” > > > > This is true for glibc, but it’s not necessarily true for all > > conforming > > strtol implementations. If endptr is non-null, a conforming strtol > > implementation can both read and write *endptr; > > It can't, I think. It's perfectly valid to pass an uninitialized > endptr, which means the callee must not read the original value. > > char *end; > strtol("0", &end, 0); > > If strtol(3) would be allowed to read it, the user would need to > initialize it. > > > it can also both read and > > write **endptr. (Although it would need to write before reading, > > reading is > > allowed.) > > Here, we need to consider two separate objects. The object pointed- > to > by *endptr _before_ the object pointed to by endptr is written to, > and > the object pointed-to by *endptr _after_ the object pointed to by > endptr > is written to. > > For the former (the original *endptr): > > Since *endptr might be uninitialized, strtol(3) must NOT > access > the object pointed to by an uninitialized pointer. > > For the latter (the final *endptr): > > The callee cannot write to it, since the specification of the > function is that the string will not be modified. And in any > case, such an access is ultimately derived from nptr, not > from > *endptr, so it does not matter for the discussion of *endptr. > > Of course, that's derived from the specification of the function, and > not from its prototype, since ISO C doesn't provide such detailed > prototypes (since it doesn't have the [[gnu::access()]] attribute). > But > the standard must abide by its own specification of functions, > anyway. > > > “This qualifier helps catch obvious bugs such as strtol(p, p, 0) > > and > > strtol(&p, &p, 0) .” > > > > No it doesn’t. Ordinary type checking catches those obvious bugs, > > and > > ‘restrict’ provides no additional help there. Please complicate the > > examples > > to make the point more clearly. > > To be pedantic, I didn't specify the type of p, so it might be (void > *), > and thus avoid type checking at all. However, to avoid second > guessing > from the standards committee, I'll add casts, to make it more obvious > that restrict is catching those. > > > “The caller knows that errno doesn’t alias any of the function > > arguments.” > > > > Only because all args are declared with ‘restrict’. So if the > > proposal is > > accepted, the caller doesn’t necessarily know that. > > Not really. The caller has created the string (or has received it > via a > restricted pointer), and so it knows it's not derived from errno. > > char buf[LINE_MAX + 1]; > > fgets(...); > n = strtol(buf, ...); > > This caller knows with certainty that errno does not alias buf. Of > course, in some complex cases, it might not know, but I ommitted that > for simplicity. And in any case, I don't think any optimizations are > affected by that in the caller. > > > > > > > “The callee knows that *endptr is not accessed.” > > > > This is true for glibc, but not necessarily true for every > > conforming strtol > > implementation. > > The original *endptr may be uninitialized, and so must not be > accessed. > > > “It might seem that it’s a problem that the callee doesn’t know if > > nptr can > > alias errno or not. However, the callee will not write to the > > latter > > directly until it knows it has failed,” > > > > Again this is true for glibc, but not necessarily true for every > > conforming > > strtol implementation. > > An implementation is free to set errno = EDEADLK in the middle of it, > as > long as it later removes that. However, I don't see how it would > make > any sense. > > If that's done, it's probably done via a helper internal function, > which > as said below, can use restrict for nptr, and thus know with > certainty > that nptr is distinct from errno. > > If that's done directly in the body of strtol(3) (the only place > where > it's not known that nptr is distinct from errno) we can probably > agree > that the implementation is doing that just for fun, and doesn't care > about optimization, and thus we can safely ignore it. > > > To my mind this is the most serious objection. The current standard > > prohibits calls like strtol((char *) &errno, 0, 0). The proposal > > would relax > > the standard to allow such calls. In other words, the proposal > > would > > constrain implementations to support such calls. > > I don't think it does. ISO C specifies that strtol(3) takes a string > as > its first parameter, and errno is not (unless you do this:). > > (char *)&errno = "111"; > > Okay, let's assume you're allowed to do that, since a char* can alias > anything. > > I still don't think ISO C constrains implementations to allow passing > (char *)&errno as a char*, just because it's not restrict. Let's > find > an ISO C function that accepts a non-restrict string: > > int system(const char *string); > > Does ISO C constrain implementations to support system((char > *)&errno)? > I don't think so. Maybe it does implicitly because of a defect in > the > wording, but even then it's widely understood that it doesn't. > > > Why is this change worth > > making? Real-world programs do not make calls like that. > > Because it makes analysis of 'restrict' more consistent. The obvious > improvement of GCC's analyzer to catch restrict violations will > trigger > false positives in normal uses of strtol(3). Hi Alejandro I'm author/maintainer of GCC's -fanalyzer option, which is presumably why you CCed me on this. One of my GSoC 2022 students (Tim Lange) looked at making use of 'restrict' in -fanalyzer, see e.g. https://lists.gnu.org/archive/html/bug-gnulib/2022-07/msg00062.html Based on Paul's comment here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99860#c2 (and its references) I came to the conclusion at the time that we should work on something else, as the meaning of 'restrict' is too ambiguous. Later, I added a new -Wanalyzer-overlapping-buffers warning in GCC 14, which simply has a hardcoded set of standard library functions that it "knows" to warn about. Has the C standard clarified the meaning of 'restrict' since that discussion? Without that, I wasn't planning to touch 'restrict' in GCC's -fanalyzer. Sorry if I'm missing anything here; I confess I've skimmed through parts of this thread. Dave > > > “But nothing prohibits those internal helper functions to specify > > that nptr > > is restrict and thus distinct from errno.” > > > > Although true, it’s also the case that the C standard does not > > *require* > > internal helper functions to use ‘restrict’. All that matters is > > the > > accesses. So I’m not sure what the point of this statement is. > > If an implementation wants to optimize, it should be careful and use > restrict. If it doesn't, then it can't complain that ISO C doesn't > allow it to. It's actually allowed to optimize, but it has to do > some > work for it. > > > “m = strtol(p, &p, 0); An analyzer more powerful than the current > > ones > > could extend the current -Wrestrict diagnostic to also diagnose > > this case.” > > > > Why would an analyzer want to do that? This case is a perfectly > > normal thing > > to do and it has well-defined behavior. > > Because without an analyzer, restrict cannot emit many useful > diagnostics. It's a qualifier that's all about data flow analysis, > and > normal diagnostics aren't able to do that. > > A qualifier that enables optimizations but doesn't enable diagnostics > is > quite dangerous, and probably better not used. If however, the > analyzer > emits advanced diagnostics for misuses of it, then it's a good > qualifier. > > Have a lovely day! > Alex > > > > > “To prevent triggering diagnostics in a powerful analyzer that > > would be > > smart enough to diagnose the example function g(), the prototype of > > strtol(3) should be changed to ‘long int strtol(const char *nptr, > > char > > **restrict endptr, int base);’” > > > > Sorry, but the case has not been made to make any such change to > > strtol’s > > prototype. On the contrary, what I’m mostly gathering from the > > discussion is > > that ‘restrict’ can be confusing, which is not news. > > > > n3220 §6.7.4.2 examples 5 through 7 demonstrate that the C > > committee has > > thought through the points you’re making. (These examples were not > > present > > in C11.) This may help to explain why the standard specifies strtol > > with > > ‘restrict’ on both arguments. > > >
On Mon, Jul 08, 2024 at 10:30:48AM GMT, David Malcolm wrote: > > > Why is this change worth > > > making? Real-world programs do not make calls like that. > > > > Because it makes analysis of 'restrict' more consistent. The obvious > > improvement of GCC's analyzer to catch restrict violations will > > trigger > > false positives in normal uses of strtol(3). > > Hi Alejandro Hi Dave, > I'm author/maintainer of GCC's -fanalyzer option, which is presumably > why you CCed me on this. Yup. > One of my GSoC 2022 students (Tim Lange) > looked at making use of 'restrict' in -fanalyzer, see e.g. > https://lists.gnu.org/archive/html/bug-gnulib/2022-07/msg00062.html > > Based on Paul's comment here: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99860#c2 (and its > references) I came to the conclusion at the time that we should work on > something else, as the meaning of 'restrict' is too ambiguous. restrict, as of the formal definition of ISO C is useless crap. The more I read it, the more I agree. restrict, as of what -Wrestrict warns about, seems a reasonable thing. How about a [[gnu::restrict()]] attribute, similar to [[gnu::access()]], which is simpler than the qualifier? Since restrict is only meaningful in function boundaries, it would make sense to have a function attribute. We don't want a qualifier that must follow discarding rules. And then have it mean something strict, such as: The object pointed to by the pointer is not pointed to by any other pointer; period. This definition is already what -Wrestrict seems to understand. > Later, I added a new -Wanalyzer-overlapping-buffers warning in GCC 14, > which simply has a hardcoded set of standard library functions that it > "knows" to warn about. Hmmm, so it doesn't help at all for anything other than libc. Ok. > Has the C standard clarified the meaning of 'restrict' since that > discussion? Without that, I wasn't planning to touch 'restrict' in > GCC's -fanalyzer. Meh; no they didn't. I understand. That's why I don't like innovations in ISO C, and prefer that implementations innovate with real stuff. > Sorry if I'm missing anything here; I confess I've skimmed through > parts of this thread. Nope; all's good. > > Dave
Am Montag, dem 08.07.2024 um 17:01 +0200 schrieb Alejandro Colomar: > On Mon, Jul 08, 2024 at 10:30:48AM GMT, David Malcolm wrote: ... > And then have it mean something strict, such as: The object pointed to > by the pointer is not pointed to by any other pointer; period. > > This definition is already what -Wrestrict seems to understand. One of the main uses of restrict is scientific computing. In this context such a definition of "restrict" would not work for many important use cases. But I agree that for warning purposes the definition of "restrict" in ISO C is not helpful. > > > Later, I added a new -Wanalyzer-overlapping-buffers warning in GCC 14, > > which simply has a hardcoded set of standard library functions that it > > "knows" to warn about. > > Hmmm, so it doesn't help at all for anything other than libc. Ok. > > > Has the C standard clarified the meaning of 'restrict' since that > > discussion? Without that, I wasn't planning to touch 'restrict' in > > GCC's -fanalyzer. > > Meh; no they didn't. There were examples added in C23 and there are now several papers being under discussion. > I understand. That's why I don't like innovations > in ISO C, and prefer that implementations innovate with real stuff.
Hi Martin, On Mon, Jul 08, 2024 at 06:05:08PM GMT, Martin Uecker wrote: > Am Montag, dem 08.07.2024 um 17:01 +0200 schrieb Alejandro Colomar: > > On Mon, Jul 08, 2024 at 10:30:48AM GMT, David Malcolm wrote: > > ... > > And then have it mean something strict, such as: The object pointed to > > by the pointer is not pointed to by any other pointer; period. > > > > This definition is already what -Wrestrict seems to understand. > > One of the main uses of restrict is scientific computing. In this > context such a definition of "restrict" would not work for many > important use cases. But I agree that for warning purposes the > definition of "restrict" in ISO C is not helpful. Do you have some examples of functions where this matters and is important? I'm curious to see them. Maybe we find some alternative. > > > Has the C standard clarified the meaning of 'restrict' since that > > > discussion? Without that, I wasn't planning to touch 'restrict' in > > > GCC's -fanalyzer. > > > > Meh; no they didn't. > > There were examples added in C23 and there are now several papers > being under discussion. Hmm, yeah, the examples help with the formal definition. I was thinking of the definition itself, which I still find quite confusing. :-) Have a lovely night! Alex
On Mon, 2024-07-08 at 17:01 +0200, Alejandro Colomar wrote: > On Mon, Jul 08, 2024 at 10:30:48AM GMT, David Malcolm wrote: > > > > Why is this change worth > > > > making? Real-world programs do not make calls like that. > > > > > > Because it makes analysis of 'restrict' more consistent. The > > > obvious > > > improvement of GCC's analyzer to catch restrict violations will > > > trigger > > > false positives in normal uses of strtol(3). > > > > Hi Alejandro > > Hi Dave, > > > I'm author/maintainer of GCC's -fanalyzer option, which is > > presumably > > why you CCed me on this. > > Yup. > > > One of my GSoC 2022 students (Tim Lange) > > looked at making use of 'restrict' in -fanalyzer, see e.g. > > https://lists.gnu.org/archive/html/bug-gnulib/2022-07/msg00062.html > > > > Based on Paul's comment here: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99860#c2 (and its > > references) I came to the conclusion at the time that we should > > work on > > something else, as the meaning of 'restrict' is too ambiguous. > > restrict, as of the formal definition of ISO C is useless crap. The > more I read it, the more I agree. Please note that "useless crap" was your wording, not mine. > > restrict, as of what -Wrestrict warns about, seems a reasonable > thing. > > How about a [[gnu::restrict()]] attribute, similar to > [[gnu::access()]], > which is simpler than the qualifier? Since restrict is only > meaningful > in function boundaries, it would make sense to have a function > attribute. We don't want a qualifier that must follow discarding > rules. If it doesn't have the same meaning as "restrict" then perhaps call the proposed attribute something other than "restrict"? That said, I don't have strong opinions on any of this, except to note that I have more than enough *other* work on improvements to GCC's static analyzer and usability to keep me busy, so getting sucked into discussion/implementation on 'restrict' is something I want to avoid, and -Wanalyzer-overlapping-buffers is getting the job done for me at the moment. [...snip...] Hope this is constructive; sorry again if I missed anything due to only skimming the thread Dave
Am Montag, dem 08.07.2024 um 22:17 +0200 schrieb Alejandro Colomar: > Hi Martin, > > On Mon, Jul 08, 2024 at 06:05:08PM GMT, Martin Uecker wrote: > > Am Montag, dem 08.07.2024 um 17:01 +0200 schrieb Alejandro Colomar: > > > On Mon, Jul 08, 2024 at 10:30:48AM GMT, David Malcolm wrote: > > > > ... > > > And then have it mean something strict, such as: The object pointed to > > > by the pointer is not pointed to by any other pointer; period. > > > > > > This definition is already what -Wrestrict seems to understand. > > > > One of the main uses of restrict is scientific computing. In this > > context such a definition of "restrict" would not work for many > > important use cases. But I agree that for warning purposes the > > definition of "restrict" in ISO C is not helpful. > > Do you have some examples of functions where this matters and is > important? I'm curious to see them. Maybe we find some alternative. In many numerical algorithms you want to operate on different parts of the same array object. E.g. for matrix decompositions you want to take a row / column and add it to another. Other examples are algorithms that decompose some input (.e.g. high and low band in a wavelet transform) and store it into the same output array, etc. Without new notation for strided array slicing, one fundamentally needs the flexibility of restrict that only guarantuees that actual accesses do not conflict. But this then implies that one can not use restrict as a contract specification on function prototypes, but has to analyze the implementation of a function to see if it is used correctly. But I would not see it as a design problem of restrict. It was simply not the intended use case when originally designed. > > > > > Has the C standard clarified the meaning of 'restrict' since that > > > > discussion? Without that, I wasn't planning to touch 'restrict' in > > > > GCC's -fanalyzer. > > > > > > Meh; no they didn't. > > > > There were examples added in C23 and there are now several papers > > being under discussion. > > Hmm, yeah, the examples help with the formal definition. I was thinking > of the definition itself, which I still find quite confusing. :-) Indeed. Martin > > Have a lovely night! > Alex >
Hi Dave, On Mon, Jul 08, 2024 at 06:48:51PM GMT, David Malcolm wrote: > > restrict, as of the formal definition of ISO C is useless crap. The > > more I read it, the more I agree. > > Please note that "useless crap" was your wording, not mine. Yup. :) > > > > > restrict, as of what -Wrestrict warns about, seems a reasonable > > thing. > > > > How about a [[gnu::restrict()]] attribute, similar to > > [[gnu::access()]], > > which is simpler than the qualifier? Since restrict is only > > meaningful > > in function boundaries, it would make sense to have a function > > attribute. We don't want a qualifier that must follow discarding > > rules. > > If it doesn't have the same meaning as "restrict" then perhaps call the > proposed attribute something other than "restrict"? Yup, I was thinking that maybe noalias is a better name. > > That said, I don't have strong opinions on any of this, except to note > that I have more than enough *other* work on improvements to GCC's > static analyzer and usability to keep me busy, so getting sucked into > discussion/implementation on 'restrict' is something I want to avoid, > and -Wanalyzer-overlapping-buffers is getting the job done for me at > the moment. > > [...snip...] > > Hope this is constructive; sorry again if I missed anything due to only > skimming the thread It is. I don't want you to work on this if you don't have time or interest. Just having the idea floating aroud, and if somebody finds time to have a look at it in the next decade, maybe try it. :-) Does that make sense? Cheers, Alex > > Dave > >
On Tue, Jul 09, 2024 at 11:07:59AM +0200, Alejandro Colomar wrote: > > > restrict, as of what -Wrestrict warns about, seems a reasonable > > > thing. > > > > > > How about a [[gnu::restrict()]] attribute, similar to > > > [[gnu::access()]], > > > which is simpler than the qualifier? Since restrict is only > > > meaningful > > > in function boundaries, it would make sense to have a function > > > attribute. We don't want a qualifier that must follow discarding > > > rules. > > > > If it doesn't have the same meaning as "restrict" then perhaps call the > > proposed attribute something other than "restrict"? > > Yup, I was thinking that maybe noalias is a better name. Name is one thing, but you'd also need to clearly define what it means. When restrict is access based, it is clear what it means. If you want something else which is not based on accesses and which should allow warnings in the callers, I suppose you need to specify not just the pointer but the extent as well (and maybe stride) or that it is an '\0' terminated string, because if you want to say that for void foo (char *, const char *, int); the 2 pointers don't really alias, the size information is missing. So, shall the new warning warn on struct S { char a[1024]; char b[1024]; } s; foo (s.a, s.b, 512); or not? Or foo (s.a, s.a + 512, 512); Jakub
Hi Martin, On Tue, Jul 09, 2024 at 07:58:40AM GMT, Martin Uecker wrote: > Am Montag, dem 08.07.2024 um 22:17 +0200 schrieb Alejandro Colomar: > > Hi Martin, > > > > On Mon, Jul 08, 2024 at 06:05:08PM GMT, Martin Uecker wrote: > > > Am Montag, dem 08.07.2024 um 17:01 +0200 schrieb Alejandro Colomar: > > > > On Mon, Jul 08, 2024 at 10:30:48AM GMT, David Malcolm wrote: > > > > > > ... > > > > And then have it mean something strict, such as: The object pointed to > > > > by the pointer is not pointed to by any other pointer; period. > > > > > > > > This definition is already what -Wrestrict seems to understand. > > > > > > One of the main uses of restrict is scientific computing. In this > > > context such a definition of "restrict" would not work for many > > > important use cases. But I agree that for warning purposes the > > > definition of "restrict" in ISO C is not helpful. > > > > Do you have some examples of functions where this matters and is > > important? I'm curious to see them. Maybe we find some alternative. > > In many numerical algorithms you want to operate on > different parts of the same array object. E.g. for matrix > decompositions you want to take a row / column and add it > to another. Other examples are algorithms that decompose > some input (.e.g. high and low band in a wavelet transform) > and store it into the same output array, etc. > > Without new notation for strided array slicing, one I'll have to remove dust from my old proposal of [.nmemb]? :) > fundamentally needs the flexibility of restrict that > only guarantuees that actual accesses do not conflict. I guess a combination of [.nmemb] (or the third argument of [[gnu::access()]] and [[alx::noalias]] could be good enough for such a use case? [[alx::noalias(1)]] [[alx::noalias(2)]] void add(int a[.n], const int b[.n], n); //or [[alx::noalias(1)]] [[alx::noalias(2)]] [[gnu::access(read_write, 1, 3)]] [[gnu::access(read_only, 2, 3)]] void add(int a[], const int b[], n); mad(&arr[0], &arr[50], 50); The caller should be able to know that while the pointers can alias within the caller, they don't alias within the callee, since the callee has no right to access past the specified bound*. * The standard would have to tighten bounds in function interfaces, since right now, the value is just ignored if it doesn't come with 'static' (which I never understood), and if specified with 'static', it's just a lower bound not a strict bound. I would propose changing the meaning of [N] in a function prototype to mean a strict bound. Does that make sense? Have a lovely day! Alex > But this then implies that one can not use restrict as a > contract specification on function prototypes, but has > to analyze the implementation of a function to see if > it is used correctly. But I would not see it as a design > problem of restrict. It was simply not the intended use > case when originally designed.
Hi Jakub, On Tue, Jul 09, 2024 at 11:18:11AM GMT, Jakub Jelinek wrote: > On Tue, Jul 09, 2024 at 11:07:59AM +0200, Alejandro Colomar wrote: > > Yup, I was thinking that maybe noalias is a better name. > > Name is one thing, but you'd also need to clearly define what it means. > When restrict is access based, it is clear what it means. > > If you want something else which is not based on accesses and which should > allow warnings in the callers, I suppose you need to specify not just the > pointer but the extent as well (and maybe stride) or that it is an '\0' Agree. Here's how I'd define it as an attribute: noalias The noalias function attribute specifies that the pointer to which it applies is the only reference to the array object that it points to (except that a pointer to one past the last element may overlap another object). If the number of elements is specified with array notation, the array object to be considered is a subobject of the original array object, which is limited to the number of elements specified in the function prototype. Example: [[alx::noalias(1)]] [[alx::noalias(2)]] [[gnu::access(read_write, 1)]] [[gnu::access(read_only, 2)]] void add_inplace(int a[n], const int b[n], size_t n); char arr[100] = ...; add_inplace(arr, arr + 50, 50); In the example above, the parameters a and b don't alias inside the function, since the subobjects of 50 elements do not overlap eachother, even though they are one single array object to the outer function. It may need some adjustment, to avoid conflicts with other parts of ISO C, but this is the idea I have in mind. > terminated string, because if you want to say that for > void foo (char *, const char *, int); > the 2 pointers don't really alias, the size information is missing. So, > shall the new warning warn on > struct S { char a[1024]; char b[1024]; } s; > foo (s.a, s.b, 512); This does not need clarification of bounds. You're passing separate objects, and thus cannot alias (except that maybe you're able to cast to the struct type, and then access s.b from a pointer derived from s.a; I never know that rule too well). > or not? Or foo (s.a, s.a + 512, 512); According to the definition I provide in this email, the above is just fine. Thanks! Have a lovely day! Alex > > Jakub > >
On Tue, Jul 09, 2024 at 12:28:18PM GMT, Alejandro Colomar wrote: > Hi Jakub, > > On Tue, Jul 09, 2024 at 11:18:11AM GMT, Jakub Jelinek wrote: > > On Tue, Jul 09, 2024 at 11:07:59AM +0200, Alejandro Colomar wrote: > > > Yup, I was thinking that maybe noalias is a better name. > > > > Name is one thing, but you'd also need to clearly define what it means. > > When restrict is access based, it is clear what it means. > > > > If you want something else which is not based on accesses and which should > > allow warnings in the callers, I suppose you need to specify not just the > > pointer but the extent as well (and maybe stride) or that it is an '\0' > > Agree. Here's how I'd define it as an attribute: > > noalias > > The noalias function attribute specifies that the pointer to > which it applies is the only reference to the array object that > it points to (except that a pointer to one past the last > element may overlap another object). > > If the number of elements is specified with array notation, the > array object to be considered is a subobject of the original > array object, which is limited to the number of elements > specified in the function prototype. > > Example: > > [[alx::noalias(1)]] [[alx::noalias(2)]] > [[gnu::access(read_write, 1)]] [[gnu::access(read_only, 2)]] > void add_inplace(int a[n], const int b[n], size_t n); Ooops, I meant 'n' to be the first parameter. > > char arr[100] = ...; > > add_inplace(arr, arr + 50, 50); > > In the example above, the parameters a and b don't alias inside > the function, since the subobjects of 50 elements do not overlap > eachother, even though they are one single array object to the > outer function. > > It may need some adjustment, to avoid conflicts with other parts of > ISO C, but this is the idea I have in mind. > > > terminated string, because if you want to say that for > > void foo (char *, const char *, int); > > the 2 pointers don't really alias, the size information is missing. So, > > shall the new warning warn on > > struct S { char a[1024]; char b[1024]; } s; > > foo (s.a, s.b, 512); > > This does not need clarification of bounds. You're passing separate > objects, and thus cannot alias (except that maybe you're able to cast > to the struct type, and then access s.b from a pointer derived from > s.a; I never know that rule too well). > > > or not? Or foo (s.a, s.a + 512, 512); > > According to the definition I provide in this email, the above is just > fine. > > Thanks! > > Have a lovely day! > Alex > > > > > Jakub > > > > > > -- > <https://www.alejandro-colomar.es/>
On 7/8/24 00:52, Alejandro Colomar wrote: > a small set of functions > accept pointers that alias each other, but one of them is never > accessed; in those few cases, restrict was added to the parameters in > ISO C, but I claim it would be better removed. Are these aliasing pointers the nptr and initial *endptr of strtol? That is, are you saying that last line in the following example, which is currently invalid, should become valid and should be implementable as ‘end = s; long l = 0;’? char *end; char *s = (char *) &end; *s = '\0'; long l = strtol (s, &end, 0); If so, I fail to see the motivation for the proposed change, as nobody writes (or should write) code like that. And if not, evidently I misunderstand the proposal. > the small set of functions where this happens don't seem to use any state, > so we don't need to care about implementations using internal buffers > that are passed somehow to the user. For strtol (nptr, endptr, 10) evidently the state that’s of concern is the value of *endptr. But here it’s possible that an implementation could use that state, even if the source code of the implementation's strtol does not. For example, suppose the key part of the base 10 implementation in strtol.c is this: bool overflow = false; long int n = 0; for (; '0' <= *nptr && *nptr <= '9'; nptr++) { overflow |= ckd_mul (&n, n, 10); overflow |= ckd_add (&n, n, *nptr - '0'); } *endptr = (char *) nptr; ... more code goes here ... Currently, on typical platforms where CHAR_WIDTH < INT_WIDTH and INT_WIDTH == UINT_WIDTH, the C standard lets the compiler to compile this code as if it were the following instead. bool overflow = false; long int n = 0; *endptr = (char *) nptr; unsigned int digit = *nptr++ - '0'; if (digit <= 9) { n = digit; while ((digit = *nptr++ - '0') <= 9) { overflow |= ckd_mul (&n, n, 10); overflow |= ckd_add (&n, n, digit); } *endptr = (char *) nptr; } ... more code goes here ... This sort of thing might make sense on some architectures. However, the proposed change would not allow this optimization, because it’s invalid when nptr points into *endptr. For strtol I suppose this is not that big a deal; strtol is kinda slow anyway so who cares if it’s a bit slower? But surely we wouldn’t want to give up even this minor performance win unless we get something in return, and I’m still not seeing what we get in return. > Maybe I should use abstract names for the objects, to avoid confusing > them with the pointer variables that are used to pass them? That might help, yes, since v0.2 is unclear on this point. > this formal > definition is quite unreadable, though. The more I read it, the less > sure I am about it. Yes, it’s lovely isn’t it? One must understand what the C committee intended in order to read and understand that part of the standard. > If L is used to access the value of the object X that it > designates, and X is also modified (by any means), then the > following requirements apply: T shall not be const-qualified > > This reads to me as "const variables are not writable when they are > accessed via a restricted pointer; casting away is not enough". Am I > reading this correctly? In that quoted statement, the restricted pointer is not allowed to be pointer-to-const. However, I’m not quite sure what your question means, as the phrase “const variables” does not appear in the standard. Perhaps give an example to clarify the question? >> an implementation is allowed to set errno = EINVAL first thing, and then set >> errno to some other nonzero value if it determines that the arguments are >> valid. I wouldn't implement strtol that way, but I can see where someone >> else might do that. > > In any case an implementation is not obliged to pessimize strtol(3). It > is only allowed to. Should we not allow them to do so? Of course the standard should allow suboptimal implementations. However, I’m not sure what the point of the question is. The “errno = EINVAL first thing” comment says that removing ‘restrict’ obliges the implementation to support obviously-bogus calls like strtol(&errno, ...), which might make the implementation less efficient. I don’t see how the question is relevant to that comment. > Let's take a simpler one: rename(2). Is it allowed to receive &errno? > Hopefully not. I agree with that hope, but the current C standard seems to allow it. I think we both agree this is a defect in the standard. >>>> Why is this change worth >>>> making? Real-world programs do not make calls like that. >>> >>> Because it makes analysis of 'restrict' more consistent. The obvious >>> improvement of GCC's analyzer to catch restrict violations will trigger >>> false positives in normal uses of strtol(3). >> >> v0.2 does not support this line of reasoning. On the contrary, v0.2 suggests >> that a compiler should diagnose calls like "strtol(p, &p, 0)", which would >> be wrong as that call is perfectly reasonable. > > That call is perfectly, reasonable, which is why I suggest that the > standard should modify the prototype so that strtol(p, &p, 0), which is > a reasonable call, should not be warned by a compiler that would > diagnose such calls. Of course they shouldn’t warn. But where are these compilers? v0.2 asserts that “An analyzer more powerful than the current ones could extend the current -Wrestrict diagnostic to also diagnose this case.” But why would an analyzer want to do that? v0.2 doesn’t say. The proposal merely asks to change prototypes for the C standard functions strtol, strtoul, etc. But if that is the only change needed then why bother? C compilers already do special-case analysis for functions defined by the C standard, and they can suppress undesirable diagnostics for these special cases. If you’ve identified a more general problem with ‘restrict’ then welcome to the club! The experts already know it’s confusing and limited, and are discussing about whether and how to improve things in the next C standard. I am sure you’d be welcome to those discussions. > That is, just by reading the prototypes: > > void foo(int *restrict x, int **restrict p); > > and > > void bar(int *x, int **restrict endp); > > one should be able to determine that > > foo(p, &p); > > is probably causing UB (and thus trigger a warning) but > > bar(p, &p); > > is fine. Sure, but this is a discussion we should be having with the compiler writers, no? Is this the main motivation for the proposal? If so, how would weakening the spec for strtol etc. affect that discussion with the compiler writers? v0.2 does not make this clear. >> Another way to put it: v0.2 does not clearly state the advantages of the >> proposed change, and in at least one area what it states as an advantage >> would actually be a disadvantage. > > The advantage is having more information in the caller. As a caller, I > want to distinguish calls where it's ok to pass pointers that alias, and > where not. And I want my compiler to be able to help me there. I’m still not understanding. Removing ‘restrict’ from strtol’s first arg gives the caller less information, not more. > I'd rather have a simple analyzer, which will provide for > less false positives and negatives. The C committee appears to have the opposite opinion, as when they were asked about this matter they added Examples 5 through 7 to what is now §6.7.4.2 (Formal definition of restrict). These examples say that Example 2 (which uses ‘restrict’ on all arguments) is the simplest and most effective way to use ‘restrict’, even though a smarter compiler can still make some good inferences when some pointer args are ‘restrict’ and others are merely pointers to const. If the proposal is disagreeing with Examples 5 through 7, this point needs to be thoroughly discussed in the proposal. > GCC can only catch the most obvious violations of restrict. Yes, but I fail to see how changing the API for strtol etc. would improve that situation. > #include <string.h> > > typedef struct { > int x; > } T; > > [[gnu::access(read_only, 1)]] > [[gnu::access(read_only, 2)]] > void > replace(T *restrict *restrict ls, const T *restrict new, size_t pos) > { > memcpy(ls[pos], new, sizeof(T)); > } > > void > f(T *restrict *restrict ls) > { > replace(ls, ls[0], 1); > } > > $ gcc-14 -Wall -Wextra -fanalyzer replace.c -S > $ > > The above program causes UB, It’s not a complete program and I don’t see the undefined behavior. If behavior is undefined because it violates the [[gnu::access(...)]] restrictions, that is not the sort of example that would convince the C standardization committee; they’d want to see a standard C program. I tried to write a standard C program to illustrate the issue, and came up with the following. #include <string.h> typedef struct { int x; } T; static void replace (T *restrict const *restrict ls, T const *restrict new, size_t pos) { memcpy (ls[pos], new, sizeof (T)); } static void f (T *restrict const *restrict ls) { replace (ls, ls[0], 1); } int main () { T u = {100}, v = {200}, *a[2] = {&u, &v}; f (a); return a[0]->x ^ a[1]->x; } However, I still don’t see undefined behavior there. gcc -O2 compiles this as if it were ‘int main () { return 0; }’ which is the only possible correct behavior. > writing an analyzer that triggers on this code, it > will trigger on strtol(3) too. Sorry, I’m still not following the motivation for the proposed change. It appears to be something like “if we removed ‘restrict’ from strtol etc’s first arg, compilers could generate better ‘restrict’ diagnostics everywhere” but none if this is clear or making sense to me. And if I’m missing the point I have little doubt that the C committee will miss it too.
Hi Paul, On Tue, Jul 09, 2024 at 02:09:24PM GMT, Paul Eggert wrote: > On 7/8/24 00:52, Alejandro Colomar wrote: > > a small set of functions > > accept pointers that alias each other, but one of them is never > > accessed; in those few cases, restrict was added to the parameters in > > ISO C, but I claim it would be better removed. > > Are these aliasing pointers the nptr and initial *endptr of strtol? Yes. > That is, > are you saying that last line in the following example, which is currently > invalid, should become valid and should be implementable as ‘end = s; long l > = 0;’? No. I don't think this is a consequence of the previous statement. > > char *end; > char *s = (char *) &end; > *s = '\0'; > long l = strtol (s, &end, 0); > > If so, I fail to see the motivation for the proposed change, as nobody > writes (or should write) code like that. And if not, evidently I > misunderstand the proposal. My proposal is: long int -strtol(const char *restrict nptr, char **restrict endptr, int base); +strtol(const char *nptr, char **restrict endptr, int base); My proposal doesn't make valid the example above. To make that example valid, you'd need: long int strtol(const char *nptr, char **endptr, int base); Because in the example above, you're aliasing nptr with endptr, not with *endptr. Thus, endptr cannot be a restricted pointer for that example to be valid. [... snip ...] I'm not sure I understood that part, but it's probably a consequence of the misuderstanding from above. Let's ignore it for now, and please resend if you think it's still a concern. > > > Maybe I should use abstract names for the objects, to avoid confusing > > them with the pointer variables that are used to pass them? > > That might help, yes, since v0.2 is unclear on this point. Ok; will do. > > this formal > > definition is quite unreadable, though. The more I read it, the less > > sure I am about it. > > Yes, it’s lovely isn’t it? One must understand what the C committee > intended in order to read and understand that part of the standard. :-) > > If L is used to access the value of the object X that it > > designates, and X is also modified (by any means), then the > > following requirements apply: T shall not be const-qualified > > > > This reads to me as "const variables are not writable when they are > > accessed via a restricted pointer; casting away is not enough". Am I > > reading this correctly? > > In that quoted statement, the restricted pointer is not allowed to be > pointer-to-const. However, I’m not quite sure what your question means, as > the phrase “const variables” does not appear in the standard. Perhaps give > an example to clarify the question? I should have said "An object pointed to by a pointer-to-const cannot be written if the pointer is a restricted one; casting const away is not enough." Is this interpretation of restrict correct? > >> an implementation is allowed to set errno = EINVAL first thing, and then > set > >> errno to some other nonzero value if it determines that the arguments are > >> valid. I wouldn't implement strtol that way, but I can see where someone > >> else might do that. > > > > In any case an implementation is not obliged to pessimize strtol(3). It > > is only allowed to. Should we not allow them to do so? > > Of course the standard should allow suboptimal implementations. However, I’m > not sure what the point of the question is. The “errno = EINVAL first thing” > comment says that removing ‘restrict’ obliges the implementation to support > obviously-bogus calls like strtol(&errno, ...), which might make the > implementation less efficient. See for example how musl implements strtol(3): $ grepc strtox src/stdlib/strtol.c src/stdlib/strtol.c:static unsigned long long strtox(const char *s, char **p, int base, unsigned long long lim) { FILE f; sh_fromstring(&f, s); shlim(&f, 0); unsigned long long y = __intscan(&f, base, 1, lim); if (p) { size_t cnt = shcnt(&f); *p = (char *)s + cnt; } return y; } The work is done within __intscan(), which could be prototyped as hidden unsigned long long __intscan(FILE *restrict, unsigned, int, unsigned long long); And now you're able to optimize internally, since thanks to that helper function you know it doesn't alias errno, regardless of the external API. BTW, now I remember that strtol(3) says: ERRORS This function does not modify errno on success. Which means that setting errno at function start wouldn't make much sense. Although there's probably a contrived way of doing it and still be conformant (plus, I think ISO C doesn't say that about errno). > I don’t see how the question is relevant to > that comment. > > > > Let's take a simpler one: rename(2). Is it allowed to receive &errno? > > Hopefully not. > > I agree with that hope, but the current C standard seems to allow it. I > think we both agree this is a defect in the standard. Yup. :) > >>>> Why is this change worth > >>>> making? Real-world programs do not make calls like that. > >>> > >>> Because it makes analysis of 'restrict' more consistent. The obvious > >>> improvement of GCC's analyzer to catch restrict violations will trigger > >>> false positives in normal uses of strtol(3). > >> > >> v0.2 does not support this line of reasoning. On the contrary, v0.2 > suggests > >> that a compiler should diagnose calls like "strtol(p, &p, 0)", which > would > >> be wrong as that call is perfectly reasonable. > > > > That call is perfectly, reasonable, which is why I suggest that the > > standard should modify the prototype so that strtol(p, &p, 0), which is > > a reasonable call, should not be warned by a compiler that would > > diagnose such calls. > > Of course they shouldn’t warn. But where are these compilers? > > v0.2 asserts that “An analyzer more powerful than the current ones could > extend the current -Wrestrict diagnostic to also diagnose this case.” But > why would an analyzer want to do that? v0.2 doesn’t say. True. > The proposal merely asks to change prototypes for the C standard functions > strtol, strtoul, etc. But if that is the only change needed then why bother? > C compilers already do special-case analysis for functions defined by the C > standard, and they can suppress undesirable diagnostics for these special > cases. > > If you’ve identified a more general problem with ‘restrict’ then welcome to > the club! The experts already know it’s confusing and limited, and are > discussing about whether and how to improve things in the next C standard. I > am sure you’d be welcome to those discussions. Thanks! I'm thinking I'll drop my proposal and redirection it into replacing restrict by something better. > > That is, just by reading the prototypes: > > > > void foo(int *restrict x, int **restrict p); > > > > and > > > > void bar(int *x, int **restrict endp); > > > > one should be able to determine that > > > > foo(p, &p); > > > > is probably causing UB (and thus trigger a warning) but > > > > bar(p, &p); > > > > is fine. > > Sure, but this is a discussion we should be having with the compiler > writers, no? > > Is this the main motivation for the proposal? Yep. > If so, how would weakening the > spec for strtol etc. affect that discussion with the compiler writers? v0.2 > does not make this clear. > > > >> Another way to put it: v0.2 does not clearly state the advantages of the > >> proposed change, and in at least one area what it states as an advantage > >> would actually be a disadvantage. > > > > The advantage is having more information in the caller. As a caller, I > > want to distinguish calls where it's ok to pass pointers that alias, and > > where not. And I want my compiler to be able to help me there. > > I’m still not understanding. Removing ‘restrict’ from strtol’s first arg > gives the caller less information, not more. Actually, the caller seems to have perfect information about strtol(3), regardless of restrict. (As long as strtol(3) uses gnu access attributes.) However, in this paragraph, I meant not about strtol(3), but in general: If a caller know if two arguments to a function are allowed to alias just by seeing the uses of restrict in the prototype, it is allowed to turn on strict diagnostics about it to catch UB. > > I'd rather have a simple analyzer, which will provide for > > less false positives and negatives. > > The C committee appears to have the opposite opinion, as when they were > asked about this matter they added Examples 5 through 7 to what is now > §6.7.4.2 (Formal definition of restrict). These examples say that Example 2 > (which uses ‘restrict’ on all arguments) is the simplest and most effective > way to use ‘restrict’, even though a smarter compiler can still make some > good inferences when some pointer args are ‘restrict’ and others are merely > pointers to const. > > If the proposal is disagreeing with Examples 5 through 7, this point needs > to be thoroughly discussed in the proposal. My proposal is thinking now that restrict is a dead end, and must be replaced by something better. > > GCC can only catch the most obvious violations of restrict. > > Yes, but I fail to see how changing the API for strtol etc. would improve > that situation. > > > > #include <string.h> > > > > typedef struct { > > int x; > > } T; > > > > [[gnu::access(read_only, 1)]] > > [[gnu::access(read_only, 2)]] > > void > > replace(T *restrict *restrict ls, const T *restrict new, size_t pos) > > { > > memcpy(ls[pos], new, sizeof(T)); > > } > > > > void > > f(T *restrict *restrict ls) > > { > > replace(ls, ls[0], 1); > > } > > > > $ gcc-14 -Wall -Wextra -fanalyzer replace.c -S > > $ > > > > The above program causes UB, > > It’s not a complete program and I don’t see the undefined behavior. I should have said s/program/code/ > If > behavior is undefined because it violates the [[gnu::access(...)]] > restrictions, It does not violate the gnu::access restrictions. It actually only reads the objects pointed to by ls and new. It is the object pointed to by *ls the one which is written to, but that's fine. When I wrote it, I was thinking that the behavior was undefined because the object pointed to by *ls is aliased by the object pointed to by new. However, it is not UB; I forgot that restrict doesn't care if the pointer aliases; it only cares if an access does alias, which does not happen. Let's s/0/1/ in that code to make it UB. If you s/0/1/ in my code, it is UB. I'd like a substitute for restrict to reject that code because both new and ls are derived from the same pointer in the caller. That is, I'd like passing two references to the same object is UB, via some attribute; regardless of accesses. More or less what Rust does, but opt-in in a controlled way. > that is not the sort of example that would convince the C > standardization committee; they’d want to see a standard C program. > > I tried to write a standard C program to illustrate the issue, and came up > with the following. [...] Have a lovely day! Alex
Hi Daniel, On Sun, Jul 07, 2024 at 03:46:48PM GMT, Daniel Plakosh wrote: > Alex, > > Your document number is below: > > n3294 - strtol(3) et al. shouldn't have a restricted first parameter > > Please return the updated document with this number Am I allowed to retitle the paper? n3294 - [[noalias()]] function attribute as a replacement of restrict Sorry for any inconveniences. Thanks, Alex > > Best regards, > > Dan > > Technical Director - Enabling Mission Capability at Scale > Principal Member of the Technical Staff > Software Engineering Institute > Carnegie Mellon University > 4500 Fifth Avenue > Pittsburgh, PA 15213 > WORK: 412-268-7197 > CELL: 412-427-4606 > > -----Original Message----- > From: Alejandro Colomar <alx@kernel.org> > Sent: Friday, July 5, 2024 3:42 PM > To: dplakosh@cert.org > Cc: Martin Uecker <muecker@gwdg.de>; Jonathan Wakely <jwakely.gcc@gmail.com>; Xi Ruoyao <xry111@xry111.site>; Jakub Jelinek <jakub@redhat.com>; libc-alpha@sourceware.org; gcc@gcc.gnu.org; Paul Eggert <eggert@cs.ucla.edu>; linux-man@vger.kernel.org; LIU Hao <lh_mouse@126.com>; Richard Earnshaw <Richard.Earnshaw@arm.com>; Sam James <sam@gentoo.org> > Subject: [WG14] Request for document number; strtol restrictness > > Hi, > > I have a paper for removing restrict from the first parameter of > strtol(3) et al. The title is > > strtol(3) et al. should’t have a restricted first parameter. > > If it helps, I already have a draft of the paper, which I attach (both the PDF, and the man(7) source). > > Cheers, > Alex > > -- > <https://www.alejandro-colomar.es/>
Alejandro, Sure please remind me when you submit Best regards, Dan Technical Director - Enabling Mission Capability at Scale Principal Member of the Technical Staff Software Engineering Institute Carnegie Mellon University 4500 Fifth Avenue Pittsburgh, PA 15213 WORK: 412-268-7197 CELL: 412-427-4606 -----Original Message----- From: Alejandro Colomar <alx@kernel.org> Sent: Tuesday, July 09, 2024 3:00 PM To: Daniel Plakosh <dplakosh@sei.cmu.edu> Cc: dplakosh@cert.org; Martin Uecker <muecker@gwdg.de>; Jonathan Wakely <jwakely.gcc@gmail.com>; Xi Ruoyao <xry111@xry111.site>; Jakub Jelinek <jakub@redhat.com>; libc-alpha@sourceware.org; gcc@gcc.gnu.org; Paul Eggert <eggert@cs.ucla.edu>; linux-man@vger.kernel.org; LIU Hao <lh_mouse@126.com>; Richard Earnshaw <Richard.Earnshaw@arm.com>; Sam James <sam@gentoo.org> Subject: Re: [WG14] Request for document number; strtol restrictness Hi Daniel, On Sun, Jul 07, 2024 at 03:46:48PM GMT, Daniel Plakosh wrote: > Alex, > > Your document number is below: > > n3294 - strtol(3) et al. shouldn't have a restricted first parameter > > Please return the updated document with this number Am I allowed to retitle the paper? n3294 - [[noalias()]] function attribute as a replacement of restrict Sorry for any inconveniences. Thanks, Alex > > Best regards, > > Dan > > Technical Director - Enabling Mission Capability at Scale Principal > Member of the Technical Staff Software Engineering Institute Carnegie > Mellon University > 4500 Fifth Avenue > Pittsburgh, PA 15213 > WORK: 412-268-7197 > CELL: 412-427-4606 > > -----Original Message----- > From: Alejandro Colomar <alx@kernel.org> > Sent: Friday, July 5, 2024 3:42 PM > To: dplakosh@cert.org > Cc: Martin Uecker <muecker@gwdg.de>; Jonathan Wakely > <jwakely.gcc@gmail.com>; Xi Ruoyao <xry111@xry111.site>; Jakub Jelinek > <jakub@redhat.com>; libc-alpha@sourceware.org; gcc@gcc.gnu.org; Paul > Eggert <eggert@cs.ucla.edu>; linux-man@vger.kernel.org; LIU Hao > <lh_mouse@126.com>; Richard Earnshaw <Richard.Earnshaw@arm.com>; Sam > James <sam@gentoo.org> > Subject: [WG14] Request for document number; strtol restrictness > > Hi, > > I have a paper for removing restrict from the first parameter of > strtol(3) et al. The title is > > strtol(3) et al. should’t have a restricted first parameter. > > If it helps, I already have a draft of the paper, which I attach (both the PDF, and the man(7) source). > > Cheers, > Alex > > -- > <https://www.alejandro-colomar.es/> -- <https://www.alejandro-colomar.es/>
Here's a proposal for adding a function attribute for replacing the restrict restrict qualifier. It's v0.3 of n3294 (now we have a document number). I was going to name it [[noalias()]], but I thought that it would be possible to mark several pointers as possibly referencing the same object, and then the name [[restrict()]] made more sense. It's based on a proposal I sent to Martin recently in this discussion. Do you have any feedback for this? I've attached the man(7) source and the resulting PDF, and below goes a plain text rendering (formatting is lost). Have a lovely night! Alex --- N3294 (WG14) Proposal for C2y N3294 (WG14) Name n3294 - The [[restrict()]] function attribute as a replacement of the restrict qualifier Category Feature and deprecation. Author Alejandro Colomar Andres; maintainer of the Linux man-pages project. Cc GNU C library GNU Compiler Collection Linux man‐pages Paul Eggert Xi Ruoyao Jakub Jelinek Martin Uecker LIU Hao Jonathan Wakely Richard Earnshaw Sam James Emanuele Torre Ben Boeckel "Eissfeldt, Heiko" David Malcolm Description restrict qualifier The restrict qualifier is not useful for diagnostics. Being de‐ fined in terms of accesses, the API is not enough for a caller to know what the function will do with the objects it receives. That is, a caller cannot know if the following call is correct: void f(const int *restrict a, int **restrict b); f(a, &a); Having no way to determine if a call will result in Undefined Be‐ havior makes it a dangerous qualifier. The reader might notice that this prototype and call is very simi‐ lar to the prototype of strtol(3), and the use reminds of a rela‐ tively common use of that function. Diagnostics A good replacement of the restrict qualifier should allow to spec‐ ify in the API of the following function that it doesn’t accept pointers that alias. void replace(const T *restrict new, T **restrict ls, size_t pos) { memcpy(ls[pos], new, sizeof(T)); } This proposal suggests the following: [[restrict(1)]] [[restrict(2)]] void replace(const T *restrict new, T **restrict ls, size_t pos); replace(arr[3], arr, 2); // UB; can be diagnosed Qualifiers It is also unfortunate that restrict is a qualifier, since it doesn’t follow the rules that apply to all other qualifiers. While it is discarded easily, its semantics make it as if it couldn’t be discarded. Function attribute The purpose of restrict is to • Allow functions to optimize based on the knowledge that certain objects are not accessed by any other object in the same scope; usually a function boundary, which is the most opaque boundary, and where this information is not otherwise available. • Diagnose calls that would result in Undefined Behavior under this memory model. Qualifiers don’t seem to be good for carrying this information, but function attributes are precisely for adding information that cannot be expressed by just using the type system. An attribute would need to be more strict than the restrict quali‐ fier to allow diagnosing non‐trivial cases, such as the call shown above. A caller only knows what the callee receives, not what it does with it. Thus, for diagnostics to work, the semantics of a func‐ tion attribute should be specified in terms of what a function is allowed to receive. [[restrict]] The [[restrict]] function attribute specifies that the pointer to which it applies is the only reference to the array object to which it points (except that a pointer to one past the last ele‐ ment may overlap another object). If the number of elements is specified with array notation or a compiler‐specific attribute, the array object to be considered is a subobject of the original array object, which is limited by the number of elementsspecified in the function prototype. For the following prototype: [[restrict(1)]] [[restrict(2)]] void add_inplace(size_t n, int a[n], const int b[n]); In the following calls, the caller is able to determine with cer‐ tainty if the behavior is defined or undefined: char a[100] = ...; char b[50] = ...; add_inplace(50, a, a + 50); // Ok add_inplace(50, a, b); // Ok add_inplace(50, a, a); // UB In the first of the three calls, the parameters don’t alias inside the function, since the subobjects of 50 elements do not overlap each other, even though they are one single array object to the outer function. Optimizations This function attribute allows similar optimizations than those allowed by the restrict qualifier. strtol(3) In some cases, such as the strtol(3) function, the proto‐ type will be different, since this attribute is stricter than restrict, and can’t be applied to the same parameters. For example, the prototype for strtol(3) would be [[restrict(2)]] long strtol(const char *str, char **endp, int base); This could affect optimizations, since now it’s not clear to the implementation that str is not modified by any other reference. Compiler‐specific attributes can help with that. For example, the [[gnu::access()]] attribute can be used in this function to give more information: [[restrict(2)]] [[gnu::access(read_only, 1)]] [[gnu::access(write_only, 2)]] long strtol(const char *str, char **endp, int base); The fact that endp is write‐only lets the callee deduce that *endp cannot be used to write to the string (since the callee is not allowed to inspect *endp). Another concern is that a global variable such as errno might alias the string. This is already a concern in sev‐ eral ISO C calls, such as rename(2). But in the case of strtol(3), it would be a regression. There are ways to overcome that, such as designing helper functions in a way that the attribute can be applied to add extra information. It is important that diagnostics are easy to determine, to avoid false negatives and false positives, so that code is easily safe. Optimizations, while important, need not be as easy to apply as diagnostics. If an implementation wants to be optimal, it will do the extra work for being fast. Multiple aliasing pointers In some cases, it might be useful to allow specifying that some pointers may alias each other, but not others. Strings Another way to determine that str cannot be aliased by any other object such as errno would be to use an attribute that marks str as a string. An object of type int shouldn’t be allowed to represent a string, so regardless of character types being allowed to alias any other type, an attribute such as [[gnu::null_terminated_string_arg()]] might be used to determine that the global errno does not alias the string. Deprecation The restrict qualifier would be deprecated by this attribute, sim‐ ilar to how the noreturn function specifier was superseded by the [[noreturn]] function attribute. Backwards compatibility Removing the restrict qualifier from function prototypes does not cause problems in most functions. Only functions with restrict applied to a pointee would have incompatible definitions. The only standard functions where this would happen are: tmpfile_s() fopen_s() freopen_s() Those functions are not widely adopted, so the problem would likely be minimal. Proposal 6.7.13.x The restrict function attribute Constraints The restrict attribute shall be applied to a function. A 1‐based index can be specified in an attribute argument clause, to associate the attribute with the corresponding parameter of the function, which must be of a pointer type. (Optional.) Several indices can be specified, separated by commas. The attribute can be applied several times to the same function, to mark several parameters with the attribute. (Optional.) The argument attribute clause may be omitted, which is equivalent to specifying the attribute once for each parameter that is a pointer. Semantics If a function is defined with the restrict attribute, the corresponding parameter shall be the only reference to the array object that it points to. If the function receives another reference to the same array object, the behavior is undefined. If the function accesses the array object through an lvalue that is not derived from that pointer, the behavior is undefined. (Optional.) If more than one parameters are specified in the same attribute argument clause, then all of those pointers are allowed to point to the same array object. If the number of elements is specified with array notation (or a compiler‐specific attribute), the array object to be considered for aliasing is a sub‐object of the original ar‐ ray object, limited by the number of elements specifiedr [1]. [1] For the following prototype: [[restrict(1)]] [[restrict(2)]] void f(size_t n, int a[n], const int b[n]); In the the following calls, the caller is able to determine if the behavior is defined or undefined: char a[100] = /*...*/; char b[50] = /*...*/; f(50, a, a + 50); // Ok f(50, a, b); // UB; a diagnostic is recommended f(50, a, a + 2); // UB; a diagnostic is recommended History Revisions of this paper: 0.1 Original draft for removing restrict from the first parame‐ ter of strtol(3). 0.2 Incorporate feedback from glibc and gcc mailing lists. 0.3 Re‐purpose, to deprecate restrict and propose [[re‐ strict()]] instead. See also The original discussion about restrict and strtol(3). ISO/IEC 9899 2024‐07‐09 N3294 (WG14)
On Wed, 10 Jul 2024, Alejandro Colomar via Gcc wrote: > 6.7.13.x The restrict function attribute > Constraints > The restrict attribute shall be applied to a function. > > A 1‐based index can be specified in an attribute argument > clause, to associate the attribute with the corresponding > parameter of the function, which must be of a pointer type. It's more appropriate to say "shall", and you need a requirement for the pointer to be a pointer to a complete object type (it makes no sense with function pointers, or void). That is, something like "If an attribute argument clause is present, it shall have the form: ( constant-expression ) The constant expression shall be an integer constant expression with positive value. It shall be the index, counting starting from 1, of a function parameter whose type is a pointer to a complete object type.". (That might not quite be sufficient - there are the usual questions of exactly *when* the type needs to be complete, if it's completed part way through the function definition, but the standard already doesn't tend to specify such things very precisely.) > (Optional.) The argument attribute clause may be omitted, > which is equivalent to specifying the attribute once for > each parameter that is a pointer. For each parameter that is a pointer to a complete object type, or should there be a constraint violation in this case if some are pointers to such types and some are pointers to other types? > If the number of elements is specified with array notation > (or a compiler‐specific attribute), the array object to be > considered for aliasing is a sub‐object of the original ar‐ > ray object, limited by the number of elements specifiedr > [1]. This is semantically problematic in the absence of something like N2906 (different declarations could use different numbers of elements), and even N2906 wouldn't help for the VLA case. > [1] For the following prototype: > > [[restrict(1)]] [[restrict(2)]] > void f(size_t n, int a[n], const int b[n]); That declaration currently means void f(size_t n, int a[*], const int b[*]); (that is, the expression giving a VLA size is ignored). It's equivalent to e.g. void f(size_t n, int a[n + foo()], const int b[n + bar()]); where because the size expressions are never evaluated and there's no time defined for evaluation, it's far from clear what anything talking about them giving an array size would even mean. I know that "noalias" was included in some C89 drafts but removed from the final standard after objections. Maybe someone who was around then could explain what "noalias" was, what the problems with it were and how it differs from "restrict", so we can make sure that any new proposals in this area don't suffer from whatever the perceived deficiencies of "noalias" were?
At 2024-07-26T16:24:14+0000, Joseph Myers wrote: > I know that "noalias" was included in some C89 drafts but removed from > the final standard after objections. Maybe someone who was around > then could explain what "noalias" was, what the problems with it were For this part, I think the source most often cited is Dennis Ritchie's thunderbolt aimed directly at "noalias". https://www.lysator.liu.se/c/dmr-on-noalias.html > and how it differs from "restrict", I can only disqualify myself as an authority here. > To comprehensively address this demands so we can make sure that any > new proposals in this area don't suffer from whatever the perceived > deficiencies of "noalias" were? I think it would be valuable to get such a discussion into the rationale of the next C standard. Regards, Branden
On 7/26/24 09:24, Joseph Myers wrote: > Maybe someone who was around then could > explain what "noalias" was, what the problems with it were and how it > differs from "restrict" You can get a hint by reading Dennis Ritchie's 1988 email with the unforgettable bottom line "Noalias must go. This is non-negotiable." https://www.lysator.liu.se/c/dmr-on-noalias.html ... and this is partly why I haven't read Alejandro's proposal. Fiddling with 'restrict' should be done only very carefully and only if there are really important advantages to messing around with it.
Hi Branden! On Fri, Jul 26, 2024 at 11:35:51AM GMT, G. Branden Robinson wrote: > At 2024-07-26T16:24:14+0000, Joseph Myers wrote: > > I know that "noalias" was included in some C89 drafts but removed from > > the final standard after objections. Maybe someone who was around > > then could explain what "noalias" was, what the problems with it were > > For this part, I think the source most often cited is Dennis Ritchie's > thunderbolt aimed directly at "noalias". > > https://www.lysator.liu.se/c/dmr-on-noalias.html Thanks! It seems Dennis's concern was that it was a qualifier. Probably the reason why restrict ended up being a qualifier on the pointer (and thus easily ignored), instead of the pointee (it would have caused the problems that Dennis mentioned and which anyone can guess). Since I'm suggesting an attribute, we are pretty much safe from type rules, and thus safe from Dennis's concerns, I think. Have a lovely night! Alex > > > and how it differs from "restrict", > > I can only disqualify myself as an authority here. > > > To comprehensively address this demands so we can make sure that any > > new proposals in this area don't suffer from whatever the perceived > > deficiencies of "noalias" were? > > I think it would be valuable to get such a discussion into the rationale > of the next C standard. > > Regards, > Branden
Hi Joseph, On Fri, Jul 26, 2024 at 04:24:14PM GMT, Joseph Myers wrote: > On Wed, 10 Jul 2024, Alejandro Colomar via Gcc wrote: > > > 6.7.13.x The restrict function attribute > > Constraints > > The restrict attribute shall be applied to a function. > > > > A 1‐based index can be specified in an attribute argument > > clause, to associate the attribute with the corresponding > > parameter of the function, which must be of a pointer type. > > It's more appropriate to say "shall", and you need a requirement for the > pointer to be a pointer to a complete object type (it makes no sense with > function pointers, or void). I don't see why it should not apply to void*. memcpy(3) should get this attribute: [[alx::restrict(1)]] [[alx::restrict(2)]] void *memcpy(void *dst, const void *src, size_t n); The index to which the text above refers is that '(1)' and '(2)'. > That is, something like "If an attribute > argument clause is present, it shall have the form: > > ( constant-expression ) > > The constant expression shall be an integer constant expression with > positive value. It shall be the index, counting starting from 1, of a > function parameter whose type is a pointer to a complete object type.". > > (That might not quite be sufficient - there are the usual questions of > exactly *when* the type needs to be complete, if it's completed part way > through the function definition, but the standard already doesn't tend to > specify such things very precisely.) > > > (Optional.) The argument attribute clause may be omitted, > > which is equivalent to specifying the attribute once for > > each parameter that is a pointer. > > For each parameter that is a pointer to a complete object type, or should > there be a constraint violation in this case if some are pointers to such > types and some are pointers to other types? > > > If the number of elements is specified with array notation > > (or a compiler‐specific attribute), the array object to be > > considered for aliasing is a sub‐object of the original ar‐ > > ray object, limited by the number of elements specifiedr > > [1]. > > This is semantically problematic in the absence of something like N2906 > (different declarations could use different numbers of elements), Agree. I think arrays should be fixed in C. n2906 is a good step towards that. Thanks Martin! :) BTW, the author of n2529 didn't follow up, right? I'd like that in, so I'll prepare something after n2906 is merged. Martin, would you mind pinging me about it? For what this [[alx::restrict]] proposal is concerned, I'd wait after n2906 is merged for proposing that extension. > and even > N2906 wouldn't help for the VLA case. I'd basically propose that [3] or [n] means the same as [static 3] and [static n], except for the nonnull implications of static. Is there any such paper? I'm interested in presenting one for that. Maybe it would also be interesting to wait after n2906 for that too. > > [1] For the following prototype: > > > > [[restrict(1)]] [[restrict(2)]] > > void f(size_t n, int a[n], const int b[n]); > > That declaration currently means > > void f(size_t n, int a[*], const int b[*]); Yeah, that should be fixed in the standard. I'll keep that extension of restrict out of a proposal until array parameters are fixed in that regard. > (that is, the expression giving a VLA size is ignored). It's equivalent > to e.g. > > void f(size_t n, int a[n + foo()], const int b[n + bar()]); > > where because the size expressions are never evaluated and there's no time > defined for evaluation, it's far from clear what anything talking about > them giving an array size would even mean. Yup. > I know that "noalias" was included in some C89 drafts but removed from the > final standard after objections. Maybe someone who was around then could > explain what "noalias" was, what the problems with it were and how it > differs from "restrict", so we can make sure that any new proposals in > this area don't suffer from whatever the perceived deficiencies of > "noalias" were? As I said in reply to Branden's response, it seems Dennis's concern was that the noalias proposal was a qualifier, which admittedly makes little sense (very much like the problems restrict has, but applied to the pointee, which makes them much worse). That in fact led me recently to think that an _Optional qualifier (similar to Clang's _Nullable) as is being proposed at the moment in n3222 is similarly DOA. Those qualities of pointers are attributes, which cannot be specified in the type system. Have a lovely night! Alex > > -- > Joseph S. Myers > josmyers@redhat.com
On Fri, 26 Jul 2024, Alejandro Colomar via Gcc wrote: > I don't see why it should not apply to void*. memcpy(3) should get this > attribute: > > [[alx::restrict(1)]] > [[alx::restrict(2)]] > void *memcpy(void *dst, const void *src, size_t n); That would disallow copying between disjoint subarrays within the same toplevel object (and there's no way to specify an array size for void *), which hardly seems right. > BTW, the author of n2529 didn't follow up, right? I'd like that in, so > I'll prepare something after n2906 is merged. Martin, would you mind > pinging me about it? See reflector message SC22WG14.18575, 17 Nov 2020 (the former convenor replying when I asked about just that paper). As far as I know the author has not yet provided an updated version / asked for it to be added to a meeting agenda.
Hi Joseph, On Fri, Jul 26, 2024 at 08:30:33PM GMT, Joseph Myers wrote: > On Fri, 26 Jul 2024, Alejandro Colomar via Gcc wrote: > > > I don't see why it should not apply to void*. memcpy(3) should get this > > attribute: > > > > [[alx::restrict(1)]] > > [[alx::restrict(2)]] > > void *memcpy(void *dst, const void *src, size_t n); > > That would disallow copying between disjoint subarrays within the same > toplevel object (and there's no way to specify an array size for void *), > which hardly seems right. Hmmm, I sometimes forget that ISO C is so painful about void. Has WG14 discussed in the past about the GNU extension that defines sizeof(void) == 1? Maybe wording that also considers compiler-specific attributes and extensions would allow for the following: [[gnu::access(write_only, 1, 3)]] [[gnu::access(read_only, 2, 3)]] [[alx::restrict(1)]] [[alx::restrict(2)]] void *memcpy(void *dst, const void *src, size_t n); The GNU attribute specifies the number of elements of the subarrays, and the GNU extension sizeof(void)==1 specifies the size of each element. That gives us the size of the subarrays to be considered for the restrictness. So, ISO C wouldn't be allowed to mark malloc(3) as [[alx::restrict]] (unless they add these GNU extensions), but GNU C could. > > BTW, the author of n2529 didn't follow up, right? I'd like that in, so > > I'll prepare something after n2906 is merged. Martin, would you mind > > pinging me about it? > > See reflector message SC22WG14.18575, 17 Nov 2020 (the former convenor > replying when I asked about just that paper). Where can I find reflector messages? > As far as I know the author > has not yet provided an updated version / asked for it to be added to a > meeting agenda. I think you mentioned that to me some time ago. I guess I'll take over then. I'll ask for a number to propose _Nitems(). And another one to propose that [n] means the same as [static n] except for the nonnull property of static. Have a lovely night! Alex > > -- > Joseph S. Myers > josmyers@redhat.com > >
On Fri, 26 Jul 2024, Alejandro Colomar via Gcc wrote: > > See reflector message SC22WG14.18575, 17 Nov 2020 (the former convenor > > replying when I asked about just that paper). > > Where can I find reflector messages? https://www.open-std.org/jtc1/sc22/wg14/18575 > And another one to propose that [n] means the same as [static n] except > for the nonnull property of static. I'm not convinced that introducing extra undefined behavior for things that have been valid since C89 (which would be the effect of such a change for any code that passes a smaller array) is a good idea - the general mood is to *reduce* undefined behavior.
On Fri, Jul 26, 2024 at 09:22:42PM GMT, Joseph Myers wrote: > On Fri, 26 Jul 2024, Alejandro Colomar via Gcc wrote: > > > > See reflector message SC22WG14.18575, 17 Nov 2020 (the former convenor > > > replying when I asked about just that paper). > > > > Where can I find reflector messages? > > https://www.open-std.org/jtc1/sc22/wg14/18575 Thanks! > > > And another one to propose that [n] means the same as [static n] except > > for the nonnull property of static. > > I'm not convinced that introducing extra undefined behavior for things > that have been valid since C89 (which would be the effect of such a change > for any code that passes a smaller array) is a good idea - the general > mood is to *reduce* undefined behavior. While [n] has always _officially_ meant the same as [], it has never made any sense to write code like that. Unofficially, it has always meant the obvious thing. Maybe if GNU C compilers (GCC and Clang) add it first as an extension, adding diagnostics, it would help. Does anyone know of any existing code that uses [n] for meaning anything other than "n elements are available to the function"? Functions that specify [n] most likely (definitely?) already mean that n elements are accessed, and thus passing something different than n elements results in UB one way or another. Having the compiler enforce that via diagnostics and UB is probably an improvement. Cheers, Alex > > -- > Joseph S. Myers > josmyers@redhat.com >
Am Freitag, dem 26.07.2024 um 23:49 +0200 schrieb Alejandro Colomar via Gcc: > On Fri, Jul 26, 2024 at 09:22:42PM GMT, Joseph Myers wrote: > > On Fri, 26 Jul 2024, Alejandro Colomar via Gcc wrote: > > > > > > See reflector message SC22WG14.18575, 17 Nov 2020 (the former convenor > > > > replying when I asked about just that paper). > > > > > > Where can I find reflector messages? > > > > https://www.open-std.org/jtc1/sc22/wg14/18575 > > Thanks! > > > > > > And another one to propose that [n] means the same as [static n] except > > > for the nonnull property of static. > > > > I'm not convinced that introducing extra undefined behavior for things > > that have been valid since C89 (which would be the effect of such a change > > for any code that passes a smaller array) is a good idea - the general > > mood is to *reduce* undefined behavior. > > While [n] has always _officially_ meant the same as [], it has never > made any sense to write code like that. Unofficially, it has always > meant the obvious thing. > > Maybe if GNU C compilers (GCC and Clang) add it first as an extension, > adding diagnostics, it would help. Both GCC and Clang already have such diagnostics and/or run-time checks: https://godbolt.org/z/MPnxqb9h7 Martin > > Does anyone know of any existing code that uses [n] for meaning anything > other than "n elements are available to the function"? > > Functions that specify [n] most likely (definitely?) already mean that > n elements are accessed, and thus passing something different than n > elements results in UB one way or another. Having the compiler enforce > that via diagnostics and UB is probably an improvement. > > Cheers, > Alex > > > > > -- > > Joseph S. Myers > > josmyers@redhat.com > > >
On Sat, Jul 27, 2024 at 12:03:20AM GMT, Martin Uecker wrote: > > Maybe if GNU C compilers (GCC and Clang) add it first as an extension, > > adding diagnostics, it would help. > > Both GCC and Clang already have such diagnostics and/or run-time checks: > > https://godbolt.org/z/MPnxqb9h7 Hi Martin, I guess that's prior art enough to make this UB in ISO C. Is there any paper for this already? Does any of your paper cover that? Should I prepare one? Have a lovely night! Alex
Am Samstag, dem 27.07.2024 um 00:26 +0200 schrieb Alejandro Colomar: > On Sat, Jul 27, 2024 at 12:03:20AM GMT, Martin Uecker wrote: > > > Maybe if GNU C compilers (GCC and Clang) add it first as an extension, > > > adding diagnostics, it would help. > > > > Both GCC and Clang already have such diagnostics and/or run-time checks: > > > > https://godbolt.org/z/MPnxqb9h7 > > Hi Martin, > > I guess that's prior art enough to make this UB in ISO C. Is there any > paper for this already? Does any of your paper cover that? Should I > prepare one? > What do you mean by "this"? Adding UB would likely see a lot of opposition, even where this could enable run-time checks. N2906 would make int foo(char f[4]); int foo(char f[5]); a constraint violation (although having those types be incompatible could also cause UB indirectly, this would not be its main effect). So I think brining a new version of this paper forward would be a possible next step, addressing the issues raised in the past. Martin
Hi Martin, On Sat, Jul 27, 2024 at 12:59:34AM GMT, Martin Uecker wrote: > Am Samstag, dem 27.07.2024 um 00:26 +0200 schrieb Alejandro Colomar: > > On Sat, Jul 27, 2024 at 12:03:20AM GMT, Martin Uecker wrote: > > > > Maybe if GNU C compilers (GCC and Clang) add it first as an extension, > > > > adding diagnostics, it would help. > > > > > > Both GCC and Clang already have such diagnostics and/or run-time checks: > > > > > > https://godbolt.org/z/MPnxqb9h7 > > > > Hi Martin, > > > > I guess that's prior art enough to make this UB in ISO C. Is there any > > paper for this already? Does any of your paper cover that? Should I > > prepare one? > > > > What do you mean by "this"? Adding UB. > Adding UB would likely see a lot > of opposition, But UB allows for safer code. It's the lack of UB what reduces the quality of diagnostics, which results in worse code. I understand it will see opposition, so we better wait for the path to be prepared (i.e., n2906 already merged before presenting a paper), but once that's done, I'd try to add UB. > even where this could enable run-time checks. (And build-time too.) > N2906 would make > > int foo(char f[4]); > int foo(char f[5]); > > a constraint violation (although having those types be incompatible > could also cause UB indirectly, this would not be its main effect). > > So I think brining a new version of this paper forward would be > a possible next step, addressing the issues raised in the past. Yeah, that would be a good next step. And when the array type is part of the function type, it'll be easier to convince that [n] can only mean [n]. Have a lovely day! Alex > Martin >
diff --git a/include/stdlib.h b/include/stdlib.h index 0cab3f5b56..c3f61f6891 100644 --- a/include/stdlib.h +++ b/include/stdlib.h @@ -189,32 +189,31 @@ libc_hidden_proto (__arc4random_uniform); extern void __arc4random_buf_internal (void *buffer, size_t len) attribute_hidden; -extern double __strtod_internal (const char *__restrict __nptr, +extern double __strtod_internal (const char *__nptr, char **__restrict __endptr, int __group) __THROW __nonnull ((1)) __wur; -extern float __strtof_internal (const char *__restrict __nptr, +extern float __strtof_internal (const char *__nptr, char **__restrict __endptr, int __group) __THROW __nonnull ((1)) __wur; -extern long double __strtold_internal (const char *__restrict __nptr, +extern long double __strtold_internal (const char *__nptr, char **__restrict __endptr, int __group) __THROW __nonnull ((1)) __wur; -extern long int __strtol_internal (const char *__restrict __nptr, +extern long int __strtol_internal (const char *__nptr, char **__restrict __endptr, int __base, int __group) __THROW __nonnull ((1)) __wur; -extern unsigned long int __strtoul_internal (const char *__restrict __nptr, +extern unsigned long int __strtoul_internal (const char *__nptr, char **__restrict __endptr, int __base, int __group) __THROW __nonnull ((1)) __wur; __extension__ -extern long long int __strtoll_internal (const char *__restrict __nptr, +extern long long int __strtoll_internal (const char *__nptr, char **__restrict __endptr, int __base, int __group) __THROW __nonnull ((1)) __wur; __extension__ -extern unsigned long long int __strtoull_internal (const char * - __restrict __nptr, +extern unsigned long long int __strtoull_internal (const char *__nptr, char **__restrict __endptr, int __base, int __group) __THROW __nonnull ((1)) __wur; @@ -226,33 +225,31 @@ libc_hidden_proto (__strtoll_internal) libc_hidden_proto (__strtoul_internal) libc_hidden_proto (__strtoull_internal) -extern double ____strtod_l_internal (const char *__restrict __nptr, +extern double ____strtod_l_internal (const char *__nptr, char **__restrict __endptr, int __group, locale_t __loc); -extern float ____strtof_l_internal (const char *__restrict __nptr, +extern float ____strtof_l_internal (const char *__nptr, char **__restrict __endptr, int __group, locale_t __loc); -extern long double ____strtold_l_internal (const char *__restrict __nptr, +extern long double ____strtold_l_internal (const char *__nptr, char **__restrict __endptr, int __group, locale_t __loc); -extern long int ____strtol_l_internal (const char *__restrict __nptr, +extern long int ____strtol_l_internal (const char *__nptr, char **__restrict __endptr, int __base, int __group, bool __bin_cst, locale_t __loc); -extern unsigned long int ____strtoul_l_internal (const char * - __restrict __nptr, +extern unsigned long int ____strtoul_l_internal (const char *__nptr, char **__restrict __endptr, int __base, int __group, bool __bin_cst, locale_t __loc); __extension__ -extern long long int ____strtoll_l_internal (const char *__restrict __nptr, +extern long long int ____strtoll_l_internal (const char *__nptr, char **__restrict __endptr, int __base, int __group, bool __bin_cst, locale_t __loc); __extension__ -extern unsigned long long int ____strtoull_l_internal (const char * - __restrict __nptr, +extern unsigned long long int ____strtoull_l_internal (const char *__nptr, char ** __restrict __endptr, int __base, int __group, @@ -309,12 +306,12 @@ extern _Float128 __wcstof128_nan (const wchar_t *, wchar_t **, wchar_t); libc_hidden_proto (__strtof128_nan) libc_hidden_proto (__wcstof128_nan) -extern _Float128 __strtof128_internal (const char *__restrict __nptr, +extern _Float128 __strtof128_internal (const char *__nptr, char **__restrict __endptr, int __group); libc_hidden_proto (__strtof128_internal) -extern _Float128 ____strtof128_l_internal (const char *__restrict __nptr, +extern _Float128 ____strtof128_l_internal (const char *__nptr, char **__restrict __endptr, int __group, locale_t __loc); diff --git a/include/wchar.h b/include/wchar.h index bf32625736..386f3ebd19 100644 --- a/include/wchar.h +++ b/include/wchar.h @@ -76,28 +76,27 @@ libc_hidden_proto (__isoc23_wcstoull_l) #endif -extern double __wcstod_internal (const wchar_t *__restrict __nptr, +extern double __wcstod_internal (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __group) __THROW; -extern float __wcstof_internal (const wchar_t *__restrict __nptr, +extern float __wcstof_internal (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __group) __THROW; -extern long double __wcstold_internal (const wchar_t *__restrict __nptr, +extern long double __wcstold_internal (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __group) __THROW; -extern long int __wcstol_internal (const wchar_t *__restrict __nptr, +extern long int __wcstol_internal (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base, int __group) __THROW; -extern unsigned long int __wcstoul_internal (const wchar_t *__restrict __npt, +extern unsigned long int __wcstoul_internal (const wchar_t *__npt, wchar_t **__restrict __endptr, int __base, int __group) __THROW; __extension__ -extern long long int __wcstoll_internal (const wchar_t *__restrict __nptr, +extern long long int __wcstoll_internal (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base, int __group) __THROW; __extension__ -extern unsigned long long int __wcstoull_internal (const wchar_t * - __restrict __nptr, +extern unsigned long long int __wcstoull_internal (const wchar_t *__nptr, wchar_t ** __restrict __endptr, int __base, @@ -143,7 +142,7 @@ extern unsigned long long int ____wcstoull_l_internal (const wchar_t *, #if __HAVE_DISTINCT_FLOAT128 extern __typeof (wcstof128_l) __wcstof128_l; libc_hidden_proto (__wcstof128_l) -extern _Float128 __wcstof128_internal (const wchar_t *__restrict __nptr, +extern _Float128 __wcstof128_internal (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __group) __THROW; diff --git a/manual/arith.texi b/manual/arith.texi index 0742c08ac4..656c0723be 100644 --- a/manual/arith.texi +++ b/manual/arith.texi @@ -2631,7 +2631,7 @@ functions in this section. It is seemingly useless but the @w{ISO C} standard uses it (for the functions defined there) so we have to do it as well. -@deftypefun {long int} strtol (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) +@deftypefun {long int} strtol (const char *@var{string}, char **restrict @var{tailptr}, int @var{base}) @standards{ISO, stdlib.h} @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} @c strtol uses the thread-local pointer to the locale in effect, and @@ -2705,7 +2705,7 @@ case there was overflow. There is an example at the end of this section. @end deftypefun -@deftypefun {long int} wcstol (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) +@deftypefun {long int} wcstol (const wchar_t *@var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) @standards{ISO, wchar.h} @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} The @code{wcstol} function is equivalent to the @code{strtol} function @@ -2714,7 +2714,7 @@ in nearly all aspects but handles wide character strings. The @code{wcstol} function was introduced in @w{Amendment 1} of @w{ISO C90}. @end deftypefun -@deftypefun {unsigned long int} strtoul (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) +@deftypefun {unsigned long int} strtoul (const char *@var{string}, char **restrict @var{tailptr}, int @var{base}) @standards{ISO, stdlib.h} @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} The @code{strtoul} (``string-to-unsigned-long'') function is like @@ -2732,7 +2732,7 @@ and an input more negative than @code{LONG_MIN} returns range, or @code{ERANGE} on overflow. @end deftypefun -@deftypefun {unsigned long int} wcstoul (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) +@deftypefun {unsigned long int} wcstoul (const wchar_t *@var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) @standards{ISO, wchar.h} @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} The @code{wcstoul} function is equivalent to the @code{strtoul} function @@ -2741,7 +2741,7 @@ in nearly all aspects but handles wide character strings. The @code{wcstoul} function was introduced in @w{Amendment 1} of @w{ISO C90}. @end deftypefun -@deftypefun {long long int} strtoll (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) +@deftypefun {long long int} strtoll (const char *@var{string}, char **restrict @var{tailptr}, int @var{base}) @standards{ISO, stdlib.h} @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} The @code{strtoll} function is like @code{strtol} except that it returns @@ -2757,7 +2757,7 @@ appropriate for the sign of the value. It also sets @code{errno} to The @code{strtoll} function was introduced in @w{ISO C99}. @end deftypefun -@deftypefun {long long int} wcstoll (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) +@deftypefun {long long int} wcstoll (const wchar_t *@var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) @standards{ISO, wchar.h} @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} The @code{wcstoll} function is equivalent to the @code{strtoll} function @@ -2766,13 +2766,13 @@ in nearly all aspects but handles wide character strings. The @code{wcstoll} function was introduced in @w{Amendment 1} of @w{ISO C90}. @end deftypefun -@deftypefun {long long int} strtoq (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) +@deftypefun {long long int} strtoq (const char *@var{string}, char **restrict @var{tailptr}, int @var{base}) @standards{BSD, stdlib.h} @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} @code{strtoq} (``string-to-quad-word'') is the BSD name for @code{strtoll}. @end deftypefun -@deftypefun {long long int} wcstoq (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) +@deftypefun {long long int} wcstoq (const wchar_t *@var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) @standards{GNU, wchar.h} @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} The @code{wcstoq} function is equivalent to the @code{strtoq} function @@ -2781,7 +2781,7 @@ in nearly all aspects but handles wide character strings. The @code{wcstoq} function is a GNU extension. @end deftypefun -@deftypefun {unsigned long long int} strtoull (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) +@deftypefun {unsigned long long int} strtoull (const char *@var{string}, char **restrict @var{tailptr}, int @var{base}) @standards{ISO, stdlib.h} @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} The @code{strtoull} function is related to @code{strtoll} the same way @@ -2790,7 +2790,7 @@ The @code{strtoull} function is related to @code{strtoll} the same way The @code{strtoull} function was introduced in @w{ISO C99}. @end deftypefun -@deftypefun {unsigned long long int} wcstoull (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) +@deftypefun {unsigned long long int} wcstoull (const wchar_t *@var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) @standards{ISO, wchar.h} @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} The @code{wcstoull} function is equivalent to the @code{strtoull} function @@ -2799,13 +2799,13 @@ in nearly all aspects but handles wide character strings. The @code{wcstoull} function was introduced in @w{Amendment 1} of @w{ISO C90}. @end deftypefun -@deftypefun {unsigned long long int} strtouq (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) +@deftypefun {unsigned long long int} strtouq (const char *@var{string}, char **restrict @var{tailptr}, int @var{base}) @standards{BSD, stdlib.h} @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} @code{strtouq} is the BSD name for @code{strtoull}. @end deftypefun -@deftypefun {unsigned long long int} wcstouq (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) +@deftypefun {unsigned long long int} wcstouq (const wchar_t *@var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) @standards{GNU, wchar.h} @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} The @code{wcstouq} function is equivalent to the @code{strtouq} function @@ -2814,7 +2814,7 @@ in nearly all aspects but handles wide character strings. The @code{wcstouq} function is a GNU extension. @end deftypefun -@deftypefun intmax_t strtoimax (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) +@deftypefun intmax_t strtoimax (const char *@var{string}, char **restrict @var{tailptr}, int @var{base}) @standards{ISO, inttypes.h} @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} The @code{strtoimax} function is like @code{strtol} except that it returns @@ -2830,7 +2830,7 @@ See @ref{Integers} for a description of the @code{intmax_t} type. The @code{strtoimax} function was introduced in @w{ISO C99}. @end deftypefun -@deftypefun intmax_t wcstoimax (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) +@deftypefun intmax_t wcstoimax (const wchar_t *@var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) @standards{ISO, wchar.h} @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} The @code{wcstoimax} function is equivalent to the @code{strtoimax} function @@ -2839,7 +2839,7 @@ in nearly all aspects but handles wide character strings. The @code{wcstoimax} function was introduced in @w{ISO C99}. @end deftypefun -@deftypefun uintmax_t strtoumax (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) +@deftypefun uintmax_t strtoumax (const char *@var{string}, char **restrict @var{tailptr}, int @var{base}) @standards{ISO, inttypes.h} @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} The @code{strtoumax} function is related to @code{strtoimax} @@ -2849,7 +2849,7 @@ See @ref{Integers} for a description of the @code{intmax_t} type. The @code{strtoumax} function was introduced in @w{ISO C99}. @end deftypefun -@deftypefun uintmax_t wcstoumax (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) +@deftypefun uintmax_t wcstoumax (const wchar_t *@var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) @standards{ISO, wchar.h} @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} The @code{wcstoumax} function is equivalent to the @code{strtoumax} function @@ -2939,7 +2939,7 @@ functions in this section. It is seemingly useless but the @w{ISO C} standard uses it (for the functions defined there) so we have to do it as well. -@deftypefun double strtod (const char *restrict @var{string}, char **restrict @var{tailptr}) +@deftypefun double strtod (const char *@var{string}, char **restrict @var{tailptr}) @standards{ISO, stdlib.h} @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} @c Besides the unsafe-but-ruled-safe locale uses, this uses a lot of @@ -3075,7 +3075,7 @@ They were introduced in @w{ISO/IEC TS 18661-3} and are available on machines that support the related types; @pxref{Mathematics}. @end deftypefun -@deftypefun double wcstod (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}) +@deftypefun double wcstod (const wchar_t *@var{string}, wchar_t **restrict @var{tailptr}) @deftypefunx float wcstof (const wchar_t *@var{string}, wchar_t **@var{tailptr}) @deftypefunx {long double} wcstold (const wchar_t *@var{string}, wchar_t **@var{tailptr}) @deftypefunx _FloatN wcstofN (const wchar_t *@var{string}, wchar_t **@var{tailptr}) diff --git a/stdlib/inttypes.h b/stdlib/inttypes.h index cfda146aa9..ba6b903dff 100644 --- a/stdlib/inttypes.h +++ b/stdlib/inttypes.h @@ -355,20 +355,20 @@ extern imaxdiv_t imaxdiv (intmax_t __numer, intmax_t __denom) __THROW __attribute__ ((__const__)); /* Like `strtol' but convert to `intmax_t'. */ -extern intmax_t strtoimax (const char *__restrict __nptr, +extern intmax_t strtoimax (const char *__nptr, char **__restrict __endptr, int __base) __THROW; /* Like `strtoul' but convert to `uintmax_t'. */ -extern uintmax_t strtoumax (const char *__restrict __nptr, +extern uintmax_t strtoumax (const char *__nptr, char ** __restrict __endptr, int __base) __THROW; /* Like `wcstol' but convert to `intmax_t'. */ -extern intmax_t wcstoimax (const __gwchar_t *__restrict __nptr, +extern intmax_t wcstoimax (const __gwchar_t *__nptr, __gwchar_t **__restrict __endptr, int __base) __THROW; /* Like `wcstoul' but convert to `uintmax_t'. */ -extern uintmax_t wcstoumax (const __gwchar_t *__restrict __nptr, +extern uintmax_t wcstoumax (const __gwchar_t *__nptr, __gwchar_t ** __restrict __endptr, int __base) __THROW; @@ -376,32 +376,32 @@ extern uintmax_t wcstoumax (const __gwchar_t *__restrict __nptr, in base 0 or 2. */ #if __GLIBC_USE (C23_STRTOL) # ifdef __REDIRECT -extern intmax_t __REDIRECT_NTH (strtoimax, (const char *__restrict __nptr, +extern intmax_t __REDIRECT_NTH (strtoimax, (const char *__nptr, char **__restrict __endptr, int __base), __isoc23_strtoimax); -extern uintmax_t __REDIRECT_NTH (strtoumax, (const char *__restrict __nptr, +extern uintmax_t __REDIRECT_NTH (strtoumax, (const char *__nptr, char **__restrict __endptr, int __base), __isoc23_strtoumax); extern intmax_t __REDIRECT_NTH (wcstoimax, - (const __gwchar_t *__restrict __nptr, + (const __gwchar_t *__nptr, __gwchar_t **__restrict __endptr, int __base), __isoc23_wcstoimax); extern uintmax_t __REDIRECT_NTH (wcstoumax, - (const __gwchar_t *__restrict __nptr, + (const __gwchar_t *__nptr, __gwchar_t **__restrict __endptr, int __base), __isoc23_wcstoumax); # else -extern intmax_t __isoc23_strtoimax (const char *__restrict __nptr, +extern intmax_t __isoc23_strtoimax (const char *__nptr, char **__restrict __endptr, int __base) __THROW; -extern uintmax_t __isoc23_strtoumax (const char *__restrict __nptr, +extern uintmax_t __isoc23_strtoumax (const char *__nptr, char ** __restrict __endptr, int __base) __THROW; -extern intmax_t __isoc23_wcstoimax (const __gwchar_t *__restrict __nptr, +extern intmax_t __isoc23_wcstoimax (const __gwchar_t *__nptr, __gwchar_t **__restrict __endptr, int __base) __THROW; -extern uintmax_t __isoc23_wcstoumax (const __gwchar_t *__restrict __nptr, +extern uintmax_t __isoc23_wcstoumax (const __gwchar_t *__nptr, __gwchar_t ** __restrict __endptr, int __base) __THROW; diff --git a/stdlib/stdlib.h b/stdlib/stdlib.h index 901926e893..3602157fcb 100644 --- a/stdlib/stdlib.h +++ b/stdlib/stdlib.h @@ -115,82 +115,73 @@ __extension__ extern long long int atoll (const char *__nptr) #endif /* Convert a string to a floating-point number. */ -extern double strtod (const char *__restrict __nptr, - char **__restrict __endptr) +extern double strtod (const char *__nptr, char **__restrict __endptr) __THROW __nonnull ((1)); #ifdef __USE_ISOC99 /* Likewise for `float' and `long double' sizes of floating-point numbers. */ -extern float strtof (const char *__restrict __nptr, - char **__restrict __endptr) __THROW __nonnull ((1)); +extern float strtof (const char *__nptr, char **__restrict __endptr) + __THROW __nonnull ((1)); -extern long double strtold (const char *__restrict __nptr, - char **__restrict __endptr) +extern long double strtold (const char *__nptr, char **__restrict __endptr) __THROW __nonnull ((1)); #endif /* Likewise for '_FloatN' and '_FloatNx'. */ #if __HAVE_FLOAT16 && __GLIBC_USE (IEC_60559_TYPES_EXT) -extern _Float16 strtof16 (const char *__restrict __nptr, - char **__restrict __endptr) +extern _Float16 strtof16 (const char *__nptr, char **__restrict __endptr) __THROW __nonnull ((1)); #endif #if __HAVE_FLOAT32 && __GLIBC_USE (IEC_60559_TYPES_EXT) -extern _Float32 strtof32 (const char *__restrict __nptr, - char **__restrict __endptr) +extern _Float32 strtof32 (const char *__nptr, char **__restrict __endptr) __THROW __nonnull ((1)); #endif #if __HAVE_FLOAT64 && __GLIBC_USE (IEC_60559_TYPES_EXT) -extern _Float64 strtof64 (const char *__restrict __nptr, - char **__restrict __endptr) +extern _Float64 strtof64 (const char *__nptr, char **__restrict __endptr) __THROW __nonnull ((1)); #endif #if __HAVE_FLOAT128 && __GLIBC_USE (IEC_60559_TYPES_EXT) -extern _Float128 strtof128 (const char *__restrict __nptr, - char **__restrict __endptr) +extern _Float128 strtof128 (const char *__nptr, char **__restrict __endptr) __THROW __nonnull ((1)); #endif #if __HAVE_FLOAT32X && __GLIBC_USE (IEC_60559_TYPES_EXT) -extern _Float32x strtof32x (const char *__restrict __nptr, - char **__restrict __endptr) +extern _Float32x strtof32x (const char *__nptr, char **__restrict __endptr) __THROW __nonnull ((1)); #endif #if __HAVE_FLOAT64X && __GLIBC_USE (IEC_60559_TYPES_EXT) -extern _Float64x strtof64x (const char *__restrict __nptr, - char **__restrict __endptr) +extern _Float64x strtof64x (const char *__nptr, char **__restrict __endptr) __THROW __nonnull ((1)); #endif #if __HAVE_FLOAT128X && __GLIBC_USE (IEC_60559_TYPES_EXT) -extern _Float128x strtof128x (const char *__restrict __nptr, - char **__restrict __endptr) +extern _Float128x strtof128x (const char *__nptr, char **__restrict __endptr) __THROW __nonnull ((1)); #endif /* Convert a string to a long integer. */ -extern long int strtol (const char *__restrict __nptr, +extern long int strtol (const char *__nptr, char **__restrict __endptr, int __base) __THROW __nonnull ((1)); /* Convert a string to an unsigned long integer. */ -extern unsigned long int strtoul (const char *__restrict __nptr, +extern unsigned long int strtoul (const char *__nptr, char **__restrict __endptr, int __base) __THROW __nonnull ((1)); #ifdef __USE_MISC /* Convert a string to a quadword integer. */ __extension__ -extern long long int strtoq (const char *__restrict __nptr, +extern long long int strtoq (const char *__nptr, char **__restrict __endptr, int __base) __THROW __nonnull ((1)); /* Convert a string to an unsigned quadword integer. */ __extension__ -extern unsigned long long int strtouq (const char *__restrict __nptr, +extern unsigned long long int strtouq (const char *__nptr, char **__restrict __endptr, int __base) __THROW __nonnull ((1)); #endif /* Use misc. */ @@ -198,12 +189,12 @@ extern unsigned long long int strtouq (const char *__restrict __nptr, #ifdef __USE_ISOC99 /* Convert a string to a quadword integer. */ __extension__ -extern long long int strtoll (const char *__restrict __nptr, +extern long long int strtoll (const char *__nptr, char **__restrict __endptr, int __base) __THROW __nonnull ((1)); /* Convert a string to an unsigned quadword integer. */ __extension__ -extern unsigned long long int strtoull (const char *__restrict __nptr, +extern unsigned long long int strtoull (const char *__nptr, char **__restrict __endptr, int __base) __THROW __nonnull ((1)); #endif /* ISO C99 or use MISC. */ @@ -212,53 +203,53 @@ extern unsigned long long int strtoull (const char *__restrict __nptr, in base 0 or 2. */ #if __GLIBC_USE (C23_STRTOL) # ifdef __REDIRECT -extern long int __REDIRECT_NTH (strtol, (const char *__restrict __nptr, +extern long int __REDIRECT_NTH (strtol, (const char *__nptr, char **__restrict __endptr, int __base), __isoc23_strtol) __nonnull ((1)); extern unsigned long int __REDIRECT_NTH (strtoul, - (const char *__restrict __nptr, + (const char *__nptr, char **__restrict __endptr, int __base), __isoc23_strtoul) __nonnull ((1)); # ifdef __USE_MISC __extension__ -extern long long int __REDIRECT_NTH (strtoq, (const char *__restrict __nptr, +extern long long int __REDIRECT_NTH (strtoq, (const char *__nptr, char **__restrict __endptr, int __base), __isoc23_strtoll) __nonnull ((1)); __extension__ extern unsigned long long int __REDIRECT_NTH (strtouq, - (const char *__restrict __nptr, + (const char *__nptr, char **__restrict __endptr, int __base), __isoc23_strtoull) __nonnull ((1)); # endif __extension__ -extern long long int __REDIRECT_NTH (strtoll, (const char *__restrict __nptr, +extern long long int __REDIRECT_NTH (strtoll, (const char *__nptr, char **__restrict __endptr, int __base), __isoc23_strtoll) __nonnull ((1)); __extension__ extern unsigned long long int __REDIRECT_NTH (strtoull, - (const char *__restrict __nptr, + (const char *__nptr, char **__restrict __endptr, int __base), __isoc23_strtoull) __nonnull ((1)); # else -extern long int __isoc23_strtol (const char *__restrict __nptr, +extern long int __isoc23_strtol (const char *__nptr, char **__restrict __endptr, int __base) __THROW __nonnull ((1)); -extern unsigned long int __isoc23_strtoul (const char *__restrict __nptr, +extern unsigned long int __isoc23_strtoul (const char *__nptr, char **__restrict __endptr, int __base) __THROW __nonnull ((1)); __extension__ -extern long long int __isoc23_strtoll (const char *__restrict __nptr, +extern long long int __isoc23_strtoll (const char *__nptr, char **__restrict __endptr, int __base) __THROW __nonnull ((1)); __extension__ -extern unsigned long long int __isoc23_strtoull (const char *__restrict __nptr, +extern unsigned long long int __isoc23_strtoull (const char *__nptr, char **__restrict __endptr, int __base) __THROW __nonnull ((1)); @@ -337,23 +328,23 @@ extern int strfromf128x (char *__dest, size_t __size, const char * __format, by the POSIX.1-2008 extended locale API. */ # include <bits/types/locale_t.h> -extern long int strtol_l (const char *__restrict __nptr, +extern long int strtol_l (const char *__nptr, char **__restrict __endptr, int __base, locale_t __loc) __THROW __nonnull ((1, 4)); -extern unsigned long int strtoul_l (const char *__restrict __nptr, +extern unsigned long int strtoul_l (const char *__nptr, char **__restrict __endptr, int __base, locale_t __loc) __THROW __nonnull ((1, 4)); __extension__ -extern long long int strtoll_l (const char *__restrict __nptr, +extern long long int strtoll_l (const char *__nptr, char **__restrict __endptr, int __base, locale_t __loc) __THROW __nonnull ((1, 4)); __extension__ -extern unsigned long long int strtoull_l (const char *__restrict __nptr, +extern unsigned long long int strtoull_l (const char *__nptr, char **__restrict __endptr, int __base, locale_t __loc) __THROW __nonnull ((1, 4)); @@ -362,19 +353,19 @@ extern unsigned long long int strtoull_l (const char *__restrict __nptr, in base 0 or 2. */ # if __GLIBC_USE (C23_STRTOL) # ifdef __REDIRECT -extern long int __REDIRECT_NTH (strtol_l, (const char *__restrict __nptr, +extern long int __REDIRECT_NTH (strtol_l, (const char *__nptr, char **__restrict __endptr, int __base, locale_t __loc), __isoc23_strtol_l) __nonnull ((1, 4)); extern unsigned long int __REDIRECT_NTH (strtoul_l, - (const char *__restrict __nptr, + (const char *__nptr, char **__restrict __endptr, int __base, locale_t __loc), __isoc23_strtoul_l) __nonnull ((1, 4)); __extension__ -extern long long int __REDIRECT_NTH (strtoll_l, (const char *__restrict __nptr, +extern long long int __REDIRECT_NTH (strtoll_l, (const char *__nptr, char **__restrict __endptr, int __base, locale_t __loc), @@ -382,26 +373,26 @@ extern long long int __REDIRECT_NTH (strtoll_l, (const char *__restrict __nptr, __nonnull ((1, 4)); __extension__ extern unsigned long long int __REDIRECT_NTH (strtoull_l, - (const char *__restrict __nptr, + (const char *__nptr, char **__restrict __endptr, int __base, locale_t __loc), __isoc23_strtoull_l) __nonnull ((1, 4)); # else -extern long int __isoc23_strtol_l (const char *__restrict __nptr, +extern long int __isoc23_strtol_l (const char *__nptr, char **__restrict __endptr, int __base, locale_t __loc) __THROW __nonnull ((1, 4)); -extern unsigned long int __isoc23_strtoul_l (const char *__restrict __nptr, +extern unsigned long int __isoc23_strtoul_l (const char *__nptr, char **__restrict __endptr, int __base, locale_t __loc) __THROW __nonnull ((1, 4)); __extension__ -extern long long int __isoc23_strtoll_l (const char *__restrict __nptr, +extern long long int __isoc23_strtoll_l (const char *__nptr, char **__restrict __endptr, int __base, locale_t __loc) __THROW __nonnull ((1, 4)); __extension__ -extern unsigned long long int __isoc23_strtoull_l (const char *__restrict __nptr, +extern unsigned long long int __isoc23_strtoull_l (const char *__nptr, char **__restrict __endptr, int __base, locale_t __loc) __THROW __nonnull ((1, 4)); @@ -412,64 +403,56 @@ extern unsigned long long int __isoc23_strtoull_l (const char *__restrict __nptr # endif # endif -extern double strtod_l (const char *__restrict __nptr, - char **__restrict __endptr, locale_t __loc) +extern double strtod_l (const char *__nptr, char **__restrict __endptr, + locale_t __loc) __THROW __nonnull ((1, 3)); -extern float strtof_l (const char *__restrict __nptr, - char **__restrict __endptr, locale_t __loc) +extern float strtof_l (const char *__nptr, char **__restrict __endptr, + locale_t __loc) __THROW __nonnull ((1, 3)); -extern long double strtold_l (const char *__restrict __nptr, - char **__restrict __endptr, +extern long double strtold_l (const char *__nptr, char **__restrict __endptr, locale_t __loc) __THROW __nonnull ((1, 3)); # if __HAVE_FLOAT16 -extern _Float16 strtof16_l (const char *__restrict __nptr, - char **__restrict __endptr, +extern _Float16 strtof16_l (const char *__nptr, char **__restrict __endptr, locale_t __loc) __THROW __nonnull ((1, 3)); # endif # if __HAVE_FLOAT32 -extern _Float32 strtof32_l (const char *__restrict __nptr, - char **__restrict __endptr, +extern _Float32 strtof32_l (const char *__nptr, char **__restrict __endptr, locale_t __loc) __THROW __nonnull ((1, 3)); # endif # if __HAVE_FLOAT64 -extern _Float64 strtof64_l (const char *__restrict __nptr, - char **__restrict __endptr, +extern _Float64 strtof64_l (const char *__nptr, char **__restrict __endptr, locale_t __loc) __THROW __nonnull ((1, 3)); # endif # if __HAVE_FLOAT128 -extern _Float128 strtof128_l (const char *__restrict __nptr, - char **__restrict __endptr, +extern _Float128 strtof128_l (const char *__nptr, char **__restrict __endptr, locale_t __loc) __THROW __nonnull ((1, 3)); # endif # if __HAVE_FLOAT32X -extern _Float32x strtof32x_l (const char *__restrict __nptr, - char **__restrict __endptr, +extern _Float32x strtof32x_l (const char *__nptr, char **__restrict __endptr, locale_t __loc) __THROW __nonnull ((1, 3)); # endif # if __HAVE_FLOAT64X -extern _Float64x strtof64x_l (const char *__restrict __nptr, - char **__restrict __endptr, +extern _Float64x strtof64x_l (const char *__nptr, char **__restrict __endptr, locale_t __loc) __THROW __nonnull ((1, 3)); # endif # if __HAVE_FLOAT128X -extern _Float128x strtof128x_l (const char *__restrict __nptr, - char **__restrict __endptr, +extern _Float128x strtof128x_l (const char *__nptr, char **__restrict __endptr, locale_t __loc) __THROW __nonnull ((1, 3)); # endif diff --git a/sysdeps/ieee754/ldbl-opt/nldbl-strtold_l.c b/sysdeps/ieee754/ldbl-opt/nldbl-strtold_l.c index 29ad60c8a5..44dc7bface 100644 --- a/sysdeps/ieee754/ldbl-opt/nldbl-strtold_l.c +++ b/sysdeps/ieee754/ldbl-opt/nldbl-strtold_l.c @@ -7,7 +7,7 @@ #undef __strtod_l extern double -__strtod_l (const char *__restrict __nptr, char **__restrict __endptr, +__strtod_l (const char *__nptr, char **__restrict __endptr, locale_t __loc); double diff --git a/wcsmbs/wchar.h b/wcsmbs/wchar.h index 554d811a22..c2c74d6f81 100644 --- a/wcsmbs/wchar.h +++ b/wcsmbs/wchar.h @@ -399,14 +399,14 @@ extern int wcswidth (const wchar_t *__s, size_t __n) __THROW; /* Convert initial portion of the wide string NPTR to `double' representation. */ -extern double wcstod (const wchar_t *__restrict __nptr, +extern double wcstod (const wchar_t *__nptr, wchar_t **__restrict __endptr) __THROW; #ifdef __USE_ISOC99 /* Likewise for `float' and `long double' sizes of floating-point numbers. */ -extern float wcstof (const wchar_t *__restrict __nptr, +extern float wcstof (const wchar_t *__nptr, wchar_t **__restrict __endptr) __THROW; -extern long double wcstold (const wchar_t *__restrict __nptr, +extern long double wcstold (const wchar_t *__nptr, wchar_t **__restrict __endptr) __THROW; #endif /* C99 */ @@ -414,37 +414,37 @@ extern long double wcstold (const wchar_t *__restrict __nptr, /* Likewise for `_FloatN' and `_FloatNx' when support is enabled. */ # if __HAVE_FLOAT16 -extern _Float16 wcstof16 (const wchar_t *__restrict __nptr, +extern _Float16 wcstof16 (const wchar_t *__nptr, wchar_t **__restrict __endptr) __THROW; # endif # if __HAVE_FLOAT32 -extern _Float32 wcstof32 (const wchar_t *__restrict __nptr, +extern _Float32 wcstof32 (const wchar_t *__nptr, wchar_t **__restrict __endptr) __THROW; # endif # if __HAVE_FLOAT64 -extern _Float64 wcstof64 (const wchar_t *__restrict __nptr, +extern _Float64 wcstof64 (const wchar_t *__nptr, wchar_t **__restrict __endptr) __THROW; # endif # if __HAVE_FLOAT128 -extern _Float128 wcstof128 (const wchar_t *__restrict __nptr, +extern _Float128 wcstof128 (const wchar_t *__nptr, wchar_t **__restrict __endptr) __THROW; # endif # if __HAVE_FLOAT32X -extern _Float32x wcstof32x (const wchar_t *__restrict __nptr, +extern _Float32x wcstof32x (const wchar_t *__nptr, wchar_t **__restrict __endptr) __THROW; # endif # if __HAVE_FLOAT64X -extern _Float64x wcstof64x (const wchar_t *__restrict __nptr, +extern _Float64x wcstof64x (const wchar_t *__nptr, wchar_t **__restrict __endptr) __THROW; # endif # if __HAVE_FLOAT128X -extern _Float128x wcstof128x (const wchar_t *__restrict __nptr, +extern _Float128x wcstof128x (const wchar_t *__nptr, wchar_t **__restrict __endptr) __THROW; # endif #endif /* __GLIBC_USE (IEC_60559_TYPES_EXT) && __GLIBC_USE (ISOC23) */ @@ -452,12 +452,12 @@ extern _Float128x wcstof128x (const wchar_t *__restrict __nptr, /* Convert initial portion of wide string NPTR to `long int' representation. */ -extern long int wcstol (const wchar_t *__restrict __nptr, +extern long int wcstol (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base) __THROW; /* Convert initial portion of wide string NPTR to `unsigned long int' representation. */ -extern unsigned long int wcstoul (const wchar_t *__restrict __nptr, +extern unsigned long int wcstoul (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base) __THROW; @@ -465,14 +465,14 @@ extern unsigned long int wcstoul (const wchar_t *__restrict __nptr, /* Convert initial portion of wide string NPTR to `long long int' representation. */ __extension__ -extern long long int wcstoll (const wchar_t *__restrict __nptr, +extern long long int wcstoll (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base) __THROW; /* Convert initial portion of wide string NPTR to `unsigned long long int' representation. */ __extension__ -extern unsigned long long int wcstoull (const wchar_t *__restrict __nptr, +extern unsigned long long int wcstoull (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base) __THROW; #endif /* ISO C99. */ @@ -481,14 +481,14 @@ extern unsigned long long int wcstoull (const wchar_t *__restrict __nptr, /* Convert initial portion of wide string NPTR to `long long int' representation. */ __extension__ -extern long long int wcstoq (const wchar_t *__restrict __nptr, +extern long long int wcstoq (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base) __THROW; /* Convert initial portion of wide string NPTR to `unsigned long long int' representation. */ __extension__ -extern unsigned long long int wcstouq (const wchar_t *__restrict __nptr, +extern unsigned long long int wcstouq (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base) __THROW; #endif /* Use GNU. */ @@ -497,49 +497,49 @@ extern unsigned long long int wcstouq (const wchar_t *__restrict __nptr, in base 0 or 2. */ #if __GLIBC_USE (C23_STRTOL) # ifdef __REDIRECT -extern long int __REDIRECT_NTH (wcstol, (const wchar_t *__restrict __nptr, +extern long int __REDIRECT_NTH (wcstol, (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base), __isoc23_wcstol); extern unsigned long int __REDIRECT_NTH (wcstoul, - (const wchar_t *__restrict __nptr, + (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base), __isoc23_wcstoul); __extension__ extern long long int __REDIRECT_NTH (wcstoll, - (const wchar_t *__restrict __nptr, + (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base), __isoc23_wcstoll); __extension__ extern unsigned long long int __REDIRECT_NTH (wcstoull, - (const wchar_t *__restrict __nptr, + (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base), __isoc23_wcstoull); # ifdef __USE_GNU __extension__ -extern long long int __REDIRECT_NTH (wcstoq, (const wchar_t *__restrict __nptr, +extern long long int __REDIRECT_NTH (wcstoq, (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base), __isoc23_wcstoll); __extension__ extern unsigned long long int __REDIRECT_NTH (wcstouq, - (const wchar_t *__restrict __nptr, + (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base), __isoc23_wcstoull); # endif # else -extern long int __isoc23_wcstol (const wchar_t *__restrict __nptr, +extern long int __isoc23_wcstol (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base) __THROW; -extern unsigned long int __isoc23_wcstoul (const wchar_t *__restrict __nptr, +extern unsigned long int __isoc23_wcstoul (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base) __THROW; __extension__ -extern long long int __isoc23_wcstoll (const wchar_t *__restrict __nptr, +extern long long int __isoc23_wcstoll (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base) __THROW; __extension__ -extern unsigned long long int __isoc23_wcstoull (const wchar_t *__restrict __nptr, +extern unsigned long long int __isoc23_wcstoull (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base) __THROW; @@ -558,21 +558,21 @@ extern unsigned long long int __isoc23_wcstoull (const wchar_t *__restrict __npt /* Parallel versions of the functions above which take the locale to use as an additional parameter. These are GNU extensions inspired by the POSIX.1-2008 extended locale API. */ -extern long int wcstol_l (const wchar_t *__restrict __nptr, +extern long int wcstol_l (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base, locale_t __loc) __THROW; -extern unsigned long int wcstoul_l (const wchar_t *__restrict __nptr, +extern unsigned long int wcstoul_l (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base, locale_t __loc) __THROW; __extension__ -extern long long int wcstoll_l (const wchar_t *__restrict __nptr, +extern long long int wcstoll_l (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base, locale_t __loc) __THROW; __extension__ -extern unsigned long long int wcstoull_l (const wchar_t *__restrict __nptr, +extern unsigned long long int wcstoull_l (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base, locale_t __loc) __THROW; @@ -581,42 +581,42 @@ extern unsigned long long int wcstoull_l (const wchar_t *__restrict __nptr, in base 0 or 2. */ # if __GLIBC_USE (C23_STRTOL) # ifdef __REDIRECT -extern long int __REDIRECT_NTH (wcstol_l, (const wchar_t *__restrict __nptr, +extern long int __REDIRECT_NTH (wcstol_l, (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base, locale_t __loc), __isoc23_wcstol_l); extern unsigned long int __REDIRECT_NTH (wcstoul_l, - (const wchar_t *__restrict __nptr, + (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base, locale_t __loc), __isoc23_wcstoul_l); __extension__ extern long long int __REDIRECT_NTH (wcstoll_l, - (const wchar_t *__restrict __nptr, + (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base, locale_t __loc), __isoc23_wcstoll_l); __extension__ extern unsigned long long int __REDIRECT_NTH (wcstoull_l, - (const wchar_t *__restrict __nptr, + (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base, locale_t __loc), __isoc23_wcstoull_l); # else -extern long int __isoc23_wcstol_l (const wchar_t *__restrict __nptr, +extern long int __isoc23_wcstol_l (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base, locale_t __loc) __THROW; -extern unsigned long int __isoc23_wcstoul_l (const wchar_t *__restrict __nptr, +extern unsigned long int __isoc23_wcstoul_l (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base, locale_t __loc) __THROW; __extension__ -extern long long int __isoc23_wcstoll_l (const wchar_t *__restrict __nptr, +extern long long int __isoc23_wcstoll_l (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base, locale_t __loc) __THROW; __extension__ -extern unsigned long long int __isoc23_wcstoull_l (const wchar_t *__restrict __nptr, +extern unsigned long long int __isoc23_wcstoull_l (const wchar_t *__nptr, wchar_t **__restrict __endptr, int __base, locale_t __loc) __THROW; @@ -627,56 +627,56 @@ extern unsigned long long int __isoc23_wcstoull_l (const wchar_t *__restrict __n # endif # endif -extern double wcstod_l (const wchar_t *__restrict __nptr, +extern double wcstod_l (const wchar_t *__nptr, wchar_t **__restrict __endptr, locale_t __loc) __THROW; -extern float wcstof_l (const wchar_t *__restrict __nptr, +extern float wcstof_l (const wchar_t *__nptr, wchar_t **__restrict __endptr, locale_t __loc) __THROW; -extern long double wcstold_l (const wchar_t *__restrict __nptr, +extern long double wcstold_l (const wchar_t *__nptr, wchar_t **__restrict __endptr, locale_t __loc) __THROW; # if __HAVE_FLOAT16 -extern _Float16 wcstof16_l (const wchar_t *__restrict __nptr, +extern _Float16 wcstof16_l (const wchar_t *__nptr, wchar_t **__restrict __endptr, locale_t __loc) __THROW; # endif # if __HAVE_FLOAT32 -extern _Float32 wcstof32_l (const wchar_t *__restrict __nptr, +extern _Float32 wcstof32_l (const wchar_t *__nptr, wchar_t **__restrict __endptr, locale_t __loc) __THROW; # endif # if __HAVE_FLOAT64 -extern _Float64 wcstof64_l (const wchar_t *__restrict __nptr, +extern _Float64 wcstof64_l (const wchar_t *__nptr, wchar_t **__restrict __endptr, locale_t __loc) __THROW; # endif # if __HAVE_FLOAT128 -extern _Float128 wcstof128_l (const wchar_t *__restrict __nptr, +extern _Float128 wcstof128_l (const wchar_t *__nptr, wchar_t **__restrict __endptr, locale_t __loc) __THROW; # endif # if __HAVE_FLOAT32X -extern _Float32x wcstof32x_l (const wchar_t *__restrict __nptr, +extern _Float32x wcstof32x_l (const wchar_t *__nptr, wchar_t **__restrict __endptr, locale_t __loc) __THROW; # endif # if __HAVE_FLOAT64X -extern _Float64x wcstof64x_l (const wchar_t *__restrict __nptr, +extern _Float64x wcstof64x_l (const wchar_t *__nptr, wchar_t **__restrict __endptr, locale_t __loc) __THROW; # endif # if __HAVE_FLOAT128X -extern _Float128x wcstof128x_l (const wchar_t *__restrict __nptr, +extern _Float128x wcstof128x_l (const wchar_t *__nptr, wchar_t **__restrict __endptr, locale_t __loc) __THROW; # endif
ISO C specifies these APIs as accepting a restricted pointer in their first parameter: $ stdc c99 strtol long int strtol(const char *restrict nptr, char **restrict endptr, int base); $ stdc c11 strtol long int strtol(const char *restrict nptr, char **restrict endptr, int base); However, it should be considered a defect in ISO C. It's common to see code that aliases it: char str[] = "10 20"; p = str; a = strtol(p, &p, 0); // Let's ignore error handling for b = strtol(p, &p, 0); // simplicity. strtol(3) doesn't write to the string at all, so it shouldn't care at all if there's any aliasing. Requiring that the user uses a distinct pointer for the second argument is an artificial imposition that has no reason to be, and is often violated by real code, so let's lift that restriction. For example, in the shadow project, there were two cases (as of shadow-4.14.8; they probably still are there in more recent versions, but they now use some wrapper functions that make it more complex to show) of violation of this restriction: $ grep -rn strto.*pos.*pos lib* src/ | sed 's/\t\t*/\t/' src/usermod.c:322: last = strtoll(pos + 1, &pos, 10); $ grep -rn strto.*end.*end lib* src/ | sed 's/\t\t*/\t/' lib/getrange.c:83: n = strtoul (endptr, &endptr, 10); Link: <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112833> Cc: <gcc@gcc.gnu.org> Cc: Paul Eggert <eggert@cs.ucla.edu> Signed-off-by: Alejandro Colomar <alx@kernel.org> --- include/stdlib.h | 35 +++--- include/wchar.h | 17 ++- manual/arith.texi | 36 +++---- stdlib/inttypes.h | 24 ++--- stdlib/stdlib.h | 119 +++++++++------------ sysdeps/ieee754/ldbl-opt/nldbl-strtold_l.c | 2 +- wcsmbs/wchar.h | 96 ++++++++--------- 7 files changed, 154 insertions(+), 175 deletions(-)