Message ID | 56607CD1.3050209@cs.ucla.edu |
---|---|
State | New |
Headers | show |
On Thu, Dec 3, 2015 at 12:33 PM, Paul Eggert <eggert@cs.ucla.edu> wrote: > Come to think of it, this is related to the confusion between one of the > main goals of strlcpy (namely, the result is always null-terminated), and > the weird special case where the destination size is zero (where strlcpy > cannot null-terminate the destination). In practice user code does not and > should not depend on this weird special case. I would not at all be surprised to find code deliberately calling strlcpy with destination size zero in order to learn how big the destination buffer needed to be. zw
On 12/03/2015 09:36 AM, Zack Weinberg wrote: > I would not at all be surprised to find code deliberately calling > strlcpy with destination size zero in order to learn how big the > destination buffer needed to be. Calling strlcpy (DST, SRC, 0) instead of the more-obvious strlen (SRC)? Really? That would be bizarre. I'd like to see an example. Also, would you be surprised by a call strlcpy (NULL, SRC, 0) instead of the more-obvious strlen (SRC)? Florian's proposed wording (like my proposal) would prohibit such a call. Even though strlcpy (NULL, SRC, 0) works just fine on BSD and the OpenBSD and FreeBSD man pages don't prohibit it, surely glibc doesn't need to support this usage, and surely the same is true for strlcpy (DST, SRC, 0) when DST is non-NULL.
On Thu, Dec 3, 2015 at 12:56 PM, Paul Eggert <eggert@cs.ucla.edu> wrote: > On 12/03/2015 09:36 AM, Zack Weinberg wrote: >> >> I would not at all be surprised to find code deliberately calling >> strlcpy with destination size zero in order to learn how big the >> destination buffer needed to be. > > Calling strlcpy (DST, SRC, 0) instead of the more-obvious strlen (SRC)? > Really? That would be bizarre. I'd like to see an example. char *my_strdup(char *s) { size_t bufsiz = strlcpy(0, s, 0) + 1; char *buf = malloc(bufsiz); if (!buf) return 0; strlcpy(buf, s, bufsiz); return s; } If you're inclined to write this function using strlcpy in the first place, using it both to compute the buffer size and to do the copy is natural. > Also, would you be surprised by a call strlcpy (NULL, SRC, 0) instead of the > more-obvious strlen (SRC)? I would actively expect strlcpy (0, SRC, 0) to be used; indeed, I would expect it to be *more* common than passing a non-null buffer with a zero size. > Florian's proposed wording (like my proposal) > would prohibit such a call. Even though strlcpy (NULL, SRC, 0) works just > fine on BSD and the OpenBSD and FreeBSD man pages don't prohibit it, > surely glibc doesn't need to support this usage If we're going to have these functions at all, they need to work _exactly_ as they do on *BSD. Also, accepting a null destination buffer along with a zero destination size is consistent with snprintf. zw
On 12/03/2015 10:08 AM, Zack Weinberg wrote: > char *my_strdup(char *s) Sorry, I should have been clearer. I'd like to see real-world applications that care about these weird corner cases. I'm skeptical that such applications exist. I'm sure that one can find contrived examples and feature-test programs and suchlike, but production code? > I would actively expect strlcpy (0, SRC, 0) to be used Any such usage would not work under NetBSD's current strlcpy implementation, which does not allow the destination to be a null pointer. So strlcpy (0, SRC, 0) is already not portable, and we don't need to support it. > If we're going to have these functions at all, they need to work > _exactly_ as they do on *BSD There is no "_exactly_". The BSDs differ. As you know, I'd rather we didn't support these poorly-designed APIs; but if we do support them at least we can help out a bit by catching any apps that fall into these API's weirder cracks.
On 12/03/2015 11:42 PM, Paul Eggert wrote: > On 12/03/2015 10:08 AM, Zack Weinberg wrote: >> If we're going to have these functions at all, they need to work >> _exactly_ as they do on *BSD > > There is no "_exactly_". The BSDs differ. I think the behavior is the same. The FreeBSD manpage deviation you mentioned could be a documentation bug. > As you know, I'd rather we didn't support these poorly-designed APIs; > but if we do support them at least we can help out a bit by catching any > apps that fall into these API's weirder cracks. We'd have to coordinate this with the OpenBSD folks at least. OpenBSD made overlapped strings explicitly undefined in 2014, so there is some precedent for clarification. Not sure if a direct conversation is possible. I've reached out. Florian
On 12/03/2015 06:33 PM, Paul Eggert wrote: > Thanks for the review. Revised patchset attached. At this point I'm > inclined to install the first two patches (which don't change semantics) > to make it easier to review and maintain the third one (which adds > strlcpy+strlcat), but I'll hold off a bit longer on that to get more > feedback. > Replying to your comments: >> I don't understand the “and as for strings usually pointers … are used” >> part. > > Changed to "A wide-string variable is usually declared to be a pointer > of type @code{wchar_t *}, by analogy with string variables and > @code{char *}." Hope this makes it clear. Oh, I see now. A few commas in the original version would have helped, but this much better. >> Standard uses “wide string” by the way, not “wide character string”. > > C11 says "wide string", POSIX says "wide-character string". True, the > more-concise form is better here, so I've changed it to that. Similarly, > I changed "multibyte character string" (POSIX wording) to "multibyte > string" (C Standard wording). I wish the standards could standardize the > wording... Oh, how unfortunate. >> +literal. Strings can also be formed by @dfn{string concatenation}: >> +@code{"a" "b"} is the >> >> The original had “string literal”. As this only works for string >> literals, it's best to keep it. > > The original was incorrect; it said "string literals can also be formed > by @dfn{string concatenation}" but string literals are the input to > string concatenation, not the output from it. I changed the wording to > "String literals can also contribute to @dfn{string concatenation}:...". Okay, fine. >> @@ -309,7 +318,8 @@ returns @var{maxlen}. Therefore this function is >> equivalent to >> @code{(strlen (@var{s}) < @var{maxlen} ? strlen (@var{s}) : >> @var{maxlen})} >> but it >> is more efficient and works even if the string @var{s} is not >> -null-terminated. >> +null-terminated so long as @var{maxlen} does not exceed the >> +size of @var{s}'s array. >> >> This doesn't make much sense anymore because strings are defined to be >> always null-terminated. > > Reworded to "If the array @var{s} of size @var{maxlen} contains a null > byte, the @code{strnlen} function returns the length of the string > @var{s} in bytes. Otherwise it returns @var{maxlen}. Therefore this > function is equivalent to @code{(strlen (@var{s}) < @var{maxlen} ? > strlen (@var{s}) : @var{maxlen})} but it is more efficient and works > even if @var{s} is not null-terminated so long as @var{maxlen} does not > exceed the size of @var{s}'s array." Looks good. I'm reading the second patch now. This claim about _FORTIFY_SOURCE is misleading: +Although these string-truncation functions have been used to fend off +some buffer overruns, nowadays more-systematic techniques are often +available, such as defining the @code{_FORTIFY_SOURCE} macro or using +GCC's @option{-fsanitize=address} option. Ironically, use of these +string-truncation functions can mask application bugs that would +otherwise be caught by the more-systematic techniques. Unfortunately, source fortification catches few buffer overflows in this area. For example, this code: char * concat (const char *a, const char *b) { size_t alen = strlen (a); size_t blen = strlen (b); size_t len = alen + blen; char *result = malloc (len + 1); if (result != NULL) { strcpy (result, a); strcat (result, b); } return result; } Compiles to (using GCC 5.1.1): concat: pushl %ebp pushl %edi pushl %esi pushl %ebx subl $24, %esp movl 44(%esp), %ebp pushl %ebp call strlen popl %edx pushl 48(%esp) movl %eax, %ebx call strlen movl %eax, %esi leal 1(%ebx,%eax), %eax movl %eax, (%esp) call malloc addl $16, %esp testl %eax, %eax movl %eax, %edi je .L2 subl $4, %esp addl $1, %esi pushl %ebx pushl %ebp addl %edi, %ebx pushl %eax call memcpy addl $12, %esp pushl %esi pushl 44(%esp) pushl %ebx call memcpy addl $16, %esp .L2: addl $12, %esp movl %edi, %eax popl %ebx popl %esi popl %edi popl %ebp ret strcpy and strcat have been replaced by memcpy, but this only eliminates only one source of buffer overflows here. In my experience, _FORTIFY_SOURCE mostly covers the write-to-statically-sized-buffer case. The manual should not make promises the current GCC/glibc combinations cannot deliver. _FORTIFY_SOURCE will always be brittle for the more complex cases because of the dependency on GCC optimization behavior. It's also highly application-specific whether a crash (induced by _FORTIFY_SOURCE) or truncation (from strncpy or strlcpy) is better. Florian
On 12/03/2015 05:42 PM, Paul Eggert wrote: > On 12/03/2015 10:08 AM, Zack Weinberg wrote: >> char *my_strdup(char *s) > > Sorry, I should have been clearer. I'd like to see real-world > applications that care about these weird corner cases. I really do not see this as a weird corner case. I see it as the expected, natural behavior - if dstsize=0, nothing is written to the destination, and the return value is unaffected; because nothing is written to the destination when dstsize=0, it is safe to pass dest=NULL when dstsize=0. Again, this is consistent with snprintf, and I think maintaining that consistency would be a compelling argument even if it wasn't the *BSD behavior. > I'm skeptical that such applications exist. I do not have time to do more than a cursory search, which turns up no such applications; however, I do not consider "we don't know of any such applications" an acceptable excuse for deviating from the *BSD behavior. I reiterate that >> If we're going to have these functions at all, they need to work >> _exactly_ as they do on *BSD. If they don't work _exactly_ as they do on *BSD, people will not stop using substitute definitions with glibc and there will have been no point in adding them! > There is no "_exactly_". The BSDs differ. I would consider the example you gave (NetBSD not allowing dst=NULL when dstsize=0) a bug in NetBSD, not to be emulated. zw
On 12/04/2015 03:12 AM, Florian Weimer wrote: >> There is no "_exactly_". The BSDs differ. > I think the behavior is the same. No, as I mentioned elsewhere, the NetBSD iimplementation does not allow strlcpy(0, SRC, 0) whereas the FreeBSD implementation does (and this disagrees with the FreeBSD documentation where it says the destination must be a string). I haven't investigated strlcpy but I expect that it differs too. It is a bit of a mess. It's certainly OK to reach out to the OpenBSD and FreeBSD and NetBSD folks here, in particular as to whether applications rely on these weird corner cases. I am pretty sure they do not, and that it'd be better to make them errors.
On 12/04/2015 06:20 AM, Zack Weinberg wrote:
> I really do not see this as a weird corner case.
It's a weird corner case because it violates the natural programmer
expectation that strlcpy and strlcat produce null-terminated strings.
This is hardwired into what programmers expect, and is this expectation
is documented by the FreeBSD man pages. And it is why the BSD man pages
say "this should not happen" when talking about these weird corner cases.
There's no way that a typical C programmer will call strlcat and think,
"OK, now I have to worry about whether the result is null-terminated".
It's a misuse of programmer time to even raise the possibility. The
whole *point* of these poorly-designed functions is to generate
null-terminated strings come hell or high water, regardless of how long
the inputs are.
I accept that in this sense strlcpy+strlcat differ from snprintf but
there's a good reason for that. With snprintf, one can't easily compute
the needed length without calling snprintf, so there's a natural pattern
of calling snprintf with a zero size, then allocating N+1 bytes, then
calling snprintf again. With strlcpy this pattern does not arise.
First, anybody who's trying to allocating a buffer of proper size will
not be calling strlcpy in the first place; they'll be calling plain
strcpy or memcpy or whatever because the output buffer will be big
enough once it's allocated. Second, the natural way to compute a string
length is to call strlen, not to call strlcpy (NULL, SRC, 0) which is a
weird corner case and does not work in NetBSD anyway.
On 12/03/2015 07:08 PM, Zack Weinberg wrote: >> Calling strlcpy (DST, SRC, 0) instead of the more-obvious strlen (SRC)? >> Really? That would be bizarre. I'd like to see an example. > > char *my_strdup(char *s) > { > size_t bufsiz = strlcpy(0, s, 0) + 1; > char *buf = malloc(bufsiz); > if (!buf) return 0; > strlcpy(buf, s, bufsiz); > return s; > } The implementation I posted does not actually support that because (like virtually all C standard functions), a null pointer is not allowed even if the buffer size is zero. This is incompatible with the C++ standard library, but I did not have any luck so far with convincing people that this needs to be fixed in the C standard. The strlcpy/strlcat patch is not the place for this general discussion, I think. Florian
On 12/04/2015 01:01 PM, Florian Weimer wrote: > On 12/03/2015 07:08 PM, Zack Weinberg wrote: > >>> Calling strlcpy (DST, SRC, 0) instead of the more-obvious strlen (SRC)? >>> Really? That would be bizarre. I'd like to see an example. >> >> char *my_strdup(char *s) >> { >> size_t bufsiz = strlcpy(0, s, 0) + 1; >> char *buf = malloc(bufsiz); >> if (!buf) return 0; >> strlcpy(buf, s, bufsiz); >> return s; >> } > > The implementation I posted does not actually support that because (like > virtually all C standard functions), a null pointer is not allowed even > if the buffer size is zero. I think that's a showstopper bug in your implementation; as I've been saying to Paul, the semantics should _exactly_ match *BSD (OpenBSD, specifically) or we shouldn't have these functions at all. If I'm reading the OpenBSD manpages correctly, this applies to both strlcpy and strlcat. > This is incompatible with the C++ standard library, but I did not have > any luck so far with convincing people that this needs to be fixed in > the C standard. The strlcpy/strlcat patch is not the place for this > general discussion, I think. FWIW I would support a general policy of "if a buffer size is specified as zero, the associated pointer(s) are not dereferenced". But agree that this is a separate discussion. zw
On 12/04/2015 11:19 AM, Paul Eggert wrote: > On 12/04/2015 06:20 AM, Zack Weinberg wrote: >> I really do not see this as a weird corner case. > > It's a weird corner case because it violates the natural programmer > expectation that strlcpy and strlcat produce null-terminated strings. When there is no space to write a nul-terminated string into, there is no such expectation. And "no space" is what bufsiz=0 *means*. Moreover, there is another - and, I'd argue, stronger - natural programmer expectation, which is that when a buffer size is zero, the associated pointer(s) will not be dereferenced. I know that the C standard does not require this for e.g. memcpy, but I very strongly suspect that if you polled a random sample of experienced C programmers they would react to that basically the same way they react to being told that signed integer overflow is undefined, i.e. "that's stupid and the standard should be changed." > There's no way that a typical C programmer will call strlcat and think, > "OK, now I have to worry about whether the result is null-terminated". > It's a misuse of programmer time to even raise the possibility. The > whole *point* of these poorly-designed functions is to generate > null-terminated strings come hell or high water, regardless of how long > the inputs are. I could be convinced that the OpenBSD semantics are wrong, but only if you (being the person demanding a change from those semantics) find a piece of existing code that can *accidentally* call either strlcpy or strlcat with a zero buffer-size argument. That means code that was intended to always call one of these functions with a nonzero buffer-size argument, but, depending on input and other external factors, might in practice call it with a zero buffer-size argument. Failing such an exhibition, I will continue to insist on _exactly_ matching the OpenBSD semantics or else not having these functions at all. zw
On 12/07/2015 05:57 AM, Zack Weinberg wrote: > I will continue to insist on_exactly_ > matching the OpenBSD semantics or else not having these functions at all. (I agree, as I think glibc shouldn't have these functions at all. That being said...) If I understand the above comment correctly, you have at least three reasons to object to the proposed strlcpy+strlcat implementation. First, it rejects null pointer arguments. Second, when arguments overlap it doesn't necessarily have the same behavior as the OpenBSD implementation. For example, the OpenBSD strlcpy implementation always has well-defined behavior when source and destination overlap, but the proposed implementation does not. The situation with strlcat is more complicated; still, the OpenBSD implementation has well-defined behavior in cases where the proposed glibc implementation does not. Third, the OpenBSD implementation declares strlcpy and strlcat to have __attribute__ ((__bounded__ ...)), an OpenBSD extension that generates warnings when compiling with gcc -Wbounded (an OpenBSD GCC option that is on by default). The proposed implementation doesn't do that so it by default does not diagnose bugs that the OpenBSD implementation does diagnose. > When there is no space to write a nul-terminated string into, there is > no such expectation. Sometimes there *is* space; that is, the destination size is nonzero, but the proposed strlcat *still* doesn't store a terminating null. Doesn't this give you qualms? It does me. > if you polled a random sample of experienced C programmers > they would react to that basically the same way they react to being told > that signed integer overflow is undefined, i.e. "that's stupid and the > standard should be changed." If you polled a random sample of experienced C programmers and told them that neither strlcpy nor strlcat were guaranteed to null-terminate the result even when given valid arguments, they would also say "that's stupid and the spec should be changed". And they'd have a point. > I could be convinced that the OpenBSD semantics are wrong, but only if > you (being the person demanding a change from those semantics) I'm not demanding any change to the OpenBSD semantics. The OpenBSD implementation would conform to the spec that I'm proposing. I'm merely asking the documentation to permit not only the OpenBSD implementation, but also the NetBSD implementation that catches the stupid mistake of a null-pointer argument, and an implementation that also catches the stupid mistake of output that is not null-terminated. It's reasonable to have debugging implementations that catch stupid mistakes like this. > find a > piece of existing code that can*accidentally* call either strlcpy or > strlcat with a zero buffer-size argument. OpenSSH 7.1p1 progressmeter.c on a platform where struct winsize's ws_col member is an int and is -1, so setscreensize sets win_size to zero, so refresh_progress_meter's call to strlcat(buf, " ", win_size) has a zero buffer-size argument. Admittedly implausible on typical environments, but there it is.
On 12/04/2015 05:06 PM, Paul Eggert wrote: > On 12/04/2015 03:12 AM, Florian Weimer wrote: >>> There is no "_exactly_". The BSDs differ. >> I think the behavior is the same. > > No, as I mentioned elsewhere, the NetBSD iimplementation does not allow > strlcpy(0, SRC, 0) whereas the FreeBSD implementation does (and this > disagrees with the FreeBSD documentation where it says the destination > must be a string). Hmm. I looked at this, and everything points towards the NetBSD implementation being buggy in this regard. Destination-NULL-size-0 is supported by snprintf, so it should be supported by strlcpy as well (and strlcat, by analogy). I will update my patch accordingly. The question what to do if the terminator is not found in the destination buffer (for strlcat) is still open. Florian
On 12/10/2015 08:44 AM, Florian Weimer wrote: > everything points towards the NetBSD > implementation being buggy in this regard "everything"? Nothing indicates that the NetBSD behavior is a bug; it doesn't cause real applications to break. And the NetBSD behavior can help catch buggy usage, such as "char *buf = malloc (n); strlcpy (buf, src, n - m);" on platforms where NULL points to accessible storage, malloc (0) returns NULL, and n happens to be 0. The main point of strlcpy+strlcat is to guarantee that the output is always a null-terminated string that fits.This is what programmers understandably expect. This main point is *far* more important than obscure details about weird corner cases that should never happen anyway. The spec should allow an implementation that guarantees the main point, and that terminates the program if the program tries to exploit the corner cases by passing bad pointers or null pointers or unterminated strings or size-zero buffers that prevent the main point from being safely satisfied.
On 12/07/2015 10:05 PM, Paul Eggert wrote: > On 12/07/2015 05:57 AM, Zack Weinberg wrote: >> I will continue to insist on_exactly_ >> matching the OpenBSD semantics or else not having these functions at all. > > (I agree, as I think glibc shouldn't have these functions at all. That > being said...) > > If I understand the above comment correctly, you have at least three > reasons to object to the proposed strlcpy+strlcat implementation. > > First, it rejects null pointer arguments. Yes, I'm going to fix that. > Second, when arguments overlap it doesn't necessarily have the same > behavior as the OpenBSD implementation. For example, the OpenBSD strlcpy > implementation always has well-defined behavior when source and > destination overlap, but the proposed implementation does not. The OpenBSD implementation is defined to be undefined with overlapping inputs, too. > Third, the OpenBSD implementation declares strlcpy and strlcat to have > __attribute__ ((__bounded__ ...)), an OpenBSD extension that generates > warnings when compiling with gcc -Wbounded (an OpenBSD GCC option that > is on by default). The proposed implementation doesn't do that so it by > default does not diagnose bugs that the OpenBSD implementation does > diagnose. Doesn't the _FORTIFY_SOURCE wrapper do something similar? >> When there is no space to write a nul-terminated string into, there is >> no such expectation. > > Sometimes there *is* space; that is, the destination size is nonzero, > but the proposed strlcat *still* doesn't store a terminating null. > Doesn't this give you qualms? It does me. Yes, it's annoying, particularly since it is inconsistent with strlcpy and snprintf. I'm not sure if we can change that, I'll ask. Florian
On 12/10/2015 11:38 AM, Florian Weimer wrote: >> the OpenBSD strlcpy >> implementation always has well-defined behavior when source and >> destination overlap, but the proposed implementation does not. > > The OpenBSD implementation is defined to be undefined with overlapping > inputs, too. No, the OpenBSD implementation has well-defined behavior then. strlcpy is declared like this: size_t strlcpy (char *, const char *, size_t) __attribute__ ((__bounded__ (__string__, 1, 3))); http://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/src/include/string.h and is implemented like this: size_t strlcpy (char *dst, const char *src, size_t size) { const char *s = src; size_t n = size; if (n) while (--n && (*dst++ = *s++)) continue; if (!n) { if (size) *dst = '\0'; while (*s++) continue; } return s - src - 1; } http://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/src/lib/libc/string/strlcpy.c Nothing in this implementation prohibits overlapping the source and destination. There is no use of 'restrict' or of 'memcpy' or anything like that. Behavior is perfectly well-defined when source and destination overlap. >> Third, the OpenBSD implementation declares strlcpy and strlcat to have >> __attribute__ ((__bounded__ ...)), an OpenBSD extension that generates >> warnings when compiling with gcc -Wbounded (an OpenBSD GCC option that >> is on by default). The proposed implementation doesn't do that so it by >> default does not diagnose bugs that the OpenBSD implementation does >> diagnose. > > Doesn't the _FORTIFY_SOURCE wrapper do something similar? Yes, but the proposed glibc implementation is not "exactly matching the OpenBSD semantics" as Zack insisted upon. For example, whether diagnostics are issued differs by default. There are probably other differences (sorry, I don't know what the __bounded__ attribute does, exactly).
On 12/10/2015 11:27 PM, Paul Eggert wrote: > http://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/src/lib/libc/string/strlcpy.c > Nothing in this implementation prohibits overlapping the source and > destination. This implementation comes with documentation and it says, among other things: “If the src and dst strings overlap, the behavior is undefined.” <http://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/src/lib/libc/string/strlcpy.3> Florian
On 12/11/2015 04:09 AM, Florian Weimer wrote: > This implementation comes with documentation and it says My comments were about the OpenBSD implementation, not the documentation. If implementation compatibility is required (which I think was Zack's point), we need to fix incompatibilities with the OpenBSD implementation, even if these incompatibilities aren't documented. Conversely, if implementation compatibility is not required, then we needn't reproduce OpenBSD's behavior exactly on weird corner cases that come up only with buggy applications. NetBSD does this to some extent, and if we're going to implement strlcpy+strlcat at all, we could do so as well. So, for example, it would be fine if _FORTIFY_SOURCE caused strlcpy to report an error when given overlapping arguments, even though OpenBSD's strlcpy implementation has well-defined behavior in that situation. Either approach would be better than a randomish glibc implementation, partially compatible with OpenBSD's weird quirks and partially not, with no principle for when we're compatible and when not. > “If the src and dst strings overlap, the behavior is undefined.” > > <http://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/src/lib/libc/string/strlcpy.3> If the goal is to implement this spec (and implementation compatibility is not required), then this quote says the destination is a string, which means the destination buffer is null-terminated for strlcat, and also means SIZE is nonzero for both strlcpy and strlcat. And that is what my proposed strlcpy+strlcat doc patch says, albeit more clearly. Of course other parts of the OpenBSD spec talk about what to do when SIZE is zero or the strlcat destination is not a string, but what can I say? The OpenBSD spec is confused and contradicts itself, and this gives us wiggle room to interpret it reasonably for these weird corner cases. Thanks, by the way, for being so patient with all this. This API is such a pain.
From 69684f3044af8b1554696054e2f8d6f20bab79c2 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Thu, 3 Dec 2015 09:14:57 -0800 Subject: [PATCH 3/3] Add strlcpy, strlcat [BZ #178] This patch was derived from text by Florian Weimer in: https://sourceware.org/ml/libc-alpha/2015-11/msg00558.html * manual/string.texi (Truncating Strings): New functions from BSD. --- ChangeLog | 6 ++++ manual/string.texi | 83 +++++++++++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 88 insertions(+), 1 deletion(-) diff --git a/ChangeLog b/ChangeLog index d02f7ad..fc2f093 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,11 @@ 2015-12-03 Paul Eggert <eggert@cs.ucla.edu> + Add strlcpy, strlcat + [BZ #178] + This patch was derived from text by Florian Weimer in: + https://sourceware.org/ml/libc-alpha/2015-11/msg00558.html + * manual/string.texi (Truncating Strings): New functions from BSD. + Split large string section; add truncation advice * manual/string.texi (Copying Strings and Arrays): Split into three sections Copying Strings and Arrays, Concatenating Strings, diff --git a/manual/string.texi b/manual/string.texi index eec17f7..d32e956 100644 --- a/manual/string.texi +++ b/manual/string.texi @@ -745,7 +745,7 @@ As noted below, this function has significant performance issues. @end deftypefun Programmers using the @code{strcat} or @code{wcscat} function (or the -@code{strncat} or @code{wcsncat} functions defined in +@code{strlcat}, @code{strncat}, or @code{wcsncat} functions defined in a later section, for that matter) can easily be recognized as lazy and reckless. In almost all situations the lengths of the participating strings are known (it better should be @@ -906,6 +906,48 @@ greater than the length of @var{from}. As noted below, this function is generally a poor choice for processing text. @end deftypefun +@comment string.h +@comment BSD +@deftypefun size_t strlcpy (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size}) +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} +This function is similar to @code{strcpy}, but copies at most +@var{size} bytes from the string @var{from} into the destination +buffer @var{to}, including a terminating null byte if @var{size} is +nonzero. + +If @var{size} is greater than the length of @var{from}, @code{strlcpy} +copies all of the string @var{from} to the destination buffer +@var{to}, including the terminating null byte. Like other string +functions such as @code{strcpy}, but unlike @code{strncpy}, any +remaining bytes in the destination buffer remain unchanged. + +If @var{size} is nonzero and is not greater than the length of the +string @var{from}, @code{strlcpy} copies only the first +@samp{@var{size} - 1} bytes to the destination buffer @var{to}, and +writes a terminating null byte to the last byte in the buffer. + +This function returns the length of @var{from}. This means that +truncation occurs whenever the returned value is not less than +@var{size}. + +The behavior of @code{strlcpy} is undefined if @var{size} is zero, if +the strings overlap, or if the source or destination are null +pointers. + +Unlike @code{strncpy}, @code{strcpy} always null-terminates the +destination buffer, does not zero-fill the destination buffer, +requires @var{size} to be nonzero, requires @var{from} to be a +null-terminated string, and always computes @var{from}'s length even +when this length is greater than @var{size}. + +This function was designed as a stopgap for quickly retrofitting +possibly-unsafe uses of @code{strcpy} on platforms lacking +buffer-overrun protection. As noted below, this function is generally +a poor choice for processing text. + +@code{strlcpy} is derived from BSD. +@end deftypefun + @comment wchar.h @comment ISO @deftypefun {wchar_t *} wcsncpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size}) @@ -1064,6 +1106,45 @@ choice for processing text. Also, this function has significant performance issues. @xref{Concatenating Strings}. @end deftypefun +@comment string.h +@comment BSD +@deftypefun size_t strlcat (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size}) +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} +This function is similar to @code{strcat}, except that it truncates +the result to at most @var{size} bytes (including the terminating null +byte) when appending the string @var{from} to the string @var{to}. + +The @code{strlcat} function appends a prefix of the string @var{from} +to the string @var{to}. The prefix contains all of @var{from} if that +can be appended to @var{to} without requiring more than @var{size} +bytes total. Otherwise, it contains only as many leading bytes of +@var{from} as will fit. The resulting string in @var{to} is always +null-terminated, and any excess trailing bytes of @var{from} are not +copied. + +@code{strlcat} returns the sum of the original length of @var{to} and +the length of @var{from}. This means that truncation occurs whenever +the returned value is not less than @var{size}. + +The behavior of @code{strlcat} is undefined if @var{from} does not +contain a null byte in its first @var{size} bytes, if the strings +overlap, or if the source or destination are null pointers. + +Unlike @code{strncat}, @code{strlcat} requires @var{to} and @var{from} +to be null-terminated, ensures that @var{to} stays null-terminated, +and always computes @var{from}'s length even when this length is +greater than that of the appended string. + +As a companion to @code{strlcpy}, @code{strlcat} was designed as a +stopgap for quickly retrofitting possibly-unsafe uses of @code{strcat} +on platforms lacking buffer-overrun protection. As noted below, this +function is generally a poor choice for processing text. Also, this +function has significant performance issues. @xref{Concatenating +Strings}. + +@code{strlcat} is derived from BSD. +@end deftypefun + @comment wchar.h @comment ISO @deftypefun {wchar_t *} wcsncat (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size}) -- 2.1.0