2016-01-04 Florian Weimer <fweimer@redhat.com>
* manual/memory.texi (Variable Size Automatic): Document
interaction between alloca and variable length arrays. Mention
function inlining. Remove obsolete warning about alloca in
function parameter lists.
(Advantages of Alloca): Note that alloca is async-signal-safe.
Mention C++ exceptions and lack of length checking in open2.
(Disadvantages of Alloca): Clarify consequences of the lack of
error checking. Do no mention the non-existing alloca emulation.
(GNU C Variable-Size Arrays): Switch terminology from GNU C
variable-sized arrays to ISO C varliable length arrays. Mention
security aspect and aliasing violations. Clarify loop behavior.
Remove NB, now part of the alloca documentation.
* manual/string.texi (Copying Strings and Arrays): Add warning
about alloca and length checking to strdupa. Drop restriction to
GNU CC.
(Truncating Strings): Add warning to strndupa. Drop restriction
to GNU CC.
@@ -2745,10 +2745,24 @@ The function @code{alloca} supports a kind of half-dynamic allocation in
which blocks are allocated dynamically but freed automatically.
Allocating a block with @code{alloca} is an explicit action; you can
-allocate as many blocks as you wish, and compute the size at run time. But
-all the blocks are freed when you exit the function that @code{alloca} was
-called from, just as if they were automatic variables declared in that
-function. There is no way to free the space explicitly.
+allocate as many blocks as you wish, and compute the size at run time.
+Memory allocated this way is freed automatically, at some point after
+the scope which contains the @code{alloca} call is left:
+
+@itemize @bullet
+@item
+@cindex variable length arrays
+If the scope calling @code{alloca} contains an variable length array, or
+is nested in such a scope, then the object allocated with @code{alloca}
+is deallocated when the closest enclosing scope which defines a
+variable length array is left.
+
+@item
+If no enclosing scope with a variable length array exist, the allocated
+object is deallocated when the function is exited, either normally or
+abnormally (for example, by throwing a C++ exception). The life time of
+such objects is not extended by function inlining.
+@end itemize
The prototype for @code{alloca} is in @file{stdlib.h}. This function is
a BSD extension.
@@ -2762,21 +2776,11 @@ The return value of @code{alloca} is the address of a block of @var{size}
bytes of memory, allocated in the stack frame of the calling function.
@end deftypefun
-Do not use @code{alloca} inside the arguments of a function call---you
-will get unpredictable results, because the stack space for the
-@code{alloca} would appear on the stack in the middle of the space for
-the function arguments. An example of what to avoid is @code{foo (x,
-alloca (4), y)}.
-@c This might get fixed in future versions of GCC, but that won't make
-@c it safe with compilers generally.
-
@menu
* Alloca Example:: Example of using @code{alloca}.
* Advantages of Alloca:: Reasons to use @code{alloca}.
* Disadvantages of Alloca:: Reasons to avoid @code{alloca}.
-* GNU C Variable-Size Arrays:: Only in GNU C, here is an alternative
- method of allocating dynamically and
- freeing automatically.
+* GNU C Variable-Size Arrays:: On-stack dynamic allocation in ISO C.
@end menu
@node Alloca Example
@@ -2834,6 +2838,14 @@ block, space used for any size block can be reused for any other size.
@code{alloca} does not cause memory fragmentation.
@item
+@cindex mmap
+The @code{alloca} function can be safely called from a signal handler.
+But signal handlers may run with little stack space available, so
+it is unclear how much memory can be safely allocted with @code{alloca}.
+This means that robust code may have to use @code{mmap} instead.
+@xref{Memory-mapped I/O}.
+
+@item
@cindex longjmp
Nonlocal exits done with @code{longjmp} (@pxref{Non-Local Exits})
automatically free the space allocated with @code{alloca} when they exit
@@ -2865,7 +2877,13 @@ freed even when an error occurs, with no special effort required.
By contrast, the previous definition of @code{open2} (which uses
@code{malloc} and @code{free}) would develop a memory leak if it were
changed in this way. Even if you are willing to make more changes to
-fix it, there is no easy way to do so.
+fix it, there is no easy way to do so (except to switch to C++ and
+exceptions).
+
+Note that the @code{open2} example with @code{alloca} is incorrect if
+@code{str1} and @code{str2} can be very long strings because
+@code{alloca} does not fail gracefully in case too many bytes are
+requested (see below).
@end itemize
@node Disadvantages of Alloca
@@ -2879,22 +2897,38 @@ These are the disadvantages of @code{alloca} in comparison with
@itemize @bullet
@item
If you try to allocate more memory than the machine can provide, you
-don't get a clean error message. Instead you get a fatal signal like
-the one you would get from an infinite recursion; probably a
-segmentation violation (@pxref{Program Error Signals}).
+don't get a clean error message. Instead, you end up with undefined
+behavior. In many cases, the program will just crash (which can still
+result in a denial-of-service vulnerability), but sometimes, it is
+possible to abuse an unbounded @code{alloca} to cause other security
+vulnerabilities such as information disclosure or arbitrary code
+execution.
@item
Some @nongnusystems{} fail to support @code{alloca}, so it is less
-portable. However, a slower emulation of @code{alloca} written in C
-is available for use on systems with this deficiency.
+portable.
@end itemize
+Due to lack of error checking, security-sensitive code must ensure that
+no large objects are allocated with @code{alloca}. In general this
+means that the size argument is checked against an arbitrary limit (say,
+4096), and an error is returned if it is exceeded, or fallback to
+@code{malloc} is performed.
+
+Extra care is required when @code{alloca} is called from a function
+called recursively or from within the loop. In this case, depending on
+the depth of the recursion or the loop iteration count, smaller
+allocation size can exhaust the stack and trigger undefined behavior.
+To a lesser degree, this problem also exists with callback functions.
+
+@c Node name preserved for backwards compatibility; the correct
+@c terminology is ``variable length array''.
@node GNU C Variable-Size Arrays
-@subsubsection GNU C Variable-Size Arrays
-@cindex variable-sized arrays
+@subsubsection ISO C Variable Length Arrays
+@cindex variable length arrays
-In GNU C, you can replace most uses of @code{alloca} with an array of
-variable size. Here is how @code{open2} would look then:
+In ISO C, you can replace most uses of @code{alloca} with an array of
+variable length. Here is how @code{open2} would look then:
@smallexample
int open2 (char *str1, char *str2, int flags, int mode)
@@ -2905,26 +2939,39 @@ int open2 (char *str1, char *str2, int flags, int mode)
@}
@end smallexample
+Compared to @code{malloc}, variable length arrays share the same
+advantages and disadvantages as @code{alloca}. In particular, there is
+no error checking (and security vulnerabilities can result from large
+allocation requests), and some @nongnusystems{} do not support variable
+length arrays because they only support earlier versions of ISO C which
+do not include variable length arrays.
+
+The variable length array version of @code{open2}, as shown above, still
+suffers from the same problem as the @code{alloca}-based variant: It
+does not check that the strings are short enough, to avoid undefined
+behavior which are the result of large allocation requests.
+
But @code{alloca} is not always equivalent to a variable-sized array, for
several reasons:
@itemize @bullet
@item
-A variable size array's space is freed at the end of the scope of the
-name of the array. The space allocated with @code{alloca}
-remains until the end of the function.
+Memory returned by @code{alloca} is untyped. A variable length array
+has always a specific type (even if it is an array of characters), and
+using it with another type can introduce aliasing violations into the
+program.
@item
-It is possible to use @code{alloca} within a loop, allocating an
-additional block on each iteration. This is impossible with
-variable-sized arrays.
+A variable length array is deallocated at the end of the scope of the
+name of the array. The space allocated with @code{alloca} remains until
+the end of the function.
@end itemize
-@strong{NB:} If you mix use of @code{alloca} and variable-sized arrays
-within one function, exiting a scope in which a variable-sized array was
-declared frees all blocks allocated with @code{alloca} during the
-execution of that scope.
-
+The second difference is most pronounced in loops: With @code{alloca},
+the allocated object can be referenced from later iterations and after
+the loop body has been exited. But a loop with a variable length array
+can execute an arbitrary number of times, without exhausting the
+available stack, as long as the individual arrays are short enough.
@node Resizing the Data Segment
@section Resizing the Data Segment
@@ -643,24 +643,17 @@ The behavior of @code{wcpcpy} is undefined if the strings overlap.
This macro is similar to @code{strdup} but allocates the new string
using @code{alloca} instead of @code{malloc} (@pxref{Variable Size
Automatic}). This means of course the returned string has the same
-limitations as any block of memory allocated using @code{alloca}.
+limitations as any block of memory allocated using @code{alloca}, and
+@code{strdupa} can introduce security vulnerabilities due to the lack of
+failure checking.
-For obvious reasons @code{strdupa} is implemented only as a macro;
-you cannot get the address of this function. Despite this limitation
-it is a useful function. The following code shows a situation where
-using @code{malloc} would be a lot more expensive.
+For obvious reasons @code{strdupa} is implemented only as a macro; you
+cannot get the address of this function. The following code shows an
+example of its use:
@smallexample
@include strdupa.c.texi
@end smallexample
-
-Please note that calling @code{strtok} using @var{path} directly is
-invalid. It is also not allowed to call @code{strdupa} in the argument
-list of @code{strtok} since @code{strdupa} uses @code{alloca}
-(@pxref{Variable Size Automatic}) can interfere with the parameter
-passing.
-
-This function is only available if GNU CC is used.
@end deftypefn
@comment string.h
@@ -958,16 +951,13 @@ processing text.
This function is similar to @code{strndup} but like @code{strdupa} it
allocates the new string using @code{alloca} @pxref{Variable Size
Automatic}. The same advantages and limitations of @code{strdupa} are
-valid for @code{strndupa}, too.
+valid for @code{strndupa}. In particular, @code{strndupa} can introduce
+security vulnerabilities due to the lack of error checking.
This function is implemented only as a macro, just like @code{strdupa}.
-Just as @code{strdupa} this macro also must not be used inside the
-parameter list in a function call.
As noted below, this function is generally a poor choice for
processing text.
-
-@code{strndupa} is only available if GNU CC is used.
@end deftypefn
@comment string.h
--
2.4.3