Message ID | 55DF6F59.3000404@linaro.org |
---|---|
State | New |
Headers | show |
On Thu, Aug 27, 2015 at 05:13:13PM -0300, Adhemerval Zanella wrote: > This patch fixes the default wordsize-32 mmap implementation offset > calculation for negative values. Current code uses signed shift > operation to calculate the multiple size to use with syscall and > it is implementation defined. Change it to use a division base > on mmap page size (default being as before, 4096). > > Tested on armv7hf. > > [BZ #18877] > * misc/Makefile (tests): Add tst-mmap > * sysdeps/unix/sysv/linux/generic/wordsize-32/mmap.c (__mmap): Fix > offset calculation for negative values. > * misc/tst-mmap.c: New file. As we have posix/tst-mmap.c already, my idea was to call it tst-mmap-offset.c, > + char fname[] = "tst-mmap-offset-XXXXXX"; hence the name of this temporary file. > - offset >> MMAP_PAGE_SHIFT); > + offset/MMAP_PAGE_UNIT); offset / MMAP_PAGE_UNIT
I'm looking at failures of this new test for MIPS, but am puzzled by the descriptions of this issue in terms of things being incorrect and fixed without any statement of what is actually correct or why that is correct. The expectation in the test seems to be that negative offsets passed to mmap should be interpreted as large unsigned values. But I can't find anything in POSIX, or in the glibc manual, or in the Linux man-pages collection, to justify those semantics. So what is the basis for that interpretation as part of the glibc API (and for its being compatible with POSIX), rather than negative arguments being considered invalid? (The MIPS mmap syscall uses a signed argument and does a signed arithmetic shift on it, meaning that matching the MIPS semantics to other architectures will involve using mmap2 for o32 and making mmap not an alias for mmap64 for n32 because mmap for n32 will need to zero-extend its offset argument before calling the syscall.)
On 20-01-2016 19:44, Joseph Myers wrote: > I'm looking at failures of this new test for MIPS, but am puzzled by the > descriptions of this issue in terms of things being incorrect and fixed > without any statement of what is actually correct or why that is correct. > > The expectation in the test seems to be that negative offsets passed to > mmap should be interpreted as large unsigned values. But I can't find > anything in POSIX, or in the glibc manual, or in the Linux man-pages > collection, to justify those semantics. So what is the basis for that > interpretation as part of the glibc API (and for its being compatible with > POSIX), rather than negative arguments being considered invalid? > You are correct, I bluntly used the bz report testcase as the one for the patch. And checking both Linux and POSIX negative offset are not really specified, so the issue at BZ#18877 can't really being considered a regression since the offset usage is undefined behaviour. The fix itself on the wordsize-32 I still consider valid since it was using a shift arithmetic signed is implementation-defined behaviour (and the divide operation is assumed to not be implementation-defined). About the test I think we should just remove it and state that invalid offsets are undefined or architecture define. > (The MIPS mmap syscall uses a signed argument and does a signed arithmetic > shift on it, meaning that matching the MIPS semantics to other > architectures will involve using mmap2 for o32 and making mmap not an > alias for mmap64 for n32 because mmap for n32 will need to zero-extend its > offset argument before calling the syscall.) I think this can be a future cleanup to simplify implementations.
On Thu, 21 Jan 2016, Adhemerval Zanella wrote: > You are correct, I bluntly used the bz report testcase as the one for the > patch. And checking both Linux and POSIX negative offset are not really > specified, so the issue at BZ#18877 can't really being considered a regression > since the offset usage is undefined behaviour. The fix itself on the > wordsize-32 I still consider valid since it was using a shift arithmetic > signed is implementation-defined behaviour (and the divide operation is > assumed to not be implementation-defined). The shift is fully defined in GNU C, which is what is used by glibc. The actual fix isn't the use of the divide (if the low bits are zero, signed division and signed arithmetic right shift have the same effect) - it's that you're dividing by an *unsigned* quantity whose type is such that C type promotion rules cause the signed offset to be converted to unsigned before the division (and converting to an unsigned type before shifting right would have just the same effect). > About the test I think we should just remove it and state that invalid > offsets are undefined or architecture define. I don't think saying they are architecture-defined makes sense - in general the glibc API should be consistent between architectures (and we test for the glibc API, not just for the POSIX API). And if it's undefined (as opposed to invalid and requiring an error), again, what is the basis for that in POSIX?
On 21-01-2016 12:12, Joseph Myers wrote: > On Thu, 21 Jan 2016, Adhemerval Zanella wrote: > >> You are correct, I bluntly used the bz report testcase as the one for the >> patch. And checking both Linux and POSIX negative offset are not really >> specified, so the issue at BZ#18877 can't really being considered a regression >> since the offset usage is undefined behaviour. The fix itself on the >> wordsize-32 I still consider valid since it was using a shift arithmetic >> signed is implementation-defined behaviour (and the divide operation is >> assumed to not be implementation-defined). > > The shift is fully defined in GNU C, which is what is used by glibc. The > actual fix isn't the use of the divide (if the low bits are zero, signed > division and signed arithmetic right shift have the same effect) - it's > that you're dividing by an *unsigned* quantity whose type is such that C > type promotion rules cause the signed offset to be converted to unsigned > before the division (and converting to an unsigned type before shifting > right would have just the same effect). Right I oversee the unsigned promotion rule, so seems the right signed shift should be safe. > >> About the test I think we should just remove it and state that invalid >> offsets are undefined or architecture define. > > I don't think saying they are architecture-defined makes sense - in > general the glibc API should be consistent between architectures (and we > test for the glibc API, not just for the POSIX API). And if it's > undefined (as opposed to invalid and requiring an error), again, what is > the basis for that in POSIX? > OK I am open to suggestion in such case to how to handle negative offsets.
Does anyone else have comments on my question <https://sourceware.org/ml/libc-alpha/2016-01/msg00534.html> of what the desired architecture-independent mmap semantics should be for negative offsets?
On Wed, Jan 20, 2016 at 09:44:26PM +0000, Joseph Myers wrote: > I'm looking at failures of this new test for MIPS, but am puzzled by the > descriptions of this issue in terms of things being incorrect and fixed > without any statement of what is actually correct or why that is correct. > > The expectation in the test seems to be that negative offsets passed to > mmap should be interpreted as large unsigned values. But I can't find > anything in POSIX, or in the glibc manual, or in the Linux man-pages > collection, to justify those semantics. So what is the basis for that > interpretation as part of the glibc API (and for its being compatible with > POSIX), rather than negative arguments being considered invalid? Is there any 32-bit kernel currently supported by glibc where underlying system call doesn't operate with 4K or pagesize units for offsets?
On 28-01-2016 15:45, Joseph Myers wrote: > Does anyone else have comments on my question > <https://sourceware.org/ml/libc-alpha/2016-01/msg00534.html> of what the > desired architecture-independent mmap semantics should be for negative > offsets? > I recheck Linux handling of mmap syscall and I would say we should handle it an unsigned integer since it is what general mmap implementation is being based current (mmap_pgoff - mm/mmap.c). The problem is some architecture implements mmap (either default or for compatibility) as signed and others as unsigned. As far I could check mmap2 is handled as unsigned for all the ports. Now from GLIBC side I see we need to start use mmap2 where is possible for the ports that already are not using it (mips for instance). The problem is ports that only provides mmap with signed offset and for such cases I am not sure how we will enforce architecture-independent semantics for negative offset.
diff --git a/misc/Makefile b/misc/Makefile index aecb0da..2929d61 100644 --- a/misc/Makefile +++ b/misc/Makefile @@ -76,7 +76,8 @@ install-lib := libg.a gpl2lgpl := error.c error.h tests := tst-dirname tst-tsearch tst-fdset tst-efgcvt tst-mntent tst-hsearch \ - tst-error1 tst-pselect tst-insremque tst-mntent2 bug-hsearch1 + tst-error1 tst-pselect tst-insremque tst-mntent2 bug-hsearch1 \ + tst-mmap ifeq ($(run-built-tests),yes) tests-special += $(objpfx)tst-error1-mem.out endif diff --git a/misc/tst-mmap.c b/misc/tst-mmap.c new file mode 100644 index 0000000..8b006b5 --- /dev/null +++ b/misc/tst-mmap.c @@ -0,0 +1,67 @@ +/* mmap offset test. + + Copyright (C) 2015 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> +#include <sys/mman.h> + +static int +printmsg (int rc, const char *msg) +{ + printf ("%s failed: %m\n", msg); + return rc; +} + +/* Check if negative offset are handled correctly by mmap. */; +static int +do_test_bz18877 (void) +{ + const int prot = PROT_READ | PROT_WRITE; + const int flags = MAP_SHARED; + const unsigned long length = 0x10000; + const unsigned long offset = 0xace00000; + const unsigned long size = offset + length; + void *addr; + int fd; + char fname[] = "tst-mmap-offset-XXXXXX"; + + fd = mkstemp64 (fname); + if (fd < 0) + return printmsg (1, "mkstemp"); + + if (unlink (fname)) + return printmsg (1, "unlink"); + + if (ftruncate64 (fd, size)) + return printmsg (0, "ftruncate64"); + + addr = mmap (NULL, length, prot, flags, fd, offset); + if (MAP_FAILED == addr) + return printmsg (1, "mmap"); + + /* This memcpy is likely to SIGBUS if mmap has messed up with offset. */ + memcpy (addr, fname, sizeof (fname)); + + return 0; +} + +#define TEST_FUNCTION do_test_bz18877 () +#include "../test-skeleton.c" diff --git a/sysdeps/unix/sysv/linux/generic/wordsize-32/mmap.c b/sysdeps/unix/sysv/linux/generic/wordsize-32/mmap.c index 24835ce..f5a4c32 100644 --- a/sysdeps/unix/sysv/linux/generic/wordsize-32/mmap.c +++ b/sysdeps/unix/sysv/linux/generic/wordsize-32/mmap.c @@ -21,20 +21,20 @@ #include <errno.h> #include <sysdep.h> -#ifndef MMAP_PAGE_SHIFT -#define MMAP_PAGE_SHIFT 12 +#ifndef MMAP_PAGE_UNIT +#define MMAP_PAGE_UNIT 4096UL #endif __ptr_t __mmap (__ptr_t addr, size_t len, int prot, int flags, int fd, off_t offset) { - if (offset & ((1 << MMAP_PAGE_SHIFT) - 1)) + if (offset & (MMAP_PAGE_UNIT - 1)) { __set_errno (EINVAL); return MAP_FAILED; } return (__ptr_t) INLINE_SYSCALL (mmap2, 6, addr, len, prot, flags, fd, - offset >> MMAP_PAGE_SHIFT); + offset/MMAP_PAGE_UNIT); } weak_alias (__mmap, mmap)