diff mbox

Fix wordsize-32 mmap offset for negative value (BZ#18877)

Message ID 55DF6F59.3000404@linaro.org
State New
Headers show

Commit Message

Adhemerval Zanella Netto Aug. 27, 2015, 8:13 p.m. UTC
This patch fixes the default wordsize-32 mmap implementation offset
calculation for negative values.  Current code uses signed shift
operation to calculate the multiple size to use with syscall and
it is implementation defined.  Change it to use a division base
on mmap page size (default being as before, 4096).

Tested on armv7hf.

       [BZ #18877]
       *  misc/Makefile (tests): Add tst-mmap
       * sysdeps/unix/sysv/linux/generic/wordsize-32/mmap.c (__mmap): Fix
       offset calculation for negative values.
       * misc/tst-mmap.c: New file.

--


--

Comments

Dmitry V. Levin Aug. 27, 2015, 8:51 p.m. UTC | #1
On Thu, Aug 27, 2015 at 05:13:13PM -0300, Adhemerval Zanella wrote:
> This patch fixes the default wordsize-32 mmap implementation offset
> calculation for negative values.  Current code uses signed shift
> operation to calculate the multiple size to use with syscall and
> it is implementation defined.  Change it to use a division base
> on mmap page size (default being as before, 4096).
> 
> Tested on armv7hf.
> 
>        [BZ #18877]
>        *  misc/Makefile (tests): Add tst-mmap
>        * sysdeps/unix/sysv/linux/generic/wordsize-32/mmap.c (__mmap): Fix
>        offset calculation for negative values.
>        * misc/tst-mmap.c: New file.

As we have posix/tst-mmap.c already, my idea was to call it
tst-mmap-offset.c,

> +  char fname[] = "tst-mmap-offset-XXXXXX";

hence the name of this temporary file.

> -                                   offset >> MMAP_PAGE_SHIFT);
> +                                   offset/MMAP_PAGE_UNIT);

offset / MMAP_PAGE_UNIT
Joseph Myers Jan. 20, 2016, 9:44 p.m. UTC | #2
I'm looking at failures of this new test for MIPS, but am puzzled by the 
descriptions of this issue in terms of things being incorrect and fixed 
without any statement of what is actually correct or why that is correct.

The expectation in the test seems to be that negative offsets passed to 
mmap should be interpreted as large unsigned values.  But I can't find 
anything in POSIX, or in the glibc manual, or in the Linux man-pages 
collection, to justify those semantics.  So what is the basis for that 
interpretation as part of the glibc API (and for its being compatible with 
POSIX), rather than negative arguments being considered invalid?

(The MIPS mmap syscall uses a signed argument and does a signed arithmetic 
shift on it, meaning that matching the MIPS semantics to other 
architectures will involve using mmap2 for o32 and making mmap not an 
alias for mmap64 for n32 because mmap for n32 will need to zero-extend its 
offset argument before calling the syscall.)
Adhemerval Zanella Netto Jan. 21, 2016, 1:37 p.m. UTC | #3
On 20-01-2016 19:44, Joseph Myers wrote:
> I'm looking at failures of this new test for MIPS, but am puzzled by the 
> descriptions of this issue in terms of things being incorrect and fixed 
> without any statement of what is actually correct or why that is correct.
> 
> The expectation in the test seems to be that negative offsets passed to 
> mmap should be interpreted as large unsigned values.  But I can't find 
> anything in POSIX, or in the glibc manual, or in the Linux man-pages 
> collection, to justify those semantics.  So what is the basis for that 
> interpretation as part of the glibc API (and for its being compatible with 
> POSIX), rather than negative arguments being considered invalid?
> 

You are correct, I bluntly used the bz report testcase as the one for the
patch.  And checking both Linux and POSIX negative offset are not really
specified, so the issue at BZ#18877 can't really being considered a regression
since the offset usage is undefined behaviour.  The fix itself on the
wordsize-32 I still consider valid since it was using a shift arithmetic
signed is implementation-defined behaviour (and the divide operation is
assumed to not be implementation-defined).

About the test I think we should just remove it and state that invalid
offsets are undefined or architecture define.

> (The MIPS mmap syscall uses a signed argument and does a signed arithmetic 
> shift on it, meaning that matching the MIPS semantics to other 
> architectures will involve using mmap2 for o32 and making mmap not an 
> alias for mmap64 for n32 because mmap for n32 will need to zero-extend its 
> offset argument before calling the syscall.)

I think this can be a future cleanup to simplify implementations.
Joseph Myers Jan. 21, 2016, 2:12 p.m. UTC | #4
On Thu, 21 Jan 2016, Adhemerval Zanella wrote:

> You are correct, I bluntly used the bz report testcase as the one for the
> patch.  And checking both Linux and POSIX negative offset are not really
> specified, so the issue at BZ#18877 can't really being considered a regression
> since the offset usage is undefined behaviour.  The fix itself on the
> wordsize-32 I still consider valid since it was using a shift arithmetic
> signed is implementation-defined behaviour (and the divide operation is
> assumed to not be implementation-defined).

The shift is fully defined in GNU C, which is what is used by glibc.  The 
actual fix isn't the use of the divide (if the low bits are zero, signed 
division and signed arithmetic right shift have the same effect) - it's 
that you're dividing by an *unsigned* quantity whose type is such that C 
type promotion rules cause the signed offset to be converted to unsigned 
before the division (and converting to an unsigned type before shifting 
right would have just the same effect).

> About the test I think we should just remove it and state that invalid
> offsets are undefined or architecture define.

I don't think saying they are architecture-defined makes sense - in 
general the glibc API should be consistent between architectures (and we 
test for the glibc API, not just for the POSIX API).  And if it's 
undefined (as opposed to invalid and requiring an error), again, what is 
the basis for that in POSIX?
Adhemerval Zanella Netto Jan. 21, 2016, 2:19 p.m. UTC | #5
On 21-01-2016 12:12, Joseph Myers wrote:
> On Thu, 21 Jan 2016, Adhemerval Zanella wrote:
> 
>> You are correct, I bluntly used the bz report testcase as the one for the
>> patch.  And checking both Linux and POSIX negative offset are not really
>> specified, so the issue at BZ#18877 can't really being considered a regression
>> since the offset usage is undefined behaviour.  The fix itself on the
>> wordsize-32 I still consider valid since it was using a shift arithmetic
>> signed is implementation-defined behaviour (and the divide operation is
>> assumed to not be implementation-defined).
> 
> The shift is fully defined in GNU C, which is what is used by glibc.  The 
> actual fix isn't the use of the divide (if the low bits are zero, signed 
> division and signed arithmetic right shift have the same effect) - it's 
> that you're dividing by an *unsigned* quantity whose type is such that C 
> type promotion rules cause the signed offset to be converted to unsigned 
> before the division (and converting to an unsigned type before shifting 
> right would have just the same effect).

Right I oversee the unsigned promotion rule, so seems the right signed
shift should be safe.

> 
>> About the test I think we should just remove it and state that invalid
>> offsets are undefined or architecture define.
> 
> I don't think saying they are architecture-defined makes sense - in 
> general the glibc API should be consistent between architectures (and we 
> test for the glibc API, not just for the POSIX API).  And if it's 
> undefined (as opposed to invalid and requiring an error), again, what is 
> the basis for that in POSIX?
> 

OK I am open to suggestion in such case to how to handle negative offsets.
Joseph Myers Jan. 28, 2016, 5:45 p.m. UTC | #6
Does anyone else have comments on my question 
<https://sourceware.org/ml/libc-alpha/2016-01/msg00534.html> of what the 
desired architecture-independent mmap semantics should be for negative 
offsets?
Dmitry V. Levin Jan. 28, 2016, 7:43 p.m. UTC | #7
On Wed, Jan 20, 2016 at 09:44:26PM +0000, Joseph Myers wrote:
> I'm looking at failures of this new test for MIPS, but am puzzled by the 
> descriptions of this issue in terms of things being incorrect and fixed 
> without any statement of what is actually correct or why that is correct.
> 
> The expectation in the test seems to be that negative offsets passed to 
> mmap should be interpreted as large unsigned values.  But I can't find 
> anything in POSIX, or in the glibc manual, or in the Linux man-pages 
> collection, to justify those semantics.  So what is the basis for that 
> interpretation as part of the glibc API (and for its being compatible with 
> POSIX), rather than negative arguments being considered invalid?

Is there any 32-bit kernel currently supported by glibc where underlying
system call doesn't operate with 4K or pagesize units for offsets?
Adhemerval Zanella Netto Jan. 28, 2016, 7:50 p.m. UTC | #8
On 28-01-2016 15:45, Joseph Myers wrote:
> Does anyone else have comments on my question 
> <https://sourceware.org/ml/libc-alpha/2016-01/msg00534.html> of what the 
> desired architecture-independent mmap semantics should be for negative 
> offsets?
> 

I recheck Linux handling of mmap syscall and I would say we should handle
it an unsigned integer since it is what general mmap implementation is
being based current (mmap_pgoff - mm/mmap.c).

The problem is some architecture implements mmap (either default or
for compatibility) as signed and others as unsigned.  As far I could
check mmap2 is handled as unsigned for all the ports.

Now from GLIBC side I see we need to start use mmap2 where is possible
for the ports that already are not using it (mips for instance). The
problem is ports that only provides mmap with signed offset and for
such cases I am not sure how we will enforce architecture-independent
semantics for negative offset.
diff mbox

Patch

diff --git a/misc/Makefile b/misc/Makefile
index aecb0da..2929d61 100644
--- a/misc/Makefile
+++ b/misc/Makefile
@@ -76,7 +76,8 @@  install-lib := libg.a
 gpl2lgpl := error.c error.h
 
 tests := tst-dirname tst-tsearch tst-fdset tst-efgcvt tst-mntent tst-hsearch \
-	 tst-error1 tst-pselect tst-insremque tst-mntent2 bug-hsearch1
+	 tst-error1 tst-pselect tst-insremque tst-mntent2 bug-hsearch1 \
+	 tst-mmap
 ifeq ($(run-built-tests),yes)
 tests-special += $(objpfx)tst-error1-mem.out
 endif
diff --git a/misc/tst-mmap.c b/misc/tst-mmap.c
new file mode 100644
index 0000000..8b006b5
--- /dev/null
+++ b/misc/tst-mmap.c
@@ -0,0 +1,67 @@ 
+/* mmap offset test.
+
+   Copyright (C) 2015 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <sys/mman.h>
+
+static int
+printmsg (int rc, const char *msg)
+{
+  printf ("%s failed: %m\n", msg);
+  return rc;
+}
+
+/* Check if negative offset are handled correctly by mmap.  */;
+static int
+do_test_bz18877 (void)
+{
+  const int prot = PROT_READ | PROT_WRITE;
+  const int flags = MAP_SHARED;
+  const unsigned long length = 0x10000;
+  const unsigned long offset = 0xace00000;
+  const unsigned long size = offset + length;
+  void *addr;
+  int fd;
+  char fname[] = "tst-mmap-offset-XXXXXX";
+
+  fd = mkstemp64 (fname);
+  if (fd < 0)
+    return printmsg (1, "mkstemp");
+
+  if (unlink (fname))
+    return printmsg (1, "unlink");
+
+  if (ftruncate64 (fd, size))
+    return printmsg (0, "ftruncate64");
+
+  addr = mmap (NULL, length, prot, flags, fd, offset);
+  if (MAP_FAILED == addr)
+    return printmsg (1, "mmap");
+
+  /* This memcpy is likely to SIGBUS if mmap has messed up with offset.  */
+  memcpy (addr, fname, sizeof (fname));
+
+  return 0;
+}
+
+#define TEST_FUNCTION do_test_bz18877 ()
+#include "../test-skeleton.c"
diff --git a/sysdeps/unix/sysv/linux/generic/wordsize-32/mmap.c b/sysdeps/unix/sysv/linux/generic/wordsize-32/mmap.c
index 24835ce..f5a4c32 100644
--- a/sysdeps/unix/sysv/linux/generic/wordsize-32/mmap.c
+++ b/sysdeps/unix/sysv/linux/generic/wordsize-32/mmap.c
@@ -21,20 +21,20 @@ 
 #include <errno.h>
 #include <sysdep.h>
 
-#ifndef MMAP_PAGE_SHIFT
-#define MMAP_PAGE_SHIFT 12
+#ifndef MMAP_PAGE_UNIT
+#define MMAP_PAGE_UNIT 4096UL
 #endif
 
 __ptr_t
 __mmap (__ptr_t addr, size_t len, int prot, int flags, int fd, off_t offset)
 {
-  if (offset & ((1 << MMAP_PAGE_SHIFT) - 1))
+  if (offset & (MMAP_PAGE_UNIT - 1))
     {
       __set_errno (EINVAL);
       return MAP_FAILED;
     }
   return (__ptr_t) INLINE_SYSCALL (mmap2, 6, addr, len, prot, flags, fd,
-                                   offset >> MMAP_PAGE_SHIFT);
+                                   offset/MMAP_PAGE_UNIT);
 }
 
 weak_alias (__mmap, mmap)