diff mbox series

[v1] move_pages04: check for "invalid area", "no page mapped" and "shared zero page mapped"

Message ID 20240829141002.1962303-1-david@redhat.com
State New
Headers show
Series [v1] move_pages04: check for "invalid area", "no page mapped" and "shared zero page mapped" | expand

Commit Message

David Hildenbrand Aug. 29, 2024, 2:10 p.m. UTC
While the kernel commit d899844e9c98 ("mm: fix status code which
move_pages() returns for zero page") fixed the return value when the
shared zero page was encountered to match what was state in the man page,
it unfortunately also changed the behavior when no page is mapped yet --
when no page was faulted in/populated on demand.

Then, this test started failing, and we thought we would be testing for
the "zero page" case, but actually we were testing for the "no page mapped"
case, and didn't realize that the kernel commit had unintended side
effects.

As we are changing the behavior back to return "-ENOENT" in the kernel
for the "no page mapped" case, while still keeping the "shared zero
page" case to return "-EFAULT" the test starts failing again ...

The man page clearly spells out that the expectation for the zero page is
"-EFAULT", and that "-EFAULT" can also be returned if "the memory area is
not mapped by the process" -- which means that there is no VMA/mmap()
covering that address.

The man page also documents that "-ENOENT" is returned when "The page is
not present", which includes "there is nothing mapped".

So let's fix the test and properly testing for all three separate things:
"invalid area/page", "no page mapped" and "shared zero page mapped">

Cc: Cyril Hrubis <chrubis@suse.cz>
Signed-off-by: David Hildenbrand <david@redhat.com>
---

The kernel change[1] that will revert to the documented behavior -- return
-ENOENT when no page is mapped yet -- is not upstream yet, but the
assumption is that it will go upstream in the next merge window, to land
in v6.12.

Consequently, this test will now fail (as expected) on kernels between
v4.3 and current mainline.

We seemed to have "hacked" the test to make kernels < 4.3 still pass,
even though they were handling zero pages the wrong way.

Should we add a similar "hack" for kernels >= 4.3 up to the one where
the kernel behavior will change? (likely v6.12)?

On current kernels:

#./move_pages04
move_pages04    1  TFAIL  :  move_pages04.c:184: status[1] is EFAULT, expected ENOENT
move_pages04    2  TPASS  :  status[2] has expected value
move_pages04    3  TPASS  :  status[3] has expected value

On kernels with [1]:

# ./move_pages04
move_pages04    1  TPASS  :  status[1] has expected value
move_pages04    2  TPASS  :  status[2] has expected value
move_pages04    3  TPASS  :  status[3] has expected value


Note that I tried converting this test to the new API, but the shared
code move_pages_support.c also till uses the old API, so it's not
as straight-forward as it seems.

[1 https://lkml.kernel.org/r/20240802155524.517137-5-david@redhat.com

---
 .../kernel/syscalls/move_pages/move_pages04.c | 104 ++++++++++++++----
 1 file changed, 83 insertions(+), 21 deletions(-)
diff mbox series

Patch

diff --git a/testcases/kernel/syscalls/move_pages/move_pages04.c b/testcases/kernel/syscalls/move_pages/move_pages04.c
index f53453ab4..7a20a1328 100644
--- a/testcases/kernel/syscalls/move_pages/move_pages04.c
+++ b/testcases/kernel/syscalls/move_pages/move_pages04.c
@@ -26,15 +26,24 @@ 
  *	move_pages04.c
  *
  * DESCRIPTION
- *      Failure when page does not exit.
+ *      Failure when the memory area is not valid, no page is mapped yet or
+ *      the shared zero page is mapped.
  *
  * ALGORITHM
  *
- *      1. Pass zero page (allocated, but not written to) as one of the
- *         page addresses to move_pages().
- *      2. Check if the corresponding status is set to:
+ *      1. Pass the address of a valid memory area where no where no page is
+ *         mapped yet (not read/written), the address of a valid memory area
+ *         where the shared zero page is mapped (read, but not written to)
+ *         and the address of an invalid memory area as page addresses to
+ *         move_pages().
+ *      2. Check if the corresponding status for "no page mapped" is set to
+ *         -ENOENT. Note that [1] changed the behavior to return -EFAULT by
+ *         accident.
+ *      3. Check if the corresponding status for "shared zero page" is set to:
  *         -ENOENT for kernels < 4.3
  *         -EFAULT for kernels >= 4.3 [1]
+ *      4. Check if the corresponding status for "invalid memory area" is set
+ *         to -EFAULT.
  *
  * [1]
  * d899844e9c98 "mm: fix status code which move_pages() returns for zero page"
@@ -64,10 +73,12 @@ 
 #include "test.h"
 #include "move_pages_support.h"
 
-#define TEST_PAGES 2
+#define TEST_PAGES 4
 #define TEST_NODES 2
 #define TOUCHED_PAGES 1
-#define UNTOUCHED_PAGE (TEST_PAGES - 1)
+#define NO_PAGE TOUCHED_PAGES
+#define ZERO_PAGE (NO_PAGE + 1)
+#define INVALID_PAGE (ZERO_PAGE + 1)
 
 void setup(void);
 void cleanup(void);
@@ -89,12 +100,12 @@  int main(int argc, char **argv)
 	int lc;
 	unsigned int from_node;
 	unsigned int to_node;
-	int ret, exp_status;
+	int ret, exp_zero_page_status;
 
 	if ((tst_kvercmp(4, 3, 0)) >= 0)
-		exp_status = -EFAULT;
+		exp_zero_page_status = -EFAULT;
 	else
-		exp_status = -ENOENT;
+		exp_zero_page_status = -ENOENT;
 
 	ret = get_allowed_nodes(NH_MEMS, 2, &from_node, &to_node);
 	if (ret < 0)
@@ -106,6 +117,7 @@  int main(int argc, char **argv)
 		int nodes[TEST_PAGES];
 		int status[TEST_PAGES];
 		unsigned long onepage = get_page_size();
+		char tmp;
 
 		/* reset tst_count in case we are looping */
 		tst_count = 0;
@@ -114,14 +126,44 @@  int main(int argc, char **argv)
 		if (ret == -1)
 			continue;
 
-		/* Allocate page and do not touch it. */
-		pages[UNTOUCHED_PAGE] = numa_alloc_onnode(onepage, from_node);
-		if (pages[UNTOUCHED_PAGE] == NULL) {
-			tst_resm(TBROK, "failed allocating page on node %d",
+		/*
+		 * Allocate memory and do not touch it. Consequently, no
+		 * page will be faulted in / mapped into the page tables.
+		 */
+		pages[NO_PAGE] = numa_alloc_onnode(onepage, from_node);
+		if (pages[NO_PAGE] == NULL) {
+			tst_resm(TBROK, "failed allocating memory on node %d",
 				 from_node);
 			goto err_free_pages;
 		}
 
+		/*
+		 * Allocate memory, read from it, but do not write to it. This
+		 * will populate the shared zeropage.
+		 */
+		pages[ZERO_PAGE] = numa_alloc_onnode(onepage, from_node);
+		if (pages[ZERO_PAGE] == NULL) {
+			tst_resm(TBROK, "failed allocating memory on node %d",
+				 from_node);
+			goto err_free_pages;
+		}
+		/* Make the compiler not optimize-out the read. */
+		tmp = *((char *)pages[ZERO_PAGE]);
+		asm volatile("" : "+r" (tmp));
+
+		/*
+		 * Temporarily allocate memory and free it immediately. Do this
+		 * as the last step so the area won't get reused before we're
+		 * done.
+		 */
+		pages[INVALID_PAGE] = numa_alloc_onnode(onepage, from_node);
+		if (pages[INVALID_PAGE] == NULL) {
+			tst_resm(TBROK, "failed allocating memory on node %d",
+				 from_node);
+			goto err_free_pages;
+		}
+		numa_free(pages[INVALID_PAGE], onepage);
+
 		for (i = 0; i < TEST_PAGES; i++)
 			nodes[i] = to_node;
 
@@ -135,20 +177,40 @@  int main(int argc, char **argv)
 			tst_resm(TINFO, "move_pages() returned %d", ret);
 		}
 
-		if (status[UNTOUCHED_PAGE] == exp_status) {
+		if (status[NO_PAGE] == -ENOENT) {
 			tst_resm(TPASS, "status[%d] has expected value",
-				 UNTOUCHED_PAGE);
+				 NO_PAGE);
 		} else {
 			tst_resm(TFAIL, "status[%d] is %s, expected %s",
-				UNTOUCHED_PAGE,
-				tst_strerrno(-status[UNTOUCHED_PAGE]),
-				tst_strerrno(-exp_status));
+				NO_PAGE,
+				tst_strerrno(-status[NO_PAGE]),
+				tst_strerrno(ENOENT));
+		}
+
+		if (status[ZERO_PAGE] == exp_zero_page_status) {
+			tst_resm(TPASS, "status[%d] has expected value",
+				 ZERO_PAGE);
+		} else {
+			tst_resm(TFAIL, "status[%d] is %s, expected %s",
+				ZERO_PAGE,
+				tst_strerrno(-status[ZERO_PAGE]),
+				tst_strerrno(-exp_zero_page_status));
+		}
+
+		if (status[INVALID_PAGE] == -EFAULT) {
+			tst_resm(TPASS, "status[%d] has expected value",
+				 INVALID_PAGE);
+		} else {
+			tst_resm(TFAIL, "status[%d] is %s, expected %s",
+				INVALID_PAGE,
+				tst_strerrno(-status[INVALID_PAGE]),
+				tst_strerrno(EFAULT));
 		}
 
 err_free_pages:
-		/* This is capable of freeing both the touched and
-		 * untouched pages.
-		 */
+		/* Memory for the invalid page was already freed. */
+		pages[INVALID_PAGE] = NULL;
+		/* This is capable of freeing all memory we allocated. */
 		free_pages(pages, TEST_PAGES);
 	}
 #else