Message ID | 20240829141002.1962303-1-david@redhat.com |
---|---|
State | Superseded |
Headers | show |
Series | [v1] move_pages04: check for "invalid area", "no page mapped" and "shared zero page mapped" | expand |
On Thu, Aug 29, 2024 at 4:10 PM David Hildenbrand <david@redhat.com> wrote: > > While the kernel commit d899844e9c98 ("mm: fix status code which > move_pages() returns for zero page") fixed the return value when the > shared zero page was encountered to match what was state in the man page, > it unfortunately also changed the behavior when no page is mapped yet -- > when no page was faulted in/populated on demand. > > Then, this test started failing, and we thought we would be testing for > the "zero page" case, but actually we were testing for the "no page mapped" > case, and didn't realize that the kernel commit had unintended side > effects. > > As we are changing the behavior back to return "-ENOENT" in the kernel > for the "no page mapped" case, while still keeping the "shared zero > page" case to return "-EFAULT" the test starts failing again ... > > The man page clearly spells out that the expectation for the zero page is > "-EFAULT", and that "-EFAULT" can also be returned if "the memory area is > not mapped by the process" -- which means that there is no VMA/mmap() > covering that address. > > The man page also documents that "-ENOENT" is returned when "The page is > not present", which includes "there is nothing mapped". > > So let's fix the test and properly testing for all three separate things: > "invalid area/page", "no page mapped" and "shared zero page mapped"> > > Cc: Cyril Hrubis <chrubis@suse.cz> > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > > The kernel change[1] that will revert to the documented behavior -- return > -ENOENT when no page is mapped yet -- is not upstream yet, but the > assumption is that it will go upstream in the next merge window, to land > in v6.12. Thanks for patch, looking at the Linus' tree I think this landed already. > > Consequently, this test will now fail (as expected) on kernels between > v4.3 and current mainline. > > We seemed to have "hacked" the test to make kernels < 4.3 still pass, > even though they were handling zero pages the wrong way. > > Should we add a similar "hack" for kernels >= 4.3 up to the one where > the kernel behavior will change? (likely v6.12)? I'm leaning towards removing the "< 4.3 hack" (in follow-up patch), because on distros that do backports it's going to be even more confusing. > > On current kernels: > > #./move_pages04 > move_pages04 1 TFAIL : move_pages04.c:184: status[1] is EFAULT, expected ENOENT > move_pages04 2 TPASS : status[2] has expected value > move_pages04 3 TPASS : status[3] has expected value > > On kernels with [1]: > > # ./move_pages04 > move_pages04 1 TPASS : status[1] has expected value > move_pages04 2 TPASS : status[2] has expected value > move_pages04 3 TPASS : status[3] has expected value > > > Note that I tried converting this test to the new API, but the shared > code move_pages_support.c also till uses the old API, so it's not > as straight-forward as it seems. > > [1 https://lkml.kernel.org/r/20240802155524.517137-5-david@redhat.com > > --- > .../kernel/syscalls/move_pages/move_pages04.c | 104 ++++++++++++++---- > 1 file changed, 83 insertions(+), 21 deletions(-) > > diff --git a/testcases/kernel/syscalls/move_pages/move_pages04.c b/testcases/kernel/syscalls/move_pages/move_pages04.c > index f53453ab4..7a20a1328 100644 > --- a/testcases/kernel/syscalls/move_pages/move_pages04.c > +++ b/testcases/kernel/syscalls/move_pages/move_pages04.c > @@ -26,15 +26,24 @@ > * move_pages04.c > * > * DESCRIPTION > - * Failure when page does not exit. > + * Failure when the memory area is not valid, no page is mapped yet or > + * the shared zero page is mapped. > * > * ALGORITHM > * > - * 1. Pass zero page (allocated, but not written to) as one of the > - * page addresses to move_pages(). > - * 2. Check if the corresponding status is set to: > + * 1. Pass the address of a valid memory area where no where no page is > + * mapped yet (not read/written), the address of a valid memory area > + * where the shared zero page is mapped (read, but not written to) > + * and the address of an invalid memory area as page addresses to > + * move_pages(). > + * 2. Check if the corresponding status for "no page mapped" is set to > + * -ENOENT. Note that [1] changed the behavior to return -EFAULT by > + * accident. > + * 3. Check if the corresponding status for "shared zero page" is set to: > * -ENOENT for kernels < 4.3 > * -EFAULT for kernels >= 4.3 [1] > + * 4. Check if the corresponding status for "invalid memory area" is set > + * to -EFAULT. > * > * [1] > * d899844e9c98 "mm: fix status code which move_pages() returns for zero page" > @@ -64,10 +73,12 @@ > #include "test.h" > #include "move_pages_support.h" > > -#define TEST_PAGES 2 > +#define TEST_PAGES 4 > #define TEST_NODES 2 > #define TOUCHED_PAGES 1 > -#define UNTOUCHED_PAGE (TEST_PAGES - 1) > +#define NO_PAGE TOUCHED_PAGES > +#define ZERO_PAGE (NO_PAGE + 1) > +#define INVALID_PAGE (ZERO_PAGE + 1) > > void setup(void); > void cleanup(void); > @@ -89,12 +100,12 @@ int main(int argc, char **argv) > int lc; > unsigned int from_node; > unsigned int to_node; > - int ret, exp_status; > + int ret, exp_zero_page_status; > > if ((tst_kvercmp(4, 3, 0)) >= 0) > - exp_status = -EFAULT; > + exp_zero_page_status = -EFAULT; > else > - exp_status = -ENOENT; > + exp_zero_page_status = -ENOENT; > > ret = get_allowed_nodes(NH_MEMS, 2, &from_node, &to_node); > if (ret < 0) > @@ -106,6 +117,7 @@ int main(int argc, char **argv) > int nodes[TEST_PAGES]; > int status[TEST_PAGES]; > unsigned long onepage = get_page_size(); > + char tmp; > > /* reset tst_count in case we are looping */ > tst_count = 0; > @@ -114,14 +126,44 @@ int main(int argc, char **argv) > if (ret == -1) > continue; > > - /* Allocate page and do not touch it. */ > - pages[UNTOUCHED_PAGE] = numa_alloc_onnode(onepage, from_node); > - if (pages[UNTOUCHED_PAGE] == NULL) { > - tst_resm(TBROK, "failed allocating page on node %d", > + /* > + * Allocate memory and do not touch it. Consequently, no > + * page will be faulted in / mapped into the page tables. > + */ > + pages[NO_PAGE] = numa_alloc_onnode(onepage, from_node); > + if (pages[NO_PAGE] == NULL) { > + tst_resm(TBROK, "failed allocating memory on node %d", > from_node); > goto err_free_pages; > } > > + /* > + * Allocate memory, read from it, but do not write to it. This > + * will populate the shared zeropage. > + */ > + pages[ZERO_PAGE] = numa_alloc_onnode(onepage, from_node); > + if (pages[ZERO_PAGE] == NULL) { > + tst_resm(TBROK, "failed allocating memory on node %d", > + from_node); > + goto err_free_pages; > + } > + /* Make the compiler not optimize-out the read. */ > + tmp = *((char *)pages[ZERO_PAGE]); > + asm volatile("" : "+r" (tmp)); > + > + /* > + * Temporarily allocate memory and free it immediately. Do this > + * as the last step so the area won't get reused before we're > + * done. > + */ > + pages[INVALID_PAGE] = numa_alloc_onnode(onepage, from_node); > + if (pages[INVALID_PAGE] == NULL) { > + tst_resm(TBROK, "failed allocating memory on node %d", > + from_node); > + goto err_free_pages; > + } > + numa_free(pages[INVALID_PAGE], onepage); > + > for (i = 0; i < TEST_PAGES; i++) > nodes[i] = to_node; > > @@ -135,20 +177,40 @@ int main(int argc, char **argv) > tst_resm(TINFO, "move_pages() returned %d", ret); > } > > - if (status[UNTOUCHED_PAGE] == exp_status) { > + if (status[NO_PAGE] == -ENOENT) { > tst_resm(TPASS, "status[%d] has expected value", > - UNTOUCHED_PAGE); > + NO_PAGE); > } else { > tst_resm(TFAIL, "status[%d] is %s, expected %s", > - UNTOUCHED_PAGE, > - tst_strerrno(-status[UNTOUCHED_PAGE]), > - tst_strerrno(-exp_status)); > + NO_PAGE, > + tst_strerrno(-status[NO_PAGE]), > + tst_strerrno(ENOENT)); > + } > + > + if (status[ZERO_PAGE] == exp_zero_page_status) { > + tst_resm(TPASS, "status[%d] has expected value", > + ZERO_PAGE); > + } else { > + tst_resm(TFAIL, "status[%d] is %s, expected %s", > + ZERO_PAGE, > + tst_strerrno(-status[ZERO_PAGE]), > + tst_strerrno(-exp_zero_page_status)); > + } > + > + if (status[INVALID_PAGE] == -EFAULT) { > + tst_resm(TPASS, "status[%d] has expected value", > + INVALID_PAGE); > + } else { > + tst_resm(TFAIL, "status[%d] is %s, expected %s", > + INVALID_PAGE, > + tst_strerrno(-status[INVALID_PAGE]), > + tst_strerrno(EFAULT)); > } > > err_free_pages: > - /* This is capable of freeing both the touched and > - * untouched pages. > - */ > + /* Memory for the invalid page was already freed. */ > + pages[INVALID_PAGE] = NULL; > + /* This is capable of freeing all memory we allocated. */ > free_pages(pages, TEST_PAGES); > } > #else > -- > 2.46.0 > > > -- > Mailing list info: https://lists.linux.it/listinfo/ltp >
On 26.09.24 11:09, Jan Stancek wrote: > On Thu, Aug 29, 2024 at 4:10 PM David Hildenbrand <david@redhat.com> wrote: >> >> While the kernel commit d899844e9c98 ("mm: fix status code which >> move_pages() returns for zero page") fixed the return value when the >> shared zero page was encountered to match what was state in the man page, >> it unfortunately also changed the behavior when no page is mapped yet -- >> when no page was faulted in/populated on demand. >> >> Then, this test started failing, and we thought we would be testing for >> the "zero page" case, but actually we were testing for the "no page mapped" >> case, and didn't realize that the kernel commit had unintended side >> effects. >> >> As we are changing the behavior back to return "-ENOENT" in the kernel >> for the "no page mapped" case, while still keeping the "shared zero >> page" case to return "-EFAULT" the test starts failing again ... >> >> The man page clearly spells out that the expectation for the zero page is >> "-EFAULT", and that "-EFAULT" can also be returned if "the memory area is >> not mapped by the process" -- which means that there is no VMA/mmap() >> covering that address. >> >> The man page also documents that "-ENOENT" is returned when "The page is >> not present", which includes "there is nothing mapped". >> >> So let's fix the test and properly testing for all three separate things: >> "invalid area/page", "no page mapped" and "shared zero page mapped"> >> >> Cc: Cyril Hrubis <chrubis@suse.cz> >> Signed-off-by: David Hildenbrand <david@redhat.com> >> --- >> >> The kernel change[1] that will revert to the documented behavior -- return >> -ENOENT when no page is mapped yet -- is not upstream yet, but the >> assumption is that it will go upstream in the next merge window, to land >> in v6.12. > > Thanks for patch, looking at the Linus' tree I think this landed already. Yes, it's upstream. > >> >> Consequently, this test will now fail (as expected) on kernels between >> v4.3 and current mainline. >> >> We seemed to have "hacked" the test to make kernels < 4.3 still pass, >> even though they were handling zero pages the wrong way. >> >> Should we add a similar "hack" for kernels >= 4.3 up to the one where >> the kernel behavior will change? (likely v6.12)? > > I'm leaning towards removing the "< 4.3 hack" (in follow-up patch), because > on distros that do backports it's going to be even more confusing. Makes sense, so we would really test against the documented+expected behavior. I will resend with: (1) This patch, including the proper patch description (2) A patch removing the < 4.3 hack (3) A patch to convert this test to the new API Sounds good?
On Thu, Sep 26, 2024 at 6:51 PM David Hildenbrand <david@redhat.com> wrote: > > On 26.09.24 11:09, Jan Stancek wrote: > > On Thu, Aug 29, 2024 at 4:10 PM David Hildenbrand <david@redhat.com> wrote: > >> > >> While the kernel commit d899844e9c98 ("mm: fix status code which > >> move_pages() returns for zero page") fixed the return value when the > >> shared zero page was encountered to match what was state in the man page, > >> it unfortunately also changed the behavior when no page is mapped yet -- > >> when no page was faulted in/populated on demand. > >> > >> Then, this test started failing, and we thought we would be testing for > >> the "zero page" case, but actually we were testing for the "no page mapped" > >> case, and didn't realize that the kernel commit had unintended side > >> effects. > >> > >> As we are changing the behavior back to return "-ENOENT" in the kernel > >> for the "no page mapped" case, while still keeping the "shared zero > >> page" case to return "-EFAULT" the test starts failing again ... > >> > >> The man page clearly spells out that the expectation for the zero page is > >> "-EFAULT", and that "-EFAULT" can also be returned if "the memory area is > >> not mapped by the process" -- which means that there is no VMA/mmap() > >> covering that address. > >> > >> The man page also documents that "-ENOENT" is returned when "The page is > >> not present", which includes "there is nothing mapped". > >> > >> So let's fix the test and properly testing for all three separate things: > >> "invalid area/page", "no page mapped" and "shared zero page mapped"> > >> > >> Cc: Cyril Hrubis <chrubis@suse.cz> > >> Signed-off-by: David Hildenbrand <david@redhat.com> > >> --- > >> > >> The kernel change[1] that will revert to the documented behavior -- return > >> -ENOENT when no page is mapped yet -- is not upstream yet, but the > >> assumption is that it will go upstream in the next merge window, to land > >> in v6.12. > > > > Thanks for patch, looking at the Linus' tree I think this landed already. > > Yes, it's upstream. > > > > >> > >> Consequently, this test will now fail (as expected) on kernels between > >> v4.3 and current mainline. > >> > >> We seemed to have "hacked" the test to make kernels < 4.3 still pass, > >> even though they were handling zero pages the wrong way. > >> > >> Should we add a similar "hack" for kernels >= 4.3 up to the one where > >> the kernel behavior will change? (likely v6.12)? > > > > I'm leaning towards removing the "< 4.3 hack" (in follow-up patch), because > > on distros that do backports it's going to be even more confusing. > > Makes sense, so we would really test against the documented+expected > behavior. > > I will resend with: > > (1) This patch, including the proper patch description > (2) A patch removing the < 4.3 hack > (3) A patch to convert this test to the new API > > > Sounds good? Sounds good to me.
diff --git a/testcases/kernel/syscalls/move_pages/move_pages04.c b/testcases/kernel/syscalls/move_pages/move_pages04.c index f53453ab4..7a20a1328 100644 --- a/testcases/kernel/syscalls/move_pages/move_pages04.c +++ b/testcases/kernel/syscalls/move_pages/move_pages04.c @@ -26,15 +26,24 @@ * move_pages04.c * * DESCRIPTION - * Failure when page does not exit. + * Failure when the memory area is not valid, no page is mapped yet or + * the shared zero page is mapped. * * ALGORITHM * - * 1. Pass zero page (allocated, but not written to) as one of the - * page addresses to move_pages(). - * 2. Check if the corresponding status is set to: + * 1. Pass the address of a valid memory area where no where no page is + * mapped yet (not read/written), the address of a valid memory area + * where the shared zero page is mapped (read, but not written to) + * and the address of an invalid memory area as page addresses to + * move_pages(). + * 2. Check if the corresponding status for "no page mapped" is set to + * -ENOENT. Note that [1] changed the behavior to return -EFAULT by + * accident. + * 3. Check if the corresponding status for "shared zero page" is set to: * -ENOENT for kernels < 4.3 * -EFAULT for kernels >= 4.3 [1] + * 4. Check if the corresponding status for "invalid memory area" is set + * to -EFAULT. * * [1] * d899844e9c98 "mm: fix status code which move_pages() returns for zero page" @@ -64,10 +73,12 @@ #include "test.h" #include "move_pages_support.h" -#define TEST_PAGES 2 +#define TEST_PAGES 4 #define TEST_NODES 2 #define TOUCHED_PAGES 1 -#define UNTOUCHED_PAGE (TEST_PAGES - 1) +#define NO_PAGE TOUCHED_PAGES +#define ZERO_PAGE (NO_PAGE + 1) +#define INVALID_PAGE (ZERO_PAGE + 1) void setup(void); void cleanup(void); @@ -89,12 +100,12 @@ int main(int argc, char **argv) int lc; unsigned int from_node; unsigned int to_node; - int ret, exp_status; + int ret, exp_zero_page_status; if ((tst_kvercmp(4, 3, 0)) >= 0) - exp_status = -EFAULT; + exp_zero_page_status = -EFAULT; else - exp_status = -ENOENT; + exp_zero_page_status = -ENOENT; ret = get_allowed_nodes(NH_MEMS, 2, &from_node, &to_node); if (ret < 0) @@ -106,6 +117,7 @@ int main(int argc, char **argv) int nodes[TEST_PAGES]; int status[TEST_PAGES]; unsigned long onepage = get_page_size(); + char tmp; /* reset tst_count in case we are looping */ tst_count = 0; @@ -114,14 +126,44 @@ int main(int argc, char **argv) if (ret == -1) continue; - /* Allocate page and do not touch it. */ - pages[UNTOUCHED_PAGE] = numa_alloc_onnode(onepage, from_node); - if (pages[UNTOUCHED_PAGE] == NULL) { - tst_resm(TBROK, "failed allocating page on node %d", + /* + * Allocate memory and do not touch it. Consequently, no + * page will be faulted in / mapped into the page tables. + */ + pages[NO_PAGE] = numa_alloc_onnode(onepage, from_node); + if (pages[NO_PAGE] == NULL) { + tst_resm(TBROK, "failed allocating memory on node %d", from_node); goto err_free_pages; } + /* + * Allocate memory, read from it, but do not write to it. This + * will populate the shared zeropage. + */ + pages[ZERO_PAGE] = numa_alloc_onnode(onepage, from_node); + if (pages[ZERO_PAGE] == NULL) { + tst_resm(TBROK, "failed allocating memory on node %d", + from_node); + goto err_free_pages; + } + /* Make the compiler not optimize-out the read. */ + tmp = *((char *)pages[ZERO_PAGE]); + asm volatile("" : "+r" (tmp)); + + /* + * Temporarily allocate memory and free it immediately. Do this + * as the last step so the area won't get reused before we're + * done. + */ + pages[INVALID_PAGE] = numa_alloc_onnode(onepage, from_node); + if (pages[INVALID_PAGE] == NULL) { + tst_resm(TBROK, "failed allocating memory on node %d", + from_node); + goto err_free_pages; + } + numa_free(pages[INVALID_PAGE], onepage); + for (i = 0; i < TEST_PAGES; i++) nodes[i] = to_node; @@ -135,20 +177,40 @@ int main(int argc, char **argv) tst_resm(TINFO, "move_pages() returned %d", ret); } - if (status[UNTOUCHED_PAGE] == exp_status) { + if (status[NO_PAGE] == -ENOENT) { tst_resm(TPASS, "status[%d] has expected value", - UNTOUCHED_PAGE); + NO_PAGE); } else { tst_resm(TFAIL, "status[%d] is %s, expected %s", - UNTOUCHED_PAGE, - tst_strerrno(-status[UNTOUCHED_PAGE]), - tst_strerrno(-exp_status)); + NO_PAGE, + tst_strerrno(-status[NO_PAGE]), + tst_strerrno(ENOENT)); + } + + if (status[ZERO_PAGE] == exp_zero_page_status) { + tst_resm(TPASS, "status[%d] has expected value", + ZERO_PAGE); + } else { + tst_resm(TFAIL, "status[%d] is %s, expected %s", + ZERO_PAGE, + tst_strerrno(-status[ZERO_PAGE]), + tst_strerrno(-exp_zero_page_status)); + } + + if (status[INVALID_PAGE] == -EFAULT) { + tst_resm(TPASS, "status[%d] has expected value", + INVALID_PAGE); + } else { + tst_resm(TFAIL, "status[%d] is %s, expected %s", + INVALID_PAGE, + tst_strerrno(-status[INVALID_PAGE]), + tst_strerrno(EFAULT)); } err_free_pages: - /* This is capable of freeing both the touched and - * untouched pages. - */ + /* Memory for the invalid page was already freed. */ + pages[INVALID_PAGE] = NULL; + /* This is capable of freeing all memory we allocated. */ free_pages(pages, TEST_PAGES); } #else
While the kernel commit d899844e9c98 ("mm: fix status code which move_pages() returns for zero page") fixed the return value when the shared zero page was encountered to match what was state in the man page, it unfortunately also changed the behavior when no page is mapped yet -- when no page was faulted in/populated on demand. Then, this test started failing, and we thought we would be testing for the "zero page" case, but actually we were testing for the "no page mapped" case, and didn't realize that the kernel commit had unintended side effects. As we are changing the behavior back to return "-ENOENT" in the kernel for the "no page mapped" case, while still keeping the "shared zero page" case to return "-EFAULT" the test starts failing again ... The man page clearly spells out that the expectation for the zero page is "-EFAULT", and that "-EFAULT" can also be returned if "the memory area is not mapped by the process" -- which means that there is no VMA/mmap() covering that address. The man page also documents that "-ENOENT" is returned when "The page is not present", which includes "there is nothing mapped". So let's fix the test and properly testing for all three separate things: "invalid area/page", "no page mapped" and "shared zero page mapped"> Cc: Cyril Hrubis <chrubis@suse.cz> Signed-off-by: David Hildenbrand <david@redhat.com> --- The kernel change[1] that will revert to the documented behavior -- return -ENOENT when no page is mapped yet -- is not upstream yet, but the assumption is that it will go upstream in the next merge window, to land in v6.12. Consequently, this test will now fail (as expected) on kernels between v4.3 and current mainline. We seemed to have "hacked" the test to make kernels < 4.3 still pass, even though they were handling zero pages the wrong way. Should we add a similar "hack" for kernels >= 4.3 up to the one where the kernel behavior will change? (likely v6.12)? On current kernels: #./move_pages04 move_pages04 1 TFAIL : move_pages04.c:184: status[1] is EFAULT, expected ENOENT move_pages04 2 TPASS : status[2] has expected value move_pages04 3 TPASS : status[3] has expected value On kernels with [1]: # ./move_pages04 move_pages04 1 TPASS : status[1] has expected value move_pages04 2 TPASS : status[2] has expected value move_pages04 3 TPASS : status[3] has expected value Note that I tried converting this test to the new API, but the shared code move_pages_support.c also till uses the old API, so it's not as straight-forward as it seems. [1 https://lkml.kernel.org/r/20240802155524.517137-5-david@redhat.com --- .../kernel/syscalls/move_pages/move_pages04.c | 104 ++++++++++++++---- 1 file changed, 83 insertions(+), 21 deletions(-)