Message ID | CA+B+MYQPoqtKrnTiVsqyJEvo1_=r3-EJqCQT+RxqfaSG0xXgYw@mail.gmail.com |
---|---|
State | Changes Requested |
Headers | show |
Series | memcg/memcontrol04: Fix the judgment error in test_memcg_low() | expand |
Hi! Thanks for checking this test, but did you take a look at the explanation given in the test "tags" ? -> https://bugzilla.suse.com/show_bug.cgi?id=1196298 Regards, Andrea On 11/21/24 04:05, Jin Guojie wrote: > When running memcontrol04, TFAIL results will appear on various Linux > distributions, kernel versions, and CPUs: > > (1) Test platform > > * Linux distribution: Ubuntu 24.10 > * CPU: X86_64, Arm64 > * Kernel: 6.6 longterm > * glibc: 2.40 > * LTP version: commit ec4161186e5, Oct 24 12:18:17 2024 > > (2) Error logs > > During the operation of memcontrol04, file systems such as ext2, ext3, > ext4, xfs, ntfs, and vfat will be tested. > For any of the file system, the same TFAIL result will appear: > > root@vm:~/ltp/testcases/kernel/controllers/memcg# ./memcontrol04 > > tst_test.c:1823: TINFO: === Testing on ext2 === > memcontrol04.c:208: TPASS: Expect: (C oom events=0) == 0 > memcontrol04.c:211: TPASS: Expect: (C low events=437) > 0 > memcontrol04.c:208: TPASS: Expect: (D oom events=0) == 0 > memcontrol04.c:211: TPASS: Expect: (D low events=437) > 0 > memcontrol04.c:208: TPASS: Expect: (E oom events=0) == 0 > memcontrol04.c:214: TPASS: Expect: (E low events=0) == 0 > memcontrol04.c:208: TPASS: Expect: (F oom events=0) == 0 > memcontrol04.c:214: TFAIL: Expect: (F low events=412) == 0 > > tst_test.c:1823: TINFO: === Testing on ext3 === > memcontrol04.c:208: TPASS: Expect: (C oom events=0) == 0 > memcontrol04.c:211: TPASS: Expect: (C low events=437) > 0 > memcontrol04.c:208: TPASS: Expect: (D oom events=0) == 0 > memcontrol04.c:211: TPASS: Expect: (D low events=437) > 0 > memcontrol04.c:208: TPASS: Expect: (E oom events=0) == 0 > memcontrol04.c:214: TPASS: Expect: (E low events=0) == 0 > memcontrol04.c:208: TPASS: Expect: (F oom events=0) == 0 > memcontrol04.c:214: TFAIL: Expect: (F low events=411) == 0 > > ...... > > Summary: > passed 55 > failed 5 > broken 0 > skipped 0 > warnings 0 > > It looks like there is an error in the processing logic of cgroup F. > > (3) Cause analysis > > In the test_memcg_low() function, 4 subgroups (C, D, E, F) are created under B, > and 50MB pagecache is allocated in C, D, and F. Therefore, when checking whether > it is successful at the end, only E should be judged to have low_events==0, > and the judgment conditions for all other subgroups should be low_events > 0. > > (4) Fix issure > > #1209 > https://github.com/linux-test-project/ltp/issues/1209 > > Signed-off-by: Jin Guojie <guojie.jin@gmail.com> > > --- > testcases/kernel/controllers/memcg/memcontrol04.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/testcases/kernel/controllers/memcg/memcontrol04.c > b/testcases/kernel/controllers/memcg/memcontrol04.c > index 1b8d115f8..0dddb7449 100644 > --- a/testcases/kernel/controllers/memcg/memcontrol04.c > +++ b/testcases/kernel/controllers/memcg/memcontrol04.c > @@ -207,7 +207,7 @@ static void test_memcg_low(void) > > TST_EXP_EXPR(oom == 0, "(%c oom events=%ld) == 0", id, oom); > > - if (i < E) { > + if (i != E) { > TST_EXP_EXPR(low > 0, > "(%c low events=%ld) > 0", id, low); > } else { > -- > 2.45.2 >
When I visit https://bugzilla.suse.com/show_bug.cgi?id=1196298, an error occurs showing that "You are not authorized to access bug #1196298." I tried to register an account by myself, but the error did not change. So far, I still can't view the link. Please see if there is a way to give me permission to view this bug. Or, could you please provide me the explanation of this bug in the email? On Thu, Nov 21, 2024 at 5:44 PM Andrea Cervesato <andrea.cervesato@suse.com> wrote: > > Hi! > > Thanks for checking this test, but did you take a look at the > explanation given in the test "tags" ? -> > https://bugzilla.suse.com/show_bug.cgi?id=1196298 > > Regards, > Andrea > > On 11/21/24 04:05, Jin Guojie wrote: > > When running memcontrol04, TFAIL results will appear on various Linux > > distributions, kernel versions, and CPUs: > > > > (1) Test platform > > > > * Linux distribution: Ubuntu 24.10 > > * CPU: X86_64, Arm64 > > * Kernel: 6.6 longterm > > * glibc: 2.40 > > * LTP version: commit ec4161186e5, Oct 24 12:18:17 2024 > > > > (2) Error logs > > > > During the operation of memcontrol04, file systems such as ext2, ext3, > > ext4, xfs, ntfs, and vfat will be tested. > > For any of the file system, the same TFAIL result will appear: > > > > root@vm:~/ltp/testcases/kernel/controllers/memcg# ./memcontrol04 > > > > tst_test.c:1823: TINFO: === Testing on ext2 === > > memcontrol04.c:208: TPASS: Expect: (C oom events=0) == 0 > > memcontrol04.c:211: TPASS: Expect: (C low events=437) > 0 > > memcontrol04.c:208: TPASS: Expect: (D oom events=0) == 0 > > memcontrol04.c:211: TPASS: Expect: (D low events=437) > 0 > > memcontrol04.c:208: TPASS: Expect: (E oom events=0) == 0 > > memcontrol04.c:214: TPASS: Expect: (E low events=0) == 0 > > memcontrol04.c:208: TPASS: Expect: (F oom events=0) == 0 > > memcontrol04.c:214: TFAIL: Expect: (F low events=412) == 0 > > > > tst_test.c:1823: TINFO: === Testing on ext3 === > > memcontrol04.c:208: TPASS: Expect: (C oom events=0) == 0 > > memcontrol04.c:211: TPASS: Expect: (C low events=437) > 0 > > memcontrol04.c:208: TPASS: Expect: (D oom events=0) == 0 > > memcontrol04.c:211: TPASS: Expect: (D low events=437) > 0 > > memcontrol04.c:208: TPASS: Expect: (E oom events=0) == 0 > > memcontrol04.c:214: TPASS: Expect: (E low events=0) == 0 > > memcontrol04.c:208: TPASS: Expect: (F oom events=0) == 0 > > memcontrol04.c:214: TFAIL: Expect: (F low events=411) == 0 > > > > ...... > > > > Summary: > > passed 55 > > failed 5 > > broken 0 > > skipped 0 > > warnings 0 > > > > It looks like there is an error in the processing logic of cgroup F. > > > > (3) Cause analysis > > > > In the test_memcg_low() function, 4 subgroups (C, D, E, F) are created under B, > > and 50MB pagecache is allocated in C, D, and F. Therefore, when checking whether > > it is successful at the end, only E should be judged to have low_events==0, > > and the judgment conditions for all other subgroups should be low_events > 0. > > > > (4) Fix issure > > > > #1209 > > https://github.com/linux-test-project/ltp/issues/1209 > > > > Signed-off-by: Jin Guojie <guojie.jin@gmail.com> > > > > --- > > testcases/kernel/controllers/memcg/memcontrol04.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/testcases/kernel/controllers/memcg/memcontrol04.c > > b/testcases/kernel/controllers/memcg/memcontrol04.c > > index 1b8d115f8..0dddb7449 100644 > > --- a/testcases/kernel/controllers/memcg/memcontrol04.c > > +++ b/testcases/kernel/controllers/memcg/memcontrol04.c > > @@ -207,7 +207,7 @@ static void test_memcg_low(void) > > > > TST_EXP_EXPR(oom == 0, "(%c oom events=%ld) == 0", id, oom); > > > > - if (i < E) { > > + if (i != E) { > > TST_EXP_EXPR(low > 0, > > "(%c low events=%ld) > 0", id, low); > > } else { > > -- > > 2.45.2 > >
Hi Jin, I can't share the bugzilla content, but the test has been analyzed quite a lot and people didn't figure out a proper fix for it. Linux is highly asynchronous and the memory management is not always predictable. Most of the times, testing memory requires a good knowledge of the underlying algorithms and, even so, it's not so obvious to write tests for it. Sometimes we end up having a test result based on **average reply** from the test (i.e. if we obtain TPASS 70 times out of 100, we can consider test pass). As far as I see, devs think this test should be rewritten or simplified, in order to reduce the noise in LTP, testing only features which can be actually tested. Feel free to take a look if you think it's worth the energy, but I also suggest to take a look at the kernel, because it could be there's an underlying bug somewhere. Kind regards, Andrea Cervesato On 11/21/24 14:18, Jin Guojie wrote: > When I visit https://bugzilla.suse.com/show_bug.cgi?id=1196298, an > error occurs showing that > > "You are not authorized to access bug #1196298." > > I tried to register an account by myself, but the error did not change. > > So far, I still can't view the link. > > Please see if there is a way to give me permission to view this bug. > > Or, could you please provide me the explanation of this bug in the email? > > On Thu, Nov 21, 2024 at 5:44 PM Andrea Cervesato > <andrea.cervesato@suse.com> wrote: >> Hi! >> >> Thanks for checking this test, but did you take a look at the >> explanation given in the test "tags" ? -> >> https://bugzilla.suse.com/show_bug.cgi?id=1196298 >> >> Regards, >> Andrea >> >> On 11/21/24 04:05, Jin Guojie wrote: >>> When running memcontrol04, TFAIL results will appear on various Linux >>> distributions, kernel versions, and CPUs: >>> >>> (1) Test platform >>> >>> * Linux distribution: Ubuntu 24.10 >>> * CPU: X86_64, Arm64 >>> * Kernel: 6.6 longterm >>> * glibc: 2.40 >>> * LTP version: commit ec4161186e5, Oct 24 12:18:17 2024 >>> >>> (2) Error logs >>> >>> During the operation of memcontrol04, file systems such as ext2, ext3, >>> ext4, xfs, ntfs, and vfat will be tested. >>> For any of the file system, the same TFAIL result will appear: >>> >>> root@vm:~/ltp/testcases/kernel/controllers/memcg# ./memcontrol04 >>> >>> tst_test.c:1823: TINFO: === Testing on ext2 === >>> memcontrol04.c:208: TPASS: Expect: (C oom events=0) == 0 >>> memcontrol04.c:211: TPASS: Expect: (C low events=437) > 0 >>> memcontrol04.c:208: TPASS: Expect: (D oom events=0) == 0 >>> memcontrol04.c:211: TPASS: Expect: (D low events=437) > 0 >>> memcontrol04.c:208: TPASS: Expect: (E oom events=0) == 0 >>> memcontrol04.c:214: TPASS: Expect: (E low events=0) == 0 >>> memcontrol04.c:208: TPASS: Expect: (F oom events=0) == 0 >>> memcontrol04.c:214: TFAIL: Expect: (F low events=412) == 0 >>> >>> tst_test.c:1823: TINFO: === Testing on ext3 === >>> memcontrol04.c:208: TPASS: Expect: (C oom events=0) == 0 >>> memcontrol04.c:211: TPASS: Expect: (C low events=437) > 0 >>> memcontrol04.c:208: TPASS: Expect: (D oom events=0) == 0 >>> memcontrol04.c:211: TPASS: Expect: (D low events=437) > 0 >>> memcontrol04.c:208: TPASS: Expect: (E oom events=0) == 0 >>> memcontrol04.c:214: TPASS: Expect: (E low events=0) == 0 >>> memcontrol04.c:208: TPASS: Expect: (F oom events=0) == 0 >>> memcontrol04.c:214: TFAIL: Expect: (F low events=411) == 0 >>> >>> ...... >>> >>> Summary: >>> passed 55 >>> failed 5 >>> broken 0 >>> skipped 0 >>> warnings 0 >>> >>> It looks like there is an error in the processing logic of cgroup F. >>> >>> (3) Cause analysis >>> >>> In the test_memcg_low() function, 4 subgroups (C, D, E, F) are created under B, >>> and 50MB pagecache is allocated in C, D, and F. Therefore, when checking whether >>> it is successful at the end, only E should be judged to have low_events==0, >>> and the judgment conditions for all other subgroups should be low_events > 0. >>> >>> (4) Fix issure >>> >>> #1209 >>> https://github.com/linux-test-project/ltp/issues/1209 >>> >>> Signed-off-by: Jin Guojie <guojie.jin@gmail.com> >>> >>> --- >>> testcases/kernel/controllers/memcg/memcontrol04.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/testcases/kernel/controllers/memcg/memcontrol04.c >>> b/testcases/kernel/controllers/memcg/memcontrol04.c >>> index 1b8d115f8..0dddb7449 100644 >>> --- a/testcases/kernel/controllers/memcg/memcontrol04.c >>> +++ b/testcases/kernel/controllers/memcg/memcontrol04.c >>> @@ -207,7 +207,7 @@ static void test_memcg_low(void) >>> >>> TST_EXP_EXPR(oom == 0, "(%c oom events=%ld) == 0", id, oom); >>> >>> - if (i < E) { >>> + if (i != E) { >>> TST_EXP_EXPR(low > 0, >>> "(%c low events=%ld) > 0", id, low); >>> } else { >>> -- >>> 2.45.2 >>>
Hi Andera, Thank you for the reply. Here are some additional comments, 1. In the function test_memcg_low(), due to the judgment of if (i == E), it is obvious that no memory is allocated for E. For other groups, 50MB memory is allocated. ---- test_memcg_low() ---- for (i = 0; i < ARRAY_SIZE(leaf_cg); i++) { // leaf_cg[] contains C,D,E,F leaf_cg[i] = tst_cg_group_mk(trunk_cg[B], "leaf_%c", 'C' + i); if (i == E) // no memory allocated in E continue; alloc_pagecache_in_child(leaf_cg[i], MB(50)); // 50MB allocated for C, D, F } Therefore, it's obvious that the change from if (i < E) to if (i != E) in my patch is consistent with the memory allocation . 2. Since this problem is relatively clear and reproducible in various distributions, it should be easy to check if someone has discussed any topics related to my patch. Could you copy the text in https://bugzilla.suse.com/show_bug.cgi?id=1196298 and send it to me separately? Kind regards, Jin Guojie On Mon, Nov 25, 2024 at 5:45 PM Andrea Cervesato <andrea.cervesato@suse.com> wrote: > > Hi Jin, > > I can't share the bugzilla content, but the test has been analyzed quite > a lot and people didn't figure out a proper fix for it. > Linux is highly asynchronous and the memory management is not always > predictable. Most of the times, testing memory requires a good knowledge > of the underlying algorithms and, even so, it's not so obvious to write > tests for it. Sometimes we end up having a test result based on > **average reply** from the test (i.e. if we obtain TPASS 70 times out of > 100, we can consider test pass). > > As far as I see, devs think this test should be rewritten or simplified, > in order to reduce the noise in LTP, testing only features which can be > actually tested. > > Feel free to take a look if you think it's worth the energy, but I also > suggest to take a look at the kernel, because it could be there's an > underlying bug somewhere. > > Kind regards, > Andrea Cervesato > > On 11/21/24 14:18, Jin Guojie wrote: > > When I visit https://bugzilla.suse.com/show_bug.cgi?id=1196298, an > > error occurs showing that > > > > "You are not authorized to access bug #1196298." > > > > I tried to register an account by myself, but the error did not change. > > > > So far, I still can't view the link. > > > > Please see if there is a way to give me permission to view this bug. > > > > Or, could you please provide me the explanation of this bug in the email? > > > > On Thu, Nov 21, 2024 at 5:44 PM Andrea Cervesato > > <andrea.cervesato@suse.com> wrote: > >> Hi! > >> > >> Thanks for checking this test, but did you take a look at the > >> explanation given in the test "tags" ? -> > >> https://bugzilla.suse.com/show_bug.cgi?id=1196298 > >> > >> Regards, > >> Andrea > >> > >> On 11/21/24 04:05, Jin Guojie wrote: > >>> When running memcontrol04, TFAIL results will appear on various Linux > >>> distributions, kernel versions, and CPUs: > >>> > >>> (1) Test platform > >>> > >>> * Linux distribution: Ubuntu 24.10 > >>> * CPU: X86_64, Arm64 > >>> * Kernel: 6.6 longterm > >>> * glibc: 2.40 > >>> * LTP version: commit ec4161186e5, Oct 24 12:18:17 2024 > >>> > >>> (2) Error logs > >>> > >>> During the operation of memcontrol04, file systems such as ext2, ext3, > >>> ext4, xfs, ntfs, and vfat will be tested. > >>> For any of the file system, the same TFAIL result will appear: > >>> > >>> root@vm:~/ltp/testcases/kernel/controllers/memcg# ./memcontrol04 > >>> > >>> tst_test.c:1823: TINFO: === Testing on ext2 === > >>> memcontrol04.c:208: TPASS: Expect: (C oom events=0) == 0 > >>> memcontrol04.c:211: TPASS: Expect: (C low events=437) > 0 > >>> memcontrol04.c:208: TPASS: Expect: (D oom events=0) == 0 > >>> memcontrol04.c:211: TPASS: Expect: (D low events=437) > 0 > >>> memcontrol04.c:208: TPASS: Expect: (E oom events=0) == 0 > >>> memcontrol04.c:214: TPASS: Expect: (E low events=0) == 0 > >>> memcontrol04.c:208: TPASS: Expect: (F oom events=0) == 0 > >>> memcontrol04.c:214: TFAIL: Expect: (F low events=412) == 0 > >>> > >>> tst_test.c:1823: TINFO: === Testing on ext3 === > >>> memcontrol04.c:208: TPASS: Expect: (C oom events=0) == 0 > >>> memcontrol04.c:211: TPASS: Expect: (C low events=437) > 0 > >>> memcontrol04.c:208: TPASS: Expect: (D oom events=0) == 0 > >>> memcontrol04.c:211: TPASS: Expect: (D low events=437) > 0 > >>> memcontrol04.c:208: TPASS: Expect: (E oom events=0) == 0 > >>> memcontrol04.c:214: TPASS: Expect: (E low events=0) == 0 > >>> memcontrol04.c:208: TPASS: Expect: (F oom events=0) == 0 > >>> memcontrol04.c:214: TFAIL: Expect: (F low events=411) == 0 > >>> > >>> ...... > >>> > >>> Summary: > >>> passed 55 > >>> failed 5 > >>> broken 0 > >>> skipped 0 > >>> warnings 0 > >>> > >>> It looks like there is an error in the processing logic of cgroup F. > >>> > >>> (3) Cause analysis > >>> > >>> In the test_memcg_low() function, 4 subgroups (C, D, E, F) are created under B, > >>> and 50MB pagecache is allocated in C, D, and F. Therefore, when checking whether > >>> it is successful at the end, only E should be judged to have low_events==0, > >>> and the judgment conditions for all other subgroups should be low_events > 0. > >>> > >>> (4) Fix issure > >>> > >>> #1209 > >>> https://github.com/linux-test-project/ltp/issues/1209 > >>> > >>> Signed-off-by: Jin Guojie <guojie.jin@gmail.com> > >>> > >>> --- > >>> testcases/kernel/controllers/memcg/memcontrol04.c | 2 +- > >>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>> > >>> diff --git a/testcases/kernel/controllers/memcg/memcontrol04.c > >>> b/testcases/kernel/controllers/memcg/memcontrol04.c > >>> index 1b8d115f8..0dddb7449 100644 > >>> --- a/testcases/kernel/controllers/memcg/memcontrol04.c > >>> +++ b/testcases/kernel/controllers/memcg/memcontrol04.c > >>> @@ -207,7 +207,7 @@ static void test_memcg_low(void) > >>> > >>> TST_EXP_EXPR(oom == 0, "(%c oom events=%ld) == 0", id, oom); > >>> > >>> - if (i < E) { > >>> + if (i != E) { > >>> TST_EXP_EXPR(low > 0, > >>> "(%c low events=%ld) > 0", id, low); > >>> } else { > >>> -- > >>> 2.45.2 > >>>
Hi Jin, Sorry for the late reply but I was sick. If I understand the test right, the memory pressure on the leaves should change the amount of memory allocated by A/B/F . The expected result is to have 0 memory allocation in A/B/F , due to the memory.low boundary, which is the target of our test. As it's written in the test description, the purpose of the test is to verify that memory.low boundary works correctly during memory pressure. To sum up the ticket, this normally happens, unless we use "memory_recursiveprot" for cgroups. It seems cgroup A/B/F folder is unjustifiably protecting its memory. This is probably a test that verifies an unrealistic and an unpractical situation, so the testing model should be changed. Kind regards, Andrea Cervesato On 11/25/24 14:19, Jin Guojie wrote: > Hi Andera, > > Thank you for the reply. > > Here are some additional comments, > > 1. In the function test_memcg_low(), due to the judgment of if (i == > E), it is obvious that no memory is allocated for E. For other groups, > 50MB memory is allocated. > > ---- test_memcg_low() ---- > for (i = 0; i < ARRAY_SIZE(leaf_cg); i++) { // leaf_cg[] contains C,D,E,F > leaf_cg[i] = tst_cg_group_mk(trunk_cg[B], "leaf_%c", 'C' + i); > > if (i == E) // no memory allocated in E > continue; > > alloc_pagecache_in_child(leaf_cg[i], MB(50)); // 50MB allocated for C, D, F > } > > Therefore, it's obvious that the change from if (i < E) to if (i != E) > in my patch is consistent with the memory allocation . > > 2. Since this problem is relatively clear and reproducible in various > distributions, it should be easy to check if someone has discussed any > topics related to my patch. > > Could you copy the text in > https://bugzilla.suse.com/show_bug.cgi?id=1196298 and send it to me > separately? > > Kind regards, > > Jin Guojie > > On Mon, Nov 25, 2024 at 5:45 PM Andrea Cervesato > <andrea.cervesato@suse.com> wrote: >> Hi Jin, >> >> I can't share the bugzilla content, but the test has been analyzed quite >> a lot and people didn't figure out a proper fix for it. >> Linux is highly asynchronous and the memory management is not always >> predictable. Most of the times, testing memory requires a good knowledge >> of the underlying algorithms and, even so, it's not so obvious to write >> tests for it. Sometimes we end up having a test result based on >> **average reply** from the test (i.e. if we obtain TPASS 70 times out of >> 100, we can consider test pass). >> >> As far as I see, devs think this test should be rewritten or simplified, >> in order to reduce the noise in LTP, testing only features which can be >> actually tested. >> >> Feel free to take a look if you think it's worth the energy, but I also >> suggest to take a look at the kernel, because it could be there's an >> underlying bug somewhere. >> >> Kind regards, >> Andrea Cervesato >> >> On 11/21/24 14:18, Jin Guojie wrote: >>> When I visithttps://bugzilla.suse.com/show_bug.cgi?id=1196298, an >>> error occurs showing that >>> >>> "You are not authorized to access bug #1196298." >>> >>> I tried to register an account by myself, but the error did not change. >>> >>> So far, I still can't view the link. >>> >>> Please see if there is a way to give me permission to view this bug. >>> >>> Or, could you please provide me the explanation of this bug in the email? >>> >>> On Thu, Nov 21, 2024 at 5:44 PM Andrea Cervesato >>> <andrea.cervesato@suse.com> wrote: >>>> Hi! >>>> >>>> Thanks for checking this test, but did you take a look at the >>>> explanation given in the test "tags" ? -> >>>> https://bugzilla.suse.com/show_bug.cgi?id=1196298 >>>> >>>> Regards, >>>> Andrea >>>> >>>> On 11/21/24 04:05, Jin Guojie wrote: >>>>> When running memcontrol04, TFAIL results will appear on various Linux >>>>> distributions, kernel versions, and CPUs: >>>>> >>>>> (1) Test platform >>>>> >>>>> * Linux distribution: Ubuntu 24.10 >>>>> * CPU: X86_64, Arm64 >>>>> * Kernel: 6.6 longterm >>>>> * glibc: 2.40 >>>>> * LTP version: commit ec4161186e5, Oct 24 12:18:17 2024 >>>>> >>>>> (2) Error logs >>>>> >>>>> During the operation of memcontrol04, file systems such as ext2, ext3, >>>>> ext4, xfs, ntfs, and vfat will be tested. >>>>> For any of the file system, the same TFAIL result will appear: >>>>> >>>>> root@vm:~/ltp/testcases/kernel/controllers/memcg# ./memcontrol04 >>>>> >>>>> tst_test.c:1823: TINFO: === Testing on ext2 === >>>>> memcontrol04.c:208: TPASS: Expect: (C oom events=0) == 0 >>>>> memcontrol04.c:211: TPASS: Expect: (C low events=437) > 0 >>>>> memcontrol04.c:208: TPASS: Expect: (D oom events=0) == 0 >>>>> memcontrol04.c:211: TPASS: Expect: (D low events=437) > 0 >>>>> memcontrol04.c:208: TPASS: Expect: (E oom events=0) == 0 >>>>> memcontrol04.c:214: TPASS: Expect: (E low events=0) == 0 >>>>> memcontrol04.c:208: TPASS: Expect: (F oom events=0) == 0 >>>>> memcontrol04.c:214: TFAIL: Expect: (F low events=412) == 0 >>>>> >>>>> tst_test.c:1823: TINFO: === Testing on ext3 === >>>>> memcontrol04.c:208: TPASS: Expect: (C oom events=0) == 0 >>>>> memcontrol04.c:211: TPASS: Expect: (C low events=437) > 0 >>>>> memcontrol04.c:208: TPASS: Expect: (D oom events=0) == 0 >>>>> memcontrol04.c:211: TPASS: Expect: (D low events=437) > 0 >>>>> memcontrol04.c:208: TPASS: Expect: (E oom events=0) == 0 >>>>> memcontrol04.c:214: TPASS: Expect: (E low events=0) == 0 >>>>> memcontrol04.c:208: TPASS: Expect: (F oom events=0) == 0 >>>>> memcontrol04.c:214: TFAIL: Expect: (F low events=411) == 0 >>>>> >>>>> ...... >>>>> >>>>> Summary: >>>>> passed 55 >>>>> failed 5 >>>>> broken 0 >>>>> skipped 0 >>>>> warnings 0 >>>>> >>>>> It looks like there is an error in the processing logic of cgroup F. >>>>> >>>>> (3) Cause analysis >>>>> >>>>> In the test_memcg_low() function, 4 subgroups (C, D, E, F) are created under B, >>>>> and 50MB pagecache is allocated in C, D, and F. Therefore, when checking whether >>>>> it is successful at the end, only E should be judged to have low_events==0, >>>>> and the judgment conditions for all other subgroups should be low_events > 0. >>>>> >>>>> (4) Fix issure >>>>> >>>>> #1209 >>>>> https://github.com/linux-test-project/ltp/issues/1209 >>>>> >>>>> Signed-off-by: Jin Guojie<guojie.jin@gmail.com> >>>>> >>>>> --- >>>>> testcases/kernel/controllers/memcg/memcontrol04.c | 2 +- >>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>> >>>>> diff --git a/testcases/kernel/controllers/memcg/memcontrol04.c >>>>> b/testcases/kernel/controllers/memcg/memcontrol04.c >>>>> index 1b8d115f8..0dddb7449 100644 >>>>> --- a/testcases/kernel/controllers/memcg/memcontrol04.c >>>>> +++ b/testcases/kernel/controllers/memcg/memcontrol04.c >>>>> @@ -207,7 +207,7 @@ static void test_memcg_low(void) >>>>> >>>>> TST_EXP_EXPR(oom == 0, "(%c oom events=%ld) == 0", id, oom); >>>>> >>>>> - if (i < E) { >>>>> + if (i != E) { >>>>> TST_EXP_EXPR(low > 0, >>>>> "(%c low events=%ld) > 0", id, low); >>>>> } else { >>>>> -- >>>>> 2.45.2 >>>>>
Hi Andrea, I carefully analyzed the description in memcontrol04.c and the memory allocation in the code, and found no problems with the case. In addition, memcontrol04.c comes from test_memcontrol.c in kernel selftests, and the contents of the two are exactly the same. By the way, test_memcontrol.c in kselftest also failed the test. The error is "nonzero memory.events: low in the unprotected A/B/E group", exactly the same as memcontrol04.c. It is a reported problem in the kernel mailing lists, but no one has solved it. (https://lore.kernel.org/all/20230522095233.4246-2-haifeng.xu@shopee.com/T/#mfda771d975e71816ae70d4ac35f957a56cb12bb1) According to the hint in your previous email, "to take a look at the kernel, because it could be there's an underlying bug somewhere", I checked the processing of memory.low in the kernel code and found a key problem. When the kernel reclaims memory, it determines the low boundary of a cgroup not based on the memory.low set by the user, but based on the "effective-low-boundary". The calculation of effective-low-boundary is done by the function effective_protection() in mm/page_counter.c. The comments of effective_protection() say that, " a cgroup's effective-low-boundary is derived from its own memory.low, its parent's and siblings' memory.low, as well as the actual memory usage in the tree. If the children aren't claiming (all of) the protection afforded to them by the parent, distribute the remainder in proportion to the (unprotected) memory of each cgroup." The following is a code snippet from effective_protection(), if (parent_effective > siblings_protected && parent_usage > siblings_protected && usage > protected) { unsigned long unclaimed; unclaimed = parent_effective - siblings_protected; unclaimed *= usage - protected; unclaimed /= parent_usage - siblings_protected; ep += unclaimed; } According to the above algorithm, several children will share the remaining space of the parent in proportion. For a cgroup with memory.low=0, this code will cause the effective-low-boundary calculated to be greater than 0. This leads to a series of unexpected results: 1. When running memcontrol04, even though we specify memory.low of group F to be 0, but the effective-low-boundary of group F in kernel is greater than 0. 2. There will always be some allocated memory(about 100KB) in group F that is not completely released。 3. When reclaim occurs, the "low" value in group F's memory.events file will increase, resulting in a non-zero result. In the cgroup documentation, (https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html) it is said that the meaning of "low" in the memory.events file is "The number of times the cgroup is reclaimed due to high memory pressure even though its usage is under the low boundary." This means that, if the memory.low value is set to be 0, the "low" value in memory.events should never be increased. To fix this problem, the key is that effective-low-boundary should not be greater than memory.low at any time. I tried to modify the calculation method of effective-low-boundary in the kernel, by simply adding a line of judgment. The following experiments are conducted in the latest kernel (6.13.0-rc1+) (1) Code change to effective_protection() diff --git a/mm/page_counter.c b/mm/page_counter.c index b249d15af9dd..0c86b7885143 100644 --- a/mm/page_counter.c +++ b/mm/page_counter.c @@ -404,6 +404,7 @@ static unsigned long effective_protection(unsigned long usage, ep += unclaimed; } + ep = (ep > setting) ? setting : ep; return ep; } (2) with the modified kernel, memcontrol04 can pass, and groups F's memory.events:low remains 0. memcontrol04.c:208: TPASS: Expect: (C oom events=0) == 0 memcontrol04.c:211: TPASS: Expect: (C low events=479) > 0 memcontrol04.c:208: TPASS: Expect: (D oom events=0) == 0 memcontrol04.c:211: TPASS: Expect: (D low events=479) > 0 memcontrol04.c:208: TPASS: Expect: (E oom events=0) == 0 memcontrol04.c:214: TPASS: Expect: (E low events=0) == 0 memcontrol04.c:208: TPASS: Expect: (F oom events=0) == 0 memcontrol04.c:214: TPASS: Expect: (F low events=0) == 0 Summary: passed 60 failed 0 broken 0 skipped 0 Please take a look at the above analysis and modification of the kernel and see if it is correct. If it is reasonable, I may submit this issue to the kernel mailing lists. Jin
diff --git a/testcases/kernel/controllers/memcg/memcontrol04.c b/testcases/kernel/controllers/memcg/memcontrol04.c index 1b8d115f8..0dddb7449 100644 --- a/testcases/kernel/controllers/memcg/memcontrol04.c +++ b/testcases/kernel/controllers/memcg/memcontrol04.c @@ -207,7 +207,7 @@ static void test_memcg_low(void) TST_EXP_EXPR(oom == 0, "(%c oom events=%ld) == 0", id, oom); - if (i < E) { + if (i != E) { TST_EXP_EXPR(low > 0, "(%c low events=%ld) > 0", id, low); } else {
When running memcontrol04, TFAIL results will appear on various Linux distributions, kernel versions, and CPUs: (1) Test platform * Linux distribution: Ubuntu 24.10 * CPU: X86_64, Arm64 * Kernel: 6.6 longterm * glibc: 2.40 * LTP version: commit ec4161186e5, Oct 24 12:18:17 2024 (2) Error logs During the operation of memcontrol04, file systems such as ext2, ext3, ext4, xfs, ntfs, and vfat will be tested. For any of the file system, the same TFAIL result will appear: root@vm:~/ltp/testcases/kernel/controllers/memcg# ./memcontrol04 tst_test.c:1823: TINFO: === Testing on ext2 === memcontrol04.c:208: TPASS: Expect: (C oom events=0) == 0 memcontrol04.c:211: TPASS: Expect: (C low events=437) > 0 memcontrol04.c:208: TPASS: Expect: (D oom events=0) == 0 memcontrol04.c:211: TPASS: Expect: (D low events=437) > 0 memcontrol04.c:208: TPASS: Expect: (E oom events=0) == 0 memcontrol04.c:214: TPASS: Expect: (E low events=0) == 0 memcontrol04.c:208: TPASS: Expect: (F oom events=0) == 0 memcontrol04.c:214: TFAIL: Expect: (F low events=412) == 0 tst_test.c:1823: TINFO: === Testing on ext3 === memcontrol04.c:208: TPASS: Expect: (C oom events=0) == 0 memcontrol04.c:211: TPASS: Expect: (C low events=437) > 0 memcontrol04.c:208: TPASS: Expect: (D oom events=0) == 0 memcontrol04.c:211: TPASS: Expect: (D low events=437) > 0 memcontrol04.c:208: TPASS: Expect: (E oom events=0) == 0 memcontrol04.c:214: TPASS: Expect: (E low events=0) == 0 memcontrol04.c:208: TPASS: Expect: (F oom events=0) == 0 memcontrol04.c:214: TFAIL: Expect: (F low events=411) == 0 ...... Summary: passed 55 failed 5 broken 0 skipped 0 warnings 0 It looks like there is an error in the processing logic of cgroup F. (3) Cause analysis In the test_memcg_low() function, 4 subgroups (C, D, E, F) are created under B, and 50MB pagecache is allocated in C, D, and F. Therefore, when checking whether it is successful at the end, only E should be judged to have low_events==0, and the judgment conditions for all other subgroups should be low_events > 0. (4) Fix issure #1209 https://github.com/linux-test-project/ltp/issues/1209 Signed-off-by: Jin Guojie <guojie.jin@gmail.com> --- testcases/kernel/controllers/memcg/memcontrol04.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- 2.45.2