Message ID | 20180902095909.5375-1-svaidy@linux.vnet.ibm.com |
---|---|
State | Superseded |
Headers | show |
Series | core/cpu: Fix memory allocation for job array | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | master/apply_patch Successfully applied |
snowpatch_ozlabs/make_check | success | Test make_check on branch master |
On 09/02/2018 03:29 PM, Vaidyanathan Srinivasan wrote: > fixes: 7a3f307e core/cpu: parallelise global CPU register setting jobs > > This bug would result in boot-hang on some configurations due to > cpu_wait_job() endlessly waiting for the last bogus jobs[cpu->pir] pointer. > > Reported-by: Stephanie Swanson <swanman@us.ibm.com> > Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> > Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> > --- > core/cpu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/core/cpu.c b/core/cpu.c > index 88477f82..cc5fba32 100644 > --- a/core/cpu.c > +++ b/core/cpu.c > @@ -1373,7 +1373,7 @@ static int64_t cpu_change_all_hid0(struct hid0_change_req *req) > struct cpu_thread *cpu; > struct cpu_job **jobs; > > - jobs = zalloc(sizeof(struct cpu_job *) * cpu_max_pir + 1); > + jobs = zalloc(sizeof(struct cpu_job *) * (cpu_max_pir + 1)); Nice catch :-) Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Thanks, -Mahesh.
On 09/02/2018 03:29 PM, Vaidyanathan Srinivasan wrote: > fixes: 7a3f307e core/cpu: parallelise global CPU register setting jobs > > This bug would result in boot-hang on some configurations due to > cpu_wait_job() endlessly waiting for the last bogus jobs[cpu->pir] pointer. > > Reported-by: Stephanie Swanson <swanman@us.ibm.com> > Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> > Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Good catch. I had gone through this code at least twice .. Somehow this didn't strike me. Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> -Vasant
On Sun, Sep 2, 2018 at 7:59 PM, Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> wrote: > fixes: 7a3f307e core/cpu: parallelise global CPU register setting jobs > > This bug would result in boot-hang on some configurations due to > cpu_wait_job() endlessly waiting for the last bogus jobs[cpu->pir] pointer. Under what circumstances does it fail? The minimum allocation block is sizeof(long) so even with the undersized allocation it I would expect it to still work. I might be missing something. > Reported-by: Stephanie Swanson <swanman@us.ibm.com> > Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> > Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> > --- > core/cpu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/core/cpu.c b/core/cpu.c > index 88477f82..cc5fba32 100644 > --- a/core/cpu.c > +++ b/core/cpu.c > @@ -1373,7 +1373,7 @@ static int64_t cpu_change_all_hid0(struct hid0_change_req *req) > struct cpu_thread *cpu; > struct cpu_job **jobs; > > - jobs = zalloc(sizeof(struct cpu_job *) * cpu_max_pir + 1); > + jobs = zalloc(sizeof(struct cpu_job *) * (cpu_max_pir + 1)); Looks like this code was copied and pasted form cpu_cleanup_all() since the same bug exists there. Can you add a fix there too? We might want to just refactor this so that we have a common cpu_run_on_all() or something. I'm not too bothered though. > assert(jobs); > > for_each_available_cpu(cpu) { > -- > 2.17.1 > > _______________________________________________ > Skiboot mailing list > Skiboot@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/skiboot
On 09/03/2018 11:05 AM, Vasant Hegde wrote: > On 09/02/2018 03:29 PM, Vaidyanathan Srinivasan wrote: >> fixes: 7a3f307e core/cpu: parallelise global CPU register setting jobs >> >> This bug would result in boot-hang on some configurations due to >> cpu_wait_job() endlessly waiting for the last bogus jobs[cpu->pir] pointer. >> >> Reported-by: Stephanie Swanson <swanman@us.ibm.com> >> Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> >> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> > > Good catch. I had gone through this code at least twice .. Somehow this didn't > strike me. > > Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> We have similar issue in cpu_cleanup_all() as well. Can you include below changes ? diff --git a/core/cpu.c b/core/cpu.c index 88477f821..2813360b4 100644 --- a/core/cpu.c +++ b/core/cpu.c @@ -1424,7 +1424,7 @@ static int64_t cpu_cleanup_all(void) struct cpu_thread *cpu; struct cpu_job **jobs; - jobs = zalloc(sizeof(struct cpu_job *) * cpu_max_pir + 1); + jobs = zalloc(sizeof(struct cpu_job *) * (cpu_max_pir + 1)); assert(jobs); for_each_available_cpu(cpu) { -Vasant
On Sun, 2 Sep 2018 15:29:09 +0530 Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> wrote: > fixes: 7a3f307e core/cpu: parallelise global CPU register setting jobs > > This bug would result in boot-hang on some configurations due to > cpu_wait_job() endlessly waiting for the last bogus jobs[cpu->pir] pointer. Dang, good catch, so *that's* what the bug was. You'd better fix my broken cpu_cleanup_all as well. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Thanks, Nick > > Reported-by: Stephanie Swanson <swanman@us.ibm.com> > Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> > Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> > --- > core/cpu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/core/cpu.c b/core/cpu.c > index 88477f82..cc5fba32 100644 > --- a/core/cpu.c > +++ b/core/cpu.c > @@ -1373,7 +1373,7 @@ static int64_t cpu_change_all_hid0(struct hid0_change_req *req) > struct cpu_thread *cpu; > struct cpu_job **jobs; > > - jobs = zalloc(sizeof(struct cpu_job *) * cpu_max_pir + 1); > + jobs = zalloc(sizeof(struct cpu_job *) * (cpu_max_pir + 1)); > assert(jobs); > > for_each_available_cpu(cpu) { > -- > 2.17.1 >
On Mon, Sep 3, 2018 at 3:37 PM, Oliver <oohall@gmail.com> wrote: > On Sun, Sep 2, 2018 at 7:59 PM, Vaidyanathan Srinivasan > <svaidy@linux.vnet.ibm.com> wrote: >> fixes: 7a3f307e core/cpu: parallelise global CPU register setting jobs >> >> This bug would result in boot-hang on some configurations due to >> cpu_wait_job() endlessly waiting for the last bogus jobs[cpu->pir] pointer. > > Under what circumstances does it fail? The minimum allocation block is > sizeof(long) so even with the undersized allocation it I would expect > it to still work. I might be missing something. Pointers are hard. Ignore me. > >> Reported-by: Stephanie Swanson <swanman@us.ibm.com> >> Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> >> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> >> --- >> core/cpu.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/core/cpu.c b/core/cpu.c >> index 88477f82..cc5fba32 100644 >> --- a/core/cpu.c >> +++ b/core/cpu.c >> @@ -1373,7 +1373,7 @@ static int64_t cpu_change_all_hid0(struct hid0_change_req *req) >> struct cpu_thread *cpu; >> struct cpu_job **jobs; >> >> - jobs = zalloc(sizeof(struct cpu_job *) * cpu_max_pir + 1); >> + jobs = zalloc(sizeof(struct cpu_job *) * (cpu_max_pir + 1)); > > Looks like this code was copied and pasted form cpu_cleanup_all() > since the same bug exists there. Can you add a fix there too? > > We might want to just refactor this so that we have a common > cpu_run_on_all() or something. I'm not too bothered though. > >> assert(jobs); >> >> for_each_available_cpu(cpu) { >> -- >> 2.17.1 >> >> _______________________________________________ >> Skiboot mailing list >> Skiboot@lists.ozlabs.org >> https://lists.ozlabs.org/listinfo/skiboot
* Nicholas Piggin <npiggin@gmail.com> [2018-09-03 15:46:28]: > On Sun, 2 Sep 2018 15:29:09 +0530 > Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> wrote: > > > fixes: 7a3f307e core/cpu: parallelise global CPU register setting jobs > > > > This bug would result in boot-hang on some configurations due to > > cpu_wait_job() endlessly waiting for the last bogus jobs[cpu->pir] pointer. > > Dang, good catch, so *that's* what the bug was. You'd better > fix my broken cpu_cleanup_all as well. Yes, I will add same fix in cpu_cleanup_all(). I should have looked other places as well. I just stopped after this fix booted the machine. --Vaidy
diff --git a/core/cpu.c b/core/cpu.c index 88477f82..cc5fba32 100644 --- a/core/cpu.c +++ b/core/cpu.c @@ -1373,7 +1373,7 @@ static int64_t cpu_change_all_hid0(struct hid0_change_req *req) struct cpu_thread *cpu; struct cpu_job **jobs; - jobs = zalloc(sizeof(struct cpu_job *) * cpu_max_pir + 1); + jobs = zalloc(sizeof(struct cpu_job *) * (cpu_max_pir + 1)); assert(jobs); for_each_available_cpu(cpu) {
fixes: 7a3f307e core/cpu: parallelise global CPU register setting jobs This bug would result in boot-hang on some configurations due to cpu_wait_job() endlessly waiting for the last bogus jobs[cpu->pir] pointer. Reported-by: Stephanie Swanson <swanman@us.ibm.com> Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> --- core/cpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)