Message ID | 20200912113804.6465-1-anant.thazhemadam@gmail.com |
---|---|
State | Changes Requested |
Delegated to: | BPF Maintainers |
Headers | show |
Series | Using a pointer and kzalloc in place of a struct directly | expand |
On Sat, Sep 12, 2020 at 05:08:04PM +0530, Anant Thazhemadam wrote: > Updated the usage of a struct variable directly, in bpf_link_get_info_by_fd > to using a pointer of the same type instead, which points to a memory > location allocated using kzalloc. > > Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com> Note, your "To:" line seemed corrupted, and why not cc: the bpf mailing list as well? Anyway, comment on your patch below: > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > index 4108ef3b828b..01b9c203ef65 100644 > --- a/kernel/bpf/syscall.c > +++ b/kernel/bpf/syscall.c > @@ -3605,30 +3605,31 @@ static int bpf_link_get_info_by_fd(struct file *file, > union bpf_attr __user *uattr) > { > struct bpf_link_info __user *uinfo = u64_to_user_ptr(attr->info.info); > - struct bpf_link_info info; > + struct bpf_link_info *info = NULL; > u32 info_len = attr->info.info_len; > int err; > > - err = bpf_check_uarg_tail_zero(uinfo, sizeof(info), info_len); > + err = bpf_check_uarg_tail_zero(uinfo, sizeof(struct bpf_link_info), info_len); > + > if (err) > return err; > info_len = min_t(u32, sizeof(info), info_len); > > - memset(&info, 0, sizeof(info)); > - if (copy_from_user(&info, uinfo, info_len)) > + info = kzalloc(sizeof(struct bpf_link_info), GFP_KERNEL); > + if (copy_from_user(info, uinfo, info_len)) > return -EFAULT; You leaked memory :( Did you test this patch? Where do you free this memory, I don't see that happening anywhere in this patch, did I miss it? And odds are this change will slow things down, right? Why make this change, what's wrong with the structure being on the stack? thanks, greg k-h
On 12/09/20 5:17 pm, Greg KH wrote: > Note, your "To:" line seemed corrupted, and why not cc: the bpf mailing > list as well? Oh, I'm sorry about that. I pulled the emails of all the people to whom this mail was sent off from the header in lkml mail, and just cc-ed everyone. > You leaked memory :( > > Did you test this patch? Where do you free this memory, I don't see > that happening anywhere in this patch, did I miss it? Yes, I did test this patch, which didn't seem to trigger any issues. It surprised me so much, that I ended up sending it in, to have it checked out. I wasn't sure where exactly the memory allocated here was supposed to be freed (might be why the current implementation isn't exactly using kzalloc). I forgot to mention it in the initial mail, and I was hoping that someone would point me in the right direction (if this approach was actually going to be considered, that is, which in retrospect I now feel might not be the best thing) > And odds are this change will slow things down, right? Why make this > change, what's wrong with the structure being on the stack? For more clarity, I'm not exactly pushing for this patch to get accepted, as much as I'm trying to understand what exactly is going on, and maybe even understand syzbot's working a little better in the process. At the time when I did send in this patch, the error seemed to be present as far as syzbot was concerned. (I had sent in a test request not too long before I sent this in, which returned a positive). I just wanted to know, in the off-chance that the commit fix that was pointed out wasn't merged in the tree yet when syzbot tested it, why exactly would a patch like this lead to no issues getting triggered? (I understand that if the fix was in the tree when syzbot ran the next test, this patch immediately is rendered obsolete, ofcourse) It felt somewhat a bit like an anomaly to me, and I figured it might be worth investigating, is all; and I'd either infer something about syzbot, or about whatever just happened there. Now that I say it out loud, I realize it might sound a little silly, but then again, I had tested the 'validity' of the bug, not too long before I sent in the patch for syzbot to test too, and it seemed to be present when I did. Thanks, Anant
On Sat, Sep 12, 2020 at 05:43:38PM +0530, Anant Thazhemadam wrote: > > On 12/09/20 5:17 pm, Greg KH wrote: > > Note, your "To:" line seemed corrupted, and why not cc: the bpf mailing > > list as well? > Oh, I'm sorry about that. I pulled the emails of all the people to whom > this mail was sent off from the header in lkml mail, and just cc-ed > everyone. > > > You leaked memory :( > > > > Did you test this patch? Where do you free this memory, I don't see > > that happening anywhere in this patch, did I miss it? > > Yes, I did test this patch, which didn't seem to trigger any issues. > It surprised me so much, that I ended up sending it in, to have > it checked out. You might not have noticed the memory leak if you were not looking for it. How did you test this? > I wasn't sure where exactly the memory allocated here was > supposed to be freed (might be why the current implementation > isn't exactly using kzalloc). I forgot to mention it in the initial mail, > and I was hoping that someone would point me in the right direction > (if this approach was actually going to be considered, that is, which in > retrospect I now feel might not be the best thing) It has to be freed somewhere, you wrote the patch :) But back to the original question here, why do you feel this change is needed? What does this do better/faster/more correct than the code that is currently there? Unless you can provide that, the change should not be needed, right? thanks, greg k-h
On 12/09/20 8:25 pm, Greg KH wrote: > On Sat, Sep 12, 2020 at 05:43:38PM +0530, Anant Thazhemadam wrote: >> On 12/09/20 5:17 pm, Greg KH wrote: >>> Note, your "To:" line seemed corrupted, and why not cc: the bpf mailing >>> list as well? >> Oh, I'm sorry about that. I pulled the emails of all the people to whom >> this mail was sent off from the header in lkml mail, and just cc-ed >> everyone. >> >>> You leaked memory :( >>> >>> Did you test this patch? Where do you free this memory, I don't see >>> that happening anywhere in this patch, did I miss it? >> Yes, I did test this patch, which didn't seem to trigger any issues. >> It surprised me so much, that I ended up sending it in, to have >> it checked out. > You might not have noticed the memory leak if you were not looking for > it. > > How did you test this? Ah, that must be it. I tested this using syzbot, which wouldn't have looked for memory leaks, but only the issue that was reported. My apologies. >> I wasn't sure where exactly the memory allocated here was >> supposed to be freed (might be why the current implementation >> isn't exactly using kzalloc). I forgot to mention it in the initial mail, >> and I was hoping that someone would point me in the right direction >> (if this approach was actually going to be considered, that is, which in >> retrospect I now feel might not be the best thing) > It has to be freed somewhere, you wrote the patch :) > > But back to the original question here, why do you feel this change is > needed? What does this do better/faster/more correct than the code that > is currently there? Unless you can provide that, the change should not > be needed, right? I was initially trying to see if allocating memory would be an appropriate heuristic in trying to get a better sense of the bug and crash report, and at that moment, that was my goal, and figured that I'd deal with rest (such as freeing the memory) later on, if this was a something that could work. I was surprised when the patch (although it caused a memory leak), seemed to pass the test for the bug, without triggering any issues; since this patch basically only allocates memory as compared to locally declaring variables. I wanted some input or explanation, about how is it that doing this no longer triggers the bug? It felt (and still feels) extremely unlikely to me, that allocating memory also prevents the issue, which is why I figured it might do some help asking someone, if it does, and I just felt sending in the patch might make it at least a little less absurd sounding. Also, if simply allocating memory provides this security (which syzbot seems to approve, but I still do not understand fully how), wouldn't it be a welcome change? Like I said, I'm trying to understand how things work, a little better here, and I apologize for any confusion that I may have caused. TLDR; I tried allocating memory as a heuristic while trying to understand the bug and the bpf-next tree a little better. Surprisingly the bug didn't seem to get triggered. I would like to know the reason why the bug didn't get triggered when syzbot applied this patch to the bpf-next tree. If the reason, and allocating memory approach seems sensible enough, (or provides some sort of security that I seem to oblivious to), I will try and come up with a way to free the allocated memory, and send in a v2 as well. (For anyone who might say that there is another commit that fixes this - yes, I am aware. However, if you take a look at the bug at https://syzkaller.appspot.com/bug?extid=976d5ecfab0c7eb43ac3 you can see that a generic test (no patch attached) to see if the bug was still valid was issued much later, and it still turned out to trigger an issue)
On Sun, Sep 13, 2020 at 01:32:43AM +0530, Anant Thazhemadam wrote: > On 12/09/20 8:25 pm, Greg KH wrote: > > On Sat, Sep 12, 2020 at 05:43:38PM +0530, Anant Thazhemadam wrote: > >> On 12/09/20 5:17 pm, Greg KH wrote: > >>> Note, your "To:" line seemed corrupted, and why not cc: the bpf mailing > >>> list as well? > >> Oh, I'm sorry about that. I pulled the emails of all the people to whom > >> this mail was sent off from the header in lkml mail, and just cc-ed > >> everyone. > >> > >>> You leaked memory :( > >>> > >>> Did you test this patch? Where do you free this memory, I don't see > >>> that happening anywhere in this patch, did I miss it? > >> Yes, I did test this patch, which didn't seem to trigger any issues. > >> It surprised me so much, that I ended up sending it in, to have > >> it checked out. > > You might not have noticed the memory leak if you were not looking for > > it. > > > > How did you test this? > Ah, that must be it. I tested this using syzbot, which wouldn't have looked > for memory leaks, but only the issue that was reported. My apologies. > >> I wasn't sure where exactly the memory allocated here was > >> supposed to be freed (might be why the current implementation > >> isn't exactly using kzalloc). I forgot to mention it in the initial mail, > >> and I was hoping that someone would point me in the right direction > >> (if this approach was actually going to be considered, that is, which in > >> retrospect I now feel might not be the best thing) > > It has to be freed somewhere, you wrote the patch :) > > > > But back to the original question here, why do you feel this change is > > needed? What does this do better/faster/more correct than the code that > > is currently there? Unless you can provide that, the change should not > > be needed, right? > I was initially trying to see if allocating memory would be an appropriate > heuristic in trying to get a better sense of the bug and crash report, and > at that moment, that was my goal, and figured that I'd deal with rest > (such as freeing the memory) later on, if this was a something that could work. > > I was surprised when the patch (although it caused a memory leak), seemed > to pass the test for the bug, without triggering any issues; since this patch > basically only allocates memory as compared to locally declaring variables. > > I wanted some input or explanation, about how is it that doing this no longer > triggers the bug? That really is up to you to work out, sorry. Look at what the syzbot is testing, and look at the code change to see the difference, and you should notice what memory is now being cleared that previously was not. good luck! greg k-h
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 4108ef3b828b..01b9c203ef65 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -3605,30 +3605,31 @@ static int bpf_link_get_info_by_fd(struct file *file, union bpf_attr __user *uattr) { struct bpf_link_info __user *uinfo = u64_to_user_ptr(attr->info.info); - struct bpf_link_info info; + struct bpf_link_info *info = NULL; u32 info_len = attr->info.info_len; int err; - err = bpf_check_uarg_tail_zero(uinfo, sizeof(info), info_len); + err = bpf_check_uarg_tail_zero(uinfo, sizeof(struct bpf_link_info), info_len); + if (err) return err; info_len = min_t(u32, sizeof(info), info_len); - memset(&info, 0, sizeof(info)); - if (copy_from_user(&info, uinfo, info_len)) + info = kzalloc(sizeof(struct bpf_link_info), GFP_KERNEL); + if (copy_from_user(info, uinfo, info_len)) return -EFAULT; - info.type = link->type; - info.id = link->id; - info.prog_id = link->prog->aux->id; + info->type = link->type; + info->id = link->id; + info->prog_id = link->prog->aux->id; if (link->ops->fill_link_info) { - err = link->ops->fill_link_info(link, &info); + err = link->ops->fill_link_info(link, info); if (err) return err; } - if (copy_to_user(uinfo, &info, info_len) || + if (copy_to_user(uinfo, info, info_len) || put_user(info_len, &uattr->info.info_len)) return -EFAULT;
Updated the usage of a struct variable directly, in bpf_link_get_info_by_fd to using a pointer of the same type instead, which points to a memory location allocated using kzalloc. Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com> --- I saw this bug (https://syzkaller.appspot.com/bug?extid=976d5ecfab0c7eb43ac3), and tried to come up with a patch for it (before I saw that this had already been taken care of). Although I don't think it fundamentally changes how things work much, it still seems to have fixed the error on it's own too. I'd like to hear anyone's 2c on this, and know if this method of using info (of type bpf_link_info) instead would be a welcome change in general, even if it was not centered around fixing the bug. If instead, as an unwelcome consequence, this patch might make something go wrong somewhere, or passing the syzbot test was a false positive, I would appreciate it if you could shed some light on that for me as well. If this patch seems acceptable, then I'll send in a cleaner v2 that's a little more articulate, if required. Just trying to understand how things work, and sometimes why things work in and around the kernel. Thanks, Anant kernel/bpf/syscall.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-)