diff mbox

Re: [PATCH] fix memory leak on kvm_vm_ioctl_create_spapr_tce

Message ID E490CD805F7529488761C40FD9D26EF1299B4667@DGGEMA505-MBX.china.huawei.com (mailing list archive)
State Superseded, archived
Delegated to: Paul Mackerras
Headers show

Commit Message

Xiaoming Ni Aug. 23, 2017, 1:43 a.m. UTC
>On 22.08.2017 17:15, David Hildenbrand wrote:
>> On 22.08.2017 16:28, nixiaoming wrote:
>>> miss kfree(stt) when anon_inode_getfd return fail so add check 
>>> anon_inode_getfd return val, and kfree stt
>>>
>>> Signed-off-by: nixiaoming <nixiaoming@huawei.com>
>>> ---
>>>  arch/powerpc/kvm/book3s_64_vio.c | 5 ++++-
>>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/powerpc/kvm/book3s_64_vio.c 
>>> b/arch/powerpc/kvm/book3s_64_vio.c
>>> index a160c14..a0b4459 100644
>>> --- a/arch/powerpc/kvm/book3s_64_vio.c
>>> +++ b/arch/powerpc/kvm/book3s_64_vio.c
>>> @@ -341,8 +341,11 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm 
>>> *kvm,
>>>  
>>>  	mutex_unlock(&kvm->lock);
>>>  
>>> -	return anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops,
>>> +	ret = anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops,
>>>  				stt, O_RDWR | O_CLOEXEC);
>>> +	if (ret < 0)
>>> +		goto fail;
>>> +	return ret;
>>>  
>>>  fail:
>>>  	if (stt) {
>>>
>> 
>> 
>> stt has already been added to kvm->arch.spapr_tce_tables, so freeing 
>> it is evil IMHO. I don't know that code, so I don't know if there is 
>> some other place that will make sure that everything in
>> kvm->arch.spapr_tce_tables will properly get freed, even when no 
>> kvm->release
>> function has been called (kvm_spapr_tce_release).
>> 
>
>If it is really not freed, than also kvm_put_kvm(stt->kvm) is missing.
>
>-- 
>
>Thanks,
>
>David
>

if (!stt) return -ENOMEM;
kvm_get_kvm(kvm);
if anon_inode_getfd return -ENOMEM
The user can not determine whether kvm_get_kvm has been called
so need add kvm_pet_kvm when anon_inode_getfd fail

stt has already been added to kvm->arch.spapr_tce_tables,
but if anon_inode_getfd fail, stt is unused val, 
so call list_del_rcu, and  free as quickly as possible

new patch:

---
 arch/powerpc/kvm/book3s_64_vio.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

-- 
2.11.0.1

Comments

Paul Mackerras Aug. 23, 2017, 6:06 a.m. UTC | #1
On Wed, Aug 23, 2017 at 01:43:08AM +0000, Nixiaoming wrote:
> >On 22.08.2017 17:15, David Hildenbrand wrote:
> >> On 22.08.2017 16:28, nixiaoming wrote:
> >>> miss kfree(stt) when anon_inode_getfd return fail so add check 
> >>> anon_inode_getfd return val, and kfree stt
> >>>
> >>> Signed-off-by: nixiaoming <nixiaoming@huawei.com>
> >>> ---
> >>>  arch/powerpc/kvm/book3s_64_vio.c | 5 ++++-
> >>>  1 file changed, 4 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/arch/powerpc/kvm/book3s_64_vio.c 
> >>> b/arch/powerpc/kvm/book3s_64_vio.c
> >>> index a160c14..a0b4459 100644
> >>> --- a/arch/powerpc/kvm/book3s_64_vio.c
> >>> +++ b/arch/powerpc/kvm/book3s_64_vio.c
> >>> @@ -341,8 +341,11 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm 
> >>> *kvm,
> >>>  
> >>>  	mutex_unlock(&kvm->lock);
> >>>  
> >>> -	return anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops,
> >>> +	ret = anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops,
> >>>  				stt, O_RDWR | O_CLOEXEC);
> >>> +	if (ret < 0)
> >>> +		goto fail;
> >>> +	return ret;
> >>>  
> >>>  fail:
> >>>  	if (stt) {
> >>>
> >> 
> >> 
> >> stt has already been added to kvm->arch.spapr_tce_tables, so freeing 
> >> it is evil IMHO. I don't know that code, so I don't know if there is 
> >> some other place that will make sure that everything in
> >> kvm->arch.spapr_tce_tables will properly get freed, even when no 
> >> kvm->release
> >> function has been called (kvm_spapr_tce_release).
> >> 
> >
> >If it is really not freed, than also kvm_put_kvm(stt->kvm) is missing.
> >
> >-- 
> >
> >Thanks,
> >
> >David
> >
> 
> if (!stt) return -ENOMEM;
> kvm_get_kvm(kvm);
> if anon_inode_getfd return -ENOMEM
> The user can not determine whether kvm_get_kvm has been called
> so need add kvm_pet_kvm when anon_inode_getfd fail
> 
> stt has already been added to kvm->arch.spapr_tce_tables,
> but if anon_inode_getfd fail, stt is unused val, 
> so call list_del_rcu, and  free as quickly as possible
> 
> new patch:
> 
> ---
>  arch/powerpc/kvm/book3s_64_vio.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
> index a160c14..e2228f1 100644
> --- a/arch/powerpc/kvm/book3s_64_vio.c
> +++ b/arch/powerpc/kvm/book3s_64_vio.c
> @@ -341,8 +341,16 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
> 
>         mutex_unlock(&kvm->lock);
> 
> -       return anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops,
> +       ret = anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops,
>                                 stt, O_RDWR | O_CLOEXEC);
> +       if (ret < 0) {
> +               mutex_lock(&kvm->lock);
> +               list_del_rcu(&stt->list);
> +               mutex_unlock(&kvm->lock);
> +               kvm_put_kvm(kvm);
> +               goto fail;
> +       }
> +       return ret;

It seems to me that it would be better to do the anon_inode_getfd()
call before the kvm_get_kvm() call, and go to the fail label if it
fails.

Paul.
Al Viro Aug. 27, 2017, 9:02 p.m. UTC | #2
On Wed, Aug 23, 2017 at 04:06:24PM +1000, Paul Mackerras wrote:

> It seems to me that it would be better to do the anon_inode_getfd()
> call before the kvm_get_kvm() call, and go to the fail label if it
> fails.

And what happens if another thread does close() on the (guessed) fd?
Paul Mackerras Aug. 28, 2017, 4:38 a.m. UTC | #3
On Sun, Aug 27, 2017 at 10:02:20PM +0100, Al Viro wrote:
> On Wed, Aug 23, 2017 at 04:06:24PM +1000, Paul Mackerras wrote:
> 
> > It seems to me that it would be better to do the anon_inode_getfd()
> > call before the kvm_get_kvm() call, and go to the fail label if it
> > fails.
> 
> And what happens if another thread does close() on the (guessed) fd?

Chaos ensues, but mostly because we don't have proper mutual exclusion
on the modifications to the list.  I'll add a mutex_lock/unlock to
kvm_spapr_tce_release() and move the anon_inode_getfd() call inside
the mutex.

It looks like the other possible uses of the fd (mmap, and passing it
as a parameter to the KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE ioctl on a KVM
device fd) are safe.

Thanks,
Paul.
Al Viro Aug. 28, 2017, 5:28 a.m. UTC | #4
On Mon, Aug 28, 2017 at 02:38:37PM +1000, Paul Mackerras wrote:
> On Sun, Aug 27, 2017 at 10:02:20PM +0100, Al Viro wrote:
> > On Wed, Aug 23, 2017 at 04:06:24PM +1000, Paul Mackerras wrote:
> > 
> > > It seems to me that it would be better to do the anon_inode_getfd()
> > > call before the kvm_get_kvm() call, and go to the fail label if it
> > > fails.
> > 
> > And what happens if another thread does close() on the (guessed) fd?
> 
> Chaos ensues, but mostly because we don't have proper mutual exclusion
> on the modifications to the list.  I'll add a mutex_lock/unlock to
> kvm_spapr_tce_release() and move the anon_inode_getfd() call inside
> the mutex.
> 
> It looks like the other possible uses of the fd (mmap, and passing it
> as a parameter to the KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE ioctl on a KVM
> device fd) are safe.

Frankly, it's a lot saner to have "no failure points past anon_inode_getfd()"
policy...
Paul Mackerras Aug. 28, 2017, 6:06 a.m. UTC | #5
On Mon, Aug 28, 2017 at 06:28:08AM +0100, Al Viro wrote:
> On Mon, Aug 28, 2017 at 02:38:37PM +1000, Paul Mackerras wrote:
> > On Sun, Aug 27, 2017 at 10:02:20PM +0100, Al Viro wrote:
> > > On Wed, Aug 23, 2017 at 04:06:24PM +1000, Paul Mackerras wrote:
> > > 
> > > > It seems to me that it would be better to do the anon_inode_getfd()
> > > > call before the kvm_get_kvm() call, and go to the fail label if it
> > > > fails.
> > > 
> > > And what happens if another thread does close() on the (guessed) fd?
> > 
> > Chaos ensues, but mostly because we don't have proper mutual exclusion
> > on the modifications to the list.  I'll add a mutex_lock/unlock to
> > kvm_spapr_tce_release() and move the anon_inode_getfd() call inside
> > the mutex.
> > 
> > It looks like the other possible uses of the fd (mmap, and passing it
> > as a parameter to the KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE ioctl on a KVM
> > device fd) are safe.
> 
> Frankly, it's a lot saner to have "no failure points past anon_inode_getfd()"
> policy...

Right.  In my latest patch, there are no failure points past
anon_inode_getfd().

Paul.
Michael Ellerman Aug. 28, 2017, 11:31 a.m. UTC | #6
Al Viro <viro@ZenIV.linux.org.uk> writes:

> On Mon, Aug 28, 2017 at 02:38:37PM +1000, Paul Mackerras wrote:
>> On Sun, Aug 27, 2017 at 10:02:20PM +0100, Al Viro wrote:
>> > On Wed, Aug 23, 2017 at 04:06:24PM +1000, Paul Mackerras wrote:
>> > 
>> > > It seems to me that it would be better to do the anon_inode_getfd()
>> > > call before the kvm_get_kvm() call, and go to the fail label if it
>> > > fails.
>> > 
>> > And what happens if another thread does close() on the (guessed) fd?
>> 
>> Chaos ensues, but mostly because we don't have proper mutual exclusion
>> on the modifications to the list.  I'll add a mutex_lock/unlock to
>> kvm_spapr_tce_release() and move the anon_inode_getfd() call inside
>> the mutex.
>> 
>> It looks like the other possible uses of the fd (mmap, and passing it
>> as a parameter to the KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE ioctl on a KVM
>> device fd) are safe.
>
> Frankly, it's a lot saner to have "no failure points past anon_inode_getfd()"
> policy...

Actually I thought that was a hard rule. But I don't see it documented
or mentioned anywhere so I'm not sure now why I thought that.

cheers
diff mbox

Patch

diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index a160c14..e2228f1 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -341,8 +341,16 @@  long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,

        mutex_unlock(&kvm->lock);

-       return anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops,
+       ret = anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops,
                                stt, O_RDWR | O_CLOEXEC);
+       if (ret < 0) {
+               mutex_lock(&kvm->lock);
+               list_del_rcu(&stt->list);
+               mutex_unlock(&kvm->lock);
+               kvm_put_kvm(kvm);
+               goto fail;
+       }
+       return ret;

 fail:
        if (stt) {