diff mbox series

[libnetfilter_queue] Stop a memory leak in nfq_close

Message ID 20240506231719.9589-1-duncan_roe@optusnet.com.au
State New
Headers show
Series [libnetfilter_queue] Stop a memory leak in nfq_close | expand

Commit Message

Duncan Roe May 6, 2024, 11:17 p.m. UTC
0c5e5fb introduced struct nfqnl_q_handle *qh_list which can point to
dynamically acquired memory. Without this patch, that memory is not freed.

Fixes: 0c5e5fb15205 ("sync with all 'upstream' changes in libnfnetlink_log")
Signed-off-by: Duncan Roe <duncan_roe@optusnet.com.au>
---
 src/libnetfilter_queue.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Pablo Neira Ayuso June 5, 2024, 3:19 p.m. UTC | #1
Hi Duncan,

On Tue, May 07, 2024 at 09:17:19AM +1000, Duncan Roe wrote:
> 0c5e5fb introduced struct nfqnl_q_handle *qh_list which can point to
> dynamically acquired memory. Without this patch, that memory is not freed.

Indeed.

Looking at the example available at utils, I can see this assumes
that:

        nfq_destroy_queue(qh);

needs to be called.

qh->data can be also set to heap structure, in that case this would leak too.

It seems nfq_destroy_queue() needs to be called before nfq_close() by design.

Probably add:

        assert(h->qh_list == NULL);

at the top of nfq_close() instead to give a chance to users of this to
fix their code in case they are leaking qh?

Thanks

> Fixes: 0c5e5fb15205 ("sync with all 'upstream' changes in libnfnetlink_log")
> Signed-off-by: Duncan Roe <duncan_roe@optusnet.com.au>
> ---
>  src/libnetfilter_queue.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/src/libnetfilter_queue.c b/src/libnetfilter_queue.c
> index bf67a19..f152efb 100644
> --- a/src/libnetfilter_queue.c
> +++ b/src/libnetfilter_queue.c
> @@ -481,7 +481,13 @@ EXPORT_SYMBOL
>  int nfq_close(struct nfq_handle *h)
>  {
>  	int ret;
> +	struct nfq_q_handle *qh;
>  
> +	while (h->qh_list) {
> +		qh = h->qh_list;
> +		h->qh_list = qh->next;
> +		free(qh);
> +	}
>  	ret = nfnl_close(h->nfnlh);
>  	if (ret == 0)
>  		free(h);
> -- 
> 2.35.8
> 
>
Duncan Roe June 11, 2024, 2:46 a.m. UTC | #2
Hi Pablo,

On Wed, Jun 05, 2024 at 05:19:23PM +0200, Pablo Neira Ayuso wrote:
> Hi Duncan,
>
> On Tue, May 07, 2024 at 09:17:19AM +1000, Duncan Roe wrote:
> > 0c5e5fb introduced struct nfqnl_q_handle *qh_list which can point to
> > dynamically acquired memory. Without this patch, that memory is not freed.
>
> Indeed.
>
> Looking at the example available at utils, I can see this assumes
> that:
>
>         nfq_destroy_queue(qh);
>
> needs to be called.
>
> qh->data can be also set to heap structure, in that case this would leak too.
>
> It seems nfq_destroy_queue() needs to be called before nfq_close() by design.

Oh sorry, I missed that. Anyone starting with the example available at utils as
a template will be OK then.
But someone carefully checking each line of code might do a
`man nfq_destroy_queue` and see:
       Removes the binding for the specified queue handle. This call also
       unbind from the nfqueue handler, so you don't have to call
       nfq_unbind_pf.
And on then doing `man nfq_unbind_pf` that person would see:
       Unbinds the given queue connection handle from processing packets
       belonging to the given protocol family.

       This call is obsolete, Linux kernels from 3.8 onwards ignore it.
And might draw the conclusion that the call to nfq_destroy_queue is unnecessary,
especially if planning to call exit after calling nfq_close.
>
> Probably add:
>
>         assert(h->qh_list == NULL);

I don't like that. It would be the first assert() in libnetfilter_queue.
libnfnetlink is peppered with asserts: I removed them in the replacement
libmnl-using code because libmnl doesn't have them. Have you looked at the v2
patches BTW? I'd really appreciate some feedback.
>
> at the top of nfq_close() instead to give a chance to users of this to
> fix their code in case they are leaking qh?

It's not as important to call nfq_destroy_queue as it used to be. Why not just
free the memory? I could send a v2 with the Fixes: tag removed and a commit
message that mentions the change is a backstop in case nfq_destroy_queue was not
called.

Either way, `man nfq_destroy_queue` could be improved e.g.:
       Removes the binding for the specified queue handle. This call also
       releases associated internal memory.
While being about it, how about removing the obsolete code snippet at the
head of Library initialisation (that details calls to nfq_[un]bind_pf)?
Perhaps a separate doc: patch?

Cheers ... Duncan.
>
> Thanks
>
> > Fixes: 0c5e5fb15205 ("sync with all 'upstream' changes in libnfnetlink_log")
> > Signed-off-by: Duncan Roe <duncan_roe@optusnet.com.au>
> > ---
> >  src/libnetfilter_queue.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/src/libnetfilter_queue.c b/src/libnetfilter_queue.c
> > index bf67a19..f152efb 100644
> > --- a/src/libnetfilter_queue.c
> > +++ b/src/libnetfilter_queue.c
> > @@ -481,7 +481,13 @@ EXPORT_SYMBOL
> >  int nfq_close(struct nfq_handle *h)
> >  {
> >  	int ret;
> > +	struct nfq_q_handle *qh;
> >
> > +	while (h->qh_list) {
> > +		qh = h->qh_list;
> > +		h->qh_list = qh->next;
> > +		free(qh);
> > +	}
> >  	ret = nfnl_close(h->nfnlh);
> >  	if (ret == 0)
> >  		free(h);
> > --
> > 2.35.8
> >
> >
Pablo Neira Ayuso June 11, 2024, 10:21 p.m. UTC | #3
On Tue, Jun 11, 2024 at 12:46:39PM +1000, Duncan Roe wrote:
> Hi Pablo,
> 
> On Wed, Jun 05, 2024 at 05:19:23PM +0200, Pablo Neira Ayuso wrote:
> > Hi Duncan,
> >
> > On Tue, May 07, 2024 at 09:17:19AM +1000, Duncan Roe wrote:
> > > 0c5e5fb introduced struct nfqnl_q_handle *qh_list which can point to
> > > dynamically acquired memory. Without this patch, that memory is not freed.
> >
> > Indeed.
> >
> > Looking at the example available at utils, I can see this assumes
> > that:
> >
> >         nfq_destroy_queue(qh);
> >
> > needs to be called.
> >
> > qh->data can be also set to heap structure, in that case this would leak too.
> >
> > It seems nfq_destroy_queue() needs to be called before nfq_close() by design.
> 
> Oh sorry, I missed that. Anyone starting with the example available at utils as
> a template will be OK then.
> But someone carefully checking each line of code might do a
> `man nfq_destroy_queue` and see:
>        Removes the binding for the specified queue handle. This call also
>        unbind from the nfqueue handler, so you don't have to call
>        nfq_unbind_pf.
> And on then doing `man nfq_unbind_pf` that person would see:
>        Unbinds the given queue connection handle from processing packets
>        belonging to the given protocol family.
> 
>        This call is obsolete, Linux kernels from 3.8 onwards ignore it.
> And might draw the conclusion that the call to nfq_destroy_queue is unnecessary,
> especially if planning to call exit after calling nfq_close.

Then, update documentation.

> > Probably add:
> >
> >         assert(h->qh_list == NULL);
> 
> I don't like that. It would be the first assert() in libnetfilter_queue.
> libnfnetlink is peppered with asserts: I removed them in the replacement
> libmnl-using code because libmnl doesn't have them. Have you looked at the v2
> patches BTW? I'd really appreciate some feedback.
>
> >
> > at the top of nfq_close() instead to give a chance to users of this to
> > fix their code in case they are leaking qh?
> 
> It's not as important to call nfq_destroy_queue as it used to be. Why not just
> free the memory?

It is not possible to know if qh->data is stored in the bss, onstack
or the heap, it is up to the user to decide this.

> I could send a v2 with the Fixes: tag removed and a commit
> message that mentions the change is a backstop in case nfq_destroy_queue was not
> called.
> 
> Either way, `man nfq_destroy_queue` could be improved e.g.:
>        Removes the binding for the specified queue handle. This call also
>        releases associated internal memory.
> While being about it, how about removing the obsolete code snippet at the
> head of Library initialisation (that details calls to nfq_[un]bind_pf)?
> Perhaps a separate doc: patch?

I'd suggest to address this by updating documentation.

Thanks.
Duncan Roe June 13, 2024, 3:09 a.m. UTC | #4
On Wed, Jun 12, 2024 at 12:21:41AM +0200, Pablo Neira Ayuso wrote:
> On Tue, Jun 11, 2024 at 12:46:39PM +1000, Duncan Roe wrote:
> > Hi Pablo,
> >
> > On Wed, Jun 05, 2024 at 05:19:23PM +0200, Pablo Neira Ayuso wrote:
> > > Hi Duncan,
> > >
> > > On Tue, May 07, 2024 at 09:17:19AM +1000, Duncan Roe wrote:
> > > > 0c5e5fb introduced struct nfqnl_q_handle *qh_list which can point to
> > > > dynamically acquired memory. Without this patch, that memory is not freed.
> > >
> > > Indeed.
> > >
> > > Looking at the example available at utils, I can see this assumes
> > > that:
> > >
> > >         nfq_destroy_queue(qh);
> > >
> > > needs to be called.
> > >
> > > qh->data can be also set to heap structure, in that case this would leak too.
> > >
> > > It seems nfq_destroy_queue() needs to be called before nfq_close() by design.
> >
> > Oh sorry, I missed that. Anyone starting with the example available at utils as
> > a template will be OK then.
> > But someone carefully checking each line of code might do a
> > `man nfq_destroy_queue` and see:
> >        Removes the binding for the specified queue handle. This call also
> >        unbind from the nfqueue handler, so you don't have to call
> >        nfq_unbind_pf.
> > And on then doing `man nfq_unbind_pf` that person would see:
> >        Unbinds the given queue connection handle from processing packets
> >        belonging to the given protocol family.
> >
> >        This call is obsolete, Linux kernels from 3.8 onwards ignore it.
> > And might draw the conclusion that the call to nfq_destroy_queue is unnecessary,
> > especially if planning to call exit after calling nfq_close.
>
> Then, update documentation.
>
> > > Probably add:
> > >
> > >         assert(h->qh_list == NULL);
> >
> > I don't like that. It would be the first assert() in libnetfilter_queue.
> > libnfnetlink is peppered with asserts: I removed them in the replacement
> > libmnl-using code because libmnl doesn't have them. Have you looked at the v2
> > patches BTW? I'd really appreciate some feedback.
> >
> > >
> > > at the top of nfq_close() instead to give a chance to users of this to
> > > fix their code in case they are leaking qh?
> >
> > It's not as important to call nfq_destroy_queue as it used to be. Why not just
> > free the memory?
>
> It is not possible to know if qh->data is stored in the bss, onstack
> or the heap, it is up to the user to decide this.

qh->data is a pointer which is assigned in nfq_create_queue() at
libnetfilter_queue.c:584. I was never proposing to free what qh->data points to,
only to free any left-over qh structs.

The user cannot access qh->data directly because qh (struct nfq_q_handle) is
opaque.

nfq_destroy_queue(qh) will free qh at libnetfilter_queue.c:619. I'm just
proposing to free qh's for which nfq_destroy_queue was not called. In
nfq_close(h), h->qh_list can only have struct nfq_q_handles if it has anything.
>
> > I could send a v2 with the Fixes: tag removed and a commit
> > message that mentions the change is a backstop in case nfq_destroy_queue was not
> > called.

I can still do that. It's an enhancement now.
> >
> > Either way, `man nfq_destroy_queue` could be improved e.g.:
> >        Removes the binding for the specified queue handle. This call also
> >        releases associated internal memory.
> > While being about it, how about removing the obsolete code snippet at the
> > head of Library initialisation (that details calls to nfq_[un]bind_pf)?
> > Perhaps a separate doc: patch?
>
> I'd suggest to address this by updating documentation.

Yes it could do with updating. My first instinct would be to remove the
nfq_unbind_pf comments and code snippet as I mentioned originally. Kernel 3.8 is
way out of LTS.

But, there have been quite recent emails on the lists from folks who are stuck
with these old kernels owing to having proprietary closed-source binary blobs.

I could leave these old comments and doxygen lines in but with extra lines
highlighting they are for pre-3.8 only. Do you have a preferance?

Cheers ... Duncan.
>
> Thanks.
>
diff mbox series

Patch

diff --git a/src/libnetfilter_queue.c b/src/libnetfilter_queue.c
index bf67a19..f152efb 100644
--- a/src/libnetfilter_queue.c
+++ b/src/libnetfilter_queue.c
@@ -481,7 +481,13 @@  EXPORT_SYMBOL
 int nfq_close(struct nfq_handle *h)
 {
 	int ret;
+	struct nfq_q_handle *qh;
 
+	while (h->qh_list) {
+		qh = h->qh_list;
+		h->qh_list = qh->next;
+		free(qh);
+	}
 	ret = nfnl_close(h->nfnlh);
 	if (ret == 0)
 		free(h);