Message ID | 157383036409.3173.14386381829936652438.stgit@firesoul |
---|---|
State | Superseded |
Delegated to: | David Miller |
Headers | show |
Series | page_pool: followup changes to restore tracepoint features | expand |
Jesper Dangaard Brouer <brouer@redhat.com> writes: > When Jonathan change the page_pool to become responsible to its > own shutdown via deferred work queue, then the disconnect_cnt > counter was removed from xdp memory model tracepoint. > > This patch change the page_pool_inflight tracepoint name to > page_pool_release, because it reflects the new responsability > better. And it reintroduces a counter that reflect the number of > times page_pool_release have been tried. > > The counter is also used by the code, to only empty the alloc > cache once. With a stuck work queue running every second and > counter being 64-bit, it will overrun in approx 584 billion > years. For comparison, Earth lifetime expectancy is 7.5 billion > years, before the Sun will engulf, and destroy, the Earth. I love how you just casually threw that last bit in there; and now I'm thinking about under which conditions that would not be enough. Maybe someone will put this code on a space probe bound for interstellar space, which will escape the death of our solar system only to be destined to increment this counter forever in the cold, dead void of space? I think that is a risk we can live with, so: Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
On Fri, 15 Nov 2019 17:33:18 +0100 Toke Høiland-Jørgensen <toke@redhat.com> wrote: > Jesper Dangaard Brouer <brouer@redhat.com> writes: > > > When Jonathan change the page_pool to become responsible to its > > own shutdown via deferred work queue, then the disconnect_cnt > > counter was removed from xdp memory model tracepoint. > > > > This patch change the page_pool_inflight tracepoint name to > > page_pool_release, because it reflects the new responsability > > better. And it reintroduces a counter that reflect the number of > > times page_pool_release have been tried. > > > > The counter is also used by the code, to only empty the alloc > > cache once. With a stuck work queue running every second and > > counter being 64-bit, it will overrun in approx 584 billion > > years. For comparison, Earth lifetime expectancy is 7.5 billion > > years, before the Sun will engulf, and destroy, the Earth. > > I love how you just casually threw that last bit in there; and now I'm > thinking about under which conditions that would not be enough. Maybe > someone will put this code on a space probe bound for interstellar > space, which will escape the death of our solar system only to be > destined to increment this counter forever in the cold, dead void of > space? Like with performance numbers, when ever presenting a number, I always strive to relate it some something else, as without that the number is just a number. > I think that is a risk we can live with, so: > > Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Thx
diff --git a/include/net/page_pool.h b/include/net/page_pool.h index 1121faa99c12..ace881c15dcb 100644 --- a/include/net/page_pool.h +++ b/include/net/page_pool.h @@ -112,6 +112,8 @@ struct page_pool { * refcnt serves purpose is to simplify drivers error handling. */ refcount_t user_cnt; + + u64 destroy_cnt; }; struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp); diff --git a/include/trace/events/page_pool.h b/include/trace/events/page_pool.h index 47b5ee880aa9..ee7f1aca7839 100644 --- a/include/trace/events/page_pool.h +++ b/include/trace/events/page_pool.h @@ -10,7 +10,7 @@ #include <net/page_pool.h> -TRACE_EVENT(page_pool_inflight, +TRACE_EVENT(page_pool_release, TP_PROTO(const struct page_pool *pool, s32 inflight, u32 hold, u32 release), @@ -22,6 +22,7 @@ TRACE_EVENT(page_pool_inflight, __field(s32, inflight) __field(u32, hold) __field(u32, release) + __field(u64, cnt) ), TP_fast_assign( @@ -29,10 +30,12 @@ TRACE_EVENT(page_pool_inflight, __entry->inflight = inflight; __entry->hold = hold; __entry->release = release; + __entry->cnt = pool->destroy_cnt; ), - TP_printk("page_pool=%p inflight=%d hold=%u release=%u", - __entry->pool, __entry->inflight, __entry->hold, __entry->release) + TP_printk("page_pool=%p inflight=%d hold=%u release=%u cnt=%llu", + __entry->pool, __entry->inflight, __entry->hold, + __entry->release, __entry->cnt) ); TRACE_EVENT(page_pool_state_release, diff --git a/net/core/page_pool.c b/net/core/page_pool.c index dfc2501c35d9..e28db2ef8e12 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -200,7 +200,7 @@ static s32 page_pool_inflight(struct page_pool *pool) inflight = _distance(hold_cnt, release_cnt); - trace_page_pool_inflight(pool, inflight, hold_cnt, release_cnt); + trace_page_pool_release(pool, inflight, hold_cnt, release_cnt); WARN(inflight < 0, "Negative(%d) inflight packet-pages", inflight); return inflight; @@ -349,10 +349,13 @@ static void page_pool_free(struct page_pool *pool) kfree(pool); } -static void page_pool_scrub(struct page_pool *pool) +static void page_pool_empty_alloc_cache_once(struct page_pool *pool) { struct page *page; + if (pool->destroy_cnt) + return; + /* Empty alloc cache, assume caller made sure this is * no-longer in use, and page_pool_alloc_pages() cannot be * call concurrently. @@ -361,6 +364,12 @@ static void page_pool_scrub(struct page_pool *pool) page = pool->alloc.cache[--pool->alloc.count]; __page_pool_return_page(pool, page); } +} + +static void page_pool_scrub(struct page_pool *pool) +{ + page_pool_empty_alloc_cache_once(pool); + pool->destroy_cnt++; /* No more consumers should exist, but producers could still * be in-flight.
When Jonathan change the page_pool to become responsible to its own shutdown via deferred work queue, then the disconnect_cnt counter was removed from xdp memory model tracepoint. This patch change the page_pool_inflight tracepoint name to page_pool_release, because it reflects the new responsability better. And it reintroduces a counter that reflect the number of times page_pool_release have been tried. The counter is also used by the code, to only empty the alloc cache once. With a stuck work queue running every second and counter being 64-bit, it will overrun in approx 584 billion years. For comparison, Earth lifetime expectancy is 7.5 billion years, before the Sun will engulf, and destroy, the Earth. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> --- include/net/page_pool.h | 2 ++ include/trace/events/page_pool.h | 9 ++++++--- net/core/page_pool.c | 13 +++++++++++-- 3 files changed, 19 insertions(+), 5 deletions(-)