[RFC,net-next,0/2] mvpp2: page_pool support

Message ID	20191224010103.56407-1-mcroce@redhat.com
Headers	show Return-Path: <netdev-owner@vger.kernel.org> From: Matteo Croce <mcroce@redhat.com> To: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ilias Apalodimas <ilias.apalodimas@linaro.org>, Lorenzo Bianconi <lorenzo@kernel.org>, Maxime Chevallier <maxime.chevallier@bootlin.com>, Antoine Tenart <antoine.tenart@bootlin.com>, Luka Perkov <luka.perkov@sartura.hr>, Tomislav Tomasic <tomislav.tomasic@sartura.hr>, Marcin Wojtas <mw@semihalf.com>, Stefan Chulski <stefanc@marvell.com>, Jesper Dangaard Brouer <brouer@redhat.com>, Nadav Haklai <nadavh@marvell.com> Subject: [RFC net-next 0/2] mvpp2: page_pool support Date: Tue, 24 Dec 2019 02:01:01 +0100 Message-Id: <20191224010103.56407-1-mcroce@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: netdev-owner@vger.kernel.org Precedence: bulk
Series	mvpp2: page_pool support \| expand [RFC,net-next,0/2] mvpp2: page_pool support [RFC,net-next,1/2] mvpp2: use page_pool allocator [RFC,net-next,2/2] mvpp2: memory accounting

Matteo Croce Dec. 24, 2019, 1:01 a.m. UTC

This patches change the memory allocator of mvpp2 from the frag allocator to
the page_pool API. This change is needed to add later XDP support to mvpp2.

The reason I send it as RFC is that with this changeset, mvpp2 performs much
more slower. This is the tc drop rate measured with a single flow:

stock net-next with frag allocator:
rx: 900.7 Mbps 1877 Kpps

this patchset with page_pool:
rx: 423.5 Mbps 882.3 Kpps

This is the perf top when receiving traffic:

  27.68%  [kernel]            [k] __page_pool_clean_page
   9.79%  [kernel]            [k] get_page_from_freelist
   7.18%  [kernel]            [k] free_unref_page
   4.64%  [kernel]            [k] build_skb
   4.63%  [kernel]            [k] __netif_receive_skb_core
   3.83%  [mvpp2]             [k] mvpp2_poll
   3.64%  [kernel]            [k] eth_type_trans
   3.61%  [kernel]            [k] kmem_cache_free
   3.03%  [kernel]            [k] kmem_cache_alloc
   2.76%  [kernel]            [k] dev_gro_receive
   2.69%  [mvpp2]             [k] mvpp2_bm_pool_put
   2.68%  [kernel]            [k] page_frag_free
   1.83%  [kernel]            [k] inet_gro_receive
   1.74%  [kernel]            [k] page_pool_alloc_pages
   1.70%  [kernel]            [k] __build_skb
   1.47%  [kernel]            [k] __alloc_pages_nodemask
   1.36%  [mvpp2]             [k] mvpp2_buf_alloc.isra.0
   1.29%  [kernel]            [k] tcf_action_exec

I tried Ilias patches for page_pool recycling, I get an improvement
to ~1100, but I'm still far than the original allocator.

Any idea on why I get such bad numbers?

Another reason to send it as RFC is that I'm not fully convinced on how to
use the page_pool given the HW limitation of the BM.

The driver currently uses, for every CPU, a page_pool for short packets and
another for long ones. The driver also has 4 rx queue per port, so every
RXQ #1 will share the short and long page pools of CPU #1.

This means that for every RX queue I call xdp_rxq_info_reg_mem_model() twice,
on two different page_pool, can this be a problem?

As usual, ideas are welcome.

Matteo Croce (2):
  mvpp2: use page_pool allocator
  mvpp2: memory accounting

 drivers/net/ethernet/marvell/Kconfig          |   1 +
 drivers/net/ethernet/marvell/mvpp2/mvpp2.h    |   7 +
 .../net/ethernet/marvell/mvpp2/mvpp2_main.c   | 142 +++++++++++++++---
 3 files changed, 125 insertions(+), 25 deletions(-)

Ilias Apalodimas Dec. 24, 2019, 9:52 a.m. UTC | #1

On Tue, Dec 24, 2019 at 02:01:01AM +0100, Matteo Croce wrote:
> This patches change the memory allocator of mvpp2 from the frag allocator to
> the page_pool API. This change is needed to add later XDP support to mvpp2.
> 
> The reason I send it as RFC is that with this changeset, mvpp2 performs much
> more slower. This is the tc drop rate measured with a single flow:
> 
> stock net-next with frag allocator:
> rx: 900.7 Mbps 1877 Kpps
> 
> this patchset with page_pool:
> rx: 423.5 Mbps 882.3 Kpps
> 
> This is the perf top when receiving traffic:
> 
>   27.68%  [kernel]            [k] __page_pool_clean_page

This seems extremly high on the list. 

>    9.79%  [kernel]            [k] get_page_from_freelist
>    7.18%  [kernel]            [k] free_unref_page
>    4.64%  [kernel]            [k] build_skb
>    4.63%  [kernel]            [k] __netif_receive_skb_core
>    3.83%  [mvpp2]             [k] mvpp2_poll
>    3.64%  [kernel]            [k] eth_type_trans
>    3.61%  [kernel]            [k] kmem_cache_free
>    3.03%  [kernel]            [k] kmem_cache_alloc
>    2.76%  [kernel]            [k] dev_gro_receive
>    2.69%  [mvpp2]             [k] mvpp2_bm_pool_put
>    2.68%  [kernel]            [k] page_frag_free
>    1.83%  [kernel]            [k] inet_gro_receive
>    1.74%  [kernel]            [k] page_pool_alloc_pages
>    1.70%  [kernel]            [k] __build_skb
>    1.47%  [kernel]            [k] __alloc_pages_nodemask
>    1.36%  [mvpp2]             [k] mvpp2_buf_alloc.isra.0
>    1.29%  [kernel]            [k] tcf_action_exec
> 
> I tried Ilias patches for page_pool recycling, I get an improvement
> to ~1100, but I'm still far than the original allocator.

Can you post the recycling perf for comparison?

> 
> Any idea on why I get such bad numbers?

Nop but it's indeed strange

> 
> Another reason to send it as RFC is that I'm not fully convinced on how to
> use the page_pool given the HW limitation of the BM.

I'll have a look right after holidays

> 
> The driver currently uses, for every CPU, a page_pool for short packets and
> another for long ones. The driver also has 4 rx queue per port, so every
> RXQ #1 will share the short and long page pools of CPU #1.
> 

I am not sure i am following the hardware config here

> This means that for every RX queue I call xdp_rxq_info_reg_mem_model() twice,
> on two different page_pool, can this be a problem?
> 
> As usual, ideas are welcome.
> 
> Matteo Croce (2):
>   mvpp2: use page_pool allocator
>   mvpp2: memory accounting
> 
>  drivers/net/ethernet/marvell/Kconfig          |   1 +
>  drivers/net/ethernet/marvell/mvpp2/mvpp2.h    |   7 +
>  .../net/ethernet/marvell/mvpp2/mvpp2_main.c   | 142 +++++++++++++++---
>  3 files changed, 125 insertions(+), 25 deletions(-)
> 
> -- 
> 2.24.1
> 
Cheers
/Ilias

Matteo Croce Dec. 24, 2019, 1:34 p.m. UTC | #2

On Tue, Dec 24, 2019 at 10:52 AM Ilias Apalodimas
<ilias.apalodimas@linaro.org> wrote:
>
> On Tue, Dec 24, 2019 at 02:01:01AM +0100, Matteo Croce wrote:
> > This patches change the memory allocator of mvpp2 from the frag allocator to
> > the page_pool API. This change is needed to add later XDP support to mvpp2.
> >
> > The reason I send it as RFC is that with this changeset, mvpp2 performs much
> > more slower. This is the tc drop rate measured with a single flow:
> >
> > stock net-next with frag allocator:
> > rx: 900.7 Mbps 1877 Kpps
> >
> > this patchset with page_pool:
> > rx: 423.5 Mbps 882.3 Kpps
> >
> > This is the perf top when receiving traffic:
> >
> >   27.68%  [kernel]            [k] __page_pool_clean_page
>
> This seems extremly high on the list.
>
> >    9.79%  [kernel]            [k] get_page_from_freelist
> >    7.18%  [kernel]            [k] free_unref_page
> >    4.64%  [kernel]            [k] build_skb
> >    4.63%  [kernel]            [k] __netif_receive_skb_core
> >    3.83%  [mvpp2]             [k] mvpp2_poll
> >    3.64%  [kernel]            [k] eth_type_trans
> >    3.61%  [kernel]            [k] kmem_cache_free
> >    3.03%  [kernel]            [k] kmem_cache_alloc
> >    2.76%  [kernel]            [k] dev_gro_receive
> >    2.69%  [mvpp2]             [k] mvpp2_bm_pool_put
> >    2.68%  [kernel]            [k] page_frag_free
> >    1.83%  [kernel]            [k] inet_gro_receive
> >    1.74%  [kernel]            [k] page_pool_alloc_pages
> >    1.70%  [kernel]            [k] __build_skb
> >    1.47%  [kernel]            [k] __alloc_pages_nodemask
> >    1.36%  [mvpp2]             [k] mvpp2_buf_alloc.isra.0
> >    1.29%  [kernel]            [k] tcf_action_exec
> >
> > I tried Ilias patches for page_pool recycling, I get an improvement
> > to ~1100, but I'm still far than the original allocator.
>
> Can you post the recycling perf for comparison?
>

  12.00%  [kernel]                  [k] get_page_from_freelist
   9.25%  [kernel]                  [k] free_unref_page
   6.83%  [kernel]                  [k] eth_type_trans
   5.33%  [kernel]                  [k] __netif_receive_skb_core
   4.96%  [mvpp2]                   [k] mvpp2_poll
   4.64%  [kernel]                  [k] kmem_cache_free
   4.06%  [kernel]                  [k] __xdp_return
   3.60%  [kernel]                  [k] kmem_cache_alloc
   3.31%  [kernel]                  [k] dev_gro_receive
   3.29%  [kernel]                  [k] __page_pool_clean_page
   3.25%  [mvpp2]                   [k] mvpp2_bm_pool_put
   2.73%  [kernel]                  [k] __page_pool_put_page
   2.33%  [kernel]                  [k] __alloc_pages_nodemask
   2.33%  [kernel]                  [k] inet_gro_receive
   2.05%  [kernel]                  [k] __build_skb
   1.95%  [kernel]                  [k] build_skb
   1.89%  [cls_matchall]            [k] mall_classify
   1.83%  [kernel]                  [k] page_pool_alloc_pages
   1.80%  [kernel]                  [k] tcf_action_exec
   1.70%  [mvpp2]                   [k] mvpp2_buf_alloc.isra.0
   1.63%  [kernel]                  [k] free_unref_page_prepare.part.0
   1.45%  [kernel]                  [k] page_pool_return_skb_page
   1.42%  [act_gact]                [k] tcf_gact_act
   1.16%  [kernel]                  [k] netif_receive_skb_list_internal
   1.08%  [kernel]                  [k] kfree_skb
   1.07%  [kernel]                  [k] skb_release_data


> >
> > Any idea on why I get such bad numbers?
>
> Nop but it's indeed strange
>
> >
> > Another reason to send it as RFC is that I'm not fully convinced on how to
> > use the page_pool given the HW limitation of the BM.
>
> I'll have a look right after holidays
>

Thanks

> >
> > The driver currently uses, for every CPU, a page_pool for short packets and
> > another for long ones. The driver also has 4 rx queue per port, so every
> > RXQ #1 will share the short and long page pools of CPU #1.
> >
>
> I am not sure i am following the hardware config here
>

Never mind, it's quite a mess, I needed a lot of time to get it :)

The HW put the packets in different buffer pools depending on the size:
short: 64..128
long: 128..1664
jumbo: 1664..9856

Let's skip the jumbo buffer for now and assume we have 4 CPU, the
driver allocates 4 short and 4 long buffers.
Each port has 4 RX queues, and each one uses a short and a long buffer.
With the page_pool api, we have 8 struct page_pool, 4 for the short
and 4 for the long buffers.


> > This means that for every RX queue I call xdp_rxq_info_reg_mem_model() twice,
> > on two different page_pool, can this be a problem?
> >
> > As usual, ideas are welcome.
> >
> > Matteo Croce (2):
> >   mvpp2: use page_pool allocator
> >   mvpp2: memory accounting
> >
> >  drivers/net/ethernet/marvell/Kconfig          |   1 +
> >  drivers/net/ethernet/marvell/mvpp2/mvpp2.h    |   7 +
> >  .../net/ethernet/marvell/mvpp2/mvpp2_main.c   | 142 +++++++++++++++---
> >  3 files changed, 125 insertions(+), 25 deletions(-)
> >
> > --
> > 2.24.1
> >
> Cheers
> /Ilias
>

Bye,

Jesper Dangaard Brouer Dec. 24, 2019, 2 p.m. UTC | #3

On Tue, 24 Dec 2019 11:52:29 +0200
Ilias Apalodimas <ilias.apalodimas@linaro.org> wrote:

> On Tue, Dec 24, 2019 at 02:01:01AM +0100, Matteo Croce wrote:
> > This patches change the memory allocator of mvpp2 from the frag allocator to
> > the page_pool API. This change is needed to add later XDP support to mvpp2.
> > 
> > The reason I send it as RFC is that with this changeset, mvpp2 performs much
> > more slower. This is the tc drop rate measured with a single flow:
> > 
> > stock net-next with frag allocator:
> > rx: 900.7 Mbps 1877 Kpps
> > 
> > this patchset with page_pool:
> > rx: 423.5 Mbps 882.3 Kpps
> > 
> > This is the perf top when receiving traffic:
> > 
> >   27.68%  [kernel]            [k] __page_pool_clean_page  
> 
> This seems extremly high on the list. 
 
This looks related to the cost of dma unmap, as page_pool have
PP_FLAG_DMA_MAP. (It is a little strange, as page_pool have flag
DMA_ATTR_SKIP_CPU_SYNC, which should make it less expensive).


> >    9.79%  [kernel]            [k] get_page_from_freelist

You are clearly hitting page-allocator every time, because you are not
using page_pool recycle facility.


> >    7.18%  [kernel]            [k] free_unref_page
> >    4.64%  [kernel]            [k] build_skb
> >    4.63%  [kernel]            [k] __netif_receive_skb_core
> >    3.83%  [mvpp2]             [k] mvpp2_poll
> >    3.64%  [kernel]            [k] eth_type_trans
> >    3.61%  [kernel]            [k] kmem_cache_free
> >    3.03%  [kernel]            [k] kmem_cache_alloc
> >    2.76%  [kernel]            [k] dev_gro_receive
> >    2.69%  [mvpp2]             [k] mvpp2_bm_pool_put
> >    2.68%  [kernel]            [k] page_frag_free
> >    1.83%  [kernel]            [k] inet_gro_receive
> >    1.74%  [kernel]            [k] page_pool_alloc_pages
> >    1.70%  [kernel]            [k] __build_skb
> >    1.47%  [kernel]            [k] __alloc_pages_nodemask
> >    1.36%  [mvpp2]             [k] mvpp2_buf_alloc.isra.0
> >    1.29%  [kernel]            [k] tcf_action_exec
> > 
> > I tried Ilias patches for page_pool recycling, I get an improvement
> > to ~1100, but I'm still far than the original allocator.

Jesper Dangaard Brouer Dec. 24, 2019, 2:04 p.m. UTC | #4

On Tue, 24 Dec 2019 14:34:07 +0100
Matteo Croce <mcroce@redhat.com> wrote:

> On Tue, Dec 24, 2019 at 10:52 AM Ilias Apalodimas
> <ilias.apalodimas@linaro.org> wrote:
> >
> > On Tue, Dec 24, 2019 at 02:01:01AM +0100, Matteo Croce wrote:  
> > > This patches change the memory allocator of mvpp2 from the frag allocator to
> > > the page_pool API. This change is needed to add later XDP support to mvpp2.
> > >
> > > The reason I send it as RFC is that with this changeset, mvpp2 performs much
> > > more slower. This is the tc drop rate measured with a single flow:
> > >
> > > stock net-next with frag allocator:
> > > rx: 900.7 Mbps 1877 Kpps
> > >
> > > this patchset with page_pool:
> > > rx: 423.5 Mbps 882.3 Kpps
> > >
> > > This is the perf top when receiving traffic:
> > >
> > >   27.68%  [kernel]            [k] __page_pool_clean_page  
> >
> > This seems extremly high on the list.
> >  
> > >    9.79%  [kernel]            [k] get_page_from_freelist
> > >    7.18%  [kernel]            [k] free_unref_page
> > >    4.64%  [kernel]            [k] build_skb
> > >    4.63%  [kernel]            [k] __netif_receive_skb_core
> > >    3.83%  [mvpp2]             [k] mvpp2_poll
> > >    3.64%  [kernel]            [k] eth_type_trans
> > >    3.61%  [kernel]            [k] kmem_cache_free
> > >    3.03%  [kernel]            [k] kmem_cache_alloc
> > >    2.76%  [kernel]            [k] dev_gro_receive
> > >    2.69%  [mvpp2]             [k] mvpp2_bm_pool_put
> > >    2.68%  [kernel]            [k] page_frag_free
> > >    1.83%  [kernel]            [k] inet_gro_receive
> > >    1.74%  [kernel]            [k] page_pool_alloc_pages
> > >    1.70%  [kernel]            [k] __build_skb
> > >    1.47%  [kernel]            [k] __alloc_pages_nodemask
> > >    1.36%  [mvpp2]             [k] mvpp2_buf_alloc.isra.0
> > >    1.29%  [kernel]            [k] tcf_action_exec
> > >
> > > I tried Ilias patches for page_pool recycling, I get an improvement
> > > to ~1100, but I'm still far than the original allocator.  
> >
> > Can you post the recycling perf for comparison?
> >  
> 
>   12.00%  [kernel]                  [k] get_page_from_freelist
>    9.25%  [kernel]                  [k] free_unref_page

Hmm, this indicate pages are not getting recycled.

>    6.83%  [kernel]                  [k] eth_type_trans
>    5.33%  [kernel]                  [k] __netif_receive_skb_core
>    4.96%  [mvpp2]                   [k] mvpp2_poll
>    4.64%  [kernel]                  [k] kmem_cache_free
>    4.06%  [kernel]                  [k] __xdp_return

You do invoke __xdp_return() code, but it might find that the page
cannot be recycled...

>    3.60%  [kernel]                  [k] kmem_cache_alloc
>    3.31%  [kernel]                  [k] dev_gro_receive
>    3.29%  [kernel]                  [k] __page_pool_clean_page
>    3.25%  [mvpp2]                   [k] mvpp2_bm_pool_put
>    2.73%  [kernel]                  [k] __page_pool_put_page
>    2.33%  [kernel]                  [k] __alloc_pages_nodemask
>    2.33%  [kernel]                  [k] inet_gro_receive
>    2.05%  [kernel]                  [k] __build_skb
>    1.95%  [kernel]                  [k] build_skb
>    1.89%  [cls_matchall]            [k] mall_classify
>    1.83%  [kernel]                  [k] page_pool_alloc_pages
>    1.80%  [kernel]                  [k] tcf_action_exec
>    1.70%  [mvpp2]                   [k] mvpp2_buf_alloc.isra.0
>    1.63%  [kernel]                  [k] free_unref_page_prepare.part.0
>    1.45%  [kernel]                  [k] page_pool_return_skb_page
>    1.42%  [act_gact]                [k] tcf_gact_act
>    1.16%  [kernel]                  [k] netif_receive_skb_list_internal
>    1.08%  [kernel]                  [k] kfree_skb
>    1.07%  [kernel]                  [k] skb_release_data

Matteo Croce Dec. 24, 2019, 2:37 p.m. UTC | #5

On Tue, Dec 24, 2019 at 3:01 PM Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
>
> On Tue, 24 Dec 2019 11:52:29 +0200
> Ilias Apalodimas <ilias.apalodimas@linaro.org> wrote:
>
> > On Tue, Dec 24, 2019 at 02:01:01AM +0100, Matteo Croce wrote:
> > > This patches change the memory allocator of mvpp2 from the frag allocator to
> > > the page_pool API. This change is needed to add later XDP support to mvpp2.
> > >
> > > The reason I send it as RFC is that with this changeset, mvpp2 performs much
> > > more slower. This is the tc drop rate measured with a single flow:
> > >
> > > stock net-next with frag allocator:
> > > rx: 900.7 Mbps 1877 Kpps
> > >
> > > this patchset with page_pool:
> > > rx: 423.5 Mbps 882.3 Kpps
> > >
> > > This is the perf top when receiving traffic:
> > >
> > >   27.68%  [kernel]            [k] __page_pool_clean_page
> >
> > This seems extremly high on the list.
>
> This looks related to the cost of dma unmap, as page_pool have
> PP_FLAG_DMA_MAP. (It is a little strange, as page_pool have flag
> DMA_ATTR_SKIP_CPU_SYNC, which should make it less expensive).
>
>
> > >    9.79%  [kernel]            [k] get_page_from_freelist
>
> You are clearly hitting page-allocator every time, because you are not
> using page_pool recycle facility.
>
>
> > >    7.18%  [kernel]            [k] free_unref_page
> > >    4.64%  [kernel]            [k] build_skb
> > >    4.63%  [kernel]            [k] __netif_receive_skb_core
> > >    3.83%  [mvpp2]             [k] mvpp2_poll
> > >    3.64%  [kernel]            [k] eth_type_trans
> > >    3.61%  [kernel]            [k] kmem_cache_free
> > >    3.03%  [kernel]            [k] kmem_cache_alloc
> > >    2.76%  [kernel]            [k] dev_gro_receive
> > >    2.69%  [mvpp2]             [k] mvpp2_bm_pool_put
> > >    2.68%  [kernel]            [k] page_frag_free
> > >    1.83%  [kernel]            [k] inet_gro_receive
> > >    1.74%  [kernel]            [k] page_pool_alloc_pages
> > >    1.70%  [kernel]            [k] __build_skb
> > >    1.47%  [kernel]            [k] __alloc_pages_nodemask
> > >    1.36%  [mvpp2]             [k] mvpp2_buf_alloc.isra.0
> > >    1.29%  [kernel]            [k] tcf_action_exec
> > >
> > > I tried Ilias patches for page_pool recycling, I get an improvement
> > > to ~1100, but I'm still far than the original allocator.
> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer
>

The change I did to use the recycling is the following:

--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
@@ -3071,7 +3071,7 @@ static int mvpp2_rx(struct mvpp2_port *port,
struct napi_struct *napi,
    if (pp)
-       page_pool_release_page(pp, virt_to_page(data));
+      skb_mark_for_recycle(skb, virt_to_page(data), &rxq->xdp_rxq.mem);
    else
         dma_unmap_single_attrs(dev->dev.parent, dma_addr,




--
Matteo Croce
per aspera ad upstream

Ilias Apalodimas Dec. 27, 2019, 11:51 a.m. UTC | #6

On Tue, Dec 24, 2019 at 03:37:49PM +0100, Matteo Croce wrote:
> On Tue, Dec 24, 2019 at 3:01 PM Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
> >
> > On Tue, 24 Dec 2019 11:52:29 +0200
> > Ilias Apalodimas <ilias.apalodimas@linaro.org> wrote:
> >
> > > On Tue, Dec 24, 2019 at 02:01:01AM +0100, Matteo Croce wrote:
> > > > This patches change the memory allocator of mvpp2 from the frag allocator to
> > > > the page_pool API. This change is needed to add later XDP support to mvpp2.
> > > >
> > > > The reason I send it as RFC is that with this changeset, mvpp2 performs much
> > > > more slower. This is the tc drop rate measured with a single flow:
> > > >
> > > > stock net-next with frag allocator:
> > > > rx: 900.7 Mbps 1877 Kpps
> > > >
> > > > this patchset with page_pool:
> > > > rx: 423.5 Mbps 882.3 Kpps
> > > >
> > > > This is the perf top when receiving traffic:
> > > >
> > > >   27.68%  [kernel]            [k] __page_pool_clean_page
> > >
> > > This seems extremly high on the list.
> >
> > This looks related to the cost of dma unmap, as page_pool have
> > PP_FLAG_DMA_MAP. (It is a little strange, as page_pool have flag
> > DMA_ATTR_SKIP_CPU_SYNC, which should make it less expensive).
> >
> >
> > > >    9.79%  [kernel]            [k] get_page_from_freelist
> >
> > You are clearly hitting page-allocator every time, because you are not
> > using page_pool recycle facility.
> >
> >
> > > >    7.18%  [kernel]            [k] free_unref_page
> > > >    4.64%  [kernel]            [k] build_skb
> > > >    4.63%  [kernel]            [k] __netif_receive_skb_core
> > > >    3.83%  [mvpp2]             [k] mvpp2_poll
> > > >    3.64%  [kernel]            [k] eth_type_trans
> > > >    3.61%  [kernel]            [k] kmem_cache_free
> > > >    3.03%  [kernel]            [k] kmem_cache_alloc
> > > >    2.76%  [kernel]            [k] dev_gro_receive
> > > >    2.69%  [mvpp2]             [k] mvpp2_bm_pool_put
> > > >    2.68%  [kernel]            [k] page_frag_free
> > > >    1.83%  [kernel]            [k] inet_gro_receive
> > > >    1.74%  [kernel]            [k] page_pool_alloc_pages
> > > >    1.70%  [kernel]            [k] __build_skb
> > > >    1.47%  [kernel]            [k] __alloc_pages_nodemask
> > > >    1.36%  [mvpp2]             [k] mvpp2_buf_alloc.isra.0
> > > >    1.29%  [kernel]            [k] tcf_action_exec
> > > >
> > > > I tried Ilias patches for page_pool recycling, I get an improvement
> > > > to ~1100, but I'm still far than the original allocator.
> > --
> > Best regards,
> >   Jesper Dangaard Brouer
> >   MSc.CS, Principal Kernel Engineer at Red Hat
> >   LinkedIn: http://www.linkedin.com/in/brouer
> >
> 
> The change I did to use the recycling is the following:
> 
> --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> @@ -3071,7 +3071,7 @@ static int mvpp2_rx(struct mvpp2_port *port,
> struct napi_struct *napi,
>     if (pp)
> -       page_pool_release_page(pp, virt_to_page(data));
> +      skb_mark_for_recycle(skb, virt_to_page(data), &rxq->xdp_rxq.mem);
>     else
>          dma_unmap_single_attrs(dev->dev.parent, dma_addr,
> 
> 
Jesper is rightm you aren't recycling anything.

The mark skb_mark_for_recycle() usage seems correct. 
There are a few more places that we refuse to recycle (for example coalescing
page pool and slub allocated pages is forbidden). I wonder if you hit any of
those cases and recycling doesn't take place. 
We'll hopefully release updated code shortly. I'll ping you and we can test on
that


Thanks
/Ilias
> 
> 
> --
> Matteo Croce
> per aspera ad upstream
>

[RFC,net-next,0/2] mvpp2: page_pool support

Message

Comments