Message ID | 20231124154732.1623518-2-aleksander.lobakin@intel.com |
---|---|
State | Handled Elsewhere |
Headers | show |
Series | net: intel: start The Great Code Dedup + Page Pool for iavf | expand |
On 2023/11/24 23:47, Alexander Lobakin wrote: > After commit 5027ec19f104 ("net: page_pool: split the page_pool_params > into fast and slow") that made &page_pool contain only "hot" params at > the start, cacheline boundary chops frag API fields group in the middle > again. > To not bother with this each time fast params get expanded or shrunk, > let's just align them to `4 * sizeof(long)`, the closest upper pow-2 to > their actual size (2 longs + 2 ints). This ensures 16-byte alignment for > the 32-bit architectures and 32-byte alignment for the 64-bit ones, > excluding unnecessary false-sharing. > > Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> > --- > include/net/page_pool/types.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h > index e1bb92c192de..989d07b831fc 100644 > --- a/include/net/page_pool/types.h > +++ b/include/net/page_pool/types.h > @@ -127,7 +127,7 @@ struct page_pool { > > bool has_init_callback; It seems odd to have only a slow field between tow fast field group, isn't it better to move it to the end of page_pool or where is more appropriate? > > - long frag_users; > + long frag_users __aligned(4 * sizeof(long)); If we need that, why not just use '____cacheline_aligned_in_smp'? > struct page *frag_page; > unsigned int frag_offset; > u32 pages_state_hold_cnt; >
On Fri, 24 Nov 2023 16:47:19 +0100 Alexander Lobakin wrote: > - long frag_users; > + long frag_users __aligned(4 * sizeof(long)); A comment for the somewhat unusual alignment size would be good.
From: Yunsheng Lin <linyunsheng@huawei.com> Date: Sat, 25 Nov 2023 20:29:22 +0800 > On 2023/11/24 23:47, Alexander Lobakin wrote: >> After commit 5027ec19f104 ("net: page_pool: split the page_pool_params >> into fast and slow") that made &page_pool contain only "hot" params at >> the start, cacheline boundary chops frag API fields group in the middle >> again. >> To not bother with this each time fast params get expanded or shrunk, >> let's just align them to `4 * sizeof(long)`, the closest upper pow-2 to >> their actual size (2 longs + 2 ints). This ensures 16-byte alignment for >> the 32-bit architectures and 32-byte alignment for the 64-bit ones, >> excluding unnecessary false-sharing. >> >> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> >> --- >> include/net/page_pool/types.h | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h >> index e1bb92c192de..989d07b831fc 100644 >> --- a/include/net/page_pool/types.h >> +++ b/include/net/page_pool/types.h >> @@ -127,7 +127,7 @@ struct page_pool { >> >> bool has_init_callback; > > It seems odd to have only a slow field between tow fast > field group, isn't it better to move it to the end of > page_pool or where is more appropriate? 1. There will be more in the subsequent patches. 2. ::has_init_callback happens each new page allocation, it's not slow. Jakub did put it here for purpose. > >> >> - long frag_users; >> + long frag_users __aligned(4 * sizeof(long)); > > If we need that, why not just use '____cacheline_aligned_in_smp'? It can be an overkill. We don't need a full cacheline, but only these fields to stay within one, no matter whether they are in the beginning of it or at the end. > >> struct page *frag_page; >> unsigned int frag_offset; >> u32 pages_state_hold_cnt; >> Thanks, Olek
From: Jakub Kicinski <kuba@kernel.org> Date: Sun, 26 Nov 2023 14:54:57 -0800 > On Fri, 24 Nov 2023 16:47:19 +0100 Alexander Lobakin wrote: >> - long frag_users; >> + long frag_users __aligned(4 * sizeof(long)); > > A comment for the somewhat unusual alignment size would be good. Roger that. Will paste a couple words from the commit message. FYI, I had an idea of doing something like __aligned(roundup_pow_of_2(2 * sizeof(long) + 2 * sizeof(int))) but that looks horrible, so I stopped on the current :D There are no functional changes between them either way. Thanks, Olek
On 2023/11/27 22:08, Alexander Lobakin wrote: > From: Yunsheng Lin <linyunsheng@huawei.com> > Date: Sat, 25 Nov 2023 20:29:22 +0800 > >> On 2023/11/24 23:47, Alexander Lobakin wrote: >>> After commit 5027ec19f104 ("net: page_pool: split the page_pool_params >>> into fast and slow") that made &page_pool contain only "hot" params at >>> the start, cacheline boundary chops frag API fields group in the middle >>> again. >>> To not bother with this each time fast params get expanded or shrunk, >>> let's just align them to `4 * sizeof(long)`, the closest upper pow-2 to >>> their actual size (2 longs + 2 ints). This ensures 16-byte alignment for >>> the 32-bit architectures and 32-byte alignment for the 64-bit ones, >>> excluding unnecessary false-sharing. >>> >>> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> >>> --- >>> include/net/page_pool/types.h | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h >>> index e1bb92c192de..989d07b831fc 100644 >>> --- a/include/net/page_pool/types.h >>> +++ b/include/net/page_pool/types.h >>> @@ -127,7 +127,7 @@ struct page_pool { >>> >>> bool has_init_callback; >> >> It seems odd to have only a slow field between tow fast >> field group, isn't it better to move it to the end of >> page_pool or where is more appropriate? > > 1. There will be more in the subsequent patches. > 2. ::has_init_callback happens each new page allocation, it's not slow. > Jakub did put it here for purpose. > >> >>> >>> - long frag_users; >>> + long frag_users __aligned(4 * sizeof(long)); >> >> If we need that, why not just use '____cacheline_aligned_in_smp'? > > It can be an overkill. We don't need a full cacheline, but only these > fields to stay within one, no matter whether they are in the beginning > of it or at the end. I am still a little lost here, A comment explaining why using '4' in the above would be really helpful here.
From: Yunsheng Lin <linyunsheng@huawei.com> Date: Wed, 29 Nov 2023 10:55:00 +0800 > On 2023/11/27 22:08, Alexander Lobakin wrote: >> From: Yunsheng Lin <linyunsheng@huawei.com> >> Date: Sat, 25 Nov 2023 20:29:22 +0800 >> >>> On 2023/11/24 23:47, Alexander Lobakin wrote: >>>> After commit 5027ec19f104 ("net: page_pool: split the page_pool_params >>>> into fast and slow") that made &page_pool contain only "hot" params at >>>> the start, cacheline boundary chops frag API fields group in the middle >>>> again. >>>> To not bother with this each time fast params get expanded or shrunk, >>>> let's just align them to `4 * sizeof(long)`, the closest upper pow-2 to >>>> their actual size (2 longs + 2 ints). This ensures 16-byte alignment for >>>> the 32-bit architectures and 32-byte alignment for the 64-bit ones, >>>> excluding unnecessary false-sharing. >>>> >>>> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> >>>> --- >>>> include/net/page_pool/types.h | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h >>>> index e1bb92c192de..989d07b831fc 100644 >>>> --- a/include/net/page_pool/types.h >>>> +++ b/include/net/page_pool/types.h >>>> @@ -127,7 +127,7 @@ struct page_pool { >>>> >>>> bool has_init_callback; >>> >>> It seems odd to have only a slow field between tow fast >>> field group, isn't it better to move it to the end of >>> page_pool or where is more appropriate? >> >> 1. There will be more in the subsequent patches. >> 2. ::has_init_callback happens each new page allocation, it's not slow. >> Jakub did put it here for purpose. >> >>> >>>> >>>> - long frag_users; >>>> + long frag_users __aligned(4 * sizeof(long)); >>> >>> If we need that, why not just use '____cacheline_aligned_in_smp'? >> >> It can be an overkill. We don't need a full cacheline, but only these >> fields to stay within one, no matter whether they are in the beginning >> of it or at the end. > > I am still a little lost here, A comment explaining why using '4' in the > above would be really helpful here. The block is: 2 longs (users, frag pointer) and 2 ints (offset, cnt). On 32-bit architectures, longs == ints, so that this effectively means 4 longs. On 64-bit architectures, long is 8 bytes and int is 4, so that the value becomes 2 * 8 + 2 * 4 = 24, but the alignment must be a pow-2. The closest pow-2 to 24 is 32, which equals to 4 * 8 = 4 longs. At the end, regardless of the architecture, the desired alignment would end up as 4 * longs. As I wrote earlier, we could do something like __aligned(roundup_pow_of_2(2 * sizeof(long) + 2 * sizeof(int))) but doesn't that seem ugly as hell? As I replied to Jakub, I'll add a comment in the code (so that you wouldn't need refer to the Git history / commit message) in the next version. Thanks, Olek
diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h index e1bb92c192de..989d07b831fc 100644 --- a/include/net/page_pool/types.h +++ b/include/net/page_pool/types.h @@ -127,7 +127,7 @@ struct page_pool { bool has_init_callback; - long frag_users; + long frag_users __aligned(4 * sizeof(long)); struct page *frag_page; unsigned int frag_offset; u32 pages_state_hold_cnt;
After commit 5027ec19f104 ("net: page_pool: split the page_pool_params into fast and slow") that made &page_pool contain only "hot" params at the start, cacheline boundary chops frag API fields group in the middle again. To not bother with this each time fast params get expanded or shrunk, let's just align them to `4 * sizeof(long)`, the closest upper pow-2 to their actual size (2 longs + 2 ints). This ensures 16-byte alignment for the 32-bit architectures and 32-byte alignment for the 64-bit ones, excluding unnecessary false-sharing. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> --- include/net/page_pool/types.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)