diff mbox series

[bpf,v2,1/2] libbpf: remove likely/unlikely in xsk.h

Message ID 1554880154-30791-2-git-send-email-magnus.karlsson@intel.com
State Changes Requested
Delegated to: BPF Maintainers
Headers show
Series libbpf: remove two dependencies on Linux kernel headers and improve performance as a bonus | expand

Commit Message

Magnus Karlsson April 10, 2019, 7:09 a.m. UTC
This patch removes the use of likely and unlikely in xsk.h since they
create a dependency on Linux headers as reported by several
users. There have also been reports that the use of these decreases
performance as the compiler puts the code on two different cache lines
instead of on a single one. All in all, I think we are better off
without them.

Fixes: 1cad07884239 ("libbpf: add support for using AF_XDP sockets")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 tools/lib/bpf/xsk.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Y Song April 10, 2019, 5:56 p.m. UTC | #1
On Wed, Apr 10, 2019 at 12:21 AM Magnus Karlsson
<magnus.karlsson@intel.com> wrote:
>
> This patch removes the use of likely and unlikely in xsk.h since they
> create a dependency on Linux headers as reported by several
> users. There have also been reports that the use of these decreases
> performance as the compiler puts the code on two different cache lines
> instead of on a single one. All in all, I think we are better off
> without them.

The change looks good to me.
  Acked-by: Yonghong Song <yhs@fb.com>

libbpf repo (https://github.com/libbpf/libbpf/) solved this issue by
providing some customer
implementation just to satisfying compilatioins. I guess users here do
not use libbpf repo and they
directly extract kernel source and try to build?

Just curious. do you have detailed info about which code in two
different cache lines instead
of one cache line and how much performance degradation?

>
> Fixes: 1cad07884239 ("libbpf: add support for using AF_XDP sockets")
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> ---
>  tools/lib/bpf/xsk.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tools/lib/bpf/xsk.h b/tools/lib/bpf/xsk.h
> index a497f00..3638147 100644
> --- a/tools/lib/bpf/xsk.h
> +++ b/tools/lib/bpf/xsk.h
> @@ -105,7 +105,7 @@ static inline __u32 xsk_cons_nb_avail(struct xsk_ring_cons *r, __u32 nb)
>  static inline size_t xsk_ring_prod__reserve(struct xsk_ring_prod *prod,
>                                             size_t nb, __u32 *idx)
>  {
> -       if (unlikely(xsk_prod_nb_free(prod, nb) < nb))
> +       if (xsk_prod_nb_free(prod, nb) < nb)
>                 return 0;
>
>         *idx = prod->cached_prod;
> @@ -129,7 +129,7 @@ static inline size_t xsk_ring_cons__peek(struct xsk_ring_cons *cons,
>  {
>         size_t entries = xsk_cons_nb_avail(cons, nb);
>
> -       if (likely(entries > 0)) {
> +       if (entries > 0) {
>                 /* Make sure we do not speculatively read the data before
>                  * we have received the packet buffers from the ring.
>                  */
> --
> 2.7.4
>
Magnus Karlsson April 11, 2019, 6:20 a.m. UTC | #2
On Wed, Apr 10, 2019 at 9:08 PM Y Song <ys114321@gmail.com> wrote:
>
> On Wed, Apr 10, 2019 at 12:21 AM Magnus Karlsson
> <magnus.karlsson@intel.com> wrote:
> >
> > This patch removes the use of likely and unlikely in xsk.h since they
> > create a dependency on Linux headers as reported by several
> > users. There have also been reports that the use of these decreases
> > performance as the compiler puts the code on two different cache lines
> > instead of on a single one. All in all, I think we are better off
> > without them.
>
> The change looks good to me.
>   Acked-by: Yonghong Song <yhs@fb.com>
>
> libbpf repo (https://github.com/libbpf/libbpf/) solved this issue by
> providing some customer
> implementation just to satisfying compilatioins. I guess users here do
> not use libbpf repo and they
> directly extract kernel source and try to build?

That is correct. Quite a number of people did not even know it existed
in the first place. Maybe we need more pointers to the repo.

> Just curious. do you have detailed info about which code in two
> different cache lines instead
> of one cache line and how much performance degradation?

Sorry, I have no detailed info. It is very dependent on the exact
access rates to and from the queues and the way the instructions line
up in the cache in the application. But the more often the unlikely
paths are taken, the more performance can be degraded. The
unlikely/likely paths in the ring access code are just not
unlikely/likely enough to warrant these annotations. They will occur
every time the locally cached values have to be refreshed from the
shared ones.

/Magnus

> >
> > Fixes: 1cad07884239 ("libbpf: add support for using AF_XDP sockets")
> > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> > ---
> >  tools/lib/bpf/xsk.h | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/lib/bpf/xsk.h b/tools/lib/bpf/xsk.h
> > index a497f00..3638147 100644
> > --- a/tools/lib/bpf/xsk.h
> > +++ b/tools/lib/bpf/xsk.h
> > @@ -105,7 +105,7 @@ static inline __u32 xsk_cons_nb_avail(struct xsk_ring_cons *r, __u32 nb)
> >  static inline size_t xsk_ring_prod__reserve(struct xsk_ring_prod *prod,
> >                                             size_t nb, __u32 *idx)
> >  {
> > -       if (unlikely(xsk_prod_nb_free(prod, nb) < nb))
> > +       if (xsk_prod_nb_free(prod, nb) < nb)
> >                 return 0;
> >
> >         *idx = prod->cached_prod;
> > @@ -129,7 +129,7 @@ static inline size_t xsk_ring_cons__peek(struct xsk_ring_cons *cons,
> >  {
> >         size_t entries = xsk_cons_nb_avail(cons, nb);
> >
> > -       if (likely(entries > 0)) {
> > +       if (entries > 0) {
> >                 /* Make sure we do not speculatively read the data before
> >                  * we have received the packet buffers from the ring.
> >                  */
> > --
> > 2.7.4
> >
diff mbox series

Patch

diff --git a/tools/lib/bpf/xsk.h b/tools/lib/bpf/xsk.h
index a497f00..3638147 100644
--- a/tools/lib/bpf/xsk.h
+++ b/tools/lib/bpf/xsk.h
@@ -105,7 +105,7 @@  static inline __u32 xsk_cons_nb_avail(struct xsk_ring_cons *r, __u32 nb)
 static inline size_t xsk_ring_prod__reserve(struct xsk_ring_prod *prod,
 					    size_t nb, __u32 *idx)
 {
-	if (unlikely(xsk_prod_nb_free(prod, nb) < nb))
+	if (xsk_prod_nb_free(prod, nb) < nb)
 		return 0;
 
 	*idx = prod->cached_prod;
@@ -129,7 +129,7 @@  static inline size_t xsk_ring_cons__peek(struct xsk_ring_cons *cons,
 {
 	size_t entries = xsk_cons_nb_avail(cons, nb);
 
-	if (likely(entries > 0)) {
+	if (entries > 0) {
 		/* Make sure we do not speculatively read the data before
 		 * we have received the packet buffers from the ring.
 		 */