Message ID | 49D3A0C2.9000403@cosmosbay.com |
---|---|
State | Not Applicable, archived |
Delegated to: | David Miller |
Headers | show |
Eric Dumazet wrote: > +#define percpu_inc(var) percpu_to_op0("inc", per_cpu__##var) > +#define percpu_dec(var) percpu_to_op0("dec", per_cpu__##var) > There's probably not a lot of value in this. The Intel and AMD optimisation guides tend to deprecate inc/dec in favour of using add/sub, because the former can cause pipeline stalls due to its partial flags update. J -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jeremy Fitzhardinge a écrit : > Eric Dumazet wrote: >> +#define percpu_inc(var) percpu_to_op0("inc", per_cpu__##var) >> +#define percpu_dec(var) percpu_to_op0("dec", per_cpu__##var) >> > > There's probably not a lot of value in this. The Intel and AMD > optimisation guides tend to deprecate inc/dec in favour of using > add/sub, because the former can cause pipeline stalls due to its partial > flags update. > > J Sure, but this saves one byte per call, this is probably why we still use inc/dec in so many places... -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jeremy Fitzhardinge <jeremy@goop.org> wrote: > > There's probably not a lot of value in this. The Intel and AMD > optimisation guides tend to deprecate inc/dec in favour of using > add/sub, because the former can cause pipeline stalls due to its partial > flags update. Is this still the case on the latest Intel CPUs? Cheers,
Herbert Xu wrote: > Jeremy Fitzhardinge <jeremy@goop.org> wrote: > >> There's probably not a lot of value in this. The Intel and AMD >> optimisation guides tend to deprecate inc/dec in favour of using >> add/sub, because the former can cause pipeline stalls due to its partial >> flags update. >> > > Is this still the case on the latest Intel CPUs? > Yes: Assembly/Compiler Coding Rule 32. (M impact, H generality) INC and DEC instructions should be replaced with ADD or SUB instructions, because ADD and SUB overwrite all flags, whereas INC and DEC do not, therefore creating false dependencies on earlier instructions that set the flags. J -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h index aee103b..248be11 100644 --- a/arch/x86/include/asm/percpu.h +++ b/arch/x86/include/asm/percpu.h @@ -103,6 +103,29 @@ do { \ } \ } while (0) +#define percpu_to_op0(op, var) \ +do { \ + switch (sizeof(var)) { \ + case 1: \ + asm(op "b "__percpu_arg(0) \ + : "+m" (var)); \ + break; \ + case 2: \ + asm(op "w "__percpu_arg(0) \ + : "+m" (var)); \ + break; \ + case 4: \ + asm(op "l "__percpu_arg(0) \ + : "+m" (var)); \ + break; \ + case 8: \ + asm(op "q "__percpu_arg(0) \ + : "+m" (var)); \ + break; \ + default: __bad_percpu_size(); \ + } \ +} while (0) + #define percpu_from_op(op, var) \ ({ \ typeof(var) ret__; \ @@ -139,6 +162,8 @@ do { \ #define percpu_and(var, val) percpu_to_op("and", per_cpu__##var, val) #define percpu_or(var, val) percpu_to_op("or", per_cpu__##var, val) #define percpu_xor(var, val) percpu_to_op("xor", per_cpu__##var, val) +#define percpu_inc(var) percpu_to_op0("inc", per_cpu__##var) +#define percpu_dec(var) percpu_to_op0("dec", per_cpu__##var) /* This is not atomic against other CPUs -- CPU preemption needs to be off */ #define x86_test_and_clear_bit_percpu(bit, var) \ diff --git a/include/asm-generic/percpu.h b/include/asm-generic/percpu.h index 00f45ff..c57357e 100644 --- a/include/asm-generic/percpu.h +++ b/include/asm-generic/percpu.h @@ -120,6 +120,14 @@ do { \ # define percpu_sub(var, val) __percpu_generic_to_op(var, (val), -=) #endif +#ifndef percpu_inc +# define percpu_inc(var) do { percpu_add(var, 1); } while (0) +#endif + +#ifndef percpu_dec +# define percpu_dec(var) do { percpu_sub(var, 1); } while (0) +#endif + #ifndef percpu_and # define percpu_and(var, val) __percpu_generic_to_op(var, (val), &=) #endif