Message ID | 49D3B61F.8010507@cosmosbay.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
Hello, Eric, Ingo. Eric Dumazet wrote: > diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h > index aee103b..6b82f6b 100644 > --- a/arch/x86/include/asm/percpu.h > +++ b/arch/x86/include/asm/percpu.h > @@ -135,6 +135,9 @@ do { \ > #define percpu_read(var) percpu_from_op("mov", per_cpu__##var) > #define percpu_write(var, val) percpu_to_op("mov", per_cpu__##var, val) > #define percpu_add(var, val) percpu_to_op("add", per_cpu__##var, val) > +#define indir_percpu_add(var, val) percpu_to_op("add", *(var), val) > +#define indir_percpu_inc(var) percpu_to_op("add", *(var), 1) > +#define indir_percpu_dec(var) percpu_to_op("add", *(var), -1) > #define percpu_sub(var, val) percpu_to_op("sub", per_cpu__##var, val) > #define percpu_and(var, val) percpu_to_op("and", per_cpu__##var, val) > #define percpu_or(var, val) percpu_to_op("or", per_cpu__##var, val) The final goal is to unify static and dynamic accesses but we aren't there yet, so, for the time being, we'll need some interim solutions. I would prefer percpu_ptr_add() tho. Thanks.
* Tejun Heo <htejun@gmail.com> wrote: > Hello, Eric, Ingo. > > Eric Dumazet wrote: > > diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h > > index aee103b..6b82f6b 100644 > > --- a/arch/x86/include/asm/percpu.h > > +++ b/arch/x86/include/asm/percpu.h > > @@ -135,6 +135,9 @@ do { \ > > #define percpu_read(var) percpu_from_op("mov", per_cpu__##var) > > #define percpu_write(var, val) percpu_to_op("mov", per_cpu__##var, val) > > #define percpu_add(var, val) percpu_to_op("add", per_cpu__##var, val) > > +#define indir_percpu_add(var, val) percpu_to_op("add", *(var), val) > > +#define indir_percpu_inc(var) percpu_to_op("add", *(var), 1) > > +#define indir_percpu_dec(var) percpu_to_op("add", *(var), -1) > > #define percpu_sub(var, val) percpu_to_op("sub", per_cpu__##var, val) > > #define percpu_and(var, val) percpu_to_op("and", per_cpu__##var, val) > > #define percpu_or(var, val) percpu_to_op("or", per_cpu__##var, val) > > The final goal is to unify static and dynamic accesses but we > aren't there yet, so, for the time being, we'll need some interim > solutions. I would prefer percpu_ptr_add() tho. Yep, that's the standard naming scheme for new APIs: generic to specific, left to right. Ingo -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday 02 April 2009 05:14:47 Eric Dumazet wrote: > Here is a preliminary patch for SNMP mibs that seems to work well on x86_32 > > [RFC] percpu: convert SNMP mibs to new infra OK, I have a whole heap of "convert to dynamic per-cpu" patches waiting in the wings too, once Tejun's conversion is complete. Also, what is optimal depends on the arch: we had a long discussion on this (it's what local_t was supposed to do, with cpu_local_inc() etc: see Subject: local_add_return 2008-12-16 thread). eg. on S/390, atomic_inc is a win over the two-counter version. On Sparc, two-counter wins. On x86, inc wins (obviously). But efforts to create a single primitive have been problematic: maybe open-coding it like this is the Right Thing. Cheers, Rusty. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Rusty Russell a écrit : > On Thursday 02 April 2009 05:14:47 Eric Dumazet wrote: >> Here is a preliminary patch for SNMP mibs that seems to work well on x86_32 >> >> [RFC] percpu: convert SNMP mibs to new infra > > OK, I have a whole heap of "convert to dynamic per-cpu" patches waiting in > the wings too, once Tejun's conversion is complete. > > Also, what is optimal depends on the arch: we had a long discussion on this > (it's what local_t was supposed to do, with cpu_local_inc() etc: see > Subject: local_add_return 2008-12-16 thread). > > eg. on S/390, atomic_inc is a win over the two-counter version. On Sparc, > two-counter wins. On x86, inc wins (obviously). > > But efforts to create a single primitive have been problematic: maybe > open-coding it like this is the Right Thing. > I tried to find a generic CONFIG_ define that would annonce that an arche has a fast percpu_add() implementation. (faster than __raw_get_cpu_var, for example, when we already are in a preempt disabled section) Any idea ? For example, net/ipv4/route.c has : static DEFINE_PER_CPU(struct rt_cache_stat, rt_cache_stat); #define RT_CACHE_STAT_INC(field) \ (__raw_get_cpu_var(rt_cache_stat).field++) We could use percpu_add(rt_cache_stat.field, 1) instead, only if percpu_add() is not the generic one. #define __percpu_generic_to_op(var, val, op) \ do { \ get_cpu_var(var) op val; \ put_cpu_var(var); \ } while (0) #ifndef percpu_add # define percpu_add(var, val) __percpu_generic_to_op(var, (val), +=) #endif -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday 02 April 2009 15:49:19 Eric Dumazet wrote: > Rusty Russell a écrit : > > eg. on S/390, atomic_inc is a win over the two-counter version. On Sparc, > > two-counter wins. On x86, inc wins (obviously). > > > > But efforts to create a single primitive have been problematic: maybe > > open-coding it like this is the Right Thing. > > I tried to find a generic CONFIG_ define that would annonce that an arche > has a fast percpu_add() implementation. (faster than __raw_get_cpu_var, > for example, when we already are in a preempt disabled section) Nope, we don't have one. It was supposed to work like this: DEFINE_PER_CPU(local_t, counter); cpu_local_inc(counter); That would do incl in x86, local_t could even be a long[3] (one for hardirq, one for softirq, one for user context). But there were issues: 1) It didn't work on dynamic percpu allocs, which was much of the interesting use (Tejun is fixing this bit right now) 2) The x86 version wasn't optimized anyway, 3) Everyone did atomic_long_inc(), so the ftrace code assumed it would be nmi safe (tho atomic_t isn't nmi-safe on some archs anyway), so the long[3] method would break them, 4) The long[3] version was overkill for networking, which doesn't need hardirq so we'd want another variant of local_t plus all the ops, 5) Some people didn't want long: Christoph had a more generic but more complex version, 6) It's still not used anywhere in the tree (tho local_t is), so there's no reason to stick to the current semantics. > For example, net/ipv4/route.c has : > > static DEFINE_PER_CPU(struct rt_cache_stat, rt_cache_stat); > #define RT_CACHE_STAT_INC(field) \ > (__raw_get_cpu_var(rt_cache_stat).field++) > > We could use percpu_add(rt_cache_stat.field, 1) instead, only if percpu_add() > is not the generic one. Yep, but this one is different from the SNMP stats which needs softirq vs user context safety. This is where I start wondering how many interfaces we're going to have... Sorry to add more questions than answers :( Rusty. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h index aee103b..6b82f6b 100644 --- a/arch/x86/include/asm/percpu.h +++ b/arch/x86/include/asm/percpu.h @@ -135,6 +135,9 @@ do { \ #define percpu_read(var) percpu_from_op("mov", per_cpu__##var) #define percpu_write(var, val) percpu_to_op("mov", per_cpu__##var, val) #define percpu_add(var, val) percpu_to_op("add", per_cpu__##var, val) +#define indir_percpu_add(var, val) percpu_to_op("add", *(var), val) +#define indir_percpu_inc(var) percpu_to_op("add", *(var), 1) +#define indir_percpu_dec(var) percpu_to_op("add", *(var), -1) #define percpu_sub(var, val) percpu_to_op("sub", per_cpu__##var, val) #define percpu_and(var, val) percpu_to_op("and", per_cpu__##var, val) #define percpu_or(var, val) percpu_to_op("or", per_cpu__##var, val) diff --git a/include/net/snmp.h b/include/net/snmp.h index 57c9362..ef9ed31 100644 --- a/include/net/snmp.h +++ b/include/net/snmp.h @@ -123,15 +123,31 @@ struct linux_xfrm_mib { }; /* - * FIXME: On x86 and some other CPUs the split into user and softirq parts + * On x86 and some other CPUs the split into user and softirq parts * is not needed because addl $1,memory is atomic against interrupts (but - * atomic_inc would be overkill because of the lock cycles). Wants new - * nonlocked_atomic_inc() primitives -AK + * atomic_inc would be overkill because of the lock cycles). */ +#ifdef CONFIG_X86 +# define SNMP_ARRAY_SZ 1 +#else +# define SNMP_ARRAY_SZ 2 +#endif + #define DEFINE_SNMP_STAT(type, name) \ - __typeof__(type) *name[2] + __typeof__(type) *name[SNMP_ARRAY_SZ] #define DECLARE_SNMP_STAT(type, name) \ - extern __typeof__(type) *name[2] + extern __typeof__(type) *name[SNMP_ARRAY_SZ] + +#if SNMP_ARRAY_SZ == 1 +#define SNMP_INC_STATS(mib, field) indir_percpu_inc(&mib[0]->mibs[field]) +#define SNMP_INC_STATS_BH(mib, field) SNMP_INC_STATS(mib, field) +#define SNMP_INC_STATS_USER(mib, field) SNMP_INC_STATS(mib, field) +#define SNMP_DEC_STATS(mib, field) indir_percpu_dec(&mib[0]->mibs[field]) +#define SNMP_ADD_STATS_BH(mib, field, addend) \ + indir_percpu_add(&mib[0]->mibs[field], addend) +#define SNMP_ADD_STATS_USER(mib, field, addend) \ + indir_percpu_add(&mib[0]->mibs[field], addend) +#else #define SNMP_STAT_BHPTR(name) (name[0]) #define SNMP_STAT_USRPTR(name) (name[1]) @@ -160,5 +176,6 @@ struct linux_xfrm_mib { per_cpu_ptr(mib[1], get_cpu())->mibs[field] += addend; \ put_cpu(); \ } while (0) +#endif #endif diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 7f03373..badb568 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -1366,27 +1366,37 @@ unsigned long snmp_fold_field(void *mib[], int offt) for_each_possible_cpu(i) { res += *(((unsigned long *) per_cpu_ptr(mib[0], i)) + offt); +#if SNMP_ARRAY_SZ == 2 res += *(((unsigned long *) per_cpu_ptr(mib[1], i)) + offt); +#endif } return res; } EXPORT_SYMBOL_GPL(snmp_fold_field); -int snmp_mib_init(void *ptr[2], size_t mibsize) +int snmp_mib_init(void *ptr[SNMP_ARRAY_SZ], size_t mibsize) { BUG_ON(ptr == NULL); ptr[0] = __alloc_percpu(mibsize, __alignof__(unsigned long long)); if (!ptr[0]) - goto err0; + return -ENOMEM; +#if SNMP_ARRAY_SZ == 2 ptr[1] = __alloc_percpu(mibsize, __alignof__(unsigned long long)); - if (!ptr[1]) - goto err1; + if (!ptr[1]) { + free_percpu(ptr[0]); + ptr[0] = NULL; + return -ENOMEM; + } +#endif + { + int i; + printk(KERN_INFO "snmp_mib_init(%u) %p ", (unsigned int)mibsize, ptr[0]); + for_each_possible_cpu(i) { + printk(KERN_INFO "%p ", per_cpu_ptr(ptr[0], i)); + } + printk(KERN_INFO "\n"); + } return 0; -err1: - free_percpu(ptr[0]); - ptr[0] = NULL; -err0: - return -ENOMEM; } EXPORT_SYMBOL_GPL(snmp_mib_init);