Message ID | 20130529063925.27486.46649.stgit@ladj378.jer.intel.com |
---|---|
State | Superseded, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, 2013-05-29 at 09:39 +0300, Eliezer Tamir wrote: > Adds a napi_id and a hashing mechanism to lookup a napi by id. > This will be used by subsequent patches to implement low latency > Ethernet device polling. > Based on a code sample by Eric Dumazet. > > Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> > --- OK this looks good enough for inclusion. If a v7 ever is submitted, please add a 'static' for static DEFINE_SPINLOCK(napi_hash_lock); static unsigned int napi_gen_id; static DEFINE_HASHTABLE(napi_hash, 8); If David chose to apply v6, I'll submit a patch for this. Signed-off-by: Eric Dumazet <edumazet@google.com> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2013-05-29 at 14:09 +0100, David Laight wrote: > > > Adds a napi_id and a hashing mechanism to lookup a napi by id. > > Is this one of the places where the 'id' can be selected > so that the 'hash' lookup never collides? Very few devices will ever call napi_hash_add() [ Real NIC RX queues, not virtual devices ] We use a hash table with 256 slots, the chance of collision is about 0% Lets not over engineer the thing before its even used. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 29/05/2013 15:56, Eric Dumazet wrote: > On Wed, 2013-05-29 at 09:39 +0300, Eliezer Tamir wrote: >> Adds a napi_id and a hashing mechanism to lookup a napi by id. >> This will be used by subsequent patches to implement low latency >> Ethernet device polling. >> Based on a code sample by Eric Dumazet. >> >> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> >> --- > > OK this looks good enough for inclusion. > > If a v7 ever is submitted, please add a 'static' for > > static DEFINE_SPINLOCK(napi_hash_lock); > static unsigned int napi_gen_id; > static DEFINE_HASHTABLE(napi_hash, 8); > I will post a v7 along with the changes you suggested to 2/5, I will wait a bit to see if there are other things to fix. Thanks, Eliezer -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2013-05-29 at 09:39 +0300, Eliezer Tamir wrote: > Adds a napi_id and a hashing mechanism to lookup a napi by id. > This will be used by subsequent patches to implement low latency > Ethernet device polling. > Based on a code sample by Eric Dumazet. > > Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> [...] > --- a/net/core/dev.c > +++ b/net/core/dev.c [...] > @@ -4136,6 +4143,53 @@ void napi_complete(struct napi_struct *n) > } > EXPORT_SYMBOL(napi_complete); > > +void napi_hash_add(struct napi_struct *napi) > +{ > + if (!test_and_set_bit(NAPI_STATE_HASHED, &napi->state)) { > + > + spin_lock(&napi_hash_lock); > + > + /* 0 is not a valid id */ > + napi->napi_id = 0; > + while (!napi->napi_id) > + napi->napi_id = ++napi_gen_id; Suppose we're loading/unloading one driver repeatedly while another one remains loaded the whole time. Then once napi_gen_id wraps around, the same ID can be assigned to multiple contexts. So far as I can see, assigning the same ID twice will just make polling stop working for one of the NAPI contexts; I don't think it causes a crash. And it is exceedingly unlikely to happen in production. But if you're going to the trouble of handling wrap-around at all, you'd better handle this. [...] > +/* must be called under rcu_read_lock(), as we dont take a reference */ > +struct napi_struct *napi_by_id(int napi_id) > +{ > + unsigned int hash = napi_id % HASH_SIZE(napi_hash); [...] napi_id should be declared unsigned int here, as elsewhere. The division can't actually yield a negative result because HASH_SIZE() has type size_t and napi_id is promoted to match, but I had to go and look at hashtable.h to check that. Ben.
On 29/05/2013 23:09, Ben Hutchings wrote: > On Wed, 2013-05-29 at 09:39 +0300, Eliezer Tamir wrote: >> +void napi_hash_add(struct napi_struct *napi) >> +{ >> + if (!test_and_set_bit(NAPI_STATE_HASHED, &napi->state)) { >> + >> + spin_lock(&napi_hash_lock); >> + >> + /* 0 is not a valid id */ >> + napi->napi_id = 0; >> + while (!napi->napi_id) >> + napi->napi_id = ++napi_gen_id; > > Suppose we're loading/unloading one driver repeatedly while another one > remains loaded the whole time. Then once napi_gen_id wraps around, the > same ID can be assigned to multiple contexts. > > So far as I can see, assigning the same ID twice will just make polling > stop working for one of the NAPI contexts; I don't think it causes a > crash. And it is exceedingly unlikely to happen in production. But if > you're going to the trouble of handling wrap-around at all, you'd better > handle this. OK > [...] >> +/* must be called under rcu_read_lock(), as we dont take a reference */ >> +struct napi_struct *napi_by_id(int napi_id) >> +{ >> + unsigned int hash = napi_id % HASH_SIZE(napi_hash); > [...] > > napi_id should be declared unsigned int here, as elsewhere. The > division can't actually yield a negative result because HASH_SIZE() has > type size_t and napi_id is promoted to match, but I had to go and look > at hashtable.h to check that. Good catch, Thanks, Eliezer -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 8f967e3..964648e 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -324,12 +324,15 @@ struct napi_struct { struct sk_buff *gro_list; struct sk_buff *skb; struct list_head dev_list; + struct hlist_node napi_hash_node; + unsigned int napi_id; }; enum { NAPI_STATE_SCHED, /* Poll is scheduled */ NAPI_STATE_DISABLE, /* Disable pending */ NAPI_STATE_NPSVC, /* Netpoll - don't dequeue from poll_list */ + NAPI_STATE_HASHED, /* In NAPI hash */ }; enum gro_result { @@ -446,6 +449,32 @@ extern void __napi_complete(struct napi_struct *n); extern void napi_complete(struct napi_struct *n); /** + * napi_hash_add - add a NAPI to global hashtable + * @napi: napi context + * + * generate a new napi_id and store a @napi under it in napi_hash + */ +extern void napi_hash_add(struct napi_struct *napi); + +/** + * napi_hash_del - remove a NAPI from global table + * @napi: napi context + * + * Warning: caller must observe rcu grace period + * before freeing memory containing @napi + */ +extern void napi_hash_del(struct napi_struct *napi); + +/** + * napi_by_id - lookup a NAPI by napi_id + * @napi_id: hashed napi_id + * + * lookup @napi_id in napi_hash table + * must be called under rcu_read_lock() + */ +extern struct napi_struct *napi_by_id(int napi_id); + +/** * napi_disable - prevent NAPI from scheduling * @n: napi context * diff --git a/net/core/dev.c b/net/core/dev.c index b2e9057..0f39481 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -129,6 +129,7 @@ #include <linux/inetdevice.h> #include <linux/cpu_rmap.h> #include <linux/static_key.h> +#include <linux/hashtable.h> #include "net-sysfs.h" @@ -166,6 +167,12 @@ static struct list_head offload_base __read_mostly; DEFINE_RWLOCK(dev_base_lock); EXPORT_SYMBOL(dev_base_lock); +/* protects napi_hash addition/deletion and napi_gen_id */ +DEFINE_SPINLOCK(napi_hash_lock); + +unsigned int napi_gen_id; +DEFINE_HASHTABLE(napi_hash, 8); + seqcount_t devnet_rename_seq; static inline void dev_base_seq_inc(struct net *net) @@ -4136,6 +4143,53 @@ void napi_complete(struct napi_struct *n) } EXPORT_SYMBOL(napi_complete); +void napi_hash_add(struct napi_struct *napi) +{ + if (!test_and_set_bit(NAPI_STATE_HASHED, &napi->state)) { + + spin_lock(&napi_hash_lock); + + /* 0 is not a valid id */ + napi->napi_id = 0; + while (!napi->napi_id) + napi->napi_id = ++napi_gen_id; + + hlist_add_head_rcu(&napi->napi_hash_node, + &napi_hash[napi->napi_id % HASH_SIZE(napi_hash)]); + + spin_unlock(&napi_hash_lock); + } +} +EXPORT_SYMBOL_GPL(napi_hash_add); + +/* Warning : caller is responsible to make sure rcu grace period + * is respected before freeing memory containing @napi + */ +void napi_hash_del(struct napi_struct *napi) +{ + spin_lock(&napi_hash_lock); + + if (test_and_clear_bit(NAPI_STATE_HASHED, &napi->state)) + hlist_del_rcu(&napi->napi_hash_node); + + spin_unlock(&napi_hash_lock); +} +EXPORT_SYMBOL_GPL(napi_hash_del); + +/* must be called under rcu_read_lock(), as we dont take a reference */ +struct napi_struct *napi_by_id(int napi_id) +{ + unsigned int hash = napi_id % HASH_SIZE(napi_hash); + struct napi_struct *napi; + + hlist_for_each_entry_rcu(napi, &napi_hash[hash], napi_hash_node) + if (napi->napi_id == napi_id) + return napi; + + return NULL; +} +EXPORT_SYMBOL_GPL(napi_by_id); + void netif_napi_add(struct net_device *dev, struct napi_struct *napi, int (*poll)(struct napi_struct *, int), int weight) {
Adds a napi_id and a hashing mechanism to lookup a napi by id. This will be used by subsequent patches to implement low latency Ethernet device polling. Based on a code sample by Eric Dumazet. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> --- include/linux/netdevice.h | 29 ++++++++++++++++++++++++ net/core/dev.c | 54 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 83 insertions(+), 0 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html