Message ID | 200910252158.53921.opurdila@ixiacom.com |
---|---|
State | Superseded, archived |
Delegated to: | David Miller |
Headers | show |
* Octavian Purdila | 2009-10-25 21:58:53 [+0200]: > >The current dev_name_hash is not very good at spreading entries when a >large number of interfaces of the same type (e.g. ethXXXXX) are used. > >Here are some performance numbers for creating 16000 dummy interfaces with >and without the patch (with per device sysctl entries disabled) > > With patch Without patch > > real 0m 2.27s real 0m 4.32s > user 0m 0.00s user 0m 0.00s > sys 0m 1.13s sys 0m 2.16s Can you rerun the test with jhash() as the hash function? Just for clearness - the spreading of jhash should be more uniformly distributed. At the end: where is the threshold where a more accurate hash function is superior. HGN
Octavian Purdila a écrit : > The current dev_name_hash is not very good at spreading entries when a > large number of interfaces of the same type (e.g. ethXXXXX) are used. Very true > > Here are some performance numbers for creating 16000 dummy interfaces with > and without the patch (with per device sysctl entries disabled) > > With patch Without patch > > real 0m 2.27s real 0m 4.32s > user 0m 0.00s user 0m 0.00s > sys 0m 1.13s sys 0m 2.16s > 1) A program to show hash distribution would be more convincing, and could show that 17 multiplier is better, not 31 :) 2) Why full_name_hash() not changed as well ? $ cat test.c #include <stdio.h> #define init_name_hash() 0 /* partial hash update function. Assume roughly 4 bits per character */ static inline unsigned long partial_name_hash(unsigned long c, unsigned long prevhash) { return (prevhash + (c << 4) + (c >> 4)) * 11; } /* * Finally: cut down the number of bits to a int value (and try to avoid * losing bits) */ static inline unsigned long end_name_hash(unsigned long hash) { return (unsigned int) hash; } /* Compute the hash for a name string. */ static inline unsigned int full_name_hash(const unsigned char *name, unsigned int len) { unsigned long hash = init_name_hash(); while (len--) hash = partial_name_hash(*name++, hash); return end_name_hash(hash); } static inline unsigned int string_hash31(const unsigned char *name, unsigned int len) { unsigned hash = 0; int i; for (i = 0; i < len; ++i) hash = 31 * hash + name[i]; return hash; } static inline unsigned int string_hash17(const unsigned char *name, unsigned int len) { unsigned hash = 0; int i; for (i = 0; i < len; ++i) hash = 17 * hash + name[i]; return hash; } unsigned int hashdist1[256], hashdist2[256], hashdist3[256]; void print_dist(unsigned int *dist, const char *name) { unsigned int i; printf("%s[] = {\n", name); for (i = 0; i < 256; i++) { printf("%d, ", dist[i]); if ((i & 15) == 15) putchar('\n'); } printf("};\n"); } int main() { int i; unsigned int hash; unsigned char name[16]; for (i = 0; i < 16000; i++) { unsigned int len = sprintf(name, "dummy%d", i); hash = full_name_hash(name, len); hashdist1[hash & 255]++; hash = string_hash31(name, len); hashdist2[hash & 255]++; hash = string_hash17(name, len); hashdist3[hash & 255]++; } print_dist(hashdist1, "full_name_hash"); print_dist(hashdist2, "string_hash31"); print_dist(hashdist3, "string_hash17"); return 0; } $ gcc -o test test.c $ ./test full_name_hash[] = { 0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 6, 0, 0, 0, 0, 56, 0, 0, 0, 374, 0, 0, 562, 0, 0, 0, 5, 0, 0, 0, 0, 57, 0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 6, 1, 0, 0, 0, 56, 0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 5, 1, 0, 0, 0, 56, 0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 56, 0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 6, 0, 0, 0, 0, 57, 0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 5, 0, 0, 0, 0, 56, 0, 0, 0, 376, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 56, 0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 5, 1, 0, 0, 0, 57, 0, 0, 0, 374, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 56, 0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 56, 0, 0, 0, 376, 0, 0, 562, 0, 0, 0, 6, 0, 0, 0, 0, 56, 0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 5, 0, 0, 0, 0, 56, 0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 5, 1, 0, 0, 0, 56, 0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 57, 0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 56, }; string_hash31[] = { 42, 54, 67, 81, 92, 103, 116, 125, 131, 131, 133, 128, 121, 110, 101, 87, 72, 59, 47, 36, 25, 17, 12, 8, 4, 4, 6, 7, 11, 15, 24, 32, 42, 54, 68, 79, 91, 105, 117, 127, 130, 135, 133, 129, 120, 113, 100, 86, 71, 58, 46, 33, 24, 16, 11, 6, 4, 4, 5, 7, 10, 16, 23, 32, 41, 54, 66, 78, 91, 104, 117, 123, 130, 133, 134, 127, 122, 112, 101, 86, 72, 61, 47, 35, 25, 19, 12, 8, 5, 6, 6, 7, 11, 16, 24, 31, 43, 54, 66, 78, 92, 106, 116, 125, 131, 136, 132, 129, 121, 112, 98, 84, 72, 58, 44, 32, 24, 16, 11, 6, 5, 5, 4, 7, 10, 17, 23, 32, 41, 54, 65, 78, 92, 105, 116, 123, 132, 133, 133, 127, 122, 111, 99, 86, 74, 60, 45, 35, 26, 19, 12, 8, 7, 5, 5, 7, 12, 17, 24, 31, 43, 54, 66, 80, 94, 107, 116, 126, 131, 135, 131, 129, 120, 110, 97, 85, 71, 56, 44, 33, 25, 16, 11, 7, 5, 3, 4, 7, 11, 17, 22, 32, 43, 54, 66, 80, 94, 105, 115, 124, 133, 132, 132, 127, 121, 110, 99, 87, 74, 59, 46, 37, 26, 19, 12, 9, 6, 4, 5, 8, 12, 16, 23, 32, 43, 53, 67, 81, 94, 105, 116, 128, 132, 135, 133, 130, 121, 112, 100, 88, 72, 57, 46, 34, 25, 17, 11, 7, 4, 3, 5, 8, 11, 16, 22, 33, }; string_hash17[] = { 68, 73, 72, 69, 62, 57, 52, 53, 60, 66, 71, 69, 69, 68, 67, 64, 67, 68, 70, 68, 63, 56, 53, 51, 57, 64, 68, 71, 68, 67, 66, 65, 62, 65, 67, 68, 64, 59, 55, 52, 54, 60, 67, 69, 69, 65, 65, 61, 59, 60, 63, 64, 64, 60, 55, 54, 54, 61, 67, 72, 72, 71, 66, 64, 60, 59, 60, 64, 64, 62, 58, 56, 55, 59, 66, 70, 73, 69, 67, 62, 58, 53, 56, 57, 60, 59, 57, 54, 53, 56, 62, 69, 71, 72, 67, 64, 57, 53, 51, 54, 58, 60, 59, 57, 57, 56, 63, 69, 74, 74, 71, 65, 60, 53, 50, 52, 55, 58, 59, 58, 57, 58, 61, 68, 73, 80, 77, 73, 66, 59, 52, 52, 54, 60, 61, 58, 58, 58, 59, 64, 71, 73, 78, 71, 66, 57, 50, 46, 50, 54, 59, 61, 58, 59, 60, 65, 70, 75, 78, 80, 72, 65, 56, 50, 49, 53, 59, 63, 62, 60, 62, 63, 68, 72, 77, 77, 76, 67, 58, 49, 46, 49, 55, 60, 62, 62, 59, 62, 65, 70, 72, 78, 75, 73, 62, 53, 47, 47, 52, 58, 63, 62, 63, 61, 64, 67, 71, 73, 76, 72, 68, 57, 49, 46, 50, 57, 62, 65, 65, 65, 64, 67, 69, 73, 76, 76, 71, 65, 54, 49, 49, 55, 62, 65, 65, 64, 65, 63, 66, 67, 71, 71, 70, 63, 57, 49, 47, 52, 58, 65, 66, 67, 65, 67, 65, 67, }; -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sunday 25 October 2009 23:24:10 you wrote: > Octavian Purdila a écrit : > > The current dev_name_hash is not very good at spreading entries when a > > large number of interfaces of the same type (e.g. ethXXXXX) are used. > > Very true > > > Here are some performance numbers for creating 16000 dummy interfaces > > with and without the patch (with per device sysctl entries disabled) > > > > With patch Without patch > > > > real 0m 2.27s real 0m 4.32s > > user 0m 0.00s user 0m 0.00s > > sys 0m 1.13s sys 0m 2.16s > > 1) A program to show hash distribution would be more convincing, > and could show that 17 multiplier is better, not 31 :) > You beat me to it :) But anyways, I've attached mine as well , which compares orig, jhash, new17, new31, with different number of interfaces as well as different hash bits. The score it shows its the number of iterations needed to find all elements in the hash (good enough metric?) My results shows that new17 is better or very close to jhash2. And I think its lighter then jhash too. > 2) Why full_name_hash() not changed as well ? I don't think it is going to be as easy to "benchmark" this, as it is used with more diverse inputs. AFAIK, full_name_hash is used by the dentry code, meaning that it cashes path names? If so, perhaps we can use find {/etc,/bin,/sbin,/usr} as input patterns? My results: $ ./dev_name_hash ixint 255 0 8 score 1140 $ ./dev_name_hash ixint 255 1 8 score 379 $ ./dev_name_hash ixint 255 2 8 score 461 $ ./dev_name_hash ixint 255 3 8 score 329 $ ./dev_name_hash ixint 1000 0 8 score 26074 $ ./dev_name_hash ixint 1000 1 8 score 2887 $ ./dev_name_hash ixint 1000 2 8 score 2853 $ ./dev_name_hash ixint 1000 3 8 score 3713 $ ./dev_name_hash ixint 16000 0 8 score 3689828 $ ./dev_name_hash ixint 16000 1 8 score 516389 $ ./dev_name_hash ixint 16000 2 8 score 515292 $ ./dev_name_hash ixint 16000 3 8 score 554042 $ ./dev_name_hash ixint 16000 0 16 score 24741 $ ./dev_name_hash ixint 16000 1 16 score 17949 $ ./dev_name_hash ixint 16000 2 16 score 16715 $ ./dev_name_hash ixint 16000 3 16 score 18125
* Octavian Purdila | 2009-10-25 23:55:32 [+0200]: >My results shows that new17 is better or very close to jhash2. And I think its >lighter then jhash too. If new17 is very close to jhash/jhash2 then the cycles comes into play. Anyway, there is already a very potent hash interface in form of jhash{,2}. +1 for jhash2 HGN PS: great work! ;) PPS: http://libhashish.sourceforge.net/ have some real hash benchmarks in form of avalanche test and some others too. It does not really matter here because Jenkins performs _nearly_ perfect in all cases. ;)
On Monday 26 October 2009 00:41:54 you wrote: > * Octavian Purdila | 2009-10-25 23:55:32 [+0200]: > >My results shows that new17 is better or very close to jhash2. And I think > > its lighter then jhash too. > > If new17 is very close to jhash/jhash2 then the cycles comes into play. > Anyway, there is already a very potent hash interface in form of jhash{,2}. > Ah, sorry for the confusion, jhash2 was a typo, I've only tested jhash. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, 25 Oct 2009 21:58:53 +0200 Octavian Purdila <opurdila@ixiacom.com> wrote: > > The current dev_name_hash is not very good at spreading entries when a > large number of interfaces of the same type (e.g. ethXXXXX) are used. > > Here are some performance numbers for creating 16000 dummy interfaces with > and without the patch (with per device sysctl entries disabled) > > With patch Without patch > > real 0m 2.27s real 0m 4.32s > user 0m 0.00s user 0m 0.00s > sys 0m 1.13s sys 0m 2.16s > > Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> > --- > net/core/dev.c | 8 +++++++- > 1 files changed, 7 insertions(+), 1 deletions(-) My $.02 would for fixing full name hash because other usages would benefit as well. Inventing special case for network devices seems unnecessary.
I overkilled this with more functions and compared filenames as well. genarated names (dummyNNNN) Algorithm Time (us) Ratio Max StdDev kr_hash 277925 152408.6 468448 543.19 string_hash31 329356 5859.4 16042 44.18 SuperFastHash 324570 4885.9 10502 2.29 djb2 327908 5608.5 15210 38.08 string_hash17 326769 4883.6 9896 0.76 full_name_hash 343196 63921.0 140628 343.62 jhash_string 463801 4883.8 10085 1.02 sdbm 398587 9801.7 29634 99.18 filesystem names Algorithm Time (us) Ratio Max StdDev kr_hash 278840 152314.9 468717 543.01 string_hash31 331206 5802.1 16004 42.87 SuperFastHash 325938 4887.5 13528 2.88 djb2 330621 5607.1 15333 38.05 string_hash17 331181 4884.9 13274 1.78 full_name_hash 347312 63972.2 141336 343.77 jhash_string 466799 4885.2 13275 1.92 sdbm 403691 9771.7 29629 98.88 Ratio is the average number of buckets examined when scanning the whole set of names. 1) Increased hash buckets to 1024 which seems reasonable if we are going to test that many names. 2) Increased name size to 256 so that longer filenames could be checked and name blocks were not in same cache line * SuperFastHash is too big to put inline
Stephen Hemminger a écrit : > I overkilled this with more functions and compared filenames as well. > > > genarated names (dummyNNNN) > Algorithm Time (us) Ratio Max StdDev > kr_hash 277925 152408.6 468448 543.19 > string_hash31 329356 5859.4 16042 44.18 > SuperFastHash 324570 4885.9 10502 2.29 > djb2 327908 5608.5 15210 38.08 > string_hash17 326769 4883.6 9896 0.76 > full_name_hash 343196 63921.0 140628 343.62 > jhash_string 463801 4883.8 10085 1.02 > sdbm 398587 9801.7 29634 99.18 > > filesystem names > Algorithm Time (us) Ratio Max StdDev > kr_hash 278840 152314.9 468717 543.01 > string_hash31 331206 5802.1 16004 42.87 > SuperFastHash 325938 4887.5 13528 2.88 > djb2 330621 5607.1 15333 38.05 > string_hash17 331181 4884.9 13274 1.78 > full_name_hash 347312 63972.2 141336 343.77 > jhash_string 466799 4885.2 13275 1.92 > sdbm 403691 9771.7 29629 98.88 > > Ratio is the average number of buckets examined when scanning > the whole set of names. > > > 1) Increased hash buckets to 1024 which seems reasonable if we are > going to test that many names. > 2) Increased name size to 256 so that longer filenames could be > checked and name blocks were not in same cache line > > * SuperFastHash is too big to put inline > > Thanks Stephen 1) dcache hash is very big on average machines. 2) dcache : We hash last component, against its parent, acting as a base. Your hashtest program considers the name as a single entity, giving different hash distribution. 3) netdev names are special, since we have only one parent, and smaller hash table. 4) jhash is not that expensive, but it might be because of huge working set of your test program : strings are not in cpu caches and speed is mostly driven by ram bandwidth. But current full_name_hash() seems a pretty bad choice ! -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/core/dev.c b/net/core/dev.c index fa88dcd..af3cab3 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -198,7 +198,13 @@ EXPORT_SYMBOL(dev_base_lock); static inline struct hlist_head *dev_name_hash(struct net *net, const char *name) { - unsigned hash = full_name_hash(name, strnlen(name, IFNAMSIZ)); + unsigned hash = 0; + int len = strnlen(name, IFNAMSIZ); + int i; + + for (i = 0; i < len; ++i) + hash = 31 * hash + name[i]; + return &net->dev_name_head[hash & ((1 << NETDEV_HASHBITS) - 1)]; }
The current dev_name_hash is not very good at spreading entries when a large number of interfaces of the same type (e.g. ethXXXXX) are used. Here are some performance numbers for creating 16000 dummy interfaces with and without the patch (with per device sysctl entries disabled) With patch Without patch real 0m 2.27s real 0m 4.32s user 0m 0.00s user 0m 0.00s sys 0m 1.13s sys 0m 2.16s Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> --- net/core/dev.c | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-)