diff mbox

[next-next-2.6] netdev: better dev_name_hash

Message ID 200910252158.53921.opurdila@ixiacom.com
State Superseded, archived
Delegated to: David Miller
Headers show

Commit Message

Octavian Purdila Oct. 25, 2009, 7:58 p.m. UTC
The current dev_name_hash is not very good at spreading entries when a
large number of interfaces of the same type (e.g. ethXXXXX) are used.

Here are some performance numbers for creating 16000 dummy interfaces with
and without the patch (with per device sysctl entries disabled)

    With patch                      Without patch

    real    0m 2.27s                real    0m 4.32s
    user    0m 0.00s                user    0m 0.00s
    sys     0m 1.13s                sys     0m 2.16s

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
---
 net/core/dev.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

Comments

Hagen Paul Pfeifer Oct. 25, 2009, 8:17 p.m. UTC | #1
* Octavian Purdila | 2009-10-25 21:58:53 [+0200]:

>
>The current dev_name_hash is not very good at spreading entries when a
>large number of interfaces of the same type (e.g. ethXXXXX) are used.
>
>Here are some performance numbers for creating 16000 dummy interfaces with
>and without the patch (with per device sysctl entries disabled)
>
>    With patch                      Without patch
>
>    real    0m 2.27s                real    0m 4.32s
>    user    0m 0.00s                user    0m 0.00s
>    sys     0m 1.13s                sys     0m 2.16s

Can you rerun the test with jhash() as the hash function? Just for clearness -
the spreading of jhash should be more uniformly distributed. At the end: where
is the threshold where a more accurate hash function is superior.

HGN
Eric Dumazet Oct. 25, 2009, 9:24 p.m. UTC | #2
Octavian Purdila a écrit :
> The current dev_name_hash is not very good at spreading entries when a
> large number of interfaces of the same type (e.g. ethXXXXX) are used.

Very true

> 
> Here are some performance numbers for creating 16000 dummy interfaces with
> and without the patch (with per device sysctl entries disabled)
> 
>     With patch                      Without patch
> 
>     real    0m 2.27s                real    0m 4.32s
>     user    0m 0.00s                user    0m 0.00s
>     sys     0m 1.13s                sys     0m 2.16s
> 

1) A program to show hash distribution would be more convincing,
and could show that 17 multiplier is better, not 31 :)

2) Why full_name_hash() not changed as well ?


$ cat test.c
#include <stdio.h>
#define init_name_hash()                0

/* partial hash update function. Assume roughly 4 bits per character */
static inline unsigned long
partial_name_hash(unsigned long c, unsigned long prevhash)
{
        return (prevhash + (c << 4) + (c >> 4)) * 11;
}

/*
 * Finally: cut down the number of bits to a int value (and try to avoid
 * losing bits)
 */
static inline unsigned long end_name_hash(unsigned long hash)
{
        return (unsigned int) hash;
}

/* Compute the hash for a name string. */
static inline unsigned int
full_name_hash(const unsigned char *name, unsigned int len)
{
        unsigned long hash = init_name_hash();
        while (len--)
                hash = partial_name_hash(*name++, hash);
        return end_name_hash(hash);
}

static inline unsigned int
string_hash31(const unsigned char *name, unsigned int len)
{
        unsigned hash = 0;
        int i;

        for (i = 0; i < len; ++i)
                hash = 31 * hash + name[i];
        return hash;
}

static inline unsigned int
string_hash17(const unsigned char *name, unsigned int len)
{
        unsigned hash = 0;
        int i;

        for (i = 0; i < len; ++i)
                hash = 17 * hash + name[i];
        return hash;
}

unsigned int hashdist1[256], hashdist2[256], hashdist3[256];

void print_dist(unsigned int *dist, const char *name)
{
        unsigned int i;

        printf("%s[] = {\n", name);
        for (i = 0; i < 256; i++) {
                printf("%d, ", dist[i]);
                if ((i & 15) == 15) putchar('\n');
        }
        printf("};\n");
}

int main()
{
        int i;
        unsigned int hash;
        unsigned char name[16];

        for (i = 0; i < 16000; i++) {
                unsigned int len = sprintf(name, "dummy%d", i);

                hash = full_name_hash(name, len);
                hashdist1[hash & 255]++;

                hash = string_hash31(name, len);
                hashdist2[hash & 255]++;

                hash = string_hash17(name, len);
                hashdist3[hash & 255]++;
                }
        print_dist(hashdist1, "full_name_hash");
        print_dist(hashdist2, "string_hash31");
        print_dist(hashdist3, "string_hash17");
        return 0;
}

$ gcc -o test test.c
$ ./test
full_name_hash[] = {
0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 6, 0, 0, 0, 0, 56,
0, 0, 0, 374, 0, 0, 562, 0, 0, 0, 5, 0, 0, 0, 0, 57,
0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 6, 1, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 5, 1, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 6, 0, 0, 0, 0, 57,
0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 5, 0, 0, 0, 0, 56,
0, 0, 0, 376, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 5, 1, 0, 0, 0, 57,
0, 0, 0, 374, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 56,
0, 0, 0, 376, 0, 0, 562, 0, 0, 0, 6, 0, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 5, 0, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 562, 0, 0, 0, 5, 1, 0, 0, 0, 56,
0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 57,
0, 0, 0, 375, 0, 0, 563, 0, 0, 0, 6, 1, 0, 0, 0, 56,
};
string_hash31[] = {
42, 54, 67, 81, 92, 103, 116, 125, 131, 131, 133, 128, 121, 110, 101, 87,
72, 59, 47, 36, 25, 17, 12, 8, 4, 4, 6, 7, 11, 15, 24, 32,
42, 54, 68, 79, 91, 105, 117, 127, 130, 135, 133, 129, 120, 113, 100, 86,
71, 58, 46, 33, 24, 16, 11, 6, 4, 4, 5, 7, 10, 16, 23, 32,
41, 54, 66, 78, 91, 104, 117, 123, 130, 133, 134, 127, 122, 112, 101, 86,
72, 61, 47, 35, 25, 19, 12, 8, 5, 6, 6, 7, 11, 16, 24, 31,
43, 54, 66, 78, 92, 106, 116, 125, 131, 136, 132, 129, 121, 112, 98, 84,
72, 58, 44, 32, 24, 16, 11, 6, 5, 5, 4, 7, 10, 17, 23, 32,
41, 54, 65, 78, 92, 105, 116, 123, 132, 133, 133, 127, 122, 111, 99, 86,
74, 60, 45, 35, 26, 19, 12, 8, 7, 5, 5, 7, 12, 17, 24, 31,
43, 54, 66, 80, 94, 107, 116, 126, 131, 135, 131, 129, 120, 110, 97, 85,
71, 56, 44, 33, 25, 16, 11, 7, 5, 3, 4, 7, 11, 17, 22, 32,
43, 54, 66, 80, 94, 105, 115, 124, 133, 132, 132, 127, 121, 110, 99, 87,
74, 59, 46, 37, 26, 19, 12, 9, 6, 4, 5, 8, 12, 16, 23, 32,
43, 53, 67, 81, 94, 105, 116, 128, 132, 135, 133, 130, 121, 112, 100, 88,
72, 57, 46, 34, 25, 17, 11, 7, 4, 3, 5, 8, 11, 16, 22, 33,
};
string_hash17[] = {
68, 73, 72, 69, 62, 57, 52, 53, 60, 66, 71, 69, 69, 68, 67, 64,
67, 68, 70, 68, 63, 56, 53, 51, 57, 64, 68, 71, 68, 67, 66, 65,
62, 65, 67, 68, 64, 59, 55, 52, 54, 60, 67, 69, 69, 65, 65, 61,
59, 60, 63, 64, 64, 60, 55, 54, 54, 61, 67, 72, 72, 71, 66, 64,
60, 59, 60, 64, 64, 62, 58, 56, 55, 59, 66, 70, 73, 69, 67, 62,
58, 53, 56, 57, 60, 59, 57, 54, 53, 56, 62, 69, 71, 72, 67, 64,
57, 53, 51, 54, 58, 60, 59, 57, 57, 56, 63, 69, 74, 74, 71, 65,
60, 53, 50, 52, 55, 58, 59, 58, 57, 58, 61, 68, 73, 80, 77, 73,
66, 59, 52, 52, 54, 60, 61, 58, 58, 58, 59, 64, 71, 73, 78, 71,
66, 57, 50, 46, 50, 54, 59, 61, 58, 59, 60, 65, 70, 75, 78, 80,
72, 65, 56, 50, 49, 53, 59, 63, 62, 60, 62, 63, 68, 72, 77, 77,
76, 67, 58, 49, 46, 49, 55, 60, 62, 62, 59, 62, 65, 70, 72, 78,
75, 73, 62, 53, 47, 47, 52, 58, 63, 62, 63, 61, 64, 67, 71, 73,
76, 72, 68, 57, 49, 46, 50, 57, 62, 65, 65, 65, 64, 67, 69, 73,
76, 76, 71, 65, 54, 49, 49, 55, 62, 65, 65, 64, 65, 63, 66, 67,
71, 71, 70, 63, 57, 49, 47, 52, 58, 65, 66, 67, 65, 67, 65, 67,
};

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Octavian Purdila Oct. 25, 2009, 9:55 p.m. UTC | #3
On Sunday 25 October 2009 23:24:10 you wrote:
> Octavian Purdila a écrit :
> > The current dev_name_hash is not very good at spreading entries when a
> > large number of interfaces of the same type (e.g. ethXXXXX) are used.
> 
> Very true
> 
> > Here are some performance numbers for creating 16000 dummy interfaces
> > with and without the patch (with per device sysctl entries disabled)
> >
> >     With patch                      Without patch
> >
> >     real    0m 2.27s                real    0m 4.32s
> >     user    0m 0.00s                user    0m 0.00s
> >     sys     0m 1.13s                sys     0m 2.16s
> 
> 1) A program to show hash distribution would be more convincing,
> and could show that 17 multiplier is better, not 31 :)
> 

You beat me to it :) 

But anyways, I've attached mine as well , which compares orig, jhash, new17, 
new31, with different number of interfaces as well as different hash bits. 

The score it shows its the number of iterations needed to find all elements in 
the hash (good enough metric?)

My results shows that new17 is better or very close to jhash2. And I think its 
lighter then jhash too.

> 2) Why full_name_hash() not changed as well ?

I don't think it is going to be as easy to "benchmark" this, as it is used 
with more diverse inputs. 

AFAIK, full_name_hash is used by the dentry code, meaning that it cashes path 
names? If so, perhaps we can use find {/etc,/bin,/sbin,/usr} as input patterns? 

My results:

$ ./dev_name_hash ixint 255 0 8
score 1140
$ ./dev_name_hash ixint 255 1 8
score 379
$ ./dev_name_hash ixint 255 2 8
score 461
$ ./dev_name_hash ixint 255 3 8
score 329
$ ./dev_name_hash ixint 1000 0 8
score 26074
$ ./dev_name_hash ixint 1000 1 8
score 2887
$ ./dev_name_hash ixint 1000 2 8
score 2853
$ ./dev_name_hash ixint 1000 3 8
score 3713
$ ./dev_name_hash ixint 16000 0 8
score 3689828
$ ./dev_name_hash ixint 16000 1 8
score 516389
$ ./dev_name_hash ixint 16000 2 8
score 515292
$ ./dev_name_hash ixint 16000 3 8
score 554042
$ ./dev_name_hash ixint 16000 0 16
score 24741
$ ./dev_name_hash ixint 16000 1 16
score 17949
$ ./dev_name_hash ixint 16000 2 16
score 16715
$ ./dev_name_hash ixint 16000 3 16
score 18125
Hagen Paul Pfeifer Oct. 25, 2009, 10:41 p.m. UTC | #4
* Octavian Purdila | 2009-10-25 23:55:32 [+0200]:

>My results shows that new17 is better or very close to jhash2. And I think its 
>lighter then jhash too.

If new17 is very close to jhash/jhash2 then the cycles comes into play.
Anyway, there is already a very potent hash interface in form of jhash{,2}.

+1 for jhash2

HGN

PS: great work! ;)

PPS: http://libhashish.sourceforge.net/ have some real hash benchmarks in form
of avalanche test and some others too. It does not really matter here because
Jenkins performs _nearly_ perfect in all cases. ;)
Octavian Purdila Oct. 25, 2009, 10:45 p.m. UTC | #5
On Monday 26 October 2009 00:41:54 you wrote:
> * Octavian Purdila | 2009-10-25 23:55:32 [+0200]:
> >My results shows that new17 is better or very close to jhash2. And I think
> > its lighter then jhash too.
> 
> If new17 is very close to jhash/jhash2 then the cycles comes into play.
> Anyway, there is already a very potent hash interface in form of jhash{,2}.
> 

Ah, sorry for the confusion, jhash2 was a typo, I've only tested jhash.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
stephen hemminger Oct. 26, 2009, 4:43 a.m. UTC | #6
On Sun, 25 Oct 2009 21:58:53 +0200
Octavian Purdila <opurdila@ixiacom.com> wrote:

> 
> The current dev_name_hash is not very good at spreading entries when a
> large number of interfaces of the same type (e.g. ethXXXXX) are used.
> 
> Here are some performance numbers for creating 16000 dummy interfaces with
> and without the patch (with per device sysctl entries disabled)
> 
>     With patch                      Without patch
> 
>     real    0m 2.27s                real    0m 4.32s
>     user    0m 0.00s                user    0m 0.00s
>     sys     0m 1.13s                sys     0m 2.16s
> 
> Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
> ---
>  net/core/dev.c |    8 +++++++-
>  1 files changed, 7 insertions(+), 1 deletions(-)

My $.02 would for fixing full name hash because other usages would
benefit as well.  Inventing special case for network devices seems
unnecessary.
stephen hemminger Oct. 26, 2009, 6:30 a.m. UTC | #7
I overkilled this with more functions and compared filenames as well.


genarated names (dummyNNNN)
Algorithm             Time (us)     Ratio       Max   StdDev
kr_hash                  277925   152408.6   468448 543.19
string_hash31            329356     5859.4    16042  44.18
SuperFastHash            324570     4885.9    10502   2.29
djb2                     327908     5608.5    15210  38.08
string_hash17            326769     4883.6     9896   0.76
full_name_hash           343196    63921.0   140628 343.62
jhash_string             463801     4883.8    10085   1.02
sdbm                     398587     9801.7    29634  99.18

filesystem names
Algorithm             Time (us)     Ratio       Max   StdDev
kr_hash                  278840   152314.9   468717 543.01
string_hash31            331206     5802.1    16004  42.87
SuperFastHash            325938     4887.5    13528   2.88
djb2                     330621     5607.1    15333  38.05
string_hash17            331181     4884.9    13274   1.78
full_name_hash           347312    63972.2   141336 343.77
jhash_string             466799     4885.2    13275   1.92
sdbm                     403691     9771.7    29629  98.88

Ratio is the average number of buckets examined when scanning
the whole set of names.


1) Increased hash buckets to 1024 which seems reasonable if we are
   going to test that many names.
2) Increased name size to 256 so that longer filenames could be
   checked and name blocks were not in same cache line

* SuperFastHash is too big to put inline
Eric Dumazet Oct. 26, 2009, 7:48 a.m. UTC | #8
Stephen Hemminger a écrit :
> I overkilled this with more functions and compared filenames as well.
> 
> 
> genarated names (dummyNNNN)
> Algorithm             Time (us)     Ratio       Max   StdDev
> kr_hash                  277925   152408.6   468448 543.19
> string_hash31            329356     5859.4    16042  44.18
> SuperFastHash            324570     4885.9    10502   2.29
> djb2                     327908     5608.5    15210  38.08
> string_hash17            326769     4883.6     9896   0.76
> full_name_hash           343196    63921.0   140628 343.62
> jhash_string             463801     4883.8    10085   1.02
> sdbm                     398587     9801.7    29634  99.18
> 
> filesystem names
> Algorithm             Time (us)     Ratio       Max   StdDev
> kr_hash                  278840   152314.9   468717 543.01
> string_hash31            331206     5802.1    16004  42.87
> SuperFastHash            325938     4887.5    13528   2.88
> djb2                     330621     5607.1    15333  38.05
> string_hash17            331181     4884.9    13274   1.78
> full_name_hash           347312    63972.2   141336 343.77
> jhash_string             466799     4885.2    13275   1.92
> sdbm                     403691     9771.7    29629  98.88
> 
> Ratio is the average number of buckets examined when scanning
> the whole set of names.
> 
> 
> 1) Increased hash buckets to 1024 which seems reasonable if we are
>    going to test that many names.
> 2) Increased name size to 256 so that longer filenames could be
>    checked and name blocks were not in same cache line
> 
> * SuperFastHash is too big to put inline
> 
> 

Thanks Stephen

1) dcache hash is very big on average machines.

2) dcache : We hash last component, against its parent, acting as a base.
   Your hashtest program considers the name as a single entity, giving different
   hash distribution.

3) netdev names are special, since we have only one parent, and
 smaller hash table.

4) jhash is not that expensive, but it might be because of huge working set of your
test program : strings are not in cpu caches and speed is mostly driven by ram bandwidth.

But current full_name_hash() seems a pretty bad choice !
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index fa88dcd..af3cab3 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -198,7 +198,13 @@  EXPORT_SYMBOL(dev_base_lock);
 
 static inline struct hlist_head *dev_name_hash(struct net *net, const char *name)
 {
-	unsigned hash = full_name_hash(name, strnlen(name, IFNAMSIZ));
+	unsigned hash = 0;
+	int len = strnlen(name, IFNAMSIZ);
+	int i;
+
+	for (i = 0; i < len; ++i)
+		hash = 31 * hash + name[i];
+
 	return &net->dev_name_head[hash & ((1 << NETDEV_HASHBITS) - 1)];
 }