Message ID | 4EF16AA3.2070303@linux.vnet.ibm.com |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
Please tell me if this patch is silly... I really want to know the reason we need to use ^ and | instead of if and &&. On 12/21/2011 01:12 PM, Michael Wang wrote: > From: Michael Wang <wangyun@linux.vnet.ibm.com> > > If previous condition doesn't meet, the later check will be cancelled. > So we don't need to do all the calculation. > > Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com> > --- > net/ipv4/route.c | 8 ++++---- > 1 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/net/ipv4/route.c b/net/ipv4/route.c > index f30112f..2872bfb 100644 > --- a/net/ipv4/route.c > +++ b/net/ipv4/route.c > @@ -2362,10 +2362,10 @@ int ip_route_input_common(struct sk_buff *skb, __be32 daddr, __be32 saddr, > > for (rth = rcu_dereference(rt_hash_table[hash].chain); rth; > rth = rcu_dereference(rth->dst.rt_next)) { > - if ((((__force u32)rth->rt_key_dst ^ (__force u32)daddr) | And I also wonder whether I can use "rth->rt_key_dst == daddr" here or not... Thanks & Regards, Michael Wang > - ((__force u32)rth->rt_key_src ^ (__force u32)saddr) | > - (rth->rt_route_iif ^ iif) | > - (rth->rt_key_tos ^ tos)) == 0 && > + if (((__force u32)rth->rt_key_dst ^ (__force u32)daddr) == 0 && > + ((__force u32)rth->rt_key_src ^ (__force u32)saddr) == 0 && > + rth->rt_route_iif == iif && > + rth->rt_key_tos == tos && > rth->rt_mark == skb->mark && > net_eq(dev_net(rth->dst.dev), net) && > !rt_is_expired(rth)) { -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2011-12-21 at 13:12 +0800, Michael Wang wrote: > From: Michael Wang <wangyun@linux.vnet.ibm.com> > > If previous condition doesn't meet, the later check will be cancelled. > So we don't need to do all the calculation. Not sure about that. > Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com> > --- > net/ipv4/route.c | 8 ++++---- > 1 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/net/ipv4/route.c b/net/ipv4/route.c > index f30112f..2872bfb 100644 > --- a/net/ipv4/route.c > +++ b/net/ipv4/route.c > @@ -2362,10 +2362,10 @@ int ip_route_input_common(struct sk_buff *skb, __be32 daddr, __be32 saddr, > > for (rth = rcu_dereference(rt_hash_table[hash].chain); rth; > rth = rcu_dereference(rth->dst.rt_next)) { > - if ((((__force u32)rth->rt_key_dst ^ (__force u32)daddr) | > - ((__force u32)rth->rt_key_src ^ (__force u32)saddr) | > - (rth->rt_route_iif ^ iif) | > - (rth->rt_key_tos ^ tos)) == 0 && > + if (((__force u32)rth->rt_key_dst ^ (__force u32)daddr) == 0 && > + ((__force u32)rth->rt_key_src ^ (__force u32)saddr) == 0 && > + rth->rt_route_iif == iif && > + rth->rt_key_tos == tos && > rth->rt_mark == skb->mark && > net_eq(dev_net(rth->dst.dev), net) && > !rt_is_expired(rth)) { See: commit c0b8c32b1c96afc9b32b717927330025cc1c501e Author: Stephen Hemminger <shemminger@vyatta.com> Date: Thu Apr 10 04:00:28 2008 -0700 IPV4: use xor rather than multiple ands for route compare The comparison in ip_route_input is a hot path, by recoding the C "and" as bit operations, fewer conditional branches get generated so the code should be faster. Maybe someday Gcc will be smart enough to do this? Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/21/2011 01:23 PM, Joe Perches wrote: > On Wed, 2011-12-21 at 13:12 +0800, Michael Wang wrote: >> From: Michael Wang <wangyun@linux.vnet.ibm.com> >> >> If previous condition doesn't meet, the later check will be cancelled. >> So we don't need to do all the calculation. > > Not sure about that. > Hi, Joe Thanks for your reply :) >> Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com> >> --- >> net/ipv4/route.c | 8 ++++---- >> 1 files changed, 4 insertions(+), 4 deletions(-) >> >> diff --git a/net/ipv4/route.c b/net/ipv4/route.c >> index f30112f..2872bfb 100644 >> --- a/net/ipv4/route.c >> +++ b/net/ipv4/route.c >> @@ -2362,10 +2362,10 @@ int ip_route_input_common(struct sk_buff *skb, __be32 daddr, __be32 saddr, >> >> for (rth = rcu_dereference(rt_hash_table[hash].chain); rth; >> rth = rcu_dereference(rth->dst.rt_next)) { >> - if ((((__force u32)rth->rt_key_dst ^ (__force u32)daddr) | >> - ((__force u32)rth->rt_key_src ^ (__force u32)saddr) | >> - (rth->rt_route_iif ^ iif) | >> - (rth->rt_key_tos ^ tos)) == 0 && >> + if (((__force u32)rth->rt_key_dst ^ (__force u32)daddr) == 0 && >> + ((__force u32)rth->rt_key_src ^ (__force u32)saddr) == 0 && >> + rth->rt_route_iif == iif && >> + rth->rt_key_tos == tos && >> rth->rt_mark == skb->mark && >> net_eq(dev_net(rth->dst.dev), net) && >> !rt_is_expired(rth)) { > > See: > > commit c0b8c32b1c96afc9b32b717927330025cc1c501e > Author: Stephen Hemminger <shemminger@vyatta.com> > Date: Thu Apr 10 04:00:28 2008 -0700 > > IPV4: use xor rather than multiple ands for route compare > > The comparison in ip_route_input is a hot path, by recoding the C > "and" as bit operations, fewer conditional branches get generated > so the code should be faster. Maybe someday Gcc will be smart > enough to do this? This is what confused me, why "fewer conditional branches get generated" will make code faster? In this example, I think the best condition when daddr is different, we only need to go to one branch do compare then quit, won't this be faster? Thanks, Michael Wang > > Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> > Acked-by: Eric Dumazet <dada1@cosmosbay.com> > Signed-off-by: David S. Miller <davem@davemloft.net> > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2011-12-21 at 13:39 +0800, Michael Wang wrote: > On 12/21/2011 01:23 PM, Joe Perches wrote: > > On Wed, 2011-12-21 at 13:12 +0800, Michael Wang wrote: > >> From: Michael Wang <wangyun@linux.vnet.ibm.com> > >> > >> If previous condition doesn't meet, the later check will be cancelled. > >> So we don't need to do all the calculation. [] > > commit c0b8c32b1c96afc9b32b717927330025cc1c501e > > Author: Stephen Hemminger <shemminger@vyatta.com> > > Date: Thu Apr 10 04:00:28 2008 -0700 > > > > IPV4: use xor rather than multiple ands for route compare > > > > The comparison in ip_route_input is a hot path, by recoding the C > > "and" as bit operations, fewer conditional branches get generated > > so the code should be faster. Maybe someday Gcc will be smart > > enough to do this? > This is what confused me, why "fewer conditional branches get generated" > will make code faster? > In this example, I think the best condition when daddr is different, we > only need to go to one branch do compare then quit, won't this be faster? if (a && b) ... pseudo-codes to: if (!a) goto fail; if (!b) goto fail; ... fail: Each of those conditional branches has a cost. Combining tests of variables in the same cache lines has relatively little cost compared to the conditional branches. That's the theory anyway. If you have tests that demonstrate otherwise, please provide them. cheers, Joe -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Michael Wang <wangyun@linux.vnet.ibm.com> Date: Wed, 21 Dec 2011 13:12:03 +0800 > From: Michael Wang <wangyun@linux.vnet.ibm.com> > > If previous condition doesn't meet, the later check will be cancelled. > So we don't need to do all the calculation. > > Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com> This is intentional to reduce the number of branch prediction misses, please don't change this. Once we read one of these values, the rest are incredibly cheap, the real cost is if we have tons of real branches here, each of which can be mispredicted. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/21/2011 01:50 PM, Joe Perches wrote: > On Wed, 2011-12-21 at 13:39 +0800, Michael Wang wrote: >> On 12/21/2011 01:23 PM, Joe Perches wrote: >>> On Wed, 2011-12-21 at 13:12 +0800, Michael Wang wrote: >>>> From: Michael Wang <wangyun@linux.vnet.ibm.com> >>>> >>>> If previous condition doesn't meet, the later check will be cancelled. >>>> So we don't need to do all the calculation. > [] >>> commit c0b8c32b1c96afc9b32b717927330025cc1c501e >>> Author: Stephen Hemminger <shemminger@vyatta.com> >>> Date: Thu Apr 10 04:00:28 2008 -0700 >>> >>> IPV4: use xor rather than multiple ands for route compare >>> >>> The comparison in ip_route_input is a hot path, by recoding the C >>> "and" as bit operations, fewer conditional branches get generated >>> so the code should be faster. Maybe someday Gcc will be smart >>> enough to do this? >> This is what confused me, why "fewer conditional branches get generated" >> will make code faster? >> In this example, I think the best condition when daddr is different, we >> only need to go to one branch do compare then quit, won't this be faster? > > if (a && b) > ... > pseudo-codes to: > if (!a) > goto fail; > if (!b) > goto fail; > ... > fail: > > Each of those conditional branches has a cost. > Combining tests of variables in the same cache lines > has relatively little cost compared to the conditional > branches. > That make sense :) > That's the theory anyway. > > If you have tests that demonstrate otherwise, please > provide them. > I think the previous patch should have done such test, otherwise they won't do this change. Thanks for your reply, that clear my confusion. Regards, Michael Wang > cheers, Joe > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/21/2011 01:57 PM, David Miller wrote: > From: Michael Wang <wangyun@linux.vnet.ibm.com> > Date: Wed, 21 Dec 2011 13:12:03 +0800 > >> From: Michael Wang <wangyun@linux.vnet.ibm.com> >> >> If previous condition doesn't meet, the later check will be cancelled. >> So we don't need to do all the calculation. >> >> Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com> > > This is intentional to reduce the number of branch prediction > misses, please don't change this. > > Once we read one of these values, the rest are incredibly cheap, > the real cost is if we have tons of real branches here, each > of which can be mispredicted. > Hi, David Thanks for your detailed explain, now I know the reason we use this style :) Thanks, Michael Wang -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/ipv4/route.c b/net/ipv4/route.c index f30112f..2872bfb 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -2362,10 +2362,10 @@ int ip_route_input_common(struct sk_buff *skb, __be32 daddr, __be32 saddr, for (rth = rcu_dereference(rt_hash_table[hash].chain); rth; rth = rcu_dereference(rth->dst.rt_next)) { - if ((((__force u32)rth->rt_key_dst ^ (__force u32)daddr) | - ((__force u32)rth->rt_key_src ^ (__force u32)saddr) | - (rth->rt_route_iif ^ iif) | - (rth->rt_key_tos ^ tos)) == 0 && + if (((__force u32)rth->rt_key_dst ^ (__force u32)daddr) == 0 && + ((__force u32)rth->rt_key_src ^ (__force u32)saddr) == 0 && + rth->rt_route_iif == iif && + rth->rt_key_tos == tos && rth->rt_mark == skb->mark && net_eq(dev_net(rth->dst.dev), net) && !rt_is_expired(rth)) {