Message ID | 1318697463.7169.21.camel@ganymede |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
Le samedi 15 octobre 2011 à 12:51 -0400, Dan Siemon a écrit : > cls_flow.c: flow_get_proto_dst() > > The proto-dst key returns the destination port for UDP, TCP and a few > other protocols [see proto_ports_offset()]. For ICMP and IPIP it falls > back to: > > return addr_fold(skb_dst(skb)) ^ (__force u16)skb->protocol; > > Since Linux maintains a dst_entry for each TOS value this causes the > returned value to be affected by the TOS which is unexpected and > probably broken. Hi Dan I think Patrick did this on purpose, because of of the lack of perturbation in cls_flow.c : If all these frames were mapped to a single flow, they might interfere with an other regular flow and hurt it. I dont qualify existing code as buggy. Its about fallback behavior anyway (I dont think its even documented) If you have too many frames going to the fallback, then this classifier is probably not the one you should use ? Hint : You can change your filter to use this classifier only on TCP/UDP trafic, and use another one on other protocols : Coupled to your qdisc rules, you even can limit to X percent the bandwidth allocated to this trafic We could argue that if TOS value of two packets is different, then packets belong to different flows as well. [ It seems we currently lack a FLOW_KEY_TOS : that could be a usefull addition ] -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2011-10-17 at 08:09 +0200, Eric Dumazet wrote: > Le samedi 15 octobre 2011 à 12:51 -0400, Dan Siemon a écrit : > > cls_flow.c: flow_get_proto_dst() > > > > The proto-dst key returns the destination port for UDP, TCP and a few > > other protocols [see proto_ports_offset()]. For ICMP and IPIP it falls > > back to: > > > > return addr_fold(skb_dst(skb)) ^ (__force u16)skb->protocol; > > > > Since Linux maintains a dst_entry for each TOS value this causes the > > returned value to be affected by the TOS which is unexpected and > > probably broken. > > Hi Dan > > I think Patrick did this on purpose, because of of the lack of > perturbation in cls_flow.c : If all these frames were mapped to a single > flow, they might interfere with an other regular flow and hurt it. > > I dont qualify existing code as buggy. Its about fallback behavior > anyway (I dont think its even documented) Thanks for the review Eric. Won't virtually all uses of proto-dst also use the dst key anyway? In which case this fallback does nothing except make the TOS effect the hash output because the dst will be the same and dst_entry would be the same if it wasn't for the different TOS (by far the common case). I don't see the value of the unintuitive behavior. I'm not certain this is a problem but also note that including TOS will mean that packets within a tunnel will be reordered if 'tos inherit' is set on the tunnel and only the typical src,dst,proto,proto-src,proto-dst is used. Again, probably not expected. > If you have too many frames going to the fallback, then this classifier > is probably not the one you should use ? If you have significant traffic in tunnels then any 5-tuple approach is going to present problems unless you look into the tunnel (like my other patch :) ) > Hint : You can change your filter to use this classifier only on TCP/UDP > trafic, and use another one on other protocols : Coupled to your qdisc > rules, you even can limit to X percent the bandwidth allocated to this > trafic > > We could argue that if TOS value of two packets is different, then > packets belong to different flows as well. [ It seems we currently lack > a FLOW_KEY_TOS : that could be a usefull addition ]
Le dimanche 23 octobre 2011 à 21:03 -0400, Dan Siemon a écrit : > On Mon, 2011-10-17 at 08:09 +0200, Eric Dumazet wrote: > > Le samedi 15 octobre 2011 à 12:51 -0400, Dan Siemon a écrit : > > > cls_flow.c: flow_get_proto_dst() > > > > > > The proto-dst key returns the destination port for UDP, TCP and a few > > > other protocols [see proto_ports_offset()]. For ICMP and IPIP it falls > > > back to: > > > > > > return addr_fold(skb_dst(skb)) ^ (__force u16)skb->protocol; > > > > > > Since Linux maintains a dst_entry for each TOS value this causes the > > > returned value to be affected by the TOS which is unexpected and > > > probably broken. > > > > Hi Dan > > > > I think Patrick did this on purpose, because of of the lack of > > perturbation in cls_flow.c : If all these frames were mapped to a single > > flow, they might interfere with an other regular flow and hurt it. > > > > I dont qualify existing code as buggy. Its about fallback behavior > > anyway (I dont think its even documented) > > Thanks for the review Eric. > > Won't virtually all uses of proto-dst also use the dst key anyway? In > which case this fallback does nothing except make the TOS effect the > hash output because the dst will be the same and dst_entry would be the > same if it wasn't for the different TOS (by far the common case). I > don't see the value of the unintuitive behavior. > > I'm not certain this is a problem but also note that including TOS will > mean that packets within a tunnel will be reordered if 'tos inherit' is > set on the tunnel and only the typical src,dst,proto,proto-src,proto-dst > is used. Again, probably not expected. Hmm, it seems cls_flow now supports "perturbation", so my prior argument is not true anymore. I guess your patch would be fine, but I prefer waiting Patrick review on it ;) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/sched/cls_flow.c b/net/sched/cls_flow.c index 6994214..7527e61 100644 --- a/net/sched/cls_flow.c +++ b/net/sched/cls_flow.c @@ -150,7 +150,7 @@ static u32 flow_get_proto_src(struct sk_buff *skb) } } - return addr_fold(skb->sk); + return 0; } static u32 flow_get_proto_dst(struct sk_buff *skb) @@ -192,7 +192,7 @@ static u32 flow_get_proto_dst(struct sk_buff *skb) } } - return addr_fold(skb_dst(skb)) ^ (__force u16)skb->protocol; + return 0; } static u32 flow_get_iif(const struct sk_buff *skb)
cls_flow.c: flow_get_proto_dst() The proto-dst key returns the destination port for UDP, TCP and a few other protocols [see proto_ports_offset()]. For ICMP and IPIP it falls back to: return addr_fold(skb_dst(skb)) ^ (__force u16)skb->protocol; Since Linux maintains a dst_entry for each TOS value this causes the returned value to be affected by the TOS which is unexpected and probably broken. Is there a reason why this doesn't return 0 for protocols that don't have a notion of source and destination ports? It seems very odd to me that a value which is not at all related to the traffic on the wire is returned for this key. There is a somewhat similar situation with flow_get_proto_src(). Here the fallback value is: return addr_fold(skb->sk); It looks like this is 0 when the traffic doesn't originate locally and even for local traffic I don't understand why the use of a effectively random number here is useful. For a long winded explanation of how I discovered this see: http://www.coverfire.com/archives/2011/10/15/linux-flow-classifier-proto-dst-and-tos/ Below is a simple patch which makes these functions fallback to returning 0 when the protocol doesn't have the notion of ports. Signed-off-by: Dan Siemon <dan@coverfire.com>