Message ID | 1255897680.4815.63.camel@dogo.mojatatu.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
I haven't looked at the patches, but I do not believe requiring marking to be symmetric to be a good idea. Example: - a relatively complex router/load balancer setup - normal (no mark) packets get routed/load balanced to destinations which are healthy - a separate job which health checks the destinations (and updates the no mark routing table on destination health state transitions) - it uses socket marking to select a separate routing table which throws all load at a specific destination host. ie. the socket marking is used to explicitly direct load at a specific destination host. Obviously only the transmit path is affected. There is no marking happening on the receive path. Another example would be tunneling. I'd envision something like: ip tunnel add vtun0 mode gre remote ... local ... tos inherit ttl 64 mark 0x1234 ip rule add fwmark 0x1234 lookup 250 ip route add 192.168.1.0/24 dev eth0 table 250 ip route add default via 192.168.1.1 dev eth0 table 250 ip route add default dev vtun0 table main (obviously this is just an example and not fully fleshed out, furthermore ip tunnel doesn't currently support mark, nor do the tunnel modules themselves) If you do want to allow explicit incoming mark filtering use another socket option (SO_MARK_RCV) and a different/new socket field. >> I vote for extending BPF, and not adding the price of a compare >> for each packet. Only users wanting mark filtering should pay the price. I agree that being able to filter on mark in bpf makes a lot of sense. I wonder if we're not hitting the filters potentially before the mark is set though (on receive at least)... I'm nowhere near sure but I think packets get diverted/cloned to tcpdump before they hit the ip stack (and thus potentially get marked by ip(6)table mangle rules) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, 2009-10-18 at 16:09 -0700, Maciej Żenczykowski wrote: > > I agree that being able to filter on mark in bpf makes a lot of sense. I agree as well - i posted a patch yesterday; i just tested it and it works so i will formally post it shortly. > I wonder if we're not hitting the filters potentially before the mark > is set though (on receive at least)... > I'm nowhere near sure but I think packets get diverted/cloned to > tcpdump before they hit the ip stack (and thus potentially get marked > by ip(6)table mangle rules) There are many ways to mark the packets before they get to the socket. tc ingress provides at least two ways (ipt action and recently posted patch by me on skbedit); iptables as well. cheers, jamal -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/filter.h b/include/linux/filter.h index 1354aaf..909193e 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -123,7 +123,8 @@ struct sock_fprog /* Required for SO_ATTACH_FILTER. */ #define SKF_AD_IFINDEX 8 #define SKF_AD_NLATTR 12 #define SKF_AD_NLATTR_NEST 16 -#define SKF_AD_MAX 20 +#define SKF_AD_MARK 20 +#define SKF_AD_MAX 24 #define SKF_NET_OFF (-0x100000) #define SKF_LL_OFF (-0x200000) diff --git a/net/core/filter.c b/net/core/filter.c index d1d779c..e3987e1 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -303,6 +303,9 @@ load_b: case SKF_AD_IFINDEX: A = skb->dev->ifindex; continue; + case SKF_AD_MARK: + A = skb->mark; + continue; case SKF_AD_NLATTR: { struct nlattr *nla;