Message ID | 4168018.821328016111235.JavaMail.root@5-MeO-DMT.ynet.sk |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
From: Stefan Gula <steweg@ynet.sk> Date: Tue, 31 Jan 2012 14:21:51 +0100 (CET) > - neither NVGRE nor VXLAN are part of the openvswitch for now Too bad, it means we'll have this new user API of your's so when openvswitch does add the necessary code we CANNOT REMOVE your stuff. I'm not applying this until you at least attempt an openvswitch version, and that's basically the end of this discussion. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 31 Jan 2012 14:21:51 +0100 (CET) Stefan Gula <steweg@ynet.sk> wrote: > From: Stefan Gula <steweg@gmail.com> > > This patch is an extension for current Ethernet over GRE > implementation, which allows user to create virtual bridge (multipoint > VPN) and forward traffic based on Ethernet MAC address information in > it. It simulates the Bridge behavior learning mechanism, but instead > of learning port ID from which given MAC address comes, it learns IP > address of peer which encapsulated given packet. Multicast, Broadcast > and unknown-multicast traffic is send over network as multicast > encapsulated GRE packet, so one Ethernet multipoint GRE tunnel can be > represented as one single virtual switch on logical level and be also > represented as one multicast IPv4 address on network level. > > Signed-off-by: Stefan Gula <steweg@gmail.com> Have you looked at the NVGRE standard? http://tools.ietf.org/html/draft-sridharan-virtualization-nvgre-00 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2012/1/31 David Miller <davem@davemloft.net>: > From: Stefan Gula <steweg@ynet.sk> > Date: Tue, 31 Jan 2012 14:21:51 +0100 (CET) > >> - neither NVGRE nor VXLAN are part of the openvswitch for now > > Too bad, it means we'll have this new user API of your's so when > openvswitch does add the necessary code we CANNOT REMOVE your stuff. > > I'm not applying this until you at least attempt an openvswitch > version, and that's basically the end of this discussion. I actually tried to deploy it on one of my linux based APs. And that's when I realize that openvswitch have several limitations why I cannot use it in my scenario on it's own. I tried to put only standard one point-to-point GRE tunnel from openvswitch without any modifications of my own and find out that it will never work ok as it is missing the security parts (ebtables/arptables/iptables), so in my scenario I end up with original bridge for security and openvswitch bridge with opevswitch gre tunnel, all linked together by veth link. Result of performance was that (veth code + openswitch bridge + openvswitch gre code) was worse than using only my gretap code. On the other hand if I omit the fact about the missing the security features (omitting also use of the original bridge code), then the result was in favor of openvswitch (that's the same result as Joseph provided). About the new API. Openvswitch is using it's own GRE code, with it's own API. So if the finally NVGRE or VXLAN will be added to openvswitch,it doesn't breaks anything to leave also my API as is. For those who will be using my API in that time, they could consider migrations of their scripts from standard bridge based code with gretap interfaces towards openvswitch with NVGRE code or VXLAN code instead. Until that time they can consider using bridge code with gretap interfaces or openvswitch code with the same gretap interfaces => both switches can benefit from it. So no harm done on this side - maybe I am not seeing something that you do, am I? About the porting of my code into openvswitch directly. I believe that developing/porting something, that will be most probably replaced eventually with something based on RFC standards like NVGRE, which is still in progress of developing, and cannot be really used in managed networks, like my own, at all due other missing features, is a huge waste of anybody's time. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2012/1/31 Stephen Hemminger <shemminger@vyatta.com>: > On Tue, 31 Jan 2012 14:21:51 +0100 (CET) > Stefan Gula <steweg@ynet.sk> wrote: > >> From: Stefan Gula <steweg@gmail.com> >> >> This patch is an extension for current Ethernet over GRE >> implementation, which allows user to create virtual bridge (multipoint >> VPN) and forward traffic based on Ethernet MAC address information in >> it. It simulates the Bridge behavior learning mechanism, but instead >> of learning port ID from which given MAC address comes, it learns IP >> address of peer which encapsulated given packet. Multicast, Broadcast >> and unknown-multicast traffic is send over network as multicast >> encapsulated GRE packet, so one Ethernet multipoint GRE tunnel can be >> represented as one single virtual switch on logical level and be also >> represented as one multicast IPv4 address on network level. >> >> Signed-off-by: Stefan Gula <steweg@gmail.com> > > Have you looked at the NVGRE standard? > http://tools.ietf.org/html/draft-sridharan-virtualization-nvgre-00 Yes, I did. One section from 3.1. NVGRE Endpoint: To encapsulate an Ethernet frame, the endpoint needs to know location information for the destination address in the frame. The way to obtain this information is not covered in this document and will be covered in a different draft. Any number of techniques can be used in the control plane to configure, discover and distribute the policy information. For the rest of this document we assume that the location information including TNI is readily available to the NVGRE endpoint. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Štefan Gula <steweg@ynet.sk> Date: Wed, 1 Feb 2012 00:32:04 +0100 > About the new API. Openvswitch is using it's own GRE code, with it's > own API. So if the finally NVGRE or VXLAN will be added to > openvswitch,it doesn't breaks anything to leave also my API as is. You don't understand. If your code is superfluous in the end, we shouldn't add it in the first place. But if I do relent and let your code in now, we have to live with it, and it's associated maintainence costs, FOREVER. That's why I'm forcing this to be implemented properly from the start, so we don't end up with two pieces of code that provide essentially the same functionality. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2012/2/1 David Miller <davem@davemloft.net>: > From: Štefan Gula <steweg@ynet.sk> > Date: Wed, 1 Feb 2012 00:32:04 +0100 > > > You don't understand. > > If your code is superfluous in the end, we shouldn't add it in > the first place. > > But if I do relent and let your code in now, we have to live > with it, and it's associated maintainence costs, FOREVER. > > That's why I'm forcing this to be implemented properly from the start, > so we don't end up with two pieces of code that provide essentially > the same functionality. I understand your strategic point of maintenance here and partially agree with it. And if I understand it correctly, it is to one day have openvswitch as full replacement of linux bridge code. On the other hand gretap interface already exists in kernel so that part of the code is currently also superfluous - what's the plan with that particular piece of code?. So if this is now only about the maintenance of my code, I'll be more than happy to continue maintaining it myself together with you guys. And if it comes in the future to decision to remove whole gretap code (not just my part) and replace it with something else that will provide the same or even more functionality, I have absolutely no problem with that. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Štefan Gula <steweg@ynet.sk> Date: Wed, 1 Feb 2012 08:22:02 +0100 > So if this is now only about the > maintenance of my code, I'll be more than happy to continue > maintaining it myself together with you guys. You have a very warped understanding of what maintainence cost means. Everyone time someone wants to change a core API in the networking your new code will need to be considered. Every time someone wants to audit how an interface is used, your code adds to the audit burdon. And this is burdon placed upon other people, even if you personally "promise" to maintain this specific code snippet. This promise completely meaningless from a global kernel maintainence standpoint. More code has a cost, no matter how well that specific piece of code is maintained. Therefore we don't add supurious code, and your code is spurious if it will end up duplicating a more desirable implementation and interface for this functionality. The world has spun successfully countless times in the 18 years that Linux hasn't had support for the feature you are so gravely interested in including "right now", and I suspect it will spin successfully a few more times while a proper implementation is ironed out. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2012/2/1 David Miller <davem@davemloft.net>: > From: Štefan Gula <steweg@ynet.sk> > Date: Wed, 1 Feb 2012 08:22:02 +0100 > >> So if this is now only about the >> maintenance of my code, I'll be more than happy to continue >> maintaining it myself together with you guys. > > You have a very warped understanding of what maintainence cost means. > > Everyone time someone wants to change a core API in the networking > your new code will need to be considered. Every time someone wants to > audit how an interface is used, your code adds to the audit burdon. That's true. But as I said. gretap interface already exists in linux kernel. Without my patch it simply use logic of point-to-point tunnel or static point-to-multipoint tunnel using muticast IP address as destination. The point here is that the maintenance cost is there already: From the kernel API point of view the functions that enables use of gre or gretap interface are already there maintained (functions like init/exit/xmit/receive...). That part of the code was modified as little as possible. If that kernel API changes, the API will need to be changed also for standard gre or gretap code, which in the end almost the same amount of time consumed to figure out the code as it is the same functions that are called. From the user-space/netlink API point of view those functions (open/close/add/change/del....) are already maintained - my patch allows you to use only one additional keyword "bridge" to maintain backward compatibility, so if that part of API changes, it influences again whole gre/gretap API and therefore almost the same amount of code is needed to be checked. The last portion of auditing the code purpose is that you are developing something new that doesn't exists or try to port the code to somewhere else. In that time we are talking about maintaining the gre/gretap code itself and not some global API changes - that's the only one relevant where more time is needed, but this one is expected (at least should be by developer) If I missed something, please feel free to highlight it. > > And this is burdon placed upon other people, even if you personally > "promise" to maintain this specific code snippet. This promise > completely meaningless from a global kernel maintainence standpoint. Yes, the burdon is there, but it's minimal from global point of view. > Therefore we don't add spurious code, and your code is spurious if it > will end up duplicating a more desirable implementation and interface > for this functionality. It cannot duplicate something that doesn't actually currently exists. VXLAN or NVGRE are still in process of developing/designing, so in the end it could easily happen that those "standards" will not do the same thing or by the same methodology - it's completely on those developers/designers, if they adopt my implementation/design or use something else. Openvswitch gre interface is currently only the same as thing as is current gretap interface in linux kernel with some kind of caching code - nothing more. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2012/2/1 Štefan Gula <steweg@ynet.sk>:
> 2012/2/1 David Miller <davem@davemloft.net>:
I think that everything what could be done and said from my-side to
provide you guys answers to hopefully all your questions and to get
this into kernel was done. So I would like to ask you to provide me
final feedback. Thanks
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
2012/2/7 Štefan Gula <steweg@ynet.sk>: > 2012/2/1 Štefan Gula <steweg@ynet.sk>: >> 2012/2/1 David Miller <davem@davemloft.net>: > I think that everything what could be done and said from my-side to > provide you guys answers to hopefully all your questions and to get > this into kernel was done. So I would like to ask you to provide me > final feedback. Thanks > > After staring at your message over ten minutes I have to say that you really need to stop overnight work ASAP and take a hot shower and a cup of hot coffee, then try to sort out answer to my question, what Ingo Molnar did, like ANK and old David, in the past ten years, and how? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Štefan Gula <steweg@ynet.sk> Date: Tue, 7 Feb 2012 10:10:04 +0100 > 2012/2/1 Štefan Gula <steweg@ynet.sk>: >> 2012/2/1 David Miller <davem@davemloft.net>: > I think that everything what could be done and said from my-side to > provide you guys answers to hopefully all your questions and to get > this into kernel was done. So I would like to ask you to provide me > final feedback. Thanks Your patch will not be applied, you haven't said anything new to me, you haven't given me any new information that would change my position, so it should be no surprise to you that I still want you to work towards a solution that uses openvswitch. And btw, your inability to see our point of view on this matter in any way, shape, or form, is really working to your disadvantage and is undermining your ultimate goals. Your should seriously reconsider how you are going about this, because right now I cringe when I see messages from you in my inbox and I bet a lot of other people feel this way right now as well. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff -uprN -X linux-3.2.1-orig/Documentation/dontdiff linux-3.2.1-orig/include/linux/if_tunnel.h linux/include/linux/if_tunnel.h --- linux-3.2.1-orig/include/linux/if_tunnel.h 2012-01-27 13:38:56.000000000 +0000 +++ linux/include/linux/if_tunnel.h 2012-01-30 14:10:01.000000000 +0000 @@ -75,6 +75,7 @@ enum { IFLA_GRE_TTL, IFLA_GRE_TOS, IFLA_GRE_PMTUDISC, + IFLA_GRE_BRIDGE, __IFLA_GRE_MAX, }; diff -uprN -X linux-3.2.1-orig/Documentation/dontdiff linux-3.2.1-orig/include/net/ipip.h linux/include/net/ipip.h --- linux-3.2.1-orig/include/net/ipip.h 2012-01-27 13:38:57.000000000 +0000 +++ linux/include/net/ipip.h 2012-01-30 14:10:01.000000000 +0000 @@ -27,6 +27,15 @@ struct ip_tunnel { __u32 o_seqno; /* The last output seqno */ int hlen; /* Precalculated GRE header length */ int mlink; +#ifdef CONFIG_NET_IPGRE_BRIDGE +#define GRETAP_BR_HASH_BITS 8 +#define GRETAP_BR_HASH_SIZE (1 << GRETAP_BR_HASH_BITS) + struct hlist_head hash[GRETAP_BR_HASH_SIZE]; + spinlock_t hash_lock; + unsigned long ageing_time; + struct timer_list gc_timer; + bool br_enabled; +#endif struct ip_tunnel_parm parms; diff -uprN -X linux-3.2.1-orig/Documentation/dontdiff linux-3.2.1-orig/net/ipv4/Kconfig linux/net/ipv4/Kconfig --- linux-3.2.1-orig/net/ipv4/Kconfig 2012-01-27 13:39:00.000000000 +0000 +++ linux/net/ipv4/Kconfig 2012-01-30 14:10:01.000000000 +0000 @@ -211,6 +211,15 @@ config NET_IPGRE_BROADCAST Network), but can be distributed all over the Internet. If you want to do that, say Y here and to "IP multicast routing" below. +config NET_IPGRE_BRIDGE + bool "IP: Ethernet over multipoint GRE over IP" + depends on IP_MULTICAST && NET_IPGRE && NET_IPGRE_BROADCAST + help + Allows you to use multipoint GRE VPN as virtual switch and interconnect + several L2 endpoints over L3 routed infrastructure. It is useful for + creating multipoint L2 VPNs which can be later used inside bridge + interfaces If you want to use. GRE multipoint L2 VPN feature say Y. + config IP_MROUTE bool "IP: multicast routing" depends on IP_MULTICAST diff -uprN -X linux-3.2.1-orig/Documentation/dontdiff linux-3.2.1-orig/net/ipv4/ip_gre.c linux/net/ipv4/ip_gre.c --- linux-3.2.1-orig/net/ipv4/ip_gre.c 2012-01-27 13:39:00.000000000 +0000 +++ linux/net/ipv4/ip_gre.c 2012-01-30 15:10:35.000000000 +0000 @@ -52,6 +52,11 @@ #include <net/ip6_route.h> #endif +#ifdef CONFIG_NET_IPGRE_BRIDGE +#include <linux/jhash.h> +#include <asm/unaligned.h> +#endif + /* Problems & solutions -------------------- @@ -134,6 +139,203 @@ struct ipgre_net { struct net_device *fb_tunnel_dev; }; +#ifdef CONFIG_NET_IPGRE_BRIDGE + /* + * This part of code includes codes to enable L2 ethernet + * switch virtualization over IP routed infrastructure with + * utilization of multicast capable endpoint using Ethernet + * over GRE + * + * Author: Stefan Gula + * Signed-off-by: Stefan Gula <steweg@gmail.com> + */ +struct ipgre_tap_bridge_entry { + struct hlist_node hlist; + __be32 raddr; + unsigned char addr[ETH_ALEN]; + unsigned long updated; + struct rcu_head rcu; +}; + +static u32 ipgre_salt __read_mostly; + +static inline int ipgre_tap_bridge_hash(const unsigned char *mac) +{ + u32 key = get_unaligned((u32 *)(mac + 2)); + + return jhash_1word(key, ipgre_salt) & (GRETAP_BR_HASH_SIZE - 1); +} + +static inline int ipgre_tap_bridge_has_expired(const struct ip_tunnel *tunnel, + const struct ipgre_tap_bridge_entry *entry) +{ + return time_before_eq(entry->updated + tunnel->ageing_time, + jiffies); +} + +static inline void ipgre_tap_bridge_delete(struct ipgre_tap_bridge_entry *entry) +{ + hlist_del_rcu(&entry->hlist); + kfree_rcu(entry, rcu); +} + +static void ipgre_tap_bridge_cleanup(unsigned long _data) +{ + struct ip_tunnel *tunnel = (struct ip_tunnel *)_data; + unsigned long delay = tunnel->ageing_time; + unsigned long next_timer = jiffies + tunnel->ageing_time; + int i; + + spin_lock(&tunnel->hash_lock); + for (i = 0; i < GRETAP_BR_HASH_SIZE; i++) { + struct ipgre_tap_bridge_entry *entry; + struct hlist_node *h, *n; + + hlist_for_each_entry_safe(entry, h, n, + &tunnel->hash[i], hlist) + { + unsigned long this_timer; + this_timer = entry->updated + delay; + if (time_before_eq(this_timer, jiffies)) + ipgre_tap_bridge_delete(entry); + else if (time_before(this_timer, next_timer)) + next_timer = this_timer; + } + } + spin_unlock(&tunnel->hash_lock); + mod_timer(&tunnel->gc_timer, round_jiffies_up(next_timer)); +} + +static void ipgre_tap_bridge_flush(struct ip_tunnel *tunnel) +{ + int i; + + spin_lock_bh(&tunnel->hash_lock); + for (i = 0; i < GRETAP_BR_HASH_SIZE; i++) { + struct ipgre_tap_bridge_entry *entry; + struct hlist_node *h, *n; + + hlist_for_each_entry_safe(entry, h, n, + &tunnel->hash[i], hlist) + { + ipgre_tap_bridge_delete(entry); + } + } + spin_unlock_bh(&tunnel->hash_lock); +} + +static struct ipgre_tap_bridge_entry *__ipgre_tap_bridge_get( + struct ip_tunnel *tunnel, const unsigned char *addr) +{ + struct hlist_node *h; + struct ipgre_tap_bridge_entry *entry; + + hlist_for_each_entry_rcu(entry, h, + &tunnel->hash[ipgre_tap_bridge_hash(addr)], hlist) { + if (!compare_ether_addr(entry->addr, addr)) { + if (unlikely(ipgre_tap_bridge_has_expired(tunnel, + entry))) + break; + return entry; + } + } + + return NULL; +} + +static struct ipgre_tap_bridge_entry *ipgre_tap_bridge_find( + struct hlist_head *head, + const unsigned char *addr) +{ + struct hlist_node *h; + struct ipgre_tap_bridge_entry *entry; + + hlist_for_each_entry(entry, h, head, hlist) { + if (!compare_ether_addr(entry->addr, addr)) + return entry; + } + return NULL; +} + + +static struct ipgre_tap_bridge_entry *ipgre_tap_bridge_find_rcu( + struct hlist_head *head, + const unsigned char *addr) +{ + struct hlist_node *h; + struct ipgre_tap_bridge_entry *entry; + + hlist_for_each_entry_rcu(entry, h, head, hlist) { + if (!compare_ether_addr(entry->addr, addr)) + return entry; + } + return NULL; +} + +static struct ipgre_tap_bridge_entry *ipgre_tap_bridge_create( + struct hlist_head *head, + __be32 source, + const unsigned char *addr) +{ + struct ipgre_tap_bridge_entry *entry; + + entry = kmalloc(sizeof(*entry), GFP_ATOMIC); + if (entry) { + memcpy(entry->addr, addr, ETH_ALEN); + entry->raddr = source; + entry->updated = jiffies; + hlist_add_head_rcu(&entry->hlist, head); + } + return entry; +} + +static __be32 ipgre_tap_bridge_get_raddr(struct ip_tunnel *tunnel, + const unsigned char *addr) +{ + __be32 raddr = 0; + struct ipgre_tap_bridge_entry *entry; + + rcu_read_lock(); + entry = __ipgre_tap_bridge_get(tunnel, addr); + if (entry) + raddr = entry->raddr; + rcu_read_unlock(); + + return raddr; +} + +static void ipgre_tap_bridge_rcv(struct ip_tunnel *tunnel, + struct sk_buff *skb, + __be32 orig_source) +{ + const struct ethhdr *tethhdr; + struct hlist_head *head; + struct ipgre_tap_bridge_entry *entry; + + if (ipv4_is_multicast(tunnel->parms.iph.daddr)) { + tethhdr = eth_hdr(skb); + if (!is_multicast_ether_addr( + tethhdr->h_source)) { + head = &tunnel->hash[ + ipgre_tap_bridge_hash(tethhdr->h_source)]; + entry = ipgre_tap_bridge_find_rcu(head, + tethhdr->h_source); + if (likely(entry)) { + entry->raddr = orig_source; + entry->updated = jiffies; + } else { + spin_lock(&tunnel->hash_lock); + if (!ipgre_tap_bridge_find(head, + tethhdr->h_source)) + ipgre_tap_bridge_create(head, + orig_source, + tethhdr->h_source); + spin_unlock(&tunnel->hash_lock); + } + } + } +} +#endif /* Tunnel hash table */ /* @@ -562,6 +764,9 @@ static int ipgre_rcv(struct sk_buff *skb struct ip_tunnel *tunnel; int offset = 4; __be16 gre_proto; +#ifdef CONFIG_NET_IPGRE_BRIDGE + __be32 orig_source; +#endif if (!pskb_may_pull(skb, 16)) goto drop_nolock; @@ -654,6 +859,9 @@ static int ipgre_rcv(struct sk_buff *skb /* Warning: All skb pointers will be invalidated! */ if (tunnel->dev->type == ARPHRD_ETHER) { +#ifdef CONFIG_NET_IPGRE_BRIDGE + orig_source = iph->saddr; +#endif if (!pskb_may_pull(skb, ETH_HLEN)) { tunnel->dev->stats.rx_length_errors++; tunnel->dev->stats.rx_errors++; @@ -663,6 +871,10 @@ static int ipgre_rcv(struct sk_buff *skb iph = ip_hdr(skb); skb->protocol = eth_type_trans(skb, tunnel->dev); skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN); +#ifdef CONFIG_NET_IPGRE_BRIDGE + if (tunnel->br_enabled) + ipgre_tap_bridge_rcv(tunnel, skb, orig_source); +#endif } tstats = this_cpu_ptr(tunnel->dev->tstats); @@ -702,7 +914,7 @@ static netdev_tx_t ipgre_tunnel_xmit(str struct iphdr *iph; /* Our new IP header */ unsigned int max_headroom; /* The extra header space needed */ int gre_hlen; - __be32 dst; + __be32 dst = 0; int mtu; if (dev->type == ARPHRD_ETHER) @@ -716,7 +928,15 @@ static netdev_tx_t ipgre_tunnel_xmit(str tiph = &tunnel->parms.iph; } - if ((dst = tiph->daddr) == 0) { +#ifdef CONFIG_NET_IPGRE_BRIDGE + if (tunnel->br_enabled && (dev->type == ARPHRD_ETHER) && + ipv4_is_multicast(tunnel->parms.iph.daddr)) + dst = ipgre_tap_bridge_get_raddr(tunnel, + ((struct ethhdr *)skb->data)->h_dest); +#endif + if (dst == 0) + dst = tiph->daddr; + if (dst == 0) { /* NBMA tunnel */ if (skb_dst(skb) == NULL) { @@ -1209,6 +1429,16 @@ static int ipgre_open(struct net_device return -EADDRNOTAVAIL; t->mlink = dev->ifindex; ip_mc_inc_group(__in_dev_get_rtnl(dev), t->parms.iph.daddr); +#ifdef CONFIG_NET_IPGRE_BRIDGE + if (t->dev->type == ARPHRD_ETHER) { + INIT_HLIST_HEAD(t->hash); + spin_lock_init(&t->hash_lock); + t->ageing_time = 300 * HZ; + setup_timer(&t->gc_timer, ipgre_tap_bridge_cleanup, + (unsigned long) t); + mod_timer(&t->gc_timer, jiffies + t->ageing_time); + } +#endif } return 0; } @@ -1219,6 +1449,12 @@ static int ipgre_close(struct net_device if (ipv4_is_multicast(t->parms.iph.daddr) && t->mlink) { struct in_device *in_dev; +#ifdef CONFIG_NET_IPGRE_BRIDGE + if (t->dev->type == ARPHRD_ETHER) { + ipgre_tap_bridge_flush(t); + del_timer_sync(&t->gc_timer); + } +#endif in_dev = inetdev_by_index(dev_net(dev), t->mlink); if (in_dev) ip_mc_dec_group(in_dev, t->parms.iph.daddr); @@ -1488,6 +1724,10 @@ static int ipgre_tap_init(struct net_dev static const struct net_device_ops ipgre_tap_netdev_ops = { .ndo_init = ipgre_tap_init, .ndo_uninit = ipgre_tunnel_uninit, +#ifdef CONFIG_NET_IPGRE_BRIDGE + .ndo_open = ipgre_open, + .ndo_stop = ipgre_close, +#endif .ndo_start_xmit = ipgre_tunnel_xmit, .ndo_set_mac_address = eth_mac_addr, .ndo_validate_addr = eth_validate_addr, @@ -1532,6 +1772,13 @@ static int ipgre_newlink(struct net *src /* Can use a lockless transmit, unless we generate output sequences */ if (!(nt->parms.o_flags & GRE_SEQ)) dev->features |= NETIF_F_LLTX; +#ifdef CONFIG_NET_IPGRE_BRIDGE + if (data && (!data[IFLA_GRE_BRIDGE] || + nla_get_u8(data[IFLA_GRE_BRIDGE]))) + nt->br_enabled = true; + else + nt->br_enabled = false; +#endif err = register_netdevice(dev); if (err) @@ -1588,6 +1835,16 @@ static int ipgre_changelink(struct net_d memcpy(dev->dev_addr, &p.iph.saddr, 4); memcpy(dev->broadcast, &p.iph.daddr, 4); } +#ifdef CONFIG_NET_IPGRE_BRIDGE + if (data && (!data[IFLA_GRE_BRIDGE] || + nla_get_u8(data[IFLA_GRE_BRIDGE]))) { + t->br_enabled = true; + } else { + if(t->br_enabled) + ipgre_tap_bridge_flush(t); + t->br_enabled = false; + } +#endif ipgre_tunnel_link(ign, t); netdev_state_change(dev); } @@ -1629,8 +1886,12 @@ static size_t ipgre_get_size(const struc nla_total_size(1) + /* IFLA_GRE_TOS */ nla_total_size(1) + - /* IFLA_GRE_PMTUDISC */ + /* IFLA_GREPMTUDISC */ nla_total_size(1) + +#ifdef CONFIG_NET_IPGRE_BRIDGE + /* IFLA_GRE_BRIDGE */ + nla_total_size(1) + +#endif 0; } @@ -1649,7 +1910,9 @@ static int ipgre_fill_info(struct sk_buf NLA_PUT_U8(skb, IFLA_GRE_TTL, p->iph.ttl); NLA_PUT_U8(skb, IFLA_GRE_TOS, p->iph.tos); NLA_PUT_U8(skb, IFLA_GRE_PMTUDISC, !!(p->iph.frag_off & htons(IP_DF))); - +#ifdef CONFIG_NET_IPGRE_BRIDGE + NLA_PUT_U8(skb, IFLA_GRE_BRIDGE, t->br_enabled); +#endif return 0; nla_put_failure: @@ -1667,6 +1930,7 @@ static const struct nla_policy ipgre_pol [IFLA_GRE_TTL] = { .type = NLA_U8 }, [IFLA_GRE_TOS] = { .type = NLA_U8 }, [IFLA_GRE_PMTUDISC] = { .type = NLA_U8 }, + [IFLA_GRE_BRIDGE] = { .type = NLA_U8 }, }; static struct rtnl_link_ops ipgre_link_ops __read_mostly = { @@ -1705,6 +1969,9 @@ static int __init ipgre_init(void) printk(KERN_INFO "GRE over IPv4 tunneling driver\n"); +#ifdef CONFIG_NET_IPGRE_BRIDGE + get_random_bytes(&ipgre_salt, sizeof(ipgre_salt)); +#endif err = register_pernet_device(&ipgre_net_ops); if (err < 0) return err;