diff mbox series

[v3,1/2] net: bridge: use mac_len in bridge forwarding

Message ID 20190902181000.25638-1-zahari.doychev@linux.com
State Changes Requested
Delegated to: David Miller
Headers show
Series [v3,1/2] net: bridge: use mac_len in bridge forwarding | expand

Commit Message

Zahari Doychev Sept. 2, 2019, 6:09 p.m. UTC
The bridge code cannot forward packets from various paths that set up the
SKBs in different ways. Some of these packets get corrupted during the
forwarding as not always is just ETH_HLEN pulled at the front.

This happens e.g. when VLAN tags are pushed by using tc act_vlan on
ingress. Example configuration is provided below. The test setup consists
of two netdevs connected to external hosts. There is act_vlan on one of
them adding two vlan tags on ingress and removing the tags on egress.
The configuration is done using the following commands:

ip link add name br0 type bridge vlan_filtering 1
ip link set dev br0 up

ip link set dev net0 up
ip link set dev net0 master br0

ip link set dev net1 up
ip link set dev net1 master br0

bridge vlan add dev net0 vid 100 master
bridge vlan add dev br0 vid 100 self
bridge vlan add dev net1 vid 100 master

tc qdisc add dev net0 handle ffff: clsact
tc qdisc add dev net1 handle ffff: clsact

tc filter add dev net0 ingress pref 1 protocol all flower \
		  action vlan push id 10 pipe action vlan push id 100

tc filter add dev net0 egress pref 1 protocol 802.1q flower \
		  vlan_id 100 vlan_ethtype 802.1q cvlan_id 10 \
		  action vlan pop pipe action vlan pop

When using the setup above the packets coming on net0 get double tagged but
the MAC headers gets corrupted when the packets go out of net1.
The skb->data is pushed only by the ETH_HLEN length instead of mac_len in
br_dev_queue_push_xmit. This later causes the function validate_xmit_vlan
to insert the outer vlan tag behind the inner vlan tag as the skb->data
does not point to the start of packet.

The problem is fixed by using skb->mac_len instead of ETH_HLEN, which makes
sure that the skb headers are correctly restored. This usually does not
change anything, execpt the local bridge transmits which now need to set
the skb->mac_len correctly in br_dev_xmit, as well as the broken case noted
above.

Signed-off-by: Zahari Doychev <zahari.doychev@linux.com>

---
v2->v3:
 - move cover letter description to commit message
---
 net/bridge/br_device.c  | 3 ++-
 net/bridge/br_forward.c | 4 ++--
 net/bridge/br_vlan.c    | 3 ++-
 3 files changed, 6 insertions(+), 4 deletions(-)

Comments

Toshiaki Makita Sept. 3, 2019, 11:37 a.m. UTC | #1
Hi Zahari,

Sorry for reviewing this late.

On 2019/09/03 3:09, Zahari Doychev wrote:
...
> @@ -466,13 +466,14 @@ static bool __allowed_ingress(const struct net_bridge *br,
>   		/* Tagged frame */
>   		if (skb->vlan_proto != br->vlan_proto) {
>   			/* Protocol-mismatch, empty out vlan_tci for new tag */
> -			skb_push(skb, ETH_HLEN);
> +			skb_push(skb, skb->mac_len);
>   			skb = vlan_insert_tag_set_proto(skb, skb->vlan_proto,
>   							skb_vlan_tag_get(skb));

I think we should insert vlan at skb->data, i.e. mac_header + mac_len, while this
function inserts the tag at mac_header + ETH_HLEN which is not always the correct
offset.

>   			if (unlikely(!skb))
>   				return false;
>   
>   			skb_pull(skb, ETH_HLEN);

Now skb->data is mac_header + ETH_HLEN which would be broken when mac_len is not
ETH_HLEN?

> +			skb_reset_network_header(skb);
>   			skb_reset_mac_len(skb);
>   			*vid = 0;
>   			tagged = false;
> 

Toshiaki Makita
Zahari Doychev Sept. 3, 2019, 1:36 p.m. UTC | #2
On Tue, Sep 03, 2019 at 08:37:36PM +0900, Toshiaki Makita wrote:
> Hi Zahari,
> 
> Sorry for reviewing this late.
> 
> On 2019/09/03 3:09, Zahari Doychev wrote:
> ...
> > @@ -466,13 +466,14 @@ static bool __allowed_ingress(const struct net_bridge *br,
> >   		/* Tagged frame */
> >   		if (skb->vlan_proto != br->vlan_proto) {
> >   			/* Protocol-mismatch, empty out vlan_tci for new tag */
> > -			skb_push(skb, ETH_HLEN);
> > +			skb_push(skb, skb->mac_len);
> >   			skb = vlan_insert_tag_set_proto(skb, skb->vlan_proto,
> >   							skb_vlan_tag_get(skb));
> 
> I think we should insert vlan at skb->data, i.e. mac_header + mac_len, while this
> function inserts the tag at mac_header + ETH_HLEN which is not always the correct
> offset.

Maybe I am misunderstanding the concern here but this should make sure that
the VLAN tag from the skb is move back in the payload as the outer most tag.
So it should follow the ethernet header. It looks like this e.g.,:

VLAN1 in skb:
+------+------+-------+
| DMAC | SMAC | ETYPE |
+------+------+-------+

VLAN1 moved to payload:
+------+------+-------+-------+
| DMAC | SMAC | VLAN1 | ETYPE |
+------+------+-------+-------+

VLAN2 in skb:
+------+------+-------+-------+
| DMAC | SMAC | VLAN1 | ETYPE |
+------+------+-------+-------+

VLAN2 moved to payload:

+------+------+-------+-------+
| DMAC | SMAC | VLAN2 | VLAN1 | ....
+------+------+-------+-------+

Doing the skb push with mac_len makes sure that VLAN tag is inserted in the
correct offset. For mac_len == ETH_HLEN this does not change the current
behaviour.

> 
> >   			if (unlikely(!skb))
> >   				return false;
> >   			skb_pull(skb, ETH_HLEN);
> 
> Now skb->data is mac_header + ETH_HLEN which would be broken when mac_len is not
> ETH_HLEN?

I thought it would be better to point in this case to the outer tag as otherwise
if mac_len is used the skb->data will point to the next tag which I find somehow
inconsistent or do you see some case where this can cause problems?


> 
> > +			skb_reset_network_header(skb);
> >   			skb_reset_mac_len(skb);
> >   			*vid = 0;
> >   			tagged = false;
> > 
> 
> Toshiaki Makita
Toshiaki Makita Sept. 4, 2019, 7:14 a.m. UTC | #3
On 2019/09/03 22:36, Zahari Doychev wrote:
> On Tue, Sep 03, 2019 at 08:37:36PM +0900, Toshiaki Makita wrote:
>> Hi Zahari,
>>
>> Sorry for reviewing this late.
>>
>> On 2019/09/03 3:09, Zahari Doychev wrote:
>> ...
>>> @@ -466,13 +466,14 @@ static bool __allowed_ingress(const struct net_bridge *br,
>>>    		/* Tagged frame */
>>>    		if (skb->vlan_proto != br->vlan_proto) {
>>>    			/* Protocol-mismatch, empty out vlan_tci for new tag */
>>> -			skb_push(skb, ETH_HLEN);
>>> +			skb_push(skb, skb->mac_len);
>>>    			skb = vlan_insert_tag_set_proto(skb, skb->vlan_proto,
>>>    							skb_vlan_tag_get(skb));
>>
>> I think we should insert vlan at skb->data, i.e. mac_header + mac_len, while this
>> function inserts the tag at mac_header + ETH_HLEN which is not always the correct
>> offset.
> 
> Maybe I am misunderstanding the concern here but this should make sure that
> the VLAN tag from the skb is move back in the payload as the outer most tag.
> So it should follow the ethernet header. It looks like this e.g.,:
> 
> VLAN1 in skb:
> +------+------+-------+
> | DMAC | SMAC | ETYPE |
> +------+------+-------+
> 
> VLAN1 moved to payload:
> +------+------+-------+-------+
> | DMAC | SMAC | VLAN1 | ETYPE |
> +------+------+-------+-------+
> 
> VLAN2 in skb:
> +------+------+-------+-------+
> | DMAC | SMAC | VLAN1 | ETYPE |
> +------+------+-------+-------+
> 
> VLAN2 moved to payload:
> 
> +------+------+-------+-------+
> | DMAC | SMAC | VLAN2 | VLAN1 | ....
> +------+------+-------+-------+
> 
> Doing the skb push with mac_len makes sure that VLAN tag is inserted in the
> correct offset. For mac_len == ETH_HLEN this does not change the current
> behaviour.

Reordering VLAN headers here does not look correct to me. If skb->data points to ETH+VLAN,
then we should insert the vlan at the offset.
Vlan devices with reorder_hdr disabled produce packets whose mac_len includes ETH+VLAN header,
and they expects vlan insertion after the outer vlan header.

Also I'm not sure there is standard ethernet header in mac_len, as mac_len is not ETH_HLEN.
E.g. tun devices can produce vlan packets without ehternet header.

> 
>>
>>>    			if (unlikely(!skb))
>>>    				return false;
>>>    			skb_pull(skb, ETH_HLEN);
>>
>> Now skb->data is mac_header + ETH_HLEN which would be broken when mac_len is not
>> ETH_HLEN?
> 
> I thought it would be better to point in this case to the outer tag as otherwise
> if mac_len is used the skb->data will point to the next tag which I find somehow
> inconsistent or do you see some case where this can cause problems?

Vlan devices with reorder_hdr off will break because it relies on skb->data offset
as I described in the previous discussion.

Toshiaki Makita
Zahari Doychev Sept. 4, 2019, 2:32 p.m. UTC | #4
On Wed, Sep 04, 2019 at 04:14:28PM +0900, Toshiaki Makita wrote:
> On 2019/09/03 22:36, Zahari Doychev wrote:
> > On Tue, Sep 03, 2019 at 08:37:36PM +0900, Toshiaki Makita wrote:
> > > Hi Zahari,
> > > 
> > > Sorry for reviewing this late.
> > > 
> > > On 2019/09/03 3:09, Zahari Doychev wrote:
> > > ...
> > > > @@ -466,13 +466,14 @@ static bool __allowed_ingress(const struct net_bridge *br,
> > > >    		/* Tagged frame */
> > > >    		if (skb->vlan_proto != br->vlan_proto) {
> > > >    			/* Protocol-mismatch, empty out vlan_tci for new tag */
> > > > -			skb_push(skb, ETH_HLEN);
> > > > +			skb_push(skb, skb->mac_len);
> > > >    			skb = vlan_insert_tag_set_proto(skb, skb->vlan_proto,
> > > >    							skb_vlan_tag_get(skb));
> > > 
> > > I think we should insert vlan at skb->data, i.e. mac_header + mac_len, while this
> > > function inserts the tag at mac_header + ETH_HLEN which is not always the correct
> > > offset.
> > 
> > Maybe I am misunderstanding the concern here but this should make sure that
> > the VLAN tag from the skb is move back in the payload as the outer most tag.
> > So it should follow the ethernet header. It looks like this e.g.,:
> > 
> > VLAN1 in skb:
> > +------+------+-------+
> > | DMAC | SMAC | ETYPE |
> > +------+------+-------+
> > 
> > VLAN1 moved to payload:
> > +------+------+-------+-------+
> > | DMAC | SMAC | VLAN1 | ETYPE |
> > +------+------+-------+-------+
> > 
> > VLAN2 in skb:
> > +------+------+-------+-------+
> > | DMAC | SMAC | VLAN1 | ETYPE |
> > +------+------+-------+-------+
> > 
> > VLAN2 moved to payload:
> > 
> > +------+------+-------+-------+
> > | DMAC | SMAC | VLAN2 | VLAN1 | ....
> > +------+------+-------+-------+
> > 
> > Doing the skb push with mac_len makes sure that VLAN tag is inserted in the
> > correct offset. For mac_len == ETH_HLEN this does not change the current
> > behaviour.
> 
> Reordering VLAN headers here does not look correct to me. If skb->data points to ETH+VLAN,
> then we should insert the vlan at the offset.
> Vlan devices with reorder_hdr disabled produce packets whose mac_len includes ETH+VLAN header,
> and they expects vlan insertion after the outer vlan header.

I see so in this case we should handle differently as it seems sometimes
we have to insert after or before the tag in the packet. I am not quite sure
if this is possible to be detected here. I was trying to do bridging with VLAN
devices with reorder_hdr disabled working but somehow I was not able to get
mac_len longer then ETH_HLEN in all cases that I tried. Can you provide some
example how can I try this out? It will really help me to understand the
problem better.

> 
> Also I'm not sure there is standard ethernet header in mac_len, as mac_len is not ETH_HLEN.
> E.g. tun devices can produce vlan packets without ehternet header.

How is the bridge forwarding decision done in this case when there are no
MAC addresses, vlan based only?

> 
> > 
> > > 
> > > >    			if (unlikely(!skb))
> > > >    				return false;
> > > >    			skb_pull(skb, ETH_HLEN);
> > > 
> > > Now skb->data is mac_header + ETH_HLEN which would be broken when mac_len is not
> > > ETH_HLEN?
> > 
> > I thought it would be better to point in this case to the outer tag as otherwise
> > if mac_len is used the skb->data will point to the next tag which I find somehow
> > inconsistent or do you see some case where this can cause problems?
> 
> Vlan devices with reorder_hdr off will break because it relies on skb->data offset
> as I described in the previous discussion.

I also see in vlan_do_receive that the VLAN tag is moved to the payload when
reorder_hdr is off and the vlan_dev is not a bridge port. So it seems that
I am misunderstanding the reorder_hdr option so if you can give me some more
details about how it is supposed to be used will be highly appreciated.

Thanks
Zahari

> 
> Toshiaki Makita
Toshiaki Makita Sept. 5, 2019, 11:20 a.m. UTC | #5
On 2019/09/04 23:32, Zahari Doychev wrote:
> On Wed, Sep 04, 2019 at 04:14:28PM +0900, Toshiaki Makita wrote:
>> On 2019/09/03 22:36, Zahari Doychev wrote:
>>> On Tue, Sep 03, 2019 at 08:37:36PM +0900, Toshiaki Makita wrote:
>>>> Hi Zahari,
>>>>
>>>> Sorry for reviewing this late.
>>>>
>>>> On 2019/09/03 3:09, Zahari Doychev wrote:
>>>> ...
>>>>> @@ -466,13 +466,14 @@ static bool __allowed_ingress(const struct net_bridge *br,
>>>>>     		/* Tagged frame */
>>>>>     		if (skb->vlan_proto != br->vlan_proto) {
>>>>>     			/* Protocol-mismatch, empty out vlan_tci for new tag */
>>>>> -			skb_push(skb, ETH_HLEN);
>>>>> +			skb_push(skb, skb->mac_len);
>>>>>     			skb = vlan_insert_tag_set_proto(skb, skb->vlan_proto,
>>>>>     							skb_vlan_tag_get(skb));
>>>>
>>>> I think we should insert vlan at skb->data, i.e. mac_header + mac_len, while this
>>>> function inserts the tag at mac_header + ETH_HLEN which is not always the correct
>>>> offset.
>>>
>>> Maybe I am misunderstanding the concern here but this should make sure that
>>> the VLAN tag from the skb is move back in the payload as the outer most tag.
>>> So it should follow the ethernet header. It looks like this e.g.,:
>>>
>>> VLAN1 in skb:
>>> +------+------+-------+
>>> | DMAC | SMAC | ETYPE |
>>> +------+------+-------+
>>>
>>> VLAN1 moved to payload:
>>> +------+------+-------+-------+
>>> | DMAC | SMAC | VLAN1 | ETYPE |
>>> +------+------+-------+-------+
>>>
>>> VLAN2 in skb:
>>> +------+------+-------+-------+
>>> | DMAC | SMAC | VLAN1 | ETYPE |
>>> +------+------+-------+-------+
>>>
>>> VLAN2 moved to payload:
>>>
>>> +------+------+-------+-------+
>>> | DMAC | SMAC | VLAN2 | VLAN1 | ....
>>> +------+------+-------+-------+
>>>
>>> Doing the skb push with mac_len makes sure that VLAN tag is inserted in the
>>> correct offset. For mac_len == ETH_HLEN this does not change the current
>>> behaviour.
>>
>> Reordering VLAN headers here does not look correct to me. If skb->data points to ETH+VLAN,
>> then we should insert the vlan at the offset.
>> Vlan devices with reorder_hdr disabled produce packets whose mac_len includes ETH+VLAN header,
>> and they expects vlan insertion after the outer vlan header.
> 
> I see so in this case we should handle differently as it seems sometimes
> we have to insert after or before the tag in the packet. I am not quite sure
> if this is possible to be detected here. I was trying to do bridging with VLAN
> devices with reorder_hdr disabled working but somehow I was not able to get
> mac_len longer then ETH_HLEN in all cases that I tried. Can you provide some
> example how can I try this out? It will really help me to understand the
> problem better.

I'm not sure if there is a case where we should insert tags before data pointer.
Your case does not look valid to me because skb is already broken in TC (I think I
explained this in the previous discussion). Bridge should not workaround the broken skb.

>> Also I'm not sure there is standard ethernet header in mac_len, as mac_len is not ETH_HLEN.
>> E.g. tun devices can produce vlan packets without ehternet header.
> 
> How is the bridge forwarding decision done in this case when there are no
> MAC addresses, vlan based only?

Tun is just an example for header shorter than we expect. It's more like an attack vector.
So maybe it's sufficient to make sure we don't crash or write data to unexpected offset
for such packets. Or if such packets cannot make it to this point, that's ok.

> 
>>
>>>
>>>>
>>>>>     			if (unlikely(!skb))
>>>>>     				return false;
>>>>>     			skb_pull(skb, ETH_HLEN);
>>>>
>>>> Now skb->data is mac_header + ETH_HLEN which would be broken when mac_len is not
>>>> ETH_HLEN?
>>>
>>> I thought it would be better to point in this case to the outer tag as otherwise
>>> if mac_len is used the skb->data will point to the next tag which I find somehow
>>> inconsistent or do you see some case where this can cause problems?
>>
>> Vlan devices with reorder_hdr off will break because it relies on skb->data offset
>> as I described in the previous discussion.
> 
> I also see in vlan_do_receive that the VLAN tag is moved to the payload when
> reorder_hdr is off and the vlan_dev is not a bridge port. So it seems that
> I am misunderstanding the reorder_hdr option so if you can give me some more
> details about how it is supposed to be used will be highly appreciated.

No, you don't misunderstand it. I just forgot the condition was added.

Now reorder_hdr does not look like a problem, I lost the reason to handle
mac_len != ETH_HLEN case, as I'm thinking this change should not be a workaround for your problem.
If we fix the broken data pointer in TC, there should not be problems with mac_len in bridge.
Do you have any other possible cases this works for?

Toshiaki Makita
diff mbox series

Patch

diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 681b72862c16..aeb77ff60311 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -55,8 +55,9 @@  netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
 	BR_INPUT_SKB_CB(skb)->frag_max_size = 0;
 
 	skb_reset_mac_header(skb);
+	skb_reset_mac_len(skb);
 	eth = eth_hdr(skb);
-	skb_pull(skb, ETH_HLEN);
+	skb_pull(skb, skb->mac_len);
 
 	if (!br_allowed_ingress(br, br_vlan_group_rcu(br), skb, &vid))
 		goto out;
diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
index 86637000f275..edb4f3533f05 100644
--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -32,7 +32,7 @@  static inline int should_deliver(const struct net_bridge_port *p,
 
 int br_dev_queue_push_xmit(struct net *net, struct sock *sk, struct sk_buff *skb)
 {
-	skb_push(skb, ETH_HLEN);
+	skb_push(skb, skb->mac_len);
 	if (!is_skb_forwardable(skb->dev, skb))
 		goto drop;
 
@@ -94,7 +94,7 @@  static void __br_forward(const struct net_bridge_port *to,
 		net = dev_net(indev);
 	} else {
 		if (unlikely(netpoll_tx_running(to->br->dev))) {
-			skb_push(skb, ETH_HLEN);
+			skb_push(skb, skb->mac_len);
 			if (!is_skb_forwardable(skb->dev, skb))
 				kfree_skb(skb);
 			else
diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index bb98984cd27d..419067b314d7 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -466,13 +466,14 @@  static bool __allowed_ingress(const struct net_bridge *br,
 		/* Tagged frame */
 		if (skb->vlan_proto != br->vlan_proto) {
 			/* Protocol-mismatch, empty out vlan_tci for new tag */
-			skb_push(skb, ETH_HLEN);
+			skb_push(skb, skb->mac_len);
 			skb = vlan_insert_tag_set_proto(skb, skb->vlan_proto,
 							skb_vlan_tag_get(skb));
 			if (unlikely(!skb))
 				return false;
 
 			skb_pull(skb, ETH_HLEN);
+			skb_reset_network_header(skb);
 			skb_reset_mac_len(skb);
 			*vid = 0;
 			tagged = false;