Message ID | trinity-782c45c1-ef94-4340-8def-f949a4a929a9-1550334156901@3c-app-gmx-bs35 |
---|---|
State | RFC |
Delegated to: | David Miller |
Headers | show |
Series | [BUG,FIX] net: dsa: oops in br_vlan_enabled | expand |
Hi Frank, On 2/16/2019 8:22 AM, Frank Wunderlich wrote: > Hi, > > i've found an oops in 4.19.23/10, seems to be fixed anyhow in 5.0 (also works in 4.14.101) > > root@bpi-r2:~# ip link add link lan0 name lan0.5 type vlan id 5 > root@bpi-r2:~# ip addr add 192.168.5.200/24 brd 192.168.5.255 dev lan0.5 > root@bpi-r2:~# ip link set dev lan0 up > root@bpi-r2:~# ip link set dev lan0.5 up So that these steps don't involve a bridge, and because we don't (yet) implment ndo_vlan_rx_{add,kill}_vid() netdevice_ops, there is no VLAN programming, it's all software > > 12: lan0.5@lan0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN group default qlen 1000 > link/ether 02:02:02:02:02:02 brd ff:ff:ff:ff:ff:ff > inet 192.168.5.200/24 brd 192.168.5.255 scope global lan0.5 > valid_lft forever preferred_lft forever > > root@bpi-r2:~# brctl addbr bridge_name > root@bpi-r2:~# brctl addif bridge_name lan0.5 Unless you changed the bridge to have VLAN filtering/awareness with: echo 1 > /sys/class/net/bridge_name/bridge/vlan_filtering There will be no VLAN configuration/objects pushed to DSA since commit 2ea7a679ca2abd251c1ec03f20508619707e1749 ("net: dsa: Don't add vlans when vlan filtering is disabled") so I am not sure how you got into that situation because prior to pushing a VLAN object, the port must be part of a bridge, so the steps typically look like: - port_bridge_join which assigns dp->bridge_dev - port_vlan_add Does your 4.19.23 kernel somehow change how VLAN objects are pushed down the switch driver? > [ 352.057128] bridge_name: port 1(lan0.5) entered blocking state > [ 352.063065] bridge_name: port 1(lan0.5) entered disabled state > [ 352.069181] device lan0.5 entered promiscuous mode > [ 352.074018] device lan0 entered promiscuous mode > [ 352.078906] Unable to handle kernel NULL pointer dereference at virtual address 00000558 > ... > [ 352.493085] [<bf0fde88>] (br_vlan_enabled [bridge]) from [<bf12c234>] (dsa_port_vlan_add+0x60/0xbc [dsa_core]) > [ 352.503050] [<bf12c234>] (dsa_port_vlan_add [dsa_core]) from [<bf12cb64>] (dsa_slave_port_obj_add+0x4c/0x50 [dsa_core]) > [ 352.513776] [<bf12cb64>] (dsa_slave_port_obj_add [dsa_core]) from [<c0b4e2d4>] (__switchdev_port_obj_add+0x50/0xc4) > [ 352.524138] [<c0b4e2d4>] (__switchdev_port_obj_add) from [<c0b4e324>] (__switchdev_port_obj_add+0xa0/0xc4) > [ 352.533721] [<c0b4e324>] (__switchdev_port_obj_add) from [<c0b4e3a8>] (switchdev_port_obj_add_now+0x60/0x130) > [ 352.543562] [<c0b4e3a8>] (switchdev_port_obj_add_now) from [<c0b4e7e4>] (switchdev_port_obj_add+0x44/0x190) > [ 352.553284] [<c0b4e7e4>] (switchdev_port_obj_add) from [<bf1013d0>] (br_switchdev_port_vlan_add+0x60/0x7c [bridge]) > [ 352.563733] [<bf1013d0>] (br_switchdev_port_vlan_add [bridge]) from [<bf0ff250>] (__vlan_add+0xb0/0x620 [bridge]) > [ 352.574007] [<bf0ff250>] (__vlan_add [bridge]) from [<bf0ffd04>] (nbp_vlan_add+0xc4/0x150 [bridge]) > [ 352.583073] [<bf0ffd04>] (nbp_vlan_add [bridge]) from [<bf0ffec4>] (nbp_vlan_init+0x134/0x164 [bridge]) > [ 352.592482] [<bf0ffec4>] (nbp_vlan_init [bridge]) from [<bf0edd4c>] (br_add_if+0x40c/0x5fc [bridge]) > [ 352.601632] [<bf0edd4c>] (br_add_if [bridge]) from [<bf0eeb14>] (add_del_if+0x6c/0x80 [bridge]) > [ 352.610351] [<bf0eeb14>] (add_del_if [bridge]) from [<bf0ef5b0>] (br_dev_ioctl+0x7c/0x9c [bridge]) > [ 352.619290] [<bf0ef5b0>] (br_dev_ioctl [bridge]) from [<c09583d4>] (dev_ifsioc+0x184/0x324) > [ 352.627582] [<c09583d4>] (dev_ifsioc) from [<c09589e8>] (dev_ioctl+0x32c/0x5cc) > [ 352.634837] [<c09589e8>] (dev_ioctl) from [<c090913c>] (sock_ioctl+0x3bc/0x580) > > > since my 4.19.23 kernel is modified a bit i tried with 4.19.10 without my net modifications and it is still reproducable with steps above (create a vlan on dsa-user-port and then use it in a bridge) > > i fixed it with these changes: > > diff --git a/net/dsa/port.c b/net/dsa/port.c > index ed0595459df1..962887752ae8 100644 > --- a/net/dsa/port.c > +++ b/net/dsa/port.c > @@ -255,8 +255,9 @@ int dsa_port_vlan_add(struct dsa_port *dp, > if (netif_is_bridge_master(vlan->obj.orig_dev)) > return -EOPNOTSUPP; > > - if (br_vlan_enabled(dp->bridge_dev)) > - return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_ADD, &info); > + printk(KERN_ALERT "DEBUG: Passed %s %d 0x%x \n",__FUNCTION__,__LINE__,(unsigned int)dp->bridge_dev); > + if (!dp->bridge_dev || br_vlan_enabled(dp->bridge_dev)) > + return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_DEL, &info); > > return 0; > } > @@ -273,7 +274,7 @@ int dsa_port_vlan_del(struct dsa_port *dp, > if (netif_is_bridge_master(vlan->obj.orig_dev)) > return -EOPNOTSUPP; > > - if (br_vlan_enabled(dp->bridge_dev)) > + if (!dp->bridge_dev || br_vlan_enabled(dp->bridge_dev)) > return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_DEL, &info); > > return 0; > > i've found in a Patch from florian/vivien: https://www.mail-archive.com/netdev@vger.kernel.org/msg281415.html > > Strange that 5.0-rc1 does not crash,because these 2 code-sections are unchanged: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/dsa/port.c#n255 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/bridge/br_vlan.c#n788 > > maybe you know why only 4.19 is affected... > > regards Frank >
On 2/17/2019 9:13 AM, Florian Fainelli wrote: > Hi Frank, > > On 2/16/2019 8:22 AM, Frank Wunderlich wrote: >> Hi, >> >> i've found an oops in 4.19.23/10, seems to be fixed anyhow in 5.0 (also works in 4.14.101) >> >> root@bpi-r2:~# ip link add link lan0 name lan0.5 type vlan id 5 >> root@bpi-r2:~# ip addr add 192.168.5.200/24 brd 192.168.5.255 dev lan0.5 >> root@bpi-r2:~# ip link set dev lan0 up >> root@bpi-r2:~# ip link set dev lan0.5 up > > So that these steps don't involve a bridge, and because we don't (yet) > implment ndo_vlan_rx_{add,kill}_vid() netdevice_ops, there is no VLAN > programming, it's all software > >> >> 12: lan0.5@lan0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN group default qlen 1000 >> link/ether 02:02:02:02:02:02 brd ff:ff:ff:ff:ff:ff >> inet 192.168.5.200/24 brd 192.168.5.255 scope global lan0.5 >> valid_lft forever preferred_lft forever >> >> root@bpi-r2:~# brctl addbr bridge_name >> root@bpi-r2:~# brctl addif bridge_name lan0.5 > > Unless you changed the bridge to have VLAN filtering/awareness with: > > echo 1 > /sys/class/net/bridge_name/bridge/vlan_filtering > > There will be no VLAN configuration/objects pushed to DSA since commit > 2ea7a679ca2abd251c1ec03f20508619707e1749 ("net: dsa: Don't add vlans > when vlan filtering is disabled") so I am not sure how you got into that > situation because prior to pushing a VLAN object, the port must be part > of a bridge, so the steps typically look like: > > - port_bridge_join which assigns dp->bridge_dev > - port_vlan_add > > Does your 4.19.23 kernel somehow change how VLAN objects are pushed down > the switch driver? > >> [ 352.057128] bridge_name: port 1(lan0.5) entered blocking state >> [ 352.063065] bridge_name: port 1(lan0.5) entered disabled state >> [ 352.069181] device lan0.5 entered promiscuous mode >> [ 352.074018] device lan0 entered promiscuous mode >> [ 352.078906] Unable to handle kernel NULL pointer dereference at virtual address 00000558 >> ... >> [ 352.493085] [<bf0fde88>] (br_vlan_enabled [bridge]) from [<bf12c234>] (dsa_port_vlan_add+0x60/0xbc [dsa_core]) >> [ 352.503050] [<bf12c234>] (dsa_port_vlan_add [dsa_core]) from [<bf12cb64>] (dsa_slave_port_obj_add+0x4c/0x50 [dsa_core]) >> [ 352.513776] [<bf12cb64>] (dsa_slave_port_obj_add [dsa_core]) from [<c0b4e2d4>] (__switchdev_port_obj_add+0x50/0xc4) >> [ 352.524138] [<c0b4e2d4>] (__switchdev_port_obj_add) from [<c0b4e324>] (__switchdev_port_obj_add+0xa0/0xc4) >> [ 352.533721] [<c0b4e324>] (__switchdev_port_obj_add) from [<c0b4e3a8>] (switchdev_port_obj_add_now+0x60/0x130) >> [ 352.543562] [<c0b4e3a8>] (switchdev_port_obj_add_now) from [<c0b4e7e4>] (switchdev_port_obj_add+0x44/0x190) >> [ 352.553284] [<c0b4e7e4>] (switchdev_port_obj_add) from [<bf1013d0>] (br_switchdev_port_vlan_add+0x60/0x7c [bridge]) >> [ 352.563733] [<bf1013d0>] (br_switchdev_port_vlan_add [bridge]) from [<bf0ff250>] (__vlan_add+0xb0/0x620 [bridge]) >> [ 352.574007] [<bf0ff250>] (__vlan_add [bridge]) from [<bf0ffd04>] (nbp_vlan_add+0xc4/0x150 [bridge]) >> [ 352.583073] [<bf0ffd04>] (nbp_vlan_add [bridge]) from [<bf0ffec4>] (nbp_vlan_init+0x134/0x164 [bridge]) >> [ 352.592482] [<bf0ffec4>] (nbp_vlan_init [bridge]) from [<bf0edd4c>] (br_add_if+0x40c/0x5fc [bridge]) >> [ 352.601632] [<bf0edd4c>] (br_add_if [bridge]) from [<bf0eeb14>] (add_del_if+0x6c/0x80 [bridge]) >> [ 352.610351] [<bf0eeb14>] (add_del_if [bridge]) from [<bf0ef5b0>] (br_dev_ioctl+0x7c/0x9c [bridge]) >> [ 352.619290] [<bf0ef5b0>] (br_dev_ioctl [bridge]) from [<c09583d4>] (dev_ifsioc+0x184/0x324) >> [ 352.627582] [<c09583d4>] (dev_ifsioc) from [<c09589e8>] (dev_ioctl+0x32c/0x5cc) >> [ 352.634837] [<c09589e8>] (dev_ioctl) from [<c090913c>] (sock_ioctl+0x3bc/0x580) >> >> >> since my 4.19.23 kernel is modified a bit i tried with 4.19.10 without my net modifications and it is still reproducable with steps above (create a vlan on dsa-user-port and then use it in a bridge) >> >> i fixed it with these changes: Your fix is undoing the commit from Andrew I just referenced, so it is definitively not the right fix because it will push VLAN objects down to a non-VLAN aware bridge, not that this is really a problem, but it should not be happening anyway. The problem appears to be the following though: you are enslaving a VLAN device, which does not have switchdev_ops, so we recurse into the lower device, which is the DSA network device, which does have switchdev_ops defined. Once we are there we check the orig_dev against being a bridge master network device, but we are not checking that it's a VLAN device so we end-up assuming it is a DSA network device, we de-reference garbage by checking br_vlan_enabled(dp->bridge_dev) because there is simply no such data structure associated with the VLAN device. The attached patch should help. This is no longer a problem in newer kernels because the switchdev operations use a notifier which checks the target network device to be DSA.
Hi, tried your Patch but crashes the same way: [ 107.416972] [<bf0fde88>] (br_vlan_enabled [bridge]) from [<bf14e234>] (dsa_po rt_vlan_add+0x60/0xbc [dsa_core]) [ 107.426939] [<bf14e234>] (dsa_port_vlan_add [dsa_core]) from [<bf14eba4>] (ds a_slave_port_obj_add+0x64/0x68 [dsa_core]) [ 107.437666] [<bf14eba4>] (dsa_slave_port_obj_add [dsa_core]) from [<c0b4e684> ] (__switchdev_port_obj_add+0x50/0xc4) [ 107.448029] [<c0b4e684>] (__switchdev_port_obj_add) from [<c0b4e6d4>] (__swit chdev_port_obj_add+0xa0/0xc4) [ 107.457612] [<c0b4e6d4>] (__switchdev_port_obj_add) from [<c0b4e758>] (switch dev_port_obj_add_now+0x60/0x130) [ 107.467453] [<c0b4e758>] (switchdev_port_obj_add_now) from [<c0b4eb94>] (swit chdev_port_obj_add+0x44/0x190) [ 107.477181] [<c0b4eb94>] (switchdev_port_obj_add) from [<bf1013d0>] (br_switc hdev_port_vlan_add+0x60/0x7c [bridge]) [ 107.487644] [<bf1013d0>] (br_switchdev_port_vlan_add [bridge]) from [<bf0ff25 0>] (__vlan_add+0xb0/0x620 [bridge]) regards Frank > Gesendet: Montag, 18. Februar 2019 um 00:20 Uhr > Von: "Florian Fainelli" <f.fainelli@gmail.com> > An: "Frank Wunderlich" <frank-w@public-files.de>, netdev@vger.kernel.org, "Andrew Lunn" <andrew@lunn.ch>, "Vivien Didelot" <vivien.didelot@gmail.com> > Betreff: Re: [BUG] [FIX] net: dsa: oops in br_vlan_enabled > > > > On 2/17/2019 9:13 AM, Florian Fainelli wrote: > > Hi Frank, > > > > On 2/16/2019 8:22 AM, Frank Wunderlich wrote: > >> Hi, > >> > >> i've found an oops in 4.19.23/10, seems to be fixed anyhow in 5.0 (also works in 4.14.101) > >> > >> root@bpi-r2:~# ip link add link lan0 name lan0.5 type vlan id 5 > >> root@bpi-r2:~# ip addr add 192.168.5.200/24 brd 192.168.5.255 dev lan0.5 > >> root@bpi-r2:~# ip link set dev lan0 up > >> root@bpi-r2:~# ip link set dev lan0.5 up > > > > So that these steps don't involve a bridge, and because we don't (yet) > > implment ndo_vlan_rx_{add,kill}_vid() netdevice_ops, there is no VLAN > > programming, it's all software > > > >> > >> 12: lan0.5@lan0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN group default qlen 1000 > >> link/ether 02:02:02:02:02:02 brd ff:ff:ff:ff:ff:ff > >> inet 192.168.5.200/24 brd 192.168.5.255 scope global lan0.5 > >> valid_lft forever preferred_lft forever > >> > >> root@bpi-r2:~# brctl addbr bridge_name > >> root@bpi-r2:~# brctl addif bridge_name lan0.5 > > > > Unless you changed the bridge to have VLAN filtering/awareness with: > > > > echo 1 > /sys/class/net/bridge_name/bridge/vlan_filtering > > > > There will be no VLAN configuration/objects pushed to DSA since commit > > 2ea7a679ca2abd251c1ec03f20508619707e1749 ("net: dsa: Don't add vlans > > when vlan filtering is disabled") so I am not sure how you got into that > > situation because prior to pushing a VLAN object, the port must be part > > of a bridge, so the steps typically look like: > > > > - port_bridge_join which assigns dp->bridge_dev > > - port_vlan_add > > > > Does your 4.19.23 kernel somehow change how VLAN objects are pushed down > > the switch driver? > > > >> [ 352.057128] bridge_name: port 1(lan0.5) entered blocking state > >> [ 352.063065] bridge_name: port 1(lan0.5) entered disabled state > >> [ 352.069181] device lan0.5 entered promiscuous mode > >> [ 352.074018] device lan0 entered promiscuous mode > >> [ 352.078906] Unable to handle kernel NULL pointer dereference at virtual address 00000558 > >> ... > >> [ 352.493085] [<bf0fde88>] (br_vlan_enabled [bridge]) from [<bf12c234>] (dsa_port_vlan_add+0x60/0xbc [dsa_core]) > >> [ 352.503050] [<bf12c234>] (dsa_port_vlan_add [dsa_core]) from [<bf12cb64>] (dsa_slave_port_obj_add+0x4c/0x50 [dsa_core]) > >> [ 352.513776] [<bf12cb64>] (dsa_slave_port_obj_add [dsa_core]) from [<c0b4e2d4>] (__switchdev_port_obj_add+0x50/0xc4) > >> [ 352.524138] [<c0b4e2d4>] (__switchdev_port_obj_add) from [<c0b4e324>] (__switchdev_port_obj_add+0xa0/0xc4) > >> [ 352.533721] [<c0b4e324>] (__switchdev_port_obj_add) from [<c0b4e3a8>] (switchdev_port_obj_add_now+0x60/0x130) > >> [ 352.543562] [<c0b4e3a8>] (switchdev_port_obj_add_now) from [<c0b4e7e4>] (switchdev_port_obj_add+0x44/0x190) > >> [ 352.553284] [<c0b4e7e4>] (switchdev_port_obj_add) from [<bf1013d0>] (br_switchdev_port_vlan_add+0x60/0x7c [bridge]) > >> [ 352.563733] [<bf1013d0>] (br_switchdev_port_vlan_add [bridge]) from [<bf0ff250>] (__vlan_add+0xb0/0x620 [bridge]) > >> [ 352.574007] [<bf0ff250>] (__vlan_add [bridge]) from [<bf0ffd04>] (nbp_vlan_add+0xc4/0x150 [bridge]) > >> [ 352.583073] [<bf0ffd04>] (nbp_vlan_add [bridge]) from [<bf0ffec4>] (nbp_vlan_init+0x134/0x164 [bridge]) > >> [ 352.592482] [<bf0ffec4>] (nbp_vlan_init [bridge]) from [<bf0edd4c>] (br_add_if+0x40c/0x5fc [bridge]) > >> [ 352.601632] [<bf0edd4c>] (br_add_if [bridge]) from [<bf0eeb14>] (add_del_if+0x6c/0x80 [bridge]) > >> [ 352.610351] [<bf0eeb14>] (add_del_if [bridge]) from [<bf0ef5b0>] (br_dev_ioctl+0x7c/0x9c [bridge]) > >> [ 352.619290] [<bf0ef5b0>] (br_dev_ioctl [bridge]) from [<c09583d4>] (dev_ifsioc+0x184/0x324) > >> [ 352.627582] [<c09583d4>] (dev_ifsioc) from [<c09589e8>] (dev_ioctl+0x32c/0x5cc) > >> [ 352.634837] [<c09589e8>] (dev_ioctl) from [<c090913c>] (sock_ioctl+0x3bc/0x580) > >> > >> > >> since my 4.19.23 kernel is modified a bit i tried with 4.19.10 without my net modifications and it is still reproducable with steps above (create a vlan on dsa-user-port and then use it in a bridge) > >> > >> i fixed it with these changes: > > Your fix is undoing the commit from Andrew I just referenced, so it is > definitively not the right fix because it will push VLAN objects down to > a non-VLAN aware bridge, not that this is really a problem, but it > should not be happening anyway. > > The problem appears to be the following though: you are enslaving a VLAN > device, which does not have switchdev_ops, so we recurse into the lower > device, which is the DSA network device, which does have switchdev_ops > defined. Once we are there we check the orig_dev against being a bridge > master network device, but we are not checking that it's a VLAN device > so we end-up assuming it is a DSA network device, we de-reference > garbage by checking br_vlan_enabled(dp->bridge_dev) because there is > simply no such data structure associated with the VLAN device. > > The attached patch should help. > > This is no longer a problem in newer kernels because the switchdev > operations use a notifier which checks the target network device to be DSA. > -- > Florian >
On February 18, 2019 9:11:27 AM PST, Frank Wunderlich <frank-w@public-files.de> wrote: >Hi, > >tried your Patch but crashes the same way: Yes the lower_dev targeted by switched does point to the DSA slave network device, but we really don't have dp->bridge_dev assigned since the physical DSA port was not enslaved in the bridge. Will follow up with a correct version late today. > >[ 107.416972] [<bf0fde88>] (br_vlan_enabled [bridge]) from >[<bf14e234>] (dsa_po >rt_vlan_add+0x60/0xbc [dsa_core]) > >[ 107.426939] [<bf14e234>] (dsa_port_vlan_add [dsa_core]) from >[<bf14eba4>] (ds >a_slave_port_obj_add+0x64/0x68 [dsa_core]) > >[ 107.437666] [<bf14eba4>] (dsa_slave_port_obj_add [dsa_core]) from >[<c0b4e684> >] (__switchdev_port_obj_add+0x50/0xc4) > >[ 107.448029] [<c0b4e684>] (__switchdev_port_obj_add) from >[<c0b4e6d4>] (__swit >chdev_port_obj_add+0xa0/0xc4) > >[ 107.457612] [<c0b4e6d4>] (__switchdev_port_obj_add) from >[<c0b4e758>] (switch >dev_port_obj_add_now+0x60/0x130) > >[ 107.467453] [<c0b4e758>] (switchdev_port_obj_add_now) from >[<c0b4eb94>] (swit >chdev_port_obj_add+0x44/0x190) > >[ 107.477181] [<c0b4eb94>] (switchdev_port_obj_add) from [<bf1013d0>] >(br_switc >hdev_port_vlan_add+0x60/0x7c [bridge]) > >[ 107.487644] [<bf1013d0>] (br_switchdev_port_vlan_add [bridge]) from >[<bf0ff25 >0>] (__vlan_add+0xb0/0x620 [bridge]) > >regards Frank > > >> Gesendet: Montag, 18. Februar 2019 um 00:20 Uhr >> Von: "Florian Fainelli" <f.fainelli@gmail.com> >> An: "Frank Wunderlich" <frank-w@public-files.de>, >netdev@vger.kernel.org, "Andrew Lunn" <andrew@lunn.ch>, "Vivien >Didelot" <vivien.didelot@gmail.com> >> Betreff: Re: [BUG] [FIX] net: dsa: oops in br_vlan_enabled >> >> >> >> On 2/17/2019 9:13 AM, Florian Fainelli wrote: >> > Hi Frank, >> > >> > On 2/16/2019 8:22 AM, Frank Wunderlich wrote: >> >> Hi, >> >> >> >> i've found an oops in 4.19.23/10, seems to be fixed anyhow in 5.0 >(also works in 4.14.101) >> >> >> >> root@bpi-r2:~# ip link add link lan0 name lan0.5 type vlan id 5 >> >> root@bpi-r2:~# ip addr add 192.168.5.200/24 brd 192.168.5.255 dev >lan0.5 >> >> root@bpi-r2:~# ip link set dev lan0 up >> >> root@bpi-r2:~# ip link set dev lan0.5 up >> > >> > So that these steps don't involve a bridge, and because we don't >(yet) >> > implment ndo_vlan_rx_{add,kill}_vid() netdevice_ops, there is no >VLAN >> > programming, it's all software >> > >> >> >> >> 12: lan0.5@lan0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 >qdisc noqueue state LOWERLAYERDOWN group default qlen 1000 >> >> link/ether 02:02:02:02:02:02 brd ff:ff:ff:ff:ff:ff >> >> inet 192.168.5.200/24 brd 192.168.5.255 scope global lan0.5 >> >> valid_lft forever preferred_lft forever >> >> >> >> root@bpi-r2:~# brctl addbr bridge_name >> >> root@bpi-r2:~# brctl addif bridge_name lan0.5 >> > >> > Unless you changed the bridge to have VLAN filtering/awareness >with: >> > >> > echo 1 > /sys/class/net/bridge_name/bridge/vlan_filtering >> > >> > There will be no VLAN configuration/objects pushed to DSA since >commit >> > 2ea7a679ca2abd251c1ec03f20508619707e1749 ("net: dsa: Don't add >vlans >> > when vlan filtering is disabled") so I am not sure how you got into >that >> > situation because prior to pushing a VLAN object, the port must be >part >> > of a bridge, so the steps typically look like: >> > >> > - port_bridge_join which assigns dp->bridge_dev >> > - port_vlan_add >> > >> > Does your 4.19.23 kernel somehow change how VLAN objects are pushed >down >> > the switch driver? >> > >> >> [ 352.057128] bridge_name: port 1(lan0.5) entered blocking state >> >> [ 352.063065] bridge_name: port 1(lan0.5) entered disabled state >> >> [ 352.069181] device lan0.5 entered promiscuous mode >> >> [ 352.074018] device lan0 entered promiscuous mode >> >> [ 352.078906] Unable to handle kernel NULL pointer dereference at >virtual address 00000558 >> >> ... >> >> [ 352.493085] [<bf0fde88>] (br_vlan_enabled [bridge]) from >[<bf12c234>] (dsa_port_vlan_add+0x60/0xbc [dsa_core]) >> >> [ 352.503050] [<bf12c234>] (dsa_port_vlan_add [dsa_core]) from >[<bf12cb64>] (dsa_slave_port_obj_add+0x4c/0x50 [dsa_core]) >> >> [ 352.513776] [<bf12cb64>] (dsa_slave_port_obj_add [dsa_core]) >from [<c0b4e2d4>] (__switchdev_port_obj_add+0x50/0xc4) >> >> [ 352.524138] [<c0b4e2d4>] (__switchdev_port_obj_add) from >[<c0b4e324>] (__switchdev_port_obj_add+0xa0/0xc4) >> >> [ 352.533721] [<c0b4e324>] (__switchdev_port_obj_add) from >[<c0b4e3a8>] (switchdev_port_obj_add_now+0x60/0x130) >> >> [ 352.543562] [<c0b4e3a8>] (switchdev_port_obj_add_now) from >[<c0b4e7e4>] (switchdev_port_obj_add+0x44/0x190) >> >> [ 352.553284] [<c0b4e7e4>] (switchdev_port_obj_add) from >[<bf1013d0>] (br_switchdev_port_vlan_add+0x60/0x7c [bridge]) >> >> [ 352.563733] [<bf1013d0>] (br_switchdev_port_vlan_add [bridge]) >from [<bf0ff250>] (__vlan_add+0xb0/0x620 [bridge]) >> >> [ 352.574007] [<bf0ff250>] (__vlan_add [bridge]) from >[<bf0ffd04>] (nbp_vlan_add+0xc4/0x150 [bridge]) >> >> [ 352.583073] [<bf0ffd04>] (nbp_vlan_add [bridge]) from >[<bf0ffec4>] (nbp_vlan_init+0x134/0x164 [bridge]) >> >> [ 352.592482] [<bf0ffec4>] (nbp_vlan_init [bridge]) from >[<bf0edd4c>] (br_add_if+0x40c/0x5fc [bridge]) >> >> [ 352.601632] [<bf0edd4c>] (br_add_if [bridge]) from [<bf0eeb14>] >(add_del_if+0x6c/0x80 [bridge]) >> >> [ 352.610351] [<bf0eeb14>] (add_del_if [bridge]) from >[<bf0ef5b0>] (br_dev_ioctl+0x7c/0x9c [bridge]) >> >> [ 352.619290] [<bf0ef5b0>] (br_dev_ioctl [bridge]) from >[<c09583d4>] (dev_ifsioc+0x184/0x324) >> >> [ 352.627582] [<c09583d4>] (dev_ifsioc) from [<c09589e8>] >(dev_ioctl+0x32c/0x5cc) >> >> [ 352.634837] [<c09589e8>] (dev_ioctl) from [<c090913c>] >(sock_ioctl+0x3bc/0x580) >> >> >> >> >> >> since my 4.19.23 kernel is modified a bit i tried with 4.19.10 >without my net modifications and it is still reproducable with steps >above (create a vlan on dsa-user-port and then use it in a bridge) >> >> >> >> i fixed it with these changes: >> >> Your fix is undoing the commit from Andrew I just referenced, so it >is >> definitively not the right fix because it will push VLAN objects down >to >> a non-VLAN aware bridge, not that this is really a problem, but it >> should not be happening anyway. >> >> The problem appears to be the following though: you are enslaving a >VLAN >> device, which does not have switchdev_ops, so we recurse into the >lower >> device, which is the DSA network device, which does have >switchdev_ops >> defined. Once we are there we check the orig_dev against being a >bridge >> master network device, but we are not checking that it's a VLAN >device >> so we end-up assuming it is a DSA network device, we de-reference >> garbage by checking br_vlan_enabled(dp->bridge_dev) because there is >> simply no such data structure associated with the VLAN device. >> >> The attached patch should help. >> >> This is no longer a problem in newer kernels because the switchdev >> operations use a notifier which checks the target network device to >be DSA. >> -- >> Florian >>
diff --git a/net/dsa/port.c b/net/dsa/port.c index ed0595459df1..962887752ae8 100644 --- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -255,8 +255,9 @@ int dsa_port_vlan_add(struct dsa_port *dp, if (netif_is_bridge_master(vlan->obj.orig_dev)) return -EOPNOTSUPP; - if (br_vlan_enabled(dp->bridge_dev)) - return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_ADD, &info); + printk(KERN_ALERT "DEBUG: Passed %s %d 0x%x \n",__FUNCTION__,__LINE__,(unsigned int)dp->bridge_dev); + if (!dp->bridge_dev || br_vlan_enabled(dp->bridge_dev)) + return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_DEL, &info); return 0; } @@ -273,7 +274,7 @@ int dsa_port_vlan_del(struct dsa_port *dp, if (netif_is_bridge_master(vlan->obj.orig_dev)) return -EOPNOTSUPP; - if (br_vlan_enabled(dp->bridge_dev)) + if (!dp->bridge_dev || br_vlan_enabled(dp->bridge_dev)) return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_DEL, &info); return 0;