Message ID | 20180605220404.6425-1-aring@mojatatu.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
Series | [net] net: ipv6: ip6_output: alloc skb with tailroom | expand |
From: Alexander Aring <aring@mojatatu.com> Date: Tue, 5 Jun 2018 18:04:04 -0400 > This patch adds care about tailroom length for allocate a skb from ipv6 > level stack. In case of 6lowpan we had the problem the skb runs into a > skb_over_panic() in some special length cases. The root was there was no > tailroom allocated for the IEEE 802.15.4 checksum, although we had > the necessary tailroom specified inside the netdev structure. > > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195059 > Reported-by: David Palma <david.palma@ntnu.no> > Reported-by: Rabi Narayan Sahoo <rabinarayans0828@gmail.com> > Signed-off-by: Alexander Aring <aring@mojatatu.com> needed_tailroom is an optimization to avoid SKB reallocations and adjustments, it is not a guarantee. If you are seeing crashes, it means code is assuming something which is not to be assumed. Whatever code is involved, it needs to check that the necessary tailroom is there and reallocate if necessary, rather than blindly pushing past the end of the SKB data.
Hi, On Wed, Jun 06, 2018 at 01:53:39PM -0400, David Miller wrote: > From: Alexander Aring <aring@mojatatu.com> > Date: Tue, 5 Jun 2018 18:04:04 -0400 > > > This patch adds care about tailroom length for allocate a skb from ipv6 > > level stack. In case of 6lowpan we had the problem the skb runs into a > > skb_over_panic() in some special length cases. The root was there was no > > tailroom allocated for the IEEE 802.15.4 checksum, although we had > > the necessary tailroom specified inside the netdev structure. > > > > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195059 > > Reported-by: David Palma <david.palma@ntnu.no> > > Reported-by: Rabi Narayan Sahoo <rabinarayans0828@gmail.com> > > Signed-off-by: Alexander Aring <aring@mojatatu.com> > > needed_tailroom is an optimization to avoid SKB reallocations > and adjustments, it is not a guarantee. > okay, then you want to have this patch for net-next? As an optimization? Of course, when it's open again. > If you are seeing crashes, it means code is assuming something which > is not to be assumed. > > Whatever code is involved, it needs to check that the necessary > tailroom is there and reallocate if necessary, rather than > blindly pushing past the end of the SKB data. > I see, I will add checks and reallocs (if necessary) in the underlaying subsystem level. Thanks for clarifying this. - Alex
From: Alexander Aring <aring@mojatatu.com> Date: Wed, 6 Jun 2018 14:09:20 -0400 > okay, then you want to have this patch for net-next? As an optimization? > > Of course, when it's open again. Like you, I have questions about where this adjustment is applied and why. So I'm not sure yet. For example, only IPV6 really takes it into consideration and as you saw only really for the fragmentation path and not the normal output path. This needs more consideration and investigation.
On Wed, Jun 6, 2018 at 2:11 PM, David Miller <davem@davemloft.net> wrote: > From: Alexander Aring <aring@mojatatu.com> > Date: Wed, 6 Jun 2018 14:09:20 -0400 > >> okay, then you want to have this patch for net-next? As an optimization? >> >> Of course, when it's open again. > > Like you, I have questions about where this adjustment is applied and > why. So I'm not sure yet. > > For example, only IPV6 really takes it into consideration and as you > saw only really for the fragmentation path and not the normal output > path. > > This needs more consideration and investigation. This is the unconditional skb_put in ieee802154_tx. In many cases there is some tailroom due to SKB_DATA_ALIGN in __alloc_skb, so it may take a specific case to not have even 2 bytes of tailroom available.
Hi, On Wed, Jun 06, 2018 at 04:26:19PM -0400, Willem de Bruijn wrote: > On Wed, Jun 6, 2018 at 2:11 PM, David Miller <davem@davemloft.net> wrote: > > From: Alexander Aring <aring@mojatatu.com> > > Date: Wed, 6 Jun 2018 14:09:20 -0400 > > > >> okay, then you want to have this patch for net-next? As an optimization? > >> > >> Of course, when it's open again. > > > > Like you, I have questions about where this adjustment is applied and > > why. So I'm not sure yet. > > > > For example, only IPV6 really takes it into consideration and as you > > saw only really for the fragmentation path and not the normal output > > path. > > > > This needs more consideration and investigation. > > This is the unconditional skb_put in ieee802154_tx. In many cases > there is some tailroom due to SKB_DATA_ALIGN in __alloc_skb, > so it may take a specific case to not have even 2 bytes of tailroom > available. Yes it's in ieee802154_tx, but we need tailroom not just for checksum. The bugreport is related to the two bytes of tailroom, because virtual hardware doing checksum by software. The most real transceivers offload this feature, so zero tailroom is needed. I will of course add checks before adding L2 header for headroom and tailroom in related subsystem code. In IEEE 802.15.4 and secured enabled frames we need a MIC field at the end of the frame. In worst case this can be 16 bytes. I looked ethernet macsec feature and it seems they need to have a similar reseved tailroom which is 16 bytes by default (max 32 bytes). Maybe it's worth to take care for the tailroom in this path since it's not just 2 bytes in some cases. --- Meanwhile I think I found a bug in macsec, I cc Sabrina here: diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c index 7de88b33d5b9..687323c0caf5 100644 --- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -522,7 +522,7 @@ static bool macsec_validate_skb(struct sk_buff *skb, u16 icv_len) } #define MACSEC_NEEDED_HEADROOM (macsec_extra_len(true)) -#define MACSEC_NEEDED_TAILROOM MACSEC_STD_ICV_LEN +#define MACSEC_NEEDED_TAILROOM MACSEC_MAX_ICV_LEN static void macsec_fill_iv(unsigned char *iv, sci_t sci, u32 pn) { --- MACSEC_NEEDED_TAILROOM is the define to check and run skb_copy_expand() and should use the ?worst case? or the the value (icv_len + ?extra_foo?) is set as runtime generation on newlink. I see that in macsec_newlink() following code: if (data && data[IFLA_MACSEC_ICV_LEN]) icv_len = nla_get_u8(data[IFLA_MACSEC_ICV_LEN]); so the user can change it to (even a value above 32?, there is no check for that). Anyway everything higher than MACSEC_STD_ICV_LEN could run into a skb_over_panic(). - Alex
On Thu, Jun 7, 2018 at 12:22 PM, Alexander Aring <aring@mojatatu.com> wrote: > Hi, > > On Wed, Jun 06, 2018 at 04:26:19PM -0400, Willem de Bruijn wrote: >> On Wed, Jun 6, 2018 at 2:11 PM, David Miller <davem@davemloft.net> wrote: >> > From: Alexander Aring <aring@mojatatu.com> >> > Date: Wed, 6 Jun 2018 14:09:20 -0400 >> > >> >> okay, then you want to have this patch for net-next? As an optimization? >> >> >> >> Of course, when it's open again. >> > >> > Like you, I have questions about where this adjustment is applied and >> > why. So I'm not sure yet. >> > >> > For example, only IPV6 really takes it into consideration and as you >> > saw only really for the fragmentation path and not the normal output >> > path. >> > >> > This needs more consideration and investigation. >> >> This is the unconditional skb_put in ieee802154_tx. In many cases >> there is some tailroom due to SKB_DATA_ALIGN in __alloc_skb, >> so it may take a specific case to not have even 2 bytes of tailroom >> available. > > Yes it's in ieee802154_tx, but we need tailroom not just for checksum. > The bugreport is related to the two bytes of tailroom, because virtual > hardware doing checksum by software. The most real transceivers offload > this feature, so zero tailroom is needed. > > I will of course add checks before adding L2 header for headroom and > tailroom in related subsystem code. > > In IEEE 802.15.4 and secured enabled frames we need a MIC field at the > end of the frame. In worst case this can be 16 bytes. > > I looked ethernet macsec feature and it seems they need to have a similar > reseved tailroom which is 16 bytes by default (max 32 bytes). Allocating with necessary tailroom to avoid a realloc in the path sounds fine to me, too. Packet sockets also take needed_tailroom into account, to give another example. We just have to avoid papering over bugs. Fixing the locations that unconditionally move the tail pointer beyond the allocated region is more important.
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 7b6d1689087b..b4e521cfe3cf 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1262,6 +1262,7 @@ static int __ip6_append_data(struct sock *sk, int exthdrlen = 0; int dst_exthdrlen = 0; int hh_len; + int t_len; int copy; int err; int offset = 0; @@ -1283,6 +1284,7 @@ static int __ip6_append_data(struct sock *sk, orig_mtu = mtu; hh_len = LL_RESERVED_SPACE(rt->dst.dev); + t_len = rt->dst.dev->needed_tailroom; fragheaderlen = sizeof(struct ipv6hdr) + rt->rt6i_nfheader_len + (opt ? opt->opt_nflen : 0); @@ -1425,13 +1427,13 @@ static int __ip6_append_data(struct sock *sk, } if (transhdrlen) { skb = sock_alloc_send_skb(sk, - alloclen + hh_len, + alloclen + hh_len + t_len, (flags & MSG_DONTWAIT), &err); } else { skb = NULL; if (refcount_read(&sk->sk_wmem_alloc) + wmem_alloc_delta <= 2 * sk->sk_sndbuf) - skb = alloc_skb(alloclen + hh_len, + skb = alloc_skb(alloclen + hh_len + t_len, sk->sk_allocation); if (unlikely(!skb)) err = -ENOBUFS;
This patch adds care about tailroom length for allocate a skb from ipv6 level stack. In case of 6lowpan we had the problem the skb runs into a skb_over_panic() in some special length cases. The root was there was no tailroom allocated for the IEEE 802.15.4 checksum, although we had the necessary tailroom specified inside the netdev structure. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195059 Reported-by: David Palma <david.palma@ntnu.no> Reported-by: Rabi Narayan Sahoo <rabinarayans0828@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> --- Hi, nasty bug, I suppose this is the correct fix to my last question for what dev->needed_tailroom is designed for. I added two Reported-by here, David Palma reported this bug one year ago and I didn't had time to investigate. Interesting is that he told this bug doesn't occur (in case of 6lowpan 802.15.4) on 32 bit systems. Maybe alignment is related to the question why it works on 32 bit. Anyway, a week ago "Rabi Narayan Sahoo" reported the bug again and I needed to investigate something "why", since I also use a 64 bit vm. David Palma did a nice job for reproduce this bug and he (I think) lives at least one year with it, so I put him at first. Anyway, Rabi Narayan Sahoo was very very close to fix it and found the right code part which I also found. I read his mail afterwards because it was received messed on the linux-wpan mailinglist. So it's correct to give him credits too. :-) I hope there are no other cases where tailroom is missing. The second one is not needed to fix my bug but I think we need it there. Also hh_len is also used inside a skb_resever() in this function, but this is for headroom only. - Alex net/ipv6/ip6_output.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)