Message ID | 1355838711-5473-1-git-send-email-ian.campbell@citrix.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On Tue, Dec 18, 2012 at 01:51:51PM +0000, Ian Campbell wrote: > Using RX_COPY_THRESHOLD is incorrect if the SKB is actually smaller > than that. We have already accounted for this in > NETFRONT_SKB_CB(skb)->pull_to so use that instead. > > Fixes WARN_ON from skb_try_coalesce. This should probably be also on the stable tree for 3.7 at least? > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com> > Cc: Sander Eikelenboom <linux@eikelenboom.it> ^^ - Reported-by: > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> ^^ - Acked-by: > Cc: annie li <annie.li@oracle.com> > Cc: xen-devel@lists.xensource.com > Cc: netdev@vger.kernel.org > Cc: stable@kernel.org # 3.7.x only > --- > drivers/net/xen-netfront.c | 15 +++++---------- > 1 files changed, 5 insertions(+), 10 deletions(-) > > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c > index caa0110..b06ef81 100644 > --- a/drivers/net/xen-netfront.c > +++ b/drivers/net/xen-netfront.c > @@ -971,17 +971,12 @@ err: > * overheads. Here, we add the size of the data pulled > * in xennet_fill_frags(). > * > - * We also adjust for any unused space in the main > - * data area by subtracting (RX_COPY_THRESHOLD - > - * len). This is especially important with drivers > - * which split incoming packets into header and data, > - * using only 66 bytes of the main data area (see the > - * e1000 driver for example.) On such systems, > - * without this last adjustement, our achievable > - * receive throughout using the standard receive > - * buffer size was cut by 25%(!!!). > + * We also adjust for the __pskb_pull_tail done in > + * handle_incoming_queue which pulls data from the > + * frags into the head area, which is already > + * accounted in RX_COPY_THRESHOLD. > */ > - skb->truesize += skb->data_len - RX_COPY_THRESHOLD; > + skb->truesize += skb->data_len - NETFRONT_SKB_CB(skb)->pull_to; > skb->len += skb->data_len; > > if (rx->flags & XEN_NETRXF_csum_blank) > -- > 1.7.2.5 > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2012-12-18 at 14:10 +0000, Konrad Rzeszutek Wilk wrote: > On Tue, Dec 18, 2012 at 01:51:51PM +0000, Ian Campbell wrote: > > Using RX_COPY_THRESHOLD is incorrect if the SKB is actually smaller > > than that. We have already accounted for this in > > NETFRONT_SKB_CB(skb)->pull_to so use that instead. > > > > Fixes WARN_ON from skb_try_coalesce. > > This should probably be also on the stable tree for 3.7 at least? Yes, hence "Cc: stable@kernel.org # 3.7.x only" below ;-) > > > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com> > > Cc: Sander Eikelenboom <linux@eikelenboom.it> > ^^ - Reported-by: > > > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > ^^ - Acked-by: > > > Cc: annie li <annie.li@oracle.com> > > Cc: xen-devel@lists.xensource.com > > Cc: netdev@vger.kernel.org > > Cc: stable@kernel.org # 3.7.x only > > --- > > drivers/net/xen-netfront.c | 15 +++++---------- > > 1 files changed, 5 insertions(+), 10 deletions(-) > > > > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c > > index caa0110..b06ef81 100644 > > --- a/drivers/net/xen-netfront.c > > +++ b/drivers/net/xen-netfront.c > > @@ -971,17 +971,12 @@ err: > > * overheads. Here, we add the size of the data pulled > > * in xennet_fill_frags(). > > * > > - * We also adjust for any unused space in the main > > - * data area by subtracting (RX_COPY_THRESHOLD - > > - * len). This is especially important with drivers > > - * which split incoming packets into header and data, > > - * using only 66 bytes of the main data area (see the > > - * e1000 driver for example.) On such systems, > > - * without this last adjustement, our achievable > > - * receive throughout using the standard receive > > - * buffer size was cut by 25%(!!!). > > + * We also adjust for the __pskb_pull_tail done in > > + * handle_incoming_queue which pulls data from the > > + * frags into the head area, which is already > > + * accounted in RX_COPY_THRESHOLD. > > */ > > - skb->truesize += skb->data_len - RX_COPY_THRESHOLD; > > + skb->truesize += skb->data_len - NETFRONT_SKB_CB(skb)->pull_to; > > skb->len += skb->data_len; > > > > if (rx->flags & XEN_NETRXF_csum_blank) > > -- > > 1.7.2.5 > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2012-12-18 at 13:51 +0000, Ian Campbell wrote: > Using RX_COPY_THRESHOLD is incorrect if the SKB is actually smaller > than that. We have already accounted for this in > NETFRONT_SKB_CB(skb)->pull_to so use that instead. > > Fixes WARN_ON from skb_try_coalesce. > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com> > Cc: Sander Eikelenboom <linux@eikelenboom.it> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Cc: annie li <annie.li@oracle.com> > Cc: xen-devel@lists.xensource.com > Cc: netdev@vger.kernel.org > Cc: stable@kernel.org # 3.7.x only > --- > drivers/net/xen-netfront.c | 15 +++++---------- > 1 files changed, 5 insertions(+), 10 deletions(-) > > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c > index caa0110..b06ef81 100644 > --- a/drivers/net/xen-netfront.c > +++ b/drivers/net/xen-netfront.c > @@ -971,17 +971,12 @@ err: > * overheads. Here, we add the size of the data pulled > * in xennet_fill_frags(). > * > - * We also adjust for any unused space in the main > - * data area by subtracting (RX_COPY_THRESHOLD - > - * len). This is especially important with drivers > - * which split incoming packets into header and data, > - * using only 66 bytes of the main data area (see the > - * e1000 driver for example.) On such systems, > - * without this last adjustement, our achievable > - * receive throughout using the standard receive > - * buffer size was cut by 25%(!!!). > + * We also adjust for the __pskb_pull_tail done in > + * handle_incoming_queue which pulls data from the > + * frags into the head area, which is already > + * accounted in RX_COPY_THRESHOLD. > */ > - skb->truesize += skb->data_len - RX_COPY_THRESHOLD; > + skb->truesize += skb->data_len - NETFRONT_SKB_CB(skb)->pull_to; > skb->len += skb->data_len; > > if (rx->flags & XEN_NETRXF_csum_blank) But skb truesize is not what you think. You must account the exact memory used by this skb, not only the used part of it. At the very minimum, it should be skb->truesize += skb->data_len; But it really should be the allocated size of the fragment. If its a page, then its a page, even if you use one single byte in it. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2012-12-18 at 15:12 +0000, Eric Dumazet wrote: > On Tue, 2012-12-18 at 13:51 +0000, Ian Campbell wrote: > > Using RX_COPY_THRESHOLD is incorrect if the SKB is actually smaller > > than that. We have already accounted for this in > > NETFRONT_SKB_CB(skb)->pull_to so use that instead. > > > > Fixes WARN_ON from skb_try_coalesce. > > > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com> > > Cc: Sander Eikelenboom <linux@eikelenboom.it> > > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > Cc: annie li <annie.li@oracle.com> > > Cc: xen-devel@lists.xensource.com > > Cc: netdev@vger.kernel.org > > Cc: stable@kernel.org # 3.7.x only > > --- > > drivers/net/xen-netfront.c | 15 +++++---------- > > 1 files changed, 5 insertions(+), 10 deletions(-) > > > > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c > > index caa0110..b06ef81 100644 > > --- a/drivers/net/xen-netfront.c > > +++ b/drivers/net/xen-netfront.c > > @@ -971,17 +971,12 @@ err: > > * overheads. Here, we add the size of the data pulled > > * in xennet_fill_frags(). > > * > > - * We also adjust for any unused space in the main > > - * data area by subtracting (RX_COPY_THRESHOLD - > > - * len). This is especially important with drivers > > - * which split incoming packets into header and data, > > - * using only 66 bytes of the main data area (see the > > - * e1000 driver for example.) On such systems, > > - * without this last adjustement, our achievable > > - * receive throughout using the standard receive > > - * buffer size was cut by 25%(!!!). > > + * We also adjust for the __pskb_pull_tail done in > > + * handle_incoming_queue which pulls data from the > > + * frags into the head area, which is already > > + * accounted in RX_COPY_THRESHOLD. > > */ > > - skb->truesize += skb->data_len - RX_COPY_THRESHOLD; > > + skb->truesize += skb->data_len - NETFRONT_SKB_CB(skb)->pull_to; > > skb->len += skb->data_len; > > > > if (rx->flags & XEN_NETRXF_csum_blank) > > > But skb truesize is not what you think. Indeed, it seems I was completely backwards about what it means! > You must account the exact memory used by this skb, not only the used > part of it. > > At the very minimum, it should be > > skb->truesize += skb->data_len; > > But it really should be the allocated size of the fragment. > > If its a page, then its a page, even if you use one single byte in it. So actually we want += PAGE_SIZE * skb_shinfo(skb)->nr_frags ? Sander, can you try that change? Ian. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2012-12-18 at 15:26 +0000, Ian Campbell wrote: > So actually we want += PAGE_SIZE * skb_shinfo(skb)->nr_frags ? > I dont know what are the real frag sizes in your case. Some drivers allocate a full page for an ethernet frame, others use half of a page, it really depends. As the frag ABI doesnt contain real size, its ok in this case to account the actual frag size. (skb->data_len in your driver) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2012-12-18 at 16:13 +0000, Eric Dumazet wrote: > On Tue, 2012-12-18 at 15:26 +0000, Ian Campbell wrote: > > > So actually we want += PAGE_SIZE * skb_shinfo(skb)->nr_frags ? > > > > I dont know what are the real frag sizes in your case. I think it's a page, see xennet_alloc_rx_buffers and the alloc_page therein. > Some drivers allocate a full page for an ethernet frame, others use half > of a page, it really depends. > > As the frag ABI doesnt contain real size, its ok in this case to account > the actual frag size. > > (skb->data_len in your driver) I guess I'm a bit confused by what truesize means again then ;-), because in that case the original patch is correct although it would have been less confusing to do: skb->truesize += skb->data_len; in xennet_poll() and then do the subtraction of NETFRONT_SKB_CB(skb)->pull_to in handle_incoming_queue() where we actually do the pull up. Unless __pskb_pull_tail does that adjustment for us, but if it does I can't see where. Ian. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2012-12-18 at 16:22 +0000, Ian Campbell wrote: > On Tue, 2012-12-18 at 16:13 +0000, Eric Dumazet wrote: > > On Tue, 2012-12-18 at 15:26 +0000, Ian Campbell wrote: > > > > > So actually we want += PAGE_SIZE * skb_shinfo(skb)->nr_frags ? > > > > > > > I dont know what are the real frag sizes in your case. > > I think it's a page, see xennet_alloc_rx_buffers and the alloc_page > therein. > If they are order-0 pages, then PAGE_SIZE * nr_frags is OK. > > Some drivers allocate a full page for an ethernet frame, others use half > > of a page, it really depends. > > > > As the frag ABI doesnt contain real size, its ok in this case to account > > the actual frag size. > > > > (skb->data_len in your driver) > > I guess I'm a bit confused by what truesize means again then ;-), > because in that case the original patch is correct although it would > have been less confusing to do: > skb->truesize += skb->data_len; > in xennet_poll() and then do the subtraction of > NETFRONT_SKB_CB(skb)->pull_to in handle_incoming_queue() where we > actually do the pull up. > > Unless __pskb_pull_tail does that adjustment for us, but if it does I > can't see where. Thats because skb frags only contain : a page pointer. An offset A size. (Exact number or used bytes in this frag) And not the 'originally allocated size. It could be 256, 768, 2048, 4096, 65536 bytes, nobody but the driver really knows. So when we pull X bytes from a fragment to skb->head, there is no way to remember what was the original size of the fragment. Only the driver allocating the frag knows its truesize. Once skb is given to the stack, we lose this information, and rely on skb->truesize being an accurate estimation. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Tuesday, December 18, 2012, 4:26:38 PM, you wrote: > On Tue, 2012-12-18 at 15:12 +0000, Eric Dumazet wrote: >> On Tue, 2012-12-18 at 13:51 +0000, Ian Campbell wrote: >> > Using RX_COPY_THRESHOLD is incorrect if the SKB is actually smaller >> > than that. We have already accounted for this in >> > NETFRONT_SKB_CB(skb)->pull_to so use that instead. >> > >> > Fixes WARN_ON from skb_try_coalesce. >> > >> > Signed-off-by: Ian Campbell <ian.campbell@citrix.com> >> > Cc: Sander Eikelenboom <linux@eikelenboom.it> >> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >> > Cc: annie li <annie.li@oracle.com> >> > Cc: xen-devel@lists.xensource.com >> > Cc: netdev@vger.kernel.org >> > Cc: stable@kernel.org # 3.7.x only >> > --- >> > drivers/net/xen-netfront.c | 15 +++++---------- >> > 1 files changed, 5 insertions(+), 10 deletions(-) >> > >> > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c >> > index caa0110..b06ef81 100644 >> > --- a/drivers/net/xen-netfront.c >> > +++ b/drivers/net/xen-netfront.c >> > @@ -971,17 +971,12 @@ err: >> > * overheads. Here, we add the size of the data pulled >> > * in xennet_fill_frags(). >> > * >> > - * We also adjust for any unused space in the main >> > - * data area by subtracting (RX_COPY_THRESHOLD - >> > - * len). This is especially important with drivers >> > - * which split incoming packets into header and data, >> > - * using only 66 bytes of the main data area (see the >> > - * e1000 driver for example.) On such systems, >> > - * without this last adjustement, our achievable >> > - * receive throughout using the standard receive >> > - * buffer size was cut by 25%(!!!). >> > + * We also adjust for the __pskb_pull_tail done in >> > + * handle_incoming_queue which pulls data from the >> > + * frags into the head area, which is already >> > + * accounted in RX_COPY_THRESHOLD. >> > */ >> > - skb->truesize += skb->data_len - RX_COPY_THRESHOLD; >> > + skb->truesize += skb->data_len - NETFRONT_SKB_CB(skb)->pull_to; >> > skb->len += skb->data_len; >> > >> > if (rx->flags & XEN_NETRXF_csum_blank) >> >> >> But skb truesize is not what you think. > Indeed, it seems I was completely backwards about what it means! >> You must account the exact memory used by this skb, not only the used >> part of it. >> >> At the very minimum, it should be >> >> skb->truesize += skb->data_len; >> >> But it really should be the allocated size of the fragment. >> >> If its a page, then its a page, even if you use one single byte in it. > So actually we want += PAGE_SIZE * skb_shinfo(skb)->nr_frags ? > Sander, can you try that change? Hi Ian, It ran overnight and i haven't seen the warn_once trigger. (but i also didn't with the previous patch) -- Sander > Ian. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2012-12-19 at 12:34 +0100, Sander Eikelenboom wrote: > Hi Ian, > > It ran overnight and i haven't seen the warn_once trigger. > (but i also didn't with the previous patch) > As I said, the miminum value to not trigger the warning was what Ian patch was doing, but it was still a not accurate estimation. Doing the real accounting might trigger slow transferts, or dropped packets because of socket limits (SNDBUF / RCVBUF) being hit sooner. So the real question was : If accounting for full pages, is your applications run as smooth as before, with no huge performance regression ? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index caa0110..b06ef81 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -971,17 +971,12 @@ err: * overheads. Here, we add the size of the data pulled * in xennet_fill_frags(). * - * We also adjust for any unused space in the main - * data area by subtracting (RX_COPY_THRESHOLD - - * len). This is especially important with drivers - * which split incoming packets into header and data, - * using only 66 bytes of the main data area (see the - * e1000 driver for example.) On such systems, - * without this last adjustement, our achievable - * receive throughout using the standard receive - * buffer size was cut by 25%(!!!). + * We also adjust for the __pskb_pull_tail done in + * handle_incoming_queue which pulls data from the + * frags into the head area, which is already + * accounted in RX_COPY_THRESHOLD. */ - skb->truesize += skb->data_len - RX_COPY_THRESHOLD; + skb->truesize += skb->data_len - NETFRONT_SKB_CB(skb)->pull_to; skb->len += skb->data_len; if (rx->flags & XEN_NETRXF_csum_blank)
Using RX_COPY_THRESHOLD is incorrect if the SKB is actually smaller than that. We have already accounted for this in NETFRONT_SKB_CB(skb)->pull_to so use that instead. Fixes WARN_ON from skb_try_coalesce. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Sander Eikelenboom <linux@eikelenboom.it> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: annie li <annie.li@oracle.com> Cc: xen-devel@lists.xensource.com Cc: netdev@vger.kernel.org Cc: stable@kernel.org # 3.7.x only --- drivers/net/xen-netfront.c | 15 +++++---------- 1 files changed, 5 insertions(+), 10 deletions(-)