Message ID | 158757174774.1370371.14395462229209766397.stgit@firesoul |
---|---|
State | Changes Requested |
Delegated to: | BPF Maintainers |
Headers | show |
Series | [net-next,01/33] xdp: add frame size to xdp_buff | expand |
On 2020/4/23 上午12:09, Jesper Dangaard Brouer wrote: > The virtio_net driver is running inside the guest-OS. There are two > XDP receive code-paths in virtio_net, namely receive_small() and > receive_mergeable(). The receive_big() function does not support XDP. > > In receive_small() the frame size is available in buflen. The buffer > backing these frames are allocated in add_recvbuf_small() with same > size, except for the headroom, but tailroom have reserved room for > skb_shared_info. The headroom is encoded in ctx pointer as a value. > > In receive_mergeable() the frame size is more dynamic. There are two > basic cases: (1) buffer size is based on a exponentially weighted > moving average (see DECLARE_EWMA) of packet length. Or (2) in case > virtnet_get_headroom() have any headroom then buffer size is > PAGE_SIZE. The ctx pointer is this time used for encoding two values; > the buffer len "truesize" and headroom. In case (1) if the rx buffer > size is underestimated, the packet will have been split over more > buffers (num_buf info in virtio_net_hdr_mrg_rxbuf placed in top of > buffer area). If that happens the XDP path does a xdp_linearize_page > operation. > > Cc: Jason Wang <jasowang@redhat.com> > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> > --- > drivers/net/virtio_net.c | 15 ++++++++++++--- > 1 file changed, 12 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index 11f722460513..1df3676da185 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -689,6 +689,7 @@ static struct sk_buff *receive_small(struct net_device *dev, > xdp.data_end = xdp.data + len; > xdp.data_meta = xdp.data; > xdp.rxq = &rq->xdp_rxq; > + xdp.frame_sz = buflen; > orig_data = xdp.data; > act = bpf_prog_run_xdp(xdp_prog, &xdp); > stats->xdp_packets++; > @@ -797,10 +798,11 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, > int offset = buf - page_address(page); > struct sk_buff *head_skb, *curr_skb; > struct bpf_prog *xdp_prog; > - unsigned int truesize; > + unsigned int truesize = mergeable_ctx_to_truesize(ctx); > unsigned int headroom = mergeable_ctx_to_headroom(ctx); > - int err; > unsigned int metasize = 0; > + unsigned int frame_sz; > + int err; > > head_skb = NULL; > stats->bytes += len - vi->hdr_len; > @@ -821,6 +823,11 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, > if (unlikely(hdr->hdr.gso_type)) > goto err_xdp; > > + /* Buffers with headroom use PAGE_SIZE as alloc size, > + * see add_recvbuf_mergeable() + get_mergeable_buf_len() > + */ > + frame_sz = headroom ? PAGE_SIZE : truesize; > + > /* This happens when rx buffer size is underestimated > * or headroom is not enough because of the buffer > * was refilled before XDP is set. This should only > @@ -834,6 +841,8 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, > page, offset, > VIRTIO_XDP_HEADROOM, > &len); > + frame_sz = PAGE_SIZE; Should this be PAGE_SIZE - SKB_DATA_ALIGN(sizeof(struct skb_shared_info))? > + > if (!xdp_page) > goto err_xdp; > offset = VIRTIO_XDP_HEADROOM; > @@ -850,6 +859,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, > xdp.data_end = xdp.data + (len - vi->hdr_len); > xdp.data_meta = xdp.data; > xdp.rxq = &rq->xdp_rxq; > + xdp.frame_sz = frame_sz; Maybe we can easily check by xdp.frame_sz = (xdp_page == page) ? truesize : ... Thanks > > act = bpf_prog_run_xdp(xdp_prog, &xdp); > stats->xdp_packets++; > @@ -924,7 +934,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, > } > rcu_read_unlock(); > > - truesize = mergeable_ctx_to_truesize(ctx); > if (unlikely(len > truesize)) { > pr_debug("%s: rx error: len %u exceeds truesize %lu\n", > dev->name, len, (unsigned long)ctx); > >
On Mon, 27 Apr 2020 15:21:02 +0800 Jason Wang <jasowang@redhat.com> wrote: > On 2020/4/23 上午12:09, Jesper Dangaard Brouer wrote: > > The virtio_net driver is running inside the guest-OS. There are two > > XDP receive code-paths in virtio_net, namely receive_small() and > > receive_mergeable(). The receive_big() function does not support XDP. > > > > In receive_small() the frame size is available in buflen. The buffer > > backing these frames are allocated in add_recvbuf_small() with same > > size, except for the headroom, but tailroom have reserved room for > > skb_shared_info. The headroom is encoded in ctx pointer as a value. > > > > In receive_mergeable() the frame size is more dynamic. There are two > > basic cases: (1) buffer size is based on a exponentially weighted > > moving average (see DECLARE_EWMA) of packet length. Or (2) in case > > virtnet_get_headroom() have any headroom then buffer size is > > PAGE_SIZE. The ctx pointer is this time used for encoding two values; > > the buffer len "truesize" and headroom. In case (1) if the rx buffer > > size is underestimated, the packet will have been split over more > > buffers (num_buf info in virtio_net_hdr_mrg_rxbuf placed in top of > > buffer area). If that happens the XDP path does a xdp_linearize_page > > operation. > > > > Cc: Jason Wang <jasowang@redhat.com> > > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> > > --- > > drivers/net/virtio_net.c | 15 ++++++++++++--- > > 1 file changed, 12 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > > index 11f722460513..1df3676da185 100644 > > --- a/drivers/net/virtio_net.c > > +++ b/drivers/net/virtio_net.c > > @@ -689,6 +689,7 @@ static struct sk_buff *receive_small(struct net_device *dev, > > xdp.data_end = xdp.data + len; > > xdp.data_meta = xdp.data; > > xdp.rxq = &rq->xdp_rxq; > > + xdp.frame_sz = buflen; > > orig_data = xdp.data; > > act = bpf_prog_run_xdp(xdp_prog, &xdp); > > stats->xdp_packets++; > > @@ -797,10 +798,11 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, > > int offset = buf - page_address(page); > > struct sk_buff *head_skb, *curr_skb; > > struct bpf_prog *xdp_prog; > > - unsigned int truesize; > > + unsigned int truesize = mergeable_ctx_to_truesize(ctx); > > unsigned int headroom = mergeable_ctx_to_headroom(ctx); > > - int err; > > unsigned int metasize = 0; > > + unsigned int frame_sz; > > + int err; > > > > head_skb = NULL; > > stats->bytes += len - vi->hdr_len; > > @@ -821,6 +823,11 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, > > if (unlikely(hdr->hdr.gso_type)) > > goto err_xdp; > > > > + /* Buffers with headroom use PAGE_SIZE as alloc size, > > + * see add_recvbuf_mergeable() + get_mergeable_buf_len() > > + */ > > + frame_sz = headroom ? PAGE_SIZE : truesize; > > + > > /* This happens when rx buffer size is underestimated > > * or headroom is not enough because of the buffer > > * was refilled before XDP is set. This should only > > @@ -834,6 +841,8 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, > > page, offset, > > VIRTIO_XDP_HEADROOM, > > &len); > > + frame_sz = PAGE_SIZE; > > > Should this be PAGE_SIZE - SKB_DATA_ALIGN(sizeof(struct skb_shared_info))? No, frame_sz include the SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) length.
On 2020/4/27 下午10:32, Jesper Dangaard Brouer wrote: > On Mon, 27 Apr 2020 15:21:02 +0800 > Jason Wang<jasowang@redhat.com> wrote: > >> On 2020/4/23 上午12:09, Jesper Dangaard Brouer wrote: >>> The virtio_net driver is running inside the guest-OS. There are two >>> XDP receive code-paths in virtio_net, namely receive_small() and >>> receive_mergeable(). The receive_big() function does not support XDP. >>> >>> In receive_small() the frame size is available in buflen. The buffer >>> backing these frames are allocated in add_recvbuf_small() with same >>> size, except for the headroom, but tailroom have reserved room for >>> skb_shared_info. The headroom is encoded in ctx pointer as a value. >>> >>> In receive_mergeable() the frame size is more dynamic. There are two >>> basic cases: (1) buffer size is based on a exponentially weighted >>> moving average (see DECLARE_EWMA) of packet length. Or (2) in case >>> virtnet_get_headroom() have any headroom then buffer size is >>> PAGE_SIZE. The ctx pointer is this time used for encoding two values; >>> the buffer len "truesize" and headroom. In case (1) if the rx buffer >>> size is underestimated, the packet will have been split over more >>> buffers (num_buf info in virtio_net_hdr_mrg_rxbuf placed in top of >>> buffer area). If that happens the XDP path does a xdp_linearize_page >>> operation. >>> >>> Cc: Jason Wang<jasowang@redhat.com> >>> Signed-off-by: Jesper Dangaard Brouer<brouer@redhat.com> >>> --- >>> drivers/net/virtio_net.c | 15 ++++++++++++--- >>> 1 file changed, 12 insertions(+), 3 deletions(-) >>> >>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c >>> index 11f722460513..1df3676da185 100644 >>> --- a/drivers/net/virtio_net.c >>> +++ b/drivers/net/virtio_net.c >>> @@ -689,6 +689,7 @@ static struct sk_buff *receive_small(struct net_device *dev, >>> xdp.data_end = xdp.data + len; >>> xdp.data_meta = xdp.data; >>> xdp.rxq = &rq->xdp_rxq; >>> + xdp.frame_sz = buflen; >>> orig_data = xdp.data; >>> act = bpf_prog_run_xdp(xdp_prog, &xdp); >>> stats->xdp_packets++; >>> @@ -797,10 +798,11 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, >>> int offset = buf - page_address(page); >>> struct sk_buff *head_skb, *curr_skb; >>> struct bpf_prog *xdp_prog; >>> - unsigned int truesize; >>> + unsigned int truesize = mergeable_ctx_to_truesize(ctx); >>> unsigned int headroom = mergeable_ctx_to_headroom(ctx); >>> - int err; >>> unsigned int metasize = 0; >>> + unsigned int frame_sz; >>> + int err; >>> >>> head_skb = NULL; >>> stats->bytes += len - vi->hdr_len; >>> @@ -821,6 +823,11 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, >>> if (unlikely(hdr->hdr.gso_type)) >>> goto err_xdp; >>> >>> + /* Buffers with headroom use PAGE_SIZE as alloc size, >>> + * see add_recvbuf_mergeable() + get_mergeable_buf_len() >>> + */ >>> + frame_sz = headroom ? PAGE_SIZE : truesize; >>> + >>> /* This happens when rx buffer size is underestimated >>> * or headroom is not enough because of the buffer >>> * was refilled before XDP is set. This should only >>> @@ -834,6 +841,8 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, >>> page, offset, >>> VIRTIO_XDP_HEADROOM, >>> &len); >>> + frame_sz = PAGE_SIZE; >> Should this be PAGE_SIZE - SKB_DATA_ALIGN(sizeof(struct skb_shared_info))? > No, frame_sz include the SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) length. Ok, consider mergeable buffer path depends on the truesize which is encoded in ctx. It looks to the the calculation in add_recvfbuf_mergeable() is wrong, we need count both headroom and tailroom there. We probably need the attached 2 patches to fix this. (untested, will test it tomorrow). Thanks
On Tue, 28 Apr 2020 17:50:11 +0800 Jason Wang <jasowang@redhat.com> wrote: > We tried to reserve space for vnet header before > xdp.data_hard_start. But this is useless since the packet could be > modified by XDP which may invalidate the information stored in the > header and there's no way for XDP to know the existence of the vnet > header currently. I think this is wrong. We can reserve space for vnet header before xdp.data_hard_start, and it will be safe and cannot be modified by XDP as BPF program cannot access data before xdp.data_hard_start. > So let's just not reserve space for vnet header in this case. > > Signed-off-by: Jason Wang <jasowang@redhat.com> > --- > drivers/net/virtio_net.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index 2fe7a3188282..9bdaf2425e6e 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -681,8 +681,8 @@ static struct sk_buff *receive_small(struct net_device *dev, > page = xdp_page; > } > > - xdp.data_hard_start = buf + VIRTNET_RX_PAD + vi->hdr_len; > - xdp.data = xdp.data_hard_start + xdp_headroom; > + xdp.data_hard_start = buf + VIRTNET_RX_PAD; > + xdp.data = xdp.data_hard_start + xdp_headroom + vi->hdr_len;; I think this is wrong. You are exposing the vi header info, which will be overwritten when code creates an xdp_frame. I find it very fragile, as later in the code the vi header info is copied, but only if xdp_prog is not loaded, so in principle it's okay, but when someone later figures out that we want to look at this area, we will be in trouble (and I expect this will be needed when we work towards multi-buffer XDP).
On 2020/4/28 下午5:50, Jason Wang wrote: > > On 2020/4/27 下午10:32, Jesper Dangaard Brouer wrote: >> On Mon, 27 Apr 2020 15:21:02 +0800 >> Jason Wang<jasowang@redhat.com> wrote: >> >>> On 2020/4/23 上午12:09, Jesper Dangaard Brouer wrote: >>>> The virtio_net driver is running inside the guest-OS. There are two >>>> XDP receive code-paths in virtio_net, namely receive_small() and >>>> receive_mergeable(). The receive_big() function does not support XDP. >>>> >>>> In receive_small() the frame size is available in buflen. The buffer >>>> backing these frames are allocated in add_recvbuf_small() with same >>>> size, except for the headroom, but tailroom have reserved room for >>>> skb_shared_info. The headroom is encoded in ctx pointer as a value. >>>> >>>> In receive_mergeable() the frame size is more dynamic. There are two >>>> basic cases: (1) buffer size is based on a exponentially weighted >>>> moving average (see DECLARE_EWMA) of packet length. Or (2) in case >>>> virtnet_get_headroom() have any headroom then buffer size is >>>> PAGE_SIZE. The ctx pointer is this time used for encoding two values; >>>> the buffer len "truesize" and headroom. In case (1) if the rx buffer >>>> size is underestimated, the packet will have been split over more >>>> buffers (num_buf info in virtio_net_hdr_mrg_rxbuf placed in top of >>>> buffer area). If that happens the XDP path does a xdp_linearize_page >>>> operation. >>>> >>>> Cc: Jason Wang<jasowang@redhat.com> >>>> Signed-off-by: Jesper Dangaard Brouer<brouer@redhat.com> >>>> --- >>>> drivers/net/virtio_net.c | 15 ++++++++++++--- >>>> 1 file changed, 12 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c >>>> index 11f722460513..1df3676da185 100644 >>>> --- a/drivers/net/virtio_net.c >>>> +++ b/drivers/net/virtio_net.c >>>> @@ -689,6 +689,7 @@ static struct sk_buff *receive_small(struct >>>> net_device *dev, >>>> xdp.data_end = xdp.data + len; >>>> xdp.data_meta = xdp.data; >>>> xdp.rxq = &rq->xdp_rxq; >>>> + xdp.frame_sz = buflen; >>>> orig_data = xdp.data; >>>> act = bpf_prog_run_xdp(xdp_prog, &xdp); >>>> stats->xdp_packets++; >>>> @@ -797,10 +798,11 @@ static struct sk_buff >>>> *receive_mergeable(struct net_device *dev, >>>> int offset = buf - page_address(page); >>>> struct sk_buff *head_skb, *curr_skb; >>>> struct bpf_prog *xdp_prog; >>>> - unsigned int truesize; >>>> + unsigned int truesize = mergeable_ctx_to_truesize(ctx); >>>> unsigned int headroom = mergeable_ctx_to_headroom(ctx); >>>> - int err; >>>> unsigned int metasize = 0; >>>> + unsigned int frame_sz; >>>> + int err; >>>> head_skb = NULL; >>>> stats->bytes += len - vi->hdr_len; >>>> @@ -821,6 +823,11 @@ static struct sk_buff >>>> *receive_mergeable(struct net_device *dev, >>>> if (unlikely(hdr->hdr.gso_type)) >>>> goto err_xdp; >>>> + /* Buffers with headroom use PAGE_SIZE as alloc size, >>>> + * see add_recvbuf_mergeable() + get_mergeable_buf_len() >>>> + */ >>>> + frame_sz = headroom ? PAGE_SIZE : truesize; >>>> + >>>> /* This happens when rx buffer size is underestimated >>>> * or headroom is not enough because of the buffer >>>> * was refilled before XDP is set. This should only >>>> @@ -834,6 +841,8 @@ static struct sk_buff *receive_mergeable(struct >>>> net_device *dev, >>>> page, offset, >>>> VIRTIO_XDP_HEADROOM, >>>> &len); >>>> + frame_sz = PAGE_SIZE; >>> Should this be PAGE_SIZE - SKB_DATA_ALIGN(sizeof(struct >>> skb_shared_info))? >> No, frame_sz include the SKB_DATA_ALIGN(sizeof(struct >> skb_shared_info)) length. > > > Ok, consider mergeable buffer path depends on the truesize which is > encoded in ctx. > > It looks to the the calculation in add_recvfbuf_mergeable() is wrong, > we need count both headroom and tailroom there. > > We probably need the attached 2 patches to fix this. > > (untested, will test it tomorrow). Sorry for the late reply. I gave a test and post the attached two patches (with minor tweaks). It looks to me they are required for this patch to work since data_hard_start excludes vnet hdr len without the attached patches which means PAGE_SIZE could not be used as frame_sz. Thanks > > Thanks
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 11f722460513..1df3676da185 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -689,6 +689,7 @@ static struct sk_buff *receive_small(struct net_device *dev, xdp.data_end = xdp.data + len; xdp.data_meta = xdp.data; xdp.rxq = &rq->xdp_rxq; + xdp.frame_sz = buflen; orig_data = xdp.data; act = bpf_prog_run_xdp(xdp_prog, &xdp); stats->xdp_packets++; @@ -797,10 +798,11 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, int offset = buf - page_address(page); struct sk_buff *head_skb, *curr_skb; struct bpf_prog *xdp_prog; - unsigned int truesize; + unsigned int truesize = mergeable_ctx_to_truesize(ctx); unsigned int headroom = mergeable_ctx_to_headroom(ctx); - int err; unsigned int metasize = 0; + unsigned int frame_sz; + int err; head_skb = NULL; stats->bytes += len - vi->hdr_len; @@ -821,6 +823,11 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, if (unlikely(hdr->hdr.gso_type)) goto err_xdp; + /* Buffers with headroom use PAGE_SIZE as alloc size, + * see add_recvbuf_mergeable() + get_mergeable_buf_len() + */ + frame_sz = headroom ? PAGE_SIZE : truesize; + /* This happens when rx buffer size is underestimated * or headroom is not enough because of the buffer * was refilled before XDP is set. This should only @@ -834,6 +841,8 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, page, offset, VIRTIO_XDP_HEADROOM, &len); + frame_sz = PAGE_SIZE; + if (!xdp_page) goto err_xdp; offset = VIRTIO_XDP_HEADROOM; @@ -850,6 +859,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, xdp.data_end = xdp.data + (len - vi->hdr_len); xdp.data_meta = xdp.data; xdp.rxq = &rq->xdp_rxq; + xdp.frame_sz = frame_sz; act = bpf_prog_run_xdp(xdp_prog, &xdp); stats->xdp_packets++; @@ -924,7 +934,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, } rcu_read_unlock(); - truesize = mergeable_ctx_to_truesize(ctx); if (unlikely(len > truesize)) { pr_debug("%s: rx error: len %u exceeds truesize %lu\n", dev->name, len, (unsigned long)ctx);
The virtio_net driver is running inside the guest-OS. There are two XDP receive code-paths in virtio_net, namely receive_small() and receive_mergeable(). The receive_big() function does not support XDP. In receive_small() the frame size is available in buflen. The buffer backing these frames are allocated in add_recvbuf_small() with same size, except for the headroom, but tailroom have reserved room for skb_shared_info. The headroom is encoded in ctx pointer as a value. In receive_mergeable() the frame size is more dynamic. There are two basic cases: (1) buffer size is based on a exponentially weighted moving average (see DECLARE_EWMA) of packet length. Or (2) in case virtnet_get_headroom() have any headroom then buffer size is PAGE_SIZE. The ctx pointer is this time used for encoding two values; the buffer len "truesize" and headroom. In case (1) if the rx buffer size is underestimated, the packet will have been split over more buffers (num_buf info in virtio_net_hdr_mrg_rxbuf placed in top of buffer area). If that happens the XDP path does a xdp_linearize_page operation. Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> --- drivers/net/virtio_net.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)