Message ID | 20230825170448.1953409-1-i.maximets@ovn.org |
---|---|
State | New |
Headers | show |
Series | virtio: use shadow_avail_idx while checking number of heads | expand |
On 8/25/23 19:04, Ilya Maximets wrote: > We do not need the most up to date number of heads, we only want to > know if there is at least one. > > Use shadow variable as long as it is not equal to the last available > index checked. This avoids expensive qatomic dereference of the > RCU-protected memory region cache as well as the memory access itself > and the subsequent memory barrier. > > The change improves performance of the af-xdp network backend by 2-3%. > > Signed-off-by: Ilya Maximets <i.maximets@ovn.org> > --- Kind reminder. Best regards, Ilya Maximets. > hw/virtio/virtio.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c > index 309038fd46..04bf7cc977 100644 > --- a/hw/virtio/virtio.c > +++ b/hw/virtio/virtio.c > @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem, > /* Called within rcu_read_lock(). */ > static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx) > { > - uint16_t num_heads = vring_avail_idx(vq) - idx; > + uint16_t num_heads; > + > + if (vq->shadow_avail_idx != idx) { > + num_heads = vq->shadow_avail_idx - idx; > + > + return num_heads; > + } > + > + num_heads = vring_avail_idx(vq) - idx; > > /* Check it isn't doing very strange things with descriptor numbers. */ > if (num_heads > vq->vring.num) {
On Fri, 25 Aug 2023 at 13:04, Ilya Maximets <i.maximets@ovn.org> wrote: > > We do not need the most up to date number of heads, we only want to > know if there is at least one. > > Use shadow variable as long as it is not equal to the last available > index checked. This avoids expensive qatomic dereference of the > RCU-protected memory region cache as well as the memory access itself > and the subsequent memory barrier. > > The change improves performance of the af-xdp network backend by 2-3%. > > Signed-off-by: Ilya Maximets <i.maximets@ovn.org> > --- > hw/virtio/virtio.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c > index 309038fd46..04bf7cc977 100644 > --- a/hw/virtio/virtio.c > +++ b/hw/virtio/virtio.c > @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem, > /* Called within rcu_read_lock(). */ > static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx) > { > - uint16_t num_heads = vring_avail_idx(vq) - idx; > + uint16_t num_heads; > + > + if (vq->shadow_avail_idx != idx) { > + num_heads = vq->shadow_avail_idx - idx; > + > + return num_heads; This still needs to check num_heads > vq->vring.num and return -EINVAL as is done below. > + } > + > + num_heads = vring_avail_idx(vq) - idx; > > /* Check it isn't doing very strange things with descriptor numbers. */ > if (num_heads > vq->vring.num) { > -- > 2.40.1 > >
On 9/25/23 16:23, Stefan Hajnoczi wrote: > On Fri, 25 Aug 2023 at 13:04, Ilya Maximets <i.maximets@ovn.org> wrote: >> >> We do not need the most up to date number of heads, we only want to >> know if there is at least one. >> >> Use shadow variable as long as it is not equal to the last available >> index checked. This avoids expensive qatomic dereference of the >> RCU-protected memory region cache as well as the memory access itself >> and the subsequent memory barrier. >> >> The change improves performance of the af-xdp network backend by 2-3%. >> >> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> >> --- >> hw/virtio/virtio.c | 10 +++++++++- >> 1 file changed, 9 insertions(+), 1 deletion(-) >> >> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c >> index 309038fd46..04bf7cc977 100644 >> --- a/hw/virtio/virtio.c >> +++ b/hw/virtio/virtio.c >> @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem, >> /* Called within rcu_read_lock(). */ >> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx) >> { >> - uint16_t num_heads = vring_avail_idx(vq) - idx; >> + uint16_t num_heads; >> + >> + if (vq->shadow_avail_idx != idx) { >> + num_heads = vq->shadow_avail_idx - idx; >> + >> + return num_heads; > > This still needs to check num_heads > vq->vring.num and return -EINVAL > as is done below. Hmm, yeas, you're right. If the value was incorrect initially, the shadow will be incorrect. However, I think we should just not return here in this case and let vring_avail_idx() to grab an actual new value below. Otherwise we may never break out of this error. Does that make sense? > >> + } >> + >> + num_heads = vring_avail_idx(vq) - idx; >> >> /* Check it isn't doing very strange things with descriptor numbers. */ >> if (num_heads > vq->vring.num) { >> -- >> 2.40.1 >> >>
On Mon, 25 Sept 2023 at 11:02, Ilya Maximets <i.maximets@ovn.org> wrote: > > On 9/25/23 16:23, Stefan Hajnoczi wrote: > > On Fri, 25 Aug 2023 at 13:04, Ilya Maximets <i.maximets@ovn.org> wrote: > >> > >> We do not need the most up to date number of heads, we only want to > >> know if there is at least one. > >> > >> Use shadow variable as long as it is not equal to the last available > >> index checked. This avoids expensive qatomic dereference of the > >> RCU-protected memory region cache as well as the memory access itself > >> and the subsequent memory barrier. > >> > >> The change improves performance of the af-xdp network backend by 2-3%. > >> > >> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> > >> --- > >> hw/virtio/virtio.c | 10 +++++++++- > >> 1 file changed, 9 insertions(+), 1 deletion(-) > >> > >> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c > >> index 309038fd46..04bf7cc977 100644 > >> --- a/hw/virtio/virtio.c > >> +++ b/hw/virtio/virtio.c > >> @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem, > >> /* Called within rcu_read_lock(). */ > >> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx) > >> { > >> - uint16_t num_heads = vring_avail_idx(vq) - idx; > >> + uint16_t num_heads; > >> + > >> + if (vq->shadow_avail_idx != idx) { > >> + num_heads = vq->shadow_avail_idx - idx; > >> + > >> + return num_heads; > > > > This still needs to check num_heads > vq->vring.num and return -EINVAL > > as is done below. > > Hmm, yeas, you're right. If the value was incorrect initially, the shadow > will be incorrect. However, I think we should just not return here in this > case and let vring_avail_idx() to grab an actual new value below. Otherwise > we may never break out of this error. > > Does that make sense? No, because virtio_error() marks the device as broken. The device requires a reset in order to function again. Fetching vring_avail_idx() again won't help. > > > > >> + } > >> + > >> + num_heads = vring_avail_idx(vq) - idx; > >> > >> /* Check it isn't doing very strange things with descriptor numbers. */ > >> if (num_heads > vq->vring.num) { > >> -- > >> 2.40.1 > >> > >> >
On 9/25/23 17:12, Stefan Hajnoczi wrote: > On Mon, 25 Sept 2023 at 11:02, Ilya Maximets <i.maximets@ovn.org> wrote: >> >> On 9/25/23 16:23, Stefan Hajnoczi wrote: >>> On Fri, 25 Aug 2023 at 13:04, Ilya Maximets <i.maximets@ovn.org> wrote: >>>> >>>> We do not need the most up to date number of heads, we only want to >>>> know if there is at least one. >>>> >>>> Use shadow variable as long as it is not equal to the last available >>>> index checked. This avoids expensive qatomic dereference of the >>>> RCU-protected memory region cache as well as the memory access itself >>>> and the subsequent memory barrier. >>>> >>>> The change improves performance of the af-xdp network backend by 2-3%. >>>> >>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> >>>> --- >>>> hw/virtio/virtio.c | 10 +++++++++- >>>> 1 file changed, 9 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c >>>> index 309038fd46..04bf7cc977 100644 >>>> --- a/hw/virtio/virtio.c >>>> +++ b/hw/virtio/virtio.c >>>> @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem, >>>> /* Called within rcu_read_lock(). */ >>>> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx) >>>> { >>>> - uint16_t num_heads = vring_avail_idx(vq) - idx; >>>> + uint16_t num_heads; >>>> + >>>> + if (vq->shadow_avail_idx != idx) { >>>> + num_heads = vq->shadow_avail_idx - idx; >>>> + >>>> + return num_heads; >>> >>> This still needs to check num_heads > vq->vring.num and return -EINVAL >>> as is done below. >> >> Hmm, yeas, you're right. If the value was incorrect initially, the shadow >> will be incorrect. However, I think we should just not return here in this >> case and let vring_avail_idx() to grab an actual new value below. Otherwise >> we may never break out of this error. >> >> Does that make sense? > > No, because virtio_error() marks the device as broken. The device > requires a reset in order to function again. Fetching > vring_avail_idx() again won't help. OK, I see. In this case we're talking about situation where vring_avail_idx() was called in some other place and stored a bad value in the shadow variable, then virtqueue_num_heads() got called. Right? AFAIU, we can still just fall through here and let vring_avail_idx() to read the index again and fail the existing check. That would happen today without this patch applied. I'm jut trying to avoid duplication of the virtio_error call, i.e.: if (vq->shadow_avail_idx != idx) { num_heads = vq->shadow_avail_idx - idx; /* Check it isn't doing very strange things with descriptor numbers. */ if (num_heads > vq->vring.num) { virtio_error(vq->vdev, "Guest moved used index from %u to %u", idx, vq->shadow_avail_idx); return -EINVAL; } return num_heads; } vs if (vq->shadow_avail_idx != idx) { num_heads = vq->shadow_avail_idx - idx; /* Only use the shadow value if it was good initially. */ if (num_heads <= vq->vring.num) { return num_heads; } } What do you think? Best regards, Ilya Maximets.
On Mon, 25 Sept 2023 at 11:36, Ilya Maximets <i.maximets@ovn.org> wrote: > > On 9/25/23 17:12, Stefan Hajnoczi wrote: > > On Mon, 25 Sept 2023 at 11:02, Ilya Maximets <i.maximets@ovn.org> wrote: > >> > >> On 9/25/23 16:23, Stefan Hajnoczi wrote: > >>> On Fri, 25 Aug 2023 at 13:04, Ilya Maximets <i.maximets@ovn.org> wrote: > >>>> > >>>> We do not need the most up to date number of heads, we only want to > >>>> know if there is at least one. > >>>> > >>>> Use shadow variable as long as it is not equal to the last available > >>>> index checked. This avoids expensive qatomic dereference of the > >>>> RCU-protected memory region cache as well as the memory access itself > >>>> and the subsequent memory barrier. > >>>> > >>>> The change improves performance of the af-xdp network backend by 2-3%. > >>>> > >>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> > >>>> --- > >>>> hw/virtio/virtio.c | 10 +++++++++- > >>>> 1 file changed, 9 insertions(+), 1 deletion(-) > >>>> > >>>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c > >>>> index 309038fd46..04bf7cc977 100644 > >>>> --- a/hw/virtio/virtio.c > >>>> +++ b/hw/virtio/virtio.c > >>>> @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem, > >>>> /* Called within rcu_read_lock(). */ > >>>> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx) > >>>> { > >>>> - uint16_t num_heads = vring_avail_idx(vq) - idx; > >>>> + uint16_t num_heads; > >>>> + > >>>> + if (vq->shadow_avail_idx != idx) { > >>>> + num_heads = vq->shadow_avail_idx - idx; > >>>> + > >>>> + return num_heads; > >>> > >>> This still needs to check num_heads > vq->vring.num and return -EINVAL > >>> as is done below. > >> > >> Hmm, yeas, you're right. If the value was incorrect initially, the shadow > >> will be incorrect. However, I think we should just not return here in this > >> case and let vring_avail_idx() to grab an actual new value below. Otherwise > >> we may never break out of this error. > >> > >> Does that make sense? > > > > No, because virtio_error() marks the device as broken. The device > > requires a reset in order to function again. Fetching > > vring_avail_idx() again won't help. > > OK, I see. In this case we're talking about situation where > vring_avail_idx() was called in some other place and stored a bad value > in the shadow variable, then virtqueue_num_heads() got called. Right? > > AFAIU, we can still just fall through here and let vring_avail_idx() > to read the index again and fail the existing check. That would happen > today without this patch applied. Yes, that is fine. > > I'm jut trying to avoid duplication of the virtio_error call, i.e.: > > if (vq->shadow_avail_idx != idx) { > num_heads = vq->shadow_avail_idx - idx; > > /* Check it isn't doing very strange things with descriptor numbers. */ > if (num_heads > vq->vring.num) { > virtio_error(vq->vdev, "Guest moved used index from %u to %u", > idx, vq->shadow_avail_idx); > return -EINVAL; > } > return num_heads; > } > > vs > > if (vq->shadow_avail_idx != idx) { > num_heads = vq->shadow_avail_idx - idx; > > /* Only use the shadow value if it was good initially. */ > if (num_heads <= vq->vring.num) { > return num_heads; > } > } > > > What do you think? Sounds good. > > Best regards, Ilya Maximets.
On 9/25/23 17:38, Stefan Hajnoczi wrote: > On Mon, 25 Sept 2023 at 11:36, Ilya Maximets <i.maximets@ovn.org> wrote: >> >> On 9/25/23 17:12, Stefan Hajnoczi wrote: >>> On Mon, 25 Sept 2023 at 11:02, Ilya Maximets <i.maximets@ovn.org> wrote: >>>> >>>> On 9/25/23 16:23, Stefan Hajnoczi wrote: >>>>> On Fri, 25 Aug 2023 at 13:04, Ilya Maximets <i.maximets@ovn.org> wrote: >>>>>> >>>>>> We do not need the most up to date number of heads, we only want to >>>>>> know if there is at least one. >>>>>> >>>>>> Use shadow variable as long as it is not equal to the last available >>>>>> index checked. This avoids expensive qatomic dereference of the >>>>>> RCU-protected memory region cache as well as the memory access itself >>>>>> and the subsequent memory barrier. >>>>>> >>>>>> The change improves performance of the af-xdp network backend by 2-3%. >>>>>> >>>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> >>>>>> --- >>>>>> hw/virtio/virtio.c | 10 +++++++++- >>>>>> 1 file changed, 9 insertions(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c >>>>>> index 309038fd46..04bf7cc977 100644 >>>>>> --- a/hw/virtio/virtio.c >>>>>> +++ b/hw/virtio/virtio.c >>>>>> @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem, >>>>>> /* Called within rcu_read_lock(). */ >>>>>> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx) >>>>>> { >>>>>> - uint16_t num_heads = vring_avail_idx(vq) - idx; >>>>>> + uint16_t num_heads; >>>>>> + >>>>>> + if (vq->shadow_avail_idx != idx) { >>>>>> + num_heads = vq->shadow_avail_idx - idx; >>>>>> + >>>>>> + return num_heads; >>>>> >>>>> This still needs to check num_heads > vq->vring.num and return -EINVAL >>>>> as is done below. >>>> >>>> Hmm, yeas, you're right. If the value was incorrect initially, the shadow >>>> will be incorrect. However, I think we should just not return here in this >>>> case and let vring_avail_idx() to grab an actual new value below. Otherwise >>>> we may never break out of this error. >>>> >>>> Does that make sense? >>> >>> No, because virtio_error() marks the device as broken. The device >>> requires a reset in order to function again. Fetching >>> vring_avail_idx() again won't help. >> >> OK, I see. In this case we're talking about situation where >> vring_avail_idx() was called in some other place and stored a bad value >> in the shadow variable, then virtqueue_num_heads() got called. Right? Hmm, I suppose we also need a read barrier after all even if we use a shadow index. Assuming the index is correct, but the shadow variable was updated by a call outside of this function, then we may miss a barrier and read the descriptor out of order, in theory. Read barrier is going to be a compiler barrier on x86, so the performance gain from this patch should still be mostly there. I'll test that. >> >> AFAIU, we can still just fall through here and let vring_avail_idx() >> to read the index again and fail the existing check. That would happen >> today without this patch applied. > > Yes, that is fine. > >> >> I'm jut trying to avoid duplication of the virtio_error call, i.e.: >> >> if (vq->shadow_avail_idx != idx) { >> num_heads = vq->shadow_avail_idx - idx; >> >> /* Check it isn't doing very strange things with descriptor numbers. */ >> if (num_heads > vq->vring.num) { >> virtio_error(vq->vdev, "Guest moved used index from %u to %u", >> idx, vq->shadow_avail_idx); >> return -EINVAL; >> } >> return num_heads; >> } >> >> vs >> >> if (vq->shadow_avail_idx != idx) { >> num_heads = vq->shadow_avail_idx - idx; >> >> /* Only use the shadow value if it was good initially. */ >> if (num_heads <= vq->vring.num) { >> return num_heads; >> } >> } >> >> >> What do you think? > > Sounds good. > >> >> Best regards, Ilya Maximets.
On Mon, Sep 25, 2023 at 10:58:05PM +0200, Ilya Maximets wrote: > On 9/25/23 17:38, Stefan Hajnoczi wrote: > > On Mon, 25 Sept 2023 at 11:36, Ilya Maximets <i.maximets@ovn.org> wrote: > >> > >> On 9/25/23 17:12, Stefan Hajnoczi wrote: > >>> On Mon, 25 Sept 2023 at 11:02, Ilya Maximets <i.maximets@ovn.org> wrote: > >>>> > >>>> On 9/25/23 16:23, Stefan Hajnoczi wrote: > >>>>> On Fri, 25 Aug 2023 at 13:04, Ilya Maximets <i.maximets@ovn.org> wrote: > >>>>>> > >>>>>> We do not need the most up to date number of heads, we only want to > >>>>>> know if there is at least one. > >>>>>> > >>>>>> Use shadow variable as long as it is not equal to the last available > >>>>>> index checked. This avoids expensive qatomic dereference of the > >>>>>> RCU-protected memory region cache as well as the memory access itself > >>>>>> and the subsequent memory barrier. > >>>>>> > >>>>>> The change improves performance of the af-xdp network backend by 2-3%. > >>>>>> > >>>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> > >>>>>> --- > >>>>>> hw/virtio/virtio.c | 10 +++++++++- > >>>>>> 1 file changed, 9 insertions(+), 1 deletion(-) > >>>>>> > >>>>>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c > >>>>>> index 309038fd46..04bf7cc977 100644 > >>>>>> --- a/hw/virtio/virtio.c > >>>>>> +++ b/hw/virtio/virtio.c > >>>>>> @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem, > >>>>>> /* Called within rcu_read_lock(). */ > >>>>>> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx) > >>>>>> { > >>>>>> - uint16_t num_heads = vring_avail_idx(vq) - idx; > >>>>>> + uint16_t num_heads; > >>>>>> + > >>>>>> + if (vq->shadow_avail_idx != idx) { > >>>>>> + num_heads = vq->shadow_avail_idx - idx; > >>>>>> + > >>>>>> + return num_heads; > >>>>> > >>>>> This still needs to check num_heads > vq->vring.num and return -EINVAL > >>>>> as is done below. > >>>> > >>>> Hmm, yeas, you're right. If the value was incorrect initially, the shadow > >>>> will be incorrect. However, I think we should just not return here in this > >>>> case and let vring_avail_idx() to grab an actual new value below. Otherwise > >>>> we may never break out of this error. > >>>> > >>>> Does that make sense? > >>> > >>> No, because virtio_error() marks the device as broken. The device > >>> requires a reset in order to function again. Fetching > >>> vring_avail_idx() again won't help. > >> > >> OK, I see. In this case we're talking about situation where > >> vring_avail_idx() was called in some other place and stored a bad value > >> in the shadow variable, then virtqueue_num_heads() got called. Right? > > Hmm, I suppose we also need a read barrier after all even if we use > a shadow index. Assuming the index is correct, but the shadow variable > was updated by a call outside of this function, then we may miss a > barrier and read the descriptor out of order, in theory. Read barrier > is going to be a compiler barrier on x86, so the performance gain from > this patch should still be mostly there. I'll test that. I can't say I understand generally. shadow is under qemu control, I don't think it can be updated concurrently by multiple CPUs. > >> > >> AFAIU, we can still just fall through here and let vring_avail_idx() > >> to read the index again and fail the existing check. That would happen > >> today without this patch applied. > > > > Yes, that is fine. > > > >> > >> I'm jut trying to avoid duplication of the virtio_error call, i.e.: > >> > >> if (vq->shadow_avail_idx != idx) { > >> num_heads = vq->shadow_avail_idx - idx; > >> > >> /* Check it isn't doing very strange things with descriptor numbers. */ > >> if (num_heads > vq->vring.num) { > >> virtio_error(vq->vdev, "Guest moved used index from %u to %u", > >> idx, vq->shadow_avail_idx); > >> return -EINVAL; > >> } > >> return num_heads; > >> } > >> > >> vs > >> > >> if (vq->shadow_avail_idx != idx) { > >> num_heads = vq->shadow_avail_idx - idx; > >> > >> /* Only use the shadow value if it was good initially. */ > >> if (num_heads <= vq->vring.num) { > >> return num_heads; > >> } > >> } > >> > >> > >> What do you think? > > > > Sounds good. > > > >> > >> Best regards, Ilya Maximets.
On 9/25/23 23:24, Michael S. Tsirkin wrote: > On Mon, Sep 25, 2023 at 10:58:05PM +0200, Ilya Maximets wrote: >> On 9/25/23 17:38, Stefan Hajnoczi wrote: >>> On Mon, 25 Sept 2023 at 11:36, Ilya Maximets <i.maximets@ovn.org> wrote: >>>> >>>> On 9/25/23 17:12, Stefan Hajnoczi wrote: >>>>> On Mon, 25 Sept 2023 at 11:02, Ilya Maximets <i.maximets@ovn.org> wrote: >>>>>> >>>>>> On 9/25/23 16:23, Stefan Hajnoczi wrote: >>>>>>> On Fri, 25 Aug 2023 at 13:04, Ilya Maximets <i.maximets@ovn.org> wrote: >>>>>>>> >>>>>>>> We do not need the most up to date number of heads, we only want to >>>>>>>> know if there is at least one. >>>>>>>> >>>>>>>> Use shadow variable as long as it is not equal to the last available >>>>>>>> index checked. This avoids expensive qatomic dereference of the >>>>>>>> RCU-protected memory region cache as well as the memory access itself >>>>>>>> and the subsequent memory barrier. >>>>>>>> >>>>>>>> The change improves performance of the af-xdp network backend by 2-3%. >>>>>>>> >>>>>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> >>>>>>>> --- >>>>>>>> hw/virtio/virtio.c | 10 +++++++++- >>>>>>>> 1 file changed, 9 insertions(+), 1 deletion(-) >>>>>>>> >>>>>>>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c >>>>>>>> index 309038fd46..04bf7cc977 100644 >>>>>>>> --- a/hw/virtio/virtio.c >>>>>>>> +++ b/hw/virtio/virtio.c >>>>>>>> @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem, >>>>>>>> /* Called within rcu_read_lock(). */ >>>>>>>> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx) >>>>>>>> { >>>>>>>> - uint16_t num_heads = vring_avail_idx(vq) - idx; >>>>>>>> + uint16_t num_heads; >>>>>>>> + >>>>>>>> + if (vq->shadow_avail_idx != idx) { >>>>>>>> + num_heads = vq->shadow_avail_idx - idx; >>>>>>>> + >>>>>>>> + return num_heads; >>>>>>> >>>>>>> This still needs to check num_heads > vq->vring.num and return -EINVAL >>>>>>> as is done below. >>>>>> >>>>>> Hmm, yeas, you're right. If the value was incorrect initially, the shadow >>>>>> will be incorrect. However, I think we should just not return here in this >>>>>> case and let vring_avail_idx() to grab an actual new value below. Otherwise >>>>>> we may never break out of this error. >>>>>> >>>>>> Does that make sense? >>>>> >>>>> No, because virtio_error() marks the device as broken. The device >>>>> requires a reset in order to function again. Fetching >>>>> vring_avail_idx() again won't help. >>>> >>>> OK, I see. In this case we're talking about situation where >>>> vring_avail_idx() was called in some other place and stored a bad value >>>> in the shadow variable, then virtqueue_num_heads() got called. Right? >> >> Hmm, I suppose we also need a read barrier after all even if we use >> a shadow index. Assuming the index is correct, but the shadow variable >> was updated by a call outside of this function, then we may miss a >> barrier and read the descriptor out of order, in theory. Read barrier >> is going to be a compiler barrier on x86, so the performance gain from >> this patch should still be mostly there. I'll test that. > > I can't say I understand generally. shadow is under qemu control, > I don't think it can be updated concurrently by multiple CPUs. It can't, I agree. Scenario I'm thinking about is the following: 1. vring_avail_idx() is called from one of the places other than virtqueue_num_heads(). Shadow is updated with the current value. Some users of vring_avail_idx() do not use barriers after the call. 2. virtqueue_split_get_avail_bytes() is called. 3. virtqueue_split_get_avail_bytes() calls virtqueue_num_heads(). 4. virtqueue_num_heads() checks the shadow and returns early. 5. virtqueue_split_get_avail_bytes() calls vring_split_desc_read() and reads the descriptor. If between steps 1 and 5 we do not have a read barrier, we potentially risk reading descriptor data that is not yet fully written, because there is no guarantee that reading the last_avail_idx on step 1 wasn't reordered with the descriptor read. In current code we always have smp_rmb() in virtqueue_num_heads(). But if we return from this function without a barrier, we may have an issue, IIUC. I agree that it's kind of a very unlikely scenario and we will probably have a control dependency between steps 1 and 5 that will prevent the issue, but it might be safer to just have an explicit barrier in virtqueue_num_heads(). Does that make sense? Or am I missing something else here? > > >>>> >>>> AFAIU, we can still just fall through here and let vring_avail_idx() >>>> to read the index again and fail the existing check. That would happen >>>> today without this patch applied. >>> >>> Yes, that is fine. >>> >>>> >>>> I'm jut trying to avoid duplication of the virtio_error call, i.e.: >>>> >>>> if (vq->shadow_avail_idx != idx) { >>>> num_heads = vq->shadow_avail_idx - idx; >>>> >>>> /* Check it isn't doing very strange things with descriptor numbers. */ >>>> if (num_heads > vq->vring.num) { >>>> virtio_error(vq->vdev, "Guest moved used index from %u to %u", >>>> idx, vq->shadow_avail_idx); >>>> return -EINVAL; >>>> } >>>> return num_heads; >>>> } >>>> >>>> vs >>>> >>>> if (vq->shadow_avail_idx != idx) { >>>> num_heads = vq->shadow_avail_idx - idx; >>>> >>>> /* Only use the shadow value if it was good initially. */ >>>> if (num_heads <= vq->vring.num) { >>>> return num_heads; >>>> } >>>> } >>>> >>>> >>>> What do you think? >>> >>> Sounds good. >>> >>>> >>>> Best regards, Ilya Maximets. >
On Tue, Sep 26, 2023 at 12:13:11AM +0200, Ilya Maximets wrote: > On 9/25/23 23:24, Michael S. Tsirkin wrote: > > On Mon, Sep 25, 2023 at 10:58:05PM +0200, Ilya Maximets wrote: > >> On 9/25/23 17:38, Stefan Hajnoczi wrote: > >>> On Mon, 25 Sept 2023 at 11:36, Ilya Maximets <i.maximets@ovn.org> wrote: > >>>> > >>>> On 9/25/23 17:12, Stefan Hajnoczi wrote: > >>>>> On Mon, 25 Sept 2023 at 11:02, Ilya Maximets <i.maximets@ovn.org> wrote: > >>>>>> > >>>>>> On 9/25/23 16:23, Stefan Hajnoczi wrote: > >>>>>>> On Fri, 25 Aug 2023 at 13:04, Ilya Maximets <i.maximets@ovn.org> wrote: > >>>>>>>> > >>>>>>>> We do not need the most up to date number of heads, we only want to > >>>>>>>> know if there is at least one. > >>>>>>>> > >>>>>>>> Use shadow variable as long as it is not equal to the last available > >>>>>>>> index checked. This avoids expensive qatomic dereference of the > >>>>>>>> RCU-protected memory region cache as well as the memory access itself > >>>>>>>> and the subsequent memory barrier. > >>>>>>>> > >>>>>>>> The change improves performance of the af-xdp network backend by 2-3%. > >>>>>>>> > >>>>>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> > >>>>>>>> --- > >>>>>>>> hw/virtio/virtio.c | 10 +++++++++- > >>>>>>>> 1 file changed, 9 insertions(+), 1 deletion(-) > >>>>>>>> > >>>>>>>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c > >>>>>>>> index 309038fd46..04bf7cc977 100644 > >>>>>>>> --- a/hw/virtio/virtio.c > >>>>>>>> +++ b/hw/virtio/virtio.c > >>>>>>>> @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem, > >>>>>>>> /* Called within rcu_read_lock(). */ > >>>>>>>> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx) > >>>>>>>> { > >>>>>>>> - uint16_t num_heads = vring_avail_idx(vq) - idx; > >>>>>>>> + uint16_t num_heads; > >>>>>>>> + > >>>>>>>> + if (vq->shadow_avail_idx != idx) { > >>>>>>>> + num_heads = vq->shadow_avail_idx - idx; > >>>>>>>> + > >>>>>>>> + return num_heads; > >>>>>>> > >>>>>>> This still needs to check num_heads > vq->vring.num and return -EINVAL > >>>>>>> as is done below. > >>>>>> > >>>>>> Hmm, yeas, you're right. If the value was incorrect initially, the shadow > >>>>>> will be incorrect. However, I think we should just not return here in this > >>>>>> case and let vring_avail_idx() to grab an actual new value below. Otherwise > >>>>>> we may never break out of this error. > >>>>>> > >>>>>> Does that make sense? > >>>>> > >>>>> No, because virtio_error() marks the device as broken. The device > >>>>> requires a reset in order to function again. Fetching > >>>>> vring_avail_idx() again won't help. > >>>> > >>>> OK, I see. In this case we're talking about situation where > >>>> vring_avail_idx() was called in some other place and stored a bad value > >>>> in the shadow variable, then virtqueue_num_heads() got called. Right? > >> > >> Hmm, I suppose we also need a read barrier after all even if we use > >> a shadow index. Assuming the index is correct, but the shadow variable > >> was updated by a call outside of this function, then we may miss a > >> barrier and read the descriptor out of order, in theory. Read barrier > >> is going to be a compiler barrier on x86, so the performance gain from > >> this patch should still be mostly there. I'll test that. > > > > I can't say I understand generally. shadow is under qemu control, > > I don't think it can be updated concurrently by multiple CPUs. > > It can't, I agree. Scenario I'm thinking about is the following: > > 1. vring_avail_idx() is called from one of the places other than > virtqueue_num_heads(). Shadow is updated with the current value. > Some users of vring_avail_idx() do not use barriers after the call. > > 2. virtqueue_split_get_avail_bytes() is called. > > 3. virtqueue_split_get_avail_bytes() calls virtqueue_num_heads(). > > 4. virtqueue_num_heads() checks the shadow and returns early. > > 5. virtqueue_split_get_avail_bytes() calls vring_split_desc_read() and > reads the descriptor. > > If between steps 1 and 5 we do not have a read barrier, we potentially > risk reading descriptor data that is not yet fully written, because > there is no guarantee that reading the last_avail_idx on step 1 wasn't > reordered with the descriptor read. > > In current code we always have smp_rmb() in virtqueue_num_heads(). > But if we return from this function without a barrier, we may have an > issue, IIUC. > > I agree that it's kind of a very unlikely scenario and we will probably > have a control dependency between steps 1 and 5 that will prevent the > issue, but it might be safer to just have an explicit barrier in > virtqueue_num_heads(). > > Does that make sense? Or am I missing something else here? Aha, got it. Good point, yes. Pls document in a code comment. > > > > > >>>> > >>>> AFAIU, we can still just fall through here and let vring_avail_idx() > >>>> to read the index again and fail the existing check. That would happen > >>>> today without this patch applied. > >>> > >>> Yes, that is fine. > >>> > >>>> > >>>> I'm jut trying to avoid duplication of the virtio_error call, i.e.: > >>>> > >>>> if (vq->shadow_avail_idx != idx) { > >>>> num_heads = vq->shadow_avail_idx - idx; > >>>> > >>>> /* Check it isn't doing very strange things with descriptor numbers. */ > >>>> if (num_heads > vq->vring.num) { > >>>> virtio_error(vq->vdev, "Guest moved used index from %u to %u", > >>>> idx, vq->shadow_avail_idx); > >>>> return -EINVAL; > >>>> } > >>>> return num_heads; > >>>> } > >>>> > >>>> vs > >>>> > >>>> if (vq->shadow_avail_idx != idx) { > >>>> num_heads = vq->shadow_avail_idx - idx; > >>>> > >>>> /* Only use the shadow value if it was good initially. */ > >>>> if (num_heads <= vq->vring.num) { > >>>> return num_heads; > >>>> } > >>>> } > >>>> > >>>> > >>>> What do you think? > >>> > >>> Sounds good. > >>> > >>>> > >>>> Best regards, Ilya Maximets. > >
On 9/26/23 00:24, Michael S. Tsirkin wrote: > On Tue, Sep 26, 2023 at 12:13:11AM +0200, Ilya Maximets wrote: >> On 9/25/23 23:24, Michael S. Tsirkin wrote: >>> On Mon, Sep 25, 2023 at 10:58:05PM +0200, Ilya Maximets wrote: >>>> On 9/25/23 17:38, Stefan Hajnoczi wrote: >>>>> On Mon, 25 Sept 2023 at 11:36, Ilya Maximets <i.maximets@ovn.org> wrote: >>>>>> >>>>>> On 9/25/23 17:12, Stefan Hajnoczi wrote: >>>>>>> On Mon, 25 Sept 2023 at 11:02, Ilya Maximets <i.maximets@ovn.org> wrote: >>>>>>>> >>>>>>>> On 9/25/23 16:23, Stefan Hajnoczi wrote: >>>>>>>>> On Fri, 25 Aug 2023 at 13:04, Ilya Maximets <i.maximets@ovn.org> wrote: >>>>>>>>>> >>>>>>>>>> We do not need the most up to date number of heads, we only want to >>>>>>>>>> know if there is at least one. >>>>>>>>>> >>>>>>>>>> Use shadow variable as long as it is not equal to the last available >>>>>>>>>> index checked. This avoids expensive qatomic dereference of the >>>>>>>>>> RCU-protected memory region cache as well as the memory access itself >>>>>>>>>> and the subsequent memory barrier. >>>>>>>>>> >>>>>>>>>> The change improves performance of the af-xdp network backend by 2-3%. >>>>>>>>>> >>>>>>>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> >>>>>>>>>> --- >>>>>>>>>> hw/virtio/virtio.c | 10 +++++++++- >>>>>>>>>> 1 file changed, 9 insertions(+), 1 deletion(-) >>>>>>>>>> >>>>>>>>>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c >>>>>>>>>> index 309038fd46..04bf7cc977 100644 >>>>>>>>>> --- a/hw/virtio/virtio.c >>>>>>>>>> +++ b/hw/virtio/virtio.c >>>>>>>>>> @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem, >>>>>>>>>> /* Called within rcu_read_lock(). */ >>>>>>>>>> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx) >>>>>>>>>> { >>>>>>>>>> - uint16_t num_heads = vring_avail_idx(vq) - idx; >>>>>>>>>> + uint16_t num_heads; >>>>>>>>>> + >>>>>>>>>> + if (vq->shadow_avail_idx != idx) { >>>>>>>>>> + num_heads = vq->shadow_avail_idx - idx; >>>>>>>>>> + >>>>>>>>>> + return num_heads; >>>>>>>>> >>>>>>>>> This still needs to check num_heads > vq->vring.num and return -EINVAL >>>>>>>>> as is done below. >>>>>>>> >>>>>>>> Hmm, yeas, you're right. If the value was incorrect initially, the shadow >>>>>>>> will be incorrect. However, I think we should just not return here in this >>>>>>>> case and let vring_avail_idx() to grab an actual new value below. Otherwise >>>>>>>> we may never break out of this error. >>>>>>>> >>>>>>>> Does that make sense? >>>>>>> >>>>>>> No, because virtio_error() marks the device as broken. The device >>>>>>> requires a reset in order to function again. Fetching >>>>>>> vring_avail_idx() again won't help. >>>>>> >>>>>> OK, I see. In this case we're talking about situation where >>>>>> vring_avail_idx() was called in some other place and stored a bad value >>>>>> in the shadow variable, then virtqueue_num_heads() got called. Right? >>>> >>>> Hmm, I suppose we also need a read barrier after all even if we use >>>> a shadow index. Assuming the index is correct, but the shadow variable >>>> was updated by a call outside of this function, then we may miss a >>>> barrier and read the descriptor out of order, in theory. Read barrier >>>> is going to be a compiler barrier on x86, so the performance gain from >>>> this patch should still be mostly there. I'll test that. >>> >>> I can't say I understand generally. shadow is under qemu control, >>> I don't think it can be updated concurrently by multiple CPUs. >> >> It can't, I agree. Scenario I'm thinking about is the following: >> >> 1. vring_avail_idx() is called from one of the places other than >> virtqueue_num_heads(). Shadow is updated with the current value. >> Some users of vring_avail_idx() do not use barriers after the call. >> >> 2. virtqueue_split_get_avail_bytes() is called. >> >> 3. virtqueue_split_get_avail_bytes() calls virtqueue_num_heads(). >> >> 4. virtqueue_num_heads() checks the shadow and returns early. >> >> 5. virtqueue_split_get_avail_bytes() calls vring_split_desc_read() and >> reads the descriptor. >> >> If between steps 1 and 5 we do not have a read barrier, we potentially >> risk reading descriptor data that is not yet fully written, because >> there is no guarantee that reading the last_avail_idx on step 1 wasn't >> reordered with the descriptor read. >> >> In current code we always have smp_rmb() in virtqueue_num_heads(). >> But if we return from this function without a barrier, we may have an >> issue, IIUC. >> >> I agree that it's kind of a very unlikely scenario and we will probably >> have a control dependency between steps 1 and 5 that will prevent the >> issue, but it might be safer to just have an explicit barrier in >> virtqueue_num_heads(). >> >> Does that make sense? Or am I missing something else here? > > Aha, got it. Good point, yes. Pls document in a code comment. Done. Posted v2: https://lore.kernel.org/qemu-devel/20230927135157.2316982-1-i.maximets@ovn.org/ > > >>> >>> >>>>>> >>>>>> AFAIU, we can still just fall through here and let vring_avail_idx() >>>>>> to read the index again and fail the existing check. That would happen >>>>>> today without this patch applied. >>>>> >>>>> Yes, that is fine. >>>>> >>>>>> >>>>>> I'm jut trying to avoid duplication of the virtio_error call, i.e.: >>>>>> >>>>>> if (vq->shadow_avail_idx != idx) { >>>>>> num_heads = vq->shadow_avail_idx - idx; >>>>>> >>>>>> /* Check it isn't doing very strange things with descriptor numbers. */ >>>>>> if (num_heads > vq->vring.num) { >>>>>> virtio_error(vq->vdev, "Guest moved used index from %u to %u", >>>>>> idx, vq->shadow_avail_idx); >>>>>> return -EINVAL; >>>>>> } >>>>>> return num_heads; >>>>>> } >>>>>> >>>>>> vs >>>>>> >>>>>> if (vq->shadow_avail_idx != idx) { >>>>>> num_heads = vq->shadow_avail_idx - idx; >>>>>> >>>>>> /* Only use the shadow value if it was good initially. */ >>>>>> if (num_heads <= vq->vring.num) { >>>>>> return num_heads; >>>>>> } >>>>>> } >>>>>> >>>>>> >>>>>> What do you think? >>>>> >>>>> Sounds good. >>>>> >>>>>> >>>>>> Best regards, Ilya Maximets. >>> >
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 309038fd46..04bf7cc977 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -999,7 +999,15 @@ void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem, /* Called within rcu_read_lock(). */ static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx) { - uint16_t num_heads = vring_avail_idx(vq) - idx; + uint16_t num_heads; + + if (vq->shadow_avail_idx != idx) { + num_heads = vq->shadow_avail_idx - idx; + + return num_heads; + } + + num_heads = vring_avail_idx(vq) - idx; /* Check it isn't doing very strange things with descriptor numbers. */ if (num_heads > vq->vring.num) {
We do not need the most up to date number of heads, we only want to know if there is at least one. Use shadow variable as long as it is not equal to the last available index checked. This avoids expensive qatomic dereference of the RCU-protected memory region cache as well as the memory access itself and the subsequent memory barrier. The change improves performance of the af-xdp network backend by 2-3%. Signed-off-by: Ilya Maximets <i.maximets@ovn.org> --- hw/virtio/virtio.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)