mbox series

[0/5] VSOCK: support mergeable rx buffer in vhost-vsock

Message ID 5BDFF49C.3040603@huawei.com
Headers show
Series VSOCK: support mergeable rx buffer in vhost-vsock | expand

Message

jiangyiwen Nov. 5, 2018, 7:43 a.m. UTC
Now vsock only support send/receive small packet, it can't achieve
high performance. As previous discussed with Jason Wang, I revisit the
idea of vhost-net about mergeable rx buffer and implement the mergeable
rx buffer in vhost-vsock, it can allow big packet to be scattered in
into different buffers and improve performance obviously.

I write a tool to test the vhost-vsock performance, mainly send big
packet(64K) included guest->Host and Host->Guest. The result as
follows:

Before performance:
              Single socket            Multiple sockets(Max Bandwidth)
Guest->Host   ~400MB/s                 ~480MB/s
Host->Guest   ~1450MB/s                ~1600MB/s

After performance:
              Single socket            Multiple sockets(Max Bandwidth)
Guest->Host   ~1700MB/s                ~2900MB/s
Host->Guest   ~1700MB/s                ~2900MB/s

From the test results, the performance is improved obviously, and guest
memory will not be wasted.

---

Yiwen Jiang (5):
  VSOCK: support fill mergeable rx buffer in guest
  VSOCK: support fill data to mergeable rx buffer in host
  VSOCK: support receive mergeable rx buffer in guest
  VSOCK: modify default rx buf size to improve performance
  VSOCK: batch sending rx buffer to increase bandwidth

 drivers/vhost/vsock.c                   | 135 +++++++++++++++++++++++------
 include/linux/virtio_vsock.h            |  15 +++-
 include/uapi/linux/virtio_vsock.h       |   5 ++
 net/vmw_vsock/virtio_transport.c        | 147 ++++++++++++++++++++++++++------
 net/vmw_vsock/virtio_transport_common.c |  59 +++++++++++--
 5 files changed, 300 insertions(+), 61 deletions(-)

Comments

Jason Wang Nov. 5, 2018, 9:21 a.m. UTC | #1
On 2018/11/5 下午3:43, jiangyiwen wrote:
> Now vsock only support send/receive small packet, it can't achieve
> high performance. As previous discussed with Jason Wang, I revisit the
> idea of vhost-net about mergeable rx buffer and implement the mergeable
> rx buffer in vhost-vsock, it can allow big packet to be scattered in
> into different buffers and improve performance obviously.
>
> I write a tool to test the vhost-vsock performance, mainly send big
> packet(64K) included guest->Host and Host->Guest. The result as
> follows:
>
> Before performance:
>                Single socket            Multiple sockets(Max Bandwidth)
> Guest->Host   ~400MB/s                 ~480MB/s
> Host->Guest   ~1450MB/s                ~1600MB/s
>
> After performance:
>                Single socket            Multiple sockets(Max Bandwidth)
> Guest->Host   ~1700MB/s                ~2900MB/s
> Host->Guest   ~1700MB/s                ~2900MB/s
>
>  From the test results, the performance is improved obviously, and guest
> memory will not be wasted.


Hi:

Thanks for the patches and the numbers are really impressive.

But instead of duplicating codes between sock and net. I was considering 
to use virtio-net as a transport of vsock. Then we may have all existed 
features likes batching, mergeable rx buffers and multiqueue. Want to 
consider this idea? Thoughts?


>
> ---
>
> Yiwen Jiang (5):
>    VSOCK: support fill mergeable rx buffer in guest
>    VSOCK: support fill data to mergeable rx buffer in host
>    VSOCK: support receive mergeable rx buffer in guest
>    VSOCK: modify default rx buf size to improve performance
>    VSOCK: batch sending rx buffer to increase bandwidth
>
>   drivers/vhost/vsock.c                   | 135 +++++++++++++++++++++++------
>   include/linux/virtio_vsock.h            |  15 +++-
>   include/uapi/linux/virtio_vsock.h       |   5 ++
>   net/vmw_vsock/virtio_transport.c        | 147 ++++++++++++++++++++++++++------
>   net/vmw_vsock/virtio_transport_common.c |  59 +++++++++++--
>   5 files changed, 300 insertions(+), 61 deletions(-)
>
jiangyiwen Nov. 6, 2018, 2:17 a.m. UTC | #2
On 2018/11/5 17:21, Jason Wang wrote:
> 
> On 2018/11/5 下午3:43, jiangyiwen wrote:
>> Now vsock only support send/receive small packet, it can't achieve
>> high performance. As previous discussed with Jason Wang, I revisit the
>> idea of vhost-net about mergeable rx buffer and implement the mergeable
>> rx buffer in vhost-vsock, it can allow big packet to be scattered in
>> into different buffers and improve performance obviously.
>>
>> I write a tool to test the vhost-vsock performance, mainly send big
>> packet(64K) included guest->Host and Host->Guest. The result as
>> follows:
>>
>> Before performance:
>>                Single socket            Multiple sockets(Max Bandwidth)
>> Guest->Host   ~400MB/s                 ~480MB/s
>> Host->Guest   ~1450MB/s                ~1600MB/s
>>
>> After performance:
>>                Single socket            Multiple sockets(Max Bandwidth)
>> Guest->Host   ~1700MB/s                ~2900MB/s
>> Host->Guest   ~1700MB/s                ~2900MB/s
>>
>>  From the test results, the performance is improved obviously, and guest
>> memory will not be wasted.
> 
> 
> Hi:
> 
> Thanks for the patches and the numbers are really impressive.
> 
> But instead of duplicating codes between sock and net. I was considering to use virtio-net as a transport of vsock. Then we may have all existed features likes batching, mergeable rx buffers and multiqueue. Want to consider this idea? Thoughts?
> 
> 

Hi Jason,

I am not very familiar with virtio-net, so I am afraid I can't give too
much effective advice. Then I have several problems:

1. If use virtio-net as a transport, guest should see a virtio-net
device instead of virtio-vsock device, right? Is vsock only as a
transport between socket and net_device? User should still use
AF_VSOCK type to create socket, right?

2. I want to know if this idea has already started, and how is
the current progress?

3. And what is stefan's idea?

Thanks,
Yiwen.

>>
>> ---
>>
>> Yiwen Jiang (5):
>>    VSOCK: support fill mergeable rx buffer in guest
>>    VSOCK: support fill data to mergeable rx buffer in host
>>    VSOCK: support receive mergeable rx buffer in guest
>>    VSOCK: modify default rx buf size to improve performance
>>    VSOCK: batch sending rx buffer to increase bandwidth
>>
>>   drivers/vhost/vsock.c                   | 135 +++++++++++++++++++++++------
>>   include/linux/virtio_vsock.h            |  15 +++-
>>   include/uapi/linux/virtio_vsock.h       |   5 ++
>>   net/vmw_vsock/virtio_transport.c        | 147 ++++++++++++++++++++++++++------
>>   net/vmw_vsock/virtio_transport_common.c |  59 +++++++++++--
>>   5 files changed, 300 insertions(+), 61 deletions(-)
>>
> 
> .
>
Jason Wang Nov. 6, 2018, 2:41 a.m. UTC | #3
On 2018/11/6 上午10:17, jiangyiwen wrote:
> On 2018/11/5 17:21, Jason Wang wrote:
>> On 2018/11/5 下午3:43, jiangyiwen wrote:
>>> Now vsock only support send/receive small packet, it can't achieve
>>> high performance. As previous discussed with Jason Wang, I revisit the
>>> idea of vhost-net about mergeable rx buffer and implement the mergeable
>>> rx buffer in vhost-vsock, it can allow big packet to be scattered in
>>> into different buffers and improve performance obviously.
>>>
>>> I write a tool to test the vhost-vsock performance, mainly send big
>>> packet(64K) included guest->Host and Host->Guest. The result as
>>> follows:
>>>
>>> Before performance:
>>>                 Single socket            Multiple sockets(Max Bandwidth)
>>> Guest->Host   ~400MB/s                 ~480MB/s
>>> Host->Guest   ~1450MB/s                ~1600MB/s
>>>
>>> After performance:
>>>                 Single socket            Multiple sockets(Max Bandwidth)
>>> Guest->Host   ~1700MB/s                ~2900MB/s
>>> Host->Guest   ~1700MB/s                ~2900MB/s
>>>
>>>   From the test results, the performance is improved obviously, and guest
>>> memory will not be wasted.
>> Hi:
>>
>> Thanks for the patches and the numbers are really impressive.
>>
>> But instead of duplicating codes between sock and net. I was considering to use virtio-net as a transport of vsock. Then we may have all existed features likes batching, mergeable rx buffers and multiqueue. Want to consider this idea? Thoughts?
>>
>>
> Hi Jason,
>
> I am not very familiar with virtio-net, so I am afraid I can't give too
> much effective advice. Then I have several problems:
>
> 1. If use virtio-net as a transport, guest should see a virtio-net
> device instead of virtio-vsock device, right? Is vsock only as a
> transport between socket and net_device? User should still use
> AF_VSOCK type to create socket, right?


Well, there're many choices. What you need is just to keep the socket 
API and hide the implementation. For example, you can keep the vosck 
device in guest and switch to use vhost-net in host. We probably need a 
new feature bit or header to let vhost know we are passing vsock packet. 
And vhost-net could forward the packet to vsock core on host.


>
> 2. I want to know if this idea has already started, and how is
> the current progress?


Not yet started.  Just want to listen from the community. If this sounds 
good, do you have interest in implementing this?


>
> 3. And what is stefan's idea?


Talk with Stefan a little on this during KVM Forum. I think he tends to 
agree on this idea. Anyway, let's wait for his reply.


Thanks


>
> Thanks,
> Yiwen.
>
jiangyiwen Nov. 6, 2018, 3:17 a.m. UTC | #4
On 2018/11/6 10:41, Jason Wang wrote:
> 
> On 2018/11/6 上午10:17, jiangyiwen wrote:
>> On 2018/11/5 17:21, Jason Wang wrote:
>>> On 2018/11/5 下午3:43, jiangyiwen wrote:
>>>> Now vsock only support send/receive small packet, it can't achieve
>>>> high performance. As previous discussed with Jason Wang, I revisit the
>>>> idea of vhost-net about mergeable rx buffer and implement the mergeable
>>>> rx buffer in vhost-vsock, it can allow big packet to be scattered in
>>>> into different buffers and improve performance obviously.
>>>>
>>>> I write a tool to test the vhost-vsock performance, mainly send big
>>>> packet(64K) included guest->Host and Host->Guest. The result as
>>>> follows:
>>>>
>>>> Before performance:
>>>>                 Single socket            Multiple sockets(Max Bandwidth)
>>>> Guest->Host   ~400MB/s                 ~480MB/s
>>>> Host->Guest   ~1450MB/s                ~1600MB/s
>>>>
>>>> After performance:
>>>>                 Single socket            Multiple sockets(Max Bandwidth)
>>>> Guest->Host   ~1700MB/s                ~2900MB/s
>>>> Host->Guest   ~1700MB/s                ~2900MB/s
>>>>
>>>>   From the test results, the performance is improved obviously, and guest
>>>> memory will not be wasted.
>>> Hi:
>>>
>>> Thanks for the patches and the numbers are really impressive.
>>>
>>> But instead of duplicating codes between sock and net. I was considering to use virtio-net as a transport of vsock. Then we may have all existed features likes batching, mergeable rx buffers and multiqueue. Want to consider this idea? Thoughts?
>>>
>>>
>> Hi Jason,
>>
>> I am not very familiar with virtio-net, so I am afraid I can't give too
>> much effective advice. Then I have several problems:
>>
>> 1. If use virtio-net as a transport, guest should see a virtio-net
>> device instead of virtio-vsock device, right? Is vsock only as a
>> transport between socket and net_device? User should still use
>> AF_VSOCK type to create socket, right?
> 
> 
> Well, there're many choices. What you need is just to keep the socket API and hide the implementation. For example, you can keep the vosck device in guest and switch to use vhost-net in host. We probably need a new feature bit or header to let vhost know we are passing vsock packet. And vhost-net could forward the packet to vsock core on host.
> 
> 
>>
>> 2. I want to know if this idea has already started, and how is
>> the current progress?
> 
> 
> Not yet started.  Just want to listen from the community. If this sounds good, do you have interest in implementing this?
> 
> 
>>
>> 3. And what is stefan's idea?
> 
> 
> Talk with Stefan a little on this during KVM Forum. I think he tends to agree on this idea. Anyway, let's wait for his reply.
> 
> 
> Thanks
> 
> 

Hi Jason,

Thanks your reply, what you want is try to avoid duplicate code, and still
use the existed features with virtio-net.
Yes, if this sounds good and most people can recognize this idea, I am very
happy to implement this.

In addition, I hope you can review these patches before the new idea is
implemented, after all the performance can be improved. :-)

Thanks,
Yiwen.

>>
>> Thanks,
>> Yiwen.
>>
> 
> .
>
Jason Wang Nov. 6, 2018, 3:32 a.m. UTC | #5
On 2018/11/6 上午11:17, jiangyiwen wrote:
> On 2018/11/6 10:41, Jason Wang wrote:
>> On 2018/11/6 上午10:17, jiangyiwen wrote:
>>> On 2018/11/5 17:21, Jason Wang wrote:
>>>> On 2018/11/5 下午3:43, jiangyiwen wrote:
>>>>> Now vsock only support send/receive small packet, it can't achieve
>>>>> high performance. As previous discussed with Jason Wang, I revisit the
>>>>> idea of vhost-net about mergeable rx buffer and implement the mergeable
>>>>> rx buffer in vhost-vsock, it can allow big packet to be scattered in
>>>>> into different buffers and improve performance obviously.
>>>>>
>>>>> I write a tool to test the vhost-vsock performance, mainly send big
>>>>> packet(64K) included guest->Host and Host->Guest. The result as
>>>>> follows:
>>>>>
>>>>> Before performance:
>>>>>                  Single socket            Multiple sockets(Max Bandwidth)
>>>>> Guest->Host   ~400MB/s                 ~480MB/s
>>>>> Host->Guest   ~1450MB/s                ~1600MB/s
>>>>>
>>>>> After performance:
>>>>>                  Single socket            Multiple sockets(Max Bandwidth)
>>>>> Guest->Host   ~1700MB/s                ~2900MB/s
>>>>> Host->Guest   ~1700MB/s                ~2900MB/s
>>>>>
>>>>>    From the test results, the performance is improved obviously, and guest
>>>>> memory will not be wasted.
>>>> Hi:
>>>>
>>>> Thanks for the patches and the numbers are really impressive.
>>>>
>>>> But instead of duplicating codes between sock and net. I was considering to use virtio-net as a transport of vsock. Then we may have all existed features likes batching, mergeable rx buffers and multiqueue. Want to consider this idea? Thoughts?
>>>>
>>>>
>>> Hi Jason,
>>>
>>> I am not very familiar with virtio-net, so I am afraid I can't give too
>>> much effective advice. Then I have several problems:
>>>
>>> 1. If use virtio-net as a transport, guest should see a virtio-net
>>> device instead of virtio-vsock device, right? Is vsock only as a
>>> transport between socket and net_device? User should still use
>>> AF_VSOCK type to create socket, right?
>>
>> Well, there're many choices. What you need is just to keep the socket API and hide the implementation. For example, you can keep the vosck device in guest and switch to use vhost-net in host. We probably need a new feature bit or header to let vhost know we are passing vsock packet. And vhost-net could forward the packet to vsock core on host.
>>
>>
>>> 2. I want to know if this idea has already started, and how is
>>> the current progress?
>>
>> Not yet started.  Just want to listen from the community. If this sounds good, do you have interest in implementing this?
>>
>>
>>> 3. And what is stefan's idea?
>>
>> Talk with Stefan a little on this during KVM Forum. I think he tends to agree on this idea. Anyway, let's wait for his reply.
>>
>>
>> Thanks
>>
>>
> Hi Jason,
>
> Thanks your reply, what you want is try to avoid duplicate code, and still
> use the existed features with virtio-net.


Yes, technically we can use virtio-net driver is guest as well but we 
could do it step by step.


> Yes, if this sounds good and most people can recognize this idea, I am very
> happy to implement this.


Cool, thanks.


>
> In addition, I hope you can review these patches before the new idea is
> implemented, after all the performance can be improved. :-)


Ok.


So the patch actually did three things:

- mergeable buffer implementation

- increase the default rx buffer size

- add used and signal guest in a batch

It would be helpful if you can measure the performance improvement 
independently. This can give reviewer a better understanding on how much 
did each part help.

Thanks


>
> Thanks,
> Yiwen.
>
>>> Thanks,
>>> Yiwen.
>>>
>> .
>>
>
jiangyiwen Nov. 6, 2018, 5:53 a.m. UTC | #6
On 2018/11/6 11:32, Jason Wang wrote:
> 
> On 2018/11/6 上午11:17, jiangyiwen wrote:
>> On 2018/11/6 10:41, Jason Wang wrote:
>>> On 2018/11/6 上午10:17, jiangyiwen wrote:
>>>> On 2018/11/5 17:21, Jason Wang wrote:
>>>>> On 2018/11/5 下午3:43, jiangyiwen wrote:
>>>>>> Now vsock only support send/receive small packet, it can't achieve
>>>>>> high performance. As previous discussed with Jason Wang, I revisit the
>>>>>> idea of vhost-net about mergeable rx buffer and implement the mergeable
>>>>>> rx buffer in vhost-vsock, it can allow big packet to be scattered in
>>>>>> into different buffers and improve performance obviously.
>>>>>>
>>>>>> I write a tool to test the vhost-vsock performance, mainly send big
>>>>>> packet(64K) included guest->Host and Host->Guest. The result as
>>>>>> follows:
>>>>>>
>>>>>> Before performance:
>>>>>>                  Single socket            Multiple sockets(Max Bandwidth)
>>>>>> Guest->Host   ~400MB/s                 ~480MB/s
>>>>>> Host->Guest   ~1450MB/s                ~1600MB/s
>>>>>>
>>>>>> After performance:
>>>>>>                  Single socket            Multiple sockets(Max Bandwidth)
>>>>>> Guest->Host   ~1700MB/s                ~2900MB/s
>>>>>> Host->Guest   ~1700MB/s                ~2900MB/s
>>>>>>
>>>>>>    From the test results, the performance is improved obviously, and guest
>>>>>> memory will not be wasted.
>>>>> Hi:
>>>>>
>>>>> Thanks for the patches and the numbers are really impressive.
>>>>>
>>>>> But instead of duplicating codes between sock and net. I was considering to use virtio-net as a transport of vsock. Then we may have all existed features likes batching, mergeable rx buffers and multiqueue. Want to consider this idea? Thoughts?
>>>>>
>>>>>
>>>> Hi Jason,
>>>>
>>>> I am not very familiar with virtio-net, so I am afraid I can't give too
>>>> much effective advice. Then I have several problems:
>>>>
>>>> 1. If use virtio-net as a transport, guest should see a virtio-net
>>>> device instead of virtio-vsock device, right? Is vsock only as a
>>>> transport between socket and net_device? User should still use
>>>> AF_VSOCK type to create socket, right?
>>>
>>> Well, there're many choices. What you need is just to keep the socket API and hide the implementation. For example, you can keep the vosck device in guest and switch to use vhost-net in host. We probably need a new feature bit or header to let vhost know we are passing vsock packet. And vhost-net could forward the packet to vsock core on host.
>>>
>>>
>>>> 2. I want to know if this idea has already started, and how is
>>>> the current progress?
>>>
>>> Not yet started.  Just want to listen from the community. If this sounds good, do you have interest in implementing this?
>>>
>>>
>>>> 3. And what is stefan's idea?
>>>
>>> Talk with Stefan a little on this during KVM Forum. I think he tends to agree on this idea. Anyway, let's wait for his reply.
>>>
>>>
>>> Thanks
>>>
>>>
>> Hi Jason,
>>
>> Thanks your reply, what you want is try to avoid duplicate code, and still
>> use the existed features with virtio-net.
> 
> 
> Yes, technically we can use virtio-net driver is guest as well but we could do it step by step.
> 
> 
>> Yes, if this sounds good and most people can recognize this idea, I am very
>> happy to implement this.
> 
> 
> Cool, thanks.
> 
> 
>>
>> In addition, I hope you can review these patches before the new idea is
>> implemented, after all the performance can be improved. :-)
> 
> 
> Ok.
> 
> 
> So the patch actually did three things:
> 
> - mergeable buffer implementation
> 
> - increase the default rx buffer size
> 
> - add used and signal guest in a batch
> 
> It would be helpful if you can measure the performance improvement independently. This can give reviewer a better understanding on how much did each part help.
> 
> Thanks
> 
> 

Great, I will test the performance independently in the later version.

Thanks,
Yiwen.

>>
>> Thanks,
>> Yiwen.
>>
>>>> Thanks,
>>>> Yiwen.
>>>>
>>> .
>>>
>>
> 
> .
>
Stefan Hajnoczi Nov. 29, 2018, 2:19 p.m. UTC | #7
On Tue, Nov 06, 2018 at 01:53:54PM +0800, jiangyiwen wrote:
> On 2018/11/6 11:32, Jason Wang wrote:
> > 
> > On 2018/11/6 上午11:17, jiangyiwen wrote:
> >> On 2018/11/6 10:41, Jason Wang wrote:
> >>> On 2018/11/6 上午10:17, jiangyiwen wrote:
> >>>> On 2018/11/5 17:21, Jason Wang wrote:
> >>>>> On 2018/11/5 下午3:43, jiangyiwen wrote:
> >>>>>> Now vsock only support send/receive small packet, it can't achieve
> >>>>>> high performance. As previous discussed with Jason Wang, I revisit the
> >>>>>> idea of vhost-net about mergeable rx buffer and implement the mergeable
> >>>>>> rx buffer in vhost-vsock, it can allow big packet to be scattered in
> >>>>>> into different buffers and improve performance obviously.
> >>>>>>
> >>>>>> I write a tool to test the vhost-vsock performance, mainly send big
> >>>>>> packet(64K) included guest->Host and Host->Guest. The result as
> >>>>>> follows:
> >>>>>>
> >>>>>> Before performance:
> >>>>>>                  Single socket            Multiple sockets(Max Bandwidth)
> >>>>>> Guest->Host   ~400MB/s                 ~480MB/s
> >>>>>> Host->Guest   ~1450MB/s                ~1600MB/s
> >>>>>>
> >>>>>> After performance:
> >>>>>>                  Single socket            Multiple sockets(Max Bandwidth)
> >>>>>> Guest->Host   ~1700MB/s                ~2900MB/s
> >>>>>> Host->Guest   ~1700MB/s                ~2900MB/s
> >>>>>>
> >>>>>>    From the test results, the performance is improved obviously, and guest
> >>>>>> memory will not be wasted.
> >>>>> Hi:
> >>>>>
> >>>>> Thanks for the patches and the numbers are really impressive.
> >>>>>
> >>>>> But instead of duplicating codes between sock and net. I was considering to use virtio-net as a transport of vsock. Then we may have all existed features likes batching, mergeable rx buffers and multiqueue. Want to consider this idea? Thoughts?
> >>>>>
> >>>>>
> >>>> Hi Jason,
> >>>>
> >>>> I am not very familiar with virtio-net, so I am afraid I can't give too
> >>>> much effective advice. Then I have several problems:
> >>>>
> >>>> 1. If use virtio-net as a transport, guest should see a virtio-net
> >>>> device instead of virtio-vsock device, right? Is vsock only as a
> >>>> transport between socket and net_device? User should still use
> >>>> AF_VSOCK type to create socket, right?
> >>>
> >>> Well, there're many choices. What you need is just to keep the socket API and hide the implementation. For example, you can keep the vosck device in guest and switch to use vhost-net in host. We probably need a new feature bit or header to let vhost know we are passing vsock packet. And vhost-net could forward the packet to vsock core on host.
> >>>
> >>>
> >>>> 2. I want to know if this idea has already started, and how is
> >>>> the current progress?
> >>>
> >>> Not yet started.  Just want to listen from the community. If this sounds good, do you have interest in implementing this?
> >>>
> >>>
> >>>> 3. And what is stefan's idea?
> >>>
> >>> Talk with Stefan a little on this during KVM Forum. I think he tends to agree on this idea. Anyway, let's wait for his reply.
> >>>
> >>>
> >>> Thanks
> >>>
> >>>
> >> Hi Jason,
> >>
> >> Thanks your reply, what you want is try to avoid duplicate code, and still
> >> use the existed features with virtio-net.
> > 
> > 
> > Yes, technically we can use virtio-net driver is guest as well but we could do it step by step.
> > 
> > 
> >> Yes, if this sounds good and most people can recognize this idea, I am very
> >> happy to implement this.
> > 
> > 
> > Cool, thanks.
> > 
> > 
> >>
> >> In addition, I hope you can review these patches before the new idea is
> >> implemented, after all the performance can be improved. :-)
> > 
> > 
> > Ok.
> > 
> > 
> > So the patch actually did three things:
> > 
> > - mergeable buffer implementation
> > 
> > - increase the default rx buffer size
> > 
> > - add used and signal guest in a batch
> > 
> > It would be helpful if you can measure the performance improvement independently. This can give reviewer a better understanding on how much did each part help.
> > 
> > Thanks
> > 
> > 
> 
> Great, I will test the performance independently in the later version.

I'm catching up on email so maybe you've already discussed this, but a
key design point in virtio-vsock is reliable in-order delivery.  When
using virtio-net code it's important to keep those properties so that
AF_VSOCK SOCK_STREAM sockets work as expected.  Packets must not be
reordered or dropped.

In addition, there's the virtio-vsock flow control scheme that allows
multiple sockets to share a ring without starvation or denial-of-service
problems.  The guest knows how much socket buffer space is available on
the host (and vice versa).  A well-behaved guest only sends up to the
available buffer space so that the host can copy the data into the
socket buffer and free up ring space for other sockets.  This scheme is
how virtio-vsock achieves guaranteed delivery while avoiding starvation
or denial-of-service.

So you'll need to use some kind of framing (protocol) that preserves
these properties on top of virtio-net.  This framing could be based on
virtio-vsock's packet headers.

Stefan
Jason Wang Nov. 30, 2018, 12:52 p.m. UTC | #8
On 2018/11/29 下午10:19, Stefan Hajnoczi wrote:
> On Tue, Nov 06, 2018 at 01:53:54PM +0800, jiangyiwen wrote:
>> On 2018/11/6 11:32, Jason Wang wrote:
>>> On 2018/11/6 上午11:17, jiangyiwen wrote:
>>>> On 2018/11/6 10:41, Jason Wang wrote:
>>>>> On 2018/11/6 上午10:17, jiangyiwen wrote:
>>>>>> On 2018/11/5 17:21, Jason Wang wrote:
>>>>>>> On 2018/11/5 下午3:43, jiangyiwen wrote:
>>>>>>>> Now vsock only support send/receive small packet, it can't achieve
>>>>>>>> high performance. As previous discussed with Jason Wang, I revisit the
>>>>>>>> idea of vhost-net about mergeable rx buffer and implement the mergeable
>>>>>>>> rx buffer in vhost-vsock, it can allow big packet to be scattered in
>>>>>>>> into different buffers and improve performance obviously.
>>>>>>>>
>>>>>>>> I write a tool to test the vhost-vsock performance, mainly send big
>>>>>>>> packet(64K) included guest->Host and Host->Guest. The result as
>>>>>>>> follows:
>>>>>>>>
>>>>>>>> Before performance:
>>>>>>>>                   Single socket            Multiple sockets(Max Bandwidth)
>>>>>>>> Guest->Host   ~400MB/s                 ~480MB/s
>>>>>>>> Host->Guest   ~1450MB/s                ~1600MB/s
>>>>>>>>
>>>>>>>> After performance:
>>>>>>>>                   Single socket            Multiple sockets(Max Bandwidth)
>>>>>>>> Guest->Host   ~1700MB/s                ~2900MB/s
>>>>>>>> Host->Guest   ~1700MB/s                ~2900MB/s
>>>>>>>>
>>>>>>>>     From the test results, the performance is improved obviously, and guest
>>>>>>>> memory will not be wasted.
>>>>>>> Hi:
>>>>>>>
>>>>>>> Thanks for the patches and the numbers are really impressive.
>>>>>>>
>>>>>>> But instead of duplicating codes between sock and net. I was considering to use virtio-net as a transport of vsock. Then we may have all existed features likes batching, mergeable rx buffers and multiqueue. Want to consider this idea? Thoughts?
>>>>>>>
>>>>>>>
>>>>>> Hi Jason,
>>>>>>
>>>>>> I am not very familiar with virtio-net, so I am afraid I can't give too
>>>>>> much effective advice. Then I have several problems:
>>>>>>
>>>>>> 1. If use virtio-net as a transport, guest should see a virtio-net
>>>>>> device instead of virtio-vsock device, right? Is vsock only as a
>>>>>> transport between socket and net_device? User should still use
>>>>>> AF_VSOCK type to create socket, right?
>>>>> Well, there're many choices. What you need is just to keep the socket API and hide the implementation. For example, you can keep the vosck device in guest and switch to use vhost-net in host. We probably need a new feature bit or header to let vhost know we are passing vsock packet. And vhost-net could forward the packet to vsock core on host.
>>>>>
>>>>>
>>>>>> 2. I want to know if this idea has already started, and how is
>>>>>> the current progress?
>>>>> Not yet started.  Just want to listen from the community. If this sounds good, do you have interest in implementing this?
>>>>>
>>>>>
>>>>>> 3. And what is stefan's idea?
>>>>> Talk with Stefan a little on this during KVM Forum. I think he tends to agree on this idea. Anyway, let's wait for his reply.
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>> Hi Jason,
>>>>
>>>> Thanks your reply, what you want is try to avoid duplicate code, and still
>>>> use the existed features with virtio-net.
>>>
>>> Yes, technically we can use virtio-net driver is guest as well but we could do it step by step.
>>>
>>>
>>>> Yes, if this sounds good and most people can recognize this idea, I am very
>>>> happy to implement this.
>>>
>>> Cool, thanks.
>>>
>>>
>>>> In addition, I hope you can review these patches before the new idea is
>>>> implemented, after all the performance can be improved. :-)
>>>
>>> Ok.
>>>
>>>
>>> So the patch actually did three things:
>>>
>>> - mergeable buffer implementation
>>>
>>> - increase the default rx buffer size
>>>
>>> - add used and signal guest in a batch
>>>
>>> It would be helpful if you can measure the performance improvement independently. This can give reviewer a better understanding on how much did each part help.
>>>
>>> Thanks
>>>
>>>
>> Great, I will test the performance independently in the later version.
> I'm catching up on email so maybe you've already discussed this, but a
> key design point in virtio-vsock is reliable in-order delivery.  When
> using virtio-net code it's important to keep those properties so that
> AF_VSOCK SOCK_STREAM sockets work as expected.  Packets must not be
> reordered or dropped.


Yes, vhost-net does not drop packet itself and it's not hard to forbid 
virtio-net to drop packets.


>
> In addition, there's the virtio-vsock flow control scheme that allows
> multiple sockets to share a ring without starvation or denial-of-service
> problems.  The guest knows how much socket buffer space is available on
> the host (and vice versa).  A well-behaved guest only sends up to the
> available buffer space so that the host can copy the data into the
> socket buffer and free up ring space for other sockets.  This scheme is
> how virtio-vsock achieves guaranteed delivery while avoiding starvation
> or denial-of-service.
>
> So you'll need to use some kind of framing (protocol) that preserves
> these properties on top of virtio-net.  This framing could be based on
> virtio-vsock's packet headers.


Current plan is to reuse those headers.

Thanks


>
> Stefan
jiangyiwen Dec. 3, 2018, 6:08 a.m. UTC | #9
On 2018/11/29 22:19, Stefan Hajnoczi wrote:
> On Tue, Nov 06, 2018 at 01:53:54PM +0800, jiangyiwen wrote:
>> On 2018/11/6 11:32, Jason Wang wrote:
>>>
>>> On 2018/11/6 上午11:17, jiangyiwen wrote:
>>>> On 2018/11/6 10:41, Jason Wang wrote:
>>>>> On 2018/11/6 上午10:17, jiangyiwen wrote:
>>>>>> On 2018/11/5 17:21, Jason Wang wrote:
>>>>>>> On 2018/11/5 下午3:43, jiangyiwen wrote:
>>>>>>>> Now vsock only support send/receive small packet, it can't achieve
>>>>>>>> high performance. As previous discussed with Jason Wang, I revisit the
>>>>>>>> idea of vhost-net about mergeable rx buffer and implement the mergeable
>>>>>>>> rx buffer in vhost-vsock, it can allow big packet to be scattered in
>>>>>>>> into different buffers and improve performance obviously.
>>>>>>>>
>>>>>>>> I write a tool to test the vhost-vsock performance, mainly send big
>>>>>>>> packet(64K) included guest->Host and Host->Guest. The result as
>>>>>>>> follows:
>>>>>>>>
>>>>>>>> Before performance:
>>>>>>>>                  Single socket            Multiple sockets(Max Bandwidth)
>>>>>>>> Guest->Host   ~400MB/s                 ~480MB/s
>>>>>>>> Host->Guest   ~1450MB/s                ~1600MB/s
>>>>>>>>
>>>>>>>> After performance:
>>>>>>>>                  Single socket            Multiple sockets(Max Bandwidth)
>>>>>>>> Guest->Host   ~1700MB/s                ~2900MB/s
>>>>>>>> Host->Guest   ~1700MB/s                ~2900MB/s
>>>>>>>>
>>>>>>>>    From the test results, the performance is improved obviously, and guest
>>>>>>>> memory will not be wasted.
>>>>>>> Hi:
>>>>>>>
>>>>>>> Thanks for the patches and the numbers are really impressive.
>>>>>>>
>>>>>>> But instead of duplicating codes between sock and net. I was considering to use virtio-net as a transport of vsock. Then we may have all existed features likes batching, mergeable rx buffers and multiqueue. Want to consider this idea? Thoughts?
>>>>>>>
>>>>>>>
>>>>>> Hi Jason,
>>>>>>
>>>>>> I am not very familiar with virtio-net, so I am afraid I can't give too
>>>>>> much effective advice. Then I have several problems:
>>>>>>
>>>>>> 1. If use virtio-net as a transport, guest should see a virtio-net
>>>>>> device instead of virtio-vsock device, right? Is vsock only as a
>>>>>> transport between socket and net_device? User should still use
>>>>>> AF_VSOCK type to create socket, right?
>>>>>
>>>>> Well, there're many choices. What you need is just to keep the socket API and hide the implementation. For example, you can keep the vosck device in guest and switch to use vhost-net in host. We probably need a new feature bit or header to let vhost know we are passing vsock packet. And vhost-net could forward the packet to vsock core on host.
>>>>>
>>>>>
>>>>>> 2. I want to know if this idea has already started, and how is
>>>>>> the current progress?
>>>>>
>>>>> Not yet started.  Just want to listen from the community. If this sounds good, do you have interest in implementing this?
>>>>>
>>>>>
>>>>>> 3. And what is stefan's idea?
>>>>>
>>>>> Talk with Stefan a little on this during KVM Forum. I think he tends to agree on this idea. Anyway, let's wait for his reply.
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>> Hi Jason,
>>>>
>>>> Thanks your reply, what you want is try to avoid duplicate code, and still
>>>> use the existed features with virtio-net.
>>>
>>>
>>> Yes, technically we can use virtio-net driver is guest as well but we could do it step by step.
>>>
>>>
>>>> Yes, if this sounds good and most people can recognize this idea, I am very
>>>> happy to implement this.
>>>
>>>
>>> Cool, thanks.
>>>
>>>
>>>>
>>>> In addition, I hope you can review these patches before the new idea is
>>>> implemented, after all the performance can be improved. :-)
>>>
>>>
>>> Ok.
>>>
>>>
>>> So the patch actually did three things:
>>>
>>> - mergeable buffer implementation
>>>
>>> - increase the default rx buffer size
>>>
>>> - add used and signal guest in a batch
>>>
>>> It would be helpful if you can measure the performance improvement independently. This can give reviewer a better understanding on how much did each part help.
>>>
>>> Thanks
>>>
>>>
>>
>> Great, I will test the performance independently in the later version.
> 
> I'm catching up on email so maybe you've already discussed this, but a
> key design point in virtio-vsock is reliable in-order delivery.  When
> using virtio-net code it's important to keep those properties so that
> AF_VSOCK SOCK_STREAM sockets work as expected.  Packets must not be
> reordered or dropped.
> 
> In addition, there's the virtio-vsock flow control scheme that allows
> multiple sockets to share a ring without starvation or denial-of-service
> problems.  The guest knows how much socket buffer space is available on
> the host (and vice versa).  A well-behaved guest only sends up to the
> available buffer space so that the host can copy the data into the
> socket buffer and free up ring space for other sockets.  This scheme is
> how virtio-vsock achieves guaranteed delivery while avoiding starvation
> or denial-of-service.
> 
> So you'll need to use some kind of framing (protocol) that preserves
> these properties on top of virtio-net.  This framing could be based on
> virtio-vsock's packet headers.
> 
> Stefan
> 

Hi Stefan,

I find some different ideas from MST. He think that use virtio-net as
vsock's transport channel is not of much value in another discussion.

So we may need to discuss which solution we should use:
1. vsock over virtio-net
2. add multiqueue and mergeable rx buffer feature in existing virtio-vsock.

Stefan, what's your suggestion?

Thanks,
Yiwen.