Message ID | 1262849506-27132-6-git-send-email-amit.shah@redhat.com |
---|---|
State | New |
Headers | show |
Amit Shah wrote: > Guests send us one buffer at a time. Current guests send buffers sized > 4K bytes. If guest userspace applications sent out > 4K bytes in one > write() syscall, the write request actually sends out multiple buffers, > each of 4K in size. > > This usually isn't a problem but for some apps, like VNC, the entire > data has to be sent in one go to make copy/paste work fine. So if an app > on the guest sends out guest clipboard contents, it has to be sent to > the vnc server in one go as the guest app sent it. > > For this to be done, we need the guest to send us START and END markers > for each write request so that we can find out complete buffers and send > them off to ports. That looks very dubious. TCP/IP doesn't maintain write boundaries; neither do pipes, unix domain sockets, pseudo-terminals, and almost every other modern byte-oriented transport. So how does VNC transmit the clipboard over TCP/IP to a VNC client, without those boundaries, and why is it different with virtserialport? -- Jamie
On (Fri) Jan 08 2010 [01:12:31], Jamie Lokier wrote: > Amit Shah wrote: > > Guests send us one buffer at a time. Current guests send buffers sized > > 4K bytes. If guest userspace applications sent out > 4K bytes in one > > write() syscall, the write request actually sends out multiple buffers, > > each of 4K in size. > > > > This usually isn't a problem but for some apps, like VNC, the entire > > data has to be sent in one go to make copy/paste work fine. So if an app > > on the guest sends out guest clipboard contents, it has to be sent to > > the vnc server in one go as the guest app sent it. > > > > For this to be done, we need the guest to send us START and END markers > > for each write request so that we can find out complete buffers and send > > them off to ports. > > That looks very dubious. TCP/IP doesn't maintain write boundaries; > neither do pipes, unix domain sockets, pseudo-terminals, and almost > every other modern byte-oriented transport. > > So how does VNC transmit the clipboard over TCP/IP to a VNC client, > without those boundaries, and why is it different with virtserialport? TCP does this in its stack: it waits for the number of bytes written to be received and then notifies userspace of data availibility. In this case, consider the case where the guest writes 10k of data. The guest gives us those 10k in 3 chunks: the first containing 4k (- header size), the 2nd containing the next 4k (- header size) and the 3rd chunk the remaining data. I want to flush out this data only when I get all 10k. Amit
Amit Shah wrote: > On (Fri) Jan 08 2010 [01:12:31], Jamie Lokier wrote: > > Amit Shah wrote: > > > Guests send us one buffer at a time. Current guests send buffers sized > > > 4K bytes. If guest userspace applications sent out > 4K bytes in one > > > write() syscall, the write request actually sends out multiple buffers, > > > each of 4K in size. > > > > > > This usually isn't a problem but for some apps, like VNC, the entire > > > data has to be sent in one go to make copy/paste work fine. So if an app > > > on the guest sends out guest clipboard contents, it has to be sent to > > > the vnc server in one go as the guest app sent it. > > > > > > For this to be done, we need the guest to send us START and END markers > > > for each write request so that we can find out complete buffers and send > > > them off to ports. > > > > That looks very dubious. TCP/IP doesn't maintain write boundaries; > > neither do pipes, unix domain sockets, pseudo-terminals, and almost > > every other modern byte-oriented transport. > > > > So how does VNC transmit the clipboard over TCP/IP to a VNC client, > > without those boundaries, and why is it different with virtserialport? > > TCP does this in its stack: it waits for the number of bytes written to > be received and then notifies userspace of data availibility. > > In this case, consider the case where the guest writes 10k of data. The > guest gives us those 10k in 3 chunks: the first containing 4k (- header > size), the 2nd containing the next 4k (- header size) and the 3rd chunk > the remaining data. > > I want to flush out this data only when I get all 10k. No, TCP does not do that. It does not maintain boundaries, or delay delivery until a full write is transmitted. Even if you use TCP_CORK (Linux specific), that is just a performance hint. If the sender writes 10k of data in a single write() over TCP, and it is split into packets size 4k/4k/2k (assume just over 4k MSS :-), the receiver will be notified of availability any time after the *first* packet is received, and the read() call may indeed return less than 10k. In fact it can be split at any byte position, depending on other activity. Applications handle this by using their own framing protocol on top of the TCP byte stream. For example a simple header saying "expect N bytes" followed by N bytes, or line delimiters or escape characters. Sometimes it looks like TCP is maintaining write boundaries, but it is just an artifact of its behaviour on many systems, and is not reliable even on those systems where it seems to happen most of the time. Even when connecting to localhost, you cannot rely on that. I have seen people write code assuming TCP keeps boundaries, and then some weeks later they are very confused debugging their code because it is not reliable... Since VNC is clearly designed to work over TCP, and is written by people who know this, I'm wondering why you think it needs to be different for virtio-serial. -- Jamie
On 01/08/2010 07:35 AM, Jamie Lokier wrote: > Sometimes it looks like TCP is maintaining write boundaries, but it is > just an artifact of its behaviour on many systems, and is not reliable > even on those systems where it seems to happen most of the time. Even > when connecting to localhost, you cannot rely on that. I have seen > people write code assuming TCP keeps boundaries, and then some weeks > later they are very confused debugging their code because it is not > reliable... > > Since VNC is clearly designed to work over TCP, and is written by > people who know this, I'm wondering why you think it needs to be > different for virtio-serial. > I'm confused about why the buffering is needed in the first place. I would think that any buffering should be pushed back to the guest. IOW, if there's available data from the char driver, but the guest doesn't have a buffer. Don't select on the char driver until the guest has a buffer available. If the guest attempts to write data but the char driver isn't ready to receive data, don't complete the request until the char driver can accept data. Where does buffering come in? Regards, Anthony Liguori > -- Jamie > > >
On (Fri) Jan 08 2010 [13:35:03], Jamie Lokier wrote: > > Since VNC is clearly designed to work over TCP, and is written by > people who know this, I'm wondering why you think it needs to be > different for virtio-serial. For vnc putting stuff from a guest clipboard into vnc client clipboard using the ServerCutText command, the entire buffer has to be provided after sending the command and the 'length' values. In this case, if the data from guest arrives in multiple packets, we really don't want to call into the write function multiple times. A single clipboard entry has to be created in the client with the entire contents, so a single write operation has to be invoked. For this to happen, there has to be some indication from the guest as to how much data was written in one write() operation, which will let us make a single write operation to the vnc client. Amit
On (Fri) Jan 08 2010 [10:26:59], Anthony Liguori wrote: > On 01/08/2010 07:35 AM, Jamie Lokier wrote: >> Sometimes it looks like TCP is maintaining write boundaries, but it is >> just an artifact of its behaviour on many systems, and is not reliable >> even on those systems where it seems to happen most of the time. Even >> when connecting to localhost, you cannot rely on that. I have seen >> people write code assuming TCP keeps boundaries, and then some weeks >> later they are very confused debugging their code because it is not >> reliable... >> >> Since VNC is clearly designed to work over TCP, and is written by >> people who know this, I'm wondering why you think it needs to be >> different for virtio-serial. >> > > I'm confused about why the buffering is needed in the first place. > > I would think that any buffering should be pushed back to the guest. > IOW, if there's available data from the char driver, but the guest > doesn't have a buffer. Don't select on the char driver until the guest > has a buffer available. If the guest attempts to write data but the > char driver isn't ready to receive data, don't complete the request > until the char driver can accept data. This is a different thing from what Jamie's talking about. A guest or a host might be interested in communicating data without waiting for the other end to come up. The other end can just start consuming the data (even the data that it missed while it wasn't connected) once it's up. (I can remove this option for now and add it later, if you prefer it that way.) Amit
Amit Shah wrote: > On (Fri) Jan 08 2010 [13:35:03], Jamie Lokier wrote: > > Since VNC is clearly designed to work over TCP, and is written by > > people who know this, I'm wondering why you think it needs to be > > different for virtio-serial. > > For vnc putting stuff from a guest clipboard into vnc client clipboard > using the ServerCutText command, the entire buffer has to be provided > after sending the command and the 'length' values. Are you talking about a VNC protocol command between qemu's VNC server and the user's VNC client, or a private protocol between the guest and qemu's VNC server? > In this case, if the data from guest arrives in multiple packets, we > really don't want to call into the write function multiple times. A > single clipboard entry has to be created in the client with the entire > contents, so a single write operation has to be invoked. Same question again: *Why do you think the VNC server (in qemu) needs to see the entire clipboard in a aingle write from the guest?* You have already told it the total length to expect. There is no ambiguity about where it ends. There is no need to do any more, if the reciever (in qemu) is implemented correctly with a sane protocol. That's assuming the guest sends to qemu's VNC server which then sends it to the user's VNC client. > For this to happen, there has to be some indication from the guest as to > how much data was written in one write() operation, which will let us > make a single write operation to the vnc client. When it is sent to the user's VNC client, it will be split into multiple packets by TCP. You *can't* send a single large write over TCP without getting it split at arbitrary places. It's *impossible*. TCP doesn't support that. It will split and merge your writes arbitrarily. So the only interesting part is how it's transmitted from the guest to qemu's VNC server first. Do you get to design that protocol yourself? -- Jamie
On (Mon) Jan 11 2010 [10:45:53], Jamie Lokier wrote: > Amit Shah wrote: > > On (Fri) Jan 08 2010 [13:35:03], Jamie Lokier wrote: > > > Since VNC is clearly designed to work over TCP, and is written by > > > people who know this, I'm wondering why you think it needs to be > > > different for virtio-serial. > > > > For vnc putting stuff from a guest clipboard into vnc client clipboard > > using the ServerCutText command, the entire buffer has to be provided > > after sending the command and the 'length' values. > > Are you talking about a VNC protocol command between qemu's VNC server > and the user's VNC client, or a private protocol between the guest and > qemu's VNC server? What happens is: 1. Guest puts something on its clipboard 2. An agent on the guest gets notified of new clipboard contents 3. This agent sends over the entire clipboard contents to qemu via virtio-serial 4. virtio-serial sends off this data to the virtio-serial-vnc code 5. ServerCutText message from the vnc backend is sent to the vnc client 6. vnc client's clipboard gets updated 7. You can see guest's clipboard contents in your client's clipboard. I'm talking about steps 3, 4, 5 here. > > In this case, if the data from guest arrives in multiple packets, we > > really don't want to call into the write function multiple times. A > > single clipboard entry has to be created in the client with the entire > > contents, so a single write operation has to be invoked. > > Same question again: *Why do you think the VNC server (in qemu) needs to > see the entire clipboard in a aingle write from the guest?* > > You have already told it the total length to expect. There is no > ambiguity about where it ends. Where does the total length come from? It has to come from the guest. Otherwise, the vnc code will not know if a byte stream contains two separate clipboard entries or just one huge clipboard entry. Earlier, I used to send the length of one write as issued by a guest to qemu. I just changed that to send a START and END flag so that I don't have to send the length. If this doesn't explain it, then I think we're not understanding each other here. Amit
Amit Shah wrote: > > Are you talking about a VNC protocol command between qemu's VNC server > > and the user's VNC client, or a private protocol between the guest and > > qemu's VNC server? > > What happens is: > > 1. Guest puts something on its clipboard > 2. An agent on the guest gets notified of new clipboard contents > 3. This agent sends over the entire clipboard contents to qemu via > virtio-serial > 4. virtio-serial sends off this data to the virtio-serial-vnc code > 5. ServerCutText message from the vnc backend is sent to the vnc client > 6. vnc client's clipboard gets updated > 7. You can see guest's clipboard contents in your client's clipboard. > > I'm talking about steps 3, 4, 5 here. Ok. Let's not worry about 5; it doesn't seem relevant, only that the guest clipboad is sent to the host somehow. > > You have already told it the total length to expect. There is no > > ambiguity about where it ends. > > Where does the total length come from? It has to come from the guest. > Otherwise, the vnc code will not know if a byte stream contains two > separate clipboard entries or just one huge clipboard entry. I see. So it's a *really simple* protocol where the clipboard entry is sent by the guest agent with a single write() without any framing bytes? > Earlier, I used to send the length of one write as issued by a guest to > qemu. I just changed that to send a START and END flag so that I don't > have to send the length. Why not just have the guest agent send a 4-byte header which is the integer length of the clipboard blob to follow? I.e. instead of int guest_send_clipboard(const char *data, size_t length) { return write_full(virtio_fd, data, length); } do this: int guest_send_clipboard(const char *data, size_t length) { u32 encoded_length = cpu_to_be32(length); int err = write_full(virtio_serial_fd, &encoded_length, sizeof(encoded_length)); if (err == 0) err = write_full(virtio_serial_fd, data, length); return err; } > If this doesn't explain it, then I think we're not understanding each > other here. It does explain it very well, thanks. I think you're misguided about the solution :-) What confused me was you mentioned the VNC ServerCutText command having to receive the whole data in one go. ServerCutText isn't really relevant to this, and clearly is encoded with VNC protocol framing. If it was RDP or the SDL client instead of VNC, it would be something else. All that matters is getting the clipboard blob from guest to qemu in one piece, right? Having the guest agent send a few framing bytes seems very simple, and would have the added bonus that the same guest agent protocol would work on a "real" emulated serial port, guest->host TCP, etc. where virtio-serial isn't available in the guest OS (e.g. older kernels). I really can't see any merit in making virtio-serial not be a serial port, being instead like a unix datagram socket, to support a specific user of virtio-serial when a trivial 4-byte header in the guest agent code would be easier for that user anyway. If it did that, I think the name virtio-serial would have to change to virtio-datagram, becuase it wouldn't behave like a serial port any more. It would also be less useful for things that _do_ want something like a pipe/serial port. But why bother? -- Jamie
On 01/11/2010 05:33 PM, Jamie Lokier wrote: > Amit Shah wrote: > >>> Are you talking about a VNC protocol command between qemu's VNC server >>> and the user's VNC client, or a private protocol between the guest and >>> qemu's VNC server? >>> >> What happens is: >> >> 1. Guest puts something on its clipboard >> 2. An agent on the guest gets notified of new clipboard contents >> 3. This agent sends over the entire clipboard contents to qemu via >> virtio-serial >> 4. virtio-serial sends off this data to the virtio-serial-vnc code >> 5. ServerCutText message from the vnc backend is sent to the vnc client >> 6. vnc client's clipboard gets updated >> 7. You can see guest's clipboard contents in your client's clipboard. >> >> I'm talking about steps 3, 4, 5 here. >> > Ok. Let's not worry about 5; it doesn't seem relevant, only that the > guest clipboad is sent to the host somehow. > > >>> You have already told it the total length to expect. There is no >>> ambiguity about where it ends. >>> >> Where does the total length come from? It has to come from the guest. >> Otherwise, the vnc code will not know if a byte stream contains two >> separate clipboard entries or just one huge clipboard entry. >> > I see. So it's a *really simple* protocol where the clipboard entry > is sent by the guest agent with a single write() without any framing bytes? > > >> Earlier, I used to send the length of one write as issued by a guest to >> qemu. I just changed that to send a START and END flag so that I don't >> have to send the length. >> > Why not just have the guest agent send a 4-byte header which is the > integer length of the clipboard blob to follow? > > I.e. instead of > > int guest_send_clipboard(const char *data, size_t length) > { > return write_full(virtio_fd, data, length); > } > > do this: > > int guest_send_clipboard(const char *data, size_t length) > { > u32 encoded_length = cpu_to_be32(length); > int err = write_full(virtio_serial_fd,&encoded_length, > sizeof(encoded_length)); > if (err == 0) > err = write_full(virtio_serial_fd, data, length); > return err; > } > > >> If this doesn't explain it, then I think we're not understanding each >> other here. >> > It does explain it very well, thanks. I think you're misguided about > the solution :-) > > What confused me was you mentioned the VNC ServerCutText command > having to receive the whole data in one go. ServerCutText isn't > really relevant to this, and clearly is encoded with VNC protocol > framing. If it was RDP or the SDL client instead of VNC, it would be > something else. All that matters is getting the clipboard blob from > guest to qemu in one piece, right? > > Having the guest agent send a few framing bytes seems very simple, and > would have the added bonus that the same guest agent protocol would > work on a "real" emulated serial port, guest->host TCP, etc. where > virtio-serial isn't available in the guest OS (e.g. older kernels). > > I really can't see any merit in making virtio-serial not be a serial > port, being instead like a unix datagram socket, to support a specific > user of virtio-serial when a trivial 4-byte header in the guest agent > code would be easier for that user anyway. > > If it did that, I think the name virtio-serial would have to change to > virtio-datagram, becuase it wouldn't behave like a serial port any > more. It would also be less useful for things that _do_ want > something like a pipe/serial port. But why bother? > I agree wrt a streaming protocol verses a datagram protocol. The core argument IMHO is that the userspace interface is a file descriptor. Most programmers are used to assuming that boundaries aren't preserved in read/write calls. Regards, Anthony Liguori > -- Jamie > > >
On 01/11/2010 02:39 AM, Amit Shah wrote: > On (Fri) Jan 08 2010 [10:26:59], Anthony Liguori wrote: > >> On 01/08/2010 07:35 AM, Jamie Lokier wrote: >> >>> Sometimes it looks like TCP is maintaining write boundaries, but it is >>> just an artifact of its behaviour on many systems, and is not reliable >>> even on those systems where it seems to happen most of the time. Even >>> when connecting to localhost, you cannot rely on that. I have seen >>> people write code assuming TCP keeps boundaries, and then some weeks >>> later they are very confused debugging their code because it is not >>> reliable... >>> >>> Since VNC is clearly designed to work over TCP, and is written by >>> people who know this, I'm wondering why you think it needs to be >>> different for virtio-serial. >>> >>> >> I'm confused about why the buffering is needed in the first place. >> >> I would think that any buffering should be pushed back to the guest. >> IOW, if there's available data from the char driver, but the guest >> doesn't have a buffer. Don't select on the char driver until the guest >> has a buffer available. If the guest attempts to write data but the >> char driver isn't ready to receive data, don't complete the request >> until the char driver can accept data. >> > This is a different thing from what Jamie's talking about. A guest or a > host might be interested in communicating data without waiting for the > other end to come up. The other end can just start consuming the data > (even the data that it missed while it wasn't connected) once it's up. > > (I can remove this option for now and add it later, if you prefer it > that way.) > If it's not needed by your use case, please remove it. Doing buffering gets tricky because you can't allow an infinite buffer for security reasons. All you end up doing is increasing the size of the buffer beyond what the guest and client are capable of doing. Since you still can lose data, apps have to be written to handle this. I think it adds complexity without a lot of benefit. Regards, Anthony Liguori > Amit >
On (Mon) Jan 11 2010 [18:28:52], Anthony Liguori wrote: >>> >>> I would think that any buffering should be pushed back to the guest. >>> IOW, if there's available data from the char driver, but the guest >>> doesn't have a buffer. Don't select on the char driver until the guest >>> has a buffer available. If the guest attempts to write data but the >>> char driver isn't ready to receive data, don't complete the request >>> until the char driver can accept data. >>> >> This is a different thing from what Jamie's talking about. A guest or a >> host might be interested in communicating data without waiting for the >> other end to come up. The other end can just start consuming the data >> (even the data that it missed while it wasn't connected) once it's up. >> >> (I can remove this option for now and add it later, if you prefer it >> that way.) >> > > If it's not needed by your use case, please remove it. Doing buffering > gets tricky because you can't allow an infinite buffer for security > reasons. All you end up doing is increasing the size of the buffer > beyond what the guest and client are capable of doing. Since you still > can lose data, apps have to be written to handle this. I think it adds > complexity without a lot of benefit. The buffering has to remain anyway since we can't assume that the ports will consume the entire buffers we pass on to them. So we'll have to buffer the data till the entire buffer is consumed. That, or the buffer management should be passed off to individual ports. Which might result in a lot of code duplication since we can have a lot of these ports in different places in the qemu code. So I guess it's better to leave the buffer management in the bus itself. Which means we get the 'cache_buffers' functionality essentially for free. Amit
On (Mon) Jan 11 2010 [23:33:56], Jamie Lokier wrote: > Amit Shah wrote: > > > Are you talking about a VNC protocol command between qemu's VNC server > > > and the user's VNC client, or a private protocol between the guest and > > > qemu's VNC server? > > > > What happens is: > > > > 1. Guest puts something on its clipboard > > 2. An agent on the guest gets notified of new clipboard contents > > 3. This agent sends over the entire clipboard contents to qemu via > > virtio-serial > > 4. virtio-serial sends off this data to the virtio-serial-vnc code > > 5. ServerCutText message from the vnc backend is sent to the vnc client > > 6. vnc client's clipboard gets updated > > 7. You can see guest's clipboard contents in your client's clipboard. > > > > I'm talking about steps 3, 4, 5 here. > > Ok. Let's not worry about 5; it doesn't seem relevant, only that the > guest clipboad is sent to the host somehow. Actually, it is important... > > > You have already told it the total length to expect. There is no > > > ambiguity about where it ends. > > > > Where does the total length come from? It has to come from the guest. > > Otherwise, the vnc code will not know if a byte stream contains two > > separate clipboard entries or just one huge clipboard entry. > > I see. So it's a *really simple* protocol where the clipboard entry > is sent by the guest agent with a single write() without any framing bytes? > > > Earlier, I used to send the length of one write as issued by a guest to > > qemu. I just changed that to send a START and END flag so that I don't > > have to send the length. > > Why not just have the guest agent send a 4-byte header which is the > integer length of the clipboard blob to follow? > > I.e. instead of > > int guest_send_clipboard(const char *data, size_t length) > { > return write_full(virtio_fd, data, length); > } > > do this: > > int guest_send_clipboard(const char *data, size_t length) > { > u32 encoded_length = cpu_to_be32(length); > int err = write_full(virtio_serial_fd, &encoded_length, > sizeof(encoded_length)); > if (err == 0) > err = write_full(virtio_serial_fd, data, length); > return err; > } > > > If this doesn't explain it, then I think we're not understanding each > > other here. > > It does explain it very well, thanks. I think you're misguided about > the solution :-) The above solution you specify works if it's assumed that we hold off writes to the vnc client till we get a complete buffer according to the header received. Now, a header might contain the length 10000, meaning 10000 bytes are to be expected. What if the write() on the guest fails after writing 8000 bytes? There's no way for us to signal that. So this vnc port might just be waiting for all 10000 bytes to be received, and it may never receive anything more. Or, it might receive the start of the next clipboard entry and it could be interpreted as data from the previous copy. > What confused me was you mentioned the VNC ServerCutText command > having to receive the whole data in one go. ServerCutText isn't > really relevant to this, It is relevant. You can't split up one ServerCutText command in multiple buffers. You can also not execute any other commands while one command is in progress, so you have to hold off on executing ServerCutText till all the data is available. And you can't reliably do that from guest userspace because of the previously-mentioned scenario. > I really can't see any merit in making virtio-serial not be a serial > port, being instead like a unix datagram socket, to support a specific > user of virtio-serial when a trivial 4-byte header in the guest agent > code would be easier for that user anyway. BTW I don't really want this too, I can get rid of it if everyone agrees we won't support clipboard writes > 4k over vnc or if there's a better idea. Amit
On 01/12/2010 01:16 AM, Amit Shah wrote: > BTW I don't really want this too, I can get rid of it if everyone agrees > we won't support clipboard writes> 4k over vnc or if there's a better > idea. > Why bother trying to preserve message boundaries? I think that's the fundamental question. Regards, Anthony Liguori > Amit > > >
On (Tue) Jan 12 2010 [09:00:52], Anthony Liguori wrote: > On 01/12/2010 01:16 AM, Amit Shah wrote: >> BTW I don't really want this too, I can get rid of it if everyone agrees >> we won't support clipboard writes> 4k over vnc or if there's a better >> idea. >> > > Why bother trying to preserve message boundaries? I think that's the > fundamental question. For the vnc clipboard copy-paste case, I explained that in the couple of mails before in this thread. There might be other use-cases, I don't know about them though. Amit
On 01/12/2010 09:13 AM, Amit Shah wrote: > On (Tue) Jan 12 2010 [09:00:52], Anthony Liguori wrote: > >> On 01/12/2010 01:16 AM, Amit Shah wrote: >> >>> BTW I don't really want this too, I can get rid of it if everyone agrees >>> we won't support clipboard writes> 4k over vnc or if there's a better >>> idea. >>> >>> >> Why bother trying to preserve message boundaries? I think that's the >> fundamental question. >> > For the vnc clipboard copy-paste case, I explained that in the couple of > mails before in this thread. > It didn't make sense to me. I think the assumption has to be that the client can send corrupt data and the host has to handle it. Regards, Anthony Liguori > There might be other use-cases, I don't know about them though. > > Amit >
On (Tue) Jan 12 2010 [09:46:55], Anthony Liguori wrote: > On 01/12/2010 09:13 AM, Amit Shah wrote: >> On (Tue) Jan 12 2010 [09:00:52], Anthony Liguori wrote: >> >>> On 01/12/2010 01:16 AM, Amit Shah wrote: >>> >>>> BTW I don't really want this too, I can get rid of it if everyone agrees >>>> we won't support clipboard writes> 4k over vnc or if there's a better >>>> idea. >>>> >>>> >>> Why bother trying to preserve message boundaries? I think that's the >>> fundamental question. >>> >> For the vnc clipboard copy-paste case, I explained that in the couple of >> mails before in this thread. >> > > It didn't make sense to me. I think the assumption has to be that the > client can send corrupt data and the host has to handle it. You mean if the guest kernel sends the wrong flags? Or doesn't set the flags? Can you explain what scenario you're talking about? Amit
On 01/12/2010 09:49 AM, Amit Shah wrote: > On (Tue) Jan 12 2010 [09:46:55], Anthony Liguori wrote: > >> On 01/12/2010 09:13 AM, Amit Shah wrote: >> >>> On (Tue) Jan 12 2010 [09:00:52], Anthony Liguori wrote: >>> >>> >>>> On 01/12/2010 01:16 AM, Amit Shah wrote: >>>> >>>> >>>>> BTW I don't really want this too, I can get rid of it if everyone agrees >>>>> we won't support clipboard writes> 4k over vnc or if there's a better >>>>> idea. >>>>> >>>>> >>>>> >>>> Why bother trying to preserve message boundaries? I think that's the >>>> fundamental question. >>>> >>>> >>> For the vnc clipboard copy-paste case, I explained that in the couple of >>> mails before in this thread. >>> >>> >> It didn't make sense to me. I think the assumption has to be that the >> client can send corrupt data and the host has to handle it. >> > You mean if the guest kernel sends the wrong flags? Or doesn't set the > flags? Can you explain what scenario you're talking about? > It's very likely that you'll have to implement some sort of protocol on top of virtio-serial. It won't always just be simple strings. If you have a simple datagram protocol, that contains two ints and a string, it's going to have to be encoded like <int a><int b><int len><char data[len]>. You need to validate that len fits within the boundaries and deal with len being less than the boundary. If you've got a command protocol where the you send the guest something and then expect a response, you have to deal with the fact that the guest may never respond. Having well defined message boundaries does not help the general problem and it only helps in the most trivial cases. Basically, it boils down to a lot of complexity for something that isn't going to be helpful in most circumstances. Regards, Anthony Liguori > Amit >
On (Tue) Jan 12 2010 [09:55:41], Anthony Liguori wrote: > On 01/12/2010 09:49 AM, Amit Shah wrote: >> On (Tue) Jan 12 2010 [09:46:55], Anthony Liguori wrote: >> >>> On 01/12/2010 09:13 AM, Amit Shah wrote: >>> >>>> On (Tue) Jan 12 2010 [09:00:52], Anthony Liguori wrote: >>>> >>>> >>>>> On 01/12/2010 01:16 AM, Amit Shah wrote: >>>>> >>>>> >>>>>> BTW I don't really want this too, I can get rid of it if everyone agrees >>>>>> we won't support clipboard writes> 4k over vnc or if there's a better >>>>>> idea. >>>>>> >>>>>> >>>>>> >>>>> Why bother trying to preserve message boundaries? I think that's the >>>>> fundamental question. >>>>> >>>>> >>>> For the vnc clipboard copy-paste case, I explained that in the couple of >>>> mails before in this thread. >>>> >>>> >>> It didn't make sense to me. I think the assumption has to be that the >>> client can send corrupt data and the host has to handle it. >>> >> You mean if the guest kernel sends the wrong flags? Or doesn't set the >> flags? Can you explain what scenario you're talking about? >> > > It's very likely that you'll have to implement some sort of protocol on > top of virtio-serial. It won't always just be simple strings. Yes, virtio-serial is just meant to be a transport agnostic of whatever data or protocols that ride over it. > If you have a simple datagram protocol, that contains two ints and a > string, it's going to have to be encoded like <int a><int b><int > len><char data[len]>. You need to validate that len fits within the > boundaries and deal with len being less than the boundary. > > If you've got a command protocol where the you send the guest something > and then expect a response, you have to deal with the fact that the > guest may never respond. Having well defined message boundaries does > not help the general problem and it only helps in the most trivial cases. > > Basically, it boils down to a lot of complexity for something that isn't > going to be helpful in most circumstances. I don't know why you're saying virtio-serial-bus does (or needs to) do anything of this. Amit
Anthony Liguori <anthony@codemonkey.ws> writes: > On 01/12/2010 01:16 AM, Amit Shah wrote: >> BTW I don't really want this too, I can get rid of it if everyone agrees >> we won't support clipboard writes> 4k over vnc or if there's a better >> idea. >> > > Why bother trying to preserve message boundaries? I think that's the > fundamental question. Yes. Either it's a datagram or a stream pipe. I always thought it would be a stream pipe, as the name "serial" suggests. As to the clipboard use case: same problem exists with any old stream pipe, including TCP, same solutions apply. If you told the peer "I'm going to send you 12345 bytes now", and your stream pipe chokes after 7890 bytes, you retry until everything got through. If you want to be able to abort a partial transfer and start a new one, you layer a protocol suitable for that on top of your stream pipe.
On 01/13/2010 11:14 AM, Markus Armbruster wrote: > Anthony Liguori<anthony@codemonkey.ws> writes: > > >> On 01/12/2010 01:16 AM, Amit Shah wrote: >> >>> BTW I don't really want this too, I can get rid of it if everyone agrees >>> we won't support clipboard writes> 4k over vnc or if there's a better >>> idea. >>> >>> >> Why bother trying to preserve message boundaries? I think that's the >> fundamental question. >> > Yes. Either it's a datagram or a stream pipe. I always thought it > would be a stream pipe, as the name "serial" suggests. > And if it's a datagram, then we should accept that there will be a fixed max message size which is pretty common in all datagram protocols. That fixed size should be no larger than what the transport supports so in this case, it would be 4k. If a guest wants to send larger messages, it must build a continuation protocol on top of the datagram protocol. Regards, Anthony Liguori
diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c index 20d9580..c947143 100644 --- a/hw/virtio-serial-bus.c +++ b/hw/virtio-serial-bus.c @@ -44,6 +44,20 @@ struct VirtIOSerial { struct virtio_console_config config; }; +/* This struct holds individual buffers received for each port */ +typedef struct VirtIOSerialPortBuffer { + QTAILQ_ENTRY(VirtIOSerialPortBuffer) next; + + uint8_t *buf; + + size_t len; /* length of the buffer */ + size_t offset; /* offset from which to consume data in the buffer */ + + uint32_t flags; /* Sent by guest (start of data stream, end of stream) */ + + bool previous_failure; /* Did sending out this buffer fail previously? */ +} VirtIOSerialPortBuffer; + static VirtIOSerialPort *find_port_by_id(VirtIOSerial *vser, uint32_t id) { VirtIOSerialPort *port; @@ -157,6 +171,198 @@ static size_t send_control_event(VirtIOSerialPort *port, uint16_t event, return send_control_msg(port, &cpkt, sizeof(cpkt)); } +static void init_buf(VirtIOSerialPortBuffer *buf, uint8_t *buffer, size_t len) +{ + buf->buf = buffer; + buf->len = len; + buf->offset = 0; + buf->flags = 0; + buf->previous_failure = false; +} + +static VirtIOSerialPortBuffer *alloc_buf(size_t len) +{ + VirtIOSerialPortBuffer *buf; + + buf = qemu_malloc(sizeof(*buf)); + buf->buf = qemu_malloc(len); + + init_buf(buf, buf->buf, len); + + return buf; +} + +static void free_buf(VirtIOSerialPortBuffer *buf) +{ + qemu_free(buf->buf); + qemu_free(buf); +} + +static size_t get_complete_data_size(VirtIOSerialPort *port) +{ + VirtIOSerialPortBuffer *buf; + size_t size; + bool is_complete, start_seen; + + size = 0; + is_complete = false; + start_seen = false; + QTAILQ_FOREACH(buf, &port->unflushed_buffers, next) { + size += buf->len - buf->offset; + + if (buf->flags & VIRTIO_CONSOLE_HDR_END_DATA) { + is_complete = true; + break; + } + if (buf == QTAILQ_FIRST(&port->unflushed_buffers) + && !(buf->flags & VIRTIO_CONSOLE_HDR_START_DATA)) { + + /* There's some data that arrived without a START flag. Flush it. */ + is_complete = true; + break; + } + + if (buf->flags & VIRTIO_CONSOLE_HDR_START_DATA) { + if (start_seen) { + /* + * There's some data that arrived without an END + * flag. Flush it. + */ + size -= buf->len + buf->offset; + is_complete = true; + break; + } + start_seen = true; + } + } + return is_complete ? size : 0; +} + +/* + * The guest could have sent the data corresponding to one write + * request split up in multiple buffers. The first buffer has the + * VIRTIO_CONSOLE_HDR_START_DATA flag set and the last buffer has the + * VIRTIO_CONSOLE_HDR_END_DATA flag set. Using this information, merge + * the parts into one buf here to process it for output. + */ +static VirtIOSerialPortBuffer *get_complete_buf(VirtIOSerialPort *port) +{ + VirtIOSerialPortBuffer *buf, *buf2; + uint8_t *outbuf; + size_t out_offset, out_size; + + out_size = get_complete_data_size(port); + if (!out_size) + return NULL; + + buf = QTAILQ_FIRST(&port->unflushed_buffers); + if (buf->len - buf->offset == out_size) { + QTAILQ_REMOVE(&port->unflushed_buffers, buf, next); + return buf; + } + out_offset = 0; + outbuf = qemu_malloc(out_size); + + QTAILQ_FOREACH_SAFE(buf, &port->unflushed_buffers, next, buf2) { + size_t copy_size; + + copy_size = buf->len - buf->offset; + memcpy(outbuf + out_offset, buf->buf + buf->offset, copy_size); + out_offset += copy_size; + + QTAILQ_REMOVE(&port->unflushed_buffers, buf, next); + qemu_free(buf->buf); + + if (out_offset == out_size) { + break; + } + qemu_free(buf); + } + init_buf(buf, outbuf, out_size); + buf->flags = VIRTIO_CONSOLE_HDR_START_DATA | VIRTIO_CONSOLE_HDR_END_DATA; + + return buf; +} + +/* Call with the unflushed_buffers_lock held */ +static void flush_queue(VirtIOSerialPort *port) +{ + VirtIOSerialPortBuffer *buf; + size_t out_size; + ssize_t ret; + + /* + * If a device is interested in buffering packets till it's + * opened, cache the data the guest sends us till a connection is + * established. + */ + if (!port->host_connected && port->cache_buffers) { + return; + } + + while ((buf = get_complete_buf(port))) { + out_size = buf->len - buf->offset; + if (!port->host_connected) { + /* + * Caching is disabled and host is not connected, so + * discard the buffer. Do this only after merging the + * buffer as a port can get connected in the middle of + * dropping buffers and the port will end up getting the + * incomplete output. + */ + port->nr_bytes -= buf->len + buf->offset; + free_buf(buf); + continue; + } + + ret = port->info->have_data(port, buf->buf + buf->offset, out_size); + if (ret < out_size) { + QTAILQ_INSERT_HEAD(&port->unflushed_buffers, buf, next); + } + if (ret <= 0) { + /* We're not progressing at all */ + if (buf->previous_failure) { + break; + } + buf->previous_failure = true; + } else { + buf->offset += ret; + port->nr_bytes -= ret; + buf->previous_failure = false; + } + if (!(buf->len - buf->offset)) { + free_buf(buf); + } + } + + if (port->host_throttled && port->nr_bytes < port->byte_limit) { + port->host_throttled = false; + send_control_event(port, VIRTIO_CONSOLE_THROTTLE_PORT, 0); + } +} + +static void flush_all_ports(VirtIOSerial *vser) +{ + struct VirtIOSerialPort *port; + + QTAILQ_FOREACH(port, &vser->ports, next) { + if (port->has_activity) { + port->has_activity = false; + flush_queue(port); + } + } +} + +static void remove_port_buffers(VirtIOSerialPort *port) +{ + struct VirtIOSerialPortBuffer *buf, *buf2; + + QTAILQ_FOREACH_SAFE(buf, &port->unflushed_buffers, next, buf2) { + QTAILQ_REMOVE(&port->unflushed_buffers, buf, next); + free_buf(buf); + } +} + /* Functions for use inside qemu to open and read from/write to ports */ int virtio_serial_open(VirtIOSerialPort *port) { @@ -168,6 +374,10 @@ int virtio_serial_open(VirtIOSerialPort *port) port->host_connected = true; send_control_event(port, VIRTIO_CONSOLE_PORT_OPEN, 1); + /* Flush any buffers that were cached while the port was closed */ + if (port->cache_buffers && port->info->have_data) { + flush_queue(port); + } return 0; } @@ -176,6 +386,9 @@ int virtio_serial_close(VirtIOSerialPort *port) port->host_connected = false; send_control_event(port, VIRTIO_CONSOLE_PORT_OPEN, 0); + if (!port->cache_buffers) { + remove_port_buffers(port); + } return 0; } @@ -265,6 +478,14 @@ static void handle_control_message(VirtIOSerial *vser, void *buf) qemu_free(buffer); } + /* + * We also want to signal to the guest whether or not the port + * is set to caching the buffers when disconnected. + */ + if (port->cache_buffers) { + send_control_event(port, VIRTIO_CONSOLE_CACHE_BUFFERS, 1); + } + if (port->host_connected) { send_control_event(port, VIRTIO_CONSOLE_PORT_OPEN, 1); } @@ -315,6 +536,10 @@ static void control_out(VirtIODevice *vdev, VirtQueue *vq) /* * Guest wrote something to some port. + * + * Flush the data in the entire chunk that we received rather than + * splitting it into multiple buffers. VNC clients don't consume split + * buffers */ static void handle_output(VirtIODevice *vdev, VirtQueue *vq) { @@ -325,6 +550,7 @@ static void handle_output(VirtIODevice *vdev, VirtQueue *vq) while (virtqueue_pop(vq, &elem)) { VirtIOSerialPort *port; + VirtIOSerialPortBuffer *buf; struct virtio_console_header header; int header_len; @@ -333,10 +559,14 @@ static void handle_output(VirtIODevice *vdev, VirtQueue *vq) if (elem.out_sg[0].iov_len < header_len) { goto next_buf; } + if (header_len) { + memcpy(&header, elem.out_sg[0].iov_base, header_len); + } port = find_port_by_vq(vser, vq); if (!port) { goto next_buf; } + /* * A port may not have any handler registered for consuming the * data that the guest sends or it may not have a chardev associated @@ -347,13 +577,38 @@ static void handle_output(VirtIODevice *vdev, VirtQueue *vq) } /* The guest always sends only one sg */ - port->info->have_data(port, elem.out_sg[0].iov_base + header_len, - elem.out_sg[0].iov_len - header_len); + buf = alloc_buf(elem.out_sg[0].iov_len - header_len); + memcpy(buf->buf, elem.out_sg[0].iov_base + header_len, buf->len); + + if (header_len) { + /* + * Only the first buffer in a stream will have this + * set. This will help us identify the first buffer and + * the remaining buffers in the stream based on length + */ + buf->flags = ldl_p(&header.flags) + & (VIRTIO_CONSOLE_HDR_START_DATA | VIRTIO_CONSOLE_HDR_END_DATA); + } else { + /* We always want to flush all the buffers in this case */ + buf->flags = VIRTIO_CONSOLE_HDR_START_DATA + | VIRTIO_CONSOLE_HDR_END_DATA; + } + + QTAILQ_INSERT_TAIL(&port->unflushed_buffers, buf, next); + port->nr_bytes += buf->len; + port->has_activity = true; + if (!port->host_throttled && port->byte_limit && + port->nr_bytes >= port->byte_limit) { + + port->host_throttled = true; + send_control_event(port, VIRTIO_CONSOLE_THROTTLE_PORT, 1); + } next_buf: virtqueue_push(vq, &elem, elem.out_sg[0].iov_len); } virtio_notify(vdev, vq); + flush_all_ports(vser); } static void handle_input(VirtIODevice *vdev, VirtQueue *vq) @@ -386,6 +641,7 @@ static void virtio_serial_save(QEMUFile *f, void *opaque) VirtIOSerial *s = opaque; VirtIOSerialPort *port; uint32_t nr_active_ports; + unsigned int nr_bufs; /* The virtio device */ virtio_save(&s->vdev, f); @@ -408,14 +664,35 @@ static void virtio_serial_save(QEMUFile *f, void *opaque) * Items in struct VirtIOSerialPort. */ QTAILQ_FOREACH(port, &s->ports, next) { + VirtIOSerialPortBuffer *buf; + /* * We put the port number because we may not have an active * port at id 0 that's reserved for a console port, or in case * of ports that might have gotten unplugged */ qemu_put_be32s(f, &port->id); + qemu_put_be64s(f, &port->byte_limit); + qemu_put_be64s(f, &port->nr_bytes); qemu_put_byte(f, port->guest_connected); + qemu_put_byte(f, port->host_throttled); + + /* All the pending buffers from active ports */ + nr_bufs = 0; + QTAILQ_FOREACH(buf, &port->unflushed_buffers, next) { + nr_bufs++; + } + qemu_put_be32s(f, &nr_bufs); + if (!nr_bufs) { + continue; + } + QTAILQ_FOREACH(buf, &port->unflushed_buffers, next) { + qemu_put_be64s(f, &buf->len); + qemu_put_be64s(f, &buf->offset); + qemu_put_be32s(f, &buf->flags); + qemu_put_buffer(f, buf->buf, buf->len); + } } } @@ -448,13 +725,34 @@ static int virtio_serial_load(QEMUFile *f, void *opaque, int version_id) /* Items in struct VirtIOSerialPort */ for (i = 0; i < nr_active_ports; i++) { + VirtIOSerialPortBuffer *buf; uint32_t id; + unsigned int nr_bufs; id = qemu_get_be32(f); port = find_port_by_id(s, id); + port->byte_limit = qemu_get_be64(f); + port->nr_bytes = qemu_get_be64(f); port->guest_connected = qemu_get_byte(f); + port->host_throttled = qemu_get_byte(f); + + /* All the pending buffers from active ports */ + qemu_get_be32s(f, &nr_bufs); + if (!nr_bufs) { + continue; + } + for (; nr_bufs; nr_bufs--) { + size_t len; + qemu_get_be64s(f, &len); + buf = alloc_buf(len); + + qemu_get_be64s(f, &buf->offset); + qemu_get_be32s(f, &buf->flags); + qemu_get_buffer(f, buf->buf, buf->len); + QTAILQ_INSERT_TAIL(&port->unflushed_buffers, buf, next); + } } return 0; @@ -490,6 +788,10 @@ static void virtser_bus_dev_print(Monitor *mon, DeviceState *qdev, int indent) indent, "", port->guest_connected); monitor_printf(mon, "%*s dev-prop-int: host_connected: %d\n", indent, "", port->host_connected); + monitor_printf(mon, "%*s dev-prop-int: host_throttled: %d\n", + indent, "", port->host_throttled); + monitor_printf(mon, "%*s dev-prop-int: nr_bytes: %zu\n", + indent, "", port->nr_bytes); } static int virtser_port_qdev_init(DeviceState *qdev, DeviceInfo *base) @@ -520,6 +822,7 @@ static int virtser_port_qdev_init(DeviceState *qdev, DeviceInfo *base) if (ret) { return ret; } + QTAILQ_INIT(&port->unflushed_buffers); port->id = plugging_port0 ? 0 : port->vser->config.nr_ports++; @@ -570,6 +873,8 @@ static int virtser_port_qdev_exit(DeviceState *qdev) if (port->info->exit) port->info->exit(dev); + remove_port_buffers(port); + return 0; } diff --git a/hw/virtio-serial.c b/hw/virtio-serial.c index 470446b..fd27c33 100644 --- a/hw/virtio-serial.c +++ b/hw/virtio-serial.c @@ -66,13 +66,14 @@ static int virtconsole_initfn(VirtIOSerialDevice *dev) port->info = dev->info; - port->is_console = true; - /* - * For console ports, just assume the guest is ready to accept our - * data. + * We're not interested in data the guest sends while nothing is + * connected on the host side. Just ignore it instead of saving it + * for later consumption. */ - port->guest_connected = true; + port->cache_buffers = 0; + + port->is_console = true; if (vcon->chr) { qemu_chr_add_handlers(vcon->chr, chr_can_read, chr_read, chr_event, diff --git a/hw/virtio-serial.h b/hw/virtio-serial.h index 5505841..acb601d 100644 --- a/hw/virtio-serial.h +++ b/hw/virtio-serial.h @@ -49,12 +49,18 @@ struct virtio_console_header { uint32_t flags; /* Some message between host and guest */ }; +/* Messages between host and guest */ +#define VIRTIO_CONSOLE_HDR_START_DATA (1 << 0) +#define VIRTIO_CONSOLE_HDR_END_DATA (1 << 1) + /* Some events for the internal messages (control packets) */ #define VIRTIO_CONSOLE_PORT_READY 0 #define VIRTIO_CONSOLE_CONSOLE_PORT 1 #define VIRTIO_CONSOLE_RESIZE 2 #define VIRTIO_CONSOLE_PORT_OPEN 3 #define VIRTIO_CONSOLE_PORT_NAME 4 +#define VIRTIO_CONSOLE_THROTTLE_PORT 5 +#define VIRTIO_CONSOLE_CACHE_BUFFERS 6 /* == In-qemu interface == */ @@ -96,6 +102,13 @@ struct VirtIOSerialPort { char *name; /* + * This list holds buffers pushed by the guest in case the guest + * sent incomplete messages or the host connection was down and + * the device requested to cache the data. + */ + QTAILQ_HEAD(, VirtIOSerialPortBuffer) unflushed_buffers; + + /* * This id helps identify ports between the guest and the host. * The guest sends a "header" with this id with each data packet * that it sends and the host can then find out which associated @@ -103,6 +116,27 @@ struct VirtIOSerialPort { */ uint32_t id; + /* + * Each port can specify the limit on number of bytes that can be + * outstanding in the unread buffers. This is to prevent any OOM + * situtation if a rogue process on the guest keeps injecting + * data. + */ + size_t byte_limit; + + /* + * The number of bytes we have queued up in our unread queue + */ + size_t nr_bytes; + + /* + * This boolean, when set, means "queue data that gets sent to + * this port when the host is not connected". The queued data, if + * any, is then sent out to the port when the host connection is + * opened. + */ + uint8_t cache_buffers; + /* Identify if this is a port that binds with hvc in the guest */ uint8_t is_console; @@ -110,6 +144,11 @@ struct VirtIOSerialPort { bool guest_connected; /* Is this device open for IO on the host? */ bool host_connected; + /* Have we sent a throttle message to the guest? */ + bool host_throttled; + + /* Did this port get data in the recent handle_output call? */ + bool has_activity; }; struct VirtIOSerialPortInfo {
Guests send us one buffer at a time. Current guests send buffers sized 4K bytes. If guest userspace applications sent out > 4K bytes in one write() syscall, the write request actually sends out multiple buffers, each of 4K in size. This usually isn't a problem but for some apps, like VNC, the entire data has to be sent in one go to make copy/paste work fine. So if an app on the guest sends out guest clipboard contents, it has to be sent to the vnc server in one go as the guest app sent it. For this to be done, we need the guest to send us START and END markers for each write request so that we can find out complete buffers and send them off to ports. This needs us to buffer all the data that comes in from the guests, hold it off till we see all the data corresponding to one write request, merge it all in one buffer and then send it to the port the data was destined for. Also, we add support for caching of these buffers till a port indicates it's ready to receive data. We keep caching data the guest sends us till a port accepts it. However, this could lead to an OOM condition where a rogue process on the guest could continue pumping in data while the host continues to cache it. We introduce a per-port byte-limit property to alleviate this condition. When this limit is reached, we send a control message to the guest indicating it to not send us any more data till further indication. When the number of bytes cached go lesser than the limit specified, we open tell the guest to restart sending data. Signed-off-by: Amit Shah <amit.shah@redhat.com> --- hw/virtio-serial-bus.c | 309 +++++++++++++++++++++++++++++++++++++++++++++++- hw/virtio-serial.c | 11 +- hw/virtio-serial.h | 39 ++++++ 3 files changed, 352 insertions(+), 7 deletions(-)