diff mbox series

char-mux: Don't overwrite the receive buffer

Message ID 20240707111940.232549-3-lrh2000@pku.edu.cn
State New
Headers show
Series char-mux: Don't overwrite the receive buffer | expand

Commit Message

Ruihan Li July 7, 2024, 11:19 a.m. UTC
This commit fixes a bug that causes incorrect results when pasting more
than 32 bytes, the size of the receive buffer b->buffer, into the virtio
console.

Example (note that the last 32 bytes are always correct, but something
goes wrong just before the last 32 bytes):

	Pasting  abcdefghijklmnopqrstuvwxyz1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()
	Received abcdefg)EFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()EFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()

The root cause of this bug is as follows:

The mux_chr_read function passes the data to the backend via
be->chr_read one byte at a time, either directly or via another
mux_chr_accept_input method. However, if the receive buffer is full,
there is a chance that the mux_chr_can_read method will return more than
one byte, because in this case the method directly returns whatever
be->chr_can_read returns.

This is problematic because if mux_chr_read passes a byte to the backend
by calling be->chr_read, it will consume the entire backend buffer, at
least in the case of virtio. Once all backend buffers are used,
mux_chr_read writes all remaining bytes to the receive buffer d->buffer,
but the number of remaining bytes can be larger than the buffer size.
This does not lead to security problems since it is a ring buffer, but
it does mess up the receive data.

This can be fixed by having mux_chr_can_read return either zero or one.
This fix is not very efficient, but it is quite reasonable since
mux_chr_read also passes the data to the backend one byte at a time.

Signed-off-by: Ruihan Li <lrh2000@pku.edu.cn>
---
 chardev/char-mux.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Marc-André Lureau July 7, 2024, 4:28 p.m. UTC | #1
Hi

On Sun, Jul 7, 2024 at 3:26 PM Ruihan Li <lrh2000@pku.edu.cn> wrote:

> This commit fixes a bug that causes incorrect results when pasting more
> than 32 bytes, the size of the receive buffer b->buffer, into the virtio
> console.
>
> Example (note that the last 32 bytes are always correct, but something
> goes wrong just before the last 32 bytes):
>
>         Pasting
> abcdefghijklmnopqrstuvwxyz1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()
>         Received
> abcdefg)EFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()EFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()
>
> The root cause of this bug is as follows:
>
> The mux_chr_read function passes the data to the backend via
> be->chr_read one byte at a time, either directly or via another
> mux_chr_accept_input method. However, if the receive buffer is full,
> there is a chance that the mux_chr_can_read method will return more than
> one byte, because in this case the method directly returns whatever
> be->chr_can_read returns.
>
> This is problematic because if mux_chr_read passes a byte to the backend
> by calling be->chr_read, it will consume the entire backend buffer, at
> least in the case of virtio. Once all backend buffers are used,
> mux_chr_read writes all remaining bytes to the receive buffer d->buffer,
>

My understanding of the code execution is:
- mux_chr_can_read() returns be->chr_can_read(), say N, because d->buffer
is already MUX_BUFFER_SIZE.
- mux_chr_read() is called with N bytes
- mux_chr_accept_input() flushes d->buffer, writing MUX_BUFFER_SIZE
- be should still accept N-MUX_BUFFER_SIZE
- mux_proc_byte() loops for N bytes
- chr_read() should accept the N-MUX_BUFFER_SIZE
- d->buffer is then filled with the remaining MUX_BUFFER_SIZE


> but the number of remaining bytes can be larger than the buffer size.
>

By the above description, I don't see how it happens.

This does not lead to security problems since it is a ring buffer, but
> it does mess up the receive data.
>
> This can be fixed by having mux_chr_can_read return either zero or one.
> This fix is not very efficient, but it is quite reasonable since
> mux_chr_read also passes the data to the backend one byte at a time.
>

Could you share your testing setup? Even better if you could write a test!


thanks


> Signed-off-by: Ruihan Li <lrh2000@pku.edu.cn>
> ---
>  chardev/char-mux.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/chardev/char-mux.c b/chardev/char-mux.c
> index ee2d47b..5c6eea2 100644
> --- a/chardev/char-mux.c
> +++ b/chardev/char-mux.c
> @@ -210,8 +210,8 @@ static int mux_chr_can_read(void *opaque)
>          return 1;
>      }
>
> -    if (be && be->chr_can_read) {
> -        return be->chr_can_read(be->opaque);
> +    if (be && be->chr_can_read && be->chr_can_read(be->opaque)) {
> +        return 1;
>      }
>
>      return 0;
> --
> 2.45.2
>
>
>
Ruihan Li July 7, 2024, 8:11 p.m. UTC | #2
Hi,

Thanks for your quick review!

On Sun, Jul 07, 2024 at 08:28:50PM GMT, Marc-André Lureau wrote:
> Hi
> 
> On Sun, Jul 7, 2024 at 3:26 PM Ruihan Li <lrh2000@pku.edu.cn> wrote:
> 
> > This commit fixes a bug that causes incorrect results when pasting more
> > than 32 bytes, the size of the receive buffer b->buffer, into the virtio
> > console.
> >
> > Example (note that the last 32 bytes are always correct, but something
> > goes wrong just before the last 32 bytes):
> >
> >         Pasting
> > abcdefghijklmnopqrstuvwxyz1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()
> >         Received
> > abcdefg)EFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()EFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()
> >
> > The root cause of this bug is as follows:
> >
> > The mux_chr_read function passes the data to the backend via
> > be->chr_read one byte at a time, either directly or via another
> > mux_chr_accept_input method. However, if the receive buffer is full,
> > there is a chance that the mux_chr_can_read method will return more than
> > one byte, because in this case the method directly returns whatever
> > be->chr_can_read returns.
> >
> > This is problematic because if mux_chr_read passes a byte to the backend
> > by calling be->chr_read, it will consume the entire backend buffer, at
> > least in the case of virtio. Once all backend buffers are used,
> > mux_chr_read writes all remaining bytes to the receive buffer d->buffer,
> >
> 
> My understanding of the code execution is:
> - mux_chr_can_read() returns be->chr_can_read(), say N, because d->buffer
> is already MUX_BUFFER_SIZE.
> - mux_chr_read() is called with N bytes
> - mux_chr_accept_input() flushes d->buffer, writing MUX_BUFFER_SIZE
> - be should still accept N-MUX_BUFFER_SIZE
> - mux_proc_byte() loops for N bytes
> - chr_read() should accept the N-MUX_BUFFER_SIZE
> - d->buffer is then filled with the remaining MUX_BUFFER_SIZE

Note this:
	[..] if mux_chr_read passes a byte to the backend by calling
	be->chr_read, it will consume the entire backend buffer, at
	least in the case of virtio [..]

At least in the case of virtio, if the guest provides a buffer of length
4096, be->chr_can_read will report 4096. But if you then call
be->chr_read with one byte, the whole 4096 buffer will be used. After
that, be->chr_can_read will return zero instead of 4095.

This should make sense since the device cannot change the number of
bytes in the buffer after it has made the buffer available to the CPU.

> 
> 
> > but the number of remaining bytes can be larger than the buffer size.
> >
> 
> By the above description, I don't see how it happens.
> 
> This does not lead to security problems since it is a ring buffer, but
> > it does mess up the receive data.
> >
> > This can be fixed by having mux_chr_can_read return either zero or one.
> > This fix is not very efficient, but it is quite reasonable since
> > mux_chr_read also passes the data to the backend one byte at a time.
> >
> 
> Could you share your testing setup? Even better if you could write a test!

This happens in https://github.com/asterinas/asterinas. Sorry, but I
don't have a minimal reproducible example, and I don't think I can make
one anytime soon.

As for the tests, I'd like to know how to write such tests in QEMU. I
checked the documentation but didn't find anything, maybe I'm missing
something?

> 
> 
> thanks
> 
> 
> > Signed-off-by: Ruihan Li <lrh2000@pku.edu.cn>
> > ---
> >  chardev/char-mux.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/chardev/char-mux.c b/chardev/char-mux.c
> > index ee2d47b..5c6eea2 100644
> > --- a/chardev/char-mux.c
> > +++ b/chardev/char-mux.c
> > @@ -210,8 +210,8 @@ static int mux_chr_can_read(void *opaque)
> >          return 1;
> >      }
> >
> > -    if (be && be->chr_can_read) {
> > -        return be->chr_can_read(be->opaque);
> > +    if (be && be->chr_can_read && be->chr_can_read(be->opaque)) {
> > +        return 1;
> >      }
> >
> >      return 0;
> > --
> > 2.45.2
> >
> >
> >
> 
> -- 
> Marc-André Lureau

Thanks,
Ruihan Li
Marc-André Lureau July 8, 2024, 11:21 a.m. UTC | #3
Hi

On Mon, Jul 8, 2024 at 12:12 AM Ruihan Li <lrh2000@pku.edu.cn> wrote:

> Hi,
>
> Thanks for your quick review!
>
> On Sun, Jul 07, 2024 at 08:28:50PM GMT, Marc-André Lureau wrote:
> > Hi
> >
> > On Sun, Jul 7, 2024 at 3:26 PM Ruihan Li <lrh2000@pku.edu.cn> wrote:
> >
> > > This commit fixes a bug that causes incorrect results when pasting more
> > > than 32 bytes, the size of the receive buffer b->buffer, into the
> virtio
> > > console.
> > >
> > > Example (note that the last 32 bytes are always correct, but something
> > > goes wrong just before the last 32 bytes):
> > >
> > >         Pasting
> > >
> abcdefghijklmnopqrstuvwxyz1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()
> > >         Received
> > >
> abcdefg)EFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()EFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()
> > >
> > > The root cause of this bug is as follows:
> > >
> > > The mux_chr_read function passes the data to the backend via
> > > be->chr_read one byte at a time, either directly or via another
> > > mux_chr_accept_input method. However, if the receive buffer is full,
> > > there is a chance that the mux_chr_can_read method will return more
> than
> > > one byte, because in this case the method directly returns whatever
> > > be->chr_can_read returns.
> > >
> > > This is problematic because if mux_chr_read passes a byte to the
> backend
> > > by calling be->chr_read, it will consume the entire backend buffer, at
> > > least in the case of virtio. Once all backend buffers are used,
> > > mux_chr_read writes all remaining bytes to the receive buffer
> d->buffer,
> > >
> >
> > My understanding of the code execution is:
> > - mux_chr_can_read() returns be->chr_can_read(), say N, because d->buffer
> > is already MUX_BUFFER_SIZE.
> > - mux_chr_read() is called with N bytes
> > - mux_chr_accept_input() flushes d->buffer, writing MUX_BUFFER_SIZE
> > - be should still accept N-MUX_BUFFER_SIZE
> > - mux_proc_byte() loops for N bytes
> > - chr_read() should accept the N-MUX_BUFFER_SIZE
> > - d->buffer is then filled with the remaining MUX_BUFFER_SIZE
>
> Note this:
>         [..] if mux_chr_read passes a byte to the backend by calling
>         be->chr_read, it will consume the entire backend buffer, at
>         least in the case of virtio [..]
>
> At least in the case of virtio, if the guest provides a buffer of length
> 4096, be->chr_can_read will report 4096. But if you then call
> be->chr_read with one byte, the whole 4096 buffer will be used. After
> that, be->chr_can_read will return zero instead of 4095.
>
> This should make sense since the device cannot change the number of
> bytes in the buffer after it has made the buffer available to the CPU.
>

Thanks, that helps explaining the incorrect behaviour.

I think the concept of extra buffer as introduced in commit
bd9bdce694ccb76facc882363e4c337e8a88c918 ("Add input buffer to mux chr
(patch by Tristan Gingold)") is flawed, as Jan Kiszka explained in commit
a80bf99fa3dd829ecea88b9bfb4f7cf146208f07 ("char-mux: Use separate input
buffers (Jan Kiszka)"):
    Note: In contrast to the original author's claim, the buffering concept
    still breaks down when the fifo of the currently active sub-device is
    full. As we cannot accept futher data from this point on without risking
    to loose it, we will also miss escape sequences, just like without all
    that buffering. In short: There is no reliable escape sequence handling
    without infinite buffers or the risk of loosing some data.

Maybe the best course is to remove the cycle buffer and either:
- drop the data that be can't accept, but have always responsive mux (by
default)
- blocking, including mux, until the be can accept more data (not friendly)
- or allow unlimited buffering?

Given that mux is meant for developers and qemu CLI users, I guess any of
this would be acceptable.


>
> >
> >
> > > but the number of remaining bytes can be larger than the buffer size.
> > >
> >
> > By the above description, I don't see how it happens.
> >
> > This does not lead to security problems since it is a ring buffer, but
> > > it does mess up the receive data.
> > >
> > > This can be fixed by having mux_chr_can_read return either zero or one.
> > > This fix is not very efficient, but it is quite reasonable since
> > > mux_chr_read also passes the data to the backend one byte at a time.
> > >
> >
> > Could you share your testing setup? Even better if you could write a
> test!
>
> This happens in https://github.com/asterinas/asterinas. Sorry, but I
> don't have a minimal reproducible example, and I don't think I can make
> one anytime soon.
>
> As for the tests, I'd like to know how to write such tests in QEMU. I
> checked the documentation but didn't find anything, maybe I'm missing
> something?
>
> >
> >
> > thanks
> >
> >
> > > Signed-off-by: Ruihan Li <lrh2000@pku.edu.cn>
> > > ---
> > >  chardev/char-mux.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/chardev/char-mux.c b/chardev/char-mux.c
> > > index ee2d47b..5c6eea2 100644
> > > --- a/chardev/char-mux.c
> > > +++ b/chardev/char-mux.c
> > > @@ -210,8 +210,8 @@ static int mux_chr_can_read(void *opaque)
> > >          return 1;
> > >      }
> > >
> > > -    if (be && be->chr_can_read) {
> > > -        return be->chr_can_read(be->opaque);
> > > +    if (be && be->chr_can_read && be->chr_can_read(be->opaque)) {
> > > +        return 1;
> > >      }
> > >
> > >      return 0;
> > > --
> > > 2.45.2
> > >
> > >
> > >
> >
> > --
> > Marc-André Lureau
>
> Thanks,
> Ruihan Li
>
>
Ruihan Li July 9, 2024, 2:41 p.m. UTC | #4
Hi,

On Mon, Jul 08, 2024 at 03:21:58PM GMT, Marc-André Lureau wrote:
> Hi
> 
> On Mon, Jul 8, 2024 at 12:12 AM Ruihan Li <lrh2000@pku.edu.cn> wrote:
> 
> > Hi,
> >
> > Thanks for your quick review!
> >
> > On Sun, Jul 07, 2024 at 08:28:50PM GMT, Marc-André Lureau wrote:
> > > Hi
> > >
> > > On Sun, Jul 7, 2024 at 3:26 PM Ruihan Li <lrh2000@pku.edu.cn> wrote:
> > >
> > > > This commit fixes a bug that causes incorrect results when pasting more
> > > > than 32 bytes, the size of the receive buffer b->buffer, into the
> > virtio
> > > > console.
> > > >
> > > > Example (note that the last 32 bytes are always correct, but something
> > > > goes wrong just before the last 32 bytes):
> > > >
> > > >         Pasting
> > > >
> > abcdefghijklmnopqrstuvwxyz1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()
> > > >         Received
> > > >
> > abcdefg)EFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()EFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()
> > > >
> > > > The root cause of this bug is as follows:
> > > >
> > > > The mux_chr_read function passes the data to the backend via
> > > > be->chr_read one byte at a time, either directly or via another
> > > > mux_chr_accept_input method. However, if the receive buffer is full,
> > > > there is a chance that the mux_chr_can_read method will return more
> > than
> > > > one byte, because in this case the method directly returns whatever
> > > > be->chr_can_read returns.
> > > >
> > > > This is problematic because if mux_chr_read passes a byte to the
> > backend
> > > > by calling be->chr_read, it will consume the entire backend buffer, at
> > > > least in the case of virtio. Once all backend buffers are used,
> > > > mux_chr_read writes all remaining bytes to the receive buffer
> > d->buffer,
> > > >
> > >
> > > My understanding of the code execution is:
> > > - mux_chr_can_read() returns be->chr_can_read(), say N, because d->buffer
> > > is already MUX_BUFFER_SIZE.
> > > - mux_chr_read() is called with N bytes
> > > - mux_chr_accept_input() flushes d->buffer, writing MUX_BUFFER_SIZE
> > > - be should still accept N-MUX_BUFFER_SIZE
> > > - mux_proc_byte() loops for N bytes
> > > - chr_read() should accept the N-MUX_BUFFER_SIZE
> > > - d->buffer is then filled with the remaining MUX_BUFFER_SIZE
> >
> > Note this:
> >         [..] if mux_chr_read passes a byte to the backend by calling
> >         be->chr_read, it will consume the entire backend buffer, at
> >         least in the case of virtio [..]
> >
> > At least in the case of virtio, if the guest provides a buffer of length
> > 4096, be->chr_can_read will report 4096. But if you then call
> > be->chr_read with one byte, the whole 4096 buffer will be used. After
> > that, be->chr_can_read will return zero instead of 4095.
> >
> > This should make sense since the device cannot change the number of
> > bytes in the buffer after it has made the buffer available to the CPU.
> >
> 
> Thanks, that helps explaining the incorrect behaviour.
> 
> I think the concept of extra buffer as introduced in commit
> bd9bdce694ccb76facc882363e4c337e8a88c918 ("Add input buffer to mux chr
> (patch by Tristan Gingold)") is flawed, as Jan Kiszka explained in commit
> a80bf99fa3dd829ecea88b9bfb4f7cf146208f07 ("char-mux: Use separate input
> buffers (Jan Kiszka)"):
>     Note: In contrast to the original author's claim, the buffering concept
>     still breaks down when the fifo of the currently active sub-device is
>     full. As we cannot accept futher data from this point on without risking
>     to loose it, we will also miss escape sequences, just like without all
>     that buffering. In short: There is no reliable escape sequence handling
>     without infinite buffers or the risk of loosing some data.
> 
> Maybe the best course is to remove the cycle buffer and either:
> - drop the data that be can't accept, but have always responsive mux (by
> default)
> - blocking, including mux, until the be can accept more data (not friendly)
> - or allow unlimited buffering?
> 
> Given that mux is meant for developers and qemu CLI users, I guess any of
> this would be acceptable.

Thanks for your comments.

However, I'm not really sure what you're talking about. If we make
mux_chr_can_read return either zero or one (as I've done in the patch),
do you mean that we are still at risk of losing some escape sequences?

In mux_proc_byte, we set d->term_got_escape to 1 when we see the escape
character. As far as I can see, the escape sequence is always handled
correctly. So I don't understand how losing escape sequences can happen.

Would you mind explaining this in more detail?

> 
> 
> >
> > >
> > >
> > > > but the number of remaining bytes can be larger than the buffer size.
> > > >
> > >
> > > By the above description, I don't see how it happens.
> > >
> > > This does not lead to security problems since it is a ring buffer, but
> > > > it does mess up the receive data.
> > > >
> > > > This can be fixed by having mux_chr_can_read return either zero or one.
> > > > This fix is not very efficient, but it is quite reasonable since
> > > > mux_chr_read also passes the data to the backend one byte at a time.
> > > >
> > >
> > > Could you share your testing setup? Even better if you could write a
> > test!
> >
> > This happens in https://github.com/asterinas/asterinas. Sorry, but I
> > don't have a minimal reproducible example, and I don't think I can make
> > one anytime soon.
> >
> > As for the tests, I'd like to know how to write such tests in QEMU. I
> > checked the documentation but didn't find anything, maybe I'm missing
> > something?
> >
> > >
> > >
> > > thanks
> > >
> > >
> > > > Signed-off-by: Ruihan Li <lrh2000@pku.edu.cn>
> > > > ---
> > > >  chardev/char-mux.c | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/chardev/char-mux.c b/chardev/char-mux.c
> > > > index ee2d47b..5c6eea2 100644
> > > > --- a/chardev/char-mux.c
> > > > +++ b/chardev/char-mux.c
> > > > @@ -210,8 +210,8 @@ static int mux_chr_can_read(void *opaque)
> > > >          return 1;
> > > >      }
> > > >
> > > > -    if (be && be->chr_can_read) {
> > > > -        return be->chr_can_read(be->opaque);
> > > > +    if (be && be->chr_can_read && be->chr_can_read(be->opaque)) {
> > > > +        return 1;
> > > >      }
> > > >
> > > >      return 0;
> > > > --
> > > > 2.45.2
> > > >
> > > >
> > > >
> > >
> > > --
> > > Marc-André Lureau
> >
> > Thanks,
> > Ruihan Li
> >
> >
> 
> -- 
> Marc-André Lureau

Thanks,
Ruihan Li
Marc-André Lureau July 9, 2024, 2:58 p.m. UTC | #5
Hi

On Tue, Jul 9, 2024 at 6:41 PM Ruihan Li <lrh2000@pku.edu.cn> wrote:

> Hi,
>
> On Mon, Jul 08, 2024 at 03:21:58PM GMT, Marc-André Lureau wrote:
> > Hi
> >
> > On Mon, Jul 8, 2024 at 12:12 AM Ruihan Li <lrh2000@pku.edu.cn> wrote:
> >
> > > Hi,
> > >
> > > Thanks for your quick review!
> > >
> > > On Sun, Jul 07, 2024 at 08:28:50PM GMT, Marc-André Lureau wrote:
> > > > Hi
> > > >
> > > > On Sun, Jul 7, 2024 at 3:26 PM Ruihan Li <lrh2000@pku.edu.cn> wrote:
> > > >
> > > > > This commit fixes a bug that causes incorrect results when pasting
> more
> > > > > than 32 bytes, the size of the receive buffer b->buffer, into the
> > > virtio
> > > > > console.
> > > > >
> > > > > Example (note that the last 32 bytes are always correct, but
> something
> > > > > goes wrong just before the last 32 bytes):
> > > > >
> > > > >         Pasting
> > > > >
> > >
> abcdefghijklmnopqrstuvwxyz1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()
> > > > >         Received
> > > > >
> > >
> abcdefg)EFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()EFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()
> > > > >
> > > > > The root cause of this bug is as follows:
> > > > >
> > > > > The mux_chr_read function passes the data to the backend via
> > > > > be->chr_read one byte at a time, either directly or via another
> > > > > mux_chr_accept_input method. However, if the receive buffer is
> full,
> > > > > there is a chance that the mux_chr_can_read method will return more
> > > than
> > > > > one byte, because in this case the method directly returns whatever
> > > > > be->chr_can_read returns.
> > > > >
> > > > > This is problematic because if mux_chr_read passes a byte to the
> > > backend
> > > > > by calling be->chr_read, it will consume the entire backend
> buffer, at
> > > > > least in the case of virtio. Once all backend buffers are used,
> > > > > mux_chr_read writes all remaining bytes to the receive buffer
> > > d->buffer,
> > > > >
> > > >
> > > > My understanding of the code execution is:
> > > > - mux_chr_can_read() returns be->chr_can_read(), say N, because
> d->buffer
> > > > is already MUX_BUFFER_SIZE.
> > > > - mux_chr_read() is called with N bytes
> > > > - mux_chr_accept_input() flushes d->buffer, writing MUX_BUFFER_SIZE
> > > > - be should still accept N-MUX_BUFFER_SIZE
> > > > - mux_proc_byte() loops for N bytes
> > > > - chr_read() should accept the N-MUX_BUFFER_SIZE
> > > > - d->buffer is then filled with the remaining MUX_BUFFER_SIZE
> > >
> > > Note this:
> > >         [..] if mux_chr_read passes a byte to the backend by calling
> > >         be->chr_read, it will consume the entire backend buffer, at
> > >         least in the case of virtio [..]
> > >
> > > At least in the case of virtio, if the guest provides a buffer of
> length
> > > 4096, be->chr_can_read will report 4096. But if you then call
> > > be->chr_read with one byte, the whole 4096 buffer will be used. After
> > > that, be->chr_can_read will return zero instead of 4095.
> > >
> > > This should make sense since the device cannot change the number of
> > > bytes in the buffer after it has made the buffer available to the CPU.
> > >
> >
> > Thanks, that helps explaining the incorrect behaviour.
> >
> > I think the concept of extra buffer as introduced in commit
> > bd9bdce694ccb76facc882363e4c337e8a88c918 ("Add input buffer to mux chr
> > (patch by Tristan Gingold)") is flawed, as Jan Kiszka explained in commit
> > a80bf99fa3dd829ecea88b9bfb4f7cf146208f07 ("char-mux: Use separate input
> > buffers (Jan Kiszka)"):
> >     Note: In contrast to the original author's claim, the buffering
> concept
> >     still breaks down when the fifo of the currently active sub-device is
> >     full. As we cannot accept futher data from this point on without
> risking
> >     to loose it, we will also miss escape sequences, just like without
> all
> >     that buffering. In short: There is no reliable escape sequence
> handling
> >     without infinite buffers or the risk of loosing some data.
> >
> > Maybe the best course is to remove the cycle buffer and either:
> > - drop the data that be can't accept, but have always responsive mux (by
> > default)
> > - blocking, including mux, until the be can accept more data (not
> friendly)
> > - or allow unlimited buffering?
> >
> > Given that mux is meant for developers and qemu CLI users, I guess any of
> > this would be acceptable.
>
> Thanks for your comments.
>
> However, I'm not really sure what you're talking about. If we make
> mux_chr_can_read return either zero or one (as I've done in the patch),
> do you mean that we are still at risk of losing some escape sequences?
>

> In mux_proc_byte, we set d->term_got_escape to 1 when we see the escape
> character. As far as I can see, the escape sequence is always handled
> correctly. So I don't understand how losing escape sequences can happen.
>
> Would you mind explaining this in more detail?
>


I agree with you that returning 0 or 1 in mux_chr_can_read() should solve
the issue (assuming future call to be can_read still return >= 1). But it's
not elegant to read/write by 1 bytes, especially as you explained, it takes
4k buffers for virtio-serial by write. My comment is more general also: the
32 bytes buffer isn't really helping, at some point it may be full and mux
will stop handling input...

(I think the quoted comment talks about escape sequences for the guest, not
the mux term_got_escape - unfortunately the original commit bd9bdce69
introducing the mux buffer doesn't have details)


> >
> >
> > >
> > > >
> > > >
> > > > > but the number of remaining bytes can be larger than the buffer
> size.
> > > > >
> > > >
> > > > By the above description, I don't see how it happens.
> > > >
> > > > This does not lead to security problems since it is a ring buffer,
> but
> > > > > it does mess up the receive data.
> > > > >
> > > > > This can be fixed by having mux_chr_can_read return either zero or
> one.
> > > > > This fix is not very efficient, but it is quite reasonable since
> > > > > mux_chr_read also passes the data to the backend one byte at a
> time.
> > > > >
> > > >
> > > > Could you share your testing setup? Even better if you could write a
> > > test!
> > >
> > > This happens in https://github.com/asterinas/asterinas. Sorry, but I
> > > don't have a minimal reproducible example, and I don't think I can make
> > > one anytime soon.
> > >
> > > As for the tests, I'd like to know how to write such tests in QEMU. I
> > > checked the documentation but didn't find anything, maybe I'm missing
> > > something?
> > >
> > > >
> > > >
> > > > thanks
> > > >
> > > >
> > > > > Signed-off-by: Ruihan Li <lrh2000@pku.edu.cn>
> > > > > ---
> > > > >  chardev/char-mux.c | 4 ++--
> > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/chardev/char-mux.c b/chardev/char-mux.c
> > > > > index ee2d47b..5c6eea2 100644
> > > > > --- a/chardev/char-mux.c
> > > > > +++ b/chardev/char-mux.c
> > > > > @@ -210,8 +210,8 @@ static int mux_chr_can_read(void *opaque)
> > > > >          return 1;
> > > > >      }
> > > > >
> > > > > -    if (be && be->chr_can_read) {
> > > > > -        return be->chr_can_read(be->opaque);
> > > > > +    if (be && be->chr_can_read && be->chr_can_read(be->opaque)) {
> > > > > +        return 1;
> > > > >      }
> > > > >
> > > > >      return 0;
> > > > > --
> > > > > 2.45.2
> > > > >
> > > > >
> > > > >
> > > >
> > > > --
> > > > Marc-André Lureau
> > >
> > > Thanks,
> > > Ruihan Li
> > >
> > >
> >
> > --
> > Marc-André Lureau
>
> Thanks,
> Ruihan Li
>
>
Ruihan Li July 9, 2024, 3:56 p.m. UTC | #6
Hi,

On Tue, Jul 09, 2024 at 06:58:41PM GMT, Marc-André Lureau wrote:
> Hi
> 
> On Tue, Jul 9, 2024 at 6:41 PM Ruihan Li <lrh2000@pku.edu.cn> wrote:
> 
> > Hi,
> >
> > On Mon, Jul 08, 2024 at 03:21:58PM GMT, Marc-André Lureau wrote:
> > > Hi
> > >
> > > On Mon, Jul 8, 2024 at 12:12 AM Ruihan Li <lrh2000@pku.edu.cn> wrote:
> > >
> > > > Hi,
> > > >
> > > > Thanks for your quick review!
> > > >
> > > > On Sun, Jul 07, 2024 at 08:28:50PM GMT, Marc-André Lureau wrote:
> > > > > Hi
> > > > >
> > > > > On Sun, Jul 7, 2024 at 3:26 PM Ruihan Li <lrh2000@pku.edu.cn> wrote:
> > > > >
> > > > > > This commit fixes a bug that causes incorrect results when pasting
> > more
> > > > > > than 32 bytes, the size of the receive buffer b->buffer, into the
> > > > virtio
> > > > > > console.
> > > > > >
> > > > > > Example (note that the last 32 bytes are always correct, but
> > something
> > > > > > goes wrong just before the last 32 bytes):
> > > > > >
> > > > > >         Pasting
> > > > > >
> > > >
> > abcdefghijklmnopqrstuvwxyz1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()
> > > > > >         Received
> > > > > >
> > > >
> > abcdefg)EFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()EFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()
> > > > > >
> > > > > > The root cause of this bug is as follows:
> > > > > >
> > > > > > The mux_chr_read function passes the data to the backend via
> > > > > > be->chr_read one byte at a time, either directly or via another
> > > > > > mux_chr_accept_input method. However, if the receive buffer is
> > full,
> > > > > > there is a chance that the mux_chr_can_read method will return more
> > > > than
> > > > > > one byte, because in this case the method directly returns whatever
> > > > > > be->chr_can_read returns.
> > > > > >
> > > > > > This is problematic because if mux_chr_read passes a byte to the
> > > > backend
> > > > > > by calling be->chr_read, it will consume the entire backend
> > buffer, at
> > > > > > least in the case of virtio. Once all backend buffers are used,
> > > > > > mux_chr_read writes all remaining bytes to the receive buffer
> > > > d->buffer,
> > > > > >
> > > > >
> > > > > My understanding of the code execution is:
> > > > > - mux_chr_can_read() returns be->chr_can_read(), say N, because
> > d->buffer
> > > > > is already MUX_BUFFER_SIZE.
> > > > > - mux_chr_read() is called with N bytes
> > > > > - mux_chr_accept_input() flushes d->buffer, writing MUX_BUFFER_SIZE
> > > > > - be should still accept N-MUX_BUFFER_SIZE
> > > > > - mux_proc_byte() loops for N bytes
> > > > > - chr_read() should accept the N-MUX_BUFFER_SIZE
> > > > > - d->buffer is then filled with the remaining MUX_BUFFER_SIZE
> > > >
> > > > Note this:
> > > >         [..] if mux_chr_read passes a byte to the backend by calling
> > > >         be->chr_read, it will consume the entire backend buffer, at
> > > >         least in the case of virtio [..]
> > > >
> > > > At least in the case of virtio, if the guest provides a buffer of
> > length
> > > > 4096, be->chr_can_read will report 4096. But if you then call
> > > > be->chr_read with one byte, the whole 4096 buffer will be used. After
> > > > that, be->chr_can_read will return zero instead of 4095.
> > > >
> > > > This should make sense since the device cannot change the number of
> > > > bytes in the buffer after it has made the buffer available to the CPU.
> > > >
> > >
> > > Thanks, that helps explaining the incorrect behaviour.
> > >
> > > I think the concept of extra buffer as introduced in commit
> > > bd9bdce694ccb76facc882363e4c337e8a88c918 ("Add input buffer to mux chr
> > > (patch by Tristan Gingold)") is flawed, as Jan Kiszka explained in commit
> > > a80bf99fa3dd829ecea88b9bfb4f7cf146208f07 ("char-mux: Use separate input
> > > buffers (Jan Kiszka)"):
> > >     Note: In contrast to the original author's claim, the buffering
> > concept
> > >     still breaks down when the fifo of the currently active sub-device is
> > >     full. As we cannot accept futher data from this point on without
> > risking
> > >     to loose it, we will also miss escape sequences, just like without
> > all
> > >     that buffering. In short: There is no reliable escape sequence
> > handling
> > >     without infinite buffers or the risk of loosing some data.
> > >
> > > Maybe the best course is to remove the cycle buffer and either:
> > > - drop the data that be can't accept, but have always responsive mux (by
> > > default)
> > > - blocking, including mux, until the be can accept more data (not
> > friendly)
> > > - or allow unlimited buffering?
> > >
> > > Given that mux is meant for developers and qemu CLI users, I guess any of
> > > this would be acceptable.
> >
> > Thanks for your comments.
> >
> > However, I'm not really sure what you're talking about. If we make
> > mux_chr_can_read return either zero or one (as I've done in the patch),
> > do you mean that we are still at risk of losing some escape sequences?
> >
> 
> > In mux_proc_byte, we set d->term_got_escape to 1 when we see the escape
> > character. As far as I can see, the escape sequence is always handled
> > correctly. So I don't understand how losing escape sequences can happen.
> >
> > Would you mind explaining this in more detail?
> >
> 
> 
> I agree with you that returning 0 or 1 in mux_chr_can_read() should solve
> the issue (assuming future call to be can_read still return >= 1). But it's
> not elegant to read/write by 1 bytes, especially as you explained, it takes
> 4k buffers for virtio-serial by write. My comment is more general also: the
> 32 bytes buffer isn't really helping, at some point it may be full and mux
> will stop handling input...
> 
> (I think the quoted comment talks about escape sequences for the guest, not
> the mux term_got_escape - unfortunately the original commit bd9bdce69
> introducing the mux buffer doesn't have details)

Thanks for the explanation. Yes, the cycle buffer isn't helpful.

I think it is possible to remove the cycle buffer. Then, in
mux_chr_read, to remove the escape characters from the input string, we
can either mutate the input string in place (can we?) or allocate a new
buffer to store the escaped string. Then call be->chr_read *once* and
pass it the escaped string.

This should be fine as long as the escaped string cannot be longer than
the original string.

(This cannot handle another corner case where the input string contains
escaped sequences that switch the focus in the middle. But neither does
the current implementation).

> 
> 
> > >
> > >
> > > >
> > > > >
> > > > >
> > > > > > but the number of remaining bytes can be larger than the buffer
> > size.
> > > > > >
> > > > >
> > > > > By the above description, I don't see how it happens.
> > > > >
> > > > > This does not lead to security problems since it is a ring buffer,
> > but
> > > > > > it does mess up the receive data.
> > > > > >
> > > > > > This can be fixed by having mux_chr_can_read return either zero or
> > one.
> > > > > > This fix is not very efficient, but it is quite reasonable since
> > > > > > mux_chr_read also passes the data to the backend one byte at a
> > time.
> > > > > >
> > > > >
> > > > > Could you share your testing setup? Even better if you could write a
> > > > test!
> > > >
> > > > This happens in https://github.com/asterinas/asterinas. Sorry, but I
> > > > don't have a minimal reproducible example, and I don't think I can make
> > > > one anytime soon.
> > > >
> > > > As for the tests, I'd like to know how to write such tests in QEMU. I
> > > > checked the documentation but didn't find anything, maybe I'm missing
> > > > something?
> > > >
> > > > >
> > > > >
> > > > > thanks
> > > > >
> > > > >
> > > > > > Signed-off-by: Ruihan Li <lrh2000@pku.edu.cn>
> > > > > > ---
> > > > > >  chardev/char-mux.c | 4 ++--
> > > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/chardev/char-mux.c b/chardev/char-mux.c
> > > > > > index ee2d47b..5c6eea2 100644
> > > > > > --- a/chardev/char-mux.c
> > > > > > +++ b/chardev/char-mux.c
> > > > > > @@ -210,8 +210,8 @@ static int mux_chr_can_read(void *opaque)
> > > > > >          return 1;
> > > > > >      }
> > > > > >
> > > > > > -    if (be && be->chr_can_read) {
> > > > > > -        return be->chr_can_read(be->opaque);
> > > > > > +    if (be && be->chr_can_read && be->chr_can_read(be->opaque)) {
> > > > > > +        return 1;
> > > > > >      }
> > > > > >
> > > > > >      return 0;
> > > > > > --
> > > > > > 2.45.2
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > > --
> > > > > Marc-André Lureau
> > > >
> > > > Thanks,
> > > > Ruihan Li
> > > >
> > > >
> > >
> > > --
> > > Marc-André Lureau
> >
> > Thanks,
> > Ruihan Li
> >
> >
> 
> -- 
> Marc-André Lureau

Thanks,
Ruihan Li
diff mbox series

Patch

diff --git a/chardev/char-mux.c b/chardev/char-mux.c
index ee2d47b..5c6eea2 100644
--- a/chardev/char-mux.c
+++ b/chardev/char-mux.c
@@ -210,8 +210,8 @@  static int mux_chr_can_read(void *opaque)
         return 1;
     }
 
-    if (be && be->chr_can_read) {
-        return be->chr_can_read(be->opaque);
+    if (be && be->chr_can_read && be->chr_can_read(be->opaque)) {
+        return 1;
     }
 
     return 0;