Message ID | 161224971122.79781.8594358970645859667.stgit@pasha-ThinkPad-X280 |
---|---|
State | New |
Headers | show |
Series | char: don't fail when client is not connected | expand |
Hi On Tue, Feb 2, 2021 at 11:18 AM Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> wrote: > This patch checks that ioc is not null before > using it in tcp socket tcp_chr_add_watch function. > > Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> > Do you have a backtrace or a reproducer when this happens? thanks --- > chardev/char-socket.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/chardev/char-socket.c b/chardev/char-socket.c > index 213a4c8dd0..cef1d9438f 100644 > --- a/chardev/char-socket.c > +++ b/chardev/char-socket.c > @@ -385,6 +385,9 @@ static ssize_t tcp_chr_recv(Chardev *chr, char *buf, > size_t len) > static GSource *tcp_chr_add_watch(Chardev *chr, GIOCondition cond) > { > SocketChardev *s = SOCKET_CHARDEV(chr); > + if (!s->ioc) { > + return NULL; > + } > return qio_channel_create_watch(s->ioc, cond); > } > > >
On 02.02.2021 10:27, Marc-André Lureau wrote: > Hi > > On Tue, Feb 2, 2021 at 11:18 AM Pavel Dovgalyuk > <pavel.dovgalyuk@ispras.ru <mailto:pavel.dovgalyuk@ispras.ru>> wrote: > > This patch checks that ioc is not null before > using it in tcp socket tcp_chr_add_watch function. > > Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru > <mailto:Pavel.Dovgalyuk@ispras.ru>> > > > Do you have a backtrace or a reproducer when this happens? > thanks Here is the backtrace: Thread 4 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7ffff2506700 (LWP 64988)] object_get_class (obj=obj@entry=0x0) at ../qom/object.c:999 999 return obj->class; (gdb) bt #0 object_get_class (obj=obj@entry=0x0) at ../qom/object.c:999 #1 0x0000555555b70e26 in QIO_CHANNEL_GET_CLASS (obj=0x0) at /home/pasha/ispras/qemu-test/include/io/channel.h:29 #2 qio_channel_create_watch (ioc=0x0, condition=(G_IO_OUT | G_IO_HUP)) at ../io/channel.c:281 #3 0x0000555555c1bf9b in qemu_chr_fe_add_watch (be=be@entry=0x555556981648, cond=cond@entry=(G_IO_OUT | G_IO_HUP), func=func@entry=0x55555597f170 <serial_watch_cb>, user_data=user_data@entry=0x5555569815a0) at /home/pasha/ispras/qemu-test/include/chardev/char.h:229 #4 0x000055555597f042 in serial_xmit (s=s@entry=0x5555569815a0) at ../hw/char/serial.c:265 #5 0x000055555597f437 in serial_ioport_write (opaque=0x5555569815a0, addr=<optimized out>, val=91, size=<optimized out>) at ../hw/char/serial.c:359 #6 0x0000555555ab95e0 in memory_region_write_accessor (mr=mr@entry=0x555556981700, addr=0, value=value@entry=0x7ffff2504fc8, size=size@entry=1, shift=<optimized out>, mask=mask@entry=255, attrs=...) at ../softmmu/memory.c:491 #7 0x0000555555ab807e in access_with_adjusted_size (addr=addr@entry=0, value=value@entry=0x7ffff2504fc8, size=size@entry=1, access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=access_fn@entry= 0x555555ab9550 <memory_region_write_accessor>, mr=0x555556981700, attrs=...) at ../softmmu/memory.c:552 #8 0x0000555555abb947 in memory_region_dispatch_write (mr=mr@entry=0x555556981700, addr=0, data=<optimized out>, data@entry=91, op=op@entry=MO_8, attrs=attrs@entry=...) at ../softmmu/memory.c:1501 #9 0x0000555555a721d8 in address_space_stb (as=<optimized out>, addr=<optimized out>, val=91, attrs=..., result=0x0) at /home/pasha/ispras/qemu-test/memory_ldst.c.inc:382 #10 0x00007fffa8b63022 in code_gen_buffer () #11 0x0000555555b10ab0 in cpu_tb_exec (tb_exit=<synthetic pointer>, itb=<optimized out>, cpu=0x7fffa8635b00 <code_gen_buffer+73620179>) at ../accel/tcg/cpu-exec.c:188 #12 cpu_loop_exec_tb (tb_exit=<synthetic pointer>, last_tb=<synthetic pointer>, tb=<optimized out>, cpu=0x7fffa8635b00 <code_gen_buffer+73620179>) at ../accel/tcg/cpu-exec.c:700 #13 cpu_exec (cpu=cpu@entry=0x5555566b4350) at ../accel/tcg/cpu-exec.c:811 #14 0x0000555555b0ce97 in tcg_cpus_exec (cpu=cpu@entry=0x5555566b4350) at ../accel/tcg/tcg-cpus.c:57 #15 0x0000555555abfa73 in rr_cpu_thread_fn (arg=arg@entry=0x5555566b4350) at ../accel/tcg/tcg-cpus-rr.c:217 #16 0x0000555555c80573 in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:521 #17 0x00007ffff6302609 in start_thread (arg=<optimized out>) at pthread_create.c:477 #18 0x00007ffff6229293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 > > --- > chardev/char-socket.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/chardev/char-socket.c b/chardev/char-socket.c > index 213a4c8dd0..cef1d9438f 100644 > --- a/chardev/char-socket.c > +++ b/chardev/char-socket.c > @@ -385,6 +385,9 @@ static ssize_t tcp_chr_recv(Chardev *chr, char > *buf, size_t len) > static GSource *tcp_chr_add_watch(Chardev *chr, GIOCondition cond) > { > SocketChardev *s = SOCKET_CHARDEV(chr); > + if (!s->ioc) { > + return NULL; > + } > return qio_channel_create_watch(s->ioc, cond); > } > >
Hi On Tue, Feb 2, 2021 at 11:33 AM Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> wrote: > On 02.02.2021 10:27, Marc-André Lureau wrote: > > Hi > > > > On Tue, Feb 2, 2021 at 11:18 AM Pavel Dovgalyuk > > <pavel.dovgalyuk@ispras.ru <mailto:pavel.dovgalyuk@ispras.ru>> wrote: > > > > This patch checks that ioc is not null before > > using it in tcp socket tcp_chr_add_watch function. > > > > Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru > > <mailto:Pavel.Dovgalyuk@ispras.ru>> > > > > > > Do you have a backtrace or a reproducer when this happens? > > thanks > > Here is the backtrace: > > Thread 4 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. > [Switching to Thread 0x7ffff2506700 (LWP 64988)] > object_get_class (obj=obj@entry=0x0) at ../qom/object.c:999 > 999 return obj->class; > (gdb) bt > #0 object_get_class (obj=obj@entry=0x0) at ../qom/object.c:999 > #1 0x0000555555b70e26 in QIO_CHANNEL_GET_CLASS (obj=0x0) at > /home/pasha/ispras/qemu-test/include/io/channel.h:29 > #2 qio_channel_create_watch (ioc=0x0, condition=(G_IO_OUT | G_IO_HUP)) > at ../io/channel.c:281 > #3 0x0000555555c1bf9b in qemu_chr_fe_add_watch > (be=be@entry=0x555556981648, cond=cond@entry=(G_IO_OUT | G_IO_HUP), > func=func@entry=0x55555597f170 <serial_watch_cb>, > user_data=user_data@entry=0x5555569815a0) > at /home/pasha/ispras/qemu-test/include/chardev/char.h:229 > #4 0x000055555597f042 in serial_xmit (s=s@entry=0x5555569815a0) at > ../hw/char/serial.c:265 > #5 0x000055555597f437 in serial_ioport_write (opaque=0x5555569815a0, > addr=<optimized out>, val=91, size=<optimized out>) at > ../hw/char/serial.c:359 > Thanks, I don't understand how this could happen. serial_xmit: int rc = qemu_chr_fe_write(&s->chr, &s->tsr, 1); if ((rc == 0 || (rc == -1 && errno == EAGAIN)) && s->tsr_retry < MAX_XMIT_RETRY) { assert(s->watch_tag == 0); s->watch_tag = qemu_chr_fe_add_watch(&s->chr, G_IO_OUT | G_IO_HUP, serial_watch_cb, s); The watch is added only if fe_write() returned 0 || -1 with EAGAIN. But tcp_chr_write() should return -1 with EIO if the state is disconnected (and ioc is NULL), or other errors on disconnect. Can you provide a reproducer? thanks > > > > --- > > chardev/char-socket.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/chardev/char-socket.c b/chardev/char-socket.c > > index 213a4c8dd0..cef1d9438f 100644 > > --- a/chardev/char-socket.c > > +++ b/chardev/char-socket.c > > @@ -385,6 +385,9 @@ static ssize_t tcp_chr_recv(Chardev *chr, char > > *buf, size_t len) > > static GSource *tcp_chr_add_watch(Chardev *chr, GIOCondition cond) > > { > > SocketChardev *s = SOCKET_CHARDEV(chr); > > + if (!s->ioc) { > > + return NULL; > > + } > > return qio_channel_create_watch(s->ioc, cond); > > } > > > > > >
On 03.02.2021 11:13, Marc-André Lureau wrote: > Hi > > On Tue, Feb 2, 2021 at 11:33 AM Pavel Dovgalyuk > <pavel.dovgalyuk@ispras.ru <mailto:pavel.dovgalyuk@ispras.ru>> wrote: > > On 02.02.2021 10:27, Marc-André Lureau wrote: > > Hi > > > > On Tue, Feb 2, 2021 at 11:18 AM Pavel Dovgalyuk > > <pavel.dovgalyuk@ispras.ru <mailto:pavel.dovgalyuk@ispras.ru> > <mailto:pavel.dovgalyuk@ispras.ru > <mailto:pavel.dovgalyuk@ispras.ru>>> wrote: > > > > This patch checks that ioc is not null before > > using it in tcp socket tcp_chr_add_watch function. > > > > Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru > <mailto:Pavel.Dovgalyuk@ispras.ru> > > <mailto:Pavel.Dovgalyuk@ispras.ru > <mailto:Pavel.Dovgalyuk@ispras.ru>>> > > > > > > Do you have a backtrace or a reproducer when this happens? > > thanks > > Here is the backtrace: > > Thread 4 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. > [Switching to Thread 0x7ffff2506700 (LWP 64988)] > object_get_class (obj=obj@entry=0x0) at ../qom/object.c:999 > 999 return obj->class; > (gdb) bt > #0 object_get_class (obj=obj@entry=0x0) at ../qom/object.c:999 > #1 0x0000555555b70e26 in QIO_CHANNEL_GET_CLASS (obj=0x0) at > /home/pasha/ispras/qemu-test/include/io/channel.h:29 > #2 qio_channel_create_watch (ioc=0x0, condition=(G_IO_OUT | G_IO_HUP)) > at ../io/channel.c:281 > #3 0x0000555555c1bf9b in qemu_chr_fe_add_watch > (be=be@entry=0x555556981648, cond=cond@entry=(G_IO_OUT | > G_IO_HUP), > func=func@entry=0x55555597f170 <serial_watch_cb>, > user_data=user_data@entry=0x5555569815a0) > at /home/pasha/ispras/qemu-test/include/chardev/char.h:229 > #4 0x000055555597f042 in serial_xmit (s=s@entry=0x5555569815a0) at > ../hw/char/serial.c:265 > #5 0x000055555597f437 in serial_ioport_write (opaque=0x5555569815a0, > addr=<optimized out>, val=91, size=<optimized out>) at > ../hw/char/serial.c:359 > > > Thanks, I don't understand how this could happen. > > serial_xmit: > int rc = qemu_chr_fe_write(&s->chr, &s->tsr, 1); > > if ((rc == 0 || > (rc == -1 && errno == EAGAIN)) && > s->tsr_retry < MAX_XMIT_RETRY) { > assert(s->watch_tag == 0); > s->watch_tag = > qemu_chr_fe_add_watch(&s->chr, G_IO_OUT | G_IO_HUP, > serial_watch_cb, s); > > The watch is added only if fe_write() returned 0 || -1 with EAGAIN. > > But tcp_chr_write() should return -1 with EIO if the state is > disconnected (and ioc is NULL), or other errors on disconnect. > > Can you provide a reproducer? That was a record/replay scenario. I've used Fedora cloudinit images, that are used in acceptance tests: qemu-system-x86_64 \ -display none -vga none -machine pc -smp 1 -m 1024 \ -monitor tcp:127.0.0.1:8081,server,nowait \ -serial tcp:127.0.0.1:8082,server,nowait \ -object filter-replay,id=replay,netdev=hub0port0 \ -drive file=Fedora-Cloud-Base-31-1.9.x86_64.qcow2,snapshot,id=disk0,if=none \ -drive driver=blkreplay,id=disk0-rr,if=none,image=disk0 \ -device virtio-blk-pci,drive=disk0-rr,ioeventfd=false \ -icount shift=1,rr=record,rrfile=replay.bin \ -drive file=cloudinit.iso,snapshot,id=disk1,if=none \ -drive driver=blkreplay,id=disk1-rr,if=none,image=disk1 \ -device virtio-blk-pci,drive=disk1-rr,ioeventfd=false The failure occurred on replay stage: qemu-system-x86_64 \ -display none -vga none -machine pc -smp 1 -m 1024 \ -monitor tcp:127.0.0.1:8081,server,nowait \ -serial tcp:127.0.0.1:8082,server,nowait \ -object filter-replay,id=replay,netdev=hub0port0 \ -drive file=Fedora-Cloud-Base-31-1.9.x86_64.qcow2,snapshot,id=disk0,if=none \ -drive driver=blkreplay,id=disk0-rr,if=none,image=disk0 \ -device virtio-blk-pci,drive=disk0-rr,ioeventfd=false \ -icount shift=1,rr=replay,rrfile=replay.bin \ -drive file=cloudinit.iso,snapshot,id=disk1,if=none \ -drive driver=blkreplay,id=disk1-rr,if=none,image=disk1 \ -device virtio-blk-pci,drive=disk1-rr,ioeventfd=false > > thanks > > > > > > --- > > chardev/char-socket.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/chardev/char-socket.c b/chardev/char-socket.c > > index 213a4c8dd0..cef1d9438f 100644 > > --- a/chardev/char-socket.c > > +++ b/chardev/char-socket.c > > @@ -385,6 +385,9 @@ static ssize_t tcp_chr_recv(Chardev *chr, > char > > *buf, size_t len) > > static GSource *tcp_chr_add_watch(Chardev *chr, > GIOCondition cond) > > { > > SocketChardev *s = SOCKET_CHARDEV(chr); > > + if (!s->ioc) { > > + return NULL; > > + } > > return qio_channel_create_watch(s->ioc, cond); > > } > > > > >
Hi On Wed, Feb 3, 2021 at 12:22 PM Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> wrote: > On 03.02.2021 11:13, Marc-André Lureau wrote: > > > Can you provide a reproducer? > > > That was a record/replay scenario. I've used Fedora cloudinit images, > that are used in acceptance tests: > > qemu-system-x86_64 \ > -display none -vga none -machine pc -smp 1 -m 1024 \ > -monitor tcp:127.0.0.1:8081,server,nowait \ > -serial tcp:127.0.0.1:8082,server,nowait \ > -object filter-replay,id=replay,netdev=hub0port0 \ > -drive > file=Fedora-Cloud-Base-31-1.9.x86_64.qcow2,snapshot,id=disk0,if=none \ > -drive driver=blkreplay,id=disk0-rr,if=none,image=disk0 \ > -device virtio-blk-pci,drive=disk0-rr,ioeventfd=false \ > -icount shift=1,rr=record,rrfile=replay.bin \ > -drive file=cloudinit.iso,snapshot,id=disk1,if=none \ > -drive driver=blkreplay,id=disk1-rr,if=none,image=disk1 \ > -device virtio-blk-pci,drive=disk1-rr,ioeventfd=false > > > The failure occurred on replay stage: > > qemu-system-x86_64 \ > -display none -vga none -machine pc -smp 1 -m 1024 \ > -monitor tcp:127.0.0.1:8081,server,nowait \ > -serial tcp:127.0.0.1:8082,server,nowait \ > -object filter-replay,id=replay,netdev=hub0port0 \ > -drive > file=Fedora-Cloud-Base-31-1.9.x86_64.qcow2,snapshot,id=disk0,if=none \ > -drive driver=blkreplay,id=disk0-rr,if=none,image=disk0 \ > -device virtio-blk-pci,drive=disk0-rr,ioeventfd=false \ > -icount shift=1,rr=replay,rrfile=replay.bin \ > -drive file=cloudinit.iso,snapshot,id=disk1,if=none \ > -drive driver=blkreplay,id=disk1-rr,if=none,image=disk1 \ > -device virtio-blk-pci,drive=disk1-rr,ioeventfd=false > Ah, that helps. qemu_chr_write() will return replay_char_write_event_load(), bypassing the current state. If the connected state during replay doesn't match the state during recording, it is possible to reach the bad condition. (there are probably many such corner-cases situations with replay..) Can you update the commit message to explain the relation with replay? with that: Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Daniel, could you review it too? thanks
diff --git a/chardev/char-socket.c b/chardev/char-socket.c index 213a4c8dd0..cef1d9438f 100644 --- a/chardev/char-socket.c +++ b/chardev/char-socket.c @@ -385,6 +385,9 @@ static ssize_t tcp_chr_recv(Chardev *chr, char *buf, size_t len) static GSource *tcp_chr_add_watch(Chardev *chr, GIOCondition cond) { SocketChardev *s = SOCKET_CHARDEV(chr); + if (!s->ioc) { + return NULL; + } return qio_channel_create_watch(s->ioc, cond); }
This patch checks that ioc is not null before using it in tcp socket tcp_chr_add_watch function. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> --- chardev/char-socket.c | 3 +++ 1 file changed, 3 insertions(+)