Message ID | 1495176212-14446-2-git-send-email-peterx@redhat.com |
---|---|
State | New |
Headers | show |
On Fri, May 19, 2017 at 02:43:27PM +0800, Peter Xu wrote: > We don't really have a return path for the other types yet. Let's check > this when .get_return_path() is called. > > For this, we introduce a new feature bit, and set it up only for socket > typed IO channels. > > This will help detect earlier failure for postcopy, e.g., logically > speaking postcopy cannot work with "exec:". Before this patch, when we > try to migrate with "migrate -d exec:cat>out", we'll hang the system. > With this patch, we'll get: > > (qemu) migrate -d exec:cat>out > Unable to open return-path for postcopy This is wrong - post-copy migration *can* work with exec: - it just entirely depends on what command you are running. Your example ran a command which is unidirectional, but if you ran 'exec:socat ...' you would have a fully bidirectional channel. Actually the channel is always bi-directional, but 'cat' simply won't ever send data back to QEMU. If QEMU hangs when the other end doesn't send data back, that actually seems like a potentially serious bug in migration code. Even if using the normal 'tcp' migration protocol, if the target QEMU server hangs and fails to send data to QEMU on the return path, the source QEMU must never hang. Regards, Daniel
On Fri, May 19, 2017 at 09:25:38AM +0100, Daniel P. Berrange wrote: > On Fri, May 19, 2017 at 02:43:27PM +0800, Peter Xu wrote: > > We don't really have a return path for the other types yet. Let's check > > this when .get_return_path() is called. > > > > For this, we introduce a new feature bit, and set it up only for socket > > typed IO channels. > > > > This will help detect earlier failure for postcopy, e.g., logically > > speaking postcopy cannot work with "exec:". Before this patch, when we > > try to migrate with "migrate -d exec:cat>out", we'll hang the system. > > With this patch, we'll get: > > > > (qemu) migrate -d exec:cat>out > > Unable to open return-path for postcopy > > This is wrong - post-copy migration *can* work with exec: - it just entirely > depends on what command you are running. Your example ran a command which is > unidirectional, but if you ran 'exec:socat ...' you would have a fully > bidirectional channel. Actually the channel is always bi-directional, but > 'cat' simply won't ever send data back to QEMU. > > If QEMU hangs when the other end doesn't send data back, that actually seems > like a potentially serious bug in migration code. Even if using the normal > 'tcp' migration protocol, if the target QEMU server hangs and fails to > send data to QEMU on the return path, the source QEMU must never hang. BTW, if you want to simplify the code in this area at all, then arguably we should get rid of the "get_return_path" helper method entirely. We're not actually opening any new connections - we're just creating a second QEMUFile that uses the same underlying QIOChannel object. All we would need is for the QEMUFile to have a separate 'buf' field management in QEMUFile for the read & write directions. Then all the code would be able to just use the single QEMUFile for read & write getting rid of this concept of "opening a return path" which doens't actually do anything at the underlying data transport level. Regards, Daniel
On Fri, May 19, 2017 at 09:25:38AM +0100, Daniel P. Berrange wrote: > On Fri, May 19, 2017 at 02:43:27PM +0800, Peter Xu wrote: > > We don't really have a return path for the other types yet. Let's check > > this when .get_return_path() is called. > > > > For this, we introduce a new feature bit, and set it up only for socket > > typed IO channels. > > > > This will help detect earlier failure for postcopy, e.g., logically > > speaking postcopy cannot work with "exec:". Before this patch, when we > > try to migrate with "migrate -d exec:cat>out", we'll hang the system. > > With this patch, we'll get: > > > > (qemu) migrate -d exec:cat>out > > Unable to open return-path for postcopy > > This is wrong - post-copy migration *can* work with exec: - it just entirely > depends on what command you are running. Your example ran a command which is > unidirectional, but if you ran 'exec:socat ...' you would have a fully > bidirectional channel. Actually the channel is always bi-directional, but > 'cat' simply won't ever send data back to QEMU. Indeed. I should not block postcopy if the user used a TCP tunnel between the source and destination in some way, using this exec: way. Thanks for pointing that out. However I still think the idea is needed here. Say, we'd better know whether the transport would be able to respond (though current approach of "assuming sockets are the only ones that can reply" is not a good solution...). Please see below. > > If QEMU hangs when the other end doesn't send data back, that actually seems > like a potentially serious bug in migration code. Even if using the normal > 'tcp' migration protocol, if the target QEMU server hangs and fails to > send data to QEMU on the return path, the source QEMU must never hang. Firstly I should not say it's a hang - it's actually by-design here imho - migration thread is in the last phase now, waiting for a SHUT message from destination (which I think is wise). But from the behavior, indeed src VM is not usable during the time, just like what happened for most postcopy cases on the source side. So, we can see that postcopy "assumes" that destination side can reply now. Meanwhile, I see it reasonable for postcopy to have such an assumption. After all, postcopy means "start VM on destination before pages are moved over completely", then there must be someone to reply to source, no matter whether it'll be via some kind of io channel. That's why I think we still need the general idea here, that we need to know whether destination end is able to reply. But, I still have no good idea (after knowing this patch won't work) on how we can do this... Any further suggestions would be greatly welcomed. Thanks,
On Fri, May 19, 2017 at 09:30:10AM +0100, Daniel P. Berrange wrote: > On Fri, May 19, 2017 at 09:25:38AM +0100, Daniel P. Berrange wrote: > > On Fri, May 19, 2017 at 02:43:27PM +0800, Peter Xu wrote: > > > We don't really have a return path for the other types yet. Let's check > > > this when .get_return_path() is called. > > > > > > For this, we introduce a new feature bit, and set it up only for socket > > > typed IO channels. > > > > > > This will help detect earlier failure for postcopy, e.g., logically > > > speaking postcopy cannot work with "exec:". Before this patch, when we > > > try to migrate with "migrate -d exec:cat>out", we'll hang the system. > > > With this patch, we'll get: > > > > > > (qemu) migrate -d exec:cat>out > > > Unable to open return-path for postcopy > > > > This is wrong - post-copy migration *can* work with exec: - it just entirely > > depends on what command you are running. Your example ran a command which is > > unidirectional, but if you ran 'exec:socat ...' you would have a fully > > bidirectional channel. Actually the channel is always bi-directional, but > > 'cat' simply won't ever send data back to QEMU. > > > > If QEMU hangs when the other end doesn't send data back, that actually seems > > like a potentially serious bug in migration code. Even if using the normal > > 'tcp' migration protocol, if the target QEMU server hangs and fails to > > send data to QEMU on the return path, the source QEMU must never hang. > > BTW, if you want to simplify the code in this area at all, then arguably > we should get rid of the "get_return_path" helper method entirely. We're > not actually opening any new connections - we're just creating a second > QEMUFile that uses the same underlying QIOChannel object. All we would > need is for the QEMUFile to have a separate 'buf' field management in > QEMUFile for the read & write directions. Then all the code would be > able to just use the single QEMUFile for read & write getting rid of this > concept of "opening a return path" which doens't actually do anything at > the underlying data transport level. Makes sense. Noted. Thanks,
On Fri, May 19, 2017 at 05:51:43PM +0800, Peter Xu wrote: > On Fri, May 19, 2017 at 09:25:38AM +0100, Daniel P. Berrange wrote: > > On Fri, May 19, 2017 at 02:43:27PM +0800, Peter Xu wrote: > > > We don't really have a return path for the other types yet. Let's check > > > this when .get_return_path() is called. > > > > > > For this, we introduce a new feature bit, and set it up only for socket > > > typed IO channels. > > > > > > This will help detect earlier failure for postcopy, e.g., logically > > > speaking postcopy cannot work with "exec:". Before this patch, when we > > > try to migrate with "migrate -d exec:cat>out", we'll hang the system. > > > With this patch, we'll get: > > > > > > (qemu) migrate -d exec:cat>out > > > Unable to open return-path for postcopy > > > > This is wrong - post-copy migration *can* work with exec: - it just entirely > > depends on what command you are running. Your example ran a command which is > > unidirectional, but if you ran 'exec:socat ...' you would have a fully > > bidirectional channel. Actually the channel is always bi-directional, but > > 'cat' simply won't ever send data back to QEMU. > > Indeed. I should not block postcopy if the user used a TCP tunnel > between the source and destination in some way, using this exec: way. > Thanks for pointing that out. > > However I still think the idea is needed here. Say, we'd better know > whether the transport would be able to respond (though current > approach of "assuming sockets are the only ones that can reply" is not > a good solution...). Please see below. > > > > > If QEMU hangs when the other end doesn't send data back, that actually seems > > like a potentially serious bug in migration code. Even if using the normal > > 'tcp' migration protocol, if the target QEMU server hangs and fails to > > send data to QEMU on the return path, the source QEMU must never hang. > > Firstly I should not say it's a hang - it's actually by-design here > imho - migration thread is in the last phase now, waiting for a SHUT > message from destination (which I think is wise). But from the > behavior, indeed src VM is not usable during the time, just like what > happened for most postcopy cases on the source side. So, we can see > that postcopy "assumes" that destination side can reply now. > > Meanwhile, I see it reasonable for postcopy to have such an > assumption. After all, postcopy means "start VM on destination before > pages are moved over completely", then there must be someone to reply > to source, no matter whether it'll be via some kind of io channel. > > That's why I think we still need the general idea here, that we need > to know whether destination end is able to reply. > > But, I still have no good idea (after knowing this patch won't work) > on how we can do this... Any further suggestions would be greatly > welcomed. IMHO this is nothing more than a documentation issue for the 'exec' protocol. ie, document that you should provide a bi-directional transport for live migration. A uni-directional transport is arguably only valid if you're using migrate to save/restore the VM state to a file. Regards, Daniel
* Daniel P. Berrange (berrange@redhat.com) wrote: > On Fri, May 19, 2017 at 02:43:27PM +0800, Peter Xu wrote: > > We don't really have a return path for the other types yet. Let's check > > this when .get_return_path() is called. > > > > For this, we introduce a new feature bit, and set it up only for socket > > typed IO channels. > > > > This will help detect earlier failure for postcopy, e.g., logically > > speaking postcopy cannot work with "exec:". Before this patch, when we > > try to migrate with "migrate -d exec:cat>out", we'll hang the system. > > With this patch, we'll get: > > > > (qemu) migrate -d exec:cat>out > > Unable to open return-path for postcopy > > This is wrong - post-copy migration *can* work with exec: - it just entirely > depends on what command you are running. Your example ran a command which is > unidirectional, but if you ran 'exec:socat ...' you would have a fully > bidirectional channel. Actually the channel is always bi-directional, but > 'cat' simply won't ever send data back to QEMU. The thing is it didn't used to be able to; prior to your conversion to channel, postcopy would reject being started with exec: because it couldn't open a return path, so it was safe. > If QEMU hangs when the other end doesn't send data back, that actually seems > like a potentially serious bug in migration code. Even if using the normal > 'tcp' migration protocol, if the target QEMU server hangs and fails to > send data to QEMU on the return path, the source QEMU must never hang. Hmm, we shouldn't get a 'hang' with a postcopy on a link without a return path; but it does depend on how the exec: behaves on the destination. If the destination discards data written to it, then I think the behaviour would be: a) Page requests would just get dropped, they'd eventually get fulfilled by the background page transmissions, but that could mean that a page request would wait for minutes for the page. b) The qemu main thread on the destination can be blocked by that, so the monitor might not respond until the page request is fulfilled. c) I'm not quite sure what would happen to the source return-path thread The behaviour seems to have changed between 2.9.0 (f26 package) and my reasonably recent head build. 2.9.0 gives me: (qemu) migrate_set_speed 1B (qemu) migrate_set_capability postcopy-ram on (qemu) migrate -d "exec:cat > out" RP: Received invalid message 0x0000 length 0x0000 (qemu) info migrate capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: on x-colo: off release-ram: off Migration status: failed total time: 0 milliseconds So that's the return path thread trying to read from the exec: not getting anything and failing. On head-ish it doesn't fail, the source qemu doesn't hang, however the migration never completes - possibly because it's waiting for the MIG_RP_MSG_SHUT from the destination. A migration_cancel also doesn't work for 'exec' because it doesn't support shutdown() - it just sticks in 'cancelling'. On a socket that was broken like this the cancel would work because it issues a shutdown() which causes the socket to cleanup. Personally I'd rather fix this by still not supporting exec:, making shutdown() work on exec (by kill'ing the child process) means at least cancel would work, but it still wouldn't be pretty for a postcopy, and still doesn't help Peter solve this problem which is a nasty problem QEMU has had for ages. Dave > Regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
* Daniel P. Berrange (berrange@redhat.com) wrote: > On Fri, May 19, 2017 at 09:25:38AM +0100, Daniel P. Berrange wrote: > > On Fri, May 19, 2017 at 02:43:27PM +0800, Peter Xu wrote: > > > We don't really have a return path for the other types yet. Let's check > > > this when .get_return_path() is called. > > > > > > For this, we introduce a new feature bit, and set it up only for socket > > > typed IO channels. > > > > > > This will help detect earlier failure for postcopy, e.g., logically > > > speaking postcopy cannot work with "exec:". Before this patch, when we > > > try to migrate with "migrate -d exec:cat>out", we'll hang the system. > > > With this patch, we'll get: > > > > > > (qemu) migrate -d exec:cat>out > > > Unable to open return-path for postcopy > > > > This is wrong - post-copy migration *can* work with exec: - it just entirely > > depends on what command you are running. Your example ran a command which is > > unidirectional, but if you ran 'exec:socat ...' you would have a fully > > bidirectional channel. Actually the channel is always bi-directional, but > > 'cat' simply won't ever send data back to QEMU. > > > > If QEMU hangs when the other end doesn't send data back, that actually seems > > like a potentially serious bug in migration code. Even if using the normal > > 'tcp' migration protocol, if the target QEMU server hangs and fails to > > send data to QEMU on the return path, the source QEMU must never hang. > > BTW, if you want to simplify the code in this area at all, then arguably > we should get rid of the "get_return_path" helper method entirely. We're > not actually opening any new connections - we're just creating a second > QEMUFile that uses the same underlying QIOChannel object. All we would > need is for the QEMUFile to have a separate 'buf' field management in > QEMUFile for the read & write directions. Then all the code would be > able to just use the single QEMUFile for read & write getting rid of this > concept of "opening a return path" which doens't actually do anything at > the underlying data transport level. No, I'd rather keep the get_return_path; we should keep each direction separate. Dave > Regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Fri, May 19, 2017 at 01:51:43PM +0100, Dr. David Alan Gilbert wrote: > * Daniel P. Berrange (berrange@redhat.com) wrote: > > On Fri, May 19, 2017 at 02:43:27PM +0800, Peter Xu wrote: > > > We don't really have a return path for the other types yet. Let's check > > > this when .get_return_path() is called. > > > > > > For this, we introduce a new feature bit, and set it up only for socket > > > typed IO channels. > > > > > > This will help detect earlier failure for postcopy, e.g., logically > > > speaking postcopy cannot work with "exec:". Before this patch, when we > > > try to migrate with "migrate -d exec:cat>out", we'll hang the system. > > > With this patch, we'll get: > > > > > > (qemu) migrate -d exec:cat>out > > > Unable to open return-path for postcopy > > > > This is wrong - post-copy migration *can* work with exec: - it just entirely > > depends on what command you are running. Your example ran a command which is > > unidirectional, but if you ran 'exec:socat ...' you would have a fully > > bidirectional channel. Actually the channel is always bi-directional, but > > 'cat' simply won't ever send data back to QEMU. > > The thing is it didn't used to be able to; prior to your conversion to > channel, postcopy would reject being started with exec: because it > couldn't open a return path, so it was safe. It safe but functionally broken because it is valid to want to use exec migration with post-copy. > > If QEMU hangs when the other end doesn't send data back, that actually seems > > like a potentially serious bug in migration code. Even if using the normal > > 'tcp' migration protocol, if the target QEMU server hangs and fails to > > send data to QEMU on the return path, the source QEMU must never hang. > > Hmm, we shouldn't get a 'hang' with a postcopy on a link without a > return path; but it does depend on how the exec: behaves on the > destination. > If the destination discards data written to it, then I think the > behaviour would be: > a) Page requests would just get dropped, they'd eventually get > fulfilled by the background page transmissions, but that could mean that > a page request would wait for minutes for the page. > b) The qemu main thread on the destination can be blocked by that, so > the monitor might not respond until the page request is fulfilled. > c) I'm not quite sure what would happen to the source return-path > thread > > The behaviour seems to have changed between 2.9.0 (f26 package) and my > reasonably recent head build. That's due to the bug we just fixed where we mistakenly didn't allow bi-directional I/O for exec commit 062d81f0e968fe1597474735f3ea038065027372 Author: Daniel P. Berrange <berrange@redhat.com> Date: Fri Apr 21 12:12:20 2017 +0100 migration: setup bi-directional I/O channel for exec: protocol Historically the migration data channel has only needed to be unidirectional. Thus the 'exec:' protocol was requesting an I/O channel with O_RDONLY on incoming side, and O_WRONLY on the outgoing side. This is fine for classic migration, but if you then try to run TLS over it, this fails because the TLS handshake requires a bi-directional channel. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> > A migration_cancel also doesn't work for 'exec' because it doesn't > support shutdown() - it just sticks in 'cancelling'. > On a socket that was broken like this the cancel would work because > it issues a shutdown() which causes the socket to cleanup. I'm curious why migration_cancel requires shutdown() to work ? Why isn't it sufficient from the source POV to just close the socket entirely straight away. Regards, Daniel
* Daniel P. Berrange (berrange@redhat.com) wrote: > On Fri, May 19, 2017 at 01:51:43PM +0100, Dr. David Alan Gilbert wrote: > > * Daniel P. Berrange (berrange@redhat.com) wrote: > > > On Fri, May 19, 2017 at 02:43:27PM +0800, Peter Xu wrote: > > > > We don't really have a return path for the other types yet. Let's check > > > > this when .get_return_path() is called. > > > > > > > > For this, we introduce a new feature bit, and set it up only for socket > > > > typed IO channels. > > > > > > > > This will help detect earlier failure for postcopy, e.g., logically > > > > speaking postcopy cannot work with "exec:". Before this patch, when we > > > > try to migrate with "migrate -d exec:cat>out", we'll hang the system. > > > > With this patch, we'll get: > > > > > > > > (qemu) migrate -d exec:cat>out > > > > Unable to open return-path for postcopy > > > > > > This is wrong - post-copy migration *can* work with exec: - it just entirely > > > depends on what command you are running. Your example ran a command which is > > > unidirectional, but if you ran 'exec:socat ...' you would have a fully > > > bidirectional channel. Actually the channel is always bi-directional, but > > > 'cat' simply won't ever send data back to QEMU. > > > > The thing is it didn't used to be able to; prior to your conversion to > > channel, postcopy would reject being started with exec: because it > > couldn't open a return path, so it was safe. > > It safe but functionally broken because it is valid to want to use > exec migration with post-copy. > > > > If QEMU hangs when the other end doesn't send data back, that actually seems > > > like a potentially serious bug in migration code. Even if using the normal > > > 'tcp' migration protocol, if the target QEMU server hangs and fails to > > > send data to QEMU on the return path, the source QEMU must never hang. > > > > Hmm, we shouldn't get a 'hang' with a postcopy on a link without a > > return path; but it does depend on how the exec: behaves on the > > destination. > > If the destination discards data written to it, then I think the > > behaviour would be: > > a) Page requests would just get dropped, they'd eventually get > > fulfilled by the background page transmissions, but that could mean that > > a page request would wait for minutes for the page. > > b) The qemu main thread on the destination can be blocked by that, so > > the monitor might not respond until the page request is fulfilled. > > c) I'm not quite sure what would happen to the source return-path > > thread > > > > The behaviour seems to have changed between 2.9.0 (f26 package) and my > > reasonably recent head build. > > That's due to the bug we just fixed where we mistakenly didn't > allow bi-directional I/O for exec > > commit 062d81f0e968fe1597474735f3ea038065027372 > Author: Daniel P. Berrange <berrange@redhat.com> > Date: Fri Apr 21 12:12:20 2017 +0100 > > migration: setup bi-directional I/O channel for exec: protocol > > Historically the migration data channel has only needed to be > unidirectional. Thus the 'exec:' protocol was requesting an > I/O channel with O_RDONLY on incoming side, and O_WRONLY on > the outgoing side. > > This is fine for classic migration, but if you then try to run > TLS over it, this fails because the TLS handshake requires a > bi-directional channel. > > Signed-off-by: Daniel P. Berrange <berrange@redhat.com> > Reviewed-by: Juan Quintela <quintela@redhat.com> > Signed-off-by: Juan Quintela <quintela@redhat.com> > > > > A migration_cancel also doesn't work for 'exec' because it doesn't > > support shutdown() - it just sticks in 'cancelling'. > > On a socket that was broken like this the cancel would work because > > it issues a shutdown() which causes the socket to cleanup. > > I'm curious why migration_cancel requires shutdown() to work ? Why > isn't it sufficient from the source POV to just close the socket > entirely straight away. close() closes the fd so that any other uses of the fd get an error and you're at risk of the fd getting reallocated by something else. So consider if the main thread calls close(), the migration thread and the return thread both carry on using that fd, which might have been reallocated to something different. Worse I think we came to the consolution that on some OSs a read()/write() blocked in the use of an fd didn't get kicked out by a close. shutdown() is safe, in that it stops any other threads accessing the fd but doesn't allow it's reallocation until the close; We perform the close only when we've joined all other threads that were using the fd. Any of the threads that do new calls on the fd get an error and quickly fall down their error paths. Dave > Regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Fri, May 19, 2017 at 02:02:00PM +0100, Dr. David Alan Gilbert wrote: > * Daniel P. Berrange (berrange@redhat.com) wrote: > > On Fri, May 19, 2017 at 01:51:43PM +0100, Dr. David Alan Gilbert wrote: > > > * Daniel P. Berrange (berrange@redhat.com) wrote: > > > > On Fri, May 19, 2017 at 02:43:27PM +0800, Peter Xu wrote: > > > > > We don't really have a return path for the other types yet. Let's check > > > > > this when .get_return_path() is called. > > > > > > > > > > For this, we introduce a new feature bit, and set it up only for socket > > > > > typed IO channels. > > > > > > > > > > This will help detect earlier failure for postcopy, e.g., logically > > > > > speaking postcopy cannot work with "exec:". Before this patch, when we > > > > > try to migrate with "migrate -d exec:cat>out", we'll hang the system. > > > > > With this patch, we'll get: > > > > > > > > > > (qemu) migrate -d exec:cat>out > > > > > Unable to open return-path for postcopy > > > > > > > > This is wrong - post-copy migration *can* work with exec: - it just entirely > > > > depends on what command you are running. Your example ran a command which is > > > > unidirectional, but if you ran 'exec:socat ...' you would have a fully > > > > bidirectional channel. Actually the channel is always bi-directional, but > > > > 'cat' simply won't ever send data back to QEMU. > > > > > > The thing is it didn't used to be able to; prior to your conversion to > > > channel, postcopy would reject being started with exec: because it > > > couldn't open a return path, so it was safe. > > > > It safe but functionally broken because it is valid to want to use > > exec migration with post-copy. > > > > > > If QEMU hangs when the other end doesn't send data back, that actually seems > > > > like a potentially serious bug in migration code. Even if using the normal > > > > 'tcp' migration protocol, if the target QEMU server hangs and fails to > > > > send data to QEMU on the return path, the source QEMU must never hang. > > > > > > Hmm, we shouldn't get a 'hang' with a postcopy on a link without a > > > return path; but it does depend on how the exec: behaves on the > > > destination. > > > If the destination discards data written to it, then I think the > > > behaviour would be: > > > a) Page requests would just get dropped, they'd eventually get > > > fulfilled by the background page transmissions, but that could mean that > > > a page request would wait for minutes for the page. > > > b) The qemu main thread on the destination can be blocked by that, so > > > the monitor might not respond until the page request is fulfilled. > > > c) I'm not quite sure what would happen to the source return-path > > > thread > > > > > > The behaviour seems to have changed between 2.9.0 (f26 package) and my > > > reasonably recent head build. > > > > That's due to the bug we just fixed where we mistakenly didn't > > allow bi-directional I/O for exec > > > > commit 062d81f0e968fe1597474735f3ea038065027372 > > Author: Daniel P. Berrange <berrange@redhat.com> > > Date: Fri Apr 21 12:12:20 2017 +0100 > > > > migration: setup bi-directional I/O channel for exec: protocol > > > > Historically the migration data channel has only needed to be > > unidirectional. Thus the 'exec:' protocol was requesting an > > I/O channel with O_RDONLY on incoming side, and O_WRONLY on > > the outgoing side. > > > > This is fine for classic migration, but if you then try to run > > TLS over it, this fails because the TLS handshake requires a > > bi-directional channel. > > > > Signed-off-by: Daniel P. Berrange <berrange@redhat.com> > > Reviewed-by: Juan Quintela <quintela@redhat.com> > > Signed-off-by: Juan Quintela <quintela@redhat.com> > > > > > > > A migration_cancel also doesn't work for 'exec' because it doesn't > > > support shutdown() - it just sticks in 'cancelling'. > > > On a socket that was broken like this the cancel would work because > > > it issues a shutdown() which causes the socket to cleanup. > > > > I'm curious why migration_cancel requires shutdown() to work ? Why > > isn't it sufficient from the source POV to just close the socket > > entirely straight away. > > close() closes the fd so that any other uses of the fd get an > error and you're at risk of the fd getting reallocated by something > else. So consider if the main thread calls close(), the migration > thread and the return thread both carry on using that fd, which might > have been reallocated to something different. Worse I think we came to the > consolution that on some OSs a read()/write() blocked in the use of an fd > didn't get kicked out by a close. I'd be very surprised if close() didn't terminate any other threads doing read/write, and even more surprised if it they handed out the same FD to another thread while something was still in the read. > shutdown() is safe, in that it stops any other threads accessing the fd > but doesn't allow it's reallocation until the close; We perform the > close only when we've joined all other threads that were using the fd. > Any of the threads that do new calls on the fd get an error and quickly > fall down their error paths. Ahh that's certainly an interesting scenario. That would certainly be a problem with the migration code when this was originally written. It had two QEMUFile structs each with an 'int fd' field, so when you close 'fd' on one QEMUFile struct, it wouldn't update the other QEMUFile used by another thread. Since we switched over to use QIOChannel though, I think the thread scenario you describe should be avoided entirely. When you have multiple QEMUFile objects, they each have a reference counted pointer to the same underlying QIOChannel object instance. So when QEMUFile triggers a call to qio_channel_close() in one thread, that'll set fd=-1 in the QIOChannel. Since the other threads have a reference to the same QIOChannel object, they'll now see this fd == -1 straightaway. So, IIUC, this should make the need for shutdown() redundant (at least for the thread race conditions you describe). Regards, Daniel
* Daniel P. Berrange (berrange@redhat.com) wrote: > On Fri, May 19, 2017 at 02:02:00PM +0100, Dr. David Alan Gilbert wrote: > > * Daniel P. Berrange (berrange@redhat.com) wrote: > > > On Fri, May 19, 2017 at 01:51:43PM +0100, Dr. David Alan Gilbert wrote: > > > > * Daniel P. Berrange (berrange@redhat.com) wrote: > > > > > On Fri, May 19, 2017 at 02:43:27PM +0800, Peter Xu wrote: > > > > > > We don't really have a return path for the other types yet. Let's check > > > > > > this when .get_return_path() is called. > > > > > > > > > > > > For this, we introduce a new feature bit, and set it up only for socket > > > > > > typed IO channels. > > > > > > > > > > > > This will help detect earlier failure for postcopy, e.g., logically > > > > > > speaking postcopy cannot work with "exec:". Before this patch, when we > > > > > > try to migrate with "migrate -d exec:cat>out", we'll hang the system. > > > > > > With this patch, we'll get: > > > > > > > > > > > > (qemu) migrate -d exec:cat>out > > > > > > Unable to open return-path for postcopy > > > > > > > > > > This is wrong - post-copy migration *can* work with exec: - it just entirely > > > > > depends on what command you are running. Your example ran a command which is > > > > > unidirectional, but if you ran 'exec:socat ...' you would have a fully > > > > > bidirectional channel. Actually the channel is always bi-directional, but > > > > > 'cat' simply won't ever send data back to QEMU. > > > > > > > > The thing is it didn't used to be able to; prior to your conversion to > > > > channel, postcopy would reject being started with exec: because it > > > > couldn't open a return path, so it was safe. > > > > > > It safe but functionally broken because it is valid to want to use > > > exec migration with post-copy. > > > > > > > > If QEMU hangs when the other end doesn't send data back, that actually seems > > > > > like a potentially serious bug in migration code. Even if using the normal > > > > > 'tcp' migration protocol, if the target QEMU server hangs and fails to > > > > > send data to QEMU on the return path, the source QEMU must never hang. > > > > > > > > Hmm, we shouldn't get a 'hang' with a postcopy on a link without a > > > > return path; but it does depend on how the exec: behaves on the > > > > destination. > > > > If the destination discards data written to it, then I think the > > > > behaviour would be: > > > > a) Page requests would just get dropped, they'd eventually get > > > > fulfilled by the background page transmissions, but that could mean that > > > > a page request would wait for minutes for the page. > > > > b) The qemu main thread on the destination can be blocked by that, so > > > > the monitor might not respond until the page request is fulfilled. > > > > c) I'm not quite sure what would happen to the source return-path > > > > thread > > > > > > > > The behaviour seems to have changed between 2.9.0 (f26 package) and my > > > > reasonably recent head build. > > > > > > That's due to the bug we just fixed where we mistakenly didn't > > > allow bi-directional I/O for exec > > > > > > commit 062d81f0e968fe1597474735f3ea038065027372 > > > Author: Daniel P. Berrange <berrange@redhat.com> > > > Date: Fri Apr 21 12:12:20 2017 +0100 > > > > > > migration: setup bi-directional I/O channel for exec: protocol > > > > > > Historically the migration data channel has only needed to be > > > unidirectional. Thus the 'exec:' protocol was requesting an > > > I/O channel with O_RDONLY on incoming side, and O_WRONLY on > > > the outgoing side. > > > > > > This is fine for classic migration, but if you then try to run > > > TLS over it, this fails because the TLS handshake requires a > > > bi-directional channel. > > > > > > Signed-off-by: Daniel P. Berrange <berrange@redhat.com> > > > Reviewed-by: Juan Quintela <quintela@redhat.com> > > > Signed-off-by: Juan Quintela <quintela@redhat.com> > > > > > > > > > > A migration_cancel also doesn't work for 'exec' because it doesn't > > > > support shutdown() - it just sticks in 'cancelling'. > > > > On a socket that was broken like this the cancel would work because > > > > it issues a shutdown() which causes the socket to cleanup. > > > > > > I'm curious why migration_cancel requires shutdown() to work ? Why > > > isn't it sufficient from the source POV to just close the socket > > > entirely straight away. > > > > close() closes the fd so that any other uses of the fd get an > > error and you're at risk of the fd getting reallocated by something > > else. So consider if the main thread calls close(), the migration > > thread and the return thread both carry on using that fd, which might > > have been reallocated to something different. Worse I think we came to the > > consolution that on some OSs a read()/write() blocked in the use of an fd > > didn't get kicked out by a close. > > I'd be very surprised if close() didn't terminate any other threads doing > read/write Prepare to be surprised then - that's the behaviour I found when I tried it out in 2014. (Also at the time we found cases where the close() could hang) > and even more surprised if it they handed out the same FD > to another thread while something was still in the read. Right, I don't think that will happen. > > > shutdown() is safe, in that it stops any other threads accessing the fd > > but doesn't allow it's reallocation until the close; We perform the > > close only when we've joined all other threads that were using the fd. > > Any of the threads that do new calls on the fd get an error and quickly > > fall down their error paths. > > Ahh that's certainly an interesting scenario. That would certainly be > a problem with the migration code when this was originally written. > It had two QEMUFile structs each with an 'int fd' field, so when you > close 'fd' on one QEMUFile struct, it wouldn't update the other QEMUFile > used by another thread. > > Since we switched over to use QIOChannel though, I think the thread > scenario you describe should be avoided entirely. When you have multiple > QEMUFile objects, they each have a reference counted pointer to the same > underlying QIOChannel object instance. So when QEMUFile triggers a call > to qio_channel_close() in one thread, that'll set fd=-1 in the QIOChannel. > Since the other threads have a reference to the same QIOChannel object, > they'll now see this fd == -1 straightaway. > > So, IIUC, this should make the need for shutdown() redundant (at least > for the thread race conditions you describe). That's not thread safe unless you're doing some very careful locking. Consider: T1 T2 oldfd=fd tmp=fd fd=-1 close(oldfd) unrelated open() read(tmp,... In practice every use of fd will be a copy into a tmp and then the syscall; the unrelated open() could happen in another thread. (OK, the gap between the tmp and the read is tiny, although if we're doing multiple operations chances are the compiler will optimise it to the top of a loop). There's no way to make that code safe. Dave > Regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Fri, May 19, 2017 at 03:33:12PM +0100, Dr. David Alan Gilbert wrote: > * Daniel P. Berrange (berrange@redhat.com) wrote: > > > shutdown() is safe, in that it stops any other threads accessing the fd > > > but doesn't allow it's reallocation until the close; We perform the > > > close only when we've joined all other threads that were using the fd. > > > Any of the threads that do new calls on the fd get an error and quickly > > > fall down their error paths. > > > > Ahh that's certainly an interesting scenario. That would certainly be > > a problem with the migration code when this was originally written. > > It had two QEMUFile structs each with an 'int fd' field, so when you > > close 'fd' on one QEMUFile struct, it wouldn't update the other QEMUFile > > used by another thread. > > > > Since we switched over to use QIOChannel though, I think the thread > > scenario you describe should be avoided entirely. When you have multiple > > QEMUFile objects, they each have a reference counted pointer to the same > > underlying QIOChannel object instance. So when QEMUFile triggers a call > > to qio_channel_close() in one thread, that'll set fd=-1 in the QIOChannel. > > Since the other threads have a reference to the same QIOChannel object, > > they'll now see this fd == -1 straightaway. > > > > So, IIUC, this should make the need for shutdown() redundant (at least > > for the thread race conditions you describe). > > That's not thread safe unless you're doing some very careful locking. > Consider: > T1 T2 > oldfd=fd tmp=fd > fd=-1 > close(oldfd) > unrelated open() > read(tmp,... > > In practice every use of fd will be a copy into a tmp and then the > syscall; the unrelated open() could happen in another thread. > (OK, the gap between the tmp and the read is tiny, although if we're > doing multiple operations chances are the compiler will optimise > it to the top of a loop). > > There's no way to make that code safe. Urgh, yes, I see what you mean. Currently the QIOChannelCommand implementation, uses a pair of anonymous pipes for stdin/out to the child process. I wonder if we could switch that to use socketpair() instead, thus letting us shutdown() on it too. Though I guess it would be sufficient for qio_channel_shutdown() to merely kill the child PID, while leaving the FDs open, as then you'd get EOF and/or EPIPE on the read/writes. Regards, Daniel
* Daniel P. Berrange (berrange@redhat.com) wrote: > On Fri, May 19, 2017 at 03:33:12PM +0100, Dr. David Alan Gilbert wrote: > > * Daniel P. Berrange (berrange@redhat.com) wrote: > > > > shutdown() is safe, in that it stops any other threads accessing the fd > > > > but doesn't allow it's reallocation until the close; We perform the > > > > close only when we've joined all other threads that were using the fd. > > > > Any of the threads that do new calls on the fd get an error and quickly > > > > fall down their error paths. > > > > > > Ahh that's certainly an interesting scenario. That would certainly be > > > a problem with the migration code when this was originally written. > > > It had two QEMUFile structs each with an 'int fd' field, so when you > > > close 'fd' on one QEMUFile struct, it wouldn't update the other QEMUFile > > > used by another thread. > > > > > > Since we switched over to use QIOChannel though, I think the thread > > > scenario you describe should be avoided entirely. When you have multiple > > > QEMUFile objects, they each have a reference counted pointer to the same > > > underlying QIOChannel object instance. So when QEMUFile triggers a call > > > to qio_channel_close() in one thread, that'll set fd=-1 in the QIOChannel. > > > Since the other threads have a reference to the same QIOChannel object, > > > they'll now see this fd == -1 straightaway. > > > > > > So, IIUC, this should make the need for shutdown() redundant (at least > > > for the thread race conditions you describe). > > > > That's not thread safe unless you're doing some very careful locking. > > Consider: > > T1 T2 > > oldfd=fd tmp=fd > > fd=-1 > > close(oldfd) > > unrelated open() > > read(tmp,... > > > > In practice every use of fd will be a copy into a tmp and then the > > syscall; the unrelated open() could happen in another thread. > > (OK, the gap between the tmp and the read is tiny, although if we're > > doing multiple operations chances are the compiler will optimise > > it to the top of a loop). > > > > There's no way to make that code safe. > > Urgh, yes, I see what you mean. > > Currently the QIOChannelCommand implementation, uses a pair of anonymous > pipes for stdin/out to the child process. I wonder if we could switch > that to use socketpair() instead, thus letting us shutdown() on it too. > > Though I guess it would be sufficient for qio_channel_shutdown() to > merely kill the child PID, while leaving the FDs open, as then you'd > get EOF and/or EPIPE on the read/writes. Yes, I guess it's a question of which one is more likely to actually kill the exec child off; the socketpair is more likely to cause the source side migration code to cancel cleanly, although a kill -9 should sort out a wayward exec child. Dave > Regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Fri, May 19, 2017 at 07:41:34PM +0100, Dr. David Alan Gilbert wrote: > * Daniel P. Berrange (berrange@redhat.com) wrote: > > On Fri, May 19, 2017 at 03:33:12PM +0100, Dr. David Alan Gilbert wrote: > > > * Daniel P. Berrange (berrange@redhat.com) wrote: > > > > > shutdown() is safe, in that it stops any other threads accessing the fd > > > > > but doesn't allow it's reallocation until the close; We perform the > > > > > close only when we've joined all other threads that were using the fd. > > > > > Any of the threads that do new calls on the fd get an error and quickly > > > > > fall down their error paths. > > > > > > > > Ahh that's certainly an interesting scenario. That would certainly be > > > > a problem with the migration code when this was originally written. > > > > It had two QEMUFile structs each with an 'int fd' field, so when you > > > > close 'fd' on one QEMUFile struct, it wouldn't update the other QEMUFile > > > > used by another thread. > > > > > > > > Since we switched over to use QIOChannel though, I think the thread > > > > scenario you describe should be avoided entirely. When you have multiple > > > > QEMUFile objects, they each have a reference counted pointer to the same > > > > underlying QIOChannel object instance. So when QEMUFile triggers a call > > > > to qio_channel_close() in one thread, that'll set fd=-1 in the QIOChannel. > > > > Since the other threads have a reference to the same QIOChannel object, > > > > they'll now see this fd == -1 straightaway. > > > > > > > > So, IIUC, this should make the need for shutdown() redundant (at least > > > > for the thread race conditions you describe). > > > > > > That's not thread safe unless you're doing some very careful locking. > > > Consider: > > > T1 T2 > > > oldfd=fd tmp=fd > > > fd=-1 > > > close(oldfd) > > > unrelated open() > > > read(tmp,... > > > > > > In practice every use of fd will be a copy into a tmp and then the > > > syscall; the unrelated open() could happen in another thread. > > > (OK, the gap between the tmp and the read is tiny, although if we're > > > doing multiple operations chances are the compiler will optimise > > > it to the top of a loop). > > > > > > There's no way to make that code safe. > > > > Urgh, yes, I see what you mean. > > > > Currently the QIOChannelCommand implementation, uses a pair of anonymous > > pipes for stdin/out to the child process. I wonder if we could switch > > that to use socketpair() instead, thus letting us shutdown() on it too. > > > > Though I guess it would be sufficient for qio_channel_shutdown() to > > merely kill the child PID, while leaving the FDs open, as then you'd > > get EOF and/or EPIPE on the read/writes. > > Yes, I guess it's a question of which one is more likely to actually > kill the exec child off; the socketpair is more likely to cause the > source side migration code to cancel cleanly, although a kill -9 > should sort out a wayward exec child. I'm also curious if there's any real world performance difference between using a pipe vs socketpair to communicate with a child process Regards, Daniel
diff --git a/include/io/channel.h b/include/io/channel.h index db9bb02..7876534 100644 --- a/include/io/channel.h +++ b/include/io/channel.h @@ -45,6 +45,7 @@ enum QIOChannelFeature { QIO_CHANNEL_FEATURE_FD_PASS, QIO_CHANNEL_FEATURE_SHUTDOWN, QIO_CHANNEL_FEATURE_LISTEN, + QIO_CHANNEL_FEATURE_RETURN_PATH, }; diff --git a/io/channel-socket.c b/io/channel-socket.c index 53386b7..ee81b2d 100644 --- a/io/channel-socket.c +++ b/io/channel-socket.c @@ -56,6 +56,7 @@ qio_channel_socket_new(void) ioc = QIO_CHANNEL(sioc); qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN); + qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_RETURN_PATH); #ifdef WIN32 ioc->event = CreateEvent(NULL, FALSE, FALSE, NULL); diff --git a/migration/qemu-file-channel.c b/migration/qemu-file-channel.c index 45c13f1..3bd7940 100644 --- a/migration/qemu-file-channel.c +++ b/migration/qemu-file-channel.c @@ -23,6 +23,7 @@ */ #include "qemu/osdep.h" +#include "qom/object.h" #include "migration/qemu-file.h" #include "io/channel-socket.h" #include "qemu/iov.h" @@ -139,6 +140,10 @@ static QEMUFile *channel_get_input_return_path(void *opaque) { QIOChannel *ioc = QIO_CHANNEL(opaque); + if (!qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_RETURN_PATH)) { + return NULL; + } + return qemu_fopen_channel_output(ioc); } @@ -146,6 +151,10 @@ static QEMUFile *channel_get_output_return_path(void *opaque) { QIOChannel *ioc = QIO_CHANNEL(opaque); + if (!qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_RETURN_PATH)) { + return NULL; + } + return qemu_fopen_channel_input(ioc); }
We don't really have a return path for the other types yet. Let's check this when .get_return_path() is called. For this, we introduce a new feature bit, and set it up only for socket typed IO channels. This will help detect earlier failure for postcopy, e.g., logically speaking postcopy cannot work with "exec:". Before this patch, when we try to migrate with "migrate -d exec:cat>out", we'll hang the system. With this patch, we'll get: (qemu) migrate -d exec:cat>out Unable to open return-path for postcopy CC: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> --- include/io/channel.h | 1 + io/channel-socket.c | 1 + migration/qemu-file-channel.c | 9 +++++++++ 3 files changed, 11 insertions(+)