Message ID | 1475146935-12118-1-git-send-email-den@openvz.org |
---|---|
State | New |
Headers | show |
On Thu, Sep 29, 2016 at 02:02:15PM +0300, Denis V. Lunev wrote: > From: Denis Plotnikov <dplotnikov@virtuozzo.com> > > Originally NBD server socket was created by qemu-nbd code. This leads to > the race when the management layer starts qemu-nbd server and allows a > client to connect to the server. In this case there is a possibility that > qemu-nbd does not open listening server socket yet. Creating listening > socket before starting of qemu-ndb and passing socket fd via command line > solves this issue completely. FWIW, this could be solved in qemu-nbd itself if we had a general ability to request "daemon" mode - currently it only daemonizes if attaching to a nbd block device. The key would be that qemu-nbd would open the listening socket before daemonizing. Thus when the mgmt application spawned qemu-nbd in daemon mode, it can be sure that the listener socket is present when waitpid() completes. Regards, Daniel
On 29/09/2016 13:13, Daniel P. Berrange wrote: > On Thu, Sep 29, 2016 at 02:02:15PM +0300, Denis V. Lunev wrote: >> From: Denis Plotnikov <dplotnikov@virtuozzo.com> >> >> Originally NBD server socket was created by qemu-nbd code. This leads to >> the race when the management layer starts qemu-nbd server and allows a >> client to connect to the server. In this case there is a possibility that >> qemu-nbd does not open listening server socket yet. Creating listening >> socket before starting of qemu-ndb and passing socket fd via command line >> solves this issue completely. > > FWIW, this could be solved in qemu-nbd itself if we had a general > ability to request "daemon" mode - currently it only daemonizes > if attaching to a nbd block device. > > The key would be that qemu-nbd would open the listening socket > before daemonizing. Thus when the mgmt application spawned > qemu-nbd in daemon mode, it can be sure that the listener > socket is present when waitpid() completes. This is 100% true, but I find daemonization to be a hack, so file descriptor passing has its place. Still, I'd prefer to have a --socket-activation option which uses the systemd socket activation protocol. The systemd protocol can be implemented easily by any management layer. Paolo
On Thu, Sep 29, 2016 at 02:18:54PM +0200, Paolo Bonzini wrote: > > > On 29/09/2016 13:13, Daniel P. Berrange wrote: > > On Thu, Sep 29, 2016 at 02:02:15PM +0300, Denis V. Lunev wrote: > >> From: Denis Plotnikov <dplotnikov@virtuozzo.com> > >> > >> Originally NBD server socket was created by qemu-nbd code. This leads to > >> the race when the management layer starts qemu-nbd server and allows a > >> client to connect to the server. In this case there is a possibility that > >> qemu-nbd does not open listening server socket yet. Creating listening > >> socket before starting of qemu-ndb and passing socket fd via command line > >> solves this issue completely. > > > > FWIW, this could be solved in qemu-nbd itself if we had a general > > ability to request "daemon" mode - currently it only daemonizes > > if attaching to a nbd block device. > > > > The key would be that qemu-nbd would open the listening socket > > before daemonizing. Thus when the mgmt application spawned > > qemu-nbd in daemon mode, it can be sure that the listener > > socket is present when waitpid() completes. > > This is 100% true, but I find daemonization to be a hack, so file > descriptor passing has its place. > > Still, I'd prefer to have a --socket-activation option which uses the > systemd socket activation protocol. The systemd protocol can be > implemented easily by any management layer. That's a nice idea - system socket activation is pretty simple todo Regards, Daniel
On Thu, Sep 29, 2016 at 12:13:18PM +0100, Daniel P. Berrange wrote: > On Thu, Sep 29, 2016 at 02:02:15PM +0300, Denis V. Lunev wrote: > > From: Denis Plotnikov <dplotnikov@virtuozzo.com> > > > > Originally NBD server socket was created by qemu-nbd code. This leads to > > the race when the management layer starts qemu-nbd server and allows a > > client to connect to the server. In this case there is a possibility that > > qemu-nbd does not open listening server socket yet. Creating listening > > socket before starting of qemu-ndb and passing socket fd via command line > > solves this issue completely. > > FWIW, this could be solved in qemu-nbd itself if we had a general > ability to request "daemon" mode - currently it only daemonizes > if attaching to a nbd block device. > > The key would be that qemu-nbd would open the listening socket > before daemonizing. Thus when the mgmt application spawned > qemu-nbd in daemon mode, it can be sure that the listener > socket is present when waitpid() completes. A patch to do that was just posted: [PATCH v4 1/3] qemu-nbd: Add --fork option
On 09/29/2016 06:02 AM, Denis V. Lunev wrote: > From: Denis Plotnikov <dplotnikov@virtuozzo.com> > > Originally NBD server socket was created by qemu-nbd code. This leads to > the race when the management layer starts qemu-nbd server and allows a > client to connect to the server. In this case there is a possibility that > qemu-nbd does not open listening server socket yet. Creating listening > socket before starting of qemu-ndb and passing socket fd via command line > solves this issue completely. > > Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com> > Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> > Signed-off-by: Denis V. Lunev <den@openvz.org> > CC: Paolo Bonzini <pbonzini@redhat.com> > --- > @@ -78,6 +79,7 @@ static void usage(const char *name) > " -b, --bind=IFACE interface to bind to (default `0.0.0.0')\n" > " -k, --socket=PATH path to the unix socket\n" > " (default '"SOCKET_PATH"')\n" > +" --server-sock-fd=NUM pre-opened listening server socket file descriptor\n" > " -e, --shared=NUM device can be shared by NUM clients (default '1')\n" Looking at 'qemu-nbd --help', all other long-option-only options are indented so that the -- line up to the same column (see for example --cache and --aio). > > -static SocketAddress *nbd_build_socket_address(const char *sockpath, > +static SocketAddress *nbd_build_sock_fd(const char *sockpath, > const char *bindto, > const char *port) Indentation now looks off. > +static void setup_address_and_port(const char **address, const char **port) > +{ > + if (*address == NULL) { > + *address = "0.0.0.0"; > + } > + > + if (*port == NULL) { > + *port = g_strdup_printf("%d", NBD_DEFAULT_PORT);; s/;;/;/ Why is one of these parameters malloc'd, but the other pointing to const storage? Do you have a memory leak, or else a risk of corrupting read-only memory?
On 09/29/2016 02:02 PM, Denis V. Lunev wrote: > From: Denis Plotnikov <dplotnikov@virtuozzo.com> > > Originally NBD server socket was created by qemu-nbd code. This leads to > the race when the management layer starts qemu-nbd server and allows a > client to connect to the server. In this case there is a possibility that > qemu-nbd does not open listening server socket yet. Creating listening > socket before starting of qemu-ndb and passing socket fd via command line > solves this issue completely. > > Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com> > Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> > Signed-off-by: Denis V. Lunev <den@openvz.org> > CC: Paolo Bonzini <pbonzini@redhat.com> This patch also solves another problem. Let us assume that qemu-nbd is started by the management software. After that the software should report the port qemu-nbd is bound to the another software. Right now as far as I know there is no good way to do that. That management software could try to bind to a port, release the port and start qemu-nbd. This is definitely racy. With the approach with passing file descriptor management layer could create a socket and pass it to qemu-nbd without a race at all. Den > --- > qemu-nbd.c | 88 ++++++++++++++++++++++++++++++++++++++++++++++++++++------- > qemu-nbd.texi | 6 ++++ > 2 files changed, 84 insertions(+), 10 deletions(-) > > diff --git a/qemu-nbd.c b/qemu-nbd.c > index 99297a5..99bba38 100644 > --- a/qemu-nbd.c > +++ b/qemu-nbd.c > @@ -48,6 +48,7 @@ > #define QEMU_NBD_OPT_OBJECT 260 > #define QEMU_NBD_OPT_TLSCREDS 261 > #define QEMU_NBD_OPT_IMAGE_OPTS 262 > +#define QEMU_NBD_OPT_SRV_SOCKET_FD 263 > > #define MBR_SIZE 512 > > @@ -78,6 +79,7 @@ static void usage(const char *name) > " -b, --bind=IFACE interface to bind to (default `0.0.0.0')\n" > " -k, --socket=PATH path to the unix socket\n" > " (default '"SOCKET_PATH"')\n" > +" --server-sock-fd=NUM pre-opened listening server socket file descriptor\n" > " -e, --shared=NUM device can be shared by NUM clients (default '1')\n" > " -t, --persistent don't exit on the last connection\n" > " -v, --verbose display extra debugging information\n" > @@ -382,7 +384,7 @@ static void nbd_update_server_watch(void) > } > > > -static SocketAddress *nbd_build_socket_address(const char *sockpath, > +static SocketAddress *nbd_build_sock_fd(const char *sockpath, > const char *bindto, > const char *port) > { > @@ -459,6 +461,41 @@ static QCryptoTLSCreds *nbd_get_tls_creds(const char *id, Error **errp) > return creds; > } > > +static void setup_address_and_port(const char **address, const char **port) > +{ > + if (*address == NULL) { > + *address = "0.0.0.0"; > + } > + > + if (*port == NULL) { > + *port = g_strdup_printf("%d", NBD_DEFAULT_PORT);; > + } > +} > + > +/* > ++ * Check socket parameters compatibility when sock_fd is given > ++ */ > +static const char *sock_fd_validate_opts(char *device, char *sockpath, > + const char *address, const char *port) > +{ > + if (device != NULL) { > + return "NBD device can't be set when socket fd is specified"; > + } > + > + if (sockpath != NULL) { > + return "Unix socket can't be set when socket fd is specified"; > + } > + > + if (address != NULL) { > + return "The interface can't be set when socket fd is specified"; > + } > + > + if (port != NULL) { > + return "TCP port number can't be set when socket fd is specified"; > + } > + > + return NULL; > +} > > int main(int argc, char **argv) > { > @@ -467,7 +504,7 @@ int main(int argc, char **argv) > off_t dev_offset = 0; > uint16_t nbdflags = 0; > bool disconnect = false; > - const char *bindto = "0.0.0.0"; > + const char *bindto = NULL; > const char *port = NULL; > char *sockpath = NULL; > char *device = NULL; > @@ -503,6 +540,8 @@ int main(int argc, char **argv) > { "tls-creds", required_argument, NULL, QEMU_NBD_OPT_TLSCREDS }, > { "image-opts", no_argument, NULL, QEMU_NBD_OPT_IMAGE_OPTS }, > { "trace", required_argument, NULL, 'T' }, > + { "server-sock-fd", required_argument, NULL, > + QEMU_NBD_OPT_SRV_SOCKET_FD }, > { NULL, 0, NULL, 0 } > }; > int ch; > @@ -524,6 +563,7 @@ int main(int argc, char **argv) > bool imageOpts = false; > bool writethrough = true; > char *trace_file = NULL; > + int sock_fd = -1; > > /* The client thread uses SIGTERM to interrupt the server. A signal > * handler ensures that "qemu-nbd -v -c" exits with a nice status code. > @@ -714,6 +754,15 @@ int main(int argc, char **argv) > g_free(trace_file); > trace_file = trace_opt_parse(optarg); > break; > + case QEMU_NBD_OPT_SRV_SOCKET_FD: > + sock_fd = qemu_parse_fd(optarg); > + > + if (sock_fd < 0) { > + error_report("Invalid socket fd value: " > + "it must be a non-negative integer"); > + exit(EXIT_FAILURE); > + } > + break; > } > } > > @@ -735,6 +784,17 @@ int main(int argc, char **argv) > trace_init_file(trace_file); > qemu_set_log(LOG_TRACE); > > + if (sock_fd != -1) { > + const char *err_msg = sock_fd_validate_opts(device, sockpath, > + bindto, port); > + if (err_msg != NULL) { > + error_report("%s", err_msg); > + exit(EXIT_FAILURE); > + } > + } else { > + setup_address_and_port(&bindto, &port); > + } > + > if (tlscredsid) { > if (sockpath) { > error_report("TLS is only supported with IPv4/IPv6"); > @@ -838,7 +898,22 @@ int main(int argc, char **argv) > snprintf(sockpath, 128, SOCKET_PATH, basename(device)); > } > > - saddr = nbd_build_socket_address(sockpath, bindto, port); > + if (sock_fd == -1) { > + server_ioc = qio_channel_socket_new(); > + saddr = nbd_build_sock_fd(sockpath, bindto, port); > + if (qio_channel_socket_listen_sync(server_ioc, saddr, &local_err) < 0) { > + object_unref(OBJECT(server_ioc)); > + error_report_err(local_err); > + return 1; > + } > + } else { > + server_ioc = qio_channel_socket_new_fd(sock_fd, &local_err); > + if (server_ioc == NULL) { > + error_report("Failed to use the given server socket fd: %s", > + error_get_pretty(local_err)); > + exit(EXIT_FAILURE); > + } > + } > > if (qemu_init_main_loop(&local_err)) { > error_report_err(local_err); > @@ -921,13 +996,6 @@ int main(int argc, char **argv) > newproto = true; > } > > - server_ioc = qio_channel_socket_new(); > - if (qio_channel_socket_listen_sync(server_ioc, saddr, &local_err) < 0) { > - object_unref(OBJECT(server_ioc)); > - error_report_err(local_err); > - return 1; > - } > - > if (device) { > int ret; > > diff --git a/qemu-nbd.texi b/qemu-nbd.texi > index 91ebf04..4c9ea00 100644 > --- a/qemu-nbd.texi > +++ b/qemu-nbd.texi > @@ -34,6 +34,12 @@ The offset into the image > The interface to bind to (default @samp{0.0.0.0}) > @item -k, --socket=@var{path} > Use a unix socket with path @var{path} > +@item --server-sock-fd=@var{fd_num} > +A server socket fd to be used for client connections. > +The socket fd must be configured and switched to listening > +state before passing to the program. The program uses > +the given socket fd as is, without any additional > +modifications. > @item --image-opts > Treat @var{filename} as a set of image options, instead of a plain > filename. If this flag is specified, the @var{-f} flag should
diff --git a/qemu-nbd.c b/qemu-nbd.c index 99297a5..99bba38 100644 --- a/qemu-nbd.c +++ b/qemu-nbd.c @@ -48,6 +48,7 @@ #define QEMU_NBD_OPT_OBJECT 260 #define QEMU_NBD_OPT_TLSCREDS 261 #define QEMU_NBD_OPT_IMAGE_OPTS 262 +#define QEMU_NBD_OPT_SRV_SOCKET_FD 263 #define MBR_SIZE 512 @@ -78,6 +79,7 @@ static void usage(const char *name) " -b, --bind=IFACE interface to bind to (default `0.0.0.0')\n" " -k, --socket=PATH path to the unix socket\n" " (default '"SOCKET_PATH"')\n" +" --server-sock-fd=NUM pre-opened listening server socket file descriptor\n" " -e, --shared=NUM device can be shared by NUM clients (default '1')\n" " -t, --persistent don't exit on the last connection\n" " -v, --verbose display extra debugging information\n" @@ -382,7 +384,7 @@ static void nbd_update_server_watch(void) } -static SocketAddress *nbd_build_socket_address(const char *sockpath, +static SocketAddress *nbd_build_sock_fd(const char *sockpath, const char *bindto, const char *port) { @@ -459,6 +461,41 @@ static QCryptoTLSCreds *nbd_get_tls_creds(const char *id, Error **errp) return creds; } +static void setup_address_and_port(const char **address, const char **port) +{ + if (*address == NULL) { + *address = "0.0.0.0"; + } + + if (*port == NULL) { + *port = g_strdup_printf("%d", NBD_DEFAULT_PORT);; + } +} + +/* ++ * Check socket parameters compatibility when sock_fd is given ++ */ +static const char *sock_fd_validate_opts(char *device, char *sockpath, + const char *address, const char *port) +{ + if (device != NULL) { + return "NBD device can't be set when socket fd is specified"; + } + + if (sockpath != NULL) { + return "Unix socket can't be set when socket fd is specified"; + } + + if (address != NULL) { + return "The interface can't be set when socket fd is specified"; + } + + if (port != NULL) { + return "TCP port number can't be set when socket fd is specified"; + } + + return NULL; +} int main(int argc, char **argv) { @@ -467,7 +504,7 @@ int main(int argc, char **argv) off_t dev_offset = 0; uint16_t nbdflags = 0; bool disconnect = false; - const char *bindto = "0.0.0.0"; + const char *bindto = NULL; const char *port = NULL; char *sockpath = NULL; char *device = NULL; @@ -503,6 +540,8 @@ int main(int argc, char **argv) { "tls-creds", required_argument, NULL, QEMU_NBD_OPT_TLSCREDS }, { "image-opts", no_argument, NULL, QEMU_NBD_OPT_IMAGE_OPTS }, { "trace", required_argument, NULL, 'T' }, + { "server-sock-fd", required_argument, NULL, + QEMU_NBD_OPT_SRV_SOCKET_FD }, { NULL, 0, NULL, 0 } }; int ch; @@ -524,6 +563,7 @@ int main(int argc, char **argv) bool imageOpts = false; bool writethrough = true; char *trace_file = NULL; + int sock_fd = -1; /* The client thread uses SIGTERM to interrupt the server. A signal * handler ensures that "qemu-nbd -v -c" exits with a nice status code. @@ -714,6 +754,15 @@ int main(int argc, char **argv) g_free(trace_file); trace_file = trace_opt_parse(optarg); break; + case QEMU_NBD_OPT_SRV_SOCKET_FD: + sock_fd = qemu_parse_fd(optarg); + + if (sock_fd < 0) { + error_report("Invalid socket fd value: " + "it must be a non-negative integer"); + exit(EXIT_FAILURE); + } + break; } } @@ -735,6 +784,17 @@ int main(int argc, char **argv) trace_init_file(trace_file); qemu_set_log(LOG_TRACE); + if (sock_fd != -1) { + const char *err_msg = sock_fd_validate_opts(device, sockpath, + bindto, port); + if (err_msg != NULL) { + error_report("%s", err_msg); + exit(EXIT_FAILURE); + } + } else { + setup_address_and_port(&bindto, &port); + } + if (tlscredsid) { if (sockpath) { error_report("TLS is only supported with IPv4/IPv6"); @@ -838,7 +898,22 @@ int main(int argc, char **argv) snprintf(sockpath, 128, SOCKET_PATH, basename(device)); } - saddr = nbd_build_socket_address(sockpath, bindto, port); + if (sock_fd == -1) { + server_ioc = qio_channel_socket_new(); + saddr = nbd_build_sock_fd(sockpath, bindto, port); + if (qio_channel_socket_listen_sync(server_ioc, saddr, &local_err) < 0) { + object_unref(OBJECT(server_ioc)); + error_report_err(local_err); + return 1; + } + } else { + server_ioc = qio_channel_socket_new_fd(sock_fd, &local_err); + if (server_ioc == NULL) { + error_report("Failed to use the given server socket fd: %s", + error_get_pretty(local_err)); + exit(EXIT_FAILURE); + } + } if (qemu_init_main_loop(&local_err)) { error_report_err(local_err); @@ -921,13 +996,6 @@ int main(int argc, char **argv) newproto = true; } - server_ioc = qio_channel_socket_new(); - if (qio_channel_socket_listen_sync(server_ioc, saddr, &local_err) < 0) { - object_unref(OBJECT(server_ioc)); - error_report_err(local_err); - return 1; - } - if (device) { int ret; diff --git a/qemu-nbd.texi b/qemu-nbd.texi index 91ebf04..4c9ea00 100644 --- a/qemu-nbd.texi +++ b/qemu-nbd.texi @@ -34,6 +34,12 @@ The offset into the image The interface to bind to (default @samp{0.0.0.0}) @item -k, --socket=@var{path} Use a unix socket with path @var{path} +@item --server-sock-fd=@var{fd_num} +A server socket fd to be used for client connections. +The socket fd must be configured and switched to listening +state before passing to the program. The program uses +the given socket fd as is, without any additional +modifications. @item --image-opts Treat @var{filename} as a set of image options, instead of a plain filename. If this flag is specified, the @var{-f} flag should