From patchwork Wed Jul 12 11:16:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Hanna Czenczek X-Patchwork-Id: 1806779 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=YVcA2p6P; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4R1FZx05cQz20cd for ; Wed, 12 Jul 2023 21:17:44 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qJXqM-0003Fi-IZ; Wed, 12 Jul 2023 07:17:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qJXqL-0003F6-2E for qemu-devel@nongnu.org; Wed, 12 Jul 2023 07:17:13 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qJXqJ-0002zv-Da for qemu-devel@nongnu.org; Wed, 12 Jul 2023 07:17:12 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1689160630; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lnrLkRMOkuC7p3z2QlwIhHUferjPvBkeXqWRW4xF8o8=; b=YVcA2p6PYHMjK/iwtk+EN6WP8yl/nntJtQRxKr9BWYVd6881LIn+0v0AvlDjA/6o2Or2qP itqMly5fdrbLYzMguZiageHLVlkdKPACVzaWhUXWm07acSE/1mpocH1/WlC/uY2jFGixRW 1TcsMGngJOsr2SVe1Tc8qp0gnHBLem0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-644-dyzPPG8CMueRyfZ-4iqo7Q-1; Wed, 12 Jul 2023 07:17:07 -0400 X-MC-Unique: dyzPPG8CMueRyfZ-4iqo7Q-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id BA721800CAC for ; Wed, 12 Jul 2023 11:17:06 +0000 (UTC) Received: from localhost (unknown [10.39.193.231]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 49CCBC1ED97; Wed, 12 Jul 2023 11:17:06 +0000 (UTC) From: Hanna Czenczek To: qemu-devel@nongnu.org, virtio-fs@redhat.com Cc: Hanna Czenczek , "Michael S . Tsirkin" , Stefan Hajnoczi , German Maglione , =?utf-8?q?Eugenio_P=C3=A9rez?= Subject: [PATCH v2 1/4] vhost-user.rst: Migrating back-end-internal state Date: Wed, 12 Jul 2023 13:16:59 +0200 Message-ID: <20230712111703.28031-2-hreitz@redhat.com> In-Reply-To: <20230712111703.28031-1-hreitz@redhat.com> References: <20230712111703.28031-1-hreitz@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 Received-SPF: pass client-ip=170.10.133.124; envelope-from=hreitz@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org For vhost-user devices, qemu can migrate the virtio state, but not the back-end's internal state. To do so, we need to be able to transfer this internal state between front-end (qemu) and back-end. At this point, this new feature is added for the purpose of virtio-fs migration. Because virtiofsd's internal state will not be too large, we believe it is best to transfer it as a single binary blob after the streaming phase. These are the additions to the protocol: - New vhost-user protocol feature VHOST_USER_PROTOCOL_F_MIGRATORY_STATE - SET_DEVICE_STATE_FD function: Front-end and back-end negotiate a pipe over which to transfer the state. - CHECK_DEVICE_STATE: After the state has been transferred through the pipe, the front-end invokes this function to verify success. There is no in-band way (through the pipe) to indicate failure, so we need to check explicitly. Once the transfer pipe has been established via SET_DEVICE_STATE_FD (which includes establishing the direction of transfer and migration phase), the sending side writes its data into the pipe, and the reading side reads it until it sees an EOF. Then, the front-end will check for success via CHECK_DEVICE_STATE, which on the destination side includes checking for integrity (i.e. errors during deserialization). Suggested-by: Stefan Hajnoczi Signed-off-by: Hanna Czenczek --- docs/interop/vhost-user.rst | 87 +++++++++++++++++++++++++++++++++++++ 1 file changed, 87 insertions(+) diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst index ac6be34c4c..c98dfeca25 100644 --- a/docs/interop/vhost-user.rst +++ b/docs/interop/vhost-user.rst @@ -334,6 +334,7 @@ in the ancillary data: * ``VHOST_USER_SET_VRING_ERR`` * ``VHOST_USER_SET_BACKEND_REQ_FD`` (previous name ``VHOST_USER_SET_SLAVE_REQ_FD``) * ``VHOST_USER_SET_INFLIGHT_FD`` (if ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD``) +* ``VHOST_USER_SET_DEVICE_STATE_FD`` If *front-end* is unable to send the full message or receives a wrong reply it will close the connection. An optional reconnection mechanism @@ -497,6 +498,44 @@ it performs WAKE ioctl's on the userfaultfd to wake the stalled back-end. The front-end indicates support for this via the ``VHOST_USER_PROTOCOL_F_PAGEFAULT`` feature. +.. _migrating_backend_state: + +Migrating back-end state +^^^^^^^^^^^^^^^^^^^^^^^^ + +If the back-end has internal state that is to be sent from source to +destination, the front-end may be able to store and transfer it via an +internal migration stream. Support for this is negotiated with the +``VHOST_USER_PROTOCOL_F_MIGRATORY_STATE`` feature. + +First, a channel over which the state is transferred is established on +the source side using the ``VHOST_USER_SET_DEVICE_STATE_FD`` message. +This message has two parameters: + +* Direction of transfer: On the source, the data is saved, transferring + it from the back-end to the front-end. On the destination, the data + is loaded, transferring it from the front-end to the back-end. + +* Migration phase: Currently, only the period after memory transfer + before switch-over is supported, in which the device is suspended and + all of its rings are stopped. + +Then, the writing end will write all the data it has, signalling the end +of data by closing its end of the pipe. The reading end must read all +of this data and process it: + +* If saving, the front-end will transfer this data to the destination, + where it is loaded into the destination back-end. + +* If loading, the back-end must deserialize its internal state from the + transferred data and be set up to resume operation. + +After the front-end has seen all data transferred (saving: seen an EOF +on the pipe; loading: closed its end of the pipe), it sends the +``VHOST_USER_CHECK_DEVICE_STATE`` message to verify that data transfer +was successful in the back-end, too. The back-end responds once it +knows whether the tranfer and processing was successful or not. + Memory access ------------- @@ -891,6 +930,7 @@ Protocol features #define VHOST_USER_PROTOCOL_F_STATUS 16 #define VHOST_USER_PROTOCOL_F_XEN_MMAP 17 #define VHOST_USER_PROTOCOL_F_SUSPEND 18 + #define VHOST_USER_PROTOCOL_F_MIGRATORY_STATE 19 Front-end message types ----------------------- @@ -1471,6 +1511,53 @@ Front-end message types before. The back-end must again begin processing rings that are not stopped, and it may resume background operations. +``VHOST_USER_SET_DEVICE_STATE_FD`` + :id: 43 + :equivalent ioctl: N/A + :request payload: device state transfer parameters + :reply payload: ``u64`` + + The front-end negotiates a pipe over which to transfer the back-end’s + internal state during migration. For this purpose, this message is + accompanied by a file descriptor that is to be the back-end’s end of + the pipe. If the back-end can provide a more efficient pipe (i.e. + because it internally already has a pipe into/from which to + put/receive state), it can ignore this and reply with a different file + descriptor to serve as the front-end’s end. + + The request payload contains parameters for the subsequent data + transfer, as described in the :ref:`Migrating back-end state + ` section. That section also explains the + data transfer itself. + + The value returned is both an indication for success, and whether a + new pipe file descriptor is returned: Bits 0–7 are 0 on success, and + non-zero on error. Bit 8 is the invalid FD flag; this flag is set + when there is no file descriptor returned. When this flag is not set, + the front-end must use the returned file descriptor as its end of the + pipe. The back-end must not both indicate an error and return a file + descriptor. + + Using this function requires prior negotiation of the + ``VHOST_USER_PROTOCOL_F_MIGRATORY_STATE`` feature. + +``VHOST_USER_CHECK_DEVICE_STATE`` + :id: 44 + :equivalent ioctl: N/A + :request payload: N/A + :reply payload: ``u64`` + + After transferring the back-end’s internal state during migration (see + the :ref:`Migrating back-end state ` + section), check whether the back-end was able to successfully fully + process the state. + + The value returned indicates success or error; 0 is success, any + non-zero value is an error. + + Using this function requires prior negotiation of the + ``VHOST_USER_PROTOCOL_F_MIGRATORY_STATE`` feature. + Back-end message types ---------------------- From patchwork Wed Jul 12 11:17:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hanna Czenczek X-Patchwork-Id: 1806782 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=hZH2kqxd; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4R1FbD2NmVz20cd for ; Wed, 12 Jul 2023 21:18:00 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qJXqP-0003I5-Ap; Wed, 12 Jul 2023 07:17:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qJXqL-0003FI-EK for qemu-devel@nongnu.org; Wed, 12 Jul 2023 07:17:13 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qJXqJ-0002zq-BF for qemu-devel@nongnu.org; Wed, 12 Jul 2023 07:17:13 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1689160630; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rzUuAIu61CvFu4s7Pb65nb3UD1Xq4aYPpW13spuFtFM=; b=hZH2kqxdnjqlDr/FXLyzPWBNQ+apjZehGKsMlpB0LlmC1rBqpEleV9Inqw3TEjJrJKsFNV oLQRLgWnz/6Iz8QRWS4isg+pbbDEVShqFwxU8FjEfNuDghtE8ZL70M2gZmAFjwWCYU8lSJ RxInFPbv5NGZnIiekcIEuLGo7jrxnBM= Received: from mimecast-mx02.redhat.com (66.187.233.73 [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-299-XdggtyJWM4yJQHSc26zqtg-1; Wed, 12 Jul 2023 07:17:08 -0400 X-MC-Unique: XdggtyJWM4yJQHSc26zqtg-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 8DC8F3C0ED4E for ; Wed, 12 Jul 2023 11:17:08 +0000 (UTC) Received: from localhost (unknown [10.39.193.231]) by smtp.corp.redhat.com (Postfix) with ESMTPS id F4103F66C0; Wed, 12 Jul 2023 11:17:07 +0000 (UTC) From: Hanna Czenczek To: qemu-devel@nongnu.org, virtio-fs@redhat.com Cc: Hanna Czenczek , "Michael S . Tsirkin" , Stefan Hajnoczi , German Maglione , =?utf-8?q?Eugenio_P=C3=A9rez?= Subject: [PATCH v2 2/4] vhost-user: Interface for migration state transfer Date: Wed, 12 Jul 2023 13:17:00 +0200 Message-ID: <20230712111703.28031-3-hreitz@redhat.com> In-Reply-To: <20230712111703.28031-1-hreitz@redhat.com> References: <20230712111703.28031-1-hreitz@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 Received-SPF: pass client-ip=170.10.129.124; envelope-from=hreitz@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Add the interface for transferring the back-end's state during migration as defined previously in vhost-user.rst. Signed-off-by: Hanna Czenczek Reviewed-by: Stefan Hajnoczi --- include/hw/virtio/vhost-backend.h | 24 +++++ include/hw/virtio/vhost.h | 79 ++++++++++++++++ hw/virtio/vhost-user.c | 147 ++++++++++++++++++++++++++++++ hw/virtio/vhost.c | 37 ++++++++ 4 files changed, 287 insertions(+) diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h index 31a251a9f5..e59d0b53f8 100644 --- a/include/hw/virtio/vhost-backend.h +++ b/include/hw/virtio/vhost-backend.h @@ -26,6 +26,18 @@ typedef enum VhostSetConfigType { VHOST_SET_CONFIG_TYPE_MIGRATION = 1, } VhostSetConfigType; +typedef enum VhostDeviceStateDirection { + /* Transfer state from back-end (device) to front-end */ + VHOST_TRANSFER_STATE_DIRECTION_SAVE = 0, + /* Transfer state from front-end to back-end (device) */ + VHOST_TRANSFER_STATE_DIRECTION_LOAD = 1, +} VhostDeviceStateDirection; + +typedef enum VhostDeviceStatePhase { + /* The device (and all its vrings) is stopped */ + VHOST_TRANSFER_STATE_PHASE_STOPPED = 0, +} VhostDeviceStatePhase; + struct vhost_inflight; struct vhost_dev; struct vhost_log; @@ -133,6 +145,15 @@ typedef int (*vhost_set_config_call_op)(struct vhost_dev *dev, typedef void (*vhost_reset_status_op)(struct vhost_dev *dev); +typedef bool (*vhost_supports_migratory_state_op)(struct vhost_dev *dev); +typedef int (*vhost_set_device_state_fd_op)(struct vhost_dev *dev, + VhostDeviceStateDirection direction, + VhostDeviceStatePhase phase, + int fd, + int *reply_fd, + Error **errp); +typedef int (*vhost_check_device_state_op)(struct vhost_dev *dev, Error **errp); + typedef struct VhostOps { VhostBackendType backend_type; vhost_backend_init vhost_backend_init; @@ -181,6 +202,9 @@ typedef struct VhostOps { vhost_force_iommu_op vhost_force_iommu; vhost_set_config_call_op vhost_set_config_call; vhost_reset_status_op vhost_reset_status; + vhost_supports_migratory_state_op vhost_supports_migratory_state; + vhost_set_device_state_fd_op vhost_set_device_state_fd; + vhost_check_device_state_op vhost_check_device_state; } VhostOps; int vhost_backend_update_device_iotlb(struct vhost_dev *dev, diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h index 69bf59d630..d8877496e5 100644 --- a/include/hw/virtio/vhost.h +++ b/include/hw/virtio/vhost.h @@ -346,4 +346,83 @@ int vhost_dev_set_inflight(struct vhost_dev *dev, int vhost_dev_get_inflight(struct vhost_dev *dev, uint16_t queue_size, struct vhost_inflight *inflight); bool vhost_dev_has_iommu(struct vhost_dev *dev); + +/** + * vhost_supports_migratory_state(): Checks whether the back-end + * supports transferring internal state for the purpose of migration. + * Support for this feature is required for vhost_set_device_state_fd() + * and vhost_check_device_state(). + * + * @dev: The vhost device + * + * Returns true if the device supports these commands, and false if it + * does not. + */ +bool vhost_supports_migratory_state(struct vhost_dev *dev); + +/** + * vhost_set_device_state_fd(): Begin transfer of internal state from/to + * the back-end for the purpose of migration. Data is to be transferred + * over a pipe according to @direction and @phase. The sending end must + * only write to the pipe, and the receiving end must only read from it. + * Once the sending end is done, it closes its FD. The receiving end + * must take this as the end-of-transfer signal and close its FD, too. + * + * @fd is the back-end's end of the pipe: The write FD for SAVE, and the + * read FD for LOAD. This function transfers ownership of @fd to the + * back-end, i.e. closes it in the front-end. + * + * The back-end may optionally reply with an FD of its own, if this + * improves efficiency on its end. In this case, the returned FD is + * stored in *reply_fd. The back-end will discard the FD sent to it, + * and the front-end must use *reply_fd for transferring state to/from + * the back-end. + * + * @dev: The vhost device + * @direction: The direction in which the state is to be transferred. + * For outgoing migrations, this is SAVE, and data is read + * from the back-end and stored by the front-end in the + * migration stream. + * For incoming migrations, this is LOAD, and data is read + * by the front-end from the migration stream and sent to + * the back-end to restore the saved state. + * @phase: Which migration phase we are in. Currently, there is only + * STOPPED (device and all vrings are stopped), in the future, + * more phases such as PRE_COPY or POST_COPY may be added. + * @fd: Back-end's end of the pipe through which to transfer state; note + * that ownership is transferred to the back-end, so this function + * closes @fd in the front-end. + * @reply_fd: If the back-end wishes to use a different pipe for state + * transfer, this will contain an FD for the front-end to + * use. Otherwise, -1 is stored here. + * @errp: Potential error description + * + * Returns 0 on success, and -errno on failure. + */ +int vhost_set_device_state_fd(struct vhost_dev *dev, + VhostDeviceStateDirection direction, + VhostDeviceStatePhase phase, + int fd, + int *reply_fd, + Error **errp); + +/** + * vhost_set_device_state_fd(): After transferring state from/to the + * back-end via vhost_set_device_state_fd(), i.e. once the sending end + * has closed the pipe, inquire the back-end to report any potential + * errors that have occurred on its side. This allows to sense errors + * like: + * - During outgoing migration, when the source side had already started + * to produce its state, something went wrong and it failed to finish + * - During incoming migration, when the received state is somehow + * invalid and cannot be processed by the back-end + * + * @dev: The vhost device + * @errp: Potential error description + * + * Returns 0 when the back-end reports successful state transfer and + * processing, and -errno when an error occurred somewhere. + */ +int vhost_check_device_state(struct vhost_dev *dev, Error **errp); + #endif diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index 53a881ec2a..8e6b5485e8 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -76,6 +76,7 @@ enum VhostUserProtocolFeature { VHOST_USER_PROTOCOL_F_STATUS = 16, /* Feature 17 reserved for VHOST_USER_PROTOCOL_F_XEN_MMAP. */ VHOST_USER_PROTOCOL_F_SUSPEND = 18, + VHOST_USER_PROTOCOL_F_MIGRATORY_STATE = 19, VHOST_USER_PROTOCOL_F_MAX }; @@ -125,6 +126,8 @@ typedef enum VhostUserRequest { VHOST_USER_GET_STATUS = 40, VHOST_USER_SUSPEND = 41, VHOST_USER_RESUME = 42, + VHOST_USER_SET_DEVICE_STATE_FD = 43, + VHOST_USER_CHECK_DEVICE_STATE = 44, VHOST_USER_MAX } VhostUserRequest; @@ -216,6 +219,12 @@ typedef struct { uint32_t size; /* the following payload size */ } QEMU_PACKED VhostUserHeader; +/* Request payload of VHOST_USER_SET_DEVICE_STATE_FD */ +typedef struct VhostUserTransferDeviceState { + uint32_t direction; + uint32_t phase; +} VhostUserTransferDeviceState; + typedef union { #define VHOST_USER_VRING_IDX_MASK (0xff) #define VHOST_USER_VRING_NOFD_MASK (0x1 << 8) @@ -230,6 +239,7 @@ typedef union { VhostUserCryptoSession session; VhostUserVringArea area; VhostUserInflight inflight; + VhostUserTransferDeviceState transfer_state; } VhostUserPayload; typedef struct VhostUserMsg { @@ -2838,6 +2848,140 @@ static void vhost_user_reset_status(struct vhost_dev *dev) } } +static bool vhost_user_supports_migratory_state(struct vhost_dev *dev) +{ + return virtio_has_feature(dev->protocol_features, + VHOST_USER_PROTOCOL_F_MIGRATORY_STATE); +} + +static int vhost_user_set_device_state_fd(struct vhost_dev *dev, + VhostDeviceStateDirection direction, + VhostDeviceStatePhase phase, + int fd, + int *reply_fd, + Error **errp) +{ + int ret; + struct vhost_user *vu = dev->opaque; + VhostUserMsg msg = { + .hdr = { + .request = VHOST_USER_SET_DEVICE_STATE_FD, + .flags = VHOST_USER_VERSION, + .size = sizeof(msg.payload.transfer_state), + }, + .payload.transfer_state = { + .direction = direction, + .phase = phase, + }, + }; + + *reply_fd = -1; + + if (!vhost_user_supports_migratory_state(dev)) { + close(fd); + error_setg(errp, "Back-end does not support migration state transfer"); + return -ENOTSUP; + } + + ret = vhost_user_write(dev, &msg, &fd, 1); + close(fd); + if (ret < 0) { + error_setg_errno(errp, -ret, + "Failed to send SET_DEVICE_STATE_FD message"); + return ret; + } + + ret = vhost_user_read(dev, &msg); + if (ret < 0) { + error_setg_errno(errp, -ret, + "Failed to receive SET_DEVICE_STATE_FD reply"); + return ret; + } + + if (msg.hdr.request != VHOST_USER_SET_DEVICE_STATE_FD) { + error_setg(errp, + "Received unexpected message type, expected %d, received %d", + VHOST_USER_SET_DEVICE_STATE_FD, msg.hdr.request); + return -EPROTO; + } + + if (msg.hdr.size != sizeof(msg.payload.u64)) { + error_setg(errp, + "Received bad message size, expected %zu, received %" PRIu32, + sizeof(msg.payload.u64), msg.hdr.size); + return -EPROTO; + } + + if ((msg.payload.u64 & 0xff) != 0) { + error_setg(errp, "Back-end did not accept migration state transfer"); + return -EIO; + } + + if (!(msg.payload.u64 & VHOST_USER_VRING_NOFD_MASK)) { + *reply_fd = qemu_chr_fe_get_msgfd(vu->user->chr); + if (*reply_fd < 0) { + error_setg(errp, + "Failed to get back-end-provided transfer pipe FD"); + *reply_fd = -1; + return -EIO; + } + } + + return 0; +} + +static int vhost_user_check_device_state(struct vhost_dev *dev, Error **errp) +{ + int ret; + VhostUserMsg msg = { + .hdr = { + .request = VHOST_USER_CHECK_DEVICE_STATE, + .flags = VHOST_USER_VERSION, + .size = 0, + }, + }; + + if (!vhost_user_supports_migratory_state(dev)) { + error_setg(errp, "Back-end does not support migration state transfer"); + return -ENOTSUP; + } + + ret = vhost_user_write(dev, &msg, NULL, 0); + if (ret < 0) { + error_setg_errno(errp, -ret, + "Failed to send CHECK_DEVICE_STATE message"); + return ret; + } + + ret = vhost_user_read(dev, &msg); + if (ret < 0) { + error_setg_errno(errp, -ret, + "Failed to receive CHECK_DEVICE_STATE reply"); + return ret; + } + + if (msg.hdr.request != VHOST_USER_CHECK_DEVICE_STATE) { + error_setg(errp, + "Received unexpected message type, expected %d, received %d", + VHOST_USER_CHECK_DEVICE_STATE, msg.hdr.request); + return -EPROTO; + } + + if (msg.hdr.size != sizeof(msg.payload.u64)) { + error_setg(errp, + "Received bad message size, expected %zu, received %" PRIu32, + sizeof(msg.payload.u64), msg.hdr.size); + return -EPROTO; + } + + if (msg.payload.u64 != 0) { + error_setg(errp, "Back-end failed to process its internal state"); + return -EIO; + } + + return 0; +} + const VhostOps user_ops = { .backend_type = VHOST_BACKEND_TYPE_USER, .vhost_backend_init = vhost_user_backend_init, @@ -2874,4 +3018,7 @@ const VhostOps user_ops = { .vhost_set_inflight_fd = vhost_user_set_inflight_fd, .vhost_dev_start = vhost_user_dev_start, .vhost_reset_status = vhost_user_reset_status, + .vhost_supports_migratory_state = vhost_user_supports_migratory_state, + .vhost_set_device_state_fd = vhost_user_set_device_state_fd, + .vhost_check_device_state = vhost_user_check_device_state, }; diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 2e28e58da7..756b6d55a8 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -2091,3 +2091,40 @@ int vhost_net_set_backend(struct vhost_dev *hdev, return -ENOSYS; } + +bool vhost_supports_migratory_state(struct vhost_dev *dev) +{ + if (dev->vhost_ops->vhost_supports_migratory_state) { + return dev->vhost_ops->vhost_supports_migratory_state(dev); + } + + return false; +} + +int vhost_set_device_state_fd(struct vhost_dev *dev, + VhostDeviceStateDirection direction, + VhostDeviceStatePhase phase, + int fd, + int *reply_fd, + Error **errp) +{ + if (dev->vhost_ops->vhost_set_device_state_fd) { + return dev->vhost_ops->vhost_set_device_state_fd(dev, direction, phase, + fd, reply_fd, errp); + } + + error_setg(errp, + "vhost transport does not support migration state transfer"); + return -ENOSYS; +} + +int vhost_check_device_state(struct vhost_dev *dev, Error **errp) +{ + if (dev->vhost_ops->vhost_check_device_state) { + return dev->vhost_ops->vhost_check_device_state(dev, errp); + } + + error_setg(errp, + "vhost transport does not support migration state transfer"); + return -ENOSYS; +} From patchwork Wed Jul 12 11:17:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hanna Czenczek X-Patchwork-Id: 1806778 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=RylJ4GsT; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4R1FZw3vb2z20bt for ; Wed, 12 Jul 2023 21:17:44 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qJXqO-0003I4-U7; Wed, 12 Jul 2023 07:17:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qJXqN-0003Fm-3F for qemu-devel@nongnu.org; Wed, 12 Jul 2023 07:17:15 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qJXqL-000305-8Y for qemu-devel@nongnu.org; Wed, 12 Jul 2023 07:17:14 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1689160632; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZUb7NCBfiWZ8ORKbNcTESP+dB2MgBUQp+SFFbaDhKQ8=; b=RylJ4GsTUYw2sCF8sAgVKRGtNiCvsLrw2BomtuUk8NiP1oKY4AQApKQXhModEa22EZgtmM QTDvcIvMTYk5BQJ+fpbfenuOHik59oiIRvjf5WVxy9wiVfCutRJehSJPM8QsUrxliR/jUX +TCuR80K+XhMtVD1oAXq5L8xemlCYI8= Received: from mimecast-mx02.redhat.com (66.187.233.73 [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-612-cdA7rs54NVSfXk0clwkxAA-1; Wed, 12 Jul 2023 07:17:10 -0400 X-MC-Unique: cdA7rs54NVSfXk0clwkxAA-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 765F83C0ED49 for ; Wed, 12 Jul 2023 11:17:10 +0000 (UTC) Received: from localhost (unknown [10.39.193.231]) by smtp.corp.redhat.com (Postfix) with ESMTPS id E6F94492B01; Wed, 12 Jul 2023 11:17:09 +0000 (UTC) From: Hanna Czenczek To: qemu-devel@nongnu.org, virtio-fs@redhat.com Cc: Hanna Czenczek , "Michael S . Tsirkin" , Stefan Hajnoczi , German Maglione , =?utf-8?q?Eugenio_P=C3=A9rez?= Subject: [PATCH v2 3/4] vhost: Add high-level state save/load functions Date: Wed, 12 Jul 2023 13:17:01 +0200 Message-ID: <20230712111703.28031-4-hreitz@redhat.com> In-Reply-To: <20230712111703.28031-1-hreitz@redhat.com> References: <20230712111703.28031-1-hreitz@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 Received-SPF: pass client-ip=170.10.133.124; envelope-from=hreitz@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org vhost_save_backend_state() and vhost_load_backend_state() can be used by vhost front-ends to easily save and load the back-end's state to/from the migration stream. Because we do not know the full state size ahead of time, vhost_save_backend_state() simply reads the data in 1 MB chunks, and writes each chunk consecutively into the migration stream, prefixed by its length. EOF is indicated by a 0-length chunk. Signed-off-by: Hanna Czenczek --- include/hw/virtio/vhost.h | 35 +++++++ hw/virtio/vhost.c | 204 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 239 insertions(+) diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h index d8877496e5..0c282abd4e 100644 --- a/include/hw/virtio/vhost.h +++ b/include/hw/virtio/vhost.h @@ -425,4 +425,39 @@ int vhost_set_device_state_fd(struct vhost_dev *dev, */ int vhost_check_device_state(struct vhost_dev *dev, Error **errp); +/** + * vhost_save_backend_state(): High-level function to receive a vhost + * back-end's state, and save it in `f`. Uses + * `vhost_set_device_state_fd()` to get the data from the back-end, and + * stores it in consecutive chunks that are each prefixed by their + * respective length (be32). The end is marked by a 0-length chunk. + * + * Must only be called while the device and all its vrings are stopped + * (`VHOST_TRANSFER_STATE_PHASE_STOPPED`). + * + * @dev: The vhost device from which to save the state + * @f: Migration stream in which to save the state + * @errp: Potential error message + * + * Returns 0 on success, and -errno otherwise. + */ +int vhost_save_backend_state(struct vhost_dev *dev, QEMUFile *f, Error **errp); + +/** + * vhost_load_backend_state(): High-level function to load a vhost + * back-end's state from `f`, and send it over to the back-end. Reads + * the data from `f` in the format used by `vhost_save_state()`, and + * uses `vhost_set_device_state_fd()` to transfer it to the back-end. + * + * Must only be called while the device and all its vrings are stopped + * (`VHOST_TRANSFER_STATE_PHASE_STOPPED`). + * + * @dev: The vhost device to which to send the sate + * @f: Migration stream from which to load the state + * @errp: Potential error message + * + * Returns 0 on success, and -errno otherwise. + */ +int vhost_load_backend_state(struct vhost_dev *dev, QEMUFile *f, Error **errp); + #endif diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 756b6d55a8..332d49a310 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -2128,3 +2128,207 @@ int vhost_check_device_state(struct vhost_dev *dev, Error **errp) "vhost transport does not support migration state transfer"); return -ENOSYS; } + +int vhost_save_backend_state(struct vhost_dev *dev, QEMUFile *f, Error **errp) +{ + /* Maximum chunk size in which to transfer the state */ + const size_t chunk_size = 1 * 1024 * 1024; + void *transfer_buf = NULL; + g_autoptr(GError) g_err = NULL; + int pipe_fds[2], read_fd = -1, write_fd = -1, reply_fd = -1; + int ret; + + /* [0] for reading (our end), [1] for writing (back-end's end) */ + if (!g_unix_open_pipe(pipe_fds, FD_CLOEXEC, &g_err)) { + error_setg(errp, "Failed to set up state transfer pipe: %s", + g_err->message); + ret = -EINVAL; + goto fail; + } + + read_fd = pipe_fds[0]; + write_fd = pipe_fds[1]; + + /* + * VHOST_TRANSFER_STATE_PHASE_STOPPED means the device must be stopped. + * We cannot check dev->suspended, because the back-end may not support + * suspending. + */ + assert(!dev->started); + + /* Transfer ownership of write_fd to the back-end */ + ret = vhost_set_device_state_fd(dev, + VHOST_TRANSFER_STATE_DIRECTION_SAVE, + VHOST_TRANSFER_STATE_PHASE_STOPPED, + write_fd, + &reply_fd, + errp); + if (ret < 0) { + error_prepend(errp, "Failed to initiate state transfer: "); + goto fail; + } + + /* If the back-end wishes to use a different pipe, switch over */ + if (reply_fd >= 0) { + close(read_fd); + read_fd = reply_fd; + } + + transfer_buf = g_malloc(chunk_size); + + while (true) { + ssize_t read_ret; + + read_ret = read(read_fd, transfer_buf, chunk_size); + if (read_ret < 0) { + ret = -errno; + error_setg_errno(errp, -ret, "Failed to receive state"); + goto fail; + } + + assert(read_ret <= chunk_size); + qemu_put_be32(f, read_ret); + + if (read_ret == 0) { + /* EOF */ + break; + } + + qemu_put_buffer(f, transfer_buf, read_ret); + } + + /* + * Back-end will not really care, but be clean and close our end of the pipe + * before inquiring the back-end about whether transfer was successful + */ + close(read_fd); + read_fd = -1; + + /* Also, verify that the device is still stopped */ + assert(!dev->started); + + ret = vhost_check_device_state(dev, errp); + if (ret < 0) { + goto fail; + } + + ret = 0; +fail: + g_free(transfer_buf); + if (read_fd >= 0) { + close(read_fd); + } + + return ret; +} + +int vhost_load_backend_state(struct vhost_dev *dev, QEMUFile *f, Error **errp) +{ + size_t transfer_buf_size = 0; + void *transfer_buf = NULL; + g_autoptr(GError) g_err = NULL; + int pipe_fds[2], read_fd = -1, write_fd = -1, reply_fd = -1; + int ret; + + /* [0] for reading (back-end's end), [1] for writing (our end) */ + if (!g_unix_open_pipe(pipe_fds, FD_CLOEXEC, &g_err)) { + error_setg(errp, "Failed to set up state transfer pipe: %s", + g_err->message); + ret = -EINVAL; + goto fail; + } + + read_fd = pipe_fds[0]; + write_fd = pipe_fds[1]; + + /* + * VHOST_TRANSFER_STATE_PHASE_STOPPED means the device must be stopped. + * We cannot check dev->suspended, because the back-end may not support + * suspending. + */ + assert(!dev->started); + + /* Transfer ownership of read_fd to the back-end */ + ret = vhost_set_device_state_fd(dev, + VHOST_TRANSFER_STATE_DIRECTION_LOAD, + VHOST_TRANSFER_STATE_PHASE_STOPPED, + read_fd, + &reply_fd, + errp); + if (ret < 0) { + error_prepend(errp, "Failed to initiate state transfer: "); + goto fail; + } + + /* If the back-end wishes to use a different pipe, switch over */ + if (reply_fd >= 0) { + close(write_fd); + write_fd = reply_fd; + } + + while (true) { + size_t this_chunk_size = qemu_get_be32(f); + ssize_t write_ret; + const uint8_t *transfer_pointer; + + if (this_chunk_size == 0) { + /* End of state */ + break; + } + + if (transfer_buf_size < this_chunk_size) { + transfer_buf = g_realloc(transfer_buf, this_chunk_size); + transfer_buf_size = this_chunk_size; + } + + if (qemu_get_buffer(f, transfer_buf, this_chunk_size) < + this_chunk_size) + { + error_setg(errp, "Failed to read state"); + ret = -EINVAL; + goto fail; + } + + transfer_pointer = transfer_buf; + while (this_chunk_size > 0) { + write_ret = write(write_fd, transfer_pointer, this_chunk_size); + if (write_ret < 0) { + ret = -errno; + error_setg_errno(errp, -ret, "Failed to send state"); + goto fail; + } else if (write_ret == 0) { + error_setg(errp, "Failed to send state: Connection is closed"); + ret = -ECONNRESET; + goto fail; + } + + assert(write_ret <= this_chunk_size); + this_chunk_size -= write_ret; + transfer_pointer += write_ret; + } + } + + /* + * Close our end, thus ending transfer, before inquiring the back-end about + * whether transfer was successful + */ + close(write_fd); + write_fd = -1; + + /* Also, verify that the device is still stopped */ + assert(!dev->started); + + ret = vhost_check_device_state(dev, errp); + if (ret < 0) { + goto fail; + } + + ret = 0; +fail: + g_free(transfer_buf); + if (write_fd >= 0) { + close(write_fd); + } + + return ret; +} From patchwork Wed Jul 12 11:17:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hanna Czenczek X-Patchwork-Id: 1806781 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=f8FTJUds; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4R1Fb95k5nz20bt for ; Wed, 12 Jul 2023 21:17:57 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qJXqP-0003I6-Ez; Wed, 12 Jul 2023 07:17:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qJXqN-0003Hf-Rm for qemu-devel@nongnu.org; Wed, 12 Jul 2023 07:17:15 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qJXqM-00030P-Ap for qemu-devel@nongnu.org; Wed, 12 Jul 2023 07:17:15 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1689160633; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=k34kcLwTvPEkT+3XVYFdwFc+Apb+lpmPwdTMWwux4Wg=; b=f8FTJUds1u24wJBlo7flAXl4F1kv2IRjsVHjkcGrOFS/o9mYgY84ivdAvmfLJzjhRPKnju PEkELdu+C6KdUYb3MlbLUVJwzyCt6eNJJMOWXQ517fpjg2JIUICB1yN+z5lsJPitJnHdRK WkXsxNzmbwbqAeA02ISz03qSbdf2yjM= Received: from mimecast-mx02.redhat.com (66.187.233.73 [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-436-IWj6wq6FOKO5UZVuFRRkLw-1; Wed, 12 Jul 2023 07:17:12 -0400 X-MC-Unique: IWj6wq6FOKO5UZVuFRRkLw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 25CF42815E28 for ; Wed, 12 Jul 2023 11:17:12 +0000 (UTC) Received: from localhost (unknown [10.39.193.231]) by smtp.corp.redhat.com (Postfix) with ESMTPS id AE3E6200A7CA; Wed, 12 Jul 2023 11:17:11 +0000 (UTC) From: Hanna Czenczek To: qemu-devel@nongnu.org, virtio-fs@redhat.com Cc: Hanna Czenczek , "Michael S . Tsirkin" , Stefan Hajnoczi , German Maglione , =?utf-8?q?Eugenio_P=C3=A9rez?= Subject: [PATCH v2 4/4] vhost-user-fs: Implement internal migration Date: Wed, 12 Jul 2023 13:17:02 +0200 Message-ID: <20230712111703.28031-5-hreitz@redhat.com> In-Reply-To: <20230712111703.28031-1-hreitz@redhat.com> References: <20230712111703.28031-1-hreitz@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 Received-SPF: pass client-ip=170.10.133.124; envelope-from=hreitz@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org A virtio-fs device's VM state consists of: - the virtio device (vring) state (VMSTATE_VIRTIO_DEVICE) - the back-end's (virtiofsd's) internal state We get/set the latter via the new vhost operations to transfer migratory state. It is its own dedicated subsection, so that for external migration, it can be disabled. Signed-off-by: Hanna Czenczek Reviewed-by: Stefan Hajnoczi --- hw/virtio/vhost-user-fs.c | 101 +++++++++++++++++++++++++++++++++++++- 1 file changed, 100 insertions(+), 1 deletion(-) diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c index 49d699ffc2..2b30f3d442 100644 --- a/hw/virtio/vhost-user-fs.c +++ b/hw/virtio/vhost-user-fs.c @@ -298,9 +298,108 @@ static struct vhost_dev *vuf_get_vhost(VirtIODevice *vdev) return &fs->vhost_dev; } +/** + * Fetch the internal state from virtiofsd and save it to `f`. + */ +static int vuf_save_state(QEMUFile *f, void *pv, size_t size, + const VMStateField *field, JSONWriter *vmdesc) +{ + VirtIODevice *vdev = pv; + VHostUserFS *fs = VHOST_USER_FS(vdev); + Error *local_error = NULL; + int ret; + + ret = vhost_save_backend_state(&fs->vhost_dev, f, &local_error); + if (ret < 0) { + error_reportf_err(local_error, + "Error saving back-end state of %s device %s " + "(tag: \"%s\"): ", + vdev->name, vdev->parent_obj.canonical_path, + fs->conf.tag ?: ""); + return ret; + } + + return 0; +} + +/** + * Load virtiofsd's internal state from `f` and send it over to virtiofsd. + */ +static int vuf_load_state(QEMUFile *f, void *pv, size_t size, + const VMStateField *field) +{ + VirtIODevice *vdev = pv; + VHostUserFS *fs = VHOST_USER_FS(vdev); + Error *local_error = NULL; + int ret; + + ret = vhost_load_backend_state(&fs->vhost_dev, f, &local_error); + if (ret < 0) { + error_reportf_err(local_error, + "Error loading back-end state of %s device %s " + "(tag: \"%s\"): ", + vdev->name, vdev->parent_obj.canonical_path, + fs->conf.tag ?: ""); + return ret; + } + + return 0; +} + +static bool vuf_is_internal_migration(void *opaque) +{ + /* TODO: Return false when an external migration is requested */ + return true; +} + +static int vuf_check_migration_support(void *opaque) +{ + VirtIODevice *vdev = opaque; + VHostUserFS *fs = VHOST_USER_FS(vdev); + + if (!vhost_supports_migratory_state(&fs->vhost_dev)) { + error_report("Back-end of %s device %s (tag: \"%s\") does not support " + "migration through qemu", + vdev->name, vdev->parent_obj.canonical_path, + fs->conf.tag ?: ""); + return -ENOTSUP; + } + + return 0; +} + +static const VMStateDescription vuf_backend_vmstate; + static const VMStateDescription vuf_vmstate = { .name = "vhost-user-fs", - .unmigratable = 1, + .version_id = 0, + .fields = (VMStateField[]) { + VMSTATE_VIRTIO_DEVICE, + VMSTATE_END_OF_LIST() + }, + .subsections = (const VMStateDescription * []) { + &vuf_backend_vmstate, + NULL, + } +}; + +static const VMStateDescription vuf_backend_vmstate = { + .name = "vhost-user-fs-backend", + .version_id = 0, + .needed = vuf_is_internal_migration, + .pre_load = vuf_check_migration_support, + .pre_save = vuf_check_migration_support, + .fields = (VMStateField[]) { + { + .name = "back-end", + .info = &(const VMStateInfo) { + .name = "virtio-fs back-end state", + .get = vuf_load_state, + .put = vuf_save_state, + }, + }, + VMSTATE_END_OF_LIST() + }, }; static Property vuf_properties[] = {