From patchwork Fri Aug 11 15:08:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fabiano Rosas X-Patchwork-Id: 1820312 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=GnhZbBqY; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=ji00aX4M; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RMnJ10dgDz1yf6 for ; Sat, 12 Aug 2023 01:09:05 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qUTku-0000Xg-V3; Fri, 11 Aug 2023 11:08:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qUTkr-0000X2-8L for qemu-devel@nongnu.org; Fri, 11 Aug 2023 11:08:45 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qUTkp-0006FO-RP for qemu-devel@nongnu.org; Fri, 11 Aug 2023 11:08:44 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 0002521870; Fri, 11 Aug 2023 15:08:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1691766522; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2U70nrbKPKQlvdKn3PLPV7kvKLVHMvBT1WZo0dV9/p0=; b=GnhZbBqYjdSY7NjhUkVQ5WgJbIlqWxmeQAELcr+y10gWqwuZD9RTPfvBJO940sNMPsCoP9 VboM+ZRMj3O5AJtyQopRt0I91+8+Dim4VcsQX9Jy68IxiC/QwRILeia97XZGd0zq10QQkN +j/2Yb/5bDh4sNwyACBxm03vwBa/WCo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1691766522; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2U70nrbKPKQlvdKn3PLPV7kvKLVHMvBT1WZo0dV9/p0=; b=ji00aX4MyPha2WJpgLS1Ee9FgOjr+SmD/wdgTNEEkrrPkXzb7jbtmq3AakykpTpDULFbSX XKbHy03dgNMUoeBw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 6B63E13592; Fri, 11 Aug 2023 15:08:40 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 4O2hDfhO1mS7KwAAMHmgww (envelope-from ); Fri, 11 Aug 2023 15:08:40 +0000 From: Fabiano Rosas To: qemu-devel@nongnu.org Cc: Juan Quintela , Peter Xu , Wei Wang , Leonardo Bras Subject: [PATCH v3 01/10] migration: Fix possible race when setting rp_state.error Date: Fri, 11 Aug 2023 12:08:27 -0300 Message-Id: <20230811150836.2895-2-farosas@suse.de> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20230811150836.2895-1-farosas@suse.de> References: <20230811150836.2895-1-farosas@suse.de> MIME-Version: 1.0 Received-SPF: pass client-ip=195.135.220.28; envelope-from=farosas@suse.de; helo=smtp-out1.suse.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org We don't need to set the rp_state.error right after a shutdown because qemu_file_shutdown() always sets the QEMUFile error, so the return path thread would have seen it and set the rp error itself. Setting the error outside of the thread is also racy because the thread could clear it after we set it. Signed-off-by: Fabiano Rosas Reviewed-by: Peter Xu --- migration/migration.c | 1 - 1 file changed, 1 deletion(-) diff --git a/migration/migration.c b/migration/migration.c index 5528acb65e..f88c86079c 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2062,7 +2062,6 @@ static int await_return_path_close_on_source(MigrationState *ms) * waiting for the destination. */ qemu_file_shutdown(ms->rp_state.from_dst_file); - mark_source_rp_bad(ms); } trace_await_return_path_close_on_source_joining(); qemu_thread_join(&ms->rp_state.rp_thread); From patchwork Fri Aug 11 15:08:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fabiano Rosas X-Patchwork-Id: 1820314 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=cndRm/K5; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=FDVhUdw/; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RMnJF2SY5z1yfH for ; Sat, 12 Aug 2023 01:09:17 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qUTkw-0000Ye-Te; Fri, 11 Aug 2023 11:08:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qUTks-0000XS-Sv for qemu-devel@nongnu.org; Fri, 11 Aug 2023 11:08:47 -0400 Received: from smtp-out2.suse.de ([195.135.220.29]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qUTkr-0006Gy-79 for qemu-devel@nongnu.org; Fri, 11 Aug 2023 11:08:46 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id E09CF1F890; Fri, 11 Aug 2023 15:08:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1691766523; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oeOxeNBC49de/luEiFl9+VCrPoWsGucZKKBRosGSJO4=; b=cndRm/K5Udro8ct25MNmtZmJ5Q/TmsVy5FwYz84BkAEg2QHDl6RgdHdlA1BlpoPQmX/B0Q v1qv14zByIFmzqYKff8ZVgDOyFI5/sO2wvmPfPavu3gyuSp4ivZZWd3q1f00QmAinugFxI Bui2ffv0z3hQ1FWGT0PydLotxfp4jJ0= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1691766523; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oeOxeNBC49de/luEiFl9+VCrPoWsGucZKKBRosGSJO4=; b=FDVhUdw/fRKr/I4EWNwjEF1x3jwoWUofrVEITheRY9hj5WRlzw+xPElVEn4ltyUkr2INV3 JMhiinxGmjkv4YDQ== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 611FB13592; Fri, 11 Aug 2023 15:08:42 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id MEvFCvpO1mS7KwAAMHmgww (envelope-from ); Fri, 11 Aug 2023 15:08:42 +0000 From: Fabiano Rosas To: qemu-devel@nongnu.org Cc: Juan Quintela , Peter Xu , Wei Wang , Leonardo Bras Subject: [PATCH v3 02/10] migration: Fix possible race when shutting return path Date: Fri, 11 Aug 2023 12:08:28 -0300 Message-Id: <20230811150836.2895-3-farosas@suse.de> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20230811150836.2895-1-farosas@suse.de> References: <20230811150836.2895-1-farosas@suse.de> MIME-Version: 1.0 Received-SPF: pass client-ip=195.135.220.29; envelope-from=farosas@suse.de; helo=smtp-out2.suse.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org We cannot call qemu_file_shutdown() on the return path file without taking the file lock. The return path thread could be running it's cleanup code and have just cleared the pointer. This was caught by inspection, it should be rare, but the next patches will start calling this code from other places, so let's do the correct thing. Signed-off-by: Fabiano Rosas --- migration/migration.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index f88c86079c..0067c927fa 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2052,17 +2052,19 @@ static int open_return_path_on_source(MigrationState *ms, static int await_return_path_close_on_source(MigrationState *ms) { /* - * If this is a normal exit then the destination will send a SHUT and the - * rp_thread will exit, however if there's an error we need to cause - * it to exit. + * If this is a normal exit then the destination will send a SHUT + * and the rp_thread will exit, however if there's an error we + * need to cause it to exit. shutdown(2), if we have it, will + * cause it to unblock if it's stuck waiting for the destination. */ - if (qemu_file_get_error(ms->to_dst_file) && ms->rp_state.from_dst_file) { - /* - * shutdown(2), if we have it, will cause it to unblock if it's stuck - * waiting for the destination. - */ - qemu_file_shutdown(ms->rp_state.from_dst_file); + if (qemu_file_get_error(ms->to_dst_file)) { + WITH_QEMU_LOCK_GUARD(&ms->qemu_file_lock) { + if (ms->rp_state.from_dst_file) { + qemu_file_shutdown(ms->rp_state.from_dst_file); + } + } } + trace_await_return_path_close_on_source_joining(); qemu_thread_join(&ms->rp_state.rp_thread); ms->rp_state.rp_thread_created = false; From patchwork Fri Aug 11 15:08:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fabiano Rosas X-Patchwork-Id: 1820318 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=fRAAbpFj; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=vOVmr/n5; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RMnKL2YKgz1yf7 for ; Sat, 12 Aug 2023 01:10:14 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qUTl6-0000br-CY; Fri, 11 Aug 2023 11:09:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qUTkz-0000Zt-BC for qemu-devel@nongnu.org; Fri, 11 Aug 2023 11:08:53 -0400 Received: from smtp-out1.suse.de ([2001:67c:2178:6::1c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qUTkv-0006Ha-1j for qemu-devel@nongnu.org; Fri, 11 Aug 2023 11:08:51 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id E35C821887; Fri, 11 Aug 2023 15:08:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1691766525; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Sdig4Xam+KaxFnZLj6qvoskpEIm5WZWhu/1QE/MF7bM=; b=fRAAbpFjVM1AauZWTUvg+MOX/+6px2YjBz3p1rpROVJArPXAn7hPSDSvLGEYYCHSF5ZoHr dxwvn6kZxdmlRP+fXtfuTVIAgQGcZJT3Esppx1YY+pemzFonim8R0iVimw92riiPswl9pZ ejGN5fWwJDrPaOGBFXHtyufeZbOj66E= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1691766525; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Sdig4Xam+KaxFnZLj6qvoskpEIm5WZWhu/1QE/MF7bM=; b=vOVmr/n5X2isJoRzkhBUD5CTaXnIqd/s7L903QRDfKQxHFKwySyHrsJ/wOGjc00Kd1hGFi SCbobTNg7abN+eCA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 5772E13592; Fri, 11 Aug 2023 15:08:44 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id AG2jCPxO1mS7KwAAMHmgww (envelope-from ); Fri, 11 Aug 2023 15:08:44 +0000 From: Fabiano Rosas To: qemu-devel@nongnu.org Cc: Juan Quintela , Peter Xu , Wei Wang , Leonardo Bras Subject: [PATCH v3 03/10] migration: Fix possible race when checking to_dst_file for errors Date: Fri, 11 Aug 2023 12:08:29 -0300 Message-Id: <20230811150836.2895-4-farosas@suse.de> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20230811150836.2895-1-farosas@suse.de> References: <20230811150836.2895-1-farosas@suse.de> MIME-Version: 1.0 Received-SPF: pass client-ip=2001:67c:2178:6::1c; envelope-from=farosas@suse.de; helo=smtp-out1.suse.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Checking ms->to_dst_file for errors when cleaning up the return path could race with migrate_fd_cleanup() which clears the pointer. Since migrate_fd_cleanup() is reachable via qmp_migrate(), which is issued by the user, it is safer if we take the lock when reading ms->to_dst_file. Signed-off-by: Fabiano Rosas --- migration/migration.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 0067c927fa..85c171f32c 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2057,11 +2057,10 @@ static int await_return_path_close_on_source(MigrationState *ms) * need to cause it to exit. shutdown(2), if we have it, will * cause it to unblock if it's stuck waiting for the destination. */ - if (qemu_file_get_error(ms->to_dst_file)) { - WITH_QEMU_LOCK_GUARD(&ms->qemu_file_lock) { - if (ms->rp_state.from_dst_file) { - qemu_file_shutdown(ms->rp_state.from_dst_file); - } + WITH_QEMU_LOCK_GUARD(&ms->qemu_file_lock) { + if (ms->to_dst_file && ms->rp_state.from_dst_file && + qemu_file_get_error(ms->to_dst_file)) { + qemu_file_shutdown(ms->rp_state.from_dst_file); } } From patchwork Fri Aug 11 15:08:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fabiano Rosas X-Patchwork-Id: 1820315 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=t8vx8ShN; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=I3E5xfud; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RMnJS62nwz1yf6 for ; Sat, 12 Aug 2023 01:09:28 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qUTl5-0000ba-DB; Fri, 11 Aug 2023 11:09:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qUTkz-0000Zs-9w for qemu-devel@nongnu.org; Fri, 11 Aug 2023 11:08:53 -0400 Received: from smtp-out1.suse.de ([2001:67c:2178:6::1c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qUTkv-0006Hu-3k for qemu-devel@nongnu.org; Fri, 11 Aug 2023 11:08:51 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id CE7B62188D; Fri, 11 Aug 2023 15:08:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1691766527; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TGQOHZA1EOlvWg+cTrtICJk/oIlwUWTaoHqhgAIgHuE=; b=t8vx8ShN4dS+OUzU4cVi1tvRwEx1J23jGljol5fvTyMr5zDIncmUnUYddtTfGApUOlgahM cd+xxsqYSQ1Des9gHiXOcuvhTSoUvvG4AN7I3LxEVziVEF9UfbvOCG2BGJfjLTacSMnJsw uGDUziFeaVtgFrAW5UDjAZAvGtq5Bx8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1691766527; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TGQOHZA1EOlvWg+cTrtICJk/oIlwUWTaoHqhgAIgHuE=; b=I3E5xfudieMG3Th5S5nkqgq3SFSvyCtrHsOGq4n8VyaaEonjMoa1S3a3+MDc6YQnk822u8 Zo2JLyXthOhdTSAg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 4D3BB13592; Fri, 11 Aug 2023 15:08:46 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 4GpPBv5O1mS7KwAAMHmgww (envelope-from ); Fri, 11 Aug 2023 15:08:46 +0000 From: Fabiano Rosas To: qemu-devel@nongnu.org Cc: Juan Quintela , Peter Xu , Wei Wang , Leonardo Bras Subject: [PATCH v3 04/10] migration: Fix possible race when shutting down to_dst_file Date: Fri, 11 Aug 2023 12:08:30 -0300 Message-Id: <20230811150836.2895-5-farosas@suse.de> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20230811150836.2895-1-farosas@suse.de> References: <20230811150836.2895-1-farosas@suse.de> MIME-Version: 1.0 Received-SPF: pass client-ip=2001:67c:2178:6::1c; envelope-from=farosas@suse.de; helo=smtp-out1.suse.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org It's not safe to call qemu_file_shutdown() on the to_dst_file without first checking for the file's presence under the lock. The cleanup of this file happens at postcopy_pause() and migrate_fd_cleanup() which are not necessarily running in the same thread as migrate_fd_cancel(). Signed-off-by: Fabiano Rosas Reviewed-by: Peter Xu --- migration/migration.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 85c171f32c..5e6a766235 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1245,7 +1245,7 @@ static void migrate_fd_error(MigrationState *s, const Error *error) static void migrate_fd_cancel(MigrationState *s) { int old_state ; - QEMUFile *f = migrate_get_current()->to_dst_file; + trace_migrate_fd_cancel(); WITH_QEMU_LOCK_GUARD(&s->qemu_file_lock) { @@ -1271,11 +1271,13 @@ static void migrate_fd_cancel(MigrationState *s) * If we're unlucky the migration code might be stuck somewhere in a * send/write while the network has failed and is waiting to timeout; * if we've got shutdown(2) available then we can force it to quit. - * The outgoing qemu file gets closed in migrate_fd_cleanup that is - * called in a bh, so there is no race against this cancel. */ - if (s->state == MIGRATION_STATUS_CANCELLING && f) { - qemu_file_shutdown(f); + if (s->state == MIGRATION_STATUS_CANCELLING) { + WITH_QEMU_LOCK_GUARD(&s->qemu_file_lock) { + if (s->to_dst_file) { + qemu_file_shutdown(s->to_dst_file); + } + } } if (s->state == MIGRATION_STATUS_CANCELLING && s->block_inactive) { Error *local_err = NULL; @@ -1519,12 +1521,14 @@ void qmp_migrate_pause(Error **errp) { MigrationState *ms = migrate_get_current(); MigrationIncomingState *mis = migration_incoming_get_current(); - int ret; + int ret = 0; if (ms->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) { /* Source side, during postcopy */ qemu_mutex_lock(&ms->qemu_file_lock); - ret = qemu_file_shutdown(ms->to_dst_file); + if (ms->to_dst_file) { + ret = qemu_file_shutdown(ms->to_dst_file); + } qemu_mutex_unlock(&ms->qemu_file_lock); if (ret) { error_setg(errp, "Failed to pause source migration"); From patchwork Fri Aug 11 15:08:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fabiano Rosas X-Patchwork-Id: 1820319 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=s7Bx98pA; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=sRIZ7PIb; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RMnKN54Njz1yf7 for ; Sat, 12 Aug 2023 01:10:16 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qUTl7-0000cN-7H; Fri, 11 Aug 2023 11:09:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qUTl0-0000aK-RP for qemu-devel@nongnu.org; Fri, 11 Aug 2023 11:08:56 -0400 Received: from smtp-out1.suse.de ([2001:67c:2178:6::1c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qUTky-0006I8-GC for qemu-devel@nongnu.org; Fri, 11 Aug 2023 11:08:54 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id D970421885; Fri, 11 Aug 2023 15:08:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1691766529; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GRI8RaIpuUMZhKg3YNlyKHOCF6WdPu4dlf4zc+5ynKI=; b=s7Bx98pA/UT7MrfUH/4BPj/wqQHLlJNN6035FDOHAqWbAFL/8e1suyKc5DUS0BQL2hDMZE hW/YU/4WbmEAxSuKXUIOwWOf8KWfnhLE16u1fkRlIv35DXFU8e9NzbJbltrK0c0/onM1wK UlJ1LIC3D7onkxx+Q0k1B5r++4kqj20= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1691766529; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GRI8RaIpuUMZhKg3YNlyKHOCF6WdPu4dlf4zc+5ynKI=; b=sRIZ7PIbS6xYB/O9eq9D03reJ+7+fhDHKEqMNyBmhe9Dw+thseGI5ifr08NQGifLsKEeXk ycRJd+rlg26Zr4CA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 492EA13592; Fri, 11 Aug 2023 15:08:48 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id YLvVBABP1mS7KwAAMHmgww (envelope-from ); Fri, 11 Aug 2023 15:08:48 +0000 From: Fabiano Rosas To: qemu-devel@nongnu.org Cc: Juan Quintela , Peter Xu , Wei Wang , Leonardo Bras Subject: [PATCH v3 05/10] migration: Remove redundant cleanup of postcopy_qemufile_src Date: Fri, 11 Aug 2023 12:08:31 -0300 Message-Id: <20230811150836.2895-6-farosas@suse.de> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20230811150836.2895-1-farosas@suse.de> References: <20230811150836.2895-1-farosas@suse.de> MIME-Version: 1.0 Received-SPF: pass client-ip=2001:67c:2178:6::1c; envelope-from=farosas@suse.de; helo=smtp-out1.suse.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org This file is owned by the return path thread which is already doing cleanup. Signed-off-by: Fabiano Rosas Reviewed-by: Peter Xu --- migration/migration.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 5e6a766235..195726eb4a 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1177,12 +1177,6 @@ static void migrate_fd_cleanup(MigrationState *s) qemu_fclose(tmp); } - if (s->postcopy_qemufile_src) { - migration_ioc_unregister_yank_from_file(s->postcopy_qemufile_src); - qemu_fclose(s->postcopy_qemufile_src); - s->postcopy_qemufile_src = NULL; - } - assert(!migration_is_active(s)); if (s->state == MIGRATION_STATUS_CANCELLING) { From patchwork Fri Aug 11 15:08:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fabiano Rosas X-Patchwork-Id: 1820320 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=hVPm9HJd; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=PJNFP5MP; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RMnKY5h0Mz1yf7 for ; Sat, 12 Aug 2023 01:10:25 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qUTl7-0000cO-7v; Fri, 11 Aug 2023 11:09:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qUTl0-0000aL-Rk for qemu-devel@nongnu.org; Fri, 11 Aug 2023 11:08:56 -0400 Received: from smtp-out1.suse.de ([2001:67c:2178:6::1c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qUTky-0006Ir-W1 for qemu-devel@nongnu.org; Fri, 11 Aug 2023 11:08:54 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id C544721889; Fri, 11 Aug 2023 15:08:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1691766531; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZR0K8FF/5xgcwd7JjY88hQ9g+gfLupJHTQs3SfKo1Xk=; b=hVPm9HJd3ghc3OVfoiA/tpBe7gKa2DW8BwUZX1IJ3ufaR0B4LoTiLow2noTP+ByPWnb4w3 uiYouI8a/ReFsdfdSzrGIKQuqwdul6rCpSM7U/AN+ECnO7m12zAQUQkZwc9fnf2wuuPjF3 z05fmQd68i2MVgXvoqrgbVD0F+34c0o= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1691766531; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZR0K8FF/5xgcwd7JjY88hQ9g+gfLupJHTQs3SfKo1Xk=; b=PJNFP5MPigdOQpmS7pOVMy0npCD9S2VXPpPIkJokU6yJFHPUz+K8mnGSbUA+VwjzmLkk2X 0RTDS7sHIFFA8vDw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 4555013592; Fri, 11 Aug 2023 15:08:50 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id ODNUBAJP1mS7KwAAMHmgww (envelope-from ); Fri, 11 Aug 2023 15:08:50 +0000 From: Fabiano Rosas To: qemu-devel@nongnu.org Cc: Juan Quintela , Peter Xu , Wei Wang , Leonardo Bras Subject: [PATCH v3 06/10] migration: Consolidate return path closing code Date: Fri, 11 Aug 2023 12:08:32 -0300 Message-Id: <20230811150836.2895-7-farosas@suse.de> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20230811150836.2895-1-farosas@suse.de> References: <20230811150836.2895-1-farosas@suse.de> MIME-Version: 1.0 Received-SPF: pass client-ip=2001:67c:2178:6::1c; envelope-from=farosas@suse.de; helo=smtp-out1.suse.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org We'll start calling the await_return_path_close_on_source() function from other parts of the code, so move all of the related checks and tracepoints into it. Signed-off-by: Fabiano Rosas Reviewed-by: Peter Xu --- migration/migration.c | 29 ++++++++++++++--------------- 1 file changed, 14 insertions(+), 15 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 195726eb4a..4edbee3a5d 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2049,6 +2049,14 @@ static int open_return_path_on_source(MigrationState *ms, /* Returns 0 if the RP was ok, otherwise there was an error on the RP */ static int await_return_path_close_on_source(MigrationState *ms) { + int ret; + + if (!ms->rp_state.rp_thread_created) { + return 0; + } + + trace_migration_return_path_end_before(); + /* * If this is a normal exit then the destination will send a SHUT * and the rp_thread will exit, however if there's an error we @@ -2066,7 +2074,10 @@ static int await_return_path_close_on_source(MigrationState *ms) qemu_thread_join(&ms->rp_state.rp_thread); ms->rp_state.rp_thread_created = false; trace_await_return_path_close_on_source_close(); - return ms->rp_state.error; + + ret = ms->rp_state.error; + trace_migration_return_path_end_after(ret); + return ret; } static inline void @@ -2362,20 +2373,8 @@ static void migration_completion(MigrationState *s) goto fail; } - /* - * If rp was opened we must clean up the thread before - * cleaning everything else up (since if there are no failures - * it will wait for the destination to send it's status in - * a SHUT command). - */ - if (s->rp_state.rp_thread_created) { - int rp_error; - trace_migration_return_path_end_before(); - rp_error = await_return_path_close_on_source(s); - trace_migration_return_path_end_after(rp_error); - if (rp_error) { - goto fail; - } + if (await_return_path_close_on_source(s)) { + goto fail; } if (qemu_file_get_error(s->to_dst_file)) { From patchwork Fri Aug 11 15:08:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fabiano Rosas X-Patchwork-Id: 1820316 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=Bw3FZx0N; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=/B6orChm; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RMnJq0zVVz1yf6 for ; Sat, 12 Aug 2023 01:09:47 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qUTl8-0000e6-PJ; Fri, 11 Aug 2023 11:09:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qUTl4-0000bW-Fi for qemu-devel@nongnu.org; Fri, 11 Aug 2023 11:08:59 -0400 Received: from smtp-out1.suse.de ([2001:67c:2178:6::1c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qUTl2-0006KS-Hb for qemu-devel@nongnu.org; Fri, 11 Aug 2023 11:08:58 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id BACE021870; Fri, 11 Aug 2023 15:08:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1691766533; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7gGNNMSGMOnRvawh4NPD+P+dNPR4Jp7S2Z+6Q+VJUrk=; b=Bw3FZx0N3htXK4Xv4bw36XqBm6JwmWO8VytZo+YcLBnLjsKzCPb+gJUANNhXqnDPQJ20xI MxvBaAqkPhEdSR/LlCByCE1jZFtlKuT/oxlDltBfx4FjcbLQ4mKEDU32WF216E4FVA7y+2 bU3Y7bqwwgjzI1dLySn+K+IikReApvo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1691766533; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7gGNNMSGMOnRvawh4NPD+P+dNPR4Jp7S2Z+6Q+VJUrk=; b=/B6orChmbN8b8JK/ntJp2aNDeoTDI5LD844NVhBuQiqn1Q8lhK0PaUzlIqggdQoaTszF/x bCY0R6GT9+b0gXDA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 3DC8F13592; Fri, 11 Aug 2023 15:08:52 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 0H5yAgRP1mS7KwAAMHmgww (envelope-from ); Fri, 11 Aug 2023 15:08:52 +0000 From: Fabiano Rosas To: qemu-devel@nongnu.org Cc: Juan Quintela , Peter Xu , Wei Wang , Leonardo Bras Subject: [PATCH v3 07/10] migration: Replace the return path retry logic Date: Fri, 11 Aug 2023 12:08:33 -0300 Message-Id: <20230811150836.2895-8-farosas@suse.de> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20230811150836.2895-1-farosas@suse.de> References: <20230811150836.2895-1-farosas@suse.de> MIME-Version: 1.0 Received-SPF: pass client-ip=2001:67c:2178:6::1c; envelope-from=farosas@suse.de; helo=smtp-out1.suse.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Replace the return path retry logic with finishing and restarting the thread. This fixes a race when resuming the migration that leads to a segfault. Currently when doing postcopy we consider that an IO error on the return path file could be due to a network intermittency. We then keep the thread alive but have it do cleanup of the 'from_dst_file' and wait on the 'postcopy_pause_rp' semaphore. When the user issues a migrate resume, a new return path is opened and the thread is allowed to continue. There's a race condition in the above mechanism. It is possible for the new return path file to be setup *before* the cleanup code in the return path thread has had a chance to run, leading to the *new* file being closed and the pointer set to NULL. When the thread is released after the resume, it tries to dereference 'from_dst_file' and crashes: Thread 7 "return path" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffd1dbf700 (LWP 9611)] 0x00005555560e4893 in qemu_file_get_error_obj (f=0x0, errp=0x0) at ../migration/qemu-file.c:154 154 return f->last_error; (gdb) bt #0 0x00005555560e4893 in qemu_file_get_error_obj (f=0x0, errp=0x0) at ../migration/qemu-file.c:154 #1 0x00005555560e4983 in qemu_file_get_error (f=0x0) at ../migration/qemu-file.c:206 #2 0x0000555555b9a1df in source_return_path_thread (opaque=0x555556e06000) at ../migration/migration.c:1876 #3 0x000055555602e14f in qemu_thread_start (args=0x55555782e780) at ../util/qemu-thread-posix.c:541 #4 0x00007ffff38d76ea in start_thread (arg=0x7fffd1dbf700) at pthread_create.c:477 #5 0x00007ffff35efa6f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Here's the race (important bit is open_return_path happening before migration_release_dst_files): migration | qmp | return path --------------------------+-----------------------------+--------------------------------- qmp_migrate_pause() shutdown(ms->to_dst_file) f->last_error = -EIO migrate_detect_error() postcopy_pause() set_state(PAUSED) wait(postcopy_pause_sem) qmp_migrate(resume) migrate_fd_connect() resume = state == PAUSED open_return_path <-- TOO SOON! set_state(RECOVER) post(postcopy_pause_sem) (incoming closes to_src_file) res = qemu_file_get_error(rp) migration_release_dst_files() ms->rp_state.from_dst_file = NULL post(postcopy_pause_rp_sem) postcopy_pause_return_path_thread() wait(postcopy_pause_rp_sem) rp = ms->rp_state.from_dst_file goto retry qemu_file_get_error(rp) SIGSEGV ------------------------------------------------------------------------------------------- We can keep the retry logic without having the thread alive and waiting. The only piece of data used by it is the 'from_dst_file' and it is only allowed to proceed after a migrate resume is issued and the semaphore released at migrate_fd_connect(). Move the retry logic to outside the thread by waiting for the thread to finish before pausing the migration. Signed-off-by: Fabiano Rosas Reviewed-by: Peter Xu --- migration/migration.c | 60 ++++++++----------------------------------- migration/migration.h | 1 - 2 files changed, 11 insertions(+), 50 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 4edbee3a5d..7dfcbc3634 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1775,18 +1775,6 @@ static void migrate_handle_rp_req_pages(MigrationState *ms, const char* rbname, } } -/* Return true to retry, false to quit */ -static bool postcopy_pause_return_path_thread(MigrationState *s) -{ - trace_postcopy_pause_return_path(); - - qemu_sem_wait(&s->postcopy_pause_rp_sem); - - trace_postcopy_pause_return_path_continued(); - - return true; -} - static int migrate_handle_rp_recv_bitmap(MigrationState *s, char *block_name) { RAMBlock *block = qemu_ram_block_by_name(block_name); @@ -1870,7 +1858,6 @@ static void *source_return_path_thread(void *opaque) trace_source_return_path_thread_entry(); rcu_register_thread(); -retry: while (!ms->rp_state.error && !qemu_file_get_error(rp) && migration_is_setup_or_active(ms->state)) { trace_source_return_path_thread_loop_top(); @@ -1992,26 +1979,7 @@ retry: } out: - res = qemu_file_get_error(rp); - if (res) { - if (res && migration_in_postcopy()) { - /* - * Maybe there is something we can do: it looks like a - * network down issue, and we pause for a recovery. - */ - migration_release_dst_files(ms); - rp = NULL; - if (postcopy_pause_return_path_thread(ms)) { - /* - * Reload rp, reset the rest. Referencing it is safe since - * it's reset only by us above, or when migration completes - */ - rp = ms->rp_state.from_dst_file; - ms->rp_state.error = false; - goto retry; - } - } - + if (qemu_file_get_error(rp)) { trace_source_return_path_thread_bad_end(); mark_source_rp_bad(ms); } @@ -2022,8 +1990,7 @@ out: return NULL; } -static int open_return_path_on_source(MigrationState *ms, - bool create_thread) +static int open_return_path_on_source(MigrationState *ms) { ms->rp_state.from_dst_file = qemu_file_get_return_path(ms->to_dst_file); if (!ms->rp_state.from_dst_file) { @@ -2032,11 +1999,6 @@ static int open_return_path_on_source(MigrationState *ms, trace_open_return_path_on_source(); - if (!create_thread) { - /* We're done */ - return 0; - } - qemu_thread_create(&ms->rp_state.rp_thread, "return path", source_return_path_thread, ms, QEMU_THREAD_JOINABLE); ms->rp_state.rp_thread_created = true; @@ -2076,6 +2038,7 @@ static int await_return_path_close_on_source(MigrationState *ms) trace_await_return_path_close_on_source_close(); ret = ms->rp_state.error; + ms->rp_state.error = false; trace_migration_return_path_end_after(ret); return ret; } @@ -2551,6 +2514,13 @@ static MigThrError postcopy_pause(MigrationState *s) qemu_file_shutdown(file); qemu_fclose(file); + /* + * We're already pausing, so ignore any errors on the return + * path and just wait for the thread to finish. It will be + * re-created when we resume. + */ + await_return_path_close_on_source(s); + migrate_set_state(&s->state, s->state, MIGRATION_STATUS_POSTCOPY_PAUSED); @@ -2568,12 +2538,6 @@ static MigThrError postcopy_pause(MigrationState *s) if (s->state == MIGRATION_STATUS_POSTCOPY_RECOVER) { /* Woken up by a recover procedure. Give it a shot */ - /* - * Firstly, let's wake up the return path now, with a new - * return path channel. - */ - qemu_sem_post(&s->postcopy_pause_rp_sem); - /* Do the resume logic */ if (postcopy_do_resume(s) == 0) { /* Let's continue! */ @@ -3263,7 +3227,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in) * QEMU uses the return path. */ if (migrate_postcopy_ram() || migrate_return_path()) { - if (open_return_path_on_source(s, !resume)) { + if (open_return_path_on_source(s)) { error_setg(&local_err, "Unable to open return-path for postcopy"); migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED); migrate_set_error(s, local_err); @@ -3327,7 +3291,6 @@ static void migration_instance_finalize(Object *obj) qemu_sem_destroy(&ms->rate_limit_sem); qemu_sem_destroy(&ms->pause_sem); qemu_sem_destroy(&ms->postcopy_pause_sem); - qemu_sem_destroy(&ms->postcopy_pause_rp_sem); qemu_sem_destroy(&ms->rp_state.rp_sem); qemu_sem_destroy(&ms->rp_state.rp_pong_acks); qemu_sem_destroy(&ms->postcopy_qemufile_src_sem); @@ -3347,7 +3310,6 @@ static void migration_instance_init(Object *obj) migrate_params_init(&ms->parameters); qemu_sem_init(&ms->postcopy_pause_sem, 0); - qemu_sem_init(&ms->postcopy_pause_rp_sem, 0); qemu_sem_init(&ms->rp_state.rp_sem, 0); qemu_sem_init(&ms->rp_state.rp_pong_acks, 0); qemu_sem_init(&ms->rate_limit_sem, 0); diff --git a/migration/migration.h b/migration/migration.h index 6eea18db36..36eb5ba70b 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -382,7 +382,6 @@ struct MigrationState { /* Needed by postcopy-pause state */ QemuSemaphore postcopy_pause_sem; - QemuSemaphore postcopy_pause_rp_sem; /* * Whether we abort the migration if decompression errors are * detected at the destination. It is left at false for qemu From patchwork Fri Aug 11 15:08:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fabiano Rosas X-Patchwork-Id: 1820321 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=mdyDmmJM; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=Udntlm47; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RMnKc1T64z1yf7 for ; Sat, 12 Aug 2023 01:10:28 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qUTl8-0000cy-4U; Fri, 11 Aug 2023 11:09:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qUTl4-0000bG-7r for qemu-devel@nongnu.org; Fri, 11 Aug 2023 11:08:58 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qUTl2-0006L3-Of for qemu-devel@nongnu.org; Fri, 11 Aug 2023 11:08:58 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id AB81921885; Fri, 11 Aug 2023 15:08:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1691766535; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AALRj7Ab+jv8HnQxY3S5JhK39qpam5fKLXPpns0eBC8=; b=mdyDmmJM/50RopQtAdfvA/OueItDANhtBXpkU1amBybmofVnmQZrpPozIorhaek52kq5Bw /0CnlnJQ6fikR1mwsa0uQJXF/C8gMS9jRIhh2yLKAPZGu65QOgF/q7lzVOcYUV2O1kHn4b x99voQ+lF/ej+6cqmQOiAEzx3SvOoRc= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1691766535; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AALRj7Ab+jv8HnQxY3S5JhK39qpam5fKLXPpns0eBC8=; b=Udntlm476gqsiuamMbJpGJd8Zf4galsJp1Cm28m7IV501vFyWlQhpMGK98YzrohRKnfKny NQLUAW2FWZJDYyCw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 2F5E713592; Fri, 11 Aug 2023 15:08:53 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id mNuSOgVP1mS7KwAAMHmgww (envelope-from ); Fri, 11 Aug 2023 15:08:53 +0000 From: Fabiano Rosas To: qemu-devel@nongnu.org Cc: Juan Quintela , Peter Xu , Wei Wang , Leonardo Bras Subject: [PATCH v3 08/10] migration: Move return path cleanup to main migration thread Date: Fri, 11 Aug 2023 12:08:34 -0300 Message-Id: <20230811150836.2895-9-farosas@suse.de> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20230811150836.2895-1-farosas@suse.de> References: <20230811150836.2895-1-farosas@suse.de> MIME-Version: 1.0 Received-SPF: pass client-ip=195.135.220.28; envelope-from=farosas@suse.de; helo=smtp-out1.suse.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Now that the return path thread is allowed to finish during a paused migration, we can move the cleanup of the QEMUFiles to the main migration thread. Signed-off-by: Fabiano Rosas Reviewed-by: Peter Xu --- migration/migration.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/migration/migration.c b/migration/migration.c index 7dfcbc3634..7fec57ad7f 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -98,6 +98,7 @@ static int migration_maybe_pause(MigrationState *s, int *current_active_state, int new_state); static void migrate_fd_cancel(MigrationState *s); +static int await_return_path_close_on_source(MigrationState *s); static bool migration_needs_multiple_sockets(void) { @@ -1177,6 +1178,12 @@ static void migrate_fd_cleanup(MigrationState *s) qemu_fclose(tmp); } + /* + * We already cleaned up to_dst_file, so errors from the return + * path might be due to that, ignore them. + */ + await_return_path_close_on_source(s); + assert(!migration_is_active(s)); if (s->state == MIGRATION_STATUS_CANCELLING) { @@ -1985,7 +1992,6 @@ out: } trace_source_return_path_thread_end(); - migration_release_dst_files(ms); rcu_unregister_thread(); return NULL; } @@ -2039,6 +2045,9 @@ static int await_return_path_close_on_source(MigrationState *ms) ret = ms->rp_state.error; ms->rp_state.error = false; + + migration_release_dst_files(ms); + trace_migration_return_path_end_after(ret); return ret; } From patchwork Fri Aug 11 15:08:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fabiano Rosas X-Patchwork-Id: 1820322 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=02cG/sgQ; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=Xa0uOBuH; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RMnKc56Cxz1yfH for ; Sat, 12 Aug 2023 01:10:28 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qUTl8-0000d7-5G; Fri, 11 Aug 2023 11:09:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qUTl6-0000bx-HI for qemu-devel@nongnu.org; Fri, 11 Aug 2023 11:09:00 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qUTl4-0006LK-Qq for qemu-devel@nongnu.org; Fri, 11 Aug 2023 11:09:00 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id A183E21887; Fri, 11 Aug 2023 15:08:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1691766537; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bfyqSy/vexXZrrfoBllaNADL3zw4+p60HBnsLWD6L18=; b=02cG/sgQi2JRH1ek6/a1CqVYu4W8uY/kGuneiFEWrsgyADKcTxBPYkaopyRz+YsUsbV0EC U+Zej+x9Gl5h3Vpzzrzi/PSU0Y3i3/FmNjVuFT1qExYkV0jRQefAjQXTwUMOXtodO5fBqD l+AIrtCDVXUQyH+ysdxxGTuU4iIpCOY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1691766537; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bfyqSy/vexXZrrfoBllaNADL3zw4+p60HBnsLWD6L18=; b=Xa0uOBuHA+cikx8xDm4S6HT9irRn3bXRm9kVmBqVYvudE86SoUoc+0OqQgJwPlI8oIiXhE XvtEYVf9Egnx2aCw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 2445913592; Fri, 11 Aug 2023 15:08:55 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id IOfPNwdP1mS7KwAAMHmgww (envelope-from ); Fri, 11 Aug 2023 15:08:55 +0000 From: Fabiano Rosas To: qemu-devel@nongnu.org Cc: Juan Quintela , Peter Xu , Wei Wang , Leonardo Bras Subject: [PATCH v3 09/10] migration: Be consistent about shutdown of source shared files Date: Fri, 11 Aug 2023 12:08:35 -0300 Message-Id: <20230811150836.2895-10-farosas@suse.de> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20230811150836.2895-1-farosas@suse.de> References: <20230811150836.2895-1-farosas@suse.de> MIME-Version: 1.0 Received-SPF: pass client-ip=195.135.220.28; envelope-from=farosas@suse.de; helo=smtp-out1.suse.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org When doing cleanup, we currently close() some of the shared migration files and shutdown() + close() others. Be consistent by always calling shutdown() before close(). Do this only for the source files for now because the source runs multiple threads which could cause races between the two calls. Having them together allows us to move them to a centralized place under the protection of a lock the next patch. Signed-off-by: Fabiano Rosas --- migration/migration.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/migration/migration.c b/migration/migration.c index 7fec57ad7f..4df5ca25c1 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1175,6 +1175,7 @@ static void migrate_fd_cleanup(MigrationState *s) * critical section won't block for long. */ migration_ioc_unregister_yank_from_file(tmp); + qemu_file_shutdown(tmp); qemu_fclose(tmp); } @@ -1844,6 +1845,7 @@ static void migration_release_dst_files(MigrationState *ms) ms->postcopy_qemufile_src = NULL; } + qemu_file_shutdown(file); qemu_fclose(file); } From patchwork Fri Aug 11 15:08:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fabiano Rosas X-Patchwork-Id: 1820313 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=RRWlAd0y; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=axtuA3C1; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RMnJ72Njvz1yf6 for ; Sat, 12 Aug 2023 01:09:11 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qUTl9-0000gZ-Rz; Fri, 11 Aug 2023 11:09:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qUTl9-0000fj-98 for qemu-devel@nongnu.org; Fri, 11 Aug 2023 11:09:03 -0400 Received: from smtp-out2.suse.de ([2001:67c:2178:6::1d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qUTl7-0006Lc-H4 for qemu-devel@nongnu.org; Fri, 11 Aug 2023 11:09:03 -0400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id CC0981F889; Fri, 11 Aug 2023 15:08:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1691766539; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vucv+zdUSHjVRXTo91p5aoBDVvdrWkIX3/yex5hGXWU=; b=RRWlAd0y0JCWOmNrbpaC6X7lEUGm+ACjBHKOxahWixYmC5TMnuloUPqPh9SpVFceZ1ZhSY NmA9p28b3VHPi7kTv91VMCbUG+UBq7rzQ9pPXa0XkamK4gbvAUHz003OXlOqCWudl6qc/k bvD0ENsrmorPS0mTP6iWV2jCGBpIK/Q= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1691766539; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vucv+zdUSHjVRXTo91p5aoBDVvdrWkIX3/yex5hGXWU=; b=axtuA3C1Sb+v7lPwSeVMDv7pZ6IhtBonHZ90AF7Jv0EHcpi6uVaDbbcwCPNEUA+f8sFdbW gb8yRRE8BwNGRGDA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 1A7D413592; Fri, 11 Aug 2023 15:08:57 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id AOFSNQlP1mS7KwAAMHmgww (envelope-from ); Fri, 11 Aug 2023 15:08:57 +0000 From: Fabiano Rosas To: qemu-devel@nongnu.org Cc: Juan Quintela , Peter Xu , Wei Wang , Leonardo Bras , Lukas Straub Subject: [PATCH v3 10/10] migration: Add a wrapper to cleanup migration files Date: Fri, 11 Aug 2023 12:08:36 -0300 Message-Id: <20230811150836.2895-11-farosas@suse.de> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20230811150836.2895-1-farosas@suse.de> References: <20230811150836.2895-1-farosas@suse.de> MIME-Version: 1.0 Received-SPF: pass client-ip=2001:67c:2178:6::1d; envelope-from=farosas@suse.de; helo=smtp-out2.suse.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org We currently have a pattern for cleaning up a migration QEMUFile: qemu_mutex_lock(&s->qemu_file_lock); file = s->file_name; s->file_name = NULL; qemu_mutex_unlock(&s->qemu_file_lock); migration_ioc_unregister_yank_from_file(file); qemu_file_shutdown(file); qemu_fclose(file); There are some considerations for this sequence: - we must clear the pointer under the lock, to avoid TOC/TOU bugs; - the shutdown() and close() expect be given a non-null parameter; - a close() in one thread should not race with a shutdown() in another; Create a wrapper function to make sure everything works correctly. Note: the return path did not used to call migration_ioc_unregister_yank_from_file(), but I added it nonetheless for uniformity. Signed-off-by: Fabiano Rosas --- migration/migration.c | 92 ++++++++++++------------------------------- util/yank.c | 2 - 2 files changed, 26 insertions(+), 68 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 4df5ca25c1..3c33e4fae4 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -217,6 +217,27 @@ MigrationIncomingState *migration_incoming_get_current(void) return current_incoming; } +static void migration_file_release(QEMUFile **file) +{ + MigrationState *ms = migrate_get_current(); + QEMUFile *tmp; + + /* + * Reset the pointer before releasing it to avoid holding the lock + * for too long. + */ + WITH_QEMU_LOCK_GUARD(&ms->qemu_file_lock) { + tmp = *file; + *file = NULL; + } + + if (tmp) { + migration_ioc_unregister_yank_from_file(tmp); + qemu_file_shutdown(tmp); + qemu_fclose(tmp); + } +} + void migration_incoming_transport_cleanup(MigrationIncomingState *mis) { if (mis->socket_address_list) { @@ -1155,8 +1176,6 @@ static void migrate_fd_cleanup(MigrationState *s) qemu_savevm_state_cleanup(); if (s->to_dst_file) { - QEMUFile *tmp; - trace_migrate_fd_cleanup(); qemu_mutex_unlock_iothread(); if (s->migration_thread_running) { @@ -1166,17 +1185,7 @@ static void migrate_fd_cleanup(MigrationState *s) qemu_mutex_lock_iothread(); multifd_save_cleanup(); - qemu_mutex_lock(&s->qemu_file_lock); - tmp = s->to_dst_file; - s->to_dst_file = NULL; - qemu_mutex_unlock(&s->qemu_file_lock); - /* - * Close the file handle without the lock to make sure the - * critical section won't block for long. - */ - migration_ioc_unregister_yank_from_file(tmp); - qemu_file_shutdown(tmp); - qemu_fclose(tmp); + migration_file_release(&s->to_dst_file); } /* @@ -1816,39 +1825,6 @@ static int migrate_handle_rp_resume_ack(MigrationState *s, uint32_t value) return 0; } -/* - * Release ms->rp_state.from_dst_file (and postcopy_qemufile_src if - * existed) in a safe way. - */ -static void migration_release_dst_files(MigrationState *ms) -{ - QEMUFile *file; - - WITH_QEMU_LOCK_GUARD(&ms->qemu_file_lock) { - /* - * Reset the from_dst_file pointer first before releasing it, as we - * can't block within lock section - */ - file = ms->rp_state.from_dst_file; - ms->rp_state.from_dst_file = NULL; - } - - /* - * Do the same to postcopy fast path socket too if there is. No - * locking needed because this qemufile should only be managed by - * return path thread. - */ - if (ms->postcopy_qemufile_src) { - migration_ioc_unregister_yank_from_file(ms->postcopy_qemufile_src); - qemu_file_shutdown(ms->postcopy_qemufile_src); - qemu_fclose(ms->postcopy_qemufile_src); - ms->postcopy_qemufile_src = NULL; - } - - qemu_file_shutdown(file); - qemu_fclose(file); -} - /* * Handles messages sent on the return path towards the source VM * @@ -2048,7 +2024,8 @@ static int await_return_path_close_on_source(MigrationState *ms) ret = ms->rp_state.error; ms->rp_state.error = false; - migration_release_dst_files(ms); + migration_file_release(&ms->rp_state.from_dst_file); + migration_file_release(&ms->postcopy_qemufile_src); trace_migration_return_path_end_after(ret); return ret; @@ -2504,26 +2481,9 @@ static MigThrError postcopy_pause(MigrationState *s) assert(s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE); while (true) { - QEMUFile *file; - - /* - * Current channel is possibly broken. Release it. Note that this is - * guaranteed even without lock because to_dst_file should only be - * modified by the migration thread. That also guarantees that the - * unregister of yank is safe too without the lock. It should be safe - * even to be within the qemu_file_lock, but we didn't do that to avoid - * taking more mutex (yank_lock) within qemu_file_lock. TL;DR: we make - * the qemu_file_lock critical section as small as possible. - */ + /* Current channel is possibly broken. Release it. */ assert(s->to_dst_file); - migration_ioc_unregister_yank_from_file(s->to_dst_file); - qemu_mutex_lock(&s->qemu_file_lock); - file = s->to_dst_file; - s->to_dst_file = NULL; - qemu_mutex_unlock(&s->qemu_file_lock); - - qemu_file_shutdown(file); - qemu_fclose(file); + migration_file_release(&s->to_dst_file); /* * We're already pausing, so ignore any errors on the return diff --git a/util/yank.c b/util/yank.c index abf47c346d..4b6afbf589 100644 --- a/util/yank.c +++ b/util/yank.c @@ -146,8 +146,6 @@ void yank_unregister_function(const YankInstance *instance, return; } } - - abort(); } void qmp_yank(YankInstanceList *instances,