From patchwork Fri Feb 2 19:11:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fabiano Rosas X-Patchwork-Id: 1894736 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=EkTl6e5T; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=ZpObsHaX; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=SWLoGxui; dkim=neutral header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=HLZB66Mw; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4TRQQd0HKrz23gx for ; Sat, 3 Feb 2024 06:12:56 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rVyww-00020Q-Fd; Fri, 02 Feb 2024 14:11:42 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rVywu-0001zi-Fm for qemu-devel@nongnu.org; Fri, 02 Feb 2024 14:11:40 -0500 Received: from smtp-out2.suse.de ([2a07:de40:b251:101:10:150:64:2]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rVywr-0001vj-Vb for qemu-devel@nongnu.org; Fri, 02 Feb 2024 14:11:39 -0500 Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 4DAE51FD32; Fri, 2 Feb 2024 19:11:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1706901096; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=s4kRO8p6DEl/drZgDKdPAZmCp+mm2aFo3U2Xjp+yQZ8=; b=EkTl6e5T+uMLZBp8D3LZVDyAxBEuH0q4L+gQnw98r/bWQUZYK0ASxzUo0EF1nWTZJLmfkr RM0gEi6dXK5kjpeMT27zQgZRJ+BAltBXaiv1KsEWsSZha0XWYQe9WZlaeiQW80F+j+7c9w j/m9u3zDPyXbdlHW0mOOukVRV3VWx78= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1706901096; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=s4kRO8p6DEl/drZgDKdPAZmCp+mm2aFo3U2Xjp+yQZ8=; b=ZpObsHaXHmQcBoxcBEXi8X5qlzA5W80eXLXIUOSroQH4i474TuJKDu40i/5WW3ec7nPMrK wnIP2P0ROYeRuMDg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1706901094; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=s4kRO8p6DEl/drZgDKdPAZmCp+mm2aFo3U2Xjp+yQZ8=; b=SWLoGxuiAdmSu8UhVQqb74xVyP9jNaFrXRHGwZJTKcy0IBliY+DwO/UanCi4roSMwOAHEX HiALMsM9xMQ+CMuxFCqOF6vc/mGTLqEMOAyTw5gEWFilSh7d5YyWNftStb/AC6fbObxxEV Z+GLxECyN1NXVrsgPeV16Vp7Y4jhzpg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1706901094; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=s4kRO8p6DEl/drZgDKdPAZmCp+mm2aFo3U2Xjp+yQZ8=; b=HLZB66Mwr7Jrj4ZCT6Ea7tMyJFwtWRW+FtpVUhFCb2MzvCzB1oOgsatUF2yX3DbdHDwK+9 3CL+/5M9sbXMOFBg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 26CD713A5F; Fri, 2 Feb 2024 19:11:32 +0000 (UTC) Received: from dovecot-director2.suse.de ([10.150.64.162]) by imap1.dmz-prg2.suse.org with ESMTPSA id +IxwN2Q+vWWlZAAAD6G6ig (envelope-from ); Fri, 02 Feb 2024 19:11:32 +0000 From: Fabiano Rosas To: qemu-devel@nongnu.org Cc: Peter Xu , Avihai Horon Subject: [PATCH 1/5] migration/multifd: Join the TLS thread Date: Fri, 2 Feb 2024 16:11:24 -0300 Message-Id: <20240202191128.1901-2-farosas@suse.de> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20240202191128.1901-1-farosas@suse.de> References: <20240202191128.1901-1-farosas@suse.de> MIME-Version: 1.0 Authentication-Results: smtp-out2.suse.de; none X-Spamd-Result: default: False [4.90 / 50.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_MISSING_CHARSET(2.50)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; BROKEN_CONTENT_TYPE(1.50)[]; TO_DN_SOME(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; MID_CONTAINS_FROM(1.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_ALL(0.00)[]; BAYES_HAM(-0.00)[10.42%] X-Spam-Score: 4.90 Received-SPF: pass client-ip=2a07:de40:b251:101:10:150:64:2; envelope-from=farosas@suse.de; helo=smtp-out2.suse.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org We're currently leaking the resources of the TLS thread by not joining it and also overwriting the p->thread pointer altogether. Signed-off-by: Fabiano Rosas Reviewed-by: Peter Xu --- migration/multifd.c | 8 +++++++- migration/multifd.h | 2 ++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/migration/multifd.c b/migration/multifd.c index 25cbc6dc6b..b557d046a9 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -524,6 +524,10 @@ void multifd_save_cleanup(void) for (i = 0; i < migrate_multifd_channels(); i++) { MultiFDSendParams *p = &multifd_send_state->params[i]; + if (p->tls_thread_created) { + qemu_thread_join(&p->tls_thread); + } + if (p->running) { qemu_thread_join(&p->thread); } @@ -827,7 +831,9 @@ static bool multifd_tls_channel_connect(MultiFDSendParams *p, trace_multifd_tls_outgoing_handshake_start(ioc, tioc, hostname); qio_channel_set_name(QIO_CHANNEL(tioc), "multifd-tls-outgoing"); p->c = QIO_CHANNEL(tioc); - qemu_thread_create(&p->thread, "multifd-tls-handshake-worker", + + p->tls_thread_created = true; + qemu_thread_create(&p->tls_thread, "multifd-tls-handshake-worker", multifd_tls_handshake_thread, p, QEMU_THREAD_JOINABLE); return true; diff --git a/migration/multifd.h b/migration/multifd.h index 35d11f103c..5a69ef40e2 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -73,6 +73,8 @@ typedef struct { char *name; /* channel thread id */ QemuThread thread; + QemuThread tls_thread; + bool tls_thread_created; /* communication channel */ QIOChannel *c; /* is the yank function registered */ From patchwork Fri Feb 2 19:11:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fabiano Rosas X-Patchwork-Id: 1894740 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=eM3aN3lZ; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=8f9jml4z; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=ATeJYyMH; dkim=neutral header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=a+tREucW; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4TRQQg2zMpz23gx for ; Sat, 3 Feb 2024 06:12:59 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rVywx-000212-Dq; Fri, 02 Feb 2024 14:11:43 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rVywv-000206-A7 for qemu-devel@nongnu.org; Fri, 02 Feb 2024 14:11:41 -0500 Received: from smtp-out1.suse.de ([2a07:de40:b251:101:10:150:64:1]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rVywt-0001vm-Hk for qemu-devel@nongnu.org; Fri, 02 Feb 2024 14:11:41 -0500 Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 2D91922134; Fri, 2 Feb 2024 19:11:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1706901097; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kl5qNQwTxzOz/0av9OOfER+ITJDQQMvx4JeU3LRyQY4=; b=eM3aN3lZ74pfl99zGdmIWwtDj764//8ur9FMJl1OLsjbYR2mSx5ESoDdmmL3/7JLDutwxh 5EiSy+ioY3KPMbP8BG9H+XNMDLV7tmcmfViycDtOfN6R3xCpLHCYIY0QgqOGceKNdJ6Vvt hw6BjW0BLvaMb7IJmvI7SCnM4YXnna0= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1706901097; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kl5qNQwTxzOz/0av9OOfER+ITJDQQMvx4JeU3LRyQY4=; b=8f9jml4ztmewpJE5b6Gb7l75sSHAZmcNZTwiJuWqwJe4rrQK8IVvz91rNKTzUz/lPfNyk6 x2EAK5j0VcTy26Cw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1706901096; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kl5qNQwTxzOz/0av9OOfER+ITJDQQMvx4JeU3LRyQY4=; b=ATeJYyMH2/8t+y+1EGGDvPqrhEM9IVQQDOWdMNWFhgsiYoV71QxvhMgERGVM7SSDbU1gaT Eyxamc6cIg1MXSqX0M4opIo/qlL+1bsh3Cr5cf5+kczUGYoL0rzpFYM9eV5mtaLMVZu0Yv vaNAu5bMbFQFYJvCJSOKgZN9GCl5tko= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1706901096; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kl5qNQwTxzOz/0av9OOfER+ITJDQQMvx4JeU3LRyQY4=; b=a+tREucW7VQxMuqmAYVP7lUbVxMItqRziXpyH1DKH2I+jHqxyUgvcB4qfakzF6CXg/mLDF uJIS81lSyUTf+1Dw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id C0918139AB; Fri, 2 Feb 2024 19:11:34 +0000 (UTC) Received: from dovecot-director2.suse.de ([10.150.64.162]) by imap1.dmz-prg2.suse.org with ESMTPSA id SKN/IWY+vWWlZAAAD6G6ig (envelope-from ); Fri, 02 Feb 2024 19:11:34 +0000 From: Fabiano Rosas To: qemu-devel@nongnu.org Cc: Peter Xu , Avihai Horon , chenyuhui5@huawei.com Subject: [PATCH 2/5] migration/multifd: Remove p->running Date: Fri, 2 Feb 2024 16:11:25 -0300 Message-Id: <20240202191128.1901-3-farosas@suse.de> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20240202191128.1901-1-farosas@suse.de> References: <20240202191128.1901-1-farosas@suse.de> MIME-Version: 1.0 Authentication-Results: smtp-out1.suse.de; none X-Spam-Score: 0.70 X-Spamd-Result: default: False [0.70 / 50.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; R_MISSING_CHARSET(2.50)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; BROKEN_CONTENT_TYPE(1.50)[]; TO_DN_SOME(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; RCVD_COUNT_THREE(0.00)[3]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-0.998]; MID_CONTAINS_FROM(1.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[huawei.com:email,nvidia.com:email,suse.de:email]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_ALL(0.00)[]; BAYES_HAM(-3.00)[100.00%] Received-SPF: pass client-ip=2a07:de40:b251:101:10:150:64:1; envelope-from=farosas@suse.de; helo=smtp-out1.suse.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org We currently only need p->running to avoid calling qemu_thread_join() on a non existent thread if the thread has never been created. However, there are at least two bugs in this logic: 1) On the sending side, p->running is set too early and qemu_thread_create() can be skipped due to an error during TLS handshake, leaving the flag set and leading to a crash when multifd_save_cleanup() calls qemu_thread_join(). 2) During exit, the multifd thread clears the flag while holding the channel lock. The counterpart at multifd_save_cleanup() reads the flag outside of the lock and might free the mutex while the multifd thread still has it locked. Fix the first issue by setting the flag right before creating the thread. Rename it from p->running to p->thread_created to clarify its usage. Fix the second issue by not clearing the flag at the multifd thread exit. We don't have any use for that. Note that these bugs are straight-forward logic issues and not race conditions. There is still a gap for races to affect this code due to multifd_save_cleanup() being allowed to run concurrently with the thread creation loop. This issue is solved in the next patch. Fixes: 29647140157a ("migration/tls: add support for multifd tls-handshake") Reported-by: Avihai Horon Reported-by: Signed-off-by: Fabiano Rosas Reviewed-by: Peter Xu --- migration/multifd.c | 30 ++++++++++++------------------ migration/multifd.h | 7 ++----- 2 files changed, 14 insertions(+), 23 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index b557d046a9..402b7fd776 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -528,7 +528,7 @@ void multifd_save_cleanup(void) qemu_thread_join(&p->tls_thread); } - if (p->running) { + if (p->thread_created) { qemu_thread_join(&p->thread); } } @@ -759,10 +759,6 @@ out: error_free(local_err); } - qemu_mutex_lock(&p->mutex); - p->running = false; - qemu_mutex_unlock(&p->mutex); - rcu_unregister_thread(); migration_threads_remove(thread); trace_multifd_send_thread_end(p->id, p->num_packets, p->total_normal_pages); @@ -859,6 +855,8 @@ static bool multifd_channel_connect(MultiFDSendParams *p, migration_ioc_register_yank(ioc); p->registered_yank = true; p->c = ioc; + + p->thread_created = true; qemu_thread_create(&p->thread, p->name, multifd_send_thread, p, QEMU_THREAD_JOINABLE); return true; @@ -890,7 +888,6 @@ static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque) trace_multifd_new_send_channel_async(p->id); if (!qio_task_propagate_error(task, &local_err)) { qio_channel_set_delay(ioc, false); - p->running = true; if (multifd_channel_connect(p, ioc, &local_err)) { return; } @@ -1030,15 +1027,15 @@ void multifd_load_cleanup(void) for (i = 0; i < migrate_multifd_channels(); i++) { MultiFDRecvParams *p = &multifd_recv_state->params[i]; - if (p->running) { - /* - * multifd_recv_thread may hung at MULTIFD_FLAG_SYNC handle code, - * however try to wakeup it without harm in cleanup phase. - */ - qemu_sem_post(&p->sem_sync); - } + /* + * multifd_recv_thread may hung at MULTIFD_FLAG_SYNC handle code, + * however try to wakeup it without harm in cleanup phase. + */ + qemu_sem_post(&p->sem_sync); - qemu_thread_join(&p->thread); + if (p->thread_created) { + qemu_thread_join(&p->thread); + } } for (i = 0; i < migrate_multifd_channels(); i++) { MultiFDRecvParams *p = &multifd_recv_state->params[i]; @@ -1148,9 +1145,6 @@ static void *multifd_recv_thread(void *opaque) multifd_recv_terminate_threads(local_err); error_free(local_err); } - qemu_mutex_lock(&p->mutex); - p->running = false; - qemu_mutex_unlock(&p->mutex); rcu_unregister_thread(); trace_multifd_recv_thread_end(p->id, p->num_packets, p->total_normal_pages); @@ -1258,7 +1252,7 @@ void multifd_recv_new_channel(QIOChannel *ioc, Error **errp) /* initial packet */ p->num_packets = 1; - p->running = true; + p->thread_created = true; qemu_thread_create(&p->thread, p->name, multifd_recv_thread, p, QEMU_THREAD_JOINABLE); qatomic_inc(&multifd_recv_state->count); diff --git a/migration/multifd.h b/migration/multifd.h index 5a69ef40e2..917833c309 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -73,6 +73,7 @@ typedef struct { char *name; /* channel thread id */ QemuThread thread; + bool thread_created; QemuThread tls_thread; bool tls_thread_created; /* communication channel */ @@ -95,8 +96,6 @@ typedef struct { /* this mutex protects the following parameters */ QemuMutex mutex; - /* is this channel thread running */ - bool running; /* should this thread finish */ bool quit; /* multifd flags for each packet */ @@ -144,6 +143,7 @@ typedef struct { char *name; /* channel thread id */ QemuThread thread; + bool thread_created; /* communication channel */ QIOChannel *c; /* packet allocated len */ @@ -158,8 +158,6 @@ typedef struct { /* this mutex protects the following parameters */ QemuMutex mutex; - /* is this channel thread running */ - bool running; /* should this thread finish */ bool quit; /* multifd flags for each packet */ @@ -209,4 +207,3 @@ typedef struct { void multifd_register_ops(int method, MultiFDMethods *ops); #endif - From patchwork Fri Feb 2 19:11:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fabiano Rosas X-Patchwork-Id: 1894737 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=Azvm8FHe; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=v/QWBo8K; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=Azvm8FHe; dkim=neutral header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=v/QWBo8K; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4TRQQd0ppbz23h9 for ; Sat, 3 Feb 2024 06:12:57 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rVyx0-00021e-4U; Fri, 02 Feb 2024 14:11:46 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rVyww-00020m-WC for qemu-devel@nongnu.org; Fri, 02 Feb 2024 14:11:43 -0500 Received: from smtp-out2.suse.de ([2a07:de40:b251:101:10:150:64:2]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rVywv-0001vp-63 for qemu-devel@nongnu.org; Fri, 02 Feb 2024 14:11:42 -0500 Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id C3D291F79A; Fri, 2 Feb 2024 19:11:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1706901097; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1bFbNsiqRzuD5JCYVR0z5Z8cOAqseAA0ZutzGeWC6ho=; b=Azvm8FHeioDMPCus3CUSpfAL3uPf9ZfUBRdRnUAa/jmHFkBbU0xuODdE4NTwsLtIJJxu6B U/+jK2QLaLtiYx1hA6VnpLzfnX4tbZi+hzRlctrw7Rkdqwe+u2hYStDaBYQkygTxUBWbq7 eJ5nX0vIn4pv68YEKHNgm+ccRCjLHtg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1706901097; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1bFbNsiqRzuD5JCYVR0z5Z8cOAqseAA0ZutzGeWC6ho=; b=v/QWBo8KDdwDq+V/1Lb+dCoDPq1v2Atv0UG6Kml09uXWYmC81UGVjDwQH5TlgCCEUsfdc3 IEX27Ixans2XOmCA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1706901097; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1bFbNsiqRzuD5JCYVR0z5Z8cOAqseAA0ZutzGeWC6ho=; b=Azvm8FHeioDMPCus3CUSpfAL3uPf9ZfUBRdRnUAa/jmHFkBbU0xuODdE4NTwsLtIJJxu6B U/+jK2QLaLtiYx1hA6VnpLzfnX4tbZi+hzRlctrw7Rkdqwe+u2hYStDaBYQkygTxUBWbq7 eJ5nX0vIn4pv68YEKHNgm+ccRCjLHtg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1706901097; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1bFbNsiqRzuD5JCYVR0z5Z8cOAqseAA0ZutzGeWC6ho=; b=v/QWBo8KDdwDq+V/1Lb+dCoDPq1v2Atv0UG6Kml09uXWYmC81UGVjDwQH5TlgCCEUsfdc3 IEX27Ixans2XOmCA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 9D8F713A5F; Fri, 2 Feb 2024 19:11:36 +0000 (UTC) Received: from dovecot-director2.suse.de ([10.150.64.162]) by imap1.dmz-prg2.suse.org with ESMTPSA id iDQgGWg+vWWlZAAAD6G6ig (envelope-from ); Fri, 02 Feb 2024 19:11:36 +0000 From: Fabiano Rosas To: qemu-devel@nongnu.org Cc: Peter Xu , Avihai Horon Subject: [PATCH 3/5] migration/multifd: Move multifd_save_setup error handling in to the function Date: Fri, 2 Feb 2024 16:11:26 -0300 Message-Id: <20240202191128.1901-4-farosas@suse.de> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20240202191128.1901-1-farosas@suse.de> References: <20240202191128.1901-1-farosas@suse.de> MIME-Version: 1.0 Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=Azvm8FHe; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b="v/QWBo8K" X-Spamd-Result: default: False [1.69 / 50.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_MISSING_CHARSET(2.50)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; BROKEN_CONTENT_TYPE(1.50)[]; TO_DN_SOME(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; DKIM_TRACE(0.00)[suse.de:+]; MX_GOOD(-0.01)[]; MID_CONTAINS_FROM(1.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:dkim,suse.de:email]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_ALL(0.00)[]; BAYES_HAM(-3.00)[100.00%] X-Rspamd-Server: rspamd1.dmz-prg2.suse.org X-Spam-Score: 1.69 X-Rspamd-Queue-Id: C3D291F79A X-Spamd-Bar: + Received-SPF: pass client-ip=2a07:de40:b251:101:10:150:64:2; envelope-from=farosas@suse.de; helo=smtp-out2.suse.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Hide the error handling inside multifd_save_setup to make it cleaner for the next patch to move the function around. Signed-off-by: Fabiano Rosas Reviewed-by: Peter Xu --- migration/migration.c | 6 +----- migration/multifd.c | 24 +++++++++++++++++------- migration/multifd.h | 2 +- 3 files changed, 19 insertions(+), 13 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index d5f705ceef..55abb175cc 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -3623,11 +3623,7 @@ void migrate_fd_connect(MigrationState *s, Error *error_in) return; } - if (multifd_save_setup(&local_err) != 0) { - migrate_set_error(s, local_err); - error_report_err(local_err); - migrate_set_state(&s->state, MIGRATION_STATUS_SETUP, - MIGRATION_STATUS_FAILED); + if (!multifd_save_setup()) { migrate_fd_cleanup(s); return; } diff --git a/migration/multifd.c b/migration/multifd.c index 402b7fd776..1851206352 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -902,14 +902,16 @@ static void multifd_new_send_channel_create(gpointer opaque) socket_send_channel_create(multifd_new_send_channel_async, opaque); } -int multifd_save_setup(Error **errp) +bool multifd_save_setup(void) { - int thread_count; + MigrationState *s = migrate_get_current(); + Error *local_err = NULL; + int thread_count, ret = 0; uint32_t page_count = MULTIFD_PACKET_SIZE / qemu_target_page_size(); uint8_t i; if (!migrate_multifd()) { - return 0; + return true; } thread_count = migrate_multifd_channels(); @@ -953,14 +955,22 @@ int multifd_save_setup(Error **errp) for (i = 0; i < thread_count; i++) { MultiFDSendParams *p = &multifd_send_state->params[i]; - int ret; - ret = multifd_send_state->ops->send_setup(p, errp); + ret = multifd_send_state->ops->send_setup(p, &local_err); if (ret) { - return ret; + break; } } - return 0; + + if (ret) { + migrate_set_error(s, local_err); + error_report_err(local_err); + migrate_set_state(&s->state, MIGRATION_STATUS_SETUP, + MIGRATION_STATUS_FAILED); + return false; + } + + return true; } struct { diff --git a/migration/multifd.h b/migration/multifd.h index 917833c309..4acbd65e91 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -13,7 +13,7 @@ #ifndef QEMU_MIGRATION_MULTIFD_H #define QEMU_MIGRATION_MULTIFD_H -int multifd_save_setup(Error **errp); +bool multifd_save_setup(void); void multifd_save_cleanup(void); int multifd_load_setup(Error **errp); void multifd_load_cleanup(void); From patchwork Fri Feb 2 19:11:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fabiano Rosas X-Patchwork-Id: 1894738 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=ZN32PkQx; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=V1ckMp7n; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=ZN32PkQx; dkim=neutral header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=V1ckMp7n; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4TRQQd1ktLz23hB for ; Sat, 3 Feb 2024 06:12:57 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rVywx-000216-FD; Fri, 02 Feb 2024 14:11:43 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rVyww-00020b-LG for qemu-devel@nongnu.org; Fri, 02 Feb 2024 14:11:42 -0500 Received: from smtp-out2.suse.de ([2a07:de40:b251:101:10:150:64:2]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rVywu-0001w6-Il for qemu-devel@nongnu.org; Fri, 02 Feb 2024 14:11:42 -0500 Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 64AF71F799; Fri, 2 Feb 2024 19:11:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1706901099; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=at+EqeXQ1TVmnDh7OeIM9ZBOhhXdwvyEyD7BG2BPYhM=; b=ZN32PkQxoeEPF2gd2BSvhtitWxgJUyK2Q/PKyK7G2Fi2Yog/WgCXMGdahY9iij+rFUiw/I 1uU5pEIF/I+Le6s6eGYp7AVA82wW8IHnp+EWWBQCNSQzvwl8kAO0S9xxQ2Hq0332N2QuBg orBgdtQ1ERv8m4dCtcSnfs2T6fEYynE= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1706901099; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=at+EqeXQ1TVmnDh7OeIM9ZBOhhXdwvyEyD7BG2BPYhM=; b=V1ckMp7nTn8A/RUpkRF0JLRkWX6pR4sHR957vOx0Cq9o0X36jnM5/4EayYOm2STLeOaqCM lZKmvG2+2eepCGCg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1706901099; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=at+EqeXQ1TVmnDh7OeIM9ZBOhhXdwvyEyD7BG2BPYhM=; b=ZN32PkQxoeEPF2gd2BSvhtitWxgJUyK2Q/PKyK7G2Fi2Yog/WgCXMGdahY9iij+rFUiw/I 1uU5pEIF/I+Le6s6eGYp7AVA82wW8IHnp+EWWBQCNSQzvwl8kAO0S9xxQ2Hq0332N2QuBg orBgdtQ1ERv8m4dCtcSnfs2T6fEYynE= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1706901099; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=at+EqeXQ1TVmnDh7OeIM9ZBOhhXdwvyEyD7BG2BPYhM=; b=V1ckMp7nTn8A/RUpkRF0JLRkWX6pR4sHR957vOx0Cq9o0X36jnM5/4EayYOm2STLeOaqCM lZKmvG2+2eepCGCg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 3FDAB139AB; Fri, 2 Feb 2024 19:11:38 +0000 (UTC) Received: from dovecot-director2.suse.de ([10.150.64.162]) by imap1.dmz-prg2.suse.org with ESMTPSA id yH0mAmo+vWWlZAAAD6G6ig (envelope-from ); Fri, 02 Feb 2024 19:11:38 +0000 From: Fabiano Rosas To: qemu-devel@nongnu.org Cc: Peter Xu , Avihai Horon Subject: [PATCH 4/5] migration/multifd: Move multifd_save_setup into migration thread Date: Fri, 2 Feb 2024 16:11:27 -0300 Message-Id: <20240202191128.1901-5-farosas@suse.de> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20240202191128.1901-1-farosas@suse.de> References: <20240202191128.1901-1-farosas@suse.de> MIME-Version: 1.0 Authentication-Results: smtp-out2.suse.de; none X-Spam-Score: 3.65 X-Spamd-Result: default: False [3.65 / 50.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_MISSING_CHARSET(2.50)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; BROKEN_CONTENT_TYPE(1.50)[]; TO_DN_SOME(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; RCVD_COUNT_THREE(0.00)[3]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-0.998]; MID_CONTAINS_FROM(1.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_ALL(0.00)[]; BAYES_HAM(-0.05)[59.69%] Received-SPF: pass client-ip=2a07:de40:b251:101:10:150:64:2; envelope-from=farosas@suse.de; helo=smtp-out2.suse.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org We currently have an unfavorable situation around multifd channels creation and the migration thread execution. We create the multifd channels with qio_channel_socket_connect_async -> qio_task_run_in_thread, but only connect them at the multifd_new_send_channel_async callback, called from qio_task_complete, which is registered as a glib event. So at multifd_save_setup() we create the channels, but they will only be actually usable after the whole multifd_save_setup() calling stack returns back to the main loop. Which means that the migration thread is already up and running without any possibility for the multifd channels to be ready on time. We currently rely on the channels-ready semaphore blocking multifd_send_sync_main() until channels start to come up and release it. However there have been bugs recently found when a channel's creation fails and multifd_save_cleanup() is allowed to run while other channels are still being created. Let's start to organize this situation by moving the multifd_save_setup() call into the migration thread. That way we unblock the main-loop to dispatch the completion callbacks and actually have a chance of getting the multifd channels ready for when the migration thread needs them. The next patches will deal with the synchronization aspects. Note that this takes multifd_save_setup() out of the BQL. Signed-off-by: Fabiano Rosas Reviewed-by: Peter Xu --- migration/migration.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 55abb175cc..c14d12497f 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -3315,6 +3315,10 @@ static void *migration_thread(void *opaque) object_ref(OBJECT(s)); update_iteration_initial_status(s); + if (!multifd_save_setup()) { + goto out; + } + bql_lock(); qemu_savevm_state_header(s->to_dst_file); bql_unlock(); @@ -3386,6 +3390,7 @@ static void *migration_thread(void *opaque) urgent = migration_rate_limit(); } +out: trace_migration_thread_after_loop(); migration_iteration_finish(s); object_unref(OBJECT(s)); @@ -3623,11 +3628,6 @@ void migrate_fd_connect(MigrationState *s, Error *error_in) return; } - if (!multifd_save_setup()) { - migrate_fd_cleanup(s); - return; - } - if (migrate_background_snapshot()) { qemu_thread_create(&s->thread, "bg_snapshot", bg_migration_thread, s, QEMU_THREAD_JOINABLE); From patchwork Fri Feb 2 19:11:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fabiano Rosas X-Patchwork-Id: 1894735 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=u+xxoSi5; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=11kmaQHZ; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=u+xxoSi5; dkim=neutral header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=11kmaQHZ; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4TRQQd1mwRz23hD for ; Sat, 3 Feb 2024 06:12:57 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rVyx0-00021i-LM; Fri, 02 Feb 2024 14:11:46 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rVywy-00021Q-UY for qemu-devel@nongnu.org; Fri, 02 Feb 2024 14:11:44 -0500 Received: from smtp-out1.suse.de ([195.135.223.130]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rVywx-0001wO-4n for qemu-devel@nongnu.org; Fri, 02 Feb 2024 14:11:44 -0500 Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 0B24622177; Fri, 2 Feb 2024 19:11:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1706901101; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fam93RtCT55Zr/k9jnh+9/piZGeCT1ahQwnSNnFryzY=; b=u+xxoSi5d1dtA5NHXgolQpbiUdIudJXj8cyTUuq64cVbIMefhUHJlA2pFiLasLI0JgpGqi MxHk7XWxGpqOrrbTvdFh3z4G0VgBP4IUFenIiOdCil8H85cDpK+ms8Os5iiwTKH6URoMyF ahsZKALR5QiBOE6OeWcNz8L/K6te+00= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1706901101; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fam93RtCT55Zr/k9jnh+9/piZGeCT1ahQwnSNnFryzY=; b=11kmaQHZEKwlgWIL3RZeysww+fHZK2qD/BTPdVfCgufWXBK2G4A7syfI7ohkeTBKkLYAvw Sj6xr7OP5K4PzODA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1706901101; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fam93RtCT55Zr/k9jnh+9/piZGeCT1ahQwnSNnFryzY=; b=u+xxoSi5d1dtA5NHXgolQpbiUdIudJXj8cyTUuq64cVbIMefhUHJlA2pFiLasLI0JgpGqi MxHk7XWxGpqOrrbTvdFh3z4G0VgBP4IUFenIiOdCil8H85cDpK+ms8Os5iiwTKH6URoMyF ahsZKALR5QiBOE6OeWcNz8L/K6te+00= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1706901101; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fam93RtCT55Zr/k9jnh+9/piZGeCT1ahQwnSNnFryzY=; b=11kmaQHZEKwlgWIL3RZeysww+fHZK2qD/BTPdVfCgufWXBK2G4A7syfI7ohkeTBKkLYAvw Sj6xr7OP5K4PzODA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id D760B139AB; Fri, 2 Feb 2024 19:11:39 +0000 (UTC) Received: from dovecot-director2.suse.de ([10.150.64.162]) by imap1.dmz-prg2.suse.org with ESMTPSA id MPN+Jms+vWWlZAAAD6G6ig (envelope-from ); Fri, 02 Feb 2024 19:11:39 +0000 From: Fabiano Rosas To: qemu-devel@nongnu.org Cc: Peter Xu , Avihai Horon Subject: [PATCH 5/5] migration/multifd: Add a synchronization point for channel creation Date: Fri, 2 Feb 2024 16:11:28 -0300 Message-Id: <20240202191128.1901-6-farosas@suse.de> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20240202191128.1901-1-farosas@suse.de> References: <20240202191128.1901-1-farosas@suse.de> MIME-Version: 1.0 Authentication-Results: smtp-out1.suse.de; none X-Spam-Score: 0.70 X-Spamd-Result: default: False [0.70 / 50.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_MISSING_CHARSET(2.50)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; BROKEN_CONTENT_TYPE(1.50)[]; TO_DN_SOME(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; RCVD_COUNT_THREE(0.00)[3]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-0.998]; MID_CONTAINS_FROM(1.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_ALL(0.00)[]; BAYES_HAM(-3.00)[100.00%] Received-SPF: pass client-ip=195.135.223.130; envelope-from=farosas@suse.de; helo=smtp-out1.suse.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org It is possible that one of the multifd channels fails to be created at multifd_new_send_channel_async() while the rest of the channel creation tasks are still in flight. This could lead to multifd_save_cleanup() executing the qemu_thread_join() loop too early and not waiting for the threads which haven't been created yet, leading to the freeing of resources that the newly created threads will try to access and crash. Add a synchronization point after which there will be no attempts at thread creation and therefore calling multifd_save_cleanup() past that point will ensure it properly waits for the threads. A note about performance: Prior to this patch, if a channel took too long to be established, other channels could finish connecting first and already start taking load. Now we're bounded by the slowest-connecting channel. Signed-off-by: Fabiano Rosas --- migration/multifd.c | 67 +++++++++++++++++++++++++-------------------- 1 file changed, 37 insertions(+), 30 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index 1851206352..888ac8b05d 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -360,6 +360,11 @@ struct { MultiFDPages_t *pages; /* global number of generated multifd packets */ uint64_t packet_num; + /* + * Synchronization point past which no more channels will be + * created. + */ + QemuSemaphore channels_created; /* send channels ready */ QemuSemaphore channels_ready; /* @@ -561,6 +566,7 @@ void multifd_save_cleanup(void) error_free(local_err); } } + qemu_sem_destroy(&multifd_send_state->channels_created); qemu_sem_destroy(&multifd_send_state->channels_ready); g_free(multifd_send_state->params); multifd_send_state->params = NULL; @@ -787,13 +793,6 @@ static void multifd_tls_outgoing_handshake(QIOTask *task, trace_multifd_tls_outgoing_handshake_error(ioc, error_get_pretty(err)); migrate_set_error(migrate_get_current(), err); - /* - * Error happen, mark multifd_send_thread status as 'quit' although it - * is not created, and then tell who pay attention to me. - */ - p->quit = true; - qemu_sem_post(&multifd_send_state->channels_ready); - qemu_sem_post(&p->sem_sync); error_free(err); } @@ -862,39 +861,37 @@ static bool multifd_channel_connect(MultiFDSendParams *p, return true; } -static void multifd_new_send_channel_cleanup(MultiFDSendParams *p, - QIOChannel *ioc, Error *err) -{ - migrate_set_error(migrate_get_current(), err); - /* Error happen, we need to tell who pay attention to me */ - qemu_sem_post(&multifd_send_state->channels_ready); - qemu_sem_post(&p->sem_sync); - /* - * Although multifd_send_thread is not created, but main migration - * thread need to judge whether it is running, so we need to mark - * its status. - */ - p->quit = true; - object_unref(OBJECT(ioc)); - error_free(err); -} - static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque) { MultiFDSendParams *p = opaque; QIOChannel *ioc = QIO_CHANNEL(qio_task_get_source(task)); Error *local_err = NULL; + bool ret; trace_multifd_new_send_channel_async(p->id); - if (!qio_task_propagate_error(task, &local_err)) { - qio_channel_set_delay(ioc, false); - if (multifd_channel_connect(p, ioc, &local_err)) { - return; - } + + if (qio_task_propagate_error(task, &local_err)) { + ret = false; + goto out; + } + + qio_channel_set_delay(ioc, false); + ret = multifd_channel_connect(p, ioc, &local_err); + +out: + /* + * Here we're not interested whether creation succeeded, only that + * it happened at all. + */ + qemu_sem_post(&multifd_send_state->channels_created); + if (ret) { + return; } trace_multifd_new_send_channel_async_error(p->id, local_err); - multifd_new_send_channel_cleanup(p, ioc, local_err); + migrate_set_error(migrate_get_current(), local_err); + object_unref(OBJECT(ioc)); + error_free(local_err); } static void multifd_new_send_channel_create(gpointer opaque) @@ -918,6 +915,7 @@ bool multifd_save_setup(void) multifd_send_state = g_malloc0(sizeof(*multifd_send_state)); multifd_send_state->params = g_new0(MultiFDSendParams, thread_count); multifd_send_state->pages = multifd_pages_init(page_count); + qemu_sem_init(&multifd_send_state->channels_created, 0); qemu_sem_init(&multifd_send_state->channels_ready, 0); qatomic_set(&multifd_send_state->exiting, 0); multifd_send_state->ops = multifd_ops[migrate_multifd_compression()]; @@ -953,6 +951,15 @@ bool multifd_save_setup(void) multifd_new_send_channel_create(p); } + /* + * Wait until channel creation has started for all channels. The + * creation can still fail, but no more channels will be created + * past this point. + */ + for (i = 0; i < thread_count; i++) { + qemu_sem_wait(&multifd_send_state->channels_created); + } + for (i = 0; i < thread_count; i++) { MultiFDSendParams *p = &multifd_send_state->params[i];