From patchwork Wed Oct 25 19:38:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 1855254 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=bytedance.com header.i=@bytedance.com header.a=rsa-sha256 header.s=google header.b=RP/2w6pr; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SFzp11zGMz23jn for ; Thu, 26 Oct 2023 06:41:45 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvjj5-0000pu-IN; Wed, 25 Oct 2023 15:39:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvjj3-0000pg-Qx for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:33 -0400 Received: from mail-qv1-xf2c.google.com ([2607:f8b0:4864:20::f2c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qvjj1-00067o-1H for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:33 -0400 Received: by mail-qv1-xf2c.google.com with SMTP id 6a1803df08f44-66d03491a1eso740246d6.2 for ; Wed, 25 Oct 2023 12:39:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1698262770; x=1698867570; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=j1bNuRq+y2CvFip0F1f8OlNXk4NaSY7ezUEtcTBMZxc=; b=RP/2w6preyrOF9ZbpyK0EvUjkWc3nfi9/3ePZtKjMuXNhOqUrszKDBRSMQxoiZR5CW 7voZYythN52eMdRQJ0GO0ImOSYVc0tVwQrSkRXOpzP2SRcSmIUEB4R1etJY2+O1siDmr r/oQi4kPTQRMbVyXhY66o4lmq53OmuXjF9LWMRC4P5UYw49VAR8NSTeI+5a+mNLm6WD6 SRR7Zo6x6uOaiSI1td0gH/TSPzHa6tH/1Sy0olWHBLa/cnMpEGJ2c164W0QQZzzvfMZk B39xxVj2FeASsRZs2zNpSfoMjJ7iJhLkNJTxt0tnSDyq39krl8uL15vZ0ahpZPAhGdvH A4zA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698262770; x=1698867570; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=j1bNuRq+y2CvFip0F1f8OlNXk4NaSY7ezUEtcTBMZxc=; b=hVgqiTQavjZrNU8QaSyPywON8f9Wc7VgPOa41fI28m3rGuePGs31s9mtWkD2xt87VD sW/alzPrPkUEHd4SlS/xKpFEzdw5HwPhpTZPhjMbgLvm2FtWbOkXwK5M6sd72NDgT3sh VqziB8PVAVbWZD4nZ2I/5u/LZ+UsgoXINkmAljAOizs9/XwW3uQ4BPt78xkB2CRZdLOp INkK3snS6ff/u9ZtH+yUiVu+TN4V34llLEuq3Kuw6bec3av+64CIUqwIHANL7xttdNL6 P+FaGP60j+TbgEsMhIGlZ4Hy6i9IGvUHuM9fUuxWNQ4R4Ql2IVOnJs4ycuy5LCqnKD02 16Hw== X-Gm-Message-State: AOJu0Yx1a5/xBfXKpJXEoEmt/eRmRSRGzx19qDOklwnKjQRJHMicZzky xPJeEILqqpThtSaJx4JSj7T0Nw== X-Google-Smtp-Source: AGHT+IFqmtnOCThWaFpypfvK0Klyay6WS8IU6wpDKrCSvU3xVU9fgN1NWvG0lFW9NcoE8XVPfpVQsg== X-Received: by 2002:a05:6214:246b:b0:66d:1d92:c694 with SMTP id im11-20020a056214246b00b0066d1d92c694mr22580113qvb.58.1698262769870; Wed, 25 Oct 2023 12:39:29 -0700 (PDT) Received: from n231-230-216.byted.org ([147.160.184.135]) by smtp.gmail.com with ESMTPSA id o8-20020a05620a228800b0076cdc3b5beasm4453721qkh.86.2023.10.25.12.39.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Oct 2023 12:39:29 -0700 (PDT) From: Hao Xiang To: quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH 01/16] Cherry pick a set of patches that enables multifd zero page feature. Date: Wed, 25 Oct 2023 19:38:07 +0000 Message-Id: <20231025193822.2813204-2-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231025193822.2813204-1-hao.xiang@bytedance.com> References: <20231025193822.2813204-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::f2c; envelope-from=hao.xiang@bytedance.com; helo=mail-qv1-xf2c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Juan Quintela had a patchset enabling zero page checking in multifd threads. https://lore.kernel.org/all/20220802063907.18882-13-quintela@redhat.com/ Some of the changes in that patchset has already made to upstream but a few are still being reviewed. This patch contains the changes from the remaining patches. This change serves as the baseline for DSA offloading. * multifd: Add capability to enable/disable zero_page * migration: Export ram_release_page() * multifd: Support for zero pages transmission * multifd: Zero pages transmission * So we use multifd to transmit zero pages. --- migration/migration-stats.h | 8 +-- migration/multifd.c | 105 ++++++++++++++++++++++++++++-------- migration/multifd.h | 19 ++++++- migration/options.c | 12 +++++ migration/options.h | 1 + migration/ram.c | 47 +++++++++++++--- migration/trace-events | 8 +-- qapi/migration.json | 7 ++- 8 files changed, 167 insertions(+), 40 deletions(-) diff --git a/migration/migration-stats.h b/migration/migration-stats.h index 2358caad63..dca3c100b0 100644 --- a/migration/migration-stats.h +++ b/migration/migration-stats.h @@ -68,6 +68,10 @@ typedef struct { * Number of pages transferred that were not full of zeros. */ Stat64 normal_pages; + /* + * Number of pages transferred that were full of zeros. + */ + Stat64 zero_pages; /* * Number of bytes sent during postcopy. */ @@ -97,10 +101,6 @@ typedef struct { * Total number of bytes transferred. */ Stat64 transferred; - /* - * Number of pages transferred that were full of zeros. - */ - Stat64 zero_pages; } MigrationAtomicStats; extern MigrationAtomicStats mig_stats; diff --git a/migration/multifd.c b/migration/multifd.c index 0f6b203877..452fb158b8 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -12,6 +12,7 @@ #include "qemu/osdep.h" #include "qemu/rcu.h" +#include "qemu/cutils.h" #include "exec/target_page.h" #include "sysemu/sysemu.h" #include "exec/ramblock.h" @@ -268,6 +269,7 @@ static void multifd_send_fill_packet(MultiFDSendParams *p) packet->normal_pages = cpu_to_be32(p->normal_num); packet->next_packet_size = cpu_to_be32(p->next_packet_size); packet->packet_num = cpu_to_be64(p->packet_num); + packet->zero_pages = cpu_to_be32(p->zero_num); if (p->pages->block) { strncpy(packet->ramblock, p->pages->block->idstr, 256); @@ -279,6 +281,12 @@ static void multifd_send_fill_packet(MultiFDSendParams *p) packet->offset[i] = cpu_to_be64(temp); } + for (i = 0; i < p->zero_num; i++) { + /* there are architectures where ram_addr_t is 32 bit */ + uint64_t temp = p->zero[i]; + + packet->offset[p->normal_num + i] = cpu_to_be64(temp); + } } static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) @@ -327,7 +335,15 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) p->next_packet_size = be32_to_cpu(packet->next_packet_size); p->packet_num = be64_to_cpu(packet->packet_num); - if (p->normal_num == 0) { + p->zero_num = be32_to_cpu(packet->zero_pages); + if (p->zero_num > packet->pages_alloc - p->normal_num) { + error_setg(errp, "multifd: received packet " + "with %u zero pages and expected maximum pages are %u", + p->zero_num, packet->pages_alloc - p->normal_num) ; + return -1; + } + + if (p->normal_num == 0 && p->zero_num == 0) { return 0; } @@ -353,6 +369,18 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) p->normal[i] = offset; } + for (i = 0; i < p->zero_num; i++) { + uint64_t offset = be64_to_cpu(packet->offset[p->normal_num + i]); + + if (offset > (p->block->used_length - p->page_size)) { + error_setg(errp, "multifd: offset too long %" PRIu64 + " (max " RAM_ADDR_FMT ")", + offset, p->block->used_length); + return -1; + } + p->zero[i] = offset; + } + return 0; } @@ -432,6 +460,8 @@ static int multifd_send_pages(QEMUFile *f) p->packet_num = multifd_send_state->packet_num++; multifd_send_state->pages = p->pages; p->pages = pages; + stat64_add(&mig_stats.normal_pages, p->normal_num); + stat64_add(&mig_stats.zero_pages, p->zero_num); qemu_mutex_unlock(&p->mutex); qemu_sem_post(&p->sem); @@ -548,6 +578,8 @@ void multifd_save_cleanup(void) p->iov = NULL; g_free(p->normal); p->normal = NULL; + g_free(p->zero); + p->zero = NULL; multifd_send_state->ops->send_cleanup(p, &local_err); if (local_err) { migrate_set_error(migrate_get_current(), local_err); @@ -622,7 +654,7 @@ int multifd_send_sync_main(QEMUFile *f) } p->packet_num = multifd_send_state->packet_num++; - p->flags |= MULTIFD_FLAG_SYNC; + p->sync_needed = true; p->pending_job++; qemu_mutex_unlock(&p->mutex); qemu_sem_post(&p->sem); @@ -648,6 +680,8 @@ static void *multifd_send_thread(void *opaque) MultiFDSendParams *p = opaque; MigrationThread *thread = NULL; Error *local_err = NULL; + /* older qemu don't understand zero page on multifd channel */ + bool use_multifd_zero_page = !migrate_use_main_zero_page(); int ret = 0; bool use_zero_copy_send = migrate_zero_copy_send(); @@ -673,9 +707,17 @@ static void *multifd_send_thread(void *opaque) qemu_mutex_lock(&p->mutex); if (p->pending_job) { + RAMBlock *rb = p->pages->block; uint64_t packet_num = p->packet_num; - uint32_t flags; + p->flags = 0; + if (p->sync_needed) { + p->flags |= MULTIFD_FLAG_SYNC; + p->sync_needed = false; + } + qemu_mutex_unlock(&p->mutex); + p->normal_num = 0; + p->zero_num = 0; if (use_zero_copy_send) { p->iovs_num = 0; @@ -684,27 +726,27 @@ static void *multifd_send_thread(void *opaque) } for (int i = 0; i < p->pages->num; i++) { - p->normal[p->normal_num] = p->pages->offset[i]; - p->normal_num++; + uint64_t offset = p->pages->offset[i]; + if (use_multifd_zero_page && + buffer_is_zero(rb->host + offset, p->page_size)) { + p->zero[p->zero_num] = offset; + p->zero_num++; + ram_release_page(rb->idstr, offset); + } else { + p->normal[p->normal_num] = offset; + p->normal_num++; + } } if (p->normal_num) { ret = multifd_send_state->ops->send_prepare(p, &local_err); if (ret != 0) { - qemu_mutex_unlock(&p->mutex); break; } } multifd_send_fill_packet(p); - flags = p->flags; - p->flags = 0; - p->num_packets++; - p->total_normal_pages += p->normal_num; - p->pages->num = 0; - p->pages->block = NULL; - qemu_mutex_unlock(&p->mutex); - trace_multifd_send(p->id, packet_num, p->normal_num, flags, + trace_multifd_send(p->id, packet_num, p->normal_num, p->zero_num, p->flags, p->next_packet_size); if (use_zero_copy_send) { @@ -731,10 +773,15 @@ static void *multifd_send_thread(void *opaque) stat64_add(&mig_stats.multifd_bytes, p->next_packet_size); stat64_add(&mig_stats.transferred, p->next_packet_size); qemu_mutex_lock(&p->mutex); + p->num_packets++; + p->total_normal_pages += p->normal_num; + p->total_zero_pages += p->zero_num; + p->pages->num = 0; + p->pages->block = NULL; p->pending_job--; qemu_mutex_unlock(&p->mutex); - if (flags & MULTIFD_FLAG_SYNC) { + if (p->flags & MULTIFD_FLAG_SYNC) { qemu_sem_post(&p->sem_sync); } } else if (p->quit) { @@ -768,7 +815,8 @@ out: rcu_unregister_thread(); migration_threads_remove(thread); - trace_multifd_send_thread_end(p->id, p->num_packets, p->total_normal_pages); + trace_multifd_send_thread_end(p->id, p->num_packets, p->total_normal_pages, + p->total_zero_pages); return NULL; } @@ -944,6 +992,7 @@ int multifd_save_setup(Error **errp) p->normal = g_new0(ram_addr_t, page_count); p->page_size = qemu_target_page_size(); p->page_count = page_count; + p->zero = g_new0(ram_addr_t, page_count); if (migrate_zero_copy_send()) { p->write_flags = QIO_CHANNEL_WRITE_FLAG_ZERO_COPY; @@ -1059,6 +1108,8 @@ void multifd_load_cleanup(void) p->iov = NULL; g_free(p->normal); p->normal = NULL; + g_free(p->zero); + p->zero = NULL; multifd_recv_state->ops->recv_cleanup(p); } qemu_sem_destroy(&multifd_recv_state->sem_sync); @@ -1105,7 +1156,7 @@ static void *multifd_recv_thread(void *opaque) rcu_register_thread(); while (true) { - uint32_t flags; + bool sync_needed = false; if (p->quit) { break; @@ -1124,13 +1175,14 @@ static void *multifd_recv_thread(void *opaque) break; } - flags = p->flags; + trace_multifd_recv(p->id, p->packet_num, p->normal_num, p->zero_num, + p->flags, p->next_packet_size); + sync_needed = p->flags & MULTIFD_FLAG_SYNC; /* recv methods don't know how to handle the SYNC flag */ p->flags &= ~MULTIFD_FLAG_SYNC; - trace_multifd_recv(p->id, p->packet_num, p->normal_num, flags, - p->next_packet_size); p->num_packets++; p->total_normal_pages += p->normal_num; + p->total_zero_pages += p->zero_num; qemu_mutex_unlock(&p->mutex); if (p->normal_num) { @@ -1140,7 +1192,14 @@ static void *multifd_recv_thread(void *opaque) } } - if (flags & MULTIFD_FLAG_SYNC) { + for (int i = 0; i < p->zero_num; i++) { + void *page = p->host + p->zero[i]; + if (!buffer_is_zero(page, p->page_size)) { + memset(page, 0, p->page_size); + } + } + + if (sync_needed) { qemu_sem_post(&multifd_recv_state->sem_sync); qemu_sem_wait(&p->sem_sync); } @@ -1155,7 +1214,8 @@ static void *multifd_recv_thread(void *opaque) qemu_mutex_unlock(&p->mutex); rcu_unregister_thread(); - trace_multifd_recv_thread_end(p->id, p->num_packets, p->total_normal_pages); + trace_multifd_recv_thread_end(p->id, p->num_packets, p->total_normal_pages, + p->total_zero_pages); return NULL; } @@ -1196,6 +1256,7 @@ int multifd_load_setup(Error **errp) p->normal = g_new0(ram_addr_t, page_count); p->page_count = page_count; p->page_size = qemu_target_page_size(); + p->zero = g_new0(ram_addr_t, page_count); } for (i = 0; i < thread_count; i++) { diff --git a/migration/multifd.h b/migration/multifd.h index a835643b48..e8f90776bb 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -48,7 +48,10 @@ typedef struct { /* size of the next packet that contains pages */ uint32_t next_packet_size; uint64_t packet_num; - uint64_t unused[4]; /* Reserved for future use */ + /* zero pages */ + uint32_t zero_pages; + uint32_t unused32[1]; /* Reserved for future use */ + uint64_t unused64[3]; /* Reserved for future use */ char ramblock[256]; uint64_t offset[]; } __attribute__((packed)) MultiFDPacket_t; @@ -118,10 +121,14 @@ typedef struct { MultiFDPacket_t *packet; /* size of the next packet that contains pages */ uint32_t next_packet_size; + /* Do we need to do an iteration sync */ + bool sync_needed; /* packets sent through this channel */ uint64_t num_packets; /* non zero pages sent through this channel */ uint64_t total_normal_pages; + /* zero pages sent through this channel */ + uint64_t total_zero_pages; /* buffers to send */ struct iovec *iov; /* number of iovs used */ @@ -130,6 +137,10 @@ typedef struct { ram_addr_t *normal; /* num of non zero pages */ uint32_t normal_num; + /* Pages that are zero */ + ram_addr_t *zero; + /* num of zero pages */ + uint32_t zero_num; /* used for compression methods */ void *data; } MultiFDSendParams; @@ -181,12 +192,18 @@ typedef struct { uint8_t *host; /* non zero pages recv through this channel */ uint64_t total_normal_pages; + /* zero pages recv through this channel */ + uint64_t total_zero_pages; /* buffers to recv */ struct iovec *iov; /* Pages that are not zero */ ram_addr_t *normal; /* num of non zero pages */ uint32_t normal_num; + /* Pages that are zero */ + ram_addr_t *zero; + /* num of zero pages */ + uint32_t zero_num; /* used for de-compression methods */ void *data; } MultiFDRecvParams; diff --git a/migration/options.c b/migration/options.c index 6bbfd4853d..12b1c4dd71 100644 --- a/migration/options.c +++ b/migration/options.c @@ -189,6 +189,7 @@ Property migration_properties[] = { DEFINE_PROP_MIG_CAP("x-block", MIGRATION_CAPABILITY_BLOCK), DEFINE_PROP_MIG_CAP("x-return-path", MIGRATION_CAPABILITY_RETURN_PATH), DEFINE_PROP_MIG_CAP("x-multifd", MIGRATION_CAPABILITY_MULTIFD), + DEFINE_PROP_MIG_CAP("x-main-zero-page", MIGRATION_CAPABILITY_MAIN_ZERO_PAGE), DEFINE_PROP_MIG_CAP("x-background-snapshot", MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT), #ifdef CONFIG_LINUX @@ -278,6 +279,13 @@ bool migrate_multifd(void) return s->capabilities[MIGRATION_CAPABILITY_MULTIFD]; } +bool migrate_use_main_zero_page(void) +{ + MigrationState *s = migrate_get_current(); + + return s->capabilities[MIGRATION_CAPABILITY_MAIN_ZERO_PAGE]; +} + bool migrate_pause_before_switchover(void) { MigrationState *s = migrate_get_current(); @@ -431,6 +439,7 @@ INITIALIZE_MIGRATE_CAPS_SET(check_caps_background_snapshot, MIGRATION_CAPABILITY_LATE_BLOCK_ACTIVATE, MIGRATION_CAPABILITY_RETURN_PATH, MIGRATION_CAPABILITY_MULTIFD, + MIGRATION_CAPABILITY_MAIN_ZERO_PAGE, MIGRATION_CAPABILITY_PAUSE_BEFORE_SWITCHOVER, MIGRATION_CAPABILITY_AUTO_CONVERGE, MIGRATION_CAPABILITY_RELEASE_RAM, @@ -499,6 +508,9 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp) error_setg(errp, "Postcopy is not yet compatible with multifd"); return false; } + if (new_caps[MIGRATION_CAPABILITY_MAIN_ZERO_PAGE]) { + error_setg(errp, "Postcopy is not yet compatible with main zero copy"); + } } if (new_caps[MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT]) { diff --git a/migration/options.h b/migration/options.h index 045e2a41a2..c663f637fd 100644 --- a/migration/options.h +++ b/migration/options.h @@ -85,6 +85,7 @@ int migrate_multifd_channels(void); MultiFDCompression migrate_multifd_compression(void); int migrate_multifd_zlib_level(void); int migrate_multifd_zstd_level(void); +bool migrate_use_main_zero_page(void); uint8_t migrate_throttle_trigger_threshold(void); const char *migrate_tls_authz(void); const char *migrate_tls_creds(void); diff --git a/migration/ram.c b/migration/ram.c index 2f5ce4d60b..516b5b9c59 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -1292,7 +1292,6 @@ static int ram_save_multifd_page(QEMUFile *file, RAMBlock *block, if (multifd_queue_page(file, block, offset) < 0) { return -1; } - stat64_add(&mig_stats.normal_pages, 1); return 1; } @@ -2149,17 +2148,43 @@ static int ram_save_target_page_legacy(RAMState *rs, PageSearchStatus *pss) } return res; } - /* - * Do not use multifd in postcopy as one whole host page should be - * placed. Meanwhile postcopy requires atomic update of pages, so even - * if host page size == guest page size the dest guest during run may - * still see partially copied pages which is data corruption. + * Do not use multifd for: + * 1. Compression as the first page in the new block should be posted out + * before sending the compressed page + * 2. In postcopy as one whole host page should be placed */ - if (migrate_multifd() && !migration_in_postcopy()) { + if (!save_page_use_compression(rs) && migrate_multifd() + && !migration_in_postcopy()) { + return ram_save_multifd_page(pss->pss_channel, block, offset); + } + + return ram_save_page(rs, pss); +} + +/** + * ram_save_target_page_multifd: save one target page + * + * Returns the number of pages written + * + * @rs: current RAM state + * @pss: data about the page we want to send + */ +static int ram_save_target_page_multifd(RAMState *rs, PageSearchStatus *pss) +{ + RAMBlock *block = pss->block; + ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS; + int res; + + if (!migration_in_postcopy()) { return ram_save_multifd_page(pss->pss_channel, block, offset); } + res = save_zero_page(pss, pss->pss_channel, block, offset); + if (res > 0) { + return res; + } + return ram_save_page(rs, pss); } @@ -3066,7 +3091,13 @@ static int ram_save_setup(QEMUFile *f, void *opaque) ram_control_after_iterate(f, RAM_CONTROL_SETUP); migration_ops = g_malloc0(sizeof(MigrationOps)); - migration_ops->ram_save_target_page = ram_save_target_page_legacy; + + if (migrate_multifd() && !migrate_use_main_zero_page()) { + migration_ops->ram_save_target_page = ram_save_target_page_multifd; + } else { + migration_ops->ram_save_target_page = ram_save_target_page_legacy; + } + ret = multifd_send_sync_main(f); if (ret < 0) { return ret; diff --git a/migration/trace-events b/migration/trace-events index ee9c8f4d63..3d059f3c06 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -125,21 +125,21 @@ postcopy_preempt_reset_channel(void) "" # multifd.c multifd_new_send_channel_async(uint8_t id) "channel %u" -multifd_recv(uint8_t id, uint64_t packet_num, uint32_t used, uint32_t flags, uint32_t next_packet_size) "channel %u packet_num %" PRIu64 " pages %u flags 0x%x next packet size %u" +multifd_recv(uint8_t id, uint64_t packet_num, uint32_t normal, uint32_t zero, uint32_t flags, uint32_t next_packet_size) "channel %u packet_num %" PRIu64 " normal pages %u zero pages %u flags 0x%x next packet size %u" multifd_recv_new_channel(uint8_t id) "channel %u" multifd_recv_sync_main(long packet_num) "packet num %ld" multifd_recv_sync_main_signal(uint8_t id) "channel %u" multifd_recv_sync_main_wait(uint8_t id) "channel %u" multifd_recv_terminate_threads(bool error) "error %d" -multifd_recv_thread_end(uint8_t id, uint64_t packets, uint64_t pages) "channel %u packets %" PRIu64 " pages %" PRIu64 +multifd_recv_thread_end(uint8_t id, uint64_t packets, uint64_t normal_pages, uint64_t zero_pages) "channel %u packets %" PRIu64 " normal pages %" PRIu64 " zero pages %" PRIu64 multifd_recv_thread_start(uint8_t id) "%u" -multifd_send(uint8_t id, uint64_t packet_num, uint32_t normal, uint32_t flags, uint32_t next_packet_size) "channel %u packet_num %" PRIu64 " normal pages %u flags 0x%x next packet size %u" +multifd_send(uint8_t id, uint64_t packet_num, uint32_t normalpages, uint32_t zero_pages, uint32_t flags, uint32_t next_packet_size) "channel %u packet_num %" PRIu64 " normal pages %u zero pages %u flags 0x%x next packet size %u" multifd_send_error(uint8_t id) "channel %u" multifd_send_sync_main(long packet_num) "packet num %ld" multifd_send_sync_main_signal(uint8_t id) "channel %u" multifd_send_sync_main_wait(uint8_t id) "channel %u" multifd_send_terminate_threads(bool error) "error %d" -multifd_send_thread_end(uint8_t id, uint64_t packets, uint64_t normal_pages) "channel %u packets %" PRIu64 " normal pages %" PRIu64 +multifd_send_thread_end(uint8_t id, uint64_t packets, uint64_t normal_pages, uint64_t zero_pages) "channel %u packets %" PRIu64 " normal pages %" PRIu64 " zero pages %" PRIu64 multifd_send_thread_start(uint8_t id) "%u" multifd_tls_outgoing_handshake_start(void *ioc, void *tioc, const char *hostname) "ioc=%p tioc=%p hostname=%s" multifd_tls_outgoing_handshake_error(void *ioc, const char *err) "ioc=%p err=%s" diff --git a/qapi/migration.json b/qapi/migration.json index d7dfaa5db9..3a99fe34d8 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -508,6 +508,11 @@ # and should not affect the correctness of postcopy migration. # (since 7.1) # +# @main-zero-page: If enabled, the detection of zero pages will be +# done on the main thread. Otherwise it is done on +# the multifd threads. +# (since 7.1) +# # @switchover-ack: If enabled, migration will not stop the source VM # and complete the migration until an ACK is received from the # destination that it's OK to do so. Exactly when this ACK is @@ -540,7 +545,7 @@ { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] }, 'validate-uuid', 'background-snapshot', 'zero-copy-send', 'postcopy-preempt', 'switchover-ack', - 'dirty-limit'] } + 'dirty-limit', 'main-zero-page'] } ## # @MigrationCapabilityStatus: From patchwork Wed Oct 25 19:38:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 1855251 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=bytedance.com header.i=@bytedance.com header.a=rsa-sha256 header.s=google header.b=HWO6ZxZz; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SFzn21502z23jn for ; Thu, 26 Oct 2023 06:40:54 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvjj9-0000r5-3I; Wed, 25 Oct 2023 15:39:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvjj6-0000qO-8K for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:36 -0400 Received: from mail-qk1-x731.google.com ([2607:f8b0:4864:20::731]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qvjj4-00067x-6G for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:36 -0400 Received: by mail-qk1-x731.google.com with SMTP id af79cd13be357-77774120c6eso9820985a.2 for ; Wed, 25 Oct 2023 12:39:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1698262771; x=1698867571; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=gkO/a2qNKiS9VIoexxTIyQuXXrYV6TaKvpeGj8t7mHo=; b=HWO6ZxZzRteklyRhKTZFvPr3SotPCCmYKMCodTRwHvO9KLC46RDQSEShHDbyPtBrap F98Qr0YMn5pQ7Rqw9Akv8VmworE1D9/SwCIltTJKH10F6m6oOXqYxLU5Q0cyEIWjLB3k 6+syEy3IrMBF9FCf/T9xLdj/mQcbK6jOJ8yu17dOW0AEfoj35Ha8hjvHuzjmRDiVmhxT Xp6qgFfWmCIibjsIbXUdluJa6nRcXcZ7nepLqmUZwB+98iCAdvf6kh4qBrVRUkMdk+jV eivYvTjh9hsHPpWEFGb+5dmIsLRSgiWkl53iIjlG6I3NtRdXeiQ+P1W3mKM5QugeYc6w RJ9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698262771; x=1698867571; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gkO/a2qNKiS9VIoexxTIyQuXXrYV6TaKvpeGj8t7mHo=; b=dFyvDKne51tg2DfrSncCOHCGo4EMCqc129OBIZm4hLA7DyWtWzCXi96Zumn3jGwYTI GRUhHxuMU1unJ7cIBFknLDhmIfVl6SCqlxgfjzxHpHSeCd6Jcv55EZXL7R6qIzLiYuth iGWcR3O3ma3ICm/5IjsEpWPbkjc+iAd08n6Pzs5YeI/i1leJQ2h9tt2CiaIPBgS9jor4 bs830CiRBUQXa5CxuO4XnmaIgWztqVN/FFYhhi3MCQW/BW5MDwTcsQgZSQQE9j44+VVM AJhz2EnWkRd1Sn2T9Vd8SzyRxhQMkGsb+aa4RZ4WYyhTSbHLdL1fkJyZeNgFA7eZG7c3 050A== X-Gm-Message-State: AOJu0YxNvRRWOHnOD+BRGS6KDkodCF9zubzfEjqKBzkbInisa6C/qOJG 1mF+LPZ8D5JUz/4JkK7nZvs7MQ== X-Google-Smtp-Source: AGHT+IFpSmF6V6lIvHNkvZQofGD3hVcEcMNevdLuz1pAGweH0jrhMdsUHqX8To7n5vdxQ8BcexusRA== X-Received: by 2002:a05:620a:981:b0:778:8f98:96a with SMTP id x1-20020a05620a098100b007788f98096amr15248990qkx.21.1698262771318; Wed, 25 Oct 2023 12:39:31 -0700 (PDT) Received: from n231-230-216.byted.org ([147.160.184.135]) by smtp.gmail.com with ESMTPSA id o8-20020a05620a228800b0076cdc3b5beasm4453721qkh.86.2023.10.25.12.39.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Oct 2023 12:39:31 -0700 (PDT) From: Hao Xiang To: quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH 02/16] meson: Introduce new instruction set enqcmd to the build system. Date: Wed, 25 Oct 2023 19:38:08 +0000 Message-Id: <20231025193822.2813204-3-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231025193822.2813204-1-hao.xiang@bytedance.com> References: <20231025193822.2813204-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::731; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x731.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Enable instruction set enqcmd in build. Signed-off-by: Hao Xiang --- meson.build | 2 ++ meson_options.txt | 2 ++ scripts/meson-buildoptions.sh | 3 +++ 3 files changed, 7 insertions(+) diff --git a/meson.build b/meson.build index bd65a111aa..6ea859829c 100644 --- a/meson.build +++ b/meson.build @@ -2661,6 +2661,8 @@ config_host_data.set('CONFIG_AVX512BW_OPT', get_option('avx512bw') \ int main(int argc, char *argv[]) { return bar(argv[0]); } '''), error_message: 'AVX512BW not available').allowed()) +config_host_data.set('CONFIG_DSA_OPT', get_option('enqcmd')) + # For both AArch64 and AArch32, detect if builtins are available. config_host_data.set('CONFIG_ARM_AES_BUILTIN', cc.compiles(''' #include diff --git a/meson_options.txt b/meson_options.txt index 6a17b90968..029be1df9f 100644 --- a/meson_options.txt +++ b/meson_options.txt @@ -119,6 +119,8 @@ option('avx512f', type: 'feature', value: 'disabled', description: 'AVX512F optimizations') option('avx512bw', type: 'feature', value: 'auto', description: 'AVX512BW optimizations') +option('enqcmd', type: 'boolean', value: false, + description: 'MENQCMD optimizations') option('keyring', type: 'feature', value: 'auto', description: 'Linux keyring support') option('libkeyutils', type: 'feature', value: 'auto', diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh index 2a74b0275b..768f2d7627 100644 --- a/scripts/meson-buildoptions.sh +++ b/scripts/meson-buildoptions.sh @@ -82,6 +82,7 @@ meson_options_help() { printf "%s\n" ' avx2 AVX2 optimizations' printf "%s\n" ' avx512bw AVX512BW optimizations' printf "%s\n" ' avx512f AVX512F optimizations' + printf "%s\n" ' enqcmd ENQCMD optimizations' printf "%s\n" ' blkio libblkio block device driver' printf "%s\n" ' bochs bochs image format support' printf "%s\n" ' bpf eBPF support' @@ -224,6 +225,8 @@ _meson_option_parse() { --disable-avx512bw) printf "%s" -Davx512bw=disabled ;; --enable-avx512f) printf "%s" -Davx512f=enabled ;; --disable-avx512f) printf "%s" -Davx512f=disabled ;; + --enable-enqcmd) printf "%s" -Denqcmd=true ;; + --disable-enqcmd) printf "%s" -Denqcmd=false ;; --enable-gcov) printf "%s" -Db_coverage=true ;; --disable-gcov) printf "%s" -Db_coverage=false ;; --enable-lto) printf "%s" -Db_lto=true ;; From patchwork Wed Oct 25 19:38:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 1855258 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=bytedance.com header.i=@bytedance.com header.a=rsa-sha256 header.s=google header.b=SsDLvkQT; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SFzpL2jZrz23jn for ; Thu, 26 Oct 2023 06:42:02 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvjj7-0000qp-PY; Wed, 25 Oct 2023 15:39:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvjj5-0000qF-Pz for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:35 -0400 Received: from mail-qk1-x731.google.com ([2607:f8b0:4864:20::731]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qvjj3-000687-Tz for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:35 -0400 Received: by mail-qk1-x731.google.com with SMTP id af79cd13be357-779fb118fe4so9479285a.2 for ; Wed, 25 Oct 2023 12:39:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1698262772; x=1698867572; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=0JqkkF9dhAzE0OkPFi/BNF1Vq1buivPQjPn42c685Sk=; b=SsDLvkQTed2tMdWXg/Wbfu7kS81LcEEOg4H9Fgk6gmYcQba0lKqSv9H+H7ifRPmo5M 8K7210lgpWiuqVBoWSiJtJxxn/WcK8YIje3Fz10xJcM27GCfrcyAnANgKsal41zwXmwx DJOnX3yRi4vMrNI7QxQeOcWgxbNp7Mj+TMOBW0nQZNWE4n5rgOu3/IrNpII22C95edg3 iQNqcZAuZGrQr/q62Zv0GvJ5S4dVDljoAghXRiQVyZ39IpNOkSP9Besq0GxjWhrsft41 WXpS3AitB5kFoRmQObsOE8S04lgreiAmLdhu+F8twpsF/Y6mKTpc34cxsRn1+G9hj8XJ kumg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698262772; x=1698867572; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0JqkkF9dhAzE0OkPFi/BNF1Vq1buivPQjPn42c685Sk=; b=cPuGg82q3/DQ/fgKOjXyB85qShuXNsNzRkpYzu/Wp9pOHCviGpCSguP4jVEkx6WIcy n+B2f9ra15OYFLROAsVrOP5C30mrYYcXFA4w3rZzQlxTo0rBheAbbUTRO8CllYH6IRdk n+MmYB6SNAanKVZ1jQwBdcrnKkvlIvVoBEDinzi9k1Zzc7NCEK9DBRISAPtQ/YStUzF0 mFi0jgq3iOWZuxHEEYhSXMl0XtWZPSo7b2Ear+jOKwgBGchMZ+7VNtKQwrlm+iMbUphD D5UuWQOw3l7YfYyJUOgCzumf2cKWZZHCVpjL94QrIgb12IkQi0T+RvqBDqIIWKAqluJ8 C5BA== X-Gm-Message-State: AOJu0YwU4ukIOrsx5BK/RMpzc07QN9sTgZLYYUeC0NZ1gg+Kyew/5wuV eHAT9OvxEqwD2I5VFA/fYJ5UCg== X-Google-Smtp-Source: AGHT+IFvcQoDPlOpKPqJe+cAEq3W0GjfIbenamTgPA6MxpuyOZPPkz7ktB7xbFGqEaiG56LXMaEGyA== X-Received: by 2002:a05:620a:1aaa:b0:769:89c8:4fae with SMTP id bl42-20020a05620a1aaa00b0076989c84faemr19131809qkb.52.1698262772577; Wed, 25 Oct 2023 12:39:32 -0700 (PDT) Received: from n231-230-216.byted.org ([147.160.184.135]) by smtp.gmail.com with ESMTPSA id o8-20020a05620a228800b0076cdc3b5beasm4453721qkh.86.2023.10.25.12.39.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Oct 2023 12:39:32 -0700 (PDT) From: Hao Xiang To: quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH 03/16] util/dsa: Add dependency idxd. Date: Wed, 25 Oct 2023 19:38:09 +0000 Message-Id: <20231025193822.2813204-4-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231025193822.2813204-1-hao.xiang@bytedance.com> References: <20231025193822.2813204-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::731; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x731.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Idxd is the device driver for DSA (Intel Data Streaming Accelerator). The driver is fully functioning since Linux kernel 5.19. This change adds the driver's header file used for userspace development. Signed-off-by: Hao Xiang --- linux-headers/linux/idxd.h | 356 +++++++++++++++++++++++++++++++++++++ 1 file changed, 356 insertions(+) create mode 100644 linux-headers/linux/idxd.h diff --git a/linux-headers/linux/idxd.h b/linux-headers/linux/idxd.h new file mode 100644 index 0000000000..1d553bedbd --- /dev/null +++ b/linux-headers/linux/idxd.h @@ -0,0 +1,356 @@ +/* SPDX-License-Identifier: LGPL-2.1 WITH Linux-syscall-note */ +/* Copyright(c) 2019 Intel Corporation. All rights rsvd. */ +#ifndef _USR_IDXD_H_ +#define _USR_IDXD_H_ + +#ifdef __KERNEL__ +#include +#else +#include +#endif + +/* Driver command error status */ +enum idxd_scmd_stat { + IDXD_SCMD_DEV_ENABLED = 0x80000010, + IDXD_SCMD_DEV_NOT_ENABLED = 0x80000020, + IDXD_SCMD_WQ_ENABLED = 0x80000021, + IDXD_SCMD_DEV_DMA_ERR = 0x80020000, + IDXD_SCMD_WQ_NO_GRP = 0x80030000, + IDXD_SCMD_WQ_NO_NAME = 0x80040000, + IDXD_SCMD_WQ_NO_SVM = 0x80050000, + IDXD_SCMD_WQ_NO_THRESH = 0x80060000, + IDXD_SCMD_WQ_PORTAL_ERR = 0x80070000, + IDXD_SCMD_WQ_RES_ALLOC_ERR = 0x80080000, + IDXD_SCMD_PERCPU_ERR = 0x80090000, + IDXD_SCMD_DMA_CHAN_ERR = 0x800a0000, + IDXD_SCMD_CDEV_ERR = 0x800b0000, + IDXD_SCMD_WQ_NO_SWQ_SUPPORT = 0x800c0000, + IDXD_SCMD_WQ_NONE_CONFIGURED = 0x800d0000, + IDXD_SCMD_WQ_NO_SIZE = 0x800e0000, + IDXD_SCMD_WQ_NO_PRIV = 0x800f0000, + IDXD_SCMD_WQ_IRQ_ERR = 0x80100000, + IDXD_SCMD_WQ_USER_NO_IOMMU = 0x80110000, +}; + +#define IDXD_SCMD_SOFTERR_MASK 0x80000000 +#define IDXD_SCMD_SOFTERR_SHIFT 16 + +/* Descriptor flags */ +#define IDXD_OP_FLAG_FENCE 0x0001 +#define IDXD_OP_FLAG_BOF 0x0002 +#define IDXD_OP_FLAG_CRAV 0x0004 +#define IDXD_OP_FLAG_RCR 0x0008 +#define IDXD_OP_FLAG_RCI 0x0010 +#define IDXD_OP_FLAG_CRSTS 0x0020 +#define IDXD_OP_FLAG_CR 0x0080 +#define IDXD_OP_FLAG_CC 0x0100 +#define IDXD_OP_FLAG_ADDR1_TCS 0x0200 +#define IDXD_OP_FLAG_ADDR2_TCS 0x0400 +#define IDXD_OP_FLAG_ADDR3_TCS 0x0800 +#define IDXD_OP_FLAG_CR_TCS 0x1000 +#define IDXD_OP_FLAG_STORD 0x2000 +#define IDXD_OP_FLAG_DRDBK 0x4000 +#define IDXD_OP_FLAG_DSTS 0x8000 + +/* IAX */ +#define IDXD_OP_FLAG_RD_SRC2_AECS 0x010000 +#define IDXD_OP_FLAG_RD_SRC2_2ND 0x020000 +#define IDXD_OP_FLAG_WR_SRC2_AECS_COMP 0x040000 +#define IDXD_OP_FLAG_WR_SRC2_AECS_OVFL 0x080000 +#define IDXD_OP_FLAG_SRC2_STS 0x100000 +#define IDXD_OP_FLAG_CRC_RFC3720 0x200000 + +/* Opcode */ +enum dsa_opcode { + DSA_OPCODE_NOOP = 0, + DSA_OPCODE_BATCH, + DSA_OPCODE_DRAIN, + DSA_OPCODE_MEMMOVE, + DSA_OPCODE_MEMFILL, + DSA_OPCODE_COMPARE, + DSA_OPCODE_COMPVAL, + DSA_OPCODE_CR_DELTA, + DSA_OPCODE_AP_DELTA, + DSA_OPCODE_DUALCAST, + DSA_OPCODE_CRCGEN = 0x10, + DSA_OPCODE_COPY_CRC, + DSA_OPCODE_DIF_CHECK, + DSA_OPCODE_DIF_INS, + DSA_OPCODE_DIF_STRP, + DSA_OPCODE_DIF_UPDT, + DSA_OPCODE_CFLUSH = 0x20, +}; + +enum iax_opcode { + IAX_OPCODE_NOOP = 0, + IAX_OPCODE_DRAIN = 2, + IAX_OPCODE_MEMMOVE, + IAX_OPCODE_DECOMPRESS = 0x42, + IAX_OPCODE_COMPRESS, + IAX_OPCODE_CRC64, + IAX_OPCODE_ZERO_DECOMP_32 = 0x48, + IAX_OPCODE_ZERO_DECOMP_16, + IAX_OPCODE_ZERO_COMP_32 = 0x4c, + IAX_OPCODE_ZERO_COMP_16, + IAX_OPCODE_SCAN = 0x50, + IAX_OPCODE_SET_MEMBER, + IAX_OPCODE_EXTRACT, + IAX_OPCODE_SELECT, + IAX_OPCODE_RLE_BURST, + IAX_OPCODE_FIND_UNIQUE, + IAX_OPCODE_EXPAND, +}; + +/* Completion record status */ +enum dsa_completion_status { + DSA_COMP_NONE = 0, + DSA_COMP_SUCCESS, + DSA_COMP_SUCCESS_PRED, + DSA_COMP_PAGE_FAULT_NOBOF, + DSA_COMP_PAGE_FAULT_IR, + DSA_COMP_BATCH_FAIL, + DSA_COMP_BATCH_PAGE_FAULT, + DSA_COMP_DR_OFFSET_NOINC, + DSA_COMP_DR_OFFSET_ERANGE, + DSA_COMP_DIF_ERR, + DSA_COMP_BAD_OPCODE = 0x10, + DSA_COMP_INVALID_FLAGS, + DSA_COMP_NOZERO_RESERVE, + DSA_COMP_XFER_ERANGE, + DSA_COMP_DESC_CNT_ERANGE, + DSA_COMP_DR_ERANGE, + DSA_COMP_OVERLAP_BUFFERS, + DSA_COMP_DCAST_ERR, + DSA_COMP_DESCLIST_ALIGN, + DSA_COMP_INT_HANDLE_INVAL, + DSA_COMP_CRA_XLAT, + DSA_COMP_CRA_ALIGN, + DSA_COMP_ADDR_ALIGN, + DSA_COMP_PRIV_BAD, + DSA_COMP_TRAFFIC_CLASS_CONF, + DSA_COMP_PFAULT_RDBA, + DSA_COMP_HW_ERR1, + DSA_COMP_HW_ERR_DRB, + DSA_COMP_TRANSLATION_FAIL, +}; + +enum iax_completion_status { + IAX_COMP_NONE = 0, + IAX_COMP_SUCCESS, + IAX_COMP_PAGE_FAULT_IR = 0x04, + IAX_COMP_ANALYTICS_ERROR = 0x0a, + IAX_COMP_OUTBUF_OVERFLOW, + IAX_COMP_BAD_OPCODE = 0x10, + IAX_COMP_INVALID_FLAGS, + IAX_COMP_NOZERO_RESERVE, + IAX_COMP_INVALID_SIZE, + IAX_COMP_OVERLAP_BUFFERS = 0x16, + IAX_COMP_INT_HANDLE_INVAL = 0x19, + IAX_COMP_CRA_XLAT, + IAX_COMP_CRA_ALIGN, + IAX_COMP_ADDR_ALIGN, + IAX_COMP_PRIV_BAD, + IAX_COMP_TRAFFIC_CLASS_CONF, + IAX_COMP_PFAULT_RDBA, + IAX_COMP_HW_ERR1, + IAX_COMP_HW_ERR_DRB, + IAX_COMP_TRANSLATION_FAIL, + IAX_COMP_PRS_TIMEOUT, + IAX_COMP_WATCHDOG, + IAX_COMP_INVALID_COMP_FLAG = 0x30, + IAX_COMP_INVALID_FILTER_FLAG, + IAX_COMP_INVALID_INPUT_SIZE, + IAX_COMP_INVALID_NUM_ELEMS, + IAX_COMP_INVALID_SRC1_WIDTH, + IAX_COMP_INVALID_INVERT_OUT, +}; + +#define DSA_COMP_STATUS_MASK 0x7f +#define DSA_COMP_STATUS_WRITE 0x80 + +struct dsa_hw_desc { + uint32_t pasid:20; + uint32_t rsvd:11; + uint32_t priv:1; + uint32_t flags:24; + uint32_t opcode:8; + uint64_t completion_addr; + union { + uint64_t src_addr; + uint64_t rdback_addr; + uint64_t pattern; + uint64_t desc_list_addr; + }; + union { + uint64_t dst_addr; + uint64_t rdback_addr2; + uint64_t src2_addr; + uint64_t comp_pattern; + }; + union { + uint32_t xfer_size; + uint32_t desc_count; + }; + uint16_t int_handle; + uint16_t rsvd1; + union { + uint8_t expected_res; + /* create delta record */ + struct { + uint64_t delta_addr; + uint32_t max_delta_size; + uint32_t delt_rsvd; + uint8_t expected_res_mask; + }; + uint32_t delta_rec_size; + uint64_t dest2; + /* CRC */ + struct { + uint32_t crc_seed; + uint32_t crc_rsvd; + uint64_t seed_addr; + }; + /* DIF check or strip */ + struct { + uint8_t src_dif_flags; + uint8_t dif_chk_res; + uint8_t dif_chk_flags; + uint8_t dif_chk_res2[5]; + uint32_t chk_ref_tag_seed; + uint16_t chk_app_tag_mask; + uint16_t chk_app_tag_seed; + }; + /* DIF insert */ + struct { + uint8_t dif_ins_res; + uint8_t dest_dif_flag; + uint8_t dif_ins_flags; + uint8_t dif_ins_res2[13]; + uint32_t ins_ref_tag_seed; + uint16_t ins_app_tag_mask; + uint16_t ins_app_tag_seed; + }; + /* DIF update */ + struct { + uint8_t src_upd_flags; + uint8_t upd_dest_flags; + uint8_t dif_upd_flags; + uint8_t dif_upd_res[5]; + uint32_t src_ref_tag_seed; + uint16_t src_app_tag_mask; + uint16_t src_app_tag_seed; + uint32_t dest_ref_tag_seed; + uint16_t dest_app_tag_mask; + uint16_t dest_app_tag_seed; + }; + + uint8_t op_specific[24]; + }; +} __attribute__((packed)); + +struct iax_hw_desc { + uint32_t pasid:20; + uint32_t rsvd:11; + uint32_t priv:1; + uint32_t flags:24; + uint32_t opcode:8; + uint64_t completion_addr; + uint64_t src1_addr; + uint64_t dst_addr; + uint32_t src1_size; + uint16_t int_handle; + union { + uint16_t compr_flags; + uint16_t decompr_flags; + }; + uint64_t src2_addr; + uint32_t max_dst_size; + uint32_t src2_size; + uint32_t filter_flags; + uint32_t num_inputs; +} __attribute__((packed)); + +struct dsa_raw_desc { + uint64_t field[8]; +} __attribute__((packed)); + +/* + * The status field will be modified by hardware, therefore it should be + * volatile and prevent the compiler from optimize the read. + */ +struct dsa_completion_record { + volatile uint8_t status; + union { + uint8_t result; + uint8_t dif_status; + }; + uint16_t rsvd; + uint32_t bytes_completed; + uint64_t fault_addr; + union { + /* common record */ + struct { + uint32_t invalid_flags:24; + uint32_t rsvd2:8; + }; + + uint32_t delta_rec_size; + uint64_t crc_val; + + /* DIF check & strip */ + struct { + uint32_t dif_chk_ref_tag; + uint16_t dif_chk_app_tag_mask; + uint16_t dif_chk_app_tag; + }; + + /* DIF insert */ + struct { + uint64_t dif_ins_res; + uint32_t dif_ins_ref_tag; + uint16_t dif_ins_app_tag_mask; + uint16_t dif_ins_app_tag; + }; + + /* DIF update */ + struct { + uint32_t dif_upd_src_ref_tag; + uint16_t dif_upd_src_app_tag_mask; + uint16_t dif_upd_src_app_tag; + uint32_t dif_upd_dest_ref_tag; + uint16_t dif_upd_dest_app_tag_mask; + uint16_t dif_upd_dest_app_tag; + }; + + uint8_t op_specific[16]; + }; +} __attribute__((packed)); + +struct dsa_raw_completion_record { + uint64_t field[4]; +} __attribute__((packed)); + +struct iax_completion_record { + volatile uint8_t status; + uint8_t error_code; + uint16_t rsvd; + uint32_t bytes_completed; + uint64_t fault_addr; + uint32_t invalid_flags; + uint32_t rsvd2; + uint32_t output_size; + uint8_t output_bits; + uint8_t rsvd3; + uint16_t xor_csum; + uint32_t crc; + uint32_t min; + uint32_t max; + uint32_t sum; + uint64_t rsvd4[2]; +} __attribute__((packed)); + +struct iax_raw_completion_record { + uint64_t field[8]; +} __attribute__((packed)); + +#endif From patchwork Wed Oct 25 19:38:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 1855252 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=bytedance.com header.i=@bytedance.com header.a=rsa-sha256 header.s=google header.b=Ou1NDxwz; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SFznZ41mpz23jn for ; Thu, 26 Oct 2023 06:41:22 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvjj9-0000ro-Od; Wed, 25 Oct 2023 15:39:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvjj7-0000r3-V5 for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:37 -0400 Received: from mail-oo1-xc30.google.com ([2607:f8b0:4864:20::c30]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qvjj5-00068U-Mi for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:37 -0400 Received: by mail-oo1-xc30.google.com with SMTP id 006d021491bc7-581b6b93bd1so98031eaf.1 for ; Wed, 25 Oct 2023 12:39:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1698262774; x=1698867574; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=k0ANuzGnWOEFY5fWYAHlg+spNpTdLgGmhcWmwSge4rg=; b=Ou1NDxwzxseImjq5CdH69l6PjrdSsblSGfKMgI1vmaGknmAOCRBfWhcZe4IQD6X9kc p/74ks4Q53XRqnXq2rf3EFhuPJ6Gd3x6pqkazpSTj2k+pBNn7hLVdNEEIq6rweN1CWSh aFw5MBJhfPgftc1Y5hm5BPdpoqt5lbo9uTBdQUVrBw86s1ZXl94qMkijkazIX+YSeGrj TmqX2VAQxmUnQXzpx00Ot86UGdID/WEGMFxbBLCKl1VS+U83I+jAEsAw30nKzg0QxLjt ojwX+lgTMu4JSn9tq2+/RsjHFRCC3DyCB/a0oZEjH3CC5IKQFephnnaRRVK3ZmxSB5nb c5Dw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698262774; x=1698867574; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=k0ANuzGnWOEFY5fWYAHlg+spNpTdLgGmhcWmwSge4rg=; b=gd8rkT6qqNXtX1gXNyZSNBoxA44DE6jQhMoXjXXwP5yWyJ3M31gxefDdmkCx+qXqGe r20Yg8Ru3zoseRiBhTEwyeiLD84tv0XdMCSmPkVBQJLd9w2ii/wdRLVqce6ziiuK1q58 ldavgXBB37xxMTgRbcNEIoXU6zqkZOGcmmZFPPBUCAAc2xOuyHiFy8dLaPvjDixYDTWO Bfd3oUIp6eiupp9FvzHhm3lYjjtutW7yaEBCm7clOoG7P0hbUE2ICUQ2l2d/uA503M1F 9GaPTNYptUyFzMs8JcbO4gmwNF9roJhsNznEVxO0aBTGnc8Ipy+0OojZGFgOqRBqDe0Q ieIw== X-Gm-Message-State: AOJu0YxbfMrI4RpTKsJfh0hOIc9un6DaAzMqfNasQKWHEqMb2gskGg3L 5xB+BKcScs75ymomxygedrq/vg== X-Google-Smtp-Source: AGHT+IFBilC6466lqF/eda52hPHX3w8gTnz8NDR3vD8wB+uzfpBleORPytdybh1pxpgDRLHW7JmgYQ== X-Received: by 2002:a05:6358:cd04:b0:168:d0a3:202f with SMTP id gv4-20020a056358cd0400b00168d0a3202fmr13387175rwb.15.1698262774119; Wed, 25 Oct 2023 12:39:34 -0700 (PDT) Received: from n231-230-216.byted.org ([147.160.184.135]) by smtp.gmail.com with ESMTPSA id o8-20020a05620a228800b0076cdc3b5beasm4453721qkh.86.2023.10.25.12.39.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Oct 2023 12:39:33 -0700 (PDT) From: Hao Xiang To: quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH 04/16] util/dsa: Implement DSA device start and stop logic. Date: Wed, 25 Oct 2023 19:38:10 +0000 Message-Id: <20231025193822.2813204-5-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231025193822.2813204-1-hao.xiang@bytedance.com> References: <20231025193822.2813204-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::c30; envelope-from=hao.xiang@bytedance.com; helo=mail-oo1-xc30.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org * DSA device open and close. * DSA group contains multiple DSA devices. * DSA group configure/start/stop/clean. Signed-off-by: Hao Xiang Signed-off-by: Bryan Zhang --- include/qemu/dsa.h | 49 +++++++ util/dsa.c | 338 +++++++++++++++++++++++++++++++++++++++++++++ util/meson.build | 1 + 3 files changed, 388 insertions(+) create mode 100644 include/qemu/dsa.h create mode 100644 util/dsa.c diff --git a/include/qemu/dsa.h b/include/qemu/dsa.h new file mode 100644 index 0000000000..30246b507e --- /dev/null +++ b/include/qemu/dsa.h @@ -0,0 +1,49 @@ +#ifndef QEMU_DSA_H +#define QEMU_DSA_H + +#include "qemu/thread.h" +#include "qemu/queue.h" + +#ifdef CONFIG_DSA_OPT + +#pragma GCC push_options +#pragma GCC target("enqcmd") + +#include +#include "x86intrin.h" + +#endif + +/** + * @brief Initializes DSA devices. + * + * @param dsa_parameter A list of DSA device path from migration parameter. + * @return int Zero if successful, otherwise non zero. + */ +int dsa_init(const char *dsa_parameter); + +/** + * @brief Start logic to enable using DSA. + */ +void dsa_start(void); + +/** + * @brief Stop logic to clean up DSA by halting the device group and cleaning up + * the completion thread. + */ +void dsa_stop(void); + +/** + * @brief Clean up system resources created for DSA offloading. + * This function is called during QEMU process teardown. + */ +void dsa_cleanup(void); + +/** + * @brief Check if DSA is running. + * + * @return True if DSA is running, otherwise false. + */ +bool dsa_is_running(void); + +#endif \ No newline at end of file diff --git a/util/dsa.c b/util/dsa.c new file mode 100644 index 0000000000..8edaa892ec --- /dev/null +++ b/util/dsa.c @@ -0,0 +1,338 @@ +/* + * Use Intel Data Streaming Accelerator to offload certain background + * operations. + * + * Copyright (c) 2023 Hao Xiang + * Bryan Zhang + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#include "qemu/osdep.h" +#include "qemu/queue.h" +#include "qemu/memalign.h" +#include "qemu/lockable.h" +#include "qemu/cutils.h" +#include "qemu/dsa.h" +#include "qemu/bswap.h" +#include "qemu/error-report.h" +#include "qemu/rcu.h" + +#ifdef CONFIG_DSA_OPT + +#pragma GCC push_options +#pragma GCC target("enqcmd") + +#include +#include "x86intrin.h" + +#define DSA_WQ_SIZE 4096 +#define MAX_DSA_DEVICES 16 + +typedef QSIMPLEQ_HEAD(dsa_task_queue, buffer_zero_batch_task) dsa_task_queue; + +struct dsa_device { + void *work_queue; +}; + +struct dsa_device_group { + struct dsa_device *dsa_devices; + int num_dsa_devices; + uint32_t index; + bool running; + QemuMutex task_queue_lock; + QemuCond task_queue_cond; + dsa_task_queue task_queue; +}; + +uint64_t max_retry_count; +static struct dsa_device_group dsa_group; + + +/** + * @brief This function opens a DSA device's work queue and + * maps the DSA device memory into the current process. + * + * @param dsa_wq_path A pointer to the DSA device work queue's file path. + * @return A pointer to the mapped memory. + */ +static void * +map_dsa_device(const char *dsa_wq_path) +{ + void *dsa_device; + int fd; + + fd = open(dsa_wq_path, O_RDWR); + if (fd < 0) { + fprintf(stderr, "open %s failed with errno = %d.\n", + dsa_wq_path, errno); + return MAP_FAILED; + } + dsa_device = mmap(NULL, DSA_WQ_SIZE, PROT_WRITE, + MAP_SHARED | MAP_POPULATE, fd, 0); + close(fd); + if (dsa_device == MAP_FAILED) { + fprintf(stderr, "mmap failed with errno = %d.\n", errno); + return MAP_FAILED; + } + return dsa_device; +} + +/** + * @brief Initializes a DSA device structure. + * + * @param instance A pointer to the DSA device. + * @param work_queue A pointer to the DSA work queue. + */ +static void +dsa_device_init(struct dsa_device *instance, + void *dsa_work_queue) +{ + instance->work_queue = dsa_work_queue; +} + +/** + * @brief Cleans up a DSA device structure. + * + * @param instance A pointer to the DSA device to cleanup. + */ +static void +dsa_device_cleanup(struct dsa_device *instance) +{ + if (instance->work_queue != MAP_FAILED) { + munmap(instance->work_queue, DSA_WQ_SIZE); + } +} + +/** + * @brief Initializes a DSA device group. + * + * @param group A pointer to the DSA device group. + * @param num_dsa_devices The number of DSA devices this group will have. + * + * @return Zero if successful, non-zero otherwise. + */ +static int +dsa_device_group_init(struct dsa_device_group *group, + const char *dsa_parameter) +{ + if (dsa_parameter == NULL || strlen(dsa_parameter) == 0) { + return 0; + } + + int ret = 0; + char *local_dsa_parameter = g_strdup(dsa_parameter); + const char *dsa_path[MAX_DSA_DEVICES]; + int num_dsa_devices = 0; + char delim[2] = " "; + + char *current_dsa_path = strtok(local_dsa_parameter, delim); + + while (current_dsa_path != NULL) { + dsa_path[num_dsa_devices++] = current_dsa_path; + if (num_dsa_devices == MAX_DSA_DEVICES) { + break; + } + current_dsa_path = strtok(NULL, delim); + } + + group->dsa_devices = + malloc(sizeof(struct dsa_device) * num_dsa_devices); + group->num_dsa_devices = num_dsa_devices; + group->index = 0; + + group->running = false; + qemu_mutex_init(&group->task_queue_lock); + qemu_cond_init(&group->task_queue_cond); + QSIMPLEQ_INIT(&group->task_queue); + + void *dsa_wq = MAP_FAILED; + for (int i = 0; i < num_dsa_devices; i++) { + dsa_wq = map_dsa_device(dsa_path[i]); + if (dsa_wq == MAP_FAILED) { + fprintf(stderr, "map_dsa_device failed MAP_FAILED, " + "using simulation.\n"); + ret = -1; + goto exit; + } + dsa_device_init(&dsa_group.dsa_devices[i], dsa_wq); + } + +exit: + g_free(local_dsa_parameter); + return ret; +} + +/** + * @brief Starts a DSA device group. + * + * @param group A pointer to the DSA device group. + * @param dsa_path An array of DSA device path. + * @param num_dsa_devices The number of DSA devices in the device group. + */ +static void +dsa_device_group_start(struct dsa_device_group *group) +{ + group->running = true; +} + +/** + * @brief Stops a DSA device group. + * + * @param group A pointer to the DSA device group. + */ +__attribute__((unused)) +static void +dsa_device_group_stop(struct dsa_device_group *group) +{ + group->running = false; +} + +/** + * @brief Cleans up a DSA device group. + * + * @param group A pointer to the DSA device group. + */ +static void +dsa_device_group_cleanup(struct dsa_device_group *group) +{ + if (!group->dsa_devices) { + return; + } + for (int i = 0; i < group->num_dsa_devices; i++) { + dsa_device_cleanup(&group->dsa_devices[i]); + } + free(group->dsa_devices); + group->dsa_devices = NULL; + + qemu_mutex_destroy(&group->task_queue_lock); + qemu_cond_destroy(&group->task_queue_cond); +} + +/** + * @brief Returns the next available DSA device in the group. + * + * @param group A pointer to the DSA device group. + * + * @return struct dsa_device* A pointer to the next available DSA device + * in the group. + */ +__attribute__((unused)) +static struct dsa_device * +dsa_device_group_get_next_device(struct dsa_device_group *group) +{ + if (group->num_dsa_devices == 0) { + return NULL; + } + uint32_t current = qatomic_fetch_inc(&group->index); + current %= group->num_dsa_devices; + return &group->dsa_devices[current]; +} + +/** + * @brief Check if DSA is running. + * + * @return True if DSA is running, otherwise false. + */ +bool dsa_is_running(void) +{ + return false; +} + +static void +dsa_globals_init(void) +{ + max_retry_count = UINT64_MAX; +} + +/** + * @brief Initializes DSA devices. + * + * @param dsa_parameter A list of DSA device path from migration parameter. + * @return int Zero if successful, otherwise non zero. + */ +int dsa_init(const char *dsa_parameter) +{ + dsa_globals_init(); + + return dsa_device_group_init(&dsa_group, dsa_parameter); +} + +/** + * @brief Start logic to enable using DSA. + * + */ +void dsa_start(void) +{ + if (dsa_group.num_dsa_devices == 0) { + return; + } + if (dsa_group.running) { + return; + } + dsa_device_group_start(&dsa_group); +} + +/** + * @brief Stop logic to clean up DSA by halting the device group and cleaning up + * the completion thread. + * + */ +void dsa_stop(void) +{ + struct dsa_device_group *group = &dsa_group; + + if (!group->running) { + return; + } +} + +/** + * @brief Clean up system resources created for DSA offloading. + * This function is called during QEMU process teardown. + * + */ +void dsa_cleanup(void) +{ + dsa_stop(); + dsa_device_group_cleanup(&dsa_group); +} + +#else + +bool dsa_is_running(void) +{ + return false; +} + +int dsa_init(const char *dsa_parameter) +{ + fprintf(stderr, "Intel Data Streaming Accelerator is not supported " + "on this platform.\n"); + return -1; +} + +void dsa_start(void) {} + +void dsa_stop(void) {} + +void dsa_cleanup(void) {} + +#endif + diff --git a/util/meson.build b/util/meson.build index c4827fd70a..96b916a981 100644 --- a/util/meson.build +++ b/util/meson.build @@ -83,6 +83,7 @@ if have_block or have_ga endif if have_block util_ss.add(files('aio-wait.c')) + util_ss.add(files('dsa.c')) util_ss.add(files('buffer.c')) util_ss.add(files('bufferiszero.c')) util_ss.add(files('hbitmap.c')) From patchwork Wed Oct 25 19:38:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 1855248 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=bytedance.com header.i=@bytedance.com header.a=rsa-sha256 header.s=google header.b=LDvwkcyf; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SFzmY4jg8z23jn for ; Thu, 26 Oct 2023 06:40:29 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvjj9-0000rL-Bh; Wed, 25 Oct 2023 15:39:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvjj8-0000rB-5H for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:38 -0400 Received: from mail-qk1-x732.google.com ([2607:f8b0:4864:20::732]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qvjj6-00068g-An for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:37 -0400 Received: by mail-qk1-x732.google.com with SMTP id af79cd13be357-77774120c6eso9824285a.2 for ; Wed, 25 Oct 2023 12:39:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1698262775; x=1698867575; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=T/zXpgGhSxq4NM4ugG+KYN+fyKZfpH/m9Za1mhIGsuQ=; b=LDvwkcyfo3eY5eQL2CAAwBK1NlTISACUk71xpbDuzV2S7aHVCLQpKStg8JUu0/TsQ0 4vQZQrWwrP8JgEUXK1SDpzcAVmI93cMXB/KDElu0CLDfl5N4LspeTMTtcHPAmjcbUyX7 PQEd6rkqN280jPXhHv0D/ax44ck/gPBfAC/mDhJDKEUceuWeGbqH2UnEMhJWisdIC/Ib YPAa1+3yvg44XBOvGtboK5g7kD1jUQZm1IPXATOXjGJKxFE5koRXQ2eDEztGYSx20Fxb xXE4EULzayInQh0RS+k+Y/sM1KOAj3+uxCDyjqUw1bXGRXlfqJ0/GQHwNhc+p379qah6 DVRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698262775; x=1698867575; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=T/zXpgGhSxq4NM4ugG+KYN+fyKZfpH/m9Za1mhIGsuQ=; b=YglRjlRe/eV7FMUcPb+esZPbsnjA7lK8wrr3qxLl7asmhY3/MfCyiPeEG46o2E5AjL r2uDWqKipGxUaC5iaEiXgjFFKaOU7FO1q+CsVWvtoJyuDJlApbauNkyGhP4LPRHr9xWP 4S05ehm/Zkx85StG7XR7m/npPuQogX5MTGGNFXr+AXpPixClvrgyGPuHUpffNYOc2yte vr8hn6C5TXvXLnL6ZKBeaE9NY7gPZ++K5monMPdwtP1fU0VkINXUjbm2nWzxbcls40nU 6V1ROlbVcPmFsbfFMowjkwS+5jKePPTlmrv7wJB41UwvjYYO42aYG1bVJY3FKZwvZ0WV Tp/w== X-Gm-Message-State: AOJu0YygjFsMTr/eaHCCK296FmPdYNElChqnoi3MpMNTg9NV+3xdr6Xk WNDIOgX8K0hgSbkBuIW3asg9RGwBi1+pikH3RQw= X-Google-Smtp-Source: AGHT+IE9u/7e0EMrhLelZiYC+gRENhD59MfzMbBkPksiTEkteN0sGrsjrRPd5xILno4on9TLkINyFw== X-Received: by 2002:a05:620a:3189:b0:777:1e9:f71c with SMTP id bi9-20020a05620a318900b0077701e9f71cmr19216533qkb.56.1698262775438; Wed, 25 Oct 2023 12:39:35 -0700 (PDT) Received: from n231-230-216.byted.org ([147.160.184.135]) by smtp.gmail.com with ESMTPSA id o8-20020a05620a228800b0076cdc3b5beasm4453721qkh.86.2023.10.25.12.39.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Oct 2023 12:39:35 -0700 (PDT) From: Hao Xiang To: quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH 05/16] util/dsa: Implement DSA task enqueue and dequeue. Date: Wed, 25 Oct 2023 19:38:11 +0000 Message-Id: <20231025193822.2813204-6-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231025193822.2813204-1-hao.xiang@bytedance.com> References: <20231025193822.2813204-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::732; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x732.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org * Use a safe thread queue for DSA task enqueue/dequeue. * Implement DSA task submission. * Implement DSA batch task submission. Signed-off-by: Hao Xiang --- include/qemu/dsa.h | 35 ++++++++ util/dsa.c | 196 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 231 insertions(+) diff --git a/include/qemu/dsa.h b/include/qemu/dsa.h index 30246b507e..23f55185be 100644 --- a/include/qemu/dsa.h +++ b/include/qemu/dsa.h @@ -12,6 +12,41 @@ #include #include "x86intrin.h" +enum dsa_task_type { + DSA_TASK = 0, + DSA_BATCH_TASK +}; + +enum dsa_task_status { + DSA_TASK_READY = 0, + DSA_TASK_PROCESSING, + DSA_TASK_COMPLETION +}; + +typedef void (*buffer_zero_dsa_completion_fn)(void *); + +typedef struct buffer_zero_batch_task { + struct dsa_hw_desc batch_descriptor; + struct dsa_hw_desc *descriptors; + struct dsa_completion_record batch_completion __attribute__((aligned(32))); + struct dsa_completion_record *completions; + struct dsa_device_group *group; + struct dsa_device *device; + buffer_zero_dsa_completion_fn completion_callback; + QemuSemaphore sem_task_complete; + enum dsa_task_type task_type; + enum dsa_task_status status; + bool *results; + int batch_size; + QSIMPLEQ_ENTRY(buffer_zero_batch_task) entry; +} buffer_zero_batch_task; + +#else + +struct buffer_zero_batch_task { + bool *results; +}; + #endif /** diff --git a/util/dsa.c b/util/dsa.c index 8edaa892ec..f82282ce99 100644 --- a/util/dsa.c +++ b/util/dsa.c @@ -245,6 +245,200 @@ dsa_device_group_get_next_device(struct dsa_device_group *group) return &group->dsa_devices[current]; } +/** + * @brief Empties out the DSA task queue. + * + * @param group A pointer to the DSA device group. + */ +static void +dsa_empty_task_queue(struct dsa_device_group *group) +{ + qemu_mutex_lock(&group->task_queue_lock); + dsa_task_queue *task_queue = &group->task_queue; + while (!QSIMPLEQ_EMPTY(task_queue)) { + QSIMPLEQ_REMOVE_HEAD(task_queue, entry); + } + qemu_mutex_unlock(&group->task_queue_lock); +} + +/** + * @brief Adds a task to the DSA task queue. + * + * @param group A pointer to the DSA device group. + * @param context A pointer to the DSA task to enqueue. + * + * @return int Zero if successful, otherwise a proper error code. + */ +static int +dsa_task_enqueue(struct dsa_device_group *group, + struct buffer_zero_batch_task *task) +{ + dsa_task_queue *task_queue = &group->task_queue; + QemuMutex *task_queue_lock = &group->task_queue_lock; + QemuCond *task_queue_cond = &group->task_queue_cond; + + bool notify = false; + + qemu_mutex_lock(task_queue_lock); + + if (!group->running) { + fprintf(stderr, "DSA: Tried to queue task to stopped device queue\n"); + qemu_mutex_unlock(task_queue_lock); + return -1; + } + + // The queue is empty. This enqueue operation is a 0->1 transition. + if (QSIMPLEQ_EMPTY(task_queue)) + notify = true; + + QSIMPLEQ_INSERT_TAIL(task_queue, task, entry); + + // We need to notify the waiter for 0->1 transitions. + if (notify) + qemu_cond_signal(task_queue_cond); + + qemu_mutex_unlock(task_queue_lock); + + return 0; +} + +/** + * @brief Takes a DSA task out of the task queue. + * + * @param group A pointer to the DSA device group. + * @return buffer_zero_batch_task* The DSA task being dequeued. + */ +__attribute__((unused)) +static struct buffer_zero_batch_task * +dsa_task_dequeue(struct dsa_device_group *group) +{ + struct buffer_zero_batch_task *task = NULL; + dsa_task_queue *task_queue = &group->task_queue; + QemuMutex *task_queue_lock = &group->task_queue_lock; + QemuCond *task_queue_cond = &group->task_queue_cond; + + qemu_mutex_lock(task_queue_lock); + + while (true) { + if (!group->running) + goto exit; + task = QSIMPLEQ_FIRST(task_queue); + if (task != NULL) { + break; + } + qemu_cond_wait(task_queue_cond, task_queue_lock); + } + + QSIMPLEQ_REMOVE_HEAD(task_queue, entry); + +exit: + qemu_mutex_unlock(task_queue_lock); + return task; +} + +/** + * @brief Submits a DSA work item to the device work queue. + * + * @param wq A pointer to the DSA work queue's device memory. + * @param descriptor A pointer to the DSA work item descriptor. + * + * @return Zero if successful, non-zero otherwise. + */ +static int +submit_wi_int(void *wq, struct dsa_hw_desc *descriptor) +{ + uint64_t retry = 0; + + _mm_sfence(); + + while (true) { + if (_enqcmd(wq, descriptor) == 0) { + break; + } + retry++; + if (retry > max_retry_count) { + fprintf(stderr, "Submit work retry %lu times.\n", retry); + exit(1); + } + } + + return 0; +} + +/** + * @brief Synchronously submits a DSA work item to the + * device work queue. + * + * @param wq A pointer to the DSA worjk queue's device memory. + * @param descriptor A pointer to the DSA work item descriptor. + * + * @return int Zero if successful, non-zero otherwise. + */ +__attribute__((unused)) +static int +submit_wi(void *wq, struct dsa_hw_desc *descriptor) +{ + return submit_wi_int(wq, descriptor); +} + +/** + * @brief Asynchronously submits a DSA work item to the + * device work queue. + * + * @param task A pointer to the buffer zero task. + * + * @return int Zero if successful, non-zero otherwise. + */ +__attribute__((unused)) +static int +submit_wi_async(struct buffer_zero_batch_task *task) +{ + struct dsa_device_group *device_group = task->group; + struct dsa_device *device_instance = task->device; + int ret; + + assert(task->task_type == DSA_TASK); + + task->status = DSA_TASK_PROCESSING; + + ret = submit_wi_int(device_instance->work_queue, + &task->descriptors[0]); + if (ret != 0) + return ret; + + return dsa_task_enqueue(device_group, task); +} + +/** + * @brief Asynchronously submits a DSA batch work item to the + * device work queue. + * + * @param batch_task A pointer to the batch buffer zero task. + * + * @return int Zero if successful, non-zero otherwise. + */ +__attribute__((unused)) +static int +submit_batch_wi_async(struct buffer_zero_batch_task *batch_task) +{ + struct dsa_device_group *device_group = batch_task->group; + struct dsa_device *device_instance = batch_task->device; + int ret; + + assert(batch_task->task_type == DSA_BATCH_TASK); + assert(batch_task->batch_descriptor.desc_count <= batch_task->batch_size); + assert(batch_task->status == DSA_TASK_READY); + + batch_task->status = DSA_TASK_PROCESSING; + + ret = submit_wi_int(device_instance->work_queue, + &batch_task->batch_descriptor); + if (ret != 0) + return ret; + + return dsa_task_enqueue(device_group, batch_task); +} + /** * @brief Check if DSA is running. * @@ -301,6 +495,8 @@ void dsa_stop(void) if (!group->running) { return; } + + dsa_empty_task_queue(group); } /** From patchwork Wed Oct 25 19:38:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 1855246 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=bytedance.com header.i=@bytedance.com header.a=rsa-sha256 header.s=google header.b=V3wINHsx; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SFzmM2RTxz23jn for ; Thu, 26 Oct 2023 06:40:19 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvjjB-0000sB-R1; Wed, 25 Oct 2023 15:39:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvjjA-0000rz-EQ for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:40 -0400 Received: from mail-qv1-xf2c.google.com ([2607:f8b0:4864:20::f2c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qvjj8-000691-IO for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:40 -0400 Received: by mail-qv1-xf2c.google.com with SMTP id 6a1803df08f44-66cfd874520so756556d6.2 for ; Wed, 25 Oct 2023 12:39:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1698262777; x=1698867577; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CLTEEKCthFs57uSw9LUby1piyRPg7rqXrvYS7xVttXM=; b=V3wINHsxfSZpOgEhyAw7ysR07Xh3C0fBUDXD0BZZQMUibLb3zyfS8as6Ulmmd+37rV pRuEE5zO425HOOH5OCKBWjDN3SgW2XkNCJTk4VT7TrRpnQ12z0RSJpTj48k0TSMTD9YH CHl1ugQVNh61V7P18lCUFebssFDAydfXGo3eKjkYrkq7LwoRhorE21IVUQnRVVaimPSh utVoW8hbeY+SAIs90ufkM8m6C4j7jpYa9pewiIZoUVnnkQ/OPMYn78v/LZUbKOU2AvYV 7QGk31ds/3I7asPB3BG19YE0+1gpUbBDTeUfjtUAWIaf/IVCRC1YrXZZ3iKnAdm37AES JWIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698262777; x=1698867577; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CLTEEKCthFs57uSw9LUby1piyRPg7rqXrvYS7xVttXM=; b=KGPjqMqyzBwcQYPnQgsw0D021jP3Ze0EUdbX6IqMyVi6tWpjCnykFhxbVJdq9Xr2Ns d0cIyjsf28LQ/jUNJzpjc/EY8J21W7OD+jWv5FS5XIq269AVSbg+YOE8Ky6IV2OSX6Ji 9Nd/Jrl3ayi+IuffVmqRZUKOQzIrv60X5qgQjy/ZvXXNl9QunrjazQYbP9MwdAdQLip1 2/doYGeZeaAXtQRBIzP5VOjGSJy5F8OAvIYXbyvyQrykeyY9ugTZhjf3lby1hDQo2kc1 uQOoGZ7o+QwP+7kCOMQOAnVZEuBUVAKmFqxsx/uZq3tSECXAvUfsHdxABD/QNj6xr/KM 5LEA== X-Gm-Message-State: AOJu0YzFMsjJU4rJC4HEGGQJcS/c4U84+pEr0YJhok71kJtnN5X3qbuT LxOsEAvNGTTm6v7S5DVO6+zjkQ== X-Google-Smtp-Source: AGHT+IH0bHJZNyTzeLOzPEzX0h8hx60BC26KU+4v9x5tBYfjxVm1VVnWuKiHk7Y+u9PAWF8nBR0+OQ== X-Received: by 2002:a05:6214:5281:b0:66c:ff4f:a35f with SMTP id kj1-20020a056214528100b0066cff4fa35fmr17363092qvb.51.1698262776875; Wed, 25 Oct 2023 12:39:36 -0700 (PDT) Received: from n231-230-216.byted.org ([147.160.184.135]) by smtp.gmail.com with ESMTPSA id o8-20020a05620a228800b0076cdc3b5beasm4453721qkh.86.2023.10.25.12.39.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Oct 2023 12:39:36 -0700 (PDT) From: Hao Xiang To: quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH 06/16] util/dsa: Implement DSA task asynchronous completion thread model. Date: Wed, 25 Oct 2023 19:38:12 +0000 Message-Id: <20231025193822.2813204-7-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231025193822.2813204-1-hao.xiang@bytedance.com> References: <20231025193822.2813204-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::f2c; envelope-from=hao.xiang@bytedance.com; helo=mail-qv1-xf2c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org * Create a dedicated thread for DSA task completion. * DSA completion thread runs a loop and poll for completed tasks. * Start and stop DSA completion thread during DSA device start stop. User space application can directly submit task to Intel DSA accelerator by writing to DSA's device memory (mapped in user space). Once a task is submitted, the device starts processing it and write the completion status back to the task. A user space application can poll the task's completion status to check for completion. This change uses a dedicated thread to perform DSA task completion checking. Signed-off-by: Hao Xiang --- util/dsa.c | 243 ++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 242 insertions(+), 1 deletion(-) diff --git a/util/dsa.c b/util/dsa.c index f82282ce99..0e68013ffb 100644 --- a/util/dsa.c +++ b/util/dsa.c @@ -44,6 +44,7 @@ #define DSA_WQ_SIZE 4096 #define MAX_DSA_DEVICES 16 +#define DSA_COMPLETION_THREAD "dsa_completion" typedef QSIMPLEQ_HEAD(dsa_task_queue, buffer_zero_batch_task) dsa_task_queue; @@ -61,8 +62,18 @@ struct dsa_device_group { dsa_task_queue task_queue; }; +struct dsa_completion_thread { + bool stopping; + bool running; + QemuThread thread; + int thread_id; + QemuSemaphore sem_init_done; + struct dsa_device_group *group; +}; + uint64_t max_retry_count; static struct dsa_device_group dsa_group; +static struct dsa_completion_thread completion_thread; /** @@ -439,6 +450,234 @@ submit_batch_wi_async(struct buffer_zero_batch_task *batch_task) return dsa_task_enqueue(device_group, batch_task); } +/** + * @brief Poll for the DSA work item completion. + * + * @param completion A pointer to the DSA work item completion record. + * @param opcode The DSA opcode. + * + * @return Zero if successful, non-zero otherwise. + */ +static int +poll_completion(struct dsa_completion_record *completion, + enum dsa_opcode opcode) +{ + uint8_t status; + uint64_t retry = 0; + + while (true) { + // The DSA operation completes successfully or fails. + status = completion->status; + if (status == DSA_COMP_SUCCESS || + status == DSA_COMP_PAGE_FAULT_NOBOF || + status == DSA_COMP_BATCH_PAGE_FAULT || + status == DSA_COMP_BATCH_FAIL) { + break; + } else if (status != DSA_COMP_NONE) { + /* TODO: Error handling here on unexpected failure. */ + fprintf(stderr, "DSA opcode %d failed with status = %d.\n", + opcode, status); + exit(1); + } + retry++; + if (retry > max_retry_count) { + fprintf(stderr, "Wait for completion retry %lu times.\n", retry); + exit(1); + } + _mm_pause(); + } + + return 0; +} + +/** + * @brief Complete a single DSA task in the batch task. + * + * @param task A pointer to the batch task structure. + */ +static void +poll_task_completion(struct buffer_zero_batch_task *task) +{ + assert(task->task_type == DSA_TASK); + + struct dsa_completion_record *completion = &task->completions[0]; + uint8_t status; + + poll_completion(completion, task->descriptors[0].opcode); + + status = completion->status; + if (status == DSA_COMP_SUCCESS) { + task->results[0] = (completion->result == 0); + return; + } + + assert(status == DSA_COMP_PAGE_FAULT_NOBOF); +} + +/** + * @brief Poll a batch task status until it completes. If DSA task doesn't + * complete properly, use CPU to complete the task. + * + * @param batch_task A pointer to the DSA batch task. + */ +static void +poll_batch_task_completion(struct buffer_zero_batch_task *batch_task) +{ + struct dsa_completion_record *batch_completion = &batch_task->batch_completion; + struct dsa_completion_record *completion; + uint8_t batch_status; + uint8_t status; + bool *results = batch_task->results; + uint32_t count = batch_task->batch_descriptor.desc_count; + + poll_completion(batch_completion, + batch_task->batch_descriptor.opcode); + + batch_status = batch_completion->status; + + if (batch_status == DSA_COMP_SUCCESS) { + if (batch_completion->bytes_completed == count) { + // Let's skip checking for each descriptors' completion status + // if the batch descriptor says all succedded. + for (int i = 0; i < count; i++) { + assert(batch_task->completions[i].status == DSA_COMP_SUCCESS); + results[i] = (batch_task->completions[i].result == 0); + } + return; + } + } else { + assert(batch_status == DSA_COMP_BATCH_FAIL || + batch_status == DSA_COMP_BATCH_PAGE_FAULT); + } + + for (int i = 0; i < count; i++) { + + completion = &batch_task->completions[i]; + status = completion->status; + + if (status == DSA_COMP_SUCCESS) { + results[i] = (completion->result == 0); + continue; + } + + if (status != DSA_COMP_PAGE_FAULT_NOBOF) { + fprintf(stderr, + "Unexpected completion status = %u.\n", status); + assert(false); + } + } +} + +/** + * @brief Handles an asynchronous DSA batch task completion. + * + * @param task A pointer to the batch buffer zero task structure. + */ +static void +dsa_batch_task_complete(struct buffer_zero_batch_task *batch_task) +{ + batch_task->status = DSA_TASK_COMPLETION; + batch_task->completion_callback(batch_task); +} + +/** + * @brief The function entry point called by a dedicated DSA + * work item completion thread. + * + * @param opaque A pointer to the thread context. + * + * @return void* Not used. + */ +static void * +dsa_completion_loop(void *opaque) +{ + struct dsa_completion_thread *thread_context = + (struct dsa_completion_thread *)opaque; + struct buffer_zero_batch_task *batch_task; + struct dsa_device_group *group = thread_context->group; + + rcu_register_thread(); + + thread_context->thread_id = qemu_get_thread_id(); + qemu_sem_post(&thread_context->sem_init_done); + + while (thread_context->running) { + batch_task = dsa_task_dequeue(group); + assert(batch_task != NULL || !group->running); + if (!group->running) { + assert(!thread_context->running); + break; + } + if (batch_task->task_type == DSA_TASK) { + poll_task_completion(batch_task); + } else { + assert(batch_task->task_type == DSA_BATCH_TASK); + poll_batch_task_completion(batch_task); + } + + dsa_batch_task_complete(batch_task); + } + + rcu_unregister_thread(); + return NULL; +} + +/** + * @brief Initializes a DSA completion thread. + * + * @param completion_thread A pointer to the completion thread context. + * @param group A pointer to the DSA device group. + */ +static void +dsa_completion_thread_init( + struct dsa_completion_thread *completion_thread, + struct dsa_device_group *group) +{ + completion_thread->stopping = false; + completion_thread->running = true; + completion_thread->thread_id = -1; + qemu_sem_init(&completion_thread->sem_init_done, 0); + completion_thread->group = group; + + qemu_thread_create(&completion_thread->thread, + DSA_COMPLETION_THREAD, + dsa_completion_loop, + completion_thread, + QEMU_THREAD_JOINABLE); + + /* Wait for initialization to complete */ + while (completion_thread->thread_id == -1) { + qemu_sem_wait(&completion_thread->sem_init_done); + } +} + +/** + * @brief Stops the completion thread (and implicitly, the device group). + * + * @param opaque A pointer to the completion thread. + */ +static void dsa_completion_thread_stop(void *opaque) +{ + struct dsa_completion_thread *thread_context = + (struct dsa_completion_thread *)opaque; + + struct dsa_device_group *group = thread_context->group; + + qemu_mutex_lock(&group->task_queue_lock); + + thread_context->stopping = true; + thread_context->running = false; + + dsa_device_group_stop(group); + + qemu_cond_signal(&group->task_queue_cond); + qemu_mutex_unlock(&group->task_queue_lock); + + qemu_thread_join(&thread_context->thread); + + qemu_sem_destroy(&thread_context->sem_init_done); +} + /** * @brief Check if DSA is running. * @@ -446,7 +685,7 @@ submit_batch_wi_async(struct buffer_zero_batch_task *batch_task) */ bool dsa_is_running(void) { - return false; + return completion_thread.running; } static void @@ -481,6 +720,7 @@ void dsa_start(void) return; } dsa_device_group_start(&dsa_group); + dsa_completion_thread_init(&completion_thread, &dsa_group); } /** @@ -496,6 +736,7 @@ void dsa_stop(void) return; } + dsa_completion_thread_stop(&completion_thread); dsa_empty_task_queue(group); } From patchwork Wed Oct 25 19:38:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 1855259 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=bytedance.com header.i=@bytedance.com header.a=rsa-sha256 header.s=google header.b=R7LOucNl; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SFzpc1hLNz23jn for ; Thu, 26 Oct 2023 06:42:16 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvjjD-0000sa-Dg; Wed, 25 Oct 2023 15:39:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvjjB-0000sD-TX for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:41 -0400 Received: from mail-qv1-xf2f.google.com ([2607:f8b0:4864:20::f2f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qvjj9-00069H-Db for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:41 -0400 Received: by mail-qv1-xf2f.google.com with SMTP id 6a1803df08f44-66d134a019cso753296d6.3 for ; Wed, 25 Oct 2023 12:39:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1698262778; x=1698867578; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=GL6cpR12l80oXg3yMpIzAIPLap4SikzmVMX/10PRhJc=; b=R7LOucNlazFhwAy3UKTtl6ARgsZ52EnOyoO9rUTbOG4qF1rMepKC8b6lZlGA7MxBbK CpmiFnxL7biLTlJqZaqUphDC1nzBxxfKJCVUtSdKhsUSPt6HVbgtfnGUriBgKlBmLbJ/ tDbM80xtPz+5WOhZVosEW5RX/ZloYTE4pVfBleruSa3Trvc4cbqDPFficKJv8ZDZ7ZZ+ GJnHGIAiEFrEgrWQW6m5rM88lBmdBYLjsnSM5TC3wM1tWrmmjywxxNlFYY+X3VACpK75 T2DjiHT4vjV+aUgRQ3JQ0qgmXhtZp5dT5Vi/6rNf+wQYXPvhg1+6aSpoHpLuipQIiAia 4ipg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698262778; x=1698867578; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GL6cpR12l80oXg3yMpIzAIPLap4SikzmVMX/10PRhJc=; b=g/5S7dcONAHhak6tg5ArodjPa3SGfc6sZ7G4AOjoInCK2wVbw4jhWOr6VKh1pCFdrG Qu/ei7iNiTHMe4bDNfxFW5OiKvcia6LFsZ/0FbFO8ispPQmoFkoCoibr+U2iFUPvLql/ 3dMB1DrNoY7YMNcgmAnyqC7u0e8S20//t/8vGtAqcFP5ElTRpgtjrFC4LTDgIPTpkACi 7OCV7+2ipAq2E6FRyYVPfEV20zc7wxw9Z110fGs5ygmzNMkonH79D7u7pJV7DamXkF2W 6JSOQzkbmLGAoTvuDY1bt6DDWttRBvitW9/RVGHOqPKb0L0BZEKVauWRslAGb5ajCtvT 3+HA== X-Gm-Message-State: AOJu0YwhWS7x8JVcLYkgJqUML7EjTbb6G1cmDuXeRXuH/AcSdpaHCHg1 XtjuFidbuX8HKtA+ex/Wap/h6w== X-Google-Smtp-Source: AGHT+IEAQUd9RA0/oKIsJF6C4TXMuA9wJBKrEk2NwBPWZadBVa5BM2Ldw7ez3ML2YHzT0nK3F9ITSQ== X-Received: by 2002:ad4:5b86:0:b0:66d:44c9:ac8 with SMTP id 6-20020ad45b86000000b0066d44c90ac8mr19340935qvp.24.1698262778227; Wed, 25 Oct 2023 12:39:38 -0700 (PDT) Received: from n231-230-216.byted.org ([147.160.184.135]) by smtp.gmail.com with ESMTPSA id o8-20020a05620a228800b0076cdc3b5beasm4453721qkh.86.2023.10.25.12.39.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Oct 2023 12:39:37 -0700 (PDT) From: Hao Xiang To: quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH 07/16] util/dsa: Implement zero page checking in DSA task. Date: Wed, 25 Oct 2023 19:38:13 +0000 Message-Id: <20231025193822.2813204-8-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231025193822.2813204-1-hao.xiang@bytedance.com> References: <20231025193822.2813204-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::f2f; envelope-from=hao.xiang@bytedance.com; helo=mail-qv1-xf2f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Create DSA task with operation code DSA_OPCODE_COMPVAL. Here we create two types of DSA tasks, a single DSA task and a batch DSA task. Batch DSA task reduces task submission overhead and hence should be the default option. However, due to the way DSA hardware works, a DSA batch task must contain at least two individual tasks. There are times we need to submit a single task and hence a single DSA task submission is also required. Signed-off-by: Hao Xiang Signed-off-by: Bryan Zhang --- include/qemu/dsa.h | 16 +++ util/dsa.c | 252 +++++++++++++++++++++++++++++++++++++++++---- 2 files changed, 247 insertions(+), 21 deletions(-) diff --git a/include/qemu/dsa.h b/include/qemu/dsa.h index 23f55185be..b10e7b8fb7 100644 --- a/include/qemu/dsa.h +++ b/include/qemu/dsa.h @@ -49,6 +49,22 @@ struct buffer_zero_batch_task { #endif +/** + * @brief Initializes a buffer zero batch task. + * + * @param task A pointer to the batch task to initialize. + * @param batch_size The number of DSA tasks in the batch. + */ +void buffer_zero_batch_task_init(struct buffer_zero_batch_task *task, + int batch_size); + +/** + * @brief Performs the proper cleanup on a DSA batch task. + * + * @param task A pointer to the batch task to cleanup. + */ +void buffer_zero_batch_task_destroy(struct buffer_zero_batch_task *task); + /** * @brief Initializes DSA devices. * diff --git a/util/dsa.c b/util/dsa.c index 0e68013ffb..3cc017b8a0 100644 --- a/util/dsa.c +++ b/util/dsa.c @@ -75,6 +75,7 @@ uint64_t max_retry_count; static struct dsa_device_group dsa_group; static struct dsa_completion_thread completion_thread; +static void buffer_zero_dsa_completion(void *context); /** * @brief This function opens a DSA device's work queue and @@ -208,7 +209,6 @@ dsa_device_group_start(struct dsa_device_group *group) * * @param group A pointer to the DSA device group. */ -__attribute__((unused)) static void dsa_device_group_stop(struct dsa_device_group *group) { @@ -244,7 +244,6 @@ dsa_device_group_cleanup(struct dsa_device_group *group) * @return struct dsa_device* A pointer to the next available DSA device * in the group. */ -__attribute__((unused)) static struct dsa_device * dsa_device_group_get_next_device(struct dsa_device_group *group) { @@ -319,7 +318,6 @@ dsa_task_enqueue(struct dsa_device_group *group, * @param group A pointer to the DSA device group. * @return buffer_zero_batch_task* The DSA task being dequeued. */ -__attribute__((unused)) static struct buffer_zero_batch_task * dsa_task_dequeue(struct dsa_device_group *group) { @@ -376,22 +374,6 @@ submit_wi_int(void *wq, struct dsa_hw_desc *descriptor) return 0; } -/** - * @brief Synchronously submits a DSA work item to the - * device work queue. - * - * @param wq A pointer to the DSA worjk queue's device memory. - * @param descriptor A pointer to the DSA work item descriptor. - * - * @return int Zero if successful, non-zero otherwise. - */ -__attribute__((unused)) -static int -submit_wi(void *wq, struct dsa_hw_desc *descriptor) -{ - return submit_wi_int(wq, descriptor); -} - /** * @brief Asynchronously submits a DSA work item to the * device work queue. @@ -400,7 +382,6 @@ submit_wi(void *wq, struct dsa_hw_desc *descriptor) * * @return int Zero if successful, non-zero otherwise. */ -__attribute__((unused)) static int submit_wi_async(struct buffer_zero_batch_task *task) { @@ -428,7 +409,6 @@ submit_wi_async(struct buffer_zero_batch_task *task) * * @return int Zero if successful, non-zero otherwise. */ -__attribute__((unused)) static int submit_batch_wi_async(struct buffer_zero_batch_task *batch_task) { @@ -678,6 +658,231 @@ static void dsa_completion_thread_stop(void *opaque) qemu_sem_destroy(&thread_context->sem_init_done); } +/** + * @brief Initializes a buffer zero comparison DSA task. + * + * @param descriptor A pointer to the DSA task descriptor. + * @param completion A pointer to the DSA task completion record. + */ +static void +buffer_zero_task_init_int(struct dsa_hw_desc *descriptor, + struct dsa_completion_record *completion) +{ + descriptor->opcode = DSA_OPCODE_COMPVAL; + descriptor->flags = IDXD_OP_FLAG_RCR | IDXD_OP_FLAG_CRAV; + descriptor->comp_pattern = (uint64_t)0; + descriptor->completion_addr = (uint64_t)completion; +} + +/** + * @brief Initializes a buffer zero batch task. + * + * @param task A pointer to the batch task to initialize. + * @param batch_size The number of DSA tasks in the batch. + */ +void +buffer_zero_batch_task_init(struct buffer_zero_batch_task *task, + int batch_size) +{ + int descriptors_size = sizeof(*task->descriptors) * batch_size; + memset(task, 0, sizeof(*task)); + + task->descriptors = + (struct dsa_hw_desc *)qemu_memalign(64, descriptors_size); + memset(task->descriptors, 0, descriptors_size); + task->completions = (struct dsa_completion_record *)qemu_memalign( + 32, sizeof(*task->completions) * batch_size); + task->results = g_new0(bool, batch_size); + task->batch_size = batch_size; + + task->batch_completion.status = DSA_COMP_NONE; + task->batch_descriptor.completion_addr = (uint64_t)&task->batch_completion; + // TODO: Ensure that we never send a batch with count <= 1 + task->batch_descriptor.desc_count = 0; + task->batch_descriptor.opcode = DSA_OPCODE_BATCH; + task->batch_descriptor.flags = IDXD_OP_FLAG_RCR | IDXD_OP_FLAG_CRAV; + task->batch_descriptor.desc_list_addr = (uintptr_t)task->descriptors; + task->status = DSA_TASK_READY; + task->group = &dsa_group; + task->device = dsa_device_group_get_next_device(&dsa_group); + + for (int i = 0; i < task->batch_size; i++) { + buffer_zero_task_init_int(&task->descriptors[i], + &task->completions[i]); + } + + qemu_sem_init(&task->sem_task_complete, 0); + task->completion_callback = buffer_zero_dsa_completion; +} + +/** + * @brief Performs the proper cleanup on a DSA batch task. + * + * @param task A pointer to the batch task to cleanup. + */ +void +buffer_zero_batch_task_destroy(struct buffer_zero_batch_task *task) +{ + qemu_vfree(task->descriptors); + qemu_vfree(task->completions); + g_free(task->results); + + qemu_sem_destroy(&task->sem_task_complete); +} + +/** + * @brief Resets a buffer zero comparison DSA batch task. + * + * @param task A pointer to the batch task. + * @param count The number of DSA tasks this batch task will contain. + */ +static void +buffer_zero_batch_task_reset(struct buffer_zero_batch_task *task, size_t count) +{ + task->batch_completion.status = DSA_COMP_NONE; + task->batch_descriptor.desc_count = count; + task->task_type = DSA_BATCH_TASK; + task->status = DSA_TASK_READY; +} + +/** + * @brief Sets a buffer zero comparison DSA task. + * + * @param descriptor A pointer to the DSA task descriptor. + * @param buf A pointer to the memory buffer. + * @param len The length of the buffer. + */ +static void +buffer_zero_task_set_int(struct dsa_hw_desc *descriptor, + const void *buf, + size_t len) +{ + struct dsa_completion_record *completion = + (struct dsa_completion_record *)descriptor->completion_addr; + + descriptor->xfer_size = len; + descriptor->src_addr = (uintptr_t)buf; + completion->status = 0; + completion->result = 0; +} + +/** + * @brief Resets a buffer zero comparison DSA batch task. + * + * @param task A pointer to the DSA batch task. + */ +static void +buffer_zero_task_reset(struct buffer_zero_batch_task *task) +{ + task->completions[0].status = DSA_COMP_NONE; + task->task_type = DSA_TASK; + task->status = DSA_TASK_READY; +} + +/** + * @brief Sets a buffer zero comparison DSA task. + * + * @param task A pointer to the DSA task. + * @param buf A pointer to the memory buffer. + * @param len The buffer length. + */ +static void +buffer_zero_task_set(struct buffer_zero_batch_task *task, + const void *buf, + size_t len) +{ + buffer_zero_task_reset(task); + buffer_zero_task_set_int(&task->descriptors[0], buf, len); +} + +/** + * @brief Sets a buffer zero comparison batch task. + * + * @param batch_task A pointer to the batch task. + * @param buf An array of memory buffers. + * @param count The number of buffers in the array. + * @param len The length of the buffers. + */ +static void +buffer_zero_batch_task_set(struct buffer_zero_batch_task *batch_task, + const void **buf, size_t count, size_t len) +{ + assert(count > 0); + assert(count <= batch_task->batch_size); + + buffer_zero_batch_task_reset(batch_task, count); + for (int i = 0; i < count; i++) { + buffer_zero_task_set_int(&batch_task->descriptors[i], buf[i], len); + } +} + +/** + * @brief Asychronously perform a buffer zero DSA operation. + * + * @param task A pointer to the batch task structure. + * @param buf A pointer to the memory buffer. + * @param len The length of the memory buffer. + * + * @return int Zero if successful, otherwise an appropriate error code. + */ +__attribute__((unused)) +static int +buffer_zero_dsa_async(struct buffer_zero_batch_task *task, + const void *buf, size_t len) +{ + buffer_zero_task_set(task, buf, len); + + return submit_wi_async(task); +} + +/** + * @brief Sends a memory comparison batch task to a DSA device and wait + * for completion. + * + * @param batch_task The batch task to be submitted to DSA device. + * @param buf An array of memory buffers to check for zero. + * @param count The number of buffers. + * @param len The buffer length. + */ +__attribute__((unused)) +static int +buffer_zero_dsa_batch_async(struct buffer_zero_batch_task *batch_task, + const void **buf, size_t count, size_t len) +{ + assert(count <= batch_task->batch_size); + buffer_zero_batch_task_set(batch_task, buf, count, len); + + return submit_batch_wi_async(batch_task); +} + +/** + * @brief The completion callback function for buffer zero + * comparison DSA task completion. + * + * @param context A pointer to the callback context. + */ +static void +buffer_zero_dsa_completion(void *context) +{ + assert(context != NULL); + + struct buffer_zero_batch_task *task = + (struct buffer_zero_batch_task *)context; + qemu_sem_post(&task->sem_task_complete); +} + +/** + * @brief Wait for the asynchronous DSA task to complete. + * + * @param batch_task A pointer to the buffer zero comparison batch task. + */ +__attribute__((unused)) +static void +buffer_zero_dsa_wait(struct buffer_zero_batch_task *batch_task) +{ + qemu_sem_wait(&batch_task->sem_task_complete); +} + /** * @brief Check if DSA is running. * @@ -753,6 +958,11 @@ void dsa_cleanup(void) #else +void buffer_zero_batch_task_init(struct buffer_zero_batch_task *task, + int batch_size) {} + +void buffer_zero_batch_task_destroy(struct buffer_zero_batch_task *task) {} + bool dsa_is_running(void) { return false; From patchwork Wed Oct 25 19:38:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 1855245 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=bytedance.com header.i=@bytedance.com header.a=rsa-sha256 header.s=google header.b=FCjjf3Hp; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SFzmJ5Vfyz23jn for ; Thu, 26 Oct 2023 06:40:16 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvjjE-0000tB-VN; Wed, 25 Oct 2023 15:39:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvjjD-0000sb-Ek for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:43 -0400 Received: from mail-qk1-x734.google.com ([2607:f8b0:4864:20::734]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qvjjB-00069W-Fx for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:43 -0400 Received: by mail-qk1-x734.google.com with SMTP id af79cd13be357-778a47bc09aso9238385a.3 for ; Wed, 25 Oct 2023 12:39:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1698262780; x=1698867580; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3DH9IahCxHNZhgGh3NSSJ8rC6cgU5W5EjmHIaRYa9EQ=; b=FCjjf3HpXsE59DCTCyGaYTw58vKBcT9tAPQ2Ngt5yYGTLWiAA0JaU7eDww9wYx5LFw 8SoCwNIFJOZIe3sSlyxih/0p3zD1uoSKuN3sriVW0sJcQ0IHeCQzQjQLwXEMK20nWeKc JhXHLoc27I1m/zPaSp/5xKv4c2xvYi3Ahu8sU/H/KRj1SnunclW1ElUyL75yhdp3KKnz UFDbitn617Bcg7fyvas0l0BA6dlkTNn+6+yzypwVvvHbwAVwGhpH2JLs5e9OlpR16OeH sFYl879Ekx4CtIr11zFEmlBLw/bMSjlmxqbaHYRvsqxW+iY98fYjFHhjEpvlp5DfjjQP d8jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698262780; x=1698867580; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3DH9IahCxHNZhgGh3NSSJ8rC6cgU5W5EjmHIaRYa9EQ=; b=Dhd8N+BDdhZ43Wr9s776eyLwWNn0HYZBBpsCq8IR6SxVohdAF3wBm77TKPQ7Qk9WUY MIaFQ5TdMZg5JfWHguzN5GygDRbr+7IBYMMxmXgubWin2IY45gTqLDxb+Sm6LPrteRmQ cUjisR8bivNKL8Sl+WWgABZEkUhS932+bqynokuGQKHuWBWiETBAkppia+QVoSw6anKZ 7iUi87HgnyWgGaQhkveEk6OwwMQyBjSkVCrvSrSRB6lDsThEUmaicIMaNix+J2ugBiUM 2ixDS4BMK72kK+TLitKUnI8xzZUrgst9HR3d9jobf0Z0rtZ+KE5a5gAyjKXjkttYLkIB 68HA== X-Gm-Message-State: AOJu0YyiEr5/eEBgyoavi48TsX0IwjRTSLgzGANq3CUp/dDo7j9UjeWH 1MohhKy8vDmowIRD7xedHeJb2gumDcmgR1QF0rE= X-Google-Smtp-Source: AGHT+IG+UOeZqkyuGmp09o0Bcotzyfe7GgvDIfe6XZau5AYgXj5C7qi+7ChdR0upsGVIh9w6q4lheA== X-Received: by 2002:a05:620a:1a27:b0:774:2915:d180 with SMTP id bk39-20020a05620a1a2700b007742915d180mr18821794qkb.37.1698262780129; Wed, 25 Oct 2023 12:39:40 -0700 (PDT) Received: from n231-230-216.byted.org ([147.160.184.135]) by smtp.gmail.com with ESMTPSA id o8-20020a05620a228800b0076cdc3b5beasm4453721qkh.86.2023.10.25.12.39.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Oct 2023 12:39:39 -0700 (PDT) From: Hao Xiang To: quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH 08/16] util/dsa: Implement DSA task asynchronous submission and wait for completion. Date: Wed, 25 Oct 2023 19:38:14 +0000 Message-Id: <20231025193822.2813204-9-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231025193822.2813204-1-hao.xiang@bytedance.com> References: <20231025193822.2813204-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::734; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x734.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org * Add a DSA task completion callback. * DSA completion thread will call the tasks's completion callback on every task/batch task completion. * DSA submission path to wait for completion. * Implement CPU fallback if DSA is not able to complete the task. Signed-off-by: Hao Xiang Signed-off-by: Bryan Zhang --- include/qemu/dsa.h | 14 +++++ util/dsa.c | 153 ++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 164 insertions(+), 3 deletions(-) diff --git a/include/qemu/dsa.h b/include/qemu/dsa.h index b10e7b8fb7..3f8ee07004 100644 --- a/include/qemu/dsa.h +++ b/include/qemu/dsa.h @@ -65,6 +65,20 @@ void buffer_zero_batch_task_init(struct buffer_zero_batch_task *task, */ void buffer_zero_batch_task_destroy(struct buffer_zero_batch_task *task); +/** + * @brief Performs buffer zero comparison on a DSA batch task asynchronously. + * + * @param batch_task A pointer to the batch task. + * @param buf An array of memory buffers. + * @param count The number of buffers in the array. + * @param len The buffer length. + * + * @return Zero if successful, otherwise non-zero. + */ +int +buffer_is_zero_dsa_batch_async(struct buffer_zero_batch_task *batch_task, + const void **buf, size_t count, size_t len); + /** * @brief Initializes DSA devices. * diff --git a/util/dsa.c b/util/dsa.c index 3cc017b8a0..06c6fbf2ca 100644 --- a/util/dsa.c +++ b/util/dsa.c @@ -470,6 +470,41 @@ poll_completion(struct dsa_completion_record *completion, return 0; } +/** + * @brief Use CPU to complete a single zero page checking task. + * + * @param task A pointer to the task. + */ +static void +task_cpu_fallback(struct buffer_zero_batch_task *task) +{ + assert(task->task_type == DSA_TASK); + + struct dsa_completion_record *completion = &task->completions[0]; + const uint8_t *buf; + size_t len; + + if (completion->status == DSA_COMP_SUCCESS) { + return; + } + + /* + * DSA was able to partially complete the operation. Check the + * result. If we already know this is not a zero page, we can + * return now. + */ + if (completion->bytes_completed != 0 && completion->result != 0) { + task->results[0] = false; + return; + } + + /* Let's fallback to use CPU to complete it. */ + buf = (const uint8_t *)task->descriptors[0].src_addr; + len = task->descriptors[0].xfer_size; + task->results[0] = buffer_is_zero(buf + completion->bytes_completed, + len - completion->bytes_completed); +} + /** * @brief Complete a single DSA task in the batch task. * @@ -548,6 +583,62 @@ poll_batch_task_completion(struct buffer_zero_batch_task *batch_task) } } +/** + * @brief Use CPU to complete the zero page checking batch task. + * + * @param batch_task A pointer to the batch task. + */ +static void +batch_task_cpu_fallback(struct buffer_zero_batch_task *batch_task) +{ + assert(batch_task->task_type == DSA_BATCH_TASK); + + struct dsa_completion_record *batch_completion = + &batch_task->batch_completion; + struct dsa_completion_record *completion; + uint8_t status; + const uint8_t *buf; + size_t len; + bool *results = batch_task->results; + uint32_t count = batch_task->batch_descriptor.desc_count; + + // DSA is able to complete the entire batch task. + if (batch_completion->status == DSA_COMP_SUCCESS) { + assert(count == batch_completion->bytes_completed); + return; + } + + /* + * DSA encounters some error and is not able to complete + * the entire batch task. Use CPU fallback. + */ + for (int i = 0; i < count; i++) { + completion = &batch_task->completions[i]; + status = completion->status; + if (status == DSA_COMP_SUCCESS) { + continue; + } + assert(status == DSA_COMP_PAGE_FAULT_NOBOF); + + /* + * DSA was able to partially complete the operation. Check the + * result. If we already know this is not a zero page, we can + * return now. + */ + if (completion->bytes_completed != 0 && completion->result != 0) { + results[i] = false; + continue; + } + + /* Let's fallback to use CPU to complete it. */ + buf = (uint8_t *)batch_task->descriptors[i].src_addr; + len = batch_task->descriptors[i].xfer_size; + results[i] = + buffer_is_zero(buf + completion->bytes_completed, + len - completion->bytes_completed); + } +} + /** * @brief Handles an asynchronous DSA batch task completion. * @@ -825,7 +916,6 @@ buffer_zero_batch_task_set(struct buffer_zero_batch_task *batch_task, * * @return int Zero if successful, otherwise an appropriate error code. */ -__attribute__((unused)) static int buffer_zero_dsa_async(struct buffer_zero_batch_task *task, const void *buf, size_t len) @@ -844,7 +934,6 @@ buffer_zero_dsa_async(struct buffer_zero_batch_task *task, * @param count The number of buffers. * @param len The buffer length. */ -__attribute__((unused)) static int buffer_zero_dsa_batch_async(struct buffer_zero_batch_task *batch_task, const void **buf, size_t count, size_t len) @@ -876,13 +965,29 @@ buffer_zero_dsa_completion(void *context) * * @param batch_task A pointer to the buffer zero comparison batch task. */ -__attribute__((unused)) static void buffer_zero_dsa_wait(struct buffer_zero_batch_task *batch_task) { qemu_sem_wait(&batch_task->sem_task_complete); } +/** + * @brief Use CPU to complete the zero page checking task if DSA + * is not able to complete it. + * + * @param batch_task A pointer to the batch task. + */ +static void +buffer_zero_cpu_fallback(struct buffer_zero_batch_task *batch_task) +{ + if (batch_task->task_type == DSA_TASK) { + task_cpu_fallback(batch_task); + } else { + assert(batch_task->task_type == DSA_BATCH_TASK); + batch_task_cpu_fallback(batch_task); + } +} + /** * @brief Check if DSA is running. * @@ -956,6 +1061,41 @@ void dsa_cleanup(void) dsa_device_group_cleanup(&dsa_group); } +/** + * @brief Performs buffer zero comparison on a DSA batch task asynchronously. + * + * @param batch_task A pointer to the batch task. + * @param buf An array of memory buffers. + * @param count The number of buffers in the array. + * @param len The buffer length. + * + * @return Zero if successful, otherwise non-zero. + */ +int +buffer_is_zero_dsa_batch_async(struct buffer_zero_batch_task *batch_task, + const void **buf, size_t count, size_t len) +{ + if (count <= 0 || count > batch_task->batch_size) { + return -1; + } + + assert(batch_task != NULL); + assert(len != 0); + assert(buf != NULL); + + if (count == 1) { + // DSA doesn't take batch operation with only 1 task. + buffer_zero_dsa_async(batch_task, buf[0], len); + } else { + buffer_zero_dsa_batch_async(batch_task, buf, count, len); + } + + buffer_zero_dsa_wait(batch_task); + buffer_zero_cpu_fallback(batch_task); + + return 0; +} + #else void buffer_zero_batch_task_init(struct buffer_zero_batch_task *task, @@ -981,5 +1121,12 @@ void dsa_stop(void) {} void dsa_cleanup(void) {} +int +buffer_is_zero_dsa_batch_async(struct buffer_zero_batch_task *batch_task, + const void **buf, size_t count, size_t len) +{ + exit(1); +} + #endif From patchwork Wed Oct 25 19:38:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 1855247 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=bytedance.com header.i=@bytedance.com header.a=rsa-sha256 header.s=google header.b=Eix6CKPj; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SFzmX27dbz23jn for ; Thu, 26 Oct 2023 06:40:28 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvjjF-0000tT-IL; Wed, 25 Oct 2023 15:39:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvjjE-0000t0-Ca for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:44 -0400 Received: from mail-qk1-x72e.google.com ([2607:f8b0:4864:20::72e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qvjjC-00069l-Je for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:44 -0400 Received: by mail-qk1-x72e.google.com with SMTP id af79cd13be357-773ac11de71so9318085a.2 for ; Wed, 25 Oct 2023 12:39:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1698262781; x=1698867581; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=d4qo/1HDAGbTl7YFYPmhgxCLUMeNwG2LmT3ZjuOoSZM=; b=Eix6CKPj7oNbdUjL5W8FhAGtWFewD3ZFmfbhhe5sn4S+tIHE2KJU9+//YV1L9SFrN1 C9NT4R+jrmK9gQq6QO19mOvE/BZiYJiwGEfJ+bNRl5/SFWWz3g4NWvmgUuqxN4RTEQNL HGus0VuktBaWLJQB8FWDCWtU3NO0SZpHEYHDSEU86Nd0fYfJS2YkhWL9ryzIMhKI5tB9 hOs6a71nVdqWNxrBfgbrRqIxnITMer8uS+mQpwh6eqvaR0lPF5cI/ASZCgSvri7x8g4O G0728R3KF3bbpFirPpxni/yQZVocgOS8BsX8cQMLkR47i6rROBM8yC8ujS07OYHScgjc wTuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698262781; x=1698867581; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=d4qo/1HDAGbTl7YFYPmhgxCLUMeNwG2LmT3ZjuOoSZM=; b=szvgKmtNLBxuoF7MJIUI9l1AbrQX6z5Lm3MLoIl5vidXc/PDcc8Wv86Fit4p/1lq8h P+AKbAz8VsFg7v0Va2fn8UVlK4gGG1vVMdpEgmq4j4o4LDot9B3+s0COTApLatfZBkwy DMqAndlM90VJyO2puD4l8BmxI6FNf1yrewsDDBppBTO1nrFeZjHQQA/Ubd6N2cyWVImT JpMYWodzYi1u3248hLJN6WJPZ93L+N+lFHe8eXDtsMZjzSzPT1MRITz6guDGyutN7bhB YZ0Kll80j/OQMPQbjTJ6sSV9ocLBOSDc4YPXSd42ZxaJ71bUM4K/j/g5BwBudhm4hUZV O/zw== X-Gm-Message-State: AOJu0YzhXSZVJX30MsWcMDEUP3of9MvBklwn6eVen8aTrw+p1OcMtXAp k2Sn6TyL1GcjCmo0HqVGAlM2UQ== X-Google-Smtp-Source: AGHT+IEPVmB6zKMO8XVS9qAQ4BC+UrrA7xlMwwY2CYozoz4DV6iIiqNh+14Izk3RWQejYtQc8zseiQ== X-Received: by 2002:a05:620a:8c91:b0:774:500:a18e with SMTP id ra17-20020a05620a8c9100b007740500a18emr13554836qkn.75.1698262781723; Wed, 25 Oct 2023 12:39:41 -0700 (PDT) Received: from n231-230-216.byted.org ([147.160.184.135]) by smtp.gmail.com with ESMTPSA id o8-20020a05620a228800b0076cdc3b5beasm4453721qkh.86.2023.10.25.12.39.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Oct 2023 12:39:41 -0700 (PDT) From: Hao Xiang To: quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH 09/16] migration/multifd: Add new migration option for multifd DSA offloading. Date: Wed, 25 Oct 2023 19:38:15 +0000 Message-Id: <20231025193822.2813204-10-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231025193822.2813204-1-hao.xiang@bytedance.com> References: <20231025193822.2813204-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::72e; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x72e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Intel DSA offloading is an optional feature that turns on if proper hardware and software stack is available. To turn on DSA offloading in multifd live migration: multifd-dsa-accel="[dsa_dev_path1] ] [dsa_dev_path2] ... [dsa_dev_pathX]" This feature is turned off by default. Signed-off-by: Hao Xiang --- migration/migration-hmp-cmds.c | 8 ++++++++ migration/options.c | 28 ++++++++++++++++++++++++++++ migration/options.h | 1 + qapi/migration.json | 17 ++++++++++++++--- 4 files changed, 51 insertions(+), 3 deletions(-) diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 5b25ba24f7..bdffe9e023 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -348,6 +348,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) monitor_printf(mon, "%s: '%s'\n", MigrationParameter_str(MIGRATION_PARAMETER_TLS_AUTHZ), params->tls_authz); + monitor_printf(mon, "%s: %s\n", + MigrationParameter_str(MIGRATION_PARAMETER_MULTIFD_DSA_ACCEL), + params->multifd_dsa_accel); if (params->has_block_bitmap_mapping) { const BitmapMigrationNodeAliasList *bmnal; @@ -586,6 +589,11 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict) p->has_block_incremental = true; visit_type_bool(v, param, &p->block_incremental, &err); break; + case MIGRATION_PARAMETER_MULTIFD_DSA_ACCEL: + p->multifd_dsa_accel = g_new0(StrOrNull, 1); + p->multifd_dsa_accel->type = QTYPE_QSTRING; + visit_type_str(v, param, &p->multifd_dsa_accel->u.s, &err); + break; case MIGRATION_PARAMETER_MULTIFD_CHANNELS: p->has_multifd_channels = true; visit_type_uint8(v, param, &p->multifd_channels, &err); diff --git a/migration/options.c b/migration/options.c index 12b1c4dd71..6a3a78a626 100644 --- a/migration/options.c +++ b/migration/options.c @@ -173,6 +173,8 @@ Property migration_properties[] = { DEFINE_PROP_UINT64("vcpu-dirty-limit", MigrationState, parameters.vcpu_dirty_limit, DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT), + DEFINE_PROP_STRING("multifd-dsa-accel", MigrationState, + parameters.multifd_dsa_accel), /* Migration capabilities */ DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE), @@ -849,6 +851,13 @@ const char *migrate_tls_creds(void) return s->parameters.tls_creds; } +const char *migrate_multifd_dsa_accel(void) +{ + MigrationState *s = migrate_get_current(); + + return s->parameters.multifd_dsa_accel; +} + const char *migrate_tls_hostname(void) { MigrationState *s = migrate_get_current(); @@ -969,6 +978,7 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp) params->x_vcpu_dirty_limit_period = s->parameters.x_vcpu_dirty_limit_period; params->has_vcpu_dirty_limit = true; params->vcpu_dirty_limit = s->parameters.vcpu_dirty_limit; + params->multifd_dsa_accel = s->parameters.multifd_dsa_accel; return params; } @@ -977,6 +987,7 @@ void migrate_params_init(MigrationParameters *params) { params->tls_hostname = g_strdup(""); params->tls_creds = g_strdup(""); + params->multifd_dsa_accel = g_strdup(""); /* Set has_* up only for parameter checks */ params->has_compress_level = true; @@ -1288,6 +1299,11 @@ static void migrate_params_test_apply(MigrateSetParameters *params, if (params->has_vcpu_dirty_limit) { dest->vcpu_dirty_limit = params->vcpu_dirty_limit; } + + if (params->multifd_dsa_accel) { + assert(params->multifd_dsa_accel->type == QTYPE_QSTRING); + dest->multifd_dsa_accel = params->multifd_dsa_accel->u.s; + } } static void migrate_params_apply(MigrateSetParameters *params, Error **errp) @@ -1414,6 +1430,12 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp) if (params->has_vcpu_dirty_limit) { s->parameters.vcpu_dirty_limit = params->vcpu_dirty_limit; } + + if (params->multifd_dsa_accel) { + g_free(s->parameters.multifd_dsa_accel); + assert(params->multifd_dsa_accel->type == QTYPE_QSTRING); + s->parameters.multifd_dsa_accel = g_strdup(params->multifd_dsa_accel->u.s); + } } void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp) @@ -1439,6 +1461,12 @@ void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp) params->tls_authz->type = QTYPE_QSTRING; params->tls_authz->u.s = strdup(""); } + if (params->multifd_dsa_accel + && params->multifd_dsa_accel->type == QTYPE_QNULL) { + qobject_unref(params->multifd_dsa_accel->u.n); + params->multifd_dsa_accel->type = QTYPE_QSTRING; + params->multifd_dsa_accel->u.s = strdup(""); + } migrate_params_test_apply(params, &tmp); diff --git a/migration/options.h b/migration/options.h index c663f637fd..f757835b4a 100644 --- a/migration/options.h +++ b/migration/options.h @@ -91,6 +91,7 @@ const char *migrate_tls_authz(void); const char *migrate_tls_creds(void); const char *migrate_tls_hostname(void); uint64_t migrate_xbzrle_cache_size(void); +const char *migrate_multifd_dsa_accel(void); /* parameters setters */ diff --git a/qapi/migration.json b/qapi/migration.json index 3a99fe34d8..201f58527e 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -829,6 +829,9 @@ # @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration. # Defaults to 1. (Since 8.1) # +# @multifd-dsa-accel: If enabled, use DSA accelerator offloading for +# certain memory operations. (since 8.1) +# # Features: # # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period @@ -844,7 +847,7 @@ 'cpu-throttle-initial', 'cpu-throttle-increment', 'cpu-throttle-tailslow', 'tls-creds', 'tls-hostname', 'tls-authz', 'max-bandwidth', - 'downtime-limit', + 'downtime-limit', 'multifd-dsa-accel', { 'name': 'x-checkpoint-delay', 'features': [ 'unstable' ] }, 'block-incremental', 'multifd-channels', @@ -995,6 +998,9 @@ # @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration. # Defaults to 1. (Since 8.1) # +# @multifd-dsa-accel: If enabled, use DSA accelerator offloading for +# certain memory operations. (since 8.1) +# # Features: # # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period @@ -1036,7 +1042,8 @@ '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ], '*x-vcpu-dirty-limit-period': { 'type': 'uint64', 'features': [ 'unstable' ] }, - '*vcpu-dirty-limit': 'uint64'} } + '*vcpu-dirty-limit': 'uint64', + '*multifd-dsa-accel': 'StrOrNull'} } ## # @migrate-set-parameters: @@ -1198,6 +1205,9 @@ # @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration. # Defaults to 1. (Since 8.1) # +# @multifd-dsa-accel: If enabled, use DSA accelerator offloading for +# certain memory operations. (since 8.1) +# # Features: # # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period @@ -1236,7 +1246,8 @@ '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ], '*x-vcpu-dirty-limit-period': { 'type': 'uint64', 'features': [ 'unstable' ] }, - '*vcpu-dirty-limit': 'uint64'} } + '*vcpu-dirty-limit': 'uint64', + '*multifd-dsa-accel': 'str'} } ## # @query-migrate-parameters: From patchwork Wed Oct 25 19:38:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 1855257 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=bytedance.com header.i=@bytedance.com header.a=rsa-sha256 header.s=google header.b=emkscCTq; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SFzp649D3z23jp for ; Thu, 26 Oct 2023 06:41:50 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvjjI-0000tw-0P; Wed, 25 Oct 2023 15:39:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvjjG-0000tk-JF for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:46 -0400 Received: from mail-qk1-x732.google.com ([2607:f8b0:4864:20::732]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qvjjE-0006AK-RO for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:46 -0400 Received: by mail-qk1-x732.google.com with SMTP id af79cd13be357-777754138bdso10726785a.1 for ; Wed, 25 Oct 2023 12:39:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1698262784; x=1698867584; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=LaMy1tZU02hq5CY5DrbMQay4vRzC3DngqtrEhWmQFqM=; b=emkscCTqE413IQH684jqpOXVbbEo+NNIPWkdgrSQK378rJtF2EqbOSCu//czQzHO5f l2vlnXjcY1CQfW6TjdyvC+q/n7LDqVPqGdGA9cm55SvLMX7c+61RNeXdj80KN4dab0Ei DzQMO4MZ6eU8zLhbOqdDKjBBAQ+x8xS7mR2PUkKOoge+NYZPVC56nbtnLa88pPZmXENX b15w4uqHIgyVtyuYyDVeQW1oX0ZLwLk8uRsFjI3es4dTi9QKNsjo4G+JRjhLtZdBIa6E 5kjIETPVEPdYQY0LDuFGIKmEN9GBZ1RRmBckgK/Uaq+b25ofjhJcsgQpw7YUhkU8TsCX YJCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698262784; x=1698867584; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LaMy1tZU02hq5CY5DrbMQay4vRzC3DngqtrEhWmQFqM=; b=U9IJqGf7u/JqF79OjUIe9DZLrkKT0Jvqwpe8htUzJiCzuT+Atncl46zHNQ0Vv0JxMf uHHniZPyk92cU6/JPyiWy56nW53f0ZA3Ji14SRocyJ2II/5q8mVrODgICX5UNdwD4g0l AUb6OqrR+fy6YGj+w7glXh9WD8zqik67CoKyG+0v34Qpn71HRw4+gDuyJ/IF+J4AavC+ O4ibo7RnahLiR27k76UG0Zmw7Re0cyPAKUGXXDQRQjmeLgKTd1NAMOvM6pmWT+DnFoF/ fzuS7ptSkLf2aV6fgd/einmUmyK8QVu53jnzGvWuGTG9QflsVppxZ7f473Wm5pBUaWJu 9aFQ== X-Gm-Message-State: AOJu0YzOfilQfcRfVJHZzPn+6CUQyb1IpKD6umJjUObPBwh9h36U61TF WGm90kqqhDcllxCq2oiQ8lR2WQ== X-Google-Smtp-Source: AGHT+IGtqN1Zn8q4jz0CxuXgQRMUxrGp/9KdXc7XISKwfzhPIM2xNXorrbpmNDhjfEk2r1s3OVJFYA== X-Received: by 2002:a05:620a:8806:b0:777:6fc2:cb2 with SMTP id qj6-20020a05620a880600b007776fc20cb2mr15429844qkn.71.1698262783876; Wed, 25 Oct 2023 12:39:43 -0700 (PDT) Received: from n231-230-216.byted.org ([147.160.184.135]) by smtp.gmail.com with ESMTPSA id o8-20020a05620a228800b0076cdc3b5beasm4453721qkh.86.2023.10.25.12.39.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Oct 2023 12:39:43 -0700 (PDT) From: Hao Xiang To: quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH 10/16] migration/multifd: Enable DSA offloading in multifd sender path. Date: Wed, 25 Oct 2023 19:38:16 +0000 Message-Id: <20231025193822.2813204-11-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231025193822.2813204-1-hao.xiang@bytedance.com> References: <20231025193822.2813204-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::732; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x732.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Multifd sender path gets an array of pages queued by the migration thread. It performs zero page checking on every page in the array. The pages are classfied as either a zero page or a normal page. This change uses Intel DSA to offload the zero page checking from CPU to the DSA accelerator. The sender thread submits a batch of pages to DSA hardware and waits for the DSA completion thread to signal for work completion. Signed-off-by: Hao Xiang --- migration/multifd.c | 101 +++++++++++++++++++++++++++++++++++++------- migration/multifd.h | 3 ++ 2 files changed, 89 insertions(+), 15 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index 452fb158b8..79fecbd3ae 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -13,6 +13,8 @@ #include "qemu/osdep.h" #include "qemu/rcu.h" #include "qemu/cutils.h" +#include "qemu/dsa.h" +#include "qemu/memalign.h" #include "exec/target_page.h" #include "sysemu/sysemu.h" #include "exec/ramblock.h" @@ -555,6 +557,8 @@ void multifd_save_cleanup(void) qemu_thread_join(&p->thread); } } + dsa_stop(); + dsa_cleanup(); for (i = 0; i < migrate_multifd_channels(); i++) { MultiFDSendParams *p = &multifd_send_state->params[i]; Error *local_err = NULL; @@ -571,6 +575,11 @@ void multifd_save_cleanup(void) p->name = NULL; multifd_pages_clear(p->pages); p->pages = NULL; + g_free(p->addr); + p->addr = NULL; + buffer_zero_batch_task_destroy(p->dsa_batch_task); + qemu_vfree(p->dsa_batch_task); + p->dsa_batch_task = NULL; p->packet_len = 0; g_free(p->packet); p->packet = NULL; @@ -675,13 +684,71 @@ int multifd_send_sync_main(QEMUFile *f) return 0; } +static void set_page(MultiFDSendParams *p, bool zero_page, uint64_t offset) +{ + RAMBlock *rb = p->pages->block; + if (zero_page) { + p->zero[p->zero_num] = offset; + p->zero_num++; + ram_release_page(rb->idstr, offset); + } else { + p->normal[p->normal_num] = offset; + p->normal_num++; + } +} + +static void buffer_is_zero_use_cpu(MultiFDSendParams *p) +{ + const void **buf = (const void **)p->addr; + assert(!migrate_use_main_zero_page()); + assert(!dsa_is_running()); + + for (int i = 0; i < p->pages->num; i++) { + p->dsa_batch_task->results[i] = buffer_is_zero(buf[i], p->page_size); + } +} + +static void buffer_is_zero_use_dsa(MultiFDSendParams *p) +{ + assert(!migrate_use_main_zero_page()); + assert(dsa_is_running()); + + buffer_is_zero_dsa_batch_async(p->dsa_batch_task, + (const void **)p->addr, + p->pages->num, + p->page_size); +} + +static void multifd_zero_page_check(MultiFDSendParams *p) +{ + /* older qemu don't understand zero page on multifd channel */ + bool use_multifd_zero_page = !migrate_use_main_zero_page(); + bool use_multifd_dsa_accel = dsa_is_running(); + + RAMBlock *rb = p->pages->block; + + for (int i = 0; i < p->pages->num; i++) { + p->addr[i] = (ram_addr_t)(rb->host + p->pages->offset[i]); + } + + if (!use_multifd_zero_page || !use_multifd_dsa_accel) { + buffer_is_zero_use_cpu(p); + } else { + buffer_is_zero_use_dsa(p); + } + + for (int i = 0; i < p->pages->num; i++) { + uint64_t offset = p->pages->offset[i]; + bool zero_page = p->dsa_batch_task->results[i]; + set_page(p, zero_page, offset); + } +} + static void *multifd_send_thread(void *opaque) { MultiFDSendParams *p = opaque; MigrationThread *thread = NULL; Error *local_err = NULL; - /* older qemu don't understand zero page on multifd channel */ - bool use_multifd_zero_page = !migrate_use_main_zero_page(); int ret = 0; bool use_zero_copy_send = migrate_zero_copy_send(); @@ -707,7 +774,6 @@ static void *multifd_send_thread(void *opaque) qemu_mutex_lock(&p->mutex); if (p->pending_job) { - RAMBlock *rb = p->pages->block; uint64_t packet_num = p->packet_num; p->flags = 0; if (p->sync_needed) { @@ -725,18 +791,7 @@ static void *multifd_send_thread(void *opaque) p->iovs_num = 1; } - for (int i = 0; i < p->pages->num; i++) { - uint64_t offset = p->pages->offset[i]; - if (use_multifd_zero_page && - buffer_is_zero(rb->host + offset, p->page_size)) { - p->zero[p->zero_num] = offset; - p->zero_num++; - ram_release_page(rb->idstr, offset); - } else { - p->normal[p->normal_num] = offset; - p->normal_num++; - } - } + multifd_zero_page_check(p); if (p->normal_num) { ret = multifd_send_state->ops->send_prepare(p, &local_err); @@ -958,11 +1013,15 @@ int multifd_save_setup(Error **errp) int thread_count; uint32_t page_count = MULTIFD_PACKET_SIZE / qemu_target_page_size(); uint8_t i; + const char *dsa_parameter = migrate_multifd_dsa_accel(); if (!migrate_multifd()) { return 0; } + dsa_init(dsa_parameter); + dsa_start(); + thread_count = migrate_multifd_channels(); multifd_send_state = g_malloc0(sizeof(*multifd_send_state)); multifd_send_state->params = g_new0(MultiFDSendParams, thread_count); @@ -981,6 +1040,10 @@ int multifd_save_setup(Error **errp) p->pending_job = 0; p->id = i; p->pages = multifd_pages_init(page_count); + p->addr = g_new0(ram_addr_t, page_count); + p->dsa_batch_task = + (struct buffer_zero_batch_task *)qemu_memalign(64, sizeof(*p->dsa_batch_task)); + buffer_zero_batch_task_init(p->dsa_batch_task, page_count); p->packet_len = sizeof(MultiFDPacket_t) + sizeof(uint64_t) * page_count; p->packet = g_malloc0(p->packet_len); @@ -1014,6 +1077,7 @@ int multifd_save_setup(Error **errp) return ret; } } + return 0; } @@ -1091,6 +1155,8 @@ void multifd_load_cleanup(void) qemu_thread_join(&p->thread); } + dsa_stop(); + dsa_cleanup(); for (i = 0; i < migrate_multifd_channels(); i++) { MultiFDRecvParams *p = &multifd_recv_state->params[i]; @@ -1225,6 +1291,7 @@ int multifd_load_setup(Error **errp) int thread_count; uint32_t page_count = MULTIFD_PACKET_SIZE / qemu_target_page_size(); uint8_t i; + const char *dsa_parameter = migrate_multifd_dsa_accel(); /* * Return successfully if multiFD recv state is already initialised @@ -1234,6 +1301,9 @@ int multifd_load_setup(Error **errp) return 0; } + dsa_init(dsa_parameter); + dsa_start(); + thread_count = migrate_multifd_channels(); multifd_recv_state = g_malloc0(sizeof(*multifd_recv_state)); multifd_recv_state->params = g_new0(MultiFDRecvParams, thread_count); @@ -1270,6 +1340,7 @@ int multifd_load_setup(Error **errp) return ret; } } + return 0; } diff --git a/migration/multifd.h b/migration/multifd.h index e8f90776bb..297b055e2b 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -114,6 +114,9 @@ typedef struct { * pending_job != 0 -> multifd_channel can use it. */ MultiFDPages_t *pages; + /* Address of each pages in pages */ + ram_addr_t *addr; + struct buffer_zero_batch_task *dsa_batch_task; /* thread local variables. No locking required */ From patchwork Wed Oct 25 19:38:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 1855253 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=bytedance.com header.i=@bytedance.com header.a=rsa-sha256 header.s=google header.b=B/rlKsom; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SFzny46K7z23jn for ; Thu, 26 Oct 2023 06:41:42 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvjjK-0000wd-G7; Wed, 25 Oct 2023 15:39:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvjjI-0000uB-Lm for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:48 -0400 Received: from mail-qk1-x735.google.com ([2607:f8b0:4864:20::735]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qvjjG-0006At-Th for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:48 -0400 Received: by mail-qk1-x735.google.com with SMTP id af79cd13be357-77774120c6eso9831885a.2 for ; Wed, 25 Oct 2023 12:39:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1698262786; x=1698867586; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qrHVfcJmjwWlckvc+ypGZtCAWYmRlr+54BSzc+Ljn00=; b=B/rlKsomehtl8tjaB10HZbeRiSqpJwRdKFPmJGllCCfSz5qtaDDrCbi4LuSEF7XynT S4kkivLPF51c2TYfvkiSWtz6X0T8CMjjWjp49ncisC1VqmWXJe6eKGC2jKVHiC9s1SNi Z0ihPetkMXoiCBFDpS/XNt32LlWjm2BS9mRumbeUCM1oDH0fxSKgPzfX3t4lzyv6kGoH VWVgPz2+iElIDyxv9wjcxkQw0hNyIC2XeJBEDWDYhss64OxIFUupPAUS4g1WLmrpHLSv qmiVGXAL4fmnj/xy3/QMNaQPwsMjErN3XA3/t8/DXq3SraCO4qXFbqGTfWqBlB4zTZk9 Kt8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698262786; x=1698867586; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qrHVfcJmjwWlckvc+ypGZtCAWYmRlr+54BSzc+Ljn00=; b=dcUfYMARhijh8UvlBlXU18EnL2izP7jEHW8c+Ci/rjO88xNlP3JcUUT/TRdnFDqYSW 24X1UM0l8Hg0T+adunlJENH9mGde60DEessh5VaJA98oMMwnQEh/4XGlXTiyaDG7LlMl zjoVxJmR+o6xC6xeiTXJh9SZ78iROT69DubPTDUOEYGq++mUNzVDTcUGpYOqAqX3VdE9 JDbeAsFZQ3ASM5qGN8gAVLrQZT/xgzS1Z2eN5RHTDN8QIMb5Xeq/WBMsSc+YiuiQ7ppx MlW8GigTjPnfzTMv0MM0Wyy7nXa9y3wVozazW1eyQQtB8nh6qYGtVemuyn1rKyTW8Awq Ix8Q== X-Gm-Message-State: AOJu0YzeIcJy4f1vZb6bkij7vojgbcs8at7eg6wNqpln4B41YGtLZeC1 7ZcBBwLZx8D0xM6xKqu4BhaUag== X-Google-Smtp-Source: AGHT+IGYiQso/p/7QMMXhIIYHzqKQKgvqU1zvOVf6Gxv6pnpJUJYoZpN2Atny0R/J24w3gril4+KOQ== X-Received: by 2002:a05:620a:f01:b0:770:fc39:25e4 with SMTP id v1-20020a05620a0f0100b00770fc3925e4mr17741343qkl.45.1698262785953; Wed, 25 Oct 2023 12:39:45 -0700 (PDT) Received: from n231-230-216.byted.org ([147.160.184.135]) by smtp.gmail.com with ESMTPSA id o8-20020a05620a228800b0076cdc3b5beasm4453721qkh.86.2023.10.25.12.39.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Oct 2023 12:39:45 -0700 (PDT) From: Hao Xiang To: quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH 11/16] migration/multifd: Add test hook to set normal page ratio. Date: Wed, 25 Oct 2023 19:38:17 +0000 Message-Id: <20231025193822.2813204-12-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231025193822.2813204-1-hao.xiang@bytedance.com> References: <20231025193822.2813204-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::735; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x735.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Multifd sender thread performs zero page checking. If a page is a zero page, only the page's metadata is sent to the receiver. If a page is a normal page, the entire page's content is sent to the receiver. This change adds a test hook to set the normal page ratio. A zero page will be forced to be sent as a normal page. This is useful for live migration performance analysis and optimization. Signed-off-by: Hao Xiang --- migration/options.c | 31 +++++++++++++++++++++++++++++++ migration/options.h | 1 + qapi/migration.json | 17 ++++++++++++++--- 3 files changed, 46 insertions(+), 3 deletions(-) diff --git a/migration/options.c b/migration/options.c index 6a3a78a626..9ee0ad5d89 100644 --- a/migration/options.c +++ b/migration/options.c @@ -78,6 +78,11 @@ #define DEFAULT_MIGRATE_ANNOUNCE_ROUNDS 5 #define DEFAULT_MIGRATE_ANNOUNCE_STEP 100 +/* + * Parameter for multifd normal page test hook. + */ +#define DEFAULT_MIGRATE_MULTIFD_NORMAL_PAGE_RATIO 101 + #define DEFINE_PROP_MIG_CAP(name, x) \ DEFINE_PROP_BOOL(name, MigrationState, capabilities[x], false) @@ -175,6 +180,9 @@ Property migration_properties[] = { DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT), DEFINE_PROP_STRING("multifd-dsa-accel", MigrationState, parameters.multifd_dsa_accel), + DEFINE_PROP_UINT8("multifd-normal-page-ratio", MigrationState, + parameters.multifd_normal_page_ratio, + DEFAULT_MIGRATE_MULTIFD_NORMAL_PAGE_RATIO), /* Migration capabilities */ DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE), @@ -808,6 +816,12 @@ int migrate_multifd_channels(void) return s->parameters.multifd_channels; } +uint8_t migrate_multifd_normal_page_ratio(void) +{ + MigrationState *s = migrate_get_current(); + return s->parameters.multifd_normal_page_ratio; +} + MultiFDCompression migrate_multifd_compression(void) { MigrationState *s = migrate_get_current(); @@ -1192,6 +1206,14 @@ bool migrate_params_check(MigrationParameters *params, Error **errp) return false; } + if (params->has_multifd_normal_page_ratio && + params->multifd_normal_page_ratio > 100) { + error_setg(errp, QERR_INVALID_PARAMETER_VALUE, + "multifd_normal_page_ratio", + "a value between 0 and 100"); + return false; + } + return true; } @@ -1304,6 +1326,11 @@ static void migrate_params_test_apply(MigrateSetParameters *params, assert(params->multifd_dsa_accel->type == QTYPE_QSTRING); dest->multifd_dsa_accel = params->multifd_dsa_accel->u.s; } + + if (params->has_multifd_normal_page_ratio) { + dest->has_multifd_normal_page_ratio = true; + dest->multifd_normal_page_ratio = params->multifd_normal_page_ratio; + } } static void migrate_params_apply(MigrateSetParameters *params, Error **errp) @@ -1436,6 +1463,10 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp) assert(params->multifd_dsa_accel->type == QTYPE_QSTRING); s->parameters.multifd_dsa_accel = g_strdup(params->multifd_dsa_accel->u.s); } + + if (params->has_multifd_normal_page_ratio) { + s->parameters.multifd_normal_page_ratio = params->multifd_normal_page_ratio; + } } void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp) diff --git a/migration/options.h b/migration/options.h index f757835b4a..dafb09d6ea 100644 --- a/migration/options.h +++ b/migration/options.h @@ -92,6 +92,7 @@ const char *migrate_tls_creds(void); const char *migrate_tls_hostname(void); uint64_t migrate_xbzrle_cache_size(void); const char *migrate_multifd_dsa_accel(void); +uint8_t migrate_multifd_normal_page_ratio(void); /* parameters setters */ diff --git a/qapi/migration.json b/qapi/migration.json index 201f58527e..a667527081 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -832,6 +832,9 @@ # @multifd-dsa-accel: If enabled, use DSA accelerator offloading for # certain memory operations. (since 8.1) # +# @multifd-normal-page-ratio: Test hook setting the normal page ratio. +# (Since 8.1) +# # Features: # # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period @@ -856,7 +859,7 @@ 'multifd-zlib-level', 'multifd-zstd-level', 'block-bitmap-mapping', { 'name': 'x-vcpu-dirty-limit-period', 'features': ['unstable'] }, - 'vcpu-dirty-limit'] } + 'vcpu-dirty-limit', 'multifd-normal-page-ratio'] } ## # @MigrateSetParameters: @@ -1001,6 +1004,9 @@ # @multifd-dsa-accel: If enabled, use DSA accelerator offloading for # certain memory operations. (since 8.1) # +# @multifd-normal-page-ratio: Test hook setting the normal page ratio. +# (Since 8.1) +# # Features: # # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period @@ -1043,7 +1049,8 @@ '*x-vcpu-dirty-limit-period': { 'type': 'uint64', 'features': [ 'unstable' ] }, '*vcpu-dirty-limit': 'uint64', - '*multifd-dsa-accel': 'StrOrNull'} } + '*multifd-dsa-accel': 'StrOrNull', + '*multifd-normal-page-ratio': 'uint8'} } ## # @migrate-set-parameters: @@ -1208,6 +1215,9 @@ # @multifd-dsa-accel: If enabled, use DSA accelerator offloading for # certain memory operations. (since 8.1) # +# @multifd-normal-page-ratio: Test hook setting the normal page ratio. +# (Since 8.1) +# # Features: # # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period @@ -1247,7 +1257,8 @@ '*x-vcpu-dirty-limit-period': { 'type': 'uint64', 'features': [ 'unstable' ] }, '*vcpu-dirty-limit': 'uint64', - '*multifd-dsa-accel': 'str'} } + '*multifd-dsa-accel': 'str', + '*multifd-normal-page-ratio': 'uint8'} } ## # @query-migrate-parameters: From patchwork Wed Oct 25 19:38:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 1855255 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=bytedance.com header.i=@bytedance.com header.a=rsa-sha256 header.s=google header.b=N9Pw7aQE; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SFzp12ddqz23jp for ; Thu, 26 Oct 2023 06:41:45 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvjjL-0000wp-IM; Wed, 25 Oct 2023 15:39:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvjjK-0000wI-0y for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:50 -0400 Received: from mail-qk1-x734.google.com ([2607:f8b0:4864:20::734]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qvjjI-0006B7-9Y for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:49 -0400 Received: by mail-qk1-x734.google.com with SMTP id af79cd13be357-778a108ae49so102571885a.0 for ; Wed, 25 Oct 2023 12:39:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1698262787; x=1698867587; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4IWhl0rawXmMq+XYbVZ524V+XuPqJsOdn1fgEyYuj5I=; b=N9Pw7aQElmmryzUVZbrK18rKXV2lhGlO507t4e+8y8g5rKictmGxtx3H4z7WH/dXS1 CTZtlEdtvLHXJlXna5JjyTUC9F2mNVJKGXm9iEK0sup5kFwtQfA6Jh1faqC/vjYikXLv oNKesgPMeFfsMep5ZTgo7+9Wt7PQV3f37bgo8eKJkyo1ihLZTtXvu3a2NdCJVio8Crjx ROX4OXuL4KMtOSQ66gwZNz5S4HBTWbEfBmKo8ce6/SehEldF2mtN/fQ2ZNeJN/vsf3PC sR4k26mPq4c4nZV4z/VIsIWC6zNlx9M0R248awkx2mBu0/aQPf/tGfbGSTN9vtIOadAJ SohA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698262787; x=1698867587; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4IWhl0rawXmMq+XYbVZ524V+XuPqJsOdn1fgEyYuj5I=; b=fu3VordpHllrpIbOyDbZZE40h1XS/fMo5gpQmlvc5PreuNwBwkRSmS5LyT2BAC1480 aqPrT3XoDyHsCSCHSbiEsVBiovLkFhoQhQ+tT/W5MBscZOUBKqQGHaeXxr0KAIlavl2U 6rInLbAtmyf0h+eu/w2HA6oZfOFSKXkdijPNn/nbWMyG40d+Vv6d4YRGqaVNZZbCaqzq ItlWJCmW1OX+SUq6NACcDkRKy10eBKzVFoEj2YyTdCJjh1tz99A0seeRfbT2K3xesF2K PxUcM4xSlA9kk0xWy2ovbS5nzaYcouQF/Rzz+TFFgrbp88gLJEU9mKylFrp4/hFl3j6B WVZw== X-Gm-Message-State: AOJu0YyMn5ST0Bc2uQYawTZGDtKdotlYkc9UU3aVIKP7gv6DBZbSIbsH 3UYDguJXskxLev7J4bqMDEM85w== X-Google-Smtp-Source: AGHT+IH2Gm3n/5cgVIqNXsCqdxRLzA62xaairKL0zvpWrRXKqVVOUgT3XW/aPezAAZkoIVJ6jj6QHQ== X-Received: by 2002:a05:620a:29d1:b0:774:2dc0:649b with SMTP id s17-20020a05620a29d100b007742dc0649bmr689518qkp.18.1698262787479; Wed, 25 Oct 2023 12:39:47 -0700 (PDT) Received: from n231-230-216.byted.org ([147.160.184.135]) by smtp.gmail.com with ESMTPSA id o8-20020a05620a228800b0076cdc3b5beasm4453721qkh.86.2023.10.25.12.39.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Oct 2023 12:39:47 -0700 (PDT) From: Hao Xiang To: quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH 12/16] migration/multifd: Enable set normal page ratio test hook in multifd. Date: Wed, 25 Oct 2023 19:38:18 +0000 Message-Id: <20231025193822.2813204-13-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231025193822.2813204-1-hao.xiang@bytedance.com> References: <20231025193822.2813204-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::734; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x734.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Test hook is disabled by default. To set it, a normal page ratio between 0 and 100 are valid. If the ratio is set to 50, it means at least 50% of all pages are sent as normal pages. Signed-off-by: Hao Xiang --- include/qemu/dsa.h | 7 ++++++- migration/migration-hmp-cmds.c | 7 +++++++ migration/multifd.c | 36 +++++++++++++++++++++++++++++++++- 3 files changed, 48 insertions(+), 2 deletions(-) diff --git a/include/qemu/dsa.h b/include/qemu/dsa.h index 3f8ee07004..bc7f652e0b 100644 --- a/include/qemu/dsa.h +++ b/include/qemu/dsa.h @@ -37,7 +37,10 @@ typedef struct buffer_zero_batch_task { enum dsa_task_type task_type; enum dsa_task_status status; bool *results; - int batch_size; + uint32_t batch_size; + // Set normal page ratio test hook. + uint32_t normal_page_index; + uint32_t normal_page_counter; QSIMPLEQ_ENTRY(buffer_zero_batch_task) entry; } buffer_zero_batch_task; @@ -45,6 +48,8 @@ typedef struct buffer_zero_batch_task { struct buffer_zero_batch_task { bool *results; + uint32_t normal_page_index; + uint32_t normal_page_counter; }; #endif diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index bdffe9e023..e1f110afbc 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -351,6 +351,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) monitor_printf(mon, "%s: %s\n", MigrationParameter_str(MIGRATION_PARAMETER_MULTIFD_DSA_ACCEL), params->multifd_dsa_accel); + monitor_printf(mon, "%s: %u\n", + MigrationParameter_str(MIGRATION_PARAMETER_MULTIFD_NORMAL_PAGE_RATIO), + params->multifd_normal_page_ratio); if (params->has_block_bitmap_mapping) { const BitmapMigrationNodeAliasList *bmnal; @@ -646,6 +649,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict) error_setg(&err, "The block-bitmap-mapping parameter can only be set " "through QMP"); break; + case MIGRATION_PARAMETER_MULTIFD_NORMAL_PAGE_RATIO: + p->has_multifd_normal_page_ratio = true; + visit_type_uint8(v, param, &p->multifd_normal_page_ratio, &err); + break; case MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD: p->has_x_vcpu_dirty_limit_period = true; visit_type_size(v, param, &p->x_vcpu_dirty_limit_period, &err); diff --git a/migration/multifd.c b/migration/multifd.c index 79fecbd3ae..a0bfb48a7e 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -684,6 +684,37 @@ int multifd_send_sync_main(QEMUFile *f) return 0; } +static void multifd_normal_page_test_hook(MultiFDSendParams *p) +{ + /* + * The value is between 0 to 100. If the value is 10, it means at + * least 10% of the pages are normal page. A zero page can be made + * a normal page but not the other way around. + */ + uint8_t multifd_normal_page_ratio = + migrate_multifd_normal_page_ratio(); + struct buffer_zero_batch_task *dsa_batch_task = p->dsa_batch_task; + + // Set normal page test hook is disabled. + if (multifd_normal_page_ratio > 100) { + return; + } + + for (int i = 0; i < p->pages->num; i++) { + if (dsa_batch_task->normal_page_counter < multifd_normal_page_ratio) { + // Turn a zero page into a normal page. + dsa_batch_task->results[i] = false; + } + dsa_batch_task->normal_page_index++; + dsa_batch_task->normal_page_counter++; + + if (dsa_batch_task->normal_page_index >= 100) { + dsa_batch_task->normal_page_index = 0; + dsa_batch_task->normal_page_counter = 0; + } + } +} + static void set_page(MultiFDSendParams *p, bool zero_page, uint64_t offset) { RAMBlock *rb = p->pages->block; @@ -704,7 +735,8 @@ static void buffer_is_zero_use_cpu(MultiFDSendParams *p) assert(!dsa_is_running()); for (int i = 0; i < p->pages->num; i++) { - p->dsa_batch_task->results[i] = buffer_is_zero(buf[i], p->page_size); + p->dsa_batch_task->results[i] = + buffer_is_zero(buf[i], p->page_size); } } @@ -737,6 +769,8 @@ static void multifd_zero_page_check(MultiFDSendParams *p) buffer_is_zero_use_dsa(p); } + multifd_normal_page_test_hook(p); + for (int i = 0; i < p->pages->num; i++) { uint64_t offset = p->pages->offset[i]; bool zero_page = p->dsa_batch_task->results[i]; From patchwork Wed Oct 25 19:38:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 1855249 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=bytedance.com header.i=@bytedance.com header.a=rsa-sha256 header.s=google header.b=IIilD8oj; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SFzmq5zqTz23jn for ; Thu, 26 Oct 2023 06:40:43 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvjjO-0000yI-4a; Wed, 25 Oct 2023 15:39:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvjjM-0000wu-5Z for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:52 -0400 Received: from mail-qk1-x734.google.com ([2607:f8b0:4864:20::734]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qvjjJ-0006BJ-OA for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:51 -0400 Received: by mail-qk1-x734.google.com with SMTP id af79cd13be357-778999c5ecfso10759785a.2 for ; Wed, 25 Oct 2023 12:39:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1698262789; x=1698867589; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=HCBwezzBVtO8oUhpBVp7AXk3ceMQWBAVIo2+dQvz4iw=; b=IIilD8ojXCBSY0ZK09ryoxhtV2Qz5QXapGMeffAei0vfqc32G0FIZZ6UuPEQiErH03 yQ7EeZ/Lqnksx2IV7QDyC1PuBY//XpsbsIQKO2PV00zQVBfFm4PWxweNX8d2eD6Rej7S pix1oVbLbB8Km53ocCgaIYasaOMSZbcEFBF+A7tiPetTBCnrHAHFojXaVTXR9HPBllZX bLTFmlDvTrPvcUi+C/LeCQYPldQSrHoidNlM9GodwYg6z0cwwNpQ5qfWEAkqS5Lqbi6G AfYPJCXsrraKQgYv1/rqH4/FNSbTb7s6Tiv64214A56hbVeVVdW/a4UlOjwjTVXEoP1D NA3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698262789; x=1698867589; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HCBwezzBVtO8oUhpBVp7AXk3ceMQWBAVIo2+dQvz4iw=; b=Lw1dMJSR2HENNZmAIbIkssO7ug/0iJ363h9xe+SpvSE8dvob4m6APuiDjOuR/p1Gev C8ZegEohKnLFjrbixJWsj3wkOhT30CFtqgtr+9kvZGefUdkQ1Jc1Tg0Ryw2DVkyfsxkV IMxl+1oFX8BQ9HmfrG/8ydOUMZr2FqA/RLaPE4DOU13oFclC1AqfqsROedHCxLvPRA2G u5tPfwWOP+OhG6/7T68Apsxnoh9z1hdhSp1bR+L+wgi6VL24SYp5ZuS3L0f57OEwDHBT EJrJs0zEYbMsu2Q5vGYIshdSQJqQ3JR3yqdncICFbm/o16kLKwIJDbUUqzc9fuKOo/Mw t6sA== X-Gm-Message-State: AOJu0YwdT6ehVfLYqMtij8d2Nrvt41IztMHFf+4U6hMczkzwQeJJxb3T w8fLyPwZBMngDBoi8rWK5+VNXB4+zC0/TjKu6As= X-Google-Smtp-Source: AGHT+IGQCvPVrVN/fs7/WJdMP8OEOrE1/bKAgoC42g5T3FHcB85yJRAuoHDd3uyYv/I3ELTc0+Kcrg== X-Received: by 2002:a05:620a:2947:b0:779:f0be:8d7e with SMTP id n7-20020a05620a294700b00779f0be8d7emr5514219qkp.27.1698262788803; Wed, 25 Oct 2023 12:39:48 -0700 (PDT) Received: from n231-230-216.byted.org ([147.160.184.135]) by smtp.gmail.com with ESMTPSA id o8-20020a05620a228800b0076cdc3b5beasm4453721qkh.86.2023.10.25.12.39.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Oct 2023 12:39:48 -0700 (PDT) From: Hao Xiang To: quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH 13/16] migration/multifd: Add migration option set packet size. Date: Wed, 25 Oct 2023 19:38:19 +0000 Message-Id: <20231025193822.2813204-14-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231025193822.2813204-1-hao.xiang@bytedance.com> References: <20231025193822.2813204-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::734; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x734.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org The current multifd packet size is 128 * 4kb. This change adds an option to set the packet size. Both sender and receiver needs to set the same packet size for things to work. Signed-off-by: Hao Xiang --- migration/options.c | 34 ++++++++++++++++++++++++++++++++++ migration/options.h | 1 + qapi/migration.json | 20 +++++++++++++++++--- 3 files changed, 52 insertions(+), 3 deletions(-) diff --git a/migration/options.c b/migration/options.c index 9ee0ad5d89..6cb3d19470 100644 --- a/migration/options.c +++ b/migration/options.c @@ -83,6 +83,12 @@ */ #define DEFAULT_MIGRATE_MULTIFD_NORMAL_PAGE_RATIO 101 +/* + * Parameter for multifd packet size. + */ +#define DEFAULT_MIGRATE_MULTIFD_PACKET_SIZE (512 * 1024) +#define MAX_MIGRATE_MULTIFD_PACKET_SIZE (1024 * 4 * 1024) + #define DEFINE_PROP_MIG_CAP(name, x) \ DEFINE_PROP_BOOL(name, MigrationState, capabilities[x], false) @@ -183,6 +189,9 @@ Property migration_properties[] = { DEFINE_PROP_UINT8("multifd-normal-page-ratio", MigrationState, parameters.multifd_normal_page_ratio, DEFAULT_MIGRATE_MULTIFD_NORMAL_PAGE_RATIO), + DEFINE_PROP_SIZE("multifd-packet-size", MigrationState, + parameters.multifd_packet_size, + DEFAULT_MIGRATE_MULTIFD_PACKET_SIZE), /* Migration capabilities */ DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE), @@ -822,6 +831,13 @@ uint8_t migrate_multifd_normal_page_ratio(void) return s->parameters.multifd_normal_page_ratio; } +uint64_t migrate_multifd_packet_size(void) +{ + MigrationState *s = migrate_get_current(); + + return s->parameters.multifd_packet_size; +} + MultiFDCompression migrate_multifd_compression(void) { MigrationState *s = migrate_get_current(); @@ -958,6 +974,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp) params->x_checkpoint_delay = s->parameters.x_checkpoint_delay; params->has_block_incremental = true; params->block_incremental = s->parameters.block_incremental; + params->has_multifd_packet_size = true; + params->multifd_packet_size = s->parameters.multifd_packet_size; params->has_multifd_channels = true; params->multifd_channels = s->parameters.multifd_channels; params->has_multifd_compression = true; @@ -1016,6 +1034,7 @@ void migrate_params_init(MigrationParameters *params) params->has_downtime_limit = true; params->has_x_checkpoint_delay = true; params->has_block_incremental = true; + params->has_multifd_packet_size = true; params->has_multifd_channels = true; params->has_multifd_compression = true; params->has_multifd_zlib_level = true; @@ -1104,6 +1123,15 @@ bool migrate_params_check(MigrationParameters *params, Error **errp) /* x_checkpoint_delay is now always positive */ + if (params->has_multifd_packet_size && + ((params->multifd_packet_size < DEFAULT_MIGRATE_MULTIFD_PACKET_SIZE) || + (params->multifd_packet_size > MAX_MIGRATE_MULTIFD_PACKET_SIZE))) { + error_setg(errp, QERR_INVALID_PARAMETER_VALUE, + "multifd_packet_size", + "a value between 524288 and 4194304"); + return false; + } + if (params->has_multifd_channels && (params->multifd_channels < 1)) { error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "multifd_channels", @@ -1281,6 +1309,9 @@ static void migrate_params_test_apply(MigrateSetParameters *params, if (params->has_block_incremental) { dest->block_incremental = params->block_incremental; } + if (params->has_multifd_packet_size) { + dest->multifd_packet_size = params->multifd_packet_size; + } if (params->has_multifd_channels) { dest->multifd_channels = params->multifd_channels; } @@ -1408,6 +1439,9 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp) if (params->has_block_incremental) { s->parameters.block_incremental = params->block_incremental; } + if (params->has_multifd_packet_size) { + s->parameters.multifd_packet_size = params->multifd_packet_size; + } if (params->has_multifd_channels) { s->parameters.multifd_channels = params->multifd_channels; } diff --git a/migration/options.h b/migration/options.h index dafb09d6ea..1170971aef 100644 --- a/migration/options.h +++ b/migration/options.h @@ -93,6 +93,7 @@ const char *migrate_tls_hostname(void); uint64_t migrate_xbzrle_cache_size(void); const char *migrate_multifd_dsa_accel(void); uint8_t migrate_multifd_normal_page_ratio(void); +uint64_t migrate_multifd_packet_size(void); /* parameters setters */ diff --git a/qapi/migration.json b/qapi/migration.json index a667527081..a492b73060 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -835,6 +835,10 @@ # @multifd-normal-page-ratio: Test hook setting the normal page ratio. # (Since 8.1) # +# @multifd-packet-size: Packet size used to migrate data. This value +# needs to be a multiple of qemu_target_page_size(). The default +# value is (512 * 1024) (Since 8.0) +# # Features: # # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period @@ -859,7 +863,7 @@ 'multifd-zlib-level', 'multifd-zstd-level', 'block-bitmap-mapping', { 'name': 'x-vcpu-dirty-limit-period', 'features': ['unstable'] }, - 'vcpu-dirty-limit', 'multifd-normal-page-ratio'] } + 'vcpu-dirty-limit', 'multifd-normal-page-ratio', 'multifd-packet-size'] } ## # @MigrateSetParameters: @@ -1007,6 +1011,10 @@ # @multifd-normal-page-ratio: Test hook setting the normal page ratio. # (Since 8.1) # +# @multifd-packet-size: Packet size used to migrate data. This value +# needs to be a multiple of qemu_target_page_size(). The default +# value is (512 * 1024) (Since 8.0) +# # Features: # # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period @@ -1050,7 +1058,8 @@ 'features': [ 'unstable' ] }, '*vcpu-dirty-limit': 'uint64', '*multifd-dsa-accel': 'StrOrNull', - '*multifd-normal-page-ratio': 'uint8'} } + '*multifd-normal-page-ratio': 'uint8', + '*multifd-packet-size' : 'uint64'} } ## # @migrate-set-parameters: @@ -1218,6 +1227,10 @@ # @multifd-normal-page-ratio: Test hook setting the normal page ratio. # (Since 8.1) # +# @multifd-packet-size: Packet size used to migrate data. This value +# needs to be a multiple of qemu_target_page_size(). The default +# value is (512 * 1024) (Since 8.0) +# # Features: # # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period @@ -1258,7 +1271,8 @@ 'features': [ 'unstable' ] }, '*vcpu-dirty-limit': 'uint64', '*multifd-dsa-accel': 'str', - '*multifd-normal-page-ratio': 'uint8'} } + '*multifd-normal-page-ratio': 'uint8', + '*multifd-packet-size': 'uint64'} } ## # @query-migrate-parameters: From patchwork Wed Oct 25 19:38:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 1855260 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=bytedance.com header.i=@bytedance.com header.a=rsa-sha256 header.s=google header.b=ZbhCQv95; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SFzpj3plZz23jn for ; Thu, 26 Oct 2023 06:42:21 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvjjX-0001HP-OF; Wed, 25 Oct 2023 15:40:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvjjT-00015l-Nv for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:59 -0400 Received: from mail-qk1-x731.google.com ([2607:f8b0:4864:20::731]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qvjjL-0006BU-U9 for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:58 -0400 Received: by mail-qk1-x731.google.com with SMTP id af79cd13be357-77063481352so101000985a.1 for ; Wed, 25 Oct 2023 12:39:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1698262790; x=1698867590; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=M+NUDX6FYlsNIn76fnLfQd7sg/7pGQAjAqSwBoE6gkI=; b=ZbhCQv95whfAI03CtqgeEMnYazXc5F1rDLoYCdwqSRCiQcWGHR3kVMZiP5pTydr1Ye Puq0FkgcPdmdlc4Pg+Mp/A1xhaZrqyvIrY+HG/Z2guXH/FP+uB3AZNZl/Juv/p8RErwT LA4ELgOLmASsfYMJ9WGtUX4Ioaf3AhAXhkEgEqbWuMt8DqGx6X616FFlMamOhRBe6d3q zY5pMkRKgwDHVVpamS9dfopXf/ztMpVmL/dtBhDxaAzfl2ZgBQdZiJGwS3g05o3I8YWI 1jd1bv+KPHULneoSoVBMv4BMOgbU/JF+4WcVTTxRwv/3ITHHChPrGK+1tpse1ylIorSV PAcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698262790; x=1698867590; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=M+NUDX6FYlsNIn76fnLfQd7sg/7pGQAjAqSwBoE6gkI=; b=G/4kcbLJhlq5miNj1UnAKR0g19DSbhr2I6/sCDVTwmbh8TmV98yHll0JLXPkGFAio7 DK1BfGYVWrcCoQEAUCSf1XfCH4Dojx8Q5eqzkFVGF4O7rRSGE0U3HaV0teUIo2+jKJws MziH3JQrL4l+W2392nBdbDf7pVP9UTGq8qqlamLBnoSGRqysS6eEP/zqUGBLxkdWmg4R lTTK7bLCSKf01sps3trQjBi9Hx8wfFOC59/7HyMDSDP4W2ov+dJGTap+PAbEKHXX8R0g hfpf4dTmCEPKDXxvo9bMfLcd3GP40n0dzIuWD2HxK+Cy4E8RMfurdaZnPaJmqOr6jHcl r82g== X-Gm-Message-State: AOJu0YyiO6EUQ/TehPu8lsgpkuDtE+y430TwMRG0PRQpbNdRAnTOVZbj Iv/innutFtLlDEpOPTtIybpoiA== X-Google-Smtp-Source: AGHT+IEi5W40koLoD77Y1VAkgr7zah8uWafJNs5AaEdLxAKdchT3RdEGgtZxXKObIse6usYwF6x3TA== X-Received: by 2002:a05:620a:4724:b0:773:ca5c:4556 with SMTP id bs36-20020a05620a472400b00773ca5c4556mr929497qkb.10.1698262790124; Wed, 25 Oct 2023 12:39:50 -0700 (PDT) Received: from n231-230-216.byted.org ([147.160.184.135]) by smtp.gmail.com with ESMTPSA id o8-20020a05620a228800b0076cdc3b5beasm4453721qkh.86.2023.10.25.12.39.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Oct 2023 12:39:49 -0700 (PDT) From: Hao Xiang To: quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH 14/16] migration/multifd: Enable set packet size migration option. Date: Wed, 25 Oct 2023 19:38:20 +0000 Message-Id: <20231025193822.2813204-15-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231025193822.2813204-1-hao.xiang@bytedance.com> References: <20231025193822.2813204-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::731; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x731.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org During live migration, if the latency between sender and receiver is high but bandwidth is high (a long and fat pipe), using a bigger packet size can help reduce migration total time. In addition, Intel DSA offloading performs better with a large batch task. Providing an option to set the packet size is useful for performance tuning. Signed-off-by: Hao Xiang --- migration/migration-hmp-cmds.c | 7 +++++++ migration/multifd-zlib.c | 4 ++-- migration/multifd-zstd.c | 4 ++-- migration/multifd.c | 6 ++++-- migration/multifd.h | 3 --- 5 files changed, 15 insertions(+), 9 deletions(-) diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index e1f110afbc..c53e4d8543 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -333,6 +333,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) monitor_printf(mon, "%s: %s\n", MigrationParameter_str(MIGRATION_PARAMETER_BLOCK_INCREMENTAL), params->block_incremental ? "on" : "off"); + monitor_printf(mon, "%s: %" PRIu64 "\n", + MigrationParameter_str(MIGRATION_PARAMETER_MULTIFD_PACKET_SIZE), + params->multifd_packet_size); monitor_printf(mon, "%s: %u\n", MigrationParameter_str(MIGRATION_PARAMETER_MULTIFD_CHANNELS), params->multifd_channels); @@ -597,6 +600,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict) p->multifd_dsa_accel->type = QTYPE_QSTRING; visit_type_str(v, param, &p->multifd_dsa_accel->u.s, &err); break; + case MIGRATION_PARAMETER_MULTIFD_PACKET_SIZE: + p->has_multifd_packet_size = true; + visit_type_size(v, param, &p->multifd_packet_size, &err); + break; case MIGRATION_PARAMETER_MULTIFD_CHANNELS: p->has_multifd_channels = true; visit_type_uint8(v, param, &p->multifd_channels, &err); diff --git a/migration/multifd-zlib.c b/migration/multifd-zlib.c index 37ce48621e..a1b127d0d1 100644 --- a/migration/multifd-zlib.c +++ b/migration/multifd-zlib.c @@ -58,7 +58,7 @@ static int zlib_send_setup(MultiFDSendParams *p, Error **errp) goto err_free_z; } /* This is the maximum size of the compressed buffer */ - z->zbuff_len = compressBound(MULTIFD_PACKET_SIZE); + z->zbuff_len = compressBound(migrate_multifd_packet_size()); z->zbuff = g_try_malloc(z->zbuff_len); if (!z->zbuff) { err_msg = "out of memory for zbuff"; @@ -200,7 +200,7 @@ static int zlib_recv_setup(MultiFDRecvParams *p, Error **errp) return -1; } /* To be safe, we reserve twice the size of the packet */ - z->zbuff_len = MULTIFD_PACKET_SIZE * 2; + z->zbuff_len = migrate_multifd_packet_size() * 2; z->zbuff = g_try_malloc(z->zbuff_len); if (!z->zbuff) { inflateEnd(zs); diff --git a/migration/multifd-zstd.c b/migration/multifd-zstd.c index b471daadcd..0c92112702 100644 --- a/migration/multifd-zstd.c +++ b/migration/multifd-zstd.c @@ -69,7 +69,7 @@ static int zstd_send_setup(MultiFDSendParams *p, Error **errp) return -1; } /* This is the maximum size of the compressed buffer */ - z->zbuff_len = ZSTD_compressBound(MULTIFD_PACKET_SIZE); + z->zbuff_len = ZSTD_compressBound(migrate_multifd_packet_size()); z->zbuff = g_try_malloc(z->zbuff_len); if (!z->zbuff) { ZSTD_freeCStream(z->zcs); @@ -196,7 +196,7 @@ static int zstd_recv_setup(MultiFDRecvParams *p, Error **errp) } /* To be safe, we reserve twice the size of the packet */ - z->zbuff_len = MULTIFD_PACKET_SIZE * 2; + z->zbuff_len = migrate_multifd_packet_size() * 2; z->zbuff = g_try_malloc(z->zbuff_len); if (!z->zbuff) { ZSTD_freeDStream(z->zds); diff --git a/migration/multifd.c b/migration/multifd.c index a0bfb48a7e..a6ecfdd449 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -1045,7 +1045,8 @@ static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque) int multifd_save_setup(Error **errp) { int thread_count; - uint32_t page_count = MULTIFD_PACKET_SIZE / qemu_target_page_size(); + uint32_t page_count = + migrate_multifd_packet_size() / qemu_target_page_size(); uint8_t i; const char *dsa_parameter = migrate_multifd_dsa_accel(); @@ -1323,7 +1324,8 @@ static void *multifd_recv_thread(void *opaque) int multifd_load_setup(Error **errp) { int thread_count; - uint32_t page_count = MULTIFD_PACKET_SIZE / qemu_target_page_size(); + uint32_t page_count = + migrate_multifd_packet_size() / qemu_target_page_size(); uint8_t i; const char *dsa_parameter = migrate_multifd_dsa_accel(); diff --git a/migration/multifd.h b/migration/multifd.h index 297b055e2b..8b1cf136d7 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -34,9 +34,6 @@ int multifd_queue_page(QEMUFile *f, RAMBlock *block, ram_addr_t offset); #define MULTIFD_FLAG_ZLIB (1 << 1) #define MULTIFD_FLAG_ZSTD (2 << 1) -/* This value needs to be a multiple of qemu_target_page_size() */ -#define MULTIFD_PACKET_SIZE (512 * 1024) - typedef struct { uint32_t magic; uint32_t version; From patchwork Wed Oct 25 19:38:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 1855250 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=bytedance.com header.i=@bytedance.com header.a=rsa-sha256 header.s=google header.b=DoBqx1e/; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SFzms5MPPz23jn for ; Thu, 26 Oct 2023 06:40:45 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvjjV-0001FF-PN; Wed, 25 Oct 2023 15:40:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvjjS-00015V-Tk for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:58 -0400 Received: from mail-qv1-xf34.google.com ([2607:f8b0:4864:20::f34]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qvjjO-0006Be-GQ for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:58 -0400 Received: by mail-qv1-xf34.google.com with SMTP id 6a1803df08f44-66d0ceba445so806726d6.0 for ; Wed, 25 Oct 2023 12:39:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1698262791; x=1698867591; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ZQDZW053br4jqH8A0OjzZIKhK8kWorlpnIDYXHN+uPc=; b=DoBqx1e/J8qoZ81nM2ol3EhwzF9jXha1ltJL25a3YR+hu1dYw4VCj53oIvRDJBPSUY yMqCtboII07mCio2CHXLPmHxDj8O1s+o/kpJGcJ67hksGYnN6nxVu0JE52StKgdw/VeP 0EXPl1kYl3LWdQDtT62piTwoxpA4ZReiTzK9C5fZaz1SYa4z8L5a76iKHgBCjFgem6nV 5UyTr7e2Irgn7eY48yrA8TxuSLHvoNCa5ig0P19aHotaS+gnaqnmlt3pG93IQC2l/ir3 6uIRDGoT5uz2z3/K0zigghLV9ewiR2dwBTGZCsq6XBwi++QEOUWlW8WB+dk4TOe8K7+v Euhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698262791; x=1698867591; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZQDZW053br4jqH8A0OjzZIKhK8kWorlpnIDYXHN+uPc=; b=QwkPdISBJo6s/gWlDDifmbf63JbD44H2A+hJRS631rLrfAXuxablMor7QYFSeTFQqC QoPetoCDljy8KTTW/e1AcEUbOcGNT76Hr2YlTOofXOP9IBXhxpSZI3fD0iD1g/tOcOU7 HCIOkeTI30ujI1DrzTtp3iF40TMdn/rY7b6XQn3OuxrWmIsyApkGiLV+fD9ilVj4cfe8 jnDu0vy7A8Q+iEhXUxeh3VKZt++EbwkAe5VUbcH2SOrHjGxpCfeYuu7vwndCdf8x6NA/ 6WTAO4kbFKAYidNB5GqIQX0IV1MG2P57J1dbF2IboW8oX6Pa74kc9j3TFA1UDqylVYmQ 7JWQ== X-Gm-Message-State: AOJu0YzqakYftyD8HJZ1IyMjwgxxCkOQ9pTrQpAmyOLVTgJyP/QEQzKn LIdQjqGKlaPV8fjWXCscWDMjijEBY2l4fE0F0Ws= X-Google-Smtp-Source: AGHT+IHyrxFt+7iOa9Y36Y9YN9FlmcNbwkmrETpkQWUgp9PM5Kc1fXkoykqjE1nYsdQNTyPpI/wv3Q== X-Received: by 2002:a05:6214:29cc:b0:66d:9589:6bb6 with SMTP id gh12-20020a05621429cc00b0066d95896bb6mr15490798qvb.63.1698262791345; Wed, 25 Oct 2023 12:39:51 -0700 (PDT) Received: from n231-230-216.byted.org ([147.160.184.135]) by smtp.gmail.com with ESMTPSA id o8-20020a05620a228800b0076cdc3b5beasm4453721qkh.86.2023.10.25.12.39.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Oct 2023 12:39:51 -0700 (PDT) From: Hao Xiang To: quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH 15/16] util/dsa: Add unit test coverage for Intel DSA task submission and completion. Date: Wed, 25 Oct 2023 19:38:21 +0000 Message-Id: <20231025193822.2813204-16-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231025193822.2813204-1-hao.xiang@bytedance.com> References: <20231025193822.2813204-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::f34; envelope-from=hao.xiang@bytedance.com; helo=mail-qv1-xf34.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org * Test DSA start and stop path. * Test DSA configure and cleanup path. * Test DSA task submission and completion path. Signed-off-by: Bryan Zhang Signed-off-by: Hao Xiang --- tests/unit/meson.build | 6 + tests/unit/test-dsa.c | 448 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 454 insertions(+) create mode 100644 tests/unit/test-dsa.c diff --git a/tests/unit/meson.build b/tests/unit/meson.build index f33ae64b8d..e4975ae7b8 100644 --- a/tests/unit/meson.build +++ b/tests/unit/meson.build @@ -53,6 +53,12 @@ tests = { 'test-virtio-dmabuf': [meson.project_source_root() / 'hw/display/virtio-dmabuf.c'], } +if config_host_data.get('CONFIG_DSA_OPT') + tests += { + 'test-dsa': [], + } +endif + if have_system or have_tools tests += { 'test-qmp-event': [testqapi], diff --git a/tests/unit/test-dsa.c b/tests/unit/test-dsa.c new file mode 100644 index 0000000000..62af3ebec4 --- /dev/null +++ b/tests/unit/test-dsa.c @@ -0,0 +1,448 @@ +/* + * Test DSA functions. + * + * Copyright (c) 2023 Hao Xiang + * Copyright (c) 2023 Bryan Zhang + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ +#include "qemu/osdep.h" +#include "qemu/host-utils.h" + +#include "qemu/cutils.h" +#include "qemu/memalign.h" +#include "qemu/dsa.h" + +// TODO Make these not-hardcoded. +static const char *path1 = "/dev/dsa/wq4.0"; +static const char *path2 = "/dev/dsa/wq4.0 /dev/dsa/wq4.1"; +static const int num_devices = 2; + +static struct buffer_zero_batch_task batch_task __attribute__((aligned(64))); + +// TODO Communicate that DSA must be configured to support this batch size. +// TODO Alternatively, poke the DSA device to figure out batch size. +static int batch_size = 128; +static int page_size = 4096; + +// A helper for running a single task and checking for correctness. +static void do_single_task(void) +{ + buffer_zero_batch_task_init(&batch_task, batch_size); + char buf[page_size]; + char* ptr = buf; + + buffer_is_zero_dsa_batch_async(&batch_task, + (const void**) &ptr, + 1, + page_size); + g_assert(batch_task.results[0] == buffer_is_zero(buf, page_size)); +} + +static void test_single_zero(void) +{ + g_assert(!dsa_init(path1)); + dsa_start(); + + buffer_zero_batch_task_init(&batch_task, batch_size); + + char buf[page_size]; + char* ptr = buf; + + memset(buf, 0x0, page_size); + buffer_is_zero_dsa_batch_async(&batch_task, + (const void**) &ptr, + 1, page_size); + g_assert(batch_task.results[0]); + + dsa_cleanup(); +} + +static void test_single_zero_async(void) +{ + test_single_zero(); +} + +static void test_single_nonzero(void) +{ + g_assert(!dsa_init(path1)); + dsa_start(); + + buffer_zero_batch_task_init(&batch_task, batch_size); + + char buf[page_size]; + char* ptr = buf; + + memset(buf, 0x1, page_size); + buffer_is_zero_dsa_batch_async(&batch_task, + (const void**) &ptr, + 1, page_size); + g_assert(!batch_task.results[0]); + + dsa_cleanup(); +} + +static void test_single_nonzero_async(void) +{ + test_single_nonzero(); +} + +// count == 0 should return quickly without calling into DSA. +static void test_zero_count_async(void) +{ + char buf[page_size]; + buffer_is_zero_dsa_batch_async(&batch_task, + (const void **) &buf, + 0, + page_size); +} + +static void test_null_task_async(void) +{ + if (g_test_subprocess()) { + g_assert(!dsa_init(path1)); + + char buf[page_size * batch_size]; + char *addrs[batch_size]; + for (int i = 0; i < batch_size; i++) { + addrs[i] = buf + (page_size * i); + } + + buffer_is_zero_dsa_batch_async(NULL, (const void**) addrs, batch_size, + page_size); + } else { + g_test_trap_subprocess(NULL, 0, 0); + g_test_trap_assert_failed(); + } +} + +static void test_oversized_batch(void) +{ + g_assert(!dsa_init(path1)); + dsa_start(); + + buffer_zero_batch_task_init(&batch_task, batch_size); + + int oversized_batch_size = batch_size + 1; + char buf[page_size * oversized_batch_size]; + char *addrs[batch_size]; + for (int i = 0; i < oversized_batch_size; i++) { + addrs[i] = buf + (page_size * i); + } + + int ret = buffer_is_zero_dsa_batch_async(&batch_task, + (const void**) addrs, + oversized_batch_size, + page_size); + g_assert(ret != 0); + + dsa_cleanup(); +} + +static void test_oversized_batch_async(void) +{ + test_oversized_batch(); +} + +static void test_zero_len_async(void) +{ + if (g_test_subprocess()) { + g_assert(!dsa_init(path1)); + + buffer_zero_batch_task_init(&batch_task, batch_size); + + char buf[page_size]; + + buffer_is_zero_dsa_batch_async(&batch_task, + (const void**) &buf, + 1, + 0); + } else { + g_test_trap_subprocess(NULL, 0, 0); + g_test_trap_assert_failed(); + } +} + +static void test_null_buf_async(void) +{ + if (g_test_subprocess()) { + g_assert(!dsa_init(path1)); + + buffer_zero_batch_task_init(&batch_task, batch_size); + + buffer_is_zero_dsa_batch_async(&batch_task, NULL, 1, page_size); + } else { + g_test_trap_subprocess(NULL, 0, 0); + g_test_trap_assert_failed(); + } +} + +static void test_batch(void) +{ + g_assert(!dsa_init(path1)); + dsa_start(); + + buffer_zero_batch_task_init(&batch_task, batch_size); + + char buf[page_size * batch_size]; + char *addrs[batch_size]; + for (int i = 0; i < batch_size; i++) { + addrs[i] = buf + (page_size * i); + } + + // Using whatever is on the stack is somewhat random. + // Manually set some pages to zero and some to nonzero. + memset(buf + 0, 0, page_size * 10); + memset(buf + (10 * page_size), 0xff, page_size * 10); + + buffer_is_zero_dsa_batch_async(&batch_task, + (const void**) addrs, + batch_size, + page_size); + + bool is_zero; + for (int i = 0; i < batch_size; i++) { + is_zero = buffer_is_zero((const void*) &buf[page_size * i], page_size); + g_assert(batch_task.results[i] == is_zero); + } + dsa_cleanup(); +} + +static void test_batch_async(void) +{ + test_batch(); +} + +static void test_page_fault(void) +{ + g_assert(!dsa_init(path1)); + dsa_start(); + + char* buf[2]; + int prot = PROT_READ | PROT_WRITE; + int flags = MAP_SHARED | MAP_ANON; + buf[0] = (char*) mmap(NULL, page_size * batch_size, prot, flags, -1, 0); + assert(buf[0] != MAP_FAILED); + buf[1] = (char*) malloc(page_size * batch_size); + assert(buf[1] != NULL); + + for (int j = 0; j < 2; j++) { + buffer_zero_batch_task_init(&batch_task, batch_size); + + char *addrs[batch_size]; + for (int i = 0; i < batch_size; i++) { + addrs[i] = buf[j] + (page_size * i); + } + + buffer_is_zero_dsa_batch_async(&batch_task, + (const void**) addrs, + batch_size, + page_size); + + bool is_zero; + for (int i = 0; i < batch_size; i++) { + is_zero = buffer_is_zero((const void*) &buf[j][page_size * i], page_size); + g_assert(batch_task.results[i] == is_zero); + } + } + + assert(!munmap(buf[0], page_size * batch_size)); + free(buf[1]); + dsa_cleanup(); +} + +static void test_various_buffer_sizes(void) +{ + g_assert(!dsa_init(path1)); + dsa_start(); + + int len = 1 << 4; + for (int count = 12; count > 0; count--, len <<= 1) { + buffer_zero_batch_task_init(&batch_task, batch_size); + + char buf[len * batch_size]; + char *addrs[batch_size]; + for (int i = 0; i < batch_size; i++) { + addrs[i] = buf + (len * i); + } + + buffer_is_zero_dsa_batch_async(&batch_task, + (const void**) addrs, + batch_size, + len); + + bool is_zero; + for (int j = 0; j < batch_size; j++) { + is_zero = buffer_is_zero((const void*) &buf[len * j], len); + g_assert(batch_task.results[j] == is_zero); + } + } + + dsa_cleanup(); +} + +static void test_various_buffer_sizes_async(void) +{ + test_various_buffer_sizes(); +} + +static void test_double_start_stop(void) +{ + g_assert(!dsa_init(path1)); + // Double start + dsa_start(); + dsa_start(); + g_assert(dsa_is_running()); + do_single_task(); + + // Double stop + dsa_stop(); + g_assert(!dsa_is_running()); + dsa_stop(); + g_assert(!dsa_is_running()); + + // Restart + dsa_start(); + g_assert(dsa_is_running()); + do_single_task(); + dsa_cleanup(); +} + +static void test_is_running(void) +{ + g_assert(!dsa_init(path1)); + + g_assert(!dsa_is_running()); + dsa_start(); + g_assert(dsa_is_running()); + dsa_stop(); + g_assert(!dsa_is_running()); + dsa_cleanup(); +} + +static void test_multiple_engines(void) +{ + g_assert(!dsa_init(path2)); + dsa_start(); + + struct buffer_zero_batch_task tasks[num_devices] + __attribute__((aligned(64))); + char bufs[num_devices][page_size * batch_size]; + char *addrs[num_devices][batch_size]; + + // This is a somewhat implementation-specific way of testing that the tasks + // have unique engines assigned to them. + buffer_zero_batch_task_init(&tasks[0], batch_size); + buffer_zero_batch_task_init(&tasks[1], batch_size); + g_assert(tasks[0].device != tasks[1].device); + + for (int i = 0; i < num_devices; i++) { + for (int j = 0; j < batch_size; j++) { + addrs[i][j] = bufs[i] + (page_size * j); + } + + buffer_is_zero_dsa_batch_async(&tasks[i], + (const void**) addrs[i], + batch_size, page_size); + + bool is_zero; + for (int j = 0; j < batch_size; j++) { + is_zero = buffer_is_zero((const void*) &bufs[i][page_size * j], + page_size); + g_assert(tasks[i].results[j] == is_zero); + } + } + + dsa_cleanup(); +} + +static void test_configure_dsa_twice(void) +{ + g_assert(!dsa_init(path2)); + g_assert(!dsa_init(path2)); + dsa_start(); + do_single_task(); + dsa_cleanup(); +} + +static void test_configure_dsa_bad_path(void) +{ + const char* bad_path = "/not/a/real/path"; + g_assert(dsa_init(bad_path)); +} + +static void test_cleanup_before_configure(void) +{ + dsa_cleanup(); + g_assert(!dsa_init(path2)); +} + +static void test_configure_dsa_num_devices(void) +{ + g_assert(!dsa_init(path1)); + dsa_start(); + + do_single_task(); + dsa_stop(); + dsa_cleanup(); +} + +static void test_cleanup_twice(void) +{ + g_assert(!dsa_init(path2)); + dsa_cleanup(); + dsa_cleanup(); + + g_assert(!dsa_init(path2)); + dsa_start(); + do_single_task(); + dsa_cleanup(); +} + +int main(int argc, char **argv) +{ + g_test_init(&argc, &argv, NULL); + + if (getenv("QEMU_TEST_FLAKY_TESTS")) { + g_test_add_func("/dsa/page_fault", test_page_fault); + } + + if (num_devices > 1) { + g_test_add_func("/dsa/multiple_engines", test_multiple_engines); + } + + g_test_add_func("/dsa/async/batch", test_batch_async); + g_test_add_func("/dsa/async/various_buffer_sizes", + test_various_buffer_sizes_async); + g_test_add_func("/dsa/async/null_buf", test_null_buf_async); + g_test_add_func("/dsa/async/zero_len", test_zero_len_async); + g_test_add_func("/dsa/async/oversized_batch", test_oversized_batch_async); + g_test_add_func("/dsa/async/zero_count", test_zero_count_async); + g_test_add_func("/dsa/async/single_zero", test_single_zero_async); + g_test_add_func("/dsa/async/single_nonzero", test_single_nonzero_async); + g_test_add_func("/dsa/async/null_task", test_null_task_async); + + g_test_add_func("/dsa/double_start_stop", test_double_start_stop); + g_test_add_func("/dsa/is_running", test_is_running); + + g_test_add_func("/dsa/configure_dsa_twice", test_configure_dsa_twice); + g_test_add_func("/dsa/configure_dsa_bad_path", test_configure_dsa_bad_path); + g_test_add_func("/dsa/cleanup_before_configure", + test_cleanup_before_configure); + g_test_add_func("/dsa/configure_dsa_num_devices", + test_configure_dsa_num_devices); + g_test_add_func("/dsa/cleanup_twice", test_cleanup_twice); + + return g_test_run(); +} From patchwork Wed Oct 25 19:38:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 1855256 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=bytedance.com header.i=@bytedance.com header.a=rsa-sha256 header.s=google header.b=W7WlYCRS; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SFzp55t9hz23jn for ; Thu, 26 Oct 2023 06:41:49 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvjjV-0001Eq-Mb; Wed, 25 Oct 2023 15:40:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvjjT-00015m-Nx for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:59 -0400 Received: from mail-qk1-x72c.google.com ([2607:f8b0:4864:20::72c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qvjjP-0006Bn-QR for qemu-devel@nongnu.org; Wed, 25 Oct 2023 15:39:59 -0400 Received: by mail-qk1-x72c.google.com with SMTP id af79cd13be357-7789aed0e46so10857085a.0 for ; Wed, 25 Oct 2023 12:39:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1698262793; x=1698867593; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3XxU3oAzYcKmZXIlYaVmAqvLOIxUOOXKqDQ3MzE30CQ=; b=W7WlYCRSpTznFbyu+k5Igo5AhdoUgjA3dwcbxeUqOZ2KSVEfh62Opph2CP1TAA4EK7 2Ry1PyXaHk8AXgiplnEjwNw9T/nbsPhQtPYKU1Hj/gNIESqXJt2kSUyAUWNuEqm1dgpD QdGcghg1c9v3rWbUg3ong6jgciQ0sf46fbSU86TGaoo6LCo51N94UF8X6YhVzNsHFWc2 xSoWVIDukdp1DfgKm1Dn8Km2qwLVHaZdXP/vQGAVVIHPtPVS/845NU7Cl5NC4rQAceae 1c6YYcUwcj3rRcJvqyJY73Dh5kk3MQpbzDSZeDb8p6mmgRoYG8DPK8q2V5qLQNLF5Sz4 S0JQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698262793; x=1698867593; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3XxU3oAzYcKmZXIlYaVmAqvLOIxUOOXKqDQ3MzE30CQ=; b=IJLqSeJ0zd3yEYIDfxCchQr3eKTbqkXL0LwASk4QvTImQDbg4qKmIGtI7jG3MJ80eT GlokYrjoHeXKkQ3BSQHd73oHJDU2OhJiVZqsznLznX3ls1pPTKArIfAvqiDC0eB2OG6B kfCuiLVNSpUQ2R0QrOZnmTeeQfS1zpqpNOLAbK2CFwSrAFIdU7zPn6ozFd1gA3GfGvuz vOjOondWroX0oylYLqDx1d0wvCw7VLay/e4s1TAJm8bA+PP9BQxn9lydeK7xDWxgIOEH 0gmbW7/U7XrDOl5Y/8EV+g5TbxLztMXJm5m6C9nt2zfpF3LnOLch0lI+bVjk3Cdeps6D XyBw== X-Gm-Message-State: AOJu0Yzs6OIPxEOCgzQ3O60IcW/dF4wpCJwJXc6pSrCOuBzXAjOCGO9P 5KOtPy5IXmuzsfdFpWFY8nZ5kg== X-Google-Smtp-Source: AGHT+IFD/4kuM9OYTKovknJ8MGs00PLLznX+f69oG0rk0upaAwVW/84oeVvXaq4ESQ6gX2UtLxPVvg== X-Received: by 2002:a05:620a:880f:b0:778:9072:e45a with SMTP id qj15-20020a05620a880f00b007789072e45amr15730831qkn.74.1698262792872; Wed, 25 Oct 2023 12:39:52 -0700 (PDT) Received: from n231-230-216.byted.org ([147.160.184.135]) by smtp.gmail.com with ESMTPSA id o8-20020a05620a228800b0076cdc3b5beasm4453721qkh.86.2023.10.25.12.39.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Oct 2023 12:39:52 -0700 (PDT) From: Hao Xiang To: quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH 16/16] migration/multifd: Add integration tests for multifd with Intel DSA offloading. Date: Wed, 25 Oct 2023 19:38:22 +0000 Message-Id: <20231025193822.2813204-17-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231025193822.2813204-1-hao.xiang@bytedance.com> References: <20231025193822.2813204-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::72c; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x72c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org * Add test case to start and complete multifd live migration with DSA offloading enabled. * Add test case to start and cancel multifd live migration with DSA offloading enabled. Signed-off-by: Bryan Zhang Signed-off-by: Hao Xiang --- tests/qtest/migration-test.c | 66 +++++++++++++++++++++++++++++++++++- 1 file changed, 65 insertions(+), 1 deletion(-) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index 8eb2053dbb..f22d39e72e 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -631,6 +631,12 @@ typedef struct { const char *opts_target; } MigrateStart; +/* + * It requires separate steps to configure and enable DSA device. + * This test assumes that the configuration is done already. + */ +static const char* dsa_dev_path = "/dev/dsa/wq4.0"; + /* * A hook that runs after the src and dst QEMUs have been * created, but before the migration is started. This can @@ -2431,7 +2437,7 @@ static void test_multifd_tcp_tls_x509_reject_anon_client(void) * * And see that it works */ -static void test_multifd_tcp_cancel(void) +static void test_multifd_tcp_cancel_common(bool use_dsa) { MigrateStart args = { .hide_stderr = true, @@ -2452,6 +2458,10 @@ static void test_multifd_tcp_cancel(void) migrate_set_capability(from, "multifd", true); migrate_set_capability(to, "multifd", true); + if (use_dsa) { + migrate_set_parameter_str(from, "multifd-dsa-accel", dsa_dev_path); + } + /* Start incoming migration from the 1st socket */ migrate_incoming_qmp(to, "tcp:127.0.0.1:0", "{}"); @@ -2508,6 +2518,48 @@ static void test_multifd_tcp_cancel(void) test_migrate_end(from, to2, true); } +/* + * This test does: + * source target + * migrate_incoming + * migrate + * migrate_cancel + * launch another target + * migrate + * + * And see that it works + */ +static void test_multifd_tcp_cancel(void) +{ + test_multifd_tcp_cancel_common(false); +} + +#ifdef CONFIG_DSA_OPT + +static void *test_migrate_precopy_tcp_multifd_start_dsa(QTestState *from, + QTestState *to) +{ + migrate_set_parameter_str(from, "multifd-dsa-accel", dsa_dev_path); + return test_migrate_precopy_tcp_multifd_start_common(from, to, "none"); +} + +static void test_multifd_tcp_none_dsa(void) +{ + MigrateCommon args = { + .listen_uri = "defer", + .start_hook = test_migrate_precopy_tcp_multifd_start_dsa, + }; + + test_precopy_common(&args); +} + +static void test_multifd_tcp_cancel_dsa(void) +{ + test_multifd_tcp_cancel_common(true); +} + +#endif + static void calc_dirty_rate(QTestState *who, uint64_t calc_time) { qtest_qmp_assert_success(who, @@ -2921,6 +2973,18 @@ int main(int argc, char **argv) } qtest_add_func("/migration/multifd/tcp/plain/none", test_multifd_tcp_none); + +#ifdef CONFIG_DSA_OPT + if (g_str_equal(arch, "x86_64")) { + qtest_add_func("/migration/multifd/tcp/plain/none/dsa", + test_multifd_tcp_none_dsa); + } + if (getenv("QEMU_TEST_FLAKY_TESTS")) { + qtest_add_func("/migration/multifd/tcp/plain/cancel/dsa", + test_multifd_tcp_cancel_dsa); + } +#endif + /* * This test is flaky and sometimes fails in CI and otherwise: * don't run unless user opts in via environment variable.