From patchwork Fri Oct 18 16:12:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gregory Price X-Patchwork-Id: 1999275 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gourry.net header.i=@gourry.net header.a=rsa-sha256 header.s=google header.b=HNFcszvg; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XVVBR5mmCz1xw2 for ; Sat, 19 Oct 2024 03:13:51 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1t1pbP-0003lZ-RH; Fri, 18 Oct 2024 12:13:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1t1pbK-0003ke-Ak for qemu-devel@nongnu.org; Fri, 18 Oct 2024 12:13:18 -0400 Received: from mail-qk1-x735.google.com ([2607:f8b0:4864:20::735]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1t1pbH-0004Ob-DH for qemu-devel@nongnu.org; Fri, 18 Oct 2024 12:13:17 -0400 Received: by mail-qk1-x735.google.com with SMTP id af79cd13be357-7b13bf566c0so141760585a.3 for ; Fri, 18 Oct 2024 09:13:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1729267993; x=1729872793; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=y/0cRfb4YBF8cPInnUDV/L7Jp4pgWglmeJbCtRCfJPQ=; b=HNFcszvgCsCQpBwfNyRF30E11MocbpOraEzo9kzfMqaez1xGn4M+UvoItRolF8GcJ5 Lb5ABzCCWqtlq9YetA/ylo87VOi5jZQ0OeDWjCCN6eUjZD3UN5294YHU5JsV3pukPmbn UwQuK+ORf6pBl0MbPFMru9zP/tHluHD6VwP+JZRvx8DxbG3mbpNYDXGqEx56PjaASukQ +w9M3i/aLWxQxgbALlSvHWJVlcFwGU90Fmpn+y/F1Zr91HMnjMYJp9LuSTuPvxOyYfZV Q8/N8w32ZunGU1zZtQdk753ZgRmYxuHBan81WFaRjKPpsz/4TEChkY5H5vaN1kg7hnqk o+yQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729267993; x=1729872793; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=y/0cRfb4YBF8cPInnUDV/L7Jp4pgWglmeJbCtRCfJPQ=; b=au6oRU4yckc0zpfSGEcYyvQ+j4B2MLwh/7ZJtC0/2AtDWP3zJubh2d8hIANIEJl7+Y OnSHYlfODRCtnECTCbcX/qilYKgq1IK0qJuxVXNLSTgjuNHVjx2OV7p+C4sF5fFtlFNT dE4G9+++qCwjqf9UI1bln5JgDy6zhP8pN//3a5sDEm0F7UNu6S+uJI7z255Q07l9f/rd O1BOfGwKsAMICMLTszHUQUBYA2xKVOiqLtbb/xdI45XED0mAqjgOfwRhdlpDW6ZYeHW2 J6ssAO6hS9H6frN+ltePIK+UKr5TVPEV3CpQvyqpIaJZr2GmccK8MezFn5IgeH4RNrtP edFg== X-Gm-Message-State: AOJu0Yx/e2SY3FhCKNXIBkNfrv5/+nF4UmCxnWKf8RnXlAdTcxyOgPwd 2UYss3kPPb6PagKmbw3xC45eCZlbDOJm0yKpf2slRQuBaxbRyHR571mYfPOItfM= X-Google-Smtp-Source: AGHT+IHgiNaZWimN4akiargJriy+PnqMz0dQ4VKMrO8LERPUv3AnoxXaArowIO1TsiS2YWVAiX8YXQ== X-Received: by 2002:a05:620a:2402:b0:7b1:4948:109f with SMTP id af79cd13be357-7b157bf13camr287097685a.57.1729267993256; Fri, 18 Oct 2024 09:13:13 -0700 (PDT) Received: from PC2K9PVX.TheFacebook.com (pool-173-79-56-208.washdc.fios.verizon.net. [173.79.56.208]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7b156fa55cfsm81677385a.67.2024.10.18.09.13.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Oct 2024 09:13:12 -0700 (PDT) From: Gregory Price To: linux-cxl@vger.kernel.org Cc: qemu-devel@nongnu.org, svetly.todorov@memverge.com, jonathan.cameron@huawei.com, nifan.cxl@gmail.com Subject: [PATCH RFC v3 1/3] cxl-mailbox-utils: move CXLUpdateDCExtentListInPl into header Date: Fri, 18 Oct 2024 12:12:50 -0400 Message-ID: <20241018161252.8896-2-gourry@gourry.net> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241018161252.8896-1-gourry@gourry.net> References: <20241018161252.8896-1-gourry@gourry.net> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::735; envelope-from=gourry@gourry.net; helo=mail-qk1-x735.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Svetly Todorov Allows other CXL devices to access host DCD-add-response payload. Signed-off-by: Gregory Price Signed-off-by: Svetly Todorov --- hw/cxl/cxl-mailbox-utils.c | 16 ---------------- include/hw/cxl/cxl_device.h | 16 ++++++++++++++++ 2 files changed, 16 insertions(+), 16 deletions(-) diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c index 72c03d85cf..10de26605c 100644 --- a/hw/cxl/cxl-mailbox-utils.c +++ b/hw/cxl/cxl-mailbox-utils.c @@ -2446,22 +2446,6 @@ void cxl_extent_group_list_delete_front(CXLDCExtentGroupList *list) g_free(group); } -/* - * CXL r3.1 Table 8-168: Add Dynamic Capacity Response Input Payload - * CXL r3.1 Table 8-170: Release Dynamic Capacity Input Payload - */ -typedef struct CXLUpdateDCExtentListInPl { - uint32_t num_entries_updated; - uint8_t flags; - uint8_t rsvd[3]; - /* CXL r3.1 Table 8-169: Updated Extent */ - struct { - uint64_t start_dpa; - uint64_t len; - uint8_t rsvd[8]; - } QEMU_PACKED updated_entries[]; -} QEMU_PACKED CXLUpdateDCExtentListInPl; - /* * For the extents in the extent list to operate, check whether they are valid * 1. The extent should be in the range of a valid DC region; diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h index c3e93b876a..b2dc7fb769 100644 --- a/include/hw/cxl/cxl_device.h +++ b/include/hw/cxl/cxl_device.h @@ -552,6 +552,22 @@ typedef struct CXLDCExtentGroup { } CXLDCExtentGroup; typedef QTAILQ_HEAD(, CXLDCExtentGroup) CXLDCExtentGroupList; +/* + * CXL r3.1 Table 8-168: Add Dynamic Capacity Response Input Payload + * CXL r3.1 Table 8-170: Release Dynamic Capacity Input Payload + */ +typedef struct CXLUpdateDCExtentListInPl { + uint32_t num_entries_updated; + uint8_t flags; + uint8_t rsvd[3]; + /* CXL r3.1 Table 8-169: Updated Extent */ + struct { + uint64_t start_dpa; + uint64_t len; + uint8_t rsvd[8]; + } QEMU_PACKED updated_entries[]; +} QEMU_PACKED CXLUpdateDCExtentListInPl; + typedef struct CXLDCRegion { uint64_t base; /* aligned to 256*MiB */ uint64_t decode_len; /* aligned to 256*MiB */ From patchwork Fri Oct 18 16:12:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gregory Price X-Patchwork-Id: 1999278 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gourry.net header.i=@gourry.net header.a=rsa-sha256 header.s=google header.b=Wh0Kg0iN; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XVVBl19scz1xw2 for ; Sat, 19 Oct 2024 03:14:07 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1t1pbT-0003qn-Ib; Fri, 18 Oct 2024 12:13:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1t1pbO-0003ll-41 for qemu-devel@nongnu.org; Fri, 18 Oct 2024 12:13:23 -0400 Received: from mail-qk1-x735.google.com ([2607:f8b0:4864:20::735]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1t1pbK-0004Ov-38 for qemu-devel@nongnu.org; Fri, 18 Oct 2024 12:13:21 -0400 Received: by mail-qk1-x735.google.com with SMTP id af79cd13be357-7b15495f04dso122313785a.0 for ; Fri, 18 Oct 2024 09:13:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1729267995; x=1729872795; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lWkU6s2NOVNttOr0QYI8NeG1kjlYmYzZJu3z4aGvls0=; b=Wh0Kg0iNEnSMAKd0ZUq08vj6o7j5anycn/NYjghg7It0H2xt8OJS63s6tYU9ad/BmL 8bKut87VS6JdJcCj60k5Ml23BqaANZwC2Xe2Ztn+QedvhlhHXEQReIqipTPIT9TA5yhw iXPw2OJnoS/hlKoaDZNWnhVzwoOVGK8HCTJ/NL5tEqg6PxyKU6Hv9RRZstSQZ8dlEurG Vl2avwEKJrKtgG/0W1CoYdkNOVIOU9FpL2UsHJmEdt8E88Jit6YjKoAZk67Wsg3+dnYV Ir1tmMcTmomblcJ968gQnHUss5c+++cO9fQ/wB6WPqhp78lZ43J1Er2WCJM0pc686U4P gHzA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729267995; x=1729872795; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lWkU6s2NOVNttOr0QYI8NeG1kjlYmYzZJu3z4aGvls0=; b=GkxVB5I0MSURJ3tQ2uS0ZYwxjPaNpVVmve4rpfWWG/dp/Ihtc7RdZQngwxFkPmanL+ QtUC2hMb33lb6WgeM0q2ZKOExi0C2AhaU1S+3rKr1IhsiwSYMkMmi6f65qQQuby4FA3v ceIOPlOFxhYcs6LzG4Z/aBahX+HFrJOVI1AbsPrnST9yTzjOkIehFRHSF+t8+1Hse/qs WRtTnddCyjTqUN6qHOs97feMbKv6yOz5JnpduU68/UPazdJsYJPdLnFxUftS7tj5lcWI 9T2CoAMeplz5GHYBsfROHhv09t6mSwkzKdFNS0SoBoEnB+VEQAUm2vA77X0YyvdcwAgK f7tw== X-Gm-Message-State: AOJu0Yw01eZcAJ9LCMeDkrh59v1nWpff4tO3yvE6WrrHdBcKL7SbLXFi B+5PdNROPlPQUrF0ptVIGKF0uqWkuf6NyPOZWabnt4/wHtDwO+d4tEzUV1/kfZg= X-Google-Smtp-Source: AGHT+IEqKAO0ketnSqmhUBvPOV068tf2A5Q37gpR9t7bNJ8EjmxgVJSGSr2XZ2oPQNDkOHcwr/ymVw== X-Received: by 2002:a05:620a:2492:b0:7b1:4579:61fa with SMTP id af79cd13be357-7b157beea8fmr371565785a.55.1729267994804; Fri, 18 Oct 2024 09:13:14 -0700 (PDT) Received: from PC2K9PVX.TheFacebook.com (pool-173-79-56-208.washdc.fios.verizon.net. [173.79.56.208]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7b156fa55cfsm81677385a.67.2024.10.18.09.13.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Oct 2024 09:13:14 -0700 (PDT) From: Gregory Price To: linux-cxl@vger.kernel.org Cc: qemu-devel@nongnu.org, svetly.todorov@memverge.com, jonathan.cameron@huawei.com, nifan.cxl@gmail.com Subject: [PATCH RFC v3 2/3] cxl_type3: add MHD callbacks Date: Fri, 18 Oct 2024 12:12:51 -0400 Message-ID: <20241018161252.8896-3-gourry@gourry.net> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241018161252.8896-1-gourry@gourry.net> References: <20241018161252.8896-1-gourry@gourry.net> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::735; envelope-from=gourry@gourry.net; helo=mail-qk1-x735.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Svetly Todorov Introduce an API for validating DC adds, removes, and responses against a multi-headed device. mhd_reserve_extents() is called during a DC add request. This allows a multi-headed device to check whether the requested extents belong to another host. If not, then this function can claim those extents in the MHD state and allow the cxl_type3 code to follow suit in the host-local blk_bitmap. mhd_reclaim_extents() is called during the DC add response. It allows the MHD to reclaim extents that were preallocated to a host during the request but rejected in the response. mhd_release_extent() is called during the DC release response. It can be invoked after a host frees an extent in its local bitmap, allowing the MHD handler to release that same extent in the multi-host state. Signed-off-by: Gregory Price Signed-off-by: Svetly Todorov --- hw/cxl/cxl-mailbox-utils.c | 28 +++++++++++++++++++++++++++- hw/mem/cxl_type3.c | 17 +++++++++++++++++ include/hw/cxl/cxl_device.h | 8 ++++++++ 3 files changed, 52 insertions(+), 1 deletion(-) diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c index 10de26605c..112272e9ac 100644 --- a/hw/cxl/cxl-mailbox-utils.c +++ b/hw/cxl/cxl-mailbox-utils.c @@ -2545,6 +2545,7 @@ static CXLRetCode cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd, { CXLUpdateDCExtentListInPl *in = (void *)payload_in; CXLType3Dev *ct3d = CXL_TYPE3(cci->d); + CXLType3Class *cvc = CXL_TYPE3_GET_CLASS(ct3d); CXLDCExtentList *extent_list = &ct3d->dc.extents; uint32_t i; uint64_t dpa, len; @@ -2579,6 +2580,11 @@ static CXLRetCode cmd_dcd_add_dyn_cap_rsp(const struct cxl_cmd *cmd, ct3d->dc.total_extent_count += 1; ct3_set_region_block_backed(ct3d, dpa, len); } + + if (cvc->mhd_reclaim_extents) + cvc->mhd_reclaim_extents(&ct3d->parent_obj, &ct3d->dc.extents_pending, + in); + /* Remove the first extent group in the pending list */ cxl_extent_group_list_delete_front(&ct3d->dc.extents_pending); @@ -2612,6 +2618,7 @@ static CXLRetCode cxl_dc_extent_release_dry_run(CXLType3Dev *ct3d, uint32_t *updated_list_size) { CXLDCExtent *ent, *ent_next; + CXLType3Class *cvc = CXL_TYPE3_GET_CLASS(ct3d); uint64_t dpa, len; uint32_t i; int cnt_delta = 0; @@ -2632,6 +2639,13 @@ static CXLRetCode cxl_dc_extent_release_dry_run(CXLType3Dev *ct3d, goto free_and_exit; } + /* In an MHD, check that this DPA range belongs to this host */ + if (cvc->mhd_access_valid && + !cvc->mhd_access_valid(&ct3d->parent_obj, dpa, len)) { + ret = CXL_MBOX_INVALID_PA; + goto free_and_exit; + } + /* After this point, extent overflow is the only error can happen */ while (len > 0) { QTAILQ_FOREACH(ent, updated_list, node) { @@ -2704,9 +2718,11 @@ static CXLRetCode cmd_dcd_release_dyn_cap(const struct cxl_cmd *cmd, { CXLUpdateDCExtentListInPl *in = (void *)payload_in; CXLType3Dev *ct3d = CXL_TYPE3(cci->d); + CXLType3Class *cvc = CXL_TYPE3_GET_CLASS(ct3d); CXLDCExtentList updated_list; CXLDCExtent *ent, *ent_next; - uint32_t updated_list_size; + uint32_t updated_list_size, i; + uint64_t dpa, len; CXLRetCode ret; if (in->num_entries_updated == 0) { @@ -2724,6 +2740,16 @@ static CXLRetCode cmd_dcd_release_dyn_cap(const struct cxl_cmd *cmd, return ret; } + /* Updated_entries contains the released extents. Free those in the MHD */ + for (i = 0; cvc->mhd_release_extent && i < in->num_entries_updated; ++i) { + dpa = in->updated_entries[i].start_dpa; + len = in->updated_entries[i].len; + + if (cvc->mhd_release_extent) { + cvc->mhd_release_extent(&ct3d->parent_obj, dpa, len); + } + } + /* * If the dry run release passes, the returned updated_list will * be the updated extent list and we just need to clear the extents diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c index b7b24b6a32..a94b9931d2 100644 --- a/hw/mem/cxl_type3.c +++ b/hw/mem/cxl_type3.c @@ -799,6 +799,7 @@ static void cxl_destroy_dc_regions(CXLType3Dev *ct3d) { CXLDCExtent *ent, *ent_next; CXLDCExtentGroup *group, *group_next; + CXLType3Class *cvc = CXL_TYPE3_CLASS(ct3d); int i; CXLDCRegion *region; @@ -817,6 +818,10 @@ static void cxl_destroy_dc_regions(CXLType3Dev *ct3d) for (i = 0; i < ct3d->dc.num_regions; i++) { region = &ct3d->dc.regions[i]; g_free(region->blk_bitmap); + if (cvc->mhd_release_extent) { + cvc->mhd_release_extent(&ct3d->parent_obj, region->base, + region->len); + } } } @@ -2077,6 +2082,7 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path, CXLEventDynamicCapacity dCap = {}; CXLEventRecordHdr *hdr = &dCap.hdr; CXLType3Dev *dcd; + CXLType3Class *cvc; uint8_t flags = 1 << CXL_EVENT_TYPE_INFO; uint32_t num_extents = 0; CxlDynamicCapacityExtentList *list; @@ -2094,6 +2100,7 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path, } dcd = CXL_TYPE3(obj); + cvc = CXL_TYPE3_GET_CLASS(dcd); if (!dcd->dc.num_regions) { error_setg(errp, "No dynamic capacity support from the device"); return; @@ -2166,6 +2173,13 @@ static void qmp_cxl_process_dynamic_capacity_prescriptive(const char *path, num_extents++; } + /* If this is an MHD, attempt to reserve the extents */ + if (type == DC_EVENT_ADD_CAPACITY && cvc->mhd_reserve_extents && + !cvc->mhd_reserve_extents(&dcd->parent_obj, records, rid)) { + error_setg(errp, "mhsld is enabled and extent reservation failed"); + return; + } + /* Create extent list for event being passed to host */ i = 0; list = records; @@ -2304,6 +2318,9 @@ static void ct3_class_init(ObjectClass *oc, void *data) cvc->set_cacheline = set_cacheline; cvc->mhd_get_info = NULL; cvc->mhd_access_valid = NULL; + cvc->mhd_reserve_extents = NULL; + cvc->mhd_reclaim_extents = NULL; + cvc->mhd_release_extent = NULL; } static const TypeInfo ct3d_info = { diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h index b2dc7fb769..13c97b576f 100644 --- a/include/hw/cxl/cxl_device.h +++ b/include/hw/cxl/cxl_device.h @@ -14,6 +14,7 @@ #include "hw/pci/pci_device.h" #include "hw/register.h" #include "hw/cxl/cxl_events.h" +#include "qapi/qapi-commands-cxl.h" #include "hw/cxl/cxl_cpmu.h" /* @@ -682,6 +683,13 @@ struct CXLType3Class { size_t *len_out, CXLCCI *cci); bool (*mhd_access_valid)(PCIDevice *d, uint64_t addr, unsigned int size); + bool (*mhd_reserve_extents)(PCIDevice *d, + CxlDynamicCapacityExtentList *records, + uint8_t rid); + bool (*mhd_reclaim_extents)(PCIDevice *d, + CXLDCExtentGroupList *groups, + CXLUpdateDCExtentListInPl *in); + bool (*mhd_release_extent)(PCIDevice *d, uint64_t dpa, uint64_t len); }; struct CSWMBCCIDev { From patchwork Fri Oct 18 16:12:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gregory Price X-Patchwork-Id: 1999276 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gourry.net header.i=@gourry.net header.a=rsa-sha256 header.s=google header.b=P8YUQt0o; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XVVBX1Xfxz1xw2 for ; Sat, 19 Oct 2024 03:13:56 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1t1pbR-0003nm-PO; Fri, 18 Oct 2024 12:13:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1t1pbP-0003ly-No for qemu-devel@nongnu.org; Fri, 18 Oct 2024 12:13:23 -0400 Received: from mail-qk1-x731.google.com ([2607:f8b0:4864:20::731]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1t1pbK-0004P2-KQ for qemu-devel@nongnu.org; Fri, 18 Oct 2024 12:13:22 -0400 Received: by mail-qk1-x731.google.com with SMTP id af79cd13be357-7b1467af9dbso154627885a.0 for ; Fri, 18 Oct 2024 09:13:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1729267997; x=1729872797; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Yg1P4kmuxtDkOe9Icy7EtGIDk1BBaBntSPvV2yBw5Vg=; b=P8YUQt0osLg50ZMDdyYJozJeqJdyGW6KWVuW8kUu4mmLYx6WPrbCeX+VRnZuv0yHHB kY39/7caafGJ+lMsbZ3TvZsGYgFYomdlpei0syt0e2PHGSeSz5/rFLqMfW7g3QPh5i2r i6TWZ4w4BwQovt+HOEvHY7Bus8kSVHthhm18t32tWfFyxdxCBcn/ZzOOSVDuG4Vu/JLU +5l0TOK6k3jTTQMF96Wjvt5TFMG9e9OTSlYR7b3SfkvdGDoBU5BEd8aQ0NDXX1QrdOQ4 xVtgtSR4bSV3QSPvAujRz0piF3+W4rE+u+ZCD6cJDpmAzHHvB4hXHc21++R98wYwMjNV qCeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729267997; x=1729872797; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Yg1P4kmuxtDkOe9Icy7EtGIDk1BBaBntSPvV2yBw5Vg=; b=BL/ox9MfhLkLUtosD7qPYe1EhlG2ApiMOOjKdXvHbJRf0C08tXtsrZy4ifJN+ww45F Lu4vQCJKeWLY5VUU/DPPssWD7KBdYp6d5sl+IdKIuVnNsx8SmBavEfZcEHSdUabR0VMr UFH8rve7YuB2sfS1TpI63MOCQn3arrEslfWY00mguv8N4Zkhe+8jz3IDHNQHEYK+zZw5 ajv3uSx+1CaBV9kgutSKPBt6SmRm+ioelhj/v7Tpg3Q8Trzbf7i2fwLYA8yQsFMiin8w LL65eKx20gONR1nb/sR/WuITTWJWEaM1dF4LOnmq97TBaLCztcVSbQFzuyrAf2jsHKC2 NzwQ== X-Gm-Message-State: AOJu0YwfMeKsR5miskUHoQ8I16GNo+AWrSABxmLgE2na1wj+mGOlGgni sqWH6yZAIYNNNYRrsz00Aef2WnG6FnMKBg5s/3/AKSQOWic7xOY8uMsOzHu59pM= X-Google-Smtp-Source: AGHT+IGikF7stO1WCXDYt/0EcUOjzFHMh+z2T+w/8SlImKWFClQj57CqBtAj2rgWTq3cnNRp87CkTA== X-Received: by 2002:a05:620a:4007:b0:7ac:c359:f132 with SMTP id af79cd13be357-7b157b72801mr259516085a.26.1729267996520; Fri, 18 Oct 2024 09:13:16 -0700 (PDT) Received: from PC2K9PVX.TheFacebook.com (pool-173-79-56-208.washdc.fios.verizon.net. [173.79.56.208]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7b156fa55cfsm81677385a.67.2024.10.18.09.13.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Oct 2024 09:13:16 -0700 (PDT) From: Gregory Price To: linux-cxl@vger.kernel.org Cc: qemu-devel@nongnu.org, svetly.todorov@memverge.com, jonathan.cameron@huawei.com, nifan.cxl@gmail.com Subject: [PATCH RFC v3 3/3] mhsld: implement MHSLD device Date: Fri, 18 Oct 2024 12:12:52 -0400 Message-ID: <20241018161252.8896-4-gourry@gourry.net> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241018161252.8896-1-gourry@gourry.net> References: <20241018161252.8896-1-gourry@gourry.net> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::731; envelope-from=gourry@gourry.net; helo=mail-qk1-x731.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Svetly Todorov Using a shared-memory bytemap, validates that DC adds, releases, and reclamations happen on extents belonging to the appropriate host. The MHSLD device inherits from the CXL_TYPE3 class and adds the following configuration options: --mhd-head= --mhd-state_file= --mhd-init= --mhd-head specifies the head ID of the host on the given device. --mhd-state_file is the name of the shared-memory-backed file used to store the MHD state. --mhd-init indicates whether this QEMU instance should initialize the state_file; if so, the instance will create the file if it does not exist, ftruncate it to the appropriate size, and initialize its header. It is assumed that the --mhd-init instance is run and allowed to completely finish configuration before any other guests access the shared state. The shared state file only needs to be intialized once. Even if a guest dies without clearing the ownership bits associated with its head-ID, future guests with that ID will clear those bits in cxl_mhsld_realize(), regardless of whether mhd_init is true or false. The following command line options create an MHSLD with 4GB of backing memory, whose state is tracked in /dev/shm/mhd_metadata. --mhd-init=true tells this instance to initialize the state as described above. ./qemu-system_x86-64 \ [... other options ...] \ -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52 \ -device cxl-rp,id=rp0,bus=cxl.0,chassis=0,port=0,slot=0 \ -object memory-backend-ram,id=mem0,size=4G \ -device cxl-mhsld,bus=rp0,num-dc-regions=1,volatile-dc-memdev=mem0,id=cxl-mem0,sn=66667,mhd-head=0,mhd-state_file=mhd_metadata,mhd-init=true \ -M cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=4G \ -qmp unix:/tmp/qmp-sock-1,server,nowait Once this guest completes setup, other guests looking to access the device can be booted with the same configuration options, but with --mhd-head != 0, --mhd-init=false, and a different QMP socket. Signed-off-by: Gregory Price Signed-off-by: Svetly Todorov --- hw/cxl/Kconfig | 1 + hw/cxl/meson.build | 1 + hw/cxl/mhsld/Kconfig | 4 + hw/cxl/mhsld/meson.build | 3 + hw/cxl/mhsld/mhsld.c | 456 +++++++++++++++++++++++++++++++++++++++ hw/cxl/mhsld/mhsld.h | 75 +++++++ 6 files changed, 540 insertions(+) create mode 100644 hw/cxl/mhsld/Kconfig create mode 100644 hw/cxl/mhsld/meson.build create mode 100644 hw/cxl/mhsld/mhsld.c create mode 100644 hw/cxl/mhsld/mhsld.h diff --git a/hw/cxl/Kconfig b/hw/cxl/Kconfig index e603839a62..919e59b598 100644 --- a/hw/cxl/Kconfig +++ b/hw/cxl/Kconfig @@ -1,3 +1,4 @@ +source mhsld/Kconfig source vendor/Kconfig config CXL diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build index e8c8c1355a..394750dd19 100644 --- a/hw/cxl/meson.build +++ b/hw/cxl/meson.build @@ -16,4 +16,5 @@ system_ss.add(when: 'CONFIG_I2C_MCTP_CXL', if_true: files('i2c_mctp_cxl.c')) system_ss.add(when: 'CONFIG_ALL', if_true: files('cxl-host-stubs.c')) +subdir('mhsld') subdir('vendor') diff --git a/hw/cxl/mhsld/Kconfig b/hw/cxl/mhsld/Kconfig new file mode 100644 index 0000000000..dc2be15140 --- /dev/null +++ b/hw/cxl/mhsld/Kconfig @@ -0,0 +1,4 @@ +config CXL_MHSLD + bool + depends on CXL_MEM_DEVICE + default y diff --git a/hw/cxl/mhsld/meson.build b/hw/cxl/mhsld/meson.build new file mode 100644 index 0000000000..c595558f8a --- /dev/null +++ b/hw/cxl/mhsld/meson.build @@ -0,0 +1,3 @@ +if host_os == 'linux' + system_ss.add(when: 'CONFIG_CXL_MHSLD', if_true: files('mhsld.c',)) +endif diff --git a/hw/cxl/mhsld/mhsld.c b/hw/cxl/mhsld/mhsld.c new file mode 100644 index 0000000000..2a3023607e --- /dev/null +++ b/hw/cxl/mhsld/mhsld.c @@ -0,0 +1,456 @@ +/* + * SPDX-License-Identifier: GPL-2.0-or-later + * + * Copyright (c) 2024 MemVerge Inc. + * + */ + +#include +#include "qemu/osdep.h" +#include "qemu/bitmap.h" +#include "hw/irq.h" +#include "migration/vmstate.h" +#include "qapi/error.h" +#include "hw/cxl/cxl.h" +#include "hw/cxl/cxl_mailbox.h" +#include "hw/cxl/cxl_device.h" +#include "hw/pci/pcie.h" +#include "hw/pci/pcie_port.h" +#include "hw/qdev-properties.h" +#include "sysemu/hostmem.h" +#include "mhsld.h" + +#define TYPE_CXL_MHSLD "cxl-mhsld" +OBJECT_DECLARE_TYPE(CXLMHSLDState, CXLMHSLDClass, CXL_MHSLD) + +/* + * CXL r3.0 section 7.6.7.5.1 - Get Multi-Headed Info (Opcode 5500h) + * + * This command retrieves the number of heads, number of supported LDs, + * and Head-to-LD mapping of a Multi-Headed device. + */ +static CXLRetCode cmd_mhd_get_info(const struct cxl_cmd *cmd, + uint8_t *payload_in, size_t len_in, + uint8_t *payload_out, size_t *len_out, + CXLCCI * cci) +{ + CXLMHSLDState *s = CXL_MHSLD(cci->d); + MHDGetInfoInput *input = (void *)payload_in; + MHDGetInfoOutput *output = (void *)payload_out; + + uint8_t start_ld = input->start_ld; + uint8_t ldmap_len = input->ldmap_len; + uint8_t i; + + if (start_ld >= s->mhd_state->nr_lds) { + return CXL_MBOX_INVALID_INPUT; + } + + output->nr_lds = s->mhd_state->nr_lds; + output->nr_heads = s->mhd_state->nr_heads; + output->resv1 = 0; + output->start_ld = start_ld; + output->resv2 = 0; + + for (i = 0; i < ldmap_len && (start_ld + i) < output->nr_lds; i++) { + output->ldmap[i] = s->mhd_state->ldmap[start_ld + i]; + } + output->ldmap_len = i; + + *len_out = sizeof(*output) + output->ldmap_len; + return CXL_MBOX_SUCCESS; +} + +static const struct cxl_cmd cxl_cmd_set_mhsld[256][256] = { + [MHSLD_MHD][GET_MHD_INFO] = {"GET_MULTI_HEADED_INFO", + cmd_mhd_get_info, 2, 0}, +}; + +static Property cxl_mhsld_props[] = { + DEFINE_PROP_UINT32("mhd-head", CXLMHSLDState, mhd_head, ~(0)), + DEFINE_PROP_STRING("mhd-state_file", CXLMHSLDState, mhd_state_file), + DEFINE_PROP_BOOL("mhd-init", CXLMHSLDState, mhd_init, false), + DEFINE_PROP_END_OF_LIST(), +}; + +static int cxl_mhsld_state_open(const char *filename, int flags) +{ + char name[128]; + snprintf(name, sizeof(name), "/%s", filename); + return shm_open(name, flags, 0666); +} + +static int cxl_mhsld_state_unlink(const char *filename) +{ + char name[128]; + snprintf(name, sizeof(name), "/%s", filename); + return shm_unlink(name); +} + +static int cxl_mhsld_state_create(const char *filename, size_t size) +{ + int fd, rc; + + fd = cxl_mhsld_state_open(filename, O_RDWR | O_CREAT); + if (fd == -1) { + return -1; + } + + rc = ftruncate(fd, size); + + if (rc) { + close(fd); + return -1; + } + + return fd; +} + +static bool cxl_mhsld_state_set(CXLMHSLDState *s, size_t block_start, + size_t block_count) +{ + uint8_t prev, val, *block; + size_t i; + + val = (1 << s->mhd_head); + + /* + * Try to claim all extents from start -> start + count; + * break early if a claimed extent is encountered + */ + for (i = 0; i < block_count; ++i) { + block = &s->mhd_state->blocks[block_start + i]; + prev = __sync_val_compare_and_swap(block, 0, val); + if (prev != 0) { + break; + } + } + + if (prev == 0) { + return true; + } + + /* Roll back incomplete claims */ + for (;; --i) { + block = &s->mhd_state->blocks[block_start + i]; + __sync_fetch_and_and(block, ~(1u << s->mhd_head)); + if (i == 0) { + break; + } + } + + return false; +} + +static void cxl_mhsld_state_clear(CXLMHSLDState *s, size_t block_start, + size_t block_count) +{ + size_t i; + uint8_t *block; + + for (i = 0; i < block_count; ++i) { + block = &s->mhd_state->blocks[block_start + i]; + __sync_fetch_and_and(block, ~(1u << s->mhd_head)); + } +} + +static void cxl_mhsld_state_initialize(CXLMHSLDState *s, size_t dc_size) +{ + if (!s->mhd_init) { + cxl_mhsld_state_clear(s, 0, dc_size / MHSLD_BLOCK_SZ); + return; + } + + memset(s->mhd_state, 0, s->mhd_state_size); + s->mhd_state->nr_heads = MHSLD_HEADS; + s->mhd_state->nr_lds = MHSLD_HEADS; + s->mhd_state->nr_blocks = dc_size / MHSLD_BLOCK_SZ; +} + +/* Returns starting index of region in MHD map. */ +static inline size_t cxl_mhsld_find_dc_region_start(PCIDevice *d, + CXLDCRegion *r) +{ + CXLType3Dev *dcd = CXL_TYPE3(d); + size_t start = 0; + uint8_t rid; + + for (rid = 0; rid < dcd->dc.num_regions; ++rid) { + if (&dcd->dc.regions[rid] == r) { + break; + } + start += dcd->dc.regions[rid].len / dcd->dc.regions[rid].block_size; + } + + return start; +} + +static MHSLDSharedState *cxl_mhsld_state_map(CXLMHSLDState *s) +{ + void *map; + size_t size = s->mhd_state_size; + int fd = s->mhd_state_fd; + + if (fd < 0) { + return NULL; + } + + map = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + if (map == MAP_FAILED) { + return NULL; + } + + return (MHSLDSharedState *)map; +} + +/* + * Triggered during an add_capacity command to a CXL device: + * takes a list of extent records and preallocates them, + * in anticipation of a "dcd accept" response from the host. + * + * Extents that are not accepted by the host will be rolled + * back later. + */ +static bool cxl_mhsld_reserve_extents(PCIDevice *d, + CxlDynamicCapacityExtentList *records, + uint8_t rid) +{ + uint64_t len, dpa; + bool rc; + + CXLMHSLDState *s = CXL_MHSLD(d); + CxlDynamicCapacityExtentList *list = records, *rollback = NULL; + + CXLType3Dev *ct3d = CXL_TYPE3(d); + CXLDCRegion *region = &ct3d->dc.regions[rid]; + + for (; list; list = list->next) { + len = list->value->len / MHSLD_BLOCK_SZ; + dpa = (list->value->offset + region->base) / MHSLD_BLOCK_SZ; + + rc = cxl_mhsld_state_set(s, dpa, len); + + if (!rc) { + rollback = records; + break; + } + } + + /* Setting the mhd state failed. Roll back the extents that were added */ + for (; rollback; rollback = rollback->next) { + len = rollback->value->len / MHSLD_BLOCK_SZ; + dpa = (list->value->offset + region->base) / MHSLD_BLOCK_SZ; + + cxl_mhsld_state_clear(s, dpa, len); + + if (rollback == list) { + return false; + } + } + + return true; +} + +static bool cxl_mhsld_reclaim_extents(PCIDevice *d, + CXLDCExtentGroupList *ext_groups, + CXLUpdateDCExtentListInPl *in) +{ + CXLMHSLDState *s = CXL_MHSLD(d); + CXLType3Dev *ct3d = CXL_TYPE3(d); + CXLDCExtentGroup *ext_group = QTAILQ_FIRST(ext_groups); + CXLDCExtent *ent; + CXLDCRegion *region; + g_autofree unsigned long *blk_bitmap = NULL; + uint64_t dpa, off, len, size, i; + + /* Get the DCD region via the first requested extent */ + ent = QTAILQ_FIRST(&ext_group->list); + dpa = ent->start_dpa; + len = ent->len; + region = cxl_find_dc_region(ct3d, dpa, len); + size = region->len / MHSLD_BLOCK_SZ; + blk_bitmap = bitmap_new(size); + + /* Set all requested extents to 1 in a bitmap */ + QTAILQ_FOREACH(ent, &ext_group->list, node) { + off = ent->start_dpa - region->base; + len = ent->len; + bitmap_set(blk_bitmap, off / MHSLD_BLOCK_SZ, len / MHSLD_BLOCK_SZ); + } + + /* Clear bits associated with accepted extents */ + for (i = 0; i < in->num_entries_updated; i++) { + off = in->updated_entries[i].start_dpa - region->base; + len = in->updated_entries[i].len; + bitmap_clear(blk_bitmap, off / MHSLD_BLOCK_SZ, len / MHSLD_BLOCK_SZ); + } + + /* + * Reclaim only the extents that belong to unaccepted extents, + * i.e. those whose bits are still raised in blk_bitmap + */ + for (off = find_first_bit(blk_bitmap, size); off < size;) { + len = find_next_zero_bit(blk_bitmap, size, off) - off; + cxl_mhsld_state_clear(s, off, len); + off = find_next_bit(blk_bitmap, size, off + len); + } + + return true; +} + +static bool cxl_mhsld_release_extent(PCIDevice *d, uint64_t dpa, uint64_t len) +{ + cxl_mhsld_state_clear(CXL_MHSLD(d), dpa / MHSLD_BLOCK_SZ, + len / MHSLD_BLOCK_SZ); + return true; +} + +static bool cxl_mhsld_access_valid(PCIDevice *d, uint64_t addr, + unsigned int size) +{ + CXLType3Dev *ct3d = CXL_TYPE3(d); + CXLMHSLDState *s = CXL_MHSLD(d); + CXLDCRegion *r = cxl_find_dc_region(ct3d, addr, size); + size_t i; + + addr = addr / r->block_size; + size = size / r->block_size; + + for (i = 0; i < size; ++i) { + if (s->mhd_state->blocks[addr + i] != (1 << s->mhd_head)) { + return false; + } + } + + return true; +} + +static void cxl_mhsld_realize(PCIDevice *pci_dev, Error **errp) +{ + CXLMHSLDState *s = CXL_MHSLD(pci_dev); + MemoryRegion *mr; + int fd = -1; + size_t dc_size; + + ct3_realize(pci_dev, errp); + + /* Get number of blocks from dcd size */ + mr = host_memory_backend_get_memory(s->ct3d.dc.host_dc); + if (!mr) { + return; + } + dc_size = memory_region_size(mr); + if (!dc_size) { + error_setg(errp, "MHSLD does not have dynamic capacity to manage"); + return; + } + + s->mhd_state_size = (dc_size / MHSLD_BLOCK_SZ) + sizeof(MHSLDSharedState); + + /* Sanity check the head idx */ + if (s->mhd_head >= MHSLD_HEADS) { + error_setg(errp, "MHD Head ID must be between 0-7"); + return; + } + + /* Create the state file if this is the 'mhd_init' instance */ + if (s->mhd_init) { + fd = cxl_mhsld_state_create(s->mhd_state_file, s->mhd_state_size); + } else { + fd = cxl_mhsld_state_open(s->mhd_state_file, O_RDWR); + } + + if (fd < 0) { + error_setg(errp, "failed to open mhsld state errno %d", errno); + return; + } + + s->mhd_state_fd = fd; + + /* Map the state and initialize it as needed */ + s->mhd_state = cxl_mhsld_state_map(s); + if (!s->mhd_state) { + error_setg(errp, "Failed to mmap mhd state file"); + close(fd); + cxl_mhsld_state_unlink(s->mhd_state_file); + return; + } + + cxl_mhsld_state_initialize(s, dc_size); + + /* Set the LD ownership for this head to this system */ + s->mhd_state->ldmap[s->mhd_head] = s->mhd_head; + return; +} + + +static void cxl_mhsld_exit(PCIDevice *pci_dev) +{ + CXLMHSLDState *s = CXL_MHSLD(pci_dev); + + ct3_exit(pci_dev); + + if (s->mhd_state_fd) { + munmap(s->mhd_state, s->mhd_state_size); + close(s->mhd_state_fd); + cxl_mhsld_state_unlink(s->mhd_state_file); + s->mhd_state = NULL; + } +} + +static void cxl_mhsld_reset(DeviceState *d) +{ + CXLMHSLDState *s = CXL_MHSLD(d); + + ct3d_reset(d); + cxl_add_cci_commands(&s->ct3d.cci, cxl_cmd_set_mhsld, 512); + + cxl_mhsld_state_clear(s, 0, s->mhd_state->nr_blocks); +} + +/* + * Example: DCD-add events need to validate that the requested extent + * does not already have a mapping (or, if it does, it is + * a shared extent with the right tagging). + * + * Since this operates on the shared state, we will need to serialize + * these callbacks across QEMU instances via a mutex in shared state. + */ + +static void cxl_mhsld_class_init(ObjectClass *klass, void *data) +{ + DeviceClass *dc = DEVICE_CLASS(klass); + PCIDeviceClass *pc = PCI_DEVICE_CLASS(klass); + + pc->realize = cxl_mhsld_realize; + pc->exit = cxl_mhsld_exit; + device_class_set_legacy_reset(dc, cxl_mhsld_reset); + device_class_set_props(dc, cxl_mhsld_props); + + CXLType3Class *cvc = CXL_TYPE3_CLASS(klass); + cvc->mhd_get_info = cmd_mhd_get_info; + cvc->mhd_access_valid = cxl_mhsld_access_valid; + cvc->mhd_reserve_extents = cxl_mhsld_reserve_extents; + cvc->mhd_reclaim_extents = cxl_mhsld_reclaim_extents; + cvc->mhd_release_extent = cxl_mhsld_release_extent; +} + +static const TypeInfo cxl_mhsld_info = { + .name = TYPE_CXL_MHSLD, + .parent = TYPE_CXL_TYPE3, + .class_size = sizeof(struct CXLMHSLDClass), + .class_init = cxl_mhsld_class_init, + .instance_size = sizeof(CXLMHSLDState), + .interfaces = (InterfaceInfo[]) { + { INTERFACE_CXL_DEVICE }, + { INTERFACE_PCIE_DEVICE }, + {} + }, +}; + +static void cxl_mhsld_register_types(void) +{ + type_register_static(&cxl_mhsld_info); +} + +type_init(cxl_mhsld_register_types) diff --git a/hw/cxl/mhsld/mhsld.h b/hw/cxl/mhsld/mhsld.h new file mode 100644 index 0000000000..e7ead1f0d2 --- /dev/null +++ b/hw/cxl/mhsld/mhsld.h @@ -0,0 +1,75 @@ +/* + * SPDX-License-Identifier: GPL-2.0-or-later + * + * Copyright (c) 2024 MemVerge Inc. + * + */ + +#ifndef CXL_MHSLD_H +#define CXL_MHSLD_H +#include +#include "hw/cxl/cxl.h" +#include "hw/cxl/cxl_mailbox.h" +#include "hw/cxl/cxl_device.h" +#include "qemu/units.h" + +#define MHSLD_BLOCK_SZ (2 * MiB) + +/* + * We limit the number of heads to prevent the shared state + * region from becoming a major memory hog. We need 512MB of + * memory space to track 8-host ownership of 4GB of memory in + * blocks of 2MB. This can change if the block size is increased. + */ +#define MHSLD_HEADS (8) + +/* + * The shared state cannot have 2 variable sized regions + * so we have to max out the ldmap. + */ +typedef struct MHSLDSharedState { + uint8_t nr_heads; + uint8_t nr_lds; + uint8_t ldmap[MHSLD_HEADS]; + uint64_t nr_blocks; + uint8_t blocks[]; +} MHSLDSharedState; + +struct CXLMHSLDState { + CXLType3Dev ct3d; + bool mhd_init; + char *mhd_state_file; + int mhd_state_fd; + size_t mhd_state_size; + uint32_t mhd_head; + MHSLDSharedState *mhd_state; +}; + +struct CXLMHSLDClass { + CXLType3Class parent_class; +}; + +enum { + MHSLD_MHD = 0x55, + #define GET_MHD_INFO 0x0 +}; + +/* + * MHD Get Info Command + * Returns information the LD's associated with this head + */ +typedef struct MHDGetInfoInput { + uint8_t start_ld; + uint8_t ldmap_len; +} QEMU_PACKED MHDGetInfoInput; + +typedef struct MHDGetInfoOutput { + uint8_t nr_lds; + uint8_t nr_heads; + uint16_t resv1; + uint8_t start_ld; + uint8_t ldmap_len; + uint16_t resv2; + uint8_t ldmap[]; +} QEMU_PACKED MHDGetInfoOutput; +#endif