From patchwork Thu Aug 8 17:40:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Reed X-Patchwork-Id: 1970671 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=185.125.189.65; helo=lists.ubuntu.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=patchwork.ozlabs.org) Received: from lists.ubuntu.com (lists.ubuntu.com [185.125.189.65]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WfvTl2gXXz1ybS for ; Fri, 9 Aug 2024 03:40:59 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=lists.ubuntu.com) by lists.ubuntu.com with esmtp (Exim 4.86_2) (envelope-from ) id 1sc788-0001on-Ee; Thu, 08 Aug 2024 17:40:52 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by lists.ubuntu.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1sc782-0001hV-Vx for kernel-team@lists.ubuntu.com; Thu, 08 Aug 2024 17:40:47 +0000 Received: from mail-oa1-f71.google.com (mail-oa1-f71.google.com [209.85.160.71]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id B9D5F3F327 for ; Thu, 8 Aug 2024 17:40:46 +0000 (UTC) Received: by mail-oa1-f71.google.com with SMTP id 586e51a60fabf-26113a081c9so1363538fac.1 for ; Thu, 08 Aug 2024 10:40:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723138845; x=1723743645; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8e4HRJuc2KQA0mJ2sdKdvW8r56fzG+fdm4LqbY8DMBI=; b=kG+bSATRMvn/q/yCY3CSeK5WVfp6P7Ga8m25mJYVj6SMOnVBs5DGPswXnHevRiQhxu moAUt4hUYrIUJZxEnTZgHiJlA2CA6984s0vFmgDvp2szWL8b/0KhrosizrmFq0emNS2m D67G5PwMBNlz7s3HeVNcTrjqqPe3Q9i+Www9dy323bH3mG2R17h7fIERkAuv9Bt8nSbn gFLvra2f7zO1dENifGmxuUdON6e4K/l3s3esOvfIUK7LvFlVmRTn/QtetNs/nfi5MsEk QeVUeMglYUBYWbK3CZQ6iuLWrhjROGavep2TopIy5Q3DlIQ4cQG5UQJQLojY8I3q3LQ7 IihQ== X-Gm-Message-State: AOJu0Yy5lOk99q+vZjmamKK6WH2kgCThFqd72ok6LCh8oKognXQaaJQS Lp5t1XsqHIllBOOzb8CT7k8R5k8xQkSHvXxaqtW7Nw/GTF35KrV2VDhL3LPHF/vVDxXo7SWw3gD 2eqTNs04xOCp8M6ZX3qPi3uJeDKeEHH6w7Aw0WFu1XBSlYKKL2Kiaja771vSqE0RlSXIrxqYkKO fa0xqcS+IN0g== X-Received: by 2002:a05:6871:8f11:b0:26c:5312:a13b with SMTP id 586e51a60fabf-26c5312a443mr1115501fac.30.1723138845248; Thu, 08 Aug 2024 10:40:45 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF6MWABW23bhntEjH+MSL99YLmn/rbFYSfcSQVUC6306s9GerFVnB0meZMOFCqA3LQ7Z9qjYg== X-Received: by 2002:a05:6871:8f11:b0:26c:5312:a13b with SMTP id 586e51a60fabf-26c5312a443mr1115489fac.30.1723138844807; Thu, 08 Aug 2024 10:40:44 -0700 (PDT) Received: from localhost ([2600:1700:1d0:5e50:4e3c:1556:5ef4:51ab]) by smtp.gmail.com with ESMTPSA id 46e09a7af769-70a31d9dd5dsm5611159a34.6.2024.08.08.10.40.44 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Aug 2024 10:40:44 -0700 (PDT) From: Michael Reed To: kernel-team@lists.ubuntu.com Subject: [SRU][N/O][PATCH 5/6] scsi: mpi3mr: Prevent PCI writes from driver during PCI error recovery Date: Thu, 8 Aug 2024 12:40:36 -0500 Message-Id: <20240808174037.22803-6-michael.reed@canonical.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240808174037.22803-1-michael.reed@canonical.com> References: <20240808174037.22803-1-michael.reed@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Sumit Saxena BugLink: https://bugs.launchpad.net/bugs/2073583 Prevent interaction with the hardware while the error recovery in progress. Co-developed-by: Sathya Prakash Signed-off-by: Sathya Prakash Co-developed-by: Ranjan Kumar Signed-off-by: Ranjan Kumar Signed-off-by: Sumit Saxena Link: https://lore.kernel.org/r/20240627101735.18286-3-sumit.saxena@broadcom.com Signed-off-by: Martin K. Petersen (cherry picked from commit 1c342b0548e3f53a4478f8d7bcb5af34b361787f) Signed-off-by: Michael Reed --- drivers/scsi/mpi3mr/mpi3mr.h | 1 + drivers/scsi/mpi3mr/mpi3mr_app.c | 10 ++++-- drivers/scsi/mpi3mr/mpi3mr_fw.c | 22 +++++++++--- drivers/scsi/mpi3mr/mpi3mr_os.c | 49 +++++++++++++++++++++++--- drivers/scsi/mpi3mr/mpi3mr_transport.c | 39 +++++++++++++++++--- 5 files changed, 104 insertions(+), 17 deletions(-) diff --git a/drivers/scsi/mpi3mr/mpi3mr.h b/drivers/scsi/mpi3mr/mpi3mr.h index 2e86342d4f52..09af5ebd5009 100644 --- a/drivers/scsi/mpi3mr/mpi3mr.h +++ b/drivers/scsi/mpi3mr/mpi3mr.h @@ -515,6 +515,7 @@ struct mpi3mr_throttle_group_info { /* HBA port flags */ #define MPI3MR_HBA_PORT_FLAG_DIRTY 0x01 +#define MPI3MR_HBA_PORT_FLAG_NEW 0x02 /* IOCTL data transfer sge*/ #define MPI3MR_NUM_IOCTL_SGE 256 diff --git a/drivers/scsi/mpi3mr/mpi3mr_app.c b/drivers/scsi/mpi3mr/mpi3mr_app.c index 6d294c92242d..ef5de5e80da6 100644 --- a/drivers/scsi/mpi3mr/mpi3mr_app.c +++ b/drivers/scsi/mpi3mr/mpi3mr_app.c @@ -846,7 +846,7 @@ static int mpi3mr_bsg_pel_abort(struct mpi3mr_ioc *mrioc) dprint_bsg_err(mrioc, "%s: reset in progress\n", __func__); return -1; } - if (mrioc->stop_bsgs) { + if (mrioc->stop_bsgs || mrioc->block_on_pci_err) { dprint_bsg_err(mrioc, "%s: bsgs are blocked\n", __func__); return -1; } @@ -1492,6 +1492,9 @@ static long mpi3mr_bsg_adp_reset(struct mpi3mr_ioc *mrioc, goto out; } + if (mrioc->unrecoverable || mrioc->block_on_pci_err) + return -EINVAL; + sg_copy_to_buffer(job->request_payload.sg_list, job->request_payload.sg_cnt, &adpreset, sizeof(adpreset)); @@ -2575,7 +2578,7 @@ static long mpi3mr_bsg_process_mpt_cmds(struct bsg_job *job) mutex_unlock(&mrioc->bsg_cmds.mutex); goto out; } - if (mrioc->stop_bsgs) { + if (mrioc->stop_bsgs || mrioc->block_on_pci_err) { dprint_bsg_err(mrioc, "%s: bsgs are blocked\n", __func__); rval = -EAGAIN; mutex_unlock(&mrioc->bsg_cmds.mutex); @@ -3103,7 +3106,8 @@ adp_state_show(struct device *dev, struct device_attribute *attr, ioc_state = mpi3mr_get_iocstate(mrioc); if (ioc_state == MRIOC_STATE_UNRECOVERABLE) adp_state = MPI3MR_BSG_ADPSTATE_UNRECOVERABLE; - else if ((mrioc->reset_in_progress) || (mrioc->stop_bsgs)) + else if (mrioc->reset_in_progress || mrioc->stop_bsgs || + mrioc->block_on_pci_err) adp_state = MPI3MR_BSG_ADPSTATE_IN_RESET; else if (ioc_state == MRIOC_STATE_FAULT) adp_state = MPI3MR_BSG_ADPSTATE_FAULT; diff --git a/drivers/scsi/mpi3mr/mpi3mr_fw.c b/drivers/scsi/mpi3mr/mpi3mr_fw.c index 654154acce54..78fd5cd86e51 100644 --- a/drivers/scsi/mpi3mr/mpi3mr_fw.c +++ b/drivers/scsi/mpi3mr/mpi3mr_fw.c @@ -608,7 +608,7 @@ int mpi3mr_blk_mq_poll(struct Scsi_Host *shost, unsigned int queue_num) mrioc = (struct mpi3mr_ioc *)shost->hostdata; if ((mrioc->reset_in_progress || mrioc->prepare_for_reset || - mrioc->unrecoverable)) + mrioc->unrecoverable || mrioc->pci_err_recovery)) return 0; num_entries = mpi3mr_process_op_reply_q(mrioc, @@ -1686,6 +1686,12 @@ int mpi3mr_admin_request_post(struct mpi3mr_ioc *mrioc, void *admin_req, retval = -EAGAIN; goto out; } + if (mrioc->pci_err_recovery) { + ioc_err(mrioc, "admin request queue submission failed due to pci error recovery in progress\n"); + retval = -EAGAIN; + goto out; + } + areq_entry = (u8 *)mrioc->admin_req_base + (areq_pi * MPI3MR_ADMIN_REQ_FRAME_SZ); memset(areq_entry, 0, MPI3MR_ADMIN_REQ_FRAME_SZ); @@ -2356,6 +2362,11 @@ int mpi3mr_op_request_post(struct mpi3mr_ioc *mrioc, retval = -EAGAIN; goto out; } + if (mrioc->pci_err_recovery) { + ioc_err(mrioc, "operational request queue submission failed due to pci error recovery in progress\n"); + retval = -EAGAIN; + goto out; + } segment_base_addr = segments[pi / op_req_q->segment_qd].segment; req_entry = (u8 *)segment_base_addr + @@ -2620,7 +2631,7 @@ static void mpi3mr_watchdog_work(struct work_struct *work) union mpi3mr_trigger_data trigger_data; u16 reset_reason = MPI3MR_RESET_FROM_FAULT_WATCH; - if (mrioc->reset_in_progress) + if (mrioc->reset_in_progress || mrioc->pci_err_recovery) return; if (!mrioc->unrecoverable && !pci_device_is_present(mrioc->pdev)) { @@ -4259,7 +4270,7 @@ int mpi3mr_reinit_ioc(struct mpi3mr_ioc *mrioc, u8 is_resume) goto out_failed_noretry; } - if (is_resume) { + if (is_resume || mrioc->block_on_pci_err) { dprint_reset(mrioc, "setting up single ISR\n"); retval = mpi3mr_setup_isr(mrioc, 1); if (retval) { @@ -4310,7 +4321,7 @@ int mpi3mr_reinit_ioc(struct mpi3mr_ioc *mrioc, u8 is_resume) goto out_failed; } - if (is_resume) { + if (is_resume || mrioc->block_on_pci_err) { dprint_reset(mrioc, "setting up multiple ISR\n"); retval = mpi3mr_setup_isr(mrioc, 0); if (retval) { @@ -4798,7 +4809,8 @@ void mpi3mr_cleanup_ioc(struct mpi3mr_ioc *mrioc) ioc_state = mpi3mr_get_iocstate(mrioc); - if ((!mrioc->unrecoverable) && (!mrioc->reset_in_progress) && + if (!mrioc->unrecoverable && !mrioc->reset_in_progress && + !mrioc->pci_err_recovery && (ioc_state == MRIOC_STATE_READY)) { if (mpi3mr_issue_and_process_mur(mrioc, MPI3MR_RESET_FROM_CTLR_CLEANUP)) diff --git a/drivers/scsi/mpi3mr/mpi3mr_os.c b/drivers/scsi/mpi3mr/mpi3mr_os.c index 2ba9701bf67a..da0efe6478c8 100644 --- a/drivers/scsi/mpi3mr/mpi3mr_os.c +++ b/drivers/scsi/mpi3mr/mpi3mr_os.c @@ -955,7 +955,7 @@ static int mpi3mr_report_tgtdev_to_host(struct mpi3mr_ioc *mrioc, int retval = 0; struct mpi3mr_tgt_dev *tgtdev; - if (mrioc->reset_in_progress) + if (mrioc->reset_in_progress || mrioc->pci_err_recovery) return -1; tgtdev = mpi3mr_get_tgtdev_by_perst_id(mrioc, perst_id); @@ -2002,6 +2002,7 @@ static void mpi3mr_fwevt_bh(struct mpi3mr_ioc *mrioc, struct mpi3_device_page0 *dev_pg0 = NULL; u16 perst_id, handle, dev_info; struct mpi3_device0_sas_sata_format *sasinf = NULL; + unsigned int timeout; mpi3mr_fwevt_del_from_list(mrioc, fwevt); mrioc->current_event = fwevt; @@ -2092,8 +2093,18 @@ static void mpi3mr_fwevt_bh(struct mpi3mr_ioc *mrioc, } case MPI3_EVENT_WAIT_FOR_DEVICES_TO_REFRESH: { - while (mrioc->device_refresh_on) + timeout = MPI3MR_RESET_TIMEOUT * 2; + while ((mrioc->device_refresh_on || mrioc->block_on_pci_err) && + !mrioc->unrecoverable && !mrioc->pci_err_recovery) { msleep(500); + if (!timeout--) { + mrioc->unrecoverable = 1; + break; + } + } + + if (mrioc->unrecoverable || mrioc->pci_err_recovery) + break; dprint_event_bh(mrioc, "scan for non responding and newly added devices after soft reset started\n"); @@ -3791,6 +3802,13 @@ int mpi3mr_issue_tm(struct mpi3mr_ioc *mrioc, u8 tm_type, mutex_unlock(&drv_cmd->mutex); goto out; } + if (mrioc->block_on_pci_err) { + retval = -1; + dprint_tm(mrioc, "sending task management failed due to\n" + "pci error recovery in progress\n"); + mutex_unlock(&drv_cmd->mutex); + goto out; + } drv_cmd->state = MPI3MR_CMD_PENDING; drv_cmd->is_waiting = 1; @@ -4176,6 +4194,7 @@ static int mpi3mr_eh_bus_reset(struct scsi_cmnd *scmd) struct mpi3mr_sdev_priv_data *sdev_priv_data; u8 dev_type = MPI3_DEVICE_DEVFORM_VD; int retval = FAILED; + unsigned int timeout = MPI3MR_RESET_TIMEOUT; sdev_priv_data = scmd->device->hostdata; if (sdev_priv_data && sdev_priv_data->tgt_priv_data) { @@ -4186,12 +4205,24 @@ static int mpi3mr_eh_bus_reset(struct scsi_cmnd *scmd) if (dev_type == MPI3_DEVICE_DEVFORM_VD) { mpi3mr_wait_for_host_io(mrioc, MPI3MR_RAID_ERRREC_RESET_TIMEOUT); - if (!mpi3mr_get_fw_pending_ios(mrioc)) + if (!mpi3mr_get_fw_pending_ios(mrioc)) { + while (mrioc->reset_in_progress || + mrioc->prepare_for_reset || + mrioc->block_on_pci_err) { + ssleep(1); + if (!timeout--) { + retval = FAILED; + goto out; + } + } retval = SUCCESS; + goto out; + } } if (retval == FAILED) mpi3mr_print_pending_host_io(mrioc); +out: sdev_printk(KERN_INFO, scmd->device, "Bus reset is %s for scmd(%p)\n", ((retval == SUCCESS) ? "SUCCESS" : "FAILED"), scmd); @@ -4892,7 +4923,8 @@ static int mpi3mr_qcmd(struct Scsi_Host *shost, goto out; } - if (mrioc->reset_in_progress) { + if (mrioc->reset_in_progress || mrioc->prepare_for_reset + || mrioc->block_on_pci_err) { retval = SCSI_MLQUEUE_HOST_BUSY; goto out; } @@ -5370,7 +5402,14 @@ static void mpi3mr_remove(struct pci_dev *pdev) while (mrioc->reset_in_progress || mrioc->is_driver_loading) ssleep(1); - if (!pci_device_is_present(mrioc->pdev)) { + if (mrioc->block_on_pci_err) { + mrioc->block_on_pci_err = false; + scsi_unblock_requests(shost); + mrioc->unrecoverable = 1; + } + + if (!pci_device_is_present(mrioc->pdev) || + mrioc->pci_err_recovery) { mrioc->unrecoverable = 1; mpi3mr_flush_cmds_for_unrecovered_controller(mrioc); } diff --git a/drivers/scsi/mpi3mr/mpi3mr_transport.c b/drivers/scsi/mpi3mr/mpi3mr_transport.c index d32ad46318cb..037860471fb7 100644 --- a/drivers/scsi/mpi3mr/mpi3mr_transport.c +++ b/drivers/scsi/mpi3mr/mpi3mr_transport.c @@ -149,6 +149,11 @@ static int mpi3mr_report_manufacture(struct mpi3mr_ioc *mrioc, return -EFAULT; } + if (mrioc->pci_err_recovery) { + ioc_err(mrioc, "%s: pci error recovery in progress!\n", __func__); + return -EFAULT; + } + data_out_sz = sizeof(struct rep_manu_request); data_in_sz = sizeof(struct rep_manu_reply); data_out = dma_alloc_coherent(&mrioc->pdev->dev, @@ -792,6 +797,12 @@ static int mpi3mr_set_identify(struct mpi3mr_ioc *mrioc, u16 handle, return -EFAULT; } + if (mrioc->pci_err_recovery) { + ioc_err(mrioc, "%s: pci error recovery in progress!\n", + __func__); + return -EFAULT; + } + if ((mpi3mr_cfg_get_dev_pg0(mrioc, &ioc_status, &device_pg0, sizeof(device_pg0), MPI3_DEVICE_PGAD_FORM_HANDLE, handle))) { ioc_err(mrioc, "%s: device page0 read failed\n", __func__); @@ -1009,6 +1020,9 @@ mpi3mr_alloc_hba_port(struct mpi3mr_ioc *mrioc, u16 port_id) hba_port->port_id = port_id; ioc_info(mrioc, "hba_port entry: %p, port: %d is added to hba_port list\n", hba_port, hba_port->port_id); + if (mrioc->reset_in_progress || + mrioc->pci_err_recovery) + hba_port->flags = MPI3MR_HBA_PORT_FLAG_NEW; list_add_tail(&hba_port->list, &mrioc->hba_port_table_list); return hba_port; } @@ -1057,7 +1071,7 @@ void mpi3mr_update_links(struct mpi3mr_ioc *mrioc, struct mpi3mr_sas_node *mr_sas_node; struct mpi3mr_sas_phy *mr_sas_phy; - if (mrioc->reset_in_progress) + if (mrioc->reset_in_progress || mrioc->pci_err_recovery) return; spin_lock_irqsave(&mrioc->sas_node_lock, flags); @@ -1970,7 +1984,7 @@ int mpi3mr_expander_add(struct mpi3mr_ioc *mrioc, u16 handle) if (!handle) return -1; - if (mrioc->reset_in_progress) + if (mrioc->reset_in_progress || mrioc->pci_err_recovery) return -1; if ((mpi3mr_cfg_get_sas_exp_pg0(mrioc, &ioc_status, &expander_pg0, @@ -2176,7 +2190,7 @@ void mpi3mr_expander_node_remove(struct mpi3mr_ioc *mrioc, /* remove sibling ports attached to this expander */ list_for_each_entry_safe(mr_sas_port, next, &sas_expander->sas_port_list, port_list) { - if (mrioc->reset_in_progress) + if (mrioc->reset_in_progress || mrioc->pci_err_recovery) return; if (mr_sas_port->remote_identify.device_type == SAS_END_DEVICE) @@ -2226,7 +2240,7 @@ void mpi3mr_expander_remove(struct mpi3mr_ioc *mrioc, u64 sas_address, struct mpi3mr_sas_node *sas_expander; unsigned long flags; - if (mrioc->reset_in_progress) + if (mrioc->reset_in_progress || mrioc->pci_err_recovery) return; if (!hba_port) @@ -2537,6 +2551,11 @@ static int mpi3mr_get_expander_phy_error_log(struct mpi3mr_ioc *mrioc, return -EFAULT; } + if (mrioc->pci_err_recovery) { + ioc_err(mrioc, "%s: pci error recovery in progress!\n", __func__); + return -EFAULT; + } + data_out_sz = sizeof(struct phy_error_log_request); data_in_sz = sizeof(struct phy_error_log_reply); sz = data_out_sz + data_in_sz; @@ -2796,6 +2815,12 @@ mpi3mr_expander_phy_control(struct mpi3mr_ioc *mrioc, return -EFAULT; } + if (mrioc->pci_err_recovery) { + ioc_err(mrioc, "%s: pci error recovery in progress!\n", + __func__); + return -EFAULT; + } + data_out_sz = sizeof(struct phy_control_request); data_in_sz = sizeof(struct phy_control_reply); sz = data_out_sz + data_in_sz; @@ -3219,6 +3244,12 @@ mpi3mr_transport_smp_handler(struct bsg_job *job, struct Scsi_Host *shost, goto out; } + if (mrioc->pci_err_recovery) { + ioc_err(mrioc, "%s: pci error recovery in progress!\n", __func__); + rc = -EFAULT; + goto out; + } + rc = mpi3mr_map_smp_buffer(&mrioc->pdev->dev, &job->request_payload, &dma_addr_out, &dma_len_out, &addr_out); if (rc)