Message ID | 1443222852-9229-1-git-send-email-mrochs@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
You have two versions of check_state() below, which is a bit confusing. It looks like you've moved the function and also added the up/down of the read semaphore. I assume that's all that changed? > > /** > + * check_state() - checks and responds to the current adapter state > + * @cfg: Internal structure associated with the host. > + * > + * This routine can block and should only be used on process context. > + * It assumes that the caller is an ioctl thread and holding the ioctl > + * read semaphore. This is temporarily let up across the wait to allow > + * for draining actively running ioctls. Also note that when waking up > + * from waiting in reset, the state is unknown and must be checked again > + * before proceeding. > + * > + * Return: 0 on success, -errno on failure > + */ > +static int check_state(struct cxlflash_cfg *cfg) > +{ > + struct device *dev = &cfg->dev->dev; > + int rc = 0; > + > +retry: > + switch (cfg->state) { > + case STATE_LIMBO: > + dev_dbg(dev, "%s: Limbo state, going to wait...\n", __func__); > + up_read(&cfg->ioctl_rwsem); > + rc = wait_event_interruptible(cfg->limbo_waitq, > + cfg->state != STATE_LIMBO); > + down_read(&cfg->ioctl_rwsem); > + if (unlikely(rc)) > + break; > + goto retry; > + case STATE_FAILTERM: > + dev_dbg(dev, "%s: Failed/Terminating!\n", __func__); > + rc = -ENODEV; > + break; > + default: > + break; > + } > + > + return rc; > +} > + > +/** > * cxlflash_disk_attach() - attach a LUN to a context > * @sdev: SCSI device associated with LUN. > * @attach: Attach ioctl data structure. > @@ -1523,41 +1563,6 @@ err1: > } > > /** > - * check_state() - checks and responds to the current adapter state > - * @cfg: Internal structure associated with the host. > - * > - * This routine can block and should only be used on process context. > - * Note that when waking up from waiting in limbo, the state is unknown > - * and must be checked again before proceeding. > - * > - * Return: 0 on success, -errno on failure > - */ > -static int check_state(struct cxlflash_cfg *cfg) > -{ > - struct device *dev = &cfg->dev->dev; > - int rc = 0; > - > -retry: > - switch (cfg->state) { > - case STATE_LIMBO: > - dev_dbg(dev, "%s: Limbo, going to wait...\n", __func__); > - rc = wait_event_interruptible(cfg->limbo_waitq, > - cfg->state != STATE_LIMBO); > - if (unlikely(rc)) > - break; > - goto retry; > - case STATE_FAILTERM: > - dev_dbg(dev, "%s: Failed/Terminating!\n", __func__); > - rc = -ENODEV; > - break; > - default: > - break; > - } > - > - return rc; > -} > - > -/** > * cxlflash_afu_recover() - initiates AFU recovery > * @sdev: SCSI device associated with LUN. > * @recover: Recover ioctl data structure. > @@ -1646,9 +1651,14 @@ retry_recover: > /* Test if in error state */ > reg = readq_be(&afu->ctrl_map->mbox_r); > if (reg == -1) { > - dev_dbg(dev, "%s: MMIO read fail! Wait for recovery...\n", > - __func__); > - mutex_unlock(&ctxi->mutex); > + dev_dbg(dev, "%s: MMIO fail, wait for recovery.\n", __func__); > + > + /* > + * Before checking the state, put back the context obtained with > + * get_context() as it is no longer needed and sleep for a short > + * period of time (see prolog notes). > + */ > + put_context(ctxi); Is this needed for the drain to work? It looks like it fixes a refcounting bug in the function, but I'm not sure I understand how it interacts with the rest of this patch. Anyway, the patch overall looks good to me, and makes your driver interact with CXL's EEH support in the way I intended when I wrote it. Reviewed-by: Daniel Axtens <dja@axtens.net> Regards, Daniel
> On Sep 28, 2015, at 6:05 PM, Daniel Axtens <dja@axtens.net> wrote: > > You have two versions of check_state() below, which is a bit > confusing. It looks like you've moved the function and also added the > up/down of the read semaphore. I assume that's all that changed? Correct. It was originally moved to meet a dependency due to it being defined statically. > >> >> /** >> + * check_state() - checks and responds to the current adapter state >> + * @cfg: Internal structure associated with the host. >> + * >> + * This routine can block and should only be used on process context. >> + * It assumes that the caller is an ioctl thread and holding the ioctl >> + * read semaphore. This is temporarily let up across the wait to allow >> + * for draining actively running ioctls. Also note that when waking up >> + * from waiting in reset, the state is unknown and must be checked again >> + * before proceeding. >> + * >> + * Return: 0 on success, -errno on failure >> + */ >> +static int check_state(struct cxlflash_cfg *cfg) >> +{ >> + struct device *dev = &cfg->dev->dev; >> + int rc = 0; >> + >> +retry: >> + switch (cfg->state) { >> + case STATE_LIMBO: >> + dev_dbg(dev, "%s: Limbo state, going to wait...\n", __func__); >> + up_read(&cfg->ioctl_rwsem); >> + rc = wait_event_interruptible(cfg->limbo_waitq, >> + cfg->state != STATE_LIMBO); >> + down_read(&cfg->ioctl_rwsem); >> + if (unlikely(rc)) >> + break; >> + goto retry; >> + case STATE_FAILTERM: >> + dev_dbg(dev, "%s: Failed/Terminating!\n", __func__); >> + rc = -ENODEV; >> + break; >> + default: >> + break; >> + } >> + >> + return rc; >> +} >> + >> +/** >> * cxlflash_disk_attach() - attach a LUN to a context >> * @sdev: SCSI device associated with LUN. >> * @attach: Attach ioctl data structure. >> @@ -1523,41 +1563,6 @@ err1: >> } >> >> /** >> - * check_state() - checks and responds to the current adapter state >> - * @cfg: Internal structure associated with the host. >> - * >> - * This routine can block and should only be used on process context. >> - * Note that when waking up from waiting in limbo, the state is unknown >> - * and must be checked again before proceeding. >> - * >> - * Return: 0 on success, -errno on failure >> - */ >> -static int check_state(struct cxlflash_cfg *cfg) >> -{ >> - struct device *dev = &cfg->dev->dev; >> - int rc = 0; >> - >> -retry: >> - switch (cfg->state) { >> - case STATE_LIMBO: >> - dev_dbg(dev, "%s: Limbo, going to wait...\n", __func__); >> - rc = wait_event_interruptible(cfg->limbo_waitq, >> - cfg->state != STATE_LIMBO); >> - if (unlikely(rc)) >> - break; >> - goto retry; >> - case STATE_FAILTERM: >> - dev_dbg(dev, "%s: Failed/Terminating!\n", __func__); >> - rc = -ENODEV; >> - break; >> - default: >> - break; >> - } >> - >> - return rc; >> -} >> - >> -/** >> * cxlflash_afu_recover() - initiates AFU recovery >> * @sdev: SCSI device associated with LUN. >> * @recover: Recover ioctl data structure. >> @@ -1646,9 +1651,14 @@ retry_recover: >> /* Test if in error state */ >> reg = readq_be(&afu->ctrl_map->mbox_r); >> if (reg == -1) { >> - dev_dbg(dev, "%s: MMIO read fail! Wait for recovery...\n", >> - __func__); >> - mutex_unlock(&ctxi->mutex); >> + dev_dbg(dev, "%s: MMIO fail, wait for recovery.\n", __func__); >> + >> + /* >> + * Before checking the state, put back the context obtained with >> + * get_context() as it is no longer needed and sleep for a short >> + * period of time (see prolog notes). >> + */ >> + put_context(ctxi); > > Is this needed for the drain to work? It looks like it fixes a > refcounting bug in the function, but I'm not sure I understand how it > interacts with the rest of this patch. This was simply some "while I'm here" refactoring as the commit originally included a change here. The main point of this change was to replace the mutex_unlock() with put_context(), which is a wrapper around the unlocking of the context's mutex. > > Anyway, the patch overall looks good to me, and makes your driver > interact with CXL's EEH support in the way I intended when I wrote it. Thanks for reviewing.
diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h index 1c56037..1abe4e0 100644 --- a/drivers/scsi/cxlflash/common.h +++ b/drivers/scsi/cxlflash/common.h @@ -16,6 +16,7 @@ #define _CXLFLASH_COMMON_H #include <linux/list.h> +#include <linux/rwsem.h> #include <linux/types.h> #include <scsi/scsi.h> #include <scsi/scsi_device.h> @@ -110,6 +111,7 @@ struct cxlflash_cfg { atomic_t recovery_threads; struct mutex ctx_recovery_mutex; struct mutex ctx_tbl_list_mutex; + struct rw_semaphore ioctl_rwsem; struct ctx_info *ctx_tbl[MAX_CONTEXT]; struct list_head ctx_err_recovery; /* contexts w/ recovery pending */ struct file_operations cxl_fops; diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c index 3e3ccf1..6e85c77 100644 --- a/drivers/scsi/cxlflash/main.c +++ b/drivers/scsi/cxlflash/main.c @@ -2311,6 +2311,7 @@ static int cxlflash_probe(struct pci_dev *pdev, cfg->lr_port = -1; mutex_init(&cfg->ctx_tbl_list_mutex); mutex_init(&cfg->ctx_recovery_mutex); + init_rwsem(&cfg->ioctl_rwsem); INIT_LIST_HEAD(&cfg->ctx_err_recovery); INIT_LIST_HEAD(&cfg->lluns); @@ -2365,6 +2366,19 @@ out_remove: } /** + * drain_ioctls() - wait until all currently executing ioctls have completed + * @cfg: Internal structure associated with the host. + * + * Obtain write access to read/write semaphore that wraps ioctl + * handling to 'drain' ioctls currently executing. + */ +static void drain_ioctls(struct cxlflash_cfg *cfg) +{ + down_write(&cfg->ioctl_rwsem); + up_write(&cfg->ioctl_rwsem); +} + +/** * cxlflash_pci_error_detected() - called when a PCI error is detected * @pdev: PCI device struct. * @state: PCI channel state. @@ -2383,16 +2397,14 @@ static pci_ers_result_t cxlflash_pci_error_detected(struct pci_dev *pdev, switch (state) { case pci_channel_io_frozen: cfg->state = STATE_LIMBO; - - /* Turn off legacy I/O */ scsi_block_requests(cfg->host); + drain_ioctls(cfg); rc = cxlflash_mark_contexts_error(cfg); if (unlikely(rc)) dev_err(dev, "%s: Failed to mark user contexts!(%d)\n", __func__, rc); term_mc(cfg, UNDO_START); stop_afu(cfg); - return PCI_ERS_RESULT_NEED_RESET; case pci_channel_io_perm_failure: cfg->state = STATE_FAILTERM; diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c index 28aa9d9..655cbf1 100644 --- a/drivers/scsi/cxlflash/superpipe.c +++ b/drivers/scsi/cxlflash/superpipe.c @@ -1214,6 +1214,46 @@ static const struct file_operations null_fops = { }; /** + * check_state() - checks and responds to the current adapter state + * @cfg: Internal structure associated with the host. + * + * This routine can block and should only be used on process context. + * It assumes that the caller is an ioctl thread and holding the ioctl + * read semaphore. This is temporarily let up across the wait to allow + * for draining actively running ioctls. Also note that when waking up + * from waiting in reset, the state is unknown and must be checked again + * before proceeding. + * + * Return: 0 on success, -errno on failure + */ +static int check_state(struct cxlflash_cfg *cfg) +{ + struct device *dev = &cfg->dev->dev; + int rc = 0; + +retry: + switch (cfg->state) { + case STATE_LIMBO: + dev_dbg(dev, "%s: Limbo state, going to wait...\n", __func__); + up_read(&cfg->ioctl_rwsem); + rc = wait_event_interruptible(cfg->limbo_waitq, + cfg->state != STATE_LIMBO); + down_read(&cfg->ioctl_rwsem); + if (unlikely(rc)) + break; + goto retry; + case STATE_FAILTERM: + dev_dbg(dev, "%s: Failed/Terminating!\n", __func__); + rc = -ENODEV; + break; + default: + break; + } + + return rc; +} + +/** * cxlflash_disk_attach() - attach a LUN to a context * @sdev: SCSI device associated with LUN. * @attach: Attach ioctl data structure. @@ -1523,41 +1563,6 @@ err1: } /** - * check_state() - checks and responds to the current adapter state - * @cfg: Internal structure associated with the host. - * - * This routine can block and should only be used on process context. - * Note that when waking up from waiting in limbo, the state is unknown - * and must be checked again before proceeding. - * - * Return: 0 on success, -errno on failure - */ -static int check_state(struct cxlflash_cfg *cfg) -{ - struct device *dev = &cfg->dev->dev; - int rc = 0; - -retry: - switch (cfg->state) { - case STATE_LIMBO: - dev_dbg(dev, "%s: Limbo, going to wait...\n", __func__); - rc = wait_event_interruptible(cfg->limbo_waitq, - cfg->state != STATE_LIMBO); - if (unlikely(rc)) - break; - goto retry; - case STATE_FAILTERM: - dev_dbg(dev, "%s: Failed/Terminating!\n", __func__); - rc = -ENODEV; - break; - default: - break; - } - - return rc; -} - -/** * cxlflash_afu_recover() - initiates AFU recovery * @sdev: SCSI device associated with LUN. * @recover: Recover ioctl data structure. @@ -1646,9 +1651,14 @@ retry_recover: /* Test if in error state */ reg = readq_be(&afu->ctrl_map->mbox_r); if (reg == -1) { - dev_dbg(dev, "%s: MMIO read fail! Wait for recovery...\n", - __func__); - mutex_unlock(&ctxi->mutex); + dev_dbg(dev, "%s: MMIO fail, wait for recovery.\n", __func__); + + /* + * Before checking the state, put back the context obtained with + * get_context() as it is no longer needed and sleep for a short + * period of time (see prolog notes). + */ + put_context(ctxi); ctxi = NULL; ssleep(1); rc = check_state(cfg); @@ -1967,6 +1977,14 @@ out: * @cmd: IOCTL command. * @arg: Userspace ioctl data structure. * + * A read/write semaphore is used to implement a 'drain' of currently + * running ioctls. The read semaphore is taken at the beginning of each + * ioctl thread and released upon concluding execution. Additionally the + * semaphore should be released and then reacquired in any ioctl execution + * path which will wait for an event to occur that is outside the scope of + * the ioctl (i.e. an adapter reset). To drain the ioctls currently running, + * a thread simply needs to acquire the write semaphore. + * * Return: 0 on success, -errno on failure */ int cxlflash_ioctl(struct scsi_device *sdev, int cmd, void __user *arg) @@ -2001,6 +2019,9 @@ int cxlflash_ioctl(struct scsi_device *sdev, int cmd, void __user *arg) {sizeof(struct dk_cxlflash_clone), (sioctl)cxlflash_disk_clone}, }; + /* Hold read semaphore so we can drain if needed */ + down_read(&cfg->ioctl_rwsem); + /* Restrict command set to physical support only for internal LUN */ if (afu->internal_lun) switch (cmd) { @@ -2082,6 +2103,7 @@ int cxlflash_ioctl(struct scsi_device *sdev, int cmd, void __user *arg) /* fall through to exit */ cxlflash_ioctl_exit: + up_read(&cfg->ioctl_rwsem); if (unlikely(rc && known_ioctl)) dev_err(dev, "%s: ioctl %s (%08X) on dev(%d/%d/%d/%llu) " "returned rc %d\n", __func__,