diff mbox series

hw/nvme: actually implement abort

Message ID 20240702080232.848849-1-ayush.m55@samsung.com
State New
Headers show
Series hw/nvme: actually implement abort | expand

Commit Message

Ayush Mishra July 2, 2024, 8:02 a.m. UTC
Abort was not implemented previously, but we can implement it for AERs and asynchrnously for I/O.

Signed-off-by: Ayush Mishra <ayush.m55@samsung.com>
---
 hw/nvme/ctrl.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

Comments

Keith Busch July 2, 2024, 3:24 p.m. UTC | #1
On Tue, Jul 02, 2024 at 01:32:32PM +0530, Ayush Mishra wrote:
> Abort was not implemented previously, but we can implement it for AERs and asynchrnously for I/O.

Not implemented for a reason. The target has no idea if the CID the
host requested to be aborted is from the same context that the target
has. Target may have previoulsy completed it, and the host re-issued a
new command after the abort, and due to the queueing could have been
observed in a different order, and now you aborted the wrong command.
Klaus Jensen July 2, 2024, 6:55 p.m. UTC | #2
On Jul  2 09:24, Keith Busch wrote:
> On Tue, Jul 02, 2024 at 01:32:32PM +0530, Ayush Mishra wrote:
> > Abort was not implemented previously, but we can implement it for AERs and asynchrnously for I/O.
> 
> Not implemented for a reason. The target has no idea if the CID the
> host requested to be aborted is from the same context that the target
> has. Target may have previoulsy completed it, and the host re-issued a
> new command after the abort, and due to the queueing could have been
> observed in a different order, and now you aborted the wrong command.

I might be missing something here, but are you saying that the Abort
command is fundamentally flawed? Isn't this a host issue? The Abort is
for a specific CID on a specific SQID. The host *should* not screw this
up and reuse a CID it has an outstanding Abort on?

I don't think there are a lot of I/O commands that a host would be able
to cancel (in QEMU, not at all, because only the iscsi backend
actually implements blk_aio_cancel_async). But some commands that issue
multiple AIOs, like Copy, may be long running and with this it can
actually be cancelled.

And with regards to AERs, I don't see why it is not advantageous to be
able to Abort one?
Klaus Jensen July 10, 2024, 9:09 a.m. UTC | #3
On Jul  2 20:55, Klaus Jensen wrote:
> On Jul  2 09:24, Keith Busch wrote:
> > On Tue, Jul 02, 2024 at 01:32:32PM +0530, Ayush Mishra wrote:
> > > Abort was not implemented previously, but we can implement it for AERs and asynchrnously for I/O.
> > 
> > Not implemented for a reason. The target has no idea if the CID the
> > host requested to be aborted is from the same context that the target
> > has. Target may have previoulsy completed it, and the host re-issued a
> > new command after the abort, and due to the queueing could have been
> > observed in a different order, and now you aborted the wrong command.
> 
> I might be missing something here, but are you saying that the Abort
> command is fundamentally flawed? Isn't this a host issue? The Abort is
> for a specific CID on a specific SQID. The host *should* not screw this
> up and reuse a CID it has an outstanding Abort on?
> 
> I don't think there are a lot of I/O commands that a host would be able
> to cancel (in QEMU, not at all, because only the iscsi backend
> actually implements blk_aio_cancel_async). But some commands that issue
> multiple AIOs, like Copy, may be long running and with this it can
> actually be cancelled.
> 
> And with regards to AERs, I don't see why it is not advantageous to be
> able to Abort one?

Keith, any thoughts on this?
Keith Busch July 11, 2024, 3:01 p.m. UTC | #4
On Wed, Jul 10, 2024 at 11:09:43AM +0200, Klaus Jensen wrote:
> On Jul  2 20:55, Klaus Jensen wrote:
> > On Jul  2 09:24, Keith Busch wrote:
> > > On Tue, Jul 02, 2024 at 01:32:32PM +0530, Ayush Mishra wrote:
> > > > Abort was not implemented previously, but we can implement it for AERs and asynchrnously for I/O.
> > > 
> > > Not implemented for a reason. The target has no idea if the CID the
> > > host requested to be aborted is from the same context that the target
> > > has. Target may have previoulsy completed it, and the host re-issued a
> > > new command after the abort, and due to the queueing could have been
> > > observed in a different order, and now you aborted the wrong command.
> > 
> > I might be missing something here, but are you saying that the Abort
> > command is fundamentally flawed? Isn't this a host issue? The Abort is
> > for a specific CID on a specific SQID. The host *should* not screw this
> > up and reuse a CID it has an outstanding Abort on?
> > 
> > I don't think there are a lot of I/O commands that a host would be able
> > to cancel (in QEMU, not at all, because only the iscsi backend
> > actually implements blk_aio_cancel_async). But some commands that issue
> > multiple AIOs, like Copy, may be long running and with this it can
> > actually be cancelled.
> > 
> > And with regards to AERs, I don't see why it is not advantageous to be
> > able to Abort one?
> 
> Keith, any thoughts on this?

Oh, you can take this if you want, I'm just mentioning the pitfalls with
the abort command. While sequestoring command id's that are being
aborted may be good practice for the host, the spec doesn't say anything
about it. The Linux driver doesn't do that at least, though it recently
created a different mechanism to avoid immediate command id reuse.
diff mbox series

Patch

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 127c3d2383..a38037b5f8 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -1759,6 +1759,10 @@  static void nvme_aio_err(NvmeRequest *req, int ret)
         break;
     }
 
+    if (ret == -ECANCELED) {
+        status = NVME_CMD_ABORT_REQ;
+    }
+
     trace_pci_nvme_err_aio(nvme_cid(req), strerror(-ret), status);
 
     error_setg_errno(&local_err, -ret, "aio failed");
@@ -5759,12 +5763,40 @@  static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req)
 static uint16_t nvme_abort(NvmeCtrl *n, NvmeRequest *req)
 {
     uint16_t sqid = le32_to_cpu(req->cmd.cdw10) & 0xffff;
+    uint16_t cid  = (le32_to_cpu(req->cmd.cdw10) >> 16) & 0xffff;
+    NvmeSQueue *sq = n->sq[sqid];
+    NvmeRequest *r, *next;
+    int i;
 
     req->cqe.result = 1;
     if (nvme_check_sqid(n, sqid)) {
         return NVME_INVALID_FIELD | NVME_DNR;
     }
 
+    if (sqid == 0) {
+        for (i = 0; i < n->outstanding_aers; i++) {
+            NvmeRequest *re = n->aer_reqs[i];
+            if (re->cqe.cid == cid) {
+                memmove(n->aer_reqs + i, n->aer_reqs + i + 1,
+                         (n->outstanding_aers - i - 1) * sizeof(NvmeRequest *));
+                n->outstanding_aers--;
+                re->status = NVME_CMD_ABORT_REQ;
+                req->cqe.result = 0;
+                nvme_enqueue_req_completion(&n->admin_cq, re);
+                return NVME_SUCCESS;
+            }
+        }
+    }
+
+    QTAILQ_FOREACH_SAFE(r, &sq->out_req_list, entry, next) {
+        if (r->cqe.cid == cid) {
+            if (r->aiocb) {
+                blk_aio_cancel_async(r->aiocb);
+            }
+            break;
+        }
+    }
+
     return NVME_SUCCESS;
 }