Message ID | 20200822020442.2677358-2-luke.w.hsiao@gmail.com |
---|---|
State | Superseded |
Delegated to: | David Miller |
Headers | show |
Series | [net-next,v2,1/2] io_uring: allow tcp ancillary data for __sys_recvmsg_sock() | expand |
On 8/21/20 8:04 PM, Luke Hsiao wrote: > From: Luke Hsiao <lukehsiao@google.com> > > Currently, io_uring's recvmsg subscribes to both POLLERR and POLLIN. In > the context of TCP tx zero-copy, this is inefficient since we are only > reading the error queue and not using recvmsg to read POLLIN responses. > > This patch was tested by using a simple sending program to call recvmsg > using io_uring with MSG_ERRQUEUE set and verifying with printks that the > POLLIN is correctly unset when the msg flags are MSG_ERRQUEUE. Sorry, one more minor thing to fix up: > @@ -4932,6 +4934,11 @@ static bool io_arm_poll_handler(struct io_kiocb *req) > mask |= POLLIN | POLLRDNORM; > if (def->pollout) > mask |= POLLOUT | POLLWRNORM; > + > + /* If reading from MSG_ERRQUEUE using recvmsg, ignore POLLIN */ > + if (req->opcode == IORING_OP_RECVMSG && (sqe->msg_flags & MSG_ERRQUEUE)) > + mask &= ~POLLIN; > + Don't pass in the sqe here, but use req->sr_msg.msg_flags for this check. This is actually really important, as you don't want to re-read anything from the sqe. I'm actually surprised this one got past Jann :-) > @@ -6146,7 +6153,7 @@ static void __io_queue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe, > * doesn't support non-blocking read/write attempts > */ > if (ret == -EAGAIN && !(req->flags & REQ_F_NOWAIT)) { > - if (!io_arm_poll_handler(req)) { > + if (!io_arm_poll_handler(req, sqe)) { Also means you can drop this part.
Hi Jens, On Fri, Aug 21, 2020 at 7:09 PM Jens Axboe <axboe@kernel.dk> wrote: > > On 8/21/20 8:04 PM, Luke Hsiao wrote: > > > Sorry, one more minor thing to fix up: > > > @@ -4932,6 +4934,11 @@ static bool io_arm_poll_handler(struct io_kiocb *req) > > mask |= POLLIN | POLLRDNORM; > > if (def->pollout) > > mask |= POLLOUT | POLLWRNORM; > > + > > + /* If reading from MSG_ERRQUEUE using recvmsg, ignore POLLIN */ > > + if (req->opcode == IORING_OP_RECVMSG && (sqe->msg_flags & MSG_ERRQUEUE)) > > + mask &= ~POLLIN; > > + > > Don't pass in the sqe here, but use req->sr_msg.msg_flags for this check. This > is actually really important, as you don't want to re-read anything from the > sqe. > > I'm actually surprised this one got past Jann :-) Got it, I will make the change and send v3. In Jann's defense, he reviewed the previous commit, but not this one :). Thanks for your detailed feedback. -- Luke
On 8/21/20 8:13 PM, Luke Hsiao wrote: > Hi Jens, > > On Fri, Aug 21, 2020 at 7:09 PM Jens Axboe <axboe@kernel.dk> wrote: >> >> On 8/21/20 8:04 PM, Luke Hsiao wrote: >>> >> Sorry, one more minor thing to fix up: >> >>> @@ -4932,6 +4934,11 @@ static bool io_arm_poll_handler(struct io_kiocb *req) >>> mask |= POLLIN | POLLRDNORM; >>> if (def->pollout) >>> mask |= POLLOUT | POLLWRNORM; >>> + >>> + /* If reading from MSG_ERRQUEUE using recvmsg, ignore POLLIN */ >>> + if (req->opcode == IORING_OP_RECVMSG && (sqe->msg_flags & MSG_ERRQUEUE)) >>> + mask &= ~POLLIN; >>> + >> >> Don't pass in the sqe here, but use req->sr_msg.msg_flags for this check. This >> is actually really important, as you don't want to re-read anything from the >> sqe. >> >> I'm actually surprised this one got past Jann :-) > > Got it, I will make the change and send v3. In Jann's defense, he > reviewed the previous commit, but not this one :). Thanks for your > detailed feedback. Ah right you are, I guess it was the previous patch that had his review! Thanks for taking care of this.
diff --git a/fs/io_uring.c b/fs/io_uring.c index dc506b75659c..fd5353e31a2c 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -79,6 +79,7 @@ #include <linux/splice.h> #include <linux/task_work.h> #include <linux/pagemap.h> +#include <linux/socket.h> #define CREATE_TRACE_POINTS #include <trace/events/io_uring.h> @@ -4902,7 +4903,8 @@ static __poll_t __io_arm_poll_handler(struct io_kiocb *req, return mask; } -static bool io_arm_poll_handler(struct io_kiocb *req) +static bool io_arm_poll_handler(struct io_kiocb *req, + const struct io_uring_sqe *sqe) { const struct io_op_def *def = &io_op_defs[req->opcode]; struct io_ring_ctx *ctx = req->ctx; @@ -4932,6 +4934,11 @@ static bool io_arm_poll_handler(struct io_kiocb *req) mask |= POLLIN | POLLRDNORM; if (def->pollout) mask |= POLLOUT | POLLWRNORM; + + /* If reading from MSG_ERRQUEUE using recvmsg, ignore POLLIN */ + if (req->opcode == IORING_OP_RECVMSG && (sqe->msg_flags & MSG_ERRQUEUE)) + mask &= ~POLLIN; + mask |= POLLERR | POLLPRI; ipt.pt._qproc = io_async_queue_proc; @@ -6146,7 +6153,7 @@ static void __io_queue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe, * doesn't support non-blocking read/write attempts */ if (ret == -EAGAIN && !(req->flags & REQ_F_NOWAIT)) { - if (!io_arm_poll_handler(req)) { + if (!io_arm_poll_handler(req, sqe)) { punt: ret = io_prep_work_files(req); if (unlikely(ret))