diff mbox series

[net-next,2/2] io_uring: ignore POLLIN for recvmsg on MSG_ERRQUEUE

Message ID 20200820234954.1784522-3-luke.w.hsiao@gmail.com
State Changes Requested
Delegated to: David Miller
Headers show
Series Support reading msg errq from io_uring | expand

Commit Message

Luke Hsiao Aug. 20, 2020, 11:49 p.m. UTC
From: Luke Hsiao <lukehsiao@google.com>

Currently, io_uring's recvmsg subscribes to both POLLERR and POLLIN. In
the context of TCP tx zero-copy, this is inefficient since we are only
reading the error queue and not using recvmsg to read POLLIN responses.

This patch was tested by using a simple sending program to call recvmsg
using io_uring with MSG_ERRQUEUE set and verifying with printks that the
POLLIN is correctly unset when the msg flags are MSG_ERRQUEUE.

Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Luke Hsiao <lukehsiao@google.com>
---
 fs/io_uring.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

Comments

Jakub Kicinski Aug. 21, 2020, 8:41 p.m. UTC | #1
On Thu, 20 Aug 2020 16:49:54 -0700 Luke Hsiao wrote:
> +	/* If reading from MSG_ERRQUEUE using recvmsg, ignore POLLIN */
> +	if (req->opcode == IORING_OP_RECVMSG && (sqe->msg_flags & MSG_ERRQUEUE))
> +		mask &= ~(POLLIN);

FWIW this adds another W=1 C=1 warnings to this code:

fs/io_uring.c:4940:22: warning: invalid assignment: &=
fs/io_uring.c:4940:22:    left side has type restricted __poll_t
fs/io_uring.c:4940:22:    right side has type int


And obviously the brackets around POLLIN are not necessary.
Jens Axboe Aug. 21, 2020, 9:11 p.m. UTC | #2
On 8/21/20 2:41 PM, Jakub Kicinski wrote:
> On Thu, 20 Aug 2020 16:49:54 -0700 Luke Hsiao wrote:
>> +	/* If reading from MSG_ERRQUEUE using recvmsg, ignore POLLIN */
>> +	if (req->opcode == IORING_OP_RECVMSG && (sqe->msg_flags & MSG_ERRQUEUE))
>> +		mask &= ~(POLLIN);
> 
> FWIW this adds another W=1 C=1 warnings to this code:
> 
> fs/io_uring.c:4940:22: warning: invalid assignment: &=
> fs/io_uring.c:4940:22:    left side has type restricted __poll_t
> fs/io_uring.c:4940:22:    right side has type int

Well, 8 or 9 of them don't really matter... This is something that should
be cleaned up separately at some point.

> And obviously the brackets around POLLIN are not necessary.

Agree, would be cleaner without!

Luke, with that:

Reviewed-by: Jens Axboe <axboe@kernel.dk>
Luke Hsiao Aug. 22, 2020, 12:08 a.m. UTC | #3
Hi Jakub and Jens,

Thank you for both of your reviews. Some responses inline below.

On Fri, Aug 21, 2020 at 2:11 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 8/21/20 2:41 PM, Jakub Kicinski wrote:
> > On Thu, 20 Aug 2020 16:49:54 -0700 Luke Hsiao wrote:
> >> +    /* If reading from MSG_ERRQUEUE using recvmsg, ignore POLLIN */
> >> +    if (req->opcode == IORING_OP_RECVMSG && (sqe->msg_flags & MSG_ERRQUEUE))
> >> +            mask &= ~(POLLIN);
> >
> > FWIW this adds another W=1 C=1 warnings to this code:
> >
> > fs/io_uring.c:4940:22: warning: invalid assignment: &=
> > fs/io_uring.c:4940:22:    left side has type restricted __poll_t
> > fs/io_uring.c:4940:22:    right side has type int
>
> Well, 8 or 9 of them don't really matter... This is something that should
> be cleaned up separately at some point.

In the spirit of not adding a warning, even if it doesn't really
matter, I'd like to fix this. But, I'm struggling to reproduce these
warnings using upstream net-next and make for x86_64. With my patches
on top of net-next/master, I'm running

$ make defconfig
$ make -j`nproc` W=1

I don't see the warning you mentioned in the logs. Could you tell me
how I can repro this warning?

> > And obviously the brackets around POLLIN are not necessary.
>
> Agree, would be cleaner without!

Thanks, I'll also remove these unnecessary parens in v2 of the patch series.

--
Luke
Jens Axboe Aug. 22, 2020, 1:17 a.m. UTC | #4
On 8/21/20 6:08 PM, Luke Hsiao wrote:
> Hi Jakub and Jens,
> 
> Thank you for both of your reviews. Some responses inline below.
> 
> On Fri, Aug 21, 2020 at 2:11 PM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> On 8/21/20 2:41 PM, Jakub Kicinski wrote:
>>> On Thu, 20 Aug 2020 16:49:54 -0700 Luke Hsiao wrote:
>>>> +    /* If reading from MSG_ERRQUEUE using recvmsg, ignore POLLIN */
>>>> +    if (req->opcode == IORING_OP_RECVMSG && (sqe->msg_flags & MSG_ERRQUEUE))
>>>> +            mask &= ~(POLLIN);
>>>
>>> FWIW this adds another W=1 C=1 warnings to this code:
>>>
>>> fs/io_uring.c:4940:22: warning: invalid assignment: &=
>>> fs/io_uring.c:4940:22:    left side has type restricted __poll_t
>>> fs/io_uring.c:4940:22:    right side has type int
>>
>> Well, 8 or 9 of them don't really matter... This is something that should
>> be cleaned up separately at some point.
> 
> In the spirit of not adding a warning, even if it doesn't really
> matter, I'd like to fix this. But, I'm struggling to reproduce these
> warnings using upstream net-next and make for x86_64. With my patches
> on top of net-next/master, I'm running
> 
> $ make defconfig
> $ make -j`nproc` W=1
> 
> I don't see the warning you mentioned in the logs. Could you tell me
> how I can repro this warning?

You should see them with C=1. But as I said, don't worry about it, as that's
a class of warnings and several exist already. It needs to get cleaned up
separately.

>>> And obviously the brackets around POLLIN are not necessary.
>>
>> Agree, would be cleaner without!
> 
> Thanks, I'll also remove these unnecessary parens in v2 of the patch series.

Please do, thanks!
diff mbox series

Patch

diff --git a/fs/io_uring.c b/fs/io_uring.c
index dc506b75659c..664ce8739615 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -79,6 +79,7 @@ 
 #include <linux/splice.h>
 #include <linux/task_work.h>
 #include <linux/pagemap.h>
+#include <linux/socket.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/io_uring.h>
@@ -4902,7 +4903,8 @@  static __poll_t __io_arm_poll_handler(struct io_kiocb *req,
 	return mask;
 }
 
-static bool io_arm_poll_handler(struct io_kiocb *req)
+static bool io_arm_poll_handler(struct io_kiocb *req,
+				const struct io_uring_sqe *sqe)
 {
 	const struct io_op_def *def = &io_op_defs[req->opcode];
 	struct io_ring_ctx *ctx = req->ctx;
@@ -4932,6 +4934,11 @@  static bool io_arm_poll_handler(struct io_kiocb *req)
 		mask |= POLLIN | POLLRDNORM;
 	if (def->pollout)
 		mask |= POLLOUT | POLLWRNORM;
+
+	/* If reading from MSG_ERRQUEUE using recvmsg, ignore POLLIN */
+	if (req->opcode == IORING_OP_RECVMSG && (sqe->msg_flags & MSG_ERRQUEUE))
+		mask &= ~(POLLIN);
+
 	mask |= POLLERR | POLLPRI;
 
 	ipt.pt._qproc = io_async_queue_proc;
@@ -6146,7 +6153,7 @@  static void __io_queue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 	 * doesn't support non-blocking read/write attempts
 	 */
 	if (ret == -EAGAIN && !(req->flags & REQ_F_NOWAIT)) {
-		if (!io_arm_poll_handler(req)) {
+		if (!io_arm_poll_handler(req, sqe)) {
 punt:
 			ret = io_prep_work_files(req);
 			if (unlikely(ret))