diff mbox series

um: read multiple msg from virtio slave request fd

Message ID 20220601153722.181427-1-benjamin.beichler@uni-rostock.de
State Superseded
Headers show
Series um: read multiple msg from virtio slave request fd | expand

Commit Message

Benjamin Beichler June 1, 2022, 3:37 p.m. UTC
If VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS is activated, the user mode
linux virtio irq handler only read one msg from the corresponding socket.
This creates issues, when the device emulation creates multiple call
requests (e.g. for multiple virtqueues), as the socket buffer tend to fill
up and the call requests are delayed.

This creates a deadlock situation, when the device simulation blocks,
because of sending a msg and the kernel side blocks because of
synchronously waiting for an acknowledge of kick request.

Actually inband notifications are meant to be used in combination with the
time travel protocol, but it is not required, therefore this corner case
needs to be handled.

Anyways, in general it seems to be more natural to consume always all
messages from a socket, instead of only a single one.

Fixes: 2cd097ba8c05 ("um: virtio: Implement VHOST_USER_PROTOCOL_F_SLAVE_REQ")
Signed-off-by: Benjamin Beichler <benjamin.beichler@uni-rostock.de>
---
 arch/um/drivers/virtio_uml.c | 72 ++++++++++++++++++------------------
 1 file changed, 37 insertions(+), 35 deletions(-)

Comments

Johannes Berg June 1, 2022, 5:13 p.m. UTC | #1
On Wed, 2022-06-01 at 15:37 +0000, Benjamin Beichler wrote:
> If VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS is activated, the user mode
> linux virtio irq handler only read one msg from the corresponding socket.
> This creates issues, when the device emulation creates multiple call
> requests (e.g. for multiple virtqueues), as the socket buffer tend to fill
> up and the call requests are delayed.
> 
> This creates a deadlock situation, when the device simulation blocks,
> because of sending a msg and the kernel side blocks because of
> synchronously waiting for an acknowledge of kick request.
> 
> Actually inband notifications are meant to be used in combination with the
> time travel protocol, but it is not required, therefore this corner case
> needs to be handled.

Hmm. How did you run into this? Why would a device send many messages
and not wait for ACK, but the kernel side actually waits for ACK? What
would the use case for that be? Seems a bit odd, if both wait for ACK
there shouldn't be an issue?

Anyway, I guess I don't mind fixing this regardless of whether I see a
use case where it could happen :-)


> +++ b/arch/um/drivers/virtio_uml.c
> @@ -363,45 +363,47 @@ static irqreturn_t vu_req_read_message(struct virtio_uml_device *vu_dev,
>  		struct vhost_user_msg msg;
>  		u8 extra_payload[512];
>  	} msg;
> -	int rc;
> -
> -	rc = vhost_user_recv_req(vu_dev, &msg.msg,
> -				 sizeof(msg.msg.payload) +
> -				 sizeof(msg.extra_payload));
> -
> -	if (rc)

This code changed a bit, you should rebase onto the uml tree's for-next
branch.

> +	while (1) {
> +		if (vhost_user_recv_req(vu_dev, &msg.msg,
> +					sizeof(msg.msg.payload)
> +					+ sizeof(msg.extra_payload)))

prefer to keep the + on the previous line.


That said, my attempt at rebasing this made it all fail completely,
maybe you have better luck :)

johannes
Benjamin Beichler June 2, 2022, 8:32 a.m. UTC | #2
Am 01.06.2022 um 19:13 schrieb Johannes Berg:
> On Wed, 2022-06-01 at 15:37 +0000, Benjamin Beichler wrote:
>
> Hmm. How did you run into this? Why would a device send many messages
> and not wait for ACK, but the kernel side actually waits for ACK? What
> would the use case for that be? Seems a bit odd, if both wait for ACK
> there shouldn't be an issue?
>
> Anyway, I guess I don't mind fixing this regardless of whether I see a
> use case where it could happen :-)

Here is my (admittedly maybe odd) case:

I want to use hwsim over virtio with UML but without time travel (as a 
precursor for a later version with TT)

I modified wmediumd to strip out the scheduler dependency and wrote a 
very simple simulation, which simply forwards all frames to all radios. 
Furthermore, I use the usfstl "loop" as main driver to poll all fds 
without time travel. This leads to the situation, that when a msg is put 
on the RX-ring of an uml instance, which also sent concurrently a kick 
(e.g., also trying to send a frame), this creates a deadlock. In the 
original wmediumd this was handled by kind of a hack, calling the loop 
implementation to answer the kick, before sending out a call msg. I need 
to rip out this workaround, because without the usfstl scheduler, it 
created a deep recursion of the loop implementation with additional 
problems.

Nonetheless, even if this would be kind of an optimization: it is 
feasible to wait for the ACK asynchronously, as long as it arrives in 
the same point of simulation time (or as you called it calender). For 
many uml-instances, which could easily run in parallel, this allows an 
easier implementation (at least in my planning :-) ). Of course, it 
would be hard to distinguish, which call-request was acked, but at the 
end wmediumd (and I also plan to do so) simply aborts when the ack is 
negative, so the actual corresponding call is not that important to know.

>
> This code changed a bit, you should rebase onto the uml tree's for-next
> branch.
My bad, I was not expecting someone to change something it that corner 
of the kernel, I only used the latest master and not the next. I will 
redo the patch with ease.
>
>> +	while (1) {
>> +		if (vhost_user_recv_req(vu_dev, &msg.msg,
>> +					sizeof(msg.msg.payload)
>> +					+ sizeof(msg.extra_payload)))
> prefer to keep the + on the previous line.
>
It slightly hits the 80 column restriction, but I would also prefer not 
to break. :-D
> That said, my attempt at rebasing this made it all fail completely,
> maybe you have better luck :)
>
> johannes
>
kind regards

Benjamin
Benjamin Beichler June 7, 2022, 11:36 a.m. UTC | #3
Sorry for the cluttered HTML-Email, I hoped I trained my thunderbird to 
do the right thing, but it didn't.

Am 07.06.2022 um 13:32 schrieb Benjamin Beichler:
> Am 02.06.2022 um 10:32 schrieb Benjamin Beichler:
>> Am 01.06.2022 um 19:13 schrieb Johannes Berg: 
>>>
>>> prefer to keep the + on the previous line.
>>>
>> It slightly hits the 80 column restriction, but I would also prefer 
>> not to break. :-D
>
> Now I got your hint, and made it this way in the v2.
>
>>> That said, my attempt at rebasing this made it all fail completely,
>>> maybe you have better luck :)
> Actually, because of your patch I needed to change my logic a bit 
> (especially need to handle EAGAIN properly), which slowed me down a 
> bit, but now it does the right thing.
diff mbox series

Patch

diff --git a/arch/um/drivers/virtio_uml.c b/arch/um/drivers/virtio_uml.c
index ba562d68dc04..0c171dd11414 100644
--- a/arch/um/drivers/virtio_uml.c
+++ b/arch/um/drivers/virtio_uml.c
@@ -363,45 +363,47 @@  static irqreturn_t vu_req_read_message(struct virtio_uml_device *vu_dev,
 		struct vhost_user_msg msg;
 		u8 extra_payload[512];
 	} msg;
-	int rc;
-
-	rc = vhost_user_recv_req(vu_dev, &msg.msg,
-				 sizeof(msg.msg.payload) +
-				 sizeof(msg.extra_payload));
-
-	if (rc)
-		return IRQ_NONE;
-
-	switch (msg.msg.header.request) {
-	case VHOST_USER_SLAVE_CONFIG_CHANGE_MSG:
-		vu_dev->config_changed_irq = true;
-		response = 0;
-		break;
-	case VHOST_USER_SLAVE_VRING_CALL:
-		virtio_device_for_each_vq((&vu_dev->vdev), vq) {
-			if (vq->index == msg.msg.payload.vring_state.index) {
-				response = 0;
-				vu_dev->vq_irq_vq_map |= BIT_ULL(vq->index);
-				break;
+	irqreturn_t rc = IRQ_NONE;
+
+	while (1) {
+		if (vhost_user_recv_req(vu_dev, &msg.msg,
+					sizeof(msg.msg.payload)
+					+ sizeof(msg.extra_payload)))
+			break;
+
+		switch (msg.msg.header.request) {
+		case VHOST_USER_SLAVE_CONFIG_CHANGE_MSG:
+			vu_dev->config_changed_irq = true;
+			response = 0;
+			break;
+		case VHOST_USER_SLAVE_VRING_CALL:
+			virtio_device_for_each_vq((&vu_dev->vdev), vq) {
+				if (vq->index ==
+				    msg.msg.payload.vring_state.index) {
+					response = 0;
+					vu_dev->vq_irq_vq_map |=
+						BIT_ULL(vq->index);
+					break;
+				}
 			}
+			break;
+		case VHOST_USER_SLAVE_IOTLB_MSG:
+			/* not supported - VIRTIO_F_ACCESS_PLATFORM */
+		case VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG:
+			/* not supported - VHOST_USER_PROTOCOL_F_HOST_NOTIFIER */
+		default:
+			vu_err(vu_dev, "unexpected slave request %d\n",
+			       msg.msg.header.request);
 		}
-		break;
-	case VHOST_USER_SLAVE_IOTLB_MSG:
-		/* not supported - VIRTIO_F_ACCESS_PLATFORM */
-	case VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG:
-		/* not supported - VHOST_USER_PROTOCOL_F_HOST_NOTIFIER */
-	default:
-		vu_err(vu_dev, "unexpected slave request %d\n",
-		       msg.msg.header.request);
-	}
-
-	if (ev && !vu_dev->suspended)
-		time_travel_add_irq_event(ev);
 
-	if (msg.msg.header.flags & VHOST_USER_FLAG_NEED_REPLY)
-		vhost_user_reply(vu_dev, &msg.msg, response);
+		if (ev && !vu_dev->suspended)
+			time_travel_add_irq_event(ev);
 
-	return IRQ_HANDLED;
+		if (msg.msg.header.flags & VHOST_USER_FLAG_NEED_REPLY)
+			vhost_user_reply(vu_dev, &msg.msg, response);
+		rc = IRQ_HANDLED;
+	}
+	return rc;
 }
 
 static irqreturn_t vu_req_interrupt(int irq, void *data)