diff mbox

[V2] vhost: fix check for # of outstanding buffers

Message ID 1311182592.8573.45.camel@localhost.localdomain
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Shirley Ma July 20, 2011, 5:23 p.m. UTC
Fix the check for number of outstanding buffers returns incorrect
results due to vq->pend_idx wrap around;

Signed-off-by: Shirley Ma <xma@us.ibm.com>
---

 drivers/vhost/net.c |   12 +++++++++---
 1 files changed, 9 insertions(+), 3 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Michael S. Tsirkin July 21, 2011, 8:06 a.m. UTC | #1
On Wed, Jul 20, 2011 at 10:23:12AM -0700, Shirley Ma wrote:
> Fix the check for number of outstanding buffers returns incorrect
> results due to vq->pend_idx wrap around;
> 
> Signed-off-by: Shirley Ma <xma@us.ibm.com>

OK, the logic's right now, and it's not worse
than what we had, so I applied this after
fixing up the comment (it's upend_idx and English
sentences don't need to end with a semicolumn ;)

However, I would like to see the effect of the bug
noted in the log in the future.

And the reason I mention this here, is that
I think that the whole VHOST_MAX_PEND thing
does not work as advertised: this logic only
triggers when the ring is empty, so we will happily push
more than VHOST_MAX_PEND packets if the guest manages
to give them to us.

I'm not sure why we have the limit, either: the wmem
limit in the socket still applies and seems more
effective to prevent denial of service by a malicious guest.

> ---
> 
>  drivers/vhost/net.c |   12 +++++++++---
>  1 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 70ac604..946a71e 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -182,15 +182,21 @@ static void handle_tx(struct vhost_net *net)
>  			break;
>  		/* Nothing new?  Wait for eventfd to tell us they refilled. */
>  		if (head == vq->num) {
> +			int num_pends;
> +
>  			wmem = atomic_read(&sock->sk->sk_wmem_alloc);
>  			if (wmem >= sock->sk->sk_sndbuf * 3 / 4) {
>  				tx_poll_start(net, sock);
>  				set_bit(SOCK_ASYNC_NOSPACE, &sock->flags);
>  				break;
>  			}
> -			/* If more outstanding DMAs, queue the work */
> -			if (unlikely(vq->upend_idx - vq->done_idx >
> -				     VHOST_MAX_PEND)) {
> +			/* If more outstanding DMAs, queue the work
> +			 * handle upend_idx wrap around
> +			 */
> +			num_pends = (vq->upend_idx >= vq->done_idx) ?
> +				    (vq->upend_idx - vq->done_idx) :
> +				    (vq->upend_idx + UIO_MAXIOV - vq->done_idx);
> +			if (unlikely(num_pends > VHOST_MAX_PEND)) {
>  				tx_poll_start(net, sock);
>  				set_bit(SOCK_ASYNC_NOSPACE, &sock->flags);
>  				break;
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Shirley Ma July 21, 2011, 4:12 p.m. UTC | #2
On Thu, 2011-07-21 at 11:06 +0300, Michael S. Tsirkin wrote:
> On Wed, Jul 20, 2011 at 10:23:12AM -0700, Shirley Ma wrote:
> > Fix the check for number of outstanding buffers returns incorrect
> > results due to vq->pend_idx wrap around;
> > 
> > Signed-off-by: Shirley Ma <xma@us.ibm.com>
> 
> OK, the logic's right now, and it's not worse
> than what we had, so I applied this after
> fixing up the comment (it's upend_idx and English
> sentences don't need to end with a semicolumn ;)
> 
> However, I would like to see the effect of the bug
> noted in the log in the future.
> 
> And the reason I mention this here, is that
> I think that the whole VHOST_MAX_PEND thing
> does not work as advertised: this logic only
> triggers when the ring is empty, so we will happily push
> more than VHOST_MAX_PEND packets if the guest manages
> to give them to us.
> 
> I'm not sure why we have the limit, either: the wmem
> limit in the socket still applies and seems more
> effective to prevent denial of service by a malicious guest.

Vhost can push more than VHOST_MAX_PEND if the guest manages to give
more. That's managed by wmem limit.

MAX_PEND is max of outstanding used buffers which lower level device
can't DMAed on time. socket destructor remains unchanged, so it can't
managed by wmem.

Since vhost handle_tx always calls vhost_zerocopy_singal_used() so this
condition is unlikely hit unless the lower device can't DMAed TX
MAX_PEND packets.

Thanks
Shirley

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 70ac604..946a71e 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -182,15 +182,21 @@  static void handle_tx(struct vhost_net *net)
 			break;
 		/* Nothing new?  Wait for eventfd to tell us they refilled. */
 		if (head == vq->num) {
+			int num_pends;
+
 			wmem = atomic_read(&sock->sk->sk_wmem_alloc);
 			if (wmem >= sock->sk->sk_sndbuf * 3 / 4) {
 				tx_poll_start(net, sock);
 				set_bit(SOCK_ASYNC_NOSPACE, &sock->flags);
 				break;
 			}
-			/* If more outstanding DMAs, queue the work */
-			if (unlikely(vq->upend_idx - vq->done_idx >
-				     VHOST_MAX_PEND)) {
+			/* If more outstanding DMAs, queue the work
+			 * handle upend_idx wrap around
+			 */
+			num_pends = (vq->upend_idx >= vq->done_idx) ?
+				    (vq->upend_idx - vq->done_idx) :
+				    (vq->upend_idx + UIO_MAXIOV - vq->done_idx);
+			if (unlikely(num_pends > VHOST_MAX_PEND)) {
 				tx_poll_start(net, sock);
 				set_bit(SOCK_ASYNC_NOSPACE, &sock->flags);
 				break;