From patchwork Thu Mar 21 14:49:56 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Hajnoczi X-Patchwork-Id: 229724 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id B30FD2C00B6 for ; Fri, 22 Mar 2013 01:52:43 +1100 (EST) Received: from localhost ([::1]:58647 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UIgrC-0006Rp-07 for incoming@patchwork.ozlabs.org; Thu, 21 Mar 2013 10:52:42 -0400 Received: from eggs.gnu.org ([208.118.235.92]:44160) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UIgpF-0003DG-HH for qemu-devel@nongnu.org; Thu, 21 Mar 2013 10:50:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UIgp6-0002J8-5k for qemu-devel@nongnu.org; Thu, 21 Mar 2013 10:50:41 -0400 Received: from mx1.redhat.com ([209.132.183.28]:54511) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UIgp5-0002In-Ux for qemu-devel@nongnu.org; Thu, 21 Mar 2013 10:50:32 -0400 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r2LEoPPb003937 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 21 Mar 2013 10:50:26 -0400 Received: from localhost (ovpn-112-35.ams2.redhat.com [10.36.112.35]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r2LEoEOI030596; Thu, 21 Mar 2013 10:50:15 -0400 From: Stefan Hajnoczi To: Date: Thu, 21 Mar 2013 15:49:56 +0100 Message-Id: <1363877399-16339-2-git-send-email-stefanha@redhat.com> In-Reply-To: <1363877399-16339-1-git-send-email-stefanha@redhat.com> References: <1363877399-16339-1-git-send-email-stefanha@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 209.132.183.28 Cc: Kevin Wolf , =?UTF-8?q?Beno=C3=AEt=20Canet?= , Stefan Hajnoczi , Zhi Yong Wu Subject: [Qemu-devel] [RFC 1/4] block: fix I/O throttling accounting blind spot X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org I/O throttling relies on bdrv_acct_done() which is called when a request completes. This leaves a blind spot since we only charge for completed requests, not submitted requests. For example, if there is 1 operation remaining in this time slice the guest could submit 3 operations and they will all be submitted successfully since they don't actually get accounted for until they complete. Originally we probably thought this is okay since the requests will be accounted when the time slice is extended. In practice it causes fluctuations since the guest can exceed its I/O limit and it will be punished for this later on. Account for I/O upon submission so that I/O limits are enforced properly. Signed-off-by: Stefan Hajnoczi --- block.c | 19 ++++++++----------- include/block/block_int.h | 2 +- 2 files changed, 9 insertions(+), 12 deletions(-) diff --git a/block.c b/block.c index 0a062c9..31fb0b0 100644 --- a/block.c +++ b/block.c @@ -141,7 +141,6 @@ void bdrv_io_limits_disable(BlockDriverState *bs) bs->slice_start = 0; bs->slice_end = 0; bs->slice_time = 0; - memset(&bs->io_base, 0, sizeof(bs->io_base)); } static void bdrv_block_timer(void *opaque) @@ -1329,8 +1328,8 @@ static void bdrv_move_feature_fields(BlockDriverState *bs_dest, bs_dest->slice_time = bs_src->slice_time; bs_dest->slice_start = bs_src->slice_start; bs_dest->slice_end = bs_src->slice_end; + bs_dest->slice_submitted = bs_src->slice_submitted; bs_dest->io_limits = bs_src->io_limits; - bs_dest->io_base = bs_src->io_base; bs_dest->throttled_reqs = bs_src->throttled_reqs; bs_dest->block_timer = bs_src->block_timer; bs_dest->io_limits_enabled = bs_src->io_limits_enabled; @@ -3665,9 +3664,9 @@ static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors, slice_time = bs->slice_end - bs->slice_start; slice_time /= (NANOSECONDS_PER_SECOND); bytes_limit = bps_limit * slice_time; - bytes_base = bs->nr_bytes[is_write] - bs->io_base.bytes[is_write]; + bytes_base = bs->slice_submitted.bytes[is_write]; if (bs->io_limits.bps[BLOCK_IO_LIMIT_TOTAL]) { - bytes_base += bs->nr_bytes[!is_write] - bs->io_base.bytes[!is_write]; + bytes_base += bs->slice_submitted.bytes[!is_write]; } /* bytes_base: the bytes of data which have been read/written; and @@ -3683,6 +3682,7 @@ static bool bdrv_exceed_bps_limits(BlockDriverState *bs, int nb_sectors, *wait = 0; } + bs->slice_submitted.bytes[is_write] += bytes_res; return false; } @@ -3725,9 +3725,9 @@ static bool bdrv_exceed_iops_limits(BlockDriverState *bs, bool is_write, slice_time = bs->slice_end - bs->slice_start; slice_time /= (NANOSECONDS_PER_SECOND); ios_limit = iops_limit * slice_time; - ios_base = bs->nr_ops[is_write] - bs->io_base.ios[is_write]; + ios_base = bs->slice_submitted.ios[is_write]; if (bs->io_limits.iops[BLOCK_IO_LIMIT_TOTAL]) { - ios_base += bs->nr_ops[!is_write] - bs->io_base.ios[!is_write]; + ios_base += bs->slice_submitted.ios[!is_write]; } if (ios_base + 1 <= ios_limit) { @@ -3735,6 +3735,7 @@ static bool bdrv_exceed_iops_limits(BlockDriverState *bs, bool is_write, *wait = 0; } + bs->slice_submitted.ios[is_write]++; return false; } @@ -3772,11 +3773,7 @@ static bool bdrv_exceed_io_limits(BlockDriverState *bs, int nb_sectors, bs->slice_start = now; bs->slice_end = now + bs->slice_time; - bs->io_base.bytes[is_write] = bs->nr_bytes[is_write]; - bs->io_base.bytes[!is_write] = bs->nr_bytes[!is_write]; - - bs->io_base.ios[is_write] = bs->nr_ops[is_write]; - bs->io_base.ios[!is_write] = bs->nr_ops[!is_write]; + memset(&bs->slice_submitted, 0, sizeof(bs->slice_submitted)); } elapsed_time = now - bs->slice_start; diff --git a/include/block/block_int.h b/include/block/block_int.h index ce0aa26..b461764 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -251,7 +251,7 @@ struct BlockDriverState { int64_t slice_start; int64_t slice_end; BlockIOLimit io_limits; - BlockIOBaseValue io_base; + BlockIOBaseValue slice_submitted; CoQueue throttled_reqs; QEMUTimer *block_timer; bool io_limits_enabled;