From patchwork Tue May 19 12:24:32 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alberto Garcia X-Patchwork-Id: 473890 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id E7DA3140D4E for ; Tue, 19 May 2015 22:29:49 +1000 (AEST) Received: from localhost ([::1]:45494 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YugeZ-0000Jp-SX for incoming@patchwork.ozlabs.org; Tue, 19 May 2015 08:29:47 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42597) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YugZr-0000WI-8v for qemu-devel@nongnu.org; Tue, 19 May 2015 08:24:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YugZh-0005c2-TH for qemu-devel@nongnu.org; Tue, 19 May 2015 08:24:55 -0400 Received: from smtp3.mundo-r.com ([212.51.32.191]:51107 helo=smtp4.mundo-r.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YugZh-0005aj-2y; Tue, 19 May 2015 08:24:45 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AuccAEQqW1VbdWOb/2dsb2JhbABcgxCBMoFAsUQBAQEBAQEFAYEEAZhOAoE8TAEBAQEBAYELhCMBAQQaDVIQUTwbGYgwAdU8AQEIIoYWiVFYB4QtBYZlhFxog3SNTIEnhmVohkSHVyNhgQUhAxyBVTsxgQSBQgEBAQ X-IPAS-Result: AuccAEQqW1VbdWOb/2dsb2JhbABcgxCBMoFAsUQBAQEBAQEFAYEEAZhOAoE8TAEBAQEBAYELhCMBAQQaDVIQUTwbGYgwAdU8AQEIIoYWiVFYB4QtBYZlhFxog3SNTIEnhmVohkSHVyNhgQUhAxyBVTsxgQSBQgEBAQ X-IronPort-AV: E=Sophos;i="5.13,458,1427752800"; d="scan'208";a="351354669" Received: from fanzine.igalia.com ([91.117.99.155]) by smtp4.mundo-r.com with ESMTP; 19 May 2015 14:24:41 +0200 Received: from maestria.local.igalia.com ([192.168.10.14] helo=mail.igalia.com) by fanzine.igalia.com with esmtps (Cipher TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim) id 1YugZd-0005xJ-Dl; Tue, 19 May 2015 14:24:41 +0200 Received: from fanzine.local.igalia.com ([192.168.10.13] helo=perseus.local) by mail.igalia.com with esmtps (Cipher TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim) id 1YugZc-0003Zx-27; Tue, 19 May 2015 14:24:40 +0200 Received: from berto by perseus.local with local (Exim 4.85) (envelope-from ) id 1YugZa-0001DC-RY; Tue, 19 May 2015 15:24:38 +0300 From: Alberto Garcia To: qemu-devel@nongnu.org Date: Tue, 19 May 2015 15:24:32 +0300 Message-Id: <7d0ca2d3615ca3bb156d4c31aa64e6812cda7a6e.1432037840.git.berto@igalia.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: References: In-Reply-To: References: X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 212.51.32.191 Cc: Kevin Wolf , Alberto Garcia , Stefan Hajnoczi , qemu-block@nongnu.org Subject: [Qemu-devel] [PATCH 4/8] throttle: Add throttle group support X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org The throttle group support use a cooperative round robin scheduling algorithm. The principles of the algorithm are simple: - Each BDS of the group is used as a token in a circular way. - The active BDS computes if a wait must be done and arms the right timer. - If a wait must be done the token timer will be armed so the token will become the next active BDS. Signed-off-by: Alberto Garcia Reviewed-by: Stefan Hajnoczi --- block.c | 15 +-- block/io.c | 75 +++----------- block/qapi.c | 5 +- block/throttle-groups.c | 214 +++++++++++++++++++++++++++++++++++++++- blockdev.c | 38 +++++-- hmp.c | 4 +- include/block/block.h | 3 +- include/block/block_int.h | 7 +- include/block/throttle-groups.h | 4 + qapi/block-core.json | 25 ++++- qemu-options.hx | 1 + qmp-commands.hx | 3 +- 12 files changed, 309 insertions(+), 85 deletions(-) diff --git a/block.c b/block.c index e41ba10..1b249b2 100644 --- a/block.c +++ b/block.c @@ -1811,15 +1811,18 @@ static void bdrv_move_feature_fields(BlockDriverState *bs_dest, bs_dest->enable_write_cache = bs_src->enable_write_cache; /* i/o throttled req */ - memcpy(&bs_dest->throttle_state, - &bs_src->throttle_state, - sizeof(ThrottleState)); + bs_dest->throttle_state = bs_src->throttle_state, + bs_dest->io_limits_enabled = bs_src->io_limits_enabled; + bs_dest->pending_reqs[0] = bs_src->pending_reqs[0]; + bs_dest->pending_reqs[1] = bs_src->pending_reqs[1]; + bs_dest->throttled_reqs[0] = bs_src->throttled_reqs[0]; + bs_dest->throttled_reqs[1] = bs_src->throttled_reqs[1]; + memcpy(&bs_dest->round_robin, + &bs_src->round_robin, + sizeof(bs_dest->round_robin)); memcpy(&bs_dest->throttle_timers, &bs_src->throttle_timers, sizeof(ThrottleTimers)); - bs_dest->throttled_reqs[0] = bs_src->throttled_reqs[0]; - bs_dest->throttled_reqs[1] = bs_src->throttled_reqs[1]; - bs_dest->io_limits_enabled = bs_src->io_limits_enabled; /* r/w error */ bs_dest->on_read_error = bs_src->on_read_error; diff --git a/block/io.c b/block/io.c index 1f20c9c..0373027 100644 --- a/block/io.c +++ b/block/io.c @@ -23,9 +23,9 @@ */ #include "trace.h" -#include "sysemu/qtest.h" #include "block/blockjob.h" #include "block/block_int.h" +#include "block/throttle-groups.h" #define NOT_DONE 0x7fffffff /* used while emulated sync operation in progress */ @@ -65,7 +65,7 @@ void bdrv_set_io_limits(BlockDriverState *bs, { int i; - throttle_config(&bs->throttle_state, &bs->throttle_timers, cfg); + throttle_group_config(bs, cfg); for (i = 0; i < 2; i++) { qemu_co_enter_next(&bs->throttled_reqs[i]); @@ -95,76 +95,33 @@ static bool bdrv_start_throttled_reqs(BlockDriverState *bs) void bdrv_io_limits_disable(BlockDriverState *bs) { bs->io_limits_enabled = false; - bdrv_start_throttled_reqs(bs); - - throttle_timers_destroy(&bs->throttle_timers); -} - -static void bdrv_throttle_read_timer_cb(void *opaque) -{ - BlockDriverState *bs = opaque; - qemu_co_enter_next(&bs->throttled_reqs[0]); -} - -static void bdrv_throttle_write_timer_cb(void *opaque) -{ - BlockDriverState *bs = opaque; - qemu_co_enter_next(&bs->throttled_reqs[1]); + throttle_group_unregister_bs(bs); } /* should be called before bdrv_set_io_limits if a limit is set */ -void bdrv_io_limits_enable(BlockDriverState *bs) +void bdrv_io_limits_enable(BlockDriverState *bs, const char *group) { - int clock_type = QEMU_CLOCK_REALTIME; - - if (qtest_enabled()) { - /* For testing block IO throttling only */ - clock_type = QEMU_CLOCK_VIRTUAL; - } assert(!bs->io_limits_enabled); - throttle_init(&bs->throttle_state); - throttle_timers_init(&bs->throttle_timers, - bdrv_get_aio_context(bs), - clock_type, - bdrv_throttle_read_timer_cb, - bdrv_throttle_write_timer_cb, - bs); + throttle_group_register_bs(bs, group); bs->io_limits_enabled = true; } -/* This function makes an IO wait if needed - * - * @nb_sectors: the number of sectors of the IO - * @is_write: is the IO a write - */ -static void bdrv_io_limits_intercept(BlockDriverState *bs, - unsigned int bytes, - bool is_write) +void bdrv_io_limits_update_group(BlockDriverState *bs, const char *group) { - /* does this io must wait */ - bool must_wait = throttle_schedule_timer(&bs->throttle_state, - &bs->throttle_timers, - is_write); - - /* if must wait or any request of this type throttled queue the IO */ - if (must_wait || - !qemu_co_queue_empty(&bs->throttled_reqs[is_write])) { - qemu_co_queue_wait(&bs->throttled_reqs[is_write]); + /* this bs is not part of any group */ + if (!bs->throttle_state) { + return; } - /* the IO will be executed, do the accounting */ - throttle_account(&bs->throttle_state, is_write, bytes); - - - /* if the next request must wait -> do nothing */ - if (throttle_schedule_timer(&bs->throttle_state, &bs->throttle_timers, - is_write)) { + /* this bs is a part of the same group than the one we want */ + if (!g_strcmp0(throttle_group_get_name(bs), group)) { return; } - /* else queue next request for execution */ - qemu_co_queue_next(&bs->throttled_reqs[is_write]); + /* need to change the group this bs belong to */ + bdrv_io_limits_disable(bs); + bdrv_io_limits_enable(bs, group); } void bdrv_setup_io_funcs(BlockDriver *bdrv) @@ -978,7 +935,7 @@ static int coroutine_fn bdrv_co_do_preadv(BlockDriverState *bs, /* throttling disk I/O */ if (bs->io_limits_enabled) { - bdrv_io_limits_intercept(bs, bytes, false); + throttle_group_co_io_limits_intercept(bs, bytes, false); } /* Align read if necessary by padding qiov */ @@ -1219,7 +1176,7 @@ static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs, /* throttling disk I/O */ if (bs->io_limits_enabled) { - bdrv_io_limits_intercept(bs, bytes, true); + throttle_group_co_io_limits_intercept(bs, bytes, true); } /* diff --git a/block/qapi.c b/block/qapi.c index 18d2b95..a5ac312 100644 --- a/block/qapi.c +++ b/block/qapi.c @@ -24,6 +24,7 @@ #include "block/qapi.h" #include "block/block_int.h" +#include "block/throttle-groups.h" #include "block/write-threshold.h" #include "qmp-commands.h" #include "qapi-visit.h" @@ -65,7 +66,9 @@ BlockDeviceInfo *bdrv_block_device_info(BlockDriverState *bs, Error **errp) if (bs->io_limits_enabled) { ThrottleConfig cfg; - throttle_get_config(&bs->throttle_state, &cfg); + + throttle_group_get_config(bs, &cfg); + info->bps = cfg.buckets[THROTTLE_BPS_TOTAL].avg; info->bps_rd = cfg.buckets[THROTTLE_BPS_READ].avg; info->bps_wr = cfg.buckets[THROTTLE_BPS_WRITE].avg; diff --git a/block/throttle-groups.c b/block/throttle-groups.c index 352077f..da8c70c 100644 --- a/block/throttle-groups.c +++ b/block/throttle-groups.c @@ -23,6 +23,9 @@ */ #include "block/throttle-groups.h" +#include "qemu/queue.h" +#include "qemu/thread.h" +#include "sysemu/qtest.h" /* The ThrottleGroup structure (with its ThrottleState) is shared * among different BlockDriverState and it's independent from @@ -160,6 +163,153 @@ static BlockDriverState *throttle_group_next_bs(BlockDriverState *bs) return next; } +/* Return the next BlockDriverState in the round-robin sequence with + * pending I/O requests. + * + * This assumes that tg->lock is held. + * + * @bs: the current BlockDriverState + * @is_write: the type of operation (read/write) + * @ret: the next BlockDriverState with pending requests, or bs + * if there is none. + */ +static BlockDriverState *next_throttle_token(BlockDriverState *bs, + bool is_write) +{ + ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts); + BlockDriverState *token, *start; + + start = token = tg->tokens[is_write]; + + /* get next bs round in round robin style */ + token = throttle_group_next_bs(token); + while (token != start && !token->pending_reqs[is_write]) { + token = throttle_group_next_bs(token); + } + + /* If no IO are queued for scheduling on the next round robin token + * then decide the token is the current bs because chances are + * the current bs get the current request queued. + */ + if (token == start && !token->pending_reqs[is_write]) { + token = bs; + } + + return token; +} + +/* Check if the next I/O request for a BlockDriverState needs to be + * throttled or not. If there's no timer set in this group, set one + * and update the token accordingly. + * + * This assumes that tg->lock is held. + * + * @bs: the current BlockDriverState + * @is_write: the type of operation (read/write) + * @ret: whether the I/O request needs to be throttled or not + */ +static bool throttle_group_schedule_timer(BlockDriverState *bs, + bool is_write) +{ + ThrottleState *ts = bs->throttle_state; + ThrottleTimers *tt = &bs->throttle_timers; + ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts); + bool must_wait; + + /* Check if any of the timers in this group is already armed */ + if (tg->any_timer_armed[is_write]) { + return true; + } + + must_wait = throttle_schedule_timer(ts, tt, is_write); + + /* If a timer just got armed, set bs as the current token */ + if (must_wait) { + tg->tokens[is_write] = bs; + tg->any_timer_armed[is_write] = true; + } + + return must_wait; +} + +/* Look for the next pending I/O request and schedule it. + * + * This assumes that tg->lock is held. + * + * @bs: the current BlockDriverState + * @is_write: the type of operation (read/write) + */ +static void schedule_next_request(BlockDriverState *bs, bool is_write) +{ + ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts); + bool must_wait; + BlockDriverState *token; + + /* Check if there's any pending request to schedule next */ + token = next_throttle_token(bs, is_write); + if (!token->pending_reqs[is_write]) { + return; + } + + /* Set a timer for the request if it needs to be throttled */ + must_wait = throttle_group_schedule_timer(token, is_write); + + /* If it doesn't have to wait, queue it for immediate execution */ + if (!must_wait) { + /* Give preference to requests from the current bs */ + if (qemu_in_coroutine() && + qemu_co_queue_next(&bs->throttled_reqs[is_write])) { + token = bs; + } else { + ThrottleTimers *tt = &token->throttle_timers; + int64_t now = qemu_clock_get_ns(tt->clock_type); + timer_mod(tt->timers[is_write], now + 1); + tg->any_timer_armed[is_write] = true; + } + tg->tokens[is_write] = token; + } +} + +/* Check if an I/O request needs to be throttled, wait and set a timer + * if necessary, and schedule the next request using a round robin + * algorithm. + * + * @bs: the current BlockDriverState + * @bytes: the number of bytes for this I/O + * @is_write: the type of operation (read/write) + */ +void coroutine_fn throttle_group_co_io_limits_intercept(BlockDriverState *bs, + unsigned int bytes, + bool is_write) +{ + bool must_wait; + BlockDriverState *token; + + ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts); + qemu_mutex_lock(&tg->lock); + + /* First we check if this I/O has to be throttled. */ + token = next_throttle_token(bs, is_write); + must_wait = throttle_group_schedule_timer(token, is_write); + + /* Wait if there's a timer set or queued requests of this type */ + if (must_wait || bs->pending_reqs[is_write]) { + bs->pending_reqs[is_write]++; + qemu_mutex_unlock(&tg->lock); + qemu_co_queue_wait(&bs->throttled_reqs[is_write]); + qemu_mutex_lock(&tg->lock); + bs->pending_reqs[is_write]--; + } + + /* The I/O will be executed, so do the accounting */ + throttle_account(bs->throttle_state, is_write, bytes); + + /* Schedule the next request */ + schedule_next_request(bs, is_write); + + qemu_mutex_unlock(&tg->lock); +} + /* Update the throttle configuration for a particular group. Similar * to throttle_config(), but guarantees atomicity within the * throttling group. @@ -195,9 +345,49 @@ void throttle_group_get_config(BlockDriverState *bs, ThrottleConfig *cfg) qemu_mutex_unlock(&tg->lock); } -/* Register a BlockDriverState in the throttling group, also updating - * its throttle_state pointer to point to it. If a throttling group - * with that name does not exist yet, it will be created. +/* ThrottleTimers callback. This wakes up a request that was waiting + * because it had been throttled. + * + * @bs: the BlockDriverState whose request had been throttled + * @is_write: the type of operation (read/write) + */ +static void timer_cb(BlockDriverState *bs, bool is_write) +{ + ThrottleState *ts = bs->throttle_state; + ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts); + bool empty_queue; + + /* The timer has just been fired, so we can update the flag */ + qemu_mutex_lock(&tg->lock); + tg->any_timer_armed[is_write] = false; + qemu_mutex_unlock(&tg->lock); + + /* Run the request that was waiting for this timer */ + empty_queue = !qemu_co_enter_next(&bs->throttled_reqs[is_write]); + + /* If the request queue was empty then we have to take care of + * scheduling the next one */ + if (empty_queue) { + qemu_mutex_lock(&tg->lock); + schedule_next_request(bs, is_write); + qemu_mutex_unlock(&tg->lock); + } +} + +static void read_timer_cb(void *opaque) +{ + timer_cb(opaque, false); +} + +static void write_timer_cb(void *opaque) +{ + timer_cb(opaque, true); +} + +/* Register a BlockDriverState in the throttling group, also + * initializing its timers and updating its throttle_state pointer to + * point to it. If a throttling group with that name does not exist + * yet, it will be created. * * @bs: the BlockDriverState to insert * @groupname: the name of the group @@ -206,6 +396,12 @@ void throttle_group_register_bs(BlockDriverState *bs, const char *groupname) { int i; ThrottleGroup *tg = throttle_group_incref(groupname); + int clock_type = QEMU_CLOCK_REALTIME; + + if (qtest_enabled()) { + /* For testing block IO throttling only */ + clock_type = QEMU_CLOCK_VIRTUAL; + } bs->throttle_state = &tg->ts; @@ -218,11 +414,20 @@ void throttle_group_register_bs(BlockDriverState *bs, const char *groupname) } QLIST_INSERT_HEAD(&tg->head, bs, round_robin); + + throttle_timers_init(&bs->throttle_timers, + bdrv_get_aio_context(bs), + clock_type, + read_timer_cb, + write_timer_cb, + bs); + qemu_mutex_unlock(&tg->lock); } /* Unregister a BlockDriverState from its group, removing it from the - * list and setting the throttle_state pointer to NULL. + * list, destroying the timers and setting the throttle_state pointer + * to NULL. * * The group will be destroyed if it's empty after this operation. * @@ -247,6 +452,7 @@ void throttle_group_unregister_bs(BlockDriverState *bs) /* remove the current bs from the list */ QLIST_REMOVE(bs, round_robin); + throttle_timers_destroy(&bs->throttle_timers); qemu_mutex_unlock(&tg->lock); throttle_group_unref(tg); diff --git a/blockdev.c b/blockdev.c index 5eaf77e..317923d 100644 --- a/blockdev.c +++ b/blockdev.c @@ -34,6 +34,7 @@ #include "sysemu/blockdev.h" #include "hw/block/block.h" #include "block/blockjob.h" +#include "block/throttle-groups.h" #include "monitor/monitor.h" #include "qemu/option.h" #include "qemu/config-file.h" @@ -357,6 +358,7 @@ static BlockBackend *blockdev_init(const char *file, QDict *bs_opts, const char *id; bool has_driver_specific_opts; BlockdevDetectZeroesOptions detect_zeroes; + const char *throttling_group; /* Check common options by copying from bs_opts to opts, all other options * stay in bs_opts for processing by bdrv_open(). */ @@ -459,6 +461,8 @@ static BlockBackend *blockdev_init(const char *file, QDict *bs_opts, cfg.op_size = qemu_opt_get_number(opts, "throttling.iops-size", 0); + throttling_group = qemu_opt_get(opts, "throttling.group"); + if (!check_throttle_config(&cfg, &error)) { error_propagate(errp, error); goto early_err; @@ -547,7 +551,10 @@ static BlockBackend *blockdev_init(const char *file, QDict *bs_opts, /* disk I/O throttling */ if (throttle_enabled(&cfg)) { - bdrv_io_limits_enable(bs); + if (!throttling_group) { + throttling_group = blk_name(blk); + } + bdrv_io_limits_enable(bs, throttling_group); bdrv_set_io_limits(bs, &cfg); } @@ -711,6 +718,8 @@ DriveInfo *drive_new(QemuOpts *all_opts, BlockInterfaceType block_default_type) { "iops_size", "throttling.iops-size" }, + { "group", "throttling.group" }, + { "readonly", "read-only" }, }; @@ -1951,7 +1960,9 @@ void qmp_block_set_io_throttle(const char *device, int64_t bps, int64_t bps_rd, bool has_iops_wr_max, int64_t iops_wr_max, bool has_iops_size, - int64_t iops_size, Error **errp) + int64_t iops_size, + bool has_group, + const char *group, Error **errp) { ThrottleConfig cfg; BlockDriverState *bs; @@ -2004,14 +2015,19 @@ void qmp_block_set_io_throttle(const char *device, int64_t bps, int64_t bps_rd, aio_context = bdrv_get_aio_context(bs); aio_context_acquire(aio_context); - if (!bs->io_limits_enabled && throttle_enabled(&cfg)) { - bdrv_io_limits_enable(bs); - } else if (bs->io_limits_enabled && !throttle_enabled(&cfg)) { - bdrv_io_limits_disable(bs); - } - - if (bs->io_limits_enabled) { + if (throttle_enabled(&cfg)) { + /* Enable I/O limits if they're not enabled yet, otherwise + * just update the throttling group. */ + if (!bs->io_limits_enabled) { + bdrv_io_limits_enable(bs, has_group ? group : device); + } else if (has_group) { + bdrv_io_limits_update_group(bs, group); + } + /* Set the new throttling configuration */ bdrv_set_io_limits(bs, &cfg); + } else if (bs->io_limits_enabled) { + /* If all throttling settings are set to 0, disable I/O limits */ + bdrv_io_limits_disable(bs); } aio_context_release(aio_context); @@ -3190,6 +3206,10 @@ QemuOptsList qemu_common_drive_opts = { .type = QEMU_OPT_NUMBER, .help = "when limiting by iops max size of an I/O in bytes", },{ + .name = "throttling.group", + .type = QEMU_OPT_STRING, + .help = "name of the block throttling group", + },{ .name = "copy-on-read", .type = QEMU_OPT_BOOL, .help = "copy read data from backing file into image file", diff --git a/hmp.c b/hmp.c index e17852d..31aa8f2 100644 --- a/hmp.c +++ b/hmp.c @@ -1338,7 +1338,9 @@ void hmp_block_set_io_throttle(Monitor *mon, const QDict *qdict) false, 0, false, /* No default I/O size */ - 0, &err); + 0, + false, + NULL, &err); hmp_handle_error(mon, &err); } diff --git a/include/block/block.h b/include/block/block.h index 7d1a717..903f229 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -173,8 +173,9 @@ void bdrv_stats_print(Monitor *mon, const QObject *data); void bdrv_info_stats(Monitor *mon, QObject **ret_data); /* disk I/O throttling */ -void bdrv_io_limits_enable(BlockDriverState *bs); +void bdrv_io_limits_enable(BlockDriverState *bs, const char *group); void bdrv_io_limits_disable(BlockDriverState *bs); +void bdrv_io_limits_update_group(BlockDriverState *bs, const char *group); void bdrv_init(void); void bdrv_init_with_whitelist(void); diff --git a/include/block/block_int.h b/include/block/block_int.h index 1167fb9..3071d31 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -376,10 +376,13 @@ struct BlockDriverState { unsigned int serialising_in_flight; /* I/O throttling */ - ThrottleState throttle_state; - ThrottleTimers throttle_timers; CoQueue throttled_reqs[2]; bool io_limits_enabled; + /* The following fields are protected by the ThrottleGroup lock. + * See the ThrottleGroup documentation for details. */ + ThrottleState *throttle_state; + ThrottleTimers throttle_timers; + unsigned pending_reqs[2]; QLIST_ENTRY(BlockDriverState) round_robin; /* I/O stats (display with "info blockstats"). */ diff --git a/include/block/throttle-groups.h b/include/block/throttle-groups.h index b966ec7..322139a 100644 --- a/include/block/throttle-groups.h +++ b/include/block/throttle-groups.h @@ -36,4 +36,8 @@ void throttle_group_get_config(BlockDriverState *bs, ThrottleConfig *cfg); void throttle_group_register_bs(BlockDriverState *bs, const char *groupname); void throttle_group_unregister_bs(BlockDriverState *bs); +void coroutine_fn throttle_group_co_io_limits_intercept(BlockDriverState *bs, + unsigned int bytes, + bool is_write); + #endif diff --git a/qapi/block-core.json b/qapi/block-core.json index 863ffea..3eae28b 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -1043,6 +1043,27 @@ # # Change I/O throttle limits for a block drive. # +# Since QEMU 2.4, each device with I/O limits is member of a throttle +# group. +# +# If two or more devices are members of the same group, the limits +# will apply to the combined I/O of the whole group in a round-robin +# fashion. Therefore, setting new I/O limits to a device will affect +# the whole group. +# +# The name of the group can be specified using the 'group' parameter. +# If the parameter is unset, it is assumed to be the current group of +# that device. If it's not in any group yet, the name of the device +# will be used as the name for its group. +# +# The 'group' parameter can also be used to move a device to a +# different group. In this case the limits specified in the parameters +# will be applied to the new group only. +# +# I/O limits can be disabled by setting all of them to 0. In this case +# the device will be removed from its group and the rest of its +# members will no be affected. The 'group' parameter is ignored. +# # @device: The name of the device # # @bps: total throughput limit in bytes per second @@ -1071,6 +1092,8 @@ # # @iops_size: #optional an I/O size in bytes (Since 1.7) # +# @group: #optional throttle group name (Since 2.4) +# # Returns: Nothing on success # If @device is not a valid block device, DeviceNotFound # @@ -1082,7 +1105,7 @@ '*bps_max': 'int', '*bps_rd_max': 'int', '*bps_wr_max': 'int', '*iops_max': 'int', '*iops_rd_max': 'int', '*iops_wr_max': 'int', - '*iops_size': 'int' } } + '*iops_size': 'int', '*group': 'str' } } ## # @block-stream: diff --git a/qemu-options.hx b/qemu-options.hx index ec356f6..bf38bba 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -464,6 +464,7 @@ DEF("drive", HAS_ARG, QEMU_OPTION_drive, " [[,bps_max=bm]|[[,bps_rd_max=rm][,bps_wr_max=wm]]]\n" " [[,iops_max=im]|[[,iops_rd_max=irm][,iops_wr_max=iwm]]]\n" " [[,iops_size=is]]\n" + " [[,group=g]]\n" " use 'file' as a drive image\n", QEMU_ARCH_ALL) STEXI @item -drive @var{option}[,@var{option}[,@var{option}[,...]]] diff --git a/qmp-commands.hx b/qmp-commands.hx index 14e109e..2ad6acb 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -1853,7 +1853,7 @@ EQMP { .name = "block_set_io_throttle", - .args_type = "device:B,bps:l,bps_rd:l,bps_wr:l,iops:l,iops_rd:l,iops_wr:l,bps_max:l?,bps_rd_max:l?,bps_wr_max:l?,iops_max:l?,iops_rd_max:l?,iops_wr_max:l?,iops_size:l?", + .args_type = "device:B,bps:l,bps_rd:l,bps_wr:l,iops:l,iops_rd:l,iops_wr:l,bps_max:l?,bps_rd_max:l?,bps_wr_max:l?,iops_max:l?,iops_rd_max:l?,iops_wr_max:l?,iops_size:l?,group:s?", .mhandler.cmd_new = qmp_marshal_input_block_set_io_throttle, }, @@ -1879,6 +1879,7 @@ Arguments: - "iops_rd_max": read I/O operations max (json-int) - "iops_wr_max": write I/O operations max (json-int) - "iops_size": I/O size in bytes when limiting (json-int) +- "group": throttle group name (json-string) Example: