From patchwork Mon Aug 10 10:14:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev" X-Patchwork-Id: 1342779 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=openvz.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4BQBg36zkJz9sRK for ; Mon, 10 Aug 2020 20:15:27 +1000 (AEST) Received: from localhost ([::1]:59964 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1k54px-00055L-O2 for incoming@patchwork.ozlabs.org; Mon, 10 Aug 2020 06:15:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49368) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1k54pV-0004zr-Ub; Mon, 10 Aug 2020 06:14:57 -0400 Received: from relay.sw.ru ([185.231.240.75]:50170 helo=relay3.sw.ru) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1k54pS-0007eI-EW; Mon, 10 Aug 2020 06:14:57 -0400 Received: from [192.168.15.33] (helo=iris.lishka.ru) by relay3.sw.ru with esmtp (Exim 4.93) (envelope-from ) id 1k54pD-0001FY-5H; Mon, 10 Aug 2020 13:14:39 +0300 From: "Denis V. Lunev" To: qemu-block@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 1/3] block/block-backend: add converter from BlockAcctStats to BlockBackend Date: Mon, 10 Aug 2020 13:14:45 +0300 Message-Id: <20200810101447.7380-2-den@openvz.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200810101447.7380-1-den@openvz.org> References: <20200810101447.7380-1-den@openvz.org> Received-SPF: pass client-ip=185.231.240.75; envelope-from=den@openvz.org; helo=relay3.sw.ru X-detected-operating-system: by eggs.gnu.org: First seen = 2020/08/10 06:14:49 X-ACL-Warn: Detected OS = Linux 3.11 and newer X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , "Denis V. Lunev" , Stefan Hajnoczi , Max Reitz Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Right now BlockAcctStats is always reside on BlockBackend. This structure is not used in any other place. Thus we are able to create a converter from one pointer to another. Signed-off-by: Denis V. Lunev Reviewed-by: Vladimir Sementsov-Ogievskiy CC: Stefan Hajnoczi CC: Kevin Wolf CC: Max Reitz --- block/block-backend.c | 5 +++++ include/sysemu/block-backend.h | 1 + 2 files changed, 6 insertions(+) diff --git a/block/block-backend.c b/block/block-backend.c index 3a13cb5f0b..88e531df98 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -2153,6 +2153,11 @@ BlockAcctStats *blk_get_stats(BlockBackend *blk) return &blk->stats; } +BlockBackend *blk_stats2blk(BlockAcctStats *s) +{ + return container_of(s, BlockBackend, stats); +} + void *blk_aio_get(const AIOCBInfo *aiocb_info, BlockBackend *blk, BlockCompletionFunc *cb, void *opaque) { diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h index 8203d7f6f9..bd4694e7bc 100644 --- a/include/sysemu/block-backend.h +++ b/include/sysemu/block-backend.h @@ -227,6 +227,7 @@ void blk_add_insert_bs_notifier(BlockBackend *blk, Notifier *notify); void blk_io_plug(BlockBackend *blk); void blk_io_unplug(BlockBackend *blk); BlockAcctStats *blk_get_stats(BlockBackend *blk); +BlockBackend *blk_stats2blk(BlockAcctStats *stats); BlockBackendRootState *blk_get_root_state(BlockBackend *blk); void blk_update_root_state(BlockBackend *blk); bool blk_get_detect_zeroes_from_root_state(BlockBackend *blk); From patchwork Mon Aug 10 10:14:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev" X-Patchwork-Id: 1342782 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=openvz.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4BQBhl631Bz9sTb for ; Mon, 10 Aug 2020 20:16:55 +1000 (AEST) Received: from localhost ([::1]:38332 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1k54rN-0007rA-Kb for incoming@patchwork.ozlabs.org; Mon, 10 Aug 2020 06:16:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49404) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1k54pX-00053o-NU; Mon, 10 Aug 2020 06:14:59 -0400 Received: from relay.sw.ru ([185.231.240.75]:50174 helo=relay3.sw.ru) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1k54pV-0007eK-S7; Mon, 10 Aug 2020 06:14:59 -0400 Received: from [192.168.15.33] (helo=iris.lishka.ru) by relay3.sw.ru with esmtp (Exim 4.93) (envelope-from ) id 1k54pD-0001FY-8u; Mon, 10 Aug 2020 13:14:39 +0300 From: "Denis V. Lunev" To: qemu-block@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 2/3] block: add logging facility for long standing IO requests Date: Mon, 10 Aug 2020 13:14:46 +0300 Message-Id: <20200810101447.7380-3-den@openvz.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200810101447.7380-1-den@openvz.org> References: <20200810101447.7380-1-den@openvz.org> Received-SPF: pass client-ip=185.231.240.75; envelope-from=den@openvz.org; helo=relay3.sw.ru X-detected-operating-system: by eggs.gnu.org: First seen = 2020/08/10 06:14:49 X-ACL-Warn: Detected OS = Linux 3.11 and newer X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , "Denis V. Lunev" , Stefan Hajnoczi , Max Reitz Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" There are severe delays with IO requests processing if QEMU is running in virtual machine or over software defined storage. Such delays potentially results in unpredictable guest behavior. For example, guests over IDE or SATA drive could remount filesystem read-only if write is performed longer than 10 seconds. Such reports are very complex to process. Some good starting point for this seems quite reasonable. This patch provides one. It adds logging of such potentially dangerous long IO operations. Signed-off-by: Denis V. Lunev Reviewed-by: Vladimir Sementsov-Ogievskiy CC: Stefan Hajnoczi CC: Kevin Wolf CC: Max Reitz --- block/accounting.c | 59 +++++++++++++++++++++++++++++++++++++- blockdev.c | 7 ++++- include/block/accounting.h | 5 +++- 3 files changed, 68 insertions(+), 3 deletions(-) diff --git a/block/accounting.c b/block/accounting.c index 8d41c8a83a..24f48c84b8 100644 --- a/block/accounting.c +++ b/block/accounting.c @@ -27,7 +27,9 @@ #include "block/accounting.h" #include "block/block_int.h" #include "qemu/timer.h" +#include "qemu/log.h" #include "sysemu/qtest.h" +#include "sysemu/block-backend.h" static QEMUClockType clock_type = QEMU_CLOCK_REALTIME; static const int qtest_latency_ns = NANOSECONDS_PER_SECOND / 1000; @@ -41,10 +43,11 @@ void block_acct_init(BlockAcctStats *stats) } void block_acct_setup(BlockAcctStats *stats, bool account_invalid, - bool account_failed) + bool account_failed, unsigned latency_log_threshold_ms) { stats->account_invalid = account_invalid; stats->account_failed = account_failed; + stats->latency_log_threshold_ms = latency_log_threshold_ms; } void block_acct_cleanup(BlockAcctStats *stats) @@ -182,6 +185,58 @@ void block_latency_histograms_clear(BlockAcctStats *stats) } } +static const char *block_account_type(enum BlockAcctType type) +{ + switch (type) { + case BLOCK_ACCT_READ: + return "READ"; + case BLOCK_ACCT_WRITE: + return "WRITE"; + case BLOCK_ACCT_FLUSH: + return "DISCARD"; + case BLOCK_ACCT_UNMAP: + return "TRUNCATE"; + case BLOCK_ACCT_NONE: + return "NONE"; + case BLOCK_MAX_IOTYPE: + break; + } + return "UNKNOWN"; +} + +static void block_acct_report_long(BlockAcctStats *stats, + BlockAcctCookie *cookie, int64_t latency_ns) +{ + unsigned latency_ms = latency_ns / SCALE_MS; + int64_t start_time_host_s; + char buf[64]; + struct tm t; + BlockDriverState *bs; + + if (cookie->type == BLOCK_ACCT_NONE) { + return; + } + if (stats->latency_log_threshold_ms == 0) { + return; + } + if (latency_ms < stats->latency_log_threshold_ms) { + return; + } + + start_time_host_s = cookie->start_time_ns / NANOSECONDS_PER_SECOND; + strftime(buf, sizeof(buf), "%m-%d %H:%M:%S", + localtime_r(&start_time_host_s, &t)); + + bs = blk_bs(blk_stats2blk(stats)); + qemu_log("long %s[%ld] IO request: %d.03%d since %s.%03d bs: %s(%s, %s)\n", + block_account_type(cookie->type), cookie->bytes, + (int)(latency_ms / 1000), (int)(latency_ms % 1000), buf, + (int)((cookie->start_time_ns / 1000000) % 1000), + bs == NULL ? "unknown" : bdrv_get_node_name(bs), + bs == NULL ? "unknown" : bdrv_get_format_name(bs), + bs == NULL ? "unknown" : bs->filename); +} + static void block_account_one_io(BlockAcctStats *stats, BlockAcctCookie *cookie, bool failed) { @@ -222,6 +277,8 @@ static void block_account_one_io(BlockAcctStats *stats, BlockAcctCookie *cookie, qemu_mutex_unlock(&stats->lock); + block_acct_report_long(stats, cookie, latency_ns); + cookie->type = BLOCK_ACCT_NONE; } diff --git a/blockdev.c b/blockdev.c index 3848a9c8ab..66158d1292 100644 --- a/blockdev.c +++ b/blockdev.c @@ -622,7 +622,8 @@ static BlockBackend *blockdev_init(const char *file, QDict *bs_opts, bs->detect_zeroes = detect_zeroes; - block_acct_setup(blk_get_stats(blk), account_invalid, account_failed); + block_acct_setup(blk_get_stats(blk), account_invalid, account_failed, + qemu_opt_get_number(opts, "latency-log-threshold", 0)); if (!parse_stats_intervals(blk_get_stats(blk), interval_list, errp)) { blk_unref(blk); @@ -3727,6 +3728,10 @@ QemuOptsList qemu_common_drive_opts = { .type = QEMU_OPT_BOOL, .help = "whether to account for failed I/O operations " "in the statistics", + },{ + .name = "latency-log-threshold", + .type = QEMU_OPT_NUMBER, + .help = "threshold for long I/O report (disabled if <=0), in ms", }, { /* end of list */ } }, diff --git a/include/block/accounting.h b/include/block/accounting.h index 878b4c3581..c3ea25f9aa 100644 --- a/include/block/accounting.h +++ b/include/block/accounting.h @@ -91,6 +91,9 @@ struct BlockAcctStats { bool account_invalid; bool account_failed; BlockLatencyHistogram latency_histogram[BLOCK_MAX_IOTYPE]; + + /* Threshold for long I/O detection, ms */ + unsigned latency_log_threshold_ms; }; typedef struct BlockAcctCookie { @@ -101,7 +104,7 @@ typedef struct BlockAcctCookie { void block_acct_init(BlockAcctStats *stats); void block_acct_setup(BlockAcctStats *stats, bool account_invalid, - bool account_failed); + bool account_failed, unsigned latency_log_threshold_ms); void block_acct_cleanup(BlockAcctStats *stats); void block_acct_add_interval(BlockAcctStats *stats, unsigned interval_length); BlockAcctTimedStats *block_acct_interval_next(BlockAcctStats *stats, From patchwork Mon Aug 10 10:14:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev" X-Patchwork-Id: 1342781 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=openvz.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4BQBhC2HS9z9sRK for ; Mon, 10 Aug 2020 20:16:25 +1000 (AEST) Received: from localhost ([::1]:35612 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1k54qs-0006kg-9j for incoming@patchwork.ozlabs.org; Mon, 10 Aug 2020 06:16:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49376) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1k54pW-0004zs-04; Mon, 10 Aug 2020 06:14:58 -0400 Received: from relay.sw.ru ([185.231.240.75]:50172 helo=relay3.sw.ru) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1k54pT-0007eJ-4i; Mon, 10 Aug 2020 06:14:57 -0400 Received: from [192.168.15.33] (helo=iris.lishka.ru) by relay3.sw.ru with esmtp (Exim 4.93) (envelope-from ) id 1k54pD-0001FY-Cr; Mon, 10 Aug 2020 13:14:39 +0300 From: "Denis V. Lunev" To: qemu-block@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 3/3] block: enable long IO requests report by default Date: Mon, 10 Aug 2020 13:14:47 +0300 Message-Id: <20200810101447.7380-4-den@openvz.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200810101447.7380-1-den@openvz.org> References: <20200810101447.7380-1-den@openvz.org> Received-SPF: pass client-ip=185.231.240.75; envelope-from=den@openvz.org; helo=relay3.sw.ru X-detected-operating-system: by eggs.gnu.org: First seen = 2020/08/10 06:14:49 X-ACL-Warn: Detected OS = Linux 3.11 and newer X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , "Denis V. Lunev" , Vladimir Sementsov-Ogievskiy , Stefan Hajnoczi , Max Reitz Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Latency threshold is set to 10 seconds following guest request timeout on legacy storage controller. Signed-off-by: Denis V. Lunev CC: Vladimir Sementsov-Ogievskiy CC: Stefan Hajnoczi CC: Kevin Wolf CC: Max Reitz Reviewed-by: Max Reitz --- blockdev.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/blockdev.c b/blockdev.c index 66158d1292..733fdd36da 100644 --- a/blockdev.c +++ b/blockdev.c @@ -622,8 +622,13 @@ static BlockBackend *blockdev_init(const char *file, QDict *bs_opts, bs->detect_zeroes = detect_zeroes; + /* + * Set log threshold to 10 seconds. Timeout choosen by observation + * of the guest behavior with legacy storage controllers. Linux + * could remount FS read-only if journal write takes this time. + */ block_acct_setup(blk_get_stats(blk), account_invalid, account_failed, - qemu_opt_get_number(opts, "latency-log-threshold", 0)); + qemu_opt_get_number(opts, "latency-log-threshold", 10000)); if (!parse_stats_intervals(blk_get_stats(blk), interval_list, errp)) { blk_unref(blk);