From patchwork Thu Oct 17 06:26:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Massimiliano Pellizzer X-Patchwork-Id: 1998397 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=185.125.189.65; helo=lists.ubuntu.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=patchwork.ozlabs.org) Received: from lists.ubuntu.com (lists.ubuntu.com [185.125.189.65]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XTdDB4x5Kz1xw7 for ; Thu, 17 Oct 2024 17:27:22 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=lists.ubuntu.com) by lists.ubuntu.com with esmtp (Exim 4.86_2) (envelope-from ) id 1t1Jyb-0001HU-D2; Thu, 17 Oct 2024 06:27:13 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by lists.ubuntu.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1t1JyR-0001GW-DX for kernel-team@lists.ubuntu.com; Thu, 17 Oct 2024 06:27:03 +0000 Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id D0B973F31B for ; Thu, 17 Oct 2024 06:27:02 +0000 (UTC) Received: by mail-wr1-f72.google.com with SMTP id ffacd0b85a97d-37d4af408dcso272363f8f.0 for ; Wed, 16 Oct 2024 23:27:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729146422; x=1729751222; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2qiPDM01onghz4ecZW1RWw6mlQ2/D/Ijvzu03PUYT10=; b=ACeqNsAFd2QR9PP623nPGnPCRBa1rcC7RTxpnrRjhCVcs/hSJivCg1e9Atms9igbZn EZEHWaLzMJRR6K7x1TJslg28sMWNyjkQ1oehK6fK3reYunwqocnrms6M9GHVH5MUumME KRw2ZXeux6xzPbmcHytExqxjzkOwhJluGYgdLhG47RNmaASlmSHiu42pSRdLt1AILMra sK2Z61MQtdV79IfnOXMS3lCJkXPkpNg0ZCjhrEWgFNP35hRoV/nqKzU77lpW4Pu60xpx /bkjU06ejYCHpAgKUZnB+6EAiZwHvIZ5x2orJMBTEgr864ukxqdMFYjT/giP+1BX3e5u eTbQ== X-Gm-Message-State: AOJu0Yy4ZCtpG0i+j5FFG6gc/NVC03yfTWAquEJLdn9vJsQdwXTBH5KF Y7inFVGbhsFXm5v+doJ+vvLMAmfnxaVZn9YUc5QmXYw2yHdpmZoW9OAdMvXJ2atAA/1iyajI8vM fTQoFjvXDaB4Am1K0xYrzdSd26xdEicfgbORKtPTZe4QG34olxizcf4yAHsffEJNMSk490noNhR j++11J6ZFyvg== X-Received: by 2002:a5d:424f:0:b0:37c:cc60:2c68 with SMTP id ffacd0b85a97d-37d86ba48b9mr4076188f8f.5.1729146422063; Wed, 16 Oct 2024 23:27:02 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH+BSC4gOVChVDbC0yZ1/cPU4n7fj0TafZKUCahjg7BEc/3cchAoahOIIHDj/skL/1a4ykLag== X-Received: by 2002:a5d:424f:0:b0:37c:cc60:2c68 with SMTP id ffacd0b85a97d-37d86ba48b9mr4076174f8f.5.1729146421643; Wed, 16 Oct 2024 23:27:01 -0700 (PDT) Received: from localhost.localdomain (net-93-66-99-170.cust.vodafonedsl.it. [93.66.99.170]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-37d7fa7a09dsm6268229f8f.23.2024.10.16.23.27.00 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Oct 2024 23:27:01 -0700 (PDT) From: Massimiliano Pellizzer To: kernel-team@lists.ubuntu.com Subject: [SRU][F][PATCH v2 1/1] ocfs2: fix DIO failure due to insufficient transaction credits Date: Thu, 17 Oct 2024 08:26:49 +0200 Message-ID: <20241017062649.10459-2-massimiliano.pellizzer@canonical.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241017062649.10459-1-massimiliano.pellizzer@canonical.com> References: <20241017062649.10459-1-massimiliano.pellizzer@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Jan Kara commit be346c1a6eeb49d8fda827d2a9522124c2f72f36 upstream. The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 #1 __crash_kexec at ffffffff8c1338fa #2 panic at ffffffff8c1d69b9 #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] #6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] #7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] #8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] #9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] #10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] #11 dio_complete at ffffffff8c2b9fa7 #12 do_blockdev_direct_IO at ffffffff8c2bc09f #13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] #14 generic_file_direct_write at ffffffff8c1dcf14 #15 __generic_file_write_iter at ffffffff8c1dd07b #16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] #17 aio_write at ffffffff8c2cc72e #18 kmem_cache_alloc at ffffffff8c248dde #19 do_io_submit at ffffffff8c2ccada #20 do_syscall_64 at ffffffff8c004984 #21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/20240617095543.6971-1-jack@suse.cz Link: https://lkml.kernel.org/r/20240614145243.8837-1-jack@suse.cz Fixes: c15471f79506 ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara Reviewed-by: Joseph Qi Reviewed-by: Heming Zhao Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Cc: Changwei Ge Cc: Gang He Cc: Jun Piao Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman (backported from commit a68b896aa56e435506453ec8835bc991ec3ae687 linux-5.10.y) [mpellizzer: backported using handle->h_buffer_credits instead of jbd2_handle_buffer_credits(handle) since the latter is not defined in focal] CVE-2024-42077 Signed-off-by: Massimiliano Pellizzer --- fs/ocfs2/aops.c | 5 +++++ fs/ocfs2/journal.c | 17 +++++++++++++++++ fs/ocfs2/journal.h | 2 ++ fs/ocfs2/ocfs2_trace.h | 2 ++ 4 files changed, 26 insertions(+) diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c index 91702ebffe84c..6b7b411f2e495 100644 --- a/fs/ocfs2/aops.c +++ b/fs/ocfs2/aops.c @@ -2382,6 +2382,11 @@ static int ocfs2_dio_end_io_write(struct inode *inode, } list_for_each_entry(ue, &dwc->dw_zero_list, ue_node) { + ret = ocfs2_assure_trans_credits(handle, credits); + if (ret < 0) { + mlog_errno(ret); + break; + } ret = ocfs2_mark_extent_written(inode, &et, handle, ue->ue_cpos, 1, ue->ue_phys, diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c index 22e9a92ff9682..7d1b522a86306 100644 --- a/fs/ocfs2/journal.c +++ b/fs/ocfs2/journal.c @@ -449,6 +449,23 @@ int ocfs2_extend_trans(handle_t *handle, int nblocks) return status; } +/* + * Make sure handle has at least 'nblocks' credits available. If it does not + * have that many credits available, we will try to extend the handle to have + * enough credits. If that fails, we will restart transaction to have enough + * credits. Similar notes regarding data consistency and locking implications + * as for ocfs2_extend_trans() apply here. + */ +int ocfs2_assure_trans_credits(handle_t *handle, int nblocks) +{ + int old_nblks = handle->h_buffer_credits; + + trace_ocfs2_assure_trans_credits(old_nblks); + if (old_nblks >= nblocks) + return 0; + return ocfs2_extend_trans(handle, nblocks - old_nblks); +} + /* * If we have fewer than thresh credits, extend by OCFS2_MAX_TRANS_DATA. * If that fails, restart the transaction & regain write access for the diff --git a/fs/ocfs2/journal.h b/fs/ocfs2/journal.h index eb7a21bac71ef..bc5d77cb3c500 100644 --- a/fs/ocfs2/journal.h +++ b/fs/ocfs2/journal.h @@ -244,6 +244,8 @@ handle_t *ocfs2_start_trans(struct ocfs2_super *osb, int ocfs2_commit_trans(struct ocfs2_super *osb, handle_t *handle); int ocfs2_extend_trans(handle_t *handle, int nblocks); +int ocfs2_assure_trans_credits(handle_t *handle, + int nblocks); int ocfs2_allocate_extend_trans(handle_t *handle, int thresh); diff --git a/fs/ocfs2/ocfs2_trace.h b/fs/ocfs2/ocfs2_trace.h index dc4bce1649c1b..7a9cfd61145a0 100644 --- a/fs/ocfs2/ocfs2_trace.h +++ b/fs/ocfs2/ocfs2_trace.h @@ -2578,6 +2578,8 @@ DEFINE_OCFS2_ULL_UINT_EVENT(ocfs2_commit_cache_end); DEFINE_OCFS2_INT_INT_EVENT(ocfs2_extend_trans); +DEFINE_OCFS2_INT_EVENT(ocfs2_assure_trans_credits); + DEFINE_OCFS2_INT_EVENT(ocfs2_extend_trans_restart); DEFINE_OCFS2_INT_INT_EVENT(ocfs2_allocate_extend_trans);