From patchwork Mon May 6 13:52:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 1932002 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=lci8E+3k; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VY2tJ2PVnz1xnT for ; Mon, 6 May 2024 23:53:12 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7E865385840B for ; Mon, 6 May 2024 13:53:10 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by sourceware.org (Postfix) with ESMTPS id 879003858D1E for ; Mon, 6 May 2024 13:52:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 879003858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 879003858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::630 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715003575; cv=none; b=l0jR6CsRvQM0kaaofVhJ8QTyHDTSXaVcU9GpYGo7iRAhTRl3NN/ZW8vqKrYyJa2L3CCL6EkY6MficpDkzIJX2pNjbHZP6MPwz9blwq2N+9TpyUbFr1J+74ur8O9uiHLCMH4ssmb1IVAmGsZG3S0+AhPI5eU/jWTXoyxl+qnWZqg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715003575; c=relaxed/simple; bh=pMfvbwFdISjKgBhAm+OMV9Yh8bCqJXEXnIBb8HVyCWw=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=t43bXzgjhnYyl9qqp5QZp47iekiAJQ9roj6tSSsvUTT5j9UbxMgE5J6ScPvR0HZ3HVVTMg4uyNA9YLGYtrIGwpQLSf7cndhfLypXKRIJkvLlhizAfdxuL4Udc77o5uLnWC5uWPL320JZWDbHS3A3fqCGn8QDWz7MmVskVIaCBWo= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1ecd3867556so12574565ad.0 for ; Mon, 06 May 2024 06:52:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1715003567; x=1715608367; darn=sourceware.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=v45NNayF1SCY+YN6E2PvhuRG5nbuWhxzXzsnIUpbVVI=; b=lci8E+3kZpp3FJqFXgLcIjpgqVfvq0knr/3Zx9AbzzNzOFojqiVRU1/ZpdOxiKWfZR 9CGltlV4Wsr5FucVbrlxyCg61/oNtgxSNAAg1YhNTQOyShkvAcQt/2GaEtAjGjt9A5p4 Hp9kP/Y6z2c1D/EcA6Gjo5E7mXCx8FwjLTOOKA65/SiGlSt2xJLnSm+6sizsRM8qz+Qt pKr9iSZqVPzoCZerRSmLcBP51RV745250cHLoOBplE4c0E7n0FbNNcgOaGDQKeLh17fv t7Ov/zNsLSO2ZeT3eukUQy651rIBRIy3ILjNPr+ZwCY9jDr6OZAWamdpE6CxbPahCqTi Jh1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715003567; x=1715608367; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=v45NNayF1SCY+YN6E2PvhuRG5nbuWhxzXzsnIUpbVVI=; b=MdMlYfAfVh0L+kd/17Hj4rS6zo923BQlRzG6LKSBdlZteUOz8iivlaKwz0aX1M5CWa x2aeGZGrRybiq7aub0eZhJVScw+pZE+ozYoBC7tFFvVpxxeTdYUU9I6VWENNLzhXOKXA XByDt+Hj+gx3gIiX1/xxTpiAiScb6OM6QRWYsYo6aKa57pCqsAZZOnzlMTphg/KzTS0K EjziQODj5LFfyAm2y7VTZ7Rz4om7zx5PWEgWghqEWNsf6Kh0oGbTDKUb4y+EDhuLZ6gS Qrra+jFdDZ/pj7Je6tbWjIwEJ7H2fFlpbdwmhXWPN1rZIEO2mHJpEvWhQyPjyG3t32ke Evlg== X-Gm-Message-State: AOJu0YyJNW5Bn+Bqgxdubanj93uZZ/TyhPAfM4BrDzZSH9LPutshL89E FMfGcV+/GOp2omoOS6NBhajcK3YOUdDekHA3i98koJTpzEiCtJ8J X-Google-Smtp-Source: AGHT+IEMMjlgUeOBntEa1EB/CaUcMw99abKcFK1NiBRjG/57zHC57uqzOvBwORpaQ26wkWi9TgGqHw== X-Received: by 2002:a17:902:f681:b0:1eb:5323:c320 with SMTP id l1-20020a170902f68100b001eb5323c320mr10873676plg.56.1715003567091; Mon, 06 May 2024 06:52:47 -0700 (PDT) Received: from gnu-cfl-3.localdomain ([172.56.168.158]) by smtp.gmail.com with ESMTPSA id iz14-20020a170902ef8e00b001eac94472f6sm8314762plb.93.2024.05.06.06.52.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 May 2024 06:52:46 -0700 (PDT) Received: from gnu-cfl-3.. (localhost [IPv6:::1]) by gnu-cfl-3.localdomain (Postfix) with ESMTP id 23739740151; Mon, 6 May 2024 06:52:45 -0700 (PDT) From: "H.J. Lu" To: libc-alpha@sourceware.org Cc: fweimer@redhat.com, alexandre.ferrieux@orange.com Subject: [PATCH v3] Fix #27777 - now use a doubly-linked list for _IO_list_all Date: Mon, 6 May 2024 06:52:38 -0700 Message-ID: <20240506135238.1344580-1-hjl.tools@gmail.com> X-Mailer: git-send-email 2.45.0 MIME-Version: 1.0 X-Spam-Status: No, score=-3022.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org From: Alexandre Ferrieux This patch fixes BZ #27777 "fclose does a linear search, takes ages when many FILE* are opened". Simply put, the master list of opened (FILE*), namely _IO_list_all, is a singly-linked list. As a consequence, the removal of a single element is in O(N), which cripples the performance of fclose(). The patch switches to a doubly-linked list, yielding O(1) removal. The one padding field in struct _IO_FILE, __pad5, is renamed to _prevchain for a doubly-linked list. Since fields in struct _IO_FILE after the _lock field are internal to glibc and opaque to applications. We can change them as long as the size of struct _IO_FILE is unchanged, which is checked as the part of glibc ABI with sizes of _IO_2_1_stdin_, _IO_2_1_stdout_ and _IO_2_1_stderr_. NB: When _IO_vtable_offset (fp) == 0, copy relocation will cover the whole struct _IO_FILE. Otherwise, only fields up to the _lock field will be copied to applications at run-time. It is used to check if the _prevchain field can be safely accessed. As tst-fclose.c shows, after opening 2 million (FILE*), the fclose() of 100 of them takes quite a few seconds without the patch, and under 2 seconds with it using "make check -jN" on a loaded machine. Co-Authored-By: H.J. Lu --- libio/Makefile | 1 + libio/bits/types/struct_FILE.h | 4 +-- libio/genops.c | 26 ++++++++++++++++ libio/stdfiles.c | 15 +++++++++ libio/tst-fclose.c | 56 ++++++++++++++++++++++++++++++++++ 5 files changed, 100 insertions(+), 2 deletions(-) create mode 100644 libio/tst-fclose.c diff --git a/libio/Makefile b/libio/Makefile index 0c1f16ee3b..7f598c840c 100644 --- a/libio/Makefile +++ b/libio/Makefile @@ -94,6 +94,7 @@ tests = \ tst-eof \ tst-ext \ tst-ext2 \ + tst-fclose \ tst-fgetc-after-eof \ tst-fgetwc \ tst-fgetws \ diff --git a/libio/bits/types/struct_FILE.h b/libio/bits/types/struct_FILE.h index 7cdaae86f8..d8d26639d1 100644 --- a/libio/bits/types/struct_FILE.h +++ b/libio/bits/types/struct_FILE.h @@ -92,10 +92,10 @@ struct _IO_FILE_complete struct _IO_wide_data *_wide_data; struct _IO_FILE *_freeres_list; void *_freeres_buf; - size_t __pad5; + struct _IO_FILE **_prevchain; int _mode; /* Make sure we don't get into trouble again. */ - char _unused2[15 * sizeof (int) - 4 * sizeof (void *) - sizeof (size_t)]; + char _unused2[15 * sizeof (int) - 5 * sizeof (void *)]; }; /* These macros are used by bits/stdio.h and internal headers. */ diff --git a/libio/genops.c b/libio/genops.c index bc45e60a09..994ee9c0b1 100644 --- a/libio/genops.c +++ b/libio/genops.c @@ -48,6 +48,19 @@ flush_cleanup (void *not_used) } #endif +/* Fields in struct _IO_FILE after the _lock field are internal to + glibc and opaque to applications. We can change them as long as + the size of struct _IO_FILE is unchanged, which is checked as the + part of glibc ABI with sizes of _IO_2_1_stdin_, _IO_2_1_stdout_ + and _IO_2_1_stderr_. + + NB: When _IO_vtable_offset (fp) == 0, copy relocation will cover the + whole struct _IO_FILE. Otherwise, only fields up to the _lock field + will be copied. */ +_Static_assert (offsetof (struct _IO_FILE, _prevchain) + > offsetof (struct _IO_FILE, _lock), + "offset of _prevchain > offset of _lock"); + void _IO_un_link (struct _IO_FILE_plus *fp) { @@ -62,6 +75,14 @@ _IO_un_link (struct _IO_FILE_plus *fp) #endif if (_IO_list_all == NULL) ; + else if (_IO_vtable_offset ((FILE *) fp) == 0) + { + FILE **pr = fp->file._prevchain; + FILE *nx = fp->file._chain; + *pr = nx; + if (nx != NULL) + nx->_prevchain = pr; + } else if (fp == _IO_list_all) _IO_list_all = (struct _IO_FILE_plus *) _IO_list_all->file._chain; else @@ -95,6 +116,11 @@ _IO_link_in (struct _IO_FILE_plus *fp) _IO_flockfile ((FILE *) fp); #endif fp->file._chain = (FILE *) _IO_list_all; + if (_IO_vtable_offset ((FILE *) fp) == 0) + { + fp->file._prevchain = (FILE **) &_IO_list_all; + _IO_list_all->file._prevchain = &fp->file._chain; + } _IO_list_all = fp; #ifdef _IO_MTSAFE_IO _IO_funlockfile ((FILE *) fp); diff --git a/libio/stdfiles.c b/libio/stdfiles.c index cd8eca8bf3..d607fa02e0 100644 --- a/libio/stdfiles.c +++ b/libio/stdfiles.c @@ -54,4 +54,19 @@ DEF_STDFILE(_IO_2_1_stdout_, 1, &_IO_2_1_stdin_, _IO_NO_READS); DEF_STDFILE(_IO_2_1_stderr_, 2, &_IO_2_1_stdout_, _IO_NO_READS+_IO_UNBUFFERED); struct _IO_FILE_plus *_IO_list_all = &_IO_2_1_stderr_; + +/* Finish the double-linking for stdfiles as static initialization + cannot. */ + +__THROW __attribute__ ((constructor)) +static void +_IO_stdfiles_init (void) +{ + struct _IO_FILE **f; + for (f = (struct _IO_FILE **) &_IO_list_all; + *f != NULL; + f = &(*f)->_chain) + (*f)->_prevchain = f; +} + libc_hidden_data_def (_IO_list_all) diff --git a/libio/tst-fclose.c b/libio/tst-fclose.c new file mode 100644 index 0000000000..bc0c216e29 --- /dev/null +++ b/libio/tst-fclose.c @@ -0,0 +1,56 @@ +/* Verify fclose performance with many opened files. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include + +#define N 2000000 +#define M 100 + +static int +do_test (void) +{ + FILE *ff, *keep[M]; + int i; + + fprintf (stderr, "Preparing %d FILEs ...\n", N); + fflush (stderr); + for (i = 0; i < N; i++) + { + ff = fdopen (STDIN_FILENO, "r"); + if (!ff) + { + fprintf (stderr, "### failed to fdopen: %m\n"); + return EXIT_FAILURE; + } + if (i < M) + keep[i] = ff; + } + fprintf (stderr, "Now fclosing %d FILEs ...", M); + fflush (stderr); + for (i = 0; i < M; i++) + fclose (keep[i]); + fprintf (stderr, "DONE\n"); + fflush (stderr); + return EXIT_SUCCESS; +} + +#define TIMEOUT 2 +#include