From patchwork Sat Nov 2 17:51:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Rong Xu X-Patchwork-Id: 2005535 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20230601 header.b=LWbh70Hj; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=2404:9400:2221:ea00::3; helo=mail.ozlabs.org; envelope-from=srs0=odt7=r5=vger.kernel.org=sparclinux+bounces-2546-patchwork-incoming=ozlabs.org@ozlabs.org; receiver=patchwork.ozlabs.org) Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Xglhd0xLPz1xwc for ; Sun, 3 Nov 2024 04:53:37 +1100 (AEDT) Received: from mail.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4Xglhb05XKz4x4c for ; Sun, 3 Nov 2024 04:53:35 +1100 (AEDT) Received: by gandalf.ozlabs.org (Postfix) id 4Xglhb02kHz4x8R; Sun, 3 Nov 2024 04:53:35 +1100 (AEDT) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; arc=pass smtp.remote-ip=147.75.48.161 arc.chain=subspace.kernel.org ARC-Seal: i=2; a=rsa-sha256; d=ozlabs.org; s=201707; t=1730570011; cv=pass; b=jnKLg5hobdgfkOQj/oe2q2kaPBsuvj4EBxihExhmZs2lS7ao6R29kt1D51SaMBTMOpONd4029r/Ct2U3Unz5wU/xsFN0mX8ZKgFkD/Ezmm/ynXMJTVsMLWtXFKrypyCWyq7DKTU0wWvvt0MW9oxjf9psPEdREfg9LqrqjKs3Bc2/vL0u6qz1BRIlDoVjkUKnuK9C/6u/QgXM3LCLzOIrnUXlqoSPitof5Opf9bfNiWP91HBaLJbbjBM4uI62VBFsw8QNakzX7jUdPVtbRT8WTQ6onebjIcgXIktyW30RniaZmikc/G5dVrgB82cqHTpdJfcxQEdSwo9ntpqRsBE5Cg== ARC-Message-Signature: i=2; a=rsa-sha256; d=ozlabs.org; s=201707; t=1730570011; c=relaxed/relaxed; bh=3p4AXk5DtJgiR7nbgs6gd2bejhjHdd1tXiNEcSnqRps=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=FmP9p9EW+UKABtx7uXJImnfr1CdwwCG9uxDpzSTn8xXhvOTdg/bSAmdJjPlKvVSUVg5re1EPl+NC3y8aLgShvCgXN/Jc9Erf6xijH/d6DRHpsEJk1LPTMnymy7EYXPxx5STjRIW9x6Tm7WpwEKRlls935bmlC0tG9s7fRbGj8an05mtfJFYe/wUJ0YnktIzbiEW5inavN2hc+zfGK1Si9cDPEJZI/sFgBcsmsYQj1spUqR1lCb9eEpuE1A4wpW6Eb2qMAj1g8ct2IW2o4iyEI2p+XxbEjQXyT5v/AloE86EEMa6E/QyH9anCYlNApu8ExvfDQFmB1b2mkjUoWqDoFA== ARC-Authentication-Results: i=2; gandalf.ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20230601 header.b=LWbh70Hj; dkim-atps=neutral; spf=pass (client-ip=147.75.48.161; helo=sy.mirrors.kernel.org; envelope-from=sparclinux+bounces-2546-patchwork-incoming=ozlabs.org@vger.kernel.org; receiver=ozlabs.org) smtp.mailfrom=vger.kernel.org Authentication-Results: gandalf.ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: gandalf.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20230601 header.b=LWbh70Hj; dkim-atps=neutral Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=147.75.48.161; helo=sy.mirrors.kernel.org; envelope-from=sparclinux+bounces-2546-patchwork-incoming=ozlabs.org@vger.kernel.org; receiver=ozlabs.org) Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org [147.75.48.161]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by gandalf.ozlabs.org (Postfix) with ESMTPS id 4XglhW11Qzz4x4c for ; Sun, 3 Nov 2024 04:53:31 +1100 (AEDT) Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id B2D41B21714 for ; Sat, 2 Nov 2024 17:53:32 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id D23551B2196; Sat, 2 Nov 2024 17:51:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="LWbh70Hj" X-Original-To: sparclinux@vger.kernel.org Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 327281AF0B7 for ; Sat, 2 Nov 2024 17:51:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730569894; cv=none; b=Ud3aCvM5PhbxxxgDOCgLCK5mK24mZ8E55W0dexVJGstw83eD6u15q2I+IvZqpGPOir/hM6jVRJVSIIV+L2TmK8nPGkyT4AqnB+mh/jH8na9nT0OAhNBSuR6To8DiMxVInKjb5MxlwHzeDu8D2+zy2NrOjdNk5paLTJXm8haMLrM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730569894; c=relaxed/simple; bh=n5DwlfeUbkUe6VSe+Bsrx/eAnae4lYm55Gp25mwM/AQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=W8za2gBVxgbAiR2D89GNeow0Ff/TaId6d7PJo43+8afCkEqYkvE7piuLXpPbcVK3BMeEa+Zcv07wp+yfllh7FmlPZNDW7VRPN3KszJejctdPPJRlkRVE2G+6rwYFCj/o8R2XL/2ahyufT4m4E5CSILz9prbHxag7BZ1hBHYa8gg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--xur.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=LWbh70Hj; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--xur.bounces.google.com Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-e30b8fd4ca1so5457623276.3 for ; Sat, 02 Nov 2024 10:51:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1730569890; x=1731174690; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=3p4AXk5DtJgiR7nbgs6gd2bejhjHdd1tXiNEcSnqRps=; b=LWbh70Hje0dR2x3PavUnXRqoLCrpro30Xf7zqJU3BLthJ9/u/0G3L1sDqHMB40/JfV TwZw13vJ/F/vnKCqFv544gGu8v0PVM9N5Zu/KUmUrcglCo0XMwcYQvzgXHJYK9CDcc6Q GlvUYpZMbmlK19tT5s/c8PVSTK9mSmgJSKIj1INnEC6JI35365BEkFbU6qZLkNHhV59l 9fgqq7H3P9gqO1wQjFCxrH/HI/P1nuEkwZ6Ygu0o2r0OrGeS/fWdX9ei1L/UueP5pFFP wNxVEKakbXbIinm065ysHSIBxzlDcfMuVM8PdKqLP0ZIw51adkpCJnL3Mtxdf9sZsW4e R5wg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730569890; x=1731174690; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=3p4AXk5DtJgiR7nbgs6gd2bejhjHdd1tXiNEcSnqRps=; b=a3FslP8CRh5jtNEEwfTivA58UWIHadYZ8dIvxfCxrdjClFpbWLvU33IgnTl6EmRXdT DiChDZ/GEgT5B4PcI82L2NEpLG9g9f5qsDXvPU8COQrV0JVQ1RqvwDehsXLANRqk/z2s GglHLqnSu32eNLucpkJeTpqkrVOoTyE5awAl4a2wiF3D2R2qEopiEYP2OKBqRe0fIydx 5dxjXaxDGIp8qCPyVHT+d4ngt1bfV9/vNzbwUYeBNjKulj3Ej8Xk7fyCxt60Ms8yUaza 5wnZnj4N6IZ5oC9TqbGQ4G9vH9JRs+X+G+QAz9Y4FElBhu4nuzdgw/cKEsJxfVJ4f626 1sBg== X-Forwarded-Encrypted: i=1; AJvYcCWifdiUBLjsMle7/iUw9oko2/c8EDeO/GwKlB6agOhJHpgDk+pnd2cbv2DRukPxu7irw/g0SfNUavxu@vger.kernel.org X-Gm-Message-State: AOJu0Yz4zNiUZCFErFeYOBrQXMU6US9XOIAHARR84jiDHZXz80VxJ1Px FSdIh0BgWWxrTpiDJ5nckxZ0BZDQ/ljUgJzeLpIYM3DFXJLp5y5mXF8WrzuLEwpQgg== X-Google-Smtp-Source: AGHT+IEZ+AxDqMwv1T8/N6VbPHxXAJQIxGK0//+kNZzPq9Tu/0rNxZacImTwdovzsKaod3QXHJCzhKg= X-Received: from xur.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:2330]) (user=xur job=sendgmr) by 2002:a25:26d1:0:b0:e2b:d0e9:130f with SMTP id 3f1490d57ef6-e30cf2e9c2bmr10994276.0.1730569890149; Sat, 02 Nov 2024 10:51:30 -0700 (PDT) Date: Sat, 2 Nov 2024 10:51:14 -0700 In-Reply-To: <20241102175115.1769468-1-xur@google.com> Precedence: bulk X-Mailing-List: sparclinux@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241102175115.1769468-1-xur@google.com> X-Mailer: git-send-email 2.47.0.163.g1226f6d8fa-goog Message-ID: <20241102175115.1769468-8-xur@google.com> Subject: [PATCH v7 7/7] Add Propeller configuration for kernel build From: Rong Xu To: Alice Ryhl , Andrew Morton , Arnd Bergmann , Bill Wendling , Borislav Petkov , Breno Leitao , Brian Gerst , Dave Hansen , David Li , Han Shen , Heiko Carstens , "H. Peter Anvin" , Ingo Molnar , Jann Horn , Jonathan Corbet , Josh Poimboeuf , Juergen Gross , Justin Stitt , Kees Cook , Masahiro Yamada , "Mike Rapoport (IBM)" , Nathan Chancellor , Nick Desaulniers , Nicolas Schier , "Paul E. McKenney" , Peter Zijlstra , Rong Xu , Sami Tolvanen , Thomas Gleixner , Wei Yang , workflows@vger.kernel.org, Miguel Ojeda , Maksim Panchenko , "David S. Miller" , Andreas Larsson , Yonghong Song , Yabin Cui , Krzysztof Pszeniczny , Sriraman Tallam , Stephane Eranian Cc: x86@kernel.org, linux-arch@vger.kernel.org, sparclinux@vger.kernel.org, linux-doc@vger.kernel.org, linux-kbuild@vger.kernel.org, linux-kernel@vger.kernel.org, llvm@lists.linux.dev X-Spam-Status: No, score=-8.4 required=5.0 tests=ARC_SIGNED,ARC_VALID, DKIMWL_WL_MED,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DMARC_PASS, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=disabled version=4.0.0 X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on gandalf.ozlabs.org Add the build support for using Clang's Propeller optimizer. Like AutoFDO, Propeller uses hardware sampling to gather information about the frequency of execution of different code paths within a binary. This information is then used to guide the compiler's optimization decisions, resulting in a more efficient binary. The support requires a Clang compiler LLVM 19 or later, and the create_llvm_prof tool (https://github.com/google/autofdo/releases/tag/v0.30.1). This commit is limited to x86 platforms that support PMU features like LBR on Intel machines and AMD Zen3 BRS. Here is an example workflow for building an AutoFDO+Propeller optimized kernel: 1) Build the kernel on the host machine, with AutoFDO and Propeller build config CONFIG_AUTOFDO_CLANG=y CONFIG_PROPELLER_CLANG=y then $ make LLVM=1 CLANG_AUTOFDO_PROFILE=” is the profile collected when doing a non-Propeller AutoFDO build. This step builds a kernel that has the same optimization level as AutoFDO, plus a metadata section that records basic block information. This kernel image runs as fast as an AutoFDO optimized kernel. 2) Install the kernel on test/production machines. 3) Run the load tests. The '-c' option in perf specifies the sample event period. We suggest using a suitable prime number, like 500009, for this purpose. For Intel platforms: $ perf record -e BR_INST_RETIRED.NEAR_TAKEN:k -a -N -b -c \ -o -- For AMD platforms: The supported system are: Zen3 with BRS, or Zen4 with amd_lbr_v2 # To see if Zen3 support LBR: $ cat proc/cpuinfo | grep " brs" # To see if Zen4 support LBR: $ cat proc/cpuinfo | grep amd_lbr_v2 # If the result is yes, then collect the profile using: $ perf record --pfm-events RETIRED_TAKEN_BRANCH_INSTRUCTIONS:k -a \ -N -b -c -o -- 4) (Optional) Download the raw perf file to the host machine. 5) Generate Propeller profile: $ create_llvm_prof --binary= --profile= \ --format=propeller --propeller_output_module_name \ --out=_cc_profile.txt \ --propeller_symorder=_ld_profile.txt “create_llvm_prof” is the profile conversion tool, and a prebuilt binary for linux can be found on https://github.com/google/autofdo/releases/tag/v0.30.1 (can also build from source). "" can be something like "/home/user/dir/any_string". This command generates a pair of Propeller profiles: "_cc_profile.txt" and "_ld_profile.txt". 6) Rebuild the kernel using the AutoFDO and Propeller profile files. CONFIG_AUTOFDO_CLANG=y CONFIG_PROPELLER_CLANG=y and $ make LLVM=1 CLANG_AUTOFDO_PROFILE= \ CLANG_PROPELLER_PROFILE_PREFIX= Co-developed-by: Han Shen Signed-off-by: Han Shen Signed-off-by: Rong Xu Suggested-by: Sriraman Tallam Suggested-by: Krzysztof Pszeniczny Suggested-by: Nick Desaulniers Suggested-by: Stephane Eranian Tested-by: Yonghong Song Tested-by: Nathan Chancellor Reviewed-by: Kees Cook --- Documentation/dev-tools/index.rst | 1 + Documentation/dev-tools/propeller.rst | 162 ++++++++++++++++++++++++++ MAINTAINERS | 7 ++ Makefile | 1 + arch/Kconfig | 19 +++ arch/x86/Kconfig | 1 + arch/x86/kernel/vmlinux.lds.S | 4 + include/asm-generic/vmlinux.lds.h | 6 +- scripts/Makefile.lib | 10 ++ scripts/Makefile.propeller | 28 +++++ tools/objtool/check.c | 1 + 11 files changed, 237 insertions(+), 3 deletions(-) create mode 100644 Documentation/dev-tools/propeller.rst create mode 100644 scripts/Makefile.propeller diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst index 6945644f7008a..3c0ac08b27091 100644 --- a/Documentation/dev-tools/index.rst +++ b/Documentation/dev-tools/index.rst @@ -35,6 +35,7 @@ Documentation/dev-tools/testing-overview.rst checkuapi gpio-sloppy-logic-analyzer autofdo + propeller .. only:: subproject and html diff --git a/Documentation/dev-tools/propeller.rst b/Documentation/dev-tools/propeller.rst new file mode 100644 index 0000000000000..92195958e3dbc --- /dev/null +++ b/Documentation/dev-tools/propeller.rst @@ -0,0 +1,162 @@ +.. SPDX-License-Identifier: GPL-2.0 + +===================================== +Using Propeller with the Linux kernel +===================================== + +This enables Propeller build support for the kernel when using Clang +compiler. Propeller is a profile-guided optimization (PGO) method used +to optimize binary executables. Like AutoFDO, it utilizes hardware +sampling to gather information about the frequency of execution of +different code paths within a binary. Unlike AutoFDO, this information +is then used right before linking phase to optimize (among others) +block layout within and across functions. + +A few important notes about adopting Propeller optimization: + +#. Although it can be used as a standalone optimization step, it is + strongly recommended to apply Propeller on top of AutoFDO, + AutoFDO+ThinLTO or Instrument FDO. The rest of this document + assumes this paradigm. + +#. Propeller uses another round of profiling on top of + AutoFDO/AutoFDO+ThinLTO/iFDO. The whole build process involves + "build-afdo - train-afdo - build-propeller - train-propeller - + build-optimized". + +#. Propeller requires LLVM 19 release or later for Clang/Clang++ + and the linker(ld.lld). + +#. In addition to LLVM toolchain, Propeller requires a profiling + conversion tool: https://github.com/google/autofdo with a release + after v0.30.1: https://github.com/google/autofdo/releases/tag/v0.30.1. + +The Propeller optimization process involves the following steps: + +#. Initial building: Build the AutoFDO or AutoFDO+ThinLTO binary as + you would normally do, but with a set of compile-time / link-time + flags, so that a special metadata section is created within the + kernel binary. The special section is only intend to be used by the + profiling tool, it is not part of the runtime image, nor does it + change kernel run time text sections. + +#. Profiling: The above kernel is then run with a representative + workload to gather execution frequency data. This data is collected + using hardware sampling, via perf. Propeller is most effective on + platforms supporting advanced PMU features like LBR on Intel + machines. This step is the same as profiling the kernel for AutoFDO + (the exact perf parameters can be different). + +#. Propeller profile generation: Perf output file is converted to a + pair of Propeller profiles via an offline tool. + +#. Optimized build: Build the AutoFDO or AutoFDO+ThinLTO optimized + binary as you would normally do, but with a compile-time / + link-time flag to pick up the Propeller compile time and link time + profiles. This build step uses 3 profiles - the AutoFDO profile, + the Propeller compile-time profile and the Propeller link-time + profile. + +#. Deployment: The optimized kernel binary is deployed and used + in production environments, providing improved performance + and reduced latency. + +Preparation +=========== + +Configure the kernel with:: + + CONFIG_AUTOFDO_CLANG=y + CONFIG_PROPELLER_CLANG=y + +Customization +============= + +The default CONFIG_PROPELLER_CLANG setting covers kernel space objects +for Propeller builds. One can, however, enable or disable Propeller build +for individual files and directories by adding a line similar to the +following to the respective kernel Makefile: + +- For enabling a single file (e.g. foo.o):: + + PROPELLER_PROFILE_foo.o := y + +- For enabling all files in one directory:: + + PROPELLER_PROFILE := y + +- For disabling one file:: + + PROPELLER_PROFILE_foo.o := n + +- For disabling all files in one directory:: + + PROPELLER__PROFILE := n + + +Workflow +======== + +Here is an example workflow for building an AutoFDO+Propeller kernel: + +1) Assuming an AutoFDO profile is already collected following + instructions in the AutoFDO document, build the kernel on the host + machine, with AutoFDO and Propeller build configs :: + + CONFIG_AUTOFDO_CLANG=y + CONFIG_PROPELLER_CLANG=y + + and :: + + $ make LLVM=1 CLANG_AUTOFDO_PROFILE= + +2) Install the kernel on the test machine. + +3) Run the load tests. The '-c' option in perf specifies the sample + event period. We suggest using a suitable prime number, like 500009, + for this purpose. + + - For Intel platforms:: + + $ perf record -e BR_INST_RETIRED.NEAR_TAKEN:k -a -N -b -c -o -- + + - For AMD platforms:: + + $ perf record --pfm-event RETIRED_TAKEN_BRANCH_INSTRUCTIONS:k -a -N -b -c -o -- + + Note you can repeat the above steps to collect multiple s. + +4) (Optional) Download the raw perf file(s) to the host machine. + +5) Use the create_llvm_prof tool (https://github.com/google/autofdo) to + generate Propeller profile. :: + + $ create_llvm_prof --binary= --profile= + --format=propeller --propeller_output_module_name + --out=_cc_profile.txt + --propeller_symorder=_ld_profile.txt + + "" can be something like "/home/user/dir/any_string". + + This command generates a pair of Propeller profiles: + "_cc_profile.txt" and + "_ld_profile.txt". + + If there are more than 1 perf_file collected in the previous step, + you can create a temp list file "" with each line + containing one perf file name and run:: + + $ create_llvm_prof --binary= --profile=@ + --format=propeller --propeller_output_module_name + --out=_cc_profile.txt + --propeller_symorder=_ld_profile.txt + +6) Rebuild the kernel using the AutoFDO and Propeller + profiles. :: + + CONFIG_AUTOFDO_CLANG=y + CONFIG_PROPELLER_CLANG=y + + and :: + + $ make LLVM=1 CLANG_AUTOFDO_PROFILE= CLANG_PROPELLER_PROFILE_PREFIX= diff --git a/MAINTAINERS b/MAINTAINERS index d6ea49433747a..42e3af0791e15 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -18449,6 +18449,13 @@ S: Maintained F: include/linux/psi* F: kernel/sched/psi.c +PROPELLER BUILD +M: Rong Xu +M: Han Shen +S: Supported +F: Documentation/dev-tools/propeller.rst +F: scripts/Makefile.propeller + PRINTK M: Petr Mladek R: Steven Rostedt diff --git a/Makefile b/Makefile index b89d87b5dca79..b2830b27e1a7f 100644 --- a/Makefile +++ b/Makefile @@ -1019,6 +1019,7 @@ include-$(CONFIG_UBSAN) += scripts/Makefile.ubsan include-$(CONFIG_KCOV) += scripts/Makefile.kcov include-$(CONFIG_RANDSTRUCT) += scripts/Makefile.randstruct include-$(CONFIG_AUTOFDO_CLANG) += scripts/Makefile.autofdo +include-$(CONFIG_PROPELLER_CLANG) += scripts/Makefile.propeller include-$(CONFIG_GCC_PLUGINS) += scripts/Makefile.gcc-plugins include $(addprefix $(srctree)/, $(include-y)) diff --git a/arch/Kconfig b/arch/Kconfig index 8dca3b5e6ef53..00551f340dbe3 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -831,6 +831,25 @@ config AUTOFDO_CLANG If unsure, say N. +config ARCH_SUPPORTS_PROPELLER_CLANG + bool + +config PROPELLER_CLANG + bool "Enable Clang's Propeller build" + depends on ARCH_SUPPORTS_PROPELLER_CLANG + depends on CC_IS_CLANG && CLANG_VERSION >= 190000 + help + This option enables Clang’s Propeller build. When the Propeller + profiles is specified in variable CLANG_PROPELLER_PROFILE_PREFIX + during the build process, Clang uses the profiles to optimize + the kernel. + + If no profile is specified, Propeller options are still passed + to Clang to facilitate the collection of perf data for creating + the Propeller profiles in subsequent builds. + + If unsure, say N. + config ARCH_SUPPORTS_CFI_CLANG bool help diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 9dc87661fb373..89b8fc452a7cf 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -127,6 +127,7 @@ config X86 select ARCH_SUPPORTS_LTO_CLANG_THIN select ARCH_SUPPORTS_RT select ARCH_SUPPORTS_AUTOFDO_CLANG + select ARCH_SUPPORTS_PROPELLER_CLANG if X86_64 select ARCH_USE_BUILTIN_BSWAP select ARCH_USE_CMPXCHG_LOCKREF if X86_CMPXCHG64 select ARCH_USE_MEMTEST diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S index b8c5741d2fb48..cf22081601ed6 100644 --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -443,6 +443,10 @@ SECTIONS STABS_DEBUG DWARF_DEBUG +#ifdef CONFIG_PROPELLER_CLANG + .llvm_bb_addr_map : { *(.llvm_bb_addr_map) } +#endif + ELF_DETAILS DISCARDS diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 8a0bb3946cf05..c995474e4c649 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -95,14 +95,14 @@ * With LTO_CLANG, the linker also splits sections by default, so we need * these macros to combine the sections during the final link. * - * With AUTOFDO_CLANG, by default, the linker splits text sections and - * regroups functions into subsections. + * With AUTOFDO_CLANG and PROPELLER_CLANG, by default, the linker splits + * text sections and regroups functions into subsections. * * RODATA_MAIN is not used because existing code already defines .rodata.x * sections to be brought in with rodata. */ #if defined(CONFIG_LD_DEAD_CODE_DATA_ELIMINATION) || defined(CONFIG_LTO_CLANG) || \ -defined(CONFIG_AUTOFDO_CLANG) +defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG) #define TEXT_MAIN .text .text.[0-9a-zA-Z_]* #else #define TEXT_MAIN .text diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib index 2d0942c1a0277..e7859ad90224a 100644 --- a/scripts/Makefile.lib +++ b/scripts/Makefile.lib @@ -201,6 +201,16 @@ _c_flags += $(if $(patsubst n%,, \ $(CFLAGS_AUTOFDO_CLANG)) endif +# +# Enable Propeller build flags except some files or directories we don't want to +# enable (depends on variables AUTOFDO_PROPELLER_obj.o and PROPELLER_PROFILE). +# +ifdef CONFIG_PROPELLER_CLANG +_c_flags += $(if $(patsubst n%,, \ + $(AUTOFDO_PROFILE_$(target-stem).o)$(AUTOFDO_PROFILE)$(PROPELLER_PROFILE))$(is-kernel-object), \ + $(CFLAGS_PROPELLER_CLANG)) +endif + # $(src) for including checkin headers from generated source files # $(obj) for including generated headers from checkin source files ifeq ($(KBUILD_EXTMOD),) diff --git a/scripts/Makefile.propeller b/scripts/Makefile.propeller new file mode 100644 index 0000000000000..344190717e471 --- /dev/null +++ b/scripts/Makefile.propeller @@ -0,0 +1,28 @@ +# SPDX-License-Identifier: GPL-2.0 + +# Enable available and selected Clang Propeller features. +ifdef CLANG_PROPELLER_PROFILE_PREFIX + CFLAGS_PROPELLER_CLANG := -fbasic-block-sections=list=$(CLANG_PROPELLER_PROFILE_PREFIX)_cc_profile.txt -ffunction-sections + KBUILD_LDFLAGS += --symbol-ordering-file=$(CLANG_PROPELLER_PROFILE_PREFIX)_ld_profile.txt --no-warn-symbol-ordering +else + CFLAGS_PROPELLER_CLANG := -fbasic-block-sections=labels +endif + +# Propeller requires debug information to embed module names in the profiles. +# If CONFIG_DEBUG_INFO is not enabled, set -gmlt option. Skip this for AutoFDO, +# as the option should already be set. +ifndef CONFIG_DEBUG_INFO + ifndef CONFIG_AUTOFDO_CLANG + CFLAGS_PROPELLER_CLANG += -gmlt + endif +endif + +ifdef CONFIG_LTO_CLANG_THIN + ifdef CLANG_PROPELLER_PROFILE_PREFIX + KBUILD_LDFLAGS += --lto-basic-block-sections=$(CLANG_PROPELLER_PROFILE_PREFIX)_cc_profile.txt + else + KBUILD_LDFLAGS += --lto-basic-block-sections=labels + endif +endif + +export CFLAGS_PROPELLER_CLANG diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 4c5229991e1e0..05a0fb4a3d1a0 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -4558,6 +4558,7 @@ static int validate_ibt(struct objtool_file *file) !strcmp(sec->name, "__mcount_loc") || !strcmp(sec->name, ".kcfi_traps") || !strcmp(sec->name, ".llvm.call-graph-profile") || + !strcmp(sec->name, ".llvm_bb_addr_map") || strstr(sec->name, "__patchable_function_entries")) continue;