From patchwork Sat Sep 14 19:17:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yangyu Chen X-Patchwork-Id: 1985695 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4X5gtf1Dnjz1y2H for ; Sun, 15 Sep 2024 05:18:00 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C9CA3385842C for ; Sat, 14 Sep 2024 19:17:57 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from cstnet.cn (smtp21.cstnet.cn [159.226.251.21]) by sourceware.org (Postfix) with ESMTPS id 3F1983858D20 for ; Sat, 14 Sep 2024 19:17:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3F1983858D20 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=isrc.iscas.ac.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=isrc.iscas.ac.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 3F1983858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=159.226.251.21 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1726341458; cv=none; b=AerxBKRHLWMkqhGw8HdqleVisek0Hi/e4LvQguFRsCmcoRKWuB/L9g/x0uARNzZO7s4wibSQjtkEDGrWhDxjgycK4fJI2h69IV6uIiBPBhczYC/0Ncboc560t1oCm/giWk8WzL1t3IvVmMi8ZxYFAY7fzhddJJQNT8DEh3UCcnU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1726341458; c=relaxed/simple; bh=Y8Mqdq2rvlJb1ocMSVMfLweN3OpqlVhvJsuWTOQB++k=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=OY3spR4anF+QyBdk6OS3K2AHQcrdlOYtHvenD831Dl2w7xJc/gHU+7ifqbh/3T0GNdThTpbbjk25b08NfIpjp4cUYjnC9FncW8LOmqNZAWx7/RqrPP7TCLbPYn6RSvhVy01JMPGCtW0QZl38p4ZLzTa1NA6rTnU3I2ASJKv+/8I= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from cyy-pc.lan (unknown [120.41.210.10]) by APP-01 (Coremail) with SMTP id qwCowABnbx8_4eVmmLbGAw--.62649S2; Sun, 15 Sep 2024 03:17:24 +0800 (CST) From: Yangyu Chen To: gcc-patches@gcc.gnu.org Cc: Jeff Law , Jan Hubicka , Andrew Carlotti , Eric Botcazou , Iain Buclaw , Sriraman Tallam , Evgeny Stupachenko , Yangyu Chen Subject: [RFC PATCH] Allow functions with target_clones attribute to be inlined Date: Sun, 15 Sep 2024 03:17:09 +0800 Message-ID: <20240914191709.648227-1-chenyangyu@isrc.iscas.ac.cn> X-Mailer: git-send-email 2.45.2 MIME-Version: 1.0 X-CM-TRANSID: qwCowABnbx8_4eVmmLbGAw--.62649S2 X-Coremail-Antispam: 1UD129KBjvJXoWxWF4fJw13KF1fKFWxKFy7GFg_yoWrJw4xpF WUCr9Fv34fJFW3KFWqyw4xXw13WrW3GrWUCr4xKrn7AayDJ3s7JrW0k3y7tF1UGFWruanr ZFyDC3s2q398ZFJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUkm14x267AKxVW8JVW5JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK02 1l84ACjcxK6xIIjxv20xvE14v26r1j6r1xM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r1j 6r4UM28EF7xvwVC2z280aVAFwI0_Jr0_Gr1l84ACjcxK6I8E87Iv6xkF7I0E14v26r1j6r 4UM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xII jxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr 1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxkF7I0En4kS14v26r1q 6r43MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI 0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y 0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxV WUJVW8JwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1l IxAIcVC2z280aVCY1x0267AKxVWUJVW8JbIYCTnIWIevJa73UjIFyTuYvjfUonmRUUUUU X-Originating-IP: [120.41.210.10] X-CM-SenderInfo: xfkh055dqj53w6lv2u4olvutnvoduhdfq/ X-Spam-Status: No, score=-13.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org I recently found that target_clones functions cannot inline even when the caller has exactly the same target. However, if we only use target attributes in C++ and let the compiler generate IFUNC for us, the functions with the same target will be inlined. For example, the following code compiled on x86-64 target with -O3 will generate IFUNC for foo and bar and inline foo into the bar: ```cpp __attribute__((target("default"))) int foo(int *arr) { int sum = 0; for (int i=0;i<16;i++) sum += arr[i]; return sum; } __attribute__((target("avx2"))) int foo(int *arr) { int sum = 0; for (int i=0;i<16;i++) sum += arr[i]; return sum; } __attribute__((target("default"))) int bar(int *arr) { return foo(arr); } __attribute__((target("avx2"))) int bar(int *arr) { return foo(arr); } ``` However, if we use target_clones attribute, the target_clones functions will not be inlined: ```cpp __attribute__((target_clones("default","avx2"))) int foo(int *arr) { int sum = 0; for (int i=0;i<16;i++) sum += arr[i]; return sum; } __attribute__((target_clones("default","avx2"))) int bar(int *arr) { return foo(arr); } ``` This behavior may negatively impact performance since the target_clones functions are not inlined. And since we didn't jump to the target_clones functions based on PLT but used the same target as the caller's target. I think it's better to allow the target_clones functions to be inlined. gcc/ada/ChangeLog: * gcc-interface/utils.cc (handle_target_clones_attribute): Allow functions with target_clones attribute to be inlined. gcc/c-family/ChangeLog: * c-attribs.cc (handle_target_clones_attribute): Allow functions with target_clones attribute to be inlined. gcc/d/ChangeLog: * d-attribs.cc (d_handle_target_clones_attribute): Allow functions with target_clones attribute to be inlined. Signed-off-by: Yangyu Chen --- gcc/ada/gcc-interface/utils.cc | 5 +---- gcc/c-family/c-attribs.cc | 3 --- gcc/d/d-attribs.cc | 5 ----- 3 files changed, 1 insertion(+), 12 deletions(-) diff --git a/gcc/ada/gcc-interface/utils.cc b/gcc/ada/gcc-interface/utils.cc index 60f36b1e50d..d010b684177 100644 --- a/gcc/ada/gcc-interface/utils.cc +++ b/gcc/ada/gcc-interface/utils.cc @@ -7299,10 +7299,7 @@ handle_target_clones_attribute (tree *node, tree name, tree ARG_UNUSED (args), int ARG_UNUSED (flags), bool *no_add_attrs) { /* Ensure we have a function type. */ - if (TREE_CODE (*node) == FUNCTION_DECL) - /* Do not inline functions with multiple clone targets. */ - DECL_UNINLINABLE (*node) = 1; - else + if (TREE_CODE (*node) != FUNCTION_DECL) { warning (OPT_Wattributes, "%qE attribute ignored", name); *no_add_attrs = true; diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc index 4dd2eecbea5..f8759bb1908 100644 --- a/gcc/c-family/c-attribs.cc +++ b/gcc/c-family/c-attribs.cc @@ -6105,9 +6105,6 @@ handle_target_clones_attribute (tree *node, tree name, tree ARG_UNUSED (args), "single % attribute is ignored"); *no_add_attrs = true; } - else - /* Do not inline functions with multiple clone targets. */ - DECL_UNINLINABLE (*node) = 1; } else { diff --git a/gcc/d/d-attribs.cc b/gcc/d/d-attribs.cc index 0f7ca10e017..9f67415adb1 100644 --- a/gcc/d/d-attribs.cc +++ b/gcc/d/d-attribs.cc @@ -788,11 +788,6 @@ d_handle_target_clones_attribute (tree *node, tree name, tree, int, warning (OPT_Wattributes, "%qE attribute ignored", name); *no_add_attrs = true; } - else - { - /* Do not inline functions with multiple clone targets. */ - DECL_UNINLINABLE (*node) = 1; - } return NULL_TREE; }