From patchwork Tue Apr 27 15:32:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bill Schmidt X-Patchwork-Id: 1470729 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=MWHOWxej; dkim-atps=neutral Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4FV5Qq2l0nz9t14 for ; Wed, 28 Apr 2021 01:34:11 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 60BAB398B42E; Tue, 27 Apr 2021 15:34:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 60BAB398B42E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1619537649; bh=Ko1dvAb5c3Zsy0d7UIXS8HT2hEFBQqUP7yr3IfQPoEc=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=MWHOWxejrSRJMal2XzZJYlPHEjKcL2EKFadx0fCMJUS9/RZcRTJo6iJPGRkKnlwjm C90SHQuLWM7U95SRt/gy8sK4cyTGYsN6L3oSeSJrFFXlwvHANw1nniopx9uVj5bN14 sUByBktzVtAHQP9DGLt0EdxEc/qy/gWTjzAW/OZ0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 6795E398B406 for ; Tue, 27 Apr 2021 15:34:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 6795E398B406 Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 13RFFT1m170243; Tue, 27 Apr 2021 11:34:04 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 386n3a0j0x-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 27 Apr 2021 11:34:04 -0400 Received: from m0098409.ppops.net (m0098409.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 13RFG5Q2171256; Tue, 27 Apr 2021 11:34:03 -0400 Received: from ppma05wdc.us.ibm.com (1b.90.2fa9.ip4.static.sl-reverse.com [169.47.144.27]) by mx0a-001b2d01.pphosted.com with ESMTP id 386n3a0j08-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 27 Apr 2021 11:34:03 -0400 Received: from pps.filterd (ppma05wdc.us.ibm.com [127.0.0.1]) by ppma05wdc.us.ibm.com (8.16.0.43/8.16.0.43) with SMTP id 13RFHJdr015992; Tue, 27 Apr 2021 15:34:02 GMT Received: from b01cxnp22035.gho.pok.ibm.com (b01cxnp22035.gho.pok.ibm.com [9.57.198.25]) by ppma05wdc.us.ibm.com with ESMTP id 384ay9ecq5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 27 Apr 2021 15:34:02 +0000 Received: from b01ledav006.gho.pok.ibm.com (b01ledav006.gho.pok.ibm.com [9.57.199.111]) by b01cxnp22035.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 13RFY1G538535496 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 27 Apr 2021 15:34:01 GMT Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2E899AC05E; Tue, 27 Apr 2021 15:34:01 +0000 (GMT) Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id EAC51AC060; Tue, 27 Apr 2021 15:34:00 +0000 (GMT) Received: from ltcden2-lp1.aus.stglabs.ibm.com (unknown [9.53.174.68]) by b01ledav006.gho.pok.ibm.com (Postfix) with ESMTPS; Tue, 27 Apr 2021 15:34:00 +0000 (GMT) Received: by ltcden2-lp1.aus.stglabs.ibm.com (Postfix, from userid 1006) id B4E8F41397AD; Tue, 27 Apr 2021 10:33:59 -0500 (CDT) To: gcc-patches@gcc.gnu.org Subject: [PATCH 00/57] Replace the Power target-specific built-in machinery Date: Tue, 27 Apr 2021 10:32:35 -0500 Message-Id: X-Mailer: git-send-email 2.27.0 MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: mdHTmaDgeyyrG50Wh5rhJzXVUjckrVCV X-Proofpoint-GUID: dX7E-dnEULF2tdCobLlNsf-izfVmEDle X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.761 definitions=2021-04-27_08:2021-04-27, 2021-04-27 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 impostorscore=0 adultscore=0 phishscore=0 mlxlogscore=999 suspectscore=0 bulkscore=0 malwarescore=0 clxscore=1011 mlxscore=0 spamscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104060000 definitions=main-2104270107 X-Spam-Status: No, score=-6.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Bill Schmidt via Gcc-patches From: Bill Schmidt Reply-To: Bill Schmidt Cc: jakub@redhat.com, Bill Schmidt , jlaw@tachyum.com, dje.gcc@gmail.com, segher@kernel.crashing.org Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" The design of the target-specific built-in function support in the Power back end has not stood the test of time. The machinery is grossly inefficient, confusing, and arcane; and adding new built-in functions is inefficient and error-prone. This patch set introduces a replacement. Because of the scope of the changes, it's important to be able to verify that the new system makes only intended changes to the functions that are supported. Therefore this patch set adds a new mechanism, and (in the final patch) enables it instead of the existing support, but does not yet remove the old support. That will happen in a follow-up patch once we're comfortable with the new system. Most of the patches in this set are specific to the rs6000 back end. However, the first two patches make changes in common code and require review from the appropriate maintainers. Jakub and Jeff, I would appreciate it if you could look at these two small patches. After these changes are upstream, adding new built-in functions will usually be as simple as adding two lines to a file, rs6000-builtin-new.def, that give the prototype of the function and a little additional information. Adding new overloaded functions will require adding a new section to another file, rs6000-overload.def, with one line describing the overload information, and two lines for each function to be dispatched to from the overloaded function. The patches are divided into the following sections. Patches 0001-0002: Common code patches Patch 0001 adds a mechanism to the Makefile to allow specifying additional dependencies for "out_object_file", which is rs6000.o for the rs6000 back end. I found this necessary to be able to have rs6000.o depend on a header file generated during the build. Patch 0002 expands the gengtype machinery to scan header files created during the build for GC roots. Patches 0003, 0005-0023: Generator program A new program, rs6000-gen-builtins, is created and executed during the build. It reads rs6000-builtin-new.def and rs6000-overload.def and produces three output files: rs6000-builtins.h, rs6000-builtins.c, and rs6000-vecdefines.h. rs6000-builtins.h defines the data structures representing the built-in functions, overloaded functions, overload instantiations, and function type specifiers. rs6000-builtins.c contains static initializers for the data structures, as well as the function rs6000_autoinit_builtins that performs additional run-time initialization. rs6000-vecdefines.h contains a set of #defines that map external identifiers such as vec_add to their internal builtin names, such as __builtin_vec_add. This replaces most of the similar #defines previously contained in altivec.h, which now #includes the new file instead. This set of patches adds the source for the generator program. Patches 0024-0025: Target build machinery These patches make changes to config.gcc and t-rs6000 to build and run the new generator program, and to ensure that the garbage collection roots in rs6000-builtins.h are scanned by gengtype. Patches 0004, 0026-0031, 0033-0037: Input files These patches build up the input files to the generator program, listing all of the built-in functions and overloads to be processed. Patch 0032: Add pointer types This patch creates and caches a bunch of pointer type nodes. The existing built-in machinery, for some reason, only created base types up front and created the pointer types on demand (over and over and over again). The new mechanism needs all the type nodes available, so we add them here. Patch 0038: Call rs6000_autoinit_builtins Patch 0039: A little special handling for Darwin Patches 0040-0041: Miscellaneous support patches Patch 0042: Rewrite the overload processing Most of this code remains largely the same as before, with the same special handling for a few interesting built-in functions. But the general handling of overloaded functions is now much more efficient since the new data structures are designed for quick lookup, whereas the old machinery does a brutal linear search. Patch 0043: Rewrite gimple folding The "rewrite" here consists entirely of changing the names of the builtins to be processed, since we need a separate enumeration of builtins for the new machinery. Patch 0044: Vectorization support Small updates to the functions used for mapping built-ins to their vectorized counterparts. Patches 0045-0050: Rewrite built-in function expansion This is where most of the meat comes in. Lookup of built-ins at expand time is again much more efficient, replacing the old mechanism of multiple linear searches over the whole built-in table. Another major change is that all built-in functions are always defined, but a test at expand time is used to determine whether they are enabled. This allows proper handling of built-ins in the presence of "#pragma target" directives. Also, handling of special cases is made more uniform using an attribute system, which I hope makes this much easier to maintain. Patches 0051-0052: Miscellaneous changes Patch 0053: Debug support Small changes here to allow gathering of a little more data from -mdebug=builtin. I used this to look for differences between functions defined by the old and new built-in support. Patch 0054: Change altivec.h to use rs6000-vecdefines.h Patch 0055: Test case adjustments Most of these changes are due to automating checks for literal arguments that must be within a certain range. This gives us more regular error messages, which don't always match the previous error messages. There are also some adjustments because altivec.h now includes rs6000-vecdefines.h. Patch 0056: Flip the switch to enable the new support Victory is ours... Patch 0057: Fix one last late-breaking change Keeping the code up-to-date with upstream has been fun. When I rebased to create the patch set, I found one new issue where a small change had been made to the overload handling for the vec_insert builtins. This patch reflects that change into the new handling. My version of git is having trouble with interactive rebasing, so it was easier to just add the extra patch. Now, with all that done, there are a few things that are not yet done: (1) A future patch will remove the old code. (2) There are times where we ought to dispatch an overload to one function if VSX is available, and to another function if it is not. We need a general mechanism for allowing conditional dispatch. I've outlined a method for this in rs6000-overload.def that I want to implement down the road. (3) I want to investigate why vec_mul requires special handling in rs6000-c.c; it doesn't seem like it should. (4) Similarly, can we remove some of the special handling for vec_adde, vec_addec, vec_sube, and vec_subec? (5) The parser in the generator program doesn't yet handle "escape-newline" sequences for breaking long lines. I should add that capability. (6) Longer term, can we use a similar mechanism for built-in functions used for all targets in common code? A word about compatibility: I deliberately implemented all the old built-ins exactly as previously defined, wherever possible, despite an overwhelming desire to pitch out a bunch of them that have already been considered deprecated for ages. I found that it was too difficult to both implement a new system and remove deprecated things at the same time, and in general it seems like a dangerous thing to do. Better to do this in stages if we're going to do it at all. Unfortunately a lot of deprecated things still appear all over our own test suite, and I'm afraid we can assume they appear in user code all over the place as well. What I've done instead is to make very clear which interfaces are considered deprecated in the input files themselves. Over time, perhaps we can start to remove some of these, but in reality I think we're going to just continue to be stuck with them. Here is a complete list of known incompatibilities with the old mechanism: (1) __builtin_vec_vpopcntu[bdhw] were all registered as overloads but didn't have any instantiations. Therefore they could not have been used anywhere, and I haven't implemented them. (2) I added ten new built-ins named __builtin_vsx_xxpermx_ to be used for the overloaded vec_xxpermx function, instead of the bloody hack that was used before. The functionality of vec_xxpermx is unchanged. (3) A number of built-ins used "long" for DImode types, which would break these for 32-bit. I changed those arguments and return values to "long long" to avoid such problems, when those built-ins were not restricted to 64-bit mode already. There aren't many such cases. (4) A small handful of builtins didn't have the correct return type to match the mode of the pattern, so I fixed those. They are all new in GCC 11 and can't have worked properly. (5) I handled the MMA internal functions slightly differently, so that all the ones with extra vector_quad arguments are listed as such, rather than having that hacked on during expand time. (6) __builtin_vsx_xl_len_r took only a char * rather than a void *; fixing this was backward compatible. (7) __builtin_vsx_splat_2d[fi] were incompletely defined and couldn't have ever worked; fixed. (8) A small handful of builtins weren't marked as "const," but are obviously const, so I fixed those. I've kept a complete list of discrepancies for my records, in case any issues arise from my misunderstanding something. I do want to thank all the people who have contributed to the built-in design over the years. For all my griping, there are some marvelous bits in there that I hope I have kept intact. My hope is to make the whole system much easier to use and maintain going forward. Time will tell. The patches have been bootstrapped and tested on a Power10 little-endian system, and on a Power8 big-endian system with both 32- and 64-bit enabled, with no regressions. I'm not crazy enough to believe I don't have any errors in here, but I have endeavoured to test and minimize them to the best of my ability. Is this series okay for trunk, in GCC 12 stage 1? Thanks! Bill Bill Schmidt (57): Allow targets to specify build dependencies for out_object_file Support scanning of build-time GC roots in gengtype rs6000: Initial create of rs6000-gen-builtins.c rs6000: Add initial input files rs6000: Add file support and functions for diagnostic support rs6000: Add helper functions for parsing rs6000: Add functions for matching types, part 1 of 3 rs6000: Add functions for matching types, part 2 of 3 rs6000: Add functions for matching types, part 3 of 3 rs6000: Red-black tree implementation for balanced tree search rs6000: Main function with stubs for parsing and output rs6000: Parsing built-in input file, part 1 of 3 rs6000: Parsing built-in input file, part 2 of 3 rs6000: Parsing built-in input file, part 3 of 3 rs6000: Parsing of overload input file rs6000: Build and store function type identifiers rs6000: Write output to the builtin definition include file rs6000: Write output to the builtins header file rs6000: Write output to the builtins init file, part 1 of 3 rs6000: Write output to the builtins init file, part 2 of 3 rs6000: Write output to the builtins init file, part 3 of 3 rs6000: Write static initializations for built-in table rs6000: Write static initializations for overload tables rs6000: Incorporate new builtins code into the build machinery rs6000: Add gengtype handling to the build machinery rs6000: Add the rest of the [altivec] stanza to the builtins file rs6000: Add VSX builtins rs6000: Add available-everywhere and ancient builtins rs6000: Add power7 and power7-64 builtins rs6000: Add power8-vector builtins rs6000: Add Power9 builtins rs6000: Add more type nodes to support builtin processing rs6000: Add Power10 builtins rs6000: Add MMA builtins rs6000: Add miscellaneous builtins rs6000: Add Cell builtins rs6000: Add remaining overloads rs6000: Execute the automatic built-in initialization code rs6000: Darwin builtin support rs6000: Add sanity to V2DI_type_node definitions rs6000: Always initialize vector_pair and vector_quad nodes rs6000: Handle overloads during program parsing rs6000: Handle gimple folding of target built-ins rs6000: Support for vectorizing built-in functions rs6000: Builtin expansion, part 1 rs6000: Builtin expansion, part 2 rs6000: Builtin expansion, part 3 rs6000: Builtin expansion, part 4 rs6000: Builtin expansion, part 5 rs6000: Builtin expansion, part 6 rs6000: Update rs6000_builtin_decl rs6000: Miscellaneous uses of rs6000_builtin_decls_x rs6000: Debug support rs6000: Update altivec.h for automated interfaces rs6000: Test case adjustments rs6000: Enable the new builtin support rs6000: Adjust to late-breaking change gcc/Makefile.in | 8 +- gcc/config.gcc | 2 + gcc/config/rs6000/altivec.h | 516 +- gcc/config/rs6000/darwin.h | 8 +- gcc/config/rs6000/rbtree.c | 233 + gcc/config/rs6000/rbtree.h | 51 + gcc/config/rs6000/rs6000-builtin-new.def | 3875 +++++++++++ gcc/config/rs6000/rs6000-c.c | 1083 +++ gcc/config/rs6000/rs6000-call.c | 3371 ++++++++- gcc/config/rs6000/rs6000-gen-builtins.c | 2997 ++++++++ gcc/config/rs6000/rs6000-overload.def | 6076 +++++++++++++++++ gcc/config/rs6000/rs6000.c | 219 +- gcc/config/rs6000/rs6000.h | 82 + gcc/config/rs6000/t-rs6000 | 44 +- gcc/gengtype-state.c | 29 +- gcc/gengtype.c | 19 +- gcc/gengtype.h | 5 + .../powerpc/bfp/scalar-extract-exp-2.c | 2 +- .../powerpc/bfp/scalar-extract-sig-2.c | 2 +- .../powerpc/bfp/scalar-insert-exp-2.c | 2 +- .../powerpc/bfp/scalar-insert-exp-5.c | 2 +- .../powerpc/bfp/scalar-insert-exp-8.c | 2 +- .../powerpc/bfp/scalar-test-neg-2.c | 2 +- .../powerpc/bfp/scalar-test-neg-3.c | 2 +- .../powerpc/bfp/scalar-test-neg-5.c | 2 +- .../gcc.target/powerpc/byte-in-set-2.c | 2 +- gcc/testsuite/gcc.target/powerpc/cmpb-2.c | 2 +- gcc/testsuite/gcc.target/powerpc/cmpb32-2.c | 2 +- .../gcc.target/powerpc/crypto-builtin-2.c | 14 +- .../powerpc/fold-vec-splat-floatdouble.c | 4 +- .../powerpc/fold-vec-splat-longlong.c | 10 +- .../powerpc/fold-vec-splat-misc-invalid.c | 8 +- .../gcc.target/powerpc/p8vector-builtin-8.c | 1 + gcc/testsuite/gcc.target/powerpc/pr80315-1.c | 2 +- gcc/testsuite/gcc.target/powerpc/pr80315-2.c | 2 +- gcc/testsuite/gcc.target/powerpc/pr80315-3.c | 2 +- gcc/testsuite/gcc.target/powerpc/pr80315-4.c | 2 +- gcc/testsuite/gcc.target/powerpc/pr88100.c | 12 +- .../gcc.target/powerpc/pragma_misc9.c | 2 +- .../gcc.target/powerpc/pragma_power8.c | 2 + .../gcc.target/powerpc/pragma_power9.c | 3 + .../powerpc/test_fpscr_drn_builtin_error.c | 4 +- .../powerpc/test_fpscr_rn_builtin_error.c | 12 +- gcc/testsuite/gcc.target/powerpc/test_mffsl.c | 3 +- gcc/testsuite/gcc.target/powerpc/vec-gnb-2.c | 2 +- .../gcc.target/powerpc/vsu/vec-all-nez-7.c | 2 +- .../gcc.target/powerpc/vsu/vec-any-eqz-7.c | 2 +- .../gcc.target/powerpc/vsu/vec-cmpnez-7.c | 2 +- .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-2.c | 2 +- .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-2.c | 2 +- .../gcc.target/powerpc/vsu/vec-xl-len-13.c | 2 +- .../gcc.target/powerpc/vsu/vec-xst-len-12.c | 2 +- 52 files changed, 17915 insertions(+), 824 deletions(-) create mode 100644 gcc/config/rs6000/rbtree.c create mode 100644 gcc/config/rs6000/rbtree.h create mode 100644 gcc/config/rs6000/rs6000-builtin-new.def create mode 100644 gcc/config/rs6000/rs6000-gen-builtins.c create mode 100644 gcc/config/rs6000/rs6000-overload.def