From patchwork Tue Jul 9 15:41:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 1958531 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=f0x2+7zh; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WJQG00dFfz1xrJ for ; Wed, 10 Jul 2024 01:41:44 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 59A87386C5B4 for ; Tue, 9 Jul 2024 15:41:42 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-yw1-x1132.google.com (mail-yw1-x1132.google.com [IPv6:2607:f8b0:4864:20::1132]) by sourceware.org (Postfix) with ESMTPS id 46FC03846097 for ; Tue, 9 Jul 2024 15:41:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 46FC03846097 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 46FC03846097 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::1132 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720539681; cv=none; b=nEtyTMsvYKDi+OUhBEmzKfiBgmO+5KQ8WIeWha5JIx1zeZ2gpWhV0LXIxiaCpEENug2tap0UqPDBDX7RBkcw5dtE1LXcwXjTEu/E9M2PYebaQB3b+cmBGxhSan+mGunhl99bt9sxBlYJWBbjOCbfzUx49i1jyDRREGmWfW69xm4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720539681; c=relaxed/simple; bh=rcZRRqusFq8oQUPtLPRZl8S2ldWOeEi2lhHUSd066rQ=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=gtOTSp5N71wX/FrywBwya7FLee8J4sozBYfFpMf+jlS0Nyb6lUn0pv0iviZMIqsweC82wEWOKKX8Ke2ay0o/jJvrfKuUKomC0s6xFpaEQ9fn16ZRBmzhKD75dEFE1Pd7vmNKDBUlXcveyZS7HnomnXke2Hrvfg/C0aPDiExAyKQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-yw1-x1132.google.com with SMTP id 00721157ae682-63036fa87dbso37039827b3.1 for ; Tue, 09 Jul 2024 08:41:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1720539677; x=1721144477; darn=gcc.gnu.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=tnY4UD1B1xLhzoFo5YUeUEmTNzPMfuqpUApDNNW3LA0=; b=f0x2+7zhgkx+WjmbjaY3i+6lwq2h+zUCemQM/GQ2ZVEkdCaNNmsNtZX7EguRuZiZt7 1r+xGr1DilwUGfjuTpoV/ZZuuWyxuqgrP0SJZD9ya4sbM+WT3lJalV7Ly4Vj6TQ+dfj2 VsYu4JZmLieshEPY7BgZkuwi62MU2uhLC15GWL1UR5sBdfsivMukjbYR54/yWVo82AmO Ysf6sQOPyH0LTk9LywDzzZU1iCvDP2nYKpGHNa8BJGpRaoQoFJMZg6aZl0I4JWwwStSk qFAFHqLbGByiBWJySYcfAsaRvGxL5DH62lFjnnmySMxxjeOA4NXEI5/c6o0XhyuuFy/v y2sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720539677; x=1721144477; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=tnY4UD1B1xLhzoFo5YUeUEmTNzPMfuqpUApDNNW3LA0=; b=VsSHHr64w+GNw3I/Pz8WiekstERokDTML3TMvn9af7O+j7qiJCVdhrBEQwh+ZIwz6x eK1wyU9F5w3LGcgd0M0PRqVFaYYyedKBahtvza9rx8VRVEFn2qEdiOwLmRJeSGARSvkx WFgqdLC6ieJBHhfxscG26Mf6c/j+1djnl6Ne/uGj/jUPH7VtW9RpmFohrwTGCH00PHMJ oruJZcRDLD8CZaAS4fm9eXtz+4r1uTeAygcrTLm4vapzrfUjN2eoXmpfD9qBnVR9FNgW WIqJdOvCoA1nuaUI8gLnqNJgXb3MWS3tkZ81Zl30ZdgFt9MdEg12QmrHrfa56vwn701e RQ3A== X-Gm-Message-State: AOJu0YzsZsPsHKrUB/Kth3hOyGNhNNHZvIeB3mbei5aUl2P8A92DFOvn A9iz2lyASXJAizKrsIKEhjUSai/tkKCO30nbVk3dzhbezslklq/969nlqBLXmNOSZAiC1AkSPe+ eFfvTKx48rBuedJFKwnuH4x6ygxptjn+1NnvAMA== X-Google-Smtp-Source: AGHT+IFC7kEfTC0DH6z8xqhzUTx1VJ1xnSu3lWmkfOng5Q4sPl+egSG0hYjFzEqMMejnc+ROK6qrUN/LeibsG9ts6OQ= X-Received: by 2002:a81:83c2:0:b0:651:ee07:76c with SMTP id 00721157ae682-659185a2fa9mr18307017b3.15.1720539677346; Tue, 09 Jul 2024 08:41:17 -0700 (PDT) MIME-Version: 1.0 From: Uros Bizjak Date: Tue, 9 Jul 2024 17:41:03 +0200 Message-ID: Subject: [committed] i386: Implement .SAT_TRUNC for unsigned integers To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-8.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org The following testcase: unsigned short foo (unsigned int x) { _Bool overflow = x > (unsigned int)(unsigned short)(-1); return ((unsigned short)x | (unsigned short)-overflow); } currently compiles (-O2) to: foo: xorl %eax, %eax cmpl $65535, %edi seta %al negl %eax orl %edi, %eax ret We can expand through ustrunc{m}{n}2 optab to use carry flag from the comparison and generate code using SBB: foo: cmpl $65535, %edi sbbl %eax, %eax orl %edi, %eax ret or CMOV instruction: foo: movl $65535, %eax cmpl %eax, %edi cmovnc %edi, %eax ret gcc/ChangeLog: * config/i386/i386.md (@cmp_1): Use SWI mode iterator. (ustruncdi2): New expander. (ustruncsi2): Ditto. (ustrunchiqi2): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/sattrunc-1.c: New test. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Uros. diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 214cb2e239a..e2f30695d70 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -1533,8 +1533,8 @@ (define_insn "@ccmp" (define_expand "@cmp_1" [(set (reg:CC FLAGS_REG) - (compare:CC (match_operand:SWI48 0 "nonimmediate_operand") - (match_operand:SWI48 1 "")))]) + (compare:CC (match_operand:SWI 0 "nonimmediate_operand") + (match_operand:SWI 1 "")))]) (define_mode_iterator SWI1248_AVX512BWDQ_64 [(QI "TARGET_AVX512DQ") HI @@ -9981,6 +9981,114 @@ (define_expand "ussub3" DONE; }) +(define_expand "ustruncdi2" + [(set (match_operand:SWI124 0 "register_operand") + (us_truncate:DI (match_operand:DI 1 "nonimmediate_operand")))] + "TARGET_64BIT" +{ + rtx op1 = force_reg (DImode, operands[1]); + rtx sat = force_reg (DImode, GEN_INT (GET_MODE_MASK (mode))); + rtx dst; + + emit_insn (gen_cmpdi_1 (op1, sat)); + + if (TARGET_CMOVE) + { + rtx cmp = gen_rtx_GEU (VOIDmode, gen_rtx_REG (CCCmode, FLAGS_REG), + const0_rtx); + + dst = force_reg (mode, operands[0]); + emit_insn (gen_movsicc (gen_lowpart (SImode, dst), cmp, + gen_lowpart (SImode, op1), + gen_lowpart (SImode, sat))); + } + else + { + rtx msk = gen_reg_rtx (mode); + + emit_insn (gen_x86_movcc_0_m1_neg (msk)); + dst = expand_simple_binop (mode, IOR, + gen_lowpart (mode, op1), msk, + operands[0], 1, OPTAB_WIDEN); + } + + if (!rtx_equal_p (dst, operands[0])) + emit_move_insn (operands[0], dst); + DONE; +}) + +(define_expand "ustruncsi2" + [(set (match_operand:SWI12 0 "register_operand") + (us_truncate:SI (match_operand:SI 1 "nonimmediate_operand")))] + "" +{ + rtx op1 = force_reg (SImode, operands[1]); + rtx sat = force_reg (SImode, GEN_INT (GET_MODE_MASK (mode))); + rtx dst; + + emit_insn (gen_cmpsi_1 (op1, sat)); + + if (TARGET_CMOVE) + { + rtx cmp = gen_rtx_GEU (VOIDmode, gen_rtx_REG (CCCmode, FLAGS_REG), + const0_rtx); + + dst = force_reg (mode, operands[0]); + emit_insn (gen_movsicc (gen_lowpart (SImode, dst), cmp, + gen_lowpart (SImode, op1), + gen_lowpart (SImode, sat))); + } + else + { + rtx msk = gen_reg_rtx (mode); + + emit_insn (gen_x86_movcc_0_m1_neg (msk)); + dst = expand_simple_binop (mode, IOR, + gen_lowpart (mode, op1), msk, + operands[0], 1, OPTAB_WIDEN); + } + + if (!rtx_equal_p (dst, operands[0])) + emit_move_insn (operands[0], dst); + DONE; +}) + +(define_expand "ustrunchiqi2" + [(set (match_operand:QI 0 "register_operand") + (us_truncate:HI (match_operand:HI 1 "nonimmediate_operand")))] + "" +{ + rtx op1 = force_reg (HImode, operands[1]); + rtx sat = force_reg (HImode, GEN_INT (GET_MODE_MASK (QImode))); + rtx dst; + + emit_insn (gen_cmphi_1 (op1, sat)); + + if (TARGET_CMOVE) + { + rtx cmp = gen_rtx_GEU (VOIDmode, gen_rtx_REG (CCCmode, FLAGS_REG), + const0_rtx); + + dst = force_reg (QImode, operands[0]); + emit_insn (gen_movsicc (gen_lowpart (SImode, dst), cmp, + gen_lowpart (SImode, op1), + gen_lowpart (SImode, sat))); + } + else + { + rtx msk = gen_reg_rtx (QImode); + + emit_insn (gen_x86_movqicc_0_m1_neg (msk)); + dst = expand_simple_binop (QImode, IOR, + gen_lowpart (QImode, op1), msk, + operands[0], 1, OPTAB_WIDEN); + } + + if (!rtx_equal_p (dst, operands[0])) + emit_move_insn (operands[0], dst); + DONE; +}) + ;; The patterns that match these are at the end of this file. (define_expand "xf3" diff --git a/gcc/testsuite/gcc.target/i386/sattrunc-1.c b/gcc/testsuite/gcc.target/i386/sattrunc-1.c new file mode 100644 index 00000000000..b1116a836dc --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/sattrunc-1.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler-times "sbb|cmov" 6 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "sbb|cmov" 3 { target ia32 } } } */ + +#include + +#define DEF_SAT_U_TRUNC(WT, NT) \ +NT sat_u_truc_##WT##_to_##NT (WT x) \ +{ \ + _Bool overflow = x > (WT)(NT)(-1); \ + return (NT)x | (NT)-overflow; \ +} + +#ifdef __x86_64__ +DEF_SAT_U_TRUNC(uint64_t, uint32_t) +DEF_SAT_U_TRUNC(uint64_t, uint16_t) +DEF_SAT_U_TRUNC(uint64_t, uint8_t) +#endif + +DEF_SAT_U_TRUNC(uint32_t, uint16_t) +DEF_SAT_U_TRUNC(uint32_t, uint8_t) + +DEF_SAT_U_TRUNC(uint16_t, uint8_t)