From patchwork Sat Oct 23 05:26:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Noah Goldstein X-Patchwork-Id: 1545148 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.a=rsa-sha256 header.s=default header.b=G7yelyAv; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4HbqTv2V1fz9sXM for ; Sat, 23 Oct 2021 16:27:14 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 89A6F385782F for ; Sat, 23 Oct 2021 05:27:11 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 89A6F385782F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1634966831; bh=4YYMwPQfYa7qATU3x/S9s4YqeTs4EE7hWnuY4BJXi5U=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=G7yelyAv9KvizIk7wHYH/B8p8cpy0Ktyo0P6XMKCHOYM/12yLG0sAH6Yt2FccYM31 M8gK07F41VAC0Z3teykvUjaONL0vr0tazjjSG9TJM8biQgWI68RsJm88+icf8ML/6n YXMkc6zKvm1QKfC/LgM4p2+Y1GC+5lWlmr83uqWg= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-io1-xd2a.google.com (mail-io1-xd2a.google.com [IPv6:2607:f8b0:4864:20::d2a]) by sourceware.org (Postfix) with ESMTPS id 003623858416 for ; Sat, 23 Oct 2021 05:26:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 003623858416 Received: by mail-io1-xd2a.google.com with SMTP id z69so8020494iof.9 for ; Fri, 22 Oct 2021 22:26:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=4YYMwPQfYa7qATU3x/S9s4YqeTs4EE7hWnuY4BJXi5U=; b=24ZFt1ok5EmIC+LG9eH5Ou4rEBOxQKM6GmoWUmIfEoCUFVFzhCUiNNUYdwEjmj1vEq Owir/62IfS3G7SVOMzhkCRm863gvXChjL4UCCU9txA3clwaRmXZrlwt1EVIjllEwSfgf 8bmABnOoHgWI0lo8rcy+AeQqzMhDHqfGJ5Ml/uxA8sTZZEcl7Pf6W4EVHNJ/w0FH3jp5 nOrKekakqmnOBlHfTBBmj4PakZpMWxXSd7LJK6xlnPUlGdZ4sgqO5z6zlMvhHPVeKWS6 5zeQ6JfpnrMs4k7EhucJbwkCuBU1IhvIjqj49UtW4gxwkXcK2v4DP5Npxn5rUiymyWKu M+PA== X-Gm-Message-State: AOAM530QB275EEN9Am5xjEpCHxEeff960wbdUBWTlzSh1WJRt3f/CaRB a0V1uijj9gNEX/PS0mcr1uGbrF/DDcM= X-Google-Smtp-Source: ABdhPJwUo+JPrSvonRnS/zgmn2LsyZWqGK2jByM2ZbRi4ac8FabKjfCmBzyzeTRvvmKi83suFvGXrA== X-Received: by 2002:a05:6602:2a44:: with SMTP id k4mr2603872iov.56.1634966816188; Fri, 22 Oct 2021 22:26:56 -0700 (PDT) Received: from localhost.localdomain (node-17-161.flex.volo.net. [76.191.17.161]) by smtp.googlemail.com with ESMTPSA id k16sm5226227ior.50.2021.10.22.22.26.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Oct 2021 22:26:55 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v1] x86: Replace sse2 instructions with avx in memcmp-evex-movbe.S Date: Sat, 23 Oct 2021 01:26:47 -0400 Message-Id: <20211023052647.535991-1-goldstein.w.n@gmail.com> X-Mailer: git-send-email 2.29.2 MIME-Version: 1.0 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Noah Goldstein via Libc-alpha From: Noah Goldstein Reply-To: Noah Goldstein Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" This commit replaces two usages of SSE2 'movups' with AVX 'vmovdqu'. it could potentially be dangerous to use SSE2 if this function is ever called without using 'vzeroupper' beforehand. While compilers appear to use 'vzeroupper' before function calls if AVX2 has been used, using SSE2 here is more brittle. Since it is not absolutely necessary it should be avoided. It costs 2-extra bytes but the extra bytes should only eat into alignment padding. Reviewed-by: H.J. Lu --- sysdeps/x86_64/multiarch/memcmp-evex-movbe.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sysdeps/x86_64/multiarch/memcmp-evex-movbe.S b/sysdeps/x86_64/multiarch/memcmp-evex-movbe.S index 2761b54f2e..640f6757fa 100644 --- a/sysdeps/x86_64/multiarch/memcmp-evex-movbe.S +++ b/sysdeps/x86_64/multiarch/memcmp-evex-movbe.S @@ -561,13 +561,13 @@ L(between_16_31): /* From 16 to 31 bytes. No branch when size == 16. */ /* Use movups to save code size. */ - movups (%rsi), %xmm2 + vmovdqu (%rsi), %xmm2 VPCMP $4, (%rdi), %xmm2, %k1 kmovd %k1, %eax testl %eax, %eax jnz L(return_vec_0_lv) /* Use overlapping loads to avoid branches. */ - movups -16(%rsi, %rdx, CHAR_SIZE), %xmm2 + vmovdqu -16(%rsi, %rdx, CHAR_SIZE), %xmm2 VPCMP $4, -16(%rdi, %rdx, CHAR_SIZE), %xmm2, %k1 addl $(CHAR_PER_VEC - (16 / CHAR_SIZE)), %edx kmovd %k1, %eax