From patchwork Tue Jul 30 15:41:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andi Kleen X-Patchwork-Id: 1966613 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=lv0J2ZC/; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WYKJK11P8z1ybX for ; Wed, 31 Jul 2024 01:43:29 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B5C08385DC1B for ; Tue, 30 Jul 2024 15:43:26 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) by sourceware.org (Postfix) with ESMTPS id 972B13858C56; Tue, 30 Jul 2024 15:42:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 972B13858C56 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.intel.com Authentication-Results: sourceware.org; spf=none smtp.mailfrom=linux.intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 972B13858C56 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722354141; cv=none; b=CW36r3QETVGj9uCCMOlmbpVTEcXWkVJu+LDx2b9uj0qMxw2YWNkHTsydzYlK3btxjFfJ+dYr3n8j8m1fXep0ed9IGbzi1HILB9dBpN8PTqa8f4Bqbe5pj2HZZ90TNVVx0fGKkJGl8udv+h1OR46EIuGnS3p3+Wuqg+6jk2aK8J8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722354141; c=relaxed/simple; bh=KAPrrx5U5wbm2ASj2x4s4cCZPOeNjfO5dKehVcatkqc=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=ooKQyMWijYl6nPmlnk0xu//3nlyQch+Bhyp6RTslnu1qJSOWJQONitX1UM9YxC3YyLK8jGJT7qTI6B49zTm4jTZAtmJ9z7e4K1wRdBASWYjOebHTKvgNJEANRZll0wSejl0lYQ45tOrMmUZnJGhqjlZuw32dAX6QPQoeBNLoEF4= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1722354139; x=1753890139; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=KAPrrx5U5wbm2ASj2x4s4cCZPOeNjfO5dKehVcatkqc=; b=lv0J2ZC/Io/3O1uWVPRN3umsIvjzuizu7eVXLLYwrtZMjY5in5y9UQnI iUl8Vym1aPLk0SlEyBAbaq+kVpGFuQYtcnfOdA/UsYTDwref9Rm2x1Mkh c5GGmiGoIgSGn9/zDypTqc+wIOFciC2Y2SYACDwi3K3VKoUY8fA8PQqyp lY72Gco4P+tYG3XAwDlljKoJatqGQBZBOJkiLfTlkgcg8TKLLaI6PTuaV /GS245uPJmKS3i90tPZ5ceOnl3OIN43XJng15SPttjzNTcW+ZtGVqIOdl wUt2qOYKztylc6IcRzbQ0bZh+AmeQlvdW3Y9ZBchZwC3B3Ko0z8S41/ii g==; X-CSE-ConnectionGUID: CFk6vrA0TLK3cOw3MV8dlw== X-CSE-MsgGUID: f6Vowo84QKmK415wXC3Fzw== X-IronPort-AV: E=McAfee;i="6700,10204,11149"; a="20336645" X-IronPort-AV: E=Sophos;i="6.09,248,1716274800"; d="scan'208";a="20336645" Received: from fmviesa003.fm.intel.com ([10.60.135.143]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jul 2024 08:42:13 -0700 X-CSE-ConnectionGUID: SIvDqGmeTJ2YVGnBC3USmg== X-CSE-MsgGUID: rkTqLxJNQiWAcjdLt+eDhQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,248,1716274800"; d="scan'208";a="58517822" Received: from tassilo.jf.intel.com ([10.54.38.190]) by fmviesa003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jul 2024 08:42:12 -0700 From: Andi Kleen To: gcc-patches@gcc.gnu.org Cc: Andi Kleen Subject: [PATCH 1/2] Remove MMX code path in lexer Date: Tue, 30 Jul 2024 08:41:58 -0700 Message-ID: <20240730154159.3799008-1-ak@linux.intel.com> X-Mailer: git-send-email 2.45.2 MIME-Version: 1.0 X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org From: Andi Kleen Host systems with only MMX and no SSE2 should be really rare now. Let's remove the MMX code path to keep the number of custom implementations the same. The SSE2 code path is also somewhat dubious now (nearly everything should have SSE4 4.2 which is >15 years old now), but the SSE2 code path is used as fallback for others and also apparently Solaris uses it due to tool chain deficiencies. libcpp/ChangeLog: * lex.cc (search_line_mmx): Remove function. (init_vectorized_lexer): Remove search_line_mmx. --- libcpp/lex.cc | 75 --------------------------------------------------- 1 file changed, 75 deletions(-) diff --git a/libcpp/lex.cc b/libcpp/lex.cc index 16f2c23af1e1..1591dcdf151a 100644 --- a/libcpp/lex.cc +++ b/libcpp/lex.cc @@ -290,71 +290,6 @@ static const char repl_chars[4][16] __attribute__((aligned(16))) = { '?', '?', '?', '?', '?', '?', '?', '?' }, }; -/* A version of the fast scanner using MMX vectorized byte compare insns. - - This uses the PMOVMSKB instruction which was introduced with "MMX2", - which was packaged into SSE1; it is also present in the AMD MMX - extension. Mark the function as using "sse" so that we emit a real - "emms" instruction, rather than the 3dNOW "femms" instruction. */ - -static const uchar * -#ifndef __SSE__ -__attribute__((__target__("sse"))) -#endif -search_line_mmx (const uchar *s, const uchar *end ATTRIBUTE_UNUSED) -{ - typedef char v8qi __attribute__ ((__vector_size__ (8))); - typedef int __m64 __attribute__ ((__vector_size__ (8), __may_alias__)); - - const v8qi repl_nl = *(const v8qi *)repl_chars[0]; - const v8qi repl_cr = *(const v8qi *)repl_chars[1]; - const v8qi repl_bs = *(const v8qi *)repl_chars[2]; - const v8qi repl_qm = *(const v8qi *)repl_chars[3]; - - unsigned int misalign, found, mask; - const v8qi *p; - v8qi data, t, c; - - /* Align the source pointer. While MMX doesn't generate unaligned data - faults, this allows us to safely scan to the end of the buffer without - reading beyond the end of the last page. */ - misalign = (uintptr_t)s & 7; - p = (const v8qi *)((uintptr_t)s & -8); - data = *p; - - /* Create a mask for the bytes that are valid within the first - 16-byte block. The Idea here is that the AND with the mask - within the loop is "free", since we need some AND or TEST - insn in order to set the flags for the branch anyway. */ - mask = -1u << misalign; - - /* Main loop processing 8 bytes at a time. */ - goto start; - do - { - data = *++p; - mask = -1; - - start: - t = __builtin_ia32_pcmpeqb(data, repl_nl); - c = __builtin_ia32_pcmpeqb(data, repl_cr); - t = (v8qi) __builtin_ia32_por ((__m64)t, (__m64)c); - c = __builtin_ia32_pcmpeqb(data, repl_bs); - t = (v8qi) __builtin_ia32_por ((__m64)t, (__m64)c); - c = __builtin_ia32_pcmpeqb(data, repl_qm); - t = (v8qi) __builtin_ia32_por ((__m64)t, (__m64)c); - found = __builtin_ia32_pmovmskb (t); - found &= mask; - } - while (!found); - - __builtin_ia32_emms (); - - /* FOUND contains 1 in bits for which we matched a relevant - character. Conversion to the byte index is trivial. */ - found = __builtin_ctz(found); - return (const uchar *)p + found; -} /* A version of the fast scanner using SSE2 vectorized byte compare insns. */ @@ -509,8 +444,6 @@ init_vectorized_lexer (void) minimum = 3; #elif defined(__SSE2__) minimum = 2; -#elif defined(__SSE__) - minimum = 1; #endif if (minimum == 3) @@ -521,14 +454,6 @@ init_vectorized_lexer (void) impl = search_line_sse42; else if (minimum == 2 || (edx & bit_SSE2)) impl = search_line_sse2; - else if (minimum == 1 || (edx & bit_SSE)) - impl = search_line_mmx; - } - else if (__get_cpuid (0x80000001, &dummy, &dummy, &dummy, &edx)) - { - if (minimum == 1 - || (edx & (bit_MMXEXT | bit_CMOV)) == (bit_MMXEXT | bit_CMOV)) - impl = search_line_mmx; } search_line_fast = impl;