From patchwork Wed Nov 30 07:17:16 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Koenig X-Patchwork-Id: 700885 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3tTBZm2ZBFz9t2g for ; Wed, 30 Nov 2016 18:17:50 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="wDhLOefh"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=Sa6Frgt3lIPGXuFu4IaJrJ31V85Ct6/P1zHXmoJ9dnVgH+QCGF l0jVMRWBl1HvFjRdDtomeAyYyh9fQQmcqgT/fU4tTXqBoNFuOP48+wOho8Fx16qW gbIcBCRAOoIG9ZIXjCCAhaY/3xYkW/FXjlUXz2gGpjIpmw2ad4c0xgiCA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; s= default; bh=lSPM3QkRXUPqw8GXonxafbSYgWU=; b=wDhLOefhoz9YEpYmkkcT ascPfibmGVfNziCO9H8u58XrhdQdRP8Q8sNG1ITzkDtHuCu9KuseG0JgMlUVt5h6 L3X9k0OQyuTDpxnRhNesUexkDew8k1LwnI2AxC72Rr7tXGhBAjyV0bhYzrNAUh6q Gt/gf+GGxkDPRvS9Kb579X4= Received: (qmail 111567 invoked by alias); 30 Nov 2016 07:17:35 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 111483 invoked by uid 89); 30 Nov 2016 07:17:32 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=BAYES_50, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_LOW, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy=cpuid.h, UD:matmul_i2.c, matmul_i2.c, UD:matmul_c8.c X-Spam-User: qpsmtpd, 2 recipients X-HELO: cc-smtpout3.netcologne.de Received: from cc-smtpout3.netcologne.de (HELO cc-smtpout3.netcologne.de) (89.1.8.213) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 30 Nov 2016 07:17:22 +0000 Received: from cc-smtpin2.netcologne.de (cc-smtpin2.netcologne.de [89.1.8.202]) by cc-smtpout3.netcologne.de (Postfix) with ESMTP id F35801280D; Wed, 30 Nov 2016 08:17:18 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by cc-smtpin2.netcologne.de (Postfix) with ESMTP id E5F8F11DF1; Wed, 30 Nov 2016 08:17:18 +0100 (CET) Received: from [78.35.158.185] (helo=cc-smtpin2.netcologne.de) by localhost with ESMTP (eXpurgate 4.1.9) (envelope-from ) id 583e7cfe-022c-7f0000012729-7f0000018281-1 for ; Wed, 30 Nov 2016 08:17:18 +0100 Received: from [192.168.178.20] (xdsl-78-35-158-185.netcologne.de [78.35.158.185]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by cc-smtpin2.netcologne.de (Postfix) with ESMTPSA; Wed, 30 Nov 2016 08:17:17 +0100 (CET) To: gcc-patches , "fortran@gcc.gnu.org" From: Thomas Koenig Subject: [patch part, libgcc] Add AVX-specific matmul Message-ID: <31a0faab-ab00-c6e4-705d-872fda554ca9@netcologne.de> Date: Wed, 30 Nov 2016 08:17:16 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.0 MIME-Version: 1.0 Hello world, the patch at https://gcc.gnu.org/ml/fortran/2016-11/msg00246.html (the one going to gcc-patches was rejected due to size of regernerated files) contains one libgcc change, which exposes the __cpu_model interface fox i386 to libgfortran. The Fortran bits are OKd, but I need an approval from a libgcc maintainer (or some hint how to do this better :-). I have attached the libgcc-specific part of the patch. OK for trunk? Regards Thomas 2016-11-27 Thomas Koenig PR fortran/78379 * config/i386/cpuinfo.c: Move denums for processor vendors, processor type, processor subtypes and declaration of struct __processor_model into * config/i386/cpuinfo.h: New header file. * Makefile.am: Add dependence of m4/matmul_internal_m4 to mamtul files.. * Makefile.in: Regenerated. * acinclude.m4: Check for AVX, AVX2 and AVX512F. * config.h.in: Add HAVE_AVX, HAVE_AVX2 and HAVE_AVX512F. * configure: Regenerated. * configure.ac: Use checks for AVX, AVX2 and AVX_512F. * m4/matmul_internal.m4: New file. working part of matmul.m4. * m4/matmul.m4: Implement architecture-specific switching for AVX, AVX2 and AVX512F by including matmul_internal.m4 multiple times. * generated/matmul_c10.c: Regenerated. * generated/matmul_c16.c: Regenerated. * generated/matmul_c4.c: Regenerated. * generated/matmul_c8.c: Regenerated. * generated/matmul_i1.c: Regenerated. * generated/matmul_i16.c: Regenerated. * generated/matmul_i2.c: Regenerated. * generated/matmul_i4.c: Regenerated. * generated/matmul_i8.c: Regenerated. * generated/matmul_r10.c: Regenerated. * generated/matmul_r16.c: Regenerated. * generated/matmul_r4.c: Regenerated. * generated/matmul_r8.c: Regenerated. Index: config/i386/cpuinfo.c =================================================================== --- config/i386/cpuinfo.c (Revision 242477) +++ config/i386/cpuinfo.c (Arbeitskopie) @@ -26,6 +26,7 @@ see the files COPYING3 and COPYING.RUNTIME respect #include "cpuid.h" #include "tsystem.h" #include "auto-target.h" +#include "cpuinfo.h" #ifdef HAVE_INIT_PRIORITY #define CONSTRUCTOR_PRIORITY (101) @@ -36,97 +37,9 @@ see the files COPYING3 and COPYING.RUNTIME respect int __cpu_indicator_init (void) __attribute__ ((constructor CONSTRUCTOR_PRIORITY)); -/* Processor Vendor and Models. */ +struct __processor_model __cpu_model = { }; -enum processor_vendor -{ - VENDOR_INTEL = 1, - VENDOR_AMD, - VENDOR_OTHER, - VENDOR_MAX -}; -/* Any new types or subtypes have to be inserted at the end. */ - -enum processor_types -{ - INTEL_BONNELL = 1, - INTEL_CORE2, - INTEL_COREI7, - AMDFAM10H, - AMDFAM15H, - INTEL_SILVERMONT, - INTEL_KNL, - AMD_BTVER1, - AMD_BTVER2, - AMDFAM17H, - CPU_TYPE_MAX -}; - -enum processor_subtypes -{ - INTEL_COREI7_NEHALEM = 1, - INTEL_COREI7_WESTMERE, - INTEL_COREI7_SANDYBRIDGE, - AMDFAM10H_BARCELONA, - AMDFAM10H_SHANGHAI, - AMDFAM10H_ISTANBUL, - AMDFAM15H_BDVER1, - AMDFAM15H_BDVER2, - AMDFAM15H_BDVER3, - AMDFAM15H_BDVER4, - AMDFAM17H_ZNVER1, - INTEL_COREI7_IVYBRIDGE, - INTEL_COREI7_HASWELL, - INTEL_COREI7_BROADWELL, - INTEL_COREI7_SKYLAKE, - INTEL_COREI7_SKYLAKE_AVX512, - CPU_SUBTYPE_MAX -}; - -/* ISA Features supported. New features have to be inserted at the end. */ - -enum processor_features -{ - FEATURE_CMOV = 0, - FEATURE_MMX, - FEATURE_POPCNT, - FEATURE_SSE, - FEATURE_SSE2, - FEATURE_SSE3, - FEATURE_SSSE3, - FEATURE_SSE4_1, - FEATURE_SSE4_2, - FEATURE_AVX, - FEATURE_AVX2, - FEATURE_SSE4_A, - FEATURE_FMA4, - FEATURE_XOP, - FEATURE_FMA, - FEATURE_AVX512F, - FEATURE_BMI, - FEATURE_BMI2, - FEATURE_AES, - FEATURE_PCLMUL, - FEATURE_AVX512VL, - FEATURE_AVX512BW, - FEATURE_AVX512DQ, - FEATURE_AVX512CD, - FEATURE_AVX512ER, - FEATURE_AVX512PF, - FEATURE_AVX512VBMI, - FEATURE_AVX512IFMA -}; - -struct __processor_model -{ - unsigned int __cpu_vendor; - unsigned int __cpu_type; - unsigned int __cpu_subtype; - unsigned int __cpu_features[1]; -} __cpu_model = { }; - - /* Get the specific type of AMD CPU. */ static void Index: config/i386/cpuinfo.h =================================================================== --- config/i386/cpuinfo.h (Revision 0) +++ config/i386/cpuinfo.h (Arbeitskopie) @@ -0,0 +1,90 @@ + +/* Processor Vendor and Models. */ + +enum processor_vendor +{ + VENDOR_INTEL = 1, + VENDOR_AMD, + VENDOR_OTHER, + VENDOR_MAX +}; + +/* Any new types or subtypes have to be inserted at the end. */ + +enum processor_types +{ + INTEL_BONNELL = 1, + INTEL_CORE2, + INTEL_COREI7, + AMDFAM10H, + AMDFAM15H, + INTEL_SILVERMONT, + INTEL_KNL, + AMD_BTVER1, + AMD_BTVER2, + AMDFAM17H, + CPU_TYPE_MAX +}; + +enum processor_subtypes +{ + INTEL_COREI7_NEHALEM = 1, + INTEL_COREI7_WESTMERE, + INTEL_COREI7_SANDYBRIDGE, + AMDFAM10H_BARCELONA, + AMDFAM10H_SHANGHAI, + AMDFAM10H_ISTANBUL, + AMDFAM15H_BDVER1, + AMDFAM15H_BDVER2, + AMDFAM15H_BDVER3, + AMDFAM15H_BDVER4, + AMDFAM17H_ZNVER1, + INTEL_COREI7_IVYBRIDGE, + INTEL_COREI7_HASWELL, + INTEL_COREI7_BROADWELL, + INTEL_COREI7_SKYLAKE, + INTEL_COREI7_SKYLAKE_AVX512, + CPU_SUBTYPE_MAX +}; + +/* ISA Features supported. New features have to be inserted at the end. */ + +enum processor_features +{ + FEATURE_CMOV = 0, + FEATURE_MMX, + FEATURE_POPCNT, + FEATURE_SSE, + FEATURE_SSE2, + FEATURE_SSE3, + FEATURE_SSSE3, + FEATURE_SSE4_1, + FEATURE_SSE4_2, + FEATURE_AVX, + FEATURE_AVX2, + FEATURE_SSE4_A, + FEATURE_FMA4, + FEATURE_XOP, + FEATURE_FMA, + FEATURE_AVX512F, + FEATURE_BMI, + FEATURE_BMI2, + FEATURE_AES, + FEATURE_PCLMUL, + FEATURE_AVX512VL, + FEATURE_AVX512BW, + FEATURE_AVX512DQ, + FEATURE_AVX512CD, + FEATURE_AVX512ER, + FEATURE_AVX512PF, + FEATURE_AVX512VBMI, + FEATURE_AVX512IFMA +}; + +extern struct __processor_model +{ + unsigned int __cpu_vendor; + unsigned int __cpu_type; + unsigned int __cpu_subtype; + unsigned int __cpu_features[1]; +} __cpu_model;