From patchwork Fri Aug 2 08:59:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 1141041 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-506074-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="G4jFJcpq"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 460Lhj1JQcz9s7T for ; Fri, 2 Aug 2019 19:00:04 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:cc:references:from:date:mime-version:in-reply-to :content-type:message-id; q=dns; s=default; b=ujs5qQ9gwpREMLWMH/ m6jL5mtaEZ5mY3Tn4JkCxGILivBobKodZ1AI6/RwKuUTBTY0qWGoefqut1PlASp2 g1obGWc4DKY9NGHVCoKotLWeR4GQVisJV+G6428vIDUl4juCYHiUjFWLXlosOSIt JlxWDMH0KsQDPgm3awANmWWFU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:cc:references:from:date:mime-version:in-reply-to :content-type:message-id; s=default; bh=QpfWS+gkO5CG7vYftV3HXUeg 9Lg=; b=G4jFJcpqjvJsYHkTDGs7QgNzYgMQdffqOc1UDQLnpGg6A0gqQoYEsL6m cK/SDhh2mbtJ5wck4RPFapiHzFKRXBkRDAqOaAVNsYCzGKiMFao2abOaKUDZvAsJ LX/ne6Qo+8n80Oe6PVF0WKzShyx6Fg0db7itYE8n1D9tSgeyge4= Received: (qmail 44813 invoked by alias); 2 Aug 2019 08:59:56 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 44804 invoked by uid 89); 2 Aug 2019 08:59:56 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-19.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.1 spammy=rotation, exploit X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 02 Aug 2019 08:59:54 +0000 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x728uXb6113429 for ; Fri, 2 Aug 2019 04:59:53 -0400 Received: from e06smtp01.uk.ibm.com (e06smtp01.uk.ibm.com [195.75.94.97]) by mx0a-001b2d01.pphosted.com with ESMTP id 2u4gfmc572-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 02 Aug 2019 04:59:52 -0400 Received: from localhost by e06smtp01.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 2 Aug 2019 09:59:50 +0100 Received: from b06cxnps4076.portsmouth.uk.ibm.com (9.149.109.198) by e06smtp01.uk.ibm.com (192.168.101.131) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Fri, 2 Aug 2019 09:59:48 +0100 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x728xlDm44236854 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 2 Aug 2019 08:59:47 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9BA2652054; Fri, 2 Aug 2019 08:59:47 +0000 (GMT) Received: from kewenlins-mbp.cn.ibm.com (unknown [9.200.146.208]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id 7F81A52050; Fri, 2 Aug 2019 08:59:45 +0000 (GMT) Subject: [PATCH V5, rs6000] Support vrotr3 for int vector types To: Segher Boessenkool Cc: GCC Patches , Jakub Jelinek , Richard Biener , richard.sandiford@arm.com, Bill Schmidt References: <32f89c4f-cd2d-a7bd-16d2-26fed6bb5f56@linux.ibm.com> <27be90e6-4beb-5c4c-a163-9b136490d783@linux.ibm.com> <20190717134025.GJ20882@gate.crashing.org> <83f8448e-3c59-8991-2176-729d87e08a86@linux.ibm.com> <20190718194818.GT20882@gate.crashing.org> <20190719150647.GZ20882@gate.crashing.org> <20190725134958.GR20882@gate.crashing.org> <20190726141004.GA31406@gate.crashing.org> From: "Kewen.Lin" Date: Fri, 2 Aug 2019 16:59:44 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20190726141004.GA31406@gate.crashing.org> x-cbid: 19080208-4275-0000-0000-000003531A5E X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19080208-4276-0000-0000-000038640F6A Message-Id: <85937573-94ae-8a13-2cf6-5d4b9edf97e2@linux.ibm.com> X-IsSubscribed: yes Hi Segher, Sorry for the late, I've addressed your comments in the attached patch. Some points: 1) Remove explict AND part. 2) Rename predicate name to vint_reg_or_vint_const. 3) Split test cases into altivec and power8. As to the predicate name and usage, I checked the current vector shifts, they don't need to check const_vector specially (like right to left conversion), excepting for the one "vec_shr_", but it checks for scalar const int. Btw, I've changed the + rtx imm_vec = + simplify_const_unary_operation back to + rtx imm_vec + = simplify_const_unary_operation Otherwise check_GNU_style will report "Trailing operator" error. :( Bootstrapped and regtested on powerpc64le-unknown-linux-gnu. Thanks, Kewen ------------ gcc/ChangeLog 2019-08-02 Kewen Lin * config/rs6000/predicates.md (vint_reg_or_vint_const): New predicate. * config/rs6000/vector.md (vrotr3): New define_expand. gcc/testsuite/ChangeLog 2019-08-02 Kewen Lin * gcc.target/powerpc/vec_rotate-1.c: New test. * gcc.target/powerpc/vec_rotate-2.c: New test. * gcc.target/powerpc/vec_rotate-3.c: New test. * gcc.target/powerpc/vec_rotate-4.c: New test. diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md index 8ca98299950..faf057425a8 100644 --- a/gcc/config/rs6000/predicates.md +++ b/gcc/config/rs6000/predicates.md @@ -163,6 +163,17 @@ return VINT_REGNO_P (REGNO (op)); }) +;; Return 1 if op is a vector register that operates on integer vectors +;; or if op is a const vector with integer vector modes. +(define_predicate "vint_reg_or_vint_const" + (match_code "reg,subreg,const_vector") +{ + if (GET_CODE (op) == CONST_VECTOR && GET_MODE_CLASS (mode) == MODE_VECTOR_INT) + return 1; + + return vint_operand (op, mode); +}) + ;; Return 1 if op is a vector register to do logical operations on (and, or, ;; xor, etc.) (define_predicate "vlogical_operand" diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md index 70bcfe02e22..3111ca9029f 100644 --- a/gcc/config/rs6000/vector.md +++ b/gcc/config/rs6000/vector.md @@ -1260,6 +1260,33 @@ "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" "") +;; Expanders for rotatert to make use of vrotl +(define_expand "vrotr3" + [(set (match_operand:VEC_I 0 "vint_operand") + (rotatert:VEC_I (match_operand:VEC_I 1 "vint_operand") + (match_operand:VEC_I 2 "vint_reg_or_vint_const")))] + "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" +{ + rtx rot_count = gen_reg_rtx (mode); + if (GET_CODE (operands[2]) == CONST_VECTOR) + { + machine_mode inner_mode = GET_MODE_INNER (mode); + unsigned int bits = GET_MODE_PRECISION (inner_mode); + rtx mask_vec = gen_const_vec_duplicate (mode, GEN_INT (bits - 1)); + rtx imm_vec + = simplify_const_unary_operation (NEG, mode, operands[2], + GET_MODE (operands[2])); + imm_vec + = simplify_const_binary_operation (AND, mode, imm_vec, mask_vec); + rot_count = force_reg (mode, imm_vec); + } + else + emit_insn (gen_neg2 (rot_count, operands[2])); + + emit_insn (gen_vrotl3 (operands[0], operands[1], rot_count)); + DONE; +}) + ;; Expanders for arithmetic shift left on each vector element (define_expand "vashl3" [(set (match_operand:VEC_I 0 "vint_operand") diff --git a/gcc/testsuite/gcc.target/powerpc/vec_rotate-1.c b/gcc/testsuite/gcc.target/powerpc/vec_rotate-1.c new file mode 100644 index 00000000000..f035a578292 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec_rotate-1.c @@ -0,0 +1,39 @@ +/* { dg-options "-O3" } */ +/* { dg-require-effective-target powerpc_altivec_ok } */ + +/* Check vectorizer can exploit vector rotation instructions on Power, mainly + for the case rotation count is const number. + + Check for instructions vrlb/vrlh/vrlw only available if altivec supported. */ + +#define N 256 +unsigned int suw[N], ruw[N]; +unsigned short suh[N], ruh[N]; +unsigned char sub[N], rub[N]; + +void +testUW () +{ + for (int i = 0; i < 256; ++i) + ruw[i] = (suw[i] >> 8) | (suw[i] << (sizeof (suw[0]) * 8 - 8)); +} + +void +testUH () +{ + for (int i = 0; i < 256; ++i) + ruh[i] = (unsigned short) (suh[i] >> 9) + | (unsigned short) (suh[i] << (sizeof (suh[0]) * 8 - 9)); +} + +void +testUB () +{ + for (int i = 0; i < 256; ++i) + rub[i] = (unsigned char) (sub[i] >> 5) + | (unsigned char) (sub[i] << (sizeof (sub[0]) * 8 - 5)); +} + +/* { dg-final { scan-assembler {\mvrlw\M} } } */ +/* { dg-final { scan-assembler {\mvrlh\M} } } */ +/* { dg-final { scan-assembler {\mvrlb\M} } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vec_rotate-2.c b/gcc/testsuite/gcc.target/powerpc/vec_rotate-2.c new file mode 100644 index 00000000000..0a2a965ddcb --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec_rotate-2.c @@ -0,0 +1,19 @@ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-options "-O3 -mdejagnu-cpu=power8" } */ + +/* Check vectorizer can exploit vector rotation instructions on Power8, mainly + for the case rotation count is const number. + + Check for vrld which is available on Power8 and above. */ + +#define N 256 +unsigned long long sud[N], rud[N]; + +void +testULL () +{ + for (int i = 0; i < 256; ++i) + rud[i] = (sud[i] >> 8) | (sud[i] << (sizeof (sud[0]) * 8 - 8)); +} + +/* { dg-final { scan-assembler {\mvrld\M} } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vec_rotate-3.c b/gcc/testsuite/gcc.target/powerpc/vec_rotate-3.c new file mode 100644 index 00000000000..5e90ae6fd63 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec_rotate-3.c @@ -0,0 +1,40 @@ +/* { dg-options "-O3" } */ +/* { dg-require-effective-target powerpc_altivec_ok } */ + +/* Check vectorizer can exploit vector rotation instructions on Power, mainly + for the case rotation count isn't const number. + + Check for instructions vrlb/vrlh/vrlw only available if altivec supported. */ + +#define N 256 +unsigned int suw[N], ruw[N]; +unsigned short suh[N], ruh[N]; +unsigned char sub[N], rub[N]; +extern unsigned char rot_cnt; + +void +testUW () +{ + for (int i = 0; i < 256; ++i) + ruw[i] = (suw[i] >> rot_cnt) | (suw[i] << (sizeof (suw[0]) * 8 - rot_cnt)); +} + +void +testUH () +{ + for (int i = 0; i < 256; ++i) + ruh[i] = (unsigned short) (suh[i] >> rot_cnt) + | (unsigned short) (suh[i] << (sizeof (suh[0]) * 8 - rot_cnt)); +} + +void +testUB () +{ + for (int i = 0; i < 256; ++i) + rub[i] = (unsigned char) (sub[i] >> rot_cnt) + | (unsigned char) (sub[i] << (sizeof (sub[0]) * 8 - rot_cnt)); +} + +/* { dg-final { scan-assembler {\mvrlw\M} } } */ +/* { dg-final { scan-assembler {\mvrlh\M} } } */ +/* { dg-final { scan-assembler {\mvrlb\M} } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vec_rotate-4.c b/gcc/testsuite/gcc.target/powerpc/vec_rotate-4.c new file mode 100644 index 00000000000..0d3e8378ed6 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec_rotate-4.c @@ -0,0 +1,20 @@ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-options "-O3 -mdejagnu-cpu=power8" } */ + +/* Check vectorizer can exploit vector rotation instructions on Power8, mainly + for the case rotation count isn't const number. + + Check for vrld which is available on Power8 and above. */ + +#define N 256 +unsigned long long sud[N], rud[N]; +extern unsigned char rot_cnt; + +void +testULL () +{ + for (int i = 0; i < 256; ++i) + rud[i] = (sud[i] >> rot_cnt) | (sud[i] << (sizeof (sud[0]) * 8 - rot_cnt)); +} + +/* { dg-final { scan-assembler {\mvrld\M} } } */