From patchwork Mon Aug 21 00:59:30 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Modra X-Patchwork-Id: 803823 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-460615-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="H9Kx2Zm6"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xbFhp50kmz9s7g for ; Mon, 21 Aug 2017 10:59:51 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; q=dns; s=default; b=VBl6sOy7BlU4cL/Em XoA2IWq5Pf61ADzLzZun/mamfrY1RGBzaqVUD4mnK8RptRLrJlMYfNTU7/H59mce yYAt3k1NHDA/03ZuFd9SvelJbQF4pLUsS9un/YMGhtJGeQf9Oilxcf/Nti8GLUyc WOBhHwSy9alP/PBd05kzeITskU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; s=default; bh=nOrHXZUFEKToMa6grPg0WwO V528=; b=H9Kx2Zm6ea5p1uo4y2Hi421R9maCqRKj8CtClQ70n/WIxwS3Bq88yTs 7k3bh0RzX+Bg1PSgF34ZFjx22ae9vHsgeUKgTiu7s5fyN/0FgSZUMuvdDMW06fIA fTTVlJjQ7SRuXPN01J2Z0f0fTHijSSwJLj5mL2pavcXNuDMZ+mro= Received: (qmail 14871 invoked by alias); 21 Aug 2017 00:59:41 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 14853 invoked by uid 89); 21 Aug 2017 00:59:40 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.1 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=blessed, officially, controversial, Really X-HELO: mail-pg0-f53.google.com Received: from mail-pg0-f53.google.com (HELO mail-pg0-f53.google.com) (74.125.83.53) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 21 Aug 2017 00:59:38 +0000 Received: by mail-pg0-f53.google.com with SMTP id n4so26032154pgn.1 for ; Sun, 20 Aug 2017 17:59:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=kgb+90a/UTzC/RWDkKFjkIfWw2pL0b9k9qbnnWKrnrI=; b=PiGLaooWgk5RE8+m5ZNp8zgemQeoq5MCpqdP/3K+AnRNa81Wb8G7ffcQyHOYRQ5/A5 JYVZCWKmxi3wO9qQo9HO+1ykSSTk2k68AJ3SmvJPny1ze85sDawOTfaEGzweB+vPNQQv 0dP/Cuar8Qgxuqv4VuNPlErudddZIi7ryUfIi2CYMs0BKgyoPNI5YutTfbNffZySt1nx B4asGbb/xu2Xe76AprjxrvUuJ7vNq/3e+LG+Lyg243J1/UnaNuVbD6N3B0212HzpBufv Ps87bfdT/klUvhsE0cTFVHDV9nICcQjPcfFPvuQ3Ep6IajBfF+pqENbu2lEgOqaTekbO hL8w== X-Gm-Message-State: AHYfb5hbUMac22o2o67lE0Wl+eCa7qShpMoAYXRARhpiYmNg+Qj6F0hD 4PZbn0y2b69se3uT X-Received: by 10.84.169.36 with SMTP id g33mr17896379plb.52.1503277176237; Sun, 20 Aug 2017 17:59:36 -0700 (PDT) Received: from bubble.grove.modra.org (CPE-58-160-71-80.tyqh2.lon.bigpond.net.au. [58.160.71.80]) by smtp.gmail.com with ESMTPSA id 204sm19408569pga.85.2017.08.20.17.59.34 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 20 Aug 2017 17:59:35 -0700 (PDT) Received: by bubble.grove.modra.org (Postfix, from userid 1000) id 385C9C2863; Mon, 21 Aug 2017 10:29:30 +0930 (ACST) Date: Mon, 21 Aug 2017 10:29:30 +0930 From: Alan Modra To: Segher Boessenkool Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Asm memory constraints Message-ID: <20170821005930.GB3368@bubble.grove.modra.org> References: <20170818144935.GA3368@bubble.grove.modra.org> <20170820130053.GT13471@gate.crashing.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20170820130053.GT13471@gate.crashing.org> User-Agent: Mutt/1.5.24 (2015-08-30) X-IsSubscribed: yes On Sun, Aug 20, 2017 at 08:00:53AM -0500, Segher Boessenkool wrote: > Hi Alan, > > On Sat, Aug 19, 2017 at 12:19:35AM +0930, Alan Modra wrote: > > +Flushing registers to memory has performance implications and may be > > +an issue for time-sensitive code. You can provide better information > > +to GCC to avoid this, as shown in the following examples. At a > > +minimum, aliasing rules allow GCC to know what memory @emph{doesn't} > > +need to be flushed. Also, if GCC can prove that all of the outputs of > > +a non-volatile @code{asm} statement are unused, then the @code{asm} > > +may be deleted. Removal of otherwise dead @code{asm} statements will > > +not happen if they clobber @code{"memory"}. > > void f(int x) { int z; asm("hcf %0,%1" : "=r"(z) : "r"(x) : "memory"); } > void g(int x) { int z; asm("hcf %0,%1" : "=r"(z) : "r"(x)); } > > Both f and g are completely removed by the first jump pass immediately > after expand (via delete_trivially_dead_insns). > > Do you have a testcase for the behaviour you saw? Oh my. I was sure that was how "memory" worked! I see though that every gcc I have lying around, going all the way back to gcc-2.95, deletes the asm in your testcase. I definitely don't want to put something in the docs that is plain wrong, or just my idea of how things ought to work, so the last two sentences quoted above need to go. Thanks for the correction. Fixed in this revised patch. The only controversial aspect now should be whether those array casts ought to be officially blessed. I've checked that "=m" (*(T (*)[]) ptr), "=m" (*(T (*)[n]) ptr), and "=m" (*(T (*)[10]) ptr), all generate reasonable MEM_ATTRS handled apparently properly by alias.c and other code. For example, at -O3 the following shows gcc moving the read of "val" before the asm, while an asm using a "memory" clobber forces the read to occur after the asm. static int f (double *x) { int res; asm ("#%0 %1 %2" : "=r" (res) : "r" (x), "m" (*(double (*)[]) x)); return res; } int val = 123; double foo[10]; int main () { int b = f (foo); __builtin_printf ("%d %d\n", val, b); return 0; } I'm also encouraged by comments like the following by rth in 2004 (gcc/c/c-typeck.c), which say that using non-kosher lvalues in memory output constraints must continue to be supported. /* ??? Really, this should not be here. Users should be using a proper lvalue, dammit. But there's a long history of using casts in the output operands. In cases like longlong.h, this becomes a primitive form of typechecking -- if the cast can be removed, then the output operand had a type of the proper width; otherwise we'll get an error. Gross, but ... */ STRIP_NOPS (output); * doc/extend.texi (Clobbers): Correct vax example. Delete old example of a memory input for a string of known length. Move commentary out of table. Add a number of new examples covering array memory inputs. testsuite/ * gcc.target/i386/asm-mem.c: New test. diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 649be01..940490e 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -8755,7 +8755,7 @@ registers: asm volatile ("movc3 %0, %1, %2" : /* No outputs. */ : "g" (from), "g" (to), "g" (count) - : "r0", "r1", "r2", "r3", "r4", "r5"); + : "r0", "r1", "r2", "r3", "r4", "r5", "memory"); @end example Also, there are two special clobber arguments: @@ -8786,14 +8786,72 @@ Note that this clobber does not prevent the @emph{processor} from doing speculative reads past the @code{asm} statement. To prevent that, you need processor-specific fence instructions. -Flushing registers to memory has performance implications and may be an issue -for time-sensitive code. You can use a trick to avoid this if the size of -the memory being accessed is known at compile time. For example, if accessing -ten bytes of a string, use a memory input like: +@end table -@code{@{"m"( (@{ struct @{ char x[10]; @} *p = (void *)ptr ; *p; @}) )@}}. +Flushing registers to memory has performance implications and may be +an issue for time-sensitive code. You can provide better information +to GCC to avoid this, as shown in the following examples. At a +minimum, aliasing rules allow GCC to know what memory @emph{doesn't} +need to be flushed. -@end table +Here is a fictitious sum of squares instruction, that takes two +pointers to floating point values in memory and produces a floating +point register output. +Notice that @code{x}, and @code{y} both appear twice in the @code{asm} +parameters, once to specify memory accessed, and once to specify a +base register used by the @code{asm}. You won't normally be wasting a +register by doing this as GCC can use the same register for both +purposes. However, it would be foolish to use both @code{%1} and +@code{%3} for @code{x} in this @code{asm} and expect them to be the +same. In fact, @code{%3} may well not be a register. It might be a +symbolic memory reference to the object pointed to by @code{x}. + +@smallexample +asm ("sumsq %0, %1, %2" + : "+f" (result) + : "r" (x), "r" (y), "m" (*x), "m" (*y)); +@end smallexample + +Here is a fictitious @code{*z++ = *x++ * *y++} instruction. +Notice that the @code{x}, @code{y} and @code{z} pointer registers +must be specified as input/output because the @code{asm} modifies +them. + +@smallexample +asm ("vecmul %0, %1, %2" + : "+r" (z), "+r" (x), "+r" (y), "=m" (*z) + : "m" (*x), "m" (*y)); +@end smallexample + +An x86 example where the string memory argument is of unknown length. + +@smallexample +asm("repne scasb" + : "=c" (count), "+D" (p) + : "m" (*(const char (*)[]) p), "0" (-1), "a" (0)); +@end smallexample + +If you know the above will only be reading a ten byte array then you +could instead use a memory input like: +@code{"m" (*(const char (*)[10]) p)}. + +Here is an example of a PowerPC vector scale implemented in assembly, +complete with vector and condition code clobbers, and some initialized +offset registers that are unchanged by the @code{asm}. + +@smallexample +void +dscal (size_t n, double *x, double alpha) +@{ + asm ("/* lots of asm here */" + : "+m" (*(double (*)[n]) x), "+r" (n), "+b" (x) + : "d" (alpha), "b" (32), "b" (48), "b" (64), + "b" (80), "b" (96), "b" (112) + : "cr0", + "vs32","vs33","vs34","vs35","vs36","vs37","vs38","vs39", + "vs40","vs41","vs42","vs43","vs44","vs45","vs46","vs47"); +@} +@end smallexample @anchor{GotoLabels} @subsubsection Goto Labels diff --git a/gcc/testsuite/gcc.target/i386/asm-mem.c b/gcc/testsuite/gcc.target/i386/asm-mem.c new file mode 100644 index 0000000..89b713f --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/asm-mem.c @@ -0,0 +1,59 @@ +/* { dg-do run } */ +/* { dg-options "-O3" } */ + +/* Check that "m" array references are effective in preventing the + array initialization from wandering past a use in the asm, and + that the casts remain supported. */ + +static int +f1 (const char *p) +{ + int count; + + __asm__ ("repne scasb" + : "=c" (count), "+D" (p) + : "m" (*(const char (*)[]) p), "0" (-1), "a" (0)); + return -2 - count; +} + +static int +f2 (const char *p) +{ + int count; + + __asm__ ("repne scasb" + : "=c" (count), "+D" (p) + : "m" (*(const char (*)[48]) p), "0" (-1), "a" (0)); + return -2 - count; +} + +static int +f3 (int n, const char *p) +{ + int count; + + __asm__ ("repne scasb" + : "=c" (count), "+D" (p) + : "m" (*(const char (*)[n]) p), "0" (-1), "a" (0)); + return -2 - count; +} + +int +main () +{ + int a; + char buff[48] = "hello world"; + buff[4] = 0; + a = f1 (buff); + if (a != 4) + __builtin_abort (); + buff[4] = 'o'; + a = f2 (buff); + if (a != 11) + __builtin_abort (); + buff[4] = 0; + a = f3 (48, buff); + if (a != 4) + __builtin_abort (); + return 0; +}