From patchwork Mon Jul 18 11:57:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vladimir Murzin X-Patchwork-Id: 1657412 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=uclibc-ng.org (client-ip=89.238.66.15; helo=helium.openadk.org; envelope-from=devel-bounces@uclibc-ng.org; receiver=) Received: from helium.openadk.org (helium.openadk.org [89.238.66.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4LmgT30bd7z9s2R for ; Mon, 18 Jul 2022 21:57:57 +1000 (AEST) Received: from helium.openadk.org (localhost [IPv6:::1]) by helium.openadk.org (Postfix) with ESMTP id 3907E31E073F; Mon, 18 Jul 2022 13:57:50 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by helium.openadk.org (Postfix) with ESMTP id 21DFB31E007F; Mon, 18 Jul 2022 13:57:45 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0565B1042; Mon, 18 Jul 2022 04:57:44 -0700 (PDT) Received: from login2.euhpc.arm.com (login2.euhpc.arm.com [10.6.27.34]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id ED5393F70D; Mon, 18 Jul 2022 04:57:42 -0700 (PDT) From: Vladimir Murzin To: devel@uclibc-ng.org Date: Mon, 18 Jul 2022 12:57:23 +0100 Message-Id: <20220718115724.13051-1-vladimir.murzin@arm.com> X-Mailer: git-send-email 2.24.0 MIME-Version: 1.0 Message-ID-Hash: OI2DHYRSF5AJTJAGCROGEUHK655WPEZM X-Message-ID-Hash: OI2DHYRSF5AJTJAGCROGEUHK655WPEZM X-MailFrom: vladimir.murzin@arm.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.3 Precedence: list Subject: [uclibc-ng-devel] [PATCH 1/2] linuxthreads/arm: fix ldrex/strex loop when built with O0 List-Id: uClibc-ng Development Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: O0 build result in the following codegen 00000000 : 0: b480 push {r7} 2: b085 sub sp, #20 4: af00 add r7, sp, #0 6: 6078 str r0, [r7, #4] 8: 687b ldr r3, [r7, #4] a: e853 3f00 ldrex r3, [r3] e: 60fb str r3, [r7, #12] 10: 68fb ldr r3, [r7, #12] 12: 4618 mov r0, r3 14: 3714 adds r7, #20 16: 46bd mov sp, r7 18: f85d 7b04 ldr.w r7, [sp], #4 1c: 4770 bx lr 0000001e : 1e: b480 push {r7} 20: b085 sub sp, #20 22: af00 add r7, sp, #0 24: 6078 str r0, [r7, #4] 26: 6039 str r1, [r7, #0] 28: 687b ldr r3, [r7, #4] 2a: 683a ldr r2, [r7, #0] 2c: e842 3300 strex r3, r3, [r2] 30: 60fb str r3, [r7, #12] 32: 68fb ldr r3, [r7, #12] 34: 4618 mov r0, r3 36: 3714 adds r7, #20 38: 46bd mov sp, r7 3a: f85d 7b04 ldr.w r7, [sp], #4 3e: 4770 bx lr 00000040 : 40: b590 push {r4, r7, lr} 42: b083 sub sp, #12 44: af00 add r7, sp, #0 46: 6078 str r0, [r7, #4] 48: 6878 ldr r0, [r7, #4] 4a: f7ff fffe bl 0 4e: 4603 mov r3, r0 50: 461c mov r4, r3 52: 6879 ldr r1, [r7, #4] 54: 2001 movs r0, #1 56: f7ff fffe bl 1e 5a: 4603 mov r3, r0 5c: 2b00 cmp r3, #0 5e: d1f3 bne.n 48 60: 4623 mov r3, r4 62: 4618 mov r0, r3 64: 370c adds r7, #12 66: 46bd mov sp, r7 68: bd90 pop {r4, r7, pc} ARM ARM suggests that LoadExcl/StoreExcl loops are guaranteed to make forward progress only if, for any LoadExcl/StoreExcl loop within a single thread of execution, the software meets all of the following conditions: 1 Between the Load-Exclusive and the Store-Exclusive, there are no explicit memory accesses, preloads, direct or indirect System register writes, address translation instructions, cache or TLB maintenance instructions, exception generating instructions, exception returns, or indirect branches. ... Obviously condition is not met for O0 builds. O2 build (which is highly likely the most common setting) able to do the right thing resulting in 00000000 : 0: e850 0f00 ldrex r0, [r0] 4: 4770 bx lr 6: bf00 nop 00000008 : 8: e841 0000 strex r0, r0, [r1] c: 4770 bx lr e: bf00 nop 00000010 : 10: 2101 movs r1, #1 12: 4603 mov r3, r0 14: e853 0f00 ldrex r0, [r3] 18: e843 1200 strex r2, r1, [r3] 1c: 2a00 cmp r2, #0 1e: d1f9 bne.n 14 20: 4770 bx lr 22: bf00 nop Rather than depending on level of optimisation implement whole ldrex/strex loop in inline assembly. Signed-off-by: Vladimir Murzin --- .../linuxthreads/sysdeps/arm/pt-machine.h | 34 +++++-------------- 1 file changed, 9 insertions(+), 25 deletions(-) diff --git a/libpthread/linuxthreads/sysdeps/arm/pt-machine.h b/libpthread/linuxthreads/sysdeps/arm/pt-machine.h index fc17e9bc7..b00b10495 100644 --- a/libpthread/linuxthreads/sysdeps/arm/pt-machine.h +++ b/libpthread/linuxthreads/sysdeps/arm/pt-machine.h @@ -29,35 +29,19 @@ #endif #if defined(__thumb2__) -PT_EI long int ldrex(int *spinlock) -{ - long int ret; - __asm__ __volatile__( - "ldrex %0, [%1]\n" - : "=r"(ret) - : "r"(spinlock) : "memory"); - return ret; -} - -PT_EI long int strex(int val, int *spinlock) -{ - long int ret; - __asm__ __volatile__( - "strex %0, %1, [%2]\n" - : "=r"(ret) - : "r" (val), "r"(spinlock) : "memory"); - return ret; -} - /* Spinlock implementation; required. */ PT_EI long int testandset (int *spinlock) { - register unsigned int ret; - - do { - ret = ldrex(spinlock); - } while (strex(1, spinlock)); + unsigned int ret, tmp, val = 1; + + __asm__ __volatile__ ( +"0: ldrex %0, [%2] \n" +" strex %1, %3, [%2] \n" +" cmp %1, #0 \n" +" bne 0b" + : "=&r" (ret), "=&r" (tmp) + : "r" (spinlock), "r" (val) : "memory", "cc"); return ret; } From patchwork Mon Jul 18 11:57:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vladimir Murzin X-Patchwork-Id: 1657416 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=uclibc-ng.org (client-ip=2a00:1828:2000:679::23; helo=helium.openadk.org; envelope-from=devel-bounces@uclibc-ng.org; receiver=) Received: from helium.openadk.org (helium.openadk.org [IPv6:2a00:1828:2000:679::23]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4LmgTy17rrz9sFs for ; Mon, 18 Jul 2022 21:58:46 +1000 (AEST) Received: from helium.openadk.org (localhost [IPv6:::1]) by helium.openadk.org (Postfix) with ESMTP id 4007A31E073F; Mon, 18 Jul 2022 13:58:36 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by helium.openadk.org (Postfix) with ESMTP id 83B5A31E074C; Mon, 18 Jul 2022 13:57:57 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8B24D1042; Mon, 18 Jul 2022 04:57:56 -0700 (PDT) Received: from login2.euhpc.arm.com (login2.euhpc.arm.com [10.6.27.34]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 7F1BC3F70D; Mon, 18 Jul 2022 04:57:55 -0700 (PDT) From: Vladimir Murzin To: devel@uclibc-ng.org Date: Mon, 18 Jul 2022 12:57:24 +0100 Message-Id: <20220718115724.13051-2-vladimir.murzin@arm.com> X-Mailer: git-send-email 2.24.0 In-Reply-To: <20220718115724.13051-1-vladimir.murzin@arm.com> References: <20220718115724.13051-1-vladimir.murzin@arm.com> MIME-Version: 1.0 Message-ID-Hash: 44LK6XSA7Y5MMR2T4SQJY7BHG6WNY5VN X-Message-ID-Hash: 44LK6XSA7Y5MMR2T4SQJY7BHG6WNY5VN X-MailFrom: vladimir.murzin@arm.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.3 Precedence: list Subject: [uclibc-ng-devel] [PATCH 2/2] linuxthread/arm: Unlock ldrex/strex varsion of testandset for __ARM_ARCH >= 7 List-Id: uClibc-ng Development Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Thomas has repored failure building ARM 32-bit systems for ARMv8 cores CC libpthread/linuxthreads/mutex.os /tmp/ccn8SFKU.s: Assembler messages: /tmp/ccn8SFKU.s:162: Error: swp{b} use is obsoleted for ARMv8 and later /tmp/ccn8SFKU.s:186: Error: swp{b} use is obsoleted for ARMv8 and later /tmp/ccn8SFKU.s:203: Error: swp{b} use is obsoleted for ARMv8 and later /tmp/ccn8SFKU.s:224: Error: swp{b} use is obsoleted for ARMv8 and later make[1]: *** [Makerules:369: libpthread/linuxthreads/mutex.os] Error 1 This is due to libpthread/linuxthreads/sysdeps/arm/pt-machine.h which uses the swp instruction that is not allowed on ARMv8. All __ARM_ARCH >= 7 support ldrex/strex instructions, so unlock testandset() varaint for them. Reported-by: Thomas Petazzoni Signed-off-by: Vladimir Murzin --- libpthread/linuxthreads/sysdeps/arm/pt-machine.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libpthread/linuxthreads/sysdeps/arm/pt-machine.h b/libpthread/linuxthreads/sysdeps/arm/pt-machine.h index b00b10495..3250961cf 100644 --- a/libpthread/linuxthreads/sysdeps/arm/pt-machine.h +++ b/libpthread/linuxthreads/sysdeps/arm/pt-machine.h @@ -28,7 +28,7 @@ # define PT_EI __extern_always_inline #endif -#if defined(__thumb2__) +#if __ARM_ARCH >= 7 || defined(__thumb2__) /* Spinlock implementation; required. */ PT_EI long int testandset (int *spinlock)