Message ID | 164977061156.1938924.8949783273940123403@helium.openadk.org |
---|---|
State | Accepted |
Headers | show |
Series | [uclibc-ng-devel] Re: Bug in memset on ARM | expand |
>>>>> "tombannink" == tombannink <tombannink@gmail.com> writes: > Good catch, thanks Peter. > From 880ae1a543f6518f37a32b3af5a734a7b72f9b39 Mon Sep 17 00:00:00 2001 > From: Tom Bannink <tombannink@gmail.com> > Date: Tue, 12 Apr 2022 11:15:41 +0200 > Subject: [PATCH] Fix bug in ARM memset implementation > The ARM implementation of memset has a bug when the fill-value is negative or outside the > [0, 255] range. To reproduce: > char array[256]; > memset(array, -5, 256); > This is supposed to fill the array with int8 values -5, -5, -5, ... . On ARM, this does > not work because the implementation assumes the high bytes of the fill-value argument are > already zero. However in this test case they are filled with 1-bits. The aarch64 and x86_64 > implementations do not have this problem: they first convert the fill-value to an unsigned > byte following the specification of memset. > With GCC one can use `memset(ptr, (-5 & 0xFF), size)` as a workaround, but for clang > users that does not work: clang optimizes the `& 0xFF` away because it assumes that > memset will do it. Acked-by: Peter Korsgaard <peter@korsgaard.com>
Hi Tom, thanks applied and pushed. best regards Waldemar tombannink@gmail.com wrote, > Good catch, thanks Peter. > > From 880ae1a543f6518f37a32b3af5a734a7b72f9b39 Mon Sep 17 00:00:00 2001 > From: Tom Bannink <tombannink@gmail.com> > Date: Tue, 12 Apr 2022 11:15:41 +0200 > Subject: [PATCH] Fix bug in ARM memset implementation > > The ARM implementation of memset has a bug when the fill-value is negative or outside the > [0, 255] range. To reproduce: > > char array[256]; > memset(array, -5, 256); > > This is supposed to fill the array with int8 values -5, -5, -5, ... . On ARM, this does > not work because the implementation assumes the high bytes of the fill-value argument are > already zero. However in this test case they are filled with 1-bits. The aarch64 and x86_64 > implementations do not have this problem: they first convert the fill-value to an unsigned > byte following the specification of memset. > > With GCC one can use `memset(ptr, (-5 & 0xFF), size)` as a workaround, but for clang > users that does not work: clang optimizes the `& 0xFF` away because it assumes that > memset will do it. > --- > libc/string/arm/memset.S | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/libc/string/arm/memset.S b/libc/string/arm/memset.S > index 412270f50..29c583f16 100644 > --- a/libc/string/arm/memset.S > +++ b/libc/string/arm/memset.S > @@ -32,6 +32,7 @@ memset: > cmp r2, #8 @ at least 8 bytes to do? > bcc 2f > > + and r1, r1, #0xFF > lsl r3, r1, #8 > orr r1, r3 > lsl r3, r1, #16 > @@ -68,6 +69,7 @@ memset: > mov a4, a1 > cmp a3, $8 @ at least 8 bytes to do? > blo 2f > + and a2, a2, #0xFF > orr a2, a2, a2, lsl $8 > orr a2, a2, a2, lsl $16 > 1: > -- > 2.35.1 > _______________________________________________ > devel mailing list -- devel@uclibc-ng.org > To unsubscribe send an email to devel-leave@uclibc-ng.org >
diff --git a/libc/string/arm/memset.S b/libc/string/arm/memset.S index 412270f50..29c583f16 100644 --- a/libc/string/arm/memset.S +++ b/libc/string/arm/memset.S @@ -32,6 +32,7 @@ memset: cmp r2, #8 @ at least 8 bytes to do? bcc 2f + and r1, r1, #0xFF lsl r3, r1, #8 orr r1, r3 lsl r3, r1, #16 @@ -68,6 +69,7 @@ memset: mov a4, a1 cmp a3, $8 @ at least 8 bytes to do? blo 2f + and a2, a2, #0xFF orr a2, a2, a2, lsl $8 orr a2, a2, a2, lsl $16 1:
Good catch, thanks Peter. From 880ae1a543f6518f37a32b3af5a734a7b72f9b39 Mon Sep 17 00:00:00 2001 From: Tom Bannink <tombannink@gmail.com> Date: Tue, 12 Apr 2022 11:15:41 +0200 Subject: [PATCH] Fix bug in ARM memset implementation The ARM implementation of memset has a bug when the fill-value is negative or outside the [0, 255] range. To reproduce: char array[256]; memset(array, -5, 256); This is supposed to fill the array with int8 values -5, -5, -5, ... . On ARM, this does not work because the implementation assumes the high bytes of the fill-value argument are already zero. However in this test case they are filled with 1-bits. The aarch64 and x86_64 implementations do not have this problem: they first convert the fill-value to an unsigned byte following the specification of memset. With GCC one can use `memset(ptr, (-5 & 0xFF), size)` as a workaround, but for clang users that does not work: clang optimizes the `& 0xFF` away because it assumes that memset will do it. --- libc/string/arm/memset.S | 2 ++ 1 file changed, 2 insertions(+)