Message ID | 001801d02b72$6ce0c3c0$46a24b40$@com |
---|---|
State | New |
Headers | show |
Has to be __strnlen.
> Roland McGrath: > Has to be __strnlen. Good point, so it becomes: - return s; + size_t size = __strnlen (s2, n); + if (size != n) + memset (s1 + size, '\0', n - size); + return memcpy (s1, s2, size); Btw while on the subject on namespaces, is bcopy correctly defined? After a lot of complex inclusion and defines, it ultimately does: void bcopy () { ... } Wilco
> Btw while on the subject on namespaces, is bcopy correctly defined? > After a lot of complex inclusion and defines, it ultimately does: > > void > bcopy () > { > ... > } I'm not sure what kind of correctness you have in mind. bcopy is the only name under which that function needs to be defined. If it's in the same file as a standard function (presumably memmove) then it needs to be defined as weak. No other code anywhere in libc should call bcopy (we'd use memmove or memcpy instead), so it doesn't need to have an __ name.
> Roland McGrath wrote: > > Btw while on the subject on namespaces, is bcopy correctly defined? > > After a lot of complex inclusion and defines, it ultimately does: > > > > void > > bcopy () > > { > > ... > > } > > I'm not sure what kind of correctness you have in mind. bcopy is the only > name under which that function needs to be defined. If it's in the same > file as a standard function (presumably memmove) then it needs to be > defined as weak. No other code anywhere in libc should call bcopy (we'd > use memmove or memcpy instead), so it doesn't need to have an __ name. Well, bcopy and bzero are both used in several non-test sources in GLIBC, but one is always defined as __bzero, and the other as bcopy. It's unclear to me what the exact namespace rules are (or whether there is even a concise description of them somewhere), however it is obvious that this is not consistent. Note any rule that relies on whether a function is called or not from within the same library is risky unless you have automated checks to catch that. Wilco
> > Roland McGrath: > > Has to be __strnlen. > > Good point, so it becomes: > > - return s; > + size_t size = __strnlen (s2, n); > + if (size != n) > + memset (s1 + size, '\0', n - size); > + return memcpy (s1, s2, size); OK for trunk with __strnlen? Wilco
ping > Wilco Dijkstra wrote: > > > Roland McGrath: > > > Has to be __strnlen. > > > > Good point, so it becomes: > > > > - return s; > > + size_t size = __strnlen (s2, n); > > + if (size != n) > > + memset (s1 + size, '\0', n - size); > > + return memcpy (s1, s2, size); > > OK for trunk with __strnlen? > > Wilco
> > > Roland McGrath: > > > Has to be __strnlen. > > > > Good point, so it becomes: > > > > - return s; > > + size_t size = __strnlen (s2, n); > > + if (size != n) > > + memset (s1 + size, '\0', n - size); > > + return memcpy (s1, s2, size); > > OK for trunk with __strnlen? Style and whatnot is fine now. I have no opinion one way or the other about this being an appropriate performance change.
On Wed, Apr 15, 2015 at 01:33:22PM +0100, Wilco Dijkstra wrote: > ping > > > Wilco Dijkstra wrote: > > > > Roland McGrath: > > > > Has to be __strnlen. > > > > > > Good point, so it becomes: > > > > > > - return s; > > > + size_t size = __strnlen (s2, n); > > > + if (size != n) > > > + memset (s1 + size, '\0', n - size); > > > + return memcpy (s1, s2, size); > > > > OK for trunk with __strnlen? > > > > Wilco Ok for me.
On Wed, May 13, 2015 at 09:02:28PM +0200, Ondřej Bílka wrote: > On Wed, Apr 15, 2015 at 01:33:22PM +0100, Wilco Dijkstra wrote: > > ping > > > > > Wilco Dijkstra wrote: > > > > > Roland McGrath: > > > > > Has to be __strnlen. > > > > > > > > Good point, so it becomes: > > > > > > > > - return s; > > > > + size_t size = __strnlen (s2, n); > > > > + if (size != n) > > > > + memset (s1 + size, '\0', n - size); > > > > + return memcpy (s1, s2, size); > > > > > > OK for trunk with __strnlen? > > > > > > Wilco > > Ok for me. I would like this in. I plan to drop strncpy/stpcpy assembly on x64 and for it I need good generic implementation. I replaced stupid_strncpy in benchtest by yours and ran it on x64. As memcpy is ten times faster than current test for big size its definite improvement. For smaller sizes benchtest show only call overhead. However they miss icache pressure overhead as strncpy is dead cold function and should be optimized for size. Also randomizing alignment and size will completely change these numbers which are simply too inaccurate for small n. simple_strncpy stupid_strncpy __strncpy_ssse3 __strncpy_sse2_unaligned __strncpy_sse2 Length 16, n 16, alignment 1/ 1: 49.7344 45.875 25.5938 14.4688 30.2344 Length 16, n 16, alignment 1/ 1: 48.8125 42.875 24.625 13.0625 27.2188 Length 16, n 16, alignment 1/ 2: 49.7188 42.5156 24.2188 12.7812 37.5938 Length 16, n 16, alignment 2/ 1: 48.3906 42.5625 24.6406 12.7812 26.6094 Length 2, n 4, alignment 7/ 2: 18.8906 42.4062 21.9219 23.2188 22.6562 Length 4, n 2, alignment 2/ 7: 14.25 43.1094 11.6719 15.8594 16.2188 Length 2, n 4, alignment 7/ 2: 18.6094 42.1406 21.6406 23.5312 22.5625 Length 4, n 2, alignment 2/ 7: 13.4219 43.1094 11.4375 14.75 15.1094 Length 16, n 16, alignment 2/ 2: 48.25 43.2969 24.8594 14.1094 28.6719 Length 16, n 16, alignment 2/ 2: 48.1562 42.5938 24.4062 12.7812 26.7812 Length 16, n 16, alignment 2/ 4: 48.4844 42.5625 23.4375 12.8594 34.6406 Length 16, n 16, alignment 4/ 2: 48.1094 42.5156 23.9531 12.7812 26.0156 Length 4, n 8, alignment 6/ 4: 28.5469 44.2969 27.5312 23.25 24.2188 Length 8, n 4, alignment 4/ 6: 18.9688 44.5312 15.0781 15.3906 16.4062 Length 4, n 8, alignment 6/ 4: 27.9844 44.9844 25.5625 23.25 22.75 Length 8, n 4, alignment 4/ 6: 18.5625 44.25 14.75 14.7969 15.4531 Length 16, n 16, alignment 3/ 3: 49.1719 42.5625 24.625 13.1875 27.9844 Length 16, n 16, alignment 3/ 3: 48.2969 42.5156 24.4531 12.9062 26.2969 Length 16, n 16, alignment 3/ 6: 48.4375 42.5625 23.1562 12.8594 34 Length 16, n 16, alignment 6/ 3: 48.1562 42.5625 23.8438 12.8594 25.9688 Length 8, n 16, alignment 5/ 6: 46.1406 43.0156 25.6406 22.9844 36.7188 Length 16, n 8, alignment 6/ 5: 29.3125 44.0156 19.5312 14.4375 20.1719 Length 8, n 16, alignment 5/ 6: 45.7812 42.6562 25.5938 23.5781 33.875 Length 16, n 8, alignment 6/ 5: 28.625 43.5625 18.1094 13.0469 18.75 Length 16, n 16, alignment 4/ 4: 49.6719 42.6406 24.9062 13.2344 27.2969 Length 16, n 16, alignment 4/ 4: 48.2969 42.5156 24.2188 12.8594 26.1562 Length 16, n 16, alignment 4/ 0: 48.2031 42.5156 23.4844 12.7812 25.9531 Length 16, n 16, alignment 0/ 4: 48.3906 42.6875 24.5781 12.8594 34.4219 Length 16, n 32, alignment 4/ 0: 79.6875 41.4062 44.4844 23.3906 33.5 Length 32, n 16, alignment 0/ 4: 48.3438 43.4844 23.2031 13.4219 34.3281 Length 16, n 32, alignment 4/ 0: 77.1094 41.125 41.9688 23.2656 32.9062 Length 32, n 16, alignment 0/ 4: 48.3906 42.7344 22.7969 12.8594 33.5469 Length 16, n 16, alignment 5/ 5: 48.125 42.5156 24.2188 12.8594 26.6094 Length 16, n 16, alignment 5/ 5: 48.125 42.5938 24.2188 12.9062 26.5625 Length 16, n 16, alignment 5/ 2: 48.1719 42.6406 23.625 12.7812 26.0469 Length 16, n 16, alignment 2/ 5: 48.3906 42.6406 23.2969 12.8594 36.5312 Length 32, n 64, alignment 3/ 2: 160.766 44.3906 55.4219 30.6094 53.7656 Length 64, n 32, alignment 2/ 3: 126.375 39.8906 52.1094 15.9375 69.5312 Length 32, n 64, alignment 3/ 2: 159.422 40.7656 53.3594 28.8594 52.2969 Length 64, n 32, alignment 2/ 3: 125.453 38.875 51.1562 14.9375 69.0781 Length 16, n 16, alignment 6/ 6: 48.2031 42.6406 24.9062 13.6875 27.7656 Length 16, n 16, alignment 6/ 6: 48.25 42.5156 24.0938 12.7812 26.2969 Length 16, n 16, alignment 6/ 4: 48.2031 42.5156 24.0312 12.875 26.2812 Length 16, n 16, alignment 4/ 6: 48.4844 42.6406 23.625 12.8594 34.7031 Length 64, n 128, alignment 2/ 4: 287.5 54.1875 65.2969 33 129.094 Length 128, n 64, alignment 4/ 2: 159.062 49.4062 50.5938 22.0625 87.4062 Length 64, n 128, alignment 2/ 4: 288.047 51.9688 63.875 31.8438 130.25 Length 128, n 64, alignment 4/ 2: 158.781 48.9844 48.0781 21.0938 84.4219 Length 16, n 16, alignment 7/ 7: 49.125 42.5156 24.3906 13.7969 27.25 Length 16, n 16, alignment 7/ 7: 48.2031 42.5156 24.1719 12.7812 26.1406 Length 16, n 16, alignment 7/ 6: 48.2969 42.5156 24.8125 12.7812 26.2969 Length 16, n 16, alignment 6/ 7: 48.4375 42.7344 24.1719 12.8281 38.8906 Length 128, n 256, alignment 1/ 6: 502.25 74.4375 83.9531 52.4375 260.25 Length 256, n 128, alignment 6/ 1: 265.078 54.1406 80.2344 32.625 165.906 Length 128, n 256, alignment 1/ 6: 501.938 72.875 79.9688 51.7969 264.656 Length 256, n 128, alignment 6/ 1: 267.922 54.125 70.5781 32.5938 162.734 Length 8, n 16, alignment 0/ 0: 44.625 41.2656 26.7031 24.3906 27.2031 Length 32, n 16, alignment 0/ 0: 48.8906 42.5625 24.5469 13.5938 27.3438 Length 8, n 16, alignment 7/ 2: 43.3438 45.2656 25.1875 23.0312 25.7344 Length 32, n 16, alignment 7/ 2: 48.1719 42.5156 23.4375 12.7812 26.0625 Length 16, n 32, alignment 0/ 0: 78.3125 40.9062 42.2344 23.9375 36.2969 Length 64, n 32, alignment 0/ 0: 101.609 40.0312 31.4219 15.125 47.9688 Length 16, n 32, alignment 6/ 4: 76.75 41.1875 46.7812 23.1094 32.9531 Length 64, n 32, alignment 6/ 4: 101.562 38.7812 43.5625 14.3906 45.5938 Length 32, n 64, alignment 0/ 0: 159.156 38.5625 46.2344 31.25 57.125 Length 128, n 64, alignment 0/ 0: 158.5 46.6406 38.9688 19.8125 120.125 Length 32, n 64, alignment 5/ 6: 169.625 42.1875 66.7656 28.5938 77.5312 Length 128, n 64, alignment 5/ 6: 192.688 50.2656 55.9688 22.4688 158.688 Length 64, n 128, alignment 0/ 0: 257.266 49.8125 49.5938 31.8438 93.8906 Length 256, n 128, alignment 0/ 0: 266.453 51.7969 45.6875 25.5938 186.031 Length 64, n 128, alignment 4/ 0: 256.656 49.3594 61.0312 29.8281 90.8125 Length 256, n 128, alignment 4/ 0: 267.875 51.8906 68.2031 31.7188 183.781 Length 128, n 256, alignment 0/ 0: 417.922 60.625 63.8281 41.8594 198.484 Length 512, n 256, alignment 0/ 0: 480.891 70.2656 60.0156 38.875 343.984 Length 128, n 256, alignment 3/ 2: 417.688 71.5938 84.9688 42.375 197.516 Length 512, n 256, alignment 3/ 2: 480.422 74.2656 87.7344 43.5156 347.203 Length 256, n 512, alignment 0/ 0: 741.547 81.0625 76.1562 57.0312 363.656 Length 1024, n 512, alignment 0/ 0: 907.672 94.1562 80.1094 58.5469 661.672 Length 256, n 512, alignment 2/ 4: 899.453 166.906 102.953 58.3125 506.016 Length 1024, n 512, alignment 2/ 4: 1242.33 110.156 228.859 63.9688 943.797 Length 512, n 1024, alignment 0/ 0: 1380.11 116.5 103.219 94.2031 687.219 Length 2048, n 1024, alignment 0/ 0: 1762.77 147.984 120.953 101.422 1294.89 Length 512, n 1024, alignment 1/ 6: 1762.95 153.953 141.922 90.4375 945.688 Length 2048, n 1024, alignment 1/ 6: 2537.5 194.906 179.781 105.922 1797.06
> Wilco Dijkstra wrote: > ping > > > Wilco Dijkstra wrote: > > > > Roland McGrath: > > > > Has to be __strnlen. > > > > > > Good point, so it becomes: > > > > > > - return s; > > > + size_t size = __strnlen (s2, n); > > > + if (size != n) > > > + memset (s1 + size, '\0', n - size); > > > + return memcpy (s1, s2, size); > > > > OK for trunk with __strnlen? > > > > Wilco
diff --git a/string/strncpy.c b/string/strncpy.c index 0915e03..9c516d6 100644 --- a/string/strncpy.c +++ b/string/strncpy.c @@ -15,73 +15,21 @@ License along with the GNU C Library; if not, see <http://www.gnu.org/licenses/>. */ +#include <stddef.h> #include <string.h> -#include <memcopy.h> #undef strncpy #ifndef STRNCPY -#define STRNCPY strncpy +# define STRNCPY strncpy #endif char * STRNCPY (char *s1, const char *s2, size_t n) { - char c; - char *s = s1; - - --s1; - - if (n >= 4) - { - size_t n4 = n >> 2; - - for (;;) - { - c = *s2++; - *++s1 = c; - if (c == '\0') - break; - c = *s2++; - *++s1 = c; - if (c == '\0') - break; - c = *s2++; - *++s1 = c; - if (c == '\0') - break; - c = *s2++; - *++s1 = c; - if (c == '\0') - break; - if (--n4 == 0) - goto last_chars; - } - n = n - (s1 - s) - 1; - if (n == 0) - return s; - goto zero_fill; - } - - last_chars: - n &= 3; - if (n == 0) - return s; - - do - { - c = *s2++; - *++s1 = c; - if (--n == 0) - return s; - } - while (c != '\0'); - - zero_fill: - do - *++s1 = '\0'; - while (--n > 0); - - return s; + size_t size = strnlen (s2, n); + if (size != n) + memset (s1 + size, '\0', n - size); + return memcpy (s1, s2, size); } libc_hidden_builtin_def (strncpy)