Message ID | 20230924064516.17398-1-tirtajames45@gmail.com |
---|---|
State | New |
Headers | show |
Series | ftw.c: Check for "." and ".." branchlessly | expand |
On Sun, 2023-09-24 at 13:45 +0700, James Tirta Halim wrote: > +static int > +is_relative (const char *fname) > +{ > + typedef uint16_t w2 __attribute__ ((__may_alias__)); > + typedef uint32_t w4 __attribute__ ((__may_alias__)); > + const uint16_t n1 = '.' | '\0' SH 8; > + const uint32_t n2 = '.' | '.' SH 8 | '\0' SH 16; > + return (*(w2 *)fname == n1) | ((*(w4 *)fname | '\0' SH 24) == > n2); > +} No. The unaligned access will cause segmentation fault or bus error on some architectures.
On Sun, 2023-09-24 at 19:29 +0800, Xi Ruoyao wrote: > On Sun, 2023-09-24 at 13:45 +0700, James Tirta Halim wrote: > > +static int > > +is_relative (const char *fname) > > +{ > > + typedef uint16_t w2 __attribute__ ((__may_alias__)); > > + typedef uint32_t w4 __attribute__ ((__may_alias__)); > > + const uint16_t n1 = '.' | '\0' SH 8; > > + const uint32_t n2 = '.' | '.' SH 8 | '\0' SH 16; > > + return (*(w2 *)fname == n1) | ((*(w4 *)fname | '\0' SH 24) == > > n2); > > +} > > No. The unaligned access will cause segmentation fault or bus error on > some architectures. There are also other issues: 1. If "fname" is an empty string, there will be a buffer overread. I don't know if fname can be empty string here though. 2. If "fname" is "..", then the bytes in "w4" can be: '.', '.', '\0', **anything** The last byte is completely arbitrary (and it may be even out of a mapped memory range and then reading it will immediately cause a segfault). Again I don't know if fname can be just ".." here though. 3. We should let the compiler decide to use the branchless operation or not. If the compiler does not do the correct thing we should fix the compiler instead of writing some nasty code here because fixing the compiler will benefit both Glibc and other packages. There is some related existing GCC bug report (https://gcc.gnu.org/PR109555), the problem is using a branchless operation here is not always a win. See the comments in the bug report for reference.
diff --git a/io/ftw.c b/io/ftw.c index a72c7d5171..742369a2d2 100644 --- a/io/ftw.c +++ b/io/ftw.c @@ -66,6 +66,7 @@ char *alloca (); #include <unistd.h> #include <not-cancel.h> #include <sys/param.h> +#include <endian.h> #ifdef _LIBC # include <include/sys/stat.h> #else @@ -388,6 +389,25 @@ open_dir_stream (int *dfdp, struct ftw_data *data, struct dir_data *dirp) } +#if __BYTE_ORDER == __LITTLE_ENDIAN +# define SH << +#else +# define SH >> +#endif + +static int +is_relative (const char *fname) +{ + typedef uint16_t w2 __attribute__ ((__may_alias__)); + typedef uint32_t w4 __attribute__ ((__may_alias__)); + const uint16_t n1 = '.' | '\0' SH 8; + const uint32_t n2 = '.' | '.' SH 8 | '\0' SH 16; + return (*(w2 *)fname == n1) | ((*(w4 *)fname | '\0' SH 24) == n2); +} + +#undef SH + + static int process_entry (struct ftw_data *data, struct dir_data *dir, const char *name, size_t namlen, int d_type) @@ -397,9 +417,8 @@ process_entry (struct ftw_data *data, struct dir_data *dir, const char *name, int flag = 0; size_t new_buflen; - if (name[0] == '.' && (name[1] == '\0' - || (name[1] == '.' && name[2] == '\0'))) - /* Don't process the "." and ".." entries. */ + /* Don't process the "." and ".." entries. */ + if (is_relative(name)) return 0; new_buflen = data->ftw.base + namlen + 2;