Message ID | 20201215044812.1068400-1-siddhesh@sourceware.org |
---|---|
State | New |
Headers | show |
Series | getmntent: Consolidate character decoding | expand |
On Dez 15 2020, Siddhesh Poyarekar via Libc-alpha wrote: > + else > + { > + /* The kernel escapes special characters with their octal > + equivalents. */ > + char c = 64 * (rp[1] - '0') + 8 * (rp[2] - '0') + rp[3] - '0'; > + *wp++ = c; > + rp += 3; That should check that the characters are indeed digits and you are not running past the end. Andreas.
On 12/15/20 2:57 PM, Andreas Schwab wrote: > On Dez 15 2020, Siddhesh Poyarekar via Libc-alpha wrote: > >> + else >> + { >> + /* The kernel escapes special characters with their octal >> + equivalents. */ >> + char c = 64 * (rp[1] - '0') + 8 * (rp[2] - '0') + rp[3] - '0'; >> + *wp++ = c; >> + rp += 3; > > That should check that the characters are indeed digits and you are not > running past the end. Indeed, I'll fix it up, thanks. Siddhesh
On 12/14/20 11:48 PM, Siddhesh Poyarekar via Libc-alpha wrote: > The Linux kernel escapes some characters (whitespaces and \) by > replacing them with their octal form in the format \xxx. Use that > format to decode insteald of looking for specific bytes so that the > check is extensible. This way if the kernel escapes additional > characters, glibc code won't have to change to accommodate it. I > have, for example, proposed[1] to escape the '#' character since it > may interfere with parsing in getmntent. > > The check for '\\\\' is kept intact even though as of today, the > kernel does not write '\\\\' to escape a backslash; it write its octal > equivalent instead. > > [1] https://lore.kernel.org/linux-fsdevel/20201215042454.998361-1-siddhesh@gotplt.org/T/#u > --- > misc/mntent_r.c | 43 +++++++++++++++---------------------------- > 1 file changed, 15 insertions(+), 28 deletions(-) Please consider how we would test this. I would like to see additional coverage here to make sure this works in the future.
On 12/15/20 8:20 PM, Carlos O'Donell via Libc-alpha wrote: > On 12/14/20 11:48 PM, Siddhesh Poyarekar via Libc-alpha wrote: >> The Linux kernel escapes some characters (whitespaces and \) by >> replacing them with their octal form in the format \xxx. Use that >> format to decode insteald of looking for specific bytes so that the >> check is extensible. This way if the kernel escapes additional >> characters, glibc code won't have to change to accommodate it. I >> have, for example, proposed[1] to escape the '#' character since it >> may interfere with parsing in getmntent. >> >> The check for '\\\\' is kept intact even though as of today, the >> kernel does not write '\\\\' to escape a backslash; it write its octal >> equivalent instead. >> >> [1] https://lore.kernel.org/linux-fsdevel/20201215042454.998361-1-siddhesh@gotplt.org/T/#u >> --- >> misc/mntent_r.c | 43 +++++++++++++++---------------------------- >> 1 file changed, 15 insertions(+), 28 deletions(-) > > Please consider how we would test this. > > I would like to see additional coverage here to make sure this works > in the future. > Sure thing. I realized my first attempt was quite a lazy one and the whole set of functions needs a closer look. I'll post a v2 with all that and tests. Siddhesh
diff --git a/misc/mntent_r.c b/misc/mntent_r.c index 0e8f10007e..f80b563f39 100644 --- a/misc/mntent_r.c +++ b/misc/mntent_r.c @@ -76,35 +76,22 @@ decode_name (char *buf) char *wp = buf; do - if (rp[0] == '\\' && rp[1] == '0' && rp[2] == '4' && rp[3] == '0') + if (rp[0] == '\\') { - /* \040 is a SPACE. */ - *wp++ = ' '; - rp += 3; - } - else if (rp[0] == '\\' && rp[1] == '0' && rp[2] == '1' && rp[3] == '1') - { - /* \011 is a TAB. */ - *wp++ = '\t'; - rp += 3; - } - else if (rp[0] == '\\' && rp[1] == '0' && rp[2] == '1' && rp[3] == '2') - { - /* \012 is a NEWLINE. */ - *wp++ = '\n'; - rp += 3; - } - else if (rp[0] == '\\' && rp[1] == '\\') - { - /* We have to escape \\ to be able to represent all characters. */ - *wp++ = '\\'; - rp += 1; - } - else if (rp[0] == '\\' && rp[1] == '1' && rp[2] == '3' && rp[3] == '4') - { - /* \134 is also \\. */ - *wp++ = '\\'; - rp += 3; + if (rp[1] == '\\') + { + /* We have to escape \\ to be able to represent all characters. */ + *wp++ = '\\'; + rp += 1; + } + else + { + /* The kernel escapes special characters with their octal + equivalents. */ + char c = 64 * (rp[1] - '0') + 8 * (rp[2] - '0') + rp[3] - '0'; + *wp++ = c; + rp += 3; + } } else *wp++ = *rp;