mbox series

[0/9] Use more flags parameters instead of global bits in stdio

Message ID 20180307193205.4751-1-zackw@panix.com
Headers show
Series Use more flags parameters instead of global bits in stdio | expand

Message

Zack Weinberg March 7, 2018, 7:31 p.m. UTC
I got stuck on the patch to use C99-compliant scanf in _GNU_SOURCE
mode because the interaction with ldbl-is-dbl was too confusing.  The
reason it's too confusing is that C99 compliance in scanf, ldbl-is-dbl
mode in scanf, printf, and strfmon, and fortify mode in printf are
handled with mode bits on the FILE and thread-global flags that must
be set and reset at just the right times.  Correct behavior is
invariably to set and then reset around just one call to a lower-level
function, and there's a better way to do that: flags parameters.

This patch series implements _internal variants of scanf, printf,
strfmon, and syslog that take flag parameters that control C99
compliance, ldbl-is-dbl mode, and fortification.  I regret the length
and the messiness, and it might make sense to squash it on landing.
I have manually hacked the patches that introduce vfprintf-internal.c
and vfscanf-internal.c so the diffs are actually readable -- git doesn't
handle "rename this file and then create a new file in its place" very
well.

(N.B. It does not make sense to do the same thing to _IO_FLAGS2_NOTCANCEL,
even though we do occasionally want to turn that mode on temporarily.
It is also used for extended fopen mode "c", and the places where we
turn it on temporarily often want to do a whole string of printfs.
What would make sense is _IO_acquire_lock_disable_cancellation, but
this patch series is already quite long enough, and if we were going
to do that we should probably also make it easier to write to a stream
whose narrow/wide orientation is unknown.)

zw

Comments

Zack Weinberg March 12, 2018, 3:29 p.m. UTC | #1
On Wed, Mar 7, 2018 at 2:31 PM, Zack Weinberg <zackw@panix.com> wrote:
> I got stuck on the patch to use C99-compliant scanf in _GNU_SOURCE
> mode because the interaction with ldbl-is-dbl was too confusing.  The
> reason it's too confusing is that C99 compliance in scanf, ldbl-is-dbl
> mode in scanf, printf, and strfmon, and fortify mode in printf are
> handled with mode bits on the FILE and thread-global flags that must
> be set and reset at just the right times.  Correct behavior is
> invariably to set and then reset around just one call to a lower-level
> function, and there's a better way to do that: flags parameters.
>
> This patch series implements _internal variants of scanf, printf,
> strfmon, and syslog that take flag parameters that control C99
> compliance, ldbl-is-dbl mode, and fortification.

Ping?  These patches have now survived build-many-glibcs testing on
all supported platforms except the Hurd (I still can't successfully
build an i686-gnu cross compiler) and are waiting for review.  Note
that an expanded version of the "post-cleanup" has already been
committed.

zw
Gabriel F. T. Gomes March 26, 2018, 3:16 p.m. UTC | #2
On Wed, 07 Mar 2018, Zack Weinberg wrote:

>I got stuck on the patch to use C99-compliant scanf in _GNU_SOURCE
>mode because the interaction with ldbl-is-dbl was too confusing.  The
>reason it's too confusing is that C99 compliance in scanf, ldbl-is-dbl
>mode in scanf, printf, and strfmon, and fortify mode in printf are
>handled with mode bits on the FILE and thread-global flags that must
>be set and reset at just the right times.  Correct behavior is
>invariably to set and then reset around just one call to a lower-level
>function, and there's a better way to do that: flags parameters.
>
>This patch series implements _internal variants of scanf, printf,
>strfmon, and syslog that take flag parameters that control C99
>compliance, ldbl-is-dbl mode, and fortification.

Thanks for doing this.  It looks a lot less confusing now.

>I regret the length
>and the messiness, and it might make sense to squash it on landing.

Although I haven't tested each patch in the patch set individually, they
look self-contained and I don't see a compelling reason to squash them.

I did, however, test a branch with all the patches applied on powerpc64 and
powerp64le.  The tests passed OK.

>I have manually hacked the patches that introduce vfprintf-internal.c
>and vfscanf-internal.c so the diffs are actually readable -- git doesn't
>handle "rename this file and then create a new file in its place" very
>well.

Thanks for pointing this out, it made it easier to know what to do in
order to apply them.


Overall, the patch set looks good to me.  I have some comments and
questions for each individual patch, which I'm sending right away.
(patches 1 through 5, that is. I didn't have time to write about patches
6 through 8).
Zack Weinberg March 26, 2018, 3:46 p.m. UTC | #3
On Mon, Mar 26, 2018 at 11:16 AM, Gabriel F. T. Gomes
<gabriel@inconstante.eti.br> wrote:
> On Wed, 07 Mar 2018, Zack Weinberg wrote:
>
>>I got stuck on the patch to use C99-compliant scanf in _GNU_SOURCE
>>mode because the interaction with ldbl-is-dbl was too confusing.  The
>>reason it's too confusing is that C99 compliance in scanf, ldbl-is-dbl
>>mode in scanf, printf, and strfmon, and fortify mode in printf are
>>handled with mode bits on the FILE and thread-global flags that must
>>be set and reset at just the right times.  Correct behavior is
>>invariably to set and then reset around just one call to a lower-level
>>function, and there's a better way to do that: flags parameters.
>>
>>This patch series implements _internal variants of scanf, printf,
>>strfmon, and syslog that take flag parameters that control C99
>>compliance, ldbl-is-dbl mode, and fortification.
>
> Thanks for doing this.  It looks a lot less confusing now.

Thanks for reviewing.  I will look at all your individual comments
when I cycle back to this patchset again, which might not be for a
while -- as I mentioned in another message, trying to get the hidden
annotations 100% correct has sent me down a rabbit hole (first "let's
write a test so that these problems are automatically detected in the
future", and then "... whoops, there are a lot of existing errors that
will need to be corrected before we can have that test", and now I'm
on "fixing some of those errors involves major surgery on libpthread
and ld.so" :-/ ) and meanwhile I do have a day job that has nothing to
do with any of this.

zw
Florian Weimer June 27, 2018, 3:49 p.m. UTC | #4
On 03/07/2018 08:31 PM, Zack Weinberg wrote:
> I got stuck on the patch to use C99-compliant scanf in _GNU_SOURCE
> mode because the interaction with ldbl-is-dbl was too confusing.  The
> reason it's too confusing is that C99 compliance in scanf, ldbl-is-dbl
> mode in scanf, printf, and strfmon, and fortify mode in printf are
> handled with mode bits on the FILE and thread-global flags that must
> be set and reset at just the right times.  Correct behavior is
> invariably to set and then reset around just one call to a lower-level
> function, and there's a better way to do that: flags parameters.

I looked at how this change interacts with printf format specifier 
callbacks.

There currently does not appear to be a way to determine in the callback 
if an L argument was of double or long double type.  There is code to 
adjust the argument type for double mode:

       case PA_DOUBLE|PA_FLAG_LONG_DOUBLE:
         if (__ldbl_is_dbl)
           {
             args_value[cnt].pa_double = va_arg (*ap_savep, double);
             args_type[cnt] &= ~PA_FLAG_LONG_DOUBLE;
           }
         else
           args_value[cnt].pa_long_double = va_arg (*ap_savep, long double);
         break;

But I don't think args_type is ever read back, and it's not really 
accessible to the second callback function afterwards.

With the thread-local variable, you can run something like this to 
determine if you are in double double or binary64 mode because snprintf 
will not reset the __no_long_double internal TLS variable:

static bool
is_long_double_mode (void)
{
   char buf[64];
   extern __typeof__ (snprintf) snprintf_alias __asm__ ("snprintf");
   snprintf_alias (buf, sizeof (buf), "%.30Lf",
                   1234.0000000000000000000001L);
   puts (buf);
   return strcmp (buf, "1234.000000000000000000000099999997") == 0;
}

There does not seem to be any other way to get at this variable, so I'm 
not sure this is something we need to support going forward.  The flag 
is not copied into the FILE * struct, either.  Considering that 
is_long_double_mode is so inefficient, I don't think this is anything to 
worry about for real code.

For the _IO_FLAGS2_FORTIFY flag, things are a bit different.  It is 
currently copied into the FILE * struct, so it is in theory accessible 
to the printf callbacks.  But it's now in an internal header, and it 
seems unlikely that any code would use it given that it was 
underdocumented before.  Again, this doesn't look a practical problem.

This concern does not apply to _IO_FLAGS2_SCANF_STD because there are no 
scanf hooks, so there isn't any problem there.

So I think this means that the change from thread-local variable and 
in-FILE flags to an argument is conceptually valid.  I have only started 
to review the implementation, though.

Thanks,
Florian