diff mbox series

[fortran] PR109662 Namelist input with comma after name accepted

Message ID e1dbf978-ac53-8f2d-4db5-b153300abad3@gmail.com
State New
Headers show
Series [fortran] PR109662 Namelist input with comma after name accepted | expand

Commit Message

Jerry D May 6, 2023, 3:41 a.m. UTC
The attached patch adds a check for the invalid comma and emits a 
runtime error if -std=f95,f2003,f2018 are specified at compile time.

Attached patch includes a new test case.

Regression tested on x86_64-linux-gnu.

OK for mainline?

Regards,

Jerry

Author: Jerry DeLisle <jvdelisle@gcc.gnu.org>
Date:   Fri May 5 20:12:25 2023 -0700

     Fortran: Namelist read with invalid input accepted.

             PR fortran/109662

     libgfortran/ChangeLog:

             * io/list_read.c: Add a check for a comma after a namelist
             name in read input. Issue a runtime error message.

     gcc/testsuite/ChangeLog:

             * gfortran.dg/pr109662.f90: New test.

Comments

Li, Pan2 via Gcc-patches May 6, 2023, 4:02 a.m. UTC | #1
On Fri, May 05, 2023 at 08:41:48PM -0700, Jerry D via Fortran wrote:
> The attached patch adds a check for the invalid comma and emits a runtime
> error if -std=f95,f2003,f2018 are specified at compile time.
> 
> Attached patch includes a new test case.
> 
> Regression tested on x86_64-linux-gnu.
> 
> OK for mainline?
> 

Yes.  Thanks for the fix.  It's been a long time since
I looked at libgfortran code and couldn't quite determine
where to start to fix this.
Harald Anlauf May 6, 2023, 6:15 p.m. UTC | #2
Hi Jerry, Steve,

I think I have to pour a little water into the wine.

The patch fixes the reported issue only for a comma after
the namelist name, but we still accept a few other illegal
characters, e.g. ';', because:

#define is_separator(c) (c == '/' ||  c == ',' || c == '\n' || c == ' ' \
                          || c == '\t' || c == '\r' || c == ';' || \
			 (dtp->u.p.namelist_mode && c == '!'))

We don't want that in standard conformance mode, or do we?

Cheers,
Harald

On 5/6/23 06:02, Steve Kargl via Gcc-patches wrote:
> On Fri, May 05, 2023 at 08:41:48PM -0700, Jerry D via Fortran wrote:
>> The attached patch adds a check for the invalid comma and emits a runtime
>> error if -std=f95,f2003,f2018 are specified at compile time.
>>
>> Attached patch includes a new test case.
>>
>> Regression tested on x86_64-linux-gnu.
>>
>> OK for mainline?
>>
> 
> Yes.  Thanks for the fix.  It's been a long time since
> I looked at libgfortran code and couldn't quite determine
> where to start to fix this.
>
Jerry D May 7, 2023, 5:33 p.m. UTC | #3
On 5/6/23 11:15 AM, Harald Anlauf via Fortran wrote:
> Hi Jerry, Steve,
> 
> I think I have to pour a little water into the wine.
> 
> The patch fixes the reported issue only for a comma after
> the namelist name, but we still accept a few other illegal
> characters, e.g. ';', because:
> 
> #define is_separator(c) (c == '/' ||  c == ',' || c == '\n' || c == ' ' \
>                           || c == '\t' || c == '\r' || c == ';' || \
>               (dtp->u.p.namelist_mode && c == '!'))
> 
> We don't want that in standard conformance mode, or do we?
> 
> Cheers,
> Harald
> 
> On 5/6/23 06:02, Steve Kargl via Gcc-patches wrote:
>> On Fri, May 05, 2023 at 08:41:48PM -0700, Jerry D via Fortran wrote:
>>> The attached patch adds a check for the invalid comma and emits a 
>>> runtime
>>> error if -std=f95,f2003,f2018 are specified at compile time.
>>>
>>> Attached patch includes a new test case.
>>>
>>> Regression tested on x86_64-linux-gnu.
>>>
>>> OK for mainline?
>>>
>>
>> Yes.  Thanks for the fix.  It's been a long time since
>> I looked at libgfortran code and couldn't quite determine
>> where to start to fix this.
>>
> 

As I think back, I don't recall ever seeing a semi-colon used after a 
NAMELIST name, so I think we should reject it always.  The other "soft" 
blanks we should allow.

I will make a another patch on trunk to reject the semi-colon and if no 
one objects here I will test and push it.

Regards,

Jerry
Harald Anlauf May 7, 2023, 6:33 p.m. UTC | #4
Hi Jerry,

I've made a small compiler survey how they behave on namelist read
from an internal unit when:

1.) there is a single input line of the type
"&stuff" // testchar // " n = 666/"

2.) the input spans 2 lines split after the testchar

3.) same as 2.) but first line right-adjusted

See attached source code.

Competitors: Intel, NAG, NVidia, gfortran at r14-547 with -std=f2018.

My findings were (last column is iostat, next-to-last is n read or -1):

NAG:

  Compiler version = NAG Fortran Compiler Release 7.1(Hanzomon) Build 7101
  1-line:       > < 666 0
  2-line/left:  > < 666 0
  2-line/right: > < 666 0
  1-line:       >!< -1 187
  2-line/left:  >!< -1 187
  2-line/right: >!< -1 187
  1-line:       >/< -1 187
  2-line/left:  >/< -1 187
  2-line/right: >/< -1 187
  1-line:       >,< -1 187
  2-line/left:  >,< -1 187
  2-line/right: >,< -1 187
  1-line:       >;< -1 187
  2-line/left:  >;< -1 187
  2-line/right: >;< -1 187
  1-line:       tab 666 0
  2-line/left:  tab 666 0
  2-line/right: tab 666 0
  1-line:       lf -1 187
  2-line/left:  lf -1 187
  2-line/right: lf -1 187
  1-line:       ret -1 187
  2-line/left:  ret -1 187
  2-line/right: ret -1 187

My interpretation of this is that NAG treats tab as (white)space,
everything else gives an error.  This is the strictest compiler.

Intel:

  Compiler version = Intel(R) Fortran Intel(R) 64 Compiler Classic for 
applications running on Intel(R) 64, Version 2021.9.0 Build 20230302_000000
  1-line:       > <         666           0
  2-line/left:  > <         666           0
  2-line/right: > <         666           0
  1-line:       >!<          -1          -1
  2-line/left:  >!<         666           0
  2-line/right: >!<         666           0
  1-line:       >/<          -1           0
  2-line/left:  >/<          -1           0
  2-line/right: >/<          -1           0
  1-line:       >,<          -1          17
  2-line/left:  >,<          -1          17
  2-line/right: >,<          -1          17
  1-line:       >;<          -1          17
  2-line/left:  >;<          -1          17
  2-line/right: >;<          -1          17
  1-line:       tab         666           0
  2-line/left:  tab         666           0
  2-line/right: tab         666           0
  1-line:       lf         666           0
  2-line/left:  lf         666           0
  2-line/right: lf         666           0
  1-line:       ret          -1          17
  2-line/left:  ret          -1          17
  2-line/right: ret          -1          17

Nvidia:

  Compiler version = nvfortran 23.3-0
  1-line:       > <          666            0
  2-line/left:  > <          666            0
  2-line/right: > <          666            0
  1-line:       >!<           -1           -1
  2-line/left:  >!<           -1           -1
  2-line/right: >!<           -1           -1
  1-line:       >/<           -1           -1
  2-line/left:  >/<           -1           -1
  2-line/right: >/<           -1           -1
  1-line:       >,<           -1           -1
  2-line/left:  >,<           -1           -1
  2-line/right: >,<           -1           -1
  1-line:       >;<           -1           -1
  2-line/left:  >;<           -1           -1
  2-line/right: >;<           -1           -1
  1-line:       tab          666            0
  2-line/left:  tab          666            0
  2-line/right: tab          666            0
  1-line:       lf           -1           -1
  2-line/left:  lf          666            0
  2-line/right: lf          666            0
  1-line:       ret          666            0
  2-line/left:  ret          666            0
  2-line/right: ret          666            0

gfortran (see above):

  Compiler version = GCC version 14.0.0 20230506 (experimental)
  1-line:       > <         666           0
  2-line/left:  > <         666           0
  2-line/right: > <         666           0
  1-line:       >!<          -1          -1
  2-line/left:  >!<          -1           0
  2-line/right: >!<         666           0
  1-line:       >/<          -1           0
  2-line/left:  >/<          -1           0
  2-line/right: >/<          -1           0
  1-line:       >,<         666        5010
  2-line/left:  >,<         666        5010
  2-line/right: >,<         666        5010
  1-line:       >;<         666           0
  2-line/left:  >;<         666           0
  2-line/right: >;<         666           0
  1-line:       tab         666           0
  2-line/left:  tab         666           0
  2-line/right: tab         666           0
  1-line:       lf         666           0
  2-line/left:  lf         666           0
  2-line/right: lf         666           0
  1-line:       ret         666           0
  2-line/left:  ret         666           0
  2-line/right: ret         666           0


So there seems to be a consensus that "," and ";" must be rejected,
and tab is accepted (makes real sense), but already the termination
character "/" and comment character "!" are treated differently.
And how do we want to treat lf and ret in internal files with
-std=f20xx?

Cheers,
Harald


On 5/7/23 19:33, Jerry D via Gcc-patches wrote:
> On 5/6/23 11:15 AM, Harald Anlauf via Fortran wrote:
>> Hi Jerry, Steve,
>>
>> I think I have to pour a little water into the wine.
>>
>> The patch fixes the reported issue only for a comma after
>> the namelist name, but we still accept a few other illegal
>> characters, e.g. ';', because:
>>
>> #define is_separator(c) (c == '/' ||  c == ',' || c == '\n' || c == ' ' \
>>                           || c == '\t' || c == '\r' || c == ';' || \
>>               (dtp->u.p.namelist_mode && c == '!'))
>>
>> We don't want that in standard conformance mode, or do we?
>>
>> Cheers,
>> Harald
>>
>> On 5/6/23 06:02, Steve Kargl via Gcc-patches wrote:
>>> On Fri, May 05, 2023 at 08:41:48PM -0700, Jerry D via Fortran wrote:
>>>> The attached patch adds a check for the invalid comma and emits a 
>>>> runtime
>>>> error if -std=f95,f2003,f2018 are specified at compile time.
>>>>
>>>> Attached patch includes a new test case.
>>>>
>>>> Regression tested on x86_64-linux-gnu.
>>>>
>>>> OK for mainline?
>>>>
>>>
>>> Yes.  Thanks for the fix.  It's been a long time since
>>> I looked at libgfortran code and couldn't quite determine
>>> where to start to fix this.
>>>
>>
> 
> As I think back, I don't recall ever seeing a semi-colon used after a 
> NAMELIST name, so I think we should reject it always.  The other "soft" 
> blanks we should allow.
> 
> I will make a another patch on trunk to reject the semi-colon and if no 
> one objects here I will test and push it.
> 
> Regards,
> 
> Jerry
> 
>
Li, Pan2 via Gcc-patches May 8, 2023, 12:13 a.m. UTC | #5
Harald,
Thanks for keeping us honest.  I didn't check what other 
separators might cause a problem.

After 2 decades of working on gfortran, I've come to conclusion
that -std=f2018 should be the default.  When f2023 is ratified,
the default becomes -std=f2023.  All GNU fortran extension 
should be behind an option, and we should be aggressive 
eliminating extensions.

Yes, this means that 'real*4' and similar would require 
a -fallow-nonstandard-declaration option.
Harald Anlauf May 8, 2023, 7:03 p.m. UTC | #6
Steve,

On 5/8/23 02:13, Steve Kargl via Gcc-patches wrote:
> Harald,
> Thanks for keeping us honest.  I didn't check what other
> separators might cause a problem.
> 
> After 2 decades of working on gfortran, I've come to conclusion
> that -std=f2018 should be the default.  When f2023 is ratified,
> the default becomes -std=f2023.  All GNU fortran extension
> should be behind an option, and we should be aggressive
> eliminating extensions.
> 
> Yes, this means that 'real*4' and similar would require
> a -fallow-nonstandard-declaration option.
> 

please don't let us get off-topic.

The issue behind the PR was F2018: 13.11.3.1, Namelist input,
which has

Input for a namelist input statement consists of
(1) optional blanks and namelist comments,
(2) the character & followed immediately by the namelist-group-name as 
specified in the NAMELIST statement,
(3) one or more blanks,

where "blanks" was to be interpreted.  Separators are discussed
separately.

Jerry has resolved "," and ";".  Good.

There is another weird issue that is visible in the testcase
output in my previous mail for "!".  Reducing that further now
suggests that the EOF condition of the namelist read of the
single line affects the namelist read of the next multi-line read.

So this one is actually a different bug, likely libgfortran's
internal state.
Jerry D May 12, 2023, 8:36 p.m. UTC | #7
I plan to commit the following as simple.

The issue was a value was being modified on a short namelist read. After 
tthe first read gives the correct EOF, a second read would give the 
error but modify the variable.

diff --git a/libgfortran/io/unit.c b/libgfortran/io/unit.c
index 82664dc5f98..36d025949c2 100644
--- a/libgfortran/io/unit.c
+++ b/libgfortran/io/unit.c
@@ -504,6 +504,7 @@ set_internal_unit (st_parameter_dt *dtp, gfc_unit 
*iunit, int kind)
    iunit->current_record=0;
    iunit->read_bad = 0;
    iunit->endfile = NO_ENDFILE;
+  iunit->last_char = 0;

    /* Set flags for the internal unit.  */


The revised test case attached.  It has been regression tested OK.

Regards,

Jerry
diff mbox series

Patch

diff --git a/gcc/testsuite/gfortran.dg/pr109662.f90 b/gcc/testsuite/gfortran.dg/pr109662.f90
new file mode 100644
index 00000000000..988cfab73cc
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr109662.f90
@@ -0,0 +1,15 @@ 
+! { dg-do run }
+! { dg-options "-std=f2003" }
+! PR109662 a comma after namelist name accepted on input. 
+program testnmlread
+  implicit none
+  character(16) :: list = '&stuff, n = 759/'
+  character(100)::message
+  integer       :: n, ioresult
+  namelist/stuff/n
+  message = ""
+  ioresult = 0
+  n = 99
+  read(list,nml=stuff,iostat=ioresult)
+  if (ioresult == 0) STOP 13
+end program testnmlread
diff --git a/libgfortran/io/list_read.c b/libgfortran/io/list_read.c
index 109313c15b1..78bfd9e8787 100644
--- a/libgfortran/io/list_read.c
+++ b/libgfortran/io/list_read.c
@@ -3596,8 +3596,12 @@  find_nml_name:
   if (dtp->u.p.nml_read_error)
     goto find_nml_name;
 
-  /* A trailing space is required, we give a little latitude here, 10.9.1.  */
+  /* A trailing space is required, we allow a comma with std=gnu.  */
   c = next_char (dtp);
+  if (c == ',' && !(compile_options.allow_std & GFC_STD_GNU))
+    generate_error (&dtp->common, LIBERROR_READ_VALUE,
+		    "Comma after namelist name not allowed");
+
   if (!is_separator(c) && c != '!')
     {
       unget_char (dtp, c);