diff mbox series

Adjust LTO mode tables for "Machine_Mode: Extend machine_mode from 8 to 16 bits" (was: [PATCH] Machine_Mode: Extend machine_mode from 8 to 16 bits)

Message ID 87o7kxuq9s.fsf@euler.schwinge.homeip.net
State New
Headers show
Series Adjust LTO mode tables for "Machine_Mode: Extend machine_mode from 8 to 16 bits" (was: [PATCH] Machine_Mode: Extend machine_mode from 8 to 16 bits) | expand

Commit Message

Thomas Schwinge June 30, 2023, 11:46 a.m. UTC
Hi!

On 2023-05-13T16:44:41+0800, Kito Cheng via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> Tried this patch and I ran into some issues, some variables are using
> unsigned char to hold machine mode and will have problems when the
> number of modes is larger than 255...
>
> And here is the fix:

> --- a/gcc/genmodes.cc
> +++ b/gcc/genmodes.cc
> @@ -1141,10 +1141,10 @@ inline __attribute__((__always_inline__))\n\
> #else\n\
> extern __inline__ __attribute__((__always_inline__, __gnu_inline__))\n\
> #endif\n\
> -unsigned char\n\
> +unsigned short\n\
> mode_inner_inline (machine_mode mode)\n\
> {\n\
> -  extern const unsigned char mode_inner[NUM_MACHINE_MODES];\n\
> +  extern const unsigned short mode_inner[NUM_MACHINE_MODES];\n\
>   gcc_assert (mode >= 0 && mode < NUM_MACHINE_MODES);\n\
>   switch (mode)\n\
>     {");
> @@ -1529,7 +1529,7 @@ emit_mode_wider (void)
>   int c;
>   struct mode_data *m;
>
> -  print_decl ("unsigned char", "mode_next", "NUM_MACHINE_MODES");
> +  print_decl ("unsigned short", "mode_next", "NUM_MACHINE_MODES");

Etc.

Instead of 's%char%short', shouldn't we really be using
'enum machine_mode' here?  (I understand such a change may require some
further surgery, but wouldn't it be the correct thing to do?)


And, in context of working on
<https://inbox.sourceware.org/87mt0hcp12.fsf@euler.schwinge.homeip.net>
"Streamer: Fix out of range memory access of machine mode", I found
another one, see attached
'[WIP] Adjust LTO mode tables for "Machine_Mode: Extend machine_mode from 8 to 16 bits"'
(..., which applies on top of the former.)  There, in fact, I did change
to 'enum machine_mode' instead of 's%char%short' -- correct?  Any
comments on the 'GTY' issues: (1) 'const' build error,
(2) '[build-gcc]/gcc/gtype-desc.cc' changes, and (3) is 'GTY((atomic))'
actually the right thing to use, here?

In particular, the 'lto_mode_identity_table' changes would seem necessary
to keep standard LTO ('-flto') functional for large 'machine_mode' size?


Bernhard: Fancy writing a Coccinelle check whether there are any more
places where we put what originally was a 'machine_mode' type into a
'char' (or, into a non-'machine_mode' generally)?  ;-) Hey, just a Friday
afternoon idea!


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

Comments

Kito Cheng June 30, 2023, 12:45 p.m. UTC | #1
> On 2023-05-13T16:44:41+0800, Kito Cheng via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> > Tried this patch and I ran into some issues, some variables are using
> > unsigned char to hold machine mode and will have problems when the
> > number of modes is larger than 255...
> >
> > And here is the fix:
>
> > --- a/gcc/genmodes.cc
> > +++ b/gcc/genmodes.cc
> > @@ -1141,10 +1141,10 @@ inline __attribute__((__always_inline__))\n\
> > #else\n\
> > extern __inline__ __attribute__((__always_inline__, __gnu_inline__))\n\
> > #endif\n\
> > -unsigned char\n\
> > +unsigned short\n\
> > mode_inner_inline (machine_mode mode)\n\
> > {\n\
> > -  extern const unsigned char mode_inner[NUM_MACHINE_MODES];\n\
> > +  extern const unsigned short mode_inner[NUM_MACHINE_MODES];\n\
> >   gcc_assert (mode >= 0 && mode < NUM_MACHINE_MODES);\n\
> >   switch (mode)\n\
> >     {");
> > @@ -1529,7 +1529,7 @@ emit_mode_wider (void)
> >   int c;
> >   struct mode_data *m;
> >
> > -  print_decl ("unsigned char", "mode_next", "NUM_MACHINE_MODES");
> > +  print_decl ("unsigned short", "mode_next", "NUM_MACHINE_MODES");
>
> Etc.
>
> Instead of 's%char%short', shouldn't we really be using
> 'enum machine_mode' here?  (I understand such a change may require some
> further surgery, but wouldn't it be the correct thing to do?)

Hmmm, I think maybe what we need is to leverage C++ language features
to declare enum with underlying types like that:

enum machine_mode : uint16_t
Thomas Schwinge June 30, 2023, 4:11 p.m. UTC | #2
Hi!

On 2023-06-30T20:45:38+0800, Kito Cheng <kito.cheng@sifive.com> wrote:
>> On 2023-05-13T16:44:41+0800, Kito Cheng via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>> > Tried this patch and I ran into some issues, some variables are using
>> > unsigned char to hold machine mode and will have problems when the
>> > number of modes is larger than 255...
>> >
>> > And here is the fix:
>>
>> > --- a/gcc/genmodes.cc
>> > +++ b/gcc/genmodes.cc
>> > @@ -1141,10 +1141,10 @@ inline __attribute__((__always_inline__))\n\
>> > #else\n\
>> > extern __inline__ __attribute__((__always_inline__, __gnu_inline__))\n\
>> > #endif\n\
>> > -unsigned char\n\
>> > +unsigned short\n\
>> > mode_inner_inline (machine_mode mode)\n\
>> > {\n\
>> > -  extern const unsigned char mode_inner[NUM_MACHINE_MODES];\n\
>> > +  extern const unsigned short mode_inner[NUM_MACHINE_MODES];\n\
>> >   gcc_assert (mode >= 0 && mode < NUM_MACHINE_MODES);\n\
>> >   switch (mode)\n\
>> >     {");
>> > @@ -1529,7 +1529,7 @@ emit_mode_wider (void)
>> >   int c;
>> >   struct mode_data *m;
>> >
>> > -  print_decl ("unsigned char", "mode_next", "NUM_MACHINE_MODES");
>> > +  print_decl ("unsigned short", "mode_next", "NUM_MACHINE_MODES");
>>
>> Etc.
>>
>> Instead of 's%char%short', shouldn't we really be using
>> 'enum machine_mode' here?  (I understand such a change may require some
>> further surgery, but wouldn't it be the correct thing to do?)
>
> Hmmm, I think maybe what we need is to leverage C++ language features
> to declare enum with underlying types like that:
>
> enum machine_mode : uint16_t

Eh, so that's the reason/confusion (or, at least some of it...) here: my
(naïve...) assumption has been that 'enum machine_mode' already does have
a fixed underlying type -- but apparently it does not, so defaults to
'unsigned int'!

    (gdb) ptype lto_mode_identity_table
    type = const enum machine_mode : unsigned int {E_VOIDmode, E_BLKmode, E_CCmode, [...], NUM_MACHINE_MODES = 130} *

So, yeah, should we fix that, and then generally use 'enum machine_mode'
instead of 'char' vs. 'short'?  (Or, which other "detail" do I fail to
recognize this time?)


Grüße
 Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Jakub Jelinek June 30, 2023, 4:37 p.m. UTC | #3
On Fri, Jun 30, 2023 at 08:45:38PM +0800, Kito Cheng wrote:
> Hmmm, I think maybe what we need is to leverage C++ language features
> to declare enum with underlying types like that:
> 
> enum machine_mode : uint16_t

What would be the advantage of doing that?
I mean, on most hosts using unsigned rather than unsigned short is
actually faster, and for the cases where we care about the size
(e.g. mode in RTL, DECLs and the like) we already use enum bitfields.

	Jakub
Thomas Schwinge July 4, 2023, 3:45 p.m. UTC | #4
Hi Jakub!

On 2023-06-30T18:37:59+0200, Jakub Jelinek <jakub@redhat.com> wrote:
> On Fri, Jun 30, 2023 at 08:45:38PM +0800, Kito Cheng wrote:
>> Hmmm, I think maybe what we need is to leverage C++ language features
>> to declare enum with underlying types like that:
>>
>> enum machine_mode : uint16_t
>
> What would be the advantage of doing that?
> I mean, on most hosts using unsigned rather than unsigned short is
> actually faster

Interesting, I had not considered that, and assumed we'd always
space-optimize such things.  But yes, I do see your point.

Is it worth adding some such rationale into a new "Data Types" (or
similar) section on <https://gcc.gnu.org/codingconventions.html>,
<https://gcc.gnu.org/codingrationale.html>?

> and for the cases where we care about the size
> (e.g. mode in RTL, DECLs and the like) we already use enum bitfields.

So, which category does (current) 'unsigned char *mode_table' in
'gcc/lto-streamer.h:struct lto_file_decl_data' fall into?  Used in
'gcc/tree-streamer.h:bp_unpack_machine_mode', normally (non-offloading
case) doing "identity-lookup" via
'gcc/lto/lto-common.cc:lto_mode_identity_table'.  Should this turn into
'ENUM_BITFIELD (machine_mode)' ("native size") or
'ENUM_BITFIELD (machine_mode) : MACHINE_MODE_BITSIZE' (space-optimized)?


Grüße
 Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Thomas Schwinge Aug. 10, 2023, 1:23 p.m. UTC | #5
Hi!

On 2023-08-10T12:25:36+0000, "Li, Pan2" <pan2.li@intel.com> wrote:
> Thanks Richard for comment, let me try to promote the table to unsigned short.

I have WIP work for this issue -- which I'd already raised a month ago:
<https://inbox.sourceware.org/87o7kxuq9s.fsf@euler.schwinge.homeip.net>:

On 2023-06-30T13:46:07+0200, Thomas Schwinge <thomas@codesourcery.com> wrote:
> In particular, the 'lto_mode_identity_table' changes would seem necessary
> to keep standard LTO ('-flto') functional for large 'machine_mode' size?

... which is exactly the problem you've now run into?

However, a simple:

    -GTY(()) const unsigned char *lto_mode_identity_table;
    +GTY(()) const unsigned short *lto_mode_identity_table;

..., or:

    -GTY(()) const unsigned char *lto_mode_identity_table;
    +GTY(()) const machine_mode *lto_mode_identity_table;

... is not sufficient: that runs into GTY issues, as the current
'unsigned char *lto_mode_identity_table' is (mis-)classified by
'gengtype' as a C string.  This happens to work for this case, but still
isn't right, and only works for 'char *' but not 'short *' etc.  I have
WIP work to tighten that.  ..., which got me into other GTY issues, and
so on...  ;-) (Richard already ACKed and I pushed some of the
prerequisite changes, but there's more to come.)  I'm still planning on
resolving all that mess, but I'm tight on time right now.

However, I have a different proposal, which should address your current
issue: simply, get rid of the 'lto_mode_identity_table', which is just
that: a 1-to-1 mapping of array index to value.  Instead, in
'gcc/lto/lto-common.cc:lto_file_finalize', for '!ACCEL_COMPILER', set
'file_data->mode_table = NULL', and in the users (only
'gcc/tree-streamer.h:bp_unpack_machine_mode'?), replace (untested):

    -return (machine_mode) ib->file_data->mode_table[ix];
    +return ib->file_data->mode_table ? ib->file_data->mode_table[ix] : ix;

Jakub, as the original author of 'lto_mode_identity_table' (see
commit db847fa8f2cca6139188b8dfa0a7064319b19193 (Subversion r221005)), is
there any reason not to do it this way?


Grüße
 Thomas


> -----Original Message-----
> From: Richard Biener <richard.guenther@gmail.com>
> Sent: Thursday, August 10, 2023 7:08 PM
> To: Li, Pan2 <pan2.li@intel.com>
> Cc: richard.sandiford@arm.com; Thomas Schwinge <thomas@codesourcery.com>; jakub@redhat.com; kito.cheng@gmail.com; Jeff Law <jeffreyalaw@gmail.com>; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>
> Subject: Re: Machine Mode ICE in RISC-V when LTO
>
> On Thu, Aug 10, 2023 at 10:19 AM Li, Pan2 <pan2.li@intel.com> wrote:
>>
>> Hi all,
>>
>>
>>
>> Recently I found there is still some issues for the machine mode with LTO part by fixing one
>>
>> ICE (only when compile with LTO) in RISC-V backend in , aka below case.
>>
>>
>>
>> >> ../__RISC-V_INSTALL___/bin/riscv64-unknown-elf-g++ -O2 -flto gcc/testsuite/g++.dg/torture/vshuf-v4df.C -o test.elf
>>
>> during RTL pass: expand
>>
>> gcc/testsuite/g++.dg/torture/vshuf-main.inc: In function 'main':
>>
>> gcc/testsuite/g++.dg/torture/vshuf-main.inc:15:9: internal compiler error: in as_a, at machmode.h:381
>>
>>    15 |       V r = __builtin_shuffle(in1[i], mask1[i]);
>>
>>       |         ^
>>
>> 0x7e5b8e scalar_int_mode as_a<scalar_int_mode>(machine_mode)
>>
>>         ../.././gcc/gcc/machmode.h:381
>>
>> 0x7eabdb scalar_mode as_a<scalar_mode>(machine_mode)
>>
>>         ../.././gcc/gcc/expr.cc:332
>>
>> 0x7eabdb convert_mode_scalar
>>
>>         ../.././gcc/gcc/expr.cc:325
>>
>> 0xb8485b store_expr(tree_node*, rtx_def*, int, bool, bool)
>>
>>         ../.././gcc/gcc/expr.cc:6413
>>
>> 0xb8a556 store_field
>>
>>         ../.././gcc/gcc/expr.cc:7648
>>
>> 0xb88f27 store_constructor(tree_node*, rtx_def*, int, poly_int<2u, long>, bool)
>>
>>         ../.././gcc/gcc/expr.cc:7588
>>
>> 0xb8b8b8 expand_constructor
>>
>>         ../.././gcc/gcc/expr.cc:8931
>>
>> 0xb76bc7 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
>>
>>         ../.././gcc/gcc/expr.cc:11170
>>
>> 0xb77ef7 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
>>
>>         ../.././gcc/gcc/expr.cc:10809
>>
>> 0xb83a80 store_expr(tree_node*, rtx_def*, int, bool, bool)
>>
>>         ../.././gcc/gcc/expr.cc:6325
>>
>> 0xb851d9 expand_assignment(tree_node*, tree_node*, bool)
>>
>>         ../.././gcc/gcc/expr.cc:6043
>>
>> 0xa48717 expand_gimple_stmt_1
>>
>>         ../.././gcc/gcc/cfgexpand.cc:3946
>>
>> 0xa48717 expand_gimple_stmt
>>
>>         ../.././gcc/gcc/cfgexpand.cc:4044
>>
>> 0xa4d030 expand_gimple_basic_block
>>
>>         ../.././gcc/gcc/cfgexpand.cc:6096
>>
>> 0xa4efd6 execute
>>
>>         ../.././gcc/gcc/cfgexpand.cc:6831
>>
>>
>>
>> I double checked the reason that comes from we add even more machine modes in the RISC-V backend,
>>
>> and then did some investigation for the root cause. It should be related to the mode_table, as well as the
>>
>> bp_unpack_machine_mode.
>>
>>
>>
>> In lto_fe_init:
>>
>>    unsigned char *table
>>
>>     = ggc_vec_alloc<unsigned char> (MAX_MACHINE_MODE);
>>
>>
>>
>>    for (int m = 0; m < MAX_MACHINE_MODE; m++)
>>
>> table[m] = m;                                                              <== May overflow here given MAX_MACHINE_MODE > 256 and table[m] is unsigned char.
>>
>>
>>
>> in bp_unpack_machine_mode:
>>
>>    unsigned ix = bp_unpack_enum (bp, machine_mode, last);
>>
>>   return (machine_mode) ib->file_data->mode_table[ix];  <== May return truncated mode here.
>>
>>
>>
>> To validate this idea, I tried below hack code for double checking and then there is no ICE anymore, which indicates
>>
>> the problem here as I bet. However, the lto is quite complicated and I am not sure how to fix it in the right way.
>>
>>
>>
>> +    = ggc_vec_alloc<unsigned char> (MAX_MACHINE_MODE * 2);
>>
>> …
>>
>> +    ((unsigned short *)table)[m] = m;
>>
>> …
>>
>> +   return (machine_mode) ((unsigned short *)ib->file_data->mode_table)[ix];
>>
>>
>>
>> Besides, I also tried to change the mode_table from char * to short * but got one weird error when building as below.
>>
>>
>>
>> gcc/lto-streamer.h:599: field `(*x).mode_table' is pointer to unimplemented type
>
> We still have some places using an array of char for the mode table.
> The above is assigned
> to lto_mode_identity_table which ends up in
> lto_file_decl_data::mode_table.  I think those
> need to be all promoted to unsigned short.
>
> Richard.
>
>>
>>
>> Pan
>>
>>
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Li, Pan2 via Gcc-patches Aug. 10, 2023, 2:14 p.m. UTC | #6
Thanks Thomas for the information, great to learn you have a fix WIP.

> ... is not sufficient: that runs into GTY issues, as the current
> 'unsigned char *lto_mode_identity_table' is (mis-)classified by
> 'gengtype' as a C string.  This happens to work for this case, but still
> isn't right, and only works for 'char *' but not 'short *' etc

Does it reports something like " gcc/lto-streamer.h:599: field `(*x).mode_table' is pointer to unimplemented type" when changing to short *?

>    -return (machine_mode) ib->file_data->mode_table[ix];
>    +return ib->file_data->mode_table ? ib->file_data->mode_table[ix] : ix;

Got the point and the mode_table is constant up to a point.

Pan

-----Original Message-----
From: Thomas Schwinge <thomas@codesourcery.com> 
Sent: Thursday, August 10, 2023 9:24 PM
To: Li, Pan2 <pan2.li@intel.com>; Richard Biener <richard.guenther@gmail.com>; Jakub Jelinek <jakub@redhat.com>
Cc: gcc-patches@gcc.gnu.org; richard.sandiford@arm.com; kito.cheng@gmail.com; Jeff Law <jeffreyalaw@gmail.com>; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>
Subject: RE: Machine Mode ICE in RISC-V when LTO

Hi!

On 2023-08-10T12:25:36+0000, "Li, Pan2" <pan2.li@intel.com> wrote:
> Thanks Richard for comment, let me try to promote the table to unsigned short.

I have WIP work for this issue -- which I'd already raised a month ago:
<https://inbox.sourceware.org/87o7kxuq9s.fsf@euler.schwinge.homeip.net>:

On 2023-06-30T13:46:07+0200, Thomas Schwinge <thomas@codesourcery.com> wrote:
> In particular, the 'lto_mode_identity_table' changes would seem necessary
> to keep standard LTO ('-flto') functional for large 'machine_mode' size?

... which is exactly the problem you've now run into?

However, a simple:

    -GTY(()) const unsigned char *lto_mode_identity_table;
    +GTY(()) const unsigned short *lto_mode_identity_table;

..., or:

    -GTY(()) const unsigned char *lto_mode_identity_table;
    +GTY(()) const machine_mode *lto_mode_identity_table;

... is not sufficient: that runs into GTY issues, as the current
'unsigned char *lto_mode_identity_table' is (mis-)classified by
'gengtype' as a C string.  This happens to work for this case, but still
isn't right, and only works for 'char *' but not 'short *' etc.  I have
WIP work to tighten that.  ..., which got me into other GTY issues, and
so on...  ;-) (Richard already ACKed and I pushed some of the
prerequisite changes, but there's more to come.)  I'm still planning on
resolving all that mess, but I'm tight on time right now.

However, I have a different proposal, which should address your current
issue: simply, get rid of the 'lto_mode_identity_table', which is just
that: a 1-to-1 mapping of array index to value.  Instead, in
'gcc/lto/lto-common.cc:lto_file_finalize', for '!ACCEL_COMPILER', set
'file_data->mode_table = NULL', and in the users (only
'gcc/tree-streamer.h:bp_unpack_machine_mode'?), replace (untested):

    -return (machine_mode) ib->file_data->mode_table[ix];
    +return ib->file_data->mode_table ? ib->file_data->mode_table[ix] : ix;

Jakub, as the original author of 'lto_mode_identity_table' (see
commit db847fa8f2cca6139188b8dfa0a7064319b19193 (Subversion r221005)), is
there any reason not to do it this way?


Grüße
 Thomas


> -----Original Message-----
> From: Richard Biener <richard.guenther@gmail.com>
> Sent: Thursday, August 10, 2023 7:08 PM
> To: Li, Pan2 <pan2.li@intel.com>
> Cc: richard.sandiford@arm.com; Thomas Schwinge <thomas@codesourcery.com>; jakub@redhat.com; kito.cheng@gmail.com; Jeff Law <jeffreyalaw@gmail.com>; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>
> Subject: Re: Machine Mode ICE in RISC-V when LTO
>
> On Thu, Aug 10, 2023 at 10:19 AM Li, Pan2 <pan2.li@intel.com> wrote:
>>
>> Hi all,
>>
>>
>>
>> Recently I found there is still some issues for the machine mode with LTO part by fixing one
>>
>> ICE (only when compile with LTO) in RISC-V backend in , aka below case.
>>
>>
>>
>> >> ../__RISC-V_INSTALL___/bin/riscv64-unknown-elf-g++ -O2 -flto gcc/testsuite/g++.dg/torture/vshuf-v4df.C -o test.elf
>>
>> during RTL pass: expand
>>
>> gcc/testsuite/g++.dg/torture/vshuf-main.inc: In function 'main':
>>
>> gcc/testsuite/g++.dg/torture/vshuf-main.inc:15:9: internal compiler error: in as_a, at machmode.h:381
>>
>>    15 |       V r = __builtin_shuffle(in1[i], mask1[i]);
>>
>>       |         ^
>>
>> 0x7e5b8e scalar_int_mode as_a<scalar_int_mode>(machine_mode)
>>
>>         ../.././gcc/gcc/machmode.h:381
>>
>> 0x7eabdb scalar_mode as_a<scalar_mode>(machine_mode)
>>
>>         ../.././gcc/gcc/expr.cc:332
>>
>> 0x7eabdb convert_mode_scalar
>>
>>         ../.././gcc/gcc/expr.cc:325
>>
>> 0xb8485b store_expr(tree_node*, rtx_def*, int, bool, bool)
>>
>>         ../.././gcc/gcc/expr.cc:6413
>>
>> 0xb8a556 store_field
>>
>>         ../.././gcc/gcc/expr.cc:7648
>>
>> 0xb88f27 store_constructor(tree_node*, rtx_def*, int, poly_int<2u, long>, bool)
>>
>>         ../.././gcc/gcc/expr.cc:7588
>>
>> 0xb8b8b8 expand_constructor
>>
>>         ../.././gcc/gcc/expr.cc:8931
>>
>> 0xb76bc7 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
>>
>>         ../.././gcc/gcc/expr.cc:11170
>>
>> 0xb77ef7 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
>>
>>         ../.././gcc/gcc/expr.cc:10809
>>
>> 0xb83a80 store_expr(tree_node*, rtx_def*, int, bool, bool)
>>
>>         ../.././gcc/gcc/expr.cc:6325
>>
>> 0xb851d9 expand_assignment(tree_node*, tree_node*, bool)
>>
>>         ../.././gcc/gcc/expr.cc:6043
>>
>> 0xa48717 expand_gimple_stmt_1
>>
>>         ../.././gcc/gcc/cfgexpand.cc:3946
>>
>> 0xa48717 expand_gimple_stmt
>>
>>         ../.././gcc/gcc/cfgexpand.cc:4044
>>
>> 0xa4d030 expand_gimple_basic_block
>>
>>         ../.././gcc/gcc/cfgexpand.cc:6096
>>
>> 0xa4efd6 execute
>>
>>         ../.././gcc/gcc/cfgexpand.cc:6831
>>
>>
>>
>> I double checked the reason that comes from we add even more machine modes in the RISC-V backend,
>>
>> and then did some investigation for the root cause. It should be related to the mode_table, as well as the
>>
>> bp_unpack_machine_mode.
>>
>>
>>
>> In lto_fe_init:
>>
>>    unsigned char *table
>>
>>     = ggc_vec_alloc<unsigned char> (MAX_MACHINE_MODE);
>>
>>
>>
>>    for (int m = 0; m < MAX_MACHINE_MODE; m++)
>>
>> table[m] = m;                                                              <== May overflow here given MAX_MACHINE_MODE > 256 and table[m] is unsigned char.
>>
>>
>>
>> in bp_unpack_machine_mode:
>>
>>    unsigned ix = bp_unpack_enum (bp, machine_mode, last);
>>
>>   return (machine_mode) ib->file_data->mode_table[ix];  <== May return truncated mode here.
>>
>>
>>
>> To validate this idea, I tried below hack code for double checking and then there is no ICE anymore, which indicates
>>
>> the problem here as I bet. However, the lto is quite complicated and I am not sure how to fix it in the right way.
>>
>>
>>
>> +    = ggc_vec_alloc<unsigned char> (MAX_MACHINE_MODE * 2);
>>
>> …
>>
>> +    ((unsigned short *)table)[m] = m;
>>
>> …
>>
>> +   return (machine_mode) ((unsigned short *)ib->file_data->mode_table)[ix];
>>
>>
>>
>> Besides, I also tried to change the mode_table from char * to short * but got one weird error when building as below.
>>
>>
>>
>> gcc/lto-streamer.h:599: field `(*x).mode_table' is pointer to unimplemented type
>
> We still have some places using an array of char for the mode table.
> The above is assigned
> to lto_mode_identity_table which ends up in
> lto_file_decl_data::mode_table.  I think those
> need to be all promoted to unsigned short.
>
> Richard.
>
>>
>>
>> Pan
>>
>>
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Li, Pan2 via Gcc-patches Aug. 11, 2023, 6:40 a.m. UTC | #7
Hi Thomas,

Just FYI that tried your proposal as below for the lto ICE in RISC-V, it can resolve the ICE (gcc/testsuite/g++.dg/torture/vshuf-v4df.C) as expected. Let's wait Jakub's comments for this.

diff --git a/gcc/lto/lto-common.cc b/gcc/lto/lto-common.cc
index 703e665b698..970e3ea11ac 100644
--- a/gcc/lto/lto-common.cc
+++ b/gcc/lto/lto-common.cc
@@ -2277,7 +2277,7 @@ lto_file_finalize (struct lto_file_decl_data *file_data, lto_file *file,
 #ifdef ACCEL_COMPILER
   lto_input_mode_table (file_data);
 #else
-  file_data->mode_table = lto_mode_identity_table;
+  file_data->mode_table = NULL;
   file_data->mode_bits = ceil_log2 (MAX_MACHINE_MODE);
 #endif

diff --git a/gcc/tree-streamer.h b/gcc/tree-streamer.h
index ff49d1ba637..30208692bc7 100644
--- a/gcc/tree-streamer.h
+++ b/gcc/tree-streamer.h
@@ -118,7 +118,11 @@ bp_unpack_machine_mode (struct bitpack_d *bp)
   lto_input_block *ib = (class lto_input_block *) bp->stream;
   int last = 1 << ib->file_data->mode_bits;
   unsigned ix = bp_unpack_enum (bp, machine_mode, last);
-  return (machine_mode) ib->file_data->mode_table[ix];
+
+  if (ib->file_data->mode_table)
+    return (machine_mode) ib->file_data->mode_table[ix];
+  else
+    return (machine_mode) ix;
 }

 #endif  /* GCC_TREE_STREAMER_H  */

Pan

-----Original Message-----
From: Li, Pan2 
Sent: Thursday, August 10, 2023 10:14 PM
To: Thomas Schwinge <thomas@codesourcery.com>; Richard Biener <richard.guenther@gmail.com>; Jakub Jelinek <jakub@redhat.com>
Cc: gcc-patches@gcc.gnu.org; richard.sandiford@arm.com; kito.cheng@gmail.com; Jeff Law <jeffreyalaw@gmail.com>; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>
Subject: RE: Machine Mode ICE in RISC-V when LTO

Thanks Thomas for the information, great to learn you have a fix WIP.

> ... is not sufficient: that runs into GTY issues, as the current
> 'unsigned char *lto_mode_identity_table' is (mis-)classified by
> 'gengtype' as a C string.  This happens to work for this case, but still
> isn't right, and only works for 'char *' but not 'short *' etc

Does it reports something like " gcc/lto-streamer.h:599: field `(*x).mode_table' is pointer to unimplemented type" when changing to short *?

>    -return (machine_mode) ib->file_data->mode_table[ix];
>    +return ib->file_data->mode_table ? ib->file_data->mode_table[ix] : ix;

Got the point and the mode_table is constant up to a point.

Pan

-----Original Message-----
From: Thomas Schwinge <thomas@codesourcery.com> 
Sent: Thursday, August 10, 2023 9:24 PM
To: Li, Pan2 <pan2.li@intel.com>; Richard Biener <richard.guenther@gmail.com>; Jakub Jelinek <jakub@redhat.com>
Cc: gcc-patches@gcc.gnu.org; richard.sandiford@arm.com; kito.cheng@gmail.com; Jeff Law <jeffreyalaw@gmail.com>; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>
Subject: RE: Machine Mode ICE in RISC-V when LTO

Hi!

On 2023-08-10T12:25:36+0000, "Li, Pan2" <pan2.li@intel.com> wrote:
> Thanks Richard for comment, let me try to promote the table to unsigned short.

I have WIP work for this issue -- which I'd already raised a month ago:
<https://inbox.sourceware.org/87o7kxuq9s.fsf@euler.schwinge.homeip.net>:

On 2023-06-30T13:46:07+0200, Thomas Schwinge <thomas@codesourcery.com> wrote:
> In particular, the 'lto_mode_identity_table' changes would seem necessary
> to keep standard LTO ('-flto') functional for large 'machine_mode' size?

... which is exactly the problem you've now run into?

However, a simple:

    -GTY(()) const unsigned char *lto_mode_identity_table;
    +GTY(()) const unsigned short *lto_mode_identity_table;

..., or:

    -GTY(()) const unsigned char *lto_mode_identity_table;
    +GTY(()) const machine_mode *lto_mode_identity_table;

... is not sufficient: that runs into GTY issues, as the current
'unsigned char *lto_mode_identity_table' is (mis-)classified by
'gengtype' as a C string.  This happens to work for this case, but still
isn't right, and only works for 'char *' but not 'short *' etc.  I have
WIP work to tighten that.  ..., which got me into other GTY issues, and
so on...  ;-) (Richard already ACKed and I pushed some of the
prerequisite changes, but there's more to come.)  I'm still planning on
resolving all that mess, but I'm tight on time right now.

However, I have a different proposal, which should address your current
issue: simply, get rid of the 'lto_mode_identity_table', which is just
that: a 1-to-1 mapping of array index to value.  Instead, in
'gcc/lto/lto-common.cc:lto_file_finalize', for '!ACCEL_COMPILER', set
'file_data->mode_table = NULL', and in the users (only
'gcc/tree-streamer.h:bp_unpack_machine_mode'?), replace (untested):

    -return (machine_mode) ib->file_data->mode_table[ix];
    +return ib->file_data->mode_table ? ib->file_data->mode_table[ix] : ix;

Jakub, as the original author of 'lto_mode_identity_table' (see
commit db847fa8f2cca6139188b8dfa0a7064319b19193 (Subversion r221005)), is
there any reason not to do it this way?


Grüße
 Thomas


> -----Original Message-----
> From: Richard Biener <richard.guenther@gmail.com>
> Sent: Thursday, August 10, 2023 7:08 PM
> To: Li, Pan2 <pan2.li@intel.com>
> Cc: richard.sandiford@arm.com; Thomas Schwinge <thomas@codesourcery.com>; jakub@redhat.com; kito.cheng@gmail.com; Jeff Law <jeffreyalaw@gmail.com>; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>
> Subject: Re: Machine Mode ICE in RISC-V when LTO
>
> On Thu, Aug 10, 2023 at 10:19 AM Li, Pan2 <pan2.li@intel.com> wrote:
>>
>> Hi all,
>>
>>
>>
>> Recently I found there is still some issues for the machine mode with LTO part by fixing one
>>
>> ICE (only when compile with LTO) in RISC-V backend in , aka below case.
>>
>>
>>
>> >> ../__RISC-V_INSTALL___/bin/riscv64-unknown-elf-g++ -O2 -flto gcc/testsuite/g++.dg/torture/vshuf-v4df.C -o test.elf
>>
>> during RTL pass: expand
>>
>> gcc/testsuite/g++.dg/torture/vshuf-main.inc: In function 'main':
>>
>> gcc/testsuite/g++.dg/torture/vshuf-main.inc:15:9: internal compiler error: in as_a, at machmode.h:381
>>
>>    15 |       V r = __builtin_shuffle(in1[i], mask1[i]);
>>
>>       |         ^
>>
>> 0x7e5b8e scalar_int_mode as_a<scalar_int_mode>(machine_mode)
>>
>>         ../.././gcc/gcc/machmode.h:381
>>
>> 0x7eabdb scalar_mode as_a<scalar_mode>(machine_mode)
>>
>>         ../.././gcc/gcc/expr.cc:332
>>
>> 0x7eabdb convert_mode_scalar
>>
>>         ../.././gcc/gcc/expr.cc:325
>>
>> 0xb8485b store_expr(tree_node*, rtx_def*, int, bool, bool)
>>
>>         ../.././gcc/gcc/expr.cc:6413
>>
>> 0xb8a556 store_field
>>
>>         ../.././gcc/gcc/expr.cc:7648
>>
>> 0xb88f27 store_constructor(tree_node*, rtx_def*, int, poly_int<2u, long>, bool)
>>
>>         ../.././gcc/gcc/expr.cc:7588
>>
>> 0xb8b8b8 expand_constructor
>>
>>         ../.././gcc/gcc/expr.cc:8931
>>
>> 0xb76bc7 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
>>
>>         ../.././gcc/gcc/expr.cc:11170
>>
>> 0xb77ef7 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
>>
>>         ../.././gcc/gcc/expr.cc:10809
>>
>> 0xb83a80 store_expr(tree_node*, rtx_def*, int, bool, bool)
>>
>>         ../.././gcc/gcc/expr.cc:6325
>>
>> 0xb851d9 expand_assignment(tree_node*, tree_node*, bool)
>>
>>         ../.././gcc/gcc/expr.cc:6043
>>
>> 0xa48717 expand_gimple_stmt_1
>>
>>         ../.././gcc/gcc/cfgexpand.cc:3946
>>
>> 0xa48717 expand_gimple_stmt
>>
>>         ../.././gcc/gcc/cfgexpand.cc:4044
>>
>> 0xa4d030 expand_gimple_basic_block
>>
>>         ../.././gcc/gcc/cfgexpand.cc:6096
>>
>> 0xa4efd6 execute
>>
>>         ../.././gcc/gcc/cfgexpand.cc:6831
>>
>>
>>
>> I double checked the reason that comes from we add even more machine modes in the RISC-V backend,
>>
>> and then did some investigation for the root cause. It should be related to the mode_table, as well as the
>>
>> bp_unpack_machine_mode.
>>
>>
>>
>> In lto_fe_init:
>>
>>    unsigned char *table
>>
>>     = ggc_vec_alloc<unsigned char> (MAX_MACHINE_MODE);
>>
>>
>>
>>    for (int m = 0; m < MAX_MACHINE_MODE; m++)
>>
>> table[m] = m;                                                              <== May overflow here given MAX_MACHINE_MODE > 256 and table[m] is unsigned char.
>>
>>
>>
>> in bp_unpack_machine_mode:
>>
>>    unsigned ix = bp_unpack_enum (bp, machine_mode, last);
>>
>>   return (machine_mode) ib->file_data->mode_table[ix];  <== May return truncated mode here.
>>
>>
>>
>> To validate this idea, I tried below hack code for double checking and then there is no ICE anymore, which indicates
>>
>> the problem here as I bet. However, the lto is quite complicated and I am not sure how to fix it in the right way.
>>
>>
>>
>> +    = ggc_vec_alloc<unsigned char> (MAX_MACHINE_MODE * 2);
>>
>> …
>>
>> +    ((unsigned short *)table)[m] = m;
>>
>> …
>>
>> +   return (machine_mode) ((unsigned short *)ib->file_data->mode_table)[ix];
>>
>>
>>
>> Besides, I also tried to change the mode_table from char * to short * but got one weird error when building as below.
>>
>>
>>
>> gcc/lto-streamer.h:599: field `(*x).mode_table' is pointer to unimplemented type
>
> We still have some places using an array of char for the mode table.
> The above is assigned
> to lto_mode_identity_table which ends up in
> lto_file_decl_data::mode_table.  I think those
> need to be all promoted to unsigned short.
>
> Richard.
>
>>
>>
>> Pan
>>
>>
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Robin Dapp Sept. 15, 2023, 1:33 p.m. UTC | #8
Hi Thomas, Jakub,

is there anything we can do to assist from the riscv side in order to help
with this?  I haven't really been involved with it but was wondering
what's missing.  If I understand correctly Thomas has a major cleanup
operation in plan but might not get to it soon.  The fix he proposed
helps for the riscv case, however, even without the rework?

If so, I'd kindly ping Jakub to check if the fix is reasonable.

Thank you.

Regards
 Robin
diff mbox series

Patch

From 0fd8f65bb87b11ef8ae366a797aec572d67b284f Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Fri, 30 Jun 2023 13:23:55 +0200
Subject: [PATCH] [WIP] Adjust LTO mode tables for "Machine_Mode: Extend
 machine_mode from 8 to 16 bits"

---
 gcc/lto-streamer-in.cc |  2 +-
 gcc/lto-streamer.h     | 56 +++++++++++++++++++++++++++++++++++++++++-
 gcc/lto/lto-common.cc  | 10 ++++----
 gcc/lto/lto-common.h   |  2 +-
 gcc/tree-streamer.h    |  2 +-
 5 files changed, 63 insertions(+), 9 deletions(-)

diff --git a/gcc/lto-streamer-in.cc b/gcc/lto-streamer-in.cc
index 1876e1967ec..bbd44504ff8 100644
--- a/gcc/lto-streamer-in.cc
+++ b/gcc/lto-streamer-in.cc
@@ -1997,7 +1997,7 @@  lto_input_mode_table (struct lto_file_decl_data *file_data)
   bitpack_d bp = streamer_read_bitpack (&ib);
 
   unsigned mode_bits = bp_unpack_value (&bp, 5);
-  unsigned char *table = ggc_cleared_vec_alloc<unsigned char> (1 << mode_bits);
+  machine_mode *table = ggc_cleared_vec_alloc<machine_mode> (1 << mode_bits);
 
   file_data->mode_table = table;
   file_data->mode_bits = mode_bits;
diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
index 0556b34c837..4d83741e4c6 100644
--- a/gcc/lto-streamer.h
+++ b/gcc/lto-streamer.h
@@ -596,7 +596,61 @@  struct GTY(()) lto_file_decl_data
   hash_map<tree, ld_plugin_symbol_resolution> * GTY((skip)) resolution_map;
 
   /* Mode translation table.  */
-  const unsigned char *mode_table;
+  /*TODO const
+With 'const', we get:
+
+    gtype-desc.cc: In function 'void gt_pch_nx_lto_file_decl_data(void*)':
+    gtype-desc.cc:6531:34: error: invalid conversion from 'const void*' to 'void*' [-fpermissive]
+             gt_pch_note_object ((*x).mode_table, x, gt_pch_p_18lto_file_decl_data);
+                                      ^
+    In file included from [...]/source-gcc/gcc/hash-table.h:247:0,
+                     from [...]/source-gcc/gcc/coretypes.h:486,
+                     from gtype-desc.cc:23:
+    [...]/source-gcc/gcc/ggc.h:47:12: note:   initializing argument 1 of 'int gt_pch_note_object(void*, void*, gt_note_pointers, size_t)'
+     extern int gt_pch_note_object (void *, void *, gt_note_pointers,
+                ^
+    make[2]: *** [Makefile:1180: gtype-desc.o] Error 1
+   */
+  machine_mode * GTY((atomic)) mode_table;
+  /*
+This (without 'const') changes '[build-gcc]/gcc/gtype-desc.cc' as follows:
+
+    @@ -2566,7 +2566,9 @@ gt_ggc_mx_lto_file_decl_data (void *x_p)
+           gt_ggc_m_17lto_in_decl_state ((*x).global_decl_state);
+           gt_ggc_m_29hash_table_decl_state_hasher_ ((*x).function_decl_states);
+           gt_ggc_m_18lto_file_decl_data ((*x).next);
+    -      gt_ggc_m_S ((*x).mode_table);
+    +      if ((*x).mode_table != NULL) {
+    +        ggc_mark ((*x).mode_table);
+    +      }
+         }
+     }
+     
+    @@ -6525,7 +6527,9 @@ gt_pch_nx_lto_file_decl_data (void *x_p)
+           gt_pch_n_17lto_in_decl_state ((*x).global_decl_state);
+           gt_pch_n_29hash_table_decl_state_hasher_ ((*x).function_decl_states);
+           gt_pch_n_18lto_file_decl_data ((*x).next);
+    -      gt_pch_n_S ((*x).mode_table);
+    +      if ((*x).mode_table != NULL) {
+    +        gt_pch_note_object ((*x).mode_table, x, gt_pch_p_18lto_file_decl_data);
+    +      }
+         }
+     }
+     
+    @@ -10929,8 +10933,10 @@ gt_pch_p_18lto_file_decl_data (ATTRIBUTE
+         op (&((*x).function_decl_states), NULL, cookie);
+       if ((void *)(x) == this_obj)
+         op (&((*x).next), NULL, cookie);
+    -  if ((void *)(x) == this_obj)
+    -    op (&((*x).mode_table), NULL, cookie);
+    +  if ((*x).mode_table != NULL) {
+    +    if ((void *)(x) == this_obj)
+    +      op (&((*x).mode_table), NULL, cookie);
+    +  }
+     }
+
+Given that the '[...]_S' routines are for strings (due to previous 'char *', I suppose), that's probably correct?
+   */
 
   /* Read LTO section.  */
   lto_section lto_section_header;
diff --git a/gcc/lto/lto-common.cc b/gcc/lto/lto-common.cc
index 973ab791712..f483f42997e 100644
--- a/gcc/lto/lto-common.cc
+++ b/gcc/lto/lto-common.cc
@@ -64,7 +64,7 @@  static bool type_streaming_finished = false;
 
 GTY(()) tree first_personality_decl;
 
-GTY(()) const unsigned char *lto_mode_identity_table;
+GTY(()) const machine_mode *lto_mode_identity_table;
 
 /* Returns a hash code for P.  */
 
@@ -2274,7 +2274,7 @@  lto_file_finalize (struct lto_file_decl_data *file_data, lto_file *file,
 #ifdef ACCEL_COMPILER
   lto_input_mode_table (file_data);
 #else
-  file_data->mode_table = lto_mode_identity_table;
+  file_data->mode_table = /*TODO const*/ (machine_mode *) lto_mode_identity_table;
   file_data->mode_bits = ceil_log2 (MAX_MACHINE_MODE);
 #endif
 
@@ -3116,10 +3116,10 @@  lto_fe_init (void)
   bitmap_obstack_initialize (NULL);
   gimple_register_cfg_hooks ();
 #ifndef ACCEL_COMPILER
-  unsigned char *table
-    = ggc_vec_alloc<unsigned char> (MAX_MACHINE_MODE);
+  machine_mode *table
+    = ggc_vec_alloc<machine_mode> (MAX_MACHINE_MODE);
   for (int m = 0; m < MAX_MACHINE_MODE; m++)
-    table[m] = m;
+    table[m] = (machine_mode) m;
   lto_mode_identity_table = table;
 #endif
 }
diff --git a/gcc/lto/lto-common.h b/gcc/lto/lto-common.h
index 24b2445673b..2b868085139 100644
--- a/gcc/lto/lto-common.h
+++ b/gcc/lto/lto-common.h
@@ -26,7 +26,7 @@  void print_lto_report_1 (void);
 
 extern tree lto_eh_personality_decl;
 extern GTY(()) vec<tree, va_gc> *tree_with_vars;
-extern const unsigned char *lto_mode_identity_table;
+extern const machine_mode *lto_mode_identity_table;
 extern tree first_personality_decl;
 
 #endif
diff --git a/gcc/tree-streamer.h b/gcc/tree-streamer.h
index ff49d1ba637..1e346b775e6 100644
--- a/gcc/tree-streamer.h
+++ b/gcc/tree-streamer.h
@@ -118,7 +118,7 @@  bp_unpack_machine_mode (struct bitpack_d *bp)
   lto_input_block *ib = (class lto_input_block *) bp->stream;
   int last = 1 << ib->file_data->mode_bits;
   unsigned ix = bp_unpack_enum (bp, machine_mode, last);
-  return (machine_mode) ib->file_data->mode_table[ix];
+  return ib->file_data->mode_table[ix];
 }
 
 #endif  /* GCC_TREE_STREAMER_H  */
-- 
2.34.1