Message ID | 359f2531-46d3-b2ce-666b-1267d28c1303@netcologne.de |
---|---|
State | New |
Headers | show |
Series | [fortran] Fix PR 82567 | expand |
On 10/17/2017 03:36 PM, Thomas Koenig wrote: > Hello world, > > this patch fixes a regression with long compile times, > which came about due to our handling of array constructors > at compile time. This, togeteher with a simplification in > front end optimization, led to long compile times and large > code. > > Regression-tested. OK for trunk and the other affected branches? > Well I know 42 is the answer to the ultimate question of the universe so this must be OK. I just don't know what the question is. OK and thanks, Jerry +#define CONSTR_LEN_MAX 42 +
On Tue, Oct 17, 2017 at 06:14:16PM -0700, Jerry DeLisle wrote: > On 10/17/2017 03:36 PM, Thomas Koenig wrote: > > Hello world, > > > > this patch fixes a regression with long compile times, > > which came about due to our handling of array constructors > > at compile time. This, togeteher with a simplification in > > front end optimization, led to long compile times and large > > code. > > > > Regression-tested. OK for trunk and the other affected branches? > > > > Well I know 42 is the answer to the ultimate question of the universe so this > must be OK. I just don't know what the question is. > > OK and thanks, > > Jerry > > +#define CONSTR_LEN_MAX 42 Actually, I was wondering about the choice myself. With most common hardware having fairly robust L1 and L2 cache sizes, a double precision array constructor with 42 elements only occupies 336 bytes. Seems small.
Le 18/10/2017 à 04:05, Steve Kargl a écrit : > On Tue, Oct 17, 2017 at 06:14:16PM -0700, Jerry DeLisle wrote: >> On 10/17/2017 03:36 PM, Thomas Koenig wrote: >>> Hello world, >>> >>> this patch fixes a regression with long compile times, >>> which came about due to our handling of array constructors >>> at compile time. This, togeteher with a simplification in >>> front end optimization, led to long compile times and large >>> code. >>> >>> Regression-tested. OK for trunk and the other affected branches? >>> >> >> Well I know 42 is the answer to the ultimate question of the universe so this >> must be OK. I just don't know what the question is. >> >> OK and thanks, >> >> Jerry >> >> +#define CONSTR_LEN_MAX 42 > > Actually, I was wondering about the choice myself. With > most common hardware having fairly robust L1 and L2 cache > sizes, a double precision array constructor with 42 > elements only occupies 336 bytes. Seems small. > There is a -fmax-array-constructor=n option. Can’t we use it for the limit?
Hi Jerry and Steve, >> Well I know 42 is the answer to the ultimate question of the universe so this >> must be OK. I just don't know what the question is. >> >> OK and thanks, >> >> Jerry >> >> +#define CONSTR_LEN_MAX 42 > Actually, I was wondering about the choice myself. With > most common hardware having fairly robust L1 and L2 cache > sizes, a double precision array constructor with 42 > elements only occupies 336 bytes. Seems small. Well, the answer is that I didn't know how to chose a reasonable constant. I now actually ran some benchmarks using rdtsc, and these seem to indicate that the optimum value for CONST_LEN_MAX is actually quite short, 3 or 4, otherwise I just got a slowdown or a break even. So, I committed (r253872) with a length of 4 as a limit. If anybody comes up with a better number, we can always change this. So, thanks for the review and the comments. Regards Thomas If somebody wants to check, here is the test case: main.f90: module tick interface function rdtsc() integer(kind=8) :: rdtsc end function rdtsc end interface end module tick program main use tick use tst implicit none integer(8) :: t1, t2 t1 = rdtsc() call sub1(2.0) t2 = rdtsc() ! print *,"sub1 : ", t2-t1 t1 = rdtsc() do i=1,10000 call sub1(2.0) end do t2 = rdtsc() print *,"sub1 : ", t2-t1 t1 = rdtsc() do i=1,10000 call sub2(2.0) end do t2 = rdtsc() print *,"sub2 : ", t2-t1 end program main tst.f90: module tst integer, parameter :: n=4 real, dimension(n) :: x real, dimension(n), parameter :: s = [(i,i=1,n)] contains subroutine sub1(a) real, intent(in) :: a x(1) = a * 1.0 x(2) = a * 2.0 x(3) = a * 3.0 x(4) = a * 3.0 end subroutine sub1 subroutine sub2(a) x(:) = a * s(:) end subroutine sub2 end module tst rdtsc.s: .file "rdtsc.s" .text .globl rdtsc_ .type rdtsc_, @function rdtsc_: .LFB0: .cfi_startproc rdtsc shl $32, %rdx or %rdx, %rax ret .cfi_endproc .LFE0: .size rdtsc_, .-rdtsc_ .section .note.GNU-stack,"",@progbits
Index: frontend-passes.c =================================================================== --- frontend-passes.c (Revision 253768) +++ frontend-passes.c (Arbeitskopie) @@ -1635,6 +1635,8 @@ combine_array_constructor (gfc_expr *e) gfc_constructor *c, *new_c; gfc_constructor_base oldbase, newbase; bool scalar_first; + int n_elem; + bool all_const; /* Array constructors have rank one. */ if (e->rank != 1) @@ -1674,12 +1676,38 @@ combine_array_constructor (gfc_expr *e) if (op2->ts.type == BT_CHARACTER) return false; - scalar = create_var (gfc_copy_expr (op2), "constr"); + /* This might be an expanded constructor with very many constant values. If + we perform the operation here, we might end up with a long compile time, + so an arbitrary length bound is in order here. If the constructor + constains something which is not a constant, it did not come from an + expansion, so leave it alone. */ +#define CONSTR_LEN_MAX 42 + oldbase = op1->value.constructor; + + n_elem = 0; + all_const = true; + for (c = gfc_constructor_first (oldbase); c; c = gfc_constructor_next(c)) + { + if (c->expr->expr_type != EXPR_CONSTANT) + { + all_const = false; + break; + } + n_elem += 1; + } + + if (all_const && n_elem > CONSTR_LEN_MAX) + return false; + +#undef CONSTR_LEN_MAX + newbase = NULL; e->expr_type = EXPR_ARRAY; + scalar = create_var (gfc_copy_expr (op2), "constr"); + for (c = gfc_constructor_first (oldbase); c; c = gfc_constructor_next (c)) {