Message ID | 531ee7a8-0446-8f7b-7504-928426a257c5@codesourcery.com |
---|---|
State | New |
Headers | show |
Series | [wwwdocs] gcc-12/changes.html (GCN): >1 workers per gang | expand |
Hi! On 2021-08-09T15:55:07+0200, Tobias Burnus <tobias@codesourcery.com> wrote: > Now that the GCN/OpenACC patches for this have been committed today, > I think it makes sense to add it to the documentation. Thanks for thinking of this. > (I was told that some follow-up items are still pending, but as > the feature does work ...) ACK. > Cf. also Andrew's talk of last year, > https://linuxplumbersconf.org/event/7/contributions/749/attachments/560/988/AMD_GCN_Update_-_LPC_2020.pdf > which I utilized when writing the attached wwwdocs patch. > > Comments and/or suggestions? > gcc-12/changes.html (GCN): >1 workers per gang ACK for that aspect specifically, but: > --- a/htdocs/gcc-12/changes.html > +++ b/htdocs/gcc-12/changes.html > + <li>When used as OpenACC device: the limitation of 1 worker per gang, 2 gangs > + per CU has been lifted; now up to 16 workers per gang and 40 gangs per CU > + are supported. (Except that the hardware limit of 40 workers total may > + not be exceeded.)</li> I haven't changed anything related to a "limitation of [...] 2 gangs per CU has been lifted". Maybe that has already been done earlier, maybe that still has to be done? I don't know -- Julian? Grüße Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
On Mon, 9 Aug 2021, Tobias Burnus wrote:
> Comments and/or suggestions?
Looks good from my perspective, with the feedback that Thomas
provided.
(Is "CU" a sufficiently established term, or might it make sense
to spell it out?)
Thanks,
Gerald
Hi all, On 09.08.21 20:53, Gerald Pfeifer wrote: > (Is "CU" a sufficiently established term, or might it make sense > to spell it out?) I don't know – but we could use "per compute unit (CU)". On 09.08.21 16:27, Thomas Schwinge wrote: > On 2021-08-09T15:55:07+0200, Tobias Burnus<tobias@codesourcery.com> wrote: >> +++ b/htdocs/gcc-12/changes.html >> + <li>When used as OpenACC device: the limitation of 1 worker per gang, 2 gangs >> + per CU has been lifted; now up to 16 workers per gang and 40 gangs per CU >> + are supported. (Except that the hardware limit of 40 workers total may >> + not be exceeded.)</li> > I haven't changed anything related to a "limitation of [...] 2 gangs per > CU has been lifted". Maybe that has already been done earlier, maybe > that still has to be done? I don't know -- Julian? Looking at the current code, it has: if (dims[0] == 0) dims[0] = get_cu_count (kernel->agent); /* Gangs. */ Thus at least when nothing else has been specified, it uses #CUs of gangs, running on #CUs CUs, i.e. 1 gang per CU. [OG11 – but not mainline] What's needed is something like: dims[0] = get_cu_count (kernel->agent) * (32 / dims[1]); which I see in OG11 – oddly, I also see there code like: def->gdims[0] = get_cu_count (agent); // * (40 / gcn_threads); In other words: For gangs > #CUs or >1 gang per CU, the following patch is needed: [OG11] https://gcc.gnu.org/g:4dcd1e1f4e6b451aac44f919b8eb3ac49292b308 [email] https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550102.html "not suitable for mainline until the multiple-worker support is merged there" @Andrew + @Julian: Do you intent to commit it relatively soon? Regarding the wwwdocs patch, I can hold off until that commit or reword it to only cover the workers part. Thanks Thomas & Gerald for the comments! Tobias ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
> In other words: For gangs > #CUs or >1 gang per CU, the following patch > is needed: > [OG11] https://gcc.gnu.org/g:4dcd1e1f4e6b451aac44f919b8eb3ac49292b308 > [email] https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550102.html > "not suitable for mainline until the multiple-worker support is merged > there" > > @Andrew + @Julian: Do you intent to commit it relatively soon? > Regarding the wwwdocs patch, I can hold off until that commit or reword > it to only cover the workers part. Were these not part of the patch set Thomas was working on? Andrew
Hi! On 2021-08-16T10:34:34+0200, "Stubbs, Andrew" <Andrew_Stubbs@mentor.com> wrote: >> In other words: For gangs > #CUs or >1 gang per CU, the following patch >> is needed: >> [OG11] https://gcc.gnu.org/g:4dcd1e1f4e6b451aac44f919b8eb3ac49292b308 >> [email] https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550102.html >> "not suitable for mainline until the multiple-worker support is merged >> there" >> >> @Andrew + @Julian: Do you intent to commit it relatively soon? >> Regarding the wwwdocs patch, I can hold off until that commit or reword >> it to only cover the workers part. > > Were these not part of the patch set Thomas was working on? That one is "amdgcn: Tune default OpenMP/OpenACC GPU utilization", which indeed I'm planning to handle in the next few weeks. So I suggest we simply hold back Tobias' 'gcc-12/changes.html' patch until that time. Grüße Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
On 09.08.21 15:55, Tobias Burnus wrote: > Now that the GCN/OpenACC patches for this have been committed today, > I think it makes sense to add it to the documentation. > (I was told that some follow-up items are still pending, but as > the feature does work ...) I think the follow-up patches have now been committed. How about the attached patch? Tobias ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
On 02/02/2022 15:39, Tobias Burnus wrote: > On 09.08.21 15:55, Tobias Burnus wrote: >> Now that the GCN/OpenACC patches for this have been committed today, >> I think it makes sense to add it to the documentation. >> (I was told that some follow-up items are still pending, but as >> the feature does work ...) > > I think the follow-up patches have now been committed. > How about the attached patch? We should probably add a qualification "(up to a limit of 40 wavefronts in total, per CU)" or else were suggesting that there can be 40 workgroups of 16 wavefronts, which the hardware will not do. Andrew
Now committed, taking Andrew's comments into account: https://gcc.gnu.org/gcc-12/changes.html#amdgcn Tobias ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
gcc-12/changes.html (GCN): >1 workers per gang diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html index 9c2799cf..05e24737 100644 --- a/htdocs/gcc-12/changes.html +++ b/htdocs/gcc-12/changes.html @@ -119,10 +119,14 @@ <h3 id="amdgcn">AMD Radeon (GCN)</h3> <ul> <li>Debug experience with ROCGDB has been improved.</li> <li>Support for the type <code>__int128_t</code>/<code>integer(kind=16)</code> was added.</li> + <li>When used as OpenACC device: the limitation of 1 worker per gang, 2 gangs + per CU has been lifted; now up to 16 workers per gang and 40 gangs per CU + are supported. (Except that the hardware limit of 40 workers total may + not be exceeded.)</li> </ul> <!-- <h3 id="arc">ARC</h3> --> <!-- <h3 id="arm">ARM</h3> -->