mbox series

[RFC,00/10] Improve ARM target's support for LLVM toolchain

Message ID 20230520205547.1009254-1-CFSworks@gmail.com
Headers show
Series Improve ARM target's support for LLVM toolchain | expand

Message

Sam Edwards May 20, 2023, 8:55 p.m. UTC
Here is a series of patches aimed at improving support for the LLVM
toolchain (clang, lld, and to a lesser extent, llvm-objcopy) when
targeting ARM. This toolchain is a cross-compiler "by default" -- a user
generally should not need to install anything more to target their board
of choice if they already have Clang installed. Additionally, Clang has
a few nice diagnostics that should help appreciably with code quality,
if Clang starts to be used regularly by a few more U-Boot developers. ;)

Most of these patches are trivial and as such they should be pretty easy
to review, but the later patches in the patchset start making some
pretty big changes to the linker scripts. There are no behavioral
changes with those (U-Boot should still function the same) but there is
always the risk of compatibility with some third-party tool or loader
being broken. Fortunately, I foresee any problems making themselves
immediately apparent upon testing.

I have tested booting on sunxi/T113, and building for Raspberry Pi
(32-bit and 64-bit). The remaining testing efforts should be focused on:
- Exynos
- EFI loader (esp. on Lenovo X13s, which Heinrich Schuchardt had
    mentioned in a previous commit as being fickle)
- Zynq
- Rockchip TPL
- AArch64 SPL

I'm submitting this as an RFC since this doesn't completely guarantee
LLVM toolchain compatibility yet; my focus here is mostly on ensuring
that I haven't caused any regressions in GNU-land. Also, I haven't
discussed most of these changes before doing them. Perhaps alternate
approaches to some of these things can be proposed - I'm all ears.

Outstanding problems are:
- LLD sometimes puts .hash outside of the image binary area, though this
  is flagged by binary_size_check
- The main makefile uses --gap-fill=0xff, which is not supported in
  llvm-objcopy
- llvm-objcopy also doesn't appear to speak S-Record; the u-boot.srec
  target has to be deleted manually
- llvm-objcopy gets upset at some of the EFI code, since the EFI linker
  scripts preserve dynamic sections that llvm-objcopy doesn't want to
  strip off

Cheers,
Sam


Sam Edwards (10):
  makefile: Fix symbol typo in binary_size_check
  arm: set alignment properly for asm funcs
  arm: exclude eabi_compat from LTO
  arm: add __aeabi_memclr in eabi_compat
  arm: add aligned-memory aliases to eabi_compat
  arm: discard .gnu.version* sections
  arm: efi_loader: discard hash, unwind information
  arm: efi_loader: move .dynamic out of .text in EFI
  arm: discard all .dyn* sections
  arm: migrate away from sections.c

 Makefile                                 |   2 +-
 arch/arm/cpu/armv8/spl_data.c            |   4 +-
 arch/arm/cpu/armv8/u-boot-spl.lds        |  26 ++----
 arch/arm/cpu/armv8/u-boot.lds            |  48 +++--------
 arch/arm/cpu/u-boot.lds                  | 103 +++++++----------------
 arch/arm/include/asm/linkage.h           |   4 +-
 arch/arm/lib/Makefile                    |   3 +-
 arch/arm/lib/eabi_compat.c               |  17 ++++
 arch/arm/lib/elf_aarch64_efi.lds         |   3 +-
 arch/arm/lib/elf_arm_efi.lds             |   3 +
 arch/arm/lib/sections.c                  |  36 --------
 arch/arm/mach-rockchip/u-boot-tpl-v8.lds |  26 ++----
 arch/arm/mach-zynq/u-boot.lds            |  73 +++-------------
 13 files changed, 96 insertions(+), 252 deletions(-)
 delete mode 100644 arch/arm/lib/sections.c

Comments

Sam Edwards May 21, 2023, 4:59 a.m. UTC | #1
On 5/20/23 22:26, Heinrich Schuchardt wrote:
> Hello Sam,

Hi Heinrich! Good to hear from you.

> I guess the documentation and the CI testing would also have to be adjusted.

Ah, yeah, those are going to be big things for me to look at when this 
series starts to mature out of the RFC phase. CI is definitely important 
so that the hard-won compatibility doesn't just decay away. :)

> What about non-ARM architectures?

If there's a groundswell of demand for building U-Boot on LLVM, I'd be 
willing to collaborate with others on getting the other architectures up 
to parity with GNU. But since the linker scripts, relocation thunks, 
sections, and whatnot are all arch-specific, I'm only focusing on ARM 
for now (which is both the arch I need and one of the more common ones).

Is there a particular arch you'd like to see next? It seems everything 
U-Boot supports is supported by LLVM, except for Microblaze, NIOS2, and SH.

> Could you, please, describe how built with lld so that reviewers can test it.

I've been building with:

nice make CC='clang --target=armv7l-none-eabi' \
   LTO_FINAL_LDFLAGS=-fuse-ld=lld LD=ld.lld OBJCOPY=llvm-objcopy

...though mostly at this stage I'm just hoping for folks to confirm that 
this patchset causes no regressions in their existing GNU environments. 
(Feedback from LLVM-land would be appreciated nonetheless, though!!!)

> I find reviewing hard when receiving only 3 out of 10 patches.

Oof sorry about that. I haven't actually joined the U-Boot mailing list 
yet so the other patches are probably caught in the moderation queue. I 
guess wait for moderation approval and see if the other 7 arrive then -- 
I can resend the whole series to you specifically if not.

Cheers,
Sam
Ilias Apalodimas May 22, 2023, 6:52 a.m. UTC | #2
Hi Sam,

On Sun, 21 May 2023 at 07:59, Sam Edwards <cfsworks@gmail.com> wrote:
>
> On 5/20/23 22:26, Heinrich Schuchardt wrote:
> > Hello Sam,
>
> Hi Heinrich! Good to hear from you.
>
> > I guess the documentation and the CI testing would also have to be adjusted.
>
> Ah, yeah, those are going to be big things for me to look at when this
> series starts to mature out of the RFC phase. CI is definitely important
> so that the hard-won compatibility doesn't just decay away. :)
>
> > What about non-ARM architectures?
>
> If there's a groundswell of demand for building U-Boot on LLVM, I'd be
> willing to collaborate with others on getting the other architectures up
> to parity with GNU. But since the linker scripts, relocation thunks,
> sections, and whatnot are all arch-specific, I'm only focusing on ARM
> for now (which is both the arch I need and one of the more common ones).

I can help clean up the arm architecture even further.  I was toying
with the idea of having page-aligned sections and eventually map
u-boot with proper permissions per section.  Right now (at least for
the majority of arm platforms) we are doing RWX for all the memory,
apart from devices that are mapped as RW. I do have an awfully hacky
PoC around, but the linker script cleanup is more than welcome.

>
> Is there a particular arch you'd like to see next? It seems everything
> U-Boot supports is supported by LLVM, except for Microblaze, NIOS2, and SH.
>
> > Could you, please, describe how built with lld so that reviewers can test it.
>
> I've been building with:
>
> nice make CC='clang --target=armv7l-none-eabi' \
>    LTO_FINAL_LDFLAGS=-fuse-ld=lld LD=ld.lld OBJCOPY=llvm-objcopy
>
> ...though mostly at this stage I'm just hoping for folks to confirm that
> this patchset causes no regressions in their existing GNU environments.
> (Feedback from LLVM-land would be appreciated nonetheless, though!!!)
>
> > I find reviewing hard when receiving only 3 out of 10 patches.
>
> Oof sorry about that. I haven't actually joined the U-Boot mailing list
> yet so the other patches are probably caught in the moderation queue. I
> guess wait for moderation approval and see if the other 7 arrive then --
> I can resend the whole series to you specifically if not.

It's probably not a mailing list issue.  I only got the efi related
patches on my mailbox.  The recipients were generated with
get_maintainers.pl?  Heinirch and I only received the efi* portions as
we maintain that subsystem

Thanks!
/Ilias
>
> Cheers,
> Sam
Michal Simek May 22, 2023, 10:39 a.m. UTC | #3
Hi

On 5/21/23 06:59, Sam Edwards wrote:
> On 5/20/23 22:26, Heinrich Schuchardt wrote:
>> Hello Sam,
> 
> Hi Heinrich! Good to hear from you.
> 
>> I guess the documentation and the CI testing would also have to be adjusted.
> 
> Ah, yeah, those are going to be big things for me to look at when this series 
> starts to mature out of the RFC phase. CI is definitely important so that the 
> hard-won compatibility doesn't just decay away. :)
> 
>> What about non-ARM architectures?
> 
> If there's a groundswell of demand for building U-Boot on LLVM, I'd be willing 
> to collaborate with others on getting the other architectures up to parity with 
> GNU. But since the linker scripts, relocation thunks, sections, and whatnot are 
> all arch-specific, I'm only focusing on ARM for now (which is both the arch I 
> need and one of the more common ones).
> 
> Is there a particular arch you'd like to see next? It seems everything U-Boot 
> supports is supported by LLVM, except for Microblaze, NIOS2, and SH.
> 
>> Could you, please, describe how built with lld so that reviewers can test it.
> 
> I've been building with:
> 
> nice make CC='clang --target=armv7l-none-eabi' \
>    LTO_FINAL_LDFLAGS=-fuse-ld=lld LD=ld.lld OBJCOPY=llvm-objcopy
> 
> ...though mostly at this stage I'm just hoping for folks to confirm that this 
> patchset causes no regressions in their existing GNU environments. (Feedback 
> from LLVM-land would be appreciated nonetheless, though!!!)

Dockerfile in repo as I see is using 3 toolchain categories.
1. llvm deb repo
2. kernel.org
3. others - xtensa/arc

For CI loop you should pretty much provide a way how to get toolchain.
That's why would be good to figure it out and then I am happy to take a look at 
changed you have done for Zynq.
Definitely nice to see this happening and I expect more warnings will be visible 
  and they should be fixed.

Thanks,
Michal
Tom Rini May 22, 2023, 3:30 p.m. UTC | #4
On Mon, May 22, 2023 at 12:39:04PM +0200, Michal Simek wrote:
> Hi
> 
> On 5/21/23 06:59, Sam Edwards wrote:
> > On 5/20/23 22:26, Heinrich Schuchardt wrote:
> > > Hello Sam,
> > 
> > Hi Heinrich! Good to hear from you.
> > 
> > > I guess the documentation and the CI testing would also have to be adjusted.
> > 
> > Ah, yeah, those are going to be big things for me to look at when this
> > series starts to mature out of the RFC phase. CI is definitely important
> > so that the hard-won compatibility doesn't just decay away. :)
> > 
> > > What about non-ARM architectures?
> > 
> > If there's a groundswell of demand for building U-Boot on LLVM, I'd be
> > willing to collaborate with others on getting the other architectures up
> > to parity with GNU. But since the linker scripts, relocation thunks,
> > sections, and whatnot are all arch-specific, I'm only focusing on ARM
> > for now (which is both the arch I need and one of the more common ones).
> > 
> > Is there a particular arch you'd like to see next? It seems everything
> > U-Boot supports is supported by LLVM, except for Microblaze, NIOS2, and
> > SH.
> > 
> > > Could you, please, describe how built with lld so that reviewers can test it.
> > 
> > I've been building with:
> > 
> > nice make CC='clang --target=armv7l-none-eabi' \
> >    LTO_FINAL_LDFLAGS=-fuse-ld=lld LD=ld.lld OBJCOPY=llvm-objcopy
> > 
> > ...though mostly at this stage I'm just hoping for folks to confirm that
> > this patchset causes no regressions in their existing GNU environments.
> > (Feedback from LLVM-land would be appreciated nonetheless, though!!!)
> 
> Dockerfile in repo as I see is using 3 toolchain categories.
> 1. llvm deb repo
> 2. kernel.org
> 3. others - xtensa/arc
> 
> For CI loop you should pretty much provide a way how to get toolchain.
> That's why would be good to figure it out and then I am happy to take a look
> at changed you have done for Zynq.
> Definitely nice to see this happening and I expect more warnings will be
> visible  and they should be fixed.

So, we can trivially add lld to the Dockerfile, it's just listing lld-16
in the install list.  I think objcopy is a bit of a stretch at this
point and it's not clear from the above if you're also making use of the
assembler.  We might also want to look at backporting
scripts/Makefile.clang from the current kernel build system and then
adapting the "guess the --target argument" logic based on CONFIG_$ARCH
rather than ARCH= (which we don't use).  That would also solve the LTO
problem as that's a result of us missing some flags that the kernel has
as LLVM+LTO (logically) requires LLVM LD not GNU LD.

At that point, and once the EFI guid_t warning is resolved to everyones
satisfaction we can put qemu_arm* + clang in the CI loop, to catch new
warnings there.  I've already got clang + Pi in my CI loop, but that
doesn't fail on warnings.
Sam Edwards May 22, 2023, 7:37 p.m. UTC | #5
Hi Ilias,

On 5/22/23 00:52, Ilias Apalodimas wrote:
> I can help clean up the arm architecture even further.  I was toying
> with the idea of having page-aligned sections and eventually map
> u-boot with proper permissions per section.  Right now (at least for
> the majority of arm platforms) we are doing RWX for all the memory,
> apart from devices that are mapped as RW. I do have an awfully hacky
> PoC around, but the linker script cleanup is more than welcome.

Glad to hear it (and excited by the idea of proper W^X)! The linker 
script cleanup (i.e. deleting those pesky `sections.c` files and going 
back to linker-assigned symbols) can really happen whenever; it won't 
cause a problem on any version of GNU ld from <7 years ago. Perhaps a 
series of patches (one per arch) doing that should be landed first?

> It's probably not a mailing list issue.  I only got the efi related
> patches on my mailbox.  The recipients were generated with
> get_maintainers.pl?  Heinirch and I only received the efi* portions as
> we maintain that subsystem

Well, it's true that you and Heinrich weren't Cc: on every email in the 
series. I just went with patman's default behavior.

But every patch was sent To: the u-boot list, and I do see the whole 
series showing up on the archive. Did you not even receive the other 
patches in the series via the list?

Cheers,
Sam
Sam Edwards May 22, 2023, 7:59 p.m. UTC | #6
Hi Tom,

On 5/22/23 09:30, Tom Rini wrote:
> I think objcopy is a bit of a stretch at this
> point and it's not clear from the above if you're also making use of the
> assembler.

I agree, since getting llvm-objcopy to play nice with this currently 
requires that I make a handful of small hack edits to the makefiles. It 
does offer a great advantage over GNU's objcopy in that it doesn't balk 
at ELFs from a foreign arch, but I'm only supporting it 
"opportunistically" right now.

I do make use of LLVM's assembler, but LLVM bundles its assembler inside 
the Clang binary (`clang -cc1as` and/or just pass .S files to Clang) 
rather than installing a separate program. This is not to be confused 
with `llvm-as` which is for bitcode/IR manipulation only. But in 
general, if you're using Clang, you're also using the LLVM assembler.

> We might also want to look at backporting
> scripts/Makefile.clang from the current kernel build system and then
> adapting the "guess the --target argument" logic based on CONFIG_$ARCH
> rather than ARCH= (which we don't use).  That would also solve the LTO
> problem as that's a result of us missing some flags that the kernel has
> as LLVM+LTO (logically) requires LLVM LD not GNU LD.

Having something like Linux's `LLVM=1` to enable LLVM would be ideal, I 
think. I probably won't be doing that backporting in this patch series 
since my goal for now is just to fill in some of the pitfalls for people 
like me who are too stubborn to install a GNU toolchain.

> At that point, and once the EFI guid_t warning is resolved to everyones
> satisfaction we can put qemu_arm* + clang in the CI loop, to catch new
> warnings there.  I've already got clang + Pi in my CI loop, but that
> doesn't fail on warnings.

Do you mean, on the current master branch, and only Clang (no LLD)? I'm 
in favor, but since a few of the patches in this series (#3-5) are to 
support some of the libcalls that LLVM's codegen likes to emit, I'd be 
surprised if that worked on all targets. Feel free to pull whatever 
necessary patches of mine here into your own series though. :)

Cheers,
Sam
Heinrich Schuchardt May 22, 2023, 8:15 p.m. UTC | #7
Am 22. Mai 2023 21:37:26 MESZ schrieb Sam Edwards <cfsworks@gmail.com>:
>Hi Ilias,
>
>On 5/22/23 00:52, Ilias Apalodimas wrote:
>> I can help clean up the arm architecture even further.  I was toying
>> with the idea of having page-aligned sections and eventually map
>> u-boot with proper permissions per section.  Right now (at least for
>> the majority of arm platforms) we are doing RWX for all the memory,
>> apart from devices that are mapped as RW. I do have an awfully hacky
>> PoC around, but the linker script cleanup is more than welcome.
>
>Glad to hear it (and excited by the idea of proper W^X)! The linker script cleanup (i.e. deleting those pesky `sections.c` files and going back to linker-assigned symbols) can really happen whenever; it won't cause a problem on any version of GNU ld from <7 years ago. Perhaps a series of patches (one per arch) doing that should be landed first?
>
>> It's probably not a mailing list issue.  I only got the efi related
>> patches on my mailbox.  The recipients were generated with
>> get_maintainers.pl?  Heinirch and I only received the efi* portions as
>> we maintain that subsystem
>
>Well, it's true that you and Heinrich weren't Cc: on every email in the series. I just went with patman's default behavior.
>
>But every patch was sent To: the u-boot list, and I do see the whole series showing up on the archive. Did you not even receive the other patches in the series via the list?

The series can be retrieved from patchwork. I personally dislike patman for this eclectic behavior. Git send-email is doing the expected.

Best regards

Heinrich

>
>Cheers,
>Sam
Tom Rini May 22, 2023, 9:13 p.m. UTC | #8
On Mon, May 22, 2023 at 01:59:33PM -0600, Sam Edwards wrote:
> Hi Tom,
> 
> On 5/22/23 09:30, Tom Rini wrote:
> > I think objcopy is a bit of a stretch at this
> > point and it's not clear from the above if you're also making use of the
> > assembler.
> 
> I agree, since getting llvm-objcopy to play nice with this currently
> requires that I make a handful of small hack edits to the makefiles. It does
> offer a great advantage over GNU's objcopy in that it doesn't balk at ELFs
> from a foreign arch, but I'm only supporting it "opportunistically" right
> now.
> 
> I do make use of LLVM's assembler, but LLVM bundles its assembler inside the
> Clang binary (`clang -cc1as` and/or just pass .S files to Clang) rather than
> installing a separate program. This is not to be confused with `llvm-as`
> which is for bitcode/IR manipulation only. But in general, if you're using
> Clang, you're also using the LLVM assembler.

Ah, ok.

> > We might also want to look at backporting
> > scripts/Makefile.clang from the current kernel build system and then
> > adapting the "guess the --target argument" logic based on CONFIG_$ARCH
> > rather than ARCH= (which we don't use).  That would also solve the LTO
> > problem as that's a result of us missing some flags that the kernel has
> > as LLVM+LTO (logically) requires LLVM LD not GNU LD.
> 
> Having something like Linux's `LLVM=1` to enable LLVM would be ideal, I
> think. I probably won't be doing that backporting in this patch series since
> my goal for now is just to fill in some of the pitfalls for people like me
> who are too stubborn to install a GNU toolchain.

It honestly shouldn't be too much work to just backport that file (and
move/remove some of the logic we have scattered elsewhere instead).

> > At that point, and once the EFI guid_t warning is resolved to everyones
> > satisfaction we can put qemu_arm* + clang in the CI loop, to catch new
> > warnings there.  I've already got clang + Pi in my CI loop, but that
> > doesn't fail on warnings.
> 
> Do you mean, on the current master branch, and only Clang (no LLD)? I'm in
> favor, but since a few of the patches in this series (#3-5) are to support
> some of the libcalls that LLVM's codegen likes to emit, I'd be surprised if
> that worked on all targets. Feel free to pull whatever necessary patches of
> mine here into your own series though. :)

The reason it's not in CI right now is that there's compiler warnings,
due to the EFI guid_t alignment thing that's being discussed in another
thread.  I am running (and so booting and running pytest) in my CI loop
clang-built U-Boot for Pi 3 (32bit and 64bit) and a few other targets
too.  But there's also certainly room to improve and I think some of the
issues you've addressed here are why other targets for example are too
large to boot.
Ilias Apalodimas May 23, 2023, 6:54 a.m. UTC | #9
On Mon, May 22, 2023 at 01:37:26PM -0600, Sam Edwards wrote:
> Hi Ilias,
>
> On 5/22/23 00:52, Ilias Apalodimas wrote:
> > I can help clean up the arm architecture even further.  I was toying
> > with the idea of having page-aligned sections and eventually map
> > u-boot with proper permissions per section.  Right now (at least for
> > the majority of arm platforms) we are doing RWX for all the memory,
> > apart from devices that are mapped as RW. I do have an awfully hacky
> > PoC around, but the linker script cleanup is more than welcome.
>
> Glad to hear it (and excited by the idea of proper W^X)!

Yes that's my end goal here.  Looking around the code we have in U-Boot
regarding the mmu configuration, it's a lot easier (and cleaner) to fix
the linker script, add symbols for ro_start, rw_start etc and layout the
binary in a way we can easily map it, instead of leaving it as is and try
to fix the mapping in c code.

> The linker script
> cleanup (i.e. deleting those pesky `sections.c` files and going back to
> linker-assigned symbols) can really happen whenever; it won't cause a
> problem on any version of GNU ld from <7 years ago. Perhaps a series of
> patches (one per arch) doing that should be landed first?

Yes probably, because I specifically remember digging through the history
of why the sections were defined like that in the first place.

>
> > It's probably not a mailing list issue.  I only got the efi related
> > patches on my mailbox.  The recipients were generated with
> > get_maintainers.pl?  Heinirch and I only received the efi* portions as
> > we maintain that subsystem
>
> Well, it's true that you and Heinrich weren't Cc: on every email in the
> series. I just went with patman's default behavior.
>
> But every patch was sent To: the u-boot list, and I do see the whole series
> showing up on the archive. Did you not even receive the other patches in the
> series via the list?
>
> Cheers,
> Sam

Cheers
/Ilias