[v2,00/25] Apple M1 SoC platform bring-up

Message ID	20210215121713.57687-1-marcan@marcan.st
Headers	show Return-Path: <devicetree-owner@vger.kernel.org> sender: hector@marcansoft.com) by mail.marcansoft.com (Postfix) with ESMTPSA id A15B741E96; Mon, 15 Feb 2021 12:17:33 +0000 (UTC) From: Hector Martin <marcan@marcan.st> To: linux-arm-kernel@lists.infradead.org Cc: Hector Martin <marcan@marcan.st>, Marc Zyngier <maz@kernel.org>, Rob Herring <robh@kernel.org>, Arnd Bergmann <arnd@kernel.org>, Olof Johansson <olof@lixom.net>, Krzysztof Kozlowski <krzk@kernel.org>, Mark Kettenis <mark.kettenis@xs4all.nl>, Tony Lindgren <tony@atomide.com>, Mohamed Mediouni <mohamed.mediouni@caramail.com>, Stan Skowronek <stan@corellium.com>, Alexander Graf <graf@amazon.com>, Will Deacon <will@kernel.org>, Linus Walleij <linus.walleij@linaro.org>, Mark Rutland <mark.rutland@arm.com>, devicetree@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 00/25] Apple M1 SoC platform bring-up Date: Mon, 15 Feb 2021 21:16:48 +0900 Message-Id: <20210215121713.57687-1-marcan@marcan.st> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	Apple M1 SoC platform bring-up \| expand [v2,00/25] Apple M1 SoC platform bring-up [v2,01/25] dt-bindings: vendor-prefixes: Add apple prefix [v2,02/25] dt-bindings: arm: apple: Add bindings for Apple ARM platforms [v2,03/25] dt-bindings: arm: cpus: Add apple,firestorm & icestorm compatibles [v2,05/25] dt-bindings: timer: arm,arch_timer: Add interrupt-names support [v2,12/25] of/address: Add infrastructure to declare MMIO as non-posted [v2,14/25] dt-bindings: interrupt-controller: Add DT bindings for apple-aic [v2,21/25] dt-bindings: serial: samsung: Add apple,s5l-uart compatible [v2,24/25] dt-bindings: display: Add apple,simple-framebuffer

Hector Martin Feb. 15, 2021, 12:16 p.m. UTC

This series brings up initial support for the Apple M1 SoC, used in the
2020 Mac Mini, MacBook Pro, and MacBook Air models.

The following features are supported in this initial port:

- UART (samsung-style) with earlycon support
- Interrupts, including affinity and IPIs (Apple Interrupt Controller)
- SMP (through standard spin-table support)
- simplefb-based framebuffer
- Devicetree for the Mac Mini (should work for the others too at this
  stage)

See below for an overview of changes since v1.

== Patch overview ==

- 01-02 Core platform DT bindings
- 03-04 CPU DT bindings and MIDR defines
- 05-06 Add interrupt-names support to the ARM timer driver
        This is used to cleanly express the lack of a secure timer;
        platforms in the past have used various hacks like dummy
        IRQs here.
- 07-09 FIQ support
        These platforms require FIQ support, as some interrupt sources
        are hard-wired to FIQ. We settled on always keeping the FIQ/IRQ
        mask state synced, and only redirecting the vectors using
        alternatives on these CPUs. Other platforms retain the existing
        panic-on-FIQ behavior.
- 10-12 ioremap_np() (nGnRnE) support
        The fabric in these SoCs only supports nGnRnE accesses for
        standard MMIO, except for PCI ranges which use nGnRE. Linux
        currently defaults to the latter on ARM64, so this adds a new
        ioremap type and DT properties to automatically select it for
        drivers using OF and devm abstractions, under buses specified
        in DT.
- 13-15 AIC (Apple Interrupt Controller) driver and support defines
        This also embeds FIQ handling for this platform.
- 16    Introduce CONFIG_ARCH_APPLE & add it to defconfig
- 17-23 Add Apple SoC support to the samsung_tty driver
        This includes several refactoring patches to be able to more
        cleanly introduce this, as well as a switch to
        devm_ioremap_resource to be able to use the nGnRnE support
        introduced above. Earlycon support is included with a
        special-case nGnRnE hack, as earlycon is too early to use the
        generic infrastructure.
- 24    simple-framebuffer bindings for Apple (trivial)
- 25    Add the initial M1 Mac Mini (j274) devicetree

== About the hardware ==

These machines officially support booting unsigned/user-provided
XNU-like kernels, with a very different boot protocol and devicetree
format. We are developing an initial bootloader, m1n1 [1], to take care
of as many hardware peculiarities as possible and present a standard
Linux arm64 boot protocol and device tree. In the future, I expect that
production setups will add U-Boot and perhaps GRUB into the boot chain,
to make the boot process similar to other ARM64 platforms.

The machines expose their debug UART over USB Type C, triggered with
vendor-specific USB-PD commands. Currently, the easiest way to get a
serial console on these machines is to use a second M1 box and a simple
USB C cable [2]. You can also build a DIY interface using an Arduino, a
FUSB302 chip or board, and a 1.2V UART-TTL adapter [3]. In the coming
weeks we will be designing an open hardware project to provide
serial/debug connectivity to these machines (and, hopefully, also
support other UART-over-Type C setups from other vendors). Please
contact me privately if you are interested in getting an early prototype
version of one of these devices.

A quickstart guide to booting Linux kernels on these machines is
available at [4], and we are documenting the hardware at [5].

[1] https://github.com/AsahiLinux/m1n1/
[2] https://github.com/AsahiLinux/macvdmtool/
[3] https://github.com/AsahiLinux/vdmtool/
[4] https://github.com/AsahiLinux/docs/wiki/Developer-Quickstart
[5] https://github.com/AsahiLinux/docs/wiki

== Project Blurb ==

Asahi Linux is an open community project dedicated to developing and
maintaining mainline support for Apple Silicon on Linux. Feel free to
drop by #asahi and #asahi-dev on freenode to chat with us, or check
our website for more information on the project:

https://asahilinux.org/

== Changes since v1 ==

* DT vendor prefix AAPL -> apple
* Added interrupt-names support to the ARMv8 timer driver, to better
  represent the lack of secure timer
* Added sysreg_apple.h to define Apple vendor specific CPU system
  registers
* No longer gating the DAIF.F bit handling via alternatives; now we
  always keep it in sync with DAIF.I on all platforms.
* ARCH_APPLE selects APPLE_AIC
* Fixed copyright of DT (GPL-2.0+ OR MIT)
* Made all the copyright lines for new files
  `Copyright The Asahi Linux Contributors`
* Split DT into apple-m1.dtsi and apple-j274.dts
* Cleaner solution for FIQ alternatives in entry.S from Marc
* Implemented proper nGnRnE solution using the `[non]posted-mmio`
  DT properties and ioremap_np()
* Added (tentative) support for guest timer FIQs in AIC
* Added masked initialization and basic fallback handling for other FIQs
* Reworked AIC to not use chained IRQ domains, and split up FIQ irqchip
* More refactoring of the samsung_tty driver to cleanly introduce Apple
  support
* Fixed many style issues, bugs, and other nits

Note: this keeps the `apple,arm-platform` compatible, which is now used
to gate the OF nonposted-mmio logic to Apple platforms only.

Hector Martin (24):
  dt-bindings: vendor-prefixes: Add apple prefix
  dt-bindings: arm: apple: Add bindings for Apple ARM platforms
  dt-bindings: arm: cpus: Add apple,firestorm & icestorm compatibles
  arm64: cputype: Add CPU implementor & types for the Apple M1 cores
  dt-bindings: timer: arm,arch_timer: Add interrupt-names support
  arm64: arch_timer: implement support for interrupt-names
  arm64: cpufeature: Add a feature for FIQ support
  arm64: Always keep DAIF.[IF] in sync
  asm-generic/io.h:  Add a non-posted variant of ioremap()
  arm64: Implement ioremap_np() to map MMIO as nGnRnE
  of/address: Add infrastructure to declare MMIO as non-posted
  arm64: Add Apple vendor-specific system registers
  dt-bindings: interrupt-controller: Add DT bindings for apple-aic
  irqchip/apple-aic: Add support for the Apple Interrupt Controller
  arm64: Kconfig: Introduce CONFIG_ARCH_APPLE
  tty: serial: samsung_tty: Separate S3C64XX ops structure
  tty: serial: samsung_tty: add s3c24xx_port_type
  tty: serial: samsung_tty: IRQ rework
  tty: serial: samsung_tty: Use devm_ioremap_resource
  dt-bindings: serial: samsung: Add apple,s5l-uart compatible
  tty: serial: samsung_tty: Add support for Apple UARTs
  tty: serial: samsung_tty: Add earlycon support for Apple UARTs
  dt-bindings: display: Add apple,simple-framebuffer
  arm64: apple: Add initial Mac Mini 2020 (M1) devicetree

Marc Zyngier (1):
  arm64: entry: Map the FIQ vector to IRQ on NEEDS_FIQ platforms

 .../devicetree/bindings/arm/apple.yaml        |  36 +
 .../devicetree/bindings/arm/cpus.yaml         |   2 +
 .../bindings/display/simple-framebuffer.yaml  |   5 +
 .../interrupt-controller/apple,aic.yaml       |  88 +++
 .../bindings/serial/samsung_uart.yaml         |   4 +-
 .../bindings/timer/arm,arch_timer.yaml        |  14 +
 .../devicetree/bindings/vendor-prefixes.yaml  |   2 +
 MAINTAINERS                                   |  15 +
 arch/arm64/Kconfig                            |  11 +
 arch/arm64/Kconfig.platforms                  |   8 +
 arch/arm64/boot/dts/Makefile                  |   1 +
 arch/arm64/boot/dts/apple/Makefile            |   2 +
 arch/arm64/boot/dts/apple/apple-j274.dts      |  41 ++
 arch/arm64/boot/dts/apple/apple-m1.dtsi       | 124 ++++
 arch/arm64/configs/defconfig                  |   1 +
 arch/arm64/include/asm/assembler.h            |   6 +-
 arch/arm64/include/asm/cpucaps.h              |   3 +-
 arch/arm64/include/asm/cputype.h              |   6 +
 arch/arm64/include/asm/daifflags.h            |   4 +-
 arch/arm64/include/asm/io.h                   |   1 +
 arch/arm64/include/asm/irqflags.h             |  19 +-
 arch/arm64/include/asm/sysreg_apple.h         |  69 ++
 arch/arm64/kernel/cpufeature.c                |  14 +
 arch/arm64/kernel/entry.S                     |  27 +-
 drivers/clocksource/arm_arch_timer.c          |  25 +-
 drivers/irqchip/Kconfig                       |  10 +
 drivers/irqchip/Makefile                      |   1 +
 drivers/irqchip/irq-apple-aic.c               | 647 ++++++++++++++++++
 drivers/of/address.c                          |  72 +-
 drivers/tty/serial/Kconfig                    |   2 +-
 drivers/tty/serial/samsung_tty.c              | 499 +++++++++++---
 include/asm-generic/io.h                      |   8 +-
 include/clocksource/arm_arch_timer.h          |   1 +
 .../interrupt-controller/apple-aic.h          |  15 +
 include/linux/cpuhotplug.h                    |   1 +
 include/linux/io.h                            |   2 +
 include/linux/ioport.h                        |   1 +
 include/linux/of_address.h                    |   1 +
 include/linux/serial_s3c.h                    |  16 +
 lib/devres.c                                  |  22 +
 40 files changed, 1685 insertions(+), 141 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/arm/apple.yaml
 create mode 100644 Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
 create mode 100644 arch/arm64/boot/dts/apple/Makefile
 create mode 100644 arch/arm64/boot/dts/apple/apple-j274.dts
 create mode 100644 arch/arm64/boot/dts/apple/apple-m1.dtsi
 create mode 100644 arch/arm64/include/asm/sysreg_apple.h
 create mode 100644 drivers/irqchip/irq-apple-aic.c
 create mode 100644 include/dt-bindings/interrupt-controller/apple-aic.h

--
2.30.0

Arnd Bergmann Feb. 15, 2021, 12:57 p.m. UTC | #1

On Mon, Feb 15, 2021 at 1:16 PM Hector Martin <marcan@marcan.st> wrote:
>
> This series brings up initial support for the Apple M1 SoC, used in the
> 2020 Mac Mini, MacBook Pro, and MacBook Air models.
>
> The following features are supported in this initial port:
>
> - UART (samsung-style) with earlycon support
> - Interrupts, including affinity and IPIs (Apple Interrupt Controller)
> - SMP (through standard spin-table support)
> - simplefb-based framebuffer
> - Devicetree for the Mac Mini (should work for the others too at this
>   stage)

I am essentially happy with the state of this series, the comments I had
in v1 by email and IRC are all addressed, but of course with the timing
during the merge window, it is not going to be in v5.12.

(adding maintainers for the serial/irqchip/clocksource drivers and
arch/arm64 to cc)

I would suggest merging it together as a series through the soc tree for
v5.13, once each patch has been reviewed by the respective subsystem
maintainers, with possible add-on patches on the same branch for
additional drivers that may become ready during the 5.12-rc cycle.
After the initial merge, driver patches will of course go through subsystem
trees as normal.

Let me know if that works for everyone.

     Arnd

Greg Kroah-Hartman Feb. 15, 2021, 1:22 p.m. UTC | #2

On Mon, Feb 15, 2021 at 01:57:39PM +0100, Arnd Bergmann wrote:
> On Mon, Feb 15, 2021 at 1:16 PM Hector Martin <marcan@marcan.st> wrote:
> >
> > This series brings up initial support for the Apple M1 SoC, used in the
> > 2020 Mac Mini, MacBook Pro, and MacBook Air models.
> >
> > The following features are supported in this initial port:
> >
> > - UART (samsung-style) with earlycon support
> > - Interrupts, including affinity and IPIs (Apple Interrupt Controller)
> > - SMP (through standard spin-table support)
> > - simplefb-based framebuffer
> > - Devicetree for the Mac Mini (should work for the others too at this
> >   stage)
> 
> I am essentially happy with the state of this series, the comments I had
> in v1 by email and IRC are all addressed, but of course with the timing
> during the merge window, it is not going to be in v5.12.
> 
> (adding maintainers for the serial/irqchip/clocksource drivers and
> arch/arm64 to cc)
> 
> I would suggest merging it together as a series through the soc tree for
> v5.13, once each patch has been reviewed by the respective subsystem
> maintainers, with possible add-on patches on the same branch for
> additional drivers that may become ready during the 5.12-rc cycle.
> After the initial merge, driver patches will of course go through subsystem
> trees as normal.
> 
> Let me know if that works for everyone.

Sure, as long as the maintainers get to see the patches, I don't think
I've seen the serial ones at all...

Marc Zyngier Feb. 15, 2021, 1:28 p.m. UTC | #3

On Mon, 15 Feb 2021 12:16:54 +0000,
Hector Martin <marcan@marcan.st> wrote:
> 
> This allows the devicetree to correctly represent the available set of
> timers, which varies from device to device, without the need for fake
> dummy interrupts for unavailable slots.
> 
> Also add the hyp-virt timer/PPI, which is not currently used, but worth
> representing.
> 
> Signed-off-by: Hector Martin <marcan@marcan.st>
> ---
>  drivers/clocksource/arm_arch_timer.c | 25 +++++++++++++++++++++----
>  include/clocksource/arm_arch_timer.h |  1 +
>  2 files changed, 22 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
> index d0177824c518..0e87b6d1ce97 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -63,6 +63,14 @@ struct arch_timer {
>  static u32 arch_timer_rate;
>  static int arch_timer_ppi[ARCH_TIMER_MAX_TIMER_PPI];
>  
> +static const char *arch_timer_ppi_names[ARCH_TIMER_MAX_TIMER_PPI] = {
> +	"phys-secure",
> +	"phys",
> +	"virt",
> +	"hyp-phys",
> +	"hyp-virt",

nit: I'd prefer it if the array was described as:

	[ARCH_TIMER_PHYS_SECURE_PPI] = "phys-secure",
	[...]

just to avoid that the two get out of sync.

> +};
> +
>  static struct clock_event_device __percpu *arch_timer_evt;
>  
>  static enum arch_timer_ppi_nr arch_timer_uses_ppi = ARCH_TIMER_VIRT_PPI;
> @@ -1280,17 +1288,26 @@ static void __init arch_timer_populate_kvm_info(void)
>  
>  static int __init arch_timer_of_init(struct device_node *np)
>  {
> -	int i, ret;
> +	int i, irq, ret;
>  	u32 rate;
> +	bool has_names;
>  
>  	if (arch_timers_present & ARCH_TIMER_TYPE_CP15) {
>  		pr_warn("multiple nodes in dt, skipping\n");
>  		return 0;
>  	}
>  
> -	arch_timers_present |= ARCH_TIMER_TYPE_CP15;

You probably didn't want to drop this line, did you?

> -	for (i = ARCH_TIMER_PHYS_SECURE_PPI; i < ARCH_TIMER_MAX_TIMER_PPI; i++)
> -		arch_timer_ppi[i] = irq_of_parse_and_map(np, i);
> +
> +	has_names = of_property_read_bool(np, "interrupt-names");
> +
> +	for (i = ARCH_TIMER_PHYS_SECURE_PPI; i < ARCH_TIMER_MAX_TIMER_PPI; i++) {
> +		if (has_names)
> +			irq = of_irq_get_byname(np, arch_timer_ppi_names[i]);
> +		else
> +			irq = of_irq_get(np, i);
> +		if (irq > 0)
> +			arch_timer_ppi[i] = irq;
> +	}
>  
>  	arch_timer_populate_kvm_info();
>  
> diff --git a/include/clocksource/arm_arch_timer.h b/include/clocksource/arm_arch_timer.h
> index 1d68d5613dae..73c7139c866f 100644
> --- a/include/clocksource/arm_arch_timer.h
> +++ b/include/clocksource/arm_arch_timer.h
> @@ -32,6 +32,7 @@ enum arch_timer_ppi_nr {
>  	ARCH_TIMER_PHYS_NONSECURE_PPI,
>  	ARCH_TIMER_VIRT_PPI,
>  	ARCH_TIMER_HYP_PPI,
> +	ARCH_TIMER_HYP_VIRT_PPI,
>  	ARCH_TIMER_MAX_TIMER_PPI
>  };

Otherwise looks OK to me.

Thanks,

	M.

Hector Martin Feb. 15, 2021, 3:13 p.m. UTC | #4

On 15/02/2021 22.28, Marc Zyngier wrote:
> nit: I'd prefer it if the array was described as:
> 
> 	[ARCH_TIMER_PHYS_SECURE_PPI] = "phys-secure",
> 	[...]
> 
> just to avoid that the two get out of sync.

Good point, changed it for v3.

>>   
>> -	arch_timers_present |= ARCH_TIMER_TYPE_CP15;
> 
> You probably didn't want to drop this line, did you?

Ouch. Thanks for catching that.

Hector Martin Feb. 15, 2021, 3:57 p.m. UTC | #5

On 15/02/2021 22.22, gregkh wrote:
> On Mon, Feb 15, 2021 at 01:57:39PM +0100, Arnd Bergmann wrote:
>> (adding maintainers for the serial/irqchip/clocksource drivers and
>> arch/arm64 to cc)
>>
>> I would suggest merging it together as a series through the soc tree for
>> v5.13, once each patch has been reviewed by the respective subsystem
>> maintainers, with possible add-on patches on the same branch for
>> additional drivers that may become ready during the 5.12-rc cycle.
>> After the initial merge, driver patches will of course go through subsystem
>> trees as normal.
>>
>> Let me know if that works for everyone.
> 
> Sure, as long as the maintainers get to see the patches, I don't think
> I've seen the serial ones at all...

Sorry, I figured Krzysztof would take a look at it first and I didn't 
want to spam too much. I'm still getting used to figuring out who to CC...

Do you want to take a look at v2, or wait for v3?

Greg Kroah-Hartman Feb. 15, 2021, 4:12 p.m. UTC | #6

On Tue, Feb 16, 2021 at 12:57:27AM +0900, Hector Martin wrote:
> On 15/02/2021 22.22, gregkh wrote:
> > On Mon, Feb 15, 2021 at 01:57:39PM +0100, Arnd Bergmann wrote:
> > > (adding maintainers for the serial/irqchip/clocksource drivers and
> > > arch/arm64 to cc)
> > > 
> > > I would suggest merging it together as a series through the soc tree for
> > > v5.13, once each patch has been reviewed by the respective subsystem
> > > maintainers, with possible add-on patches on the same branch for
> > > additional drivers that may become ready during the 5.12-rc cycle.
> > > After the initial merge, driver patches will of course go through subsystem
> > > trees as normal.
> > > 
> > > Let me know if that works for everyone.
> > 
> > Sure, as long as the maintainers get to see the patches, I don't think
> > I've seen the serial ones at all...
> 
> Sorry, I figured Krzysztof would take a look at it first and I didn't want
> to spam too much. I'm still getting used to figuring out who to CC...

scripts/get_maintainer.pl is your friend.

> Do you want to take a look at v2, or wait for v3?

v3 is fine, I can't do anything until after 5.12-rc1 is out anyway.

thanks,

greg k-h

Hector Martin Feb. 15, 2021, 4:54 p.m. UTC | #7

On 16/02/2021 01.12, gregkh wrote:
> On Tue, Feb 16, 2021 at 12:57:27AM +0900, Hector Martin wrote:
>> On 15/02/2021 22.22, gregkh wrote:
>>> On Mon, Feb 15, 2021 at 01:57:39PM +0100, Arnd Bergmann wrote:
>>>> (adding maintainers for the serial/irqchip/clocksource drivers and
>>>> arch/arm64 to cc)
>>>>
>>>> I would suggest merging it together as a series through the soc tree for
>>>> v5.13, once each patch has been reviewed by the respective subsystem
>>>> maintainers, with possible add-on patches on the same branch for
>>>> additional drivers that may become ready during the 5.12-rc cycle.
>>>> After the initial merge, driver patches will of course go through subsystem
>>>> trees as normal.
>>>>
>>>> Let me know if that works for everyone.
>>>
>>> Sure, as long as the maintainers get to see the patches, I don't think
>>> I've seen the serial ones at all...
>>
>> Sorry, I figured Krzysztof would take a look at it first and I didn't want
>> to spam too much. I'm still getting used to figuring out who to CC...
> 
> scripts/get_maintainer.pl is your friend.

It's the additional step of figuring out who to include from 
get_maintainer.pl output, and whether any subsetting is warranted at 
all, that I'm finding less well documented... :-)

(In particular for a bring-up series such as this one, where most people 
are only concerned with a few patches... but maybe I'm just overthinking 
things)

>> Do you want to take a look at v2, or wait for v3?
> 
> v3 is fine, I can't do anything until after 5.12-rc1 is out anyway.

Got it, thanks!

Krzysztof Kozlowski Feb. 15, 2021, 5:43 p.m. UTC | #8

On Tue, Feb 16, 2021 at 01:54:25AM +0900, Hector Martin wrote:
> On 16/02/2021 01.12, gregkh wrote:
> > On Tue, Feb 16, 2021 at 12:57:27AM +0900, Hector Martin wrote:
> > > On 15/02/2021 22.22, gregkh wrote:
> > > > On Mon, Feb 15, 2021 at 01:57:39PM +0100, Arnd Bergmann wrote:
> > > > > (adding maintainers for the serial/irqchip/clocksource drivers and
> > > > > arch/arm64 to cc)
> > > > > 
> > > > > I would suggest merging it together as a series through the soc tree for
> > > > > v5.13, once each patch has been reviewed by the respective subsystem
> > > > > maintainers, with possible add-on patches on the same branch for
> > > > > additional drivers that may become ready during the 5.12-rc cycle.
> > > > > After the initial merge, driver patches will of course go through subsystem
> > > > > trees as normal.
> > > > > 
> > > > > Let me know if that works for everyone.
> > > > 
> > > > Sure, as long as the maintainers get to see the patches, I don't think
> > > > I've seen the serial ones at all...
> > > 
> > > Sorry, I figured Krzysztof would take a look at it first and I didn't want
> > > to spam too much. I'm still getting used to figuring out who to CC...
> > 
> > scripts/get_maintainer.pl is your friend.
> 
> It's the additional step of figuring out who to include from
> get_maintainer.pl output, and whether any subsetting is warranted at all,
> that I'm finding less well documented... :-)
> 
> (In particular for a bring-up series such as this one, where most people are
> only concerned with a few patches... but maybe I'm just overthinking things)

For bigger patchsets, you should combine get_maintainer.pl with sending
emails so individual patches will go to all role-based entries from
get_maintainer.pl and to all mailing lists. You can (and sometimes even
worth to) skip proposals based on git-history.

Then the cover letter which should go to everyone... or most of people
and linked from individual patches.

For example the easiest I think:
1. Put all folks and lists as "Cc:" in cover letter.
2. Send everything:
   git send-email --cc-cmd "scripts/get_maintainer.pl --no-git --no-roles --no-rolestats" --to linux-kernel@vger.kernel.org 0*

Optionally you could add --no-git-fallback to get_maintainer if it
proposes some irrelevant addresses.

Other way is to send first cover letter and then reference it via
in-reply-to:

1. To get everyone for cover letter:
   scripts/get_maintainer.pl --no-multiline --interactive --separator=\'' --to '\' --no-git --no-roles --no-rolestats 00*

2. git send-email --cc linux-kernel@vger.kernel.org ... 0000*

3. Finally all patches with in-reply-to:
   git send-email --cc-cmd "scripts/get_maintainer.pl --no-git --no-roles --no-rolestats" --to linux-kernel@vger.kernel.org --no-thread --in-reply-to='foo-bar-xxxxxxx' 000[123456789]-* 00[123456789]*-*

Best regards,
Krzysztof

Krzysztof Kozlowski Feb. 15, 2021, 6:06 p.m. UTC | #9

On Mon, Feb 15, 2021 at 09:17:05PM +0900, Hector Martin wrote:
> Instead of patching a single global ops structure depending on the port
> type, use a separate s3c64xx_serial_ops for the S3C64XX type. This
> allows us to mark the structures as const.
> 
> Also split out s3c64xx_serial_shutdown into a separate function now that
> we have a separate ops structure; this avoids excessive branching
> control flow and mirrors s3c64xx_serial_startup. tx_claimed and
> rx_claimed are only used in the S3C24XX functions.
> 
> Signed-off-by: Hector Martin <marcan@marcan.st>
> ---
>  drivers/tty/serial/samsung_tty.c | 69 ++++++++++++++++++++++++--------
>  1 file changed, 53 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/tty/serial/samsung_tty.c b/drivers/tty/serial/samsung_tty.c
> index 8ae3e03fbd8c..6b661f3ec1ae 100644
> --- a/drivers/tty/serial/samsung_tty.c
> +++ b/drivers/tty/serial/samsung_tty.c
> @@ -1098,27 +1098,36 @@ static void s3c24xx_serial_shutdown(struct uart_port *port)
>  	struct s3c24xx_uart_port *ourport = to_ourport(port);
>  
>  	if (ourport->tx_claimed) {
> -		if (!s3c24xx_serial_has_interrupt_mask(port))
> -			free_irq(ourport->tx_irq, ourport);
> +		free_irq(ourport->tx_irq, ourport);
>  		ourport->tx_enabled = 0;
>  		ourport->tx_claimed = 0;
>  		ourport->tx_mode = 0;
>  	}
>  
>  	if (ourport->rx_claimed) {
> -		if (!s3c24xx_serial_has_interrupt_mask(port))
> -			free_irq(ourport->rx_irq, ourport);
> +		free_irq(ourport->rx_irq, ourport);
>  		ourport->rx_claimed = 0;
>  		ourport->rx_enabled = 0;
>  	}
>  
> -	/* Clear pending interrupts and mask all interrupts */
> -	if (s3c24xx_serial_has_interrupt_mask(port)) {
> -		free_irq(port->irq, ourport);
> +	if (ourport->dma)
> +		s3c24xx_serial_release_dma(ourport);
>  
> -		wr_regl(port, S3C64XX_UINTP, 0xf);
> -		wr_regl(port, S3C64XX_UINTM, 0xf);
> -	}
> +	ourport->tx_in_progress = 0;
> +}
> +
> +static void s3c64xx_serial_shutdown(struct uart_port *port)
> +{
> +	struct s3c24xx_uart_port *ourport = to_ourport(port);
> +
> +	free_irq(port->irq, ourport);
> +
> +	wr_regl(port, S3C64XX_UINTP, 0xf);
> +	wr_regl(port, S3C64XX_UINTM, 0xf);
> +
> +	ourport->tx_enabled = 0;
> +	ourport->tx_mode = 0;
> +	ourport->rx_enabled = 0;

For S3C64xx type this is not equivalent: the assignments were
happening before free_irq() and wr_regl(). Honestly I don't know whether
it matters (except some barriers coming from these functions) but please
do not change the order of code in this patch. If needed, the
re-ordering should be a patch on its own. With explanation why.

>  
>  	if (ourport->dma)
>  		s3c24xx_serial_release_dma(ourport);
> @@ -1193,9 +1202,7 @@ static int s3c64xx_serial_startup(struct uart_port *port)
>  
>  	/* For compatibility with s3c24xx Soc's */
>  	ourport->rx_enabled = 1;
> -	ourport->rx_claimed = 1;
>  	ourport->tx_enabled = 0;
> -	ourport->tx_claimed = 1;
>  
>  	spin_lock_irqsave(&port->lock, flags);
>  
> @@ -1631,6 +1638,29 @@ static struct uart_ops s3c24xx_serial_ops = {
>  #endif
>  };
>  
> +static const struct uart_ops s3c64xx_serial_ops = {
> +	.pm		= s3c24xx_serial_pm,
> +	.tx_empty	= s3c24xx_serial_tx_empty,
> +	.get_mctrl	= s3c24xx_serial_get_mctrl,
> +	.set_mctrl	= s3c24xx_serial_set_mctrl,
> +	.stop_tx	= s3c24xx_serial_stop_tx,
> +	.start_tx	= s3c24xx_serial_start_tx,
> +	.stop_rx	= s3c24xx_serial_stop_rx,
> +	.break_ctl	= s3c24xx_serial_break_ctl,
> +	.startup	= s3c64xx_serial_startup,
> +	.shutdown	= s3c64xx_serial_shutdown,
> +	.set_termios	= s3c24xx_serial_set_termios,
> +	.type		= s3c24xx_serial_type,
> +	.release_port	= s3c24xx_serial_release_port,
> +	.request_port	= s3c24xx_serial_request_port,
> +	.config_port	= s3c24xx_serial_config_port,
> +	.verify_port	= s3c24xx_serial_verify_port,
> +#if defined(CONFIG_SERIAL_SAMSUNG_CONSOLE) && defined(CONFIG_CONSOLE_POLL)
> +	.poll_get_char = s3c24xx_serial_get_poll_char,
> +	.poll_put_char = s3c24xx_serial_put_poll_char,
> +#endif
> +};
> +

Make the s3c24xx_serial_ops const as well.

Best regards,
Krzysztof

Marc Zyngier Feb. 15, 2021, 6:09 p.m. UTC | #10

On Mon, 15 Feb 2021 12:17:03 +0000,
Hector Martin <marcan@marcan.st> wrote:
> 
> This is the root interrupt controller used on Apple ARM SoCs such as the
> M1. This irqchip driver performs multiple functions:
> 
> * Discriminates between IRQs and FIQs
> 
> * Drives the AIC peripheral itself (which handles IRQs)
> 
> * Dispatches FIQs to downstream hard-wired clients (currently the ARM
>   timer).
> 
> This patch introduces basic UP irqchip support, without SMP/IPI support.

This last comment seems outdated now.

> Signed-off-by: Hector Martin <marcan@marcan.st>
> ---
>  MAINTAINERS                     |   2 +
>  drivers/irqchip/Kconfig         |  10 +
>  drivers/irqchip/Makefile        |   1 +
>  drivers/irqchip/irq-apple-aic.c | 647 ++++++++++++++++++++++++++++++++
>  include/linux/cpuhotplug.h      |   1 +
>  5 files changed, 661 insertions(+)
>  create mode 100644 drivers/irqchip/irq-apple-aic.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 9fe723033e63..a8f258fbb5f1 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1636,6 +1636,8 @@ T:	git https://github.com/AsahiLinux/linux.git
>  F:	Documentation/devicetree/bindings/arm/apple.yaml
>  F:	Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
>  F:	arch/arm64/include/asm/sysreg_apple.h
> +F:	drivers/irqchip/irq-apple-aic.c
> +F:	include/dt-bindings/interrupt-controller/apple-aic.h
>  
>  ARM/ARTPEC MACHINE SUPPORT
>  M:	Jesper Nilsson <jesper.nilsson@axis.com>
> diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> index b147f22a78f4..fca080640c1a 100644
> --- a/drivers/irqchip/Kconfig
> +++ b/drivers/irqchip/Kconfig
> @@ -590,4 +590,14 @@ config MST_IRQ
>  	help
>  	  Support MStar Interrupt Controller.
>  
> +config APPLE_AIC
> +	bool "Apple Interrupt Controller (AIC)"
> +	depends on ARM64
> +	default ARCH_APPLE
> +	select IRQ_DOMAIN
> +	select IRQ_DOMAIN_HIERARCHY

arm64 selects GENERIC_IRQ_IPI, which selects IRQ_DOMAIN_HIERARCHY,
which selects IRQ_DOMAIN. So these two lines are superfluous.

> +	help
> +	  Support for the Apple Interrupt Controller found on Apple Silicon SoCs,
> +	  such as the M1.
> +
>  endmenu
> diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> index 0ac93bfaec61..0e2ba7c2dce7 100644
> --- a/drivers/irqchip/Makefile
> +++ b/drivers/irqchip/Makefile
> @@ -113,3 +113,4 @@ obj-$(CONFIG_LOONGSON_PCH_PIC)		+= irq-loongson-pch-pic.o
>  obj-$(CONFIG_LOONGSON_PCH_MSI)		+= irq-loongson-pch-msi.o
>  obj-$(CONFIG_MST_IRQ)			+= irq-mst-intc.o
>  obj-$(CONFIG_SL28CPLD_INTC)		+= irq-sl28cpld.o
> +obj-$(CONFIG_APPLE_AIC)			+= irq-apple-aic.o
> diff --git a/drivers/irqchip/irq-apple-aic.c b/drivers/irqchip/irq-apple-aic.c
> new file mode 100644
> index 000000000000..604dbb8fd898
> --- /dev/null
> +++ b/drivers/irqchip/irq-apple-aic.c
> @@ -0,0 +1,647 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Copyright The Asahi Linux Contributors
> + *
> + * Based on irq-lpc32xx:
> + *   Copyright 2015-2016 Vladimir Zapolskiy <vz@mleia.com>
> + * Based on irq-bcm2836:
> + *   Copyright 2015 Broadcom
> + */
> +
> +/*
> + * AIC is a fairly simple interrupt controller with the following features:
> + *
> + * - 896 level-triggered hardware IRQs
> + *   - Single mask bit per IRQ
> + *   - Per-IRQ affinity setting
> + *   - Automatic masking on event delivery (auto-ack)
> + *   - Software triggering (ORed with hw line)
> + * - 2 per-CPU IPIs (meant as "self" and "other", but they are interchangeable if not symmetric)
> + * - Automatic prioritization (single event/ack register per CPU, lower IRQs = higher priority)
> + * - Automatic masking on ack
> + * - Default "this CPU" register view and explicit per-CPU views
> + *
> + * In addition, this driver also handles FIQs, as these are routed to the same IRQ vector. These
> + * are used for Fast IPIs (TODO), the ARMv8 timer IRQs, and performance counters (TODO).
> + *

nit: A bit of comment formatting could be helpful.

> + * Implementation notes:
> + *
> + * - This driver creates two IRQ domains, one for HW IRQs and internal FIQs, and one for IPIs
> + * - Since Linux needs more than 2 IPIs, we implement a software IRQ controller and funnel all IPIs
> + *   into one per-CPU IPI (the second "self" IPI is unused).
> + * - FIQ hwirq numbers are assigned after true hwirqs, and are per-cpu
> + * - DT bindings use 3-cell form (like GIC):
> + *   - <0 nr flags> - hwirq #nr
> + *   - <1 nr flags> - FIQ #nr
> + *     - nr=0  Physical HV timer
> + *     - nr=1  Virtual HV timer
> + *     - nr=2  Physical guest timer
> + *     - nr=3  Virtual guest timer
> + *
> + */
> +
> +#define pr_fmt(fmt) "%s: " fmt, __func__
> +
> +#include <linux/bits.h>
> +#include <linux/bitfield.h>
> +#include <linux/cpuhotplug.h>
> +#include <linux/io.h>
> +#include <linux/irqchip.h>
> +#include <linux/irqchip/arm-gic-v3.h>

I'd rather you move the ICH_HCR_* definitions to sysreg.h rather than
including the GICv3 stuff. They are only there for historical reasons
(such as supporting KVM on 32bit systems), none of which apply anymore.

> +#include <linux/irqdomain.h>
> +#include <linux/of_address.h>
> +#include <linux/slab.h>
> +#include <asm/exception.h>
> +#include <asm/sysreg.h>
> +#include <asm/sysreg_apple.h>
> +
> +#include <dt-bindings/interrupt-controller/apple-aic.h>
> +
> +#define AIC_INFO		0x0004
> +#define AIC_INFO_NR_HW		GENMASK(15, 0)
> +
> +#define AIC_CONFIG		0x0010
> +
> +#define AIC_WHOAMI		0x2000
> +#define AIC_EVENT		0x2004
> +#define AIC_EVENT_TYPE		GENMASK(31, 16)
> +#define AIC_EVENT_NUM		GENMASK(15, 0)
> +
> +#define AIC_EVENT_TYPE_HW	1
> +#define AIC_EVENT_TYPE_IPI	4
> +#define AIC_EVENT_IPI_OTHER	1
> +#define AIC_EVENT_IPI_SELF	2
> +
> +#define AIC_IPI_SEND		0x2008
> +#define AIC_IPI_ACK		0x200c
> +#define AIC_IPI_MASK_SET	0x2024
> +#define AIC_IPI_MASK_CLR	0x2028
> +
> +#define AIC_IPI_SEND_CPU(cpu)	BIT(cpu)
> +
> +#define AIC_IPI_OTHER		BIT(0)
> +#define AIC_IPI_SELF		BIT(31)
> +
> +#define AIC_TARGET_CPU		0x3000
> +#define AIC_SW_SET		0x4000
> +#define AIC_SW_CLR		0x4080
> +#define AIC_MASK_SET		0x4100
> +#define AIC_MASK_CLR		0x4180
> +
> +#define AIC_CPU_IPI_SET(cpu)	(0x5008 + ((cpu) << 7))
> +#define AIC_CPU_IPI_CLR(cpu)	(0x500c + ((cpu) << 7))
> +#define AIC_CPU_IPI_MASK_SET(cpu) (0x5024 + ((cpu) << 7))
> +#define AIC_CPU_IPI_MASK_CLR(cpu) (0x5028 + ((cpu) << 7))
> +
> +#define MASK_REG(x)		(4 * ((x) >> 5))
> +#define MASK_BIT(x)		BIT((x) & 0x1f)
> +
> +#define AIC_NR_FIQ		4
> +#define AIC_NR_SWIPI		32
> +
> +/*
> + * Max 31 bits in IPI SEND register (top bit is self).
> + * >=32-core chips will need code changes anyway.
> + */
> +#define AIC_MAX_CPUS		31
> +
> +struct aic_irq_chip {
> +	void __iomem *base;
> +	struct irq_domain *hw_domain;
> +	struct irq_domain *ipi_domain;
> +	int nr_hw;
> +	int ipi_hwirq;
> +};
> +
> +static atomic_t aic_vipi_flag[AIC_MAX_CPUS];
> +static atomic_t aic_vipi_mask[AIC_MAX_CPUS];
> +
> +static struct aic_irq_chip *aic_irqc;
> +
> +static void aic_handle_ipi(struct pt_regs *regs);
> +
> +static u32 aic_ic_read(struct aic_irq_chip *ic, u32 reg)
> +{
> +	return readl_relaxed(ic->base + reg);
> +}
> +
> +static void aic_ic_write(struct aic_irq_chip *ic, u32 reg, u32 val)
> +{
> +	writel_relaxed(val, ic->base + reg);
> +}
> +
> +/*
> + * IRQ irqchip
> + */
> +
> +static void aic_irq_mask(struct irq_data *d)
> +{
> +	struct aic_irq_chip *ic = irq_data_get_irq_chip_data(d);
> +
> +	aic_ic_write(ic, AIC_MASK_SET + MASK_REG(d->hwirq),
> +		     MASK_BIT(d->hwirq));
> +}
> +
> +static void aic_irq_unmask(struct irq_data *d)
> +{
> +	struct aic_irq_chip *ic = irq_data_get_irq_chip_data(d);
> +
> +	aic_ic_write(ic, AIC_MASK_CLR + MASK_REG(d->hwirq),
> +		     MASK_BIT(d->hwirq));
> +}
> +
> +static void aic_irq_eoi(struct irq_data *d)
> +{
> +	/*
> +	 * Reading the interrupt reason automatically acknowledges and masks
> +	 * the IRQ, so we just unmask it here if needed.
> +	 */
> +	if (!irqd_irq_disabled(d) && !irqd_irq_masked(d))
> +		aic_irq_unmask(d);
> +}
> +
> +static void aic_handle_irq(struct pt_regs *regs)
> +{
> +	struct aic_irq_chip *ic = aic_irqc;
> +	u32 event, type, irq;
> +
> +	do {
> +		/*
> +		 * We cannot use a relaxed read here, as DMA needs to be
> +		 * ordered with respect to the IRQ firing.
> +		 */
> +		event = readl(ic->base + AIC_EVENT);
> +		type = FIELD_GET(AIC_EVENT_TYPE, event);
> +		irq = FIELD_GET(AIC_EVENT_NUM, event);
> +
> +		if (type == AIC_EVENT_TYPE_HW)
> +			handle_domain_irq(aic_irqc->hw_domain, irq, regs);
> +		else if (type == AIC_EVENT_TYPE_IPI && irq == 1)
> +			aic_handle_ipi(regs);
> +		else if (event != 0)
> +			pr_err("Unknown IRQ event %d, %d\n", type, irq);
> +	} while (event);
> +
> +	/*
> +	 * vGIC maintenance interrupts end up here too, so we need to check
> +	 * for them separately. Just report and disable vGIC for now, until
> +	 * we implement this properly.
> +	 */
> +	if ((read_sysreg_s(SYS_ICH_HCR_EL2) & ICH_HCR_EN) &&
> +		read_sysreg_s(SYS_ICH_MISR_EL2) != 0) {
> +		pr_err("vGIC IRQ fired, disabling.\n");
> +		sysreg_clear_set_s(SYS_ICH_HCR_EL2, ICH_HCR_EN, 0);
> +	}
> +}
> +
> +static int aic_irq_set_affinity(struct irq_data *d,
> +				const struct cpumask *mask_val, bool force)
> +{
> +	irq_hw_number_t hwirq = irqd_to_hwirq(d);
> +	struct aic_irq_chip *ic = irq_data_get_irq_chip_data(d);
> +	int cpu;
> +
> +	if (hwirq > ic->nr_hw)
> +		return -EINVAL;
> +
> +	if (force)
> +		cpu = cpumask_first(mask_val);
> +	else
> +		cpu = cpumask_any_and(mask_val, cpu_online_mask);
> +
> +	aic_ic_write(ic, AIC_TARGET_CPU + hwirq * 4, BIT(cpu));
> +	irq_data_update_effective_affinity(d, cpumask_of(cpu));

It is fine to pick a single CPU out of the whole affinity set, but you
should tell the kernel that this is the case (irqd_set_single_target()).

> +
> +	return IRQ_SET_MASK_OK;
> +}
> +
> +static struct irq_chip aic_chip = {
> +	.name = "AIC",
> +	.irq_mask = aic_irq_mask,
> +	.irq_unmask = aic_irq_unmask,
> +	.irq_eoi = aic_irq_eoi,
> +	.irq_set_affinity = aic_irq_set_affinity,
> +};
> +
> +/*
> + * FIQ irqchip
> + */
> +
> +static void aic_fiq_mask(struct irq_data *d)
> +{
> +	/* Only the guest timers have real mask bits, unfortunately. */
> +	switch (d->hwirq) {
> +	case AIC_TMR_GUEST_PHYS:
> +		sysreg_clear_set_s(SYS_APL_VM_TMR_MASK,
> +					VM_TMR_MASK_P, 0);
> +		break;
> +	case AIC_TMR_GUEST_VIRT:
> +		sysreg_clear_set_s(SYS_APL_VM_TMR_MASK,
> +					VM_TMR_MASK_V, 0);
> +		break;
> +	}
> +}
> +
> +static void aic_fiq_unmask(struct irq_data *d)
> +{
> +	switch (d->hwirq) {
> +	case AIC_TMR_GUEST_PHYS:
> +		sysreg_clear_set_s(SYS_APL_VM_TMR_MASK,
> +					0, VM_TMR_MASK_P);
> +		break;
> +	case AIC_TMR_GUEST_VIRT:
> +		sysreg_clear_set_s(SYS_APL_VM_TMR_MASK,
> +					0, VM_TMR_MASK_V);
> +		break;
> +	}
> +}
> +
> +static void aic_fiq_eoi(struct irq_data *d)
> +{
> +	/* We mask to ack (where we can), so we need to unmask at EOI. */
> +	if (!irqd_irq_disabled(d) && !irqd_irq_masked(d))
> +		aic_fiq_unmask(d);
> +}
> +
> +#define TIMER_FIRING(x)                                                        \
> +	(((x) & (ARCH_TIMER_CTRL_ENABLE | ARCH_TIMER_CTRL_IT_MASK |            \
> +		 ARCH_TIMER_CTRL_IT_STAT)) ==                                  \
> +	 (ARCH_TIMER_CTRL_ENABLE | ARCH_TIMER_CTRL_IT_STAT))
> +
> +static void aic_handle_fiq(struct pt_regs *regs)
> +{
> +	/*
> +	 * It would be really if we had a system register that lets us get
> +	 * the FIQ source state without having to peek down into sources...
> +	 * but such a register does not seem to exist.
> +	 *
> +	 * So, we have these potential sources to test for:
> +	 *  - Fast IPIs (not yet used)
> +	 *  - The 4 timers (CNTP, CNTV for each of HV and guest)
> +	 *  - Per-core PMCs (not yet supported)
> +	 *  - Per-cluster uncore PMCs (not yet supported)
> +	 *
> +	 * Since not dealing with any of these results in a FIQ storm,
> +	 * we check for everything here, even things we don't support yet.
> +	 */
> +
> +	if (read_sysreg_s(SYS_APL_IPI_SR) & IPI_SR_PENDING) {
> +		pr_warn("Fast IPI fired. Acking.\n");
> +		write_sysreg_s(IPI_SR_PENDING, SYS_APL_IPI_SR);
> +	}
> +
> +	if (TIMER_FIRING(read_sysreg(cntp_ctl_el0)))
> +		handle_domain_irq(aic_irqc->hw_domain,
> +				  aic_irqc->nr_hw + AIC_TMR_HV_PHYS, regs);
> +
> +	if (TIMER_FIRING(read_sysreg(cntv_ctl_el0)))
> +		handle_domain_irq(aic_irqc->hw_domain,
> +				  aic_irqc->nr_hw + AIC_TMR_HV_VIRT, regs);
> +
> +	if (TIMER_FIRING(read_sysreg_s(SYS_CNTP_CTL_EL02)))
> +		handle_domain_irq(aic_irqc->hw_domain,
> +				  aic_irqc->nr_hw + AIC_TMR_GUEST_PHYS, regs);
> +
> +	if (TIMER_FIRING(read_sysreg_s(SYS_CNTV_CTL_EL02)))
> +		handle_domain_irq(aic_irqc->hw_domain,
> +				  aic_irqc->nr_hw + AIC_TMR_GUEST_VIRT, regs);
> +
> +	if ((read_sysreg_s(SYS_APL_PMCR0) & (PMCR0_IMODE | PMCR0_IACT))
> +			== (FIELD_PREP(PMCR0_IMODE, PMCR0_IMODE_FIQ) | PMCR0_IACT)) {
> +		/*
> +		 * Not supported yet, let's figure out how to handle this when
> +		 * we implement these proprietary performance counters. For now,
> +		 * just mask it and move on.
> +		 */
> +		pr_warn("PMC FIQ fired. Masking.\n");
> +		sysreg_clear_set_s(SYS_APL_PMCR0, PMCR0_IMODE | PMCR0_IACT,
> +				   FIELD_PREP(PMCR0_IMODE, PMCR0_IMODE_OFF));
> +	}
> +
> +	if (FIELD_GET(UPMCR0_IMODE, read_sysreg_s(SYS_APL_UPMCR0)) == UPMCR0_IMODE_FIQ &&
> +			(read_sysreg_s(SYS_APL_UPMSR) & UPMSR_IACT)) {
> +		/* Same story with uncore PMCs */
> +		pr_warn("Uncore PMC FIQ fired. Masking.\n");
> +		sysreg_clear_set_s(SYS_APL_UPMCR0, UPMCR0_IMODE,
> +				   FIELD_PREP(UPMCR0_IMODE, UPMCR0_IMODE_OFF));
> +	}
> +}
> +
> +static struct irq_chip fiq_chip = {
> +	.name = "AIC-FIQ",
> +	.irq_mask = aic_fiq_mask,
> +	.irq_unmask = aic_fiq_unmask,
> +	.irq_ack = aic_fiq_mask,
> +	.irq_eoi = aic_fiq_eoi,
> +};
> +
> +/*
> + * Main IRQ domain
> + */
> +
> +static void __exception_irq_entry aic_handle_irq_or_fiq(struct pt_regs *regs)
> +{
> +	u64 isr = read_sysreg(isr_el1);
> +
> +	if (isr & PSR_F_BIT)
> +		aic_handle_fiq(regs);
> +
> +	if (isr & PSR_I_BIT)
> +		aic_handle_irq(regs);
> +}
> +
> +static int aic_irq_domain_map(struct irq_domain *id, unsigned int irq,
> +			      irq_hw_number_t hw)
> +{
> +	struct aic_irq_chip *ic = id->host_data;
> +
> +	irq_set_chip_data(irq, ic);
> +	if (hw < ic->nr_hw) {
> +		irq_set_chip_and_handler(irq, &aic_chip, handle_fasteoi_irq);
> +	} else {
> +		irq_set_percpu_devid(irq);
> +		irq_set_chip_and_handler(irq, &fiq_chip,
> +					 handle_percpu_devid_irq);
> +	}
> +
> +	irq_set_status_flags(irq, IRQ_LEVEL);

I'm definitely not keen on this override, and the trigger information
should be the one coming from the DT, which is already set for you.
It'd probably be useful to provide an irq_set_type() callback that
returns an error when fed an unsupported trigger.

> +	irq_set_noprobe(irq);

This seems to be cargo-culted, and I don't believe this is necessary.

> +
> +	return 0;
> +}
> +
> +static void aic_irq_domain_unmap(struct irq_domain *id, unsigned int irq)
> +{
> +	irq_set_chip_and_handler(irq, NULL, NULL);
> +}
> +
> +static int aic_irq_domain_xlate(struct irq_domain *id,
> +				struct device_node *ctrlr, const u32 *intspec,
> +				unsigned int intsize,
> +				irq_hw_number_t *out_hwirq,
> +				unsigned int *out_type)
> +{
> +	struct aic_irq_chip *ic = id->host_data;
> +
> +	if (intsize != 3)
> +		return -EINVAL;
> +
> +	if (intspec[0] == AIC_IRQ && intspec[1] < ic->nr_hw)
> +		*out_hwirq = intspec[1];
> +	else if (intspec[0] == AIC_FIQ && intspec[1] < AIC_NR_FIQ)
> +		*out_hwirq = ic->nr_hw + intspec[1];
> +	else
> +		return -EINVAL;
> +
> +	*out_type = intspec[2] & IRQ_TYPE_SENSE_MASK;
> +
> +	return 0;
> +}
> +
> +static const struct irq_domain_ops aic_irq_domain_ops = {
> +	.map = aic_irq_domain_map,
> +	.unmap = aic_irq_domain_unmap,
> +	.xlate = aic_irq_domain_xlate,
> +};

You are mixing two APIs: the older OF-specific one, and the newer one
that uses fwnode_handle for hierarchical support. That's OK for older
drivers that were forcefully converted to using generic IPIs, but as
this is a brand new driver, I'd rather it consistently used the new
API. See a proposed rework at [1] (compile tested only).

> +
> +/*
> + * IPI irqchip
> + */
> +
> +static void aic_ipi_mask(struct irq_data *d)
> +{
> +	struct aic_irq_chip *ic = irq_data_get_irq_chip_data(d);
> +	u32 irq_bit = BIT(irqd_to_hwirq(d));
> +	int this_cpu = smp_processor_id();
> +
> +	atomic_and(~irq_bit, &aic_vipi_mask[this_cpu]);

atomic_andnot()?

> +
> +	if (!atomic_read(&aic_vipi_mask[this_cpu]))
> +		aic_ic_write(ic, AIC_IPI_MASK_SET, AIC_IPI_OTHER);

This is odd. It means that you still perform the MMIO write if the bit
was already clear. I think this could be written as:

	u32 val;
	val = atomic_fetch_andnot(irq_bit, &aic_vipi_mask[this_cpu]);
	if (val && !(val & ~irq_bit))
		aic_ic_write();

> +}
> +
> +static void aic_ipi_unmask(struct irq_data *d)
> +{
> +	struct aic_irq_chip *ic = irq_data_get_irq_chip_data(d);
> +	u32 irq_bit = BIT(irqd_to_hwirq(d));
> +	int this_cpu = smp_processor_id();
> +
> +	atomic_or(irq_bit, &aic_vipi_mask[this_cpu]);
> +
> +	aic_ic_write(ic, AIC_IPI_MASK_CLR, AIC_IPI_OTHER);

	val  = atomic_fetch_or(irq_bit, &aic_vipi_mask[this_cpu]);
	if (!val)
		aic_ic_write();

> +}
> +
> +static void aic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask)
> +{
> +	struct aic_irq_chip *ic = irq_data_get_irq_chip_data(d);
> +	u32 irq_bit = BIT(irqd_to_hwirq(d));
> +	u32 send = 0;
> +	int cpu;
> +
> +	/*
> +	 * Ensure that stores to normal memory are visible to the
> +	 * other CPUs before issuing the IPI. This needs to happen
> +	 * before setting any vIPI flag bits, since that can race the
> +	 * atomic_xchg below.
> +	 */
> +	wmb();
> +
> +	for_each_cpu(cpu, mask) {
> +		if (atomic_read(&aic_vipi_mask[cpu]) & irq_bit) {
> +			atomic_or(irq_bit, &aic_vipi_flag[cpu]);
> +			send |= AIC_IPI_SEND_CPU(cpu);

That's really odd. A masked IPI should be made pending, and delivered
on unmask. I think this all works because we never mask individual
IPIs, as this would otherwise drop interrupts on the floor.

> +		}
> +	}
> +
> +	if (send) {
> +		/*
> +		 * Ensure that the vIPI flag writes complete before issuing
> +		 * the physical IPI.
> +		 */
> +		wmb();
> +		aic_ic_write(ic, AIC_IPI_SEND, send);
> +	}
> +}
> +
> +static struct irq_chip ipi_chip = {
> +	.name = "AIC-IPI",
> +	.irq_mask = aic_ipi_mask,
> +	.irq_unmask = aic_ipi_unmask,
> +	.ipi_send_mask = aic_ipi_send_mask,
> +};
> +
> +/*
> + * IPI IRQ domain
> + */
> +
> +static void aic_handle_ipi(struct pt_regs *regs)
> +{
> +	int this_cpu = smp_processor_id();
> +	int i;
> +	unsigned long firing;
> +
> +	aic_ic_write(aic_irqc, AIC_IPI_ACK, AIC_IPI_OTHER);
> +
> +	/*
> +	 * Ensure that we've received and acked the IPI before we load the vIPI
> +	 * flags. This pairs with the second wmb() above.
> +	 */
> +	mb();

I don't get your ordering here.

If you are trying to order against something that has happened on
another CPU (which is pretty likely in the case of an IPI), why isn't
this a smp_mb_before_atomic() (and conversely smp_mb_after_atomic() in
aic_ipi_send_mask())?

Although this looks to me like a good case for _acquire/_release
semantics.

> +
> +	firing = atomic_xchg(&aic_vipi_flag[this_cpu], 0);
> +
> +	/*
> +	 * Ensure that we've exchanged the vIPI flags before running any IPI
> +	 * handler code. This pairs with the first wmb() above.
> +	 */
> +	rmb();
> +
> +	for_each_set_bit(i, &firing, AIC_NR_SWIPI) {
> +		handle_domain_irq(aic_irqc->ipi_domain, i, regs);
> +	}
> +
> +	aic_ic_write(aic_irqc, AIC_IPI_MASK_CLR, AIC_IPI_OTHER);
> +}
> +
> +static int aic_ipi_alloc(struct irq_domain *d, unsigned int virq,
> +			 unsigned int nr_irqs, void *args)
> +{
> +	int i;
> +
> +	for (i = 0; i < nr_irqs; i++) {
> +		irq_set_percpu_devid(virq + i);
> +		irq_domain_set_info(d, virq + i, i, &ipi_chip, d->host_data,
> +				    handle_percpu_devid_irq, NULL, NULL);
> +	}
> +
> +	return 0;
> +}
> +
> +static void aic_ipi_free(struct irq_domain *d, unsigned int virq, unsigned int nr_irqs)
> +{
> +	/* Not freeing IPIs */
> +}
> +
> +static const struct irq_domain_ops aic_ipi_domain_ops = {
> +	.alloc = aic_ipi_alloc,
> +	.free = aic_ipi_free,
> +};
> +
> +static int aic_init_smp(struct aic_irq_chip *irqc, struct device_node *node)
> +{
> +	int base_ipi;
> +
> +	irqc->ipi_domain = irq_domain_create_linear(irqc->hw_domain->fwnode, AIC_NR_SWIPI,
> +						    &aic_ipi_domain_ops, irqc);
> +	if (WARN_ON(!irqc->ipi_domain))
> +		return -ENODEV;
> +
> +	irqc->ipi_domain->flags |= IRQ_DOMAIN_FLAG_IPI_SINGLE;
> +	irq_domain_update_bus_token(irqc->ipi_domain, DOMAIN_BUS_IPI);
> +
> +	base_ipi = __irq_domain_alloc_irqs(irqc->ipi_domain, -1, AIC_NR_SWIPI,
> +					   NUMA_NO_NODE, NULL, false, NULL);
> +
> +	if (WARN_ON(!base_ipi)) {
> +		irq_domain_remove(irqc->ipi_domain);
> +		return -ENODEV;
> +	}
> +
> +	set_smp_ipi_range(base_ipi, AIC_NR_SWIPI);
> +
> +	return 0;
> +}
> +
> +static int aic_init_cpu(unsigned int cpu)
> +{
> +	/* Mask all hard-wired per-CPU IRQ/FIQ sources */
> +
> +	/* vGIC maintenance IRQ */
> +	sysreg_clear_set_s(SYS_ICH_HCR_EL2, ICH_HCR_EN, 0);
> +
> +	/* Pending Fast IPI FIQs */
> +	write_sysreg_s(IPI_SR_PENDING, SYS_APL_IPI_SR);
> +
> +	/* Timer FIQs */
> +	sysreg_clear_set(cntp_ctl_el0, 0, ARCH_TIMER_CTRL_IT_MASK);
> +	sysreg_clear_set(cntv_ctl_el0, 0, ARCH_TIMER_CTRL_IT_MASK);
> +	sysreg_clear_set_s(SYS_CNTP_CTL_EL02, 0, ARCH_TIMER_CTRL_IT_MASK);
> +	sysreg_clear_set_s(SYS_CNTV_CTL_EL02, 0, ARCH_TIMER_CTRL_IT_MASK);
> +
> +	/* PMC FIQ */
> +	sysreg_clear_set_s(SYS_APL_PMCR0, PMCR0_IMODE | PMCR0_IACT,
> +			   FIELD_PREP(PMCR0_IMODE, PMCR0_IMODE_OFF));
> +
> +	/* Uncore PMC FIQ */
> +	sysreg_clear_set_s(SYS_APL_UPMCR0, UPMCR0_IMODE,
> +			   FIELD_PREP(UPMCR0_IMODE, UPMCR0_IMODE_OFF));
> +
> +	/*
> +	 * Make sure the kernel's idea of logical CPU order is the same as AIC's
> +	 * If we ever end up with a mismatch here, we will have to introduce
> +	 * a mapping table similar to what other irqchip drivers do.
> +	 */
> +	WARN_ON(aic_ic_read(aic_irqc, AIC_WHOAMI) != smp_processor_id());

This is unlikely to work as soon as you get kexec up and running. You
may not have to worry about this for some time...

> +
> +	return 0;
> +
> +}
> +
> +static int __init aic_of_ic_init(struct device_node *node, struct device_node *parent)
> +{
> +	int i;
> +	void __iomem *regs;
> +	u32 info;
> +	struct aic_irq_chip *irqc;
> +
> +	regs = of_iomap(node, 0);
> +	if (WARN_ON(!regs))
> +		return -EIO;
> +
> +	irqc = kzalloc(sizeof(*irqc), GFP_KERNEL);
> +	if (!irqc)
> +		return -ENOMEM;
> +
> +	aic_irqc = irqc;
> +	irqc->base = regs;
> +
> +	info = aic_ic_read(irqc, AIC_INFO);
> +	irqc->nr_hw = FIELD_GET(AIC_INFO_NR_HW, info);
> +
> +	irqc->hw_domain = irq_domain_add_linear(node, irqc->nr_hw + AIC_NR_FIQ,
> +						&aic_irq_domain_ops, irqc);
> +	if (WARN_ON(!irqc->hw_domain)) {
> +		iounmap(irqc->base);
> +		kfree(irqc);
> +		return -ENODEV;
> +	}
> +
> +	irq_domain_update_bus_token(irqc->hw_domain, DOMAIN_BUS_WIRED);
> +
> +	if (aic_init_smp(irqc, node)) {
> +		irq_domain_remove(irqc->hw_domain);
> +		iounmap(irqc->base);
> +		kfree(irqc);
> +		return -ENODEV;
> +	}
> +
> +	set_handle_irq(aic_handle_irq_or_fiq);
> +
> +	for (i = 0; i < BITS_TO_U32(irqc->nr_hw); i++)
> +		aic_ic_write(irqc, AIC_MASK_SET + i * 4, ~0);
> +	for (i = 0; i < BITS_TO_U32(irqc->nr_hw); i++)
> +		aic_ic_write(irqc, AIC_SW_CLR + i * 4, ~0);
> +	for (i = 0; i < irqc->nr_hw; i++)
> +		aic_ic_write(irqc, AIC_TARGET_CPU + i * 4, 1);
> +
> +	cpuhp_setup_state(CPUHP_AP_IRQ_APPLE_AIC_STARTING,
> +			  "irqchip/apple-aic/ipi:starting",
> +			  aic_init_cpu, NULL);
> +
> +	pr_info("AIC: initialized with %d IRQs, %d FIQs, %d vIPIs\n",
> +		irqc->nr_hw, AIC_NR_FIQ, AIC_NR_SWIPI);
> +
> +	return 0;
> +}
> +
> +IRQCHIP_DECLARE(apple_m1_aic, "apple,aic", aic_of_ic_init);
> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> index 0042ef362511..7ea1fc537f12 100644
> --- a/include/linux/cpuhotplug.h
> +++ b/include/linux/cpuhotplug.h
> @@ -100,6 +100,7 @@ enum cpuhp_state {
>  	CPUHP_AP_CPU_PM_STARTING,
>  	CPUHP_AP_IRQ_GIC_STARTING,
>  	CPUHP_AP_IRQ_HIP04_STARTING,
> +	CPUHP_AP_IRQ_APPLE_AIC_STARTING,
>  	CPUHP_AP_IRQ_ARMADA_XP_STARTING,
>  	CPUHP_AP_IRQ_BCM2836_STARTING,
>  	CPUHP_AP_IRQ_MIPS_GIC_STARTING,

Thanks,

	M.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/commit/?h=hack/m1-aic&id=17e2fdb4f551f98a2ff483f3ac2501648bfba33e

Tony Lindgren Feb. 15, 2021, 6:23 p.m. UTC | #11

* Hector Martin <marcan@marcan.st> [210215 12:18]:
> This allows the devicetree to correctly represent the available set of
> timers, which varies from device to device, without the need for fake
> dummy interrupts for unavailable slots.

I like the idea of using interrupt-names property for mapping timers :)

Similar approach might help other SoCs too. And clocksources never really
had similar issues.

With Marc's comments addressed, please feel free to add:

Reviewed-by: Tony Lindgren <tony@atomide.com>

Krzysztof Kozlowski Feb. 15, 2021, 6:26 p.m. UTC | #12

On Mon, Feb 15, 2021 at 09:17:06PM +0900, Hector Martin wrote:
> This decouples the TTY layer PORT_ types, which are exposed to
> userspace, from the driver-internal flag of what kind of port this is.
> 
> This removes s3c24xx_serial_has_interrupt_mask, which was just checking
> for a specific type anyway, and adds the ucon_mask port info member to
> avoid having S3C2440 as a distinct type.

Please split setting the ucon_mask to separate patch. It's a nice
code simplification on its own.

> 
> Signed-off-by: Hector Martin <marcan@marcan.st>
> ---
>  drivers/tty/serial/samsung_tty.c | 131 ++++++++++++++++++-------------
>  1 file changed, 77 insertions(+), 54 deletions(-)
> 
> diff --git a/drivers/tty/serial/samsung_tty.c b/drivers/tty/serial/samsung_tty.c
> index 6b661f3ec1ae..21955be680a4 100644
> --- a/drivers/tty/serial/samsung_tty.c
> +++ b/drivers/tty/serial/samsung_tty.c
> @@ -56,9 +56,15 @@
>  /* flag to ignore all characters coming in */
>  #define RXSTAT_DUMMY_READ (0x10000000)
>  
> +enum s3c24xx_port_type {
> +	TYPE_S3C24XX,
> +	TYPE_S3C6400,
> +};
> +
>  struct s3c24xx_uart_info {
>  	char			*name;
> -	unsigned int		type;
> +	enum s3c24xx_port_type	type;
> +	unsigned int		port_type;
>  	unsigned int		fifosize;
>  	unsigned long		rx_fifomask;
>  	unsigned long		rx_fifoshift;
> @@ -70,6 +76,7 @@ struct s3c24xx_uart_info {
>  	unsigned long		num_clks;
>  	unsigned long		clksel_mask;
>  	unsigned long		clksel_shift;
> +	unsigned long		ucon_mask;
>  
>  	/* uart port features */
>  
> @@ -228,16 +235,6 @@ static int s3c24xx_serial_txempty_nofifo(struct uart_port *port)
>  	return rd_regl(port, S3C2410_UTRSTAT) & S3C2410_UTRSTAT_TXE;
>  }
>  
> -/*
> - * s3c64xx and later SoC's include the interrupt mask and status registers in
> - * the controller itself, unlike the s3c24xx SoC's which have these registers
> - * in the interrupt controller. Check if the port type is s3c64xx or higher.
> - */
> -static int s3c24xx_serial_has_interrupt_mask(struct uart_port *port)
> -{
> -	return to_ourport(port)->info->type == PORT_S3C6400;
> -}
> -
>  static void s3c24xx_serial_rx_enable(struct uart_port *port)
>  {
>  	struct s3c24xx_uart_port *ourport = to_ourport(port);
> @@ -289,10 +286,14 @@ static void s3c24xx_serial_stop_tx(struct uart_port *port)
>  	if (!ourport->tx_enabled)
>  		return;
>  
> -	if (s3c24xx_serial_has_interrupt_mask(port))
> +	switch (ourport->info->type) {
> +	case TYPE_S3C6400:
>  		s3c24xx_set_bit(port, S3C64XX_UINTM_TXD, S3C64XX_UINTM);
> -	else
> +		break;
> +	default:
>  		disable_irq_nosync(ourport->tx_irq);
> +		break;
> +	}
>  
>  	if (dma && dma->tx_chan && ourport->tx_in_progress == S3C24XX_TX_DMA) {
>  		dmaengine_pause(dma->tx_chan);
> @@ -353,10 +354,14 @@ static void enable_tx_dma(struct s3c24xx_uart_port *ourport)
>  	u32 ucon;
>  
>  	/* Mask Tx interrupt */
> -	if (s3c24xx_serial_has_interrupt_mask(port))
> +	switch (ourport->info->type) {
> +	case TYPE_S3C6400:
>  		s3c24xx_set_bit(port, S3C64XX_UINTM_TXD, S3C64XX_UINTM);
> -	else
> +		break;
> +	default:
>  		disable_irq_nosync(ourport->tx_irq);
> +		break;
> +	}
>  
>  	/* Enable tx dma mode */
>  	ucon = rd_regl(port, S3C2410_UCON);
> @@ -386,11 +391,14 @@ static void enable_tx_pio(struct s3c24xx_uart_port *ourport)
>  	wr_regl(port,  S3C2410_UCON, ucon);
>  
>  	/* Unmask Tx interrupt */
> -	if (s3c24xx_serial_has_interrupt_mask(port))
> -		s3c24xx_clear_bit(port, S3C64XX_UINTM_TXD,
> -				  S3C64XX_UINTM);
> -	else
> +	switch (ourport->info->type) {
> +	case TYPE_S3C6400:
> +		s3c24xx_clear_bit(port, S3C64XX_UINTM_TXD, S3C64XX_UINTM);

Please do not re-wrap. It makes reviewing more difficult. You can
perform proper re-wrapping as a separate cleanup patch.

> +		break;
> +	default:
>  		enable_irq(ourport->tx_irq);
> +		break;
> +	}
>  
>  	ourport->tx_mode = S3C24XX_TX_PIO;
>  }
> @@ -513,11 +521,14 @@ static void s3c24xx_serial_stop_rx(struct uart_port *port)
>  
>  	if (ourport->rx_enabled) {
>  		dev_dbg(port->dev, "stopping rx\n");
> -		if (s3c24xx_serial_has_interrupt_mask(port))
> -			s3c24xx_set_bit(port, S3C64XX_UINTM_RXD,
> -					S3C64XX_UINTM);

The same.

Best regards,
Krzysztof

Krzysztof Kozlowski Feb. 15, 2021, 6:40 p.m. UTC | #13

On Mon, Feb 15, 2021 at 09:17:07PM +0900, Hector Martin wrote:
> * Split out s3c24xx_serial_tx_chars from s3c24xx_serial_tx_irq,
>   where only the latter acquires the port lock.

I miss here information why you do all this.

> 
> * For S3C64xx, return IRQ_NONE if no flag bits were set, so the
>   interrupt core can detect IRQ storms. Note that both IRQ handlers
>   always return IRQ_HANDLED anyway, so 'or' logic isn't necessary here,
>   if either handler ran we are always going to return IRQ_HANDLED.

It looks like separate patch. Your patches should do only one thing at
once. The fact that you have here three bullet points is a warning
sign. This is even more important if you do some refactorings and
cleanups which should not affect functionality. Hiding there changes
which could affect functionality is a no-go.

> 
> * Rename s3c24xx_serial_rx_chars to s3c24xx_serial_rx_irq for
>   consistency with the above. All it does now is call two other
>   functions anyway.

Separate patch for trivial renaming.

> 
> Signed-off-by: Hector Martin <marcan@marcan.st>
> ---
>  drivers/tty/serial/samsung_tty.c | 41 +++++++++++++++++++-------------
>  1 file changed, 24 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/tty/serial/samsung_tty.c b/drivers/tty/serial/samsung_tty.c
> index 21955be680a4..821cd0e4f870 100644
> --- a/drivers/tty/serial/samsung_tty.c
> +++ b/drivers/tty/serial/samsung_tty.c
> @@ -151,6 +151,9 @@ struct s3c24xx_uart_port {
>  #endif
>  };
>  
> +static void s3c24xx_serial_start_next_tx(struct s3c24xx_uart_port *ourport);
> +static void s3c24xx_serial_tx_chars(struct s3c24xx_uart_port *ourport);
> +
>  /* conversion functions */
>  
>  #define s3c24xx_dev_to_port(__dev) dev_get_drvdata(__dev)
> @@ -316,8 +319,6 @@ static void s3c24xx_serial_stop_tx(struct uart_port *port)
>  	ourport->tx_mode = 0;
>  }
>  
> -static void s3c24xx_serial_start_next_tx(struct s3c24xx_uart_port *ourport);
> -

Why moving this? Why adding s3c24xx_serial_tx_chars() forward
declaration?

Best regards,
Krzysztof

Krzysztof Kozlowski Feb. 15, 2021, 6:51 p.m. UTC | #14

On Mon, Feb 15, 2021 at 09:17:08PM +0900, Hector Martin wrote:
> This picks up the non-posted I/O mode needed for Apple platforms to
> work properly.
> 
> This removes the request/release functions, which are no longer
> necessary, since devm_ioremap_resource takes care of that already. Most
> other drivers already do it this way, anyway.
> 
> Also fix a bug checking the return value, which should use IS_ERR().

No, no, no. We never, never combine fixing bugs with some rework.
However devm_ioremap() returns NULL so where is the error?

Did you test your patches on existing platforms? If not, please mark all
of them as RFT on next submission, so Greg does not pick them too fast.

Best regards,
Krzysztof

Marc Zyngier Feb. 15, 2021, 7:11 p.m. UTC | #15

On Mon, 15 Feb 2021 17:43:59 +0000,
Krzysztof Kozlowski <krzk@kernel.org> wrote:

> For bigger patchsets, you should combine get_maintainer.pl with sending
> emails so individual patches will go to all role-based entries from
> get_maintainer.pl and to all mailing lists. You can (and sometimes even
> worth to) skip proposals based on git-history.
> 
> Then the cover letter which should go to everyone... or most of people
> and linked from individual patches.
> 
> For example the easiest I think:
> 1. Put all folks and lists as "Cc:" in cover letter.
> 2. Send everything:
>    git send-email --cc-cmd "scripts/get_maintainer.pl --no-git --no-roles --no-rolestats" --to linux-kernel@vger.kernel.org 0*
> 
> Optionally you could add --no-git-fallback to get_maintainer if it
> proposes some irrelevant addresses.
> 
> Other way is to send first cover letter and then reference it via
> in-reply-to:
> 
> 1. To get everyone for cover letter:
>    scripts/get_maintainer.pl --no-multiline --interactive --separator=\'' --to '\' --no-git --no-roles --no-rolestats 00*
> 
> 2. git send-email --cc linux-kernel@vger.kernel.org ... 0000*
> 
> 3. Finally all patches with in-reply-to:
>    git send-email --cc-cmd "scripts/get_maintainer.pl --no-git --no-roles --no-rolestats" --to linux-kernel@vger.kernel.org --no-thread --in-reply-to='foo-bar-xxxxxxx' 000[123456789]-* 00[123456789]*-*

I personally dislike this method as a recipient. Context matters, and
seeing how an isolated patch fits in a bigger series is pretty
important. I can mark the patches I don't care about quicker than you
can say "Don't care", and it saves me having to drag the mbox via lore
and import it.

That's only me, but I suspect I'm not the only one with that
particular flow.

Thanks,

	M.

Krzysztof Kozlowski Feb. 15, 2021, 7:13 p.m. UTC | #16

On Mon, Feb 15, 2021 at 09:17:10PM +0900, Hector Martin wrote:
> Apple SoCs are a distant descendant of Samsung designs and use yet
> another variant of their UART style, with different interrupt handling.
> 
> In particular, this variant has the following differences with existing
> ones:
> 
> * It includes a built-in interrupt controller with different registers,
>   using only a single platform IRQ
> 
> * Internal interrupt sources are treated as edge-triggered, even though
>   the IRQ output is level-triggered. This chiefly affects the TX IRQ
>   path: the driver can no longer rely on the TX buffer empty IRQ
>   immediately firing after TX is enabled, but instead must prime the
>   FIFO with data directly.
> 
> Signed-off-by: Hector Martin <marcan@marcan.st>
> ---
>  drivers/tty/serial/Kconfig       |   2 +-
>  drivers/tty/serial/samsung_tty.c | 228 +++++++++++++++++++++++++++++--
>  include/linux/serial_s3c.h       |  16 +++
>  3 files changed, 236 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig
> index 34a2899e69c0..9bfe4ec1e761 100644
> --- a/drivers/tty/serial/Kconfig
> +++ b/drivers/tty/serial/Kconfig
> @@ -236,7 +236,7 @@ config SERIAL_CLPS711X_CONSOLE
>  
>  config SERIAL_SAMSUNG
>  	tristate "Samsung SoC serial support"
> -	depends on PLAT_SAMSUNG || ARCH_S5PV210 || ARCH_EXYNOS || COMPILE_TEST
> +	depends on PLAT_SAMSUNG || ARCH_S5PV210 || ARCH_EXYNOS || ARCH_APPLE || COMPILE_TEST
>  	select SERIAL_CORE
>  	help
>  	  Support for the on-chip UARTs on the Samsung S3C24XX series CPUs,
> diff --git a/drivers/tty/serial/samsung_tty.c b/drivers/tty/serial/samsung_tty.c
> index 619bc4864e2a..e7ab0b9d89a7 100644
> --- a/drivers/tty/serial/samsung_tty.c
> +++ b/drivers/tty/serial/samsung_tty.c
> @@ -59,6 +59,7 @@
>  enum s3c24xx_port_type {
>  	TYPE_S3C24XX,
>  	TYPE_S3C6400,
> +	TYPE_APPLE_S5L,
>  };
>  
>  struct s3c24xx_uart_info {
> @@ -290,6 +291,9 @@ static void s3c24xx_serial_stop_tx(struct uart_port *port)
>  		return;
>  
>  	switch (ourport->info->type) {
> +	case TYPE_APPLE_S5L:
> +		s3c24xx_clear_bit(port, APPLE_S5L_UCON_TXTHRESH_ENA, S3C2410_UCON);
> +		break;
>  	case TYPE_S3C6400:
>  		s3c24xx_set_bit(port, S3C64XX_UINTM_TXD, S3C64XX_UINTM);
>  		break;
> @@ -356,6 +360,9 @@ static void enable_tx_dma(struct s3c24xx_uart_port *ourport)
>  
>  	/* Mask Tx interrupt */
>  	switch (ourport->info->type) {
> +	case TYPE_APPLE_S5L:
> +		WARN_ON(1); // No DMA
> +		break;
>  	case TYPE_S3C6400:
>  		s3c24xx_set_bit(port, S3C64XX_UINTM_TXD, S3C64XX_UINTM);
>  		break;
> @@ -389,10 +396,12 @@ static void enable_tx_pio(struct s3c24xx_uart_port *ourport)
>  	ucon = rd_regl(port, S3C2410_UCON);
>  	ucon &= ~(S3C64XX_UCON_TXMODE_MASK);
>  	ucon |= S3C64XX_UCON_TXMODE_CPU;
> -	wr_regl(port,  S3C2410_UCON, ucon);
>  
>  	/* Unmask Tx interrupt */
>  	switch (ourport->info->type) {
> +	case TYPE_APPLE_S5L:
> +		ucon |= APPLE_S5L_UCON_TXTHRESH_ENA_MSK;
> +		break;
>  	case TYPE_S3C6400:
>  		s3c24xx_clear_bit(port, S3C64XX_UINTM_TXD, S3C64XX_UINTM);
>  		break;
> @@ -401,7 +410,16 @@ static void enable_tx_pio(struct s3c24xx_uart_port *ourport)
>  		break;
>  	}
>  
> +	wr_regl(port,  S3C2410_UCON, ucon);

You are now configuring the PIO mode after unmasking interrupt. I don't
think it's a good idea to change the order... and if it were, it
would deserve a separate patch.

> +
>  	ourport->tx_mode = S3C24XX_TX_PIO;
> +
> +	/*
> +	 * The Apple version only has edge triggered TX IRQs, so we need
> +	 * to kick off the process by sending some characters here.
> +	 */
> +	if (ourport->info->type == TYPE_APPLE_S5L)
> +		s3c24xx_serial_tx_chars(ourport);
>  }
>  
>  static void s3c24xx_serial_start_tx_pio(struct s3c24xx_uart_port *ourport)
> @@ -523,6 +541,10 @@ static void s3c24xx_serial_stop_rx(struct uart_port *port)
>  	if (ourport->rx_enabled) {
>  		dev_dbg(port->dev, "stopping rx\n");
>  		switch (ourport->info->type) {
> +		case TYPE_APPLE_S5L:
> +			s3c24xx_clear_bit(port, APPLE_S5L_UCON_RXTHRESH_ENA, S3C2410_UCON);
> +			s3c24xx_clear_bit(port, APPLE_S5L_UCON_RXTO_ENA, S3C2410_UCON);
> +			break;
>  		case TYPE_S3C6400:
>  			s3c24xx_set_bit(port, S3C64XX_UINTM_RXD, S3C64XX_UINTM);
>  			break;
> @@ -663,14 +685,18 @@ static void enable_rx_pio(struct s3c24xx_uart_port *ourport)
>  
>  	/* set Rx mode to DMA mode */
>  	ucon = rd_regl(port, S3C2410_UCON);
> -	ucon &= ~(S3C64XX_UCON_TIMEOUT_MASK |
> -			S3C64XX_UCON_EMPTYINT_EN |
> -			S3C64XX_UCON_DMASUS_EN |
> -			S3C64XX_UCON_TIMEOUT_EN |
> -			S3C64XX_UCON_RXMODE_MASK);
> -	ucon |= 0xf << S3C64XX_UCON_TIMEOUT_SHIFT |
> -			S3C64XX_UCON_TIMEOUT_EN |
> -			S3C64XX_UCON_RXMODE_CPU;
> +	ucon &= ~S3C64XX_UCON_RXMODE_MASK;
> +	ucon |= S3C64XX_UCON_RXMODE_CPU;
> +
> +	/* Apple types use these bits for IRQ masks */
> +	if (ourport->info->type != TYPE_APPLE_S5L) {
> +		ucon &= ~(S3C64XX_UCON_TIMEOUT_MASK |
> +				S3C64XX_UCON_EMPTYINT_EN |
> +				S3C64XX_UCON_DMASUS_EN |
> +				S3C64XX_UCON_TIMEOUT_EN);
> +		ucon |= 0xf << S3C64XX_UCON_TIMEOUT_SHIFT |
> +				S3C64XX_UCON_TIMEOUT_EN;
> +	}
>  	wr_regl(port, S3C2410_UCON, ucon);
>  
>  	ourport->rx_mode = S3C24XX_RX_PIO;
> @@ -934,6 +960,27 @@ static irqreturn_t s3c64xx_serial_handle_irq(int irq, void *id)
>  	return ret;
>  }
>  
> +/* interrupt handler for Apple SoC's.*/
> +static irqreturn_t apple_serial_handle_irq(int irq, void *id)
> +{
> +	struct s3c24xx_uart_port *ourport = id;
> +	struct uart_port *port = &ourport->port;
> +	unsigned int pend = rd_regl(port, S3C2410_UTRSTAT);
> +	irqreturn_t ret = IRQ_NONE;
> +
> +	if (pend & (APPLE_S5L_UTRSTAT_RXTHRESH | APPLE_S5L_UTRSTAT_RXTO)) {
> +		wr_regl(port, S3C2410_UTRSTAT,
> +			APPLE_S5L_UTRSTAT_RXTHRESH | APPLE_S5L_UTRSTAT_RXTO);
> +		ret = s3c24xx_serial_rx_irq(irq, id);
> +	}
> +	if (pend & APPLE_S5L_UTRSTAT_TXTHRESH) {
> +		wr_regl(port, S3C2410_UTRSTAT, APPLE_S5L_UTRSTAT_TXTHRESH);
> +		ret = s3c24xx_serial_tx_irq(irq, id);
> +	}
> +
> +	return ret;
> +}
> +
>  static unsigned int s3c24xx_serial_tx_empty(struct uart_port *port)
>  {
>  	struct s3c24xx_uart_info *info = s3c24xx_port_to_info(port);
> @@ -1153,6 +1200,32 @@ static void s3c64xx_serial_shutdown(struct uart_port *port)
>  	ourport->tx_in_progress = 0;
>  }
>  
> +static void apple_s5l_serial_shutdown(struct uart_port *port)
> +{
> +	struct s3c24xx_uart_port *ourport = to_ourport(port);
> +
> +	unsigned int ucon;
> +
> +	ucon = rd_regl(port, S3C2410_UCON);
> +	ucon &= ~(APPLE_S5L_UCON_TXTHRESH_ENA_MSK |
> +		  APPLE_S5L_UCON_RXTHRESH_ENA_MSK |
> +		  APPLE_S5L_UCON_RXTO_ENA_MSK);
> +	wr_regl(port, S3C2410_UCON, ucon);
> +
> +	wr_regl(port, S3C2410_UTRSTAT, APPLE_S5L_UTRSTAT_ALL_FLAGS);
> +
> +	free_irq(port->irq, ourport);
> +
> +	ourport->tx_enabled = 0;
> +	ourport->tx_mode = 0;
> +	ourport->rx_enabled = 0;
> +
> +	if (ourport->dma)
> +		s3c24xx_serial_release_dma(ourport);
> +
> +	ourport->tx_in_progress = 0;
> +}
> +
>  static int s3c24xx_serial_startup(struct uart_port *port)
>  {
>  	struct s3c24xx_uart_port *ourport = to_ourport(port);
> @@ -1240,6 +1313,45 @@ static int s3c64xx_serial_startup(struct uart_port *port)
>  	return ret;
>  }
>  
> +static int apple_s5l_serial_startup(struct uart_port *port)
> +{
> +	struct s3c24xx_uart_port *ourport = to_ourport(port);
> +	unsigned long flags;
> +	unsigned int ufcon;
> +	int ret;
> +
> +	wr_regl(port, S3C2410_UTRSTAT, APPLE_S5L_UTRSTAT_ALL_FLAGS);
> +
> +	ret = request_irq(port->irq, apple_serial_handle_irq, 0,
> +			  s3c24xx_serial_portname(port), ourport);
> +	if (ret) {
> +		dev_err(port->dev, "cannot get irq %d\n", port->irq);
> +		return ret;
> +	}
> +
> +	/* For compatibility with s3c24xx Soc's */
> +	ourport->rx_enabled = 1;
> +	ourport->tx_enabled = 0;
> +
> +	spin_lock_irqsave(&port->lock, flags);
> +
> +	ufcon = rd_regl(port, S3C2410_UFCON);
> +	ufcon |= S3C2410_UFCON_RESETRX | S5PV210_UFCON_RXTRIG8;
> +	if (!uart_console(port))
> +		ufcon |= S3C2410_UFCON_RESETTX;
> +	wr_regl(port, S3C2410_UFCON, ufcon);
> +
> +	enable_rx_pio(ourport);
> +
> +	spin_unlock_irqrestore(&port->lock, flags);
> +
> +	/* Enable Rx Interrupt */
> +	s3c24xx_set_bit(port, APPLE_S5L_UCON_RXTHRESH_ENA, S3C2410_UCON);
> +	s3c24xx_set_bit(port, APPLE_S5L_UCON_RXTO_ENA, S3C2410_UCON);
> +
> +	return ret;
> +}
> +
>  /* power power management control */
>  
>  static void s3c24xx_serial_pm(struct uart_port *port, unsigned int level,
> @@ -1567,6 +1679,8 @@ static const char *s3c24xx_serial_type(struct uart_port *port)
>  		return "S3C24XX";
>  	case TYPE_S3C6400:
>  		return "S3C6400/10";
> +	case TYPE_APPLE_S5L:
> +		return "APPLE S5L";
>  	default:
>  		return NULL;
>  	}
> @@ -1658,6 +1772,27 @@ static const struct uart_ops s3c64xx_serial_ops = {
>  #endif
>  };
>  
> +static const struct uart_ops apple_s5l_serial_ops = {
> +	.pm		= s3c24xx_serial_pm,
> +	.tx_empty	= s3c24xx_serial_tx_empty,
> +	.get_mctrl	= s3c24xx_serial_get_mctrl,
> +	.set_mctrl	= s3c24xx_serial_set_mctrl,
> +	.stop_tx	= s3c24xx_serial_stop_tx,
> +	.start_tx	= s3c24xx_serial_start_tx,
> +	.stop_rx	= s3c24xx_serial_stop_rx,
> +	.break_ctl	= s3c24xx_serial_break_ctl,
> +	.startup	= apple_s5l_serial_startup,
> +	.shutdown	= apple_s5l_serial_shutdown,
> +	.set_termios	= s3c24xx_serial_set_termios,
> +	.type		= s3c24xx_serial_type,
> +	.config_port	= s3c24xx_serial_config_port,
> +	.verify_port	= s3c24xx_serial_verify_port,
> +#if defined(CONFIG_SERIAL_SAMSUNG_CONSOLE) && defined(CONFIG_CONSOLE_POLL)
> +	.poll_get_char = s3c24xx_serial_get_poll_char,
> +	.poll_put_char = s3c24xx_serial_put_poll_char,
> +#endif
> +};
> +
>  static struct uart_driver s3c24xx_uart_drv = {
>  	.owner		= THIS_MODULE,
>  	.driver_name	= "s3c2410_serial",
> @@ -1969,6 +2104,18 @@ static int s3c24xx_serial_init_port(struct s3c24xx_uart_port *ourport,
>  
>  	/* Keep all interrupts masked and cleared */
>  	switch (ourport->info->type) {
> +	case TYPE_APPLE_S5L: {

Usually you put TYPE_APPLE at the end of switch, so please keep it
consistent. Can be first or last - just everywhere the same, unless you
have a fall-through on purpose.

> +		unsigned int ucon;
> +
> +		ucon = rd_regl(port, S3C2410_UCON);
> +		ucon &= ~(APPLE_S5L_UCON_TXTHRESH_ENA_MSK |
> +			APPLE_S5L_UCON_RXTHRESH_ENA_MSK |
> +			APPLE_S5L_UCON_RXTO_ENA_MSK);
> +		wr_regl(port, S3C2410_UCON, ucon);
> +
> +		wr_regl(port, S3C2410_UTRSTAT, APPLE_S5L_UTRSTAT_ALL_FLAGS);
> +		break;
> +	}
>  	case TYPE_S3C6400:
>  		wr_regl(port, S3C64XX_UINTM, 0xf);
>  		wr_regl(port, S3C64XX_UINTP, 0xf);
> @@ -2053,6 +2200,9 @@ static int s3c24xx_serial_probe(struct platform_device *pdev)
>  	case TYPE_S3C6400:
>  		ourport->port.ops = &s3c64xx_serial_ops;
>  		break;
> +	case TYPE_APPLE_S5L:
> +		ourport->port.ops = &apple_s5l_serial_ops;
> +		break;
>  	}
>  
>  	if (np) {
> @@ -2179,6 +2329,32 @@ static int s3c24xx_serial_resume_noirq(struct device *dev)
>  	if (port) {
>  		/* restore IRQ mask */
>  		switch (ourport->info->type) {
> +		case TYPE_APPLE_S5L: {
> +			unsigned int ucon;
> +
> +			clk_prepare_enable(ourport->clk);
> +			if (!IS_ERR(ourport->baudclk))
> +				clk_prepare_enable(ourport->baudclk);

We should start checking the return values of clk operations. I know
that existing code does it only in few places, so basically you are not
making it worse...

> +
> +			ucon = rd_regl(port, S3C2410_UCON);
> +
> +			ucon &= ~(APPLE_S5L_UCON_TXTHRESH_ENA_MSK |
> +				  APPLE_S5L_UCON_RXTHRESH_ENA_MSK |
> +				  APPLE_S5L_UCON_RXTO_ENA_MSK);
> +
> +			if (ourport->tx_enabled)
> +				ucon |= APPLE_S5L_UCON_TXTHRESH_ENA_MSK;
> +			if (ourport->rx_enabled)
> +				ucon |= APPLE_S5L_UCON_RXTHRESH_ENA_MSK |
> +					APPLE_S5L_UCON_RXTO_ENA_MSK;
> +
> +			wr_regl(port, S3C2410_UCON, ucon);
> +
> +			if (!IS_ERR(ourport->baudclk))
> +				clk_disable_unprepare(ourport->baudclk);
> +			clk_disable_unprepare(ourport->clk);
> +			break;
> +		}
>  		case TYPE_S3C6400: {
>  			unsigned int uintm = 0xf;
>  
> @@ -2604,6 +2780,35 @@ static struct s3c24xx_serial_drv_data exynos5433_serial_drv_data = {
>  #define EXYNOS5433_SERIAL_DRV_DATA (kernel_ulong_t)NULL
>  #endif
>  
> +#ifdef CONFIG_ARCH_APPLE
> +static struct s3c24xx_serial_drv_data s5l_serial_drv_data = {
> +	.info = &(struct s3c24xx_uart_info) {
> +		.name		= "Apple S5L UART",
> +		.type		= TYPE_APPLE_S5L,
> +		.port_type	= PORT_8250,
> +		.fifosize	= 16,
> +		.rx_fifomask	= S3C2410_UFSTAT_RXMASK,
> +		.rx_fifoshift	= S3C2410_UFSTAT_RXSHIFT,
> +		.rx_fifofull	= S3C2410_UFSTAT_RXFULL,
> +		.tx_fifofull	= S3C2410_UFSTAT_TXFULL,
> +		.tx_fifomask	= S3C2410_UFSTAT_TXMASK,
> +		.tx_fifoshift	= S3C2410_UFSTAT_TXSHIFT,
> +		.def_clk_sel	= S3C2410_UCON_CLKSEL0,
> +		.num_clks	= 1,
> +		.clksel_mask	= 0,
> +		.clksel_shift	= 0,
> +	},
> +	.def_cfg = &(struct s3c2410_uartcfg) {
> +		.ucon		= APPLE_S5L_UCON_DEFAULT,
> +		.ufcon		= S3C2410_UFCON_DEFAULT,
> +	},
> +};
> +#define S5L_SERIAL_DRV_DATA ((kernel_ulong_t)&s5l_serial_drv_data)
> +#else
> +#define S5L_SERIAL_DRV_DATA ((kernel_ulong_t)NULL)
> +#endif
> +
> +

Only one line break.

Best regards,
Krzysztof

Krzysztof Kozlowski Feb. 15, 2021, 7:17 p.m. UTC | #17

On Mon, Feb 15, 2021 at 09:17:11PM +0900, Hector Martin wrote:
> Earlycon support is identical to S3C2410, but Apple SoCs also need
> MMIO mapped as nGnRnE. This is handled generically for normal drivers
> including the normal UART path here, but earlycon uses fixmap and
> runs before that scaffolding is ready.
> 
> Since this is the only case where we need this fix, it makes more
> sense to do it here in the UART driver instead of introducing a
> whole fdt nonposted-mmio resolver just for earlycon/fixmap.
> 
> Suggested-by: Arnd Bergmann <arnd@arndb.de>
> Signed-off-by: Hector Martin <marcan@marcan.st>
> ---
>  drivers/tty/serial/samsung_tty.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/drivers/tty/serial/samsung_tty.c b/drivers/tty/serial/samsung_tty.c
> index e7ab0b9d89a7..00262f0e704b 100644
> --- a/drivers/tty/serial/samsung_tty.c
> +++ b/drivers/tty/serial/samsung_tty.c
> @@ -2988,6 +2988,23 @@ OF_EARLYCON_DECLARE(s5pv210, "samsung,s5pv210-uart",
>  			s5pv210_early_console_setup);
>  OF_EARLYCON_DECLARE(exynos4210, "samsung,exynos4210-uart",
>  			s5pv210_early_console_setup);
> +
> +/* Apple S5L */
> +static int __init apple_s5l_early_console_setup(struct earlycon_device *device,
> +						const char *opt)
> +{
> +	/* Close enough to S3C2410 for earlycon... */
> +	device->port.private_data = &s3c2410_early_console_data;
> +
> +#ifdef CONFIG_ARM64

if IS_ENABLED()
(unless it cannot be used due to missing symbol?)

> +	/* ... but we need to override the existing fixmap entry as nGnRnE */
> +	__set_fixmap(FIX_EARLYCON_MEM_BASE, device->port.mapbase,
> +		     __pgprot(PROT_DEVICE_nGnRnE));
> +#endif
> +	return samsung_early_console_setup(device, opt);

Don't you need to handle the error code - set PROT_DEFAULT() or whatever
was there before?

Best regards,
Krzysztof

Krzysztof Kozlowski Feb. 15, 2021, 7:29 p.m. UTC | #18

On Mon, Feb 15, 2021 at 09:17:13PM +0900, Hector Martin wrote:
> This currently supports:
> 
> * SMP (via spin-tables)
> * AIC IRQs
> * Serial (with earlycon)
> * Framebuffer
> 
> A number of properties are dynamic, and based on system firmware
> decisions that vary from version to version. These are expected
> to be filled in by the loader.
> 
> Signed-off-by: Hector Martin <marcan@marcan.st>
> ---
>  MAINTAINERS                              |   1 +
>  arch/arm64/boot/dts/Makefile             |   1 +
>  arch/arm64/boot/dts/apple/Makefile       |   2 +
>  arch/arm64/boot/dts/apple/apple-j274.dts |  41 ++++++++
>  arch/arm64/boot/dts/apple/apple-m1.dtsi  | 124 +++++++++++++++++++++++
>  5 files changed, 169 insertions(+)
>  create mode 100644 arch/arm64/boot/dts/apple/Makefile
>  create mode 100644 arch/arm64/boot/dts/apple/apple-j274.dts
>  create mode 100644 arch/arm64/boot/dts/apple/apple-m1.dtsi
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index a8f258fbb5f1..87db1c947f45 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1635,6 +1635,7 @@ C:	irc://chat.freenode.net/asahi-dev
>  T:	git https://github.com/AsahiLinux/linux.git
>  F:	Documentation/devicetree/bindings/arm/apple.yaml
>  F:	Documentation/devicetree/bindings/interrupt-controller/apple,aic.yaml
> +F:	arch/arm64/boot/dts/apple/
>  F:	arch/arm64/include/asm/sysreg_apple.h
>  F:	drivers/irqchip/irq-apple-aic.c
>  F:	include/dt-bindings/interrupt-controller/apple-aic.h
> diff --git a/arch/arm64/boot/dts/Makefile b/arch/arm64/boot/dts/Makefile
> index 9b1170658d60..64f055d94948 100644
> --- a/arch/arm64/boot/dts/Makefile
> +++ b/arch/arm64/boot/dts/Makefile
> @@ -6,6 +6,7 @@ subdir-y += amazon
>  subdir-y += amd
>  subdir-y += amlogic
>  subdir-y += apm
> +subdir-y += apple
>  subdir-y += arm
>  subdir-y += bitmain
>  subdir-y += broadcom
> diff --git a/arch/arm64/boot/dts/apple/Makefile b/arch/arm64/boot/dts/apple/Makefile
> new file mode 100644
> index 000000000000..ec03c474efd4
> --- /dev/null
> +++ b/arch/arm64/boot/dts/apple/Makefile
> @@ -0,0 +1,2 @@
> +# SPDX-License-Identifier: GPL-2.0
> +dtb-$(CONFIG_ARCH_APPLE) += apple-j274.dtb
> diff --git a/arch/arm64/boot/dts/apple/apple-j274.dts b/arch/arm64/boot/dts/apple/apple-j274.dts
> new file mode 100644
> index 000000000000..9a1be91a2cf0
> --- /dev/null
> +++ b/arch/arm64/boot/dts/apple/apple-j274.dts
> @@ -0,0 +1,41 @@
> +// SPDX-License-Identifier: GPL-2.0+ OR MIT

() around licenses, so:
(GPL-2.0+ OR MIT)

> +/*
> + * Copyright The Asahi Linux Contributors
> + */
> +
> +/dts-v1/;
> +
> +#include "apple-m1.dtsi"
> +
> +/ {
> +	compatible = "apple,j274", "apple,m1", "apple,arm-platform";
> +	model = "Apple Mac Mini M1 2020";
> +
> +	aliases {
> +		serial0 = &serial0;
> +	};
> +
> +	chosen {
> +		#address-cells = <2>;
> +		#size-cells = <2>;
> +		ranges;
> +
> +		stdout-path = "serial0:1500000";
> +
> +		framebuffer0: framebuffer@0 {
> +			compatible = "apple,simple-framebuffer", "simple-framebuffer";
> +			reg = <0 0 0 0>; /* To be filled by loader */
> +			/* Format properties will be added by loader */
> +			status = "disabled";
> +		};
> +	};
> +
> +	memory@800000000 {
> +		device_type = "memory";
> +		reg = <0 0 0 0>; /* To be filled by loader */

dtc and dtschema might complain, so could you set here fake memory
address 0x800000000? Would that work for your bootloader?

> +	};
> +};
> +
> +&serial0 {
> +	status = "okay";
> +};
> diff --git a/arch/arm64/boot/dts/apple/apple-m1.dtsi b/arch/arm64/boot/dts/apple/apple-m1.dtsi
> new file mode 100644
> index 000000000000..45c87771b057
> --- /dev/null
> +++ b/arch/arm64/boot/dts/apple/apple-m1.dtsi
> @@ -0,0 +1,124 @@
> +// SPDX-License-Identifier: GPL-2.0+ OR MIT
> +/*
> + * Copyright The Asahi Linux Contributors
> + */
> +
> +#include <dt-bindings/interrupt-controller/apple-aic.h>
> +#include <dt-bindings/interrupt-controller/irq.h>
> +
> +/ {
> +	compatible = "apple,m1", "apple,arm-platform";
> +
> +	#address-cells = <2>;
> +	#size-cells = <2>;
> +
> +	cpus {
> +		#address-cells = <2>;
> +		#size-cells = <0>;
> +
> +		cpu0: cpu@0 {
> +			compatible = "apple,icestorm";
> +			device_type = "cpu";
> +			reg = <0x0 0x0>;
> +			enable-method = "spin-table";
> +			cpu-release-addr = <0 0>; /* To be filled by loader */
> +		};

New line after every device node, please.

With this minor changes, fine for me:
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>

Best regards,
Krzysztof

Randy Dunlap Feb. 15, 2021, 9 p.m. UTC | #19

On 2/15/21 11:29 AM, Krzysztof Kozlowski wrote:
>> diff --git a/arch/arm64/boot/dts/apple/apple-j274.dts b/arch/arm64/boot/dts/apple/apple-j274.dts
>> new file mode 100644
>> index 000000000000..9a1be91a2cf0
>> --- /dev/null
>> +++ b/arch/arm64/boot/dts/apple/apple-j274.dts
>> @@ -0,0 +1,41 @@
>> +// SPDX-License-Identifier: GPL-2.0+ OR MIT
> () around licenses, so:
> (GPL-2.0+ OR MIT)

Hi,
When/where does that "() around licenses" apply, please?

thanks.

Krzysztof Kozlowski Feb. 16, 2021, 7:31 a.m. UTC | #20

On Mon, Feb 15, 2021 at 01:00:58PM -0800, Randy Dunlap wrote:
> On 2/15/21 11:29 AM, Krzysztof Kozlowski wrote:
> >> diff --git a/arch/arm64/boot/dts/apple/apple-j274.dts b/arch/arm64/boot/dts/apple/apple-j274.dts
> >> new file mode 100644
> >> index 000000000000..9a1be91a2cf0
> >> --- /dev/null
> >> +++ b/arch/arm64/boot/dts/apple/apple-j274.dts
> >> @@ -0,0 +1,41 @@
> >> +// SPDX-License-Identifier: GPL-2.0+ OR MIT
> > () around licenses, so:
> > (GPL-2.0+ OR MIT)
> 
> Hi,
> When/where does that "() around licenses" apply, please?

Hmm, somehow this pattern got into me but now I checked on the spdx.dev
and the preferred syntax is indeed without (). Skip this comment then.

Best regards,
Krzysztof

Arnd Bergmann Feb. 16, 2021, 10:18 a.m. UTC | #21

On Mon, Feb 15, 2021 at 8:19 PM Krzysztof Kozlowski <krzk@kernel.org> wrote:
> On Mon, Feb 15, 2021 at 09:17:11PM +0900, Hector Martin wrote:

> > +
> > +/* Apple S5L */
> > +static int __init apple_s5l_early_console_setup(struct earlycon_device *device,
> > +                                             const char *opt)
> > +{
> > +     /* Close enough to S3C2410 for earlycon... */
> > +     device->port.private_data = &s3c2410_early_console_data;
> > +
> > +#ifdef CONFIG_ARM64
>
> if IS_ENABLED()
> (unless it cannot be used due to missing symbol?)
>
> > +     /* ... but we need to override the existing fixmap entry as nGnRnE */
> > +     __set_fixmap(FIX_EARLYCON_MEM_BASE, device->port.mapbase,
> > +                  __pgprot(PROT_DEVICE_nGnRnE));
> > +#endif

It has to be a preprocessor conditional because PROT_DEVICE_nGnRnE
is only defined on arm64. We could add a FIXMAP_PAGE_NONPOSTED
alias for it that defaults to FIXMAP_PAGE_IO, but in the end this is
really an architecture specific thing and I think leaving it guarded by
the architecture is appropriate.

> > +     return samsung_early_console_setup(device, opt);
>
> Don't you need to handle the error code - set PROT_DEFAULT() or whatever
> was there before?

__set_fixmap() has no return value, it just writes a page table entry and
does not fail.

      Arnd

Krzysztof Kozlowski Feb. 16, 2021, 10:20 a.m. UTC | #22

On Tue, 16 Feb 2021 at 11:19, Arnd Bergmann <arnd@kernel.org> wrote:
> > > +     return samsung_early_console_setup(device, opt);
> >
> > Don't you need to handle the error code - set PROT_DEFAULT() or whatever
> > was there before?
>
> __set_fixmap() has no return value, it just writes a page table entry and
> does not fail.

I meant, handle samsung_early_console_setup() error code (NULL).

Best regards,
Krzysztof

Arnd Bergmann Feb. 16, 2021, 10:29 a.m. UTC | #23

On Tue, Feb 16, 2021 at 11:20 AM Krzysztof Kozlowski <krzk@kernel.org> wrote:
>
> On Tue, 16 Feb 2021 at 11:19, Arnd Bergmann <arnd@kernel.org> wrote:
> > > > +     return samsung_early_console_setup(device, opt);
> > >
> > > Don't you need to handle the error code - set PROT_DEFAULT() or whatever
> > > was there before?
> >
> > __set_fixmap() has no return value, it just writes a page table entry and
> > does not fail.
>
> I meant, handle samsung_early_console_setup() error code (NULL).

Ah, I see.

I don't think it makes a difference -- if ->setup() fails, the page table entry
is just left in place unused, and the type of the unused mapping doesn't
matter. If earlycon tried to unmap the page, the type also would not
change anything.

With earlycon, I'd generally lean towards keeping things as simple as possible,
in order to increase the chance of seeing anything at all. It clearly wouldn't
hurt to try to add minimal error handling here.

       Arnd

Hector Martin Feb. 16, 2021, 10:50 a.m. UTC | #24

On 16/02/2021 19.29, Arnd Bergmann wrote:
> On Tue, Feb 16, 2021 at 11:20 AM Krzysztof Kozlowski <krzk@kernel.org> wrote:
>>
>> On Tue, 16 Feb 2021 at 11:19, Arnd Bergmann <arnd@kernel.org> wrote:
>>>>> +     return samsung_early_console_setup(device, opt);
>>>>
>>>> Don't you need to handle the error code - set PROT_DEFAULT() or whatever
>>>> was there before?
>>>
>>> __set_fixmap() has no return value, it just writes a page table entry and
>>> does not fail.
>>
>> I meant, handle samsung_early_console_setup() error code (NULL).
> 
> Ah, I see.
> 
> I don't think it makes a difference -- if ->setup() fails, the page table entry
> is just left in place unused, and the type of the unused mapping doesn't
> matter. If earlycon tried to unmap the page, the type also would not
> change anything.
> 
> With earlycon, I'd generally lean towards keeping things as simple as possible,
> in order to increase the chance of seeing anything at all. It clearly wouldn't
> hurt to try to add minimal error handling here.

There's no logic to clean this up in earlycon itself anyway, so there's 
no point in trying to do it for the override. If another earlycon driver 
ends up getting instantiated for some reason, it will override the 
mapping with a normal one again.

Christoph Hellwig Feb. 16, 2021, 10:53 a.m. UTC | #25

Your new iomremap variant needs proper documentation explaining the
semantics in detail.

Hector Martin Feb. 16, 2021, 2:33 p.m. UTC | #26

On 16/02/2021 03.23, Tony Lindgren wrote:
> * Hector Martin <marcan@marcan.st> [210215 12:18]:
>> This allows the devicetree to correctly represent the available set of
>> timers, which varies from device to device, without the need for fake
>> dummy interrupts for unavailable slots.
> 
> I like the idea of using interrupt-names property for mapping timers :)
> 
> Similar approach might help other SoCs too. And clocksources never really
> had similar issues.

Yeah, there are some SoCs using dummy IRQs right now for this reason, so 
this should help with those too (though it's too late to introduce it 
now into those DTs without breaking backwards-compat with older kernels, 
sadly).

> With Marc's comments addressed, please feel free to add:
> 
> Reviewed-by: Tony Lindgren <tony@atomide.com>

Thank you!

Mark Rutland Feb. 17, 2021, 11:49 a.m. UTC | #27

Hi Hector,

On Mon, Feb 15, 2021 at 09:16:57PM +0900, Hector Martin wrote:
> From: Marc Zyngier <maz@kernel.org>
> 
> By default, FIQ exceptions trigger a panic. On platforms that need to
> deliver interrupts via FIQ, this gets redirected via an alternative to
> instead handle FIQ the same way as IRQ. It is up to the irqchip handler
> to discriminate between the two.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Hector Martin <marcan@marcan.st>

Since the use of FIQ is a platform integration detail rather than a CPU
implementation detail (and e.g. can differ across bare-metal and VM),
I'd prefer to always have separate registered handlers for IRQ/FIQ (also
avoiding the need for patching). That way we can explicitly opt-in to
FIQ when required, and avoid edge-cases where an unexpected FIQ could
livelock an unaware IRQ handler.

Marc and I had a quick play with that, and I have a series of patches
I've pushed to:

https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/fiq
git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git arm64/fiq

... which I'll post out shortly.

Thanks,
Mark.

> ---
>  arch/arm64/kernel/entry.S | 21 ++++++++++++++++++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index ba5f9aa379ce..bcfd1ac72636 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -547,18 +547,18 @@ SYM_CODE_START(vectors)
>  
>  	kernel_ventry	1, sync				// Synchronous EL1h
>  	kernel_ventry	1, irq				// IRQ EL1h
> -	kernel_ventry	1, fiq_invalid			// FIQ EL1h
> +	kernel_ventry	1, fiq				// FIQ EL1h
>  	kernel_ventry	1, error			// Error EL1h
>  
>  	kernel_ventry	0, sync				// Synchronous 64-bit EL0
>  	kernel_ventry	0, irq				// IRQ 64-bit EL0
> -	kernel_ventry	0, fiq_invalid			// FIQ 64-bit EL0
> +	kernel_ventry	0, fiq				// FIQ 64-bit EL0
>  	kernel_ventry	0, error			// Error 64-bit EL0
>  
>  #ifdef CONFIG_COMPAT
>  	kernel_ventry	0, sync_compat, 32		// Synchronous 32-bit EL0
>  	kernel_ventry	0, irq_compat, 32		// IRQ 32-bit EL0
> -	kernel_ventry	0, fiq_invalid_compat, 32	// FIQ 32-bit EL0
> +	kernel_ventry	0, fiq_compat, 32		// FIQ 32-bit EL0
>  	kernel_ventry	0, error_compat, 32		// Error 32-bit EL0
>  #else
>  	kernel_ventry	0, sync_invalid, 32		// Synchronous 32-bit EL0
> @@ -658,6 +658,10 @@ SYM_CODE_START_LOCAL_NOALIGN(el1_sync)
>  SYM_CODE_END(el1_sync)
>  
>  	.align	6
> +SYM_CODE_START_LOCAL_NOALIGN(el1_fiq)
> +alternative_if_not ARM64_NEEDS_FIQ
> +	b	el1_fiq_invalid
> +alternative_else_nop_endif
>  SYM_CODE_START_LOCAL_NOALIGN(el1_irq)
>  	kernel_entry 1
>  	gic_prio_irq_setup pmr=x20, tmp=x1
> @@ -688,6 +692,7 @@ alternative_else_nop_endif
>  
>  	kernel_exit 1
>  SYM_CODE_END(el1_irq)
> +SYM_CODE_END(el1_fiq)
>  
>  /*
>   * EL0 mode handlers.
> @@ -710,10 +715,15 @@ SYM_CODE_START_LOCAL_NOALIGN(el0_sync_compat)
>  SYM_CODE_END(el0_sync_compat)
>  
>  	.align	6
> +SYM_CODE_START_LOCAL_NOALIGN(el0_fiq_compat)
> +alternative_if_not ARM64_NEEDS_FIQ
> +	b	el0_fiq_invalid_compat
> +alternative_else_nop_endif
>  SYM_CODE_START_LOCAL_NOALIGN(el0_irq_compat)
>  	kernel_entry 0, 32
>  	b	el0_irq_naked
>  SYM_CODE_END(el0_irq_compat)
> +SYM_CODE_END(el0_fiq_compat)
>  
>  SYM_CODE_START_LOCAL_NOALIGN(el0_error_compat)
>  	kernel_entry 0, 32
> @@ -722,6 +732,10 @@ SYM_CODE_END(el0_error_compat)
>  #endif
>  
>  	.align	6
> +SYM_CODE_START_LOCAL_NOALIGN(el0_fiq)
> +alternative_if_not ARM64_NEEDS_FIQ
> +	b	el0_fiq_invalid
> +alternative_else_nop_endif
>  SYM_CODE_START_LOCAL_NOALIGN(el0_irq)
>  	kernel_entry 0
>  el0_irq_naked:
> @@ -736,6 +750,7 @@ el0_irq_naked:
>  
>  	b	ret_to_user
>  SYM_CODE_END(el0_irq)
> +SYM_CODE_END(el0_fiq)
>  
>  SYM_CODE_START_LOCAL(el1_error)
>  	kernel_entry 1
> -- 
> 2.30.0
>

Mark Rutland Feb. 17, 2021, 12:22 p.m. UTC | #28

Hi Hector,

On Mon, Feb 15, 2021 at 09:16:56PM +0900, Hector Martin wrote:
> Apple SoCs (A11 and newer) have some interrupt sources hardwired to the
> FIQ line. We implement support for this by simply treating IRQs and FIQs
> the same way in the interrupt vectors.
> 
> To support these systems, the FIQ mask bit needs to be kept in sync with
> the IRQ mask bit, so both kinds of exceptions are masked together. No
> other platforms should be delivering FIQ exceptions right now, and we
> already unmask FIQ in normal process context, so this should not have an
> effect on other systems - if spurious FIQs were arriving, they would
> already panic the kernel.

Keeping these in sync sounds fine to me, FWIW.

> Root irqchip drivers can discriminate between IRQs and FIQs by checking
> the ISR_EL1 system register.

I think we can remove this note for now. If we go with seperate handlers
this won't be necessary, and if not this would be better placed on a
commit adding the FIQ handling capability.

> Signed-off-by: Hector Martin <marcan@marcan.st>
> ---
>  arch/arm64/include/asm/assembler.h |  6 +++---
>  arch/arm64/include/asm/daifflags.h |  4 ++--
>  arch/arm64/include/asm/irqflags.h  | 19 +++++++++++--------
>  arch/arm64/kernel/entry.S          |  6 +++---
>  4 files changed, 19 insertions(+), 16 deletions(-)

Judging by `git grep -Wi daif -- arch/arm64`, with this patch applied,
we'll also need fixups in:

* gic_arch_enable_irqs() in arch/arm64/include/asm/arch_gicv3.h
* save_and_disable_irq() in arch/arm64/include/asm/assembler.h (noted below)
* local_daif_save_flags() in arch/arm64/include/asm/daifflags.h
  (the fake DAIF should have F set too)
* __cpu_do_idle_irqprio() in arch/arm64/kernel/process.c

> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> index bf125c591116..ac4c823bf2b6 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -40,9 +40,9 @@
>  	msr	daif, \flags
>  	.endm
>  
> -	/* IRQ is the lowest priority flag, unconditionally unmask the rest. */
> -	.macro enable_da_f
> -	msr	daifclr, #(8 | 4 | 1)
> +	/* IRQ/FIQ are the lowest priority flags, unconditionally unmask the rest. */
> +	.macro enable_da
> +	msr	daifclr, #(8 | 4)
>  	.endm

I think save_and_diable_irq below needs to be updated too, since it
only sets DAIF.I and leaves DAIF.F as-is.

[...]

> diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
> index ff328e5bbb75..125201dced5f 100644
> --- a/arch/arm64/include/asm/irqflags.h
> +++ b/arch/arm64/include/asm/irqflags.h
> @@ -12,15 +12,18 @@
>  
>  /*
>   * Aarch64 has flags for masking: Debug, Asynchronous (serror), Interrupts and
> - * FIQ exceptions, in the 'daif' register. We mask and unmask them in 'dai'
> + * FIQ exceptions, in the 'daif' register. We mask and unmask them in 'daif'
>   * order:
>   * Masking debug exceptions causes all other exceptions to be masked too/
> - * Masking SError masks irq, but not debug exceptions. Masking irqs has no
> - * side effects for other flags. Keeping to this order makes it easier for
> - * entry.S to know which exceptions should be unmasked.
> + * Masking SError masks IRQ/FIQ, but not debug exceptions. IRQ and FIQ are
> + * always masked and unmasked together, and have no side effects for other
> + * flags. Keeping to this order makes it easier for entry.S to know which
> + * exceptions should be unmasked.
>   *

This sounds good.

> - * FIQ is never expected, but we mask it when we disable debug exceptions, and
> - * unmask it at all other times.
> + * FIQ is never expected on most platforms, but we keep it synchronized
> + * with the IRQ mask status. On platforms that do not expect FIQ, that vector
> + * triggers a kernel panic. On platforms that do, the FIQ vector is unified
> + * with the IRQ vector.
>   */

Can we please delete this bit, though? Now that we say IRQ and FIQ are
masked/unmasked together, I don't think the rest is necessary to
understand the masking logic, and it's one less thing to keep in sync
with changes to the entry code.

Otherwise this looks good to me.

Thanks,
Mark.

Marc Zyngier Feb. 17, 2021, 2:38 p.m. UTC | #29

On Wed, 17 Feb 2021 11:49:23 +0000,
Mark Rutland <mark.rutland@arm.com> wrote:
> 
> Hi Hector,
> 
> On Mon, Feb 15, 2021 at 09:16:57PM +0900, Hector Martin wrote:
> > From: Marc Zyngier <maz@kernel.org>
> > 
> > By default, FIQ exceptions trigger a panic. On platforms that need to
> > deliver interrupts via FIQ, this gets redirected via an alternative to
> > instead handle FIQ the same way as IRQ. It is up to the irqchip handler
> > to discriminate between the two.
> > 
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > Signed-off-by: Hector Martin <marcan@marcan.st>
> 
> Since the use of FIQ is a platform integration detail rather than a CPU
> implementation detail (and e.g. can differ across bare-metal and VM),
> I'd prefer to always have separate registered handlers for IRQ/FIQ (also
> avoiding the need for patching). That way we can explicitly opt-in to
> FIQ when required, and avoid edge-cases where an unexpected FIQ could
> livelock an unaware IRQ handler.
> 
> Marc and I had a quick play with that, and I have a series of patches
> I've pushed to:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/fiq
> git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git arm64/fiq
> 
> ... which I'll post out shortly.

FWIW, I've just posted a more complete version of the first patch in
this series[1], which you may want to use directly (though I plan to
take it as a fix for 5.12).

Thanks,

	M.

[1] https://lore.kernel.org/r/20210217142800.2547737-1-maz@kernel.org

Hector Martin Feb. 18, 2021, 12:51 p.m. UTC | #30

On 17/02/2021 21.22, Mark Rutland wrote:
>> Root irqchip drivers can discriminate between IRQs and FIQs by checking
>> the ISR_EL1 system register.
> 
> I think we can remove this note for now. If we go with seperate handlers
> this won't be necessary, and if not this would be better placed on a
> commit adding the FIQ handling capability.

Indeed, this doesn't make sense any more. Changed for v3.

> Judging by `git grep -Wi daif -- arch/arm64`, with this patch applied,
> we'll also need fixups in:
> 
> * gic_arch_enable_irqs() in arch/arm64/include/asm/arch_gicv3.h
> * save_and_disable_irq() in arch/arm64/include/asm/assembler.h (noted below)
> * local_daif_save_flags() in arch/arm64/include/asm/daifflags.h
>    (the fake DAIF should have F set too)
> * __cpu_do_idle_irqprio() in arch/arm64/kernel/process.c

Good catches. A few of those are irrelevant for M1 but need to be done 
now that we're making this change globally, others I just missed from 
the beginning.

There's also an incorrect comment in entry.S:

	/*
	 * DA_F were cleared at start of handling. If anything is set in
	 * DAIF, we come back from an NMI, so skip preemption
	 */
	mrs	x0, daif
	orr	x24, x24, x0

Now only DA__ are cleared. This actually pairs with 
gic_arch_enable_irqs() and begs the question: in priority masking 
systems, do we unmask both IRQ and FIQ (the gic_arch_enable_irqs 
change), or do we leave FIQ masked (which instead would need an AND in 
that part of entry.S so as to not consider FIQ masked as meaning we're 
coming back from an NMI)?

And a minor related one: should init_gic_priority_masking() WARN if FIQ 
is masked too? This probably goes with the above.

Either way, this was nontrivial to make sense of, so I'll make that 
entry.S comment clearer while I'm touching it.

> I think save_and_diable_irq below needs to be updated too, since it
> only sets DAIF.I and leaves DAIF.F as-is.

Totally missed this one! Fixed for v3.

>> - * FIQ is never expected, but we mask it when we disable debug exceptions, and
>> - * unmask it at all other times.
>> + * FIQ is never expected on most platforms, but we keep it synchronized
>> + * with the IRQ mask status. On platforms that do not expect FIQ, that vector
>> + * triggers a kernel panic. On platforms that do, the FIQ vector is unified
>> + * with the IRQ vector.
>>    */
> 
> Can we please delete this bit, though? Now that we say IRQ and FIQ are
> masked/unmasked together, I don't think the rest is necessary to
> understand the masking logic, and it's one less thing to keep in sync
> with changes to the entry code.

Gone :)

Hector Martin Feb. 18, 2021, 1:08 p.m. UTC | #31

On 16/02/2021 19.53, Christoph Hellwig wrote:
> Your new iomremap variant needs proper documentation explaining the
> semantics in detail.

What's the right place for this? I haven't found any generic ioremap 
documentation page in the tree yet. I suppose I could write an 
ARM64-specific blurb on IO mapping modes, much like x86 has one...

Hector Martin Feb. 18, 2021, 1:24 p.m. UTC | #32

On 16/02/2021 03.06, Krzysztof Kozlowski wrote:
> On Mon, Feb 15, 2021 at 09:17:05PM +0900, Hector Martin wrote:
>> +static void s3c64xx_serial_shutdown(struct uart_port *port)
>> +{
>> +	struct s3c24xx_uart_port *ourport = to_ourport(port);
>> +
>> +	free_irq(port->irq, ourport);
>> +
>> +	wr_regl(port, S3C64XX_UINTP, 0xf);
>> +	wr_regl(port, S3C64XX_UINTM, 0xf);
>> +
>> +	ourport->tx_enabled = 0;
>> +	ourport->tx_mode = 0;
>> +	ourport->rx_enabled = 0;
> 
> For S3C64xx type this is not equivalent: the assignments were
> happening before free_irq() and wr_regl(). Honestly I don't know whether
> it matters (except some barriers coming from these functions) but please
> do not change the order of code in this patch. If needed, the
> re-ordering should be a patch on its own. With explanation why.

Honestly, I think if anything the masking should happen first (to make 
sure no IRQs go off), but at this point it's probably better to play it 
safe and not introduce any logic changes, so I've moved the assignments 
first to retain the old behavior.

> Make the s3c24xx_serial_ops const as well.

Done for v3, thanks.

Hector Martin Feb. 18, 2021, 1:37 p.m. UTC | #33

On 16/02/2021 03.26, Krzysztof Kozlowski wrote:
>> This removes s3c24xx_serial_has_interrupt_mask, which was just checking
>> for a specific type anyway, and adds the ucon_mask port info member to
>> avoid having S3C2440 as a distinct type.
> 
> Please split setting the ucon_mask to separate patch. It's a nice
> code simplification on its own.

Done for v3.

>>   	/* Unmask Tx interrupt */
>> -	if (s3c24xx_serial_has_interrupt_mask(port))
>> -		s3c24xx_clear_bit(port, S3C64XX_UINTM_TXD,
>> -				  S3C64XX_UINTM);
>> -	else
>> +	switch (ourport->info->type) {
>> +	case TYPE_S3C6400:
>> +		s3c24xx_clear_bit(port, S3C64XX_UINTM_TXD, S3C64XX_UINTM);
> 
> Please do not re-wrap. It makes reviewing more difficult. You can
> perform proper re-wrapping as a separate cleanup patch.

>>   	if (ourport->rx_enabled) {
>>   		dev_dbg(port->dev, "stopping rx\n");
>> -		if (s3c24xx_serial_has_interrupt_mask(port))
>> -			s3c24xx_set_bit(port, S3C64XX_UINTM_RXD,
>> -					S3C64XX_UINTM);
> 
> The same.

Reverted those two lines for v3.

Hector Martin Feb. 18, 2021, 1:53 p.m. UTC | #34

On 16/02/2021 03.40, Krzysztof Kozlowski wrote:
> On Mon, Feb 15, 2021 at 09:17:07PM +0900, Hector Martin wrote:
>> * Split out s3c24xx_serial_tx_chars from s3c24xx_serial_tx_irq,
>>    where only the latter acquires the port lock.
> 
> I miss here information why you do all this.

Added an explanation for v3. This is used by a subsequent patch in the 
series.

>>
>> * For S3C64xx, return IRQ_NONE if no flag bits were set, so the
>>    interrupt core can detect IRQ storms. Note that both IRQ handlers
>>    always return IRQ_HANDLED anyway, so 'or' logic isn't necessary here,
>>    if either handler ran we are always going to return IRQ_HANDLED.
> 
> It looks like separate patch. Your patches should do only one thing at
> once. The fact that you have here three bullet points is a warning
> sign. This is even more important if you do some refactorings and
> cleanups which should not affect functionality. Hiding there changes
> which could affect functionality is a no-go.

I've reverted this one. I don't think it should affect functionality, 
but I don't have any way to test on these devices, so I'll leave it to 
someone else to be safe :)

>>
>> * Rename s3c24xx_serial_rx_chars to s3c24xx_serial_rx_irq for
>>    consistency with the above. All it does now is call two other
>>    functions anyway.
> 
> Separate patch for trivial renaming.

I think it makes sense to do this rename together with the first change 
above, as it keeps both functions symmetric. Otherwise we end up with an 
inconsistent function naming between both patches. If you really want it 
separate though, I can do that.

> 
>>
>> Signed-off-by: Hector Martin <marcan@marcan.st>
>> ---
>>   drivers/tty/serial/samsung_tty.c | 41 +++++++++++++++++++-------------
>>   1 file changed, 24 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/tty/serial/samsung_tty.c b/drivers/tty/serial/samsung_tty.c
>> index 21955be680a4..821cd0e4f870 100644
>> --- a/drivers/tty/serial/samsung_tty.c
>> +++ b/drivers/tty/serial/samsung_tty.c
>> @@ -151,6 +151,9 @@ struct s3c24xx_uart_port {
>>   #endif
>>   };
>>   
>> +static void s3c24xx_serial_start_next_tx(struct s3c24xx_uart_port *ourport);
>> +static void s3c24xx_serial_tx_chars(struct s3c24xx_uart_port *ourport);
>> +
>>   /* conversion functions */
>>   
>>   #define s3c24xx_dev_to_port(__dev) dev_get_drvdata(__dev)
>> @@ -316,8 +319,6 @@ static void s3c24xx_serial_stop_tx(struct uart_port *port)
>>   	ourport->tx_mode = 0;
>>   }
>>   
>> -static void s3c24xx_serial_start_next_tx(struct s3c24xx_uart_port *ourport);
>> -
> 
> Why moving this? Why adding s3c24xx_serial_tx_chars() forward
> declaration?

This should've gone in the next patch. A previous reviewer told me to 
put declarations at the top of the file, so I put it there and moved 
this one along with it, but I'll keep it to the additon only for v3.

Hector Martin Feb. 18, 2021, 2:01 p.m. UTC | #35

On 16/02/2021 03.51, Krzysztof Kozlowski wrote:
>> Also fix a bug checking the return value, which should use IS_ERR().
> 
> No, no, no. We never, never combine fixing bugs with some rework.
> However devm_ioremap() returns NULL so where is the error?

Sorry, this was a commit message mistake. The code is correct and so is 
the patch: just the NULL check is correct for the previous variant and 
IS_ERR is correct for devm_ioremap_resource. I confused myself while 
writing the commit message after the fact.

> Did you test your patches on existing platforms? If not, please mark all
> of them as RFT on next submission, so Greg does not pick them too fast.

I unfortunately don't have any Exynos devices where I could test the 
code (I have a couple but no serial connections, and I have no idea if 
mailine would run on them). I'll mark v3 as RFT.

Hector Martin Feb. 18, 2021, 2:16 p.m. UTC | #36

On 16/02/2021 04.13, Krzysztof Kozlowski wrote:
> On Mon, Feb 15, 2021 at 09:17:10PM +0900, Hector Martin wrote:
>> @@ -389,10 +396,12 @@ static void enable_tx_pio(struct s3c24xx_uart_port *ourport)
>>   	ucon = rd_regl(port, S3C2410_UCON);
>>   	ucon &= ~(S3C64XX_UCON_TXMODE_MASK);
>>   	ucon |= S3C64XX_UCON_TXMODE_CPU;
>> -	wr_regl(port,  S3C2410_UCON, ucon);
>>   
>>   	/* Unmask Tx interrupt */
>>   	switch (ourport->info->type) {
>> +	case TYPE_APPLE_S5L:
>> +		ucon |= APPLE_S5L_UCON_TXTHRESH_ENA_MSK;
>> +		break;
>>   	case TYPE_S3C6400:
>>   		s3c24xx_clear_bit(port, S3C64XX_UINTM_TXD, S3C64XX_UINTM);
>>   		break;
>> @@ -401,7 +410,16 @@ static void enable_tx_pio(struct s3c24xx_uart_port *ourport)
>>   		break;
>>   	}
>>   
>> +	wr_regl(port,  S3C2410_UCON, ucon);
> 
> You are now configuring the PIO mode after unmasking interrupt. I don't
> think it's a good idea to change the order... and if it were, it
> would deserve a separate patch.

For v3 I moved the wr_regl back and just write it again in the 
TYPE_APPLE_S5L branch; that way, setting the PIO mode and unmasking the 
interrupt are two discrete operations on S5L, like they are on other types.

>>   	/* Keep all interrupts masked and cleared */
>>   	switch (ourport->info->type) {
>> +	case TYPE_APPLE_S5L: {
> 
> Usually you put TYPE_APPLE at the end of switch, so please keep it
> consistent. Can be first or last - just everywhere the same, unless you
> have a fall-through on purpose.

Good point, thanks, moved it for v3. It was actually inconsistent in 
more places, I made all the orders the same (the enum order, and 
default: always goes at the end).

>> @@ -2179,6 +2329,32 @@ static int s3c24xx_serial_resume_noirq(struct device *dev)
>>   	if (port) {
>>   		/* restore IRQ mask */
>>   		switch (ourport->info->type) {
>> +		case TYPE_APPLE_S5L: {
>> +			unsigned int ucon;
>> +
>> +			clk_prepare_enable(ourport->clk);
>> +			if (!IS_ERR(ourport->baudclk))
>> +				clk_prepare_enable(ourport->baudclk);
> 
> We should start checking the return values of clk operations. I know
> that existing code does it only in few places, so basically you are not
> making it worse...

Added error checking for these for v3, thanks.

>> +#define S5L_SERIAL_DRV_DATA ((kernel_ulong_t)&s5l_serial_drv_data)
>> +#else
>> +#define S5L_SERIAL_DRV_DATA ((kernel_ulong_t)NULL)
>> +#endif
>> +
>> +
> 
> Only one line break.

Fixed in v3.

Thank you for the reviews!

Mark Rutland Feb. 18, 2021, 2:22 p.m. UTC | #37

On Thu, Feb 18, 2021 at 09:51:40PM +0900, Hector Martin wrote:
> On 17/02/2021 21.22, Mark Rutland wrote:
> > > Root irqchip drivers can discriminate between IRQs and FIQs by checking
> > > the ISR_EL1 system register.
> > 
> > I think we can remove this note for now. If we go with seperate handlers
> > this won't be necessary, and if not this would be better placed on a
> > commit adding the FIQ handling capability.
> 
> Indeed, this doesn't make sense any more. Changed for v3.
> 
> > Judging by `git grep -Wi daif -- arch/arm64`, with this patch applied,
> > we'll also need fixups in:
> > 
> > * gic_arch_enable_irqs() in arch/arm64/include/asm/arch_gicv3.h
> > * save_and_disable_irq() in arch/arm64/include/asm/assembler.h (noted below)
> > * local_daif_save_flags() in arch/arm64/include/asm/daifflags.h
> >    (the fake DAIF should have F set too)
> > * __cpu_do_idle_irqprio() in arch/arm64/kernel/process.c
> 
> Good catches. A few of those are irrelevant for M1 but need to be done now
> that we're making this change globally, others I just missed from the
> beginning.

Sure; my general view is that we should aim for consistency, and should
ensure that DAIF.F==DAIF.I at all times on all platforms unless we have
a strong reason to violate that rule. That generally makes it easier to
reason about the code and avoid accidentally breaking M1/non-M1 if/when
we refactor masking logic.

> There's also an incorrect comment in entry.S:
> 
> 	/*
> 	 * DA_F were cleared at start of handling. If anything is set in
> 	 * DAIF, we come back from an NMI, so skip preemption
> 	 */
> 	mrs	x0, daif
> 	orr	x24, x24, x0
> 
> Now only DA__ are cleared. This actually pairs with gic_arch_enable_irqs()
> and begs the question: in priority masking systems, do we unmask both IRQ
> and FIQ (the gic_arch_enable_irqs change), or do we leave FIQ masked (which
> instead would need an AND in that part of entry.S so as to not consider FIQ
> masked as meaning we're coming back from an NMI)?

I think that for consistency we always want to keep IRQ and FIQ in-sync,
even when using GIC priorities. So when handling a pseudo-NMI we should
unmask DAIF.DA and leave DAIF.IF masked.

> And a minor related one: should init_gic_priority_masking() WARN if FIQ is
> masked too? This probably goes with the above.

I think it should, yes.

> Either way, this was nontrivial to make sense of, so I'll make that entry.S
> comment clearer while I'm touching it.

Sounds good; thanks!

> > I think save_and_diable_irq below needs to be updated too, since it
> > only sets DAIF.I and leaves DAIF.F as-is.
> 
> Totally missed this one! Fixed for v3.
> 
> > > - * FIQ is never expected, but we mask it when we disable debug exceptions, and
> > > - * unmask it at all other times.
> > > + * FIQ is never expected on most platforms, but we keep it synchronized
> > > + * with the IRQ mask status. On platforms that do not expect FIQ, that vector
> > > + * triggers a kernel panic. On platforms that do, the FIQ vector is unified
> > > + * with the IRQ vector.
> > >    */
> > 
> > Can we please delete this bit, though? Now that we say IRQ and FIQ are
> > masked/unmasked together, I don't think the rest is necessary to
> > understand the masking logic, and it's one less thing to keep in sync
> > with changes to the entry code.
> 
> Gone :)

Thanks,
Mark.

Mark Rutland Feb. 18, 2021, 2:36 p.m. UTC | #38

On Mon, Feb 15, 2021 at 09:16:48PM +0900, Hector Martin wrote:
> This series brings up initial support for the Apple M1 SoC, used in the
> 2020 Mac Mini, MacBook Pro, and MacBook Air models.
> 
> The following features are supported in this initial port:
> 
> - UART (samsung-style) with earlycon support
> - Interrupts, including affinity and IPIs (Apple Interrupt Controller)
> - SMP (through standard spin-table support)
> - simplefb-based framebuffer
> - Devicetree for the Mac Mini (should work for the others too at this
>   stage)

IIUC, the CPUs in these parts have some IMP-DEF instructions that can be
used at EL0 which might have some IMP-DEF state. Our general expectation
is that FW should configure such things to trap, but I don't know
whether the M1 FW does that, and I fear that this will end up being a
problem for us -- even if that doesn't affect EL1/EL2, IMP-DEF state is
an interesting covert channel between EL0 tasks, and not generally safe
to use thanks to context-switch and idle, so I'd like to make sure we
can catch usage and make it SIGILL.

Do you happen to know whether all of that is configured to trap, and if
not, is it possible to adjust the bootloader to ensure it is?

Thanks,
Mark.

Hector Martin Feb. 18, 2021, 2:42 p.m. UTC | #39

On 18/02/2021 23.22, Mark Rutland wrote:
> I think that for consistency we always want to keep IRQ and FIQ in-sync,
> even when using GIC priorities. So when handling a pseudo-NMI we should
> unmask DAIF.DA and leave DAIF.IF masked.

In that case there's one more, in daifflags.h:local_daif_restore():

			/*
			 * If interrupts are disabled but we can take
			 * asynchronous errors, we can take NMIs
			 */
			flags &= PSR_I_BIT;
			pmr = GIC_PRIO_IRQOFF;

>> And a minor related one: should init_gic_priority_masking() WARN if FIQ is
>> masked too? This probably goes with the above.
> 
> I think it should, yes.

Done for v3 then. Thanks!

Mark Rutland Feb. 18, 2021, 3:26 p.m. UTC | #40

On Thu, Feb 18, 2021 at 11:42:01PM +0900, Hector Martin wrote:
> On 18/02/2021 23.22, Mark Rutland wrote:
> > I think that for consistency we always want to keep IRQ and FIQ in-sync,
> > even when using GIC priorities. So when handling a pseudo-NMI we should
> > unmask DAIF.DA and leave DAIF.IF masked.
> 
> In that case there's one more, in daifflags.h:local_daif_restore():
> 
> 			/*
> 			 * If interrupts are disabled but we can take
> 			 * asynchronous errors, we can take NMIs
> 			 */
> 			flags &= PSR_I_BIT;
> 			pmr = GIC_PRIO_IRQOFF;

Good spot, yes!

I did a quick scan with `git grep 'PSR_[IF]_BIT' -- arch/arm64`, and
AFAICT that's the last one.

> > > And a minor related one: should init_gic_priority_masking() WARN if FIQ is
> > > masked too? This probably goes with the above.
> > 
> > I think it should, yes.
> 
> Done for v3 then. Thanks!

Cool!

Mark.

Krzysztof Kozlowski Feb. 20, 2021, 7:11 p.m. UTC | #41

On Thu, Feb 18, 2021 at 10:53:10PM +0900, Hector Martin wrote:
> > Separate patch for trivial renaming.
> 
> I think it makes sense to do this rename together with the first change
> above, as it keeps both functions symmetric. Otherwise we end up with an
> inconsistent function naming between both patches. If you really want it
> separate though, I can do that.

OK, keep your version.

> 
> > 
> > > 
> > > Signed-off-by: Hector Martin <marcan@marcan.st>
> > > ---
> > >   drivers/tty/serial/samsung_tty.c | 41 +++++++++++++++++++-------------
> > >   1 file changed, 24 insertions(+), 17 deletions(-)
> > > 
> > > diff --git a/drivers/tty/serial/samsung_tty.c b/drivers/tty/serial/samsung_tty.c
> > > index 21955be680a4..821cd0e4f870 100644
> > > --- a/drivers/tty/serial/samsung_tty.c
> > > +++ b/drivers/tty/serial/samsung_tty.c
> > > @@ -151,6 +151,9 @@ struct s3c24xx_uart_port {
> > >   #endif
> > >   };
> > > +static void s3c24xx_serial_start_next_tx(struct s3c24xx_uart_port *ourport);
> > > +static void s3c24xx_serial_tx_chars(struct s3c24xx_uart_port *ourport);
> > > +
> > >   /* conversion functions */
> > >   #define s3c24xx_dev_to_port(__dev) dev_get_drvdata(__dev)
> > > @@ -316,8 +319,6 @@ static void s3c24xx_serial_stop_tx(struct uart_port *port)
> > >   	ourport->tx_mode = 0;
> > >   }
> > > -static void s3c24xx_serial_start_next_tx(struct s3c24xx_uart_port *ourport);
> > > -
> > 
> > Why moving this? Why adding s3c24xx_serial_tx_chars() forward
> > declaration?
> 
> This should've gone in the next patch. A previous reviewer told me to put
> declarations at the top of the file, so I put it there and moved this one
> along with it, but I'll keep it to the additon only for v3.

Maybe I missed something in the context but it looked like
forward declaration s3c24xx_serial_tx_chars() was not needed? In such
case no need to move it.

Best regards,
Krzysztof

Krzysztof Kozlowski Feb. 20, 2021, 7:13 p.m. UTC | #42

On Thu, Feb 18, 2021 at 11:01:21PM +0900, Hector Martin wrote:
> On 16/02/2021 03.51, Krzysztof Kozlowski wrote:
> > > Also fix a bug checking the return value, which should use IS_ERR().
> > 
> > No, no, no. We never, never combine fixing bugs with some rework.
> > However devm_ioremap() returns NULL so where is the error?
> 
> Sorry, this was a commit message mistake. The code is correct and so is the
> patch: just the NULL check is correct for the previous variant and IS_ERR is
> correct for devm_ioremap_resource. I confused myself while writing the
> commit message after the fact.
> 
> > Did you test your patches on existing platforms? If not, please mark all
> > of them as RFT on next submission, so Greg does not pick them too fast.
> 
> I unfortunately don't have any Exynos devices where I could test the code (I
> have a couple but no serial connections, and I have no idea if mailine would
> run on them). I'll mark v3 as RFT.

If you have one of Odroid boards with Exynos, then you can nicely test
Exynos. Others - depends, on board.
Anyway I can test them for you. I just want to be sure that Greg waits
for this testing.

Best regards,
Krzysztof

Marc Zyngier Feb. 20, 2021, 7:17 p.m. UTC | #43

On 2021-02-20 19:13, Krzysztof Kozlowski wrote:
> On Thu, Feb 18, 2021 at 11:01:21PM +0900, Hector Martin wrote:
>> On 16/02/2021 03.51, Krzysztof Kozlowski wrote:
>> > > Also fix a bug checking the return value, which should use IS_ERR().
>> >
>> > No, no, no. We never, never combine fixing bugs with some rework.
>> > However devm_ioremap() returns NULL so where is the error?
>> 
>> Sorry, this was a commit message mistake. The code is correct and so 
>> is the
>> patch: just the NULL check is correct for the previous variant and 
>> IS_ERR is
>> correct for devm_ioremap_resource. I confused myself while writing the
>> commit message after the fact.
>> 
>> > Did you test your patches on existing platforms? If not, please mark all
>> > of them as RFT on next submission, so Greg does not pick them too fast.
>> 
>> I unfortunately don't have any Exynos devices where I could test the 
>> code (I
>> have a couple but no serial connections, and I have no idea if mailine 
>> would
>> run on them). I'll mark v3 as RFT.
> 
> If you have one of Odroid boards with Exynos, then you can nicely test
> Exynos. Others - depends, on board.
> Anyway I can test them for you. I just want to be sure that Greg waits
> for this testing.

Worse case, QEMU has some Exynos4210 emulation that is usable.

Thanks,

         M.

Hector Martin Feb. 21, 2021, 1:43 p.m. UTC | #44

On 21/02/2021 04.11, Krzysztof Kozlowski wrote:
> On Thu, Feb 18, 2021 at 10:53:10PM +0900, Hector Martin wrote:
>> This should've gone in the next patch. A previous reviewer told me to put
>> declarations at the top of the file, so I put it there and moved this one
>> along with it, but I'll keep it to the additon only for v3.
> 
> Maybe I missed something in the context but it looked like
> forward declaration s3c24xx_serial_tx_chars() was not needed? In such
> case no need to move it.

It's needed in patch #22 in this series; having it in this patch was a 
mistake I made while splitting up the changes. I have moved that line to 
the Apple support patch for v3.

Hector Martin Feb. 21, 2021, 2:38 p.m. UTC | #45

On 21/02/2021 04.17, Marc Zyngier wrote:
> On 2021-02-20 19:13, Krzysztof Kozlowski wrote:
>> On Thu, Feb 18, 2021 at 11:01:21PM +0900, Hector Martin wrote:
>>> On 16/02/2021 03.51, Krzysztof Kozlowski wrote:
>>>>> Also fix a bug checking the return value, which should use IS_ERR().
>>>>
>>>> No, no, no. We never, never combine fixing bugs with some rework.
>>>> However devm_ioremap() returns NULL so where is the error?
>>>
>>> Sorry, this was a commit message mistake. The code is correct and so
>>> is the
>>> patch: just the NULL check is correct for the previous variant and
>>> IS_ERR is
>>> correct for devm_ioremap_resource. I confused myself while writing the
>>> commit message after the fact.
>>>
>>>> Did you test your patches on existing platforms? If not, please mark all
>>>> of them as RFT on next submission, so Greg does not pick them too fast.
>>>
>>> I unfortunately don't have any Exynos devices where I could test the
>>> code (I
>>> have a couple but no serial connections, and I have no idea if mailine
>>> would
>>> run on them). I'll mark v3 as RFT.
>>
>> If you have one of Odroid boards with Exynos, then you can nicely test
>> Exynos. Others - depends, on board.
>> Anyway I can test them for you. I just want to be sure that Greg waits
>> for this testing.
> 
> Worse case, QEMU has some Exynos4210 emulation that is usable.

That's a good point; better than nothing, certainly.

Does anyone have a known good example of booting an exynos kernel under 
qemu? I tried building a plain 5.11 arm exynos_defconfig and booting it, 
but without much luck:

$ qemu-system-arm -kernel arch/arm/boot/zImage -append 
"console=ttySAC0,115200n8 debug" -dtb 
arch/arm/boot/dts/exynos4210-universal_c210.dtb -nographic -serial 
mon:stdio -M smdkc210 -smp 2

(I also tried without the -dtb option, in case qemu provides something 
usable)

Of course I'll still mark v3 as RFT, I just thought I might as well try 
qemu.

Hector Martin Feb. 21, 2021, 2:43 p.m. UTC | #46

On 16/02/2021 04.29, Krzysztof Kozlowski wrote:
> On Mon, Feb 15, 2021 at 09:17:13PM +0900, Hector Martin wrote:
>> +	memory@800000000 {
>> +		device_type = "memory";
>> +		reg = <0 0 0 0>; /* To be filled by loader */
> 
> dtc and dtschema might complain, so could you set here fake memory
> address 0x800000000? Would that work for your bootloader?

Yeah, the bootloader just replaces the entire property anyway. I'll fill 
in some dummy values (the real usable memory range is to some extent 
dynamic and depends on firmware).

>> +	};
>> +};
>> +
>> +&serial0 {
>> +	status = "okay";
>> +};
>> diff --git a/arch/arm64/boot/dts/apple/apple-m1.dtsi b/arch/arm64/boot/dts/apple/apple-m1.dtsi
>> new file mode 100644
>> index 000000000000..45c87771b057
>> --- /dev/null
>> +++ b/arch/arm64/boot/dts/apple/apple-m1.dtsi
>> @@ -0,0 +1,124 @@
>> +// SPDX-License-Identifier: GPL-2.0+ OR MIT
>> +/*
>> + * Copyright The Asahi Linux Contributors
>> + */
>> +
>> +#include <dt-bindings/interrupt-controller/apple-aic.h>
>> +#include <dt-bindings/interrupt-controller/irq.h>
>> +
>> +/ {
>> +	compatible = "apple,m1", "apple,arm-platform";
>> +
>> +	#address-cells = <2>;
>> +	#size-cells = <2>;
>> +
>> +	cpus {
>> +		#address-cells = <2>;
>> +		#size-cells = <0>;
>> +
>> +		cpu0: cpu@0 {
>> +			compatible = "apple,icestorm";
>> +			device_type = "cpu";
>> +			reg = <0x0 0x0>;
>> +			enable-method = "spin-table";
>> +			cpu-release-addr = <0 0>; /* To be filled by loader */
>> +		};
> 
> New line after every device node, please.

Added newlines after all the CPU nodes.

> With this minor changes, fine for me:
> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>

Thanks!

v3 will rename this file to apple/t8103.dtsi and the board file to 
t8103-j274.dts to better match other platforms (and to use the proper 
SoC ID for the M1); please let me know if you're okay keeping the 
Reviewed-by for that.

Marc Zyngier Feb. 21, 2021, 2:59 p.m. UTC | #47

On Sun, 21 Feb 2021 14:38:16 +0000,
Hector Martin <marcan@marcan.st> wrote:
> 
> On 21/02/2021 04.17, Marc Zyngier wrote:
> > On 2021-02-20 19:13, Krzysztof Kozlowski wrote:
> >> On Thu, Feb 18, 2021 at 11:01:21PM +0900, Hector Martin wrote:
> >>> On 16/02/2021 03.51, Krzysztof Kozlowski wrote:
> >>>>> Also fix a bug checking the return value, which should use IS_ERR().
> >>>> 
> >>>> No, no, no. We never, never combine fixing bugs with some rework.
> >>>> However devm_ioremap() returns NULL so where is the error?
> >>> 
> >>> Sorry, this was a commit message mistake. The code is correct and so
> >>> is the
> >>> patch: just the NULL check is correct for the previous variant and
> >>> IS_ERR is
> >>> correct for devm_ioremap_resource. I confused myself while writing the
> >>> commit message after the fact.
> >>> 
> >>>> Did you test your patches on existing platforms? If not, please mark all
> >>>> of them as RFT on next submission, so Greg does not pick them too fast.
> >>> 
> >>> I unfortunately don't have any Exynos devices where I could test the
> >>> code (I
> >>> have a couple but no serial connections, and I have no idea if mailine
> >>> would
> >>> run on them). I'll mark v3 as RFT.
> >> 
> >> If you have one of Odroid boards with Exynos, then you can nicely test
> >> Exynos. Others - depends, on board.
> >> Anyway I can test them for you. I just want to be sure that Greg waits
> >> for this testing.
> > 
> > Worse case, QEMU has some Exynos4210 emulation that is usable.
> 
> That's a good point; better than nothing, certainly.
> 
> Does anyone have a known good example of booting an exynos kernel
> under qemu? I tried building a plain 5.11 arm exynos_defconfig and
> booting it, but without much luck:
> 
> $ qemu-system-arm -kernel arch/arm/boot/zImage -append
> "console=ttySAC0,115200n8 debug" -dtb
> arch/arm/boot/dts/exynos4210-universal_c210.dtb -nographic -serial
> mon:stdio -M smdkc210 -smp 2

Here's what I've been using last time I had to muck with the 4210
stuff:

<quote>
qemu-system-arm \
	-kernel arch/arm/boot/zImage -M smdkc210 \
	-append "console=ttySAC0,115200n8 earlycon=smh root=/dev/mmcblk0p2 rootwait" \
	-nographic -semihosting -smp 2 \
	-dtb arch/arm/boot/dts/exynos4210-smdkv310.dtb \
	-drive if=sd,driver=null-co -drive if=sd,driver=null-co \
	-drive if=sd,file=../vminstall/bullseye32/MsiKFRxxujYIkiKT.img,format=raw
</quote>

where the last line points to a standard Debian image created
separately.

	M.

Hector Martin Feb. 21, 2021, 3:20 p.m. UTC | #48

On 18/02/2021 23.36, Mark Rutland wrote:
> IIUC, the CPUs in these parts have some IMP-DEF instructions that can be
> used at EL0 which might have some IMP-DEF state. Our general expectation
> is that FW should configure such things to trap, but I don't know
> whether the M1 FW does that, and I fear that this will end up being a
> problem for us -- even if that doesn't affect EL1/EL2, IMP-DEF state is
> an interesting covert channel between EL0 tasks, and not generally safe
> to use thanks to context-switch and idle, so I'd like to make sure we
> can catch usage and make it SIGILL.
> 
> Do you happen to know whether all of that is configured to trap, and if
> not, is it possible to adjust the bootloader to ensure it is?

Very good point!

If only they were IMP-DEF... they're straight in Unallocated space. I 
spent some time the other day exhaustively searching the chunk of the 
encoding space where it looks like all these "fun" additions are,
at EL2, and I documented what I found here:

https://github.com/AsahiLinux/docs/wiki/HW:Apple-Instructions

I haven't tested things at EL0 yet, but it looks like the stateful 
instructions known to be usable in EL0 (AMX) already default to trap on 
this platform, so we should be safe there. Everything else looks like it 
probably either shouldn't work in EL0 (I sure hope the address 
translation one doesn't...) or is probably stateless. I'll dig deeper 
and test EL0 in the future, but so far things look OK (for some 
questionable values of OK :) ).

Krzysztof Kozlowski Feb. 21, 2021, 3:32 p.m. UTC | #49

On Sun, Feb 21, 2021 at 11:43:52PM +0900, Hector Martin wrote:
> On 16/02/2021 04.29, Krzysztof Kozlowski wrote:
> > On Mon, Feb 15, 2021 at 09:17:13PM +0900, Hector Martin wrote:
> > > +	memory@800000000 {
> > > +		device_type = "memory";
> > > +		reg = <0 0 0 0>; /* To be filled by loader */
> > 
> > dtc and dtschema might complain, so could you set here fake memory
> > address 0x800000000? Would that work for your bootloader?
> 
> Yeah, the bootloader just replaces the entire property anyway. I'll fill in
> some dummy values (the real usable memory range is to some extent dynamic
> and depends on firmware).
> 
> > > +	};
> > > +};
> > > +
> > > +&serial0 {
> > > +	status = "okay";
> > > +};
> > > diff --git a/arch/arm64/boot/dts/apple/apple-m1.dtsi b/arch/arm64/boot/dts/apple/apple-m1.dtsi
> > > new file mode 100644
> > > index 000000000000..45c87771b057
> > > --- /dev/null
> > > +++ b/arch/arm64/boot/dts/apple/apple-m1.dtsi
> > > @@ -0,0 +1,124 @@
> > > +// SPDX-License-Identifier: GPL-2.0+ OR MIT
> > > +/*
> > > + * Copyright The Asahi Linux Contributors
> > > + */
> > > +
> > > +#include <dt-bindings/interrupt-controller/apple-aic.h>
> > > +#include <dt-bindings/interrupt-controller/irq.h>
> > > +
> > > +/ {
> > > +	compatible = "apple,m1", "apple,arm-platform";
> > > +
> > > +	#address-cells = <2>;
> > > +	#size-cells = <2>;
> > > +
> > > +	cpus {
> > > +		#address-cells = <2>;
> > > +		#size-cells = <0>;
> > > +
> > > +		cpu0: cpu@0 {
> > > +			compatible = "apple,icestorm";
> > > +			device_type = "cpu";
> > > +			reg = <0x0 0x0>;
> > > +			enable-method = "spin-table";
> > > +			cpu-release-addr = <0 0>; /* To be filled by loader */
> > > +		};
> > 
> > New line after every device node, please.
> 
> Added newlines after all the CPU nodes.
> 
> > With this minor changes, fine for me:
> > Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
> 
> Thanks!
> 
> v3 will rename this file to apple/t8103.dtsi and the board file to
> t8103-j274.dts to better match other platforms (and to use the proper SoC ID
> for the M1); please let me know if you're okay keeping the Reviewed-by for
> that.

It's fine, keep my review tag.

Best regards,
Krzysztof

Hector Martin Feb. 21, 2021, 5:09 p.m. UTC | #50

On 21/02/2021 23.59, Marc Zyngier wrote:
> Here's what I've been using last time I had to muck with the 4210
> stuff:
> 
> <quote>
> qemu-system-arm \
> 	-kernel arch/arm/boot/zImage -M smdkc210 \
> 	-append "console=ttySAC0,115200n8 earlycon=smh root=/dev/mmcblk0p2 rootwait" \
> 	-nographic -semihosting -smp 2 \
> 	-dtb arch/arm/boot/dts/exynos4210-smdkv310.dtb \
> 	-drive if=sd,driver=null-co -drive if=sd,driver=null-co \
> 	-drive if=sd,file=../vminstall/bullseye32/MsiKFRxxujYIkiKT.img,format=raw
> </quote>
> 
> where the last line points to a standard Debian image created
> separately.

Hah, exynos4210-smdkv310.dtb is what did it. And here I was thinking 
something with "c210" in the name would be more likely to work with qemu 
machine "smdkc210"... :-)

Hector Martin Feb. 22, 2021, 7:35 p.m. UTC | #51

On 16/02/2021 03.09, Marc Zyngier wrote:
> On Mon, 15 Feb 2021 12:17:03 +0000,
> Hector Martin <marcan@marcan.st> wrote:
>> This patch introduces basic UP irqchip support, without SMP/IPI support.
> 
> This last comment seems outdated now.

Heh, I forgot to reword this one. Thanks :)

>> +config APPLE_AIC
>> +	bool "Apple Interrupt Controller (AIC)"
>> +	depends on ARM64
>> +	default ARCH_APPLE
>> +	select IRQ_DOMAIN
>> +	select IRQ_DOMAIN_HIERARCHY
> 
> arm64 selects GENERIC_IRQ_IPI, which selects IRQ_DOMAIN_HIERARCHY,
> which selects IRQ_DOMAIN. So these two lines are superfluous.

Ack, removing these for v3.

>> + * In addition, this driver also handles FIQs, as these are routed to the same IRQ vector. These
>> + * are used for Fast IPIs (TODO), the ARMv8 timer IRQs, and performance counters (TODO).
>> + *
> 
> nit: A bit of comment formatting could be helpful.

Wrapped this to 80 columns for v3.

>> +#include <linux/bits.h>
>> +#include <linux/bitfield.h>
>> +#include <linux/cpuhotplug.h>
>> +#include <linux/io.h>
>> +#include <linux/irqchip.h>
>> +#include <linux/irqchip/arm-gic-v3.h>
> 
> I'd rather you move the ICH_HCR_* definitions to sysreg.h rather than
> including the GICv3 stuff. They are only there for historical reasons
> (such as supporting KVM on 32bit systems), none of which apply anymore.

Just ICH_HCR, or should I bring all of the ICH_ and ICC_ defines along 
with it?

>> +	aic_ic_write(ic, AIC_TARGET_CPU + hwirq * 4, BIT(cpu));
>> +	irq_data_update_effective_affinity(d, cpumask_of(cpu));
> 
> It is fine to pick a single CPU out of the whole affinity set, but you
> should tell the kernel that this is the case (irqd_set_single_target()).

>> +
>> +	irq_set_status_flags(irq, IRQ_LEVEL);
> 
> I'm definitely not keen on this override, and the trigger information
> should be the one coming from the DT, which is already set for you.
> It'd probably be useful to provide an irq_set_type() callback that
> returns an error when fed an unsupported trigger.
> 

>> +	irq_set_noprobe(irq);
> 
> This seems to be cargo-culted, and I don't believe this is necessary.

>> +static const struct irq_domain_ops aic_irq_domain_ops = {
>> +	.map = aic_irq_domain_map,
>> +	.unmap = aic_irq_domain_unmap,
>> +	.xlate = aic_irq_domain_xlate,
>> +};
> 
> You are mixing two APIs: the older OF-specific one, and the newer one
> that uses fwnode_handle for hierarchical support. That's OK for older
> drivers that were forcefully converted to using generic IPIs, but as
> this is a brand new driver, I'd rather it consistently used the new
> API. See a proposed rework at [1] (compile tested only).

Applying your fixups for these, thanks! :)

>> +	atomic_and(~irq_bit, &aic_vipi_mask[this_cpu]);
> 
> atomic_andnot()?
> 
>> +
>> +	if (!atomic_read(&aic_vipi_mask[this_cpu]))
>> +		aic_ic_write(ic, AIC_IPI_MASK_SET, AIC_IPI_OTHER);
> 
> This is odd. It means that you still perform the MMIO write if the bit
> was already clear. I think this could be written as:
> 
> 	u32 val;
> 	val = atomic_fetch_andnot(irq_bit, &aic_vipi_mask[this_cpu]);
> 	if (val && !(val & ~irq_bit))
> 		aic_ic_write();

> 
> 	val  = atomic_fetch_or(irq_bit, &aic_vipi_mask[this_cpu]);
> 	if (!val)
> 		aic_ic_write();

This makes more sense to avoid the redundant MMIO writes. I need to get 
more familiar with all the available atomic ops... lots of useful stuff 
in there I didn't know about.

>> +	for_each_cpu(cpu, mask) {
>> +		if (atomic_read(&aic_vipi_mask[cpu]) & irq_bit) {
>> +			atomic_or(irq_bit, &aic_vipi_flag[cpu]);
>> +			send |= AIC_IPI_SEND_CPU(cpu);
> 
> That's really odd. A masked IPI should be made pending, and delivered
> on unmask. I think this all works because we never mask individual
> IPIs, as this would otherwise drop interrupts on the floor.

I wasn't really sure whether IPIs are supposed to end up pending like 
that; indeed if that's how it's supposed to work, then I also need logic 
at mask/unmask time to fire off any pending IPIs. I'll do it like that 
for v3.

Now I wonder how other drivers do it... I'm guessing this never gets 
tested, since the IPI code only exercises a fraction of the irq features...

>> +static void aic_handle_ipi(struct pt_regs *regs)
>> +{
>> +	int this_cpu = smp_processor_id();
>> +	int i;
>> +	unsigned long firing;
>> +
>> +	aic_ic_write(aic_irqc, AIC_IPI_ACK, AIC_IPI_OTHER);
>> +
>> +	/*
>> +	 * Ensure that we've received and acked the IPI before we load the vIPI
>> +	 * flags. This pairs with the second wmb() above.
>> +	 */
>> +	mb();
> 
> I don't get your ordering here.
> 
> If you are trying to order against something that has happened on
> another CPU (which is pretty likely in the case of an IPI), why isn't
> this a smp_mb_before_atomic() (and conversely smp_mb_after_atomic() in
> aic_ipi_send_mask())?
> 
> Although this looks to me like a good case for _acquire/_release
> semantics.

This is trying to order the atomic ops with the IPI IRQ itself, in 
particular the ACK in the preceding line. If they execute in reverse 
order (or more precisely if the ACK takes effect after the xchg), this 
happens and we lose an IPI:

CPU1              CPU2
set vIPI #0
fire IPI
                   IPI IRQ
                   read EVENT
                   / xchg vIPI
set vIPI #1       X
fire IPI          |
                   \ ACK IPI

The converse race can also happen in the CPU1 path, of course, if the 
IPI ends up fired before the vIPI is set, hence the barrier there.

What I'm not sure is how the smp_ ops order with regard to 
writel_relaxed. It seems like mb() is dsb(sy) and smp_mb() is dmb(ish) 
and the atomic versions just default to aliasing smp_mb()). An 
inner-sharable dmb doesn't sound like it would safely satisfy this 
requirement, as MMIO out to AIC is involved (and we don't know if it's 
in the inner sharable domain or not for these purposes). Since the MMIO 
is nGnRnE, I would expect that a dsb(sy) would satisfy this requirement, 
as then the write op really shouldn't complete until it has taken effect 
in AIC.

>> +	/*
>> +	 * Make sure the kernel's idea of logical CPU order is the same as AIC's
>> +	 * If we ever end up with a mismatch here, we will have to introduce
>> +	 * a mapping table similar to what other irqchip drivers do.
>> +	 */
>> +	WARN_ON(aic_ic_read(aic_irqc, AIC_WHOAMI) != smp_processor_id());
> 
> This is unlikely to work as soon as you get kexec up and running. You
> may not have to worry about this for some time...

Ah, can kexec randomly shuffle CPUs around?

The solution here is obvious, but at this point I'm more keen on punting 
this to a future patch instead of introducing more complexity into the 
initial series; gotta leave behind some bugs to fix later :)

Hector Martin Feb. 23, 2021, 9:11 a.m. UTC | #52

On 22/02/2021 05.41, Andy Shevchenko wrote:
> Hector, I would like to be cc’ed in the next version

Noted, thanks!

Marc Zyngier Feb. 23, 2021, 5:37 p.m. UTC | #53

Hi Hector,

On Mon, 22 Feb 2021 19:35:03 +0000,
Hector Martin <marcan@marcan.st> wrote:
> 
> On 16/02/2021 03.09, Marc Zyngier wrote:
> > On Mon, 15 Feb 2021 12:17:03 +0000,
> > Hector Martin <marcan@marcan.st> wrote:
> >> This patch introduces basic UP irqchip support, without SMP/IPI support.
> > 
> > This last comment seems outdated now.
> 
> Heh, I forgot to reword this one. Thanks :)
> 
> >> +config APPLE_AIC
> >> +	bool "Apple Interrupt Controller (AIC)"
> >> +	depends on ARM64
> >> +	default ARCH_APPLE
> >> +	select IRQ_DOMAIN
> >> +	select IRQ_DOMAIN_HIERARCHY
> > 
> > arm64 selects GENERIC_IRQ_IPI, which selects IRQ_DOMAIN_HIERARCHY,
> > which selects IRQ_DOMAIN. So these two lines are superfluous.
> 
> Ack, removing these for v3.
> 
> >> + * In addition, this driver also handles FIQs, as these are routed to the same IRQ vector. These
> >> + * are used for Fast IPIs (TODO), the ARMv8 timer IRQs, and performance counters (TODO).
> >> + *
> > 
> > nit: A bit of comment formatting could be helpful.
> 
> Wrapped this to 80 columns for v3.
> 
> >> +#include <linux/bits.h>
> >> +#include <linux/bitfield.h>
> >> +#include <linux/cpuhotplug.h>
> >> +#include <linux/io.h>
> >> +#include <linux/irqchip.h>
> >> +#include <linux/irqchip/arm-gic-v3.h>
> > 
> > I'd rather you move the ICH_HCR_* definitions to sysreg.h rather than
> > including the GICv3 stuff. They are only there for historical reasons
> > (such as supporting KVM on 32bit systems), none of which apply anymore.
> 
> Just ICH_HCR, or should I bring all of the ICH_ and ICC_ defines along
> with it?

Just ICH_HCR for now would be fine.

> 
> >> +	aic_ic_write(ic, AIC_TARGET_CPU + hwirq * 4, BIT(cpu));
> >> +	irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > 
> > It is fine to pick a single CPU out of the whole affinity set, but you
> > should tell the kernel that this is the case (irqd_set_single_target()).
> 
> >> +
> >> +	irq_set_status_flags(irq, IRQ_LEVEL);
> > 
> > I'm definitely not keen on this override, and the trigger information
> > should be the one coming from the DT, which is already set for you.
> > It'd probably be useful to provide an irq_set_type() callback that
> > returns an error when fed an unsupported trigger.
> > 
> 
> >> +	irq_set_noprobe(irq);
> > 
> > This seems to be cargo-culted, and I don't believe this is necessary.
> 
> >> +static const struct irq_domain_ops aic_irq_domain_ops = {
> >> +	.map = aic_irq_domain_map,
> >> +	.unmap = aic_irq_domain_unmap,
> >> +	.xlate = aic_irq_domain_xlate,
> >> +};
> > 
> > You are mixing two APIs: the older OF-specific one, and the newer one
> > that uses fwnode_handle for hierarchical support. That's OK for older
> > drivers that were forcefully converted to using generic IPIs, but as
> > this is a brand new driver, I'd rather it consistently used the new
> > API. See a proposed rework at [1] (compile tested only).
> 
> Applying your fixups for these, thanks! :)

My pleasure!

> 
> >> +	atomic_and(~irq_bit, &aic_vipi_mask[this_cpu]);
> > 
> > atomic_andnot()?
> > 
> >> +
> >> +	if (!atomic_read(&aic_vipi_mask[this_cpu]))
> >> +		aic_ic_write(ic, AIC_IPI_MASK_SET, AIC_IPI_OTHER);
> > 
> > This is odd. It means that you still perform the MMIO write if the bit
> > was already clear. I think this could be written as:
> > 
> > 	u32 val;
> > 	val = atomic_fetch_andnot(irq_bit, &aic_vipi_mask[this_cpu]);
> > 	if (val && !(val & ~irq_bit))
> > 		aic_ic_write();
> 
> > 
> > 	val  = atomic_fetch_or(irq_bit, &aic_vipi_mask[this_cpu]);
> > 	if (!val)
> > 		aic_ic_write();
> 
> This makes more sense to avoid the redundant MMIO writes. I need to
> get more familiar with all the available atomic ops... lots of useful
> stuff in there I didn't know about.
> 
> >> +	for_each_cpu(cpu, mask) {
> >> +		if (atomic_read(&aic_vipi_mask[cpu]) & irq_bit) {
> >> +			atomic_or(irq_bit, &aic_vipi_flag[cpu]);
> >> +			send |= AIC_IPI_SEND_CPU(cpu);
> > 
> > That's really odd. A masked IPI should be made pending, and delivered
> > on unmask. I think this all works because we never mask individual
> > IPIs, as this would otherwise drop interrupts on the floor.
> 
> I wasn't really sure whether IPIs are supposed to end up pending like
> that; indeed if that's how it's supposed to work, then I also need
> logic at mask/unmask time to fire off any pending IPIs. I'll do it
> like that for v3.

Yes, unmask() needs to release pending IPIs.

> Now I wonder how other drivers do it... I'm guessing this never gets
> tested, since the IPI code only exercises a fraction of the irq
> features...

In most cases, the HW does it for you, fortunately. A masked IPI that
is made pending stays pending until unmasked. On the GIC, the SGIs
(which is the interrupt type we use for IPIs) are just dealt with like
any other interrupt, and are subject to the same life cycle. I think
this is the first SW-based IPI we have on arm64.

> 
> >> +static void aic_handle_ipi(struct pt_regs *regs)
> >> +{
> >> +	int this_cpu = smp_processor_id();
> >> +	int i;
> >> +	unsigned long firing;
> >> +
> >> +	aic_ic_write(aic_irqc, AIC_IPI_ACK, AIC_IPI_OTHER);
> >> +
> >> +	/*
> >> +	 * Ensure that we've received and acked the IPI before we load the vIPI
> >> +	 * flags. This pairs with the second wmb() above.
> >> +	 */
> >> +	mb();
> > 
> > I don't get your ordering here.
> > 
> > If you are trying to order against something that has happened on
> > another CPU (which is pretty likely in the case of an IPI), why isn't
> > this a smp_mb_before_atomic() (and conversely smp_mb_after_atomic() in
> > aic_ipi_send_mask())?
> > 
> > Although this looks to me like a good case for _acquire/_release
> > semantics.
> 
> This is trying to order the atomic ops with the IPI IRQ itself, in
> particular the ACK in the preceding line. If they execute in reverse
> order (or more precisely if the ACK takes effect after the xchg), this
> happens and we lose an IPI:
> 
> CPU1              CPU2
> set vIPI #0
> fire IPI
>                   IPI IRQ
>                   read EVENT
>                   / xchg vIPI
> set vIPI #1       X
> fire IPI          |
>                   \ ACK IPI
> 
> The converse race can also happen in the CPU1 path, of course, if the
> IPI ends up fired before the vIPI is set, hence the barrier there.

I'm going to be brave and suggest something. Will can chime in and
explain why I'm totally wrong! :D

I *think* this scenario should be solved with:

- sender:
	// virtually signal the remote CPU, making sure this RMW
	// is ordered after all the previous writes
	atomic_or_release(irq_bit, &vipi[target]);
	// Signal the IPI
	writel(...);

  although I think the _release is superfluous given that the writel()
  is a pretty heavy hammer already. You may need to rejig this to
  account for the loop

- receiver
        // Ack the interrupt, all further memory accesses are ordered
        // after it
	irq = readl();
	// Order all further loads after this
	ipis = atomic_xchg_acquire(&vipi[me], 0);
	// handle all interrupts described in ipis, which may include
	// more than a single source
	handle_ipis(ipis);

  Here, we may end-up with multiple IPIs (#0 and #1 in your
  example). It doesn't really matter, and the only ill effect is a
  potential spurious IPI if we have consumed more than a single one.

> What I'm not sure is how the smp_ ops order with regard to
> writel_relaxed. It seems like mb() is dsb(sy) and smp_mb() is dmb(ish)
> and the atomic versions just default to aliasing smp_mb()). An
> inner-sharable dmb doesn't sound like it would safely satisfy this
> requirement, as MMIO out to AIC is involved (and we don't know if it's
> in the inner sharable domain or not for these purposes). Since the
> MMIO is nGnRnE, I would expect that a dsb(sy) would satisfy this
> requirement, as then the write op really shouldn't complete until it
> has taken effect in AIC.

In the example above, I used a non-relaxed write, as it guarantees
that it is ordered after any DMA agent that can access the data
ordered before (and I assume that if DMA can see it, another CPU can
as well).

I'm not convinced that the DSB(SY) solves anything here, *unless*
there are some other rules specific to the AIC. DSB doesn't guarantee
that the device has actually changed state as a consequence of the
MMIO access. If I remember well, you actually need a read from the
same device to ensure the state has been changed. But after all, a
state change isn't what you're after, only ordering.

> 
> >> +	/*
> >> +	 * Make sure the kernel's idea of logical CPU order is the same as AIC's
> >> +	 * If we ever end up with a mismatch here, we will have to introduce
> >> +	 * a mapping table similar to what other irqchip drivers do.
> >> +	 */
> >> +	WARN_ON(aic_ic_read(aic_irqc, AIC_WHOAMI) != smp_processor_id());
> > 
> > This is unlikely to work as soon as you get kexec up and running. You
> > may not have to worry about this for some time...
> 
> Ah, can kexec randomly shuffle CPUs around?

Not completely randomly ;-). kexec can execute on any CPU, and all the
others will be shut down. When that CPU enters the secondary kernel,
it will be CPU0, no matter what it was in the previous instance.

> The solution here is obvious, but at this point I'm more keen on
> punting this to a future patch instead of introducing more complexity
> into the initial series; gotta leave behind some bugs to fix later
> :)

I think that's fine for now. kexec is a long way away (we need to
solve the firmware story first), so keeping it on the back burner
seems like a sensible approach.

Thanks,

	M.

Hector Martin Feb. 24, 2021, 3:55 p.m. UTC | #54

On 22/02/2021 00.20, Hector Martin wrote:
> I haven't tested things at EL0 yet, but it looks like the stateful
> instructions known to be usable in EL0 (AMX) already default to trap on
> this platform, so we should be safe there. Everything else looks like it
> probably either shouldn't work in EL0 (I sure hope the address
> translation one doesn't...) or is probably stateless. I'll dig deeper
> and test EL0 in the future, but so far things look OK (for some
> questionable values of OK :) ).

Follow-up: I have EL0 testing scaffolding now, and I found some more 
mutable state (an IMP-DEF, pre-standard version of FEAT_AFP, using a 
separate status register for the bits), but thankfully it traps at EL0 
by default.

And then I found some other mutable IMP-DEF state that does not trap at 
EL0. And which is a 0-day CVE in macOS, because it doesn't 
save/restore/clear it either, nor does it trap there.

E-mailing security@apple.com...

[v2,00/25] Apple M1 SoC platform bring-up

Message

Comments