mbox series

[v3,00/13] arm64/firmware: Software Delegated Exception Interface

Message ID 20170922182614.27885-1-james.morse@arm.com
Headers show
Series arm64/firmware: Software Delegated Exception Interface | expand

Message

James Morse Sept. 22, 2017, 6:26 p.m. UTC
Hello!

The Software Delegated Exception Interface (SDEI) is an ARM specification
for registering callbacks from the platform firmware into the OS.
This is intended to be used to implement firmware-first RAS notifications,
but also supports vendor-defined events and binding IRQs as events.

The document is here:
http://infocenter.arm.com/help/topic/com.arm.doc.den0054a/ARM_DEN0054A_Software_Delegated_Exception_Interface.pdf

I anticpate once reviewed this series will go via the arm-soc tree, as the
bulk of the code is under /drivers/firmware/.

The major change in v3 is due to VMAP stacks. We can't trust sp when we take
an event, so need our own SDEI stack. Events can nest, so we need two. Private
events can be delivered per-cpu, so we need two per-cpu SDEI stacks.

The APCICA support for the SDEI table is now in mainline, so this series has
grown the detection code for non-DT systems. The APEI/GHES wire-up is still
being held back as APEI's ghes_proc() isn't (yet) safe for multiple sources
of NMI.

Exposing the HVC range through KVM to user-space is gone, this will be
re-incarnated as something that better fits KVMs architecture emulation,
as opposed to assuming all-the-world's-SMC-CC.


This series (juggles some registers with KVM+VHE, then) adds a DT binding to
trigger probing of the interface and support for the SDEI API.

SDEI runs between adjacent exception levels, so events will always be delivered
to EL2 if firmware is at EL3. For VHE hosts we run the SDEI event handler
behind KVM's back with all exceptions masked. Once the handler has done its
work we return to the appropriate vbar+irq entry. This allows KVM to
world-switch and deliver any signals sent by the handler to Qemu/kvmtool. We
do the same thing if we interrupt host EL0. If we interrupted code with
interrupts masked, we use a different API call to return to the interrupted
context.

What about non-VHE KVM? If you don't have VHE support and boot at EL2, the
kernel drops to EL1. This driver will print an error message then give up. This
is because events would still be delivered to EL2 hitting either KVM, or the
hyp-stub. Supporting this is complicated, but because the main use-case is
RAS, and ARM v8.2's RAS extensions imply v8.1's Virtual Host Extensions, we
can assume all platforms with SDEI will support VHE too. (I have some ideas
on how to support non-VHE).

Running the event handler behind VHE-KVM's back has some side effects: The
event handler will blindly use any registers that are shared between the host
and guest. The two that I think matter are TPIDR_EL1, and the debug state. The
guest may have set MDSCR_EL1 so debug exceptions must remain masked. The
guest's TPIDR_EL1 will be used by the event handler if it accesses per-cpu
variables. This needs fixing. The first part of this series juggles KVMs use
of TPIDR_EL2 so that we share it with the host on VHE systems. An equivalent
change for 32bit is (still) on my todo list. (the alternative to this is to
have a parody world switch in the SDEI event handler, but this would mean
special casing interrupted guests, and be an ABI link to KVM.)

Is this another begins-with-S RAS mechanism for arm64? Yes.
Why? Any notification delivered as an exception will overwrite the exception
registers. This is fatal for the running thread if it happens during entry.S's
kernel_enter or kernel_exit. Instead of adding masking and routing controls,
events are delivered to a registered address at a fixed exception level and
don't change the exception registers when delivered.

This series can be retrieved from:
git://linux-arm.org/linux-jm.git -b sdei/v3/base

Questions and contradictions welcome!

Thanks,

James

[Changes since previous versions are noted on each patch]


James Morse (13):
  KVM: arm64: Store vcpu on the stack during __guest_enter()
  KVM: arm/arm64: Convert kvm_host_cpu_state to a static per-cpu
    allocation
  KVM: arm64: Change hyp_panic()s dependency on tpidr_el2
  arm64: alternatives: use tpidr_el2 on VHE hosts
  KVM: arm64: Stop save/restoring host tpidr_el1 on VHE
  Docs: dt: add devicetree binding for describing arm64 SDEI firmware
  firmware: arm_sdei: Add driver for Software Delegated Exceptions
  arm64: Add vmap_stack header file
  arm64: kernel: Add arch-specific SDEI entry code and CPU masking
  firmware: arm_sdei: Add support for CPU and system power states
  firmware: arm_sdei: add support for CPU private events
  arm64: acpi: Remove __init from acpi_psci_use_hvc() for use by SDEI
  firmware: arm_sdei: Discover SDEI support via ACPI

 .../devicetree/bindings/arm/firmware/sdei.txt      |   42 +
 arch/arm64/Kconfig                                 |    2 +-
 arch/arm64/include/asm/assembler.h                 |    8 +
 arch/arm64/include/asm/kvm_host.h                  |    2 +
 arch/arm64/include/asm/percpu.h                    |   11 +-
 arch/arm64/include/asm/processor.h                 |    1 +
 arch/arm64/include/asm/sdei.h                      |   63 ++
 arch/arm64/include/asm/stacktrace.h                |    3 +
 arch/arm64/include/asm/vmap_stack.h                |   41 +
 arch/arm64/kernel/Makefile                         |    1 +
 arch/arm64/kernel/acpi.c                           |    2 +-
 arch/arm64/kernel/asm-offsets.c                    |    5 +
 arch/arm64/kernel/cpufeature.c                     |   23 +
 arch/arm64/kernel/irq.c                            |   13 +-
 arch/arm64/kernel/sdei-entry.S                     |  122 +++
 arch/arm64/kernel/sdei.c                           |  235 +++++
 arch/arm64/kernel/smp.c                            |   11 +-
 arch/arm64/kvm/hyp-init.S                          |    4 +
 arch/arm64/kvm/hyp/entry.S                         |   10 +-
 arch/arm64/kvm/hyp/hyp-entry.S                     |   18 +-
 arch/arm64/kvm/hyp/switch.c                        |   25 +-
 arch/arm64/kvm/hyp/sysreg-sr.c                     |   16 +-
 arch/arm64/mm/proc.S                               |    8 +
 drivers/firmware/Kconfig                           |    8 +
 drivers/firmware/Makefile                          |    1 +
 drivers/firmware/arm_sdei.c                        | 1096 ++++++++++++++++++++
 include/linux/cpuhotplug.h                         |    1 +
 include/linux/psci.h                               |    3 +-
 include/linux/sdei.h                               |  105 ++
 include/uapi/linux/sdei.h                          |   91 ++
 virt/kvm/arm/arm.c                                 |   18 +-
 31 files changed, 1931 insertions(+), 58 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/arm/firmware/sdei.txt
 create mode 100644 arch/arm64/include/asm/sdei.h
 create mode 100644 arch/arm64/include/asm/vmap_stack.h
 create mode 100644 arch/arm64/kernel/sdei-entry.S
 create mode 100644 arch/arm64/kernel/sdei.c
 create mode 100644 drivers/firmware/arm_sdei.c
 create mode 100644 include/linux/sdei.h
 create mode 100644 include/uapi/linux/sdei.h

Comments

Catalin Marinas Oct. 13, 2017, 3:31 p.m. UTC | #1
On Fri, Sep 22, 2017 at 07:26:05PM +0100, James Morse wrote:
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index cd52d365d1f0..8e4c7da2b126 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -865,6 +865,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>  		.capability = ARM64_HAS_VIRT_HOST_EXTN,
>  		.def_scope = SCOPE_SYSTEM,
>  		.matches = runs_at_el2,
> +		.enable = cpu_copy_el2regs,
>  	},
>  	{
>  		.desc = "32-bit EL0 Support",
> @@ -1308,3 +1309,25 @@ static int __init enable_mrs_emulation(void)
>  }
>  
>  late_initcall(enable_mrs_emulation);
> +
> +int cpu_copy_el2regs(void *__unused)
> +{
> +	int do_copyregs = 0;
> +
> +	/*
> +	 * Copy register values that aren't redirected by hardware.
> +	 *
> +	 * Before code patching, we only set tpidr_el1, all CPUs need to copy
> +	 * this value to tpidr_el2 before we patch the code. Once we've done
> +	 * that, freshly-onlined CPUs will set tpidr_el2, so we don't need to
> +	 * do anything here.
> +	 */
> +	asm volatile(ALTERNATIVE("mov %0, #1", "mov %0, #0",
> +				 ARM64_HAS_VIRT_HOST_EXTN)
> +		    : "=r" (do_copyregs) : : );

Can you just do:

	if (cpu_have_const_cap(ARM64_HAS_VIRT_HOST_EXTN))
		write_sysreg(read_sysreg(tpidr_el1), tpidr_el2);

At this point the capability bits should be set and the jump labels
enabled.

Otherwise:

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Catalin Marinas Oct. 13, 2017, 3:42 p.m. UTC | #2
On Fri, Sep 22, 2017 at 07:26:09PM +0100, James Morse wrote:
> diff --git a/arch/arm64/include/asm/vmap_stack.h b/arch/arm64/include/asm/vmap_stack.h
> new file mode 100644
> index 000000000000..f41d043cac31
> --- /dev/null
> +++ b/arch/arm64/include/asm/vmap_stack.h
> @@ -0,0 +1,41 @@
> +/*
> + * Copyright (C) 2017 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +#ifndef __ASM_VMAP_STACK_H
> +#define __ASM_VMAP_STACK_H
> +
> +#include <linux/vmalloc.h>
> +#include <asm/memory.h>
> +#include <asm/pgtable.h>
> +#include <asm/thread_info.h>
> +
> +#ifdef CONFIG_VMAP_STACK
> +/*
> + * To ensure that VMAP'd stack overflow detection works correctly, all VMAP'd
> + * stacks need to have the same alignment.
> + */
> +static inline unsigned long *arch_alloc_vmap_stack(size_t stack_size, int node)
> +{
> +	return __vmalloc_node_range(stack_size, THREAD_ALIGN,
> +				    VMALLOC_START, VMALLOC_END,
> +				    THREADINFO_GFP, PAGE_KERNEL, 0, node,
> +				    __builtin_return_address(0));
> +}
> +
> +#else
> +unsigned long *arch_alloc_vmap_stack(size_t stack_size, int node); // link error

Do you actually need this here? The calling site (at least in this
patch), is surrounded by CONFIG_VMAP_STACK. Removing the dummy
declaration should be fine (I haven't read the rest of the patches yet,
maybe it's needed later; would a BUILD_BUG do?).
Catalin Marinas Oct. 13, 2017, 3:49 p.m. UTC | #3
On Fri, Sep 22, 2017 at 07:26:08PM +0100, James Morse wrote:
> The Software Delegated Exception Interface (SDEI) is an ARM standard
> for registering callbacks from the platform firmware into the OS.
> This is typically used to implement firmware notifications (such as
> firmware-first RAS) or promote an IRQ that has been promoted to a
> firmware-assisted NMI.
> 
> Add the code for detecting the SDEI version and the framework for
> registering and unregistering events. Subsequent patches will add the
> arch-specific backend code and the necessary power management hooks.
> 
> Only shared events are supported, power management, private events and
> discovery for ACPI systems will be added by later patches.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> Changes since v2:
>  * Copy the priority into the structure the arch asm handler gets. This
>    is used for VMAP stacks where we can't know if a critical event interrupted
>    a normal priority event, thus they need separate stacks.
> 
> Changes since v1:
>  * Changed entry point to unsigned long, if we support non-vhe systems this
>    won't be a valid pointer
>  * Made invoke_sdei_fn() pass the only register we are interested in, instead
>    of the whole arm_smccc_res
>  * Made all the locking WARN_ON()s lockdep_assert_held()s.
>  * Moved more messages from 'err' to 'warn'.
>  * Made IS_SDEI_CALL() not depend on whether the config option is selected.
>  * Made 'event failed' messages rate limited.
> 
>  drivers/firmware/Kconfig    |   7 +
>  drivers/firmware/Makefile   |   1 +
>  drivers/firmware/arm_sdei.c | 616 ++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/sdei.h        | 100 +++++++
>  include/uapi/linux/sdei.h   |  91 +++++++
>  5 files changed, 815 insertions(+)
>  create mode 100644 drivers/firmware/arm_sdei.c
>  create mode 100644 include/linux/sdei.h
>  create mode 100644 include/uapi/linux/sdei.h

I haven't reviewed this patch (yet...). However, I think it's worth
adding a MAINTAINERS entry with your name on it (unless you really want
to distance yourself from this code once merged ;)).

Thanks.
James Morse Oct. 13, 2017, 4:50 p.m. UTC | #4
Hi Catalin,

On 13/10/17 16:31, Catalin Marinas wrote:
> On Fri, Sep 22, 2017 at 07:26:05PM +0100, James Morse wrote:
>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>> index cd52d365d1f0..8e4c7da2b126 100644
>> --- a/arch/arm64/kernel/cpufeature.c
>> +++ b/arch/arm64/kernel/cpufeature.c

>> @@ -1308,3 +1309,25 @@ static int __init enable_mrs_emulation(void)
>>  }
>>  
>>  late_initcall(enable_mrs_emulation);
>> +
>> +int cpu_copy_el2regs(void *__unused)
>> +{
>> +	int do_copyregs = 0;
>> +
>> +	/*
>> +	 * Copy register values that aren't redirected by hardware.
>> +	 *
>> +	 * Before code patching, we only set tpidr_el1, all CPUs need to copy
>> +	 * this value to tpidr_el2 before we patch the code. Once we've done
>> +	 * that, freshly-onlined CPUs will set tpidr_el2, so we don't need to
>> +	 * do anything here.
>> +	 */
>> +	asm volatile(ALTERNATIVE("mov %0, #1", "mov %0, #0",
>> +				 ARM64_HAS_VIRT_HOST_EXTN)
>> +		    : "=r" (do_copyregs) : : );
> 
> Can you just do:
> 
> 	if (cpu_have_const_cap(ARM64_HAS_VIRT_HOST_EXTN))
> 		write_sysreg(read_sysreg(tpidr_el1), tpidr_el2);
> 
> At this point the capability bits should be set and the jump labels
> enabled.

These are enabled too early, we haven't done patching yet.

We need to copy tpidr_el1 -> tpidr_el2 on all CPUs that are online before code
patching.

After code patching new CPUs set tpidr_el2 when they come online, so we don't
need to do the copy... but this enable method is still called. There is nothing
for us to do, and tpidr_el1 now contains junk, so the copy


cpu_have_const_cap() is great for knowing if we have a feature, here we want to
know if we've done the patching for this feature.

I can wrap the ALTERNATIVE() into a helper, something like:
> arm64_alternatives_applied(ARM64_HAS_VIRT_HOST_EXTN)

which should make it clearer.

Christoffer had the same question at connect, so I evidently haven't found the
right way of describing this yet.


> Otherwise:
> 
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

Thanks for taking a look, I'll leave this RB until your happy with the
ALTERNATIVE() hackery above.


James
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Catalin Marinas Oct. 16, 2017, 10:17 a.m. UTC | #5
On Fri, Sep 22, 2017 at 07:26:05PM +0100, James Morse wrote:
> diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> index 29adab8138c3..8f2d0f7d193b 100644
> --- a/arch/arm64/include/asm/processor.h
> +++ b/arch/arm64/include/asm/processor.h
> @@ -193,5 +193,6 @@ static inline void spin_lock_prefetch(const void *ptr)
>  
>  int cpu_enable_pan(void *__unused);
>  int cpu_enable_cache_maint_trap(void *__unused);
> +int cpu_copy_el2regs(void *__unused);
>  
>  #endif /* __ASM_PROCESSOR_H */
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index cd52d365d1f0..8e4c7da2b126 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
[...]
> +int cpu_copy_el2regs(void *__unused)

Can this be static? I couldn't find it used anywhere else apart from
cpufeature.c
Catalin Marinas Oct. 16, 2017, 1:41 p.m. UTC | #6
On Fri, Sep 22, 2017 at 07:26:10PM +0100, James Morse wrote:
> diff --git a/arch/arm64/include/asm/sdei.h b/arch/arm64/include/asm/sdei.h
> new file mode 100644
> index 000000000000..ed329e01a301
> --- /dev/null
> +++ b/arch/arm64/include/asm/sdei.h
> @@ -0,0 +1,63 @@
> +/*
> + * Copyright (C) 2017 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +#ifndef __ASM_SDEI_H
> +#define __ASM_SDEI_H
> +
> +/* Values for sdei_exit_mode */
> +#define SDEI_EXIT_HVC  0
> +#define SDEI_EXIT_SMC  1
> +
> +#define SDEI_STACK_SIZE		IRQ_STACK_SIZE
> +
> +#ifndef __ASSEMBLY__
> +
> +#include <linux/linkage.h>
> +#include <linux/types.h>
> +
> +#include <asm/virt.h>
> +
> +extern unsigned long sdei_exit_mode;
> +
> +/* Software Delegated Exception entry point from firmware*/
> +asmlinkage void __sdei_asm_handler(unsigned long event_num, unsigned long arg,
> +				   unsigned long pc, unsigned long pstate);
> +
> +/*
> + * The above entry point does the minimum to call C code. This function does
> + * anything else, before calling the driver.
> + */
> +struct sdei_registered_event;
> +asmlinkage unsigned long __sdei_handler(struct pt_regs *regs,
> +					struct sdei_registered_event *arg);
> +
> +extern unsigned long sdei_arch_get_entry_point(int conduit);

Nitpick: drop the "extern" here.

> diff --git a/arch/arm64/kernel/sdei-entry.S b/arch/arm64/kernel/sdei-entry.S
> new file mode 100644
> index 000000000000..cf12f8da0789
> --- /dev/null
> +++ b/arch/arm64/kernel/sdei-entry.S
> @@ -0,0 +1,122 @@
> +/*
> + * Software Delegated Exception entry point from firmware to the kernel.
> + *
> + * Copyright (C) 2017 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <linux/linkage.h>
> +
> +#include <asm/alternative.h>
> +#include <asm/asm-offsets.h>
> +#include <asm/assembler.h>
> +#include <asm/memory.h>
> +#include <asm/ptrace.h>
> +#include <asm/sdei.h>
> +#include <uapi/linux/sdei.h>
> +
> +/*
> + * Software Delegated Exception entry point.
> + *
> + * x0: Event number

Currently the only event number is 0. Shall we plan for having other
events in the future or we just ignore this argument?

> + * x1: struct sdei_registered_event argument from registration time.
> + * x2: interrupted PC
> + * x3: interrupted PSTATE
[...]
> +/*
> + * __sdei_handler() returns one of:
> + *  SDEI_EV_HANDLED -  success, return to the interrupted context.
> + *  SDEI_EV_FAILED  -  failure, return this error code to firmare.
> + *  virtual-address -  success, return to this address.
> + */
> +static __kprobes unsigned long _sdei_handler(struct pt_regs *regs,
> +					     struct sdei_registered_event *arg)
> +{
> +	u32 mode;
> +	int i, err = 0;
> +	int clobbered_registers = 4;

Maybe const int here if you want to avoid magic numbers in the 'for'
loop below. Otherwise it looks like some variable you intend to modify
in this function.

> +	u64 elr = read_sysreg(elr_el1);
> +	u32 kernel_mode = read_sysreg(CurrentEL) | 1;	/* +SPSel */
> +	unsigned long vbar = read_sysreg(vbar_el1);
> +
> +	/* Retrieve the missing registers values */
> +	for (i = 0; i < clobbered_registers; i++) {
> +		/* from within the handler, this call always succeeds */
> +		sdei_api_event_context(i, &regs->regs[i]);
> +	}
> +
> +	/*
> +	 * We didn't take an exception to get here, set PAN. UAO will be cleared
> +	 * by sdei_event_handler()s set_fs(USER_DS) call.
> +	 */
> +	asm(ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN,
> +			CONFIG_ARM64_PAN));

Can you use uaccess_disable() directly?

> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> index d8a9859f6102..1f6633b337aa 100644
> --- a/drivers/firmware/Kconfig
> +++ b/drivers/firmware/Kconfig
> @@ -50,6 +50,7 @@ config ARM_SCPI_POWER_DOMAIN
>  
>  config ARM_SDE_INTERFACE
>  	bool "ARM Software Delegated Exception Interface (SDEI)"
> +	depends on ARM64
>  	help
>  	  The Software Delegated Exception Interface (SDEI) is an ARM
>  	  standard for registering callbacks from the platform firmware
> diff --git a/drivers/firmware/arm_sdei.c b/drivers/firmware/arm_sdei.c
> index 4b3c7fd99c34..3ea6a19ae439 100644
> --- a/drivers/firmware/arm_sdei.c
> +++ b/drivers/firmware/arm_sdei.c
> @@ -154,6 +154,7 @@ int sdei_api_event_context(u32 query, u64 *result)
>  	return invoke_sdei_fn(SDEI_1_0_FN_SDEI_EVENT_CONTEXT, query, 0, 0, 0, 0,
>  			      result);
>  }
> +NOKPROBE_SYMBOL(sdei_api_event_context);

Should this be part of the patch introducing arm_sdei.c?

>  
>  static int sdei_api_event_get_info(u32 event, u32 info, u64 *result)
>  {
> diff --git a/include/linux/sdei.h b/include/linux/sdei.h
> index bb3dd000771e..df3fe6373e32 100644
> --- a/include/linux/sdei.h
> +++ b/include/linux/sdei.h
> @@ -32,6 +32,8 @@ enum sdei_conduit_types {
>  	CONDUIT_HVC,
>  };
>  
> +#include <asm/sdei.h>
> +
>  /* Arch code should override this to set the entry point from firmware... */
>  #ifndef sdei_arch_get_entry_point
>  #define sdei_arch_get_entry_point(conduit)	(0)

Same here. This should not be built before the Kconfig change anyway.
Catalin Marinas Oct. 16, 2017, 1:52 p.m. UTC | #7
On Fri, Sep 22, 2017 at 07:26:11PM +0100, James Morse wrote:
> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> index f24bfb2b9a2d..466b949474df 100644
> --- a/include/linux/cpuhotplug.h
> +++ b/include/linux/cpuhotplug.h
> @@ -88,6 +88,7 @@ enum cpuhp_state {
>  	CPUHP_AP_PERF_XTENSA_STARTING,
>  	CPUHP_AP_PERF_METAG_STARTING,
>  	CPUHP_AP_MIPS_OP_LOONGSON3_STARTING,
> +	CPUHP_AP_SDEI_STARTING,

Nitpick: how generic is this as to apply to other architectures?
Probably not, so shall we prefix this with ARM_?

Not actually a strong preferences but similar question for the
definitions in the sdei.h files (and maybe the filenames themselves).
James Morse Oct. 17, 2017, 4:34 p.m. UTC | #8
Hi Catalin,

On 16/10/17 14:41, Catalin Marinas wrote:
> On Fri, Sep 22, 2017 at 07:26:10PM +0100, James Morse wrote:
>> diff --git a/arch/arm64/kernel/sdei-entry.S b/arch/arm64/kernel/sdei-entry.S
>> new file mode 100644
>> index 000000000000..cf12f8da0789
>> --- /dev/null
>> +++ b/arch/arm64/kernel/sdei-entry.S
>> @@ -0,0 +1,122 @@
>> +/*
>> + * Software Delegated Exception entry point from firmware to the kernel.
>> + *
>> + * Copyright (C) 2017 ARM Ltd.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include <linux/linkage.h>
>> +
>> +#include <asm/alternative.h>
>> +#include <asm/asm-offsets.h>
>> +#include <asm/assembler.h>
>> +#include <asm/memory.h>
>> +#include <asm/ptrace.h>
>> +#include <asm/sdei.h>
>> +#include <uapi/linux/sdei.h>
>> +
>> +/*
>> + * Software Delegated Exception entry point.
>> + *
>> + * x0: Event number
> 
> Currently the only event number is 0. Shall we plan for having other
> events in the future or we just ignore this argument?

'0' is the only event number specified by the SDEI specification. For use with
RAS the event number is read from the HEST. (and there may me more than one of
them).



>> + * x1: struct sdei_registered_event argument from registration time.
>> + * x2: interrupted PC
>> + * x3: interrupted PSTATE
> [...]
>> +/*
>> + * __sdei_handler() returns one of:
>> + *  SDEI_EV_HANDLED -  success, return to the interrupted context.
>> + *  SDEI_EV_FAILED  -  failure, return this error code to firmare.
>> + *  virtual-address -  success, return to this address.
>> + */
>> +static __kprobes unsigned long _sdei_handler(struct pt_regs *regs,
>> +					     struct sdei_registered_event *arg)
>> +{
>> +	u32 mode;
>> +	int i, err = 0;
>> +	int clobbered_registers = 4;

> Maybe const int here if you want to avoid magic numbers in the 'for'
> loop below. Otherwise it looks like some variable you intend to modify
> in this function.

Sure.

(this was modified in my various attempts to get this working on non-vhe systems
with KVM).


>> +	u64 elr = read_sysreg(elr_el1);
>> +	u32 kernel_mode = read_sysreg(CurrentEL) | 1;	/* +SPSel */
>> +	unsigned long vbar = read_sysreg(vbar_el1);
>> +
>> +	/* Retrieve the missing registers values */
>> +	for (i = 0; i < clobbered_registers; i++) {
>> +		/* from within the handler, this call always succeeds */
>> +		sdei_api_event_context(i, &regs->regs[i]);
>> +	}
>> +
>> +	/*
>> +	 * We didn't take an exception to get here, set PAN. UAO will be cleared
>> +	 * by sdei_event_handler()s set_fs(USER_DS) call.
>> +	 */
>> +	asm(ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN,
>> +			CONFIG_ARM64_PAN));

> Can you use uaccess_disable() directly?

Wouldn't this invoke sw-pan's __uaccess_ttbr0_disable():
> write_sysreg(ttbr, ttbr0_el1);

Probing depends on VHE if booted at EL2, can we guarantee to have PAN (and
therefore not-SW-PAN) too?

(otherwise I can add 'depends on !ARM64_SW_TTBR0_PAN' to the Kconfig)


>> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
>> index d8a9859f6102..1f6633b337aa 100644
>> --- a/drivers/firmware/Kconfig
>> +++ b/drivers/firmware/Kconfig
>> @@ -50,6 +50,7 @@ config ARM_SCPI_POWER_DOMAIN
>>  
>>  config ARM_SDE_INTERFACE
>>  	bool "ARM Software Delegated Exception Interface (SDEI)"
>> +	depends on ARM64
>>  	help
>>  	  The Software Delegated Exception Interface (SDEI) is an ARM
>>  	  standard for registering callbacks from the platform firmware
>> diff --git a/drivers/firmware/arm_sdei.c b/drivers/firmware/arm_sdei.c
>> index 4b3c7fd99c34..3ea6a19ae439 100644
>> --- a/drivers/firmware/arm_sdei.c
>> +++ b/drivers/firmware/arm_sdei.c
>> @@ -154,6 +154,7 @@ int sdei_api_event_context(u32 query, u64 *result)
>>  	return invoke_sdei_fn(SDEI_1_0_FN_SDEI_EVENT_CONTEXT, query, 0, 0, 0, 0,
>>  			      result);
>>  }
>> +NOKPROBE_SYMBOL(sdei_api_event_context);

> Should this be part of the patch introducing arm_sdei.c?

Yes.


>>  static int sdei_api_event_get_info(u32 event, u32 info, u64 *result)
>>  {
>> diff --git a/include/linux/sdei.h b/include/linux/sdei.h
>> index bb3dd000771e..df3fe6373e32 100644
>> --- a/include/linux/sdei.h
>> +++ b/include/linux/sdei.h
>> @@ -32,6 +32,8 @@ enum sdei_conduit_types {
>>  	CONDUIT_HVC,
>>  };
>>  
>> +#include <asm/sdei.h>
>> +
>>  /* Arch code should override this to set the entry point from firmware... */
>>  #ifndef sdei_arch_get_entry_point
>>  #define sdei_arch_get_entry_point(conduit)	(0)
> 
> Same here. This should not be built before the Kconfig change anyway.

(I've been building this on x86 to check I'd not done anything arm64 specific
outside the arch code...)

Sure, I'll move this hunk and the Kconfig create a place holder asm/sdei.h in
the earlier patch.


Thanks,

James

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Morse Oct. 17, 2017, 4:34 p.m. UTC | #9
Hi Catalin,

On 13/10/17 16:42, Catalin Marinas wrote:
> On Fri, Sep 22, 2017 at 07:26:09PM +0100, James Morse wrote:
>> diff --git a/arch/arm64/include/asm/vmap_stack.h b/arch/arm64/include/asm/vmap_stack.h
>> new file mode 100644
>> index 000000000000..f41d043cac31
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/vmap_stack.h
>> @@ -0,0 +1,41 @@

>> +#ifndef __ASM_VMAP_STACK_H
>> +#define __ASM_VMAP_STACK_H
>> +
>> +#include <linux/vmalloc.h>
>> +#include <asm/memory.h>
>> +#include <asm/pgtable.h>
>> +#include <asm/thread_info.h>
>> +
>> +#ifdef CONFIG_VMAP_STACK
>> +/*
>> + * To ensure that VMAP'd stack overflow detection works correctly, all VMAP'd
>> + * stacks need to have the same alignment.
>> + */
>> +static inline unsigned long *arch_alloc_vmap_stack(size_t stack_size, int node)
>> +{
>> +	return __vmalloc_node_range(stack_size, THREAD_ALIGN,
>> +				    VMALLOC_START, VMALLOC_END,
>> +				    THREADINFO_GFP, PAGE_KERNEL, 0, node,
>> +				    __builtin_return_address(0));
>> +}
>> +
>> +#else
>> +unsigned long *arch_alloc_vmap_stack(size_t stack_size, int node); // link error

> Do you actually need this here? The calling site (at least in this
> patch), is surrounded by CONFIG_VMAP_STACK. Removing the dummy
> declaration should be fine (I haven't read the rest of the patches yet,
> maybe it's needed later;

This was to avoid having to #ifdef the calling site I add in a later patch.


> would a BUILD_BUG do?).

Ah, you can put those in header files, yes that would be clearer.


Thanks,

James





--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Morse Oct. 17, 2017, 4:34 p.m. UTC | #10
Hi Catalin,

On 16/10/17 14:52, Catalin Marinas wrote:
> On Fri, Sep 22, 2017 at 07:26:11PM +0100, James Morse wrote:
>> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
>> index f24bfb2b9a2d..466b949474df 100644
>> --- a/include/linux/cpuhotplug.h
>> +++ b/include/linux/cpuhotplug.h
>> @@ -88,6 +88,7 @@ enum cpuhp_state {
>>  	CPUHP_AP_PERF_XTENSA_STARTING,
>>  	CPUHP_AP_PERF_METAG_STARTING,
>>  	CPUHP_AP_MIPS_OP_LOONGSON3_STARTING,
>> +	CPUHP_AP_SDEI_STARTING,
> 
> Nitpick: how generic is this as to apply to other architectures?
> Probably not, so shall we prefix this with ARM_?

This may get used on 32bit ARM. I blindly copied PSCI, which doesn't have a
prefix. But it makes sense to have one.


> Not actually a strong preferences but similar question for the
> definitions in the sdei.h files (and maybe the filenames themselves).


Thanks,

James
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Morse Oct. 17, 2017, 4:36 p.m. UTC | #11
Hi Catalin,

On 16/10/17 10:58, Catalin Marinas wrote:
> On Fri, Oct 13, 2017 at 05:50:45PM +0100, James Morse wrote:
>> On 13/10/17 16:31, Catalin Marinas wrote:
>>> On Fri, Sep 22, 2017 at 07:26:05PM +0100, James Morse wrote:
>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>>>> index cd52d365d1f0..8e4c7da2b126 100644
>>>> --- a/arch/arm64/kernel/cpufeature.c
>>>> +++ b/arch/arm64/kernel/cpufeature.c
>>
>>>> @@ -1308,3 +1309,25 @@ static int __init enable_mrs_emulation(void)
>>>>  }
>>>>  
>>>>  late_initcall(enable_mrs_emulation);
>>>> +
>>>> +int cpu_copy_el2regs(void *__unused)
>>>> +{
>>>> +	int do_copyregs = 0;
>>>> +
>>>> +	/*
>>>> +	 * Copy register values that aren't redirected by hardware.
>>>> +	 *
>>>> +	 * Before code patching, we only set tpidr_el1, all CPUs need to copy
>>>> +	 * this value to tpidr_el2 before we patch the code. Once we've done
>>>> +	 * that, freshly-onlined CPUs will set tpidr_el2, so we don't need to
>>>> +	 * do anything here.
>>>> +	 */
>>>> +	asm volatile(ALTERNATIVE("mov %0, #1", "mov %0, #0",
>>>> +				 ARM64_HAS_VIRT_HOST_EXTN)
>>>> +		    : "=r" (do_copyregs) : : );
>>>
>>> Can you just do:
>>>
>>> 	if (cpu_have_const_cap(ARM64_HAS_VIRT_HOST_EXTN))
>>> 		write_sysreg(read_sysreg(tpidr_el1), tpidr_el2);
>>>
>>> At this point the capability bits should be set and the jump labels
>>> enabled.
>>
>> These are enabled too early, we haven't done patching yet.
>>
>> We need to copy tpidr_el1 -> tpidr_el2 on all CPUs that are online before code
>> patching.
>>
>> After code patching new CPUs set tpidr_el2 when they come online, so we don't
>> need to do the copy... but this enable method is still called. There is nothing
>> for us to do, and tpidr_el1 now contains junk, so the copy

> Ah, I get it now (should've read the comment but I usually expect the
> code to be obvious; it wasn't, at least to me, in this case ;)).

> You could have added the sysreg copying directly in the asm volatile.

I was trying to stick to the sysreg C accessors, and thought there would be more
registers that needed copying. (I discovered this VHE doesn't remap all the _ELx
registers quite late.)


> Anyway, I think it's better if we keep it entirely in C with this hunk
> (untested):

[...]

Yes, that looks much better. I got tangled up in 'which alternative', but you're
right, they are all applied in one go so it doesn't matter.


>	if (!READ_ONCE(alternatives_applied))
> 		write_sysreg(read_sysreg(tpidr_el1), tpidr_el2);

I don't think this READ_ONCE() is needed, that only matters within the
stop_machine()/alternatives-patching code that modifies the value on one CPU.


Thanks,

James

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Catalin Marinas Oct. 18, 2017, 10:54 a.m. UTC | #12
On Tue, Oct 17, 2017 at 05:34:13PM +0100, James Morse wrote:
> On 16/10/17 14:41, Catalin Marinas wrote:
> > On Fri, Sep 22, 2017 at 07:26:10PM +0100, James Morse wrote:
> >> +	u64 elr = read_sysreg(elr_el1);
> >> +	u32 kernel_mode = read_sysreg(CurrentEL) | 1;	/* +SPSel */
> >> +	unsigned long vbar = read_sysreg(vbar_el1);
> >> +
> >> +	/* Retrieve the missing registers values */
> >> +	for (i = 0; i < clobbered_registers; i++) {
> >> +		/* from within the handler, this call always succeeds */
> >> +		sdei_api_event_context(i, &regs->regs[i]);
> >> +	}
> >> +
> >> +	/*
> >> +	 * We didn't take an exception to get here, set PAN. UAO will be cleared
> >> +	 * by sdei_event_handler()s set_fs(USER_DS) call.
> >> +	 */
> >> +	asm(ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN,
> >> +			CONFIG_ARM64_PAN));
> 
> > Can you use uaccess_disable() directly?
> 
> Wouldn't this invoke sw-pan's __uaccess_ttbr0_disable():
> > write_sysreg(ttbr, ttbr0_el1);
> 
> Probing depends on VHE if booted at EL2, can we guarantee to have PAN (and
> therefore not-SW-PAN) too?
> 
> (otherwise I can add 'depends on !ARM64_SW_TTBR0_PAN' to the Kconfig)

We want the Kconfig to be able to include all features. What you can do
though is:

	select ARM64_PAN if ARM64_SW_TTBR0_PAN

With VHE (ARMv8.2) we can guarantee that hardware PAN is around.