Message ID | 20171017174432.1684-1-james.morse@arm.com |
---|---|
Headers | show |
Series | arm64/firmware: Software Delegated Exception Interface | expand |
On Tue, Oct 17, 2017 at 06:44:23PM +0100, James Morse wrote: > Now that KVM uses tpidr_el2 in the same way as Linux's cpu_offset in > tpidr_el1, merge the two. This saves KVM from save/restoring tpidr_el1 > on VHE hosts, and allows future code to blindly access per-cpu variables > without triggering world-switch. > > Signed-off-by: James Morse <james.morse@arm.com> > Reviewed-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 17, 2017 at 06:44:26PM +0100, James Morse wrote: > The Software Delegated Exception Interface (SDEI) is an ARM standard > for registering callbacks from the platform firmware into the OS. > This is typically used to implement firmware notifications (such as > firmware-first RAS) or promote an IRQ that has been promoted to a > firmware-assisted NMI. > > Add the code for detecting the SDEI version and the framework for > registering and unregistering events. Subsequent patches will add the > arch-specific backend code and the necessary power management hooks. > > Only shared events are supported, power management, private events and > discovery for ACPI systems will be added by later patches. > > Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 17, 2017 at 06:44:27PM +0100, James Morse wrote: > Today the arm64 arch code allocates an extra IRQ stack per-cpu. If we > also have SDEI and VMAP stacks we need two extra per-cpu VMAP stacks. > > Move the VMAP stack allocation out to a helper in a new header file. > This avoids missing THREADINFO_GFP, or getting the all-important alignment > wrong. > > Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 17, 2017 at 06:44:29PM +0100, James Morse wrote: > When a CPU enters an idle lower-power state or is powering off, we > need to mask SDE events so that no events can be delivered while we > are messing with the MMU as the registered entry points won't be valid. > > If the system reboots, we want to unregister all events and mask the CPUs. > For kexec this allows us to hand a clean slate to the next kernel > instead of relying on it to call sdei_{private,system}_data_reset(). > > For hibernate we unregister all events and re-register them on restore, > in case we restored with the SDE code loaded at a different address. > (e.g. KASLR). > > Add all the notifiers necessary to do this. We only support shared events > so all events are left registered and enabled over CPU hotplug. > > Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 17, 2017 at 06:44:30PM +0100, James Morse wrote: > Private SDE events are per-cpu, and need to be registered and enabled > on each CPU. > > Hide this detail from the caller by adapting our {,un}register and > {en,dis}able calls to send an IPI to each CPU if the event is private. > > CPU private events are unregistered when the CPU is powered-off, and > re-registered when the CPU is brought back online. This saves bringing > secondary cores back online to call private_reset() on shutdown, kexec > and resume from hibernate. > > Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 17, 2017 at 06:44:31PM +0100, James Morse wrote: > SDEI inherits the 'use hvc' bit that is also used by PSCI. PSCI does all > its initialisation early, SDEI does its late. > > Remove the __init annotation from acpi_psci_use_hvc(). > > Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 17, 2017 at 06:44:32PM +0100, James Morse wrote: > SDEI defines a new ACPI table to indicate the presence of the interface. > The conduit is discovered in the same way as PSCI. > > For ACPI we need to create the platform device ourselves as SDEI doesn't > have an entry in the DSDT. > > The SDEI platform device should be created after ACPI has been initialised > so that we can parse the table, but before GHES devices are created, which > may register SDE events if they use SDEI as their notification type. > > Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 17, 2017 at 06:44:28PM +0100, James Morse wrote: > The Software Delegated Exception Interface (SDEI) is an ARM standard > for registering callbacks from the platform firmware into the OS. > This is typically used to implement RAS notifications. > > Such notifications enter the kernel at the registered entry-point > with the register values of the interrupted CPU context. Because this > is not a CPU exception, it cannot reuse the existing entry code. > (crucially we don't implicitly know which exception level we interrupted), > > Add an sdei-entry.S asm file to set us up for calling into C code. If > the event interrupted code that had interrupts masked, we always return > to that location. Otherwise we pretend this was an IRQ, and use SDEI's > complete_and_resume call to return to vbar_el1 + offset. > > This allows the kernel to deliver signals to user space processes. For > KVM this triggers the world switch, a quick spin round vcpu_run, then > back into the guest, unless there are pending signals. > > Add sdei_mask_local_cpu() calls to the smp_send_stop() code, this covers > the panic() code-path, which doesn't invoke cpuhotplug notifiers. > > Because we can interrupt entry-from/exit-to another EL, we can't trust the > value in sp_el0 or x29, even if we interrupted the kernel, in this case > the code in entry.S will save/restore sp_el0 and use the value in > __entry_task. > > When we have VMAP stacks we can interrupt the stack-overflow test, which > stirs x0 into sp, meaning we have to have our own VMAP stack. > > Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> > +static __kprobes unsigned long _sdei_handler(struct pt_regs *regs, > + struct sdei_registered_event *arg) > +{ > + u32 mode; > + int i, err = 0; > + const int clobbered_registers = 4; > + u64 elr = read_sysreg(elr_el1); > + u32 kernel_mode = read_sysreg(CurrentEL) | 1; /* +SPSel */ > + unsigned long vbar = read_sysreg(vbar_el1); > + > + /* Retrieve the missing registers values */ > + for (i = 0; i < clobbered_registers; i++) { > + /* from within the handler, this call always succeeds */ > + sdei_api_event_context(i, ®s->regs[i]); > + } > + > + /* > + * We didn't take an exception to get here, set PAN. UAO will be cleared > + * by sdei_event_handler()s set_fs(USER_DS) call. > + */ > + asm(ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN, > + CONFIG_ARM64_PAN)); We have a similar asm in __cpu_suspend_exit(), so there's precedent. As a separate patch (not part of this series), we should move them into a macro, e.g. __uaccess_disable_hw_pan() (enabling as well for consistency).
Hi James, On Tue, Oct 17, 2017 at 06:44:19PM +0100, James Morse wrote: > The Software Delegated Exception Interface (SDEI) is an ARM specification > for registering callbacks from the platform firmware into the OS. > This is intended to be used to implement firmware-first RAS notifications, > but also supports vendor-defined events and binding IRQs as events. > > The document is here: > http://infocenter.arm.com/help/topic/com.arm.doc.den0054a/ARM_DEN0054A_Software_Delegated_Exception_Interface.pdf > > I anticpate once reviewed this series will go via the arm-soc tree, as the > bulk of the code is under /drivers/firmware/. Given the close coupling with the arch code, I'm actually fine taking this all via arm64 so save it being spread across two merge windows. The driver code is a new file, so the scope for conflicts is minimal. Will -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 17, 2017 at 06:44:29PM +0100, James Morse wrote: > When a CPU enters an idle lower-power state or is powering off, we > need to mask SDE events so that no events can be delivered while we > are messing with the MMU as the registered entry points won't be valid. > > If the system reboots, we want to unregister all events and mask the CPUs. > For kexec this allows us to hand a clean slate to the next kernel > instead of relying on it to call sdei_{private,system}_data_reset(). > > For hibernate we unregister all events and re-register them on restore, > in case we restored with the SDE code loaded at a different address. > (e.g. KASLR). > > Add all the notifiers necessary to do this. We only support shared events > so all events are left registered and enabled over CPU hotplug. > > Signed-off-by: James Morse <james.morse@arm.com> > > --- > Changes since v3: > * Renamed CPUHP enum entry to have an ARM_ prefix. > > drivers/firmware/arm_sdei.c | 228 +++++++++++++++++++++++++++++++++++++++++++- > include/linux/arm_sdei.h | 3 + > include/linux/cpuhotplug.h | 1 + > 3 files changed, 231 insertions(+), 1 deletion(-) [...] > static void sdei_smccc_smc(unsigned long function_id, > unsigned long arg0, unsigned long arg1, > unsigned long arg2, unsigned long arg3, > @@ -544,9 +742,36 @@ static int sdei_probe(struct platform_device *pdev) > return 0; > } > > + err = cpuhp_setup_state_nocalls(CPUHP_AP_ARM_SDEI_STARTING, "SDEI", > + &sdei_cpuhp_up, &sdei_cpuhp_down); > + if (err) { > + pr_warn("Failed to register CPU hotplug notifier...\n"); > + return err; > + } What prevents CPU hotplug events coming in here? Will -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 17, 2017 at 06:44:30PM +0100, James Morse wrote: > Private SDE events are per-cpu, and need to be registered and enabled > on each CPU. > > Hide this detail from the caller by adapting our {,un}register and > {en,dis}able calls to send an IPI to each CPU if the event is private. > > CPU private events are unregistered when the CPU is powered-off, and > re-registered when the CPU is brought back online. This saves bringing > secondary cores back online to call private_reset() on shutdown, kexec > and resume from hibernate. > > Signed-off-by: James Morse <james.morse@arm.com> > --- > drivers/firmware/arm_sdei.c | 242 +++++++++++++++++++++++++++++++++++++++++--- > 1 file changed, 228 insertions(+), 14 deletions(-) > > diff --git a/drivers/firmware/arm_sdei.c b/drivers/firmware/arm_sdei.c > index 28e4c4cbb16d..5598d9ba8b5d 100644 > --- a/drivers/firmware/arm_sdei.c > +++ b/drivers/firmware/arm_sdei.c > @@ -21,9 +21,11 @@ > #include <linux/acpi.h> > #include <linux/arm_sdei.h> > #include <linux/arm-smccc.h> > +#include <linux/atomic.h> > #include <linux/bitops.h> > #include <linux/compiler.h> > #include <linux/cpuhotplug.h> > +#include <linux/cpu.h> > #include <linux/cpu_pm.h> > #include <linux/errno.h> > #include <linux/hardirq.h> > @@ -64,12 +66,49 @@ struct sdei_event { > u8 priority; > > /* This pointer is handed to firmware as the event argument. */ > - struct sdei_registered_event *registered; > + union { > + /* Shared events */ > + struct sdei_registered_event *registered; > + > + /* CPU private events */ > + struct sdei_registered_event __percpu *private_registered; > + }; > }; > > static LIST_HEAD(sdei_events); > static DEFINE_SPINLOCK(sdei_events_lock); > > +/* When frozen, cpu-hotplug notifiers shouldn't unregister/re-register events */ > +static bool frozen; It looks like this can be accessed concurrently, so you need at least READ_ONCE/WRITE_ONCE. > /* When entering idle, mask/unmask events for this cpu */ > static int sdei_pm_notifier(struct notifier_block *nb, unsigned long action, > void *data) > @@ -610,6 +813,7 @@ static int sdei_device_freeze(struct device *dev) > { > int err; > > + frozen = true; > err = sdei_event_unregister_all(); Are the release semantics from spin_unlock in sdei_event_unregister_all sufficient for the ordering guarantees you need? Will -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Will, On 18/10/17 18:19, Will Deacon wrote: > On Tue, Oct 17, 2017 at 06:44:30PM +0100, James Morse wrote: >> Private SDE events are per-cpu, and need to be registered and enabled >> on each CPU. >> >> Hide this detail from the caller by adapting our {,un}register and >> {en,dis}able calls to send an IPI to each CPU if the event is private. >> >> CPU private events are unregistered when the CPU is powered-off, and >> re-registered when the CPU is brought back online. This saves bringing >> secondary cores back online to call private_reset() on shutdown, kexec >> and resume from hibernate. >> diff --git a/drivers/firmware/arm_sdei.c b/drivers/firmware/arm_sdei.c >> index 28e4c4cbb16d..5598d9ba8b5d 100644 >> --- a/drivers/firmware/arm_sdei.c >> +++ b/drivers/firmware/arm_sdei.c >> @@ -610,6 +813,7 @@ static int sdei_device_freeze(struct device *dev) >> { >> int err; >> >> + frozen = true; >> err = sdei_event_unregister_all(); > Are the release semantics from spin_unlock in sdei_event_unregister_all > sufficient for the ordering guarantees you need? ... ordering ... The hotplug notifiers don't touch that lock, so at the first level: no. It looks like I was relying on the cpu-hotplug code using stop-machine for its work, and the spinlocks changing the pre-empt count for this to work. Which is not something I want to debug if it changes! I'll post a patch changing this bool to a more sensible atomic type. Thanks, James -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Will, On 18/10/17 18:17, Will Deacon wrote: > On Tue, Oct 17, 2017 at 06:44:29PM +0100, James Morse wrote: >> When a CPU enters an idle lower-power state or is powering off, we >> need to mask SDE events so that no events can be delivered while we >> are messing with the MMU as the registered entry points won't be valid. >> >> If the system reboots, we want to unregister all events and mask the CPUs. >> For kexec this allows us to hand a clean slate to the next kernel >> instead of relying on it to call sdei_{private,system}_data_reset(). >> >> For hibernate we unregister all events and re-register them on restore, >> in case we restored with the SDE code loaded at a different address. >> (e.g. KASLR). >> >> Add all the notifiers necessary to do this. We only support shared events >> so all events are left registered and enabled over CPU hotplug. >> static void sdei_smccc_smc(unsigned long function_id, >> unsigned long arg0, unsigned long arg1, >> unsigned long arg2, unsigned long arg3, >> @@ -544,9 +742,36 @@ static int sdei_probe(struct platform_device *pdev) >> return 0; >> } >> >> + err = cpuhp_setup_state_nocalls(CPUHP_AP_ARM_SDEI_STARTING, "SDEI", >> + &sdei_cpuhp_up, &sdei_cpuhp_down); >> + if (err) { >> + pr_warn("Failed to register CPU hotplug notifier...\n"); >> + return err; >> + } > > What prevents CPU hotplug events coming in here? Nothing, but what would trigger them? This is after the arch code has brought up secondaries, but before user space is running. No events are registered by this point, so all that could go wrong is a freshly-arrived CPU is unmasked, then this probe method fails and assumes it left cores masked. If you think its possible I can post a patch to bolt it down some more. Thanks for taking a look! Thanks, James -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 24, 2017 at 06:34:19PM +0100, James Morse wrote: > Hi Will, > > On 18/10/17 18:17, Will Deacon wrote: > > On Tue, Oct 17, 2017 at 06:44:29PM +0100, James Morse wrote: > >> When a CPU enters an idle lower-power state or is powering off, we > >> need to mask SDE events so that no events can be delivered while we > >> are messing with the MMU as the registered entry points won't be valid. > >> > >> If the system reboots, we want to unregister all events and mask the CPUs. > >> For kexec this allows us to hand a clean slate to the next kernel > >> instead of relying on it to call sdei_{private,system}_data_reset(). > >> > >> For hibernate we unregister all events and re-register them on restore, > >> in case we restored with the SDE code loaded at a different address. > >> (e.g. KASLR). > >> > >> Add all the notifiers necessary to do this. We only support shared events > >> so all events are left registered and enabled over CPU hotplug. > > >> static void sdei_smccc_smc(unsigned long function_id, > >> unsigned long arg0, unsigned long arg1, > >> unsigned long arg2, unsigned long arg3, > >> @@ -544,9 +742,36 @@ static int sdei_probe(struct platform_device *pdev) > >> return 0; > >> } > >> > >> + err = cpuhp_setup_state_nocalls(CPUHP_AP_ARM_SDEI_STARTING, "SDEI", > >> + &sdei_cpuhp_up, &sdei_cpuhp_down); > >> + if (err) { > >> + pr_warn("Failed to register CPU hotplug notifier...\n"); > >> + return err; > >> + } > > > > What prevents CPU hotplug events coming in here? > > Nothing, but what would trigger them? This is after the arch code has brought up > secondaries, but before user space is running. But as we recently discovered (ae2e972dae3c), userspace can be running really early due to an initrd and I don't think that necessarily rules out hotplug operations. > No events are registered by this point, so all that could go wrong is a > freshly-arrived CPU is unmasked, then this probe method fails and assumes it > left cores masked. If you think its possible I can post a patch to bolt it down > some more. If the race is benign, could you avoid using the _nocalls registration function and drop the subsequent IPI? Will -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Will, On 24/10/17 18:34, James Morse wrote: > On 18/10/17 18:19, Will Deacon wrote: >> On Tue, Oct 17, 2017 at 06:44:30PM +0100, James Morse wrote: >>> Private SDE events are per-cpu, and need to be registered and enabled >>> on each CPU. >>> >>> Hide this detail from the caller by adapting our {,un}register and >>> {en,dis}able calls to send an IPI to each CPU if the event is private. >>> >>> CPU private events are unregistered when the CPU is powered-off, and >>> re-registered when the CPU is brought back online. This saves bringing >>> secondary cores back online to call private_reset() on shutdown, kexec >>> and resume from hibernate. > >>> diff --git a/drivers/firmware/arm_sdei.c b/drivers/firmware/arm_sdei.c >>> index 28e4c4cbb16d..5598d9ba8b5d 100644 >>> --- a/drivers/firmware/arm_sdei.c >>> +++ b/drivers/firmware/arm_sdei.c >>> @@ -610,6 +813,7 @@ static int sdei_device_freeze(struct device *dev) >>> { >>> int err; >>> >>> + frozen = true; >>> err = sdei_event_unregister_all(); > >> Are the release semantics from spin_unlock in sdei_event_unregister_all >> sufficient for the ordering guarantees you need? > > ... ordering ... > > The hotplug notifiers don't touch that lock, so at the first level: no. > > It looks like I was relying on the cpu-hotplug code using stop-machine for its > work, and the spinlocks changing the pre-empt count for this to work. Which is > not something I want to debug if it changes! > > I'll post a patch changing this bool to a more sensible atomic type. Here I show my ignorance: atomic_t didn't do what I thought. But I do take the spinlock inside the hotplug callback, so I'll move the variable in there. Thanks, James -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/11/17 15:59, James Morse wrote: > dpm_suspend() calls the freeze/thaw callbacks for hibernate before > disable_non_bootcpus() takes down secondaries. > > This leads to a fun race where the freeze/thaw callbacks reset the > SDEI interface (as we may be restoring a kernel with a different > layout due to KASLR), then the cpu-hotplug callbacks come in to > save the current state, which has already been reset. > > We solve this with a 'frozen' flag that stops the hotplug callback > from overwriting the saved values. > > This patch moves the flag under the 'events' spinlock we take > in the hotplug callbacks, to avoid depending on cpu-hotplug's > mechanics to ensure the callback sees the correct value. Scratch this. All this has really done is moved the race around. Will had a much better suggestion that makes it look like all the CPUs are down. That makes hotplug and power-management behave the same. Thanks, James -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 08, 2017 at 04:06:24PM +0000, James Morse wrote: > dpm_suspend() calls the freeze/thaw callbacks for hibernate before > disable_non_bootcpus() takes down secondaries. > > This leads to a fun race where the freeze/thaw callbacks reset the > SDEI interface (as we may be restoring a kernel with a different > layout due to KASLR), then the cpu-hotplug callbacks come in to > save the current state, which has already been reset. > > I tried to solve this with a 'frozen' flag that stops the hotplug > callback from overwriting the saved values. Instead this just > moves the race around and makes it even harder to think about. > > Instead, make it look like the secondaries have gone offline. > Call cpuhp_remove_state() in the freeze callback, this will call the > teardown hook on all online CPUs, then remove the state. This saves > all private events and makes future CPU up/down events invisible. > > Change sdei_event_unregister_all()/sdei_reregister_events() to > only save/restore shared events, which are all that is left. With > this we can remove the frozen flag. We can remove the device > suspend/resume calls too as cpuhotplug's teardown call has masked > the CPUs. > > All that is left is the reboot notifier, (which was abusing the > frozen flag). Call cpuhp_remove_state() to make it look like > secondary CPUs have gone offline. > > Suggested-by: Will Deacon <will.deacon@arm.com> > Signed-off-by: James Morse <james.morse@arm.com> > --- > drivers/firmware/arm_sdei.c | 60 +++++++++++++++++++++++---------------------- > 1 file changed, 31 insertions(+), 29 deletions(-) Thanks, this appears to address my concerns. It's too late for 4.15 now, but please resend for 4.16 and Catalin can pick this series up. Will -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html