Message ID | alpine.DEB.2.00.1106151349520.11405@localhost6.localdomain6 |
---|---|
State | New |
Headers | show |
On Thu, 16 Jun 2011, Russell King - ARM Linux wrote: > On Wed, Jun 15, 2011 at 02:35:26PM +0100, Frank Hofmann wrote: >> this change is perfect; with this, the hibernation support code turns >> into the attached. > > Should I assume that's an ack for all the patches? Sorry the late reply, been away. I've only tested on s3c64xx (for s2ram and s2disk) and omap3 (s2disk only due to the omap code not having been "genericized" yet), the changes are ok there. The remainder apply and compile ok. If that's sufficient, then: Acked-by: Frank Hofmann <frank.hofmann@tomtom.com> > >> That's both better and simpler to perform a full suspend/resume cycle >> (via resetting in the cpu_suspend "finisher") after the snapshot image >> has been created, instead of shoehorning a return into this. > > It's now not soo difficult to have an error code returned from the > finisher function - the only thing we have to make sure is that > the cpu_do_suspend helper just saves state and does not make any > state changes. > > We can then do this, which makes it possible for the finisher to > return, and we propagate its return value. We also ensure that a > path through from cpu_resume will result in a zero return value > from cpu_suspend. > > This isn't an invitation for people to make the S2RAM path return > after they time out waiting for suspend to happen - that's potentially > dangerous because in that case the suspend may happen while we're > resuming devices which wouldn't be nice. You're right there's some risk that the ability to return an error here is misinterpreted as an invitation to use error returns for indicating state machine conditions. What I'm wondering about is the usecase for having an error return opportunity in this case (beyond reporting it as such). Isn't it rather so that most finishers would succeed and at best do a "BUG_ON()" at failure, because system state then isn't such to make it trivially recoverable ? For hibernation, the ability to force the entire down/up transition before writing the snapshot image out is actually very beneficial, for reliability - one knows that the device/cpu side of suspend/resume has already worked when the snapshot is being written, without having to wait for reboot/image load/resume to test that. One would want to go through suspend/resume even if the memory snapshot operation, swsusp_save, errors. A failure of swsusp_save doesn't make suspend/resume itself a failure, therefore it's desirable to propagate that return code in other ways (and keep the finisher "unconditional"). I'm not opposed to this addition as such, but I'm curious: * If an error occurs, how can the caller of cpu_suspend recover from it ? * What's the state the system is in after an error in this codepath ? * What subsystems / users would do anything else with it than BUG_ON() ? Also, consider possible errors in the SoC-specific code on the resume side; situations like a failure to perform SoC-iROM-calls for e.g. an attempted secure state restore would result in errors that can't be propagated by this mechanism; i.e. there are still failure modes which require platform-specific intervention of sorts, and platform-specific error propagation/handling, even were cpu_suspend to return error codes. FrankH. > > diff -u b/arch/arm/kernel/sleep.S b/arch/arm/kernel/sleep.S > --- b/arch/arm/kernel/sleep.S > +++ b/arch/arm/kernel/sleep.S > @@ -12,7 +12,6 @@ > * r1 = v:p offset > * r2 = suspend function arg0 > * r3 = suspend function > - * Note: does not return until system resumes > */ > ENTRY(cpu_suspend) > stmfd sp!, {r4 - r11, lr} > @@ -26,7 +25,7 @@ > #endif > mov r6, sp @ current virtual SP > sub sp, sp, r5 @ allocate CPU state on stack > - mov r0, sp @ save pointer > + mov r0, sp @ save pointer to CPU save block > add ip, ip, r1 @ convert resume fn to phys > stmfd sp!, {r1, r6, ip} @ save v:p, virt SP, phys resume fn > ldr r5, =sleep_save_sp > @@ -55,10 +54,17 @@ > #else > bl __cpuc_flush_kern_all > #endif > + adr lr, BSYM(cpu_suspend_abort) > ldmfd sp!, {r0, pc} @ call suspend fn > ENDPROC(cpu_suspend) > .ltorg > > +cpu_suspend_abort: > + ldmia sp!, {r1 - r3} @ pop v:p, virt SP, phys resume fn > + mov sp, r2 > + ldmfd sp!, {r4 - r11, pc} > +ENDPROC(cpu_suspend_abort) > + > /* > * r0 = control register value > * r1 = v:p offset (preserved by cpu_do_resume) > @@ -88,6 +94,7 @@ > cpu_resume_after_mmu: > str r5, [r2, r4, lsl #2] @ restore old mapping > mcr p15, 0, r0, c1, c0, 0 @ turn on D-cache > + mov r0, #0 @ return zero on success > ldmfd sp!, {r4 - r11, pc} > ENDPROC(cpu_resume_after_mmu) > > >
On Mon, Jun 20, 2011 at 01:32:47PM +0100, Frank Hofmann wrote: > I've only tested on s3c64xx (for s2ram and s2disk) and omap3 (s2disk only > due to the omap code not having been "genericized" yet), the changes are > ok there. > The remainder apply and compile ok. > > If that's sufficient, then: > Acked-by: Frank Hofmann <frank.hofmann@tomtom.com> I've added that to the core stuff and s3c64xx. > You're right there's some risk that the ability to return an error here > is misinterpreted as an invitation to use error returns for indicating > state machine conditions. Well, an error return undoes (and probably corrupts) the state which cpu_suspend saved, so it wouldn't be useful for state machine stuff - I wonder whether we should poison the saved pointers just to reinforce this point. > What I'm wondering about is the usecase for having an error return > opportunity in this case (beyond reporting it as such). Isn't it rather > so that most finishers would succeed and at best do a "BUG_ON()" at > failure, because system state then isn't such to make it trivially > recoverable ? For s2ram, there's no reason at this point to fail. The only thing left to do at this point is to do whatever is required to cause the system to enter the low power mode. In the case where you need to place the SDRAM into self refresh mode manually, the timing of stuff after self refresh mode has been entered to the point where power is cut to the CPU is normally well defined, and if for whatever reason power doesn't get cut, there's very little which can be done to re-awake the system (you'd have to ensure that the code to do that was already in the I-cache and whatever data was required was already in registers.) And as I mentioned, if you timeout waiting for the system to power off, and you've done everything you should have done, are you sure that the system won't power off on you unexpectedly if you return with an error, which would then mean you have corrupted state saved (and possibly corrupted filesystems.) So, I don't think S2RAM would have any practical use for an error return path. It's more for hibernation. > For hibernation, the ability to force the entire down/up transition > before writing the snapshot image out is actually very beneficial, for > reliability - one knows that the device/cpu side of suspend/resume has > already worked when the snapshot is being written, without having to wait > for reboot/image load/resume to test that. It's not a test of the suspend/resume path though - device state is very different between suspending in some way followed by an immediate resume without resetting, and a real suspend, power off, resume cycle. The former doesn't really test the resume paths - a register which has been forgotten won't be found by any other way than cutting the power, and you'll only ever find that on a proper resume-from-power-off. > I'm not opposed to this addition as such, but I'm curious: > * If an error occurs, how can the caller of cpu_suspend recover from it ? If an error occurs from cpu_suspend(), then the caller needs to undo any modification to state which it did while saving, and return an error. > * What's the state the system is in after an error in this codepath ? We can arrange for another callback into CPU specific code to sort out anything they've done - iow, an error return from cpu_suspend should ensure that we have the same state present as we had at the point it was called. We currently do not need that as cpu_suspend() just saves state without modifying anything, but if we ended up with a CPU which required stuff to be modified, we'd have to undo those modifications before returning an error. > * What subsystems / users would do anything else with it than BUG_ON() ? The error code from the platform_suspend_ops ->enter method is already dealt with - by going through the resume steps before returning the error code. Whether hibernate has similar functionality I don't know. What I do think is that if hibernate needs to return an error code, we'd better do that without doing things like taking the MMU down to run the resume paths in order to do that. If we have some error to propagate, we want to propagate it back as easily as possible.
Frank Hofmann a écrit : > On Mon, 13 Jun 2011, Russell King - ARM Linux wrote: > > Hi Russell, > > > this change is perfect; with this, the hibernation support code turns into > the attached. > > That's both better and simpler to perform a full suspend/resume cycle (via > resetting in the cpu_suspend "finisher") after the snapshot image has been > created, instead of shoehorning a return into this. > > FrankH. > > + > +u8 __swsusp_resume_stk[PAGE_SIZE/2] __nosavedata; It look like dangerous : there is no alignment constraint, but the stack should be aligned on a 8 Bytes. Matthieu
On Wed, 29 Jun 2011, Matthieu CASTET wrote: > Frank Hofmann a écrit : >> On Mon, 13 Jun 2011, Russell King - ARM Linux wrote: >> >> Hi Russell, >> >> >> this change is perfect; with this, the hibernation support code turns into >> the attached. >> >> That's both better and simpler to perform a full suspend/resume cycle (via >> resetting in the cpu_suspend "finisher") after the snapshot image has been >> created, instead of shoehorning a return into this. >> >> FrankH. > > >> >> + >> +u8 __swsusp_resume_stk[PAGE_SIZE/2] __nosavedata; > It look like dangerous : there is no alignment constraint, but the stack should > be aligned on a 8 Bytes. Uh - sorry. I used to have both the __nosavedata and __attribute__((__aligned__(PAGE_SIZE/2))) attributes there. That must've gone lost at one point. It's an artifact of the build that things turn out ok by default; the __nosavedata forces a separate section (page), and arch/arm is linked before kernel/power (the other user of __nosavedata), hence this block, due to the way the kernel build works, ends up just fine. But as you say, not by intent / declaration. Have you seen Will Deacon's suggested kexec changes ? That keeps a "reset stack" page around, _elsewhere_, and I've been considering using that. In the end, all swsusp_arch_resume() really requires is a stack page that's guaranteed to be outside the target kernel data, thereby left alone by the restore. __nosavedata is merely one way. > > > Matthieu > Thanks, FrankH.
Hi Frank, On Wed, Jun 29, 2011 at 04:14:23PM +0100, Frank Hofmann wrote: > Have you seen Will Deacon's suggested kexec changes ? That keeps a "reset > stack" page around, _elsewhere_, and I've been considering using that. In > the end, all swsusp_arch_resume() really requires is a stack page that's > guaranteed to be outside the target kernel data, thereby left alone by the > restore. __nosavedata is merely one way. For what it's worth, in v4 of that series (hopefully I'll post it soon) I'm starting to think about SMP kexec which necessitates that the reserved stack page is placed in a fixed location. I'm planning to memblock_reserve the page immediately below swapper for this. Will
Frank Hofmann a écrit : > On Mon, 13 Jun 2011, Russell King - ARM Linux wrote: > >> On Mon, Jun 13, 2011 at 02:20:12PM +0100, Frank Hofmann wrote: >>> >>> On Mon, 13 Jun 2011, Russell King - ARM Linux wrote: >>> >>>> On Mon, Jun 13, 2011 at 01:04:02PM +0100, Frank Hofmann wrote: >>>>> To make it clear: IF AND ONLY IF your suspend(-to-ram) func looks like: >>>>> >>>>> ENTRY(acmeSoC_cpu_suspend) >>>>> stmfd sp!, {r4-r12,lr} >>>>> ldr r3, resume_mmu_done >>>>> bl cpu_suspend >>>>> resume_mmu_done: >>>>> ldmfd sp!, {r3-r12,pc} >>>>> ENDPROC(acmeSoC_cpu_suspend) >>>> Nothing has that - because you can't execute that ldmfd after the call >>>> to cpu_suspend returns. I don't think you've understood what I said on >>>> that subject in the previous thread. >>>> >>> Ok, to illustrate a bit more, what is ok and what not. >> Actually, we can do something about cpu_suspend. >> >> Currently cpu_suspend is not like a normal C function - when it's called >> it returns normally to a bunch of code which is not expected to return. >> The return path is via code pointed to by 'r3'. >> >> It also corrupts a bunch of registers in ways which make it non-compliant >> with a C API. >> >> If we do make this complaint as a normal C-like function, it eliminates >> this register saving. We also swap 'lr' and 'r3', so cpu_suspend >> effectively only returns to following code on resume - and r3 points >> to the suspend code. > > Hi Russell, > > > this change is perfect; with this, the hibernation support code turns into > the attached. > That's both better and simpler to perform a full suspend/resume cycle (via > resetting in the cpu_suspend "finisher") after the snapshot image has been > created, instead of shoehorning a return into this. > > > +static void notrace __swsusp_arch_restore_image(void) > +{ > + extern struct pbe *restore_pblist; > + struct pbe *pbe; > + > + cpu_switch_mm(swapper_pg_dir, &init_mm); > + > + for (pbe = restore_pblist; pbe; pbe = pbe->next) > + copy_page(pbe->orig_address, pbe->address); > + One question : isn't dangerous to modify the code where we are running ? I believe the code shouldn't change too much between the kernel that do the resume and the resumed kernel and the copy routine should fit in the instruction cache, but I want to be sure it doesn't cause any problem on recent arm cores (instruction prefetching , ...) Matthieu
Frank Hofmann a écrit : > -----Original Message----- > From: Matthieu CASTET [mailto:matthieu.castet@parrot.com] > Frank Hofmann a écrit : >> [ ... ] >> > +static void notrace __swsusp_arch_restore_image(void) >> > +{ >> > + extern struct pbe *restore_pblist; >> > + struct pbe *pbe; >> > + >> > + cpu_switch_mm(swapper_pg_dir, &init_mm); >> > + >> > + for (pbe = restore_pblist; pbe; pbe = pbe->next) >> > + copy_page(pbe->orig_address, pbe->address); >> > + >> >> One question : isn't dangerous to modify the code where we are running ? >> >> I believe the code shouldn't change too much between the kernel that > do the >> resume and the resumed kernel and the copy routine should fit in the > instruction >> cache, but I want to be sure it doesn't cause any problem on recent > arm cores >> (instruction prefetching , ...) >> >> >> Matthieu > > Hi Matthieu, > > this isn't new behaviour to _this_ rev of the patch ... yes > > and yes, it is dangerous to modify code where you're running. Except > that this isn't happening in the current environment; You are modifying it by putting the same code (modulo dynamic patching on the code (ftrace, kprobe, ...)). > If you're resuming via some other > mechanism but the kernel's snapshot image loading code (and only jump > into swsusp_arch_resume to kickstart resume) then it's up to you how you > get the kernel text into place. Yes. > I've not experimented with resuming "foreign" images; how would one > create such, and bypass the checks on load ? I wasn't suggesting that. > > It's less of a problem for the copy loop itself, really, as that's > definitely cache-hot. But it's an issue for what happens _after_ the > copy loop. If the code for cpu_resume / cpu_reset is not at the places > where the resum-ing code expects it to be (i.e. if there's a mismatch in > the resuming and to-be-resumed kernels wrt. to that), then things will > jump to code nirvana. > > Why are you asking about this ? While reading the code I was surprised of that. And that there weren't any comment about it. Looking at other implementations, only x86_64 seems to need to relocate the copy code. Matthieu
diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h index af44a8f..e78d4f5 100644 --- a/arch/arm/include/asm/memory.h +++ b/arch/arm/include/asm/memory.h @@ -246,6 +246,7 @@ static inline void *phys_to_virt(phys_addr_t x) */ #define __pa(x) __virt_to_phys((unsigned long)(x)) #define __va(x) ((void *)__phys_to_virt((unsigned long)(x))) +#define __pa_symbol(x) __pa(RELOC_HIDE((unsigned long)(x),0)) #define pfn_to_kaddr(pfn) __va((pfn) << PAGE_SHIFT) /* diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile index a5b31af..c64d715 100644 --- a/arch/arm/kernel/Makefile +++ b/arch/arm/kernel/Makefile @@ -30,6 +30,7 @@ obj-$(CONFIG_ARTHUR) += arthur.o obj-$(CONFIG_ISA_DMA) += dma-isa.o obj-$(CONFIG_PCI) += bios32.o isa.o obj-$(CONFIG_PM_SLEEP) += sleep.o +obj-$(CONFIG_HIBERNATION) += hibernate.o obj-$(CONFIG_HAVE_SCHED_CLOCK) += sched_clock.o obj-$(CONFIG_SMP) += smp.o smp_tlb.o obj-$(CONFIG_HAVE_ARM_SCU) += smp_scu.o diff --git a/arch/arm/kernel/hibernate.c b/arch/arm/kernel/hibernate.c new file mode 100644 index 0000000..bf1d2ef --- /dev/null +++ b/arch/arm/kernel/hibernate.c @@ -0,0 +1,152 @@ +/* + * Hibernation support specific for ARM + * + * Derived from work on ARM hibernation support by: + * + * Ubuntu project, hibernation support for mach-dove + * Copyright (C) 2010 Nokia Corporation (Hiroshi Doyu) + * Copyright (C) 2010 Texas Instruments, Inc. (Teerth Reddy et al.) + * https://lkml.org/lkml/2010/6/18/4 + * https://lists.linux-foundation.org/pipermail/linux-pm/2010-June/027422.html + * https://patchwork.kernel.org/patch/96442/ + * + * Copyright (C) 2006 Rafael J. Wysocki <rjw@sisk.pl> + * + * License terms: GNU General Public License (GPL) version 2 + */ + +#include <linux/mm.h> +#include <linux/suspend.h> +#include <asm/tlbflush.h> +#include <asm/cacheflush.h> + +extern const void __nosave_begin, __nosave_end; + +int pfn_is_nosave(unsigned long pfn) +{ + unsigned long nosave_begin_pfn = __pa_symbol(&__nosave_begin) >> PAGE_SHIFT; + unsigned long nosave_end_pfn = PAGE_ALIGN(__pa_symbol(&__nosave_end)) >> PAGE_SHIFT; + + return (pfn >= nosave_begin_pfn) && (pfn < nosave_end_pfn); +} + +void notrace save_processor_state(void) +{ + WARN_ON(num_online_cpus() != 1); + flush_thread(); + local_fiq_disable(); +} + +void notrace restore_processor_state(void) +{ + local_fiq_enable(); +} + +u8 __swsusp_resume_stk[PAGE_SIZE/2] __nosavedata; +static int __swsusp_arch_resume_ret; + +static void notrace __swsusp_arch_reset(void) +{ + extern void cpu_resume(void); + void (*phys_reset)(unsigned long) = + (void (*)(unsigned long))virt_to_phys(cpu_reset); + + flush_tlb_all(); + flush_cache_all(); + + identity_mapping_add(swapper_pg_dir, 0, TASK_SIZE); + flush_tlb_all(); + flush_cache_all(); + + cpu_proc_fin(); + flush_tlb_all(); + flush_cache_all(); + + phys_reset(virt_to_phys(cpu_resume)); +} + +static int __swsusp_arch_resume_finish(void) +{ + cpu_init(); + identity_mapping_del(swapper_pg_dir, 0, TASK_SIZE); + flush_tlb_all(); + flush_cache_all(); + + return __swsusp_arch_resume_ret; +} + +/* + * Snapshot kernel memory and reset the system. + * After resume, the hibernation snapshot is written out. + */ +static void notrace __swsusp_arch_save_image(void) +{ + extern int swsusp_save(void); + + __swsusp_arch_resume_ret = swsusp_save(); + + cpu_switch_mm(swapper_pg_dir, &init_mm); + __swsusp_arch_reset(); +} + +/* + * The framework loads the hibernation image into a linked list anchored + * at restore_pblist, for swsusp_arch_resume() to copy back to the proper + * destinations. + * + * To make this work if resume is triggered from initramfs, the + * pagetables need to be switched to allow writes to kernel mem. + */ +static void notrace __swsusp_arch_restore_image(void) +{ + extern struct pbe *restore_pblist; + struct pbe *pbe; + + cpu_switch_mm(swapper_pg_dir, &init_mm); + + for (pbe = restore_pblist; pbe; pbe = pbe->next) + copy_page(pbe->orig_address, pbe->address); + + /* + * Done - reset and resume from the hibernation image. + */ + __swsusp_arch_resume_ret = 0; /* success at this point */ + __swsusp_arch_reset(); +} + +/* + * Save the current CPU state before suspend / poweroff. + */ +int notrace swsusp_arch_suspend(void) +{ + extern void cpu_suspend(unsigned long, unsigned long, unsigned long, void *); + + cpu_suspend(0, virt_to_phys(0), 0, __swsusp_arch_save_image); + + /* + * After resume, execution restarts here. Clean up and return. + */ + return __swsusp_arch_resume_finish(); +} + +/* + * Resume from the hibernation image. + * Due to the kernel heap / data restore, stack contents change underneath + * and that would make function calls impossible; switch to a temporary + * stack within the nosave region to avoid that problem. + */ +int __naked swsusp_arch_resume(void) +{ + cpu_init(); /* get a clean PSR */ + + /* + * when switch_stack() becomes available, should use that + */ + __asm__ __volatile__("mov sp, %0\n\t" + : : "r"(__swsusp_resume_stk + sizeof(__swsusp_resume_stk)) + : "memory", "cc"); + + __swsusp_arch_restore_image(); + + return 0; +} diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig index 0074b8d..c668f8f 100644 --- a/arch/arm/mm/Kconfig +++ b/arch/arm/mm/Kconfig @@ -627,6 +627,11 @@ config CPU_USE_DOMAINS config IO_36 bool +config ARCH_HIBERNATION_POSSIBLE + bool + depends on MMU + default y if CPU_ARM920T || CPU_ARM926T || CPU_SA1100 || CPU_XSCALE || CPU_XSC3 || CPU_V6 || CPU_V6K || CPU_V7 + comment "Processor Features" config ARM_THUMB