Message ID | 1288077826-9544-1-git-send-email-daniel@gaisler.com |
---|---|
State | Accepted |
Delegated to: | David Miller |
Headers | show |
From: Daniel Hellstrom <daniel@gaisler.com> Date: Tue, 26 Oct 2010 09:23:46 +0200 > Signed-off-by: Daniel Hellstrom <daniel@gaisler.com> Applied, thanks! -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Thanks for applying the patch. I don't know if I have introduced myself before.. I'm working at Aeroflex Gaisler as a software engineer on the LEON{2,3,4} architectures. I saw the discussion about LEON previously on the list, and I feel that I should say something short about LEON and SMP. We are of course very thankfull for being in the Linux kernel tree. There are multiple multi-core designs of LEON in FPGAs and ASICs. For example the dual-core LEON4 200MHz eAsic that Sam mentioned, and a new chip GR712 is being tested at the fab as we speak. More chips are in the design phase. The LEON4 architecture was released earlier this year, it is aiming more on SMP than the LEON3, even though the LEON3 can do SMP as well. The LEON4 has for example L2 cache and wider buses. Of course all LEONs are SPARCv8 compatible. I have been working with a new quad-core LEON4 design since mid-september, we have made some changes to the LEON port of Linux port which we would like to submit. Some of them are not in a clean state yet, so there are some work still before I will try to post them here. Now to the question... The LEON do not have internal timers as some CPUs does, it has one/multiple General Purpose TIMERs on the Processor Local Bus. On single-CPU/SMP systems the first Timer is used for System Clock, and on SMP systems timer two is also used to generate a simultaneous IRQ on all CPUs for profiling etc. (leon_percpu_timer_interrupt()). On the quad-core SMP system I discovered that since the per-cpu timer is generated at the same frequency (and almost simultaneously) as the System Clock Timer. I have made a patch that uses only one Timer for SMP systems, the Timer generates a per-cpu tick as before, however on CPU0 the handler_irq() is also called after profiling has been done, this is to handle the System Clock Tick. I seems to work successfully, and it saves me HZ interrupts per second and a Timer instance. What is you opinion about that? Is it possible to use the same timer for System Clock and for per-cpu profiling etc.? Best Regards, Daniel Hellstrom -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Daniel Hellstrom <daniel@gaisler.com> Date: Tue, 26 Oct 2010 19:50:36 +0200 > The LEON do not have internal timers as some CPUs does, it has > one/multiple General Purpose TIMERs on the Processor Local Bus. On > single-CPU/SMP systems the first Timer is used for System Clock, and > on SMP systems timer two is also used to generate a simultaneous IRQ > on all CPUs for profiling etc. (leon_percpu_timer_interrupt()). On the > quad-core SMP system I discovered that since the per-cpu timer is > generated at the same frequency (and almost simultaneously) as the > System Clock Timer. I have made a patch that uses only one Timer for > SMP systems, the Timer generates a per-cpu tick as before, however on > CPU0 the handler_irq() is also called after profiling has been done, > this is to handle the System Clock Tick. I seems to work successfully, > and it saves me HZ interrupts per second and a Timer instance. What is > you opinion about that? Is it possible to use the same timer for > System Clock and for per-cpu profiling etc.? You only need to generate one timer interrupt per-cpu, and the kernel generically decides to run the global timer actions (jiffies update, etc.) on a choosen cpu, transparently, in the per-cpu periodic timer code. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller wrote: >From: Daniel Hellstrom <daniel@gaisler.com> >Date: Tue, 26 Oct 2010 19:50:36 +0200 > > > >>The LEON do not have internal timers as some CPUs does, it has >>one/multiple General Purpose TIMERs on the Processor Local Bus. On >>single-CPU/SMP systems the first Timer is used for System Clock, and >>on SMP systems timer two is also used to generate a simultaneous IRQ >>on all CPUs for profiling etc. (leon_percpu_timer_interrupt()). On the >>quad-core SMP system I discovered that since the per-cpu timer is >>generated at the same frequency (and almost simultaneously) as the >>System Clock Timer. I have made a patch that uses only one Timer for >>SMP systems, the Timer generates a per-cpu tick as before, however on >>CPU0 the handler_irq() is also called after profiling has been done, >>this is to handle the System Clock Tick. I seems to work successfully, >>and it saves me HZ interrupts per second and a Timer instance. What is >>you opinion about that? Is it possible to use the same timer for >>System Clock and for per-cpu profiling etc.? >> >> > >You only need to generate one timer interrupt per-cpu, and the kernel >generically decides to run the global timer actions (jiffies update, >etc.) on a choosen cpu, transparently, in the per-cpu periodic timer >code. > > > That is interesting, I didn't even think about that. So then I can even remove the extra call to handler_irq() from within the per-cpu timer IRQ handler, and I will probably have to fix some code in the System Clock Timer setup as well. I will have to investigate this further then. So actually it is bad to make 5 timer IRQs per tick on a quad core system, and one of them is calling handler_irq. But I can see that the system clock is progressing in the correct pace, unless my watch is bad :) I will get back to this issue later on. Thank you, Daniel -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Daniel Hellstrom wrote: > David Miller wrote: > >> From: Daniel Hellstrom <daniel@gaisler.com> >> Date: Tue, 26 Oct 2010 19:50:36 +0200 >> >> >> >>> The LEON do not have internal timers as some CPUs does, it has >>> one/multiple General Purpose TIMERs on the Processor Local Bus. On >>> single-CPU/SMP systems the first Timer is used for System Clock, and >>> on SMP systems timer two is also used to generate a simultaneous IRQ >>> on all CPUs for profiling etc. (leon_percpu_timer_interrupt()). On the >>> quad-core SMP system I discovered that since the per-cpu timer is >>> generated at the same frequency (and almost simultaneously) as the >>> System Clock Timer. I have made a patch that uses only one Timer for >>> SMP systems, the Timer generates a per-cpu tick as before, however on >>> CPU0 the handler_irq() is also called after profiling has been done, >>> this is to handle the System Clock Tick. I seems to work successfully, >>> and it saves me HZ interrupts per second and a Timer instance. What is >>> you opinion about that? Is it possible to use the same timer for >>> System Clock and for per-cpu profiling etc.? >>> >> >> >> You only need to generate one timer interrupt per-cpu, and the kernel >> generically decides to run the global timer actions (jiffies update, >> etc.) on a choosen cpu, transparently, in the per-cpu periodic timer >> code. >> >> >> > That is interesting, I didn't even think about that. So then I can > even remove the extra call to handler_irq() from within the per-cpu > timer IRQ handler, and I will probably have to fix some code in the > System Clock Timer setup as well. I will have to investigate this > further then. > > So actually it is bad to make 5 timer IRQs per tick on a quad core > system, and one of them is calling handler_irq. But I can see that the > system clock is progressing in the correct pace, unless my watch is > bad :) I will get back to this issue later on. I have started working on this timer patch again... I tried looking a sun4d and sun4m to get an example of how to implement this in a better way, however they seem to implement the per-cpu ticker using hardcoded IRQ number 14 and a custom trap handler for the per-cpu timer ticker (see bottom of kernel/sun4m_irq.c: sun4m_init_timers()). The system clock is implemented using the time-tick at IRQ10. I'm not sure why the time-tick timer is used at all, unless the hardware requires it (the per-cpu timer perhaps can not get the time or limits HZ). The LEON port mimics this behaviour, however the LEON CPU does not have a "per-cpu" timer. The new patch uses only one timer to generate one IRQ on each CPU simultaneously, and only CPU0 will call handler_irq() to handle the standard system clock tick interrupt handler. Please see the patch I will submit to the list soon as reference. This patch is bad if one would want the system clock tick to run at a different rate than the profiling per-cpu ticks, or if the system clock tick is turned off/on as it will affect the profiling, however I have not seen code indicating such a behaviour. Thanks for applying the other patches, Daniel -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > I have started working on this timer patch again... > > I tried looking a sun4d and sun4m to get an example of how to implement > this in a better way, however they seem to implement the per-cpu ticker > using hardcoded IRQ number 14 and a custom trap handler for the per-cpu > timer ticker (see bottom of kernel/sun4m_irq.c: sun4m_init_timers()). I am slowly looking into introducing generic IRQ support for SPARC. If I succeed then we will shift to a more dynamic numbering of interrupts - like sparc64 does. Right now I am in a situation where I try to analyse SPARC, existing codebase and genirq in the kernel. So it will take a while before I get anywhere with this. Sam -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Sam Ravnborg wrote: >>I have started working on this timer patch again... >> >>I tried looking a sun4d and sun4m to get an example of how to implement >>this in a better way, however they seem to implement the per-cpu ticker >>using hardcoded IRQ number 14 and a custom trap handler for the per-cpu >>timer ticker (see bottom of kernel/sun4m_irq.c: sun4m_init_timers()). >> >> > >I am slowly looking into introducing generic IRQ support for SPARC. >If I succeed then we will shift to a more dynamic numbering >of interrupts - like sparc64 does. > > That would be great. I have not looked so much into the other SPARC32 ports or the SPARC64, however the LEON port handles IRQ always on the CPU calling request_irq(), since CPU0 initializes everything during startup CPU0 will end up doing a lot if IRQ work. I wish there where a way of implementing IRQ routing to different CPUs. In best case during runtime, however a static configuration is good enough. >Right now I am in a situation where I try to analyse SPARC, existing >codebase and genirq in the kernel. So it will take >a while before I get anywhere with this. > > I understand, I'm appreciating your efforts. Daniel -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Daniel Hellstrom <daniel@gaisler.com> Date: Wed, 05 Jan 2011 11:20:19 +0100 > I tried looking a sun4d and sun4m to get an example of how to > implement this in a better way, however they seem to implement the > per-cpu ticker using hardcoded IRQ number 14 and a custom trap handler > for the per-cpu timer ticker (see bottom of kernel/sun4m_irq.c: > sun4m_init_timers()). The system clock is implemented using the > time-tick at IRQ10. I'm not sure why the time-tick timer is used at > all, unless the hardware requires it (the per-cpu timer perhaps can > not get the time or limits HZ). sun4c, sun4d, and sun4m have two timer sources that deliver interrupts on level 10 and level 14 and these mappings are not changable. The level 10 one is a global interrupt which can target one cpu, whereas the level 14 one can be used in a per-cpu manner and is usually referred to as the profiling timer. Therefore there is one count/limit register pair for the level 10 timer, and NR_CPUS pairs of count/limit registers for level 14. The best thing to do is the use only the level 14 timer, and register is as a clockevent with the generic code. Then you won't have to deal with any details like which cpu to invoke the scheduler tick on, etc. It'll all be taken care of for you. Something like: static struct clock_event_device sparc32_clockevent = { .features = CLOCK_EVT_FEAT_ONESHOT, .set_mode = sparc32_timer_setup, .set_next_event = sparc32_next_event, .rating = 100, .shift = 30, .irq = -1, }; static DEFINE_PER_CPU(struct clock_event_device, sparc32_events); You implement sparc32_timer_setup, which will look something like: static void sparc32_timer_setup(enum clock_event_mode mode, struct clock_event_device *evt) { switch (mode) { case CLOCK_EVT_MODE_ONESHOT: case CLOCK_EVT_MODE_RESUME: break; case CLOCK_EVT_MODE_SHUTDOWN: disable_level14_timer(); break; case CLOCK_EVT_MODE_PERIODIC: case CLOCK_EVT_MODE_UNUSED: WARN_ON(1); break; }; } and sparc32_next_event, which advances the limit register to the next interrupt count. static int sparc64_next_event(unsigned long delta, struct clock_event_device *evt) { unsigned long orig_count, new_count, new_limit; orig_count = sparc32_read_level14_count(); new_limit = orig_count + delta; sparc32_write_level14_limit(new_limit); new_count = sparc32_read_level14_count(); /* If the new limit has been passed already, let the caller * know. */ if (((long)(new_count - (orig_count + delta))) > 0L) return -ETIME; return 0; } The level 14 timer interrupt should go: int cpu = smp_processor_id(); struct clock_event_device *evt = &per_cpu(sparc32_events, cpu); ... if (unlikely(!evt->event_handler)) { printk(KERN_WARNING "Spurious SPARC32 timer interrupt on cpu %d\n", cpu); } else evt->event_handler(evt); ... Finally, on bootup and on each cpu, initialize (but do not start) the level 14 timer and then go: struct clock_event_device *sevt; sevt = &__get_cpu_var(sparc32_events); memcpy(sevt, &sparc32_clockevent, sizeof(*sevt)); sevt->cpumask = cpumask_of(smp_processor_id()); clockevents_register_device(sevt); The clockevents system will make ->next_event() calls on this clockevents device to start the timers firing. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/sparc/kernel/leon_smp.c b/arch/sparc/kernel/leon_smp.c index e1656fc..7524689 100644 --- a/arch/sparc/kernel/leon_smp.c +++ b/arch/sparc/kernel/leon_smp.c @@ -56,8 +56,8 @@ void __init leon_configure_cache_smp(void); static inline unsigned long do_swap(volatile unsigned long *ptr, unsigned long val) { - __asm__ __volatile__("swapa [%1] %2, %0\n\t" : "=&r"(val) - : "r"(ptr), "i"(ASI_LEON_DCACHE_MISS) + __asm__ __volatile__("swapa [%2] %3, %0\n\t" : "=&r"(val) + : "0"(val), "r"(ptr), "i"(ASI_LEON_DCACHE_MISS) : "memory"); return val; }
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com> --- arch/sparc/kernel/leon_smp.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-)