From patchwork Fri Jun 2 18:40:52 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Tatashin X-Patchwork-Id: 770549 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3wfY2L0C55z9s76 for ; Sat, 3 Jun 2017 04:41:14 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751007AbdFBSlN (ORCPT ); Fri, 2 Jun 2017 14:41:13 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:37908 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751201AbdFBSlL (ORCPT ); Fri, 2 Jun 2017 14:41:11 -0400 Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v52If864019312 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 2 Jun 2017 18:41:08 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id v52If8RD026738 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 2 Jun 2017 18:41:08 GMT Received: from abhmp0006.oracle.com (abhmp0006.oracle.com [141.146.116.12]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id v52If7NX004664; Fri, 2 Jun 2017 18:41:07 GMT Received: from ca-ldom103.us.oracle.com (/10.129.68.23) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 02 Jun 2017 11:41:07 -0700 From: Pavel Tatashin To: sparclinux@vger.kernel.org, davem@davemloft.net Subject: [v1 4/6] sparc64: optimize loads in clock_sched() Date: Fri, 2 Jun 2017 14:40:52 -0400 Message-Id: <1496428854-936411-5-git-send-email-pasha.tatashin@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1496428854-936411-1-git-send-email-pasha.tatashin@oracle.com> References: <1496428854-936411-1-git-send-email-pasha.tatashin@oracle.com> X-Source-IP: userv0021.oracle.com [156.151.31.71] Sender: sparclinux-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: sparclinux@vger.kernel.org In clock sched we now have three loads: - Function pointer - quotient for multiplication - offset However, it is possible to improve performance substantially, by guaranteeing that all three loads are from the same cacheline. By moving these three values first in sparc64_tick_ops, and by having tick_operations 64-byte aligned we guarantee this. Signed-off-by: Pavel Tatashin Reviewed-by: Shannon Nelson Reviewed-by: Steven Sistare --- arch/sparc/include/asm/timer_64.h | 5 +++++ arch/sparc/kernel/time_64.c | 17 +++++++---------- 2 files changed, 12 insertions(+), 10 deletions(-) diff --git a/arch/sparc/include/asm/timer_64.h b/arch/sparc/include/asm/timer_64.h index fce4150..bde2cc4 100644 --- a/arch/sparc/include/asm/timer_64.h +++ b/arch/sparc/include/asm/timer_64.h @@ -9,7 +9,12 @@ #include #include +/* The most frequently accessed fields should be first, + * to fit into the same cacheline. + */ struct sparc64_tick_ops { + unsigned long ticks_per_nsec_quotient; + unsigned long offset; unsigned long long (*get_tick)(void); int (*add_compare)(unsigned long); unsigned long softint_mask; diff --git a/arch/sparc/kernel/time_64.c b/arch/sparc/kernel/time_64.c index 5f53b74..d654f5c 100644 --- a/arch/sparc/kernel/time_64.c +++ b/arch/sparc/kernel/time_64.c @@ -164,7 +164,7 @@ static unsigned long tick_add_tick(unsigned long adj) return new_tick; } -static struct sparc64_tick_ops tick_operations __read_mostly = { +static struct sparc64_tick_ops tick_operations __aligned(64) __read_mostly = { .name = "tick", .init_tick = tick_init_tick, .disable_irq = tick_disable_irq, @@ -391,9 +391,6 @@ static int hbtick_add_compare(unsigned long adj) .softint_mask = 1UL << 0, }; -static unsigned long timer_ticks_per_nsec_quotient __read_mostly; -static unsigned long timer_offset __read_mostly; - unsigned long cmos_regs; EXPORT_SYMBOL(cmos_regs); @@ -784,11 +781,11 @@ void __init time_init(void) tb_ticks_per_usec = freq / USEC_PER_SEC; - timer_ticks_per_nsec_quotient = + tick_operations.ticks_per_nsec_quotient = clocksource_hz2mult(freq, SPARC64_NSEC_PER_CYC_SHIFT); - timer_offset = (tick_operations.get_tick() - * timer_ticks_per_nsec_quotient) + tick_operations.offset = (tick_operations.get_tick() + * tick_operations.ticks_per_nsec_quotient) >> SPARC64_NSEC_PER_CYC_SHIFT; clocksource_tick.name = tick_operations.name; @@ -816,11 +813,11 @@ void __init time_init(void) unsigned long long sched_clock(void) { + unsigned long quotient = tick_operations.ticks_per_nsec_quotient; + unsigned long offset = tick_operations.offset; unsigned long ticks = tick_operations.get_tick(); - return ((ticks * timer_ticks_per_nsec_quotient) - >> SPARC64_NSEC_PER_CYC_SHIFT) - - timer_offset; + return ((ticks * quotient) >> SPARC64_NSEC_PER_CYC_SHIFT) - offset; } int read_current_timer(unsigned long *timer_val)