From patchwork Fri Nov 1 09:36:36 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Markus Pargmann X-Patchwork-Id: 287750 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id BAC822C00E4 for ; Fri, 1 Nov 2013 20:37:13 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752750Ab3KAJgs (ORCPT ); Fri, 1 Nov 2013 05:36:48 -0400 Received: from metis.ext.pengutronix.de ([92.198.50.35]:54800 "EHLO metis.ext.pengutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751681Ab3KAJgq (ORCPT ); Fri, 1 Nov 2013 05:36:46 -0400 Received: from dude.hi.pengutronix.de ([2001:6f8:1178:2:21e:67ff:fe11:9c5c]) by metis.ext.pengutronix.de with esmtp (Exim 4.72) (envelope-from ) id 1VcB9l-0007iH-E2; Fri, 01 Nov 2013 10:36:41 +0100 Received: from mpa by dude.hi.pengutronix.de with local (Exim 4.80) (envelope-from ) id 1VcB9k-0001oE-Mo; Fri, 01 Nov 2013 10:36:40 +0100 From: Markus Pargmann To: Marc Kleine-Budde , Wolfgang Grandegger Cc: Joe Perches , linux-can@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, kernel@pengutronix.de, Markus Pargmann Subject: [PATCH v3] can: c_can: Speed up rx_poll function Date: Fri, 1 Nov 2013 10:36:36 +0100 Message-Id: <1383298596-18385-1-git-send-email-mpa@pengutronix.de> X-Mailer: git-send-email 1.8.4.rc3 X-SA-Exim-Connect-IP: 2001:6f8:1178:2:21e:67ff:fe11:9c5c X-SA-Exim-Mail-From: mpa@pengutronix.de X-SA-Exim-Scanned: No (on metis.ext.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: netdev@vger.kernel.org Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch speeds up the rx_poll function by reducing the number of register reads. Replace the 32bit register read by a 16bit register read. Currently the 32bit register read is implemented by using 2 16bit reads. This is inefficient as we only use the lower 16bit in rx_poll. The for loop reads the pending interrupts in every iteration. This leads up to 16 reads of pending interrupts. The patch introduces a new outer loop to read the pending interrupts as long as 'quota' is above 0. This reduces the total number of reads. The third change is to replace the for-loop by a ffs loop. Tested on AM335x. I removed all 'static' and 'inline' from c_can.c to see the timings for all functions. I used the function tracer with trace_stats. 125kbit: Function Hit Time Avg s^2 -------- --- ---- --- --- c_can_do_rx_poll 63960 10168178 us 158.977 us 1493056 us With patch: c_can_do_rx_poll 63941 3764057 us 58.867 us 776162.2 us 1Mbit: Function Hit Time Avg s^2 -------- --- ---- --- --- c_can_do_rx_poll 69489 30049498 us 432.435 us 9271851 us With patch: c_can_do_rx_poll 207109 24322185 us 117.436 us 171469047 us Signed-off-by: Markus Pargmann --- Notes: Changes in v3: - Update commit message (measurements and ffs) Changes in v2: - Small changes, find_next_bit -> ffs and other drivers/net/can/c_can/c_can.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/drivers/net/can/c_can/c_can.c b/drivers/net/can/c_can/c_can.c index a668cd4..428681e 100644 --- a/drivers/net/can/c_can/c_can.c +++ b/drivers/net/can/c_can/c_can.c @@ -798,17 +798,19 @@ static int c_can_do_rx_poll(struct net_device *dev, int quota) u32 num_rx_pkts = 0; unsigned int msg_obj, msg_ctrl_save; struct c_can_priv *priv = netdev_priv(dev); - u32 val = c_can_read_reg32(priv, C_CAN_INTPND1_REG); + u16 val; + + /* + * It is faster to read only one 16bit register. This is only possible + * for a maximum number of 16 objects. + */ + BUILD_BUG_ON_MSG(C_CAN_MSG_OBJ_RX_LAST > 16, + "Implementation does not support more message objects than 16"); + + while (quota > 0 && (val = priv->read_reg(priv, C_CAN_INTPND1_REG))) { + while ((msg_obj = ffs(val)) && quota > 0) { + val &= ~BIT(msg_obj - 1); - for (msg_obj = C_CAN_MSG_OBJ_RX_FIRST; - msg_obj <= C_CAN_MSG_OBJ_RX_LAST && quota > 0; - val = c_can_read_reg32(priv, C_CAN_INTPND1_REG), - msg_obj++) { - /* - * as interrupt pending register's bit n-1 corresponds to - * message object n, we need to handle the same properly. - */ - if (val & (1 << (msg_obj - 1))) { c_can_object_get(dev, 0, msg_obj, IF_COMM_ALL & ~IF_COMM_TXRQST); msg_ctrl_save = priv->read_reg(priv,