From patchwork Fri Sep 2 13:24:23 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 113137 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from chlorine.canonical.com (chlorine.canonical.com [91.189.94.204]) by ozlabs.org (Postfix) with ESMTP id 08E5AB6F88 for ; Fri, 2 Sep 2011 23:25:02 +1000 (EST) Received: from localhost ([127.0.0.1] helo=chlorine.canonical.com) by chlorine.canonical.com with esmtp (Exim 4.71) (envelope-from ) id 1QzTjg-0001Sd-L9; Fri, 02 Sep 2011 13:24:44 +0000 Received: from mail-pz0-f49.google.com ([209.85.210.49]) by chlorine.canonical.com with esmtp (Exim 4.71) (envelope-from ) id 1QzTje-0001ST-Qb for kernel-team@lists.ubuntu.com; Fri, 02 Sep 2011 13:24:43 +0000 Received: by pzk6 with SMTP id 6so5134861pzk.22 for ; Fri, 02 Sep 2011 06:24:41 -0700 (PDT) Received: by 10.68.6.100 with SMTP id z4mr783038pbz.310.1314969881515; Fri, 02 Sep 2011 06:24:41 -0700 (PDT) Received: from localhost ([183.37.192.34]) by mx.google.com with ESMTPS id e3sm10175975pbi.7.2011.09.02.06.24.35 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 02 Sep 2011 06:24:38 -0700 (PDT) From: ming.lei@canonical.com To: kernel-team@lists.ubuntu.com Subject: [PATCH][Oneiric ARM] usb: ehci: make HC see up-to-date qh/qtd descriptor ASAP Date: Fri, 2 Sep 2011 21:24:23 +0800 Message-Id: <1314969863-4914-1-git-send-email-ming.lei@canonical.com> X-Mailer: git-send-email 1.7.4.1 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.13 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: kernel-team-bounces@lists.ubuntu.com Errors-To: kernel-team-bounces@lists.ubuntu.com From: Ming Lei This patch introduces the helper of ehci_sync_mem to flush qtd/qh into memory immediately on some ARM, so that HC can see the up-to-date qtd/qh descriptor asap. This patch fixs one performance bug on ARM Cortex A9 dual core platform, which has been reported on quite a few ARM machines (OMAP4, Tegra 2, snowball...), see details from link of https://bugs.launchpad.net/bugs/709245. The patch has been tested ok on OMAP4 panda A1 board, and the performance of 'dd' over usb mass storage can be increased from 4~5MB/sec to 14~16MB/sec after applying this patch. SRU Justification: Impact: - without the patch, 'dd' over usb mass storage is about 4~5MB/sec. Fix: - After applying the patch, 'dd' over usb mass storage is about 14~16MB/sec. BugLink: http://bugs.launchpad.net/bugs/709245 upstream discusstion: https://patchwork.kernel.org/patch/1113332/ Signed-off-by: Ming Lei --- The patch has been agreed(signed-off-by) by ehci maintainer (Alan Stern) of upstream kernel, but still not enter upstream now. The current upstream discussion is focused on if a new DMA API should be introduced to flush data into DMA coherent memory. I think the patch will enter 3.2 instead of 3.1 if new DMA API needs to be introduced, so post it out that the patch can fix this beta 1 bug of Oneric. --- drivers/usb/host/ehci-q.c | 18 ++++++++++++++++++ drivers/usb/host/ehci.h | 17 +++++++++++++++++ 2 files changed, 35 insertions(+), 0 deletions(-) diff --git a/drivers/usb/host/ehci-q.c b/drivers/usb/host/ehci-q.c index 0917e3a..2719879 100644 --- a/drivers/usb/host/ehci-q.c +++ b/drivers/usb/host/ehci-q.c @@ -995,6 +995,12 @@ static void qh_link_async (struct ehci_hcd *ehci, struct ehci_qh *qh) head->qh_next.qh = qh; head->hw->hw_next = dma; + /* + * flush qh descriptor into memory immediately, + * see comments in qh_append_tds. + */ + ehci_sync_mem(); + qh_get(qh); qh->xacterrs = 0; qh->qh_state = QH_STATE_LINKED; @@ -1082,6 +1088,18 @@ static struct ehci_qh *qh_append_tds ( wmb (); dummy->hw_token = token; + /* + * Writing to dma coherent buffer on ARM may + * be delayed to reach memory, so HC may not see + * hw_token of dummy qtd in time, which can cause + * the qtd transaction to be executed very late, + * and degrade performance a lot. ehci_sync_mem + * is added to flush 'token' immediatelly into + * memory, so that ehci can execute the transaction + * ASAP. + */ + ehci_sync_mem(); + urb->hcpriv = qh_get (qh); } } diff --git a/drivers/usb/host/ehci.h b/drivers/usb/host/ehci.h index cc7d337..313d9d6 100644 --- a/drivers/usb/host/ehci.h +++ b/drivers/usb/host/ehci.h @@ -738,6 +738,23 @@ static inline u32 hc32_to_cpup (const struct ehci_hcd *ehci, const __hc32 *x) #endif +/* + * Writing to dma coherent memory on ARM may be delayed via L2 + * writing buffer, so introduce the helper which can flush L2 writing + * buffer into memory immediately, especially used to flush ehci + * descriptor to memory. + */ +#ifdef CONFIG_ARM_DMA_MEM_BUFFERABLE +static inline void ehci_sync_mem() +{ + mb(); +} +#else +static inline void ehci_sync_mem() +{ +} +#endif + /*-------------------------------------------------------------------------*/ #ifndef DEBUG