From patchwork Fri Aug 10 16:27:23 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Beno=C3=AEt_Th=C3=A9baudeau?= X-Patchwork-Id: 176530 X-Patchwork-Delegate: marek.vasut@gmail.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from theia.denx.de (theia.denx.de [85.214.87.163]) by ozlabs.org (Postfix) with ESMTP id 90E252C0099 for ; Sat, 11 Aug 2012 02:22:12 +1000 (EST) Received: from localhost (localhost [127.0.0.1]) by theia.denx.de (Postfix) with ESMTP id 2C70428116; Fri, 10 Aug 2012 18:22:11 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at theia.denx.de Received: from theia.denx.de ([127.0.0.1]) by localhost (theia.denx.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CXj1GYA7CFhf; Fri, 10 Aug 2012 18:22:10 +0200 (CEST) Received: from theia.denx.de (localhost [127.0.0.1]) by theia.denx.de (Postfix) with ESMTP id 64FCC2810D; Fri, 10 Aug 2012 18:22:09 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by theia.denx.de (Postfix) with ESMTP id 3BECF2810D for ; Fri, 10 Aug 2012 18:22:08 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at theia.denx.de Received: from theia.denx.de ([127.0.0.1]) by localhost (theia.denx.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wS9LvNRbvtov for ; Fri, 10 Aug 2012 18:22:07 +0200 (CEST) X-policyd-weight: NOT_IN_SBL_XBL_SPAMHAUS=-1.5 NOT_IN_SPAMCOP=-1.5 NOT_IN_BL_NJABL=-1.5 (only DNSBL check requested) Received: from zose-mta15.web4all.fr (zose-mta15.web4all.fr [176.31.217.11]) by theia.denx.de (Postfix) with ESMTP id A2D7D28100 for ; Fri, 10 Aug 2012 18:22:07 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by zose-mta15.web4all.fr (Postfix) with ESMTP id B2CC33211E; Fri, 10 Aug 2012 18:24:46 +0200 (CEST) X-Virus-Scanned: amavisd-new at zose1.web4all.fr Received: from zose-mta15.web4all.fr ([127.0.0.1]) by localhost (zose-mta15.web4all.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dsdfg86-wHUY; Fri, 10 Aug 2012 18:24:46 +0200 (CEST) Received: from zose-store12.web4all.fr (zose-store-12.w4a.fr [178.33.204.48]) by zose-mta15.web4all.fr (Postfix) with ESMTP id 0E8952C373; Fri, 10 Aug 2012 18:24:46 +0200 (CEST) Date: Fri, 10 Aug 2012 18:27:23 +0200 (CEST) From: =?utf-8?Q?Beno=C3=AEt_Th=C3=A9baudeau?= To: u-boot@lists.denx.de Message-ID: <330258397.2278501.1344616043805.JavaMail.root@advansee.com> In-Reply-To: <277000258.2278273.1344615697596.JavaMail.root@advansee.com> MIME-Version: 1.0 X-Originating-IP: [88.188.188.98] X-Mailer: Zimbra 7.2.0_GA_2669 (ZimbraWebClient - FF3.0 (Win)/7.2.0_GA_2669) Cc: Marek Vasut , Ilya Yanok Subject: [U-Boot] [PATCH v4 7/7] ehci: Optimize qTD allocations X-BeenThere: u-boot@lists.denx.de X-Mailman-Version: 2.1.11 Precedence: list List-Id: U-Boot discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: u-boot-bounces@lists.denx.de Errors-To: u-boot-bounces@lists.denx.de Relax the qTD transfer alignment constraints in order to need less qTDs for buffers that are aligned to 512 bytes but not to pages. Signed-off-by: Benoît Thébaudeau Cc: Marek Vasut Cc: Ilya Yanok Cc: Stefan Herbrechtsmeier --- Changes for v2: N/A. Changes for v3: - New patch. Changes for v4: - Optimize away the qtd_toggle variable. .../drivers/usb/host/ehci-hcd.c | 67 +++++++++++--------- 1 file changed, 37 insertions(+), 30 deletions(-) diff --git u-boot-usb-4f8254e.orig/drivers/usb/host/ehci-hcd.c u-boot-usb-4f8254e/drivers/usb/host/ehci-hcd.c index a0ef5db..18b4bc6 100644 --- u-boot-usb-4f8254e.orig/drivers/usb/host/ehci-hcd.c +++ u-boot-usb-4f8254e/drivers/usb/host/ehci-hcd.c @@ -215,7 +215,7 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer, volatile struct qTD *vtd; unsigned long ts; uint32_t *tdp; - uint32_t endpt, token, usbsts; + uint32_t endpt, maxpacket, token, usbsts; uint32_t c, toggle; uint32_t cmd; int timeout; @@ -230,6 +230,7 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer, le16_to_cpu(req->value), le16_to_cpu(req->value), le16_to_cpu(req->index)); +#define PKT_ALIGN 512 /* * The USB transfer is split into qTD transfers. Eeach qTD transfer is * described by a transfer descriptor (the qTD). The qTDs form a linked @@ -251,43 +252,41 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer, if (length > 0 || req == NULL) { /* * Determine the qTD transfer size that will be used for the - * data payload (not considering the final qTD transfer, which - * may be shorter). + * data payload (not considering the first qTD transfer, which + * may be longer or shorter, and the final one, which may be + * shorter). * * In order to keep each packet within a qTD transfer, the qTD - * transfer size is aligned to EHCI_PAGE_SIZE, which is a - * multiple of wMaxPacketSize (except in some cases for - * interrupt transfers, see comment in submit_int_msg()). + * transfer size is aligned to PKT_ALIGN, which is a multiple of + * wMaxPacketSize (except in some cases for interrupt transfers, + * see comment in submit_int_msg()). * - * By default, i.e. if the input buffer is page-aligned, + * By default, i.e. if the input buffer is aligned to PKT_ALIGN, * QT_BUFFER_CNT full pages will be used. */ int xfr_sz = QT_BUFFER_CNT; /* - * However, if the input buffer is not page-aligned, the qTD - * transfer size will be one page shorter, and the first qTD + * However, if the input buffer is not aligned to PKT_ALIGN, the + * qTD transfer size will be one page shorter, and the first qTD * data buffer of each transfer will be page-unaligned. */ - if ((uint32_t)buffer & (EHCI_PAGE_SIZE - 1)) + if ((uint32_t)buffer & (PKT_ALIGN - 1)) xfr_sz--; /* Convert the qTD transfer size to bytes. */ xfr_sz *= EHCI_PAGE_SIZE; /* - * Determine the number of qTDs that will be required for the - * data payload. This value has to be rounded up since the final - * qTD transfer may be shorter than the regular qTD transfer - * size that has just been computed. + * Approximate by excess the number of qTDs that will be + * required for the data payload. The exact formula is way more + * complicated and saves at most 2 qTDs, i.e. a total of 128 + * bytes. */ - qtd_count += DIV_ROUND_UP(length, xfr_sz); - /* ZLPs also need a qTD. */ - if (!qtd_count) - qtd_count++; + qtd_count += 2 + length / xfr_sz; } /* - * Threshold value based on the worst-case total size of the qTDs to allocate - * for a mass-storage transfer of 65535 blocks of 512 bytes. + * Threshold value based on the worst-case total size of the allocated qTDs for + * a mass-storage transfer of 65535 blocks of 512 bytes. */ -#if CONFIG_SYS_MALLOC_LEN <= 128 * 1024 +#if CONFIG_SYS_MALLOC_LEN <= 64 + 128 * 1024 #warning CONFIG_SYS_MALLOC_LEN may be too small for EHCI #endif qtd = memalign(USB_DMA_MINALIGN, qtd_count * sizeof(struct qTD)); @@ -313,8 +312,9 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer, */ qh->qh_link = cpu_to_hc32((uint32_t)qh_list | QH_LINK_TYPE_QH); c = usb_pipespeed(pipe) != USB_SPEED_HIGH && !usb_pipeendpoint(pipe); + maxpacket = usb_maxpacket(dev, pipe); endpt = QH_ENDPT1_RL(8) | QH_ENDPT1_C(c) | - QH_ENDPT1_MAXPKTLEN(usb_maxpacket(dev, pipe)) | QH_ENDPT1_H(0) | + QH_ENDPT1_MAXPKTLEN(maxpacket) | QH_ENDPT1_H(0) | QH_ENDPT1_DTC(QH_ENDPT1_DTC_DT_FROM_QTD) | QH_ENDPT1_EPS(usb_pipespeed(pipe)) | QH_ENDPT1_ENDPT(usb_pipeendpoint(pipe)) | QH_ENDPT1_I(0) | @@ -373,9 +373,9 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer, xfr_bytes -= (uint32_t)buf_ptr & (EHCI_PAGE_SIZE - 1); /* * In order to keep each packet within a qTD transfer, - * align the qTD transfer size to EHCI_PAGE_SIZE. + * align the qTD transfer size to PKT_ALIGN. */ - xfr_bytes &= ~(EHCI_PAGE_SIZE - 1); + xfr_bytes &= ~(PKT_ALIGN - 1); /* * This transfer may be shorter than the available qTD * transfer size that has just been computed. @@ -411,6 +411,13 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer, /* Update previous qTD! */ *tdp = cpu_to_hc32((uint32_t)&qtd[qtd_counter]); tdp = &qtd[qtd_counter++].qt_next; + /* + * Data toggle has to be adjusted since the qTD transfer + * size is not always an even multiple of + * wMaxPacketSize. + */ + if ((xfr_bytes / maxpacket) & 1) + toggle ^= 1; buf_ptr += xfr_bytes; left_length -= xfr_bytes; } while (left_length > 0); @@ -426,7 +433,7 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer, */ qtd[qtd_counter].qt_next = cpu_to_hc32(QT_NEXT_TERMINATE); qtd[qtd_counter].qt_altnext = cpu_to_hc32(QT_NEXT_TERMINATE); - token = QT_TOKEN_DT(toggle) | QT_TOKEN_TOTALBYTES(0) | + token = QT_TOKEN_DT(1) | QT_TOKEN_TOTALBYTES(0) | QT_TOKEN_IOC(1) | QT_TOKEN_CPAGE(0) | QT_TOKEN_CERR(3) | QT_TOKEN_PID(usb_pipein(pipe) ? QT_TOKEN_PID_OUT : QT_TOKEN_PID_IN) | @@ -931,11 +938,11 @@ submit_int_msg(struct usb_device *dev, unsigned long pipe, void *buffer, * because bInterval is ignored. * * Also, ehci_submit_async() relies on wMaxPacketSize being a power of 2 - * if several qTDs are required, while the USB specification does not - * constrain this for interrupt transfers. That means that - * ehci_submit_async() would support interrupt transfers requiring - * several transactions only as long as the transfer size does not - * require more than a single qTD. + * <= PKT_ALIGN if several qTDs are required, while the USB + * specification does not constrain this for interrupt transfers. That + * means that ehci_submit_async() would support interrupt transfers + * requiring several transactions only as long as the transfer size does + * not require more than a single qTD. */ if (length > usb_maxpacket(dev, pipe)) { printf("%s: Interrupt transfers requiring several transactions "