Message ID | 20100715142418.GA26491@host-a-229.ustcsz.edu.cn |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On Thu, 2010-07-15 at 22:24 +0800, Junchang Wang wrote: > Hi list, > My understand of the way that NICs deliver packets to the kernel is > as follows. Correct me if any of this is wrong. Thanks. > > 1) The device buffer is fixed. When the kernel is acknowledged arrival of a > new packet, it dynamically allocate a new skb and copy the packet into it. > For example, 8139too. > > 2) The device buffer is mapped by streaming DMA. When the kernel is > acknowledged arrival of a new packet, it unmaps the region previously mapped. > Obviously, there is NO memcpy operation. Additional cost is streaming DMA > map/unmap operations. For example, e100 and e1000. > > Here comes my question: > 1) Is there a principle indicating which one is better? Is streaming DMA > map/unmap operations more expensive than memcpy operation? DMA should result in lower CPU usage and higher maximum performance. > 2) Why does r8169 bias towards the first approach even if it support both? I > convert r8169 to the second one and get a 5% performance boost. Below is result > running netperf TCP_STREAM test with 1.6K byte packet length. > scheme 1 scheme 2 Imp. > r8169 683M 718M 5% [...] You should also compare the CPU usage. Ben.
On Thu, 15 Jul 2010 15:33:37 +0100 Ben Hutchings <bhutchings@solarflare.com> wrote: > On Thu, 2010-07-15 at 22:24 +0800, Junchang Wang wrote: > > Hi list, > > My understand of the way that NICs deliver packets to the kernel is > > as follows. Correct me if any of this is wrong. Thanks. > > > > 1) The device buffer is fixed. When the kernel is acknowledged arrival of a > > new packet, it dynamically allocate a new skb and copy the packet into it. > > For example, 8139too. > > > > 2) The device buffer is mapped by streaming DMA. When the kernel is > > acknowledged arrival of a new packet, it unmaps the region previously mapped. > > Obviously, there is NO memcpy operation. Additional cost is streaming DMA > > map/unmap operations. For example, e100 and e1000. > > > > Here comes my question: > > 1) Is there a principle indicating which one is better? Is streaming DMA > > map/unmap operations more expensive than memcpy operation? > > DMA should result in lower CPU usage and higher maximum performance. > > > 2) Why does r8169 bias towards the first approach even if it support both? I > > convert r8169 to the second one and get a 5% performance boost. Below is result > > running netperf TCP_STREAM test with 1.6K byte packet length. > > scheme 1 scheme 2 Imp. > > r8169 683M 718M 5% > [...] > > You should also compare the CPU usage. Also many drivers copy small receives into a new buffer which saves space and often gives better performance. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Junchang Wang <junchangwang@gmail.com> :
[...]
> 2) Why does r8169 bias towards the first approach even if it support both ?
It is a simple, straightforward fix against a 8169 hardware bug.
See commit c0cd884af045338476b8e69a61fceb3f34ff22f1.
> > You should also compare the CPU usage. > > Ben. > Hi Ben, I added options -c -C to netperf's command line. Result is as follows: scheme 1 scheme 2 Imp. Throughput: 683M 718M 5% CPU usage: 47.8% 45.6% That really surprised me because "top" command showed the CPU usage was fluctuating between 0.5% and 1.5% rather that between 45% and 50%. How can I get the exact CPU usage? Thanks.
> It is a simple, straightforward fix against a 8169 hardware bug. > > See commit c0cd884af045338476b8e69a61fceb3f34ff22f1. > Fortunately, it seems my device is unaffected by this issue. :) Thanks Francois.
Junchang Wang wrote: >>You should also compare the CPU usage. >> >>Ben. >> > > Hi Ben, > I added options -c -C to netperf's command line. Result is as follows: > scheme 1 scheme 2 Imp. > Throughput: 683M 718M 5% > CPU usage: 47.8% 45.6% > > That really surprised me because "top" command showed the CPU usage > was fluctuating between 0.5% and 1.5% rather that between 45% and 50%. Can you tell us a bit more about the system, and which version of netperf you are using? Any chance that the CPU utilization you were looking at in top was just that being charged to netperf the process? "Network processing" does not often get charged to the responsible process, so netperf reports system-wide CPU utilization on the assumption it is the only thing causing the CPUs to be utilized. happy benchmarking, rick jones -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jul 16, 2010 at 10:58:46AM -0700, Rick Jones wrote: >>Hi Ben, >>I added options -c -C to netperf's command line. Result is as follows: >> scheme 1 scheme 2 Imp. >>Throughput: 683M 718M 5% >>CPU usage: 47.8% 45.6% >> >>That really surprised me because "top" command showed the CPU usage >>was fluctuating between 0.5% and 1.5% rather that between 45% and 50%. > Hi rick, very sorry for my late reply. Just recovered from the final exam.:) >Can you tell us a bit more about the system, and which version of >netperf you are using? The target machine is a Pentium Dual-core E2200 desktop with a r8169 gigabit NIC. (I couldn't find a better server with old pci slot.) Another machine is a Nehalem based system with Intel 82576 NIC. The target machine executes netserver and Nehalem machine executes netperf. The version of netperf is 2.4.5 >Any chance that the CPU utilization you were >looking at in top was just that being charged to netperf the process? What I see on target machine is as follows: top - 21:37:12 up 21 min, 6 users, load average: 0.43, 0.28, 0.19 Tasks: 152 total, 2 running, 149 sleeping, 0 stopped, 1 zombie Cpu(s): 2.3%us, 1.5%sy, 0.1%ni, 89.5%id, 2.7%wa, 0.0%hi, 3.9%si, 0.0% Mem: 2074064k total, 690200k used, 1383864k free, 39372k buffers Swap: 2096476k total, 0k used, 2096476k free, 435044k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3916 root 20 0 2228 584 296 R 84.6 0.0 0:07.12 netserver It shows the CPU usage of taget machine is around 10%. while Nehalem machine's report is as follows: TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.1 (192.168.2.1) port 0 AF_INET Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 16384 10.05 679.79 1.63 48.27 1.571 11.634 It shows the CPU usage of target machine is 48.27%. >"Network processing" does not often get charged to the responsible >process, so netperf reports system-wide CPU utilization on the >assumption it is the only thing causing the CPUs to be utilized. My understand of your commends is: 1)except running in ksoftirqd, network processing cannot be correctly counted because it runs in interrupt contexts that do not get charged to a correct process. So "top" misses lots of CPU usage in high interrupt rate network situation. 2)As you have mentioned in netperf's manual, netperf uses /proc/stat on Linux to retrieve time spent in idle mode. In other words, it accumulates cpu time spent in all other modes, including hardware interrupt, software interrupt, etc., making the CPU usage more accurate in high interrupt situation. 3)Since most processes in target machine are in sleeping mode, the CPU usage of network processing is in actually very close to 48.27%. Right? Correct me if any of them are incorrect. Thanks. --Junchang -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Junchang Wang wrote: > On Fri, Jul 16, 2010 at 10:58:46AM -0700, Rick Jones wrote: > >>>Hi Ben, >>>I added options -c -C to netperf's command line. Result is as follows: >>> scheme 1 scheme 2 Imp. >>>Throughput: 683M 718M 5% >>>CPU usage: 47.8% 45.6% >>> >>>That really surprised me because "top" command showed the CPU usage >>>was fluctuating between 0.5% and 1.5% rather that between 45% and 50%. >> > > Hi rick, > very sorry for my late reply. Just recovered from the final exam.:) > > >>Can you tell us a bit more about the system, and which version of >>netperf you are using? > > > The target machine is a Pentium Dual-core E2200 desktop with a r8169 > gigabit NIC. (I couldn't find a better server with old pci slot.) > > Another machine is a Nehalem based system with Intel 82576 NIC. > > The target machine executes netserver and Nehalem machine executes netperf. > The version of netperf is 2.4.5 > > >>Any chance that the CPU utilization you were >>looking at in top was just that being charged to netperf the process? > > > What I see on target machine is as follows: > > top - 21:37:12 up 21 min, 6 users, load average: 0.43, 0.28, 0.19 > Tasks: 152 total, 2 running, 149 sleeping, 0 stopped, 1 zombie > Cpu(s): 2.3%us, 1.5%sy, 0.1%ni, 89.5%id, 2.7%wa, 0.0%hi, 3.9%si, 0.0% > Mem: 2074064k total, 690200k used, 1383864k free, 39372k buffers > Swap: 2096476k total, 0k used, 2096476k free, 435044k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 3916 root 20 0 2228 584 296 R 84.6 0.0 0:07.12 netserver You said this was a dual-core system right? So two cores, no threads? If so, then that does look odd - if netserver is consuming 84% of a CPU (core) and there are only two CPUs (cores) in the system, how the system can be 89.5% idle is beyond me. The 48% reported by netperf below makes better sense. If you press "1" while top is running it should start to show per-CPU statistics > It shows the CPU usage of taget machine is around 10%. > > while Nehalem machine's report is as follows: > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.1 (192.168.2.1) port 0 AF_INET > Recv Send Send Utilization Service Demand > Socket Socket Message Elapsed Send Recv Send Recv > Size Size Size Time Throughput local remote local remote > bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB > > 87380 16384 16384 10.05 679.79 1.63 48.27 1.571 11.634 > > It shows the CPU usage of target machine is 48.27%. Clearly something is out of joint - let's go off-list (or on to netperf-talk@netperf.org) and hash that out to see what may be happening. It will probably involve variations on grabbing the top-of-trunk, adding the debug option etc. > > >>"Network processing" does not often get charged to the responsible >>process, so netperf reports system-wide CPU utilization on the >>assumption it is the only thing causing the CPUs to be utilized. > > > My understand of your commends is: > 1)except running in ksoftirqd, network processing cannot be correctly counted > because it runs in interrupt contexts that do not get charged to a correct > process. So "top" misses lots of CPU usage in high interrupt rate network > situation. Top *shouldn't* miss it as far as reporting overall CPU utlization. It just may not be charged to the process on who's behalf the work is done. > 2)As you have mentioned in netperf's manual, netperf uses /proc/stat on Linux > to retrieve time spent in idle mode. In other words, it accumulates cpu time > spent in all other modes, including hardware interrupt, software interrupt, > etc., making the CPU usage more accurate in high interrupt situation. That is the theory. In practice however... while the top output you've provided looks like there is an "issue" in top, netperf has been known to have a bug or three. > 3)Since most processes in target machine are in sleeping mode, the CPU usage > of network processing is in actually very close to 48.27%. Right? I do not expect there to be a huge discrepancy between the overall CPU utilization reported by top and the CPU utilization reported by netperf. That there seems to be such a discrepancy has me wanting to make certain that netperf is operating correctly. happy benchmarking, rick jones > > Correct me if any of them are incorrect. Thanks. > > --Junchang > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi list, > Clearly something is out of joint - let's go off-list (or on to > netperf-talk@netperf.org) and hash that out to see what may be happening. > It will probably involve variations on grabbing the top-of-trunk, adding > the debug option etc. > The discrepancy between netperf and top has been worked out. It seems top produce buggy data when I try to send output to a file. For example, "top -b > output" gives out my previous buggy data in its first iteration. Actually, the report of top should be: top - 21:37:15 up 21 min, 6 users, load average: 0.43, 0.28, 0.19 Tasks: 152 total, 2 running, 149 sleeping, 0 stopped, 1 zombie Cpu(s): 0.2%us, 5.4%sy, 0.0%ni, 50.9%id, 0.0%wa, 0.0%hi, 43.5%si, 0.0% Mem: 2074064k total, 690192k used, 1383872k free, 39372k buffers Swap: 2096476k total, 0k used, 2096476k free, 435056k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3916 root 20 0 2228 584 296 R 86.3 0.0 0:09.72 netserver I think 50.9% system idle makes sense because this is a dual-core system and netserver is consuming 86.3% of a core. On average, the CPU usage of the whole system reported by top can be regarded as from 46.2% to 50.1%. netperf's report of 48% is right and testifies that "there is no huge discrepancy between the overall CPU utilization reported by top and the CPU utilization reported by netperf." Thanks Rick.
diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c index 239d7ef..707876f 100644 --- a/drivers/net/r8169.c +++ b/drivers/net/r8169.c @@ -4556,15 +4556,9 @@ static int rtl8169_rx_interrupt(struct net_device *dev, rtl8169_rx_csum(skb, desc); - if (rtl8169_try_rx_copy(&skb, tp, pkt_size, addr)) { - pci_dma_sync_single_for_device(pdev, addr, - pkt_size, PCI_DMA_FROMDEVICE); - rtl8169_mark_to_asic(desc, tp->rx_buf_sz); - } else { - pci_unmap_single(pdev, addr, tp->rx_buf_sz, - PCI_DMA_FROMDEVICE); - tp->Rx_skbuff[entry] = NULL; - } + pci_unmap_single(pdev, addr, tp->rx_buf_sz, + PCI_DMA_FROMDEVICE); + tp->Rx_skbuff[entry] = NULL; skb_put(skb, pkt_size); skb->protocol = eth_type_trans(skb, dev);