Message ID | 1430457130-16003-1-git-send-email-ast@plumgrid.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On Thu, 2015-04-30 at 22:12 -0700, Alexei Starovoitov wrote: > Introduce 'RX' mode for pktgen which generates the packets > using familiar pktgen commands, but feeds them into > netif_receive_skb() instead of ndo_start_xmit(). > Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> > --- > v2->v3: addressed more Eric comments. Thanks! > > v1->v2: as suggested by Eric: > - dropped 'clone_skb' flag, now it will return enotsupp > - fix rps/rfs bug by checking skb->users after every netif_receive_skb > - tested with RPS/RFS, taps, veth, physical devs, various tc cls/act Acked-by: Eric Dumazet <edumazet@google.com> Thanks ! -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 30 Apr 2015 22:12:10 -0700 Alexei Starovoitov <ast@plumgrid.com> wrote: > Introduce 'RX' mode for pktgen which generates the packets > using familiar pktgen commands, but feeds them into > netif_receive_skb() instead of ndo_start_xmit(). > > It is designed to test netif_receive_skb and ingress qdisc > performace only. Make sure to understand how it works before > using it for other rx benchmarking. Hi Alexei First of all I love the idea of modifying pktgen to performance test the RX path. I'm not sure the simple "rx" flag is a good "name". It likely conflicts with other work where pktgen can receive it own packets, e.g. https://people.kth.se/~danieltt/pktgen/ or Ben Greer's solution. In your patch several things are not pktgen "compliant": 1. Flags in pktgen are normally in upper-case "RX" 2. Flags also require a disable "!RX" option 3. You didn't add the flag to list of supported flags 4. You don't output if the flag is enabled 5. You didn't update the documentation (Documentation/networking/pktgen.txt) Cc.ed Robert and Ben, and left the patch below for their review. > Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> > --- > v2->v3: addressed more Eric comments. Thanks! > > v1->v2: as suggested by Eric: > - dropped 'clone_skb' flag, now it will return enotsupp > - fix rps/rfs bug by checking skb->users after every netif_receive_skb > - tested with RPS/RFS, taps, veth, physical devs, various tc cls/act > > net/core/pktgen.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 47 insertions(+), 2 deletions(-) > > diff --git a/net/core/pktgen.c b/net/core/pktgen.c > index 508155b283dd..34fd5ece2681 100644 > --- a/net/core/pktgen.c > +++ b/net/core/pktgen.c > @@ -203,6 +203,7 @@ > #define F_NODE (1<<15) /* Node memory alloc*/ > #define F_UDPCSUM (1<<16) /* Include UDP checksum */ > #define F_NO_TIMESTAMP (1<<17) /* Don't timestamp packets (default TS) */ > +#define F_DO_RX (1<<18) /* generate packets for RX */ > > /* Thread control flag bits */ > #define T_STOP (1<<0) /* Stop run */ > @@ -1081,7 +1082,8 @@ static ssize_t pktgen_if_write(struct file *file, > if (len < 0) > return len; > if ((value > 0) && > - (!(pkt_dev->odev->priv_flags & IFF_TX_SKB_SHARING))) > + ((pkt_dev->flags & F_DO_RX) || > + !(pkt_dev->odev->priv_flags & IFF_TX_SKB_SHARING))) > return -ENOTSUPP; > i += len; > pkt_dev->clone_skb = value; > @@ -1089,6 +1091,12 @@ static ssize_t pktgen_if_write(struct file *file, > sprintf(pg_result, "OK: clone_skb=%d", pkt_dev->clone_skb); > return count; > } > + if (!strcmp(name, "rx")) { > + pkt_dev->flags |= F_DO_RX; > + > + sprintf(pg_result, "OK: RX is ON"); > + return count; > + } > if (!strcmp(name, "count")) { > len = num_arg(&user_buffer[i], 10, &value); > if (len < 0) > @@ -1134,7 +1142,7 @@ static ssize_t pktgen_if_write(struct file *file, > return len; > > i += len; > - if ((value > 1) && > + if ((value > 1) && !(pkt_dev->flags & F_DO_RX) && > (!(pkt_dev->odev->priv_flags & IFF_TX_SKB_SHARING))) > return -ENOTSUPP; > pkt_dev->burst = value < 1 ? 1 : value; > @@ -3317,6 +3325,7 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev) > unsigned int burst = ACCESS_ONCE(pkt_dev->burst); > struct net_device *odev = pkt_dev->odev; > struct netdev_queue *txq; > + struct sk_buff *skb; > int ret; > > /* If device is offline, then don't send */ > @@ -3349,11 +3358,46 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev) > pkt_dev->last_pkt_size = pkt_dev->skb->len; > pkt_dev->allocated_skbs++; > pkt_dev->clone_count = 0; /* reset counter */ > + if (pkt_dev->flags & F_DO_RX) > + pkt_dev->skb->protocol = eth_type_trans(pkt_dev->skb, > + pkt_dev->skb->dev); > } > > if (pkt_dev->delay && pkt_dev->last_ok) > spin(pkt_dev, pkt_dev->next_tx); > > + if (pkt_dev->flags & F_DO_RX) { > + skb = pkt_dev->skb; > + atomic_add(burst, &skb->users); > + local_bh_disable(); > + do { > + ret = netif_receive_skb(skb); > + if (ret == NET_RX_DROP) > + pkt_dev->errors++; > + pkt_dev->sofar++; > + pkt_dev->seq_num++; > + if (atomic_read(&skb->users) != burst) { > + /* skb was queued by rps/rfs or taps, > + * so cannot reuse this skb > + */ > + atomic_sub(burst - 1, &skb->users); > + /* get out of the loop and wait > + * until skb is consumed > + */ > + pkt_dev->last_ok = 1; > + pkt_dev->clone_skb = 0; > + break; > + } > + /* skb was 'freed' by stack, so clean few > + * bits and reuse it > + */ > +#ifdef CONFIG_NET_CLS_ACT > + skb->tc_verd = 0; /* reset reclass/redir ttl */ > +#endif > + } while (--burst > 0); > + goto out; > + } > + > txq = skb_get_tx_queue(odev, pkt_dev->skb); > > local_bh_disable(); > @@ -3401,6 +3445,7 @@ xmit_more: > unlock: > HARD_TX_UNLOCK(odev, txq); > > +out: > local_bh_enable(); > > /* If pkt_dev->count is zero, then run forever */
Hi Jesper, On 05/02/2015 10:46 AM, Jesper Dangaard Brouer wrote: ... > First of all I love the idea of modifying pktgen to performance test > the RX path. > > I'm not sure the simple "rx" flag is a good "name". It likely > conflicts with other work where pktgen can receive it own packets, e.g. > https://people.kth.se/~danieltt/pktgen/ or Ben Greer's solution. Why do we start caring about out of tree code now? We never have, really. If there is no interest in merging this stuff upstream, then it's always the case that _their code_ needs to adapt iff you want to run on a more recent kernel; the kernel dictates the uapi, not some out of tree module. ;) Cheers, Daniel -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 02 May 2015 11:44:45 +0200 Daniel Borkmann <daniel@iogearbox.net> wrote: > Hi Jesper, > > On 05/02/2015 10:46 AM, Jesper Dangaard Brouer wrote: > ... > > First of all I love the idea of modifying pktgen to performance test > > the RX path. > > > > I'm not sure the simple "rx" flag is a good "name". It likely > > conflicts with other work where pktgen can receive it own packets, e.g. > > https://people.kth.se/~danieltt/pktgen/ or Ben Greer's solution. > > Why do we start caring about out of tree code now? We never have, > really. If there is no interest in merging this stuff upstream, > then it's always the case that _their code_ needs to adapt iff you > want to run on a more recent kernel; the kernel dictates the uapi, > not some out of tree module. ;) Sure, out-of-tree code should not control our choices. I personally just don't like the "RX" flag name. What about "STACK_RX" or "RX_INJECT"?
On 05/02/2015 11:54 AM, Jesper Dangaard Brouer wrote: ... > I personally just don't like the "RX" flag name. > What about "STACK_RX" or "RX_INJECT"? I'm not sure if it actually qualifies as a flag from user space point of view, besides that it's used as an implementation detail. You could have an option "xmit_mode [rx|tx]" (where obviously tx is the default), perhaps that makes it clearer that you xmit into the kernel's rx path. Cheers, Daniel -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 5/2/15 1:46 AM, Jesper Dangaard Brouer wrote: > On Thu, 30 Apr 2015 22:12:10 -0700 > Alexei Starovoitov <ast@plumgrid.com> wrote: > >> Introduce 'RX' mode for pktgen which generates the packets >> using familiar pktgen commands, but feeds them into >> netif_receive_skb() instead of ndo_start_xmit(). >> >> It is designed to test netif_receive_skb and ingress qdisc >> performace only. Make sure to understand how it works before >> using it for other rx benchmarking. > > Hi Alexei > > First of all I love the idea of modifying pktgen to performance test > the RX path. > > I'm not sure the simple "rx" flag is a good "name". It likely > conflicts with other work where pktgen can receive it own packets, e.g. > https://people.kth.se/~danieltt/pktgen/ or Ben Greer's solution. > > In your patch several things are not pktgen "compliant": > 1. Flags in pktgen are normally in upper-case "RX" > 2. Flags also require a disable "!RX" option > 3. You didn't add the flag to list of supported flags > 4. You don't output if the flag is enabled > 5. You didn't update the documentation (Documentation/networking/pktgen.txt) It's actually not a flag, because it cannot be disabled by design. It cannot be flipped back and forth, because it affects what other real flags can be applied. It's a _mode_. I don't see yet how I can safely switch this mode back into tx while things are running. It would need a whole new mechanism of stopping things and so on. I wanted to start simple. For 5, yeah, agree, need to update the doc. As far as name, I don't have preferences. Will 'stack_inject' sound better? I can respin with that name if you like, but disabling it on the fly is a major change. I'd rather do it in small steps. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 02 May 2015 09:01:27 -0700 Alexei Starovoitov <ast@plumgrid.com> wrote: > On 5/2/15 1:46 AM, Jesper Dangaard Brouer wrote: > > On Thu, 30 Apr 2015 22:12:10 -0700 > > Alexei Starovoitov <ast@plumgrid.com> wrote: > > > >> Introduce 'RX' mode for pktgen which generates the packets > >> using familiar pktgen commands, but feeds them into > >> netif_receive_skb() instead of ndo_start_xmit(). > >> > >> It is designed to test netif_receive_skb and ingress qdisc > >> performace only. Make sure to understand how it works before > >> using it for other rx benchmarking. > > > > Hi Alexei > > > > First of all I love the idea of modifying pktgen to performance test > > the RX path. > > > > I'm not sure the simple "rx" flag is a good "name". It likely > > conflicts with other work where pktgen can receive it own packets, e.g. > > https://people.kth.se/~danieltt/pktgen/ or Ben Greer's solution. > > > > In your patch several things are not pktgen "compliant": > > 1. Flags in pktgen are normally in upper-case "RX" > > 2. Flags also require a disable "!RX" option > > 3. You didn't add the flag to list of supported flags > > 4. You don't output if the flag is enabled > > 5. You didn't update the documentation (Documentation/networking/pktgen.txt) > > It's actually not a flag, because it cannot be disabled by design. > It cannot be flipped back and forth, because it affects what other > real flags can be applied. It's a _mode_. > I don't see yet how I can safely switch this mode back into tx while > things are running. It would need a whole new mechanism of stopping > things and so on. I wanted to start simple. > For 5, yeah, agree, need to update the doc. > As far as name, I don't have preferences. Will 'stack_inject' sound > better? I can respin with that name if you like, but disabling it on > the fly is a major change. I'd rather do it in small steps. Okay, I guess it makes sense as a "mode". Calling it "stack_inject" is fairly descriptive, but I also like Daniel's idea of option "xmit_mode [rx|tx]". I would not allow it to be reconfigured on the fly, but only when the test/thread is stopped (!pkt_dev->running), allow re-configuring back to "TX" mode. Although, removing (and free) the entire pkt_dev, would implicitly reset it back.
On 5/2/15 9:46 AM, Jesper Dangaard Brouer wrote: > Okay, I guess it makes sense as a "mode". > > Calling it "stack_inject" is fairly descriptive, but I also like > Daniel's idea of option "xmit_mode [rx|tx]". sure. that's fine too. > I would not allow it to be reconfigured on the fly, but only when the > test/thread is stopped (!pkt_dev->running), allow re-configuring back > to "TX" mode. Although, removing (and free) the entire pkt_dev, would > implicitly reset it back. all makes sense. btw, I'm off the grid till Monday and it sounds like you see concrete spots in the code to hack this stuff in, so if you want to take over, please go ahead :) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 02 May 2015 10:07:48 -0700 Alexei Starovoitov <ast@plumgrid.com> wrote: > On 5/2/15 9:46 AM, Jesper Dangaard Brouer wrote: [...] > > all makes sense. > btw, I'm off the grid till Monday and it sounds like you see concrete > spots in the code to hack this stuff in, so if you want to take over, > please go ahead :) Okay, I'll address my own concerns and post a patch in your name("From:") As I would like to see this feature in pktgen, as it provides a really easy and quick way to measure/profile changes in the ingress path...
On 5/5/15 11:15 AM, Jesper Dangaard Brouer wrote: > > On Sat, 02 May 2015 10:07:48 -0700 Alexei Starovoitov <ast@plumgrid.com> wrote: >> On 5/2/15 9:46 AM, Jesper Dangaard Brouer wrote: > [...] >> >> all makes sense. >> btw, I'm off the grid till Monday and it sounds like you see concrete >> spots in the code to hack this stuff in, so if you want to take over, >> please go ahead :) > > Okay, I'll address my own concerns and post a patch in your name("From:") lol :) I seriously don't care about commit count. Thank you for taking over. I'm swamped with tracing stuff at the moment. > As I would like to see this feature in pktgen, as it provides a really > easy and quick way to measure/profile changes in the ingress path... same here. my motivation for this patch is to have a common tool to measure ingress speed, so we can have concrete numbers to talk about. Also, as you mentioned, there is always a danger of 'zooming in'. Any type of benchmarking shouldn't be blindly followed. At the same time all numbers are useful. They are different datapoints that make a complete picture. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
The following series introduce some pktgen changes Patch01: Cleanup my own work when I introduced NO_TIMESTAMP. Patch02: Took over patch from Alexei, and addressed my own concerns, as Alexie is too busy with other work, and this will provide an easy tool for measuring ingress path performance, which is a hot topic ATM. Changes were primarily user interface related. Introduced a separate "xmit_mode" setting, instead of stealing one of the dev flags like Alexei did. --- Alexei Starovoitov (1): pktgen: introduce xmit_mode 'rx_inject' Jesper Dangaard Brouer (1): pktgen: adjust flag NO_TIMESTAMP to be more pktgen compliant Documentation/networking/pktgen.txt | 9 ++++ net/core/pktgen.c | 87 +++++++++++++++++++++++++++++++++-- 2 files changed, 91 insertions(+), 5 deletions(-)
diff --git a/net/core/pktgen.c b/net/core/pktgen.c index 508155b283dd..34fd5ece2681 100644 --- a/net/core/pktgen.c +++ b/net/core/pktgen.c @@ -203,6 +203,7 @@ #define F_NODE (1<<15) /* Node memory alloc*/ #define F_UDPCSUM (1<<16) /* Include UDP checksum */ #define F_NO_TIMESTAMP (1<<17) /* Don't timestamp packets (default TS) */ +#define F_DO_RX (1<<18) /* generate packets for RX */ /* Thread control flag bits */ #define T_STOP (1<<0) /* Stop run */ @@ -1081,7 +1082,8 @@ static ssize_t pktgen_if_write(struct file *file, if (len < 0) return len; if ((value > 0) && - (!(pkt_dev->odev->priv_flags & IFF_TX_SKB_SHARING))) + ((pkt_dev->flags & F_DO_RX) || + !(pkt_dev->odev->priv_flags & IFF_TX_SKB_SHARING))) return -ENOTSUPP; i += len; pkt_dev->clone_skb = value; @@ -1089,6 +1091,12 @@ static ssize_t pktgen_if_write(struct file *file, sprintf(pg_result, "OK: clone_skb=%d", pkt_dev->clone_skb); return count; } + if (!strcmp(name, "rx")) { + pkt_dev->flags |= F_DO_RX; + + sprintf(pg_result, "OK: RX is ON"); + return count; + } if (!strcmp(name, "count")) { len = num_arg(&user_buffer[i], 10, &value); if (len < 0) @@ -1134,7 +1142,7 @@ static ssize_t pktgen_if_write(struct file *file, return len; i += len; - if ((value > 1) && + if ((value > 1) && !(pkt_dev->flags & F_DO_RX) && (!(pkt_dev->odev->priv_flags & IFF_TX_SKB_SHARING))) return -ENOTSUPP; pkt_dev->burst = value < 1 ? 1 : value; @@ -3317,6 +3325,7 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev) unsigned int burst = ACCESS_ONCE(pkt_dev->burst); struct net_device *odev = pkt_dev->odev; struct netdev_queue *txq; + struct sk_buff *skb; int ret; /* If device is offline, then don't send */ @@ -3349,11 +3358,46 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev) pkt_dev->last_pkt_size = pkt_dev->skb->len; pkt_dev->allocated_skbs++; pkt_dev->clone_count = 0; /* reset counter */ + if (pkt_dev->flags & F_DO_RX) + pkt_dev->skb->protocol = eth_type_trans(pkt_dev->skb, + pkt_dev->skb->dev); } if (pkt_dev->delay && pkt_dev->last_ok) spin(pkt_dev, pkt_dev->next_tx); + if (pkt_dev->flags & F_DO_RX) { + skb = pkt_dev->skb; + atomic_add(burst, &skb->users); + local_bh_disable(); + do { + ret = netif_receive_skb(skb); + if (ret == NET_RX_DROP) + pkt_dev->errors++; + pkt_dev->sofar++; + pkt_dev->seq_num++; + if (atomic_read(&skb->users) != burst) { + /* skb was queued by rps/rfs or taps, + * so cannot reuse this skb + */ + atomic_sub(burst - 1, &skb->users); + /* get out of the loop and wait + * until skb is consumed + */ + pkt_dev->last_ok = 1; + pkt_dev->clone_skb = 0; + break; + } + /* skb was 'freed' by stack, so clean few + * bits and reuse it + */ +#ifdef CONFIG_NET_CLS_ACT + skb->tc_verd = 0; /* reset reclass/redir ttl */ +#endif + } while (--burst > 0); + goto out; + } + txq = skb_get_tx_queue(odev, pkt_dev->skb); local_bh_disable(); @@ -3401,6 +3445,7 @@ xmit_more: unlock: HARD_TX_UNLOCK(odev, txq); +out: local_bh_enable(); /* If pkt_dev->count is zero, then run forever */
Introduce 'RX' mode for pktgen which generates the packets using familiar pktgen commands, but feeds them into netif_receive_skb() instead of ndo_start_xmit(). It is designed to test netif_receive_skb and ingress qdisc performace only. Make sure to understand how it works before using it for other rx benchmarking. Sample script 'pktgen.sh': \#!/bin/bash function pgset() { local result echo $1 > $PGDEV result=`cat $PGDEV | fgrep "Result: OK:"` if [ "$result" = "" ]; then cat $PGDEV | fgrep Result: fi } ETH=$1 PGDEV=/proc/net/pktgen/kpktgend_0 pgset "rem_device_all" pgset "add_device $ETH" PGDEV=/proc/net/pktgen/$ETH pgset "rx" pgset "pkt_size 60" pgset "dst 99.1.1.2" pgset "dst_mac 90:e2:ba:6e:a8:e5" pgset "count 10000000" pgset "burst 32" PGDEV=/proc/net/pktgen/pgctrl echo "Running... ctrl^C to stop" pgset "start" echo "Done" cat /proc/net/pktgen/$ETH Usage: $ sudo ./pktgen.sh eth2 ... Result: OK: 232376(c232372+d3) usec, 10000000 (60byte,0frags) 43033682pps 20656Mb/sec (20656167360bps) errors: 10000000 Raw netif_receive_skb speed should be ~43 million packet per second on 3.7Ghz x86 and 'perf report' should look like: 37.69% kpktgend_0 [kernel.vmlinux] [k] __netif_receive_skb_core 25.81% kpktgend_0 [kernel.vmlinux] [k] kfree_skb 7.22% kpktgend_0 [kernel.vmlinux] [k] ip_rcv 5.68% kpktgend_0 [pktgen] [k] pktgen_thread_worker if fib_table_lookup is seen on top, it means skb was processed by the stack. To benchmark netif_receive_skb only make sure that 'dst_mac' of your pktgen script is different from receiving device mac and it will be dropped by ip_rcv Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> --- v2->v3: addressed more Eric comments. Thanks! v1->v2: as suggested by Eric: - dropped 'clone_skb' flag, now it will return enotsupp - fix rps/rfs bug by checking skb->users after every netif_receive_skb - tested with RPS/RFS, taps, veth, physical devs, various tc cls/act net/core/pktgen.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 47 insertions(+), 2 deletions(-)