diff mbox

[net-next] hv_netvsc: don't make assumptions on struct flow_keys layout

Message ID 877fjlfrid.fsf@vitty.brq.redhat.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Vitaly Kuznetsov Jan. 7, 2016, 1:28 p.m. UTC
Eric Dumazet <eric.dumazet@gmail.com> writes:

> On Thu, 2016-01-07 at 10:33 +0100, Vitaly Kuznetsov wrote:
>> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
>>  VLAN ID to flow_keys")) introduced a performance regression in netvsc
>> driver. Is problem is, however, not the above mentioned commit but the
>> fact that netvsc_set_hash() function did some assumptions on the struct
>> flow_keys data layout and this is wrong. We need to extract the data we
>> need (src/dst addresses and ports) after the dissect.
>> 
>> The issue could also be solved in a completely different way: as suggested
>> by Eric instead of our own homegrown netvsc_set_hash() we could use
>> skb_get_hash() which does more or less the same. Unfortunately, the
>> testing done by Simon showed that Hyper-V hosts are not happy with our
>> Jenkins hash, selecting the output queue with the current algorithm based
>> on Toeplitz hash works significantly better.
>
> Were tests done on IPv6 traffic ?
>

Simon, could you please test this patch for IPv6 and show us the numbers?

> Toeplitz hash takes at least 100 ns to hash 12 bytes (one iteration per
> bit : 96 iterations)
>
> For IPv6 it is 3 times this, since we have to hash 36 bytes.
>
> I do not see how it can compete with skb_get_hash() that directly gives
> skb->hash for local TCP flows.
>

My guess is that this is not the bottleneck, something is happening
behind the scene with out packets in Hyper-V host (e.g. re-distributing
them to hardware queues?) but I don't know the internals, Microsoft
folks could probably comment.


> See commits b73c3d0e4f0e1961e15bec18720e48aabebe2109
> ("net: Save TX flow hash in sock and set in skbuf on xmit")
> and 877d1f6291f8e391237e324be58479a3e3a7407c
> ("net: Set sk_txhash from a random number")
>
> I understand Microsoft loves Toeplitz, but this looks not well placed
> here.
>
> I suspect there is another problem.
>
> Please share your numbers and test methodology, and the alternative
> patch Simon tested so that we can double check it.
>

Alternative patch which uses skb_get_hash() attached. Simon, could you
please share the rest (environment, metodology, numbers) with us here?
Thanks!

> Thanks.
>
> PS: For the time being this patch can probably be applied on -net tree,
> as it fixes a real bug.

Comments

John Fastabend Jan. 8, 2016, 1:02 a.m. UTC | #1
On 16-01-07 05:28 AM, Vitaly Kuznetsov wrote:
> Eric Dumazet <eric.dumazet@gmail.com> writes:
> 
>> On Thu, 2016-01-07 at 10:33 +0100, Vitaly Kuznetsov wrote:
>>> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
>>>  VLAN ID to flow_keys")) introduced a performance regression in netvsc
>>> driver. Is problem is, however, not the above mentioned commit but the
>>> fact that netvsc_set_hash() function did some assumptions on the struct
>>> flow_keys data layout and this is wrong. We need to extract the data we
>>> need (src/dst addresses and ports) after the dissect.
>>>
>>> The issue could also be solved in a completely different way: as suggested
>>> by Eric instead of our own homegrown netvsc_set_hash() we could use
>>> skb_get_hash() which does more or less the same. Unfortunately, the
>>> testing done by Simon showed that Hyper-V hosts are not happy with our
>>> Jenkins hash, selecting the output queue with the current algorithm based
>>> on Toeplitz hash works significantly better.
>>

Also can I ask the maybe naive question. It looks like the hypervisor
is populating some table via a mailbox msg and this is used to select
the queues I guess with some sort of weighting function?

What happens if you just remove select_queue altogether? Or maybe just
what is this 16 entry table doing? How does this work on my larger
systems with 64+ cores can I only use 16 cores? Sorry I really have
no experience with hyperV and this got me curious.

Thanks,
John

>> Were tests done on IPv6 traffic ?
>>
> 
> Simon, could you please test this patch for IPv6 and show us the numbers?
> 
>> Toeplitz hash takes at least 100 ns to hash 12 bytes (one iteration per
>> bit : 96 iterations)
>>
>> For IPv6 it is 3 times this, since we have to hash 36 bytes.
>>
>> I do not see how it can compete with skb_get_hash() that directly gives
>> skb->hash for local TCP flows.
>>
> 
> My guess is that this is not the bottleneck, something is happening
> behind the scene with out packets in Hyper-V host (e.g. re-distributing
> them to hardware queues?) but I don't know the internals, Microsoft
> folks could probably comment.
> 
> 
>> See commits b73c3d0e4f0e1961e15bec18720e48aabebe2109
>> ("net: Save TX flow hash in sock and set in skbuf on xmit")
>> and 877d1f6291f8e391237e324be58479a3e3a7407c
>> ("net: Set sk_txhash from a random number")
>>
>> I understand Microsoft loves Toeplitz, but this looks not well placed
>> here.
>>
>> I suspect there is another problem.
>>
>> Please share your numbers and test methodology, and the alternative
>> patch Simon tested so that we can double check it.
>>
> 
> Alternative patch which uses skb_get_hash() attached. Simon, could you
> please share the rest (environment, metodology, numbers) with us here?
> Thanks!
> 
>> Thanks.
>>
>> PS: For the time being this patch can probably be applied on -net tree,
>> as it fixes a real bug.
>
KY Srinivasan Jan. 8, 2016, 3:49 a.m. UTC | #2
> -----Original Message-----
> From: John Fastabend [mailto:john.fastabend@gmail.com]
> Sent: Thursday, January 7, 2016 5:02 PM
> To: Vitaly Kuznetsov <vkuznets@redhat.com>; Simon Xiao
> <sixiao@microsoft.com>; Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Tom Herbert <tom@herbertland.com>; netdev@vger.kernel.org; KY
> Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>;
> devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; David Miller
> <davem@davemloft.net>
> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct
> flow_keys layout
> 
> On 16-01-07 05:28 AM, Vitaly Kuznetsov wrote:
> > Eric Dumazet <eric.dumazet@gmail.com> writes:
> >
> >> On Thu, 2016-01-07 at 10:33 +0100, Vitaly Kuznetsov wrote:
> >>> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
> >>>  VLAN ID to flow_keys")) introduced a performance regression in netvsc
> >>> driver. Is problem is, however, not the above mentioned commit but the
> >>> fact that netvsc_set_hash() function did some assumptions on the struct
> >>> flow_keys data layout and this is wrong. We need to extract the data we
> >>> need (src/dst addresses and ports) after the dissect.
> >>>
> >>> The issue could also be solved in a completely different way: as suggested
> >>> by Eric instead of our own homegrown netvsc_set_hash() we could use
> >>> skb_get_hash() which does more or less the same. Unfortunately, the
> >>> testing done by Simon showed that Hyper-V hosts are not happy with our
> >>> Jenkins hash, selecting the output queue with the current algorithm based
> >>> on Toeplitz hash works significantly better.
> >>
> 
> Also can I ask the maybe naive question. It looks like the hypervisor
> is populating some table via a mailbox msg and this is used to select
> the queues I guess with some sort of weighting function?
> 
> What happens if you just remove select_queue altogether? Or maybe just
> what is this 16 entry table doing? How does this work on my larger
> systems with 64+ cores can I only use 16 cores? Sorry I really have
> no experience with hyperV and this got me curious.

We will limit the number of VRSS channels to the number of CPUs in
a NUMA node. If the number of CPUs in a NUMA node exceeds 8, we
will only open up 8 VRSS channels. On the host side currently traffic
spreading is done in software and we have found that limiting to 8 CPUs
gives us the best throughput. In Windows Server 2016, we will be 
distributing traffic on the host in hardware; the heuristics in the guest
may change.

Regards,

K. Y
> 
> Thanks,
> John
> 
> >> Were tests done on IPv6 traffic ?
> >>
> >
> > Simon, could you please test this patch for IPv6 and show us the numbers?
> >
> >> Toeplitz hash takes at least 100 ns to hash 12 bytes (one iteration per
> >> bit : 96 iterations)
> >>
> >> For IPv6 it is 3 times this, since we have to hash 36 bytes.
> >>
> >> I do not see how it can compete with skb_get_hash() that directly gives
> >> skb->hash for local TCP flows.
> >>
> >
> > My guess is that this is not the bottleneck, something is happening
> > behind the scene with out packets in Hyper-V host (e.g. re-distributing
> > them to hardware queues?) but I don't know the internals, Microsoft
> > folks could probably comment.
> >
> >
> >> See commits b73c3d0e4f0e1961e15bec18720e48aabebe2109
> >> ("net: Save TX flow hash in sock and set in skbuf on xmit")
> >> and 877d1f6291f8e391237e324be58479a3e3a7407c
> >> ("net: Set sk_txhash from a random number")
> >>
> >> I understand Microsoft loves Toeplitz, but this looks not well placed
> >> here.
> >>
> >> I suspect there is another problem.
> >>
> >> Please share your numbers and test methodology, and the alternative
> >> patch Simon tested so that we can double check it.
> >>
> >
> > Alternative patch which uses skb_get_hash() attached. Simon, could you
> > please share the rest (environment, metodology, numbers) with us here?
> > Thanks!
> >
> >> Thanks.
> >>
> >> PS: For the time being this patch can probably be applied on -net tree,
> >> as it fixes a real bug.
> >
John Fastabend Jan. 8, 2016, 6:16 a.m. UTC | #3
On 16-01-07 07:49 PM, KY Srinivasan wrote:
> 
> 
>> -----Original Message-----
>> From: John Fastabend [mailto:john.fastabend@gmail.com]
>> Sent: Thursday, January 7, 2016 5:02 PM
>> To: Vitaly Kuznetsov <vkuznets@redhat.com>; Simon Xiao
>> <sixiao@microsoft.com>; Eric Dumazet <eric.dumazet@gmail.com>
>> Cc: Tom Herbert <tom@herbertland.com>; netdev@vger.kernel.org; KY
>> Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>;
>> devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; David Miller
>> <davem@davemloft.net>
>> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct
>> flow_keys layout
>>
>> On 16-01-07 05:28 AM, Vitaly Kuznetsov wrote:
>>> Eric Dumazet <eric.dumazet@gmail.com> writes:
>>>
>>>> On Thu, 2016-01-07 at 10:33 +0100, Vitaly Kuznetsov wrote:
>>>>> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
>>>>>  VLAN ID to flow_keys")) introduced a performance regression in netvsc
>>>>> driver. Is problem is, however, not the above mentioned commit but the
>>>>> fact that netvsc_set_hash() function did some assumptions on the struct
>>>>> flow_keys data layout and this is wrong. We need to extract the data we
>>>>> need (src/dst addresses and ports) after the dissect.
>>>>>
>>>>> The issue could also be solved in a completely different way: as suggested
>>>>> by Eric instead of our own homegrown netvsc_set_hash() we could use
>>>>> skb_get_hash() which does more or less the same. Unfortunately, the
>>>>> testing done by Simon showed that Hyper-V hosts are not happy with our
>>>>> Jenkins hash, selecting the output queue with the current algorithm based
>>>>> on Toeplitz hash works significantly better.
>>>>
>>
>> Also can I ask the maybe naive question. It looks like the hypervisor
>> is populating some table via a mailbox msg and this is used to select
>> the queues I guess with some sort of weighting function?
>>
>> What happens if you just remove select_queue altogether? Or maybe just
>> what is this 16 entry table doing? How does this work on my larger
>> systems with 64+ cores can I only use 16 cores? Sorry I really have
>> no experience with hyperV and this got me curious.
> 
> We will limit the number of VRSS channels to the number of CPUs in
> a NUMA node. If the number of CPUs in a NUMA node exceeds 8, we
> will only open up 8 VRSS channels. On the host side currently traffic
> spreading is done in software and we have found that limiting to 8 CPUs
> gives us the best throughput. In Windows Server 2016, we will be 
> distributing traffic on the host in hardware; the heuristics in the guest
> may change.
> 
> Regards,
> 
> K. Y

I think a better way to do this would be to query the numa node when
the interface comes online via dev_to_node() and then use cpu_to_node()
or create/find some better variant to get a list of cpus on the numa
node.

At this point you can use the xps mapping interface
netif_set_xps_queue() to get the right queue to cpu binding. If you want
to cap it to max 8 queues that works as well. I don't think there is
any value to have more tx queues than number of cpus in use.

If you do it this way all the normal mechanisms to setup queue mappings
will work for users who are doing some special configuration and the
default will still be what you want.

I guess I should go do this numa mapping for ixgbe and friends now that
I mention it. Last perf numbers I had showed cross numa affinitizing
was pretty bad.

Thanks,
John
KY Srinivasan Jan. 8, 2016, 6:01 p.m. UTC | #4
> -----Original Message-----
> From: John Fastabend [mailto:john.fastabend@gmail.com]
> Sent: Thursday, January 7, 2016 10:17 PM
> To: KY Srinivasan <kys@microsoft.com>; Vitaly Kuznetsov
> <vkuznets@redhat.com>; Simon Xiao <sixiao@microsoft.com>; Eric Dumazet
> <eric.dumazet@gmail.com>
> Cc: Tom Herbert <tom@herbertland.com>; netdev@vger.kernel.org;
> Haiyang Zhang <haiyangz@microsoft.com>; devel@linuxdriverproject.org;
> linux-kernel@vger.kernel.org; David Miller <davem@davemloft.net>
> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on struct
> flow_keys layout
> 
> On 16-01-07 07:49 PM, KY Srinivasan wrote:
> >
> >
> >> -----Original Message-----
> >> From: John Fastabend [mailto:john.fastabend@gmail.com]
> >> Sent: Thursday, January 7, 2016 5:02 PM
> >> To: Vitaly Kuznetsov <vkuznets@redhat.com>; Simon Xiao
> >> <sixiao@microsoft.com>; Eric Dumazet <eric.dumazet@gmail.com>
> >> Cc: Tom Herbert <tom@herbertland.com>; netdev@vger.kernel.org; KY
> >> Srinivasan <kys@microsoft.com>; Haiyang Zhang
> <haiyangz@microsoft.com>;
> >> devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; David Miller
> >> <davem@davemloft.net>
> >> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
> struct
> >> flow_keys layout
> >>
> >> On 16-01-07 05:28 AM, Vitaly Kuznetsov wrote:
> >>> Eric Dumazet <eric.dumazet@gmail.com> writes:
> >>>
> >>>> On Thu, 2016-01-07 at 10:33 +0100, Vitaly Kuznetsov wrote:
> >>>>> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net:
> Add
> >>>>>  VLAN ID to flow_keys")) introduced a performance regression in
> netvsc
> >>>>> driver. Is problem is, however, not the above mentioned commit but
> the
> >>>>> fact that netvsc_set_hash() function did some assumptions on the
> struct
> >>>>> flow_keys data layout and this is wrong. We need to extract the data
> we
> >>>>> need (src/dst addresses and ports) after the dissect.
> >>>>>
> >>>>> The issue could also be solved in a completely different way: as
> suggested
> >>>>> by Eric instead of our own homegrown netvsc_set_hash() we could
> use
> >>>>> skb_get_hash() which does more or less the same. Unfortunately,
> the
> >>>>> testing done by Simon showed that Hyper-V hosts are not happy with
> our
> >>>>> Jenkins hash, selecting the output queue with the current algorithm
> based
> >>>>> on Toeplitz hash works significantly better.
> >>>>
> >>
> >> Also can I ask the maybe naive question. It looks like the hypervisor
> >> is populating some table via a mailbox msg and this is used to select
> >> the queues I guess with some sort of weighting function?
> >>
> >> What happens if you just remove select_queue altogether? Or maybe
> just
> >> what is this 16 entry table doing? How does this work on my larger
> >> systems with 64+ cores can I only use 16 cores? Sorry I really have
> >> no experience with hyperV and this got me curious.
> >
> > We will limit the number of VRSS channels to the number of CPUs in
> > a NUMA node. If the number of CPUs in a NUMA node exceeds 8, we
> > will only open up 8 VRSS channels. On the host side currently traffic
> > spreading is done in software and we have found that limiting to 8 CPUs
> > gives us the best throughput. In Windows Server 2016, we will be
> > distributing traffic on the host in hardware; the heuristics in the guest
> > may change.
> >
> > Regards,
> >
> > K. Y
> 
> I think a better way to do this would be to query the numa node when
> the interface comes online via dev_to_node() and then use cpu_to_node()
> or create/find some better variant to get a list of cpus on the numa
> node.
> 
> At this point you can use the xps mapping interface
> netif_set_xps_queue() to get the right queue to cpu binding. If you want
> to cap it to max 8 queues that works as well. I don't think there is
> any value to have more tx queues than number of cpus in use.
> 
> If you do it this way all the normal mechanisms to setup queue mappings
> will work for users who are doing some special configuration and the
> default will still be what you want.
> 
> I guess I should go do this numa mapping for ixgbe and friends now that
> I mention it. Last perf numbers I had showed cross numa affinitizing
> was pretty bad.
> 
> Thanks,
> John
John,

I am little confused. In the guest, we need to first open the sub-channels (VRSS queues) based on what the host is offering. While we cannot open more sub-channels than what the host is offering, the guest can certainly open fewer sub-channels. I was describing the heuristics for how many sub-channels the guest currently opens. This is based on the NUMA topology presented to the guest and the number of VCPUs provisioned for the guest. The binding of VCPUs to the channels occur at the point of opening these channels.

Regards,

K. Y
Haiyang Zhang Jan. 8, 2016, 9:07 p.m. UTC | #5
> -----Original Message-----
> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com]
> Sent: Thursday, January 7, 2016 8:28 AM
> To: Simon Xiao <sixiao@microsoft.com>; Eric Dumazet
> <eric.dumazet@gmail.com>
> Cc: Tom Herbert <tom@herbertland.com>; netdev@vger.kernel.org; KY
> Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>;
> devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; David Miller
> <davem@davemloft.net>
> Subject: Re: [PATCH net-next] hv_netvsc: don't make assumptions on
> struct flow_keys layout
> 
> Eric Dumazet <eric.dumazet@gmail.com> writes:
> 
> > On Thu, 2016-01-07 at 10:33 +0100, Vitaly Kuznetsov wrote:
> >> Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net:
> Add
> >>  VLAN ID to flow_keys")) introduced a performance regression in
> netvsc
> >> driver. Is problem is, however, not the above mentioned commit but
> the
> >> fact that netvsc_set_hash() function did some assumptions on the
> struct
> >> flow_keys data layout and this is wrong. We need to extract the data
> we
> >> need (src/dst addresses and ports) after the dissect.
> >>
> >> The issue could also be solved in a completely different way: as
> suggested
> >> by Eric instead of our own homegrown netvsc_set_hash() we could use
> >> skb_get_hash() which does more or less the same. Unfortunately, the
> >> testing done by Simon showed that Hyper-V hosts are not happy with
> our
> >> Jenkins hash, selecting the output queue with the current algorithm
> based
> >> on Toeplitz hash works significantly better.
> >
> > Were tests done on IPv6 traffic ?
> >
> 
> Simon, could you please test this patch for IPv6 and show us the numbers?
> 
> > Toeplitz hash takes at least 100 ns to hash 12 bytes (one iteration
> per
> > bit : 96 iterations)
> >
> > For IPv6 it is 3 times this, since we have to hash 36 bytes.
> >
> > I do not see how it can compete with skb_get_hash() that directly
> gives
> > skb->hash for local TCP flows.
> >
> 
> My guess is that this is not the bottleneck, something is happening
> behind the scene with out packets in Hyper-V host (e.g. re-distributing
> them to hardware queues?) but I don't know the internals, Microsoft
> folks could probably comment.

The Hyper-V vRSS protocol lets us use the Toeplitz hash algorithm. We are
currently running further tests, including IPv6 too, and will share the 
results when available.

Thanks,
- Haiyang
diff mbox

Patch

From 0040e79c1303bd225ddbbce679ea944ea11ad0bd Mon Sep 17 00:00:00 2001
From: Vitaly Kuznetsov <vkuznets@redhat.com>
Date: Wed, 6 Jan 2016 12:14:10 +0100
Subject: [PATCH] hv_netvsc: use skb_get_hash() instead of a homegrown
 implementation

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 drivers/net/hyperv/netvsc_drv.c | 67 ++---------------------------------------
 1 file changed, 3 insertions(+), 64 deletions(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 409b48e..038bf4f 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -195,65 +195,6 @@  static void *init_ppi_data(struct rndis_message *msg, u32 ppi_size,
 	return ppi;
 }
 
-union sub_key {
-	u64 k;
-	struct {
-		u8 pad[3];
-		u8 kb;
-		u32 ka;
-	};
-};
-
-/* Toeplitz hash function
- * data: network byte order
- * return: host byte order
- */
-static u32 comp_hash(u8 *key, int klen, void *data, int dlen)
-{
-	union sub_key subk;
-	int k_next = 4;
-	u8 dt;
-	int i, j;
-	u32 ret = 0;
-
-	subk.k = 0;
-	subk.ka = ntohl(*(u32 *)key);
-
-	for (i = 0; i < dlen; i++) {
-		subk.kb = key[k_next];
-		k_next = (k_next + 1) % klen;
-		dt = ((u8 *)data)[i];
-		for (j = 0; j < 8; j++) {
-			if (dt & 0x80)
-				ret ^= subk.ka;
-			dt <<= 1;
-			subk.k <<= 1;
-		}
-	}
-
-	return ret;
-}
-
-static bool netvsc_set_hash(u32 *hash, struct sk_buff *skb)
-{
-	struct flow_keys flow;
-	int data_len;
-
-	if (!skb_flow_dissect_flow_keys(skb, &flow, 0) ||
-	    !(flow.basic.n_proto == htons(ETH_P_IP) ||
-	      flow.basic.n_proto == htons(ETH_P_IPV6)))
-		return false;
-
-	if (flow.basic.ip_proto == IPPROTO_TCP)
-		data_len = 12;
-	else
-		data_len = 8;
-
-	*hash = comp_hash(netvsc_hash_key, HASH_KEYLEN, &flow, data_len);
-
-	return true;
-}
-
 static u16 netvsc_select_queue(struct net_device *ndev, struct sk_buff *skb,
 			void *accel_priv, select_queue_fallback_t fallback)
 {
@@ -266,11 +207,9 @@  static u16 netvsc_select_queue(struct net_device *ndev, struct sk_buff *skb,
 	if (nvsc_dev == NULL || ndev->real_num_tx_queues <= 1)
 		return 0;
 
-	if (netvsc_set_hash(&hash, skb)) {
-		q_idx = nvsc_dev->send_table[hash % VRSS_SEND_TAB_SIZE] %
-			ndev->real_num_tx_queues;
-		skb_set_hash(skb, hash, PKT_HASH_TYPE_L3);
-	}
+	hash = skb_get_hash(skb);
+	q_idx = nvsc_dev->send_table[hash % VRSS_SEND_TAB_SIZE] %
+		ndev->real_num_tx_queues;
 
 	return q_idx;
 }
-- 
2.4.3