[net-next,2/3] net: Add SRIOV VGT+ support

Message ID	1572468274-30748-3-git-send-email-lariel@mellanox.com
State	Superseded
Delegated to:	David Miller
Headers	show Return-Path: <netdev-owner@vger.kernel.org> From: Ariel Levkovich <lariel@mellanox.com> To: "netdev@vger.kernel.org" <netdev@vger.kernel.org> CC: Saeed Mahameed <saeedm@mellanox.com>, "sd@queasysnail.net" <sd@queasysnail.net>, "sbrivio@redhat.com" <sbrivio@redhat.com>, "nikolay@cumulusnetworks.com" <nikolay@cumulusnetworks.com>, Jiri Pirko <jiri@mellanox.com>, "dsahern@gmail.com" <dsahern@gmail.com>, Ariel Levkovich <lariel@mellanox.com> Subject: [PATCH net-next 2/3] net: Add SRIOV VGT+ support Thread-Topic: [PATCH net-next 2/3] net: Add SRIOV VGT+ support Thread-Index: AQHVj2Lc0LGQTyCfw0CZhjRV2YLakg== Date: Wed, 30 Oct 2019 20:44:44 +0000 Message-ID: <1572468274-30748-3-git-send-email-lariel@mellanox.com> References: <1572468274-30748-1-git-send-email-lariel@mellanox.com> In-Reply-To: <1572468274-30748-1-git-send-email-lariel@mellanox.com> Accept-Language: en-US Content-Language: en-US received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk
Series	VGT+ support \| expand [net-next,0/3] VGT+ support [net-next,1/3] net: Support querying specific VF properties [net-next,2/3] net: Add SRIOV VGT+ support [net-next,3/3] net/mlx5: Add SRIOV VGT+ support

Message ID

1572468274-30748-3-git-send-email-lariel@mellanox.com

State

Superseded

Delegated to:

David Miller

Headers

From: Ariel Levkovich <lariel@mellanox.com>
To: "netdev@vger.kernel.org" <netdev@vger.kernel.org>
CC: Saeed Mahameed <saeedm@mellanox.com>,
	"sd@queasysnail.net" <sd@queasysnail.net>,
	"sbrivio@redhat.com" <sbrivio@redhat.com>,
	"nikolay@cumulusnetworks.com" <nikolay@cumulusnetworks.com>,
	Jiri Pirko <jiri@mellanox.com>, "dsahern@gmail.com" <dsahern@gmail.com>,
	Ariel Levkovich <lariel@mellanox.com>
Subject: [PATCH net-next 2/3] net: Add SRIOV VGT+ support
Thread-Topic: [PATCH net-next 2/3] net: Add SRIOV VGT+ support
Thread-Index: AQHVj2Lc0LGQTyCfw0CZhjRV2YLakg==
Date: Wed, 30 Oct 2019 20:44:44 +0000
Message-ID: <1572468274-30748-3-git-send-email-lariel@mellanox.com>
References: <1572468274-30748-1-git-send-email-lariel@mellanox.com>
In-Reply-To: <1572468274-30748-1-git-send-email-lariel@mellanox.com>
Accept-Language: en-US
Content-Language: en-US
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-publictraffictype: Email
x-ms-office365-filtering-ht: Tenant
x-ms-office365-filtering-correlation-id: c866e432-f5db-44cd-0c4e-08d75d79ff15
x-ms-traffictypediagnostic: AM4PR05MB3267:|AM4PR05MB3267:
x-ms-exchange-transport-forked: True
x-microsoft-antispam-prvs: <AM4PR05MB3267E485E21E45E642F0CB07BA600@AM4PR05MB3267.eurprd05.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:8882;
x-forefront-prvs: 02065A9E77
x-forefront-antispam-report: SFV:NSPM;
	SFS:(10009020)(4636009)(396003)(376002)(39860400002)(136003)(366004)(346002)(199004)(189003)(14454004)(1730700003)(6486002)(476003)(486006)(8936002)(186003)(11346002)(6512007)(446003)(478600001)(316002)(26005)(54906003)(81166006)(2616005)(6436002)(81156014)(8676002)(50226002)(5640700003)(86362001)(107886003)(25786009)(99286004)(52116002)(386003)(2906002)(102836004)(2501003)(6116002)(14444005)(71200400001)(36756003)(71190400001)(76176011)(6506007)(64756008)(66556008)(4720700003)(2351001)(66446008)(30864003)(66476007)(3846002)(7736002)(6916009)(5660300002)(66066001)(305945005)(256004)(4326008)(66946007)(309714004);
	DIR:OUT; SFP:1101; SCL:1; SRVR:AM4PR05MB3267;
	H:AM4PR05MB3313.eurprd05.prod.outlook.com; FPR:; SPF:None;
	LANG:en; PTR:InfoNoRecords; A:1; MX:1; 
received-spf: None (protection.outlook.com: mellanox.com does not designate
	permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: z4x0ffVGwEpHI8p6ccDO+ePV1S1mBz0A0IFZYPHsCy8pW61H2W9w+HgYD04L1Yikz49o7MylnWvgn6O+K0Ou8EZma3LrYYmhRUANKAIkKfR6JUl/3G1ITMoZPAShBAhB4T362k6FVNU/XPUIqcOw/cy1pT+aSeM8MewvxXxZvOwgMS7HnEdir9ameN0TBwX9EG1INh0IRbUSiwCnoel+utzQIv00RGhVu1jHQ7F/rxcLAon1MPAPD4IY5em4jTSTsRNZ+1Xb1z18NkHLB43ufqJQLGcLPaGCViQvrGeF6GmFOnVndq7jCCcYq2TRkxZQE4zStjddpmJyyRSBJ8qWW1FV7zPsfw17J/PMtWhGOmwYHVdmCtm5oqPhDbKjiAHnVfJCsPYlneAjrGMd7tXYn6WfYprmjE0Z7mAL7iXKvhYkhfU2iKN7y2mYBY3ty4z1
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginatorOrg: Mellanox.com
X-MS-Exchange-CrossTenant-Network-Message-Id: c866e432-f5db-44cd-0c4e-08d75d79ff15
X-MS-Exchange-CrossTenant-originalarrivaltime: 30 Oct 2019 20:44:44.9759
	(UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: f8VWFP95qxKoyWHmOYFVvg75FLB8jBlJ022Bh0e2mRZSZhbM4JCf+sPcwmIua/cEk8Yfp7DN235ngLxdKbVcHw==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR05MB3267
Sender: netdev-owner@vger.kernel.org
Precedence: bulk
List-ID: <netdev.vger.kernel.org>
X-Mailing-List: netdev@vger.kernel.org

Series

VGT+ support | expand

Commit Message

Ariel Levkovich Oct. 30, 2019, 8:44 p.m. UTC

VGT+ is a security feature that gives the administrator the ability of
controlling the allowed vlan-ids list that can be transmitted/received
from/to the VF.
The allowed vlan-ids list is called "trunk".
Admin can add/remove a range of allowed vlan-ids via iptool.
Example:
After this series of configuration :
1) ip link set eth3 vf 0 trunk add 10 100 (allow vlan-id 10-100, default tpid 0x8100)
2) ip link set eth3 vf 0 trunk add 105 proto 802.1q (allow vlan-id 105 tpid 0x8100)
3) ip link set eth3 vf 0 trunk add 105 proto 802.1ad (allow vlan-id 105 tpid 0x88a8)
4) ip link set eth3 vf 0 trunk rem 90 (block vlan-id 90)
5) ip link set eth3 vf 0 trunk rem 50 60 (block vlan-ids 50-60)

The VF 0 can only communicate on vlan-ids: 10-49,61-89,91-100,105 with
tpid 0x8100 and vlan-id 105 with tpid 0x88a8.

For this purpose we added the following netlink sr-iov commands:

1) IFLA_VF_VLAN_RANGE: used to add/remove allowed vlan-ids range.
We added the ifla_vf_vlan_range struct to specify the range we want to
add/remove from the userspace.
We added ndo_add_vf_vlan_trunk_range and ndo_del_vf_vlan_trunk_range
netdev ops to add/remove allowed vlan-ids range in the netdev.

2) IFLA_VF_VLAN_TRUNK: used to query the allowed vlan-ids trunk.
We added trunk bitmap to the ifla_vf_info struct to get the current
allowed vlan-ids trunk from the netdev.
We added ifla_vf_vlan_trunk struct for sending the allowed vlan-ids
trunk to the userspace.
Since the trunk bitmap needs to contain a bit per possible enabled
vlan id, the size addition to ifla_vf_info is significant which may
create attribute length overrun when querying all the VFs.

Therefore, the return of the full bitmap is limited to the case
where the admin queries a specific VF only and for the VF list
query we introduce a new vf_info attribute called ifla_vf_vlan_mode
that will present the current VF tagging mode - VGT, VST or VGT+(trunk).

Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
---
 include/linux/if_link.h      |   3 ++
 include/linux/netdevice.h    |  12 +++++
 include/uapi/linux/if_link.h |  34 ++++++++++++
 net/core/rtnetlink.c         | 122 ++++++++++++++++++++++++++++++++-----------
 4 files changed, 140 insertions(+), 31 deletions(-)

Comments

Jakub Kicinski Oct. 30, 2019, 9:34 p.m. UTC | #1

On Wed, 30 Oct 2019 20:44:44 +0000, Ariel Levkovich wrote:
> VGT+ is a security feature that gives the administrator the ability of
> controlling the allowed vlan-ids list that can be transmitted/received
> from/to the VF.
> The allowed vlan-ids list is called "trunk".
> Admin can add/remove a range of allowed vlan-ids via iptool.
> Example:
> After this series of configuration :
> 1) ip link set eth3 vf 0 trunk add 10 100 (allow vlan-id 10-100, default tpid 0x8100)
> 2) ip link set eth3 vf 0 trunk add 105 proto 802.1q (allow vlan-id 105 tpid 0x8100)
> 3) ip link set eth3 vf 0 trunk add 105 proto 802.1ad (allow vlan-id 105 tpid 0x88a8)
> 4) ip link set eth3 vf 0 trunk rem 90 (block vlan-id 90)
> 5) ip link set eth3 vf 0 trunk rem 50 60 (block vlan-ids 50-60)
> 
> The VF 0 can only communicate on vlan-ids: 10-49,61-89,91-100,105 with
> tpid 0x8100 and vlan-id 105 with tpid 0x88a8.
> 
> For this purpose we added the following netlink sr-iov commands:
> 
> 1) IFLA_VF_VLAN_RANGE: used to add/remove allowed vlan-ids range.
> We added the ifla_vf_vlan_range struct to specify the range we want to
> add/remove from the userspace.
> We added ndo_add_vf_vlan_trunk_range and ndo_del_vf_vlan_trunk_range
> netdev ops to add/remove allowed vlan-ids range in the netdev.
> 
> 2) IFLA_VF_VLAN_TRUNK: used to query the allowed vlan-ids trunk.
> We added trunk bitmap to the ifla_vf_info struct to get the current
> allowed vlan-ids trunk from the netdev.
> We added ifla_vf_vlan_trunk struct for sending the allowed vlan-ids
> trunk to the userspace.
> Since the trunk bitmap needs to contain a bit per possible enabled
> vlan id, the size addition to ifla_vf_info is significant which may
> create attribute length overrun when querying all the VFs.
> 
> Therefore, the return of the full bitmap is limited to the case
> where the admin queries a specific VF only and for the VF list
> query we introduce a new vf_info attribute called ifla_vf_vlan_mode
> that will present the current VF tagging mode - VGT, VST or VGT+(trunk).
> 
> Signed-off-by: Ariel Levkovich <lariel@mellanox.com>

> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 3207e0b..da79976 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -1067,6 +1067,10 @@ struct netdev_name_node {
>   *      Hash Key. This is needed since on some devices VF share this information
>   *      with PF and querying it may introduce a theoretical security risk.
>   * int (*ndo_set_vf_rss_query_en)(struct net_device *dev, int vf, bool setting);
> + * int (*ndo_add_vf_vlan_trunk_range)(struct net_device *dev, int vf,
> + *				      u16 start_vid, u16 end_vid, __be16 proto);
> + * int (*ndo_del_vf_vlan_trunk_range)(struct net_device *dev, int vf,
> + *				      u16 start_vid, u16 end_vid, __be16 proto);
>   * int (*ndo_get_vf_port)(struct net_device *dev, int vf, struct sk_buff *skb);
>   * int (*ndo_setup_tc)(struct net_device *dev, enum tc_setup_type type,
>   *		       void *type_data);
> @@ -1332,6 +1336,14 @@ struct net_device_ops {
>  	int			(*ndo_set_vf_rss_query_en)(
>  						   struct net_device *dev,
>  						   int vf, bool setting);
> +	int			(*ndo_add_vf_vlan_trunk_range)(
> +						   struct net_device *dev,
> +						   int vf, u16 start_vid,
> +						   u16 end_vid, __be16 proto);
> +	int			(*ndo_del_vf_vlan_trunk_range)(
> +						   struct net_device *dev,
> +						   int vf, u16 start_vid,
> +						   u16 end_vid, __be16 proto);
>  	int			(*ndo_setup_tc)(struct net_device *dev,
>  						enum tc_setup_type type,
>  						void *type_data);

Is this official Mellanox patch submission or do you guys need time to
decide between each other if you like legacy VF ndos or not? ;-)

Saeed Mahameed Oct. 30, 2019, 10:50 p.m. UTC | #2

On Wed, 2019-10-30 at 14:34 -0700, Jakub Kicinski wrote:
> On Wed, 30 Oct 2019 20:44:44 +0000, Ariel Levkovich wrote:
> > VGT+ is a security feature that gives the administrator the ability
> > of
> > controlling the allowed vlan-ids list that can be
> > transmitted/received
> > from/to the VF.
> > The allowed vlan-ids list is called "trunk".
> > Admin can add/remove a range of allowed vlan-ids via iptool.
> > Example:
> > After this series of configuration :
> > 1) ip link set eth3 vf 0 trunk add 10 100 (allow vlan-id 10-100,
> > default tpid 0x8100)
> > 2) ip link set eth3 vf 0 trunk add 105 proto 802.1q (allow vlan-id
> > 105 tpid 0x8100)
> > 3) ip link set eth3 vf 0 trunk add 105 proto 802.1ad (allow vlan-id 
> > 105 tpid 0x88a8)
> > 4) ip link set eth3 vf 0 trunk rem 90 (block vlan-id 90)
> > 5) ip link set eth3 vf 0 trunk rem 50 60 (block vlan-ids 50-60)
> > 
> > The VF 0 can only communicate on vlan-ids: 10-49,61-89,91-100,105
> > with
> > tpid 0x8100 and vlan-id 105 with tpid 0x88a8.
> > 
> > For this purpose we added the following netlink sr-iov commands:
> > 
> > 1) IFLA_VF_VLAN_RANGE: used to add/remove allowed vlan-ids range.
> > We added the ifla_vf_vlan_range struct to specify the range we want
> > to
> > add/remove from the userspace.
> > We added ndo_add_vf_vlan_trunk_range and
> > ndo_del_vf_vlan_trunk_range
> > netdev ops to add/remove allowed vlan-ids range in the netdev.
> > 
> > 2) IFLA_VF_VLAN_TRUNK: used to query the allowed vlan-ids trunk.
> > We added trunk bitmap to the ifla_vf_info struct to get the current
> > allowed vlan-ids trunk from the netdev.
> > We added ifla_vf_vlan_trunk struct for sending the allowed vlan-ids
> > trunk to the userspace.
> > Since the trunk bitmap needs to contain a bit per possible enabled
> > vlan id, the size addition to ifla_vf_info is significant which may
> > create attribute length overrun when querying all the VFs.
> > 
> > Therefore, the return of the full bitmap is limited to the case
> > where the admin queries a specific VF only and for the VF list
> > query we introduce a new vf_info attribute called ifla_vf_vlan_mode
> > that will present the current VF tagging mode - VGT, VST or
> > VGT+(trunk).
> > 
> > Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index 3207e0b..da79976 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -1067,6 +1067,10 @@ struct netdev_name_node {
> >   *      Hash Key. This is needed since on some devices VF share
> > this information
> >   *      with PF and querying it may introduce a theoretical
> > security risk.
> >   * int (*ndo_set_vf_rss_query_en)(struct net_device *dev, int vf,
> > bool setting);
> > + * int (*ndo_add_vf_vlan_trunk_range)(struct net_device *dev, int
> > vf,
> > + *				      u16 start_vid, u16 end_vid,
> > __be16 proto);
> > + * int (*ndo_del_vf_vlan_trunk_range)(struct net_device *dev, int
> > vf,
> > + *				      u16 start_vid, u16 end_vid,
> > __be16 proto);
> >   * int (*ndo_get_vf_port)(struct net_device *dev, int vf, struct
> > sk_buff *skb);
> >   * int (*ndo_setup_tc)(struct net_device *dev, enum tc_setup_type
> > type,
> >   *		       void *type_data);
> > @@ -1332,6 +1336,14 @@ struct net_device_ops {
> >  	int			(*ndo_set_vf_rss_query_en)(
> >  						   struct net_device
> > *dev,
> >  						   int vf, bool
> > setting);
> > +	int			(*ndo_add_vf_vlan_trunk_range)(
> > +						   struct net_device
> > *dev,
> > +						   int vf, u16
> > start_vid,
> > +						   u16 end_vid, __be16
> > proto);
> > +	int			(*ndo_del_vf_vlan_trunk_range)(
> > +						   struct net_device
> > *dev,
> > +						   int vf, u16
> > start_vid,
> > +						   u16 end_vid, __be16
> > proto);
> >  	int			(*ndo_setup_tc)(struct net_device *dev,
> >  						enum tc_setup_type
> > type,
> >  						void *type_data);
> 
> Is this official Mellanox patch submission or do you guys need time
> to
> decide between each other if you like legacy VF ndos or not? ;-)

It is official :), as much as we want to move away from legacy mode, we
do still have two major customers that are not quite ready yet to move
to switchdev mode. the silver-lining here is that they are welling to
move to upstream kernel (advanced distros), but we need this feature in
legacy mode.

The ability to configure per VF ACL tables vlan filters is a must.

I tried to think of an API where we can expose the whole VF ACL tables
to users and let them configure it the way they want with TC flower
maybe (sort of hybrid legacy-switchdev mode that can act only on VF ACL
tables but not on the FDB). The problem with this is that it can easily
conflict with VST/trust mode and other settings that can be done via
legacy VF ndos... so i guess the complexity of such API is not worthy
and a simple vlan list filter API is more natural for legacy sriov ?!

Jakub Kicinski Oct. 31, 2019, 12:59 a.m. UTC | #3

On Wed, 30 Oct 2019 22:50:06 +0000, Saeed Mahameed wrote:
> On Wed, 2019-10-30 at 14:34 -0700, Jakub Kicinski wrote:
> > On Wed, 30 Oct 2019 20:44:44 +0000, Ariel Levkovich wrote:  
> > > Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
> > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > > index 3207e0b..da79976 100644
> > > --- a/include/linux/netdevice.h
> > > +++ b/include/linux/netdevice.h
> > > @@ -1067,6 +1067,10 @@ struct netdev_name_node {
> > >   *      Hash Key. This is needed since on some devices VF share
> > > this information
> > >   *      with PF and querying it may introduce a theoretical
> > > security risk.
> > >   * int (*ndo_set_vf_rss_query_en)(struct net_device *dev, int vf,
> > > bool setting);
> > > + * int (*ndo_add_vf_vlan_trunk_range)(struct net_device *dev, int
> > > vf,
> > > + *				      u16 start_vid, u16 end_vid,
> > > __be16 proto);
> > > + * int (*ndo_del_vf_vlan_trunk_range)(struct net_device *dev, int
> > > vf,
> > > + *				      u16 start_vid, u16 end_vid,
> > > __be16 proto);
> > >   * int (*ndo_get_vf_port)(struct net_device *dev, int vf, struct
> > > sk_buff *skb);
> > >   * int (*ndo_setup_tc)(struct net_device *dev, enum tc_setup_type
> > > type,
> > >   *		       void *type_data);
> > > @@ -1332,6 +1336,14 @@ struct net_device_ops {
> > >  	int			(*ndo_set_vf_rss_query_en)(
> > >  						   struct net_device
> > > *dev,
> > >  						   int vf, bool
> > > setting);
> > > +	int			(*ndo_add_vf_vlan_trunk_range)(
> > > +						   struct net_device
> > > *dev,
> > > +						   int vf, u16
> > > start_vid,
> > > +						   u16 end_vid, __be16
> > > proto);
> > > +	int			(*ndo_del_vf_vlan_trunk_range)(
> > > +						   struct net_device
> > > *dev,
> > > +						   int vf, u16
> > > start_vid,
> > > +						   u16 end_vid, __be16
> > > proto);
> > >  	int			(*ndo_setup_tc)(struct net_device *dev,
> > >  						enum tc_setup_type
> > > type,
> > >  						void *type_data);  
> > 
> > Is this official Mellanox patch submission or do you guys need time
> > to
> > decide between each other if you like legacy VF ndos or not? ;-)  
> 
> It is official :), as much as we want to move away from legacy mode, we
> do still have two major customers that are not quite ready yet to move
> to switchdev mode. the silver-lining here is that they are welling to
> move to upstream kernel (advanced distros), but we need this feature in
> legacy mode.

So they are willing to update the kernel, just not willing to move the
orchestration to the new way of doing things? Sounds familiar :(

> The ability to configure per VF ACL tables vlan filters is a must.
> 
> I tried to think of an API where we can expose the whole VF ACL tables
> to users and let them configure it the way they want with TC flower
> maybe (sort of hybrid legacy-switchdev mode that can act only on VF ACL
> tables but not on the FDB). The problem with this is that it can easily
> conflict with VST/trust mode and other settings that can be done via
> legacy VF ndos... so i guess the complexity of such API is not worthy
> and a simple vlan list filter API is more natural for legacy sriov ?!

The "we don't want any more legacy VF ndos" policy which I think we
wanted to follow is much easier to stick to than "we don't want any
more legacy VF ndos, unless..".

There's nothing here that can't be done in switchdev mode (perhaps
bridge offload would actually be more suitable than just flower),
and the uAPI extension is not an insignificant one.

I don't think we should be growing both legacy and switchdev APIs, at
some point we got to pick one. The switchdev extension to set hwaddr
for which patches were posted recently had been implemented through
legacy API a while ago (by Chelsio IIRC) so it's not that we're looking
towards switchdev where legacy API is impossible to extend. It's purely
a policy decision to pick one and deprecate the other.

Only if we freeze the legacy API completely will the orchestration
layers have an incentive to support switchdev. And we can save the few
hundred lines of code per feature in every driver..

diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index 622658d..7146181 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -28,6 +28,9 @@  struct ifla_vf_info {
 	__u32 max_tx_rate;
 	__u32 rss_query_en;
 	__u32 trusted;
+	__u32 vlan_mode;
+	__u64 trunk_8021q[VF_VLAN_BITMAP];
+	__u64 trunk_8021ad[VF_VLAN_BITMAP];
 	__be16 vlan_proto;
 };
 #endif /* _LINUX_IF_LINK_H */
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 3207e0b..da79976 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1067,6 +1067,10 @@  struct netdev_name_node {
  *      Hash Key. This is needed since on some devices VF share this information
  *      with PF and querying it may introduce a theoretical security risk.
  * int (*ndo_set_vf_rss_query_en)(struct net_device *dev, int vf, bool setting);
+ * int (*ndo_add_vf_vlan_trunk_range)(struct net_device *dev, int vf,
+ *				      u16 start_vid, u16 end_vid, __be16 proto);
+ * int (*ndo_del_vf_vlan_trunk_range)(struct net_device *dev, int vf,
+ *				      u16 start_vid, u16 end_vid, __be16 proto);
  * int (*ndo_get_vf_port)(struct net_device *dev, int vf, struct sk_buff *skb);
  * int (*ndo_setup_tc)(struct net_device *dev, enum tc_setup_type type,
  *		       void *type_data);
@@ -1332,6 +1336,14 @@  struct net_device_ops {
 	int			(*ndo_set_vf_rss_query_en)(
 						   struct net_device *dev,
 						   int vf, bool setting);
+	int			(*ndo_add_vf_vlan_trunk_range)(
+						   struct net_device *dev,
+						   int vf, u16 start_vid,
+						   u16 end_vid, __be16 proto);
+	int			(*ndo_del_vf_vlan_trunk_range)(
+						   struct net_device *dev,
+						   int vf, u16 start_vid,
+						   u16 end_vid, __be16 proto);
 	int			(*ndo_setup_tc)(struct net_device *dev,
 						enum tc_setup_type type,
 						void *type_data);
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 797e214..35ab210 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -180,6 +180,8 @@  enum {
 #ifndef __KERNEL__
 #define IFLA_RTA(r)  ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct ifinfomsg))))
 #define IFLA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct ifinfomsg))
+#define BITS_PER_BYTE 8
+#define DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
 #endif
 
 enum {
@@ -699,6 +701,9 @@  enum {
 	IFLA_VF_IB_PORT_GUID,	/* VF Infiniband port GUID */
 	IFLA_VF_VLAN_LIST,	/* nested list of vlans, option for QinQ */
 	IFLA_VF_BROADCAST,	/* VF broadcast */
+	IFLA_VF_VLAN_MODE,	/* vlan tagging mode */
+	IFLA_VF_VLAN_RANGE,	/* add/delete vlan range filtering */
+	IFLA_VF_VLAN_TRUNK,	/* vlan trunk filtering */
 	__IFLA_VF_MAX,
 };
 
@@ -713,6 +718,19 @@  struct ifla_vf_broadcast {
 	__u8 broadcast[32];
 };
 
+enum {
+	IFLA_VF_VLAN_MODE_UNSPEC,
+	IFLA_VF_VLAN_MODE_VGT,
+	IFLA_VF_VLAN_MODE_VST,
+	IFLA_VF_VLAN_MODE_TRUNK,
+	__IFLA_VF_VLAN_MODE_MAX,
+};
+
+struct ifla_vf_vlan_mode {
+	__u32 vf;
+	__u32 mode; /* The VLAN tagging mode */
+};
+
 struct ifla_vf_vlan {
 	__u32 vf;
 	__u32 vlan; /* 0 - 4095, 0 disables VLAN filter */
@@ -727,6 +745,7 @@  enum {
 
 #define IFLA_VF_VLAN_INFO_MAX (__IFLA_VF_VLAN_INFO_MAX - 1)
 #define MAX_VLAN_LIST_LEN 1
+#define VF_VLAN_N_VID 4096
 
 struct ifla_vf_vlan_info {
 	__u32 vf;
@@ -735,6 +754,21 @@  struct ifla_vf_vlan_info {
 	__be16 vlan_proto; /* VLAN protocol either 802.1Q or 802.1ad */
 };
 
+struct ifla_vf_vlan_range {
+	__u32 vf;
+	__u32 start_vid;   /* 1 - 4095 */
+	__u32 end_vid;     /* 1 - 4095 */
+	__u32 setting;
+	__be16 vlan_proto; /* VLAN protocol either 802.1Q or 802.1ad */
+};
+
+#define VF_VLAN_BITMAP	DIV_ROUND_UP(VF_VLAN_N_VID, sizeof(__u64) * BITS_PER_BYTE)
+struct ifla_vf_vlan_trunk {
+	__u32 vf;
+	__u64 allowed_vlans_8021q_bm[VF_VLAN_BITMAP];
+	__u64 allowed_vlans_8021ad_bm[VF_VLAN_BITMAP];
+};
+
 struct ifla_vf_tx_rate {
 	__u32 vf;
 	__u32 rate; /* Max TX bandwidth in Mbps, 0 disables throttling */
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 31fa0af..e273abb 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -911,8 +911,10 @@  static inline int rtnl_vfinfo_size(const struct net_device *dev,
 		int num_vfs = dev_num_vf(dev->dev.parent);
 		size_t size = nla_total_size(0);
 
-		if (num_vfs && (ext_filter_mask & RTEXT_FILTER_VF_EXT))
+		if (num_vfs && (ext_filter_mask & RTEXT_FILTER_VF_EXT)) {
 			num_vfs = 1;
+			size += nla_total_size(sizeof(struct ifla_vf_vlan_trunk));
+		}
 
 		size += num_vfs *
 			(nla_total_size(0) +
@@ -927,6 +929,7 @@  static inline int rtnl_vfinfo_size(const struct net_device *dev,
 			 nla_total_size(sizeof(struct ifla_vf_rate)) +
 			 nla_total_size(sizeof(struct ifla_vf_link_state)) +
 			 nla_total_size(sizeof(struct ifla_vf_rss_query_en)) +
+			 nla_total_size(sizeof(struct ifla_vf_vlan_mode)) +
 			 nla_total_size(0) + /* nest IFLA_VF_STATS */
 			 /* IFLA_VF_STATS_RX_PACKETS */
 			 nla_total_size_64bit(sizeof(__u64)) +
@@ -1216,7 +1219,9 @@  static noinline_for_stack int rtnl_fill_vfinfo(struct sk_buff *skb,
 	struct nlattr *vf, *vfstats, *vfvlanlist;
 	struct ifla_vf_link_state vf_linkstate;
 	struct ifla_vf_vlan_info vf_vlan_info;
+	struct ifla_vf_vlan_mode vf_vlan_mode;
 	struct ifla_vf_spoofchk vf_spoofchk;
+	struct ifla_vf_vlan_trunk *vf_trunk;
 	struct ifla_vf_tx_rate vf_tx_rate;
 	struct ifla_vf_stats vf_stats;
 	struct ifla_vf_trust vf_trust;
@@ -1224,25 +1229,36 @@  static noinline_for_stack int rtnl_fill_vfinfo(struct sk_buff *skb,
 	struct ifla_vf_rate vf_rate;
 	struct ifla_vf_mac vf_mac;
 	struct ifla_vf_broadcast vf_broadcast;
-	struct ifla_vf_info ivi;
+	struct ifla_vf_info *ivi;
+
+	ivi = kzalloc(sizeof(*ivi), GFP_KERNEL);
+	if (!ivi)
+		return -ENOMEM;
 
-	memset(&ivi, 0, sizeof(ivi));
+	vf_trunk = kzalloc(sizeof(*vf_trunk), GFP_KERNEL);
+	if (!vf_trunk) {
+		kfree(ivi);
+		return -ENOMEM;
+	}
 
 	/* Not all SR-IOV capable drivers support the
 	 * spoofcheck and "RSS query enable" query.  Preset to
 	 * -1 so the user space tool can detect that the driver
 	 * didn't report anything.
 	 */
-	ivi.spoofchk = -1;
-	ivi.rss_query_en = -1;
-	ivi.trusted = -1;
+	ivi->spoofchk = -1;
+	ivi->rss_query_en = -1;
+	ivi->trusted = -1;
+	memset(ivi->mac, 0, sizeof(ivi->mac));
+	memset(ivi->trunk_8021q, 0, sizeof(ivi->trunk_8021q));
+	memset(ivi->trunk_8021ad, 0, sizeof(ivi->trunk_8021ad));
 	/* The default value for VF link state is "auto"
 	 * IFLA_VF_LINK_STATE_AUTO which equals zero
 	 */
-	ivi.linkstate = 0;
+	ivi->linkstate = 0;
 	/* VLAN Protocol by default is 802.1Q */
-	ivi.vlan_proto = htons(ETH_P_8021Q);
-	if (dev->netdev_ops->ndo_get_vf_config(dev, vfs_num, &ivi))
+	ivi->vlan_proto = htons(ETH_P_8021Q);
+	if (dev->netdev_ops->ndo_get_vf_config(dev, vfs_num, ivi))
 		return 0;
 
 	memset(&vf_vlan_info, 0, sizeof(vf_vlan_info));
@@ -1255,22 +1271,26 @@  static noinline_for_stack int rtnl_fill_vfinfo(struct sk_buff *skb,
 		vf_spoofchk.vf =
 		vf_linkstate.vf =
 		vf_rss_query_en.vf =
-		vf_trust.vf = ivi.vf;
-
-	memcpy(vf_mac.mac, ivi.mac, sizeof(ivi.mac));
-	memcpy(vf_broadcast.broadcast, dev->broadcast, dev->addr_len);
-	vf_vlan.vlan = ivi.vlan;
-	vf_vlan.qos = ivi.qos;
-	vf_vlan_info.vlan = ivi.vlan;
-	vf_vlan_info.qos = ivi.qos;
-	vf_vlan_info.vlan_proto = ivi.vlan_proto;
-	vf_tx_rate.rate = ivi.max_tx_rate;
-	vf_rate.min_tx_rate = ivi.min_tx_rate;
-	vf_rate.max_tx_rate = ivi.max_tx_rate;
-	vf_spoofchk.setting = ivi.spoofchk;
-	vf_linkstate.link_state = ivi.linkstate;
-	vf_rss_query_en.setting = ivi.rss_query_en;
-	vf_trust.setting = ivi.trusted;
+		vf_vlan_mode.vf =
+		vf_trunk->vf =
+		vf_trust.vf = ivi->vf;
+
+	memcpy(vf_mac.mac, ivi->mac, sizeof(ivi->mac));
+	memcpy(vf_trunk->allowed_vlans_8021q_bm, ivi->trunk_8021q, sizeof(ivi->trunk_8021q));
+	memcpy(vf_trunk->allowed_vlans_8021ad_bm, ivi->trunk_8021ad, sizeof(ivi->trunk_8021ad));
+	vf_vlan_mode.mode = ivi->vlan_mode;
+	vf_vlan.vlan = ivi->vlan;
+	vf_vlan.qos = ivi->qos;
+	vf_vlan_info.vlan = ivi->vlan;
+	vf_vlan_info.qos = ivi->qos;
+	vf_vlan_info.vlan_proto = ivi->vlan_proto;
+	vf_tx_rate.rate = ivi->max_tx_rate;
+	vf_rate.min_tx_rate = ivi->min_tx_rate;
+	vf_rate.max_tx_rate = ivi->max_tx_rate;
+	vf_spoofchk.setting = ivi->spoofchk;
+	vf_linkstate.link_state = ivi->linkstate;
+	vf_rss_query_en.setting = ivi->rss_query_en;
+	vf_trust.setting = ivi->trusted;
 	vf = nla_nest_start_noflag(skb, IFLA_VF_INFO);
 	if (!vf)
 		goto nla_put_vfinfo_failure;
@@ -1289,7 +1309,11 @@  static noinline_for_stack int rtnl_fill_vfinfo(struct sk_buff *skb,
 		    sizeof(vf_rss_query_en),
 		    &vf_rss_query_en) ||
 	    nla_put(skb, IFLA_VF_TRUST,
-		    sizeof(vf_trust), &vf_trust))
+		    sizeof(vf_trust), &vf_trust) ||
+	    nla_put(skb, IFLA_VF_VLAN_MODE,
+		    sizeof(vf_vlan_mode), &vf_vlan_mode) ||
+	    (vf_ext && nla_put(skb, IFLA_VF_VLAN_TRUNK,
+			       sizeof(*vf_trunk), vf_trunk)))
 		goto nla_put_vf_failure;
 	vfvlanlist = nla_nest_start_noflag(skb, IFLA_VF_VLAN_LIST);
 	if (!vfvlanlist)
@@ -1328,12 +1352,16 @@  static noinline_for_stack int rtnl_fill_vfinfo(struct sk_buff *skb,
 	}
 	nla_nest_end(skb, vfstats);
 	nla_nest_end(skb, vf);
+	kfree(vf_trunk);
+	kfree(ivi);
 	return 0;
 
 nla_put_vf_failure:
 	nla_nest_cancel(skb, vf);
 nla_put_vfinfo_failure:
 	nla_nest_cancel(skb, vfinfo);
+	kfree(vf_trunk);
+	kfree(ivi);
 	return -EMSGSIZE;
 }
 
@@ -1843,6 +1871,9 @@  static int rtnl_fill_ifinfo(struct sk_buff *skb,
 	[IFLA_VF_TRUST]		= { .len = sizeof(struct ifla_vf_trust) },
 	[IFLA_VF_IB_NODE_GUID]	= { .len = sizeof(struct ifla_vf_guid) },
 	[IFLA_VF_IB_PORT_GUID]	= { .len = sizeof(struct ifla_vf_guid) },
+	[IFLA_VF_VLAN_MODE]	= { .len = sizeof(struct ifla_vf_vlan_mode) },
+	[IFLA_VF_VLAN_RANGE]	= { .len = sizeof(struct ifla_vf_vlan_range) },
+	[IFLA_VF_VLAN_TRUNK]	= { .len = sizeof(struct ifla_vf_vlan_trunk) },
 };
 
 static const struct nla_policy ifla_port_policy[IFLA_PORT_MAX+1] = {
@@ -2285,6 +2316,26 @@  static int do_setvfinfo(struct net_device *dev, struct nlattr **tb)
 			return err;
 	}
 
+	if (tb[IFLA_VF_VLAN_RANGE]) {
+		struct ifla_vf_vlan_range *ivvr =
+					nla_data(tb[IFLA_VF_VLAN_RANGE]);
+		bool add = !!ivvr->setting;
+
+		err = -EOPNOTSUPP;
+		if (add && ops->ndo_add_vf_vlan_trunk_range)
+			err = ops->ndo_add_vf_vlan_trunk_range(dev, ivvr->vf,
+							       ivvr->start_vid,
+							       ivvr->end_vid,
+							       ivvr->vlan_proto);
+		else if (!add && ops->ndo_del_vf_vlan_trunk_range)
+			err = ops->ndo_del_vf_vlan_trunk_range(dev, ivvr->vf,
+							       ivvr->start_vid,
+							       ivvr->end_vid,
+							       ivvr->vlan_proto);
+		if (err < 0)
+			return err;
+	}
+
 	if (tb[IFLA_VF_VLAN_LIST]) {
 		struct ifla_vf_vlan_info *ivvl[MAX_VLAN_LIST_LEN];
 		struct nlattr *attr;
@@ -2316,21 +2367,30 @@  static int do_setvfinfo(struct net_device *dev, struct nlattr **tb)
 
 	if (tb[IFLA_VF_TX_RATE]) {
 		struct ifla_vf_tx_rate *ivt = nla_data(tb[IFLA_VF_TX_RATE]);
-		struct ifla_vf_info ivf;
+		struct ifla_vf_info *ivf;
+
+		ivf = kzalloc(sizeof(*ivf), GFP_KERNEL);
+		if (!ivf)
+			return -ENOMEM;
 
 		err = -EOPNOTSUPP;
 		if (ops->ndo_get_vf_config)
-			err = ops->ndo_get_vf_config(dev, ivt->vf, &ivf);
-		if (err < 0)
+			err = ops->ndo_get_vf_config(dev, ivt->vf, ivf);
+		if (err < 0) {
+			kfree(ivf);
 			return err;
+		}
 
 		err = -EOPNOTSUPP;
 		if (ops->ndo_set_vf_rate)
 			err = ops->ndo_set_vf_rate(dev, ivt->vf,
-						   ivf.min_tx_rate,
+						   ivf->min_tx_rate,
 						   ivt->rate);
-		if (err < 0)
+		if (err < 0) {
+			kfree(ivf);
 			return err;
+		}
+		kfree(ivf);
 	}
 
 	if (tb[IFLA_VF_RATE]) {

[net-next,2/3] net: Add SRIOV VGT+ support

Commit Message

Comments

Patch