Message ID | alpine.LFD.2.21.1905031603340.11823@ehc-opti7040.uk.solarflarecom.com |
---|---|
State | RFC |
Delegated to: | David Miller |
Headers | show |
Series | flow_offload: Re-add various features that disappeared | expand |
On Fri, 3 May 2019 16:06:55 +0100, Edward Cree wrote: > Introduce a new offload command TC_CLSFLOWER_STATS_BYINDEX, similar to > the existing TC_CLSFLOWER_STATS but specifying an action_index (the > tcfa_index of the action), which is called for each stats-having action > on the rule. Drivers should implement either, but not both, of these > commands. > > Signed-off-by: Edward Cree <ecree@solarflare.com> > --- > include/net/pkt_cls.h | 2 ++ > net/sched/cls_flower.c | 30 ++++++++++++++++++++++++++++++ > 2 files changed, 32 insertions(+) It feels a little strange to me to call the new stats updates from cls_flower, if we really want to support action sharing correctly. Can RTM_GETACTION not be used to dump actions without dumping the classifiers? If we dump from the classifiers wouldn't that lead to stale stats being returned?
On 2019-05-04 2:27 a.m., Jakub Kicinski wrote: > On Fri, 3 May 2019 16:06:55 +0100, Edward Cree wrote: >> Introduce a new offload command TC_CLSFLOWER_STATS_BYINDEX, similar to >> the existing TC_CLSFLOWER_STATS but specifying an action_index (the >> tcfa_index of the action), which is called for each stats-having action >> on the rule. Drivers should implement either, but not both, of these >> commands. [..] > > It feels a little strange to me to call the new stats updates from > cls_flower, if we really want to support action sharing correctly. > > Can RTM_GETACTION not be used to dump actions without dumping the > classifiers? If we dump from the classifiers wouldn't that lead to > stale stats being returned? Not sure about the staleness factor, but: For efficiency reasons we certainly need the RTM_GETACTION approach (as you stated above we dont need to dump all that classifier info if all we want are stats). This becomes a big deal if you have a lot of stats/rules. But we also need to support the reference from the classifier rules, if for no other reason, to support tc semantics. If we are going to support RTM_GETACTION, then it is important to note one caveat: tc uses the index as an identifier for the action; meaning attributes + stats. You specify the index when i want to change an attribute or optionally when you create an action, or do a get attributes and stats. Example, to change an existing skbedit to set attribute skbmark of 10 instead of whatever existing value was there: tc actions replace action skbedit mark 10 index 2 and to get the attributes + stats for a specific action instance: tc actions get action skbedit index 2 or dump for specific action type as such: tc actions ls action police Most H/W i have seen has a global indexed stats table which is shared by different action types (droppers, accept, mirror etc). The specific actions may also have their own tables which also then refer to the 32 bit index used in the stats table[1]. So for this to work well, the action will need at minimal to have two indices one that is used in hardware stats table and another that is kernel mapped to identify the attributes. Of course we'll need to have a skip_sw flag etc. cheers, jamal [1] except for the policers/meters which tend have their own state tables and often counters.
On 06/05/2019 13:41, Jamal Hadi Salim wrote: > On 2019-05-04 2:27 a.m., Jakub Kicinski wrote: >> On Fri, 3 May 2019 16:06:55 +0100, Edward Cree wrote: >>> Introduce a new offload command TC_CLSFLOWER_STATS_BYINDEX, similar to >>> the existing TC_CLSFLOWER_STATS but specifying an action_index (the >>> tcfa_index of the action), which is called for each stats-having action >>> on the rule. Drivers should implement either, but not both, of these >>> commands. > > [..] >> >> It feels a little strange to me to call the new stats updates from >> cls_flower, if we really want to support action sharing correctly. >> >> Can RTM_GETACTION not be used to dump actions without dumping the >> classifiers? If we dump from the classifiers wouldn't that lead to >> stale stats being returned? > > Not sure about the staleness factor, but: > For efficiency reasons we certainly need the RTM_GETACTION approach > (as you stated above we dont need to dump all that classifier info if > all we want are stats). This becomes a big deal if you have a lot > of stats/rules. I don't know much of anything about RTM_GETACTION, but it doesn't appear to be part of the current "tc offload" world, which AIUI is very much centred around cls_flower. I'm just trying to make counters in cls_flower offload do 'the right thing' (whatever that may be), anything else is out of scope. > Most H/W i have seen has a global indexed stats table which is > shared by different action types (droppers, accept, mirror etc). > The specific actions may also have their own tables which also > then refer to the 32 bit index used in the stats table[1]. > So for this to work well, the action will need at minimal to have > two indices one that is used in hardware stats table > and another that is kernel mapped to identify the attributes. Of > course we'll need to have a skip_sw flag etc. I'm not sure I'm parsing this correctly, but are you saying that the index namespace is per-action type? I.e. a mirred and a drop action could have the same index yet expect to have separate counters? My approach here has assumed that in such a case they would share their counters.
On Tue, 7 May 2019 13:27:15 +0100, Edward Cree wrote: > On 06/05/2019 13:41, Jamal Hadi Salim wrote: > > On 2019-05-04 2:27 a.m., Jakub Kicinski wrote: > >> On Fri, 3 May 2019 16:06:55 +0100, Edward Cree wrote: > >>> Introduce a new offload command TC_CLSFLOWER_STATS_BYINDEX, similar to > >>> the existing TC_CLSFLOWER_STATS but specifying an action_index (the > >>> tcfa_index of the action), which is called for each stats-having action > >>> on the rule. Drivers should implement either, but not both, of these > >>> commands. > >> > >> It feels a little strange to me to call the new stats updates from > >> cls_flower, if we really want to support action sharing correctly. > >> > >> Can RTM_GETACTION not be used to dump actions without dumping the > >> classifiers? If we dump from the classifiers wouldn't that lead to > >> stale stats being returned? > > > > Not sure about the staleness factor, but: > > For efficiency reasons we certainly need the RTM_GETACTION approach > > (as you stated above we dont need to dump all that classifier info if > > all we want are stats). This becomes a big deal if you have a lot > > of stats/rules. > > I don't know much of anything about RTM_GETACTION, but it doesn't appear > to be part of the current "tc offload" world, which AIUI is very much > centred around cls_flower. I'm just trying to make counters in > cls_flower offload do 'the right thing' (whatever that may be), anything > else is out of scope. I understand, and you are correct that the action sharing is not part of the current "tc offload" world. My point is that you are making it part of the offload world, so it'd be best if it could be done well from the start. People trying to make the code do the right thing just in their narrow use case brings us the periodic rewrites of the TC offload infrastructure, and every time we do it some use cases get broken :/
On 2019-05-07 8:27 a.m., Edward Cree wrote: > On 06/05/2019 13:41, Jamal Hadi Salim wrote: >> On 2019-05-04 2:27 a.m., Jakub Kicinski wrote: >>> On Fri, 3 May 2019 16:06:55 +0100, Edward Cree wrote: [..] > I don't know much of anything about RTM_GETACTION, but it doesn't appear > to be part of the current "tc offload" world, which AIUI is very much > centred around cls_flower. I'm just trying to make counters in > cls_flower offload do 'the right thing' (whatever that may be), anything > else is out of scope. > The lazy thing most people have done is essentially assume that there is a stat per filter rule. From tc semantics that is an implicit "pipe" action (which then carries stats). And most times they implement a single action per match. I wouldnt call it the 'the right thing'... >> Most H/W i have seen has a global indexed stats table which is >> shared by different action types (droppers, accept, mirror etc). >> The specific actions may also have their own tables which also >> then refer to the 32 bit index used in the stats table[1]. >> So for this to work well, the action will need at minimal to have >> two indices one that is used in hardware stats table >> and another that is kernel mapped to identify the attributes. Of >> course we'll need to have a skip_sw flag etc. > I'm not sure I'm parsing this correctly, but are you saying that the > index namespace is per-action type? I.e. a mirred and a drop action > could have the same index yet expect to have separate counters? My > approach here has assumed that in such a case they would share their > counters. Yes, the index at tc semantics level is per-action type. So "mirred index 1" and "drop index 1" are not the same stats counter. Filters on the other hand can share per-action type stats by virtue of sharing the same index per-action. e.g flower match foo action drop index 1 flower match bar action drop index 1 will share the same stats. cheers, jamal
On 08/05/2019 15:02, Jamal Hadi Salim wrote: > The lazy thing most people have done is essentially assume that > there is a stat per filter rule... > I wouldnt call it the 'the right thing' Yup, that's why I'm trying to not do that ;-) > Yes, the index at tc semantics level is per-action type. > So "mirred index 1" and "drop index 1" are not the same stats counter. Ok, then that kills the design I used here that relied entirely on the index to specify counters. I guess instead I'll have to go with the approach Pablo suggested, passing an array of struct flow_stats in the callback, thus using the index into that array (which corresponds to the index in f->exts->actions) to identify different counters. Which means I will have to change all the existing drivers, which will largely revert (from the drivers' perspective) the change when Pablo took f->exts away from them — they will go back to calling something that looks a lot like tcf_exts_stats_update(). However, that'll mean the API has in-tree users, so it might be considered mergeable(?) -Ed
On 2019-05-08 1:07 p.m., Edward Cree wrote: > On 08/05/2019 15:02, Jamal Hadi Salim wrote: >> The lazy thing most people have done is essentially assume that >> there is a stat per filter rule... >> I wouldnt call it the 'the right thing' > Yup, that's why I'm trying to not do that ;-) Thank you ;-> > >> Yes, the index at tc semantics level is per-action type. >> So "mirred index 1" and "drop index 1" are not the same stats counter. > Ok, then that kills the design I used here that relied entirely on the > index to specify counters. > I guess instead I'll have to go with the approach Pablo suggested, > passing an array of struct flow_stats in the callback, thus using > the index into that array (which corresponds to the index in > f->exts->actions) to identify different counters. > Which means I will have to change all the existing drivers, which will > largely revert (from the drivers' perspective) the change when Pablo > took f->exts away from them — they will go back to calling something > that looks a lot like tcf_exts_stats_update(). > However, that'll mean the API has in-tree users, so it might be > considered mergeable(?) I would say yes, but post the patches and lets have the stakeholders chime in. Would it be simpler to just restore the f->exts? cheers, jamal
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h index d5e7a1af346f..0e33c52c23a8 100644 --- a/include/net/pkt_cls.h +++ b/include/net/pkt_cls.h @@ -762,6 +762,7 @@ enum tc_fl_command { TC_CLSFLOWER_REPLACE, TC_CLSFLOWER_DESTROY, TC_CLSFLOWER_STATS, + TC_CLSFLOWER_STATS_BYINDEX, TC_CLSFLOWER_TMPLT_CREATE, TC_CLSFLOWER_TMPLT_DESTROY, }; @@ -773,6 +774,7 @@ struct tc_cls_flower_offload { struct flow_rule *rule; struct flow_stats stats; u32 classid; + u32 action_index; /* for TC_CLSFLOWER_STATS_BYINDEX */ }; static inline struct flow_rule * diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c index f6685fc53119..be339cd6a86e 100644 --- a/net/sched/cls_flower.c +++ b/net/sched/cls_flower.c @@ -474,6 +474,10 @@ static void fl_hw_update_stats(struct tcf_proto *tp, struct cls_fl_filter *f, { struct tc_cls_flower_offload cls_flower = {}; struct tcf_block *block = tp->chain->block; +#ifdef CONFIG_NET_CLS_ACT + struct tc_action *a; + int i; +#endif if (!rtnl_held) rtnl_lock(); @@ -489,6 +493,32 @@ static void fl_hw_update_stats(struct tcf_proto *tp, struct cls_fl_filter *f, cls_flower.stats.pkts, cls_flower.stats.lastused); +#ifdef CONFIG_NET_CLS_ACT + for (i = 0; i < f->exts.nr_actions; i++) { + a = f->exts.actions[i]; + + if (!a->ops->stats_update) + continue; + memset(&cls_flower, 0, sizeof(cls_flower)); + tc_cls_common_offload_init(&cls_flower.common, tp, f->flags, NULL); + cls_flower.command = TC_CLSFLOWER_STATS_BYINDEX; + cls_flower.cookie = (unsigned long) f; + cls_flower.classid = f->res.classid; + cls_flower.action_index = a->tcfa_index; + + tc_setup_cb_call(block, TC_SETUP_CLSFLOWER, &cls_flower, false); + + /* Some ->stats_update() use percpu variables and must thus be + * called with preemption disabled. + */ + preempt_disable(); + a->ops->stats_update(a, cls_flower.stats.bytes, + cls_flower.stats.pkts, + cls_flower.stats.lastused, true); + preempt_enable(); + } +#endif + if (!rtnl_held) rtnl_unlock(); }
Introduce a new offload command TC_CLSFLOWER_STATS_BYINDEX, similar to the existing TC_CLSFLOWER_STATS but specifying an action_index (the tcfa_index of the action), which is called for each stats-having action on the rule. Drivers should implement either, but not both, of these commands. Signed-off-by: Edward Cree <ecree@solarflare.com> --- include/net/pkt_cls.h | 2 ++ net/sched/cls_flower.c | 30 ++++++++++++++++++++++++++++++ 2 files changed, 32 insertions(+)