Message ID | cover.1590512901.git.petrm@mellanox.com |
---|---|
Headers | show |
Series | TC: Introduce qevents | expand |
On Tue, May 26, 2020 at 10:11 AM Petr Machata <petrm@mellanox.com> wrote: > > The Spectrum hardware allows execution of one of several actions as a > result of queue management events: tail-dropping, early-dropping, marking a > packet, or passing a configured latency threshold or buffer size. Such > packets can be mirrored, trapped, or sampled. > > Modeling the action to be taken as simply a TC action is very attractive, > but it is not obvious where to put these actions. At least with ECN marking > one could imagine a tree of qdiscs and classifiers that effectively > accomplishes this task, albeit in an impractically complex manner. But > there is just no way to match on dropped-ness of a packet, let alone > dropped-ness due to a particular reason. > > To allow configuring user-defined actions as a result of inner workings of > a qdisc, this patch set introduces a concept of qevents. Those are attach > points for TC blocks, where filters can be put that are executed as the > packet hits well-defined points in the qdisc algorithms. The attached > blocks can be shared, in a manner similar to clsact ingress and egress > blocks, arbitrary classifiers with arbitrary actions can be put on them, > etc. This concept does not fit well into qdisc, essentially you still want to install filters (and actions) somewhere on qdisc, but currently all filters are executed at enqueue, basically you want to execute them at other pre-defined locations too, for example early drop. So, perhaps adding a "position" in tc filter is better? Something like: tc qdisc add dev x root handle 1: ... # same as before tc filter add dev x parent 1:0 position early_drop matchall action.... And obviously default position must be "enqueue". Makes sense? (The word "position" may be not accurate, but hope you get my point here.) Thanks.
Cong Wang <xiyou.wangcong@gmail.com> writes: > On Tue, May 26, 2020 at 10:11 AM Petr Machata <petrm@mellanox.com> wrote: >> >> The Spectrum hardware allows execution of one of several actions as a >> result of queue management events: tail-dropping, early-dropping, marking a >> packet, or passing a configured latency threshold or buffer size. Such >> packets can be mirrored, trapped, or sampled. >> >> Modeling the action to be taken as simply a TC action is very attractive, >> but it is not obvious where to put these actions. At least with ECN marking >> one could imagine a tree of qdiscs and classifiers that effectively >> accomplishes this task, albeit in an impractically complex manner. But >> there is just no way to match on dropped-ness of a packet, let alone >> dropped-ness due to a particular reason. >> >> To allow configuring user-defined actions as a result of inner workings of >> a qdisc, this patch set introduces a concept of qevents. Those are attach >> points for TC blocks, where filters can be put that are executed as the >> packet hits well-defined points in the qdisc algorithms. The attached >> blocks can be shared, in a manner similar to clsact ingress and egress >> blocks, arbitrary classifiers with arbitrary actions can be put on them, >> etc. > > This concept does not fit well into qdisc, essentially you still want to > install filters (and actions) somewhere on qdisc, but currently all filters > are executed at enqueue, basically you want to execute them at other > pre-defined locations too, for example early drop. > > So, perhaps adding a "position" in tc filter is better? Something like: > > tc qdisc add dev x root handle 1: ... # same as before > tc filter add dev x parent 1:0 position early_drop matchall action.... Position would just be a chain index. > And obviously default position must be "enqueue". Makes sense? Chain 0. So if I understand the proposal correctly: a qdisc has a classification block (cl_ops->tcf_block). Qevents then reference a chain on that classification block. If the above is correct, I disagree that this is a better model. RED does not need to classify anything, modelling this as a classification block is wrong. It should be another block. (Also shareable, because as an operator you likely want to treat all, say, early drops the same, and therefore to add just one rule for all 128 or so of your qdiscs.)
Hi Petr, On Tue, 26 May 2020 at 20:11, Petr Machata <petrm@mellanox.com> wrote: > > The Spectrum hardware allows execution of one of several actions as a > result of queue management events: tail-dropping, early-dropping, marking a > packet, or passing a configured latency threshold or buffer size. Such > packets can be mirrored, trapped, or sampled. > > Modeling the action to be taken as simply a TC action is very attractive, > but it is not obvious where to put these actions. At least with ECN marking > one could imagine a tree of qdiscs and classifiers that effectively > accomplishes this task, albeit in an impractically complex manner. But > there is just no way to match on dropped-ness of a packet, let alone > dropped-ness due to a particular reason. > > To allow configuring user-defined actions as a result of inner workings of > a qdisc, this patch set introduces a concept of qevents. Those are attach > points for TC blocks, where filters can be put that are executed as the > packet hits well-defined points in the qdisc algorithms. The attached > blocks can be shared, in a manner similar to clsact ingress and egress > blocks, arbitrary classifiers with arbitrary actions can be put on them, > etc. > > For example: > > # tc qdisc add dev eth0 root handle 1: \ > red limit 500K avpkt 1K qevent early block 10 > # tc filter add block 10 \ > matchall action mirred egress mirror dev eth1 > > Patch #1 of this set introduces several helpers to allow easy and uniform > addition of qevents to qdiscs. The following two patches, #2 and #3, then > add two qevents to the RED qdisc: "early" qevent fires when a packet is > early-dropped; "mark" qevent, when it is ECN-marked. > > This patch set does not deal with offloading. The idea there is that a > driver will be able to figure out that a given block is used in qevent > context by looking at binder type. A future patch-set will add a qdisc > pointer to struct flow_block_offload, which a driver will be able to > consult to glean the TC or other relevant attributes. > > Petr Machata (3): > net: sched: Introduce helpers for qevent blocks > net: sched: sch_red: Split init and change callbacks > net: sched: sch_red: Add qevents "early" and "mark" > > include/net/flow_offload.h | 2 + > include/net/pkt_cls.h | 48 +++++++++++++++ > include/uapi/linux/pkt_sched.h | 2 + > net/sched/cls_api.c | 107 +++++++++++++++++++++++++++++++++ > net/sched/sch_red.c | 100 ++++++++++++++++++++++++++---- > 5 files changed, 247 insertions(+), 12 deletions(-) > > -- > 2.20.1 > I only took a cursory glance at your patches. Can these "qevents" be added to code outside of the packet scheduler, like to the bridge, for example? Or can the bridge mark the packets somehow, and then any generic qdisc be able to recognize this mark without specific code? A very common use case which is currently not possible to implement is to rate-limit flooded (broadcast, unknown unicast, unknown multicast) traffic. Can your "qevents" be used to describe this, or must it be described separately? Thanks, -Vladimir
Vladimir Oltean <olteanv@gmail.com> writes: > I only took a cursory glance at your patches. Can these "qevents" be > added to code outside of the packet scheduler, like to the bridge, for > example? Or can the bridge mark the packets somehow, and then any > generic qdisc be able to recognize this mark without specific code? > A very common use case which is currently not possible to implement is > to rate-limit flooded (broadcast, unknown unicast, unknown multicast) > traffic. Can your "qevents" be used to describe this, or must it be > described separately? You mean something like a "flood" qevent? In principle nothing prevents this, but it does not strike me as a very good fit. These events are meant to be used on qdiscs, hence the "q" in the name. I am not sure it makes sense to reuse them for bridge traffic policing. Peeking in 802.1Q, I see "Managed objects for per-stream filtering and policing". If that's related, it seems like it would be conservative to model the policing directly in the bridge, instead of this round-about through TC.
On Wed, May 27, 2020 at 2:56 AM Petr Machata <petrm@mellanox.com> wrote: > > > Cong Wang <xiyou.wangcong@gmail.com> writes: > > > On Tue, May 26, 2020 at 10:11 AM Petr Machata <petrm@mellanox.com> wrote: > >> > >> The Spectrum hardware allows execution of one of several actions as a > >> result of queue management events: tail-dropping, early-dropping, marking a > >> packet, or passing a configured latency threshold or buffer size. Such > >> packets can be mirrored, trapped, or sampled. > >> > >> Modeling the action to be taken as simply a TC action is very attractive, > >> but it is not obvious where to put these actions. At least with ECN marking > >> one could imagine a tree of qdiscs and classifiers that effectively > >> accomplishes this task, albeit in an impractically complex manner. But > >> there is just no way to match on dropped-ness of a packet, let alone > >> dropped-ness due to a particular reason. > >> > >> To allow configuring user-defined actions as a result of inner workings of > >> a qdisc, this patch set introduces a concept of qevents. Those are attach > >> points for TC blocks, where filters can be put that are executed as the > >> packet hits well-defined points in the qdisc algorithms. The attached > >> blocks can be shared, in a manner similar to clsact ingress and egress > >> blocks, arbitrary classifiers with arbitrary actions can be put on them, > >> etc. > > > > This concept does not fit well into qdisc, essentially you still want to > > install filters (and actions) somewhere on qdisc, but currently all filters > > are executed at enqueue, basically you want to execute them at other > > pre-defined locations too, for example early drop. > > > > So, perhaps adding a "position" in tc filter is better? Something like: > > > > tc qdisc add dev x root handle 1: ... # same as before > > tc filter add dev x parent 1:0 position early_drop matchall action.... > > Position would just be a chain index. Why? By position, I mean a place where we _execute_ tc filters on a qdisc, currently there is only "enqueue". I don't see how this is close to a chain which is basically a group of filters. > > > And obviously default position must be "enqueue". Makes sense? > > Chain 0. > > So if I understand the proposal correctly: a qdisc has a classification > block (cl_ops->tcf_block). Qevents then reference a chain on that > classification block. No, I am suggesting to replace your qevents with position, because as I said it does not fit well there. > > If the above is correct, I disagree that this is a better model. RED > does not need to classify anything, modelling this as a classification > block is wrong. It should be another block. (Also shareable, because as > an operator you likely want to treat all, say, early drops the same, and > therefore to add just one rule for all 128 or so of your qdiscs.) You can still choose not to classify anything by using matchall. No one is saying you have to do classification. If you want to jump to a block, you can just use a goto action like normal. So your above example can be replaced with: # tc qdisc add dev eth0 root handle 1: \ red limit 500K avpkt 1K # tc filter add dev eth0 parent 1:0 position early_drop matchall \ action goto chain 10 # tc chain add dev eth0 index 10 ... See the difference? Thanks.
Cong Wang <xiyou.wangcong@gmail.com> writes: > On Wed, May 27, 2020 at 2:56 AM Petr Machata <petrm@mellanox.com> wrote: >> >> >> Cong Wang <xiyou.wangcong@gmail.com> writes: >> >> > On Tue, May 26, 2020 at 10:11 AM Petr Machata <petrm@mellanox.com> wrote: >> >> >> >> The Spectrum hardware allows execution of one of several actions as a >> >> result of queue management events: tail-dropping, early-dropping, marking a >> >> packet, or passing a configured latency threshold or buffer size. Such >> >> packets can be mirrored, trapped, or sampled. >> >> >> >> Modeling the action to be taken as simply a TC action is very attractive, >> >> but it is not obvious where to put these actions. At least with ECN marking >> >> one could imagine a tree of qdiscs and classifiers that effectively >> >> accomplishes this task, albeit in an impractically complex manner. But >> >> there is just no way to match on dropped-ness of a packet, let alone >> >> dropped-ness due to a particular reason. >> >> >> >> To allow configuring user-defined actions as a result of inner workings of >> >> a qdisc, this patch set introduces a concept of qevents. Those are attach >> >> points for TC blocks, where filters can be put that are executed as the >> >> packet hits well-defined points in the qdisc algorithms. The attached >> >> blocks can be shared, in a manner similar to clsact ingress and egress >> >> blocks, arbitrary classifiers with arbitrary actions can be put on them, >> >> etc. >> > >> > This concept does not fit well into qdisc, essentially you still want to >> > install filters (and actions) somewhere on qdisc, but currently all filters >> > are executed at enqueue, basically you want to execute them at other >> > pre-defined locations too, for example early drop. >> > >> > So, perhaps adding a "position" in tc filter is better? Something like: >> > >> > tc qdisc add dev x root handle 1: ... # same as before >> > tc filter add dev x parent 1:0 position early_drop matchall action.... >> >> Position would just be a chain index. > > Why? By position, I mean a place where we _execute_ tc filters on > a qdisc, currently there is only "enqueue". I don't see how this is > close to a chain which is basically a group of filters. OK, I misunderstood what you mean. So you propose to have further division within the block? To have sort of namespaces within blocks or chains, where depending on the context, only filters in the corresponding namespace are executed? >> >> > And obviously default position must be "enqueue". Makes sense? >> >> Chain 0. >> >> So if I understand the proposal correctly: a qdisc has a classification >> block (cl_ops->tcf_block). Qevents then reference a chain on that >> classification block. > > No, I am suggesting to replace your qevents with position, because > as I said it does not fit well there. > >> >> If the above is correct, I disagree that this is a better model. RED >> does not need to classify anything, modelling this as a classification >> block is wrong. It should be another block. (Also shareable, because as >> an operator you likely want to treat all, say, early drops the same, and >> therefore to add just one rule for all 128 or so of your qdiscs.) > > You can still choose not to classify anything by using matchall. No > one is saying you have to do classification. The point here is filter reuse, not classification. > If you want to jump to a block, you can just use a goto action like I don't think you can jump to a block. You can jump to a chain within the same block. > normal. So your above example can be replaced with: > > # tc qdisc add dev eth0 root handle 1: \ > red limit 500K avpkt 1K > > # tc filter add dev eth0 parent 1:0 position early_drop matchall \ > action goto chain 10 > > # tc chain add dev eth0 index 10 ... > > See the difference?
On Thu, May 28, 2020 at 2:48 AM Petr Machata <petrm@mellanox.com> wrote: > So you propose to have further division within the block? To have sort > of namespaces within blocks or chains, where depending on the context, > only filters in the corresponding namespace are executed? What I suggest is to let filters (or chain or block) decide where they belong to, because I think that fit naturally. What you suggest is to let qdisc's decide where they put filters, which looks odd to me. Hope it is all clear now. Thanks.
Cong Wang <xiyou.wangcong@gmail.com> writes: > On Thu, May 28, 2020 at 2:48 AM Petr Machata <petrm@mellanox.com> wrote: >> So you propose to have further division within the block? To have sort >> of namespaces within blocks or chains, where depending on the context, >> only filters in the corresponding namespace are executed? > > What I suggest is to let filters (or chain or block) decide where > they belong to, because I think that fit naturally. So filters would have this attribute that marks them for execution in the qevent context? Does "goto chain" keep this position? I.e. are the executed filters from the "position" context or not? If they keep the position, then this new system can be equivalently modeled as simply a block binding point. If "goto chain" loses the position, that is just a block binding point and a chain to start on, instead of the default zero. So introducing the position does not seem to allow us to model anything that could not be modeled before. But it is a more complicated system. > What you suggest is to let qdisc's decide where they put filters, > which looks odd to me. Ultimately the qdisc decides what qevents to expose. Qevents are closely tied to the inner workings of a qdisc algorithm, they can't be probed as modules the way qdiscs, filters and actions can. If a user wishes to make use of them, they will have to let qdiscs "dictate" where to put the filters one way or another.
Thu, May 28, 2020 at 11:48:50AM CEST, petrm@mellanox.com wrote: > >Cong Wang <xiyou.wangcong@gmail.com> writes: > >> On Wed, May 27, 2020 at 2:56 AM Petr Machata <petrm@mellanox.com> wrote: >>> >>> >>> Cong Wang <xiyou.wangcong@gmail.com> writes: >>> >>> > On Tue, May 26, 2020 at 10:11 AM Petr Machata <petrm@mellanox.com> wrote: >>> >> >>> >> The Spectrum hardware allows execution of one of several actions as a >>> >> result of queue management events: tail-dropping, early-dropping, marking a >>> >> packet, or passing a configured latency threshold or buffer size. Such >>> >> packets can be mirrored, trapped, or sampled. >>> >> >>> >> Modeling the action to be taken as simply a TC action is very attractive, >>> >> but it is not obvious where to put these actions. At least with ECN marking >>> >> one could imagine a tree of qdiscs and classifiers that effectively >>> >> accomplishes this task, albeit in an impractically complex manner. But >>> >> there is just no way to match on dropped-ness of a packet, let alone >>> >> dropped-ness due to a particular reason. >>> >> >>> >> To allow configuring user-defined actions as a result of inner workings of >>> >> a qdisc, this patch set introduces a concept of qevents. Those are attach >>> >> points for TC blocks, where filters can be put that are executed as the >>> >> packet hits well-defined points in the qdisc algorithms. The attached >>> >> blocks can be shared, in a manner similar to clsact ingress and egress >>> >> blocks, arbitrary classifiers with arbitrary actions can be put on them, >>> >> etc. >>> > >>> > This concept does not fit well into qdisc, essentially you still want to >>> > install filters (and actions) somewhere on qdisc, but currently all filters >>> > are executed at enqueue, basically you want to execute them at other >>> > pre-defined locations too, for example early drop. >>> > >>> > So, perhaps adding a "position" in tc filter is better? Something like: >>> > >>> > tc qdisc add dev x root handle 1: ... # same as before >>> > tc filter add dev x parent 1:0 position early_drop matchall action.... >>> >>> Position would just be a chain index. >> >> Why? By position, I mean a place where we _execute_ tc filters on >> a qdisc, currently there is only "enqueue". I don't see how this is >> close to a chain which is basically a group of filters. > >OK, I misunderstood what you mean. > >So you propose to have further division within the block? To have sort >of namespaces within blocks or chains, where depending on the context, >only filters in the corresponding namespace are executed? Please take the block as a whole entity. It has one entry ->chain0 processing. The gotos to other chains should be contained within the block. Please don't divide the block. If you want to process the block from a different entry point, please process it as a whole. > >>> >>> > And obviously default position must be "enqueue". Makes sense? >>> >>> Chain 0. >>> >>> So if I understand the proposal correctly: a qdisc has a classification >>> block (cl_ops->tcf_block). Qevents then reference a chain on that >>> classification block. >> >> No, I am suggesting to replace your qevents with position, because >> as I said it does not fit well there. >> >>> >>> If the above is correct, I disagree that this is a better model. RED >>> does not need to classify anything, modelling this as a classification >>> block is wrong. It should be another block. (Also shareable, because as >>> an operator you likely want to treat all, say, early drops the same, and >>> therefore to add just one rule for all 128 or so of your qdiscs.) >> >> You can still choose not to classify anything by using matchall. No >> one is saying you have to do classification. > >The point here is filter reuse, not classification. > >> If you want to jump to a block, you can just use a goto action like > >I don't think you can jump to a block. You can jump to a chain within >the same block. Correct. entrypoint-> block X chain 0 chain A chain B chain ... There's no goto/jump out of the block. > >> normal. So your above example can be replaced with: >> >> # tc qdisc add dev eth0 root handle 1: \ >> red limit 500K avpkt 1K >> >> # tc filter add dev eth0 parent 1:0 position early_drop matchall \ >> action goto chain 10 >> >> # tc chain add dev eth0 index 10 ... >> >> See the difference?
Wed, May 27, 2020 at 06:09:03AM CEST, xiyou.wangcong@gmail.com wrote: >On Tue, May 26, 2020 at 10:11 AM Petr Machata <petrm@mellanox.com> wrote: >> >> The Spectrum hardware allows execution of one of several actions as a >> result of queue management events: tail-dropping, early-dropping, marking a >> packet, or passing a configured latency threshold or buffer size. Such >> packets can be mirrored, trapped, or sampled. >> >> Modeling the action to be taken as simply a TC action is very attractive, >> but it is not obvious where to put these actions. At least with ECN marking >> one could imagine a tree of qdiscs and classifiers that effectively >> accomplishes this task, albeit in an impractically complex manner. But >> there is just no way to match on dropped-ness of a packet, let alone >> dropped-ness due to a particular reason. >> >> To allow configuring user-defined actions as a result of inner workings of >> a qdisc, this patch set introduces a concept of qevents. Those are attach >> points for TC blocks, where filters can be put that are executed as the >> packet hits well-defined points in the qdisc algorithms. The attached >> blocks can be shared, in a manner similar to clsact ingress and egress >> blocks, arbitrary classifiers with arbitrary actions can be put on them, >> etc. > >This concept does not fit well into qdisc, essentially you still want to >install filters (and actions) somewhere on qdisc, but currently all filters >are executed at enqueue, basically you want to execute them at other >pre-defined locations too, for example early drop. > >So, perhaps adding a "position" in tc filter is better? Something like: > >tc qdisc add dev x root handle 1: ... # same as before >tc filter add dev x parent 1:0 position early_drop matchall action.... > >And obviously default position must be "enqueue". Makes sense? Well, if you look at the examples in the cover letter, I think that they are showing something very similar you are talking about: # tc qdisc add dev eth0 root handle 1: \ red limit 500K avpkt 1K qevent early block 10 # tc filter add block 10 \ matchall action mirred egress mirror dev eth1 The first command just says "early drop position should be processed by block 10" The second command just adds a filter to the block 10. We have this concept of blocks, we use them in "tc filter" as a handle. The block as a unit could be attached to be processed not only to "enqueue" but to anything else, like some qdisc stage. Looks quite neat to me. > >(The word "position" may be not accurate, but hope you get my point >here.) > >Thanks.
On Mon, Jun 1, 2020 at 6:40 AM Jiri Pirko <jiri@resnulli.us> wrote: > The first command just says "early drop position should be processed by > block 10" > > The second command just adds a filter to the block 10. This is exactly why it looks odd to me, because it _reads_ like 'tc qdisc' creates the block to hold tc filters... I think tc filters should create whatever placeholder for themselves. I know in memory block (or chain or filters) are stored in qdisc, but it is still different to me who initiates the creation. Thanks.
On Sat, May 30, 2020 at 1:55 AM Petr Machata <petrm@mellanox.com> wrote: > > > Cong Wang <xiyou.wangcong@gmail.com> writes: > > > On Thu, May 28, 2020 at 2:48 AM Petr Machata <petrm@mellanox.com> wrote: > >> So you propose to have further division within the block? To have sort > >> of namespaces within blocks or chains, where depending on the context, > >> only filters in the corresponding namespace are executed? > > > > What I suggest is to let filters (or chain or block) decide where > > they belong to, because I think that fit naturally. > > So filters would have this attribute that marks them for execution in > the qevent context? If you view it as position rather than qevent, sure, we already need to specify the "context" for tc filter creations anyway, "dev... parent ..." is is context to locate the filter placeholders. This is why it makes sense to add one more piece, say "position", that is "dev... parent... position...". It is you who calls it qevent which of course looks like only belong to qdisc's. > > Ultimately the qdisc decides what qevents to expose. Qevents are closely > tied to the inner workings of a qdisc algorithm, they can't be probed as > modules the way qdiscs, filters and actions can. If a user wishes to > make use of them, they will have to let qdiscs "dictate" where to put > the filters one way or another. For a specific event like early drop, sure. But if we think it generally, "enqueue" has the same problem, there are a few qdisc's don't even support filtering (noop). We can just return ENOSUPP anyway.
Cong Wang <xiyou.wangcong@gmail.com> writes: > On Mon, Jun 1, 2020 at 6:40 AM Jiri Pirko <jiri@resnulli.us> wrote: >> The first command just says "early drop position should be processed by >> block 10" >> >> The second command just adds a filter to the block 10. > This is exactly why it looks odd to me, because it _reads_ like > 'tc qdisc' creates the block to hold tc filters... I think tc filters should > create whatever placeholder for themselves. Look at clsact. It creates blocks in exactly the same way. > I know in memory block (or chain or filters) are stored in qdisc, but > it is still different to me who initiates the creation. The block binding mechanics are not new. The patch just reuses them. If you are unhappy about how this is currently done, I too would see merit in creating a block explicitly, like with chains. But it has nothing to do with this patchset, which would just naturally pick up whatever this new mechanic is.
Mon, Jun 01, 2020 at 09:50:23PM CEST, xiyou.wangcong@gmail.com wrote: >On Mon, Jun 1, 2020 at 6:40 AM Jiri Pirko <jiri@resnulli.us> wrote: >> The first command just says "early drop position should be processed by >> block 10" >> >> The second command just adds a filter to the block 10. > >This is exactly why it looks odd to me, because it _reads_ like >'tc qdisc' creates the block to hold tc filters... I think tc filters should >create whatever placeholder for themselves. That is how it is done already today. We have the block object. The instances are created separatelly (clsact for example), user may specify the block id to identify the block. Then the user may use this block id to add/remove filters directly to/from the block. What you propose with "position" on the other hand look unnatural for me. It would slice the exising blocks in some way. Currently, the block has 1 clearly defined entrypoint. With "positions", all of the sudden there would be many of them? I can't really imagine how that is supposed to work :/ > >I know in memory block (or chain or filters) are stored in qdisc, but >it is still different to me who initiates the creation. Qdiscs create blocks. > >Thanks.
On Mon, Jun 1, 2020 at 11:05 PM Jiri Pirko <jiri@resnulli.us> wrote: > > Mon, Jun 01, 2020 at 09:50:23PM CEST, xiyou.wangcong@gmail.com wrote: > >On Mon, Jun 1, 2020 at 6:40 AM Jiri Pirko <jiri@resnulli.us> wrote: > >> The first command just says "early drop position should be processed by > >> block 10" > >> > >> The second command just adds a filter to the block 10. > > > >This is exactly why it looks odd to me, because it _reads_ like > >'tc qdisc' creates the block to hold tc filters... I think tc filters should > >create whatever placeholder for themselves. > > That is how it is done already today. We have the block object. The > instances are created separatelly (clsact for example), user may specify > the block id to identify the block. Then the user may use this block id > to add/remove filters directly to/from the block. > > What you propose with "position" on the other hand look unnatural for > me. It would slice the exising blocks in some way. Currently, the block > has 1 clearly defined entrypoint. With "positions", all of the sudden > there would be many of them? I can't really imagine how that is supposed > to work :/ I imagine we could introduce multiple blocks for a qdisc. Currently we have: struct some_qdisc_data { struct tcf_block *block; }; Maybe we can extend it to: struct some_qdisc_data { struct tcf_block *blocks[3]; }; #define ENQUEUE 0 #define DEQUEUE 1 #define DROP 2 static struct tcf_block *foo_tcf_block(struct Qdisc *sch, unsigned long cl, struct netlink_ext_ack *extack, int position) { struct some_qdisc_data *q = qdisc_priv(sch); if (cl) return NULL; return q->block[position]; } Just some scratches on my mind. Thanks.
Cong Wang <xiyou.wangcong@gmail.com> writes: > I imagine we could introduce multiple blocks for a qdisc. Yes, and that's what the patchset does. If you look at struct tcf_qevent, it is just some block bookkeeping and an attribute name. > Currently we have: > > struct some_qdisc_data { > struct tcf_block *block; > }; > > Maybe we can extend it to: > > struct some_qdisc_data { > struct tcf_block *blocks[3]; Yeah, except not all qdiscs will implement all qevents, so let's instead make it a handful of fields, like in the patchset. > }; > > #define ENQUEUE 0 > #define DEQUEUE 1 > #define DROP 2 > > static struct tcf_block *foo_tcf_block(struct Qdisc *sch, unsigned long cl, > struct netlink_ext_ack > *extack, int position) > { > struct some_qdisc_data *q = qdisc_priv(sch); > > if (cl) > return NULL; > return q->block[position]; > } Interestingly, this is close to my original approach, pre-RFC. But there needs to be this global list of all existing qevents. On its own, that's a negative--at least it's an extra uAPI to maintain. What does it bring? It theoretically allows one to refer to blocks symbolically, through binding point coordinates (dev D parent P qevent Q) not by indices (block B). But then one block could be referenced by several different coordinates, which is confusing. That is the reason TC disallows editing filters on shared blocks. Qevents should be the same. What else is there?