diff mbox series

[ovs-dev,net-next,v2,7/9] net: openvswitch: do not notify drops inside sample

Message ID 20240603185647.2310748-8-amorenoz@redhat.com
State Handled Elsewhere
Delegated to: Simon Horman
Headers show
Series net: openvswitch: Add sample multicasting. | expand

Commit Message

Adrián Moreno June 3, 2024, 6:56 p.m. UTC
The OVS_ACTION_ATTR_SAMPLE action is, in essence,
observability-oriented.

Apart from some corner case in which it's used a replacement of clone()
for old kernels, it's really only used for sFlow, IPFIX and now,
local emit_sample.

With this in mind, it doesn't make much sense to report
OVS_DROP_LAST_ACTION inside sample actions.

For instance, if the flow:

  actions:sample(..,emit_sample(..)),2

triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
confusing for users since the packet did reach its destination.

This patch makes internal action execution silently consume the skb
instead of notifying a drop for this case.

Unfortunately, this patch does not remove all potential sources of
confusion since, if the sample action itself is the last action, e.g:

    actions:sample(..,emit_sample(..))

we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.

Sadly, this case is difficult to solve without breaking the
optimization by which the skb is not cloned on last sample actions.
But, given explicit drop actions are now supported, OVS can just add one
after the last sample() and rewrite the flow as:

    actions:sample(..,emit_sample(..)),drop

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
---
 net/openvswitch/actions.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

Comments

Simon Horman June 14, 2024, 4:17 p.m. UTC | #1
On Mon, Jun 03, 2024 at 08:56:41PM +0200, Adrian Moreno wrote:
> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
> observability-oriented.
> 
> Apart from some corner case in which it's used a replacement of clone()
> for old kernels, it's really only used for sFlow, IPFIX and now,
> local emit_sample.
> 
> With this in mind, it doesn't make much sense to report
> OVS_DROP_LAST_ACTION inside sample actions.
> 
> For instance, if the flow:
> 
>   actions:sample(..,emit_sample(..)),2
> 
> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
> confusing for users since the packet did reach its destination.
> 
> This patch makes internal action execution silently consume the skb
> instead of notifying a drop for this case.
> 
> Unfortunately, this patch does not remove all potential sources of
> confusion since, if the sample action itself is the last action, e.g:
> 
>     actions:sample(..,emit_sample(..))
> 
> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
> 
> Sadly, this case is difficult to solve without breaking the
> optimization by which the skb is not cloned on last sample actions.
> But, given explicit drop actions are now supported, OVS can just add one
> after the last sample() and rewrite the flow as:
> 
>     actions:sample(..,emit_sample(..)),drop
> 
> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

Reviewed-by: Simon Horman <horms@kernel.org>
Ilya Maximets June 17, 2024, 11:55 a.m. UTC | #2
On 6/3/24 20:56, Adrian Moreno wrote:
> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
> observability-oriented.
> 
> Apart from some corner case in which it's used a replacement of clone()
> for old kernels, it's really only used for sFlow, IPFIX and now,
> local emit_sample.
> 
> With this in mind, it doesn't make much sense to report
> OVS_DROP_LAST_ACTION inside sample actions.
> 
> For instance, if the flow:
> 
>   actions:sample(..,emit_sample(..)),2
> 
> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
> confusing for users since the packet did reach its destination.
> 
> This patch makes internal action execution silently consume the skb
> instead of notifying a drop for this case.
> 
> Unfortunately, this patch does not remove all potential sources of
> confusion since, if the sample action itself is the last action, e.g:
> 
>     actions:sample(..,emit_sample(..))
> 
> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
> 
> Sadly, this case is difficult to solve without breaking the
> optimization by which the skb is not cloned on last sample actions.
> But, given explicit drop actions are now supported, OVS can just add one
> after the last sample() and rewrite the flow as:
> 
>     actions:sample(..,emit_sample(..)),drop
> 
> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> ---
>  net/openvswitch/actions.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> index 33f6d93ba5e4..54fc1abcff95 100644
> --- a/net/openvswitch/actions.c
> +++ b/net/openvswitch/actions.c
> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
>  static struct action_flow_keys __percpu *flow_keys;
>  static DEFINE_PER_CPU(int, exec_actions_level);
>  
> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
> +{
> +	/* Do not emit packet drops inside sample(). */
> +	if (OVS_CB(skb)->probability)
> +		consume_skb(skb);
> +	else
> +		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> +}
> +
>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
>   * space. Return NULL if out of key spaces.
>   */
> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
>  	if ((arg->probability != U32_MAX) &&
>  	    (!arg->probability || get_random_u32() > arg->probability)) {
>  		if (last)
> -			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> +			ovs_drop_skb_last_action(skb);
>  		return 0;
>  	}
>  
> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>  		}
>  	}
>  
> -	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> +	ovs_drop_skb_last_action(skb);

I don't think I agree with this one.  If we have a sample() action with
a lot of different actions inside and we reached the end while the last
action didn't consume the skb, then we should report that.  E.g.
"sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.

The only actions that are actually consuming the skb are "output",
"userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
consuming the skb "naturally" by stealing it when it is the last action.
"userspace" has an explicit check to consume the skb if it is the last
action.  "emit_sample" should have the similar check.  It should likely
be added at the point of action introduction instead of having a separate
patch.

Best regards, Ilya Maximets.
Ilya Maximets June 17, 2024, 12:10 p.m. UTC | #3
On 6/17/24 13:55, Ilya Maximets wrote:
> On 6/3/24 20:56, Adrian Moreno wrote:
>> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
>> observability-oriented.
>>
>> Apart from some corner case in which it's used a replacement of clone()
>> for old kernels, it's really only used for sFlow, IPFIX and now,
>> local emit_sample.
>>
>> With this in mind, it doesn't make much sense to report
>> OVS_DROP_LAST_ACTION inside sample actions.
>>
>> For instance, if the flow:
>>
>>   actions:sample(..,emit_sample(..)),2
>>
>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
>> confusing for users since the packet did reach its destination.
>>
>> This patch makes internal action execution silently consume the skb
>> instead of notifying a drop for this case.
>>
>> Unfortunately, this patch does not remove all potential sources of
>> confusion since, if the sample action itself is the last action, e.g:
>>
>>     actions:sample(..,emit_sample(..))
>>
>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
>>
>> Sadly, this case is difficult to solve without breaking the
>> optimization by which the skb is not cloned on last sample actions.
>> But, given explicit drop actions are now supported, OVS can just add one
>> after the last sample() and rewrite the flow as:
>>
>>     actions:sample(..,emit_sample(..)),drop
>>
>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>> ---
>>  net/openvswitch/actions.c | 13 +++++++++++--
>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>> index 33f6d93ba5e4..54fc1abcff95 100644
>> --- a/net/openvswitch/actions.c
>> +++ b/net/openvswitch/actions.c
>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
>>  static struct action_flow_keys __percpu *flow_keys;
>>  static DEFINE_PER_CPU(int, exec_actions_level);
>>  
>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
>> +{
>> +	/* Do not emit packet drops inside sample(). */
>> +	if (OVS_CB(skb)->probability)
>> +		consume_skb(skb);
>> +	else
>> +		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>> +}
>> +
>>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
>>   * space. Return NULL if out of key spaces.
>>   */
>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
>>  	if ((arg->probability != U32_MAX) &&
>>  	    (!arg->probability || get_random_u32() > arg->probability)) {
>>  		if (last)
>> -			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>> +			ovs_drop_skb_last_action(skb);

Always consuming the skb at this point makes sense, since having smaple()
as a last action is a reasonable thing to have.  But this looks more like
a fix for the original drop reason patch set.

>>  		return 0;
>>  	}
>>  
>> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>>  		}
>>  	}
>>  
>> -	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>> +	ovs_drop_skb_last_action(skb);
> 
> I don't think I agree with this one.  If we have a sample() action with
> a lot of different actions inside and we reached the end while the last
> action didn't consume the skb, then we should report that.  E.g.
> "sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
> cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.
> 
> The only actions that are actually consuming the skb are "output",
> "userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
> consuming the skb "naturally" by stealing it when it is the last action.
> "userspace" has an explicit check to consume the skb if it is the last
> action.  "emit_sample" should have the similar check.  It should likely
> be added at the point of action introduction instead of having a separate
> patch.
> 
> Best regards, Ilya Maximets.
Adrián Moreno June 18, 2024, 7 a.m. UTC | #4
On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote:
> On 6/17/24 13:55, Ilya Maximets wrote:
> > On 6/3/24 20:56, Adrian Moreno wrote:
> >> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
> >> observability-oriented.
> >>
> >> Apart from some corner case in which it's used a replacement of clone()
> >> for old kernels, it's really only used for sFlow, IPFIX and now,
> >> local emit_sample.
> >>
> >> With this in mind, it doesn't make much sense to report
> >> OVS_DROP_LAST_ACTION inside sample actions.
> >>
> >> For instance, if the flow:
> >>
> >>   actions:sample(..,emit_sample(..)),2
> >>
> >> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
> >> confusing for users since the packet did reach its destination.
> >>
> >> This patch makes internal action execution silently consume the skb
> >> instead of notifying a drop for this case.
> >>
> >> Unfortunately, this patch does not remove all potential sources of
> >> confusion since, if the sample action itself is the last action, e.g:
> >>
> >>     actions:sample(..,emit_sample(..))
> >>
> >> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
> >>
> >> Sadly, this case is difficult to solve without breaking the
> >> optimization by which the skb is not cloned on last sample actions.
> >> But, given explicit drop actions are now supported, OVS can just add one
> >> after the last sample() and rewrite the flow as:
> >>
> >>     actions:sample(..,emit_sample(..)),drop
> >>
> >> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> >> ---
> >>  net/openvswitch/actions.c | 13 +++++++++++--
> >>  1 file changed, 11 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> >> index 33f6d93ba5e4..54fc1abcff95 100644
> >> --- a/net/openvswitch/actions.c
> >> +++ b/net/openvswitch/actions.c
> >> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
> >>  static struct action_flow_keys __percpu *flow_keys;
> >>  static DEFINE_PER_CPU(int, exec_actions_level);
> >>
> >> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
> >> +{
> >> +	/* Do not emit packet drops inside sample(). */
> >> +	if (OVS_CB(skb)->probability)
> >> +		consume_skb(skb);
> >> +	else
> >> +		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >> +}
> >> +
> >>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
> >>   * space. Return NULL if out of key spaces.
> >>   */
> >> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
> >>  	if ((arg->probability != U32_MAX) &&
> >>  	    (!arg->probability || get_random_u32() > arg->probability)) {
> >>  		if (last)
> >> -			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >> +			ovs_drop_skb_last_action(skb);
>
> Always consuming the skb at this point makes sense, since having smaple()
> as a last action is a reasonable thing to have.  But this looks more like
> a fix for the original drop reason patch set.
>

I don't think consuming the skb at this point makes sense. It was very
intentionally changed to a drop since a very common use-case for
sampling is drop-sampling, i.e: replacing an empty action list (that
triggers OVS_DROP_LAST_ACTION) with a sample(emit_sample()). Ideally,
that replacement should not have any effect on the number of
OVS_DROP_LAST_ACTION being reported as the packets are being treated in
the same way (only observed in one case).


> >>  		return 0;
> >>  	}
> >>
> >> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
> >>  		}
> >>  	}
> >>
> >> -	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >> +	ovs_drop_skb_last_action(skb);
> >
> > I don't think I agree with this one.  If we have a sample() action with
> > a lot of different actions inside and we reached the end while the last
> > action didn't consume the skb, then we should report that.  E.g.
> > "sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
> > cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.
> >

What is the use case for such action list? Having an action branch
executed randomly doesn't make sense to me if it's not some
observability thing (which IMHO should not trigger drops).

> > The only actions that are actually consuming the skb are "output",
> > "userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
> > consuming the skb "naturally" by stealing it when it is the last action.
> > "userspace" has an explicit check to consume the skb if it is the last
> > action.  "emit_sample" should have the similar check.  It should likely
> > be added at the point of action introduction instead of having a separate
> > patch.
> >

Unlinke "output", "recirc", "userspace", etc. with emit_sample the
packet does not continue it's way through the datapath.

It would be very confusing if OVS starts monitoring drops and adds a bunch
of flows such as "actions:emit_sample()" and suddently it stops reporting such
drops via standard kfree_skb_reason. Packets _are_ being dropped here,
we are just observing them.

And if we change emit_sample to trigger a drop if it's the last action,
then "sample(50%, emit_sample()),2" will trigger a drop half of the times
which is also terribly confusing.

I think we should try to be clear and informative with what we
_actually_ drop and not require the user that is just running
"dropwatch" to understand the internals of the OVS module.

So if you don't want to accept the "observational" nature of sample(),
the only other solution that does not bring even more confusion to OVS
drops would be to have userspace add explicit drop actions. WDYT?
Ilya Maximets June 18, 2024, 10:22 a.m. UTC | #5
On 6/18/24 09:00, Adrián Moreno wrote:
> On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote:
>> On 6/17/24 13:55, Ilya Maximets wrote:
>>> On 6/3/24 20:56, Adrian Moreno wrote:
>>>> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
>>>> observability-oriented.
>>>>
>>>> Apart from some corner case in which it's used a replacement of clone()
>>>> for old kernels, it's really only used for sFlow, IPFIX and now,
>>>> local emit_sample.
>>>>
>>>> With this in mind, it doesn't make much sense to report
>>>> OVS_DROP_LAST_ACTION inside sample actions.
>>>>
>>>> For instance, if the flow:
>>>>
>>>>   actions:sample(..,emit_sample(..)),2
>>>>
>>>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
>>>> confusing for users since the packet did reach its destination.
>>>>
>>>> This patch makes internal action execution silently consume the skb
>>>> instead of notifying a drop for this case.
>>>>
>>>> Unfortunately, this patch does not remove all potential sources of
>>>> confusion since, if the sample action itself is the last action, e.g:
>>>>
>>>>     actions:sample(..,emit_sample(..))
>>>>
>>>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
>>>>
>>>> Sadly, this case is difficult to solve without breaking the
>>>> optimization by which the skb is not cloned on last sample actions.
>>>> But, given explicit drop actions are now supported, OVS can just add one
>>>> after the last sample() and rewrite the flow as:
>>>>
>>>>     actions:sample(..,emit_sample(..)),drop
>>>>
>>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>>>> ---
>>>>  net/openvswitch/actions.c | 13 +++++++++++--
>>>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>>>> index 33f6d93ba5e4..54fc1abcff95 100644
>>>> --- a/net/openvswitch/actions.c
>>>> +++ b/net/openvswitch/actions.c
>>>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
>>>>  static struct action_flow_keys __percpu *flow_keys;
>>>>  static DEFINE_PER_CPU(int, exec_actions_level);
>>>>
>>>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
>>>> +{
>>>> +	/* Do not emit packet drops inside sample(). */
>>>> +	if (OVS_CB(skb)->probability)
>>>> +		consume_skb(skb);
>>>> +	else
>>>> +		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>> +}
>>>> +
>>>>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
>>>>   * space. Return NULL if out of key spaces.
>>>>   */
>>>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
>>>>  	if ((arg->probability != U32_MAX) &&
>>>>  	    (!arg->probability || get_random_u32() > arg->probability)) {
>>>>  		if (last)
>>>> -			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>> +			ovs_drop_skb_last_action(skb);
>>
>> Always consuming the skb at this point makes sense, since having smaple()
>> as a last action is a reasonable thing to have.  But this looks more like
>> a fix for the original drop reason patch set.
>>
> 
> I don't think consuming the skb at this point makes sense. It was very
> intentionally changed to a drop since a very common use-case for
> sampling is drop-sampling, i.e: replacing an empty action list (that
> triggers OVS_DROP_LAST_ACTION) with a sample(emit_sample()). Ideally,
> that replacement should not have any effect on the number of
> OVS_DROP_LAST_ACTION being reported as the packets are being treated in
> the same way (only observed in one case).
> 
> 
>>>>  		return 0;
>>>>  	}
>>>>
>>>> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>>>>  		}
>>>>  	}
>>>>
>>>> -	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>> +	ovs_drop_skb_last_action(skb);
>>>
>>> I don't think I agree with this one.  If we have a sample() action with
>>> a lot of different actions inside and we reached the end while the last
>>> action didn't consume the skb, then we should report that.  E.g.
>>> "sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
>>> cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.
>>>
> 
> What is the use case for such action list? Having an action branch
> executed randomly doesn't make sense to me if it's not some
> observability thing (which IMHO should not trigger drops).

It is exactly my point.  A list of actions that doesn't end is some sort
of a terminal action (output, drop, etc) does not make a lot of sense and
hence should be signaled as an unexpected drop, so users can re-check the
pipeline in case they missed the terminal action somehow.

> 
>>> The only actions that are actually consuming the skb are "output",
>>> "userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
>>> consuming the skb "naturally" by stealing it when it is the last action.
>>> "userspace" has an explicit check to consume the skb if it is the last
>>> action.  "emit_sample" should have the similar check.  It should likely
>>> be added at the point of action introduction instead of having a separate
>>> patch.
>>>
> 
> Unlinke "output", "recirc", "userspace", etc. with emit_sample the
> packet does not continue it's way through the datapath.

After "output" the packet leaves the datapath too, i.e. does not continue
it's way through OVS datapath.

> 
> It would be very confusing if OVS starts monitoring drops and adds a bunch
> of flows such as "actions:emit_sample()" and suddently it stops reporting such
> drops via standard kfree_skb_reason. Packets _are_ being dropped here,
> we are just observing them.

This might make sense from the higher logic in user space application, but
it doesn't from the datapath perspective.  And also, if the user adds the
'emit_sample' action for drop monitring, they already know where to find
packet samples, they don't need to use tools like dropwatch anymore.
This packet is not dropped from the datapath perspective, it is sampled.

> 
> And if we change emit_sample to trigger a drop if it's the last action,
> then "sample(50%, emit_sample()),2" will trigger a drop half of the times
> which is also terribly confusing.

If emit_sample is the last action, then skb should be consumed silently.
The same as for "output" and "userspace".

> 
> I think we should try to be clear and informative with what we
> _actually_ drop and not require the user that is just running
> "dropwatch" to understand the internals of the OVS module.

If someone is already using sampling to watch their packet drops, why would
they use dropwatch?

> 
> So if you don't want to accept the "observational" nature of sample(),
> the only other solution that does not bring even more confusion to OVS
> drops would be to have userspace add explicit drop actions. WDYT?
> 

These are not drops from the datapath perspective.  Users can add explicit
drop actions if they want to, but I'm really not sure why they would do that
if they are already capturing all these packets in psample, sFlow or IPFIX.

Best regards, Ilya Maximets.
Adrián Moreno June 18, 2024, 10:50 a.m. UTC | #6
On Tue, Jun 18, 2024 at 12:22:23PM GMT, Ilya Maximets wrote:
> On 6/18/24 09:00, Adrián Moreno wrote:
> > On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote:
> >> On 6/17/24 13:55, Ilya Maximets wrote:
> >>> On 6/3/24 20:56, Adrian Moreno wrote:
> >>>> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
> >>>> observability-oriented.
> >>>>
> >>>> Apart from some corner case in which it's used a replacement of clone()
> >>>> for old kernels, it's really only used for sFlow, IPFIX and now,
> >>>> local emit_sample.
> >>>>
> >>>> With this in mind, it doesn't make much sense to report
> >>>> OVS_DROP_LAST_ACTION inside sample actions.
> >>>>
> >>>> For instance, if the flow:
> >>>>
> >>>>   actions:sample(..,emit_sample(..)),2
> >>>>
> >>>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
> >>>> confusing for users since the packet did reach its destination.
> >>>>
> >>>> This patch makes internal action execution silently consume the skb
> >>>> instead of notifying a drop for this case.
> >>>>
> >>>> Unfortunately, this patch does not remove all potential sources of
> >>>> confusion since, if the sample action itself is the last action, e.g:
> >>>>
> >>>>     actions:sample(..,emit_sample(..))
> >>>>
> >>>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
> >>>>
> >>>> Sadly, this case is difficult to solve without breaking the
> >>>> optimization by which the skb is not cloned on last sample actions.
> >>>> But, given explicit drop actions are now supported, OVS can just add one
> >>>> after the last sample() and rewrite the flow as:
> >>>>
> >>>>     actions:sample(..,emit_sample(..)),drop
> >>>>
> >>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> >>>> ---
> >>>>  net/openvswitch/actions.c | 13 +++++++++++--
> >>>>  1 file changed, 11 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> >>>> index 33f6d93ba5e4..54fc1abcff95 100644
> >>>> --- a/net/openvswitch/actions.c
> >>>> +++ b/net/openvswitch/actions.c
> >>>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
> >>>>  static struct action_flow_keys __percpu *flow_keys;
> >>>>  static DEFINE_PER_CPU(int, exec_actions_level);
> >>>>
> >>>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
> >>>> +{
> >>>> +	/* Do not emit packet drops inside sample(). */
> >>>> +	if (OVS_CB(skb)->probability)
> >>>> +		consume_skb(skb);
> >>>> +	else
> >>>> +		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >>>> +}
> >>>> +
> >>>>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
> >>>>   * space. Return NULL if out of key spaces.
> >>>>   */
> >>>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
> >>>>  	if ((arg->probability != U32_MAX) &&
> >>>>  	    (!arg->probability || get_random_u32() > arg->probability)) {
> >>>>  		if (last)
> >>>> -			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >>>> +			ovs_drop_skb_last_action(skb);
> >>
> >> Always consuming the skb at this point makes sense, since having smaple()
> >> as a last action is a reasonable thing to have.  But this looks more like
> >> a fix for the original drop reason patch set.
> >>
> >
> > I don't think consuming the skb at this point makes sense. It was very
> > intentionally changed to a drop since a very common use-case for
> > sampling is drop-sampling, i.e: replacing an empty action list (that
> > triggers OVS_DROP_LAST_ACTION) with a sample(emit_sample()). Ideally,
> > that replacement should not have any effect on the number of
> > OVS_DROP_LAST_ACTION being reported as the packets are being treated in
> > the same way (only observed in one case).
> >
> >
> >>>>  		return 0;
> >>>>  	}
> >>>>
> >>>> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
> >>>>  		}
> >>>>  	}
> >>>>
> >>>> -	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >>>> +	ovs_drop_skb_last_action(skb);
> >>>
> >>> I don't think I agree with this one.  If we have a sample() action with
> >>> a lot of different actions inside and we reached the end while the last
> >>> action didn't consume the skb, then we should report that.  E.g.
> >>> "sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
> >>> cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.
> >>>
> >
> > What is the use case for such action list? Having an action branch
> > executed randomly doesn't make sense to me if it's not some
> > observability thing (which IMHO should not trigger drops).
>
> It is exactly my point.  A list of actions that doesn't end is some sort
> of a terminal action (output, drop, etc) does not make a lot of sense and
> hence should be signaled as an unexpected drop, so users can re-check the
> pipeline in case they missed the terminal action somehow.
>
> >
> >>> The only actions that are actually consuming the skb are "output",
> >>> "userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
> >>> consuming the skb "naturally" by stealing it when it is the last action.
> >>> "userspace" has an explicit check to consume the skb if it is the last
> >>> action.  "emit_sample" should have the similar check.  It should likely
> >>> be added at the point of action introduction instead of having a separate
> >>> patch.
> >>>
> >
> > Unlinke "output", "recirc", "userspace", etc. with emit_sample the
> > packet does not continue it's way through the datapath.
>
> After "output" the packet leaves the datapath too, i.e. does not continue
> it's way through OVS datapath.
>

I meant a broader concept of "datapath". The packet continues. For the
userspace action this is true only for the CONTROLLER ofp action but
since the datapath does not know which action it's implementing, we
cannot do better.

> >
> > It would be very confusing if OVS starts monitoring drops and adds a bunch
> > of flows such as "actions:emit_sample()" and suddently it stops reporting such
> > drops via standard kfree_skb_reason. Packets _are_ being dropped here,
> > we are just observing them.
>
> This might make sense from the higher logic in user space application, but
> it doesn't from the datapath perspective.  And also, if the user adds the
> 'emit_sample' action for drop monitring, they already know where to find
> packet samples, they don't need to use tools like dropwatch anymore.
> This packet is not dropped from the datapath perspective, it is sampled.
>
> >
> > And if we change emit_sample to trigger a drop if it's the last action,
> > then "sample(50%, emit_sample()),2" will trigger a drop half of the times
> > which is also terribly confusing.
>
> If emit_sample is the last action, then skb should be consumed silently.
> The same as for "output" and "userspace".
>
> >
> > I think we should try to be clear and informative with what we
> > _actually_ drop and not require the user that is just running
> > "dropwatch" to understand the internals of the OVS module.
>
> If someone is already using sampling to watch their packet drops, why would
> they use dropwatch?
>
> >
> > So if you don't want to accept the "observational" nature of sample(),
> > the only other solution that does not bring even more confusion to OVS
> > drops would be to have userspace add explicit drop actions. WDYT?
> >
>
> These are not drops from the datapath perspective.  Users can add explicit
> drop actions if they want to, but I'm really not sure why they would do that
> if they are already capturing all these packets in psample, sFlow or IPFIX.

Because there is not a single "user". Tools and systems can be built on
top of tracepoints and samples and they might not be coordinated between
them. Some observability application can be always enabled and doing
constant network monitoring or statistics while other lower level tools
can be run at certain moments to troubleshoot issues.

In order to run dropwatch in a node you don't need to have rights to
access the OpenFlow controller and ask it to change the OpenFlow rules
or else dropwatch simply will not show actual packet drops.

To me it seems obvious that drop sampling (via emit_sample) "includes"
drop reporting via emit_sample. In both cases you get the packet
headers, but in one case you also get OFP controller metadata. Now even
if there is a system that uses both, does it make sense to push to them
the responsibility of dealing with them being mutually exclusive?

I think this makes debugging OVS datapath unnecessarily obscure when we
know the packet is actually being dropped intentionally by OVS.

What's the problem with having OVS write the following?
    "sample(50%, emit_sample()),drop(0)"

Thanks,
Adrián
Ilya Maximets June 18, 2024, 3:44 p.m. UTC | #7
On 6/18/24 12:50, Adrián Moreno wrote:
> On Tue, Jun 18, 2024 at 12:22:23PM GMT, Ilya Maximets wrote:
>> On 6/18/24 09:00, Adrián Moreno wrote:
>>> On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote:
>>>> On 6/17/24 13:55, Ilya Maximets wrote:
>>>>> On 6/3/24 20:56, Adrian Moreno wrote:
>>>>>> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
>>>>>> observability-oriented.
>>>>>>
>>>>>> Apart from some corner case in which it's used a replacement of clone()
>>>>>> for old kernels, it's really only used for sFlow, IPFIX and now,
>>>>>> local emit_sample.
>>>>>>
>>>>>> With this in mind, it doesn't make much sense to report
>>>>>> OVS_DROP_LAST_ACTION inside sample actions.
>>>>>>
>>>>>> For instance, if the flow:
>>>>>>
>>>>>>   actions:sample(..,emit_sample(..)),2
>>>>>>
>>>>>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
>>>>>> confusing for users since the packet did reach its destination.
>>>>>>
>>>>>> This patch makes internal action execution silently consume the skb
>>>>>> instead of notifying a drop for this case.
>>>>>>
>>>>>> Unfortunately, this patch does not remove all potential sources of
>>>>>> confusion since, if the sample action itself is the last action, e.g:
>>>>>>
>>>>>>     actions:sample(..,emit_sample(..))
>>>>>>
>>>>>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
>>>>>>
>>>>>> Sadly, this case is difficult to solve without breaking the
>>>>>> optimization by which the skb is not cloned on last sample actions.
>>>>>> But, given explicit drop actions are now supported, OVS can just add one
>>>>>> after the last sample() and rewrite the flow as:
>>>>>>
>>>>>>     actions:sample(..,emit_sample(..)),drop
>>>>>>
>>>>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>>>>>> ---
>>>>>>  net/openvswitch/actions.c | 13 +++++++++++--
>>>>>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>>>>>> index 33f6d93ba5e4..54fc1abcff95 100644
>>>>>> --- a/net/openvswitch/actions.c
>>>>>> +++ b/net/openvswitch/actions.c
>>>>>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
>>>>>>  static struct action_flow_keys __percpu *flow_keys;
>>>>>>  static DEFINE_PER_CPU(int, exec_actions_level);
>>>>>>
>>>>>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
>>>>>> +{
>>>>>> +	/* Do not emit packet drops inside sample(). */
>>>>>> +	if (OVS_CB(skb)->probability)
>>>>>> +		consume_skb(skb);
>>>>>> +	else
>>>>>> +		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>> +}
>>>>>> +
>>>>>>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
>>>>>>   * space. Return NULL if out of key spaces.
>>>>>>   */
>>>>>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
>>>>>>  	if ((arg->probability != U32_MAX) &&
>>>>>>  	    (!arg->probability || get_random_u32() > arg->probability)) {
>>>>>>  		if (last)
>>>>>> -			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>> +			ovs_drop_skb_last_action(skb);
>>>>
>>>> Always consuming the skb at this point makes sense, since having smaple()
>>>> as a last action is a reasonable thing to have.  But this looks more like
>>>> a fix for the original drop reason patch set.
>>>>
>>>
>>> I don't think consuming the skb at this point makes sense. It was very
>>> intentionally changed to a drop since a very common use-case for
>>> sampling is drop-sampling, i.e: replacing an empty action list (that
>>> triggers OVS_DROP_LAST_ACTION) with a sample(emit_sample()). Ideally,
>>> that replacement should not have any effect on the number of
>>> OVS_DROP_LAST_ACTION being reported as the packets are being treated in
>>> the same way (only observed in one case).
>>>
>>>
>>>>>>  		return 0;
>>>>>>  	}
>>>>>>
>>>>>> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>>>>>>  		}
>>>>>>  	}
>>>>>>
>>>>>> -	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>> +	ovs_drop_skb_last_action(skb);
>>>>>
>>>>> I don't think I agree with this one.  If we have a sample() action with
>>>>> a lot of different actions inside and we reached the end while the last
>>>>> action didn't consume the skb, then we should report that.  E.g.
>>>>> "sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
>>>>> cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.
>>>>>
>>>
>>> What is the use case for such action list? Having an action branch
>>> executed randomly doesn't make sense to me if it's not some
>>> observability thing (which IMHO should not trigger drops).
>>
>> It is exactly my point.  A list of actions that doesn't end is some sort
>> of a terminal action (output, drop, etc) does not make a lot of sense and
>> hence should be signaled as an unexpected drop, so users can re-check the
>> pipeline in case they missed the terminal action somehow.
>>
>>>
>>>>> The only actions that are actually consuming the skb are "output",
>>>>> "userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
>>>>> consuming the skb "naturally" by stealing it when it is the last action.
>>>>> "userspace" has an explicit check to consume the skb if it is the last
>>>>> action.  "emit_sample" should have the similar check.  It should likely
>>>>> be added at the point of action introduction instead of having a separate
>>>>> patch.
>>>>>
>>>
>>> Unlinke "output", "recirc", "userspace", etc. with emit_sample the
>>> packet does not continue it's way through the datapath.
>>
>> After "output" the packet leaves the datapath too, i.e. does not continue
>> it's way through OVS datapath.
>>
> 
> I meant a broader concept of "datapath". The packet continues. For the
> userspace action this is true only for the CONTROLLER ofp action but
> since the datapath does not know which action it's implementing, we
> cannot do better.

It's not only controller() action.  Packets can be brought to userspace
for various reason including just an explicit ask to execute some actions
in userspace.  In any case the packet sent to userspace kind of reached its
destination and it's not the "datapath drops the packet" situation.

> 
>>>
>>> It would be very confusing if OVS starts monitoring drops and adds a bunch
>>> of flows such as "actions:emit_sample()" and suddently it stops reporting such
>>> drops via standard kfree_skb_reason. Packets _are_ being dropped here,
>>> we are just observing them.
>>
>> This might make sense from the higher logic in user space application, but
>> it doesn't from the datapath perspective.  And also, if the user adds the
>> 'emit_sample' action for drop monitring, they already know where to find
>> packet samples, they don't need to use tools like dropwatch anymore.
>> This packet is not dropped from the datapath perspective, it is sampled.
>>
>>>
>>> And if we change emit_sample to trigger a drop if it's the last action,
>>> then "sample(50%, emit_sample()),2" will trigger a drop half of the times
>>> which is also terribly confusing.
>>
>> If emit_sample is the last action, then skb should be consumed silently.
>> The same as for "output" and "userspace".
>>
>>>
>>> I think we should try to be clear and informative with what we
>>> _actually_ drop and not require the user that is just running
>>> "dropwatch" to understand the internals of the OVS module.
>>
>> If someone is already using sampling to watch their packet drops, why would
>> they use dropwatch?
>>
>>>
>>> So if you don't want to accept the "observational" nature of sample(),
>>> the only other solution that does not bring even more confusion to OVS
>>> drops would be to have userspace add explicit drop actions. WDYT?
>>>
>>
>> These are not drops from the datapath perspective.  Users can add explicit
>> drop actions if they want to, but I'm really not sure why they would do that
>> if they are already capturing all these packets in psample, sFlow or IPFIX.
> 
> Because there is not a single "user". Tools and systems can be built on
> top of tracepoints and samples and they might not be coordinated between
> them. Some observability application can be always enabled and doing
> constant network monitoring or statistics while other lower level tools
> can be run at certain moments to troubleshoot issues.
> 
> In order to run dropwatch in a node you don't need to have rights to
> access the OpenFlow controller and ask it to change the OpenFlow rules
> or else dropwatch simply will not show actual packet drops.

The point is that these are not drops in this scenario.  The packet was
delivered to its destination and hence should not be reported as dropped.
In the observability use-case that you're describing even OpenFlow layer
in OVS doesn't know if these supposed to be treated as packet drops for
the user or if these are just samples with the sampling being the only
intended destination.  For OpenFlow and OVS userspace components these
two scenarios are indistinguishable.  Only the OpenFlow controller knows
that these rules were put in place because it was an ACL created by some
user or tool.  And since OVS in user space can't make such a distinction,
kernel can't make it either, and so shouldn't guess what the user two
levels of abstraction higher up meant.

> 
> To me it seems obvious that drop sampling (via emit_sample) "includes"
> drop reporting via emit_sample. In both cases you get the packet
> headers, but in one case you also get OFP controller metadata. Now even
> if there is a system that uses both, does it make sense to push to them
> the responsibility of dealing with them being mutually exclusive?
> 
> I think this makes debugging OVS datapath unnecessarily obscure when we
> know the packet is actually being dropped intentionally by OVS.

I don't think we know that we're in a drop sampling scenario.  We don't
have enough information even in OVS userspace to tell.

And having different behavior between "userspace" and "emit_sample" in
the kernel may cause even more confusion, because now two ways of sampling
packets will result in packets showing up in dropwatch in one case, but
not in the other.

> 
> What's the problem with having OVS write the following?
>     "sample(50%, emit_sample()),drop(0)"

It's a valid sequence of actions, but we shouldn't guess what the end
user meant by putting those actions into the kernel.  If we see such a
sequence in the kernel, then we should report an explicit drop.  If
there was only the "sample(50%, emit_sample())" then we should simply
consume the skb as it reached its destination in the psample.

For the question if OVS in user space should put explicit drop action
while preparing to emit sample, this doesn't sound reasonable for the
same reason - OVS in user space doesn't know what the intention was of
the user or tool that put the sampling action into OpenFlow pipeline.


I actually became more confused about what are we arguing about.
To recap:

                                     This patch     My proposal

1. emit_sample() is the last            consume        consume  
    inside the sample()

2. the end of the action list           consume        drop
    inside the sample()

3. emit_sample() is the last            drop           consume
    outside the sample()

4. the end of the action list           drop           drop
    outside the sample()

5. sample() is the last action          consume        consume
    and probability failed


I don't think cases 1 and 3 should differ, i.e. the behavior should
be the same regardless of emit_sample() being inside or outside of
the sample().  As a side point, OVS in user space will omit the 100%
rate sample() action and will just list inner actions instead.  This
means that 100% probability sampling will generate drops and 99% will
not.  Doesn't sound right.

Case 2 should likely never happen, but I'd like to see a drop reported
if that ever happens, because it is not a meaningful list of actions.

Best regards, Ilya Maximets.
Adrián Moreno June 19, 2024, 6:35 a.m. UTC | #8
On Tue, Jun 18, 2024 at 05:44:05PM GMT, Ilya Maximets wrote:
> On 6/18/24 12:50, Adrián Moreno wrote:
> > On Tue, Jun 18, 2024 at 12:22:23PM GMT, Ilya Maximets wrote:
> >> On 6/18/24 09:00, Adrián Moreno wrote:
> >>> On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote:
> >>>> On 6/17/24 13:55, Ilya Maximets wrote:
> >>>>> On 6/3/24 20:56, Adrian Moreno wrote:
> >>>>>> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
> >>>>>> observability-oriented.
> >>>>>>
> >>>>>> Apart from some corner case in which it's used a replacement of clone()
> >>>>>> for old kernels, it's really only used for sFlow, IPFIX and now,
> >>>>>> local emit_sample.
> >>>>>>
> >>>>>> With this in mind, it doesn't make much sense to report
> >>>>>> OVS_DROP_LAST_ACTION inside sample actions.
> >>>>>>
> >>>>>> For instance, if the flow:
> >>>>>>
> >>>>>>   actions:sample(..,emit_sample(..)),2
> >>>>>>
> >>>>>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
> >>>>>> confusing for users since the packet did reach its destination.
> >>>>>>
> >>>>>> This patch makes internal action execution silently consume the skb
> >>>>>> instead of notifying a drop for this case.
> >>>>>>
> >>>>>> Unfortunately, this patch does not remove all potential sources of
> >>>>>> confusion since, if the sample action itself is the last action, e.g:
> >>>>>>
> >>>>>>     actions:sample(..,emit_sample(..))
> >>>>>>
> >>>>>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
> >>>>>>
> >>>>>> Sadly, this case is difficult to solve without breaking the
> >>>>>> optimization by which the skb is not cloned on last sample actions.
> >>>>>> But, given explicit drop actions are now supported, OVS can just add one
> >>>>>> after the last sample() and rewrite the flow as:
> >>>>>>
> >>>>>>     actions:sample(..,emit_sample(..)),drop
> >>>>>>
> >>>>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> >>>>>> ---
> >>>>>>  net/openvswitch/actions.c | 13 +++++++++++--
> >>>>>>  1 file changed, 11 insertions(+), 2 deletions(-)
> >>>>>>
> >>>>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> >>>>>> index 33f6d93ba5e4..54fc1abcff95 100644
> >>>>>> --- a/net/openvswitch/actions.c
> >>>>>> +++ b/net/openvswitch/actions.c
> >>>>>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
> >>>>>>  static struct action_flow_keys __percpu *flow_keys;
> >>>>>>  static DEFINE_PER_CPU(int, exec_actions_level);
> >>>>>>
> >>>>>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
> >>>>>> +{
> >>>>>> +	/* Do not emit packet drops inside sample(). */
> >>>>>> +	if (OVS_CB(skb)->probability)
> >>>>>> +		consume_skb(skb);
> >>>>>> +	else
> >>>>>> +		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >>>>>> +}
> >>>>>> +
> >>>>>>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
> >>>>>>   * space. Return NULL if out of key spaces.
> >>>>>>   */
> >>>>>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
> >>>>>>  	if ((arg->probability != U32_MAX) &&
> >>>>>>  	    (!arg->probability || get_random_u32() > arg->probability)) {
> >>>>>>  		if (last)
> >>>>>> -			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >>>>>> +			ovs_drop_skb_last_action(skb);
> >>>>
> >>>> Always consuming the skb at this point makes sense, since having smaple()
> >>>> as a last action is a reasonable thing to have.  But this looks more like
> >>>> a fix for the original drop reason patch set.
> >>>>
> >>>
> >>> I don't think consuming the skb at this point makes sense. It was very
> >>> intentionally changed to a drop since a very common use-case for
> >>> sampling is drop-sampling, i.e: replacing an empty action list (that
> >>> triggers OVS_DROP_LAST_ACTION) with a sample(emit_sample()). Ideally,
> >>> that replacement should not have any effect on the number of
> >>> OVS_DROP_LAST_ACTION being reported as the packets are being treated in
> >>> the same way (only observed in one case).
> >>>
> >>>
> >>>>>>  		return 0;
> >>>>>>  	}
> >>>>>>
> >>>>>> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
> >>>>>>  		}
> >>>>>>  	}
> >>>>>>
> >>>>>> -	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >>>>>> +	ovs_drop_skb_last_action(skb);
> >>>>>
> >>>>> I don't think I agree with this one.  If we have a sample() action with
> >>>>> a lot of different actions inside and we reached the end while the last
> >>>>> action didn't consume the skb, then we should report that.  E.g.
> >>>>> "sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
> >>>>> cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.
> >>>>>
> >>>
> >>> What is the use case for such action list? Having an action branch
> >>> executed randomly doesn't make sense to me if it's not some
> >>> observability thing (which IMHO should not trigger drops).
> >>
> >> It is exactly my point.  A list of actions that doesn't end is some sort
> >> of a terminal action (output, drop, etc) does not make a lot of sense and
> >> hence should be signaled as an unexpected drop, so users can re-check the
> >> pipeline in case they missed the terminal action somehow.
> >>
> >>>
> >>>>> The only actions that are actually consuming the skb are "output",
> >>>>> "userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
> >>>>> consuming the skb "naturally" by stealing it when it is the last action.
> >>>>> "userspace" has an explicit check to consume the skb if it is the last
> >>>>> action.  "emit_sample" should have the similar check.  It should likely
> >>>>> be added at the point of action introduction instead of having a separate
> >>>>> patch.
> >>>>>
> >>>
> >>> Unlinke "output", "recirc", "userspace", etc. with emit_sample the
> >>> packet does not continue it's way through the datapath.
> >>
> >> After "output" the packet leaves the datapath too, i.e. does not continue
> >> it's way through OVS datapath.
> >>
> >
> > I meant a broader concept of "datapath". The packet continues. For the
> > userspace action this is true only for the CONTROLLER ofp action but
> > since the datapath does not know which action it's implementing, we
> > cannot do better.
>
> It's not only controller() action.  Packets can be brought to userspace
> for various reason including just an explicit ask to execute some actions
> in userspace.  In any case the packet sent to userspace kind of reached its
> destination and it's not the "datapath drops the packet" situation.
>
> >
> >>>
> >>> It would be very confusing if OVS starts monitoring drops and adds a bunch
> >>> of flows such as "actions:emit_sample()" and suddently it stops reporting such
> >>> drops via standard kfree_skb_reason. Packets _are_ being dropped here,
> >>> we are just observing them.
> >>
> >> This might make sense from the higher logic in user space application, but
> >> it doesn't from the datapath perspective.  And also, if the user adds the
> >> 'emit_sample' action for drop monitring, they already know where to find
> >> packet samples, they don't need to use tools like dropwatch anymore.
> >> This packet is not dropped from the datapath perspective, it is sampled.
> >>
> >>>
> >>> And if we change emit_sample to trigger a drop if it's the last action,
> >>> then "sample(50%, emit_sample()),2" will trigger a drop half of the times
> >>> which is also terribly confusing.
> >>
> >> If emit_sample is the last action, then skb should be consumed silently.
> >> The same as for "output" and "userspace".
> >>
> >>>
> >>> I think we should try to be clear and informative with what we
> >>> _actually_ drop and not require the user that is just running
> >>> "dropwatch" to understand the internals of the OVS module.
> >>
> >> If someone is already using sampling to watch their packet drops, why would
> >> they use dropwatch?
> >>
> >>>
> >>> So if you don't want to accept the "observational" nature of sample(),
> >>> the only other solution that does not bring even more confusion to OVS
> >>> drops would be to have userspace add explicit drop actions. WDYT?
> >>>
> >>
> >> These are not drops from the datapath perspective.  Users can add explicit
> >> drop actions if they want to, but I'm really not sure why they would do that
> >> if they are already capturing all these packets in psample, sFlow or IPFIX.
> >
> > Because there is not a single "user". Tools and systems can be built on
> > top of tracepoints and samples and they might not be coordinated between
> > them. Some observability application can be always enabled and doing
> > constant network monitoring or statistics while other lower level tools
> > can be run at certain moments to troubleshoot issues.
> >
> > In order to run dropwatch in a node you don't need to have rights to
> > access the OpenFlow controller and ask it to change the OpenFlow rules
> > or else dropwatch simply will not show actual packet drops.
>
> The point is that these are not drops in this scenario.  The packet was
> delivered to its destination and hence should not be reported as dropped.
> In the observability use-case that you're describing even OpenFlow layer
> in OVS doesn't know if these supposed to be treated as packet drops for
> the user or if these are just samples with the sampling being the only
> intended destination.  For OpenFlow and OVS userspace components these
> two scenarios are indistinguishable.  Only the OpenFlow controller knows
> that these rules were put in place because it was an ACL created by some
> user or tool.  And since OVS in user space can't make such a distinction,
> kernel can't make it either, and so shouldn't guess what the user two
> levels of abstraction higher up meant.
>
> >
> > To me it seems obvious that drop sampling (via emit_sample) "includes"
> > drop reporting via emit_sample. In both cases you get the packet
> > headers, but in one case you also get OFP controller metadata. Now even
> > if there is a system that uses both, does it make sense to push to them
> > the responsibility of dealing with them being mutually exclusive?
> >
> > I think this makes debugging OVS datapath unnecessarily obscure when we
> > know the packet is actually being dropped intentionally by OVS.
>
> I don't think we know that we're in a drop sampling scenario.  We don't
> have enough information even in OVS userspace to tell.
>
> And having different behavior between "userspace" and "emit_sample" in
> the kernel may cause even more confusion, because now two ways of sampling
> packets will result in packets showing up in dropwatch in one case, but
> not in the other.
>
> >
> > What's the problem with having OVS write the following?
> >     "sample(50%, emit_sample()),drop(0)"
>
> It's a valid sequence of actions, but we shouldn't guess what the end
> user meant by putting those actions into the kernel.  If we see such a
> sequence in the kernel, then we should report an explicit drop.  If
> there was only the "sample(50%, emit_sample())" then we should simply
> consume the skb as it reached its destination in the psample.
>
> For the question if OVS in user space should put explicit drop action
> while preparing to emit sample, this doesn't sound reasonable for the
> same reason - OVS in user space doesn't know what the intention was of
> the user or tool that put the sampling action into OpenFlow pipeline.
>

I don't see it that way. The spec says that packets whose action sets
(the result of classification) have no output action and no group action
must be dropped. Even if OFP sample action is an extension, I don't see
it invalidating that semantics.
So, IMHO, OVS does know that a flow that is just sampled is a drop.

> I actually became more confused about what are we arguing about.
> To recap:
>
>                                      This patch     My proposal
>
> 1. emit_sample() is the last            consume        consume
>     inside the sample()
>
> 2. the end of the action list           consume        drop
>     inside the sample()
>
> 3. emit_sample() is the last            drop           consume
>     outside the sample()
>
> 4. the end of the action list           drop           drop
>     outside the sample()
>
> 5. sample() is the last action          consume        consume
>     and probability failed
>
>
> I don't think cases 1 and 3 should differ, i.e. the behavior should
> be the same regardless of emit_sample() being inside or outside of
> the sample().  As a side point, OVS in user space will omit the 100%
> rate sample() action and will just list inner actions instead.  This
> means that 100% probability sampling will generate drops and 99% will
> not.  Doesn't sound right.
>

That's what I was refering to in the commit message, we still OVS to
write:
    actions:sample(..,emit_sample(..)),drop

> Case 2 should likely never happen, but I'd like to see a drop reported
> if that ever happens, because it is not a meaningful list of actions.
>
> Best regards, Ilya Maximets.
>

I think we could drop this patch if we agree that OVS could write
explicit drops when it knows the packet is being dropped and sampled
(the action only has OFP sample actions).

The drop could be placed inside the odp sample action to avoid
breaking the clone optimization:
    actions:sample(50%, actions(emit_sample(),drop)))

or outside if the sample itself is optimized out:
    actions:emit_sample(),drop

IIUC, if we don't do that, we are saying that sampling is incompatible
with decent drop reporting via kfree_skb infrastructure used by tools
like dropwatch or retis (among many others). And I think that is
unnecessarily and deliberately making OVS datapath more difficult to
troubleshoot.

Thanks,
Adrián
Ilya Maximets June 19, 2024, 6:21 p.m. UTC | #9
On 6/19/24 08:35, Adrián Moreno wrote:
> On Tue, Jun 18, 2024 at 05:44:05PM GMT, Ilya Maximets wrote:
>> On 6/18/24 12:50, Adrián Moreno wrote:
>>> On Tue, Jun 18, 2024 at 12:22:23PM GMT, Ilya Maximets wrote:
>>>> On 6/18/24 09:00, Adrián Moreno wrote:
>>>>> On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote:
>>>>>> On 6/17/24 13:55, Ilya Maximets wrote:
>>>>>>> On 6/3/24 20:56, Adrian Moreno wrote:
>>>>>>>> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
>>>>>>>> observability-oriented.
>>>>>>>>
>>>>>>>> Apart from some corner case in which it's used a replacement of clone()
>>>>>>>> for old kernels, it's really only used for sFlow, IPFIX and now,
>>>>>>>> local emit_sample.
>>>>>>>>
>>>>>>>> With this in mind, it doesn't make much sense to report
>>>>>>>> OVS_DROP_LAST_ACTION inside sample actions.
>>>>>>>>
>>>>>>>> For instance, if the flow:
>>>>>>>>
>>>>>>>>   actions:sample(..,emit_sample(..)),2
>>>>>>>>
>>>>>>>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
>>>>>>>> confusing for users since the packet did reach its destination.
>>>>>>>>
>>>>>>>> This patch makes internal action execution silently consume the skb
>>>>>>>> instead of notifying a drop for this case.
>>>>>>>>
>>>>>>>> Unfortunately, this patch does not remove all potential sources of
>>>>>>>> confusion since, if the sample action itself is the last action, e.g:
>>>>>>>>
>>>>>>>>     actions:sample(..,emit_sample(..))
>>>>>>>>
>>>>>>>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
>>>>>>>>
>>>>>>>> Sadly, this case is difficult to solve without breaking the
>>>>>>>> optimization by which the skb is not cloned on last sample actions.
>>>>>>>> But, given explicit drop actions are now supported, OVS can just add one
>>>>>>>> after the last sample() and rewrite the flow as:
>>>>>>>>
>>>>>>>>     actions:sample(..,emit_sample(..)),drop
>>>>>>>>
>>>>>>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>>>>>>>> ---
>>>>>>>>  net/openvswitch/actions.c | 13 +++++++++++--
>>>>>>>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>>>>>>>> index 33f6d93ba5e4..54fc1abcff95 100644
>>>>>>>> --- a/net/openvswitch/actions.c
>>>>>>>> +++ b/net/openvswitch/actions.c
>>>>>>>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
>>>>>>>>  static struct action_flow_keys __percpu *flow_keys;
>>>>>>>>  static DEFINE_PER_CPU(int, exec_actions_level);
>>>>>>>>
>>>>>>>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
>>>>>>>> +{
>>>>>>>> +	/* Do not emit packet drops inside sample(). */
>>>>>>>> +	if (OVS_CB(skb)->probability)
>>>>>>>> +		consume_skb(skb);
>>>>>>>> +	else
>>>>>>>> +		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
>>>>>>>>   * space. Return NULL if out of key spaces.
>>>>>>>>   */
>>>>>>>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
>>>>>>>>  	if ((arg->probability != U32_MAX) &&
>>>>>>>>  	    (!arg->probability || get_random_u32() > arg->probability)) {
>>>>>>>>  		if (last)
>>>>>>>> -			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>>>> +			ovs_drop_skb_last_action(skb);
>>>>>>
>>>>>> Always consuming the skb at this point makes sense, since having smaple()
>>>>>> as a last action is a reasonable thing to have.  But this looks more like
>>>>>> a fix for the original drop reason patch set.
>>>>>>
>>>>>
>>>>> I don't think consuming the skb at this point makes sense. It was very
>>>>> intentionally changed to a drop since a very common use-case for
>>>>> sampling is drop-sampling, i.e: replacing an empty action list (that
>>>>> triggers OVS_DROP_LAST_ACTION) with a sample(emit_sample()). Ideally,
>>>>> that replacement should not have any effect on the number of
>>>>> OVS_DROP_LAST_ACTION being reported as the packets are being treated in
>>>>> the same way (only observed in one case).
>>>>>
>>>>>
>>>>>>>>  		return 0;
>>>>>>>>  	}
>>>>>>>>
>>>>>>>> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>>>>>>>>  		}
>>>>>>>>  	}
>>>>>>>>
>>>>>>>> -	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>>>> +	ovs_drop_skb_last_action(skb);
>>>>>>>
>>>>>>> I don't think I agree with this one.  If we have a sample() action with
>>>>>>> a lot of different actions inside and we reached the end while the last
>>>>>>> action didn't consume the skb, then we should report that.  E.g.
>>>>>>> "sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
>>>>>>> cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.
>>>>>>>
>>>>>
>>>>> What is the use case for such action list? Having an action branch
>>>>> executed randomly doesn't make sense to me if it's not some
>>>>> observability thing (which IMHO should not trigger drops).
>>>>
>>>> It is exactly my point.  A list of actions that doesn't end is some sort
>>>> of a terminal action (output, drop, etc) does not make a lot of sense and
>>>> hence should be signaled as an unexpected drop, so users can re-check the
>>>> pipeline in case they missed the terminal action somehow.
>>>>
>>>>>
>>>>>>> The only actions that are actually consuming the skb are "output",
>>>>>>> "userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
>>>>>>> consuming the skb "naturally" by stealing it when it is the last action.
>>>>>>> "userspace" has an explicit check to consume the skb if it is the last
>>>>>>> action.  "emit_sample" should have the similar check.  It should likely
>>>>>>> be added at the point of action introduction instead of having a separate
>>>>>>> patch.
>>>>>>>
>>>>>
>>>>> Unlinke "output", "recirc", "userspace", etc. with emit_sample the
>>>>> packet does not continue it's way through the datapath.
>>>>
>>>> After "output" the packet leaves the datapath too, i.e. does not continue
>>>> it's way through OVS datapath.
>>>>
>>>
>>> I meant a broader concept of "datapath". The packet continues. For the
>>> userspace action this is true only for the CONTROLLER ofp action but
>>> since the datapath does not know which action it's implementing, we
>>> cannot do better.
>>
>> It's not only controller() action.  Packets can be brought to userspace
>> for various reason including just an explicit ask to execute some actions
>> in userspace.  In any case the packet sent to userspace kind of reached its
>> destination and it's not the "datapath drops the packet" situation.
>>
>>>
>>>>>
>>>>> It would be very confusing if OVS starts monitoring drops and adds a bunch
>>>>> of flows such as "actions:emit_sample()" and suddently it stops reporting such
>>>>> drops via standard kfree_skb_reason. Packets _are_ being dropped here,
>>>>> we are just observing them.
>>>>
>>>> This might make sense from the higher logic in user space application, but
>>>> it doesn't from the datapath perspective.  And also, if the user adds the
>>>> 'emit_sample' action for drop monitring, they already know where to find
>>>> packet samples, they don't need to use tools like dropwatch anymore.
>>>> This packet is not dropped from the datapath perspective, it is sampled.
>>>>
>>>>>
>>>>> And if we change emit_sample to trigger a drop if it's the last action,
>>>>> then "sample(50%, emit_sample()),2" will trigger a drop half of the times
>>>>> which is also terribly confusing.
>>>>
>>>> If emit_sample is the last action, then skb should be consumed silently.
>>>> The same as for "output" and "userspace".
>>>>
>>>>>
>>>>> I think we should try to be clear and informative with what we
>>>>> _actually_ drop and not require the user that is just running
>>>>> "dropwatch" to understand the internals of the OVS module.
>>>>
>>>> If someone is already using sampling to watch their packet drops, why would
>>>> they use dropwatch?
>>>>
>>>>>
>>>>> So if you don't want to accept the "observational" nature of sample(),
>>>>> the only other solution that does not bring even more confusion to OVS
>>>>> drops would be to have userspace add explicit drop actions. WDYT?
>>>>>
>>>>
>>>> These are not drops from the datapath perspective.  Users can add explicit
>>>> drop actions if they want to, but I'm really not sure why they would do that
>>>> if they are already capturing all these packets in psample, sFlow or IPFIX.
>>>
>>> Because there is not a single "user". Tools and systems can be built on
>>> top of tracepoints and samples and they might not be coordinated between
>>> them. Some observability application can be always enabled and doing
>>> constant network monitoring or statistics while other lower level tools
>>> can be run at certain moments to troubleshoot issues.
>>>
>>> In order to run dropwatch in a node you don't need to have rights to
>>> access the OpenFlow controller and ask it to change the OpenFlow rules
>>> or else dropwatch simply will not show actual packet drops.
>>
>> The point is that these are not drops in this scenario.  The packet was
>> delivered to its destination and hence should not be reported as dropped.
>> In the observability use-case that you're describing even OpenFlow layer
>> in OVS doesn't know if these supposed to be treated as packet drops for
>> the user or if these are just samples with the sampling being the only
>> intended destination.  For OpenFlow and OVS userspace components these
>> two scenarios are indistinguishable.  Only the OpenFlow controller knows
>> that these rules were put in place because it was an ACL created by some
>> user or tool.  And since OVS in user space can't make such a distinction,
>> kernel can't make it either, and so shouldn't guess what the user two
>> levels of abstraction higher up meant.
>>
>>>
>>> To me it seems obvious that drop sampling (via emit_sample) "includes"
>>> drop reporting via emit_sample. In both cases you get the packet
>>> headers, but in one case you also get OFP controller metadata. Now even
>>> if there is a system that uses both, does it make sense to push to them
>>> the responsibility of dealing with them being mutually exclusive?
>>>
>>> I think this makes debugging OVS datapath unnecessarily obscure when we
>>> know the packet is actually being dropped intentionally by OVS.
>>
>> I don't think we know that we're in a drop sampling scenario.  We don't
>> have enough information even in OVS userspace to tell.
>>
>> And having different behavior between "userspace" and "emit_sample" in
>> the kernel may cause even more confusion, because now two ways of sampling
>> packets will result in packets showing up in dropwatch in one case, but
>> not in the other.
>>
>>>
>>> What's the problem with having OVS write the following?
>>>     "sample(50%, emit_sample()),drop(0)"
>>
>> It's a valid sequence of actions, but we shouldn't guess what the end
>> user meant by putting those actions into the kernel.  If we see such a
>> sequence in the kernel, then we should report an explicit drop.  If
>> there was only the "sample(50%, emit_sample())" then we should simply
>> consume the skb as it reached its destination in the psample.
>>
>> For the question if OVS in user space should put explicit drop action
>> while preparing to emit sample, this doesn't sound reasonable for the
>> same reason - OVS in user space doesn't know what the intention was of
>> the user or tool that put the sampling action into OpenFlow pipeline.
>>
> 
> I don't see it that way. The spec says that packets whose action sets
> (the result of classification) have no output action and no group action
> must be dropped. Even if OFP sample action is an extension, I don't see
> it invalidating that semantics.
> So, IMHO, OVS does know that a flow that is just sampled is a drop.

This applies to "action sets", but most users are actually using "action
lists" supplied via "Apply-actions" OF instruction and the action sets
always remain empty.  So, from the OF perspective, strictly speaking, we
are dropping every single packet.  So, this is not a good analogy.

> 
>> I actually became more confused about what are we arguing about.
>> To recap:
>>
>>                                      This patch     My proposal
>>
>> 1. emit_sample() is the last            consume        consume
>>     inside the sample()
>>
>> 2. the end of the action list           consume        drop
>>     inside the sample()
>>
>> 3. emit_sample() is the last            drop           consume
>>     outside the sample()
>>
>> 4. the end of the action list           drop           drop
>>     outside the sample()
>>
>> 5. sample() is the last action          consume        consume
>>     and probability failed
>>
>>
>> I don't think cases 1 and 3 should differ, i.e. the behavior should
>> be the same regardless of emit_sample() being inside or outside of
>> the sample().  As a side point, OVS in user space will omit the 100%
>> rate sample() action and will just list inner actions instead.  This
>> means that 100% probability sampling will generate drops and 99% will
>> not.  Doesn't sound right.
>>
> 
> That's what I was refering to in the commit message, we still OVS to
> write:
>     actions:sample(..,emit_sample(..)),drop
> 
>> Case 2 should likely never happen, but I'd like to see a drop reported
>> if that ever happens, because it is not a meaningful list of actions.
>>
>> Best regards, Ilya Maximets.
>>
> 
> I think we could drop this patch if we agree that OVS could write
> explicit drops when it knows the packet is being dropped and sampled
> (the action only has OFP sample actions).
> 
> The drop could be placed inside the odp sample action to avoid
> breaking the clone optimization:
>     actions:sample(50%, actions(emit_sample(),drop)))
> 
> or outside if the sample itself is optimized out:
>     actions:emit_sample(),drop
> 
> IIUC, if we don't do that, we are saying that sampling is incompatible
> with decent drop reporting via kfree_skb infrastructure used by tools
> like dropwatch or retis (among many others). And I think that is
> unnecessarily and deliberately making OVS datapath more difficult to
> troubleshoot.

This makes some sense, so let's ensure that semantics is consistent
within the kernel and discuss how to make the tools happy from the
user space perspective.

But we shouldn't simply drop this patch, we still need to consume the
skb after emit_sample() when it is the last action.  The same as we
do for the userpsace() action.  Though it should be done at the point
of the action introduction.  Having both actions consistent will allow
us to solve the observability problem for both in the same way by
adding explicit drop actions from user space.

On a side note:
I wonder if probability-induced drop needs a separate reason... i.e.
it could have been consumed by emit_smaple()/userspace() but wasn't.

Best regards, Ilya Maximets.
Adrián Moreno June 19, 2024, 8:40 p.m. UTC | #10
On Wed, Jun 19, 2024 at 08:21:02PM GMT, Ilya Maximets wrote:
> On 6/19/24 08:35, Adrián Moreno wrote:
> > On Tue, Jun 18, 2024 at 05:44:05PM GMT, Ilya Maximets wrote:
> >> On 6/18/24 12:50, Adrián Moreno wrote:
> >>> On Tue, Jun 18, 2024 at 12:22:23PM GMT, Ilya Maximets wrote:
> >>>> On 6/18/24 09:00, Adrián Moreno wrote:
> >>>>> On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote:
> >>>>>> On 6/17/24 13:55, Ilya Maximets wrote:
> >>>>>>> On 6/3/24 20:56, Adrian Moreno wrote:
> >>>>>>>> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
> >>>>>>>> observability-oriented.
> >>>>>>>>
> >>>>>>>> Apart from some corner case in which it's used a replacement of clone()
> >>>>>>>> for old kernels, it's really only used for sFlow, IPFIX and now,
> >>>>>>>> local emit_sample.
> >>>>>>>>
> >>>>>>>> With this in mind, it doesn't make much sense to report
> >>>>>>>> OVS_DROP_LAST_ACTION inside sample actions.
> >>>>>>>>
> >>>>>>>> For instance, if the flow:
> >>>>>>>>
> >>>>>>>>   actions:sample(..,emit_sample(..)),2
> >>>>>>>>
> >>>>>>>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
> >>>>>>>> confusing for users since the packet did reach its destination.
> >>>>>>>>
> >>>>>>>> This patch makes internal action execution silently consume the skb
> >>>>>>>> instead of notifying a drop for this case.
> >>>>>>>>
> >>>>>>>> Unfortunately, this patch does not remove all potential sources of
> >>>>>>>> confusion since, if the sample action itself is the last action, e.g:
> >>>>>>>>
> >>>>>>>>     actions:sample(..,emit_sample(..))
> >>>>>>>>
> >>>>>>>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
> >>>>>>>>
> >>>>>>>> Sadly, this case is difficult to solve without breaking the
> >>>>>>>> optimization by which the skb is not cloned on last sample actions.
> >>>>>>>> But, given explicit drop actions are now supported, OVS can just add one
> >>>>>>>> after the last sample() and rewrite the flow as:
> >>>>>>>>
> >>>>>>>>     actions:sample(..,emit_sample(..)),drop
> >>>>>>>>
> >>>>>>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> >>>>>>>> ---
> >>>>>>>>  net/openvswitch/actions.c | 13 +++++++++++--
> >>>>>>>>  1 file changed, 11 insertions(+), 2 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> >>>>>>>> index 33f6d93ba5e4..54fc1abcff95 100644
> >>>>>>>> --- a/net/openvswitch/actions.c
> >>>>>>>> +++ b/net/openvswitch/actions.c
> >>>>>>>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
> >>>>>>>>  static struct action_flow_keys __percpu *flow_keys;
> >>>>>>>>  static DEFINE_PER_CPU(int, exec_actions_level);
> >>>>>>>>
> >>>>>>>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
> >>>>>>>> +{
> >>>>>>>> +	/* Do not emit packet drops inside sample(). */
> >>>>>>>> +	if (OVS_CB(skb)->probability)
> >>>>>>>> +		consume_skb(skb);
> >>>>>>>> +	else
> >>>>>>>> +		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >>>>>>>> +}
> >>>>>>>> +
> >>>>>>>>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
> >>>>>>>>   * space. Return NULL if out of key spaces.
> >>>>>>>>   */
> >>>>>>>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
> >>>>>>>>  	if ((arg->probability != U32_MAX) &&
> >>>>>>>>  	    (!arg->probability || get_random_u32() > arg->probability)) {
> >>>>>>>>  		if (last)
> >>>>>>>> -			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >>>>>>>> +			ovs_drop_skb_last_action(skb);
> >>>>>>
> >>>>>> Always consuming the skb at this point makes sense, since having smaple()
> >>>>>> as a last action is a reasonable thing to have.  But this looks more like
> >>>>>> a fix for the original drop reason patch set.
> >>>>>>
> >>>>>
> >>>>> I don't think consuming the skb at this point makes sense. It was very
> >>>>> intentionally changed to a drop since a very common use-case for
> >>>>> sampling is drop-sampling, i.e: replacing an empty action list (that
> >>>>> triggers OVS_DROP_LAST_ACTION) with a sample(emit_sample()). Ideally,
> >>>>> that replacement should not have any effect on the number of
> >>>>> OVS_DROP_LAST_ACTION being reported as the packets are being treated in
> >>>>> the same way (only observed in one case).
> >>>>>
> >>>>>
> >>>>>>>>  		return 0;
> >>>>>>>>  	}
> >>>>>>>>
> >>>>>>>> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
> >>>>>>>>  		}
> >>>>>>>>  	}
> >>>>>>>>
> >>>>>>>> -	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >>>>>>>> +	ovs_drop_skb_last_action(skb);
> >>>>>>>
> >>>>>>> I don't think I agree with this one.  If we have a sample() action with
> >>>>>>> a lot of different actions inside and we reached the end while the last
> >>>>>>> action didn't consume the skb, then we should report that.  E.g.
> >>>>>>> "sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
> >>>>>>> cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.
> >>>>>>>
> >>>>>
> >>>>> What is the use case for such action list? Having an action branch
> >>>>> executed randomly doesn't make sense to me if it's not some
> >>>>> observability thing (which IMHO should not trigger drops).
> >>>>
> >>>> It is exactly my point.  A list of actions that doesn't end is some sort
> >>>> of a terminal action (output, drop, etc) does not make a lot of sense and
> >>>> hence should be signaled as an unexpected drop, so users can re-check the
> >>>> pipeline in case they missed the terminal action somehow.
> >>>>
> >>>>>
> >>>>>>> The only actions that are actually consuming the skb are "output",
> >>>>>>> "userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
> >>>>>>> consuming the skb "naturally" by stealing it when it is the last action.
> >>>>>>> "userspace" has an explicit check to consume the skb if it is the last
> >>>>>>> action.  "emit_sample" should have the similar check.  It should likely
> >>>>>>> be added at the point of action introduction instead of having a separate
> >>>>>>> patch.
> >>>>>>>
> >>>>>
> >>>>> Unlinke "output", "recirc", "userspace", etc. with emit_sample the
> >>>>> packet does not continue it's way through the datapath.
> >>>>
> >>>> After "output" the packet leaves the datapath too, i.e. does not continue
> >>>> it's way through OVS datapath.
> >>>>
> >>>
> >>> I meant a broader concept of "datapath". The packet continues. For the
> >>> userspace action this is true only for the CONTROLLER ofp action but
> >>> since the datapath does not know which action it's implementing, we
> >>> cannot do better.
> >>
> >> It's not only controller() action.  Packets can be brought to userspace
> >> for various reason including just an explicit ask to execute some actions
> >> in userspace.  In any case the packet sent to userspace kind of reached its
> >> destination and it's not the "datapath drops the packet" situation.
> >>
> >>>
> >>>>>
> >>>>> It would be very confusing if OVS starts monitoring drops and adds a bunch
> >>>>> of flows such as "actions:emit_sample()" and suddently it stops reporting such
> >>>>> drops via standard kfree_skb_reason. Packets _are_ being dropped here,
> >>>>> we are just observing them.
> >>>>
> >>>> This might make sense from the higher logic in user space application, but
> >>>> it doesn't from the datapath perspective.  And also, if the user adds the
> >>>> 'emit_sample' action for drop monitring, they already know where to find
> >>>> packet samples, they don't need to use tools like dropwatch anymore.
> >>>> This packet is not dropped from the datapath perspective, it is sampled.
> >>>>
> >>>>>
> >>>>> And if we change emit_sample to trigger a drop if it's the last action,
> >>>>> then "sample(50%, emit_sample()),2" will trigger a drop half of the times
> >>>>> which is also terribly confusing.
> >>>>
> >>>> If emit_sample is the last action, then skb should be consumed silently.
> >>>> The same as for "output" and "userspace".
> >>>>
> >>>>>
> >>>>> I think we should try to be clear and informative with what we
> >>>>> _actually_ drop and not require the user that is just running
> >>>>> "dropwatch" to understand the internals of the OVS module.
> >>>>
> >>>> If someone is already using sampling to watch their packet drops, why would
> >>>> they use dropwatch?
> >>>>
> >>>>>
> >>>>> So if you don't want to accept the "observational" nature of sample(),
> >>>>> the only other solution that does not bring even more confusion to OVS
> >>>>> drops would be to have userspace add explicit drop actions. WDYT?
> >>>>>
> >>>>
> >>>> These are not drops from the datapath perspective.  Users can add explicit
> >>>> drop actions if they want to, but I'm really not sure why they would do that
> >>>> if they are already capturing all these packets in psample, sFlow or IPFIX.
> >>>
> >>> Because there is not a single "user". Tools and systems can be built on
> >>> top of tracepoints and samples and they might not be coordinated between
> >>> them. Some observability application can be always enabled and doing
> >>> constant network monitoring or statistics while other lower level tools
> >>> can be run at certain moments to troubleshoot issues.
> >>>
> >>> In order to run dropwatch in a node you don't need to have rights to
> >>> access the OpenFlow controller and ask it to change the OpenFlow rules
> >>> or else dropwatch simply will not show actual packet drops.
> >>
> >> The point is that these are not drops in this scenario.  The packet was
> >> delivered to its destination and hence should not be reported as dropped.
> >> In the observability use-case that you're describing even OpenFlow layer
> >> in OVS doesn't know if these supposed to be treated as packet drops for
> >> the user or if these are just samples with the sampling being the only
> >> intended destination.  For OpenFlow and OVS userspace components these
> >> two scenarios are indistinguishable.  Only the OpenFlow controller knows
> >> that these rules were put in place because it was an ACL created by some
> >> user or tool.  And since OVS in user space can't make such a distinction,
> >> kernel can't make it either, and so shouldn't guess what the user two
> >> levels of abstraction higher up meant.
> >>
> >>>
> >>> To me it seems obvious that drop sampling (via emit_sample) "includes"
> >>> drop reporting via emit_sample. In both cases you get the packet
> >>> headers, but in one case you also get OFP controller metadata. Now even
> >>> if there is a system that uses both, does it make sense to push to them
> >>> the responsibility of dealing with them being mutually exclusive?
> >>>
> >>> I think this makes debugging OVS datapath unnecessarily obscure when we
> >>> know the packet is actually being dropped intentionally by OVS.
> >>
> >> I don't think we know that we're in a drop sampling scenario.  We don't
> >> have enough information even in OVS userspace to tell.
> >>
> >> And having different behavior between "userspace" and "emit_sample" in
> >> the kernel may cause even more confusion, because now two ways of sampling
> >> packets will result in packets showing up in dropwatch in one case, but
> >> not in the other.
> >>
> >>>
> >>> What's the problem with having OVS write the following?
> >>>     "sample(50%, emit_sample()),drop(0)"
> >>
> >> It's a valid sequence of actions, but we shouldn't guess what the end
> >> user meant by putting those actions into the kernel.  If we see such a
> >> sequence in the kernel, then we should report an explicit drop.  If
> >> there was only the "sample(50%, emit_sample())" then we should simply
> >> consume the skb as it reached its destination in the psample.
> >>
> >> For the question if OVS in user space should put explicit drop action
> >> while preparing to emit sample, this doesn't sound reasonable for the
> >> same reason - OVS in user space doesn't know what the intention was of
> >> the user or tool that put the sampling action into OpenFlow pipeline.
> >>
> >
> > I don't see it that way. The spec says that packets whose action sets
> > (the result of classification) have no output action and no group action
> > must be dropped. Even if OFP sample action is an extension, I don't see
> > it invalidating that semantics.
> > So, IMHO, OVS does know that a flow that is just sampled is a drop.
>
> This applies to "action sets", but most users are actually using "action
> lists" supplied via "Apply-actions" OF instruction and the action sets
> always remain empty.  So, from the OF perspective, strictly speaking, we
> are dropping every single packet.  So, this is not a good analogy.
>
> >
> >> I actually became more confused about what are we arguing about.
> >> To recap:
> >>
> >>                                      This patch     My proposal
> >>
> >> 1. emit_sample() is the last            consume        consume
> >>     inside the sample()
> >>
> >> 2. the end of the action list           consume        drop
> >>     inside the sample()
> >>
> >> 3. emit_sample() is the last            drop           consume
> >>     outside the sample()
> >>
> >> 4. the end of the action list           drop           drop
> >>     outside the sample()
> >>
> >> 5. sample() is the last action          consume        consume
> >>     and probability failed
> >>
> >>
> >> I don't think cases 1 and 3 should differ, i.e. the behavior should
> >> be the same regardless of emit_sample() being inside or outside of
> >> the sample().  As a side point, OVS in user space will omit the 100%
> >> rate sample() action and will just list inner actions instead.  This
> >> means that 100% probability sampling will generate drops and 99% will
> >> not.  Doesn't sound right.
> >>
> >
> > That's what I was refering to in the commit message, we still OVS to
> > write:
> >     actions:sample(..,emit_sample(..)),drop
> >
> >> Case 2 should likely never happen, but I'd like to see a drop reported
> >> if that ever happens, because it is not a meaningful list of actions.
> >>
> >> Best regards, Ilya Maximets.
> >>
> >
> > I think we could drop this patch if we agree that OVS could write
> > explicit drops when it knows the packet is being dropped and sampled
> > (the action only has OFP sample actions).
> >
> > The drop could be placed inside the odp sample action to avoid
> > breaking the clone optimization:
> >     actions:sample(50%, actions(emit_sample(),drop)))
> >
> > or outside if the sample itself is optimized out:
> >     actions:emit_sample(),drop
> >
> > IIUC, if we don't do that, we are saying that sampling is incompatible
> > with decent drop reporting via kfree_skb infrastructure used by tools
> > like dropwatch or retis (among many others). And I think that is
> > unnecessarily and deliberately making OVS datapath more difficult to
> > troubleshoot.
>
> This makes some sense, so let's ensure that semantics is consistent
> within the kernel and discuss how to make the tools happy from the
> user space perspective.
>
> But we shouldn't simply drop this patch, we still need to consume the
> skb after emit_sample() when it is the last action.  The same as we
> do for the userpsace() action.  Though it should be done at the point
> of the action introduction.  Having both actions consistent will allow
> us to solve the observability problem for both in the same way by
> adding explicit drop actions from user space.

OK. I'll resend the series dropping this patch (and consuming the skb
apropriately).

>
> On a side note:
> I wonder if probability-induced drop needs a separate reason... i.e.
> it could have been consumed by emit_smaple()/userspace() but wasn't.
>

You mean in sample action "get_random_u32() > arg->probability"?
It only makes sense to drop it if the last action so currently uses
OVS_DROP_LAST_ACTION.

> Best regards, Ilya Maximets.
>

Thanks for the great discussion.
Adrián
Ilya Maximets June 19, 2024, 8:56 p.m. UTC | #11
On 6/19/24 22:40, Adrián Moreno wrote:
> On Wed, Jun 19, 2024 at 08:21:02PM GMT, Ilya Maximets wrote:
>> On 6/19/24 08:35, Adrián Moreno wrote:
>>> On Tue, Jun 18, 2024 at 05:44:05PM GMT, Ilya Maximets wrote:
>>>> On 6/18/24 12:50, Adrián Moreno wrote:
>>>>> On Tue, Jun 18, 2024 at 12:22:23PM GMT, Ilya Maximets wrote:
>>>>>> On 6/18/24 09:00, Adrián Moreno wrote:
>>>>>>> On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote:
>>>>>>>> On 6/17/24 13:55, Ilya Maximets wrote:
>>>>>>>>> On 6/3/24 20:56, Adrian Moreno wrote:
>>>>>>>>>> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
>>>>>>>>>> observability-oriented.
>>>>>>>>>>
>>>>>>>>>> Apart from some corner case in which it's used a replacement of clone()
>>>>>>>>>> for old kernels, it's really only used for sFlow, IPFIX and now,
>>>>>>>>>> local emit_sample.
>>>>>>>>>>
>>>>>>>>>> With this in mind, it doesn't make much sense to report
>>>>>>>>>> OVS_DROP_LAST_ACTION inside sample actions.
>>>>>>>>>>
>>>>>>>>>> For instance, if the flow:
>>>>>>>>>>
>>>>>>>>>>   actions:sample(..,emit_sample(..)),2
>>>>>>>>>>
>>>>>>>>>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
>>>>>>>>>> confusing for users since the packet did reach its destination.
>>>>>>>>>>
>>>>>>>>>> This patch makes internal action execution silently consume the skb
>>>>>>>>>> instead of notifying a drop for this case.
>>>>>>>>>>
>>>>>>>>>> Unfortunately, this patch does not remove all potential sources of
>>>>>>>>>> confusion since, if the sample action itself is the last action, e.g:
>>>>>>>>>>
>>>>>>>>>>     actions:sample(..,emit_sample(..))
>>>>>>>>>>
>>>>>>>>>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
>>>>>>>>>>
>>>>>>>>>> Sadly, this case is difficult to solve without breaking the
>>>>>>>>>> optimization by which the skb is not cloned on last sample actions.
>>>>>>>>>> But, given explicit drop actions are now supported, OVS can just add one
>>>>>>>>>> after the last sample() and rewrite the flow as:
>>>>>>>>>>
>>>>>>>>>>     actions:sample(..,emit_sample(..)),drop
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>>>>>>>>>> ---
>>>>>>>>>>  net/openvswitch/actions.c | 13 +++++++++++--
>>>>>>>>>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>>>>>>>>>> index 33f6d93ba5e4..54fc1abcff95 100644
>>>>>>>>>> --- a/net/openvswitch/actions.c
>>>>>>>>>> +++ b/net/openvswitch/actions.c
>>>>>>>>>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
>>>>>>>>>>  static struct action_flow_keys __percpu *flow_keys;
>>>>>>>>>>  static DEFINE_PER_CPU(int, exec_actions_level);
>>>>>>>>>>
>>>>>>>>>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
>>>>>>>>>> +{
>>>>>>>>>> +	/* Do not emit packet drops inside sample(). */
>>>>>>>>>> +	if (OVS_CB(skb)->probability)
>>>>>>>>>> +		consume_skb(skb);
>>>>>>>>>> +	else
>>>>>>>>>> +		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
>>>>>>>>>>   * space. Return NULL if out of key spaces.
>>>>>>>>>>   */
>>>>>>>>>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
>>>>>>>>>>  	if ((arg->probability != U32_MAX) &&
>>>>>>>>>>  	    (!arg->probability || get_random_u32() > arg->probability)) {
>>>>>>>>>>  		if (last)
>>>>>>>>>> -			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>>>>>> +			ovs_drop_skb_last_action(skb);
>>>>>>>>
>>>>>>>> Always consuming the skb at this point makes sense, since having smaple()
>>>>>>>> as a last action is a reasonable thing to have.  But this looks more like
>>>>>>>> a fix for the original drop reason patch set.
>>>>>>>>
>>>>>>>
>>>>>>> I don't think consuming the skb at this point makes sense. It was very
>>>>>>> intentionally changed to a drop since a very common use-case for
>>>>>>> sampling is drop-sampling, i.e: replacing an empty action list (that
>>>>>>> triggers OVS_DROP_LAST_ACTION) with a sample(emit_sample()). Ideally,
>>>>>>> that replacement should not have any effect on the number of
>>>>>>> OVS_DROP_LAST_ACTION being reported as the packets are being treated in
>>>>>>> the same way (only observed in one case).
>>>>>>>
>>>>>>>
>>>>>>>>>>  		return 0;
>>>>>>>>>>  	}
>>>>>>>>>>
>>>>>>>>>> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>>>>>>>>>>  		}
>>>>>>>>>>  	}
>>>>>>>>>>
>>>>>>>>>> -	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>>>>>> +	ovs_drop_skb_last_action(skb);
>>>>>>>>>
>>>>>>>>> I don't think I agree with this one.  If we have a sample() action with
>>>>>>>>> a lot of different actions inside and we reached the end while the last
>>>>>>>>> action didn't consume the skb, then we should report that.  E.g.
>>>>>>>>> "sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
>>>>>>>>> cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.
>>>>>>>>>
>>>>>>>
>>>>>>> What is the use case for such action list? Having an action branch
>>>>>>> executed randomly doesn't make sense to me if it's not some
>>>>>>> observability thing (which IMHO should not trigger drops).
>>>>>>
>>>>>> It is exactly my point.  A list of actions that doesn't end is some sort
>>>>>> of a terminal action (output, drop, etc) does not make a lot of sense and
>>>>>> hence should be signaled as an unexpected drop, so users can re-check the
>>>>>> pipeline in case they missed the terminal action somehow.
>>>>>>
>>>>>>>
>>>>>>>>> The only actions that are actually consuming the skb are "output",
>>>>>>>>> "userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
>>>>>>>>> consuming the skb "naturally" by stealing it when it is the last action.
>>>>>>>>> "userspace" has an explicit check to consume the skb if it is the last
>>>>>>>>> action.  "emit_sample" should have the similar check.  It should likely
>>>>>>>>> be added at the point of action introduction instead of having a separate
>>>>>>>>> patch.
>>>>>>>>>
>>>>>>>
>>>>>>> Unlinke "output", "recirc", "userspace", etc. with emit_sample the
>>>>>>> packet does not continue it's way through the datapath.
>>>>>>
>>>>>> After "output" the packet leaves the datapath too, i.e. does not continue
>>>>>> it's way through OVS datapath.
>>>>>>
>>>>>
>>>>> I meant a broader concept of "datapath". The packet continues. For the
>>>>> userspace action this is true only for the CONTROLLER ofp action but
>>>>> since the datapath does not know which action it's implementing, we
>>>>> cannot do better.
>>>>
>>>> It's not only controller() action.  Packets can be brought to userspace
>>>> for various reason including just an explicit ask to execute some actions
>>>> in userspace.  In any case the packet sent to userspace kind of reached its
>>>> destination and it's not the "datapath drops the packet" situation.
>>>>
>>>>>
>>>>>>>
>>>>>>> It would be very confusing if OVS starts monitoring drops and adds a bunch
>>>>>>> of flows such as "actions:emit_sample()" and suddently it stops reporting such
>>>>>>> drops via standard kfree_skb_reason. Packets _are_ being dropped here,
>>>>>>> we are just observing them.
>>>>>>
>>>>>> This might make sense from the higher logic in user space application, but
>>>>>> it doesn't from the datapath perspective.  And also, if the user adds the
>>>>>> 'emit_sample' action for drop monitring, they already know where to find
>>>>>> packet samples, they don't need to use tools like dropwatch anymore.
>>>>>> This packet is not dropped from the datapath perspective, it is sampled.
>>>>>>
>>>>>>>
>>>>>>> And if we change emit_sample to trigger a drop if it's the last action,
>>>>>>> then "sample(50%, emit_sample()),2" will trigger a drop half of the times
>>>>>>> which is also terribly confusing.
>>>>>>
>>>>>> If emit_sample is the last action, then skb should be consumed silently.
>>>>>> The same as for "output" and "userspace".
>>>>>>
>>>>>>>
>>>>>>> I think we should try to be clear and informative with what we
>>>>>>> _actually_ drop and not require the user that is just running
>>>>>>> "dropwatch" to understand the internals of the OVS module.
>>>>>>
>>>>>> If someone is already using sampling to watch their packet drops, why would
>>>>>> they use dropwatch?
>>>>>>
>>>>>>>
>>>>>>> So if you don't want to accept the "observational" nature of sample(),
>>>>>>> the only other solution that does not bring even more confusion to OVS
>>>>>>> drops would be to have userspace add explicit drop actions. WDYT?
>>>>>>>
>>>>>>
>>>>>> These are not drops from the datapath perspective.  Users can add explicit
>>>>>> drop actions if they want to, but I'm really not sure why they would do that
>>>>>> if they are already capturing all these packets in psample, sFlow or IPFIX.
>>>>>
>>>>> Because there is not a single "user". Tools and systems can be built on
>>>>> top of tracepoints and samples and they might not be coordinated between
>>>>> them. Some observability application can be always enabled and doing
>>>>> constant network monitoring or statistics while other lower level tools
>>>>> can be run at certain moments to troubleshoot issues.
>>>>>
>>>>> In order to run dropwatch in a node you don't need to have rights to
>>>>> access the OpenFlow controller and ask it to change the OpenFlow rules
>>>>> or else dropwatch simply will not show actual packet drops.
>>>>
>>>> The point is that these are not drops in this scenario.  The packet was
>>>> delivered to its destination and hence should not be reported as dropped.
>>>> In the observability use-case that you're describing even OpenFlow layer
>>>> in OVS doesn't know if these supposed to be treated as packet drops for
>>>> the user or if these are just samples with the sampling being the only
>>>> intended destination.  For OpenFlow and OVS userspace components these
>>>> two scenarios are indistinguishable.  Only the OpenFlow controller knows
>>>> that these rules were put in place because it was an ACL created by some
>>>> user or tool.  And since OVS in user space can't make such a distinction,
>>>> kernel can't make it either, and so shouldn't guess what the user two
>>>> levels of abstraction higher up meant.
>>>>
>>>>>
>>>>> To me it seems obvious that drop sampling (via emit_sample) "includes"
>>>>> drop reporting via emit_sample. In both cases you get the packet
>>>>> headers, but in one case you also get OFP controller metadata. Now even
>>>>> if there is a system that uses both, does it make sense to push to them
>>>>> the responsibility of dealing with them being mutually exclusive?
>>>>>
>>>>> I think this makes debugging OVS datapath unnecessarily obscure when we
>>>>> know the packet is actually being dropped intentionally by OVS.
>>>>
>>>> I don't think we know that we're in a drop sampling scenario.  We don't
>>>> have enough information even in OVS userspace to tell.
>>>>
>>>> And having different behavior between "userspace" and "emit_sample" in
>>>> the kernel may cause even more confusion, because now two ways of sampling
>>>> packets will result in packets showing up in dropwatch in one case, but
>>>> not in the other.
>>>>
>>>>>
>>>>> What's the problem with having OVS write the following?
>>>>>     "sample(50%, emit_sample()),drop(0)"
>>>>
>>>> It's a valid sequence of actions, but we shouldn't guess what the end
>>>> user meant by putting those actions into the kernel.  If we see such a
>>>> sequence in the kernel, then we should report an explicit drop.  If
>>>> there was only the "sample(50%, emit_sample())" then we should simply
>>>> consume the skb as it reached its destination in the psample.
>>>>
>>>> For the question if OVS in user space should put explicit drop action
>>>> while preparing to emit sample, this doesn't sound reasonable for the
>>>> same reason - OVS in user space doesn't know what the intention was of
>>>> the user or tool that put the sampling action into OpenFlow pipeline.
>>>>
>>>
>>> I don't see it that way. The spec says that packets whose action sets
>>> (the result of classification) have no output action and no group action
>>> must be dropped. Even if OFP sample action is an extension, I don't see
>>> it invalidating that semantics.
>>> So, IMHO, OVS does know that a flow that is just sampled is a drop.
>>
>> This applies to "action sets", but most users are actually using "action
>> lists" supplied via "Apply-actions" OF instruction and the action sets
>> always remain empty.  So, from the OF perspective, strictly speaking, we
>> are dropping every single packet.  So, this is not a good analogy.
>>
>>>
>>>> I actually became more confused about what are we arguing about.
>>>> To recap:
>>>>
>>>>                                      This patch     My proposal
>>>>
>>>> 1. emit_sample() is the last            consume        consume
>>>>     inside the sample()
>>>>
>>>> 2. the end of the action list           consume        drop
>>>>     inside the sample()
>>>>
>>>> 3. emit_sample() is the last            drop           consume
>>>>     outside the sample()
>>>>
>>>> 4. the end of the action list           drop           drop
>>>>     outside the sample()
>>>>
>>>> 5. sample() is the last action          consume        consume
>>>>     and probability failed
>>>>
>>>>
>>>> I don't think cases 1 and 3 should differ, i.e. the behavior should
>>>> be the same regardless of emit_sample() being inside or outside of
>>>> the sample().  As a side point, OVS in user space will omit the 100%
>>>> rate sample() action and will just list inner actions instead.  This
>>>> means that 100% probability sampling will generate drops and 99% will
>>>> not.  Doesn't sound right.
>>>>
>>>
>>> That's what I was refering to in the commit message, we still OVS to
>>> write:
>>>     actions:sample(..,emit_sample(..)),drop
>>>
>>>> Case 2 should likely never happen, but I'd like to see a drop reported
>>>> if that ever happens, because it is not a meaningful list of actions.
>>>>
>>>> Best regards, Ilya Maximets.
>>>>
>>>
>>> I think we could drop this patch if we agree that OVS could write
>>> explicit drops when it knows the packet is being dropped and sampled
>>> (the action only has OFP sample actions).
>>>
>>> The drop could be placed inside the odp sample action to avoid
>>> breaking the clone optimization:
>>>     actions:sample(50%, actions(emit_sample(),drop)))
>>>
>>> or outside if the sample itself is optimized out:
>>>     actions:emit_sample(),drop
>>>
>>> IIUC, if we don't do that, we are saying that sampling is incompatible
>>> with decent drop reporting via kfree_skb infrastructure used by tools
>>> like dropwatch or retis (among many others). And I think that is
>>> unnecessarily and deliberately making OVS datapath more difficult to
>>> troubleshoot.
>>
>> This makes some sense, so let's ensure that semantics is consistent
>> within the kernel and discuss how to make the tools happy from the
>> user space perspective.
>>
>> But we shouldn't simply drop this patch, we still need to consume the
>> skb after emit_sample() when it is the last action.  The same as we
>> do for the userpsace() action.  Though it should be done at the point
>> of the action introduction.  Having both actions consistent will allow
>> us to solve the observability problem for both in the same way by
>> adding explicit drop actions from user space.
> 
> OK. I'll resend the series dropping this patch (and consuming the skb
> apropriately).

Thanks!

> 
>>
>> On a side note:
>> I wonder if probability-induced drop needs a separate reason... i.e.
>> it could have been consumed by emit_smaple()/userspace() but wasn't.
>>
> 
> You mean in sample action "get_random_u32() > arg->probability"?
> It only makes sense to drop it if the last action so currently uses
> OVS_DROP_LAST_ACTION.

Sure, but, for example:
  actions:sample(50%,userspace())
In 50% cases we will consume the skb, in 50% we will report a LAST_ACTION
drop.  Looks a little inconsistent.  That's why I was saying that always
consuming on probability check failure is a sane option.  But if we have
  actions:sample(50%,userspace(),drop)
Then reporting a drop makes more sense.  So, I was thinking that maybe the
LAST_ACTION is just not the right drop reason to report.  e.g. something
like OVS_DROP_SAMPLE_PROBABILITY may be more appropriate to report in both
cases.

Anyways, this is only kind of related to this set and may be a separate
change if we decide it is needed.

> 
>> Best regards, Ilya Maximets.
>>
> 
> Thanks for the great discussion.
> Adrián
>
diff mbox series

Patch

diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 33f6d93ba5e4..54fc1abcff95 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -82,6 +82,15 @@  static struct action_fifo __percpu *action_fifos;
 static struct action_flow_keys __percpu *flow_keys;
 static DEFINE_PER_CPU(int, exec_actions_level);
 
+static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
+{
+	/* Do not emit packet drops inside sample(). */
+	if (OVS_CB(skb)->probability)
+		consume_skb(skb);
+	else
+		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
+}
+
 /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
  * space. Return NULL if out of key spaces.
  */
@@ -1061,7 +1070,7 @@  static int sample(struct datapath *dp, struct sk_buff *skb,
 	if ((arg->probability != U32_MAX) &&
 	    (!arg->probability || get_random_u32() > arg->probability)) {
 		if (last)
-			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
+			ovs_drop_skb_last_action(skb);
 		return 0;
 	}
 
@@ -1579,7 +1588,7 @@  static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
 		}
 	}
 
-	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
+	ovs_drop_skb_last_action(skb);
 	return 0;
 }