diff mbox series

[net-next,1/2] tcp: call tcp_drop() in tcp collapse

Message ID 1532746900-11710-1-git-send-email-laoar.shao@gmail.com
State Changes Requested, archived
Delegated to: David Miller
Headers show
Series [net-next,1/2] tcp: call tcp_drop() in tcp collapse | expand

Commit Message

Yafang Shao July 28, 2018, 3:01 a.m. UTC
When this SKB is dropped, we should add the counter sk_drops.
That could help us better tracking this behavior.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 net/ipv4/tcp_input.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Eric Dumazet July 28, 2018, 3:06 a.m. UTC | #1
On Fri, Jul 27, 2018 at 8:02 PM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> When this SKB is dropped, we should add the counter sk_drops.
> That could help us better tracking this behavior.
>
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> ---
>  net/ipv4/tcp_input.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index d51fa35..90f83eb 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -4802,7 +4802,7 @@ static struct sk_buff *tcp_collapse_one(struct sock *sk, struct sk_buff *skb,
>         else
>                 rb_erase(&skb->rbnode, root);
>
> -       __kfree_skb(skb);
> +       tcp_drop(sk, skb);


Absolutely not.

We do not drop the packet, we have simply lowered the memory overhead.
Yafang Shao July 28, 2018, 3:34 a.m. UTC | #2
On Sat, Jul 28, 2018 at 11:06 AM, Eric Dumazet <edumazet@google.com> wrote:
> On Fri, Jul 27, 2018 at 8:02 PM Yafang Shao <laoar.shao@gmail.com> wrote:
>>
>> When this SKB is dropped, we should add the counter sk_drops.
>> That could help us better tracking this behavior.
>>
>> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
>> ---
>>  net/ipv4/tcp_input.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
>> index d51fa35..90f83eb 100644
>> --- a/net/ipv4/tcp_input.c
>> +++ b/net/ipv4/tcp_input.c
>> @@ -4802,7 +4802,7 @@ static struct sk_buff *tcp_collapse_one(struct sock *sk, struct sk_buff *skb,
>>         else
>>                 rb_erase(&skb->rbnode, root);
>>
>> -       __kfree_skb(skb);
>> +       tcp_drop(sk, skb);
>
>
> Absolutely not.
>
> We do not drop the packet, we have simply lowered the memory overhead.

So what about LINUX_MIB_TCPOFOMERGE ?
Regarding LINUX_MIB_TCPOFOMERGE,  a skb is already covered by another
skb, is that dropping the packet or simply lowering the memory
overhead ?

Thanks
Yafang
Eric Dumazet July 28, 2018, 3:38 a.m. UTC | #3
On Fri, Jul 27, 2018 at 8:35 PM Yafang Shao <laoar.shao@gmail.com> wrote:

> So what about LINUX_MIB_TCPOFOMERGE ?
> Regarding LINUX_MIB_TCPOFOMERGE,  a skb is already covered by another
> skb, is that dropping the packet or simply lowering the memory
> overhead ?

What do you think ?

If you receive two times the same payload, don't you have to drop one
of the duplicate ?

There is a a big difference between the two cases.
Yafang Shao July 28, 2018, 7:42 a.m. UTC | #4
On Sat, Jul 28, 2018 at 11:38 AM, Eric Dumazet <edumazet@google.com> wrote:
> On Fri, Jul 27, 2018 at 8:35 PM Yafang Shao <laoar.shao@gmail.com> wrote:
>
>> So what about LINUX_MIB_TCPOFOMERGE ?
>> Regarding LINUX_MIB_TCPOFOMERGE,  a skb is already covered by another
>> skb, is that dropping the packet or simply lowering the memory
>> overhead ?
>
> What do you think ?
>
> If you receive two times the same payload, don't you have to drop one
> of the duplicate ?
>
> There is a a big difference between the two cases.

If the drop caused some data lost (which may then cause retransmition
or something), then this is a really DROP.
While if the drop won't cause any data lost, meaning it is a
non-harmful behavior, I think it should not be defined as DROP.
This is my suggestion anyway.

Thanks
Yafang
Eric Dumazet July 28, 2018, 4:28 p.m. UTC | #5
On Sat, Jul 28, 2018 at 12:43 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> On Sat, Jul 28, 2018 at 11:38 AM, Eric Dumazet <edumazet@google.com> wrote:
> > On Fri, Jul 27, 2018 at 8:35 PM Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> >> So what about LINUX_MIB_TCPOFOMERGE ?
> >> Regarding LINUX_MIB_TCPOFOMERGE,  a skb is already covered by another
> >> skb, is that dropping the packet or simply lowering the memory
> >> overhead ?
> >
> > What do you think ?
> >
> > If you receive two times the same payload, don't you have to drop one
> > of the duplicate ?
> >
> > There is a a big difference between the two cases.
>
> If the drop caused some data lost (which may then cause retransmition
> or something), then this is a really DROP.
> While if the drop won't cause any data lost, meaning it is a
> non-harmful behavior, I think it should not be defined as DROP.
> This is my suggestion anyway.

Sigh.

We count drops, not because they are ' bad or something went wrong'.

If TCP stack receives twice the same sequence (same payload), we
_drop_ one of the duplicate, so we account for this event.

When ' collapsing'  we reorganize our own storage, not because we have
to drop a payload,
but for some memory pressure reason.
We have specific SNMP counters to account for these, we do not want to
pretend a packet was ' dropped' since it was not.

If we have to _drop_ some packets, it is called Pruning, and we do
properly account for these drops.
Yafang Shao July 30, 2018, 2:06 a.m. UTC | #6
On Sun, Jul 29, 2018 at 12:28 AM, Eric Dumazet <edumazet@google.com> wrote:
> On Sat, Jul 28, 2018 at 12:43 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>>
>> On Sat, Jul 28, 2018 at 11:38 AM, Eric Dumazet <edumazet@google.com> wrote:
>> > On Fri, Jul 27, 2018 at 8:35 PM Yafang Shao <laoar.shao@gmail.com> wrote:
>> >
>> >> So what about LINUX_MIB_TCPOFOMERGE ?
>> >> Regarding LINUX_MIB_TCPOFOMERGE,  a skb is already covered by another
>> >> skb, is that dropping the packet or simply lowering the memory
>> >> overhead ?
>> >
>> > What do you think ?
>> >
>> > If you receive two times the same payload, don't you have to drop one
>> > of the duplicate ?
>> >
>> > There is a a big difference between the two cases.
>>
>> If the drop caused some data lost (which may then cause retransmition
>> or something), then this is a really DROP.
>> While if the drop won't cause any data lost, meaning it is a
>> non-harmful behavior, I think it should not be defined as DROP.
>> This is my suggestion anyway.
>
> Sigh.
>
> We count drops, not because they are ' bad or something went wrong'.
>
> If TCP stack receives twice the same sequence (same payload), we
> _drop_ one of the duplicate, so we account for this event.
>
> When ' collapsing'  we reorganize our own storage, not because we have
> to drop a payload,
> but for some memory pressure reason.

Thanks for you clarification.
So what about LINUX_MIB_TCPOFODROP ?

        if (unlikely(tcp_try_rmem_schedule(sk, skb, skb->truesize))) {
                NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPOFODROP);
                tcp_drop(sk, skb);
                return;
        }


It is also because of our own memory pressure, but we call tcp_drop() here.

I am not mean to disagree with you. I am just confused and  want to
make it clear.

> We have specific SNMP counters to account for these, we do not want to
> pretend a packet was ' dropped' since it was not.
>
> If we have to _drop_ some packets, it is called Pruning, and we do
> properly account for these drops.

Agreed.


Thanks
Yafang
Eric Dumazet July 30, 2018, 2:27 a.m. UTC | #7
On Sun, Jul 29, 2018 at 7:06 PM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> On Sun, Jul 29, 2018 at 12:28 AM, Eric Dumazet <edumazet@google.com> wrote:
> > On Sat, Jul 28, 2018 at 12:43 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> >>
> >> On Sat, Jul 28, 2018 at 11:38 AM, Eric Dumazet <edumazet@google.com> wrote:
> >> > On Fri, Jul 27, 2018 at 8:35 PM Yafang Shao <laoar.shao@gmail.com> wrote:
> >> >
> >> >> So what about LINUX_MIB_TCPOFOMERGE ?
> >> >> Regarding LINUX_MIB_TCPOFOMERGE,  a skb is already covered by another
> >> >> skb, is that dropping the packet or simply lowering the memory
> >> >> overhead ?
> >> >
> >> > What do you think ?
> >> >
> >> > If you receive two times the same payload, don't you have to drop one
> >> > of the duplicate ?
> >> >
> >> > There is a a big difference between the two cases.
> >>
> >> If the drop caused some data lost (which may then cause retransmition
> >> or something), then this is a really DROP.
> >> While if the drop won't cause any data lost, meaning it is a
> >> non-harmful behavior, I think it should not be defined as DROP.
> >> This is my suggestion anyway.
> >
> > Sigh.
> >
> > We count drops, not because they are ' bad or something went wrong'.
> >
> > If TCP stack receives twice the same sequence (same payload), we
> > _drop_ one of the duplicate, so we account for this event.
> >
> > When ' collapsing'  we reorganize our own storage, not because we have
> > to drop a payload,
> > but for some memory pressure reason.
>
> Thanks for you clarification.
> So what about LINUX_MIB_TCPOFODROP ?
>
>         if (unlikely(tcp_try_rmem_schedule(sk, skb, skb->truesize))) {
>                 NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPOFODROP);
>                 tcp_drop(sk, skb);
>                 return;
>         }
>
>
> It is also because of our own memory pressure, but we call tcp_drop() here.

Yes, we _drop_ a packet.

That is pretty clear that the payload is dropped, and that the sender
will have to _retransmit_.

>
> I am not mean to disagree with you. I am just confused and  want to
> make it clear.


Collapsing is :

For (a bunch of packets)
   Try (to compress them in order to reduce memory overhead)

No drop of payload happens here. Sender wont have to retransmit.
Yafang Shao July 30, 2018, 5:40 a.m. UTC | #8
On Mon, Jul 30, 2018 at 10:27 AM, Eric Dumazet <edumazet@google.com> wrote:
> On Sun, Jul 29, 2018 at 7:06 PM Yafang Shao <laoar.shao@gmail.com> wrote:
>>
>> On Sun, Jul 29, 2018 at 12:28 AM, Eric Dumazet <edumazet@google.com> wrote:
>> > On Sat, Jul 28, 2018 at 12:43 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>> >>
>> >> On Sat, Jul 28, 2018 at 11:38 AM, Eric Dumazet <edumazet@google.com> wrote:
>> >> > On Fri, Jul 27, 2018 at 8:35 PM Yafang Shao <laoar.shao@gmail.com> wrote:
>> >> >
>> >> >> So what about LINUX_MIB_TCPOFOMERGE ?
>> >> >> Regarding LINUX_MIB_TCPOFOMERGE,  a skb is already covered by another
>> >> >> skb, is that dropping the packet or simply lowering the memory
>> >> >> overhead ?
>> >> >
>> >> > What do you think ?
>> >> >
>> >> > If you receive two times the same payload, don't you have to drop one
>> >> > of the duplicate ?
>> >> >
>> >> > There is a a big difference between the two cases.
>> >>
>> >> If the drop caused some data lost (which may then cause retransmition
>> >> or something), then this is a really DROP.
>> >> While if the drop won't cause any data lost, meaning it is a
>> >> non-harmful behavior, I think it should not be defined as DROP.
>> >> This is my suggestion anyway.
>> >
>> > Sigh.
>> >
>> > We count drops, not because they are ' bad or something went wrong'.
>> >
>> > If TCP stack receives twice the same sequence (same payload), we
>> > _drop_ one of the duplicate, so we account for this event.
>> >
>> > When ' collapsing'  we reorganize our own storage, not because we have
>> > to drop a payload,
>> > but for some memory pressure reason.
>>
>> Thanks for you clarification.
>> So what about LINUX_MIB_TCPOFODROP ?
>>
>>         if (unlikely(tcp_try_rmem_schedule(sk, skb, skb->truesize))) {
>>                 NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPOFODROP);
>>                 tcp_drop(sk, skb);
>>                 return;
>>         }
>>
>>
>> It is also because of our own memory pressure, but we call tcp_drop() here.
>
> Yes, we _drop_ a packet.
>
> That is pretty clear that the payload is dropped, and that the sender
> will have to _retransmit_.
>
>>
>> I am not mean to disagree with you. I am just confused and  want to
>> make it clear.
>
>
> Collapsing is :
>
> For (a bunch of packets)
>    Try (to compress them in order to reduce memory overhead)
>
> No drop of payload happens here. Sender wont have to retransmit.

OK.
Thanks for your patient.

Should we put NET_INC_STATS(sock_net(sk),  mib_idx) into the funtion
tcp_drop() ?
Then we could easily relate the sk_drops with the SNMP counters.

Something like that,

    static void tcp_drop(struct sock *sk, struct sk_buff *skb, int mib_idx)
    {
        int segs = max_t(u16, 1, skb_shinfo(skb)->gso_segs);

        atomic_add(segs, &sk->sk_drops);
        NET_ADD_STATS(sock_net(sk), mib_idx, segs);
        __kfree_skb(skb);
    }


Thanks
Yafang
Eric Dumazet July 30, 2018, 3:56 p.m. UTC | #9
On Sun, Jul 29, 2018 at 10:40 PM Yafang Shao <laoar.shao@gmail.com> wrote:

> Should we put NET_INC_STATS(sock_net(sk),  mib_idx) into the funtion
> tcp_drop() ?
> Then we could easily relate the sk_drops with the SNMP counters.
>
> Something like that,
>
>     static void tcp_drop(struct sock *sk, struct sk_buff *skb, int mib_idx)
>     {
>         int segs = max_t(u16, 1, skb_shinfo(skb)->gso_segs);
>
>         atomic_add(segs, &sk->sk_drops);
>         NET_ADD_STATS(sock_net(sk), mib_idx, segs);
>         __kfree_skb(skb);
>     }

We had a discussion during netconf, and Brendan Gregg was working on
an idea like that,
so that distinct events could be traced/reported.

I prefer letting Brendan submit his patch, which not only refactors
things, but add new functionality.

Thanks.
Yafang Shao July 31, 2018, 12:48 a.m. UTC | #10
On Mon, Jul 30, 2018 at 11:56 PM, Eric Dumazet <edumazet@google.com> wrote:
> On Sun, Jul 29, 2018 at 10:40 PM Yafang Shao <laoar.shao@gmail.com> wrote:
>
>> Should we put NET_INC_STATS(sock_net(sk),  mib_idx) into the funtion
>> tcp_drop() ?
>> Then we could easily relate the sk_drops with the SNMP counters.
>>
>> Something like that,
>>
>>     static void tcp_drop(struct sock *sk, struct sk_buff *skb, int mib_idx)
>>     {
>>         int segs = max_t(u16, 1, skb_shinfo(skb)->gso_segs);
>>
>>         atomic_add(segs, &sk->sk_drops);
>>         NET_ADD_STATS(sock_net(sk), mib_idx, segs);
>>         __kfree_skb(skb);
>>     }
>
> We had a discussion during netconf, and Brendan Gregg was working on
> an idea like that,
> so that distinct events could be traced/reported.
>

Oh yes, introducing a new tracepoint for it should be better.
trace_tcp_probe(sk, skb, mib_idx);

> I prefer letting Brendan submit his patch, which not only refactors
> things, but add new functionality.
>

OK.
diff mbox series

Patch

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index d51fa35..90f83eb 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4802,7 +4802,7 @@  static struct sk_buff *tcp_collapse_one(struct sock *sk, struct sk_buff *skb,
 	else
 		rb_erase(&skb->rbnode, root);
 
-	__kfree_skb(skb);
+	tcp_drop(sk, skb);
 	NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRCVCOLLAPSED);
 
 	return next;