mbox series

[bpf-next,v3,0/5] tcp: increase flexibility of EBPF congestion control initialization

Message ID 20200910193536.2980613-1-ncardwell.kernel@gmail.com
Headers show
Series tcp: increase flexibility of EBPF congestion control initialization | expand

Message

Neal Cardwell Sept. 10, 2020, 7:35 p.m. UTC
From: Neal Cardwell <ncardwell@google.com>

This patch series reorganizes TCP congestion control initialization so that if
EBPF code called by tcp_init_transfer() sets the congestion control algorithm
by calling setsockopt(TCP_CONGESTION) then the TCP stack initializes the
congestion control module immediately, instead of having tcp_init_transfer()
later initialize the congestion control module.

This increases flexibility for the EBPF code that runs at connection
establishment time, and simplifies the code.

This has the following benefits:

(1) This allows CC module customizations made by the EBPF called in
    tcp_init_transfer() to persist, and not be wiped out by a later
    call to tcp_init_congestion_control() in tcp_init_transfer().

(2) Does not flip the order of EBPF and CC init, to avoid causing bugs
    for existing code upstream that depends on the current order.

(3) Does not cause 2 initializations for for CC in the case where the
    EBPF called in tcp_init_transfer() wants to set the CC to a new CC
    algorithm.

(4) Allows follow-on simplifications to the code in net/core/filter.c
    and net/ipv4/tcp_cong.c, which currently both have some complexity
    to special-case CC initialization to avoid double CC
    initialization if EBPF sets the CC.

changes in v2:

o rebase onto bpf-next

o add another follow-on simplification suggested by Martin KaFai Lau:
   "tcp: simplify tcp_set_congestion_control() load=false case"

changes in v3:

o no change in commits

o resent patch series from @gmail.com, since mail from ncardwell@google.com
  stopped being accepted at netdev@vger.kernel.org mid-way through processing
  the v2 patch series (between patches 2 and 3), confusing patchwork about
  which patches belonged to the v2 patch series

Neal Cardwell (5):
  tcp: only init congestion control if not initialized already
  tcp: simplify EBPF TCP_CONGESTION to always init CC
  tcp: simplify tcp_set_congestion_control(): always reinitialize
  tcp: simplify _bpf_setsockopt(): remove flags argument
  tcp: simplify tcp_set_congestion_control() load=false case

 include/net/inet_connection_sock.h |  3 ++-
 include/net/tcp.h                  |  2 +-
 net/core/filter.c                  | 18 ++++--------------
 net/ipv4/tcp.c                     |  3 ++-
 net/ipv4/tcp_cong.c                | 27 +++++++--------------------
 net/ipv4/tcp_input.c               |  4 +++-
 6 files changed, 19 insertions(+), 38 deletions(-)

Comments

Martin KaFai Lau Sept. 11, 2020, 3:28 a.m. UTC | #1
On Thu, Sep 10, 2020 at 03:35:31PM -0400, Neal Cardwell wrote:
> From: Neal Cardwell <ncardwell@google.com>
> 
> This patch series reorganizes TCP congestion control initialization so that if
> EBPF code called by tcp_init_transfer() sets the congestion control algorithm
> by calling setsockopt(TCP_CONGESTION) then the TCP stack initializes the
> congestion control module immediately, instead of having tcp_init_transfer()
> later initialize the congestion control module.
> 
> This increases flexibility for the EBPF code that runs at connection
> establishment time, and simplifies the code.
> 
> This has the following benefits:
> 
> (1) This allows CC module customizations made by the EBPF called in
>     tcp_init_transfer() to persist, and not be wiped out by a later
>     call to tcp_init_congestion_control() in tcp_init_transfer().
> 
> (2) Does not flip the order of EBPF and CC init, to avoid causing bugs
>     for existing code upstream that depends on the current order.
> 
> (3) Does not cause 2 initializations for for CC in the case where the
>     EBPF called in tcp_init_transfer() wants to set the CC to a new CC
>     algorithm.
> 
> (4) Allows follow-on simplifications to the code in net/core/filter.c
>     and net/ipv4/tcp_cong.c, which currently both have some complexity
>     to special-case CC initialization to avoid double CC
>     initialization if EBPF sets the CC.
> 
> changes in v2:
> 
> o rebase onto bpf-next
> 
> o add another follow-on simplification suggested by Martin KaFai Lau:
>    "tcp: simplify tcp_set_congestion_control() load=false case"
> 
> changes in v3:
> 
> o no change in commits
> 
> o resent patch series from @gmail.com, since mail from ncardwell@google.com
>   stopped being accepted at netdev@vger.kernel.org mid-way through processing
>   the v2 patch series (between patches 2 and 3), confusing patchwork about
>   which patches belonged to the v2 patch series
Acked-by: Martin KaFai Lau <kafai@fb.com>
Alexei Starovoitov Sept. 11, 2020, 4:20 a.m. UTC | #2
On Thu, Sep 10, 2020 at 8:28 PM Martin KaFai Lau <kafai@fb.com> wrote:
>
> On Thu, Sep 10, 2020 at 03:35:31PM -0400, Neal Cardwell wrote:
> > From: Neal Cardwell <ncardwell@google.com>
> >
> > This patch series reorganizes TCP congestion control initialization so that if
> > EBPF code called by tcp_init_transfer() sets the congestion control algorithm
> > by calling setsockopt(TCP_CONGESTION) then the TCP stack initializes the
> > congestion control module immediately, instead of having tcp_init_transfer()
> > later initialize the congestion control module.
> >
> > This increases flexibility for the EBPF code that runs at connection
> > establishment time, and simplifies the code.
> >
> > This has the following benefits:
> >
> > (1) This allows CC module customizations made by the EBPF called in
> >     tcp_init_transfer() to persist, and not be wiped out by a later
> >     call to tcp_init_congestion_control() in tcp_init_transfer().
> >
> > (2) Does not flip the order of EBPF and CC init, to avoid causing bugs
> >     for existing code upstream that depends on the current order.
> >
> > (3) Does not cause 2 initializations for for CC in the case where the
> >     EBPF called in tcp_init_transfer() wants to set the CC to a new CC
> >     algorithm.
> >
> > (4) Allows follow-on simplifications to the code in net/core/filter.c
> >     and net/ipv4/tcp_cong.c, which currently both have some complexity
> >     to special-case CC initialization to avoid double CC
> >     initialization if EBPF sets the CC.
> >
> > changes in v2:
> >
> > o rebase onto bpf-next
> >
> > o add another follow-on simplification suggested by Martin KaFai Lau:
> >    "tcp: simplify tcp_set_congestion_control() load=false case"
> >
> > changes in v3:
> >
> > o no change in commits
> >
> > o resent patch series from @gmail.com, since mail from ncardwell@google.com
> >   stopped being accepted at netdev@vger.kernel.org mid-way through processing
> >   the v2 patch series (between patches 2 and 3), confusing patchwork about
> >   which patches belonged to the v2 patch series
> Acked-by: Martin KaFai Lau <kafai@fb.com>

Applied.

Martin, thanks for the review.

Neal, please keep Acks when you resubmit patches without changes in the future.
Also please follow up with a selftests/bpf based on test_progs to
cover new functionality.

Thanks