Message ID | 1520936711-16784-3-git-send-email-ilpo.jarvinen@helsinki.fi |
---|---|
State | Deferred, archived |
Delegated to: | David Miller |
Headers | show |
Series | tcp: fixes to non-SACK TCP | expand |
On Tue, Mar 13, 2018 at 3:25 AM, Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> wrote: > > If SACK is not enabled and the first cumulative ACK after the RTO > retransmission covers more than the retransmitted skb, a spurious > FRTO undo will trigger (assuming FRTO is enabled for that RTO). > The reason is that any non-retransmitted segment acknowledged will > set FLAG_ORIG_SACK_ACKED in tcp_clean_rtx_queue even if there is > no indication that it would have been delivered for real (the > scoreboard is not kept with TCPCB_SACKED_ACKED bits in the non-SACK > case so the check for that bit won't help like it does with SACK). > Having FLAG_ORIG_SACK_ACKED set results in the spurious FRTO undo > in tcp_process_loss. > > We need to use more strict condition for non-SACK case and check > that none of the cumulatively ACKed segments were retransmitted > to prove that progress is due to original transmissions. Only then > keep FLAG_ORIG_SACK_ACKED set, allowing FRTO undo to proceed in > non-SACK case. > > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> > --- > net/ipv4/tcp_input.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index 4a26c09..c60745c 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -3166,6 +3166,15 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack, > pkts_acked = rexmit_acked + newdata_acked; > > tcp_remove_reno_sacks(sk, pkts_acked); > + > + /* If any of the cumulatively ACKed segments was > + * retransmitted, non-SACK case cannot confirm that > + * progress was due to original transmission due to > + * lack of TCPCB_SACKED_ACKED bits even if some of > + * the packets may have been never retransmitted. > + */ > + if (flag & FLAG_RETRANS_DATA_ACKED) > + flag &= ~FLAG_ORIG_SACK_ACKED; How about keeping your excellent comment but move the fix to F-RTO code directly so it's more clear? this way the flag remains clear that indicates some never-retransmitted data are acked/sacked. // pseudo code for illustration diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 8d480542aa07..f7f3357de618 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2629,8 +2629,15 @@ static void tcp_process_loss(struct sock *sk, int flag, bool is_dupack, if (tp->frto) { /* F-RTO RFC5682 sec 3.1 (sack enhanced version). */ /* Step 3.b. A timeout is spurious if not all data are * lost, i.e., never-retransmitted data are (s)acked. + * + * If any of the cumulatively ACKed segments was + * retransmitted, non-SACK case cannot confirm that + * progress was due to original transmission due to + * lack of TCPCB_SACKED_ACKED bits even if some of + * the packets may have been never retransmitted. */ if ((flag & FLAG_ORIG_SACK_ACKED) && + (tcp_is_sack(tp) || !FLAG_RETRANS_DATA_ACKED) && tcp_try_undo_loss(sk, true)) return; > } else { > int delta; > > -- > 2.7.4 >
On Wed, 28 Mar 2018, Yuchung Cheng wrote: > On Tue, Mar 13, 2018 at 3:25 AM, Ilpo Järvinen > <ilpo.jarvinen@helsinki.fi> wrote: > > > > If SACK is not enabled and the first cumulative ACK after the RTO > > retransmission covers more than the retransmitted skb, a spurious > > FRTO undo will trigger (assuming FRTO is enabled for that RTO). > > The reason is that any non-retransmitted segment acknowledged will > > set FLAG_ORIG_SACK_ACKED in tcp_clean_rtx_queue even if there is > > no indication that it would have been delivered for real (the > > scoreboard is not kept with TCPCB_SACKED_ACKED bits in the non-SACK > > case so the check for that bit won't help like it does with SACK). > > Having FLAG_ORIG_SACK_ACKED set results in the spurious FRTO undo > > in tcp_process_loss. > > > > We need to use more strict condition for non-SACK case and check > > that none of the cumulatively ACKed segments were retransmitted > > to prove that progress is due to original transmissions. Only then > > keep FLAG_ORIG_SACK_ACKED set, allowing FRTO undo to proceed in > > non-SACK case. > > > > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> > > --- > > net/ipv4/tcp_input.c | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > > index 4a26c09..c60745c 100644 > > --- a/net/ipv4/tcp_input.c > > +++ b/net/ipv4/tcp_input.c > > @@ -3166,6 +3166,15 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack, > > pkts_acked = rexmit_acked + newdata_acked; > > > > tcp_remove_reno_sacks(sk, pkts_acked); > > + > > + /* If any of the cumulatively ACKed segments was > > + * retransmitted, non-SACK case cannot confirm that > > + * progress was due to original transmission due to > > + * lack of TCPCB_SACKED_ACKED bits even if some of > > + * the packets may have been never retransmitted. > > + */ > > + if (flag & FLAG_RETRANS_DATA_ACKED) > > + flag &= ~FLAG_ORIG_SACK_ACKED; > > How about keeping your excellent comment but move the fix to F-RTO > code directly so it's more clear? this way the flag remains clear that > indicates some never-retransmitted data are acked/sacked. > > // pseudo code for illustration > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index 8d480542aa07..f7f3357de618 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -2629,8 +2629,15 @@ static void tcp_process_loss(struct sock *sk, > int flag, bool is_dupack, > if (tp->frto) { /* F-RTO RFC5682 sec 3.1 (sack enhanced version). */ > /* Step 3.b. A timeout is spurious if not all data are > * lost, i.e., never-retransmitted data are (s)acked. > + * > + * If any of the cumulatively ACKed segments was > + * retransmitted, non-SACK case cannot confirm that > + * progress was due to original transmission due to > + * lack of TCPCB_SACKED_ACKED bits even if some of > + * the packets may have been never retransmitted. > */ > if ((flag & FLAG_ORIG_SACK_ACKED) && > + (tcp_is_sack(tp) || !FLAG_RETRANS_DATA_ACKED) && > tcp_try_undo_loss(sk, true)) > return; Of course I could put the back there but I really like the new place more (which was a result of your suggestion to place the code elsewhere). IMHO, it makes more sense to have it in tcp_clean_rtx_queue() because we weren't successful in proving (there in tcp_clean_rtx_queue) that progress was due original transmission and thus I would not want falsely indicate it with that flag. And there's the non-SACK related block anyway already there so it keeps the non-SACK "pollution" off from the SACK code paths. (In addition, I'd actually also like to rename FLAG_ORIG_SACK_ACKED to FLAG_ORIG_PROGRESS, the latter is more descriptive about the condition we're after regardless of SACK and less ambiguous in non-SACK case).
On Wed, Apr 4, 2018 at 6:35 AM Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> wrote: > On Wed, 28 Mar 2018, Yuchung Cheng wrote: > > On Tue, Mar 13, 2018 at 3:25 AM, Ilpo Järvinen > > <ilpo.jarvinen@helsinki.fi> wrote: > > > > > > If SACK is not enabled and the first cumulative ACK after the RTO > > > retransmission covers more than the retransmitted skb, a spurious > > > FRTO undo will trigger (assuming FRTO is enabled for that RTO). > > > The reason is that any non-retransmitted segment acknowledged will > > > set FLAG_ORIG_SACK_ACKED in tcp_clean_rtx_queue even if there is > > > no indication that it would have been delivered for real (the > > > scoreboard is not kept with TCPCB_SACKED_ACKED bits in the non-SACK > > > case so the check for that bit won't help like it does with SACK). > > > Having FLAG_ORIG_SACK_ACKED set results in the spurious FRTO undo > > > in tcp_process_loss. > > > > > > We need to use more strict condition for non-SACK case and check > > > that none of the cumulatively ACKed segments were retransmitted > > > to prove that progress is due to original transmissions. Only then > > > keep FLAG_ORIG_SACK_ACKED set, allowing FRTO undo to proceed in > > > non-SACK case. > > > > > > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> > > > --- > > > net/ipv4/tcp_input.c | 9 +++++++++ > > > 1 file changed, 9 insertions(+) > > > > > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > > > index 4a26c09..c60745c 100644 > > > --- a/net/ipv4/tcp_input.c > > > +++ b/net/ipv4/tcp_input.c > > > @@ -3166,6 +3166,15 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack, > > > pkts_acked = rexmit_acked + newdata_acked; > > > > > > tcp_remove_reno_sacks(sk, pkts_acked); > > > + > > > + /* If any of the cumulatively ACKed segments was > > > + * retransmitted, non-SACK case cannot confirm that > > > + * progress was due to original transmission due to > > > + * lack of TCPCB_SACKED_ACKED bits even if some of > > > + * the packets may have been never retransmitted. > > > + */ > > > + if (flag & FLAG_RETRANS_DATA_ACKED) > > > + flag &= ~FLAG_ORIG_SACK_ACKED; FWIW I'd vote for this version. > Of course I could put the back there but I really like the new place more > (which was a result of your suggestion to place the code elsewhere). > IMHO, it makes more sense to have it in tcp_clean_rtx_queue() because we > weren't successful in proving (there in tcp_clean_rtx_queue) that progress > was due original transmission and thus I would not want falsely indicate > it with that flag. And there's the non-SACK related block anyway already > there so it keeps the non-SACK "pollution" off from the SACK code paths. I think that's a compelling argument. In particular, in terms of long-term maintenance it seems risky to allow an unsound/buggy FLAG_ORIG_SACK_ACKED bit be returned by tcp_clean_rtx_queue(). If we return an incorrect/imcomplete FLAG_ORIG_SACK_ACKED bit then I worry that one day we will forget that for non-SACK flows that bit is incorrect/imcomplete, and we will add code using that bit but forgetting to check (tcp_is_sack(tp) || !FLAG_RETRANS_DATA_ACKED). > (In addition, I'd actually also like to rename FLAG_ORIG_SACK_ACKED to > FLAG_ORIG_PROGRESS, the latter is more descriptive about the condition > we're after regardless of SACK and less ambiguous in non-SACK case). I'm neutral on this. Not sure if the extra clarity is worth the code churn. cheers, neal
On Wed, Apr 4, 2018 at 9:33 AM, Neal Cardwell <ncardwell@google.com> wrote: > > On Wed, Apr 4, 2018 at 6:35 AM Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> > wrote: > > > On Wed, 28 Mar 2018, Yuchung Cheng wrote: > > > > On Tue, Mar 13, 2018 at 3:25 AM, Ilpo Järvinen > > > <ilpo.jarvinen@helsinki.fi> wrote: > > > > > > > > If SACK is not enabled and the first cumulative ACK after the RTO > > > > retransmission covers more than the retransmitted skb, a spurious > > > > FRTO undo will trigger (assuming FRTO is enabled for that RTO). > > > > The reason is that any non-retransmitted segment acknowledged will > > > > set FLAG_ORIG_SACK_ACKED in tcp_clean_rtx_queue even if there is > > > > no indication that it would have been delivered for real (the > > > > scoreboard is not kept with TCPCB_SACKED_ACKED bits in the non-SACK > > > > case so the check for that bit won't help like it does with SACK). > > > > Having FLAG_ORIG_SACK_ACKED set results in the spurious FRTO undo > > > > in tcp_process_loss. > > > > > > > > We need to use more strict condition for non-SACK case and check > > > > that none of the cumulatively ACKed segments were retransmitted > > > > to prove that progress is due to original transmissions. Only then > > > > keep FLAG_ORIG_SACK_ACKED set, allowing FRTO undo to proceed in > > > > non-SACK case. > > > > > > > > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> > > > > --- > > > > net/ipv4/tcp_input.c | 9 +++++++++ > > > > 1 file changed, 9 insertions(+) > > > > > > > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > > > > index 4a26c09..c60745c 100644 > > > > --- a/net/ipv4/tcp_input.c > > > > +++ b/net/ipv4/tcp_input.c > > > > @@ -3166,6 +3166,15 @@ static int tcp_clean_rtx_queue(struct sock > *sk, u32 prior_fack, > > > > pkts_acked = rexmit_acked + > newdata_acked; > > > > > > > > tcp_remove_reno_sacks(sk, pkts_acked); > > > > + > > > > + /* If any of the cumulatively ACKed segments > was > > > > + * retransmitted, non-SACK case cannot > confirm that > > > > + * progress was due to original transmission > due to > > > > + * lack of TCPCB_SACKED_ACKED bits even if > some of > > > > + * the packets may have been never > retransmitted. > > > > + */ > > > > + if (flag & FLAG_RETRANS_DATA_ACKED) > > > > + flag &= ~FLAG_ORIG_SACK_ACKED; > > FWIW I'd vote for this version. > > > Of course I could put the back there but I really like the new place more > > (which was a result of your suggestion to place the code elsewhere). > > IMHO, it makes more sense to have it in tcp_clean_rtx_queue() because we > > weren't successful in proving (there in tcp_clean_rtx_queue) that progress > > was due original transmission and thus I would not want falsely indicate > > it with that flag. And there's the non-SACK related block anyway already > > there so it keeps the non-SACK "pollution" off from the SACK code paths. > > I think that's a compelling argument. In particular, in terms of long-term > maintenance it seems risky to allow an unsound/buggy FLAG_ORIG_SACK_ACKED > bit be returned by tcp_clean_rtx_queue(). If we return an > incorrect/imcomplete FLAG_ORIG_SACK_ACKED bit then I worry that one day we > will forget that for non-SACK flows that bit is incorrect/imcomplete, and > we will add code using that bit but forgetting to check (tcp_is_sack(tp) || > !FLAG_RETRANS_DATA_ACKED). Agreed. That's a good point. And I would much preferred to rename that to FLAG_ORIG_PROGRESS (w/ updated comment). so I think we're in agreement to use existing patch w/ the new name FLAG_ORIG_PROGRESS > > > (In addition, I'd actually also like to rename FLAG_ORIG_SACK_ACKED to > > FLAG_ORIG_PROGRESS, the latter is more descriptive about the condition > > we're after regardless of SACK and less ambiguous in non-SACK case). > > I'm neutral on this. Not sure if the extra clarity is worth the code churn. > > cheers, > neal
n Wed, Apr 4, 2018 at 1:13 PM Yuchung Cheng <ycheng@google.com> wrote: > Agreed. That's a good point. And I would much preferred to rename that > to FLAG_ORIG_PROGRESS (w/ updated comment). > so I think we're in agreement to use existing patch w/ the new name > FLAG_ORIG_PROGRESS Yes, SGTM. I guess this "prevent bogus FRTO undos" patch would go to "net" branch and the s/FLAG_ORIG_SACK_ACKED/FLAG_ORIG_PROGRESS/ would go in "net-next" branch? neal
On Wed, Apr 4, 2018 at 10:22 AM, Neal Cardwell <ncardwell@google.com> wrote: > n Wed, Apr 4, 2018 at 1:13 PM Yuchung Cheng <ycheng@google.com> wrote: >> Agreed. That's a good point. And I would much preferred to rename that >> to FLAG_ORIG_PROGRESS (w/ updated comment). > >> so I think we're in agreement to use existing patch w/ the new name >> FLAG_ORIG_PROGRESS > > Yes, SGTM. > > I guess this "prevent bogus FRTO undos" patch would go to "net" branch and > the s/FLAG_ORIG_SACK_ACKED/FLAG_ORIG_PROGRESS/ would go in "net-next" > branch? huh? why not one patch ... this is getting close to patch-split paralyses. > > neal
On Wed, Apr 4, 2018 at 1:41 PM Yuchung Cheng <ycheng@google.com> wrote: > On Wed, Apr 4, 2018 at 10:22 AM, Neal Cardwell <ncardwell@google.com> wrote: > > n Wed, Apr 4, 2018 at 1:13 PM Yuchung Cheng <ycheng@google.com> wrote: > >> Agreed. That's a good point. And I would much preferred to rename that > >> to FLAG_ORIG_PROGRESS (w/ updated comment). > > > >> so I think we're in agreement to use existing patch w/ the new name > >> FLAG_ORIG_PROGRESS > > > > Yes, SGTM. > > > > I guess this "prevent bogus FRTO undos" patch would go to "net" branch and > > the s/FLAG_ORIG_SACK_ACKED/FLAG_ORIG_PROGRESS/ would go in "net-next" > > branch? > huh? why not one patch ... this is getting close to patch-split paralyses. The flag rename seemed like a cosmetic issue that was not needed for the fix. Smelled like net-next to me. But I don't feel strongly. However you guys want to package it is fine with me. :-) neal
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 4a26c09..c60745c 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3166,6 +3166,15 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack, pkts_acked = rexmit_acked + newdata_acked; tcp_remove_reno_sacks(sk, pkts_acked); + + /* If any of the cumulatively ACKed segments was + * retransmitted, non-SACK case cannot confirm that + * progress was due to original transmission due to + * lack of TCPCB_SACKED_ACKED bits even if some of + * the packets may have been never retransmitted. + */ + if (flag & FLAG_RETRANS_DATA_ACKED) + flag &= ~FLAG_ORIG_SACK_ACKED; } else { int delta;
If SACK is not enabled and the first cumulative ACK after the RTO retransmission covers more than the retransmitted skb, a spurious FRTO undo will trigger (assuming FRTO is enabled for that RTO). The reason is that any non-retransmitted segment acknowledged will set FLAG_ORIG_SACK_ACKED in tcp_clean_rtx_queue even if there is no indication that it would have been delivered for real (the scoreboard is not kept with TCPCB_SACKED_ACKED bits in the non-SACK case so the check for that bit won't help like it does with SACK). Having FLAG_ORIG_SACK_ACKED set results in the spurious FRTO undo in tcp_process_loss. We need to use more strict condition for non-SACK case and check that none of the cumulatively ACKed segments were retransmitted to prove that progress is due to original transmissions. Only then keep FLAG_ORIG_SACK_ACKED set, allowing FRTO undo to proceed in non-SACK case. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> --- net/ipv4/tcp_input.c | 9 +++++++++ 1 file changed, 9 insertions(+)