Message ID | 20200517014451.954F05026DE@novek.ru |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | net/tls: fix encryption error checking | expand |
On Sun, 17 May 2020 02:48:39 +0300 Vadim Fedorenko wrote: > tls_push_record can return -EAGAIN because of tcp layer. In that > case open_rec is already in the tx_record list and should not be > freed. > Also the record size can be more than the size requested to write > in tls_sw_do_sendpage(). That leads to overflow of copied variable > and wrong return code. > > Fixes: d10523d0b3d7 ("net/tls: free the record on encryption error") > Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Doesn't this return -EAGAIN back to user space? Meaning even tho we queued the user space will try to send it again?
On 19.05.2020 01:30, Jakub Kicinski wrote: > > tls_push_record can return -EAGAIN because of tcp layer. In that > > case open_rec is already in the tx_record list and should not be > > freed. > > Also the record size can be more than the size requested to write > > in tls_sw_do_sendpage(). That leads to overflow of copied variable > > and wrong return code. > > > > Fixes: d10523d0b3d7 ("net/tls: free the record on encryption error") > > Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> > > Doesn't this return -EAGAIN back to user space? Meaning even tho we > queued the user space will try to send it again? Before patch it was sending negative value back to user space. After patch it sends the amount of data encrypted in last call. It is checked by: return (copied > 0) ? copied : ret; and returns -EAGAIN only if data is not sent to open record.
On Tue, 19 May 2020 02:05:29 +0300 Vadim Fedorenko wrote: > On 19.05.2020 01:30, Jakub Kicinski wrote: > > > tls_push_record can return -EAGAIN because of tcp layer. In that > > > case open_rec is already in the tx_record list and should not be > > > freed. > > > Also the record size can be more than the size requested to write > > > in tls_sw_do_sendpage(). That leads to overflow of copied variable > > > and wrong return code. > > > > > > Fixes: d10523d0b3d7 ("net/tls: free the record on encryption error") > > > Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> > > > > Doesn't this return -EAGAIN back to user space? Meaning even tho we > > queued the user space will try to send it again? > Before patch it was sending negative value back to user space. > After patch it sends the amount of data encrypted in last call. It is checked > by: > return (copied > 0) ? copied : ret; > and returns -EAGAIN only if data is not sent to open record. I see, you're fixing two different bugs in one patch. Could you please split the fixes into two? (BTW no need for parenthesis around the condition in the ternary operator.) I think you need more fixes tags, too. Commit d3b18ad31f93 ("tls: add bpf support to sk_msg handling") already added one instance of the problem, right? What do you think about Pooja's patch to consume the EAGAIN earlier? There doesn't seem to be anything reasonable we can do with the error anyway, not sure there is a point checking for it..
On 19.05.2020 02:23, Jakub Kicinski wrote: > On Tue, 19 May 2020 02:05:29 +0300 Vadim Fedorenko wrote: >> On 19.05.2020 01:30, Jakub Kicinski wrote: >>>> tls_push_record can return -EAGAIN because of tcp layer. In that >>>> case open_rec is already in the tx_record list and should not be >>>> freed. >>>> Also the record size can be more than the size requested to write >>>> in tls_sw_do_sendpage(). That leads to overflow of copied variable >>>> and wrong return code. >>>> >>>> Fixes: d10523d0b3d7 ("net/tls: free the record on encryption error") >>>> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> >>> Doesn't this return -EAGAIN back to user space? Meaning even tho we >>> queued the user space will try to send it again? >> Before patch it was sending negative value back to user space. >> After patch it sends the amount of data encrypted in last call. It is checked >> by: >> return (copied > 0) ? copied : ret; >> and returns -EAGAIN only if data is not sent to open record. > I see, you're fixing two different bugs in one patch. Could you please > split the fixes into two? (BTW no need for parenthesis around the > condition in the ternary operator.) I think you need more fixes tags, > too. Commit d3b18ad31f93 ("tls: add bpf support to sk_msg handling") > already added one instance of the problem, right? Sure, will split it into two. Also the problem with overflow is possible in tls_sw_sendmsg(). But I'm not sure about correctness of freeing whole open record in bpf_exec_tx_verdict. > What do you think about Pooja's patch to consume the EAGAIN earlier? > There doesn't seem to be anything reasonable we can do with the error > anyway, not sure there is a point checking for it.. Yes, it's a good idea to consume this error earlier. I think it's better to fix tls_push_record() instead of dealing with it every possible caller. So I suggest to accept Pooja's patch and will resend only ssize_t checking fix.
On Tue, 19 May 2020 02:55:16 +0300 Vadim Fedorenko wrote: > On 19.05.2020 02:23, Jakub Kicinski wrote: > > On Tue, 19 May 2020 02:05:29 +0300 Vadim Fedorenko wrote: > >> On 19.05.2020 01:30, Jakub Kicinski wrote: > >>>> tls_push_record can return -EAGAIN because of tcp layer. In that > >>>> case open_rec is already in the tx_record list and should not be > >>>> freed. > >>>> Also the record size can be more than the size requested to write > >>>> in tls_sw_do_sendpage(). That leads to overflow of copied variable > >>>> and wrong return code. > >>>> > >>>> Fixes: d10523d0b3d7 ("net/tls: free the record on encryption error") > >>>> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> > >>> Doesn't this return -EAGAIN back to user space? Meaning even tho we > >>> queued the user space will try to send it again? > >> Before patch it was sending negative value back to user space. > >> After patch it sends the amount of data encrypted in last call. It is checked > >> by: > >> return (copied > 0) ? copied : ret; > >> and returns -EAGAIN only if data is not sent to open record. > > I see, you're fixing two different bugs in one patch. Could you please > > split the fixes into two? (BTW no need for parenthesis around the > > condition in the ternary operator.) I think you need more fixes tags, > > too. Commit d3b18ad31f93 ("tls: add bpf support to sk_msg handling") > > already added one instance of the problem, right? > Sure, will split it into two. Also the problem with overflow is possible in > tls_sw_sendmsg(). But I'm not sure about correctness of freeing whole > open record in bpf_exec_tx_verdict. Yeah, as a matter of fact checking if copied is negative is just papering over the issue. Cleaning up the record so it can be re-submitted again would be better. > > What do you think about Pooja's patch to consume the EAGAIN earlier? > > There doesn't seem to be anything reasonable we can do with the error > > anyway, not sure there is a point checking for it.. > Yes, it's a good idea to consume this error earlier. I think it's better to fix > tls_push_record() instead of dealing with it every possible caller. > > So I suggest to accept Pooja's patch and will resend only ssize_t checking fix. Cool, thanks!
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index e23f94a..d4acbd1 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -796,7 +796,7 @@ static int bpf_exec_tx_verdict(struct sk_msg *msg, struct sock *sk, psock = sk_psock_get(sk); if (!psock || !policy) { err = tls_push_record(sk, flags, record_type); - if (err && err != -EINPROGRESS) { + if (err && err != -EINPROGRESS && err != -EAGAIN) { *copied -= sk_msg_free(sk, msg); tls_free_open_rec(sk); } @@ -824,7 +824,7 @@ static int bpf_exec_tx_verdict(struct sk_msg *msg, struct sock *sk, switch (psock->eval) { case __SK_PASS: err = tls_push_record(sk, flags, record_type); - if (err && err != -EINPROGRESS) { + if (err && err != -EINPROGRESS && err != -EAGAIN) { *copied -= sk_msg_free(sk, msg); tls_free_open_rec(sk); goto out_err; @@ -1132,7 +1132,7 @@ static int tls_sw_do_sendpage(struct sock *sk, struct page *page, struct sk_msg *msg_pl; struct tls_rec *rec; int num_async = 0; - size_t copied = 0; + ssize_t copied = 0; bool full_record; int record_room; int ret = 0; @@ -1234,7 +1234,7 @@ static int tls_sw_do_sendpage(struct sock *sk, struct page *page, } sendpage_end: ret = sk_stream_error(sk, flags, ret); - return copied ? copied : ret; + return (copied > 0) ? copied : ret; } int tls_sw_sendpage_locked(struct sock *sk, struct page *page,
tls_push_record can return -EAGAIN because of tcp layer. In that case open_rec is already in the tx_record list and should not be freed. Also the record size can be more than the size requested to write in tls_sw_do_sendpage(). That leads to overflow of copied variable and wrong return code. Fixes: d10523d0b3d7 ("net/tls: free the record on encryption error") Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> --- net/tls/tls_sw.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)