diff mbox series

net/tls: fix encryption error checking

Message ID 20200517014451.954F05026DE@novek.ru
State Changes Requested
Delegated to: David Miller
Headers show
Series net/tls: fix encryption error checking | expand

Commit Message

Vadim Fedorenko May 16, 2020, 11:48 p.m. UTC
tls_push_record can return -EAGAIN because of tcp layer. In that
case open_rec is already in the tx_record list and should not be
freed.
Also the record size can be more than the size requested to write
in tls_sw_do_sendpage(). That leads to overflow of copied variable
and wrong return code.

Fixes: d10523d0b3d7 ("net/tls: free the record on encryption error")
Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
---
 net/tls/tls_sw.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Jakub Kicinski May 18, 2020, 10:30 p.m. UTC | #1
On Sun, 17 May 2020 02:48:39 +0300 Vadim Fedorenko wrote:
> tls_push_record can return -EAGAIN because of tcp layer. In that
> case open_rec is already in the tx_record list and should not be
> freed.
> Also the record size can be more than the size requested to write
> in tls_sw_do_sendpage(). That leads to overflow of copied variable
> and wrong return code.
> 
> Fixes: d10523d0b3d7 ("net/tls: free the record on encryption error")
> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>

Doesn't this return -EAGAIN back to user space? Meaning even tho we
queued the user space will try to send it again?
Vadim Fedorenko May 18, 2020, 11:05 p.m. UTC | #2
On 19.05.2020 01:30, Jakub Kicinski wrote:
> > tls_push_record can return -EAGAIN because of tcp layer. In that
> > case open_rec is already in the tx_record list and should not be
> > freed.
> > Also the record size can be more than the size requested to write
> > in tls_sw_do_sendpage(). That leads to overflow of copied variable
> > and wrong return code.
> >
> > Fixes: d10523d0b3d7 ("net/tls: free the record on encryption error")
> > Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
>
> Doesn't this return -EAGAIN back to user space? Meaning even tho we
> queued the user space will try to send it again?
Before patch it was sending negative value back to user space.
After patch it sends the amount of data encrypted in last call. It is checked
by:
  return (copied > 0) ? copied : ret;
and returns -EAGAIN only if data is not sent to open record.
Jakub Kicinski May 18, 2020, 11:23 p.m. UTC | #3
On Tue, 19 May 2020 02:05:29 +0300 Vadim Fedorenko wrote:
> On 19.05.2020 01:30, Jakub Kicinski wrote:
> > > tls_push_record can return -EAGAIN because of tcp layer. In that
> > > case open_rec is already in the tx_record list and should not be
> > > freed.
> > > Also the record size can be more than the size requested to write
> > > in tls_sw_do_sendpage(). That leads to overflow of copied variable
> > > and wrong return code.
> > >
> > > Fixes: d10523d0b3d7 ("net/tls: free the record on encryption error")
> > > Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>  
> >
> > Doesn't this return -EAGAIN back to user space? Meaning even tho we
> > queued the user space will try to send it again?  
> Before patch it was sending negative value back to user space.
> After patch it sends the amount of data encrypted in last call. It is checked
> by:
>   return (copied > 0) ? copied : ret;
> and returns -EAGAIN only if data is not sent to open record.

I see, you're fixing two different bugs in one patch. Could you please
split the fixes into two? (BTW no need for parenthesis around the
condition in the ternary operator.) I think you need more fixes tags,
too. Commit d3b18ad31f93 ("tls: add bpf support to sk_msg handling")
already added one instance of the problem, right?

What do you think about Pooja's patch to consume the EAGAIN earlier?
There doesn't seem to be anything reasonable we can do with the error
anyway, not sure there is a point checking for it..
Vadim Fedorenko May 18, 2020, 11:55 p.m. UTC | #4
On 19.05.2020 02:23, Jakub Kicinski wrote:
> On Tue, 19 May 2020 02:05:29 +0300 Vadim Fedorenko wrote:
>> On 19.05.2020 01:30, Jakub Kicinski wrote:
>>>> tls_push_record can return -EAGAIN because of tcp layer. In that
>>>> case open_rec is already in the tx_record list and should not be
>>>> freed.
>>>> Also the record size can be more than the size requested to write
>>>> in tls_sw_do_sendpage(). That leads to overflow of copied variable
>>>> and wrong return code.
>>>>
>>>> Fixes: d10523d0b3d7 ("net/tls: free the record on encryption error")
>>>> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
>>> Doesn't this return -EAGAIN back to user space? Meaning even tho we
>>> queued the user space will try to send it again?
>> Before patch it was sending negative value back to user space.
>> After patch it sends the amount of data encrypted in last call. It is checked
>> by:
>>    return (copied > 0) ? copied : ret;
>> and returns -EAGAIN only if data is not sent to open record.
> I see, you're fixing two different bugs in one patch. Could you please
> split the fixes into two? (BTW no need for parenthesis around the
> condition in the ternary operator.) I think you need more fixes tags,
> too. Commit d3b18ad31f93 ("tls: add bpf support to sk_msg handling")
> already added one instance of the problem, right?
Sure, will split it into two. Also the problem with overflow is possible in
tls_sw_sendmsg(). But I'm not sure about correctness of freeing whole open
record in bpf_exec_tx_verdict.
> What do you think about Pooja's patch to consume the EAGAIN earlier?
> There doesn't seem to be anything reasonable we can do with the error
> anyway, not sure there is a point checking for it..
Yes, it's a good idea to consume this error earlier. I think it's better to fix
tls_push_record() instead of dealing with it every possible caller.

So I suggest to accept Pooja's patch and will resend only ssize_t checking fix.
Jakub Kicinski May 19, 2020, 12:26 a.m. UTC | #5
On Tue, 19 May 2020 02:55:16 +0300 Vadim Fedorenko wrote:
> On 19.05.2020 02:23, Jakub Kicinski wrote:
> > On Tue, 19 May 2020 02:05:29 +0300 Vadim Fedorenko wrote:  
> >> On 19.05.2020 01:30, Jakub Kicinski wrote:  
> >>>> tls_push_record can return -EAGAIN because of tcp layer. In that
> >>>> case open_rec is already in the tx_record list and should not be
> >>>> freed.
> >>>> Also the record size can be more than the size requested to write
> >>>> in tls_sw_do_sendpage(). That leads to overflow of copied variable
> >>>> and wrong return code.
> >>>>
> >>>> Fixes: d10523d0b3d7 ("net/tls: free the record on encryption error")
> >>>> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>  
> >>> Doesn't this return -EAGAIN back to user space? Meaning even tho we
> >>> queued the user space will try to send it again?  
> >> Before patch it was sending negative value back to user space.
> >> After patch it sends the amount of data encrypted in last call. It is checked
> >> by:
> >>    return (copied > 0) ? copied : ret;
> >> and returns -EAGAIN only if data is not sent to open record.  
> > I see, you're fixing two different bugs in one patch. Could you please
> > split the fixes into two? (BTW no need for parenthesis around the
> > condition in the ternary operator.) I think you need more fixes tags,
> > too. Commit d3b18ad31f93 ("tls: add bpf support to sk_msg handling")
> > already added one instance of the problem, right?  
> Sure, will split it into two. Also the problem with overflow is possible in
> tls_sw_sendmsg(). But I'm not sure about correctness of freeing whole
> open record in bpf_exec_tx_verdict.

Yeah, as a matter of fact checking if copied is negative is just
papering over the issue. Cleaning up the record so it can be
re-submitted again would be better.

> > What do you think about Pooja's patch to consume the EAGAIN earlier?
> > There doesn't seem to be anything reasonable we can do with the error
> > anyway, not sure there is a point checking for it..  
> Yes, it's a good idea to consume this error earlier. I think it's better to fix
> tls_push_record() instead of dealing with it every possible caller.
> 
> So I suggest to accept Pooja's patch and will resend only ssize_t checking fix.

Cool, thanks!
diff mbox series

Patch

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index e23f94a..d4acbd1 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -796,7 +796,7 @@  static int bpf_exec_tx_verdict(struct sk_msg *msg, struct sock *sk,
 	psock = sk_psock_get(sk);
 	if (!psock || !policy) {
 		err = tls_push_record(sk, flags, record_type);
-		if (err && err != -EINPROGRESS) {
+		if (err && err != -EINPROGRESS && err != -EAGAIN) {
 			*copied -= sk_msg_free(sk, msg);
 			tls_free_open_rec(sk);
 		}
@@ -824,7 +824,7 @@  static int bpf_exec_tx_verdict(struct sk_msg *msg, struct sock *sk,
 	switch (psock->eval) {
 	case __SK_PASS:
 		err = tls_push_record(sk, flags, record_type);
-		if (err && err != -EINPROGRESS) {
+		if (err && err != -EINPROGRESS && err != -EAGAIN) {
 			*copied -= sk_msg_free(sk, msg);
 			tls_free_open_rec(sk);
 			goto out_err;
@@ -1132,7 +1132,7 @@  static int tls_sw_do_sendpage(struct sock *sk, struct page *page,
 	struct sk_msg *msg_pl;
 	struct tls_rec *rec;
 	int num_async = 0;
-	size_t copied = 0;
+	ssize_t copied = 0;
 	bool full_record;
 	int record_room;
 	int ret = 0;
@@ -1234,7 +1234,7 @@  static int tls_sw_do_sendpage(struct sock *sk, struct page *page,
 	}
 sendpage_end:
 	ret = sk_stream_error(sk, flags, ret);
-	return copied ? copied : ret;
+	return (copied > 0) ? copied : ret;
 }
 
 int tls_sw_sendpage_locked(struct sock *sk, struct page *page,