diff mbox series

[net] net/tls(TLS_SW): Fix integrity issue with non-blocking sw KTLS request

Message ID 1589732796-22839-1-git-send-email-pooja.trivedi@stackpath.com
State Changes Requested
Delegated to: David Miller
Headers show
Series [net] net/tls(TLS_SW): Fix integrity issue with non-blocking sw KTLS request | expand

Commit Message

Pooja Trivedi May 17, 2020, 4:26 p.m. UTC
In pure sw ktls(AES-NI), -EAGAIN from tcp layer (do_tcp_sendpages for
encrypted record) gets treated as error, subtracts the offset, and
returns to application. Because of this, application sends data from
subtracted offset, which leads to data integrity issue. Since record is
already encrypted, ktls module marks it as partially sent and pushes the
packet to tcp layer in the following iterations (either from bottom half
or when pushing next chunk). So returning success in case of EAGAIN
will fix the issue.

Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption")
Signed-off-by: Pooja Trivedi <pooja.trivedi@stackpath.com>
Reviewed-by: Mallesham Jatharkonda <mallesham.jatharkonda@oneconvergence.com>
Reviewed-by: Josh Tway <josh.tway@stackpath.com>
---
 net/tls/tls_sw.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Jakub Kicinski May 18, 2020, 10:50 p.m. UTC | #1
On Sun, 17 May 2020 16:26:36 +0000 Pooja Trivedi wrote:
> In pure sw ktls(AES-NI), -EAGAIN from tcp layer (do_tcp_sendpages for
> encrypted record) gets treated as error, subtracts the offset, and
> returns to application. Because of this, application sends data from
> subtracted offset, which leads to data integrity issue. Since record is
> already encrypted, ktls module marks it as partially sent and pushes the
> packet to tcp layer in the following iterations (either from bottom half
> or when pushing next chunk). So returning success in case of EAGAIN
> will fix the issue.
> 
> Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption")
> Signed-off-by: Pooja Trivedi <pooja.trivedi@stackpath.com>
> Reviewed-by: Mallesham Jatharkonda <mallesham.jatharkonda@oneconvergence.com>
> Reviewed-by: Josh Tway <josh.tway@stackpath.com>

This looks reasonable, I think. Next time user space calls if no new
buffer space was made available it will get a -EAGAIN, right?

Two questions - is there any particular application or use case that
runs into this? Seems a bit surprising to see a patch from Vadim and
you guys come at the same time.

Could you also add test for this bug? 
In tools/testing/selftests/net/tls.c
Pooja Trivedi May 19, 2020, 5:21 p.m. UTC | #2
On Mon, May 18, 2020 at 6:50 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Sun, 17 May 2020 16:26:36 +0000 Pooja Trivedi wrote:
> > In pure sw ktls(AES-NI), -EAGAIN from tcp layer (do_tcp_sendpages for
> > encrypted record) gets treated as error, subtracts the offset, and
> > returns to application. Because of this, application sends data from
> > subtracted offset, which leads to data integrity issue. Since record is
> > already encrypted, ktls module marks it as partially sent and pushes the
> > packet to tcp layer in the following iterations (either from bottom half
> > or when pushing next chunk). So returning success in case of EAGAIN
> > will fix the issue.
> >
> > Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption")
> > Signed-off-by: Pooja Trivedi <pooja.trivedi@stackpath.com>
> > Reviewed-by: Mallesham Jatharkonda <mallesham.jatharkonda@oneconvergence.com>
> > Reviewed-by: Josh Tway <josh.tway@stackpath.com>
>
> This looks reasonable, I think. Next time user space calls if no new
> buffer space was made available it will get a -EAGAIN, right?
>

Yes, this fix should only affect encrypted record. Plain text calls from
user space should be unaffected.

>
> Two questions - is there any particular application or use case that
> runs into this?
>

We are running into this case when we hit our kTLS-enabled homegrown
webserver with a 'pipeline' test tool, also homegrown. The issue basically
happens whenever the send buffer on the server gets full and TCP layer
returns EAGAIN when attempting to TX the encrypted record. In fact, we
are also able to reproduce the issue by using a simple wget with a large
file, if/when sndbuf fills up.

> Seems a bit surprising to see a patch from Vadim and
> you guys come at the same time.
>

Not familiar with Vadim or her/his patch. Could you please point me to it?

>
> Could you also add test for this bug?
> In tools/testing/selftests/net/tls.c
>

Sure, yes. Let me look into this.
Jakub Kicinski May 19, 2020, 9:42 p.m. UTC | #3
On Tue, 19 May 2020 13:21:56 -0400 Pooja Trivedi wrote:
> On Mon, May 18, 2020 at 6:50 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > On Sun, 17 May 2020 16:26:36 +0000 Pooja Trivedi wrote:  
> > > In pure sw ktls(AES-NI), -EAGAIN from tcp layer (do_tcp_sendpages for
> > > encrypted record) gets treated as error, subtracts the offset, and
> > > returns to application. Because of this, application sends data from
> > > subtracted offset, which leads to data integrity issue. Since record is
> > > already encrypted, ktls module marks it as partially sent and pushes the
> > > packet to tcp layer in the following iterations (either from bottom half
> > > or when pushing next chunk). So returning success in case of EAGAIN
> > > will fix the issue.
> > >
> > > Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption")
> > > Signed-off-by: Pooja Trivedi <pooja.trivedi@stackpath.com>
> > > Reviewed-by: Mallesham Jatharkonda <mallesham.jatharkonda@oneconvergence.com>
> > > Reviewed-by: Josh Tway <josh.tway@stackpath.com>  
> >
> > This looks reasonable, I think. Next time user space calls if no new
> > buffer space was made available it will get a -EAGAIN, right?
> 
> Yes, this fix should only affect encrypted record. Plain text calls from
> user space should be unaffected.

AFAICS if TCP layer is full next call from user space should hit
sk_stream_wait_memory() immediately and if it has MSG_DONTWAIT set 
exit with EAGAIN. Which I believe to be correct behavior.

> > Two questions - is there any particular application or use case that
> > runs into this?
> 
> We are running into this case when we hit our kTLS-enabled homegrown
> webserver with a 'pipeline' test tool, also homegrown. The issue basically
> happens whenever the send buffer on the server gets full and TCP layer
> returns EAGAIN when attempting to TX the encrypted record. In fact, we
> are also able to reproduce the issue by using a simple wget with a large
> file, if/when sndbuf fills up.

I see just a coincidence, then, no worries.

> > Seems a bit surprising to see a patch from Vadim and
> > you guys come at the same time.
> 
> Not familiar with Vadim or her/his patch. Could you please point me to it?

http://patchwork.ozlabs.org/project/netdev/patch/20200517014451.954F05026DE@novek.ru/

> > Could you also add test for this bug?
> > In tools/testing/selftests/net/tls.c
> >  
> 
> Sure, yes. Let me look into this.

Thanks!
Pooja Trivedi May 20, 2020, 7:56 p.m. UTC | #4
On Tue, May 19, 2020 at 5:43 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 19 May 2020 13:21:56 -0400 Pooja Trivedi wrote:
> > On Mon, May 18, 2020 at 6:50 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > > On Sun, 17 May 2020 16:26:36 +0000 Pooja Trivedi wrote:
> > > > In pure sw ktls(AES-NI), -EAGAIN from tcp layer (do_tcp_sendpages for
> > > > encrypted record) gets treated as error, subtracts the offset, and
> > > > returns to application. Because of this, application sends data from
> > > > subtracted offset, which leads to data integrity issue. Since record is
> > > > already encrypted, ktls module marks it as partially sent and pushes the
> > > > packet to tcp layer in the following iterations (either from bottom half
> > > > or when pushing next chunk). So returning success in case of EAGAIN
> > > > will fix the issue.
> > > >
> > > > Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption")
> > > > Signed-off-by: Pooja Trivedi <pooja.trivedi@stackpath.com>
> > > > Reviewed-by: Mallesham Jatharkonda <mallesham.jatharkonda@oneconvergence.com>
> > > > Reviewed-by: Josh Tway <josh.tway@stackpath.com>
> > >
> > > This looks reasonable, I think. Next time user space calls if no new
> > > buffer space was made available it will get a -EAGAIN, right?
> >
> > Yes, this fix should only affect encrypted record. Plain text calls from
> > user space should be unaffected.
>
> AFAICS if TCP layer is full next call from user space should hit
> sk_stream_wait_memory() immediately and if it has MSG_DONTWAIT set
> exit with EAGAIN. Which I believe to be correct behavior.
>

The flow is tls_sw_sendmsg/tls_sw_do_sendpage --> bpf_exec_tx_verdict -->
tls_push_record --> tls_tx_records --> tls_push_sg --> do_tcp_sendpages

do_tcp_sendpages() sends partial record, 'retry:' label is exercised wherein
do_tcp_sendpages gets called again and returns -EAGAIN.
tls_push_sg sets partially_sent_record/partially_sent_offset and
returns -EAGAIN. -EAGAIN bubbles up to bpf_exec_tx_verdict.
In bpf_exec_tx_verdict, the following code causes 'copied' variable to
get updated to a negative value and returns -EAGAIN.

                err = tls_push_record(sk, flags, record_type);
                if (err && err != -EINPROGRESS) {
                        *copied -= sk_msg_free(sk, msg);
                        tls_free_open_rec(sk);
                }
                return err;

-EAGAIN returned by bpf_exec_tx_verdict causes
tls_sw_sendmsg/tls_sw_do_sendpage to 'continue' in the while loop and
call sk_stream_wait_memory(). sk_stream_wait_memory returns -EAGAIN
also and control reaches the 'send_end:' label. The following return
statement causes a negative 'copied' variable value to be returned to the
user space.

        return copied ? copied : ret;

User space applies this negative value as offset for the next send, causing
part of the record that was already sent to be pushed again.

Hope this clarifies it.


>
> > > Two questions - is there any particular application or use case that
> > > runs into this?
> >
> > We are running into this case when we hit our kTLS-enabled homegrown
> > webserver with a 'pipeline' test tool, also homegrown. The issue basically
> > happens whenever the send buffer on the server gets full and TCP layer
> > returns EAGAIN when attempting to TX the encrypted record. In fact, we
> > are also able to reproduce the issue by using a simple wget with a large
> > file, if/when sndbuf fills up.
>
> I see just a coincidence, then, no worries.
>
> > > Seems a bit surprising to see a patch from Vadim and
> > > you guys come at the same time.
> >
> > Not familiar with Vadim or her/his patch. Could you please point me to it?
>
> http://patchwork.ozlabs.org/project/netdev/patch/20200517014451.954F05026DE@novek.ru/
>

Ah, looks like Vadim ran into the exact issue!

>
> > > Could you also add test for this bug?
> > > In tools/testing/selftests/net/tls.c
> > >
> >
> > Sure, yes. Let me look into this.
>
> Thanks!
Jakub Kicinski May 20, 2020, 8:12 p.m. UTC | #5
On Wed, 20 May 2020 15:56:56 -0400 Pooja Trivedi wrote:
> On Tue, May 19, 2020 at 5:43 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Tue, 19 May 2020 13:21:56 -0400 Pooja Trivedi wrote:  
> > > On Mon, May 18, 2020 at 6:50 PM Jakub Kicinski <kuba@kernel.org> wrote:  
> > > > On Sun, 17 May 2020 16:26:36 +0000 Pooja Trivedi wrote:  
> > > > > In pure sw ktls(AES-NI), -EAGAIN from tcp layer (do_tcp_sendpages for
> > > > > encrypted record) gets treated as error, subtracts the offset, and
> > > > > returns to application. Because of this, application sends data from
> > > > > subtracted offset, which leads to data integrity issue. Since record is
> > > > > already encrypted, ktls module marks it as partially sent and pushes the
> > > > > packet to tcp layer in the following iterations (either from bottom half
> > > > > or when pushing next chunk). So returning success in case of EAGAIN
> > > > > will fix the issue.
> > > > >
> > > > > Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption")
> > > > > Signed-off-by: Pooja Trivedi <pooja.trivedi@stackpath.com>
> > > > > Reviewed-by: Mallesham Jatharkonda <mallesham.jatharkonda@oneconvergence.com>
> > > > > Reviewed-by: Josh Tway <josh.tway@stackpath.com>  
> > > >
> > > > This looks reasonable, I think. Next time user space calls if no new
> > > > buffer space was made available it will get a -EAGAIN, right?  
> > >
> > > Yes, this fix should only affect encrypted record. Plain text calls from
> > > user space should be unaffected.  
> >
> > AFAICS if TCP layer is full next call from user space should hit
> > sk_stream_wait_memory() immediately and if it has MSG_DONTWAIT set
> > exit with EAGAIN. Which I believe to be correct behavior.
> >  
> 
> The flow is tls_sw_sendmsg/tls_sw_do_sendpage --> bpf_exec_tx_verdict -->
> tls_push_record --> tls_tx_records --> tls_push_sg --> do_tcp_sendpages
> 
> do_tcp_sendpages() sends partial record, 'retry:' label is exercised wherein
> do_tcp_sendpages gets called again and returns -EAGAIN.
> tls_push_sg sets partially_sent_record/partially_sent_offset and
> returns -EAGAIN. -EAGAIN bubbles up to bpf_exec_tx_verdict.
> In bpf_exec_tx_verdict, the following code causes 'copied' variable to
> get updated to a negative value and returns -EAGAIN.
> 
>                 err = tls_push_record(sk, flags, record_type);
>                 if (err && err != -EINPROGRESS) {
>                         *copied -= sk_msg_free(sk, msg);
>                         tls_free_open_rec(sk);
>                 }
>                 return err;
> 
> -EAGAIN returned by bpf_exec_tx_verdict causes
> tls_sw_sendmsg/tls_sw_do_sendpage to 'continue' in the while loop and
> call sk_stream_wait_memory(). sk_stream_wait_memory returns -EAGAIN
> also and control reaches the 'send_end:' label. The following return
> statement causes a negative 'copied' variable value to be returned to the
> user space.
> 
>         return copied ? copied : ret;
> 
> User space applies this negative value as offset for the next send, causing
> part of the record that was already sent to be pushed again.
> 
> Hope this clarifies it.

Oh yes, sorry I was talking about the behavior _after_ your patch, on
the _next_ sendmsg/sendpage call.

It should now work like this:

	bpf_exec_tx_verdict() returns success, next iteration of the
	sendmsg/sendpage loop hits sk_stream_wait_memory(), we return
	positive copied which is counts the entire record, even though
	some of it is still in partially_sent_record. If user space
	calls sendmsg again we will hit sk_stream_wait_memory() ->
	send_end -> this time copied is 0, so user space will see
	-EAGAIN.

If I'm still not making sense don't worry about it, I think it should
be easy to explain based on the selftest.
diff mbox series

Patch

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index e23f94a..d8ebdfc 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -415,10 +415,12 @@  int tls_tx_records(struct sock *sk, int flags)
 	}
 
 tx_err:
-	if (rc < 0 && rc != -EAGAIN)
+	if (rc < 0 && rc != -EAGAIN) {
 		tls_err_abort(sk, EBADMSG);
+		return rc;
+	}
 
-	return rc;
+	return 0;
 }
 
 static void tls_encrypt_done(struct crypto_async_request *req, int err)