diff mbox series

net: ipv4: emulate READ_ONCE() on ->hdrincl bit-field in raw_sendmsg()

Message ID 20180102163020.32473-1-nstange@suse.de
State Changes Requested, archived
Delegated to: David Miller
Headers show
Series net: ipv4: emulate READ_ONCE() on ->hdrincl bit-field in raw_sendmsg() | expand

Commit Message

Nicolai Stange Jan. 2, 2018, 4:30 p.m. UTC
Commit 8f659a03a0ba ("net: ipv4: fix for a race condition in
raw_sendmsg") fixed the issue of possibly inconsistent ->hdrincl handling
due to concurrent updates by reading this bit-field member into a local
variable and using the thus stabilized value in subsequent tests.

However, aforementioned commit also adds the (correct) comment that

  /* hdrincl should be READ_ONCE(inet->hdrincl)
   * but READ_ONCE() doesn't work with bit fields
   */

because as it stands, the compiler is free to shortcut or even eliminate
the local variable at its will.

Note that I have not seen anything like this happening in reality and thus,
the concern is a theoretical one.

However, in order to be on the safe side, emulate a READ_ONCE() on the
bit-field by introducing an intermediate local variable and doing a
READ_ONCE() from it:

	int __hdrincl = inet->hdrincl;
	int hdrincl = READ_ONCE(__hdrincl);

This breaks the chain in the sense that the compiler is not allowed
to replace subsequent reads from hdrincl with reloads from inet->hdrincl.

Fixes: 8f659a03a0ba ("net: ipv4: fix for a race condition in raw_sendmsg")
Signed-off-by: Nicolai Stange <nstange@suse.de>
---
 Compile-tested only (with inspection of compiler output on x86_64).
 Applicable to linux-next-20180102.

 net/ipv4/raw.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Stefano Brivio Jan. 2, 2018, 9:12 p.m. UTC | #1
Hi,

On Tue,  2 Jan 2018 17:30:20 +0100
Nicolai Stange <nstange@suse.de> wrote:

> [...]
>
> diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
> index 5b9bd5c33d9d..e84290c28c0c 100644
> --- a/net/ipv4/raw.c
> +++ b/net/ipv4/raw.c
> @@ -513,16 +513,18 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
>  	int err;
>  	struct ip_options_data opt_copy;
>  	struct raw_frag_vec rfv;
> -	int hdrincl;
> +	int hdrincl, __hdrincl;
>  
>  	err = -EMSGSIZE;
>  	if (len > 0xFFFF)
>  		goto out;
>  
>  	/* hdrincl should be READ_ONCE(inet->hdrincl)
> -	 * but READ_ONCE() doesn't work with bit fields
> +	 * but READ_ONCE() doesn't work with bit fields.
> +	 * Emulate it by doing the READ_ONCE() from an intermediate int.
>  	 */
> -	hdrincl = inet->hdrincl;
> +	__hdrincl = inet->hdrincl;
> +	hdrincl = READ_ONCE(__hdrincl);

I guess you don't actually need to use a third variable. What about
doing READ_ONCE() on hdrincl itself after the first assignment?

Perhaps something like the patch below -- applies to net.git, yields
same binary output as your version with gcc 6, looks IMHO more
straightforward:

diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 125c1eab3eaa..8c2f783a95fc 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -519,10 +519,12 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	if (len > 0xFFFF)
 		goto out;
 
-	/* hdrincl should be READ_ONCE(inet->hdrincl)
-	 * but READ_ONCE() doesn't work with bit fields
+	/* hdrincl should be READ_ONCE(inet->hdrincl) but READ_ONCE() doesn't
+	 * work with bit fields. Emulate it by adding a further sequence point.
 	 */
 	hdrincl = inet->hdrincl;
+	hdrincl = READ_ONCE(hdrincl);
+
 	/*
 	 *	Check the flags.
 	 */
Nicolai Stange Jan. 3, 2018, 9:28 a.m. UTC | #2
Hi Stefano,

Stefano Brivio <sbrivio@redhat.com> writes:

> On Tue,  2 Jan 2018 17:30:20 +0100
> Nicolai Stange <nstange@suse.de> wrote:
>
>> [...]
>>
>> diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
>> index 5b9bd5c33d9d..e84290c28c0c 100644
>> --- a/net/ipv4/raw.c
>> +++ b/net/ipv4/raw.c
>> @@ -513,16 +513,18 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
>>  	int err;
>>  	struct ip_options_data opt_copy;
>>  	struct raw_frag_vec rfv;
>> -	int hdrincl;
>> +	int hdrincl, __hdrincl;
>>  
>>  	err = -EMSGSIZE;
>>  	if (len > 0xFFFF)
>>  		goto out;
>>  
>>  	/* hdrincl should be READ_ONCE(inet->hdrincl)
>> -	 * but READ_ONCE() doesn't work with bit fields
>> +	 * but READ_ONCE() doesn't work with bit fields.
>> +	 * Emulate it by doing the READ_ONCE() from an intermediate int.
>>  	 */
>> -	hdrincl = inet->hdrincl;
>> +	__hdrincl = inet->hdrincl;
>> +	hdrincl = READ_ONCE(__hdrincl);
>
> I guess you don't actually need to use a third variable. What about
> doing READ_ONCE() on hdrincl itself after the first assignment?
>
> Perhaps something like the patch below -- applies to net.git, yields
> same binary output as your version with gcc 6, looks IMHO more
> straightforward:
>
> diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
> index 125c1eab3eaa..8c2f783a95fc 100644
> --- a/net/ipv4/raw.c
> +++ b/net/ipv4/raw.c
> @@ -519,10 +519,12 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
>  	if (len > 0xFFFF)
>  		goto out;
>  
> -	/* hdrincl should be READ_ONCE(inet->hdrincl)
> -	 * but READ_ONCE() doesn't work with bit fields
> +	/* hdrincl should be READ_ONCE(inet->hdrincl) but READ_ONCE() doesn't
> +	 * work with bit fields. Emulate it by adding a further sequence point.
>  	 */
>  	hdrincl = inet->hdrincl;
> +	hdrincl = READ_ONCE(hdrincl);
> +

Yes, this does also work. In fact, after having been lowered into SSA
form, it should be equivalent to what I posted.

So, it's a matter of preference/style and I'd leave the decision on
this to the maintainers -- for me, either way is fine.

I don't like the "sequence point" wording in the comment above though:
AFAICS, if taken in the meaning of C99, it's not any sequence point but
the volatile access in READ_ONCE() which ensures that there won't be any
reloads from ->hdrincl. If you don't mind, I'll adjust that comment if
asked to resend with your solution.

Thanks,

Nicolai
Stefano Brivio Jan. 3, 2018, 10:37 a.m. UTC | #3
Hi Nicolai,

On Wed, 03 Jan 2018 10:28:20 +0100
Nicolai Stange <nstange@suse.de> wrote:

> Hi Stefano,
> 
> Stefano Brivio <sbrivio@redhat.com> writes:
> 
> > On Tue,  2 Jan 2018 17:30:20 +0100
> > Nicolai Stange <nstange@suse.de> wrote:
> >  
> >> [...]
> >>
> >> diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
> >> index 5b9bd5c33d9d..e84290c28c0c 100644
> >> --- a/net/ipv4/raw.c
> >> +++ b/net/ipv4/raw.c
> >> @@ -513,16 +513,18 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
> >>  	int err;
> >>  	struct ip_options_data opt_copy;
> >>  	struct raw_frag_vec rfv;
> >> -	int hdrincl;
> >> +	int hdrincl, __hdrincl;
> >>  
> >>  	err = -EMSGSIZE;
> >>  	if (len > 0xFFFF)
> >>  		goto out;
> >>  
> >>  	/* hdrincl should be READ_ONCE(inet->hdrincl)
> >> -	 * but READ_ONCE() doesn't work with bit fields
> >> +	 * but READ_ONCE() doesn't work with bit fields.
> >> +	 * Emulate it by doing the READ_ONCE() from an intermediate int.
> >>  	 */
> >> -	hdrincl = inet->hdrincl;
> >> +	__hdrincl = inet->hdrincl;
> >> +	hdrincl = READ_ONCE(__hdrincl);  
> >
> > I guess you don't actually need to use a third variable. What about
> > doing READ_ONCE() on hdrincl itself after the first assignment?
> >
> > Perhaps something like the patch below -- applies to net.git, yields
> > same binary output as your version with gcc 6, looks IMHO more
> > straightforward:
> >
> > diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
> > index 125c1eab3eaa..8c2f783a95fc 100644
> > --- a/net/ipv4/raw.c
> > +++ b/net/ipv4/raw.c
> > @@ -519,10 +519,12 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
> >  	if (len > 0xFFFF)
> >  		goto out;
> >  
> > -	/* hdrincl should be READ_ONCE(inet->hdrincl)
> > -	 * but READ_ONCE() doesn't work with bit fields
> > +	/* hdrincl should be READ_ONCE(inet->hdrincl) but READ_ONCE() doesn't
> > +	 * work with bit fields. Emulate it by adding a further sequence point.
> >  	 */
> >  	hdrincl = inet->hdrincl;
> > +	hdrincl = READ_ONCE(hdrincl);
> > +  
> 
> Yes, this does also work. In fact, after having been lowered into SSA
> form, it should be equivalent to what I posted.
> 
> So, it's a matter of preference/style and I'd leave the decision on
> this to the maintainers -- for me, either way is fine.
> 
> I don't like the "sequence point" wording in the comment above though:
> AFAICS, if taken in the meaning of C99, it's not any sequence point but
> the volatile access in READ_ONCE() which ensures that there won't be any
> reloads from ->hdrincl. If you don't mind, I'll adjust that comment if
> asked to resend with your solution.

Well, by "by adding a further sequence point" I refer to what we have
to do to emulate READ_ONCE(), not to the reason why we need READ_ONCE().

However, this is a likely sign that my comment isn't that clear either.
So unless you have better ideas, I would go with:

+	/* hdrincl should be READ_ONCE(inet->hdrincl) but READ_ONCE() doesn't
+	 * work with bit fields. Doing this indirectly yields the same result.

but I really hope you have a better idea. :)
diff mbox series

Patch

diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 5b9bd5c33d9d..e84290c28c0c 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -513,16 +513,18 @@  static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	int err;
 	struct ip_options_data opt_copy;
 	struct raw_frag_vec rfv;
-	int hdrincl;
+	int hdrincl, __hdrincl;
 
 	err = -EMSGSIZE;
 	if (len > 0xFFFF)
 		goto out;
 
 	/* hdrincl should be READ_ONCE(inet->hdrincl)
-	 * but READ_ONCE() doesn't work with bit fields
+	 * but READ_ONCE() doesn't work with bit fields.
+	 * Emulate it by doing the READ_ONCE() from an intermediate int.
 	 */
-	hdrincl = inet->hdrincl;
+	__hdrincl = inet->hdrincl;
+	hdrincl = READ_ONCE(__hdrincl);
 	/*
 	 *	Check the flags.
 	 */