diff mbox

[net-next,v2] block/drbd: use nla_put_u64_64bit()

Message ID 20160503100644.GE16459@soda.linbit
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Lars Ellenberg May 3, 2016, 10:06 a.m. UTC
On Tue, May 03, 2016 at 11:39:18AM +0200, Nicolas Dichtel wrote:
> Two new handlers have been defined in genl_magic_ headers:
>  - __field2: the corresponding nla_put() function (nla_put_flag()) takes
>              only two args
>  - __field4: the corresponding nla_put() function (nla_put_u64_64bit())
>              takes four args
> 
> __field2 allows us to define __unspec_field for padding attribute.
> __field4 allows us to update the definition of __u64_field: the pad
> attribute should now be specified.

Please just NOT use an additional "field",
but always use 0 to pad.

Patch is much shorter as well, see below.

Attribute type "0" is not used,
and will never be of semantic value,
but always be ignored in the DRBD netlink family.

Whereas using some arbitrary value will be wrong,
and will needlessly break userland.

Thanks,

    Lars Ellenberg

Comments

Nicolas Dichtel May 3, 2016, 12:07 p.m. UTC | #1
Le 03/05/2016 12:06, Lars Ellenberg a écrit :
> On Tue, May 03, 2016 at 11:39:18AM +0200, Nicolas Dichtel wrote:
>> Two new handlers have been defined in genl_magic_ headers:
>>  - __field2: the corresponding nla_put() function (nla_put_flag()) takes
>>              only two args
>>  - __field4: the corresponding nla_put() function (nla_put_u64_64bit())
>>              takes four args
>>
>> __field2 allows us to define __unspec_field for padding attribute.
>> __field4 allows us to update the definition of __u64_field: the pad
>> attribute should now be specified.
> 
> Please just NOT use an additional "field",
> but always use 0 to pad.
> 
> Patch is much shorter as well, see below.
I don't think that the goal is to make the shortest patch...
But frankly, I don't care. The goal was to use the new interface in a proper
way, like every other subsystem.

> 
> Attribute type "0" is not used,
> and will never be of semantic value,
> but always be ignored in the DRBD netlink family.
> 
> Whereas using some arbitrary value will be wrong,
> and will needlessly break userland.
An application should always ignore unknown attribute, this is a golden rule.
Now if you know that this patch will break applications (which one exactly?), we
can use your proposal.


Regards,
Nicolas
David Miller May 3, 2016, 4:05 p.m. UTC | #2
From: Lars Ellenberg <lars.ellenberg@linbit.com>
Date: Tue, 3 May 2016 12:06:44 +0200

> Please just NOT use an additional "field",
> but always use 0 to pad.

You can't, it doesn't work.

We are adding a new field to every netlink protocol family that has this
alignment problem.
David Miller May 3, 2016, 4:06 p.m. UTC | #3
From: Lars Ellenberg <lars.ellenberg@linbit.com>
Date: Tue, 3 May 2016 12:06:44 +0200

> Whereas using some arbitrary value will be wrong,
> and will needlessly break userland.

It cannot break userland.

A fundamental property of netlink is that all code must silently
ignore netlink attributes it does not understand.

This is why netlink is easily extensible.

If code isn't doing that, it is broken and must be fixed.
Lars Ellenberg May 4, 2016, 9:05 a.m. UTC | #4
On Tue, May 03, 2016 at 12:05:56PM -0400, David Miller wrote:
> From: Lars Ellenberg <lars.ellenberg@linbit.com>
> Date: Tue, 3 May 2016 12:06:44 +0200
> 
> > Please just NOT use an additional "field",
> > but always use 0 to pad.
> 
> You can't, it doesn't work.

I did, and it *did* work.
At least, it appeared to.

I'm not talking about every user of netlink out there.
That I don't know. But specifically for DRBD netlink,
from what my experiments tell me, it works just fine.

> We are adding a new field to every netlink protocol family that has
> this alignment problem.

We don't have an "alignment problem" there, btw.
Last time I checked, we did work fine without this alignment magic,
we already take care of that, yes, even on affected architectures.

On Tue, May 03, 2016 at 12:06:52PM -0400, David Miller wrote:
> From: Lars Ellenberg <lars.ellenberg@linbit.com>
> Date: Tue, 3 May 2016 12:06:44 +0200
> 
> > Whereas using some arbitrary value will be wrong,
> > and will needlessly break userland.
> 
> It cannot break userland.

It can, if those tags have been used already.
There is DRBD out-of-tree as well,
it usually is ahead of in-tree DRBD.

But yes, I could obviously check and assign and reserve some
not-yet-used tag to all of them.

I don't see why, though, given that 0 (appearently) works fine.

Can you elaborate why and how that does not work?

	Lars
Nicolas Dichtel May 4, 2016, 12:49 p.m. UTC | #5
Le 04/05/2016 11:05, Lars Ellenberg a écrit :
[snip]
> We don't have an "alignment problem" there, btw.
> Last time I checked, we did work fine without this alignment magic,
> we already take care of that, yes, even on affected architectures.
The code adds several consecutive u64 attributes. The nl attribute header is 4
bytes, thus the full attribute length is 12 bytes. If the first u64 is aligned
on 8 (nla_data()), the next one is not aligned on 8: it starts 12 bytes (8 (u64)
+ 4 (nl attr hdr)) after the previous u64.
Lars Ellenberg May 4, 2016, 12:52 p.m. UTC | #6
On Wed, May 04, 2016 at 02:49:00PM +0200, Nicolas Dichtel wrote:
> Le 04/05/2016 11:05, Lars Ellenberg a écrit :
> [snip]
> > We don't have an "alignment problem" there, btw.
> > Last time I checked, we did work fine without this alignment magic,
> > we already take care of that, yes, even on affected architectures.
> The code adds several consecutive u64 attributes. The nl attribute header is 4
> bytes, thus the full attribute length is 12 bytes. If the first u64 is aligned
> on 8 (nla_data()), the next one is not aligned on 8: it starts 12 bytes (8 (u64)
> + 4 (nl attr hdr)) after the previous u64.

Yes.  Which in our case is not a problem.
But I don't object to the padding per se,
if that is how things "should be".

I try to understand why you so much object to using 0 as pad.

    Lars
Eric Dumazet May 4, 2016, 2:27 p.m. UTC | #7
On Wed, 2016-05-04 at 14:49 +0200, Nicolas Dichtel wrote:
> Le 04/05/2016 11:05, Lars Ellenberg a écrit :
> [snip]
> > We don't have an "alignment problem" there, btw.
> > Last time I checked, we did work fine without this alignment magic,
> > we already take care of that, yes, even on affected architectures.
> The code adds several consecutive u64 attributes. The nl attribute header is 4
> bytes, thus the full attribute length is 12 bytes. If the first u64 is aligned
> on 8 (nla_data()), the next one is not aligned on 8: it starts 12 bytes (8 (u64)
> + 4 (nl attr hdr)) after the previous u64.

As I mentioned earlier ( https://lkml.org/lkml/2016/4/22/706 ), if both
kernel and user land do not blindly use *(u64 *)ptr to put/get these
values, there was no issue.

kernel was fine, and most user land apps were fine as well.

Only for compound struct  like tcp_info this was nice to get alignment
for performance reason (it removed one memcpy(), or the use of
put_unaligned() helpers)
David Miller May 4, 2016, 4:47 p.m. UTC | #8
From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Date: Wed, 4 May 2016 14:49:00 +0200

> Le 04/05/2016 11:05, Lars Ellenberg a écrit :
> [snip]
>> We don't have an "alignment problem" there, btw.
>> Last time I checked, we did work fine without this alignment magic,
>> we already take care of that, yes, even on affected architectures.
> The code adds several consecutive u64 attributes. The nl attribute header is 4
> bytes, thus the full attribute length is 12 bytes. If the first u64 is aligned
> on 8 (nla_data()), the next one is not aligned on 8: it starts 12 bytes (8 (u64)
> + 4 (nl attr hdr)) after the previous u64.

Right.
David Miller May 4, 2016, 4:50 p.m. UTC | #9
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 04 May 2016 07:27:06 -0700

> kernel was fine, and most user land apps were fine as well.

Userland should really not have to deal with garbage like this.

And because it quietly works just fine on x86-64, nothing makes
sure that applications will universally get this right anyways.

Better to align things properly and magically all of these issues
simply disappear.
Eric Dumazet May 4, 2016, 5:13 p.m. UTC | #10
On Wed, 2016-05-04 at 12:50 -0400, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Wed, 04 May 2016 07:27:06 -0700
> 
> > kernel was fine, and most user land apps were fine as well.
> 
> Userland should really not have to deal with garbage like this.
> 
> And because it quietly works just fine on x86-64, nothing makes
> sure that applications will universally get this right anyways.
> 
> Better to align things properly and magically all of these issues
> simply disappear.

Sure, but in practice we end up consuming 16 bytes (instead of 12) per
u64 attribute, only on some arches. 33 % space overhead.

So maybe some dumps will abort on those arches, while on x86 the size of
skb might be below some magic limit.

I guess this is fine, we do not break ABI in any way.
diff mbox

Patch

diff --git a/include/linux/genl_magic_struct.h b/include/linux/genl_magic_struct.h
index eecd19b..6270a56 100644
--- a/include/linux/genl_magic_struct.h
+++ b/include/linux/genl_magic_struct.h
@@ -62,6 +62,11 @@  extern void CONCAT_(GENL_MAGIC_FAMILY, _genl_unregister)(void);
 
 /* MAGIC helpers							{{{2 */
 
+static inline int nla_put_u64_0pad(struct sk_buff *skb, int attrtype, u64 value)
+{
+	return nla_put_64bit(skb, attrtype, sizeof(u64), &value, 0);
+}
+
 /* possible field types */
 #define __flg_field(attr_nr, attr_flag, name) \
 	__field(attr_nr, attr_flag, name, NLA_U8, char, \
@@ -80,7 +85,7 @@  extern void CONCAT_(GENL_MAGIC_FAMILY, _genl_unregister)(void);
 			nla_get_u32, nla_put_u32, true)
 #define __u64_field(attr_nr, attr_flag, name)	\
 	__field(attr_nr, attr_flag, name, NLA_U64, __u64, \
-			nla_get_u64, nla_put_u64, false)
+			nla_get_u64, nla_put_u64_0pad, false)
 #define __str_field(attr_nr, attr_flag, name, maxlen) \
 	__array(attr_nr, attr_flag, name, NLA_NUL_STRING, char, maxlen, \
 			nla_strlcpy, nla_put, false)
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 1fd1dcc..206cc76 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -3633,14 +3633,14 @@  static int nla_put_status_info(struct sk_buff *skb, struct drbd_device *device,
 		goto nla_put_failure;
 	if (nla_put_u32(skb, T_sib_reason, sib ? sib->sib_reason : SIB_GET_STATUS_REPLY) ||
 	    nla_put_u32(skb, T_current_state, device->state.i) ||
-	    nla_put_u64(skb, T_ed_uuid, device->ed_uuid) ||
-	    nla_put_u64(skb, T_capacity, drbd_get_capacity(device->this_bdev)) ||
-	    nla_put_u64(skb, T_send_cnt, device->send_cnt) ||
-	    nla_put_u64(skb, T_recv_cnt, device->recv_cnt) ||
-	    nla_put_u64(skb, T_read_cnt, device->read_cnt) ||
-	    nla_put_u64(skb, T_writ_cnt, device->writ_cnt) ||
-	    nla_put_u64(skb, T_al_writ_cnt, device->al_writ_cnt) ||
-	    nla_put_u64(skb, T_bm_writ_cnt, device->bm_writ_cnt) ||
+	    nla_put_u64_0pad(skb, T_ed_uuid, device->ed_uuid) ||
+	    nla_put_u64_0pad(skb, T_capacity, drbd_get_capacity(device->this_bdev)) ||
+	    nla_put_u64_0pad(skb, T_send_cnt, device->send_cnt) ||
+	    nla_put_u64_0pad(skb, T_recv_cnt, device->recv_cnt) ||
+	    nla_put_u64_0pad(skb, T_read_cnt, device->read_cnt) ||
+	    nla_put_u64_0pad(skb, T_writ_cnt, device->writ_cnt) ||
+	    nla_put_u64_0pad(skb, T_al_writ_cnt, device->al_writ_cnt) ||
+	    nla_put_u64_0pad(skb, T_bm_writ_cnt, device->bm_writ_cnt) ||
 	    nla_put_u32(skb, T_ap_bio_cnt, atomic_read(&device->ap_bio_cnt)) ||
 	    nla_put_u32(skb, T_ap_pending_cnt, atomic_read(&device->ap_pending_cnt)) ||
 	    nla_put_u32(skb, T_rs_pending_cnt, atomic_read(&device->rs_pending_cnt)))
@@ -3657,13 +3657,13 @@  static int nla_put_status_info(struct sk_buff *skb, struct drbd_device *device,
 			goto nla_put_failure;
 
 		if (nla_put_u32(skb, T_disk_flags, device->ldev->md.flags) ||
-		    nla_put_u64(skb, T_bits_total, drbd_bm_bits(device)) ||
-		    nla_put_u64(skb, T_bits_oos, drbd_bm_total_weight(device)))
+		    nla_put_u64_0pad(skb, T_bits_total, drbd_bm_bits(device)) ||
+		    nla_put_u64_0pad(skb, T_bits_oos, drbd_bm_total_weight(device)))
 			goto nla_put_failure;
 		if (C_SYNC_SOURCE <= device->state.conn &&
 		    C_PAUSED_SYNC_T >= device->state.conn) {
-			if (nla_put_u64(skb, T_bits_rs_total, device->rs_total) ||
-			    nla_put_u64(skb, T_bits_rs_failed, device->rs_failed))
+			if (nla_put_u64_0pad(skb, T_bits_rs_total, device->rs_total) ||
+			    nla_put_u64_0pad(skb, T_bits_rs_failed, device->rs_failed))
 				goto nla_put_failure;
 		}
 	}