[bpf] bpf: fix narrower loads on s390

Message ID	20190716115910.23093-1-iii@linux.ibm.com
State	Changes Requested
Delegated to:	BPF Maintainers
Headers	show Return-Path: <bpf-owner@vger.kernel.org> Gateway: Authorized Use Only! Violators will be prosecuted for <bpf@vger.kernel.org> from <iii@linux.ibm.com>; Tue, 16 Jul 2019 12:59:16 +0100 Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 16 Jul 2019 12:59:14 +0100 From: Ilya Leoshkevich <iii@linux.ibm.com> To: bpf@vger.kernel.org, netdev@vger.kernel.org Cc: gor@linux.ibm.com, heiko.carstens@de.ibm.com, Ilya Leoshkevich <iii@linux.ibm.com> Subject: [PATCH bpf] bpf: fix narrower loads on s390 Date: Tue, 16 Jul 2019 13:59:10 +0200 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-Id: <20190716115910.23093-1-iii@linux.ibm.com> Sender: bpf-owner@vger.kernel.org Precedence: bulk
Series	[bpf] bpf: fix narrower loads on s390 \| expand [bpf] bpf: fix narrower loads on s390

Ilya Leoshkevich July 16, 2019, 11:59 a.m. UTC

test_pkt_md_access is failing on s390, since the associated eBPF prog
returns TC_ACT_SHOT, which in turn happens because loading a part of a
struct __sk_buff field produces an incorrect result.

The problem is that when verifier emits the code to replace partial load
of a field with a full load, a shift and a bitwise AND, it assumes that
the machine is little endian.

Adjust shift count calculation to account for endianness.

Fixes: 31fd85816dbe ("bpf: permits narrower load from bpf program context fields")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 kernel/bpf/verifier.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Y Song July 17, 2019, 5:08 a.m. UTC | #1

On Tue, Jul 16, 2019 at 4:59 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> test_pkt_md_access is failing on s390, since the associated eBPF prog
> returns TC_ACT_SHOT, which in turn happens because loading a part of a
> struct __sk_buff field produces an incorrect result.
>
> The problem is that when verifier emits the code to replace partial load
> of a field with a full load, a shift and a bitwise AND, it assumes that
> the machine is little endian.
>
> Adjust shift count calculation to account for endianness.
>
> Fixes: 31fd85816dbe ("bpf: permits narrower load from bpf program context fields")
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> ---
>  kernel/bpf/verifier.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 5900cbb966b1..3f9353653558 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -8616,8 +8616,12 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
>                 }
>
>                 if (is_narrower_load && size < target_size) {
> -                       u8 shift = (off & (size_default - 1)) * 8;
> -
> +                       u8 load_off = off & (size_default - 1);
> +#ifdef __LITTLE_ENDIAN
> +                       u8 shift = load_off * 8;
> +#else
> +                       u8 shift = (size_default - (load_off + size)) * 8;
> +#endif

All the values are in register. The shifting operations should be the
same for big endian and little endian, e.g., value 64 >> 2 = 16 when
value "64" is in register. So I did not see a problem here.

Could you elaborate which field access in test_pkt_md_access
caused problem?

It would be good if you can give detailed memory la


>                         if (ctx_field_size <= 4) {
>                                 if (shift)
>                                         insn_buf[cnt++] = BPF_ALU32_IMM(BPF_RSH,
> --
> 2.21.0
>

Y Song July 17, 2019, 5:11 a.m. UTC | #2

[sorry, resend again as previous one has come text messed out due to
networking issues]

On Tue, Jul 16, 2019 at 10:08 PM Y Song <ys114321@gmail.com> wrote:
>
> On Tue, Jul 16, 2019 at 4:59 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
> >
> > test_pkt_md_access is failing on s390, since the associated eBPF prog
> > returns TC_ACT_SHOT, which in turn happens because loading a part of a
> > struct __sk_buff field produces an incorrect result.
> >
> > The problem is that when verifier emits the code to replace partial load
> > of a field with a full load, a shift and a bitwise AND, it assumes that
> > the machine is little endian.
> >
> > Adjust shift count calculation to account for endianness.
> >
> > Fixes: 31fd85816dbe ("bpf: permits narrower load from bpf program context fields")
> > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> > ---
> >  kernel/bpf/verifier.c | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 5900cbb966b1..3f9353653558 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -8616,8 +8616,12 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
> >                 }
> >
> >                 if (is_narrower_load && size < target_size) {
> > -                       u8 shift = (off & (size_default - 1)) * 8;
> > -
> > +                       u8 load_off = off & (size_default - 1);
> > +#ifdef __LITTLE_ENDIAN
> > +                       u8 shift = load_off * 8;
> > +#else
> > +                       u8 shift = (size_default - (load_off + size)) * 8;
> > +#endif
>
All the values are in register. The shifting operations should be the
same for big endian and little endian, e.g., value 64 >> 2 = 16 when
value "64" is in register. So I did not see a problem here.

Could you elaborate which field access in test_pkt_md_access
caused problem?

It would be good if you can give detailed memory layout and register values
to illustrate the problem.

>
> >                         if (ctx_field_size <= 4) {
> >                                 if (shift)
> >                                         insn_buf[cnt++] = BPF_ALU32_IMM(BPF_RSH,
> > --
> > 2.21.0
> >

Ilya Leoshkevich July 17, 2019, 9:21 a.m. UTC | #3

> Am 17.07.2019 um 07:11 schrieb Y Song <ys114321@gmail.com>:
> 
> [sorry, resend again as previous one has come text messed out due to
> networking issues]
> 
> On Tue, Jul 16, 2019 at 10:08 PM Y Song <ys114321@gmail.com> wrote:
>> 
>> On Tue, Jul 16, 2019 at 4:59 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>>> 
>>> test_pkt_md_access is failing on s390, since the associated eBPF prog
>>> returns TC_ACT_SHOT, which in turn happens because loading a part of a
>>> struct __sk_buff field produces an incorrect result.
>>> 
>>> The problem is that when verifier emits the code to replace partial load
>>> of a field with a full load, a shift and a bitwise AND, it assumes that
>>> the machine is little endian.
>>> 
>>> Adjust shift count calculation to account for endianness.
>>> 
>>> Fixes: 31fd85816dbe ("bpf: permits narrower load from bpf program context fields")
>>> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
>>> ---
>>> kernel/bpf/verifier.c | 8 ++++++--
>>> 1 file changed, 6 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>>> index 5900cbb966b1..3f9353653558 100644
>>> --- a/kernel/bpf/verifier.c
>>> +++ b/kernel/bpf/verifier.c
>>> @@ -8616,8 +8616,12 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
>>>                }
>>> 
>>>                if (is_narrower_load && size < target_size) {
>>> -                       u8 shift = (off & (size_default - 1)) * 8;
>>> -
>>> +                       u8 load_off = off & (size_default - 1);
>>> +#ifdef __LITTLE_ENDIAN
>>> +                       u8 shift = load_off * 8;
>>> +#else
>>> +                       u8 shift = (size_default - (load_off + size)) * 8;
>>> +#endif
>> 
> All the values are in register. The shifting operations should be the
> same for big endian and little endian, e.g., value 64 >> 2 = 16 when
> value "64" is in register. So I did not see a problem here.
> 
> Could you elaborate which field access in test_pkt_md_access
> caused problem?

The very first one: TEST_FIELD(__u8,  len, 0xFF);

> It would be good if you can give detailed memory layout and register values
> to illustrate the problem.

Suppose len = 0x11223344. On a big endian system, this would be

11 22 33 44

Now, we would like to do *(u8 *)&len, the desired result is 0x11.
Verifier should emit the following: ((*(u32 *)&len) >> 24) & 0xff, but as
of today it misses the shift.

On a little endian system the layout is:

44 33 22 11

and the desired result is different - 0x44. Verifier correctly emits
(*(u32 *)&len) & 0xff.

> 
>> 
>>>                        if (ctx_field_size <= 4) {
>>>                                if (shift)
>>>                                        insn_buf[cnt++] = BPF_ALU32_IMM(BPF_RSH,
>>> --
>>> 2.21.0

Ilya Leoshkevich July 17, 2019, 10:36 a.m. UTC | #4

> Am 17.07.2019 um 11:21 schrieb Ilya Leoshkevich <iii@linux.ibm.com>:
> 
>> Am 17.07.2019 um 07:11 schrieb Y Song <ys114321@gmail.com>:
>> 
>> [sorry, resend again as previous one has come text messed out due to
>> networking issues]
>> 
>> On Tue, Jul 16, 2019 at 10:08 PM Y Song <ys114321@gmail.com> wrote:
>>> 
>>> On Tue, Jul 16, 2019 at 4:59 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>>>> 
>>>> test_pkt_md_access is failing on s390, since the associated eBPF prog
>>>> returns TC_ACT_SHOT, which in turn happens because loading a part of a
>>>> struct __sk_buff field produces an incorrect result.
>>>> 
>>>> The problem is that when verifier emits the code to replace partial load
>>>> of a field with a full load, a shift and a bitwise AND, it assumes that
>>>> the machine is little endian.
>>>> 
>>>> Adjust shift count calculation to account for endianness.
>>>> 
>>>> Fixes: 31fd85816dbe ("bpf: permits narrower load from bpf program context fields")
>>>> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
>>>> ---
>>>> kernel/bpf/verifier.c | 8 ++++++--
>>>> 1 file changed, 6 insertions(+), 2 deletions(-)
>>>> 
>>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>>>> index 5900cbb966b1..3f9353653558 100644
>>>> --- a/kernel/bpf/verifier.c
>>>> +++ b/kernel/bpf/verifier.c
>>>> @@ -8616,8 +8616,12 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
>>>>               }
>>>> 
>>>>               if (is_narrower_load && size < target_size) {
>>>> -                       u8 shift = (off & (size_default - 1)) * 8;
>>>> -
>>>> +                       u8 load_off = off & (size_default - 1);
>>>> +#ifdef __LITTLE_ENDIAN
>>>> +                       u8 shift = load_off * 8;
>>>> +#else
>>>> +                       u8 shift = (size_default - (load_off + size)) * 8;
>>>> +#endif
>>> 
>> All the values are in register. The shifting operations should be the
>> same for big endian and little endian, e.g., value 64 >> 2 = 16 when
>> value "64" is in register. So I did not see a problem here.
>> 
>> Could you elaborate which field access in test_pkt_md_access
>> caused problem?
> 
> The very first one: TEST_FIELD(__u8,  len, 0xFF);
> 
>> It would be good if you can give detailed memory layout and register values
>> to illustrate the problem.
> 
> Suppose len = 0x11223344. On a big endian system, this would be
> 
> 11 22 33 44
> 
> Now, we would like to do *(u8 *)&len, the desired result is 0x11.
> Verifier should emit the following: ((*(u32 *)&len) >> 24) & 0xff, but as
> of today it misses the shift.
> 
> On a little endian system the layout is:
> 
> 44 33 22 11
> 
> and the desired result is different - 0x44. Verifier correctly emits
> (*(u32 *)&len) & 0xff.

I’ve just realized, that this example does not reflect what the test is
doing on big-endian systems (there is an #ifdef for those).

Here is a better one: len=0x11223344 and we would like to do
((u8 *)&len)[3].

len is represented as `11 22 33 44` in memory, so the desired result is
0x44. It can be obtained by doing (*(u32 *)&len) & 0xff, but today the
verifier does ((*(u32 *)&len) >> 24) & 0xff instead.

Y Song July 17, 2019, 4:25 p.m. UTC | #5

On Wed, Jul 17, 2019 at 3:36 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> > Am 17.07.2019 um 11:21 schrieb Ilya Leoshkevich <iii@linux.ibm.com>:
> >
> >> Am 17.07.2019 um 07:11 schrieb Y Song <ys114321@gmail.com>:
> >>
> >> [sorry, resend again as previous one has come text messed out due to
> >> networking issues]
> >>
> >> On Tue, Jul 16, 2019 at 10:08 PM Y Song <ys114321@gmail.com> wrote:
> >>>
> >>> On Tue, Jul 16, 2019 at 4:59 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
> >>>>
> >>>> test_pkt_md_access is failing on s390, since the associated eBPF prog
> >>>> returns TC_ACT_SHOT, which in turn happens because loading a part of a
> >>>> struct __sk_buff field produces an incorrect result.
> >>>>
> >>>> The problem is that when verifier emits the code to replace partial load
> >>>> of a field with a full load, a shift and a bitwise AND, it assumes that
> >>>> the machine is little endian.
> >>>>
> >>>> Adjust shift count calculation to account for endianness.
> >>>>
> >>>> Fixes: 31fd85816dbe ("bpf: permits narrower load from bpf program context fields")
> >>>> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> >>>> ---
> >>>> kernel/bpf/verifier.c | 8 ++++++--
> >>>> 1 file changed, 6 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> >>>> index 5900cbb966b1..3f9353653558 100644
> >>>> --- a/kernel/bpf/verifier.c
> >>>> +++ b/kernel/bpf/verifier.c
> >>>> @@ -8616,8 +8616,12 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
> >>>>               }
> >>>>
> >>>>               if (is_narrower_load && size < target_size) {
> >>>> -                       u8 shift = (off & (size_default - 1)) * 8;
> >>>> -
> >>>> +                       u8 load_off = off & (size_default - 1);
> >>>> +#ifdef __LITTLE_ENDIAN
> >>>> +                       u8 shift = load_off * 8;
> >>>> +#else
> >>>> +                       u8 shift = (size_default - (load_off + size)) * 8;
> >>>> +#endif
> >>>
> >> All the values are in register. The shifting operations should be the
> >> same for big endian and little endian, e.g., value 64 >> 2 = 16 when
> >> value "64" is in register. So I did not see a problem here.
> >>
> >> Could you elaborate which field access in test_pkt_md_access
> >> caused problem?
> >
> > The very first one: TEST_FIELD(__u8,  len, 0xFF);
> >
> >> It would be good if you can give detailed memory layout and register values
> >> to illustrate the problem.
> >
> > Suppose len = 0x11223344. On a big endian system, this would be
> >
> > 11 22 33 44
> >
> > Now, we would like to do *(u8 *)&len, the desired result is 0x11.
> > Verifier should emit the following: ((*(u32 *)&len) >> 24) & 0xff, but as
> > of today it misses the shift.
> >
> > On a little endian system the layout is:
> >
> > 44 33 22 11
> >
> > and the desired result is different - 0x44. Verifier correctly emits
> > (*(u32 *)&len) & 0xff.
>
> I’ve just realized, that this example does not reflect what the test is
> doing on big-endian systems (there is an #ifdef for those).
>
> Here is a better one: len=0x11223344 and we would like to do
> ((u8 *)&len)[3].
>
> len is represented as `11 22 33 44` in memory, so the desired result is
> 0x44. It can be obtained by doing (*(u32 *)&len) & 0xff, but today the
> verifier does ((*(u32 *)&len) >> 24) & 0xff instead.

What you described above for the memory layout all makes sense.
The root cause is for big endian, we should do *((u8 *)&len + 3).
This is exactly what macros in test_pkt_md_access.c tries to do.

if  __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
#define TEST_FIELD(TYPE, FIELD, MASK)                                   \
        {                                                               \
                TYPE tmp = *(volatile TYPE *)&skb->FIELD;               \
                if (tmp != ((*(volatile __u32 *)&skb->FIELD) & MASK))   \
                        return TC_ACT_SHOT;                             \
        }
#else
#define TEST_FIELD_OFFSET(a, b) ((sizeof(a) - sizeof(b)) / sizeof(b))
#define TEST_FIELD(TYPE, FIELD, MASK)                                   \
        {                                                               \
                TYPE tmp = *((volatile TYPE *)&skb->FIELD +             \
                              TEST_FIELD_OFFSET(skb->FIELD, TYPE));     \
                if (tmp != ((*(volatile __u32 *)&skb->FIELD) & MASK))   \
                        return TC_ACT_SHOT;                             \
        }
#endif

Could you check whether your __BYTE_ORDER__ is set
correctly or not for this case? You may need to tweak Makefile
if you are doing cross compilation, I am not sure how as I
did not have environment.

Ilya Leoshkevich July 17, 2019, 8:51 p.m. UTC | #6

> Am 17.07.2019 um 18:25 schrieb Y Song <ys114321@gmail.com>:
> 
> On Wed, Jul 17, 2019 at 3:36 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>> 
>> 
>> Here is a better one: len=0x11223344 and we would like to do
>> ((u8 *)&len)[3].
>> 
>> len is represented as `11 22 33 44` in memory, so the desired result is
>> 0x44. It can be obtained by doing (*(u32 *)&len) & 0xff, but today the
>> verifier does ((*(u32 *)&len) >> 24) & 0xff instead.
> 
> What you described above for the memory layout all makes sense.
> The root cause is for big endian, we should do *((u8 *)&len + 3).
> This is exactly what macros in test_pkt_md_access.c tries to do.
> 
> if  __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
> #define TEST_FIELD(TYPE, FIELD, MASK)                                   \
>        {                                                               \
>                TYPE tmp = *(volatile TYPE *)&skb->FIELD;               \
>                if (tmp != ((*(volatile __u32 *)&skb->FIELD) & MASK))   \
>                        return TC_ACT_SHOT;                             \
>        }
> #else
> #define TEST_FIELD_OFFSET(a, b) ((sizeof(a) - sizeof(b)) / sizeof(b))
> #define TEST_FIELD(TYPE, FIELD, MASK)                                   \
>        {                                                               \
>                TYPE tmp = *((volatile TYPE *)&skb->FIELD +             \
>                              TEST_FIELD_OFFSET(skb->FIELD, TYPE));     \
>                if (tmp != ((*(volatile __u32 *)&skb->FIELD) & MASK))   \
>                        return TC_ACT_SHOT;                             \
>        }
> #endif
> 
> Could you check whether your __BYTE_ORDER__ is set
> correctly or not for this case? You may need to tweak Makefile
> if you are doing cross compilation, I am not sure how as I
> did not have environment.

I’m building natively on s390.

Here is the (formatted) preprocessed C code for the first condition:

{
	__u8 tmp = *((volatile __u8 *)&skb->len +
		((sizeof(skb->len) - sizeof(__u8)) / sizeof(__u8)));
	if (tmp != ((*(volatile __u32 *)&skb->len) & 0xFF)) return 2;
};

So I believe the endianness is chosen correctly.

Here is the clang-generated BPF bytecode for the first condition:

# llvm-objdump -d test_pkt_md_access.o
0000000000000000 process:
       0:	71 21 00 03 00 00 00 00	r2 = *(u8 *)(r1 + 3)
       1:	61 31 00 00 00 00 00 00	r3 = *(u32 *)(r1 + 0)
       2:	57 30 00 00 00 00 00 ff	r3 &= 255
       3:	5d 23 00 1d 00 00 00 00	if r2 != r3 goto +29 <LBB0_10>

This also looks good to me.

Finally, here is the verifier-generated BPF bytecode:

# bpftool prog dump xlated id 14
; TEST_FIELD(__u8,  len, 0xFF);
   0: (61) r2 = *(u32 *)(r1 +104)
   1: (bc) w2 = w2
   2: (74) w2 >>= 24
   3: (bc) w2 = w2
   4: (54) w2 &= 255
   5: (bc) w2 = w2

Here we can see the shift that I'm referring to. I believe we should
translate *(u8 *)(r1 + 3) in this case without this shift on big-endian
machines.

Best regards,
Ilya

Y Song July 17, 2019, 10:23 p.m. UTC | #7

On Wed, Jul 17, 2019 at 1:52 PM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> > Am 17.07.2019 um 18:25 schrieb Y Song <ys114321@gmail.com>:
> >
> > On Wed, Jul 17, 2019 at 3:36 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
> >>
> >>
> >> Here is a better one: len=0x11223344 and we would like to do
> >> ((u8 *)&len)[3].
> >>
> >> len is represented as `11 22 33 44` in memory, so the desired result is
> >> 0x44. It can be obtained by doing (*(u32 *)&len) & 0xff, but today the
> >> verifier does ((*(u32 *)&len) >> 24) & 0xff instead.
> >
> > What you described above for the memory layout all makes sense.
> > The root cause is for big endian, we should do *((u8 *)&len + 3).
> > This is exactly what macros in test_pkt_md_access.c tries to do.
> >
> > if  __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
> > #define TEST_FIELD(TYPE, FIELD, MASK)                                   \
> >        {                                                               \
> >                TYPE tmp = *(volatile TYPE *)&skb->FIELD;               \
> >                if (tmp != ((*(volatile __u32 *)&skb->FIELD) & MASK))   \
> >                        return TC_ACT_SHOT;                             \
> >        }
> > #else
> > #define TEST_FIELD_OFFSET(a, b) ((sizeof(a) - sizeof(b)) / sizeof(b))
> > #define TEST_FIELD(TYPE, FIELD, MASK)                                   \
> >        {                                                               \
> >                TYPE tmp = *((volatile TYPE *)&skb->FIELD +             \
> >                              TEST_FIELD_OFFSET(skb->FIELD, TYPE));     \
> >                if (tmp != ((*(volatile __u32 *)&skb->FIELD) & MASK))   \
> >                        return TC_ACT_SHOT;                             \
> >        }
> > #endif
> >
> > Could you check whether your __BYTE_ORDER__ is set
> > correctly or not for this case? You may need to tweak Makefile
> > if you are doing cross compilation, I am not sure how as I
> > did not have environment.
>
> I’m building natively on s390.
>
> Here is the (formatted) preprocessed C code for the first condition:
>
> {
>         __u8 tmp = *((volatile __u8 *)&skb->len +
>                 ((sizeof(skb->len) - sizeof(__u8)) / sizeof(__u8)));
>         if (tmp != ((*(volatile __u32 *)&skb->len) & 0xFF)) return 2;
> };
>
> So I believe the endianness is chosen correctly.
>
> Here is the clang-generated BPF bytecode for the first condition:
>
> # llvm-objdump -d test_pkt_md_access.o
> 0000000000000000 process:
>        0:       71 21 00 03 00 00 00 00 r2 = *(u8 *)(r1 + 3)
>        1:       61 31 00 00 00 00 00 00 r3 = *(u32 *)(r1 + 0)
>        2:       57 30 00 00 00 00 00 ff r3 &= 255
>        3:       5d 23 00 1d 00 00 00 00 if r2 != r3 goto +29 <LBB0_10>
>
> This also looks good to me.
>
> Finally, here is the verifier-generated BPF bytecode:
>
> # bpftool prog dump xlated id 14
> ; TEST_FIELD(__u8,  len, 0xFF);
>    0: (61) r2 = *(u32 *)(r1 +104)
>    1: (bc) w2 = w2
>    2: (74) w2 >>= 24
>    3: (bc) w2 = w2
>    4: (54) w2 &= 255
>    5: (bc) w2 = w2
>
> Here we can see the shift that I'm referring to. I believe we should
> translate *(u8 *)(r1 + 3) in this case without this shift on big-endian
> machines.

Thanks for the detailed illustration.
Now, with your detailed output of byte codes and xlated program, it
indeed becomes apparent
that verifier should not do shift at insn 2.

I was correct that after insn 0, register r2 should hold the same
value for big and little endian.
But I missed the fact in the first review that insn->off for original
first insn is different.
r2 = *(u8 *)(r1 + 3), the first insn on big endian, and r2 = *(u8 *)r1
for little endian.
They should really have the same shift amount.

Therefore, indeed, shifting amount is actually different between big
and little endians.
So your code is correct. Could you add a macro in linux/filter.h? Most
narrow load related
macros are there? This way, we maintain verifier.c __BYTE_ORDER__ macro free.

Also, could you put your above analysis in the commit message? This will help
reasoning the change easily later on.

Thanks!

>
> Best regards,
> Ilya

[bpf] bpf: fix narrower loads on s390

Commit Message

Comments

Patch