diff mbox series

[v2,1/3] powerpc: sstep: Fix load and update emulation

Message ID 20210203063841.431063-1-sandipan@linux.ibm.com (mailing list archive)
State Changes Requested
Headers show
Series [v2,1/3] powerpc: sstep: Fix load and update emulation | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success Successfully applied on branch powerpc/merge (44158b256b30415079588d0fcb1bccbdc2ccd009)
snowpatch_ozlabs/checkpatch warning total: 0 errors, 1 warnings, 0 checks, 87 lines checked
snowpatch_ozlabs/needsstable warning Please consider tagging this patch for stable!

Commit Message

Sandipan Das Feb. 3, 2021, 6:38 a.m. UTC
The Power ISA says that the fixed-point load and update
instructions must neither use R0 for the base address (RA)
nor have the destination (RT) and the base address (RA) as
the same register. In these cases, the instruction is
invalid. This applies to the following instructions.
  * Load Byte and Zero with Update (lbzu)
  * Load Byte and Zero with Update Indexed (lbzux)
  * Load Halfword and Zero with Update (lhzu)
  * Load Halfword and Zero with Update Indexed (lhzux)
  * Load Halfword Algebraic with Update (lhau)
  * Load Halfword Algebraic with Update Indexed (lhaux)
  * Load Word and Zero with Update (lwzu)
  * Load Word and Zero with Update Indexed (lwzux)
  * Load Word Algebraic with Update Indexed (lwaux)
  * Load Doubleword with Update (ldu)
  * Load Doubleword with Update Indexed (ldux)

However, the following behaviour is observed using some
invalid opcodes where RA = RT.

An userspace program using an invalid instruction word like
0xe9ce0001, i.e. "ldu r14, 0(r14)", runs and exits without
getting terminated abruptly. The instruction performs the
load operation but does not write the effective address to
the base address register. Attaching an uprobe at that
instruction's address results in emulation which writes the
effective address to the base register. Thus, the final value
of the base address register is different.

To remove any inconsistencies, this adds an additional check
for the aforementioned instructions to make sure that they
are treated as unknown by the emulation infrastructure when
RA = 0 or RA = RT. The kernel will then fallback to executing
the instruction on hardware.

Fixes: 0016a4cf5582 ("powerpc: Emulate most Book I instructions in emulate_step()")
Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
---
Previous versions can be found at:
v1: https://lore.kernel.org/linuxppc-dev/20201119054139.244083-1-sandipan@linux.ibm.com/

Changes in v2:
- Jump to unknown_opcode instead of returning -1 for invalid
  instruction forms.

---
 arch/powerpc/lib/sstep.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

Comments

Naveen N. Rao Feb. 3, 2021, 9:49 a.m. UTC | #1
On 2021/02/03 12:08PM, Sandipan Das wrote:
> The Power ISA says that the fixed-point load and update
> instructions must neither use R0 for the base address (RA)
> nor have the destination (RT) and the base address (RA) as
> the same register. In these cases, the instruction is
> invalid. This applies to the following instructions.
>   * Load Byte and Zero with Update (lbzu)
>   * Load Byte and Zero with Update Indexed (lbzux)
>   * Load Halfword and Zero with Update (lhzu)
>   * Load Halfword and Zero with Update Indexed (lhzux)
>   * Load Halfword Algebraic with Update (lhau)
>   * Load Halfword Algebraic with Update Indexed (lhaux)
>   * Load Word and Zero with Update (lwzu)
>   * Load Word and Zero with Update Indexed (lwzux)
>   * Load Word Algebraic with Update Indexed (lwaux)
>   * Load Doubleword with Update (ldu)
>   * Load Doubleword with Update Indexed (ldux)
> 
> However, the following behaviour is observed using some
> invalid opcodes where RA = RT.
> 
> An userspace program using an invalid instruction word like
> 0xe9ce0001, i.e. "ldu r14, 0(r14)", runs and exits without
> getting terminated abruptly. The instruction performs the
> load operation but does not write the effective address to
> the base address register. 

While the processor (p8 in my test) doesn't seem to be throwing an 
exception, I don't think it is necessarily loading the value. Qemu 
throws an exception though. It's probably best to term the behavior as 
being undefined.

> Attaching an uprobe at that
> instruction's address results in emulation which writes the
> effective address to the base register. Thus, the final value
> of the base address register is different.
> 
> To remove any inconsistencies, this adds an additional check
> for the aforementioned instructions to make sure that they
> are treated as unknown by the emulation infrastructure when
> RA = 0 or RA = RT. The kernel will then fallback to executing
> the instruction on hardware.
> 
> Fixes: 0016a4cf5582 ("powerpc: Emulate most Book I instructions in emulate_step()")
> Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
> Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
> ---
> Previous versions can be found at:
> v1: https://lore.kernel.org/linuxppc-dev/20201119054139.244083-1-sandipan@linux.ibm.com/
> 
> Changes in v2:
> - Jump to unknown_opcode instead of returning -1 for invalid
>   instruction forms.
> 
> ---
>  arch/powerpc/lib/sstep.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)

Wouldn't it be easier to just do the below at the end? Or, am I missing something?

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index ede093e9623472..a2d726d2a5e9d1 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -2980,6 +2980,10 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
        }
 #endif /* CONFIG_VSX */

+       if (GETTYPE(op->type) == LOAD && (op->type & UPDATE) &&
+                       (ra == 0 || ra == rd))
+               goto unknown_opcode;
+
        return 0;

  logical_done:


- Naveen
Sandipan Das Feb. 3, 2021, 10:35 a.m. UTC | #2
On 03/02/21 3:19 pm, Naveen N. Rao wrote:
> [...]
> 
> Wouldn't it be easier to just do the below at the end? Or, am I missing something?
> 
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index ede093e9623472..a2d726d2a5e9d1 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -2980,6 +2980,10 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>         }
>  #endif /* CONFIG_VSX */
> 
> +       if (GETTYPE(op->type) == LOAD && (op->type & UPDATE) &&
> +                       (ra == 0 || ra == rd))
> +               goto unknown_opcode;
> +
>         return 0;
> 
>   logical_done:
> 

Thanks that's much cleaner! We might need something similar for
the FP load/store and update instructions where an instruction is
invalid if RA is 0. I'll send a new revision with these changes.

- Sandipan
Sandipan Das Feb. 3, 2021, 11:37 a.m. UTC | #3
On 03/02/21 3:19 pm, Naveen N. Rao wrote:
> [...]
> 
> Wouldn't it be easier to just do the below at the end? Or, am I missing something?
> 
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index ede093e9623472..a2d726d2a5e9d1 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -2980,6 +2980,10 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>         }
>  #endif /* CONFIG_VSX */
> 
> +       if (GETTYPE(op->type) == LOAD && (op->type & UPDATE) &&
> +                       (ra == 0 || ra == rd))
> +               goto unknown_opcode;
> +
>         return 0;
> 
>   logical_done:
> 

This looks good?

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index e96cff845ef7..a9c149bfd2f5 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -3017,6 +3017,21 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 
        }
 
+       if (op->type & UPDATE) {
+               if (ra == rd && GETTYPE(op->type) == LOAD)
+                       goto unknown_opcode;
+               else if (ra == 0)
+                       switch(GETTYPE(op->type)) {
+                       case LOAD:
+                       case STORE:
+#ifdef CONFIG_PPC_FPU
+                       case LOAD_FP:
+                       case STORE_FP:
+#endif
+                               goto unknown_opcode;
+                       }
+       }
+
 #ifdef CONFIG_VSX
        if ((GETTYPE(op->type) == LOAD_VSX ||
             GETTYPE(op->type) == STORE_VSX) &&


- Sandipan
Segher Boessenkool Feb. 3, 2021, 9:17 p.m. UTC | #4
On Wed, Feb 03, 2021 at 03:19:09PM +0530, Naveen N. Rao wrote:
> On 2021/02/03 12:08PM, Sandipan Das wrote:
> > The Power ISA says that the fixed-point load and update
> > instructions must neither use R0 for the base address (RA)
> > nor have the destination (RT) and the base address (RA) as
> > the same register. In these cases, the instruction is
> > invalid.

> > However, the following behaviour is observed using some
> > invalid opcodes where RA = RT.
> > 
> > An userspace program using an invalid instruction word like
> > 0xe9ce0001, i.e. "ldu r14, 0(r14)", runs and exits without
> > getting terminated abruptly. The instruction performs the
> > load operation but does not write the effective address to
> > the base address register. 
> 
> While the processor (p8 in my test) doesn't seem to be throwing an 
> exception, I don't think it is necessarily loading the value. Qemu 
> throws an exception though. It's probably best to term the behavior as 
> being undefined.

Power8 does:

  Load with Update Instructions (RA = 0)
    EA is placed into R0.
  Load with Update Instructions (RA = RT)
    EA is placed into RT. The storage operand addressed by EA is
    accessed, but the data returned by the load is discarded.

Power9 does:

  Load with Update Instructions (RA = 0)
    EA is placed into R0.
  Load with Update Instructions (RA = RT)
    The storage operand addressed by EA is accessed. The displacement
    field is added to the data returned by the load and placed into RT.

Both UMs also say

  Invalid Forms
    In general, the POWER9 core handles invalid forms of instructions in
    the manner that is most convenient for the particular case (within
    the scope of meeting the boundedly-undefined definition described in
    the Power ISA). This document specifies the behavior for these
    cases.  However, it is not recommended that software or other system
    facilities make use of the POWER9 behavior in these cases because
    such behavior might be different in another processor that
    implements the Power ISA.

(or POWER8 instead of POWER9 of course).  Always complaining about most
invalid forms seems wise, certainly if not all recent CPUs behave the
same :-)


Segher
Michael Ellerman Feb. 4, 2021, 12:53 a.m. UTC | #5
Sandipan Das <sandipan@linux.ibm.com> writes:
> On 03/02/21 3:19 pm, Naveen N. Rao wrote:
>> [...]
>> 
>> Wouldn't it be easier to just do the below at the end? Or, am I missing something?
>> 
>> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
>> index ede093e9623472..a2d726d2a5e9d1 100644
>> --- a/arch/powerpc/lib/sstep.c
>> +++ b/arch/powerpc/lib/sstep.c
>> @@ -2980,6 +2980,10 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>>         }
>>  #endif /* CONFIG_VSX */
>> 
>> +       if (GETTYPE(op->type) == LOAD && (op->type & UPDATE) &&
>> +                       (ra == 0 || ra == rd))
>> +               goto unknown_opcode;
>> +
>>         return 0;
>> 
>>   logical_done:
>> 
>
> This looks good?
>
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index e96cff845ef7..a9c149bfd2f5 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -3017,6 +3017,21 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>  
>         }
>  
> +       if (op->type & UPDATE) {
> +               if (ra == rd && GETTYPE(op->type) == LOAD)
> +                       goto unknown_opcode;
> +               else if (ra == 0)
> +                       switch(GETTYPE(op->type)) {
> +                       case LOAD:
> +                       case STORE:
> +#ifdef CONFIG_PPC_FPU
> +                       case LOAD_FP:
> +                       case STORE_FP:
> +#endif

Why make it conditional?

cheers
Naveen N. Rao Feb. 4, 2021, 8:27 a.m. UTC | #6
On 2021/02/03 03:17PM, Segher Boessenkool wrote:
> On Wed, Feb 03, 2021 at 03:19:09PM +0530, Naveen N. Rao wrote:
> > On 2021/02/03 12:08PM, Sandipan Das wrote:
> > > The Power ISA says that the fixed-point load and update
> > > instructions must neither use R0 for the base address (RA)
> > > nor have the destination (RT) and the base address (RA) as
> > > the same register. In these cases, the instruction is
> > > invalid.
> 
> > > However, the following behaviour is observed using some
> > > invalid opcodes where RA = RT.
> > > 
> > > An userspace program using an invalid instruction word like
> > > 0xe9ce0001, i.e. "ldu r14, 0(r14)", runs and exits without
> > > getting terminated abruptly. The instruction performs the
> > > load operation but does not write the effective address to
> > > the base address register. 
> > 
> > While the processor (p8 in my test) doesn't seem to be throwing an 
> > exception, I don't think it is necessarily loading the value. Qemu 
> > throws an exception though. It's probably best to term the behavior as 
> > being undefined.
> 
> Power8 does:
> 
>   Load with Update Instructions (RA = 0)
>     EA is placed into R0.
>   Load with Update Instructions (RA = RT)
>     EA is placed into RT. The storage operand addressed by EA is
>     accessed, but the data returned by the load is discarded.

I'm actually not seeing that. This is what I am testing with:
	li      8,0xaaa
	mr      6,1
	std     8,64(6)
	#ldu    6,64(6)
	.long	0xe8c60041

And, r6 always ends up with 0xaea. It changes with the value I put into 
r6 though.

Granted, this is all up in the air, but it does look like there is more 
going on and the value isn't the EA or the value at the address.

> 
> Power9 does:
> 
>   Load with Update Instructions (RA = 0)
>     EA is placed into R0.
>   Load with Update Instructions (RA = RT)
>     The storage operand addressed by EA is accessed. The displacement
>     field is added to the data returned by the load and placed into RT.
> 
> Both UMs also say
> 
>   Invalid Forms
>     In general, the POWER9 core handles invalid forms of instructions in
>     the manner that is most convenient for the particular case (within
>     the scope of meeting the boundedly-undefined definition described in
>     the Power ISA). This document specifies the behavior for these
>     cases.  However, it is not recommended that software or other system
>     facilities make use of the POWER9 behavior in these cases because
>     such behavior might be different in another processor that
>     implements the Power ISA.
> 
> (or POWER8 instead of POWER9 of course).  Always complaining about most
> invalid forms seems wise, certainly if not all recent CPUs behave the
> same :-)

Agreed.

- Naveen
David Laight Feb. 4, 2021, 10:29 a.m. UTC | #7
From: Segher Boessenkool
> Sent: 03 February 2021 21:18
...
> Power9 does:
> 
>   Load with Update Instructions (RA = 0)
>     EA is placed into R0.

Does that change the value of 0?
Rather reminds me of some 1960s era systems that had the small integers
at fixed (global) addresses.
FORTRAN always passes by reference, pass 0 and the address of the global
zero location was passed, the called function could change 0 to 1 for
the entire computer!

>   Load with Update Instructions (RA = RT)
>     The storage operand addressed by EA is accessed. The displacement
>     field is added to the data returned by the load and placed into RT.

Shame that isn't standard - could be used to optimise some code.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Segher Boessenkool March 2, 2021, 2:37 a.m. UTC | #8
Hi!

I didn't see this until now, almost a month later, sorry about that :-)

On Thu, Feb 04, 2021 at 01:57:53PM +0530, Naveen N. Rao wrote:
> On 2021/02/03 03:17PM, Segher Boessenkool wrote:
> > Power8 does:
> > 
> >   Load with Update Instructions (RA = 0)
> >     EA is placed into R0.
> >   Load with Update Instructions (RA = RT)
> >     EA is placed into RT. The storage operand addressed by EA is
> >     accessed, but the data returned by the load is discarded.
> 
> I'm actually not seeing that. This is what I am testing with:
> 	li      8,0xaaa
> 	mr      6,1
> 	std     8,64(6)
> 	#ldu    6,64(6)
> 	.long	0xe8c60041
> 
> And, r6 always ends up with 0xaea. It changes with the value I put into 
> r6 though.

That is exactly the behaviour specified for p8.  0aaa+0040=0aea.

> Granted, this is all up in the air, but it does look like there is more 
> going on and the value isn't the EA or the value at the address.

That *is* the EA.  The EA is the address the insn does the access at.


Segher
Naveen N. Rao March 3, 2021, 4:31 p.m. UTC | #9
On 2021/03/01 08:37PM, Segher Boessenkool wrote:
> Hi!
> 
> I didn't see this until now, almost a month later, sorry about that :-)

No problem.

> 
> On Thu, Feb 04, 2021 at 01:57:53PM +0530, Naveen N. Rao wrote:
> > On 2021/02/03 03:17PM, Segher Boessenkool wrote:
> > > Power8 does:
> > > 
> > >   Load with Update Instructions (RA = 0)
> > >     EA is placed into R0.
> > >   Load with Update Instructions (RA = RT)
> > >     EA is placed into RT. The storage operand addressed by EA is
> > >     accessed, but the data returned by the load is discarded.
> > 
> > I'm actually not seeing that. This is what I am testing with:
> > 	li      8,0xaaa
> > 	mr      6,1
> > 	std     8,64(6)
> > 	#ldu    6,64(6)
> > 	.long	0xe8c60041
> > 
> > And, r6 always ends up with 0xaea. It changes with the value I put into 
> > r6 though.
> 
> That is exactly the behaviour specified for p8.  0aaa+0040=0aea.
> 
> > Granted, this is all up in the air, but it does look like there is more 
> > going on and the value isn't the EA or the value at the address.
> 
> That *is* the EA.  The EA is the address the insn does the access at.

I'm probably missing something here. 0xaaa is the value I stored at an 
offset of 64 bytes from the stack pointer (r1 is copied into r6). In the 
ldu instruction above, the EA is 64(r6), which should translate to 
r1+64.  The data returned by the load would be 0xaaa, which should be 
discarded per the description you provided above. So, I would expect to 
see a 0xc0.. address in r6.

In fact, this looks to be the behavior documented for P9:

> > Power9 does:
> >
> >   Load with Update Instructions (RA = 0)
> >     EA is placed into R0.
> >   Load with Update Instructions (RA = RT)
> >     The storage operand addressed by EA is accessed. The 
> >     displacement
> >     field is added to the data returned by the load and placed into 
> >     RT.

- Naveen
Naveen N. Rao March 4, 2021, 1:06 a.m. UTC | #10
On 2021/03/04 09:45AM, Segher Boessenkool wrote:
> On Wed, Mar 03, 2021 at 10:01:27PM +0530, Naveen N. Rao wrote:
> > On 2021/03/01 08:37PM, Segher Boessenkool wrote:
> > > > And, r6 always ends up with 0xaea. It changes with the value I put into 
> > > > r6 though.
> > > 
> > > That is exactly the behaviour specified for p8.  0aaa+0040=0aea.
> > > 
> > > > Granted, this is all up in the air, but it does look like there is more 
> > > > going on and the value isn't the EA or the value at the address.
> > > 
> > > That *is* the EA.  The EA is the address the insn does the access at.
> > 
> > I'm probably missing something here. 0xaaa is the value I stored at an 
> > offset of 64 bytes from the stack pointer (r1 is copied into r6). In the 
> > ldu instruction above, the EA is 64(r6), which should translate to 
> > r1+64.  The data returned by the load would be 0xaaa, which should be 
> > discarded per the description you provided above. So, I would expect to 
> > see a 0xc0.. address in r6.
> 
> Yes, I misread your code it seems.
> 
> > In fact, this looks to be the behavior documented for P9:
> > 
> > > > Power9 does:
> > > >
> > > >   Load with Update Instructions (RA = 0)
> > > >     EA is placed into R0.
> > > >   Load with Update Instructions (RA = RT)
> > > >     The storage operand addressed by EA is accessed. The 
> > > >     displacement
> > > >     field is added to the data returned by the load and placed into 
> > > >     RT.
> 
> Yup.  So on what cpu did you test?

I tested this on two processors:
2.0 (pvr 004d 0200)
2.1 (pvr 004b 0201)

I guess the behavior changed some time during P8, but I don't have a P9 
to test this on.

In any case, this souldn't matter too much for us as you rightly point 
out:

> 
> Either way, the kernel should not emulate any particular cpu here, I'd
> say, esp. since recent cpus do different things for this invalid form.

Ack.


Thanks!
- Naveen
Segher Boessenkool March 4, 2021, 3:45 p.m. UTC | #11
On Wed, Mar 03, 2021 at 10:01:27PM +0530, Naveen N. Rao wrote:
> On 2021/03/01 08:37PM, Segher Boessenkool wrote:
> > > And, r6 always ends up with 0xaea. It changes with the value I put into 
> > > r6 though.
> > 
> > That is exactly the behaviour specified for p8.  0aaa+0040=0aea.
> > 
> > > Granted, this is all up in the air, but it does look like there is more 
> > > going on and the value isn't the EA or the value at the address.
> > 
> > That *is* the EA.  The EA is the address the insn does the access at.
> 
> I'm probably missing something here. 0xaaa is the value I stored at an 
> offset of 64 bytes from the stack pointer (r1 is copied into r6). In the 
> ldu instruction above, the EA is 64(r6), which should translate to 
> r1+64.  The data returned by the load would be 0xaaa, which should be 
> discarded per the description you provided above. So, I would expect to 
> see a 0xc0.. address in r6.

Yes, I misread your code it seems.

> In fact, this looks to be the behavior documented for P9:
> 
> > > Power9 does:
> > >
> > >   Load with Update Instructions (RA = 0)
> > >     EA is placed into R0.
> > >   Load with Update Instructions (RA = RT)
> > >     The storage operand addressed by EA is accessed. The 
> > >     displacement
> > >     field is added to the data returned by the load and placed into 
> > >     RT.

Yup.  So on what cpu did you test?

Either way, the kernel should not emulate any particular cpu here, I'd
say, esp. since recent cpus do different things for this invalid form.


Segher
diff mbox series

Patch

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index e96cff845ef7..db824fec6165 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -2232,11 +2232,15 @@  int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 
 		case 23:	/* lwzx */
 		case 55:	/* lwzux */
+			if (u && (ra == 0 || ra == rd))
+				goto unknown_opcode;
 			op->type = MKOP(LOAD, u, 4);
 			break;
 
 		case 87:	/* lbzx */
 		case 119:	/* lbzux */
+			if (u && (ra == 0 || ra == rd))
+				goto unknown_opcode;
 			op->type = MKOP(LOAD, u, 1);
 			break;
 
@@ -2290,6 +2294,8 @@  int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 #ifdef __powerpc64__
 		case 21:	/* ldx */
 		case 53:	/* ldux */
+			if (u && (ra == 0 || ra == rd))
+				goto unknown_opcode;
 			op->type = MKOP(LOAD, u, 8);
 			break;
 
@@ -2311,18 +2317,24 @@  int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 
 		case 279:	/* lhzx */
 		case 311:	/* lhzux */
+			if (u && (ra == 0 || ra == rd))
+				goto unknown_opcode;
 			op->type = MKOP(LOAD, u, 2);
 			break;
 
 #ifdef __powerpc64__
 		case 341:	/* lwax */
 		case 373:	/* lwaux */
+			if (u && (ra == 0 || ra == rd))
+				goto unknown_opcode;
 			op->type = MKOP(LOAD, SIGNEXT | u, 4);
 			break;
 #endif
 
 		case 343:	/* lhax */
 		case 375:	/* lhaux */
+			if (u && (ra == 0 || ra == rd))
+				goto unknown_opcode;
 			op->type = MKOP(LOAD, SIGNEXT | u, 2);
 			break;
 
@@ -2656,12 +2668,16 @@  int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 
 	case 32:	/* lwz */
 	case 33:	/* lwzu */
+		if (u && (ra == 0 || ra == rd))
+			goto unknown_opcode;
 		op->type = MKOP(LOAD, u, 4);
 		op->ea = dform_ea(word, regs);
 		break;
 
 	case 34:	/* lbz */
 	case 35:	/* lbzu */
+		if (u && (ra == 0 || ra == rd))
+			goto unknown_opcode;
 		op->type = MKOP(LOAD, u, 1);
 		op->ea = dform_ea(word, regs);
 		break;
@@ -2680,12 +2696,16 @@  int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 
 	case 40:	/* lhz */
 	case 41:	/* lhzu */
+		if (u && (ra == 0 || ra == rd))
+			goto unknown_opcode;
 		op->type = MKOP(LOAD, u, 2);
 		op->ea = dform_ea(word, regs);
 		break;
 
 	case 42:	/* lha */
 	case 43:	/* lhau */
+		if (u && (ra == 0 || ra == rd))
+			goto unknown_opcode;
 		op->type = MKOP(LOAD, SIGNEXT | u, 2);
 		op->ea = dform_ea(word, regs);
 		break;
@@ -2779,6 +2799,8 @@  int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			op->type = MKOP(LOAD, 0, 8);
 			break;
 		case 1:		/* ldu */
+			if (ra == 0 || ra == rd)
+				goto unknown_opcode;
 			op->type = MKOP(LOAD, UPDATE, 8);
 			break;
 		case 2:		/* lwa */