diff mbox series

sparc64: Fix opcode filtering in handling of no fault loads

Message ID 1614577696-27586-1-git-send-email-rob.gardner@oracle.com
State Accepted
Delegated to: David Miller
Headers show
Series sparc64: Fix opcode filtering in handling of no fault loads | expand

Commit Message

Rob Gardner March 1, 2021, 5:48 a.m. UTC
is_no_fault_exception() has two bugs which were discovered via random
opcode testing with stress-ng. Both are caused by improper filtering
of opcodes.

The first bug can be triggered by a floating point store with a no-fault
ASI, for instance "sta %f0, [%g0] #ASI_PNF", opcode C1A01040.

The code first tests op3[5] (0x1000000), which denotes a floating
point instruction, and then tests op3[2] (0x200000), which denotes a
store instruction. But these bits are not mutually exclusive, and the
above mentioned opcode has both bits set. The intent is to filter out
stores, so the test for stores must be done first in order to have
any effect.

The second bug can be triggered by a floating point load with one of
the invalid ASI values 0x8e or 0x8f, which pass this check in
is_no_fault_exception():
     if ((asi & 0xf2) == ASI_PNF)

An example instruction is "ldqa [%l7 + %o7] #ASI 0x8f, %f38",
opcode CF95D1EF. Asi values greater than 0x8b (ASI_SNFL) are fatal
in handle_ldf_stq(), and is_no_fault_exception() must not allow these
invalid asi values to make it that far.

In both of these cases, handle_ldf_stq() reacts by calling
sun4v_data_access_exception() or spitfire_data_access_exception(),
which call is_no_fault_exception() and results in an infinite
recursion.

Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
---
 arch/sparc/kernel/traps_64.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

Comments

Anatoly Pugachev March 1, 2021, 12:56 p.m. UTC | #1
On Mon, Mar 1, 2021 at 9:09 AM Rob Gardner <rob.gardner@oracle.com> wrote:
>
> is_no_fault_exception() has two bugs which were discovered via random
> opcode testing with stress-ng. Both are caused by improper filtering
> of opcodes.

Rob, tested on my ldom, works perfectly now...

$ uname -a
Linux ttip 5.12.0-rc1-dirty #195 SMP Mon Mar 1 15:46:15 MSK 2021
sparc64 GNU/Linux

$ stress-ng --opcode 1 --timeout 60 --metrics-brief
stress-ng: info:  [945] dispatching hogs: 1 opcode
stress-ng: info:  [945] successful run completed in 60.00s (1 min, 0.00 secs)
stress-ng: info:  [945] stressor       bogo ops real time  usr time
sys time   bogo ops/s   bogo ops/s
stress-ng: info:  [945]                           (secs)    (secs)
(secs)   (real time) (usr+sys time)
stress-ng: info:  [945] opcode            17847     60.00     27.45
 34.03       297.45       290.29

Thank you for a quick fix.
Rob Gardner March 1, 2021, 3:05 p.m. UTC | #2
On 3/1/21 5:56 AM, Anatoly Pugachev wrote:
> On Mon, Mar 1, 2021 at 9:09 AM Rob Gardner <rob.gardner@oracle.com> wrote:
>> is_no_fault_exception() has two bugs which were discovered via random
>> opcode testing with stress-ng. Both are caused by improper filtering
>> of opcodes.
> Rob, tested on my ldom, works perfectly now...
>
> $ uname -a
> Linux ttip 5.12.0-rc1-dirty #195 SMP Mon Mar 1 15:46:15 MSK 2021
> sparc64 GNU/Linux
>
> $ stress-ng --opcode 1 --timeout 60 --metrics-brief
> stress-ng: info:  [945] dispatching hogs: 1 opcode
> stress-ng: info:  [945] successful run completed in 60.00s (1 min, 0.00 secs)
> stress-ng: info:  [945] stressor       bogo ops real time  usr time
> sys time   bogo ops/s   bogo ops/s
> stress-ng: info:  [945]                           (secs)    (secs)
> (secs)   (real time) (usr+sys time)
> stress-ng: info:  [945] opcode            17847     60.00     27.45
>   34.03       297.45       290.29
>
> Thank you for a quick fix.


You're welcome. Please add your "tested-by" if you like.

Rob
Anatoly Pugachev March 1, 2021, 4:03 p.m. UTC | #3
On Mon, Mar 1, 2021 at 9:09 AM Rob Gardner <rob.gardner@oracle.com> wrote:
>
> is_no_fault_exception() has two bugs which were discovered via random
> opcode testing with stress-ng. Both are caused by improper filtering
> of opcodes.

fixes the issue for me. thanks again

Tested-by: Anatoly Pugachev <matorola@gmail.com>
John Paul Adrian Glaubitz March 2, 2021, 2:44 p.m. UTC | #4
Hi!

On 3/1/21 4:05 PM, Rob Gardner wrote:
> On 3/1/21 5:56 AM, Anatoly Pugachev wrote:
>> On Mon, Mar 1, 2021 at 9:09 AM Rob Gardner <rob.gardner@oracle.com> wrote:
>>> is_no_fault_exception() has two bugs which were discovered via random
>>> opcode testing with stress-ng. Both are caused by improper filtering
>>> of opcodes.
>> Rob, tested on my ldom, works perfectly now...
>>
>> $ uname -a
>> Linux ttip 5.12.0-rc1-dirty #195 SMP Mon Mar 1 15:46:15 MSK 2021
>> sparc64 GNU/Linux
>>
>> $ stress-ng --opcode 1 --timeout 60 --metrics-brief
>> stress-ng: info:  [945] dispatching hogs: 1 opcode
>> stress-ng: info:  [945] successful run completed in 60.00s (1 min, 0.00 secs)
>> stress-ng: info:  [945] stressor       bogo ops real time  usr time
>> sys time   bogo ops/s   bogo ops/s
>> stress-ng: info:  [945]                           (secs)    (secs)
>> (secs)   (real time) (usr+sys time)
>> stress-ng: info:  [945] opcode            17847     60.00     27.45
>>   34.03       297.45       290.29
>>
>> Thank you for a quick fix.
> 
> 
> You're welcome. Please add your "tested-by" if you like.

Any chance we could get this patch into 5.12 and the stable kernels? (CC Greg)

Adrian
Greg Kroah-Hartman March 2, 2021, 2:46 p.m. UTC | #5
On Tue, Mar 02, 2021 at 03:44:54PM +0100, John Paul Adrian Glaubitz wrote:
> Hi!
> 
> On 3/1/21 4:05 PM, Rob Gardner wrote:
> > On 3/1/21 5:56 AM, Anatoly Pugachev wrote:
> >> On Mon, Mar 1, 2021 at 9:09 AM Rob Gardner <rob.gardner@oracle.com> wrote:
> >>> is_no_fault_exception() has two bugs which were discovered via random
> >>> opcode testing with stress-ng. Both are caused by improper filtering
> >>> of opcodes.
> >> Rob, tested on my ldom, works perfectly now...
> >>
> >> $ uname -a
> >> Linux ttip 5.12.0-rc1-dirty #195 SMP Mon Mar 1 15:46:15 MSK 2021
> >> sparc64 GNU/Linux
> >>
> >> $ stress-ng --opcode 1 --timeout 60 --metrics-brief
> >> stress-ng: info:  [945] dispatching hogs: 1 opcode
> >> stress-ng: info:  [945] successful run completed in 60.00s (1 min, 0.00 secs)
> >> stress-ng: info:  [945] stressor       bogo ops real time  usr time
> >> sys time   bogo ops/s   bogo ops/s
> >> stress-ng: info:  [945]                           (secs)    (secs)
> >> (secs)   (real time) (usr+sys time)
> >> stress-ng: info:  [945] opcode            17847     60.00     27.45
> >>   34.03       297.45       290.29
> >>
> >> Thank you for a quick fix.
> > 
> > 
> > You're welcome. Please add your "tested-by" if you like.
> 
> Any chance we could get this patch into 5.12 and the stable kernels? (CC Greg)

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>
John Paul Adrian Glaubitz March 2, 2021, 2:50 p.m. UTC | #6
Hi Greg!

On 3/2/21 3:46 PM, Greg Kroah-Hartman wrote:
>> On 3/1/21 4:05 PM, Rob Gardner wrote:
>>> On 3/1/21 5:56 AM, Anatoly Pugachev wrote:
>>>> On Mon, Mar 1, 2021 at 9:09 AM Rob Gardner <rob.gardner@oracle.com> wrote:
>>>>> is_no_fault_exception() has two bugs which were discovered via random
>>>>> opcode testing with stress-ng. Both are caused by improper filtering
>>>>> of opcodes.
>>>> Rob, tested on my ldom, works perfectly now...
>>>>
>>>> $ uname -a
>>>> Linux ttip 5.12.0-rc1-dirty #195 SMP Mon Mar 1 15:46:15 MSK 2021
>>>> sparc64 GNU/Linux
>>>>
>>>> $ stress-ng --opcode 1 --timeout 60 --metrics-brief
>>>> stress-ng: info:  [945] dispatching hogs: 1 opcode
>>>> stress-ng: info:  [945] successful run completed in 60.00s (1 min, 0.00 secs)
>>>> stress-ng: info:  [945] stressor       bogo ops real time  usr time
>>>> sys time   bogo ops/s   bogo ops/s
>>>> stress-ng: info:  [945]                           (secs)    (secs)
>>>> (secs)   (real time) (usr+sys time)
>>>> stress-ng: info:  [945] opcode            17847     60.00     27.45
>>>>   34.03       297.45       290.29
>>>>
>>>> Thank you for a quick fix.
>>>
>>>
>>> You're welcome. Please add your "tested-by" if you like.
>>
>> Any chance we could get this patch into 5.12 and the stable kernels? (CC Greg)
> 
> <formletter>
> 
> This is not the correct way to submit patches for inclusion in the
> stable kernel tree.  Please read:
>     https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> for how to do this properly.
> 
> </formletter>

Thanks for the heads-up. I'll leave it to Rob to decide when to send it.

Adrian
Anatoly Pugachev March 9, 2021, 2:52 p.m. UTC | #7
On Mon, Mar 1, 2021 at 7:03 PM Anatoly Pugachev <matorola@gmail.com> wrote:
>
> On Mon, Mar 1, 2021 at 9:09 AM Rob Gardner <rob.gardner@oracle.com> wrote:
> >
> > is_no_fault_exception() has two bugs which were discovered via random
> > opcode testing with stress-ng. Both are caused by improper filtering
> > of opcodes.
>
> fixes the issue for me. thanks again
>
> Tested-by: Anatoly Pugachev <matorola@gmail.com>

Rob,

you might want to add:

Fixes: b6fe1089667a7afcc2cf92cdaec590c7b8381715

as well , on a patch submission.
diff mbox series

Patch

diff --git a/arch/sparc/kernel/traps_64.c b/arch/sparc/kernel/traps_64.c
index d92e5ea..a850dcc 100644
--- a/arch/sparc/kernel/traps_64.c
+++ b/arch/sparc/kernel/traps_64.c
@@ -275,14 +275,13 @@  bool is_no_fault_exception(struct pt_regs *regs)
 			asi = (regs->tstate >> 24); /* saved %asi       */
 		else
 			asi = (insn >> 5);	    /* immediate asi    */
-		if ((asi & 0xf2) == ASI_PNF) {
-			if (insn & 0x1000000) {     /* op3[5:4]=3       */
-				handle_ldf_stq(insn, regs);
-				return true;
-			} else if (insn & 0x200000) { /* op3[2], stores */
+		if ((asi & 0xf6) == ASI_PNF) {
+			if (insn & 0x200000)        /* op3[2], stores   */
 				return false;
-			}
-			handle_ld_nf(insn, regs);
+			if (insn & 0x1000000)       /* op3[5:4]=3 (fp)  */
+				handle_ldf_stq(insn, regs);
+			else
+				handle_ld_nf(insn, regs);
 			return true;
 		}
 	}