diff mbox series

[1/2] powerpc: Fix user data corruption with P9N DD2.1 VSX CI load workaround emulation

Message ID 20201013043741.743413-1-mikey@neuling.org (mailing list archive)
State Accepted
Commit 1da4a0272c5469169f78cd76cf175ff984f52f06
Headers show
Series [1/2] powerpc: Fix user data corruption with P9N DD2.1 VSX CI load workaround emulation | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success Successfully applied on branch powerpc/merge (118be7377c97e35c33819bcb3bbbae5a42a4ac43)
snowpatch_ozlabs/checkpatch success total: 0 errors, 0 warnings, 0 checks, 8 lines checked
snowpatch_ozlabs/needsstable success Patch is tagged for stable

Commit Message

Michael Neuling Oct. 13, 2020, 4:37 a.m. UTC
__get_user_atomic_128_aligned() stores to kaddr using stvx which is a
VMX store instruction, hence kaddr must be 16 byte aligned otherwise
the store won't occur as expected.

Unfortunately when we call __get_user_atomic_128_aligned() in
p9_hmi_special_emu(), the buffer we pass as kaddr (ie. vbuf) isn't
guaranteed to be 16B aligned. This means that the write to vbuf in
__get_user_atomic_128_aligned() has the bottom bits of the address
truncated. This results in other local variables being
overwritten. Also vbuf will not contain the correct data which results
in the userspace emulation being wrong and hence user data corruption.

In the past we've been mostly lucky as vbuf has ended up aligned but
this is fragile and isn't always true. CONFIG_STACKPROTECTOR in
particular can change the stack arrangement enough that our luck runs
out.

This issue only occurs on POWER9 Nimbus <= DD2.1 bare metal.

The fix is to align vbuf to a 16 byte boundary.

Fixes: 5080332c2c89 ("powerpc/64s: Add workaround for P9 vector CI load issue")
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: <stable@vger.kernel.org> # v4.15+
---
 arch/powerpc/kernel/traps.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Michael Ellerman Oct. 14, 2020, 12:13 a.m. UTC | #1
Michael Neuling <mikey@neuling.org> writes:
> __get_user_atomic_128_aligned() stores to kaddr using stvx which is a
> VMX store instruction, hence kaddr must be 16 byte aligned otherwise
> the store won't occur as expected.
>
> Unfortunately when we call __get_user_atomic_128_aligned() in
> p9_hmi_special_emu(), the buffer we pass as kaddr (ie. vbuf) isn't
> guaranteed to be 16B aligned. This means that the write to vbuf in
> __get_user_atomic_128_aligned() has the bottom bits of the address
> truncated. This results in other local variables being
> overwritten. Also vbuf will not contain the correct data which results
> in the userspace emulation being wrong and hence user data corruption.
>
> In the past we've been mostly lucky as vbuf has ended up aligned but
> this is fragile and isn't always true. CONFIG_STACKPROTECTOR in
> particular can change the stack arrangement enough that our luck runs
> out.

Actually I'm yet to find a kernel with CONFIG_STACKPROTECTOR=n that is
vulnerable to the bug.

Turning on STACKPROTECTOR changes the order GCC allocates locals on the
stack, from bottom-up to top-down. That in conjunction with the 8 byte
stack canary means we end up with 8 bytes of space below the locals,
which misaligns vbuf.

But obviously other things can change the stack layout too, so no
guarantees that CONFIG_STACKPROTECTOR=n makes it safe.

cheers
Michael Ellerman Oct. 14, 2020, 12:32 a.m. UTC | #2
Michael Neuling <mikey@neuling.org> writes:
> __get_user_atomic_128_aligned() stores to kaddr using stvx which is a
> VMX store instruction, hence kaddr must be 16 byte aligned otherwise
> the store won't occur as expected.
>
> Unfortunately when we call __get_user_atomic_128_aligned() in
> p9_hmi_special_emu(), the buffer we pass as kaddr (ie. vbuf) isn't
> guaranteed to be 16B aligned. This means that the write to vbuf in
> __get_user_atomic_128_aligned() has the bottom bits of the address
> truncated. This results in other local variables being
> overwritten. Also vbuf will not contain the correct data which results
> in the userspace emulation being wrong and hence user data corruption.
>
> In the past we've been mostly lucky as vbuf has ended up aligned but
> this is fragile and isn't always true. CONFIG_STACKPROTECTOR in
> particular can change the stack arrangement enough that our luck runs
> out.

Below is a script which takes a System.map and vmlinux (or objdump
output) and tries to check if the stack layout is susceptible to the
bug.

cheers



#!/usr/bin/python3

import os
import sys
import re
from subprocess import Popen, PIPE


# eg: c00000000002ea88:       ce 49 00 7c     stvx    v0,0,r9
stvx_pattern = re.compile('^c[0-9a-f]{15}:\s+(?:[0-9a-f]{2} ){4}\s+stvx\s+v0,0,(r\d+)\s*')

# eg: c00000000002ea80:       28 00 21 39     addi    r9,r1,40
addi_pattern = '^c[0-9a-f]{15}:\s+(?:[0-9a-f]{2} ){4}\s+addi\s+%s,r1,(\d+)\s*'


def main(args):
    if len(args) != 2:
        print('Usage: %s <objdump|vmlinux> <System.map>' % sys.argv[0])
        return -1

    if os.path.basename(sys.argv[1]).startswith('vmlinu'):
        dump = Popen(['objdump', '-d', sys.argv[1]], stdout=PIPE, encoding='utf-8').stdout
    else:
        dump = open(sys.argv[1])

    syms = read_symbols(sys.argv[2])

    func_lines = extract_func(dump, 'handle_hmi_exception', syms)
    if func_lines is None:
        print("Error: couldn't find handle_hmi_exception in objdump output")
        return -1

    match = None
    i = 0
    while i < len(func_lines):
        match = stvx_pattern.match(func_lines[i])
        if match:
            break
        i += 1

    if match is None:
        print("Error: couldn't find stvx in handle_hmi_exception")
        return -1

    stvx_reg = match.group(1)
    print('stvx found using register %s:\n%s\n' % (stvx_reg, match.group(0).rstrip()))

    match = None
    i -= 1
    while i > 0:
        pattern = re.compile(addi_pattern % stvx_reg)
        match = pattern.match(func_lines[i])
        if match:
            break
        i -= 1

    if match is None:
        print("Error: couldn't find addi in handle_hmi_exception")
        return -1

    stack_offset = int(match.group(1))
    print('addi found using offset %d:\n%s\n' % (stack_offset, match.group(0).rstrip()))

    if stack_offset & 0xf:
        print('!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!')
        print('!! Offset is misaligned - bug present !!')
        print('!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!')
        return 1
    else:
        print('OK - offset is aligned')

    return 0


def extract_func(f, func_name, syms):
    func_addr, func_size = find_symbol_and_size(syms, func_name)
    num_lines = int(func_size / 4)

    pattern = re.compile('^%016x:' % func_addr)

    match = None
    line = f.readline()
    while len(line):
        match = pattern.match(line)
        if match:
            break
        line = f.readline()

    if match is None:
        return None

    lines = []
    for i in range(0, num_lines):
        lines.append(f.readline())

    return lines


def read_symbols(map_path):
    last_function = ''
    last_addr = 0

    lines = open(map_path).readlines()

    addrs = []
    last_addr = 0
    for line in lines:
        tokens = line.split()
        if len(tokens) == 3:
            addr = int(tokens[0], 16)
            sym_type = tokens[1]
            name = tokens[2]
        elif len(tokens) == 2:
            addr = last_addr
            sym_type = tokens[0]
            name = tokens[1]
        else:
            raise Exception("Couldn't grok System.map")

        addrs.append((addr, name, sym_type))
        last_addr = addr

    return addrs


def find_symbol_and_size(symbol_map, name):
    dot_name = '.%s' % name
    saddr = None
    i = 0
    for addr, cur_name, sym_type in symbol_map:
        if cur_name == name or cur_name == dot_name:
            saddr = addr
            break
        i += 1

    if saddr is None:
        return (None, None)

    i += 1
    if i >= len(symbol_map):
        size = -1
    else:
        size = symbol_map[i][0] - saddr

    return (saddr, size)


sys.exit(main(sys.argv[1:]))
Michael Ellerman Oct. 20, 2020, 12:23 p.m. UTC | #3
On Tue, 13 Oct 2020 15:37:40 +1100, Michael Neuling wrote:
> __get_user_atomic_128_aligned() stores to kaddr using stvx which is a
> VMX store instruction, hence kaddr must be 16 byte aligned otherwise
> the store won't occur as expected.
> 
> Unfortunately when we call __get_user_atomic_128_aligned() in
> p9_hmi_special_emu(), the buffer we pass as kaddr (ie. vbuf) isn't
> guaranteed to be 16B aligned. This means that the write to vbuf in
> __get_user_atomic_128_aligned() has the bottom bits of the address
> truncated. This results in other local variables being
> overwritten. Also vbuf will not contain the correct data which results
> in the userspace emulation being wrong and hence user data corruption.
> 
> [...]

Applied to powerpc/fixes.

[1/2] powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load emulation
      https://git.kernel.org/powerpc/c/1da4a0272c5469169f78cd76cf175ff984f52f06
[2/2] selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround
      https://git.kernel.org/powerpc/c/d1781f23704707d350b8c9006e2bdf5394bf91b2

cheers
diff mbox series

Patch

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index c5f39f13e96e..5006dcbe1d9f 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -885,7 +885,7 @@  static void p9_hmi_special_emu(struct pt_regs *regs)
 {
 	unsigned int ra, rb, t, i, sel, instr, rc;
 	const void __user *addr;
-	u8 vbuf[16], *vdst;
+	u8 vbuf[16] __aligned(16), *vdst;
 	unsigned long ea, msr, msr_mask;
 	bool swap;