Message ID | 20240620060014.605563-2-marcin.juszkiewicz@linaro.org |
---|---|
State | New |
Headers | show |
Series | tests/avocado: make sbsa-ref working with >1 core | expand |
On Thu, 20 Jun 2024 at 07:00, Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org> wrote: > > I was wondering why avocado tests passed with firmware which crashes > when anyone else is using it. > > Turned out that amount of cores matters. Have to find out why still. This commit message confuses me. It reads like "running with two cores will make the guest crash", i.e. "apply this patch and the test suite will stop passing". I assume that's not the case, but what's actually going on here? thanks -- PMM
W dniu 20.06.2024 o 11:34, Peter Maydell pisze: > On Thu, 20 Jun 2024 at 07:00, Marcin Juszkiewicz > <marcin.juszkiewicz@linaro.org> wrote: >> >> I was wondering why avocado tests passed with firmware which >> crashes when anyone else is using it. >> >> Turned out that amount of cores matters. Have to find out why >> still. > > This commit message confuses me. Had no idea how to write in more readable form. Will reword it for v3 (with reverse order of patches as recommended by Philippe. > It reads like "running with two cores will make the guest crash", > i.e. "apply this patch and the test suite will stop passing". I > assume that's not the case, but what's actually going on here? That's exactly the case. With sbsa-ref firmware which qemu uses now we have crash if more than 1 core is used. Avocado test hardcoded "-smp 1" and was passing fine. And I forgot to mail qemu-devel when I got hit by that crash. This week Rebecca Cran pointed me that crash is in BootLogoLib in EDK2 and I wrote some workaround for make things work. Then Ard Biesheuvel found the real reason, fixed QemuVideoDxe in EDK2 and we got sbsa-ref running with any amount of cores. The commit message of fix: commit c1d1910be6e04a8b1a73090cf2881fb698947a6e Author: Ard Biesheuvel <ardb@kernel.org> Date: Mon Jun 17 17:07:41 2024 +0200 OvmfPkg/QemuVideoDxe: add feature PCD to remap framebuffer W/C Some platforms (such as SBSA-QEMU on recent builds of the emulator) only tolerate misaligned accesses to normal memory, and raise alignment faults on such accesses to device memory, which is the default for PCIe MMIO BARs. When emulating a PCIe graphics controller, the framebuffer is typically exposed via a MMIO BAR, while the disposition of the region is closer to memory (no side effects on reads or writes, except for the changing picture on the screen; direct random access to any pixel in the image). In order to permit the use of such controllers on platforms that only tolerate these types of accesses for normal memory, it is necessary to remap the memory. Use the DXE services to set the desired capabilities and attributes. Hide this behavior under a feature PCD so only platforms that really need it can enable it. (OVMF on x86 has no need for this)
On Thu, 20 Jun 2024 at 10:55, Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org> wrote: > > W dniu 20.06.2024 o 11:34, Peter Maydell pisze: > > On Thu, 20 Jun 2024 at 07:00, Marcin Juszkiewicz > > <marcin.juszkiewicz@linaro.org> wrote: > >> > >> I was wondering why avocado tests passed with firmware which > >> crashes when anyone else is using it. > >> > >> Turned out that amount of cores matters. Have to find out why > >> still. > > > > This commit message confuses me. > > Had no idea how to write in more readable form. Will reword it for v3 > (with reverse order of patches as recommended by Philippe. > > > It reads like "running with two cores will make the guest crash", > > i.e. "apply this patch and the test suite will stop passing". I > > assume that's not the case, but what's actually going on here? > > That's exactly the case. With sbsa-ref firmware which qemu uses now we > have crash if more than 1 core is used. Avocado test hardcoded "-smp 1" > and was passing fine. > > And I forgot to mail qemu-devel when I got hit by that crash. > > This week Rebecca Cran pointed me that crash is in BootLogoLib in EDK2 > and I wrote some workaround for make things work. Then Ard Biesheuvel > found the real reason, fixed QemuVideoDxe in EDK2 and we got sbsa-ref > running with any amount of cores. Oh, OK, so it's just random bad luck that enabling the second CPU means that we end up doing an unaligned access to the framebuffer, I guess. Then, yes, Philippe is right and we need to update our sbsa-ref firmware we're using for the test first, to avoid breaking bisection. For a commit message for this patch, maybe something like: The version of the sbsa-ref EDK2 firmware we used to use in this test had a bug where it might make an unaligned access to the framebuffer, which causes a guest crash on newer versions of QEMU where we enforce the architectural requirement that unaligned accesses to Device memory should take an exception. We happened to not notice this because our test was booting with "-smp 1" and through luck this didn't write the boot logo to the framebuffer at an unaligned address; but trying to boot the same firmware with two CPUs would result in a guest crash. Now we have updated the firmware we're using for the test, we can make the test use all the cores on the board, so we are testing the SMP boot path. ? thanks -- PMM
diff --git a/tests/avocado/machine_aarch64_sbsaref.py b/tests/avocado/machine_aarch64_sbsaref.py index 6bb82f2a03..136b495096 100644 --- a/tests/avocado/machine_aarch64_sbsaref.py +++ b/tests/avocado/machine_aarch64_sbsaref.py @@ -75,8 +75,6 @@ def fetch_firmware(self): f"if=pflash,file={fs0_path},format=raw", "-drive", f"if=pflash,file={fs1_path},format=raw", - "-smp", - "1", "-machine", "sbsa-ref", )
I was wondering why avocado tests passed with firmware which crashes when anyone else is using it. Turned out that amount of cores matters. Have to find out why still. Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org> --- tests/avocado/machine_aarch64_sbsaref.py | 2 -- 1 file changed, 2 deletions(-)