diff mbox series

[v3] hw/arm/virt: Avoid unexpected warning from Linux guest on host with Fujitsu CPUs

Message ID 20240612020506.307793-1-zhenyzha@redhat.com
State New
Headers show
Series [v3] hw/arm/virt: Avoid unexpected warning from Linux guest on host with Fujitsu CPUs | expand

Commit Message

Zhenyu Zhang June 12, 2024, 2:05 a.m. UTC
Multiple warning messages and corresponding backtraces are observed when Linux
guest is booted on the host with Fujitsu CPUs. One of them is shown as below.

[    0.032443] ------------[ cut here ]------------
[    0.032446] uart-pl011 9000000.pl011: ARCH_DMA_MINALIGN smaller than
CTR_EL0.CWG (128 < 256)
[    0.032454] WARNING: CPU: 0 PID: 1 at arch/arm64/mm/dma-mapping.c:54
arch_setup_dma_ops+0xbc/0xcc
[    0.032470] Modules linked in:
[    0.032475] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-452.el9.aarch64
[    0.032481] Hardware name: linux,dummy-virt (DT)
[    0.032484] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    0.032490] pc : arch_setup_dma_ops+0xbc/0xcc
[    0.032496] lr : arch_setup_dma_ops+0xbc/0xcc
[    0.032501] sp : ffff80008003b860
[    0.032503] x29: ffff80008003b860 x28: 0000000000000000 x27: ffffaae4b949049c
[    0.032510] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
[    0.032517] x23: 0000000000000100 x22: 0000000000000000 x21: 0000000000000000
[    0.032523] x20: 0000000100000000 x19: ffff2f06c02ea400 x18: ffffffffffffffff
[    0.032529] x17: 00000000208a5f76 x16: 000000006589dbcb x15: ffffaae4ba071c89
[    0.032535] x14: 0000000000000000 x13: ffffaae4ba071c84 x12: 455f525443206e61
[    0.032541] x11: 68742072656c6c61 x10: 0000000000000029 x9 : ffffaae4b7d21da4
[    0.032547] x8 : 0000000000000029 x7 : 4c414e494d5f414d x6 : 0000000000000029
[    0.032553] x5 : 000000000000000f x4 : ffffaae4b9617a00 x3 : 0000000000000001
[    0.032558] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff2f06c029be40
[    0.032564] Call trace:
[    0.032566]  arch_setup_dma_ops+0xbc/0xcc
[    0.032572]  of_dma_configure_id+0x138/0x300
[    0.032591]  amba_dma_configure+0x34/0xc0
[    0.032600]  really_probe+0x78/0x3dc
[    0.032614]  __driver_probe_device+0x108/0x160
[    0.032619]  driver_probe_device+0x44/0x114
[    0.032624]  __device_attach_driver+0xb8/0x14c
[    0.032629]  bus_for_each_drv+0x88/0xe4
[    0.032634]  __device_attach+0xb0/0x1e0
[    0.032638]  device_initial_probe+0x18/0x20
[    0.032643]  bus_probe_device+0xa8/0xb0
[    0.032648]  device_add+0x4b4/0x6c0
[    0.032652]  amba_device_try_add.part.0+0x48/0x360
[    0.032657]  amba_device_add+0x104/0x144
[    0.032662]  of_amba_device_create.isra.0+0x100/0x1c4
[    0.032666]  of_platform_bus_create+0x294/0x35c
[    0.032669]  of_platform_populate+0x5c/0x150
[    0.032672]  of_platform_default_populate_init+0xd0/0xec
[    0.032697]  do_one_initcall+0x4c/0x2e0
[    0.032701]  do_initcalls+0x100/0x13c
[    0.032707]  kernel_init_freeable+0x1c8/0x21c
[    0.032712]  kernel_init+0x28/0x140
[    0.032731]  ret_from_fork+0x10/0x20
[    0.032735] ---[ end trace 0000000000000000 ]---

In Linux, a check is applied to every device which is exposed through
device-tree node. The warning message is raised when the device isn't
DMA coherent and the cache line size is larger than ARCH_DMA_MINALIGN
(128 bytes). The cache line is sorted from CTR_EL0[CWG], which corresponds
to 256 bytes on the guest CPUs. The DMA coherent capability is claimed
through 'dma-coherent' in their device-tree nodes or parent nodes.

Fix the issue by adding 'dma-coherent' property to the device-tree root
node, meaning all devices are capable of DMA coherent by default.

Signed-off-by: Zhenyu Zhang <zhenyzha@redhat.com>
---
v3: Add comments explaining why we add 'dma-coherent' property (Peter)
---
 hw/arm/virt.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Gavin Shan June 12, 2024, 2:25 a.m. UTC | #1
On 6/12/24 12:05, Zhenyu Zhang wrote:
> Multiple warning messages and corresponding backtraces are observed when Linux
> guest is booted on the host with Fujitsu CPUs. One of them is shown as below.
> 
> [    0.032443] ------------[ cut here ]------------
> [    0.032446] uart-pl011 9000000.pl011: ARCH_DMA_MINALIGN smaller than
> CTR_EL0.CWG (128 < 256)
> [    0.032454] WARNING: CPU: 0 PID: 1 at arch/arm64/mm/dma-mapping.c:54
> arch_setup_dma_ops+0xbc/0xcc
> [    0.032470] Modules linked in:
> [    0.032475] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-452.el9.aarch64
> [    0.032481] Hardware name: linux,dummy-virt (DT)
> [    0.032484] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [    0.032490] pc : arch_setup_dma_ops+0xbc/0xcc
> [    0.032496] lr : arch_setup_dma_ops+0xbc/0xcc
> [    0.032501] sp : ffff80008003b860
> [    0.032503] x29: ffff80008003b860 x28: 0000000000000000 x27: ffffaae4b949049c
> [    0.032510] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
> [    0.032517] x23: 0000000000000100 x22: 0000000000000000 x21: 0000000000000000
> [    0.032523] x20: 0000000100000000 x19: ffff2f06c02ea400 x18: ffffffffffffffff
> [    0.032529] x17: 00000000208a5f76 x16: 000000006589dbcb x15: ffffaae4ba071c89
> [    0.032535] x14: 0000000000000000 x13: ffffaae4ba071c84 x12: 455f525443206e61
> [    0.032541] x11: 68742072656c6c61 x10: 0000000000000029 x9 : ffffaae4b7d21da4
> [    0.032547] x8 : 0000000000000029 x7 : 4c414e494d5f414d x6 : 0000000000000029
> [    0.032553] x5 : 000000000000000f x4 : ffffaae4b9617a00 x3 : 0000000000000001
> [    0.032558] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff2f06c029be40
> [    0.032564] Call trace:
> [    0.032566]  arch_setup_dma_ops+0xbc/0xcc
> [    0.032572]  of_dma_configure_id+0x138/0x300
> [    0.032591]  amba_dma_configure+0x34/0xc0
> [    0.032600]  really_probe+0x78/0x3dc
> [    0.032614]  __driver_probe_device+0x108/0x160
> [    0.032619]  driver_probe_device+0x44/0x114
> [    0.032624]  __device_attach_driver+0xb8/0x14c
> [    0.032629]  bus_for_each_drv+0x88/0xe4
> [    0.032634]  __device_attach+0xb0/0x1e0
> [    0.032638]  device_initial_probe+0x18/0x20
> [    0.032643]  bus_probe_device+0xa8/0xb0
> [    0.032648]  device_add+0x4b4/0x6c0
> [    0.032652]  amba_device_try_add.part.0+0x48/0x360
> [    0.032657]  amba_device_add+0x104/0x144
> [    0.032662]  of_amba_device_create.isra.0+0x100/0x1c4
> [    0.032666]  of_platform_bus_create+0x294/0x35c
> [    0.032669]  of_platform_populate+0x5c/0x150
> [    0.032672]  of_platform_default_populate_init+0xd0/0xec
> [    0.032697]  do_one_initcall+0x4c/0x2e0
> [    0.032701]  do_initcalls+0x100/0x13c
> [    0.032707]  kernel_init_freeable+0x1c8/0x21c
> [    0.032712]  kernel_init+0x28/0x140
> [    0.032731]  ret_from_fork+0x10/0x20
> [    0.032735] ---[ end trace 0000000000000000 ]---
> 
> In Linux, a check is applied to every device which is exposed through
> device-tree node. The warning message is raised when the device isn't
> DMA coherent and the cache line size is larger than ARCH_DMA_MINALIGN
> (128 bytes). The cache line is sorted from CTR_EL0[CWG], which corresponds
> to 256 bytes on the guest CPUs. The DMA coherent capability is claimed
> through 'dma-coherent' in their device-tree nodes or parent nodes.
> 
> Fix the issue by adding 'dma-coherent' property to the device-tree root
> node, meaning all devices are capable of DMA coherent by default.
> 
> Signed-off-by: Zhenyu Zhang <zhenyzha@redhat.com>
> ---
> v3: Add comments explaining why we add 'dma-coherent' property (Peter)
> ---
>   hw/arm/virt.c | 11 +++++++++++
>   1 file changed, 11 insertions(+)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>
Donald Dutile June 12, 2024, 3:04 a.m. UTC | #2
On 6/11/24 10:05 PM, Zhenyu Zhang wrote:
> Multiple warning messages and corresponding backtraces are observed when Linux
> guest is booted on the host with Fujitsu CPUs. One of them is shown as below.
> 
> [    0.032443] ------------[ cut here ]------------
> [    0.032446] uart-pl011 9000000.pl011: ARCH_DMA_MINALIGN smaller than
> CTR_EL0.CWG (128 < 256)
> [    0.032454] WARNING: CPU: 0 PID: 1 at arch/arm64/mm/dma-mapping.c:54
> arch_setup_dma_ops+0xbc/0xcc
> [    0.032470] Modules linked in:
> [    0.032475] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-452.el9.aarch64
> [    0.032481] Hardware name: linux,dummy-virt (DT)
> [    0.032484] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [    0.032490] pc : arch_setup_dma_ops+0xbc/0xcc
> [    0.032496] lr : arch_setup_dma_ops+0xbc/0xcc
> [    0.032501] sp : ffff80008003b860
> [    0.032503] x29: ffff80008003b860 x28: 0000000000000000 x27: ffffaae4b949049c
> [    0.032510] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
> [    0.032517] x23: 0000000000000100 x22: 0000000000000000 x21: 0000000000000000
> [    0.032523] x20: 0000000100000000 x19: ffff2f06c02ea400 x18: ffffffffffffffff
> [    0.032529] x17: 00000000208a5f76 x16: 000000006589dbcb x15: ffffaae4ba071c89
> [    0.032535] x14: 0000000000000000 x13: ffffaae4ba071c84 x12: 455f525443206e61
> [    0.032541] x11: 68742072656c6c61 x10: 0000000000000029 x9 : ffffaae4b7d21da4
> [    0.032547] x8 : 0000000000000029 x7 : 4c414e494d5f414d x6 : 0000000000000029
> [    0.032553] x5 : 000000000000000f x4 : ffffaae4b9617a00 x3 : 0000000000000001
> [    0.032558] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff2f06c029be40
> [    0.032564] Call trace:
> [    0.032566]  arch_setup_dma_ops+0xbc/0xcc
> [    0.032572]  of_dma_configure_id+0x138/0x300
> [    0.032591]  amba_dma_configure+0x34/0xc0
> [    0.032600]  really_probe+0x78/0x3dc
> [    0.032614]  __driver_probe_device+0x108/0x160
> [    0.032619]  driver_probe_device+0x44/0x114
> [    0.032624]  __device_attach_driver+0xb8/0x14c
> [    0.032629]  bus_for_each_drv+0x88/0xe4
> [    0.032634]  __device_attach+0xb0/0x1e0
> [    0.032638]  device_initial_probe+0x18/0x20
> [    0.032643]  bus_probe_device+0xa8/0xb0
> [    0.032648]  device_add+0x4b4/0x6c0
> [    0.032652]  amba_device_try_add.part.0+0x48/0x360
> [    0.032657]  amba_device_add+0x104/0x144
> [    0.032662]  of_amba_device_create.isra.0+0x100/0x1c4
> [    0.032666]  of_platform_bus_create+0x294/0x35c
> [    0.032669]  of_platform_populate+0x5c/0x150
> [    0.032672]  of_platform_default_populate_init+0xd0/0xec
> [    0.032697]  do_one_initcall+0x4c/0x2e0
> [    0.032701]  do_initcalls+0x100/0x13c
> [    0.032707]  kernel_init_freeable+0x1c8/0x21c
> [    0.032712]  kernel_init+0x28/0x140
> [    0.032731]  ret_from_fork+0x10/0x20
> [    0.032735] ---[ end trace 0000000000000000 ]---
> 
> In Linux, a check is applied to every device which is exposed through
> device-tree node. The warning message is raised when the device isn't
> DMA coherent and the cache line size is larger than ARCH_DMA_MINALIGN
> (128 bytes). The cache line is sorted from CTR_EL0[CWG], which corresponds
> to 256 bytes on the guest CPUs. The DMA coherent capability is claimed
> through 'dma-coherent' in their device-tree nodes or parent nodes.
> 
> Fix the issue by adding 'dma-coherent' property to the device-tree root
> node, meaning all devices are capable of DMA coherent by default.
> 
> Signed-off-by: Zhenyu Zhang <zhenyzha@redhat.com>
> ---
> v3: Add comments explaining why we add 'dma-coherent' property (Peter)
> ---
>   hw/arm/virt.c | 11 +++++++++++
>   1 file changed, 11 insertions(+)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 3c93c0c0a6..3cefac6d43 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -271,6 +271,17 @@ static void create_fdt(VirtMachineState *vms)
>       qemu_fdt_setprop_cell(fdt, "/", "#size-cells", 0x2);
>       qemu_fdt_setprop_string(fdt, "/", "model", "linux,dummy-virt");
>   
> +    /*
> +     * For QEMU, all DMA is coherent. Advertising this in the root node
> +     * has two benefits:
> +     *
> +     * - It avoids potential bugs where we forget to mark a DMA
> +     *   capable device as being dma-coherent
> +     * - It avoids spurious warnings from the Linux kernel about
> +     *   devices which can't do DMA at all
> +     */
> +    qemu_fdt_setprop(fdt, "/", "dma-coherent", NULL, 0);
> +
>       /* /chosen must exist for load_dtb to fill in necessary properties later */
>       qemu_fdt_add_subnode(fdt, "/chosen");
>       if (vms->dtb_randomness) {

+1 to Peter's suggested comment, otherwise, unless privy to this thread,
one would wonder how/why.

Reviewed-by: Donald Dutile <ddutile@redhat.com
Philippe Mathieu-Daudé June 12, 2024, 12:33 p.m. UTC | #3
Hi Zhenyu,

On 12/6/24 04:05, Zhenyu Zhang wrote:
> Multiple warning messages and corresponding backtraces are observed when Linux
> guest is booted on the host with Fujitsu CPUs. One of them is shown as below.
> 
> [    0.032443] ------------[ cut here ]------------
> [    0.032446] uart-pl011 9000000.pl011: ARCH_DMA_MINALIGN smaller than
> CTR_EL0.CWG (128 < 256)
> [    0.032454] WARNING: CPU: 0 PID: 1 at arch/arm64/mm/dma-mapping.c:54
> arch_setup_dma_ops+0xbc/0xcc
> [    0.032470] Modules linked in:
> [    0.032475] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-452.el9.aarch64
> [    0.032481] Hardware name: linux,dummy-virt (DT)
> [    0.032484] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [    0.032490] pc : arch_setup_dma_ops+0xbc/0xcc
> [    0.032496] lr : arch_setup_dma_ops+0xbc/0xcc
> [    0.032501] sp : ffff80008003b860
> [    0.032503] x29: ffff80008003b860 x28: 0000000000000000 x27: ffffaae4b949049c
> [    0.032510] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
> [    0.032517] x23: 0000000000000100 x22: 0000000000000000 x21: 0000000000000000
> [    0.032523] x20: 0000000100000000 x19: ffff2f06c02ea400 x18: ffffffffffffffff
> [    0.032529] x17: 00000000208a5f76 x16: 000000006589dbcb x15: ffffaae4ba071c89
> [    0.032535] x14: 0000000000000000 x13: ffffaae4ba071c84 x12: 455f525443206e61
> [    0.032541] x11: 68742072656c6c61 x10: 0000000000000029 x9 : ffffaae4b7d21da4
> [    0.032547] x8 : 0000000000000029 x7 : 4c414e494d5f414d x6 : 0000000000000029
> [    0.032553] x5 : 000000000000000f x4 : ffffaae4b9617a00 x3 : 0000000000000001
> [    0.032558] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff2f06c029be40
> [    0.032564] Call trace:
> [    0.032566]  arch_setup_dma_ops+0xbc/0xcc
> [    0.032572]  of_dma_configure_id+0x138/0x300
> [    0.032591]  amba_dma_configure+0x34/0xc0
> [    0.032600]  really_probe+0x78/0x3dc
> [    0.032614]  __driver_probe_device+0x108/0x160
> [    0.032619]  driver_probe_device+0x44/0x114
> [    0.032624]  __device_attach_driver+0xb8/0x14c
> [    0.032629]  bus_for_each_drv+0x88/0xe4
> [    0.032634]  __device_attach+0xb0/0x1e0
> [    0.032638]  device_initial_probe+0x18/0x20
> [    0.032643]  bus_probe_device+0xa8/0xb0
> [    0.032648]  device_add+0x4b4/0x6c0
> [    0.032652]  amba_device_try_add.part.0+0x48/0x360
> [    0.032657]  amba_device_add+0x104/0x144
> [    0.032662]  of_amba_device_create.isra.0+0x100/0x1c4
> [    0.032666]  of_platform_bus_create+0x294/0x35c
> [    0.032669]  of_platform_populate+0x5c/0x150
> [    0.032672]  of_platform_default_populate_init+0xd0/0xec
> [    0.032697]  do_one_initcall+0x4c/0x2e0
> [    0.032701]  do_initcalls+0x100/0x13c
> [    0.032707]  kernel_init_freeable+0x1c8/0x21c
> [    0.032712]  kernel_init+0x28/0x140
> [    0.032731]  ret_from_fork+0x10/0x20
> [    0.032735] ---[ end trace 0000000000000000 ]---
> 
> In Linux, a check is applied to every device which is exposed through
> device-tree node. The warning message is raised when the device isn't
> DMA coherent and the cache line size is larger than ARCH_DMA_MINALIGN
> (128 bytes). The cache line is sorted from CTR_EL0[CWG], which corresponds
> to 256 bytes on the guest CPUs. The DMA coherent capability is claimed
> through 'dma-coherent' in their device-tree nodes or parent nodes.
> 
> Fix the issue by adding 'dma-coherent' property to the device-tree root
> node, meaning all devices are capable of DMA coherent by default.
> 
> Signed-off-by: Zhenyu Zhang <zhenyzha@redhat.com>
> ---
> v3: Add comments explaining why we add 'dma-coherent' property (Peter)
> ---
>   hw/arm/virt.c | 11 +++++++++++
>   1 file changed, 11 insertions(+)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 3c93c0c0a6..3cefac6d43 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -271,6 +271,17 @@ static void create_fdt(VirtMachineState *vms)
>       qemu_fdt_setprop_cell(fdt, "/", "#size-cells", 0x2);
>       qemu_fdt_setprop_string(fdt, "/", "model", "linux,dummy-virt");
>   
> +    /*
> +     * For QEMU, all DMA is coherent. Advertising this in the root node
> +     * has two benefits:
> +     *
> +     * - It avoids potential bugs where we forget to mark a DMA
> +     *   capable device as being dma-coherent
> +     * - It avoids spurious warnings from the Linux kernel about
> +     *   devices which can't do DMA at all
> +     */
> +    qemu_fdt_setprop(fdt, "/", "dma-coherent", NULL, 0);

OK, but why restrict that to the Aarch64 virt machine?
Shouldn't advertise this generically in create_device_tree()?
Or otherwise at least in the other virt machines?

>       /* /chosen must exist for load_dtb to fill in necessary properties later */
>       qemu_fdt_add_subnode(fdt, "/chosen");
>       if (vms->dtb_randomness) {
Peter Maydell June 12, 2024, 12:48 p.m. UTC | #4
On Wed, 12 Jun 2024 at 13:33, Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
>
> Hi Zhenyu,
>
> On 12/6/24 04:05, Zhenyu Zhang wrote:
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 3c93c0c0a6..3cefac6d43 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -271,6 +271,17 @@ static void create_fdt(VirtMachineState *vms)
> >       qemu_fdt_setprop_cell(fdt, "/", "#size-cells", 0x2);
> >       qemu_fdt_setprop_string(fdt, "/", "model", "linux,dummy-virt");
> >
> > +    /*
> > +     * For QEMU, all DMA is coherent. Advertising this in the root node
> > +     * has two benefits:
> > +     *
> > +     * - It avoids potential bugs where we forget to mark a DMA
> > +     *   capable device as being dma-coherent
> > +     * - It avoids spurious warnings from the Linux kernel about
> > +     *   devices which can't do DMA at all
> > +     */
> > +    qemu_fdt_setprop(fdt, "/", "dma-coherent", NULL, 0);
>
> OK, but why restrict that to the Aarch64 virt machine?
> Shouldn't advertise this generically in create_device_tree()?
> Or otherwise at least in the other virt machines?

create_device_tree() creates an empty device tree, not one
with stuff in it. It seems reasonable to me for this property
on the root to be set in the same place we set other properties
of the root node.

thanks
-- PMM
Philippe Mathieu-Daudé June 12, 2024, 12:50 p.m. UTC | #5
On 12/6/24 14:48, Peter Maydell wrote:
> On Wed, 12 Jun 2024 at 13:33, Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
>>
>> Hi Zhenyu,
>>
>> On 12/6/24 04:05, Zhenyu Zhang wrote:
>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>> index 3c93c0c0a6..3cefac6d43 100644
>>> --- a/hw/arm/virt.c
>>> +++ b/hw/arm/virt.c
>>> @@ -271,6 +271,17 @@ static void create_fdt(VirtMachineState *vms)
>>>        qemu_fdt_setprop_cell(fdt, "/", "#size-cells", 0x2);
>>>        qemu_fdt_setprop_string(fdt, "/", "model", "linux,dummy-virt");
>>>
>>> +    /*
>>> +     * For QEMU, all DMA is coherent. Advertising this in the root node
>>> +     * has two benefits:
>>> +     *
>>> +     * - It avoids potential bugs where we forget to mark a DMA
>>> +     *   capable device as being dma-coherent
>>> +     * - It avoids spurious warnings from the Linux kernel about
>>> +     *   devices which can't do DMA at all
>>> +     */
>>> +    qemu_fdt_setprop(fdt, "/", "dma-coherent", NULL, 0);
>>
>> OK, but why restrict that to the Aarch64 virt machine?
>> Shouldn't advertise this generically in create_device_tree()?
>> Or otherwise at least in the other virt machines?
> 
> create_device_tree() creates an empty device tree, not one
> with stuff in it. It seems reasonable to me for this property
> on the root to be set in the same place we set other properties
> of the root node.

OK. Still the question about other virt machines remains
unanswered :)
Robin Murphy June 12, 2024, 5:48 p.m. UTC | #6
On 2024-06-12 1:50 pm, Philippe Mathieu-Daudé wrote:
> On 12/6/24 14:48, Peter Maydell wrote:
>> On Wed, 12 Jun 2024 at 13:33, Philippe Mathieu-Daudé 
>> <philmd@linaro.org> wrote:
>>>
>>> Hi Zhenyu,
>>>
>>> On 12/6/24 04:05, Zhenyu Zhang wrote:
>>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>>> index 3c93c0c0a6..3cefac6d43 100644
>>>> --- a/hw/arm/virt.c
>>>> +++ b/hw/arm/virt.c
>>>> @@ -271,6 +271,17 @@ static void create_fdt(VirtMachineState *vms)
>>>>        qemu_fdt_setprop_cell(fdt, "/", "#size-cells", 0x2);
>>>>        qemu_fdt_setprop_string(fdt, "/", "model", "linux,dummy-virt");
>>>>
>>>> +    /*
>>>> +     * For QEMU, all DMA is coherent. Advertising this in the root 
>>>> node
>>>> +     * has two benefits:
>>>> +     *
>>>> +     * - It avoids potential bugs where we forget to mark a DMA
>>>> +     *   capable device as being dma-coherent
>>>> +     * - It avoids spurious warnings from the Linux kernel about
>>>> +     *   devices which can't do DMA at all
>>>> +     */
>>>> +    qemu_fdt_setprop(fdt, "/", "dma-coherent", NULL, 0);
>>>
>>> OK, but why restrict that to the Aarch64 virt machine?
>>> Shouldn't advertise this generically in create_device_tree()?
>>> Or otherwise at least in the other virt machines?
>>
>> create_device_tree() creates an empty device tree, not one
>> with stuff in it. It seems reasonable to me for this property
>> on the root to be set in the same place we set other properties
>> of the root node.
> 
> OK. Still the question about other virt machines remains
> unanswered :)

 From the DT consumer point of view, the interpretation and assumptions 
around coherency *are* generally architecture- or platform-specific. For 
instance on RISC-V, many platforms want to assume coherency by default 
(and potentially use "dma-noncoherent" to mark individual devices that 
aren't), while others may still want to do the opposite and use 
"dma-coherent" in the same manner as Arm and AArch64. Neither property 
existed back in ePAPR, so typical PowerPC systems wouldn't even be 
looking and will just make their own assumptions by other means.

Thanks,
Robin.
Zhenyu Zhang June 17, 2024, 5:26 a.m. UTC | #7
On Thu, Jun 13, 2024 at 1:48 AM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2024-06-12 1:50 pm, Philippe Mathieu-Daudé wrote:
> > On 12/6/24 14:48, Peter Maydell wrote:
> >> On Wed, 12 Jun 2024 at 13:33, Philippe Mathieu-Daudé
> >> <philmd@linaro.org> wrote:
> >>>
> >>> Hi Zhenyu,
> >>>
Hello Philippe,

> >>> On 12/6/24 04:05, Zhenyu Zhang wrote:
> >>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> >>>> index 3c93c0c0a6..3cefac6d43 100644
> >>>> --- a/hw/arm/virt.c
> >>>> +++ b/hw/arm/virt.c
> >>>> @@ -271,6 +271,17 @@ static void create_fdt(VirtMachineState *vms)
> >>>>        qemu_fdt_setprop_cell(fdt, "/", "#size-cells", 0x2);
> >>>>        qemu_fdt_setprop_string(fdt, "/", "model", "linux,dummy-virt");
> >>>>
> >>>> +    /*
> >>>> +     * For QEMU, all DMA is coherent. Advertising this in the root
> >>>> node
> >>>> +     * has two benefits:
> >>>> +     *
> >>>> +     * - It avoids potential bugs where we forget to mark a DMA
> >>>> +     *   capable device as being dma-coherent
> >>>> +     * - It avoids spurious warnings from the Linux kernel about
> >>>> +     *   devices which can't do DMA at all
> >>>> +     */
> >>>> +    qemu_fdt_setprop(fdt, "/", "dma-coherent", NULL, 0);
> >>>
> >>> OK, but why restrict that to the Aarch64 virt machine?
> >>> Shouldn't advertise this generically in create_device_tree()?
> >>> Or otherwise at least in the other virt machines?
> >>
> >> create_device_tree() creates an empty device tree, not one
> >> with stuff in it. It seems reasonable to me for this property
> >> on the root to be set in the same place we set other properties
> >> of the root node.
> >
> > OK. Still the question about other virt machines remains
> > unanswered :)
>
>  From the DT consumer point of view, the interpretation and assumptions
> around coherency *are* generally architecture- or platform-specific. For
> instance on RISC-V, many platforms want to assume coherency by default
> (and potentially use "dma-noncoherent" to mark individual devices that
> aren't), while others may still want to do the opposite and use
> "dma-coherent" in the same manner as Arm and AArch64. Neither property
> existed back in ePAPR, so typical PowerPC systems wouldn't even be
> looking and will just make their own assumptions by other means.
>
As Robin's comment says, each platform wants to assume coherency
by default may be different. Adding it to all virt machines may
introduce new risks. Currently, the issue is only valid on Fujitsu CPUs
where the cache line size is 256 bytes, meaning the combination of
kvm+virt-platform is needed to trigger the warning. So I'd be inclined
to add this "dma-coherent" property for the "virt" platform first
and advertise the property to other platforms if we hit the issue
on those platforms.

Thanks,
Zhenyu
Philippe Mathieu-Daudé June 17, 2024, 9:36 a.m. UTC | #8
On 17/6/24 07:26, Zhenyu Zhang wrote:
> On Thu, Jun 13, 2024 at 1:48 AM Robin Murphy <robin.murphy@arm.com> wrote:
>>
>> On 2024-06-12 1:50 pm, Philippe Mathieu-Daudé wrote:
>>> On 12/6/24 14:48, Peter Maydell wrote:
>>>> On Wed, 12 Jun 2024 at 13:33, Philippe Mathieu-Daudé
>>>> <philmd@linaro.org> wrote:
>>>>>
>>>>> Hi Zhenyu,
>>>>>
> Hello Philippe,
> 
>>>>> On 12/6/24 04:05, Zhenyu Zhang wrote:
>>>>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>>>>> index 3c93c0c0a6..3cefac6d43 100644
>>>>>> --- a/hw/arm/virt.c
>>>>>> +++ b/hw/arm/virt.c
>>>>>> @@ -271,6 +271,17 @@ static void create_fdt(VirtMachineState *vms)
>>>>>>         qemu_fdt_setprop_cell(fdt, "/", "#size-cells", 0x2);
>>>>>>         qemu_fdt_setprop_string(fdt, "/", "model", "linux,dummy-virt");
>>>>>>
>>>>>> +    /*
>>>>>> +     * For QEMU, all DMA is coherent. Advertising this in the root
>>>>>> node
>>>>>> +     * has two benefits:
>>>>>> +     *
>>>>>> +     * - It avoids potential bugs where we forget to mark a DMA
>>>>>> +     *   capable device as being dma-coherent
>>>>>> +     * - It avoids spurious warnings from the Linux kernel about
>>>>>> +     *   devices which can't do DMA at all
>>>>>> +     */
>>>>>> +    qemu_fdt_setprop(fdt, "/", "dma-coherent", NULL, 0);
>>>>>
>>>>> OK, but why restrict that to the Aarch64 virt machine?
>>>>> Shouldn't advertise this generically in create_device_tree()?
>>>>> Or otherwise at least in the other virt machines?
>>>>
>>>> create_device_tree() creates an empty device tree, not one
>>>> with stuff in it. It seems reasonable to me for this property
>>>> on the root to be set in the same place we set other properties
>>>> of the root node.
>>>
>>> OK. Still the question about other virt machines remains
>>> unanswered :)
>>
>>   From the DT consumer point of view, the interpretation and assumptions
>> around coherency *are* generally architecture- or platform-specific. For
>> instance on RISC-V, many platforms want to assume coherency by default
>> (and potentially use "dma-noncoherent" to mark individual devices that
>> aren't), while others may still want to do the opposite and use
>> "dma-coherent" in the same manner as Arm and AArch64. Neither property
>> existed back in ePAPR, so typical PowerPC systems wouldn't even be
>> looking and will just make their own assumptions by other means.

TIL :)

> As Robin's comment says, each platform wants to assume coherency
> by default may be different. Adding it to all virt machines may
> introduce new risks. Currently, the issue is only valid on Fujitsu CPUs
> where the cache line size is 256 bytes, meaning the combination of
> kvm+virt-platform is needed to trigger the warning. So I'd be inclined
> to add this "dma-coherent" property for the "virt" platform first
> and advertise the property to other platforms if we hit the issue
> on those platforms.

Thanks Robin & Zhenyu,

Phil.
Peter Maydell June 21, 2024, 1:01 p.m. UTC | #9
On Wed, 12 Jun 2024 at 03:05, Zhenyu Zhang <zhenyzha@redhat.com> wrote:
>
> Multiple warning messages and corresponding backtraces are observed when Linux
> guest is booted on the host with Fujitsu CPUs. One of them is shown as below.
>
> [    0.032443] ------------[ cut here ]------------
> [    0.032446] uart-pl011 9000000.pl011: ARCH_DMA_MINALIGN smaller than
> CTR_EL0.CWG (128 < 256)
> [    0.032454] WARNING: CPU: 0 PID: 1 at arch/arm64/mm/dma-mapping.c:54
> arch_setup_dma_ops+0xbc/0xcc
> [    0.032470] Modules linked in:
> [    0.032475] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-452.el9.aarch64
> [    0.032481] Hardware name: linux,dummy-virt (DT)
> [    0.032484] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [    0.032490] pc : arch_setup_dma_ops+0xbc/0xcc
> [    0.032496] lr : arch_setup_dma_ops+0xbc/0xcc
> [    0.032501] sp : ffff80008003b860
> [    0.032503] x29: ffff80008003b860 x28: 0000000000000000 x27: ffffaae4b949049c
> [    0.032510] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
> [    0.032517] x23: 0000000000000100 x22: 0000000000000000 x21: 0000000000000000
> [    0.032523] x20: 0000000100000000 x19: ffff2f06c02ea400 x18: ffffffffffffffff
> [    0.032529] x17: 00000000208a5f76 x16: 000000006589dbcb x15: ffffaae4ba071c89
> [    0.032535] x14: 0000000000000000 x13: ffffaae4ba071c84 x12: 455f525443206e61
> [    0.032541] x11: 68742072656c6c61 x10: 0000000000000029 x9 : ffffaae4b7d21da4
> [    0.032547] x8 : 0000000000000029 x7 : 4c414e494d5f414d x6 : 0000000000000029
> [    0.032553] x5 : 000000000000000f x4 : ffffaae4b9617a00 x3 : 0000000000000001
> [    0.032558] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff2f06c029be40
> [    0.032564] Call trace:
> [    0.032566]  arch_setup_dma_ops+0xbc/0xcc
> [    0.032572]  of_dma_configure_id+0x138/0x300
> [    0.032591]  amba_dma_configure+0x34/0xc0
> [    0.032600]  really_probe+0x78/0x3dc
> [    0.032614]  __driver_probe_device+0x108/0x160
> [    0.032619]  driver_probe_device+0x44/0x114
> [    0.032624]  __device_attach_driver+0xb8/0x14c
> [    0.032629]  bus_for_each_drv+0x88/0xe4
> [    0.032634]  __device_attach+0xb0/0x1e0
> [    0.032638]  device_initial_probe+0x18/0x20
> [    0.032643]  bus_probe_device+0xa8/0xb0
> [    0.032648]  device_add+0x4b4/0x6c0
> [    0.032652]  amba_device_try_add.part.0+0x48/0x360
> [    0.032657]  amba_device_add+0x104/0x144
> [    0.032662]  of_amba_device_create.isra.0+0x100/0x1c4
> [    0.032666]  of_platform_bus_create+0x294/0x35c
> [    0.032669]  of_platform_populate+0x5c/0x150
> [    0.032672]  of_platform_default_populate_init+0xd0/0xec
> [    0.032697]  do_one_initcall+0x4c/0x2e0
> [    0.032701]  do_initcalls+0x100/0x13c
> [    0.032707]  kernel_init_freeable+0x1c8/0x21c
> [    0.032712]  kernel_init+0x28/0x140
> [    0.032731]  ret_from_fork+0x10/0x20
> [    0.032735] ---[ end trace 0000000000000000 ]---
>
> In Linux, a check is applied to every device which is exposed through
> device-tree node. The warning message is raised when the device isn't
> DMA coherent and the cache line size is larger than ARCH_DMA_MINALIGN
> (128 bytes). The cache line is sorted from CTR_EL0[CWG], which corresponds
> to 256 bytes on the guest CPUs. The DMA coherent capability is claimed
> through 'dma-coherent' in their device-tree nodes or parent nodes.
>
> Fix the issue by adding 'dma-coherent' property to the device-tree root
> node, meaning all devices are capable of DMA coherent by default.
>
> Signed-off-by: Zhenyu Zhang <zhenyzha@redhat.com>



Applied to target-arm.next, thanks.

-- PMM
diff mbox series

Patch

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 3c93c0c0a6..3cefac6d43 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -271,6 +271,17 @@  static void create_fdt(VirtMachineState *vms)
     qemu_fdt_setprop_cell(fdt, "/", "#size-cells", 0x2);
     qemu_fdt_setprop_string(fdt, "/", "model", "linux,dummy-virt");
 
+    /*
+     * For QEMU, all DMA is coherent. Advertising this in the root node
+     * has two benefits:
+     *
+     * - It avoids potential bugs where we forget to mark a DMA
+     *   capable device as being dma-coherent
+     * - It avoids spurious warnings from the Linux kernel about
+     *   devices which can't do DMA at all
+     */
+    qemu_fdt_setprop(fdt, "/", "dma-coherent", NULL, 0);
+
     /* /chosen must exist for load_dtb to fill in necessary properties later */
     qemu_fdt_add_subnode(fdt, "/chosen");
     if (vms->dtb_randomness) {