mbox series

[v6,0/6] Allwinner H6 Mali GPU support

Message ID 20190521161102.29620-1-peron.clem@gmail.com
Headers show
Series Allwinner H6 Mali GPU support | expand

Message

Clément Péron May 21, 2019, 4:10 p.m. UTC
Hi,

The Allwinner H6 has a Mali-T720 MP2 which should be supported by
the new panfrost driver. This series fix two issues and introduce the
dt-bindings but a simple benchmark show that it's still NOT WORKING.

I'm pushing it in case someone want to continue the work.

This has been tested with Mesa3D 19.1.0-RC2 and a GPU bitness patch[1].

One patch is from Icenowy Zheng where I changed the order as required
by Rob Herring[2].

Thanks,
Clement

[1] https://gitlab.freedesktop.org/kszaq/mesa/tree/panfrost_64_32
[2] https://patchwork.kernel.org/patch/10699829/


[  345.204813] panfrost 1800000.gpu: mmu irq status=1
[  345.209617] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
0x0000000002400400
[  345.209617] Reason: TODO
[  345.209617] raw fault status: 0x800002C1
[  345.209617] decoded fault status: SLAVE FAULT
[  345.209617] exception type 0xC1: TRANSLATION_FAULT_LEVEL1
[  345.209617] access type 0x2: READ
[  345.209617] source id 0x8000
[  345.729957] panfrost 1800000.gpu: gpu sched timeout, js=0,
status=0x8, head=0x2400400, tail=0x2400400, sched_job=000000009e204de9
[  346.055876] panfrost 1800000.gpu: mmu irq status=1
[  346.060680] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
0x0000000002C00A00
[  346.060680] Reason: TODO
[  346.060680] raw fault status: 0x810002C1
[  346.060680] decoded fault status: SLAVE FAULT
[  346.060680] exception type 0xC1: TRANSLATION_FAULT_LEVEL1
[  346.060680] access type 0x2: READ
[  346.060680] source id 0x8100
[  346.561955] panfrost 1800000.gpu: gpu sched timeout, js=1,
status=0x8, head=0x2c00a00, tail=0x2c00a00, sched_job=00000000b55a9a85
[  346.573913] panfrost 1800000.gpu: mmu irq status=1
[  346.578707] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
0x0000000002C00B80

Change in v5:
 - Remove fix indent

Changes in v4:
 - Add bus_clock probe
 - Fix sanity check in io-pgtable
 - Add vramp-delay
 - Merge all boards into one patch
 - Remove upstreamed Neil A. patch

Change in v3 (Thanks to Maxime Ripard):
 - Reauthor Icenowy for her path

Changes in v2 (Thanks to Maxime Ripard):
 - Drop GPU OPP Table
 - Add clocks and clock-names in required

Clément Péron (5):
  drm: panfrost: add optional bus_clock
  iommu: io-pgtable: fix sanity check for non 48-bit mali iommu
  dt-bindings: gpu: mali-midgard: Add H6 mali gpu compatible
  arm64: dts: allwinner: Add ARM Mali GPU node for H6
  arm64: dts: allwinner: Add mali GPU supply for H6 boards

Icenowy Zheng (1):
  dt-bindings: gpu: add bus clock for Mali Midgard GPUs

 .../bindings/gpu/arm,mali-midgard.txt         | 15 ++++++++++++-
 .../dts/allwinner/sun50i-h6-beelink-gs1.dts   |  6 +++++
 .../dts/allwinner/sun50i-h6-orangepi-3.dts    |  6 +++++
 .../dts/allwinner/sun50i-h6-orangepi.dtsi     |  6 +++++
 .../boot/dts/allwinner/sun50i-h6-pine-h64.dts |  6 +++++
 arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi  | 14 ++++++++++++
 drivers/gpu/drm/panfrost/panfrost_device.c    | 22 +++++++++++++++++++
 drivers/gpu/drm/panfrost/panfrost_device.h    |  1 +
 drivers/iommu/io-pgtable-arm.c                |  2 +-
 9 files changed, 76 insertions(+), 2 deletions(-)

Comments

Rob Herring May 22, 2019, 7:27 p.m. UTC | #1
On Tue, May 21, 2019 at 11:11 AM Clément Péron <peron.clem@gmail.com> wrote:
>
> Hi,
>
> The Allwinner H6 has a Mali-T720 MP2 which should be supported by
> the new panfrost driver. This series fix two issues and introduce the
> dt-bindings but a simple benchmark show that it's still NOT WORKING.
>
> I'm pushing it in case someone want to continue the work.
>
> This has been tested with Mesa3D 19.1.0-RC2 and a GPU bitness patch[1].
>
> One patch is from Icenowy Zheng where I changed the order as required
> by Rob Herring[2].
>
> Thanks,
> Clement
>
> [1] https://gitlab.freedesktop.org/kszaq/mesa/tree/panfrost_64_32
> [2] https://patchwork.kernel.org/patch/10699829/
>
>
> [  345.204813] panfrost 1800000.gpu: mmu irq status=1
> [  345.209617] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> 0x0000000002400400
> [  345.209617] Reason: TODO
> [  345.209617] raw fault status: 0x800002C1
> [  345.209617] decoded fault status: SLAVE FAULT
> [  345.209617] exception type 0xC1: TRANSLATION_FAULT_LEVEL1
> [  345.209617] access type 0x2: READ
> [  345.209617] source id 0x8000
> [  345.729957] panfrost 1800000.gpu: gpu sched timeout, js=0,
> status=0x8, head=0x2400400, tail=0x2400400, sched_job=000000009e204de9
> [  346.055876] panfrost 1800000.gpu: mmu irq status=1
> [  346.060680] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> 0x0000000002C00A00
> [  346.060680] Reason: TODO
> [  346.060680] raw fault status: 0x810002C1
> [  346.060680] decoded fault status: SLAVE FAULT
> [  346.060680] exception type 0xC1: TRANSLATION_FAULT_LEVEL1
> [  346.060680] access type 0x2: READ
> [  346.060680] source id 0x8100
> [  346.561955] panfrost 1800000.gpu: gpu sched timeout, js=1,
> status=0x8, head=0x2c00a00, tail=0x2c00a00, sched_job=00000000b55a9a85
> [  346.573913] panfrost 1800000.gpu: mmu irq status=1
> [  346.578707] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> 0x0000000002C00B80
>
> Change in v5:
>  - Remove fix indent
>
> Changes in v4:
>  - Add bus_clock probe
>  - Fix sanity check in io-pgtable
>  - Add vramp-delay
>  - Merge all boards into one patch
>  - Remove upstreamed Neil A. patch
>
> Change in v3 (Thanks to Maxime Ripard):
>  - Reauthor Icenowy for her path
>
> Changes in v2 (Thanks to Maxime Ripard):
>  - Drop GPU OPP Table
>  - Add clocks and clock-names in required
>
> Clément Péron (5):
>   drm: panfrost: add optional bus_clock
>   iommu: io-pgtable: fix sanity check for non 48-bit mali iommu
>   dt-bindings: gpu: mali-midgard: Add H6 mali gpu compatible
>   arm64: dts: allwinner: Add ARM Mali GPU node for H6
>   arm64: dts: allwinner: Add mali GPU supply for H6 boards
>
> Icenowy Zheng (1):
>   dt-bindings: gpu: add bus clock for Mali Midgard GPUs

I've applied patches 1 and 3 to drm-misc. I was going to do patch 4
too, but it doesn't apply.

Patch 2 can go in via the iommu tree and the rest via the allwinner tree.

Rob
Clément Péron May 22, 2019, 7:41 p.m. UTC | #2
Hi Rob,

On Wed, 22 May 2019 at 21:27, Rob Herring <robh+dt@kernel.org> wrote:
>
> On Tue, May 21, 2019 at 11:11 AM Clément Péron <peron.clem@gmail.com> wrote:
> >
> > Hi,
> >
> > The Allwinner H6 has a Mali-T720 MP2 which should be supported by
> > the new panfrost driver. This series fix two issues and introduce the
> > dt-bindings but a simple benchmark show that it's still NOT WORKING.
> >
> > I'm pushing it in case someone want to continue the work.
> >
> > This has been tested with Mesa3D 19.1.0-RC2 and a GPU bitness patch[1].
> >
> > One patch is from Icenowy Zheng where I changed the order as required
> > by Rob Herring[2].
> >
> > Thanks,
> > Clement
> >
> > [1] https://gitlab.freedesktop.org/kszaq/mesa/tree/panfrost_64_32
> > [2] https://patchwork.kernel.org/patch/10699829/
> >
> >
> > [  345.204813] panfrost 1800000.gpu: mmu irq status=1
> > [  345.209617] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> > 0x0000000002400400
> > [  345.209617] Reason: TODO
> > [  345.209617] raw fault status: 0x800002C1
> > [  345.209617] decoded fault status: SLAVE FAULT
> > [  345.209617] exception type 0xC1: TRANSLATION_FAULT_LEVEL1
> > [  345.209617] access type 0x2: READ
> > [  345.209617] source id 0x8000
> > [  345.729957] panfrost 1800000.gpu: gpu sched timeout, js=0,
> > status=0x8, head=0x2400400, tail=0x2400400, sched_job=000000009e204de9
> > [  346.055876] panfrost 1800000.gpu: mmu irq status=1
> > [  346.060680] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> > 0x0000000002C00A00
> > [  346.060680] Reason: TODO
> > [  346.060680] raw fault status: 0x810002C1
> > [  346.060680] decoded fault status: SLAVE FAULT
> > [  346.060680] exception type 0xC1: TRANSLATION_FAULT_LEVEL1
> > [  346.060680] access type 0x2: READ
> > [  346.060680] source id 0x8100
> > [  346.561955] panfrost 1800000.gpu: gpu sched timeout, js=1,
> > status=0x8, head=0x2c00a00, tail=0x2c00a00, sched_job=00000000b55a9a85
> > [  346.573913] panfrost 1800000.gpu: mmu irq status=1
> > [  346.578707] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> > 0x0000000002C00B80
> >
> > Change in v5:
> >  - Remove fix indent
> >
> > Changes in v4:
> >  - Add bus_clock probe
> >  - Fix sanity check in io-pgtable
> >  - Add vramp-delay
> >  - Merge all boards into one patch
> >  - Remove upstreamed Neil A. patch
> >
> > Change in v3 (Thanks to Maxime Ripard):
> >  - Reauthor Icenowy for her path
> >
> > Changes in v2 (Thanks to Maxime Ripard):
> >  - Drop GPU OPP Table
> >  - Add clocks and clock-names in required
> >
> > Clément Péron (5):
> >   drm: panfrost: add optional bus_clock
> >   iommu: io-pgtable: fix sanity check for non 48-bit mali iommu
> >   dt-bindings: gpu: mali-midgard: Add H6 mali gpu compatible
> >   arm64: dts: allwinner: Add ARM Mali GPU node for H6
> >   arm64: dts: allwinner: Add mali GPU supply for H6 boards
> >
> > Icenowy Zheng (1):
> >   dt-bindings: gpu: add bus clock for Mali Midgard GPUs
>
> I've applied patches 1 and 3 to drm-misc. I was going to do patch 4
> too, but it doesn't apply.

Thanks,

I have tried to applied on drm-misc/for-linux-next but it doesn't apply too.
It looks like commit d5ff1adb3809e2f74a3b57cea2e57c8e37d693c4 is
missing on drm-misc ?
https://github.com/torvalds/linux/commit/d5ff1adb3809e2f74a3b57cea2e57c8e37d693c4#diff-c3172f5d421d492ff91d7bb44dd44917

Clément

>
> Patch 2 can go in via the iommu tree and the rest via the allwinner tree.
>
> Rob
Rob Herring May 22, 2019, 8:51 p.m. UTC | #3
On Wed, May 22, 2019 at 2:41 PM Clément Péron <peron.clem@gmail.com> wrote:
>
> Hi Rob,
>
> On Wed, 22 May 2019 at 21:27, Rob Herring <robh+dt@kernel.org> wrote:
> >
> > On Tue, May 21, 2019 at 11:11 AM Clément Péron <peron.clem@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > The Allwinner H6 has a Mali-T720 MP2 which should be supported by
> > > the new panfrost driver. This series fix two issues and introduce the
> > > dt-bindings but a simple benchmark show that it's still NOT WORKING.
> > >
> > > I'm pushing it in case someone want to continue the work.
> > >
> > > This has been tested with Mesa3D 19.1.0-RC2 and a GPU bitness patch[1].
> > >
> > > One patch is from Icenowy Zheng where I changed the order as required
> > > by Rob Herring[2].
> > >
> > > Thanks,
> > > Clement
> > >
> > > [1] https://gitlab.freedesktop.org/kszaq/mesa/tree/panfrost_64_32
> > > [2] https://patchwork.kernel.org/patch/10699829/
> > >
> > >
> > > [  345.204813] panfrost 1800000.gpu: mmu irq status=1
> > > [  345.209617] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> > > 0x0000000002400400
> > > [  345.209617] Reason: TODO
> > > [  345.209617] raw fault status: 0x800002C1
> > > [  345.209617] decoded fault status: SLAVE FAULT
> > > [  345.209617] exception type 0xC1: TRANSLATION_FAULT_LEVEL1
> > > [  345.209617] access type 0x2: READ
> > > [  345.209617] source id 0x8000
> > > [  345.729957] panfrost 1800000.gpu: gpu sched timeout, js=0,
> > > status=0x8, head=0x2400400, tail=0x2400400, sched_job=000000009e204de9
> > > [  346.055876] panfrost 1800000.gpu: mmu irq status=1
> > > [  346.060680] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> > > 0x0000000002C00A00
> > > [  346.060680] Reason: TODO
> > > [  346.060680] raw fault status: 0x810002C1
> > > [  346.060680] decoded fault status: SLAVE FAULT
> > > [  346.060680] exception type 0xC1: TRANSLATION_FAULT_LEVEL1
> > > [  346.060680] access type 0x2: READ
> > > [  346.060680] source id 0x8100
> > > [  346.561955] panfrost 1800000.gpu: gpu sched timeout, js=1,
> > > status=0x8, head=0x2c00a00, tail=0x2c00a00, sched_job=00000000b55a9a85
> > > [  346.573913] panfrost 1800000.gpu: mmu irq status=1
> > > [  346.578707] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> > > 0x0000000002C00B80
> > >
> > > Change in v5:
> > >  - Remove fix indent
> > >
> > > Changes in v4:
> > >  - Add bus_clock probe
> > >  - Fix sanity check in io-pgtable
> > >  - Add vramp-delay
> > >  - Merge all boards into one patch
> > >  - Remove upstreamed Neil A. patch
> > >
> > > Change in v3 (Thanks to Maxime Ripard):
> > >  - Reauthor Icenowy for her path
> > >
> > > Changes in v2 (Thanks to Maxime Ripard):
> > >  - Drop GPU OPP Table
> > >  - Add clocks and clock-names in required
> > >
> > > Clément Péron (5):
> > >   drm: panfrost: add optional bus_clock
> > >   iommu: io-pgtable: fix sanity check for non 48-bit mali iommu
> > >   dt-bindings: gpu: mali-midgard: Add H6 mali gpu compatible
> > >   arm64: dts: allwinner: Add ARM Mali GPU node for H6
> > >   arm64: dts: allwinner: Add mali GPU supply for H6 boards
> > >
> > > Icenowy Zheng (1):
> > >   dt-bindings: gpu: add bus clock for Mali Midgard GPUs
> >
> > I've applied patches 1 and 3 to drm-misc. I was going to do patch 4
> > too, but it doesn't apply.
>
> Thanks,
>
> I have tried to applied on drm-misc/for-linux-next but it doesn't apply too.
> It looks like commit d5ff1adb3809e2f74a3b57cea2e57c8e37d693c4 is
> missing on drm-misc ?
> https://github.com/torvalds/linux/commit/d5ff1adb3809e2f74a3b57cea2e57c8e37d693c4#diff-c3172f5d421d492ff91d7bb44dd44917

5.2-rc1 is merged in now and I've applied patch 4.

Rob
Robin Murphy May 24, 2019, 5:25 p.m. UTC | #4
On 21/05/2019 17:10, Clément Péron wrote:
> Hi,
> 
> The Allwinner H6 has a Mali-T720 MP2 which should be supported by
> the new panfrost driver. This series fix two issues and introduce the
> dt-bindings but a simple benchmark show that it's still NOT WORKING.
> 
> I'm pushing it in case someone want to continue the work.
> 
> This has been tested with Mesa3D 19.1.0-RC2 and a GPU bitness patch[1].
> 
> One patch is from Icenowy Zheng where I changed the order as required
> by Rob Herring[2].
> 
> Thanks,
> Clement
> 
> [1] https://gitlab.freedesktop.org/kszaq/mesa/tree/panfrost_64_32
> [2] https://patchwork.kernel.org/patch/10699829/
> 
> 
> [  345.204813] panfrost 1800000.gpu: mmu irq status=1
> [  345.209617] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> 0x0000000002400400
> [  345.209617] Reason: TODO
> [  345.209617] raw fault status: 0x800002C1
> [  345.209617] decoded fault status: SLAVE FAULT
> [  345.209617] exception type 0xC1: TRANSLATION_FAULT_LEVEL1
> [  345.209617] access type 0x2: READ
> [  345.209617] source id 0x8000
> [  345.729957] panfrost 1800000.gpu: gpu sched timeout, js=0,
> status=0x8, head=0x2400400, tail=0x2400400, sched_job=000000009e204de9
> [  346.055876] panfrost 1800000.gpu: mmu irq status=1
> [  346.060680] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> 0x0000000002C00A00
> [  346.060680] Reason: TODO
> [  346.060680] raw fault status: 0x810002C1
> [  346.060680] decoded fault status: SLAVE FAULT
> [  346.060680] exception type 0xC1: TRANSLATION_FAULT_LEVEL1
> [  346.060680] access type 0x2: READ
> [  346.060680] source id 0x8100
> [  346.561955] panfrost 1800000.gpu: gpu sched timeout, js=1,
> status=0x8, head=0x2c00a00, tail=0x2c00a00, sched_job=00000000b55a9a85
> [  346.573913] panfrost 1800000.gpu: mmu irq status=1
> [  346.578707] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> 0x0000000002C00B80

FWIW I seem to have reproduced the same behaviour on a different T720 
setup, so this may well be more about the GPU than the platform. There 
doesn't look to be anything obviously wrong with the pagetables, but if 
I can find some more free time I may have a bit more of a poke around.

Robin.

> 
> Change in v5:
>   - Remove fix indent
> 
> Changes in v4:
>   - Add bus_clock probe
>   - Fix sanity check in io-pgtable
>   - Add vramp-delay
>   - Merge all boards into one patch
>   - Remove upstreamed Neil A. patch
> 
> Change in v3 (Thanks to Maxime Ripard):
>   - Reauthor Icenowy for her path
> 
> Changes in v2 (Thanks to Maxime Ripard):
>   - Drop GPU OPP Table
>   - Add clocks and clock-names in required
> 
> Clément Péron (5):
>    drm: panfrost: add optional bus_clock
>    iommu: io-pgtable: fix sanity check for non 48-bit mali iommu
>    dt-bindings: gpu: mali-midgard: Add H6 mali gpu compatible
>    arm64: dts: allwinner: Add ARM Mali GPU node for H6
>    arm64: dts: allwinner: Add mali GPU supply for H6 boards
> 
> Icenowy Zheng (1):
>    dt-bindings: gpu: add bus clock for Mali Midgard GPUs
> 
>   .../bindings/gpu/arm,mali-midgard.txt         | 15 ++++++++++++-
>   .../dts/allwinner/sun50i-h6-beelink-gs1.dts   |  6 +++++
>   .../dts/allwinner/sun50i-h6-orangepi-3.dts    |  6 +++++
>   .../dts/allwinner/sun50i-h6-orangepi.dtsi     |  6 +++++
>   .../boot/dts/allwinner/sun50i-h6-pine-h64.dts |  6 +++++
>   arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi  | 14 ++++++++++++
>   drivers/gpu/drm/panfrost/panfrost_device.c    | 22 +++++++++++++++++++
>   drivers/gpu/drm/panfrost/panfrost_device.h    |  1 +
>   drivers/iommu/io-pgtable-arm.c                |  2 +-
>   9 files changed, 76 insertions(+), 2 deletions(-)
>
Tomeu Vizoso May 29, 2019, 3:09 p.m. UTC | #5
On Tue, 21 May 2019 at 18:11, Clément Péron <peron.clem@gmail.com> wrote:
>
[snip]
> [  345.204813] panfrost 1800000.gpu: mmu irq status=1
> [  345.209617] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> 0x0000000002400400

From what I can see here, 0x0000000002400400 points to the first byte
of the first submitted job descriptor.

So mapping buffers for the GPU doesn't seem to be working at all on
64-bit T-760.

Steven, Robin, do you have any idea of why this could be?

Thanks,

Tomeu
Clément Péron May 29, 2019, 3:37 p.m. UTC | #6
Hi Tomeu,

On Wed, 29 May 2019 at 17:33, Tomeu Vizoso <tomeu.vizoso@collabora.com> wrote:
>
> The GPU driver needs them to change the clock frequency and regulator
> voltage depending on the load.

As requested by Maxime, I have dropped these OPP table as It's taken
from vendor and untested with Panfrost.

https://lore.kernel.org/patchwork/patch/1060374/

Regards,
Clément

>
> Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Clément Péron <peron.clem@gmail.com>
>
> ---
>
> Feel free to pick up this patch if you are going to keep pushing this
> series forward.
>
> Thanks,
>
> Tomeu
> ---
>  arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi | 66 ++++++++++++++++++++
>  1 file changed, 66 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi b/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
> index 6aad06095c40..decf7b56e2df 100644
> --- a/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
> +++ b/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
> @@ -157,6 +157,71 @@
>                         allwinner,sram = <&ve_sram 1>;
>                 };
>
> +               gpu_opp_table: opp-table2 {
> +                       compatible = "operating-points-v2";
> +
> +                       opp00 {
> +                               opp-hz = /bits/ 64 <756000000>;
> +                               opp-microvolt = <1040000>;
> +                       };
> +                       opp01 {
> +                               opp-hz = /bits/ 64 <624000000>;
> +                               opp-microvolt = <950000>;
> +                       };
> +                       opp02 {
> +                               opp-hz = /bits/ 64 <576000000>;
> +                               opp-microvolt = <930000>;
> +                       };
> +                       opp03 {
> +                               opp-hz = /bits/ 64 <540000000>;
> +                               opp-microvolt = <910000>;
> +                       };
> +                       opp04 {
> +                               opp-hz = /bits/ 64 <504000000>;
> +                               opp-microvolt = <890000>;
> +                       };
> +                       opp05 {
> +                               opp-hz = /bits/ 64 <456000000>;
> +                               opp-microvolt = <870000>;
> +                       };
> +                       opp06 {
> +                               opp-hz = /bits/ 64 <432000000>;
> +                               opp-microvolt = <860000>;
> +                       };
> +                       opp07 {
> +                               opp-hz = /bits/ 64 <420000000>;
> +                               opp-microvolt = <850000>;
> +                       };
> +                       opp08 {
> +                               opp-hz = /bits/ 64 <408000000>;
> +                               opp-microvolt = <840000>;
> +                       };
> +                       opp09 {
> +                               opp-hz = /bits/ 64 <384000000>;
> +                               opp-microvolt = <830000>;
> +                       };
> +                       opp10 {
> +                               opp-hz = /bits/ 64 <360000000>;
> +                               opp-microvolt = <820000>;
> +                       };
> +                       opp11 {
> +                               opp-hz = /bits/ 64 <336000000>;
> +                               opp-microvolt = <810000>;
> +                       };
> +                       opp12 {
> +                               opp-hz = /bits/ 64 <312000000>;
> +                               opp-microvolt = <810000>;
> +                       };
> +                       opp13 {
> +                               opp-hz = /bits/ 64 <264000000>;
> +                               opp-microvolt = <810000>;
> +                       };
> +                       opp14 {
> +                               opp-hz = /bits/ 64 <216000000>;
> +                               opp-microvolt = <810000>;
> +                       };
> +               };
> +
>                 gpu: gpu@1800000 {
>                         compatible = "allwinner,sun50i-h6-mali",
>                                      "arm,mali-t720";
> @@ -168,6 +233,7 @@
>                         clocks = <&ccu CLK_GPU>, <&ccu CLK_BUS_GPU>;
>                         clock-names = "core", "bus";
>                         resets = <&ccu RST_BUS_GPU>;
> +                       operating-points-v2 = <&gpu_opp_table>;
>                         status = "disabled";
>                 };
>
> --
> 2.20.1
>
Tomeu Vizoso May 29, 2019, 3:42 p.m. UTC | #7
On Wed, 29 May 2019 at 17:37, Clément Péron <peron.clem@gmail.com> wrote:
>
> Hi Tomeu,
>
> On Wed, 29 May 2019 at 17:33, Tomeu Vizoso <tomeu.vizoso@collabora.com> wrote:
> >
> > The GPU driver needs them to change the clock frequency and regulator
> > voltage depending on the load.
>
> As requested by Maxime, I have dropped these OPP table as It's taken
> from vendor and untested with Panfrost.
>
> https://lore.kernel.org/patchwork/patch/1060374/

Ok, guess this series should wait then until we can run Panfrost on it
and check how DVFS is working.

Thanks,

Tomeu

> Regards,
> Clément
>
> >
> > Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> > Cc: Clément Péron <peron.clem@gmail.com>
> >
> > ---
> >
> > Feel free to pick up this patch if you are going to keep pushing this
> > series forward.
> >
> > Thanks,
> >
> > Tomeu
> > ---
> >  arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi | 66 ++++++++++++++++++++
> >  1 file changed, 66 insertions(+)
> >
> > diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi b/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
> > index 6aad06095c40..decf7b56e2df 100644
> > --- a/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
> > +++ b/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
> > @@ -157,6 +157,71 @@
> >                         allwinner,sram = <&ve_sram 1>;
> >                 };
> >
> > +               gpu_opp_table: opp-table2 {
> > +                       compatible = "operating-points-v2";
> > +
> > +                       opp00 {
> > +                               opp-hz = /bits/ 64 <756000000>;
> > +                               opp-microvolt = <1040000>;
> > +                       };
> > +                       opp01 {
> > +                               opp-hz = /bits/ 64 <624000000>;
> > +                               opp-microvolt = <950000>;
> > +                       };
> > +                       opp02 {
> > +                               opp-hz = /bits/ 64 <576000000>;
> > +                               opp-microvolt = <930000>;
> > +                       };
> > +                       opp03 {
> > +                               opp-hz = /bits/ 64 <540000000>;
> > +                               opp-microvolt = <910000>;
> > +                       };
> > +                       opp04 {
> > +                               opp-hz = /bits/ 64 <504000000>;
> > +                               opp-microvolt = <890000>;
> > +                       };
> > +                       opp05 {
> > +                               opp-hz = /bits/ 64 <456000000>;
> > +                               opp-microvolt = <870000>;
> > +                       };
> > +                       opp06 {
> > +                               opp-hz = /bits/ 64 <432000000>;
> > +                               opp-microvolt = <860000>;
> > +                       };
> > +                       opp07 {
> > +                               opp-hz = /bits/ 64 <420000000>;
> > +                               opp-microvolt = <850000>;
> > +                       };
> > +                       opp08 {
> > +                               opp-hz = /bits/ 64 <408000000>;
> > +                               opp-microvolt = <840000>;
> > +                       };
> > +                       opp09 {
> > +                               opp-hz = /bits/ 64 <384000000>;
> > +                               opp-microvolt = <830000>;
> > +                       };
> > +                       opp10 {
> > +                               opp-hz = /bits/ 64 <360000000>;
> > +                               opp-microvolt = <820000>;
> > +                       };
> > +                       opp11 {
> > +                               opp-hz = /bits/ 64 <336000000>;
> > +                               opp-microvolt = <810000>;
> > +                       };
> > +                       opp12 {
> > +                               opp-hz = /bits/ 64 <312000000>;
> > +                               opp-microvolt = <810000>;
> > +                       };
> > +                       opp13 {
> > +                               opp-hz = /bits/ 64 <264000000>;
> > +                               opp-microvolt = <810000>;
> > +                       };
> > +                       opp14 {
> > +                               opp-hz = /bits/ 64 <216000000>;
> > +                               opp-microvolt = <810000>;
> > +                       };
> > +               };
> > +
> >                 gpu: gpu@1800000 {
> >                         compatible = "allwinner,sun50i-h6-mali",
> >                                      "arm,mali-t720";
> > @@ -168,6 +233,7 @@
> >                         clocks = <&ccu CLK_GPU>, <&ccu CLK_BUS_GPU>;
> >                         clock-names = "core", "bus";
> >                         resets = <&ccu RST_BUS_GPU>;
> > +                       operating-points-v2 = <&gpu_opp_table>;
> >                         status = "disabled";
> >                 };
> >
> > --
> > 2.20.1
> >
Tomeu Vizoso May 31, 2019, 12:04 p.m. UTC | #8
On Wed, 29 May 2019 at 19:38, Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 29/05/2019 16:09, Tomeu Vizoso wrote:
> > On Tue, 21 May 2019 at 18:11, Clément Péron <peron.clem@gmail.com> wrote:
> >>
> > [snip]
> >> [  345.204813] panfrost 1800000.gpu: mmu irq status=1
> >> [  345.209617] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> >> 0x0000000002400400
> >
> >  From what I can see here, 0x0000000002400400 points to the first byte
> > of the first submitted job descriptor.
> >
> > So mapping buffers for the GPU doesn't seem to be working at all on
> > 64-bit T-760.
> >
> > Steven, Robin, do you have any idea of why this could be?
>
> I tried rolling back to the old panfrost/nondrm shim, and it works fine
> with kbase, and I also found that T-820 falls over in the exact same
> manner, so the fact that it seemed to be common to the smaller 33-bit
> designs rather than anything to do with the other
> job_descriptor_size/v4/v5 complication turned out to be telling.

Is this complication something you can explain? I don't know what v4
and v5 are meant here.

> [ as an aside, are 64-bit jobs actually known not to work on v4 GPUs, or
> is it just that nobody's yet observed a 64-bit blob driving one? ]

I'm looking right now at getting Panfrost working on T720 with 64-bit
descriptors, with the ultimate goal of making Panfrost
64-bit-descriptor only so we can have a single build of Mesa in
distros.

> Long story short, it appears that 'Mali LPAE' is also lacking the start
> level notion of VMSA, and expects a full 4-level table even for <40 bits
> when level 0 effectively redundant. Thus walking the 3-level table that
> io-pgtable comes back with ends up going wildly wrong. The hack below
> seems to do the job for me; if Clément can confirm (on T-720 you'll
> still need the userspace hack to force 32-bit jobs as well) then I think
> I'll cook up a proper refactoring of the allocator to put things right.

Mmaps seem to work with this patch, thanks.

The main complication I'm facing right now seems to be that the SFBD
descriptor on T720 seems to be different from the one we already had
(tested on T6xx?).

Cheers,

Tomeu

> Robin.
>
>
> ----->8-----
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 546968d8a349..f29da6e8dc08 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -1023,12 +1023,14 @@ arm_mali_lpae_alloc_pgtable(struct
> io_pgtable_cfg *cfg, void *cookie)
>         iop = arm_64_lpae_alloc_pgtable_s1(cfg, cookie);
>         if (iop) {
>                 u64 mair, ttbr;
> +               struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(&iop->ops);
>
> +               data->levels = 4;
>                 /* Copy values as union fields overlap */
>                 mair = cfg->arm_lpae_s1_cfg.mair[0];
>                 ttbr = cfg->arm_lpae_s1_cfg.ttbr[0];
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Robin Murphy May 31, 2019, 1:47 p.m. UTC | #9
On 31/05/2019 13:04, Tomeu Vizoso wrote:
> On Wed, 29 May 2019 at 19:38, Robin Murphy <robin.murphy@arm.com> wrote:
>>
>> On 29/05/2019 16:09, Tomeu Vizoso wrote:
>>> On Tue, 21 May 2019 at 18:11, Clément Péron <peron.clem@gmail.com> wrote:
>>>>
>>> [snip]
>>>> [  345.204813] panfrost 1800000.gpu: mmu irq status=1
>>>> [  345.209617] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
>>>> 0x0000000002400400
>>>
>>>   From what I can see here, 0x0000000002400400 points to the first byte
>>> of the first submitted job descriptor.
>>>
>>> So mapping buffers for the GPU doesn't seem to be working at all on
>>> 64-bit T-760.
>>>
>>> Steven, Robin, do you have any idea of why this could be?
>>
>> I tried rolling back to the old panfrost/nondrm shim, and it works fine
>> with kbase, and I also found that T-820 falls over in the exact same
>> manner, so the fact that it seemed to be common to the smaller 33-bit
>> designs rather than anything to do with the other
>> job_descriptor_size/v4/v5 complication turned out to be telling.
> 
> Is this complication something you can explain? I don't know what v4
> and v5 are meant here.

I was alluding to BASE_HW_FEATURE_V4, which I believe refers to the 
Midgard architecture version - the older versions implemented by T6xx 
and T720 seem to be collectively treated as "v4", while T760 and T8xx 
would effectively be "v5".

>> [ as an aside, are 64-bit jobs actually known not to work on v4 GPUs, or
>> is it just that nobody's yet observed a 64-bit blob driving one? ]
> 
> I'm looking right now at getting Panfrost working on T720 with 64-bit
> descriptors, with the ultimate goal of making Panfrost
> 64-bit-descriptor only so we can have a single build of Mesa in
> distros.

Cool, I'll keep an eye out, and hope that it might be enough for T620 on 
Juno, too :)

>> Long story short, it appears that 'Mali LPAE' is also lacking the start
>> level notion of VMSA, and expects a full 4-level table even for <40 bits
>> when level 0 effectively redundant. Thus walking the 3-level table that
>> io-pgtable comes back with ends up going wildly wrong. The hack below
>> seems to do the job for me; if Clément can confirm (on T-720 you'll
>> still need the userspace hack to force 32-bit jobs as well) then I think
>> I'll cook up a proper refactoring of the allocator to put things right.
> 
> Mmaps seem to work with this patch, thanks.
> 
> The main complication I'm facing right now seems to be that the SFBD
> descriptor on T720 seems to be different from the one we already had
> (tested on T6xx?).

OK - with the 32-bit hack pointed to up-thread, a quick kmscube test 
gave me the impression that T720 works fine, but on closer inspection 
some parts of glmark2 do seem to go a bit wonky (although I suspect at 
least some of it is just down to the FPGA setup being both very slow and 
lacking in memory bandwidth), and the "nv12-1img" mode of kmscube turns 
out to render in some delightfully wrong colours.

I'll try to get a 'proper' version of the io-pgtable patch posted soon.

Thanks,
Robin.

> 
> Cheers,
> 
> Tomeu
> 
>> Robin.
>>
>>
>> ----->8-----
>> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
>> index 546968d8a349..f29da6e8dc08 100644
>> --- a/drivers/iommu/io-pgtable-arm.c
>> +++ b/drivers/iommu/io-pgtable-arm.c
>> @@ -1023,12 +1023,14 @@ arm_mali_lpae_alloc_pgtable(struct
>> io_pgtable_cfg *cfg, void *cookie)
>>          iop = arm_64_lpae_alloc_pgtable_s1(cfg, cookie);
>>          if (iop) {
>>                  u64 mair, ttbr;
>> +               struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(&iop->ops);
>>
>> +               data->levels = 4;
>>                  /* Copy values as union fields overlap */
>>                  mair = cfg->arm_lpae_s1_cfg.mair[0];
>>                  ttbr = cfg->arm_lpae_s1_cfg.ttbr[0];
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Clément Péron June 3, 2019, 5:27 p.m. UTC | #10
Hi Maxime, Joerg,

On Wed, 22 May 2019 at 21:27, Rob Herring <robh+dt@kernel.org> wrote:
>
> On Tue, May 21, 2019 at 11:11 AM Clément Péron <peron.clem@gmail.com> wrote:
> >
> > Hi,
> >
> > The Allwinner H6 has a Mali-T720 MP2 which should be supported by
> > the new panfrost driver. This series fix two issues and introduce the
> > dt-bindings but a simple benchmark show that it's still NOT WORKING.
> >
> > I'm pushing it in case someone want to continue the work.
> >
> > This has been tested with Mesa3D 19.1.0-RC2 and a GPU bitness patch[1].
> >
> > One patch is from Icenowy Zheng where I changed the order as required
> > by Rob Herring[2].
> >
> > Thanks,
> > Clement
> >
> > [1] https://gitlab.freedesktop.org/kszaq/mesa/tree/panfrost_64_32
> > [2] https://patchwork.kernel.org/patch/10699829/
> >
> >
> > [  345.204813] panfrost 1800000.gpu: mmu irq status=1
> > [  345.209617] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> > 0x0000000002400400
> > [  345.209617] Reason: TODO
> > [  345.209617] raw fault status: 0x800002C1
> > [  345.209617] decoded fault status: SLAVE FAULT
> > [  345.209617] exception type 0xC1: TRANSLATION_FAULT_LEVEL1
> > [  345.209617] access type 0x2: READ
> > [  345.209617] source id 0x8000
> > [  345.729957] panfrost 1800000.gpu: gpu sched timeout, js=0,
> > status=0x8, head=0x2400400, tail=0x2400400, sched_job=000000009e204de9
> > [  346.055876] panfrost 1800000.gpu: mmu irq status=1
> > [  346.060680] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> > 0x0000000002C00A00
> > [  346.060680] Reason: TODO
> > [  346.060680] raw fault status: 0x810002C1
> > [  346.060680] decoded fault status: SLAVE FAULT
> > [  346.060680] exception type 0xC1: TRANSLATION_FAULT_LEVEL1
> > [  346.060680] access type 0x2: READ
> > [  346.060680] source id 0x8100
> > [  346.561955] panfrost 1800000.gpu: gpu sched timeout, js=1,
> > status=0x8, head=0x2c00a00, tail=0x2c00a00, sched_job=00000000b55a9a85
> > [  346.573913] panfrost 1800000.gpu: mmu irq status=1
> > [  346.578707] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> > 0x0000000002C00B80
> >
> > Change in v5:
> >  - Remove fix indent
> >
> > Changes in v4:
> >  - Add bus_clock probe
> >  - Fix sanity check in io-pgtable
> >  - Add vramp-delay
> >  - Merge all boards into one patch
> >  - Remove upstreamed Neil A. patch
> >
> > Change in v3 (Thanks to Maxime Ripard):
> >  - Reauthor Icenowy for her path
> >
> > Changes in v2 (Thanks to Maxime Ripard):
> >  - Drop GPU OPP Table
> >  - Add clocks and clock-names in required
> >
> > Clément Péron (5):
> >   drm: panfrost: add optional bus_clock
> >   iommu: io-pgtable: fix sanity check for non 48-bit mali iommu
> >   dt-bindings: gpu: mali-midgard: Add H6 mali gpu compatible
> >   arm64: dts: allwinner: Add ARM Mali GPU node for H6
> >   arm64: dts: allwinner: Add mali GPU supply for H6 boards
> >
> > Icenowy Zheng (1):
> >   dt-bindings: gpu: add bus clock for Mali Midgard GPUs
>
> I've applied patches 1 and 3 to drm-misc. I was going to do patch 4
> too, but it doesn't apply.
>
> Patch 2 can go in via the iommu tree and the rest via the allwinner tree.

Is this OK for you to pick up this series?

Thanks,
Clément

>
> Rob
Tomeu Vizoso June 10, 2019, 1:30 p.m. UTC | #11
On Wed, 29 May 2019 at 19:38, Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 29/05/2019 16:09, Tomeu Vizoso wrote:
> > On Tue, 21 May 2019 at 18:11, Clément Péron <peron.clem@gmail.com> wrote:
> >>
> > [snip]
> >> [  345.204813] panfrost 1800000.gpu: mmu irq status=1
> >> [  345.209617] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
> >> 0x0000000002400400
> >
> >  From what I can see here, 0x0000000002400400 points to the first byte
> > of the first submitted job descriptor.
> >
> > So mapping buffers for the GPU doesn't seem to be working at all on
> > 64-bit T-760.
> >
> > Steven, Robin, do you have any idea of why this could be?
>
> I tried rolling back to the old panfrost/nondrm shim, and it works fine
> with kbase, and I also found that T-820 falls over in the exact same
> manner, so the fact that it seemed to be common to the smaller 33-bit
> designs rather than anything to do with the other
> job_descriptor_size/v4/v5 complication turned out to be telling.
>
> [ as an aside, are 64-bit jobs actually known not to work on v4 GPUs, or
> is it just that nobody's yet observed a 64-bit blob driving one? ]

Do you know if 64-bit descriptors work on v4 GPUs with our kernel
driver but with the DDK?

Wonder if there something else to be fixed in the kernel for that scenario.

Thanks,

Tomeu

> Long story short, it appears that 'Mali LPAE' is also lacking the start
> level notion of VMSA, and expects a full 4-level table even for <40 bits
> when level 0 effectively redundant. Thus walking the 3-level table that
> io-pgtable comes back with ends up going wildly wrong. The hack below
> seems to do the job for me; if Clément can confirm (on T-720 you'll
> still need the userspace hack to force 32-bit jobs as well) then I think
> I'll cook up a proper refactoring of the allocator to put things right.
>
> Robin.
>
>
> ----->8-----
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 546968d8a349..f29da6e8dc08 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -1023,12 +1023,14 @@ arm_mali_lpae_alloc_pgtable(struct
> io_pgtable_cfg *cfg, void *cookie)
>         iop = arm_64_lpae_alloc_pgtable_s1(cfg, cookie);
>         if (iop) {
>                 u64 mair, ttbr;
> +               struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(&iop->ops);
>
> +               data->levels = 4;
>                 /* Copy values as union fields overlap */
>                 mair = cfg->arm_lpae_s1_cfg.mair[0];
>                 ttbr = cfg->arm_lpae_s1_cfg.ttbr[0];
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Neil Armstrong Aug. 28, 2019, 11:28 a.m. UTC | #12
Hi Robyn,

On 31/05/2019 15:47, Robin Murphy wrote:
> On 31/05/2019 13:04, Tomeu Vizoso wrote:
>> On Wed, 29 May 2019 at 19:38, Robin Murphy <robin.murphy@arm.com> wrote:
>>>
>>> On 29/05/2019 16:09, Tomeu Vizoso wrote:
>>>> On Tue, 21 May 2019 at 18:11, Clément Péron <peron.clem@gmail.com> wrote:
>>>>>
>>>> [snip]
>>>>> [  345.204813] panfrost 1800000.gpu: mmu irq status=1
>>>>> [  345.209617] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
>>>>> 0x0000000002400400
>>>>
>>>>   From what I can see here, 0x0000000002400400 points to the first byte
>>>> of the first submitted job descriptor.
>>>>
>>>> So mapping buffers for the GPU doesn't seem to be working at all on
>>>> 64-bit T-760.
>>>>
>>>> Steven, Robin, do you have any idea of why this could be?
>>>
>>> I tried rolling back to the old panfrost/nondrm shim, and it works fine
>>> with kbase, and I also found that T-820 falls over in the exact same
>>> manner, so the fact that it seemed to be common to the smaller 33-bit
>>> designs rather than anything to do with the other
>>> job_descriptor_size/v4/v5 complication turned out to be telling.
>>
>> Is this complication something you can explain? I don't know what v4
>> and v5 are meant here.
> 
> I was alluding to BASE_HW_FEATURE_V4, which I believe refers to the Midgard architecture version - the older versions implemented by T6xx and T720 seem to be collectively treated as "v4", while T760 and T8xx would effectively be "v5".
> 
>>> [ as an aside, are 64-bit jobs actually known not to work on v4 GPUs, or
>>> is it just that nobody's yet observed a 64-bit blob driving one? ]
>>
>> I'm looking right now at getting Panfrost working on T720 with 64-bit
>> descriptors, with the ultimate goal of making Panfrost
>> 64-bit-descriptor only so we can have a single build of Mesa in
>> distros.
> 
> Cool, I'll keep an eye out, and hope that it might be enough for T620 on Juno, too :)
> 
>>> Long story short, it appears that 'Mali LPAE' is also lacking the start
>>> level notion of VMSA, and expects a full 4-level table even for <40 bits
>>> when level 0 effectively redundant. Thus walking the 3-level table that
>>> io-pgtable comes back with ends up going wildly wrong. The hack below
>>> seems to do the job for me; if Clément can confirm (on T-720 you'll
>>> still need the userspace hack to force 32-bit jobs as well) then I think
>>> I'll cook up a proper refactoring of the allocator to put things right.
>>
>> Mmaps seem to work with this patch, thanks.
>>
>> The main complication I'm facing right now seems to be that the SFBD
>> descriptor on T720 seems to be different from the one we already had
>> (tested on T6xx?).
> 
> OK - with the 32-bit hack pointed to up-thread, a quick kmscube test gave me the impression that T720 works fine, but on closer inspection some parts of glmark2 do seem to go a bit wonky (although I suspect at least some of it is just down to the FPGA setup being both very slow and lacking in memory bandwidth), and the "nv12-1img" mode of kmscube turns out to render in some delightfully wrong colours.
> 
> I'll try to get a 'proper' version of the io-pgtable patch posted soon.

I'm trying to collect all the fixes needed to make T820 work again, and
I was wondering if you finally have a proper patch for this and "cfg->ias > 48"
hack ? Or one I can test ?

Thanks,
Neil

> 
> Thanks,
> Robin.
> 
>>
>> Cheers,
>>
>> Tomeu
>>
>>> Robin.
>>>
>>>
>>> ----->8-----
>>> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
>>> index 546968d8a349..f29da6e8dc08 100644
>>> --- a/drivers/iommu/io-pgtable-arm.c
>>> +++ b/drivers/iommu/io-pgtable-arm.c
>>> @@ -1023,12 +1023,14 @@ arm_mali_lpae_alloc_pgtable(struct
>>> io_pgtable_cfg *cfg, void *cookie)
>>>          iop = arm_64_lpae_alloc_pgtable_s1(cfg, cookie);
>>>          if (iop) {
>>>                  u64 mair, ttbr;
>>> +               struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(&iop->ops);
>>>
>>> +               data->levels = 4;
>>>                  /* Copy values as union fields overlap */
>>>                  mair = cfg->arm_lpae_s1_cfg.mair[0];
>>>                  ttbr = cfg->arm_lpae_s1_cfg.ttbr[0];
>>>
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Robin Murphy Aug. 28, 2019, 11:49 a.m. UTC | #13
Hi Neil,

On 28/08/2019 12:28, Neil Armstrong wrote:
> Hi Robin,
> 
> On 31/05/2019 15:47, Robin Murphy wrote:
>> On 31/05/2019 13:04, Tomeu Vizoso wrote:
>>> On Wed, 29 May 2019 at 19:38, Robin Murphy <robin.murphy@arm.com> wrote:
>>>>
>>>> On 29/05/2019 16:09, Tomeu Vizoso wrote:
>>>>> On Tue, 21 May 2019 at 18:11, Clément Péron <peron.clem@gmail.com> wrote:
>>>>>>
>>>>> [snip]
>>>>>> [  345.204813] panfrost 1800000.gpu: mmu irq status=1
>>>>>> [  345.209617] panfrost 1800000.gpu: Unhandled Page fault in AS0 at VA
>>>>>> 0x0000000002400400
>>>>>
>>>>>    From what I can see here, 0x0000000002400400 points to the first byte
>>>>> of the first submitted job descriptor.
>>>>>
>>>>> So mapping buffers for the GPU doesn't seem to be working at all on
>>>>> 64-bit T-760.
>>>>>
>>>>> Steven, Robin, do you have any idea of why this could be?
>>>>
>>>> I tried rolling back to the old panfrost/nondrm shim, and it works fine
>>>> with kbase, and I also found that T-820 falls over in the exact same
>>>> manner, so the fact that it seemed to be common to the smaller 33-bit
>>>> designs rather than anything to do with the other
>>>> job_descriptor_size/v4/v5 complication turned out to be telling.
>>>
>>> Is this complication something you can explain? I don't know what v4
>>> and v5 are meant here.
>>
>> I was alluding to BASE_HW_FEATURE_V4, which I believe refers to the Midgard architecture version - the older versions implemented by T6xx and T720 seem to be collectively treated as "v4", while T760 and T8xx would effectively be "v5".
>>
>>>> [ as an aside, are 64-bit jobs actually known not to work on v4 GPUs, or
>>>> is it just that nobody's yet observed a 64-bit blob driving one? ]
>>>
>>> I'm looking right now at getting Panfrost working on T720 with 64-bit
>>> descriptors, with the ultimate goal of making Panfrost
>>> 64-bit-descriptor only so we can have a single build of Mesa in
>>> distros.
>>
>> Cool, I'll keep an eye out, and hope that it might be enough for T620 on Juno, too :)
>>
>>>> Long story short, it appears that 'Mali LPAE' is also lacking the start
>>>> level notion of VMSA, and expects a full 4-level table even for <40 bits
>>>> when level 0 effectively redundant. Thus walking the 3-level table that
>>>> io-pgtable comes back with ends up going wildly wrong. The hack below
>>>> seems to do the job for me; if Clément can confirm (on T-720 you'll
>>>> still need the userspace hack to force 32-bit jobs as well) then I think
>>>> I'll cook up a proper refactoring of the allocator to put things right.
>>>
>>> Mmaps seem to work with this patch, thanks.
>>>
>>> The main complication I'm facing right now seems to be that the SFBD
>>> descriptor on T720 seems to be different from the one we already had
>>> (tested on T6xx?).
>>
>> OK - with the 32-bit hack pointed to up-thread, a quick kmscube test gave me the impression that T720 works fine, but on closer inspection some parts of glmark2 do seem to go a bit wonky (although I suspect at least some of it is just down to the FPGA setup being both very slow and lacking in memory bandwidth), and the "nv12-1img" mode of kmscube turns out to render in some delightfully wrong colours.
>>
>> I'll try to get a 'proper' version of the io-pgtable patch posted soon.
> 
> I'm trying to collect all the fixes needed to make T820 work again, and
> I was wondering if you finally have a proper patch for this and "cfg->ias > 48"
> hack ? Or one I can test ?

I do have a handful of io-pgtable patches written up and ready to go, 
I'm just treading carefully and waiting for the internal approval box to 
be ticked before I share anything :(

Robin.

> 
> Thanks,
> Neil
> 
>>
>> Thanks,
>> Robin.
>>
>>>
>>> Cheers,
>>>
>>> Tomeu
>>>
>>>> Robin.
>>>>
>>>>
>>>> ----->8-----
>>>> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
>>>> index 546968d8a349..f29da6e8dc08 100644
>>>> --- a/drivers/iommu/io-pgtable-arm.c
>>>> +++ b/drivers/iommu/io-pgtable-arm.c
>>>> @@ -1023,12 +1023,14 @@ arm_mali_lpae_alloc_pgtable(struct
>>>> io_pgtable_cfg *cfg, void *cookie)
>>>>           iop = arm_64_lpae_alloc_pgtable_s1(cfg, cookie);
>>>>           if (iop) {
>>>>                   u64 mair, ttbr;
>>>> +               struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(&iop->ops);
>>>>
>>>> +               data->levels = 4;
>>>>                   /* Copy values as union fields overlap */
>>>>                   mair = cfg->arm_lpae_s1_cfg.mair[0];
>>>>                   ttbr = cfg->arm_lpae_s1_cfg.ttbr[0];
>>>>
>>>> _______________________________________________
>>>> dri-devel mailing list
>>>> dri-devel@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
Neil Armstrong Aug. 28, 2019, 11:51 a.m. UTC | #14
On 28/08/2019 13:49, Robin Murphy wrote:
> Hi Neil,
> 
> On 28/08/2019 12:28, Neil Armstrong wrote:
>> Hi Robin,
>>

[...]
>>>
>>> OK - with the 32-bit hack pointed to up-thread, a quick kmscube test gave me the impression that T720 works fine, but on closer inspection some parts of glmark2 do seem to go a bit wonky (although I suspect at least some of it is just down to the FPGA setup being both very slow and lacking in memory bandwidth), and the "nv12-1img" mode of kmscube turns out to render in some delightfully wrong colours.
>>>
>>> I'll try to get a 'proper' version of the io-pgtable patch posted soon.
>>
>> I'm trying to collect all the fixes needed to make T820 work again, and
>> I was wondering if you finally have a proper patch for this and "cfg->ias > 48"
>> hack ? Or one I can test ?
> 
> I do have a handful of io-pgtable patches written up and ready to go, I'm just treading carefully and waiting for the internal approval box to be ticked before I share anything :(

Great !

No problem, it can totally wait until approval,

Thanks,
Neil

> 
> Robin.
> 
>>
>> Thanks,
>> Neil
>>
>>>
>>> Thanks,
>>> Robin.
>>>
>>>>
>>>> Cheers,
>>>>
>>>> Tomeu
>>>>
>>>>> Robin.
>>>>>
>>>>>
>>>>> ----->8-----
>>>>> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
>>>>> index 546968d8a349..f29da6e8dc08 100644
>>>>> --- a/drivers/iommu/io-pgtable-arm.c
>>>>> +++ b/drivers/iommu/io-pgtable-arm.c
>>>>> @@ -1023,12 +1023,14 @@ arm_mali_lpae_alloc_pgtable(struct
>>>>> io_pgtable_cfg *cfg, void *cookie)
>>>>>           iop = arm_64_lpae_alloc_pgtable_s1(cfg, cookie);
>>>>>           if (iop) {
>>>>>                   u64 mair, ttbr;
>>>>> +               struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(&iop->ops);
>>>>>
>>>>> +               data->levels = 4;
>>>>>                   /* Copy values as union fields overlap */
>>>>>                   mair = cfg->arm_lpae_s1_cfg.mair[0];
>>>>>                   ttbr = cfg->arm_lpae_s1_cfg.ttbr[0];
>>>>>
>>>>> _______________________________________________
>>>>> dri-devel mailing list
>>>>> dri-devel@lists.freedesktop.org
>>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>