mbox series

[v3,00/24] MT8192 IOMMU support

Message ID 20200930070647.10188-1-yong.wu@mediatek.com
Headers show
Series MT8192 IOMMU support | expand

Message

Yong Wu (吴勇) Sept. 30, 2020, 7:06 a.m. UTC
This patch mainly adds support for mt8192 IOMMU and SMI.

mt8192 also is MTK IOMMU gen2 which uses ARM Short-Descriptor translation
table format. The M4U-SMI HW diagram is as below:

                          EMI
                           |
                          M4U
                           |
                      ------------
                       SMI Common
                      ------------
                           |
  +-------+------+------+----------------------+-------+
  |       |      |      |       ......         |       |
  |       |      |      |                      |       |
larb0   larb1  larb2  larb4     ......      larb19   larb20
disp0   disp1   mdp    vdec                   IPE      IPE

All the connections are HW fixed, SW can NOT adjust it.

Comparing with the preview SoC, this patchset mainly adds two new functions:
a) add iova 34 bits support.
b) add multi domains support since several HW has the special iova
region requirement.

this patchset depend on v5.9-rc1.

change note:
v3:
  a) Fix DT schema issue commented from Rob.
  b) Fix a v7s issue. Use "_lvl" instead of "_l" in the macro(ARM_V7S_PTES_PER_LVL) since 
  it is called in ARM_V7S_LVL_IDX which has already used "_l".
  c) Fix a PM suspend issue: Avoid pm suspend in pm runtime case.

v2: https://lore.kernel.org/linux-iommu/20200905080920.13396-1-yong.wu@mediatek.com/
  a) Convert IOMMU/SMI dt-binding to DT schema.
  b) Fix some comment from Pi-Hsun and Nicolas. like use
  generic_iommu_put_resv_regions.
  c) Reword some comment, like add how to use domain-id.

v1: https://lore.kernel.org/linux-iommu/20200711064846.16007-1-yong.wu@mediatek.com/

Yong Wu (24):
  dt-bindings: iommu: mediatek: Convert IOMMU to DT schema
  dt-bindings: memory: mediatek: Convert SMI to DT schema
  dt-bindings: memory: mediatek: Add a common larb-port header file
  dt-bindings: memory: mediatek: Extend LARB_NR_MAX to 32
  dt-bindings: memory: mediatek: Add domain definition
  dt-bindings: mediatek: Add binding for mt8192 IOMMU
  iommu/mediatek: Use the common mtk-smi-larb-port.h
  iommu/io-pgtable-arm-v7s: Use ias to check the valid iova in unmap
  iommu/io-pgtable-arm-v7s: Extend PA34 for MediaTek
  iommu/io-pgtable-arm-v7s: Add cfg as a param in some macros
  iommu/io-pgtable-arm-v7s: Quad lvl1 pgtable for MediaTek
  iommu/mediatek: Move hw_init into attach_device
  iommu/mediatek: Add device link for smi-common and m4u
  iommu/mediatek: Add pm runtime callback
  iommu/mediatek: Add power-domain operation
  iommu/mediatek: Add iova reserved function
  iommu/mediatek: Add single domain
  iommu/mediatek: Support master use iova over 32bit
  iommu/mediatek: Support up to 34bit iova in tlb flush
  iommu/mediatek: Support report iova 34bit translation fault in ISR
  iommu/mediatek: Add support for multi domain
  iommu/mediatek: Adjust the structure
  iommu/mediatek: Add mt8192 support
  memory: mtk-smi: Add mt8192 support

 .../bindings/iommu/mediatek,iommu.txt         | 103 -------
 .../bindings/iommu/mediatek,iommu.yaml        | 159 +++++++++++
 .../mediatek,smi-common.txt                   |  49 ----
 .../mediatek,smi-common.yaml                  | 101 +++++++
 .../memory-controllers/mediatek,smi-larb.txt  |  49 ----
 .../memory-controllers/mediatek,smi-larb.yaml |  92 +++++++
 drivers/iommu/io-pgtable-arm-v7s.c            |  57 ++--
 drivers/iommu/mtk_iommu.c                     | 256 +++++++++++++++---
 drivers/iommu/mtk_iommu.h                     |  11 +-
 drivers/memory/mtk-smi.c                      |  27 ++
 include/dt-bindings/memory/mt2712-larb-port.h |   2 +-
 include/dt-bindings/memory/mt6779-larb-port.h |   2 +-
 include/dt-bindings/memory/mt8173-larb-port.h |   2 +-
 include/dt-bindings/memory/mt8183-larb-port.h |   2 +-
 include/dt-bindings/memory/mt8192-larb-port.h | 239 ++++++++++++++++
 .../dt-bindings/memory/mtk-smi-larb-port.h    |  22 ++
 include/linux/io-pgtable.h                    |   4 +-
 include/soc/mediatek/smi.h                    |   3 +-
 18 files changed, 903 insertions(+), 277 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
 create mode 100644 Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml
 delete mode 100644 Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.txt
 create mode 100644 Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
 delete mode 100644 Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.txt
 create mode 100644 Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
 create mode 100644 include/dt-bindings/memory/mt8192-larb-port.h
 create mode 100644 include/dt-bindings/memory/mtk-smi-larb-port.h

Comments

Krzysztof Kozlowski Oct. 2, 2020, 11:15 a.m. UTC | #1
On Wed, Sep 30, 2020 at 03:06:47PM +0800, Yong Wu wrote:
> Add mt8192 smi support.
> 
> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> ---
>  drivers/memory/mtk-smi.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)

Does it depend on any of the previous patches (so can it be applied
independently)?

Best regards,
Krzysztof
Krzysztof Kozlowski Oct. 6, 2020, 7:18 a.m. UTC | #2
On Wed, Sep 30, 2020 at 03:06:41PM +0800, Yong Wu wrote:
> After extending v7s, our pagetable already support iova reach
> 16GB(34bit). the master got the iova via dma_alloc_attrs may reach
> 34bits, but its HW register still is 32bit. then how to set the
> bit32/bit33 iova? this depend on a SMI larb setting(bank_sel).
> 
> we separate whole 16GB iova to four banks:
> bank: 0: 0~4G; 1: 4~8G; 2: 8-12G; 3: 12-16G;
> The bank number is (iova >> 32).
> 
> We will preassign which bank the larbs belong to. currently we don't
> have a interface for master to adjust its bank number.
> 
> Each a bank is a iova_region which is a independent iommu-domain.
> the iova range for each iommu-domain can't cross 4G.
> 
> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> ---
>  drivers/iommu/mtk_iommu.c  | 12 +++++++++---
>  drivers/memory/mtk-smi.c   |  7 +++++++
>  include/soc/mediatek/smi.h |  1 +
>  3 files changed, 17 insertions(+), 3 deletions(-)


For the memory part:
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>

Best regards,
Krzysztof
Will Deacon Oct. 23, 2020, 11:17 a.m. UTC | #3
On Wed, Sep 30, 2020 at 03:06:31PM +0800, Yong Wu wrote:
> Use the ias for the valid iova checking in arm_v7s_unmap. This is a
> preparing patch for supporting iova 34bit for MediaTek.
> BTW, change the ias/oas checking format in arm_v7s_map.
> 
> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> ---
>  drivers/iommu/io-pgtable-arm-v7s.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
> index a688f22cbe3b..4c9d8dccfc5a 100644
> --- a/drivers/iommu/io-pgtable-arm-v7s.c
> +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> @@ -526,8 +526,7 @@ static int arm_v7s_map(struct io_pgtable_ops *ops, unsigned long iova,
>  	if (!(prot & (IOMMU_READ | IOMMU_WRITE)))
>  		return 0;
>  
> -	if (WARN_ON(iova >= (1ULL << data->iop.cfg.ias) ||
> -		    paddr >= (1ULL << data->iop.cfg.oas)))
> +	if (WARN_ON(iova >> data->iop.cfg.ias || paddr >> data->iop.cfg.oas))
>  		return -ERANGE;

As discussed when reviewing these for Android, please leave this code as-is.

>  
>  	ret = __arm_v7s_map(data, iova, paddr, size, prot, 1, data->pgd, gfp);
> @@ -717,7 +716,7 @@ static size_t arm_v7s_unmap(struct io_pgtable_ops *ops, unsigned long iova,
>  {
>  	struct arm_v7s_io_pgtable *data = io_pgtable_ops_to_data(ops);
>  
> -	if (WARN_ON(upper_32_bits(iova)))
> +	if (WARN_ON(iova >> data->iop.cfg.ias))
>  		return 0;

And avoid the UB here for 32-bit machines by comparing with 1ULL <<
iop.cfg.ias instead.

Thanks,

Will
Will Deacon Oct. 23, 2020, 11:21 a.m. UTC | #4
On Wed, Sep 30, 2020 at 03:06:34PM +0800, Yong Wu wrote:
> The standard input iova bits is 32. MediaTek quad the lvl1 pagetable
> (4 * lvl1). No change for lvl2 pagetable. Then the iova bits can reach
> 34bit.
> 
> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> ---
>  drivers/iommu/io-pgtable-arm-v7s.c | 13 ++++++++++---
>  drivers/iommu/mtk_iommu.c          |  2 +-
>  2 files changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
> index 8362fdf76657..306bae2755ed 100644
> --- a/drivers/iommu/io-pgtable-arm-v7s.c
> +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> @@ -50,10 +50,17 @@
>   */
>  #define ARM_V7S_ADDR_BITS		32

If we rename this to _ARM_V7S_ADDR_BITS can we then have ARM_V7S_ADDR_BITS
take a cfg parameter and move the arm_v7s_is_mtk_enabled() check in there?
Same for _ARM_V7S_LVL_BITS.

That would avoid scattering arm_v7s_is_mtk_enabled() into all the users.

Will
Will Deacon Oct. 23, 2020, 11:22 a.m. UTC | #5
On Wed, Sep 30, 2020 at 03:06:32PM +0800, Yong Wu wrote:
> MediaTek extend the bit5 in lvl1 and lvl2 descriptor as PA34.
> 
> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> ---
>  drivers/iommu/io-pgtable-arm-v7s.c | 9 +++++++--
>  drivers/iommu/mtk_iommu.c          | 2 +-
>  include/linux/io-pgtable.h         | 4 ++--
>  3 files changed, 10 insertions(+), 5 deletions(-)

Acked-by: Will Deacon <will@kernel.org>

Will
Will Deacon Oct. 23, 2020, 11:23 a.m. UTC | #6
On Wed, Sep 30, 2020 at 03:06:33PM +0800, Yong Wu wrote:
> Add "cfg" as a parameter for some macros. This is a preparing patch for
> mediatek extend the lvl1 pgtable. No functional change.
> 
> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> ---
>  drivers/iommu/io-pgtable-arm-v7s.c | 34 +++++++++++++++---------------
>  1 file changed, 17 insertions(+), 17 deletions(-)

Acked-by: Will Deacon <will@kernel.org>

(but see my later comments above doing this for some of the 'constants' too)

Will
Robin Murphy Oct. 23, 2020, 2:10 p.m. UTC | #7
On 2020-09-30 08:06, Yong Wu wrote:
> The standard input iova bits is 32. MediaTek quad the lvl1 pagetable
> (4 * lvl1). No change for lvl2 pagetable. Then the iova bits can reach
> 34bit.
> 
> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> ---
>   drivers/iommu/io-pgtable-arm-v7s.c | 13 ++++++++++---
>   drivers/iommu/mtk_iommu.c          |  2 +-
>   2 files changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
> index 8362fdf76657..306bae2755ed 100644
> --- a/drivers/iommu/io-pgtable-arm-v7s.c
> +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> @@ -50,10 +50,17 @@
>    */
>   #define ARM_V7S_ADDR_BITS		32
>   #define _ARM_V7S_LVL_BITS(lvl)		(16 - (lvl) * 4)
> +/* MediaTek: totally 34bits, 14bits at lvl1 and 8bits at lvl2. */
> +#define _ARM_V7S_LVL_BITS_MTK(lvl)	(20 - (lvl) * 6)

This should defined in terms of both lvl and cfg->ias. The formula here 
is nothing more than a disgusting trick I made up since a linear 
interpolation happened to fit the required numbers. That said, all of 
these bits pretending that short-descriptor is a well-defined recursive 
format only served to allow the rest of the code to look more like the 
LPAE code - IIRC they've already diverged a fair bit since then, so 
frankly a lot of this could stand to be unpicked and made considerably 
clearer by simply accepting that level 1 and level 2 are different from 
each other.

Robin.

>   #define ARM_V7S_LVL_SHIFT(lvl)		(ARM_V7S_ADDR_BITS - (4 + 8 * (lvl)))
>   #define ARM_V7S_TABLE_SHIFT		10
>   
> -#define ARM_V7S_PTES_PER_LVL(lvl, cfg)	(1 << _ARM_V7S_LVL_BITS(lvl))
> +#define ARM_V7S_PTES_PER_LVL(lvl, cfg)	({				\
> +	int _lvl = lvl;							\
> +	!arm_v7s_is_mtk_enabled(cfg) ?					\
> +	 (1 << _ARM_V7S_LVL_BITS(_lvl)) : (1 << _ARM_V7S_LVL_BITS_MTK(_lvl));\
> +})
> +
>   #define ARM_V7S_TABLE_SIZE(lvl, cfg)					\
>   	(ARM_V7S_PTES_PER_LVL(lvl, cfg) * sizeof(arm_v7s_iopte))
>   
> @@ -63,7 +70,7 @@
>   #define _ARM_V7S_IDX_MASK(lvl, cfg)	(ARM_V7S_PTES_PER_LVL(lvl, cfg) - 1)
>   #define ARM_V7S_LVL_IDX(addr, lvl, cfg)	({			\
>   	int _l = lvl;							\
> -	((u32)(addr) >> ARM_V7S_LVL_SHIFT(_l)) & _ARM_V7S_IDX_MASK(_l, cfg); \
> +	((addr) >> ARM_V7S_LVL_SHIFT(_l)) & _ARM_V7S_IDX_MASK(_l, cfg); \
>   })
>   
>   /*
> @@ -755,7 +762,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg,
>   {
>   	struct arm_v7s_io_pgtable *data;
>   
> -	if (cfg->ias > ARM_V7S_ADDR_BITS)
> +	if (cfg->ias > (arm_v7s_is_mtk_enabled(cfg) ? 34 : ARM_V7S_ADDR_BITS))
>   		return NULL;
>   
>   	if (cfg->oas > (arm_v7s_is_mtk_enabled(cfg) ? 35 : ARM_V7S_ADDR_BITS))
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index f6a2e3eb59d2..6e85c9976a33 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -316,7 +316,7 @@ static int mtk_iommu_domain_finalise(struct mtk_iommu_domain *dom)
>   			IO_PGTABLE_QUIRK_TLBI_ON_MAP |
>   			IO_PGTABLE_QUIRK_ARM_MTK_EXT,
>   		.pgsize_bitmap = mtk_iommu_ops.pgsize_bitmap,
> -		.ias = 32,
> +		.ias = 34,
>   		.oas = 35,
>   		.tlb = &mtk_iommu_flush_ops,
>   		.iommu_dev = data->dev,
>
Yong Wu (吴勇) Oct. 26, 2020, 7:41 a.m. UTC | #8
On Fri, 2020-10-23 at 15:10 +0100, Robin Murphy wrote:
> On 2020-09-30 08:06, Yong Wu wrote:
> > The standard input iova bits is 32. MediaTek quad the lvl1 pagetable
> > (4 * lvl1). No change for lvl2 pagetable. Then the iova bits can reach
> > 34bit.
> > 
> > Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> > ---
> >   drivers/iommu/io-pgtable-arm-v7s.c | 13 ++++++++++---
> >   drivers/iommu/mtk_iommu.c          |  2 +-
> >   2 files changed, 11 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
> > index 8362fdf76657..306bae2755ed 100644
> > --- a/drivers/iommu/io-pgtable-arm-v7s.c
> > +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> > @@ -50,10 +50,17 @@
> >    */
> >   #define ARM_V7S_ADDR_BITS		32
> >   #define _ARM_V7S_LVL_BITS(lvl)		(16 - (lvl) * 4)
> > +/* MediaTek: totally 34bits, 14bits at lvl1 and 8bits at lvl2. */
> > +#define _ARM_V7S_LVL_BITS_MTK(lvl)	(20 - (lvl) * 6)
> 
> This should defined in terms of both lvl and cfg->ias. The formula here 
> is nothing more than a disgusting trick I made up since a linear 
> interpolation happened to fit the required numbers. That said, all of 
> these bits pretending that short-descriptor is a well-defined recursive 
> format only served to allow the rest of the code to look more like the 
> LPAE code - IIRC they've already diverged a fair bit since then, so 
> frankly a lot of this could stand to be unpicked and made considerably 
> clearer by simply accepting that level 1 and level 2 are different from 
> each other.

If the formula is not good and make it clearer, How about this?


/*
 * We have 32 bits total; 12 bits resolved at level 1, 8 bits at level
2,
-* and 12 bits in a page. With some carefully-chosen coefficients we can
-* hide the ugly inconsistencies behind these macros and at least let
the
-* rest of the code pretend to be somewhat sane.
+* and 12 bits in a page.
+*
+* MediaTek extend 2 bits to reach 34 bits, 14 bits at lvl1 and 8 bits
at lvl2.
 */

-#define _ARM_V7S_LVL_BITS(lvl)		(16 - (lvl) * 4)
+#define _ARM_V7S_LVL1_BITS_NR(cfg)     (((cfg)->ias == 32) ? 12 : 14)
+#define _ARM_V7S_LVL2_BITS_NR		8
+
+#define _ARM_V7S_LVL_BITS(lvl, cfg)    \
+      (((lvl) == 1) ? _ARM_V7S_LVL1_BITS_NR(cfg):_ARM_V7S_LVL2_BITS_NR)

> Robin.
> 
> >   #define ARM_V7S_LVL_SHIFT(lvl)		(ARM_V7S_ADDR_BITS - (4 + 8 * (lvl)))
> >   #define ARM_V7S_TABLE_SHIFT		10
> >   
> > -#define ARM_V7S_PTES_PER_LVL(lvl, cfg)	(1 << _ARM_V7S_LVL_BITS(lvl))
> > +#define ARM_V7S_PTES_PER_LVL(lvl, cfg)	({				\
> > +	int _lvl = lvl;							\
> > +	!arm_v7s_is_mtk_enabled(cfg) ?					\
> > +	 (1 << _ARM_V7S_LVL_BITS(_lvl)) : (1 << _ARM_V7S_LVL_BITS_MTK(_lvl));\
> > +})
> > +
> >   #define ARM_V7S_TABLE_SIZE(lvl, cfg)					\
> >   	(ARM_V7S_PTES_PER_LVL(lvl, cfg) * sizeof(arm_v7s_iopte))
> >   
> > @@ -63,7 +70,7 @@
> >   #define _ARM_V7S_IDX_MASK(lvl, cfg)	(ARM_V7S_PTES_PER_LVL(lvl, cfg) - 1)
> >   #define ARM_V7S_LVL_IDX(addr, lvl, cfg)	({			\
> >   	int _l = lvl;							\
> > -	((u32)(addr) >> ARM_V7S_LVL_SHIFT(_l)) & _ARM_V7S_IDX_MASK(_l, cfg); \
> > +	((addr) >> ARM_V7S_LVL_SHIFT(_l)) & _ARM_V7S_IDX_MASK(_l, cfg); \
> >   })
> >   
> >   /*
> > @@ -755,7 +762,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg,
> >   {
> >   	struct arm_v7s_io_pgtable *data;
> >   
> > -	if (cfg->ias > ARM_V7S_ADDR_BITS)
> > +	if (cfg->ias > (arm_v7s_is_mtk_enabled(cfg) ? 34 : ARM_V7S_ADDR_BITS))
> >   		return NULL;
> >   
> >   	if (cfg->oas > (arm_v7s_is_mtk_enabled(cfg) ? 35 : ARM_V7S_ADDR_BITS))
> > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > index f6a2e3eb59d2..6e85c9976a33 100644
> > --- a/drivers/iommu/mtk_iommu.c
> > +++ b/drivers/iommu/mtk_iommu.c
> > @@ -316,7 +316,7 @@ static int mtk_iommu_domain_finalise(struct mtk_iommu_domain *dom)
> >   			IO_PGTABLE_QUIRK_TLBI_ON_MAP |
> >   			IO_PGTABLE_QUIRK_ARM_MTK_EXT,
> >   		.pgsize_bitmap = mtk_iommu_ops.pgsize_bitmap,
> > -		.ias = 32,
> > +		.ias = 34,
> >   		.oas = 35,
> >   		.tlb = &mtk_iommu_flush_ops,
> >   		.iommu_dev = data->dev,
> >
Yong Wu (吴勇) Oct. 26, 2020, 7:45 a.m. UTC | #9
On Fri, 2020-10-23 at 12:21 +0100, Will Deacon wrote:
> On Wed, Sep 30, 2020 at 03:06:34PM +0800, Yong Wu wrote:
> > The standard input iova bits is 32. MediaTek quad the lvl1 pagetable
> > (4 * lvl1). No change for lvl2 pagetable. Then the iova bits can reach
> > 34bit.
> > 
> > Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> > ---
> >  drivers/iommu/io-pgtable-arm-v7s.c | 13 ++++++++++---
> >  drivers/iommu/mtk_iommu.c          |  2 +-
> >  2 files changed, 11 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
> > index 8362fdf76657..306bae2755ed 100644
> > --- a/drivers/iommu/io-pgtable-arm-v7s.c
> > +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> > @@ -50,10 +50,17 @@
> >   */
> >  #define ARM_V7S_ADDR_BITS		32
> 
> If we rename this to _ARM_V7S_ADDR_BITS can we then have ARM_V7S_ADDR_BITS
> take a cfg parameter and move the arm_v7s_is_mtk_enabled() check in there?
> Same for _ARM_V7S_LVL_BITS.
> 
> That would avoid scattering arm_v7s_is_mtk_enabled() into all the users.

I added "cfg" for _ARM_V7S_LVL_BITS in Robin's mail. is that ok?

Regarding ARM_V7S_ADDR_BITS, I'd like to keep it as is(Don't add cfg),
this macro only is used in ARM_V7S_LVL_SHIFT and checking the value of
ias/oas.

a) ARM_V7S_LVL_SHIFT always expect ARM_V7S_ADDR_BITS is 32.

b) our ias/oas is different(ias is 34 while oas is 35). If we define a
new macro, we need two about this, like:
#define ARM_V7S_IADDR_BITS(cfg)	(!arm_v7s_is_mtk_enabled(cfg) ? 32 : 34)
#define ARM_V7S_OADDR_BITS(cfg)	(!arm_v7s_is_mtk_enabled(cfg) ? 32 : 35)
and the two will only are used in the checking of ias/oas.

thus it looks unnecessary?

> 
> Will
Yong Wu (吴勇) Oct. 26, 2020, 7:49 a.m. UTC | #10
On Fri, 2020-10-23 at 12:17 +0100, Will Deacon wrote:
> On Wed, Sep 30, 2020 at 03:06:31PM +0800, Yong Wu wrote:
> > Use the ias for the valid iova checking in arm_v7s_unmap. This is a
> > preparing patch for supporting iova 34bit for MediaTek.
> > BTW, change the ias/oas checking format in arm_v7s_map.
> > 
> > Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> > ---
> >  drivers/iommu/io-pgtable-arm-v7s.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
> > index a688f22cbe3b..4c9d8dccfc5a 100644
> > --- a/drivers/iommu/io-pgtable-arm-v7s.c
> > +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> > @@ -526,8 +526,7 @@ static int arm_v7s_map(struct io_pgtable_ops *ops, unsigned long iova,
> >  	if (!(prot & (IOMMU_READ | IOMMU_WRITE)))
> >  		return 0;
> >  
> > -	if (WARN_ON(iova >= (1ULL << data->iop.cfg.ias) ||
> > -		    paddr >= (1ULL << data->iop.cfg.oas)))
> > +	if (WARN_ON(iova >> data->iop.cfg.ias || paddr >> data->iop.cfg.oas))
> >  		return -ERANGE;
> 
> As discussed when reviewing these for Android, please leave this code as-is.
> 
> >  
> >  	ret = __arm_v7s_map(data, iova, paddr, size, prot, 1, data->pgd, gfp);
> > @@ -717,7 +716,7 @@ static size_t arm_v7s_unmap(struct io_pgtable_ops *ops, unsigned long iova,
> >  {
> >  	struct arm_v7s_io_pgtable *data = io_pgtable_ops_to_data(ops);
> >  
> > -	if (WARN_ON(upper_32_bits(iova)))
> > +	if (WARN_ON(iova >> data->iop.cfg.ias))
> >  		return 0;
> 
> And avoid the UB here for 32-bit machines by comparing with 1ULL <<
> iop.cfg.ias instead.

Thanks. I will fix it in next version.

> 
> Thanks,
> 
> Will
Robin Murphy Oct. 26, 2020, 11:35 a.m. UTC | #11
On 2020-10-26 07:41, Yong Wu wrote:
> On Fri, 2020-10-23 at 15:10 +0100, Robin Murphy wrote:
>> On 2020-09-30 08:06, Yong Wu wrote:
>>> The standard input iova bits is 32. MediaTek quad the lvl1 pagetable
>>> (4 * lvl1). No change for lvl2 pagetable. Then the iova bits can reach
>>> 34bit.
>>>
>>> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
>>> ---
>>>    drivers/iommu/io-pgtable-arm-v7s.c | 13 ++++++++++---
>>>    drivers/iommu/mtk_iommu.c          |  2 +-
>>>    2 files changed, 11 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
>>> index 8362fdf76657..306bae2755ed 100644
>>> --- a/drivers/iommu/io-pgtable-arm-v7s.c
>>> +++ b/drivers/iommu/io-pgtable-arm-v7s.c
>>> @@ -50,10 +50,17 @@
>>>     */
>>>    #define ARM_V7S_ADDR_BITS		32
>>>    #define _ARM_V7S_LVL_BITS(lvl)		(16 - (lvl) * 4)
>>> +/* MediaTek: totally 34bits, 14bits at lvl1 and 8bits at lvl2. */
>>> +#define _ARM_V7S_LVL_BITS_MTK(lvl)	(20 - (lvl) * 6)
>>
>> This should defined in terms of both lvl and cfg->ias. The formula here
>> is nothing more than a disgusting trick I made up since a linear
>> interpolation happened to fit the required numbers. That said, all of
>> these bits pretending that short-descriptor is a well-defined recursive
>> format only served to allow the rest of the code to look more like the
>> LPAE code - IIRC they've already diverged a fair bit since then, so
>> frankly a lot of this could stand to be unpicked and made considerably
>> clearer by simply accepting that level 1 and level 2 are different from
>> each other.
> 
> If the formula is not good and make it clearer, How about this?
> 
> 
> /*
>   * We have 32 bits total; 12 bits resolved at level 1, 8 bits at level
> 2,
> -* and 12 bits in a page. With some carefully-chosen coefficients we can
> -* hide the ugly inconsistencies behind these macros and at least let
> the
> -* rest of the code pretend to be somewhat sane.
> +* and 12 bits in a page.
> +*
> +* MediaTek extend 2 bits to reach 34 bits, 14 bits at lvl1 and 8 bits
> at lvl2.
>   */
> 
> -#define _ARM_V7S_LVL_BITS(lvl)		(16 - (lvl) * 4)
> +#define _ARM_V7S_LVL1_BITS_NR(cfg)     (((cfg)->ias == 32) ? 12 : 14)
> +#define _ARM_V7S_LVL2_BITS_NR		8
> +
> +#define _ARM_V7S_LVL_BITS(lvl, cfg)    \
> +      (((lvl) == 1) ? _ARM_V7S_LVL1_BITS_NR(cfg):_ARM_V7S_LVL2_BITS_NR)

Well, I'd have gone for something really simple and clear like:

#define ARM_V7S_LVL_BITS(lvl, cfg) ((lvl) == 1 ? (cfg)->ias - 20 : 8)
#define ARM_V7S_LVL_SHIFT(lvl)     ((lvl) == 1 ? 20 : 12)

Then maybe see if enough of the users could resolve lvl significantly 
earlier to make it worth splitting things up further.

Robin.

>>>    #define ARM_V7S_LVL_SHIFT(lvl)		(ARM_V7S_ADDR_BITS - (4 + 8 * (lvl)))
>>>    #define ARM_V7S_TABLE_SHIFT		10
>>>    
>>> -#define ARM_V7S_PTES_PER_LVL(lvl, cfg)	(1 << _ARM_V7S_LVL_BITS(lvl))
>>> +#define ARM_V7S_PTES_PER_LVL(lvl, cfg)	({				\
>>> +	int _lvl = lvl;							\
>>> +	!arm_v7s_is_mtk_enabled(cfg) ?					\
>>> +	 (1 << _ARM_V7S_LVL_BITS(_lvl)) : (1 << _ARM_V7S_LVL_BITS_MTK(_lvl));\
>>> +})
>>> +
>>>    #define ARM_V7S_TABLE_SIZE(lvl, cfg)					\
>>>    	(ARM_V7S_PTES_PER_LVL(lvl, cfg) * sizeof(arm_v7s_iopte))
>>>    
>>> @@ -63,7 +70,7 @@
>>>    #define _ARM_V7S_IDX_MASK(lvl, cfg)	(ARM_V7S_PTES_PER_LVL(lvl, cfg) - 1)
>>>    #define ARM_V7S_LVL_IDX(addr, lvl, cfg)	({			\
>>>    	int _l = lvl;							\
>>> -	((u32)(addr) >> ARM_V7S_LVL_SHIFT(_l)) & _ARM_V7S_IDX_MASK(_l, cfg); \
>>> +	((addr) >> ARM_V7S_LVL_SHIFT(_l)) & _ARM_V7S_IDX_MASK(_l, cfg); \
>>>    })
>>>    
>>>    /*
>>> @@ -755,7 +762,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg,
>>>    {
>>>    	struct arm_v7s_io_pgtable *data;
>>>    
>>> -	if (cfg->ias > ARM_V7S_ADDR_BITS)
>>> +	if (cfg->ias > (arm_v7s_is_mtk_enabled(cfg) ? 34 : ARM_V7S_ADDR_BITS))
>>>    		return NULL;
>>>    
>>>    	if (cfg->oas > (arm_v7s_is_mtk_enabled(cfg) ? 35 : ARM_V7S_ADDR_BITS))
>>> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
>>> index f6a2e3eb59d2..6e85c9976a33 100644
>>> --- a/drivers/iommu/mtk_iommu.c
>>> +++ b/drivers/iommu/mtk_iommu.c
>>> @@ -316,7 +316,7 @@ static int mtk_iommu_domain_finalise(struct mtk_iommu_domain *dom)
>>>    			IO_PGTABLE_QUIRK_TLBI_ON_MAP |
>>>    			IO_PGTABLE_QUIRK_ARM_MTK_EXT,
>>>    		.pgsize_bitmap = mtk_iommu_ops.pgsize_bitmap,
>>> -		.ias = 32,
>>> +		.ias = 34,
>>>    		.oas = 35,
>>>    		.tlb = &mtk_iommu_flush_ops,
>>>    		.iommu_dev = data->dev,
>>>
>
Krzysztof Kozlowski Oct. 26, 2020, 8:08 p.m. UTC | #12
On Wed, Sep 30, 2020 at 03:06:23PM +0800, Yong Wu wrote:
> This patch mainly adds support for mt8192 IOMMU and SMI.
> 
> mt8192 also is MTK IOMMU gen2 which uses ARM Short-Descriptor translation
> table format. The M4U-SMI HW diagram is as below:
> 
>                           EMI
>                            |
>                           M4U
>                            |
>                       ------------
>                        SMI Common
>                       ------------
>                            |
>   +-------+------+------+----------------------+-------+
>   |       |      |      |       ......         |       |
>   |       |      |      |                      |       |
> larb0   larb1  larb2  larb4     ......      larb19   larb20
> disp0   disp1   mdp    vdec                   IPE      IPE
> 
> All the connections are HW fixed, SW can NOT adjust it.
> 
> Comparing with the preview SoC, this patchset mainly adds two new functions:
> a) add iova 34 bits support.
> b) add multi domains support since several HW has the special iova
> region requirement.
> 
> this patchset depend on v5.9-rc1.

Hi,

I think there will be v4 of this, right? If yes, please also describe
the dependencies between the patches. If the entire patchset is strictly
ordered, then mention this as well.

Best regards,
Krzysztof