Message ID | 20230104110013.24738-1-marcan@marcan.st |
---|---|
Headers | show |
Series | iommu: dart: Apple t8110 DART support | expand |
On Wed, Jan 4, 2023, at 12:00, Hector Martin wrote: > T8110 only has one TTBR per stream, so un-hardcode that. > > Signed-off-by: Hector Martin <marcan@marcan.st> > --- > drivers/iommu/apple-dart.c | 26 ++++++++++++++++++-------- > 1 file changed, 18 insertions(+), 8 deletions(-) > > diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c > index 48743bcd5b9d..189487c1d978 100644 > --- a/drivers/iommu/apple-dart.c > +++ b/drivers/iommu/apple-dart.c > @@ -77,15 +77,21 @@ > #define DART_TCR_BYPASS0_ENABLE BIT(8) > #define DART_TCR_BYPASS1_ENABLE BIT(12) > > -#define DART_TTBR(sid, idx) (0x200 + 16 * (sid) + 4 * (idx)) > #define DART_TTBR_VALID BIT(31) > #define DART_TTBR_SHIFT 12 > > +#define DART_TTBR(dart, sid, idx) (0x200 + \ > + (((dart)->hw->ttbr_count * (sid)) << 2) + \ > + ((idx) << 2)) > + > + > struct apple_dart_hw { > u32 oas; > enum io_pgtable_fmt fmt; > > int max_sid_count; > + > + int ttbr_count; > }; > > /* > @@ -245,7 +251,7 @@ static void apple_dart_hw_set_ttbr(struct > apple_dart_stream_map *stream_map, > WARN_ON(paddr & ((1 << DART_TTBR_SHIFT) - 1)); > for_each_set_bit(sid, stream_map->sidmap, dart->num_streams) > writel(DART_TTBR_VALID | (paddr >> DART_TTBR_SHIFT), > - dart->regs + DART_TTBR(sid, idx)); > + dart->regs + DART_TTBR(dart, sid, idx)); > } > > static void apple_dart_hw_clear_ttbr(struct apple_dart_stream_map > *stream_map, > @@ -255,7 +261,7 @@ static void apple_dart_hw_clear_ttbr(struct > apple_dart_stream_map *stream_map, > int sid; > > for_each_set_bit(sid, stream_map->sidmap, dart->num_streams) > - writel(0, dart->regs + DART_TTBR(sid, idx)); > + writel(0, dart->regs + DART_TTBR(dart, sid, idx)); > } > > static void > @@ -263,7 +269,7 @@ apple_dart_hw_clear_all_ttbrs(struct > apple_dart_stream_map *stream_map) > { > int i; > > - for (i = 0; i < DART_MAX_TTBR; ++i) > + for (i = 0; i < stream_map->dart->hw->ttbr_count; ++i) > apple_dart_hw_clear_ttbr(stream_map, i); > } > > @@ -415,7 +421,7 @@ apple_dart_setup_translation(struct > apple_dart_domain *domain, > for (i = 0; i < pgtbl_cfg->apple_dart_cfg.n_ttbrs; ++i) > apple_dart_hw_set_ttbr(stream_map, i, > pgtbl_cfg->apple_dart_cfg.ttbr[i]); > - for (; i < DART_MAX_TTBR; ++i) > + for (; i < stream_map->dart->hw->ttbr_count; ++i) > apple_dart_hw_clear_ttbr(stream_map, i); > > apple_dart_hw_enable_translation(stream_map); > @@ -956,11 +962,15 @@ static const struct apple_dart_hw apple_dart_hw_t8103 = { > .oas = 36, > .fmt = APPLE_DART, > .max_sid_count = 16, > + > + .ttbr_count = 4, > }; > static const struct apple_dart_hw apple_dart_hw_t6000 = { > .oas = 42, > .fmt = APPLE_DART2, > .max_sid_count = 16, > + > + .ttbr_count = 4, > }; > > static __maybe_unused int apple_dart_suspend(struct device *dev) > @@ -970,9 +980,9 @@ static __maybe_unused int apple_dart_suspend(struct > device *dev) > > for (sid = 0; sid < dart->num_streams; sid++) { > dart->save_tcr[sid] = readl_relaxed(dart->regs + DART_TCR(sid)); > - for (idx = 0; idx < DART_MAX_TTBR; idx++) > + for (idx = 0; idx < dart->hw->ttbr_count; idx++) > dart->save_ttbr[sid][idx] = > - readl(dart->regs + DART_TTBR(sid, idx)); > + readl(dart->regs + DART_TTBR(dart, sid, idx)); > } > > return 0; > @@ -993,7 +1003,7 @@ static __maybe_unused int apple_dart_resume(struct > device *dev) > for (sid = 0; sid < dart->num_streams; sid++) { > for (idx = 0; idx < DART_MAX_TTBR; idx++) s/DART_MAX_TTBR/dart->hw->ttbr_count/ I think. With that fixed: Reviewed-by: Sven Peter <sven@svenpeter.dev> Sven
On Wed, Jan 4, 2023, at 12:00, Hector Martin wrote: > They didn't have the PARAMS reg index in them, but they should. > > Signed-off-by: Hector Martin <marcan@marcan.st> > --- Reviewed-by: Sven Peter <sven@svenpeter.dev>
On Wed, Jan 4, 2023, at 12:00, Hector Martin wrote: > T8110 DARTs have up to 256 SIDs, so we need to switch to a bitmap to > handle them properly. > > Signed-off-by: Hector Martin <marcan@marcan.st> > --- > drivers/iommu/apple-dart.c | 114 +++++++++++++++++++++++-------------- > 1 file changed, 71 insertions(+), 43 deletions(-) > > diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c > index 2458416122f8..48743bcd5b9d 100644 > --- a/drivers/iommu/apple-dart.c > +++ b/drivers/iommu/apple-dart.c > @@ -34,11 +34,10 @@ > > #include "dma-iommu.h" > > -#define DART_MAX_STREAMS 16 > +#define DART_MAX_STREAMS 256 Feels a bit wasteful to allocate 256-wide sid2group and save_{tcr,ttbr} arrays even for the M1 where 16 are enough. But then again, that's still <100 KiB for all DARTs combined and these machine have >8 GiB of RAM so it probably won't make a difference > #define DART_MAX_TTBR 4 > #define MAX_DARTS_PER_DEVICE 2 > > -#define DART_STREAM_ALL 0xffff > > #define DART_PARAMS1 0x00 > #define DART_PARAMS_PAGE_SHIFT GENMASK(27, 24) > @@ -85,6 +84,8 @@ > struct apple_dart_hw { > u32 oas; > enum io_pgtable_fmt fmt; > + > + int max_sid_count; > }; > > /* > @@ -116,6 +117,7 @@ struct apple_dart { > spinlock_t lock; > > u32 pgsize; > + u32 num_streams; > u32 supports_bypass : 1; > u32 force_bypass : 1; > > @@ -143,11 +145,11 @@ struct apple_dart { > */ > struct apple_dart_stream_map { > struct apple_dart *dart; > - unsigned long sidmap; > + DECLARE_BITMAP(sidmap, DART_MAX_STREAMS); > }; > struct apple_dart_atomic_stream_map { > struct apple_dart *dart; > - atomic64_t sidmap; > + atomic_long_t sidmap[BITS_TO_LONGS(DART_MAX_STREAMS)]; > }; > > /* > @@ -205,50 +207,55 @@ static struct apple_dart_domain > *to_dart_domain(struct iommu_domain *dom) > static void > apple_dart_hw_enable_translation(struct apple_dart_stream_map > *stream_map) > { > + struct apple_dart *dart = stream_map->dart; > int sid; > > - for_each_set_bit(sid, &stream_map->sidmap, DART_MAX_STREAMS) > + for_each_set_bit(sid, stream_map->sidmap, dart->num_streams) > writel(DART_TCR_TRANSLATE_ENABLE, > - stream_map->dart->regs + DART_TCR(sid)); > + dart->regs + DART_TCR(sid)); > } > > static void apple_dart_hw_disable_dma(struct apple_dart_stream_map *stream_map) > { > + struct apple_dart *dart = stream_map->dart; > int sid; > > - for_each_set_bit(sid, &stream_map->sidmap, DART_MAX_STREAMS) > - writel(0, stream_map->dart->regs + DART_TCR(sid)); > + for_each_set_bit(sid, stream_map->sidmap, dart->num_streams) > + writel(0, dart->regs + DART_TCR(sid)); > } > > static void > apple_dart_hw_enable_bypass(struct apple_dart_stream_map *stream_map) > { > + struct apple_dart *dart = stream_map->dart; > int sid; > > WARN_ON(!stream_map->dart->supports_bypass); > - for_each_set_bit(sid, &stream_map->sidmap, DART_MAX_STREAMS) > + for_each_set_bit(sid, stream_map->sidmap, dart->num_streams) > writel(DART_TCR_BYPASS0_ENABLE | DART_TCR_BYPASS1_ENABLE, > - stream_map->dart->regs + DART_TCR(sid)); > + dart->regs + DART_TCR(sid)); > } > > static void apple_dart_hw_set_ttbr(struct apple_dart_stream_map *stream_map, > u8 idx, phys_addr_t paddr) > { > + struct apple_dart *dart = stream_map->dart; > int sid; > > WARN_ON(paddr & ((1 << DART_TTBR_SHIFT) - 1)); > - for_each_set_bit(sid, &stream_map->sidmap, DART_MAX_STREAMS) > + for_each_set_bit(sid, stream_map->sidmap, dart->num_streams) > writel(DART_TTBR_VALID | (paddr >> DART_TTBR_SHIFT), > - stream_map->dart->regs + DART_TTBR(sid, idx)); > + dart->regs + DART_TTBR(sid, idx)); > } > > static void apple_dart_hw_clear_ttbr(struct apple_dart_stream_map *stream_map, > u8 idx) > { > + struct apple_dart *dart = stream_map->dart; > int sid; > > - for_each_set_bit(sid, &stream_map->sidmap, DART_MAX_STREAMS) > - writel(0, stream_map->dart->regs + DART_TTBR(sid, idx)); > + for_each_set_bit(sid, stream_map->sidmap, dart->num_streams) > + writel(0, dart->regs + DART_TTBR(sid, idx)); > } > > static void > @@ -270,7 +277,7 @@ apple_dart_hw_stream_command(struct > apple_dart_stream_map *stream_map, > > spin_lock_irqsave(&stream_map->dart->lock, flags); > > - writel(stream_map->sidmap, stream_map->dart->regs + DART_STREAM_SELECT); > + writel(stream_map->sidmap[0], stream_map->dart->regs + DART_STREAM_SELECT); > writel(command, stream_map->dart->regs + DART_STREAM_COMMAND); > > ret = readl_poll_timeout_atomic( > @@ -283,7 +290,7 @@ apple_dart_hw_stream_command(struct > apple_dart_stream_map *stream_map, > if (ret) { > dev_err(stream_map->dart->dev, > "busy bit did not clear after command %x for streams %lx\n", > - command, stream_map->sidmap); > + command, stream_map->sidmap[0]); > return ret; > } > > @@ -301,6 +308,7 @@ static int apple_dart_hw_reset(struct apple_dart *dart) > { > u32 config; > struct apple_dart_stream_map stream_map; > + int i; > > config = readl(dart->regs + DART_CONFIG); > if (config & DART_CONFIG_LOCK) { > @@ -310,12 +318,14 @@ static int apple_dart_hw_reset(struct apple_dart *dart) > } > > stream_map.dart = dart; > - stream_map.sidmap = DART_STREAM_ALL; > + bitmap_zero(stream_map.sidmap, DART_MAX_STREAMS); > + bitmap_set(stream_map.sidmap, 0, dart->num_streams); > apple_dart_hw_disable_dma(&stream_map); > apple_dart_hw_clear_all_ttbrs(&stream_map); > > /* enable all streams globally since TCR is used to control isolation */ > - writel(DART_STREAM_ALL, dart->regs + DART_STREAMS_ENABLE); > + for (i = 0; i < BITS_TO_U32(dart->num_streams); i++) > + writel(U32_MAX, dart->regs + DART_STREAMS_ENABLE); This seems weird: this code writes U32_MAX to the same register again and again. > > /* clear any pending errors before the interrupt is unmasked */ > writel(readl(dart->regs + DART_ERROR), dart->regs + DART_ERROR); > @@ -325,13 +335,16 @@ static int apple_dart_hw_reset(struct apple_dart *dart) > > static void apple_dart_domain_flush_tlb(struct apple_dart_domain *domain) > { > - int i; > + int i, j; > struct apple_dart_atomic_stream_map *domain_stream_map; > struct apple_dart_stream_map stream_map; > > for_each_stream_map(i, domain, domain_stream_map) { > stream_map.dart = domain_stream_map->dart; > - stream_map.sidmap = atomic64_read(&domain_stream_map->sidmap); > + > + for (j = 0; j < BITS_TO_LONGS(stream_map.dart->num_streams); j++) > + stream_map.sidmap[j] = > atomic_long_read(&domain_stream_map->sidmap[j]); > + > apple_dart_hw_invalidate_tlb(&stream_map); > } > } > @@ -416,7 +429,7 @@ static int apple_dart_finalize_domain(struct > iommu_domain *domain, > struct apple_dart *dart = cfg->stream_maps[0].dart; > struct io_pgtable_cfg pgtbl_cfg; > int ret = 0; > - int i; > + int i, j; > > mutex_lock(&dart_domain->init_lock); > > @@ -425,8 +438,9 @@ static int apple_dart_finalize_domain(struct > iommu_domain *domain, > > for (i = 0; i < MAX_DARTS_PER_DEVICE; ++i) { > dart_domain->stream_maps[i].dart = cfg->stream_maps[i].dart; > - atomic64_set(&dart_domain->stream_maps[i].sidmap, > - cfg->stream_maps[i].sidmap); > + for (j = 0; j < BITS_TO_LONGS(dart->num_streams); j++) > + atomic_long_set(&dart_domain->stream_maps[i].sidmap[j], > + cfg->stream_maps[i].sidmap[j]); > } > > pgtbl_cfg = (struct io_pgtable_cfg){ > @@ -461,7 +475,7 @@ apple_dart_mod_streams(struct > apple_dart_atomic_stream_map *domain_maps, > struct apple_dart_stream_map *master_maps, > bool add_streams) > { > - int i; > + int i, j; > > for (i = 0; i < MAX_DARTS_PER_DEVICE; ++i) { > if (domain_maps[i].dart != master_maps[i].dart) > @@ -471,12 +485,14 @@ apple_dart_mod_streams(struct > apple_dart_atomic_stream_map *domain_maps, > for (i = 0; i < MAX_DARTS_PER_DEVICE; ++i) { > if (!domain_maps[i].dart) > break; > - if (add_streams) > - atomic64_or(master_maps[i].sidmap, > - &domain_maps[i].sidmap); > - else > - atomic64_and(~master_maps[i].sidmap, > - &domain_maps[i].sidmap); > + for (j = 0; j < BITS_TO_LONGS(domain_maps[i].dart->num_streams); > j++) { > + if (add_streams) > + atomic_long_or(master_maps[i].sidmap[j], > + &domain_maps[i].sidmap[j]); > + else > + atomic_long_and(~master_maps[i].sidmap[j], > + &domain_maps[i].sidmap[j]); > + } > } > > return 0; > @@ -640,14 +656,14 @@ static int apple_dart_of_xlate(struct device > *dev, struct of_phandle_args *args) > > for (i = 0; i < MAX_DARTS_PER_DEVICE; ++i) { > if (cfg->stream_maps[i].dart == dart) { > - cfg->stream_maps[i].sidmap |= 1 << sid; > + set_bit(sid, cfg->stream_maps[i].sidmap); > return 0; > } > } > for (i = 0; i < MAX_DARTS_PER_DEVICE; ++i) { > if (!cfg->stream_maps[i].dart) { > cfg->stream_maps[i].dart = dart; > - cfg->stream_maps[i].sidmap = 1 << sid; > + set_bit(sid, cfg->stream_maps[i].sidmap); > return 0; > } > } > @@ -666,7 +682,7 @@ static void apple_dart_release_group(void *iommu_data) > mutex_lock(&apple_dart_groups_lock); > > for_each_stream_map(i, group_master_cfg, stream_map) > - for_each_set_bit(sid, &stream_map->sidmap, DART_MAX_STREAMS) > + for_each_set_bit(sid, stream_map->sidmap, stream_map->dart->num_streams) > stream_map->dart->sid2group[sid] = NULL; > > kfree(iommu_data); > @@ -685,7 +701,7 @@ static struct iommu_group > *apple_dart_device_group(struct device *dev) > mutex_lock(&apple_dart_groups_lock); > > for_each_stream_map(i, cfg, stream_map) { > - for_each_set_bit(sid, &stream_map->sidmap, DART_MAX_STREAMS) { > + for_each_set_bit(sid, stream_map->sidmap, stream_map->dart->num_streams) { > struct iommu_group *stream_group = > stream_map->dart->sid2group[sid]; > > @@ -724,7 +740,7 @@ static struct iommu_group > *apple_dart_device_group(struct device *dev) > apple_dart_release_group); > > for_each_stream_map(i, cfg, stream_map) > - for_each_set_bit(sid, &stream_map->sidmap, DART_MAX_STREAMS) > + for_each_set_bit(sid, stream_map->sidmap, stream_map->dart->num_streams) > stream_map->dart->sid2group[sid] = group; > > res = group; > @@ -869,16 +885,26 @@ static int apple_dart_probe(struct platform_device *pdev) > if (ret) > return ret; > > - ret = apple_dart_hw_reset(dart); > - if (ret) > - goto err_clk_disable; > - > dart_params[0] = readl(dart->regs + DART_PARAMS1); > dart_params[1] = readl(dart->regs + DART_PARAMS2); > dart->pgsize = 1 << FIELD_GET(DART_PARAMS_PAGE_SHIFT, dart_params[0]); > dart->supports_bypass = dart_params[1] & DART_PARAMS_BYPASS_SUPPORT; > + > + dart->num_streams = dart->hw->max_sid_count; > + > + if (dart->num_streams > DART_MAX_STREAMS) { > + dev_err(&pdev->dev, "Too many streams (%d > %d)\n", > + dart->num_streams, DART_MAX_STREAMS); > + ret = -EINVAL; > + goto err_clk_disable; > + } > + > dart->force_bypass = dart->pgsize > PAGE_SIZE; > > + ret = apple_dart_hw_reset(dart); > + if (ret) > + goto err_clk_disable; > + > ret = request_irq(dart->irq, apple_dart_irq, IRQF_SHARED, > "apple-dart fault handler", dart); > if (ret) > @@ -897,8 +923,8 @@ static int apple_dart_probe(struct platform_device *pdev) > > dev_info( > &pdev->dev, > - "DART [pagesize %x, bypass support: %d, bypass forced: %d] > initialized\n", > - dart->pgsize, dart->supports_bypass, dart->force_bypass); > + "DART [pagesize %x, %d streams, bypass support: %d, bypass forced: > %d] initialized\n", > + dart->pgsize, dart->num_streams, dart->supports_bypass, > dart->force_bypass); > return 0; > > err_sysfs_remove: > @@ -929,10 +955,12 @@ static int apple_dart_remove(struct platform_device *pdev) > static const struct apple_dart_hw apple_dart_hw_t8103 = { > .oas = 36, > .fmt = APPLE_DART, > + .max_sid_count = 16, > }; > static const struct apple_dart_hw apple_dart_hw_t6000 = { > .oas = 42, > .fmt = APPLE_DART2, > + .max_sid_count = 16, > }; > > static __maybe_unused int apple_dart_suspend(struct device *dev) > @@ -940,7 +968,7 @@ static __maybe_unused int apple_dart_suspend(struct > device *dev) > struct apple_dart *dart = dev_get_drvdata(dev); > unsigned int sid, idx; > > - for (sid = 0; sid < DART_MAX_STREAMS; sid++) { > + for (sid = 0; sid < dart->num_streams; sid++) { > dart->save_tcr[sid] = readl_relaxed(dart->regs + DART_TCR(sid)); > for (idx = 0; idx < DART_MAX_TTBR; idx++) > dart->save_ttbr[sid][idx] = > @@ -962,7 +990,7 @@ static __maybe_unused int apple_dart_resume(struct > device *dev) > return ret; > } > > - for (sid = 0; sid < DART_MAX_STREAMS; sid++) { > + for (sid = 0; sid < dart->num_streams; sid++) { > for (idx = 0; idx < DART_MAX_TTBR; idx++) > writel(dart->save_ttbr[sid][idx], > dart->regs + DART_TTBR(sid, idx)); > -- > 2.35.1
On Wed, Jan 4, 2023, at 12:00, Hector Martin wrote: > T8110 has a new register layout. To accommodate this, first move all the > register offsets to the hw structure, and rename all the existing > registers to DART_T8020_*. > > Signed-off-by: Hector Martin <marcan@marcan.st> > --- Reviewed-by: Sven Peter <sven@svenpeter.dev> > drivers/iommu/apple-dart.c | 188 ++++++++++++++++++++++++------------- > 1 file changed, 125 insertions(+), 63 deletions(-) > > diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c > index 03a3cb5638ba..396da83f2f9e 100644 > --- a/drivers/iommu/apple-dart.c > +++ b/drivers/iommu/apple-dart.c > @@ -38,6 +38,7 @@ > #define DART_MAX_TTBR 4 > #define MAX_DARTS_PER_DEVICE 2 > > +/* Common registers */ > > #define DART_PARAMS1 0x00 > #define DART_PARAMS1_PAGE_SHIFT GENMASK(27, 24) > @@ -45,52 +46,79 @@ > #define DART_PARAMS2 0x04 > #define DART_PARAMS2_BYPASS_SUPPORT BIT(0) > > -#define DART_STREAM_COMMAND 0x20 > -#define DART_STREAM_COMMAND_BUSY BIT(2) > -#define DART_STREAM_COMMAND_INVALIDATE BIT(20) > +/* T8020/T6000 registers */ > > -#define DART_STREAM_SELECT 0x34 > +#define DART_T8020_STREAM_COMMAND 0x20 > +#define DART_T8020_STREAM_COMMAND_BUSY BIT(2) > +#define DART_T8020_STREAM_COMMAND_INVALIDATE BIT(20) > > -#define DART_ERROR 0x40 > -#define DART_ERROR_STREAM GENMASK(27, 24) > -#define DART_ERROR_CODE GENMASK(11, 0) > -#define DART_ERROR_FLAG BIT(31) > +#define DART_T8020_STREAM_SELECT 0x34 > > -#define DART_ERROR_READ_FAULT BIT(4) > -#define DART_ERROR_WRITE_FAULT BIT(3) > -#define DART_ERROR_NO_PTE BIT(2) > -#define DART_ERROR_NO_PMD BIT(1) > -#define DART_ERROR_NO_TTBR BIT(0) > +#define DART_T8020_ERROR 0x40 > +#define DART_T8020_ERROR_STREAM GENMASK(27, 24) > +#define DART_T8020_ERROR_CODE GENMASK(11, 0) > +#define DART_T8020_ERROR_FLAG BIT(31) > > -#define DART_CONFIG 0x60 > -#define DART_CONFIG_LOCK BIT(15) > +#define DART_T8020_ERROR_READ_FAULT BIT(4) > +#define DART_T8020_ERROR_WRITE_FAULT BIT(3) > +#define DART_T8020_ERROR_NO_PTE BIT(2) > +#define DART_T8020_ERROR_NO_PMD BIT(1) > +#define DART_T8020_ERROR_NO_TTBR BIT(0) > + > +#define DART_T8020_CONFIG 0x60 > +#define DART_T8020_CONFIG_LOCK BIT(15) > > #define DART_STREAM_COMMAND_BUSY_TIMEOUT 100 > > -#define DART_ERROR_ADDR_HI 0x54 > -#define DART_ERROR_ADDR_LO 0x50 > +#define DART_T8020_ERROR_ADDR_HI 0x54 > +#define DART_T8020_ERROR_ADDR_LO 0x50 > + > +#define DART_T8020_STREAMS_ENABLE 0xfc > > -#define DART_STREAMS_ENABLE 0xfc > +#define DART_T8020_TCR 0x100 > +#define DART_T8020_TCR_TRANSLATE_ENABLE BIT(7) > +#define DART_T8020_TCR_BYPASS_DART BIT(8) > +#define DART_T8020_TCR_BYPASS_DAPF BIT(12) > > -#define DART_TCR(sid) (0x100 + 4 * (sid)) > -#define DART_TCR_TRANSLATE_ENABLE BIT(7) > -#define DART_TCR_BYPASS0_ENABLE BIT(8) > -#define DART_TCR_BYPASS1_ENABLE BIT(12) > +#define DART_T8020_TTBR 0x200 > +#define DART_T8020_TTBR_VALID BIT(31) > +#define DART_T8020_TTBR_ADDR_OFF 0 > +#define DART_T8020_TTBR_SHIFT 12 > > -#define DART_TTBR_VALID BIT(31) > -#define DART_TTBR_SHIFT 12 > +#define DART_TCR(dart, sid) ((dart)->hw->tcr + ((sid) << 2)) > > -#define DART_TTBR(dart, sid, idx) (0x200 + \ > +#define DART_TTBR(dart, sid, idx) ((dart)->hw->ttbr + \ > (((dart)->hw->ttbr_count * (sid)) << 2) + \ > ((idx) << 2)) > > +struct apple_dart_stream_map; > > struct apple_dart_hw { > + irqreturn_t (*irq_handler)(int irq, void *dev); > + int (*invalidate_tlb)(struct apple_dart_stream_map *stream_map); > + > u32 oas; > enum io_pgtable_fmt fmt; > > int max_sid_count; > > + u64 lock; > + u64 lock_bit; > + > + u64 error; > + > + u64 enable_streams; > + u64 disable_streams; I don't think disable_streams is used anywhere. I assume you just left it in here to document it? > + > + u64 tcr; > + u64 tcr_enabled; > + u64 tcr_disabled; > + u64 tcr_bypass; > + > + u64 ttbr; > + u64 ttbr_valid; > + u64 ttbr_addr_off; This name confused me a bit since off sounds like offset to me while this is actually another shift. Can't really think of a better name right now though. I'd at least a comment here to describe it. > + u64 ttbr_shift; > int ttbr_count; > }; > > @@ -217,8 +245,7 @@ apple_dart_hw_enable_translation(struct > apple_dart_stream_map *stream_map) > int sid; > > for_each_set_bit(sid, stream_map->sidmap, dart->num_streams) > - writel(DART_TCR_TRANSLATE_ENABLE, > - dart->regs + DART_TCR(sid)); > + writel(dart->hw->tcr_enabled, dart->regs + DART_TCR(dart, sid)); > } > > static void apple_dart_hw_disable_dma(struct apple_dart_stream_map > *stream_map) > @@ -227,7 +254,7 @@ static void apple_dart_hw_disable_dma(struct > apple_dart_stream_map *stream_map) > int sid; > > for_each_set_bit(sid, stream_map->sidmap, dart->num_streams) > - writel(0, dart->regs + DART_TCR(sid)); > + writel(dart->hw->tcr_disabled, dart->regs + DART_TCR(dart, sid)); > } > > static void > @@ -238,8 +265,8 @@ apple_dart_hw_enable_bypass(struct > apple_dart_stream_map *stream_map) > > WARN_ON(!stream_map->dart->supports_bypass); > for_each_set_bit(sid, stream_map->sidmap, dart->num_streams) > - writel(DART_TCR_BYPASS0_ENABLE | DART_TCR_BYPASS1_ENABLE, > - dart->regs + DART_TCR(sid)); > + writel(dart->hw->tcr_bypass, > + dart->regs + DART_TCR(dart, sid)); > } > > static void apple_dart_hw_set_ttbr(struct apple_dart_stream_map > *stream_map, > @@ -248,9 +275,10 @@ static void apple_dart_hw_set_ttbr(struct > apple_dart_stream_map *stream_map, > struct apple_dart *dart = stream_map->dart; > int sid; > > - WARN_ON(paddr & ((1 << DART_TTBR_SHIFT) - 1)); > + WARN_ON(paddr & ((1 << dart->hw->ttbr_shift) - 1)); > for_each_set_bit(sid, stream_map->sidmap, dart->num_streams) > - writel(DART_TTBR_VALID | (paddr >> DART_TTBR_SHIFT), > + writel(dart->hw->ttbr_valid | > + (paddr >> dart->hw->ttbr_shift) << dart->hw->ttbr_addr_off, > dart->regs + DART_TTBR(dart, sid, idx)); > } > > @@ -274,7 +302,7 @@ apple_dart_hw_clear_all_ttbrs(struct > apple_dart_stream_map *stream_map) > } > > static int > -apple_dart_hw_stream_command(struct apple_dart_stream_map *stream_map, > +apple_dart_t8020_hw_stream_command(struct apple_dart_stream_map > *stream_map, > u32 command) > { > unsigned long flags; > @@ -283,12 +311,12 @@ apple_dart_hw_stream_command(struct > apple_dart_stream_map *stream_map, > > spin_lock_irqsave(&stream_map->dart->lock, flags); > > - writel(stream_map->sidmap[0], stream_map->dart->regs + > DART_STREAM_SELECT); > - writel(command, stream_map->dart->regs + DART_STREAM_COMMAND); > + writel(stream_map->sidmap[0], stream_map->dart->regs + > DART_T8020_STREAM_SELECT); > + writel(command, stream_map->dart->regs + DART_T8020_STREAM_COMMAND); > > ret = readl_poll_timeout_atomic( > - stream_map->dart->regs + DART_STREAM_COMMAND, command_reg, > - !(command_reg & DART_STREAM_COMMAND_BUSY), 1, > + stream_map->dart->regs + DART_T8020_STREAM_COMMAND, command_reg, > + !(command_reg & DART_T8020_STREAM_COMMAND_BUSY), 1, > DART_STREAM_COMMAND_BUSY_TIMEOUT); > > spin_unlock_irqrestore(&stream_map->dart->lock, flags); > @@ -304,10 +332,10 @@ apple_dart_hw_stream_command(struct > apple_dart_stream_map *stream_map, > } > > static int > -apple_dart_hw_invalidate_tlb(struct apple_dart_stream_map *stream_map) > +apple_dart_t8020_hw_invalidate_tlb(struct apple_dart_stream_map *stream_map) > { > - return apple_dart_hw_stream_command(stream_map, > - DART_STREAM_COMMAND_INVALIDATE); > + return apple_dart_t8020_hw_stream_command( > + stream_map, DART_T8020_STREAM_COMMAND_INVALIDATE); > } > > static int apple_dart_hw_reset(struct apple_dart *dart) > @@ -316,8 +344,8 @@ static int apple_dart_hw_reset(struct apple_dart *dart) > struct apple_dart_stream_map stream_map; > int i; > > - config = readl(dart->regs + DART_CONFIG); > - if (config & DART_CONFIG_LOCK) { > + config = readl(dart->regs + dart->hw->lock); > + if (config & dart->hw->lock_bit) { > dev_err(dart->dev, "DART is locked down until reboot: %08x\n", > config); > return -EINVAL; > @@ -331,12 +359,12 @@ static int apple_dart_hw_reset(struct apple_dart *dart) > > /* enable all streams globally since TCR is used to control isolation */ > for (i = 0; i < BITS_TO_U32(dart->num_streams); i++) > - writel(U32_MAX, dart->regs + DART_STREAMS_ENABLE); > + writel(U32_MAX, dart->regs + dart->hw->enable_streams); > > /* clear any pending errors before the interrupt is unmasked */ > - writel(readl(dart->regs + DART_ERROR), dart->regs + DART_ERROR); > + writel(readl(dart->regs + dart->hw->error), dart->regs + dart->hw->error); > > - return apple_dart_hw_invalidate_tlb(&stream_map); > + return dart->hw->invalidate_tlb(&stream_map); > } > > static void apple_dart_domain_flush_tlb(struct apple_dart_domain > *domain) > @@ -351,7 +379,7 @@ static void apple_dart_domain_flush_tlb(struct > apple_dart_domain *domain) > for (j = 0; j < BITS_TO_LONGS(stream_map.dart->num_streams); j++) > stream_map.sidmap[j] = > atomic_long_read(&domain_stream_map->sidmap[j]); > > - apple_dart_hw_invalidate_tlb(&stream_map); > + stream_map.dart->hw->invalidate_tlb(&stream_map); > } > } > > @@ -425,7 +453,7 @@ apple_dart_setup_translation(struct > apple_dart_domain *domain, > apple_dart_hw_clear_ttbr(stream_map, i); > > apple_dart_hw_enable_translation(stream_map); > - apple_dart_hw_invalidate_tlb(stream_map); > + stream_map->dart->hw->invalidate_tlb(stream_map); > } > > static int apple_dart_finalize_domain(struct iommu_domain *domain, > @@ -816,30 +844,30 @@ static const struct iommu_ops apple_dart_iommu_ops = { > } > }; > > -static irqreturn_t apple_dart_irq(int irq, void *dev) > +static irqreturn_t apple_dart_t8020_irq(int irq, void *dev) > { > struct apple_dart *dart = dev; > const char *fault_name = NULL; > - u32 error = readl(dart->regs + DART_ERROR); > - u32 error_code = FIELD_GET(DART_ERROR_CODE, error); > - u32 addr_lo = readl(dart->regs + DART_ERROR_ADDR_LO); > - u32 addr_hi = readl(dart->regs + DART_ERROR_ADDR_HI); > + u32 error = readl(dart->regs + DART_T8020_ERROR); > + u32 error_code = FIELD_GET(DART_T8020_ERROR_CODE, error); > + u32 addr_lo = readl(dart->regs + DART_T8020_ERROR_ADDR_LO); > + u32 addr_hi = readl(dart->regs + DART_T8020_ERROR_ADDR_HI); > u64 addr = addr_lo | (((u64)addr_hi) << 32); > - u8 stream_idx = FIELD_GET(DART_ERROR_STREAM, error); > + u8 stream_idx = FIELD_GET(DART_T8020_ERROR_STREAM, error); > > - if (!(error & DART_ERROR_FLAG)) > + if (!(error & DART_T8020_ERROR_FLAG)) > return IRQ_NONE; > > /* there should only be a single bit set but let's use == to be sure */ > - if (error_code == DART_ERROR_READ_FAULT) > + if (error_code == DART_T8020_ERROR_READ_FAULT) > fault_name = "READ FAULT"; > - else if (error_code == DART_ERROR_WRITE_FAULT) > + else if (error_code == DART_T8020_ERROR_WRITE_FAULT) > fault_name = "WRITE FAULT"; > - else if (error_code == DART_ERROR_NO_PTE) > + else if (error_code == DART_T8020_ERROR_NO_PTE) > fault_name = "NO PTE FOR IOVA"; > - else if (error_code == DART_ERROR_NO_PMD) > + else if (error_code == DART_T8020_ERROR_NO_PMD) > fault_name = "NO PMD FOR IOVA"; > - else if (error_code == DART_ERROR_NO_TTBR) > + else if (error_code == DART_T8020_ERROR_NO_TTBR) > fault_name = "NO TTBR FOR IOVA"; > else > fault_name = "unknown"; > @@ -849,7 +877,7 @@ static irqreturn_t apple_dart_irq(int irq, void *dev) > "translation fault: status:0x%x stream:%d code:0x%x (%s) at 0x%llx", > error, stream_idx, error_code, fault_name, addr); > > - writel(error, dart->regs + DART_ERROR); > + writel(error, dart->regs + DART_T8020_ERROR); > return IRQ_HANDLED; > } > > @@ -911,7 +939,7 @@ static int apple_dart_probe(struct platform_device *pdev) > if (ret) > goto err_clk_disable; > > - ret = request_irq(dart->irq, apple_dart_irq, IRQF_SHARED, > + ret = request_irq(dart->irq, dart->hw->irq_handler, IRQF_SHARED, > "apple-dart fault handler", dart); > if (ret) > goto err_clk_disable; > @@ -959,17 +987,51 @@ static int apple_dart_remove(struct platform_device *pdev) > } > > static const struct apple_dart_hw apple_dart_hw_t8103 = { > + .irq_handler = apple_dart_t8020_irq, > + .invalidate_tlb = apple_dart_t8020_hw_invalidate_tlb, > .oas = 36, > .fmt = APPLE_DART, > .max_sid_count = 16, > > + .enable_streams = DART_T8020_STREAMS_ENABLE, > + .lock = DART_T8020_CONFIG, > + .lock_bit = DART_T8020_CONFIG_LOCK, > + > + .error = DART_T8020_ERROR, > + > + .tcr = DART_T8020_TCR, > + .tcr_enabled = DART_T8020_TCR_TRANSLATE_ENABLE, > + .tcr_disabled = 0, > + .tcr_bypass = DART_T8020_TCR_BYPASS_DAPF | DART_T8020_TCR_BYPASS_DART, > + > + .ttbr = DART_T8020_TTBR, > + .ttbr_valid = DART_T8020_TTBR_VALID, > + .ttbr_addr_off = DART_T8020_TTBR_ADDR_OFF, > + .ttbr_shift = DART_T8020_TTBR_SHIFT, > .ttbr_count = 4, > }; > static const struct apple_dart_hw apple_dart_hw_t6000 = { > + .irq_handler = apple_dart_t8020_irq, > + .invalidate_tlb = apple_dart_t8020_hw_invalidate_tlb, > .oas = 42, > .fmt = APPLE_DART2, > .max_sid_count = 16, > > + .enable_streams = DART_T8020_STREAMS_ENABLE, > + .lock = DART_T8020_CONFIG, > + .lock_bit = DART_T8020_CONFIG_LOCK, > + > + .error = DART_T8020_ERROR, > + > + .tcr = DART_T8020_TCR, > + .tcr_enabled = DART_T8020_TCR_TRANSLATE_ENABLE, > + .tcr_disabled = 0, > + .tcr_bypass = DART_T8020_TCR_BYPASS_DAPF | DART_T8020_TCR_BYPASS_DART, > + > + .ttbr = DART_T8020_TTBR, > + .ttbr_valid = DART_T8020_TTBR_VALID, > + .ttbr_addr_off = DART_T8020_TTBR_ADDR_OFF, > + .ttbr_shift = DART_T8020_TTBR_SHIFT, > .ttbr_count = 4, > }; > > @@ -979,7 +1041,7 @@ static __maybe_unused int > apple_dart_suspend(struct device *dev) > unsigned int sid, idx; > > for (sid = 0; sid < dart->num_streams; sid++) { > - dart->save_tcr[sid] = readl_relaxed(dart->regs + DART_TCR(sid)); > + dart->save_tcr[sid] = readl_relaxed(dart->regs + DART_TCR(dart, > sid)); > for (idx = 0; idx < dart->hw->ttbr_count; idx++) > dart->save_ttbr[sid][idx] = > readl(dart->regs + DART_TTBR(dart, sid, idx)); > @@ -1004,7 +1066,7 @@ static __maybe_unused int > apple_dart_resume(struct device *dev) > for (idx = 0; idx < DART_MAX_TTBR; idx++) > writel(dart->save_ttbr[sid][idx], > dart->regs + DART_TTBR(dart, sid, idx)); > - writel(dart->save_tcr[sid], dart->regs + DART_TCR(sid)); > + writel(dart->save_tcr[sid], dart->regs + DART_TCR(dart, sid)); > } > > return 0; > -- > 2.35.1
On Wed, Jan 4, 2023, at 12:00, Hector Martin wrote: > Now that we have the driver properly parameterized, we can add support > for T8110 DARTs. These DARTs drop the multiple TTBRs (which only make > sense with legacy 4K page platforms) and instead add support for new > features and more stream IDs. The register layout is different, but the > pagetable format is the same as T6000. > > Signed-off-by: Hector Martin <marcan@marcan.st> > --- One minor nit below, otherwise Reviewed-by: Sven Peter <sven@svenpeter.dev> > drivers/iommu/apple-dart.c | 206 ++++++++++++++++++++++++++++++++++++- > 1 file changed, 201 insertions(+), 5 deletions(-) > > diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c > index 396da83f2f9e..e9cbdb45448c 100644 > --- a/drivers/iommu/apple-dart.c > +++ b/drivers/iommu/apple-dart.c > @@ -85,6 +85,62 @@ > #define DART_T8020_TTBR_ADDR_OFF 0 > #define DART_T8020_TTBR_SHIFT 12 > > +/* T8110 registers */ > + > +#define DART_T8110_PARAMS3 0x08 > +#define DART_T8110_PARAMS3_PA_WIDTH GENMASK(29, 24) > +#define DART_T8110_PARAMS3_VA_WIDTH GENMASK(21, 16) > +#define DART_T8110_PARAMS3_VER_MAJ GENMASK(15, 8) > +#define DART_T8110_PARAMS3_VER_MIN GENMASK(7, 0) > + > +#define DART_T8110_PARAMS4 0x0c > +#define DART_T8110_PARAMS4_NUM_CLIENTS GENMASK(24, 16) > +#define DART_T8110_PARAMS4_NUM_SIDS GENMASK(8, 0) > + > +#define DART_T8110_TLB_CMD 0x80 > +#define DART_T8110_TLB_CMD_BUSY BIT(31) > +#define DART_T8110_TLB_CMD_OP GENMASK(10, 8) > +#define DART_T8110_TLB_CMD_OP_FLUSH_ALL 0 > +#define DART_T8110_TLB_CMD_OP_FLUSH_SID 1 > +#define DART_T8110_TLB_CMD_STREAM GENMASK(7, 0) > + > +#define DART_T8110_ERROR 0x100 > +#define DART_T8110_ERROR_STREAM GENMASK(27, 20) > +#define DART_T8110_ERROR_CODE GENMASK(14, 0) > +#define DART_T8110_ERROR_FLAG BIT(31) > + > +#define DART_T8110_ERROR_MASK 0x104 > + > +#define DART_T8110_ERROR_READ_FAULT BIT(4) > +#define DART_T8110_ERROR_WRITE_FAULT BIT(3) > +#define DART_T8110_ERROR_NO_PTE BIT(3) > +#define DART_T8110_ERROR_NO_PMD BIT(2) > +#define DART_T8110_ERROR_NO_PGD BIT(1) > +#define DART_T8110_ERROR_NO_TTBR BIT(0) > + > +#define DART_T8110_ERROR_ADDR_LO 0x170 > +#define DART_T8110_ERROR_ADDR_HI 0x174 > + > +#define DART_T8110_PROTECT 0x200 > +#define DART_T8110_UNPROTECT 0x204 > +#define DART_T8110_PROTECT_LOCK 0x208 > +#define DART_T8110_PROTECT_TTBR_TCR BIT(0) Do you have any more details on this registers? For the 8103 DART we called it _CONFIG but I assume for the t8110 DART it can actually lock different parts of the HW instead of just a global lock? > + > +#define DART_T8110_ENABLE_STREAMS 0xc00 > +#define DART_T8110_DISABLE_STREAMS 0xc20 > + > +#define DART_T8110_TCR 0x1000 > +#define DART_T8110_TCR_REMAP GENMASK(11, 8) > +#define DART_T8110_TCR_REMAP_EN BIT(7) > +#define DART_T8110_TCR_BYPASS_DAPF BIT(2) > +#define DART_T8110_TCR_BYPASS_DART BIT(1) > +#define DART_T8110_TCR_TRANSLATE_ENABLE BIT(0) > + > +#define DART_T8110_TTBR 0x1400 > +#define DART_T8110_TTBR_VALID BIT(0) > +#define DART_T8110_TTBR_ADDR_OFF 2 > +#define DART_T8110_TTBR_SHIFT 14 > + > #define DART_TCR(dart, sid) ((dart)->hw->tcr + ((sid) << 2)) > > #define DART_TTBR(dart, sid, idx) ((dart)->hw->ttbr + \ > @@ -93,7 +149,14 @@ > > struct apple_dart_stream_map; > > +enum dart_type { Minor nit: enum apple_dart_type to be consistent with the rest of the driver. > + DART_T8020, > + DART_T6000, > + DART_T8110, > +}; > + > struct apple_dart_hw { > + enum dart_type type; > irqreturn_t (*irq_handler)(int irq, void *dev); > int (*invalidate_tlb)(struct apple_dart_stream_map *stream_map); > > @@ -150,6 +213,8 @@ struct apple_dart { > > spinlock_t lock; > > + u32 ias; > + u32 oas; > u32 pgsize; > u32 num_streams; > u32 supports_bypass : 1; > @@ -331,6 +396,44 @@ apple_dart_t8020_hw_stream_command(struct > apple_dart_stream_map *stream_map, > return 0; > } > > +static int > +apple_dart_t8110_hw_tlb_command(struct apple_dart_stream_map > *stream_map, > + u32 command) > +{ > + struct apple_dart *dart = stream_map->dart; > + unsigned long flags; > + int ret = 0; > + int sid; > + > + spin_lock_irqsave(&dart->lock, flags); > + > + for_each_set_bit(sid, stream_map->sidmap, dart->num_streams) { > + u32 val = FIELD_PREP(DART_T8110_TLB_CMD_OP, command) | > + FIELD_PREP(DART_T8110_TLB_CMD_STREAM, sid); > + writel(val, dart->regs + DART_T8110_TLB_CMD); > + > + ret = readl_poll_timeout_atomic( > + dart->regs + DART_T8110_TLB_CMD, val, > + !(val & DART_T8110_TLB_CMD_BUSY), 1, > + DART_STREAM_COMMAND_BUSY_TIMEOUT); > + > + if (ret) > + break; > + > + } > + > + spin_unlock_irqrestore(&dart->lock, flags); > + > + if (ret) { > + dev_err(stream_map->dart->dev, > + "busy bit did not clear after command %x for stream %d\n", > + command, sid); > + return ret; > + } > + > + return 0; > +} > + > static int > apple_dart_t8020_hw_invalidate_tlb(struct apple_dart_stream_map > *stream_map) > { > @@ -338,6 +441,13 @@ apple_dart_t8020_hw_invalidate_tlb(struct > apple_dart_stream_map *stream_map) > stream_map, DART_T8020_STREAM_COMMAND_INVALIDATE); > } > > +static int > +apple_dart_t8110_hw_invalidate_tlb(struct apple_dart_stream_map *stream_map) > +{ > + return apple_dart_t8110_hw_tlb_command( > + stream_map, DART_T8110_TLB_CMD_OP_FLUSH_SID); > +} > + > static int apple_dart_hw_reset(struct apple_dart *dart) > { > u32 config; > @@ -364,6 +474,9 @@ static int apple_dart_hw_reset(struct apple_dart *dart) > /* clear any pending errors before the interrupt is unmasked */ > writel(readl(dart->regs + dart->hw->error), dart->regs + dart->hw->error); > > + if (dart->hw->type == DART_T8110) > + writel(0, dart->regs + DART_T8110_ERROR_MASK); > + > return dart->hw->invalidate_tlb(&stream_map); > } > > @@ -479,8 +592,8 @@ static int apple_dart_finalize_domain(struct > iommu_domain *domain, > > pgtbl_cfg = (struct io_pgtable_cfg){ > .pgsize_bitmap = dart->pgsize, > - .ias = 32, > - .oas = dart->hw->oas, > + .ias = dart->ias, > + .oas = dart->oas, > .coherent_walk = 1, > .iommu_dev = dart->dev, > }; > @@ -494,7 +607,7 @@ static int apple_dart_finalize_domain(struct > iommu_domain *domain, > > domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap; > domain->geometry.aperture_start = 0; > - domain->geometry.aperture_end = DMA_BIT_MASK(32); > + domain->geometry.aperture_end = DMA_BIT_MASK(dart->ias); > domain->geometry.force_aperture = true; > > dart_domain->finalized = true; > @@ -881,10 +994,49 @@ static irqreturn_t apple_dart_t8020_irq(int irq, > void *dev) > return IRQ_HANDLED; > } > > +static irqreturn_t apple_dart_t8110_irq(int irq, void *dev) > +{ > + struct apple_dart *dart = dev; > + const char *fault_name = NULL; > + u32 error = readl(dart->regs + DART_T8110_ERROR); > + u32 error_code = FIELD_GET(DART_T8110_ERROR_CODE, error); > + u32 addr_lo = readl(dart->regs + DART_T8110_ERROR_ADDR_LO); > + u32 addr_hi = readl(dart->regs + DART_T8110_ERROR_ADDR_HI); > + u64 addr = addr_lo | (((u64)addr_hi) << 32); > + u8 stream_idx = FIELD_GET(DART_T8110_ERROR_STREAM, error); > + > + if (!(error & DART_T8110_ERROR_FLAG)) > + return IRQ_NONE; > + > + /* there should only be a single bit set but let's use == to be sure */ > + if (error_code == DART_T8110_ERROR_READ_FAULT) > + fault_name = "READ FAULT"; > + else if (error_code == DART_T8110_ERROR_WRITE_FAULT) > + fault_name = "WRITE FAULT"; > + else if (error_code == DART_T8110_ERROR_NO_PTE) > + fault_name = "NO PTE FOR IOVA"; > + else if (error_code == DART_T8110_ERROR_NO_PMD) > + fault_name = "NO PMD FOR IOVA"; > + else if (error_code == DART_T8110_ERROR_NO_PGD) > + fault_name = "NO PGD FOR IOVA"; > + else if (error_code == DART_T8110_ERROR_NO_TTBR) > + fault_name = "NO TTBR FOR IOVA"; > + else > + fault_name = "unknown"; > + > + dev_err_ratelimited( > + dart->dev, > + "translation fault: status:0x%x stream:%d code:0x%x (%s) at 0x%llx", > + error, stream_idx, error_code, fault_name, addr); > + > + writel(error, dart->regs + DART_T8110_ERROR); > + return IRQ_HANDLED; > +} > + > static int apple_dart_probe(struct platform_device *pdev) > { > int ret; > - u32 dart_params[2]; > + u32 dart_params[4]; > struct resource *res; > struct apple_dart *dart; > struct device *dev = &pdev->dev; > @@ -924,7 +1076,22 @@ static int apple_dart_probe(struct platform_device *pdev) > dart->pgsize = 1 << FIELD_GET(DART_PARAMS1_PAGE_SHIFT, dart_params[0]); > dart->supports_bypass = dart_params[1] & DART_PARAMS2_BYPASS_SUPPORT; > > - dart->num_streams = dart->hw->max_sid_count; > + switch (dart->hw->type) { > + case DART_T8020: > + case DART_T6000: > + dart->ias = 32; > + dart->oas = dart->hw->oas; > + dart->num_streams = dart->hw->max_sid_count; > + break; > + > + case DART_T8110: > + dart_params[2] = readl(dart->regs + DART_T8110_PARAMS3); > + dart_params[3] = readl(dart->regs + DART_T8110_PARAMS4); > + dart->ias = FIELD_GET(DART_T8110_PARAMS3_VA_WIDTH, dart_params[2]); > + dart->oas = FIELD_GET(DART_T8110_PARAMS3_PA_WIDTH, dart_params[2]); > + dart->num_streams = FIELD_GET(DART_T8110_PARAMS4_NUM_SIDS, dart_params[3]); > + break; > + } > > if (dart->num_streams > DART_MAX_STREAMS) { > dev_err(&pdev->dev, "Too many streams (%d > %d)\n", > @@ -987,6 +1154,7 @@ static int apple_dart_remove(struct platform_device *pdev) > } > > static const struct apple_dart_hw apple_dart_hw_t8103 = { > + .type = DART_T8020, > .irq_handler = apple_dart_t8020_irq, > .invalidate_tlb = apple_dart_t8020_hw_invalidate_tlb, > .oas = 36, > @@ -1011,6 +1179,7 @@ static const struct apple_dart_hw apple_dart_hw_t8103 = { > .ttbr_count = 4, > }; > static const struct apple_dart_hw apple_dart_hw_t6000 = { > + .type = DART_T6000, > .irq_handler = apple_dart_t8020_irq, > .invalidate_tlb = apple_dart_t8020_hw_invalidate_tlb, > .oas = 42, > @@ -1035,6 +1204,32 @@ static const struct apple_dart_hw apple_dart_hw_t6000 = { > .ttbr_count = 4, > }; > > +static const struct apple_dart_hw apple_dart_hw_t8110 = { > + .type = DART_T8110, > + .irq_handler = apple_dart_t8110_irq, > + .invalidate_tlb = apple_dart_t8110_hw_invalidate_tlb, > + .fmt = APPLE_DART2, > + .max_sid_count = 256, > + > + .enable_streams = DART_T8110_ENABLE_STREAMS, > + .disable_streams = DART_T8110_DISABLE_STREAMS, > + .lock = DART_T8110_PROTECT, > + .lock_bit = DART_T8110_PROTECT_TTBR_TCR, > + > + .error = DART_T8110_ERROR, > + > + .tcr = DART_T8110_TCR, > + .tcr_enabled = DART_T8110_TCR_TRANSLATE_ENABLE, > + .tcr_disabled = 0, > + .tcr_bypass = DART_T8110_TCR_BYPASS_DAPF | DART_T8110_TCR_BYPASS_DART, > + > + .ttbr = DART_T8110_TTBR, > + .ttbr_valid = DART_T8110_TTBR_VALID, > + .ttbr_addr_off = DART_T8110_TTBR_ADDR_OFF, > + .ttbr_shift = DART_T8110_TTBR_SHIFT, > + .ttbr_count = 1, > +}; > + > static __maybe_unused int apple_dart_suspend(struct device *dev) > { > struct apple_dart *dart = dev_get_drvdata(dev); > @@ -1076,6 +1271,7 @@ DEFINE_SIMPLE_DEV_PM_OPS(apple_dart_pm_ops, > apple_dart_suspend, apple_dart_resum > > static const struct of_device_id apple_dart_of_match[] = { > { .compatible = "apple,t8103-dart", .data = &apple_dart_hw_t8103 }, > + { .compatible = "apple,t8110-dart", .data = &apple_dart_hw_t8110 }, > { .compatible = "apple,t6000-dart", .data = &apple_dart_hw_t6000 }, > {}, > }; > -- > 2.35.1
On 2023/01/04 22:37, Sven Peter wrote: >> #include "dma-iommu.h" >> >> -#define DART_MAX_STREAMS 16 >> +#define DART_MAX_STREAMS 256 > > Feels a bit wasteful to allocate 256-wide sid2group and save_{tcr,ttbr} > arrays even for the M1 where 16 are enough. But then again, that's still <100 KiB > for all DARTs combined and these machine have >8 GiB of RAM so it probably won't > make a difference Yeah, I don't think this is worth the extra fumbling around with dynamic allocation. >> /* enable all streams globally since TCR is used to control isolation */ >> - writel(DART_STREAM_ALL, dart->regs + DART_STREAMS_ENABLE); >> + for (i = 0; i < BITS_TO_U32(dart->num_streams); i++) >> + writel(U32_MAX, dart->regs + DART_STREAMS_ENABLE); > > This seems weird: this code writes U32_MAX to the same register > again and again. Whoops, that was supposed to have a `+ 4 * i` in there. Fixed for v2. - Hector
On 2023/01/04 22:18, Sven Peter wrote: >> @@ -993,7 +1003,7 @@ static __maybe_unused int apple_dart_resume(struct >> device *dev) >> for (sid = 0; sid < dart->num_streams; sid++) { >> for (idx = 0; idx < DART_MAX_TTBR; idx++) > > s/DART_MAX_TTBR/dart->hw->ttbr_count/ I think. > > With that fixed: > > Reviewed-by: Sven Peter <sven@svenpeter.dev> Yup, good catch, thanks! - Hector
On 2023/01/04 22:43, Sven Peter wrote: > On Wed, Jan 4, 2023, at 12:00, Hector Martin wrote: >> + u64 enable_streams; >> + u64 disable_streams; > I don't think disable_streams is used anywhere. I assume you just left it in > here to document it? Yeah, we don't use this field ever, so we might as well drop it. I'll leave the #define for T8110 in though, as documentation. >> + u64 ttbr; >> + u64 ttbr_valid; >> + u64 ttbr_addr_off; > > This name confused me a bit since off sounds like offset to me while > this is actually another shift. Can't really think of a better name > right now though. I'd at least a comment here to describe it. How about `ttbr_addr_field_shift`? - Hector
On 2023/01/04 22:50, Sven Peter wrote: > Do you have any more details on this registers? For the 8103 DART > we called it _CONFIG but I assume for the t8110 DART it can > actually lock different parts of the HW instead of just a global lock? This is based on R's reverse engineering here: https://github.com/AsahiLinux/m1n1/blob/main/proxyclient/m1n1/hw/dart8110.py#L87 I don't think they ever fully nailed down exactly what the lock bit behavior is, though. - Hector
Hi, On Thu, Jan 5, 2023, at 06:19, Hector Martin wrote: > On 2023/01/04 22:50, Sven Peter wrote: >> Do you have any more details on this registers? For the 8103 DART >> we called it _CONFIG but I assume for the t8110 DART it can >> actually lock different parts of the HW instead of just a global lock? > > This is based on R's reverse engineering here: > > https://github.com/AsahiLinux/m1n1/blob/main/proxyclient/m1n1/hw/dart8110.py#L87 > > I don't think they ever fully nailed down exactly what the lock bit > behavior is, though. Fair enough, I was mostly curious if it was actually _PROTECT and not just _CONFIG with different bit assignments. Sounds like it does mostly set up protections though. Sven
Hi, On Thu, Jan 5, 2023, at 06:16, Hector Martin wrote: > On 2023/01/04 22:43, Sven Peter wrote: >> On Wed, Jan 4, 2023, at 12:00, Hector Martin wrote: >>> + u64 enable_streams; >>> + u64 disable_streams; >> I don't think disable_streams is used anywhere. I assume you just left it in >> here to document it? > > Yeah, we don't use this field ever, so we might as well drop it. I'll > leave the #define for T8110 in though, as documentation. > >>> + u64 ttbr; >>> + u64 ttbr_valid; >>> + u64 ttbr_addr_off; >> >> This name confused me a bit since off sounds like offset to me while >> this is actually another shift. Can't really think of a better name >> right now though. I'd at least a comment here to describe it. > > How about `ttbr_addr_field_shift`? Sounds good to me! Sven
Hi, On Thu, Jan 5, 2023, at 05:43, Hector Martin wrote: > On 2023/01/04 22:37, Sven Peter wrote: >>> #include "dma-iommu.h" >>> >>> -#define DART_MAX_STREAMS 16 >>> +#define DART_MAX_STREAMS 256 >> >> Feels a bit wasteful to allocate 256-wide sid2group and save_{tcr,ttbr} >> arrays even for the M1 where 16 are enough. But then again, that's still <100 KiB >> for all DARTs combined and these machine have >8 GiB of RAM so it probably won't >> make a difference > > Yeah, I don't think this is worth the extra fumbling around with dynamic > allocation. > >>> /* enable all streams globally since TCR is used to control isolation */ >>> - writel(DART_STREAM_ALL, dart->regs + DART_STREAMS_ENABLE); >>> + for (i = 0; i < BITS_TO_U32(dart->num_streams); i++) >>> + writel(U32_MAX, dart->regs + DART_STREAMS_ENABLE); >> >> This seems weird: this code writes U32_MAX to the same register >> again and again. > > Whoops, that was supposed to have a `+ 4 * i` in there. Fixed for v2. Great! Feel free to also add Reviewed-by: Sven Peter <sven@svenpeter.dev> then. Best, Sven