Message ID | 20180302101050.6191-1-vivek.gautam@codeaurora.org |
---|---|
Headers | show |
Series | iommu/arm-smmu: Add runtime pm/sleep support | expand |
Hi Vivek, On Fri, Mar 2, 2018 at 7:10 PM, Vivek Gautam <vivek.gautam@codeaurora.org> wrote: > This series provides the support for turning on the arm-smmu's > clocks/power domains using runtime pm. This is done using the > recently introduced device links patches, which lets the smmu's > runtime to follow the master's runtime pm, so the smmu remains > powered only when the masters use it. > > It also adds support for Qcom's arm-smmu-v2 variant that > has different clocks and power requirements. > > Took some reference from the exynos runtime patches [1]. > > After another round of discussion [3], we now finally seem to be > in agreement to add a flag based on compatible, a flag that would > indicate if a particular implementation of arm-smmu supports > runtime pm or not. > This lets us to use the much-argued pm_runtime_get_sync/put_sync() > calls in map/unmap callbacks so that the clients do not have to > worry about handling any of the arm-smmu's power. > The patch that exported couple of pm_runtime suppliers APIS, viz. > pm_runtime_get_suppliers(), and pm_runtime_put_suppliers() can be > dropped since we don't have a user now for these APIs. > Thanks Rafael for reviewing the changes, but looks like we don't > need to export those APIs for some more time. :) > > Previous version of this patch series is @ [5]. Thanks for addressing my comments. There is still a bit of space for improving the granularity of power management, as far as I understood how it works on SDM845 correctly, but as a first step, this should at least let things work. Reviewed-by: Tomasz Figa <tfiga@chromium.org> Best regards, Tomasz -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Tomasz, On 3/5/2018 6:55 PM, Tomasz Figa wrote: > Hi Vivek, > > On Fri, Mar 2, 2018 at 7:10 PM, Vivek Gautam > <vivek.gautam@codeaurora.org> wrote: >> This series provides the support for turning on the arm-smmu's >> clocks/power domains using runtime pm. This is done using the >> recently introduced device links patches, which lets the smmu's >> runtime to follow the master's runtime pm, so the smmu remains >> powered only when the masters use it. >> >> It also adds support for Qcom's arm-smmu-v2 variant that >> has different clocks and power requirements. >> >> Took some reference from the exynos runtime patches [1]. >> >> After another round of discussion [3], we now finally seem to be >> in agreement to add a flag based on compatible, a flag that would >> indicate if a particular implementation of arm-smmu supports >> runtime pm or not. >> This lets us to use the much-argued pm_runtime_get_sync/put_sync() >> calls in map/unmap callbacks so that the clients do not have to >> worry about handling any of the arm-smmu's power. >> The patch that exported couple of pm_runtime suppliers APIS, viz. >> pm_runtime_get_suppliers(), and pm_runtime_put_suppliers() can be >> dropped since we don't have a user now for these APIs. >> Thanks Rafael for reviewing the changes, but looks like we don't >> need to export those APIs for some more time. :) >> >> Previous version of this patch series is @ [5]. > Thanks for addressing my comments. There is still a bit of space for > improving the granularity of power management, as far as I understood > how it works on SDM845 correctly, but as a first step, this should at > least let things work. Sure. I will be sending a patch, based on this series, to add 'qcom,smmu-500' that enables *rpm_suported* flag for us. We can try to take care of some of the things with that. > Reviewed-by: Tomasz Figa <tfiga@chromium.org> Thanks for the review. regards Vivek > > Best regards, > Tomasz -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 02/03/18 10:10, Vivek Gautam wrote: > If we fail after initializing domain_context, we should destroy > the context to free up resources. Have another think about why the "problem" this patch caters for cannot ever happen (hint: consider how domain->smmu is used in arm_smmu_init_domain_context()). And then also about the really catastrophically bad problem it actually introduces (hint: "iommu_attach(domain, good_dev); iommu_attach(domain, bad_dev);") Robin. > Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> > --- > > * New patch added in this series. > > drivers/iommu/arm-smmu.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index 69e7c60792a8..ffc152c36002 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -1223,11 +1223,16 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) > dev_err(dev, > "cannot attach to SMMU %s whilst already attached to domain on SMMU %s\n", > dev_name(smmu_domain->smmu->dev), dev_name(smmu->dev)); > - return -EINVAL; > + ret = -EINVAL; > + goto destroy_domain; > } > > /* Looks ok, so add the device to the domain */ > return arm_smmu_domain_add_master(smmu_domain, fwspec); > + > +destroy_domain: > + arm_smmu_destroy_domain_context(domain); > + return ret; > } > > static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova, > -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 02/03/18 10:10, Vivek Gautam wrote: > From: Sricharan R <sricharan@codeaurora.org> > > The smmu device probe/remove and add/remove master device callbacks > gets called when the smmu is not linked to its master, that is without > the context of the master device. So calling runtime apis in those places > separately. > > Signed-off-by: Sricharan R <sricharan@codeaurora.org> > [vivek: Cleanup pm runtime calls] > Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> > --- > drivers/iommu/arm-smmu.c | 96 ++++++++++++++++++++++++++++++++++++++++++++---- > 1 file changed, 88 insertions(+), 8 deletions(-) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index c8b16f53f597..3d6a1875431f 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -209,6 +209,8 @@ struct arm_smmu_device { > struct clk_bulk_data *clks; > int num_clks; > > + bool rpm_supported; > + Can we not automatically infer this from whether clocks and/or power domains are specified or not, then just use pm_runtime_enabled() as the fast-path check as Tomasz originally proposed? I worry that relying on statically-defined matchdata is just going to blow up the driver and DT binding into a maintenance nightmare; I really don't want to start needing separate definitions for e.g. "arm,juno-etr-mmu-401" and "arm,juno-hdlcd-mmu-401" just because one otherwise-identical instance within the SoC is in a separate controllable power domain while the others aren't. Robin. > u32 cavium_id_base; /* Specific to Cavium */ > > spinlock_t global_sync_lock; > @@ -268,6 +270,20 @@ static struct arm_smmu_option_prop arm_smmu_options[] = { > { 0, NULL}, > }; > > +static inline int arm_smmu_rpm_get(struct arm_smmu_device *smmu) > +{ > + if (smmu->rpm_supported) > + return pm_runtime_get_sync(smmu->dev); > + > + return 0; > +} > + > +static inline void arm_smmu_rpm_put(struct arm_smmu_device *smmu) > +{ > + if (smmu->rpm_supported) > + pm_runtime_put(smmu->dev); > +} > + > static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom) > { > return container_of(dom, struct arm_smmu_domain, domain); > @@ -913,11 +929,15 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) > struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); > struct arm_smmu_device *smmu = smmu_domain->smmu; > struct arm_smmu_cfg *cfg = &smmu_domain->cfg; > - int irq; > + int ret, irq; > > if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY) > return; > > + ret = arm_smmu_rpm_get(smmu); > + if (ret < 0) > + return; > + > /* > * Disable the context bank and free the page tables before freeing > * it. > @@ -932,6 +952,8 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) > > free_io_pgtable_ops(smmu_domain->pgtbl_ops); > __arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx); > + > + arm_smmu_rpm_put(smmu); > } > > static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) > @@ -1213,10 +1235,15 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) > return -ENODEV; > > smmu = fwspec_smmu(fwspec); > + > + ret = arm_smmu_rpm_get(smmu); > + if (ret < 0) > + return ret; > + > /* Ensure that the domain is finalised */ > ret = arm_smmu_init_domain_context(domain, smmu); > if (ret < 0) > - return ret; > + goto rpm_put; > > /* > * Sanity check the domain. We don't support domains across > @@ -1231,10 +1258,17 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) > } > > /* Looks ok, so add the device to the domain */ > - return arm_smmu_domain_add_master(smmu_domain, fwspec); > + ret = arm_smmu_domain_add_master(smmu_domain, fwspec); > + > + arm_smmu_rpm_put(smmu); > + > + return ret; > > destroy_domain: > arm_smmu_destroy_domain_context(domain); > +rpm_put: > + arm_smmu_rpm_put(smmu); > + > return ret; > } > > @@ -1242,22 +1276,36 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova, > phys_addr_t paddr, size_t size, int prot) > { > struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops; > + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); > + struct arm_smmu_device *smmu = smmu_domain->smmu; > + int ret; > > if (!ops) > return -ENODEV; > > - return ops->map(ops, iova, paddr, size, prot); > + arm_smmu_rpm_get(smmu); > + ret = ops->map(ops, iova, paddr, size, prot); > + arm_smmu_rpm_put(smmu); > + > + return ret; > } > > static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova, > size_t size) > { > struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops; > + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); > + struct arm_smmu_device *smmu = smmu_domain->smmu; > + size_t ret; > > if (!ops) > return 0; > > - return ops->unmap(ops, iova, size); > + arm_smmu_rpm_get(smmu); > + ret = ops->unmap(ops, iova, size); > + arm_smmu_rpm_put(smmu); > + > + return ret; > } > > static void arm_smmu_iotlb_sync(struct iommu_domain *domain) > @@ -1412,14 +1460,22 @@ static int arm_smmu_add_device(struct device *dev) > while (i--) > cfg->smendx[i] = INVALID_SMENDX; > > + ret = arm_smmu_rpm_get(smmu); > + if (ret < 0) > + goto out_cfg_free; > + > ret = arm_smmu_master_alloc_smes(dev); > if (ret) > - goto out_cfg_free; > + goto out_rpm_put; > > iommu_device_link(&smmu->iommu, dev); > > + arm_smmu_rpm_put(smmu); > + > return 0; > > +out_rpm_put: > + arm_smmu_rpm_put(smmu); > out_cfg_free: > kfree(cfg); > out_free: > @@ -1432,7 +1488,7 @@ static void arm_smmu_remove_device(struct device *dev) > struct iommu_fwspec *fwspec = dev->iommu_fwspec; > struct arm_smmu_master_cfg *cfg; > struct arm_smmu_device *smmu; > - > + int ret; > > if (!fwspec || fwspec->ops != &arm_smmu_ops) > return; > @@ -1440,8 +1496,15 @@ static void arm_smmu_remove_device(struct device *dev) > cfg = fwspec->iommu_priv; > smmu = cfg->smmu; > > + ret = arm_smmu_rpm_get(smmu); > + if (ret < 0) > + return; > + > iommu_device_unlink(&smmu->iommu, dev); > arm_smmu_master_free_smes(fwspec); > + > + arm_smmu_rpm_put(smmu); > + > iommu_group_remove_device(dev); > kfree(fwspec->iommu_priv); > iommu_fwspec_free(dev); > @@ -1907,6 +1970,7 @@ struct arm_smmu_match_data { > enum arm_smmu_implementation model; > const char * const *clks; > int num_clks; > + bool rpm_supported; > }; > > #define ARM_SMMU_MATCH_DATA(name, ver, imp) \ > @@ -2029,6 +2093,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev, > smmu->version = data->version; > smmu->model = data->model; > smmu->num_clks = data->num_clks; > + smmu->rpm_supported = data->rpm_supported; > > arm_smmu_fill_clk_data(smmu, data->clks); > > @@ -2129,6 +2194,8 @@ static int arm_smmu_device_probe(struct platform_device *pdev) > smmu->irqs[i] = irq; > } > > + platform_set_drvdata(pdev, smmu); > + > err = devm_clk_bulk_get(smmu->dev, smmu->num_clks, smmu->clks); > if (err) > return err; > @@ -2137,6 +2204,13 @@ static int arm_smmu_device_probe(struct platform_device *pdev) > if (err) > return err; > > + if (smmu->rpm_supported) > + pm_runtime_enable(dev); > + > + err = arm_smmu_rpm_get(smmu); > + if (err < 0) > + return err; > + > err = arm_smmu_device_cfg_probe(smmu); > if (err) > return err; > @@ -2178,10 +2252,11 @@ static int arm_smmu_device_probe(struct platform_device *pdev) > return err; > } > > - platform_set_drvdata(pdev, smmu); > arm_smmu_device_reset(smmu); > arm_smmu_test_smr_masks(smmu); > > + arm_smmu_rpm_put(smmu); > + > /* > * For ACPI and generic DT bindings, an SMMU will be probed before > * any device which might need it, so we want the bus ops in place > @@ -2217,9 +2292,14 @@ static int arm_smmu_device_remove(struct platform_device *pdev) > if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS)) > dev_err(&pdev->dev, "removing device with active domains!\n"); > > + arm_smmu_rpm_get(smmu); > /* Turn the thing off */ > writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0); > > + arm_smmu_rpm_put(smmu); > + if (smmu->rpm_supported) > + pm_runtime_disable(smmu->dev); > + > clk_bulk_unprepare(smmu->num_clks, smmu->clks); > > return 0; > -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 02/03/18 10:10, Vivek Gautam wrote: > From: Sricharan R <sricharan@codeaurora.org> > > Finally add the device link between the master device and > smmu, so that the smmu gets runtime enabled/disabled only when the > master needs it. This is done from add_device callback which gets > called once when the master is added to the smmu. > > Signed-off-by: Sricharan R <sricharan@codeaurora.org> > Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> > --- > drivers/iommu/arm-smmu.c | 21 +++++++++++++++++++++ > 1 file changed, 21 insertions(+) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index 3d6a1875431f..bb1ea82c1003 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -217,6 +217,9 @@ struct arm_smmu_device { > > /* IOMMU core code handle */ > struct iommu_device iommu; > + > + /* runtime PM link to master */ > + struct device_link *link; Just the one? > }; > > enum arm_smmu_context_fmt { > @@ -1470,10 +1473,26 @@ static int arm_smmu_add_device(struct device *dev) > > iommu_device_link(&smmu->iommu, dev); > > + /* > + * Establish the link between smmu and master, so that the > + * smmu gets runtime enabled/disabled as per the master's > + * needs. > + */ > + smmu->link = device_link_add(dev, smmu->dev, DL_FLAG_PM_RUNTIME); Maybe I've misunderstood how the API works, but AFAICS the second and subsequent devices are all just going to overwrite (and leak) the link of the previous one... > + if (!smmu->link) { > + dev_warn(smmu->dev, "Unable to create device link between %s and %s\n", > + dev_name(smmu->dev), dev_name(dev)); > + ret = -ENODEV; > + goto out_unlink; > + } > + > arm_smmu_rpm_put(smmu); > > return 0; > > +out_unlink: > + iommu_device_unlink(&smmu->iommu, dev); > + arm_smmu_master_free_smes(fwspec); > out_rpm_put: > arm_smmu_rpm_put(smmu); > out_cfg_free: > @@ -1496,6 +1515,8 @@ static void arm_smmu_remove_device(struct device *dev) > cfg = fwspec->iommu_priv; > smmu = cfg->smmu; > > + device_link_del(smmu->link); ...and equivalently you end up with a double-free (or more) here of a link which may not have belonged to dev anyway. Robin. > + > ret = arm_smmu_rpm_get(smmu); > if (ret < 0) > return; > -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Mar 7, 2018 at 9:38 PM, Robin Murphy <robin.murphy@arm.com> wrote: > On 02/03/18 10:10, Vivek Gautam wrote: >> >> From: Sricharan R <sricharan@codeaurora.org> >> >> The smmu device probe/remove and add/remove master device callbacks >> gets called when the smmu is not linked to its master, that is without >> the context of the master device. So calling runtime apis in those places >> separately. >> >> Signed-off-by: Sricharan R <sricharan@codeaurora.org> >> [vivek: Cleanup pm runtime calls] >> Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> >> --- >> drivers/iommu/arm-smmu.c | 96 >> ++++++++++++++++++++++++++++++++++++++++++++---- >> 1 file changed, 88 insertions(+), 8 deletions(-) >> >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >> index c8b16f53f597..3d6a1875431f 100644 >> --- a/drivers/iommu/arm-smmu.c >> +++ b/drivers/iommu/arm-smmu.c >> @@ -209,6 +209,8 @@ struct arm_smmu_device { >> struct clk_bulk_data *clks; >> int num_clks; >> + bool rpm_supported; >> + > > > Can we not automatically infer this from whether clocks and/or power domains > are specified or not, then just use pm_runtime_enabled() as the fast-path > check as Tomasz originally proposed? I wouldn't tie this to presence of clocks, since as a next step we would want to actually control the clocks separately. (As far as I understand, on QCom SoCs we might want to have runtime PM active for the translation to work, but clocks gated whenever access to SMMU registers is not needed.) Moreover, you might still have some super high scale thousand-core systems that require clocks to be prepare-enabled, but runtime PM would be undesirable for the reasons we discussed before. > > I worry that relying on statically-defined matchdata is just going to blow > up the driver and DT binding into a maintenance nightmare; I really don't > want to start needing separate definitions for e.g. "arm,juno-etr-mmu-401" > and "arm,juno-hdlcd-mmu-401" just because one otherwise-identical instance > within the SoC is in a separate controllable power domain while the others > aren't. I don't see a reason why both couldn't just have RPM supported regardless of whether there is a real power domain. It would effectively be just a no-op for those that don't have one. IMHO the only reason to avoid having the RPM enabled is the scalability issue we discussed before. Best regards, Tomasz -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/03/18 13:52, Tomasz Figa wrote: > On Wed, Mar 7, 2018 at 9:38 PM, Robin Murphy <robin.murphy@arm.com> wrote: >> On 02/03/18 10:10, Vivek Gautam wrote: >>> >>> From: Sricharan R <sricharan@codeaurora.org> >>> >>> The smmu device probe/remove and add/remove master device callbacks >>> gets called when the smmu is not linked to its master, that is without >>> the context of the master device. So calling runtime apis in those places >>> separately. >>> >>> Signed-off-by: Sricharan R <sricharan@codeaurora.org> >>> [vivek: Cleanup pm runtime calls] >>> Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> >>> --- >>> drivers/iommu/arm-smmu.c | 96 >>> ++++++++++++++++++++++++++++++++++++++++++++---- >>> 1 file changed, 88 insertions(+), 8 deletions(-) >>> >>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >>> index c8b16f53f597..3d6a1875431f 100644 >>> --- a/drivers/iommu/arm-smmu.c >>> +++ b/drivers/iommu/arm-smmu.c >>> @@ -209,6 +209,8 @@ struct arm_smmu_device { >>> struct clk_bulk_data *clks; >>> int num_clks; >>> + bool rpm_supported; >>> + >> >> >> Can we not automatically infer this from whether clocks and/or power domains >> are specified or not, then just use pm_runtime_enabled() as the fast-path >> check as Tomasz originally proposed? > > I wouldn't tie this to presence of clocks, since as a next step we > would want to actually control the clocks separately. (As far as I > understand, on QCom SoCs we might want to have runtime PM active for > the translation to work, but clocks gated whenever access to SMMU > registers is not needed.) Moreover, you might still have some super > high scale thousand-core systems that require clocks to be > prepare-enabled, but runtime PM would be undesirable for the reasons > we discussed before. > >> >> I worry that relying on statically-defined matchdata is just going to blow >> up the driver and DT binding into a maintenance nightmare; I really don't >> want to start needing separate definitions for e.g. "arm,juno-etr-mmu-401" >> and "arm,juno-hdlcd-mmu-401" just because one otherwise-identical instance >> within the SoC is in a separate controllable power domain while the others >> aren't. > > I don't see a reason why both couldn't just have RPM supported > regardless of whether there is a real power domain. It would > effectively be just a no-op for those that don't have one. Because you're then effectively defining "compatible" values for the sake of attaching software policy to them, rather than actually describing different hardware implementations. The fact that RPM can't do anything meaningful unless relevant clock/power aspects *are* described, however, means that we shouldn't need additional information redundant with that. Much like the fact that we don't *already* have an "arm,juno-hdlcd-mmu-401" compatible to account for those being integrated such that IDR0.CTTW has the wrong value, since the presence or not of the "dma-coherent" property already describes the truth in that regard. > IMHO the > only reason to avoid having the RPM enabled is the scalability issue > we discussed before. Yes, but that's kind of my point; in reality high throughput/minimal latency and aggressive power management are more or less mutually exclusive. Mobile SoCs with fine-grained clock trees and power domains won't have multiple 40GBe/NVMf/whatever links running flat out in parallel; conversely networking/infrastructure/server SoCs aren't designed around saving every last microamp of leakage current - even in the (fairly unlikely) case of the interconnect clocks being software-gateable at all I would be very surprised if that were ever exposed directly to Linux (FWIW I believe ACPI essentially *requires* clocks to be abstracted behind firmware). Realistically then, explicit clocks are only expected on systems which care about power management. We can always revisit that assumption if anything crazy where it isn't the case ever becomes non-theoretical, but for now it's one I'm entirely comfortable with. If on the other hand it turns out that we can rely on just a power domain being present wherever we want RPM, making clocks moot, then all the better. Robin. -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Mar 8, 2018 at 1:58 AM, Robin Murphy <robin.murphy@arm.com> wrote: > On 07/03/18 13:52, Tomasz Figa wrote: >> >> On Wed, Mar 7, 2018 at 9:38 PM, Robin Murphy <robin.murphy@arm.com> wrote: >>> >>> On 02/03/18 10:10, Vivek Gautam wrote: >>>> >>>> >>>> From: Sricharan R <sricharan@codeaurora.org> >>>> >>>> The smmu device probe/remove and add/remove master device callbacks >>>> gets called when the smmu is not linked to its master, that is without >>>> the context of the master device. So calling runtime apis in those >>>> places >>>> separately. >>>> >>>> Signed-off-by: Sricharan R <sricharan@codeaurora.org> >>>> [vivek: Cleanup pm runtime calls] >>>> Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> >>>> --- >>>> drivers/iommu/arm-smmu.c | 96 >>>> ++++++++++++++++++++++++++++++++++++++++++++---- >>>> 1 file changed, 88 insertions(+), 8 deletions(-) >>>> >>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >>>> index c8b16f53f597..3d6a1875431f 100644 >>>> --- a/drivers/iommu/arm-smmu.c >>>> +++ b/drivers/iommu/arm-smmu.c >>>> @@ -209,6 +209,8 @@ struct arm_smmu_device { >>>> struct clk_bulk_data *clks; >>>> int num_clks; >>>> + bool rpm_supported; >>>> + >>> >>> >>> >>> Can we not automatically infer this from whether clocks and/or power >>> domains >>> are specified or not, then just use pm_runtime_enabled() as the fast-path >>> check as Tomasz originally proposed? >> >> >> I wouldn't tie this to presence of clocks, since as a next step we >> would want to actually control the clocks separately. (As far as I >> understand, on QCom SoCs we might want to have runtime PM active for >> the translation to work, but clocks gated whenever access to SMMU >> registers is not needed.) Moreover, you might still have some super >> high scale thousand-core systems that require clocks to be >> prepare-enabled, but runtime PM would be undesirable for the reasons >> we discussed before. >> >>> >>> I worry that relying on statically-defined matchdata is just going to >>> blow >>> up the driver and DT binding into a maintenance nightmare; I really don't >>> want to start needing separate definitions for e.g. >>> "arm,juno-etr-mmu-401" >>> and "arm,juno-hdlcd-mmu-401" just because one otherwise-identical >>> instance >>> within the SoC is in a separate controllable power domain while the >>> others >>> aren't. >> >> >> I don't see a reason why both couldn't just have RPM supported >> regardless of whether there is a real power domain. It would >> effectively be just a no-op for those that don't have one. > > > Because you're then effectively defining "compatible" values for the sake of > attaching software policy to them, rather than actually describing different > hardware implementations. > > The fact that RPM can't do anything meaningful unless relevant clock/power > aspects *are* described, however, means that we shouldn't need additional > information redundant with that. Much like the fact that we don't *already* > have an "arm,juno-hdlcd-mmu-401" compatible to account for those being > integrated such that IDR0.CTTW has the wrong value, since the presence or > not of the "dma-coherent" property already describes the truth in that > regard. Fair enough. > >> IMHO the >> only reason to avoid having the RPM enabled is the scalability issue >> we discussed before. > > > Yes, but that's kind of my point; in reality high throughput/minimal latency > and aggressive power management are more or less mutually exclusive. Mobile > SoCs with fine-grained clock trees and power domains won't have multiple > 40GBe/NVMf/whatever links running flat out in parallel; conversely > networking/infrastructure/server SoCs aren't designed around saving every > last microamp of leakage current - even in the (fairly unlikely) case of the > interconnect clocks being software-gateable at all I would be very surprised > if that were ever exposed directly to Linux (FWIW I believe ACPI essentially > *requires* clocks to be abstracted behind firmware). > > Realistically then, explicit clocks are only expected on systems which care > about power management. We can always revisit that assumption if anything > crazy where it isn't the case ever becomes non-theoretical, but for now it's > one I'm entirely comfortable with. If on the other hand it turns out that we > can rely on just a power domain being present wherever we want RPM, making > clocks moot, then all the better. Alright. Since Qcom would be the only user of clock and power handling for the time being, I think checking power domain presence could work for us. +/- the fact that clocks need to be handled even if power domain is not present, but we should normally always have both. Now we need a way to do the check. Perhaps for the time being it would be enough to just check for the power-domains property in DT? Best regards, Tomasz -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Mar 7, 2018 at 6:17 PM, Robin Murphy <robin.murphy@arm.com> wrote: > On 02/03/18 10:10, Vivek Gautam wrote: >> >> From: Sricharan R <sricharan@codeaurora.org> >> >> Finally add the device link between the master device and >> smmu, so that the smmu gets runtime enabled/disabled only when the >> master needs it. This is done from add_device callback which gets >> called once when the master is added to the smmu. >> >> Signed-off-by: Sricharan R <sricharan@codeaurora.org> >> Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> >> --- >> drivers/iommu/arm-smmu.c | 21 +++++++++++++++++++++ >> 1 file changed, 21 insertions(+) >> >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >> index 3d6a1875431f..bb1ea82c1003 100644 >> --- a/drivers/iommu/arm-smmu.c >> +++ b/drivers/iommu/arm-smmu.c >> @@ -217,6 +217,9 @@ struct arm_smmu_device { >> /* IOMMU core code handle */ >> struct iommu_device iommu; >> + >> + /* runtime PM link to master */ >> + struct device_link *link; > > > Just the one? > >> }; >> enum arm_smmu_context_fmt { >> @@ -1470,10 +1473,26 @@ static int arm_smmu_add_device(struct device *dev) >> iommu_device_link(&smmu->iommu, dev); >> + /* >> + * Establish the link between smmu and master, so that the >> + * smmu gets runtime enabled/disabled as per the master's >> + * needs. >> + */ >> + smmu->link = device_link_add(dev, smmu->dev, DL_FLAG_PM_RUNTIME); > > > Maybe I've misunderstood how the API works, but AFAICS the second and > subsequent devices are all just going to overwrite (and leak) the link of > the previous one... Sorry, my bad. Will take care of this. regards Vivek > >> + if (!smmu->link) { >> + dev_warn(smmu->dev, "Unable to create device link between >> %s and %s\n", >> + dev_name(smmu->dev), dev_name(dev)); >> + ret = -ENODEV; >> + goto out_unlink; >> + } >> + >> arm_smmu_rpm_put(smmu); >> return 0; >> +out_unlink: >> + iommu_device_unlink(&smmu->iommu, dev); >> + arm_smmu_master_free_smes(fwspec); >> out_rpm_put: >> arm_smmu_rpm_put(smmu); >> out_cfg_free: >> @@ -1496,6 +1515,8 @@ static void arm_smmu_remove_device(struct device >> *dev) >> cfg = fwspec->iommu_priv; >> smmu = cfg->smmu; >> + device_link_del(smmu->link); > > > ...and equivalently you end up with a double-free (or more) here of a link > which may not have belonged to dev anyway. > > Robin. > > >> + >> ret = arm_smmu_rpm_get(smmu); >> if (ret < 0) >> return; >> >
On Wed, Mar 7, 2018 at 5:50 PM, Robin Murphy <robin.murphy@arm.com> wrote: > On 02/03/18 10:10, Vivek Gautam wrote: >> >> If we fail after initializing domain_context, we should destroy >> the context to free up resources. > > > Have another think about why the "problem" this patch caters for cannot ever > happen (hint: consider how domain->smmu is used in > arm_smmu_init_domain_context()). And then also about the really > catastrophically bad problem it actually introduces (hint: > "iommu_attach(domain, good_dev); iommu_attach(domain, bad_dev);") Got it, we would end up destroying good_dev's domain context with this. Thanks regards Vivek > > Robin. > > >> Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> >> --- >> >> * New patch added in this series. >> >> drivers/iommu/arm-smmu.c | 7 ++++++- >> 1 file changed, 6 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >> index 69e7c60792a8..ffc152c36002 100644 >> --- a/drivers/iommu/arm-smmu.c >> +++ b/drivers/iommu/arm-smmu.c >> @@ -1223,11 +1223,16 @@ static int arm_smmu_attach_dev(struct iommu_domain >> *domain, struct device *dev) >> dev_err(dev, >> "cannot attach to SMMU %s whilst already attached >> to domain on SMMU %s\n", >> dev_name(smmu_domain->smmu->dev), >> dev_name(smmu->dev)); >> - return -EINVAL; >> + ret = -EINVAL; >> + goto destroy_domain; >> } >> /* Looks ok, so add the device to the domain */ >> return arm_smmu_domain_add_master(smmu_domain, fwspec); >> + >> +destroy_domain: >> + arm_smmu_destroy_domain_context(domain); >> + return ret; >> } >> static int arm_smmu_map(struct iommu_domain *domain, unsigned long >> iova, >> >
On 08/03/18 04:33, Tomasz Figa wrote: > On Thu, Mar 8, 2018 at 1:58 AM, Robin Murphy <robin.murphy@arm.com> wrote: >> On 07/03/18 13:52, Tomasz Figa wrote: >>> >>> On Wed, Mar 7, 2018 at 9:38 PM, Robin Murphy <robin.murphy@arm.com> wrote: >>>> >>>> On 02/03/18 10:10, Vivek Gautam wrote: >>>>> >>>>> >>>>> From: Sricharan R <sricharan@codeaurora.org> >>>>> >>>>> The smmu device probe/remove and add/remove master device callbacks >>>>> gets called when the smmu is not linked to its master, that is without >>>>> the context of the master device. So calling runtime apis in those >>>>> places >>>>> separately. >>>>> >>>>> Signed-off-by: Sricharan R <sricharan@codeaurora.org> >>>>> [vivek: Cleanup pm runtime calls] >>>>> Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> >>>>> --- >>>>> drivers/iommu/arm-smmu.c | 96 >>>>> ++++++++++++++++++++++++++++++++++++++++++++---- >>>>> 1 file changed, 88 insertions(+), 8 deletions(-) >>>>> >>>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >>>>> index c8b16f53f597..3d6a1875431f 100644 >>>>> --- a/drivers/iommu/arm-smmu.c >>>>> +++ b/drivers/iommu/arm-smmu.c >>>>> @@ -209,6 +209,8 @@ struct arm_smmu_device { >>>>> struct clk_bulk_data *clks; >>>>> int num_clks; >>>>> + bool rpm_supported; >>>>> + >>>> >>>> >>>> >>>> Can we not automatically infer this from whether clocks and/or power >>>> domains >>>> are specified or not, then just use pm_runtime_enabled() as the fast-path >>>> check as Tomasz originally proposed? >>> >>> >>> I wouldn't tie this to presence of clocks, since as a next step we >>> would want to actually control the clocks separately. (As far as I >>> understand, on QCom SoCs we might want to have runtime PM active for >>> the translation to work, but clocks gated whenever access to SMMU >>> registers is not needed.) Moreover, you might still have some super >>> high scale thousand-core systems that require clocks to be >>> prepare-enabled, but runtime PM would be undesirable for the reasons >>> we discussed before. >>> >>>> >>>> I worry that relying on statically-defined matchdata is just going to >>>> blow >>>> up the driver and DT binding into a maintenance nightmare; I really don't >>>> want to start needing separate definitions for e.g. >>>> "arm,juno-etr-mmu-401" >>>> and "arm,juno-hdlcd-mmu-401" just because one otherwise-identical >>>> instance >>>> within the SoC is in a separate controllable power domain while the >>>> others >>>> aren't. >>> >>> >>> I don't see a reason why both couldn't just have RPM supported >>> regardless of whether there is a real power domain. It would >>> effectively be just a no-op for those that don't have one. >> >> >> Because you're then effectively defining "compatible" values for the sake of >> attaching software policy to them, rather than actually describing different >> hardware implementations. >> >> The fact that RPM can't do anything meaningful unless relevant clock/power >> aspects *are* described, however, means that we shouldn't need additional >> information redundant with that. Much like the fact that we don't *already* >> have an "arm,juno-hdlcd-mmu-401" compatible to account for those being >> integrated such that IDR0.CTTW has the wrong value, since the presence or >> not of the "dma-coherent" property already describes the truth in that >> regard. > > Fair enough. > >> >>> IMHO the >>> only reason to avoid having the RPM enabled is the scalability issue >>> we discussed before. >> >> >> Yes, but that's kind of my point; in reality high throughput/minimal latency >> and aggressive power management are more or less mutually exclusive. Mobile >> SoCs with fine-grained clock trees and power domains won't have multiple >> 40GBe/NVMf/whatever links running flat out in parallel; conversely >> networking/infrastructure/server SoCs aren't designed around saving every >> last microamp of leakage current - even in the (fairly unlikely) case of the >> interconnect clocks being software-gateable at all I would be very surprised >> if that were ever exposed directly to Linux (FWIW I believe ACPI essentially >> *requires* clocks to be abstracted behind firmware). >> >> Realistically then, explicit clocks are only expected on systems which care >> about power management. We can always revisit that assumption if anything >> crazy where it isn't the case ever becomes non-theoretical, but for now it's >> one I'm entirely comfortable with. If on the other hand it turns out that we >> can rely on just a power domain being present wherever we want RPM, making >> clocks moot, then all the better. > > Alright. Since Qcom would be the only user of clock and power handling > for the time being, I think checking power domain presence could work > for us. +/- the fact that clocks need to be handled even if power > domain is not present, but we should normally always have both. Great! (the issue of Qcom-specific clock handling is a separate argument which I don't feel like reigniting just now...) > Now we need a way to do the check. Perhaps for the time being it would > be enough to just check for the power-domains property in DT? AFAICS, it might be as simple as arm_smmu_probe() doing this: /* * We want to avoid touching dev->power.lock in fastpaths unless * it's really going to do something useful - pm_runtime_enabled() * can serve as an ideal proxy for that decision. */ if (dev->pm_domain) pm_runtime_enable(dev); or maybe even just gate all the calls with "if (smmu->dev.pm_domain)" directly (like pcie-mediatek does), but I'm not sure which would be conceptually cleaner. Robin. -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Mar 8, 2018 at 9:12 PM, Robin Murphy <robin.murphy@arm.com> wrote: > On 08/03/18 04:33, Tomasz Figa wrote: >> >> On Thu, Mar 8, 2018 at 1:58 AM, Robin Murphy <robin.murphy@arm.com> wrote: >>> >>> On 07/03/18 13:52, Tomasz Figa wrote: >>>> >>>> >>>> On Wed, Mar 7, 2018 at 9:38 PM, Robin Murphy <robin.murphy@arm.com> >>>> wrote: >>>>> >>>>> >>>>> On 02/03/18 10:10, Vivek Gautam wrote: >>>>>> >>>>>> >>>>>> >>>>>> From: Sricharan R <sricharan@codeaurora.org> >>>>>> >>>>>> The smmu device probe/remove and add/remove master device callbacks >>>>>> gets called when the smmu is not linked to its master, that is without >>>>>> the context of the master device. So calling runtime apis in those >>>>>> places >>>>>> separately. >>>>>> >>>>>> Signed-off-by: Sricharan R <sricharan@codeaurora.org> >>>>>> [vivek: Cleanup pm runtime calls] >>>>>> Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> >>>>>> --- >>>>>> drivers/iommu/arm-smmu.c | 96 >>>>>> ++++++++++++++++++++++++++++++++++++++++++++---- >>>>>> 1 file changed, 88 insertions(+), 8 deletions(-) >>>>>> >>>>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >>>>>> index c8b16f53f597..3d6a1875431f 100644 >>>>>> --- a/drivers/iommu/arm-smmu.c >>>>>> +++ b/drivers/iommu/arm-smmu.c >>>>>> @@ -209,6 +209,8 @@ struct arm_smmu_device { >>>>>> struct clk_bulk_data *clks; >>>>>> int num_clks; >>>>>> + bool rpm_supported; >>>>>> + >>>>> >>>>> >>>>> >>>>> >>>>> Can we not automatically infer this from whether clocks and/or power >>>>> domains >>>>> are specified or not, then just use pm_runtime_enabled() as the >>>>> fast-path >>>>> check as Tomasz originally proposed? >>>> >>>> >>>> >>>> I wouldn't tie this to presence of clocks, since as a next step we >>>> would want to actually control the clocks separately. (As far as I >>>> understand, on QCom SoCs we might want to have runtime PM active for >>>> the translation to work, but clocks gated whenever access to SMMU >>>> registers is not needed.) Moreover, you might still have some super >>>> high scale thousand-core systems that require clocks to be >>>> prepare-enabled, but runtime PM would be undesirable for the reasons >>>> we discussed before. >>>> >>>>> >>>>> I worry that relying on statically-defined matchdata is just going to >>>>> blow >>>>> up the driver and DT binding into a maintenance nightmare; I really >>>>> don't >>>>> want to start needing separate definitions for e.g. >>>>> "arm,juno-etr-mmu-401" >>>>> and "arm,juno-hdlcd-mmu-401" just because one otherwise-identical >>>>> instance >>>>> within the SoC is in a separate controllable power domain while the >>>>> others >>>>> aren't. >>>> >>>> >>>> >>>> I don't see a reason why both couldn't just have RPM supported >>>> regardless of whether there is a real power domain. It would >>>> effectively be just a no-op for those that don't have one. >>> >>> >>> >>> Because you're then effectively defining "compatible" values for the sake >>> of >>> attaching software policy to them, rather than actually describing >>> different >>> hardware implementations. >>> >>> The fact that RPM can't do anything meaningful unless relevant >>> clock/power >>> aspects *are* described, however, means that we shouldn't need additional >>> information redundant with that. Much like the fact that we don't >>> *already* >>> have an "arm,juno-hdlcd-mmu-401" compatible to account for those being >>> integrated such that IDR0.CTTW has the wrong value, since the presence or >>> not of the "dma-coherent" property already describes the truth in that >>> regard. >> >> >> Fair enough. >> >>> >>>> IMHO the >>>> only reason to avoid having the RPM enabled is the scalability issue >>>> we discussed before. >>> >>> >>> >>> Yes, but that's kind of my point; in reality high throughput/minimal >>> latency >>> and aggressive power management are more or less mutually exclusive. >>> Mobile >>> SoCs with fine-grained clock trees and power domains won't have multiple >>> 40GBe/NVMf/whatever links running flat out in parallel; conversely >>> networking/infrastructure/server SoCs aren't designed around saving every >>> last microamp of leakage current - even in the (fairly unlikely) case of >>> the >>> interconnect clocks being software-gateable at all I would be very >>> surprised >>> if that were ever exposed directly to Linux (FWIW I believe ACPI >>> essentially >>> *requires* clocks to be abstracted behind firmware). >>> >>> Realistically then, explicit clocks are only expected on systems which >>> care >>> about power management. We can always revisit that assumption if anything >>> crazy where it isn't the case ever becomes non-theoretical, but for now >>> it's >>> one I'm entirely comfortable with. If on the other hand it turns out that >>> we >>> can rely on just a power domain being present wherever we want RPM, >>> making >>> clocks moot, then all the better. >> >> >> Alright. Since Qcom would be the only user of clock and power handling >> for the time being, I think checking power domain presence could work >> for us. +/- the fact that clocks need to be handled even if power >> domain is not present, but we should normally always have both. > > > Great! (the issue of Qcom-specific clock handling is a separate argument > which I don't feel like reigniting just now...) > >> Now we need a way to do the check. Perhaps for the time being it would >> be enough to just check for the power-domains property in DT? > > > AFAICS, it might be as simple as arm_smmu_probe() doing this: > > /* > * We want to avoid touching dev->power.lock in fastpaths unless > * it's really going to do something useful - pm_runtime_enabled() > * can serve as an ideal proxy for that decision. > */ > if (dev->pm_domain) > pm_runtime_enable(dev); > > or maybe even just gate all the calls with "if (smmu->dev.pm_domain)" > directly (like pcie-mediatek does), but I'm not sure which would be > conceptually cleaner. Okay, that was easier than I expected. Thanks. :) Actually, there is one more thing that might need rechecking. Are you sure that dev->pm_domain is NULL for the devices, for which we don't want runtime PM to be enabled? I think ACPI was mentioned and ACPI includes the concept of PM domains. Best regards, Tomasz -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Mar 8, 2018 at 10:29 AM, Vivek Gautam <vivek.gautam@codeaurora.org> wrote: > On Wed, Mar 7, 2018 at 6:17 PM, Robin Murphy <robin.murphy@arm.com> wrote: >> On 02/03/18 10:10, Vivek Gautam wrote: >>> >>> From: Sricharan R <sricharan@codeaurora.org> >>> >>> Finally add the device link between the master device and >>> smmu, so that the smmu gets runtime enabled/disabled only when the >>> master needs it. This is done from add_device callback which gets >>> called once when the master is added to the smmu. >>> >>> Signed-off-by: Sricharan R <sricharan@codeaurora.org> >>> Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> >>> --- >>> drivers/iommu/arm-smmu.c | 21 +++++++++++++++++++++ >>> 1 file changed, 21 insertions(+) >>> >>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >>> index 3d6a1875431f..bb1ea82c1003 100644 >>> --- a/drivers/iommu/arm-smmu.c >>> +++ b/drivers/iommu/arm-smmu.c >>> @@ -217,6 +217,9 @@ struct arm_smmu_device { >>> /* IOMMU core code handle */ >>> struct iommu_device iommu; >>> + >>> + /* runtime PM link to master */ >>> + struct device_link *link; >> >> >> Just the one? we will either have to count all the devices that are present on the iommu bus, or maintain a list to which all the links can be added. But to add the list, we will have to initialize a LIST_HEAD in struct device_link as well. Or, I think we don't even need to maintain a pointer to link with smmu. In arm_smmu_remove_device(), we can find out the correct link, and delete it. list_for_each_entry(link, &dev->links.suppliers, c_node) if (link->supplier == smmu->dev); device_link_del(link); Should that be fine? Rafael, does the above snippet looks right to you? Context: smmu->dev is the supplier, and dev is the consumer. We want to find the link, and delete it. regards Vivek >> >>> }; >>> enum arm_smmu_context_fmt { >>> @@ -1470,10 +1473,26 @@ static int arm_smmu_add_device(struct device *dev) >>> iommu_device_link(&smmu->iommu, dev); >>> + /* >>> + * Establish the link between smmu and master, so that the >>> + * smmu gets runtime enabled/disabled as per the master's >>> + * needs. >>> + */ >>> + smmu->link = device_link_add(dev, smmu->dev, DL_FLAG_PM_RUNTIME); >> >> >> Maybe I've misunderstood how the API works, but AFAICS the second and >> subsequent devices are all just going to overwrite (and leak) the link of >> the previous one... > > Sorry, my bad. Will take care of this. > > regards > Vivek > >> >>> + if (!smmu->link) { >>> + dev_warn(smmu->dev, "Unable to create device link between >>> %s and %s\n", >>> + dev_name(smmu->dev), dev_name(dev)); >>> + ret = -ENODEV; >>> + goto out_unlink; >>> + } >>> + >>> arm_smmu_rpm_put(smmu); >>> return 0; >>> +out_unlink: >>> + iommu_device_unlink(&smmu->iommu, dev); >>> + arm_smmu_master_free_smes(fwspec); >>> out_rpm_put: >>> arm_smmu_rpm_put(smmu); >>> out_cfg_free: >>> @@ -1496,6 +1515,8 @@ static void arm_smmu_remove_device(struct device >>> *dev) >>> cfg = fwspec->iommu_priv; >>> smmu = cfg->smmu; >>> + device_link_del(smmu->link); >> >> >> ...and equivalently you end up with a double-free (or more) here of a link >> which may not have belonged to dev anyway. >> >> Robin. >> >> >>> + >>> ret = arm_smmu_rpm_get(smmu); >>> if (ret < 0) >>> return; >>> >> > > > > -- > QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member > of Code Aurora Forum, hosted by The Linux Foundation
On Wed, Mar 7, 2018 at 6:17 PM, Robin Murphy <robin.murphy@arm.com> wrote: > On 02/03/18 10:10, Vivek Gautam wrote: >> >> From: Sricharan R <sricharan@codeaurora.org> >> >> Finally add the device link between the master device and >> smmu, so that the smmu gets runtime enabled/disabled only when the >> master needs it. This is done from add_device callback which gets >> called once when the master is added to the smmu. >> >> Signed-off-by: Sricharan R <sricharan@codeaurora.org> >> Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> >> --- >> drivers/iommu/arm-smmu.c | 21 +++++++++++++++++++++ >> 1 file changed, 21 insertions(+) >> >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >> index 3d6a1875431f..bb1ea82c1003 100644 >> --- a/drivers/iommu/arm-smmu.c >> +++ b/drivers/iommu/arm-smmu.c >> @@ -217,6 +217,9 @@ struct arm_smmu_device { >> /* IOMMU core code handle */ >> struct iommu_device iommu; >> + >> + /* runtime PM link to master */ >> + struct device_link *link; > > > Just the one? > >> }; >> enum arm_smmu_context_fmt { >> @@ -1470,10 +1473,26 @@ static int arm_smmu_add_device(struct device *dev) >> iommu_device_link(&smmu->iommu, dev); >> + /* >> + * Establish the link between smmu and master, so that the >> + * smmu gets runtime enabled/disabled as per the master's >> + * needs. >> + */ >> + smmu->link = device_link_add(dev, smmu->dev, DL_FLAG_PM_RUNTIME); > > > Maybe I've misunderstood how the API works, but AFAICS the second and > subsequent devices are all just going to overwrite (and leak) the link of > the previous one... Also, noticed one more thing while testing on sdm845. When we are conditionally enabling the runtime pm, we should create the device link too conditionally, i.e. only in the case the smmu->dev has runtime pm_enabled we can create this device link between smmu and the master device. Otherwise when the master tries to do a pm_runtime_get() over itself, the device link will ensure that pm_runtime_get() for smmu is done first. But that will fail when we don't have pm runtime enabled over smmu, and so the master device's pm_runtime_get() will fail too. Will fix this in the next version. Thanks Vivek > >> + if (!smmu->link) { >> + dev_warn(smmu->dev, "Unable to create device link between >> %s and %s\n", >> + dev_name(smmu->dev), dev_name(dev)); >> + ret = -ENODEV; >> + goto out_unlink; >> + } >> + >> arm_smmu_rpm_put(smmu); >> return 0; >> +out_unlink: >> + iommu_device_unlink(&smmu->iommu, dev); >> + arm_smmu_master_free_smes(fwspec); >> out_rpm_put: >> arm_smmu_rpm_put(smmu); >> out_cfg_free: >> @@ -1496,6 +1515,8 @@ static void arm_smmu_remove_device(struct device >> *dev) >> cfg = fwspec->iommu_priv; >> smmu = cfg->smmu; >> + device_link_del(smmu->link); > > > ...and equivalently you end up with a double-free (or more) here of a link > which may not have belonged to dev anyway. > > Robin. > > >> + >> ret = arm_smmu_rpm_get(smmu); >> if (ret < 0) >> return; >> >
On 09/03/18 07:11, Vivek Gautam wrote: > On Thu, Mar 8, 2018 at 10:29 AM, Vivek Gautam > <vivek.gautam@codeaurora.org> wrote: >> On Wed, Mar 7, 2018 at 6:17 PM, Robin Murphy <robin.murphy@arm.com> wrote: >>> On 02/03/18 10:10, Vivek Gautam wrote: >>>> >>>> From: Sricharan R <sricharan@codeaurora.org> >>>> >>>> Finally add the device link between the master device and >>>> smmu, so that the smmu gets runtime enabled/disabled only when the >>>> master needs it. This is done from add_device callback which gets >>>> called once when the master is added to the smmu. >>>> >>>> Signed-off-by: Sricharan R <sricharan@codeaurora.org> >>>> Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> >>>> --- >>>> drivers/iommu/arm-smmu.c | 21 +++++++++++++++++++++ >>>> 1 file changed, 21 insertions(+) >>>> >>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >>>> index 3d6a1875431f..bb1ea82c1003 100644 >>>> --- a/drivers/iommu/arm-smmu.c >>>> +++ b/drivers/iommu/arm-smmu.c >>>> @@ -217,6 +217,9 @@ struct arm_smmu_device { >>>> /* IOMMU core code handle */ >>>> struct iommu_device iommu; >>>> + >>>> + /* runtime PM link to master */ >>>> + struct device_link *link; >>> >>> >>> Just the one? > > we will either have to count all the devices that are present on the > iommu bus, or > maintain a list to which all the links can be added. > But to add the list, we will have to initialize a LIST_HEAD in struct > device_link > as well. > > Or, I think we don't even need to maintain a pointer to link with smmu. > In arm_smmu_remove_device(), we can find out the correct link, and delete it. > > list_for_each_entry(link, &dev->links.suppliers, c_node) > if (link->supplier == smmu->dev); > device_link_del(link); > > Should that be fine? > > Rafael, does the above snippet looks right to you? Context: smmu->dev > is the supplier, and dev is the consumer. We want to find the link, > and delete it. Actually, looking at the existing code, it seems like device_link_add() will in fact look up and return any existing link between a given supplier and consumer - is that intentional API behaviour that users may rely on to avoid keeping track of explicit link pointers? (or conversely, might it be reasonable to factor out a device_link_find() function?) Robin. -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[ +Lorenzo ] On 09/03/18 04:50, Tomasz Figa wrote: [...] >>> Now we need a way to do the check. Perhaps for the time being it would >>> be enough to just check for the power-domains property in DT? >> >> >> AFAICS, it might be as simple as arm_smmu_probe() doing this: >> >> /* >> * We want to avoid touching dev->power.lock in fastpaths unless >> * it's really going to do something useful - pm_runtime_enabled() >> * can serve as an ideal proxy for that decision. >> */ >> if (dev->pm_domain) >> pm_runtime_enable(dev); >> >> or maybe even just gate all the calls with "if (smmu->dev.pm_domain)" >> directly (like pcie-mediatek does), but I'm not sure which would be >> conceptually cleaner. > > Okay, that was easier than I expected. Thanks. :) > > Actually, there is one more thing that might need rechecking. Are you > sure that dev->pm_domain is NULL for the devices, for which we don't > want runtime PM to be enabled? I think ACPI was mentioned and ACPI > includes the concept of PM domains. Thanks for pointing that out - thankfully, I've confirmed that the SMMUs on my Juno don't have dev->pm_domain set when booting with ACPI, and double-checking the ACPI code I think we're OK here. Since the SMMUs are only described in the static IORT table and not in the ACPI namespace, they won't have the ACPI companion device that acpi_dev_pm_attach() looks for, and thus should always be ignored. Lorenzo, do I have that right? Robin. -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Mar 9, 2018 at 6:04 PM, Robin Murphy <robin.murphy@arm.com> wrote: > On 09/03/18 07:11, Vivek Gautam wrote: >> >> On Thu, Mar 8, 2018 at 10:29 AM, Vivek Gautam >> <vivek.gautam@codeaurora.org> wrote: >>> >>> On Wed, Mar 7, 2018 at 6:17 PM, Robin Murphy <robin.murphy@arm.com> >>> wrote: >>>> >>>> On 02/03/18 10:10, Vivek Gautam wrote: >>>>> >>>>> >>>>> From: Sricharan R <sricharan@codeaurora.org> >>>>> >>>>> Finally add the device link between the master device and >>>>> smmu, so that the smmu gets runtime enabled/disabled only when the >>>>> master needs it. This is done from add_device callback which gets >>>>> called once when the master is added to the smmu. >>>>> >>>>> Signed-off-by: Sricharan R <sricharan@codeaurora.org> >>>>> Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> >>>>> --- >>>>> drivers/iommu/arm-smmu.c | 21 +++++++++++++++++++++ >>>>> 1 file changed, 21 insertions(+) >>>>> >>>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >>>>> index 3d6a1875431f..bb1ea82c1003 100644 >>>>> --- a/drivers/iommu/arm-smmu.c >>>>> +++ b/drivers/iommu/arm-smmu.c >>>>> @@ -217,6 +217,9 @@ struct arm_smmu_device { >>>>> /* IOMMU core code handle */ >>>>> struct iommu_device iommu; >>>>> + >>>>> + /* runtime PM link to master */ >>>>> + struct device_link *link; >>>> >>>> >>>> >>>> Just the one? >> >> >> we will either have to count all the devices that are present on the >> iommu bus, or >> maintain a list to which all the links can be added. >> But to add the list, we will have to initialize a LIST_HEAD in struct >> device_link >> as well. >> >> Or, I think we don't even need to maintain a pointer to link with smmu. >> In arm_smmu_remove_device(), we can find out the correct link, and delete >> it. >> >> list_for_each_entry(link, &dev->links.suppliers, c_node) >> if (link->supplier == smmu->dev); >> device_link_del(link); >> >> Should that be fine? >> >> Rafael, does the above snippet looks right to you? Context: smmu->dev >> is the supplier, and dev is the consumer. We want to find the link, >> and delete it. > > > Actually, looking at the existing code, it seems like device_link_add() will > in fact look up and return any existing link between a given supplier and > consumer - is that intentional API behaviour that users may rely on to avoid > keeping track of explicit link pointers? > (or conversely, might it be > reasonable to factor out a device_link_find() function?) Yea, that sounds better. regards Vivek > > Robin. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html