Message ID | 20150221190050.GA20184@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Michael Ellerman |
Headers | show |
On Sat, 2015-21-02 at 19:00:50 UTC, Nishanth Aravamudan wrote: > On 20.02.2015 [15:31:29 +1100], Michael Ellerman wrote: > > On Thu, 2015-02-19 at 10:41 -0800, Nishanth Aravamudan wrote: > > > After d905c5df9aef ("PPC: POWERNV: move iommu_add_device earlier"), the > > > refcnt on the kobject backing the IOMMU group for a PCI device is > > > elevated by each call to pci_dma_dev_setup_pSeriesLP() (via > > > set_iommu_table_base_and_group). When we go to dlpar a multi-function > > > PCI device out: > > > > > > iommu_reconfig_notifier -> > > > iommu_free_table -> > > > iommu_group_put > > > BUG_ON(tbl->it_group) > > > > > > We trip this BUG_ON, because there are still references on the table, so > > > it is not freed. Fix this by also adding a bus notifier identical to > > > PowerNV for pSeries. > > > > Please put it somewhere common, arch/powerpc/kernel/iommu.c perhaps, and just > > add a second machine_init_call() for pseries. > > How does this look? Only compile-tested with CONFIG_IOMMU_API on/off so > far, waiting for access to the test LPAR (should have it on Monday). Yeah that looks better, thanks. It probably doesn't build with CONFIG_PCI=n though, but I don't think CONFIG_PCI=n builds anyway. cheers
On 23.02.2015 [13:27:24 +1100], Michael Ellerman wrote: > On Sat, 2015-21-02 at 19:00:50 UTC, Nishanth Aravamudan wrote: > > On 20.02.2015 [15:31:29 +1100], Michael Ellerman wrote: > > > On Thu, 2015-02-19 at 10:41 -0800, Nishanth Aravamudan wrote: > > > > After d905c5df9aef ("PPC: POWERNV: move iommu_add_device earlier"), the > > > > refcnt on the kobject backing the IOMMU group for a PCI device is > > > > elevated by each call to pci_dma_dev_setup_pSeriesLP() (via > > > > set_iommu_table_base_and_group). When we go to dlpar a multi-function > > > > PCI device out: > > > > > > > > iommu_reconfig_notifier -> > > > > iommu_free_table -> > > > > iommu_group_put > > > > BUG_ON(tbl->it_group) > > > > > > > > We trip this BUG_ON, because there are still references on the table, so > > > > it is not freed. Fix this by also adding a bus notifier identical to > > > > PowerNV for pSeries. > > > > > > Please put it somewhere common, arch/powerpc/kernel/iommu.c perhaps, and just > > > add a second machine_init_call() for pseries. > > > > How does this look? Only compile-tested with CONFIG_IOMMU_API on/off so > > far, waiting for access to the test LPAR (should have it on Monday). > > Yeah that looks better, thanks. > > It probably doesn't build with CONFIG_PCI=n though, but I don't think > CONFIG_PCI=n builds anyway. Indeed it doesn't. Started looking at CONFIG_PCI=n and immediately hit the following: PCI_MSI depends on PCI PCI can be manually turned off PSERIES (and a bunch of other platforms) select PCI_MSI So you end up with PCI_MSI on and PCI off and the build breaks. Should the platforms depend on PCI_MSI instead? Per the Documentation: " select should be used with care. select will force a symbol to a value without visiting the dependencies. By abusing select you are able to select a symbol FOO even if FOO depends on BAR that is not set." Thanks, Nish
On 21.02.2015 [11:00:50 -0800], Nishanth Aravamudan wrote: > On 20.02.2015 [15:31:29 +1100], Michael Ellerman wrote: > > On Thu, 2015-02-19 at 10:41 -0800, Nishanth Aravamudan wrote: > > > After d905c5df9aef ("PPC: POWERNV: move iommu_add_device earlier"), the > > > refcnt on the kobject backing the IOMMU group for a PCI device is > > > elevated by each call to pci_dma_dev_setup_pSeriesLP() (via > > > set_iommu_table_base_and_group). When we go to dlpar a multi-function > > > PCI device out: > > > > > > iommu_reconfig_notifier -> > > > iommu_free_table -> > > > iommu_group_put > > > BUG_ON(tbl->it_group) > > > > > > We trip this BUG_ON, because there are still references on the table, so > > > it is not freed. Fix this by also adding a bus notifier identical to > > > PowerNV for pSeries. > > > > Please put it somewhere common, arch/powerpc/kernel/iommu.c perhaps, and just > > add a second machine_init_call() for pseries. > > How does this look? Only compile-tested with CONFIG_IOMMU_API on/off so > far, waiting for access to the test LPAR (should have it on Monday). > > > After d905c5df9aef ("PPC: POWERNV: move iommu_add_device earlier"), the > refcnt on the kobject backing the IOMMU group for a PCI device is > elevated by each call to pci_dma_dev_setup_pSeriesLP() (via > set_iommu_table_base_and_group). When we go to dlpar a multi-function > PCI device out: > > iommu_reconfig_notifier -> > iommu_free_table -> > iommu_group_put > BUG_ON(tbl->it_group) > > We trip this BUG_ON, because there are still references on the table, so > it is not freed. Fix this by moving the PowerNV bus notifier to common > code and calling it for both PowerNV and pSeries. Survived a remove -> add -> remove cycle, which always resulted in the BUG_ON without the change. > Fixes: d905c5df9aef ("PPC: POWERNV: move iommu_add_device earlier") > Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> > Cc: stable@kernel.org (3.13+) Tested-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
On Mon, 2015-02-23 at 10:54 -0800, Nishanth Aravamudan wrote: > On 23.02.2015 [13:27:24 +1100], Michael Ellerman wrote: > > On Sat, 2015-21-02 at 19:00:50 UTC, Nishanth Aravamudan wrote: > > > On 20.02.2015 [15:31:29 +1100], Michael Ellerman wrote: > > > > On Thu, 2015-02-19 at 10:41 -0800, Nishanth Aravamudan wrote: > > > > > After d905c5df9aef ("PPC: POWERNV: move iommu_add_device earlier"), the > > > > > refcnt on the kobject backing the IOMMU group for a PCI device is > > > > > elevated by each call to pci_dma_dev_setup_pSeriesLP() (via > > > > > set_iommu_table_base_and_group). When we go to dlpar a multi-function > > > > > PCI device out: > > > > > > > > > > iommu_reconfig_notifier -> > > > > > iommu_free_table -> > > > > > iommu_group_put > > > > > BUG_ON(tbl->it_group) > > > > > > > > > > We trip this BUG_ON, because there are still references on the table, so > > > > > it is not freed. Fix this by also adding a bus notifier identical to > > > > > PowerNV for pSeries. > > > > > > > > Please put it somewhere common, arch/powerpc/kernel/iommu.c perhaps, and just > > > > add a second machine_init_call() for pseries. > > > > > > How does this look? Only compile-tested with CONFIG_IOMMU_API on/off so > > > far, waiting for access to the test LPAR (should have it on Monday). > > > > Yeah that looks better, thanks. > > > > It probably doesn't build with CONFIG_PCI=n though, but I don't think > > CONFIG_PCI=n builds anyway. > > Indeed it doesn't. Started looking at CONFIG_PCI=n and immediately hit > the following: > > PCI_MSI depends on PCI > > PCI can be manually turned off > > PSERIES (and a bunch of other platforms) select PCI_MSI > > So you end up with PCI_MSI on and PCI off and the build breaks. > > Should the platforms depend on PCI_MSI instead? No, they don't depend on it, they would just like it if PCI is enabled. That can be fixed fairly easily by making it: config PSERIES select PCI_MSI if PCI But you then discover that there are ten other places where the build breaks for PCI=n. I'm starting to think we should just force PCI on for PSERIES and be done with it, we could all spend less of our time chasing build breaks for configurations no one actually cares about in practice (ie. PSERIES=y PCI=n). cheers
diff -urpN linux-3.19/arch/powerpc/include/asm/iommu.h linux-3.19-dev/arch/powerpc/include/asm/iommu.h --- linux-3.19/arch/powerpc/include/asm/iommu.h 2015-02-08 18:54:22.000000000 -0800 +++ linux-3.19-dev/arch/powerpc/include/asm/iommu.h 2015-02-21 09:03:55.960995053 -0800 @@ -113,6 +113,7 @@ extern void iommu_register_group(struct int pci_domain_number, unsigned long pe_num); extern int iommu_add_device(struct device *dev); extern void iommu_del_device(struct device *dev); +extern int __init tce_iommu_bus_notifier_init(void); #else static inline void iommu_register_group(struct iommu_table *tbl, int pci_domain_number, @@ -128,6 +129,11 @@ static inline int iommu_add_device(struc static inline void iommu_del_device(struct device *dev) { } + +static inline int __init tce_iommu_bus_notifier_init(void) +{ + return 0; +} #endif /* !CONFIG_IOMMU_API */ static inline void set_iommu_table_base_and_group(struct device *dev, diff -urpN linux-3.19/arch/powerpc/kernel/iommu.c linux-3.19-dev/arch/powerpc/kernel/iommu.c --- linux-3.19/arch/powerpc/kernel/iommu.c 2015-02-08 18:54:22.000000000 -0800 +++ linux-3.19-dev/arch/powerpc/kernel/iommu.c 2015-02-20 17:50:19.229927080 -0800 @@ -1175,4 +1175,30 @@ void iommu_del_device(struct device *dev } EXPORT_SYMBOL_GPL(iommu_del_device); +static int tce_iommu_bus_notifier(struct notifier_block *nb, + unsigned long action, void *data) +{ + struct device *dev = data; + + switch (action) { + case BUS_NOTIFY_ADD_DEVICE: + return iommu_add_device(dev); + case BUS_NOTIFY_DEL_DEVICE: + if (dev->iommu_group) + iommu_del_device(dev); + return 0; + default: + return 0; + } +} + +static struct notifier_block tce_iommu_bus_nb = { + .notifier_call = tce_iommu_bus_notifier, +}; + +int __init tce_iommu_bus_notifier_init(void) +{ + bus_register_notifier(&pci_bus_type, &tce_iommu_bus_nb); + return 0; +} #endif /* CONFIG_IOMMU_API */ diff -urpN linux-3.19/arch/powerpc/platforms/powernv/pci.c linux-3.19-dev/arch/powerpc/platforms/powernv/pci.c --- linux-3.19/arch/powerpc/platforms/powernv/pci.c 2015-02-08 18:54:22.000000000 -0800 +++ linux-3.19-dev/arch/powerpc/platforms/powernv/pci.c 2015-02-20 17:50:33.917927464 -0800 @@ -866,30 +866,4 @@ void __init pnv_pci_init(void) #endif } -static int tce_iommu_bus_notifier(struct notifier_block *nb, - unsigned long action, void *data) -{ - struct device *dev = data; - - switch (action) { - case BUS_NOTIFY_ADD_DEVICE: - return iommu_add_device(dev); - case BUS_NOTIFY_DEL_DEVICE: - if (dev->iommu_group) - iommu_del_device(dev); - return 0; - default: - return 0; - } -} - -static struct notifier_block tce_iommu_bus_nb = { - .notifier_call = tce_iommu_bus_notifier, -}; - -static int __init tce_iommu_bus_notifier_init(void) -{ - bus_register_notifier(&pci_bus_type, &tce_iommu_bus_nb); - return 0; -} machine_subsys_initcall_sync(powernv, tce_iommu_bus_notifier_init); diff -urpN linux-3.19/arch/powerpc/platforms/pseries/iommu.c linux-3.19-dev/arch/powerpc/platforms/pseries/iommu.c --- linux-3.19/arch/powerpc/platforms/pseries/iommu.c 2015-02-08 18:54:22.000000000 -0800 +++ linux-3.19-dev/arch/powerpc/platforms/pseries/iommu.c 2015-02-20 17:51:23.265928866 -0800 @@ -1340,3 +1340,5 @@ static int __init disable_multitce(char } __setup("multitce=", disable_multitce); + +machine_subsys_initcall_sync(pseries, tce_iommu_bus_notifier_init);