diff mbox series

[RFC,2/2] pci: create device tree node for selected devices

Message ID 1661809417-11370-3-git-send-email-lizhi.hou@amd.com
State New
Headers show
Series Generate device tree node for pci devices | expand

Commit Message

Lizhi Hou Aug. 29, 2022, 9:43 p.m. UTC
The PCI endpoint device such as Xilinx Alveo PCI card maps the register
spaces from multiple hardware peripherals to its PCI BAR. Normally,
the PCI core discovers devices and BARs using the PCI enumeration process.
And the process does not provide a way to discover the hardware peripherals
been mapped to PCI BARs. For Alveo PCI card, the card firmware provides a
flattened device tree to describe the hardware peripherals on its BARs.
And the Alveo card driver can load this flattened device tree and leverage
device tree framework to generate platform devices for the hardware
peripherals eventually.

Apparently, the device tree framework requires a device tree node for the
PCI device. Thus, it can generate the device tree nodes for hardware
peripherals underneath. Because PCI is self discoverable bus, there might
not be a device tree node created for PCI devices. This patch is to add
support to generate device tree node for PCI devices. It introduces a
kernel option. When the option is turned on, the kernel will generate
device tree nodes for PCI bridges unconditionally. It will also generate
a device tree node for Xilinx Alveo U50 by using PCI quirks. The generated
device tree nodes do not have any property. The future patches will add
necessary properties.

Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Sonal Santan <sonal.santan@amd.com>
Signed-off-by: Max Zhen <max.zhen@amd.com>
Signed-off-by: Brian Xu <brian.xu@amd.com>
---
 drivers/pci/Kconfig         |  11 ++++
 drivers/pci/bus.c           |   2 +
 drivers/pci/msi/irqdomain.c |   6 +-
 drivers/pci/of.c            | 106 ++++++++++++++++++++++++++++++++++++
 drivers/pci/pci-driver.c    |   3 +-
 drivers/pci/pci.h           |  16 ++++++
 drivers/pci/quirks.c        |  11 ++++
 drivers/pci/remove.c        |   1 +
 8 files changed, 153 insertions(+), 3 deletions(-)

Comments

Rob Herring (Arm) Sept. 2, 2022, 6:54 p.m. UTC | #1
On Mon, Aug 29, 2022 at 02:43:37PM -0700, Lizhi Hou wrote:
> The PCI endpoint device such as Xilinx Alveo PCI card maps the register
> spaces from multiple hardware peripherals to its PCI BAR. Normally,
> the PCI core discovers devices and BARs using the PCI enumeration process.
> And the process does not provide a way to discover the hardware peripherals
> been mapped to PCI BARs.

This sentence doesn't make sense.

> For Alveo PCI card, the card firmware provides a
> flattened device tree to describe the hardware peripherals on its BARs.
> And the Alveo card driver can load this flattened device tree and leverage
> device tree framework to generate platform devices for the hardware
> peripherals eventually.
> 
> Apparently, the device tree framework requires a device tree node for the
> PCI device. Thus, it can generate the device tree nodes for hardware
> peripherals underneath. Because PCI is self discoverable bus, there might
> not be a device tree node created for PCI devices. This patch is to add
> support to generate device tree node for PCI devices. It introduces a
> kernel option. When the option is turned on, the kernel will generate
> device tree nodes for PCI bridges unconditionally. It will also generate
> a device tree node for Xilinx Alveo U50 by using PCI quirks. The generated
> device tree nodes do not have any property. The future patches will add
> necessary properties.
> 
> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
> Signed-off-by: Sonal Santan <sonal.santan@amd.com>
> Signed-off-by: Max Zhen <max.zhen@amd.com>
> Signed-off-by: Brian Xu <brian.xu@amd.com>
> ---
>  drivers/pci/Kconfig         |  11 ++++
>  drivers/pci/bus.c           |   2 +
>  drivers/pci/msi/irqdomain.c |   6 +-
>  drivers/pci/of.c            | 106 ++++++++++++++++++++++++++++++++++++
>  drivers/pci/pci-driver.c    |   3 +-
>  drivers/pci/pci.h           |  16 ++++++
>  drivers/pci/quirks.c        |  11 ++++
>  drivers/pci/remove.c        |   1 +
>  8 files changed, 153 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
> index 55c028af4bd9..9eca5420330b 100644
> --- a/drivers/pci/Kconfig
> +++ b/drivers/pci/Kconfig
> @@ -198,6 +198,17 @@ config PCI_HYPERV
>  	  The PCI device frontend driver allows the kernel to import arbitrary
>  	  PCI devices from a PCI backend to support PCI driver domains.
>  
> +config PCI_OF

We already have OF_PCI so this is confusing. Maybe 
'PCI_DYNAMIC_OF_NODES'.


> +	bool "Device tree node for PCI devices"
> +	select OF_DYNAMIC
> +	help
> +	  This option enables support for generating device tree nodes for some
> +	  PCI devices. Thus, the driver of this kind can load and overlay
> +	  flattened device tree for its downstream devices.
> +
> +	  Once this option is selected, the device tree nodes will be generated
> +	  for all PCI/PCIE bridges.
> +
>  choice
>  	prompt "PCI Express hierarchy optimization setting"
>  	default PCIE_BUS_DEFAULT
> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> index 3cef835b375f..f752b804ad1f 100644
> --- a/drivers/pci/bus.c
> +++ b/drivers/pci/bus.c
> @@ -316,6 +316,8 @@ void pci_bus_add_device(struct pci_dev *dev)
>  	 */
>  	pcibios_bus_add_device(dev);
>  	pci_fixup_device(pci_fixup_final, dev);
> +	if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE)

Would pci_is_bridge() work here? That would include cardbus, but I think 
that won't matter in practice.

> +		of_pci_make_dev_node(dev);


>  	pci_create_sysfs_dev_files(dev);
>  	pci_proc_attach_device(dev);
>  	pci_bridge_d3_update(dev);
> diff --git a/drivers/pci/msi/irqdomain.c b/drivers/pci/msi/irqdomain.c
> index e9cf318e6670..eeaf44169bfd 100644
> --- a/drivers/pci/msi/irqdomain.c
> +++ b/drivers/pci/msi/irqdomain.c
> @@ -230,8 +230,10 @@ u32 pci_msi_domain_get_msi_rid(struct irq_domain *domain, struct pci_dev *pdev)
>  	pci_for_each_dma_alias(pdev, get_msi_id_cb, &rid);
>  
>  	of_node = irq_domain_get_of_node(domain);
> -	rid = of_node ? of_msi_map_id(&pdev->dev, of_node, rid) :
> -			iort_msi_map_id(&pdev->dev, rid);
> +	if (of_node && !of_node_check_flag(of_node, OF_DYNAMIC))
> +		rid = of_msi_map_id(&pdev->dev, of_node, rid);
> +	else
> +		rid = iort_msi_map_id(&pdev->dev, rid);
>  
>  	return rid;
>  }
> diff --git a/drivers/pci/of.c b/drivers/pci/of.c
> index 196834ed44fe..19856d42e147 100644
> --- a/drivers/pci/of.c
> +++ b/drivers/pci/of.c
> @@ -469,6 +469,8 @@ static int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *
>  		} else {
>  			/* We found a P2P bridge, check if it has a node */
>  			ppnode = pci_device_to_OF_node(ppdev);
> +			if (of_node_check_flag(ppnode, OF_DYNAMIC))
> +				ppnode = NULL;
>  		}
>  
>  		/*
> @@ -599,6 +601,110 @@ int devm_of_pci_bridge_init(struct device *dev, struct pci_host_bridge *bridge)
>  	return pci_parse_request_of_pci_ranges(dev, bridge);
>  }
>  
> +#if IS_ENABLED(CONFIG_PCI_OF)
> +struct of_pci_node {
> +	struct list_head node;
> +	struct device_node *dt_node;
> +	struct of_changeset cset;
> +};
> +
> +static LIST_HEAD(of_pci_node_list);
> +static DEFINE_MUTEX(of_pci_node_lock);

There is a 'data' pointer in device_node which you could use to store 
the changeset. Then you wouldn't need a list. (though 'data' is rarely 
used and I hoped to get rid of it)

> +
> +static void of_pci_free_node(struct of_pci_node *node)
> +{
> +	of_changeset_destroy(&node->cset);
> +	kfree(node->dt_node->full_name);
> +	if (node->dt_node)
> +		of_node_put(node->dt_node);

You free full_name before freeing the node, so you could have a UAF. 
That needs to be taken care of when releasing the node.

> +	kfree(node);
> +}
> +
> +void of_pci_remove_node(struct pci_dev *pdev)
> +{
> +	struct list_head *ele, *next;
> +	struct of_pci_node *node;
> +
> +	mutex_lock(&of_pci_node_lock);
> +
> +	list_for_each_safe(ele, next, &of_pci_node_list) {
> +		node = list_entry(ele, struct of_pci_node, node);
> +		if (node->dt_node == pdev->dev.of_node) {
> +			list_del(&node->node);
> +			mutex_unlock(&of_pci_node_lock);
> +			of_pci_free_node(node);
> +			break;
> +		}
> +	}
> +	mutex_unlock(&of_pci_node_lock);
> +}

The above bits aren't really particular to PCI, so they probably 
belong in the DT core code. Frank will probably have thoughts on what 
this should look like.

> +
> +void of_pci_make_dev_node(struct pci_dev *pdev)
> +{
> +	const char *pci_type = "dev";
> +	struct device_node *parent;
> +	struct of_pci_node *node;
> +	int ret;
> +
> +	/*
> +	 * if there is already a device tree node linked to this device,
> +	 * return immediately.
> +	 */
> +	if (pci_device_to_OF_node(pdev))
> +		return;
> +
> +	/* check if there is device tree node for parent device */
> +	if (!pdev->bus->self)
> +		parent = pdev->bus->dev.of_node;
> +	else
> +		parent = pdev->bus->self->dev.of_node;
> +	if (!parent)
> +		return;
> +
> +	node = kzalloc(sizeof(*node), GFP_KERNEL);
> +	if (!node)
> +		return;
> +	of_changeset_init(&node->cset);
> +
> +	node->dt_node = of_node_alloc(NULL);
> +	if (!node->dt_node) {
> +		ret = -ENOMEM;
> +		goto failed;
> +	}
> +	node->dt_node->parent = parent;
> +
> +	if (pci_is_bridge(pdev))
> +		pci_type = "pci";
> +
> +	node->dt_node->full_name = kasprintf(GFP_KERNEL, "%s@%x,%x", pci_type,
> +					     PCI_SLOT(pdev->devfn),
> +					     PCI_FUNC(pdev->devfn));
> +	if (!node->dt_node->full_name) {
> +		ret = -ENOMEM;
> +		goto failed;
> +	}

Wait, aren't you missing adding properties? You need 'reg', 
'compatbile', and 'device_type = "pci"' for bridges.

> +
> +	ret = of_changeset_attach_node(&node->cset, node->dt_node);
> +	if (ret)
> +		goto failed;
> +
> +	ret = of_changeset_apply(&node->cset);
> +	if (ret)
> +		goto failed;
> +
> +	pdev->dev.of_node = node->dt_node;
> +
> +	mutex_lock(&of_pci_node_lock);
> +	list_add(&node->node, &of_pci_node_list);
> +	mutex_unlock(&of_pci_node_lock);
> +
> +	return;
> +
> +failed:
> +	of_pci_free_node(node);
> +}

Pass in the parent node and node name, and this function is not PCI 
specific either.

> +#endif
> +
>  #endif /* CONFIG_PCI */
>  
>  /**
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 49238ddd39ee..1540c4c9a770 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -1628,7 +1628,8 @@ static int pci_dma_configure(struct device *dev)
>  	bridge = pci_get_host_bridge_device(to_pci_dev(dev));
>  
>  	if (IS_ENABLED(CONFIG_OF) && bridge->parent &&
> -	    bridge->parent->of_node) {
> +	    bridge->parent->of_node &&
> +	    !of_node_check_flag(bridge->parent->of_node, OF_DYNAMIC)) {
>  		ret = of_dma_configure(dev, bridge->parent->of_node, true);
>  	} else if (has_acpi_companion(bridge)) {
>  		struct acpi_device *adev = to_acpi_device_node(bridge->fwnode);
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 785f31086313..319b79920d40 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -678,6 +678,22 @@ static inline int devm_of_pci_bridge_init(struct device *dev, struct pci_host_br
>  
>  #endif /* CONFIG_OF */
>  
> +#ifdef CONFIG_PCI_OF
> +void of_pci_make_dev_node(struct pci_dev *pdev);
> +void of_pci_remove_node(struct pci_dev *pdev);
> +
> +#else
> +static inline void
> +of_pci_make_dev_node(struct pci_dev *pdev)
> +{
> +}
> +
> +static inline void
> +of_pci_remove_node(struct pci_dev *pdev);
> +{
> +}
> +#endif /* CONFIG_OF_DYNAMIC */
> +
>  #ifdef CONFIG_PCIEAER
>  void pci_no_aer(void);
>  void pci_aer_init(struct pci_dev *dev);
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 4944798e75b5..58a81e9ff0ed 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -5956,3 +5956,14 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b1, aspm_l1_acceptable_latency
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c0, aspm_l1_acceptable_latency);
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c1, aspm_l1_acceptable_latency);
>  #endif
> +
> +/*
> + * For PCIe device which have multiple downstream devices, its driver may use
> + * a flattened device tree to describe the downstream devices.
> + * To overlay the flattened device tree, the PCI device and all its ancestor
> + * devices need to have device tree nodes on system base device tree. Thus,
> + * before driver probing, it might need to add a device tree node as the final
> + * fixup.
> + */
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5020, of_pci_make_dev_node);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5021, of_pci_make_dev_node);
> diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
> index 4c54c75050dc..0eaa9d9a3609 100644
> --- a/drivers/pci/remove.c
> +++ b/drivers/pci/remove.c
> @@ -23,6 +23,7 @@ static void pci_stop_dev(struct pci_dev *dev)
>  		device_release_driver(&dev->dev);
>  		pci_proc_detach_device(dev);
>  		pci_remove_sysfs_dev_files(dev);
> +		of_pci_remove_node(dev);
>  
>  		pci_dev_assign_added(dev, false);
>  	}
> -- 
> 2.27.0
> 
>
Frank Rowand Sept. 12, 2022, 6:33 a.m. UTC | #2
On 9/2/22 13:54, Rob Herring wrote:
> On Mon, Aug 29, 2022 at 02:43:37PM -0700, Lizhi Hou wrote:
>> The PCI endpoint device such as Xilinx Alveo PCI card maps the register
>> spaces from multiple hardware peripherals to its PCI BAR. Normally,
>> the PCI core discovers devices and BARs using the PCI enumeration process.
>> And the process does not provide a way to discover the hardware peripherals
>> been mapped to PCI BARs.

< snip >

> 
> The above bits aren't really particular to PCI, so they probably 
> belong in the DT core code. Frank will probably have thoughts on what 
> this should look like.

< snip >

I will try to look through this patch series later today (Monday 9/12
USA time - I will not be in Dublin for the conferences this week.)

-Frank
Lizhi Hou Sept. 13, 2022, 5:49 a.m. UTC | #3
On 9/2/22 11:54, Rob Herring wrote:
> On Mon, Aug 29, 2022 at 02:43:37PM -0700, Lizhi Hou wrote:
>> The PCI endpoint device such as Xilinx Alveo PCI card maps the register
>> spaces from multiple hardware peripherals to its PCI BAR. Normally,
>> the PCI core discovers devices and BARs using the PCI enumeration process.
>> And the process does not provide a way to discover the hardware peripherals
>> been mapped to PCI BARs.
> This sentence doesn't make sense.

How about changing it to:

And it does not discover the hardware peripherals that are present in a 
PCI device, and which can be accessed through the PCI BARs.

>
>> For Alveo PCI card, the card firmware provides a
>> flattened device tree to describe the hardware peripherals on its BARs.
>> And the Alveo card driver can load this flattened device tree and leverage
>> device tree framework to generate platform devices for the hardware
>> peripherals eventually.
>>
>> Apparently, the device tree framework requires a device tree node for the
>> PCI device. Thus, it can generate the device tree nodes for hardware
>> peripherals underneath. Because PCI is self discoverable bus, there might
>> not be a device tree node created for PCI devices. This patch is to add
>> support to generate device tree node for PCI devices. It introduces a
>> kernel option. When the option is turned on, the kernel will generate
>> device tree nodes for PCI bridges unconditionally. It will also generate
>> a device tree node for Xilinx Alveo U50 by using PCI quirks. The generated
>> device tree nodes do not have any property. The future patches will add
>> necessary properties.
>>
>> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
>> Signed-off-by: Sonal Santan <sonal.santan@amd.com>
>> Signed-off-by: Max Zhen <max.zhen@amd.com>
>> Signed-off-by: Brian Xu <brian.xu@amd.com>
>> ---
>>   drivers/pci/Kconfig         |  11 ++++
>>   drivers/pci/bus.c           |   2 +
>>   drivers/pci/msi/irqdomain.c |   6 +-
>>   drivers/pci/of.c            | 106 ++++++++++++++++++++++++++++++++++++
>>   drivers/pci/pci-driver.c    |   3 +-
>>   drivers/pci/pci.h           |  16 ++++++
>>   drivers/pci/quirks.c        |  11 ++++
>>   drivers/pci/remove.c        |   1 +
>>   8 files changed, 153 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
>> index 55c028af4bd9..9eca5420330b 100644
>> --- a/drivers/pci/Kconfig
>> +++ b/drivers/pci/Kconfig
>> @@ -198,6 +198,17 @@ config PCI_HYPERV
>>   	  The PCI device frontend driver allows the kernel to import arbitrary
>>   	  PCI devices from a PCI backend to support PCI driver domains.
>>   
>> +config PCI_OF
> We already have OF_PCI so this is confusing. Maybe
> 'PCI_DYNAMIC_OF_NODES'.
Sure. I will change it to PCI_DYNAMIC_OF_NODES
>
>
>> +	bool "Device tree node for PCI devices"
>> +	select OF_DYNAMIC
>> +	help
>> +	  This option enables support for generating device tree nodes for some
>> +	  PCI devices. Thus, the driver of this kind can load and overlay
>> +	  flattened device tree for its downstream devices.
>> +
>> +	  Once this option is selected, the device tree nodes will be generated
>> +	  for all PCI/PCIE bridges.
>> +
>>   choice
>>   	prompt "PCI Express hierarchy optimization setting"
>>   	default PCIE_BUS_DEFAULT
>> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
>> index 3cef835b375f..f752b804ad1f 100644
>> --- a/drivers/pci/bus.c
>> +++ b/drivers/pci/bus.c
>> @@ -316,6 +316,8 @@ void pci_bus_add_device(struct pci_dev *dev)
>>   	 */
>>   	pcibios_bus_add_device(dev);
>>   	pci_fixup_device(pci_fixup_final, dev);
>> +	if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE)
> Would pci_is_bridge() work here? That would include cardbus, but I think
> that won't matter in practice.
ok. I will use pci_is_bridge() here.
>
>> +		of_pci_make_dev_node(dev);
>
>>   	pci_create_sysfs_dev_files(dev);
>>   	pci_proc_attach_device(dev);
>>   	pci_bridge_d3_update(dev);
>> diff --git a/drivers/pci/msi/irqdomain.c b/drivers/pci/msi/irqdomain.c
>> index e9cf318e6670..eeaf44169bfd 100644
>> --- a/drivers/pci/msi/irqdomain.c
>> +++ b/drivers/pci/msi/irqdomain.c
>> @@ -230,8 +230,10 @@ u32 pci_msi_domain_get_msi_rid(struct irq_domain *domain, struct pci_dev *pdev)
>>   	pci_for_each_dma_alias(pdev, get_msi_id_cb, &rid);
>>   
>>   	of_node = irq_domain_get_of_node(domain);
>> -	rid = of_node ? of_msi_map_id(&pdev->dev, of_node, rid) :
>> -			iort_msi_map_id(&pdev->dev, rid);
>> +	if (of_node && !of_node_check_flag(of_node, OF_DYNAMIC))
>> +		rid = of_msi_map_id(&pdev->dev, of_node, rid);
>> +	else
>> +		rid = iort_msi_map_id(&pdev->dev, rid);
>>   
>>   	return rid;
>>   }
>> diff --git a/drivers/pci/of.c b/drivers/pci/of.c
>> index 196834ed44fe..19856d42e147 100644
>> --- a/drivers/pci/of.c
>> +++ b/drivers/pci/of.c
>> @@ -469,6 +469,8 @@ static int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *
>>   		} else {
>>   			/* We found a P2P bridge, check if it has a node */
>>   			ppnode = pci_device_to_OF_node(ppdev);
>> +			if (of_node_check_flag(ppnode, OF_DYNAMIC))
>> +				ppnode = NULL;
>>   		}
>>   
>>   		/*
>> @@ -599,6 +601,110 @@ int devm_of_pci_bridge_init(struct device *dev, struct pci_host_bridge *bridge)
>>   	return pci_parse_request_of_pci_ranges(dev, bridge);
>>   }
>>   
>> +#if IS_ENABLED(CONFIG_PCI_OF)
>> +struct of_pci_node {
>> +	struct list_head node;
>> +	struct device_node *dt_node;
>> +	struct of_changeset cset;
>> +};
>> +
>> +static LIST_HEAD(of_pci_node_list);
>> +static DEFINE_MUTEX(of_pci_node_lock);
> There is a 'data' pointer in device_node which you could use to store
> the changeset. Then you wouldn't need a list. (though 'data' is rarely
> used and I hoped to get rid of it)

Ok. So if I understand correctly, in of_pci_removed_node(), it may check 
the flag and assume 'data' is pointing to cset if OF_DYNAMIC is set.

>> +
>> +static void of_pci_free_node(struct of_pci_node *node)
>> +{
>> +	of_changeset_destroy(&node->cset);
>> +	kfree(node->dt_node->full_name);
>> +	if (node->dt_node)
>> +		of_node_put(node->dt_node);
> You free full_name before freeing the node, so you could have a UAF.
> That needs to be taken care of when releasing the node.
Got it. I will fix this.
>
>> +	kfree(node);
>> +}
>> +
>> +void of_pci_remove_node(struct pci_dev *pdev)
>> +{
>> +	struct list_head *ele, *next;
>> +	struct of_pci_node *node;
>> +
>> +	mutex_lock(&of_pci_node_lock);
>> +
>> +	list_for_each_safe(ele, next, &of_pci_node_list) {
>> +		node = list_entry(ele, struct of_pci_node, node);
>> +		if (node->dt_node == pdev->dev.of_node) {
>> +			list_del(&node->node);
>> +			mutex_unlock(&of_pci_node_lock);
>> +			of_pci_free_node(node);
>> +			break;
>> +		}
>> +	}
>> +	mutex_unlock(&of_pci_node_lock);
>> +}
> The above bits aren't really particular to PCI, so they probably
> belong in the DT core code. Frank will probably have thoughts on what
> this should look like.
Sure. Looking forward Frank's comment.
>
>> +
>> +void of_pci_make_dev_node(struct pci_dev *pdev)
>> +{
>> +	const char *pci_type = "dev";
>> +	struct device_node *parent;
>> +	struct of_pci_node *node;
>> +	int ret;
>> +
>> +	/*
>> +	 * if there is already a device tree node linked to this device,
>> +	 * return immediately.
>> +	 */
>> +	if (pci_device_to_OF_node(pdev))
>> +		return;
>> +
>> +	/* check if there is device tree node for parent device */
>> +	if (!pdev->bus->self)
>> +		parent = pdev->bus->dev.of_node;
>> +	else
>> +		parent = pdev->bus->self->dev.of_node;
>> +	if (!parent)
>> +		return;
>> +
>> +	node = kzalloc(sizeof(*node), GFP_KERNEL);
>> +	if (!node)
>> +		return;
>> +	of_changeset_init(&node->cset);
>> +
>> +	node->dt_node = of_node_alloc(NULL);
>> +	if (!node->dt_node) {
>> +		ret = -ENOMEM;
>> +		goto failed;
>> +	}
>> +	node->dt_node->parent = parent;
>> +
>> +	if (pci_is_bridge(pdev))
>> +		pci_type = "pci";
>> +
>> +	node->dt_node->full_name = kasprintf(GFP_KERNEL, "%s@%x,%x", pci_type,
>> +					     PCI_SLOT(pdev->devfn),
>> +					     PCI_FUNC(pdev->devfn));
>> +	if (!node->dt_node->full_name) {
>> +		ret = -ENOMEM;
>> +		goto failed;
>> +	}
> Wait, aren't you missing adding properties? You need 'reg',
> 'compatbile', and 'device_type = "pci"' for bridges.
In this patch series nobody consumes the dynamic generated node yet, 
Thus I did not add any property. I will add one or two patches to this 
series for the properties you listed.
>
>> +
>> +	ret = of_changeset_attach_node(&node->cset, node->dt_node);
>> +	if (ret)
>> +		goto failed;
>> +
>> +	ret = of_changeset_apply(&node->cset);
>> +	if (ret)
>> +		goto failed;
>> +
>> +	pdev->dev.of_node = node->dt_node;
>> +
>> +	mutex_lock(&of_pci_node_lock);
>> +	list_add(&node->node, &of_pci_node_list);
>> +	mutex_unlock(&of_pci_node_lock);
>> +
>> +	return;
>> +
>> +failed:
>> +	of_pci_free_node(node);
>> +}
> Pass in the parent node and node name, and this function is not PCI
> specific either.

Ok. How about introducing new functions of_changeset_create_node(), 
of_changeset_create_property_*()  to of/dynamic.c?

So the function could be like:

of_pci_make_dev_node ()

{

     node = of_changeset_create_node (cset, full_name, parent); //alloc  
of_node and add to cset

     of_changeset_create_property_string(cset, node, name, string);  
//alloc of_property and add to cset

     of_changeset_create_property_u32_array(cset, node, name, array, len);

     ....  add more properties;

     of_changeset_apply(cset)

}


Thanks,

Lizhi

>
>> +#endif
>> +
>>   #endif /* CONFIG_PCI */
>>   
>>   /**
>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>> index 49238ddd39ee..1540c4c9a770 100644
>> --- a/drivers/pci/pci-driver.c
>> +++ b/drivers/pci/pci-driver.c
>> @@ -1628,7 +1628,8 @@ static int pci_dma_configure(struct device *dev)
>>   	bridge = pci_get_host_bridge_device(to_pci_dev(dev));
>>   
>>   	if (IS_ENABLED(CONFIG_OF) && bridge->parent &&
>> -	    bridge->parent->of_node) {
>> +	    bridge->parent->of_node &&
>> +	    !of_node_check_flag(bridge->parent->of_node, OF_DYNAMIC)) {
>>   		ret = of_dma_configure(dev, bridge->parent->of_node, true);
>>   	} else if (has_acpi_companion(bridge)) {
>>   		struct acpi_device *adev = to_acpi_device_node(bridge->fwnode);
>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>> index 785f31086313..319b79920d40 100644
>> --- a/drivers/pci/pci.h
>> +++ b/drivers/pci/pci.h
>> @@ -678,6 +678,22 @@ static inline int devm_of_pci_bridge_init(struct device *dev, struct pci_host_br
>>   
>>   #endif /* CONFIG_OF */
>>   
>> +#ifdef CONFIG_PCI_OF
>> +void of_pci_make_dev_node(struct pci_dev *pdev);
>> +void of_pci_remove_node(struct pci_dev *pdev);
>> +
>> +#else
>> +static inline void
>> +of_pci_make_dev_node(struct pci_dev *pdev)
>> +{
>> +}
>> +
>> +static inline void
>> +of_pci_remove_node(struct pci_dev *pdev);
>> +{
>> +}
>> +#endif /* CONFIG_OF_DYNAMIC */
>> +
>>   #ifdef CONFIG_PCIEAER
>>   void pci_no_aer(void);
>>   void pci_aer_init(struct pci_dev *dev);
>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>> index 4944798e75b5..58a81e9ff0ed 100644
>> --- a/drivers/pci/quirks.c
>> +++ b/drivers/pci/quirks.c
>> @@ -5956,3 +5956,14 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b1, aspm_l1_acceptable_latency
>>   DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c0, aspm_l1_acceptable_latency);
>>   DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c1, aspm_l1_acceptable_latency);
>>   #endif
>> +
>> +/*
>> + * For PCIe device which have multiple downstream devices, its driver may use
>> + * a flattened device tree to describe the downstream devices.
>> + * To overlay the flattened device tree, the PCI device and all its ancestor
>> + * devices need to have device tree nodes on system base device tree. Thus,
>> + * before driver probing, it might need to add a device tree node as the final
>> + * fixup.
>> + */
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5020, of_pci_make_dev_node);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5021, of_pci_make_dev_node);
>> diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
>> index 4c54c75050dc..0eaa9d9a3609 100644
>> --- a/drivers/pci/remove.c
>> +++ b/drivers/pci/remove.c
>> @@ -23,6 +23,7 @@ static void pci_stop_dev(struct pci_dev *dev)
>>   		device_release_driver(&dev->dev);
>>   		pci_proc_detach_device(dev);
>>   		pci_remove_sysfs_dev_files(dev);
>> +		of_pci_remove_node(dev);
>>   
>>   		pci_dev_assign_added(dev, false);
>>   	}
>> -- 
>> 2.27.0
>>
>>
Frank Rowand Sept. 13, 2022, 7:03 a.m. UTC | #4
On 9/12/22 01:33, Frank Rowand wrote:
> On 9/2/22 13:54, Rob Herring wrote:
>> On Mon, Aug 29, 2022 at 02:43:37PM -0700, Lizhi Hou wrote:
>>> The PCI endpoint device such as Xilinx Alveo PCI card maps the register
>>> spaces from multiple hardware peripherals to its PCI BAR. Normally,
>>> the PCI core discovers devices and BARs using the PCI enumeration process.
>>> And the process does not provide a way to discover the hardware peripherals
>>> been mapped to PCI BARs.
> 
> < snip >
> 
>>
>> The above bits aren't really particular to PCI, so they probably 
>> belong in the DT core code. Frank will probably have thoughts on what 
>> this should look like.
> 
> < snip >
> 
> I will try to look through this patch series later today (Monday 9/12
> USA time - I will not be in Dublin for the conferences this week.)
> 
> -Frank

I have collected nearly 500 emails on the history behind this patch and
also another set of patch series that has been trying to achieve some
somewhat similar functionality.  Expect me to take a while to wade through
all of this.

-Frank
Frank Rowand Sept. 16, 2022, 11:20 p.m. UTC | #5
On 9/13/22 02:03, Frank Rowand wrote:
> On 9/12/22 01:33, Frank Rowand wrote:
>> On 9/2/22 13:54, Rob Herring wrote:
>>> On Mon, Aug 29, 2022 at 02:43:37PM -0700, Lizhi Hou wrote:
>>>> The PCI endpoint device such as Xilinx Alveo PCI card maps the register
>>>> spaces from multiple hardware peripherals to its PCI BAR. Normally,
>>>> the PCI core discovers devices and BARs using the PCI enumeration process.
>>>> And the process does not provide a way to discover the hardware peripherals
>>>> been mapped to PCI BARs.
>>
>> < snip >
>>
>>>
>>> The above bits aren't really particular to PCI, so they probably 
>>> belong in the DT core code. Frank will probably have thoughts on what 
>>> this should look like.
>>
>> < snip >
>>
>> I will try to look through this patch series later today (Monday 9/12
>> USA time - I will not be in Dublin for the conferences this week.)
>>
>> -Frank
> 
> I have collected nearly 500 emails on the history behind this patch and
> also another set of patch series that has been trying to achieve some
> somewhat similar functionality.  Expect me to take a while to wade through
> all of this.

I'm still working at understanding the full picture of patch 2/2.

-Frank

> 
> -Frank
diff mbox series

Patch

diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 55c028af4bd9..9eca5420330b 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -198,6 +198,17 @@  config PCI_HYPERV
 	  The PCI device frontend driver allows the kernel to import arbitrary
 	  PCI devices from a PCI backend to support PCI driver domains.
 
+config PCI_OF
+	bool "Device tree node for PCI devices"
+	select OF_DYNAMIC
+	help
+	  This option enables support for generating device tree nodes for some
+	  PCI devices. Thus, the driver of this kind can load and overlay
+	  flattened device tree for its downstream devices.
+
+	  Once this option is selected, the device tree nodes will be generated
+	  for all PCI/PCIE bridges.
+
 choice
 	prompt "PCI Express hierarchy optimization setting"
 	default PCIE_BUS_DEFAULT
diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 3cef835b375f..f752b804ad1f 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -316,6 +316,8 @@  void pci_bus_add_device(struct pci_dev *dev)
 	 */
 	pcibios_bus_add_device(dev);
 	pci_fixup_device(pci_fixup_final, dev);
+	if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE)
+		of_pci_make_dev_node(dev);
 	pci_create_sysfs_dev_files(dev);
 	pci_proc_attach_device(dev);
 	pci_bridge_d3_update(dev);
diff --git a/drivers/pci/msi/irqdomain.c b/drivers/pci/msi/irqdomain.c
index e9cf318e6670..eeaf44169bfd 100644
--- a/drivers/pci/msi/irqdomain.c
+++ b/drivers/pci/msi/irqdomain.c
@@ -230,8 +230,10 @@  u32 pci_msi_domain_get_msi_rid(struct irq_domain *domain, struct pci_dev *pdev)
 	pci_for_each_dma_alias(pdev, get_msi_id_cb, &rid);
 
 	of_node = irq_domain_get_of_node(domain);
-	rid = of_node ? of_msi_map_id(&pdev->dev, of_node, rid) :
-			iort_msi_map_id(&pdev->dev, rid);
+	if (of_node && !of_node_check_flag(of_node, OF_DYNAMIC))
+		rid = of_msi_map_id(&pdev->dev, of_node, rid);
+	else
+		rid = iort_msi_map_id(&pdev->dev, rid);
 
 	return rid;
 }
diff --git a/drivers/pci/of.c b/drivers/pci/of.c
index 196834ed44fe..19856d42e147 100644
--- a/drivers/pci/of.c
+++ b/drivers/pci/of.c
@@ -469,6 +469,8 @@  static int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *
 		} else {
 			/* We found a P2P bridge, check if it has a node */
 			ppnode = pci_device_to_OF_node(ppdev);
+			if (of_node_check_flag(ppnode, OF_DYNAMIC))
+				ppnode = NULL;
 		}
 
 		/*
@@ -599,6 +601,110 @@  int devm_of_pci_bridge_init(struct device *dev, struct pci_host_bridge *bridge)
 	return pci_parse_request_of_pci_ranges(dev, bridge);
 }
 
+#if IS_ENABLED(CONFIG_PCI_OF)
+struct of_pci_node {
+	struct list_head node;
+	struct device_node *dt_node;
+	struct of_changeset cset;
+};
+
+static LIST_HEAD(of_pci_node_list);
+static DEFINE_MUTEX(of_pci_node_lock);
+
+static void of_pci_free_node(struct of_pci_node *node)
+{
+	of_changeset_destroy(&node->cset);
+	kfree(node->dt_node->full_name);
+	if (node->dt_node)
+		of_node_put(node->dt_node);
+	kfree(node);
+}
+
+void of_pci_remove_node(struct pci_dev *pdev)
+{
+	struct list_head *ele, *next;
+	struct of_pci_node *node;
+
+	mutex_lock(&of_pci_node_lock);
+
+	list_for_each_safe(ele, next, &of_pci_node_list) {
+		node = list_entry(ele, struct of_pci_node, node);
+		if (node->dt_node == pdev->dev.of_node) {
+			list_del(&node->node);
+			mutex_unlock(&of_pci_node_lock);
+			of_pci_free_node(node);
+			break;
+		}
+	}
+	mutex_unlock(&of_pci_node_lock);
+}
+
+void of_pci_make_dev_node(struct pci_dev *pdev)
+{
+	const char *pci_type = "dev";
+	struct device_node *parent;
+	struct of_pci_node *node;
+	int ret;
+
+	/*
+	 * if there is already a device tree node linked to this device,
+	 * return immediately.
+	 */
+	if (pci_device_to_OF_node(pdev))
+		return;
+
+	/* check if there is device tree node for parent device */
+	if (!pdev->bus->self)
+		parent = pdev->bus->dev.of_node;
+	else
+		parent = pdev->bus->self->dev.of_node;
+	if (!parent)
+		return;
+
+	node = kzalloc(sizeof(*node), GFP_KERNEL);
+	if (!node)
+		return;
+	of_changeset_init(&node->cset);
+
+	node->dt_node = of_node_alloc(NULL);
+	if (!node->dt_node) {
+		ret = -ENOMEM;
+		goto failed;
+	}
+	node->dt_node->parent = parent;
+
+	if (pci_is_bridge(pdev))
+		pci_type = "pci";
+
+	node->dt_node->full_name = kasprintf(GFP_KERNEL, "%s@%x,%x", pci_type,
+					     PCI_SLOT(pdev->devfn),
+					     PCI_FUNC(pdev->devfn));
+	if (!node->dt_node->full_name) {
+		ret = -ENOMEM;
+		goto failed;
+	}
+
+	ret = of_changeset_attach_node(&node->cset, node->dt_node);
+	if (ret)
+		goto failed;
+
+	ret = of_changeset_apply(&node->cset);
+	if (ret)
+		goto failed;
+
+	pdev->dev.of_node = node->dt_node;
+
+	mutex_lock(&of_pci_node_lock);
+	list_add(&node->node, &of_pci_node_list);
+	mutex_unlock(&of_pci_node_lock);
+
+	return;
+
+failed:
+	of_pci_free_node(node);
+}
+#endif
+
 #endif /* CONFIG_PCI */
 
 /**
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 49238ddd39ee..1540c4c9a770 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -1628,7 +1628,8 @@  static int pci_dma_configure(struct device *dev)
 	bridge = pci_get_host_bridge_device(to_pci_dev(dev));
 
 	if (IS_ENABLED(CONFIG_OF) && bridge->parent &&
-	    bridge->parent->of_node) {
+	    bridge->parent->of_node &&
+	    !of_node_check_flag(bridge->parent->of_node, OF_DYNAMIC)) {
 		ret = of_dma_configure(dev, bridge->parent->of_node, true);
 	} else if (has_acpi_companion(bridge)) {
 		struct acpi_device *adev = to_acpi_device_node(bridge->fwnode);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 785f31086313..319b79920d40 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -678,6 +678,22 @@  static inline int devm_of_pci_bridge_init(struct device *dev, struct pci_host_br
 
 #endif /* CONFIG_OF */
 
+#ifdef CONFIG_PCI_OF
+void of_pci_make_dev_node(struct pci_dev *pdev);
+void of_pci_remove_node(struct pci_dev *pdev);
+
+#else
+static inline void
+of_pci_make_dev_node(struct pci_dev *pdev)
+{
+}
+
+static inline void
+of_pci_remove_node(struct pci_dev *pdev);
+{
+}
+#endif /* CONFIG_OF_DYNAMIC */
+
 #ifdef CONFIG_PCIEAER
 void pci_no_aer(void);
 void pci_aer_init(struct pci_dev *dev);
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 4944798e75b5..58a81e9ff0ed 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5956,3 +5956,14 @@  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56b1, aspm_l1_acceptable_latency
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c0, aspm_l1_acceptable_latency);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x56c1, aspm_l1_acceptable_latency);
 #endif
+
+/*
+ * For PCIe device which have multiple downstream devices, its driver may use
+ * a flattened device tree to describe the downstream devices.
+ * To overlay the flattened device tree, the PCI device and all its ancestor
+ * devices need to have device tree nodes on system base device tree. Thus,
+ * before driver probing, it might need to add a device tree node as the final
+ * fixup.
+ */
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5020, of_pci_make_dev_node);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5021, of_pci_make_dev_node);
diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index 4c54c75050dc..0eaa9d9a3609 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -23,6 +23,7 @@  static void pci_stop_dev(struct pci_dev *dev)
 		device_release_driver(&dev->dev);
 		pci_proc_detach_device(dev);
 		pci_remove_sysfs_dev_files(dev);
+		of_pci_remove_node(dev);
 
 		pci_dev_assign_added(dev, false);
 	}