Message ID | 1581120007-5280-1-git-send-email-jonathan.derrick@intel.com |
---|---|
Headers | show |
Series | PCIe Hotplug Slot Emulation driver | expand |
On Fri, Feb 07, 2020 at 04:59:58PM -0700, Jon Derrick wrote: > This set adds an emulation driver for PCIe Hotplug. There may be platforms with > specific configurations that can support hotplug but don't provide the logical > slot hotplug hardware. For instance, the platform may use an > electrically-tolerant interposer between the slot and the device. The code seems like one giant hack to me. What is the real life use case for this? Another Intel chipset fuckup like vmd or the ahci remapping?
On Mon, 2020-02-10 at 08:01 +0100, Christoph Hellwig wrote: > On Fri, Feb 07, 2020 at 04:59:58PM -0700, Jon Derrick wrote: > > This set adds an emulation driver for PCIe Hotplug. There may be platforms with > > specific configurations that can support hotplug but don't provide the logical > > slot hotplug hardware. For instance, the platform may use an > > electrically-tolerant interposer between the slot and the device. > > The code seems like one giant hack to me. What is the real life > use case for this? Another Intel chipset fuckup like vmd or the ahci > remapping? > Exactly as the cover letter describes. An interposer being used on a non-hotplug slot.
On Mon, Feb 10, 2020 at 03:05:47PM +0000, Derrick, Jonathan wrote: > > The code seems like one giant hack to me. What is the real life > > use case for this? Another Intel chipset fuckup like vmd or the ahci > > remapping? > > > Exactly as the cover letter describes. An interposer being used on a > non-hotplug slot. That isn't a use a case, that iѕ a description of the implementation. Why would you want this code?
On Mon, 2020-02-10 at 17:58 +0100, hch@lst.de wrote: > On Mon, Feb 10, 2020 at 03:05:47PM +0000, Derrick, Jonathan wrote: > > > The code seems like one giant hack to me. What is the real life > > > use case for this? Another Intel chipset fuckup like vmd or the ahci > > > remapping? > > > > > Exactly as the cover letter describes. An interposer being used on a > > non-hotplug slot. > > That isn't a use a case, that iѕ a description of the implementation. > Why would you want this code? It allows non-hotplug slots to take advantage of the kernel's robust hotplug ecosystem, if the platform configuration can tolerate the events. This could also reduce BOM cost by eliminating some slot controllers. Say you had something that only needed to be hotplugged very infrequently, like RAIDed OS drives, versus something needing to be hotplugged very frequently like data drives. Granted it probably could be fit into pciehp_poll, but it seemed to have a different objective (emulating slot)
[+cc Stuart, Lukas] On Fri, Feb 07, 2020 at 04:59:58PM -0700, Jon Derrick wrote: > This set adds an emulation driver for PCIe Hotplug. There may be platforms with > specific configurations that can support hotplug but don't provide the logical > slot hotplug hardware. For instance, the platform may use an > electrically-tolerant interposer between the slot and the device. > > This driver utilizes the pci-bridge-emul architecture to manage register reads > and writes. The underlying functionality of the hotplug emulation driver uses > the Data Link Layer Link Active Reporting mechanism in a polling loop, but can > tolerate other event sources such as AER or DPC. > > When enabled and a slot is managed by the driver, all port services are managed > by the kernel. This is done to ensure that firmware hotplug and error > architecture does not (correctly) halt/machine check the system when hotplug is > performed on a non-hotplug slot. > > The driver offers two active mode: Auto and Force. > auto: The driver will bind to non-hotplug slots > force: The driver will bind to all slots and overrides the slot's services > > There are three kernel params: > pciehp.pciehp_emul_mode={off, auto, force} > pciehp.pciehp_emul_time=<msecs polling time> (def 1000, min 100, max 60000) > pciehp.pciehp_emul_ports=<PCI [S]BDF/ID format string> > > The pciehp_emul_ports kernel parameter takes a semi-colon tokenized string > representing PCI [S]BDFs and IDs. The pciehp_emul_mode will then be applied to > only those slots, leaving other slots unmanaged by pciehp_emul. > > The string follows the pci_dev_str_match() format: > > [<domain>:]<bus>:<device>.<func>[/<device>.<func>]* > pci:<vendor>:<device>[:<subvendor>:<subdevice>] > > When using the path format, the path for the device can be obtained > using 'lspci -t' and further specified using the upstream bridge and the > downstream port's device-function to be more robust against bus > renumbering. > > When using the vendor-device format, a value of '0' in any field acts as > a wildcard for that field, matching all values. > > The driver is enabled with CONFIG_HOTPLUG_PCI_PCIE_EMUL=y. > > The driver should be considered 'use at own risk' unless the platform/hardware > vendor recommends this mode. There's a lot of good work in here, and I don't claim to understand the use case and all the benefits. But it seems like quite a lot of additional code and complexity in an area that's already pretty subtle, so I'm not yet convinced that it's all worthwhile. It seems like this would rely on Data Link Layer Link Active Reporting. Is that something we could add to pciehp as a generic feature without making a separate driver for it? I haven't looked at this for a while, but I would assume that if we find out that a link went down, pciehp could/should be smart enough to notice that even if it didn't come via the usual pciehp Slot Status path. > Jon Derrick (9): > PCI: pci-bridge-emul: Update PCIe register behaviors > PCI: pci-bridge-emul: Eliminate reserved member > PCI: pci-bridge-emul: Provide a helper to set behavior > PCI: pciehp: Indirect slot register operations > PCI: Add pcie_port_slot_emulated stub > PCI: pciehp: Expose the poll loop to other drivers > PCI: Move pci_dev_str_match to search.c > PCI: pciehp: Add hotplug slot emulation driver > PCI: pciehp: Wire up pcie_port_emulate_slot and pciehp_emul > > drivers/pci/hotplug/Makefile | 4 + > drivers/pci/hotplug/pciehp.h | 28 +++ > drivers/pci/hotplug/pciehp_emul.c | 378 ++++++++++++++++++++++++++++++++++++++ > drivers/pci/hotplug/pciehp_hpc.c | 136 ++++++++++---- > drivers/pci/pci-acpi.c | 3 + > drivers/pci/pci-bridge-emul.c | 95 +++++----- > drivers/pci/pci-bridge-emul.h | 10 + > drivers/pci/pci.c | 163 ---------------- > drivers/pci/pcie/Kconfig | 14 ++ > drivers/pci/pcie/portdrv_core.c | 14 +- > drivers/pci/probe.c | 2 +- > drivers/pci/search.c | 162 ++++++++++++++++ > include/linux/pci.h | 8 + > 13 files changed, 775 insertions(+), 242 deletions(-) > create mode 100644 drivers/pci/hotplug/pciehp_emul.c > > -- > 1.8.3.1 >
Hi Bjorn, On Sat, 2020-03-28 at 16:51 -0500, Bjorn Helgaas wrote: > [+cc Stuart, Lukas] > > On Fri, Feb 07, 2020 at 04:59:58PM -0700, Jon Derrick wrote: > > This set adds an emulation driver for PCIe Hotplug. There may be platforms with > > specific configurations that can support hotplug but don't provide the logical > > slot hotplug hardware. For instance, the platform may use an > > electrically-tolerant interposer between the slot and the device. > > > > This driver utilizes the pci-bridge-emul architecture to manage register reads > > and writes. The underlying functionality of the hotplug emulation driver uses > > the Data Link Layer Link Active Reporting mechanism in a polling loop, but can > > tolerate other event sources such as AER or DPC. > > > > When enabled and a slot is managed by the driver, all port services are managed > > by the kernel. This is done to ensure that firmware hotplug and error > > architecture does not (correctly) halt/machine check the system when hotplug is > > performed on a non-hotplug slot. > > > > The driver offers two active mode: Auto and Force. > > auto: The driver will bind to non-hotplug slots > > force: The driver will bind to all slots and overrides the slot's services > > > > There are three kernel params: > > pciehp.pciehp_emul_mode={off, auto, force} > > pciehp.pciehp_emul_time=<msecs polling time> (def 1000, min 100, max 60000) > > pciehp.pciehp_emul_ports=<PCI [S]BDF/ID format string> > > > > The pciehp_emul_ports kernel parameter takes a semi-colon tokenized string > > representing PCI [S]BDFs and IDs. The pciehp_emul_mode will then be applied to > > only those slots, leaving other slots unmanaged by pciehp_emul. > > > > The string follows the pci_dev_str_match() format: > > > > [<domain>:]<bus>:<device>.<func>[/<device>.<func>]* > > pci:<vendor>:<device>[:<subvendor>:<subdevice>] > > > > When using the path format, the path for the device can be obtained > > using 'lspci -t' and further specified using the upstream bridge and the > > downstream port's device-function to be more robust against bus > > renumbering. > > > > When using the vendor-device format, a value of '0' in any field acts as > > a wildcard for that field, matching all values. > > > > The driver is enabled with CONFIG_HOTPLUG_PCI_PCIE_EMUL=y. > > > > The driver should be considered 'use at own risk' unless the platform/hardware > > vendor recommends this mode. > > There's a lot of good work in here, and I don't claim to understand > the use case and all the benefits. I've received more info that the customer use case is an AIC that breaks out 1-4 M.2 cards which have been made hotplug tolerant. > > But it seems like quite a lot of additional code and complexity in an > area that's already pretty subtle, so I'm not yet convinced that it's > all worthwhile. > > It seems like this would rely on Data Link Layer Link Active > Reporting. Is that something we could add to pciehp as a generic > feature without making a separate driver for it? I haven't looked at > this for a while, but I would assume that if we find out that a link > went down, pciehp could/should be smart enough to notice that even if > it didn't come via the usual pciehp Slot Status path. I had a plan to do V2 by intercepting bus_ops rather than indirecting slot_ops in pciehp. That should touch /a lot/ less code. The problem I saw with adding DLLLA as a primary signal in pciehp is that most of the pciehp boilerplate relies on valid Slot register logic. I don't know how reliable pciehp will be if there's no backing slot register logic, emulated or real. Consider how many slot capability reads are in hpc. I could add a non-slot flag check to each of those callers, but it might be worse than the emulation alternative. What do you think? Thanks > > > Jon Derrick (9): > > PCI: pci-bridge-emul: Update PCIe register behaviors > > PCI: pci-bridge-emul: Eliminate reserved member > > PCI: pci-bridge-emul: Provide a helper to set behavior > > PCI: pciehp: Indirect slot register operations > > PCI: Add pcie_port_slot_emulated stub > > PCI: pciehp: Expose the poll loop to other drivers > > PCI: Move pci_dev_str_match to search.c > > PCI: pciehp: Add hotplug slot emulation driver > > PCI: pciehp: Wire up pcie_port_emulate_slot and pciehp_emul > > > > drivers/pci/hotplug/Makefile | 4 + > > drivers/pci/hotplug/pciehp.h | 28 +++ > > drivers/pci/hotplug/pciehp_emul.c | 378 ++++++++++++++++++++++++++++++++++++++ > > drivers/pci/hotplug/pciehp_hpc.c | 136 ++++++++++---- > > drivers/pci/pci-acpi.c | 3 + > > drivers/pci/pci-bridge-emul.c | 95 +++++----- > > drivers/pci/pci-bridge-emul.h | 10 + > > drivers/pci/pci.c | 163 ---------------- > > drivers/pci/pcie/Kconfig | 14 ++ > > drivers/pci/pcie/portdrv_core.c | 14 +- > > drivers/pci/probe.c | 2 +- > > drivers/pci/search.c | 162 ++++++++++++++++ > > include/linux/pci.h | 8 + > > 13 files changed, 775 insertions(+), 242 deletions(-) > > create mode 100644 drivers/pci/hotplug/pciehp_emul.c > > > > -- > > 1.8.3.1 > >
On Mon, Mar 30, 2020 at 05:43:33PM +0000, Derrick, Jonathan wrote: > > There's a lot of good work in here, and I don't claim to understand > > the use case and all the benefits. > I've received more info that the customer use case is an AIC that > breaks out 1-4 M.2 cards which have been made hotplug tolerant. Which sounds completely bogus. M.2 cards are eletrically not designed for this at all, and neither is the AIC. If people want to support similar use cases they need to get them standardized by PCI SIG and handled by hotplug capable slots.
On Mon, Mar 30, 2020 at 12:43 PM Derrick, Jonathan <jonathan.derrick@intel.com> wrote: > On Sat, 2020-03-28 at 16:51 -0500, Bjorn Helgaas wrote: > > On Fri, Feb 07, 2020 at 04:59:58PM -0700, Jon Derrick wrote: > > > This set adds an emulation driver for PCIe Hotplug. There may be platforms with > > > specific configurations that can support hotplug but don't provide the logical > > > slot hotplug hardware. For instance, the platform may use an > > > electrically-tolerant interposer between the slot and the device. > > > ... > > There's a lot of good work in here, and I don't claim to understand > > the use case and all the benefits. > I've received more info that the customer use case is an AIC that > breaks out 1-4 M.2 cards which have been made hotplug tolerant. > > > But it seems like quite a lot of additional code and complexity in an > > area that's already pretty subtle, so I'm not yet convinced that it's > > all worthwhile. > > > > It seems like this would rely on Data Link Layer Link Active > > Reporting. Is that something we could add to pciehp as a generic > > feature without making a separate driver for it? I haven't looked at > > this for a while, but I would assume that if we find out that a link > > went down, pciehp could/should be smart enough to notice that even if > > it didn't come via the usual pciehp Slot Status path. > I had a plan to do V2 by intercepting bus_ops rather than indirecting > slot_ops in pciehp. That should touch /a lot/ less code. I assume this is something like pci_bus_set_ops() or pci_bus_set_aer_ops()? Probably touches less code, but I'm not really a fan of either of those current situations because they make things magical -- there's a lot of stuff going on behind the curtain, and it makes another thing to consider when we evaluate every pciehp change. > The problem I saw with adding DLLLA as a primary signal in pciehp is > that most of the pciehp boilerplate relies on valid Slot register > logic. I don't know how reliable pciehp will be if there's no backing > slot register logic, emulated or real. Consider how many slot > capability reads are in hpc. > > I could add a non-slot flag check to each of those callers, but it > might be worse than the emulation alternative. I see what you mean -- there are quite a few reads of PCI_EXP_SLTSTA. I'm not 100% sure all of those really need to exist. I would expect that we'd read it once in the ISR and then operate on that value. So maybe there's some middle ground of restructuring to remove some of those reads and making the remaining few more generic. All that being said, I'm also sympathetic to Christoph's concern about cluttering pciehp to deal with non-standard topologies. At some point if you need to do non-standard things you may have to write and maintain your own drivers. I don't know where that point is yet. Bjorn