Message ID | 20180417002825.2737-1-fred@fredlawl.com |
---|---|
Headers | show |
Series | IB/hfi1: Cleanup PCIe link configuration | expand |
On Mon, Apr 16, 2018 at 07:28:23PM -0500, Frederick Lawler wrote: > The IB/hfi1 driver uses custom macros to configure the target link speed. Some > macros were previously replaced, but not fully. This patch series addresses the > configuration inconsistencies by adding PCI_EXP_LNKCTL2_TLS* macros to PCI, > and then use them in the following IB/hfi1 patch. Btw, why is the driver configuring the PCIe link speed? Isn't this something we should be handling in the PCI core?
[+cc Bartlomiej, who recently changed do_pcie_gen3_transition(), LKML] On Mon, Apr 16, 2018 at 11:53:43PM -0700, Christoph Hellwig wrote: > On Mon, Apr 16, 2018 at 07:28:23PM -0500, Frederick Lawler wrote: > > The IB/hfi1 driver uses custom macros to configure the target link speed. Some > > macros were previously replaced, but not fully. This patch series addresses the > > configuration inconsistencies by adding PCI_EXP_LNKCTL2_TLS* macros to PCI, > > and then use them in the following IB/hfi1 patch. > > Btw, why is the driver configuring the PCIe link speed? Isn't this > something we should be handling in the PCI core? Good question. I think this sort of code definitely should not be in drivers unless it's to work around some kind of defect in the device. I think the intent of the spec is that neither the OS nor the driver has to deal with link training and the link should train to the highest speed supported by both ends. For example, PCIe r4.0, sec 1.2 says Initialization – During hardware initialization, each PCI Express Link is set up following a negotiation of Lane widths and frequency of operation by the two agents at each end of the Link. No firmware or operating system software is involved. Maybe the Intel guys can comment on why HFI needs this code. Presumably they didn't write it just for fun, so I assume there's some reason. Bjorn
> > Btw, why is the driver configuring the PCIe link speed? Isn't this > something we should be handling in the PCI core? The device comes out of reset at the 5GT/s speed. The driver downloads device firmware, programs PCIe registers, and co-ordinates the transition to 8GT/s. This recipe is device specific and is therefore implemented in the hfi1 driver built on top of PCI core functions and macros. Mike
On Tue, Apr 17, 2018 at 03:56:13PM +0000, Marciniszyn, Mike wrote: > > > > Btw, why is the driver configuring the PCIe link speed? Isn't this > > something we should be handling in the PCI core? > > The device comes out of reset at the 5GT/s speed. The driver > downloads device firmware, programs PCIe registers, and co-ordinates > the transition to 8GT/s. > > This recipe is device specific and is therefore implemented in the > hfi1 driver built on top of PCI core functions and macros. Do you think this behavior conforms to the spec, or is this a workaround for a hardware erratum? I don't think it's feasible to have every driver deal with this level of PCIe detail. Do you have to do something similar in the Windows driver? Is there something we can do in the PCI core to do the reconfiguration? If you can't go to 8GT/s before doing some device-specific initialization, could the driver do that setup and then use a generic PCI core interface to reconfigure the link? Or maybe if the driver finds the device at 5GT/s, it could download the firmware and reset the device. Would the device then negotiate at 8GT/s? Bjorn
On 4/17/2018 10:32 AM, Bjorn Helgaas wrote: > [+cc Bartlomiej, who recently changed do_pcie_gen3_transition(), LKML] > > On Mon, Apr 16, 2018 at 11:53:43PM -0700, Christoph Hellwig wrote: >> On Mon, Apr 16, 2018 at 07:28:23PM -0500, Frederick Lawler wrote: >>> The IB/hfi1 driver uses custom macros to configure the target link speed. Some >>> macros were previously replaced, but not fully. This patch series addresses the >>> configuration inconsistencies by adding PCI_EXP_LNKCTL2_TLS* macros to PCI, >>> and then use them in the following IB/hfi1 patch. >> >> Btw, why is the driver configuring the PCIe link speed? Isn't this >> something we should be handling in the PCI core? > > Good question. I think this sort of code definitely should not be in > drivers unless it's to work around some kind of defect in the device. > > I think the intent of the spec is that neither the OS nor the driver > has to deal with link training and the link should train to the > highest speed supported by both ends. For example, PCIe r4.0, sec 1.2 > says > > Initialization – During hardware initialization, each PCI Express > Link is set up following a negotiation of Lane widths and frequency > of operation by the two agents at each end of the Link. No firmware > or operating system software is involved. > > Maybe the Intel guys can comment on why HFI needs this code. > Presumably they didn't write it just for fun, so I assume there's > some reason. Yeah there was a reason. It just escapes me what it was right now. Let me find out and I'll update. -Denny
On 4/17/2018 1:19 PM, Bjorn Helgaas wrote: > On Tue, Apr 17, 2018 at 03:56:13PM +0000, Marciniszyn, Mike wrote: >>> >>> Btw, why is the driver configuring the PCIe link speed? Isn't this >>> something we should be handling in the PCI core? >> >> The device comes out of reset at the 5GT/s speed. The driver >> downloads device firmware, programs PCIe registers, and co-ordinates >> the transition to 8GT/s. >> >> This recipe is device specific and is therefore implemented in the >> hfi1 driver built on top of PCI core functions and macros. > > Do you think this behavior conforms to the spec, or is this a > workaround for a hardware erratum? Can't speculate as to why, but this is just the way this hardware works. > I don't think it's feasible to have every driver deal with this level > of PCIe detail. Do you have to do something similar in the Windows > driver? No Windows driver. > Is there something we can do in the PCI core to do the > reconfiguration? If you can't go to 8GT/s before doing some > device-specific initialization, could the driver do that setup and > then use a generic PCI core interface to reconfigure the link? > > Or maybe if the driver finds the device at 5GT/s, it could download > the firmware and reset the device. Would the device then negotiate at > 8GT/s? Yes. In fact this is what we should be doing. We use the SBR to trigger renegotiation of the link to 8GT/s. -Denny
On Mon, 2018-04-16 at 19:28 -0500, Frederick Lawler wrote: > The IB/hfi1 driver uses custom macros to configure the target link speed. Some > macros were previously replaced, but not fully. This patch series addresses the > configuration inconsistencies by adding PCI_EXP_LNKCTL2_TLS* macros to PCI, > and then use them in the following IB/hfi1 patch. > > V3: > * PCI: Add PCI_EXP_LNKCTL2_TLS_* macros > * Drop patch to use extract_speed() in do_pcie_gen3_transition() > V2: > * s/LINK/LNK > > Frederick Lawler (2): > PCI: Add PCI_EXP_LNKCTL2_TLS* macros > IB/hfi1: Replace custom hfi1 macros with PCIe macros > > drivers/infiniband/hw/hfi1/pcie.c | 24 ++++++++---------------- > include/uapi/linux/pci_regs.h | 5 +++++ > 2 files changed, 13 insertions(+), 16 deletions(-) > Where do we stand on this series? People talked about what the core might be able to do, or how the driver might be able to do things a little differently, but I didn't hear a clear conclusion on whether or not to take this series as is (it seems to be a sensible cleanup and possibly worthwhile even if we end up changing the hfi1 driver to do a firmware updated followed by a device reset to get the same results). And more importantly as far as I'm concerned, is this sitting on my plate or on the PCI people's plate?
On Mon, Apr 16, 2018 at 07:28:23PM -0500, Frederick Lawler wrote: > The IB/hfi1 driver uses custom macros to configure the target link speed. Some > macros were previously replaced, but not fully. This patch series addresses the > configuration inconsistencies by adding PCI_EXP_LNKCTL2_TLS* macros to PCI, > and then use them in the following IB/hfi1 patch. > > V3: > * PCI: Add PCI_EXP_LNKCTL2_TLS_* macros > * Drop patch to use extract_speed() in do_pcie_gen3_transition() > V2: > * s/LINK/LNK > > Frederick Lawler (2): > PCI: Add PCI_EXP_LNKCTL2_TLS* macros > IB/hfi1: Replace custom hfi1 macros with PCIe macros > > drivers/infiniband/hw/hfi1/pcie.c | 24 ++++++++---------------- > include/uapi/linux/pci_regs.h | 5 +++++ > 2 files changed, 13 insertions(+), 16 deletions(-) I applied these, with Michael's reviewed-by on the second, to pci/misc for v4.18, thanks! Ideally we could pull some of this link speed management code into the PCI core someday, but I think these patches are useful as-is. Oh, I did s/PCI_EXP_LNKCTL2_TLS_2_5GB/PCI_EXP_LNKCTL2_TLS_2_5GT/ because the link speeds are actually defined in terms of GT/s, not GB/s.
On Fri, 2018-04-27 at 12:58 -0500, Bjorn Helgaas wrote: > On Mon, Apr 16, 2018 at 07:28:23PM -0500, Frederick Lawler wrote: > > The IB/hfi1 driver uses custom macros to configure the target link speed. Some > > macros were previously replaced, but not fully. This patch series addresses the > > configuration inconsistencies by adding PCI_EXP_LNKCTL2_TLS* macros to PCI, > > and then use them in the following IB/hfi1 patch. > > > > V3: > > * PCI: Add PCI_EXP_LNKCTL2_TLS_* macros > > * Drop patch to use extract_speed() in do_pcie_gen3_transition() > > V2: > > * s/LINK/LNK > > > > Frederick Lawler (2): > > PCI: Add PCI_EXP_LNKCTL2_TLS* macros > > IB/hfi1: Replace custom hfi1 macros with PCIe macros > > > > drivers/infiniband/hw/hfi1/pcie.c | 24 ++++++++---------------- > > include/uapi/linux/pci_regs.h | 5 +++++ > > 2 files changed, 13 insertions(+), 16 deletions(-) > > I applied these, with Michael's reviewed-by on the second, to > pci/misc for v4.18, thanks! > > Ideally we could pull some of this link speed management code into the > PCI core someday, but I think these patches are useful as-is. > > Oh, I did s/PCI_EXP_LNKCTL2_TLS_2_5GB/PCI_EXP_LNKCTL2_TLS_2_5GT/ > because the link speeds are actually defined in terms of GT/s, not > GB/s. Thanks :-)