Message ID | 20230214140858.1133292-1-rick.wertenbroek@gmail.com |
---|---|
Headers | show |
Series | PCI: rockchip: Fix RK3399 PCIe endpoint controller driver | expand |
On 2/14/23 23:08, Rick Wertenbroek wrote: > This is a series of patches that fixes the PCIe endpoint controller driver > for the Rockchip RK3399 SoC. The driver was introduced in > cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") > The original driver had issues and would not allow for the RK3399 to > operate in PCIe endpoint mode correctly. This patch series fixes that so > that the PCIe core controller of the RK3399 SoC can now act as a PCIe > endpoint. This is v2 of the patch series and addresses the concerns that > were raised during the review of the first version. > > Thank you in advance for reviewing these changes and hopefully > getting this merged. Having a functional PCIe endpoint controller > driver for the RK3399 would allow to develop further PCIe endpoint > functions through the Linux PCIe endpoint framework using this SoC. > > Problem: The Rockchip RK3399 PCIe endpoint controller driver introduced in > cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") > did not work. > > Summary of problems with the driver : > > * Missing dtsi entry > * Could not update Device ID (DID) > * The endpoint could not be configured by a host computer because the > endpoint kept sending Configuration Request Retry Status (CRS) messages > * The kernel would sometimes hang on probe due to access to registers in > a clock domain of which the PLLs were not locked > * The memory window mapping and address translation mechanism had > conflicting mappings and did not follow the technical reference manual > as to how the address translation should be done > * Legacy IRQs were not generated by the endpoint > * Message Signaled interrupts (MSI) were not generated by the endpoint > > The problems have been addressed and validated through tests (see below). > > Summary of changes : > > This patch series is composed of 9 patches that do the following : > * Remove writes to unused registers in the PCIe core register space. > The registers that were written to is marked "unused" and read > only in the technical reference manual of the RK3399 SoC. > * Write PCI Device ID (DID) to correct register, the DID was written to > a read only register and therefore would not update the DID. > * Assert PCI Configuration Enable bit after probe so that it would stop > sending Configuration Request Retry Status (CRS) messages to the > host once configured, without this the host would retry until > timeout and cancel the PCI configuration. > * Add poll and timeout to wait for PHY PLLs to be locked, this > is the only patch that also applies to the root complex function > of the PCIe core controller, without this the kernel would > sometimes access registers in the PHY PLL clock domain when the PLLs > were not yet locked and the system would hang. This was hackily solved > in other non mainline patches (e.g., in armbian) with a "msleep()" > that was added after PHY PLL configuration but without realizing > why it was needed. A poll with timeout seems like a sane approach. > * Add dtsi entry for RK3399 PCIe endpoint core. The new entry is > in "disabled" status by default, so unless it is explicitly enabled > it will not conflict with the PCIe root complex controller entry. > Developers that will enable it would know that the root complex function > then must be disabled, this can be done in the board level DTS. > * Fix window mapping and address translation for endpoint. The window > mapping and address translation did not follow the technical reference > manual and a single memory region was used which resulted in conflicting > address translations for memory allocated in that region. The current > patch allows to allocate up to 32 memory windows with 1MB pages. > * Fix legacy IRQ generation for RK3399 PCIe endpoint core, the legacy IRQs > were not sent by the device because their generation did not follow the > instructions in the technical reference manual. They now work. > * Use u32 variable to access 32-bit registers, u16 variables were used to > access and manipulate data of 32-bit registers, this would lead to > overflows e.g., when left shifting more than 16 bits. > * Add parameter check for RK3399 PCIe endpoint core set_msi(), return > -EINVAL when incompatible parameters are passed. > > Validation on real hardware: > > This patch series has been tested with kernel 6.0.19 (and 5.19) > on real hardware, a FriendlyElec NanoPC-T4 RK3399 based single computer > board connected to a host computer through PCIe x1 and x4. The PCIe > endpoint test function driver was loaded on the SoC and the PCIe endpoint > test driver was loaded on the host computer. The following tests were > executed through this setup : > > * enumeration of the PCIe endpoint device (lspci) > lspci -vvv > * validation of PCI header and capabilities > setpci and lspci -xxxx > * device was recognized by host computer dans PCIe endpoint test driver > was loaded > lspci -v states "Kernel modules: pci_endpoint_test" > * tested the BARs 0-5 > sudo /usr/bin/pcitest -b 0 > ... > sudo /usr/bin/pcitest -b 5 > * tested legacy interrupt through the test driver > sudo /usr/bin/pcitest -i 0 > sudo /usr/bin/pcitest -l > * tested MSI interrupt through the test driver > sudo /usr/bin/pcitest -i 1 > sudo /usr/bin/pcitest -m 1 > * tested read/write to and from host through the test driver with checksum > sudo /usr/bin/pcitest -r -s 1024 > sudo /usr/bin/pcitest -w -s 1024 > * tested read/write with DMA enabled (all read/write tests also did IRQ) > sudo /usr/bin/pcitest -r -d -s 8192 > sudo /usr/bin/pcitest -w -d -s 8192 > > Commands used on the SoC to launch the endpoint function (configfs) : > > modprobe -i pci-epf-test > mkdir -p /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0 > echo 0xb500 > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/deviceid > echo 0x104c > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/vendorid > echo 16 > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/msi_interrupts > ln -s /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0 \ > /sys/kernel/config/pci_ep/controllers/fd000000.pcie-ep/ > echo 1 > /sys/kernel/config/pci_ep/controllers/fd000000.pcie-ep/start > > Note: to enable the endpoint controller on the board the file : > arch/arm64/boot/dts/rockchip/rk3399-nanopc-t4.dts > Was edited to set the status of &pcie0 to "disabled" and &pcie0_ep > to "okay". This is not submitted as a patch because most users > will use the PCIe core controller in host (root complex) mode > rather than endpoint mode. > > I have tested and confirmed all basic functionality required for the > endpoint with the test driver and tools. With the previous state of > the driver the device would not even be enumerated by the host > computer (mainly because of CRS messages being sent back to the root > complex) and tests would not pass (driver would not even be loaded > because DID was not set correctly) and then only the BAR test would > pass. Now all tests pass as stated above. Note about that: with your series applied, nothing was working for me on my pine Rockpro64 board (AMD Ryzen host). I got weird/unstable behavior and the host IOMMU screaming about IO page faults due to the endpoint doing weird pci accesses. Running the host with IOMMU on really helps in debugging this stuff :) With the few fixes to your series I commented about, things started to work better, but still very unstable. More debugging and I found out that the pci-epf-test drivers, both host and endpoint sides, have nasty problems that lead to reporting failures when things are actually working, or outright dummy things being done that trigger errors (e.g. bad DMA synchronization triggers IOMMU page faults reports). I have a dozen fix patches for these drivers. Will clean them up and post ASAP. With the test drivers fixed + the fixes to your series, I have the pci_test.sh tests passing 100% of the time, repeatedly (in a loop). All solid. However, I am still seeing issues with my ongoing work with a NVMe endpoint driver function: I see everything working when the host BIOS pokes at the NVMe "drive" it sees (all good, that is normal), but once Linux nvme driver probe kicks in, IRQs are essentially dead: the nvme driver does not see anything strange and allocates IRQs (1 first, which ends up being INTX, then multiple MSI one for each completion queue), but on the endpoint side, attempting to raise MSI or INTX IRQs result in error as the rockchip-ep driver sees both INTX and MSI as disabled. No clue what is going on. I suspect that a pci reset may have happened and corrupted the core configuration. However, the EPC/EPF infrastructure does not catch/process PCI resets as far as I can tell. That may be the issue. I do not see this issue with the epf test driver, because I suspect the host BIOS not knowing anything about that device, it does not touch it. This all may depend on the host & BIOS. Not sure. Need to try with different hosts. Just FYI :)
On Wed, Feb 15, 2023 at 2:51 AM Damien Le Moal <damien.lemoal@opensource.wdc.com> wrote: > > Note about that: with your series applied, nothing was working for me on > my pine Rockpro64 board (AMD Ryzen host). I got weird/unstable behavior > and the host IOMMU screaming about IO page faults due to the endpoint > doing weird pci accesses. Running the host with IOMMU on really helps in > debugging this stuff :) Thank you for testing, I have also tested with a Ryzen host, I have IOMMU enabled as well. > > With the few fixes to your series I commented about, things started to > work better, but still very unstable. More debugging and I found out that > the pci-epf-test drivers, both host and endpoint sides, have nasty > problems that lead to reporting failures when things are actually working, > or outright dummy things being done that trigger errors (e.g. bad DMA > synchronization triggers IOMMU page faults reports). I have a dozen fix > patches for these drivers. Will clean them up and post ASAP. > > With the test drivers fixed + the fixes to your series, I have the > pci_test.sh tests passing 100% of the time, repeatedly (in a loop). All solid. > Good to hear that it now works, I'll try them as well. > However, I am still seeing issues with my ongoing work with a NVMe > endpoint driver function: I see everything working when the host BIOS > pokes at the NVMe "drive" it sees (all good, that is normal), but once > Linux nvme driver probe kicks in, IRQs are essentially dead: the nvme > driver does not see anything strange and allocates IRQs (1 first, which > ends up being INTX, then multiple MSI one for each completion queue), but > on the endpoint side, attempting to raise MSI or INTX IRQs result in error > as the rockchip-ep driver sees both INTX and MSI as disabled. No clue what > is going on. I suspect that a pci reset may have happened and corrupted > the core configuration. However, the EPC/EPF infrastructure does not > catch/process PCI resets as far as I can tell. That may be the issue. > I do not see this issue with the epf test driver, because I suspect the > host BIOS not knowing anything about that device, it does not touch it. > This all may depend on the host & BIOS. Not sure. Need to try with > different hosts. Just FYI :) > Interesting that you are working on this, I started to patch the RK3399 PCIe endpoint controller driver for a similar project, I want to run our NVMe firmware in a Linux PCIe endpoint function. For the IRQs there are two things that come to mind: 1) The host driver could actually disable them and work in polling mode, I have seen that with different versions of the Linux kernel NVMe driver sometimes it would choose to use polling instead of IRQs for the queues. So maybe it's just the 2) The RK3399 PCIe endpoint controller is said to be able only to generate one type of interrupt at a given time. "It is capable of generating MSI or Legacy interrupt if the PCIe is configured as EP. Notes that one PCIe component can't generate both types of interrupts. It is either one or the other." (see TRM 17.5.9 Interrupt Support). I don't know exactly what the TRM means the the controller cannot use both interrupts at the same time, but this might be a path to explore
On 2/15/23 19:28, Rick Wertenbroek wrote: > On Wed, Feb 15, 2023 at 2:51 AM Damien Le Moal > <damien.lemoal@opensource.wdc.com> wrote: >> >> Note about that: with your series applied, nothing was working for me on >> my pine Rockpro64 board (AMD Ryzen host). I got weird/unstable behavior >> and the host IOMMU screaming about IO page faults due to the endpoint >> doing weird pci accesses. Running the host with IOMMU on really helps in >> debugging this stuff :) > > Thank you for testing, I have also tested with a Ryzen host, I have IOMMU > enabled as well. > >> >> With the few fixes to your series I commented about, things started to >> work better, but still very unstable. More debugging and I found out that >> the pci-epf-test drivers, both host and endpoint sides, have nasty >> problems that lead to reporting failures when things are actually working, >> or outright dummy things being done that trigger errors (e.g. bad DMA >> synchronization triggers IOMMU page faults reports). I have a dozen fix >> patches for these drivers. Will clean them up and post ASAP. >> >> With the test drivers fixed + the fixes to your series, I have the >> pci_test.sh tests passing 100% of the time, repeatedly (in a loop). All solid. >> > > Good to hear that it now works, I'll try them as well. > >> However, I am still seeing issues with my ongoing work with a NVMe >> endpoint driver function: I see everything working when the host BIOS >> pokes at the NVMe "drive" it sees (all good, that is normal), but once >> Linux nvme driver probe kicks in, IRQs are essentially dead: the nvme >> driver does not see anything strange and allocates IRQs (1 first, which >> ends up being INTX, then multiple MSI one for each completion queue), but >> on the endpoint side, attempting to raise MSI or INTX IRQs result in error >> as the rockchip-ep driver sees both INTX and MSI as disabled. No clue what >> is going on. I suspect that a pci reset may have happened and corrupted >> the core configuration. However, the EPC/EPF infrastructure does not >> catch/process PCI resets as far as I can tell. That may be the issue. >> I do not see this issue with the epf test driver, because I suspect the >> host BIOS not knowing anything about that device, it does not touch it. >> This all may depend on the host & BIOS. Not sure. Need to try with >> different hosts. Just FYI :) >> > > Interesting that you are working on this, I started to patch the RK3399 PCIe > endpoint controller driver for a similar project, I want to run our NVMe > firmware in a Linux PCIe endpoint function. > > For the IRQs there are two things that come to mind: > 1) The host driver could actually disable them and work in polling mode, > I have seen that with different versions of the Linux kernel NVMe driver > sometimes it would choose to use polling instead of IRQs for the queues. > So maybe it's just the > 2) The RK3399 PCIe endpoint controller is said to be able only to generate > one type of interrupt at a given time. "It is capable of generating MSI or > Legacy interrupt if the PCIe is configured as EP. Notes that one PCIe > component can't generate both types of interrupts. It is either one or the > other." (see TRM 17.5.9 Interrupt Support). > I don't know exactly what the TRM means the the controller cannot > use both interrupts at the same time, but this might be a path to explore The host says that both INTX is enabled and MSI disabled when the nvme driver starts probing. That driver starts probe with a single vector to enable the device first and use the admin SQ/CQ for indentify etc. Then, that IRQ is freed and multiple MSI vectors allocated, one for each admin + IO queue pair. The problem is that on the endpoint, the driver says that both INTX and MSI are disabled but the host at least sees INTX enabled, and the first IRQ allocated for the probe enables MSI and gets one vector. But that MSI enable is not seen by the EP, and the EP also says that INTX is disabled, contrary to what the host says. When the BIOS probe the drive, both INTX and MSI are OK. Only one IRQ is used by the BIOS and I tried both by setting & disabling MSI. What I think happens is that there may be a PCI reset/FLR or something similar, and that screws up the core config... I do not have a PCI bus analyzer, so hard to debug :) I did hack both the host nvme driver and EP driver to print PCI link status etc, but I do not see anything strange there. Furthermore, the BAR accesses and admin SQ/CQ commands and cqe exchange is working as I get the identify commands from the host and the host sees the cqe, but after a timeout as it never receives any IRQ... I would like to try testing without the BIOS touching the EP nvme controller. But not sure how to do that. Probably should ignore the first CC.EN enable event I see, which is from the BIOS.
On 2/14/23 23:08, Rick Wertenbroek wrote: > This is a series of patches that fixes the PCIe endpoint controller driver > for the Rockchip RK3399 SoC. The driver was introduced in > cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") > The original driver had issues and would not allow for the RK3399 to > operate in PCIe endpoint mode correctly. This patch series fixes that so > that the PCIe core controller of the RK3399 SoC can now act as a PCIe > endpoint. This is v2 of the patch series and addresses the concerns that > were raised during the review of the first version. Rick, Are you going to send a rebased V3 soon ? I have a couple of additional patches to add on top of your series... > > Thank you in advance for reviewing these changes and hopefully > getting this merged. Having a functional PCIe endpoint controller > driver for the RK3399 would allow to develop further PCIe endpoint > functions through the Linux PCIe endpoint framework using this SoC. > > Problem: The Rockchip RK3399 PCIe endpoint controller driver introduced in > cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") > did not work. > > Summary of problems with the driver : > > * Missing dtsi entry > * Could not update Device ID (DID) > * The endpoint could not be configured by a host computer because the > endpoint kept sending Configuration Request Retry Status (CRS) messages > * The kernel would sometimes hang on probe due to access to registers in > a clock domain of which the PLLs were not locked > * The memory window mapping and address translation mechanism had > conflicting mappings and did not follow the technical reference manual > as to how the address translation should be done > * Legacy IRQs were not generated by the endpoint > * Message Signaled interrupts (MSI) were not generated by the endpoint > > The problems have been addressed and validated through tests (see below). > > Summary of changes : > > This patch series is composed of 9 patches that do the following : > * Remove writes to unused registers in the PCIe core register space. > The registers that were written to is marked "unused" and read > only in the technical reference manual of the RK3399 SoC. > * Write PCI Device ID (DID) to correct register, the DID was written to > a read only register and therefore would not update the DID. > * Assert PCI Configuration Enable bit after probe so that it would stop > sending Configuration Request Retry Status (CRS) messages to the > host once configured, without this the host would retry until > timeout and cancel the PCI configuration. > * Add poll and timeout to wait for PHY PLLs to be locked, this > is the only patch that also applies to the root complex function > of the PCIe core controller, without this the kernel would > sometimes access registers in the PHY PLL clock domain when the PLLs > were not yet locked and the system would hang. This was hackily solved > in other non mainline patches (e.g., in armbian) with a "msleep()" > that was added after PHY PLL configuration but without realizing > why it was needed. A poll with timeout seems like a sane approach. > * Add dtsi entry for RK3399 PCIe endpoint core. The new entry is > in "disabled" status by default, so unless it is explicitly enabled > it will not conflict with the PCIe root complex controller entry. > Developers that will enable it would know that the root complex function > then must be disabled, this can be done in the board level DTS. > * Fix window mapping and address translation for endpoint. The window > mapping and address translation did not follow the technical reference > manual and a single memory region was used which resulted in conflicting > address translations for memory allocated in that region. The current > patch allows to allocate up to 32 memory windows with 1MB pages. > * Fix legacy IRQ generation for RK3399 PCIe endpoint core, the legacy IRQs > were not sent by the device because their generation did not follow the > instructions in the technical reference manual. They now work. > * Use u32 variable to access 32-bit registers, u16 variables were used to > access and manipulate data of 32-bit registers, this would lead to > overflows e.g., when left shifting more than 16 bits. > * Add parameter check for RK3399 PCIe endpoint core set_msi(), return > -EINVAL when incompatible parameters are passed. > > Validation on real hardware: > > This patch series has been tested with kernel 6.0.19 (and 5.19) > on real hardware, a FriendlyElec NanoPC-T4 RK3399 based single computer > board connected to a host computer through PCIe x1 and x4. The PCIe > endpoint test function driver was loaded on the SoC and the PCIe endpoint > test driver was loaded on the host computer. The following tests were > executed through this setup : > > * enumeration of the PCIe endpoint device (lspci) > lspci -vvv > * validation of PCI header and capabilities > setpci and lspci -xxxx > * device was recognized by host computer dans PCIe endpoint test driver > was loaded > lspci -v states "Kernel modules: pci_endpoint_test" > * tested the BARs 0-5 > sudo /usr/bin/pcitest -b 0 > ... > sudo /usr/bin/pcitest -b 5 > * tested legacy interrupt through the test driver > sudo /usr/bin/pcitest -i 0 > sudo /usr/bin/pcitest -l > * tested MSI interrupt through the test driver > sudo /usr/bin/pcitest -i 1 > sudo /usr/bin/pcitest -m 1 > * tested read/write to and from host through the test driver with checksum > sudo /usr/bin/pcitest -r -s 1024 > sudo /usr/bin/pcitest -w -s 1024 > * tested read/write with DMA enabled (all read/write tests also did IRQ) > sudo /usr/bin/pcitest -r -d -s 8192 > sudo /usr/bin/pcitest -w -d -s 8192 > > Commands used on the SoC to launch the endpoint function (configfs) : > > modprobe -i pci-epf-test > mkdir -p /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0 > echo 0xb500 > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/deviceid > echo 0x104c > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/vendorid > echo 16 > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/msi_interrupts > ln -s /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0 \ > /sys/kernel/config/pci_ep/controllers/fd000000.pcie-ep/ > echo 1 > /sys/kernel/config/pci_ep/controllers/fd000000.pcie-ep/start > > Note: to enable the endpoint controller on the board the file : > arch/arm64/boot/dts/rockchip/rk3399-nanopc-t4.dts > Was edited to set the status of &pcie0 to "disabled" and &pcie0_ep > to "okay". This is not submitted as a patch because most users > will use the PCIe core controller in host (root complex) mode > rather than endpoint mode. > > I have tested and confirmed all basic functionality required for the > endpoint with the test driver and tools. With the previous state of > the driver the device would not even be enumerated by the host > computer (mainly because of CRS messages being sent back to the root > complex) and tests would not pass (driver would not even be loaded > because DID was not set correctly) and then only the BAR test would > pass. Now all tests pass as stated above. > > Best regards > Rick > > Rick Wertenbroek (9): > PCI: rockchip: Remove writes to unused registers > PCI: rockchip: Write PCI Device ID to correct register > PCI: rockchip: Assert PCI Configuration Enable bit after probe > PCI: rockchip: Add poll and timeout to wait for PHY PLLs to be locked > arm64: dts: rockchip: Add dtsi entry for RK3399 PCIe endpoint core > PCI: rockchip: Fix window mapping and address translation for endpoint > PCI: rockchip: Fix legacy IRQ generation for RK3399 PCIe endpoint core > PCI: rockchip: Use u32 variable to access 32-bit registers > PCI: rockchip: Add parameter check for RK3399 PCIe endpoint core > set_msi() > > arch/arm64/boot/dts/rockchip/rk3399.dtsi | 23 ++++ > drivers/pci/controller/pcie-rockchip-ep.c | 143 ++++++++++------------ > drivers/pci/controller/pcie-rockchip.c | 16 +++ > drivers/pci/controller/pcie-rockchip.h | 36 ++++-- > 4 files changed, 128 insertions(+), 90 deletions(-) >
On Tue, Mar 14, 2023 at 1:02 AM Damien Le Moal <damien.lemoal@opensource.wdc.com> wrote: > > On 2/14/23 23:08, Rick Wertenbroek wrote: > > This is a series of patches that fixes the PCIe endpoint controller driver > > for the Rockchip RK3399 SoC. The driver was introduced in > > cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") > > The original driver had issues and would not allow for the RK3399 to > > operate in PCIe endpoint mode correctly. This patch series fixes that so > > that the PCIe core controller of the RK3399 SoC can now act as a PCIe > > endpoint. This is v2 of the patch series and addresses the concerns that > > were raised during the review of the first version. > > Rick, > > Are you going to send a rebased V3 soon ? I have a couple of additional > patches to add on top of your series... > I'll try to send a V3 this week. The changes to V2 will be the issues raised and discussed on the V2 here in the mailing list with the additional code for removing the unsupported MSI-X capability list (was discussed in the mailing list as well). > > > > > Thank you in advance for reviewing these changes and hopefully > > getting this merged. Having a functional PCIe endpoint controller > > driver for the RK3399 would allow to develop further PCIe endpoint > > functions through the Linux PCIe endpoint framework using this SoC. > > > > Problem: The Rockchip RK3399 PCIe endpoint controller driver introduced in > > cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") > > did not work. > > > > Summary of problems with the driver : > > > > * Missing dtsi entry > > * Could not update Device ID (DID) > > * The endpoint could not be configured by a host computer because the > > endpoint kept sending Configuration Request Retry Status (CRS) messages > > * The kernel would sometimes hang on probe due to access to registers in > > a clock domain of which the PLLs were not locked > > * The memory window mapping and address translation mechanism had > > conflicting mappings and did not follow the technical reference manual > > as to how the address translation should be done > > * Legacy IRQs were not generated by the endpoint > > * Message Signaled interrupts (MSI) were not generated by the endpoint > > > > The problems have been addressed and validated through tests (see below). > > > > Summary of changes : > > > > This patch series is composed of 9 patches that do the following : > > * Remove writes to unused registers in the PCIe core register space. > > The registers that were written to is marked "unused" and read > > only in the technical reference manual of the RK3399 SoC. > > * Write PCI Device ID (DID) to correct register, the DID was written to > > a read only register and therefore would not update the DID. > > * Assert PCI Configuration Enable bit after probe so that it would stop > > sending Configuration Request Retry Status (CRS) messages to the > > host once configured, without this the host would retry until > > timeout and cancel the PCI configuration. > > * Add poll and timeout to wait for PHY PLLs to be locked, this > > is the only patch that also applies to the root complex function > > of the PCIe core controller, without this the kernel would > > sometimes access registers in the PHY PLL clock domain when the PLLs > > were not yet locked and the system would hang. This was hackily solved > > in other non mainline patches (e.g., in armbian) with a "msleep()" > > that was added after PHY PLL configuration but without realizing > > why it was needed. A poll with timeout seems like a sane approach. > > * Add dtsi entry for RK3399 PCIe endpoint core. The new entry is > > in "disabled" status by default, so unless it is explicitly enabled > > it will not conflict with the PCIe root complex controller entry. > > Developers that will enable it would know that the root complex function > > then must be disabled, this can be done in the board level DTS. > > * Fix window mapping and address translation for endpoint. The window > > mapping and address translation did not follow the technical reference > > manual and a single memory region was used which resulted in conflicting > > address translations for memory allocated in that region. The current > > patch allows to allocate up to 32 memory windows with 1MB pages. > > * Fix legacy IRQ generation for RK3399 PCIe endpoint core, the legacy IRQs > > were not sent by the device because their generation did not follow the > > instructions in the technical reference manual. They now work. > > * Use u32 variable to access 32-bit registers, u16 variables were used to > > access and manipulate data of 32-bit registers, this would lead to > > overflows e.g., when left shifting more than 16 bits. > > * Add parameter check for RK3399 PCIe endpoint core set_msi(), return > > -EINVAL when incompatible parameters are passed. > > > > Validation on real hardware: > > > > This patch series has been tested with kernel 6.0.19 (and 5.19) > > on real hardware, a FriendlyElec NanoPC-T4 RK3399 based single computer > > board connected to a host computer through PCIe x1 and x4. The PCIe > > endpoint test function driver was loaded on the SoC and the PCIe endpoint > > test driver was loaded on the host computer. The following tests were > > executed through this setup : > > > > * enumeration of the PCIe endpoint device (lspci) > > lspci -vvv > > * validation of PCI header and capabilities > > setpci and lspci -xxxx > > * device was recognized by host computer dans PCIe endpoint test driver > > was loaded > > lspci -v states "Kernel modules: pci_endpoint_test" > > * tested the BARs 0-5 > > sudo /usr/bin/pcitest -b 0 > > ... > > sudo /usr/bin/pcitest -b 5 > > * tested legacy interrupt through the test driver > > sudo /usr/bin/pcitest -i 0 > > sudo /usr/bin/pcitest -l > > * tested MSI interrupt through the test driver > > sudo /usr/bin/pcitest -i 1 > > sudo /usr/bin/pcitest -m 1 > > * tested read/write to and from host through the test driver with checksum > > sudo /usr/bin/pcitest -r -s 1024 > > sudo /usr/bin/pcitest -w -s 1024 > > * tested read/write with DMA enabled (all read/write tests also did IRQ) > > sudo /usr/bin/pcitest -r -d -s 8192 > > sudo /usr/bin/pcitest -w -d -s 8192 > > > > Commands used on the SoC to launch the endpoint function (configfs) : > > > > modprobe -i pci-epf-test > > mkdir -p /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0 > > echo 0xb500 > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/deviceid > > echo 0x104c > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/vendorid > > echo 16 > /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0/msi_interrupts > > ln -s /sys/kernel/config/pci_ep/functions/pci_epf_test/pci_epf_test.0 \ > > /sys/kernel/config/pci_ep/controllers/fd000000.pcie-ep/ > > echo 1 > /sys/kernel/config/pci_ep/controllers/fd000000.pcie-ep/start > > > > Note: to enable the endpoint controller on the board the file : > > arch/arm64/boot/dts/rockchip/rk3399-nanopc-t4.dts > > Was edited to set the status of &pcie0 to "disabled" and &pcie0_ep > > to "okay". This is not submitted as a patch because most users > > will use the PCIe core controller in host (root complex) mode > > rather than endpoint mode. > > > > I have tested and confirmed all basic functionality required for the > > endpoint with the test driver and tools. With the previous state of > > the driver the device would not even be enumerated by the host > > computer (mainly because of CRS messages being sent back to the root > > complex) and tests would not pass (driver would not even be loaded > > because DID was not set correctly) and then only the BAR test would > > pass. Now all tests pass as stated above. > > > > Best regards > > Rick > > > > Rick Wertenbroek (9): > > PCI: rockchip: Remove writes to unused registers > > PCI: rockchip: Write PCI Device ID to correct register > > PCI: rockchip: Assert PCI Configuration Enable bit after probe > > PCI: rockchip: Add poll and timeout to wait for PHY PLLs to be locked > > arm64: dts: rockchip: Add dtsi entry for RK3399 PCIe endpoint core > > PCI: rockchip: Fix window mapping and address translation for endpoint > > PCI: rockchip: Fix legacy IRQ generation for RK3399 PCIe endpoint core > > PCI: rockchip: Use u32 variable to access 32-bit registers > > PCI: rockchip: Add parameter check for RK3399 PCIe endpoint core > > set_msi() > > > > arch/arm64/boot/dts/rockchip/rk3399.dtsi | 23 ++++ > > drivers/pci/controller/pcie-rockchip-ep.c | 143 ++++++++++------------ > > drivers/pci/controller/pcie-rockchip.c | 16 +++ > > drivers/pci/controller/pcie-rockchip.h | 36 ++++-- > > 4 files changed, 128 insertions(+), 90 deletions(-) > > > > -- > Damien Le Moal > Western Digital Research >
On 3/14/23 16:57, Rick Wertenbroek wrote: > On Tue, Mar 14, 2023 at 1:02 AM Damien Le Moal > <damien.lemoal@opensource.wdc.com> wrote: >> >> On 2/14/23 23:08, Rick Wertenbroek wrote: >>> This is a series of patches that fixes the PCIe endpoint controller driver >>> for the Rockchip RK3399 SoC. The driver was introduced in >>> cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") >>> The original driver had issues and would not allow for the RK3399 to >>> operate in PCIe endpoint mode correctly. This patch series fixes that so >>> that the PCIe core controller of the RK3399 SoC can now act as a PCIe >>> endpoint. This is v2 of the patch series and addresses the concerns that >>> were raised during the review of the first version. >> >> Rick, >> >> Are you going to send a rebased V3 soon ? I have a couple of additional >> patches to add on top of your series... >> > > I'll try to send a V3 this week. The changes to V2 will be the issues > raised and discussed on the V2 here in the mailing list with the additional > code for removing the unsupported MSI-X capability list (was discussed > in the mailing list as well). Thanks. Additional patch needed to avoid problems with this controller is that we need to set ".align = 256" in the features. Otherwise, things do not work well. This is because the ATU drops the low 8-bits of the PCI addresses. It is a one liner patch, so feel free to add it to your series. I also noticed random issues wich seem to be due to link-up timing... We probably will need to implement a poll thread to detect and notify with the linkup callback when we actually have the link established with the host (see the dw-ep controller which does something similar). From: Damien Le Moal <damien.lemoal@opensource.wdc.com> Date: Thu, 9 Mar 2023 16:37:24 +0900 Subject: [PATCH] PCI: rockchip: Set address alignment for endpoint mode The address translation unit of the rockchip EP controller does not use the lower 8 bits of a PCIe-space address to map local memory. Thus we must set the align feature field to 256 to let the user know about this constraint. Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> --- drivers/pci/controller/pcie-rockchip-ep.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c index 12db9a9d92af..c6a23db84967 100644 --- a/drivers/pci/controller/pcie-rockchip-ep.c +++ b/drivers/pci/controller/pcie-rockchip-ep.c @@ -471,6 +471,7 @@ static const struct pci_epc_features rockchip_pcie_epc_features = { .linkup_notifier = false, .msi_capable = true, .msix_capable = false, + .align = 256, }; static const struct pci_epc_features*
On Tue, Mar 14, 2023 at 9:10 AM Damien Le Moal <damien.lemoal@opensource.wdc.com> wrote: > > On 3/14/23 16:57, Rick Wertenbroek wrote: > > On Tue, Mar 14, 2023 at 1:02 AM Damien Le Moal > > <damien.lemoal@opensource.wdc.com> wrote: > >> > >> On 2/14/23 23:08, Rick Wertenbroek wrote: > >>> This is a series of patches that fixes the PCIe endpoint controller driver > >>> for the Rockchip RK3399 SoC. The driver was introduced in > >>> cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") > >>> The original driver had issues and would not allow for the RK3399 to > >>> operate in PCIe endpoint mode correctly. This patch series fixes that so > >>> that the PCIe core controller of the RK3399 SoC can now act as a PCIe > >>> endpoint. This is v2 of the patch series and addresses the concerns that > >>> were raised during the review of the first version. > >> > >> Rick, > >> > >> Are you going to send a rebased V3 soon ? I have a couple of additional > >> patches to add on top of your series... > >> > > > > I'll try to send a V3 this week. The changes to V2 will be the issues > > raised and discussed on the V2 here in the mailing list with the additional > > code for removing the unsupported MSI-X capability list (was discussed > > in the mailing list as well). > > Thanks. > > Additional patch needed to avoid problems with this controller is that > we need to set ".align = 256" in the features. Otherwise, things do not > work well. This is because the ATU drops the low 8-bits of the PCI > addresses. It is a one liner patch, so feel free to add it to your series. > > I also noticed random issues wich seem to be due to link-up timing... We > probably will need to implement a poll thread to detect and notify with > the linkup callback when we actually have the link established with the > host (see the dw-ep controller which does something similar). > Hello Damien, I also noticed random issues I suspect to be related to link status or power state, in my case it sometimes happens that the BARs (0-6) in the config space get reset to 0. This is not due to the driver because the driver never ever accesses these registers (@0xfd80'0010 to 0xfd80'0024 TRM 17.6.4.1.5-17.6.4.1.10). I don't think the host rewrites them because lspci shows the BARs as "[virtual]" which means they have been assigned by host but have 0 value in the endpoint device (when lspci rereads the PCI config header). See https://github.com/pciutils/pciutils/blob/master/lspci.c#L422 So I suspect the controller detects something related to link status or power state and internally (in hardware) resets those registers. It's not the kernel code, it never accesses these regs. The problem occurs very randomly, sometimes in a few seconds, sometimes I cannot see it for a whole day. Is this similar to what you are experiencing ? Do you have any idea as to what could make these registers to be reset (I could not find anything in the TRM, also nothing in the driver seems to cause it). > > From: Damien Le Moal <damien.lemoal@opensource.wdc.com> > Date: Thu, 9 Mar 2023 16:37:24 +0900 > Subject: [PATCH] PCI: rockchip: Set address alignment for endpoint mode > > The address translation unit of the rockchip EP controller does not use > the lower 8 bits of a PCIe-space address to map local memory. Thus we > must set the align feature field to 256 to let the user know about this > constraint. > > Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> > --- > drivers/pci/controller/pcie-rockchip-ep.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/pci/controller/pcie-rockchip-ep.c > b/drivers/pci/controller/pcie-rockchip-ep.c > index 12db9a9d92af..c6a23db84967 100644 > --- a/drivers/pci/controller/pcie-rockchip-ep.c > +++ b/drivers/pci/controller/pcie-rockchip-ep.c > @@ -471,6 +471,7 @@ static const struct pci_epc_features > rockchip_pcie_epc_features = { > .linkup_notifier = false, > .msi_capable = true, > .msix_capable = false, > + .align = 256, > }; > > static const struct pci_epc_features* > -- > 2.39.2 > > > -- > Damien Le Moal > Western Digital Research > Do you want me to include this patch in the V3 series or will you submit another patch series for the changes you applied on the RK3399 PCIe endpoint controller ? I don't know if you prefer to build the V3 together or if you prefer to submit another patch series on top of mine. Let me know. Best regards. Rick
On 3/14/23 23:53, Rick Wertenbroek wrote: > Hello Damien, > I also noticed random issues I suspect to be related to link status or power > state, in my case it sometimes happens that the BARs (0-6) in the config > space get reset to 0. This is not due to the driver because the driver never > ever accesses these registers (@0xfd80'0010 to 0xfd80'0024 TRM > 17.6.4.1.5-17.6.4.1.10). > I don't think the host rewrites them because lspci shows the BARs as > "[virtual]" which means they have been assigned by host but have 0 > value in the endpoint device (when lspci rereads the PCI config header). > See https://github.com/pciutils/pciutils/blob/master/lspci.c#L422 > > So I suspect the controller detects something related to link status or > power state and internally (in hardware) resets those registers. It's not > the kernel code, it never accesses these regs. The problem occurs > very randomly, sometimes in a few seconds, sometimes I cannot see > it for a whole day. > > Is this similar to what you are experiencing ? Yes. I sometimes get NMIs after starting the function driver, when my function driver starts probing the bar registers after seeing the host changing one register. And the link also comes up with 4 lanes or 2 lanes, random. > Do you have any idea as to what could make these registers to be reset > (I could not find anything in the TRM, also nothing in the driver seems to > cause it). My thinking is that since we do not have a linkup notifier, the function driver starts setting things up without the link established (e.g. when the host is still powered down). Once the host start booting and pic link is established, things may be reset in the hardware... That is the only thing I can think of. And yes, there are definitely something going on with the power states too I think: if I let things idle for a few minutes, everything stops working: no activity seen on the endpoint over the BARs. I tried enabling the sys and client interrupts to see if I can see power state changes, or if clearing the interrupts helps (they are masked by default), but no change. And booting the host with pci_aspm=off does not help either. Also tried to change all the capabilities related to link & power states to "off" (not supported), and no change either. So currently, I am out of ideas regarding that one. I am trying to make progress on my endpoint driver (nvme function) to be sure it is not a bug there that breaks things. I may still have something bad because when I enable the BIOS native NVMe driver on the host, either the host does not boot, or grub crashes with memory corruptions. Overall, not yet very stable and still trying to sort out the root cause of that. > Do you want me to include this patch in the V3 series or will you > submit another patch series for the changes you applied on the RK3399 PCIe > endpoint controller ? I don't know if you prefer to build the V3 > together or if you > prefer to submit another patch series on top of mine. Let me know. If it is no trouble, please include it with your series. Will be easier to retest everything together :)
On 3/15/23 07:54, Damien Le Moal wrote: > On 3/14/23 23:53, Rick Wertenbroek wrote: >> Hello Damien, >> I also noticed random issues I suspect to be related to link status or power >> state, in my case it sometimes happens that the BARs (0-6) in the config >> space get reset to 0. This is not due to the driver because the driver never >> ever accesses these registers (@0xfd80'0010 to 0xfd80'0024 TRM >> 17.6.4.1.5-17.6.4.1.10). >> I don't think the host rewrites them because lspci shows the BARs as >> "[virtual]" which means they have been assigned by host but have 0 >> value in the endpoint device (when lspci rereads the PCI config header). >> See https://github.com/pciutils/pciutils/blob/master/lspci.c#L422 >> >> So I suspect the controller detects something related to link status or >> power state and internally (in hardware) resets those registers. It's not >> the kernel code, it never accesses these regs. The problem occurs >> very randomly, sometimes in a few seconds, sometimes I cannot see >> it for a whole day. >> >> Is this similar to what you are experiencing ? > > Yes. I sometimes get NMIs after starting the function driver, when my function > driver starts probing the bar registers after seeing the host changing one > register. And the link also comes up with 4 lanes or 2 lanes, random. > >> Do you have any idea as to what could make these registers to be reset >> (I could not find anything in the TRM, also nothing in the driver seems to >> cause it). > > My thinking is that since we do not have a linkup notifier, the function driver > starts setting things up without the link established (e.g. when the host is > still powered down). Once the host start booting and pic link is established, > things may be reset in the hardware... That is the only thing I can think of. > > And yes, there are definitely something going on with the power states too I > think: if I let things idle for a few minutes, everything stops working: no > activity seen on the endpoint over the BARs. I tried enabling the sys and client > interrupts to see if I can see power state changes, or if clearing the > interrupts helps (they are masked by default), but no change. And booting the > host with pci_aspm=off does not help either. Also tried to change all the > capabilities related to link & power states to "off" (not supported), and no > change either. So currently, I am out of ideas regarding that one. > > I am trying to make progress on my endpoint driver (nvme function) to be sure it > is not a bug there that breaks things. I may still have something bad because > when I enable the BIOS native NVMe driver on the host, either the host does not > boot, or grub crashes with memory corruptions. Overall, not yet very stable and > still trying to sort out the root cause of that. By the way, enabling the interrupts to see the error notifications, I do see a lot of retry timeout and other recoverable errors. So the issues I am seeing could be due to my PCI cable setup that is not ideal (bad signal, ground loops, ... ?). Not sure. I do not have a PCI analyzer handy :) I attached the patches I used to enable the EP interrupts. Enabling debug prints will tell you what is going on. That may give you some hints on your setup ?
On Wed, Mar 15, 2023 at 1:00 AM Damien Le Moal <damien.lemoal@opensource.wdc.com> wrote: > > On 3/15/23 07:54, Damien Le Moal wrote: > > On 3/14/23 23:53, Rick Wertenbroek wrote: > >> Hello Damien, > >> I also noticed random issues I suspect to be related to link status or power > >> state, in my case it sometimes happens that the BARs (0-6) in the config > >> space get reset to 0. This is not due to the driver because the driver never > >> ever accesses these registers (@0xfd80'0010 to 0xfd80'0024 TRM > >> 17.6.4.1.5-17.6.4.1.10). > >> I don't think the host rewrites them because lspci shows the BARs as > >> "[virtual]" which means they have been assigned by host but have 0 > >> value in the endpoint device (when lspci rereads the PCI config header). > >> See https://github.com/pciutils/pciutils/blob/master/lspci.c#L422 > >> > >> So I suspect the controller detects something related to link status or > >> power state and internally (in hardware) resets those registers. It's not > >> the kernel code, it never accesses these regs. The problem occurs > >> very randomly, sometimes in a few seconds, sometimes I cannot see > >> it for a whole day. > >> > >> Is this similar to what you are experiencing ? > > > > Yes. I sometimes get NMIs after starting the function driver, when my function > > driver starts probing the bar registers after seeing the host changing one > > register. And the link also comes up with 4 lanes or 2 lanes, random. Hello, I have never had it come up with only 2 lanes, I get 4 consistently. I have it connected through a M.2 to female PCIe 16x (4x electrically connected), then through a male-to-male PCIe 4x cable with TX/RX swap, then through a 16x extender. All three cables are approx 25cm. It seems stable. > > > >> Do you have any idea as to what could make these registers to be reset > >> (I could not find anything in the TRM, also nothing in the driver seems to > >> cause it). > > > > My thinking is that since we do not have a linkup notifier, the function driver > > starts setting things up without the link established (e.g. when the host is > > still powered down). Once the host start booting and pic link is established, > > things may be reset in the hardware... That is the only thing I can think of. This might be worth investigating, I'll look into it, but it seems many of the EP drivers don't have a Linkup notifier, drivers/pci/controller/dwc/pci-dra7xx.c has one, but most of the other EP drivers don't have them, so it might not be absolutely required. > > > > And yes, there are definitely something going on with the power states too I > > think: if I let things idle for a few minutes, everything stops working: no > > activity seen on the endpoint over the BARs. I tried enabling the sys and client > > interrupts to see if I can see power state changes, or if clearing the > > interrupts helps (they are masked by default), but no change. And booting the > > host with pci_aspm=off does not help either. Also tried to change all the > > capabilities related to link & power states to "off" (not supported), and no > > change either. So currently, I am out of ideas regarding that one. > > > > I am trying to make progress on my endpoint driver (nvme function) to be sure it > > is not a bug there that breaks things. I may still have something bad because > > when I enable the BIOS native NVMe driver on the host, either the host does not > > boot, or grub crashes with memory corruptions. Overall, not yet very stable and > > still trying to sort out the root cause of that. I am also working on an NVMe driver but I have our NVMe firmware running in userspace so our endpoint function driver only exposes the BARs as UIO mapped memory and has a simple interface to generate IRQs to host / initiate DMA transfers. So that driver does very little in itself and I still have problems with the BARs getting unmapped (reset to 0) randomly. I hope your patches for monitoring the IRQs will shed some light on this. I also observed the BARs getting reset with the pcie ep test function driver, so I don't think it necessarily is the function that is to blame, rather the controller itself (also because none of the kernel code should / does access the BARs registers @0xfd80'0010). > > By the way, enabling the interrupts to see the error notifications, I do see a > lot of retry timeout and other recoverable errors. So the issues I am seeing > could be due to my PCI cable setup that is not ideal (bad signal, ground loops, > ... ?). Not sure. I do not have a PCI analyzer handy :) > > I attached the patches I used to enable the EP interrupts. Enabling debug prints > will tell you what is going on. That may give you some hints on your setup ? > > -- > Damien Le Moal > Western Digital Research Thank you for these patches. I will try them and see if they give me more info. Also, I will delay the release of the v3 of my patch series because of these issues. The v3 only incorporates the changes discussed here in the mailing list so your version should be up to date. If you want me to send you the series in its current state let me know. But I will need some more debugging, I'll release the v3 when the driver is more stable. I don't when, I don't have that much time on this project. Thanks for your understanding. Rick
On Thu, Mar 16, 2023 at 1:52 PM Rick Wertenbroek <rick.wertenbroek@gmail.com> wrote: > > On Wed, Mar 15, 2023 at 1:00 AM Damien Le Moal > <damien.lemoal@opensource.wdc.com> wrote: > > > > On 3/15/23 07:54, Damien Le Moal wrote: > > > On 3/14/23 23:53, Rick Wertenbroek wrote: > > >> Hello Damien, > > >> I also noticed random issues I suspect to be related to link status or power > > >> state, in my case it sometimes happens that the BARs (0-6) in the config > > >> space get reset to 0. This is not due to the driver because the driver never > > >> ever accesses these registers (@0xfd80'0010 to 0xfd80'0024 TRM > > >> 17.6.4.1.5-17.6.4.1.10). > > >> I don't think the host rewrites them because lspci shows the BARs as > > >> "[virtual]" which means they have been assigned by host but have 0 > > >> value in the endpoint device (when lspci rereads the PCI config header). > > >> See https://github.com/pciutils/pciutils/blob/master/lspci.c#L422 > > >> > > >> So I suspect the controller detects something related to link status or > > >> power state and internally (in hardware) resets those registers. It's not > > >> the kernel code, it never accesses these regs. The problem occurs > > >> very randomly, sometimes in a few seconds, sometimes I cannot see > > >> it for a whole day. > > >> > > >> Is this similar to what you are experiencing ? > > > > > > Yes. I sometimes get NMIs after starting the function driver, when my function > > > driver starts probing the bar registers after seeing the host changing one > > > register. And the link also comes up with 4 lanes or 2 lanes, random. > > Hello, I have never had it come up with only 2 lanes, I get 4 consistently. > I have it connected through a M.2 to female PCIe 16x (4x electrically > connected), > then through a male-to-male PCIe 4x cable with TX/RX swap, then through a > 16x extender. All three cables are approx 25cm. It seems stable. > > > > > > >> Do you have any idea as to what could make these registers to be reset > > >> (I could not find anything in the TRM, also nothing in the driver seems to > > >> cause it). > > > > > > My thinking is that since we do not have a linkup notifier, the function driver > > > starts setting things up without the link established (e.g. when the host is > > > still powered down). Once the host start booting and pic link is established, > > > things may be reset in the hardware... That is the only thing I can think of. > > This might be worth investigating, I'll look into it, but it seems > many of the EP > drivers don't have a Linkup notifier, > drivers/pci/controller/dwc/pci-dra7xx.c has > one, but most of the other EP drivers don't have them, so it might not be > absolutely required. > > > > > > > And yes, there are definitely something going on with the power states too I > > > think: if I let things idle for a few minutes, everything stops working: no > > > activity seen on the endpoint over the BARs. I tried enabling the sys and client > > > interrupts to see if I can see power state changes, or if clearing the > > > interrupts helps (they are masked by default), but no change. And booting the > > > host with pci_aspm=off does not help either. Also tried to change all the > > > capabilities related to link & power states to "off" (not supported), and no > > > change either. So currently, I am out of ideas regarding that one. > > > > > > I am trying to make progress on my endpoint driver (nvme function) to be sure it > > > is not a bug there that breaks things. I may still have something bad because > > > when I enable the BIOS native NVMe driver on the host, either the host does not > > > boot, or grub crashes with memory corruptions. Overall, not yet very stable and > > > still trying to sort out the root cause of that. > > I am also working on an NVMe driver but I have our NVMe firmware running in > userspace so our endpoint function driver only exposes the BARs as UIO > mapped memory and has a simple interface to generate IRQs to host / initiate > DMA transfers. > > So that driver does very little in itself and I still have problems > with the BARs > getting unmapped (reset to 0) randomly. I hope your patches for monitoring > the IRQs will shed some light on this. I also observed the BARs getting reset > with the pcie ep test function driver, so I don't think it necessarily > is the function > that is to blame, rather the controller itself (also because none of > the kernel code > should / does access the BARs registers @0xfd80'0010). > > > > > By the way, enabling the interrupts to see the error notifications, I do see a > > lot of retry timeout and other recoverable errors. So the issues I am seeing > > could be due to my PCI cable setup that is not ideal (bad signal, ground loops, > > ... ?). Not sure. I do not have a PCI analyzer handy :) I have enabled the IRQs and messages thanks to your patches but I don't get messages from the IRQs (it seems no IRQs are fired). My PCIe link seems stable. The main issue I face is still that after a random amount of time, the BARs are reset to 0, I don't have a PCIe analyzer so I cannot chase config space TLPs (e.g., host writing the BAR values to the config header), but I don't think that the problem comes from a TLP issued from the host. (it might be). I don't think it's a buffer overflow / out-of-bounds access by kernel code for two reasons 1) The values in the config space around the BARs is coherent and unchanged 2) The bars are reset to 0 and not a random value I suspect a hardware reset of those registers issued internally in the PCIe controller, I don't know why (it might be a link related event or power state related event). I have also experienced very slow behavior with the PCI endpoint test driver, e.g., pcitest -w 1024 -d would take tens of seconds to complete. It seems to come from LCRC errors, when I check the "LCRC Error count register" @0xFD90'0214 I can see it drastically increase between two calls of pcitest (when I mean drastically it means by 6607 (0x19CF) for example). The "ECC Correctable Error Count Register" @0xFD90'0218 reads 0 though. I have tried to shorten the cabling by removing one of the PCIe extenders, that didn't change the issues much. Any ideas as to why I see a large number of TLPs with LCRC errors in them ? Do you experience the same ? What are your values in 0xFD90'0214 when running e.g., pcitest -w 1024 -d (note: you can reset the counter by writing 0xFFFF to it in case it reaches the maximum value of 0xFFFF). > > > > I attached the patches I used to enable the EP interrupts. Enabling debug prints > > will tell you what is going on. That may give you some hints on your setup ? > > > > -- > > Damien Le Moal > > Western Digital Research > > Thank you for these patches. I will try them and see if they give me more info. > > Also, I will delay the release of the v3 of my patch series because of > these issues. > The v3 only incorporates the changes discussed here in the mailing list so your > version should be up to date. If you want me to send you the series in > its current > state let me know. > > But I will need some more debugging, I'll release the v3 when the driver is more > stable. I don't when, I don't have that much time on this project. Thanks for > your understanding. > > Rick
On 3/17/23 01:34, Rick Wertenbroek wrote: >>> By the way, enabling the interrupts to see the error notifications, I do see a >>> lot of retry timeout and other recoverable errors. So the issues I am seeing >>> could be due to my PCI cable setup that is not ideal (bad signal, ground loops, >>> ... ?). Not sure. I do not have a PCI analyzer handy :) > > I have enabled the IRQs and messages thanks to your patches but I don't get > messages from the IRQs (it seems no IRQs are fired). My PCIe link seems stable. > The main issue I face is still that after a random amount of time, the BARs are > reset to 0, I don't have a PCIe analyzer so I cannot chase config space TLPs > (e.g., host writing the BAR values to the config header), but I don't think that > the problem comes from a TLP issued from the host. (it might be). Hmmm... I am getting lots of IRQs, especially the ones signaling "replay timer timed out" and "replay timer rolled over after 4 transmissions of the same TLP" but also some "phy error detected on receive side"... Need to try to rework my cable setup I guess. As for the BARs being reset to 0, I have not checked, but it may be why I see things not working after some inactivity. Will check that. We may be seeing the same regarding that. > I don't think it's a buffer overflow / out-of-bounds access by kernel > code for two reasons > 1) The values in the config space around the BARs is coherent and unchanged > 2) The bars are reset to 0 and not a random value > > I suspect a hardware reset of those registers issued internally in the > PCIe controller, > I don't know why (it might be a link related event or power state > related event). > > I have also experienced very slow behavior with the PCI endpoint test driver, > e.g., pcitest -w 1024 -d would take tens of seconds to complete. It seems to > come from LCRC errors, when I check the "LCRC Error count register" > @0xFD90'0214 I can see it drastically increase between two calls of pcitest > (when I mean drastically it means by 6607 (0x19CF) for example). > > The "ECC Correctable Error Count Register" @0xFD90'0218 reads 0 though. > > I have tried to shorten the cabling by removing one of the PCIe extenders, that > didn't change the issues much. > > Any ideas as to why I see a large number of TLPs with LCRC errors in them ? > Do you experience the same ? What are your values in 0xFD90'0214 when > running e.g., pcitest -w 1024 -d (note: you can reset the counter by writing > 0xFFFF to it in case it reaches the maximum value of 0xFFFF). I have not checked. But I will look at these counters to see what I have there.
On Fri, Mar 17, 2023 at 07:09:04AM +0900, Damien Le Moal wrote: > On 3/17/23 01:34, Rick Wertenbroek wrote: > >>> By the way, enabling the interrupts to see the error notifications, I do see a > >>> lot of retry timeout and other recoverable errors. So the issues I am seeing > >>> could be due to my PCI cable setup that is not ideal (bad signal, ground loops, > >>> ... ?). Not sure. I do not have a PCI analyzer handy :) > > > > I have enabled the IRQs and messages thanks to your patches but I don't get > > messages from the IRQs (it seems no IRQs are fired). My PCIe link seems stable. > > The main issue I face is still that after a random amount of time, the BARs are > > reset to 0, I don't have a PCIe analyzer so I cannot chase config space TLPs > > (e.g., host writing the BAR values to the config header), but I don't think that > > the problem comes from a TLP issued from the host. (it might be). > > Hmmm... I am getting lots of IRQs, especially the ones signaling "replay timer > timed out" and "replay timer rolled over after 4 transmissions of the same TLP" > but also some "phy error detected on receive side"... Need to try to rework my > cable setup I guess. > > As for the BARs being reset to 0, I have not checked, but it may be why I see > things not working after some inactivity. Will check that. We may be seeing the > same regarding that. > > > I don't think it's a buffer overflow / out-of-bounds access by kernel > > code for two reasons > > 1) The values in the config space around the BARs is coherent and unchanged > > 2) The bars are reset to 0 and not a random value > > > > I suspect a hardware reset of those registers issued internally in the > > PCIe controller, > > I don't know why (it might be a link related event or power state > > related event). > > > > I have also experienced very slow behavior with the PCI endpoint test driver, > > e.g., pcitest -w 1024 -d would take tens of seconds to complete. It seems to > > come from LCRC errors, when I check the "LCRC Error count register" > > @0xFD90'0214 I can see it drastically increase between two calls of pcitest > > (when I mean drastically it means by 6607 (0x19CF) for example). > > > > The "ECC Correctable Error Count Register" @0xFD90'0218 reads 0 though. > > > > I have tried to shorten the cabling by removing one of the PCIe extenders, that > > didn't change the issues much. > > > > Any ideas as to why I see a large number of TLPs with LCRC errors in them ? > > Do you experience the same ? What are your values in 0xFD90'0214 when > > running e.g., pcitest -w 1024 -d (note: you can reset the counter by writing > > 0xFFFF to it in case it reaches the maximum value of 0xFFFF). > > I have not checked. But I will look at these counters to see what I have there. Hi, checking where are we with this thread and whether there is something to consider for v6.4, if testing succeeds. Thanks, Lorenzo
On Thu, Apr 13, 2023 at 3:49 PM Lorenzo Pieralisi <lpieralisi@kernel.org> wrote: > > On Fri, Mar 17, 2023 at 07:09:04AM +0900, Damien Le Moal wrote: > > On 3/17/23 01:34, Rick Wertenbroek wrote: > > >>> By the way, enabling the interrupts to see the error notifications, I do see a > > >>> lot of retry timeout and other recoverable errors. So the issues I am seeing > > >>> could be due to my PCI cable setup that is not ideal (bad signal, ground loops, > > >>> ... ?). Not sure. I do not have a PCI analyzer handy :) > > > > > > I have enabled the IRQs and messages thanks to your patches but I don't get > > > messages from the IRQs (it seems no IRQs are fired). My PCIe link seems stable. > > > The main issue I face is still that after a random amount of time, the BARs are > > > reset to 0, I don't have a PCIe analyzer so I cannot chase config space TLPs > > > (e.g., host writing the BAR values to the config header), but I don't think that > > > the problem comes from a TLP issued from the host. (it might be). > > > > Hmmm... I am getting lots of IRQs, especially the ones signaling "replay timer > > timed out" and "replay timer rolled over after 4 transmissions of the same TLP" > > but also some "phy error detected on receive side"... Need to try to rework my > > cable setup I guess. > > > > As for the BARs being reset to 0, I have not checked, but it may be why I see > > things not working after some inactivity. Will check that. We may be seeing the > > same regarding that. > > > > > I don't think it's a buffer overflow / out-of-bounds access by kernel > > > code for two reasons > > > 1) The values in the config space around the BARs is coherent and unchanged > > > 2) The bars are reset to 0 and not a random value > > > > > > I suspect a hardware reset of those registers issued internally in the > > > PCIe controller, > > > I don't know why (it might be a link related event or power state > > > related event). > > > > > > I have also experienced very slow behavior with the PCI endpoint test driver, > > > e.g., pcitest -w 1024 -d would take tens of seconds to complete. It seems to > > > come from LCRC errors, when I check the "LCRC Error count register" > > > @0xFD90'0214 I can see it drastically increase between two calls of pcitest > > > (when I mean drastically it means by 6607 (0x19CF) for example). > > > > > > The "ECC Correctable Error Count Register" @0xFD90'0218 reads 0 though. > > > > > > I have tried to shorten the cabling by removing one of the PCIe extenders, that > > > didn't change the issues much. > > > > > > Any ideas as to why I see a large number of TLPs with LCRC errors in them ? > > > Do you experience the same ? What are your values in 0xFD90'0214 when > > > running e.g., pcitest -w 1024 -d (note: you can reset the counter by writing > > > 0xFFFF to it in case it reaches the maximum value of 0xFFFF). > > > > I have not checked. But I will look at these counters to see what I have there. > > Hi, > > checking where are we with this thread and whether there is something to > consider for v6.4, if testing succeeds. > > Thanks, > Lorenzo Hello, Thank you for considering this. There is a V3 of this patch series [1|, that fixes the issues encountered with the V2. The debugging following this thread was discussed off-list with Damien Le Moal. The V3 has been tested successfully by Damien Le Moal [2] I will submit a V4 next week, since there are minor changes that were suggested in the threads for the V3 (mostly minor changes in code style, macros, comments). I hope it can be considered for v6.4, thank you. [1] https://lore.kernel.org/linux-pci/29a5ccc3-d2c8-b844-a333-28bc20657942@fastmail.com/T/#mc8f2589ff04862175cb0c906b38cb37a90db0e42 [2] https://lore.kernel.org/linux-pci/29a5ccc3-d2c8-b844-a333-28bc20657942@fastmail.com/ Notes on what was discovered off-list : The issues regarding BAR reset were due to power supply issues (PCI cable jumping host 3V3 supply to SoC 3V3 supply, and are fixed with proper cabling). a few LCRC errors are normal with PCIe, the number will depend on signal integrity at the physical layer (cabling). Best regards, Rick