mbox series

[SRU,J:linux-bluefield,v7,0/6] Add VFIO P2P support

Message ID 20240925172214.1511114-1-witu@nvidia.com
Headers show
Series Add VFIO P2P support | expand

Message

William Tu Sept. 25, 2024, 5:22 p.m. UTC
BugLink: https://bugs.launchpad.net/bugs/2077887

The series adds support for VFIO P2P feature for NVMe target offload.
We took the patch (not upstreamed) below with some fix to make it work.
https://patchwork.kernel.org/project/linux-rdma/cover/0-v2-472615b3877e+28f7-vfio_dma_buf_jgg@nvidia.com/

* Kernel Config
Need to enable
CONFIG_PCI_P2PDMA (enable p2p dma transaction)
CONFIG_DMABUF_MOVE_NOTIFY (no need to pin down the memory)

The feature doesn't work with IOMMU, so either we disable it
for the system, "iommu.passthrough=1 iommu=off" in kernel boot
parameter, or conditionally disable it for VFIO.
CONFIG_VFIO_NOIOMMU (optional)
- You should enable CONFIG_VFIO_NOIOMMU in the kernel config if your
vfio module does not have vfio enable_unsafe_noiommu_mode in the parameter list.

* /etc/modprobe.d/vfio.conf
$ cat /etc/modprobe.d/vfio.conf
# The kernel hungs when 0000:06:00.0 (Non-Volatile memory controller: KIOXIA
# Corporation Device 0001) is bound to the vfio-pci module, and the following
# error is printed:
#
# vfio-pci 0000:06:00.0: Unable to change power state from D3hot to D0, device inaccessible
#
# The configuration below is a workarround for the issue.
options vfio-pci disable_idle_d3=1

# Allow using VFIO without IOMMU. This is only required when SMMU is disabled.
options vfio enable_unsafe_noiommu_mode=1

* Setup and Test
1. For ease of testing, we take a BF3 and boot the OS into mmc instead
of NVMe drive. So leaving NVMe drive free to use.

2. use a test program to verify its correctness.
The program create ibv rdma program, open nvme drive, create dma vfio
buf from nvme, and let the rdma program read (DMA P2P) the version
number of the nvme drive.
https://gitlab.com/Mellanox/spdk_team/vfio-dmabuf-test

* UAPI changes
see "vfio/pci: Allow MMIO regions to be exported through
    dma-buf" for detail

v7:
- add back the missing vfio_pci_dma_buf_move code, suggested by Bartlomiej

v6: 
- remove comments, fix the vfio_pci_set_power_state code

v5:
- add back the missing power management code, suggested by Bartlomiej

v4:
- fix Makefile

v1->v2:
- introduce new ioctl uAPI for vfio dma-buf

v2->v3:
- squash the ioctl uapi change into
  "vfio/pci: Allow MMIO regions to be exported through dma-buf"

Jason Gunthorpe (4):
  UBUNTU: SAUCE: dma-buf: Add dma_buf_try_get()
  UBUNTU: SAUCE: vfio: Add vfio_device_get()
  UBUNTU: SAUCE: vfio_pci: Do not open code pci_try_reset_function()
  UBUNTU: SAUCE: vfio/pci: Allow MMIO regions to be exported through
    dma-buf

Sergey Gorenko (1):
  UBUNTU: SAUCE: vfio/pci: Fix p2p address

William Tu (1):
  UBUNTU: [Config] bluefield: add config for VFIO P2P

 debian.bluefield/config/annotations |   2 +
 drivers/vfio/pci/Makefile           |   1 +
 drivers/vfio/pci/dma_buf.c          | 265 ++++++++++++++++++++++++++++
 drivers/vfio/pci/vfio_pci_config.c  |  24 +--
 drivers/vfio/pci/vfio_pci_core.c    |  76 +++++---
 drivers/vfio/pci/vfio_pci_priv.h    |  28 +++
 include/linux/dma-buf.h             |  13 ++
 include/linux/vfio.h                |   5 +
 include/linux/vfio_pci_core.h       |   1 +
 include/uapi/linux/vfio.h           |  17 ++
 10 files changed, 395 insertions(+), 37 deletions(-)
 create mode 100644 drivers/vfio/pci/dma_buf.c
 create mode 100644 drivers/vfio/pci/vfio_pci_priv.h

Comments

Bartlomiej Zolnierkiewicz Sept. 26, 2024, 2:54 p.m. UTC | #1
On Wed, Sep 25, 2024 at 7:23 PM William Tu <witu@nvidia.com> wrote:
>
> BugLink: https://bugs.launchpad.net/bugs/2077887
>
> The series adds support for VFIO P2P feature for NVMe target offload.
> We took the patch (not upstreamed) below with some fix to make it work.
> https://patchwork.kernel.org/project/linux-rdma/cover/0-v2-472615b3877e+28f7-vfio_dma_buf_jgg@nvidia.com/
>
> * Kernel Config
> Need to enable
> CONFIG_PCI_P2PDMA (enable p2p dma transaction)
> CONFIG_DMABUF_MOVE_NOTIFY (no need to pin down the memory)
>
> The feature doesn't work with IOMMU, so either we disable it
> for the system, "iommu.passthrough=1 iommu=off" in kernel boot
> parameter, or conditionally disable it for VFIO.
> CONFIG_VFIO_NOIOMMU (optional)
> - You should enable CONFIG_VFIO_NOIOMMU in the kernel config if your
> vfio module does not have vfio enable_unsafe_noiommu_mode in the parameter list.
>
> * /etc/modprobe.d/vfio.conf
> $ cat /etc/modprobe.d/vfio.conf
> # The kernel hungs when 0000:06:00.0 (Non-Volatile memory controller: KIOXIA
> # Corporation Device 0001) is bound to the vfio-pci module, and the following
> # error is printed:
> #
> # vfio-pci 0000:06:00.0: Unable to change power state from D3hot to D0, device inaccessible
> #
> # The configuration below is a workarround for the issue.
> options vfio-pci disable_idle_d3=1
>
> # Allow using VFIO without IOMMU. This is only required when SMMU is disabled.
> options vfio enable_unsafe_noiommu_mode=1
>
> * Setup and Test
> 1. For ease of testing, we take a BF3 and boot the OS into mmc instead
> of NVMe drive. So leaving NVMe drive free to use.
>
> 2. use a test program to verify its correctness.
> The program create ibv rdma program, open nvme drive, create dma vfio
> buf from nvme, and let the rdma program read (DMA P2P) the version
> number of the nvme drive.
> https://gitlab.com/Mellanox/spdk_team/vfio-dmabuf-test
>
> * UAPI changes
> see "vfio/pci: Allow MMIO regions to be exported through
>     dma-buf" for detail
>
> v7:
> - add back the missing vfio_pci_dma_buf_move code, suggested by Bartlomiej
>
> v6:
> - remove comments, fix the vfio_pci_set_power_state code
>
> v5:
> - add back the missing power management code, suggested by Bartlomiej
>
> v4:
> - fix Makefile
>
> v1->v2:
> - introduce new ioctl uAPI for vfio dma-buf
>
> v2->v3:
> - squash the ioctl uapi change into
>   "vfio/pci: Allow MMIO regions to be exported through dma-buf"
>
> Jason Gunthorpe (4):
>   UBUNTU: SAUCE: dma-buf: Add dma_buf_try_get()
>   UBUNTU: SAUCE: vfio: Add vfio_device_get()
>   UBUNTU: SAUCE: vfio_pci: Do not open code pci_try_reset_function()
>   UBUNTU: SAUCE: vfio/pci: Allow MMIO regions to be exported through
>     dma-buf
>
> Sergey Gorenko (1):
>   UBUNTU: SAUCE: vfio/pci: Fix p2p address
>
> William Tu (1):
>   UBUNTU: [Config] bluefield: add config for VFIO P2P
>
>  debian.bluefield/config/annotations |   2 +
>  drivers/vfio/pci/Makefile           |   1 +
>  drivers/vfio/pci/dma_buf.c          | 265 ++++++++++++++++++++++++++++
>  drivers/vfio/pci/vfio_pci_config.c  |  24 +--
>  drivers/vfio/pci/vfio_pci_core.c    |  76 +++++---
>  drivers/vfio/pci/vfio_pci_priv.h    |  28 +++
>  include/linux/dma-buf.h             |  13 ++
>  include/linux/vfio.h                |   5 +
>  include/linux/vfio_pci_core.h       |   1 +
>  include/uapi/linux/vfio.h           |  17 ++
>  10 files changed, 395 insertions(+), 37 deletions(-)
>  create mode 100644 drivers/vfio/pci/dma_buf.c
>  create mode 100644 drivers/vfio/pci/vfio_pci_priv.h
>

Acked-by: Bartlomiej Zolnierkiewicz <bartlomiej.zolnierkiewicz@canonical.com>

--
Best regards,
Bartlomiej
Agathe Porte Oct. 2, 2024, 12:22 p.m. UTC | #2
2024-09-25 19:23 CEST, William Tu:
> BugLink: https://bugs.launchpad.net/bugs/2077887
> 
> The series adds support for VFIO P2P feature for NVMe target offload.
> We took the patch (not upstreamed) below with some fix to make it work.
> https://patchwork.kernel.org/project/linux-rdma/cover/0-v2-472615b3877e+28f7-vfio_dma_buf_jgg@nvidia.com/
> 
> * Kernel Config
> Need to enable
> CONFIG_PCI_P2PDMA (enable p2p dma transaction)
> CONFIG_DMABUF_MOVE_NOTIFY (no need to pin down the memory)
> 
> The feature doesn't work with IOMMU, so either we disable it
> for the system, "iommu.passthrough=1 iommu=off" in kernel boot
> parameter, or conditionally disable it for VFIO.
> CONFIG_VFIO_NOIOMMU (optional)
> - You should enable CONFIG_VFIO_NOIOMMU in the kernel config if your
> vfio module does not have vfio enable_unsafe_noiommu_mode in the parameter list.
> 
> * /etc/modprobe.d/vfio.conf
> $ cat /etc/modprobe.d/vfio.conf
> # The kernel hungs when 0000:06:00.0 (Non-Volatile memory controller: KIOXIA
> # Corporation Device 0001) is bound to the vfio-pci module, and the following
> # error is printed:
> #
> # vfio-pci 0000:06:00.0: Unable to change power state from D3hot to D0, device inaccessible
> #
> # The configuration below is a workarround for the issue.
> options vfio-pci disable_idle_d3=1
> 
> # Allow using VFIO without IOMMU. This is only required when SMMU is disabled.
> options vfio enable_unsafe_noiommu_mode=1
> 
> * Setup and Test
> 1. For ease of testing, we take a BF3 and boot the OS into mmc instead
> of NVMe drive. So leaving NVMe drive free to use.
> 
> 2. use a test program to verify its correctness.
> The program create ibv rdma program, open nvme drive, create dma vfio
> buf from nvme, and let the rdma program read (DMA P2P) the version
> number of the nvme drive.
> https://gitlab.com/Mellanox/spdk_team/vfio-dmabuf-test
> 
> * UAPI changes
> see "vfio/pci: Allow MMIO regions to be exported through
>     dma-buf" for detail
> 
> v7:
> - add back the missing vfio_pci_dma_buf_move code, suggested by Bartlomiej
> 
> v6: 
> - remove comments, fix the vfio_pci_set_power_state code
> 
> v5:
> - add back the missing power management code, suggested by Bartlomiej
> 
> v4:
> - fix Makefile
> 
> v1->v2:
> - introduce new ioctl uAPI for vfio dma-buf
> 
> v2->v3:
> - squash the ioctl uapi change into
>   "vfio/pci: Allow MMIO regions to be exported through dma-buf"
> 
> Jason Gunthorpe (4):
>   UBUNTU: SAUCE: dma-buf: Add dma_buf_try_get()
>   UBUNTU: SAUCE: vfio: Add vfio_device_get()
>   UBUNTU: SAUCE: vfio_pci: Do not open code pci_try_reset_function()
>   UBUNTU: SAUCE: vfio/pci: Allow MMIO regions to be exported through
>     dma-buf
> 
> Sergey Gorenko (1):
>   UBUNTU: SAUCE: vfio/pci: Fix p2p address
> 
> William Tu (1):
>   UBUNTU: [Config] bluefield: add config for VFIO P2P
> 
>  debian.bluefield/config/annotations |   2 +
>  drivers/vfio/pci/Makefile           |   1 +
>  drivers/vfio/pci/dma_buf.c          | 265 ++++++++++++++++++++++++++++
>  drivers/vfio/pci/vfio_pci_config.c  |  24 +--
>  drivers/vfio/pci/vfio_pci_core.c    |  76 +++++---
>  drivers/vfio/pci/vfio_pci_priv.h    |  28 +++
>  include/linux/dma-buf.h             |  13 ++
>  include/linux/vfio.h                |   5 +
>  include/linux/vfio_pci_core.h       |   1 +
>  include/uapi/linux/vfio.h           |  17 ++
>  10 files changed, 395 insertions(+), 37 deletions(-)
>  create mode 100644 drivers/vfio/pci/dma_buf.c
>  create mode 100644 drivers/vfio/pci/vfio_pci_priv.h

Acked-by: Agathe Porte <agathe.porte@canonical.com>
Bartlomiej Zolnierkiewicz Oct. 7, 2024, 3:46 p.m. UTC | #3
Applied to jammy:linux-bluefield/master-next. Thanks.

--
Best regards,
Bartlomiej

On Wed, Sep 25, 2024 at 7:23 PM William Tu <witu@nvidia.com> wrote:
>
> BugLink: https://bugs.launchpad.net/bugs/2077887
>
> The series adds support for VFIO P2P feature for NVMe target offload.
> We took the patch (not upstreamed) below with some fix to make it work.
> https://patchwork.kernel.org/project/linux-rdma/cover/0-v2-472615b3877e+28f7-vfio_dma_buf_jgg@nvidia.com/
>
> * Kernel Config
> Need to enable
> CONFIG_PCI_P2PDMA (enable p2p dma transaction)
> CONFIG_DMABUF_MOVE_NOTIFY (no need to pin down the memory)
>
> The feature doesn't work with IOMMU, so either we disable it
> for the system, "iommu.passthrough=1 iommu=off" in kernel boot
> parameter, or conditionally disable it for VFIO.
> CONFIG_VFIO_NOIOMMU (optional)
> - You should enable CONFIG_VFIO_NOIOMMU in the kernel config if your
> vfio module does not have vfio enable_unsafe_noiommu_mode in the parameter list.
>
> * /etc/modprobe.d/vfio.conf
> $ cat /etc/modprobe.d/vfio.conf
> # The kernel hungs when 0000:06:00.0 (Non-Volatile memory controller: KIOXIA
> # Corporation Device 0001) is bound to the vfio-pci module, and the following
> # error is printed:
> #
> # vfio-pci 0000:06:00.0: Unable to change power state from D3hot to D0, device inaccessible
> #
> # The configuration below is a workarround for the issue.
> options vfio-pci disable_idle_d3=1
>
> # Allow using VFIO without IOMMU. This is only required when SMMU is disabled.
> options vfio enable_unsafe_noiommu_mode=1
>
> * Setup and Test
> 1. For ease of testing, we take a BF3 and boot the OS into mmc instead
> of NVMe drive. So leaving NVMe drive free to use.
>
> 2. use a test program to verify its correctness.
> The program create ibv rdma program, open nvme drive, create dma vfio
> buf from nvme, and let the rdma program read (DMA P2P) the version
> number of the nvme drive.
> https://gitlab.com/Mellanox/spdk_team/vfio-dmabuf-test
>
> * UAPI changes
> see "vfio/pci: Allow MMIO regions to be exported through
>     dma-buf" for detail
>
> v7:
> - add back the missing vfio_pci_dma_buf_move code, suggested by Bartlomiej
>
> v6:
> - remove comments, fix the vfio_pci_set_power_state code
>
> v5:
> - add back the missing power management code, suggested by Bartlomiej
>
> v4:
> - fix Makefile
>
> v1->v2:
> - introduce new ioctl uAPI for vfio dma-buf
>
> v2->v3:
> - squash the ioctl uapi change into
>   "vfio/pci: Allow MMIO regions to be exported through dma-buf"
>
> Jason Gunthorpe (4):
>   UBUNTU: SAUCE: dma-buf: Add dma_buf_try_get()
>   UBUNTU: SAUCE: vfio: Add vfio_device_get()
>   UBUNTU: SAUCE: vfio_pci: Do not open code pci_try_reset_function()
>   UBUNTU: SAUCE: vfio/pci: Allow MMIO regions to be exported through
>     dma-buf
>
> Sergey Gorenko (1):
>   UBUNTU: SAUCE: vfio/pci: Fix p2p address
>
> William Tu (1):
>   UBUNTU: [Config] bluefield: add config for VFIO P2P
>
>  debian.bluefield/config/annotations |   2 +
>  drivers/vfio/pci/Makefile           |   1 +
>  drivers/vfio/pci/dma_buf.c          | 265 ++++++++++++++++++++++++++++
>  drivers/vfio/pci/vfio_pci_config.c  |  24 +--
>  drivers/vfio/pci/vfio_pci_core.c    |  76 +++++---
>  drivers/vfio/pci/vfio_pci_priv.h    |  28 +++
>  include/linux/dma-buf.h             |  13 ++
>  include/linux/vfio.h                |   5 +
>  include/linux/vfio_pci_core.h       |   1 +
>  include/uapi/linux/vfio.h           |  17 ++
>  10 files changed, 395 insertions(+), 37 deletions(-)
>  create mode 100644 drivers/vfio/pci/dma_buf.c
>  create mode 100644 drivers/vfio/pci/vfio_pci_priv.h
>