Message ID | 20210914015326.1494-2-jiangkunkun@huawei.com |
---|---|
State | New |
Headers | show |
Series | vfio: Some fixes about vfio-pci MMIO RAM mapping | expand |
Hi Kunkun, On 9/14/21 3:53 AM, Kunkun Jiang wrote: > We expand MemoryRegions of vfio-pci sub-page MMIO BARs to > vfio_pci_write_config to improve IO performance. s/to vfio_pci_write_config/ in vfio_pci_write_config() > The MemoryRegions of destination VM will not be expanded > successful in live migration, because their addresses have s/will not be expanded successful in live migration/are not expanded anymore after live migration (?) Is that the correct symptom? > been updated in vmstate_load_state (vfio_pci_load_config). > > So iterate BARs in vfio_pci_write_config and try to update > sub-page BARs. > > Fixes: c5e2fb3ce4d (vfio: Add save and load functions for VFIO PCI devices) is it an actual fix or an optimization? > Reported-by: Nianyao Tang <tangnianyao@huawei.com> > Reported-by: Qixin Gan <ganqixin@huawei.com> > Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> > --- > hw/vfio/pci.c | 15 ++++++++++++++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c > index e1ea1d8a23..43c7e93153 100644 > --- a/hw/vfio/pci.c > +++ b/hw/vfio/pci.c > @@ -2453,7 +2453,12 @@ static int vfio_pci_load_config(VFIODevice *vbasedev, QEMUFile *f) > { > VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev); > PCIDevice *pdev = &vdev->pdev; > - int ret; > + pcibus_t old_addr[PCI_NUM_REGIONS - 1]; > + int bar, ret; > + > + for (bar = 0; bar < PCI_ROM_SLOT; bar++) { > + old_addr[bar] = pdev->io_regions[bar].addr; > + } what are those values before the vmstate_load_state ie. can't you do the vfio_sub_page_bar_update_mapping() unconditionnaly on old_addr[bar] != pdev->io_regions[bar].addr > > ret = vmstate_load_state(f, &vmstate_vfio_pci_config, vdev, 1); > if (ret) { > @@ -2463,6 +2468,14 @@ static int vfio_pci_load_config(VFIODevice *vbasedev, QEMUFile *f) > vfio_pci_write_config(pdev, PCI_COMMAND, > pci_get_word(pdev->config + PCI_COMMAND), 2); > > + for (bar = 0; bar < PCI_ROM_SLOT; bar++) { > + if (old_addr[bar] != pdev->io_regions[bar].addr && > + vdev->bars[bar].region.size > 0 && > + vdev->bars[bar].region.size < qemu_real_host_page_size) { > + vfio_sub_page_bar_update_mapping(pdev, bar); > + } > + } > + > if (msi_enabled(pdev)) { > vfio_msi_enable(vdev); > } else if (msix_enabled(pdev)) { Thanks Eric
Hi Eric, On 2021/10/22 0:15, Eric Auger wrote: > Hi Kunkun, > > On 9/14/21 3:53 AM, Kunkun Jiang wrote: >> We expand MemoryRegions of vfio-pci sub-page MMIO BARs to >> vfio_pci_write_config to improve IO performance. > s/to vfio_pci_write_config/ in vfio_pci_write_config() Thank you for your review. I will correct it in v3. >> The MemoryRegions of destination VM will not be expanded >> successful in live migration, because their addresses have > s/will not be expanded successful in live migration/are not expanded > anymore after live migration (?) Is that the correct symptom? You are right. They are not expanded anymore after live migration, not expanded unsuccessfully. I used the wrong words. >> been updated in vmstate_load_state (vfio_pci_load_config). >> >> So iterate BARs in vfio_pci_write_config and try to update >> sub-page BARs. >> >> Fixes: c5e2fb3ce4d (vfio: Add save and load functions for VFIO PCI devices) > is it an actual fix or an optimization? I recently realized that this is actually an optimization. The VF driver in VM use the assembly language instructions, which can operate two registers simultaneously, like stp, ldp. These instructions would result in non-ISV data abort, which cannot be handled well at the moment. If the driver doesn't use such instructions, not expanding only affects performance. I will add these to the commit message in the next version. >> Reported-by: Nianyao Tang <tangnianyao@huawei.com> >> Reported-by: Qixin Gan <ganqixin@huawei.com> >> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> >> --- >> hw/vfio/pci.c | 15 ++++++++++++++- >> 1 file changed, 14 insertions(+), 1 deletion(-) >> >> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c >> index e1ea1d8a23..43c7e93153 100644 >> --- a/hw/vfio/pci.c >> +++ b/hw/vfio/pci.c >> @@ -2453,7 +2453,12 @@ static int vfio_pci_load_config(VFIODevice *vbasedev, QEMUFile *f) >> { >> VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev); >> PCIDevice *pdev = &vdev->pdev; >> - int ret; >> + pcibus_t old_addr[PCI_NUM_REGIONS - 1]; >> + int bar, ret; >> + >> + for (bar = 0; bar < PCI_ROM_SLOT; bar++) { >> + old_addr[bar] = pdev->io_regions[bar].addr; >> + } > what are those values before the vmstate_load_state ie. can't you do the Are you referring to pdev->io_regions[bar].addr ? All of the bars addr is PCI_BAR_UNMAPPED (~(pcibus_t)0) before the vmstate_load_state. > vfio_sub_page_bar_update_mapping() unconditionnaly on old_addr[bar] != > pdev->io_regions[bar].addr The size of Bar is a power of 2. The Bar, whose size is greater than host page size, doesn't need to be expanded. Can you explain more? May be I misunderstood you. Thanks, Kunkun Jiang >> >> ret = vmstate_load_state(f, &vmstate_vfio_pci_config, vdev, 1); >> if (ret) { >> @@ -2463,6 +2468,14 @@ static int vfio_pci_load_config(VFIODevice *vbasedev, QEMUFile *f) >> vfio_pci_write_config(pdev, PCI_COMMAND, >> pci_get_word(pdev->config + PCI_COMMAND), 2); >> >> + for (bar = 0; bar < PCI_ROM_SLOT; bar++) { >> + if (old_addr[bar] != pdev->io_regions[bar].addr && >> + vdev->bars[bar].region.size > 0 && >> + vdev->bars[bar].region.size < qemu_real_host_page_size) { >> + vfio_sub_page_bar_update_mapping(pdev, bar); >> + } >> + } >> + >> if (msi_enabled(pdev)) { >> vfio_msi_enable(vdev); >> } else if (msix_enabled(pdev)) { > Thanks > > Eric > > .
Hi Kunkun, On 10/22/21 12:01 PM, Kunkun Jiang wrote: > Hi Eric, > > On 2021/10/22 0:15, Eric Auger wrote: >> Hi Kunkun, >> >> On 9/14/21 3:53 AM, Kunkun Jiang wrote: >>> We expand MemoryRegions of vfio-pci sub-page MMIO BARs to >>> vfio_pci_write_config to improve IO performance. >> s/to vfio_pci_write_config/ in vfio_pci_write_config() > Thank you for your review. I will correct it in v3. >>> The MemoryRegions of destination VM will not be expanded >>> successful in live migration, because their addresses have >> s/will not be expanded successful in live migration/are not expanded >> anymore after live migration (?) Is that the correct symptom? > You are right. They are not expanded anymore after live migration, > not expanded unsuccessfully. I used the wrong words. >>> been updated in vmstate_load_state (vfio_pci_load_config). >>> >>> So iterate BARs in vfio_pci_write_config and try to update >>> sub-page BARs. >>> >>> Fixes: c5e2fb3ce4d (vfio: Add save and load functions for VFIO PCI >>> devices) >> is it an actual fix or an optimization? > I recently realized that this is actually an optimization. > > The VF driver in VM use the assembly language instructions, > which can operate two registers simultaneously, like stp, ldp. > These instructions would result in non-ISV data abort, which > cannot be handled well at the moment. > > If the driver doesn't use such instructions, not expanding > only affects performance. > > I will add these to the commit message in the next version. >>> Reported-by: Nianyao Tang <tangnianyao@huawei.com> >>> Reported-by: Qixin Gan <ganqixin@huawei.com> >>> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> >>> --- >>> hw/vfio/pci.c | 15 ++++++++++++++- >>> 1 file changed, 14 insertions(+), 1 deletion(-) >>> >>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c >>> index e1ea1d8a23..43c7e93153 100644 >>> --- a/hw/vfio/pci.c >>> +++ b/hw/vfio/pci.c >>> @@ -2453,7 +2453,12 @@ static int vfio_pci_load_config(VFIODevice >>> *vbasedev, QEMUFile *f) >>> { >>> VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, >>> vbasedev); >>> PCIDevice *pdev = &vdev->pdev; >>> - int ret; >>> + pcibus_t old_addr[PCI_NUM_REGIONS - 1]; >>> + int bar, ret; >>> + >>> + for (bar = 0; bar < PCI_ROM_SLOT; bar++) { >>> + old_addr[bar] = pdev->io_regions[bar].addr; >>> + } >> what are those values before the vmstate_load_state ie. can't you do the > Are you referring to pdev->io_regions[bar].addr ? All of the bars addr is > PCI_BAR_UNMAPPED (~(pcibus_t)0) before the vmstate_load_state. OK that was my assumption. What I meant is in that case you always have old_addr[bar] != pdev->io_regions[bar].addr, right? In the positive this check is not needed and you don't need old_addr at all. In the original code this was needed since one wanted to call vfio_sub_page_bar_update_mapping() only for the bar base address that were changed during the vfio_pci_write_config. Thanks Eric >> vfio_sub_page_bar_update_mapping() unconditionnaly on old_addr[bar] != >> pdev->io_regions[bar].addr > The size of Bar is a power of 2. The Bar, whose size is greater than host > page size, doesn't need to be expanded. > > Can you explain more? May be I misunderstood you. > > Thanks, > Kunkun Jiang >>> ret = vmstate_load_state(f, &vmstate_vfio_pci_config, vdev, 1); >>> if (ret) { >>> @@ -2463,6 +2468,14 @@ static int vfio_pci_load_config(VFIODevice >>> *vbasedev, QEMUFile *f) >>> vfio_pci_write_config(pdev, PCI_COMMAND, >>> pci_get_word(pdev->config + >>> PCI_COMMAND), 2); >>> + for (bar = 0; bar < PCI_ROM_SLOT; bar++) { >>> + if (old_addr[bar] != pdev->io_regions[bar].addr && >>> + vdev->bars[bar].region.size > 0 && >>> + vdev->bars[bar].region.size < qemu_real_host_page_size) { >>> + vfio_sub_page_bar_update_mapping(pdev, bar); >>> + } >>> + } >>> + >>> if (msi_enabled(pdev)) { >>> vfio_msi_enable(vdev); >>> } else if (msix_enabled(pdev)) { >> Thanks >> >> Eric >> >> . >
Hi Eric, On 2021/10/23 22:26, Eric Auger wrote: > Hi Kunkun, > > On 10/22/21 12:01 PM, Kunkun Jiang wrote: >> Hi Eric, >> >> On 2021/10/22 0:15, Eric Auger wrote: >>> Hi Kunkun, >>> >>> On 9/14/21 3:53 AM, Kunkun Jiang wrote: >>>> We expand MemoryRegions of vfio-pci sub-page MMIO BARs to >>>> vfio_pci_write_config to improve IO performance. >>> s/to vfio_pci_write_config/ in vfio_pci_write_config() >> Thank you for your review. I will correct it in v3. >>>> The MemoryRegions of destination VM will not be expanded >>>> successful in live migration, because their addresses have >>> s/will not be expanded successful in live migration/are not expanded >>> anymore after live migration (?) Is that the correct symptom? >> You are right. They are not expanded anymore after live migration, >> not expanded unsuccessfully. I used the wrong words. >>>> been updated in vmstate_load_state (vfio_pci_load_config). >>>> >>>> So iterate BARs in vfio_pci_write_config and try to update >>>> sub-page BARs. >>>> >>>> Fixes: c5e2fb3ce4d (vfio: Add save and load functions for VFIO PCI >>>> devices) >>> is it an actual fix or an optimization? >> I recently realized that this is actually an optimization. >> >> The VF driver in VM use the assembly language instructions, >> which can operate two registers simultaneously, like stp, ldp. >> These instructions would result in non-ISV data abort, which >> cannot be handled well at the moment. >> >> If the driver doesn't use such instructions, not expanding >> only affects performance. >> >> I will add these to the commit message in the next version. >>>> Reported-by: Nianyao Tang <tangnianyao@huawei.com> >>>> Reported-by: Qixin Gan <ganqixin@huawei.com> >>>> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> >>>> --- >>>> hw/vfio/pci.c | 15 ++++++++++++++- >>>> 1 file changed, 14 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c >>>> index e1ea1d8a23..43c7e93153 100644 >>>> --- a/hw/vfio/pci.c >>>> +++ b/hw/vfio/pci.c >>>> @@ -2453,7 +2453,12 @@ static int vfio_pci_load_config(VFIODevice >>>> *vbasedev, QEMUFile *f) >>>> { >>>> VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, >>>> vbasedev); >>>> PCIDevice *pdev = &vdev->pdev; >>>> - int ret; >>>> + pcibus_t old_addr[PCI_NUM_REGIONS - 1]; >>>> + int bar, ret; >>>> + >>>> + for (bar = 0; bar < PCI_ROM_SLOT; bar++) { >>>> + old_addr[bar] = pdev->io_regions[bar].addr; >>>> + } >>> what are those values before the vmstate_load_state ie. can't you do the >> Are you referring to pdev->io_regions[bar].addr ? All of the bars addr is >> PCI_BAR_UNMAPPED (~(pcibus_t)0) before the vmstate_load_state. > OK that was my assumption. What I meant is in that case you always have > > old_addr[bar] != pdev->io_regions[bar].addr, right? In the positive this check is not needed and you don't need old_addr at all. > In the original code this was needed since one wanted to call > vfio_sub_page_bar_update_mapping() only for the bar base address that were changed during the > vfio_pci_write_config. As far as I know, there is at least one case. If the VF driver is not loaded (insmod xxx.ko) in the VM, we will have old_addr[bar] == pdev->io_regions[bar].addr. The vfio_sub_page_bar_update_mapping() will be called when 0 < bar size < host page size. But vfio_sub_page_bar_update_mapping() will not change anything in this case. Thanks, Kunkun Jiang > > Thanks > > Eric > >>> vfio_sub_page_bar_update_mapping() unconditionnaly on old_addr[bar] != >>> pdev->io_regions[bar].addr >> The size of Bar is a power of 2. The Bar, whose size is greater than host >> page size, doesn't need to be expanded. >> >> Can you explain more? May be I misunderstood you. >> >> Thanks, >> Kunkun Jiang >>>> ret = vmstate_load_state(f, &vmstate_vfio_pci_config, vdev, 1); >>>> if (ret) { >>>> @@ -2463,6 +2468,14 @@ static int vfio_pci_load_config(VFIODevice >>>> *vbasedev, QEMUFile *f) >>>> vfio_pci_write_config(pdev, PCI_COMMAND, >>>> pci_get_word(pdev->config + >>>> PCI_COMMAND), 2); >>>> + for (bar = 0; bar < PCI_ROM_SLOT; bar++) { >>>> + if (old_addr[bar] != pdev->io_regions[bar].addr && >>>> + vdev->bars[bar].region.size > 0 && >>>> + vdev->bars[bar].region.size < qemu_real_host_page_size) { >>>> + vfio_sub_page_bar_update_mapping(pdev, bar); >>>> + } >>>> + } >>>> + >>>> if (msi_enabled(pdev)) { >>>> vfio_msi_enable(vdev); >>>> } else if (msix_enabled(pdev)) { >>> Thanks >>> >>> Eric >>> >>> . > .
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index e1ea1d8a23..43c7e93153 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -2453,7 +2453,12 @@ static int vfio_pci_load_config(VFIODevice *vbasedev, QEMUFile *f) { VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev); PCIDevice *pdev = &vdev->pdev; - int ret; + pcibus_t old_addr[PCI_NUM_REGIONS - 1]; + int bar, ret; + + for (bar = 0; bar < PCI_ROM_SLOT; bar++) { + old_addr[bar] = pdev->io_regions[bar].addr; + } ret = vmstate_load_state(f, &vmstate_vfio_pci_config, vdev, 1); if (ret) { @@ -2463,6 +2468,14 @@ static int vfio_pci_load_config(VFIODevice *vbasedev, QEMUFile *f) vfio_pci_write_config(pdev, PCI_COMMAND, pci_get_word(pdev->config + PCI_COMMAND), 2); + for (bar = 0; bar < PCI_ROM_SLOT; bar++) { + if (old_addr[bar] != pdev->io_regions[bar].addr && + vdev->bars[bar].region.size > 0 && + vdev->bars[bar].region.size < qemu_real_host_page_size) { + vfio_sub_page_bar_update_mapping(pdev, bar); + } + } + if (msi_enabled(pdev)) { vfio_msi_enable(vdev); } else if (msix_enabled(pdev)) {
We expand MemoryRegions of vfio-pci sub-page MMIO BARs to vfio_pci_write_config to improve IO performance. The MemoryRegions of destination VM will not be expanded successful in live migration, because their addresses have been updated in vmstate_load_state (vfio_pci_load_config). So iterate BARs in vfio_pci_write_config and try to update sub-page BARs. Fixes: c5e2fb3ce4d (vfio: Add save and load functions for VFIO PCI devices) Reported-by: Nianyao Tang <tangnianyao@huawei.com> Reported-by: Qixin Gan <ganqixin@huawei.com> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com> --- hw/vfio/pci.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-)