Message ID | 1430817191-6231-1-git-send-email-j.fanguede@virtualopensystems.com |
---|---|
State | New |
Headers | show |
Hi Jérémy, On Tue, May 05, 2015 at 11:13:11AM +0200, Jérémy Fanguède wrote: > To maintain cache coherency on ARM, we may need a mechanism to flush > the data cache. In addition to generally just making this functionality available (see below), do you have an actual use case in mind for this? To solve the VGA issue, for example, we already have a patch series from Drew trying to address this. Does that not work for you? There was a long discussion about this here: https://lists.cs.columbia.edu/pipermail/kvmarm/2015-February/013593.html Drew then created a patch set, here: https://lists.nongnu.org/archive/html/qemu-devel/2015-03/msg01254.html and replied to himself, here: https://www.marc.info/?l=android-virt&m=142670523929132&w=3 Which basically says that he doesn't like having to do flushes all over QEMU (IIUC), so he sent this version instead: https://lists.cs.columbia.edu/pipermail/kvmarm/2015-March/014027.html Which he now said he'd respin. Meanwhile there was also the idea of simply convincing the guest to access devices as coherent through added DT properties etc., but I'm not sure where that landed. I think it would be better to agree to the overall approach before sending out further patches. On this particular patch, it does feel like sidestepping general kernel functionality for arm64 through KVM, which I think is a bit weird. I'd like to hear Will and Catalin's thoughts on this before we'd go ahead with something like this. > > This patch implements KVM_FLUSH_DCACHE_GPA vm ioctl which flushes the > data cache at a specified address range. The input argument is a > struct kvm_mem_addr containing the guest physical address and the > length. > > Signed-off-by: Jérémy Fanguède <j.fanguede@virtualopensystems.com> > --- > arch/arm/kvm/arm.c | 13 +++++++++++++ > include/uapi/linux/kvm.h | 6 ++++++ > 2 files changed, 19 insertions(+) > > diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c > index d9631ec..8638fd2 100644 > --- a/arch/arm/kvm/arm.c > +++ b/arch/arm/kvm/arm.c > @@ -886,6 +886,19 @@ long kvm_arch_vm_ioctl(struct file *filp, > > return 0; > } > + case KVM_FLUSH_DCACHE_GPA: { > + struct kvm_mem_addr mem_addr; > + hva_t hva; > + gpa_t gpa; > + > + if (copy_from_user(&mem_addr, argp, sizeof(mem_addr))) > + return -EFAULT; > + > + gpa = mem_addr.addr; > + hva = gfn_to_hva(kvm, gpa_to_gfn(gpa)) | (gpa & ~PAGE_MASK); > + kvm_flush_dcache_to_poc((void *)hva, mem_addr.len); In case you end up moving further with this, see the following commit: 0d3e4d4 (arm/arm64: KVM: Use kernel mapping to perform invalidation on page fault, 2015-01-05) I think you end up with the same problem here. -Christoffer
On Wed, May 6, 2015 at 4:12 PM, Christoffer Dall <christoffer.dall@linaro.org> wrote: > Hi Jérémy, > > On Tue, May 05, 2015 at 11:13:11AM +0200, Jérémy Fanguède wrote: >> To maintain cache coherency on ARM, we may need a mechanism to flush >> the data cache. > > In addition to generally just making this functionality available (see > below), do you have an actual use case in mind for this? To solve the > VGA issue, for example, we already have a patch series from Drew trying > to address this. Does that not work for you? > > There was a long discussion about this here: > https://lists.cs.columbia.edu/pipermail/kvmarm/2015-February/013593.html > > Drew then created a patch set, here: > https://lists.nongnu.org/archive/html/qemu-devel/2015-03/msg01254.html > > and replied to himself, here: > https://www.marc.info/?l=android-virt&m=142670523929132&w=3 > > Which basically says that he doesn't like having to do flushes all over > QEMU (IIUC), so he sent this version instead: > https://lists.cs.columbia.edu/pipermail/kvmarm/2015-March/014027.html > > Which he now said he'd respin. In fact, I used this ioctl in pairs with this QEMU patch series: https://lists.nongnu.org/archive/html/qemu-devel/2015-05/msg00407.html My current work doesn't do anything about vga ram, so vga issue probably still persists, but it solves others issues with some emulated devices (mainly PCI) which were failing before and now work fine with this patch.
On Thu, May 07, 2015 at 12:50:50PM +0200, Jérémy Fanguède wrote: > On Wed, May 6, 2015 at 4:12 PM, Christoffer Dall > <christoffer.dall@linaro.org> wrote: > > Hi Jérémy, > > > > On Tue, May 05, 2015 at 11:13:11AM +0200, Jérémy Fanguède wrote: > >> To maintain cache coherency on ARM, we may need a mechanism to flush > >> the data cache. > > > > In addition to generally just making this functionality available (see > > below), do you have an actual use case in mind for this? To solve the > > VGA issue, for example, we already have a patch series from Drew trying > > to address this. Does that not work for you? > > > > There was a long discussion about this here: > > https://lists.cs.columbia.edu/pipermail/kvmarm/2015-February/013593.html > > > > Drew then created a patch set, here: > > https://lists.nongnu.org/archive/html/qemu-devel/2015-03/msg01254.html > > > > and replied to himself, here: > > https://www.marc.info/?l=android-virt&m=142670523929132&w=3 > > > > Which basically says that he doesn't like having to do flushes all over > > QEMU (IIUC), so he sent this version instead: > > https://lists.cs.columbia.edu/pipermail/kvmarm/2015-March/014027.html > > > > Which he now said he'd respin. > > In fact, I used this ioctl in pairs with this QEMU patch series: > https://lists.nongnu.org/archive/html/qemu-devel/2015-05/msg00407.html > My current work doesn't do anything about vga ram, so vga issue > probably still persists, but it solves others issues with some > emulated devices (mainly PCI) which were failing before and now work > fine with this patch. Why does Drew's approach not work and your approach works here? What is the case that we haven't though about yet? -Christoffer
On Thu, May 7, 2015 at 1:20 PM, Christoffer Dall <christoffer.dall@linaro.org> wrote: > On Thu, May 07, 2015 at 12:50:50PM +0200, Jérémy Fanguède wrote: >> On Wed, May 6, 2015 at 4:12 PM, Christoffer Dall >> <christoffer.dall@linaro.org> wrote: >> > Hi Jérémy, >> > >> > On Tue, May 05, 2015 at 11:13:11AM +0200, Jérémy Fanguède wrote: >> >> To maintain cache coherency on ARM, we may need a mechanism to flush >> >> the data cache. >> > >> > In addition to generally just making this functionality available (see >> > below), do you have an actual use case in mind for this? To solve the >> > VGA issue, for example, we already have a patch series from Drew trying >> > to address this. Does that not work for you? >> > >> > There was a long discussion about this here: >> > https://lists.cs.columbia.edu/pipermail/kvmarm/2015-February/013593.html >> > >> > Drew then created a patch set, here: >> > https://lists.nongnu.org/archive/html/qemu-devel/2015-03/msg01254.html >> > >> > and replied to himself, here: >> > https://www.marc.info/?l=android-virt&m=142670523929132&w=3 >> > >> > Which basically says that he doesn't like having to do flushes all over >> > QEMU (IIUC), so he sent this version instead: >> > https://lists.cs.columbia.edu/pipermail/kvmarm/2015-March/014027.html >> > >> > Which he now said he'd respin. >> >> In fact, I used this ioctl in pairs with this QEMU patch series: >> https://lists.nongnu.org/archive/html/qemu-devel/2015-05/msg00407.html >> My current work doesn't do anything about vga ram, so vga issue >> probably still persists, but it solves others issues with some >> emulated devices (mainly PCI) which were failing before and now work >> fine with this patch. > > Why does Drew's approach not work and your approach works here? What is > the case that we haven't though about yet? The first patch from Andrew, (which is for arm64 only) doesn't permit me to make some emulated PCI devices working with virt, probably because some flushes/cleans are missing. As for the second patch, it focuses, for now, only on VGA ram. I quickly tried to extend the KVM_MEM_UNCACHED flag to the whole guest memory, but then the guest fails to boot; but even if it was working, does it make sense to map as uncached all the ram of the guest? Since we can not guess which region of the guest memory will be accessed. Simple PCI devices like e1000 or usb-ehci (with usb-kbd for instance) are not usable, with or without these patches, but if I flush a precise memory range, on reads and writes performed by emulated devices on guest ram, (It's exactly what the QEMU patch series that I sent does, with this ioctl), emulated PCI devices are now working.
On Thu, May 7, 2015 at 4:50 PM, Jérémy Fanguède <j.fanguede@virtualopensystems.com> wrote: > On Thu, May 7, 2015 at 1:20 PM, Christoffer Dall > <christoffer.dall@linaro.org> wrote: >> On Thu, May 07, 2015 at 12:50:50PM +0200, Jérémy Fanguède wrote: >>> On Wed, May 6, 2015 at 4:12 PM, Christoffer Dall >>> <christoffer.dall@linaro.org> wrote: >>> > Hi Jérémy, >>> > >>> > On Tue, May 05, 2015 at 11:13:11AM +0200, Jérémy Fanguède wrote: >>> >> To maintain cache coherency on ARM, we may need a mechanism to flush >>> >> the data cache. >>> > >>> > In addition to generally just making this functionality available (see >>> > below), do you have an actual use case in mind for this? To solve the >>> > VGA issue, for example, we already have a patch series from Drew trying >>> > to address this. Does that not work for you? >>> > >>> > There was a long discussion about this here: >>> > https://lists.cs.columbia.edu/pipermail/kvmarm/2015-February/013593.html >>> > >>> > Drew then created a patch set, here: >>> > https://lists.nongnu.org/archive/html/qemu-devel/2015-03/msg01254.html >>> > >>> > and replied to himself, here: >>> > https://www.marc.info/?l=android-virt&m=142670523929132&w=3 >>> > >>> > Which basically says that he doesn't like having to do flushes all over >>> > QEMU (IIUC), so he sent this version instead: >>> > https://lists.cs.columbia.edu/pipermail/kvmarm/2015-March/014027.html >>> > >>> > Which he now said he'd respin. >>> >>> In fact, I used this ioctl in pairs with this QEMU patch series: >>> https://lists.nongnu.org/archive/html/qemu-devel/2015-05/msg00407.html >>> My current work doesn't do anything about vga ram, so vga issue >>> probably still persists, but it solves others issues with some >>> emulated devices (mainly PCI) which were failing before and now work >>> fine with this patch. >> >> Why does Drew's approach not work and your approach works here? What is >> the case that we haven't though about yet? > > The first patch from Andrew, (which is for arm64 only) doesn't permit > me to make some emulated PCI devices working with virt, probably > because some flushes/cleans are missing. > As for the second patch, it focuses, for now, only on VGA ram. I > quickly tried to extend the KVM_MEM_UNCACHED flag to the whole guest > memory, but then the guest fails to boot; but even if it was working, > does it make sense to map as uncached all the ram of the guest? Since > we can not guess which region of the guest memory will be accessed. > > Simple PCI devices like e1000 or usb-ehci (with usb-kbd for instance) > are not usable, with or without these patches, but if I flush a > precise memory range, on reads and writes performed by emulated > devices on guest ram, (It's exactly what the QEMU patch series that I > sent does, with this ioctl), emulated PCI devices are now working. I understand all this. What I'd like for us to find out is why we are having coherency issues. We knew that for the VGA adapter, the guest maps the memory as uncached (because that's how the real hardware works), and QEMU maps the memory as cached (because it's just normal memory), and unsurprisingly the two views of that memory is not coherent. What are the cases you are seeing with e1000 or usb-ehci? Hint: We can make a lot of things work by just sticking cache flushes all over, but it's not a good engineering approach. Thanks, -Christoffer
On Thu, May 7, 2015 at 5:34 PM, Christoffer Dall <christoffer.dall@linaro.org> wrote: > On Thu, May 7, 2015 at 4:50 PM, Jérémy Fanguède > <j.fanguede@virtualopensystems.com> wrote: >> On Thu, May 7, 2015 at 1:20 PM, Christoffer Dall >> <christoffer.dall@linaro.org> wrote: >>> On Thu, May 07, 2015 at 12:50:50PM +0200, Jérémy Fanguède wrote: >>>> On Wed, May 6, 2015 at 4:12 PM, Christoffer Dall >>>> <christoffer.dall@linaro.org> wrote: >>>> > Hi Jérémy, >>>> > >>>> > On Tue, May 05, 2015 at 11:13:11AM +0200, Jérémy Fanguède wrote: >>>> >> To maintain cache coherency on ARM, we may need a mechanism to flush >>>> >> the data cache. >>>> > >>>> > In addition to generally just making this functionality available (see >>>> > below), do you have an actual use case in mind for this? To solve the >>>> > VGA issue, for example, we already have a patch series from Drew trying >>>> > to address this. Does that not work for you? >>>> > >>>> > There was a long discussion about this here: >>>> > https://lists.cs.columbia.edu/pipermail/kvmarm/2015-February/013593.html >>>> > >>>> > Drew then created a patch set, here: >>>> > https://lists.nongnu.org/archive/html/qemu-devel/2015-03/msg01254.html >>>> > >>>> > and replied to himself, here: >>>> > https://www.marc.info/?l=android-virt&m=142670523929132&w=3 >>>> > >>>> > Which basically says that he doesn't like having to do flushes all over >>>> > QEMU (IIUC), so he sent this version instead: >>>> > https://lists.cs.columbia.edu/pipermail/kvmarm/2015-March/014027.html >>>> > >>>> > Which he now said he'd respin. >>>> >>>> In fact, I used this ioctl in pairs with this QEMU patch series: >>>> https://lists.nongnu.org/archive/html/qemu-devel/2015-05/msg00407.html >>>> My current work doesn't do anything about vga ram, so vga issue >>>> probably still persists, but it solves others issues with some >>>> emulated devices (mainly PCI) which were failing before and now work >>>> fine with this patch. >>> >>> Why does Drew's approach not work and your approach works here? What is >>> the case that we haven't though about yet? >> >> The first patch from Andrew, (which is for arm64 only) doesn't permit >> me to make some emulated PCI devices working with virt, probably >> because some flushes/cleans are missing. >> As for the second patch, it focuses, for now, only on VGA ram. I >> quickly tried to extend the KVM_MEM_UNCACHED flag to the whole guest >> memory, but then the guest fails to boot; but even if it was working, >> does it make sense to map as uncached all the ram of the guest? Since >> we can not guess which region of the guest memory will be accessed. >> >> Simple PCI devices like e1000 or usb-ehci (with usb-kbd for instance) >> are not usable, with or without these patches, but if I flush a >> precise memory range, on reads and writes performed by emulated >> devices on guest ram, (It's exactly what the QEMU patch series that I >> sent does, with this ioctl), emulated PCI devices are now working. > > I understand all this. What I'd like for us to find out is why we are > having coherency issues. We knew that for the VGA adapter, the guest > maps the memory as uncached (because that's how the real hardware > works), and QEMU maps the memory as cached (because it's just normal > memory), and unsurprisingly the two views of that memory is not > coherent. > > What are the cases you are seeing with e1000 or usb-ehci? USB devices fail with a timeout error, as if the communication between the kernel and the devices fail at a certain point: usb 1-1: device not accepting address 5, error -110 usb usb1-port1: unable to enumerate USB device e1000 fails when the userspace tries to use it, with these type of kernel messages: e1000 0000:00:02.0 eth0: Detected Tx Unit Hang Tx Queue <0> TDH <d> TDT <d> next_to_use <d> next_to_clean <9> buffer_info[next_to_clean] time_stamp <ffff9311> next_to_watch <a> jiffies <ffff956a> next_to_watch.status <0> > > Hint: We can make a lot of things work by just sticking cache flushes > all over, but it's not a good engineering approach. Yes, it's probably not the best approach here, but currently it's the only solution that I have to make some PCI devices usable on ARM.
On 07/05/2015 18:56, Jérémy Fanguède wrote: > USB devices fail with a timeout error, as if the communication between > the kernel and the devices fail at a certain point: > usb 1-1: device not accepting address 5, error -110 > usb usb1-port1: unable to enumerate USB device > > e1000 fails when the userspace tries to use it, with these type of > kernel messages: > e1000 0000:00:02.0 eth0: Detected Tx Unit Hang > Tx Queue <0> > TDH <d> > TDT <d> > next_to_use <d> > next_to_clean <9> > buffer_info[next_to_clean] > time_stamp <ffff9311> > next_to_watch <a> > jiffies <ffff956a> > next_to_watch.status <0> Can you find out what memory attributes the guest is using for the memory---and if it's uncached, why? Paolo
On 05/07/15 19:01, Paolo Bonzini wrote: > > > On 07/05/2015 18:56, Jérémy Fanguède wrote: >> USB devices fail with a timeout error, as if the communication between >> the kernel and the devices fail at a certain point: >> usb 1-1: device not accepting address 5, error -110 >> usb usb1-port1: unable to enumerate USB device This is consistent with what I saw in my earlier testing. >> e1000 fails when the userspace tries to use it, with these type of >> kernel messages: >> e1000 0000:00:02.0 eth0: Detected Tx Unit Hang >> Tx Queue <0> >> TDH <d> >> TDT <d> >> next_to_use <d> >> next_to_clean <9> >> buffer_info[next_to_clean] >> time_stamp <ffff9311> >> next_to_watch <a> >> jiffies <ffff956a> >> next_to_watch.status <0> > > Can you find out what memory attributes the guest is using for the > memory---and if it's uncached, why? For USB, see "drivers/usb/core/hcd-pci.c", function usb_hcd_pci_probe(): it uses ioremap_nocache(). On the "why", that ioremap_nocache() call can be tracked to http://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/?id=a914dd8b (Feb 2002), which predates the kernel's move to git. I guess ioremap_nocache() is used simply because USB host controllers are supposed to programmed like that. And, from "arch/arm64/include/asm/io.h": #define ioremap_nocache(addr, size) __ioremap((addr), (size), __pgprot(PROT_DEVICE_nGnRE)) Thanks Laszlo
On Fri, May 15, 2015 at 01:43:57PM +0200, Laszlo Ersek wrote: > On 05/07/15 19:01, Paolo Bonzini wrote: > > > > > > On 07/05/2015 18:56, Jérémy Fanguède wrote: > >> USB devices fail with a timeout error, as if the communication between > >> the kernel and the devices fail at a certain point: > >> usb 1-1: device not accepting address 5, error -110 > >> usb usb1-port1: unable to enumerate USB device > > This is consistent with what I saw in my earlier testing. > > >> e1000 fails when the userspace tries to use it, with these type of > >> kernel messages: > >> e1000 0000:00:02.0 eth0: Detected Tx Unit Hang > >> Tx Queue <0> > >> TDH <d> > >> TDT <d> > >> next_to_use <d> > >> next_to_clean <9> > >> buffer_info[next_to_clean] > >> time_stamp <ffff9311> > >> next_to_watch <a> > >> jiffies <ffff956a> > >> next_to_watch.status <0> > > > > Can you find out what memory attributes the guest is using for the > > memory---and if it's uncached, why? > > For USB, see "drivers/usb/core/hcd-pci.c", function usb_hcd_pci_probe(): > it uses ioremap_nocache(). > > On the "why", that ioremap_nocache() call can be tracked to > > http://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/?id=a914dd8b > > (Feb 2002), which predates the kernel's move to git. I guess > ioremap_nocache() is used simply because USB host controllers are > supposed to programmed like that. > > And, from "arch/arm64/include/asm/io.h": > > #define ioremap_nocache(addr, size) __ioremap((addr), (size), > __pgprot(PROT_DEVICE_nGnRE)) > So this just means that these devices should be mapped as device memory (like the VGA case before) right? And therefore should work with Drew's patches (assuming they are actually correct and you add the right QEMU annotations to set the memory regions and non-cacheable), correct? Thanks, -Christoffer
On 15/05/2015 17:12, Christoffer Dall wrote: >>> > > Can you find out what memory attributes the guest is using for the >>> > > memory---and if it's uncached, why? >> > >> > For USB, see "drivers/usb/core/hcd-pci.c", function usb_hcd_pci_probe(): >> > it uses ioremap_nocache(). >> > >> > On the "why", that ioremap_nocache() call can be tracked to >> > >> > http://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/?id=a914dd8b >> > >> > (Feb 2002), which predates the kernel's move to git. I guess >> > ioremap_nocache() is used simply because USB host controllers are >> > supposed to programmed like that. >> > >> > And, from "arch/arm64/include/asm/io.h": >> > >> > #define ioremap_nocache(addr, size) __ioremap((addr), (size), >> > __pgprot(PROT_DEVICE_nGnRE)) >> > > So this just means that these devices should be mapped as device memory > (like the VGA case before) right? And therefore should work with Drew's > patches (assuming they are actually correct and you add the right QEMU > annotations to set the memory regions and non-cacheable), correct? As far as I understand ioremap_nocache() is used on a MMIO BAR. QEMU does not back those with RAM at all, each access causes a userspace exit. Paolo
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index d9631ec..8638fd2 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -886,6 +886,19 @@ long kvm_arch_vm_ioctl(struct file *filp, return 0; } + case KVM_FLUSH_DCACHE_GPA: { + struct kvm_mem_addr mem_addr; + hva_t hva; + gpa_t gpa; + + if (copy_from_user(&mem_addr, argp, sizeof(mem_addr))) + return -EFAULT; + + gpa = mem_addr.addr; + hva = gfn_to_hva(kvm, gpa_to_gfn(gpa)) | (gpa & ~PAGE_MASK); + kvm_flush_dcache_to_poc((void *)hva, mem_addr.len); + return 0; + } default: return -EINVAL; } diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 4b60056..3bc599e 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -978,6 +978,11 @@ struct kvm_arm_device_addr { __u64 addr; }; +struct kvm_mem_addr { + __u64 addr; + __u32 len; +}; + /* * Device control API, available with KVM_CAP_DEVICE_CTRL */ @@ -1199,6 +1204,7 @@ struct kvm_s390_ucas_mapping { /* Available with KVM_CAP_S390_IRQ_STATE */ #define KVM_S390_SET_IRQ_STATE _IOW(KVMIO, 0xb5, struct kvm_s390_irq_state) #define KVM_S390_GET_IRQ_STATE _IOW(KVMIO, 0xb6, struct kvm_s390_irq_state) +#define KVM_FLUSH_DCACHE_GPA _IOW(KVMIO, 0xb7, struct kvm_mem_addr) #define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0) #define KVM_DEV_ASSIGN_PCI_2_3 (1 << 1)
To maintain cache coherency on ARM, we may need a mechanism to flush the data cache. This patch implements KVM_FLUSH_DCACHE_GPA vm ioctl which flushes the data cache at a specified address range. The input argument is a struct kvm_mem_addr containing the guest physical address and the length. Signed-off-by: Jérémy Fanguède <j.fanguede@virtualopensystems.com> --- arch/arm/kvm/arm.c | 13 +++++++++++++ include/uapi/linux/kvm.h | 6 ++++++ 2 files changed, 19 insertions(+)