Message ID | 1494403315-12760-5-git-send-email-peterx@redhat.com |
---|---|
State | New |
Headers | show |
On Wed, May 10, 2017 at 04:01:47PM +0800, Peter Xu wrote: > This function has an assumption that we will definitely call translate() > once (or say, the addr will be located inside one IOMMU memory region), > otherwise an empty IOTLB will be returned. Nevertheless, this is not > what we want. When there is no IOMMU memory region, we should build up a > static mapping for the caller, instead of an invalid IOTLB. > > We won't trigger this path before VT-d passthrough mode. When > passthrough mode for a vhost device is setup, VT-d is possible to > disable the IOMMU region for that device. Without current patch, we'll > get a vhost boot failure, and it'll be failed over to virtio userspace > mode. This doesn't look right to me. You're assuming the target is address_space_memory, which might not be the case - and you should be able to check from the MR you do hit. Furthermore it doesn't look like you're accounting for the trivial translation if the section's offset in the address space is different from its offset in the MR. I think for the fallback path you're going to want something based on address_space_translate() instead. > CC: Paolo Bonzini <pbonzini@redhat.com> > CC: Jason Wang <jasowang@redhat.com> > CC: Michael S. Tsirkin <mst@redhat.com> > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > exec.c | 20 +++++++++++++++++++- > 1 file changed, 19 insertions(+), 1 deletion(-) > > diff --git a/exec.c b/exec.c > index 072de5d..5cfdacd 100644 > --- a/exec.c > +++ b/exec.c > @@ -463,12 +463,13 @@ address_space_translate_internal(AddressSpaceDispatch *d, hwaddr addr, hwaddr *x > } > > /* Called from RCU critical section */ > -IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpace *as, hwaddr addr, > +IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpace *as, hwaddr iova, > bool is_write) > { > IOMMUTLBEntry iotlb = {0}; > MemoryRegionSection *section; > MemoryRegion *mr; > + hwaddr addr = iova, psize; > > for (;;) { > AddressSpaceDispatch *d = atomic_rcu_read(&as->dispatch); > @@ -478,6 +479,23 @@ IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpace *as, hwaddr addr, > mr = section->mr; > > if (!mr->iommu_ops) { > + /* > + * We didn't translate() but reached here. It possibly > + * means it's a static mapping. If so (it should be RAM), > + * we set the IOTLB up. > + */ > + if (!iotlb.target_as && memory_region_is_ram(mr) && > + !memory_region_is_rom(mr)) { > + psize = mr->ram_block->page_size; > + iova &= ~(psize - 1); > + iotlb = (IOMMUTLBEntry) { > + .target_as = &address_space_memory, > + .iova = iova, > + .translated_addr = iova, > + .addr_mask = psize - 1, > + .perm = IOMMU_RW, > + }; > + } > break; > } >
On Thu, May 11, 2017 at 11:56:38AM +1000, David Gibson wrote: > On Wed, May 10, 2017 at 04:01:47PM +0800, Peter Xu wrote: > > This function has an assumption that we will definitely call translate() > > once (or say, the addr will be located inside one IOMMU memory region), > > otherwise an empty IOTLB will be returned. Nevertheless, this is not > > what we want. When there is no IOMMU memory region, we should build up a > > static mapping for the caller, instead of an invalid IOTLB. > > > > We won't trigger this path before VT-d passthrough mode. When > > passthrough mode for a vhost device is setup, VT-d is possible to > > disable the IOMMU region for that device. Without current patch, we'll > > get a vhost boot failure, and it'll be failed over to virtio userspace > > mode. > > This doesn't look right to me. You're assuming the target is > address_space_memory, which might not be the case - and you should be > able to check from the MR you do hit. Furthermore it doesn't look > like you're accounting for the trivial translation if the section's > offset in the address space is different from its offset in the MR. Do you mean this line? addr = addr - section->offset_within_address_space + section->offset_within_region; I thought it was calculating the relative address against that memory region. That should only be useful if we want to do further translate(), right? For the path that this patch tries to handle (when there is no translate() call), then this "addr" is useless here? Regarding to the address space assignment - do you mean, e.g., I should use section->address_space here instead of &system_address_space? If so, I can do the switch. But after all, for now address_space_get_iotlb_entry() is only used by vhost codes, and it only check against iotlb.target_as == NULL, so the address space didn't count too much here... Another reason I used &address_space_memory is that in vfio_iommu_map_notify() we have a check against it: if (iotlb->target_as != &address_space_memory) { error_report("Wrong target AS \"%s\", only system memory is allowed", iotlb->target_as->name ? iotlb->target_as->name : "none"); return; } Or say, we have some assumptions (not only in this patch) that assumes this iotlb.target_as should be system_address_space. Thanks, > > I think for the fallback path you're going to want something based on > address_space_translate() instead. > > > CC: Paolo Bonzini <pbonzini@redhat.com> > > CC: Jason Wang <jasowang@redhat.com> > > CC: Michael S. Tsirkin <mst@redhat.com> > > Signed-off-by: Peter Xu <peterx@redhat.com> > > --- > > exec.c | 20 +++++++++++++++++++- > > 1 file changed, 19 insertions(+), 1 deletion(-) > > > > diff --git a/exec.c b/exec.c > > index 072de5d..5cfdacd 100644 > > --- a/exec.c > > +++ b/exec.c > > @@ -463,12 +463,13 @@ address_space_translate_internal(AddressSpaceDispatch *d, hwaddr addr, hwaddr *x > > } > > > > /* Called from RCU critical section */ > > -IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpace *as, hwaddr addr, > > +IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpace *as, hwaddr iova, > > bool is_write) > > { > > IOMMUTLBEntry iotlb = {0}; > > MemoryRegionSection *section; > > MemoryRegion *mr; > > + hwaddr addr = iova, psize; > > > > for (;;) { > > AddressSpaceDispatch *d = atomic_rcu_read(&as->dispatch); > > @@ -478,6 +479,23 @@ IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpace *as, hwaddr addr, > > mr = section->mr; > > > > if (!mr->iommu_ops) { > > + /* > > + * We didn't translate() but reached here. It possibly > > + * means it's a static mapping. If so (it should be RAM), > > + * we set the IOTLB up. > > + */ > > + if (!iotlb.target_as && memory_region_is_ram(mr) && > > + !memory_region_is_rom(mr)) { > > + psize = mr->ram_block->page_size; > > + iova &= ~(psize - 1); > > + iotlb = (IOMMUTLBEntry) { > > + .target_as = &address_space_memory, > > + .iova = iova, > > + .translated_addr = iova, > > + .addr_mask = psize - 1, > > + .perm = IOMMU_RW, > > + }; > > + } > > break; > > } > > > > -- > David Gibson | I'll have my music baroque, and my code > david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ > | _way_ _around_! > http://www.ozlabs.org/~dgibson
On Thu, May 11, 2017 at 05:36:03PM +0800, Peter Xu wrote: > On Thu, May 11, 2017 at 11:56:38AM +1000, David Gibson wrote: > > On Wed, May 10, 2017 at 04:01:47PM +0800, Peter Xu wrote: > > > This function has an assumption that we will definitely call translate() > > > once (or say, the addr will be located inside one IOMMU memory region), > > > otherwise an empty IOTLB will be returned. Nevertheless, this is not > > > what we want. When there is no IOMMU memory region, we should build up a > > > static mapping for the caller, instead of an invalid IOTLB. > > > > > > We won't trigger this path before VT-d passthrough mode. When > > > passthrough mode for a vhost device is setup, VT-d is possible to > > > disable the IOMMU region for that device. Without current patch, we'll > > > get a vhost boot failure, and it'll be failed over to virtio userspace > > > mode. > > > > This doesn't look right to me. You're assuming the target is > > address_space_memory, which might not be the case - and you should be > > able to check from the MR you do hit. Furthermore it doesn't look > > like you're accounting for the trivial translation if the section's > > offset in the address space is different from its offset in the MR. > > Do you mean this line? > > addr = addr - section->offset_within_address_space > + section->offset_within_region; Uh.. where is that line? But.. wait, yes, I think I was mistaken. I saw: .translated_addr = iova, and thought that meant you were assuming an identify mapping from iova to translated addr. But thinking more carefull, IIRC iova and translated_addr are both relative to the MR, not the AS, so I think that is correct after all. > I thought it was calculating the relative address against that memory > region. That should only be useful if we want to do further > translate(), right? For the path that this patch tries to handle (when > there is no translate() call), then this "addr" is useless here? > > Regarding to the address space assignment - do you mean, e.g., I > should use section->address_space here instead of > &system_address_space? If so, I can do the switch. Yes, I think you should. > But after all, for > now address_space_get_iotlb_entry() is only used by vhost codes, and > it only check against iotlb.target_as == NULL, so the address space > didn't count too much here... > Another reason I used &address_space_memory is that in > vfio_iommu_map_notify() we have a check against it: > > if (iotlb->target_as != &address_space_memory) { > error_report("Wrong target AS \"%s\", only system memory is allowed", > iotlb->target_as->name ? iotlb->target_as->name : "none"); > return; > } > > Or say, we have some assumptions (not only in this patch) that assumes > this iotlb.target_as should be system_address_space. Right, the vhost code can only handle some IOMMU setups - something like nested IOMMUs would break it. But this way if someone sets up a machine with an IOMMU configuration that vhost can't handle, you'll get an error message, rather than accesses to unexpected locations, which could cause really hard to debug corruption. In other words we make assumptions, but we should _test_ those assumptions. I also think it would make sense to use address_space_translate() if we can, since it's an existing interface for a very similar operation. > > Thanks, > > > > > I think for the fallback path you're going to want something based on > > address_space_translate() instead. > > > > > CC: Paolo Bonzini <pbonzini@redhat.com> > > > CC: Jason Wang <jasowang@redhat.com> > > > CC: Michael S. Tsirkin <mst@redhat.com> > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > > --- > > > exec.c | 20 +++++++++++++++++++- > > > 1 file changed, 19 insertions(+), 1 deletion(-) > > > > > > diff --git a/exec.c b/exec.c > > > index 072de5d..5cfdacd 100644 > > > --- a/exec.c > > > +++ b/exec.c > > > @@ -463,12 +463,13 @@ address_space_translate_internal(AddressSpaceDispatch *d, hwaddr addr, hwaddr *x > > > } > > > > > > /* Called from RCU critical section */ > > > -IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpace *as, hwaddr addr, > > > +IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpace *as, hwaddr iova, > > > bool is_write) > > > { > > > IOMMUTLBEntry iotlb = {0}; > > > MemoryRegionSection *section; > > > MemoryRegion *mr; > > > + hwaddr addr = iova, psize; > > > > > > for (;;) { > > > AddressSpaceDispatch *d = atomic_rcu_read(&as->dispatch); > > > @@ -478,6 +479,23 @@ IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpace *as, hwaddr addr, > > > mr = section->mr; > > > > > > if (!mr->iommu_ops) { > > > + /* > > > + * We didn't translate() but reached here. It possibly > > > + * means it's a static mapping. If so (it should be RAM), > > > + * we set the IOTLB up. > > > + */ > > > + if (!iotlb.target_as && memory_region_is_ram(mr) && > > > + !memory_region_is_rom(mr)) { > > > + psize = mr->ram_block->page_size; > > > + iova &= ~(psize - 1); > > > + iotlb = (IOMMUTLBEntry) { > > > + .target_as = &address_space_memory, > > > + .iova = iova, > > > + .translated_addr = iova, > > > + .addr_mask = psize - 1, > > > + .perm = IOMMU_RW, > > > + }; > > > + } > > > break; > > > } > > > > > > > >
On Fri, May 12, 2017 at 03:25:08PM +1000, David Gibson wrote: > On Thu, May 11, 2017 at 05:36:03PM +0800, Peter Xu wrote: > > On Thu, May 11, 2017 at 11:56:38AM +1000, David Gibson wrote: > > > On Wed, May 10, 2017 at 04:01:47PM +0800, Peter Xu wrote: > > > > This function has an assumption that we will definitely call translate() > > > > once (or say, the addr will be located inside one IOMMU memory region), > > > > otherwise an empty IOTLB will be returned. Nevertheless, this is not > > > > what we want. When there is no IOMMU memory region, we should build up a > > > > static mapping for the caller, instead of an invalid IOTLB. > > > > > > > > We won't trigger this path before VT-d passthrough mode. When > > > > passthrough mode for a vhost device is setup, VT-d is possible to > > > > disable the IOMMU region for that device. Without current patch, we'll > > > > get a vhost boot failure, and it'll be failed over to virtio userspace > > > > mode. > > > > > > This doesn't look right to me. You're assuming the target is > > > address_space_memory, which might not be the case - and you should be > > > able to check from the MR you do hit. Furthermore it doesn't look > > > like you're accounting for the trivial translation if the section's > > > offset in the address space is different from its offset in the MR. > > > > Do you mean this line? > > > > addr = addr - section->offset_within_address_space > > + section->offset_within_region; > > Uh.. where is that line? But.. wait, yes, I think I was mistaken. I saw: > .translated_addr = iova, > > and thought that meant you were assuming an identify mapping from iova > to translated addr. But thinking more carefull, IIRC iova and > translated_addr are both relative to the MR, not the AS, so I think > that is correct after all. > > > I thought it was calculating the relative address against that memory > > region. That should only be useful if we want to do further > > translate(), right? For the path that this patch tries to handle (when > > there is no translate() call), then this "addr" is useless here? > > > > Regarding to the address space assignment - do you mean, e.g., I > > should use section->address_space here instead of > > &system_address_space? If so, I can do the switch. > > Yes, I think you should. > > > But after all, for > > now address_space_get_iotlb_entry() is only used by vhost codes, and > > it only check against iotlb.target_as == NULL, so the address space > > didn't count too much here... > > > Another reason I used &address_space_memory is that in > > vfio_iommu_map_notify() we have a check against it: > > > > if (iotlb->target_as != &address_space_memory) { > > error_report("Wrong target AS \"%s\", only system memory is allowed", > > iotlb->target_as->name ? iotlb->target_as->name : "none"); > > return; > > } > > > > Or say, we have some assumptions (not only in this patch) that assumes > > this iotlb.target_as should be system_address_space. > > Right, the vhost code can only handle some IOMMU setups - something > like nested IOMMUs would break it. But this way if someone sets up a > machine with an IOMMU configuration that vhost can't handle, you'll > get an error message, rather than accesses to unexpected locations, > which could cause really hard to debug corruption. > > In other words we make assumptions, but we should _test_ those > assumptions. > > I also think it would make sense to use address_space_translate() if we > can, since it's an existing interface for a very similar operation. I did give it a shot to generalize these two functions in this series (I just posted it): [PATCH 0/4] exec: address space translation cleanups Please see whether that'll be a good replacement of this single patch. Thanks,
diff --git a/exec.c b/exec.c index 072de5d..5cfdacd 100644 --- a/exec.c +++ b/exec.c @@ -463,12 +463,13 @@ address_space_translate_internal(AddressSpaceDispatch *d, hwaddr addr, hwaddr *x } /* Called from RCU critical section */ -IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpace *as, hwaddr addr, +IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpace *as, hwaddr iova, bool is_write) { IOMMUTLBEntry iotlb = {0}; MemoryRegionSection *section; MemoryRegion *mr; + hwaddr addr = iova, psize; for (;;) { AddressSpaceDispatch *d = atomic_rcu_read(&as->dispatch); @@ -478,6 +479,23 @@ IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpace *as, hwaddr addr, mr = section->mr; if (!mr->iommu_ops) { + /* + * We didn't translate() but reached here. It possibly + * means it's a static mapping. If so (it should be RAM), + * we set the IOTLB up. + */ + if (!iotlb.target_as && memory_region_is_ram(mr) && + !memory_region_is_rom(mr)) { + psize = mr->ram_block->page_size; + iova &= ~(psize - 1); + iotlb = (IOMMUTLBEntry) { + .target_as = &address_space_memory, + .iova = iova, + .translated_addr = iova, + .addr_mask = psize - 1, + .perm = IOMMU_RW, + }; + } break; }
This function has an assumption that we will definitely call translate() once (or say, the addr will be located inside one IOMMU memory region), otherwise an empty IOTLB will be returned. Nevertheless, this is not what we want. When there is no IOMMU memory region, we should build up a static mapping for the caller, instead of an invalid IOTLB. We won't trigger this path before VT-d passthrough mode. When passthrough mode for a vhost device is setup, VT-d is possible to disable the IOMMU region for that device. Without current patch, we'll get a vhost boot failure, and it'll be failed over to virtio userspace mode. CC: Paolo Bonzini <pbonzini@redhat.com> CC: Jason Wang <jasowang@redhat.com> CC: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> --- exec.c | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-)