Message ID | 1e2824fd77e8eeb351c6c6246f384d0d89fd2d58.1553093421.git.andreyknvl@google.com |
---|---|
State | Not Applicable |
Delegated to: | David Miller |
Headers | show |
Series | arm64: untag user pointers passed to the kernel | expand |
(trimmed down the cc list slightly as the message bounces) On Mon, Apr 29, 2019 at 09:09:15PM +0300, Leon Romanovsky wrote: > On Wed, Mar 20, 2019 at 03:51:30PM +0100, Andrey Konovalov wrote: > > This patch is a part of a series that extends arm64 kernel ABI to allow to > > pass tagged user pointers (with the top byte set to something else other > > than 0x00) as syscall arguments. > > > > mlx4_get_umem_mr() uses provided user pointers for vma lookups, which can > > only by done with untagged pointers. > > > > Untag user pointers in this function. > > > > Signed-off-by: Andrey Konovalov <andreyknvl@google.com> > > --- > > drivers/infiniband/hw/mlx4/mr.c | 7 ++++--- > > 1 file changed, 4 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c > > index 395379a480cb..9a35ed2c6a6f 100644 > > --- a/drivers/infiniband/hw/mlx4/mr.c > > +++ b/drivers/infiniband/hw/mlx4/mr.c > > @@ -378,6 +378,7 @@ static struct ib_umem *mlx4_get_umem_mr(struct ib_udata *udata, u64 start, > > * again > > */ > > if (!ib_access_writable(access_flags)) { > > + unsigned long untagged_start = untagged_addr(start); > > struct vm_area_struct *vma; > > > > down_read(¤t->mm->mmap_sem); > > @@ -386,9 +387,9 @@ static struct ib_umem *mlx4_get_umem_mr(struct ib_udata *udata, u64 start, > > * cover the memory, but for now it requires a single vma to > > * entirely cover the MR to support RO mappings. > > */ > > - vma = find_vma(current->mm, start); > > - if (vma && vma->vm_end >= start + length && > > - vma->vm_start <= start) { > > + vma = find_vma(current->mm, untagged_start); > > + if (vma && vma->vm_end >= untagged_start + length && > > + vma->vm_start <= untagged_start) { > > if (vma->vm_flags & VM_WRITE) > > access_flags |= IB_ACCESS_LOCAL_WRITE; > > } else { > > -- > > Thanks, > Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Thanks for the review. > Interesting, the followup question is why mlx4 is only one driver in IB which > needs such code in umem_mr. I'll take a look on it. I don't know. Just using the light heuristics of find_vma() shows some other places. For example, ib_umem_odp_get() gets the umem->address via ib_umem_start(). This was previously set in ib_umem_get() as called from mlx4_get_umem_mr(). Should the above patch have just untagged "start" on entry? BTW, what's the provenience of such "start" address here? Is it something that the user would have malloc()'ed? We try to impose some restrictions one what is allowed to be tagged in user so that we don't have to untag the addresses in the kernel. For example, if it was the result of an mmap() on the device file, we don't allow tagging. Thanks.
On Tue, Apr 30, 2019 at 12:16:25PM +0100, Catalin Marinas wrote: > (trimmed down the cc list slightly as the message bounces) > > On Mon, Apr 29, 2019 at 09:09:15PM +0300, Leon Romanovsky wrote: > > On Wed, Mar 20, 2019 at 03:51:30PM +0100, Andrey Konovalov wrote: > > > This patch is a part of a series that extends arm64 kernel ABI to allow to > > > pass tagged user pointers (with the top byte set to something else other > > > than 0x00) as syscall arguments. > > > > > > mlx4_get_umem_mr() uses provided user pointers for vma lookups, which can > > > only by done with untagged pointers. > > > > > > Untag user pointers in this function. > > > > > > Signed-off-by: Andrey Konovalov <andreyknvl@google.com> > > > --- > > > drivers/infiniband/hw/mlx4/mr.c | 7 ++++--- > > > 1 file changed, 4 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c > > > index 395379a480cb..9a35ed2c6a6f 100644 > > > --- a/drivers/infiniband/hw/mlx4/mr.c > > > +++ b/drivers/infiniband/hw/mlx4/mr.c > > > @@ -378,6 +378,7 @@ static struct ib_umem *mlx4_get_umem_mr(struct ib_udata *udata, u64 start, > > > * again > > > */ > > > if (!ib_access_writable(access_flags)) { > > > + unsigned long untagged_start = untagged_addr(start); > > > struct vm_area_struct *vma; > > > > > > down_read(¤t->mm->mmap_sem); > > > @@ -386,9 +387,9 @@ static struct ib_umem *mlx4_get_umem_mr(struct ib_udata *udata, u64 start, > > > * cover the memory, but for now it requires a single vma to > > > * entirely cover the MR to support RO mappings. > > > */ > > > - vma = find_vma(current->mm, start); > > > - if (vma && vma->vm_end >= start + length && > > > - vma->vm_start <= start) { > > > + vma = find_vma(current->mm, untagged_start); > > > + if (vma && vma->vm_end >= untagged_start + length && > > > + vma->vm_start <= untagged_start) { > > > if (vma->vm_flags & VM_WRITE) > > > access_flags |= IB_ACCESS_LOCAL_WRITE; > > > } else { > > > -- > > > > Thanks, > > Reviewed-by: Leon Romanovsky <leonro@mellanox.com> > > Thanks for the review. > > > Interesting, the followup question is why mlx4 is only one driver in IB which > > needs such code in umem_mr. I'll take a look on it. > > I don't know. Just using the light heuristics of find_vma() shows some > other places. For example, ib_umem_odp_get() gets the umem->address via > ib_umem_start(). This was previously set in ib_umem_get() as called from > mlx4_get_umem_mr(). Should the above patch have just untagged "start" on > entry? ODP flows are not applicable to any driver except mlx5. According to commit message of d8f9cc328c88 ("IB/mlx4: Mark user MR as writable if actual virtual memory is writable"), the code in its current form needed to deal with different mappings between RDMA memory requested and VMA memory underlined. > > BTW, what's the provenience of such "start" address here? Is it > something that the user would have malloc()'ed? We try to impose some > restrictions one what is allowed to be tagged in user so that we don't > have to untag the addresses in the kernel. For example, if it was the > result of an mmap() on the device file, we don't allow tagging. The *_reg_user_mr() is called from userspace through ibv_reg_mr() call [1] and this is how "address" and access flags are provided. Right now, the address should point to memory accessible by get_user_pages(), however mmap-ed memory uses remap_pfn_range() to provide such pages which makes them unusable for get_user_pages(). I would be glad to see this is a current limitation of RDMA stack and not as a final design decision. [1] https://linux.die.net/man/3/ibv_reg_mr > > Thanks. > > -- > Catalin
On Tue, Apr 30, 2019 at 12:16:25PM +0100, Catalin Marinas wrote: > > Interesting, the followup question is why mlx4 is only one driver in IB which > > needs such code in umem_mr. I'll take a look on it. > > I don't know. Just using the light heuristics of find_vma() shows some > other places. For example, ib_umem_odp_get() gets the umem->address via > ib_umem_start(). This was previously set in ib_umem_get() as called from > mlx4_get_umem_mr(). Should the above patch have just untagged "start" on > entry? I have a feeling that there needs to be something for this in the odp code.. Presumably mmu notifiers and what not also use untagged pointers? Most likely then the umem should also be storing untagged pointers. This probably becomes problematic because we do want the tag in cases talking about the base VA of the MR.. Jason
Thanks Jason and Leon for the information. On Thu, May 02, 2019 at 03:44:42PM -0300, Jason Gunthorpe wrote: > On Tue, Apr 30, 2019 at 12:16:25PM +0100, Catalin Marinas wrote: > > > Interesting, the followup question is why mlx4 is only one driver in IB which > > > needs such code in umem_mr. I'll take a look on it. > > > > I don't know. Just using the light heuristics of find_vma() shows some > > other places. For example, ib_umem_odp_get() gets the umem->address via > > ib_umem_start(). This was previously set in ib_umem_get() as called from > > mlx4_get_umem_mr(). Should the above patch have just untagged "start" on > > entry? > > I have a feeling that there needs to be something for this in the odp > code.. > > Presumably mmu notifiers and what not also use untagged pointers? Most > likely then the umem should also be storing untagged pointers. Yes. > This probably becomes problematic because we do want the tag in cases > talking about the base VA of the MR.. It depends on whether the tag is relevant to the kernel or not. The only useful case so far is for the kernel performing copy_form_user() etc. accesses so they'd get checked in the presence of hardware memory tagging (MTE; but it's not mandatory, a 0 tag would do as well). If we talk about a memory range where the content is relatively opaque (or irrelevant) to the kernel code, we don't really need the tag. I'm not familiar to RDMA but I presume it would be a device accessing such MR but not through the user VA directly. The tag is a property of the buffer address/pointer when accessed by the CPU at that specific address range. Any DMA or even kernel accessing it through the linear mapping (get_user_pages()) would drop such tag.
On Fri, May 03, 2019 at 05:28:46PM +0100, Catalin Marinas wrote: > Thanks Jason and Leon for the information. > > On Thu, May 02, 2019 at 03:44:42PM -0300, Jason Gunthorpe wrote: > > On Tue, Apr 30, 2019 at 12:16:25PM +0100, Catalin Marinas wrote: > > > > Interesting, the followup question is why mlx4 is only one driver in IB which > > > > needs such code in umem_mr. I'll take a look on it. > > > > > > I don't know. Just using the light heuristics of find_vma() shows some > > > other places. For example, ib_umem_odp_get() gets the umem->address via > > > ib_umem_start(). This was previously set in ib_umem_get() as called from > > > mlx4_get_umem_mr(). Should the above patch have just untagged "start" on > > > entry? > > > > I have a feeling that there needs to be something for this in the odp > > code.. > > > > Presumably mmu notifiers and what not also use untagged pointers? Most > > likely then the umem should also be storing untagged pointers. > > Yes. > > > This probably becomes problematic because we do want the tag in cases > > talking about the base VA of the MR.. > > It depends on whether the tag is relevant to the kernel or not. The only > useful case so far is for the kernel performing copy_form_user() etc. > accesses so they'd get checked in the presence of hardware memory > tagging (MTE; but it's not mandatory, a 0 tag would do as well). > > If we talk about a memory range where the content is relatively opaque > (or irrelevant) to the kernel code, we don't really need the tag. I'm > not familiar to RDMA but I presume it would be a device accessing such > MR but not through the user VA directly. RDMA exposes the user VA directly (the IOVA) as part of the wire protocol, we must preserve the tag in these cases as that is what the userspace is using for the pointer. So the ODP stuff will definately need some adjusting when it interacts with the mmu notifiers and get user pages. Jason
diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c index 395379a480cb..9a35ed2c6a6f 100644 --- a/drivers/infiniband/hw/mlx4/mr.c +++ b/drivers/infiniband/hw/mlx4/mr.c @@ -378,6 +378,7 @@ static struct ib_umem *mlx4_get_umem_mr(struct ib_udata *udata, u64 start, * again */ if (!ib_access_writable(access_flags)) { + unsigned long untagged_start = untagged_addr(start); struct vm_area_struct *vma; down_read(¤t->mm->mmap_sem); @@ -386,9 +387,9 @@ static struct ib_umem *mlx4_get_umem_mr(struct ib_udata *udata, u64 start, * cover the memory, but for now it requires a single vma to * entirely cover the MR to support RO mappings. */ - vma = find_vma(current->mm, start); - if (vma && vma->vm_end >= start + length && - vma->vm_start <= start) { + vma = find_vma(current->mm, untagged_start); + if (vma && vma->vm_end >= untagged_start + length && + vma->vm_start <= untagged_start) { if (vma->vm_flags & VM_WRITE) access_flags |= IB_ACCESS_LOCAL_WRITE; } else {
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments. mlx4_get_umem_mr() uses provided user pointers for vma lookups, which can only by done with untagged pointers. Untag user pointers in this function. Signed-off-by: Andrey Konovalov <andreyknvl@google.com> --- drivers/infiniband/hw/mlx4/mr.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)