[2/6] bsd-user/freebsd/os-syscall.c: unlock_iovec

Message ID	20220607201440.41464-3-imp@bsdimp.com
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> From: Warner Losh <imp@bsdimp.com> To: qemu-devel@nongnu.org Cc: arrowd@freebsd.org, def@freebsd.org, jrtc27@FreeBSD.org, Warner Losh <imp@bsdimp.com>, Kyle Evans <kevans@freebsd.org> Subject: [PATCH 2/6] bsd-user/freebsd/os-syscall.c: unlock_iovec Date: Tue, 7 Jun 2022 14:14:36 -0600 Message-Id: <20220607201440.41464-3-imp@bsdimp.com> In-Reply-To: <20220607201440.41464-1-imp@bsdimp.com> References: <20220607201440.41464-1-imp@bsdimp.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: none client-ip=2607:f8b0:4864:20::12d; envelope-from=imp@bsdimp.com; helo=mail-il1-x12d.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>
Series	bsd-user upstreaming: read, write and exit \| expand [0/6] bsd-user upstreaming: read, write and exit [1/6] bsd-user/freebsd/os-syscall.c: lock_iovec [2/6] bsd-user/freebsd/os-syscall.c: unlock_iovec [3/6] bsd-user/freebsd/os-syscall.c: Tracing and error boilerplate [4/6] bsd-user/bsd-file.h: Add implementations for read, pread, readv and preadv [5/6] bsd-user/bsd-file.h: Meat of the write system calls [6/6] bsd-user/freebsd/os-syscall.c: Implement exit

Warner Losh June 7, 2022, 8:14 p.m. UTC

Releases the references to the iovec created by lock_iovec.

Signed-off-by: Warner Losh <imp@bsdimp.com>
---
 bsd-user/freebsd/os-syscall.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

Richard Henderson June 7, 2022, 9:28 p.m. UTC | #1

On 6/7/22 13:14, Warner Losh wrote:
> +void unlock_iovec(struct iovec *vec, abi_ulong target_addr,
> +        int count, int copy)
> +{
> +    struct target_iovec *target_vec;
> +
> +    target_vec = lock_user(VERIFY_READ, target_addr,
> +                           count * sizeof(struct target_iovec), 1);
> +    if (target_vec) {

Locking the same region twice seems like a bad idea.

How about something like

typedef struct {
     struct target_iovec *target;
     abi_ulong target_addr;
     int count;
     struct iovec host[];
} IOVecMap;

IOVecMap *lock_iovec(abi_ulong target_addr, int count, bool copy_in)
{
     IOVecMap *map;

     if (count == 0) ...
     if (count < 0) ...

     map = g_try_malloc0(sizeof(IOVecNew) + sizeof(struct iovec) * count);
     if (!map) ...

     map->target = lock_user(...);
     if (!map->target) {
         g_free(map);
         errno = EFAULT;
         return NULL;
     }
     map->target_addr = target_addr;
     map->count = count;

     // lock loop

  fail:
     unlock_iovec(vec, false);
     errno = err;
     return NULL;
}

void unlock_iovec(IOVecMap *map, bool copy_out)
{
     for (int i = 0, count = map->count; i < count; ++i) {
         if (map->host[i].iov_base) {
             abi_ulong target_base = tswapal(map->target[i].iov_base);
             unlock_user(map->host[i].iov_base, target_base,
                         copy_out ? map->host[i].iov_len : 0);
         }
     }
     unlock_user(map->target, map->target_addr, 0);
     g_free(map);
}


r~

Warner Losh June 7, 2022, 9:51 p.m. UTC | #2

On Tue, Jun 7, 2022 at 2:28 PM Richard Henderson <
richard.henderson@linaro.org> wrote:

> On 6/7/22 13:14, Warner Losh wrote:
> > +void unlock_iovec(struct iovec *vec, abi_ulong target_addr,
> > +        int count, int copy)
> > +{
> > +    struct target_iovec *target_vec;
> > +
> > +    target_vec = lock_user(VERIFY_READ, target_addr,
> > +                           count * sizeof(struct target_iovec), 1);
> > +    if (target_vec) {
>
> Locking the same region twice seems like a bad idea.
>

We unlock the iovec memory in the lock_iovec()


> How about something like
>
> typedef struct {
>      struct target_iovec *target;
>      abi_ulong target_addr;
>      int count;
>      struct iovec host[];
> } IOVecMap;
>
> IOVecMap *lock_iovec(abi_ulong target_addr, int count, bool copy_in)
> {
>      IOVecMap *map;
>
>      if (count == 0) ...
>      if (count < 0) ...
>
>      map = g_try_malloc0(sizeof(IOVecNew) + sizeof(struct iovec) * count);
>      if (!map) ...
>
>      map->target = lock_user(...);
>      if (!map->target) {
>          g_free(map);
>          errno = EFAULT;
>          return NULL;
>      }
>      map->target_addr = target_addr;
>      map->count = count;
>
>      // lock loop
>
>   fail:
>      unlock_iovec(vec, false);
>      errno = err;
>      return NULL;
> }
>
> void unlock_iovec(IOVecMap *map, bool copy_out)
> {
>      for (int i = 0, count = map->count; i < count; ++i) {
>          if (map->host[i].iov_base) {
>              abi_ulong target_base = tswapal(map->target[i].iov_base);
>              unlock_user(map->host[i].iov_base, target_base,
>                          copy_out ? map->host[i].iov_len : 0);
>          }
>

And wouldn't we want to filter out the iov_base that == 0 since
we may terminate the loop before we get to the count. When the
I/O is done, we'll call it not with the number we mapped, but with
the original number...  Or am I not understanding something here...

Warner


>      }
>      unlock_user(map->target, map->target_addr, 0);
>      g_free(map);
> }
>
>
> r~
>

Richard Henderson June 7, 2022, 10:23 p.m. UTC | #3

On 6/7/22 14:51, Warner Losh wrote:
>     void unlock_iovec(IOVecMap *map, bool copy_out)
>     {
>           for (int i = 0, count = map->count; i < count; ++i) {
>               if (map->host[i].iov_base) {
>                   abi_ulong target_base = tswapal(map->target[i].iov_base);
>                   unlock_user(map->host[i].iov_base, target_base,
>                               copy_out ? map->host[i].iov_len : 0);
>               }
> 
> 
> And wouldn't we want to filter out the iov_base that == 0 since
> we may terminate the loop before we get to the count. When the
> I/O is done, we'll call it not with the number we mapped, but with
> the original number...  Or am I not understanding something here...

I'm not following -- when and why are you adjusting count?


r~

Warner Losh June 7, 2022, 11:35 p.m. UTC | #4

> On Jun 7, 2022, at 3:23 PM, Richard Henderson <richard.henderson@linaro.org> wrote:
> 
> On 6/7/22 14:51, Warner Losh wrote:
>>    void unlock_iovec(IOVecMap *map, bool copy_out)
>>    {
>>          for (int i = 0, count = map->count; i < count; ++i) {
>>              if (map->host[i].iov_base) {
>>                  abi_ulong target_base = tswapal(map->target[i].iov_base);
>>                  unlock_user(map->host[i].iov_base, target_base,
>>                              copy_out ? map->host[i].iov_len : 0);
>>              }
>> And wouldn't we want to filter out the iov_base that == 0 since
>> we may terminate the loop before we get to the count. When the
>> I/O is done, we'll call it not with the number we mapped, but with
>> the original number...  Or am I not understanding something here...
> 
> I'm not following -- when and why are you adjusting count?

When we hit a memory range we can’t map after the first one,
we effectively stop mapping in (in the current linux code we
do map after, but then destroy the length). So that means
we’ll have entries in the iovec that are zero, and this code
doesn’t account for that. We’re not changing the count, per
se, but have a scenario where they might wind up NULL.

I’ll add “if I understand all this right” because I a little shaky
still on these aspects of qemu’s soft mmu.

Warner

Richard Henderson June 8, 2022, 2:02 a.m. UTC | #5

On 6/7/22 16:35, Warner Losh wrote:
> 
> 
>> On Jun 7, 2022, at 3:23 PM, Richard Henderson <richard.henderson@linaro.org> wrote:
>>
>> On 6/7/22 14:51, Warner Losh wrote:
>>>     void unlock_iovec(IOVecMap *map, bool copy_out)
>>>     {
>>>           for (int i = 0, count = map->count; i < count; ++i) {
>>>               if (map->host[i].iov_base) {
>>>                   abi_ulong target_base = tswapal(map->target[i].iov_base);
>>>                   unlock_user(map->host[i].iov_base, target_base,
>>>                               copy_out ? map->host[i].iov_len : 0);
>>>               }
>>> And wouldn't we want to filter out the iov_base that == 0 since
>>> we may terminate the loop before we get to the count. When the
>>> I/O is done, we'll call it not with the number we mapped, but with
>>> the original number...  Or am I not understanding something here...
>>
>> I'm not following -- when and why are you adjusting count?
> 
> When we hit a memory range we can’t map after the first one,
> we effectively stop mapping in (in the current linux code we
> do map after, but then destroy the length). So that means
> we’ll have entries in the iovec that are zero, and this code
> doesn’t account for that. We’re not changing the count, per
> se, but have a scenario where they might wind up NULL.

... and so skip them with the if.

I mean, I suppose you could set map->count on error, as you say, so that we don't iterate 
so far, but... duh, error case.  So long as you don't actively fail, there's no point in 
optimizing for it.


r~

Warner Losh June 8, 2022, 4:32 p.m. UTC | #6

On Tue, Jun 7, 2022 at 7:02 PM Richard Henderson <
richard.henderson@linaro.org> wrote:

> On 6/7/22 16:35, Warner Losh wrote:
> >
> >
> >> On Jun 7, 2022, at 3:23 PM, Richard Henderson <
> richard.henderson@linaro.org> wrote:
> >>
> >> On 6/7/22 14:51, Warner Losh wrote:
> >>>     void unlock_iovec(IOVecMap *map, bool copy_out)
> >>>     {
> >>>           for (int i = 0, count = map->count; i < count; ++i) {
> >>>               if (map->host[i].iov_base) {
> >>>                   abi_ulong target_base =
> tswapal(map->target[i].iov_base);
> >>>                   unlock_user(map->host[i].iov_base, target_base,
> >>>                               copy_out ? map->host[i].iov_len : 0);
> >>>               }
> >>> And wouldn't we want to filter out the iov_base that == 0 since
> >>> we may terminate the loop before we get to the count. When the
> >>> I/O is done, we'll call it not with the number we mapped, but with
> >>> the original number...  Or am I not understanding something here...
> >>
> >> I'm not following -- when and why are you adjusting count?
> >
> > When we hit a memory range we can’t map after the first one,
> > we effectively stop mapping in (in the current linux code we
> > do map after, but then destroy the length). So that means
> > we’ll have entries in the iovec that are zero, and this code
> > doesn’t account for that. We’re not changing the count, per
> > se, but have a scenario where they might wind up NULL.
>
> ... and so skip them with the if.
>
> I mean, I suppose you could set map->count on error, as you say, so that
> we don't iterate
> so far, but... duh, error case.  So long as you don't actively fail,
> there's no point in
> optimizing for it.
>

Setting the count would be hard because we'd have to allocate and free
state that we're not currently doing. Better to just skip it with an if. We
allocate
a vector that's used in a number of places, and we'd have to change that
code if we did things differently. While I'm open to suggestions here, I
think
that just accounting for the possible error with an if is our best bet for
now.
I have a lot of code to get in, and am hoping to not rewrite things unless
there's
some clear benefit over the existing structure (like fixing bugs, matching
linux-user,
or increasing performance).

Warner

[2/6] bsd-user/freebsd/os-syscall.c: unlock_iovec

Commit Message

Comments

Patch