mbox series

[v2,00/23] migration/multifd: Refactor ->send_prepare() and cleanups

Message ID 20240202102857.110210-1-peterx@redhat.com
Headers show
Series migration/multifd: Refactor ->send_prepare() and cleanups | expand

Message

Peter Xu Feb. 2, 2024, 10:28 a.m. UTC
From: Peter Xu <peterx@redhat.com>

v1: https://lore.kernel.org/r/20240131103111.306523-1-peterx@redhat.com

This v2 patchset contains quite a few refactorings to current multifd:

 1) Redefines send_prepare() interface, to be:

      p->pages ----------->  send_prepare() -------------> IOVs

    A major goal of it is to get prepared for others to add quite a few new
    hardware accelerators for multifd compression, or adding zeropage
    detection accelerators.

    It turns out we don't yet need more hooks to achieve that, so hopefully
    most new accelerator codes do not need a lot of changes if rebase onto
    this (please still wait until this series fully reviewed and merged;
    hopefully this should happen soon).  Please note that p->normal[] is
    dropped; now please use p->pages->offset[] for the same purpose.
    Please check the code for details.

    We may want a separate OPs for file later, which I'll leave that
    decision to Fabiano.

    This also prepares any possibility of replacing p->pages to raw
    buffers (VFIO usage).  But that's left for later too.  Logically with
    this patchset applied, it should be much easier.

 2) [new in v2] Fixed one more race usage of MultiFDSendParams.packet_num,
    as reported by Fabiano during his review on v1.  This is mostly done in
    the single patch:

 3) [new in v2] Made multifd sender side lockless, by dropping the mutex,
    as suggested by Fabiano in his review in v1.  This is mostly done in a
    single patch:

 4) A lot of cleanups to multifd code, it picked up some patches from an
    old series of mine [0] (the last patches were dropped, though; I did
    the cleanup slightly differently), and added a bunch of other cleanups
    either I got to see when doing that, or requested when Fabiano reviewed
    v1.

    Note: when I worked on this I even found more things to cleanup.  But I
    decided to stop at only what mostly requested from Fabiano in v1 to
    stop growing this series.

1) above is mostly done in:

    migration/multifd: Move header prepare/fill into send_prepare()

2) above is mostly done in:

    migration/multifd: Fix MultiFDSendParams.packet_num race

3) above is mostly done in:

    migration/multifd: Optimize sender side to be lockless

The rest patches all fall into 4) category.  Please have a look, thanks.

Avihai/Fabiano, I hope my understanding is right that we can still consider
this series before a separate patchset to fix the dangling thread issue.
If not, please shoot.

v2 changelog:

- When spurious wakeup happens for multifd sender thread, crash hard rather
  than causing a deadlock later.
- Always use atomic operations on pending_job / pending_sync
- Moved the setup of zerocopy write_flags from multifd_save_setup() into
  no-comp setup() phase.
- Added below patches, some form of request here and there from Fabiano:
  migration/multifd: Split multifd_send_terminate_threads()
  migration/multifd: Change retval of multifd_queue_page()
  migration/multifd: Change retval of multifd_send_pages()
  migration/multifd: Rewrite multifd_queue_page()
  migration/multifd: Cleanup multifd_save_cleanup()
  migration/multifd: Cleanup multifd_load_cleanup()
  migration/multifd: Stick with send/recv on function names
  migration/multifd: Fix MultiFDSendParams.packet_num race
  migration/multifd: Optimize sender side to be lockless

[0] https://lore.kernel.org/r/20231022201211.452861-1-peterx@redhat.com
[1] https://lore.kernel.org/qemu-devel/20240126221943.26628-1-farosas@suse.de

Peter Xu (23):
  migration/multifd: Drop stale comment for multifd zero copy
  migration/multifd: multifd_send_kick_main()
  migration/multifd: Drop MultiFDSendParams.quit, cleanup error paths
  migration/multifd: Postpone reset of MultiFDPages_t
  migration/multifd: Drop MultiFDSendParams.normal[] array
  migration/multifd: Separate SYNC request with normal jobs
  migration/multifd: Simplify locking in sender thread
  migration/multifd: Drop pages->num check in sender thread
  migration/multifd: Rename p->num_packets and clean it up
  migration/multifd: Move total_normal_pages accounting
  migration/multifd: Move trace_multifd_send|recv()
  migration/multifd: multifd_send_prepare_header()
  migration/multifd: Move header prepare/fill into send_prepare()
  migration/multifd: Forbid spurious wakeups
  migration/multifd: Split multifd_send_terminate_threads()
  migration/multifd: Change retval of multifd_queue_page()
  migration/multifd: Change retval of multifd_send_pages()
  migration/multifd: Rewrite multifd_queue_page()
  migration/multifd: Cleanup multifd_save_cleanup()
  migration/multifd: Cleanup multifd_load_cleanup()
  migration/multifd: Stick with send/recv on function names
  migration/multifd: Fix MultiFDSendParams.packet_num race
  migration/multifd: Optimize sender side to be lockless

 migration/multifd.h      |  50 ++--
 migration/migration.c    |  12 +-
 migration/multifd-zlib.c |  11 +-
 migration/multifd-zstd.c |  11 +-
 migration/multifd.c      | 630 ++++++++++++++++++++++-----------------
 migration/ram.c          |   2 +-
 migration/trace-events   |   2 +-
 7 files changed, 404 insertions(+), 314 deletions(-)

Comments

Peter Xu Feb. 6, 2024, 3:05 a.m. UTC | #1
queued.