mbox series

[0/5] migration/multifd: Prerequisite cleanups for ongoing work

Message ID 20240126221943.26628-1-farosas@suse.de
Headers show
Series migration/multifd: Prerequisite cleanups for ongoing work | expand

Message

Fabiano Rosas Jan. 26, 2024, 10:19 p.m. UTC
Hi,

Here are two cleanups that are prerequiste for the fixed-ram work, but
also affect the other series on the list at the moment, so I want to
make sure it works for everyone:

1) Separate multifd_ops from compression. The multifd_ops are
   currently coupled with the multifd_compression parameter.

We're adding new multifd_ops in the fixed-ram work and adding new
compression ops in the compression work.

2) Add a new send hook. The multifd_send_thread code currently does
   some twists to support zero copy, which is a socket-only feature.

This might affect the zero page and DSA work which add code to
multifd_send_thread.

CI run: https://gitlab.com/farosas/qemu/-/pipelines/1154332360

(I also tested zero copy locally. We cannot add a test for it because
it needs root due to memory locking limits)

Fabiano Rosas (5):
  migration/multifd: Separate compression ops from non-compression
  migration/multifd: Move multifd_socket_ops to socket.c
  migration/multifd: Add multifd_ops->send
  migration/multifd: Simplify zero copy send
  migration/multifd: Move zero copy flag into multifd_socket_setup

 migration/multifd-zlib.c |   9 ++-
 migration/multifd-zstd.c |   9 ++-
 migration/multifd.c      | 164 +++++----------------------------------
 migration/multifd.h      |   6 +-
 migration/socket.c       |  90 ++++++++++++++++++++-
 5 files changed, 128 insertions(+), 150 deletions(-)

Comments

Liu, Yuan1 Jan. 29, 2024, 1:41 a.m. UTC | #1
> -----Original Message-----
> From: Fabiano Rosas <farosas@suse.de>
> Sent: Saturday, January 27, 2024 6:20 AM
> To: qemu-devel@nongnu.org
> Cc: Peter Xu <peterx@redhat.com>; Hao Xiang <hao.xiang@bytedance.com>;
> Liu, Yuan1 <yuan1.liu@intel.com>; Bryan Zhang <bryan.zhang@bytedance.com>
> Subject: [PATCH 0/5] migration/multifd: Prerequisite cleanups for ongoing
> work
> 
> Hi,
> 
> Here are two cleanups that are prerequiste for the fixed-ram work, but
> also affect the other series on the list at the moment, so I want to make
> sure it works for everyone:
> 
> 1) Separate multifd_ops from compression. The multifd_ops are
>    currently coupled with the multifd_compression parameter.
> 
> We're adding new multifd_ops in the fixed-ram work and adding new
> compression ops in the compression work.
> 2) Add a new send hook. The multifd_send_thread code currently does
>    some twists to support zero copy, which is a socket-only feature.
> 
> This might affect the zero page and DSA work which add code to
> multifd_send_thread.

Thank you for your reminder, I reviewed the patch set and there is 
a question.

Because this change has an impact on the previous live migration 
With IAA Patch, does the submission of the next version needs 
to be submitted based on this change?

> 
> CI run: https://gitlab.com/farosas/qemu/-/pipelines/1154332360
> 
> (I also tested zero copy locally. We cannot add a test for it because it
> needs root due to memory locking limits)
> 
> Fabiano Rosas (5):
>   migration/multifd: Separate compression ops from non-compression
>   migration/multifd: Move multifd_socket_ops to socket.c
>   migration/multifd: Add multifd_ops->send
>   migration/multifd: Simplify zero copy send
>   migration/multifd: Move zero copy flag into multifd_socket_setup
> 
>  migration/multifd-zlib.c |   9 ++-
>  migration/multifd-zstd.c |   9 ++-
>  migration/multifd.c      | 164 +++++----------------------------------
>  migration/multifd.h      |   6 +-
>  migration/socket.c       |  90 ++++++++++++++++++++-
>  5 files changed, 128 insertions(+), 150 deletions(-)
> 
> --
> 2.35.3
Peter Xu Jan. 29, 2024, 7:36 a.m. UTC | #2
On Mon, Jan 29, 2024 at 01:41:01AM +0000, Liu, Yuan1 wrote:
> Because this change has an impact on the previous live migration 
> With IAA Patch, does the submission of the next version needs 
> to be submitted based on this change?

I'd say hold off a little while until we're more certain on the planned
interface changes, to avoid you rebase your code back and forth; unless
you're pretty confident that this will be the right approach.

I apologize on not having looked at any of the QAT/IAA compression / zero
detection series posted on the list; I do plan to read them very soon too
after Fabiano.  So I may not have a complete full picture here yet, please
bare with me.

If this series is trying to provide a base ground for all the efforts,
it'll be great if we can thoroughly discuss here and settle an approach
soon that will satisfy everyone.

Thanks,
Fabiano Rosas Jan. 29, 2024, 12:51 p.m. UTC | #3
Peter Xu <peterx@redhat.com> writes:

> On Mon, Jan 29, 2024 at 01:41:01AM +0000, Liu, Yuan1 wrote:
>> Because this change has an impact on the previous live migration 
>> With IAA Patch, does the submission of the next version needs 
>> to be submitted based on this change?
>
> I'd say hold off a little while until we're more certain on the planned
> interface changes, to avoid you rebase your code back and forth; unless
> you're pretty confident that this will be the right approach.
>
> I apologize on not having looked at any of the QAT/IAA compression / zero
> detection series posted on the list; I do plan to read them very soon too
> after Fabiano.  So I may not have a complete full picture here yet, please
> bare with me.
>
> If this series is trying to provide a base ground for all the efforts,
> it'll be great if we can thoroughly discuss here and settle an approach
> soon that will satisfy everyone.

Just a summary if it helps:

For compression work (IAA/QPL, QAT) the discussion is around having a
new "compression acceleration" option that enables the accelerators and
is complementary to the existing zlib compression method. We'd choose
those automatically based on availability and we'd make HW accelerated
compression produce a stream that is compatible with QEMU's zlib stream
so we could migrate between solutions.

For zero page work and zero page acceleration (DSA), the question is how
to fit zero page detection into multifd and whether we need a new hook
multifd_ops->zero_page_detect() (or similar) to allow client code to
provide it's own zero page detection methods. My worry here is that
teaching multifd to recognize zero pages is one more coupling to the
"pages" data type. Ideallly we'd find a way to include that operation as
a prepare() responsibility and the client code would deal with it.
Peter Xu Jan. 31, 2024, 9:29 a.m. UTC | #4
On Mon, Jan 29, 2024 at 09:51:06AM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Mon, Jan 29, 2024 at 01:41:01AM +0000, Liu, Yuan1 wrote:
> >> Because this change has an impact on the previous live migration 
> >> With IAA Patch, does the submission of the next version needs 
> >> to be submitted based on this change?
> >
> > I'd say hold off a little while until we're more certain on the planned
> > interface changes, to avoid you rebase your code back and forth; unless
> > you're pretty confident that this will be the right approach.
> >
> > I apologize on not having looked at any of the QAT/IAA compression / zero
> > detection series posted on the list; I do plan to read them very soon too
> > after Fabiano.  So I may not have a complete full picture here yet, please
> > bare with me.
> >
> > If this series is trying to provide a base ground for all the efforts,
> > it'll be great if we can thoroughly discuss here and settle an approach
> > soon that will satisfy everyone.
> 
> Just a summary if it helps:
> 
> For compression work (IAA/QPL, QAT) the discussion is around having a
> new "compression acceleration" option that enables the accelerators and
> is complementary to the existing zlib compression method. We'd choose
> those automatically based on availability and we'd make HW accelerated
> compression produce a stream that is compatible with QEMU's zlib stream
> so we could migrate between solutions.
> 
> For zero page work and zero page acceleration (DSA), the question is how
> to fit zero page detection into multifd and whether we need a new hook
> multifd_ops->zero_page_detect() (or similar) to allow client code to
> provide it's own zero page detection methods. My worry here is that
> teaching multifd to recognize zero pages is one more coupling to the
> "pages" data type. Ideallly we'd find a way to include that operation as
> a prepare() responsibility and the client code would deal with it.

Thanks Fabiano.

Since I'm preparing the old series to post for some fundamental cleanups
around multifd, and when I'm looking around the code, I noticed that
_maybe_ it'll also be eaiser to apply such a series if we can cleanup more
things then move towards a clean base to add more accelerators.

I agree many ideas in your this series, but I may address it slightly
different (e.g., I want to avoid send(), but you can consider that in the
fixed-ram series instead), also it'll be after some other cleanup I plan to
give a stab at which is not yet covered in this series.  I hope I can add
your "Co-developed-by" in some of the patches there.  If you haven't spend
more time on new version of this series, please wait 1-2 days so I can post
my thoughts.
Fabiano Rosas Jan. 31, 2024, 1:19 p.m. UTC | #5
Peter Xu <peterx@redhat.com> writes:

> On Mon, Jan 29, 2024 at 09:51:06AM -0300, Fabiano Rosas wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > On Mon, Jan 29, 2024 at 01:41:01AM +0000, Liu, Yuan1 wrote:
>> >> Because this change has an impact on the previous live migration 
>> >> With IAA Patch, does the submission of the next version needs 
>> >> to be submitted based on this change?
>> >
>> > I'd say hold off a little while until we're more certain on the planned
>> > interface changes, to avoid you rebase your code back and forth; unless
>> > you're pretty confident that this will be the right approach.
>> >
>> > I apologize on not having looked at any of the QAT/IAA compression / zero
>> > detection series posted on the list; I do plan to read them very soon too
>> > after Fabiano.  So I may not have a complete full picture here yet, please
>> > bare with me.
>> >
>> > If this series is trying to provide a base ground for all the efforts,
>> > it'll be great if we can thoroughly discuss here and settle an approach
>> > soon that will satisfy everyone.
>> 
>> Just a summary if it helps:
>> 
>> For compression work (IAA/QPL, QAT) the discussion is around having a
>> new "compression acceleration" option that enables the accelerators and
>> is complementary to the existing zlib compression method. We'd choose
>> those automatically based on availability and we'd make HW accelerated
>> compression produce a stream that is compatible with QEMU's zlib stream
>> so we could migrate between solutions.
>> 
>> For zero page work and zero page acceleration (DSA), the question is how
>> to fit zero page detection into multifd and whether we need a new hook
>> multifd_ops->zero_page_detect() (or similar) to allow client code to
>> provide it's own zero page detection methods. My worry here is that
>> teaching multifd to recognize zero pages is one more coupling to the
>> "pages" data type. Ideallly we'd find a way to include that operation as
>> a prepare() responsibility and the client code would deal with it.
>
> Thanks Fabiano.
>
> Since I'm preparing the old series to post for some fundamental cleanups
> around multifd, and when I'm looking around the code, I noticed that
> _maybe_ it'll also be eaiser to apply such a series if we can cleanup more
> things then move towards a clean base to add more accelerators.
>
> I agree many ideas in your this series, but I may address it slightly
> different (e.g., I want to avoid send(), but you can consider that in the
> fixed-ram series instead), also it'll be after some other cleanup I plan to
> give a stab at which is not yet covered in this series.  I hope I can add
> your "Co-developed-by" in some of the patches there.  If you haven't spend
> more time on new version of this series, please wait 1-2 days so I can post
> my thoughts.

Sure, go ahead.
Hao Xiang Feb. 1, 2024, 1:11 a.m. UTC | #6
On Wed, Jan 31, 2024 at 5:19 AM Fabiano Rosas <farosas@suse.de> wrote:
>
> Peter Xu <peterx@redhat.com> writes:
>
> > On Mon, Jan 29, 2024 at 09:51:06AM -0300, Fabiano Rosas wrote:
> >> Peter Xu <peterx@redhat.com> writes:
> >>
> >> > On Mon, Jan 29, 2024 at 01:41:01AM +0000, Liu, Yuan1 wrote:
> >> >> Because this change has an impact on the previous live migration
> >> >> With IAA Patch, does the submission of the next version needs
> >> >> to be submitted based on this change?
> >> >
> >> > I'd say hold off a little while until we're more certain on the planned
> >> > interface changes, to avoid you rebase your code back and forth; unless
> >> > you're pretty confident that this will be the right approach.
> >> >
> >> > I apologize on not having looked at any of the QAT/IAA compression / zero
> >> > detection series posted on the list; I do plan to read them very soon too
> >> > after Fabiano.  So I may not have a complete full picture here yet, please
> >> > bare with me.
> >> >
> >> > If this series is trying to provide a base ground for all the efforts,
> >> > it'll be great if we can thoroughly discuss here and settle an approach
> >> > soon that will satisfy everyone.
> >>
> >> Just a summary if it helps:
> >>
> >> For compression work (IAA/QPL, QAT) the discussion is around having a
> >> new "compression acceleration" option that enables the accelerators and
> >> is complementary to the existing zlib compression method. We'd choose
> >> those automatically based on availability and we'd make HW accelerated
> >> compression produce a stream that is compatible with QEMU's zlib stream
> >> so we could migrate between solutions.
> >>
> >> For zero page work and zero page acceleration (DSA), the question is how
> >> to fit zero page detection into multifd and whether we need a new hook
> >> multifd_ops->zero_page_detect() (or similar) to allow client code to
> >> provide it's own zero page detection methods. My worry here is that
> >> teaching multifd to recognize zero pages is one more coupling to the
> >> "pages" data type. Ideallly we'd find a way to include that operation as
> >> a prepare() responsibility and the client code would deal with it.
> >
> > Thanks Fabiano.

Hi Fabiano,

Your current refactoring assumes that compression ops and multifd
socket ops are mutually exclusive. Both of them need to implement the
entire MultiFDMethods interface. I think this works fine for now. Once
we introduce multifd zero page checking and we add a new interface for
that, we are adding a new method zero_page_detect() on the
MultiFDMethods interface. If we do that, zero_page_detect() needs to
be implemented in multifd_socket_ops and it also needs to be
implemented in zlib and zstd. On top of that, if we add an accelerator
to offload zero_page_detect(), that accelerator configuration can
co-exist with compression or socket. That makes things quite
complicated in my opinion.

Can we create an instance of MultiFDMethods at runtime and fill each
method depending on the configuration? If methods are not filled, we
fallback to fill it with the default implementation (like what
socket.c provides) For instance, if zstd is enabled and zero page
checking using CPU, the interface will be filled with all the
functions zstd currently implements and since zstd doesn't implement
zero_page_detect(), we will fallback to fill zero_page_detect() with
the default multifd zero page checking implementation.

> >
> > Since I'm preparing the old series to post for some fundamental cleanups
> > around multifd, and when I'm looking around the code, I noticed that
> > _maybe_ it'll also be eaiser to apply such a series if we can cleanup more
> > things then move towards a clean base to add more accelerators.
> >
> > I agree many ideas in your this series, but I may address it slightly
> > different (e.g., I want to avoid send(), but you can consider that in the
> > fixed-ram series instead), also it'll be after some other cleanup I plan to
> > give a stab at which is not yet covered in this series.  I hope I can add
> > your "Co-developed-by" in some of the patches there.  If you haven't spend
> > more time on new version of this series, please wait 1-2 days so I can post
> > my thoughts.
>
> Sure, go ahead.
>
Fabiano Rosas Feb. 1, 2024, 1:23 p.m. UTC | #7
Hao Xiang <hao.xiang@bytedance.com> writes:

> On Wed, Jan 31, 2024 at 5:19 AM Fabiano Rosas <farosas@suse.de> wrote:
>>
>> Peter Xu <peterx@redhat.com> writes:
>>
>> > On Mon, Jan 29, 2024 at 09:51:06AM -0300, Fabiano Rosas wrote:
>> >> Peter Xu <peterx@redhat.com> writes:
>> >>
>> >> > On Mon, Jan 29, 2024 at 01:41:01AM +0000, Liu, Yuan1 wrote:
>> >> >> Because this change has an impact on the previous live migration
>> >> >> With IAA Patch, does the submission of the next version needs
>> >> >> to be submitted based on this change?
>> >> >
>> >> > I'd say hold off a little while until we're more certain on the planned
>> >> > interface changes, to avoid you rebase your code back and forth; unless
>> >> > you're pretty confident that this will be the right approach.
>> >> >
>> >> > I apologize on not having looked at any of the QAT/IAA compression / zero
>> >> > detection series posted on the list; I do plan to read them very soon too
>> >> > after Fabiano.  So I may not have a complete full picture here yet, please
>> >> > bare with me.
>> >> >
>> >> > If this series is trying to provide a base ground for all the efforts,
>> >> > it'll be great if we can thoroughly discuss here and settle an approach
>> >> > soon that will satisfy everyone.
>> >>
>> >> Just a summary if it helps:
>> >>
>> >> For compression work (IAA/QPL, QAT) the discussion is around having a
>> >> new "compression acceleration" option that enables the accelerators and
>> >> is complementary to the existing zlib compression method. We'd choose
>> >> those automatically based on availability and we'd make HW accelerated
>> >> compression produce a stream that is compatible with QEMU's zlib stream
>> >> so we could migrate between solutions.
>> >>
>> >> For zero page work and zero page acceleration (DSA), the question is how
>> >> to fit zero page detection into multifd and whether we need a new hook
>> >> multifd_ops->zero_page_detect() (or similar) to allow client code to
>> >> provide it's own zero page detection methods. My worry here is that
>> >> teaching multifd to recognize zero pages is one more coupling to the
>> >> "pages" data type. Ideallly we'd find a way to include that operation as
>> >> a prepare() responsibility and the client code would deal with it.
>> >
>> > Thanks Fabiano.
>
> Hi Fabiano,
>
> Your current refactoring assumes that compression ops and multifd
> socket ops are mutually exclusive. Both of them need to implement the
> entire MultiFDMethods interface. I think this works fine for now. Once
> we introduce multifd zero page checking and we add a new interface for
> that, we are adding a new method zero_page_detect() on the
> MultiFDMethods interface. If we do that, zero_page_detect() needs to
> be implemented in multifd_socket_ops and it also needs to be
> implemented in zlib and zstd. On top of that, if we add an accelerator
> to offload zero_page_detect(), that accelerator configuration can
> co-exist with compression or socket. That makes things quite
> complicated in my opinion.

Peter has proposed an alternate scheme. Take a look at his series. But
it basically keeps the compression as is and moves some code into the
prepare() phase:

https://lore.kernel.org/r/20240131103111.306523-1-peterx@redhat.com

> Can we create an instance of MultiFDMethods at runtime and fill each
> method depending on the configuration? If methods are not filled, we
> fallback to fill it with the default implementation (like what
> socket.c provides) For instance, if zstd is enabled and zero page
> checking using CPU, the interface will be filled with all the
> functions zstd currently implements and since zstd doesn't implement
> zero_page_detect(), we will fallback to fill zero_page_detect() with
> the default multifd zero page checking implementation.

Take a look whether incorporating zero_page_detect() in the prepare()
phase would work. We're trying to walk toward a multifd_ops model that
is not tied to the pages concept.

>> >
>> > Since I'm preparing the old series to post for some fundamental cleanups
>> > around multifd, and when I'm looking around the code, I noticed that
>> > _maybe_ it'll also be eaiser to apply such a series if we can cleanup more
>> > things then move towards a clean base to add more accelerators.
>> >
>> > I agree many ideas in your this series, but I may address it slightly
>> > different (e.g., I want to avoid send(), but you can consider that in the
>> > fixed-ram series instead), also it'll be after some other cleanup I plan to
>> > give a stab at which is not yet covered in this series.  I hope I can add
>> > your "Co-developed-by" in some of the patches there.  If you haven't spend
>> > more time on new version of this series, please wait 1-2 days so I can post
>> > my thoughts.
>>
>> Sure, go ahead.
>>