diff mbox series

[4/3] migration: Add tracepoints for downtime checkpoints

Message ID 20231026190159.608025-1-peterx@redhat.com
State New
Headers show
Series migration: Downtime tracepoints | expand

Commit Message

Peter Xu Oct. 26, 2023, 7:01 p.m. UTC
Add tracepoints for major downtime checkpoints on both src and dst.  They
share the same tracepoint with a string showing its stage.

On src, we have these checkpoints added:

  - downtime-start: right before vm stops on src
  - vm-stopped: after vm is fully stopped
  - iterable-saved: after all iterables saved (END sections)
  - non-iterable-saved: after all non-iterable saved (FULL sections)
  - downtime-stop: migration fully completed

On dst, we have these checkpoints added:

  - precopy-loadvm-completes: after loadvm all done for precopy
  - precopy-bh-*: record BH steps to resume VM for precopy
  - postcopy-bh-*: record BH steps to resume VM for postcopy

On dst side, we don't have a good way to trace total time consumed by
iterable or non-iterable for now.  We can mark it by 1st time receiving a
FULL / END section, but rather than that let's just rely on the other
tracepoints added for vmstates to back up the information.

Until this patch, one can enable "vmstate_downtime*" and it'll enable all
tracepoints for downtime measurements.

Since the downtime timestamp tracepoints will cover postcopy too, drop
loadvm_postcopy_handle_run_bh() tracepoint alongside, because they service
the same purpose, but that only for postcopy.  We then have unified prefix
for all downtime relevant tracepoints.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.c  | 16 +++++++++++++++-
 migration/savevm.c     | 14 +++++++++-----
 migration/trace-events |  2 +-
 3 files changed, 25 insertions(+), 7 deletions(-)

Comments

Joao Martins Oct. 26, 2023, 7:43 p.m. UTC | #1
On 26/10/2023 20:01, Peter Xu wrote:
> Add tracepoints for major downtime checkpoints on both src and dst.  They
> share the same tracepoint with a string showing its stage.
> 
> On src, we have these checkpoints added:
> 
>   - downtime-start: right before vm stops on src
>   - vm-stopped: after vm is fully stopped
>   - iterable-saved: after all iterables saved (END sections)
>   - non-iterable-saved: after all non-iterable saved (FULL sections)
>   - downtime-stop: migration fully completed
> 
> On dst, we have these checkpoints added:
> 
>   - precopy-loadvm-completes: after loadvm all done for precopy
>   - precopy-bh-*: record BH steps to resume VM for precopy
>   - postcopy-bh-*: record BH steps to resume VM for postcopy
> 
> On dst side, we don't have a good way to trace total time consumed by
> iterable or non-iterable for now.  We can mark it by 1st time receiving a
> FULL / END section, but rather than that let's just rely on the other
> tracepoints added for vmstates to back up the information.
> 
> Until this patch, one can enable "vmstate_downtime*" and it'll enable all
> tracepoints for downtime measurements.
> 
> Since the downtime timestamp tracepoints will cover postcopy too, drop
> loadvm_postcopy_handle_run_bh() tracepoint alongside, because they service
> the same purpose, but that only for postcopy.  We then have unified prefix
> for all downtime relevant tracepoints.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  migration/migration.c  | 16 +++++++++++++++-
>  migration/savevm.c     | 14 +++++++++-----
>  migration/trace-events |  2 +-
>  3 files changed, 25 insertions(+), 7 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 9013c1b500..f1f1d2ae2b 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -103,6 +103,7 @@ static int close_return_path_on_source(MigrationState *s);
>  
>  static void migration_downtime_start(MigrationState *s)
>  {
> +    trace_vmstate_downtime_timestamp("downtime-start");
>      s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>  }
>  
> @@ -117,6 +118,8 @@ static void migration_downtime_end(MigrationState *s)
>      if (!s->downtime) {
>          s->downtime = now - s->downtime_start;
>      }
> +
> +    trace_vmstate_downtime_timestamp("downtime-end");
>  }
> 

Considering we aren't including any downtime timestamps in the tracing, is this
a way to say that the tracing tool printing timestamps is what we use to extract
downtime contribution?

It might be obvious, but perhaps should be spelled out in the commit message?
Peter Xu Oct. 26, 2023, 8:08 p.m. UTC | #2
On Thu, Oct 26, 2023 at 08:43:59PM +0100, Joao Martins wrote:
> Considering we aren't including any downtime timestamps in the tracing, is this
> a way to say that the tracing tool printing timestamps is what we use to extract
> downtime contribution?
> 
> It might be obvious, but perhaps should be spelled out in the commit message?

Sure, I'll state that in the commit message in a new version.
Peter Xu Oct. 26, 2023, 8:14 p.m. UTC | #3
On Thu, Oct 26, 2023 at 04:08:20PM -0400, Peter Xu wrote:
> On Thu, Oct 26, 2023 at 08:43:59PM +0100, Joao Martins wrote:
> > Considering we aren't including any downtime timestamps in the tracing, is this
> > a way to say that the tracing tool printing timestamps is what we use to extract
> > downtime contribution?
> > 
> > It might be obvious, but perhaps should be spelled out in the commit message?
> 
> Sure, I'll state that in the commit message in a new version.

After a second thought, I'll rename it into vmstate_downtime_checkopint().

I'll wait for 1-2 more days for further review comments before a repost.

Thanks,
diff mbox series

Patch

diff --git a/migration/migration.c b/migration/migration.c
index 9013c1b500..f1f1d2ae2b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -103,6 +103,7 @@  static int close_return_path_on_source(MigrationState *s);
 
 static void migration_downtime_start(MigrationState *s)
 {
+    trace_vmstate_downtime_timestamp("downtime-start");
     s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
 }
 
@@ -117,6 +118,8 @@  static void migration_downtime_end(MigrationState *s)
     if (!s->downtime) {
         s->downtime = now - s->downtime_start;
     }
+
+    trace_vmstate_downtime_timestamp("downtime-end");
 }
 
 static bool migration_needs_multiple_sockets(void)
@@ -151,7 +154,11 @@  static gint page_request_addr_cmp(gconstpointer ap, gconstpointer bp)
 
 int migration_stop_vm(RunState state)
 {
-    return vm_stop_force_state(state);
+    int ret = vm_stop_force_state(state);
+
+    trace_vmstate_downtime_timestamp("vm-stopped");
+
+    return ret;
 }
 
 void migration_object_init(void)
@@ -500,6 +507,8 @@  static void process_incoming_migration_bh(void *opaque)
     Error *local_err = NULL;
     MigrationIncomingState *mis = opaque;
 
+    trace_vmstate_downtime_timestamp("precopy-bh-enter");
+
     /* If capability late_block_activate is set:
      * Only fire up the block code now if we're going to restart the
      * VM, else 'cont' will do it.
@@ -525,6 +534,8 @@  static void process_incoming_migration_bh(void *opaque)
      */
     qemu_announce_self(&mis->announce_timer, migrate_announce_params());
 
+    trace_vmstate_downtime_timestamp("precopy-bh-announced");
+
     multifd_load_shutdown();
 
     dirty_bitmap_mig_before_vm_start();
@@ -542,6 +553,7 @@  static void process_incoming_migration_bh(void *opaque)
     } else {
         runstate_set(global_state_get_runstate());
     }
+    trace_vmstate_downtime_timestamp("precopy-bh-vm-started");
     /*
      * This must happen after any state changes since as soon as an external
      * observer sees this event they might start to prod at the VM assuming
@@ -576,6 +588,8 @@  process_incoming_migration_co(void *opaque)
     ret = qemu_loadvm_state(mis->from_src_file);
     mis->loadvm_co = NULL;
 
+    trace_vmstate_downtime_timestamp("precopy-loadvm-completed");
+
     ps = postcopy_state_get();
     trace_process_incoming_migration_co_end(ret, ps);
     if (ps != POSTCOPY_INCOMING_NONE) {
diff --git a/migration/savevm.c b/migration/savevm.c
index cd6d6ba493..49cbbd151c 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1494,6 +1494,8 @@  int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy)
                                     end_ts_each - start_ts_each);
     }
 
+    trace_vmstate_downtime_timestamp("iterable-saved");
+
     return 0;
 }
 
@@ -1560,6 +1562,8 @@  int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
     json_writer_free(vmdesc);
     ms->vmdesc = NULL;
 
+    trace_vmstate_downtime_timestamp("non-iterable-saved");
+
     return 0;
 }
 
@@ -2102,18 +2106,18 @@  static void loadvm_postcopy_handle_run_bh(void *opaque)
     Error *local_err = NULL;
     MigrationIncomingState *mis = opaque;
 
-    trace_loadvm_postcopy_handle_run_bh("enter");
+    trace_vmstate_downtime_timestamp("postcopy-bh-enter");
 
     /* TODO we should move all of this lot into postcopy_ram.c or a shared code
      * in migration.c
      */
     cpu_synchronize_all_post_init();
 
-    trace_loadvm_postcopy_handle_run_bh("after cpu sync");
+    trace_vmstate_downtime_timestamp("postcopy-bh-cpu-synced");
 
     qemu_announce_self(&mis->announce_timer, migrate_announce_params());
 
-    trace_loadvm_postcopy_handle_run_bh("after announce");
+    trace_vmstate_downtime_timestamp("postcopy-bh-announced");
 
     /* Make sure all file formats throw away their mutable metadata.
      * If we get an error here, just don't restart the VM yet. */
@@ -2124,7 +2128,7 @@  static void loadvm_postcopy_handle_run_bh(void *opaque)
         autostart = false;
     }
 
-    trace_loadvm_postcopy_handle_run_bh("after invalidate cache");
+    trace_vmstate_downtime_timestamp("postcopy-bh-cache-invalidated");
 
     dirty_bitmap_mig_before_vm_start();
 
@@ -2138,7 +2142,7 @@  static void loadvm_postcopy_handle_run_bh(void *opaque)
 
     qemu_bh_delete(mis->bh);
 
-    trace_loadvm_postcopy_handle_run_bh("return");
+    trace_vmstate_downtime_timestamp("postcopy-bh-vm-started");
 }
 
 /* After all discards we can start running and asking for pages */
diff --git a/migration/trace-events b/migration/trace-events
index 5820add1f3..aebdd939b7 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -17,7 +17,6 @@  loadvm_handle_recv_bitmap(char *s) "%s"
 loadvm_postcopy_handle_advise(void) ""
 loadvm_postcopy_handle_listen(const char *str) "%s"
 loadvm_postcopy_handle_run(void) ""
-loadvm_postcopy_handle_run_bh(const char *str) "%s"
 loadvm_postcopy_handle_resume(void) ""
 loadvm_postcopy_ram_handle_discard(void) ""
 loadvm_postcopy_ram_handle_discard_end(void) ""
@@ -50,6 +49,7 @@  vmstate_save(const char *idstr, const char *vmsd_name) "%s, %s"
 vmstate_load(const char *idstr, const char *vmsd_name) "%s, %s"
 vmstate_downtime_save(const char *type, const char *idstr, uint32_t instance_id, int64_t downtime) "type=%s idstr=%s instance_id=%d downtime=%"PRIi64
 vmstate_downtime_load(const char *type, const char *idstr, uint32_t instance_id, int64_t downtime) "type=%s idstr=%s instance_id=%d downtime=%"PRIi64
+vmstate_downtime_timestamp(const char *checkpoint) "%s"
 postcopy_pause_incoming(void) ""
 postcopy_pause_incoming_continued(void) ""
 postcopy_page_req_sync(void *host_addr) "sync page req %p"