Message ID | 20230814163135.187882-5-npiggin@gmail.com |
---|---|
State | New |
Headers | show |
Series | Assorted replay patches | expand |
On 14.08.2023 19:31, Nicholas Piggin wrote: > record makes an initial snapshot when the machine is created, to enable > reverse-debugging. Often the issue being debugged appears near the end of > the trace, so it is important for performance to keep snapshots close to > the end. > > This implements a periodic snapshot mode that keeps a rolling set of > recent snapshots. > > Arguably this should be done by the debugger or a program that talks to > QMP, but for setting up simple scenarios and tests, it is convenient to > have this feature. > > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> > --- > docs/system/replay.rst | 5 ++++ > include/sysemu/replay.h | 11 ++++++++ > qemu-options.hx | 9 +++++-- > replay/replay-snapshot.c | 57 ++++++++++++++++++++++++++++++++++++++++ > replay/replay.c | 25 ++++++++++++++++++ > softmmu/vl.c | 9 +++++++ > 6 files changed, 114 insertions(+), 2 deletions(-) > > diff --git a/docs/system/replay.rst b/docs/system/replay.rst > index 3105327423..bef9ea4171 100644 > --- a/docs/system/replay.rst > +++ b/docs/system/replay.rst > @@ -156,6 +156,11 @@ for storing VM snapshots. Here is the example of the command line for this: > ``empty.qcow2`` drive does not connected to any virtual block device and used > for VM snapshots only. > > +``rrsnapmode`` can be used to select just an initial snapshot or periodic > +snapshots, with ``rrsnapcount`` specifying the number of periodic snapshots > +to maintain, and ``rrsnaptime`` the amount of run time in seconds between > +periodic snapshots. > + > .. _network-label: > > Network devices > diff --git a/include/sysemu/replay.h b/include/sysemu/replay.h > index 08aae5869f..a83e54afc6 100644 > --- a/include/sysemu/replay.h > +++ b/include/sysemu/replay.h > @@ -45,6 +45,17 @@ typedef enum ReplayCheckpoint ReplayCheckpoint; > > typedef struct ReplayNetState ReplayNetState; > > +enum ReplaySnapshotMode { > + REPLAY_SNAPSHOT_MODE_INITIAL, > + REPLAY_SNAPSHOT_MODE_PERIODIC, > +}; > +typedef enum ReplaySnapshotMode ReplaySnapshotMode; > + > +extern ReplaySnapshotMode replay_snapshot_mode; > + > +extern uint64_t replay_snapshot_periodic_delay; > +extern int replay_snapshot_periodic_nr_keep; > + It seems that all of these doesn't have to be exported, you can add it into the internal replay header. > /* Name of the initial VM snapshot */ > extern char *replay_snapshot > > diff --git a/qemu-options.hx b/qemu-options.hx > index 29b98c3d4c..0dce93eeab 100644 > --- a/qemu-options.hx > +++ b/qemu-options.hx > @@ -4530,13 +4530,13 @@ SRST > ERST > > DEF("icount", HAS_ARG, QEMU_OPTION_icount, \ > - "-icount [shift=N|auto][,align=on|off][,sleep=on|off][,rr=record|replay,rrfile=<filename>[,rrsnapshot=<snapshot>]]\n" \ > + "-icount [shift=N|auto][,align=on|off][,sleep=on|off][,rr=record|replay,rrfile=<filename>[,rrsnapshot=<snapshot>][,rrsnapmode=initial|periodic][,rrsnaptime=secs][,rrsnapcount=N]\n" \ > " enable virtual instruction counter with 2^N clock ticks per\n" \ > " instruction, enable aligning the host and virtual clocks\n" \ > " or disable real time cpu sleeping, and optionally enable\n" \ > " record-and-replay mode\n", QEMU_ARCH_ALL) > SRST > -``-icount [shift=N|auto][,align=on|off][,sleep=on|off][,rr=record|replay,rrfile=filename[,rrsnapshot=snapshot]]`` > +``-icount [shift=N|auto][,align=on|off][,sleep=on|off][,rr=record|replay,rrfile=filename[,rrsnapshot=snapshot][,rrsnapmode=initial|periodic][,rrsnaptime=secs][,rrsnapcount=N]]`` > Enable virtual instruction counter. The virtual cpu will execute one > instruction every 2^N ns of virtual time. If ``auto`` is specified > then the virtual cpu speed will be automatically adjusted to keep > @@ -4578,6 +4578,11 @@ SRST > name. In record mode, a new VM snapshot with the given name is created > at the start of execution recording. In replay mode this option > specifies the snapshot name used to load the initial VM state. > + ``rrsnapmode=periodic`` will additionally cause a periodic snapshot to > + be created after ``rrsnaptime=secs`` seconds of real runtime. The last > + ``rrsnapcount=N`` periodic snapshots (not including the initial) will > + be kept (0 for infinite). Periodic snapshots are useful to speed > + reverse debugging operations near the end of the recorded trace. > ERST > > DEF("watchdog-action", HAS_ARG, QEMU_OPTION_watchdog_action, \ > diff --git a/replay/replay-snapshot.c b/replay/replay-snapshot.c > index 10a7cf7992..38eac61c43 100644 > --- a/replay/replay-snapshot.c > +++ b/replay/replay-snapshot.c > @@ -69,6 +69,53 @@ void replay_vmstate_register(void) > vmstate_register(NULL, 0, &vmstate_replay, &replay_state); > } > > +static QEMUTimer *replay_snapshot_timer; > +static int replay_snapshot_count; > + > +static void replay_snapshot_timer_cb(void *opaque) > +{ > + Error *err = NULL; > + char *name; > + > + if (!replay_can_snapshot()) { > + /* Try again soon */ > + timer_mod(replay_snapshot_timer, > + qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + > + replay_snapshot_periodic_delay / 10); > + return; > + } > + > + name = g_strdup_printf("%s-%d", replay_snapshot, replay_snapshot_count); > + if (!save_snapshot(name, > + true, NULL, false, NULL, &err)) { > + error_report_err(err); > + error_report("Could not create periodic snapshot " > + "for icount record, disabling"); > + g_free(name); > + return; > + } > + g_free(name); > + replay_snapshot_count++; > + > + if (replay_snapshot_periodic_nr_keep >= 1 && > + replay_snapshot_count > replay_snapshot_periodic_nr_keep) { > + int del_nr; > + > + del_nr = replay_snapshot_count - replay_snapshot_periodic_nr_keep - 1; > + name = g_strdup_printf("%s-%d", replay_snapshot, del_nr); > + if (!delete_snapshot(name, false, NULL, &err)) { > + error_report_err(err); > + error_report("Could not delete periodic snapshot " > + "for icount record"); > + } > + g_free(name); > + } > + > + timer_mod(replay_snapshot_timer, > + qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + > + replay_snapshot_periodic_delay); I'm not sure that realtime is not the best choice for such of a timer. Virtual machine may be stopped or slowed down for some reason. > +} > + > void replay_vmstate_init(void) > { > Error *err = NULL; > @@ -81,6 +128,16 @@ void replay_vmstate_init(void) > error_report("Could not create snapshot for icount record"); > exit(1); > } > + > + if (replay_snapshot_mode == REPLAY_SNAPSHOT_MODE_PERIODIC) { > + replay_snapshot_timer = timer_new_ms(QEMU_CLOCK_REALTIME, > + replay_snapshot_timer_cb, > + NULL); > + timer_mod(replay_snapshot_timer, > + qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + > + replay_snapshot_periodic_delay); > + } > + Please also delete placeholder comment for the snapshotting timer in replay_enable function. > } else if (replay_mode == REPLAY_MODE_PLAY) { > if (!load_snapshot(replay_snapshot, NULL, false, NULL, &err)) { > error_report_err(err); > diff --git a/replay/replay.c b/replay/replay.c > index e64f71209a..fa5930700d 100644 > --- a/replay/replay.c > +++ b/replay/replay.c > @@ -29,6 +29,10 @@ > ReplayMode replay_mode = REPLAY_MODE_NONE; > char *replay_snapshot; > > +ReplaySnapshotMode replay_snapshot_mode; > +uint64_t replay_snapshot_periodic_delay; > +int replay_snapshot_periodic_nr_keep; > + > /* Name of replay file */ > static char *replay_filename; > ReplayState replay_state; > @@ -313,6 +317,27 @@ void replay_configure(QemuOpts *opts) > } > > replay_snapshot = g_strdup(qemu_opt_get(opts, "rrsnapshot")); > + if (replay_snapshot && mode == REPLAY_MODE_RECORD) { Can such a snapshotting may be useful in replay mode? > + const char *snapmode; > + > + snapmode = qemu_opt_get(opts, "rrsnapmode"); > + if (!snapmode || !strcmp(snapmode, "initial")) { > + replay_snapshot_mode = REPLAY_SNAPSHOT_MODE_INITIAL; > + } else if (!strcmp(snapmode, "periodic")) { > + replay_snapshot_mode = REPLAY_SNAPSHOT_MODE_PERIODIC; > + } else { > + error_report("Invalid rrsnapmode option: %s", snapmode); > + exit(1); > + } > + > + /* Default 10 host seconds of machine runtime per snapshot. */ > + replay_snapshot_periodic_delay = > + qemu_opt_get_number(opts, "rrsnaptime", 10) * 1000; > + > + /* Default 2, to cover at least the last 10 host seconds of runtime. */ > + replay_snapshot_periodic_nr_keep = > + qemu_opt_get_number(opts, "rrsnapcount", 2); > + } > replay_vmstate_register(); > replay_enable(fname, mode); > > diff --git a/softmmu/vl.c b/softmmu/vl.c > index b0b96f67fa..e032eb45e8 100644 > --- a/softmmu/vl.c > +++ b/softmmu/vl.c > @@ -446,6 +446,15 @@ static QemuOptsList qemu_icount_opts = { > }, { > .name = "rrsnapshot", > .type = QEMU_OPT_STRING, > + }, { > + .name = "rrsnapmode", > + .type = QEMU_OPT_STRING, > + }, { > + .name = "rrsnaptime", > + .type = QEMU_OPT_NUMBER, > + }, { > + .name = "rrsnapcount", > + .type = QEMU_OPT_NUMBER, > }, > { /* end of list */ } > },
On Fri Aug 18, 2023 at 2:36 PM AEST, Pavel Dovgalyuk wrote: > On 14.08.2023 19:31, Nicholas Piggin wrote: > > record makes an initial snapshot when the machine is created, to enable > > reverse-debugging. Often the issue being debugged appears near the end of > > the trace, so it is important for performance to keep snapshots close to > > the end. > > > > This implements a periodic snapshot mode that keeps a rolling set of > > recent snapshots. > > > > Arguably this should be done by the debugger or a program that talks to > > QMP, but for setting up simple scenarios and tests, it is convenient to > > have this feature. I'm looking at resurrecting this to help add a bit of testing... [snip] > > +static void replay_snapshot_timer_cb(void *opaque) > > +{ > > + Error *err = NULL; > > + char *name; > > + > > + if (!replay_can_snapshot()) { > > + /* Try again soon */ > > + timer_mod(replay_snapshot_timer, > > + qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + > > + replay_snapshot_periodic_delay / 10); > > + return; > > + } > > + > > + name = g_strdup_printf("%s-%d", replay_snapshot, replay_snapshot_count); > > + if (!save_snapshot(name, > > + true, NULL, false, NULL, &err)) { > > + error_report_err(err); > > + error_report("Could not create periodic snapshot " > > + "for icount record, disabling"); > > + g_free(name); > > + return; > > + } > > + g_free(name); > > + replay_snapshot_count++; > > + > > + if (replay_snapshot_periodic_nr_keep >= 1 && > > + replay_snapshot_count > replay_snapshot_periodic_nr_keep) { > > + int del_nr; > > + > > + del_nr = replay_snapshot_count - replay_snapshot_periodic_nr_keep - 1; > > + name = g_strdup_printf("%s-%d", replay_snapshot, del_nr); > > + if (!delete_snapshot(name, false, NULL, &err)) { > > + error_report_err(err); > > + error_report("Could not delete periodic snapshot " > > + "for icount record"); > > + } > > + g_free(name); > > + } > > + > > + timer_mod(replay_snapshot_timer, > > + qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + > > + replay_snapshot_periodic_delay); > > I'm not sure that realtime is not the best choice for such of a timer. > Virtual machine may be stopped or slowed down for some reason. My thinking was that, say if you snapshot every 10 seconds of real time executed, then you should have an upper limit on the order of 10 seconds to perform a reverse-debug operation (so long as you don't exceed your nr_keep limit). Is it worth worrying about complexity of slowdowns and vm pausing? Maybe it could stop snapshotting on a host pause. > > +} > > + > > void replay_vmstate_init(void) > > { > > Error *err = NULL; > > @@ -81,6 +128,16 @@ void replay_vmstate_init(void) > > error_report("Could not create snapshot for icount record"); > > exit(1); > > } > > + > > + if (replay_snapshot_mode == REPLAY_SNAPSHOT_MODE_PERIODIC) { > > + replay_snapshot_timer = timer_new_ms(QEMU_CLOCK_REALTIME, > > + replay_snapshot_timer_cb, > > + NULL); > > + timer_mod(replay_snapshot_timer, > > + qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + > > + replay_snapshot_periodic_delay); > > + } > > + > > Please also delete placeholder comment for the snapshotting timer > in replay_enable function. Wil do. > > } else if (replay_mode == REPLAY_MODE_PLAY) { > > if (!load_snapshot(replay_snapshot, NULL, false, NULL, &err)) { > > error_report_err(err); > > diff --git a/replay/replay.c b/replay/replay.c > > index e64f71209a..fa5930700d 100644 > > --- a/replay/replay.c > > +++ b/replay/replay.c > > @@ -29,6 +29,10 @@ > > ReplayMode replay_mode = REPLAY_MODE_NONE; > > char *replay_snapshot; > > > > +ReplaySnapshotMode replay_snapshot_mode; > > +uint64_t replay_snapshot_periodic_delay; > > +int replay_snapshot_periodic_nr_keep; > > + > > /* Name of replay file */ > > static char *replay_filename; > > ReplayState replay_state; > > @@ -313,6 +317,27 @@ void replay_configure(QemuOpts *opts) > > } > > > > replay_snapshot = g_strdup(qemu_opt_get(opts, "rrsnapshot")); > > + if (replay_snapshot && mode == REPLAY_MODE_RECORD) { > > Can such a snapshotting may be useful in replay mode? Does snapshotting do anything in replay mode? I assume if we did snapshotting based on the machine timer then we'd have to support it here so the timer events get replayed properly, at least. But I was trying to get by with minimum complexity :) Thanks, Nick
On 26.02.2024 10:36, Nicholas Piggin wrote: > On Fri Aug 18, 2023 at 2:36 PM AEST, Pavel Dovgalyuk wrote: >> On 14.08.2023 19:31, Nicholas Piggin wrote: >>> record makes an initial snapshot when the machine is created, to enable >>> reverse-debugging. Often the issue being debugged appears near the end of >>> the trace, so it is important for performance to keep snapshots close to >>> the end. >>> >>> This implements a periodic snapshot mode that keeps a rolling set of >>> recent snapshots. >>> >>> Arguably this should be done by the debugger or a program that talks to >>> QMP, but for setting up simple scenarios and tests, it is convenient to >>> have this feature. > > I'm looking at resurrecting this to help add a bit of testing... > > [snip] > >>> +static void replay_snapshot_timer_cb(void *opaque) >>> +{ >>> + Error *err = NULL; >>> + char *name; >>> + >>> + if (!replay_can_snapshot()) { >>> + /* Try again soon */ >>> + timer_mod(replay_snapshot_timer, >>> + qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + >>> + replay_snapshot_periodic_delay / 10); >>> + return; >>> + } >>> + >>> + name = g_strdup_printf("%s-%d", replay_snapshot, replay_snapshot_count); >>> + if (!save_snapshot(name, >>> + true, NULL, false, NULL, &err)) { >>> + error_report_err(err); >>> + error_report("Could not create periodic snapshot " >>> + "for icount record, disabling"); >>> + g_free(name); >>> + return; >>> + } >>> + g_free(name); >>> + replay_snapshot_count++; >>> + >>> + if (replay_snapshot_periodic_nr_keep >= 1 && >>> + replay_snapshot_count > replay_snapshot_periodic_nr_keep) { >>> + int del_nr; >>> + >>> + del_nr = replay_snapshot_count - replay_snapshot_periodic_nr_keep - 1; >>> + name = g_strdup_printf("%s-%d", replay_snapshot, del_nr); >>> + if (!delete_snapshot(name, false, NULL, &err)) { >>> + error_report_err(err); >>> + error_report("Could not delete periodic snapshot " >>> + "for icount record"); >>> + } >>> + g_free(name); >>> + } >>> + >>> + timer_mod(replay_snapshot_timer, >>> + qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + >>> + replay_snapshot_periodic_delay); >> >> I'm not sure that realtime is not the best choice for such of a timer. >> Virtual machine may be stopped or slowed down for some reason. > > My thinking was that, say if you snapshot every 10 seconds of real time > executed, then you should have an upper limit on the order of 10 seconds > to perform a reverse-debug operation (so long as you don't exceed your > nr_keep limit). But in some cases savevm itself could take more than 10 seconds. We'll have infinite saving in this case. That's why I propose using virtual clock with the QEMU_TIMER_ATTR_EXTERNAL attribute. > > Is it worth worrying about complexity of slowdowns and vm pausing? > Maybe it could stop snapshotting on a host pause. > >>> +} >>> + >>> void replay_vmstate_init(void) >>> { >>> Error *err = NULL; >>> @@ -81,6 +128,16 @@ void replay_vmstate_init(void) >>> error_report("Could not create snapshot for icount record"); >>> exit(1); >>> } >>> + >>> + if (replay_snapshot_mode == REPLAY_SNAPSHOT_MODE_PERIODIC) { >>> + replay_snapshot_timer = timer_new_ms(QEMU_CLOCK_REALTIME, >>> + replay_snapshot_timer_cb, >>> + NULL); >>> + timer_mod(replay_snapshot_timer, >>> + qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + >>> + replay_snapshot_periodic_delay); >>> + } >>> + >> >> Please also delete placeholder comment for the snapshotting timer >> in replay_enable function. > > Wil do. > >>> } else if (replay_mode == REPLAY_MODE_PLAY) { >>> if (!load_snapshot(replay_snapshot, NULL, false, NULL, &err)) { >>> error_report_err(err); >>> diff --git a/replay/replay.c b/replay/replay.c >>> index e64f71209a..fa5930700d 100644 >>> --- a/replay/replay.c >>> +++ b/replay/replay.c >>> @@ -29,6 +29,10 @@ >>> ReplayMode replay_mode = REPLAY_MODE_NONE; >>> char *replay_snapshot; >>> >>> +ReplaySnapshotMode replay_snapshot_mode; >>> +uint64_t replay_snapshot_periodic_delay; >>> +int replay_snapshot_periodic_nr_keep; >>> + >>> /* Name of replay file */ >>> static char *replay_filename; >>> ReplayState replay_state; >>> @@ -313,6 +317,27 @@ void replay_configure(QemuOpts *opts) >>> } >>> >>> replay_snapshot = g_strdup(qemu_opt_get(opts, "rrsnapshot")); >>> + if (replay_snapshot && mode == REPLAY_MODE_RECORD) { >> >> Can such a snapshotting may be useful in replay mode? > > Does snapshotting do anything in replay mode? Yes, you can create as many snapshots as you want if 'snapshot=on' option of the disk image was not used. > I assume if we did > snapshotting based on the machine timer then we'd have to support > it here so the timer events get replayed properly, at least. But > I was trying to get by with minimum complexity :) Use QEMU_TIMER_ATTR_EXTERNAL attribute for the timer and then its events will not affect the replay. > > Thanks, > Nick
On Wed Feb 28, 2024 at 3:07 PM AEST, Pavel Dovgalyuk wrote: > On 26.02.2024 10:36, Nicholas Piggin wrote: > > On Fri Aug 18, 2023 at 2:36 PM AEST, Pavel Dovgalyuk wrote: > >> On 14.08.2023 19:31, Nicholas Piggin wrote: > >>> record makes an initial snapshot when the machine is created, to enable > >>> reverse-debugging. Often the issue being debugged appears near the end of > >>> the trace, so it is important for performance to keep snapshots close to > >>> the end. > >>> > >>> This implements a periodic snapshot mode that keeps a rolling set of > >>> recent snapshots. > >>> > >>> Arguably this should be done by the debugger or a program that talks to > >>> QMP, but for setting up simple scenarios and tests, it is convenient to > >>> have this feature. > > > > I'm looking at resurrecting this to help add a bit of testing... > > > > [snip] > > > >>> +static void replay_snapshot_timer_cb(void *opaque) > >>> +{ > >>> + Error *err = NULL; > >>> + char *name; > >>> + > >>> + if (!replay_can_snapshot()) { > >>> + /* Try again soon */ > >>> + timer_mod(replay_snapshot_timer, > >>> + qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + > >>> + replay_snapshot_periodic_delay / 10); > >>> + return; > >>> + } > >>> + > >>> + name = g_strdup_printf("%s-%d", replay_snapshot, replay_snapshot_count); > >>> + if (!save_snapshot(name, > >>> + true, NULL, false, NULL, &err)) { > >>> + error_report_err(err); > >>> + error_report("Could not create periodic snapshot " > >>> + "for icount record, disabling"); > >>> + g_free(name); > >>> + return; > >>> + } > >>> + g_free(name); > >>> + replay_snapshot_count++; > >>> + > >>> + if (replay_snapshot_periodic_nr_keep >= 1 && > >>> + replay_snapshot_count > replay_snapshot_periodic_nr_keep) { > >>> + int del_nr; > >>> + > >>> + del_nr = replay_snapshot_count - replay_snapshot_periodic_nr_keep - 1; > >>> + name = g_strdup_printf("%s-%d", replay_snapshot, del_nr); > >>> + if (!delete_snapshot(name, false, NULL, &err)) { > >>> + error_report_err(err); > >>> + error_report("Could not delete periodic snapshot " > >>> + "for icount record"); > >>> + } > >>> + g_free(name); > >>> + } > >>> + > >>> + timer_mod(replay_snapshot_timer, > >>> + qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + > >>> + replay_snapshot_periodic_delay); > >> > >> I'm not sure that realtime is not the best choice for such of a timer. > >> Virtual machine may be stopped or slowed down for some reason. > > > > My thinking was that, say if you snapshot every 10 seconds of real time > > executed, then you should have an upper limit on the order of 10 seconds > > to perform a reverse-debug operation (so long as you don't exceed your > > nr_keep limit). > > But in some cases savevm itself could take more than 10 seconds. > We'll have infinite saving in this case. That's why I propose using > virtual clock with the QEMU_TIMER_ATTR_EXTERNAL attribute. It shouldn't do that because it sets the next timer after end of snapshot time + delay. With virtual time, won't an idle machine just keep warping the time to the next snapshot save and that *could* cause infinite snapshotting. I worry that icount time is difficult to predict and understand. Both have issues but I think real time is better. > > > > Is it worth worrying about complexity of slowdowns and vm pausing? > > Maybe it could stop snapshotting on a host pause. > > > >>> +} > >>> + > >>> void replay_vmstate_init(void) > >>> { > >>> Error *err = NULL; > >>> @@ -81,6 +128,16 @@ void replay_vmstate_init(void) > >>> error_report("Could not create snapshot for icount record"); > >>> exit(1); > >>> } > >>> + > >>> + if (replay_snapshot_mode == REPLAY_SNAPSHOT_MODE_PERIODIC) { > >>> + replay_snapshot_timer = timer_new_ms(QEMU_CLOCK_REALTIME, > >>> + replay_snapshot_timer_cb, > >>> + NULL); > >>> + timer_mod(replay_snapshot_timer, > >>> + qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + > >>> + replay_snapshot_periodic_delay); > >>> + } > >>> + > >> > >> Please also delete placeholder comment for the snapshotting timer > >> in replay_enable function. > > > > Wil do. > > > >>> } else if (replay_mode == REPLAY_MODE_PLAY) { > >>> if (!load_snapshot(replay_snapshot, NULL, false, NULL, &err)) { > >>> error_report_err(err); > >>> diff --git a/replay/replay.c b/replay/replay.c > >>> index e64f71209a..fa5930700d 100644 > >>> --- a/replay/replay.c > >>> +++ b/replay/replay.c > >>> @@ -29,6 +29,10 @@ > >>> ReplayMode replay_mode = REPLAY_MODE_NONE; > >>> char *replay_snapshot; > >>> > >>> +ReplaySnapshotMode replay_snapshot_mode; > >>> +uint64_t replay_snapshot_periodic_delay; > >>> +int replay_snapshot_periodic_nr_keep; > >>> + > >>> /* Name of replay file */ > >>> static char *replay_filename; > >>> ReplayState replay_state; > >>> @@ -313,6 +317,27 @@ void replay_configure(QemuOpts *opts) > >>> } > >>> > >>> replay_snapshot = g_strdup(qemu_opt_get(opts, "rrsnapshot")); > >>> + if (replay_snapshot && mode == REPLAY_MODE_RECORD) { > >> > >> Can such a snapshotting may be useful in replay mode? > > > > Does snapshotting do anything in replay mode? > > Yes, you can create as many snapshots as you want if 'snapshot=on' > option of the disk image was not used. Oh, so it will make real snapshots? I just assumed it would "replay" the motions of the snapshot. In that case actually that could be useful if your recording didn't have snapshots and you want to debug it. I'll try it. Thanks, Nick
On Wed Feb 28, 2024 at 3:07 PM AEST, Pavel Dovgalyuk wrote: > On 26.02.2024 10:36, Nicholas Piggin wrote: > >>> @@ -313,6 +317,27 @@ void replay_configure(QemuOpts *opts) > >>> } > >>> > >>> replay_snapshot = g_strdup(qemu_opt_get(opts, "rrsnapshot")); > >>> + if (replay_snapshot && mode == REPLAY_MODE_RECORD) { > >> > >> Can such a snapshotting may be useful in replay mode? > > > > Does snapshotting do anything in replay mode? > > Yes, you can create as many snapshots as you want if 'snapshot=on' > option of the disk image was not used. Actually something goes wrong and the machine crashes when I try this, so it's not so simple. It's a nice idea to be able to do because it really makes reverse debugging much faster, but no time to look into it further at the moment. So I will resubmit basically as is, which works for me nicely. Tweaks can be made or other timer behaviour can be added later, it's not a hard API with precisely defined semantics. So it would be nice to get the basic mechanisms merged and tests added. Thanks, Nick
diff --git a/docs/system/replay.rst b/docs/system/replay.rst index 3105327423..bef9ea4171 100644 --- a/docs/system/replay.rst +++ b/docs/system/replay.rst @@ -156,6 +156,11 @@ for storing VM snapshots. Here is the example of the command line for this: ``empty.qcow2`` drive does not connected to any virtual block device and used for VM snapshots only. +``rrsnapmode`` can be used to select just an initial snapshot or periodic +snapshots, with ``rrsnapcount`` specifying the number of periodic snapshots +to maintain, and ``rrsnaptime`` the amount of run time in seconds between +periodic snapshots. + .. _network-label: Network devices diff --git a/include/sysemu/replay.h b/include/sysemu/replay.h index 08aae5869f..a83e54afc6 100644 --- a/include/sysemu/replay.h +++ b/include/sysemu/replay.h @@ -45,6 +45,17 @@ typedef enum ReplayCheckpoint ReplayCheckpoint; typedef struct ReplayNetState ReplayNetState; +enum ReplaySnapshotMode { + REPLAY_SNAPSHOT_MODE_INITIAL, + REPLAY_SNAPSHOT_MODE_PERIODIC, +}; +typedef enum ReplaySnapshotMode ReplaySnapshotMode; + +extern ReplaySnapshotMode replay_snapshot_mode; + +extern uint64_t replay_snapshot_periodic_delay; +extern int replay_snapshot_periodic_nr_keep; + /* Name of the initial VM snapshot */ extern char *replay_snapshot; diff --git a/qemu-options.hx b/qemu-options.hx index 29b98c3d4c..0dce93eeab 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -4530,13 +4530,13 @@ SRST ERST DEF("icount", HAS_ARG, QEMU_OPTION_icount, \ - "-icount [shift=N|auto][,align=on|off][,sleep=on|off][,rr=record|replay,rrfile=<filename>[,rrsnapshot=<snapshot>]]\n" \ + "-icount [shift=N|auto][,align=on|off][,sleep=on|off][,rr=record|replay,rrfile=<filename>[,rrsnapshot=<snapshot>][,rrsnapmode=initial|periodic][,rrsnaptime=secs][,rrsnapcount=N]\n" \ " enable virtual instruction counter with 2^N clock ticks per\n" \ " instruction, enable aligning the host and virtual clocks\n" \ " or disable real time cpu sleeping, and optionally enable\n" \ " record-and-replay mode\n", QEMU_ARCH_ALL) SRST -``-icount [shift=N|auto][,align=on|off][,sleep=on|off][,rr=record|replay,rrfile=filename[,rrsnapshot=snapshot]]`` +``-icount [shift=N|auto][,align=on|off][,sleep=on|off][,rr=record|replay,rrfile=filename[,rrsnapshot=snapshot][,rrsnapmode=initial|periodic][,rrsnaptime=secs][,rrsnapcount=N]]`` Enable virtual instruction counter. The virtual cpu will execute one instruction every 2^N ns of virtual time. If ``auto`` is specified then the virtual cpu speed will be automatically adjusted to keep @@ -4578,6 +4578,11 @@ SRST name. In record mode, a new VM snapshot with the given name is created at the start of execution recording. In replay mode this option specifies the snapshot name used to load the initial VM state. + ``rrsnapmode=periodic`` will additionally cause a periodic snapshot to + be created after ``rrsnaptime=secs`` seconds of real runtime. The last + ``rrsnapcount=N`` periodic snapshots (not including the initial) will + be kept (0 for infinite). Periodic snapshots are useful to speed + reverse debugging operations near the end of the recorded trace. ERST DEF("watchdog-action", HAS_ARG, QEMU_OPTION_watchdog_action, \ diff --git a/replay/replay-snapshot.c b/replay/replay-snapshot.c index 10a7cf7992..38eac61c43 100644 --- a/replay/replay-snapshot.c +++ b/replay/replay-snapshot.c @@ -69,6 +69,53 @@ void replay_vmstate_register(void) vmstate_register(NULL, 0, &vmstate_replay, &replay_state); } +static QEMUTimer *replay_snapshot_timer; +static int replay_snapshot_count; + +static void replay_snapshot_timer_cb(void *opaque) +{ + Error *err = NULL; + char *name; + + if (!replay_can_snapshot()) { + /* Try again soon */ + timer_mod(replay_snapshot_timer, + qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + + replay_snapshot_periodic_delay / 10); + return; + } + + name = g_strdup_printf("%s-%d", replay_snapshot, replay_snapshot_count); + if (!save_snapshot(name, + true, NULL, false, NULL, &err)) { + error_report_err(err); + error_report("Could not create periodic snapshot " + "for icount record, disabling"); + g_free(name); + return; + } + g_free(name); + replay_snapshot_count++; + + if (replay_snapshot_periodic_nr_keep >= 1 && + replay_snapshot_count > replay_snapshot_periodic_nr_keep) { + int del_nr; + + del_nr = replay_snapshot_count - replay_snapshot_periodic_nr_keep - 1; + name = g_strdup_printf("%s-%d", replay_snapshot, del_nr); + if (!delete_snapshot(name, false, NULL, &err)) { + error_report_err(err); + error_report("Could not delete periodic snapshot " + "for icount record"); + } + g_free(name); + } + + timer_mod(replay_snapshot_timer, + qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + + replay_snapshot_periodic_delay); +} + void replay_vmstate_init(void) { Error *err = NULL; @@ -81,6 +128,16 @@ void replay_vmstate_init(void) error_report("Could not create snapshot for icount record"); exit(1); } + + if (replay_snapshot_mode == REPLAY_SNAPSHOT_MODE_PERIODIC) { + replay_snapshot_timer = timer_new_ms(QEMU_CLOCK_REALTIME, + replay_snapshot_timer_cb, + NULL); + timer_mod(replay_snapshot_timer, + qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + + replay_snapshot_periodic_delay); + } + } else if (replay_mode == REPLAY_MODE_PLAY) { if (!load_snapshot(replay_snapshot, NULL, false, NULL, &err)) { error_report_err(err); diff --git a/replay/replay.c b/replay/replay.c index e64f71209a..fa5930700d 100644 --- a/replay/replay.c +++ b/replay/replay.c @@ -29,6 +29,10 @@ ReplayMode replay_mode = REPLAY_MODE_NONE; char *replay_snapshot; +ReplaySnapshotMode replay_snapshot_mode; +uint64_t replay_snapshot_periodic_delay; +int replay_snapshot_periodic_nr_keep; + /* Name of replay file */ static char *replay_filename; ReplayState replay_state; @@ -313,6 +317,27 @@ void replay_configure(QemuOpts *opts) } replay_snapshot = g_strdup(qemu_opt_get(opts, "rrsnapshot")); + if (replay_snapshot && mode == REPLAY_MODE_RECORD) { + const char *snapmode; + + snapmode = qemu_opt_get(opts, "rrsnapmode"); + if (!snapmode || !strcmp(snapmode, "initial")) { + replay_snapshot_mode = REPLAY_SNAPSHOT_MODE_INITIAL; + } else if (!strcmp(snapmode, "periodic")) { + replay_snapshot_mode = REPLAY_SNAPSHOT_MODE_PERIODIC; + } else { + error_report("Invalid rrsnapmode option: %s", snapmode); + exit(1); + } + + /* Default 10 host seconds of machine runtime per snapshot. */ + replay_snapshot_periodic_delay = + qemu_opt_get_number(opts, "rrsnaptime", 10) * 1000; + + /* Default 2, to cover at least the last 10 host seconds of runtime. */ + replay_snapshot_periodic_nr_keep = + qemu_opt_get_number(opts, "rrsnapcount", 2); + } replay_vmstate_register(); replay_enable(fname, mode); diff --git a/softmmu/vl.c b/softmmu/vl.c index b0b96f67fa..e032eb45e8 100644 --- a/softmmu/vl.c +++ b/softmmu/vl.c @@ -446,6 +446,15 @@ static QemuOptsList qemu_icount_opts = { }, { .name = "rrsnapshot", .type = QEMU_OPT_STRING, + }, { + .name = "rrsnapmode", + .type = QEMU_OPT_STRING, + }, { + .name = "rrsnaptime", + .type = QEMU_OPT_NUMBER, + }, { + .name = "rrsnapcount", + .type = QEMU_OPT_NUMBER, }, { /* end of list */ } },
record makes an initial snapshot when the machine is created, to enable reverse-debugging. Often the issue being debugged appears near the end of the trace, so it is important for performance to keep snapshots close to the end. This implements a periodic snapshot mode that keeps a rolling set of recent snapshots. Arguably this should be done by the debugger or a program that talks to QMP, but for setting up simple scenarios and tests, it is convenient to have this feature. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> --- docs/system/replay.rst | 5 ++++ include/sysemu/replay.h | 11 ++++++++ qemu-options.hx | 9 +++++-- replay/replay-snapshot.c | 57 ++++++++++++++++++++++++++++++++++++++++ replay/replay.c | 25 ++++++++++++++++++ softmmu/vl.c | 9 +++++++ 6 files changed, 114 insertions(+), 2 deletions(-)