Message ID | 20230705163502.331007-7-peterx@redhat.com |
---|---|
State | New |
Headers | show |
Series | migration: Better error handling in return path thread | expand |
Peter Xu <peterx@redhat.com> writes: > There're a lot of cases where we only have an errno set in last_error but > without a detailed error description. When this happens, try to generate > an error contains the errno as a descriptive error. > > This will be helpful in cases where one relies on the Error*. E.g., > migration state only caches Error* in MigrationState.error. With this, > we'll display correct error messages in e.g. query-migrate when the error > was only set by qemu_file_set_error(). > > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > migration/qemu-file.c | 15 ++++++++++++--- > 1 file changed, 12 insertions(+), 3 deletions(-) > > diff --git a/migration/qemu-file.c b/migration/qemu-file.c > index acc282654a..419b4092e7 100644 > --- a/migration/qemu-file.c > +++ b/migration/qemu-file.c > @@ -156,15 +156,24 @@ void qemu_file_set_hooks(QEMUFile *f, const QEMUFileHooks *hooks) > * > * Return negative error value if there has been an error on previous > * operations, return 0 if no error happened. > - * Optional, it returns Error* in errp, but it may be NULL even if return value > - * is not 0. > * > + * If errp is specified, a verbose error message will be copied over. > */ > int qemu_file_get_error_obj(QEMUFile *f, Error **errp) > { > + if (!f->last_error) { > + return 0; > + } > + > + /* There is an error */ > if (errp) { > - *errp = f->last_error_obj ? error_copy(f->last_error_obj) : NULL; > + if (f->last_error_obj) { > + *errp = error_copy(f->last_error_obj); > + } else { > + error_setg_errno(errp, -f->last_error, "Channel error"); There are a couple of places that do: ret = vmstate_save(f, se, ms->vmdesc); if (ret) { qemu_file_set_error(f, ret); break; } and vmstate_save() can return > 0 on error. This would make this message say "Unknown error". This is minor. But take a look at qemu_fclose(). It can return f->last_error while the function documentation says it should return negative on error. Should we make qemu_file_set_error() check 'ret' and always set a negative value for f->last_error?
On Wed, Jul 05, 2023 at 06:54:37PM -0300, Fabiano Rosas wrote: > Peter Xu <peterx@redhat.com> writes: > > > There're a lot of cases where we only have an errno set in last_error but > > without a detailed error description. When this happens, try to generate > > an error contains the errno as a descriptive error. > > > > This will be helpful in cases where one relies on the Error*. E.g., > > migration state only caches Error* in MigrationState.error. With this, > > we'll display correct error messages in e.g. query-migrate when the error > > was only set by qemu_file_set_error(). > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > --- > > migration/qemu-file.c | 15 ++++++++++++--- > > 1 file changed, 12 insertions(+), 3 deletions(-) > > > > diff --git a/migration/qemu-file.c b/migration/qemu-file.c > > index acc282654a..419b4092e7 100644 > > --- a/migration/qemu-file.c > > +++ b/migration/qemu-file.c > > @@ -156,15 +156,24 @@ void qemu_file_set_hooks(QEMUFile *f, const QEMUFileHooks *hooks) > > * > > * Return negative error value if there has been an error on previous > > * operations, return 0 if no error happened. > > - * Optional, it returns Error* in errp, but it may be NULL even if return value > > - * is not 0. > > * > > + * If errp is specified, a verbose error message will be copied over. > > */ > > int qemu_file_get_error_obj(QEMUFile *f, Error **errp) > > { > > + if (!f->last_error) { > > + return 0; > > + } > > + > > + /* There is an error */ > > if (errp) { > > - *errp = f->last_error_obj ? error_copy(f->last_error_obj) : NULL; > > + if (f->last_error_obj) { > > + *errp = error_copy(f->last_error_obj); > > + } else { > > + error_setg_errno(errp, -f->last_error, "Channel error"); > > There are a couple of places that do: > > ret = vmstate_save(f, se, ms->vmdesc); > if (ret) { > qemu_file_set_error(f, ret); > break; > } > > and vmstate_save() can return > 0 on error. This would make this message > say "Unknown error". This is minor. > > But take a look at qemu_fclose(). It can return f->last_error while the > function documentation says it should return negative on error. > > Should we make qemu_file_set_error() check 'ret' and always set a > negative value for f->last_error? Yeah, maybe we can add a sanity check, but logically it's better we just fix vmstate_save() to make sure it always returns a <0 error. It seems to me there're so many hooks in vmstate_save_state_v() that it can return random things. What's the one you spot? If it's an obvious issue we can fix them.
Peter Xu <peterx@redhat.com> writes: > On Wed, Jul 05, 2023 at 06:54:37PM -0300, Fabiano Rosas wrote: >> Peter Xu <peterx@redhat.com> writes: >> >> > There're a lot of cases where we only have an errno set in last_error but >> > without a detailed error description. When this happens, try to generate >> > an error contains the errno as a descriptive error. >> > >> > This will be helpful in cases where one relies on the Error*. E.g., >> > migration state only caches Error* in MigrationState.error. With this, >> > we'll display correct error messages in e.g. query-migrate when the error >> > was only set by qemu_file_set_error(). >> > >> > Signed-off-by: Peter Xu <peterx@redhat.com> >> > --- >> > migration/qemu-file.c | 15 ++++++++++++--- >> > 1 file changed, 12 insertions(+), 3 deletions(-) >> > >> > diff --git a/migration/qemu-file.c b/migration/qemu-file.c >> > index acc282654a..419b4092e7 100644 >> > --- a/migration/qemu-file.c >> > +++ b/migration/qemu-file.c >> > @@ -156,15 +156,24 @@ void qemu_file_set_hooks(QEMUFile *f, const QEMUFileHooks *hooks) >> > * >> > * Return negative error value if there has been an error on previous >> > * operations, return 0 if no error happened. >> > - * Optional, it returns Error* in errp, but it may be NULL even if return value >> > - * is not 0. >> > * >> > + * If errp is specified, a verbose error message will be copied over. >> > */ >> > int qemu_file_get_error_obj(QEMUFile *f, Error **errp) >> > { >> > + if (!f->last_error) { >> > + return 0; >> > + } >> > + >> > + /* There is an error */ >> > if (errp) { >> > - *errp = f->last_error_obj ? error_copy(f->last_error_obj) : NULL; >> > + if (f->last_error_obj) { >> > + *errp = error_copy(f->last_error_obj); >> > + } else { >> > + error_setg_errno(errp, -f->last_error, "Channel error"); >> >> There are a couple of places that do: >> >> ret = vmstate_save(f, se, ms->vmdesc); >> if (ret) { >> qemu_file_set_error(f, ret); >> break; >> } >> >> and vmstate_save() can return > 0 on error. This would make this message >> say "Unknown error". This is minor. >> >> But take a look at qemu_fclose(). It can return f->last_error while the >> function documentation says it should return negative on error. >> >> Should we make qemu_file_set_error() check 'ret' and always set a >> negative value for f->last_error? > > Yeah, maybe we can add a sanity check, but logically it's better we just > fix vmstate_save() to make sure it always returns a <0 error. > > It seems to me there're so many hooks in vmstate_save_state_v() that it can > return random things. What's the one you spot? If it's an obvious issue > we can fix them. I see at least: ret = field->info->put(f, curr_elem, size, field, vmdesc_loop); with put_power() from target/arm/machine.c returning 1. Since vmstate_save_state_v() is quite involved I don't think we should block this series because of it. I can do a closer audit and send a separate patch with it. So: Reviewed-by: Fabiano Rosas <farosas@suse.de>
diff --git a/migration/qemu-file.c b/migration/qemu-file.c index acc282654a..419b4092e7 100644 --- a/migration/qemu-file.c +++ b/migration/qemu-file.c @@ -156,15 +156,24 @@ void qemu_file_set_hooks(QEMUFile *f, const QEMUFileHooks *hooks) * * Return negative error value if there has been an error on previous * operations, return 0 if no error happened. - * Optional, it returns Error* in errp, but it may be NULL even if return value - * is not 0. * + * If errp is specified, a verbose error message will be copied over. */ int qemu_file_get_error_obj(QEMUFile *f, Error **errp) { + if (!f->last_error) { + return 0; + } + + /* There is an error */ if (errp) { - *errp = f->last_error_obj ? error_copy(f->last_error_obj) : NULL; + if (f->last_error_obj) { + *errp = error_copy(f->last_error_obj); + } else { + error_setg_errno(errp, -f->last_error, "Channel error"); + } } + return f->last_error; }
There're a lot of cases where we only have an errno set in last_error but without a detailed error description. When this happens, try to generate an error contains the errno as a descriptive error. This will be helpful in cases where one relies on the Error*. E.g., migration state only caches Error* in MigrationState.error. With this, we'll display correct error messages in e.g. query-migrate when the error was only set by qemu_file_set_error(). Signed-off-by: Peter Xu <peterx@redhat.com> --- migration/qemu-file.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)