diff mbox series

[v2,23/53] migration/rdma: Fix QEMUFileHooks method return values

Message ID 20230928132019.2544702-24-armbru@redhat.com
State New
Headers show
Series migration/rdma: Error handling fixes | expand

Commit Message

Markus Armbruster Sept. 28, 2023, 1:19 p.m. UTC
The QEMUFileHooks methods don't come with a written contract.  Digging
through the code calling them, we find:

* save_page():

  Negative values RAM_SAVE_CONTROL_DELAYED and
  RAM_SAVE_CONTROL_NOT_SUPP are special.  Any other negative value is
  an unspecified error.

  qemu_rdma_save_page() returns -EIO or rdma->error_state on error.  I
  believe the latter is always negative.  Nothing stops either of them
  to clash with the special values, though.  Feels unlikely, but fix
  it anyway to return only the special values and -1.

* before_ram_iterate(), after_ram_iterate():

  Negative value means error.  qemu_rdma_registration_start() and
  qemu_rdma_registration_stop() comply as far as I can tell.  Make
  them comply *obviously*, by returning -1 on error.

* hook_ram_load:

  Negative value means error.  rdma_load_hook() already returns -1 on
  error.  Leave it alone.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
---
 migration/rdma.c | 79 +++++++++++++++++++++++-------------------------
 1 file changed, 37 insertions(+), 42 deletions(-)

Comments

Juan Quintela Oct. 4, 2023, 3:28 p.m. UTC | #1
Markus Armbruster <armbru@redhat.com> wrote:
> The QEMUFileHooks methods don't come with a written contract.  Digging
> through the code calling them, we find:
>
> * save_page():
>
>   Negative values RAM_SAVE_CONTROL_DELAYED and
>   RAM_SAVE_CONTROL_NOT_SUPP are special.  Any other negative value is
>   an unspecified error.
>
>   qemu_rdma_save_page() returns -EIO or rdma->error_state on error.  I
>   believe the latter is always negative.  Nothing stops either of them
>   to clash with the special values, though.  Feels unlikely, but fix
>   it anyway to return only the special values and -1.
>
> * before_ram_iterate(), after_ram_iterate():
>
>   Negative value means error.  qemu_rdma_registration_start() and
>   qemu_rdma_registration_stop() comply as far as I can tell.  Make
>   them comply *obviously*, by returning -1 on error.
>
> * hook_ram_load:
>
>   Negative value means error.  rdma_load_hook() already returns -1 on
>   error.  Leave it alone.
>
> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>

I "remove" QEMUFileHooks on this series:

https://lists.gnu.org/archive/html/qemu-devel/2023-10/msg01037.html

Do you mean waiting for the series to land before adding this one?

As that hooks only get rdma use, I just changed them to be normal
functions, no hooks.

And yes, it was not fun.  I know you have feel the "pleasure" of hacking
this file O:-)

Thanks, Juan.
Juan Quintela Oct. 4, 2023, 4:22 p.m. UTC | #2
Markus Armbruster <armbru@redhat.com> wrote:
> The QEMUFileHooks methods don't come with a written contract.  Digging
> through the code calling them, we find:
>
> * save_page():
>
>   Negative values RAM_SAVE_CONTROL_DELAYED and
>   RAM_SAVE_CONTROL_NOT_SUPP are special.  Any other negative value is
>   an unspecified error.
>
>   qemu_rdma_save_page() returns -EIO or rdma->error_state on error.  I
>   believe the latter is always negative.  Nothing stops either of them
>   to clash with the special values, though.  Feels unlikely, but fix
>   it anyway to return only the special values and -1.
>
> * before_ram_iterate(), after_ram_iterate():
>
>   Negative value means error.  qemu_rdma_registration_start() and
>   qemu_rdma_registration_stop() comply as far as I can tell.  Make
>   them comply *obviously*, by returning -1 on error.
>
> * hook_ram_load:
>
>   Negative value means error.  rdma_load_hook() already returns -1 on
>   error.  Leave it alone.
>
> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>

Changed idea.  Will include this and rebase mines on top.
Markus Armbruster Oct. 4, 2023, 4:37 p.m. UTC | #3
Juan Quintela <quintela@redhat.com> writes:

> Markus Armbruster <armbru@redhat.com> wrote:
>> The QEMUFileHooks methods don't come with a written contract.  Digging
>> through the code calling them, we find:
>>
>> * save_page():
>>
>>   Negative values RAM_SAVE_CONTROL_DELAYED and
>>   RAM_SAVE_CONTROL_NOT_SUPP are special.  Any other negative value is
>>   an unspecified error.
>>
>>   qemu_rdma_save_page() returns -EIO or rdma->error_state on error.  I
>>   believe the latter is always negative.  Nothing stops either of them
>>   to clash with the special values, though.  Feels unlikely, but fix
>>   it anyway to return only the special values and -1.
>>
>> * before_ram_iterate(), after_ram_iterate():
>>
>>   Negative value means error.  qemu_rdma_registration_start() and
>>   qemu_rdma_registration_stop() comply as far as I can tell.  Make
>>   them comply *obviously*, by returning -1 on error.
>>
>> * hook_ram_load:
>>
>>   Negative value means error.  rdma_load_hook() already returns -1 on
>>   error.  Leave it alone.
>>
>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
>
> Reviewed-by: Juan Quintela <quintela@redhat.com>
>
> Changed idea.  Will include this and rebase mines on top.

Thanks!
diff mbox series

Patch

diff --git a/migration/rdma.c b/migration/rdma.c
index 1ae2f87906..a58c2734e3 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3250,12 +3250,11 @@  static size_t qemu_rdma_save_page(QEMUFile *f,
     rdma = qatomic_rcu_read(&rioc->rdmaout);
 
     if (!rdma) {
-        return -EIO;
+        return -1;
     }
 
-    ret = check_error_state(rdma);
-    if (ret) {
-        return ret;
+    if (check_error_state(rdma)) {
+        return -1;
     }
 
     qemu_fflush(f);
@@ -3321,9 +3320,10 @@  static size_t qemu_rdma_save_page(QEMUFile *f,
     }
 
     return RAM_SAVE_CONTROL_DELAYED;
+
 err:
     rdma->error_state = ret;
-    return ret;
+    return -1;
 }
 
 static void rdma_accept_incoming_migration(void *opaque);
@@ -3569,12 +3569,11 @@  static int qemu_rdma_registration_handle(QEMUFile *f)
     rdma = qatomic_rcu_read(&rioc->rdmain);
 
     if (!rdma) {
-        return -EIO;
+        return -1;
     }
 
-    ret = check_error_state(rdma);
-    if (ret) {
-        return ret;
+    if (check_error_state(rdma)) {
+        return -1;
     }
 
     local = &rdma->local_ram_blocks;
@@ -3607,7 +3606,7 @@  static int qemu_rdma_registration_handle(QEMUFile *f)
                              (unsigned int)comp->block_idx,
                              rdma->local_ram_blocks.nb_blocks);
                 ret = -EIO;
-                goto out;
+                goto err;
             }
             block = &(rdma->local_ram_blocks.block[comp->block_idx]);
 
@@ -3619,7 +3618,7 @@  static int qemu_rdma_registration_handle(QEMUFile *f)
 
         case RDMA_CONTROL_REGISTER_FINISHED:
             trace_qemu_rdma_registration_handle_finished();
-            goto out;
+            return 0;
 
         case RDMA_CONTROL_RAM_BLOCKS_REQUEST:
             trace_qemu_rdma_registration_handle_ram_blocks();
@@ -3640,7 +3639,7 @@  static int qemu_rdma_registration_handle(QEMUFile *f)
                 if (ret) {
                     error_report("rdma migration: error dest "
                                     "registering ram blocks");
-                    goto out;
+                    goto err;
                 }
             }
 
@@ -3679,7 +3678,7 @@  static int qemu_rdma_registration_handle(QEMUFile *f)
 
             if (ret < 0) {
                 error_report("rdma migration: error sending remote info");
-                goto out;
+                goto err;
             }
 
             break;
@@ -3706,7 +3705,7 @@  static int qemu_rdma_registration_handle(QEMUFile *f)
                                  (unsigned int)reg->current_index,
                                  rdma->local_ram_blocks.nb_blocks);
                     ret = -ENOENT;
-                    goto out;
+                    goto err;
                 }
                 block = &(rdma->local_ram_blocks.block[reg->current_index]);
                 if (block->is_ram_block) {
@@ -3716,7 +3715,7 @@  static int qemu_rdma_registration_handle(QEMUFile *f)
                             block->block_name, block->offset,
                             reg->key.current_addr);
                         ret = -ERANGE;
-                        goto out;
+                        goto err;
                     }
                     host_addr = (block->local_host_addr +
                                 (reg->key.current_addr - block->offset));
@@ -3732,7 +3731,7 @@  static int qemu_rdma_registration_handle(QEMUFile *f)
                             " chunk: %" PRIx64,
                             block->block_name, reg->key.chunk);
                         ret = -ERANGE;
-                        goto out;
+                        goto err;
                     }
                 }
                 chunk_start = ram_chunk_start(block, chunk);
@@ -3744,7 +3743,7 @@  static int qemu_rdma_registration_handle(QEMUFile *f)
                             chunk, chunk_start, chunk_end)) {
                     error_report("cannot get rkey");
                     ret = -EINVAL;
-                    goto out;
+                    goto err;
                 }
                 reg_result->rkey = tmp_rkey;
 
@@ -3761,7 +3760,7 @@  static int qemu_rdma_registration_handle(QEMUFile *f)
 
             if (ret < 0) {
                 error_report("Failed to send control buffer");
-                goto out;
+                goto err;
             }
             break;
         case RDMA_CONTROL_UNREGISTER_REQUEST:
@@ -3784,7 +3783,7 @@  static int qemu_rdma_registration_handle(QEMUFile *f)
                 if (ret != 0) {
                     perror("rdma unregistration chunk failed");
                     ret = -ret;
-                    goto out;
+                    goto err;
                 }
 
                 rdma->total_registrations--;
@@ -3797,24 +3796,23 @@  static int qemu_rdma_registration_handle(QEMUFile *f)
 
             if (ret < 0) {
                 error_report("Failed to send control buffer");
-                goto out;
+                goto err;
             }
             break;
         case RDMA_CONTROL_REGISTER_RESULT:
             error_report("Invalid RESULT message at dest.");
             ret = -EIO;
-            goto out;
+            goto err;
         default:
             error_report("Unknown control message %s", control_desc(head.type));
             ret = -EIO;
-            goto out;
+            goto err;
         }
     } while (1);
-out:
-    if (ret < 0) {
-        rdma->error_state = ret;
-    }
-    return ret;
+
+err:
+    rdma->error_state = ret;
+    return -1;
 }
 
 /* Destination:
@@ -3836,7 +3834,7 @@  rdma_block_notification_handle(QEMUFile *f, const char *name)
     rdma = qatomic_rcu_read(&rioc->rdmain);
 
     if (!rdma) {
-        return -EIO;
+        return -1;
     }
 
     /* Find the matching RAMBlock in our local list */
@@ -3849,7 +3847,7 @@  rdma_block_notification_handle(QEMUFile *f, const char *name)
 
     if (found == -1) {
         error_report("RAMBlock '%s' not found on destination", name);
-        return -ENOENT;
+        return -1;
     }
 
     rdma->local_ram_blocks.block[curr].src_index = rdma->next_src_index;
@@ -3879,7 +3877,6 @@  static int qemu_rdma_registration_start(QEMUFile *f,
 {
     QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(qemu_file_get_ioc(f));
     RDMAContext *rdma;
-    int ret;
 
     if (migration_in_postcopy()) {
         return 0;
@@ -3888,12 +3885,11 @@  static int qemu_rdma_registration_start(QEMUFile *f,
     RCU_READ_LOCK_GUARD();
     rdma = qatomic_rcu_read(&rioc->rdmaout);
     if (!rdma) {
-        return -EIO;
+        return -1;
     }
 
-    ret = check_error_state(rdma);
-    if (ret) {
-        return ret;
+    if (check_error_state(rdma)) {
+        return -1;
     }
 
     trace_qemu_rdma_registration_start(flags);
@@ -3922,12 +3918,11 @@  static int qemu_rdma_registration_stop(QEMUFile *f,
     RCU_READ_LOCK_GUARD();
     rdma = qatomic_rcu_read(&rioc->rdmaout);
     if (!rdma) {
-        return -EIO;
+        return -1;
     }
 
-    ret = check_error_state(rdma);
-    if (ret) {
-        return ret;
+    if (check_error_state(rdma)) {
+        return -1;
     }
 
     qemu_fflush(f);
@@ -3958,7 +3953,7 @@  static int qemu_rdma_registration_stop(QEMUFile *f,
                     qemu_rdma_reg_whole_ram_blocks : NULL);
         if (ret < 0) {
             fprintf(stderr, "receiving remote info!");
-            return ret;
+            return -1;
         }
 
         nb_dest_blocks = resp.len / sizeof(RDMADestBlock);
@@ -3981,7 +3976,7 @@  static int qemu_rdma_registration_stop(QEMUFile *f,
                     "not identical on both the source and destination.",
                     local->nb_blocks, nb_dest_blocks);
             rdma->error_state = -EINVAL;
-            return -EINVAL;
+            return -1;
         }
 
         qemu_rdma_move_header(rdma, reg_result_idx, &resp);
@@ -3997,7 +3992,7 @@  static int qemu_rdma_registration_stop(QEMUFile *f,
                         local->block[i].length,
                         rdma->dest_blocks[i].length);
                 rdma->error_state = -EINVAL;
-                return -EINVAL;
+                return -1;
             }
             local->block[i].remote_host_addr =
                     rdma->dest_blocks[i].remote_host_addr;
@@ -4017,7 +4012,7 @@  static int qemu_rdma_registration_stop(QEMUFile *f,
     return 0;
 err:
     rdma->error_state = ret;
-    return ret;
+    return -1;
 }
 
 static const QEMUFileHooks rdma_read_hooks = {