Message ID | 20230918144206.560120-17-armbru@redhat.com |
---|---|
State | New |
Headers | show |
Series | migration/rdma: Error handling fixes | expand |
Markus Armbruster <armbru@redhat.com> writes: > QIOChannelClass methods qio_channel_rdma_readv() and > qio_channel_rdma_writev() violate their method contract when > rdma->error_state is non-zero: > > 1. They return whatever is in rdma->error_state then. Only -1 will be > fine. -2 will be misinterpreted as "would block". Anything less > than -2 isn't defined in the contract. A positive value would be > misinterpreted as success, but I believe that's not actually > possible. > > 2. They neglect to set an error then. If something up the call stack > dereferences the error when failure is returned, it will crash. If > it ignores the return value and checks the error instead, it will > miss the error. > > Crap like this happens when return statements hide in macros, > especially when their uses are far away from the definition. > > I elected not to investigate how callers are impacted. > > Expand the two bad macro uses, so we can set an error and return -1. > The next commit will then get rid of the macro altogether. > > Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de>
On 18/09/2023 22:41, Markus Armbruster wrote: > QIOChannelClass methods qio_channel_rdma_readv() and > qio_channel_rdma_writev() violate their method contract when > rdma->error_state is non-zero: > > 1. They return whatever is in rdma->error_state then. Only -1 will be > fine. -2 will be misinterpreted as "would block". Anything less > than -2 isn't defined in the contract. A positive value would be > misinterpreted as success, but I believe that's not actually > possible. > > 2. They neglect to set an error then. If something up the call stack > dereferences the error when failure is returned, it will crash. If > it ignores the return value and checks the error instead, it will > miss the error. > > Crap like this happens when return statements hide in macros, > especially when their uses are far away from the definition. > > I elected not to investigate how callers are impacted. > > Expand the two bad macro uses, so we can set an error and return -1. > The next commit will then get rid of the macro altogether. > > Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> > --- > migration/rdma.c | 12 ++++++++++-- > 1 file changed, 10 insertions(+), 2 deletions(-) > > diff --git a/migration/rdma.c b/migration/rdma.c > index 2173cb076f..30e6dff875 100644 > --- a/migration/rdma.c > +++ b/migration/rdma.c > @@ -2761,7 +2761,11 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc, > return -1; > } > > - CHECK_ERROR_STATE(); > + if (rdma->error_state) { > + error_setg(errp, > + "RDMA is in an error state waiting migration to abort!"); > + return -1; > + } > > /* > * Push out any writes that > @@ -2847,7 +2851,11 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc, > return -1; > } > > - CHECK_ERROR_STATE(); > + if (rdma->error_state) { > + error_setg(errp, > + "RDMA is in an error state waiting migration to abort!"); > + return -1; > + } > > for (i = 0; i < niov; i++) { > size_t want = iov[i].iov_len;
diff --git a/migration/rdma.c b/migration/rdma.c index 2173cb076f..30e6dff875 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2761,7 +2761,11 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc, return -1; } - CHECK_ERROR_STATE(); + if (rdma->error_state) { + error_setg(errp, + "RDMA is in an error state waiting migration to abort!"); + return -1; + } /* * Push out any writes that @@ -2847,7 +2851,11 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc, return -1; } - CHECK_ERROR_STATE(); + if (rdma->error_state) { + error_setg(errp, + "RDMA is in an error state waiting migration to abort!"); + return -1; + } for (i = 0; i < niov; i++) { size_t want = iov[i].iov_len;
QIOChannelClass methods qio_channel_rdma_readv() and qio_channel_rdma_writev() violate their method contract when rdma->error_state is non-zero: 1. They return whatever is in rdma->error_state then. Only -1 will be fine. -2 will be misinterpreted as "would block". Anything less than -2 isn't defined in the contract. A positive value would be misinterpreted as success, but I believe that's not actually possible. 2. They neglect to set an error then. If something up the call stack dereferences the error when failure is returned, it will crash. If it ignores the return value and checks the error instead, it will miss the error. Crap like this happens when return statements hide in macros, especially when their uses are far away from the definition. I elected not to investigate how callers are impacted. Expand the two bad macro uses, so we can set an error and return -1. The next commit will then get rid of the macro altogether. Signed-off-by: Markus Armbruster <armbru@redhat.com> --- migration/rdma.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)