Message ID | 20240815122747.3053871-1-peter.maydell@linaro.org |
---|---|
State | New |
Headers | show |
Series | [for-9.2] kvm: Use 'unsigned long' for request argument in functions wrapping ioctl() | expand |
Ping for code review, please? thanks -- PMM On Thu, 15 Aug 2024 at 13:27, Peter Maydell <peter.maydell@linaro.org> wrote: > > From: Johannes Stoelp <johannes.stoelp@googlemail.com> > > Change the data type of the ioctl _request_ argument from 'int' to > 'unsigned long' for the various accel/kvm functions which are > essentially wrappers around the ioctl() syscall. > > The correct type for ioctl()'s 'request' argument is confused: > * POSIX defines the request argument as 'int' > * glibc uses 'unsigned long' in the prototype in sys/ioctl.h > * the glibc info documentation uses 'int' > * the Linux manpage uses 'unsigned long' > * the Linux implementation of the syscall uses 'unsigned int' > > If we wrap ioctl() with another function which uses 'int' as the > type for the request argument, then requests with the 0x8000_0000 > bit set will be sign-extended when the 'int' is cast to > 'unsigned long' for the call to ioctl(). > > On x86_64 one such example is the KVM_IRQ_LINE_STATUS request. > Bit requests with the _IOC_READ direction bit set, will have the high > bit set. > > Fortunately the Linux Kernel truncates the upper 32bit of the request > on 64bit machines (because it uses 'unsigned int', and see also Linus > Torvalds' comments in > https://sourceware.org/bugzilla/show_bug.cgi?id=14362 ) > so this doesn't cause active problems for us. However it is more > consistent to follow the glibc ioctl() prototype when we define > functions that are essentially wrappers around ioctl(). > > This resolves a Coverity issue where it points out that in > kvm_get_xsave() we assign a value (KVM_GET_XSAVE or KVM_GET_XSAVE2) > to an 'int' variable which can't hold it without overflow. > > Resolves: Coverity CID 1547759 > Signed-off-by: Johannes Stoelp <johannes.stoelp@gmail.com> > [PMM: Rebased patch, adjusted commit message, included note about > Coverity fix, updated the type of the local var in kvm_get_xsave, > updated the comment in the KVMState struct definition] > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > --- > This is a patch that was posted back in 2021, and I reviewed it > at the time > https://lore.kernel.org/qemu-devel/20210901213426.360748-1-johannes.stoelp@gmail.com/ > but we never actually took it into the tree. I was reminded of it > by the Coverity issue, where a change to Coverity means it now > complains about the potential integer overflow when we put one > of these high-bit-set ioctls into an "int". So I thought it would > be worth dusting this off and getting it upstream. > > For more discussion of the ioctl request datatype see also the > review thread on the previous version of the patch: > https://lore.kernel.org/qemu-devel/CAFEAcA8TRQdj33Ycm=XzmuUUNApaXVgeDexfS+3Ycg6kLnpmyg@mail.gmail.com/ > > Since this doesn't actually cause any incorrect behaviour this > is obviously for-9.2 material. > --- > include/sysemu/kvm.h | 8 ++++---- > include/sysemu/kvm_int.h | 16 ++++++++++++---- > accel/kvm/kvm-all.c | 8 ++++---- > target/i386/kvm/kvm.c | 3 ++- > accel/kvm/trace-events | 8 ++++---- > 5 files changed, 26 insertions(+), 17 deletions(-) > > diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h > index 9cf14ca3d5b..613d3f7581f 100644 > --- a/include/sysemu/kvm.h > +++ b/include/sysemu/kvm.h > @@ -235,11 +235,11 @@ static inline int kvm_update_guest_debug(CPUState *cpu, unsigned long reinject_t > > /* internal API */ > > -int kvm_ioctl(KVMState *s, int type, ...); > +int kvm_ioctl(KVMState *s, unsigned long type, ...); > > -int kvm_vm_ioctl(KVMState *s, int type, ...); > +int kvm_vm_ioctl(KVMState *s, unsigned long type, ...); > > -int kvm_vcpu_ioctl(CPUState *cpu, int type, ...); > +int kvm_vcpu_ioctl(CPUState *cpu, unsigned long type, ...); > > /** > * kvm_device_ioctl - call an ioctl on a kvm device > @@ -248,7 +248,7 @@ int kvm_vcpu_ioctl(CPUState *cpu, int type, ...); > * > * Returns: -errno on error, nonnegative on success > */ > -int kvm_device_ioctl(int fd, int type, ...); > +int kvm_device_ioctl(int fd, unsigned long type, ...); > > /** > * kvm_vm_check_attr - check for existence of a specific vm attribute > diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h > index 1d8fb1473bd..b52e3483511 100644 > --- a/include/sysemu/kvm_int.h > +++ b/include/sysemu/kvm_int.h > @@ -122,10 +122,18 @@ struct KVMState > bool sync_mmu; > bool guest_state_protected; > uint64_t manual_dirty_log_protect; > - /* The man page (and posix) say ioctl numbers are signed int, but > - * they're not. Linux, glibc and *BSD all treat ioctl numbers as > - * unsigned, and treating them as signed here can break things */ > - unsigned irq_set_ioctl; > + /* > + * POSIX says that ioctl numbers are signed int, but in practice > + * they are not. Linux, glibc and *BSD all treat ioctl numbers as > + * unsigned, and real-world ioctl values like KVM_GET_XSAVE have > + * bit 31 set, which means that passing them via an 'int' will > + * result in sign-extension when they get converted back to the > + * 'unsigned long' which the ioctl() prototype uses. Luckily Linux > + * always treats the argument as an unsigned 32-bit int, so any > + * possible sign-extension is deliberately ignored, but for > + * consistency we keep to the same type that glibc is using. > + */ > + unsigned long irq_set_ioctl; > unsigned int sigmask_len; > GHashTable *gsimap; > #ifdef KVM_CAP_IRQ_ROUTING > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c > index 75d11a07b2b..beb1988d12c 100644 > --- a/accel/kvm/kvm-all.c > +++ b/accel/kvm/kvm-all.c > @@ -3170,7 +3170,7 @@ int kvm_cpu_exec(CPUState *cpu) > return ret; > } > > -int kvm_ioctl(KVMState *s, int type, ...) > +int kvm_ioctl(KVMState *s, unsigned long type, ...) > { > int ret; > void *arg; > @@ -3188,7 +3188,7 @@ int kvm_ioctl(KVMState *s, int type, ...) > return ret; > } > > -int kvm_vm_ioctl(KVMState *s, int type, ...) > +int kvm_vm_ioctl(KVMState *s, unsigned long type, ...) > { > int ret; > void *arg; > @@ -3208,7 +3208,7 @@ int kvm_vm_ioctl(KVMState *s, int type, ...) > return ret; > } > > -int kvm_vcpu_ioctl(CPUState *cpu, int type, ...) > +int kvm_vcpu_ioctl(CPUState *cpu, unsigned long type, ...) > { > int ret; > void *arg; > @@ -3228,7 +3228,7 @@ int kvm_vcpu_ioctl(CPUState *cpu, int type, ...) > return ret; > } > > -int kvm_device_ioctl(int fd, int type, ...) > +int kvm_device_ioctl(int fd, unsigned long type, ...) > { > int ret; > void *arg; > diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c > index 2fa88ef1e37..ada581c5d6e 100644 > --- a/target/i386/kvm/kvm.c > +++ b/target/i386/kvm/kvm.c > @@ -4102,7 +4102,8 @@ static int kvm_get_xsave(X86CPU *cpu) > { > CPUX86State *env = &cpu->env; > void *xsave = env->xsave_buf; > - int type, ret; > + unsigned long type; > + int ret; > > type = has_xsave2 ? KVM_GET_XSAVE2 : KVM_GET_XSAVE; > ret = kvm_vcpu_ioctl(CPU(cpu), type, xsave); > diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events > index 37626c1ac5d..82c65fd2ab8 100644 > --- a/accel/kvm/trace-events > +++ b/accel/kvm/trace-events > @@ -1,11 +1,11 @@ > # See docs/devel/tracing.rst for syntax documentation. > > # kvm-all.c > -kvm_ioctl(int type, void *arg) "type 0x%x, arg %p" > -kvm_vm_ioctl(int type, void *arg) "type 0x%x, arg %p" > -kvm_vcpu_ioctl(int cpu_index, int type, void *arg) "cpu_index %d, type 0x%x, arg %p" > +kvm_ioctl(unsigned long type, void *arg) "type 0x%lx, arg %p" > +kvm_vm_ioctl(unsigned long type, void *arg) "type 0x%lx, arg %p" > +kvm_vcpu_ioctl(int cpu_index, unsigned long type, void *arg) "cpu_index %d, type 0x%lx, arg %p" > kvm_run_exit(int cpu_index, uint32_t reason) "cpu_index %d, reason %d" > -kvm_device_ioctl(int fd, int type, void *arg) "dev fd %d, type 0x%x, arg %p" > +kvm_device_ioctl(int fd, unsigned long type, void *arg) "dev fd %d, type 0x%lx, arg %p" > kvm_failed_reg_get(uint64_t id, const char *msg) "Warning: Unable to retrieve ONEREG %" PRIu64 " from KVM: %s" > kvm_failed_reg_set(uint64_t id, const char *msg) "Warning: Unable to set ONEREG %" PRIu64 " to KVM: %s" > kvm_init_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id: %lu" > -- > 2.34.1
On Thu, Aug 15, 2024 at 01:27:47PM GMT, Peter Maydell wrote: > From: Johannes Stoelp <johannes.stoelp@googlemail.com> > > Change the data type of the ioctl _request_ argument from 'int' to > 'unsigned long' for the various accel/kvm functions which are > essentially wrappers around the ioctl() syscall. > > The correct type for ioctl()'s 'request' argument is confused: > * POSIX defines the request argument as 'int' A bit of history: POSIX defined ioctl() because of the old Solaris STREAMS interface, that ended up deprecated in Issue 7 (2004) and removed in Issue 8 (just released this year). So POSIX no longer specified a signature for ioctl(). Admittedly, it DID add a new interface posix_devctl() designed to be a "portable" replacement for what ioctl() did (by being a new interface, there is no longer a question on what type it has), with the intent that libraries will eventually implement it (perhaps as a wrapper around the real ioctl), and that one uses 'int' for the command, and replaces the varargs of ioctl with a more direct specification of pointer and length. https://pubs.opengroup.org/onlinepubs/9799919799/functions/posix_devctl.html > * glibc uses 'unsigned long' in the prototype in sys/ioctl.h > * the glibc info documentation uses 'int' Arguably, that's a bug in glibc for being self-inconsistent. > * the Linux manpage uses 'unsigned long' > * the Linux implementation of the syscall uses 'unsigned int' And there's more of the history, which didn't become apparent until 64-bit architectures became common, but where we now have fallout like this thread. > > If we wrap ioctl() with another function which uses 'int' as the > type for the request argument, then requests with the 0x8000_0000 > bit set will be sign-extended when the 'int' is cast to > 'unsigned long' for the call to ioctl(). > > On x86_64 one such example is the KVM_IRQ_LINE_STATUS request. > Bit requests with the _IOC_READ direction bit set, will have the high > bit set. > > Fortunately the Linux Kernel truncates the upper 32bit of the request > on 64bit machines (because it uses 'unsigned int', and see also Linus > Torvalds' comments in > https://sourceware.org/bugzilla/show_bug.cgi?id=14362 ) > so this doesn't cause active problems for us. However it is more > consistent to follow the glibc ioctl() prototype when we define > functions that are essentially wrappers around ioctl(). > > This resolves a Coverity issue where it points out that in > kvm_get_xsave() we assign a value (KVM_GET_XSAVE or KVM_GET_XSAVE2) > to an 'int' variable which can't hold it without overflow. For the record, despite POSIX having picked a signed type, I am totally in favor of an unsigned type in any of our uses. > > Resolves: Coverity CID 1547759 > Signed-off-by: Johannes Stoelp <johannes.stoelp@gmail.com> > [PMM: Rebased patch, adjusted commit message, included note about > Coverity fix, updated the type of the local var in kvm_get_xsave, > updated the comment in the KVMState struct definition] > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > --- > This is a patch that was posted back in 2021, and I reviewed it > at the time > https://lore.kernel.org/qemu-devel/20210901213426.360748-1-johannes.stoelp@gmail.com/ > but we never actually took it into the tree. I was reminded of it > by the Coverity issue, where a change to Coverity means it now > complains about the potential integer overflow when we put one > of these high-bit-set ioctls into an "int". So I thought it would > be worth dusting this off and getting it upstream. > > For more discussion of the ioctl request datatype see also the > review thread on the previous version of the patch: > https://lore.kernel.org/qemu-devel/CAFEAcA8TRQdj33Ycm=XzmuUUNApaXVgeDexfS+3Ycg6kLnpmyg@mail.gmail.com/ > > Since this doesn't actually cause any incorrect behaviour this > is obviously for-9.2 material. > --- > include/sysemu/kvm.h | 8 ++++---- > include/sysemu/kvm_int.h | 16 ++++++++++++---- > accel/kvm/kvm-all.c | 8 ++++---- > target/i386/kvm/kvm.c | 3 ++- > accel/kvm/trace-events | 8 ++++---- > 5 files changed, 26 insertions(+), 17 deletions(-) Reviewed-by: Eric Blake <eblake@redhat.com> > +++ b/include/sysemu/kvm_int.h > @@ -122,10 +122,18 @@ struct KVMState > bool sync_mmu; > bool guest_state_protected; > uint64_t manual_dirty_log_protect; > - /* The man page (and posix) say ioctl numbers are signed int, but > - * they're not. Linux, glibc and *BSD all treat ioctl numbers as > - * unsigned, and treating them as signed here can break things */ > - unsigned irq_set_ioctl; > + /* > + * POSIX says that ioctl numbers are signed int, but in practice Maybe s/POSIX/older POSIX/ (given that newer POSIX does not specify ioctl at all) > + * they are not. Linux, glibc and *BSD all treat ioctl numbers as > + * unsigned, and real-world ioctl values like KVM_GET_XSAVE have > + * bit 31 set, which means that passing them via an 'int' will > + * result in sign-extension when they get converted back to the > + * 'unsigned long' which the ioctl() prototype uses. Luckily Linux > + * always treats the argument as an unsigned 32-bit int, so any > + * possible sign-extension is deliberately ignored, but for > + * consistency we keep to the same type that glibc is using. > + */ > + unsigned long irq_set_ioctl; > unsigned int sigmask_len; > GHashTable *gsimap;
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index 9cf14ca3d5b..613d3f7581f 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -235,11 +235,11 @@ static inline int kvm_update_guest_debug(CPUState *cpu, unsigned long reinject_t /* internal API */ -int kvm_ioctl(KVMState *s, int type, ...); +int kvm_ioctl(KVMState *s, unsigned long type, ...); -int kvm_vm_ioctl(KVMState *s, int type, ...); +int kvm_vm_ioctl(KVMState *s, unsigned long type, ...); -int kvm_vcpu_ioctl(CPUState *cpu, int type, ...); +int kvm_vcpu_ioctl(CPUState *cpu, unsigned long type, ...); /** * kvm_device_ioctl - call an ioctl on a kvm device @@ -248,7 +248,7 @@ int kvm_vcpu_ioctl(CPUState *cpu, int type, ...); * * Returns: -errno on error, nonnegative on success */ -int kvm_device_ioctl(int fd, int type, ...); +int kvm_device_ioctl(int fd, unsigned long type, ...); /** * kvm_vm_check_attr - check for existence of a specific vm attribute diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h index 1d8fb1473bd..b52e3483511 100644 --- a/include/sysemu/kvm_int.h +++ b/include/sysemu/kvm_int.h @@ -122,10 +122,18 @@ struct KVMState bool sync_mmu; bool guest_state_protected; uint64_t manual_dirty_log_protect; - /* The man page (and posix) say ioctl numbers are signed int, but - * they're not. Linux, glibc and *BSD all treat ioctl numbers as - * unsigned, and treating them as signed here can break things */ - unsigned irq_set_ioctl; + /* + * POSIX says that ioctl numbers are signed int, but in practice + * they are not. Linux, glibc and *BSD all treat ioctl numbers as + * unsigned, and real-world ioctl values like KVM_GET_XSAVE have + * bit 31 set, which means that passing them via an 'int' will + * result in sign-extension when they get converted back to the + * 'unsigned long' which the ioctl() prototype uses. Luckily Linux + * always treats the argument as an unsigned 32-bit int, so any + * possible sign-extension is deliberately ignored, but for + * consistency we keep to the same type that glibc is using. + */ + unsigned long irq_set_ioctl; unsigned int sigmask_len; GHashTable *gsimap; #ifdef KVM_CAP_IRQ_ROUTING diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index 75d11a07b2b..beb1988d12c 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -3170,7 +3170,7 @@ int kvm_cpu_exec(CPUState *cpu) return ret; } -int kvm_ioctl(KVMState *s, int type, ...) +int kvm_ioctl(KVMState *s, unsigned long type, ...) { int ret; void *arg; @@ -3188,7 +3188,7 @@ int kvm_ioctl(KVMState *s, int type, ...) return ret; } -int kvm_vm_ioctl(KVMState *s, int type, ...) +int kvm_vm_ioctl(KVMState *s, unsigned long type, ...) { int ret; void *arg; @@ -3208,7 +3208,7 @@ int kvm_vm_ioctl(KVMState *s, int type, ...) return ret; } -int kvm_vcpu_ioctl(CPUState *cpu, int type, ...) +int kvm_vcpu_ioctl(CPUState *cpu, unsigned long type, ...) { int ret; void *arg; @@ -3228,7 +3228,7 @@ int kvm_vcpu_ioctl(CPUState *cpu, int type, ...) return ret; } -int kvm_device_ioctl(int fd, int type, ...) +int kvm_device_ioctl(int fd, unsigned long type, ...) { int ret; void *arg; diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index 2fa88ef1e37..ada581c5d6e 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -4102,7 +4102,8 @@ static int kvm_get_xsave(X86CPU *cpu) { CPUX86State *env = &cpu->env; void *xsave = env->xsave_buf; - int type, ret; + unsigned long type; + int ret; type = has_xsave2 ? KVM_GET_XSAVE2 : KVM_GET_XSAVE; ret = kvm_vcpu_ioctl(CPU(cpu), type, xsave); diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events index 37626c1ac5d..82c65fd2ab8 100644 --- a/accel/kvm/trace-events +++ b/accel/kvm/trace-events @@ -1,11 +1,11 @@ # See docs/devel/tracing.rst for syntax documentation. # kvm-all.c -kvm_ioctl(int type, void *arg) "type 0x%x, arg %p" -kvm_vm_ioctl(int type, void *arg) "type 0x%x, arg %p" -kvm_vcpu_ioctl(int cpu_index, int type, void *arg) "cpu_index %d, type 0x%x, arg %p" +kvm_ioctl(unsigned long type, void *arg) "type 0x%lx, arg %p" +kvm_vm_ioctl(unsigned long type, void *arg) "type 0x%lx, arg %p" +kvm_vcpu_ioctl(int cpu_index, unsigned long type, void *arg) "cpu_index %d, type 0x%lx, arg %p" kvm_run_exit(int cpu_index, uint32_t reason) "cpu_index %d, reason %d" -kvm_device_ioctl(int fd, int type, void *arg) "dev fd %d, type 0x%x, arg %p" +kvm_device_ioctl(int fd, unsigned long type, void *arg) "dev fd %d, type 0x%lx, arg %p" kvm_failed_reg_get(uint64_t id, const char *msg) "Warning: Unable to retrieve ONEREG %" PRIu64 " from KVM: %s" kvm_failed_reg_set(uint64_t id, const char *msg) "Warning: Unable to set ONEREG %" PRIu64 " to KVM: %s" kvm_init_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id: %lu"