Message ID | 1475583448-21013-1-git-send-email-clg@kaod.org |
---|---|
State | New |
Headers | show |
On 4 October 2016 at 13:17, Cédric Le Goater <clg@kaod.org> wrote: > Some test scenarios require to access memory regions using a specific > endianness, such as a device region, but the current qtest memory > accessors are done in native endian, which means that the values are > byteswapped in qtest if the endianness of the guest and the host are > different. > > To maintain the endianness of a value, we need a new set of memory > accessor. This can be done in two ways: > > - first, convert the value to the required endianness in libqtest and > then use the memread/write routines so that qtest accesses the guest > memory without doing any supplementary byteswapping > > - an alternative method would be to handle the byte swapping on the > qtest side. For that, we would need to extend the read/write > protocol with an ending word : "native|le|be" and modify the tswap > calls accordingly under the qtest_process_command() routine. > > The result is the same and the first method is simpler. The difficulty with this patch is that it's hard to tell whether it's really required, or if this is just adding an extra layer of byteswapping that should really be done in some other location in the stack. What's the actual test case here? thanks -- PMM
On 04/10/2016 14:36, Peter Maydell wrote: > On 4 October 2016 at 13:17, Cédric Le Goater <clg@kaod.org> wrote: >> Some test scenarios require to access memory regions using a specific >> endianness, such as a device region, but the current qtest memory >> accessors are done in native endian, which means that the values are >> byteswapped in qtest if the endianness of the guest and the host are >> different. >> >> To maintain the endianness of a value, we need a new set of memory >> accessor. This can be done in two ways: >> >> - first, convert the value to the required endianness in libqtest and >> then use the memread/write routines so that qtest accesses the guest >> memory without doing any supplementary byteswapping >> >> - an alternative method would be to handle the byte swapping on the >> qtest side. For that, we would need to extend the read/write >> protocol with an ending word : "native|le|be" and modify the tswap >> calls accordingly under the qtest_process_command() routine. >> >> The result is the same and the first method is simpler. > > The difficulty with this patch is that it's hard to tell whether > it's really required, or if this is just adding an extra layer > of byteswapping that should really be done in some other location > in the stack. What's the actual test case here? > We need that to implement PCI qtest with SPAPR, as PCI is always little-endian and SPAPR is big-endian by default, see my patch http://patchwork.ozlabs.org/patch/676591/ I'm going to resend my series based on Cédric's patch. Thanks, Laurent
On 04/10/2016 14:36, Peter Maydell wrote: > > - first, convert the value to the required endianness in libqtest and > > then use the memread/write routines so that qtest accesses the guest > > memory without doing any supplementary byteswapping > > > > - an alternative method would be to handle the byte swapping on the > > qtest side. For that, we would need to extend the read/write > > protocol with an ending word : "native|le|be" and modify the tswap > > calls accordingly under the qtest_process_command() routine. > > > > The result is the same and the first method is simpler. > > The difficulty with this patch is that it's hard to tell whether > it's really required, or if this is just adding an extra layer > of byteswapping that should really be done in some other location > in the stack. What's the actual test case here? I think the first method makes sense. The second would be the wrong side. Paolo
On Tue, Oct 04, 2016 at 02:17:28PM +0200, Cédric Le Goater wrote: > Some test scenarios require to access memory regions using a specific > endianness, such as a device region, but the current qtest memory > accessors are done in native endian, which means that the values are > byteswapped in qtest if the endianness of the guest and the host are > different. > > To maintain the endianness of a value, we need a new set of memory > accessor. This can be done in two ways: > > - first, convert the value to the required endianness in libqtest and > then use the memread/write routines so that qtest accesses the guest > memory without doing any supplementary byteswapping > > - an alternative method would be to handle the byte swapping on the > qtest side. For that, we would need to extend the read/write > protocol with an ending word : "native|le|be" and modify the tswap > calls accordingly under the qtest_process_command() routine. > > The result is the same and the first method is simpler. > > Signed-off-by: Cédric Le Goater <clg@kaod.org> > --- > > tests/libqtest.c | 91 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ > tests/libqtest.h | 71 ++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 162 insertions(+) > > Index: qemu-aspeed.git/tests/libqtest.h > =================================================================== > --- qemu-aspeed.git.orig/tests/libqtest.h > +++ qemu-aspeed.git/tests/libqtest.h > @@ -887,4 +887,75 @@ void qmp_fd_send(int fd, const char *fmt > QDict *qmp_fdv(int fd, const char *fmt, va_list ap); > QDict *qmp_fd(int fd, const char *fmt, ...); > > +/* > + * BE/LE read/write accessors > + */ > +uint16_t qtest_readw_be(QTestState *s, uint64_t addr); > +uint32_t qtest_readl_be(QTestState *s, uint64_t addr); > +uint64_t qtest_readq_be(QTestState *s, uint64_t addr); > + > +static inline uint16_t readw_be(uint64_t addr) > +{ > + return qtest_readw_be(global_qtest, addr); > +} > +static inline uint32_t readl_be(uint64_t addr) > +{ > + return qtest_readl_be(global_qtest, addr); > +} > +static inline uint64_t readq_be(uint64_t addr) > +{ > + return qtest_readq_be(global_qtest, addr); > +} > + > +void qtest_writew_be(QTestState *s, uint64_t addr, uint16_t value); > +void qtest_writel_be(QTestState *s, uint64_t addr, uint32_t value); > +void qtest_writeq_be(QTestState *s, uint64_t addr, uint64_t value); > + > +static inline void writew_be(uint64_t addr, uint16_t value) > +{ > + qtest_writew_be(global_qtest, addr, value); > +} > +static inline void writel_be(uint64_t addr, uint32_t value) > +{ > + qtest_writel_be(global_qtest, addr, value); > +} > +static inline void writeq_be(uint64_t addr, uint64_t value) > +{ > + qtest_writeq_be(global_qtest, addr, value); > +} > + > +uint16_t qtest_readw_le(QTestState *s, uint64_t addr); > +uint32_t qtest_readl_le(QTestState *s, uint64_t addr); > +uint64_t qtest_readq_le(QTestState *s, uint64_t addr); > + > +static inline uint16_t readw_le(uint64_t addr) > +{ > + return qtest_readw_le(global_qtest, addr); > +} > +static inline uint32_t readl_le(uint64_t addr) > +{ > + return qtest_readl_le(global_qtest, addr); > +} > +static inline uint64_t readq_le(uint64_t addr) > +{ > + return qtest_readq_le(global_qtest, addr); > +} > + > +void qtest_writew_le(QTestState *s, uint64_t addr, uint16_t value); > +void qtest_writel_le(QTestState *s, uint64_t addr, uint32_t value); > +void qtest_writeq_le(QTestState *s, uint64_t addr, uint64_t value); > + > +static inline void writew_le(uint64_t addr, uint16_t value) > +{ > + qtest_writew_le(global_qtest, addr, value); > +} > +static inline void writel_le(uint64_t addr, uint32_t value) > +{ > + qtest_writel_le(global_qtest, addr, value); > +} > +static inline void writeq_le(uint64_t addr, uint64_t value) > +{ > + qtest_writeq_le(global_qtest, addr, value); > +} > + > #endif > Index: qemu-aspeed.git/tests/libqtest.c > =================================================================== > --- qemu-aspeed.git.orig/tests/libqtest.c > +++ qemu-aspeed.git/tests/libqtest.c > @@ -15,6 +15,7 @@ > * > */ > #include "qemu/osdep.h" > +#include "qemu/bswap.h" > #include "libqtest.h" > > #include <sys/socket.h> > @@ -688,6 +689,42 @@ void qtest_writeq(QTestState *s, uint64_ > qtest_write(s, "writeq", addr, value); > } > > +void qtest_writew_be(QTestState *s, uint64_t addr, uint16_t value) > +{ > + value = cpu_to_be16(value); > + qtest_memread(s, addr, &value, sizeof(value)); Uh.. this surely needs to be qtest_memwrite(). > +} > + > +void qtest_writel_be(QTestState *s, uint64_t addr, uint32_t value) > +{ > + value = cpu_to_be32(value); > + qtest_memread(s, addr, &value, sizeof(value)); and here, > +} > + > +void qtest_writeq_be(QTestState *s, uint64_t addr, uint64_t value) > +{ > + value = cpu_to_be64(value); > + qtest_memread(s, addr, &value, sizeof(value)); here > +} > + > +void qtest_writew_le(QTestState *s, uint64_t addr, uint16_t value) > +{ > + value = cpu_to_le16(value); > + qtest_memread(s, addr, &value, sizeof(value)); here > +} > + > +void qtest_writel_le(QTestState *s, uint64_t addr, uint32_t value) > +{ > + value = cpu_to_le32(value); > + qtest_memread(s, addr, &value, sizeof(value)); here > +} > + > +void qtest_writeq_le(QTestState *s, uint64_t addr, uint64_t value) > +{ > + value = cpu_to_le64(value); > + qtest_memread(s, addr, &value, sizeof(value)); and here > +} > + > static uint64_t qtest_read(QTestState *s, const char *cmd, uint64_t addr) > { > gchar **args; > @@ -721,6 +758,60 @@ uint64_t qtest_readq(QTestState *s, uint > return qtest_read(s, "readq", addr); > } > > +uint16_t qtest_readw_be(QTestState *s, uint64_t addr) > +{ > + uint16_t value; > + > + qtest_memread(s, addr, &value, sizeof(value)); > + > + return be16_to_cpu(value); > +} > + > +uint32_t qtest_readl_be(QTestState *s, uint64_t addr) > +{ > + uint32_t value; > + > + qtest_memread(s, addr, &value, sizeof(value)); > + > + return be32_to_cpu(value); > +} > + > +uint64_t qtest_readq_be(QTestState *s, uint64_t addr) > +{ > + uint64_t value; > + > + qtest_memread(s, addr, &value, sizeof(value)); > + > + return be64_to_cpu(value); > +} > + > +uint16_t qtest_readw_le(QTestState *s, uint64_t addr) > +{ > + uint16_t value; > + > + qtest_memread(s, addr, &value, sizeof(value)); > + > + return le16_to_cpu(value); > +} > + > +uint32_t qtest_readl_le(QTestState *s, uint64_t addr) > +{ > + uint32_t value; > + > + qtest_memread(s, addr, &value, sizeof(value)); > + > + return le32_to_cpu(value); > +} > + > +uint64_t qtest_readq_le(QTestState *s, uint64_t addr) > +{ > + uint64_t value; > + > + qtest_memread(s, addr, &value, sizeof(value)); > + > + return le64_to_cpu(value); > +} > + > static int hex2nib(char ch) > { > if (ch >= '0' && ch <= '9') { >
On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote: > On 4 October 2016 at 13:17, Cédric Le Goater <clg@kaod.org> wrote: > > Some test scenarios require to access memory regions using a specific > > endianness, such as a device region, but the current qtest memory > > accessors are done in native endian, which means that the values are > > byteswapped in qtest if the endianness of the guest and the host are > > different. > > > > To maintain the endianness of a value, we need a new set of memory > > accessor. This can be done in two ways: > > > > - first, convert the value to the required endianness in libqtest and > > then use the memread/write routines so that qtest accesses the guest > > memory without doing any supplementary byteswapping > > > > - an alternative method would be to handle the byte swapping on the > > qtest side. For that, we would need to extend the read/write > > protocol with an ending word : "native|le|be" and modify the tswap > > calls accordingly under the qtest_process_command() routine. > > > > The result is the same and the first method is simpler. > > The difficulty with this patch is that it's hard to tell whether > it's really required, or if this is just adding an extra layer > of byteswapping that should really be done in some other location Actually, it's neither. It's not essential for anything, but it *removes* an extra layer of byteswapping that really never should have been done in the first place. > in the stack. What's the actual test case here? The current readw, readl, etc. all work in "guest endianness". But guest endianness is not well defined - there are a number of targets which can support either. And it's doubly meaningless since it's a property of the guest cpu, which we're essentially replacing with the qtest stub anyway. Furthermore "guest endianness" isn't useful. With a tiny handful of exceptions, all peripherals have their own endianness which is known independent of the target. It makes more sense for test cases to explicitly do their accesses in the correct endianness for the device, without having to compensate for the fact that it'll be swapped into the essentially arbitrary "guest endianness" along the way. Basically, the existing byteswaps, instead of removing the need for them in the testcase code, instead mean that target-conditional byteswaps will be *required* in nearly every testcase. It's a recipe for endianness-unsafe testcases.
On 10/05/2016 01:26 AM, David Gibson wrote: > On Tue, Oct 04, 2016 at 02:17:28PM +0200, Cédric Le Goater wrote: >> Some test scenarios require to access memory regions using a specific >> endianness, such as a device region, but the current qtest memory >> accessors are done in native endian, which means that the values are >> byteswapped in qtest if the endianness of the guest and the host are >> different. >> >> To maintain the endianness of a value, we need a new set of memory >> accessor. This can be done in two ways: >> >> - first, convert the value to the required endianness in libqtest and >> then use the memread/write routines so that qtest accesses the guest >> memory without doing any supplementary byteswapping >> >> - an alternative method would be to handle the byte swapping on the >> qtest side. For that, we would need to extend the read/write >> protocol with an ending word : "native|le|be" and modify the tswap >> calls accordingly under the qtest_process_command() routine. >> >> The result is the same and the first method is simpler. >> >> Signed-off-by: Cédric Le Goater <clg@kaod.org> >> --- >> >> tests/libqtest.c | 91 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> tests/libqtest.h | 71 ++++++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 162 insertions(+) >> >> Index: qemu-aspeed.git/tests/libqtest.h >> =================================================================== >> --- qemu-aspeed.git.orig/tests/libqtest.h >> +++ qemu-aspeed.git/tests/libqtest.h >> @@ -887,4 +887,75 @@ void qmp_fd_send(int fd, const char *fmt >> QDict *qmp_fdv(int fd, const char *fmt, va_list ap); >> QDict *qmp_fd(int fd, const char *fmt, ...); >> >> +/* >> + * BE/LE read/write accessors >> + */ >> +uint16_t qtest_readw_be(QTestState *s, uint64_t addr); >> +uint32_t qtest_readl_be(QTestState *s, uint64_t addr); >> +uint64_t qtest_readq_be(QTestState *s, uint64_t addr); >> + >> +static inline uint16_t readw_be(uint64_t addr) >> +{ >> + return qtest_readw_be(global_qtest, addr); >> +} >> +static inline uint32_t readl_be(uint64_t addr) >> +{ >> + return qtest_readl_be(global_qtest, addr); >> +} >> +static inline uint64_t readq_be(uint64_t addr) >> +{ >> + return qtest_readq_be(global_qtest, addr); >> +} >> + >> +void qtest_writew_be(QTestState *s, uint64_t addr, uint16_t value); >> +void qtest_writel_be(QTestState *s, uint64_t addr, uint32_t value); >> +void qtest_writeq_be(QTestState *s, uint64_t addr, uint64_t value); >> + >> +static inline void writew_be(uint64_t addr, uint16_t value) >> +{ >> + qtest_writew_be(global_qtest, addr, value); >> +} >> +static inline void writel_be(uint64_t addr, uint32_t value) >> +{ >> + qtest_writel_be(global_qtest, addr, value); >> +} >> +static inline void writeq_be(uint64_t addr, uint64_t value) >> +{ >> + qtest_writeq_be(global_qtest, addr, value); >> +} >> + >> +uint16_t qtest_readw_le(QTestState *s, uint64_t addr); >> +uint32_t qtest_readl_le(QTestState *s, uint64_t addr); >> +uint64_t qtest_readq_le(QTestState *s, uint64_t addr); >> + >> +static inline uint16_t readw_le(uint64_t addr) >> +{ >> + return qtest_readw_le(global_qtest, addr); >> +} >> +static inline uint32_t readl_le(uint64_t addr) >> +{ >> + return qtest_readl_le(global_qtest, addr); >> +} >> +static inline uint64_t readq_le(uint64_t addr) >> +{ >> + return qtest_readq_le(global_qtest, addr); >> +} >> + >> +void qtest_writew_le(QTestState *s, uint64_t addr, uint16_t value); >> +void qtest_writel_le(QTestState *s, uint64_t addr, uint32_t value); >> +void qtest_writeq_le(QTestState *s, uint64_t addr, uint64_t value); >> + >> +static inline void writew_le(uint64_t addr, uint16_t value) >> +{ >> + qtest_writew_le(global_qtest, addr, value); >> +} >> +static inline void writel_le(uint64_t addr, uint32_t value) >> +{ >> + qtest_writel_le(global_qtest, addr, value); >> +} >> +static inline void writeq_le(uint64_t addr, uint64_t value) >> +{ >> + qtest_writeq_le(global_qtest, addr, value); >> +} >> + >> #endif >> Index: qemu-aspeed.git/tests/libqtest.c >> =================================================================== >> --- qemu-aspeed.git.orig/tests/libqtest.c >> +++ qemu-aspeed.git/tests/libqtest.c >> @@ -15,6 +15,7 @@ >> * >> */ >> #include "qemu/osdep.h" >> +#include "qemu/bswap.h" >> #include "libqtest.h" >> >> #include <sys/socket.h> >> @@ -688,6 +689,42 @@ void qtest_writeq(QTestState *s, uint64_ >> qtest_write(s, "writeq", addr, value); >> } >> >> +void qtest_writew_be(QTestState *s, uint64_t addr, uint16_t value) >> +{ >> + value = cpu_to_be16(value); >> + qtest_memread(s, addr, &value, sizeof(value)); > > Uh.. this surely needs to be qtest_memwrite(). arg. I missed a quilt refresh before sending ... Thanks, C. > >> +} >> + >> +void qtest_writel_be(QTestState *s, uint64_t addr, uint32_t value) >> +{ >> + value = cpu_to_be32(value); >> + qtest_memread(s, addr, &value, sizeof(value)); > > and here, > >> +} >> + >> +void qtest_writeq_be(QTestState *s, uint64_t addr, uint64_t value) >> +{ >> + value = cpu_to_be64(value); >> + qtest_memread(s, addr, &value, sizeof(value)); > > here > >> +} >> + >> +void qtest_writew_le(QTestState *s, uint64_t addr, uint16_t value) >> +{ >> + value = cpu_to_le16(value); >> + qtest_memread(s, addr, &value, sizeof(value)); > > here > >> +} >> + >> +void qtest_writel_le(QTestState *s, uint64_t addr, uint32_t value) >> +{ >> + value = cpu_to_le32(value); >> + qtest_memread(s, addr, &value, sizeof(value)); > > here > >> +} >> + >> +void qtest_writeq_le(QTestState *s, uint64_t addr, uint64_t value) >> +{ >> + value = cpu_to_le64(value); >> + qtest_memread(s, addr, &value, sizeof(value)); > > and here > >> +} >> + >> static uint64_t qtest_read(QTestState *s, const char *cmd, uint64_t addr) >> { >> gchar **args; >> @@ -721,6 +758,60 @@ uint64_t qtest_readq(QTestState *s, uint >> return qtest_read(s, "readq", addr); >> } >> >> +uint16_t qtest_readw_be(QTestState *s, uint64_t addr) >> +{ >> + uint16_t value; >> + >> + qtest_memread(s, addr, &value, sizeof(value)); >> + >> + return be16_to_cpu(value); >> +} >> + >> +uint32_t qtest_readl_be(QTestState *s, uint64_t addr) >> +{ >> + uint32_t value; >> + >> + qtest_memread(s, addr, &value, sizeof(value)); >> + >> + return be32_to_cpu(value); >> +} >> + >> +uint64_t qtest_readq_be(QTestState *s, uint64_t addr) >> +{ >> + uint64_t value; >> + >> + qtest_memread(s, addr, &value, sizeof(value)); >> + >> + return be64_to_cpu(value); >> +} >> + >> +uint16_t qtest_readw_le(QTestState *s, uint64_t addr) >> +{ >> + uint16_t value; >> + >> + qtest_memread(s, addr, &value, sizeof(value)); >> + >> + return le16_to_cpu(value); >> +} >> + >> +uint32_t qtest_readl_le(QTestState *s, uint64_t addr) >> +{ >> + uint32_t value; >> + >> + qtest_memread(s, addr, &value, sizeof(value)); >> + >> + return le32_to_cpu(value); >> +} >> + >> +uint64_t qtest_readq_le(QTestState *s, uint64_t addr) >> +{ >> + uint64_t value; >> + >> + qtest_memread(s, addr, &value, sizeof(value)); >> + >> + return le64_to_cpu(value); >> +} >> + >> static int hex2nib(char ch) >> { >> if (ch >= '0' && ch <= '9') { >> >
On 10/05/2016 01:43 AM, David Gibson wrote: > On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote: >> On 4 October 2016 at 13:17, Cédric Le Goater <clg@kaod.org> wrote: >>> Some test scenarios require to access memory regions using a specific >>> endianness, such as a device region, but the current qtest memory >>> accessors are done in native endian, which means that the values are >>> byteswapped in qtest if the endianness of the guest and the host are >>> different. >>> >>> To maintain the endianness of a value, we need a new set of memory >>> accessor. This can be done in two ways: >>> >>> - first, convert the value to the required endianness in libqtest and >>> then use the memread/write routines so that qtest accesses the guest >>> memory without doing any supplementary byteswapping >>> >>> - an alternative method would be to handle the byte swapping on the >>> qtest side. For that, we would need to extend the read/write >>> protocol with an ending word : "native|le|be" and modify the tswap >>> calls accordingly under the qtest_process_command() routine. >>> >>> The result is the same and the first method is simpler. >> >> The difficulty with this patch is that it's hard to tell whether >> it's really required, or if this is just adding an extra layer >> of byteswapping that should really be done in some other location > > Actually, it's neither. It's not essential for anything, but it > *removes* an extra layer of byteswapping that really never should have > been done in the first place. So may be removing the tswap in qtest_process_command() is a better solution ? I don't know how much of an earthquake this will be in the tests results. Tests will have to be explicit on which endianness they expect else they will be using the qtest default endianness which is the host. >> in the stack. What's the actual test case here? > > The current readw, readl, etc. all work in "guest endianness". But > guest endianness is not well defined - there are a number of targets > which can support either. And it's doubly meaningless since it's a > property of the guest cpu, which we're essentially replacing with the > qtest stub anyway. > > Furthermore "guest endianness" isn't useful. With a tiny handful of > exceptions, all peripherals have their own endianness which is known > independent of the target. It makes more sense for test cases to > explicitly do their accesses in the correct endianness for the device, > without having to compensate for the fact that it'll be swapped into > the essentially arbitrary "guest endianness" along the way. > > Basically, the existing byteswaps, instead of removing the need for > them in the testcase code, instead mean that target-conditional > byteswaps will be *required* in nearly every testcase. It's a recipe > for endianness-unsafe testcases. and plenty of baroque work arounds. Thanks, C.
On 4 October 2016 at 16:43, David Gibson <david@gibson.dropbear.id.au> wrote: > On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote: >> The difficulty with this patch is that it's hard to tell whether >> it's really required, or if this is just adding an extra layer >> of byteswapping that should really be done in some other location > > Actually, it's neither. It's not essential for anything, but it > *removes* an extra layer of byteswapping that really never should have > been done in the first place. The patch is very clearly adding calls to swapping functions. It looks like it's mostly convenience functions for not doing those swaps explicitly in the test cases. >> in the stack. What's the actual test case here? > > The current readw, readl, etc. all work in "guest endianness". But > guest endianness is not well defined - there are a number of targets > which can support either. It's guest bus endianness, and it's pretty well defined I think. (ARM for instance is LE bus even if the CPU is doing BE writes.) > And it's doubly meaningless since it's a > property of the guest cpu, which we're essentially replacing with the > qtest stub anyway. The stub sits on the same bus the guest cpu would. > Furthermore "guest endianness" isn't useful. With a tiny handful of > exceptions, all peripherals have their own endianness which is known > independent of the target. It makes more sense for test cases to > explicitly do their accesses in the correct endianness for the device, > without having to compensate for the fact that it'll be swapped into > the essentially arbitrary "guest endianness" along the way. Here I definitely disagree. I think it makes much more sense for writes to be "what the guest CPU would write", because that's where we're injecting them. If we had a test framework where the test was talking directly to the device, then maybe, but we don't. > Basically, the existing byteswaps, instead of removing the need for > them in the testcase code, instead mean that target-conditional > byteswaps will be *required* in nearly every testcase. It's a recipe > for endianness-unsafe testcases. I prefer the swaps to be explicit in the test cases. If your test case running on the real CPU would have had to do "swap to LE and then write this word", so does the test case in our test framework. If your test case just does "write the word", then so does the test. Most devices IME do not require byteswaps by guest code, and I think these functions are confusing -- if you try to use them to write tests for ARM devices, for instance, the _le versions of these functions will introduce an incorrect byteswap on a BE host, even though ARM CPUs and devices are typically all LE. thanks -- PMM
On 10/05/2016 02:31 PM, Peter Maydell wrote: > On 4 October 2016 at 16:43, David Gibson <david@gibson.dropbear.id.au> wrote: >> On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote: >>> The difficulty with this patch is that it's hard to tell whether >>> it's really required, or if this is just adding an extra layer >>> of byteswapping that should really be done in some other location >> >> Actually, it's neither. It's not essential for anything, but it >> *removes* an extra layer of byteswapping that really never should have >> been done in the first place. > > The patch is very clearly adding calls to swapping functions. > It looks like it's mostly convenience functions for not doing > those swaps explicitly in the test cases. > >>> in the stack. What's the actual test case here? >> >> The current readw, readl, etc. all work in "guest endianness". But >> guest endianness is not well defined - there are a number of targets >> which can support either. > > It's guest bus endianness, and it's pretty well defined I think. > (ARM for instance is LE bus even if the CPU is doing BE writes.) > >> And it's doubly meaningless since it's a >> property of the guest cpu, which we're essentially replacing with the >> qtest stub anyway. > > The stub sits on the same bus the guest cpu would. > >> Furthermore "guest endianness" isn't useful. With a tiny handful of >> exceptions, all peripherals have their own endianness which is known >> independent of the target. It makes more sense for test cases to >> explicitly do their accesses in the correct endianness for the device, >> without having to compensate for the fact that it'll be swapped into >> the essentially arbitrary "guest endianness" along the way. > > Here I definitely disagree. I think it makes much more sense > for writes to be "what the guest CPU would write", because that's > where we're injecting them. If we had a test framework where the > test was talking directly to the device, then maybe, but we don't. This is how it all started with the test I wrote. Quite ignorant of the middle layers, I used cpu_to_be32() as I would have done on the guest and then realized that qtest was byte-swapping also in some cases and so the test failed on a BE host. >> Basically, the existing byteswaps, instead of removing the need for >> them in the testcase code, instead mean that target-conditional >> byteswaps will be *required* in nearly every testcase. It's a recipe >> for endianness-unsafe testcases. > > I prefer the swaps to be explicit in the test cases. If your > test case running on the real CPU would have had to do > "swap to LE and then write this word", so does the test > case in our test framework. If your test case just does > "write the word", then so does the test. > > Most devices IME do not require byteswaps by guest code, > and I think these functions are confusing -- if you try > to use them to write tests for ARM devices, for instance, > the _le versions of these functions will introduce an > incorrect byteswap on a BE host, even though ARM CPUs and > devices are typically all LE. hmmm, yes I agree :/ My scenario is for a ARM SoC running LE on a BE host and accessing a BE device. So how do we handle the possible bswap due to the host and guest endianness being different ? I am confused. Should we try to remove the tswap from qtest ? Thanks C.
On 5 October 2016 at 06:49, Cédric Le Goater <clg@kaod.org> wrote: > On 10/05/2016 02:31 PM, Peter Maydell wrote: >> Here I definitely disagree. I think it makes much more sense >> for writes to be "what the guest CPU would write", because that's >> where we're injecting them. If we had a test framework where the >> test was talking directly to the device, then maybe, but we don't. > > This is how it all started with the test I wrote. Quite ignorant > of the middle layers, I used cpu_to_be32() as I would have done > on the guest and then realized that qtest was byte-swapping also > in some cases and so the test failed on a BE host. At this point we're back to "I'd need to look at the actual test and device code to identify what's going wrong"... >>> Basically, the existing byteswaps, instead of removing the need for >>> them in the testcase code, instead mean that target-conditional >>> byteswaps will be *required* in nearly every testcase. It's a recipe >>> for endianness-unsafe testcases. >> >> I prefer the swaps to be explicit in the test cases. If your >> test case running on the real CPU would have had to do >> "swap to LE and then write this word", so does the test >> case in our test framework. If your test case just does >> "write the word", then so does the test. >> >> Most devices IME do not require byteswaps by guest code, >> and I think these functions are confusing -- if you try >> to use them to write tests for ARM devices, for instance, >> the _le versions of these functions will introduce an >> incorrect byteswap on a BE host, even though ARM CPUs and >> devices are typically all LE. > > hmmm, yes I agree :/ My scenario is for a ARM SoC running LE > on a BE host and accessing a BE device. > > So how do we handle the possible bswap due to the host and > guest endianness being different ? I am confused. Should we > try to remove the tswap from qtest ? Which tswap? Last time I worked through the stack of what happens I thought that we had the right set of swaps in the right places. thanks -- PMM
On 10/05/2016 03:53 PM, Peter Maydell wrote: > On 5 October 2016 at 06:49, Cédric Le Goater <clg@kaod.org> wrote: >> On 10/05/2016 02:31 PM, Peter Maydell wrote: >>> Here I definitely disagree. I think it makes much more sense >>> for writes to be "what the guest CPU would write", because that's >>> where we're injecting them. If we had a test framework where the >>> test was talking directly to the device, then maybe, but we don't. >> >> This is how it all started with the test I wrote. Quite ignorant >> of the middle layers, I used cpu_to_be32() as I would have done >> on the guest and then realized that qtest was byte-swapping also >> in some cases and so the test failed on a BE host. > > At this point we're back to "I'd need to look at the actual test > and device code to identify what's going wrong"... ok. This is clearly not critical. When ever you have time. >>>> Basically, the existing byteswaps, instead of removing the need for >>>> them in the testcase code, instead mean that target-conditional >>>> byteswaps will be *required* in nearly every testcase. It's a recipe >>>> for endianness-unsafe testcases. >>> >>> I prefer the swaps to be explicit in the test cases. If your >>> test case running on the real CPU would have had to do >>> "swap to LE and then write this word", so does the test >>> case in our test framework. If your test case just does >>> "write the word", then so does the test. >>> >>> Most devices IME do not require byteswaps by guest code, >>> and I think these functions are confusing -- if you try >>> to use them to write tests for ARM devices, for instance, >>> the _le versions of these functions will introduce an >>> incorrect byteswap on a BE host, even though ARM CPUs and >>> devices are typically all LE. >> >> hmmm, yes I agree :/ My scenario is for a ARM SoC running LE >> on a BE host and accessing a BE device. >> >> So how do we handle the possible bswap due to the host and >> guest endianness being different ? I am confused. Should we >> try to remove the tswap from qtest ? > > Which tswap? Last time I worked through the stack of > what happens I thought that we had the right set of > swaps in the right places. The one I am talking about are under qtest_process_command(), see below. Thanks, C. } else if (strcmp(words[0], "writeb") == 0 || strcmp(words[0], "writew") == 0 || strcmp(words[0], "writel") == 0 || strcmp(words[0], "writeq") == 0) { uint64_t addr; uint64_t value; g_assert(words[1] && words[2]); g_assert(qemu_strtoull(words[1], NULL, 0, &addr) == 0); g_assert(qemu_strtoull(words[2], NULL, 0, &value) == 0); if (words[0][5] == 'b') { uint8_t data = value; cpu_physical_memory_write(addr, &data, 1); } else if (words[0][5] == 'w') { uint16_t data = value; tswap16s(&data); cpu_physical_memory_write(addr, &data, 2); } else if (words[0][5] == 'l') { uint32_t data = value; tswap32s(&data); cpu_physical_memory_write(addr, &data, 4); } else if (words[0][5] == 'q') { uint64_t data = value; tswap64s(&data); cpu_physical_memory_write(addr, &data, 8); } qtest_send_prefix(chr); qtest_send(chr, "OK\n"); } else if (strcmp(words[0], "readb") == 0 || strcmp(words[0], "readw") == 0 || strcmp(words[0], "readl") == 0 || strcmp(words[0], "readq") == 0) { uint64_t addr; uint64_t value = UINT64_C(-1); g_assert(words[1]); g_assert(qemu_strtoull(words[1], NULL, 0, &addr) == 0); if (words[0][4] == 'b') { uint8_t data; cpu_physical_memory_read(addr, &data, 1); value = data; } else if (words[0][4] == 'w') { uint16_t data; cpu_physical_memory_read(addr, &data, 2); value = tswap16(data); } else if (words[0][4] == 'l') { uint32_t data; cpu_physical_memory_read(addr, &data, 4); value = tswap32(data); } else if (words[0][4] == 'q') { cpu_physical_memory_read(addr, &value, 8); tswap64s(&value); } qtest_send_prefix(chr); qtest_sendf(chr, "OK 0x%016" PRIx64 "\n", value);
On 5 October 2016 at 07:00, Cédric Le Goater <clg@kaod.org> wrote: > On 10/05/2016 03:53 PM, Peter Maydell wrote: >> Which tswap? Last time I worked through the stack of >> what happens I thought that we had the right set of >> swaps in the right places. > > The one I am talking about are under qtest_process_command(), > see below. Those are correct and required, and they do not change the overall behaviour of the system depending on the host endianness. (They convert 32-bit values to "bag of bytes in guest order" which is what the cpu_physical_memory_* functions want.) https://lists.gnu.org/archive/html/qemu-arm/2016-07/msg00037.html is the explanation from last time around, I think. If your test is giving different answers on different host endiannesses, it is presumably because it is incorrectly issuing different readl/etc commands in the two cases. thanks -- PMM
On 10/05/2016 04:20 PM, Peter Maydell wrote: > On 5 October 2016 at 07:00, Cédric Le Goater <clg@kaod.org> wrote: >> On 10/05/2016 03:53 PM, Peter Maydell wrote: >>> Which tswap? Last time I worked through the stack of >>> what happens I thought that we had the right set of >>> swaps in the right places. >> >> The one I am talking about are under qtest_process_command(), >> see below. > > Those are correct and required, and they do not change > the overall behaviour of the system depending on the host > endianness. (They convert 32-bit values to "bag of > bytes in guest order" which is what the cpu_physical_memory_* > functions want.) > > https://lists.gnu.org/archive/html/qemu-arm/2016-07/msg00037.html > is the explanation from last time around, I think. > > If your test is giving different answers on different > host endiannesses, it is presumably because it is > incorrectly issuing different readl/etc commands > in the two cases. OK. I think my brain is starting to see things from the right angle. Let's try that : my test is simulating a Little Endian CPU which is writing a Big Endian value, so it *always* needs to bswap() that value. cpu_to_be32 is incorrect as it implies the endianness of the host which can be wrong (when running BE). The qtest layer is a just fixing layer. Thanks, C.
On 5 October 2016 at 10:17, Cédric Le Goater <clg@kaod.org> wrote: > OK. I think my brain is starting to see things from the right > angle. > > Let's try that : my test is simulating a Little Endian CPU which > is writing a Big Endian value, so it *always* needs to bswap() > that value. Yes. If your code on the (LE) guest CPU would be "bswap 32-bit value X; store 32-bit value to address" then your code in the test should be: "bswap 32-bit value X; store 32-bit value to address" If your code on the guest CPU is just "write 32-bit value X to address" then the code in the test is just "write 32-bit value X to address" > cpu_to_be32 is incorrect as it implies the endianness > of the host which can be wrong (when running BE). The main time you might want it in a test is if you do what the IDE and e1000e tests do: build up a data structure locally, and then memwrite() it to the guest memory as a pile of bytes (which you then point the device-under-test at). thanks -- PMM
On Wed, Oct 05, 2016 at 05:31:07AM -0700, Peter Maydell wrote: > On 4 October 2016 at 16:43, David Gibson <david@gibson.dropbear.id.au> wrote: > > On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote: > >> The difficulty with this patch is that it's hard to tell whether > >> it's really required, or if this is just adding an extra layer > >> of byteswapping that should really be done in some other location > > > > Actually, it's neither. It's not essential for anything, but it > > *removes* an extra layer of byteswapping that really never should have > > been done in the first place. > > The patch is very clearly adding calls to swapping functions. > It looks like it's mostly convenience functions for not doing > those swaps explicitly in the test cases. It's adding 1 swap on top of the memread/memwrite path - that's the path which had no existing swaps (intended primarily for bag-o'-bytes block access AFAICT). It's less swaps compared to the alternative approach of using the readw/writew/readl/writel set of functions. Those already include a tswap, and will in nearly all cases require an additional (conditional) swap in the caller. > >> in the stack. What's the actual test case here? > > > > The current readw, readl, etc. all work in "guest endianness". But > > guest endianness is not well defined - there are a number of targets > > which can support either. > > It's guest bus endianness, and it's pretty well defined I think. > (ARM for instance is LE bus even if the CPU is doing BE writes.) I don't see that guest bus endianness is any better defined, or any more useful than "guest endianness". It might have a vague meaning for ARM (or embedded Power) in the sense that the on-SoC devices will use that endianness. But since the SoC devices are generally unique to the architecture anyway, you still know their endianness independent of any notion of guest endianness. > > And it's doubly meaningless since it's a > > property of the guest cpu, which we're essentially replacing with the > > qtest stub anyway. > > The stub sits on the same bus the guest cpu would. > > > Furthermore "guest endianness" isn't useful. With a tiny handful of > > exceptions, all peripherals have their own endianness which is known > > independent of the target. It makes more sense for test cases to > > explicitly do their accesses in the correct endianness for the device, > > without having to compensate for the fact that it'll be swapped into > > the essentially arbitrary "guest endianness" along the way. > > Here I definitely disagree. I think it makes much more sense > for writes to be "what the guest CPU would write", because that's > where we're injecting them. If we had a test framework where the > test was talking directly to the device, then maybe, but we don't. When I say that guest endianness is not well defined, what I mean is precisely that "what the guest CPU would write" is not well defined. For example on Power, the endianness of a given access will depend on: - For server chips, whether the CPU is in LE or BE mode - For embedded chips, the endianness flag in the TLB - For all chips, whether the plain or byte-reversed load/store instructions are used [There might even be variants with both a global mode flag and per-page flags in the TLB, I'm not sure] All of those components are not present in the qtest model. So attempting to do "what the cpu would do" - by making a bunch of essentially arbitrary assumptions - gives us zero extra coverage. The model proposed here is that our testcases, on every access specify a specific final-result endianness. The alternative is that every write from qtest has to do: 1) Start with a value (host endian) 2) Conditional swap for host endian vs. device endian difference 3) Conditional swap for host endian vs. "guest endian" difference simply to compensate for.. 4) [Existing tswap() within writew/writel] Conditional swap for host endian vs. "guest endian difference > > Basically, the existing byteswaps, instead of removing the need for > > them in the testcase code, instead mean that target-conditional > > byteswaps will be *required* in nearly every testcase. It's a recipe > > for endianness-unsafe testcases. > > I prefer the swaps to be explicit in the test cases. If your The are explicit in the testcases, in the sense that on every read or write you say this is an LE access or a BE access. Potentially we could have a "guest native" access option as well, but I think that's useful > test case running on the real CPU would have had to do > "swap to LE and then write this word", so does the test > case in our test framework. If your test case just does > "write the word", then so does the test. > > Most devices IME do not require byteswaps by guest code, What!? So speaks the person working on a historically LE platform. In the kernel, if a driver isn't littered with cpu_to_leXX() it's almost certainly broken on a BE platform. The only devices I can think of which don't have a fixed endianness regardless of platform are: * legacy virtio: this was an insane design choice, which is why it went away in virtio-1.0 * graphics card framebuffers, which can usually be configured either way (or have both BE and LE views of the framebuffer) * a handful of simple devices, usually from a while back, where some idiot thought a BE version of a previously LE device (or vice versa) was a good idea > and I think these functions are confusing -- if you try > to use them to write tests for ARM devices, for instance, > the _le versions of these functions will introduce an > incorrect byteswap on a BE host, even though ARM CPUs and > devices are typically all LE. Huh? How so? If you're accessing an LE device (like PCI) you say "write_le" and there are no byteswaps at all on an LE host, and a single byteswap on a BE host, which is just what you want. For a PCI device "write_le" is the correct option regardless of both host and guest platform. That makes testcases straightforward to write. As a bonus, it forces the test author to think momentarily about what the right endianness is for each access, which is a good habit to be in (not thinking about this is why we've so often had endianness-broken drivers). Switching to an arbitrary "guest endianness" and back again makes the test more complicated, and accomplishes no extra coverage (because all the swaps under consideration are in the test code anyway). It gains nothing but confusion.
On Wed, Oct 05, 2016 at 03:49:18PM +0200, Cédric Le Goater wrote: > On 10/05/2016 02:31 PM, Peter Maydell wrote: > > On 4 October 2016 at 16:43, David Gibson <david@gibson.dropbear.id.au> wrote: > >> On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote: > >>> The difficulty with this patch is that it's hard to tell whether > >>> it's really required, or if this is just adding an extra layer > >>> of byteswapping that should really be done in some other location > >> > >> Actually, it's neither. It's not essential for anything, but it > >> *removes* an extra layer of byteswapping that really never should have > >> been done in the first place. > > > > The patch is very clearly adding calls to swapping functions. > > It looks like it's mostly convenience functions for not doing > > those swaps explicitly in the test cases. > > > >>> in the stack. What's the actual test case here? > >> > >> The current readw, readl, etc. all work in "guest endianness". But > >> guest endianness is not well defined - there are a number of targets > >> which can support either. > > > > It's guest bus endianness, and it's pretty well defined I think. > > (ARM for instance is LE bus even if the CPU is doing BE writes.) > > > >> And it's doubly meaningless since it's a > >> property of the guest cpu, which we're essentially replacing with the > >> qtest stub anyway. > > > > The stub sits on the same bus the guest cpu would. > > > >> Furthermore "guest endianness" isn't useful. With a tiny handful of > >> exceptions, all peripherals have their own endianness which is known > >> independent of the target. It makes more sense for test cases to > >> explicitly do their accesses in the correct endianness for the device, > >> without having to compensate for the fact that it'll be swapped into > >> the essentially arbitrary "guest endianness" along the way. > > > > Here I definitely disagree. I think it makes much more sense > > for writes to be "what the guest CPU would write", because that's > > where we're injecting them. If we had a test framework where the > > test was talking directly to the device, then maybe, but we don't. > > This is how it all started with the test I wrote. Quite ignorant > of the middle layers, I used cpu_to_be32() as I would have done > on the guest and then realized that qtest was byte-swapping also > in some cases and so the test failed on a BE host. > > >> Basically, the existing byteswaps, instead of removing the need for > >> them in the testcase code, instead mean that target-conditional > >> byteswaps will be *required* in nearly every testcase. It's a recipe > >> for endianness-unsafe testcases. > > > > I prefer the swaps to be explicit in the test cases. If your > > test case running on the real CPU would have had to do > > "swap to LE and then write this word", so does the test > > case in our test framework. If your test case just does > > "write the word", then so does the test. > > > > Most devices IME do not require byteswaps by guest code, > > and I think these functions are confusing -- if you try > > to use them to write tests for ARM devices, for instance, > > the _le versions of these functions will introduce an > > incorrect byteswap on a BE host, even though ARM CPUs and > > devices are typically all LE. > > hmmm, yes I agree :/ My scenario is for a ARM SoC running LE > on a BE host and accessing a BE device. This case is handled perfectly well by this proposal. The ARM devices are LE, so testcases testing them should use the write_le operations. That will do the right thing regardless of host endianness. It will also do the right thing regardless of guest endianness, if for whatever reason you try to attach an ARM SoC device to a (notionally) BE target. > So how do we handle the possible bswap due to the host and > guest endianness being different ? I am confused. Should we > try to remove the tswap from qtest ? > > Thanks > > C. >
On Wed, Oct 05, 2016 at 07:20:52AM -0700, Peter Maydell wrote: > On 5 October 2016 at 07:00, Cédric Le Goater <clg@kaod.org> wrote: > > On 10/05/2016 03:53 PM, Peter Maydell wrote: > >> Which tswap? Last time I worked through the stack of > >> what happens I thought that we had the right set of > >> swaps in the right places. > > > > The one I am talking about are under qtest_process_command(), > > see below. > > Those are correct and required, and they do not change > the overall behaviour of the system depending on the host > endianness. (They convert 32-bit values to "bag of > bytes in guest order" which is what the cpu_physical_memory_* > functions want.) These functions are correct for the defined semantics of the readw/readl operations, but those semantics are not useful. This proposal is introducing alternate functions with the more useful semantics which are "convert a 32-bit value to a bag of bytes in LE order" or "convert a 32-bit value to a bag of bytes in BE order" depending on which variant you choose. qtest is about testing hardware, not the guest cpu, and whether you're accessing MMIO space or buffers the hardware will read via DMA, it's the hardware which determines the correct endianness, not the guest cpu. > https://lists.gnu.org/archive/html/qemu-arm/2016-07/msg00037.html > is the explanation from last time around, I think. > > If your test is giving different answers on different > host endiannesses, it is presumably because it is > incorrectly issuing different readl/etc commands > in the two cases. > > thanks > -- PMM >
On Thu, Oct 06, 2016 at 02:38:36PM +1100, David Gibson wrote: > On Wed, Oct 05, 2016 at 05:31:07AM -0700, Peter Maydell wrote: > > On 4 October 2016 at 16:43, David Gibson <david@gibson.dropbear.id.au> wrote: > > > On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote: > > >> The difficulty with this patch is that it's hard to tell whether > > >> it's really required, or if this is just adding an extra layer > > >> of byteswapping that should really be done in some other location > > > > > > Actually, it's neither. It's not essential for anything, but it > > > *removes* an extra layer of byteswapping that really never should have > > > been done in the first place. > > > > The patch is very clearly adding calls to swapping functions. > > It looks like it's mostly convenience functions for not doing > > those swaps explicitly in the test cases. > > It's adding 1 swap on top of the memread/memwrite path - that's the > path which had no existing swaps (intended primarily for bag-o'-bytes > block access AFAICT). > > It's less swaps compared to the alternative approach of using the > readw/writew/readl/writel set of functions. Those already include a > tswap, and will in nearly all cases require an additional > (conditional) swap in the caller. > > > >> in the stack. What's the actual test case here? > > > > > > The current readw, readl, etc. all work in "guest endianness". But > > > guest endianness is not well defined - there are a number of targets > > > which can support either. > > > > It's guest bus endianness, and it's pretty well defined I think. > > (ARM for instance is LE bus even if the CPU is doing BE writes.) > > I don't see that guest bus endianness is any better defined, or any > more useful than "guest endianness". It might have a vague meaning > for ARM (or embedded Power) in the sense that the on-SoC devices will > use that endianness. But since the SoC devices are generally unique > to the architecture anyway, you still know their endianness > independent of any notion of guest endianness. > > > > And it's doubly meaningless since it's a > > > property of the guest cpu, which we're essentially replacing with the > > > qtest stub anyway. > > > > The stub sits on the same bus the guest cpu would. > > > > > Furthermore "guest endianness" isn't useful. With a tiny handful of > > > exceptions, all peripherals have their own endianness which is known > > > independent of the target. It makes more sense for test cases to > > > explicitly do their accesses in the correct endianness for the device, > > > without having to compensate for the fact that it'll be swapped into > > > the essentially arbitrary "guest endianness" along the way. > > > > Here I definitely disagree. I think it makes much more sense > > for writes to be "what the guest CPU would write", because that's > > where we're injecting them. If we had a test framework where the > > test was talking directly to the device, then maybe, but we don't. > > When I say that guest endianness is not well defined, what I mean is > precisely that "what the guest CPU would write" is not well defined. > For example on Power, the endianness of a given access will depend on: > - For server chips, whether the CPU is in LE or BE mode > - For embedded chips, the endianness flag in the TLB > - For all chips, whether the plain or byte-reversed load/store > instructions are used > > [There might even be variants with both a global mode flag and > per-page flags in the TLB, I'm not sure] > > All of those components are not present in the qtest model. So > attempting to do "what the cpu would do" - by making a bunch of > essentially arbitrary assumptions - gives us zero extra coverage. > > The model proposed here is that our testcases, on every access specify > a specific final-result endianness. The alternative is that every > write from qtest has to do: > 1) Start with a value (host endian) > 2) Conditional swap for host endian vs. device endian difference > 3) Conditional swap for host endian vs. "guest endian" difference > simply to compensate for.. > 4) [Existing tswap() within writew/writel] > Conditional swap for host endian vs. "guest endian difference > > > > Basically, the existing byteswaps, instead of removing the need for > > > them in the testcase code, instead mean that target-conditional > > > byteswaps will be *required* in nearly every testcase. It's a recipe > > > for endianness-unsafe testcases. > > > > I prefer the swaps to be explicit in the test cases. If your > > The are explicit in the testcases, in the sense that on every read or > write you say this is an LE access or a BE access. Potentially we > could have a "guest native" access option as well, but I think that's > useful > > > test case running on the real CPU would have had to do > > "swap to LE and then write this word", so does the test > > case in our test framework. If your test case just does > > "write the word", then so does the test. > > > > Most devices IME do not require byteswaps by guest code, > > What!? So speaks the person working on a historically LE platform. > In the kernel, if a driver isn't littered with cpu_to_leXX() it's > almost certainly broken on a BE platform. Clarifying here. In the kernel, byteswaps are not typically required on writew(), writel() operations. That's because the kernel writew() and writel() are defined to have always-LE operation, and most hardware is LE. In other words, the kernel writew() and writel() have the same semantics as the proposed writew_le() and writel_le() *NOT* the existing qtest writew() and writel() (they're more like the rarely used kernel __raw_writew() and __raw_writel()).
On 06/10/2016 05:45, David Gibson wrote: > qtest is about testing hardware, not the guest cpu, and whether you're > accessing MMIO space or buffers the hardware will read via DMA, it's > the hardware which determines the correct endianness, not the guest > cpu. Well, sort of... qtest writes are definitely injecting things from the point of view of the guest CPU. For example see Laurent's patches that add RTAS commands to qtest. Paolo
On Thu, Oct 06, 2016 at 09:23:11AM +0200, Paolo Bonzini wrote: > > > On 06/10/2016 05:45, David Gibson wrote: > > qtest is about testing hardware, not the guest cpu, and whether you're > > accessing MMIO space or buffers the hardware will read via DMA, it's > > the hardware which determines the correct endianness, not the guest > > cpu. > > Well, sort of... qtest writes are definitely injecting things from the > point of view of the guest CPU. For example see Laurent's patches that > add RTAS commands to qtest. Sure. My point is that any guest CPU can do both LE and BE accesses pretty simply, so we might as well specify which we want.
On 06/10/2016 10:37, David Gibson wrote: > On Thu, Oct 06, 2016 at 09:23:11AM +0200, Paolo Bonzini wrote: >> >> >> On 06/10/2016 05:45, David Gibson wrote: >>> qtest is about testing hardware, not the guest cpu, and whether you're >>> accessing MMIO space or buffers the hardware will read via DMA, it's >>> the hardware which determines the correct endianness, not the guest >>> cpu. >> >> Well, sort of... qtest writes are definitely injecting things from the >> point of view of the guest CPU. For example see Laurent's patches that >> add RTAS commands to qtest. > > Sure. My point is that any guest CPU can do both LE and BE accesses > pretty simply, so we might as well specify which we want. On this I do agree. Paolo
On 10/06/2016 11:40 AM, Paolo Bonzini wrote: > > > On 06/10/2016 10:37, David Gibson wrote: >> On Thu, Oct 06, 2016 at 09:23:11AM +0200, Paolo Bonzini wrote: >>> >>> >>> On 06/10/2016 05:45, David Gibson wrote: >>>> qtest is about testing hardware, not the guest cpu, and whether you're >>>> accessing MMIO space or buffers the hardware will read via DMA, it's >>>> the hardware which determines the correct endianness, not the guest >>>> cpu. >>> >>> Well, sort of... qtest writes are definitely injecting things from the >>> point of view of the guest CPU. For example see Laurent's patches that >>> add RTAS commands to qtest. >> >> Sure. My point is that any guest CPU can do both LE and BE accesses >> pretty simply, so we might as well specify which we want. > > On this I do agree. Yes. It is pretty confusing today from the test side. To clarify things a little, I came up with this : +static inline uint32_t make_be32(uint32_t data) +{ + return bswap32(data); +} ... + writel(ASPEED_FLASH_BASE, make_be32(some_page_addr)); Which is what a Little Endian CPU would do to write a BE value. So it should probably be called, make_b32_on_le_cpu() to be correct. I am not sure there is a good interface for these accesses ... C.
On 6 October 2016 at 04:45, David Gibson <david@gibson.dropbear.id.au> wrote: > On Wed, Oct 05, 2016 at 07:20:52AM -0700, Peter Maydell wrote: >> On 5 October 2016 at 07:00, Cédric Le Goater <clg@kaod.org> wrote: >> > On 10/05/2016 03:53 PM, Peter Maydell wrote: >> >> Which tswap? Last time I worked through the stack of >> >> what happens I thought that we had the right set of >> >> swaps in the right places. >> > >> > The one I am talking about are under qtest_process_command(), >> > see below. >> >> Those are correct and required, and they do not change >> the overall behaviour of the system depending on the host >> endianness. (They convert 32-bit values to "bag of >> bytes in guest order" which is what the cpu_physical_memory_* >> functions want.) > > These functions are correct for the defined semantics of the > readw/readl operations, but those semantics are not useful. ?? They're the most obvious and required semantics: "act like the CPU just did a word/halfword/byte read/write". There's a reason we've got this far without needing anything else, and it's that this is the most straightforward use case. > This proposal is introducing alternate functions with the more useful > semantics which are "convert a 32-bit value to a bag of bytes in LE > order" or "convert a 32-bit value to a bag of bytes in BE order" > depending on which variant you choose. It's adding functions whose semantics are "act like the CPU wrote this value to some RAM and then memcpy()ed it to the device". I think devices whose usage model is "memcpy bytes to me" are rare to nonexistent. thanks -- PMM
On 6 October 2016 at 04:38, David Gibson <david@gibson.dropbear.id.au> wrote: > On Wed, Oct 05, 2016 at 05:31:07AM -0700, Peter Maydell wrote: >> On 4 October 2016 at 16:43, David Gibson <david@gibson.dropbear.id.au> wrote: >> > On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote: >> >> The difficulty with this patch is that it's hard to tell whether >> >> it's really required, or if this is just adding an extra layer >> >> of byteswapping that should really be done in some other location >> > >> > Actually, it's neither. It's not essential for anything, but it >> > *removes* an extra layer of byteswapping that really never should have >> > been done in the first place. >> >> The patch is very clearly adding calls to swapping functions. >> It looks like it's mostly convenience functions for not doing >> those swaps explicitly in the test cases. > > It's adding 1 swap on top of the memread/memwrite path - that's the > path which had no existing swaps (intended primarily for bag-o'-bytes > block access AFAICT). Yeah, I hadn't noticed it was using the memread/memwrite path. I disagree with using that code path for what ought to be register read/writes (among other things, it's not clear to me that it guarantees that a 4-byte access by the test code is always a 4-byte access on the device, etc). >> >> in the stack. What's the actual test case here? >> > >> > The current readw, readl, etc. all work in "guest endianness". But >> > guest endianness is not well defined - there are a number of targets >> > which can support either. >> >> It's guest bus endianness, and it's pretty well defined I think. >> (ARM for instance is LE bus even if the CPU is doing BE writes.) > > I don't see that guest bus endianness is any better defined, or any > more useful than "guest endianness". It might have a vague meaning > for ARM (or embedded Power) in the sense that the on-SoC devices will > use that endianness. But since the SoC devices are generally unique > to the architecture anyway, you still know their endianness > independent of any notion of guest endianness. It means "the endianness that you would see if you could snoop the data bus at the output of the CPU". It is definitely well defined for every QEMU target because that's what TARGET_WORDS_BIGENDIAN tells you. It's not the same as whatever the device thinks it might interpret values as (though obviously people don't often design systems where they differ). >> > And it's doubly meaningless since it's a >> > property of the guest cpu, which we're essentially replacing with the >> > qtest stub anyway. >> >> The stub sits on the same bus the guest cpu would. >> >> > Furthermore "guest endianness" isn't useful. With a tiny handful of >> > exceptions, all peripherals have their own endianness which is known >> > independent of the target. It makes more sense for test cases to >> > explicitly do their accesses in the correct endianness for the device, >> > without having to compensate for the fact that it'll be swapped into >> > the essentially arbitrary "guest endianness" along the way. >> >> Here I definitely disagree. I think it makes much more sense >> for writes to be "what the guest CPU would write", because that's >> where we're injecting them. If we had a test framework where the >> test was talking directly to the device, then maybe, but we don't. > > When I say that guest endianness is not well defined, what I mean is > precisely that "what the guest CPU would write" is not well defined. > For example on Power, the endianness of a given access will depend on: > - For server chips, whether the CPU is in LE or BE mode > - For embedded chips, the endianness flag in the TLB > - For all chips, whether the plain or byte-reversed load/store > instructions are used > > [There might even be variants with both a global mode flag and > per-page flags in the TLB, I'm not sure] AFAIK, these things all take effect inside the guest CPU, and affect the value the guest eventually writes to the CPU bus. > The model proposed here is that our testcases, on every access specify > a specific final-result endianness. The alternative is that every > write from qtest has to do: > 1) Start with a value (host endian) > 2) Conditional swap for host endian vs. device endian difference > 3) Conditional swap for host endian vs. "guest endian" difference > simply to compensate for.. > 4) [Existing tswap() within writew/writel] > Conditional swap for host endian vs. "guest endian difference > >> > Basically, the existing byteswaps, instead of removing the need for >> > them in the testcase code, instead mean that target-conditional >> > byteswaps will be *required* in nearly every testcase. It's a recipe >> > for endianness-unsafe testcases. >> >> I prefer the swaps to be explicit in the test cases. If your > > The are explicit in the testcases, in the sense that on every read or > write you say this is an LE access or a BE access. Potentially we > could have a "guest native" access option as well, but I think that's > useful > >> test case running on the real CPU would have had to do >> "swap to LE and then write this word", so does the test >> case in our test framework. If your test case just does >> "write the word", then so does the test. >> >> Most devices IME do not require byteswaps by guest code, > > What!? So speaks the person working on a historically LE platform. Well, platforms where the devices and the CPU agree about endianness. > In the kernel, if a driver isn't littered with cpu_to_leXX() it's > almost certainly broken on a BE platform. > > The only devices I can think of which don't have a fixed endianness > regardless of platform are: > * legacy virtio: this was an insane design choice, which is why it > went away in virtio-1.0 > * graphics card framebuffers, which can usually be configured > either way (or have both BE and LE views of the framebuffer) > * a handful of simple devices, usually from a while back, where > some idiot thought a BE version of a previously LE device (or vice > versa) was a good idea > >> and I think these functions are confusing -- if you try >> to use them to write tests for ARM devices, for instance, >> the _le versions of these functions will introduce an >> incorrect byteswap on a BE host, even though ARM CPUs and >> devices are typically all LE. > > Huh? How so? This was the result of my mis-reading of the code: I thought they were doing a byteswap and then calling the readl/readw/etc functions, not the readmem etc functions. > If you're accessing an LE device (like PCI) you say "write_le" and > there are no byteswaps at all on an LE host, and a single byteswap on > a BE host, which is just what you want. For a PCI device "write_le" > is the correct option regardless of both host and guest platform. > > That makes testcases straightforward to write. As a bonus, it forces > the test author to think momentarily about what the right endianness > is for each access, which is a good habit to be in (not thinking about > this is why we've so often had endianness-broken drivers). > > Switching to an arbitrary "guest endianness" and back again makes the > test more complicated, and accomplishes no extra coverage (because all > the swaps under consideration are in the test code anyway). It gains > nothing but confusion. I think I need to think about this a bit... thanks -- PMM
On Thu, 6 Oct 2016 12:03:34 +0100 Peter Maydell <peter.maydell@linaro.org> wrote: > On 6 October 2016 at 04:38, David Gibson <david@gibson.dropbear.id.au> wrote: > > On Wed, Oct 05, 2016 at 05:31:07AM -0700, Peter Maydell wrote: > >> On 4 October 2016 at 16:43, David Gibson <david@gibson.dropbear.id.au> wrote: > >> > On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote: > >> >> The difficulty with this patch is that it's hard to tell whether > >> >> it's really required, or if this is just adding an extra layer > >> >> of byteswapping that should really be done in some other location > >> > > >> > Actually, it's neither. It's not essential for anything, but it > >> > *removes* an extra layer of byteswapping that really never should have > >> > been done in the first place. > >> > >> The patch is very clearly adding calls to swapping functions. > >> It looks like it's mostly convenience functions for not doing > >> those swaps explicitly in the test cases. > > > > It's adding 1 swap on top of the memread/memwrite path - that's the > > path which had no existing swaps (intended primarily for bag-o'-bytes > > block access AFAICT). > > Yeah, I hadn't noticed it was using the memread/memwrite path. > I disagree with using that code path for what ought to be > register read/writes (among other things, it's not clear to me > that it guarantees that a 4-byte access by the test code is > always a 4-byte access on the device, etc). > FWIW, Cedric had another proposal which apparently went unnoticed: <fc24ad74-da26-a713-9312-a2c2d07fb6a7@kaod.org> The idea is to add an optional endianness argument to the read*/write* commands in the qtest protocol: - libqtest then provides explicit _le and _be APIs - no extra byteswap is performed on the test program side: qtest actually handles that and does exactly 1 or 0 byteswap. - it does not use memread/memwrite - the current 'guest native' API where qtest tswaps is preserved > >> >> in the stack. What's the actual test case here? > >> > > >> > The current readw, readl, etc. all work in "guest endianness". But > >> > guest endianness is not well defined - there are a number of targets > >> > which can support either. > >> > >> It's guest bus endianness, and it's pretty well defined I think. > >> (ARM for instance is LE bus even if the CPU is doing BE writes.) > > > > I don't see that guest bus endianness is any better defined, or any > > more useful than "guest endianness". It might have a vague meaning > > for ARM (or embedded Power) in the sense that the on-SoC devices will > > use that endianness. But since the SoC devices are generally unique > > to the architecture anyway, you still know their endianness > > independent of any notion of guest endianness. > > It means "the endianness that you would see if you could snoop > the data bus at the output of the CPU". It is definitely well > defined for every QEMU target because that's what TARGET_WORDS_BIGENDIAN > tells you. It's not the same as whatever the device thinks it > might interpret values as (though obviously people don't > often design systems where they differ). > > >> > And it's doubly meaningless since it's a > >> > property of the guest cpu, which we're essentially replacing with the > >> > qtest stub anyway. > >> > >> The stub sits on the same bus the guest cpu would. > >> > >> > Furthermore "guest endianness" isn't useful. With a tiny handful of > >> > exceptions, all peripherals have their own endianness which is known > >> > independent of the target. It makes more sense for test cases to > >> > explicitly do their accesses in the correct endianness for the device, > >> > without having to compensate for the fact that it'll be swapped into > >> > the essentially arbitrary "guest endianness" along the way. > >> > >> Here I definitely disagree. I think it makes much more sense > >> for writes to be "what the guest CPU would write", because that's > >> where we're injecting them. If we had a test framework where the > >> test was talking directly to the device, then maybe, but we don't. > > > > When I say that guest endianness is not well defined, what I mean is > > precisely that "what the guest CPU would write" is not well defined. > > For example on Power, the endianness of a given access will depend on: > > - For server chips, whether the CPU is in LE or BE mode > > - For embedded chips, the endianness flag in the TLB > > - For all chips, whether the plain or byte-reversed load/store > > instructions are used > > > > [There might even be variants with both a global mode flag and > > per-page flags in the TLB, I'm not sure] > > AFAIK, these things all take effect inside the guest CPU, and affect > the value the guest eventually writes to the CPU bus. > > > The model proposed here is that our testcases, on every access specify > > a specific final-result endianness. The alternative is that every > > write from qtest has to do: > > 1) Start with a value (host endian) > > 2) Conditional swap for host endian vs. device endian difference > > 3) Conditional swap for host endian vs. "guest endian" difference > > simply to compensate for.. > > 4) [Existing tswap() within writew/writel] > > Conditional swap for host endian vs. "guest endian difference > > > >> > Basically, the existing byteswaps, instead of removing the need for > >> > them in the testcase code, instead mean that target-conditional > >> > byteswaps will be *required* in nearly every testcase. It's a recipe > >> > for endianness-unsafe testcases. > >> > >> I prefer the swaps to be explicit in the test cases. If your > > > > The are explicit in the testcases, in the sense that on every read or > > write you say this is an LE access or a BE access. Potentially we > > could have a "guest native" access option as well, but I think that's > > useful > > > >> test case running on the real CPU would have had to do > >> "swap to LE and then write this word", so does the test > >> case in our test framework. If your test case just does > >> "write the word", then so does the test. > >> > >> Most devices IME do not require byteswaps by guest code, > > > > What!? So speaks the person working on a historically LE platform. > > Well, platforms where the devices and the CPU agree about > endianness. > > > In the kernel, if a driver isn't littered with cpu_to_leXX() it's > > almost certainly broken on a BE platform. > > > > The only devices I can think of which don't have a fixed endianness > > regardless of platform are: > > * legacy virtio: this was an insane design choice, which is why it > > went away in virtio-1.0 > > * graphics card framebuffers, which can usually be configured > > either way (or have both BE and LE views of the framebuffer) > > * a handful of simple devices, usually from a while back, where > > some idiot thought a BE version of a previously LE device (or vice > > versa) was a good idea > > > >> and I think these functions are confusing -- if you try > >> to use them to write tests for ARM devices, for instance, > >> the _le versions of these functions will introduce an > >> incorrect byteswap on a BE host, even though ARM CPUs and > >> devices are typically all LE. > > > > Huh? How so? > > This was the result of my mis-reading of the code: I thought > they were doing a byteswap and then calling the readl/readw/etc > functions, not the readmem etc functions. > > > If you're accessing an LE device (like PCI) you say "write_le" and > > there are no byteswaps at all on an LE host, and a single byteswap on > > a BE host, which is just what you want. For a PCI device "write_le" > > is the correct option regardless of both host and guest platform. > > > > That makes testcases straightforward to write. As a bonus, it forces > > the test author to think momentarily about what the right endianness > > is for each access, which is a good habit to be in (not thinking about > > this is why we've so often had endianness-broken drivers). > > > > Switching to an arbitrary "guest endianness" and back again makes the > > test more complicated, and accomplishes no extra coverage (because all > > the swaps under consideration are in the test code anyway). It gains > > nothing but confusion. > > I think I need to think about this a bit... > > thanks > -- PMM >
On 06/10/2016 16:11, Greg Kurz wrote: > FWIW, Cedric had another proposal which apparently went unnoticed: > > <fc24ad74-da26-a713-9312-a2c2d07fb6a7@kaod.org> > > The idea is to add an optional endianness argument to the read*/write* > commands in the qtest protocol: > - libqtest then provides explicit _le and _be APIs > - no extra byteswap is performed on the test program side: qtest > actually handles that and does exactly 1 or 0 byteswap. > - it does not use memread/memwrite > - the current 'guest native' API where qtest tswaps is preserved > No, this is a worse idea, because the right place to do the swap is in the "program" (libqtest) not in the "CPU" (QEMU). Paolo
On 6 October 2016 at 16:36, Paolo Bonzini <pbonzini@redhat.com> wrote: > > > On 06/10/2016 16:11, Greg Kurz wrote: >> FWIW, Cedric had another proposal which apparently went unnoticed: >> >> <fc24ad74-da26-a713-9312-a2c2d07fb6a7@kaod.org> >> >> The idea is to add an optional endianness argument to the read*/write* >> commands in the qtest protocol: >> - libqtest then provides explicit _le and _be APIs >> - no extra byteswap is performed on the test program side: qtest >> actually handles that and does exactly 1 or 0 byteswap. >> - it does not use memread/memwrite >> - the current 'guest native' API where qtest tswaps is preserved >> > > No, this is a worse idea, because the right place to do the swap is in > the "program" (libqtest) not in the "CPU" (QEMU). Speaking of the right place to do things, perhaps we should reimplement qtest_big_endian() in libqtest.c to send a query to the QEMU-under-test to ask it what TARGET_BIG_ENDIAN says, rather than hardcoding a big list of architectures... thanks -- PMM
On 10/06/2016 05:36 PM, Paolo Bonzini wrote: > > > On 06/10/2016 16:11, Greg Kurz wrote: >> FWIW, Cedric had another proposal which apparently went unnoticed: >> >> <fc24ad74-da26-a713-9312-a2c2d07fb6a7@kaod.org> >> >> The idea is to add an optional endianness argument to the read*/write* >> commands in the qtest protocol: >> - libqtest then provides explicit _le and _be APIs >> - no extra byteswap is performed on the test program side: qtest >> actually handles that and does exactly 1 or 0 byteswap. >> - it does not use memread/memwrite >> - the current 'guest native' API where qtest tswaps is preserved >> > > No, this is a worse idea, because the right place to do the swap is in > the "program" (libqtest) not in the "CPU" (QEMU). So the current patch, minus the typos, seems to be a "better" solution. Thanks, C.
On 06/10/2016 17:44, Cédric Le Goater wrote: > On 10/06/2016 05:36 PM, Paolo Bonzini wrote: >> >> >> On 06/10/2016 16:11, Greg Kurz wrote: >>> FWIW, Cedric had another proposal which apparently went unnoticed: >>> >>> <fc24ad74-da26-a713-9312-a2c2d07fb6a7@kaod.org> >>> >>> The idea is to add an optional endianness argument to the read*/write* >>> commands in the qtest protocol: >>> - libqtest then provides explicit _le and _be APIs >>> - no extra byteswap is performed on the test program side: qtest >>> actually handles that and does exactly 1 or 0 byteswap. >>> - it does not use memread/memwrite >>> - the current 'guest native' API where qtest tswaps is preserved >>> >> >> No, this is a worse idea, because the right place to do the swap is in >> the "program" (libqtest) not in the "CPU" (QEMU). > > So the current patch, minus the typos, seems to be a "better" solution. Yes, at least I don't think it's a problem that it uses "memread" and "memwrite". It's just address_space_rw under the hood, and we know that it does the right thing with MMIO addresses. Paolo
On 06/10/2016 17:41, Peter Maydell wrote: > On 6 October 2016 at 16:36, Paolo Bonzini <pbonzini@redhat.com> wrote: >> >> >> On 06/10/2016 16:11, Greg Kurz wrote: >>> FWIW, Cedric had another proposal which apparently went unnoticed: >>> >>> <fc24ad74-da26-a713-9312-a2c2d07fb6a7@kaod.org> >>> >>> The idea is to add an optional endianness argument to the read*/write* >>> commands in the qtest protocol: >>> - libqtest then provides explicit _le and _be APIs >>> - no extra byteswap is performed on the test program side: qtest >>> actually handles that and does exactly 1 or 0 byteswap. >>> - it does not use memread/memwrite >>> - the current 'guest native' API where qtest tswaps is preserved >>> >> >> No, this is a worse idea, because the right place to do the swap is in >> the "program" (libqtest) not in the "CPU" (QEMU). > > Speaking of the right place to do things, perhaps we should > reimplement qtest_big_endian() in libqtest.c to send a query > to the QEMU-under-test to ask it what TARGET_BIG_ENDIAN says, > rather than hardcoding a big list of architectures... Yes, it's a good idea. I can do that... Laurent
On Thu, Oct 06, 2016 at 11:47:36AM +0100, Peter Maydell wrote: > On 6 October 2016 at 04:45, David Gibson <david@gibson.dropbear.id.au> wrote: > > On Wed, Oct 05, 2016 at 07:20:52AM -0700, Peter Maydell wrote: > >> On 5 October 2016 at 07:00, Cédric Le Goater <clg@kaod.org> wrote: > >> > On 10/05/2016 03:53 PM, Peter Maydell wrote: > >> >> Which tswap? Last time I worked through the stack of > >> >> what happens I thought that we had the right set of > >> >> swaps in the right places. > >> > > >> > The one I am talking about are under qtest_process_command(), > >> > see below. > >> > >> Those are correct and required, and they do not change > >> the overall behaviour of the system depending on the host > >> endianness. (They convert 32-bit values to "bag of > >> bytes in guest order" which is what the cpu_physical_memory_* > >> functions want.) > > > > These functions are correct for the defined semantics of the > > readw/readl operations, but those semantics are not useful. > > ?? They're the most obvious and required semantics: "act > like the CPU just did a word/halfword/byte read/write". The CPU with what mode and options? > There's a reason we've got this far without needing anything > else, and it's that this is the most straightforward use case. No, I'm pretty sure we've got this far because most of the tests haven't yet been enabled for a traditionally BE target. > > This proposal is introducing alternate functions with the more useful > > semantics which are "convert a 32-bit value to a bag of bytes in LE > > order" or "convert a 32-bit value to a bag of bytes in BE order" > > depending on which variant you choose. > > It's adding functions whose semantics are "act like the > CPU wrote this value to some RAM and then memcpy()ed it to > the device". I think devices whose usage model is "memcpy bytes > to me" are rare to nonexistent. Uh, that's not the intention. I have some comments on this elsewhere. The intended semantics are that we do a single atomic write to an address, but with a specific endianness.
On Thu, Oct 06, 2016 at 12:03:34PM +0100, Peter Maydell wrote: > On 6 October 2016 at 04:38, David Gibson <david@gibson.dropbear.id.au> wrote: > > On Wed, Oct 05, 2016 at 05:31:07AM -0700, Peter Maydell wrote: > >> On 4 October 2016 at 16:43, David Gibson <david@gibson.dropbear.id.au> wrote: > >> > On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote: > >> >> The difficulty with this patch is that it's hard to tell whether > >> >> it's really required, or if this is just adding an extra layer > >> >> of byteswapping that should really be done in some other location > >> > > >> > Actually, it's neither. It's not essential for anything, but it > >> > *removes* an extra layer of byteswapping that really never should have > >> > been done in the first place. > >> > >> The patch is very clearly adding calls to swapping functions. > >> It looks like it's mostly convenience functions for not doing > >> those swaps explicitly in the test cases. > > > > It's adding 1 swap on top of the memread/memwrite path - that's the > > path which had no existing swaps (intended primarily for bag-o'-bytes > > block access AFAICT). > > Yeah, I hadn't noticed it was using the memread/memwrite path. > I disagree with using that code path for what ought to be > register read/writes (among other things, it's not clear to me > that it guarantees that a 4-byte access by the test code is > always a 4-byte access on the device, etc). I was concerned about that when I first saw it as well. But, I traced the path and both the memread() path and the readw/readl path backend onto cpu_physical_memory_read() with the appropriate width. So if the proposed ops are broken in this way, so are readw/readl. If we're concerned that that won't always be the case we can still implement the same operations, but instead of having them be just wrappers on memread/memwrite, they'd be new primitives passed over the pipe to the qtest accelerator. > > >> >> in the stack. What's the actual test case here? > >> > > >> > The current readw, readl, etc. all work in "guest endianness". But > >> > guest endianness is not well defined - there are a number of targets > >> > which can support either. > >> > >> It's guest bus endianness, and it's pretty well defined I think. > >> (ARM for instance is LE bus even if the CPU is doing BE writes.) > > > > I don't see that guest bus endianness is any better defined, or any > > more useful than "guest endianness". It might have a vague meaning > > for ARM (or embedded Power) in the sense that the on-SoC devices will > > use that endianness. But since the SoC devices are generally unique > > to the architecture anyway, you still know their endianness > > independent of any notion of guest endianness. > > It means "the endianness that you would see if you could snoop > the data bus at the output of the CPU". Hrm. I see two cases here, neither of them makes this clear: 1) The bus spec defines which data lines are MSB, and which are LSB. In this case, the endianness depends on how those are mapped to byte addresses when you get out to RAM - that could involve several intermediate bridges. 2) The bus spec defines which data lines are byte0, 1, 2, etc (in byte address order). In this case it really is the CPU which determines the endianness of its accesses - and that in turn could depend on modes, TLB entries or instruction formss within the CPU. So the question is: the endianness you see if you snoop the bus when the CPU does... what.. exactly? > It is definitely well > defined for every QEMU target because that's what TARGET_WORDS_BIGENDIAN > tells you. It's not the same as whatever the device thinks it > might interpret values as (though obviously people don't > often design systems where they differ). Honestly I think TAGET_WORDS_BIGENDIAN is just a hangover from when nearly all CPUs worked exclusively in one endianness. It's never terribly well defined. > >> > And it's doubly meaningless since it's a > >> > property of the guest cpu, which we're essentially replacing with the > >> > qtest stub anyway. > >> > >> The stub sits on the same bus the guest cpu would. > >> > >> > Furthermore "guest endianness" isn't useful. With a tiny handful of > >> > exceptions, all peripherals have their own endianness which is known > >> > independent of the target. It makes more sense for test cases to > >> > explicitly do their accesses in the correct endianness for the device, > >> > without having to compensate for the fact that it'll be swapped into > >> > the essentially arbitrary "guest endianness" along the way. > >> > >> Here I definitely disagree. I think it makes much more sense > >> for writes to be "what the guest CPU would write", because that's > >> where we're injecting them. If we had a test framework where the > >> test was talking directly to the device, then maybe, but we don't. > > > > When I say that guest endianness is not well defined, what I mean is > > precisely that "what the guest CPU would write" is not well defined. > > For example on Power, the endianness of a given access will depend on: > > - For server chips, whether the CPU is in LE or BE mode > > - For embedded chips, the endianness flag in the TLB > > - For all chips, whether the plain or byte-reversed load/store > > instructions are used > > > > [There might even be variants with both a global mode flag and > > per-page flags in the TLB, I'm not sure] > > AFAIK, these things all take effect inside the guest CPU, and affect > the value the guest eventually writes to the CPU bus. > > > The model proposed here is that our testcases, on every access specify > > a specific final-result endianness. The alternative is that every > > write from qtest has to do: > > 1) Start with a value (host endian) > > 2) Conditional swap for host endian vs. device endian difference > > 3) Conditional swap for host endian vs. "guest endian" difference > > simply to compensate for.. > > 4) [Existing tswap() within writew/writel] > > Conditional swap for host endian vs. "guest endian difference > > > >> > Basically, the existing byteswaps, instead of removing the need for > >> > them in the testcase code, instead mean that target-conditional > >> > byteswaps will be *required* in nearly every testcase. It's a recipe > >> > for endianness-unsafe testcases. > >> > >> I prefer the swaps to be explicit in the test cases. If your > > > > The are explicit in the testcases, in the sense that on every read or > > write you say this is an LE access or a BE access. Potentially we > > could have a "guest native" access option as well, but I think that's > > useful > > > >> test case running on the real CPU would have had to do > >> "swap to LE and then write this word", so does the test > >> case in our test framework. If your test case just does > >> "write the word", then so does the test. > >> > >> Most devices IME do not require byteswaps by guest code, > > > > What!? So speaks the person working on a historically LE platform. > > Well, platforms where the devices and the CPU agree about > endianness. Which means historically LE platforms, since nearly all commodity, cross platform hardware is LE. > > In the kernel, if a driver isn't littered with cpu_to_leXX() it's > > almost certainly broken on a BE platform. > > > > The only devices I can think of which don't have a fixed endianness > > regardless of platform are: > > * legacy virtio: this was an insane design choice, which is why it > > went away in virtio-1.0 > > * graphics card framebuffers, which can usually be configured > > either way (or have both BE and LE views of the framebuffer) > > * a handful of simple devices, usually from a while back, where > > some idiot thought a BE version of a previously LE device (or vice > > versa) was a good idea > > > >> and I think these functions are confusing -- if you try > >> to use them to write tests for ARM devices, for instance, > >> the _le versions of these functions will introduce an > >> incorrect byteswap on a BE host, even though ARM CPUs and > >> devices are typically all LE. > > > > Huh? How so? > > This was the result of my mis-reading of the code: I thought > they were doing a byteswap and then calling the readl/readw/etc > functions, not the readmem etc functions. Ah, ok, yes. That would certainly have been wrong. > > If you're accessing an LE device (like PCI) you say "write_le" and > > there are no byteswaps at all on an LE host, and a single byteswap on > > a BE host, which is just what you want. For a PCI device "write_le" > > is the correct option regardless of both host and guest platform. > > > > That makes testcases straightforward to write. As a bonus, it forces > > the test author to think momentarily about what the right endianness > > is for each access, which is a good habit to be in (not thinking about > > this is why we've so often had endianness-broken drivers). > > > > Switching to an arbitrary "guest endianness" and back again makes the > > test more complicated, and accomplishes no extra coverage (because all > > the swaps under consideration are in the test code anyway). It gains > > nothing but confusion. > > I think I need to think about this a bit... > > thanks > -- PMM >
On Thu, Oct 06, 2016 at 04:11:42PM +0200, Greg Kurz wrote: > On Thu, 6 Oct 2016 12:03:34 +0100 > Peter Maydell <peter.maydell@linaro.org> wrote: > > > On 6 October 2016 at 04:38, David Gibson <david@gibson.dropbear.id.au> wrote: > > > On Wed, Oct 05, 2016 at 05:31:07AM -0700, Peter Maydell wrote: > > >> On 4 October 2016 at 16:43, David Gibson <david@gibson.dropbear.id.au> wrote: > > >> > On Tue, Oct 04, 2016 at 01:36:09PM +0100, Peter Maydell wrote: > > >> >> The difficulty with this patch is that it's hard to tell whether > > >> >> it's really required, or if this is just adding an extra layer > > >> >> of byteswapping that should really be done in some other location > > >> > > > >> > Actually, it's neither. It's not essential for anything, but it > > >> > *removes* an extra layer of byteswapping that really never should have > > >> > been done in the first place. > > >> > > >> The patch is very clearly adding calls to swapping functions. > > >> It looks like it's mostly convenience functions for not doing > > >> those swaps explicitly in the test cases. > > > > > > It's adding 1 swap on top of the memread/memwrite path - that's the > > > path which had no existing swaps (intended primarily for bag-o'-bytes > > > block access AFAICT). > > > > Yeah, I hadn't noticed it was using the memread/memwrite path. > > I disagree with using that code path for what ought to be > > register read/writes (among other things, it's not clear to me > > that it guarantees that a 4-byte access by the test code is > > always a 4-byte access on the device, etc). > > > > FWIW, Cedric had another proposal which apparently went unnoticed: > > <fc24ad74-da26-a713-9312-a2c2d07fb6a7@kaod.org> > > The idea is to add an optional endianness argument to the read*/write* > commands in the qtest protocol: > - libqtest then provides explicit _le and _be APIs > - no extra byteswap is performed on the test program side: qtest > actually handles that and does exactly 1 or 0 byteswap. > - it does not use memread/memwrite > - the current 'guest native' API where qtest tswaps is preserved That would also be fine by me.
On Thu, Oct 06, 2016 at 05:59:00PM +0200, Laurent Vivier wrote: > > > On 06/10/2016 17:41, Peter Maydell wrote: > > On 6 October 2016 at 16:36, Paolo Bonzini <pbonzini@redhat.com> wrote: > >> > >> > >> On 06/10/2016 16:11, Greg Kurz wrote: > >>> FWIW, Cedric had another proposal which apparently went unnoticed: > >>> > >>> <fc24ad74-da26-a713-9312-a2c2d07fb6a7@kaod.org> > >>> > >>> The idea is to add an optional endianness argument to the read*/write* > >>> commands in the qtest protocol: > >>> - libqtest then provides explicit _le and _be APIs > >>> - no extra byteswap is performed on the test program side: qtest > >>> actually handles that and does exactly 1 or 0 byteswap. > >>> - it does not use memread/memwrite > >>> - the current 'guest native' API where qtest tswaps is preserved > >>> > >> > >> No, this is a worse idea, because the right place to do the swap is in > >> the "program" (libqtest) not in the "CPU" (QEMU). > > > > Speaking of the right place to do things, perhaps we should > > reimplement qtest_big_endian() in libqtest.c to send a query > > to the QEMU-under-test to ask it what TARGET_BIG_ENDIAN says, > > rather than hardcoding a big list of architectures... > > Yes, it's a good idea. I disagree. TARGET_BIG_ENDIAN is simply not well defined - we should avoid using it at all.
On Thu, Oct 06, 2016 at 05:36:32PM +0200, Paolo Bonzini wrote: > > > On 06/10/2016 16:11, Greg Kurz wrote: > > FWIW, Cedric had another proposal which apparently went unnoticed: > > > > <fc24ad74-da26-a713-9312-a2c2d07fb6a7@kaod.org> > > > > The idea is to add an optional endianness argument to the read*/write* > > commands in the qtest protocol: > > - libqtest then provides explicit _le and _be APIs > > - no extra byteswap is performed on the test program side: qtest > > actually handles that and does exactly 1 or 0 byteswap. > > - it does not use memread/memwrite > > - the current 'guest native' API where qtest tswaps is preserved > > > > No, this is a worse idea, because the right place to do the swap is in > the "program" (libqtest) not in the "CPU" (QEMU). Hrm.. I guess that makes sense from an x86 perspective when load/stores always operate in LE. Not so much for something like Power where the CPU can perform both LE and BE load/stores trivially. You can select with CPU mode combined with which instruction form you use. e.g. the always-LE writel() on a BE Power kernel is a single byte-reversed store instruction[0]. there's no "swap" as such, and the swapped value never appears in a register. I'm not certain if gcc is smart enough to translate foo->bar = cpu_to_le32(val) into a byte-reversed store, but it might be. The value passed across the pipe to readw etc. is text, so it has no endianness, just as a value in a cpu register has no endianness. To me it makes perfect sense to tell the qtest "cpu" which endianness of load/store you want it to do with that. [0] Well, ok, there's a memory barrier too, so it's not quite 1 instruction.
On 07/10/2016 01:34, David Gibson wrote: >> On 06/10/2016 17:41, Peter Maydell wrote: >>> Speaking of the right place to do things, perhaps we should >>> reimplement qtest_big_endian() in libqtest.c to send a query >>> to the QEMU-under-test to ask it what TARGET_BIG_ENDIAN says, >>> rather than hardcoding a big list of architectures... >> >> Yes, it's a good idea. > > I disagree. > > TARGET_BIG_ENDIAN is simply not well defined - we should avoid using > it at all. I think it is at least better to send a query to ask the endianness of the guest instead of using some guessing on the test application side. For the moment, we can use TARGET_WORDS_BIGENDIAN, because this is simple and its purpose. TARGET_WORDS_BIGENDIAN defines the default endianness of the guest. But I don't think we can't use something else: in the case of POWER, for instance, the endianness can be changed, but the processor is always big endian by default at reset, and in the case of qtest, the processor is never started (or in some special cases, never exits from the firmware, which always runs in big endian mode), so the endianness will always stay to the one given by TARGET_WORDS_BIGENDIAN. Thanks, Laurent
On 7 October 2016 at 00:31, David Gibson <david@gibson.dropbear.id.au> wrote: > Honestly I think TAGET_WORDS_BIGENDIAN is just a hangover from when > nearly all CPUs worked exclusively in one endianness. It's never > terribly well defined. Feel free to try to get rid of it, but it's baked into the KVM ABI as a concept and also important for TCG efficiency. It works fine both for "always one endianness" systems and for "programmably variable" CPUs like ARM. thanks -- PMM
Index: qemu-aspeed.git/tests/libqtest.h =================================================================== --- qemu-aspeed.git.orig/tests/libqtest.h +++ qemu-aspeed.git/tests/libqtest.h @@ -887,4 +887,75 @@ void qmp_fd_send(int fd, const char *fmt QDict *qmp_fdv(int fd, const char *fmt, va_list ap); QDict *qmp_fd(int fd, const char *fmt, ...); +/* + * BE/LE read/write accessors + */ +uint16_t qtest_readw_be(QTestState *s, uint64_t addr); +uint32_t qtest_readl_be(QTestState *s, uint64_t addr); +uint64_t qtest_readq_be(QTestState *s, uint64_t addr); + +static inline uint16_t readw_be(uint64_t addr) +{ + return qtest_readw_be(global_qtest, addr); +} +static inline uint32_t readl_be(uint64_t addr) +{ + return qtest_readl_be(global_qtest, addr); +} +static inline uint64_t readq_be(uint64_t addr) +{ + return qtest_readq_be(global_qtest, addr); +} + +void qtest_writew_be(QTestState *s, uint64_t addr, uint16_t value); +void qtest_writel_be(QTestState *s, uint64_t addr, uint32_t value); +void qtest_writeq_be(QTestState *s, uint64_t addr, uint64_t value); + +static inline void writew_be(uint64_t addr, uint16_t value) +{ + qtest_writew_be(global_qtest, addr, value); +} +static inline void writel_be(uint64_t addr, uint32_t value) +{ + qtest_writel_be(global_qtest, addr, value); +} +static inline void writeq_be(uint64_t addr, uint64_t value) +{ + qtest_writeq_be(global_qtest, addr, value); +} + +uint16_t qtest_readw_le(QTestState *s, uint64_t addr); +uint32_t qtest_readl_le(QTestState *s, uint64_t addr); +uint64_t qtest_readq_le(QTestState *s, uint64_t addr); + +static inline uint16_t readw_le(uint64_t addr) +{ + return qtest_readw_le(global_qtest, addr); +} +static inline uint32_t readl_le(uint64_t addr) +{ + return qtest_readl_le(global_qtest, addr); +} +static inline uint64_t readq_le(uint64_t addr) +{ + return qtest_readq_le(global_qtest, addr); +} + +void qtest_writew_le(QTestState *s, uint64_t addr, uint16_t value); +void qtest_writel_le(QTestState *s, uint64_t addr, uint32_t value); +void qtest_writeq_le(QTestState *s, uint64_t addr, uint64_t value); + +static inline void writew_le(uint64_t addr, uint16_t value) +{ + qtest_writew_le(global_qtest, addr, value); +} +static inline void writel_le(uint64_t addr, uint32_t value) +{ + qtest_writel_le(global_qtest, addr, value); +} +static inline void writeq_le(uint64_t addr, uint64_t value) +{ + qtest_writeq_le(global_qtest, addr, value); +} + #endif Index: qemu-aspeed.git/tests/libqtest.c =================================================================== --- qemu-aspeed.git.orig/tests/libqtest.c +++ qemu-aspeed.git/tests/libqtest.c @@ -15,6 +15,7 @@ * */ #include "qemu/osdep.h" +#include "qemu/bswap.h" #include "libqtest.h" #include <sys/socket.h> @@ -688,6 +689,42 @@ void qtest_writeq(QTestState *s, uint64_ qtest_write(s, "writeq", addr, value); } +void qtest_writew_be(QTestState *s, uint64_t addr, uint16_t value) +{ + value = cpu_to_be16(value); + qtest_memread(s, addr, &value, sizeof(value)); +} + +void qtest_writel_be(QTestState *s, uint64_t addr, uint32_t value) +{ + value = cpu_to_be32(value); + qtest_memread(s, addr, &value, sizeof(value)); +} + +void qtest_writeq_be(QTestState *s, uint64_t addr, uint64_t value) +{ + value = cpu_to_be64(value); + qtest_memread(s, addr, &value, sizeof(value)); +} + +void qtest_writew_le(QTestState *s, uint64_t addr, uint16_t value) +{ + value = cpu_to_le16(value); + qtest_memread(s, addr, &value, sizeof(value)); +} + +void qtest_writel_le(QTestState *s, uint64_t addr, uint32_t value) +{ + value = cpu_to_le32(value); + qtest_memread(s, addr, &value, sizeof(value)); +} + +void qtest_writeq_le(QTestState *s, uint64_t addr, uint64_t value) +{ + value = cpu_to_le64(value); + qtest_memread(s, addr, &value, sizeof(value)); +} + static uint64_t qtest_read(QTestState *s, const char *cmd, uint64_t addr) { gchar **args; @@ -721,6 +758,60 @@ uint64_t qtest_readq(QTestState *s, uint return qtest_read(s, "readq", addr); } +uint16_t qtest_readw_be(QTestState *s, uint64_t addr) +{ + uint16_t value; + + qtest_memread(s, addr, &value, sizeof(value)); + + return be16_to_cpu(value); +} + +uint32_t qtest_readl_be(QTestState *s, uint64_t addr) +{ + uint32_t value; + + qtest_memread(s, addr, &value, sizeof(value)); + + return be32_to_cpu(value); +} + +uint64_t qtest_readq_be(QTestState *s, uint64_t addr) +{ + uint64_t value; + + qtest_memread(s, addr, &value, sizeof(value)); + + return be64_to_cpu(value); +} + +uint16_t qtest_readw_le(QTestState *s, uint64_t addr) +{ + uint16_t value; + + qtest_memread(s, addr, &value, sizeof(value)); + + return le16_to_cpu(value); +} + +uint32_t qtest_readl_le(QTestState *s, uint64_t addr) +{ + uint32_t value; + + qtest_memread(s, addr, &value, sizeof(value)); + + return le32_to_cpu(value); +} + +uint64_t qtest_readq_le(QTestState *s, uint64_t addr) +{ + uint64_t value; + + qtest_memread(s, addr, &value, sizeof(value)); + + return le64_to_cpu(value); +} + static int hex2nib(char ch) { if (ch >= '0' && ch <= '9') {
Some test scenarios require to access memory regions using a specific endianness, such as a device region, but the current qtest memory accessors are done in native endian, which means that the values are byteswapped in qtest if the endianness of the guest and the host are different. To maintain the endianness of a value, we need a new set of memory accessor. This can be done in two ways: - first, convert the value to the required endianness in libqtest and then use the memread/write routines so that qtest accesses the guest memory without doing any supplementary byteswapping - an alternative method would be to handle the byte swapping on the qtest side. For that, we would need to extend the read/write protocol with an ending word : "native|le|be" and modify the tswap calls accordingly under the qtest_process_command() routine. The result is the same and the first method is simpler. Signed-off-by: Cédric Le Goater <clg@kaod.org> --- tests/libqtest.c | 91 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ tests/libqtest.h | 71 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 162 insertions(+)