Message ID | 20230904161703.3996734-1-peter.maydell@linaro.org |
---|---|
State | New |
Headers | show |
Series | docs/devel/loads-stores: Fix git grep regexes | expand |
On 4/9/23 18:17, Peter Maydell wrote: > The loads-and-stores documentation includes git grep regexes to find > occurrences of the various functions. Some of these regexes have > errors, typically failing to escape the '?', '(' and ')' when they > should be metacharacters (since these are POSIX basic REs). We also > weren't consistent about whether to have a ':' on the end of the > line introducing the list of regexes in each section. > > Fix the errors. > > The following shell rune will complain about any REs in the > file which don't have any matches in the codebase: > for re in $(sed -ne 's/ - ``\(\\<.*\)``/\1/p' docs/devel/loads-stores.rst); do git grep -q "$re" || echo "no matches for re $re"; done > > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > --- > docs/devel/loads-stores.rst | 40 ++++++++++++++++++------------------- > 1 file changed, 20 insertions(+), 20 deletions(-) Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
On Mon, Sep 04, 2023 at 05:17:03PM +0100, Peter Maydell wrote: > The loads-and-stores documentation includes git grep regexes to find > occurrences of the various functions. Some of these regexes have > errors, typically failing to escape the '?', '(' and ')' when they > should be metacharacters (since these are POSIX basic REs). We also > weren't consistent about whether to have a ':' on the end of the > line introducing the list of regexes in each section. > > Fix the errors. > > The following shell rune will complain about any REs in the > file which don't have any matches in the codebase: > for re in $(sed -ne 's/ - ``\(\\<.*\)``/\1/p' docs/devel/loads-stores.rst); do git grep -q "$re" || echo "no matches for re $re"; done > > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > --- > docs/devel/loads-stores.rst | 40 ++++++++++++++++++------------------- > 1 file changed, 20 insertions(+), 20 deletions(-) > > diff --git a/docs/devel/loads-stores.rst b/docs/devel/loads-stores.rst > index dab6dfa0acc..ec627aa9c06 100644 > --- a/docs/devel/loads-stores.rst > +++ b/docs/devel/loads-stores.rst > @@ -63,12 +63,12 @@ which stores ``val`` to ``ptr`` as an ``{endian}`` order value > of size ``sz`` bytes. > > > -Regexes for git grep > +Regexes for git grep: > - ``\<ld[us]\?[bwlq]\(_[hbl]e\)\?_p\>`` This claims that ldul_be_p() is a valid function name (which I would expect to take a pointer to a 32-bit integer and produce an unsigned result suitable for assigning into a 64-bit value). But it does not exist, and the fact that ldl_be_p() returns 'int' means I had to add a cast to avoid unintended sign-extension: https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg05234.html cast added in relation to v5 patch at https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg04923.html > - ``\<st[bwlq]\(_[hbl]e\)\?_p\>`` > - ``\<st24\(_[hbl]e\)\?_p\>`` > - - ``\<ldn_\([hbl]e\)?_p\>`` > - - ``\<stn_\([hbl]e\)?_p\>`` > + - ``\<ldn_\([hbl]e\)\?_p\>`` > + - ``\<stn_\([hbl]e\)\?_p\>`` So as long as we are touching the docs, is it worth considering the larger task of auditing whether it is appropriate to have all of the ld*_ functions return unsigned values, and/or implement ldu/lds variants that guarantee zero or sign extension for widening 32-bit values when assigning to 64-bit destinations? > > ``cpu_{ld,st}*_mmu`` > ~~~~~~~~~~~~~~~~~~~~ > @@ -121,8 +121,8 @@ store: ``cpu_st{size}{end}_mmu(env, ptr, val, oi, retaddr)`` > - ``_le`` : little endian > > Regexes for git grep: > - - ``\<cpu_ld[bwlq](_[bl]e)\?_mmu\>`` As a counterpoint, the cpu_ldl_* functions already return uint32_t.
On Tue, 5 Sept 2023 at 15:31, Eric Blake <eblake@redhat.com> wrote: > > On Mon, Sep 04, 2023 at 05:17:03PM +0100, Peter Maydell wrote: > > The loads-and-stores documentation includes git grep regexes to find > > occurrences of the various functions. Some of these regexes have > > errors, typically failing to escape the '?', '(' and ')' when they > > should be metacharacters (since these are POSIX basic REs). We also > > weren't consistent about whether to have a ':' on the end of the > > line introducing the list of regexes in each section. > > > > Fix the errors. > > > > The following shell rune will complain about any REs in the > > file which don't have any matches in the codebase: > > for re in $(sed -ne 's/ - ``\(\\<.*\)``/\1/p' docs/devel/loads-stores.rst); do git grep -q "$re" || echo "no matches for re $re"; done > > > > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > > --- > > docs/devel/loads-stores.rst | 40 ++++++++++++++++++------------------- > > 1 file changed, 20 insertions(+), 20 deletions(-) > > > > diff --git a/docs/devel/loads-stores.rst b/docs/devel/loads-stores.rst > > index dab6dfa0acc..ec627aa9c06 100644 > > --- a/docs/devel/loads-stores.rst > > +++ b/docs/devel/loads-stores.rst > > @@ -63,12 +63,12 @@ which stores ``val`` to ``ptr`` as an ``{endian}`` order value > > of size ``sz`` bytes. > > > > > > -Regexes for git grep > > +Regexes for git grep: > > - ``\<ld[us]\?[bwlq]\(_[hbl]e\)\?_p\>`` > > This claims that ldul_be_p() is a valid function name No, it's not claiming that. It's just claiming that this regex will catch all the function names defined in this section, not that it will avoid matching on some non-existent function names. The documentation section above tells you what is actually valid, and that says that "sign" is empty for 32 or 64 bit accesses. > (which I would > expect to take a pointer to a 32-bit integer and produce an unsigned > result suitable for assigning into a 64-bit value). But it does not > exist, and the fact that ldl_be_p() returns 'int' means I had to add a > cast to avoid unintended sign-extension: > > https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg05234.html > > cast added in relation to v5 patch at > https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg04923.html > > > - ``\<st[bwlq]\(_[hbl]e\)\?_p\>`` > > - ``\<st24\(_[hbl]e\)\?_p\>`` > > - - ``\<ldn_\([hbl]e\)?_p\>`` > > - - ``\<stn_\([hbl]e\)?_p\>`` > > + - ``\<ldn_\([hbl]e\)\?_p\>`` > > + - ``\<stn_\([hbl]e\)\?_p\>`` > > So as long as we are touching the docs, is it worth considering the > larger task of auditing whether it is appropriate to have all of the > ld*_ functions return unsigned values, and/or implement ldu/lds > variants that guarantee zero or sign extension for widening 32-bit > values when assigning to 64-bit destinations? No, I think it clearly is not. All I want to do here is fix the busted regular expressions. If you would like to try to tidy up some of the semantics of the load/store APIs you're welcome to have a go at that. The major obstacle is the obvious one that there are an absolute ton of existing uses for all of these API families. thanks -- PMM
diff --git a/docs/devel/loads-stores.rst b/docs/devel/loads-stores.rst index dab6dfa0acc..ec627aa9c06 100644 --- a/docs/devel/loads-stores.rst +++ b/docs/devel/loads-stores.rst @@ -63,12 +63,12 @@ which stores ``val`` to ``ptr`` as an ``{endian}`` order value of size ``sz`` bytes. -Regexes for git grep +Regexes for git grep: - ``\<ld[us]\?[bwlq]\(_[hbl]e\)\?_p\>`` - ``\<st[bwlq]\(_[hbl]e\)\?_p\>`` - ``\<st24\(_[hbl]e\)\?_p\>`` - - ``\<ldn_\([hbl]e\)?_p\>`` - - ``\<stn_\([hbl]e\)?_p\>`` + - ``\<ldn_\([hbl]e\)\?_p\>`` + - ``\<stn_\([hbl]e\)\?_p\>`` ``cpu_{ld,st}*_mmu`` ~~~~~~~~~~~~~~~~~~~~ @@ -121,8 +121,8 @@ store: ``cpu_st{size}{end}_mmu(env, ptr, val, oi, retaddr)`` - ``_le`` : little endian Regexes for git grep: - - ``\<cpu_ld[bwlq](_[bl]e)\?_mmu\>`` - - ``\<cpu_st[bwlq](_[bl]e)\?_mmu\>`` + - ``\<cpu_ld[bwlq]\(_[bl]e\)\?_mmu\>`` + - ``\<cpu_st[bwlq]\(_[bl]e\)\?_mmu\>`` ``cpu_{ld,st}*_mmuidx_ra`` @@ -155,8 +155,8 @@ store: ``cpu_st{size}{end}_mmuidx_ra(env, ptr, val, mmuidx, retaddr)`` - ``_le`` : little endian Regexes for git grep: - - ``\<cpu_ld[us]\?[bwlq](_[bl]e)\?_mmuidx_ra\>`` - - ``\<cpu_st[bwlq](_[bl]e)\?_mmuidx_ra\>`` + - ``\<cpu_ld[us]\?[bwlq]\(_[bl]e\)\?_mmuidx_ra\>`` + - ``\<cpu_st[bwlq]\(_[bl]e\)\?_mmuidx_ra\>`` ``cpu_{ld,st}*_data_ra`` ~~~~~~~~~~~~~~~~~~~~~~~~ @@ -193,8 +193,8 @@ store: ``cpu_st{size}{end}_data_ra(env, ptr, val, ra)`` - ``_le`` : little endian Regexes for git grep: - - ``\<cpu_ld[us]\?[bwlq](_[bl]e)\?_data_ra\>`` - - ``\<cpu_st[bwlq](_[bl]e)\?_data_ra\>`` + - ``\<cpu_ld[us]\?[bwlq]\(_[bl]e\)\?_data_ra\>`` + - ``\<cpu_st[bwlq]\(_[bl]e\)\?_data_ra\>`` ``cpu_{ld,st}*_data`` ~~~~~~~~~~~~~~~~~~~~~ @@ -231,9 +231,9 @@ store: ``cpu_st{size}{end}_data(env, ptr, val)`` - ``_be`` : big endian - ``_le`` : little endian -Regexes for git grep - - ``\<cpu_ld[us]\?[bwlq](_[bl]e)\?_data\>`` - - ``\<cpu_st[bwlq](_[bl]e)\?_data\+\>`` +Regexes for git grep: + - ``\<cpu_ld[us]\?[bwlq]\(_[bl]e\)\?_data\>`` + - ``\<cpu_st[bwlq]\(_[bl]e\)\?_data\+\>`` ``cpu_ld*_code`` ~~~~~~~~~~~~~~~~ @@ -296,7 +296,7 @@ swap: ``translator_ld{sign}{size}_swap(env, ptr, swap)`` - ``l`` : 32 bits - ``q`` : 64 bits -Regexes for git grep +Regexes for git grep: - ``\<translator_ld[us]\?[bwlq]\(_swap\)\?\>`` ``helper_{ld,st}*_mmu`` @@ -325,7 +325,7 @@ store: ``helper_{size}_mmu(env, addr, val, opindex, retaddr)`` - ``l`` : 32 bits - ``q`` : 64 bits -Regexes for git grep +Regexes for git grep: - ``\<helper_ld[us]\?[bwlq]_mmu\>`` - ``\<helper_st[bwlq]_mmu\>`` @@ -382,7 +382,7 @@ succeeded using a MemTxResult return code. The ``_{endian}`` suffix is omitted for byte accesses. -Regexes for git grep +Regexes for git grep: - ``\<address_space_\(read\|write\|rw\)\>`` - ``\<address_space_ldu\?[bwql]\(_[lb]e\)\?\>`` - ``\<address_space_st[bwql]\(_[lb]e\)\?\>`` @@ -400,7 +400,7 @@ Note that portions of the write which attempt to write data to a device will be silently ignored -- only real RAM and ROM will be written to. -Regexes for git grep +Regexes for git grep: - ``address_space_write_rom`` ``{ld,st}*_phys`` @@ -438,7 +438,7 @@ device doing the access has no way to report such an error. The ``_{endian}_`` infix is omitted for byte accesses. -Regexes for git grep +Regexes for git grep: - ``\<ldu\?[bwlq]\(_[bl]e\)\?_phys\>`` - ``\<st[bwlq]\(_[bl]e\)\?_phys\>`` @@ -462,7 +462,7 @@ For new code they are better avoided: ``cpu_physical_memory_rw`` -Regexes for git grep +Regexes for git grep: - ``\<cpu_physical_memory_\(read\|write\|rw\)\>`` ``cpu_memory_rw_debug`` @@ -497,7 +497,7 @@ make sure our existing code is doing things correctly. ``dma_memory_rw`` -Regexes for git grep +Regexes for git grep: - ``\<dma_memory_\(read\|write\|rw\)\>`` - ``\<ldu\?[bwlq]\(_[bl]e\)\?_dma\>`` - ``\<st[bwlq]\(_[bl]e\)\?_dma\>`` @@ -538,7 +538,7 @@ correct address space for that device. The ``_{endian}_`` infix is omitted for byte accesses. -Regexes for git grep +Regexes for git grep: - ``\<pci_dma_\(read\|write\|rw\)\>`` - ``\<ldu\?[bwlq]\(_[bl]e\)\?_pci_dma\>`` - ``\<st[bwlq]\(_[bl]e\)\?_pci_dma\>``
The loads-and-stores documentation includes git grep regexes to find occurrences of the various functions. Some of these regexes have errors, typically failing to escape the '?', '(' and ')' when they should be metacharacters (since these are POSIX basic REs). We also weren't consistent about whether to have a ':' on the end of the line introducing the list of regexes in each section. Fix the errors. The following shell rune will complain about any REs in the file which don't have any matches in the codebase: for re in $(sed -ne 's/ - ``\(\\<.*\)``/\1/p' docs/devel/loads-stores.rst); do git grep -q "$re" || echo "no matches for re $re"; done Signed-off-by: Peter Maydell <peter.maydell@linaro.org> --- docs/devel/loads-stores.rst | 40 ++++++++++++++++++------------------- 1 file changed, 20 insertions(+), 20 deletions(-)