diff mbox series

s390x: Fix segfault in wcsncmp [BZ #31934]

Message ID 20240711092853.3810283-1-stli@linux.ibm.com
State New
Headers show
Series s390x: Fix segfault in wcsncmp [BZ #31934] | expand

Commit Message

Stefan Liebler July 11, 2024, 9:28 a.m. UTC
The z13/vector-optimized wcsncmp implementation segfaults if n=1
and there is only one character (equal on both strings) before
the page end.  Then it loads and compares one character and misses
to check n again.  The following load fails.

This patch removes the extra load and compare of the first character
and just start with the loop which uses vector-load-to-block-boundary.
This code-path also checks n.

With this patch both tests are passing:
- the simplified one mentioned in the bugzilla 31934
- the full one in Florian Weimer's patch:
"manual: Document a GNU extension for strncmp/wcsncmp"
(https://patchwork.sourceware.org/project/glibc/patch/874j9eml6y.fsf@oldenburg.str.redhat.com/):
On s390x-linux-gnu (z16), the new wcsncmp test fails due to bug 31934.
---
 sysdeps/s390/wcsncmp-vx.S | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

Comments

Carlos O'Donell July 11, 2024, 12:57 p.m. UTC | #1
On 7/11/24 5:28 AM, Stefan Liebler wrote:
> The z13/vector-optimized wcsncmp implementation segfaults if n=1
> and there is only one character (equal on both strings) before
> the page end.  Then it loads and compares one character and misses
> to check n again.  The following load fails.

LGTM. Thank you for fixing this!

Please feel free to push since we're currently in bug fixing phase and this has
no ABI impact.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

> This patch removes the extra load and compare of the first character
> and just start with the loop which uses vector-load-to-block-boundary.
> This code-path also checks n.
> 
> With this patch both tests are passing:
> - the simplified one mentioned in the bugzilla 31934
> - the full one in Florian Weimer's patch:
> "manual: Document a GNU extension for strncmp/wcsncmp"
> (https://patchwork.sourceware.org/project/glibc/patch/874j9eml6y.fsf@oldenburg.str.redhat.com/):
> On s390x-linux-gnu (z16), the new wcsncmp test fails due to bug 31934.
> ---
>  sysdeps/s390/wcsncmp-vx.S | 10 +---------
>  1 file changed, 1 insertion(+), 9 deletions(-)
> 
> diff --git a/sysdeps/s390/wcsncmp-vx.S b/sysdeps/s390/wcsncmp-vx.S
> index bf6dfa6bc2..8b081567a2 100644
> --- a/sysdeps/s390/wcsncmp-vx.S
> +++ b/sysdeps/s390/wcsncmp-vx.S
> @@ -59,14 +59,7 @@ ENTRY(WCSNCMP_Z13)
>  	sllg	%r4,%r4,2	/* Convert character-count to byte-count.  */
>  	locgrne	%r4,%r1		/* Use max byte-count, if bit 0/1 was one.  */
>  
> -	/* Check first character without vector load.  */
> -	lghi	%r5,4		/* current_len = 4 bytes.  */
> -	/* Check s1/2[0].  */
> -	lt	%r0,0(%r2)
> -	l	%r1,0(%r3)
> -	je	.Lend_cmp_one_char
> -	crjne	%r0,%r1,.Lend_cmp_one_char

OK. Removes the unrolled processing of just 1 character.

> -
> +	lghi	%r5,0		/* current_len = 0 bytes.  */

OK. Current progress is set to 0 at the start (as expected).

>  .Lloop:
>  	vlbb	%v17,0(%r5,%r3),6 /* Load s2 to block boundary.  */
>  	vlbb	%v16,0(%r5,%r2),6 /* Load s1 to block boundary.  */

... and we use vlbb to avoid crossing the block boundary (page boundary).

> @@ -167,7 +160,6 @@ ENTRY(WCSNCMP_Z13)
>  	srl	%r4,2		/* And convert it to character-index.  */
>  	vlgvf	%r0,%v16,0(%r4)	/* Load character-values.  */
>  	vlgvf	%r1,%v17,0(%r4)
> -.Lend_cmp_one_char:

OK. Label no longer needed.

>  	cr	%r0,%r1
>  	je	.Lend_equal
>  	lghi	%r2,1
Stefan Liebler July 11, 2024, 1:17 p.m. UTC | #2
On 11.07.24 14:57, Carlos O'Donell wrote:
> On 7/11/24 5:28 AM, Stefan Liebler wrote:
>> The z13/vector-optimized wcsncmp implementation segfaults if n=1
>> and there is only one character (equal on both strings) before
>> the page end.  Then it loads and compares one character and misses
>> to check n again.  The following load fails.
> 
> LGTM. Thank you for fixing this!
> 
> Please feel free to push since we're currently in bug fixing phase and this has
> no ABI impact.
> 
> Reviewed-by: Carlos O'Donell <carlos@redhat.com>

Thanks for reviewing.
I've just committed it.

Note that the wcsncmp implementation was introduced in glibc 2.23:
https://sourceware.org/git/?p=glibc.git;a=commit;h=cee82e70ccb7b2f054cd781b0a603ae244523e72
'S390: Optimize strncmp and wcsncmp.'

Can you please advice which release-branches are maintained and to which
one I should backport them?
To all from glibc 2.39 down to 2.23?

Thanks,
Stefan
> 
>> This patch removes the extra load and compare of the first character
>> and just start with the loop which uses vector-load-to-block-boundary.
>> This code-path also checks n.
>>
>> With this patch both tests are passing:
>> - the simplified one mentioned in the bugzilla 31934
>> - the full one in Florian Weimer's patch:
>> "manual: Document a GNU extension for strncmp/wcsncmp"
>> (https://patchwork.sourceware.org/project/glibc/patch/874j9eml6y.fsf@oldenburg.str.redhat.com/):
>> On s390x-linux-gnu (z16), the new wcsncmp test fails due to bug 31934.
>> ---
>>  sysdeps/s390/wcsncmp-vx.S | 10 +---------
>>  1 file changed, 1 insertion(+), 9 deletions(-)
>>
>> diff --git a/sysdeps/s390/wcsncmp-vx.S b/sysdeps/s390/wcsncmp-vx.S
>> index bf6dfa6bc2..8b081567a2 100644
>> --- a/sysdeps/s390/wcsncmp-vx.S
>> +++ b/sysdeps/s390/wcsncmp-vx.S
>> @@ -59,14 +59,7 @@ ENTRY(WCSNCMP_Z13)
>>  	sllg	%r4,%r4,2	/* Convert character-count to byte-count.  */
>>  	locgrne	%r4,%r1		/* Use max byte-count, if bit 0/1 was one.  */
>>  
>> -	/* Check first character without vector load.  */
>> -	lghi	%r5,4		/* current_len = 4 bytes.  */
>> -	/* Check s1/2[0].  */
>> -	lt	%r0,0(%r2)
>> -	l	%r1,0(%r3)
>> -	je	.Lend_cmp_one_char
>> -	crjne	%r0,%r1,.Lend_cmp_one_char
> 
> OK. Removes the unrolled processing of just 1 character.
> 
>> -
>> +	lghi	%r5,0		/* current_len = 0 bytes.  */
> 
> OK. Current progress is set to 0 at the start (as expected).
> 
>>  .Lloop:
>>  	vlbb	%v17,0(%r5,%r3),6 /* Load s2 to block boundary.  */
>>  	vlbb	%v16,0(%r5,%r2),6 /* Load s1 to block boundary.  */
> 
> ... and we use vlbb to avoid crossing the block boundary (page boundary).
> 
>> @@ -167,7 +160,6 @@ ENTRY(WCSNCMP_Z13)
>>  	srl	%r4,2		/* And convert it to character-index.  */
>>  	vlgvf	%r0,%v16,0(%r4)	/* Load character-values.  */
>>  	vlgvf	%r1,%v17,0(%r4)
>> -.Lend_cmp_one_char:
> 
> OK. Label no longer needed.
> 
>>  	cr	%r0,%r1
>>  	je	.Lend_equal
>>  	lghi	%r2,1
>
Andreas K. Huettel July 11, 2024, 1:49 p.m. UTC | #3
Am Donnerstag, 11. Juli 2024, 15:17:11 CEST schrieb Stefan Liebler:
> On 11.07.24 14:57, Carlos O'Donell wrote:
> > On 7/11/24 5:28 AM, Stefan Liebler wrote:
> >> The z13/vector-optimized wcsncmp implementation segfaults if n=1
> >> and there is only one character (equal on both strings) before
> >> the page end.  Then it loads and compares one character and misses
> >> to check n again.  The following load fails.
> > 
> > LGTM. Thank you for fixing this!
> > 
> > Please feel free to push since we're currently in bug fixing phase and this has
> > no ABI impact.
> > 
> > Reviewed-by: Carlos O'Donell <carlos@redhat.com>
> 
> Thanks for reviewing.
> I've just committed it.
> 
> Note that the wcsncmp implementation was introduced in glibc 2.23:
> https://sourceware.org/git/?p=glibc.git;a=commit;h=cee82e70ccb7b2f054cd781b0a603ae244523e72
> 'S390: Optimize strncmp and wcsncmp.'
> 
> Can you please advice which release-branches are maintained and to which
> one I should backport them?
> To all from glibc 2.39 down to 2.23?

Let's see what Carlos says, but we also have the table on the wiki for this:

https://sourceware.org/glibc/wiki/Release

I took the liberty of splitting it up into "active" and "EOL'ed" (just now).
If anything has been placed wrong there, please feel free to correct.


> 
> Thanks,
> Stefan
> >
Carlos O'Donell July 11, 2024, 2:53 p.m. UTC | #4
On 7/11/24 9:17 AM, Stefan Liebler wrote:
> On 11.07.24 14:57, Carlos O'Donell wrote:
>> On 7/11/24 5:28 AM, Stefan Liebler wrote:
>>> The z13/vector-optimized wcsncmp implementation segfaults if n=1
>>> and there is only one character (equal on both strings) before
>>> the page end.  Then it loads and compares one character and misses
>>> to check n again.  The following load fails.
>>
>> LGTM. Thank you for fixing this!
>>
>> Please feel free to push since we're currently in bug fixing phase and this has
>> no ABI impact.
>>
>> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
> 
> Thanks for reviewing.
> I've just committed it.
> 
> Note that the wcsncmp implementation was introduced in glibc 2.23:
> https://sourceware.org/git/?p=glibc.git;a=commit;h=cee82e70ccb7b2f054cd781b0a603ae244523e72
> 'S390: Optimize strncmp and wcsncmp.'
> 
> Can you please advice which release-branches are maintained and to which
> one I should backport them?
> To all from glibc 2.39 down to 2.23?

The active release branches are here:
https://sourceware.org/glibc/wiki/Release/#Current_branches

You need only backport from 2.39 down to 2.32.

We are actively trying to limit the open branches to those matching
downstream distribution efforts.

>>
>>> This patch removes the extra load and compare of the first character
>>> and just start with the loop which uses vector-load-to-block-boundary.
>>> This code-path also checks n.
>>>
>>> With this patch both tests are passing:
>>> - the simplified one mentioned in the bugzilla 31934
>>> - the full one in Florian Weimer's patch:
>>> "manual: Document a GNU extension for strncmp/wcsncmp"
>>> (https://patchwork.sourceware.org/project/glibc/patch/874j9eml6y.fsf@oldenburg.str.redhat.com/):
>>> On s390x-linux-gnu (z16), the new wcsncmp test fails due to bug 31934.
>>> ---
>>>  sysdeps/s390/wcsncmp-vx.S | 10 +---------
>>>  1 file changed, 1 insertion(+), 9 deletions(-)
>>>
>>> diff --git a/sysdeps/s390/wcsncmp-vx.S b/sysdeps/s390/wcsncmp-vx.S
>>> index bf6dfa6bc2..8b081567a2 100644
>>> --- a/sysdeps/s390/wcsncmp-vx.S
>>> +++ b/sysdeps/s390/wcsncmp-vx.S
>>> @@ -59,14 +59,7 @@ ENTRY(WCSNCMP_Z13)
>>>  	sllg	%r4,%r4,2	/* Convert character-count to byte-count.  */
>>>  	locgrne	%r4,%r1		/* Use max byte-count, if bit 0/1 was one.  */
>>>  
>>> -	/* Check first character without vector load.  */
>>> -	lghi	%r5,4		/* current_len = 4 bytes.  */
>>> -	/* Check s1/2[0].  */
>>> -	lt	%r0,0(%r2)
>>> -	l	%r1,0(%r3)
>>> -	je	.Lend_cmp_one_char
>>> -	crjne	%r0,%r1,.Lend_cmp_one_char
>>
>> OK. Removes the unrolled processing of just 1 character.
>>
>>> -
>>> +	lghi	%r5,0		/* current_len = 0 bytes.  */
>>
>> OK. Current progress is set to 0 at the start (as expected).
>>
>>>  .Lloop:
>>>  	vlbb	%v17,0(%r5,%r3),6 /* Load s2 to block boundary.  */
>>>  	vlbb	%v16,0(%r5,%r2),6 /* Load s1 to block boundary.  */
>>
>> ... and we use vlbb to avoid crossing the block boundary (page boundary).
>>
>>> @@ -167,7 +160,6 @@ ENTRY(WCSNCMP_Z13)
>>>  	srl	%r4,2		/* And convert it to character-index.  */
>>>  	vlgvf	%r0,%v16,0(%r4)	/* Load character-values.  */
>>>  	vlgvf	%r1,%v17,0(%r4)
>>> -.Lend_cmp_one_char:
>>
>> OK. Label no longer needed.
>>
>>>  	cr	%r0,%r1
>>>  	je	.Lend_equal
>>>  	lghi	%r2,1
>>
>
Stefan Liebler July 11, 2024, 3:11 p.m. UTC | #5
On 11.07.24 16:53, Carlos O'Donell wrote:
> On 7/11/24 9:17 AM, Stefan Liebler wrote:
>> On 11.07.24 14:57, Carlos O'Donell wrote:
>>> On 7/11/24 5:28 AM, Stefan Liebler wrote:
>>>> The z13/vector-optimized wcsncmp implementation segfaults if n=1
>>>> and there is only one character (equal on both strings) before
>>>> the page end.  Then it loads and compares one character and misses
>>>> to check n again.  The following load fails.
>>>
>>> LGTM. Thank you for fixing this!
>>>
>>> Please feel free to push since we're currently in bug fixing phase and this has
>>> no ABI impact.
>>>
>>> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
>>
>> Thanks for reviewing.
>> I've just committed it.
>>
>> Note that the wcsncmp implementation was introduced in glibc 2.23:
>> https://sourceware.org/git/?p=glibc.git;a=commit;h=cee82e70ccb7b2f054cd781b0a603ae244523e72
>> 'S390: Optimize strncmp and wcsncmp.'
>>
>> Can you please advice which release-branches are maintained and to which
>> one I should backport them?
>> To all from glibc 2.39 down to 2.23?
> 
> The active release branches are here:
> https://sourceware.org/glibc/wiki/Release/#Current_branches
> 
> You need only backport from 2.39 down to 2.32.
> 
> We are actively trying to limit the open branches to those matching
> downstream distribution efforts.
Thanks Andreas, Carlos,
then I will cherry-pick it to 2.39 down to 2.32 release-branches.
> 
>>>
>>>> This patch removes the extra load and compare of the first character
>>>> and just start with the loop which uses vector-load-to-block-boundary.
>>>> This code-path also checks n.
>>>>
>>>> With this patch both tests are passing:
>>>> - the simplified one mentioned in the bugzilla 31934
>>>> - the full one in Florian Weimer's patch:
>>>> "manual: Document a GNU extension for strncmp/wcsncmp"
>>>> (https://patchwork.sourceware.org/project/glibc/patch/874j9eml6y.fsf@oldenburg.str.redhat.com/):
>>>> On s390x-linux-gnu (z16), the new wcsncmp test fails due to bug 31934.
>>>> ---
>>>>  sysdeps/s390/wcsncmp-vx.S | 10 +---------
>>>>  1 file changed, 1 insertion(+), 9 deletions(-)
>>>>
>>>> diff --git a/sysdeps/s390/wcsncmp-vx.S b/sysdeps/s390/wcsncmp-vx.S
>>>> index bf6dfa6bc2..8b081567a2 100644
>>>> --- a/sysdeps/s390/wcsncmp-vx.S
>>>> +++ b/sysdeps/s390/wcsncmp-vx.S
>>>> @@ -59,14 +59,7 @@ ENTRY(WCSNCMP_Z13)
>>>>  	sllg	%r4,%r4,2	/* Convert character-count to byte-count.  */
>>>>  	locgrne	%r4,%r1		/* Use max byte-count, if bit 0/1 was one.  */
>>>>  
>>>> -	/* Check first character without vector load.  */
>>>> -	lghi	%r5,4		/* current_len = 4 bytes.  */
>>>> -	/* Check s1/2[0].  */
>>>> -	lt	%r0,0(%r2)
>>>> -	l	%r1,0(%r3)
>>>> -	je	.Lend_cmp_one_char
>>>> -	crjne	%r0,%r1,.Lend_cmp_one_char
>>>
>>> OK. Removes the unrolled processing of just 1 character.
>>>
>>>> -
>>>> +	lghi	%r5,0		/* current_len = 0 bytes.  */
>>>
>>> OK. Current progress is set to 0 at the start (as expected).
>>>
>>>>  .Lloop:
>>>>  	vlbb	%v17,0(%r5,%r3),6 /* Load s2 to block boundary.  */
>>>>  	vlbb	%v16,0(%r5,%r2),6 /* Load s1 to block boundary.  */
>>>
>>> ... and we use vlbb to avoid crossing the block boundary (page boundary).
>>>
>>>> @@ -167,7 +160,6 @@ ENTRY(WCSNCMP_Z13)
>>>>  	srl	%r4,2		/* And convert it to character-index.  */
>>>>  	vlgvf	%r0,%v16,0(%r4)	/* Load character-values.  */
>>>>  	vlgvf	%r1,%v17,0(%r4)
>>>> -.Lend_cmp_one_char:
>>>
>>> OK. Label no longer needed.
>>>
>>>>  	cr	%r0,%r1
>>>>  	je	.Lend_equal
>>>>  	lghi	%r2,1
>>>
>>
>
diff mbox series

Patch

diff --git a/sysdeps/s390/wcsncmp-vx.S b/sysdeps/s390/wcsncmp-vx.S
index bf6dfa6bc2..8b081567a2 100644
--- a/sysdeps/s390/wcsncmp-vx.S
+++ b/sysdeps/s390/wcsncmp-vx.S
@@ -59,14 +59,7 @@  ENTRY(WCSNCMP_Z13)
 	sllg	%r4,%r4,2	/* Convert character-count to byte-count.  */
 	locgrne	%r4,%r1		/* Use max byte-count, if bit 0/1 was one.  */
 
-	/* Check first character without vector load.  */
-	lghi	%r5,4		/* current_len = 4 bytes.  */
-	/* Check s1/2[0].  */
-	lt	%r0,0(%r2)
-	l	%r1,0(%r3)
-	je	.Lend_cmp_one_char
-	crjne	%r0,%r1,.Lend_cmp_one_char
-
+	lghi	%r5,0		/* current_len = 0 bytes.  */
 .Lloop:
 	vlbb	%v17,0(%r5,%r3),6 /* Load s2 to block boundary.  */
 	vlbb	%v16,0(%r5,%r2),6 /* Load s1 to block boundary.  */
@@ -167,7 +160,6 @@  ENTRY(WCSNCMP_Z13)
 	srl	%r4,2		/* And convert it to character-index.  */
 	vlgvf	%r0,%v16,0(%r4)	/* Load character-values.  */
 	vlgvf	%r1,%v17,0(%r4)
-.Lend_cmp_one_char:
 	cr	%r0,%r1
 	je	.Lend_equal
 	lghi	%r2,1