diff mbox series

varasm: Shorten assembly of strings with larger zero regions

Message ID ZpfNPBSLF+QDqr85@tucnak
State New
Headers show
Series varasm: Shorten assembly of strings with larger zero regions | expand

Commit Message

Jakub Jelinek July 17, 2024, 1:55 p.m. UTC
Hi!

When not using .base64 directive, we emit for long sequences of zeros
	.string	"foobarbaz"
	.string ""
	.string ""
	.string ""
	.string ""
	.string ""
	.string ""
	.string ""
	.string ""
	.string ""
	.string ""
	.string ""
	.string ""
The following patch changes that to
	.string	"foobarbaz"
	.zero	12
It keeps emitting .string "" if there is just one zero or two zeros where
the first one is preceded by non-zeros, so we can have
	.string "foobarbaz"
	.string ""
or
	.base64 "VG8gYmUgb3Igbm90IHRvIGJlLCB0aGF0IGlzIHRoZSBxdWVzdGlvbg=="
	.string ""
but not 2 .string "" in a row.

On a testcase I have with around 310440 0-255 unsigned char character
constants mostly derived from cc1plus start but with too long sequences of
0s which broke transformation to STRING_CST adjusted to have at most 126
consecutive 0s, I see:
1504498 bytes long assembly without this patch on i686-linux (without
.base64 support in binutils)
1155071 bytes long assembly with this patch on i686-linux (without .base64
support in binutils)
431390 bytes long assembly without this patch on x86_64-linux (with
.base64 support in binutils)
427593 bytes long assembly with this patch on x86_64-linux (with .base64
support in binutils)
All 4 assemble to identical *.o file when using x86_64-linux .base64
supporting gas, and the former 2 when using older x86_64-linux gas assemble
to identical content as well.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-07-17  Jakub Jelinek  <jakub@redhat.com>

	* varasm.cc (default_elf_asm_output_ascii): Use ASM_OUTPUT_SKIP instead
	of 2 or more default_elf_asm_output_limited_string (f, "") calls and
	adjust base64 heuristics correspondingly.


	Jakub

Comments

Richard Biener July 17, 2024, 2:15 p.m. UTC | #1
> Am 17.07.2024 um 15:55 schrieb Jakub Jelinek <jakub@redhat.com>:
> 
> Hi!
> 
> When not using .base64 directive, we emit for long sequences of zeros
>    .string    "foobarbaz"
>    .string ""
>    .string ""
>    .string ""
>    .string ""
>    .string ""
>    .string ""
>    .string ""
>    .string ""
>    .string ""
>    .string ""
>    .string ""
>    .string ""
> The following patch changes that to
>    .string    "foobarbaz"
>    .zero    12
> It keeps emitting .string "" if there is just one zero or two zeros where
> the first one is preceded by non-zeros, so we can have
>    .string "foobarbaz"
>    .string ""
> or
>    .base64 "VG8gYmUgb3Igbm90IHRvIGJlLCB0aGF0IGlzIHRoZSBxdWVzdGlvbg=="
>    .string ""
> but not 2 .string "" in a row.
> 
> On a testcase I have with around 310440 0-255 unsigned char character
> constants mostly derived from cc1plus start but with too long sequences of
> 0s which broke transformation to STRING_CST adjusted to have at most 126
> consecutive 0s, I see:
> 1504498 bytes long assembly without this patch on i686-linux (without
> .base64 support in binutils)
> 1155071 bytes long assembly with this patch on i686-linux (without .base64
> support in binutils)
> 431390 bytes long assembly without this patch on x86_64-linux (with
> .base64 support in binutils)
> 427593 bytes long assembly with this patch on x86_64-linux (with .base64
> support in binutils)
> All 4 assemble to identical *.o file when using x86_64-linux .base64
> supporting gas, and the former 2 when using older x86_64-linux gas assemble
> to identical content as well.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.  Is there a more general repeat byte op available?

Richard 

> 2024-07-17  Jakub Jelinek  <jakub@redhat.com>
> 
>    * varasm.cc (default_elf_asm_output_ascii): Use ASM_OUTPUT_SKIP instead
>    of 2 or more default_elf_asm_output_limited_string (f, "") calls and
>    adjust base64 heuristics correspondingly.
> 
> --- gcc/varasm.cc.jj    2024-07-16 13:36:54.259748720 +0200
> +++ gcc/varasm.cc    2024-07-16 14:08:19.211753867 +0200
> @@ -8538,6 +8538,7 @@ default_elf_asm_output_ascii (FILE *f, c
>       if (s >= last_base64)
>    {
>      unsigned cnt = 0;
> +      unsigned char prev_c = ' ';
>      const char *t;
>      for (t = s; t < limit && (t - s) < (long) ELF_STRING_LIMIT - 1; t++)
>        {
> @@ -8560,7 +8561,13 @@ default_elf_asm_output_ascii (FILE *f, c
>          break;
>        case 1:
>          if (c == 0)
> -            cnt += 2 + strlen (STRING_ASM_OP) + 1;
> +            {
> +              if (prev_c == 0
> +              && t + 1 < limit
> +              && (t + 1 - s) < (long) ELF_STRING_LIMIT - 1)
> +            break;
> +              cnt += 2 + strlen (STRING_ASM_OP) + 1;
> +            }
>          else
>            cnt += 4;
>          break;
> @@ -8568,6 +8575,7 @@ default_elf_asm_output_ascii (FILE *f, c
>          cnt += 2;
>          break;
>        }
> +          prev_c = c;
>        }
>      if (cnt > ((unsigned) (t - s) + 2) / 3 * 4 && (t - s) >= 3)
>        {
> @@ -8633,8 +8641,18 @@ default_elf_asm_output_ascii (FILE *f, c
>          bytes_in_chunk = 0;
>        }
> 
> -      default_elf_asm_output_limited_string (f, s);
> -      s = p;
> +      if (p == s && p + 1 < limit && p[1] == '\0')
> +        {
> +          for (p = s + 2; p < limit && *p == '\0'; p++)
> +        continue;
> +          ASM_OUTPUT_SKIP (f, (unsigned HOST_WIDE_INT) (p - s));
> +          s = p - 1;
> +        }
> +      else
> +        {
> +          default_elf_asm_output_limited_string (f, s);
> +          s = p;
> +        }
>    }
>       else
>    {
> 
>    Jakub
>
Jakub Jelinek July 17, 2024, 2:45 p.m. UTC | #2
On Wed, Jul 17, 2024 at 04:15:16PM +0200, Richard Biener wrote:
> Ok.  Is there a more general repeat byte op available?

I think
	.skip	bytes, fill
but not sure what assemblers do support that, not sure it is that
common to have long sequences of non-zero identical bytes and
for '\0's we were already splitting the string into separate directives,
while say for
	.string "fooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo"
we aren't.
The intent of this patch was just to get some common savings without
sacrifying readability too much.

	Jakub
Richard Biener July 17, 2024, 3:26 p.m. UTC | #3
> Am 17.07.2024 um 16:45 schrieb Jakub Jelinek <jakub@redhat.com>:
> 
> On Wed, Jul 17, 2024 at 04:15:16PM +0200, Richard Biener wrote:
>> Ok.  Is there a more general repeat byte op available?
> 
> I think
>    .skip    bytes, fill
> but not sure what assemblers do support that, not sure it is that
> common to have long sequences of non-zero identical bytes and
> for '\0's we were already splitting the string into separate directives,
> while say for
>    .string "fooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo"
> we aren't.
> The intent of this patch was just to get some common savings without
> sacrifying readability too much.

Fair enough, I was just wondering.

Richard 

>    Jakub
>
diff mbox series

Patch

--- gcc/varasm.cc.jj	2024-07-16 13:36:54.259748720 +0200
+++ gcc/varasm.cc	2024-07-16 14:08:19.211753867 +0200
@@ -8538,6 +8538,7 @@  default_elf_asm_output_ascii (FILE *f, c
       if (s >= last_base64)
 	{
 	  unsigned cnt = 0;
+	  unsigned char prev_c = ' ';
 	  const char *t;
 	  for (t = s; t < limit && (t - s) < (long) ELF_STRING_LIMIT - 1; t++)
 	    {
@@ -8560,7 +8561,13 @@  default_elf_asm_output_ascii (FILE *f, c
 		  break;
 		case 1:
 		  if (c == 0)
-		    cnt += 2 + strlen (STRING_ASM_OP) + 1;
+		    {
+		      if (prev_c == 0
+			  && t + 1 < limit
+			  && (t + 1 - s) < (long) ELF_STRING_LIMIT - 1)
+			break;
+		      cnt += 2 + strlen (STRING_ASM_OP) + 1;
+		    }
 		  else
 		    cnt += 4;
 		  break;
@@ -8568,6 +8575,7 @@  default_elf_asm_output_ascii (FILE *f, c
 		  cnt += 2;
 		  break;
 		}
+	      prev_c = c;
 	    }
 	  if (cnt > ((unsigned) (t - s) + 2) / 3 * 4 && (t - s) >= 3)
 	    {
@@ -8633,8 +8641,18 @@  default_elf_asm_output_ascii (FILE *f, c
 	      bytes_in_chunk = 0;
 	    }
 
-	  default_elf_asm_output_limited_string (f, s);
-	  s = p;
+	  if (p == s && p + 1 < limit && p[1] == '\0')
+	    {
+	      for (p = s + 2; p < limit && *p == '\0'; p++)
+		continue;
+	      ASM_OUTPUT_SKIP (f, (unsigned HOST_WIDE_INT) (p - s));
+	      s = p - 1;
+	    }
+	  else
+	    {
+	      default_elf_asm_output_limited_string (f, s);
+	      s = p;
+	    }
 	}
       else
 	{