diff mbox series

libcpp: Fix up UB in finish_embed

Message ID ZuQU3AWe3ktvlmJk@tucnak
State New
Headers show
Series libcpp: Fix up UB in finish_embed | expand

Commit Message

Jakub Jelinek Sept. 13, 2024, 10:33 a.m. UTC
Hi!

Jonathan reported on IRC that certain unnamed proprietary static analyzer
is unhappy about the new finish_embed function and it is actually right.
On a testcase like:
#embed __FILE__ limit (0) if_empty (0)
params->if_empty.count is 1, limit is 0, so count is 0 (we need just
a single token and one fits into pfile->directive_result).  Because
count is 0, we don't allocate toks, so it stays NULL, and then in
1301	  if (prefix->count)
1302	    {
1303	      *tok = *prefix->base_run.base;
1304	      tok = toks;
1305	      tokenrun *cur_run = &prefix->base_run;
1306	      while (cur_run)
1307		{
1308		  size_t cnt = (cur_run->next ? cur_run->limit
1309				: prefix->cur_token) - cur_run->base;
1310		  cpp_token *t = cur_run->base;
1311		  if (cur_run == &prefix->base_run)
1312		    {
1313		      t++;
1314		      cnt--;
1315		    }
1316		  memcpy (tok, t, cnt * sizeof (cpp_token));
1317		  tok += cnt;
1318		  cur_run = cur_run->next;
1319		}
1320	    }
the *tok = *prefix->base_run.base; assignment will copy the only
token.  cur_run is still non-NULL, cnt will be initially 1 and
then decremented to 0, but we invoke UB because we do
memcpy (NULL, cur_run->base + 1, 0 * sizeof (cpp_token));
and then the loop stops because cur_run->next must be NULL.

As we don't really copy anything, toks can be anything non-NULL,
so the following patch fixes that by initializing toks also to
&pfile->directive_result (just something known to be non-NULL).
This should be harmless even for the
#embed __FILE__ limit (1)
case (no non-empty prefix/suffix) where toks isn't allocated
either, but in that case prefix->count will be 0 and in the
1321	  for (size_t i = 0; i < limit; ++i)
1322	    {
1323	      tok->src_loc = params->loc;
1324	      tok->type = CPP_NUMBER;
1325	      tok->flags = NO_EXPAND;
1326	      if (i == 0)
1327		tok->flags |= PREV_WHITE;
1328	      tok->val.str.text = s;
1329	      tok->val.str.len = sprintf ((char *) s, "%d", buffer[i]);
1330	      s += tok->val.str.len + 1;
1331	      if (tok == &pfile->directive_result)
1332		tok = toks;
1333	      else
1334		tok++;
1335	      if (i < limit - 1)
1336		{
1337		  tok->src_loc = params->loc;
1338		  tok->type = CPP_COMMA;
1339		  tok->flags = NO_EXPAND;
1340		  tok++;
1341		}
1342	    }
loop limit will be 1, so tok is initially &pfile->directive_result,
that is stilled in, then tok = toks; (previously setting tok to NULL,
now to &pfile->directive_result again) and because 0 < 1 - 1 is
false, nothing further will happen and the loop will finish (and as
params->suffix.count will be 0, nothing further will use tok).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-09-12  Jakub Jelinek  <jakub@redhat.com>

	* files.cc (finish_embed): Initialize toks to tok rather
	than NULL.


	Jakub

Comments

Marek Polacek Sept. 13, 2024, 1:55 p.m. UTC | #1
On Fri, Sep 13, 2024 at 12:33:00PM +0200, Jakub Jelinek wrote:
> Hi!
> 
> Jonathan reported on IRC that certain unnamed proprietary static analyzer
> is unhappy about the new finish_embed function and it is actually right.
> On a testcase like:
> #embed __FILE__ limit (0) if_empty (0)
> params->if_empty.count is 1, limit is 0, so count is 0 (we need just
> a single token and one fits into pfile->directive_result).  Because
> count is 0, we don't allocate toks, so it stays NULL, and then in
> 1301	  if (prefix->count)
> 1302	    {
> 1303	      *tok = *prefix->base_run.base;
> 1304	      tok = toks;
> 1305	      tokenrun *cur_run = &prefix->base_run;
> 1306	      while (cur_run)
> 1307		{
> 1308		  size_t cnt = (cur_run->next ? cur_run->limit
> 1309				: prefix->cur_token) - cur_run->base;
> 1310		  cpp_token *t = cur_run->base;
> 1311		  if (cur_run == &prefix->base_run)
> 1312		    {
> 1313		      t++;
> 1314		      cnt--;
> 1315		    }
> 1316		  memcpy (tok, t, cnt * sizeof (cpp_token));
> 1317		  tok += cnt;
> 1318		  cur_run = cur_run->next;
> 1319		}
> 1320	    }
> the *tok = *prefix->base_run.base; assignment will copy the only
> token.  cur_run is still non-NULL, cnt will be initially 1 and
> then decremented to 0, but we invoke UB because we do
> memcpy (NULL, cur_run->base + 1, 0 * sizeof (cpp_token));
> and then the loop stops because cur_run->next must be NULL.
> 
> As we don't really copy anything, toks can be anything non-NULL,
> so the following patch fixes that by initializing toks also to
> &pfile->directive_result (just something known to be non-NULL).
> This should be harmless even for the
> #embed __FILE__ limit (1)
> case (no non-empty prefix/suffix) where toks isn't allocated
> either, but in that case prefix->count will be 0 and in the
> 1321	  for (size_t i = 0; i < limit; ++i)
> 1322	    {
> 1323	      tok->src_loc = params->loc;
> 1324	      tok->type = CPP_NUMBER;
> 1325	      tok->flags = NO_EXPAND;
> 1326	      if (i == 0)
> 1327		tok->flags |= PREV_WHITE;
> 1328	      tok->val.str.text = s;
> 1329	      tok->val.str.len = sprintf ((char *) s, "%d", buffer[i]);
> 1330	      s += tok->val.str.len + 1;
> 1331	      if (tok == &pfile->directive_result)
> 1332		tok = toks;
> 1333	      else
> 1334		tok++;
> 1335	      if (i < limit - 1)
> 1336		{
> 1337		  tok->src_loc = params->loc;
> 1338		  tok->type = CPP_COMMA;
> 1339		  tok->flags = NO_EXPAND;
> 1340		  tok++;
> 1341		}
> 1342	    }
> loop limit will be 1, so tok is initially &pfile->directive_result,
> that is stilled in, then tok = toks; (previously setting tok to NULL,
> now to &pfile->directive_result again) and because 0 < 1 - 1 is
> false, nothing further will happen and the loop will finish (and as
> params->suffix.count will be 0, nothing further will use tok).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.
 
> 2024-09-12  Jakub Jelinek  <jakub@redhat.com>
> 
> 	* files.cc (finish_embed): Initialize toks to tok rather
> 	than NULL.
> 
> --- libcpp/files.cc.jj	2024-09-12 18:16:49.995409074 +0200
> +++ libcpp/files.cc	2024-09-12 22:55:01.619765364 +0200
> @@ -1284,7 +1284,7 @@ finish_embed (cpp_reader *pfile, _cpp_fi
>      }
>    uchar *s = len ? _cpp_unaligned_alloc (pfile, len) : NULL;
>    _cpp_buff *tok_buff = NULL;
> -  cpp_token *toks = NULL, *tok = &pfile->directive_result;
> +  cpp_token *tok = &pfile->directive_result, *toks = tok;
>    size_t count = 0;
>    if (limit)
>      count = (params->prefix.count + limit * 2 - 1
> 
> 	Jakub
> 

Marek
diff mbox series

Patch

--- libcpp/files.cc.jj	2024-09-12 18:16:49.995409074 +0200
+++ libcpp/files.cc	2024-09-12 22:55:01.619765364 +0200
@@ -1284,7 +1284,7 @@  finish_embed (cpp_reader *pfile, _cpp_fi
     }
   uchar *s = len ? _cpp_unaligned_alloc (pfile, len) : NULL;
   _cpp_buff *tok_buff = NULL;
-  cpp_token *toks = NULL, *tok = &pfile->directive_result;
+  cpp_token *tok = &pfile->directive_result, *toks = tok;
   size_t count = 0;
   if (limit)
     count = (params->prefix.count + limit * 2 - 1