diff mbox series

Tweak vector::_M_realloc_insert for code size

Message ID alpine.DEB.2.20.1711111854310.7206@stedding.saclay.inria.fr
State New
Headers show
Series Tweak vector::_M_realloc_insert for code size | expand

Commit Message

Marc Glisse Nov. 11, 2017, 6:14 p.m. UTC
Hello,

operator new can clobber memory, it is hard to teach the compiler 
otherwise since it is replaceable. Here I cache a couple values before the 
call to the allocator. I checked the result on this simple example:

#include <vector>
void f(std::vector<int>&v){ v.push_back(0); }

The patch does not affect the fast path where no reallocation is needed. 
Compiling with -O3 (everything gets inlined into f), I see a nice decrease 
in code size on x86_64

$ size old.o new.o
    text	   data	    bss	    dec	    hex	filename
     462	      0	      0	    462	    1ce	old.o
     376	      0	      0	    376	    178	new.o

Even at -O2 where _M_realloc_insert is not inlined, I get a slight 
decrease in code size (490 -> 470). On x86, that's 531 -> 519 (465 -> 
387 at -O3).

I'm not going to modify every function like that, I just happened to be 
looking at this example for other reasons, and the size gain is larger 
than I expected, so I am posting the patch.

Bootstrap+regtest on powerpc64le-unknown-linux-gnu.

2017-11-13  Marc Glisse  <marc.glisse@inria.fr>

 	* include/bits/vector.tcc (vector::_M_realloc_insert): Cache old
 	values before the allocation.

Comments

Jonathan Wakely Nov. 14, 2017, 5:09 p.m. UTC | #1
On 11/11/17 19:14 +0100, Marc Glisse wrote:
>Hello,
>
>operator new can clobber memory, it is hard to teach the compiler 
>otherwise since it is replaceable. Here I cache a couple values before 
>the call to the allocator. I checked the result on this simple 
>example:
>
>#include <vector>
>void f(std::vector<int>&v){ v.push_back(0); }
>
>The patch does not affect the fast path where no reallocation is 
>needed. Compiling with -O3 (everything gets inlined into f), I see a 
>nice decrease in code size on x86_64
>
>$ size old.o new.o
>   text	   data	    bss	    dec	    hex	filename
>    462	      0	      0	    462	    1ce	old.o
>    376	      0	      0	    376	    178	new.o
>
>Even at -O2 where _M_realloc_insert is not inlined, I get a slight 
>decrease in code size (490 -> 470). On x86, that's 531 -> 519 (465 -> 
>387 at -O3).
>
>I'm not going to modify every function like that, I just happened to 
>be looking at this example for other reasons, and the size gain is 
>larger than I expected, so I am posting the patch.

That's a nice improvement, OK for trunk. Thanks.
diff mbox series

Patch

Index: libstdc++-v3/include/bits/vector.tcc
===================================================================
--- libstdc++-v3/include/bits/vector.tcc	(revision 254629)
+++ libstdc++-v3/include/bits/vector.tcc	(working copy)
@@ -414,20 +414,22 @@  _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
       _M_realloc_insert(iterator __position, _Args&&... __args)
 #else
   template<typename _Tp, typename _Alloc>
     void
     vector<_Tp, _Alloc>::
     _M_realloc_insert(iterator __position, const _Tp& __x)
 #endif
     {
       const size_type __len =
 	_M_check_len(size_type(1), "vector::_M_realloc_insert");
+      pointer __old_start = this->_M_impl._M_start;
+      pointer __old_finish = this->_M_impl._M_finish;
       const size_type __elems_before = __position - begin();
       pointer __new_start(this->_M_allocate(__len));
       pointer __new_finish(__new_start);
       __try
 	{
 	  // The order of the three operations is dictated by the C++11
 	  // case, where the moves could alter a new element belonging
 	  // to the existing vector.  This is an issue only for callers
 	  // taking the element by lvalue ref (see last bullet of C++11
 	  // [res.on.arguments]).
@@ -435,45 +437,44 @@  _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 				   __new_start + __elems_before,
 #if __cplusplus >= 201103L
 				   std::forward<_Args>(__args)...);
 #else
 				   __x);
 #endif
 	  __new_finish = pointer();
 
 	  __new_finish
 	    = std::__uninitialized_move_if_noexcept_a
-	    (this->_M_impl._M_start, __position.base(),
+	    (__old_start, __position.base(),
 	     __new_start, _M_get_Tp_allocator());
 
 	  ++__new_finish;
 
 	  __new_finish
 	    = std::__uninitialized_move_if_noexcept_a
-	    (__position.base(), this->_M_impl._M_finish,
+	    (__position.base(), __old_finish,
 	     __new_finish, _M_get_Tp_allocator());
 	}
       __catch(...)
 	{
 	  if (!__new_finish)
 	    _Alloc_traits::destroy(this->_M_impl,
 				   __new_start + __elems_before);
 	  else
 	    std::_Destroy(__new_start, __new_finish, _M_get_Tp_allocator());
 	  _M_deallocate(__new_start, __len);
 	  __throw_exception_again;
 	}
       _GLIBCXX_ASAN_ANNOTATE_REINIT;
-      std::_Destroy(this->_M_impl._M_start, this->_M_impl._M_finish,
-		    _M_get_Tp_allocator());
-      _M_deallocate(this->_M_impl._M_start,
-		    this->_M_impl._M_end_of_storage - this->_M_impl._M_start);
+      std::_Destroy(__old_start, __old_finish, _M_get_Tp_allocator());
+      _M_deallocate(__old_start,
+		    this->_M_impl._M_end_of_storage - __old_start);
       this->_M_impl._M_start = __new_start;
       this->_M_impl._M_finish = __new_finish;
       this->_M_impl._M_end_of_storage = __new_start + __len;
     }
 
   template<typename _Tp, typename _Alloc>
     void
     vector<_Tp, _Alloc>::
     _M_fill_insert(iterator __position, size_type __n, const value_type& __x)
     {