diff mbox

[3/N] std::regex refactoring - _Executor DFS / BFS

Message ID 20140428192950.GE928@redhat.com
State New
Headers show

Commit Message

Jonathan Wakely April 28, 2014, 7:29 p.m. UTC
On 28/04/14 15:24 -0400, Tim Shen wrote:
>Worth a try. Will you make the change or will I? It seems to be
>simpler doing than talking.

Yes :-)

I'm testing the attached patch now. It compiles slightly faster
(-ftime-report shows, as expected, that less time is spent in template
instantiation).

I'd also like to change __match_mode from a bool to an enum like:

   enum _Match_mode { _S_exact_match, _S_prefix_match };

Because I find it easier to read something like:

   if (__match_mode == _S_exact_match)
     // ...

rather than

   if (__match_mode)
     // ...

Comments

Tim Shen April 28, 2014, 8:05 p.m. UTC | #1
On Mon, Apr 28, 2014 at 3:29 PM, Jonathan Wakely <jwakely@redhat.com> wrote:
> I'm testing the attached patch now. It compiles slightly faster
> (-ftime-report shows, as expected, that less time is spent in template
> instantiation).
>
> I'd also like to change __match_mode from a bool to an enum like:
>
>   enum _Match_mode { _S_exact_match, _S_prefix_match };
>
> Because I find it easier to read something like:
>
>   if (__match_mode == _S_exact_match)
>     // ...
>
> rather than
>
>   if (__match_mode)
>     // ...

Oh this is nice, good to know.

Thanks!
Jonathan Wakely April 28, 2014, 8:18 p.m. UTC | #2
I thought I'd make a 5x speedup to the run-time of the regex matching,
but I was comparing the wrong version and the improvement actually
came from one of your patches yesterday - maybe this one:
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01725.html

Nice work!

My changes don't seem to make anything worse, and the first one makes
a big improvement to the worst case performance of std::wregex.
Tim Shen April 28, 2014, 9:14 p.m. UTC | #3
On Mon, Apr 28, 2014 at 4:18 PM, Jonathan Wakely <jwakely@redhat.com> wrote:
> I thought I'd make a 5x speedup to the run-time of the regex matching,
> but I was comparing the wrong version and the improvement actually
> came from one of your patches yesterday - maybe this one:
> http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01725.html
>
> Nice work!

That's surprising. May I ask for the performance testcase?

> My changes don't seem to make anything worse, and the first one makes
> a big improvement to the worst case performance of std::wregex.

Glad to see that!

Thanks!
Jonathan Wakely April 29, 2014, 10:17 a.m. UTC | #4
On 28/04/14 17:14 -0400, Tim Shen wrote:
>On Mon, Apr 28, 2014 at 4:18 PM, Jonathan Wakely <jwakely@redhat.com> wrote:
>> I thought I'd make a 5x speedup to the run-time of the regex matching,
>> but I was comparing the wrong version and the improvement actually
>> came from one of your patches yesterday - maybe this one:
>> http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01725.html
>>
>> Nice work!
>
>That's surprising. May I ask for the performance testcase?

See below. These are just microbenchmarks, so not very good or
reliable, but the <regex> code is large enough and complicated enough
that the results are fairly consistent and reproducible.
(I use __builtin_printf and __builtin_puts when I'm too lazy to
include a file, there's no other reason for that!)

I was using something like this to test basic_regex<char>:

#include <regex>

int main()
{
   auto re = std::regex("[abc]{3}d*x?");
   int count = 0;
   for (int i = 0; i < 100000; ++i)
     if (std::regex_match("bab", re)
         && std::regex_search("aacdddddddddddddddddddddddxaa", re))
       ++count;
   __builtin_printf("%d\n", count);
}

This runs much faster on trunk than with 4.9.0, so we might want to
backport your recent patches to the gcc-4_9-branch.

I was using examples like this for testing _BracketMatcher with
basic_regex<wchar_t>

#include <regex>

int main()
{
   auto re = std::wregex(L'[' + std::wstring(300, L'a') + L"bc"
                         + std::wstring(1000, 'a') + L"d]");
   int count = 0;
   for (int i = 0; i < 100000; ++i)
     if (std::regex_match(L"b", re) && std::regex_match(L"d", re))
       ++count;
   __builtin_printf("%d\n", count);
}

This runs faster with my patch to use std::sort and std::unique on the
_M_char_set vector, because the regex compiles to be equivalent to
std::wregex(L"[abcd]")
Tim Shen April 29, 2014, 4:54 p.m. UTC | #5
On Tue, Apr 29, 2014 at 6:17 AM, Jonathan Wakely <jwakely@redhat.com> wrote:
> This runs much faster on trunk than with 4.9.0, so we might want to
> backport your recent patches to the gcc-4_9-branch.

It's faster but nothing to do with optimization. It's because my first
patch set the DFS approach by default, which is faster in common case
but has bad worst case performance. Try your testcase and
regex_match("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", regex("(a*){30}"))
with/without defining _GLIBCXX_REGEX_USE_THOMPSON_NFA in the trunk
version.
Jonathan Wakely April 29, 2014, 5 p.m. UTC | #6
On 29/04/14 12:54 -0400, Tim Shen wrote:
>On Tue, Apr 29, 2014 at 6:17 AM, Jonathan Wakely <jwakely@redhat.com> wrote:
>> This runs much faster on trunk than with 4.9.0, so we might want to
>> backport your recent patches to the gcc-4_9-branch.
>
>It's faster but nothing to do with optimization. It's because my first
>patch set the DFS approach by default, which is faster in common case
>but has bad worst case performance. Try your testcase and
>regex_match("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", regex("(a*){30}"))
>with/without defining _GLIBCXX_REGEX_USE_THOMPSON_NFA in the trunk
>version.

Ah yes, of course ... but that's still a nice improvement for people
using a regex with no back references (at least for the common cases).
Paolo Carlini April 29, 2014, 5:25 p.m. UTC | #7
.. are we going to add some of these (micro-)benchmarks to the 
performance testsuite, yes? ;)

Paolo.
Tim Shen April 29, 2014, 9:06 p.m. UTC | #8
On Tue, Apr 29, 2014 at 1:25 PM, Paolo Carlini <paolo.carlini@oracle.com> wrote:
> .. are we going to add some of these (micro-)benchmarks to the performance
> testsuite, yes? ;)

I think the wregex one is great :)
diff mbox

Patch

diff --git a/libstdc++-v3/include/bits/regex_executor.h b/libstdc++-v3/include/bits/regex_executor.h
index 064e3df..617d1a4 100644
--- a/libstdc++-v3/include/bits/regex_executor.h
+++ b/libstdc++-v3/include/bits/regex_executor.h
@@ -88,7 +88,7 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
       _M_match()
       {
 	_M_current = _M_begin;
-	return _M_main<true>();
+	return _M_main(true);
       }
 
       // Set matched when some prefix of the string matches the pattern.
@@ -96,33 +96,28 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
       _M_search_from_first()
       {
 	_M_current = _M_begin;
-	return _M_main<false>();
+	return _M_main(false);
       }
 
       bool
       _M_search();
 
     private:
-      template<bool __match_mode>
-	void
-	_M_rep_once_more(_StateIdT);
-
-      template<bool __match_mode>
-	void
-	_M_dfs(_StateIdT __start);
-
-      template<bool __match_mode>
-	bool
-	_M_main()
-	{ return _M_main_dispatch<__match_mode>(__search_mode{}); }
-
-      template<bool __match_mode>
-	bool
-	_M_main_dispatch(__dfs);
-
-      template<bool __match_mode>
-	bool
-	_M_main_dispatch(__bfs);
+      void
+      _M_rep_once_more(bool __match_mode, _StateIdT);
+
+      void
+      _M_dfs(bool __match_mode, _StateIdT __start);
+
+      bool
+      _M_main(bool __match_mode)
+      { return _M_main_dispatch(__match_mode, __search_mode{}); }
+
+      bool
+      _M_main_dispatch(bool __match_mode, __dfs);
+
+      bool
+      _M_main_dispatch(bool __match_mode, __bfs);
 
       bool
       _M_is_word(_CharT __ch) const
diff --git a/libstdc++-v3/include/bits/regex_executor.tcc b/libstdc++-v3/include/bits/regex_executor.tcc
index a7351de..c0c75fa 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -45,7 +45,7 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
       do
 	{
 	  _M_current = __cur;
-	  if (_M_main<false>())
+	  if (_M_main(false))
 	    return true;
 	}
       // Continue when __cur == _M_end
@@ -78,13 +78,12 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
   //
   template<typename _BiIter, typename _Alloc, typename _TraitsT,
 	   bool __dfs_mode>
-  template<bool __match_mode>
     bool _Executor<_BiIter, _Alloc, _TraitsT, __dfs_mode>::
-    _M_main_dispatch(__dfs)
+    _M_main_dispatch(bool __match_mode, __dfs)
     {
       _M_has_sol = false;
       _M_cur_results = _M_results;
-      _M_dfs<__match_mode>(_M_states._M_start);
+      _M_dfs(__match_mode, _M_states._M_start);
       return _M_has_sol;
     }
 
@@ -112,9 +111,8 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
   //                   O(_M_nfa.size() * match_results.size())
   template<typename _BiIter, typename _Alloc, typename _TraitsT,
 	   bool __dfs_mode>
-  template<bool __match_mode>
     bool _Executor<_BiIter, _Alloc, _TraitsT, __dfs_mode>::
-    _M_main_dispatch(__bfs)
+    _M_main_dispatch(bool __match_mode, __bfs)
     {
       _M_states._M_queue(_M_states._M_start, _M_results);
       bool __ret = false;
@@ -128,7 +126,7 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	  for (auto& __task : __old_queue)
 	    {
 	      _M_cur_results = std::move(__task.second);
-	      _M_dfs<__match_mode>(__task.first);
+	      _M_dfs(__match_mode, __task.first);
 	    }
 	  if (!__match_mode)
 	    __ret |= _M_has_sol;
@@ -168,9 +166,8 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // we need to spare one more time for potential group capture.
   template<typename _BiIter, typename _Alloc, typename _TraitsT,
     bool __dfs_mode>
-  template<bool __match_mode>
     void _Executor<_BiIter, _Alloc, _TraitsT, __dfs_mode>::
-    _M_rep_once_more(_StateIdT __i)
+    _M_rep_once_more(bool __match_mode, _StateIdT __i)
     {
       const auto& __state = _M_nfa[__i];
       auto& __rep_count = _M_rep_count[__i];
@@ -179,7 +176,7 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	  auto __back = __rep_count;
 	  __rep_count.first = _M_current;
 	  __rep_count.second = 1;
-	  _M_dfs<__match_mode>(__state._M_alt);
+	  _M_dfs(__match_mode, __state._M_alt);
 	  __rep_count = __back;
 	}
       else
@@ -187,7 +184,7 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	  if (__rep_count.second < 2)
 	    {
 	      __rep_count.second++;
-	      _M_dfs<__match_mode>(__state._M_alt);
+	      _M_dfs(__match_mode, __state._M_alt);
 	      __rep_count.second--;
 	    }
 	}
@@ -195,9 +192,8 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template<typename _BiIter, typename _Alloc, typename _TraitsT,
 	   bool __dfs_mode>
-  template<bool __match_mode>
     void _Executor<_BiIter, _Alloc, _TraitsT, __dfs_mode>::
-    _M_dfs(_StateIdT __i)
+    _M_dfs(bool __match_mode, _StateIdT __i)
     {
       if (_M_states._M_visited(__i))
 	return;
@@ -216,19 +212,19 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	    // Greedy.
 	    if (!__state._M_neg)
 	      {
-		_M_rep_once_more<__match_mode>(__i);
+		_M_rep_once_more(__match_mode, __i);
 		// If it's DFS executor and already accepted, we're done.
 		if (!__dfs_mode || !_M_has_sol)
-		  _M_dfs<__match_mode>(__state._M_next);
+		  _M_dfs(__match_mode, __state._M_next);
 	      }
 	    else // Non-greedy mode
 	      {
 		if (__dfs_mode)
 		  {
 		    // vice-versa.
-		    _M_dfs<__match_mode>(__state._M_next);
+		    _M_dfs(__match_mode, __state._M_next);
 		    if (!_M_has_sol)
-		      _M_rep_once_more<__match_mode>(__i);
+		      _M_rep_once_more(__match_mode, __i);
 		  }
 		else
 		  {
@@ -237,12 +233,12 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
 		    // better by attempting its next node.
 		    if (!_M_has_sol)
 		      {
-			_M_dfs<__match_mode>(__state._M_next);
+			_M_dfs(__match_mode, __state._M_next);
 			// DON'T attempt anything if it's already accepted. An
 			// accepted state *must* be better than a solution that
 			// matches a non-greedy quantifier one more time.
 			if (!_M_has_sol)
-			  _M_rep_once_more<__match_mode>(__i);
+			  _M_rep_once_more(__match_mode, __i);
 		      }
 		  }
 	      }
@@ -253,7 +249,7 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	    auto& __res = _M_cur_results[__state._M_subexpr];
 	    auto __back = __res.first;
 	    __res.first = _M_current;
-	    _M_dfs<__match_mode>(__state._M_next);
+	    _M_dfs(__match_mode, __state._M_next);
 	    __res.first = __back;
 	  }
 	  break;
@@ -263,27 +259,27 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	    auto __back = __res;
 	    __res.second = _M_current;
 	    __res.matched = true;
-	    _M_dfs<__match_mode>(__state._M_next);
+	    _M_dfs(__match_mode, __state._M_next);
 	    __res = __back;
 	  }
 	  break;
 	case _S_opcode_line_begin_assertion:
 	  if (_M_at_begin())
-	    _M_dfs<__match_mode>(__state._M_next);
+	    _M_dfs(__match_mode, __state._M_next);
 	  break;
 	case _S_opcode_line_end_assertion:
 	  if (_M_at_end())
-	    _M_dfs<__match_mode>(__state._M_next);
+	    _M_dfs(__match_mode, __state._M_next);
 	  break;
 	case _S_opcode_word_boundary:
 	  if (_M_word_boundary(__state) == !__state._M_neg)
-	    _M_dfs<__match_mode>(__state._M_next);
+	    _M_dfs(__match_mode, __state._M_next);
 	  break;
 	// Here __state._M_alt offers a single start node for a sub-NFA.
 	// We recursively invoke our algorithm to match the sub-NFA.
 	case _S_opcode_subexpr_lookahead:
 	  if (_M_lookahead(__state) == !__state._M_neg)
-	    _M_dfs<__match_mode>(__state._M_next);
+	    _M_dfs(__match_mode, __state._M_next);
 	  break;
 	case _S_opcode_match:
 	  if (__dfs_mode)
@@ -291,7 +287,7 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	      if (_M_current != _M_end && __state._M_matches(*_M_current))
 		{
 		  ++_M_current;
-		  _M_dfs<__match_mode>(__state._M_next);
+		  _M_dfs(__match_mode, __state._M_next);
 		  --_M_current;
 		}
 	    }
@@ -322,11 +318,11 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
 		  {
 		    auto __backup = _M_current;
 		    _M_current = __last;
-		    _M_dfs<__match_mode>(__state._M_next);
+		    _M_dfs(__match_mode, __state._M_next);
 		    _M_current = __backup;
 		  }
 		else
-		  _M_dfs<__match_mode>(__state._M_next);
+		  _M_dfs(__match_mode, __state._M_next);
 	      }
 	  }
 	  break;
@@ -358,9 +354,9 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	    }
 	  break;
 	case _S_opcode_alternative:
-	  _M_dfs<__match_mode>(__state._M_alt);
+	  _M_dfs(__match_mode, __state._M_alt);
 	  if (!__dfs_mode || !_M_has_sol)
-	    _M_dfs<__match_mode>(__state._M_next);
+	    _M_dfs(__match_mode, __state._M_next);
 	  break;
 	default:
 	  _GLIBCXX_DEBUG_ASSERT(false);