google / re2

RE2 is a fast, safe, thread-friendly alternative to backtracking regular expression engines like those used in PCRE, Perl, and Python. It is a C++ library.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Lockless re2

rschu1ze opened this issue · comments

Hi,
we (ClickHouse) used to use a variant of re2 where internal locking was patched out (in an admittedly hacky way). Details of the source-to-source transformation from re2 to "re2_st", i.e. "single-threaded", can be found here:

The underlying idea was that the calling program already takes care that pattern compilation and evaluation are properly synchronized, i.e. single-threaded or protected by mutexes etc. The re2-to-re2_st transformation broke for us when re2 moved from its own locking to abseil locking.

I believe the situation that higher layers provide synchronization is rather common. Therefore, would it be possible to relax the locking in re2, either as a runtime-option (e.g. via some flag during pattern compilation) or at build time (e.g. via some build system option)?

Sorry, this isn't something that I'm keen to support upstream. Having said that, if the reader–writer mutex in the DFA class is the primary target here, there is a way that you can swap in another mechanism somewhat more conveniently:

re2/re2/dfa.cc

Lines 168 to 169 in 09de536

// Make it easier to swap in a scalable reader-writer mutex.
using CacheMutex = absl::Mutex;

Within Google, we simply do something like this:

#if defined(…)
#include …
#endif

…

  // Make it easier to swap in a scalable reader-writer mutex.
#if defined(…)
  using CacheMutex = …;
#else
  using CacheMutex = absl::Mutex;
#endif

Thanks, I'll try that!