This article relies excessively on references to primary sources. Please improve this article by adding secondary or tertiary sources. Find sources: "RE2" software – news · newspapers · books · scholar · JSTOR (November 2019) (Learn how and when to remove this message) |
Original author(s) | |
---|---|
Initial release | March 11, 2010; 14 years ago (2010-03-11) |
Stable release | 2021-04-01 / April 1, 2021; 3 years ago (2021-04-01) |
Repository | |
Written in | C++ |
Operating system | Cross-platform |
Type | Pattern matching library |
License | BSD |
Website | github |
RE2 is a software library which implements a regular expression engine. It uses finite-state machines, in contrast to most other regular expression libraries. RE2 supports a C++ interface.
RE2 was implemented by Google and Google uses RE2 for Google products. RE2 uses an "on-the-fly" deterministic finite-state automaton algorithm based on Ken Thompson's Plan 9 grep.
Comparison to PCRE
RE2 performs comparably to Perl Compatible Regular Expressions (PCRE). For certain regular expression operators like |
(the operator for alternation or logical disjunction) it is superior to PCRE. Unlike PCRE, which supports features such as lookarounds, backreferences and recursion, RE2 is only able to recognize regular languages due to its construction using the Thompson DFA algorithm. It is also slightly slower than PCRE for parenthetic capturing operations.
PCRE can use a large recursive stack with corresponding high memory usage and result in exponential runtime on certain patterns. In contrast, RE2 uses a fixed stack size and guarantees that its runtime increases linearly (not exponentially) with the size of the input. The maximum memory allocated with RE2 is configurable. This can make it more suitable for use in server applications, which require boundaries on memory usage and computational time.
Adoption
Use in Google products
RE2 is available to users of Google Docs and Google Sheets. Google Sheets supports RE2 except Unicode character class matching. RegexExtract does not use grouping.
Use in Go
The built-in "regexp" package uses the same patterns and implementation as RE2, though it is written in Go. This is unsurprising, given Go's common staff from the Plan 9 team.
Related libraries
The RE2 algorithm has been rewritten in Rust as the package "regex". CloudFlare's web application firewall uses this package because the RE2 algorithm is immune to ReDoS.
Russ Cox also wrote RE1, an earlier regular expression based on a bytecode interpreter. OpenResty uses a RE1 fork called "sregex".
See also
References
- Cox, Russ (March 11, 2010). "RE2: a principled approach to regular expression matching". Google Open Source Blog. Retrieved 2020-05-29.
- "Releases". Github. Retrieved 2021-05-03.
- "Search and use find and replace: Find and replace items using regular expressions". support.google.com. Retrieved 30 November 2024.
- ^ Cox, Russ. "Regular Expression Matching in the Wild". swtch.com.
- "Search and use find and replace". Retrieved 24 March 2020.
- "RegMatch".
- "regexp package - regexp - Go Packages". Retrieved 8 Nov 2024.
- "Making the WAF 40% faster". The Cloudflare Blog. 1 July 2020.
- "Regular Expression Matching: the Virtual Machine Approach". swtch.com.
- "openresty/sregex: A non-backtracking NFA/DFA-based Perl-compatible regex engine matching on large data streams". OpenResty. 6 February 2024.