![regex non capturing group regex non capturing group](https://i.stack.imgur.com/ao9Xg.jpg)
Now, when integer matches, the engine exits from an atomic group, and throws away the backtracking positions it stored for the alternation. We can do this by turning the capturing group into an atomic group: \b (?> integer | insert | in ) \b. The word we’ve encountered in the subject string is a longer word, and it isn’t in our list. We can optimize this by telling the regular expression engine that if it can’t match \b after it matched integer, then it shouldn’t bother trying any of the other words. This is quite a lot of work to figure out integers isn’t in our list of words. The regex engine has no more remembered backtracking positions, so it declares failure. So the engine backtracks once more to the third alternative.
![regex non capturing group regex non capturing group](https://i.stack.imgur.com/uJher.png)
The second alternative matches in, but then fails to match s. So the engine backtracks to try the second alternative inside the group. The regex engine makes note that there are two more alternatives in the group, and continues with \b. \b matches at the start of the string, and integer matches integer. What’s not so obvious is that the regex engine will spend quite some effort figuring this out. Obviously, because of the word boundaries, these don’t match. Regex Optimization Using Atomic GroupingĬonsider the regex \b ( integer | insert | in ) \b and the subject integers. Or more importantly, it eliminates certain match attempts. But it does illustrate very clearly how atomic grouping eliminates certain matches. Of course, the above example isn’t very useful. As a result, when c fails, the regex engine has no alternatives left to try. In this example, the alternation’s option to try b at the second position in the string is discarded. At that point, all backtracking positions for tokens inside the group are discarded. The regex with the atomic group, however, exited from an atomic group after bc was matched. The group will give up its match, b then matches b and c matches c. The regex with the capturing group has remembered a backtracking position for the alternation. When applied to abc, both regexes will match a to a, bc to bc, and then c will fail to match at the end of the string. The regex a (?> bc | b ) c (atomic group) matches abcc but not abc. The regular expression a ( bc | b ) c (capturing group) matches abcc and abc. Most of these also support possessive quantifiers, which are essentially a notational convenience for atomic grouping.Īn example will make the behavior of atomic groups clear. Atomic grouping is supported by most modern regular expression flavors, including the JGsoft flavor, Java, PCRE. An atomic group is a group that, when the regex engine exits from it, automatically throws away all backtracking positions remembered by any tokens inside the group.