10.6. PHP Efficiency IssuesPHP's preg routines use PCRE, an optimized NFA regular-expression engine, so many of the techniques discussed in Chapters 4 through 6 apply directly. This includes benchmarking critical sections of code to understand practically, and not just theoretically, what is fast and what is not. Chapter 6 shows an example of benchmarking in PHP (˜ 234). For particularly time-critical code, remember that a callback is generally faster than using the e pattern modifier (˜ 465), and that named capture with very long strings can result in a lot of extra data copying. Regular expressions are compiled as they're encountered at runtime, but PHP has a huge 4,096-entry cache (˜ 242), so in practice, a particular pattern string is compiled only the first time it is encountered . The S pattern modifier deserves special coverage: it "studies" a regex to try to achieve a faster match. (This is unrelated, by the way, to Perl's study function, which works with target text rather than a regular expression ˜ 359.) 10.6.1. The S Pattern Modifier: "Study" Using the S pattern modifier instructs the preg engine to spend a little extra time [
Currently, the situations where study can and can't help are fairly well defined: it enhances what Chapter 6 calls the initial class discrimination optimization (˜ 247). I'll start off first by noting that unless you intend to apply a regex to a lot of text, there's probably not a lot of time to save in the first place. You need to be concerned with the S pattern modifier only when applying the same regex to large chunks of text, or to many small chunks . 10.6.1.1. Standard optimizations, without the S pattern modifier Consider a simple expression such as This simple presearch can be much faster than a full regex application, and therein lies the optimization. Particularly, the less frequently the character in question appears in the target text, the greater the optimization. Also, the more work a regex engine must do to detect a first-character failure, the greater the benefit of the optimization. This optimization helps 10.6.1.2. Enhancing the optimization with the S pattern modifierThe preg engine is smart enough to apply this optimization to most expressions that have only a single character that must start any match, as in the previous examples. However, the S pattern modifier tells the engine to preanalyze the expression to enable this optimization for expressions whose possible matches have multiple starting characters . Here are several sample expressions, some of which we've already seen in this chapter, that require the S pattern modifier to be optimized in this way:
10.6.1.3. When the S pattern modifier can't helpIt's instructive to look at the type of expressions that don't benefit from the S pattern modifier:
10.6.1.4. Suggested useIt doesn't take long for the preg engine to do the extra analysis invoked by the S pattern modifier, so if you'll be applying a regex to relatively large chunks of text, it doesn't hurt to use it. If you think there's any chance it might apply, the potential benefit makes it worthwhile. |