3 perlreapi - Perl regular expression plugin interface
7 As of Perl 5.9.5 there is a new interface for plugging and using
8 regular expression engines other than the default one.
10 Each engine is supposed to provide access to a constant structure of the
13 typedef struct regexp_engine {
14 REGEXP* (*comp) (pTHX_
15 const SV * const pattern, const U32 flags);
19 char* strend, char* strbeg,
20 SSize_t minend, SV* sv,
21 void* data, U32 flags);
22 char* (*intuit) (pTHX_
23 REGEXP * const rx, SV *sv,
24 const char * const strbeg,
25 char *strpos, char *strend, U32 flags,
26 struct re_scream_pos_data_s *data);
27 SV* (*checkstr) (pTHX_ REGEXP * const rx);
28 void (*free) (pTHX_ REGEXP * const rx);
29 void (*numbered_buff_FETCH) (pTHX_
33 void (*numbered_buff_STORE) (pTHX_
36 SV const * const value);
37 I32 (*numbered_buff_LENGTH) (pTHX_
41 SV* (*named_buff) (pTHX_
46 SV* (*named_buff_iter) (pTHX_
48 const SV * const lastkey,
50 SV* (*qr_package)(pTHX_ REGEXP * const rx);
52 void* (*dupe) (pTHX_ REGEXP * const rx, CLONE_PARAMS *param);
54 REGEXP* (*op_comp) (...);
57 When a regexp is compiled, its C<engine> field is then set to point at
58 the appropriate structure, so that when it needs to be used Perl can find
59 the right routines to do so.
61 In order to install a new regexp handler, C<$^H{regcomp}> is set
62 to an integer which (when casted appropriately) resolves to one of these
63 structures. When compiling, the C<comp> method is executed, and the
64 resulting C<regexp> structure's engine field is expected to point back at
67 The pTHX_ symbol in the definition is a macro used by Perl under threading
68 to provide an extra argument to the routine holding a pointer back to
69 the interpreter that is executing the regexp. So under threading all
70 routines get an extra argument.
76 REGEXP* comp(pTHX_ const SV * const pattern, const U32 flags);
78 Compile the pattern stored in C<pattern> using the given C<flags> and
79 return a pointer to a prepared C<REGEXP> structure that can perform
80 the match. See L</The REGEXP structure> below for an explanation of
81 the individual fields in the REGEXP struct.
83 The C<pattern> parameter is the scalar that was used as the
84 pattern. Previous versions of Perl would pass two C<char*> indicating
85 the start and end of the stringified pattern; the following snippet can
86 be used to get the old parameters:
89 char* exp = SvPV(pattern, plen);
90 char* xend = exp + plen;
92 Since any scalar can be passed as a pattern, it's possible to implement
93 an engine that does something with an array (C<< "ook" =~ [ qw/ eek
94 hlagh / ] >>) or with the non-stringified form of a compiled regular
95 expression (C<< "ook" =~ qr/eek/ >>). Perl's own engine will always
96 stringify everything using the snippet above, but that doesn't mean
97 other engines have to.
99 The C<flags> parameter is a bitfield which indicates which of the
100 C<msixp> flags the regex was compiled with. It also contains
101 additional info, such as if C<use locale> is in effect.
103 The C<eogc> flags are stripped out before being passed to the comp
104 routine. The regex engine does not need to know if any of these
105 are set, as those flags should only affect what Perl does with the
106 pattern and its match variables, not how it gets compiled and
109 By the time the comp callback is called, some of these flags have
110 already had effect (noted below where applicable). However most of
111 their effect occurs after the comp callback has run, in routines that
112 read the C<< rx->extflags >> field which it populates.
114 In general the flags should be preserved in C<< rx->extflags >> after
115 compilation, although the regex engine might want to add or delete
116 some of them to invoke or disable some special behavior in Perl. The
117 flags along with any special behavior they cause are documented below:
119 The pattern modifiers:
123 =item C</m> - RXf_PMf_MULTILINE
125 If this is in C<< rx->extflags >> it will be passed to
126 C<Perl_fbm_instr> by C<pp_split> which will treat the subject string
127 as a multi-line string.
129 =item C</s> - RXf_PMf_SINGLELINE
131 =item C</i> - RXf_PMf_FOLD
133 =item C</x> - RXf_PMf_EXTENDED
135 If present on a regex, C<"#"> comments will be handled differently by the
136 tokenizer in some cases.
138 TODO: Document those cases.
140 =item C</p> - RXf_PMf_KEEPCOPY
146 The character set rules are determined by an enum that is contained
147 in this field. This is still experimental and subject to change, but
148 the current interface returns the rules by use of the in-line function
149 C<get_regex_charset(const U32 flags)>. The only currently documented
150 value returned from it is REGEX_LOCALE_CHARSET, which is set if
151 C<use locale> is in effect. If present in C<< rx->extflags >>,
152 C<split> will use the locale dependent definition of whitespace
153 when RXf_SKIPWHITE or RXf_WHITE is in effect. ASCII whitespace
154 is defined as per L<isSPACE|perlapi/isSPACE>, and by the internal
155 macros C<is_utf8_space> under UTF-8, and C<isSPACE_LC> under C<use
166 This flag was removed in perl 5.18.0. C<split ' '> is now special-cased
167 solely in the parser. RXf_SPLIT is still #defined, so you can test for it.
168 This is how it used to work:
170 If C<split> is invoked as C<split ' '> or with no arguments (which
171 really means C<split(' ', $_)>, see L<split|perlfunc/split>), Perl will
172 set this flag. The regex engine can then check for it and set the
173 SKIPWHITE and WHITE extflags. To do this, the Perl engine does:
175 if (flags & RXf_SPLIT && r->prelen == 1 && r->precomp[0] == ' ')
176 r->extflags |= (RXf_SKIPWHITE|RXf_WHITE);
180 These flags can be set during compilation to enable optimizations in
181 the C<split> operator.
187 This flag was removed in perl 5.18.0. It is still #defined, so you can
188 set it, but doing so will have no effect. This is how it used to work:
190 If the flag is present in C<< rx->extflags >> C<split> will delete
191 whitespace from the start of the subject string before it's operated
192 on. What is considered whitespace depends on if the subject is a
193 UTF-8 string and if the C<RXf_PMf_LOCALE> flag is set.
195 If RXf_WHITE is set in addition to this flag, C<split> will behave like
196 C<split " "> under the Perl engine.
200 Tells the split operator to split the target string on newlines
201 (C<\n>) without invoking the regex engine.
203 Perl's engine sets this if the pattern is C</^/> (C<plen == 1 && *exp
204 == '^'>), even under C</^/s>; see L<split|perlfunc>. Of course a
205 different regex engine might want to use the same optimizations
206 with a different syntax.
210 Tells the split operator to split the target string on whitespace
211 without invoking the regex engine. The definition of whitespace varies
212 depending on if the target string is a UTF-8 string and on
213 if RXf_PMf_LOCALE is set.
215 Perl's engine sets this flag if the pattern is C<\s+>.
219 Tells the split operator to split the target string on
220 characters. The definition of character varies depending on if
221 the target string is a UTF-8 string.
223 Perl's engine sets this flag on empty patterns, this optimization
224 makes C<split //> much faster than it would otherwise be. It's even
225 faster than C<unpack>.
227 =item RXf_NO_INPLACE_SUBST
229 Added in perl 5.18.0, this flag indicates that a regular expression might
230 perform an operation that would interfere with inplace substitution. For
231 instance it might contain lookbehind, or assign to non-magical variables
232 (such as $REGMARK and $REGERROR) during matching. C<s///> will skip
233 certain optimisations when this is set.
239 I32 exec(pTHX_ REGEXP * const rx,
240 char *stringarg, char* strend, char* strbeg,
241 SSize_t minend, SV* sv,
242 void* data, U32 flags);
244 Execute a regexp. The arguments are
250 The regular expression to execute.
254 This is the SV to be matched against. Note that the
255 actual char array to be matched against is supplied by the arguments
256 described below; the SV is just used to determine UTF8ness, C<pos()> etc.
260 Pointer to the physical start of the string.
264 Pointer to the character following the physical end of the string (i.e.
269 Pointer to the position in the string where matching should start; it might
270 not be equal to C<strbeg> (for example in a later iteration of C</.../g>).
274 Minimum length of string (measured in bytes from C<stringarg>) that must
275 match; if the engine reaches the end of the match but hasn't reached this
276 position in the string, it should fail.
280 Optimisation data; subject to change.
284 Optimisation flags; subject to change.
293 const char * const strbeg,
297 struct re_scream_pos_data_s *data);
299 Find the start position where a regex match should be attempted,
300 or possibly if the regex engine should not be run because the
301 pattern can't match. This is called, as appropriate, by the core,
302 depending on the values of the C<extflags> member of the C<regexp>
307 rx: the regex to match against
308 sv: the SV being matched: only used for utf8 flag; the string
309 itself is accessed via the pointers below. Note that on
310 something like an overloaded SV, SvPOK(sv) may be false
311 and the string pointers may point to something unrelated to
313 strbeg: real beginning of string
314 strpos: the point in the string at which to begin matching
315 strend: pointer to the byte following the last char of the string
316 flags currently unused; set to 0
317 data: currently unused; set to NULL
322 SV* checkstr(pTHX_ REGEXP * const rx);
324 Return a SV containing a string that must appear in the pattern. Used
325 by C<split> for optimising matches.
329 void free(pTHX_ REGEXP * const rx);
331 Called by Perl when it is freeing a regexp pattern so that the engine
332 can release any resources pointed to by the C<pprivate> member of the
333 C<regexp> structure. This is only responsible for freeing private data;
334 Perl will handle releasing anything else contained in the C<regexp> structure.
336 =head2 Numbered capture callbacks
338 Called to get/set the value of C<$`>, C<$'>, C<$&> and their named
339 equivalents, ${^PREMATCH}, ${^POSTMATCH} and ${^MATCH}, as well as the
340 numbered capture groups (C<$1>, C<$2>, ...).
342 The C<paren> parameter will be C<1> for C<$1>, C<2> for C<$2> and so
343 forth, and have these symbolic values for the special variables:
345 ${^PREMATCH} RX_BUFF_IDX_CARET_PREMATCH
346 ${^POSTMATCH} RX_BUFF_IDX_CARET_POSTMATCH
347 ${^MATCH} RX_BUFF_IDX_CARET_FULLMATCH
348 $` RX_BUFF_IDX_PREMATCH
349 $' RX_BUFF_IDX_POSTMATCH
350 $& RX_BUFF_IDX_FULLMATCH
352 Note that in Perl 5.17.3 and earlier, the last three constants were also
353 used for the caret variants of the variables.
356 The names have been chosen by analogy with L<Tie::Scalar> methods
357 names with an additional B<LENGTH> callback for efficiency. However
358 named capture variables are currently not tied internally but
359 implemented via magic.
361 =head3 numbered_buff_FETCH
363 void numbered_buff_FETCH(pTHX_ REGEXP * const rx, const I32 paren,
366 Fetch a specified numbered capture. C<sv> should be set to the scalar
367 to return, the scalar is passed as an argument rather than being
368 returned from the function because when it's called Perl already has a
369 scalar to store the value, creating another one would be
370 redundant. The scalar can be set with C<sv_setsv>, C<sv_setpvn> and
371 friends, see L<perlapi>.
373 This callback is where Perl untaints its own capture variables under
374 taint mode (see L<perlsec>). See the C<Perl_reg_numbered_buff_fetch>
375 function in F<regcomp.c> for how to untaint capture variables if
376 that's something you'd like your engine to do as well.
378 =head3 numbered_buff_STORE
380 void (*numbered_buff_STORE) (pTHX_
383 SV const * const value);
385 Set the value of a numbered capture variable. C<value> is the scalar
386 that is to be used as the new value. It's up to the engine to make
387 sure this is used as the new value (or reject it).
391 if ("ook" =~ /(o*)/) {
392 # 'paren' will be '1' and 'value' will be 'ee'
396 Perl's own engine will croak on any attempt to modify the capture
397 variables, to do this in another engine use the following callback
398 (copied from C<Perl_reg_numbered_buff_store>):
401 Example_reg_numbered_buff_store(pTHX_
404 SV const * const value)
407 PERL_UNUSED_ARG(paren);
408 PERL_UNUSED_ARG(value);
411 Perl_croak(aTHX_ PL_no_modify);
414 Actually Perl will not I<always> croak in a statement that looks
415 like it would modify a numbered capture variable. This is because the
416 STORE callback will not be called if Perl can determine that it
417 doesn't have to modify the value. This is exactly how tied variables
418 behave in the same situation:
421 use parent 'Tie::Scalar';
423 sub TIESCALAR { bless [] }
425 sub STORE { die "This doesn't get called" }
429 tie my $sv => "CaptureVar";
432 Because C<$sv> is C<undef> when the C<y///> operator is applied to it,
433 the transliteration won't actually execute and the program won't
434 C<die>. This is different to how 5.8 and earlier versions behaved
435 since the capture variables were READONLY variables then; now they'll
436 just die when assigned to in the default engine.
438 =head3 numbered_buff_LENGTH
440 I32 numbered_buff_LENGTH (pTHX_
445 Get the C<length> of a capture variable. There's a special callback
446 for this so that Perl doesn't have to do a FETCH and run C<length> on
447 the result, since the length is (in Perl's case) known from an offset
448 stored in C<< rx->offs >>, this is much more efficient:
450 I32 s1 = rx->offs[paren].start;
451 I32 s2 = rx->offs[paren].end;
454 This is a little bit more complex in the case of UTF-8, see what
455 C<Perl_reg_numbered_buff_length> does with
456 L<is_utf8_string_loclen|perlapi/is_utf8_string_loclen>.
458 =head2 Named capture callbacks
460 Called to get/set the value of C<%+> and C<%->, as well as by some
461 utility functions in L<re>.
463 There are two callbacks, C<named_buff> is called in all the cases the
464 FETCH, STORE, DELETE, CLEAR, EXISTS and SCALAR L<Tie::Hash> callbacks
465 would be on changes to C<%+> and C<%-> and C<named_buff_iter> in the
466 same cases as FIRSTKEY and NEXTKEY.
468 The C<flags> parameter can be used to determine which of these
469 operations the callbacks should respond to. The following flags are
472 Which L<Tie::Hash> operation is being performed from the Perl level on
473 C<%+> or C<%+>, if any:
484 If C<%+> or C<%-> is being operated on, if any.
489 If this is being called as C<re::regname>, C<re::regnames> or
490 C<re::regnames_count>, if any. The first two will be combined with
491 C<RXapif_ONE> or C<RXapif_ALL>.
495 RXapif_REGNAMES_COUNT
497 Internally C<%+> and C<%-> are implemented with a real tied interface
498 via L<Tie::Hash::NamedCapture>. The methods in that package will call
499 back into these functions. However the usage of
500 L<Tie::Hash::NamedCapture> for this purpose might change in future
501 releases. For instance this might be implemented by magic instead
502 (would need an extension to mgvtbl).
506 SV* (*named_buff) (pTHX_ REGEXP * const rx, SV * const key,
507 SV * const value, U32 flags);
509 =head3 named_buff_iter
511 SV* (*named_buff_iter) (pTHX_
513 const SV * const lastkey,
518 SV* qr_package(pTHX_ REGEXP * const rx);
520 The package the qr// magic object is blessed into (as seen by C<ref
521 qr//>). It is recommended that engines change this to their package
522 name for identification regardless of if they implement methods
525 The package this method returns should also have the internal
526 C<Regexp> package in its C<@ISA>. C<< qr//->isa("Regexp") >> should always
527 be true regardless of what engine is being used.
529 Example implementation might be:
532 Example_qr_package(pTHX_ REGEXP * const rx)
535 return newSVpvs("re::engine::Example");
538 Any method calls on an object created with C<qr//> will be dispatched to the
539 package as a normal object.
541 use re::engine::Example;
543 $re->meth; # dispatched to re::engine::Example::meth()
545 To retrieve the C<REGEXP> object from the scalar in an XS function use
546 the C<SvRX> macro, see L<"REGEXP Functions" in perlapi|perlapi/REGEXP
551 REGEXP * re = SvRX(sv);
555 void* dupe(pTHX_ REGEXP * const rx, CLONE_PARAMS *param);
557 On threaded builds a regexp may need to be duplicated so that the pattern
558 can be used by multiple threads. This routine is expected to handle the
559 duplication of any private data pointed to by the C<pprivate> member of
560 the C<regexp> structure. It will be called with the preconstructed new
561 C<regexp> structure as an argument, the C<pprivate> member will point at
562 the B<old> private structure, and it is this routine's responsibility to
563 construct a copy and return a pointer to it (which Perl will then use to
564 overwrite the field as passed to this routine.)
566 This allows the engine to dupe its private data but also if necessary
567 modify the final structure if it really must.
569 On unthreaded builds this field doesn't exist.
573 This is private to the Perl core and subject to change. Should be left
576 =head1 The REGEXP structure
578 The REGEXP struct is defined in F<regexp.h>.
579 All regex engines must be able to
580 correctly build such a structure in their L</comp> routine.
582 The REGEXP structure contains all the data that Perl needs to be aware of
583 to properly work with the regular expression. It includes data about
584 optimisations that Perl can use to determine if the regex engine should
585 really be used, and various other control info that is needed to properly
586 execute patterns in various contexts, such as if the pattern anchored in
587 some way, or what flags were used during the compile, or if the
588 program contains special constructs that Perl needs to be aware of.
590 In addition it contains two fields that are intended for the private
591 use of the regex engine that compiled the pattern. These are the
592 C<intflags> and C<pprivate> members. C<pprivate> is a void pointer to
593 an arbitrary structure, whose use and management is the responsibility
594 of the compiling engine. Perl will never modify either of these
597 typedef struct regexp {
598 /* what engine created this regexp? */
599 const struct regexp_engine* engine;
601 /* what re is this a lightweight copy of? */
602 struct regexp* mother_re;
604 /* Information about the match that the Perl core uses to manage
606 U32 extflags; /* Flags used both externally and internally */
607 I32 minlen; /* mininum possible number of chars in */
609 I32 minlenret; /* mininum possible number of chars in $& */
610 U32 gofs; /* chars left of pos that we search from */
612 /* substring data about strings that must appear
613 in the final match, used for optimisations */
614 struct reg_substr_data *substrs;
616 U32 nparens; /* number of capture groups */
618 /* private engine specific data */
619 U32 intflags; /* Engine Specific Internal flags */
620 void *pprivate; /* Data private to the regex engine which
621 created this object. */
623 /* Data about the last/current match. These are modified during
625 U32 lastparen; /* highest close paren matched ($+) */
626 U32 lastcloseparen; /* last close paren matched ($^N) */
627 regexp_paren_pair *swap; /* Swap copy of *offs */
628 regexp_paren_pair *offs; /* Array of offsets for (@-) and
631 char *subbeg; /* saved or original string so \digit works
633 SV_SAVED_COPY /* If non-NULL, SV which is COW from original */
634 I32 sublen; /* Length of string pointed by subbeg */
635 I32 suboffset; /* byte offset of subbeg from logical start of
637 I32 subcoffset; /* suboffset equiv, but in chars (for @-/@+) */
639 /* Information about the match that isn't often used */
640 I32 prelen; /* length of precomp */
641 const char *precomp; /* pre-compilation regular expression */
643 char *wrapped; /* wrapped version of the pattern */
644 I32 wraplen; /* length of wrapped */
646 I32 seen_evals; /* number of eval groups in the pattern - for
648 HV *paren_names; /* Optional hash of paren names */
650 /* Refcount of this regexp */
651 I32 refcnt; /* Refcount of this regexp */
654 The fields are discussed in more detail below:
658 This field points at a C<regexp_engine> structure which contains pointers
659 to the subroutines that are to be used for performing a match. It
660 is the compiling routine's responsibility to populate this field before
661 returning the regexp object.
663 Internally this is set to C<NULL> unless a custom engine is specified in
664 C<$^H{regcomp}>, Perl's own set of callbacks can be accessed in the struct
665 pointed to by C<RE_ENGINE_PTR>.
669 TODO, see L<http://www.mail-archive.com/perl5-changes@perl.org/msg17328.html>
673 This will be used by Perl to see what flags the regexp was compiled
674 with, this will normally be set to the value of the flags parameter by
675 the L<comp|/comp> callback. See the L<comp|/comp> documentation for
678 =head2 C<minlen> C<minlenret>
680 The minimum string length (in characters) required for the pattern to match.
682 prune the search space by not bothering to match any closer to the end of a
683 string than would allow a match. For instance there is no point in even
684 starting the regex engine if the minlen is 10 but the string is only 5
685 characters long. There is no way that the pattern can match.
687 C<minlenret> is the minimum length (in characters) of the string that would
688 be found in $& after a match.
690 The difference between C<minlen> and C<minlenret> can be seen in the
695 where the C<minlen> would be 3 but C<minlenret> would only be 2 as the \d is
696 required to match but is not actually
697 included in the matched content. This
698 distinction is particularly important as the substitution logic uses the
699 C<minlenret> to tell if it can do in-place substitutions (these can
700 result in considerable speed-up).
704 Left offset from pos() to start match at.
708 Substring data about strings that must appear in the final match. This
709 is currently only used internally by Perl's engine, but might be
710 used in the future for all engines for optimisations.
712 =head2 C<nparens>, C<lastparen>, and C<lastcloseparen>
714 These fields are used to keep track of how many paren groups could be matched
715 in the pattern, which was the last open paren to be entered, and which was
716 the last close paren to be entered.
720 The engine's private copy of the flags the pattern was compiled with. Usually
721 this is the same as C<extflags> unless the engine chose to modify one of them.
725 A void* pointing to an engine-defined
726 data structure. The Perl engine uses the
727 C<regexp_internal> structure (see L<perlreguts/Base Structures>) but a custom
728 engine should use something else.
732 Unused. Left in for compatibility with Perl 5.10.0.
736 A C<regexp_paren_pair> structure which defines offsets into the string being
737 matched which correspond to the C<$&> and C<$1>, C<$2> etc. captures, the
738 C<regexp_paren_pair> struct is defined as follows:
740 typedef struct regexp_paren_pair {
745 If C<< ->offs[num].start >> or C<< ->offs[num].end >> is C<-1> then that
746 capture group did not match.
747 C<< ->offs[0].start/end >> represents C<$&> (or
748 C<${^MATCH}> under C<//p>) and C<< ->offs[paren].end >> matches C<$$paren> where
751 =head2 C<precomp> C<prelen>
753 Used for optimisations. C<precomp> holds a copy of the pattern that
754 was compiled and C<prelen> its length. When a new pattern is to be
755 compiled (such as inside a loop) the internal C<regcomp> operator
756 checks if the last compiled C<REGEXP>'s C<precomp> and C<prelen>
757 are equivalent to the new one, and if so uses the old pattern instead
758 of compiling a new one.
760 The relevant snippet from C<Perl_pp_regcomp>:
762 if (!re || !re->precomp || re->prelen != (I32)len ||
763 memNE(re->precomp, t, len))
764 /* Compile a new pattern */
766 =head2 C<paren_names>
768 This is a hash used internally to track named capture groups and their
769 offsets. The keys are the names of the buffers the values are dualvars,
770 with the IV slot holding the number of buffers with the given name and the
771 pv being an embedded array of I32. The values may also be contained
772 independently in the data array in cases where named backreferences are
777 Holds information on the longest string that must occur at a fixed
778 offset from the start of the pattern, and the longest string that must
779 occur at a floating offset from the start of the pattern. Used to do
780 Fast-Boyer-Moore searches on the string to find out if its worth using
781 the regex engine at all, and if so where in the string to search.
783 =head2 C<subbeg> C<sublen> C<saved_copy> C<suboffset> C<subcoffset>
785 Used during the execution phase for managing search and replace patterns,
786 and for providing the text for C<$&>, C<$1> etc. C<subbeg> points to a
787 buffer (either the original string, or a copy in the case of
788 C<RX_MATCH_COPIED(rx)>), and C<sublen> is the length of the buffer. The
789 C<RX_OFFS> start and end indices index into this buffer.
791 In the presence of the C<REXEC_COPY_STR> flag, but with the addition of
792 the C<REXEC_COPY_SKIP_PRE> or C<REXEC_COPY_SKIP_POST> flags, an engine
793 can choose not to copy the full buffer (although it must still do so in
794 the presence of C<RXf_PMf_KEEPCOPY> or the relevant bits being set in
795 C<PL_sawampersand>). In this case, it may set C<suboffset> to indicate the
796 number of bytes from the logical start of the buffer to the physical start
797 (i.e. C<subbeg>). It should also set C<subcoffset>, the number of
798 characters in the offset. The latter is needed to support C<@-> and C<@+>
799 which work in characters, not bytes.
801 =head2 C<wrapped> C<wraplen>
803 Stores the string C<qr//> stringifies to. The Perl engine for example
804 stores C<(?^:eek)> in the case of C<qr/eek/>.
806 When using a custom engine that doesn't support the C<(?:)> construct
807 for inline modifiers, it's probably best to have C<qr//> stringify to
808 the supplied pattern, note that this will create undesired patterns in
811 my $x = qr/a|b/; # "a|b"
812 my $y = qr/c/i; # "c"
813 my $z = qr/$x$y/; # "a|bc"
815 There's no solution for this problem other than making the custom
816 engine understand a construct like C<(?:)>.
820 This stores the number of eval groups in
821 the pattern. This is used for security
822 purposes when embedding compiled regexes into larger patterns with C<qr//>.
826 The number of times the structure is referenced. When
827 this falls to 0, the regexp is automatically freed
828 by a call to pregfree. This should be set to 1 in
829 each engine's L</comp> routine.
833 Originally part of L<perlreguts>.
837 Originally written by Yves Orton, expanded by E<AElig>var ArnfjE<ouml>rE<eth>
842 Copyright 2006 Yves Orton and 2007 E<AElig>var ArnfjE<ouml>rE<eth> Bjarmason.
844 This program is free software; you can redistribute it and/or modify it under
845 the same terms as Perl itself.