[perl5.git] / pod / perlreref.pod

=head1 NAME

perlreref - Perl Regular Expressions Reference

=head1 DESCRIPTION

This is a quick reference to Perl's regular expressions.
For full information see L<perlre> and L<perlop>, as well
as the L</"SEE ALSO"> section in this document.

=head2 OPERATORS

C<=~> determines to which variable the regex is applied.
In its absence, $_ is used.

    $var =~ /foo/;

C<!~> determines to which variable the regex is applied,
and negates the result of the match; it returns
false if the match succeeds, and true if it fails.

    $var !~ /foo/;

C<m/pattern/msixpogcdual> searches a string for a pattern match,
applying the given options.

    m  Multiline mode - ^ and $ match internal lines
    s  match as a Single line - . matches \n
    i  case-Insensitive
    x  eXtended legibility - free whitespace and comments
    p  Preserve a copy of the matched string -
       ${^PREMATCH}, ${^MATCH}, ${^POSTMATCH} will be defined.
    o  compile pattern Once
    g  Global - all occurrences
    c  don't reset pos on failed matches when using /g
    a  restrict \d, \s, \w and [:posix:] to match ASCII only
    aa (two a's) also /i matches exclude ASCII/non-ASCII
    l  match according to current locale
    u  match according to Unicode rules
    d  match according to native rules unless something indicates
       Unicode

If 'pattern' is an empty string, the last I<successfully> matched
regex is used. Delimiters other than '/' may be used for both this
operator and the following ones. The leading C<m> can be omitted
if the delimiter is '/'.

C<qr/pattern/msixpodual> lets you store a regex in a variable,
or pass one around. Modifiers as for C<m//>, and are stored
within the regex.

C<s/pattern/replacement/msixpogcedual> substitutes matches of
'pattern' with 'replacement'. Modifiers as for C<m//>,
with two additions:

    e  Evaluate 'replacement' as an expression
    r  Return substitution and leave the original string untouched.

'e' may be specified multiple times. 'replacement' is interpreted
as a double quoted string unless a single-quote (C<'>) is the delimiter.

C<?pattern?> is like C<m/pattern/> but matches only once. No alternate
delimiters can be used.  Must be reset with reset().

=head2 SYNTAX

 \       Escapes the character immediately following it
 .       Matches any single character except a newline (unless /s is
           used)
 ^       Matches at the beginning of the string (or line, if /m is used)
 $       Matches at the end of the string (or line, if /m is used)
 *       Matches the preceding element 0 or more times
 +       Matches the preceding element 1 or more times
 ?       Matches the preceding element 0 or 1 times
 {...}   Specifies a range of occurrences for the element preceding it
 [...]   Matches any one of the characters contained within the brackets
 (...)   Groups subexpressions for capturing to $1, $2...
 (?:...) Groups subexpressions without capturing (cluster)
 |       Matches either the subexpression preceding or following it
 \g1 or \g{1}, \g2 ...    Matches the text from the Nth group
 \1, \2, \3 ...           Matches the text from the Nth group
 \g-1 or \g{-1}, \g-2 ... Matches the text from the Nth previous group
 \g{name}     Named backreference
 \k<name>     Named backreference
 \k'name'     Named backreference
 (?P=name)    Named backreference (python syntax)

=head2 ESCAPE SEQUENCES

These work as in normal strings.

   \a       Alarm (beep)
   \e       Escape
   \f       Formfeed
   \n       Newline
   \r       Carriage return
   \t       Tab
   \037     Char whose ordinal is the 3 octal digits, max \777
   \o{2307} Char whose ordinal is the octal number, unrestricted
   \x7f     Char whose ordinal is the 2 hex digits, max \xFF
   \x{263a} Char whose ordinal is the hex number, unrestricted
   \cx      Control-x
   \N{name} A named Unicode character or character sequence
   \N{U+263D} A Unicode character by hex ordinal

   \l  Lowercase next character
   \u  Titlecase next character
   \L  Lowercase until \E
   \U  Uppercase until \E
   \Q  Disable pattern metacharacters until \E
   \E  End modification

For Titlecase, see L</Titlecase>.

This one works differently from normal strings:

   \b  An assertion, not backspace, except in a character class

=head2 CHARACTER CLASSES

   [amy]    Match 'a', 'm' or 'y'
   [f-j]    Dash specifies "range"
   [f-j-]   Dash escaped or at start or end means 'dash'
   [^f-j]   Caret indicates "match any character _except_ these"

The following sequences (except C<\N>) work within or without a character class.
The first six are locale aware, all are Unicode aware. See L<perllocale>
and L<perlunicode> for details.

   \d      A digit
   \D      A nondigit
   \w      A word character
   \W      A non-word character
   \s      A whitespace character
   \S      A non-whitespace character
   \h      An horizontal whitespace
   \H      A non horizontal whitespace
   \N      A non newline (when not followed by '{NAME}'; experimental;
           not valid in a character class; equivalent to [^\n]; it's
           like '.' without /s modifier)
   \v      A vertical whitespace
   \V      A non vertical whitespace
   \R      A generic newline           (?>\v|\x0D\x0A)

   \C      Match a byte (with Unicode, '.' matches a character)
   \pP     Match P-named (Unicode) property
   \p{...} Match Unicode property with name longer than 1 character
   \PP     Match non-P
   \P{...} Match lack of Unicode property with name longer than 1 char
   \X      Match Unicode extended grapheme cluster

POSIX character classes and their Unicode and Perl equivalents:

            ASCII-         Full-
   POSIX    range          range    backslash
 [[:...:]]  \p{...}        \p{...}   sequence    Description

 -----------------------------------------------------------------------
 alnum   PosixAlnum       XPosixAlnum            Alpha plus Digit
 alpha   PosixAlpha       XPosixAlpha            Alphabetic characters
 ascii   ASCII                                   Any ASCII character
 blank   PosixBlank       XPosixBlank   \h       Horizontal whitespace;
                                                   full-range also
                                                   written as
                                                   \p{HorizSpace} (GNU
                                                   extension)
 cntrl   PosixCntrl       XPosixCntrl            Control characters
 digit   PosixDigit       XPosixDigit   \d       Decimal digits
 graph   PosixGraph       XPosixGraph            Alnum plus Punct
 lower   PosixLower       XPosixLower            Lowercase characters
 print   PosixPrint       XPosixPrint            Graph plus Print, but
                                                   not any Cntrls
 punct   PosixPunct       XPosixPunct            Punctuation and Symbols
                                                   in ASCII-range; just
                                                   punct outside it
 space   PosixSpace       XPosixSpace            [\s\cK]
         PerlSpace        XPerlSpace    \s       Perl's whitespace def'n
 upper   PosixUpper       XPosixUpper            Uppercase characters
 word    PosixWord        XPosixWord    \w       Alnum + Unicode marks +
                                                   connectors, like '_'
                                                   (Perl extension)
 xdigit  ASCII_Hex_Digit  XPosixDigit            Hexadecimal digit,
                                                    ASCII-range is
                                                    [0-9A-Fa-f]

Also, various synonyms like C<\p{Alpha}> for C<\p{XPosixAlpha}>; all listed
in L<perluniprops/Properties accessible through \p{} and \P{}>

Within a character class:

    POSIX      traditional   Unicode
  [:digit:]       \d        \p{Digit}
  [:^digit:]      \D        \P{Digit}

=head2 ANCHORS

All are zero-width assertions.

   ^  Match string start (or line, if /m is used)
   $  Match string end (or line, if /m is used) or before newline
   \b Match word boundary (between \w and \W)
   \B Match except at word boundary (between \w and \w or \W and \W)
   \A Match string start (regardless of /m)
   \Z Match string end (before optional newline)
   \z Match absolute string end
   \G Match where previous m//g left off
   \K Keep the stuff left of the \K, don't include it in $&

=head2 QUANTIFIERS

Quantifiers are greedy by default and match the B<longest> leftmost.

   Maximal Minimal Possessive Allowed range
   ------- ------- ---------- -------------
   {n,m}   {n,m}?  {n,m}+     Must occur at least n times
                              but no more than m times
   {n,}    {n,}?   {n,}+      Must occur at least n times
   {n}     {n}?    {n}+       Must occur exactly n times
   *       *?      *+         0 or more times (same as {0,})
   +       +?      ++         1 or more times (same as {1,})
   ?       ??      ?+         0 or 1 time (same as {0,1})

The possessive forms (new in Perl 5.10) prevent backtracking: what gets
matched by a pattern with a possessive quantifier will not be backtracked
into, even if that causes the whole match to fail.

There is no quantifier C<{,n}>. That's interpreted as a literal string.

=head2 EXTENDED CONSTRUCTS

   (?#text)          A comment
   (?:...)           Groups subexpressions without capturing (cluster)
   (?pimsx-imsx:...) Enable/disable option (as per m// modifiers)
   (?=...)           Zero-width positive lookahead assertion
   (?!...)           Zero-width negative lookahead assertion
   (?<=...)          Zero-width positive lookbehind assertion
   (?<!...)          Zero-width negative lookbehind assertion
   (?>...)           Grab what we can, prohibit backtracking
   (?|...)           Branch reset
   (?<name>...)      Named capture
   (?'name'...)      Named capture
   (?P<name>...)     Named capture (python syntax)
   (?{ code })       Embedded code, return value becomes $^R
   (??{ code })      Dynamic regex, return value used as regex
   (?N)              Recurse into subpattern number N
   (?-N), (?+N)      Recurse into Nth previous/next subpattern
   (?R), (?0)        Recurse at the beginning of the whole pattern
   (?&name)          Recurse into a named subpattern
   (?P>name)         Recurse into a named subpattern (python syntax)
   (?(cond)yes|no)
   (?(cond)yes)      Conditional expression, where "cond" can be:
                     (?=pat)   look-ahead
                     (?!pat)   negative look-ahead
                     (?<=pat)  look-behind
                     (?<!pat)  negative look-behind
                     (N)       subpattern N has matched something
                     (<name>)  named subpattern has matched something
                     ('name')  named subpattern has matched something
                     (?{code}) code condition
                     (R)       true if recursing
                     (RN)      true if recursing into Nth subpattern
                     (R&name)  true if recursing into named subpattern
                     (DEFINE)  always false, no no-pattern allowed

=head2 VARIABLES

   $_    Default variable for operators to use

   $`    Everything prior to matched string
   $&    Entire matched string
   $'    Everything after to matched string

   ${^PREMATCH}   Everything prior to matched string
   ${^MATCH}      Entire matched string
   ${^POSTMATCH}  Everything after to matched string

The use of C<$`>, C<$&> or C<$'> will slow down B<all> regex use
within your program. Consult L<perlvar> for C<@->
to see equivalent expressions that won't cause slow down.
See also L<Devel::SawAmpersand>. Starting with Perl 5.10, you
can also use the equivalent variables C<${^PREMATCH}>, C<${^MATCH}>
and C<${^POSTMATCH}>, but for them to be defined, you have to
specify the C</p> (preserve) modifier on your regular expression.

   $1, $2 ...  hold the Xth captured expr
   $+    Last parenthesized pattern match
   $^N   Holds the most recently closed capture
   $^R   Holds the result of the last (?{...}) expr
   @-    Offsets of starts of groups. $-[0] holds start of whole match
   @+    Offsets of ends of groups. $+[0] holds end of whole match
   %+    Named capture groups
   %-    Named capture groups, as array refs

Captured groups are numbered according to their I<opening> paren.

=head2 FUNCTIONS

   lc          Lowercase a string
   lcfirst     Lowercase first char of a string
   uc          Uppercase a string
   ucfirst     Titlecase first char of a string

   pos         Return or set current match position
   quotemeta   Quote metacharacters
   reset       Reset ?pattern? status
   study       Analyze string for optimizing matching

   split       Use a regex to split a string into parts

The first four of these are like the escape sequences C<\L>, C<\l>,
C<\U>, and C<\u>.  For Titlecase, see L</Titlecase>.

=head2 TERMINOLOGY

=head3 Titlecase

Unicode concept which most often is equal to uppercase, but for
certain characters like the German "sharp s" there is a difference.

=head1 AUTHOR

Iain Truskett. Updated by the Perl 5 Porters.

This document may be distributed under the same terms as Perl itself.

=head1 SEE ALSO

=over 4

=item *

L<perlretut> for a tutorial on regular expressions.

=item *

L<perlrequick> for a rapid tutorial.

=item *

L<perlre> for more details.

=item *

L<perlvar> for details on the variables.

=item *

L<perlop> for details on the operators.

=item *

L<perlfunc> for details on the functions.

=item *

L<perlfaq6> for FAQs on regular expressions.

=item *

L<perlrebackslash> for a reference on backslash sequences.

=item *

L<perlrecharclass> for a reference on character classes.

=item *

The L<re> module to alter behaviour and aid
debugging.

=item *

L<perldebug/"Debugging Regular Expressions">

=item *

L<perluniintro>, L<perlunicode>, L<charnames> and L<perllocale>
for details on regexes and internationalisation.

=item *

I<Mastering Regular Expressions> by Jeffrey Friedl
(F<http://oreilly.com/catalog/9780596528126/>) for a thorough grounding and
reference on the topic.

=back

=head1 THANKS

David P.C. Wollmann,
Richard Soderberg,
Sean M. Burke,
Tom Christiansen,
Jim Cromie,
and
Jeffrey Goff
for useful advice.

=cut
Commit	Line	Data
30487ceb RGS	1	=head1 NAME
	2
	3	perlreref - Perl Regular Expressions Reference
	4
	5	=head1 DESCRIPTION
	6
	7	This is a quick reference to Perl's regular expressions.
	8	For full information see L<perlre> and L<perlop>, as well
6d014f17	9	as the L</"SEE ALSO"> section in this document.
30487ceb	10
a5365663	11	=head2 OPERATORS
30487ceb	12
e17472c5 RGS	13	C<=~> determines to which variable the regex is applied.
e17472c5 RGS	14	In its absence, $_ is used.
30487ceb	15
e17472c5	16	$var =~ /foo/;
30487ceb	17
e17472c5 RGS	18	C<!~> determines to which variable the regex is applied,
	19	and negates the result of the match; it returns
	20	false if the match succeeds, and true if it fails.
6d014f17	21
e17472c5	22	$var !~ /foo/;
6d014f17	23
b33bbe43	24	C<m/pattern/msixpogcdual> searches a string for a pattern match,
e17472c5	25	applying the given options.
30487ceb	26
e17472c5 RGS	27	m Multiline mode - ^ and $ match internal lines
	28	s match as a Single line - . matches \n
	29	i case-Insensitive
	30	x eXtended legibility - free whitespace and comments
	31	p Preserve a copy of the matched string -
	32	${^PREMATCH}, ${^MATCH}, ${^POSTMATCH} will be defined.
	33	o compile pattern Once
	34	g Global - all occurrences
	35	c don't reset pos on failed matches when using /g
b33bbe43 KW	36	a restrict \d, \s, \w and [:posix:] to match ASCII only
	37	aa (two a's) also /i matches exclude ASCII/non-ASCII
	38	l match according to current locale
	39	u match according to Unicode rules
	40	d match according to native rules unless something indicates
	41	Unicode
30487ceb	42
e17472c5 RGS	43	If 'pattern' is an empty string, the last I<successfully> matched
e17472c5 RGS	44	regex is used. Delimiters other than '/' may be used for both this
64c5a566	45	operator and the following ones. The leading C<m> can be omitted
e17472c5	46	if the delimiter is '/'.
30487ceb	47
b33bbe43	48	C<qr/pattern/msixpodual> lets you store a regex in a variable,
e17472c5 RGS	49	or pass one around. Modifiers as for C<m//>, and are stored
e17472c5 RGS	50	within the regex.
30487ceb	51
b33bbe43	52	C<s/pattern/replacement/msixpogcedual> substitutes matches of
e17472c5	53	'pattern' with 'replacement'. Modifiers as for C<m//>,
4f4d7508	54	with two additions:
30487ceb	55
e17472c5	56	e Evaluate 'replacement' as an expression
4f4d7508	57	r Return substitution and leave the original string untouched.
30487ceb	58
e17472c5 RGS	59	'e' may be specified multiple times. 'replacement' is interpreted
e17472c5 RGS	60	as a double quoted string unless a single-quote (C<'>) is the delimiter.
30487ceb	61
e17472c5 RGS	62	C<?pattern?> is like C<m/pattern/> but matches only once. No alternate
e17472c5 RGS	63	delimiters can be used. Must be reset with reset().
30487ceb	64
a5365663	65	=head2 SYNTAX
30487ceb	66
9f4a55d4 KW	67	\ Escapes the character immediately following it
	68	. Matches any single character except a newline (unless /s is
	69	used)
	70	^ Matches at the beginning of the string (or line, if /m is used)
	71	$ Matches at the end of the string (or line, if /m is used)
	72	* Matches the preceding element 0 or more times
	73	+ Matches the preceding element 1 or more times
	74	? Matches the preceding element 0 or 1 times
	75	{...} Specifies a range of occurrences for the element preceding it
	76	[...] Matches any one of the characters contained within the brackets
	77	(...) Groups subexpressions for capturing to $1, $2...
	78	(?:...) Groups subexpressions without capturing (cluster)
	79	\| Matches either the subexpression preceding or following it
9f4a55d4	80	\g1 or \g{1}, \g2 ... Matches the text from the Nth group
c27a5cfe	81	\1, \2, \3 ... Matches the text from the Nth group
9f4a55d4 KW	82	\g-1 or \g{-1}, \g-2 ... Matches the text from the Nth previous group
	83	\g{name} Named backreference
	84	\k<name> Named backreference
	85	\k'name' Named backreference
	86	(?P=name) Named backreference (python syntax)
30487ceb RGS	87
	88	=head2 ESCAPE SEQUENCES
	89
	90	These work as in normal strings.
	91
	92	\a Alarm (beep)
	93	\e Escape
	94	\f Formfeed
	95	\n Newline
	96	\r Carriage return
	97	\t Tab
e54859e6 KW	98	\037 Char whose ordinal is the 3 octal digits, max \777
	99	\o{2307} Char whose ordinal is the octal number, unrestricted
	100	\x7f Char whose ordinal is the 2 hex digits, max \xFF
	101	\x{263a} Char whose ordinal is the hex number, unrestricted
30487ceb	102	\cx Control-x
fb121860	103	\N{name} A named Unicode character or character sequence
e526e8bb	104	\N{U+263D} A Unicode character by hex ordinal
30487ceb	105
6d014f17	106	\l Lowercase next character
d3b55b48	107	\u Titlecase next character
30487ceb	108	\L Lowercase until \E
d3b55b48	109	\U Uppercase until \E
30487ceb	110	\Q Disable pattern metacharacters until \E
e17472c5	111	\E End modification
30487ceb	112
47e8a552 IT	113	For Titlecase, see L</Titlecase>.
47e8a552 IT	114
30487ceb RGS	115	This one works differently from normal strings:
	116
	117	\b An assertion, not backspace, except in a character class
	118
	119	=head2 CHARACTER CLASSES
	120
	121	[amy] Match 'a', 'm' or 'y'
	122	[f-j] Dash specifies "range"
	123	[f-j-] Dash escaped or at start or end means 'dash'
6d014f17	124	[^f-j] Caret indicates "match any character _except_ these"
30487ceb	125
df225385	126	The following sequences (except C<\N>) work within or without a character class.
e17472c5 RGS	127	The first six are locale aware, all are Unicode aware. See L<perllocale>
	128	and L<perlunicode> for details.
	129
	130	\d A digit
	131	\D A nondigit
	132	\w A word character
	133	\W A non-word character
	134	\s A whitespace character
	135	\S A non-whitespace character
418e7b04 KW	136	\h An horizontal whitespace
418e7b04 KW	137	\H A non horizontal whitespace
9f4a55d4 KW	138	\N A non newline (when not followed by '{NAME}'; experimental;
	139	not valid in a character class; equivalent to [^\n]; it's
	140	like '.' without /s modifier)
418e7b04 KW	141	\v A vertical whitespace
418e7b04 KW	142	\V A non vertical whitespace
e17472c5	143	\R A generic newline (?>\v\|\x0D\x0A)
e04a154e JH	144
e04a154e JH	145	\C Match a byte (with Unicode, '.' matches a character)
30487ceb	146	\pP Match P-named (Unicode) property
e1b711da	147	\p{...} Match Unicode property with name longer than 1 character
30487ceb	148	\PP Match non-P
e1b711da	149	\P{...} Match lack of Unicode property with name longer than 1 char
0111a78f	150	\X Match Unicode extended grapheme cluster
30487ceb RGS	151
	152	POSIX character classes and their Unicode and Perl equivalents:
	153
cbc24f92 KW	154	ASCII- Full-
	155	POSIX range range backslash
	156	[[:...:]] \p{...} \p{...} sequence Description
	157
9f4a55d4	158	-----------------------------------------------------------------------
cbc24f92 KW	159	alnum PosixAlnum XPosixAlnum Alpha plus Digit
	160	alpha PosixAlpha XPosixAlpha Alphabetic characters
	161	ascii ASCII Any ASCII character
	162	blank PosixBlank XPosixBlank \h Horizontal whitespace;
	163	full-range also
	164	written as
	165	\p{HorizSpace} (GNU
	166	extension)
	167	cntrl PosixCntrl XPosixCntrl Control characters
	168	digit PosixDigit XPosixDigit \d Decimal digits
	169	graph PosixGraph XPosixGraph Alnum plus Punct
	170	lower PosixLower XPosixLower Lowercase characters
	171	print PosixPrint XPosixPrint Graph plus Print, but
	172	not any Cntrls
	173	punct PosixPunct XPosixPunct Punctuation and Symbols
	174	in ASCII-range; just
	175	punct outside it
7f04f24f	176	space PosixSpace XPosixSpace [\s\cK]
cbc24f92 KW	177	PerlSpace XPerlSpace \s Perl's whitespace def'n
cbc24f92 KW	178	upper PosixUpper XPosixUpper Uppercase characters
e6e3f926	179	word PosixWord XPosixWord \w Alnum + Unicode marks +
d35dd6c6 KW	180	connectors, like '_'
d35dd6c6 KW	181	(Perl extension)
cbc24f92 KW	182	xdigit ASCII_Hex_Digit XPosixDigit Hexadecimal digit,
	183	ASCII-range is
	184	[0-9A-Fa-f]
	185
	186	Also, various synonyms like C<\p{Alpha}> for C<\p{XPosixAlpha}>; all listed
	187	in L<perluniprops/Properties accessible through \p{} and \P{}>
30487ceb RGS	188
	189	Within a character class:
	190
9f4a55d4 KW	191	POSIX traditional Unicode
	192	[:digit:] \d \p{Digit}
	193	[:^digit:] \D \P{Digit}
30487ceb RGS	194
	195	=head2 ANCHORS
	196
	197	All are zero-width assertions.
	198
	199	^ Match string start (or line, if /m is used)
	200	$ Match string end (or line, if /m is used) or before newline
	201	\b Match word boundary (between \w and \W)
6d014f17	202	\B Match except at word boundary (between \w and \w or \W and \W)
30487ceb	203	\A Match string start (regardless of /m)
6d014f17	204	\Z Match string end (before optional newline)
30487ceb RGS	205	\z Match absolute string end
30487ceb RGS	206	\G Match where previous m//g left off
64c5a566 RGS	207	\K Keep the stuff left of the \K, don't include it in $&
64c5a566 RGS	208
30487ceb RGS	209	=head2 QUANTIFIERS
30487ceb RGS	210
ac036724	211	Quantifiers are greedy by default and match the B<longest> leftmost.
30487ceb	212
64c5a566 RGS	213	Maximal Minimal Possessive Allowed range
	214	------- ------- ---------- -------------
	215	{n,m} {n,m}? {n,m}+ Must occur at least n times
	216	but no more than m times
	217	{n,} {n,}? {n,}+ Must occur at least n times
	218	{n} {n}? {n}+ Must occur exactly n times
	219	* ? + 0 or more times (same as {0,})
	220	+ +? ++ 1 or more times (same as {1,})
	221	? ?? ?+ 0 or 1 time (same as {0,1})
	222
	223	The possessive forms (new in Perl 5.10) prevent backtracking: what gets
	224	matched by a pattern with a possessive quantifier will not be backtracked
	225	into, even if that causes the whole match to fail.
30487ceb	226
ac036724	227	There is no quantifier C<{,n}>. That's interpreted as a literal string.
6d014f17	228
30487ceb RGS	229	=head2 EXTENDED CONSTRUCTS
30487ceb RGS	230
64c5a566 RGS	231	(?#text) A comment
	232	(?:...) Groups subexpressions without capturing (cluster)
	233	(?pimsx-imsx:...) Enable/disable option (as per m// modifiers)
	234	(?=...) Zero-width positive lookahead assertion
	235	(?!...) Zero-width negative lookahead assertion
	236	(?<=...) Zero-width positive lookbehind assertion
	237	(?<!...) Zero-width negative lookbehind assertion
	238	(?>...) Grab what we can, prohibit backtracking
	239	(?\|...) Branch reset
	240	(?<name>...) Named capture
	241	(?'name'...) Named capture
	242	(?P<name>...) Named capture (python syntax)
	243	(?{ code }) Embedded code, return value becomes $^R
	244	(??{ code }) Dynamic regex, return value used as regex
	245	(?N) Recurse into subpattern number N
	246	(?-N), (?+N) Recurse into Nth previous/next subpattern
	247	(?R), (?0) Recurse at the beginning of the whole pattern
	248	(?&name) Recurse into a named subpattern
	249	(?P>name) Recurse into a named subpattern (python syntax)
	250	(?(cond)yes\|no)
	251	(?(cond)yes) Conditional expression, where "cond" can be:
41ef34de ML	252	(?=pat) look-ahead
	253	(?!pat) negative look-ahead
	254	(?<=pat) look-behind
	255	(?<!pat) negative look-behind
64c5a566 RGS	256	(N) subpattern N has matched something
	257	(<name>) named subpattern has matched something
	258	('name') named subpattern has matched something
	259	(?{code}) code condition
	260	(R) true if recursing
	261	(RN) true if recursing into Nth subpattern
	262	(R&name) true if recursing into named subpattern
	263	(DEFINE) always false, no no-pattern allowed
30487ceb	264
a5365663	265	=head2 VARIABLES
30487ceb RGS	266
30487ceb RGS	267	$_ Default variable for operators to use
30487ceb	268
30487ceb	269	$` Everything prior to matched string
e17472c5	270	$& Entire matched string
30487ceb RGS	271	$' Everything after to matched string
30487ceb RGS	272
e17472c5 RGS	273	${^PREMATCH} Everything prior to matched string
	274	${^MATCH} Entire matched string
	275	${^POSTMATCH} Everything after to matched string
	276
	277	The use of C<$`>, C<$&> or C<$'> will slow down B<all> regex use
64c5a566	278	within your program. Consult L<perlvar> for C<@->
30487ceb	279	to see equivalent expressions that won't cause slow down.
e17472c5 RGS	280	See also L<Devel::SawAmpersand>. Starting with Perl 5.10, you
	281	can also use the equivalent variables C<${^PREMATCH}>, C<${^MATCH}>
	282	and C<${^POSTMATCH}>, but for them to be defined, you have to
	283	specify the C</p> (preserve) modifier on your regular expression.
30487ceb RGS	284
	285	$1, $2 ... hold the Xth captured expr
	286	$+ Last parenthesized pattern match
	287	$^N Holds the most recently closed capture
	288	$^R Holds the result of the last (?{...}) expr
6d014f17 JH	289	@- Offsets of starts of groups. $-[0] holds start of whole match
6d014f17 JH	290	@+ Offsets of ends of groups. $+[0] holds end of whole match
c27a5cfe KW	291	%+ Named capture groups
c27a5cfe KW	292	%- Named capture groups, as array refs
30487ceb	293
6d014f17	294	Captured groups are numbered according to their I<opening> paren.
30487ceb	295
a5365663	296	=head2 FUNCTIONS
30487ceb RGS	297
	298	lc Lowercase a string
	299	lcfirst Lowercase first char of a string
	300	uc Uppercase a string
47e8a552 IT	301	ucfirst Titlecase first char of a string
47e8a552 IT	302
30487ceb RGS	303	pos Return or set current match position
	304	quotemeta Quote metacharacters
	305	reset Reset ?pattern? status
	306	study Analyze string for optimizing matching
	307
e17472c5	308	split Use a regex to split a string into parts
30487ceb	309
d3b55b48 JH	310	The first four of these are like the escape sequences C<\L>, C<\l>,
d3b55b48 JH	311	C<\U>, and C<\u>. For Titlecase, see L</Titlecase>.
47e8a552	312
1501d360	313	=head2 TERMINOLOGY
47e8a552	314
a5365663	315	=head3 Titlecase
47e8a552 IT	316
	317	Unicode concept which most often is equal to uppercase, but for
	318	certain characters like the German "sharp s" there is a difference.
	319
40506b5d	320	=head1 AUTHOR
30487ceb	321
64c5a566	322	Iain Truskett. Updated by the Perl 5 Porters.
30487ceb RGS	323
	324	This document may be distributed under the same terms as Perl itself.
	325
40506b5d	326	=head1 SEE ALSO
30487ceb RGS	327
	328	=over 4
	329
	330	=item *
	331
	332	L<perlretut> for a tutorial on regular expressions.
	333
	334	=item *
	335
	336	L<perlrequick> for a rapid tutorial.
	337
	338	=item *
	339
	340	L<perlre> for more details.
	341
	342	=item *
	343
	344	L<perlvar> for details on the variables.
	345
	346	=item *
	347
	348	L<perlop> for details on the operators.
	349
	350	=item *
	351
	352	L<perlfunc> for details on the functions.
	353
	354	=item *
	355
	356	L<perlfaq6> for FAQs on regular expressions.
	357
	358	=item *
	359
64c5a566 RGS	360	L<perlrebackslash> for a reference on backslash sequences.
	361
	362	=item *
	363
	364	L<perlrecharclass> for a reference on character classes.
	365
	366	=item *
	367
30487ceb RGS	368	The L<re> module to alter behaviour and aid
	369	debugging.
	370
	371	=item *
	372
57e8c15d	373	L<perldebug/"Debugging Regular Expressions">
30487ceb RGS	374
	375	=item *
	376
e17472c5	377	L<perluniintro>, L<perlunicode>, L<charnames> and L<perllocale>
30487ceb RGS	378	for details on regexes and internationalisation.
	379
	380	=item *
	381
	382	I<Mastering Regular Expressions> by Jeffrey Friedl
08d7a6b2	383	(F<http://oreilly.com/catalog/9780596528126/>) for a thorough grounding and
30487ceb RGS	384	reference on the topic.
	385
	386	=back
	387
40506b5d	388	=head1 THANKS
30487ceb RGS	389
	390	David P.C. Wollmann,
	391	Richard Soderberg,
	392	Sean M. Burke,
	393	Tom Christiansen,
e5a7b003	394	Jim Cromie,
30487ceb RGS	395	and
	396	Jeffrey Goff
	397	for useful advice.
6d014f17 JH	398
6d014f17 JH	399	=cut