Warn on unescaped /[]}]/ under re strict

[perl5.git] / pod / perldiag.pod
diff --git a/pod/perldiag.pod b/pod/perldiag.pod

index bc08652..4de4574 100644 (file)
--- a/pod/perldiag.pod
+++ b/pod/perldiag.pod
@@ -2011,7 +2011,7 @@ than to create a dangling reference.
  
  =item Did not produce a valid header
  
-See Server error.
+See L</500 Server error>.
  
  =item %s did not return a true value
  
@@ -2044,7 +2044,7 @@ you called it with no args and C<$@> was empty.
  
  =item Document contains no data
  
-See Server error.
+See L</500 Server error>.
  
  =item %s does not define %s::VERSION--version check failed
  
@@ -2733,6 +2733,26 @@ parent '%s'
  C3-consistent, and you have enabled the C3 MRO for this class.  See the C3
  documentation in L<mro> for more information.
  
+=item Indentation on line %d of here-doc doesn't match delimiter
+
+(F) You have an indented here-document where one or more of its lines
+have whitespace at the beginning that does not match the closing
+delimiter.
+
+For example, line 2 below is wrong because it does not have at least
+2 spaces, but lines 1 and 3 are fine because they have at least 2:
+
+    if ($something) {
+      print <<~EOF;
+        Line 1
+       Line 2 not
+          Line 3
+        EOF
+    }
+
+Note that tabs and spaces are compared strictly, meaning 1 tab will
+not match 8 spaces.
+
  =item Infinite recursion in regex
  
  (F) You used a pattern that references itself without consuming any input
@@ -2741,11 +2761,10 @@ either consume text or fail.
  
  =item Initialization of state variables in list context currently forbidden
  
-(F) Currently the implementation of "state" only permits the
-initialization of scalar variables in scalar context.  Re-write
-C<state ($a) = 42> as C<state $a = 42> to change from list to scalar
-context.  Constructions such as C<state (@a) = foo()> will be
-supported in a future perl release.
+(F) C<state> only permits initializing a single scalar variable, in scalar
+context.  So C<state $a = 42> is allowed, but not C<state ($a) = 42>.  To apply
+state semantics to a hash or array, store a hash or array reference in a
+scalar variable.
  
  =item %%s[%s] in scalar context better written as $%s[%s]
  
@@ -3074,7 +3093,7 @@ L<perlrebackslash/\b{}, \b, \B{}, \B>.
  
  =item %s() is deprecated on :utf8 handles
  
-(W deprecated) The sysread(), recv(), syswrite() and send() operators are
+(D deprecated) The sysread(), recv(), syswrite() and send() operators are
  deprecated on handles that have the C<:utf8> layer, either explicitly, or
  implicitly, eg., with the C<:encoding(UTF-16LE)> layer.
  
@@ -3345,10 +3364,13 @@ Perhaps the function's author was trying to write a subroutine signature
  but didn't enable that feature first (C<use feature 'signatures'>),
  so the signature was instead interpreted as a bad prototype.
  
-=item Malformed UTF-8 character (%s)
+=item Malformed UTF-8 character%s
  
-(S utf8)(F) Perl detected a string that didn't comply with UTF-8
-encoding rules, even though it had the UTF8 flag on.
+(S utf8)(F) Perl detected a string that should be UTF-8, but didn't
+comply with UTF-8 encoding rules, or represents a code point whose
+ordinal integer value doesn't fit into the word size of the current
+platform (overflows).  Details as to the exact malformation are given in
+the variable, C<%s>, part of the message.
  
  One possible cause is that you set the UTF8 flag yourself for data that
  you thought to be in UTF-8 but it wasn't (it was for example legacy
@@ -3468,7 +3490,7 @@ doesn't resolve to a valid subroutine.  See L<overload>.
  
  =item Method %s not permitted
  
-See Server error.
+See L</500 Server error>.
  
  =item Might be a runaway multi-line %s string starting on line %d
  
@@ -3546,11 +3568,12 @@ can vary from one line to the next.
  (S syntax) This is an educated guess made in conjunction with the message
  "%s found where operator expected".  Often the missing operator is a comma.
  
-=item Missing or undefined argument to require
+=item Missing or undefined argument to %s
  
-(F) You tried to call require with no argument or with an undefined
+(F) You tried to call require or do with no argument or with an undefined
  value as an argument.  Require expects either a package name or a
-file-specification as an argument.  See L<perlfunc/require>.
+file-specification as an argument; do expects a filename.  See
+L<perlfunc/require EXPR> and L<perlfunc/do EXPR>.
  
  =item Missing right brace on \%c{} in regex; marked by S<<-- HERE> in m/%s/
  
@@ -4207,14 +4230,6 @@ C<sysread()>ing a file, or when seeking past the end of a scalar opened
  for I/O (in anticipation of future reads and to imitate the behavior
  with real files).
  
-=item Only one /x regex modifier is allowed
-
-=item Only one /x regex modifier is allowed in regex; marked by <-- HERE in m/%s/
-
-(F) You used the C</x> regular expression pattern modifier at least twice in a
-string of modifiers.  This has been made illegal, in order to allow future
-extensions to the Perl language.
-
  =item %s() on unopened %s
  
  (W unopened) An I/O operation was attempted on a filehandle that was
@@ -4431,10 +4446,6 @@ able to initialize properly.
  
  (P) Failed an internal consistency check trying to compile a grep.
  
-=item panic: ck_split, type=%u
-
-(P) Failed an internal consistency check trying to compile a split.
-
  =item panic: corrupt saved stack index %ld
  
  (P) The savestack was requested to restore more localized values than
@@ -4561,10 +4572,6 @@ and freeing temporaries and lexicals from.
  (P) The internal pp_match() routine was called with invalid operational
  data.
  
-=item panic: pp_split, pm=%p, s=%p
-
-(P) Something terrible went wrong in setting up for the split.
-
  =item panic: realloc, %s
  
  (P) Something requested a negative number of bytes of realloc.
@@ -4656,17 +4663,6 @@ Remember that "my", "our", "local" and "state" bind tighter than comma.
  (F) Parsing code supplied by an extension violated the parser's API in
  a detectable way.
  
-=item Passing malformed UTF-8 to "%s" is deprecated
-
-(D deprecated, utf8) This message indicates a bug either in the Perl
-core or in XS code.  Such code was trying to find out if a character,
-allegedly stored internally encoded as UTF-8, was of a given type, such
-as being punctuation or a digit.  But the character was not encoded in
-legal UTF-8.  The C<%s> is replaced by a string that can be used by
-knowledgeable people to determine what the type being checked against
-was.  If C<utf8> warnings are enabled, a further message is raised,
-giving details of the malformation.
-
  =item Pattern subroutine nesting without pos change exceeded limit in regex
  
  (F) You used a pattern that uses too many nested subpattern calls without
@@ -4970,7 +4966,7 @@ of "||".
  
  =item Premature end of script headers
  
-See Server error.
+See L</500 Server error>.
  
  =item printf() on closed filehandle %s
  
@@ -5466,7 +5462,7 @@ in the regular expression the problem was discovered.
  (F) An C<(?R)> or C<(?0)> sequence in a regular expression was missing the
  final parenthesis.
  
-=item Server error (a.k.a. "500 Server error")
+=item Z<>500 Server error
  
  (A) This is the error message generally seen in a browser window
  when trying to run a CGI program (including SSI) over the web.  The
@@ -6331,6 +6327,26 @@ as the first character following a quantifier
  
  =back
  
+=item Unescaped literal '%c' in regex; marked by <-- HERE in m/%s/
+
+(W regexp) (only under C<S<use re 'strict'>>)
+
+Within the scope of C<S<use re 'strict'>> in a regular expression
+pattern, you included an unescaped C<}> or C<]> which was interpreted
+literally.  These two characters are sometimes metacharacters, and
+sometimes literals, depending on what precedes them in the
+pattern.  This is unlike the similar C<)> which is always a
+metacharacter unless escaped.
+
+This action at a distance, perhaps a large distance, can lead to Perl
+silently misinterpreting what you meant, so when you specify that you
+want extra checking by C<S<use re 'strict'>>, this warning is generated.
+If you meant the character as a literal, simply confirm that to Perl by
+preceding the character with a backslash, or make it into a bracketed
+character class (like C<[}]>).  If you meant it as closing a
+corresponding C<[> or C<{>, you'll need to look back through the pattern
+to find out why that isn't happening.
+
  =item unexec of %s into %s failed!
  
  (F) The unexec() routine failed for some reason.  See your local FSF
@@ -6922,6 +6938,13 @@ separated by commas, not just aligned on a line.
  it may skip items, or visit items more than once.  Consider using
  C<keys()> instead of C<each()>.
  
+=item Infinite recursion via empty pattern
+
+(F) You tried to use the empty pattern inside of a regex code block,
+for instance C</(?{ s!!! })/>, which resulted in re-executing
+the same pattern, which is an infinite loop which is broken by
+throwing an exception.
+
  =item Use of := for an empty attribute list is not allowed
  
  (F) The construction C<my $x := 42> used to parse as equivalent to
@@ -7053,6 +7076,31 @@ arguments and at least one of them is tainted.  This used to be allowed
  but will become a fatal error in a future version of perl.  Untaint your
  arguments.  See L<perlsec>.
  
+=item Use of unassigned code point or non-standalone grapheme for a
+delimiter will be a fatal error starting in Perl v5.30
+
+(D deprecated)
+A grapheme is what appears to a native-speaker of a language to be a
+character.  In Unicode (and hence Perl) a grapheme may actually be
+several adjacent characters that together form a complete grapheme.  For
+example, there can be a base character, like "R" and an accent, like a
+circumflex "^", that appear when displayed to be a single character with
+the circumflex hovering over the "R".  Perl currently allows things like
+that circumflex to be delimiters of strings, patterns, I<etc>.  When
+displayed, the circumflex would look like it belongs to the character
+just to the left of it.  In order to move the language to be able to
+accept graphemes as delimiters, we have to deprecate the use of
+delimiters which aren't graphemes by themselves.  Also, a delimiter must
+already be assigned (or known to be never going to be assigned) to try
+to future-proof code, for otherwise code that works today would fail to
+compile if the currently unassigned delimiter ends up being something
+that isn't a stand-alone grapheme.  Because Unicode is never going to
+assign
+L<non-character code points|perlunicode/Noncharacter code points>, nor
+L<code points that are above the legal Unicode maximum|
+perlunicode/Beyond Unicode code points>, those can be delimiters, and
+their use won't raise this warning.
+
  =item Use of uninitialized value%s
  
  (W uninitialized) An undefined value was used as if it were already
@@ -7192,7 +7240,7 @@ front of your variable.
  (F) Lookbehind is allowed only for subexpressions whose length is fixed and
  known at compile time.  For positive lookbehind, you can use the C<\K>
  regex construct as a way to get the equivalent functionality.  See
-L<perlre/(?<=pattern) \K>.
+L<(?<=pattern) and \K in perlre|perlre/\K>.
  
  There are non-obvious Unicode rules under C</i> that can match variably,
  but which you might not think could.  For example, the substring C<"ss">