Use "UTF-8" consistently in perldelta

[perl5.git] / pod / perldiag.pod
diff --git a/pod/perldiag.pod b/pod/perldiag.pod

index 20696d8..93ae13b 100644 (file)
--- a/pod/perldiag.pod
+++ b/pod/perldiag.pod
@@ -189,6 +189,11 @@ alternatives.
  that expected a numeric value instead.  If you're fortunate the message
  will identify which operator was so unfortunate.
  
+Note that for the C<Inf> and C<NaN> (infinity and not-a-number) the
+definition of "numeric" is somewhat unusual: the strings themselves
+(like "Inf") are considered numeric, and anything following them is
+considered non-numeric.
+
  =item Argument list not closed for PerlIO layer "%s"
  
  (W layer) When pushing a layer with arguments onto the Perl I/O
@@ -469,6 +474,11 @@ that wasn't a symbol table entry.
  (P) An internal request asked to add a hash entry to something that
  wasn't a symbol table entry.
  
+=item Bad symbol for scalar
+
+(P) An internal request asked to add a scalar entry to something that
+wasn't a symbol table entry.
+
  =item Bareword found in conditional
  
  (W bareword) The compiler found a bareword where it expected a
@@ -549,6 +559,22 @@ copiable.
  (P) When starting a new thread or returning values from a thread, Perl
  encountered an invalid data type.
  
+=item Both or neither range ends should be Unicode in regex; marked by
+S<<-- HERE> in m/%s/
+
+(W regexp) (only under C<S<use re 'strict'>> or within C<(?[...])>)
+
+In a bracketed character class in a regular expression pattern, you
+had a range which has exactly one end of it specified using C<\N{}>, and
+the other end is specified using a non-portable mechanism.  Perl treats
+the range as a Unicode range, that is, all the characters in it are
+considered to be the Unicode characters, and which may be different code
+points on some platforms Perl runs on.  For example, C<[\N{U+06}-\x08]>
+is treated as if you had instead said C<[\N{U+06}-\N{U+08}]>, that is it
+matches the characters whose code points in Unicode are 6, 7, and 8.
+But that C<\x08> might indicate that you meant something different, so
+the warning gets raised.
+
  =item Buffer overflow in prime_env_iter: %s
  
  (W internal) A warning peculiar to VMS.  While Perl was preparing to
@@ -1533,7 +1559,7 @@ defined in the C<:alias> import argument to C<use charnames>, but they
  could be defined by a translator installed into C<$^H{charnames}>.
  See L<charnames/CUSTOM ALIASES>.
  
-=item \C is deprecated in regex; marked by <-- HERE in m/%s/
+=item \C is deprecated in regex; marked by S<<-- HERE> in m/%s/
  
  (D deprecated, regexp) The \C character class is deprecated, and will
  become a compile-time error in a future release of perl (tentatively
@@ -1692,7 +1718,20 @@ workarounds.
  (F) The parser found inconsistencies either while attempting
  to define an overloaded constant, or when trying to find the
  character name specified in the C<\N{...}> escape.  Perhaps you
-forgot to load the corresponding L<overload> pragma?.
+forgot to load the corresponding L<overload> pragma?
+
+=item :const is experimental
+
+(S experimental::const_attr) The "const" attribute is experimental.
+If you want to use the feature, disable the warning with C<no warnings
+'experimental::const_attr'>, but know that in doing so you are taking
+the risk that your code may break in a future Perl version.
+
+=item :const is not permitted on named subroutines
+
+(F) The "const" attribute causes an anonymous subroutine to be run and
+its value captured at the time that it is cloned.  Named subroutines are
+not cloned like this, so the attribute does not make sense on them.
  
  =item Copy method did not return a reference
  
@@ -1762,7 +1801,7 @@ S<<-- HERE> in m/%s/
  most likely cause of this error is that you left out a parenthesis inside
  of the C<....> part.
  
-The <-- HERE shows whereabouts in the regular expression the problem was
+The S<<-- HERE> shows whereabouts in the regular expression the problem was
  discovered.
  
  =item %s defines neither package nor VERSION--version check failed
@@ -1985,7 +2024,7 @@ S<<-- HERE> in m/%s/
  (F) You used a pattern that nested too many EVAL calls without consuming
  any text.  Restructure the pattern so that text is consumed.
  
-The <-- HERE shows whereabouts in the regular expression the problem was
+The S<<-- HERE> shows whereabouts in the regular expression the problem was
  discovered.
  
  =item Excessively long <> operator
@@ -2728,13 +2767,6 @@ instead, except within S<C<(?[   ])>>, where it is a fatal error.
  The S<<-- HERE> shows whereabouts in the regular expression the
  escape was discovered.
  
-=item %s: Invalid handshake key got %p needed %p, binaries are mismatched
-
-(P) A dynamic loading library C<.so> or C<.dll> was being loaded into the
-process that was built against a different build of perl than the
-said library was compiled against.  Reinstalling the XS module will
-likely fix this error.
-
  =item Invalid hexadecimal number in \N{U+...}
  
  =item Invalid hexadecimal number in \N{U+...} in regex; marked by
@@ -2770,6 +2802,13 @@ character (U+FFFD).
  with the B<-D> option with no flags to see the list of acceptable values.
  See also L<perlrun/-Dletters>.
  
+=item Invalid quantifier in {,} in regex; marked by S<<-- HERE> in m/%s/
+
+(F) The pattern looks like a {min,max} quantifier, but the min or max
+could not be parsed as a valid number - either it has leading zeroes,
+or it represents too big a number to cope with.  The S<<-- HERE> shows
+where in the regular expression the problem was discovered.  See L<perlre>.
+
  =item Invalid [] range "%s" in regex; marked by S<<-- HERE> in m/%s/
  
  (F) The range specified in a character class had a minimum character
@@ -2862,6 +2901,20 @@ with 'useperlio'.
  (F) Your machine doesn't implement the sockatmark() functionality,
  neither as a system call nor an ioctl call (SIOCATMARK).
  
+=item '%s' is an unknown bound type in regex; marked by S<<-- HERE> in m/%s/
+
+(F) You used C<\b{...}> or C<\B{...}> and the C<...> is not known to
+Perl.  The current valid ones are given in
+L<perlrebackslash/\b{}, \b, \B{}, \B>.
+
+=item "%s" is more clearly written simply as "%s" in regex; marked by S<<-- HERE> in m/%s/
+
+(W regexp) (only under C<S<use re 'strict'>> or within C<(?[...])>)
+
+You specified a character that has the given plainer way of writing it,
+and which is also portable to platforms running with different character
+sets.
+
  =item $* is no longer supported
  
  (D deprecated, syntax) The special variable C<$*>, deprecated in older
@@ -2980,18 +3033,25 @@ L<perlfunc/listen>.
  form of C<open> does not support pipes, such as C<open($pipe, '|-', @args)>.
  Use the two-argument C<open($pipe, '|prog arg1 arg2...')> form instead.
  
+=item %s: loadable library and perl binaries are mismatched (got handshake key %p, needed %p)
+
+(P) A dynamic loading library C<.so> or C<.dll> was being loaded into the
+process that was built against a different build of perl than the
+said library was compiled against.  Reinstalling the XS module will
+likely fix this error.
+
  =item Locale '%s' may not work well.%s
  
-(W locale) The named locale that Perl is now trying to use is not fully
-compatible with Perl.  The second C<%s> gives a reason.
+(W locale) You are using the named locale, which is a non-UTF-8 one, and
+which Perl has determined is not fully compatible with Perl.  The second
+C<%s> gives a reason.
  
  By far the most common reason is that the locale has characters in it
  that are represented by more than one byte.  The only such locales that
  Perl can handle are the UTF-8 locales.  Most likely the specified locale
  is a non-UTF-8 one for an East Asian language such as Chinese or
  Japanese.  If the locale is a superset of ASCII, the ASCII portion of it
-may work in Perl.  Read on for problems when it isn't a superset of
-ASCII.
+may work in Perl.
  
  Some essentially obsolete locales that aren't supersets of ASCII, mainly
  those in ISO 646 or other 7-bit locales, such as ASMO 449, can also have
@@ -2999,6 +3059,18 @@ problems, depending on what portions of the ASCII character set get
  changed by the locale and are also used by the program.
  The warning message lists the determinable conflicting characters.
  
+Note that not all incompatibilities are found.
+
+If this happens to you, there's not much you can do except switch to use a
+different locale or use L<Encode> to translate from the locale into
+UTF-8; if that's impracticable, you have been warned that some things
+may break.
+
+This message is output once each time a bad locale is switched into
+within the scope of C<S<use locale>>, or on the first possibly-affected
+operation if the C<S<use locale>> inherits a bad one.  It is not raised
+for any operations from the L<POSIX> module.
+
  =item localtime(%f) failed
  
  (W overflow) You called C<localtime> with a number that it could not handle:
@@ -3540,8 +3612,7 @@ bracketed character class, for the same reason that C<.> in a character
  class loses its specialness: it matches almost everything, which is
  probably not what you want.
  
-=item \N{} in inverted character class or as a range end-point is restricted to one character in regex; marked
-by S<<-- HERE> in m/%s/
+=item \N{} in inverted character class or as a range end-point is restricted to one character in regex; marked by <-- HERE in m/%s/
  
  (F) Named Unicode character escapes (C<\N{...}>) may return a
  multi-character sequence.  Even though a character class is
@@ -3687,6 +3758,12 @@ in the remaining packages of the MRO of this class.  If you don't want
  it throwing an exception, use C<maybe::next::method>
  or C<next::can>.  See L<mro>.
  
+=item Non-finite repeat count does nothing
+
+(W numeric) You tried to execute the
+L<C<x>|perlop/Multiplicative Operators> repetition operator C<Inf> (or
+C<-Inf>) or C<NaN> times, which doesn't make sense.
+
  =item Non-hex character in regex; marked by S<<-- HERE> in m/%s/
  
  (F) In a regular expression, there was a non-hexadecimal character where
@@ -3735,7 +3812,7 @@ find the name of the file to which to write data destined for stdout.
  
  (F) Fully qualified variable names are not allowed in "our"
  declarations, because that doesn't make much sense under existing
-semantics.  Such syntax is reserved for future extensions.
+rules.  Such syntax is reserved for future extensions.
  
  =item No Perl script found in input
  
@@ -3990,7 +4067,7 @@ the C<fallback> overloading key is specified to be true.  See L<overload>.
  
  =item Operation "%s" returns its argument for non-Unicode code point 0x%X
  
-(S non_unicode) You performed an operation requiring Unicode semantics
+(S non_unicode) You performed an operation requiring Unicode rules
  on a code point that is not in Unicode, so what it should do is not
  defined.  Perl has chosen to have it do nothing, and warn you.
  
@@ -4003,9 +4080,9 @@ C<no warnings 'non_unicode';>.
  =item Operation "%s" returns its argument for UTF-16 surrogate U+%X
  
  (S surrogate) You performed an operation requiring Unicode
-semantics on a Unicode surrogate.  Unicode frowns upon the use
+rules on a Unicode surrogate.  Unicode frowns upon the use
  of surrogates for anything but storing strings in UTF-16, but
-semantics are (reluctantly) defined for the surrogates, and
+rules are (reluctantly) defined for the surrogates, and
  they are to do nothing for this operation.  Because the use of
  surrogates can be dangerous, Perl warns.
  
@@ -4522,7 +4599,7 @@ take the risk of using this feature, simply disable this warning:
  
      no warnings "experimental::autoderef";
  
-=item POSIX class [:%s:] unknown in regex; marked by S<< <-- HERE in m/%s/ >>
+=item POSIX class [:%s:] unknown in regex; marked by S<<-- HERE> in m/%s/
  
  (F) The class in the character class [: :] syntax is unknown.  The S<<-- HERE>
  shows whereabouts in the regular expression the problem was discovered.
@@ -4639,7 +4716,7 @@ Note this may be also triggered for constructs like:
  
      sub { 1 if die; }
  
-=item Possible precedence problem on bitwise %c operator
+=item Possible precedence problem on bitwise %s operator
  
  (W precedence) Your program uses a bitwise logical operator in conjunction
  with a numeric comparison operator, like this :
@@ -4764,14 +4841,13 @@ take the risk of using this feature, simply disable this warning:
  
      no warnings "experimental::autoderef";
  
-=item Quantifier follows nothing in regex; marked by S<< <-- HERE in m/%s/ >>
+=item Quantifier follows nothing in regex; marked by S<<-- HERE> in m/%s/
  
  (F) You started a regular expression with a quantifier.  Backslash it if
  you meant it literally.  The S<<-- HERE> shows whereabouts in the regular
  expression the problem was discovered.  See L<perlre>.
  
-=item Quantifier in {,} bigger than %d in regex; marked by S<<-- HERE> in
-m/%s/
+=item Quantifier in {,} bigger than %d in regex; marked by S<<-- HERE> in m/%s/
  
  (F) There is currently a limit to the size of the min and max values of
  the {min,max} construct.  The S<<-- HERE> shows whereabouts in the regular
@@ -4800,6 +4876,45 @@ are outside the range which can be represented by integers internally.
  One possible workaround is to force Perl to use magical string increment
  by prepending "0" to your numbers.
  
+=item Ranges of ASCII printables should be some subset of "0-9", "A-Z", or
+"a-z" in regex; marked by S<<-- HERE> in m/%s/
+
+(W regexp) (only under C<S<use re 'strict'>> or within C<(?[...])>)
+
+Stricter rules help to find typos and other errors.  Perhaps you didn't
+even intend a range here, if the C<"-"> was meant to be some other
+character, or should have been escaped (like C<"\-">).  If you did
+intend a range, the one that was used is not portable between ASCII and
+EBCDIC platforms, and doesn't have an obvious meaning to a casual
+reader.
+
+ [3-7]    # OK; Obvious and portable
+ [d-g]    # OK; Obvious and portable
+ [A-Y]    # OK; Obvious and portable
+ [A-z]    # WRONG; Not portable; not clear what is meant
+ [a-Z]    # WRONG; Not portable; not clear what is meant
+ [%-.]    # WRONG; Not portable; not clear what is meant
+ [\x41-Z] # WRONG; Not portable; not obvious to non-geek
+
+(You can force portability by specifying a Unicode range, which means that
+the endpoints are specified by
+L<C<\N{...}>|perlrecharclass/Character Ranges>, but the meaning may
+still not be obvious.)
+The stricter rules require that ranges that start or stop with an ASCII
+character that is not a control have all their endpoints be the literal
+character, and not some escape sequence (like C<"\x41">), and the ranges
+must be all digits, or all uppercase letters, or all lowercase letters.
+
+=item Ranges of digits should be from the same group in regex; marked by
+S<<-- HERE> in m/%s/
+
+(W regexp) (only under C<S<use re 'strict'>> or within C<(?[...])>)
+
+Stricter rules help to find typos and other errors.  You included a
+range, and at least one of the end points is a decimal digit.  Under the
+stricter rules, when this happens, both end points should be digits in
+the same group of 10 consecutive digits.
+
  =item readdir() attempted on invalid dirhandle %s
  
  (W io) The dirhandle you're reading from is either closed or not really
@@ -4894,7 +5009,7 @@ not at least seven sets of capturing parentheses in the expression.  If
  you wanted to have the character with ordinal 7 inserted into the regular
  expression, prepend zeroes to make it three digits long: C<\007>
  
-The <-- HERE shows whereabouts in the regular expression the problem was
+The S<<-- HERE> shows whereabouts in the regular expression the problem was
  discovered.
  
  =item Reference to nonexistent named group in regex; marked by S<<-- HERE>
@@ -4905,7 +5020,7 @@ expression, but there is no corresponding named capturing parentheses
  such as C<(?'NAME'...)> or C<< (?<NAME>...) >>.  Check if the name has been
  spelled correctly both in the backreference and the declaration.
  
-The <-- HERE shows whereabouts in the regular expression the problem was
+The S<<-- HERE> shows whereabouts in the regular expression the problem was
  discovered.
  
  =item Reference to nonexistent or unclosed group in regex; marked by
@@ -4915,7 +5030,7 @@ S<<-- HERE> in m/%s/
  are not at least seven sets of closed capturing parentheses in the
  expression before where the C<\g{-7}> was located.
  
-The <-- HERE shows whereabouts in the regular expression the problem was
+The S<<-- HERE> shows whereabouts in the regular expression the problem was
  discovered.
  
  =item regexp memory corruption
@@ -5223,7 +5338,7 @@ L<perlfunc/setsockopt>.
  
  =item Setting ${^ENCODING} is deprecated
  
-(D deprecated) You assiged a non-C<undef> value to C<${^ENCODING}>.
+(D deprecated) You assigned a non-C<undef> value to C<${^ENCODING}>.
  This is deprecated; see C<L<perlvar/${^ENCODING}>> for details.
  
  =item Setting $/ to a reference to %s as a form of slurp is deprecated, treating as undef
@@ -5440,6 +5555,15 @@ the previous instance.  This is almost always a typographical error.
  Note that the earlier subroutine will still exist until the end of
  the scope or until all closure references to it are destroyed.
  
+=item Subroutine %s redefined
+
+(W redefine) You redefined a subroutine.  To suppress this warning, say
+
+    {
+       no warnings 'redefine';
+       eval "sub name { ... }";
+    }
+
  =item Subroutine "%s" will not stay shared
  
  (W closure) An inner (nested) I<named> subroutine is referencing a "my"
@@ -5459,15 +5583,6 @@ anonymous, using the C<sub {}> syntax.  When inner anonymous subs that
  reference lexical subroutines in outer subroutines are created, they
  are automatically rebound to the current values of such lexical subs.
  
-=item Subroutine %s redefined
-
-(W redefine) You redefined a subroutine.  To suppress this warning, say
-
-    {
-       no warnings 'redefine';
-       eval "sub name { ... }";
-    }
-
  =item Substitution loop
  
  (P) The substitution was looping infinitely.  (Obviously, a substitution
@@ -5534,7 +5649,7 @@ is not known.  The condition must be one of the following:
   (R&NAME)           true if directly inside named capture
   (DEFINE)           always false; for defining named subpatterns
  
-The <-- HERE shows whereabouts in the regular expression the problem was
+The S<<-- HERE> shows whereabouts in the regular expression the problem was
  discovered.  See L<perlre>.
  
  =item Switch (?(condition)... not terminated in regex; marked by
@@ -5646,6 +5761,18 @@ as a compiler directive.  You may say only one of
  This is to prevent the problem of one module changing the array base out
  from under another module inadvertently.  See L<perlvar/$[> and L<arybase>.
  
+=item The bitwise feature is experimental
+
+(S experimental::bitwise) This warning is emitted if you use bitwise
+operators (C<& | ^ ~ &. |. ^. ~.>) with the "bitwise" feature enabled.
+Simply suppress the warning if you want to use the feature, but know
+that in doing so you are taking the risk of using an experimental
+feature which may change or be removed in a future Perl version:
+
+    no warnings "experimental::bitwise";
+    use feature "bitwise";
+    $x |.= $y;
+
  =item The crypt() function is unimplemented due to excessive paranoia.
  
  (F) Configure couldn't find the crypt() function on your machine,
@@ -5935,7 +6062,7 @@ C<undef *foo>.
  Check the #! line, or manually feed your script into Perl yourself.
  
  =item Unescaped left brace in regex is deprecated, passed through in regex;
-marked by <-- HERE in m/%s/
+marked by S<<-- HERE> in m/%s/
  
  (D deprecated, regexp) You used a literal C<"{"> character in a regular
  expression pattern.  You should change to use C<"\{"> instead, because a
@@ -6108,7 +6235,7 @@ is not known.  The condition must be one of the following:
   (R&NAME)           true if directly inside named capture
   (DEFINE)           always false; for defining named subpatterns
  
-The <-- HERE shows whereabouts in the regular expression the problem was
+The S<<-- HERE> shows whereabouts in the regular expression the problem was
  discovered.  See L<perlre>.
  
  =item Unknown Unicode option letter '%c'
@@ -6387,7 +6514,7 @@ must be written as
  
      if ($string =~ /$pattern/) { ... }
  
-The <-- HERE shows whereabouts in the regular expression the problem was
+The S<<-- HERE> shows whereabouts in the regular expression the problem was
  discovered.  See L<perlre>.
  
  =item Useless localization of %s
@@ -6408,9 +6535,16 @@ must be written as
  
      if ($string =~ /$pattern/o) { ... }
  
-The <-- HERE shows whereabouts in the regular expression the problem was
+The S<<-- HERE> shows whereabouts in the regular expression the problem was
  discovered.  See L<perlre>.
  
+=item Useless use of attribute "const"
+
+(W misc) The "const" attribute has no effect except
+on anonymous closure prototypes.  You applied it to
+a subroutine via L<attributes.pm|attributes>.  This is only useful
+inside an attribute handler for an anonymous subroutine.
+
  =item Useless use of /d modifier in transliteration operator
  
  (W misc) You have used the /d modifier where the searchlist has the
@@ -6515,6 +6649,15 @@ is deprecated.  See L<perlvar/"$[">.
  form if you wish to use an empty line as the terminator of the
  here-document.
  
+=item Use of \b{} for non-UTF-8 locale is wrong.  Assuming a UTF-8 locale
+
+(W locale)  You are matching a regular expression using locale rules,
+and a Unicode boundary is being matched, but the locale is not a Unicode
+one.  This doesn't make sense.  Perl will continue, assuming a Unicode
+(UTF-8) locale, but the results could well be wrong except if the locale
+happens to be ISO-8859-1 (Latin1) where this message is spurious and can
+be ignored.
+
  =item Use of chdir('') or chdir(undef) as chdir() deprecated
  
  (D deprecated) chdir() with no arguments is documented to change to
@@ -6699,6 +6842,14 @@ optimized into C<"that " . $foo>, and the warning will refer to the
  C<concatenation (.)> operator, even though there is no C<.> in
  your program.
  
+=item "use re 'strict'" is experimental
+
+(S experimental::re_strict) The things that are different when a regular
+expression pattern is compiled under C<'strict'> are subject to change
+in future Perl releases in incompatible ways.  This means that a pattern
+that compiles today may not in a future Perl release.  This warning is
+to alert you to that risk.
+
  =item Use \x{...} for more than two hex characters in regex; marked by
  S<<-- HERE> in m/%s/
  
@@ -6728,6 +6879,15 @@ a range.  For these, what should happen isn't clear at all.  In
  these circumstances, Perl discards all but the first character
  of the returned sequence, which is not likely what you want.
  
+=item Using /u for '%s' instead of /%s in regex; marked by S<<-- HERE> in m/%s/
+
+(W regexp) You used a Unicode boundary (C<\b{...}> or C<\B{...}>) in a
+portion of a regular expression where the character set modifiers C</a>
+or C</aa> are in effect.  These two modifiers indicate an ASCII
+interpretation, and this doesn't make sense for a Unicode defintion.
+The generated regular expression will compile so that the boundary uses
+all of Unicode.  No other portion of the regular expression is affected.
+
  =item Using !~ with %s doesn't make sense
  
  (F) Using the C<!~> operator with C<s///r>, C<tr///r> or C<y///r> is
@@ -6941,6 +7101,20 @@ warning is to add C<no warnings 'utf8';> but that is often closer to
  cheating.  In general, you are supposed to explicitly mark the
  filehandle with an encoding, see L<open> and L<perlfunc/binmode>.
  
+=item Wide character (U+%X) in %s
+
+(W locale) While in a single-byte locale (I<i.e.>, a non-UTF-8
+one), a multi-byte character was encountered.   Perl considers this
+character to be the specified Unicode code point.  Combining non-UTF-8
+locales and Unicode is dangerous.  Almost certainly some characters
+will have two different representations.  For example, in the ISO 8859-7
+(Greek) locale, the code point 0xC3 represents a Capital Gamma.  But so
+also does 0x393.  This will make string comparisons unreliable.
+
+You likely need to figure out how this multi-byte character got mixed up
+with your single-byte locale (or perhaps you thought you had a UTF-8
+locale, but Perl disagrees).
+
  =item Within []-length '%c' not allowed
  
  (F) The count in the (un)pack template may be replaced by C<[TEMPLATE]>