This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
perllocale: de-emphasize /l
[perl5.git] / pod / perllocale.pod
CommitLineData
5f05dabc
PP
1=head1 NAME
2
b0c42ed9 3perllocale - Perl locale handling (internationalization and localization)
5f05dabc
PP
4
5=head1 DESCRIPTION
6
e199995e
KW
7Locales these days have been mostly been supplanted by Unicode, but Perl
8continues to support them. See L</Unicode and UTF-8> below.
9
5a964f20
TC
10Perl supports language-specific notions of data such as "is this
11a letter", "what is the uppercase equivalent of this letter", and
12"which of these letters comes first". These are important issues,
13especially for languages other than English--but also for English: it
14would be naE<iuml>ve to imagine that C<A-Za-z> defines all the "letters"
b4ffc3db
TC
15needed to write correct English. Perl is also aware that some character other
16than "." may be preferred as a decimal point, and that output date
5a964f20
TC
17representations may be language-specific. The process of making an
18application take account of its users' preferences in such matters is
19called B<internationalization> (often abbreviated as B<i18n>); telling
20such an application about a particular set of preferences is known as
21B<localization> (B<l10n>).
14280422
DD
22
23Perl can understand language-specific data via the standardized (ISO C,
24XPG4, POSIX 1.c) method called "the locale system". The locale system is
b0c42ed9 25controlled per application using one pragma, one function call, and
14280422
DD
26several environment variables.
27
28B<NOTE>: This feature is new in Perl 5.004, and does not apply unless an
5a964f20 29application specifically requests it--see L<Backward compatibility>.
e38874e2
DD
30The one exception is that write() now B<always> uses the current locale
31- see L<"NOTES">.
5f05dabc
PP
32
33=head1 PREPARING TO USE LOCALES
34
5a964f20 35If Perl applications are to understand and present your data
14280422 36correctly according a locale of your choice, B<all> of the following
5f05dabc
PP
37must be true:
38
39=over 4
40
41=item *
42
43B<Your operating system must support the locale system>. If it does,
14280422 44you should find that the setlocale() function is a documented part of
5f05dabc
PP
45its C library.
46
47=item *
48
5a964f20 49B<Definitions for locales that you use must be installed>. You, or
14280422
DD
50your system administrator, must make sure that this is the case. The
51available locales, the location in which they are kept, and the manner
5a964f20
TC
52in which they are installed all vary from system to system. Some systems
53provide only a few, hard-wired locales and do not allow more to be
54added. Others allow you to add "canned" locales provided by the system
55supplier. Still others allow you or the system administrator to define
14280422 56and add arbitrary locales. (You may have to ask your supplier to
5a964f20 57provide canned locales that are not delivered with your operating
14280422 58system.) Read your system documentation for further illumination.
5f05dabc
PP
59
60=item *
61
62B<Perl must believe that the locale system is supported>. If it does,
63C<perl -V:d_setlocale> will say that the value for C<d_setlocale> is
64C<define>.
65
66=back
67
68If you want a Perl application to process and present your data
69according to a particular locale, the application code should include
2ae324a7 70the S<C<use locale>> pragma (see L<The use locale pragma>) where
5f05dabc
PP
71appropriate, and B<at least one> of the following must be true:
72
73=over 4
74
c052850d 75=item 1
5f05dabc 76
14280422 77B<The locale-determining environment variables (see L<"ENVIRONMENT">)
5a964f20 78must be correctly set up> at the time the application is started, either
c052850d 79by yourself or by whoever set up your system account; or
5f05dabc 80
c052850d 81=item 2
5f05dabc 82
14280422
DD
83B<The application must set its own locale> using the method described in
84L<The setlocale function>.
5f05dabc
PP
85
86=back
87
88=head1 USING LOCALES
89
90=head2 The use locale pragma
91
14280422 92By default, Perl ignores the current locale. The S<C<use locale>>
70709c68
KW
93pragma and the C</l> regular expression modifier tell Perl to use the
94current locale for some operations (C</l> for just pattern matching).
c052850d
KW
95
96The current locale is set at execution time by
97L<setlocale()|/The setlocale function> described below. If that function
98hasn't yet been called in the course of the program's execution, the
99current locale is that which was determined by the L<"ENVIRONMENT"> in
100effect at the start of the program, except that
101C<L<LC_NUMERIC|/Category LC_NUMERIC: Numeric Formatting>> is always
102initialized to the C locale (mentioned under L<Finding locales>).
70709c68
KW
103If there is no valid environment, the current locale is undefined. It
104is likely, but not necessarily, the "C" locale.
c052850d
KW
105
106The operations that are affected by locale are:
5f05dabc
PP
107
108=over 4
109
110=item *
111
14280422
DD
112B<The comparison operators> (C<lt>, C<le>, C<cmp>, C<ge>, and C<gt>) and
113the POSIX string collation functions strcoll() and strxfrm() use
5a964f20
TC
114C<LC_COLLATE>. sort() is also affected if used without an
115explicit comparison function, because it uses C<cmp> by default.
14280422 116
5a964f20 117B<Note:> C<eq> and C<ne> are unaffected by locale: they always
de108802 118perform a char-by-char comparison of their scalar operands. What's
14280422
DD
119more, if C<cmp> finds that its operands are equal according to the
120collation sequence specified by the current locale, it goes on to
de108802
RGS
121perform a char-by-char comparison, and only returns I<0> (equal) if the
122operands are char-for-char identical. If you really want to know whether
5a964f20 123two strings--which C<eq> and C<cmp> may consider different--are equal
14280422
DD
124as far as collation in the locale is concerned, see the discussion in
125L<Category LC_COLLATE: Collation>.
5f05dabc
PP
126
127=item *
128
14280422
DD
129B<Regular expressions and case-modification functions> (uc(), lc(),
130ucfirst(), and lcfirst()) use C<LC_CTYPE>
5f05dabc
PP
131
132=item *
133
903eb63f 134B<Format declarations> (format()) use C<LC_NUMERIC>
5f05dabc
PP
135
136=item *
137
14280422 138B<The POSIX date formatting function> (strftime()) uses C<LC_TIME>.
5f05dabc
PP
139
140=back
141
13a2d996
SP
142C<LC_COLLATE>, C<LC_CTYPE>, and so on, are discussed further in
143L<LOCALE CATEGORIES>.
5f05dabc 144
5a964f20
TC
145The default behavior is restored with the S<C<no locale>> pragma, or
146upon reaching the end of block enclosing C<use locale>.
5f05dabc 147
5a964f20 148The string result of any operation that uses locale
14280422
DD
149information is tainted, as it is possible for a locale to be
150untrustworthy. See L<"SECURITY">.
5f05dabc
PP
151
152=head2 The setlocale function
153
14280422
DD
154You can switch locales as often as you wish at run time with the
155POSIX::setlocale() function:
5f05dabc
PP
156
157 # This functionality not usable prior to Perl 5.004
158 require 5.004;
159
160 # Import locale-handling tool set from POSIX module.
161 # This example uses: setlocale -- the function call
162 # LC_CTYPE -- explained below
163 use POSIX qw(locale_h);
164
14280422 165 # query and save the old locale
5f05dabc
PP
166 $old_locale = setlocale(LC_CTYPE);
167
168 setlocale(LC_CTYPE, "fr_CA.ISO8859-1");
169 # LC_CTYPE now in locale "French, Canada, codeset ISO 8859-1"
170
171 setlocale(LC_CTYPE, "");
172 # LC_CTYPE now reset to default defined by LC_ALL/LC_CTYPE/LANG
173 # environment variables. See below for documentation.
174
175 # restore the old locale
176 setlocale(LC_CTYPE, $old_locale);
177
14280422
DD
178The first argument of setlocale() gives the B<category>, the second the
179B<locale>. The category tells in what aspect of data processing you
180want to apply locale-specific rules. Category names are discussed in
181L<LOCALE CATEGORIES> and L<"ENVIRONMENT">. The locale is the name of a
182collection of customization information corresponding to a particular
183combination of language, country or territory, and codeset. Read on for
184hints on the naming of locales: not all systems name locales as in the
185example.
186
502a173a
JH
187If no second argument is provided and the category is something else
188than LC_ALL, the function returns a string naming the current locale
189for the category. You can use this value as the second argument in a
190subsequent call to setlocale().
191
192If no second argument is provided and the category is LC_ALL, the
193result is implementation-dependent. It may be a string of
c052850d 194concatenated locale names (separator also implementation-dependent)
f979aebc 195or a single locale name. Please consult your setlocale(3) man page for
502a173a
JH
196details.
197
198If a second argument is given and it corresponds to a valid locale,
199the locale for the category is set to that value, and the function
200returns the now-current locale value. You can then use this in yet
201another call to setlocale(). (In some implementations, the return
202value may sometimes differ from the value you gave as the second
203argument--think of it as an alias for the value you gave.)
5f05dabc
PP
204
205As the example shows, if the second argument is an empty string, the
206category's locale is returned to the default specified by the
207corresponding environment variables. Generally, this results in a
5a964f20 208return to the default that was in force when Perl started up: changes
54310121 209to the environment made by the application after startup may or may not
5a964f20 210be noticed, depending on your system's C library.
5f05dabc 211
14280422
DD
212If the second argument does not correspond to a valid locale, the locale
213for the category is not changed, and the function returns I<undef>.
5f05dabc 214
f979aebc 215For further information about the categories, consult setlocale(3).
3e6e419a
JH
216
217=head2 Finding locales
218
f979aebc 219For locales available in your system, consult also setlocale(3) to
5a964f20
TC
220see whether it leads to the list of available locales (search for the
221I<SEE ALSO> section). If that fails, try the following command lines:
5f05dabc
PP
222
223 locale -a
224
225 nlsinfo
226
227 ls /usr/lib/nls/loc
228
229 ls /usr/lib/locale
230
231 ls /usr/lib/nls
232
b478f28d
JH
233 ls /usr/share/locale
234
5f05dabc
PP
235and see whether they list something resembling these
236
2bdf8add 237 en_US.ISO8859-1 de_DE.ISO8859-1 ru_RU.ISO8859-5
502a173a 238 en_US.iso88591 de_DE.iso88591 ru_RU.iso88595
2bdf8add 239 en_US de_DE ru_RU
14280422 240 en de ru
2bdf8add
JH
241 english german russian
242 english.iso88591 german.iso88591 russian.iso88595
502a173a 243 english.roman8 russian.koi8r
5f05dabc 244
528d65ad
JH
245Sadly, even though the calling interface for setlocale() has been
246standardized, names of locales and the directories where the
5a964f20 247configuration resides have not been. The basic form of the name is
528d65ad
JH
248I<language_territory>B<.>I<codeset>, but the latter parts after
249I<language> are not always present. The I<language> and I<country>
250are usually from the standards B<ISO 3166> and B<ISO 639>, the
251two-letter abbreviations for the countries and the languages of the
252world, respectively. The I<codeset> part often mentions some B<ISO
2538859> character set, the Latin codesets. For example, C<ISO 8859-1>
254is the so-called "Western European codeset" that can be used to encode
255most Western European languages adequately. Again, there are several
256ways to write even the name of that one standard. Lamentably.
5f05dabc 257
14280422
DD
258Two special locales are worth particular mention: "C" and "POSIX".
259Currently these are effectively the same locale: the difference is
5a964f20
TC
260mainly that the first one is defined by the C standard, the second by
261the POSIX standard. They define the B<default locale> in which
14280422 262every program starts in the absence of locale information in its
5a964f20 263environment. (The I<default> default locale, if you will.) Its language
14280422 264is (American) English and its character codeset ASCII.
c052850d
KW
265B<Warning>. The C locale delivered by some vendors may not
266actually exactly match what the C standard calls for. So beware.
5f05dabc 267
14280422
DD
268B<NOTE>: Not all systems have the "POSIX" locale (not all systems are
269POSIX-conformant), so use "C" when you need explicitly to specify this
270default locale.
5f05dabc 271
3e6e419a
JH
272=head2 LOCALE PROBLEMS
273
5a964f20 274You may encounter the following warning message at Perl startup:
3e6e419a
JH
275
276 perl: warning: Setting locale failed.
277 perl: warning: Please check that your locale settings:
278 LC_ALL = "En_US",
279 LANG = (unset)
280 are supported and installed on your system.
281 perl: warning: Falling back to the standard locale ("C").
282
5a964f20
TC
283This means that your locale settings had LC_ALL set to "En_US" and
284LANG exists but has no value. Perl tried to believe you but could not.
285Instead, Perl gave up and fell back to the "C" locale, the default locale
286that is supposed to work no matter what. This usually means your locale
287settings were wrong, they mention locales your system has never heard
288of, or the locale installation in your system has problems (for example,
289some system files are broken or missing). There are quick and temporary
290fixes to these problems, as well as more thorough and lasting fixes.
3e6e419a
JH
291
292=head2 Temporarily fixing locale problems
293
5a964f20 294The two quickest fixes are either to render Perl silent about any
3e6e419a
JH
295locale inconsistencies or to run Perl under the default locale "C".
296
297Perl's moaning about locale problems can be silenced by setting the
900bd440
JH
298environment variable PERL_BADLANG to a zero value, for example "0".
299This method really just sweeps the problem under the carpet: you tell
300Perl to shut up even when Perl sees that something is wrong. Do not
301be surprised if later something locale-dependent misbehaves.
3e6e419a
JH
302
303Perl can be run under the "C" locale by setting the environment
5a964f20
TC
304variable LC_ALL to "C". This method is perhaps a bit more civilized
305than the PERL_BADLANG approach, but setting LC_ALL (or
306other locale variables) may affect other programs as well, not just
307Perl. In particular, external programs run from within Perl will see
3e6e419a 308these changes. If you make the new settings permanent (read on), all
f979aebc 309programs you run see the changes. See L<"ENVIRONMENT"> for
5a964f20
TC
310the full list of relevant environment variables and L<USING LOCALES>
311for their effects in Perl. Effects in other programs are
312easily deducible. For example, the variable LC_COLLATE may well affect
b432a672 313your B<sort> program (or whatever the program that arranges "records"
3e6e419a
JH
314alphabetically in your system is called).
315
5a964f20
TC
316You can test out changing these variables temporarily, and if the
317new settings seem to help, put those settings into your shell startup
318files. Consult your local documentation for the exact details. For in
319Bourne-like shells (B<sh>, B<ksh>, B<bash>, B<zsh>):
3e6e419a
JH
320
321 LC_ALL=en_US.ISO8859-1
322 export LC_ALL
323
5a964f20
TC
324This assumes that we saw the locale "en_US.ISO8859-1" using the commands
325discussed above. We decided to try that instead of the above faulty
326locale "En_US"--and in Cshish shells (B<csh>, B<tcsh>)
3e6e419a
JH
327
328 setenv LC_ALL en_US.ISO8859-1
c47ff5f1 329
c406981e
JH
330or if you have the "env" application you can do in any shell
331
332 env LC_ALL=en_US.ISO8859-1 perl ...
333
5a964f20 334If you do not know what shell you have, consult your local
3e6e419a
JH
335helpdesk or the equivalent.
336
337=head2 Permanently fixing locale problems
338
5a964f20
TC
339The slower but superior fixes are when you may be able to yourself
340fix the misconfiguration of your own environment variables. The
3e6e419a
JH
341mis(sing)configuration of the whole system's locales usually requires
342the help of your friendly system administrator.
343
5a964f20
TC
344First, see earlier in this document about L<Finding locales>. That tells
345how to find which locales are really supported--and more importantly,
346installed--on your system. In our example error message, environment
347variables affecting the locale are listed in the order of decreasing
348importance (and unset variables do not matter). Therefore, having
349LC_ALL set to "En_US" must have been the bad choice, as shown by the
350error message. First try fixing locale settings listed first.
3e6e419a 351
5a964f20
TC
352Second, if using the listed commands you see something B<exactly>
353(prefix matches do not count and case usually counts) like "En_US"
354without the quotes, then you should be okay because you are using a
355locale name that should be installed and available in your system.
4a4eefd0 356In this case, see L<Permanently fixing your system's locale configuration>.
3e6e419a 357
4a4eefd0 358=head2 Permanently fixing your system's locale configuration
3e6e419a 359
5a964f20 360This is when you see something like:
3e6e419a
JH
361
362 perl: warning: Please check that your locale settings:
363 LC_ALL = "En_US",
364 LANG = (unset)
365 are supported and installed on your system.
366
367but then cannot see that "En_US" listed by the above-mentioned
5a964f20
TC
368commands. You may see things like "en_US.ISO8859-1", but that isn't
369the same. In this case, try running under a locale
370that you can list and which somehow matches what you tried. The
3e6e419a 371rules for matching locale names are a bit vague because
13a2d996
SP
372standardization is weak in this area. See again the
373L<Finding locales> about general rules.
3e6e419a 374
b687b08b 375=head2 Fixing system locale configuration
3e6e419a 376
5a964f20
TC
377Contact a system administrator (preferably your own) and report the exact
378error message you get, and ask them to read this same documentation you
379are now reading. They should be able to check whether there is something
380wrong with the locale configuration of the system. The L<Finding locales>
381section is unfortunately a bit vague about the exact commands and places
382because these things are not that standardized.
3e6e419a 383
5f05dabc
PP
384=head2 The localeconv function
385
14280422
DD
386The POSIX::localeconv() function allows you to get particulars of the
387locale-dependent numeric formatting information specified by the current
388C<LC_NUMERIC> and C<LC_MONETARY> locales. (If you just want the name of
389the current locale for a particular category, use POSIX::setlocale()
5a964f20 390with a single parameter--see L<The setlocale function>.)
5f05dabc
PP
391
392 use POSIX qw(locale_h);
5f05dabc
PP
393
394 # Get a reference to a hash of locale-dependent info
395 $locale_values = localeconv();
396
397 # Output sorted list of the values
398 for (sort keys %$locale_values) {
14280422 399 printf "%-20s = %s\n", $_, $locale_values->{$_}
5f05dabc
PP
400 }
401
14280422 402localeconv() takes no arguments, and returns B<a reference to> a hash.
5a964f20 403The keys of this hash are variable names for formatting, such as
502a173a 404C<decimal_point> and C<thousands_sep>. The values are the
cea6626f 405corresponding, er, values. See L<POSIX/localeconv> for a longer
502a173a
JH
406example listing the categories an implementation might be expected to
407provide; some provide more and others fewer. You don't need an
408explicit C<use locale>, because localeconv() always observes the
409current locale.
5f05dabc 410
5a964f20
TC
411Here's a simple-minded example program that rewrites its command-line
412parameters as integers correctly formatted in the current locale:
5f05dabc
PP
413
414 # See comments in previous example
415 require 5.004;
416 use POSIX qw(locale_h);
5f05dabc
PP
417
418 # Get some of locale's numeric formatting parameters
419 my ($thousands_sep, $grouping) =
14280422 420 @{localeconv()}{'thousands_sep', 'grouping'};
5f05dabc
PP
421
422 # Apply defaults if values are missing
423 $thousands_sep = ',' unless $thousands_sep;
502a173a
JH
424
425 # grouping and mon_grouping are packed lists
426 # of small integers (characters) telling the
427 # grouping (thousand_seps and mon_thousand_seps
428 # being the group dividers) of numbers and
429 # monetary quantities. The integers' meanings:
430 # 255 means no more grouping, 0 means repeat
431 # the previous grouping, 1-254 means use that
432 # as the current grouping. Grouping goes from
433 # right to left (low to high digits). In the
434 # below we cheat slightly by never using anything
435 # else than the first grouping (whatever that is).
436 if ($grouping) {
437 @grouping = unpack("C*", $grouping);
438 } else {
439 @grouping = (3);
440 }
5f05dabc
PP
441
442 # Format command line params for current locale
14280422
DD
443 for (@ARGV) {
444 $_ = int; # Chop non-integer part
5f05dabc 445 1 while
502a173a 446 s/(\d)(\d{$grouping[0]}($|$thousands_sep))/$1$thousands_sep$2/;
14280422 447 print "$_";
5f05dabc
PP
448 }
449 print "\n";
450
74c76037 451=head2 I18N::Langinfo
4bbcc6e8
JH
452
453Another interface for querying locale-dependent information is the
e1020413 454I18N::Langinfo::langinfo() function, available at least in Unix-like
4bbcc6e8
JH
455systems and VMS.
456
74c76037
JH
457The following example will import the langinfo() function itself and
458three constants to be used as arguments to langinfo(): a constant for
459the abbreviated first day of the week (the numbering starts from
460Sunday = 1) and two more constants for the affirmative and negative
461answers for a yes/no question in the current locale.
4bbcc6e8 462
74c76037 463 use I18N::Langinfo qw(langinfo ABDAY_1 YESSTR NOSTR);
4bbcc6e8 464
74c76037 465 my ($abday_1, $yesstr, $nostr) = map { langinfo } qw(ABDAY_1 YESSTR NOSTR);
4bbcc6e8 466
74c76037 467 print "$abday_1? [$yesstr/$nostr] ";
4bbcc6e8 468
74c76037
JH
469In other words, in the "C" (or English) locale the above will probably
470print something like:
471
472 Sun? [yes/no]
4bbcc6e8
JH
473
474See L<I18N::Langinfo> for more information.
475
5f05dabc
PP
476=head1 LOCALE CATEGORIES
477
5a964f20
TC
478The following subsections describe basic locale categories. Beyond these,
479some combination categories allow manipulation of more than one
480basic category at a time. See L<"ENVIRONMENT"> for a discussion of these.
5f05dabc
PP
481
482=head2 Category LC_COLLATE: Collation
483
5a964f20
TC
484In the scope of S<C<use locale>>, Perl looks to the C<LC_COLLATE>
485environment variable to determine the application's notions on collation
b4ffc3db
TC
486(ordering) of characters. For example, "b" follows "a" in Latin
487alphabets, but where do "E<aacute>" and "E<aring>" belong? And while
488"color" follows "chocolate" in English, what about in Spanish?
5f05dabc 489
60f0fa02
JH
490The following collations all make sense and you may meet any of them
491if you "use locale".
492
493 A B C D E a b c d e
35316ca3 494 A a B b C c D d E e
60f0fa02
JH
495 a A b B c C d D e E
496 a b c d e A B C D E
497
f1cbbd6e 498Here is a code snippet to tell what "word"
5a964f20 499characters are in the current locale, in that locale's order:
5f05dabc
PP
500
501 use locale;
35316ca3 502 print +(sort grep /\w/, map { chr } 0..255), "\n";
5f05dabc 503
14280422
DD
504Compare this with the characters that you see and their order if you
505state explicitly that the locale should be ignored:
5f05dabc
PP
506
507 no locale;
35316ca3 508 print +(sort grep /\w/, map { chr } 0..255), "\n";
5f05dabc
PP
509
510This machine-native collation (which is what you get unless S<C<use
511locale>> has appeared earlier in the same block) must be used for
512sorting raw binary data, whereas the locale-dependent collation of the
b0c42ed9 513first example is useful for natural text.
5f05dabc 514
14280422
DD
515As noted in L<USING LOCALES>, C<cmp> compares according to the current
516collation locale when C<use locale> is in effect, but falls back to a
de108802 517char-by-char comparison for strings that the locale says are equal. You
14280422
DD
518can use POSIX::strcoll() if you don't want this fall-back:
519
520 use POSIX qw(strcoll);
521 $equal_in_locale =
522 !strcoll("space and case ignored", "SpaceAndCaseIgnored");
523
524$equal_in_locale will be true if the collation locale specifies a
5a964f20 525dictionary-like ordering that ignores space characters completely and
9e3a2af8 526which folds case.
14280422 527
5a964f20 528If you have a single string that you want to check for "equality in
14280422
DD
529locale" against several others, you might think you could gain a little
530efficiency by using POSIX::strxfrm() in conjunction with C<eq>:
531
532 use POSIX qw(strxfrm);
533 $xfrm_string = strxfrm("Mixed-case string");
534 print "locale collation ignores spaces\n"
535 if $xfrm_string eq strxfrm("Mixed-casestring");
536 print "locale collation ignores hyphens\n"
537 if $xfrm_string eq strxfrm("Mixedcase string");
538 print "locale collation ignores case\n"
539 if $xfrm_string eq strxfrm("mixed-case string");
540
541strxfrm() takes a string and maps it into a transformed string for use
de108802 542in char-by-char comparisons against other transformed strings during
14280422 543collation. "Under the hood", locale-affected Perl comparison operators
de108802 544call strxfrm() for both operands, then do a char-by-char
5a964f20 545comparison of the transformed strings. By calling strxfrm() explicitly
14280422 546and using a non locale-affected comparison, the example attempts to save
5a964f20 547a couple of transformations. But in fact, it doesn't save anything: Perl
2ae324a7 548magic (see L<perlguts/Magic Variables>) creates the transformed version of a
5a964f20 549string the first time it's needed in a comparison, then keeps this version around
14280422 550in case it's needed again. An example rewritten the easy way with
e38874e2 551C<cmp> runs just about as fast. It also copes with null characters
14280422 552embedded in strings; if you call strxfrm() directly, it treats the first
5a964f20
TC
553null it finds as a terminator. don't expect the transformed strings
554it produces to be portable across systems--or even from one revision
e38874e2
DD
555of your operating system to the next. In short, don't call strxfrm()
556directly: let Perl do it for you.
14280422 557
5a964f20 558Note: C<use locale> isn't shown in some of these examples because it isn't
14280422
DD
559needed: strcoll() and strxfrm() exist only to generate locale-dependent
560results, and so always obey the current C<LC_COLLATE> locale.
5f05dabc
PP
561
562=head2 Category LC_CTYPE: Character Types
563
5a964f20 564In the scope of S<C<use locale>>, Perl obeys the C<LC_CTYPE> locale
14280422
DD
565setting. This controls the application's notion of which characters are
566alphabetic. This affects Perl's C<\w> regular expression metanotation,
f1cbbd6e
GS
567which stands for alphanumeric characters--that is, alphabetic,
568numeric, and including other special characters such as the underscore or
569hyphen. (Consult L<perlre> for more information about
14280422 570regular expressions.) Thanks to C<LC_CTYPE>, depending on your locale
b4ffc3db
TC
571setting, characters like "E<aelig>", "E<eth>", "E<szlig>", and
572"E<oslash>" may be understood as C<\w> characters.
5f05dabc 573
2c268ad5 574The C<LC_CTYPE> locale also provides the map used in transliterating
68dc0745 575characters between lower and uppercase. This affects the case-mapping
5a964f20
TC
576functions--lc(), lcfirst, uc(), and ucfirst(); case-mapping
577interpolation with C<\l>, C<\L>, C<\u>, or C<\U> in double-quoted strings
578and C<s///> substitutions; and case-independent regular expression
e38874e2
DD
579pattern matching using the C<i> modifier.
580
5a964f20
TC
581Finally, C<LC_CTYPE> affects the POSIX character-class test
582functions--isalpha(), islower(), and so on. For example, if you move
583from the "C" locale to a 7-bit Scandinavian one, you may find--possibly
584to your surprise--that "|" moves from the ispunct() class to isalpha().
5f05dabc 585
14280422
DD
586B<Note:> A broken or malicious C<LC_CTYPE> locale definition may result
587in clearly ineligible characters being considered to be alphanumeric by
e199995e 588your application. For strict matching of (mundane) ASCII letters and
5a964f20 589digits--for example, in command strings--locale-aware applications
e199995e 590should use C<\w> with the C</a> regular expression modifier. See L<"SECURITY">.
5f05dabc
PP
591
592=head2 Category LC_NUMERIC: Numeric Formatting
593
2095dafa
RGS
594After a proper POSIX::setlocale() call, Perl obeys the C<LC_NUMERIC>
595locale information, which controls an application's idea of how numbers
596should be formatted for human readability by the printf(), sprintf(), and
597write() functions. String-to-numeric conversion by the POSIX::strtod()
5a964f20 598function is also affected. In most implementations the only effect is to
b4ffc3db 599change the character used for the decimal point--perhaps from "." to ",".
5a964f20 600These functions aren't aware of such niceties as thousands separation and
2095dafa 601so on. (See L<The localeconv function> if you care about these things.)
5a964f20 602
3cf03d68 603Output produced by print() is also affected by the current locale: it
3cf03d68
JH
604corresponds to what you'd get from printf() in the "C" locale. The
605same is true for Perl's internal conversions between numeric and
606string formats:
5f05dabc 607
2095dafa
RGS
608 use POSIX qw(strtod setlocale LC_NUMERIC);
609
610 setlocale LC_NUMERIC, "";
14280422 611
5f05dabc
PP
612 $n = 5/2; # Assign numeric 2.5 to $n
613
35316ca3 614 $a = " $n"; # Locale-dependent conversion to string
5f05dabc 615
35316ca3 616 print "half five is $n\n"; # Locale-dependent output
5f05dabc
PP
617
618 printf "half five is %g\n", $n; # Locale-dependent output
619
14280422
DD
620 print "DECIMAL POINT IS COMMA\n"
621 if $n == (strtod("2,5"))[0]; # Locale-dependent conversion
5f05dabc 622
4bbcc6e8
JH
623See also L<I18N::Langinfo> and C<RADIXCHAR>.
624
5f05dabc
PP
625=head2 Category LC_MONETARY: Formatting of monetary amounts
626
e199995e 627The C standard defines the C<LC_MONETARY> category, but not a function
5a964f20 628that is affected by its contents. (Those with experience of standards
b0c42ed9 629committees will recognize that the working group decided to punt on the
14280422 630issue.) Consequently, Perl takes no notice of it. If you really want
13a2d996
SP
631to use C<LC_MONETARY>, you can query its contents--see
632L<The localeconv function>--and use the information that it returns in your
633application's own formatting of currency amounts. However, you may well
634find that the information, voluminous and complex though it may be, still
635does not quite meet your requirements: currency formatting is a hard nut
636to crack.
5f05dabc 637
4bbcc6e8
JH
638See also L<I18N::Langinfo> and C<CRNCYSTR>.
639
5f05dabc
PP
640=head2 LC_TIME
641
5a964f20 642Output produced by POSIX::strftime(), which builds a formatted
5f05dabc
PP
643human-readable date/time string, is affected by the current C<LC_TIME>
644locale. Thus, in a French locale, the output produced by the C<%B>
645format element (full month name) for the first month of the year would
5a964f20 646be "janvier". Here's how to get a list of long month names in the
5f05dabc
PP
647current locale:
648
649 use POSIX qw(strftime);
14280422
DD
650 for (0..11) {
651 $long_month_name[$_] =
652 strftime("%B", 0, 0, 0, 1, $_, 96);
5f05dabc
PP
653 }
654
5a964f20 655Note: C<use locale> isn't needed in this example: as a function that
14280422
DD
656exists only to generate locale-dependent results, strftime() always
657obeys the current C<LC_TIME> locale.
5f05dabc 658
4bbcc6e8 659See also L<I18N::Langinfo> and C<ABDAY_1>..C<ABDAY_7>, C<DAY_1>..C<DAY_7>,
2a2bf5f4 660C<ABMON_1>..C<ABMON_12>, and C<ABMON_1>..C<ABMON_12>.
4bbcc6e8 661
5f05dabc
PP
662=head2 Other categories
663
5a964f20
TC
664The remaining locale category, C<LC_MESSAGES> (possibly supplemented
665by others in particular implementations) is not currently used by
98a6f11e 666Perl--except possibly to affect the behavior of library functions
667called by extensions outside the standard Perl distribution and by the
668operating system and its utilities. Note especially that the string
669value of C<$!> and the error messages given by external utilities may
670be changed by C<LC_MESSAGES>. If you want to have portable error
265f5c4a 671codes, use C<%!>. See L<Errno>.
14280422
DD
672
673=head1 SECURITY
674
5a964f20 675Although the main discussion of Perl security issues can be found in
14280422
DD
676L<perlsec>, a discussion of Perl's locale handling would be incomplete
677if it did not draw your attention to locale-dependent security issues.
5a964f20
TC
678Locales--particularly on systems that allow unprivileged users to
679build their own locales--are untrustworthy. A malicious (or just plain
14280422
DD
680broken) locale can make a locale-aware application give unexpected
681results. Here are a few possibilities:
682
683=over 4
684
685=item *
686
687Regular expression checks for safe file names or mail addresses using
5a964f20 688C<\w> may be spoofed by an C<LC_CTYPE> locale that claims that
14280422
DD
689characters such as "E<gt>" and "|" are alphanumeric.
690
691=item *
692
e38874e2
DD
693String interpolation with case-mapping, as in, say, C<$dest =
694"C:\U$name.$ext">, may produce dangerous results if a bogus LC_CTYPE
695case-mapping table is in effect.
696
697=item *
698
14280422
DD
699A sneaky C<LC_COLLATE> locale could result in the names of students with
700"D" grades appearing ahead of those with "A"s.
701
702=item *
703
5a964f20 704An application that takes the trouble to use information in
14280422 705C<LC_MONETARY> may format debits as if they were credits and vice versa
5a964f20 706if that locale has been subverted. Or it might make payments in US
14280422
DD
707dollars instead of Hong Kong dollars.
708
709=item *
710
711The date and day names in dates formatted by strftime() could be
712manipulated to advantage by a malicious user able to subvert the
5a964f20 713C<LC_DATE> locale. ("Look--it says I wasn't in the building on
14280422
DD
714Sunday.")
715
716=back
717
718Such dangers are not peculiar to the locale system: any aspect of an
5a964f20 719application's environment which may be modified maliciously presents
14280422 720similar challenges. Similarly, they are not specific to Perl: any
5a964f20 721programming language that allows you to write programs that take
14280422
DD
722account of their environment exposes you to these issues.
723
5a964f20
TC
724Perl cannot protect you from all possibilities shown in the
725examples--there is no substitute for your own vigilance--but, when
14280422 726C<use locale> is in effect, Perl uses the tainting mechanism (see
5a964f20 727L<perlsec>) to mark string results that become locale-dependent, and
14280422 728which may be untrustworthy in consequence. Here is a summary of the
5a964f20 729tainting behavior of operators and functions that may be affected by
14280422
DD
730the locale:
731
732=over 4
733
551e1d92
RB
734=item *
735
736B<Comparison operators> (C<lt>, C<le>, C<ge>, C<gt> and C<cmp>):
14280422
DD
737
738Scalar true/false (or less/equal/greater) result is never tainted.
739
551e1d92
RB
740=item *
741
742B<Case-mapping interpolation> (with C<\l>, C<\L>, C<\u> or C<\U>)
e38874e2
DD
743
744Result string containing interpolated material is tainted if
745C<use locale> is in effect.
746
551e1d92
RB
747=item *
748
749B<Matching operator> (C<m//>):
14280422
DD
750
751Scalar true/false result never tainted.
752
5a964f20 753Subpatterns, either delivered as a list-context result or as $1 etc.
14280422 754are tainted if C<use locale> is in effect, and the subpattern regular
e38874e2 755expression contains C<\w> (to match an alphanumeric character), C<\W>
6b0ac556
OK
756(non-alphanumeric character), C<\s> (whitespace character), or C<\S>
757(non whitespace character). The matched-pattern variable, $&, $`
e38874e2
DD
758(pre-match), $' (post-match), and $+ (last match) are also tainted if
759C<use locale> is in effect and the regular expression contains C<\w>,
760C<\W>, C<\s>, or C<\S>.
14280422 761
551e1d92
RB
762=item *
763
764B<Substitution operator> (C<s///>):
14280422 765
e38874e2 766Has the same behavior as the match operator. Also, the left
5a964f20
TC
767operand of C<=~> becomes tainted when C<use locale> in effect
768if modified as a result of a substitution based on a regular
e38874e2 769expression match involving C<\w>, C<\W>, C<\s>, or C<\S>; or of
7b8d334a 770case-mapping with C<\l>, C<\L>,C<\u> or C<\U>.
14280422 771
551e1d92
RB
772=item *
773
774B<Output formatting functions> (printf() and write()):
14280422 775
3cf03d68
JH
776Results are never tainted because otherwise even output from print,
777for example C<print(1/7)>, should be tainted if C<use locale> is in
778effect.
14280422 779
551e1d92
RB
780=item *
781
782B<Case-mapping functions> (lc(), lcfirst(), uc(), ucfirst()):
14280422
DD
783
784Results are tainted if C<use locale> is in effect.
785
551e1d92
RB
786=item *
787
788B<POSIX locale-dependent functions> (localeconv(), strcoll(),
14280422
DD
789strftime(), strxfrm()):
790
791Results are never tainted.
792
551e1d92
RB
793=item *
794
795B<POSIX character class tests> (isalnum(), isalpha(), isdigit(),
14280422
DD
796isgraph(), islower(), isprint(), ispunct(), isspace(), isupper(),
797isxdigit()):
798
799True/false results are never tainted.
800
801=back
802
803Three examples illustrate locale-dependent tainting.
804The first program, which ignores its locale, won't run: a value taken
54310121 805directly from the command line may not be used to name an output file
14280422
DD
806when taint checks are enabled.
807
808 #/usr/local/bin/perl -T
809 # Run with taint checking
810
54310121 811 # Command line sanity check omitted...
14280422
DD
812 $tainted_output_file = shift;
813
814 open(F, ">$tainted_output_file")
815 or warn "Open of $untainted_output_file failed: $!\n";
816
817The program can be made to run by "laundering" the tainted value through
5a964f20
TC
818a regular expression: the second example--which still ignores locale
819information--runs, creating the file named on its command line
14280422
DD
820if it can.
821
822 #/usr/local/bin/perl -T
823
824 $tainted_output_file = shift;
825 $tainted_output_file =~ m%[\w/]+%;
826 $untainted_output_file = $&;
827
828 open(F, ">$untainted_output_file")
829 or warn "Open of $untainted_output_file failed: $!\n";
830
5a964f20 831Compare this with a similar but locale-aware program:
14280422
DD
832
833 #/usr/local/bin/perl -T
834
835 $tainted_output_file = shift;
836 use locale;
837 $tainted_output_file =~ m%[\w/]+%;
838 $localized_output_file = $&;
839
840 open(F, ">$localized_output_file")
841 or warn "Open of $localized_output_file failed: $!\n";
842
843This third program fails to run because $& is tainted: it is the result
5a964f20 844of a match involving C<\w> while C<use locale> is in effect.
5f05dabc
PP
845
846=head1 ENVIRONMENT
847
848=over 12
849
850=item PERL_BADLANG
851
14280422 852A string that can suppress Perl's warning about failed locale settings
54310121 853at startup. Failure can occur if the locale support in the operating
5a964f20 854system is lacking (broken) in some way--or if you mistyped the name of
900bd440
JH
855a locale when you set up your environment. If this environment
856variable is absent, or has a value that does not evaluate to integer
857zero--that is, "0" or ""-- Perl will complain about locale setting
858failures.
5f05dabc 859
14280422
DD
860B<NOTE>: PERL_BADLANG only gives you a way to hide the warning message.
861The message tells about some problem in your system's locale support,
862and you should investigate what the problem is.
5f05dabc
PP
863
864=back
865
866The following environment variables are not specific to Perl: They are
14280422
DD
867part of the standardized (ISO C, XPG4, POSIX 1.c) setlocale() method
868for controlling an application's opinion on data.
5f05dabc
PP
869
870=over 12
871
872=item LC_ALL
873
5a964f20 874C<LC_ALL> is the "override-all" locale environment variable. If
5f05dabc
PP
875set, it overrides all the rest of the locale environment variables.
876
528d65ad
JH
877=item LANGUAGE
878
879B<NOTE>: C<LANGUAGE> is a GNU extension, it affects you only if you
880are using the GNU libc. This is the case if you are using e.g. Linux.
e1020413 881If you are using "commercial" Unixes you are most probably I<not>
22b6f60d
JH
882using GNU libc and you can ignore C<LANGUAGE>.
883
884However, in the case you are using C<LANGUAGE>: it affects the
885language of informational, warning, and error messages output by
886commands (in other words, it's like C<LC_MESSAGES>) but it has higher
96090e4f 887priority than C<LC_ALL>. Moreover, it's not a single value but
22b6f60d
JH
888instead a "path" (":"-separated list) of I<languages> (not locales).
889See the GNU C<gettext> library documentation for more information.
528d65ad 890
5f05dabc
PP
891=item LC_CTYPE
892
893In the absence of C<LC_ALL>, C<LC_CTYPE> chooses the character type
894locale. In the absence of both C<LC_ALL> and C<LC_CTYPE>, C<LANG>
895chooses the character type locale.
896
897=item LC_COLLATE
898
14280422
DD
899In the absence of C<LC_ALL>, C<LC_COLLATE> chooses the collation
900(sorting) locale. In the absence of both C<LC_ALL> and C<LC_COLLATE>,
901C<LANG> chooses the collation locale.
5f05dabc
PP
902
903=item LC_MONETARY
904
14280422
DD
905In the absence of C<LC_ALL>, C<LC_MONETARY> chooses the monetary
906formatting locale. In the absence of both C<LC_ALL> and C<LC_MONETARY>,
907C<LANG> chooses the monetary formatting locale.
5f05dabc
PP
908
909=item LC_NUMERIC
910
911In the absence of C<LC_ALL>, C<LC_NUMERIC> chooses the numeric format
912locale. In the absence of both C<LC_ALL> and C<LC_NUMERIC>, C<LANG>
913chooses the numeric format.
914
915=item LC_TIME
916
14280422
DD
917In the absence of C<LC_ALL>, C<LC_TIME> chooses the date and time
918formatting locale. In the absence of both C<LC_ALL> and C<LC_TIME>,
919C<LANG> chooses the date and time formatting locale.
5f05dabc
PP
920
921=item LANG
922
14280422
DD
923C<LANG> is the "catch-all" locale environment variable. If it is set, it
924is used as the last resort after the overall C<LC_ALL> and the
5f05dabc
PP
925category-specific C<LC_...>.
926
927=back
928
7e4353e9
RGS
929=head2 Examples
930
931The LC_NUMERIC controls the numeric output:
932
933 use locale;
934 use POSIX qw(locale_h); # Imports setlocale() and the LC_ constants.
935 setlocale(LC_NUMERIC, "fr_FR") or die "Pardon";
936 printf "%g\n", 1.23; # If the "fr_FR" succeeded, probably shows 1,23.
937
938and also how strings are parsed by POSIX::strtod() as numbers:
939
940 use locale;
941 use POSIX qw(locale_h strtod);
2095dafa 942 setlocale(LC_NUMERIC, "de_DE") or die "Entschuldigung";
7e4353e9
RGS
943 my $x = strtod("2,34") + 5;
944 print $x, "\n"; # Probably shows 7,34.
945
5f05dabc
PP
946=head1 NOTES
947
948=head2 Backward compatibility
949
b0c42ed9 950Versions of Perl prior to 5.004 B<mostly> ignored locale information,
5a964f20
TC
951generally behaving as if something similar to the C<"C"> locale were
952always in force, even if the program environment suggested otherwise
953(see L<The setlocale function>). By default, Perl still behaves this
954way for backward compatibility. If you want a Perl application to pay
955attention to locale information, you B<must> use the S<C<use locale>>
062ca197
KW
956pragma (see L<The use locale pragma>) or, in the unlikely event
957that you want to do so for just pattern matching, the
70709c68
KW
958C</l> regular expression modifier (see L<perlre/Character set
959modifiers>) to instruct it to do so.
b0c42ed9
JH
960
961Versions of Perl from 5.002 to 5.003 did use the C<LC_CTYPE>
5a964f20
TC
962information if available; that is, C<\w> did understand what
963were the letters according to the locale environment variables.
b0c42ed9
JH
964The problem was that the user had no control over the feature:
965if the C library supported locales, Perl used them.
966
967=head2 I18N:Collate obsolete
968
5a964f20 969In versions of Perl prior to 5.004, per-locale collation was possible
b0c42ed9
JH
970using the C<I18N::Collate> library module. This module is now mildly
971obsolete and should be avoided in new applications. The C<LC_COLLATE>
972functionality is now integrated into the Perl core language: One can
973use locale-specific scalar data completely normally with C<use locale>,
974so there is no longer any need to juggle with the scalar references of
975C<I18N::Collate>.
5f05dabc 976
14280422 977=head2 Sort speed and memory use impacts
5f05dabc
PP
978
979Comparing and sorting by locale is usually slower than the default
14280422
DD
980sorting; slow-downs of two to four times have been observed. It will
981also consume more memory: once a Perl scalar variable has participated
982in any string comparison or sorting operation obeying the locale
983collation rules, it will take 3-15 times more memory than before. (The
984exact multiplier depends on the string's contents, the operating system
985and the locale.) These downsides are dictated more by the operating
986system's implementation of the locale system than by Perl.
5f05dabc 987
e38874e2
DD
988=head2 write() and LC_NUMERIC
989
903eb63f
NT
990If a program's environment specifies an LC_NUMERIC locale and C<use
991locale> is in effect when the format is declared, the locale is used
992to specify the decimal point character in formatted output. Formatted
993output cannot be controlled by C<use locale> at the time when write()
994is called.
e38874e2 995
5f05dabc
PP
996=head2 Freely available locale definitions
997
08d7a6b2
LB
998There is a large collection of locale definitions at:
999
1000 http://std.dkuug.dk/i18n/WG15-collection/locales/
1001
1002You should be aware that it is
14280422 1003unsupported, and is not claimed to be fit for any purpose. If your
5a964f20 1004system allows installation of arbitrary locales, you may find the
14280422
DD
1005definitions useful as they are, or as a basis for the development of
1006your own locales.
5f05dabc 1007
14280422 1008=head2 I18n and l10n
5f05dabc 1009
b0c42ed9
JH
1010"Internationalization" is often abbreviated as B<i18n> because its first
1011and last letters are separated by eighteen others. (You may guess why
1012the internalin ... internaliti ... i18n tends to get abbreviated.) In
1013the same way, "localization" is often abbreviated to B<l10n>.
14280422
DD
1014
1015=head2 An imperfect standard
1016
1017Internationalization, as defined in the C and POSIX standards, can be
1018criticized as incomplete, ungainly, and having too large a granularity.
1019(Locales apply to a whole process, when it would arguably be more useful
1020to have them apply to a single thread, window group, or whatever.) They
1021also have a tendency, like standards groups, to divide the world into
1022nations, when we all know that the world can equally well be divided
e199995e 1023into bankers, bikers, gamers, and so on.
5f05dabc 1024
b310b053
JH
1025=head1 Unicode and UTF-8
1026
e199995e 1027The support of Unicode is new starting from Perl version 5.6, and more fully
b4ffc3db
TC
1028implemented in version 5.8 and later. See L<perluniintro>. Perl tries to
1029work with both Unicode and locales--but of course, there are problems.
e199995e
KW
1030
1031Perl does not handle multi-byte locales, such as have been used for various
b4ffc3db
TC
1032Asian languages, such as Big5 or Shift JIS. However, the increasingly common
1033multi-byte UTF-8 locales, if properly implemented, tend to work
1034reasonably well in Perl, simply because both they and Perl store
e199995e
KW
1035characters that take up multiple bytes the same way.
1036
1037Perl generally takes the tack to use locale rules on code points that can fit
1038in a single byte, and Unicode rules for those that can't (though this wasn't
1039uniformly applied prior to Perl 5.14). This prevents many problems in locales
1040that aren't UTF-8. Suppose the locale is ISO8859-7, Greek. The character at
10410xD7 there is a capital Chi. But in the ISO8859-1 locale, Latin1, it is a
1042multiplication sign. The POSIX regular expression character class
b4ffc3db
TC
1043C<[[:alpha:]]> will magically match 0xD7 in the Greek locale but not in the
1044Latin one, even if the string is encoded in UTF-8, which would normally imply
1045Unicode semantics. (The "U" in UTF-8 stands for Unicode.)
e199995e
KW
1046
1047However, there are places where this breaks down. Certain constructs are
b4ffc3db
TC
1048for Unicode only, such as C<\p{Alpha}>. They assume that 0xD7 always has its
1049Unicode meaning (or the equivalent on EBCDIC platforms). Since Latin1 is a
1050subset of Unicode and 0xD7 is the multiplication sign in both Latin1 and
1051Unicode, C<\p{Alpha}> will never match it, regardless of locale. A similar
1052issue occurs with C<\N{...}>. It is therefore a bad idea to use C<\p{}> or
1053C<\N{}> under C<use locale>--I<unless> you can guarantee that the locale will
1054be a ISO8859-1 or UTF-8 one. Use POSIX character classes instead.
1055
e199995e
KW
1056
1057The same problem ensues if you enable automatic UTF-8-ification of your
1058standard file handles, default C<open()> layer, and C<@ARGV> on non-ISO8859-1,
b4ffc3db
TC
1059non-UTF-8 locales (by using either the B<-C> command line switch or the
1060C<PERL_UNICODE> environment variable; see L<perlrun>).
1061Things are read in as UTF-8, which would normally imply a Unicode
1062interpretation, but the presence of a locale causes them to be interpreted
1063in that locale instead. For example, a 0xD7 code point in the Unicode
1064input, which should mean the multiplication sign, won't be interpreted by
1065Perl that way under the Greek locale. Again, this is not a problem
1066I<provided> you make certain that all locales will always and only be either
1067an ISO8859-1 or a UTF-8 locale.
1068
1069Vendor locales are notoriously buggy, and it is difficult for Perl to test
1070its locale-handling code because this interacts with code that Perl has no
1071control over; therefore the locale-handling code in Perl may be buggy as
1072well. But if you I<do> have locales that work, using them may be
1073worthwhile for certain specific purposes, as long as you keep in mind the
1074gotchas already mentioned. For example, collation runs faster under
1075locales than under L<Unicode::Collate> (albeit with less flexibility), and
1076you gain access to such things as the local currency symbol and the names
1077of the months and days of the week.
b310b053 1078
5f05dabc
PP
1079=head1 BUGS
1080
1081=head2 Broken systems
1082
5a964f20 1083In certain systems, the operating system's locale support
2bdf8add 1084is broken and cannot be fixed or used by Perl. Such deficiencies can
b4ffc3db 1085and will result in mysterious hangs and/or Perl core dumps when
2bdf8add 1086C<use locale> is in effect. When confronted with such a system,
7f2de2d2 1087please report in excruciating detail to <F<perlbug@perl.org>>, and
b4ffc3db 1088also contact your vendor: bug fixes may exist for these problems
2bdf8add
JH
1089in your operating system. Sometimes such bug fixes are called an
1090operating system upgrade.
5f05dabc
PP
1091
1092=head1 SEE ALSO
1093
b310b053
JH
1094L<I18N::Langinfo>, L<perluniintro>, L<perlunicode>, L<open>,
1095L<POSIX/isalnum>, L<POSIX/isalpha>,
4bbcc6e8
JH
1096L<POSIX/isdigit>, L<POSIX/isgraph>, L<POSIX/islower>,
1097L<POSIX/isprint>, L<POSIX/ispunct>, L<POSIX/isspace>,
1098L<POSIX/isupper>, L<POSIX/isxdigit>, L<POSIX/localeconv>,
1099L<POSIX/setlocale>, L<POSIX/strcoll>, L<POSIX/strftime>,
1100L<POSIX/strtod>, L<POSIX/strxfrm>.
5f05dabc
PP
1101
1102=head1 HISTORY
1103
b0c42ed9 1104Jarkko Hietaniemi's original F<perli18n.pod> heavily hacked by Dominic
5a964f20 1105Dunlop, assisted by the perl5-porters. Prose worked over a bit by
c052850d 1106Tom Christiansen, and updated by Perl 5 porters.