abbreviated as B<i18n>); telling such an application about a particular
set of preferences is known as B<localization> (B<l10n>).
-Perl was extended, starting in 5.004, to support the locale system. This
+Perl was extended to support the locale system. This
is controlled per application by using one pragma, one function call,
and several environment variables.
You can switch locales as often as you wish at run time with the
POSIX::setlocale() function:
- # This functionality not usable prior to Perl 5.004
- require 5.004;
-
# Import locale-handling tool set from POSIX module.
# This example uses: setlocale -- the function call
# LC_CTYPE -- explained below
Here's a simple-minded example program that rewrites its command-line
parameters as integers correctly formatted in the current locale:
- # See comments in previous example
- require 5.004;
use POSIX qw(locale_h);
# Get some of locale's numeric formatting parameters
environment variable to determine the application's notions on collation
(ordering) of characters. For example, "b" follows "a" in Latin
alphabets, but where do "E<aacute>" and "E<aring>" belong? And while
-"color" follows "chocolate" in English, what about in Spanish?
+"color" follows "chocolate" in English, what about in traditional Spanish?
The following collations all make sense and you may meet any of them
if you "use locale".
Unfortunately, this creates big problems for regular expressions. "|" still
means alternation even though it matches C<\w>.
+Note that there are quite a few things that are unaffected by the
+current locale. All the escape sequences for particular characters,
+C<\n> for example, always mean the platform's native one. This means,
+for example, that C<\N> in regular expressions (every character
+but new-line) work on the platform character set.
+
B<Note:> A broken or malicious C<LC_CTYPE> locale definition may result
in clearly ineligible characters being considered to be alphanumeric by
your application. For strict matching of (mundane) ASCII letters and