This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
document qw's whitespace rules
[perl5.git] / pod / perlsec.pod
CommitLineData
a0d0e21e
LW
1=head1 NAME
2
3perlsec - Perl security
4
5=head1 DESCRIPTION
6
425e5e39 7Perl is designed to make it easy to program securely even when running
8with extra privileges, like setuid or setgid programs. Unlike most
54310121 9command line shells, which are based on multiple substitution passes on
425e5e39 10each line of the script, Perl uses a more conventional evaluation scheme
11with fewer hidden snags. Additionally, because the language has more
54310121 12builtin functionality, it can rely less upon external (and possibly
425e5e39 13untrustworthy) programs to accomplish its purposes.
a0d0e21e 14
89f530a6
DG
15=head1 SECURITY VULNERABILITY CONTACT INFORMATION
16
87c118b9
DM
17If you believe you have found a security vulnerability in Perl, please
18email the details to perl5-security-report@perl.org. This creates a new
19Request Tracker ticket in a special queue which isn't initially publicly
20accessible. The email will also be copied to a closed subscription
21unarchived mailing list which includes all the core committers, who will
22be able to help assess the impact of issues, figure out a resolution, and
23help co-ordinate the release of patches to mitigate or fix the problem
24across all platforms on which Perl is supported. Please only use this
25address for security issues in the Perl core, not for modules
26independently distributed on CPAN.
27
28When sending an initial request to the security email address, please
29don't Cc any other parties, because if they reply to all, the reply will
30generate yet another new ticket. Once you have received an initial reply
31with a C<[perl #NNNNNN]> ticket number in the headline, it's okay to Cc
32subsequent replies to third parties: all emails to the
33perl5-security-report address with the ticket number in the subject line
34will be added to the ticket; without it, a new ticket will be created.
89f530a6
DG
35
36=head1 SECURITY MECHANISMS AND CONCERNS
37
38=head2 Taint mode
39
425e5e39 40Perl automatically enables a set of special security checks, called I<taint
41mode>, when it detects its program running with differing real and effective
42user or group IDs. The setuid bit in Unix permissions is mode 04000, the
43setgid bit mode 02000; either or both may be set. You can also enable taint
91e64913 44mode explicitly by using the B<-T> command line flag. This flag is
425e5e39 45I<strongly> suggested for server programs and any program run on behalf of
91e64913 46someone else, such as a CGI script. Once taint mode is on, it's on for
fb73857a 47the remainder of your script.
a0d0e21e 48
1e422769 49While in this mode, Perl takes special precautions called I<taint
50checks> to prevent both obvious and subtle traps. Some of these checks
51are reasonably simple, such as verifying that path directories aren't
52writable by others; careful programmers have always used checks like
53these. Other checks, however, are best supported by the language itself,
fb73857a 54and it is these checks especially that contribute to making a set-id Perl
425e5e39 55program more secure than the corresponding C program.
56
fb73857a 57You may not use data derived from outside your program to affect
58something else outside your program--at least, not by accident. All
59command line arguments, environment variables, locale information (see
23634c10
AL
60L<perllocale>), results of certain system calls (C<readdir()>,
61C<readlink()>, the variable of C<shmread()>, the messages returned by
62C<msgrcv()>, the password, gcos and shell fields returned by the
63C<getpwxxx()> calls), and all file input are marked as "tainted".
41d6edb2
JH
64Tainted data may not be used directly or indirectly in any command
65that invokes a sub-shell, nor in any command that modifies files,
b7ee89ce
AP
66directories, or processes, B<with the following exceptions>:
67
68=over 4
69
70=item *
71
b7ee89ce
AP
72Arguments to C<print> and C<syswrite> are B<not> checked for taintedness.
73
7f6513c1
JH
74=item *
75
76Symbolic methods
77
78 $obj->$method(@args);
79
80and symbolic sub references
81
82 &{$foo}(@args);
83 $foo->(@args);
84
85are not checked for taintedness. This requires extra carefulness
86unless you want external data to affect your control flow. Unless
87you carefully limit what these symbolic values are, people are able
88to call functions B<outside> your Perl code, such as POSIX::system,
89in which case they are able to run arbitrary external code.
90
8ea1447c
RD
91=item *
92
93Hash keys are B<never> tainted.
94
b7ee89ce
AP
95=back
96
595bde10
MG
97For efficiency reasons, Perl takes a conservative view of
98whether data is tainted. If an expression contains tainted data,
99any subexpression may be considered tainted, even if the value
100of the subexpression is not itself affected by the tainted data.
ee556d55 101
d929ce6f 102Because taintedness is associated with each scalar value, some
595bde10 103elements of an array or hash can be tainted and others not.
8ea1447c 104The keys of a hash are B<never> tainted.
a0d0e21e 105
a0d0e21e
LW
106For example:
107
425e5e39 108 $arg = shift; # $arg is tainted
048b63be 109 $hid = $arg . 'bar'; # $hid is also tainted
425e5e39 110 $line = <>; # Tainted
8ebc5c01 111 $line = <STDIN>; # Also tainted
112 open FOO, "/home/me/bar" or die $!;
113 $line = <FOO>; # Still tainted
a0d0e21e 114 $path = $ENV{'PATH'}; # Tainted, but see below
425e5e39 115 $data = 'abc'; # Not tainted
a0d0e21e 116
425e5e39 117 system "echo $arg"; # Insecure
7de90c4d 118 system "/bin/echo", $arg; # Considered insecure
bbd7eb8a 119 # (Perl doesn't know about /bin/echo)
425e5e39 120 system "echo $hid"; # Insecure
121 system "echo $data"; # Insecure until PATH set
a0d0e21e 122
425e5e39 123 $path = $ENV{'PATH'}; # $path now tainted
a0d0e21e 124
54310121 125 $ENV{'PATH'} = '/bin:/usr/bin';
c90c0ff4 126 delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
a0d0e21e 127
425e5e39 128 $path = $ENV{'PATH'}; # $path now NOT tainted
129 system "echo $data"; # Is secure now!
a0d0e21e 130
425e5e39 131 open(FOO, "< $arg"); # OK - read-only file
132 open(FOO, "> $arg"); # Not OK - trying to write
a0d0e21e 133
bbd7eb8a 134 open(FOO,"echo $arg|"); # Not OK
425e5e39 135 open(FOO,"-|")
7de90c4d 136 or exec 'echo', $arg; # Also not OK
a0d0e21e 137
425e5e39 138 $shout = `echo $arg`; # Insecure, $shout now tainted
a0d0e21e 139
425e5e39 140 unlink $data, $arg; # Insecure
141 umask $arg; # Insecure
a0d0e21e 142
bbd7eb8a 143 exec "echo $arg"; # Insecure
7de90c4d
RD
144 exec "echo", $arg; # Insecure
145 exec "sh", '-c', $arg; # Very insecure!
a0d0e21e 146
3a4b19e4
GS
147 @files = <*.c>; # insecure (uses readdir() or similar)
148 @files = glob('*.c'); # insecure (uses readdir() or similar)
7bac28a0 149
dde0c558
BF
150 # In either case, the results of glob are tainted, since the list of
151 # filenames comes from outside of the program.
3f7d42d8 152
ee556d55
MG
153 $bad = ($arg, 23); # $bad will be tainted
154 $arg, `true`; # Insecure (although it isn't really)
155
a0d0e21e 156If you try to do something insecure, you will get a fatal error saying
7de90c4d 157something like "Insecure dependency" or "Insecure $ENV{PATH}".
425e5e39 158
23634c10
AL
159The exception to the principle of "one tainted value taints the whole
160expression" is with the ternary conditional operator C<?:>. Since code
161with a ternary conditional
162
163 $result = $tainted_value ? "Untainted" : "Also untainted";
164
165is effectively
166
167 if ( $tainted_value ) {
168 $result = "Untainted";
169 } else {
170 $result = "Also untainted";
171 }
172
173it doesn't make sense for C<$result> to be tainted.
174
425e5e39 175=head2 Laundering and Detecting Tainted Data
176
3f7d42d8
JH
177To test whether a variable contains tainted data, and whose use would
178thus trigger an "Insecure dependency" message, you can use the
23634c10 179C<tainted()> function of the Scalar::Util module, available in your
3f7d42d8 180nearby CPAN mirror, and included in Perl starting from the release 5.8.0.
595bde10 181Or you may be able to use the following C<is_tainted()> function.
425e5e39 182
183 sub is_tainted {
7687d286 184 local $@; # Don't pollute caller's value.
61890e45 185 return ! eval { eval("#" . substr(join("", @_), 0, 0)); 1 };
425e5e39 186 }
187
188This function makes use of the fact that the presence of tainted data
189anywhere within an expression renders the entire expression tainted. It
190would be inefficient for every operator to test every argument for
191taintedness. Instead, the slightly more efficient and conservative
192approach is used that if any tainted value has been accessed within the
193same expression, the whole expression is considered tainted.
194
5f05dabc 195But testing for taintedness gets you only so far. Sometimes you have just
595bde10
MG
196to clear your data's taintedness. Values may be untainted by using them
197as keys in a hash; otherwise the only way to bypass the tainting
54310121 198mechanism is by referencing subpatterns from a regular expression match.
18512f39
KW
199Perl presumes that if you reference a substring using $1, $2, etc. in a
200non-tainting pattern, that
201you knew what you were doing when you wrote that pattern. That means using
425e5e39 202a bit of thought--don't just blindly untaint anything, or you defeat the
a034a98d
DD
203entire mechanism. It's better to verify that the variable has only good
204characters (for certain values of "good") rather than checking whether it
205has any bad characters. That's because it's far too easy to miss bad
206characters that you never thought of.
425e5e39 207
208Here's a test to make sure that the data contains nothing but "word"
209characters (alphabetics, numerics, and underscores), a hyphen, an at sign,
210or a dot.
211
54310121 212 if ($data =~ /^([-\@\w.]+)$/) {
425e5e39 213 $data = $1; # $data now untainted
214 } else {
3a2263fe 215 die "Bad data in '$data'"; # log this somewhere
425e5e39 216 }
217
5f05dabc 218This is fairly secure because C</\w+/> doesn't normally match shell
425e5e39 219metacharacters, nor are dot, dash, or at going to mean something special
220to the shell. Use of C</.+/> would have been insecure in theory because
221it lets everything through, but Perl doesn't check for that. The lesson
222is that when untainting, you must be exceedingly careful with your patterns.
19799a22 223Laundering data using regular expression is the I<only> mechanism for
425e5e39 224untainting dirty data, unless you use the strategy detailed below to fork
225a child of lesser privilege.
226
23634c10 227The example does not untaint C<$data> if C<use locale> is in effect,
a034a98d
DD
228because the characters matched by C<\w> are determined by the locale.
229Perl considers that locale definitions are untrustworthy because they
230contain data from outside the program. If you are writing a
231locale-aware program, and want to launder data with a regular expression
232containing C<\w>, put C<no locale> ahead of the expression in the same
233block. See L<perllocale/SECURITY> for further discussion and examples.
234
3a52c276
CS
235=head2 Switches On the "#!" Line
236
237When you make a script executable, in order to make it usable as a
238command, the system will pass switches to perl from the script's #!
54310121 239line. Perl checks that any command line switches given to a setuid
3a52c276 240(or setgid) script actually match the ones set on the #! line. Some
54310121 241Unix and Unix-like environments impose a one-switch limit on the #!
3a52c276 242line, so you may need to use something like C<-wU> instead of C<-w -U>
54310121 243under such systems. (This issue should arise only in Unix or
244Unix-like environments that support #! and setuid or setgid scripts.)
3a52c276 245
588f7210
SB
246=head2 Taint mode and @INC
247
248When the taint mode (C<-T>) is in effect, the "." directory is removed
249from C<@INC>, and the environment variables C<PERL5LIB> and C<PERLLIB>
91e64913 250are ignored by Perl. You can still adjust C<@INC> from outside the
588f7210 251program by using the C<-I> command line option as explained in
91e64913 252L<perlrun>. The two environment variables are ignored because
588f7210
SB
253they are obscured, and a user running a program could be unaware that
254they are set, whereas the C<-I> option is clearly visible and
255therefore permitted.
256
257Another way to modify C<@INC> without modifying the program, is to use
258the C<lib> pragma, e.g.:
259
260 perl -Mlib=/foo program
261
262The benefit of using C<-Mlib=/foo> over C<-I/foo>, is that the former
6fd9f613 263will automagically remove any duplicated directories, while the latter
588f7210
SB
264will not.
265
6a268663
RGS
266Note that if a tainted string is added to C<@INC>, the following
267problem will be reported:
268
269 Insecure dependency in require while running with -T switch
270
425e5e39 271=head2 Cleaning Up Your Path
272
df98f984
RGS
273For "Insecure C<$ENV{PATH}>" messages, you need to set C<$ENV{'PATH'}> to
274a known value, and each directory in the path must be absolute and
275non-writable by others than its owner and group. You may be surprised to
276get this message even if the pathname to your executable is fully
277qualified. This is I<not> generated because you didn't supply a full path
278to the program; instead, it's generated because you never set your PATH
279environment variable, or you didn't set it to something that was safe.
280Because Perl can't guarantee that the executable in question isn't itself
281going to turn around and execute some other program that is dependent on
282your PATH, it makes sure you set the PATH.
a0d0e21e 283
a3cb178b
GS
284The PATH isn't the only environment variable which can cause problems.
285Because some shells may use the variables IFS, CDPATH, ENV, and
286BASH_ENV, Perl checks that those are either empty or untainted when
91e64913 287starting subprocesses. You may wish to add something like this to your
a3cb178b
GS
288setid and taint-checking scripts.
289
290 delete @ENV{qw(IFS CDPATH ENV BASH_ENV)}; # Make %ENV safer
291
a0d0e21e
LW
292It's also possible to get into trouble with other operations that don't
293care whether they use tainted values. Make judicious use of the file
294tests in dealing with any user-supplied filenames. When possible, do
fb73857a 295opens and such B<after> properly dropping any special user (or group!)
91e64913
FC
296privileges. Perl doesn't prevent you from
297opening tainted filenames for reading,
a0d0e21e
LW
298so be careful what you print out. The tainting mechanism is intended to
299prevent stupid mistakes, not to remove the need for thought.
300
23634c10
AL
301Perl does not call the shell to expand wild cards when you pass C<system>
302and C<exec> explicit parameter lists instead of strings with possible shell
303wildcards in them. Unfortunately, the C<open>, C<glob>, and
54310121 304backtick functions provide no such alternate calling convention, so more
305subterfuge will be required.
425e5e39 306
307Perl provides a reasonably safe way to open a file or pipe from a setuid
308or setgid program: just create a child process with reduced privilege who
309does the dirty work for you. First, fork a child using the special
23634c10 310C<open> syntax that connects the parent and child by a pipe. Now the
425e5e39 311child resets its ID set and any other per-process attributes, like
312environment variables, umasks, current working directories, back to the
313originals or known safe values. Then the child process, which no longer
23634c10 314has any special permissions, does the C<open> or other system call.
425e5e39 315Finally, the child passes the data it managed to access back to the
5f05dabc 316parent. Because the file or pipe was opened in the child while running
425e5e39 317under less privilege than the parent, it's not apt to be tricked into
318doing something it shouldn't.
319
23634c10 320Here's a way to do backticks reasonably safely. Notice how the C<exec> is
425e5e39 321not called with a string that the shell could expand. This is by far the
322best way to call something that might be subjected to shell escapes: just
fb73857a 323never call the shell at all.
cb1a09d0 324
6ca3c6c6 325 use English;
e093bcf0
GW
326 die "Can't fork: $!" unless defined($pid = open(KID, "-|"));
327 if ($pid) { # parent
328 while (<KID>) {
329 # do something
330 }
331 close KID;
332 } else {
333 my @temp = ($EUID, $EGID);
334 my $orig_uid = $UID;
335 my $orig_gid = $GID;
336 $EUID = $UID;
337 $EGID = $GID;
338 # Drop privileges
339 $UID = $orig_uid;
340 $GID = $orig_gid;
341 # Make sure privs are really gone
342 ($EUID, $EGID) = @temp;
343 die "Can't drop privileges"
344 unless $UID == $EUID && $GID eq $EGID;
345 $ENV{PATH} = "/bin:/usr/bin"; # Minimal PATH.
346 # Consider sanitizing the environment even more.
347 exec 'myprog', 'arg1', 'arg2'
348 or die "can't exec myprog: $!";
349 }
425e5e39 350
fb73857a 351A similar strategy would work for wildcard expansion via C<glob>, although
352you can use C<readdir> instead.
425e5e39 353
354Taint checking is most useful when although you trust yourself not to have
355written a program to give away the farm, you don't necessarily trust those
356who end up using it not to try to trick it into doing something bad. This
fb73857a 357is the kind of security checking that's useful for set-id programs and
425e5e39 358programs launched on someone else's behalf, like CGI programs.
359
360This is quite different, however, from not even trusting the writer of the
361code not to try to do something evil. That's the kind of trust needed
362when someone hands you a program you've never seen before and says, "Here,
18d7fc85
RGS
363run this." For that kind of safety, you might want to check out the Safe
364module, included standard in the Perl distribution. This module allows the
425e5e39 365programmer to set up special compartments in which all system operations
18d7fc85
RGS
366are trapped and namespace access is carefully controlled. Safe should
367not be considered bullet-proof, though: it will not prevent the foreign
368code to set up infinite loops, allocate gigabytes of memory, or even
369abusing perl bugs to make the host interpreter crash or behave in
91e64913 370unpredictable ways. In any case it's better avoided completely if you're
18d7fc85 371really concerned about security.
425e5e39 372
373=head2 Security Bugs
374
375Beyond the obvious problems that stem from giving special privileges to
fb73857a 376systems as flexible as scripts, on many versions of Unix, set-id scripts
425e5e39 377are inherently insecure right from the start. The problem is a race
378condition in the kernel. Between the time the kernel opens the file to
fb73857a 379see which interpreter to run and when the (now-set-id) interpreter turns
425e5e39 380around and reopens the file to interpret it, the file in question may have
381changed, especially if you have symbolic links on your system.
382
383Fortunately, sometimes this kernel "feature" can be disabled.
384Unfortunately, there are two ways to disable it. The system can simply
fb73857a 385outlaw scripts with any set-id bit set, which doesn't help much.
cc69b689 386Alternately, it can simply ignore the set-id bits on scripts.
425e5e39 387
fb73857a 388However, if the kernel set-id script feature isn't disabled, Perl will
389complain loudly that your set-id script is insecure. You'll need to
390either disable the kernel set-id script feature, or put a C wrapper around
425e5e39 391the script. A C wrapper is just a compiled program that does nothing
392except call your Perl program. Compiled programs are not subject to the
fb73857a 393kernel bug that plagues set-id scripts. Here's a simple wrapper, written
425e5e39 394in C:
395
245c138e
LM
396 #include <unistd.h>
397 #include <stdio.h>
398 #include <string.h>
399 #include <errno.h>
400
425e5e39 401 #define REAL_PATH "/path/to/script"
245c138e
LM
402
403 int main(int argc, char **argv)
425e5e39 404 {
245c138e
LM
405 execv(REAL_PATH, argv);
406 fprintf(stderr, "%s: %s: %s\n",
407 argv[0], REAL_PATH, strerror(errno));
408 return 127;
54310121 409 }
cb1a09d0 410
54310121 411Compile this wrapper into a binary executable and then make I<it> rather
412than your script setuid or setgid.
425e5e39 413
425e5e39 414In recent years, vendors have begun to supply systems free of this
415inherent security bug. On such systems, when the kernel passes the name
fb73857a 416of the set-id script to open to the interpreter, rather than using a
425e5e39 417pathname subject to meddling, it instead passes I</dev/fd/3>. This is a
418special file already opened on the script, so that there can be no race
419condition for evil scripts to exploit. On these systems, Perl should be
23634c10 420compiled with C<-DSETUID_SCRIPTS_ARE_SECURE_NOW>. The F<Configure>
425e5e39 421program that builds Perl tries to figure this out for itself, so you
422should never have to specify this yourself. Most modern releases of
423SysVr4 and BSD 4.4 use this approach to avoid the kernel race condition.
424
68dc0745 425=head2 Protecting Your Programs
426
427There are a number of ways to hide the source to your Perl programs,
428with varying levels of "security".
429
430First of all, however, you I<can't> take away read permission, because
431the source code has to be readable in order to be compiled and
432interpreted. (That doesn't mean that a CGI script's source is
433readable by people on the web, though.) So you have to leave the
5a964f20
TC
434permissions at the socially friendly 0755 level. This lets
435people on your local system only see your source.
68dc0745 436
5a964f20 437Some people mistakenly regard this as a security problem. If your program does
68dc0745 438insecure things, and relies on people not knowing how to exploit those
439insecurities, it is not secure. It is often possible for someone to
440determine the insecure things and exploit them without viewing the
441source. Security through obscurity, the name for hiding your bugs
442instead of fixing them, is little security indeed.
443
83df6a1d
JH
444You can try using encryption via source filters (Filter::* from CPAN,
445or Filter::Util::Call and Filter::Simple since Perl 5.8).
446But crackers might be able to decrypt it. You can try using the byte
447code compiler and interpreter described below, but crackers might be
448able to de-compile it. You can try using the native-code compiler
68dc0745 449described below, but crackers might be able to disassemble it. These
450pose varying degrees of difficulty to people wanting to get at your
451code, but none can definitively conceal it (this is true of every
452language, not just Perl).
453
454If you're concerned about people profiting from your code, then the
3462340b 455bottom line is that nothing but a restrictive license will give you
68dc0745 456legal security. License your software and pepper it with threatening
457statements like "This is unpublished proprietary software of XYZ Corp.
458Your access to it does not give you permission to use it blah blah
3462340b 459blah." You should see a lawyer to be sure your license's wording will
68dc0745 460stand up in court.
5a964f20 461
0d7c09bb
JH
462=head2 Unicode
463
464Unicode is a new and complex technology and one may easily overlook
465certain security pitfalls. See L<perluniintro> for an overview and
466L<perlunicode> for details, and L<perlunicode/"Security Implications
467of Unicode"> for security implications in particular.
468
504f80c1
JH
469=head2 Algorithmic Complexity Attacks
470
471Certain internal algorithms used in the implementation of Perl can
472be attacked by choosing the input carefully to consume large amounts
473of either time or space or both. This can lead into the so-called
474I<Denial of Service> (DoS) attacks.
475
476=over 4
477
478=item *
479
6a5b4183
YO
480Hash Algorithm - Hash algorithms like the one used in Perl are well
481known to be vulnerable to collision attacks on their hash function.
482Such attacks involve constructing a set of keys which collide into
91e64913 483the same bucket producing inefficient behavior. Such attacks often
6a5b4183 484depend on discovering the seed of the hash function used to map the
91e64913
FC
485keys to buckets. That seed is then used to brute-force a key set which
486can be used to mount a denial of service attack. In Perl 5.8.1 changes
6a5b4183
YO
487were introduced to harden Perl to such attacks, and then later in
488Perl 5.18.0 these features were enhanced and additional protections
489added.
490
4d74c8eb
S
491At the time of this writing, Perl 5.18.0 is considered to be
492well-hardened against algorithmic complexity attacks on its hash
91e64913 493implementation. This is largely owed to the following measures
4d74c8eb 494mitigate attacks:
6a5b4183
YO
495
496=over 4
497
498=item Hash Seed Randomization
499
500In order to make it impossible to know what seed to generate an attack
91e64913 501key set for, this seed is randomly initialized at process start. This
4d74c8eb 502may be overridden by using the PERL_HASH_SEED environment variable, see
91e64913 503L<perlrun/PERL_HASH_SEED>. This environment variable controls how
4d74c8eb
S
504items are actually stored, not how they are presented via
505C<keys>, C<values> and C<each>.
6a5b4183
YO
506
507=item Hash Traversal Randomization
508
4d74c8eb 509Independent of which seed is used in the hash function, C<keys>,
6a5b4183
YO
510C<values>, and C<each> return items in a per-hash randomized order.
511Modifying a hash by insertion will change the iteration order of that hash.
4d74c8eb 512This behavior can be overridden by using C<hash_traversal_mask()> from
6a5b4183 513L<Hash::Util> or by using the PERL_PERTURB_KEYS environment variable,
91e64913 514see L<perlrun/PERL_PERTURB_KEYS>. Note that this feature controls the
6a5b4183
YO
515"visible" order of the keys, and not the actual order they are stored in.
516
517=item Bucket Order Perturbance
518
4d74c8eb 519When items collide into a given hash bucket the order they are stored in
91e64913
FC
520the chain is no longer predictable in Perl 5.18. This
521has the intention to make it harder to observe a
c6c886ef 522collision. This behavior can be overridden by using
6a5b4183
YO
523the PERL_PERTURB_KEYS environment variable, see L<perlrun/PERL_PERTURB_KEYS>.
524
525=item New Default Hash Function
526
527The default hash function has been modified with the intention of making
528it harder to infer the hash seed.
529
530=item Alternative Hash Functions
531
532The source code includes multiple hash algorithms to choose from. While we
4d74c8eb 533believe that the default perl hash is robust to attack, we have included the
91e64913 534hash function Siphash as a fall-back option. At the time of release of
6a5b4183
YO
535Perl 5.18.0 Siphash is believed to be of cryptographic strength. This is
536not the default as it is much slower than the default hash.
537
538=back
539
4d74c8eb 540Without compiling a special Perl, there is no way to get the exact same
91e64913 541behavior of any versions prior to Perl 5.18.0. The closest one can get
6a5b4183 542is by setting PERL_PERTURB_KEYS to 0 and setting the PERL_HASH_SEED
91e64913 543to a known value. We do not advise those settings for production use
4d74c8eb 544due to the above security considerations.
6a5b4183
YO
545
546B<Perl has never guaranteed any ordering of the hash keys>, and
547the ordering has already changed several times during the lifetime of
548Perl 5. Also, the ordering of hash keys has always been, and continues
549to be, affected by the insertion order and the history of changes made
550to the hash over its lifetime.
7b3f7037
JH
551
552Also note that while the order of the hash elements might be
4d74c8eb
S
553randomized, this "pseudo-ordering" should B<not> be used for
554applications like shuffling a list randomly (use C<List::Util::shuffle()>
7b3f7037 555for that, see L<List::Util>, a standard core module since Perl 5.8.0;
4d74c8eb
S
556or the CPAN module C<Algorithm::Numerical::Shuffle>), or for generating
557permutations (use e.g. the CPAN modules C<Algorithm::Permute> or
558C<Algorithm::FastPermute>), or for any cryptographic applications.
7b3f7037 559
883f220b
TC
560Tied hashes may have their own ordering and algorithmic complexity
561attacks.
562
504f80c1
JH
563=item *
564
5a4e8ea7
P
565Regular expressions - Perl's regular expression engine is so called NFA
566(Non-deterministic Finite Automaton), which among other things means that
567it can rather easily consume large amounts of both time and space if the
504f80c1
JH
568regular expression may match in several ways. Careful crafting of the
569regular expressions can help but quite often there really isn't much
570one can do (the book "Mastering Regular Expressions" is required
571reading, see L<perlfaq2>). Running out of space manifests itself by
572Perl running out of memory.
573
574=item *
575
576Sorting - the quicksort algorithm used in Perls before 5.8.0 to
577implement the sort() function is very easy to trick into misbehaving
3462340b
JL
578so that it consumes a lot of time. Starting from Perl 5.8.0 a different
579sorting algorithm, mergesort, is used by default. Mergesort cannot
580misbehave on any input.
504f80c1
JH
581
582=back
583
b25b06cf 584See L<https://www.usenix.org/legacy/events/sec03/tech/full_papers/crosby/crosby.pdf> for more information,
3462340b 585and any computer science textbook on algorithmic complexity.
504f80c1 586
5a964f20
TC
587=head1 SEE ALSO
588
589L<perlrun> for its description of cleaning up environment variables.