This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
fix link to crosby paper on hash complexity attack
[perl5.git] / pod / perlsec.pod
CommitLineData
a0d0e21e
LW
1=head1 NAME
2
3perlsec - Perl security
4
5=head1 DESCRIPTION
6
425e5e39
PP
7Perl is designed to make it easy to program securely even when running
8with extra privileges, like setuid or setgid programs. Unlike most
54310121 9command line shells, which are based on multiple substitution passes on
425e5e39
PP
10each line of the script, Perl uses a more conventional evaluation scheme
11with fewer hidden snags. Additionally, because the language has more
54310121 12builtin functionality, it can rely less upon external (and possibly
425e5e39 13untrustworthy) programs to accomplish its purposes.
a0d0e21e 14
89f530a6
DG
15=head1 SECURITY VULNERABILITY CONTACT INFORMATION
16
87c118b9
DM
17If you believe you have found a security vulnerability in Perl, please
18email the details to perl5-security-report@perl.org. This creates a new
19Request Tracker ticket in a special queue which isn't initially publicly
20accessible. The email will also be copied to a closed subscription
21unarchived mailing list which includes all the core committers, who will
22be able to help assess the impact of issues, figure out a resolution, and
23help co-ordinate the release of patches to mitigate or fix the problem
24across all platforms on which Perl is supported. Please only use this
25address for security issues in the Perl core, not for modules
26independently distributed on CPAN.
27
28When sending an initial request to the security email address, please
29don't Cc any other parties, because if they reply to all, the reply will
30generate yet another new ticket. Once you have received an initial reply
31with a C<[perl #NNNNNN]> ticket number in the headline, it's okay to Cc
32subsequent replies to third parties: all emails to the
33perl5-security-report address with the ticket number in the subject line
34will be added to the ticket; without it, a new ticket will be created.
89f530a6
DG
35
36=head1 SECURITY MECHANISMS AND CONCERNS
37
38=head2 Taint mode
39
425e5e39
PP
40Perl automatically enables a set of special security checks, called I<taint
41mode>, when it detects its program running with differing real and effective
42user or group IDs. The setuid bit in Unix permissions is mode 04000, the
43setgid bit mode 02000; either or both may be set. You can also enable taint
91e64913 44mode explicitly by using the B<-T> command line flag. This flag is
425e5e39 45I<strongly> suggested for server programs and any program run on behalf of
91e64913 46someone else, such as a CGI script. Once taint mode is on, it's on for
fb73857a 47the remainder of your script.
a0d0e21e 48
1e422769
PP
49While in this mode, Perl takes special precautions called I<taint
50checks> to prevent both obvious and subtle traps. Some of these checks
51are reasonably simple, such as verifying that path directories aren't
52writable by others; careful programmers have always used checks like
53these. Other checks, however, are best supported by the language itself,
fb73857a 54and it is these checks especially that contribute to making a set-id Perl
425e5e39
PP
55program more secure than the corresponding C program.
56
fb73857a
PP
57You may not use data derived from outside your program to affect
58something else outside your program--at least, not by accident. All
59command line arguments, environment variables, locale information (see
23634c10
AL
60L<perllocale>), results of certain system calls (C<readdir()>,
61C<readlink()>, the variable of C<shmread()>, the messages returned by
62C<msgrcv()>, the password, gcos and shell fields returned by the
63C<getpwxxx()> calls), and all file input are marked as "tainted".
41d6edb2
JH
64Tainted data may not be used directly or indirectly in any command
65that invokes a sub-shell, nor in any command that modifies files,
b7ee89ce
AP
66directories, or processes, B<with the following exceptions>:
67
68=over 4
69
70=item *
71
b7ee89ce
AP
72Arguments to C<print> and C<syswrite> are B<not> checked for taintedness.
73
7f6513c1
JH
74=item *
75
76Symbolic methods
77
78 $obj->$method(@args);
79
80and symbolic sub references
81
82 &{$foo}(@args);
83 $foo->(@args);
84
85are not checked for taintedness. This requires extra carefulness
86unless you want external data to affect your control flow. Unless
87you carefully limit what these symbolic values are, people are able
88to call functions B<outside> your Perl code, such as POSIX::system,
89in which case they are able to run arbitrary external code.
90
8ea1447c
RD
91=item *
92
93Hash keys are B<never> tainted.
94
b7ee89ce
AP
95=back
96
595bde10
MG
97For efficiency reasons, Perl takes a conservative view of
98whether data is tainted. If an expression contains tainted data,
99any subexpression may be considered tainted, even if the value
100of the subexpression is not itself affected by the tainted data.
ee556d55 101
d929ce6f 102Because taintedness is associated with each scalar value, some
595bde10 103elements of an array or hash can be tainted and others not.
8ea1447c 104The keys of a hash are B<never> tainted.
a0d0e21e 105
a0d0e21e
LW
106For example:
107
425e5e39 108 $arg = shift; # $arg is tainted
048b63be 109 $hid = $arg . 'bar'; # $hid is also tainted
425e5e39 110 $line = <>; # Tainted
8ebc5c01
PP
111 $line = <STDIN>; # Also tainted
112 open FOO, "/home/me/bar" or die $!;
113 $line = <FOO>; # Still tainted
a0d0e21e 114 $path = $ENV{'PATH'}; # Tainted, but see below
425e5e39 115 $data = 'abc'; # Not tainted
a0d0e21e 116
425e5e39 117 system "echo $arg"; # Insecure
7de90c4d 118 system "/bin/echo", $arg; # Considered insecure
bbd7eb8a 119 # (Perl doesn't know about /bin/echo)
425e5e39
PP
120 system "echo $hid"; # Insecure
121 system "echo $data"; # Insecure until PATH set
a0d0e21e 122
425e5e39 123 $path = $ENV{'PATH'}; # $path now tainted
a0d0e21e 124
54310121 125 $ENV{'PATH'} = '/bin:/usr/bin';
c90c0ff4 126 delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
a0d0e21e 127
425e5e39
PP
128 $path = $ENV{'PATH'}; # $path now NOT tainted
129 system "echo $data"; # Is secure now!
a0d0e21e 130
425e5e39
PP
131 open(FOO, "< $arg"); # OK - read-only file
132 open(FOO, "> $arg"); # Not OK - trying to write
a0d0e21e 133
bbd7eb8a 134 open(FOO,"echo $arg|"); # Not OK
425e5e39 135 open(FOO,"-|")
7de90c4d 136 or exec 'echo', $arg; # Also not OK
a0d0e21e 137
425e5e39 138 $shout = `echo $arg`; # Insecure, $shout now tainted
a0d0e21e 139
425e5e39
PP
140 unlink $data, $arg; # Insecure
141 umask $arg; # Insecure
a0d0e21e 142
bbd7eb8a 143 exec "echo $arg"; # Insecure
7de90c4d
RD
144 exec "echo", $arg; # Insecure
145 exec "sh", '-c', $arg; # Very insecure!
a0d0e21e 146
3a4b19e4
GS
147 @files = <*.c>; # insecure (uses readdir() or similar)
148 @files = glob('*.c'); # insecure (uses readdir() or similar)
7bac28a0 149
dde0c558
BF
150 # In either case, the results of glob are tainted, since the list of
151 # filenames comes from outside of the program.
3f7d42d8 152
ee556d55
MG
153 $bad = ($arg, 23); # $bad will be tainted
154 $arg, `true`; # Insecure (although it isn't really)
155
a0d0e21e 156If you try to do something insecure, you will get a fatal error saying
7de90c4d 157something like "Insecure dependency" or "Insecure $ENV{PATH}".
425e5e39 158
23634c10
AL
159The exception to the principle of "one tainted value taints the whole
160expression" is with the ternary conditional operator C<?:>. Since code
161with a ternary conditional
162
163 $result = $tainted_value ? "Untainted" : "Also untainted";
164
165is effectively
166
167 if ( $tainted_value ) {
168 $result = "Untainted";
169 } else {
170 $result = "Also untainted";
171 }
172
173it doesn't make sense for C<$result> to be tainted.
174
425e5e39
PP
175=head2 Laundering and Detecting Tainted Data
176
3f7d42d8
JH
177To test whether a variable contains tainted data, and whose use would
178thus trigger an "Insecure dependency" message, you can use the
23634c10 179C<tainted()> function of the Scalar::Util module, available in your
3f7d42d8 180nearby CPAN mirror, and included in Perl starting from the release 5.8.0.
595bde10 181Or you may be able to use the following C<is_tainted()> function.
425e5e39
PP
182
183 sub is_tainted {
7687d286 184 local $@; # Don't pollute caller's value.
61890e45 185 return ! eval { eval("#" . substr(join("", @_), 0, 0)); 1 };
425e5e39
PP
186 }
187
188This function makes use of the fact that the presence of tainted data
189anywhere within an expression renders the entire expression tainted. It
190would be inefficient for every operator to test every argument for
191taintedness. Instead, the slightly more efficient and conservative
192approach is used that if any tainted value has been accessed within the
193same expression, the whole expression is considered tainted.
194
5f05dabc 195But testing for taintedness gets you only so far. Sometimes you have just
595bde10
MG
196to clear your data's taintedness. Values may be untainted by using them
197as keys in a hash; otherwise the only way to bypass the tainting
54310121 198mechanism is by referencing subpatterns from a regular expression match.
18512f39
KW
199Perl presumes that if you reference a substring using $1, $2, etc. in a
200non-tainting pattern, that
201you knew what you were doing when you wrote that pattern. That means using
425e5e39 202a bit of thought--don't just blindly untaint anything, or you defeat the
a034a98d
DD
203entire mechanism. It's better to verify that the variable has only good
204characters (for certain values of "good") rather than checking whether it
205has any bad characters. That's because it's far too easy to miss bad
206characters that you never thought of.
425e5e39
PP
207
208Here's a test to make sure that the data contains nothing but "word"
209characters (alphabetics, numerics, and underscores), a hyphen, an at sign,
210or a dot.
211
54310121 212 if ($data =~ /^([-\@\w.]+)$/) {
425e5e39
PP
213 $data = $1; # $data now untainted
214 } else {
3a2263fe 215 die "Bad data in '$data'"; # log this somewhere
425e5e39
PP
216 }
217
5f05dabc 218This is fairly secure because C</\w+/> doesn't normally match shell
425e5e39
PP
219metacharacters, nor are dot, dash, or at going to mean something special
220to the shell. Use of C</.+/> would have been insecure in theory because
221it lets everything through, but Perl doesn't check for that. The lesson
222is that when untainting, you must be exceedingly careful with your patterns.
19799a22 223Laundering data using regular expression is the I<only> mechanism for
425e5e39
PP
224untainting dirty data, unless you use the strategy detailed below to fork
225a child of lesser privilege.
226
23634c10 227The example does not untaint C<$data> if C<use locale> is in effect,
a034a98d
DD
228because the characters matched by C<\w> are determined by the locale.
229Perl considers that locale definitions are untrustworthy because they
230contain data from outside the program. If you are writing a
231locale-aware program, and want to launder data with a regular expression
232containing C<\w>, put C<no locale> ahead of the expression in the same
233block. See L<perllocale/SECURITY> for further discussion and examples.
234
3a52c276
CS
235=head2 Switches On the "#!" Line
236
237When you make a script executable, in order to make it usable as a
238command, the system will pass switches to perl from the script's #!
54310121 239line. Perl checks that any command line switches given to a setuid
3a52c276 240(or setgid) script actually match the ones set on the #! line. Some
54310121 241Unix and Unix-like environments impose a one-switch limit on the #!
3a52c276 242line, so you may need to use something like C<-wU> instead of C<-w -U>
54310121
PP
243under such systems. (This issue should arise only in Unix or
244Unix-like environments that support #! and setuid or setgid scripts.)
3a52c276 245
588f7210
SB
246=head2 Taint mode and @INC
247
248When the taint mode (C<-T>) is in effect, the "." directory is removed
249from C<@INC>, and the environment variables C<PERL5LIB> and C<PERLLIB>
91e64913 250are ignored by Perl. You can still adjust C<@INC> from outside the
588f7210 251program by using the C<-I> command line option as explained in
91e64913 252L<perlrun>. The two environment variables are ignored because
588f7210
SB
253they are obscured, and a user running a program could be unaware that
254they are set, whereas the C<-I> option is clearly visible and
255therefore permitted.
256
257Another way to modify C<@INC> without modifying the program, is to use
258the C<lib> pragma, e.g.:
259
260 perl -Mlib=/foo program
261
262The benefit of using C<-Mlib=/foo> over C<-I/foo>, is that the former
6fd9f613 263will automagically remove any duplicated directories, while the latter
588f7210
SB
264will not.
265
6a268663
RGS
266Note that if a tainted string is added to C<@INC>, the following
267problem will be reported:
268
269 Insecure dependency in require while running with -T switch
270
425e5e39
PP
271=head2 Cleaning Up Your Path
272
df98f984
RGS
273For "Insecure C<$ENV{PATH}>" messages, you need to set C<$ENV{'PATH'}> to
274a known value, and each directory in the path must be absolute and
275non-writable by others than its owner and group. You may be surprised to
276get this message even if the pathname to your executable is fully
277qualified. This is I<not> generated because you didn't supply a full path
278to the program; instead, it's generated because you never set your PATH
279environment variable, or you didn't set it to something that was safe.
280Because Perl can't guarantee that the executable in question isn't itself
281going to turn around and execute some other program that is dependent on
282your PATH, it makes sure you set the PATH.
a0d0e21e 283
a3cb178b
GS
284The PATH isn't the only environment variable which can cause problems.
285Because some shells may use the variables IFS, CDPATH, ENV, and
286BASH_ENV, Perl checks that those are either empty or untainted when
91e64913 287starting subprocesses. You may wish to add something like this to your
a3cb178b
GS
288setid and taint-checking scripts.
289
290 delete @ENV{qw(IFS CDPATH ENV BASH_ENV)}; # Make %ENV safer
291
a0d0e21e
LW
292It's also possible to get into trouble with other operations that don't
293care whether they use tainted values. Make judicious use of the file
294tests in dealing with any user-supplied filenames. When possible, do
fb73857a 295opens and such B<after> properly dropping any special user (or group!)
91e64913
FC
296privileges. Perl doesn't prevent you from
297opening tainted filenames for reading,
a0d0e21e
LW
298so be careful what you print out. The tainting mechanism is intended to
299prevent stupid mistakes, not to remove the need for thought.
300
23634c10
AL
301Perl does not call the shell to expand wild cards when you pass C<system>
302and C<exec> explicit parameter lists instead of strings with possible shell
303wildcards in them. Unfortunately, the C<open>, C<glob>, and
54310121
PP
304backtick functions provide no such alternate calling convention, so more
305subterfuge will be required.
425e5e39
PP
306
307Perl provides a reasonably safe way to open a file or pipe from a setuid
308or setgid program: just create a child process with reduced privilege who
309does the dirty work for you. First, fork a child using the special
23634c10 310C<open> syntax that connects the parent and child by a pipe. Now the
425e5e39
PP
311child resets its ID set and any other per-process attributes, like
312environment variables, umasks, current working directories, back to the
313originals or known safe values. Then the child process, which no longer
23634c10 314has any special permissions, does the C<open> or other system call.
425e5e39 315Finally, the child passes the data it managed to access back to the
5f05dabc 316parent. Because the file or pipe was opened in the child while running
425e5e39
PP
317under less privilege than the parent, it's not apt to be tricked into
318doing something it shouldn't.
319
23634c10 320Here's a way to do backticks reasonably safely. Notice how the C<exec> is
425e5e39
PP
321not called with a string that the shell could expand. This is by far the
322best way to call something that might be subjected to shell escapes: just
fb73857a 323never call the shell at all.
cb1a09d0 324
6ca3c6c6 325 use English;
e093bcf0
GW
326 die "Can't fork: $!" unless defined($pid = open(KID, "-|"));
327 if ($pid) { # parent
328 while (<KID>) {
329 # do something
330 }
331 close KID;
332 } else {
333 my @temp = ($EUID, $EGID);
334 my $orig_uid = $UID;
335 my $orig_gid = $GID;
336 $EUID = $UID;
337 $EGID = $GID;
338 # Drop privileges
339 $UID = $orig_uid;
340 $GID = $orig_gid;
341 # Make sure privs are really gone
342 ($EUID, $EGID) = @temp;
343 die "Can't drop privileges"
344 unless $UID == $EUID && $GID eq $EGID;
345 $ENV{PATH} = "/bin:/usr/bin"; # Minimal PATH.
346 # Consider sanitizing the environment even more.
347 exec 'myprog', 'arg1', 'arg2'
348 or die "can't exec myprog: $!";
349 }
425e5e39 350
fb73857a
PP
351A similar strategy would work for wildcard expansion via C<glob>, although
352you can use C<readdir> instead.
425e5e39
PP
353
354Taint checking is most useful when although you trust yourself not to have
355written a program to give away the farm, you don't necessarily trust those
356who end up using it not to try to trick it into doing something bad. This
fb73857a 357is the kind of security checking that's useful for set-id programs and
425e5e39
PP
358programs launched on someone else's behalf, like CGI programs.
359
360This is quite different, however, from not even trusting the writer of the
361code not to try to do something evil. That's the kind of trust needed
362when someone hands you a program you've never seen before and says, "Here,
18d7fc85
RGS
363run this." For that kind of safety, you might want to check out the Safe
364module, included standard in the Perl distribution. This module allows the
425e5e39 365programmer to set up special compartments in which all system operations
18d7fc85
RGS
366are trapped and namespace access is carefully controlled. Safe should
367not be considered bullet-proof, though: it will not prevent the foreign
368code to set up infinite loops, allocate gigabytes of memory, or even
369abusing perl bugs to make the host interpreter crash or behave in
91e64913 370unpredictable ways. In any case it's better avoided completely if you're
18d7fc85 371really concerned about security.
425e5e39
PP
372
373=head2 Security Bugs
374
375Beyond the obvious problems that stem from giving special privileges to
fb73857a 376systems as flexible as scripts, on many versions of Unix, set-id scripts
425e5e39
PP
377are inherently insecure right from the start. The problem is a race
378condition in the kernel. Between the time the kernel opens the file to
fb73857a 379see which interpreter to run and when the (now-set-id) interpreter turns
425e5e39
PP
380around and reopens the file to interpret it, the file in question may have
381changed, especially if you have symbolic links on your system.
382
383Fortunately, sometimes this kernel "feature" can be disabled.
384Unfortunately, there are two ways to disable it. The system can simply
fb73857a 385outlaw scripts with any set-id bit set, which doesn't help much.
cc69b689 386Alternately, it can simply ignore the set-id bits on scripts.
425e5e39 387
fb73857a
PP
388However, if the kernel set-id script feature isn't disabled, Perl will
389complain loudly that your set-id script is insecure. You'll need to
390either disable the kernel set-id script feature, or put a C wrapper around
425e5e39
PP
391the script. A C wrapper is just a compiled program that does nothing
392except call your Perl program. Compiled programs are not subject to the
fb73857a 393kernel bug that plagues set-id scripts. Here's a simple wrapper, written
425e5e39
PP
394in C:
395
396 #define REAL_PATH "/path/to/script"
54310121 397 main(ac, av)
425e5e39
PP
398 char **av;
399 {
400 execv(REAL_PATH, av);
54310121 401 }
cb1a09d0 402
54310121
PP
403Compile this wrapper into a binary executable and then make I<it> rather
404than your script setuid or setgid.
425e5e39 405
425e5e39
PP
406In recent years, vendors have begun to supply systems free of this
407inherent security bug. On such systems, when the kernel passes the name
fb73857a 408of the set-id script to open to the interpreter, rather than using a
425e5e39
PP
409pathname subject to meddling, it instead passes I</dev/fd/3>. This is a
410special file already opened on the script, so that there can be no race
411condition for evil scripts to exploit. On these systems, Perl should be
23634c10 412compiled with C<-DSETUID_SCRIPTS_ARE_SECURE_NOW>. The F<Configure>
425e5e39
PP
413program that builds Perl tries to figure this out for itself, so you
414should never have to specify this yourself. Most modern releases of
415SysVr4 and BSD 4.4 use this approach to avoid the kernel race condition.
416
68dc0745
PP
417=head2 Protecting Your Programs
418
419There are a number of ways to hide the source to your Perl programs,
420with varying levels of "security".
421
422First of all, however, you I<can't> take away read permission, because
423the source code has to be readable in order to be compiled and
424interpreted. (That doesn't mean that a CGI script's source is
425readable by people on the web, though.) So you have to leave the
5a964f20
TC
426permissions at the socially friendly 0755 level. This lets
427people on your local system only see your source.
68dc0745 428
5a964f20 429Some people mistakenly regard this as a security problem. If your program does
68dc0745
PP
430insecure things, and relies on people not knowing how to exploit those
431insecurities, it is not secure. It is often possible for someone to
432determine the insecure things and exploit them without viewing the
433source. Security through obscurity, the name for hiding your bugs
434instead of fixing them, is little security indeed.
435
83df6a1d
JH
436You can try using encryption via source filters (Filter::* from CPAN,
437or Filter::Util::Call and Filter::Simple since Perl 5.8).
438But crackers might be able to decrypt it. You can try using the byte
439code compiler and interpreter described below, but crackers might be
440able to de-compile it. You can try using the native-code compiler
68dc0745
PP
441described below, but crackers might be able to disassemble it. These
442pose varying degrees of difficulty to people wanting to get at your
443code, but none can definitively conceal it (this is true of every
444language, not just Perl).
445
446If you're concerned about people profiting from your code, then the
3462340b 447bottom line is that nothing but a restrictive license will give you
68dc0745
PP
448legal security. License your software and pepper it with threatening
449statements like "This is unpublished proprietary software of XYZ Corp.
450Your access to it does not give you permission to use it blah blah
3462340b 451blah." You should see a lawyer to be sure your license's wording will
68dc0745 452stand up in court.
5a964f20 453
0d7c09bb
JH
454=head2 Unicode
455
456Unicode is a new and complex technology and one may easily overlook
457certain security pitfalls. See L<perluniintro> for an overview and
458L<perlunicode> for details, and L<perlunicode/"Security Implications
459of Unicode"> for security implications in particular.
460
504f80c1
JH
461=head2 Algorithmic Complexity Attacks
462
463Certain internal algorithms used in the implementation of Perl can
464be attacked by choosing the input carefully to consume large amounts
465of either time or space or both. This can lead into the so-called
466I<Denial of Service> (DoS) attacks.
467
468=over 4
469
470=item *
471
6a5b4183
YO
472Hash Algorithm - Hash algorithms like the one used in Perl are well
473known to be vulnerable to collision attacks on their hash function.
474Such attacks involve constructing a set of keys which collide into
91e64913 475the same bucket producing inefficient behavior. Such attacks often
6a5b4183 476depend on discovering the seed of the hash function used to map the
91e64913
FC
477keys to buckets. That seed is then used to brute-force a key set which
478can be used to mount a denial of service attack. In Perl 5.8.1 changes
6a5b4183
YO
479were introduced to harden Perl to such attacks, and then later in
480Perl 5.18.0 these features were enhanced and additional protections
481added.
482
4d74c8eb
SM
483At the time of this writing, Perl 5.18.0 is considered to be
484well-hardened against algorithmic complexity attacks on its hash
91e64913 485implementation. This is largely owed to the following measures
4d74c8eb 486mitigate attacks:
6a5b4183
YO
487
488=over 4
489
490=item Hash Seed Randomization
491
492In order to make it impossible to know what seed to generate an attack
91e64913 493key set for, this seed is randomly initialized at process start. This
4d74c8eb 494may be overridden by using the PERL_HASH_SEED environment variable, see
91e64913 495L<perlrun/PERL_HASH_SEED>. This environment variable controls how
4d74c8eb
SM
496items are actually stored, not how they are presented via
497C<keys>, C<values> and C<each>.
6a5b4183
YO
498
499=item Hash Traversal Randomization
500
4d74c8eb 501Independent of which seed is used in the hash function, C<keys>,
6a5b4183
YO
502C<values>, and C<each> return items in a per-hash randomized order.
503Modifying a hash by insertion will change the iteration order of that hash.
4d74c8eb 504This behavior can be overridden by using C<hash_traversal_mask()> from
6a5b4183 505L<Hash::Util> or by using the PERL_PERTURB_KEYS environment variable,
91e64913 506see L<perlrun/PERL_PERTURB_KEYS>. Note that this feature controls the
6a5b4183
YO
507"visible" order of the keys, and not the actual order they are stored in.
508
509=item Bucket Order Perturbance
510
4d74c8eb 511When items collide into a given hash bucket the order they are stored in
91e64913
FC
512the chain is no longer predictable in Perl 5.18. This
513has the intention to make it harder to observe a
c6c886ef 514collision. This behavior can be overridden by using
6a5b4183
YO
515the PERL_PERTURB_KEYS environment variable, see L<perlrun/PERL_PERTURB_KEYS>.
516
517=item New Default Hash Function
518
519The default hash function has been modified with the intention of making
520it harder to infer the hash seed.
521
522=item Alternative Hash Functions
523
524The source code includes multiple hash algorithms to choose from. While we
4d74c8eb 525believe that the default perl hash is robust to attack, we have included the
91e64913 526hash function Siphash as a fall-back option. At the time of release of
6a5b4183
YO
527Perl 5.18.0 Siphash is believed to be of cryptographic strength. This is
528not the default as it is much slower than the default hash.
529
530=back
531
4d74c8eb 532Without compiling a special Perl, there is no way to get the exact same
91e64913 533behavior of any versions prior to Perl 5.18.0. The closest one can get
6a5b4183 534is by setting PERL_PERTURB_KEYS to 0 and setting the PERL_HASH_SEED
91e64913 535to a known value. We do not advise those settings for production use
4d74c8eb 536due to the above security considerations.
6a5b4183
YO
537
538B<Perl has never guaranteed any ordering of the hash keys>, and
539the ordering has already changed several times during the lifetime of
540Perl 5. Also, the ordering of hash keys has always been, and continues
541to be, affected by the insertion order and the history of changes made
542to the hash over its lifetime.
7b3f7037
JH
543
544Also note that while the order of the hash elements might be
4d74c8eb
SM
545randomized, this "pseudo-ordering" should B<not> be used for
546applications like shuffling a list randomly (use C<List::Util::shuffle()>
7b3f7037 547for that, see L<List::Util>, a standard core module since Perl 5.8.0;
4d74c8eb
SM
548or the CPAN module C<Algorithm::Numerical::Shuffle>), or for generating
549permutations (use e.g. the CPAN modules C<Algorithm::Permute> or
550C<Algorithm::FastPermute>), or for any cryptographic applications.
7b3f7037 551
883f220b
TC
552Tied hashes may have their own ordering and algorithmic complexity
553attacks.
554
504f80c1
JH
555=item *
556
5a4e8ea7 557Regular expressions - Perl's regular expression engine is so called NFA
558(Non-deterministic Finite Automaton), which among other things means that
559it can rather easily consume large amounts of both time and space if the
504f80c1
JH
560regular expression may match in several ways. Careful crafting of the
561regular expressions can help but quite often there really isn't much
562one can do (the book "Mastering Regular Expressions" is required
563reading, see L<perlfaq2>). Running out of space manifests itself by
564Perl running out of memory.
565
566=item *
567
568Sorting - the quicksort algorithm used in Perls before 5.8.0 to
569implement the sort() function is very easy to trick into misbehaving
3462340b
JL
570so that it consumes a lot of time. Starting from Perl 5.8.0 a different
571sorting algorithm, mergesort, is used by default. Mergesort cannot
572misbehave on any input.
504f80c1
JH
573
574=back
575
b25b06cf 576See L<https://www.usenix.org/legacy/events/sec03/tech/full_papers/crosby/crosby.pdf> for more information,
3462340b 577and any computer science textbook on algorithmic complexity.
504f80c1 578
5a964f20
TC
579=head1 SEE ALSO
580
581L<perlrun> for its description of cleaning up environment variables.