This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
PerlIO::Via: check arg is non-NULL before using it.
[perl5.git] / pod / perlsec.pod
CommitLineData
a0d0e21e
LW
1=head1 NAME
2
3perlsec - Perl security
4
5=head1 DESCRIPTION
6
425e5e39 7Perl is designed to make it easy to program securely even when running
8with extra privileges, like setuid or setgid programs. Unlike most
54310121 9command line shells, which are based on multiple substitution passes on
425e5e39 10each line of the script, Perl uses a more conventional evaluation scheme
11with fewer hidden snags. Additionally, because the language has more
54310121 12builtin functionality, it can rely less upon external (and possibly
425e5e39 13untrustworthy) programs to accomplish its purposes.
a0d0e21e 14
89f530a6
DG
15=head1 SECURITY VULNERABILITY CONTACT INFORMATION
16
87c118b9
DM
17If you believe you have found a security vulnerability in Perl, please
18email the details to perl5-security-report@perl.org. This creates a new
19Request Tracker ticket in a special queue which isn't initially publicly
20accessible. The email will also be copied to a closed subscription
21unarchived mailing list which includes all the core committers, who will
22be able to help assess the impact of issues, figure out a resolution, and
23help co-ordinate the release of patches to mitigate or fix the problem
24across all platforms on which Perl is supported. Please only use this
25address for security issues in the Perl core, not for modules
26independently distributed on CPAN.
27
28When sending an initial request to the security email address, please
29don't Cc any other parties, because if they reply to all, the reply will
30generate yet another new ticket. Once you have received an initial reply
31with a C<[perl #NNNNNN]> ticket number in the headline, it's okay to Cc
32subsequent replies to third parties: all emails to the
33perl5-security-report address with the ticket number in the subject line
34will be added to the ticket; without it, a new ticket will be created.
89f530a6
DG
35
36=head1 SECURITY MECHANISMS AND CONCERNS
37
38=head2 Taint mode
39
425e5e39 40Perl automatically enables a set of special security checks, called I<taint
41mode>, when it detects its program running with differing real and effective
42user or group IDs. The setuid bit in Unix permissions is mode 04000, the
43setgid bit mode 02000; either or both may be set. You can also enable taint
91e64913 44mode explicitly by using the B<-T> command line flag. This flag is
425e5e39 45I<strongly> suggested for server programs and any program run on behalf of
91e64913 46someone else, such as a CGI script. Once taint mode is on, it's on for
fb73857a 47the remainder of your script.
a0d0e21e 48
1e422769 49While in this mode, Perl takes special precautions called I<taint
50checks> to prevent both obvious and subtle traps. Some of these checks
51are reasonably simple, such as verifying that path directories aren't
52writable by others; careful programmers have always used checks like
53these. Other checks, however, are best supported by the language itself,
fb73857a 54and it is these checks especially that contribute to making a set-id Perl
425e5e39 55program more secure than the corresponding C program.
56
fb73857a 57You may not use data derived from outside your program to affect
58something else outside your program--at least, not by accident. All
59command line arguments, environment variables, locale information (see
23634c10
AL
60L<perllocale>), results of certain system calls (C<readdir()>,
61C<readlink()>, the variable of C<shmread()>, the messages returned by
62C<msgrcv()>, the password, gcos and shell fields returned by the
63C<getpwxxx()> calls), and all file input are marked as "tainted".
41d6edb2
JH
64Tainted data may not be used directly or indirectly in any command
65that invokes a sub-shell, nor in any command that modifies files,
b7ee89ce
AP
66directories, or processes, B<with the following exceptions>:
67
68=over 4
69
70=item *
71
b7ee89ce
AP
72Arguments to C<print> and C<syswrite> are B<not> checked for taintedness.
73
7f6513c1
JH
74=item *
75
76Symbolic methods
77
78 $obj->$method(@args);
79
80and symbolic sub references
81
82 &{$foo}(@args);
83 $foo->(@args);
84
85are not checked for taintedness. This requires extra carefulness
86unless you want external data to affect your control flow. Unless
87you carefully limit what these symbolic values are, people are able
88to call functions B<outside> your Perl code, such as POSIX::system,
89in which case they are able to run arbitrary external code.
90
8ea1447c
RD
91=item *
92
93Hash keys are B<never> tainted.
94
b7ee89ce
AP
95=back
96
595bde10
MG
97For efficiency reasons, Perl takes a conservative view of
98whether data is tainted. If an expression contains tainted data,
99any subexpression may be considered tainted, even if the value
100of the subexpression is not itself affected by the tainted data.
ee556d55 101
d929ce6f 102Because taintedness is associated with each scalar value, some
595bde10 103elements of an array or hash can be tainted and others not.
8ea1447c 104The keys of a hash are B<never> tainted.
a0d0e21e 105
a0d0e21e
LW
106For example:
107
425e5e39 108 $arg = shift; # $arg is tainted
048b63be 109 $hid = $arg . 'bar'; # $hid is also tainted
425e5e39 110 $line = <>; # Tainted
8ebc5c01 111 $line = <STDIN>; # Also tainted
112 open FOO, "/home/me/bar" or die $!;
113 $line = <FOO>; # Still tainted
a0d0e21e 114 $path = $ENV{'PATH'}; # Tainted, but see below
425e5e39 115 $data = 'abc'; # Not tainted
a0d0e21e 116
425e5e39 117 system "echo $arg"; # Insecure
7de90c4d 118 system "/bin/echo", $arg; # Considered insecure
bbd7eb8a 119 # (Perl doesn't know about /bin/echo)
425e5e39 120 system "echo $hid"; # Insecure
121 system "echo $data"; # Insecure until PATH set
a0d0e21e 122
425e5e39 123 $path = $ENV{'PATH'}; # $path now tainted
a0d0e21e 124
54310121 125 $ENV{'PATH'} = '/bin:/usr/bin';
c90c0ff4 126 delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
a0d0e21e 127
425e5e39 128 $path = $ENV{'PATH'}; # $path now NOT tainted
129 system "echo $data"; # Is secure now!
a0d0e21e 130
425e5e39 131 open(FOO, "< $arg"); # OK - read-only file
132 open(FOO, "> $arg"); # Not OK - trying to write
a0d0e21e 133
bbd7eb8a 134 open(FOO,"echo $arg|"); # Not OK
425e5e39 135 open(FOO,"-|")
7de90c4d 136 or exec 'echo', $arg; # Also not OK
a0d0e21e 137
425e5e39 138 $shout = `echo $arg`; # Insecure, $shout now tainted
a0d0e21e 139
425e5e39 140 unlink $data, $arg; # Insecure
141 umask $arg; # Insecure
a0d0e21e 142
bbd7eb8a 143 exec "echo $arg"; # Insecure
7de90c4d
RD
144 exec "echo", $arg; # Insecure
145 exec "sh", '-c', $arg; # Very insecure!
a0d0e21e 146
3a4b19e4
GS
147 @files = <*.c>; # insecure (uses readdir() or similar)
148 @files = glob('*.c'); # insecure (uses readdir() or similar)
7bac28a0 149
dde0c558
BF
150 # In either case, the results of glob are tainted, since the list of
151 # filenames comes from outside of the program.
3f7d42d8 152
ee556d55
MG
153 $bad = ($arg, 23); # $bad will be tainted
154 $arg, `true`; # Insecure (although it isn't really)
155
a0d0e21e 156If you try to do something insecure, you will get a fatal error saying
7de90c4d 157something like "Insecure dependency" or "Insecure $ENV{PATH}".
425e5e39 158
23634c10
AL
159The exception to the principle of "one tainted value taints the whole
160expression" is with the ternary conditional operator C<?:>. Since code
161with a ternary conditional
162
163 $result = $tainted_value ? "Untainted" : "Also untainted";
164
165is effectively
166
167 if ( $tainted_value ) {
168 $result = "Untainted";
169 } else {
170 $result = "Also untainted";
171 }
172
173it doesn't make sense for C<$result> to be tainted.
174
425e5e39 175=head2 Laundering and Detecting Tainted Data
176
3f7d42d8
JH
177To test whether a variable contains tainted data, and whose use would
178thus trigger an "Insecure dependency" message, you can use the
23634c10 179C<tainted()> function of the Scalar::Util module, available in your
3f7d42d8 180nearby CPAN mirror, and included in Perl starting from the release 5.8.0.
595bde10 181Or you may be able to use the following C<is_tainted()> function.
425e5e39 182
183 sub is_tainted {
7687d286 184 local $@; # Don't pollute caller's value.
61890e45 185 return ! eval { eval("#" . substr(join("", @_), 0, 0)); 1 };
425e5e39 186 }
187
188This function makes use of the fact that the presence of tainted data
189anywhere within an expression renders the entire expression tainted. It
190would be inefficient for every operator to test every argument for
191taintedness. Instead, the slightly more efficient and conservative
192approach is used that if any tainted value has been accessed within the
193same expression, the whole expression is considered tainted.
194
5f05dabc 195But testing for taintedness gets you only so far. Sometimes you have just
595bde10
MG
196to clear your data's taintedness. Values may be untainted by using them
197as keys in a hash; otherwise the only way to bypass the tainting
54310121 198mechanism is by referencing subpatterns from a regular expression match.
18512f39
KW
199Perl presumes that if you reference a substring using $1, $2, etc. in a
200non-tainting pattern, that
201you knew what you were doing when you wrote that pattern. That means using
425e5e39 202a bit of thought--don't just blindly untaint anything, or you defeat the
a034a98d
DD
203entire mechanism. It's better to verify that the variable has only good
204characters (for certain values of "good") rather than checking whether it
205has any bad characters. That's because it's far too easy to miss bad
206characters that you never thought of.
425e5e39 207
208Here's a test to make sure that the data contains nothing but "word"
209characters (alphabetics, numerics, and underscores), a hyphen, an at sign,
210or a dot.
211
54310121 212 if ($data =~ /^([-\@\w.]+)$/) {
425e5e39 213 $data = $1; # $data now untainted
214 } else {
3a2263fe 215 die "Bad data in '$data'"; # log this somewhere
425e5e39 216 }
217
5f05dabc 218This is fairly secure because C</\w+/> doesn't normally match shell
425e5e39 219metacharacters, nor are dot, dash, or at going to mean something special
220to the shell. Use of C</.+/> would have been insecure in theory because
221it lets everything through, but Perl doesn't check for that. The lesson
222is that when untainting, you must be exceedingly careful with your patterns.
19799a22 223Laundering data using regular expression is the I<only> mechanism for
425e5e39 224untainting dirty data, unless you use the strategy detailed below to fork
225a child of lesser privilege.
226
23634c10 227The example does not untaint C<$data> if C<use locale> is in effect,
a034a98d
DD
228because the characters matched by C<\w> are determined by the locale.
229Perl considers that locale definitions are untrustworthy because they
230contain data from outside the program. If you are writing a
231locale-aware program, and want to launder data with a regular expression
232containing C<\w>, put C<no locale> ahead of the expression in the same
233block. See L<perllocale/SECURITY> for further discussion and examples.
234
3a52c276
CS
235=head2 Switches On the "#!" Line
236
237When you make a script executable, in order to make it usable as a
238command, the system will pass switches to perl from the script's #!
54310121 239line. Perl checks that any command line switches given to a setuid
3a52c276 240(or setgid) script actually match the ones set on the #! line. Some
54310121 241Unix and Unix-like environments impose a one-switch limit on the #!
3a52c276 242line, so you may need to use something like C<-wU> instead of C<-w -U>
54310121 243under such systems. (This issue should arise only in Unix or
244Unix-like environments that support #! and setuid or setgid scripts.)
3a52c276 245
588f7210
SB
246=head2 Taint mode and @INC
247
248When the taint mode (C<-T>) is in effect, the "." directory is removed
249from C<@INC>, and the environment variables C<PERL5LIB> and C<PERLLIB>
91e64913 250are ignored by Perl. You can still adjust C<@INC> from outside the
588f7210 251program by using the C<-I> command line option as explained in
91e64913 252L<perlrun>. The two environment variables are ignored because
588f7210
SB
253they are obscured, and a user running a program could be unaware that
254they are set, whereas the C<-I> option is clearly visible and
255therefore permitted.
256
257Another way to modify C<@INC> without modifying the program, is to use
258the C<lib> pragma, e.g.:
259
260 perl -Mlib=/foo program
261
262The benefit of using C<-Mlib=/foo> over C<-I/foo>, is that the former
6fd9f613 263will automagically remove any duplicated directories, while the latter
588f7210
SB
264will not.
265
6a268663
RGS
266Note that if a tainted string is added to C<@INC>, the following
267problem will be reported:
268
269 Insecure dependency in require while running with -T switch
270
425e5e39 271=head2 Cleaning Up Your Path
272
df98f984
RGS
273For "Insecure C<$ENV{PATH}>" messages, you need to set C<$ENV{'PATH'}> to
274a known value, and each directory in the path must be absolute and
275non-writable by others than its owner and group. You may be surprised to
276get this message even if the pathname to your executable is fully
277qualified. This is I<not> generated because you didn't supply a full path
278to the program; instead, it's generated because you never set your PATH
279environment variable, or you didn't set it to something that was safe.
280Because Perl can't guarantee that the executable in question isn't itself
281going to turn around and execute some other program that is dependent on
282your PATH, it makes sure you set the PATH.
a0d0e21e 283
a3cb178b
GS
284The PATH isn't the only environment variable which can cause problems.
285Because some shells may use the variables IFS, CDPATH, ENV, and
286BASH_ENV, Perl checks that those are either empty or untainted when
91e64913 287starting subprocesses. You may wish to add something like this to your
a3cb178b
GS
288setid and taint-checking scripts.
289
290 delete @ENV{qw(IFS CDPATH ENV BASH_ENV)}; # Make %ENV safer
291
a0d0e21e
LW
292It's also possible to get into trouble with other operations that don't
293care whether they use tainted values. Make judicious use of the file
294tests in dealing with any user-supplied filenames. When possible, do
fb73857a 295opens and such B<after> properly dropping any special user (or group!)
91e64913
FC
296privileges. Perl doesn't prevent you from
297opening tainted filenames for reading,
a0d0e21e
LW
298so be careful what you print out. The tainting mechanism is intended to
299prevent stupid mistakes, not to remove the need for thought.
300
23634c10
AL
301Perl does not call the shell to expand wild cards when you pass C<system>
302and C<exec> explicit parameter lists instead of strings with possible shell
303wildcards in them. Unfortunately, the C<open>, C<glob>, and
54310121 304backtick functions provide no such alternate calling convention, so more
305subterfuge will be required.
425e5e39 306
307Perl provides a reasonably safe way to open a file or pipe from a setuid
308or setgid program: just create a child process with reduced privilege who
309does the dirty work for you. First, fork a child using the special
23634c10 310C<open> syntax that connects the parent and child by a pipe. Now the
425e5e39 311child resets its ID set and any other per-process attributes, like
312environment variables, umasks, current working directories, back to the
313originals or known safe values. Then the child process, which no longer
23634c10 314has any special permissions, does the C<open> or other system call.
425e5e39 315Finally, the child passes the data it managed to access back to the
5f05dabc 316parent. Because the file or pipe was opened in the child while running
425e5e39 317under less privilege than the parent, it's not apt to be tricked into
318doing something it shouldn't.
319
23634c10 320Here's a way to do backticks reasonably safely. Notice how the C<exec> is
425e5e39 321not called with a string that the shell could expand. This is by far the
322best way to call something that might be subjected to shell escapes: just
fb73857a 323never call the shell at all.
cb1a09d0 324
6ca3c6c6 325 use English;
e093bcf0
GW
326 die "Can't fork: $!" unless defined($pid = open(KID, "-|"));
327 if ($pid) { # parent
328 while (<KID>) {
329 # do something
330 }
331 close KID;
332 } else {
333 my @temp = ($EUID, $EGID);
334 my $orig_uid = $UID;
335 my $orig_gid = $GID;
336 $EUID = $UID;
337 $EGID = $GID;
338 # Drop privileges
339 $UID = $orig_uid;
340 $GID = $orig_gid;
341 # Make sure privs are really gone
342 ($EUID, $EGID) = @temp;
343 die "Can't drop privileges"
344 unless $UID == $EUID && $GID eq $EGID;
345 $ENV{PATH} = "/bin:/usr/bin"; # Minimal PATH.
346 # Consider sanitizing the environment even more.
347 exec 'myprog', 'arg1', 'arg2'
348 or die "can't exec myprog: $!";
349 }
425e5e39 350
fb73857a 351A similar strategy would work for wildcard expansion via C<glob>, although
352you can use C<readdir> instead.
425e5e39 353
354Taint checking is most useful when although you trust yourself not to have
355written a program to give away the farm, you don't necessarily trust those
356who end up using it not to try to trick it into doing something bad. This
fb73857a 357is the kind of security checking that's useful for set-id programs and
425e5e39 358programs launched on someone else's behalf, like CGI programs.
359
360This is quite different, however, from not even trusting the writer of the
361code not to try to do something evil. That's the kind of trust needed
362when someone hands you a program you've never seen before and says, "Here,
18d7fc85
RGS
363run this." For that kind of safety, you might want to check out the Safe
364module, included standard in the Perl distribution. This module allows the
425e5e39 365programmer to set up special compartments in which all system operations
18d7fc85
RGS
366are trapped and namespace access is carefully controlled. Safe should
367not be considered bullet-proof, though: it will not prevent the foreign
368code to set up infinite loops, allocate gigabytes of memory, or even
369abusing perl bugs to make the host interpreter crash or behave in
91e64913 370unpredictable ways. In any case it's better avoided completely if you're
18d7fc85 371really concerned about security.
425e5e39 372
b5145c7d 373=head2 Shebang Race Condition
425e5e39 374
375Beyond the obvious problems that stem from giving special privileges to
fb73857a 376systems as flexible as scripts, on many versions of Unix, set-id scripts
425e5e39 377are inherently insecure right from the start. The problem is a race
378condition in the kernel. Between the time the kernel opens the file to
fb73857a 379see which interpreter to run and when the (now-set-id) interpreter turns
425e5e39 380around and reopens the file to interpret it, the file in question may have
381changed, especially if you have symbolic links on your system.
382
dabde021 383Some Unixes, especially more recent ones, are free of this
b5145c7d
Z
384inherent security bug. On such systems, when the kernel passes the name
385of the set-id script to open to the interpreter, rather than using a
386pathname subject to meddling, it instead passes I</dev/fd/3>. This is a
387special file already opened on the script, so that there can be no race
388condition for evil scripts to exploit. On these systems, Perl should be
389compiled with C<-DSETUID_SCRIPTS_ARE_SECURE_NOW>. The F<Configure>
390program that builds Perl tries to figure this out for itself, so you
391should never have to specify this yourself. Most modern releases of
392SysVr4 and BSD 4.4 use this approach to avoid the kernel race condition.
425e5e39 393
b5145c7d
Z
394If you don't have the safe version of set-id scripts, all is not lost.
395Sometimes this kernel "feature" can be disabled, so that the kernel
396either doesn't run set-id scripts with the set-id or doesn't run them
397at all. Either way avoids the exploitability of the race condition,
398but doesn't help in actually running scripts set-id.
399
400If the kernel set-id script feature isn't disabled, then any set-id
401script provides an exploitable vulnerability. Perl can't avoid being
402exploitable, but will point out vulnerable scripts where it can. If Perl
403detects that it is being applied to a set-id script then it will complain
404loudly that your set-id script is insecure, and won't run it. When Perl
405complains, you need to remove the set-id bit from the script to eliminate
406the vulnerability. Refusing to run the script doesn't in itself close
407the vulnerability; it is just Perl's way of encouraging you to do this.
408
409To actually run a script set-id, if you don't have the safe version of
410set-id scripts, you'll need to put a C wrapper around
425e5e39 411the script. A C wrapper is just a compiled program that does nothing
412except call your Perl program. Compiled programs are not subject to the
fb73857a 413kernel bug that plagues set-id scripts. Here's a simple wrapper, written
425e5e39 414in C:
415
245c138e
LM
416 #include <unistd.h>
417 #include <stdio.h>
418 #include <string.h>
419 #include <errno.h>
420
425e5e39 421 #define REAL_PATH "/path/to/script"
245c138e
LM
422
423 int main(int argc, char **argv)
425e5e39 424 {
245c138e
LM
425 execv(REAL_PATH, argv);
426 fprintf(stderr, "%s: %s: %s\n",
427 argv[0], REAL_PATH, strerror(errno));
428 return 127;
54310121 429 }
cb1a09d0 430
54310121 431Compile this wrapper into a binary executable and then make I<it> rather
b5145c7d 432than your script setuid or setgid. Note that this wrapper isn't doing
dabde021 433anything to sanitise the execution environment other than ensuring
b5145c7d
Z
434that a safe path to the script is used. It only avoids the shebang
435race condition. It relies on Perl's own features, and on the script
436itself being careful, to make it safe enough to run the script set-id.
425e5e39 437
68dc0745 438=head2 Protecting Your Programs
439
440There are a number of ways to hide the source to your Perl programs,
441with varying levels of "security".
442
443First of all, however, you I<can't> take away read permission, because
444the source code has to be readable in order to be compiled and
445interpreted. (That doesn't mean that a CGI script's source is
446readable by people on the web, though.) So you have to leave the
5a964f20
TC
447permissions at the socially friendly 0755 level. This lets
448people on your local system only see your source.
68dc0745 449
5a964f20 450Some people mistakenly regard this as a security problem. If your program does
68dc0745 451insecure things, and relies on people not knowing how to exploit those
452insecurities, it is not secure. It is often possible for someone to
453determine the insecure things and exploit them without viewing the
454source. Security through obscurity, the name for hiding your bugs
455instead of fixing them, is little security indeed.
456
83df6a1d
JH
457You can try using encryption via source filters (Filter::* from CPAN,
458or Filter::Util::Call and Filter::Simple since Perl 5.8).
459But crackers might be able to decrypt it. You can try using the byte
460code compiler and interpreter described below, but crackers might be
461able to de-compile it. You can try using the native-code compiler
68dc0745 462described below, but crackers might be able to disassemble it. These
463pose varying degrees of difficulty to people wanting to get at your
464code, but none can definitively conceal it (this is true of every
465language, not just Perl).
466
467If you're concerned about people profiting from your code, then the
3462340b 468bottom line is that nothing but a restrictive license will give you
68dc0745 469legal security. License your software and pepper it with threatening
470statements like "This is unpublished proprietary software of XYZ Corp.
471Your access to it does not give you permission to use it blah blah
3462340b 472blah." You should see a lawyer to be sure your license's wording will
68dc0745 473stand up in court.
5a964f20 474
0d7c09bb
JH
475=head2 Unicode
476
477Unicode is a new and complex technology and one may easily overlook
478certain security pitfalls. See L<perluniintro> for an overview and
479L<perlunicode> for details, and L<perlunicode/"Security Implications
480of Unicode"> for security implications in particular.
481
504f80c1
JH
482=head2 Algorithmic Complexity Attacks
483
484Certain internal algorithms used in the implementation of Perl can
485be attacked by choosing the input carefully to consume large amounts
486of either time or space or both. This can lead into the so-called
487I<Denial of Service> (DoS) attacks.
488
489=over 4
490
491=item *
492
6a5b4183
YO
493Hash Algorithm - Hash algorithms like the one used in Perl are well
494known to be vulnerable to collision attacks on their hash function.
495Such attacks involve constructing a set of keys which collide into
91e64913 496the same bucket producing inefficient behavior. Such attacks often
6a5b4183 497depend on discovering the seed of the hash function used to map the
91e64913
FC
498keys to buckets. That seed is then used to brute-force a key set which
499can be used to mount a denial of service attack. In Perl 5.8.1 changes
6a5b4183
YO
500were introduced to harden Perl to such attacks, and then later in
501Perl 5.18.0 these features were enhanced and additional protections
502added.
503
4d74c8eb
S
504At the time of this writing, Perl 5.18.0 is considered to be
505well-hardened against algorithmic complexity attacks on its hash
91e64913 506implementation. This is largely owed to the following measures
4d74c8eb 507mitigate attacks:
6a5b4183
YO
508
509=over 4
510
511=item Hash Seed Randomization
512
513In order to make it impossible to know what seed to generate an attack
91e64913 514key set for, this seed is randomly initialized at process start. This
4d74c8eb 515may be overridden by using the PERL_HASH_SEED environment variable, see
91e64913 516L<perlrun/PERL_HASH_SEED>. This environment variable controls how
4d74c8eb
S
517items are actually stored, not how they are presented via
518C<keys>, C<values> and C<each>.
6a5b4183
YO
519
520=item Hash Traversal Randomization
521
4d74c8eb 522Independent of which seed is used in the hash function, C<keys>,
6a5b4183
YO
523C<values>, and C<each> return items in a per-hash randomized order.
524Modifying a hash by insertion will change the iteration order of that hash.
4d74c8eb 525This behavior can be overridden by using C<hash_traversal_mask()> from
6a5b4183 526L<Hash::Util> or by using the PERL_PERTURB_KEYS environment variable,
91e64913 527see L<perlrun/PERL_PERTURB_KEYS>. Note that this feature controls the
6a5b4183
YO
528"visible" order of the keys, and not the actual order they are stored in.
529
530=item Bucket Order Perturbance
531
4d74c8eb 532When items collide into a given hash bucket the order they are stored in
91e64913
FC
533the chain is no longer predictable in Perl 5.18. This
534has the intention to make it harder to observe a
c6c886ef 535collision. This behavior can be overridden by using
6a5b4183
YO
536the PERL_PERTURB_KEYS environment variable, see L<perlrun/PERL_PERTURB_KEYS>.
537
538=item New Default Hash Function
539
540The default hash function has been modified with the intention of making
541it harder to infer the hash seed.
542
543=item Alternative Hash Functions
544
545The source code includes multiple hash algorithms to choose from. While we
4d74c8eb 546believe that the default perl hash is robust to attack, we have included the
91e64913 547hash function Siphash as a fall-back option. At the time of release of
6a5b4183
YO
548Perl 5.18.0 Siphash is believed to be of cryptographic strength. This is
549not the default as it is much slower than the default hash.
550
551=back
552
4d74c8eb 553Without compiling a special Perl, there is no way to get the exact same
91e64913 554behavior of any versions prior to Perl 5.18.0. The closest one can get
6a5b4183 555is by setting PERL_PERTURB_KEYS to 0 and setting the PERL_HASH_SEED
91e64913 556to a known value. We do not advise those settings for production use
4d74c8eb 557due to the above security considerations.
6a5b4183
YO
558
559B<Perl has never guaranteed any ordering of the hash keys>, and
560the ordering has already changed several times during the lifetime of
561Perl 5. Also, the ordering of hash keys has always been, and continues
562to be, affected by the insertion order and the history of changes made
563to the hash over its lifetime.
7b3f7037
JH
564
565Also note that while the order of the hash elements might be
4d74c8eb
S
566randomized, this "pseudo-ordering" should B<not> be used for
567applications like shuffling a list randomly (use C<List::Util::shuffle()>
7b3f7037 568for that, see L<List::Util>, a standard core module since Perl 5.8.0;
4d74c8eb
S
569or the CPAN module C<Algorithm::Numerical::Shuffle>), or for generating
570permutations (use e.g. the CPAN modules C<Algorithm::Permute> or
571C<Algorithm::FastPermute>), or for any cryptographic applications.
7b3f7037 572
883f220b
TC
573Tied hashes may have their own ordering and algorithmic complexity
574attacks.
575
504f80c1
JH
576=item *
577
5a4e8ea7
P
578Regular expressions - Perl's regular expression engine is so called NFA
579(Non-deterministic Finite Automaton), which among other things means that
580it can rather easily consume large amounts of both time and space if the
504f80c1
JH
581regular expression may match in several ways. Careful crafting of the
582regular expressions can help but quite often there really isn't much
583one can do (the book "Mastering Regular Expressions" is required
584reading, see L<perlfaq2>). Running out of space manifests itself by
585Perl running out of memory.
586
587=item *
588
589Sorting - the quicksort algorithm used in Perls before 5.8.0 to
e2091bb6 590implement the sort() function was very easy to trick into misbehaving
3462340b
JL
591so that it consumes a lot of time. Starting from Perl 5.8.0 a different
592sorting algorithm, mergesort, is used by default. Mergesort cannot
593misbehave on any input.
504f80c1
JH
594
595=back
596
b25b06cf 597See L<https://www.usenix.org/legacy/events/sec03/tech/full_papers/crosby/crosby.pdf> for more information,
3462340b 598and any computer science textbook on algorithmic complexity.
504f80c1 599
b5145c7d
Z
600=head2 Using Sudo
601
602The popular tool C<sudo> provides a controlled way for users to be able
603to run programs as other users. It sanitises the execution environment
604to some extent, and will avoid the L<shebang race condition|/"Shebang
605Race Condition">. If you don't have the safe version of set-id scripts,
606then C<sudo> may be a more convenient way of executing a script as
607another user than writing a C wrapper would be.
608
609However, C<sudo> sets the real user or group ID to that of the target
610identity, not just the effective ID as set-id bits do. As a result, Perl
611can't detect that it is running under C<sudo>, and so won't automatically
612take its own security precautions such as turning on taint mode. Where
613C<sudo> configuration dictates exactly which command can be run, the
614approved command may include a C<-T> option to perl to enable taint mode.
615
616In general, it is necessary to evaluate the suitaility of a script to
617run under C<sudo> specifically with that kind of execution environment
618in mind. It is neither necessary nor sufficient for the same script to
619be suitable to run in a traditional set-id arrangement, though many of
620the issues overlap.
621
5a964f20
TC
622=head1 SEE ALSO
623
624L<perlrun> for its description of cleaning up environment variables.