This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
lib/_charnames.pm: Fix typo in comment
[perl5.git] / pod / perlsec.pod
CommitLineData
a0d0e21e
LW
1=head1 NAME
2
3perlsec - Perl security
4
5=head1 DESCRIPTION
6
425e5e39 7Perl is designed to make it easy to program securely even when running
8with extra privileges, like setuid or setgid programs. Unlike most
54310121 9command line shells, which are based on multiple substitution passes on
425e5e39 10each line of the script, Perl uses a more conventional evaluation scheme
11with fewer hidden snags. Additionally, because the language has more
54310121 12builtin functionality, it can rely less upon external (and possibly
425e5e39 13untrustworthy) programs to accomplish its purposes.
a0d0e21e 14
89f530a6
DG
15=head1 SECURITY VULNERABILITY CONTACT INFORMATION
16
87c118b9
DM
17If you believe you have found a security vulnerability in Perl, please
18email the details to perl5-security-report@perl.org. This creates a new
19Request Tracker ticket in a special queue which isn't initially publicly
20accessible. The email will also be copied to a closed subscription
21unarchived mailing list which includes all the core committers, who will
22be able to help assess the impact of issues, figure out a resolution, and
23help co-ordinate the release of patches to mitigate or fix the problem
24across all platforms on which Perl is supported. Please only use this
25address for security issues in the Perl core, not for modules
26independently distributed on CPAN.
27
28When sending an initial request to the security email address, please
29don't Cc any other parties, because if they reply to all, the reply will
30generate yet another new ticket. Once you have received an initial reply
31with a C<[perl #NNNNNN]> ticket number in the headline, it's okay to Cc
32subsequent replies to third parties: all emails to the
33perl5-security-report address with the ticket number in the subject line
34will be added to the ticket; without it, a new ticket will be created.
89f530a6
DG
35
36=head1 SECURITY MECHANISMS AND CONCERNS
37
38=head2 Taint mode
39
425e5e39 40Perl automatically enables a set of special security checks, called I<taint
41mode>, when it detects its program running with differing real and effective
42user or group IDs. The setuid bit in Unix permissions is mode 04000, the
43setgid bit mode 02000; either or both may be set. You can also enable taint
91e64913 44mode explicitly by using the B<-T> command line flag. This flag is
425e5e39 45I<strongly> suggested for server programs and any program run on behalf of
91e64913 46someone else, such as a CGI script. Once taint mode is on, it's on for
fb73857a 47the remainder of your script.
a0d0e21e 48
1e422769 49While in this mode, Perl takes special precautions called I<taint
50checks> to prevent both obvious and subtle traps. Some of these checks
51are reasonably simple, such as verifying that path directories aren't
52writable by others; careful programmers have always used checks like
53these. Other checks, however, are best supported by the language itself,
fb73857a 54and it is these checks especially that contribute to making a set-id Perl
425e5e39 55program more secure than the corresponding C program.
56
fb73857a 57You may not use data derived from outside your program to affect
58something else outside your program--at least, not by accident. All
59command line arguments, environment variables, locale information (see
23634c10
AL
60L<perllocale>), results of certain system calls (C<readdir()>,
61C<readlink()>, the variable of C<shmread()>, the messages returned by
62C<msgrcv()>, the password, gcos and shell fields returned by the
63C<getpwxxx()> calls), and all file input are marked as "tainted".
41d6edb2
JH
64Tainted data may not be used directly or indirectly in any command
65that invokes a sub-shell, nor in any command that modifies files,
b7ee89ce
AP
66directories, or processes, B<with the following exceptions>:
67
68=over 4
69
70=item *
71
b7ee89ce
AP
72Arguments to C<print> and C<syswrite> are B<not> checked for taintedness.
73
7f6513c1
JH
74=item *
75
76Symbolic methods
77
78 $obj->$method(@args);
79
80and symbolic sub references
81
82 &{$foo}(@args);
83 $foo->(@args);
84
85are not checked for taintedness. This requires extra carefulness
86unless you want external data to affect your control flow. Unless
87you carefully limit what these symbolic values are, people are able
88to call functions B<outside> your Perl code, such as POSIX::system,
89in which case they are able to run arbitrary external code.
90
8ea1447c
RD
91=item *
92
93Hash keys are B<never> tainted.
94
b7ee89ce
AP
95=back
96
595bde10
MG
97For efficiency reasons, Perl takes a conservative view of
98whether data is tainted. If an expression contains tainted data,
99any subexpression may be considered tainted, even if the value
100of the subexpression is not itself affected by the tainted data.
ee556d55 101
d929ce6f 102Because taintedness is associated with each scalar value, some
595bde10 103elements of an array or hash can be tainted and others not.
8ea1447c 104The keys of a hash are B<never> tainted.
a0d0e21e 105
a0d0e21e
LW
106For example:
107
425e5e39 108 $arg = shift; # $arg is tainted
048b63be 109 $hid = $arg . 'bar'; # $hid is also tainted
425e5e39 110 $line = <>; # Tainted
8ebc5c01 111 $line = <STDIN>; # Also tainted
112 open FOO, "/home/me/bar" or die $!;
113 $line = <FOO>; # Still tainted
a0d0e21e 114 $path = $ENV{'PATH'}; # Tainted, but see below
425e5e39 115 $data = 'abc'; # Not tainted
a0d0e21e 116
425e5e39 117 system "echo $arg"; # Insecure
7de90c4d 118 system "/bin/echo", $arg; # Considered insecure
bbd7eb8a 119 # (Perl doesn't know about /bin/echo)
425e5e39 120 system "echo $hid"; # Insecure
121 system "echo $data"; # Insecure until PATH set
a0d0e21e 122
425e5e39 123 $path = $ENV{'PATH'}; # $path now tainted
a0d0e21e 124
54310121 125 $ENV{'PATH'} = '/bin:/usr/bin';
c90c0ff4 126 delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
a0d0e21e 127
425e5e39 128 $path = $ENV{'PATH'}; # $path now NOT tainted
129 system "echo $data"; # Is secure now!
a0d0e21e 130
425e5e39 131 open(FOO, "< $arg"); # OK - read-only file
132 open(FOO, "> $arg"); # Not OK - trying to write
a0d0e21e 133
bbd7eb8a 134 open(FOO,"echo $arg|"); # Not OK
425e5e39 135 open(FOO,"-|")
7de90c4d 136 or exec 'echo', $arg; # Also not OK
a0d0e21e 137
425e5e39 138 $shout = `echo $arg`; # Insecure, $shout now tainted
a0d0e21e 139
425e5e39 140 unlink $data, $arg; # Insecure
141 umask $arg; # Insecure
a0d0e21e 142
bbd7eb8a 143 exec "echo $arg"; # Insecure
7de90c4d
RD
144 exec "echo", $arg; # Insecure
145 exec "sh", '-c', $arg; # Very insecure!
a0d0e21e 146
3a4b19e4
GS
147 @files = <*.c>; # insecure (uses readdir() or similar)
148 @files = glob('*.c'); # insecure (uses readdir() or similar)
7bac28a0 149
dde0c558
BF
150 # In either case, the results of glob are tainted, since the list of
151 # filenames comes from outside of the program.
3f7d42d8 152
ee556d55
MG
153 $bad = ($arg, 23); # $bad will be tainted
154 $arg, `true`; # Insecure (although it isn't really)
155
a0d0e21e 156If you try to do something insecure, you will get a fatal error saying
7de90c4d 157something like "Insecure dependency" or "Insecure $ENV{PATH}".
425e5e39 158
23634c10
AL
159The exception to the principle of "one tainted value taints the whole
160expression" is with the ternary conditional operator C<?:>. Since code
161with a ternary conditional
162
163 $result = $tainted_value ? "Untainted" : "Also untainted";
164
165is effectively
166
167 if ( $tainted_value ) {
168 $result = "Untainted";
169 } else {
170 $result = "Also untainted";
171 }
172
173it doesn't make sense for C<$result> to be tainted.
174
425e5e39 175=head2 Laundering and Detecting Tainted Data
176
3f7d42d8
JH
177To test whether a variable contains tainted data, and whose use would
178thus trigger an "Insecure dependency" message, you can use the
23634c10 179C<tainted()> function of the Scalar::Util module, available in your
3f7d42d8 180nearby CPAN mirror, and included in Perl starting from the release 5.8.0.
595bde10 181Or you may be able to use the following C<is_tainted()> function.
425e5e39 182
183 sub is_tainted {
7687d286 184 local $@; # Don't pollute caller's value.
61890e45 185 return ! eval { eval("#" . substr(join("", @_), 0, 0)); 1 };
425e5e39 186 }
187
188This function makes use of the fact that the presence of tainted data
189anywhere within an expression renders the entire expression tainted. It
190would be inefficient for every operator to test every argument for
191taintedness. Instead, the slightly more efficient and conservative
192approach is used that if any tainted value has been accessed within the
193same expression, the whole expression is considered tainted.
194
5f05dabc 195But testing for taintedness gets you only so far. Sometimes you have just
595bde10
MG
196to clear your data's taintedness. Values may be untainted by using them
197as keys in a hash; otherwise the only way to bypass the tainting
54310121 198mechanism is by referencing subpatterns from a regular expression match.
18512f39
KW
199Perl presumes that if you reference a substring using $1, $2, etc. in a
200non-tainting pattern, that
201you knew what you were doing when you wrote that pattern. That means using
425e5e39 202a bit of thought--don't just blindly untaint anything, or you defeat the
a034a98d
DD
203entire mechanism. It's better to verify that the variable has only good
204characters (for certain values of "good") rather than checking whether it
205has any bad characters. That's because it's far too easy to miss bad
206characters that you never thought of.
425e5e39 207
208Here's a test to make sure that the data contains nothing but "word"
209characters (alphabetics, numerics, and underscores), a hyphen, an at sign,
210or a dot.
211
54310121 212 if ($data =~ /^([-\@\w.]+)$/) {
425e5e39 213 $data = $1; # $data now untainted
214 } else {
3a2263fe 215 die "Bad data in '$data'"; # log this somewhere
425e5e39 216 }
217
5f05dabc 218This is fairly secure because C</\w+/> doesn't normally match shell
425e5e39 219metacharacters, nor are dot, dash, or at going to mean something special
220to the shell. Use of C</.+/> would have been insecure in theory because
221it lets everything through, but Perl doesn't check for that. The lesson
222is that when untainting, you must be exceedingly careful with your patterns.
19799a22 223Laundering data using regular expression is the I<only> mechanism for
425e5e39 224untainting dirty data, unless you use the strategy detailed below to fork
225a child of lesser privilege.
226
23634c10 227The example does not untaint C<$data> if C<use locale> is in effect,
a034a98d
DD
228because the characters matched by C<\w> are determined by the locale.
229Perl considers that locale definitions are untrustworthy because they
230contain data from outside the program. If you are writing a
231locale-aware program, and want to launder data with a regular expression
232containing C<\w>, put C<no locale> ahead of the expression in the same
233block. See L<perllocale/SECURITY> for further discussion and examples.
234
3a52c276
CS
235=head2 Switches On the "#!" Line
236
237When you make a script executable, in order to make it usable as a
238command, the system will pass switches to perl from the script's #!
54310121 239line. Perl checks that any command line switches given to a setuid
3a52c276 240(or setgid) script actually match the ones set on the #! line. Some
54310121 241Unix and Unix-like environments impose a one-switch limit on the #!
3a52c276 242line, so you may need to use something like C<-wU> instead of C<-w -U>
54310121 243under such systems. (This issue should arise only in Unix or
244Unix-like environments that support #! and setuid or setgid scripts.)
3a52c276 245
588f7210
SB
246=head2 Taint mode and @INC
247
f7335192
DC
248When the taint mode (C<-T>) is in effect, the environment variables
249C<PERL5LIB> and C<PERLLIB>
91e64913 250are ignored by Perl. You can still adjust C<@INC> from outside the
588f7210 251program by using the C<-I> command line option as explained in
028611fa
DB
252L<perlrun|perlrun/-Idirectory>. The two environment variables are
253ignored because they are obscured, and a user running a program could
254be unaware that they are set, whereas the C<-I> option is clearly
255visible and therefore permitted.
588f7210
SB
256
257Another way to modify C<@INC> without modifying the program, is to use
258the C<lib> pragma, e.g.:
259
260 perl -Mlib=/foo program
261
262The benefit of using C<-Mlib=/foo> over C<-I/foo>, is that the former
6fd9f613 263will automagically remove any duplicated directories, while the latter
588f7210
SB
264will not.
265
6a268663
RGS
266Note that if a tainted string is added to C<@INC>, the following
267problem will be reported:
268
269 Insecure dependency in require while running with -T switch
270
f7335192 271On versions of Perl before 5.26, activating taint mode will also remove
a1c1fa25
DC
272the current directory (".") from the default value of C<@INC>. Since
273version 5.26, the current directory isn't included in C<@INC> by
274default.
f7335192 275
425e5e39 276=head2 Cleaning Up Your Path
277
df98f984
RGS
278For "Insecure C<$ENV{PATH}>" messages, you need to set C<$ENV{'PATH'}> to
279a known value, and each directory in the path must be absolute and
280non-writable by others than its owner and group. You may be surprised to
281get this message even if the pathname to your executable is fully
282qualified. This is I<not> generated because you didn't supply a full path
283to the program; instead, it's generated because you never set your PATH
284environment variable, or you didn't set it to something that was safe.
285Because Perl can't guarantee that the executable in question isn't itself
286going to turn around and execute some other program that is dependent on
287your PATH, it makes sure you set the PATH.
a0d0e21e 288
a3cb178b
GS
289The PATH isn't the only environment variable which can cause problems.
290Because some shells may use the variables IFS, CDPATH, ENV, and
291BASH_ENV, Perl checks that those are either empty or untainted when
91e64913 292starting subprocesses. You may wish to add something like this to your
a3cb178b
GS
293setid and taint-checking scripts.
294
295 delete @ENV{qw(IFS CDPATH ENV BASH_ENV)}; # Make %ENV safer
296
a0d0e21e
LW
297It's also possible to get into trouble with other operations that don't
298care whether they use tainted values. Make judicious use of the file
299tests in dealing with any user-supplied filenames. When possible, do
fb73857a 300opens and such B<after> properly dropping any special user (or group!)
91e64913
FC
301privileges. Perl doesn't prevent you from
302opening tainted filenames for reading,
a0d0e21e
LW
303so be careful what you print out. The tainting mechanism is intended to
304prevent stupid mistakes, not to remove the need for thought.
305
23634c10
AL
306Perl does not call the shell to expand wild cards when you pass C<system>
307and C<exec> explicit parameter lists instead of strings with possible shell
308wildcards in them. Unfortunately, the C<open>, C<glob>, and
54310121 309backtick functions provide no such alternate calling convention, so more
310subterfuge will be required.
425e5e39 311
312Perl provides a reasonably safe way to open a file or pipe from a setuid
313or setgid program: just create a child process with reduced privilege who
314does the dirty work for you. First, fork a child using the special
23634c10 315C<open> syntax that connects the parent and child by a pipe. Now the
425e5e39 316child resets its ID set and any other per-process attributes, like
317environment variables, umasks, current working directories, back to the
318originals or known safe values. Then the child process, which no longer
23634c10 319has any special permissions, does the C<open> or other system call.
425e5e39 320Finally, the child passes the data it managed to access back to the
5f05dabc 321parent. Because the file or pipe was opened in the child while running
425e5e39 322under less privilege than the parent, it's not apt to be tricked into
323doing something it shouldn't.
324
23634c10 325Here's a way to do backticks reasonably safely. Notice how the C<exec> is
425e5e39 326not called with a string that the shell could expand. This is by far the
327best way to call something that might be subjected to shell escapes: just
fb73857a 328never call the shell at all.
cb1a09d0 329
6ca3c6c6 330 use English;
e093bcf0
GW
331 die "Can't fork: $!" unless defined($pid = open(KID, "-|"));
332 if ($pid) { # parent
333 while (<KID>) {
334 # do something
335 }
336 close KID;
337 } else {
338 my @temp = ($EUID, $EGID);
339 my $orig_uid = $UID;
340 my $orig_gid = $GID;
341 $EUID = $UID;
342 $EGID = $GID;
343 # Drop privileges
344 $UID = $orig_uid;
345 $GID = $orig_gid;
346 # Make sure privs are really gone
347 ($EUID, $EGID) = @temp;
348 die "Can't drop privileges"
349 unless $UID == $EUID && $GID eq $EGID;
350 $ENV{PATH} = "/bin:/usr/bin"; # Minimal PATH.
351 # Consider sanitizing the environment even more.
352 exec 'myprog', 'arg1', 'arg2'
353 or die "can't exec myprog: $!";
354 }
425e5e39 355
fb73857a 356A similar strategy would work for wildcard expansion via C<glob>, although
357you can use C<readdir> instead.
425e5e39 358
359Taint checking is most useful when although you trust yourself not to have
360written a program to give away the farm, you don't necessarily trust those
361who end up using it not to try to trick it into doing something bad. This
fb73857a 362is the kind of security checking that's useful for set-id programs and
425e5e39 363programs launched on someone else's behalf, like CGI programs.
364
365This is quite different, however, from not even trusting the writer of the
366code not to try to do something evil. That's the kind of trust needed
367when someone hands you a program you've never seen before and says, "Here,
18d7fc85
RGS
368run this." For that kind of safety, you might want to check out the Safe
369module, included standard in the Perl distribution. This module allows the
425e5e39 370programmer to set up special compartments in which all system operations
18d7fc85
RGS
371are trapped and namespace access is carefully controlled. Safe should
372not be considered bullet-proof, though: it will not prevent the foreign
373code to set up infinite loops, allocate gigabytes of memory, or even
374abusing perl bugs to make the host interpreter crash or behave in
91e64913 375unpredictable ways. In any case it's better avoided completely if you're
18d7fc85 376really concerned about security.
425e5e39 377
b5145c7d 378=head2 Shebang Race Condition
425e5e39 379
380Beyond the obvious problems that stem from giving special privileges to
fb73857a 381systems as flexible as scripts, on many versions of Unix, set-id scripts
425e5e39 382are inherently insecure right from the start. The problem is a race
383condition in the kernel. Between the time the kernel opens the file to
fb73857a 384see which interpreter to run and when the (now-set-id) interpreter turns
425e5e39 385around and reopens the file to interpret it, the file in question may have
386changed, especially if you have symbolic links on your system.
387
dabde021 388Some Unixes, especially more recent ones, are free of this
b5145c7d
Z
389inherent security bug. On such systems, when the kernel passes the name
390of the set-id script to open to the interpreter, rather than using a
391pathname subject to meddling, it instead passes I</dev/fd/3>. This is a
392special file already opened on the script, so that there can be no race
393condition for evil scripts to exploit. On these systems, Perl should be
394compiled with C<-DSETUID_SCRIPTS_ARE_SECURE_NOW>. The F<Configure>
395program that builds Perl tries to figure this out for itself, so you
396should never have to specify this yourself. Most modern releases of
397SysVr4 and BSD 4.4 use this approach to avoid the kernel race condition.
425e5e39 398
b5145c7d
Z
399If you don't have the safe version of set-id scripts, all is not lost.
400Sometimes this kernel "feature" can be disabled, so that the kernel
401either doesn't run set-id scripts with the set-id or doesn't run them
402at all. Either way avoids the exploitability of the race condition,
403but doesn't help in actually running scripts set-id.
404
405If the kernel set-id script feature isn't disabled, then any set-id
406script provides an exploitable vulnerability. Perl can't avoid being
407exploitable, but will point out vulnerable scripts where it can. If Perl
408detects that it is being applied to a set-id script then it will complain
409loudly that your set-id script is insecure, and won't run it. When Perl
410complains, you need to remove the set-id bit from the script to eliminate
411the vulnerability. Refusing to run the script doesn't in itself close
412the vulnerability; it is just Perl's way of encouraging you to do this.
413
414To actually run a script set-id, if you don't have the safe version of
415set-id scripts, you'll need to put a C wrapper around
425e5e39 416the script. A C wrapper is just a compiled program that does nothing
417except call your Perl program. Compiled programs are not subject to the
fb73857a 418kernel bug that plagues set-id scripts. Here's a simple wrapper, written
425e5e39 419in C:
420
245c138e
LM
421 #include <unistd.h>
422 #include <stdio.h>
423 #include <string.h>
424 #include <errno.h>
425
425e5e39 426 #define REAL_PATH "/path/to/script"
245c138e
LM
427
428 int main(int argc, char **argv)
425e5e39 429 {
245c138e
LM
430 execv(REAL_PATH, argv);
431 fprintf(stderr, "%s: %s: %s\n",
432 argv[0], REAL_PATH, strerror(errno));
433 return 127;
54310121 434 }
cb1a09d0 435
54310121 436Compile this wrapper into a binary executable and then make I<it> rather
b5145c7d 437than your script setuid or setgid. Note that this wrapper isn't doing
dabde021 438anything to sanitise the execution environment other than ensuring
b5145c7d
Z
439that a safe path to the script is used. It only avoids the shebang
440race condition. It relies on Perl's own features, and on the script
441itself being careful, to make it safe enough to run the script set-id.
425e5e39 442
68dc0745 443=head2 Protecting Your Programs
444
445There are a number of ways to hide the source to your Perl programs,
446with varying levels of "security".
447
448First of all, however, you I<can't> take away read permission, because
449the source code has to be readable in order to be compiled and
450interpreted. (That doesn't mean that a CGI script's source is
451readable by people on the web, though.) So you have to leave the
5a964f20
TC
452permissions at the socially friendly 0755 level. This lets
453people on your local system only see your source.
68dc0745 454
5a964f20 455Some people mistakenly regard this as a security problem. If your program does
68dc0745 456insecure things, and relies on people not knowing how to exploit those
457insecurities, it is not secure. It is often possible for someone to
458determine the insecure things and exploit them without viewing the
459source. Security through obscurity, the name for hiding your bugs
460instead of fixing them, is little security indeed.
461
83df6a1d
JH
462You can try using encryption via source filters (Filter::* from CPAN,
463or Filter::Util::Call and Filter::Simple since Perl 5.8).
464But crackers might be able to decrypt it. You can try using the byte
465code compiler and interpreter described below, but crackers might be
466able to de-compile it. You can try using the native-code compiler
68dc0745 467described below, but crackers might be able to disassemble it. These
468pose varying degrees of difficulty to people wanting to get at your
469code, but none can definitively conceal it (this is true of every
470language, not just Perl).
471
472If you're concerned about people profiting from your code, then the
3462340b 473bottom line is that nothing but a restrictive license will give you
68dc0745 474legal security. License your software and pepper it with threatening
475statements like "This is unpublished proprietary software of XYZ Corp.
476Your access to it does not give you permission to use it blah blah
3462340b 477blah." You should see a lawyer to be sure your license's wording will
68dc0745 478stand up in court.
5a964f20 479
0d7c09bb
JH
480=head2 Unicode
481
482Unicode is a new and complex technology and one may easily overlook
483certain security pitfalls. See L<perluniintro> for an overview and
484L<perlunicode> for details, and L<perlunicode/"Security Implications
485of Unicode"> for security implications in particular.
486
504f80c1
JH
487=head2 Algorithmic Complexity Attacks
488
489Certain internal algorithms used in the implementation of Perl can
490be attacked by choosing the input carefully to consume large amounts
491of either time or space or both. This can lead into the so-called
492I<Denial of Service> (DoS) attacks.
493
494=over 4
495
496=item *
497
6a5b4183
YO
498Hash Algorithm - Hash algorithms like the one used in Perl are well
499known to be vulnerable to collision attacks on their hash function.
500Such attacks involve constructing a set of keys which collide into
91e64913 501the same bucket producing inefficient behavior. Such attacks often
6a5b4183 502depend on discovering the seed of the hash function used to map the
91e64913
FC
503keys to buckets. That seed is then used to brute-force a key set which
504can be used to mount a denial of service attack. In Perl 5.8.1 changes
6a5b4183
YO
505were introduced to harden Perl to such attacks, and then later in
506Perl 5.18.0 these features were enhanced and additional protections
507added.
508
4d74c8eb
S
509At the time of this writing, Perl 5.18.0 is considered to be
510well-hardened against algorithmic complexity attacks on its hash
91e64913 511implementation. This is largely owed to the following measures
4d74c8eb 512mitigate attacks:
6a5b4183
YO
513
514=over 4
515
516=item Hash Seed Randomization
517
518In order to make it impossible to know what seed to generate an attack
91e64913 519key set for, this seed is randomly initialized at process start. This
4d74c8eb 520may be overridden by using the PERL_HASH_SEED environment variable, see
91e64913 521L<perlrun/PERL_HASH_SEED>. This environment variable controls how
4d74c8eb
S
522items are actually stored, not how they are presented via
523C<keys>, C<values> and C<each>.
6a5b4183
YO
524
525=item Hash Traversal Randomization
526
4d74c8eb 527Independent of which seed is used in the hash function, C<keys>,
6a5b4183
YO
528C<values>, and C<each> return items in a per-hash randomized order.
529Modifying a hash by insertion will change the iteration order of that hash.
4d74c8eb 530This behavior can be overridden by using C<hash_traversal_mask()> from
6a5b4183 531L<Hash::Util> or by using the PERL_PERTURB_KEYS environment variable,
91e64913 532see L<perlrun/PERL_PERTURB_KEYS>. Note that this feature controls the
6a5b4183
YO
533"visible" order of the keys, and not the actual order they are stored in.
534
535=item Bucket Order Perturbance
536
4d74c8eb 537When items collide into a given hash bucket the order they are stored in
91e64913
FC
538the chain is no longer predictable in Perl 5.18. This
539has the intention to make it harder to observe a
c6c886ef 540collision. This behavior can be overridden by using
6a5b4183
YO
541the PERL_PERTURB_KEYS environment variable, see L<perlrun/PERL_PERTURB_KEYS>.
542
543=item New Default Hash Function
544
545The default hash function has been modified with the intention of making
546it harder to infer the hash seed.
547
548=item Alternative Hash Functions
549
550The source code includes multiple hash algorithms to choose from. While we
4d74c8eb 551believe that the default perl hash is robust to attack, we have included the
91e64913 552hash function Siphash as a fall-back option. At the time of release of
6a5b4183
YO
553Perl 5.18.0 Siphash is believed to be of cryptographic strength. This is
554not the default as it is much slower than the default hash.
555
556=back
557
4d74c8eb 558Without compiling a special Perl, there is no way to get the exact same
91e64913 559behavior of any versions prior to Perl 5.18.0. The closest one can get
6a5b4183 560is by setting PERL_PERTURB_KEYS to 0 and setting the PERL_HASH_SEED
91e64913 561to a known value. We do not advise those settings for production use
4d74c8eb 562due to the above security considerations.
6a5b4183
YO
563
564B<Perl has never guaranteed any ordering of the hash keys>, and
565the ordering has already changed several times during the lifetime of
566Perl 5. Also, the ordering of hash keys has always been, and continues
567to be, affected by the insertion order and the history of changes made
568to the hash over its lifetime.
7b3f7037
JH
569
570Also note that while the order of the hash elements might be
4d74c8eb
S
571randomized, this "pseudo-ordering" should B<not> be used for
572applications like shuffling a list randomly (use C<List::Util::shuffle()>
7b3f7037 573for that, see L<List::Util>, a standard core module since Perl 5.8.0;
4d74c8eb
S
574or the CPAN module C<Algorithm::Numerical::Shuffle>), or for generating
575permutations (use e.g. the CPAN modules C<Algorithm::Permute> or
576C<Algorithm::FastPermute>), or for any cryptographic applications.
7b3f7037 577
883f220b
TC
578Tied hashes may have their own ordering and algorithmic complexity
579attacks.
580
504f80c1
JH
581=item *
582
5a4e8ea7
P
583Regular expressions - Perl's regular expression engine is so called NFA
584(Non-deterministic Finite Automaton), which among other things means that
585it can rather easily consume large amounts of both time and space if the
504f80c1
JH
586regular expression may match in several ways. Careful crafting of the
587regular expressions can help but quite often there really isn't much
588one can do (the book "Mastering Regular Expressions" is required
589reading, see L<perlfaq2>). Running out of space manifests itself by
590Perl running out of memory.
591
592=item *
593
594Sorting - the quicksort algorithm used in Perls before 5.8.0 to
e2091bb6 595implement the sort() function was very easy to trick into misbehaving
3462340b
JL
596so that it consumes a lot of time. Starting from Perl 5.8.0 a different
597sorting algorithm, mergesort, is used by default. Mergesort cannot
598misbehave on any input.
504f80c1
JH
599
600=back
601
b25b06cf 602See L<https://www.usenix.org/legacy/events/sec03/tech/full_papers/crosby/crosby.pdf> for more information,
3462340b 603and any computer science textbook on algorithmic complexity.
504f80c1 604
b5145c7d
Z
605=head2 Using Sudo
606
607The popular tool C<sudo> provides a controlled way for users to be able
608to run programs as other users. It sanitises the execution environment
609to some extent, and will avoid the L<shebang race condition|/"Shebang
610Race Condition">. If you don't have the safe version of set-id scripts,
611then C<sudo> may be a more convenient way of executing a script as
612another user than writing a C wrapper would be.
613
614However, C<sudo> sets the real user or group ID to that of the target
615identity, not just the effective ID as set-id bits do. As a result, Perl
616can't detect that it is running under C<sudo>, and so won't automatically
617take its own security precautions such as turning on taint mode. Where
618C<sudo> configuration dictates exactly which command can be run, the
619approved command may include a C<-T> option to perl to enable taint mode.
620
621In general, it is necessary to evaluate the suitaility of a script to
622run under C<sudo> specifically with that kind of execution environment
623in mind. It is neither necessary nor sufficient for the same script to
624be suitable to run in a traditional set-id arrangement, though many of
625the issues overlap.
626
5a964f20
TC
627=head1 SEE ALSO
628
028611fa
DB
629L<perlrun/ENVIRONMENT> for its description of cleaning up environment
630variables.