This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
numeric.c: White-space only
[perl5.git] / pod / perlsec.pod
CommitLineData
a0d0e21e
LW
1=head1 NAME
2
3perlsec - Perl security
4
5=head1 DESCRIPTION
6
425e5e39 7Perl is designed to make it easy to program securely even when running
8with extra privileges, like setuid or setgid programs. Unlike most
54310121 9command line shells, which are based on multiple substitution passes on
425e5e39 10each line of the script, Perl uses a more conventional evaluation scheme
11with fewer hidden snags. Additionally, because the language has more
54310121 12builtin functionality, it can rely less upon external (and possibly
425e5e39 13untrustworthy) programs to accomplish its purposes.
a0d0e21e 14
89f530a6
DG
15=head1 SECURITY VULNERABILITY CONTACT INFORMATION
16
b135fd4a
JL
17If you believe you have found a security vulnerability in the Perl
18interpreter or modules maintained in the core Perl codebase,
19email the details to
20L<perl-security@perl.org|mailto:perl-security@perl.org>.
21This address is a closed membership mailing list monitored by the Perl
22security team.
23
24See L<perlsecpolicy> for additional information.
89f530a6
DG
25
26=head1 SECURITY MECHANISMS AND CONCERNS
27
28=head2 Taint mode
29
425e5e39 30Perl automatically enables a set of special security checks, called I<taint
31mode>, when it detects its program running with differing real and effective
32user or group IDs. The setuid bit in Unix permissions is mode 04000, the
33setgid bit mode 02000; either or both may be set. You can also enable taint
91e64913 34mode explicitly by using the B<-T> command line flag. This flag is
425e5e39 35I<strongly> suggested for server programs and any program run on behalf of
91e64913 36someone else, such as a CGI script. Once taint mode is on, it's on for
fb73857a 37the remainder of your script.
a0d0e21e 38
1e422769 39While in this mode, Perl takes special precautions called I<taint
40checks> to prevent both obvious and subtle traps. Some of these checks
41are reasonably simple, such as verifying that path directories aren't
42writable by others; careful programmers have always used checks like
43these. Other checks, however, are best supported by the language itself,
fb73857a 44and it is these checks especially that contribute to making a set-id Perl
425e5e39 45program more secure than the corresponding C program.
46
fb73857a 47You may not use data derived from outside your program to affect
48something else outside your program--at least, not by accident. All
49command line arguments, environment variables, locale information (see
23634c10
AL
50L<perllocale>), results of certain system calls (C<readdir()>,
51C<readlink()>, the variable of C<shmread()>, the messages returned by
52C<msgrcv()>, the password, gcos and shell fields returned by the
53C<getpwxxx()> calls), and all file input are marked as "tainted".
41d6edb2
JH
54Tainted data may not be used directly or indirectly in any command
55that invokes a sub-shell, nor in any command that modifies files,
b7ee89ce
AP
56directories, or processes, B<with the following exceptions>:
57
58=over 4
59
60=item *
61
b7ee89ce
AP
62Arguments to C<print> and C<syswrite> are B<not> checked for taintedness.
63
7f6513c1
JH
64=item *
65
66Symbolic methods
67
68 $obj->$method(@args);
69
70and symbolic sub references
71
72 &{$foo}(@args);
73 $foo->(@args);
74
75are not checked for taintedness. This requires extra carefulness
76unless you want external data to affect your control flow. Unless
77you carefully limit what these symbolic values are, people are able
78to call functions B<outside> your Perl code, such as POSIX::system,
79in which case they are able to run arbitrary external code.
80
8ea1447c
RD
81=item *
82
83Hash keys are B<never> tainted.
84
b7ee89ce
AP
85=back
86
595bde10
MG
87For efficiency reasons, Perl takes a conservative view of
88whether data is tainted. If an expression contains tainted data,
89any subexpression may be considered tainted, even if the value
90of the subexpression is not itself affected by the tainted data.
ee556d55 91
d929ce6f 92Because taintedness is associated with each scalar value, some
595bde10 93elements of an array or hash can be tainted and others not.
8ea1447c 94The keys of a hash are B<never> tainted.
a0d0e21e 95
a0d0e21e
LW
96For example:
97
425e5e39 98 $arg = shift; # $arg is tainted
048b63be 99 $hid = $arg . 'bar'; # $hid is also tainted
425e5e39 100 $line = <>; # Tainted
8ebc5c01 101 $line = <STDIN>; # Also tainted
102 open FOO, "/home/me/bar" or die $!;
103 $line = <FOO>; # Still tainted
a0d0e21e 104 $path = $ENV{'PATH'}; # Tainted, but see below
425e5e39 105 $data = 'abc'; # Not tainted
a0d0e21e 106
425e5e39 107 system "echo $arg"; # Insecure
7de90c4d 108 system "/bin/echo", $arg; # Considered insecure
bbd7eb8a 109 # (Perl doesn't know about /bin/echo)
425e5e39 110 system "echo $hid"; # Insecure
111 system "echo $data"; # Insecure until PATH set
a0d0e21e 112
425e5e39 113 $path = $ENV{'PATH'}; # $path now tainted
a0d0e21e 114
54310121 115 $ENV{'PATH'} = '/bin:/usr/bin';
c90c0ff4 116 delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
a0d0e21e 117
425e5e39 118 $path = $ENV{'PATH'}; # $path now NOT tainted
119 system "echo $data"; # Is secure now!
a0d0e21e 120
425e5e39 121 open(FOO, "< $arg"); # OK - read-only file
122 open(FOO, "> $arg"); # Not OK - trying to write
a0d0e21e 123
bbd7eb8a 124 open(FOO,"echo $arg|"); # Not OK
425e5e39 125 open(FOO,"-|")
7de90c4d 126 or exec 'echo', $arg; # Also not OK
a0d0e21e 127
425e5e39 128 $shout = `echo $arg`; # Insecure, $shout now tainted
a0d0e21e 129
425e5e39 130 unlink $data, $arg; # Insecure
131 umask $arg; # Insecure
a0d0e21e 132
bbd7eb8a 133 exec "echo $arg"; # Insecure
7de90c4d
RD
134 exec "echo", $arg; # Insecure
135 exec "sh", '-c', $arg; # Very insecure!
a0d0e21e 136
3a4b19e4
GS
137 @files = <*.c>; # insecure (uses readdir() or similar)
138 @files = glob('*.c'); # insecure (uses readdir() or similar)
7bac28a0 139
dde0c558
BF
140 # In either case, the results of glob are tainted, since the list of
141 # filenames comes from outside of the program.
3f7d42d8 142
ee556d55
MG
143 $bad = ($arg, 23); # $bad will be tainted
144 $arg, `true`; # Insecure (although it isn't really)
145
a0d0e21e 146If you try to do something insecure, you will get a fatal error saying
7de90c4d 147something like "Insecure dependency" or "Insecure $ENV{PATH}".
425e5e39 148
23634c10
AL
149The exception to the principle of "one tainted value taints the whole
150expression" is with the ternary conditional operator C<?:>. Since code
151with a ternary conditional
152
153 $result = $tainted_value ? "Untainted" : "Also untainted";
154
155is effectively
156
157 if ( $tainted_value ) {
158 $result = "Untainted";
159 } else {
160 $result = "Also untainted";
161 }
162
163it doesn't make sense for C<$result> to be tainted.
164
425e5e39 165=head2 Laundering and Detecting Tainted Data
166
3f7d42d8
JH
167To test whether a variable contains tainted data, and whose use would
168thus trigger an "Insecure dependency" message, you can use the
23634c10 169C<tainted()> function of the Scalar::Util module, available in your
3f7d42d8 170nearby CPAN mirror, and included in Perl starting from the release 5.8.0.
595bde10 171Or you may be able to use the following C<is_tainted()> function.
425e5e39 172
173 sub is_tainted {
7687d286 174 local $@; # Don't pollute caller's value.
61890e45 175 return ! eval { eval("#" . substr(join("", @_), 0, 0)); 1 };
425e5e39 176 }
177
178This function makes use of the fact that the presence of tainted data
179anywhere within an expression renders the entire expression tainted. It
180would be inefficient for every operator to test every argument for
181taintedness. Instead, the slightly more efficient and conservative
182approach is used that if any tainted value has been accessed within the
183same expression, the whole expression is considered tainted.
184
5f05dabc 185But testing for taintedness gets you only so far. Sometimes you have just
595bde10
MG
186to clear your data's taintedness. Values may be untainted by using them
187as keys in a hash; otherwise the only way to bypass the tainting
54310121 188mechanism is by referencing subpatterns from a regular expression match.
18512f39
KW
189Perl presumes that if you reference a substring using $1, $2, etc. in a
190non-tainting pattern, that
191you knew what you were doing when you wrote that pattern. That means using
425e5e39 192a bit of thought--don't just blindly untaint anything, or you defeat the
a034a98d
DD
193entire mechanism. It's better to verify that the variable has only good
194characters (for certain values of "good") rather than checking whether it
195has any bad characters. That's because it's far too easy to miss bad
196characters that you never thought of.
425e5e39 197
198Here's a test to make sure that the data contains nothing but "word"
199characters (alphabetics, numerics, and underscores), a hyphen, an at sign,
200or a dot.
201
54310121 202 if ($data =~ /^([-\@\w.]+)$/) {
425e5e39 203 $data = $1; # $data now untainted
204 } else {
3a2263fe 205 die "Bad data in '$data'"; # log this somewhere
425e5e39 206 }
207
5f05dabc 208This is fairly secure because C</\w+/> doesn't normally match shell
425e5e39 209metacharacters, nor are dot, dash, or at going to mean something special
210to the shell. Use of C</.+/> would have been insecure in theory because
211it lets everything through, but Perl doesn't check for that. The lesson
212is that when untainting, you must be exceedingly careful with your patterns.
19799a22 213Laundering data using regular expression is the I<only> mechanism for
425e5e39 214untainting dirty data, unless you use the strategy detailed below to fork
215a child of lesser privilege.
216
23634c10 217The example does not untaint C<$data> if C<use locale> is in effect,
a034a98d
DD
218because the characters matched by C<\w> are determined by the locale.
219Perl considers that locale definitions are untrustworthy because they
220contain data from outside the program. If you are writing a
221locale-aware program, and want to launder data with a regular expression
222containing C<\w>, put C<no locale> ahead of the expression in the same
223block. See L<perllocale/SECURITY> for further discussion and examples.
224
3a52c276
CS
225=head2 Switches On the "#!" Line
226
227When you make a script executable, in order to make it usable as a
228command, the system will pass switches to perl from the script's #!
54310121 229line. Perl checks that any command line switches given to a setuid
3a52c276 230(or setgid) script actually match the ones set on the #! line. Some
54310121 231Unix and Unix-like environments impose a one-switch limit on the #!
3a52c276 232line, so you may need to use something like C<-wU> instead of C<-w -U>
54310121 233under such systems. (This issue should arise only in Unix or
234Unix-like environments that support #! and setuid or setgid scripts.)
3a52c276 235
588f7210
SB
236=head2 Taint mode and @INC
237
f7335192
DC
238When the taint mode (C<-T>) is in effect, the environment variables
239C<PERL5LIB> and C<PERLLIB>
91e64913 240are ignored by Perl. You can still adjust C<@INC> from outside the
588f7210 241program by using the C<-I> command line option as explained in
028611fa
DB
242L<perlrun|perlrun/-Idirectory>. The two environment variables are
243ignored because they are obscured, and a user running a program could
244be unaware that they are set, whereas the C<-I> option is clearly
245visible and therefore permitted.
588f7210
SB
246
247Another way to modify C<@INC> without modifying the program, is to use
248the C<lib> pragma, e.g.:
249
250 perl -Mlib=/foo program
251
252The benefit of using C<-Mlib=/foo> over C<-I/foo>, is that the former
6fd9f613 253will automagically remove any duplicated directories, while the latter
588f7210
SB
254will not.
255
6a268663
RGS
256Note that if a tainted string is added to C<@INC>, the following
257problem will be reported:
258
259 Insecure dependency in require while running with -T switch
260
f7335192 261On versions of Perl before 5.26, activating taint mode will also remove
a1c1fa25
DC
262the current directory (".") from the default value of C<@INC>. Since
263version 5.26, the current directory isn't included in C<@INC> by
264default.
f7335192 265
425e5e39 266=head2 Cleaning Up Your Path
267
df98f984
RGS
268For "Insecure C<$ENV{PATH}>" messages, you need to set C<$ENV{'PATH'}> to
269a known value, and each directory in the path must be absolute and
270non-writable by others than its owner and group. You may be surprised to
271get this message even if the pathname to your executable is fully
272qualified. This is I<not> generated because you didn't supply a full path
273to the program; instead, it's generated because you never set your PATH
274environment variable, or you didn't set it to something that was safe.
275Because Perl can't guarantee that the executable in question isn't itself
276going to turn around and execute some other program that is dependent on
277your PATH, it makes sure you set the PATH.
a0d0e21e 278
a3cb178b
GS
279The PATH isn't the only environment variable which can cause problems.
280Because some shells may use the variables IFS, CDPATH, ENV, and
281BASH_ENV, Perl checks that those are either empty or untainted when
91e64913 282starting subprocesses. You may wish to add something like this to your
a3cb178b
GS
283setid and taint-checking scripts.
284
285 delete @ENV{qw(IFS CDPATH ENV BASH_ENV)}; # Make %ENV safer
286
a0d0e21e
LW
287It's also possible to get into trouble with other operations that don't
288care whether they use tainted values. Make judicious use of the file
289tests in dealing with any user-supplied filenames. When possible, do
fb73857a 290opens and such B<after> properly dropping any special user (or group!)
91e64913
FC
291privileges. Perl doesn't prevent you from
292opening tainted filenames for reading,
a0d0e21e
LW
293so be careful what you print out. The tainting mechanism is intended to
294prevent stupid mistakes, not to remove the need for thought.
295
23634c10
AL
296Perl does not call the shell to expand wild cards when you pass C<system>
297and C<exec> explicit parameter lists instead of strings with possible shell
298wildcards in them. Unfortunately, the C<open>, C<glob>, and
54310121 299backtick functions provide no such alternate calling convention, so more
300subterfuge will be required.
425e5e39 301
302Perl provides a reasonably safe way to open a file or pipe from a setuid
303or setgid program: just create a child process with reduced privilege who
304does the dirty work for you. First, fork a child using the special
23634c10 305C<open> syntax that connects the parent and child by a pipe. Now the
425e5e39 306child resets its ID set and any other per-process attributes, like
307environment variables, umasks, current working directories, back to the
308originals or known safe values. Then the child process, which no longer
23634c10 309has any special permissions, does the C<open> or other system call.
425e5e39 310Finally, the child passes the data it managed to access back to the
5f05dabc 311parent. Because the file or pipe was opened in the child while running
425e5e39 312under less privilege than the parent, it's not apt to be tricked into
313doing something it shouldn't.
314
23634c10 315Here's a way to do backticks reasonably safely. Notice how the C<exec> is
425e5e39 316not called with a string that the shell could expand. This is by far the
317best way to call something that might be subjected to shell escapes: just
fb73857a 318never call the shell at all.
cb1a09d0 319
6ca3c6c6 320 use English;
e093bcf0
GW
321 die "Can't fork: $!" unless defined($pid = open(KID, "-|"));
322 if ($pid) { # parent
323 while (<KID>) {
324 # do something
325 }
326 close KID;
327 } else {
328 my @temp = ($EUID, $EGID);
329 my $orig_uid = $UID;
330 my $orig_gid = $GID;
331 $EUID = $UID;
332 $EGID = $GID;
333 # Drop privileges
334 $UID = $orig_uid;
335 $GID = $orig_gid;
336 # Make sure privs are really gone
337 ($EUID, $EGID) = @temp;
338 die "Can't drop privileges"
339 unless $UID == $EUID && $GID eq $EGID;
340 $ENV{PATH} = "/bin:/usr/bin"; # Minimal PATH.
341 # Consider sanitizing the environment even more.
342 exec 'myprog', 'arg1', 'arg2'
343 or die "can't exec myprog: $!";
344 }
425e5e39 345
fb73857a 346A similar strategy would work for wildcard expansion via C<glob>, although
347you can use C<readdir> instead.
425e5e39 348
349Taint checking is most useful when although you trust yourself not to have
350written a program to give away the farm, you don't necessarily trust those
351who end up using it not to try to trick it into doing something bad. This
fb73857a 352is the kind of security checking that's useful for set-id programs and
425e5e39 353programs launched on someone else's behalf, like CGI programs.
354
355This is quite different, however, from not even trusting the writer of the
356code not to try to do something evil. That's the kind of trust needed
357when someone hands you a program you've never seen before and says, "Here,
18d7fc85
RGS
358run this." For that kind of safety, you might want to check out the Safe
359module, included standard in the Perl distribution. This module allows the
425e5e39 360programmer to set up special compartments in which all system operations
18d7fc85
RGS
361are trapped and namespace access is carefully controlled. Safe should
362not be considered bullet-proof, though: it will not prevent the foreign
363code to set up infinite loops, allocate gigabytes of memory, or even
364abusing perl bugs to make the host interpreter crash or behave in
91e64913 365unpredictable ways. In any case it's better avoided completely if you're
18d7fc85 366really concerned about security.
425e5e39 367
b5145c7d 368=head2 Shebang Race Condition
425e5e39 369
370Beyond the obvious problems that stem from giving special privileges to
fb73857a 371systems as flexible as scripts, on many versions of Unix, set-id scripts
425e5e39 372are inherently insecure right from the start. The problem is a race
373condition in the kernel. Between the time the kernel opens the file to
fb73857a 374see which interpreter to run and when the (now-set-id) interpreter turns
425e5e39 375around and reopens the file to interpret it, the file in question may have
376changed, especially if you have symbolic links on your system.
377
dabde021 378Some Unixes, especially more recent ones, are free of this
b5145c7d
Z
379inherent security bug. On such systems, when the kernel passes the name
380of the set-id script to open to the interpreter, rather than using a
381pathname subject to meddling, it instead passes I</dev/fd/3>. This is a
382special file already opened on the script, so that there can be no race
383condition for evil scripts to exploit. On these systems, Perl should be
384compiled with C<-DSETUID_SCRIPTS_ARE_SECURE_NOW>. The F<Configure>
385program that builds Perl tries to figure this out for itself, so you
386should never have to specify this yourself. Most modern releases of
387SysVr4 and BSD 4.4 use this approach to avoid the kernel race condition.
425e5e39 388
b5145c7d
Z
389If you don't have the safe version of set-id scripts, all is not lost.
390Sometimes this kernel "feature" can be disabled, so that the kernel
391either doesn't run set-id scripts with the set-id or doesn't run them
392at all. Either way avoids the exploitability of the race condition,
393but doesn't help in actually running scripts set-id.
394
395If the kernel set-id script feature isn't disabled, then any set-id
396script provides an exploitable vulnerability. Perl can't avoid being
397exploitable, but will point out vulnerable scripts where it can. If Perl
398detects that it is being applied to a set-id script then it will complain
399loudly that your set-id script is insecure, and won't run it. When Perl
400complains, you need to remove the set-id bit from the script to eliminate
401the vulnerability. Refusing to run the script doesn't in itself close
402the vulnerability; it is just Perl's way of encouraging you to do this.
403
404To actually run a script set-id, if you don't have the safe version of
405set-id scripts, you'll need to put a C wrapper around
425e5e39 406the script. A C wrapper is just a compiled program that does nothing
407except call your Perl program. Compiled programs are not subject to the
fb73857a 408kernel bug that plagues set-id scripts. Here's a simple wrapper, written
425e5e39 409in C:
410
245c138e
LM
411 #include <unistd.h>
412 #include <stdio.h>
413 #include <string.h>
414 #include <errno.h>
415
425e5e39 416 #define REAL_PATH "/path/to/script"
245c138e
LM
417
418 int main(int argc, char **argv)
425e5e39 419 {
245c138e
LM
420 execv(REAL_PATH, argv);
421 fprintf(stderr, "%s: %s: %s\n",
422 argv[0], REAL_PATH, strerror(errno));
423 return 127;
54310121 424 }
cb1a09d0 425
54310121 426Compile this wrapper into a binary executable and then make I<it> rather
b5145c7d 427than your script setuid or setgid. Note that this wrapper isn't doing
dabde021 428anything to sanitise the execution environment other than ensuring
b5145c7d
Z
429that a safe path to the script is used. It only avoids the shebang
430race condition. It relies on Perl's own features, and on the script
431itself being careful, to make it safe enough to run the script set-id.
425e5e39 432
68dc0745 433=head2 Protecting Your Programs
434
435There are a number of ways to hide the source to your Perl programs,
436with varying levels of "security".
437
438First of all, however, you I<can't> take away read permission, because
439the source code has to be readable in order to be compiled and
440interpreted. (That doesn't mean that a CGI script's source is
441readable by people on the web, though.) So you have to leave the
5a964f20
TC
442permissions at the socially friendly 0755 level. This lets
443people on your local system only see your source.
68dc0745 444
5a964f20 445Some people mistakenly regard this as a security problem. If your program does
68dc0745 446insecure things, and relies on people not knowing how to exploit those
447insecurities, it is not secure. It is often possible for someone to
448determine the insecure things and exploit them without viewing the
449source. Security through obscurity, the name for hiding your bugs
450instead of fixing them, is little security indeed.
451
83df6a1d
JH
452You can try using encryption via source filters (Filter::* from CPAN,
453or Filter::Util::Call and Filter::Simple since Perl 5.8).
454But crackers might be able to decrypt it. You can try using the byte
455code compiler and interpreter described below, but crackers might be
456able to de-compile it. You can try using the native-code compiler
68dc0745 457described below, but crackers might be able to disassemble it. These
458pose varying degrees of difficulty to people wanting to get at your
459code, but none can definitively conceal it (this is true of every
460language, not just Perl).
461
462If you're concerned about people profiting from your code, then the
3462340b 463bottom line is that nothing but a restrictive license will give you
68dc0745 464legal security. License your software and pepper it with threatening
465statements like "This is unpublished proprietary software of XYZ Corp.
466Your access to it does not give you permission to use it blah blah
3462340b 467blah." You should see a lawyer to be sure your license's wording will
68dc0745 468stand up in court.
5a964f20 469
0d7c09bb
JH
470=head2 Unicode
471
472Unicode is a new and complex technology and one may easily overlook
473certain security pitfalls. See L<perluniintro> for an overview and
474L<perlunicode> for details, and L<perlunicode/"Security Implications
475of Unicode"> for security implications in particular.
476
504f80c1
JH
477=head2 Algorithmic Complexity Attacks
478
479Certain internal algorithms used in the implementation of Perl can
480be attacked by choosing the input carefully to consume large amounts
481of either time or space or both. This can lead into the so-called
482I<Denial of Service> (DoS) attacks.
483
484=over 4
485
486=item *
487
6a5b4183
YO
488Hash Algorithm - Hash algorithms like the one used in Perl are well
489known to be vulnerable to collision attacks on their hash function.
490Such attacks involve constructing a set of keys which collide into
91e64913 491the same bucket producing inefficient behavior. Such attacks often
6a5b4183 492depend on discovering the seed of the hash function used to map the
91e64913
FC
493keys to buckets. That seed is then used to brute-force a key set which
494can be used to mount a denial of service attack. In Perl 5.8.1 changes
6a5b4183
YO
495were introduced to harden Perl to such attacks, and then later in
496Perl 5.18.0 these features were enhanced and additional protections
497added.
498
4d74c8eb
S
499At the time of this writing, Perl 5.18.0 is considered to be
500well-hardened against algorithmic complexity attacks on its hash
91e64913 501implementation. This is largely owed to the following measures
4d74c8eb 502mitigate attacks:
6a5b4183
YO
503
504=over 4
505
506=item Hash Seed Randomization
507
508In order to make it impossible to know what seed to generate an attack
91e64913 509key set for, this seed is randomly initialized at process start. This
4d74c8eb 510may be overridden by using the PERL_HASH_SEED environment variable, see
91e64913 511L<perlrun/PERL_HASH_SEED>. This environment variable controls how
4d74c8eb
S
512items are actually stored, not how they are presented via
513C<keys>, C<values> and C<each>.
6a5b4183
YO
514
515=item Hash Traversal Randomization
516
4d74c8eb 517Independent of which seed is used in the hash function, C<keys>,
6a5b4183
YO
518C<values>, and C<each> return items in a per-hash randomized order.
519Modifying a hash by insertion will change the iteration order of that hash.
4d74c8eb 520This behavior can be overridden by using C<hash_traversal_mask()> from
6a5b4183 521L<Hash::Util> or by using the PERL_PERTURB_KEYS environment variable,
91e64913 522see L<perlrun/PERL_PERTURB_KEYS>. Note that this feature controls the
6a5b4183
YO
523"visible" order of the keys, and not the actual order they are stored in.
524
525=item Bucket Order Perturbance
526
4d74c8eb 527When items collide into a given hash bucket the order they are stored in
91e64913
FC
528the chain is no longer predictable in Perl 5.18. This
529has the intention to make it harder to observe a
c6c886ef 530collision. This behavior can be overridden by using
6a5b4183
YO
531the PERL_PERTURB_KEYS environment variable, see L<perlrun/PERL_PERTURB_KEYS>.
532
533=item New Default Hash Function
534
535The default hash function has been modified with the intention of making
536it harder to infer the hash seed.
537
538=item Alternative Hash Functions
539
540The source code includes multiple hash algorithms to choose from. While we
4d74c8eb 541believe that the default perl hash is robust to attack, we have included the
91e64913 542hash function Siphash as a fall-back option. At the time of release of
6a5b4183
YO
543Perl 5.18.0 Siphash is believed to be of cryptographic strength. This is
544not the default as it is much slower than the default hash.
545
546=back
547
4d74c8eb 548Without compiling a special Perl, there is no way to get the exact same
91e64913 549behavior of any versions prior to Perl 5.18.0. The closest one can get
6a5b4183 550is by setting PERL_PERTURB_KEYS to 0 and setting the PERL_HASH_SEED
91e64913 551to a known value. We do not advise those settings for production use
4d74c8eb 552due to the above security considerations.
6a5b4183
YO
553
554B<Perl has never guaranteed any ordering of the hash keys>, and
555the ordering has already changed several times during the lifetime of
556Perl 5. Also, the ordering of hash keys has always been, and continues
557to be, affected by the insertion order and the history of changes made
558to the hash over its lifetime.
7b3f7037
JH
559
560Also note that while the order of the hash elements might be
4d74c8eb
S
561randomized, this "pseudo-ordering" should B<not> be used for
562applications like shuffling a list randomly (use C<List::Util::shuffle()>
7b3f7037 563for that, see L<List::Util>, a standard core module since Perl 5.8.0;
4d74c8eb
S
564or the CPAN module C<Algorithm::Numerical::Shuffle>), or for generating
565permutations (use e.g. the CPAN modules C<Algorithm::Permute> or
566C<Algorithm::FastPermute>), or for any cryptographic applications.
7b3f7037 567
883f220b
TC
568Tied hashes may have their own ordering and algorithmic complexity
569attacks.
570
504f80c1
JH
571=item *
572
5a4e8ea7
P
573Regular expressions - Perl's regular expression engine is so called NFA
574(Non-deterministic Finite Automaton), which among other things means that
575it can rather easily consume large amounts of both time and space if the
504f80c1
JH
576regular expression may match in several ways. Careful crafting of the
577regular expressions can help but quite often there really isn't much
578one can do (the book "Mastering Regular Expressions" is required
579reading, see L<perlfaq2>). Running out of space manifests itself by
580Perl running out of memory.
581
582=item *
583
584Sorting - the quicksort algorithm used in Perls before 5.8.0 to
e2091bb6 585implement the sort() function was very easy to trick into misbehaving
3462340b
JL
586so that it consumes a lot of time. Starting from Perl 5.8.0 a different
587sorting algorithm, mergesort, is used by default. Mergesort cannot
588misbehave on any input.
504f80c1
JH
589
590=back
591
b25b06cf 592See L<https://www.usenix.org/legacy/events/sec03/tech/full_papers/crosby/crosby.pdf> for more information,
3462340b 593and any computer science textbook on algorithmic complexity.
504f80c1 594
b5145c7d
Z
595=head2 Using Sudo
596
597The popular tool C<sudo> provides a controlled way for users to be able
598to run programs as other users. It sanitises the execution environment
599to some extent, and will avoid the L<shebang race condition|/"Shebang
600Race Condition">. If you don't have the safe version of set-id scripts,
601then C<sudo> may be a more convenient way of executing a script as
602another user than writing a C wrapper would be.
603
604However, C<sudo> sets the real user or group ID to that of the target
605identity, not just the effective ID as set-id bits do. As a result, Perl
606can't detect that it is running under C<sudo>, and so won't automatically
607take its own security precautions such as turning on taint mode. Where
608C<sudo> configuration dictates exactly which command can be run, the
609approved command may include a C<-T> option to perl to enable taint mode.
610
f1460a66 611In general, it is necessary to evaluate the suitability of a script to
b5145c7d
Z
612run under C<sudo> specifically with that kind of execution environment
613in mind. It is neither necessary nor sufficient for the same script to
614be suitable to run in a traditional set-id arrangement, though many of
615the issues overlap.
616
5a964f20
TC
617=head1 SEE ALSO
618
028611fa
DB
619L<perlrun/ENVIRONMENT> for its description of cleaning up environment
620variables.