3 perlsec - Perl security
7 Perl is designed to make it easy to program securely even when running
8 with extra privileges, like setuid or setgid programs. Unlike most
9 command line shells, which are based on multiple substitution passes on
10 each line of the script, Perl uses a more conventional evaluation scheme
11 with fewer hidden snags. Additionally, because the language has more
12 builtin functionality, it can rely less upon external (and possibly
13 untrustworthy) programs to accomplish its purposes.
15 =head1 SECURITY VULNERABILITY CONTACT INFORMATION
17 If you believe you have found a security vulnerability in the Perl
18 interpreter or modules maintained in the core Perl codebase,
20 L<perl-security@perl.org|mailto:perl-security@perl.org>.
21 This address is a closed membership mailing list monitored by the Perl
24 See L<perlsecpolicy> for additional information.
26 =head1 SECURITY MECHANISMS AND CONCERNS
30 Perl automatically enables a set of special security checks, called I<taint
31 mode>, when it detects its program running with differing real and effective
32 user or group IDs. The setuid bit in Unix permissions is mode 04000, the
33 setgid bit mode 02000; either or both may be set. You can also enable taint
34 mode explicitly by using the B<-T> command line flag. This flag is
35 I<strongly> suggested for server programs and any program run on behalf of
36 someone else, such as a CGI script. Once taint mode is on, it's on for
37 the remainder of your script.
39 While in this mode, Perl takes special precautions called I<taint
40 checks> to prevent both obvious and subtle traps. Some of these checks
41 are reasonably simple, such as verifying that path directories aren't
42 writable by others; careful programmers have always used checks like
43 these. Other checks, however, are best supported by the language itself,
44 and it is these checks especially that contribute to making a set-id Perl
45 program more secure than the corresponding C program.
47 You may not use data derived from outside your program to affect
48 something else outside your program--at least, not by accident. All
49 command line arguments, environment variables, locale information (see
50 L<perllocale>), results of certain system calls (C<readdir()>,
51 C<readlink()>, the variable of C<shmread()>, the messages returned by
52 C<msgrcv()>, the password, gcos and shell fields returned by the
53 C<getpwxxx()> calls), and all file input are marked as "tainted".
54 Tainted data may not be used directly or indirectly in any command
55 that invokes a sub-shell, nor in any command that modifies files,
56 directories, or processes, B<with the following exceptions>:
62 Arguments to C<print> and C<syswrite> are B<not> checked for taintedness.
70 and symbolic sub references
75 are not checked for taintedness. This requires extra carefulness
76 unless you want external data to affect your control flow. Unless
77 you carefully limit what these symbolic values are, people are able
78 to call functions B<outside> your Perl code, such as POSIX::system,
79 in which case they are able to run arbitrary external code.
83 Hash keys are B<never> tainted.
87 For efficiency reasons, Perl takes a conservative view of
88 whether data is tainted. If an expression contains tainted data,
89 any subexpression may be considered tainted, even if the value
90 of the subexpression is not itself affected by the tainted data.
92 Because taintedness is associated with each scalar value, some
93 elements of an array or hash can be tainted and others not.
94 The keys of a hash are B<never> tainted.
98 $arg = shift; # $arg is tainted
99 $hid = $arg . 'bar'; # $hid is also tainted
100 $line = <>; # Tainted
101 $line = <STDIN>; # Also tainted
102 open FOO, "/home/me/bar" or die $!;
103 $line = <FOO>; # Still tainted
104 $path = $ENV{'PATH'}; # Tainted, but see below
105 $data = 'abc'; # Not tainted
107 system "echo $arg"; # Insecure
108 system "/bin/echo", $arg; # Considered insecure
109 # (Perl doesn't know about /bin/echo)
110 system "echo $hid"; # Insecure
111 system "echo $data"; # Insecure until PATH set
113 $path = $ENV{'PATH'}; # $path now tainted
115 $ENV{'PATH'} = '/bin:/usr/bin';
116 delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
118 $path = $ENV{'PATH'}; # $path now NOT tainted
119 system "echo $data"; # Is secure now!
121 open(FOO, "< $arg"); # OK - read-only file
122 open(FOO, "> $arg"); # Not OK - trying to write
124 open(FOO,"echo $arg|"); # Not OK
126 or exec 'echo', $arg; # Also not OK
128 $shout = `echo $arg`; # Insecure, $shout now tainted
130 unlink $data, $arg; # Insecure
131 umask $arg; # Insecure
133 exec "echo $arg"; # Insecure
134 exec "echo", $arg; # Insecure
135 exec "sh", '-c', $arg; # Very insecure!
137 @files = <*.c>; # insecure (uses readdir() or similar)
138 @files = glob('*.c'); # insecure (uses readdir() or similar)
140 # In either case, the results of glob are tainted, since the list of
141 # filenames comes from outside of the program.
143 $bad = ($arg, 23); # $bad will be tainted
144 $arg, `true`; # Insecure (although it isn't really)
146 If you try to do something insecure, you will get a fatal error saying
147 something like "Insecure dependency" or "Insecure $ENV{PATH}".
149 The exception to the principle of "one tainted value taints the whole
150 expression" is with the ternary conditional operator C<?:>. Since code
151 with a ternary conditional
153 $result = $tainted_value ? "Untainted" : "Also untainted";
157 if ( $tainted_value ) {
158 $result = "Untainted";
160 $result = "Also untainted";
163 it doesn't make sense for C<$result> to be tainted.
165 =head2 Laundering and Detecting Tainted Data
167 To test whether a variable contains tainted data, and whose use would
168 thus trigger an "Insecure dependency" message, you can use the
169 C<tainted()> function of the Scalar::Util module, available in your
170 nearby CPAN mirror, and included in Perl starting from the release 5.8.0.
171 Or you may be able to use the following C<is_tainted()> function.
174 local $@; # Don't pollute caller's value.
175 return ! eval { eval("#" . substr(join("", @_), 0, 0)); 1 };
178 This function makes use of the fact that the presence of tainted data
179 anywhere within an expression renders the entire expression tainted. It
180 would be inefficient for every operator to test every argument for
181 taintedness. Instead, the slightly more efficient and conservative
182 approach is used that if any tainted value has been accessed within the
183 same expression, the whole expression is considered tainted.
185 But testing for taintedness gets you only so far. Sometimes you have just
186 to clear your data's taintedness. Values may be untainted by using them
187 as keys in a hash; otherwise the only way to bypass the tainting
188 mechanism is by referencing subpatterns from a regular expression match.
189 Perl presumes that if you reference a substring using $1, $2, etc. in a
190 non-tainting pattern, that
191 you knew what you were doing when you wrote that pattern. That means using
192 a bit of thought--don't just blindly untaint anything, or you defeat the
193 entire mechanism. It's better to verify that the variable has only good
194 characters (for certain values of "good") rather than checking whether it
195 has any bad characters. That's because it's far too easy to miss bad
196 characters that you never thought of.
198 Here's a test to make sure that the data contains nothing but "word"
199 characters (alphabetics, numerics, and underscores), a hyphen, an at sign,
202 if ($data =~ /^([-\@\w.]+)$/) {
203 $data = $1; # $data now untainted
205 die "Bad data in '$data'"; # log this somewhere
208 This is fairly secure because C</\w+/> doesn't normally match shell
209 metacharacters, nor are dot, dash, or at going to mean something special
210 to the shell. Use of C</.+/> would have been insecure in theory because
211 it lets everything through, but Perl doesn't check for that. The lesson
212 is that when untainting, you must be exceedingly careful with your patterns.
213 Laundering data using regular expression is the I<only> mechanism for
214 untainting dirty data, unless you use the strategy detailed below to fork
215 a child of lesser privilege.
217 The example does not untaint C<$data> if C<use locale> is in effect,
218 because the characters matched by C<\w> are determined by the locale.
219 Perl considers that locale definitions are untrustworthy because they
220 contain data from outside the program. If you are writing a
221 locale-aware program, and want to launder data with a regular expression
222 containing C<\w>, put C<no locale> ahead of the expression in the same
223 block. See L<perllocale/SECURITY> for further discussion and examples.
225 =head2 Switches On the "#!" Line
227 When you make a script executable, in order to make it usable as a
228 command, the system will pass switches to perl from the script's #!
229 line. Perl checks that any command line switches given to a setuid
230 (or setgid) script actually match the ones set on the #! line. Some
231 Unix and Unix-like environments impose a one-switch limit on the #!
232 line, so you may need to use something like C<-wU> instead of C<-w -U>
233 under such systems. (This issue should arise only in Unix or
234 Unix-like environments that support #! and setuid or setgid scripts.)
236 =head2 Taint mode and @INC
238 When the taint mode (C<-T>) is in effect, the environment variables
239 C<PERL5LIB> and C<PERLLIB>
240 are ignored by Perl. You can still adjust C<@INC> from outside the
241 program by using the C<-I> command line option as explained in
242 L<perlrun|perlrun/-Idirectory>. The two environment variables are
243 ignored because they are obscured, and a user running a program could
244 be unaware that they are set, whereas the C<-I> option is clearly
245 visible and therefore permitted.
247 Another way to modify C<@INC> without modifying the program, is to use
248 the C<lib> pragma, e.g.:
250 perl -Mlib=/foo program
252 The benefit of using C<-Mlib=/foo> over C<-I/foo>, is that the former
253 will automagically remove any duplicated directories, while the latter
256 Note that if a tainted string is added to C<@INC>, the following
257 problem will be reported:
259 Insecure dependency in require while running with -T switch
261 On versions of Perl before 5.26, activating taint mode will also remove
262 the current directory (".") from the default value of C<@INC>. Since
263 version 5.26, the current directory isn't included in C<@INC> by
266 =head2 Cleaning Up Your Path
268 For "Insecure C<$ENV{PATH}>" messages, you need to set C<$ENV{'PATH'}> to
269 a known value, and each directory in the path must be absolute and
270 non-writable by others than its owner and group. You may be surprised to
271 get this message even if the pathname to your executable is fully
272 qualified. This is I<not> generated because you didn't supply a full path
273 to the program; instead, it's generated because you never set your PATH
274 environment variable, or you didn't set it to something that was safe.
275 Because Perl can't guarantee that the executable in question isn't itself
276 going to turn around and execute some other program that is dependent on
277 your PATH, it makes sure you set the PATH.
279 The PATH isn't the only environment variable which can cause problems.
280 Because some shells may use the variables IFS, CDPATH, ENV, and
281 BASH_ENV, Perl checks that those are either empty or untainted when
282 starting subprocesses. You may wish to add something like this to your
283 setid and taint-checking scripts.
285 delete @ENV{qw(IFS CDPATH ENV BASH_ENV)}; # Make %ENV safer
287 It's also possible to get into trouble with other operations that don't
288 care whether they use tainted values. Make judicious use of the file
289 tests in dealing with any user-supplied filenames. When possible, do
290 opens and such B<after> properly dropping any special user (or group!)
291 privileges. Perl doesn't prevent you from
292 opening tainted filenames for reading,
293 so be careful what you print out. The tainting mechanism is intended to
294 prevent stupid mistakes, not to remove the need for thought.
296 Perl does not call the shell to expand wild cards when you pass C<system>
297 and C<exec> explicit parameter lists instead of strings with possible shell
298 wildcards in them. Unfortunately, the C<open>, C<glob>, and
299 backtick functions provide no such alternate calling convention, so more
300 subterfuge will be required.
302 Perl provides a reasonably safe way to open a file or pipe from a setuid
303 or setgid program: just create a child process with reduced privilege who
304 does the dirty work for you. First, fork a child using the special
305 C<open> syntax that connects the parent and child by a pipe. Now the
306 child resets its ID set and any other per-process attributes, like
307 environment variables, umasks, current working directories, back to the
308 originals or known safe values. Then the child process, which no longer
309 has any special permissions, does the C<open> or other system call.
310 Finally, the child passes the data it managed to access back to the
311 parent. Because the file or pipe was opened in the child while running
312 under less privilege than the parent, it's not apt to be tricked into
313 doing something it shouldn't.
315 Here's a way to do backticks reasonably safely. Notice how the C<exec> is
316 not called with a string that the shell could expand. This is by far the
317 best way to call something that might be subjected to shell escapes: just
318 never call the shell at all.
321 die "Can't fork: $!" unless defined($pid = open(KID, "-|"));
328 my @temp = ($EUID, $EGID);
336 # Make sure privs are really gone
337 ($EUID, $EGID) = @temp;
338 die "Can't drop privileges"
339 unless $UID == $EUID && $GID eq $EGID;
340 $ENV{PATH} = "/bin:/usr/bin"; # Minimal PATH.
341 # Consider sanitizing the environment even more.
342 exec 'myprog', 'arg1', 'arg2'
343 or die "can't exec myprog: $!";
346 A similar strategy would work for wildcard expansion via C<glob>, although
347 you can use C<readdir> instead.
349 Taint checking is most useful when although you trust yourself not to have
350 written a program to give away the farm, you don't necessarily trust those
351 who end up using it not to try to trick it into doing something bad. This
352 is the kind of security checking that's useful for set-id programs and
353 programs launched on someone else's behalf, like CGI programs.
355 This is quite different, however, from not even trusting the writer of the
356 code not to try to do something evil. That's the kind of trust needed
357 when someone hands you a program you've never seen before and says, "Here,
358 run this." For that kind of safety, you might want to check out the Safe
359 module, included standard in the Perl distribution. This module allows the
360 programmer to set up special compartments in which all system operations
361 are trapped and namespace access is carefully controlled. Safe should
362 not be considered bullet-proof, though: it will not prevent the foreign
363 code to set up infinite loops, allocate gigabytes of memory, or even
364 abusing perl bugs to make the host interpreter crash or behave in
365 unpredictable ways. In any case it's better avoided completely if you're
366 really concerned about security.
368 =head2 Shebang Race Condition
370 Beyond the obvious problems that stem from giving special privileges to
371 systems as flexible as scripts, on many versions of Unix, set-id scripts
372 are inherently insecure right from the start. The problem is a race
373 condition in the kernel. Between the time the kernel opens the file to
374 see which interpreter to run and when the (now-set-id) interpreter turns
375 around and reopens the file to interpret it, the file in question may have
376 changed, especially if you have symbolic links on your system.
378 Some Unixes, especially more recent ones, are free of this
379 inherent security bug. On such systems, when the kernel passes the name
380 of the set-id script to open to the interpreter, rather than using a
381 pathname subject to meddling, it instead passes I</dev/fd/3>. This is a
382 special file already opened on the script, so that there can be no race
383 condition for evil scripts to exploit. On these systems, Perl should be
384 compiled with C<-DSETUID_SCRIPTS_ARE_SECURE_NOW>. The F<Configure>
385 program that builds Perl tries to figure this out for itself, so you
386 should never have to specify this yourself. Most modern releases of
387 SysVr4 and BSD 4.4 use this approach to avoid the kernel race condition.
389 If you don't have the safe version of set-id scripts, all is not lost.
390 Sometimes this kernel "feature" can be disabled, so that the kernel
391 either doesn't run set-id scripts with the set-id or doesn't run them
392 at all. Either way avoids the exploitability of the race condition,
393 but doesn't help in actually running scripts set-id.
395 If the kernel set-id script feature isn't disabled, then any set-id
396 script provides an exploitable vulnerability. Perl can't avoid being
397 exploitable, but will point out vulnerable scripts where it can. If Perl
398 detects that it is being applied to a set-id script then it will complain
399 loudly that your set-id script is insecure, and won't run it. When Perl
400 complains, you need to remove the set-id bit from the script to eliminate
401 the vulnerability. Refusing to run the script doesn't in itself close
402 the vulnerability; it is just Perl's way of encouraging you to do this.
404 To actually run a script set-id, if you don't have the safe version of
405 set-id scripts, you'll need to put a C wrapper around
406 the script. A C wrapper is just a compiled program that does nothing
407 except call your Perl program. Compiled programs are not subject to the
408 kernel bug that plagues set-id scripts. Here's a simple wrapper, written
416 #define REAL_PATH "/path/to/script"
418 int main(int argc, char **argv)
420 execv(REAL_PATH, argv);
421 fprintf(stderr, "%s: %s: %s\n",
422 argv[0], REAL_PATH, strerror(errno));
426 Compile this wrapper into a binary executable and then make I<it> rather
427 than your script setuid or setgid. Note that this wrapper isn't doing
428 anything to sanitise the execution environment other than ensuring
429 that a safe path to the script is used. It only avoids the shebang
430 race condition. It relies on Perl's own features, and on the script
431 itself being careful, to make it safe enough to run the script set-id.
433 =head2 Protecting Your Programs
435 There are a number of ways to hide the source to your Perl programs,
436 with varying levels of "security".
438 First of all, however, you I<can't> take away read permission, because
439 the source code has to be readable in order to be compiled and
440 interpreted. (That doesn't mean that a CGI script's source is
441 readable by people on the web, though.) So you have to leave the
442 permissions at the socially friendly 0755 level. This lets
443 people on your local system only see your source.
445 Some people mistakenly regard this as a security problem. If your program does
446 insecure things, and relies on people not knowing how to exploit those
447 insecurities, it is not secure. It is often possible for someone to
448 determine the insecure things and exploit them without viewing the
449 source. Security through obscurity, the name for hiding your bugs
450 instead of fixing them, is little security indeed.
452 You can try using encryption via source filters (Filter::* from CPAN,
453 or Filter::Util::Call and Filter::Simple since Perl 5.8).
454 But crackers might be able to decrypt it. You can try using the byte
455 code compiler and interpreter described below, but crackers might be
456 able to de-compile it. You can try using the native-code compiler
457 described below, but crackers might be able to disassemble it. These
458 pose varying degrees of difficulty to people wanting to get at your
459 code, but none can definitively conceal it (this is true of every
460 language, not just Perl).
462 If you're concerned about people profiting from your code, then the
463 bottom line is that nothing but a restrictive license will give you
464 legal security. License your software and pepper it with threatening
465 statements like "This is unpublished proprietary software of XYZ Corp.
466 Your access to it does not give you permission to use it blah blah
467 blah." You should see a lawyer to be sure your license's wording will
472 Unicode is a new and complex technology and one may easily overlook
473 certain security pitfalls. See L<perluniintro> for an overview and
474 L<perlunicode> for details, and L<perlunicode/"Security Implications
475 of Unicode"> for security implications in particular.
477 =head2 Algorithmic Complexity Attacks
479 Certain internal algorithms used in the implementation of Perl can
480 be attacked by choosing the input carefully to consume large amounts
481 of either time or space or both. This can lead into the so-called
482 I<Denial of Service> (DoS) attacks.
488 Hash Algorithm - Hash algorithms like the one used in Perl are well
489 known to be vulnerable to collision attacks on their hash function.
490 Such attacks involve constructing a set of keys which collide into
491 the same bucket producing inefficient behavior. Such attacks often
492 depend on discovering the seed of the hash function used to map the
493 keys to buckets. That seed is then used to brute-force a key set which
494 can be used to mount a denial of service attack. In Perl 5.8.1 changes
495 were introduced to harden Perl to such attacks, and then later in
496 Perl 5.18.0 these features were enhanced and additional protections
499 At the time of this writing, Perl 5.18.0 is considered to be
500 well-hardened against algorithmic complexity attacks on its hash
501 implementation. This is largely owed to the following measures
506 =item Hash Seed Randomization
508 In order to make it impossible to know what seed to generate an attack
509 key set for, this seed is randomly initialized at process start. This
510 may be overridden by using the PERL_HASH_SEED environment variable, see
511 L<perlrun/PERL_HASH_SEED>. This environment variable controls how
512 items are actually stored, not how they are presented via
513 C<keys>, C<values> and C<each>.
515 =item Hash Traversal Randomization
517 Independent of which seed is used in the hash function, C<keys>,
518 C<values>, and C<each> return items in a per-hash randomized order.
519 Modifying a hash by insertion will change the iteration order of that hash.
520 This behavior can be overridden by using C<hash_traversal_mask()> from
521 L<Hash::Util> or by using the PERL_PERTURB_KEYS environment variable,
522 see L<perlrun/PERL_PERTURB_KEYS>. Note that this feature controls the
523 "visible" order of the keys, and not the actual order they are stored in.
525 =item Bucket Order Perturbance
527 When items collide into a given hash bucket the order they are stored in
528 the chain is no longer predictable in Perl 5.18. This
529 has the intention to make it harder to observe a
530 collision. This behavior can be overridden by using
531 the PERL_PERTURB_KEYS environment variable, see L<perlrun/PERL_PERTURB_KEYS>.
533 =item New Default Hash Function
535 The default hash function has been modified with the intention of making
536 it harder to infer the hash seed.
538 =item Alternative Hash Functions
540 The source code includes multiple hash algorithms to choose from. While we
541 believe that the default perl hash is robust to attack, we have included the
542 hash function Siphash as a fall-back option. At the time of release of
543 Perl 5.18.0 Siphash is believed to be of cryptographic strength. This is
544 not the default as it is much slower than the default hash.
548 Without compiling a special Perl, there is no way to get the exact same
549 behavior of any versions prior to Perl 5.18.0. The closest one can get
550 is by setting PERL_PERTURB_KEYS to 0 and setting the PERL_HASH_SEED
551 to a known value. We do not advise those settings for production use
552 due to the above security considerations.
554 B<Perl has never guaranteed any ordering of the hash keys>, and
555 the ordering has already changed several times during the lifetime of
556 Perl 5. Also, the ordering of hash keys has always been, and continues
557 to be, affected by the insertion order and the history of changes made
558 to the hash over its lifetime.
560 Also note that while the order of the hash elements might be
561 randomized, this "pseudo-ordering" should B<not> be used for
562 applications like shuffling a list randomly (use C<List::Util::shuffle()>
563 for that, see L<List::Util>, a standard core module since Perl 5.8.0;
564 or the CPAN module C<Algorithm::Numerical::Shuffle>), or for generating
565 permutations (use e.g. the CPAN modules C<Algorithm::Permute> or
566 C<Algorithm::FastPermute>), or for any cryptographic applications.
568 Tied hashes may have their own ordering and algorithmic complexity
573 Regular expressions - Perl's regular expression engine is so called NFA
574 (Non-deterministic Finite Automaton), which among other things means that
575 it can rather easily consume large amounts of both time and space if the
576 regular expression may match in several ways. Careful crafting of the
577 regular expressions can help but quite often there really isn't much
578 one can do (the book "Mastering Regular Expressions" is required
579 reading, see L<perlfaq2>). Running out of space manifests itself by
580 Perl running out of memory.
584 Sorting - the quicksort algorithm used in Perls before 5.8.0 to
585 implement the sort() function was very easy to trick into misbehaving
586 so that it consumes a lot of time. Starting from Perl 5.8.0 a different
587 sorting algorithm, mergesort, is used by default. Mergesort cannot
588 misbehave on any input.
592 See L<https://www.usenix.org/legacy/events/sec03/tech/full_papers/crosby/crosby.pdf> for more information,
593 and any computer science textbook on algorithmic complexity.
597 The popular tool C<sudo> provides a controlled way for users to be able
598 to run programs as other users. It sanitises the execution environment
599 to some extent, and will avoid the L<shebang race condition|/"Shebang
600 Race Condition">. If you don't have the safe version of set-id scripts,
601 then C<sudo> may be a more convenient way of executing a script as
602 another user than writing a C wrapper would be.
604 However, C<sudo> sets the real user or group ID to that of the target
605 identity, not just the effective ID as set-id bits do. As a result, Perl
606 can't detect that it is running under C<sudo>, and so won't automatically
607 take its own security precautions such as turning on taint mode. Where
608 C<sudo> configuration dictates exactly which command can be run, the
609 approved command may include a C<-T> option to perl to enable taint mode.
611 In general, it is necessary to evaluate the suitability of a script to
612 run under C<sudo> specifically with that kind of execution environment
613 in mind. It is neither necessary nor sufficient for the same script to
614 be suitable to run in a traditional set-id arrangement, though many of
619 L<perlrun/ENVIRONMENT> for its description of cleaning up environment